Assignment 5 Color Blindness Simulator
CSE 13S, Spring 2023 Document version 1 (changes in Section 11)
1 Introduction About 4% of people are color blind , meaning that they have some decreased ability to see differences in colors compared to other people. The most common type of color blindness is red-green color blindness, which refers to an inability or a reduced ability to distinguish between red and green. The most severe form of red-green color blindness is deuteranopia , which is caused by a complete lack of green-sensitive light sensors (cone cells) in one’s eyes, leaving just the red-sensitive and blue-sensitive cone cells. Presented with two colors that differ only in their stimulation of green cone cells, someone with deuteranopia will be unable to detect any difference between the colors. Such a condition affects how someone interacts with the world around them. If user-interface designers can be made aware of how their color choices affect people with color blindness, they can design their products to avoid color confusion for those with the most common types of color blindness. To understand which pairs of colors are affected by deuteranopia, in this assignment, you will write an image-processing program that allows someone with normal color vision to appreciate the range of colors experienced by someone who has deuteranopia. As part of this exercise, you will □Use “unbuffered file-I/O” functions to read and write binary files (Section 2). (a) Original photo. (b) Simulating deuteranopia . Figure 1: A comparison of a normal photo and a simulation of deuteranopia . 1 □Get practice in reading and writing the data of a binary file into and out of a C data structure—a process called “marshaling” or “serialization” (Section 3 and Section 4). □Use a simple “shell script” to avoid needing to type complex commands (Section 5). 2 Unbuffered File I/O I’d spell creat with an e. Ken Thompson, when asked what he would do differently if he were redesigning the UNIX operating system[1]. The file-I/O functions that you’ve been using, like fopen(),fprintf() ,fscanf() , and fclose() , pro- vide convenience and a wealth of features, but they also have overhead. Sometimes, when writing high- performance programs, we want to minimize that overhead. So instead of using the “FILE ” functions, we use Unix system calls to access files using an interface of our own design. The Unix file-I/O functions include open(),read(),creat(),write(), and close(). One difference is that instead of a FILE , the Unix file-I/O functions use an intfile descriptor. Another difference is that these Unix file-I/O functions read and write raw bytes of data ( uint8_t s). There is no formatting! You specify the number of bytes that you would like to transfer into or out of a file, and the read()orwrite() function returns the number of bytes that it actually transferred (or an error code). Take a look at the manual page for write(): in Ubuntu enter this command: $ man 2 write (We specify chapter “2” because there’s also a writecommand in chapter 1 that does not interest us.) From the manual page for the write() system call you learn: •To use the function you need to
include
•The function prototype is ssize_t write(int fd, const void buf, size_t count); •The return value is the number of bytes actually written or -1to indicate an error (such as a full storage disk). fdis a file descriptor (described next), bufpoints to a byte buffer, and countis the number of bytes from the buffer to write to the file (although the return value may be less than that, if the function chooses not to write that many bytes). To get a file descriptor, you use either the open()system call (for reading a file) or the creat() system call (for writing a file). The details regarding the use of these system calls are covered in lecture. We say that the system calls support unbuffered file I/O, but your functions will provide a bitof buffering—just not the complexity that the 216-byte FILE structure supports. Instead of a pointer to FILE, your file-access functions will use a pointer to Buffer, a type that is defined below. The structure has fields that let access functions convert between large blocks of data and individual bytes. The functions of this section manage the buffers, and the functions of the Section 3 manage the individual bytes. We mention here what the structure’s fields are for. The a[]field is the actual buffer: it is a place for up to BUFFER_SIZE bytes. The purpose of the offset and num_remaining fields depends on whether this is a read buffer or a write buffer. When reading from a file, after a block of data is read into a[], the offsetfield contains the index of the next byte to take from the buffer, and the num_remaining field stores the number of bytes remaining in a[]until the buffer needs to be refilled. When writing to a file, bytes are stored sequentially in a[], the offsetfield contains the index of the next free location to place a byte into the buffer, and the num_remaining field is unused. The fdfield holds the file descriptor that will be used by either read()orwrite(). ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 2 of 13 / put in io.h / typedef struct buffer Buffer; / put in io.c / struct buffer { int fd; // file descriptor from open() or creat() int offset; // offset into buffer a[] // next valid byte (reading) // next empty location (writing) int num_remaining; // number of bytes remaining in buffer (reading) uint8_t a[BUFFER_SIZE]; // buffer }; 2.1 Functions The read_open() andread_close() functions allocate and free a Bufferthat can be used for reading from a file. The write_open() and write_close() functions allocate and free a Bufferthat can be used for writing to a file. (See Section 3.1 for the related functions that read and write data.) Buffer read_open(const char *filename); Open the file filename using the open(filename, O_RDONLY) system call. If unsuccessful (the return value <0) then return NULL. Otherwise malloc() orcalloc() a new Buffer, set the fdfield to the return value from open(), and set the rest of the fields to 0. Return the pointer to the new Buffer. Whenever your program calls read_open() , it will need to call a corresponding read_close() later. void read_close(Buffer pbuf); Call close((pbuf)->fd) to close the file. Free the Buffer. Set pbuf = NULL . Buffer write_open(const char filename); Open the file filename using the creat(filename, 0664); system call. If unsuccessful (the return value <0) then return NULL. Otherwise malloc() orcalloc() a new Buffer, set the fdfield to the return value from creat(), and set the rest of the fields to 0. Return the pointer to the new Buffer. Whenever your program calls write_open() , it will need to call a corresponding write_close() later. void write_close(Buffer pbuf); Write any accumulated bytes that are in the buffer a[]to the file indicated by (pbuf)->fd . (See Section 3.) Then call close((pbuf)->fd) to close the file. Free the Buffer. Set pbuf = NULL . 3 Marshaling/Serialization marshal ’mär-sh @l transitive verb 3.to lead ceremoniously or solicitously, usher: Marshaling her little group of children down the street. merriam-webster.com Different computers can use different in-memory representations of the same type of data. In fact, the same computer, running a different compiler, may represent the same data types differently. For example, ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 3 of 13 on different computers integers may have different sizes, floating-point numbers may use different internal representations, and even worse, pointers that contain the addresses of data values on one computer probably won’t point to the same data values on another computer. The classic example of a difference between memory representations is called “endianness.” If a data value, such as an int, is represented in memory using more than one byte, and a computer needs to access individual bytes of the data value, then there are two choices. A computer that considers the least-significant byte of the intto have a lower address than the most-significant byte is called “little-endian.” A computer that considers the most-significant byte of the intto have a lower address than the least-significant byte is called “big-endian”[2]. So if a program were to send its in-memory representation of a struct to a computer that uses a different endianness, almost surely the intended data values will not be communicated. So although it is tempting allow computers to communicate quickly by sending each other big blocks of memory, to ensure that the meaning of the bytes is the same on both computers, the sending computer “marshals” or “serializes” its memory data into an agreed-upon byte sequence that it sends to the other computer. Then the receiving computer “unmarshals” or “deserializes” the byte sequence into its own in- memory data format. Such a byte sequence can be used for communication, as described above, but it also can be used to define the format of a binary file, such as an audio file or an image file, which is what you will do in this assignment. 3.1 Functions The six read and write functions below transfer uint8_t,uint16_t , and uint32_t data values through a Bufferthat already has been initialized with read_open() orwrite_open() . Each call to a read function retrieves data from a read Buffer; each call to a write function stores data to a write Buffer. Since at some point the read buffer will become empty (or the write buffer will become full), when necessary, the functions below also will refill the read buffer from a file (or write the contents of a write buffer to a file). When you look at the descriptions of the functions, you’ll see that only the read_uint8() function and thewrite_uint8() function read from or write to files. The other read and write functions, for the larger data types, call the read_uint8() orwrite_uint8() functions in the proper order to “marshal” the data into or out of the file. For this assignment we will use the “little-endian” byte order, meaning that the first byte of a data value’s byte sequence will contain the data value’s least-significant bits. It is the responsibility of the functions below to use the little-endian byte order. bool read_uint8(Buffer buf, uint8_t x); If the buffer is empty ( buf->num_remaining == 0 ), then refill it from an open file using buf->fd. ssize_t rc = read(buf->fd, buf->a, sizeof(buf->a)); if (rc < 0) TODO report error if (rc == 0) return false; // end of file buf->num_remaining = rc; buf->offset = 0; Then set xto the next byte in the buffer, increment buf->offset , and return true. bool read_uint16(Buffer buf, uint16_t x); To deserialize a uint16_t , call read_uint8() twice to read two bytes. If either call returns false, we’ve gotten to the end of the file, and so this function returns falseas well. Otherwise, copy the second byte into a new uint16_t variable and shift the new variable to the left by 8 bits. Next, use the binary “or” ( |) operation to “or” the first byte into the uint16_t variable. You’ve now “unmarshaled” or “deserialized” the uint16_t value from the byte sequence. Set xto the resulting uint16_t and return true. ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 4 of 13 bool read_uint32(Buffer buf, uint32_t x); To deserialize a uint32_t , Call read_uint16() twice to read two uint16_t values (four bytes). If either call returns false, we’ve gotten to the end of the file, and so this function returns falseas well. Otherwise, copy the second uint16_t into a new uint32_t variable and shift the new variable to the left by 16 bits. Next, use the binary “or” ( |) operation to “or” the first uint16_t into the uint32_t variable. You’ve now “unmarshaled” or “deserialized” the uint32_t value from the byte sequence. Set xto the resulting uint32_t and return true. void write_uint8(Buffer buf, uint8_t x); If the buffer is full ( buf->offset == BUFFER_SIZE ), then call write() as many times as necessary to empty it. (write() may not write allbytes of the buffer, and so you need to check its return value and call it in a loop.) uint8_t start = buf->a; int num_bytes = buf->offset; do { ssize_t rc = write(buf->fd, start, num_bytes); if (rc < 0) TODO report error start += rc; // skip past the bytes that were just written num_bytes -= rc; // how many bytes are left? } while (num_bytes > 0); buf->offset = 0; Then set xto the next byte in the buffer, increment buf->offset , and return true. void write_uint16(Buffer buf, uint16_t x); To serialize the uint16 x , call write_uint8() twice: first with x, and then with x >> 8. void write_uint32(Buffer *buf, uint32_t x); To serialize the uint32 x , call write_uint16() twice: first with x, and then with x >> 16. 4 Reading/Writing Windows BMP Image Files Images that use the Windows BMP format[3] can be read by both Windows computers and Macs. Version 3 of the format for Microsoft Windows 3.x is very easy to read and write, and so that’s the version that we use. You will write functions to read and write Windows BMP files into and out of a BMP type: typedef struct color { uint8_t red; uint8_t green; uint8_t blue; } Color; typedef struct bmp { uint16_t height; uint16_t width; Color palette[MAX_COLORS]; uint8_t *a; } BMP; ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 5 of 13 4.1 Functions You will write functions to serialize and deserialize a BMP file: bmp_write() serializes a BMP image and creates a new BMP file. bmp_create() deserializes a BMP file, creating a BMP in memory. bmp_free() frees the BMP once you are finished with it. We provide you with bmp_adjust_palette() , which changes the colors of a BMP image to simulate deuteranopia. void bmp_write(const BMP bmp, Buffer buf) Write a BMP file. The pseudocode below represents data being written to the file as rows of a table. The left column indicates the number of bits to write, using write_uint8() ,write_uint16() , orwrite_uint32() . The right column gives the value to write. BMP File Format: Writing int32_t rounded_width = (width + 3) & ~3; int32_t image_size = height rounded_width; int32_t file_header_size = 14 int32_t bitmap_header_size = 40 int32_t num_colors = 256 int32_t palette_size = 4 num_colors int32_t bitmap_offset = file_header_size + bitmap_header_size + palette_size int32_t file_size = bitmap_offset + image_size 8'B' 8'M' 32 file_size 16 0 16 0 32 bitmap_offset 32 bitmap_header_size 32 bmp->width 32 bmp->height 16 1 16 8 32 0 32 image_size 32 2835 32 2835 32 num_colors 32 num_colors for i from 0 to num_colors - 1 8bmp->palette[i].blue 8bmp->palette[i].green 8bmp->palette[i].red 8(skip one byte) for y from 0 to bmp->height - 1 for x from 0 to bmp->width - 1 8bmp->a[x][y] for x from bmp->width to rounded_width - 1 80 ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 6 of 13 BMP bmp_create(Buffer buf) Create a new BMP struct, read a BMP file into it, and return a pointer to the new struct. The pseudocode below represents data being read from the file as rows of a table. The left column indicates the number of bits to read, using read_uint8() ,read_uint16() , orread_uint32() . The right column indicates which variable should receive the value. When the table indicates that data should be skipped, just read the data into a variable whose value will be ignored. BMP File Format: Reading BMP bmp = calloc(1, sizeof(BMP)); // TODO check for bmp == NULL 8uint8_t type1 8uint8_t type2 32(skip four bytes) 16(skip two bytes) 16(skip two bytes) 32(skip four bytes) 32 uint32_t bitmap_header_size 32 bmp->width 32 bmp->height 16(skip two bytes) 16 uint16_t bits_per_pixel 32 uint32_t compression 32(skip four bytes) 32(skip four bytes) 32(skip four bytes) 32 uint32_t colors_used 32(skip four bytes) verify type1 == 'B' verify type2 == 'M' verify bitmap_header_size == 40 verify bits_per_pixel == 8 verify compression == 0 uint32_t num_colors = colors_used if (num_colors == 0) num_colors = (1 << bits_per_pixel) for i from 0 to num_colors - 1 8bmp->palette[i].blue 8bmp->palette[i].green 8bmp->palette[i].red 8(skip one byte) // Each row must have a multiple of 4 pixels. Round up to next multiple of 4. uint32_t rounded_width = (bmp->width + 3) & ~3 // Allocate pixel array bmp->a = calloc(rounded_width, sizeof(bmp->a[0])); for x from 0 to rounded_width - 1 bmp->a[x] = calloc(bmp->height, sizeof(bmp->a[x][0])); // read pixels for y from 0 to bmp->height - 1 for x from 0 to rounded_width - 1 8bmp->a[x][y] return bmp; ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 7 of 13 void bmp_free(BMP *bmp) uint32_t rounded_width = ((bmp)->width + 3) & ~3 for i from 0 to rounded_width - 1 free((bmp)->a[i]) free((bmp)->a); free(bmp); bmp = NULL; void bmp_reduce_palette(BMP bmp); AdjustthecolorpaletteofabitmapimagetosimulatedeuteranopiausingtheCcodebelow. (SeeAppendixA if you are interested in the source of the equations.) int constrain(int x, int a, int b) { return x < a ? a : x > b ? b : x; } void bmp_reduce_palette(BMP bmp) { for (int i = 0; i < MAX_COLORS; ++i) { int r = bmp->palette[i].red; int g = bmp->palette[i].green; int b = bmp->palette[i].blue; int new_r, new_g, new_b; double SQLE = 0.00999 r + 0.0664739 g + 0.7317 b; double SELQ = 0.153384 r + 0.316624 g + 0.057134 b; if (SQLE < SELQ) { // use 575-nm equations new_r = 0.426331 r + 0.875102 g + 0.0801271 b + 0.5; new_g = 0.281100 r + 0.571195 g + -0.0392627 b + 0.5; new_b = -0.0177052 r + 0.0270084 g + 1.00247 b + 0.5; } else { // use 475-nm equations new_r = 0.758100 r + 1.45387 g + -1.48060 b + 0.5; new_g = 0.118532 r + 0.287595 g + 0.725501 b + 0.5; new_b = -0.00746579 r + 0.0448711 g + 0.954303 b + 0.5; } new_r = constrain(new_r, 0, UINT8_MAX); new_g = constrain(new_g, 0, UINT8_MAX); new_b = constrain(new_b, 0, UINT8_MAX); bmp->palette[i].red = new_r; bmp->palette[i].green = new_g; bmp->palette[i].blue = new_b; } } ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 8 of 13 5 Using Simple Shell Scripts We’ve provided you with a number of test images in the bmpsdirectory. You can run colorbon all of them manually using commands like this: $ ./colorb -i bmps/apples-orig.bmp -o bmps/apples-colorb.bmp $ ./colorb -i bmps/cereal-orig.bmp -o bmps/cereal-colorb.bmp $ ./colorb -i bmps/froot-loops-orig.bmp -o bmps/froot-loops-colorb.bmp $ ./colorb -i bmps/ishihara-9-orig.bmp -o bmps/ishihara-9-colorb.bmp $ ./colorb -i bmps/produce-orig.bmp -o bmps/produce-colorb.bmp $ ./colorb -i bmps/color-chooser-orig.bmp -o bmps/color-chooser-colorb.bmp However it is easier to use a “shell script.” We’ve provided a script called cb.shthat looks in the bmpsdirectory and runs colorbon all BMP files whose name ends in -orig.bmp . Each corresponding output file has the same base but ends in -colorb.bmp . $ ./cb.sh cb.shis a text file. You can edit it to see what it does. You may want to run colorbon your own files, but if you don’t have BMP files, no worries! You can use a program called convert from the ImageMagick software suite to convert images to and from the BMP format. You can install the ImageMagick software suite on your Ubuntu VM using this command: sudo apt install imagemagick Then the following command converts from nearly any other image format into the BMP format that colorbrequires. Change the input filename file-orig.gif into whatever file you have (any base name and any extension, such as .gif,.png,.jpg, etc.). You can change the output filename, too, except that it must end in .bmp. The BMP3:file-version prefix is required, too. $ convert file-orig.gif -colors 256 -alpha off -compress none BMP3:file-orig.bmp To convert back into a .gif(or whatever), use a command like this: $ convert file-colorb.bmp file-colorb.gif 6 Command line options Your program should support these command-line options. -iand-oare required. •-i: Sets the name of the input file. Requires a filename as an argument. •-o: Sets the name of the output file. Requires a filename as an argument. •-h: Prints a help message to stdout. 7 Program Output and Error Handling If any invalid options or files are specified, your program should report an error and exit cleanly. Your bmp_create() function should include the listed verification steps, reporting an error if one of them fails, but other than those checks, your code can assume that an input BMP file is valid. ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 9 of 13 8 Testing your code To get you started on testing your code, we have provided you io_test.c . •You will receive a folder of BMP images. Your program should be able to successfully process these images. •Your program should have no memory leaks . Make sure you free()before exiting. valgrind should pass cleanly with any combination of the specified command-line options, including on an error condi- tion. •Your program must pass the static analyzer, scan-build , 9 Submission For the report draft , you must submit a commit ID on canvas before Wednesday May 24th at 11:59 Pacific Time. You must have a PDF called report.pdf . For the entire project, you must submit a commit ID on canvas before Sunday May 28th at 11:59 pm Pacific Time . Your submission must have these files. You must have run clang-format on the .cand.hfiles. •bmp.c– contains your BMP functions (Section 4.1) •bmp.h— provided •colorb.c — contains your main()function •io.c— contains your serialization/deserialization functions (Section 3.1) •io.h— provided •Makefile — your Makefile 10 Supplemental Readings •The C Programming Language by Kernighan & Ritchie 11 Revisions Version 1 Original. References [1] Brian W. Kernighan and Rob Pike. The UNIX Operating Environment , page 204. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1984. [2] Danny Cohen. On holy wars and a plea for peace. Internet Experiment Note , (137), April 1980. URL: https://www.rfc-editor.org/ien/ien137.txt . [3] FileFormat.info. Microsoft windows bitmap file format summary. https://www.fileformat.info/ format/bmp/egff.htm . [4] Hans Brettel, Françoise Viénot, and John D. Mollon. Computerized simulation of color appearance for dichromats. Journal of the Optical Society of America A , 14(10):2647–2655, October 1997. ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 10 of 13 Appendix A: Color Transformations This appendix documents the equations that are in the bmp_reduce_palette() function of Section 4. Read it if you are interested. A.1 Background Brettel et al.[4] teach how to convert an RGB (red/green/blue) representation of a color into a newRGB color that lets people with normal color vision experience the “color confusion” that someone with color blindness can experience. Their approach is to divide the conversion into three steps: (1) transform the RGB color into the more physiologically relevant LMS color space, which models the color response of the three kinds of cone cells of the human eye, (2) manipulate the LMS color to simulate the unique cone cells of someone who has color blindness, and (3) transform the manipulated LMS color back into an RGB value. These three steps can be combined into a single matrix multiplication of an RGB color value: V′=RV where V′= RV′ GV′ BV′ =converted color V= RV GV BV =original color andRis the desired transformation matrix. A.2 Deriving R Equations from the reference define colors in the LMS color space ( QandQ′) along with conversions to and from the colors in the RGB color space ( VandV′). Q′= L′ Q M′ Q S′ Q =converted color in LMS space Q= LQ MQ SQ =orginal color in LMS space (3) Q′=TV′Q=TV (4) V′=T−1Q′V=T−1Q (5) (The equation numbers used in this section match those of the reference.) Table 1 from the reference provides the values of T: T= 0.1992 0 .4112 0 .0742 0.0353 0 .2226 0 .0574 0.0185 0 .1231 1 .3550 At this point we merely lack a conversion in LMS color space from QintoQ′. The reference uses a matrix multiplication: Q′=CQ Combine equations from above into a conversion from VintoV′ V′=T−1(C(TV)) apply the associative property of matrix multiplication V′= (T−1CT)V and recall that we are looking for the Rof the original equation V′=RV, which reveals R=T−1CT ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 11 of 13 A.3 Deriving C The variable Cof the LMS multiplicative transformation Q′=CQis defined by the equations: LQ′=LQ (10) MQ′=−(aLQ+cSQ)/b SQ′=SQ or in matrix form LQ′ MQ′ SQ′ = 1 0 0 −a b0−c b 0 0 1 LQ MQ SQ so C= 1 0 0 −a b0−c b 0 0 1 The values of a,b, and ccome from a cross product of two vectors E= (LE, ME, SE)andA= (LA, MA, SA). a=MESA−SEMA (8) b=SELA−LESA c=LEMA−MELA To obtain the values of EandA, I needed to physically measure the plots in Figures 2(a) and 2(b) of the reference. Those measurements result in E= LE ME SE = 0.54 0.22 0.77 A575= LA MA SA = 0.47 0.18 0 A475= LA MA SA = 0 0.151 1.41 So depending on whether one uses A575orA475, the value of Cis either C575= 1 0 0 0.382978 0 0 .0171318 0 0 1 or C475= 1 0 0 0.254701 0 0 .107092 0 0 1 ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 12 of 13 A.4 Converting RGB Colors The reference teaches that if SQ/LQ< SE/LEthen we use A575. Otherwise, we use A475. To avoid division by zero, we multiply both sides of the inequality by LQLE, and then the inequality check becomes SQLE< SELQ. Recalling that Q=TV, or LQ= 0.1992 RV+ 0.4112 GV+ 0.0742 BV MQ= 0.0353 RV+ 0.2226 GV+ 0.0574 BV SQ= 0.0185 RV+ 0.1231 GV+ 1.3550 BV and using LE= 0.54andSE= 0.77from above, we get SQLE= 0.00999 RV+ 0.0664739 GV+ 0.7317 BV SELQ= 0.153384 RV+ 0.316624 GV+ 0.057134 BV IfSQLE< SELQthen R=T−1C575T= 0.426331 0 .875102 0 .0801271 0.281100 0 .571195 −0.0392627 −0.0177052 0 .0270084 1 .00247 so rnew= 0.426331 r+ 0.875102 g+ 0.0801271 b gnew= 0.2811 r+ 0.571195 g−0.0392627 b bnew=−0.0177052 r+ 0.0270084 g+ 1.00247 b Otherwise SQLE≥SELQ, and then R=T−1C475T= 0.758100 1 .45387 −1.48060 0.118532 0 .287595 0 .725501 −0.00746579 0 .0448711 0 .954303 so rnew= 0.7581 r+ 1.45387 g−1.4806 b gnew= 0.118532 r+ 0.287595 g+ 0.725501 b bnew=−0.00746579 r+ 0.0448711 g+ 0.954303 b ©2023 Dr. Kerry Veenstra, edits by TBD, Document version 1 13 of 13