Assignment 2: executing and pipelining processes
This assignment is worth 10% of your marks for this course.
For this assignment you will write two C programs to execute a list of commands.
- Your first program (sequence.c) will simply read in the commands and then execute them in a sequence.
- The second program (pipeline.c) will execute the commands in a pipeline, where the output of
each command becomes the input to the next command in the sequence:
cmd1 | cmd2 | cmd3 ………… | cmdn
Round 1: sequence.c
This program should read and execute a list of commands from stdin. Each command and its arguments (if any) will appear on a separate line. For example, if the file "cmdfile" contains the lines:
whoami cal 4 2020 echo The time is: date
./sequence < cmdfile
should output your username, a calendar of the month of April, the string "The time is:", and the current date/time, to standard output.
Suggested approach: first, make sure you can parse a command and its arguments correctly, and store them as an array of C strings. Test this thoroughly before moving on, making use of trace statements to make sure you've accounted for all the cases you can come up with! Debugging processes and pipes can be tricky so you want to have confidence you are giving later parts of your program the correct input. For the purposes of this assignment, you can assume a maximum of 10 arguments per command, 256 characters per line, and 100 lines per file.
Second, look at the exec() family of functions and use one to execute each command in turn. Hint: there is a sample program from the first lecture on processes that does something very similar! Here is a pseudo-code version of the setup:
while there are remaining commands: read line from file parse line into command and arguments for cmd in commands: fork and handle errors if (child) execute command else // parent wait for child to terminate
Note: Simply calling the 'system command in a loop' is not considered a correct solution to this question.
Round 2: pipeline.c
This program takes the same input as sequence.c, but executes the commands as a pipeline, where the output of each command is piped to the input of the next command in line. The input of the first command, and the output of the final command, should not be altered. For example, if the file "cmdpipe" contains the lines
ls -s1 sort -n tail -n 5
./pipeline < cmdpipe
should output the 5 largest files in the current directory, in order of size.
Suggested approach: set it up so that the parent process forks a child to execute each command in turn, and keeps track of the pipes that the current child should read from and write to. The child redirects its input and output to these pipes, and then executes the current command. Here is a pseudo-code version of the setup:
This pseudocode will also be covered in lectures.
// For each command except the final one: // new_pipe = the output of the current command // For each command except the first one: // prev_pipe = the output of the previous command, input of current command for cmd in cmds: if this is not the final command: pipe (new_pipe) // Create a new pipe fork and handle errors if (child) if this is not the first command: Redirect input to prev_pipe if this is not the final command: Redirect output to new_pipe execute command else // parent if this is not the first command: close prev_pipe if this is not the final command: prev_pipe = new_pipe close any remaining pipes, clean up
Remember that once you call fork, both parent and child have their own copy of all variables, including pipes! Have a look at the pattern for setting up processes to read and write from pipes, from lectures.
Make sure you close all pipes when they are no longer needed as not doing so can cause behaviour that is hard to debug.
Note your code should work with any command. For example: cal cat pipeline.c grep . wc -l
should print out the number of non-blank lines in your source file (the first line has no effect - why?). Note: Simply calling the 'system command with a big string containing | characters' is not considered a correct solution to this question.
The handin key for this exercise is: assignment2. The following SVN commands will enable you to make a repository for this assignment. Please note the following: • Perform these steps in the order written once only! • Replace XXXXXX, where it appears in the commands, with YOUR student id. • Some commands are long — they must be typed on one line.
(checks out a working copy in your directory) You can now begin work. You can add a file to the repository by typing the commands: svn add NAME-OF-FILE svn commit -m "REASON-FOR-THE-COMMIT"
where “reason-for-the-commit” should be some brief text that explains why you changed the code since the last commit. Note that you only need to add a file once — after that, SVN will “know” it is in the repository. You are now ready to commence working on the exercise.
The files you handin must include:
- Your C source files as specified above.
- A Makefile that will compile your C sources as specified above (make sequence, makepipeline)
At the start of your Makefile please have the following lines:
all: sequence pipeline # Your stuff goes below i.e. sequence: ... <some commands you write> pipeline: ... <some commands you write>
Make sure you commit your files frequently, in case you have an accident. The University’s SVN repository is very reliable, and is backed up regularly — your computer probably is not... Regular submission is also a good defence against plagiarism by others, since the submissions are dated. We will test the behaviour of your scripts using an automated tester. The tester is thorough, and will find places where your scripts do not work correctly. If it finds an error, it will offer a (vaguish) hint. You should also come up with your own tests, as we may add extra test cases when determining your final mark!
Note that we reserve the right to deduct marks if your code does anything egregious or games the system to obtain marks. End of Instructions.