Processes
LABORATORY
FOUR
OBJECTIVES
1. Unix Time
2. Programming with Processes
Unix Time and Y2K Problem
In Unix or Linux, the time is simply a counter (of data type time_t) that counts the number of seconds fromaspecificdate,namely,1 January 1970, 00:00:00andavalueof86399means1January 1970, 23:59:59. The integer counter is of size 32 bits. In each day, there are 24 hours, i.e. 86400 seconds. Each year is about 86400 * 365.2422 = 31556926 seconds. The size of a 32-bit integer can count up to 2147483647, i.e. a little more than 68 years. So the Unix time counter will overflow by year 2038. Remember the Y2K problem that people are using two digits (bytes) for the year instead of four digits to save storage and by the year 2000, the date will wrap around to 1900. This 2038 problem is the Unix Y2K bug or Unix Millennium bug. Many Unix programs would not function correctly by 19 January 2038 or when computing a date beyond that day. How many of you have heard of this Unix Y2K bug?
Programmers are already advised to write their programs with checking for the invalid entry for the date data type time_t to avoid this 2038 problem (Y2K38 problem) on Unix and Linux. Newer architectures based on 64-bit integers would no longer suffer from this problem by replacing the time_t data type. However, people need time to switch to new architectures and need to watch out for legacy codes and programs that rely on this 32-bit time representation. Could you estimate when will an overflow to this new 64-bit time data type occur? Remember that we still need to handle the Y10K problem in another 8000 years, after solving the Y2K problem.
Try to study lab4A.c for the calendar logic, and figure out how this program can be executed. You should also try to learn the skill of table-lookup for fast result determination. This is a very useful programming skill. As a simple exercise, you could extend the checking logic for leap years for year 1900, 2000 and 2100. When the year ends in “00”, it must be divisible by 400 to qualify as a leap year. So years 1900 and 2100 are not leap years, but year 2000 is a leap year.
// lab 4A
// try to find out how this calendar-related program can be executed #include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int month[] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; char *name[] = {”Jan”, ”Feb”, ”Mar”, ”Apr”, ”May”, ”June”, ”July”, ”Aug”, ”Sept”, ”Oct”, ”Nov”, ”Dec”};
int yy, mm, dd, numDay;
int i;
yy = atoi(argv[1]); numDay = atoi(argv[2]); // atoi returns an integer if (numDay <= 0 || numDay > 366 || (numDay == 366 && yy % 4 != 0)) {
printf(”Sorry: wrong number of days\n”);
exit(1); }
if (yy % 4 == 0) month[1] = 29;
for (i = 0; i < 12; i++) {
if (numDay <= month[i]) break;
numDay = numDay - month[i]; }
mm = i; dd = numDay;
printf(”Date is %d %s %4d\n”, dd, name[mm], yy); }
Programming with Processes
In Unix/Linux the way to create a process is to use the fork() system call. It will create a child process as an exact copy of the parent, including the value of the program counter. However, once fork() is executed, the two copies of parent and child will be separated and the variables are not shared. Note that the action of copying only occurs on demand but logically, we could assume that it occurs when fork() is completed. To get the id of a process, we could use the system call getpid(). The sleep() system call allows a process to stop for the specific number of seconds. The exit()system call allows the process to terminate successfully.
Since the two copies of parent and child are separated after fork(), how could a child know who is the parent? A child can get the id of the parent by getppid(). Try to observe the variable values produced by parent and child processes and draw a picture to illustrate the changes in values in lab4B.c. Note that there is no guarantee that the parent after fork() is executed before or after the child. The actual execution order depends on the CPU scheduling mechanism. Vary the sleep() parameters and observe the outputs. You should be able to generate different outputs by properly changing the waiting time.
// lab 4B
#include <stdio.h> #include <stdlib.h>
int main() {
int childid, myid, parentid, cid; int val;
val = 1;
myid = getpid();
printf("My id is %d, value is %d\n",myid,val);
printf("Fork Failed\n");
exit(1);
} else if (childid == 0) { // child process
myid = getpid(); parentid = getppid();
printf("Child: My parentid is %d, childid is %d\n",parentid,childid); printf("Child: My id is %d, value is %d\n",myid,val);
val = 12;
printf("Child: My id is %d, value is %d\n",myid,val);
sleep(4);
printf("Child: My id is %d, value is %d\n",myid,val);
val = 13;
printf("Child: My id is %d, value is %d\n",myid,val);
sleep(4);
printf("Child: My id is %d, value is %d\n",myid,val);
printf("Child: I %d completed\n",myid);
printf("Parent: My childid is %d\n",childid); printf("Parent: My id is %d, value is %d\n",myid,val); val = 2;
printf("Parent: My id is %d, value is %d\n",myid,val); sleep(4);
printf("Parent: My id is %d, value is %d\n",myid,val); val = 3;
printf("Parent: My id is %d, value is %d\n",myid,val); sleep(4);
printf("Parent: My id is %d, value is %d\n",myid,val); cid = wait(NULL);
printf("Parent: Child %d collected\n",cid);
exit(0);
} }
Again, since the child is an exact copy of the parent, all values stored in the variables before fork() are exactly the same. All values stored in the argument list and all other variables are directly inherited
childid = fork();
if (childid < 0) { // error occurred
exit(0);
} else { // parent process
by the child. This serves as a simple mechanism of passing information from parent to child, in case the information is known before the creation of the child process via fork().
Here is a program to compute the GPA of subjects of different levels, still based on the old GPA system. Would there be any difference if we create the child process in the first statement, i.e. move fork() to the beginning? In other words, move the line to the first line, before num_subj = (argc-1)/2; and
Orphan Processes (Optional)
In Unix, if a parent process terminates before a child process terminates, the child process would become an orphan will be adopted by a new parent. Orphan processes will be adopted by process init with pid 1, which becomes the new parent. This approach is also used by Linux. Run the following program in background (lab4E o &) on apollo2 and on a Mac. Type ps –lf and you will see that the orphan child process has been adopted by the process called init with pid 1 (look at the PPID column of information). This process with pid 1 is the systemd (system daemon) process on CentOS Linux. If a child terminates before a parent, it would become a zombie until waited or collected by its parent via wait() system call before the parent terminates. Run the program in background again (lab4E z &). Type ps –lf and you will see a zombie process. A zombie process is identified by a “Z” state (on the “S” or “STAT” column). Alternatively, you may use multiple windows to run and check, without putting the processes to the background. |
// lab 4E : o means orphan and z means zombie |
#include <stdio.h> int childid; childid = fork(); if (childid < 0) { // an error occurred printf(”Fork Failed\n”); exit(1); } else if (childid == 0) { // child process if (argv[1][0] == ’o’) printf(”Child completed\n”); exit(0); } else { // parent process // parent does not wait for child if (argv[1][0] == ’z’) sleep(30); // child completes before parent printf(”Parent completed\n”); } } |
Laboratory Exercise
Making use of lab4C.c for child process generation, you can create a program that will generate separate child processes to help dividing up a lot of processing tasks to be handled by each individual child. Effective division of labors among many processes actually forms the basis of cloud computing and big data processing, making use of the computing power of multiple processors.
One common application to make use of abundant computing power is to perform simulation study. For simple questions concerning probabilistic results, mathematical analysis could be performed, e.g. the probability of getting 12 with two dice is 1/36 and the probability of getting 7 is 1/6. It is still possible to compute the probability of getting 40 with 10 dice. Likewise, one can compute the unconditional probability of playing for a 2/2 split in the bridge game with an analytic solution, but the conditional probability when some of the cards are known will deviate from the analytic unconditional probability. One way to estimate this probability is to simulate. In other words, generate many hands, say N, under the existing constraints and count the number of hands, h, satisfying the required condition. The required probability will then be h/N. Simple simulation could also be applied to study queuing systems, e.g. bank tellers serving customers, traffic congestion situation at Cross Harbor Tunnel. More complex computer simulation could be applied to study the infection model in epidemic study, typhoon trajectory prediction in meteorology, stock price change in financial technology, or pilot training in flight simulation.
In this exercise, we are to study the impact of waiting queue building up for a service provided by the teller of a bank. Customers arrive randomly and queue up for teller service. Tellers take on customers and serve them, one after the other. There is also variation in the service time by each teller. We are to study how the queue is built up, and as a result, the impact on the waiting time of the customers. To simplify the scenario, let us assume that there is only one single queue, and only one single teller for the customers. You are given input data about the customer arrival pattern and the servicing pattern by the teller in the debug mode of the program and would generate the patterns in the simulation mode.
Data being used for the patterns can be generated randomly. For example, customer arrival events could follow a Poisson distribution, and the teller servicing events could follow a normal distribution. The standard queuing model usually assumes Poisson arrival and Poisson service in order to be analyzed mathematically, since Poisson events exhibit very nice mathematical properties. To achieve this, include the following code segment in your program, and make calls to the generator for the arrival and service patterns, after making appropriate modifications. Note that besides including the header file math.h, there is also a need to compile through linking to the maths library: cc -lm gen.c
#include <stdlib.h>
#include <math.h> // in order to use log() function
.. .
int seed = 2023;
int arrivalPattern[100];
int servicePattern[100];
.. .
srand(seed); // do this only once and before using rand() to generate .. .
// generate patterns for 100 customers generate(arrivalPattern,100,5.0); // arriving every 5 time units arrivalPattern[0] = 0; // set default arrival time for first customer generate(servicePattern,100,4.0); // servicing average 4 time units // your computation
.. .
int generate(int pattern[], int num, float timespan)
{
int i;
float u, r;
for (i = 0; i < num; i++) {
u = (float)rand() / RAND_MAX; // generate random number in (0,1]
r = -log(u)*timespan; // exponential random variable for Poisson events pattern[i] = (int)r + 1; // round up to integer
} }
In this exercise, there are n child processes. The parent is just like a server that accepts user request and passes it to a child process to do the simulation. Remember that a child will know everything about the parent before fork() is executed, so there is no need for the parent to pass in any argument to the child (and this simplifies the program design). The child should be able to extract its own part of the input data knowing its position from the argument list or from any stored array before fork() is executed. To facilitate program development and debugging, the program would be executed in different modes, as indicated by the first argument. Depending on the mode, other arguments should be interpreted accordingly. The modes to be provided include debug mode (D) and simulation mode (S). You can assume that there are no errors in the user input argument list. Do not forget to wait for the child to complete to avoid zombie processes.
them via reduction. That would require the ability that the child processes can report the computed information back to the parent (i.e., ability of inter-process communication). These complicated arrangements are not required in this exercise.
Hint: This is a highly simplified case for the discrete event simulation approach. All time units are in integers in this case. You could simply advance the clock for each time unit, and then check for the occurrence of the three key events: customer arrival (and queues up), customer completion (and departs), customer welcomed by the teller (and receives service). Once detected, appropriate actions can then be carried out for the specific event at that time unit.
Please provide appropriate comments and check to make sure that your program can be compiled and execute properly under the Linux (apollo or apollo2) environment before submission. Note that you have to let all children execute together concurrently and must not control the output order from different children. The final results produced by all the children may be in any mixed order and you do not need to do anything to change the output order. Just let the nature play the output game for you. It is quite likely that those tasks with longer pattern files or more customers would execute more slowly.
Sample inputs and outputs:
queue D 0 1 4 2 -1 2 4 3 2
Parent, pid 12345: debug mode
Child 1, pid 12346: 4 customers arriving at 0 1 5 7
Child 1, pid 12346: 4 customers requiring service for 2 4 3 2
Child 1, pid 12346: time 0 customer 1 arrives, customer 1 waits for 0, queue length 0 Child 1, pid 12346: time 1 customer 2 arrives, queue length 1
Child 1, pid 12346: time 2 customer 1 departs, customer 2 waits for 1, queue length 0 # just for debugging, no need for output: Child 1, pid 12346: time 3 queue length 0
# just for debugging, no need for output: Child 1, pid 12346: time 4 queue length 0 Child 1, pid 12346: time 5 customer 3 arrives, queue length 1
Child 1, pid 12346: time 6 customer 2 departs, customer 3 waits for 1, queue length 0 Child 1, pid 12346: time 7 customer 4 arrives, queue length 1
# just for debugging, no need for output: Child 1, pid 12346: time 8 queue length 1 Child 1, pid 12346: time 9 customer 3 departs, customer 4 waits for 2, queue length 0 # just for debugging, no need for output: Child 1, pid 12346: time 10 queue length 0 Child 1, pid 12346: time 11 customer 4 departs, queue length 0
Child 1, pid 12346: all customers served
Child 1, pid 12346: maximum queue length
Child 1, pid 12346: average queue length
Child 1, pid 12346: total waiting time 4
Child 1, pid 12346: average waiting time
Child 1, pid 12346: child 1 completed execution Parent, pid 12345: child 1 completed execution
at time 11 1
0.364
1.000
queue D 0 1 4 2 3 1 2 3 -1 2 4 3 2 6 3 5 1
Parent, pid 12348: debug mode
for 1, queue length 0
for 1, queue length 0
for 2, queue length 0
。。
Child 1, pid 24322: average waiting time 41.441
Child 1, pid 24322: simulating 1000 customers, seed 2029, arrival 4.0, service 3.0 Child 3, pid 24324: all customers served at time 8934
Child 3, pid 24324: maximum queue length 7
Child 3, pid 24324: average queue length 0.414
Child 3, pid 24324: total waiting time 3698
Child 3, pid 24324: average waiting time 1.849
Child 3, pid 24324: child 3 completed execution
Parent, pid 24321: child 3 completed execution
Child 1, pid 24322: all customers served at time 4478
Child 1, pid 24322: maximum queue length 10
Child 1, pid 24322: average queue length 1.587
Child 1, pid 24322: total waiting time 7108
Child 1, pid 24322: average waiting time 7.108
Child 1, pid 24322: child 1 completed execution
Parent, pid 24321: child 1 completed execution
Child 2, pid 24323: all customers served at time 15568
Child 2, pid 24323: maximum queue length 13
Child 2, pid 24323: average queue length 2.312
Child 2, pid 24323: total waiting time 35992
Child 2, pid 24323: average waiting time 23.995
Child 2, pid 24323: child 2 completed execution
Parent, pid 24321: child 2 completed execution
Level 1 requirement: the child processes are created properly and can display the correct information about the given arrival and service patterns.
Level 2 requirement: the unique child process can produce partially correct results in the debug mode. Level 3 requirement: the unique child process can product almost all correct results in the debug mode. Level 4 requirement: all child processes can work correctly under all the cases and produce correct simulation results in terms of average queue length and waiting time.
Bonus level: you can handle the simulation with more than one teller. There can be several possible queuing strategies for the multiple tellers. Please explain briefly your adopted strategy.