PA1: Address Spaces and Resource Usage Monitoring CMPSC 473, Spring 2021
1 Goals
This programming assignment (PA1) has three main goals. First, you will learn in con- crete terms what the different segments of a virtual address space (or simply address space) are. Second, you will learn to use three Linux facilities - /proc, strace, and getrusage - to record/measure certain aspects of process resource usage. Finally, we hope that PA1 will serve to refresh your understanding of (or make you learn in case you lack it): (i) logging into a Linux machine (ssh, VPN, 2FA) and using basic Linux shellandcommands,(ii)compilingaCprogram(gcc, make),creating/linkingagainst libraries, (iii) debugging (gdb), (iv) working with a code repository (github), (v) using Linux man pages (the man command), and (vi) plotting experimental data for easy visu- alization. All of these are good programming and experimental analysis practices that we would like you to follow throughout this course (and beyond it).
2 Getting Started
After accepting the invitational link to the Github Classroom @CMPSC473, you will be given access to your own private repository and another repository named ”PA1” con- taining all the files needed for completing PA1. When you open this repository, you will find 5 folders, named prog1, ..., prog5. These folders contain the files described in Section 3. On the web-page of this repository, you will find a link to download it. To download copy the link and type on the command line:
$ git clone <link of the repository>
You will find additional information on using GitHub and other important documents uploaded in a separate document on canvas.
As mentioned in class, you may do the bulk of your work on any Linux (virtual) machine of your choosing. However, the results in your report (which will be used for grading your work) must be carried out on CSE department’s Linux-based teaching ma- chines. These machines are named cse-p204instxx.cse.psu.edu (where xx is a machine number). The reason for asking you to report results on these machines is to
have relative consistency/uniformity in your measurements - this would help us grade in a consistent manner and identify possible bugs/shortcomings.
3 Description of Tasks
1. Stack, heap, and system calls: The executable named prog1 contains a function that is recursively called 10 times. This function has a local variable and a dynami- cally allocated variable. Upon each invocation, the function displays the addresses of the newly allocated variables on the console. After 10 invocations, the program waits for a key to be pressed on the keyboard before concluding. We would like you to observe the addresses displayed by prog1 and answer the following:
- (a) Which addresses are for the local variables and which ones are for the dynam- ically allocated variables? How were you able to deduce this? What are the directions in which the stack and the heap grow on your system?
- (b) What is the size of the process stack when it is waiting for user input? (Hint: Use the contents of /proc/PID/smaps that the /proc file system maintains for this process where we are denoting its process ID by PID. While the pro- gram waits for a user input, try running ps -ef | grep prog1. This will give you PID. You can then look at the smaps entry for this process (cat /proc/PID/smaps) to see a description of the current memory allocation to each segment of the process address space.
- (c) What is the size of the process heap when it is waiting for user input?
- (d) What are the address limits of the stack and the heap. (Hint: Use the maps entry within the /proc filesystem for this process. This will show all the start- ing and ending addresses assigned to each segment of virtual memory of a process.) Confirm the variables being allocated lie within these limits.
- (e) Use the strace command to record the system calls invoked while prog1 executes. For this, simply run strace prog1 on the command line. Look at the man page of strace to learn more about it. Similarly, use man pages to learn basic information about each of these system calls. For each unique system call, write in your own words (just one sentence should do) what purpose this system call serves for this program.
2. Debugging refresher: The program prog2.c calls a recursive function which has a local and a dynamically allocated variable. Unlike the last time, however, this program will crash due a bug we have introduced into it. Use the Makefile that we have provided to compile the program. Execute it. The program will exit with an error printed on the console. You are to compile the program with 32 bit and 64 bit options and carry out the following tasks separately for each:
(a) Observe and report the differences in the following for the 32 bit and 64 bit executables: (i) size of compiled code, (ii) size of code during run time, (iii) size of linked libraries.
2. (b) Use gdb to find the program statement that caused the error. See some tips on gdb in the Appendix if needed.
3. (c) Explain the cause of this error. Support your claim with address limits found from /proc.
4. (d) Using gdb back trace the stack. Examine individual frames in the stack to find each frame’s size. Combine this with your knowledge (or estimate) of the sizes of other address space components to determine how many invocations of the recursive function should be possible on your system. How many invocations occur when you actually execute the program?
5. (e) What are the contents of a frame in general? Which of these are present in a frame corresponding to an invocation of the recursive function and what are their sizes?
3. More debugging: Consider the program prog3.c. It calls a recursive function which has a local and a dynamically allocated variable. Like the last time, this program will crash due to a bug that we have introduced in it. Use the provided Makefile to compile the program. Again, create both a 32 bit and a 64 bit exe- cutable. For each of these, execute it. Upon executing, you will see an error on the console before the program terminates. You are to carry out the following tasks:
1. (a) Observe and report the differences in the following for the 32 bit and 64 bit executables: (i) size of compiled code, (ii) size of code during run time, (iii) size of linked libraries.
2. (b) Use valgrind to find the cause of the error including the program statement causing it. For this, simply run valgrind prog3 on the command line. Val- idate this alleged cause with address space related information gleaned from /proc.
3. (c) How is this error different than the one for prog2?
4. And some more: The program prog4.c may seem to be error-free. But when exe- cuting under valgrind, you will see many errors. You are to perform the following tasks:
1. (a) Describe the cause and nature of these errors. How would you fix them?
2. (b) Modify the program to use getrusage for measuring the following: (i) user CPU time used, (ii) system CPU time used - what is the difference between (i) and (ii)?, (iii) maximum resident set size - what is this?, (iii) signals received - who may have sent these?, (iv) voluntary context switches, (v) involuntary context switches - what is the difference between (iv) and (v)? Look at the sample code in the Appendix for an example on how to use getrusage().
5. Multi-processprogramthatusesfork, exec, wait, kill, exitcalls: TBD