MPCS 51082 Introduction to Unix Systems - Project: Unix-like Shell
Project: Unix-like Shell
CS Linux Machine
You will need access to an Linux based machine when working on your project. You should not test your programs on macOS or Windows Linux because these operating systems do not provide all utility commands necessary for completing this and possibly future assignments. Additionally, if they do provide a command then it may not contain all options that a Unix-like system provides. We will use and grade all assignments on the CS Linux machines and all programming assignments must work correctly on these machines. However, you can work locally on a Unix or Unix-like machine but ensure that you test your final solutions on a CS Linux machine.
Please follow the instructions provided here
to easily access and work remotely on a CS Linux machine.
If you have any difficulties working or accessing a machine then please ask your question on Ed Discussion.
Creating Your Private Repository
For each assignment, a Git repository will be created for you on GitHub. However, before that repository can be created for you, you need to have a GitHub account. If you do not yet have one, you can get an account here: https://github.com/join.
To actually get your private repository, you will need this invitation URL:
When you click on an invitation URL, you will have to complete the following steps:
You will need to select your CNetID from a list. This will allow us to know what student is associated with each GitHub account. This step is only done for the very first invitation you accept.
Note
If you are on the waiting list for this course you will not have a repository made for you until you are admitted into the course. I will post the starter code on Ed so you can work on the assignment until you are admitted into the course.
You must click “Accept this assignment” or your repository will not actually be created.
After accepting the assignment, Github will take a few minutes to create your repository. You should receive an email from Github when your repository is ready. Normally, it’s ready within seconds and you can just refresh the page.
- You now need to clone your repository (i.e., download it to your machine).
Make sure you’ve set up SSH access on your GitHub account.
For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per assignment). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs51082-aut22/proj-GITHUB-USERNAME.git.
If you do not know how to use
git clone
to clone your repository then follow this guide that Github provides: Cloning a Repository
If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.
Unix-like System shell
The final project gives you the opportunity to show me what you learned in this course and to build your own emulated Unix-like system. In particular, your Unix system will include its own users, /proc
virtual filesystem, and its own implementation of a shell. The main focus of the project will be building the shell; however, shell commands will interact with user specific directories, /proc
filesystem and etc
directory for the shell. Please note that as you implement this project there will be certain implementations the differ from an actual Unix-like system so please keep that in mind. We will point that in the specification.
The first task is understanding the repository structure, which is described in the next section.
Task 0: Repository Structure
Inside your repository, you will see the following structure (directories are bold)
proc
(empty at the moment)
home
root
.tsh_history
etc
passwd
tsh.c
The contents of each directory is explained below:
proc - represents the
proc
virtual filesystem for the shell. This directory will containPID
directories each with their ownproc/PID/status
file. We will discuss the contents of each of these files in a later section.home - represents the home directories of the users. Similar to a Unix system, each user will have a separate home directory. Our shell won’t have permissions so everything is accessible by all users. Inside each home directory, will be a
.tsh_history
file that contains the last 10 commands ran by the user before they quit/logged out of the shell. You can actually see this yourself by typing inhistory
on the CS linux servers; however, this shows much more than the last 10 commands.etc - contains only one file which is the
etc/passwd
file. As a reminder, this file contains information about users. Unlike in the normal/etc/passwd
that contains multiple fields, the shell’s passwd file will have the following structureusername:password:home_directory_path
Each line will represent a single user of the shell. Currently there is only user inside the file
root
root:pass:/home/root
Unlike in a normal Unix system, we do embed the actual password of the user inside the
passwd
for security purposes but rather in encrypt the password within/etc/shadow
. However, for our shell we will keep things simple (in a vey unsecure way) by having the actual passwords in theetc/passwd
file. Additionally,root
normally does not have a home directory but for the purposes of this assignment it will.tsh.c - this is where you will implement the entire shell.
Task #1: Understanding the tsh.c
File
Looking at the tsh.c
file, you will see that it contains a functional skeleton of a simple Unix shell. To help you get started, we have already implemented the less interesting functions. Your assignment is to complete the remaining empty functions listed below along with writing additional functions and possibly struct definitions.
eval: Main routine that parses and interprets the command line.
builtin_cmd: Recognizes and interprets the built-in commands: quit, fg, bg, and jobs.
login: will login a specific user and return the username of the user logged in.
do_bgfg: Implements the bg and fg built-in commands.
waitfg: Waits for a foreground job to complete.
sigchld handler: Catches SIGCHILD signals.
sigint handler: Catches SIGINT (ctrl-c) signals.
sigtstp handler: Catches SIGTSTP (ctrl-z) signals.
Please take sometime to look over the comments and code in the file to make sure you understand how to use them. You may need to add/modify these functions. Additionally you may need to write additional helper functions, define structs, global variables to implement all aspects of the project.
Task #2: User Login
The compiling and running of the shell can be done as follows
$ gcc -std=gnu11 -o tsh tsh.c
$ ./tsh
When the shell begins running it must prompt the user to enter in a username (username:
) and password (password:
) in order to start the shell. The shell must then perform user authentication by verifying the username and password matches one inside the etc/passwd
file. if their is a match then it will begin the shell tsh>
; otherwise, if the user entered in an incorrect username and/or password then the shell responds with "User Authentication failed. Please try again."
. The shell will continuously keep asking the user to login until an authentication is successful or the user enters in the command quit
, which terminates the shell. Here are a few example runs
$ ./tsh
username: root
password: badPass
User Authentication failed. Please try again.
username: lamonts
password: pass
User Authentication failed. Please try again.
username: quit
$ ./shell
username: root
password: pass
tsh>
Task #3: The tiny (tsh
) Shell
The main objective for the project is to implement tsh
(tiny shell). As a reminder, a shell is an interactive command-line interpreter that runs programs on behalf of the user. A shell repeatedly prints a prompt, waits for a command line on stdin, and then carries out some action, as directed by the contents of the command line. The command line is a sequence of ASCII text words delimited by whitespace. The first word in the command line is either the name of a built-in command or the pathname of an executable file. The remaining words are command-line arguments. If the first word is a built-in command, the shell immediately executes the command in the current process. Otherwise, the word is assumed to be the pathname of an executable program. In this case, the shell forks a child process, then loads and runs the program in the context of the child. The child processes created as a result of interpreting a single command line are known collectively as a job. In general, a job can consist of multiple child processes connected by Unix pipes.
If the command line ends with an ampersand ”&”
, then the job runs in the background, which means that the shell does not wait for the job to terminate before printing the prompt and awaiting the next command line. Otherwise, the job runs in the foreground, which means that the shell waits for the job to terminate before awaiting the next command line. Thus, at any point in time, at most one job can be running in the foreground. However, an arbitrary number of jobs can run in the background.
For example, typing the command line
tsh> jobs
causes the shell to execute the built-in jobs
command. Typing the command line
tsh> /bin/ls -l -d
runs the ls
program in the foreground. Alternatively, typing the command line
tsh> /bin/ls -l -d &
runs the ls
program in the background.
Unix shells support the notion of job control, which allows users to move jobs back and forth between background and foreground, and to change the process state (running, stopped, or terminated) of the processes in a job. Typing ctrl-c causes a SIGINT signal to be delivered to each process in the foreground job. The default action for SIGINT is to terminate the process. Similarly, typing ctrl-z
causes a SIGTSTP
signal to be delivered to each process in the foreground job. The default action for SIGTSTP
is to place a process in the stopped state, where it remains until it is awakened by the receipt of a SIGCONT
signal. Unix shells also provide various built-in commands that support job control. For example:
jobs
: List the running and stopped background jobs.bg <job>
: Change a stopped background job to a running background job.fg <job>
: Change a stopped or running background job to a running in the foreground.kill <job>
: Terminate a job.
tsh
Specification
Your tsh shell must have the following features:
The prompt should be the string
"tsh> "
.The command line typed by the user should consist of a name and zero or more arguments, all separated by one or more spaces. If name is a built-in command, then
tsh
should handle it immediately and wait for the next command line. Otherwise,tsh
should assume that name is the path of an executable file, which it loads and runs in the context of an initial child process (In this context, the term job refers to this initial child process). Each command line needs to be saved in the history file (i.e.,.tsh_history
) for the user such that if the user logs back in that file should be reloaded with the previous ran commands.Any process (i.e. a foreground or background) started, must have an entry in the
proc
directory. Specifically, the directory created will beproc/PID
wherePID
is the unique identifier for the process. Eachproc/PID
directory will only contain a single filestatus
that has the following structureName: <name of process, argv[0]> Pid: <unique identifier for the process> PPid: <unique identifier for the parent process> PGid: <unique identifier for the process group> Sid: <unique identifier for the session leader id> STAT: <process state, two letter state see m7 Slide 20> Username: <the name of the user who owns this process>
As the process is running, the only line changing in this file is the
STAT
line. The shell will always be the session leader and is required to have an entry in theproc
directory. If a process changes it’s state then theirproc/PID/status
file must be updated. If a process is terminated then theproc/PID
directory is removed.tsh
need not support pipes(|)
or I/O redirection (<
and>
).Typing
ctrl-c
(ctrl-z
) should cause aSIGINT
(SIGTSTP
) signal to be sent to the current foreground job, as well as any descendents of that job (e.g., any child processes that it forked). If there is no foreground job, then the signal should have no effect.If the command line ends with an ampersand
&
, thentsh
should run the job in the background. Otherwise, it should run the job in the foreground.Each job can be identified by either a process ID (PID) or a job ID (JID), which is a positive integer assigned by tsh. JIDs should be denoted on the command line by the prefix
'%'
. For example,"%5"
denotes JID 5, and “5” denotes PID 5. (We have provided you with all of the routines you need for manipulating the job list.)tsh
should support the following built-in commands:quit
command terminates the shell immediately.logout
command logs out the user from the shell and then terminates the shell. If there are any suspended (i.e., stopped) processes then the command print"There are suspended jobs."
and does not log the user out. The user mustkill
or bring them back into the foregrouund to allow them to terminate. Once all jobs are no longer suspended then running thelogout
command terminates the shell.history
shows the last 10 commands ran by the user, each numbered on a separate line. The first represents the oldest command and the last line represents the most recently ran command. Each line is numbered starting from 1 up to N where N is at most equal to 10.jobs
command lists all the jobs currently active. I will provide the implementation for this as followsif (strcmp(argv[0],"jobs") == 0) { listjobs(jobs); }
!N
, whereN
is a line number from the history command - reruns theN
command from the user’s history list. Do not add!N
command to the history of the user.The
bg <job>
command restarts<job>
by sending it aSIGCONT
signal, and then runs it in the background. The<job>
argument can be either a PID or a JID.The
fg<job>
command restarts<job>
by sending it aSIGCONT
signal, and then runs it in the foreground. The<job>
argument can be either a PID or a JID.adduser new_username new_password
commands creates a new user for the shell. This command can only be done if theroot
user is logged in. If any other user tries to run this command then the command returns"root privileges required to run adduser."
Otherwise, the shell will create an entry for the new user inside theetc/passwd
file and create a new home directory (i.e.,home/new_username
) and an empty.tsh_history
file. We do not have a delete user command for the shell.
tsh
should reap all of its zombie children. If any job terminates because it receives a signal that it didn’t catch, then tsh should recognize this event and print a message with the job’s PID and a description of the offending signal.
Hints & Tips
The
waitpid
,kill
,fork
,execve
,setpgid
, andsigprocmask
functions will come in very handy. TheWUNTRACED
andWNOHANG
options towaitpid
will also be useful.When you implement your signal handlers, be sure to send
SIGINT
andSIGTSTP
signals to the entire foreground process group, using"-pid"
instead of"pid"
in the argument to the kill function.- One of the tricky parts of the assignment is deciding on the allocation of work between the
waitfg
andsigchld
handler functions. We recommend the following approach: In
waitfg
, use awhilte(1)
loop around thesleep
function.In
sigchldhandler
, use exactly one call towaitpid
.
While other solutions are possible, such as calling waitpid in both waitfg and sigchld handler, these can be very confusing. It is simpler to do all reaping in the handler.
- One of the tricky parts of the assignment is deciding on the allocation of work between the
In eval, the parent must use
sigprocmask
to blockSIGCHLD
signals before it forks the child, and then unblock these signals, again usingsigprocmas``k
after it adds the child to the job list by callingaddjob
. Since children inherit the blocked vectors of their parents, the child must be sure to then unblockSIGCHLD
signals before it execs the new program.The parent needs to block the
SIGCHLD
signals in this way in order to avoid the race condition where the child is reaped by sigchld handler (and thus removed from the job list) before the parent callsaddjob
.Programs such as
more
,less
,vi
, andemacs
do strange things with the terminal settings. Don’t run these programs from your shell. Stick with simple text-based programs such as/bin/ls
,/bin/ps
, and/bin/echo
.When you run your shell from the standard Unix shell, your shell is running in the foreground process group. If your shell then creates a child process, by default that child will also be a member of the foreground process group. Since typing ctrl-c sends a
SIGINT
to every process in the foreground group, typing ctrl-c will send aSIGINT
to your shell, as well as to every process that your shell created, which obviously isn’t correct.Here is the workaround: After the fork, but before the
execve
, the child process should callsetpgid(0, 0)
, which puts the child in a new process group whose group ID is identical to the child’s PID. This ensures that there will be only one process, your shell, in the foreground process group. When you type ctrl-c, the shell should catch the resulting SIGINT and then forward it to the appropriate foreground job (or more precisely, the process group that contains the foreground job).The
remove(path)
andrmdir(path)
functions will be helpful with deleting files and removing directoriesYou may want to define a history array as follow
char history[MAXLINE][MAXHISTORY];
whereMAXHISTORY=10
to easily store the history of the current user. For example, You can do something likestrcpy(history[1], "/bin/ls ls -l")
to easily store the history for the user.To update the
passwd
file use the"a"
flag to store append to the file when adding a new user.Make sure to not add a new user if they already exist in the
passwd
file. You can display an error message such asUser already exists.
if they do appear.built-in commands do not need to have proc file since it’s handled by the shell itself.
Project Grading
When grading the project, we will use the following as the criteria for getting specific grades for the project:
High A Range (96-100)
The project is fully-working based on the specification above. All job-control is working (i.e., signals handlers) and the built-in commands are working as specified.
High B and Low A Range (86-95)
Job-control is working fully for foreground processes. Proc files are created/deleted correctly and history commands are saved correctly for foreground processes. All built-in commands are working correctly with the exception of handling background processes or signals (i.e., bg
, fg
, jobs
, etc.). You do not need to have the signal handlers working fully/correctly to receive a grade in this range.
High C and Low B Range (75-85)
Job-control is not fully working for either background/foreground processes. However, significant progress has been made to get them working correctly. The user can login and the root user can create new users. The built-in commands are implemented with the exception of handling foreground and background processes or signals correctly (i.e., bg
, fg
, jobs
, etc.).
Lower then Low C (74-0)
This is be graded on a case by case basis based on what is submitted; however, you cannot receive higher than a 75 on the project.
As you can see, We will be pretty lenient on grading the project. The ranges in the above categories are there because design and style will be a factor in the grading. Please make sure you have modular code (i.e., break you code down into functions). Even if you are missing edge cases in implementing a few features you can still receive a good grade in the above ranges.
Project Paper
The final paper must provide an overview of the structure of your shell and the important data structures used within it. In particular, discuss the different parts of your shell (listed below). Each of the following can be addressed in a paragraph or two.
Login: Describe the code that handles logging into shell and how it interacts with the initialization of the shell. Describe how a new user is added to the shell.
Command Evaluation: Discuss how you implemented the
eval
function and how it interacts with signal handlers and job control. If you did not implement signal handling and/or job control then conceptually describe how you would have implemented these components.Built-in commands: Describe how each built-in command is implemented and any helper functions/global variables needed to implement them. This includes describing in detail how the
history
and.tsh_history
is implemented in your code.Proc - Describe how
proc
files are implemented in your code. What code needed to be added/changed in thetsh.c
file to get this to work?Job Control - Describe how job control is implemented in your solution (i.e.,
waitfg
,do_bgfg
,sigchld
,sigint
,sigtstp
) how it interacts with the other functions and helper functions in you implemented in the system. If you did not implement this feature then describe at a high-level how you would implement this feature.
In addition to explaining the above components make sure to answer the following questions (these make take more than 1 to 2 paragraphs to answer):
What was the most challenging aspect of the project for you?
Right now, the
tsh
shell is running within the filesystem of your default shell (i.e., most likelybash
) and uses the filesystem structure associated with it. What if we required you to have the built-in commands ofmount
andunmount
where the shell had to ensure that directories and files not physically mounted/unmounted could not be accessed or modified. Conceptually, what changes would you have to make to program to mount and umount work in thetsh.c
file.What would need to change if you wanted to implement pipes (e.g.,
sort longsort.txt | uniq | wc -l
) intsh
? Conceptually, what aspects of the shell would need to be modified and added to implement this functionality?
I expect the paper to be in range of 1-3 pages. However, you can definitely describe these details in less than 3 pages. You can go pass the 3 page requirement.