Simulating Scheduling
In this question, you’ll be writing a simulator that:
- Implements the MLFQ scheduling policy.
- Simulates the behaviour of scheduling on a multi CPU system using threads (where each “CPU” is one thread)
Your simulator will implement a Multi-level feedback queue (preemptive, priority-based) with tasks that enter the system at start time, and while the simulator is running.
Implementation
While the overall design of your program is up to you, a recommended architecture that generally makes sense to use here looks like:
Before the simulator starts, you will load the workload (your set of tasks) into the scheduler, and initialize all of your “CPU” threads. The “CPU” threads should wait after they have been initialized until the dispatcher notifies them that tasks are available. A “reading” thread will sleep, “waiting” for the next batch of processes to enter the system.
When the simulator starts, your scheduling policy will decide which task should be executed next given the complete set of tasks that are waiting to run.
The dispatcher will notify all waiting “CPU” threads that a new task is available to be executed. One of the “CPU” threads will take the task and “run” it.
The “CPU” threads will each run one task at a time. Running a task here means doing two things: Deciding if/when the task will make a system call (i.e., request I/O) and sleeping until either the task is done, the time slice is complete, or the time until the task makes a system call elapses.
Once sleeping is done, your “CPU” should update the appropriate accounting values for the task before placing the task back into the queue or into the done area.
If a task still has time left to complete, it will be sent back to the scheduling policy to be rescheduled. Note that when the task is rescheduled, it may or may not be “run” on the same CPU again (we’re not implementing processor affinity).
If a task does not have any time left to complete after running on the CPU, the CPU will put the task into the “Done” area.
When there are no tasks left to be scheduled (NOTE: this is not the same as just no tasks in the scheduler, this is the condition that there are no tasks in the scheduler and the “reading” thread has finished reading all tasks from the file), the simulator shuts down and produces a report.
When the “reading” thread wakes, it resumes reading the processes file, adding tasks to the system until the file end, or another delay is found.
If you implement your scheduler using this architecture, your implementation will require several locks, minimally where the “reader” thread adds tasks to the ready queue, where the CPUs get a task from the dispatcher, where the CPUs return the task to the running queue, and where the CPUs write tasks into the “Done” area. The level of granularity that you choose to implement in terms of locking the queue here is your decision.
You’re also going to need to use condition variables to tell CPUs that are not currently working on a task that a new task is available.
Workload
Some tasks are initially loaded from a file and are put in the scheduler’s ready queue before the start of the simulation (all the tasks that appear before the first DELAY
line).
The file of tasks has a space-delimited format:
task_name task_type task_length odds_of_IO
There are 4 types of tasks to run, which are indicated by an id:
- id 0 - Short tasks
- id 1 - Medium tasks
- id 2 - Long tasks
- id 3 - I/O tasks
Note: I/O tasks are effectively the same as “Long tasks”, where the only difference is that I/O tasks have a higher chance of doing an I/O operation (see taskmaker.py to get an idea of how the tasks are created).
Here’s an example of a single line that’s generated by taskmaker.py, which represents the properties for one task:
io_task_1 3 100 50
This line can be decoded into a task in your system as:
task_name
: “io_task_1”task_type
: 3, io tasktask_length
: 100 microseconds left to completeodds_of_IO
: 50%
You can find a file of tasks here. Note that your program will be evaluated on a different tasks.txt
than what is provided here, so you should be sure to generate at least one other workload using taskmaker.py and verify that your program does not crash on different input. Include this extra workload in your submission and note in the README
the name of this file. The graders may use the file that you submit.
I/O
The odds of a task yielding the processor for I/O is a percentage out of 100.
The strategy that you can use to determine whether or not a task makes a system call (does I/O) is using random numbers. When a task starts on a CPU, generate a random number between 0 and 100. If the number you generate is less than or equal to the “odds_of_IO
” for a task, then the task will do I/O. If the number you generate is greater than the “odds_of_IO
” for a task, then the task will not do I/O.
- If a task does do I/O, then you should generate another random number based on how much time the task could possibly execute for. The random number that you should generate should be between 0 and the length of a time slice.
- If a task does not do I/O, then it should just run for it’s complete time slice.
I/O in this system is satisfied instantly. That means that if a task does I/O, it should be placed back into the scheduler and it should be eligible to be scheduled, possibly immediately.
Delays
There are special lines in input file that slow the input of the tasks, so not all the tasks are put in the ready queue at moment 0.
The format is DELAY [number of milliseconds]
Example:
DELAY 100
Or in a potential file
io_task_1 3 100 50DELAY 100io_task_2 3 100 50
Which would read in io_task_1
, put it on the ready queue. The reading thread would then wait 100 ms, then read in io_task_2
and the rest of the file, adding those processes into the simulation.
Note: You do not need to preempt running jobs with new jobs read in from the file. Simply add the jobs to the queue, and the jobs should be picked up appropriately when the running jobs’ time slice is completed.
“Running” tasks
The tasks that you’re running don’t actually do anything, they just go to sleep. Once you’ve decided how long your task is going to run for, the “CPU” thread should just go to sleep for that long, in microseconds.
You can use the following code snippet to “run” your task:
#include <time.h>
#define NANOS_PER_USEC 1000
#define USEC_PER_SEC 1000000
static
void
microsleep
(unsignedint
usecs
)
{
long
seconds
=usecs
/USEC_PER_SEC
;
long
nanos
=(
usecs
%USEC_PER_SEC
)*
NANOS_PER_USEC
;
struct
timespec t
={
.
tv_sec
=seconds
,.
tv_nsec
=nanos
};
int
ret
;
do
{
ret
=nanosleep
(&
t
,&
t
);
// need to loop, `nanosleep` might return before sleeping
// for the complete time (see `man nanosleep` for details)
}
while
(
ret
==-1
&&
(
t
.tv_sec
||t
.tv_nsec
));
}
MLFQ Scheduling Policies
Your implementation of a multi-level feedback queue (MLFQ) should match what is described in Chapter 9 of OSTEP. Specifically, it should implement the final set of rules that are described:
The configuration for your implementation of MLFQ should be:
- Three (3) priority levels.
- Quantum length for all queues is 50 microseconds.
- Time allotment for tasks is 200 microseconds before their priority is lowered.
- The time value S will be modified to do some analysis of the system.
Queues
You are welcome to implement your own queue(s) as a linked data structure in C, but you also have other options:
- Use the queue wrapper library defined in
sys/queue.h
(seeman queue
). - Use a “ticket-based” queuing/priority system, similar to the ticket lock implementation described in Chapter 28 of OSTEP.
- Use an array/some arrays and constantly sort them by some property of each task.
Measuring time
You will ultimately be reporting on the average turnaround time and response time for each type of task in the system. In that your tasks will actually take time to complete, you should be measuring real time.
You can use clock_gettime
to get the current time with sufficient granularity (see man clock_gettime
). You should look at the options that you can pass to clock_gettime
, but you almost certainly want to use CLOCK_REALTIME
.
You should also take a look at Profiling Code Using clock_gettime by Guy Rutenberg for a complete example on how to compute differences between times gathered from clock_gettime
. The code in Guy’s blog post is explicitly permitted to be used in your own solutions.
Running
Your program should take three command-line arguments:
- The number of CPUs in the system you’re simulating,
- The value for S, the time between moving all the jobs to the topmost queue, in milliseconds, and
- The name of the file that contains the workload for the simulation.
Some examples:
./a3q1
4 5000 tasks.txt
# 4 "CPUs" With S set to 5000ms
Output
Print the following statistics at the end of the simulation:
- Average turnaround time for each type of task.
- Average response time for each type of task.
Sample output
Using mlfq with 4 CPUs.
Average turnaround time per type:
Type 0: 861 usec
Type 1: 2476 usec
Type 2: 16384 usec
Type 3: 18239 usec
Average response time per type:
Type 0: 900 usec
Type 1: 2017 usec
Type 2: 1092 usec
Type 3: 9816 usec
Using mlfq with 4 CPUs.
Average turnaround time per type:
Type 0: 861 usec
Type 1: 2476 usec
Type 2: 16384 usec
Type 3: 18239 usec
Average response time per type:
Type 0: 900 usec
Type 1: 2017 usec
Type 2: 1092 usec
Type 3: 9816 usec
Note: This is sample output to show you the expected format and isn’t necessarily an example of the numbers you would actually see.
Report and analysis
Run the simulation that you implemented with
- 1 CPU, 2CPUs, 4 CPUs, and 8 CPUs
- each with 4 different values for S.
A total of 16 runs.
Compute an average turnaround and response time for each type of task for each scheduling policy for these settings.
Include these results in your README.md
, and explain whether or not the results that you’re seeing make sense by answering the following:
- Is the difference in turnaround time and response time you expected to see as S and the number of CPUs change? Why or why not?
- How does adjusting the S value in the system affect the turnaround time or response time for long-running tasks? Does it appear to be highly correlated?
You may make custom workloads to focus your tests.