1. Homepage
  2. Programming
  3. COSC 1114 Operating Systems Principles Semester 2, 2022 Assignment 1 : Map Reduce

COSC 1114 Operating Systems Principles Semester 2, 2022 Assignment 1 : Map Reduce

Engage in a Conversation
COSC 1114Operating Systems Principles C++CMap Reduce澳洲AustraliaRmit University

COSC 1114 Operating Systems Principles Semester 2, 2022 CourseNana.COM

Assignment 1 CourseNana.COM


1. Research Question CourseNana.COM

We can use Load Balancing as a technique of ensuring that all processes/threads in a particular [problem are treated in such a fashion that they all finish at approximately the same time. We frame the problem in the form of a map/reduce problem, and then use Static Load Balancing (S-LBAN) to adjust the process load for optimal net performance. CourseNana.COM

The question is how accurate can we do this? Is it worth the effort? Clearly Amazon’s Elastic Computing (EC2) tells us that this is so on a large scale, which typically includes a lot of data movement and networking. But what about in a real-time system, with no significant networking – the data arriving in a batch before the program begins, and having repeatable statistical properties? CourseNana.COM

In other words, we measure, adjust, and deliver optimally. CourseNana.COM

The central task of this assignment is to produce an evidence-based report answering the above questions. CourseNana.COM

Methodology CourseNana.COM

We propose to answer the RQ by creating a series of tasks. First some simple ones in order to establish a baseline of result expectations, then to rewrite the tasks using the map/reduce software pattern to produce a set of concurrent threads of different complexity. We can then adjust the scheduling in order to answer the research question. CourseNana.COM

2. Overview of Methodology CourseNana.COM

In this assignment, you must write C/C++ code to implement a map / reduce problem. Map() is a process or function that maps the input problem to the available resources which may mean subdividing the problem into components and solving each of those component problems using prioritized job scheduling. CourseNana.COM

Then reduce() gathers the component solutions and combines them to form the global solution. CourseNana.COM

You will be provided with “dirty data” and the assignment is then divided into a number of tasks, each of which purports to solve the problem a different way. In each case, there will be a performance measuring phase, and for the later methods, an adjustment phase based on the performance metrics obtained. CourseNana.COM

Throughout, you will be measuring and properly documenting performance (and hopefully improving it) CourseNana.COM

In the last task, where streaming data is used, it will be important to reduce idle time by ensuring that all subtasks created by map() finish at the same time, so that reduce() is not kept waiting. CourseNana.COM

3. Learning Outcomes CourseNana.COM

This assessment relates to the following course learning outcomes: CourseNana.COM

  • Summarise the full range of considerations in the design of file systems, summarise techniques for achieving synchronisation in an operation system,
  • Compare and contrast the common algorithms used for both pre-emptive and non-pre-emptive scheduling of tasks in operating systems, such as priority, performance comparison, and fair- share schemes. Contrast kernel and user mode in an operating system
  • Evaluate and report appropriate design choices when solving real-world problems
  • Analyse the key trade-offs between multiple approaches to operating system design.

4. Assessment details CourseNana.COM

General CourseNana.COM

Your solution to the problem you choose must not use busy waiting, which is where a program sits in a tight loop until some condition is met as this waste’s CPU time. Instead, you must have each thread sleep until a condition is met (use a condition variable), wake up and do some processing and then perhaps signal other threads that the job has been done. Please ensure sufficient output so that it is clear when your program performs any actions such as adding to an array, locking a lock, etc. CourseNana.COM

You should ensure in your design of the program that you only share between threads the minimum state (variables) possible as the more information that is shared the more likely there is to be a conflict (deadlocks, race conditions, etc). Please see the courseware material for more detail on these problems. If your algorithm requires randomness, you should ensure you seed the random number generator correctly. CourseNana.COM

To ease marking of your assignment, you must call your executables "taskN", where N is the task number as described below. All materials necessary for the markers to build your tasks must be included in the submitted ZIP file. The markers will use the CS servers (titan, saturn, jupiter) for the marking, and the code is expected to run there. CourseNana.COM

Makefile CourseNana.COM

Your solution must be written in C / c++ and you must supply a Makefile that builds your solution incrementally and then links them together. If you are feeling a big rusty about make files or have not used them before, we recommend going through the following tutorial: CourseNana.COM

https://www.tutorialspoint.com/makefile/index.htm CourseNana.COM

Compiler Settings CourseNana.COM

Please note that as a minimum you must use the ‘-Wall -g’ flags on gcc or equivalent on other compilers to generate warnings. You may use any supported c++ compiler on the server - if you wish to use a standard above c++ on the server with g++ you will need to use the scl command, e.g.:  scl enable devtoolset-9 bash
This should get you gcc version 9.3.1 (2020) instead of gcc 4.8.5 (2015). Use “gcc –version” to confirm.
CourseNana.COM

Graceful Exit CourseNana.COM

In order to account for the possibility of thread starvation, you will need to gracefully exit your simulation at the end of around 10 seconds. We would normally expect even the slowest task – task1() – to not take longer than 10 seconds to execute. CourseNana.COM

Use the following method to do this: once you have launched all the required threads from main, call the sleep function, to specify sleep for ten seconds. Once the sleep function finishes, change the value of a global variable to indicate that all the threads should exit, which they should be checking regularly. CourseNana.COM

A less graceful exit might be to call exit() after 10 seconds, but that risks leaving zombies behind with unfinished business – potentially leading to data loss. CourseNana.COM

The solution you submit will ideally be as bug-free as possible including free of memory bugs. Your markers will mark your submission on the titan/jupiter/saturn servers provided by the university and we will use the tool "valgrind" to test that your program is memory correct (no invalid access to memory and all memory freed) and as such you should check your program on titan before submission with this tool. This is for debugging and memory testing only. When gathering performance data, you should do this on a dedicated machine or VM, since on titan, being a shared resource, the timings will obviously not be repeatable. CourseNana.COM

The following command:
valgrind --track-origins=yes --leak-check=full --show-leak-kinds=all ./simulation
CourseNana.COM

Will run your program with valgrind. Please note that in this case the name of the executable produced by compilation of your program is 'executable'. CourseNana.COM

Allowed Concurrency Functions CourseNana.COM

For the concurrency part of the assignment, you must limit yourself to the following functions: CourseNana.COM

  • pthread_create
  • pthread_join
  • pthread_detach
  • pthread_mutex_init
  • pthread_mutex_destroy
  • pthread_mutex_trylock
  • pthread_mutex_lock
  • pthread_cond_wait
  • pthread_cond_signal
  • pthread_cond_init
  • you may also need the pthread_mutexattr_* functions
  • you may also use the scheduling priority function.

This list is not exhaustive and so if you are unsure, please ask on the discussion board about the function you wish to use. CourseNana.COM

Please note that in practice beyond this course, you would use higher-level c++ functions that call these functions, but part of what you are learning here is the underlying calls that are made as part of managing concurrency. That is, part of what you are learning is to apply the theory that you learn in the workshops to practical problems. CourseNana.COM

The Source Data CourseNana.COM

To start off, you may use a words list conventionally stored in ‘/usr/share/dict/linux.words’ as a clean data file. You may need to install this in your distro. Titan does not have it. CourseNana.COM

The actual data to be used is at https://www.keithv.com/software/wlist/. CourseNana.COM

You will find a number of ZIP files containing text files containing (ideally) 1 word per line which you read into an array for subsequent processing. In fact, the data is ‘dirty’ and you will need to clean It up first. In your performance measurement using Amdahl’s Law, this is the serial part. CourseNana.COM

Performance Measurement and Reporting CourseNana.COM

You will need to devise a way of measuring improvement using primarily execution time, but also resource usage. Consider some of the metrics discussed in class. CourseNana.COM

You then need to describe this using graphs and other ways to describe how what you did improved performance, and why. See reporting details below. CourseNana.COM

5. The Tasks in detail. CourseNana.COM

The fundamental task is to receive a set of words, clean it up, then use map/reduce ultimately to sort it. Words of length 3-15 characters are to be sorted on the third character onwards. Words of other lengths are to be ignored (filtered out). The tasks below achieve this in different ways. CourseNana.COM

Below you will find a selection of five tasks called task1 ... task5 that must all be done in sequence. In each case, the simulations should not exceed about 10 seconds and terminate cleanly - no crashes, no deadlocks, no race conditions. This should be enough time to gather event statistics. You may need to replicate the data in order to make it run long enough. If so, then please describe what you did. CourseNana.COM

You should also print a line of text for each action in your program such as adding an element to an array. All normal messages should be printed to stdout (use cout) and error messages should be printed to stderr (use cerr). This will allow your marker to capture normal output and error output separately if they wish to. In bash, a command like: CourseNana.COM

./task2 (args) 1>outfile 2>errfile
would do the job (where args are whatever arguments deemed necessary).
CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
COSC 1114代写,Operating Systems Principles 代写,C++代写,C代写,Map Reduce代写,澳洲代写,Australia代写,Rmit University代写,COSC 1114代编,Operating Systems Principles 代编,C++代编,C代编,Map Reduce代编,澳洲代编,Australia代编,Rmit University代编,COSC 1114代考,Operating Systems Principles 代考,C++代考,C代考,Map Reduce代考,澳洲代考,Australia代考,Rmit University代考,COSC 1114help,Operating Systems Principles help,C++help,Chelp,Map Reducehelp,澳洲help,Australiahelp,Rmit Universityhelp,COSC 1114作业代写,Operating Systems Principles 作业代写,C++作业代写,C作业代写,Map Reduce作业代写,澳洲作业代写,Australia作业代写,Rmit University作业代写,COSC 1114编程代写,Operating Systems Principles 编程代写,C++编程代写,C编程代写,Map Reduce编程代写,澳洲编程代写,Australia编程代写,Rmit University编程代写,COSC 1114programming help,Operating Systems Principles programming help,C++programming help,Cprogramming help,Map Reduceprogramming help,澳洲programming help,Australiaprogramming help,Rmit Universityprogramming help,COSC 1114assignment help,Operating Systems Principles assignment help,C++assignment help,Cassignment help,Map Reduceassignment help,澳洲assignment help,Australiaassignment help,Rmit Universityassignment help,COSC 1114solution,Operating Systems Principles solution,C++solution,Csolution,Map Reducesolution,澳洲solution,Australiasolution,Rmit Universitysolution,