IOT304TC Cloud Computing Coursework 2
ASSIGNMENT TASK (INDIVIDUAL WORK)
You are required to work individually to complete the assignment. There are 5 questions and you have to answer all of them. In order to complete the assessment, you need to apply the skills that you developed in the lecture where you learned how to analyse different cloud models and mechanism and use the cloud computing technology to solve industrial and practical problems. It is strongly encouraged that you do some additional research to identify, collect, and compare further relevant information that can be incorporated into your report. When using different sources, you must ensure that they are correctly referenced, and you need to synthesise your own ideas and present them in your own words.
SUBMISSION FORMAT INSTRUCTIONS The assignment must be typed and submitted via Learning Mall Online to the correct dropbox. Only electronic submissions are accepted - no hard copies. The report should be submitted in a single PDF file and the format should follow the below structure: • Cover page filled in with your student ID • Your answer to each question • List of references All students must download their file and check that it is viewable after submission. Document uploads may become corrupted during the uploading process (e.g., due to slow internet connections). Therefore, students themselves are responsible for submitting a functional and correct file that needs to be tested after submitting it. Deadline reminder: 11:59 PM China time (UTC+8 Beijing) on Tuesday 30 May 2023
LEARNING OUTCOMES This assignment tests your ability to: A. Demonstrate systematic understanding and critical awareness of well-defined concepts, models, and technologies for Cloud computing technologies and practices. B. Demonstrate expertise in different Cloud models and mechanisms, including their strengths and weaknesses. C. Adapt or combine the key elements of existing Cloud models and mechanisms to design cloud solutions to the real world application problems.
MARKING CRITERIA
The following criteria will be used to assess the assignment. This report is marked for the whole group. Outstanding: Report format is consistent throughout including heading styles, fonts, and margins, figure/table/diagram/program/chart are correctly labelled, effectively interpreted and discussed, writing flows smoothly from one idea to another, information is presented in logical and interesting way, all information is located in the appropriate section, calculation process is clearly presented before arriving to the final answer or conclusion. Appropriate: Report format is generally consistent, figure/table/diagram/program/chart are properly interpreted, sentences are structured and word are chosen to communicate ideas clearly, information is presented in logical manner, information is located in the appropriate section, calculation process is properly presented before arriving to the final answer or conclusion.
Needs Improvement: Report format is inconsistent, figure/table/diagram/program/chart are poorly interpreted and discussed, sentence structure and/or word choice sometimes interfere with clarity, information is hard to follow as there is very little continuity, many items are in the wrong section, some steps or procedures are missing before arriving to the final answer or conclusion. Hard to Understand: Report format is inconsistent, figure/table/diagram/program/chart are not used effectively, sentence structure and word choice make reading and understanding difficult, sequence of information is difficult to follow, lack of appropriate sections and many items are in the wrong section, some steps or procedures are missing in the calculation process, the final answer and/or conclusion are incorrect. No submission or Missing Section: No submission or missing section of the discussion in the report. Item Basis of marking Marks Question #1 ü Ability to demonstrate correctly implemented code on cloud platform based on existing cloud computing technologies/models. ü Ability to analyse the strength or factors that influence the adoption of cloud technology in a relevant industry. Mark • Outstanding: 11 - 15 • Appropriate: 7 - 10 • Needs improvement: 4 - 6 • Hard to Understand: 1 – 3 • No submission or missing section: 0 15
Question #2 ü Ability to analyse the recent trend and development of cloud computing technology. ü Ability to analyse the strength or factors that influence the adoption of cloud technology in a relevant industry. Mark • Outstanding: 15 - 20 • Appropriate: 10 - 14 • Needs improvement: 6 - 9 • Hard to understand: 1 - 5 • No submission or missing section: 0 20
Question #3 ü Ability to analyse the various cloud deployment models/technologies, including their practical or industrial applications. ü Ability to analyse the strengths and weaknesses/limitations of various cloud deployment models. Mark • Outstanding: 15 – 20 • Appropriate: 10 - 14 • Needs improvement: 6 – 9 • Hard to understand: 1 – 5 • No submission or missing section: 0 20
Question #4 ü Ability to analyse the various cloud service models, including their practical. ü Ability to analyse the strengths and weaknesses/limitations of various cloud service models. Mark • Outstanding: 20 - 25 • Appropriate: 15 - 19 • Needs improvement: 10 - 14 • Hard to understand: 1 - 9 • No submission or missing section: 0 25 Question #5 ü Ability to design and integrate the existing cloud models and technology to solve real-world application problems. ü Ability to understand and critical awareness of well-defined concepts, models, and technologies for Cloud computing technologies and practices. Mark value • Outstanding: 15 - 20 • Appropriate: 10 - 14 • Needs improvement: 6 - 9 • Hard to understand: 1 – 5 • No submission or missing section: 0 20 Overall Mark 100 The following table indicates what is expected for each classification category, highlighting generic marking criteria that bring together expectations in performance for each percentage (or alphabetical) band and the criteria that need to be satisfied. Generic Marking Criteria Grade Point Scale Criteria to be satisfied A 81+ First Ø Outstanding work that is at the upper limit of performance. Ø Work would be worthy of dissemination under appropriate conditions. Ø Mastery of advanced methods and techniques at a level beyond that explicitly taught. Ø Ability to synthesise and employ in an original way ideas from across the subject. Ø In group work, there is evidence of an outstanding individual contribution. Ø Excellent presentation. Ø Outstanding command of critical analysis and judgment. B 70 - 80 First Ø Excellent range and depth of attainment of intended learning outcomes. Ø Mastery of a wide range of methods and techniques.
Ø Evidence of study and originality clearly beyond the bounds of what has been taught. Ø In group work, there is evidence of an excellent individual contribution. Ø Excellent presentation. Ø Able to display a command of critical thinking, analysis and judgment. C 60 - 69 Upper Second Ø Attained all the intended learning outcomes for a module or assessment. Ø Able to use well a range of methods and techniques to come to conclusions. Ø Evidence of study, comprehension, and synthesis beyond the bounds of what has been explicitly taught. Ø Very good presentation of material. Ø Able to employ critical analysis and judgement. Ø Where group work is involved there is evidence of a productive individual contribution D 50- 59 Lower Second Ø Some limitations in attainment of learning objectives but has managed to grasp most of them. Ø Able to use most of the methods and techniques taught. Ø Evidence of study and comprehension of what has been taught Ø Adequate presentation of material. Ø Some grasp of issues and concepts underlying the techniques and material taught. Ø Where group work is involved there is evidence of a positive individual contribution. E 40 - 49 Third Ø Limited attainment of intended learning outcomes. Ø Able to use a proportion of the basic methods and techniques taught. Ø Evidence of study and comprehension of what has been taught, but grasp insecure. Ø Poorly presented. Ø Some grasp of the issues and concepts underlying the techniques and material taught, but weak and incomplete. F 0 - 39 Fail Ø Attainment of only a minority of the learning outcomes. Ø Able to demonstrate a clear but limited use of some of the basic methods and techniques taught. Ø Weak and incomplete grasp of what has been taught. Ø Deficient understanding of the issues and concepts underlying the techniques and material taught. Ø Attainment of nearly all the intended learning outcomes deficient. Ø Lack of ability to use at all or the right methods and techniques taught. Ø Inadequately and incoherently presented. Ø Wholly deficient grasp of what has been taught. Ø Lack of understanding of the issues and concepts underlying the techniques and material taught.
Ø Incoherence in presentation of information that hinders understanding. G 0 Fail Ø No significant assessable material, absent, or assessment missing a “must pass” component.
Overview MLlib is a Spark-based machine learning library. Its goal is to make popular machine learning algorithms scalable and easy to use. It provides algorithms including classification, regression, clustering, and collaborative filtering. In this assessment, we will create a cluster on the cloud based on the components learned in this course, deploy this tool, and complete a binary classification task. Dataset We will provide a dataset ‘data.csv’, which can be used to test your program. It is a 2-category, 10-feature dataset with about 1×106 samples.
Page 7 of 11 Question 1 (15 marks) Create a cluster on a cloud which contains at least 2 ECS. Altogether they contain at least 8 vCPUs. (Hint: Explain the main installation process and take screenshots on the key steps.)
Question 2 (20 marks) Deploy the spark environment on the cluster. Explain the main installation process and take screenshots on the key steps.
Question 3 (20 marks) Import the data set 'data.csv' into a cloud storage, and the specific service type is determined by yourself. Explain the main installation process and take screenshots at key steps. (Note that in this question, the storage should be defined independently of the ECS in Question 1).
Question 4 (25 marks) Write a program to read the data stored in Question 3, and you can choose to call any 3 classifiers in MLlib to estimate F-score and AUC by 5-fold CV. Fill the table below based on the program and indicate which classifier performed best. NOTE: The parameters like ‘number of iterations’ or ‘number of estimators’ in various classifiers should be not less than 100. Model Name F-score AUC
Question 5 (20 marks) Change the maximum number of vCPUs available to Spark to 1 - 8, respectively test the running time of the 3 classifiers in Question 4 and fill in the table below. And point out which classifier runs the fastest on average, and which classifier has the most obvious speed-up effect when increasing the number of processes. NOTE: The parameters like ‘number of iterations’ or ‘number of estimators’ in various classifiers should be not less than 100. Num. of processors Running Time (s) Model 1 Model 2 Model 3 1 2 3 4 5 6 7 8