1. Homepage
  2. Homework
  3. [2022] Coventry - 5011CEM Big Data Programming Project
This question has been solved

[2022] Coventry - 5011CEM Big Data Programming Project

Engage in a Conversation
Coventry5011CEMBig Data Programming Project

Assessment Overview

Over the course of this module you have been introduced to a range of techniques that may be used for programming a big data project. This assessment allows you to pull together these techniques in a realistic scenario to complete a big data analysis project. Below is a realistic project scenario. By using the techniques presented during class you are to carry out the project and write a final project report for your client. CourseNana.COM

Project Scenario

You have been approached by a client who analyses atmospheric science and climate model data. They have developed a new analysis technique, but it takes too long to run for them to use it. They have asked you to investigate the use of big data techniques to reduce the processing time. CourseNana.COM

They have a large volume of data to process, and the analysis needs to be repeated frequently. They have the following basic requirements: CourseNana.COM

1.     Current analysis time is approximately 2.5 hours to analyse the climate model output data for a 1-hour time period. CourseNana.COM

2.     The data for a single day of model output is approximately 250MB. However, they have over 100 years’ worth of data to analyse making a total of over 9TB. CourseNana.COM

3.     Each day, they need to analyse the new data set for that day, so they wish to complete the analysis of the data for a 24-hour period (25 data sets) in under 2 hours. CourseNana.COM

4.     It is not possible to hold on this in memory at one time, so the new process should load only 1 hour of data for processing at a time. If parallel processing is to occur, then 1 hour of data per worker can be loaded as needed. CourseNana.COM

You have been tasked with investigating the use of parallel processing to achieve the analysis speed required, with the following expectations: CourseNana.COM

1.     Test and compare the processing speed of sequential and parallel processing CourseNana.COM

2.     Extrapolate your findings to indicate the number of processors required to achieve the target processing time. CourseNana.COM

3.     Test how your code responds to common errors, e.g. data that is text instead of numeric, use of NaN in the data as an error code. CourseNana.COM

4.     Run automated tests that allow your client to set the tests running and return later to see the results, without user intervention. CourseNana.COM

The data has been provided by the European Centre for Medium Range Weather Forecasts (ECMWF) CourseNana.COM

  CourseNana.COM

Continued over… CourseNana.COM

Project Deliverables

Your project should deliver the following: CourseNana.COM

1.     Working code that demonstrates: CourseNana.COM

a.     Loading of only the data required for the processing taking place CourseNana.COM

b.     Sequential processing of the data CourseNana.COM

c.      Parallel processing of the data CourseNana.COM

d.     Plots of the comparisons between sequential processing and parallel processing with different numbers of workers CourseNana.COM

e.     Automated testing of your code to deal with pre-defined data error types. CourseNana.COM

2.     A formal project report for your client covering: CourseNana.COM

a.     Comparisons between parallel and sequential data processing CourseNana.COM

b.     Estimated number of processors required to achieve the goal of processing 24-hours of data in under 2 hours. CourseNana.COM

c.      Testing the code to see how it deals with: CourseNana.COM

                                               i.     Text instead of numeric values CourseNana.COM

                                              ii.     NaN values indicating data errors. CourseNana.COM

                                             iii.     Note: it is not necessary to solve these problems to pass, but you should be able to suggest methods of dealing with these problems so code will not crash. CourseNana.COM

d.     A summary of the evidence generated during your project and how it helps you arrive at your conclusions CourseNana.COM

e.     Recommendations CourseNana.COM

f.      References CourseNana.COM

g.     Appendices containing: CourseNana.COM

                                               i.     Code flow charts CourseNana.COM

                                              ii.     Gannt chart for your project CourseNana.COM

                                             iii.     Logbook CourseNana.COM

                                             iv.     Specification items CourseNana.COM

3.     VIVA / presentation. You will be expected to present your work in a formal presentation / VIVA. Details of this can be found in the VIVA assessment brief. CourseNana.COM

This assessment brief covers only parts 1 and 2. The assessment brief for part 3, VIVA, is found in a separate document. CourseNana.COM

Additional Information

1.     You will be provided with NetCDF data files: CourseNana.COM

a.     One complete, correct data file CourseNana.COM

b.     One file containing instrument errors, recorded as NaN. CourseNana.COM

c.      One file containing data storage error where the numerical values have been saved as text CourseNana.COM

2.     You are provided with code files for the analysis technique. You should not edit this file in any way. You are required run the analysis, for timing purposes, but are not expected to analyse, display, report on, or deal with the results of the analysis in any way. CourseNana.COM

Continued over… CourseNana.COM

3.     You are expected to define your project by means of a list of 5 SMART specification items. These should be included in an appendix. CourseNana.COM

4.     You are expected to plan the work required for this project and provide a complete Gannt chart, including identifying the critical path. This should be included in an appendix. CourseNana.COM

5.     This is a formal report and it is expected that appropriate formal grammar and language are to be used. Where this is not the case, a penalty of up to 10% may be applied to the marks for the report structure. For help with formal writing, please contact the Centre for Academic Writing. CourseNana.COM

  CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
Coventry代写,5011CEM代写,Big Data Programming Project代写,Coventry代编,5011CEM代编,Big Data Programming Project代编,Coventry代考,5011CEM代考,Big Data Programming Project代考,Coventryhelp,5011CEMhelp,Big Data Programming Projecthelp,Coventry作业代写,5011CEM作业代写,Big Data Programming Project作业代写,Coventry编程代写,5011CEM编程代写,Big Data Programming Project编程代写,Coventryprogramming help,5011CEMprogramming help,Big Data Programming Projectprogramming help,Coventryassignment help,5011CEMassignment help,Big Data Programming Projectassignment help,Coventrysolution,5011CEMsolution,Big Data Programming Projectsolution,