1. Homepage
  2. Programming
  3. INF2-FDS: Informatics 2 - Foundations of Data Science Project

INF2-FDS: Informatics 2 - Foundations of Data Science Project

Engage in a Conversation
EdinburghINF2-FDSInformatics 2 - Foundations of Data SciencePythonData Analysis

FDS Coursework 3, 2023/24 CourseNana.COM

Data Science Project CourseNana.COM

Released:
Submission deadline:
CourseNana.COM

Monday 26 February 2024
Tuesday 2 April 2024 at 12:00 UK time
CourseNana.COM

Late submission rules: CourseNana.COM

Rule 1: extensions (3 days) and ETA (7 days). Late penalties apply. See Late coursework and extension requests for full details of rules and late penalties. CourseNana.COM

If working as a group, please see the details on Learn of how Extensions and ETAs apply to this project. CourseNana.COM

Group or individual work:
This is a marked assignment which will count towards 40% of your final grade for Inf2-FDS. CourseNana.COM

  • Project description CourseNana.COM

    For your final project in FDS you will work on a data science project. The goal of the project is to go through the complete data science process to answer a question. You will: CourseNana.COM

To reduce workload, and make the project more enjoyable and potentially interesting, we are encouraging you strongly to undertake the project in self-selected groups of two or three. However, we are offering the option of undertaking the project individually. There will be slight differences between the individual and group projects, as described below. CourseNana.COM

Project options CourseNana.COM

We are offering a choice of three project options: CourseNana.COM

Historical and world-wide trends in ultramarathon running The cancellations of planned operations in the Scottish NHS. National university student satisfaction survey data CourseNana.COM

the document there are more details of each option, including questions to address. CourseNana.COM

If you are working individually, you should address the main question we have supplied.
If you are working in a pair, you should address the main question we have supplied, and propose and address an extra question.
If you are working in a group of three, you should address the main question we have supplied and propose and address two extra questions.
CourseNana.COM

Feedback on your progress CourseNana.COM

We offer the opportunity to share any progress on your project either via a mini one-page progress page (due Week 8) or by presenting at a workshop in week 9 or 10. This is not a marked component of the assignment; the purpose of this is to help you reflect on your progress, and to get feedback from your tutor and peers (or another FDS staff member for those who submit a written page). Details of this are outlined in the section below, Feedback via written update or presentations (not for credit). CourseNana.COM

Submission CourseNana.COM

We will ask you to submit: CourseNana.COM

  1. A short report of your project written in LaTeX, using the supplied template and word limits (see “Report Structure”, below). The report will be assessed according to the criteria below. The report will be submitted using Gradescope. For group submissions, only one member of the group submits and should use the Gradescope interface to tag their other group members at the time of submission. CourseNana.COM

  2. Jupyter notebooks and/or python files containing the code. We will not mark the code, but we may wish to run it. The code must run with no errors. The code will be submitted to Learn. CourseNana.COM

  3. If you are doing a project in pairs or threes, you will each need to write a short individual statement about how you divided the work, and what the individual contributions of each member of the group were. This can be a brief statement of contributions, e.g. “X & Y planned the analysis, Y implemented the analysis, X did the visualisations, X & Y wrote the report”. This is common practice in scientific reports. This statement will be submitted via a Microsoft Form that will be distributed near the submission date. CourseNana.COM

Submission details for the report and individual statements will be released closer to the deadline. CourseNana.COM

Report Structure CourseNana.COM

Getting and using the LaTeX template CourseNana.COM

You must use the LaTeX template we supply, and not change margins or font sizes. CourseNana.COM

To get the template, firstly find the template in Overleaf: https://www.overleaf.com/read/brpnfsptvxnp and then either CourseNana.COM

  1. “Copy Project” from the Overleaf menu to start editing your own version CourseNana.COM

  2. or download the source as a zip file if you wish to edit it locally using another LaTeX editor. CourseNana.COM

The training resource LaTeX for Beginners using Overleaf by the University of Edinburgh Digital Skills & Training Team contains a step-by-step guide to using LaTeX with Overleaf, including how to do equations, tables, citations and references. Tutorials on LaTeX are also available from InfPALS. CourseNana.COM

Format CourseNana.COM

The report format is as follows: CourseNana.COM

  • Overview, giving description of problem, work carried out, and results (Maximum 250 words) CourseNana.COM

  • Introduction (suggested 400 words): Background to the question to be read by someone with no CourseNana.COM

    prior knowledge of the question. It should give: CourseNana.COM

o Context and motivation - what is the area of this data science study, and why is it CourseNana.COM

interesting to investigate?
o Brief description of any previous work in this area (e.g., in the media, scientific literature CourseNana.COM

or blogs)
o Objectives of the project what questions are you setting out to answer? CourseNana.COM

Data (Suggested 300 words): A description of the dataset(s), and how you processed it or them: o Data provenance: Who created the dataset(s)? How you have obtained it (e.g., file or CourseNana.COM

web scraping), and do the T&Cs allow you to use obtain the data for the project? o Description of the variables in each table, e.g. variables in each table, number of CourseNana.COM

records.
o Description of how you have processed the dataset, e.g., removing missing values, CourseNana.COM

joining tables
Exploration and analysis (suggested 500 words for individual report; proportionately longer for CourseNana.COM

group projects). A data science analysis of the paper, including:
o Visualisations and tables
o Interpretation of the results
o Description of how you have applied one or more of the statistical and ML methods CourseNana.COM

learned in the FDS to the data o Interpretation of the findings CourseNana.COM

Discussion & Conclusions (Suggested 400 words) o Summary of findings CourseNana.COM

o Evaluation of own work: Strengths and limitations
o Comparison with any other related work
o Improvements and extensions note that this is just discussing what improvements and CourseNana.COM

extensions you would make if you had more time, not actually implementing them. CourseNana.COM

References: A list of work cited the template has examples of how to cite various types of work. Please ask if you need more help with citing. CourseNana.COM

Page limits CourseNana.COM

We will limit the report length depending on whether the project is individual, in pairs, or in threes: CourseNana.COM

  • For an individual project you can have 6 pages of the main text, including tables and visualisations, with the references section starting at the top of page 7. However, you can have the references within the 6 pages if you want. CourseNana.COM

  • For a 2-person project you can have 8 pages of the main text, including tables and visualisations, with the references section starting at the top of page 9. However, you can have the references within the 8 pages if you want. CourseNana.COM

  • For a 3-person project you can have 10 pages of the main text, including tables and visualisations, with the references section starting at the top of page 11. However, you can have the references within the 10 pages if you want. CourseNana.COM

    Figure & Table format CourseNana.COM

  • Ensure that the font size in the figures is at least 9pt in the actual PDF file you submit (not just specified as 9pt in matplotlib see the second visualisation lecture for how to get font sizes correct). CourseNana.COM

  • Do not change the font size in tables. CourseNana.COM

  • All figures and tables should have a meaningful caption and should be referred to in the text. CourseNana.COM

  • Note that the plots do not necessarily need to have a title above them the figure caption (I.e. CourseNana.COM

    everything inside the \caption{} in LaTeX) can fulfil that role. However, titles above multiple axes in a figure can make them easier to read. CourseNana.COM

    Forming groups CourseNana.COM

    You can choose your own groups. CourseNana.COM

  • If you haven’t found anyone to work with but would like to find prospective group members, please use this form:
    https://forms.office.com/e/r3u5Y1gQW6
    We will try to find you group members with similar project interests. Please fill in this form by 5pm on Thursday 29 February. We will form the groups on Friday morning. CourseNana.COM

  • We recommend setting up a private repository on GitHub to keep track of your code within your groups. CourseNana.COM

  • We recognise that individual schedules, preferences, and other constraints might limit your ability to work in a group. The default expectation is that grades for each group member will be same, but if your statements of how you worked as a group indicate that one member did CourseNana.COM

significantly less than the others, we reserve the right to reduce the mark of that group member. CourseNana.COM

Please divide up tasks between yourselves, e.g. after an initial discussion, one of you might focus on data cleaning, and another on coding, and another on presentation. CourseNana.COM

Project options CourseNana.COM

Project option 1: Historical and world-wide trends in ultramarathon running CourseNana.COM

Ultramarathon running pushes the limits of human endurance, making it a fascinating subject for data analysis. An ultramarathon is described as any footrace longer than the traditional marathon length (42.2 kilometres) and consequently demands even greater resilience and perseverance of athletes. The following dataset spans over two centuries of ultra-marathon race records and offers a unique opportunity to explore global trends, factors influencing performance, as well as the evolution of the sport, https://www.kaggle.com/datasets/aiaiaidavid/the-big-dataset-of-ultra-marathon-running/data CourseNana.COM

Everybody (individuals and groups): We would like you to explore the evolution of participation and performance in ultramarathons across the world over time. How has finishing time performance changed for different distances and athlete demographics? Have there been any significant shifts in regional participation patterns? Additionally, we would like you to identify whether certain factors (e.g., event characteristics, athlete attributes) predict better performance. For example, can you develop a model to predict finishing times for future ultramarathon events based on available data? CourseNana.COM

Groups: The extra questions should extend the basic findings. Examples of questions are: CourseNana.COM

  • Can you detect any influence of external factors (e.g., climate, terrain, technology, historical events) on finishing times or participation rates? CourseNana.COM

  • You could also choose to take a ‘deep-dive’ into a shorter time period or the performance of one (or a collection of) athletes with remarkable achievements. CourseNana.COM

  • Any other questions that arise as you explore the data. CourseNana.COM

    Project option 2: Analysing the cancellations of planned operations in the Scottish NHS. CourseNana.COM

    Healthcare systems worldwide constantly strive to balance patient needs with resource limitations. Understanding the factors contributing to cancelled planned operations is crucial for improving efficiency, reducing patient anxiety, and ensuring equitable access to essential medical services. The following dataset contains information on the number of cancelled planned operations in Scotland by NHS hospitals since 2015, https://www.opendata.nhs.scot/dataset/cancelled-planned-operations. We can gain valuable insights into this issue by analysing the number of cancelled operations, their reasons, and the responsible NHS boards/hospitals. CourseNana.COM

    Everybody (individuals or groups): We would like you to explore the data to report statistics on the number of planned operations and reasons for their cancellation (e.g., clinical, capacity, patient-related) across Scotland, using tables, summary statistics and/or visualisations. Are there any interesting patterns in cancellations that you can identify over time? Additionally, are there significant differences in cancellation rates between different NHS Boards and hospitals? Can you identify regions requiring specific attention? CourseNana.COM

Groups: The extra questions should extend the basic findings. Examples of questions are: CourseNana.COM

  • Do specific reasons for cancellation dominate in certain regions or hospitals, suggesting variations in practice or resource management? CourseNana.COM

  • Do the number of cancellations and their reasons vary significantly by month, suggesting potential resource strain during specific seasons? CourseNana.COM

  • You could also find additional data to compare the cancellations with a related issue (e.g. drug and alcohol treatment waiting times, https://www.opendata.nhs.scot/dataset/drug-and- alcohol-treatment-waiting-times). CourseNana.COM

  • You could also identify and explore any outliers of particular interest within the data (e.g. relating to COVID-19 measures). CourseNana.COM

  • Any other questions that arise as you explore the data.
    Note, that explanatory data dictionaries relevant to the data can be found under the “Explore > CourseNana.COM

    Preview” option at the bottom of the given URL. CourseNana.COM

    Project option 3: National university student satisfaction survey data CourseNana.COM

    The National Student Survey (NSS) is an annual survey of final-year undergraduate students in UK higher education institutions. It is a valuable source of data on student satisfaction with their courses and universities. The 2023 NSS data release includes responses from over 339,000 students, making it a rich resource for data analysis: https://www.officeforstudents.org.uk/data-and-analysis/national-student- survey-data/download-the-nss-data/ CourseNana.COM

    Everybody (individuals or groups): We would like you to explore what factors contribute to higher student satisfaction within and across different subject areas and universities in the UK. How does student satisfaction vary by subject of study? For example, are students in STEM subjects more or less satisfied than students in humanities subjects? Are there any universities that consistently outperform or underperform in terms of student satisfaction? Where appropriate, make sure to relate your answers to the factors underlying student satisfaction as indicated by the survey’s questions. CourseNana.COM

    The website provides various versions of the data; we recommend that you mainly look at the “2023 NSS results by registering provider (full-time) (XLSX, 95.4 MB)” under ‘Provider-level data’. CourseNana.COM

    Groups: The extra questions should extend the basic findings to explore advanced relationships in the data. Examples of questions are: CourseNana.COM

  • How has student satisfaction changed over time? Are there any trends that can be identified? CourseNana.COM

  • Are there any trends in satisfaction between England, Scotland, Wales and Northern Ireland? CourseNana.COM

  • Do students in smaller, niche subjects express different satisfaction levels compared to those in CourseNana.COM

    larger, ‘mainstream’ subjects? CourseNana.COM

  • You may also wish to explore the ‘student characteristic’ data which gives student satisfaction CourseNana.COM

    broken down by characteristics such as age, disability, ethnicity (and more). CourseNana.COM

  • Any other questions that arise as you explore the data. CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
Edinburgh代写,INF2-FDS代写,Informatics 2 - Foundations of Data Science代写,Python代写,Data Analysis代写,Edinburgh代编,INF2-FDS代编,Informatics 2 - Foundations of Data Science代编,Python代编,Data Analysis代编,Edinburgh代考,INF2-FDS代考,Informatics 2 - Foundations of Data Science代考,Python代考,Data Analysis代考,Edinburghhelp,INF2-FDShelp,Informatics 2 - Foundations of Data Sciencehelp,Pythonhelp,Data Analysishelp,Edinburgh作业代写,INF2-FDS作业代写,Informatics 2 - Foundations of Data Science作业代写,Python作业代写,Data Analysis作业代写,Edinburgh编程代写,INF2-FDS编程代写,Informatics 2 - Foundations of Data Science编程代写,Python编程代写,Data Analysis编程代写,Edinburghprogramming help,INF2-FDSprogramming help,Informatics 2 - Foundations of Data Scienceprogramming help,Pythonprogramming help,Data Analysisprogramming help,Edinburghassignment help,INF2-FDSassignment help,Informatics 2 - Foundations of Data Scienceassignment help,Pythonassignment help,Data Analysisassignment help,Edinburghsolution,INF2-FDSsolution,Informatics 2 - Foundations of Data Sciencesolution,Pythonsolution,Data Analysissolution,