1. Homepage
  2. Programming
  3. STAT 1261 - PRINCIPLES OF DATA SCIENCE - Project Guidelines

STAT 1261 - PRINCIPLES OF DATA SCIENCE - Project Guidelines

Engage in a Conversation
PITTSTAT 1261STAT 2260PRINCIPLES OF DATA SCIENCERData WranglingData Visualization

STAT 1261/2260 Project Guidelines CourseNana.COM

Teams

          Team size CourseNana.COM

         2 to 4 students CourseNana.COM

         I will consider smaller team sizes but you should give convincing justification. In particular, you need to convince me you have the capacity to complete the project with less people. CourseNana.COM

          Team request CourseNana.COM

         Email me with your request to form a team. CourseNana.COM

         One team member should email, with a Cc to the other members. Your email should include a list of the team members and their majors. CourseNana.COM

Scope

I only expect you to use the techniques that I have shown you in the lectures. You should not use any techniques that you do not understand. CourseNana.COM

I would prefer that you do relatively simple, clear analyses using simple techniques than complex analyses that you do not fully understand. Your job as a data scientist is to draw clear conclusions from data. CourseNana.COM

Suggested Structure

See the rubric for the requirements of your project files. CourseNana.COM

I also recommend that you: CourseNana.COM

          Download the data you are working on, and save it with your project files. Leave instructions on how I can get the original data that you downloaded. CourseNana.COM

          Consider using a “setup” file, such as a markdown file, that runs once, to set up the project. For example, it might install any libraries that you will use. It may download the data from a URL and save it to your project directory. CourseNana.COM

Part I and Part II of the Project

Part I Data Wrangling and Visualization

The first part of the project will concentrate on data exploration. Your task involves selecting a dataset of and delving into a comprehensive analysis of the data. This includes utilizing graphical displays and summary statistics to gain insights from the dataset. To illustrate, try to address the following questions: CourseNana.COM

          What is the distribution of each variable? Is the distribution roughly symmetric? Any extreme values? Is it approximately normal? CourseNana.COM

          What relationships can be observed between the variables? You can employ graphs and descriptive statistics to answer this question. CourseNana.COM

          Based on the answers to the preceding questions, what hypotheses can you formulate for testing? Which variable can be considered as a response variable? Which variables appear to be valuable in the estimation or prediction of the response variable? CourseNana.COM

Part II Modeling

Based on your findings from Part I, choose a few machine learning models to fit. Make sure that you tune their parameters to optimize the models according to some model assessment criteria. CourseNana.COM

Project Report

The first part of the project report should be roughly between 1000 and 1500 words of explanatory text and code, and the final report (including Parts I and II) should be between 1500 and 3000 words, not including figures and tables. CourseNana.COM

It can be in the form of a PDF, Word, or HTML document. Note that your R Markdown file needs to be submitted as well. CourseNana.COM

Project Calendar

Week CourseNana.COM

Date (Monday) CourseNana.COM

Task CourseNana.COM

1 CourseNana.COM

Aug. 28 CourseNana.COM

Start to form teams CourseNana.COM

2 CourseNana.COM

Sept. 4 CourseNana.COM

Data exploration CourseNana.COM

3 CourseNana.COM

Sept. 11 CourseNana.COM

Data exploration CourseNana.COM

4 CourseNana.COM

Sept. 18 CourseNana.COM

Teams finalized CourseNana.COM

5 CourseNana.COM

Sept. 25 CourseNana.COM

Start to work on Part 1 CourseNana.COM

6 CourseNana.COM

Oct. 2 CourseNana.COM

  CourseNana.COM

7 CourseNana.COM

Oct. 9 CourseNana.COM

  CourseNana.COM

8 CourseNana.COM

Oct. 16 CourseNana.COM

Part 1 Due on Oct. 20 CourseNana.COM

9 CourseNana.COM

Oct. 23 CourseNana.COM

Start to work on Part 2 CourseNana.COM

10 CourseNana.COM

Oct. 30 CourseNana.COM

  CourseNana.COM

11 CourseNana.COM

Nov. 6 CourseNana.COM

  CourseNana.COM

12 CourseNana.COM

Nov. 13 CourseNana.COM

  CourseNana.COM

13 CourseNana.COM

Nov. 20 CourseNana.COM

Thanksgiving Break CourseNana.COM

14 CourseNana.COM

Nov. 27 CourseNana.COM

Part 2 Due on Dec. 1 CourseNana.COM

15 CourseNana.COM

Dec. 4 CourseNana.COM

Project Presentations CourseNana.COM

Presentation

Your project presentations are short presentations on your project to me, your instructor, and the rest of the class. CourseNana.COM

Presentation Details

          The project presentations should be 5-7 minutes. CourseNana.COM

          You should send your slides to me the day before the presentation day. CourseNana.COM

          I will video record the presentations, to make sure that the grading is fair and consistent. CourseNana.COM

Goal of the presentation

The goal of the project presentation is to get quickly to your main conclusions, and the evidence for these conclusions. A good presentation will help your listeners engage with your analysis, and think about new questions to ask. The focus of the presentation should be on the following: CourseNana.COM

          Summarize your data. CourseNana.COM

          Describe your main analysis strategy. CourseNana.COM

          Describe your main findings. CourseNana.COM

          Draw conclusions with care, citing evidence from your data, and from any relevant literature. CourseNana.COM

What if you get the “opposite” conclusion or no conclusion?

If all your analyses have so far proved negative or inclusive, that’s fine too. Say what you tried, what evidence you were able to find and whether you need new evidence or a new analysis strategy. You might also conclude that your initial hypothesis was wrong, and that the data gives evidence against it. CourseNana.COM

Who will do the presentation?

Please discuss and decide with your teammates about who prepares and does the presentation. CourseNana.COM

Data for Projects

Your task is open-ended task, and this is typical of real data analysis projects. Each project will go in a different direction, and you will find that your group will become experts in interpreting your own data. You might even end up writing a little paper from your report. CourseNana.COM

Data Sets Suggested

If this is the first data science course for you or you have no previous experience in data analysis, I recommend you use one of the following data sets from Kaggle. CourseNana.COM

1.        Heart Failure Prediction (4 kB) CourseNana.COM

2.        Data Science Job Salaries (8 kB) CourseNana.COM

3.        Sleep Health and Lifestyle Dataset (3 kB) CourseNana.COM

4.        Heart Attack Analysis & Prediction Dataset (4 kB) CourseNana.COM

5.        Airline Passenger Satisfaction (3.04 MB) CourseNana.COM

6.        Credit Risk of Customers (19 kB) CourseNana.COM

7.        American Citizens Annual Income (343 kB) CourseNana.COM

8.        Loan Approval Prediction Data (83 kB) CourseNana.COM

9.        Travel Insurance Prediction Data (13 kB) CourseNana.COM

10.    Employee Satisfaction Index Dataset (8 kB) CourseNana.COM

Other Data Sources

Feel free to use data from your own discipline. Ask around to see if you can find interesting data from the University of Pittsburgh, maybe in your School. CourseNana.COM

Below there are some links to find data sets. CourseNana.COM

          Kaggle Datasets Kaggle has a large list of datasets. You may use filters to choose the format and the size of the data set. I suggest you use small to medium-sized (<5MB) data sets because otherwise, it will take a long time to fit and tune models. CourseNana.COM

          Google Dataset Search Try the link, you???ll get the idea. CourseNana.COM

          World Bank Data The site has a lot of data on global development, and related issues. CourseNana.COM

CourseNana.COM

          UK Government Open Data CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
PITT代写,STAT 1261代写,STAT 2260代写,PRINCIPLES OF DATA SCIENCE代写,R代写,Data Wrangling代写,Data Visualization代写,PITT代编,STAT 1261代编,STAT 2260代编,PRINCIPLES OF DATA SCIENCE代编,R代编,Data Wrangling代编,Data Visualization代编,PITT代考,STAT 1261代考,STAT 2260代考,PRINCIPLES OF DATA SCIENCE代考,R代考,Data Wrangling代考,Data Visualization代考,PITThelp,STAT 1261help,STAT 2260help,PRINCIPLES OF DATA SCIENCEhelp,Rhelp,Data Wranglinghelp,Data Visualizationhelp,PITT作业代写,STAT 1261作业代写,STAT 2260作业代写,PRINCIPLES OF DATA SCIENCE作业代写,R作业代写,Data Wrangling作业代写,Data Visualization作业代写,PITT编程代写,STAT 1261编程代写,STAT 2260编程代写,PRINCIPLES OF DATA SCIENCE编程代写,R编程代写,Data Wrangling编程代写,Data Visualization编程代写,PITTprogramming help,STAT 1261programming help,STAT 2260programming help,PRINCIPLES OF DATA SCIENCEprogramming help,Rprogramming help,Data Wranglingprogramming help,Data Visualizationprogramming help,PITTassignment help,STAT 1261assignment help,STAT 2260assignment help,PRINCIPLES OF DATA SCIENCEassignment help,Rassignment help,Data Wranglingassignment help,Data Visualizationassignment help,PITTsolution,STAT 1261solution,STAT 2260solution,PRINCIPLES OF DATA SCIENCEsolution,Rsolution,Data Wranglingsolution,Data Visualizationsolution,