1. Homepage
  2. Programming
  3. Math 10 Introduction to Programming for Data Science - Final Project: Supervised Learning and Unsupervised Learning

Math 10 Introduction to Programming for Data Science - Final Project: Supervised Learning and Unsupervised Learning

Engage in a Conversation
USUC IrvineIntroduction to Programming for Data ScienceMath 10Math10Supervised LearningUnsupervised LearningPythonHealth Insurance Cross Sell PredictionAmerican sign languageTripAdvisor Review

Math 10 Final Project, 23 Winter CourseNana.COM

Due: 11:59 PM Wed March 22th 2023 CourseNana.COM

Submission: Upload the .ipynb file to Canvas. CourseNana.COM

The submission should be a well-organized report (with well-structured sections, high-quality figures and necessary descriptions as markdown in the notebook file, with all code blocks already executed). Submitting merely the codes and/or incomplete results will severely impact your grades. Please include everything in one single ipynb file, and any other formats (.pdf, .doc ...) or redundant files are not valid and won’t be graded. CourseNana.COM

Dataset Downloading: CourseNana.COM

In the final project, we provide (and it’s required to use) following three choices of datasets, please click the links for details and download the datasets: CourseNana.COM

1) Tabular Data: Health Insurance Cross Sell Prediction CourseNana.COM

2) Image Data: American sign language-MNIST https://www.kaggle.com/datasets/datamunge/sign-language-mnist CourseNana.COM

3) Text Data: TripAdvisor Review https://www.kaggle.com/datasets/andrewmvd/trip-advisor-hotel-reviews CourseNana.COM

You must pick one of the datasets and conduct all the tasks described below. The choice won’t affect your grades. Of course, exploring all three datasets is especially welcome if you aim to get A+ in this course. CourseNana.COM

Tasks and Grading Policy (20pts in the total course grade)
Task 1: Data Loading, Processing and Explorary Data Analysis (EDA)(4 pts)
CourseNana.COM

  • Write codes to correctly load the data, and use markdown to write down what Python

data type/package you have selected to store/process the data and briefly explain the reason. CourseNana.COM

Additional requirements for each data: CourseNana.COM

Tabular Data: Generate the summary statistics of all columns. Write the code and use text to explain how your transform the data to the number-valued matrix in the end. CourseNana.COM

Image Data: For each category, randomly pick up one sample image and plot them all together in the same figure. CourseNana.COM

Text Data: Use text and codes to show the whole processing process how you transform texts to vectors/matrix. CourseNana.COM

Task 2: Supervised Learning (8pts) CourseNana.COM

  • Define a meaningful supervised learning problem for the dataset using markdown cell. Use codes to assign the appropriate 1) training data, 2) test data and 3) labels.
  • Use at least three supervised learning model to solve the problem. If you use a third-part package, explain why you choose it. If you use customized codes, please write documentation strings. If the package has not be covered in this course, cite the original resources and explain your understanding about the package, or you won’t receive credit for it.
  • Choose at least one method, use text and equations to describe the algorithm in more details. It’s okay to refer to lecture notes/ discussion files, but please rephrase.
  • Choose at least one method (can be same or different), change the hyperparameter in the model to see how the performance change correspondingly. Use markdown cell to describe your findings.
  • Describe how you evaluate the performance for all the methods indeed? And what is the result? Be specific about the performance on both training and test data.
  • It’s totally fine to drop some variables in the original data (e.g. doing simple feature selection), especially in the tabular dataset as long as you can explain your rationale.

Task 3: Unsupervised Learning (6 pts) CourseNana.COM

  • Choose at least two unsupervised learning methods to analyze the data. For at least one method, use text and equations to describe the algorithm in more details.
  • Describe what is your finding in the unsupervised learning task using markdown cells.

Task 4: Organizing your report (2 pt) CourseNana.COM

This 2pt will be determined by the overall quality of your report in ipynb form (judged by the instructor when grading). There is no guarantee that you will get the 2 point in full if only CourseNana.COM

submitting the correct (instead of nearly-perfect) report. In other words, it’s totally possible that you receive zero in this task, if your report merely just copy and paste codes and Markown from lecture notes/discussion files. CourseNana.COM

Try to write the descriptions and codes (including document strings and comments) in well- organized and logical way, and generate high-quality figures. A practical tip is to imagine that you’re writing a thesis instead of merely codes– therefore you need to include : CourseNana.COM

  • Meaningful title of the whole project reflecting its scientific contents
  • Sections and subsections at different heading levels using markdown language
  • Latex equations when introducing the methods used
  • Abstract/conclusion/transition paragraphs
  • Clear figure axis labels when plotting
  • Document strings or comments for long functions

You may also consult the corresponding highly voted Kaggle notebooks for each dataset for possible examples. Especially, repeating the words of this file and copying requirements of the tasks are not necessary and not encouraged – I expect to see your own results and associated descriptions/explanations. CourseNana.COM

Other Requirements/Resources: CourseNana.COM

  1. Each student should work on the final project independently, and direct discussion on the content (especially about debugging) with other students/ TA/ teacher is NOT allowed. Violations of the academic integrity rules will be reported to the department.
  2. Make sure to submit the final project .ipynb file to Canvas before the deadline. We do not allow to extend the deadline.
  3. Computer/software issue is not a valid excuse to submit incomplete results, since we have already tested the datasets and basic tasks in personal laptop satisfying the minimum requirement asked by university– not to mention that we have also introduced free Kaggle or Google Colab resources to run the codes in cloud.

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
US代写,UC Irvine代写,Introduction to Programming for Data Science代写,Math 10代写,Math10代写,Supervised Learning代写,Unsupervised Learning代写,Python代写,Health Insurance Cross Sell Prediction代写,American sign language代写,TripAdvisor Review代写,US代编,UC Irvine代编,Introduction to Programming for Data Science代编,Math 10代编,Math10代编,Supervised Learning代编,Unsupervised Learning代编,Python代编,Health Insurance Cross Sell Prediction代编,American sign language代编,TripAdvisor Review代编,US代考,UC Irvine代考,Introduction to Programming for Data Science代考,Math 10代考,Math10代考,Supervised Learning代考,Unsupervised Learning代考,Python代考,Health Insurance Cross Sell Prediction代考,American sign language代考,TripAdvisor Review代考,UShelp,UC Irvinehelp,Introduction to Programming for Data Sciencehelp,Math 10help,Math10help,Supervised Learninghelp,Unsupervised Learninghelp,Pythonhelp,Health Insurance Cross Sell Predictionhelp,American sign languagehelp,TripAdvisor Reviewhelp,US作业代写,UC Irvine作业代写,Introduction to Programming for Data Science作业代写,Math 10作业代写,Math10作业代写,Supervised Learning作业代写,Unsupervised Learning作业代写,Python作业代写,Health Insurance Cross Sell Prediction作业代写,American sign language作业代写,TripAdvisor Review作业代写,US编程代写,UC Irvine编程代写,Introduction to Programming for Data Science编程代写,Math 10编程代写,Math10编程代写,Supervised Learning编程代写,Unsupervised Learning编程代写,Python编程代写,Health Insurance Cross Sell Prediction编程代写,American sign language编程代写,TripAdvisor Review编程代写,USprogramming help,UC Irvineprogramming help,Introduction to Programming for Data Scienceprogramming help,Math 10programming help,Math10programming help,Supervised Learningprogramming help,Unsupervised Learningprogramming help,Pythonprogramming help,Health Insurance Cross Sell Predictionprogramming help,American sign languageprogramming help,TripAdvisor Reviewprogramming help,USassignment help,UC Irvineassignment help,Introduction to Programming for Data Scienceassignment help,Math 10assignment help,Math10assignment help,Supervised Learningassignment help,Unsupervised Learningassignment help,Pythonassignment help,Health Insurance Cross Sell Predictionassignment help,American sign languageassignment help,TripAdvisor Reviewassignment help,USsolution,UC Irvinesolution,Introduction to Programming for Data Sciencesolution,Math 10solution,Math10solution,Supervised Learningsolution,Unsupervised Learningsolution,Pythonsolution,Health Insurance Cross Sell Predictionsolution,American sign languagesolution,TripAdvisor Reviewsolution,