1. Homepage
  2. Programming
  3. Assignment No 2: Predicting Housing Prices with Polynomial Linear Regression

Assignment No 2: Predicting Housing Prices with Polynomial Linear Regression

Engage in a Conversation
Polynomial Linear RegressionMachine LearningPython

Assignment No 2

Introduction

In the lectures and tutorials that are based on https://github.com/materials-discovery/ML_tutorials we have learned how to manipulation of data within Python by the utilisation of different libraries such as Pandas, NumPy. We have also learnt how to visualise the data that we process using MatPlotLib and Seaborn. We have also dealt with using loops and conditional statements. Once these tools mastered, we have moved to machine learning techniques such as regression and classification and we have built different models using various algorithms such as Random Forest, Linear regression, Support vector Machine etc. CourseNana.COM

The purpose of this assignment is to use to estimate the housing price using polynomial linear regression. CourseNana.COM

Problem Statement:

You work for a real estate company that wants to predict the price of a house based on its size, number of bedrooms, and other features. You are given a dataset that contains information about houses, including their size, number of bedrooms, number of bathrooms, and other relevant features, as well as the price at which they were sold. Your task is to build a machine learning model that can predict the price of a house based on these features. CourseNana.COM

Dataset:

The dataset contains 11 independant features such as: CourseNana.COM

1 Id To count the records. 2 MSSubClass Identifies the type of dwelling involved in the sale. 3 MSZoning Identifies the general zoning classification of the sale. 4 LotArea Lot size in square feet. 5 LotConfig Configuration of the lot 6 BldgType Type of dwelling 7 OverallCond Rates the overall condition of the house 8 YearBuilt Original construction year 9 YearRemodAdd Remodel date (same as construction date if no remodeling or additions). 10 Exterior1st Exterior covering on house 11 BsmtFinSF2 Type 2 finished square feet. 12 TotalBsmtSF Total square feet of basement area 13 SalePrice To be predicted CourseNana.COM

The dataset can be downloaded from the following Google spreadsheed: https://docs.google.com/spreadsheets/d/1caaR9pT24GNmq3rDQpMiIMJrmiTGarbs/edit#gid=1150341366 CourseNana.COM

Your tasks

-Steps: CourseNana.COM

1)Import the dataset into a pandas DataFrame and perform exploratory data analysis (EDA) to understand the structure of the data: Plot the correlation matrix between the different features and the target.

2)Clean and preprocess the data by handling missing values, handling categorical variables, and scaling the data.

3)Split the data into training and testing sets using initially the simple Holdout method.

4)Build a polynomial linear regression model using scikit-learn library in Python and pull out the coefficients to generate the mathematical model with a polynomial degree of 2.

5)Train again the model on the training data using the 5-fold cross-validation and evaluate its performance on the testing data using appropriate evaluation metrics such as mean squared error (MSE) and R-squared.

6)Optimise the hyperparameters of the model using grid search then randomized search using the 5-fold cross-validation. Compare both methods of search and show the outcome of this search.

7)Rebuild now the model using the optimised hyperparameters without using the cross-validation and visualise the results of the regression in terms of graphs, such as scatter plots, residual plots, and prediction plots.

8)Compare the model in step 7 and model in step 4. Why is this relevant?

9)Present the results in a report, including the methodology, results, and conclusions.

Note: You can also experiment with different polynomial degrees to see which one gives the best results. CourseNana.COM

Bonus: CourseNana.COM

10)Perform feature engineering by adding additional features such as the square or cubic of one of the feature and assess how this can affect the robustness of the predicitive model.

11)Experiment with different feature selection methods such as Recursive feature elimination (RFE) to identify the most relevant features for the model.

12)Try different types of models such as support vector regression or random forest regression to compare their performance with polynomial linear regression.

Have fun! CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
Polynomial Linear Regression代写,Machine Learning代写,Python代写,Polynomial Linear Regression代编,Machine Learning代编,Python代编,Polynomial Linear Regression代考,Machine Learning代考,Python代考,Polynomial Linear Regressionhelp,Machine Learninghelp,Pythonhelp,Polynomial Linear Regression作业代写,Machine Learning作业代写,Python作业代写,Polynomial Linear Regression编程代写,Machine Learning编程代写,Python编程代写,Polynomial Linear Regressionprogramming help,Machine Learningprogramming help,Pythonprogramming help,Polynomial Linear Regressionassignment help,Machine Learningassignment help,Pythonassignment help,Polynomial Linear Regressionsolution,Machine Learningsolution,Pythonsolution,