MSBA - Bootcamp
Group Project Part 1
(Both part 1 and part 2 are based data (Ford-Data-Group Project.xlsx) provided with this assignment
There is also an optional (bonus) question at the end. This quesiton does not require coding/ programming
Assignment type | Group Project Part 1 |
Delivery format | · A single PDF (ONLY the PDF format): Written responses to questions + code file (saved as PDF and appended at the end of written responses)
· Jupyter code file (.py, same file as the one you appended in PDF format) |
Points | 15 |
Due Date | Aug 22nd , 5pm ET |
Disclosure | Please read the description below, understand its meaning and implications, and include it on the top of the first page of your assignment.
The work I (or We) submitted as part of this assignment is original, and due credit is given to others where appropriate. I (or We) accept and acknowledge that I (or We are) am solely responsible for if the assignment found to be plagiarized in any way, and I (or we) will be subject to School’s Academic Integrity policy Name and ID (of all members in the group): Date: |
Suppose the data was provided to you by the client who is a leading automobile dealer in UK. The client wants to understand the data (e.g., basic summary statistics, key trends, etc.) and also wants to build a regression model for used car prices for Ford cars.
Your responses to the following questions must be based on appropriate data analysis using Python coding. You are flexible to use coding and the steps within coding, wherever you feel it is necessary
(Note: you are the one running the analysis and specific steps within it to help your client. The client does not know/cannot advice you what/how/ where to write codes and perform other types of analysis
1. (5 points)
a. Describe your data by analyzing summary statistics of variables, groupby summary statistics, etc. You have to identify appropriate variables for groupby and produce summary statistics based on that for your description (hint: the client is interested in understanding how prices vary across attributes (model, transmission, etc.) and price is the only variable which the client can manipulate so it is the key variable for them!)
You may use the following article for help on how to describe a dataset
Describing a dataset. Anil Doshi. Retrieved from https://www.scribd.com/document/495644797/How-to-Describe-a-Data-Set
b. The description should not be more than 300-400 words and you should limit it to explaining 3-4 key patterns and trends (i.e., you do not have to write about every little thing you notice in data). Be specific in your description and make sure it matches with the summary statistics you produced in step 1a
2. (5 points)
a. For (price, mileage, tax, and mpg), write a code to identify if these variables have outlier values in the data.
b. For the variables which have outliers values, describe how they are different from regular pattern in the data (200-300 words)
3. (5 points)
a. For (price, mileage, tax, and mpg), write a code to produce correlation matrix among these variables. Explain the key patterns and trends you notice based on the correlation analysis (200-300 words)
Group Project Part 2
Assignment type | Group Project Part 2 |
Delivery format | · A single PDF (ONLY the PDF format): Written report based on the analysis + code file (saved as PDF and appended at the end of the report) · Jupyter code file (.py, same file as the one you appended in PDF format) · Page limit for the written report: 3 pages (excluding title page, table of contents, appendix, code file appended, etc.) |
Points | 15 |
Due Date | Aug 29th, 11:59 pm ET |
Disclosure | Please read the description below, understand its meaning and implications, and include it on the top of the first page of your assignment.
The work I (or We) submitted as part of this assignment is original, and due credit is given to others where appropriate. I (or We) accept and acknowledge that I (or We are) am solely responsible for if the assignment found to be plagiarized in any way, and I (or we) will be subject to School’s Academic Integrity policy Name and ID (of all members in the group): Date: |
The client wants to run a regression model with price as a dependent variable, but not sure which variables to include / exclude under independent variables, and how to think about different steps in the analysis. The client needs your help in understanding how prices change based on variables included in the dataset, and interested in building a predictive model of price based on regression analysis.
Your responses to the following questions must be based on appropriate data analysis using python coding. You are flexible to use coding and the steps within coding, wherever you feel it is necessary (remember: you are the one running the analysis and specific steps within it to help your client. The client does not know/cannot advice you what/how/ where to write codes and perform other types of analysis
1. (15 points)
a. Which variables would you suggest to include (and exclude) in the regression (or segmentation) analysis? And Why? (4 points)
b. Based on the variables included in the analysis from step 1a, run and estimate two regression models and provide a comparative analysis based on results obtained (e.g., R-sq, estimated coefficients, contribution of each independent variable in explaining price etc.) (8 points)
c. Which of the regression models (which you identified in 1b), would you propose to the client as the final recommendation? And why? (3 points)
Bonus (Optional) Question (does not require coding/programming)
Q. As a fresh graduate from MSBA program and with a good job, you are now planning to purchase a house in Boston. Currently, there are 6000 houses listed on the market which are within your budget. There are two types of house in the market – good (G) and bad (B) with a breakdown of 3600 (G) and 2400 (B) houses of each type.
The seller knows EXACTLY the type of house s/he is selling in the market but you cannot distinguish between a good and bad house. However, you can hire a professional home inspector to get the house inspected before making an offer to the seller. The home inspector can provide you a report indicating if the house is G or B. The report, however, is not always perfect or error free. The report correctly detects a good house 100% of the time but identifies a bad house 70% of the time. In other words, if the house is G, the report will show it is G 100% of the time but if the house is B, then report will show is it B only 70% of the time.
Each house, whether G or B, is listed for $500k in the market. Your valuation* for a good house is $510k, and for a bad house is $410k. The seller’s valuation for a good house is $495k and for a bad house is $405k.
(* For a seller, valuation $v1 means the seller will accept any offer of $v1 or above. For a buyer, valuation $v2 means the maximum amount the buyer is willing to pay for the house)
A. What are the possible decisions you can take based on home inspector reporting? Please write all possible decisions in using notations/simple wordings.
B. Please intuitively explain the decisions in QA. For example, if your friend (who is not analytically savvy and does not understand technical wordings and explanations) asks you about your possible decisions, what/how you are going to explain to him/her?
C. Suppose you fully trust the home inspection report, i.e., if the report says it is G, then you also believe it is G and if the report says it is B, then you also believe it is B. What would be your expected valuation of the house under this decision? Please show your steps in calculations.
D. Suppose you make an offer to a seller with price equal to your expected valuation in QC. What would be possible set of actions by the seller? And which set of action(s) the seller is going to choose?
E. Suppose the seller accepts your offer of purchasing the house in QD. Are you going to proceed with the next step of purchasing the house or will back out of the deal? (assuming you can legally back out of the deal at this stage with no penalty). Please provide a brief explanation for your choice.
F. If you back out of the deal in QE, how you are going to plan for purchasing a house in the future? Any creative ways to add conditions/clauses in your offer to potential sellers, i.e., something which can help protect both you and the seller and may be help you in identifying the type of the house**?
(**Since you made an offer in QD based on expected valuation and assuming in QE you rationally backed out of the deal. The same outcome will happen if you keep making an offer based on expected valuation, so you have to find a creative but logical way to make sure deal happens)