DS 64510 Linear Models Homework Assignment 1
1. Conduct some exploratory analysis: a. How many observations are in the data? How many variables? b. Make a scatterplot of educ (horizontal axis) vs. wage (vertical axis). Does the plot suggest an overall linear relationship between the two variables? c. What is the correlation between educ and wage? 2. Fit a linear model with wage as the response variable and educ as the predictor variable. a. Report and interpret the value ?̂0 b. Report and interpret the value ?̂1 3. Refer to the model fit in the previous problem. a. Compute the predicted value of wage for the first individual in the data set. b. Compute the residual for the prediction in the previous part. c. Compute the mean of all the residuals. Do you think this value would be similar for any linear model? 4. Let’s now think more generally and connect to previous concepts. Recall the sample mean ?̅ . In Probability and Statistics you learned about the sampling distribution of ?̅ . a. State the definition of the sampling distribution of ?̅ . You can start with “If we sampled many times from the population,…” b. Once you recall the definition of the sampling distribution of ?̅ , apply the concept to linear regression and state the definition of the sampling distribution of ?̂1 . Start the same way: “If we sample many times from the population…”