DS 64510 Linear Models Homework Assignment 2
Fit a model with wage as the response ?, educ as predictor ?1 , and exper as predictor ?2 , then complete the following problems 1-8.
-
Report and interpret the value of ?̂1 , the slope parameter for education.
-
What is the predicted wage for an individual with 10 years of experience and 10 years of education? (Show how you used the regression equation to obtain your answer.)
-
Use the predict() function to obtain the predicted value of wage for the first 10 subjects in the uswages data.
-
Compute the residual sum of squares (RSS) for the model.
-
Explain the concept of Least Squares estimation in the context of this problem.
-
In Week 2 Live Session we will see that highly correlated predictor variables tend to destabilize a regression model. Is there cause for concern in the model above?
-
Find the definition of the hat matrix ? defined on page 16 of the text, and use matrix operations to compute it. How many columns does ? have? (Hint: In the formula on page 16, ? is a matrix with ? rows and ? columns. The first column consists of all 1's for the intercept, and each predictor in the model contributes another column. Assuming you named your regression model mod, then ? can be obtained using the code: X <- model.matrix(mod)
-
Use linear algebra functions in R to obtain the model parameter estimates.
Note: Items 9 and 10 concern simulating linear regression data, which we will cover in Live Session 2.
-
Recall, the mathematical equation for a linear regression with two predictors is ? = ?0 + ?1 ?1 + ?2 ?2 + ? Explain how the ? term is included in the simulation. (1 or 2 sentences should be enough.)
-
Explain why the including the ? term is important in the simulation.