3. The datascores-Version-0.txt shows the scores in the final examination Final and the scores in two preliminary examinations Pre1 and Pre2 for 22 students in a statistics course.
This dataset scores-Version-0.txt is available in LMS unit STAT2401 online. Please download the data, save it in “your working directory”, and read in the data by
setwd("your working directory")
scores = read.table(file="scores-Version-0.txt",header=T)
(a) Fit each of the following models to the data:
Model1: Final = 0 + 1Pre1 + 2Pre2 +
Model 2: Final = 0 + 10Pre1 + ,
Model 3: Final = 00 + 00Pre2 + . 02
where are all normally distributed with mean zero and a constant variance. Write down the R-codes and the fitted models for these 3 models. [6 marks]
(b) Use all three models to predict the final examination scores for a student who scored 78 and 85 on the first and second preliminary examinations, respectively. What are the 95% prediction intervals for this student (Report also the R-code you use to obtain the intervals)? Determine also the model that provides a shortest interval. [6 marks]
(c) The relationship between the simple and the multiple regression coecients can be seen when we compare the previous regression equations (Model 1 & Model 2) and the following regression equation:
Model4: Pre2 = a0 + a1Pre1 + e
Verify numerically that = + ↵ˆ , that is, the simple regression coefficient estimate for Final on Pre1 is the sum of the Model 1 coefficient estimate for Pre1, and the product of the Model 1 coefficient estimate for Pre2 and the Model 4 coefficient estimate from the regression of Pre2 on Pre1. The estimates are all least squares estimates. [5 marks]