1. Homepage
2. Programming
3. STAT 361 Applied Methods in Statistics I (Fall 2023) Assignment 3: Linear regression

STAT 361 Applied Methods in Statistics I (Fall 2023) Assignment 3: Linear regression

Queens UniversitySTAT 361Applied Methods in Statistics IRlinear regression

STAT 361 (Fall 2023) Assignment 3

The assignment is due on Nov. 04 (Saturday) at 23:00 (time of Kingston Ontario). Please submit to Crowd Mark.

You can still submit your assignment after the scheduled submission deadline; the penalty for a late assignment is 1% per hour. Watch out for a crowdmark techinical feature: If you have clicked submission for a question before the dead- line, then you CANNOT resubmit for that question once the deadline has passed.

Please read the course outline posted in Week 1, OnQ, if you need special accommodation for your assignment.
Requests for extending the submission deadline by
< 24 hours (say 1 hour late) will not be considered.

Guidelines for Preparing Solutions

For questions that needs R coding, please only include the important R output and the necessary results in the main text of your solutions. Present them in a clear and concise fashion (for example, tabulate models and output).
If there are other long code and output that are related to your work and exploration, please put them in an Appendix at the end of EACH problem.

These Appendix sections will NOT be marked, but you could submit them as evidence of your independent work.
If you will not submit Appendix sections, make sure your assignment solutions are presented clearly, and show your independent work.

Do not expect TAs to search everywhere for your answers from lengthy code and output. Identical solutions between students or copying from other sources will be investigated for academic integrity violations.

1. How is R2 related to the sample correlation coefficient? Recall the correlation coefficient

E{[X E(X)][Y E(Y )]} = q .

V ar(X)V ar(Y )

forrandomvariablesXandY,definedasρ= q
The sample correlation coefficient for the observed data x and y is

P[(xi x)(yi y)] ρˆ= qP(xi x)2 P(yi y)2.

Cov(X, Y )
V ar(X)V ar(Y )

Show that the R2 of the simple linear regression, model (1) of Chapter 2, is the square of the sample correlation coefficient between x and y,

22

R = ρˆ .

2. Consider the multiple regression model Y = Xβ + ε, where ε MVNn(0, σ2I). See descriptions of model forms (1) and (2) in Chapter 4.

(a) Show that the residual vector r = (I P)Y, where P = X(XT X)1XT , and show that 1

I P is also a projection matrix.
(b) Let
U = (βˆ , r)T . Find the joint distribution of the random vector U. It may be helpful

(XT X)1XT ! to notice that U = (I P)

Y. (c) Show that βˆ and r are independent.

Hint: For (b) and (c), properties of multivariate normal distribution may be useful.

3. Consider the “Savings.txt” data posted. It is an economic dataset collected in 48 different countries. The variable “sr” is ratio of savings (aggregate personal saving divided by dis- posable income). The variables “pop15” and “pop75” are percentages of population under 15 and over 75 respectively. The variable “dpi” is disposable income (per-capita, in dollars) while the variable “ddpi” is the rate (percent) of change in disposable income (per capita). (a) Draw scatter plot matrix for all the variables involved. Comment on the possible rela- tionships between variables, focus on those appear interesting to you.

(b) Fit a simple linear regression model with disposable income (“dpi”) as response and percentage of population under 15 as the only covariate. Describe the model clearly in mathematical form. Report and interpret the fitted model: is there a significant association between the variables, is this what you expect?

(c) Find the sample correlation coefficient between the two variables you studied in (b). How is it related to R2 of the model you fitted in (b)?
(d) Fit a regression model with ratio of savings (
Y , “sr”) as the response, and all other variables as the covariates. Describe the model clearly in mathematical form, report and discuss the fit of the model. Interpret the estimated coefficient for the rate of change in disposable income.

(e) Present the analysis of variance table for the model in (c), i.e, the ANOVA table in the form of Table 1 of Section 4.4. The model you specified in (d) assumes that the error terms are i.i.d. normal with mean 0 and variance σ2. An estimate of σ, denoted by σˆ, can be extracted from your fitted model (supposed it’s named “fitd” in your code), by the R code “sigma(fitd)”. How is σˆ related to SS(Res), the residual sum of squares?

Get in Touch with Our Experts

WeChat
WhatsApp
Queens University代写,STAT 361代写,Applied Methods in Statistics I代写,R代写,linear regression代写,Queens University代编,STAT 361代编,Applied Methods in Statistics I代编,R代编,linear regression代编,Queens University代考,STAT 361代考,Applied Methods in Statistics I代考,R代考,linear regression代考,Queens Universityhelp,STAT 361help,Applied Methods in Statistics Ihelp,Rhelp,linear regressionhelp,Queens University作业代写,STAT 361作业代写,Applied Methods in Statistics I作业代写,R作业代写,linear regression作业代写,Queens University编程代写,STAT 361编程代写,Applied Methods in Statistics I编程代写,R编程代写,linear regression编程代写,Queens Universityprogramming help,STAT 361programming help,Applied Methods in Statistics Iprogramming help,Rprogramming help,linear regressionprogramming help,Queens Universityassignment help,STAT 361assignment help,Applied Methods in Statistics Iassignment help,Rassignment help,linear regressionassignment help,Queens Universitysolution,STAT 361solution,Applied Methods in Statistics Isolution,Rsolution,linear regressionsolution,