# Economic and Statistical Software: Introduction to R

Economic and Statistical Software: Introduction to R

Note: (i) Clearly state your name and student ID. (ii) Send your answers and codes to the TA before the deadline. (iii) There are 100 marks in total. (iv) You can answer the questions in English or Chinese unless otherwise stated.

- (10 marks) By definition, the projection matrices are PX ≡ X(X > X)−1 X > and MX = I − PX . Assume that we have a matrix X 1 such that the space spanned by X 1 , denoted as S1 belongs to the set spanned by X, denoted as SX . Define the projection matrix P 1 ≡ −1 > X 1 (X > 1 X 1) X 1 (a) Prove that P1 · PX = P1 . (b) What is the result of M1 · MX ? Why? Provide intuition.
- (10 marks) If ut follows the stationary AR(1) process ut = ρut−1 + t ,

t ∼ iid(0, σ2 ),

|ρ| < 1.

Derive the covariance and correlation between ut and ut−j .

- (10 marks) Explain why endogeneity is rarely considered in the machine learning forecasting exercises. Describe your understanding of endogeneity first, then use one machine learning algorithm that you are familiar with as an example.
- (20 marks) Along with this final assignment, you should also find two PDF files. These are academic articles published in the journal of economic perspective by well-known scholars. The two papers are:

(a) “Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left” by Jerry Hausman (2001) (b) “Avoiding Invalid Instruments and oping with Weak Instruments” by Michael P. Murray (2006) Please write a short report in English on (i) paper (a), if the last digit of your student ID is an odd number; or (ii) paper (b), if the last digit of your student ID is an even number. DO NOT write reports on both papers! Your report should be at least 300 words long that summarizes the article. You should discuss the findings, contribution, and the conclusion of the paper. You can add equations and technical terms if necessary. You can cite other references, but keep in mind that the references shall not be counted as part of the 300 words. Any form of plagiarism will not be tolerated.

- (20 marks) This question is about the VIX data set vixlarge.csv that contains the VIX data and the associated dates. (a) (5 marks) Plot the VIX data against date in line. Clearly label the horizontal and vertical axises. (b) (10 marks) Let the dependent variable y be the VIX and the first and the second columns of the independent variable X be the intercept term and the lag of VIX (set x0 = 0). Conduct a one-step-ahead rolling window exercise. i. Set the window length at 3000 and make forecast on the next period yt+1 . ii. Start from the beginning and roll until the end. iii. For each roll, we make forecast using ridge and lasso methods with tuning parameter λ = 1, 10 for each method. In total, we compare 4 methods. iv. Comparing the forecasts with the actual true values of yt+1 . Compute the mean squared forecast errors and the mean absolute forecast errors for the four methods and report them in a table. v. Which method has the best performance and which one has the worst? Provide your understanding and explanation of the results. vi. Come up with an algorithm that can beat the best performing method stated in question v. Clearly describe your motivation, the details of the algorithm, and the results. (c) (5 marks) We now consider a more general forecasting exercise with model yt+h = f (xt ) + ut+h ,

for t = 1, ..., n − h

where h is the forecasting horizon. Note that Q1(b) is the special case with h = 1 and f (·) being the ridge or LASSO estimator. We now replicate 1(b) with h = [1, 5, 10, 22] using LASSO and the regression tree. Choose your own tuning parameters this time, state them clearly, and report your forecasting results in a table. What do you observe?

- (30 marks, 5 marks each) This question requires you to use the movielarge.csv data. As usual, the OpenBox variable on the first column is the response and all others are predictors. This larger data set now contains 18 predictors. (a) Apply regression tree, bagging tree, and random forest to fit the data using all the predictors. Clearly state the approach you adopt (which method, prune or not, if prune, how, what is your number of bootstraps B, etc.). Report the respective centered R2 s. (b) Briefly describe how we measure predictor importance using OOB error. Do not copy from the lecture notes, use your own language. (c) Use OOB error to measure predictor importance by bagging tree. Clearly label the top 5 predictors. (d) Treat the first 90 observations as training set, the rest 4 observations as evaluation set. Compare the prediction performance of regression tree, bagging tree, and random forest by MSFE. Describe your results in details. (e) Repeat (d), this time, set the first 80 obs. as training set, the rest as evaluation set.