STA4003 Project
Due date: May 13, 2022
. Outstanding projects will be invited to give a presentation on Zoom on April 28, 2022. Students who have given a presentation can receive maximum 10 bonus points in their final exam.
. Students who want to present their work need to submit the project by April 22, 2022. Submissions after April 22 will not be invited for presentation. All students can revise their work before May 13, 2022.
. The submitted codes must be clearly written in a R file with an output MSE.
. A report to describe your analysis is required.
This project is a full analysis of a dataset “AppData.Rdata”, which contains averages of daily data usage (MB) of 9 categories (namely App i, for i = 1, . . . , 9) of mobile apps of a mobile carrier operating in Chengdu, Sichuan. The data spans December 1, 2019 to April 30, 2020, during which the city government imposed three levels of public health restrictions: Policy-level 1, starting January 25, when households were limited to daily grocery trips; Policy-level 2, starting February 26, when most mobility restrictions were lifted; Policy-level 3, starting March 25, when all restrictions were lifted.
In this project, you are required to forecast the average usage starting from Mar 1, 2020 to April 30, 2020. Let xs,i be the true average of data usage of App i on Day s and ˆxs,i be your forecast. Your goal is to minimize the mean squared errors,
Please note the followings.
1. Your work will be evaluated by other data sets that have the same background of the given data set. Not the given data set, which will not be involved in the evaluation.
2. Only the given data set and the information provided in the project can be used. Don’t use any other additional information in your analysis.
3. Your forecast ˆxs,i must be dependent on past values only. Don’t use the information “in the future”.