2024/4/23 20:54 Project 2
Project 2 Semester One 2024
The content and delivery of content in this course are protected by copyright. Material
belonging to others may have been used in this course and copied by and solely for the
educational purposes of the University under license. You may copy the course content
for the purposes of private study or research, but you may not upload onto any third
party site, make a further copy or sell, alter or further reproduce or distribute any
part of the course content to another person.
This project draws on lecture and lab content from Module 2: Creating web-based
dynamic reporting systems. The knowledge and skills assessed by the project will be
covered in lectures/labs by the end of Module 2. Consequently, the order of the
project instructions below does not necessarily reflect the order the material is
covered in lectures/labs.
Prep work
You will also need access to a computer that has R (https://cran.r-project.org/)
installed and RStudio (https://posit.co/download/rstudio-desktop/) installed.
If installing R and RStudio onto your computer is a problem, you can use the free
level of Posit Cloud (https://posit.cloud/)
You should already have the R package {tidyverse} installed from Project 1
Remember, you can use the lab sessions to get help with completing your project, and
with any technical difficulties you face, like installing R or RStudio, or any
packages.
Part A: Form design
The data context for Project 2 is tracking human behaviour. You need to develop a
form/survey that could be completed on a weekly basis.
You need to select a human behaviour focus for tracking (e.g., spending money, volunteer work, going for walks, cleaning, helping wh̄anau with tasks, sports-related training, using social media, socialising, studying, watching TV, etc.).
Decide on what questions you will use for your form, in order to collect data. Think
carefully about what aspect(s) of the human behaviour that you specifically want to
focus on, and that would be meaningful to track and compare over several weeks.
Develop a new Google form with between five to eight questions that align with your
focus. Use the four guidelines for designing forms discussed in Module 2. Create
different kinds of questions and add text, visual, and structural elements as needed.
At least two questions that will generate numeric data using a short answer
question with response validation.
At least two questions that will generate categorical data using multiple choice
questions with fixed options (i.e. no “other” option).
At least one question that will generate categorical data using check boxes with
fixed options (i.e. no “other” option).
Ensure that responses are anonymous.
Change the settings of your Google form so that it is NOT restricted to only users from the University of Auckland. [Note: You can use your non-university gmail account for this project, in which case this instruction won’t apply!]
Print your Google form to PDF and save this file within your project folder as
“project2_form.pdf”
You may be asked to discuss your form design process as part of the test, so you should
keep notes or drafts of your development of the form.
The video below shows you how to get started with creating a Google form
Part B: Data collection
Generate a link to share the Google form with others, and send your Google form to friends and wh̄anau, aiming for at least 20 responses.
If you are unable to gain enough responses, you can complete the form yourself,
making sure to vary your answers so you end up with interesting data for analysis.
Link your Google form to a new Google sheet.
Publish your Google sheet as a CSV and record this URL (this is a different process
from sharing the link or just copying the link to your Google sheet).
Keep your form open to collecting responses when you submit your project, as the
marker will complete your form as part of marking your project.
You only need to collect “one week” of data from your participants (i.e. they only need
to complete it once, based on their behaviour for that week).
Part C: Data exploration
Within your stats220 folder on your computer, create a new project/folder using
RStudio that is called “Project2”.
Create a new R script/file called “exploration.R”. You will not submit this file, but
it is highly encouraged that you spend time exploring your data first, before you
create your project report.
Write R code to load the package {tidyverse}.
Remember, that the package {tidyverse} includes the packages {readr}, {dplyr} and
{ggplot2}, so you don’t need to load these packages as well.
Use the function read_csv() to read data directly from the URL of your published CSV file into a data frame called learning_data .
Rename the variables of the data frame learning_data using the rename() function.
Use relevant R functions covered in Module 2 to explore your data, with the purpose
of identifying two bar charts and two summary values (e.g. min, max, mean, length,
etc.) that you can use for your dynamic report.
Ensure your R code uses comments, indenting, and “white space”.
You need to spend a bit of time getting to know your data and figuring out what key
aspects you want to present for your analytics section of the report.
Part D: RMarkdown report setup and introduction
Create a new Rmd file within the Project2 project folder called
“project2_report.Rmd”.
Make the title “Project 2” and put your name as the author.
Edit the YAML of the index.Rmd file so that:
the subtitle is “STATS 220 Semester One 2024”.
code is folded (see example code below)
For the r setup chunk:
load the tidyverse library using library(tidyverse)
output:
html_document:
code_folding: hide
use the following settings
knitr::opts_chunk$set(echo=TRUE, message=FALSE, warning=FALSE, error=FALSE)
Structure the report (the project2_report.Rmd file) using second-level headings as
follows:
Introduction
Analytics
Creativity
Learning reflection
Under the Introduction section of your report:
Describe what human behaviour you decided to focus on for your report.
Discuss one of the guidelines for designing forms and how you considered this
when designing your form.
Explain how the data collected from your form would allow you to analyse changes
in this human behaviour over different weeks by referring to a specific
question(s) used in your form.
Include the link to your Google form (the same one that your shared with your
respondents - this needs to be accessible to the marker).
Write at least 200 words for this section.
Under the Analytics section of your report:
Write three “static” statements that describe what you learned about your data
when you explored it in Part C
A “static” statement is one that you write entirely using only markdown
“Static” means it will not change when your Rmd is knitted to HTML
Use what you observed when you explored your data in Part C to write these
statements (e.g. Most of the people who completed my form liked cats)
Copy code you have developed in your exploration.R script into R code chunks to
do the following:
Read the data from your published CSV file (using its URL) into a data frame
called learning_data.
Rename the variables of the data frame learning_data using the rename()
function.
Manipulate the data using appropriate functions.
Produce text that contains at least two summary values based on your data,
using R code and the paste() function or similar.
Produce at least two different and informative bar charts using {ggplot}
functions.
A good report will mix together writing (using markdown) and data analysis output
(produced by R chunks).
Under the Creativity section of your report:
Describe and justify how your project demonstrates creativity.
What does “demonstrate creativity” mean? It means that you have gone beyond what was
asked in terms of either your explanations, creations, presentation or use of data
technologies. For this project, that could mean re-using skills and knowledge covered
in Module 1 that was not asked for in this project, using regex or other features of
Google forms above what was asked for when designing your form, producing an analytics
section of the report that is statistically insightful, or developing code that is not
just copied from the lectures or labs. You can check with Anna or the lab tutors to
confirm your plans for creativity are sufficient!
Under the Learning reflection section of your report:
Describe in your own words (i.e., not using an AI tool to generate what you
write) at least ONE important idea you learned from Module 2 Creating web-based
dynamic reporting systems.
Discuss what things related to data technologies that you are more curious about
exploring further.
Write at least 100 words for this section of your report.
Knit your project2_report.Rmd file to create a self-contained project2_report.html
file.
Marking guide
For this project, you will submit the following files:
project2_report.html
project2_form.pdf
The project will be marked out of 10. The criteria given below are based on the six
learning objectives of Module 2 and the three focuses for this course:
Focus or objective
Design a form for data
collection
Criteria
Form contains required questions and demonstrates
use of four guidelines (1 mark)
Consider data with respect Explanation of how the data collected would support
to analysis
Store rectangular (tidy)
data persistently using
Google sheets
analysis of changes in human behaviour over
different weeks (1 mark)
Google sheet published as CSV and URL used within
analysis (1 mark)
Develop communication with Three statements have been written that describe
data and technology
Create dynamic reports
using R Markdown
Develop R-coding related
knowledge
Manipulate tidy data with
R
what was learned from the data. (1 mark)
The analytics section of the report contains use of
R chunks and markdown (1 mark)
The analytics section R chunks show use of relevant
R functions for reading and renaming data (1 mark)
The analytics section R chunks show use of relevant
R functions for summarising data and outputting
text-based statements that include summary values (1
mark)
Focus or objective Criteria
|
Develop simple graphics The analytics section of the report contains R
with {ggplot2} chunks to create and display at least two
informative bar charts (1 mark)
|
Develop creativity with The project demonstrates creativity (1 mark)
data and technology
|
Submission requirements The project report meets the stated requirements and
all correct files were submitted for the project (1
mark) |