title: 'ANOVA and multiple comparisons' author: "YOURNAME" date: "2/28/2023" output: html_document
knitr::opts_chunk$set(echo = TRUE)
1. Read in the farm milk dataset and assign to the object "milk".
Calculate the mean and std dev of milk production (milk column) for each breed. Also count the number of observations included in the dataset per breed.
2. Create a bar plot that compares the mean milk production of the four breeds. Include error bars appropriate for inference on mean differences.
3. Test the null hypothesis that breeds do not differ in milk production. Report the test statistic, the associated degrees of freedom, and the P value.
4. About how much variation in individual milk production is 'explained' by breed?
5. Conduct a post-hoc test to determine which breeds differ from each other in milk production. Report P values.
After Farmer Brown conducted his measurements, he wondered what else could explain variation in milk production. He wondered in particular whether it mattered whether he bought his cows from Georgia or Kentucky, or whether his cows were grass or grain fed. These are 'state' and 'feed' columns in the provided dataset.
6. Assume that his prior efforts testing the significance of 'breed' came down to the ANOVA P value. That's one test using the milk production response variable. Conduct two additional tests of whether milk production varies according to state of cow origin or feed type, and then adjust the P values for all three using a standard post hoc test method (hint: try p.adjust). What are your conclusions?
7. What if Farmer Brown had wanted to compare the effects of all three covariates (breed, state, feed) at the outset of the experiment (he is not assuming any interactions between variables)? Would his conclusions have changed?
8. How would you detect whether there is collinearity in the predictors?