Background
ST332 & ST409 Medical Statistics: 2023-2024
Assignment [20%]
Deadline: 13:00 Friday 26th April 2024
Clinicians at University Hospitals Coventry & Warwickshire (UHCW) NHS Trust have studied whether elevated levels (above 2.9 ng/mL) of a biomarker, carcinoembryonic antigen (CEA), after surgery (typically collected within 2 weeks following surgery) for patients undergoing with colorectal cancer are prognostic for all-cause mortality. They collected data on co- morbidities – diagnosis of cardiovascular disease (CVD) prior to surgery, diagnosis of Chronic obstructive pulmonary disease (COPD) prior to surgery, in addition to smoking status which is thought to affect CEA levels and age at the time of surgery.
Dataset
The R dataset crcsurg.RData contains synthetic data on 160 patients treated at UHCW NHS Trust over 15 years with a median follow-up of 5.7 years. The dataset only considers death as an outcome and contains the following variables and associated codes;
id (Patient ID number)
CEA (0=normal,1=elevated)
Smoker (0=no,1=yes at time of surgery)
Male (0=female,1=male)
Age (age in years at time of surgery)
CVD (0=none,1=CVD at time of surgery)
COPD (0=none,1=COPD at time of surgery)
pys (person-years follow-up since surgery)
dead (0=alive,1=dead)
Assignment Tasks
PART A [Maximum 5 pages – 15 marks]
Answer the following questions based on the crcsurg.RData dataset described above. Note it is not necessary to write a report – just answer the questions, but do not include R code or paste R output – marks will be lost for this.
-
Undertake an Exploratory Data Analysis of the crcsurg.RData dataset, in particular identify any covariates which may be potential confounders. [5 marks]
-
Produce a summary table of the numbers of deaths and person-years data, based on the CEA level of patients. Calculate and interpret the associated Incidence Rate Ratio (IRR) and it’s 95% CI and P-value. [3 marks]
-
Present and interpret an appropriate log-linear Poisson regression model that captures the effect of elevated CEA levels after adjusting for other relevant variables.
Carefully describe the model (defining any variables you refer to), how your model was chosen, how the results compare to those in Question 2, and any interesting/pertinent features. [5 marks]
4. Your clinical colleagues suggest that a 10% relative increase in mortality due to elevated CEA levels would be clinically important to detect. Describe how you would design a future study based on this information (and the results of your analyses in questions 1 -3). [2 marks]
PART B [500 words maximum - 5 marks]
Based on your answers to PART A write a Press Release about the study for the University to
make available to the media.
Marks will be awarded for;
-
Appropriate interpretation of the results in PART A
-
Discussion of strengths and limitations of the study
-
Use of non-technical language for an informed lay audience
-
Inclusion of an “anonymous” quote from you as “Researcher <Joe Bloggs>”
-
Discussion of potential next steps/further research