Q4) (20 marks)
The Diabetes dataset contains ten baseline variables - age, sex, body mass index, average blood pressure, and six blood serum measurements - obtained for each of n = 442 diabetes patients, as well as a quantitative measure of disease progression one year after baseline (column named 'Y').
CourseNana.COM
Your task is to predict the disease progression for each patient based on the given data. We have already split data into two parts: train and test sets. diab_train is training data and diab_test is testing data
CourseNana.COM
CourseNana.COM
diab = pd.read_csv('data/diabetes.tab.txt', delimiter='\t')
print(diab.describe())
diab = diab.iloc[np.random.permutation(len(diab))]
diab_train = diab.head(300)
diab_test = diab.tail(142)