1. Homepage
  2. Programming
  3. CS615 Deep Learning Assignment 3 - Learning and Basic Architectures Spring 2024

CS615 Deep Learning Assignment 3 - Learning and Basic Architectures Spring 2024

Engage in a Conversation
DrexelCS615Deep LearningLearning and Basic ArchitecturesPythonGradient DescentLogistic RegressionLinear Regression

CS 615 - Deep Learning CourseNana.COM

Assignment 3 - Learning and Basic Architectures Spring 2024 CourseNana.COM

Introduction CourseNana.COM

In this assignment we will implement backpropagation and train/validate a few simple architectures using real datasets. CourseNana.COM

Allowable Libraries/Functions CourseNana.COM

Recall that you cannot use any ML functions to do the training or evaluation for you. Using basic statistical and linear algebra function like mean, std, cov etc.. is fine, but using ones like train are not. Using any ML-related functions, may result in a zero for the programming component. In general, use the “spirit of the assignment” (where we’re implementing things from scratch) as your guide, but if you want clarification on if can use a particular function, DM the professor on slack. CourseNana.COM

Grading CourseNana.COM

Do not modify the public interfaces of any code skeleton given to you. Class and variable names should be exactly the same as the skeleton code provided, and no default parameters should be added or removed. CourseNana.COM

Part 1 (Theory)
Part 2 (Visualizing Gradient Descent)
CourseNana.COM

Part 3 (Update Weights method) CourseNana.COM

Part 4 (Linear Regression)
Part 5 (Logistic Regression)
TOTAL CourseNana.COM

Table 1: Grading Rubric CourseNana.COM

20pts 10pts 10pts 25pts 25pts 100pts CourseNana.COM

Datasets
Medical Cost Personal Dataset For our regression task we’ll once again use the medical cost CourseNana.COM

dataset that consists of data for 1338 people in a CSV file. This data for each person includes: 1. age CourseNana.COM

2. sex
3. bmi
4. children
5. smoker
6. region
7. charges (target value,
Y ) CourseNana.COM

This time I preprocessed the data for you to again convert the sex and smoker features into binary features and the region into a set of binary features (basically one-hot encoded this). In addition, we now included the charges information as we will want to predict this. CourseNana.COM

For more information, see https://www.kaggle.com/mirichoi0218/insurance CourseNana.COM

Kid Creative We will use this dataset for binary classification. This dataset consists of data for 673 people in a CSV file. This data for each person includes: CourseNana.COM

1. Observation Number (we’ll want to omit this) 2. Buy (binary target value, Y )
3. Income
4. Is Female
CourseNana.COM

5. Is Married
6. Has College
7. Is Professional 8. Is Retired
9. Unemployed
CourseNana.COM

10. Residence Length 11. Dual Income
12. Minors
13. Own
CourseNana.COM

14. House CourseNana.COM

15. White
16. English
17. Prev Child Mag 18. Prev Parent Mag
CourseNana.COM

We’ll omit the first column and use the second column for our binary target Y . The remaining 16 columns provide our feature data for our observation matrix X. CourseNana.COM

1 Theory CourseNana.COM

1. For the function J = (x1w1 5x2w2 2)2, where w = [w1, w2]T are our weights to learn:
(a) What are the partial gradients,
∂J and ∂J ? Show work to support your answer (6pts). CourseNana.COM

∂w2
(b). ∂J = -4 CourseNana.COM

∂w1 ∂J =20 CourseNana.COM

∂w1 ∂w2
(b) What are the value of the partial gradients, given current values of w = [0, 0]T , x = [1, 1] CourseNana.COM

(4pts)?
2. Given the objective function
J = 14 (x1w1)4 34 (x1w1)3 + 23 (x1w1)2: CourseNana.COM

  1. (a)  What is the gradient ∂J (2pts)? ∂w1 CourseNana.COM

  2. (b)  What are the locations of the extrema points for this objective function J if x1 = 1? Recall that to find these you take the derivative of the objective function with respect to the unknown, set that equal to zero and solve for said unknown (in this case, w1). (5pts) CourseNana.COM

  3. (c)  What does J evaluate to at each of your extrema points, again when x1 = 1 (3pts)? CourseNana.COM

1.1 answer CourseNana.COM

1.(a).J = (x1w1 5x2w2 2)2,we have u = x1w1 5x2w2 2 and J = u2 ∂J = 2u* ∂u CourseNana.COM

∂w1 ∂w1 where, ∂u = x1. CourseNana.COM

∂w1
So, ∂J =2(x1w1 5x2w2 2)x1 CourseNana.COM

∂w1
we also have, ∂u =5x2 CourseNana.COM

∂w2
So, ∂J =2(x1w1 5x2w2 2)(5x2) CourseNana.COM

∂w2
2.(a). ∂J = x2w (x2w2 4x w +3) CourseNana.COM

∂w1 1111 11 (b),setx1=1,∂J =w1(w124w1+3) CourseNana.COM

∂w1
w1(w12 4w1 + 3) = 0,w1=0,1,3 CourseNana.COM

(c).For w1=0,J = 0. For w1=1,J = 5 . CourseNana.COM

12 For w1=3,J = -2.25. CourseNana.COM

2 Visualizing Gradient Descent CourseNana.COM

In this section we want to visualize the gradient descent process for the following function (which was part of one of the theory questions): CourseNana.COM

J = (x1w1 5x2w2 2)2
Note that this is more of a toy problem to explore the idea of gradient-based learning than it is a CourseNana.COM

deep learning architecture. CourseNana.COM

Hyperparameter choices will be as follows: Initialize your weights to zero.
Set the learning rate to η = 0.01.
Terminate after 100 epochs. CourseNana.COM

Using the partial gradients you computed in the theory question, perform gradient descent, using x = [1,1]. After each training epoch, evaluate J so that you can plot w1 vs w2, vs J as a 3D line plot. Put this figure in your report. CourseNana.COM

2.1 answer CourseNana.COM

Figure 1: Enter Caption CourseNana.COM

3 Updating Fully Connected Layer’s Weights and Biases CourseNana.COM

We also need to add an updateWeights method for our Fully Connected layer. This method takes a backcoming gradient and a learning rate as parameters, and updates its weights and biases according to the formulas in lecture. The method’s prototype should look like: CourseNana.COM

def updateWeights( self , gradIn , eta = 0.0001): #TODO CourseNana.COM

4 Linear Regression CourseNana.COM

In this section you’ll use your modules to train a linear regression model for the medical cost dataset. The architecture of your linear regression should be as follows: CourseNana.COM

Input Fully-Connected Squared-Error-Objective Your code should do the following: CourseNana.COM

  1. Read in the dataset to assemble X and Y (recall that our target Y is the charges column for this dataset). CourseNana.COM

  2. Shuffle the rows of the dataset (both X and Y , together) and use approximately 2/3 for training and 1/3 for validating. CourseNana.COM

  3. Train, via gradient learning, your linear regression system using the training data. Refer to the pseudocode in the lecture slides on how this training loop should look. Initialize your weights to be random values in the range of ±104. Play with your learning rate such that you get to (near) convergence in a reasonable amount of time with stability. Terminate the learning process when the absolute change in the mean squared error on the training data is less than 1010 or you pass 100,000 epochs. During training, keep track of the mean squared error (MSE) for both the training and the validation sets so that we can plot these as a function of the epoch. CourseNana.COM

In your report provide:
1. Your plots of training and validation MSE vs epoch.
2. Your final RMSE for the training and validation data. 3. Your final SMAPE for the training and validation data.
CourseNana.COM

4.1 answer CourseNana.COM

(a). CourseNana.COM

Figure 2: MSE vs epoch CourseNana.COM

(b),(c):
Training RMSE: 8544.019074654007 Validation RMSE: 8693.1881630646 Training SMAPE: 51.3827515030682 Validation SMAPE: 54.37187011111296
CourseNana.COM

5 Logistic Regression CourseNana.COM

Next we’ll use a logistic regression model on the kid creative dataset to predict if a user will purchase a product. The architecture of this model should be: CourseNana.COM

Input Fully-Connected Sigmoid-Activation Log-Loss-Objective Your code should do the following: CourseNana.COM

  1. Read in the dataset to assemble X and Y (rcall that our target Y is the Buy column for this dataset). CourseNana.COM

  2. Shuffle the rows of the dataset (both X and Y , together) and use approximately 2/3 for training and 1/3 for validating. CourseNana.COM

  3. Train, via gradient learning, your logistic regression system using the training data. Initialize your weights to be random values in the range of ±104. Play with your learning rate such that you get to (near) convergence in a reasonable amount of time with stability. Terminate the learning process when the absolute change in the log loss is less than 1010 or you pass 100, 000 epochs. During training, keep track of the log loss for both the training and the validation sets so that we can plot these as a function of the epoch. CourseNana.COM

In your report provide: CourseNana.COM

  1. Your plots of training and validation log loss vs epoch. CourseNana.COM

  2. Assigning an observation to class 1 if the model outputs a value greater than 0.5, report the training and validation accuracy. CourseNana.COM

5.1 answer CourseNana.COM

(a). CourseNana.COM

Figure 3: log loss vs epoch CourseNana.COM

(b).
Training Accuracy: 0.9333333333333333 Validation Accuracy: 0.8878923766816144
CourseNana.COM

CourseNana.COM

Submission CourseNana.COM

For your submission, upload to Blackboard a single zip file containing: CourseNana.COM

1. PDF Writeup 2. Source Code 3. readme.txt file CourseNana.COM

The readme.txt file should contain information on how to run your code to reproduce results for each part of the assignment. CourseNana.COM

The PDF document should contain the following: CourseNana.COM

1. Part 1: Your solutions to the theory question 2. Part 2: Your plot.
3. Part 3: Nothing.
4. Part 4: Your plot and requested statistics.
CourseNana.COM

5. Part 5: Your plot and requested accuracies. CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
Drexel代写,CS615代写,Deep Learning代写,Learning and Basic Architectures代写,Python代写,Gradient Descent代写,Logistic Regression代写,Linear Regression代写,Drexel代编,CS615代编,Deep Learning代编,Learning and Basic Architectures代编,Python代编,Gradient Descent代编,Logistic Regression代编,Linear Regression代编,Drexel代考,CS615代考,Deep Learning代考,Learning and Basic Architectures代考,Python代考,Gradient Descent代考,Logistic Regression代考,Linear Regression代考,Drexelhelp,CS615help,Deep Learninghelp,Learning and Basic Architectureshelp,Pythonhelp,Gradient Descenthelp,Logistic Regressionhelp,Linear Regressionhelp,Drexel作业代写,CS615作业代写,Deep Learning作业代写,Learning and Basic Architectures作业代写,Python作业代写,Gradient Descent作业代写,Logistic Regression作业代写,Linear Regression作业代写,Drexel编程代写,CS615编程代写,Deep Learning编程代写,Learning and Basic Architectures编程代写,Python编程代写,Gradient Descent编程代写,Logistic Regression编程代写,Linear Regression编程代写,Drexelprogramming help,CS615programming help,Deep Learningprogramming help,Learning and Basic Architecturesprogramming help,Pythonprogramming help,Gradient Descentprogramming help,Logistic Regressionprogramming help,Linear Regressionprogramming help,Drexelassignment help,CS615assignment help,Deep Learningassignment help,Learning and Basic Architecturesassignment help,Pythonassignment help,Gradient Descentassignment help,Logistic Regressionassignment help,Linear Regressionassignment help,Drexelsolution,CS615solution,Deep Learningsolution,Learning and Basic Architecturessolution,Pythonsolution,Gradient Descentsolution,Logistic Regressionsolution,Linear Regressionsolution,