CourseNana | CS615 Deep Learning Assignment 2 - Objectives, Gradients, and Backpropagation Spring 2024

CS 615 - Deep Learning CourseNana.COM

Assignment 2 - Objectives, Gradients, and Backpropagation Spring 2024 CourseNana.COM

In this assignment we’ll implement our output/objective modules and add computing the gradients to each of our modules. CourseNana.COM

Allowable Libraries/Functions CourseNana.COM

Recall that you cannot use any ML functions to do the training or evaluation for you. Using basic statistical and linear algebra function like mean, std, cov etc.. is fine, but using ones like train are not. Using any ML-related functions, may result in a zero for the programming component. In general, use the “spirit of the assignment” (where we’re implementing things from scratch) as your guide, but if you want clarification on if can use a particular function, DM the professor on slack. CourseNana.COM

Grading CourseNana.COM

Do not modify the public interfaces of any code skeleton given to you. Class and variable names should be exactly the same as the skeleton code provided, and no default parameters should be added or removed. CourseNana.COM

Theory
Testing fully-connected and activation layers’ gradient methods Testing objective layers’ loss computations and gradients Forwards-Backwards Propagate Dataset
TOTAL CourseNana.COM

Table 1: Grading Rubric CourseNana.COM

20pts 30pts 30pts 20pts 100pts CourseNana.COM

Theory CourseNana.COM

1 2 3
(10 points) Given H = 4 5 6 as an input, compute the gradients of the output with respect CourseNana.COM

to this input for the following activation layers. CourseNana.COM

(a) A ReLU layer (b) A Softmax layer CourseNana.COM

(c) A Logistic Sigmoid Layer (d) A Tanh Layer CourseNana.COM

(e) A Linear Layer
(2 points) Given H = 4 5 6 as an input, compute the gradient of the output a fully CourseNana.COM

connected layer with regards to this input if the fully connected layer has weights of W = 3 4 CourseNana.COM

as biases b = −1 2.
(2 points) Given target values of Y = 1 and estimated values of Y = 0.3 CourseNana.COM

0 ˆ 0.2 (a) A squared error objective function CourseNana.COM

compute the loss CourseNana.COM

for: CourseNana.COM

(b) A log loss (negative log likelihood) objective function) 1 0 0 CourseNana.COM

ˆ
(1 point) Given target distributions of Y = 0 1 0 and estimated distributions of Y = CourseNana.COM

0.2 0.2 0.6
0.2 0.7 0.1 compute the cross entropy loss. CourseNana.COM

0 ˆ 0.2
(4 points) Given target values of Y = 1 and estimated values of Y = 0.3 compute the CourseNana.COM

gradient of the following objective functions with regards to their input, Yˆ: (a) A squared error objective function CourseNana.COM

(b) A log loss (negative log likelihood) objective function)
1 0 0 ˆ CourseNana.COM

(1 point) Given target distributions of Y = 0 1 0 and estimated distributions of Y = 0.2 0.2 0.6 CourseNana.COM

0.2 0.7 0.1 compute the gradient of the cross entropy loss function, with regard to the input distributions Yˆ. CourseNana.COM

2 CourseNana.COM

1 2 56 CourseNana.COM

1.1 answer CourseNana.COM

1 1 1 1.a.1 1 1 CourseNana.COM

0.09003 0.02237 0.06766 b. 0.02237 0.09003 0.06766 CourseNana.COM

0.19661 0.10499 0.04518 c. 0.01767 0.00665 0.00247 CourseNana.COM

0.41997 0.07065 0.00987 d. 0.00067 0.00019 0.00001 CourseNana.COM

1 0 0 e. 0 1 0 CourseNana.COM

1 3 5 2. 2 4 6 3.a.0.265 b.−0.7136 4.0.9831 CourseNana.COM

0.2 5.a. −0.4 CourseNana.COM

−0.625 b. −1.6667 CourseNana.COM

−1.875 0.625
6. 0.625 −0.7143 0.625 CourseNana.COM

1.25 CourseNana.COM

3 CourseNana.COM

Datasets
Kid Creative We will use this dataset for binary classification. This dataset consists of data for CourseNana.COM

673 people in a CSV file. This data for each person includes: 1. Observation Number (we’ll want to omit this)
2. Buy (binary target value, Y )
3. Income CourseNana.COM

4. Is Female
5. Is Married
6. Has College
7. Is Professional 8. Is Retired
9. Unemployed CourseNana.COM

10. Residence Length 11. Dual Income
12. Minors
13. Own CourseNana.COM

14. House
15. White
16. English
17. Prev Child Mag 18. Prev Parent Mag CourseNana.COM

We’ll omit the first column and use the second column for our binary target Y . The remaining 16 columns provide our feature data for our observation matrix X. CourseNana.COM

4 CourseNana.COM

2 Update Your Codebase CourseNana.COM

In this assignment you’ll add gradient and backwards methods to your existing fully-connected layer and activation functions, and implement your objective functions. Again, make sure these work for a single observation and multiple observations (both stored as matrices). We will be unit testing these. CourseNana.COM

Adding Gradient Methods CourseNana.COM

Implement gradient methods for your fully connected layer, and all of your activation layers. The prototype of these methods should be: CourseNana.COM

#Input : None
#Output : An N by (D by D) tensor CourseNana.COM

def gradient(self): #TODO CourseNana.COM

Adding Backwards Methods CourseNana.COM

Add the backward method to our activation and fully-connected layers! You might want to consider having a default version in the abstract class Layer, although we’ll leave those design decisions to you. In general, the backward methods should takes as inputs the backcoming gradient, and returns the updated gradient to be backpropagated. The methods’ prototype should look like: CourseNana.COM

def backward( self , gradIn ): #TODO CourseNana.COM

Adding Objective Layers CourseNana.COM

Now let’s implement a module for each of our objective functions. These modules should again each be in their own file with the same filename as the class/module, and implement (at least) two methods: CourseNana.COM

eval - This method takes two explicit parameters, the target values and the incoming/estimated values, and computes and returns the loss (as a single float value) according to the module’s objective function. This should work both for a single observation, and a set of observations. CourseNana.COM
gradient - This method takes the same two explicit parameters as the eval method and computes and returns the gradient of the objective function using those parameters. CourseNana.COM

Implement these for the following objective functions: • Squared Error as SquaredError
• Log Loss (negative log likelihood) as LogLoss
• Cross Entropy as CrossEntropy CourseNana.COM

5 CourseNana.COM

Your public interface is: CourseNana.COM

class XXX():
#Input : Y is an N by K matrix of target values . #Input : Yhat is an N by K matrix of estimated values . # Where N can be any integer>=1
#Output: A single floating point value. CourseNana.COM

def eval(self ,Y, Yhat): #TODO CourseNana.COM

#Input : Y is an N by K matrix of target values . #Input : Yhat is an N by K matrix of estimated values . #Output : An N by K matrix . CourseNana.COM

def gradient(self ,Y, Yhat): #TODO CourseNana.COM

6 CourseNana.COM

3 Forwards-Backwards Propagate a Dataset CourseNana.COM

In HW1 you implemented forwards propagation for the Kid Creative dataset with the following architecture (note that I have added on a LogLoss layer): CourseNana.COM

Input→FC (1 output)→Logistic Sigmoid→LogLoss CourseNana.COM

Now let’s do forwards-backwards propagation. Using the code shown in the Objectives and Gradients slides, perform one forwards-backwards pass. In your report provide the gradient due to the first observation coming backwards out of: CourseNana.COM

1. Log Loss
2. Logistic Sigmoid Layer 3. Fully-Connected Layer CourseNana.COM

3.1 answer CourseNana.COM

Gradient out of Log Loss for the first observation: [1.99983166]
Gradient out of Logistic Sigmoid Layer for the first observation: [0.49995791]
Gradient out of Fully Connected Layer for the first observation: [-4.98777029e-05 3.93557355e-06 3.61814772e-05 -8.14937026e-06
-4.95566822e-05 -4.03753562e-05 8.27928752e-06 -2.90450515e-05
-9.80522957e-06 -6.82904674e-06 2.57671326e-05 3.63388693e-05
-2.55929394e-05 1.16773737e-05 4.75451430e-05 9.68959219e-06] CourseNana.COM

7 CourseNana.COM

Submission CourseNana.COM

For your submission, upload to Blackboard a single zip file containing: CourseNana.COM

1. PDF Writeup 2. Source Code 3. readme.txt file CourseNana.COM

The readme.txt file should contain information on how to run your code to reproduce results for each part of the assignment. CourseNana.COM

The PDF document should contain the following: CourseNana.COM

Part 1: Your solutions to the theory question CourseNana.COM
Part 2: Nothing. We will unit test these, but again we encourage you do so yourself, particularly using the examples from the theory questions. CourseNana.COM
Part 3: The gradient pertaining to the first observation as it comes backwards out of the three modules. CourseNana.COM

CS615 Deep Learning Assignment 2 - Objectives, Gradients, and Backpropagation Spring 2024

Get in Touch with Our Experts