Question 1: A simple classifier (60%)
For this exercise, we will provide a demo code showing how to train a network on a small dataset called FashionMinst. Please go through the following tutorials first. You will get a basic understanding about how to train an image classification network in pytorch. You can change the training scheme and the network structure. Please answer the following questions then. You can orginaze your own text and code cell to show the answer of each questions.
Note: Please plot the loss curve for each experiment (2 point).
Requirement:
Q1.1 (1 point) Change the learning rate and train for 10 epochs. Fill this table:
Lr | Accuracy |
---|---|
1 | |
0.1 | |
0.01 | |
0.001 |
Q1.2 (2 point) Report the number of epochs when the network is converged. Hint: The network is called "converged" when the accuracy is not changed (or the change is smaller than a threshold).
Fill this table:
Lr | Accuracy | Epoch |
---|---|---|
1 | ||
0.1 | ||
0.01 | ||
0.001 |
Q1.3 (2 points) Compare the results in table 1 and table 2, what is your observation and your understanding of learning rate?
Q1.4 (3 point) Build a deeper/ wider network. Report the accuracy and the parameters for each structure. Parameters represent the number of trainable parameters in your model, e.g. a 3 x 3 conv has 9 parameters.
Structures | Accuracy | Parameters |
---|---|---|
Base | ||
Deeper | ||
Wider |
Q1.5 (2 points) Choose to do one of the following two tasks:
a. Write a code to calculate the parameter and expian the code.
OR
b. Write done the process of how to calculate the parameters by hand.
Q1.6 (1 points) What are your observations and conclusions for changing network structure?
Q1.7 (2 points) Calculate the mean of the gradients of the loss to all trainable parameters. Plot the gradients curve for the first 100 training steps. What are your observations? Note that this gradients will be saved with the training weight automatically after you call loss.backwards(). Hint: the mean of the gradients should be decreased.
For more exlanation of q1.7, you could refer to the following simple instructions: https://colab.research.google.com/drive/1XAsyNegGSvMf3_B6MrsXht7-fHqtJ7OW?usp=sharing