1. Homepage
  2. Programming
  3. CMPUT 328 Visual Recognition - Assignment 5: Generative Models (VAE and Diffusion Models)

CMPUT 328 Visual Recognition - Assignment 5: Generative Models (VAE and Diffusion Models)

Engage in a Conversation
AlbertaCMPUT 328Visual RecognitionGenerative ModelsVAEDiffusion ModelsPython

Assignment 5 CourseNana.COM

Generative Models (VAE and Diffusion Models) CMPUT 328 - Fall 2023 CourseNana.COM

1 Assignment Description CourseNana.COM

The main objective in this assignment is to implement and evaluate two of the most popular generative models, namely Variational Auto-Encoders (VAE) and Diffusion Models. Our goal is to implement each of these models on the FashionMNIST dataset and see how such models can generate new images. However, instead of simply training the models on the whole dataset, we would like to be able to tell the model from which class it should generate samples. Hence, we are going to implement class-conditional VAEs and Diffusion Models. CourseNana.COM

Figure 1: Sample images from the FashionMNIST dataset CourseNana.COM

Note: Please the watch the video provided for this assignment for better understanding the tasks and objectives. CourseNana.COM

2 What You Need to Do CourseNana.COM

For this assignment, 5 files are given to you: A5 vae submission.py CourseNana.COM

A5 vae helper.ipynb CourseNana.COM

A5 diffusion submission.py CourseNana.COM

A5 diffusion helper.ipynb CourseNana.COM

classifier.pt CourseNana.COM

You only need to submit “A5 vae submission.py”, “A5 diffusion submission.py”, and weights of your networks (“vae.pt”, “diffusion.pt”). CourseNana.COM

1 CourseNana.COM

2.1 Task 1: Conditional VAE (40%) CourseNana.COM

2.1.1 A5 vae submission.py CourseNana.COM

In this file there is a skeleton of a VAE class which you are required to complete. CourseNana.COM

  1. For the VAE you need to implement the following components as specified in the code file: Encoder, mu net (for estimating the mean), logvar net (for estimating the log-variance), class embedding module (for properly embedding the labels), and decoder (for reconstructing the samples). CourseNana.COM

  2. The forward function of the VAE class must receive the batch of images and their labels, and return the reconstructed image, estimated mean (output of mu net), and the estimated logvar (output of the logvar net). CourseNana.COM

  3. You need to fill in the “reparameterize” method of the class given mu and logvar vectors (as provided in the code), and implement the reparameterization trick to sample from a Gaussian distribution with mean “mu”, and log-variance “logvar”. CourseNana.COM

  4. You need to fill in the “kl loss” method of the class given mu and logvar vectors, and compute the Kullback-Leibler (KL) divergence between the Gaussian distribution with mean “mu” and log-variance “logvar” and the standard Gaussian distribution N(0,I). Recall that if the the mean and variance of the a Gaussian distribution are μ and σ2, respectively, the KL divergence with the standard Gaussian can be simply calculated as CourseNana.COM

    1 Xn CourseNana.COM

    KL(N(μ,σ2)∥N(0,I))= 2 CourseNana.COM

  5. You need to fill in the “get loss” method of the class given the input batch of images and their labels. In this method you need to find the estimated mu, estimated logvar, and the reconstructed image, find the KL divergence using mu and logvar and find the reconstruction loss between the input image and the reconstructed image. Usually for the reconstruction loss the Binary Cross-Entropy loss is used. CourseNana.COM

  6. Most importantly, you need to fill in the “generate sample” method of the class, which receives the number of images to be generated along with their labels, and generates new samples from the VAE. Basically, you need to sample from standard Gaussian noise, combine it with the class embedding and pass it to the networks decoder to generate new images. CourseNana.COM

  7. Please do not rename the VAE class and its methods. You can add as many extra functions/classes as you need in this file. You can change the arguments passed to the “ init ” method of the class based on your needs. CourseNana.COM

  8. Finally, you need to complete the “load vae and generate” function at the bottom of the file, which merely requires you to define your VAE. CourseNana.COM

2.1.2 A5 vae helper.ipynb CourseNana.COM

This file is provided to you so you can train and validate your model more simply. Once you are done with your implementation of the VAE class you can start running the blocks of this file to train your model, save the weights of your model, and generate new samples. You only need to specify some hyperparameters such as batch size, optimizer, learning rate, and epochs, and of course your model. CourseNana.COM

There is also a brief description of the VAEs at the beginning of this file. 2 CourseNana.COM

(σi2 +μ2i 1ln(σi2)) (1) CourseNana.COM

2.2 Task 2: Conditional Diffusion Model (60%) CourseNana.COM

2.2.1 A5 diffusion submission.py CourseNana.COM

In this file there are skeletons of a VarianceScheduler class, NoiseEstimatingNet class, and the DiffusionModel class, which you are required to complete. CourseNana.COM

  1. For the VarianceScheduler class you need to store the statistical variables required for making the images noisy and sampling from the diffusion model, such as βtt, and α ̄t. You also need to complete the “add noise” method which receives a batch of images and a batch of timesteps and computes the noisy version of the images based on the timesteps. CourseNana.COM

  2. You need to complete the NoiseEstimatingNet class, which is supposed to be a neural network (prefer- ably a UNet) which receives the noisy version of the image, the timestep, and the label of the image, and estimates the amount of noise added to the image. You are encouraged to look at the network architectures you have seen in the notebooks provided to you on eClass resources. Note that you can add extra functions and classes (e.g., for time embedding module) in this file. CourseNana.COM

  3. You need to complete the “DiffusionModel” class. The forward method of the class receives a batch of input images and their labels, randomly adds noise to the images, estimates the noise using NoiseEsti- mating network, and finally computes the loss between the ground truth noise and the estimated noise. The forward method outputs the loss. CourseNana.COM

  4. Most importantly, you need to fill in the “generate sample” method of the DiffusionModel class which receives the number of images to be generated along with their labels, and generates new samples using the diffusion model. CourseNana.COM

  5. You need to fill in the “get loss” method of the class given the input batch of images and their labels. In this method you need to find the estimated mu, estimated logvar, and the reconstructed image, find the KL divergence using mu and logvar and find the reconstruction loss between the input image and the reconstructed image. Usually for the reconstruction loss the Binary Cross-Entropy loss is used. CourseNana.COM

  6. Most importantly, you need to fill in the “generate sample” method of the class, which receives the number of images to be generated along with their labels, and generates new samples from the VAE. Basically, you need to sample from standard Gaussian noise, combine it with the class embedding and pass it to the networks decoder to generate new images. CourseNana.COM

  7. Please do not rename the VarianceScheduler, NoiseEstimatingNet, and DiffusionModel classes and their methods. You can add as many extra functions/classes as you need in this file. CourseNana.COM

  8. Finally, you need to complete the “load diffusion and generate” function at the bottom of the file, which merely requires you to define your VarianceScheduler and NoiseEstimatingNet. CourseNana.COM

2.2.2 A5 diffusion helper.ipynb CourseNana.COM

This file is provided to you so you can train and validate your model more simply. Once you are done with your implementation of the VarianceScheduler, NoiseEstimatingNet, and DiffusionModel classes you can start running the blocks of this file to train your model, save the weights of your model, and generate new samples. You only need to specify some hyperparameters such as batch size, optimizer, learning rate, and epochs, and of course your model. CourseNana.COM

3 CourseNana.COM

There is also a brief description of the Diffusion Models at the beginning of this file, including how to make the noisy images, and how to sample from the diffusion model, which could be helpful. CourseNana.COM

3 Deliverables CourseNana.COM

  • The correct (working) implementation of the explained modules in the previous section. CourseNana.COM

  • For the diffusion model use a number of diffusion steps less than or equal to 1000 for a roughly fast CourseNana.COM

    image generation. CourseNana.COM

  • We verify the quality of the images generated by your models by using a classifier trained over the dataset. This classifier is provided to you in the helper notebooks, and without changing the code you can run the corresponding blocks to load the classifier and apply it to your generated images. CourseNana.COM

  • For the VAE model, a final accuracy of 65% gets a full mark and an accuracy of < 55% gets no mark. You mark will linearly vary for any accuracy in between. CourseNana.COM

  • For the Diffusion Model, a final accuracy of 60% gets a full mark and an accuracy of < 50% gets no mark. You mark will linearly vary for any accuracy in between. CourseNana.COM

    In the following you can see some sample outputs of a simple VAE and a simple DiffusionModel trained on the FashionMNIST. CourseNana.COM

CourseNana.COM

Figure 2: Sample images generated by the VAE and Diffusion Model  CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
Alberta代写,CMPUT 328代写,Visual Recognition代写,Generative Models代写,VAE代写,Diffusion Models代写,Python代写,Alberta代编,CMPUT 328代编,Visual Recognition代编,Generative Models代编,VAE代编,Diffusion Models代编,Python代编,Alberta代考,CMPUT 328代考,Visual Recognition代考,Generative Models代考,VAE代考,Diffusion Models代考,Python代考,Albertahelp,CMPUT 328help,Visual Recognitionhelp,Generative Modelshelp,VAEhelp,Diffusion Modelshelp,Pythonhelp,Alberta作业代写,CMPUT 328作业代写,Visual Recognition作业代写,Generative Models作业代写,VAE作业代写,Diffusion Models作业代写,Python作业代写,Alberta编程代写,CMPUT 328编程代写,Visual Recognition编程代写,Generative Models编程代写,VAE编程代写,Diffusion Models编程代写,Python编程代写,Albertaprogramming help,CMPUT 328programming help,Visual Recognitionprogramming help,Generative Modelsprogramming help,VAEprogramming help,Diffusion Modelsprogramming help,Pythonprogramming help,Albertaassignment help,CMPUT 328assignment help,Visual Recognitionassignment help,Generative Modelsassignment help,VAEassignment help,Diffusion Modelsassignment help,Pythonassignment help,Albertasolution,CMPUT 328solution,Visual Recognitionsolution,Generative Modelssolution,VAEsolution,Diffusion Modelssolution,Pythonsolution,