PM520 Advanced Statistical Computing - Homework 3. Variational Inference

Engage in a Conversation

# Homework 3. Variational Inference
CourseNana.COM

## 1. Evidence Lower Bound CourseNana.COM

$\newcommand{\bX}{\mathbf{X}}\newcommand{\by}{\mathbf{y}}\newcommand{\bI}{\mathbf{I}}$ CourseNana.COM

Recall from Lab 8, our example of variational inference for a Bayesian linear regression model. Namely, CourseNana.COM

$$\begin{align*} CourseNana.COM

\by | \bX, \beta &\sim N(\bX\beta, \bI_n \sigma^2) \\ CourseNana.COM

\beta &\sim N(0, \bI_p \sigma^2_b). CourseNana.COM

\end{align*}$$ CourseNana.COM

CourseNana.COM

We assumed a mean-field model that $Q$ factorizes as $$Q(\beta) = \prod_{j=1}^P Q_j(\beta_j).$$ CourseNana.COM

CourseNana.COM

### 1.1 CourseNana.COM

Consulting the results in Lab 8 on parameter definitions for each $Q_j$, please derive the *evidence lower bound* or ELBO for this model. CourseNana.COM

CourseNana.COM

### 1.2 CourseNana.COM

Consult lab 8 for the implementation of a CAVI algorithm for the model above, but rather than evaluate the mean squared error (MSE), evaluate the ELBO. The ELBO should *increase* with each iteration, otherwise there is likely a bug. CourseNana.COM

## 2. Bayesian Linear Regression Pt II CourseNana.COM

Here we assume a slightly different linear model, which is given by, $$\begin{align*} CourseNana.COM

\by | \bX, \beta &\sim N(\bX\beta, \bI_n \sigma^2) \\ CourseNana.COM

\beta_j &\sim \text{Laplace}(0, b). CourseNana.COM

\end{align*}$$ CourseNana.COM

CourseNana.COM

We assumed a mean-field model that $Q$ factorizes as $$Q(\beta) = \prod_{j=1}^P Q_j(\beta_j).$$ Rather than identify optimal $Q_j$ through CAVI, we will first assume $Q_j := \text{Laplace}(\mu_j, b_j)$. Next, to identify updates for each $\mu_j, b_j$, we take the derivative of the ELBO with respect to each; however the gradient of the ELBO requires knowing $\mu_j, b_j$, which causes challenges. CourseNana.COM

CourseNana.COM

### 2.1 CourseNana.COM

Re-write the ELBO as a deterministic transformation of $\beta_j$ using location-scale rules (i.e. reparameterization trick) CourseNana.COM

CourseNana.COM

### 2.2 CourseNana.COM

Implement the above by performing stochastic VI to optimize the ELBO by sampling. CourseNana.COM

CourseNana.COM

PM520 Advanced Statistical Computing - Homework 3. Variational Inference

Get in Touch with Our Experts