Directions

Most diffusion switching from DDPM to rectified flows, which can converge in fewer steps

Diffusion (DDPM)

the score function (or Fisher score) is defined as the gradient of the log-likelihood $\nabla_x \log p(x)$

Untitled

Train model $\epsilon_\theta$ to predict noise given the noised image and a timestep embedding
- Conceptually, this model finds the direction to move $x$ to maximize how likely it is an image.
Sampling: at each timestep
- Predicting fully from noise to initial image doesn’t work well (predicts the mean which is blurry), so move in small steps, which yields better results
- So subtract the predicted noise, but also add back in some noise (adds stability somehow)

Untitled

In one step (the “nice property” of Gaussians/”reparameterization trick”):

Untitled

First we need to understand what we’re modeling. It’s the q distribution. Forward is easy. Reverse is intractable:

Untitled