tldr: unnormalized probability distributions. can only compare/contrast different points.

Overview

Energy-based models (EBMs) are a class of probabilistic models that define probability distributions through an energy function. Rather than directly modeling the probability of data, they assign an energy (scalar value) to each possible configuration, where lower energy states correspond to higher probabilities.

The core idea is to define a probability distribution as:

$$ p(x) = \frac{e^{-E(x)}}{Z} $$

where:

$E(x)$ is the energy function that assigns a scalar energy to input $$x$$
$Z = \sum_x e^{-E(x)}$ is the partition function (normalizing constant)

Key Characteristics

Flexibility: The energy function $E(x)$ can be any differentiable function (often a neural network), giving EBMs great expressiveness in modeling complex distributions.

Unnormalized modeling: EBMs only need to learn the relative energies between different states, not the absolute probabilities. This is useful when the partition function $Z$ is intractable to compute.

Implicit density: Unlike autoregressive models or VAEs that explicitly model how to generate data, EBMs implicitly define the distribution through the energy landscape.

Applications and Examples

You can also combine existing (generative) probability models to get the AND of them (like a model that generates young faces, a model that generates female faces, etc. can be combined to get young female faces). The product of the distributions is not a normalized distribution. (Note that this is unlike mixture models which are more like an OR.)

Overview

Key Characteristics

Applications and Examples

Models