Variational Autoencoders (VAEs)
From compressor to generator
A plain autoencoder maps each input to a single point in latent space. The gaps between those points are gibberish, so you can't just invent new data. A VAE fixes this by encoding each input to a distribution (a mean and a spread) instead of a point. Train it right and the whole latent space becomes smooth — every point decodes to something sensible, so you can sample brand-new data.
The encoder outputs a mean μ and standard deviation σ — a little cloud in latent space.
Draw a random point from that cloud — the reparameterization trick keeps it trainable.
A penalty pulls all the clouds toward a standard normal, so the space has no holes.
Encode, sample, decode — then generate
Follow one input to a latent distribution, sample a code, decode it back — then see the real payoff: sampling random points from the smooth latent space to generate data the model never saw.
Where VAEs sit
- Smooth, structured latent space you can interpolate
- Stable training (just two loss terms)
- Encoder + decoder — good for representation learning
A smooth, samplable latent space is the foundation modern generative models build on — latent diffusion runs the diffusion process inside a VAE-style latent space for exactly this reason.