A variational autoencoder is a generative model that learns a probabilistic latent space where similar inputs map to nearby regions, enabling generation of new samples by sampling from and decoding points in that space. Unlike standard autoencoders that encode inputs to fixed points, VAEs encode inputs to probability distributions, typically Gaussian distributions parameterized by mean and variance. During training, samples are drawn from these distributions and decoded. The training objective balances two goals: reconstruction accuracy (decoded samples should match inputs) and latent space regularity (the encoded distributions should be close to a standard normal prior). This regularization ensures the latent space is smooth and complete, any point you sample from it decodes to something sensible. The mathematics involves variational inference: the encoder approximates an intractable posterior, and the loss function is a variational lower bound on the data likelihood. VAEs generate new samples by sampling from the prior distribution and decoding. They enable interpolation between samples by moving through latent space. They support conditional generation by conditioning the decoder on additional inputs. Applications include image generation, molecular design, music synthesis, and learned representations for downstream tasks. VAEs produce blurrier images than GANs or diffusion models but offer stable training and a principled probabilistic framework.
Back to Glossary