A generative adversarial network trains two neural networks in competition: a generator that creates synthetic samples and a discriminator that distinguishes real samples from generated ones, with each network improving to defeat the other. The generator takes random noise as input and transforms it into samples mimicking the training distribution, images, audio, or other data types. The discriminator receives both real training samples and generated samples, outputting a probability that each sample is real. The generator is trained to maximize the probability that the discriminator mistakes generated samples for real ones. The discriminator is trained to correctly classify real and fake samples. This adversarial game continues until the generator produces samples indistinguishable from real data. GANs created a revolution in image synthesis, producing photorealistic faces, artwork, and scene modifications that fooled human observers. StyleGAN enabled fine-grained control over generated image attributes. CycleGAN learned to translate between image domains without paired examples. BigGAN scaled the approach to ImageNet-level diversity. However, GAN training is notoriously unstable: mode collapse occurs when the generator produces limited variety, training oscillates without converging, and hyperparameter sensitivity is extreme. Diffusion models have largely replaced GANs for image generation due to more stable training and better sample diversity, though GANs remain important for real-time applications requiring fast generation.
Back to Glossary