Diffusion Model

A diffusion model generates data by learning to reverse a gradual noising process, iteratively denoising random noise into coherent samples through many small refinement steps. The forward diffusion process progressively adds Gaussian noise to training data over many steps until the original signal is completely destroyed, leaving pure noise.

The model learns to predict and remove the noise added at each step. Generation works in reverse: start with random noise, repeatedly apply the learned denoising, and gradually reveal a sample from the data distribution.

This iterative refinement is the key to diffusion model quality, each denoising step makes a small correction, and many small corrections accumulate into dramatic transformations. DALL-E 2, Midjourney, Stable Diffusion, and Imagen are diffusion models that produce photorealistic images from text descriptions. They achieve higher sample quality than GANs with more stable, reliable training.

The architecture typically uses a U-Net that processes the noisy image conditioned on timestep and optional guidance signals like text embeddings. Classifier-free guidance trades off sample diversity for adherence to conditioning prompts. The main drawback is slow generation: producing one image requires hundreds or thousands of neural network evaluations.

Techniques like DDIM, progressive distillation, and consistency models accelerate generation by reducing the required steps. Diffusion models now dominate image, video, and audio generation.

Interactive Concept: diffusion model

Diffusion Model Visualization

Watch how diffusion models learn to add or remove noise through iterative steps

Process:

Original

→

Step 0/10

Noise Level0%

Manual Step Control

Forward Diffusion: Gradually adds noise until the image becomes pure noise. Reverse Diffusion: The model learns to predict and remove noise at each step, generating new samples by starting from random noise and iteratively denoising.

Related Essays

Twilight Economy →

Techniques like DDIM, progressive distillation, and consistency models accelerate generation by reducing the required steps. Diffusion models now dominate image, video, and audio generation.

Interactive Concept: diffusion model

Diffusion Model Visualization

Watch how diffusion models learn to add or remove noise through iterative steps

Process:

Original

→

Step 0/10

Noise Level0%

Manual Step Control

Related Essays

Twilight Economy →

Diffusion Model Visualization

Related Terms

Related Essays

Diffusion Model

Diffusion Model Visualization

Related Terms

Related Essays