DDIM (Denoising Diffusion Implicit Models)
DDIM is a faster deterministic sampling method for [[Diffusion Model|diffusion models]] that generalizes [[Diffusion Model|DDPM]] by relaxing the Markovian assumption. It enables 10-50× fewer sampling steps while maintaining comparable quality, and introduces the crucial concept of deterministic inversion between noise and data.
1. Core Concept
1.1 Motivation
[[Diffusion Model|DDPM]] problem: Requires 1000 steps for high-quality sampling.
Root cause: The reverse process must closely follow the forward process’s path, which was defined as a [[Markov Process|Markov chain]] with small Gaussian steps.
DDIM insight: The [[Diffusion Model|DDPM]] objective only depends on marginals
1.2 Key Innovation
DDIM defines a non-Markovian forward process:
- [[Diffusion Model|DDPM]] forward:
(Markovian) - DDIM forward:
(non-Markovian, conditions on )
Both share the same marginal distribution:
[!NOTE] Core Insight
DDIM proves that the [[Diffusion Model|DDPM]] training objective is valid for a family of inference distributions, not just the Markovian one. By choosing a non-Markovian inference distribution, we can produce higher quality samples with fewer steps.
2. Mathematical Foundation
2.1 [[Diffusion Model|DDPM]] Review
[[Diffusion Model|DDPM]] forward process (Markov):
where
Marginal
where
2.2 DDIM Inference Distribution
DDIM defines a non-Markovian forward process:
where:
and for
Parameter
-
: [[Diffusion Model|DDPM]] (fully stochastic) -
: DDIM (fully deterministic)
2.3 Marginally Consistent
Theorem: For any choice of
This means all
Consequence: A [[Diffusion Model|DDPM]]-trained model (which only depends on these marginals) can be used with any
3. DDIM Sampling
3.1 Generative Process
DDIM reverse process:
Given a noisy sample
where
Three components:
- Predicted
: Estimate of clean data from noisy - Direction to
: Points toward the current noisy sample - Random noise: Controlled by
(zero for deterministic case)
3.2 Deterministic DDIM (
)
When
Properties:
- Deterministic mapping:
uniquely determined by - Invertible: Can compute
from (DDIM inversion) - Consistent: Same noise produces same output
3.3 Accelerated Sampling
DDIM can use a subsequence of timesteps:
Full schedule:
Subsampled:
Example (T=1000, S=50):
- Original:
(1000 steps) - Subsampled:
(50 steps)
[!TIP] Practical Choice
DDIM with 50-100 steps typically achieves quality close to full [[Diffusion Model|DDPM]] (1000 steps), giving 10-20× speedup.
3.4 Sampling Pseudocode
1 | # Deterministic DDIM Sampling |
4. DDIM Inversion
4.1 Forward (Encoding) Process
DDIM inversion reverses the sampling process: given a real image
Algorithm:
1 | # DDIM Inversion (encoding real image to noise) |
4.2 Applications of Inversion
1. Real Image Editing:
- Encode image to noise:
- Modify or guide the reverse process
- Decode back:
(edited)
2. Semantic Interpolation:
- Encode two images:
, - Interpolate in noise space:
- Decode:
(interpolated)
3. Attribute Manipulation:
- Encode image
- Apply semantic direction in noise space
- Decode for controlled editing
5. Theoretical Analysis
5.1 Why Fewer Steps Work
[[Diffusion Model|DDPM]] problem: Each reverse step is a small Gaussian step that assumes close proximity between
DDIM solution: The generative process directly jumps to the predicted
Mathematical justification:
The [[Diffusion Model|DDPM]] training loss:
depends only on the marginal
5.2 Consistency Properties
Theorem (Consistency): For the same initial noise
Practical implication:
- 10 steps: Approximate result, some artifacts
- 50 steps: Good quality, minor differences from 1000-step [[Diffusion Model|DDPM]]
- 100 steps: Nearly identical to [[Diffusion Model|DDPM]]
5.3 Connection to [[Probability Flow ODE]]
DDIM as ODE discretization:
As
DDIM vs ODE solvers:
| Method | Type | Steps | Quality |
|---|---|---|---|
| DDIM-50 | Discrete ODE | 50 | High |
| DDIM-100 | Discrete ODE | 100 | Very High |
| [[DPM-Solver]]-2 | Higher-order ODE | 20 | Very High |
| Euler ODE | 1st-order ODE | 100 | Medium-High |
[!NOTE] Historical Significance
DDIM was the first to show that diffusion models can be sampled with far fewer steps, paving the way for subsequent ODE-based samplers like [[Probability Flow ODE]] and [[DPM-Solver]].
6. Comparison with [[Diffusion Model|DDPM]]
6.1 Forward Process
| Aspect | [[Diffusion Model|DDPM]] | DDIM |
|---|---|---|
| Type | Markovian | Non-Markovian |
| Joint distribution |
|
|
| Marginals |
|
Same |
| Inference |
|
|
6.2 Reverse (Sampling) Process
| Aspect | [[Diffusion Model|DDPM]] | DDIM (
|
DDIM (
|
|---|---|---|---|
| Stochasticity | Random | Random | Deterministic |
| Noise injection | Yes | Yes | No |
| Invertible | No | No | Yes |
| Steps | 1000 | 10-1000 | 10-1000 |
6.3 Quality vs Speed Trade-off
| Method | FID (CIFAR-10) | Steps | Time (relative) |
|---|---|---|---|
| [[Diffusion Model|DDPM]] | 3.17 | 1000 | 1.0× |
| DDIM | 4.16 | 100 | 0.1× |
| DDIM | 6.84 | 50 | 0.05× |
| DDIM | 13.36 | 20 | 0.02× |
| DDIM | 23.05 | 10 | 0.01× |
[!TIP] Speed-Quality Balance
DDIM-100 provides an excellent balance: nearly the same quality as [[Diffusion Model|DDPM]]-1000 but 10× faster.
7. Stochastic DDIM (
)
7.1 Continuous Stochasticity Control
General DDIM update:
Effects of
|
|
Stochasticity | Diversity | Determinism | Best For |
|---|---|---|---|---|
| 0.0 | None | Fixed output | Perfect | Editing, inversion |
| 0.2 | Low | Some variation | Near-deterministic | Balanced quality |
| 0.5 | Medium | Moderate | Partial | General sampling |
| 0.8 | High | High | Low | Diverse generation |
| 1.0 | Full ([[Diffusion Model|DDPM]]) | Maximum | None | Maximum quality |
7.2 When to Use Each Mode
Deterministic (
- Image editing (need reproducibility)
- DDIM inversion
- Semantic interpolation
- Latent space exploration
Stochastic (
- Maximum sample diversity
- Unconditional generation
- When quality trumps speed
Intermediate (
- Controlled diversity
- Balanced speed-quality trade-off
8. Applications
8.1 Real Image Editing
DDIM Inversion + Editing Pipeline:
- Encode: Use DDIM inversion to map real image to noise
- Modify: Apply text-guided editing in the denoising process
- Decode: Generate edited image
Key advantage: DDIM inversion preserves image structure better than random noise.
Example: Prompt-to-Prompt, Null-text Inversion, EDICT.
8.2 Semantic Interpolation
Process:
- Encode image A and B via DDIM inversion
- Interpolate noise codes:
- Decode interpolated noise
Result: Smooth semantic transition between images.
8.3 Latent Space Manipulation
Finding semantic directions:
- Encode many images
- Find directions in noise space corresponding to attributes
- Apply directional shifts for controlled editing
8.4 Accelerated Training
Progressive distillation:
- Train teacher model with [[Diffusion Model|DDPM]]
- Use DDIM to generate high-quality samples
- Train student model with fewer steps
- Repeat for further acceleration
9. Practical Implementation
9.1 Training
Key fact: DDIM uses the same training as [[Diffusion Model|DDPM]]!
1 | # DDIM Training = DDPM Training |
9.2 Timestep Selection
Strategies for subsampling:
-
Uniform: Select every
-th step- Simple but suboptimal
-
Quadratic: More steps near
1
2
3
4def quadratic_schedule(T, S):
steps = np.linspace(0, T, S+1)**2
steps = (steps / steps[-1] * T).astype(int)
return sorted(set(steps)) -
Linear: Evenly spaced steps
1
2def linear_schedule(T, S):
return np.linspace(0, T, S+1).astype(int)
Recommendation: Linear schedule works well for 50+ steps; quadratic better for very few steps (< 20).
9.3 Debugging Checklist
- [ ] Verify
and values are correct - [ ] Check
prediction formula matches noise schedule - [ ] Test with
(deterministic) first - [ ] Compare output with [[Diffusion Model|DDPM]] baseline
- [ ] Verify inversion consistency:
- [ ] Monitor numerical stability (no NaN/Inf)
- [ ] Test with different step counts (10, 20, 50, 100)
10. Core Formula Cards
[!QUOTE] DDIM Marginal Distribution
[!QUOTE] DDIM Generative Process
[!QUOTE] Deterministic DDIM (
)
[!QUOTE] Predicted
(One-step Estimation)
[!QUOTE] DDIM Inversion (Forward)
[!QUOTE] Stochasticity Parameter
11. Extensions and Variants
11.1 DDIM with Classifier Guidance
Combine DDIM sampling with classifier guidance:
where
11.2 DDIM with Classifier-Free Guidance
11.3 Spherical DDIM
Motivation:
Fix: Normalize to unit sphere for better interpolation.
11.4 Comparison with Other Fast Samplers
| Method | Type | Steps | Inversion | Training Required |
|---|---|---|---|---|
| DDIM | Discrete ODE | 10-100 | Yes | No (uses [[Diffusion Model|DDPM]] model) |
| [[Diffusion Model|DDPM]] (few-step) | [[Stochastic Differential Equation (SDE)|SDE]] | 50-100 | No | No |
| [[DPM-Solver]] | High-order ODE | 10-20 | Possible | No |
| [[Flow Matching]] | ODE | 10-100 | Yes | Separate training |
| Consistency Models | Direct mapping | 1-8 | No | Separate training |
Related Concepts
- [[Diffusion Model]]
- [[Probability Flow ODE]]
- [[Stochastic Differential Equation (SDE)]]
- [[Score Function]]
- [[DPM-Solver]]
- [[Langevin Dynamics]]
- [[Flow Matching]]
- [[Consistency Models]]
- [[Wiener Process|Wiener Process]]
- [[Markov Process]]
- [[Neural ODE]]
- [[Prompt-to-Prompt]]
Dataview Query
1 | LIST |
References
- Paper: Denoising Diffusion Implicit Models (Song et al., 2021)
- Paper: Denoising Diffusion Probabilistic Models (Ho et al., 2020)
- Paper: Score-Based Generative Modeling through SDEs (Song et al., 2021)
- Blog: What are Diffusion Models? - Lilian Weng
- Blog: DDIM Explained - Papers with Code
- GitHub: https://github.com/ermongroup/ddim
- Course: CS236 Deep Generative Models (Stanford)