DPM-Solver
DPM-Solver is a fast ordinary differential equation (ODE) solver specifically designed for [[Diffusion Model|Diffusion Probabilistic Models]] (DPMs). It exploits the semi-linear structure of the [[Probability Flow ODE]] to achieve high-order convergence with significantly fewer function evaluations, enabling fast sampling with only 10-20 steps.
1. Core Concept
1.1 Motivation
Standard [[Diffusion Model|diffusion models]] require 1000+ steps for high-quality sampling due to:
- Discretization error in Euler-Maruyama method
- Stiff dynamics near
- Random noise accumulation in [[Stochastic Differential Equation (SDE)|SDE]]-based sampling
DPM-Solver addresses this by:
- Using the deterministic [[Probability Flow ODE]] formulation
- Exploiting the semi-linear structure for analytical solutions
- Designing high-order solvers specifically for diffusion models
1.2 Key Innovation
The [[Probability Flow ODE]] has a semi-linear structure:
where:
-
: Linear term (can be solved analytically) -
: Nonlinear term (score function, requires numerical integration)
[!NOTE] Semi-linear Advantage
By solving the linear part exactly and only approximating the nonlinear part, DPM-Solver achieves much higher accuracy than general-purpose ODE solvers with the same number of function evaluations.
2. Mathematical Foundation
2.1 [[Probability Flow ODE]] Recap
The forward [[Stochastic Differential Equation (SDE)|SDE]]:
has an equivalent [[Probability Flow ODE]]:
2.2 Change of Variables
Define signal-to-noise ratio parameters:
The ODE can be rewritten as:
where
2.3 Semi-linear Form
Rearranging terms:
This is a semi-linear ODE where the linear part dominates.
3. DPM-Solver Algorithm
3.1 Exact Solution of Linear Part
The linear ODE
3.2 Variation of Constants Formula
Using variation of constants, the full solution is:
Define:
Then:
3.3 First-Order DPM-Solver (DPM-Solver-1)
Approximate
This is equivalent to the DDIM update rule.
3.4 Second-Order DPM-Solver (DPM-Solver-2)
Use linear approximation for
- Predictor step: Compute intermediate point
at
- Corrector step: Use midpoint rule
Function evaluations: 2 per step (at
3.5 Third-Order DPM-Solver (DPM-Solver-3)
Use quadratic approximation with two intermediate points:
- First intermediate:
- Second intermediate:
- Final step: Use Simpson’s rule
where
Function evaluations: 3 per step
4. Algorithm Summary
4.1 DPM-Solver Pseudocode
1 | # DPM-Solver-2 Sampling Algorithm |
4.2 Order Comparison
| Solver | Order | Function Evaluations | Steps Needed | Total NFE |
|---|---|---|---|---|
| Euler | 1st | 1 per step | 50-100 | 50-100 |
| DPM-Solver-1 | 1st | 1 per step | 50-100 | 50-100 |
| DPM-Solver-2 | 2nd | 2 per step | 10-20 | 20-40 |
| DPM-Solver-3 | 3rd | 3 per step | 10-15 | 30-45 |
| RK4 | 4th | 4 per step | 20-30 | 80-120 |
[!TIP] Efficiency Insight
DPM-Solver-2 achieves high quality with only 20-40 function evaluations, compared to 1000+ for [[Diffusion Model|DDPM]] or 100+ for general-purpose ODE solvers.
5. Advanced Variants
5.1 DPM-Solver++
Improvements over DPM-Solver:
- Better numerical stability
- Unified framework for different parameterizations
- Adaptive step size control
Key Innovation: Use
5.2 DPM-Solver-Adaptive
Adaptive Step Size Control:
- Estimate local truncation error using embedded methods
- Adjust step size
based on error tolerance - Accept/reject steps dynamically
Error Estimation (for DPM-Solver-2):
where
5.3 DPM-Solver with Correctors
Predictor-Corrector Framework:
- Predictor: Take one DPM-Solver step
- Corrector: Apply few steps of Langevin dynamics
- Repeat: For enhanced sample quality
1 | # Predictor-Corrector with DPM-Solver |
6. Theoretical Analysis
6.1 Convergence Order
Theorem: DPM-Solver-
Proof Sketch:
- Expand
in Taylor series - Match terms up to order
- Show truncation error is
6.2 Stability Analysis
Linear Stability: For linear ODE
where
DPM-Solver Advantage: The exact solution of linear part ensures better stability than explicit methods.
6.3 Error Decomposition
Total error consists of:
- Discretization error: From numerical integration (
) - Score approximation error: From imperfect score model (
) - Accumulated error: Propagated through multiple steps
Key Insight: Higher-order methods reduce discretization error, making score approximation error dominant.
7. Practical Implementation
7.1 Time Schedule Design
Uniform vs Non-uniform Schedules:
| Schedule | Steps Distribution | Best For |
|---|---|---|
| Uniform | Equal spacing | Simple implementation |
| Log-SNR | More steps near
|
Better accuracy |
| Adaptive | Dynamic based on error | Optimal efficiency |
Recommended Schedule (for 20 steps):
1 | def get_time_schedule(N=20): |
7.2 Numerical Stability Tips
1. Avoid Division by Small Values:
1 | # Unstable |
2. Clamp Time Values:
1 | t = torch.clamp(t, min=1e-5, max=1.0) |
3. Use Log-SNR Parameterization:
This provides better numerical conditioning.
7.3 Batch Processing
Parallel Sampling:
1 | # Sample multiple images in parallel |
Memory Optimization:
- Process in batches to fit GPU memory
- Use gradient checkpointing if needed
- Precompute
, values
8. Performance Comparison
8.1 Sampling Speed vs Quality
| Method | FID (CIFAR-10) | Steps | Time (s) | NFE |
|---|---|---|---|---|
| [[Diffusion Model|DDPM]] | 3.17 | 1000 | 21.7 | 1000 |
| DDIM | 4.16 | 100 | 2.2 | 100 |
| DPM-Solver-2 | 3.28 | 20 | 0.5 | 40 |
| DPM-Solver-3 | 3.19 | 15 | 0.4 | 45 |
| DPM-Solver++ | 3.15 | 20 | 0.5 | 20 |
8.2 Comparison with Other Fast Samplers
| Method | Type | Steps | Quality | Training Required |
|---|---|---|---|---|
| DPM-Solver | ODE solver | 10-20 | High | No (plug-and-play) |
| DDIM | ODE solver | 50-100 | Medium-High | No |
| Consistency Models | Distillation | 1-8 | High | Yes (retraining) |
| Progressive Distillation | Distillation | 2-8 | High | Yes (retraining) |
| Rectified Flows | Retraining | 1-10 | High | Yes (retraining) |
[!NOTE] Key Advantage
DPM-Solver is a plug-and-play solver that works with any pre-trained [[Diffusion Model]] without retraining, unlike distillation-based methods.
9. Applications
9.1 Text-to-Image Generation
Stable Diffusion + DPM-Solver:
- Original: 50 DDIM steps (~10 seconds)
- With DPM-Solver-2: 20 steps (~4 seconds)
- Quality: Comparable or better FID scores
9.2 High-Resolution Synthesis
Benefits for Large Images:
- Fewer steps = less memory accumulation
- Better stability for high-dimensional data
- Enables real-time generation (1-2 seconds for 1024×1024)
9.3 Video Generation
Temporal Consistency:
- Deterministic ODE trajectories ensure smooth transitions
- Fewer steps reduce temporal artifacts
- Suitable for frame interpolation tasks
9.4 3D Generation
Score Distillation Sampling (SDS):
- DPM-Solver provides stable gradients
- Faster convergence in optimization-based generation
- Used in DreamFusion, Magic3D, etc.
10. Core Formula Cards
[!QUOTE] [[Probability Flow ODE]]
[!QUOTE] Semi-linear Form
[!QUOTE] Exact Linear Solution
[!QUOTE] DPM-Solver-1 (First-Order)
[!QUOTE] DPM-Solver-2 (Second-Order)
[!QUOTE] Step Size in Log-SNR Space
11. Debugging and Troubleshooting
11.1 Common Issues
Problem 1: Sample quality degrades with fewer steps
Causes:
- Step size too large
- Low-order solver (DPM-Solver-1)
- Stiff dynamics near
Solutions:
- Increase number of steps (try 20-30)
- Use DPM-Solver-2 or DPM-Solver-3
- Use non-uniform time schedule (more steps near
)
Problem 2: Numerical instability (NaN values)
Causes:
- Division by
when - Score function explosion
- Accumulated rounding errors
Solutions:
- Clamp
- Clip score values:
- Use
-prediction instead of -prediction
Problem 3: Slow sampling despite DPM-Solver
Causes:
- Too many function evaluations
- Inefficient implementation
- Large batch size causing memory bottleneck
Solutions:
- Use DPM-Solver-2 with 15-20 steps
- Precompute
, values - Optimize model inference (TensorRT, ONNX)
11.2 Quality Checklist
Before deploying DPM-Solver:
- [ ] Test with different step counts (10, 15, 20, 30)
- [ ] Compare FID/IS scores with baseline (DDIM-50)
- [ ] Check for artifacts in generated samples
- [ ] Verify time schedule is appropriate
- [ ] Monitor numerical stability (no NaN/Inf)
- [ ] Profile inference time and memory usage
12. Extensions and Variants
12.1 Unified Framework
DPM-Solver can handle different model parameterizations:
| Parameterization | Network Predicts | Best For |
|---|---|---|
|
|
Noise
|
Standard training |
|
|
Clean data
|
Better stability |
|
|
Velocity
|
Balanced performance |
12.2 Multistep Methods
DPM-Solver-Multistep: Use information from previous steps (like Adams-Bashforth):
Advantage: Fewer function evaluations (1 per step after initialization)
12.3 Integration with Other Methods
DPM-Solver + Consistency Models:
- Use DPM-Solver for high-quality sampling
- Distill to consistency model for fast deployment
DPM-Solver + Rectified Flows:
- Straighter ODE trajectories
- Even fewer steps needed (5-10)
Related Concepts
- [[Diffusion Model]]
- [[Probability Flow ODE]]
- [[Stochastic Differential Equation (SDE)]]
- [[Score Function]]
- [[DDIM]]
- [[Numerical ODE Methods]]
- [[Runge-Kutta Methods]]
- [[Consistency Models]]
- [[Rectified Flows]]
- [[Flow Matching]]
- [[Wiener Process|Wiener Process]]
- [[Markov Process]]
- [[Neural ODE]]
- [[Fast Sampling Methods]]
- [[Langevin Dynamics]]
Dataview Query
1 | LIST |
References
- Paper: DPM-Solver: A Fast ODE Solver for [[Diffusion Model|Diffusion Probabilistic Model]] Sampling (Lu et al., 2022)
- Paper: DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models (Lu et al., 2022)
- Paper: DPM-Solver-3: Third-Order Fast ODE Solver for Diffusion Models (2023)
- GitHub: https://github.com/LuChengTHU/dpm-solver
- Blog: Understanding DPM-Solver - Lilian Weng
- Course: CS236 Deep Generative Models (Stanford) - Lecture on Fast Sampling