Kolmogorov Equations

The Kolmogorov equations are a family of fundamental equations that govern the time evolution of transition probabilities in [[Markov Process|Markov processes]]. They form the mathematical backbone connecting discrete-state jump processes, continuous-state diffusions, and the PDE descriptions of stochastic dynamics — including the [[Fokker-Planck Equation|Fokker-Planck equation]] and the backward equation used in option pricing and hitting-time problems.


1. Core Concept

1.1 The Kolmogorov Triplet

Kolmogorov’s legacy in stochastic processes crystallizes into three interlocking equations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Kolmogorov Equations — Three Pillars
═══════════════════════════════════════════════════════

Chapman-Kolmogorov
(Semigroup Property)
P(s+t) = P(s) P(t)

├── Differentiate w.r.t. forward time t
│ ↓
│ Kolmogorov Forward Equation
│ ∂_t p = L* p
│ (Fokker-Planck / Master Equation)
│ "Given where I started, where will I be?"

└── Differentiate w.r.t. backward time s

Kolmogorov Backward Equation
∂_t u = L u
"Given where I'll end, what's my expected payoff?"
═══════════════════════════════════════════════════════
Equation Domain What It Describes Key Application
Chapman-Kolmogorov Discrete + Continuous time Semigroup property of transitions Foundation of all Markov models
Forward (Fokker-Planck) Continuous time Evolution of probability density p(x,t) Diffusion models, physics, population dynamics
Backward Continuous time Evolution of conditional expectations u(x,t) Option pricing, hitting times, Feynman-Kac

1.2 Why They Matter Together

The three equations are not independent — they are different facets of the same underlying semigroup structure:

  • Chapman-Kolmogorov is the algebraic identity — it asserts that transitions compose
  • Forward equation is the differential identity — it describes the future of the density
  • Backward equation is the adjoint identity — it describes the past-dependence of expectations

In modern [[Diffusion Model|diffusion models]], all three appear:

  • Chapman-Kolmogorov: the Markov chain of the forward noising process
  • Forward (Fokker-Planck): the evolution of pt(x) along the forward SDE
  • Backward: the foundation for score matching via denoising

2. Chapman-Kolmogorov Equation

2.1 Discrete-Time Markov Chains (DTMC)

For a discrete-time [[Markov Process|Markov chain]] with transition matrix P=[pij] :

pij(n)=P(Xm+n=jXm=i)

The Chapman-Kolmogorov equation states that multi-step transitions compose via matrix multiplication:

pij(m+n)=kpik(m)pkj(n)

Or in matrix form:

P(m+n)=P(m)P(n)

Interpretation: To go from i to j in m+n steps, you must pass through some intermediate state k at step m — and the probability is the sum over all possible intermediate states.

2.2 Continuous-Time Markov Chains (CTMC)

For a CTMC with generator matrix Q , the transition probability P(t)=[pij(t)] satisfies:

P(s+t)=P(s)P(t),P(0)=I

This is the semigroup property: the transition operator P(t) forms a one-parameter semigroup.

2.3 General State Space

For a Markov process on a general (possibly continuous) state space with transition kernel P(t,x,A) :

P(s+t,x,A)=SP(t,y,A)P(s,x,dy)

This is the most general form: to transition from x to any state in set A over time s+t , integrate over all possible intermediate positions y at time s .

2.4 Probabilistic Interpretation

The Chapman-Kolmogorov equation is more than a formula — it’s a consistency condition:

If you know the 1-step transition probabilities, you know everything about the process.

1
2
3
4
Time:  0 ────── s ────── s+t
i ──→── k ──→── j
\______________/
m+n steps

Every path from i to j decomposes uniquely into a prefix and a suffix — and the probability factors accordingly.


3. Kolmogorov Forward Equation

3.1 CTMC Form (Master Equation)

Starting from P(s+t)=P(s)P(t) and differentiating with respect to forward time t at t=0 :

dP(t)dt=P(t)Q,P(0)=I

In component form:

dpij(t)dt=kpik(t)qkj

Interpretation: The rate of change of pij equals the net probability flux into state j — transitions from occupied states minus transitions out.

3.2 Diffusion Form (Fokker-Planck Equation)

For a [[Stochastic Differential Equation (SDE)|diffusion process]] dXt=μ(Xt)dt+σ(Xt)dWt , the forward equation becomes the [[Fokker-Planck Equation]]:

p(x,t)t=x[μ(x)p(x,t)]+122x2[σ2(x)p(x,t)]

Compact operator notation:

pt=Lp

where L is the adjoint of the infinitesimal generator.

[!NOTE] Unified View
Both the CTMC master equation and the Fokker-Planck equation are Kolmogorov forward equations — they differ only in the state space (discrete vs continuous) and the form of the generator.

3.3 Forward vs Fokker-Planck Terminology

Context Equation Name Generator
Discrete state (CTMC) Kolmogorov Forward / Master Equation Q -matrix
Continuous state (diffusion) Fokker-Planck Equation / Forward Kolmogorov L=xμ+12x2σ2
General Markov process Kolmogorov Forward Equation A (adjoint of generator)

3.4 Role in Diffusion Models

In [[Diffusion Model|diffusion models]], the forward noising process is a Markov diffusion. Its density pt(x) satisfies the forward equation:

ptt=[f(t)xpt]+12g(t)22pt

This equation determines how the data distribution p0 evolves toward Gaussian noise pT , and it forms the theoretical basis for deriving the [[Probability Flow ODE]].


4. Kolmogorov Backward Equation

4.1 CTMC Form

Differentiating P(s+t)=P(s)P(t) with respect to backward time s :

dP(t)dt=QP(t),P(0)=I

In component form:

dpij(t)dt=kqikpkj(t)

Key difference from forward: In the forward equation, the sum is over the second index of Q (destination); in the backward, over the first index (origin).

4.2 Diffusion Form

For a diffusion, the backward equation governs the conditional expectation:

u(x,t)=E[f(XT)Xt=x] ut+μ(x)ux+12σ2(x)2ux2=0

with terminal condition u(x,T)=f(x) .

Compact operator form:

ut+Lu=0

where L is the infinitesimal generator (NOT its adjoint — this is the key difference from the forward equation).

4.3 Forward vs Backward — Side-by-Side

Aspect Forward (Fokker-Planck) Backward
Variable x (future state) x0 (initial state)
Unknown Density p(x,tx0,0) Expectation u(x0,t)=E[f(Xt)X0=x0]
Operator L (adjoint) L (generator)
Initial/Boundary p(x,0)=δ(xx0) u(x,T)=f(x) (terminal)
Direction Forward in time Backward in time
Lineage From pt to pt+dt From expectation at t to tdt
CTMC Form dP/dt=PQ dP/dt=QP

[!WARNING] The Adjoint Distinction
The forward equation uses L (adjoint), the backward uses L . For self-adjoint generators (e.g., pure Brownian motion L=12Δ ), forward and backward equations coincide — but this is the exception, not the rule.

4.4 Feynman-Kac Extension

The backward equation generalizes to include a potential (discount/killing) term r(x) :

ut+Lur(x)u=0,u(x,T)=f(x)

This has the stochastic representation:

u(x,t)=E[f(XT)exp(tTr(Xs)ds)|Xt=x]

Applications: Option pricing (Black-Scholes), exit problems, reaction-diffusion systems.


5. The Generator and Semigroup Framework

5.1 Transition Semigroup

The transition operators {Pt}t0 form a strongly continuous semigroup (C₀-semigroup) on the space of bounded measurable functions:

(Ptf)(x)=E[f(Xt)X0=x]

Properties:

  1. Identity: P0=I
  2. Semigroup: Ps+t=PsPt (Chapman-Kolmogorov)
  3. Continuity: limt0Ptf=f (strong continuity)

5.2 Infinitesimal Generator

The generator A is the derivative of the semigroup at zero:

Af=limt0Ptfft

This single operator encodes ALL information about the process dynamics.

Process Generator A
[[Wiener Process|Wiener Process]] 12d2dx2
General diffusion μ(x)ddx+σ2(x)2d2dx2
CTMC (Qf)(i)=jqijf(j)
Jump-diffusion Adiff+λ[f(x+y)f(x)]ν(dy)

5.3 The Unified Kolmogorov Equations

From the semigroup property, BOTH Kolmogorov equations follow:

Forward: ddtPt=APt (or PtA depending on convention)
→ Acting on the density: ptt=Apt

Backward: ddtPtf=APtf
→ Acting on the test function: tPtf=APtf

1
2
3
4
5
6
7
8
9
10
11
12
               Chapman-Kolmogorov
P_{s+t} = P_s P_t

┌────────────┴────────────┐
│ │
Differentiate s Differentiate t
(backward) (forward)
│ │
▼ ▼
Backward Equation Forward Equation
d/dt P_t f = A P_t f ∂p/∂t = A* p
(test functions) (densities)

6. Connection to Diffusion Models

6.1 The Full Kolmogorov Picture

In [[Diffusion Model|diffusion models]], the Kolmogorov equations provide the complete mathematical scaffolding:

Component Kolmogorov Equation Role
Forward noising process Chapman-Kolmogorov q(xtx0)=q(xtxs)q(xsx0)dxs
Marginal density evolution Forward (Fokker-Planck) tpt=[fxpt]+12g22pt
Score matching Backward Links xlogpt(x) to denoising objective
Probability Flow ODE Forward (continuity form) Same marginals as SDE
Reverse-time SDE Backward (time-reversed) dx=[fxg2logpt]dt+gdW¯t

6.2 From Chapman-Kolmogorov to DDPM

The DDPM forward process is defined by discrete Markov transitions:

q(xtxt1)=N(xt;1βtxt1,βtI)

The Chapman-Kolmogorov equation allows compressing multiple steps:

q(xtx0)=q(xtxt1)q(x1x0)dx1dxt1

Due to the Gaussian structure and Chapman-Kolmogorov, this simplifies to a single Gaussian:

q(xtx0)=N(xt;α¯tx0,(1α¯t)I)

This is the Chapman-Kolmogorov equation in action: multi-step transitions reduce to a single closed-form expression, making efficient training possible.

6.3 Kolmogorov Equations in Score-Based Models

In score-based generative models ([[Score Function|Score SDE]] framework):

  1. Forward equation describes how pt spreads from data → noise
  2. Backward equation describes the reverse-time dynamics for sampling
  3. The score function xlogpt(x) appears in the backward equation as the drift correction term

The equivalence between SDE sampling and [[Probability Flow ODE]] sampling follows from the fact that both share the same Kolmogorov forward equation — they produce identical marginal distributions at all times.


7. Mathematical Properties

7.1 Uniqueness

Under standard regularity conditions (Lipschitz drift, bounded diffusion, non-degenerate noise):

  • The forward equation has a unique solution for a given initial density
  • The backward equation has a unique solution for a given terminal condition
  • Both solutions are C1,2 (continuously differentiable in t , twice in x )

7.2 Positivity and Conservation

Both the forward and backward equations preserve fundamental properties:

  • Forward: p(x,t)dx=1 for all t (conservation of probability)
  • Forward: p(x,0)0p(x,t)0 (positivity preservation)
  • Backward: Maximum principle — u(x,t) is bounded by its terminal values

7.3 Self-Adjoint Case

When L=L (self-adjoint), forward and backward equations coincide. This occurs for:

  • [[Wiener Process|Wiener Process]]: L=12d2dx2 (the Laplacian is self-adjoint)
  • Gradient diffusions with symmetric potential: dXt=V(Xt)dt+2dWt

In general diffusions with non-zero drift, the generator is not self-adjoint — forward and backward equations are genuinely different.

7.4 Spectral Interpretation

The forward and backward equations share the same spectrum (eigenvalues of L ), but different eigenfunctions:

  • Forward eigenfunctions = left eigenvectors of L
  • Backward eigenfunctions = right eigenvectors of L

The spectral gap λ1 (smallest non-zero eigenvalue) determines the mixing time of the process.


8. Core Formula Cards

[!QUOTE] Chapman-Kolmogorov (General)

P(s+t,x,A)=SP(t,y,A)P(s,x,dy)

[!QUOTE] Chapman-Kolmogorov (DTMC)

P(m+n)=P(m)P(n)

[!QUOTE] Kolmogorov Forward (CTMC)

dP(t)dt=P(t)Q,P(0)=I

[!QUOTE] Kolmogorov Backward (CTMC)

dP(t)dt=QP(t),P(0)=I

[!QUOTE] Kolmogorov Forward (Diffusion / Fokker-Planck)

pt=x[μp]+122x2[σ2p]

[!QUOTE] Kolmogorov Backward (Diffusion)

ut+μux+σ222ux2=0

[!QUOTE] Infinitesimal Generator

Af(x)=limt0E[f(Xt)X0=x]f(x)t

[!QUOTE] Semigroup Property

Ps+t=PsPt,P0=I

9. Summary

Aspect Description
What they describe Time evolution of transition probabilities in Markov processes
Chapman-Kolmogorov Algebraic consistency: transitions compose via semigroup property
Forward (Fokker-Planck) How probability density flows forward in time
Backward How conditional expectations evolve backward from terminal conditions
Unifying framework All three derive from the semigroup property of Markov transitions
Key distinction Forward uses adjoint generator L , backward uses generator L
Role in diffusion models Forward = density evolution; Backward = reverse process / score matching foundation
Named after Andrey Kolmogorov (1931) — who also axiomatized probability theory

Kolmogorov’s equations are the mathematical thread that connects the algebraic (Chapman-Kolmogorov), probabilistic (forward density), and analytic (backward expectations) descriptions of stochastic dynamics — a unification that remains at the heart of modern generative modeling.


Dataview Query

1
2
3
LIST
FROM #kolmogorov_equations OR #stochastic_process OR #markov_process
SORT file.ctime DESC

  • [[Markov Process]]
  • [[Fokker-Planck Equation]]
  • [[Stochastic Differential Equation (SDE)]]
  • [[Wiener Process|Wiener Process]]
  • [[Diffusion Model]]
  • [[Score Function]]
  • [[Probability Flow ODE]]
  • [[Martingale]]
  • [[Langevin Dynamics]]
  • [[Feynman-Kac Formula]]

References

  • Paper: Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung (Kolmogorov, 1931 — foundational paper)
  • Book: Continuous Martingales and Brownian Motion (Revuz & Yor, Chapter III: Markov Processes)
  • Book: Stochastic Differential Equations (Øksendal, Chapter 8: Diffusions and Kolmogorov Equations)
  • Book: Markov Processes: Characterization and Convergence (Ethier & Kurtz)
  • Book: Probability Theory and Stochastic Processes (Grimmett & Stirzaker)
  • Book: Diffusion Models: A Comprehensive Guide (Yang Song, Chapter on Score SDE)
  • Wikipedia: Chapman-Kolmogorov equation, Fokker-Planck equation, C₀-semigroup, Infinitesimal generator
    "