Optimal Transport Plans

This tutorial builds intuition for what SDOT computes, how the Newton solver works, and what quantities you can extract from an OT plan.

The Problem

Given:

f — a discrete measure: a finite collection of weighted Dirac masses at positions $y_{1}, \dots, y_{n}$ with masses $m_{1}, \dots, m_{n}$ (with $\sum_{i} m_{i} = 1$ )
g — a continuous density on some domain $Ω$

Find the transport map $T : Ω \to {y_{1}, \dots, y_{n}}$ that moves $g$ onto $f$ at minimum cost:

W_{2}^{2} (f, g) = min_{T} \int_{Ω} ∥ x - T (x) ∥^{2} g (x) d x s.t. \int_{T^{- 1} (y_{i})} g (x) d x = m_{i} \forall i

Power Diagrams (Laguerre Cells)

The optimal transport map in the semi-discrete setting is always a power diagram (also called a Laguerre tessellation): a partition of $Ω$ into cells $C_{i} (w)$ , each associated to a Dirac $y_{i}$ , where membership is defined by a weighted distance:

x \in C_{i} (w) ⟺ ∥ x - y_{i} ∥^{2} - w_{i} \leq ∥ x - y_{j} ∥^{2} - w_{j} \forall j \neq i

When all weights $w_{i} = 0$ , this reduces to the standard Voronoi diagram. The weights "inflate" or "deflate" each cell to match the target mass.

The transport plan then simply sends every point $x$ in cell $C_{i}$ to position $y_{i}$ .

The Newton Solver

SDOT finds the optimal weights by solving:

\int_{C_{i} (w)} g (x) d x = m_{i} \forall i

This is a smooth, strictly concave system — Newton's method converges in a handful of iterations in practice (typically 5–20). The key cost is computing the power diagram and its integrals at each step, which runs in O(n log n) time.

Reference: Kitagawa, Mérigot, Thibert — Convergence of a Newton algorithm for semi-discrete optimal transport, JEMS 2019.

Computing a Plan

python

from sdot import SplineGrid, SumOfDiracs, optimal_transport_plan
import numpy as np

f = SumOfDiracs( np.random.rand( 300, 2 ) )
g = SplineGrid( np.random.rand( 10, 10 ) )

plan = optimal_transport_plan( f, g )

Extracting quantities

python

plan.distance          # W2^2 transport cost (scalar)
plan.barycenters       # centroid of each Laguerre cell — shape (n, d)
plan.cell_masses       # mass of each cell             — shape (n,)
plan.brenier_potential # dual potential psi            — shape (n,)
plan.power_diagram     # the underlying PowerDiagram object

The Brenier potential $ψ$ is the dual variable: $T (x) = x - \frac{1}{2} \nabla ψ (x)$ , and the transport cost is the Legendre transform of $ψ$ integrated against $g$ .

The Wasserstein Distance

The scalar plan.distance equals:

W_{2}^{2} (f, g) = \int_{Ω} ∥ x - T (x) ∥^{2} g (x) d x = \sum_{i} \int_{C_{i}} ∥ x - y_{i} ∥^{2} g (x) d x

It is also the negative of the dual objective at the optimal weights:

W_{2}^{2} (f, g) = \sum_{i} m_{i} w_{i} - \int_{Ω} max_{i} (y_{i} \cdot x - \frac{1}{2} w_{i}) g (x) d x

For a shortcut when you only need the distance:

python

from sdot import distance

d = distance( f, g )   # same as plan.distance, but doesn't store the full plan

Barycenters and the Lloyd Algorithm

The barycenters plan.barycenters[i] are the centroids of each transport cell:

b_{i} = \frac{1}{m_{i}} \int_{C_{i} (w)} x g (x) d x

Moving each Dirac to its barycenter and repeating gives the Lloyd algorithm, which converges to an optimal quantization of $g$ :

python

positions = np.random.rand( 200, 2 )

for _ in range( 30 ):
    f    = SumOfDiracs( positions )
    plan = optimal_transport_plan( f, g )
    positions = plan.barycenters          # Lloyd step

Gradients

plan.distance (and distance(f, g)) are differentiable with respect to the Dirac positions and masses. The gradient with respect to positions $y_{i}$ is:

\frac{\partial W_{2}^{2}}{\partial y_{i}} = m_{i} \cdot (y_{i} - b_{i})

where $b_{i}$ is the barycenter of cell $i$ . This has a clean geometric interpretation: the gradient points from the barycenter toward the Dirac, and vanishes exactly when the Dirac is at its barycenter (i.e., at the Lloyd fixed point).

In SDOT, this gradient is computed automatically through the JAX/PyTorch autodiff:

python

import jax

grad_positions = jax.grad( lambda pos: distance( SumOfDiracs( pos ), g ) )( positions )
# grad_positions[i] ~ masses[i] * (positions[i] - barycenters[i])

Optimal Transport Plans ​

The Problem ​

Power Diagrams (Laguerre Cells) ​

The Newton Solver ​

Computing a Plan ​

Extracting quantities ​

The Wasserstein Distance ​

Barycenters and the Lloyd Algorithm ​

Gradients ​

What's Next ​

Optimal Transport Plans

The Problem

Power Diagrams (Laguerre Cells)

The Newton Solver

Computing a Plan

Extracting quantities

The Wasserstein Distance

Barycenters and the Lloyd Algorithm

Gradients

What's Next