Analysis on Wasserstein space

8 minute read

Published:

In this blog post, we will focus on the some analysis on the space of probability measures as it is the main technical component arising in Mc-Kean Vlasov SDEs.

This blog post is mainly inspired from the Chapter $n°5$ of the excellent book of Carmona-Delarue : Probabilistic Theory of Mean Field Games with Applications I available at this link.

Table of contents

Some preliminaries on probability measures

In this section, we will consider by $(\mathcal{X},d)$ a metric space and we equi $\mathcal{X}$ with the Borel $\sigma$-algebra induced by the topology generated by $d$. We denote by $\mathcal{P}(\mathcal{X})$ for the set of probability measures on $\mathcal{X}$. We also denote by $\mathcal{C}{b}(\mathcal{X})$ the set of bounded real-valued functions on $\mathcal{X}$. We now define the notion of weak convergence for a sequence $(\mu_n){n \in \mathbb{N}^*} \in \mathcal{P}(\mathcal{X})$ towards $\mu \in \mathcal{P}(\mathcal{X})$.

Weak convergence of probability measures

Given a probability measure $\mu \in \mathcal{P}(\mathcal{X})$ and a sequence of probability measures $(\mu_n){n \in \mathbb{N}^*} \subset \mathcal{P}(\mathcal{X})$, we say that $(\mu_n){n \in \mathbb{N}^*}$ converges weakly to $\mu$ if

\(\begin{align} \underset{n \to \infty}{\lim} \int_{\mathcal{X}} f(x) d\mu_n(x) = \int_{\mathcal{X}} f(x) d \mu(x), \quad \forall f \in \mathcal{C}_b(\mathcal{X}). \end{align}\) We also provide a more probabilistic definition involving random variables. Given a sequence of $\mathcal{X}$-valued random variables $(X_n){n \in \mathbb{N}^*}$, we say that $(X_n){n \in \mathbb{N}^*}$ converges $\textbf{weakly}$ (or in $\textbf{distribution}$) to another $\mathcal{X}$-valued random variable $X$ if

\[\begin{align} \underset{n \to \infty}{\lim} \mathbb{E}[ f(X_n)] = \mathbb{E}[f(X)], \quad \forall f \in \mathcal{C}_b(\mathcal{X}). \end{align}\]

It is worth mentioning that the random variables $X_n$ don’t need to be defined on the same probability space $(\Omega,\mathcal{F},\mathbb{P})$. We could have random variables $X_n$ defined on $(\Omega_n, \mathcal{F}_n,\mathbb{P}_n)$ and $X$ defined on $(\Omega,\mathcal{F},\mathbb{P})$ and the definition would still hold since from the transfer theorem, $(2)$ can be rewritten

\[\begin{align} \underset{n \to \infty}{\lim} \int_{\mathcal{X}} f(x) d \mathbb{P}_{X_n}(x) = \int_{\mathcal{X}} f(x) d \mathbb{P}_{X}(x), \quad \forall f \in \mathcal{C}_b(\mathcal{X}). \end{align}\]

As a simple example, when looking at a sequence $(x_n){n \in \mathbb{N}^*}$ valued in $\mathcal{X}$ converging towards $x$, we have the weak convergence of the sequence of measures $(\delta{x_n}){n \in \mathbb{N}^*}$ towards $\delta{x}$. Indeed, this follows directly from the continuity of the functions $f$.

$\quad$ Moreover, it is important to notice that the weak convergence of probability measures doesn’t imply the convergence of $\mu_n(A)$ towards $\mu(A)$ for any Borel measurable set $A \in \mathcal{B}(\mathcal{X})$ and neither the convergence of the moments for measures on $\mathcal{P}(\mathbb{R}^d)$. As a simple counterexample, we can look at the laws defined on $\mathcal{P}(\mathbb{R})$ as

\(\begin{align} \mu_n(A) = (1- \frac{1}{n}) \delta_{0}(A) + \frac{1}{n} \delta_{n}(A). \end{align}\) It is easy to see that $\mu_n$ converges weakly to $\delta_0$. However, for the monomes functions $(x^k)_{k \in \mathbb{N}^*}$, we have

\(\begin{align} \int_{\R} f(x) d \mu_n(x) = n^{k-1} \nrightarrow 0, \end{align}\) which shows that the weak converges don’t ensure the convergence of the $ \textbf{moments}$ of the laws.

There exists the famous $\textbf{Portmanteau theorem}$ which tells about the convergence setwise of the weak convergence of probability measures for which proof can easily found online.

We now give an important result which shows the stability by continuous mappings of the convergence in distribution.

$\textbf{Theorem : \text{Continuous mapping theorem} }$

Let $\mathcal{X}$ and $\mathcal{Y}$ 2 metrics spaces, and $(X_n){n \in \mathbb{N}^*}$ a sequence of $\mathcal{X}$-valued random variables which converge in distribution towards $X$. Let $g$ be a continuous mapping from $\mathcal{X}$ to $\mathcal{Y}$. Then, the sequence of $\mathcal{Y}$-valued random variables $(g(X_n)){n \in \mathbb{N}^*}$ converge weakly towards $g(X)$.

$\emph{Proof : }$ Let $f \in \mathcal{C}b(\mathcal{Y})$. Since $g$ is continuous, we have $f \circ g \in \mathcal{C}_b(\mathcal{X})$ and it follows from the weak convergence of $(X_n){n \in \mathbb{N}^*}$ towards $X$.

We now discuss on the weak convergence of empirical measures as it is one of the main components in the study of limiting processes depending on the state distribution law.

For this, let $(X_i)_{i }$ i.i.d $\mathcal{X}$-valued random variables and we define the $\mathcal{P}(\mathcal{X}$)-valued random variable as

\[\begin{align} \mu_n = \frac{1}{n} \sum_{i=1}^{n} \delta_{X_i} \end{align}\]

Some compactness results on $\mathcal{P}(\mathcal{X})$

Given a set $S \subset \mathcal{P}(\mathcal{X})$, we say that the family $S$ of probability measures is $\textbf{tight}$ if for all $\epsilon > 0$, there exists a compact set $K \subset X$ such that

\[\begin{align} \underset{ \mu \in K}{\sup } \hspace{0.2 cm} \mu(K^c) \leq \epsilon. \end{align}\]

$\textbf{Theorem : \text{Prokhorov for compactness} }$

Let $(\mu_n){n \in \mathbb{N}^*} \subset \mathcal{P}(\mathcal{X})$. If $(\mu_n){n \in \mathbb{N}^}$ is tight, then it is pre-compact in the sense that every subsequence admits a further subsequence which converges weakly to $\mu \in \mathcal{P}(\mathcal{X})$. Conversely, if $(\mu_n)_{n \in \mathbb{N}^}$ is precompact and if the metric $(\mathcal{X},d)$ is separable and complete, then $(\mu_n)_{n \in \mathbb{N}^*}$ is tight.

Note that we say that a set $S \subset \mathcal{X}$ for a metric space $\mathcal{X}$ is precompact if its adherence $\bar{S}$ is compact. The previous theorem is really useful as we need to characterize precompacts sets on $\mathcal{P}(\mathcal{X})$ to study their tightness.

$\textbf{Theorem : \text{Skorokhod’s representation} }$

Suppose $(\mathcal{X},d)$ is separable. Let $(\mu_n)_{n \in \mathbb{N}^*}$ converges weakly to $\mu \in \mathcal{P}(\mathcal{X})$. Then, there exists a probability space $(\Omega,\mathcal{F},\mathbb{P})$ supporting $\mathcal{X}$-valued random variables $X_n$ and $X$ with $X_n \sim \mu_n$ and $X \sim \mu$ and $X_n \to X$ a.s.

Notice that this theorem is important as we can prove properties related to weak convergence using the results from almost surely convergence. For instance, we give the following theorem which extends the weak convergence to the case of uniform integrable maps.

$\textbf{Theorem : \text{Extension of weak convergence} }$

Let $(\mu_n)_{n \in \mathbb{N}^*}$ converges weakly to $\mu \in \mathcal{P}(\mathcal{X})$. If $f : \mathcal{X} \to \mathbb{R}$ is continuous and uniformly integrable in the sense that

\(\begin{align} \underset{r \to \infty}{\lim} \underset{n \in \mathbb{N}^*}{\sup} \int_{\mathcal{X}} |f| \mathbb{1}_{\lbrace|f| \geq r \rbrace} d \mu_n(x) \to 0, \end{align}\) then we have $\underset{n \to \infty}{\lim} \int_{\mathcal{X}} f(x) d \mu_n(x) = \int_{\mathcal{X}} f(x) d \mu(x)$.

$\textbf{Proof}$

First, as $f$ is a continuous mapping, we know that the measures $f_{#} \mu_n$ converges weakly to $f_{#} \mu$. Therefore, from Skrokohoh theorem, there exists on the same probability space $(\Omega,\mathcal{F},\mathbb{P})$ random variables $X_n \sim f_{#} \mu_n$ and $X \sim f_{#} \mu$ with $X_n \to $X almost surely. From the condition $(9)$ of the Theorem, it can be rewritten from the transfert theorem as

\(\begin{align} \underset{r \to \infty}{\lim} \underset{n \in \mathbb{N}^*}{\sup} \hspace{0.2 cm} \mathbb{E}[ |X_n| \mathbb{1}_{|X_n| \geq r}] \to 0, \end{align}\) which implies that $(X_n)_{n \in \mathbb{N}^*}$ is uniformly integrable and since $X_n$ converges to $X$ almost surely and therefore in probability, it implies from the optimal dominated convergence theorem that $X_n \to X$ in $L^1(\mathbb{P})$ which gives

\[\begin{align} \underset{n \to \infty}{\lim} \int_{\mathcal{X}} f(x) d\mu_n(x)=\underset{n \to \infty}{\lim} \mathbb{E}[X_n] = \mathbb{E}[X] = \int_{\mathcal{X}} f(x) d\mu(x) \end{align}\]

$\mathcal{P}(\mathcal{X})$ as a metric space

We will now discuss some definitions of metrics on the space $\mathcal{P}(\mathcal{X})$ which are compatible with the weak convergence that we just defined. For this, we will discuss the main choices of metrics which can exist in the following section.

$\textbf{Theorem : Wasserstein distance and weak convergence}$

Let $\mu$ and a sequence $(\mu_n)_{n \in \mathbb{N}^*}$ valued in $\mathcal{P}^p(\mathcal{X})$ for some $p \geq 1$. Then, the following are equivalents

  • $\mathcal{W}_p(\mu_n,\mu) \to 0$
  • For every continuous function $f : \mathcal{X} \to \mathbb{R}$ with the property that for every $x_0 \in \mathcal{X}$ and $c > 0$, such that $|f(x)| \leq c ( 1+ d(x_0,x)^p)$ for all $x \in \mathcal{X}$, we have \(\begin{align} \int_{\mathcal{X}} f(x) d \mu_n(x) \to \int_{\mathcal{X}} f(x) d \mu(x). \end{align}\)
  • $\mu_n \to \mu$ weakly and \(\int_{\mathcal{X}} d(x,x_0)^p \mu_n(dx) \to \int_{\mathcal{X}} d(x,x_0)^p \mu(dx).\) for some $x_0 \in \mathcal{X}$.
  • $\mu_n \to \mu$ weakly

$\textbf{Theorem : \text{Kantorovitch Duality} }$

Some metrics on $\mathcal{P}(\mathcal{X})$

The Lévy-Prokhorov and total variation distance

Wasserstein metrics

Some topological properties of the Wasserstein space $\mathcal{P}_2(\mathbb{R}^d)$

To be done asap

Differentiability on $\mathcal{P}(\mathbb{R}^d)$

Some notions of derivatives on $\mathcal{P}(\mathbb{R}^d)$

To be done asap.

An Itô’s formula along the flow of measures

To be done asap.