A Note on the Monge-Kantorovich Problem in the Plane

The Monge-Kantorovich mass-transportation problem has been shown to be fundamental for various basic problems in analysis and geometry in recent years. Shen and Zheng (2010) proposed a probability method to transform the celebrated Monge-Kantorovich problem in a bounded region of the Euclidean plane into a Dirichlet boundary problem associated to a nonlinear elliptic equation. Their results are original and sound, however, their arguments leading to the main results are skipped and difficult to follow. In the present paper, we adopt a different approach and give a short and easy-followed detailed proof for their main results.


Introduction
The optimal transportation problem was first raised by Monge in 1781. Let X and Y be two separable metric spaces, and c : X ×Y → [0, ∞] be a Borel-measurable function, where c(x, y) is the cost of the transportation from x to y. Given probability measures µ on X and ν on Y , Monge's formulation of the optimal transportation problem is to find a transport map T : X → Y that realizes the infimum inf X c(x, T (x)) dµ(x) : T −1 (µ) = ν , where T −1 (µ) = ν means that ν(A) = µ(T −1 (A)) for every Borel set A on Y . Sometimes one can write µT −1 = ν. A map T that attains this infimum is called an optimal transport map. Monge's formulation of the optimal transportation problem can be ill-posed, because sometimes there is no T satisfying T −1 (µ) = ν. Kantorovich (1942) reformulated this problem as following: to find a probability measure γ on X × Y that attains the infimum where Γ(µ, ν) denotes the collection of all probability measures on X × Y with marginals µ on X and ν on Y . It is known that a minimizer for this problem always exists when the cost function c is lower semi-continuous and Γ(µ, ν) is a tight collection of measures. Such optimization problem is called Monge-Kantorovich problem.
We consider a special case: X and Y are both one-dimensional Euclidian domains, and c(x, y) = |x − y| 2 . Let (Ω, F , P) be a non-atomic probability space. We denote by L(F, G) the set of all 2-dimensional random variables whose marginal distributions are F and G, respectively. Then the Monge-Kantorovich problem can be reformulated as follows: to find an optimal coupling of (X, Y ) ∈ L(F, G) such that E[(X − Y ) 2 ] attains the minimum. It is well-known and easily proved (see, eg. Rachev and Rüschendorf (1998a,b)) that if ( X, Y ) ∈ L(F, G), and X and Y are comonotonic 1 , then ( X, Y ) is an optimal coupling: and the minimum value is where F (−1) (·) and G (−1) (·) denote the left-continuous inverse functions of F (·) and G(·), respectively.

Main Idea of Shen and Zheng (2010)
In Shen and Zheng (2010), they consider the Monge-Kantorovich problem in the Euclidean plane. Given two 2-dimensional distribution functions F and G, they try to find an optimal coupling of (X , Y) whose marginal distributions are F and G, respectively, such that attains the minimum, where X = (X 1 , X 2 ) and Y = (Y 1 , Y 2 ). Denote Z = (X 1 , Y 2 ), then Assuming that random vectors X , Y and Z all have smooth and strictly positive density functions, with help of the random vector Z, Shen and Zheng (2010) have successfully reduced the dimension of the decision variable by turning the original optimal coupling problem on (X , Y) into an optimization problem on the distribution of Z. In Shen and Zheng (2010), they assume that the random vectors take values in a bounded region and reformulate the problem in a new probability space ( Ω, F, P) with Ω = [0, 1]×[0, 1] and P being the Lebesgue measure. In fact, as we will see below, this restriction and reformulation are not needed.
The main approach in Shen and Zheng (2010) consists two steps. First, for each fixed pair (X 1 , Y 2 ), they adopt a probability approach to find the best X 2 and under the constraint that the joint distribution of (X 1 , X 2 ) is the given distribution F and that of (Y 1 , Y 2 ) is G. After this step, the optimal coupling problem boils down to an optimization problem over all possible joint probability density functions of (X 1 , Y 2 ). They then propose a calculus of variations method to solve the above optimization problem.
As in Shen and Zheng (2010), our first step is to construct two functions g(·, ·) and h(·, ·) satisfying Let f (·, ·) be the probability density function of the 2-dimensional random vector X = (X 1 , X 2 ). Then the conditional distribution of X 2 given X 1 = x is Let p(·, ·) be the probability density function of the 2-dimensional random vector Z = (X 1 , Y 2 ). Then the conditional distribution of Y 2 given X 1 = x is where the last identity is due to the fact that where the last identity is due to the fact that . In Shen and Zheng (2010), the authors claim that X ∼ X and Y ∼ Y without giving a proof. For the reader's convenience, we give a proof here.
In fact, for any bounded Borel function B on R 2 , we have p(x, y) dx dy.
This indicates X ∼ X . Similarly, one can prove Y ∼ Y. In Shen and Zheng (2010), they claim that " if (X , Y) is the optimal coupling, then the above vector ( X , Y) have the same optimal joint distribution.
" Unfortunately, we are not able to prove that (X , Y) and ( X , Y) have the same optimal joint distribution. Fortunately, we will show that if (X , Y) is the optimal coupling, then we do have , that is to say, ( X , Y) is an optimal coupling as well.
In fact, given Y 2 = y, the conditional distributions of X 1 and Y 1 are F X 1 |Y 2 (·|y) and F Y 1 |Y 2 (·|y), respectively. Therefore, where B is the set of all 2-dimensional random vectors whose marginal distributions are F X 1 |Y 2 (·|y) and F Y 1 |Y 2 (·|y), respectively. If ( X, Y ) ∈ B and X and Y are comonotonic, then ( X, Y ) is an optimal coupling: It is an easy exercise to show that ( X, Y ) ∈ B and X and Y are comonotonic if and only if Y 1 |Y 2 (·|y) denote the left-continuous inverse functions of F Y (·) and F Y 1 |Y 2 (·|y), respectively. Therefore, where we used the fact that the conditional distribution of X 1 given Y 2 = y is the same as the distribution of X, that is F X 1 |Y 2 (·|y). Now we obtain Similarly, one can prove that Adding them up, we get Thus, we proved that if (X , Y) is an optimal coupling, so is ( X , Y).
Note that The optimal coupling problem in the Euclidean plane boils down to minimizing the right hand side of the above identity over H, the set of all the probability density functions p(·, ·) satisfying R p(·, t) dt = R f (·, t) dt and R p(u, ·) du = R f (u, ·) du. In Shen and Zheng (2010), they propose a calculus of variations method to solve the above optimization problem. However, their arguments are skipped and difficult to follow. The main objective of this note is to modify their method and give a detailed proof for their main results. Proof. This follows immediately from the mean value theorem. Lemma 3.2 If β is second order continuously differentiable in a neighbourhood of (a, b) ∈ R 2 , then

Solving the Problem: Calculus of Variations
Proof. Note Applying Lemma 3.1 to each term above, we obtain which then follows The proof is complete.
Our objective is to minimize the functional L(p) over H, where Suppose p > 0 minimizes the functional L(·) over H. Let η be any bounded function with compact support on R 2 satisfying R η(x, ·) dx = R η(·, y) dy = 0.