Invariance properties of the Monge-Kantorovich mass transport problem

We consider the multidimensional Monge-Kantrovich transport problem in an abstract setting. Our main results state that if a cost function and marginal measures are invariant by a family of transformations, then a solution of the Kantrovich relaxation problem and a solution of its dual can be chosen so that they are invariant under the same family of transformations. This provides a new tool to study and analyze the support of optimal transport plans and consequently to scrutinize the Monge problem. Birkhoff's Ergodic theorem is an essential tool in our analysis.

(M K) The dual formulation of (M K) takes the following aspect [1].
In case where X 1 = X 2 = ... = X n , we have the following result.
Theorem 1.4 Let X be a Polish space, µ 1 , ..., µ n be n Borel probability measures on X, and c : X n → [0, ∞] be Borel measurable and ⊗ n i=1 µ i -a.e. finite. Let σ : X n → X n be a permutation defined by σ(x 1 , ..., x n ) = (x 2 , ..., x n , x 1 ). We also assume that there exists a finite transport plan, and that the dual problem (DK) has a solution (ϕ 1 , ..., ϕ n ) such that ϕ j ∈ L 1 (µ j ) for all j ∈ {1, ..., n}. The following assertions hold: i. Suppose that R : X → X is a periodic map of order n. If R is µ j measure preserving for every j ∈ {1, ..., n} and c(x 1 , x 2 , ..., x n ) = c(σ(Rx 1 , Rx 2 , ..., Rx n )), then (DK) has a solution (ψ 1 , ..., ψ n ) Remark 1.5 The arguments presented in the proof of Theorems 1.3 and 1.4 are broad enough to capture more invariance properties of the dual problem. We made an effort to state the above theorems for a general setting. In an specific case one may be able to obtain more information about the dual problem. For instance, part (ii) of Theorem 1.4 can be generalized to an infinite sequence of {R k } (1≤k<∞) if solutions of the dual problem are uniformly bounded in certain spaces.
If the dual problem has a unique solution then the aforementioned Theorems will be quite useful to characterize it. Even though the existence of an optimal transport plan for the Monge-Kantrovich problem holds under rather general hypotheses on the cost function and marginal measures, its uniqueness remains an issue. In general, there are very limited cases that one can obtain uniqueness for (M K). However, the dual problem seems to have a better chance to admit a unique solution even when solutions of the primal problem (M K) fail to be unique. Here, we state the following uniqueness result for the dual problem. Proposition 1.1 Let c : (R d ) n → R be differentiable and locally Lipschitz, and µ 1 , ..., µ n be n probability measures on R d with bounded and regular supports. If each µ i is absolutely continuous with respect to the d-dimensional Lebesgue measure, L d , and has a positive density with respect to L d , then (DK) admits a unique solution (up to the addition of constants summing to 0 to each potential).
Finally, we shall show that an optimal plan can be chosen so that it inherits the invariance properties of the cost function and the marginal measures. Theorem 1.6 Let (X 1 , µ 1 ), ..., (X n , µ n ) be n Polish probability spaces, and c : be a lower semi-continuous Borel measurable function. The following statements hold.
ii. Let X 1 = ... = X n = X and µ 1 = ... = µ n = µ. Let G be a set of µ-measure preserving maps on X such that and, that U • R = R • U for all U, R ∈ G. If G is a countable set then (M K) has a solutionγ that is invariant under G, i.e.γ = (U, ..., U )#γ for all U ∈ G.
The following result is an immediate consequence of the above theorem.
Corollary 1.7 Under the assumptions of part (ii) in Theorem 1.6, if the cost function c is invariant by a permutation σ : X n → X n , i.e. c • σ = c then (M K) has a solutionγ such thatγ = σ#γ, and γ = (U, ..., U )#γ for all U ∈ G. Furthermore, If X is a metric space and G equipped with the point-wise topology on X is separable then the countability assumption on G is not required.
Remark 1.8 By drawing a comparison between Theorems 1.4 and 1.6, it is evident that the primal problem (M K) is more flexible than its dual (DK) when it comes to capturing invariance properties of the cost function and marginal measures. The reason falls under the fact that transport plans live in the set Π(µ 1 , ..., µ n ) which is pre-compact for the weak topology. It allows one to analyze the limit points of optimal plans with certain properties.
The literature on the theory and applications of optimal mass transportation is too vast to provide an exhaustive bibliography here, we refer to the books of Villani [16] and Rachev and Rüchendrof [15] and the references therein.
The next section is devoted to the proof of Theorems 1.3 and 1.4. In section 3, by recalling the measure isomorphism theorem we address the uniqueness issue for the dual problem. Section 4 is concerned with the invariance properties of (M K) and the proof of Theorem 1.6. Applications to optimal mass transportation problems arising in volume maximizing and also semi-classical Hohenberg-Kohn functionlas are discussed in section 5.

Invariance properties of (DK)
This section is devoted to the proof of Theorems 1.3 and 1.4. We first recall the celebrated ergodic theorem, due to George David Birkhoff (1931).
Theorem 2.1 (Birkhoff 's Ergodic Theorem) Let T : X → X be a measure-preserving transformation on a measure space (X, Σ, µ) and suppose f is a µ-integrable function, i.e. f ∈ L 1 (µ). Then the average converge a.e. on X (as m → ∞) to an integrable functionf . Furthermore,f is T -invariant, i.e.f • T =f holds almost everywhere, and if µ(X) is finite, then the normalization is the same, The above theorem is an essential tool in Ergodic theory to study the behavior of a dynamical system with an invariant measure. We shall see that it is also surprisingly relevant and essential in the theory of optimal mass transportation.
Proof of Theorem 1.3. For each j ∈ {1, ..., n}, define the average of ϕ j under the the transformation R j as in Theorem 2.1, It follows from Theorem 2.1 that A m (ϕ j ) converges µ j -a.e. to an integrable function Φ j satisfying Denote by Φ c j the c-conjugate of Φ j defined by Claim. For each j ∈ {1, ..., n}, Φ c j satisfies the following properties: and therefore By Birkhoff's Ergodic theorem each Φ i is integrable and by the assumption there exists a finite transport plan from which the integrability of Φ c j follows. To prove part (c), we first note that It follows from part (a) of the claim and (2) that The optimality of (ϕ 1 , ..., ϕ n ) implies that the inequality in the latter expression is in fact an equality. There- e. on X j as the same property holds by Φ j .
It now follows from the definition of Φ c j that, the latter inequality yields that It also follows from the above claim that By induction for each 1 < k < n define, and finally we define It follows that, and, Consider now the functions, It follows from (4) thatψ j ≥ ψ j . It also follows from (3) that and therefore, It then follows from (5), (6) and the maximality of ψ j thatψ j = ψ j for all j ∈ {1, ..., n}. Therefore, one obtains To complete the proof we will show that (ψ 1 , ψ 2 , ..., ψ n ) is a solution of (DK). The inequality relation in (6) together with the integral equality given in (2) and therefore, the optimality of (ϕ 1 , ϕ 2 , ..., ϕ n ) implies that This completes the proof of part (i) for possibly non-periodic maps R 1 , ..., R n . Now assume that R 1 , ..., R n are periodic. There exist positive integers m 1 , ..., m n such that R mj j is the identity map on X j for each j ∈ {1, ..., n}. Define Φ j to be The rest of the proof goes in the same lines as in the previous case.
For the case where X 1 = ... = X n = X one can hope for more invariance properties as stated in Theorem 1.4. We shall now proceed with the proof of this Theorem.
Proof of Theorem 1.4. Define, it follows from the above expression that We also have that Φ 1 ≤ Φ c 1 . In fact, Consider now the function, It follows thatψ ≥ ψ. Set w(x) =ψ (x)+(n−1)ψ(x) n and note that w ≥ ψ. One can easily check that Φ c ., x n ) on X n . By the maximality of ψ we have that ψ = w and thereforeψ = ψ. This completes the proof of part (i). Proof of part (ii). By assuming R to be the identity map in part (i), it follows that a solution (ϕ 1 , ..., ϕ n ) of (DK) can be chosen in such a way that ϕ j = ϕ 1 for all j ∈ {1, ..., n}. Define the average of ϕ 1 under the the transformation R 1 as in Theorem 2.1, It follows from Theorem 2.1 that A m (ϕ j ) converges µ-a.e. to an integrable function Φ 1 satisfying Note also that We now define the average of Φ 1 under the the transformation R 2 , By making use of Theorem 2.1 again, it follows that A m (Φ 1 ) converges µ-a.e. to an integrable function Φ 2 satisfying e. on X. It also follows from (7) and (8) that By repeating the above argument we obtain, by the l-th step, a function Φ l enjoying the following properties: , it follows from the above expression that We also have that Φ c l ≥ Φ l , and Φ c l (x) = Φ l (x) for µ-a.e. x ∈ X. It follows that Φ c Similar to part (i), one has Note that Φ c l ≥ ψ ≥ Φ l , and Φ l and Φ c l are R k invariant µ-a.e. on X. It follows that ψ • R k = ψ, µ-a.e. on X for all k ∈ {1, ..., l}. This completes the proof for possibly non-periodic maps R 1 , ..., R l . A similar argument as in the proof of Theorem 1.3 shows that if R 1 , ..., R l are periodic then ψ • R k = ψ on entire X.

Uniqueness
In this section we address the uniqueness issues for both (M K) and (DK). Under rather general conditions on the cost and marginals, existence of both (M K) and (DK) are warranted. For (M K) with n = 2, the well known twist condition i.e., is injective for fixed x 1 , ensures the uniqueness and Monge structure of the optimal map [6,12]. For larger n, the uniqueness question is still largely open. Examples of special cost functions for which the optimal measure has this structure are known [3,9,10,11,12,13] as well as several examples for which uniqueness and Monge solutions fail [4,13,7].
To the best of our knowledge, there is no result on the uniqueness of (DK). We now proceed with the proof of Theorem 1.1. To do this, we first recall some preliminary notations and results as in the theory of measure isomorphisms. Let X be a topological space and denote by B(X) the set of all Borel subsets of X. The following is the standard measure isomorphism theorem. We also have the following definition. Proof of Proposition 1.1 Since µ i has a regular support there exist a connected open set X i with full measure such that that µ i (∂X i ) = 0. LetX = X 1 × ... × X n , and assume that γ is a solution of (M K). Since each µ i is non-atomic we have that γ is a non-atomic measure on B(X). It follows from Theorem 3.2 that (X, B(X), γ) is isomorphic to (X 1 , B(X 1 ), µ 1 ). Thus, there exists an isomorphism T = (T 1 , ..., T n ) from (X 1 , B(X 1 ), µ 1 ) onto (X, B(X), γ). Note that each T i : X 1 → X i is onto and pushes µ i forward to µ 1 , i.e.
Let (ϕ 1 , ..., ϕ n ) be a solution of (M K) satisfying Since c is locally Lipschitz and X 1 , ..., X n are bounded, each ϕ i is locally Lipschitz (Lemma C. 1 [6]). Suppose that A i ⊂ R d is the set of non-differentiable points of ϕ i on X i . It follows from Rademacher's theorem that L d (A i ) = 0, and since µ i is absolutely continuous with respect to L d , one has µ i (A i ) = 0. It now follows from (9) that On the other hand, it follows from (9) and the duality between (M K) and (DK) that If (DK) has another solution (ψ 1 , ..., ψ n ) satisfying (10), the above argument shows that for some measurable set B with µ 1 (B) = 0. Therefore, We show that ∇ϕ i (z) = ∇ψ i (z) for L d -a.e. z ∈ X i . Let λ i be the measure L d restricted to X i and λ * i its corresponding outer measure. By assumption dµi dλi is a positive function. Set Λ i = T i X 1 \ (A ∪ B) and note that Λ i is not in general Borel measurable. However, there exist an open set O i containing Λ i such that Thus, dλi is a positive function we obtain λ i (O i ) = λ(X i ) and therefore ∇ϕ i (z) = ∇ψ i (z) for L d -a.e. z ∈ X i .
Lemma 4.1 Let X 1 , ..., X n be Polish spaces and c : X 1 ×...×X n → [0, ∞] be lower semi-continuous. Assume that γ k is a sequence in Π(µ 1 , ..., µ n ) converging weakly to a transport plan γ. Then The above result does not imply that the optimal cost is finite. In fact, all transport plans may lead to an infinite cost i.e. c dγ = ∞ for all γ ∈ Π(µ 1 , ..., µ n ). We are now ready to prove Theorem 1.6.
For each nonnegative integer k, set R k = (R k 1 , ..., R k n ) with the convention that R 0 i to be the identity map on X i for each i ∈ {1, ..., n}. By Lemma 4.1 and the fact that there exists a finite transport plan, the existence of an optimal plan γ with a finite cost is ensured. Define Note first that γ k ∈ Π(µ 1 , ..., µ n ). Indeed, if f is a bounded continuous function on X j then By a similar argument and using the fact that the cost function c is invariant under each R k we obtain Since the sequence {γ k } k∈N is tight, up to a subsequence, there existsγ ∈ Π(µ 1 , ..., µ n ) such that γ k converges weakly toγ. It follows from Lemma 4.1 that X c dγ ≤ lim inf k→∞ c dγ k from which together with the optimality of γ we obtain X c dγ = X c dγ.
To conclude the proof we shall show that R#γ =γ. Take a bounded continuous function f onX. We show Define the average of f as in the Birkhoff's theorem by and note that It follows from the Birkhoff's theorem that A k+1 (f ) converges γ-a.e. to an integrable functionf andf •R =f for γ-a.e. onX. Since, f is bounded then so is A k+1 (f ) and therefore by the dominated convergence theorem we have lim k→∞ X A k+1 (f (x)) dγ = Xf (x) dγ. It implies that Therefore, and sincef =f • R γ-a.e., it follows that This completes the proof of the first part.
Proof of part (ii): For each nonnegative integer k and each m ∈ N, set U m = (R m , ..., R m ) and U k m = (R k m , ..., R k m ). It follows from part (i) that there exists an optimal plan π 1 such that U 1 #π 1 = π 1 . For each k ∈ N ∪ {0}, define It follows that U 1 #γ k = γ k for all k ∈ N ∪ {0}. In fact, for a bounded continuous function f onX, we have This implies that U 1 #γ k = γ k . Since {γ k } is tight, up to a subsequence, γ k converges weakly to some π 2 . It then follows that U 1 #π 2 = π 2 , U 2 #π 2 = π 2 , and π 2 is a solution of (M K). By repeating this argument, we obtain a sequence of optimal transport plans {π m } m∈N such that U k #π m = π m for all k ∈ {1, ..., m}. The tightness of {π m } and Lemma 4.1 ensure the existence of an optimal plan π such that, up to a subsequence, π m converges weakly to π. It also follows that U m #π = π for every m ∈ N. This completes the proof of Theorem when there exists a finite transport plan. If there is no finite transport plan then the plansγ in part (i) and π in part (ii) possess the desired symmetry and c dγ = c dπ = ∞.
Let X be a metric space with a metric d. Assume that G is a set of measure preserving maps on X. One can equip the set G with the point-wise topology induced by the metric d. Indeed, for a sequence {U k } k∈N ⊂ G, we say that U k converges to some U ∈ G if and only if for every x ∈ X, The set G equipped with the point-wise topology is called separable if it has a dense countable subset.
Proof of Corollary 1.7. By assumption the cost function c is invariant by a permutation σ : X n → X n , By part (ii) of Theorem 1.6, (M K) has a solution γ such that γ = (U, ..., U )#γ for all U ∈ G. Definē γ = n i=i σ i #γ n .
If (X, d) is a metric space and G equipped with the point-wise topology on X is separable then G has a dense countable subset {U k } k∈N . Thus, by the latter argument, (M K) has a solutionγ that is invariant under σ, and thatγ = (U k , ..., U k )#γ for all k ∈ N. We shall show thatγ = (U, ..., U )#γ for every U ∈ G. Let f be a bounded continuous function on X n , and let U ∈ G. There exists a subsequence {U km } (m∈N) approaching U in the point-wise topology. We also have By the dominated convergence theorem we obtain

Applications
In this section we provide two applications of our results. The first one is concerned with a volume maximizing problem. In the second example the cost function under the study is a repulsive function.
(M K d ) We focus on (M K d ) in the case where the measures (µ i ) are radially symmetric. This means that for all U in the orthogonal group of R n , U #µ i = µ i , ∀i ∈ 1, ..., n.
Denote by |.| the Euclidian norm in R n . The problem (M K d ) is dual to : where K d is the set of n-tuples of lower-semi continuous functions (ϕ 1 , ..., ϕ n ) from R n to R ∪ {+∞} such that The above problem is studied in [4]. In the radial case they provided an explicit solution for (M K d ) from which an extremality condition was established and solutions of (DK d ) were studied. We shall use Theorem 1.3 to find an explicit solution for (DK d ) and derive an extremality condition for (M K d ). Here is our result for problem (M K d ) and its dual (DK d ).
Theorem 5.1 Let µ 1 , ..., µ n be radially symmetric compact supported measures on R N with the same support. If µ i is absolutely continuous with respect to the n-dimensional Lebesgue measure, L n , and has a positive density with respect to L n , then the following statements hold.
Proof. Let B = supp(µ 1 ). It follows from Proposition 1.1 that (DK d ) has a unique solution (ϕ 1 , ..., ϕ n ) with The above expression shows that each ϕ i is a finite convex function. Suppose U is a rotation matrix. Since each µ i is radial one has U #µ i = µ i . Note also that det(x 1 , x 2 , ..., x n ) = det(U x 1 , ..., U x n ). It follows from part (i) of Theorem 1.3 that ϕ i (U x) = ϕ i (x) for µ i -a.e. x ∈ B. Since ϕ i is continuous and µ i is absolutely continuous with respect to the n-dimensional Lebesgue measure, we have that ϕ i (U x) = ϕ i (x) for all x ∈ B. Thus, ϕ i is invariant by all rotation matrices. We may now observed that, since each ϕ i is determined by its behavior on the half-line {βe 1 ; β ≤ 0}, where e 1 = (1, 0, ..., 0) ∈ R n , one can write ϕ i (x) = h i (|x|), where the function h : R + → R is defined by h i (β) = ϕ i (βe 1 ).
We now prove part (ii). Sinceγ is a solution of (M K d ), it follows from the duality relation between (M K d ) and (DK d ) together with part (i) of the current Theorem that |x i | n n dγ and therefore, On the other hand by the Young inequality, Therefore, on the support ofγ, the Young inequality becomes an equality from which part (ii) follows.

The Coulomb cost.
The Coulomb cost is the repulsive part of the Hohenberg-Kohn functional and is given by c(x 1 , ..., x n ) = n i =j It represents the Coulombic interaction energy between the electrons. Therefore, the marginals, which represent the single particle densities of the electrons, are all the same, embodying the indistinguishability of the electrons. To precisely formulate this problem, fix a probability measure ρ on R d . Let Π(ρ) be the set of all probability measures on R dn whose marginals are all ρ. We shall consider the following problem, inf γ∈Π(ρ) R d×n n i =j We focus on (M K c ) in the case where the measure ρ is radially symmetric. The problem (M K c ) is dual to : sup (ϕ1,...,ϕn)∈Kc where K c is the set of n-tuples of lower-semi continuous functions (ϕ 1 , ..., ϕ n ) from R d to R ∪ {−∞} such that The optimal mass transport problem with the Coulomb cost has been recently studied by many authors. We refer the interested reader to [2,5,14] and references therein. Here is our result for the Coulomb cost. ii. If ρ is absolutely continuous with respect to the d-dimensional Lebesgue measure and has a positive density function, then (DK c ) has a solution (ϕ 1 , ..., ϕ n ) where ϕ 1 = ... = ϕ n and ϕ 1 is radially symmetric.
We refer the interested reader to [8] where the existence of optimal transport maps is studied and a detailed proof of the above theorem is provided. Here, we shall sketch the proof.
Proof of Theorem 5.2. Note first that for all U, R ∈ SO(d) one has U • R = R • U as they are rotation matrices. Note also that SO(d) has a dense countable subset {R 1 , R 2 , ...}. It follows by Corollary 1.7 that there exists an optimal map γ such that (U, ..., U )#γ = γ for each U ∈ SO(d).
We shall now prove part (ii). It follows from Theorem 1.4 that (DK c ) has a solution (ϕ 1 , ..., ϕ n ) such that ϕ 1 = ... = ϕ n and ϕ 1 (x j ) = inf c(x 1 , x 2 , ..., x n ) − n i=1,i =j It follows from Theorem (4) in [2] that any solution of (DK c ) satisfying (12) is bounded and almost everywhere differentiable. The same argument as in the proof of Proposition 1.1 shows that solutions of (DK c ) satisfying (12) are unique (up to the addition of constants summing to 0 to each potential). It then follows from part (ii) of Theorem 1.4 that ϕ is invariant under the group SO(d). It proves that ϕ has to be a radial function.