On the twist condition and c-monotone transport plans.

We consider optimal transport problems in R d as well as on manifolds for cost functions c which satisfy a nonsmooth version of the classical Left Twist condition (i


Introduction
Consider the optimal transport problem (P) inf M×N c(x, y)dγ(x, y) : γ ∈ Π(µ, ν) where M is a second countable C 1 manifold, N is a Polish space, c : M × N → R is a cost function, µ and ν are Borel probabilities respectively over M and N , and Π(µ, ν) denotes the set of transport plans from µ to ν, i.e. the set of Borel probabilities γ on M × N with first marginal π x ♯ γ equal to µ and second marginal π y ♯ γ equal to ν.In this work, we are interested in identifying conditions on the cost function c and the measure µ to ensure the existence of an optimal transport plan γ for (P) which is induced by a transport map, that is for which there exists a µ-measurable function T : M → N such that γ = (id × T ) ♯ µ.
This problem has been extensively studied in the past twenty years, and one may distinguish two main types of sufficient conditions for the existence of an optimal transport plan for (P) induced by a map, depending on whether or not the cost function c satisfies the following regularity assumption : (LT) for all x ∈ M , y → ∂c ∂x (x, y) is injective on its domain, where (LT) stands for "left twist" (we refer to Fathi and Figalli [1] for comments on this terminology).For example, in the special case where M = N = R d and c(x, y) = y − x p , with p > 0 and • denoting the Euclidian norm, the assumption (LT) holds whenever p = 1 but fails for p = 1.Let us briefly describe the corresponding existence results for those particular instances of problem (P).First in the regular case where (LT) holds, i.e. for p = 1, one obtains that the unique solution of (P) is indeed induced by a map under the general hypothesis that µ does not give mass to sets with σ-finite H d−1 measure, see Brenier [2] and Rüschendorf and Rachev [3] for the case p = 2 and Caffarelli [4], Gangbo and McCann [5,6] and Rüschendorf [7] for the general case p = 1.On the other hand the case p = 1, which corresponds to the relaxation proposed by Kantorovitch [8,9] for the optimal mass transport problem originally studied by Monge [10], this existence issue reveals quite involved and one only obtains that some particular solution of (P) is indeed induced by a map under the hypothesis that µ is absolutely continuous with respect to the Lebesgue measure L d , see Evans and Gangbo [11] with the further assumption that µ and ν have Lipschitz densities with respect to L d , and Ambrosio and Pratelli [12], Caffarelli, Feldman and McCann [13], Caravenna [14], Champion and De Pascale [15,16], and Trudinger and Wang [17] for the general case on µ.In that case the problem (P) may have solutions which are not induced by a transport map even when both measures µ and ν are absolutely continuous with respect to the Lebesgue measure L d , or may have no solution induced by a transport map in the case where µ is not absolutely continuous with respect to the Lebesgue measure L d (see Ambrosio and Pratelli [12]).
In the present work we shall concentrate on the regular case where the assumption (LT) holds.In this framework and under the hypothesis that the admissible plan γ is concentrated on a c-cyclically monotone set (which amounts to γ being optimal for (P) when the value is finite, see §4 below), the fact that the plan γ is induced by a transport map has been proven by many authors under various regularity hypotheses.When M = R d , this was in particular obtained by Caffarelli [4], Gangbo and McCann [5,6] and Rüschendorf [7] in the case c(x, y) = h(y − x) (see also Villani [18] for a presentation), while the general form for the cost function c was investigated in Carlier [19], Levin [20] .For the case where (M, d) is a Riemaniannian manifold, this was studied for the square distance cost c = d 2 by Cordero-Erausquin [21] (in the special case of the flat torus) and McCann [22], and the general case where c is induced by a Lagrangian was studied by Bernard and Buffoni [23], Fathi and Figalli [1] and Figalli [24,25].While all these works heavily use the fact that the plan γ is concentrated on a c-cyclically monotone set, our paper is innovative in that it applies to any plan γ which is supported on a c-monotone set (see section 4 below) : to obtain such a result we have to strengthen the hypothesis (LT) on the cost function c (see the Nonsmooth Left Twist condition (NLT) in section 2 below) but we prove that the most common cost functions that satisfy (LT) also satisfy this stronger condition.Surprisingly enough our result holds true with the same requirement that the initial measure µ shall give zero mass to sets with σ-finite H d−1 measure (in fact we slightly weaken this hypothesis).Our approach follows the method of proof designed by Champion, De Pascale and Juutinen [26] for the study of the ∞-Wasserstein distance problem and generalized by Champion and De Pascale [16,27] to deal with the classical Monge Transportation problem.The main features of our proof is that it relies on local arguments and make no use of the regularity theory for c-convex functions; see the discussion following the proof of Theorem 4.3 below.
The paper is organized as follows : in sections §2 and §3 we present the assumptions we shall make on the cost function c and the initial measure µ respectively.In §3 we also deduce some regularity property for any admissible plan γ ∈ Π(µ, ν).Section 4 is devoted to our main result and some comments, and we prove that this result applies to the most usual costs induced by Lagrangians in section 5.

Hypotheses on the cost function
We hereafter describe the common regularity assumptions that we shall impose on the cost function c, and in particular we introduce our refined version of the left twist condition (LT) which is the basis of our main result Theorem 4.3.Since our method of proof is local, we first state those assumptions in U × N , where U is an open subset of R d and N is a Polish space, and we shall then briefly discuss the manifold setting, the specific case of costs induced by a Tonelli Lagrangian being treated in section §5 below.A more detailed presentation of the following notions (except for the nonsmooth left twist condition (NLT) below) and related properties in the context of optimal transportation may be found for example in chapter 10 of Villani [28] or the appendix of Fathi and Figalli [1].
Definition 2.1.The function c : U × N → R is superdifferentiable in x at (x 0 , y 0 ) if there exist p 0 ∈ R d and a function η 0 : R + → R such that with η 0 (r) → 0 as r → 0, where • denotes the canonical scalar product on R d .The convex set of all such vectors p 0 is the superdifferential in x of c at (x 0 , y 0 ) and is denoted by ∇ + x c(x 0 , y 0 ).The likewise notions of subdifferentiability and subdifferential ∇ − x c(x 0 , y 0 ) are obtained by reversing the inequality in (1).Remark 2.2.Note that a function c is differentiable with respect to x at (x 0 , y 0 ) if and only if it is both sub-and superdifferentiable with ∇ + c(x 0 , y 0 ) = ∇ − c(x 0 , y 0 ) = {∇ x c(x 0 , y 0 )} (we refer to Proposition 10.7 of Villani [28]).
We shall need the usual following refined notion of local regularity.
Definition 2.3.The function c : U × V → R is locally superdifferentiable in x at (x 0 , y 0 ) locally in y if there exists a neighborhood W of (x 0 , y 0 ) such that c is superdifferentiable in x at any (x, y) ∈ W and there exists a continuous function η : R + → R such that for all (x, y) ∈ W : with η(r) → 0 as r → 0. The corresponding notion of local subdifferentiability is defined likewise.
Notice that in the above local notion of superdifferentiability it is assumed that the function η involved in (1) does not depend on (x, y) locally around (x 0 , y 0 ).Remark 2.4.It is to be noted that when c : U × V → R is locally superdifferentiable (resp.subdifferentiable) in x at (x 0 , y 0 ) locally in y, then it is differentiable in x at (x 0 , y 0 ) whenever ∇ + x c(x 0 , y 0 ) is reduced to a singleton (see Theorem 10.8(iii) in Villani [28]).
Definition 2.5.A set-valued map F : U ×N ⇉ R d is locally bounded if for any (x, y) ∈ U ×N there exists a neighborhood W of (x, y) and a compact set K such that F (x, y) ⊂ K for all (x, y) ∈ W .
The following result is a direct consequence of the above definitions, and states that under mild assumptions the superdifferential (resp.subdifferential) set-valued map has closed values and closed graph.
x c(x n , y n ) for any n, it holds that (p n ) n≥0 is bounded and any of its cluster points p ∞ belongs to ∇ + x c(x ∞ , y ∞ ).A similar statement holds for the subdifferential.
As already mentioned, the proof of our main result is based on a local argument, so that for our purpose all the previous notions may be generalized through local charts to a cost c : U × N → R where U is an open set of a C 1 manifold M .Indeed, if (V, φ) and (W, ψ) are two local charts around x 0 ∈ U , then c • (φ −1 × id) is locally superdifferentiable with respect to the first variable x locally in y at (φ(x 0 ), y 0 ) if and only if the same holds for c • (ψ −1 × id) at (ψ(x 0 ), y 0 ), and in this case one has for any (x, y) ∈ (V ∩ W ) × N : this directly follows from the fact that ψ Note that equation ( 2) uniquely characterizes the following subset of the cotangent space T * x0 M : Indeed such a subset does not depend on the local coordinate and is, by definition, the superdifferential of c with respect to x at (x 0 , y 0 ).The more specific case where M is endowed with a Riemannian structure is treated in section §5 below.We finally introduce the refined twist condition.
Definition 2.7.Assume c : U × N → R is superdifferentiable in x on U × N , we say that c satisfies the Nonsmooth Left Twist condition whenever it holds (NLT) for all x ∈ U , the set-valued map y → ∇ + x c(x, y) is injective.
Remark 2.8.We use the same terminology of "Nonsmooth Left Twist condition" when c is subdifferentiable, with ∇ − x c(x, y) replacing ∇ + x c(x, y) : it indeed follows from Remark 2.2 that these two conditions coincide with (LT) when c is both sub-and superdifferentiable.
Let us comment briefly on the above definition.First, the conditions (LT) and (NLT) obviously coincide in the regular case when c is differentiable in x on its domain.This is for example the case for the usual l p costs c(x, y) = y − x p , which do satisfy both conditions on R d × R d for p positive and p = 1.In fact the main reason for introducing this nonsmooth version of the left twist condition is to address the case of non-regular costs c which do naturally appear in the case where M is a manifold.In particular, if one considers the case of a smooth Riemannian manifold (M, g) with corresponding Riemannian metric d, a natural generalization of the l p -costs are the costs c(x, y) = d(x, y) p and one can not expect in general that this cost is smooth even when p = 1.For example if M is the sphere S d or the flat torus T d endowed with the Riemannian structure induced by R d+1 , then (x, y) → d(x, y) p is never differentiable in x at (x, y) whenever x and y are antipodal.Nevertheless we shall see in section §5 below that those particular costs also satisfy (NLT) as particular cases of costs associated to a Tonelli Lagrangian.
In view of the previous discussion, one may wonder whether (LT) implies (NLT), or at least if the fact that c satisfies (LT) on U × N implies that it satisfies (NLT) at least on some part of the non-differentiability points for c.We hereafter provide a counter-example.

A property of transport plans
We now discuss the regularity assumption we shall make upon the initial measure µ.As noted in the introduction, a common hypothesis on µ is to assume that it does not give mass to sets with H d−1 σ-finite measure.Following Fathi and Figalli [1], we shall make the slightly more general assumption that the measure µ does not give mass to countably (d−1)-Lipschitz sets.We recall the definition below.
It directly follows from the above definition that a countably (d − 1)-Lipschitz set is of H d−1 σ-finite measure : as a consequence if a Borel probability µ on R d does not give mass to sets of H d−1 σ-finite measure then it does not give mass to countably (d − 1)-Lipschitz sets.In the case where µ is a Borel probability on a second countable C 1 manifold M , we say that it does not give mass to countably (d − 1)-Lipschitz sets if this holds in local charts, that is if φ ♯ µ has this property for any local chart (V, φ).Here again this notion does not depend on the chosen local chart : if (V, φ) and (W, ψ) are two local charts then ψ • φ −1 is a C 1 diffeomorphism on φ(V ∩ W ) so that it preserves the countably (d − 1)-Lipschitz property.
As the proof of our main result in the manifold setting relies on a localization argument, we shall assume that M = R d in the rest of this section.Our aim here is to prove Lemma 3.4 below, which states that when the initial measure µ does not give mass to countably (d − 1)-Lipschitz sets then any transport plan Π(µ, ν) is concentrated on a σ-compact subset of R d × N whose elements are somehow interior points of its support (see Remark 3.5 below).We first recall the following definition from Champion, De Pascale and Juutinen [26].Definition 3.2.Let Γ be a σ-compact subset of R d × N , for y ∈ N and r > 0 we define where π x : (x, y) → x is the projection on R d .
In terms of mass transportation, if Γ is a σ-compact subset of the support of a transport plan γ ∈ Π(µ, ν), and if γ is concentrated on Γ, then Γ −1 (B (y, r)) may be interpreted as a pre-image of B (y, r) by γ (restricted to Γ).Note that the set Γ −1 (B (y, r)) is also σ-compact, and thus is a Borel set. For and for a subset A of R d and ε > 0 we set The set C(x, ξ, δ) is a pointed cone with apex x, with direction ξ and angle arccos(1 − δ) around ξ.Note that when A is closed, the set C(A; ξ, α, ε) is also closed.
The following Lemma of Geometric Measure Theory was shown to us by Tapio Rajala.
Proof.Without loss of generality we assume that ξ = e d is the d-th vector of the canonical basis of R d , and we shall denote by x −d the projection of a vector x ∈ R d onto R d−1 × {0}.
We fix δ ∈ ]0, 1[ and ε > 0, and for k ∈ Z we set Then B k is a compact subset of R d .We also remark that if x and y both belong to B k then x / ∈ C(y, ξ, α) and y / ∈ C(x, ξ, α).Using the definition (4) we thus obtain that if x and y both belong to B k then We infer from the last inequality that the compact set B k is the graph of the Lipschitz function The following result is a finer version of Lemma 5.2 and Proposition 5.4 of Champion, De Pascale and Juutinen [26], in which the assumption that µ is absolutely continuous with respect to the Lebesgue measure L d is weakened thanks to the above Lemma 3.3.Lemma 3.4.Assume that the Borel probability measure µ on R d gives zero measure to countably (d − 1)-Lipschitz sets.Let γ ∈ Π(µ, ν), and Γ be a σ-compact set on which γ is concentrated.Then there exists a Borel subset R(Γ) of Γ ∩ support(γ) on which γ is concentrated, and such that for any Proof.Let (y n ) n be a dense sequence in R d and (ξ n ) n be a dense sequence in S d−1 , then for any (i, j, k) ∈ N 5 we set We remark that if (x, y) ∈ Γ is such that x belongs to C(Γ −1 (B (y, r)); ξ, δ, ε) for some r, ξ, δ, ε then x belongs to ∪ i,j,k B i,j,k .We infer from Lemma 3.3 that each compact set B i,j,k is countably (d − 1)-Lipschitz so that it has µ-measure 0. Then the Borel set fulfills the desired property.
Remark 3.5.An element x of π x (R(Γ)) of Lemma 3.4 may be considered to be sort of an interior point of π x (Γ ∩ support(γ)) in the sense that in any direction ξ ∈ S d−1 one may find x ′ = x in π x (Γ ∩ support(γ)) such that x − x ′ and ξ − (x − x ′ )/ x − x ′ are as small as desired.

c-monotone transport plans are induced by transport maps
In this section we prove the main result of this paper, namely that under mild assumptions on the initial measure µ and the cost function c it holds that any c-monotone transport plan is induced by transport map.Let us first recall the notion of c-monotonicity and c-cyclical monotonicity.
The set Γ is c-cyclically monotone if for any N ≥ 2, any permutation σ of {1, . . ., N } and any N elements (x 1 , y 1 ), . . ., (x N , y N ) of Γ it holds When c is continuous and (P) has finite value, it is well known that the fact that the support of a transport plan γ ∈ Π(µ, ν) is c-cyclically monotone is a necessary and sufficient condition for its optimality (see e.g.Pratelli [29], Schachermayer and Teichmann [30]).This is of course not case for c-monotonicity, as the simple example below shows.We are now in position to state the main result of this paper.Proof (Proof of Theorem 4.3).We first do the proof in the case M = R d .
From Lemma 3.4 we get that γ is concentrated on a Borel subset R(Γ) of Γ which is c-monotone and such that (6) holds for any (x, y) ∈ R(Γ), r > 0, ξ ∈ S d−1 , δ ∈ ]0, 1[ and ε > 0. We prove by contradiction that R(Γ) ∩ U × N is a graph : assume that (x 0 , y 0 ) and (x 0 , z 0 ) both belong to that set with y 0 = z 0 .By (NLT) we may assume that there exists p 0 ∈ ∇ + x c(x 0 , y 0 )\∇ + x c(x 0 , z 0 ), then since the convex set ∇ + x c(x 0 , z 0 ) is non-empty and closed (see Lemma 2.6), we infer by the Banach separation Theorem that there exists ξ ∈ S d−1 and Now there exists η : R d → R such that η(r) → 0 as r → 0 and for all (x, z) in a neighborhood of (x 0 , z 0 ) and any q x,z ∈ ∇ + x c(x, z) we have We thus obtain Since (x, z) → ∇ + x c(x, z) has closed graph (see Lemma 2.6) there exist ε > 0 and r > 0 such that for all x ∈ B(x 0 , ε) and z ∈ B(z 0 , r) one has We may also assume that for such a choice of ε it holds η(x − x 0 ) < β 4 .
In the general case where M is a second countable C 1 manifold, we consider a C 1 atlas (U i , φ i ) i∈N of M and for any i we define the cost ci : For any i we also associate a compact covering family (K i j ) j of U i ∩U , and then for any i, j we define the initial measure μi,j = (φ i ⌊K i j ) ♯ µ and the transport plan γi,j := ((φ i ⌊K i j ) × id) ♯ γ.Then we are in position to apply the first part of the proof to each triple of data (c i , μi,j , γi,j ), and then obtain that each γi,j is concentrated on a graph, which yields that γ itself is concentrated on a graph.
Let us now comment the statement of Theorem 4.3 and its proof.In the more recent related results for the existence of an optimal transport map for (P) in the literature (see Fathi and Figalli [1], Figalli [25,24], Villani [28]), the basic common assumptions on the data are the following : (a) the initial measure µ gives zero mass to sets with σ-finite H d−1 measure; (b) the cost (x, y) → c(x, y) is lower semicontinuous and locally superdifferentiable in x locally in y (generally expressed in terms of local semiconcavity, which is equivalent by Proposition 10.12 in Villani [28]); (c) the cost function c satisfies the Left Twist condition (LT); (d) the transport plan γ is concentrated on a c-cyclically monotone set.
In the statement of Theorem 4.3, hypothesis (a) is replaced by the slightly weakened fact that µ should give zero measure to countably (d − 1)-Lipschitz sets, which is sufficient for our purpose from Lemma 3.4.Regarding hypothesis (b), in our framework the continuity of c yields Lemma 2.6, which is necessary to obtain that the superdifferential of c has closed graph; one may note that continuity of the cost function is in fact a usual feature in this setting, and holds for example for any cost associated with a Tonelli Lagrangian (see Proposition 5.3 below).Then the main differences between the hypotheses of Theorem 4.3 and the common assumptions listed above are that (c) is clearly strengthened while hypothesis (d) is quite weakened, so we shall focus on those two points.As commented after Definition 4.1, under the hypothesis (d) the plan γ is an optimal solution of problem (P) (or a generalized optimal plan in the case where the value of (P) is not finite) : as such, its support is included in the c-subdifferential of a c-convex function ψ, which is heuristically a Kantorovich potential for problem (P) (that is a solution of its dual problem, when the value of (P) is finite).
Then it remains to obtain that this function ψ is in fact differentiable µ-almost everywhere to get that the plan γ is concentrated on the graph of Dψ : this last point then relies on a deep study of the regularity theory for c-convex functions.We refer to Chapter 10 of Villani [28] for a presentation of this approach and extended bibliographical notes on this topic.As a consequence, the usual approach to obtain that a plan γ is concentrated on a graph relies on a dual argument and the regularity theory for c-convex functions.On the contrary, our approach is essentially primal since we assume that the plan γ is concentrated on a set which is just c-monotone and thus may be not optimal even when the value of problem (P) is finite (remind Example 4.2).Therefore we can not use the regularity theory for c-convex functions, and thus the assumption that c satisfies the Nonsmooth Left Twist condition (NLT) compensates the weakenning of hypothesis (d) in the statement of Theorem 4.3.As a conclusion, we obtain a result which is more general since it applies to plans that are not necessarily optimal, but also weaker in the sense that we only obtain that such a plan γ is concentrated on the graph of a Borel function (e.g.see Proposition 2.1 in Ambrosio [31]) but we do not get a description of that transport map as a gradient (although one can find such a description as a partial derivative of an Hamiltonian in the special case where c(x, y) = y − x 2 on R 2 in Ghoussoub and Moradifam [32]).

On costs induced by a Lagrangian
In this section we prove that the most common Lagrangian costs do satisfy the Nosmooth Left Twist condition (NLT).We first recall some basic definitions and known facts about those Lagrangian costs, and refer to Appendix B of Fathi and Figalli [1] and Chapter 10 of Villani [28] for a more detailed presentation.
In all this section M is assumed to be a connected complete and smooth Riemannian manifold of dimension d.When it exists, a minimizer ξ of c t (x, y) is called an L-minimal curve.Some conditions are needed on the Lagrangian L to ensure finite values and some regularity properties for the costs c t .The notions of Tonelli Lagrangian and weak Tonelli Lagrangian are the most common in the Calculus of Variations, the Dynamical System Theory and in Optimal Transportation.Definition 5.2.We say that L : where • x is the norm on T x M associated to the Riemannian metric g; in the fibers, that is for every a ≥ 0, there exists a constant α(a, K) > −∞ such that The above notions are sufficient to obtain the following result, see Theorems B.6-7-19 of Fathi and Figalli [1].Proposition 5.3.Let L be a weak Tonelli Lagrangian and let c t be the associated cost for some t > 0. Then 1. c is continuous and locally superdifferentiable in x locally in y on M × M ; 2. For any two points x 0 and y 0 in M there exists at least a L-minimal curve, every such curve γ 0 is of class C 1 and moreover We now turn to the main result of this section, namely that under mild assumptions on the weak Tonelli Lagrangian L it holds that any of the costs c t satisfies (NLT).
Then c t satisfies (NLT) Proof.Assume that ∇ + x c(x 0 , y 0 ) = ∇ + x c(x 0 , z 0 ), we aim to show that y 0 = z 0 .First we prove that By (2) of Proposition 5.3 we already know that If equality does not hold in the previous inclusion then there exist p 0 ∈ ∇ + x c(x 0 , y 0 ) and w ∈ T x0 M such that p 0 • w < min − ∂L ∂v (x 0 , ξ0 (0)) • w : ξ is L-minimal from x 0 to y 0 .
Now let X w be the geodesic such that X w (0) = x 0 and Ẋw (0) = w, and for ε small enough set x ε = X w (ε) = exp x0 (ε w) and w ε = Ẋw (ε), where exp x0 denotes the exponential mapping on T x0 M .Note that X ε w : t → X w (ε − t) is also a geodesic and then exp xε (−εw ε ) = x 0 , and also that w ε → w as ε → 0 since the geodesic X w is smooth.If ξ ε denotes a L-minimal curve from x ε to y 0 , then it follows from (2) of Proposition 5. Dividing by ε and then passing to the limit, we obtain from the continuity of the Legendre transform and assumption (CP) that − ∂L ∂v (x 0 , ξ0 (0)) • w ≤ p 0 • w for some L-minimal curve ξ 0 from x 0 to y 0 : this contradicts (8).
We are now in position to conclude from the following fact : when A ⊂ R d is compact it contains the set of extremal points of conv(A) which is non empty by Krein-Milman theorem.By (CP) the sets − ∂L ∂v (x 0 , ξ0 (0)) : ξ is L-minimal from x 0 to y 0 and − ∂L ∂v (x 0 , ξ0 (0)) : ξ is L-minimal from x 0 to z 0 are compact and by (7) they have same convex hull, so they have nonempty intersection.By the injectivity of ∂L ∂v (x 0 , •) and by (UC) we conclude that y 0 = z 0 .
The following two examples show that the most common Lagrangian costs, and in particular the powers (strictly larger than 1) of the distance on a Riemannian manifold, fall in the framework of our study.In particular if g is the Riemannian metric over M then the square distance cost c t (x, y) = t d g (x, y) 2 is associated to the Lagrangian which is a Tonelli Lagrangian.
Example 5.6.As a corollary of the above example, one also obtains that the costs associated to the Lagrangian L(x, v) = g(v, v) for p > 1 also enter the framework of our study : it follows from Proposition B.24 in Fathi and Figalli [1]) that the associated cost for such a Lagrangian is of the form c t (x, y) = t p−1 d g (x, y) p , and as such it has the same L-minimal curves as the square distance cost, so that the hypotheses of Theorem 5.4 hold.Note that in the case p ≥ 2 such a Lagrangian is in fact a Tonelli Lagrangian.

Example 4 . 2 .
Let M = N = R 2 and consider the usual squarred Euclidian norm cost c(x, y) = y − x 2 , then the problem (P) for the probabilities µ = ν = 1 π L 2 ⌊B(0, 1) obviously has a unique solution (id × id) ♯ µ induced by the identity, which is then the only admissible transport plan supported by a c-cyclically monotone set.On the other hand, if R π/2 denotes the rotation of center 0 and angle π 2 in R 2 then it is clear that the transport plan (id×R π/2 ) ♯ µ is c-monotone and not optimal for (P).

Theorem 4 . 3 .
Let M be a second countable C 1 manifold of dimension d and let N be a Polish space, µ and ν be Borel probabilities respectively over M and N and let the cost c be continuous and locally superdifferentiable in x locally in y on U × N , where U is an open set on which µ is supported.We assume that µ gives zero measure to countably (d − 1)-Lipschitz sets and that c satisfies (NLT).If a transport plan γ ∈ Π(µ, ν) is concentrated on σ-compact and c-monotone set Γ, then γ is concentrated on a Borel graph.Remark 4.4.Following the line of the proof of Theorem 4.3 below, a similar result holds whenever c is subdifferentiable instead of superdifferentiable.

Example 5 . 5 .
If L is a Tonelli Lagrangian then it satisfies the assumptions of Theorem 5.4.Indeed, property (UC) follows by Lemma B.22 and Proposition B.23 of Fathi and Figalli[1], while (CP) is a consequence of the classical regularity theory in the Calculus of Variations.