An optimal transport formulation of the Einstein equations of general relativity

The goal of the paper is to give an optimal transport formulation of the full Einstein equations of general relativity, linking the (Ricci) curvature of a space-time with the cosmological constant and the energy-momentum tensor. Such an optimal transport formulation is in terms of convexity/concavity properties of the Shannon-Bolzmann entropy along curves of probability measures extremizing suitable optimal transport costs. The result gives a new connection between general relativity and optimal transport; moreover it gives a mathematical reinforcement of the strong link between general relativity and thermodynamics/information theory that emerged in the physics literature of the last years.

Here let us just quote two of the many applications to partial differential equations. In the pioneering work of Jordan-Kinderleher-Otto [55] it was discovered a new optimal transport formulation of the Fokker-Planck equation (and in particular of the heat equation) as a gradient flow of a suitable functional (roughly, the Boltzmann-Shannon entropy defined below in (1.5) plus a potential) in the Wasserstein space (i.e. the space of probability measures with finite second moments endowed with the quadratic Kantorovich-Wasserstein distance); later, Otto [68] found a related optimal transport formulation of the porous medium equation. The impact of these works in the optimal transport community has been huge, and opened the way to a more general theory of gradient flows (see for instance the monograph by Ambrosio-Gigli-Savaré [3]).
The goal of the present work is to give a new optimal transport formulation of another fundamental class of partial differential equations: the Einstein equations of general relativity. First published by Einstein in 1915, the Einstein equations describe gravitation as a result of space-time being curved by mass and energy; more precisely, the space-time (Ricci) curvature is related to the local energy and momentum expressed by the energymomentum tensor. Before entering into the topic, let us first recall that the Einstein equations are hyperbolic evolution equations (for a comprehensive treatment see the recent monograph by Klainerman-Nicoló [52]). Instead of a gradient flow/PDE approach, we will see the evolution from a geometric/thermodynamic/information point of view.
Next we briefly recall the formulation of the Einstein equations. Let M n be an n-dimensional manifold (n ≥ 3, the physical dimension being n = 4) endowed with a Lorentzian metric g, i.e. g is a nondegenerate symmetric bilinear form of signature (− + + . . . +). Denote with Ric and Scal the Ricci and the scalar curvatures of (M n , g). The Einstein equations read as where Λ ∈ R is the cosmological constant, and T is the energy-momentum tensor. Physically, the cosmological constant Λ corresponds to the energy density of the vacuum; the energy-momentum tensor is a symmetric bilinear form on M representing the density of energy and momentum, acting as the source of the gravitational field. A physical particle moving in the space-time (M, g, C) is represented by a causal curve which is an absolutely continuous curve, γ, satisfyinġ γ t ∈ C a.e. t ∈ [0, 1].
If the particle cannot reach the speed of light (e.g. massive particle), then it is represented by a chronological curve which is an absolutely continuous curve, γ, satisfyingγ t ∈ Int(C) a.e. t ∈ [0, 1], where Int(C) is the interior of the cone C made of future pointing time-like vectors. The Lorentz length of a causal curve is A point y is in the future of x, denoted y >> x, if there is a future oriented chronological curve from x to y; in this case, the Lorentz distance or proper time between x and y is defined by sup{L g (γ) : γ 0 = x and γ 1 = y, γ chronological} > 0, which is achieved by a geodesic which is called a maximal geodesic. See for example [31,67,83] In this paper, we consider the following Lorentzian Lagrangian on T M for p ∈ (0, 1): if v ∈ C +∞ otherwise.
Note that if p were 1 this would be the negative of the integrand for the Lorentz length given above. Here we study p ∈ (0, 1) because this Lorentzian Lagrangian L p has good convexity properties for such p [see Lemma 2.1]. Let AC([0, 1], M ) denote the space of absolutely continuous curves from [0, 1] to M. The Lagrangian action A p , corresponding to the Lagrangian L p and defined for any γ ∈ AC([0, 1], M ), is given by Observe that A p (γ) ∈ (−∞, 0] if and only if γ is a causal curve. Note that, if p were 1, this would be the negative of the Lorentz length of γ or the proper time along γ. Thus −A p (γ) can be seen as a kind of non-linear p-proper time along γ, enjoying better convexity properties. The reader may note the parallel with the theory of Riemannian geodesics, where one often studies the energy functional´|γ| 2 in place of the length functional´|γ|, due to the analogous advantages. The choice of the minus sign in (1.3) is motivated by optimal transport theory, in order to have a minimization problem instead of a maximization one (as in the sup defining the Lorentz distance between points above). It is readily checked that the critical points of A p with negative action are time-like geodesics [see Lemma 2.2]. The advantage of A p with p ∈ (0, 1) is that it automatically selects an affine parametrization for its critical points with negative action.
Note that, if p were 1, then this would be the negative of the Lorentz distance between x and y. The Ricci curvature, Ric x (v, v) at a point x ∈ M in the direction v, is a trace of the curvature tensor so that intuitively it measures the average way in which geodesics near γ x bend towards or away from it. See Section 1.3. In Riemannian geometry, the Ricci curvature influences the volumes of balls.
Here, instead of balls, we define for any x ∈ p T M →M (E) B g,E r (x) := {exp g x (tw) : w ∈ T x M ∩ E, g(w, w) = −1, t ∈ [0, r]} precisely to avoid the null directions.
Rather than considering individual paths between a given pair of points, we will consider distributions of paths between a given pair of distributions of points using the optimal transport approach.
We denote by P(M ) the set of Borel probability measures on M . For any µ 1 , µ 2 ∈ P(M ), we say that a Borel probability measure π ∈ P(M × M ) is a coupling of µ 1 and µ 2 if (p i ) π = µ i , i = 1, 2, where p 1 , p 2 : M × M → M are the projections onto the first and second coordinate. Recall that the push-forward (p 1 ) π is defined by (p 1 ) π(A) := π p −1 1 (A) for any Borel subset A ⊂ M . The set of couplings of µ 1 , µ 2 is denoted by Cpl(µ 1 , µ 2 ). The c p -cost of a coupling π is given bŷ Denote by C p (µ 1 , µ 2 ) the minimal cost relative to c p among all couplings from µ 1 to µ 2 , i.e. If C p (µ 1 , µ 2 ) ∈ R, a coupling achieving the infimum is said to be c p -optimal.
We will mainly consider a special class of c p -optimal dynamical plans, that we call regular : roughly, a c p -optimal dynamical plan is said to be regular if it is obtained by exponentiating the gradient (which is assumed to be time-like) of a smooth Kantorovich potential φ: (1.4) µ t = (Ψ t 1/2 ) µ 1/2 , Ψ t 1/2 (x) := exp g x −(t−1/2)|∇ g φ| q−2 g ∇ g φ(x) , and moreover µ t vol g for all t ∈ (0, 1), where vol g denotes the standard volume measure of (M, g). For the precise notions, the reader is referred to Section 2.5.
A key role in our optimal transport formulation of the Einstein equations will be played by the (relative) Boltzmann-Shannon entropy. Denote by vol g the standard volume measure on (M, g). Given an absolutely continuous probability measure µ = vol g with density ∈ C c (M ), its Boltzmann-Shannon entropy (relative to vol g ) is defined as (1.5) Ent(µ|vol g ) :=ˆM log dvol g .
We will be proving that the second order derivative of this entropy along a c poptimal dynamical plan, is equivalent to the Einstein Equation in Theorem 4.10. See Figure 1. Throughout the paper we will assume the cosmological constant Λ and the energy momentum tensor T to be given, say from physics and/or mathematical general relativity. Given g, Λ and T it is convenient to set so that the Einstein Equation can be written as Ric =T , see Lemma 4.1.
Theorem 1.1 (Theorem 4.10). Let (M, g, C) be a space-time of dimension n ≥ 3. Then the following assertions are equivalent: (1) (M, g, C) satisfies the Einstein equations (1.1), which can be rewritten in terms ofT of (1.6) as Ric =T .
(2) For every p ∈ (0, 1) and for every relatively compact open subset E ⊂⊂ Int(C) there exist R = R(E) ∈ (0, 1) and a function the next assertion holds. For every r ∈ (0, R), setting y = exp g x (rv), there exists a regular c p -optimal dynamical plan Π = Π(x, v, r) with associated curve of probability measures 1} and which has convex/concave entropy in the following sense: There exists p ∈ (0, 1) such that the assertion as in (2) holds true. Remark 1.2 (On the regularity of the space-time). For simplicity, in the paper we work with a smooth space-time. However, all the statements and proofs would be valid assuming that M is a differentiable manifold endowed with a C 3 -atlas and that g is a C 2 -Lorentzian metric on M .  be interpreted as the evolution (a) of a distribution of gas passing through a given gas distribution µ 1/2 (that in Theorem 1.1 is assumed to be concentrated in the space-time near x). Theorem 1.1 says that the Einstein equations can be equivalently formulated in terms of the convexity properties of the Bolzmann-Shannon entropy along such evolutions (µ t ) t∈[0,1] ⊂ P(M ). Extrapolating a bit more, we can say that the second law of thermodynamics (i.e. in a natural thermodynamic process, the sum of the entropies of the interacting thermodynamic systems decreases, due to our sign convention) concerns the first derivative of the Bolzmann-Shannon entropy; gravitation (a) strictly speaking t is not the proper time, but only a variable parametrizing the evolution (under the form of Ricci curvature) is instead related to the second order derivative of the Bolzmann-Shannon entropy along a natural thermodynamic process.
Remark 1.4 (Disclaimer). In Theorem 1.1 we are not claiming to solve the general Einstein Equations via optimal transport; we are instead proposing a novel formulation/characterization of the solutions of the Einstein Equations based on optimal transport, assuming the cosmological constant Λ and the energy-momentum tensor T being already given (this can be a bit controversial for a general T ; however, the characterization is already new and interesting in the vacuum case T ≡ 0 where there is no controversy). The aim is indeed to bridge optimal transport and general relativity, with the goal of stimulating fruitful connections between these two fascinating fields. In particular, optimal transport tools have been very successful to study Ricci curvature bounds in a (low regularity) Riemannian and metricmeasure framework (see later in the introduction for the related literature) and it is thus natural to expect that optimal transport can be useful also in a low-regularity Lorentzian framework, where singularities play an important part in the theory; for example it is expected that, at least generically, singularities occur in black-hole interiors.
For equivalent formulations of Theorem 1.1, see Remark 4.11 and Remark 4.12.
In the vacuum case T ≡ 0 with zero cosmological constant Λ = 0, the Einstein equations read as (1.8) Ric ≡ 0, for an n-dimensional space-time (M, g, C). Specializing Theorem 1.1 with the choiceT = 0 (plus a small extra observation to sharpen the lower bound in (1.9) from − (r) to 0; moreover the same proof extends to n = 2) gives the following optimal transport formulation of Einstein vacuum equations with zero cosmological constant.
1.2. Outline of the argument. As already mentioned, the Einstein Equations can be written as Ric =T whereT was defined in (1.6), see Lemma 4.1.
The optimal transport formulation of the Einstein equations will consist separately of an optimal transport characterization of the two inequalities respectively. The optimal transport characterization of the lower bound (1.10) will be achieved in Theorem 4.3 and consists in showing that (1.10) is equivalent to a convexity property of the Bolzmann-Shannon entropy along every regular c p -optimal dynamical plan. The optimal transport characterization of the upper bound (1.11) will be achieved in Theorem 4.7 and consists in showing that (1.11) is equivalent to the existence of a large family of regular c p -optimal dynamical plans (roughly the ones given by exponentiating the gradient of a smooth Kantorovich potential with Hessian vanishing at a given point) along which the Bolzmann-Shannon entropy satisfies the corresponding concavity condition. Important ingredients in the proofs will be the following. In Theorem 4.3, for proving that Ricci lower bounds imply convexity properties of the entropy, we will perform Jacobi fields computations relating the Ricci curvature with the Jacobian of the change of coordinates of the optimal transport map (see Proposition 3.4); in order to establish the converse implication we will argue by contradiction via constructing c p -optimal dynamical plans very localized in the space-time (Lemma 3.2). In Theorem 4.3 we will consider the special class of regular c p -optimal dynamical plans constructed in Lemma 3.2, roughly the ones given by exponentiating the gradient of a smooth Kantorovich potential with Hessian vanishing at a given point x ∈ M . For proving that Ricci upper bounds imply concavity properties of the entropy, we will need to establish the Hamilton-Jacobi equation satisfied by the evolved Kantorovich potentials (Proposition 3.1) and a non-linear Bochner formula involving the p-Box operator (Proposition A.1), the Lorentzian counterpart of the p-Laplacian. In order to show the converse implication we will argue by contradiction using Theorem 4.3.

1.
3. An Example. FLRW Spacetimes. We illustrate Theorem 1.1 for the class of Friedmann-Lemaître-Robertson-Walker spacetimes (short FLRW spacetimes), a group of cosmological models well known in general relativity. See [67,Chapter 12] for a discussion of the geometry in the case n = 4.

The foliation
is a geodesic foliation by c p -minimal geodesics. The orthogonal complement ∂ ⊥ s = T Σ with respect to g is integrable. Consider the projection S : M = I × Σ → I and for r > 0 the function It is easy to see that where the second equality follows from the fact that the geodesics in O minimize c p to the level sets of φ. It follows that )} for all x ∈ M whenever the right hand side is well defined (see Section 2.5 for the definition).
Consider a probability measure ν 1/2 vol σ on Σ and set for s 0 , s 1 ∈ I with s 0 < s 1 and L 1 the 1-dimensional Lebesgue measure. For Assuming s 1 − s 0 r and r > 0 sufficiently small (i.e. the setting of Theorem 1.1), since a is continuous, we obtain: for some function ε(r) → 0 as r → 0. As an example, we discuss the case of the following C 1,1 -warping function a : I → R, a(s) := λ + s 2 + 1, s ≥ 0 λ − s 2 + 1, s < 0 for λ − , λ + ∈ R. Ignoring terms of higher order near s = 0 we get 4 It is now easy to see that the possible accumulation points of the right hand side for r → 0 and s ∈ [− r 2 , r 2 ] lie between −2(n − 1)λ + = lim s↓0 Ric(∂ s , ∂ s ) and −2(n − 1)λ − = lim s↑0 Ric(∂ s , ∂ s ). Furthermore, every value in that interval is an accumulation point for the right hand side, as r → 0. In case the warping function lies in C 2 , we get from (1.12) the asymptotic formula:  [27,28], Otto-Villani [69] and von Renesse-Sturm, has culminated in a characterization of Ricci-curvature lower bounds (by a constant K ∈ R) involving only the displacement convexity of certain information-theoretic entropies [77]. This in turn led Sturm [74,75] and independently Lott-Villani [57] to develop a theory for lower Ricci curvature bounds in a non-smooth metric-measure space setting. The theory of such spaces has seen a very fast development in the last years, see e.g. [2,4,5,6,7,19,23,24,32,39,40,65]. An approach to the complementary upper bounds on the Ricci tensor (again by a constant K ∈ R) has been recently proposed by Naber [66] (see also Haslhofer-Naber [44]) in terms of functional inequalities on path spaces and martingales, and by Sturm [76] (see also Erbar-Sturm [33]) in terms of contraction/expansion rate estimates of the heat flow and in terms of displacement concavity of the Shannon-Bolzmann entropy. The Lorentzian time-like Ricci upper bounds of this paper have been inspired in particular by the work of Sturm [76].

1.4.2.
Optimal transport in Lorentzian setting. The optimal transport problem in Lorentzian geometry was first proposed by Brenier [14] and further investigated in [12,78,50]. An intriguing physical motivation for studying the optimal transport problem in Lorentzian setting called the "early universe reconstruction problem" [16,38]. The Lorentzian cost C p , for p ∈ (0, 1), was proposed by Eckstein-Miller [29] and thoroughly studied by Mc Cann [62] very recently. In the same paper [62], Mc Cann gave an optimal transport formulation of the strong energy condition Ric ≥ 0 of Penrose-Hawking [71,45,46] in terms of displacement convexity of the Shannon-Bolzmann entropy under the assumption that the space time is globally hyperbolic. We learned of the work of Mc Cann [62] when we were already in the final stages of writing the present paper. Though both papers (inspired by the aforementioned Riemannian setting) are based on the idea of analyzing convexity properties of entropy functionals on the space of probability measures endowed with the cost C p , p ∈ (0, 1), the two approaches are largely independent: while Mc Cann develops a general theory of optimal transportation in globally hyperbolic space times focusing on the strong energy condition Ric ≥ 0, in this paper we decided to take the quickest path in order to reach our goal of giving an optimal transport formulation of the full Einstein's equations. Compared to [62], in the present paper we remove the assumption of global hyperbolicity on the space-time, we extend the optimal transport formulation to any lower bound of the type Ric ≥T for any symmetric bilinear formT , and we also characterize general upper bounds Ric ≤T .
1.4.3. Physics literature. The existence of strong connections between thermodynamics and general relativity is not new in the physics literature; it has its origins at least in the work Bekenstein [10] and Hawking with collaborators [8] in the mid-1970s about the black hole thermodynamics. These works inspired a new research field in theoretical physics, called entropic gravity (also known as emergent gravity), asserting that gravity is an entropic force rather than a fundamental interaction. Let us give a brief account. In 1995 Jacobson [43] derived the Einstein equations from the proportionality of entropy and horizon area of a black hole, exploiting the fundamental relation δQ = T δS linking heat Q, temperature T and entropy S. Subsequently, other physicists, most notably Padmanabhan (see for instance the recent survey [70]), have been exploring links between gravity and entropy.
More recently, in 2011 Verlinde [80] proposed a heuristic argument suggesting that (Newtonian) gravity can be identified with an entropic force caused by changes in the information associated with the positions of material bodies. A relativistic generalization of those arguments leads to the Einstein equations.
The optimal transport formulation of Einstein equations obtained in the present paper involving the Shannon-Bolzmann entropy can be seen as an additional strong connection between general relativity and thermodynamics/information theory. It would be interesting to explore this relationship further.
Acknowledgement. The authors wish to thank Christina Sormani and the anonymous referee for several comments that improved the exposition of the paper.

Preliminaries
2.1. Some basics of Lorentzian geometry. Let M be a smooth manifold of dimension n ≥ 2. It is convenient to fix a complete Riemannian metric h on M . The norm | · | on T x M and the distance dist(·, ·) : M × M → R + are understood to be induced by h, unless otherwise specified. Recall that h induces a Riemannian metric on T M . Distances on T M are understood to the induced by such a metric. The metric ball around x ∈ M with radius r, with respect to h, is denoted by B h r (x) or simply by B r (x). A Lorentzian metric g on M is a smooth (0, 2)-tensor field such that is symmetric and non-degenerate with signature (−, +, . . . , +) for all x ∈ M . It is well known that, if M is compact, the vanishing of the Euler characteristic of M is equivalent to the existence of a Lorentzian metric; on the other hand, any non-compact manifold admits a Lorentzian metric.
is said to be time-oriented if M admits a continuous nowhere vanishing time-like vector field X. The vector field X induces a partition on the set of causal vectors, into two equivalence classes: • The future pointing tangent vectors v for which g(X, v) < 0, • The past pointing tangent vectors v for which g(X, v) > 0.
The closure of the set of future pointing time-like vectors is denoted An absolutely continuous curve γ : I → M is called (C)-causal ifγ t ∈ C for every differentiability point t ∈ I. A causal curve γ : I → M is called time-like if for every s ∈ I there exist ε, δ > 0 such that dist(γ t , ∂C) ≥ ε|γ t | for every t ∈ I for whichγ t exists and |s − t| < δ. In [17, Section 2.2] time-like curves are defined in terms of the Clarke differential of a Lipschitz curve. Whereas the definition via the Clarke differential is probably more satisfying from a conceptual point of view, the definition given here is easier to state. All relevant sets and curves used below are independent of the definition, see [17,Lemma 2.11] and Proposition 2.4, though.
We denote by J + (x) (resp. J − (x)) the set of points y ∈ M such that there exists a causal curve with initial point x (resp. y) and final point y (resp. x), i.e. the causal future (resp. past) of x. 2.2. The Lagrangian L p , the action A p and the cost c p . On a spacetime (M, g, C) consider, for any p ∈ (0, 1), the following Lagrangian on T M : if v ∈ C +∞ otherwise.
The following fact appears in [62, Lemma 3.1]. We provide a proof for the readers convenience.
Lemma 2.1. The function L p is fiberwise convex, finite (and non-positive) on its domain and positive homogenous of degree p. Moreover L p is smooth and fiberwise strictly convex on Int(C).
Proof. It is clear from its very definition that the restriction of L p to Int(C) is smooth. A direct computation gives (2.4) Fix v ∈ Int(C). Decompose w ∈ T x M into w the part parallel to v and w ⊥ the part orthogonal to v, all with respect to g. Then we have Since g(w ⊥ , w ⊥ ) ≥ 0 and p < 1 we have We define the Lagrangian action A p associated to L p as follows: Lemma 2.2. Any A p -minimizer with finite action is either a future pointing time-like geodesic of (M, g) or a future pointing light-like pregeodesic of (M, g), i.e. an orientation preserving reparameterization is a future pointing light-like geodesic of (M, g).
Proof. Let γ : [0, 1] → M be a A p -minimizer with finite action. Thenγ(t) ∈ C for a.e. t. By Jensen's inequality we havê for any causal curve η : [0, 1] → M with equality if and only if η is parametrized proportionally to arclength. Recall that the restriction of a minimizer to any subinterval of [0, 1] is a minimizer of the restricted action. Since any point in a spacetime admits a globally hyperbolic neighborhood, see [64, Theorem 2.14], the Avez-Seifert Theorem [67,Proposition 14.19] implies that every minimizer of A 1 with finite action is a causal pregeodesic.
Combining both points we see that if the action of γ is negative, the curve is a time-like pregeodesic parameterized with respect to constant arclength, i.e. a time-like geodesic. If the action of γ vanishes, the curve is a light-like pregeodesic.
Consider the cost function relative to the p-action A p : Remark 2.3. We will always assume that: Proposition 2.4. Fix p ∈ (0, 1) and let (M, g, C) be a space-time. Then every point has a neighborhood U such that the following holds for the spacetime (U, g| U , C| U ). For every pair of points x, y ∈ U with (x, y) ∈ J + U , the causal relation of (U, g| U , C| U ), there exists a curve γ : [0, 1] → U with γ 0 = x, γ 1 = y, and minimizing A p among all curves η ∈ AC([0, 1], M ) with η 0 = x and η 1 = y. Moreover γ is a constant speed geodesic for the metric g,γ ∈ C whenever the tangent vector exists, and A p (γ) ∈ R.
Proof. It is well known that in a space-time every point has a globally hyperbolic neighborhood. Let U be such a neighborhood. If (x, y) ∈ J + U there exists a curve with finite action A p between x and y. At the same time the action is bounded from below, e.g. by a steep Lyapunov function, see [17]. Therefore any minimizer γ : [0, 1] → U has finite action, i.e.γ(t) ∈ C for almost all t. By Jensen's inequality we havê for any causal curve η : [0, 1] → U with equality if and only if η is parametrized proportionally to arclength. By the Avez-Seifert Theorem [67,Proposition 14.19] every minimizer of the right hand side is a causal pregeodesic. Combining both it follows that every A p -minimizer is a causal geodesic.

2.3.
Ricci curvature and Jacobi equation. We now fix the notation regarding curvature for a Lorentzian manifold (M, g) of dimension n ≥ 2.
Called ∇ the Levi-Civita connection of (M, g), the Riemann curvature tensor is defined by where X, Y, Z are smooth vector fields on M and [X, Y ] is the Lie bracket of X and Y .
For each x ∈ M , the Ricci curvature is a symmetric bilinear form Ric x : Given a endomorphism U : T x M → T x M and a g-orthonormal basis {e i } i=1,...,n of T x M , we associate to U the matrix The trace Tr g (U) and the determinant Det g (U) of the endomorphism U with respect to the Lorentzian metric g are by definition the trace tr(U ij ) and the determinant det(U ij )) of the matrix (U ij ) i,j=1,...,n , respectively. It is standard to check that such a definition is independent of the chosen orthonormal basis of A vector field J along a geodesic γ is said to be a Jacobi field if it satisfies the Jacobi equation: 2.4. The q-gradient of a function. Finally let us recall the definition of gradient and hessian. Given a smooth function f : M → R, the gradient of f denoted by ∇ g f is defined by the identity where df is the differential of f . The Hessian of f , denoted by Hess f is defined to be the covariant derivative of df : It is related to the gradient through the formula and satisfies the symmetry Next we recall some notions for the causal character of functions. • Notice that, since p ranges in (0, 1) then q ranges in (−∞, 0). In order to describe the optimal transport maps later in the paper, it is useful to introduce the q-gradient (cf. [49]) if and only if The motivation for the use of the q-gradient comes from the Hamiltonian formulation of the dynamics; let us briefly mention a few key facts that will play a role later in the paper.
be the Legendre transform of L p . Denote with g * the dual Lorentzian metric on T * M and C * ⊂ T * M the dual cone field to C. Then H p satisfies for (p − 1)(q − 1) = 1. By analogous computations as performed in the proof of Lemma 2.1, one can check that ). By well known properties of the Legendre transform (see for instance [21, Theorem A.2.5]) it follows that DH p is invertible on Int(C * ) with inverse given by DL p . Thus (2.17) is equivalent to . c p -concave functions and regular c p -optimal dynamical plans. We denote by P(M ) the set of Borel probability measures on M . For any µ 1 , µ 2 ∈ P(M ), we say that a Borel probability measure π ∈ P(M × M ) is a coupling of µ 1 and µ 2 if (p i ) π = µ i , i = 1, 2, where p 1 , p 2 : M × M → M are the projections onto the first and second coordinate. Recall that the push-forward (p 1 ) π is defined by (p 1 ) π(A) := π p −1 1 (A) for any Borel subset A ⊂ M . The set of couplings of µ 1 , µ 2 is denoted by Cpl(µ 1 , µ 2 ). The c p -cost of a coupling π is given bŷ Denote by C p (µ 1 , µ 2 ) the minimal cost relative to c p among all couplings from µ 1 to µ 2 , i.e.
If C p (µ 1 , µ 2 ) ∈ R, a coupling achieving the infimum is said to be c p -optimal.
We next define the notion of c p -optimal dynamical plan. To this aim, it is convenient to consider the set of A p -minimizing curves, denoted by Γ p . The set Γ p is endowed with the sup metric induced by the auxiliary Riemannian metric h. It will be useful to consider the maps for t ∈ [0, 1]: A c p -optimal dynamical plan is a probability measure Π on Γ p such that (e 0 , e 1 ) Π is a c p -optimal coupling from µ 0 := (e 0 ) Π to µ 1 := (e 1 ) Π. We will mostly be interested in c p -optimal dynamical plans obtained by "exponentiating the q-gradient of a c p -concave function", what we will call regular c p -optimal dynamical plans. In order to define them precisely, let us first recall some basics of Kantorovich duality (we adopt the convention of Note that i.e. φ is a causal function. The same argument gives that −φ cp is a causal function as well. Definition 2.5 (Regular c p -optimal dynamical plan). A c p -optimal dynamical plan Π ∈ P(Γ p ) is regular if the following holds. There exists U, V ⊂ M relatively compact open subsets and a smooth c pconcave (with respect to (U, V )) function φ 1/2 : U → R such that where Inj g (U ) is the injectivity radius of g on U ; Roughly, the above notion of regularity asks that the A p -minimizing curves performing the optimal transport from µ 0 := (e 0 ) Π to µ 1 := (e 1 ) Π have velocities contained in K, i.e. they are all "uniformly" time-like future pointing. Moreover it also implies that ∪ t∈[0,1] supp(µ t ) ⊂ M is compact; in addition the optimal transport is assumed to be driven by a smooth potential φ 1/2 . Even if these conditions may appear a bit strong, we will prove in Lemma 3.2 that there are a lot of such regular plans; moreover in the paper we will show that it is enough to consider such particular optimal transports in order to characterize upper and lower bounds on the (causal-)Ricci curvature and thus characterize the solutions of Einstein equations.

Existence, regularity and evolution of Kantorovich potentials
In order to characterize Lorentzian Ricci curvature upper bounds, it will be useful the next proposition concerning the evolution of Kantorovich potentials along a regular A p -minimizing curve of probability measures (µ t ) t∈[0, 1] given by exponentiating the q-gradient of a smooth c p -concave function with time-like gradient. To this aim it is convenient to consider, for 0 ≤ s < t ≤ 1, the restricted minimal action .
The fact that t → Ψ t 1/2 is a smooth 1-parameter family of maps performing c p -optimal transport gives that φ defined in (3.19) In particular it holds Since by construction everything is defined inside the injectivity radius and all the transport rays are non-constant, from (3.5) (respectively (3.6)) it is manifest that the map (t, Step 2: validity of the Hamilton-Jacobi equation (3.2). We consider t ∈ (1/2, 1], the case t ∈ [0, 1/2] being analogous. Fix y = Ψ t 1/2 (x) for some arbitrary x ∈ U and t ∈ (1/2, 1], and let γ : Dividing by s and taking the limit for s → 0, we obtain Note that equality holds for denote the Legendre transform of L p . Thus we get Recalling that H p has the representation (2.16), we have which, together with (3.7), implies (3.2).
Step 3: validity of (3.3). Since Ψ t 1/2 is a smooth 1-parameter family of maps performing c p -optimal transport and the function φ defined in (3.19) is smooth, it coincides with the viscosity solution (resp. backward solution) for every s ∈ (0, 1), y ∈ Ψ t 1/2 (U ). Let us discuss the case t ∈ (s, 1], the other is analogous. From (3.4) it follows that Ψ s 1/2 (x) is a maximum point in the right hand side of (3.8) By construction d ds Ψ s 1/2 (x) ∈ Int(C) and, as already observed, DL p is invertible on Int(C) with inverse given by DH p . We conclude that We next show that for every pointx ∈ M and every v ∈ Cx "small enough" we can find a smooth c p -concave function φ defined on a neighbourhood of x, such that ∇ q g φ = v and the hessian of φ vanishes atx. This is well known in the Riemannian setting (e.g. [81,Theorem 13.5]) and should be compared with the recent paper by Mc Cann [62] in the Lorentzian framework. The second part of the next lemma shows that the class of regular c p -optimal dynamical plans is non-empty, and actually rather rich. (1) For every s ∈ (0, ε), for every C 2 function φ : M → R satisfying (3.9) ∇ q g φ(x) = sv, Hess φ (x) = 0, there exists a neighbourhood Ux ofx and a neighbourhood Uȳ ofȳ := exp ḡ x (sv) such that φ is c p -concave relatively to (Ux, Uȳ).
Proof. (1) Callingȳ =ȳ(sv) := exp ḡ x (sv), notice that ∇ q g φ(x) = sv is equivalent to (3.10) dφ where D x c p (x,ȳ) denotes the differential atx of the function x → c p (x,ȳ). Indeed, a computation shows that D x c p (x,ȳ) = −DL p (sv) and thus the claim follows from (2.18). Let φ : M → R be any smooth function satisfying In what follows we denote with Hess x,cp (x,ȳ)(resp. Hess v,Lp (sv) the Hessian of the function x → c p (x,ȳ) evaluated at x =x (resp. the Hessian of the function TxM w → L p (sv + w)). By taking normal coordinates centred atx one can check that the operator norm Hess x,cp (x,ȳ) − Hess v,Lp (sv) → 0 as t → 0.
Differentiating the last equation in y atȳ and using that Hess φ (x) = 0, we obtain is the covector associated to sv.
For every fixed y ∈ Uȳ, the function Ux x → c p (x, y) − φ(x) − u(y) vanishes at x = F (y); moreover, from (3.12), it follows that x = F (y) is the strict global minimum of such a function on Ux, up to further reducing Ux and Uȳ, possibly. In other words, the function Ux × Uȳ (x, y) → c p (x, y) − φ(x) − u(y) is always non-negative and vanishes exactly on the graph of F . It follows that (3.14) φ(x) = inf y∈Uȳ c p (x, y) − u(y), for every x ∈ Ux, i.e. φ : Ux → R is a smooth c p -concave function relative to (Ux, Uȳ) satisfying (3.9).
Since by construction c p : Ux × Uȳ → R is smooth, by classical optimal transport theory it is well know that the c p -superdifferential ∂ cp φ ⊂ Uȳ is c p -cyclically monotone (see for instance [1,Theorem 1.13]). Therefore, in order to have (3.15), it is enough to prove that Let us first show that ∂ cp φ(x) = ∅, for every x ∈ Ux. From the proof of part (1), there exists a smooth diffeomorphism F : Uȳ → Ux such that From the definition of φ cp in (2.19), it is readily seen that φ cp = u on Uȳ. Thus (3.17) combined with (2.21) gives that y ∈ ∂ cp φ(F (y)) for every y ∈ Uȳ or, equivalently, where w ∈ Int(C x ) is such that y = exp g x (w), which by (2.17) is equivalent to w = DH p (−dφ(x)) = ∇ q g φ(x), which yields y = exp g x (w) = exp g x (∇ q g φ(x)), concluding the proof of (3.16).
Step 2. Up to further reducing the open set Ux and the scale parameter s > 0 in the definition of φ, we can assume that φ satisfies the assumptions of Proposition 3.1 and thatφ : is still a Lyapunov function satisfying (3.20) − g(∇ q gφ , ∇ q gφ ) 1/2 < Inj g (Ψ 1/2 (Ux)).
It is easily seen that Moreover, thanks to (3.20), the curve [0, 1] t → exp z (t − 1)∇ q gφ (z) , ∀x ∈ Ux, z := Ψ 1/2 (x), is still a g-geodesic, solving the method of characteristics associated to the optimal transport problem (see for instance [21,Ch. 5.1] and [81,Ch. 7]; this is actually a variation of step 1 and of the proof of Proposition 3.1). It follows that the map , ∀x ∈ Ux, z := Ψ 1/2 (x) is a c p -optimal transport map. In other words, for everyμ ∈ P(M ) with suppμ ⊂ Ψ 1/2 (Ux), is a c p -optimal coupling for its marginals. We conclude that (e 0 , e 1 ) Π = (Ξ, Id) (e 1 ) Π is a c p -optimal coupling for its marginals and thus Π is a c p -optimal dynamical plan.
We next establish some basic properties of c p -optimal dynamical plans which will turn out to be useful for the OT-characterization of Lorentzian Ricci curvature upper and lower bounds.
(1) By construction, φ is smooth on U and g(∇ g φ, ∇ g φ) < 0. Thus also ∇ q g φ : M → T M is a smooth section of the tangent bundle and the symmetry of the endomorphism ∇∇ q g φ(x) : T x M → T x M follows by Schwartz's Lemma.
(3) is a straightforward consequence of the change of variable formula.
It will be convenient to consider the matrix of Jacobi fields along the geodesic t → γ t := Ψ t 1/2 (x); recalling (2.12), B t (x) satisfies the Jacobi equation where we denoted ∇ t := ∇γ t for short. Since by Lemma 3.3 we know that B t is non-singular for all x ∈ supp µ 1/2 , we can define for all x ∈ supp µ 1/2 . The next proposition will be key in the proof of the lower bounds on causal Ricci curvature. It is well known in Riemannian and Lorentzian geometry, see for instance [28,Lemma 3.1] and [30]; in any case we report a proof for the reader's convenience.
Proposition 3.4. Let U t be defined in (3.24). Then U t is a symmetric endomorphism of T γt M (i.e. the matrix (U t ) ij with respect to an orthonormal basis is symmetric) and it holds Taking the trace with respect to g yields Proof. Using (3.23) we get Taking the trace with respect to g yields the second identity. The rest of the proof is devoted to show (3.26). Let (e i (t)) i=1,...,n be an orthonormal basis of T γt M parallel along γ. Setting y(t) = log det B t , we have that We next show that U t is a symmetric endomorphism of T γt M , i.e. the matrix (U t ) ij is symmetric. To this aim, calling U * t the adjoint, we observe that (3.28) . Now the Jacobi equation (3.23) reads where R(t) : T γt M → T γt M, R(t)[v] := R(v,γ t )γ t is symmetric; indeed, in the orthonormal basis (e i (t)) i=1,...,n , it is represented by the symmetric matrix g(e i (t), e j (t)) g(R(e i (t),γ t )γ t , e j (t)) i,j=1,...,n . Plugging (3.30) into (3.29), we obtain that g φ is symmetric by assertion (1) in Lemma 3.3. Taking into account (3.28), we conclude that U t is symmetric for every t ∈ [0, 1]. Using that U t is symmetric, by Cauchy-Schwartz inequality, we have that The desired estimate (3.26) then follows from the combination of (3.25), (3.27) and (3.31).

Optimal transport formulation of the Einstein equations
The Einstein equations of General Relativity for an n-dimensional spacetime (M n , g, C), n ≥ 3, read as where Scal is the scalar curvature, Λ ∈ R is the cosmological constant, and T is the energy-momentum tensor.
Tr g (T ) g.

Proof.
Taking the trace of (4.1), one can express the scalar curvature as The optimal transport formulation of the Einstein equations will consist separately of an optimal transport characterization of the two inequalities  A key role in such an optimal transport formulation will be played by the (relative) Boltzmann-Shannon entropy defined below. Denote by vol g the standard volume measure on (M, g). Given an absolutely continuous probability measure µ = vol g with density ∈ C c (M ), define its Boltzmann-Shannon entropy (relative to vol g ) as so that for all t ∈ (0, 1) one has In particular, if f ≡ c ∈ R then The characterization of Ricci curvature lower bounds (i.e. Ric ≥ Kg for some constant K ∈ R) via displacement convexity of the entropy is by now classical in the Riemannian setting, let us briefly recall the key contributions. Otto & Villani [69] gave a nice heuristic argument for the implication "Ric ≥ Kg ⇒ K-convexity of the entropy"; this implication was proved for K = 0 by Cordero-Erausquin, McCann & Schmuckenschläger [27]; the equivalence for every K ∈ R was then established by Sturm & von Renesse [77]. Our optimal transport characterization of Ric ≥ 2Λ n−2 g + 8πT − 8π n−2 Tr g (T ) g is inspired by such fundamental papers (compare also with [51] for the implication (3)⇒(1)). Let us also mention that the characterization of Ric ≥ Kg for K ≥ 0 via displacement convexity in the globally hyperbolic Lorentzian setting has recently been obtained independently by Mc Cann [62]. Note that Corollary 4.4 extends such a result to any lower bounds K ∈ R and to the case of general (possibly non globally hyperbolic) space times.
The next general result will be applied with n ≥ 3 andT be as in (4.4). (1) Ric(v, v) ≥T (v, v) for every causal vector v ∈ C.

By applying Lemma 4.2 we get that
where, in the equality we used that for every fixed x ∈ B δ (x 0 ) the function t → g(γ x t ,γ x t ) is constant (as t → γ x t is by construction a g-geodesic). This clearly contradicts (4.10), as´g(γ,γ)dΠ(γ) < 0.
In the vacuum case, i.e. T ≡ 0, the inequality Ric ≥T withT as in (4.4) reads as Ric ≥ Kg with K = 2Λ n−2 ∈ R. Note that for v ∈ C it holds g(v, v) ≤ 0 so, when comparing the next result with its Riemannian counterparts [69,27,77], the sign of the lower bound K is reversed. ( (2) For every p ∈ (0, 1), for every regular dynamical c p -optimal plan Π it holds where we denoted µ i := (e i ) Π, i = 0, 1, the endpoints of the curve of probability measures associated to Π.
Remark 4.5 (The strong energy condition). The strong energy condition asserts that, called T the energy-momentum tensor, it holds T (v, v) ≥  [71,45,46], plays a key role in general relativity. For instance, in the presence of trapped surfaces, it implies that the space-time has singularities (e.g. black holes) [31,83].

4.2.
OT-characterization of Ric ≤T . The goal of the present section is to provide an optimal transport formulation of upper bounds on time-like Ricci curvature in the Lorentzian setting. More precisely, given a quadratic formT (which will later be chosen to be equal to the right hand side of Einstein equations, i.e. as in (4.4)), we aim to find an optimal transport formulation of the condition The Riemannian counterpart, in the special case of Ric ≤ Kg for some constant K ∈ R, has been recently established by Sturm [76].
In order to state the result, let us fix some notation. Given a relatively compact open subset E ⊂⊂ Int(C) let p T M →M : T M → M be the canonical projection map and inj g (E) > 0 be the injectivity radius of the exponential map of g restricted to E. For x ∈ p T M →M (E) and r ∈ (0, inj g (E)) we denote x (rv). The next general result will be applied with n ≥ 3 andT as in (4.4).
the next assertion holds. For every r ∈ (0, R), there exists an rconcentrated regular c p -optimal dynamical plan Π = Π(x, v, r) in the direction of v (with respect to E) which hasT (v, v)-concave entropy in the sense that There exists p ∈ (0, 1) such that the analogous assertion as in (2) holds true. Moreover, both in (2) and (2') one can replace B g,E r 4 (x) (resp. {exp g y (r 2 w) : w ∈ T y M ∩ C, g(w, w) = −1}) by B h r 4 (x) (resp. B h r 2 (y)).

Proof. (1)⇒ (2)
Let (M, g, C) be a space time and let h be an auxiliary Riemannian metric on M such that We denote with d T M h the distance on T M induced by the auxiliary Riemannian metric h. Once the compact subset E ⊂⊂ Int(C) is fixed, thanks to Lemma 3.2 there exist a constant R = R(E) ∈ (0, min(1, inj g (E))) and a function we can find a c p -convex function φ : M → R with the following properties: 10r (x). For t ∈ [0, 1], consider the map Ψ t 1/2 : z → exp z (r(t − 1/2)∇ q g φ(z)). Notice that Ψ t 1/2 (B g,E r 4 (x)) ⊂ B h 10r (x), ∀t ∈ [0, 1]. Let µ 1/2 = vol g (B g,E r 4 (x)) −1 vol g B g,E r 4 (x) and define µ t := (Ψ t 1/2 ) (µ 1/2 ) ∀t ∈ [0, 1]. By the properties of φ, the plan Π representing the curve of probability measures (µ t ) t∈[0,1] is a regular c p -optimal dynamical plan and supp(µ 1 ) ⊂ {exp g y (r 2 w) : w ∈ T y M ∩ C, g(w, w) = −1}. By Proposition 3.1 we can find a smooth family of functions The curve [0, 1] t → Ent(µ t |vol g ) ∈ R is smooth and, in virtue of (4.24), it satisfies is the q-Box of φ t (the Lorentzian analog of the q-Laplacian), and where we used the continuity equation For what follows it is useful to consider the linearization of the q-Box at a smooth function f , denoted by L q f and defined by the following relation: The map [0, 1] t →´M 2 q g φ t dµ t ∈ R is smooth and, in virtue of (4.23) and (4.24), it satisfies for every t ∈ [0, 1]. Using the q-Bochner identity (A.2) together with the assumption Ric(w, w) ≤T (w, w) for any w ∈ C and the estimates (4.25) on φ t , we can rewrite the last formula as d dt up to renaming (r) with a suitable function (2)⇒ (3): trivial.
In the vacuum case when T ≡ 0, the inequality Ric ≤T withT as in (4.4) reads as Note that for v ∈ C it holds g(v, v) ≤ 0 so, when comparing the next result with its Riemannian counterpart [76], the sign of the lower bound K is reversed.

4.3.
Optimal transport formulation of the Einstein equations. Recall that Einstein equations of general relativity, with cosmological constant equal to Λ ∈ R and energy-momentum tensor T , read as for an n-dimensional space-time (M, g, C). Combining Theorem 4.3 with Theorem 4.7, both with the choicẽ we obtain the following optimal transport formulation of (4.32).
(3) For every p ∈ (0, 1) and for every relatively compact open subset E ⊂⊂ Int(C) there exist R = R(E) ∈ (0, 1) and a function the next assertion holds. For every r ∈ (0, R), there exists an rconcentrated regular c p -optimal dynamical plan Π = Π(x, v, r) in the direction of v (with respect to E) satisfying (4.33) . (4) There exists p ∈ (0, 1) such that the analogous assertion as in (3) holds true.
Remark 4.11 (µ 1/2 can be chosen more general). From the proof of Theorem 4.10 it follows that one can replace (2) (and analogously (3)) with the following (a priori stronger, but a fortiori equivalent) statement. For every p ∈ (0, 1) the following holds. For every relatively compact open subset E ⊂⊂ Int(C) there exist R = R(E) ∈ (0, 1) and a function the next assertion holds. For every r ∈ (0, R) and every µ 1/2 ∈ P(M ) with µ 1/2 vol g and supp(µ 1/2 ) ⊂ B g,E r 4 (x), setting y = exp g x (rv), there exists a regular c p -optimal dynamical plan Π = Π(µ 1/2 , v, r) with associated curve of probability measures  For every r ∈ (0, R), there exists an r-concentrated regular c poptimal dynamical plan Π = Π(x, v, r) in the direction of v satisfying (4.33). (3) and (3') one can replace B g,E r 4 (x) by B h r 4 (x) and {exp g y (r 2 w) : w ∈ T y M ∩ C, g(w, w) = −1} by B h r 2 (y). Remark 4.13 (The tensorT ). As mentioned above, we will assume the cosmological constant Λ and the energy momentum tensor T to be given, say from physics and/or mathematical general relativity. Given g, Λ and T , for convenience of notation we setT to be defined in (1.6). Let us stress that not any symmetric bilinear formT would correspond to a physically meaningful situation; in order to be physically relevant, it is crucial thatT is given by (1.6) where T is a physical energy-momentum tensor (in particular T has to satisfy ∇ a T ab = 0, i.e. be "freely gravitating", it has to satisfy some suitable energy condition like the "dominant energy condition", etc.).  (2)⇒ (3): From the implication (1)⇒ (2) in Theorem 4.7, we get a regular c p -optimal dynamical plan Π = Π(x, v, r) as in (3) such that the upper bound in (4.33) holds. Moreover, from (4.25) it holds that (4.34) d T M h (γ t , rv) ≤ r (r), for Π-a.e. γ, for all t ∈ [0, 1]. Recalling that the implication (1)⇒ (2) in Theorem 4.3 gives the convexity property (4.10) of the entropy along every regular c p -optimal dynamical plan, and using (4.34), we conclude that also the lower bound in (4.33) holds.
(4)⇒ (2). The fact that Ric(v, v) ≤T (v, v) ∀v ∈ C follows directly from Theorem 4.7. The fact that can be showed following arguments already used in the paper, let us briefly discuss it. Fix p ∈ (0, 1) given by (4) and assume by contradiction that . Thanks to Lemma 3.2, up to replacing v with sv for some s ∈ (0, 1) small enough, we know that we can construct a c p -convex function φ : M → R, smooth in a neighbourhood of x and satisfying Then, by continuity, we can find . By the properties of φ, the plan Π representing the curve of probability measures (µ t ) t∈[0,1] is a regular c p -optimal dynamical plan and supp(µ 1 ) ⊂ {exp g y (r 2 w) : w ∈ T y M ∩ C, g(w, w) = −1}. Moreover We can now follow verbatim the arguments in (1)⇒(2) of Theorem 4.7 by using (4.35) and (4.36), obtaining a function (r) → 0 as r → 0 such that The last inequality clearly contradicts the lower bound in (4.33).
In the vacuum case T ≡ 0 with cosmological constant Λ ∈ R, the Einstein equations read as the next assertion holds. For every r ∈ (0, R), there exists an rconcentrated regular c p -optimal dynamical plan Π = Π(x, v, r) in the direction of v (with respect to E) satisfying (4.38) (3) There exists p ∈ (0, 1) such that the analogous assertion as in (2) holds true.
It is worth to isolate the case of zero cosmological constant.
the next assertion holds. For every r ∈ (0, R), there exists an rconcentrated regular c p -optimal dynamical plan Π = Π(x, v, r) in the direction of v (with respect to E) satisfying (3) There exists p ∈ (0, 1) such that the analogous assertion as in (2) holds true.

Appendix A. A q-Bochner identity in Lorentzian setting
In this section we prove a Bochner type identity in Lorentzian setting for the linearization of the q-Box operator, the Lorentzian analog of the q-Laplacian; let us mention that related results have been obtained in the Riemannian [59,79] and Finsler settings [84,85] but at best our knowledge this section is original in the Lorentzian L p framework.

Denote by
∇ q g φ := −|g(∇ g φ, ∇ g φ)| q−2 2 ∇ g φ the q-gradient, by 2 q g φ := div(−∇ q g φ) the q-Box operator of φ and by L q φ the linearization of the q-Box operator at φ defined by the following relation: The ultimate goal of the section is to prove the following result.
Proposition A.1. Under the above notation, the following q-Bochner identity holds: The proof of Proposition A.1 requires some preliminary lemmas. First of all we derive an explicit expression for the operator L q φ . Lemma A.2. Under the above notation, it holds Proof. By the very definitions of L q φ u and 2 q g φ, we have In order to explicit the last formula, compute ∇ g g(∇ g φ, ∇ g u) = Hess u (∇ g φ) + Hess φ (∇ g u). We next show a q-Bochner identity for the operator A q φ defined as Lemma A.3. Under the above notation, the following identity holds: Proof. We perform the computation at an arbitrary point x 0 ∈ U . In order to simplify the computations, we consider normal coordinates (x i ) in a neighbourhood of x 0 with ∂ ∂x 1 ∈ C. It holds Now, from the symmetry of second order derivatives and the very definition of the the Riemann tensor (2.9), we have Thus and we can rewrite (A.9) as We now compute the second part of −L q φ (−g(∇gφ,∇gφ)) q 2 q . To this aim observe that It is useful to express 2 q g in terms of 2 g : Using (A.14), we can write . (A.17) Now, the combination of (A.17) and (A.8) yields Appendix B. A synthetic formulation of Einstein's vacuum equations in a non-smooth setting B.1. The Lorentzian synthetic framework. The goal of this appendix in to give a synthetic formulation of Einstein's vacuum equations (i.e. zero stress-energy tensor T ≡ 0 but possibly non-zero cosmological constant Λ) and show its stability under a natural adaptation for Lorentzian synthetic spaces of the measured Gromov-Hausdorff convergence (classically designed as a notion of convergence for metric measure spaces). This appendix was written in early 2021, more than two years after the rest of the paper was posted in arXiv (in October 2018). One of the reasons is that we will build on top of the synthetic time-like Ricci curvature lower bounds in a non-smooth setting developed by the first author in collaboration with Cavalletti in [25] (posted on arXiv in 2020).
The general framework is given by Lorentzian pre-length/geodesic spaces. Let us start by recalling some basics of theory, following the approach of Kunziger-Sämann [54]. A causal space (X, , ≤) is a set X endowed with a preorder ≤ and a transitive relation contained in ≤. We write x < y if x ≤ y and x = y. When x y (resp. x ≤ y), we say that x and y are time-like (resp. causally) related.
We define the chronological (resp. causal ) future of a subset E ⊂ X as respectively. Analogously, we define I − (E) (resp. J − (E)) the chronological (resp. causal ) past of E. In case E = {x} is a singleton, with a slight abuse of notation, we will write I ± (x) (resp. J ± (x)) instead of I ± ({x}) (resp. J ± ({x})).
Recall that a metric space (X, d) is said to be proper if closed and bounded subsets are compact.
Definition B.1 (Lorentzian pre-length space (X, d, , ≤, τ )). A Lorentzian pre-length space (X, d, , ≤, τ ) is a casual space (X, , ≤) additionally equipped with a proper metric d and a lower semicontinuous function τ : We will consider X endowed with the metric topology induced by d.
We say that X is (resp. locally) causally closed if {x ≤ y} ⊂ X × X is a closed subset (resp. if every point x ∈ X has neighbourhood U such that {x ≤ y} ∩Ū ×Ū is closed inŪ ×Ū ).
Throughout this appendix, I ⊂ R will denote an arbitrary interval. A non-constant curve γ : I → X is called (future-directed) time-like (resp. causal ) if γ is locally Lipschitz continuous (with respect to d) and if for all t 1 , t 2 ∈ I, with t 1 < t 2 , it holds γ t 1 γ t 2 (resp. γ t 1 ≤ γ t 2 ). The length of a causal curve is defined via the time separation function, in analogy to the theory of length metric spaces: for γ : [a, b] → X futuredirected causal we set • non-totally imprisoning if for every compact set K X there is constant C > 0 such that the d-arc-length of all causal curves contained in K is bounded by C; • globally hyperbolic if it is non-totally imprisoning and for every x, y ∈ X the set J + (x) ∩ J − (y) is compact in X; • K-globally hyperbolic if it is non-totally imprisoning and for every K 1 , K 2 X compact subsets, the set J + (K 1 ) ∩ J − (K 2 ) is compact in X; • geodesic if for all x, y ∈ X with x ≤ y there is a future-directed causal curve γ from x to y with τ (x, y) = L τ (γ), i.e. a (maximizing) geodesic from x to y.
For a globally hyperbolic Lorentzian geodesic (actually length would suffice) space (X, d, , ≤, τ ), the time-separation function τ is finite and continuous, see [54,Theorem 3.28]. Moreover, any globally hyperbolic Lorentzian length space (for the definition of Lorenzian length space see [54,Definition 3.22], we omit it for brevity since we will not use it) is geodesic [54,Theorem 3.30].
A measured Lorentzian pre-length space (X, d, m, , ≤, τ ) is a Lorentzian pre-length space endowed with a Radon non-negative measure m with supp m = X.
Examples entering the class of Lorentzian synthetic spaces. In this short section we briefly recall some notable examples entering the aforementioned framework of Lorentzian synthetic spaces.
Spacetimes with a continuous Lorentzian metric. Let M be a smooth manifold endowed with a continuous Lorentzian metric g. Assume that (M, g) is time-oriented, i.e. there is a continuous time-like vector field. Observe that, for C 0 -metrics, the natural class of differentiability of the underlying manifolds is C 1 ; now, C 1 manifolds always admit a C ∞ subatlas, and one can pick such sub-atlas whenever convenient. A causal (respectively time-like) curve in M is by definition a locally Lipschitz curve whose tangent vector is causal (resp. time-like) almost everywhere. One could also start from absolutely continuous (AC for short) curves, but since causal AC curves always admit a Lipschitz re-parametrisation [63, Sec. 2.1, Rem. 2.3], we do not loose in generality with the above convention. Set L g (γ) to be the g-length of a causal curve γ : I → M , i.e. τ (x, y) := sup{L g (γ) : γ is future directed causal from x to y}, if x ≤ y, and τ (x, y) = 0 otherwise. The reverse triangle inequality (B.1) follows directly from the definition. Moreover, every L g -maximal curve γ is also L τmaximal, and L g (γ) = L τ (γ) (see for instance [54,Remark 5.1]). In order to have an underlying metric structure, we also fix a complete Riemannian metric h on M and denote by d h the associated distance function.
For a spacetime with a Lorentzian C 0 -metric: Some examples towards quantum gravity. The framework of Lorentzian prelength spaces allows to handle situations where one may not have the structure of a manifold or a Lorentz(-Finsler) metric. A remarkable example of such a situation is given by certain approaches to quantum gravity, see for instance [60] where it is shown that it is possible to reconstruct a globally hyperbolic spacetime and the causality relation from a countable dense set of events, in a purely order theoretic manner. Two approaches to quantum gravity, particularly linked to Lorentzian prelength spaces, are the theory of causal Fermion systems [36,37] and the theory of causal sets [13]. The basic idea in both cases is that the structure of spacetime needs to be adjusted on a microscopic scale to include quantum effects. This leads to non-smoothness of the underlying geometry, and the classical structure of Lorentzian manifold emerges only in the macroscopic regime. For the connection to the theory of Lorentzian (pre-)length spaces we refer to [54,Section 5.3], [36, Section 5.1].
B.2. The p-Lorentz-Wasserstein distance on a Lorentzian pre-length space. We denote with P(X) (resp. P c (X), or P ac (X)) the collection of all Borel probability measures (resp. with compact support, or absolutely continuous with respect to m). Given a probability measure µ ∈ P(X) we define its relative entropy by if µ = ρ m ∈ P ac (X) and (ρ log(ρ)) + is m-integrable. Otherwise we set Ent(µ|m) = +∞. We set Dom(Ent(·|m)) the finiteness domain of Ent(·|m).
A key property of p is the reverse triangle inequality. This was proved in the smooth Lorentzian setting by Eckstein-Miller [29,Theorem 13] and in the present synthetic setting in [25,Proposition 2.5]. Such a property is the natural Lorentzian analogue of the fact that the Kantorovich-Rubinstein-Wasserstein distances W p , p ≥ 1, in the metric space setting satisfy the usual triangle inequality (see for instance [81,Section 6]).
has time-like Ricci curvature bounded above by K with respect to p ∈ (0, 1), r 0 > 0 and with remainder function ω in the synthetic sense of Definition B.5. Then also the limit space (X ∞ , d ∞ , m ∞ , ∞ , ≤ ∞ , τ ∞ ) has time-like Ricci curvature bounded above by K with respect to p ∈ (0, 1), r 0 + 1 and with remainder function ω in the synthetic sense of Definition B.5.
Proof. For simplicity of notation, we will identify X j with its isomorphic image ι j (X j ) ⊂ X and the measure m j with (ι j ) m j , for each j ∈ N ∪ {∞}.
We will write B d j Step 1. We claim that, for j ∈ N sufficiently large, there exist x j , y j ∈ X j such that (1) x j → x ∞ and y j → y ∞ in (X, d); (2) d j (x j , y j ) =: r j → r; . Since by assumption m j m ∞ weakly as measures in (X, d), by the lowersemicontinuity over open subset we have that for every r > 0 it holds In particular, for every r > 0 there exists J(r) > 0 such that B d r (x ∞ )∩X j = ∅ for all j ≥ J(r). By a diagonal argument we obtain that there exist x j ∈ X j with x j → x ∞ in (X, d). The analogous argument gives a sequence y j ∈ X j with y j → y ∞ . Using that the inclusion maps ι j are isometric, we also get We are left to show the third claim. For every j ∈ N, let x j ∈ B d j r 4 j (x j ) and Combining the volume non-collapsing assumption with the weak convergence m j m ∞ and the properness of the metrics, we obtain that there exist x ∞ ∈ B d∞ (r+ε) 4 (x ∞ ) and y ∞ ∈ B d∞ (r+ε) 2 (y ∞ ) such that x j → x ∞ and y j → y ∞ up to a subsequence. Recalling (B.6) and that ι ∞ is an isomorphic embedding, we infer that (x ∞ , y ∞ ) ∈ X 2 . Since X 2 ⊂ X 2 is an open subset, we conclude that for j sufficiently large it holds (x j , y j ) ⊂ X 2 ∩ X 2 j = (X 2 j ) as desired.
Step 2. We claim that, for every j ∈ N, there exists µ j 0 ∈ Dom(Ent( · |m j )) with supp µ j Since m j m ∞ weakly, there exists R ∈ (2r 4 , 3r 4 ) such that For j ∈ N ∪ {∞}, denotem j := z j m j B d R (x ∞ ) and observe thatm j m ∞ weakly and thus (since now the supports are uniformly bounded in X) in W (X,d) q for some (or equivalently every) q ∈ [1, ∞). In particularm j →m ∞ in W -optimal coupling.
B.4. Synthetic time-like Ricci lower bounds. In this subsection we briefly recall some basics of the synthetic theory of time-like Ricci lower bounds developed in [25].
The motivation for considering (strongly) time-like p-dualisable pairs of measures is twofold: firstly the p-optimal coupling dπ(x, y) matches events described by dµ(x) with events described by dν(y) so that x y; secondly Kantorovich duality holds (see [78,Proposition 2.7] in smooth Lorentzian setting and in case p = 1 and [25, Section 2.4] for the non-smooth setting and general p ∈ (0, 1]).
The following stability result for synthetic time-like Ricci lower bounds was proved in [25, Theorem 3.14].
The combination of the stability of time-like Ricci lower and upper bounds (i.e. Theorem B.6 and Theorem B.9) gives the stability of the synthetic vacuum Einstein's equations under the aforementioned natural Lorentzian variant of measured Gromov-Hausdorff convergence.
Remark B.12. By [25, Theorem 3.1] after [62] (see also Corollary 4.4) and by Theorem 4.7 (see also Remark 4.8), if (X, d, m, , ≤, τ ) is a (for simplicity say a compact subset in a) smooth Lorentzian manifold, then (X, d, m, , ≤, τ ) satisfies the Einstein's equations Ric ≡ Λ in the smooth classical sense if and only if (X, d, m, , ≤, τ ) satisfies the Einstein's equations in the synthetic sense of Definition B.10. Therefore, Theorem B.11 gives that the corresponding limits of smooth solutions to Einstein's equation Ric ≡ Λ satisfy the weak synthetic Einstein's equations Ric ≡ Λ in the sense of Definition B.10. In other terms, the vacuum Einstein's equations are stable under the conditions (and with respect to the notion of convergence) of Theorem B.11.
Let us mention that the stability of the Einstein's equations under various notions of (weak) convergence is a topic of high interest in General Relativity. Classically, the problem is phrased in terms of convergence of a sequence of Lorentzian metrics g j converging to a limit Lorentzian metric g ∞ , on a fixed underlying manifold. It is well known that, if g j are solutions of the vacuum Einstein equations, g j → g ∞ in C 0 loc and the derivatives of g j converge in L 2 loc , then the limit g ∞ satisfies the vacuum Einstein equations as well. However, if the g j → g ∞ in C 0 loc and the derivatives of g j converge weakly in L 2 loc , explicit examples are known (see [20,42] for examples in symmetry classes) where the limit g ∞ may satisfy the Einstein equations with a nonvanishing stress energy momentum tensor. Burnett [20] conjectured that, if there exist C > 0 and λ j → 0 such that |g j − g ∞ | ≤ λ j , |∂g j | ≤ C, |∂ 2 g j | ≤ Cλ −1 j , then g ∞ is isometric to a solution to the Einstein-massless Vlasov system for some appropriate choice of Vlasov field. Such a conjecture remains open, although there has been recent progress [47,48] when g j are assumed to be U(1)-symmetric. We also mention the recent work [58] where concentrations (at the level of ∂g j ) are allowed in addition to oscillations.
Theorem B.11 gives a new point of view on the stability of the vacuum Einstein's equations. Indeed, while in the aforementioned results the metrics g j are converging on a fixed underlying manifold, in Theorem B.11 also the underlying space X may vary (along the sequence and at the limit), allowing change in topology in the limit, as one may expect in case of formation of singularities. Moreover, the notion of convergence is quite different in spirit: while in the aforementioned results g j → g ∞ in a suitable functional analytic sense, in Theorem B.11 the spaces are converging in a more geometric sense (inspired by the measured Gromov-Haudorff convergence).