Gradient flow structures for discrete porous medium equations

We consider discrete porous medium equations of the form \partial_t \rho_t = \Delta \phi(\rho_t), where \Delta is the generator of a reversible continuous time Markov chain on a finite set X, and \phi is an increasing function. We show that these equations arise as gradient flows of certain entropy functionals with respect to suitable non-local transportation metrics. This may be seen as a discrete analogue of the Wasserstein gradient flow structure for porous medium equations in R^n discovered by Otto. We present a one-dimensional counterexample to geodesic convexity and discuss Gromov-Hausdorff convergence to the Wasserstein metric.


Introduction
Recently it has been shown that discretisations of heat equations and Fokker-Planck equations can be formulated as gradient flows of the entropy with respect to a non-local transportation metric W on the space of probability measures. Results in this direction have been obtained independently in various settings, including Fokker-Planck equations on graphs [5], reversible Markov chains [11], and systems of reaction-diffusion equations [14]. Related gradient flow structures have been found for fractional heat equations [7] and quantum mechanical evolution equations [4,15].
The above-mentioned results can be regarded as discrete counterparts to the by now classical result of Jordan, Kinderlehrer and Otto [10], who showed that Fokker-Planck equations on R n are gradient flows of the entropy with respect to the L 2 -Wasserstein metric on the space of probability measures. In this continuous setting there are many other interesting partial differential equations which admit a formulation as Wasserstein gradient flow. Among them, one of the most prominent examples is the porous medium equation, which has been identified as the Wasserstein gradient flow of the Rényi entropy in Otto's seminal paper [16].
It seems therefore natural to ask whether a similar gradient flow structure exists for discrete versions of porous medium equations. In this paper we show that suitable discretisations indeed admit a gradient flow structure. The associated metrics turn out to be variations on the metric W, which have already been studied in [11].
1.1. The discrete setting. In this paper we let X be a finite set. We consider a matrix Q : X × X → R satisfying Q(x, y) ≥ 0 for all x = y and y∈X Q(x, y) = 0. We assume that Q is irreducible, so that basic Markov chain theory implies the existence of a unique invariant probability measure π on X . We assume that π is reversible, i.e., the detailed balance equations Q(x, y)π(x) = Q(y, x)π(y) (1.1) hold for all x, y ∈ X . For a function ψ : X → R we write ∆ψ(x) := y∈X Q(x, y)ψ(y) = y∈X Q(x, y)(ψ(y) − ψ(x)) . (
Consider now the relative entropy functional H : P(X ) → R defined by The following result has been shown in [5,11,14]: Solutions to the heat equation ∂ t ρ t = ∆ρ t are gradient flow trajectories of H with respect to the metric W, provided that θ is the logarithmic mean defined by θ(r, s) := where ∆ denotes the discrete Laplacian defined in (1.2). This equation can be analysed using classical Hilbertian gradient flow techniques in a suitable "discrete H −1 -space" (see Proposition 3.2 below).
Here we are interested in more general gradient flow structures in the spirit of the Wasserstein gradient flow structure for the porous medium equation [16]. For this purpose, for suitable (see Assumption 3.3) strictly convex functions f : [0, ∞) → R, we consider the entropy functional F : P(X ) → R defined by (1.5) We shall prove the following result.
Let ϕ and f be as a above, and let W denote the non-local transportation metric induced by Then the gradient flow equation of the entropy functional F with respect to W is the discrete porous medium equation ∂ t ρ = ∆ϕ(ρ).
Remark 1.2. In the special case where ϕ(r) = r and f (r) = r log r we recover the result from [5,11,14]. In this case we have θ(r, s) = r−s log r−log s , which coincides with (1.3). The possibility of allowing more general functions f has already been discussed in [11]. Remark 1.3. Of particular interest is the case where ϕ(r) = r m and f (r) = 1 m−1 r m for some 0 < m ≤ 2. In this case the equation can be considered as a discrete porous medium equation (if m > 1) or fast diffusion equation (if m < 1). The functional F becomes a Rényi entropy F m , so that Theorem 1.1 can be considered as a discrete analogue of the gradient flow structure obtained by Otto [16]. The expression for θ from (1.6) becomes (1.7) Several classical means are special cases of this expression. In fact, θ 2 (r, s) = r+s 2 is the arithmetic mean, lim m→1 θ m (r, s) is the logarithmic mean, and θ 1/2 (r, s) = √ rs is the geometric mean. Moreover, θ −1 (r, s) = 2rs r+s is the harmonic mean, but we shall not consider negative values of m in the sequel.
Remark 1.4. For integer values of m, the porous medium equation takes the form of a chemical reaction equation, for which a gradient flow structure has been found in [14]. However, there the driving functional is the relative entropy H, so that the associated weight function can be expressed in terms of logarithmic means. Here we also allow for different driving functionals, such as the Rényi entropy.
1.3. Geodesic convexity of entropy functionals. Of crucial importance in the theory of Wasserstein gradient flows is that the relevant entropy functionals exhibit good convexity properties along Wasserstein geodesics. Recent work [8,13] shows that analogous properties hold for the relative entropy functional H in some discrete examples. In particular, H turns out to be convex along W-geodesics in one-dimensional discrete Fokker-Planck equations [13] as well as in heat equations on d-dimensional square lattices in arbitrary dimension [8].
It thus seems natural to ask whether similar convexity properties hold for the more general functionals F along W-geodesics with the appropriate choice of θ given by (1.6). In the continous setting, it follows from a fundamental result by McCann [12] that the Rényi entropy F m (ρ) = 1 m−1 R n ρ m (x) dx is displacement convex, i.e., convex along L 2 -Wasserstein geodesics in P(R n ), for m ≥ 1 − 1 n .
However, in this paper we shall show that the discrete analogue fails in general. We present a counterexample (Proposition 4.5) in the case m = 2, which shows that even in one dimension the Rényi entropy fails to be convex along geodesics in the associated non-local transportation metric.
Proposition 1.5. For N ≥ 6, let Q be the generator of simple random walk on the discrete circle T N = Z/N Z. Let W be the non-local transportation metric associated with the arithmetic mean θ 2 . Then the Rényi entropy F 2 is not convex along W-geodesics. 1.4. Gromov-Hausdorff convergence of discrete transportation metrics. Since the discrete transportation metrics discussed in this paper take over the role of the L 2 -Wasserstein metric in a discrete setting, it seems natural to ask whether they converge to the L 2 -Wasserstein metric by a suitable limiting procedure. First results in this spirit have been obtained in [9], where it was proved that discrete transportation metrics W N associated with simple random walk on the discrete torus (Z/N Z) d converge to the L 2 -Wasserstein metric W 2 over the torus T d . The weight function considered in [9] is the logarithmic mean. In the present paper we observe that the result extends to more general weight functions associated to porous medium equations. Theorem 1.6. Let d ≥ 1, let 0 < m ≤ 2, and let W N be the renormalised discrete transportation metric on P(T d N ) associated with simple random walk on T d N and weight function θ m . Then the metric spaces (P(T d N ), W N ) converge to (P(T d ), W 2 ) in the sense of Gromov-Hausdorff as N → ∞.
Since this result can be obtained by a minor modification of the proof in [9], we do not give a detailed proof here, but merely point out the crucial properties which make the argument go through.

Preliminaries on non-local transportation metrics
In this section we collect some preliminary results on non-local transportation metrics. The presentation here is based on [11].
In this paper, a reversible Markov chain consists of a triple (X , Q, π) where • X is a finite set; • Q : X × X → R is a Q-matrix, i.e., Q(x, y) ≥ 0 for all x = y and y∈X Q(x, y) = 0 for all x ∈ X . Throughout this paper we assume that Q is irreducible, i.e., for all x, y ∈ X there exist n ≥ 1 and x 0 , . . . x n ∈ X such that x 0 = x, x n = y and Q(x i−1 , x i ) > 0 for all 1 ≤ i ≤ n; • π is the associated stationary probability measure on X , which exists uniquely by basic Markov chain theory. We impose the standing assumption that π is reversible, i.e., we assume that the detailed balance equations are satisfied for all x, y ∈ X . The set of probability densities with respect to π will be denoted by and we set Note that (A5) implies in particular the following estimate: Let us fix a reversible Markov chain (X , Q, π) and the weight function θ throughout the remainder of this section. We recall the definition of the associated non-local transportation metric from [11], as given in a slightly different (but equivalent) form in [8]. Actually, in [11,8] a slightly less general case has been considered, which corresponds to Q(x, x) = −1 for all x ∈ X . However, it is easily checked that the results extend to the present setting verbatim.

2)
and for ρ ∈ P(X ) and ψ ∈ R X , The equation appearing in (2.2) can be regarded as a "discrete continuity equation". The analogy with the continuous case becomes more apparent if we introduce some notation. For a function ψ : X → R, we consider the discrete gradient ∇ψ : X × X → R given by For a function Ψ : X × X we define its discrete divergence ∇ · Ψ : X → R by With this notation the integration by parts formula holds. Here we write, for ϕ, ψ : X → R and Φ, Ψ : X × X → R, The continuity equation above can now be written as Moreover, for ρ ∈ P(X ) we shall write

With this notation we have
A(ρ, ψ) := ∇ψ 2 ρ . Basic properties of W have been studied in [11]. Under the current assumptions we have the following result.
Proof. Under somewhat weaker assumptions, it has been proved in [11] that W defines an extended, i.e., (possibly [0, ∞]-valued) metric. To show that W is finite under the current assumptions, we apply [11,Theorem 3.12], which asserts that it suffices to check that Using the concavity assumption (A5) we infer that The final assertion has been proved in [8,Theorem 3.2].
Some basic properties of the metric W are collected in the following result, which asserts that the distance W is induced by a Riemannian metric on the interior P * (X ) of P(X ).
Proposition 2.4 (Riemannian structure). The restriction of W to P * (X ) is the Riemannian distance induced by the following Riemannian structure: -the tangent space of ρ ∈ P * (X ) can be identified with the set of discrete gradients by means of the following identification: given a smooth curve (−ε, ε) ∋ t → ρ t ∈ P * (X ) with ρ 0 = ρ, there exists a unique element ∇ψ 0 ∈ T ρ , such that the continuity equation (2.2)(v) holds at t = 0.
-The Riemannian metric on T ρ is given by the inner product The following result characterises the geodesic equations for W. The equation for ψ is reminiscent of the Hamilton-Jacobi equation, which describes geodesics in the Wasserstein space.
Proposition 2.5 (Geodesic equations). Letρ ∈ P * (X ) andψ ∈ R X . On a sufficiently small time interval around 0, the unique constant speed geodesic with ρ 0 =ρ and initial tangent vector ∇ψ 0 = ∇ψ satisfies the following equations: Let ∆ be the discrete Laplacian associated with Q, i.e., for a function ψ : The operator ∆ is the generator of the continuous time Markov semigroup associated with Q. Note that we have the usual formula ∆ = (∇·)∇. By the reversibility assumption, ∆ is selfadjoint on L 2 (X , π).
Let f ∈ C([0, ∞); R) be a strictly convex function, which is smooth on (0, ∞). We consider the associated entropy functional F : P(X ) → R defined by The following result explains the relevance of the non-local transportation metrics W. Proposition 2.6 (Gradient flows). The heat flow generated by ∆ is the gradient flow of the entropy with respect to W, provided that θ is given by Let us note that among all weight functions θ of the form (2.5), the logarithmic mean (1.3) is the only one satisfying θ(r, r) = r.

Discrete porous medium equations as gradient flows of the entropy
In this note we shall be concerned with equations of porous medium-type associated with a reversible Markov chain (X , Q, π). The following assumption will be in force throughout this section.
We shall study the discrete porous medium equation where ∆ denotes the discrete Laplacian defined in (2.4). Of course, if ϕ(r) = r, we recover the usual discrete heat equation associated with Q, which has been studied in [5,11,14]. We shall analyse these equations using gradient flow methods, first in a Hilbertian setting of discrete Sobolev spaces, and then in a Riemannian setting of non-local transportation metrics.
3.1. A Hilbertian gradient flow structure. In order to apply Hilbertian gradient flow methods we shall introduce a "discrete H −1 -space" H −1 . First we observe that, by the irreducibility assumption, Ker(∆) = lin{1}, where 1 denotes the function identically equal to 1. Since ∆ is selfadjoint on L 2 (X , π), it follows that the operator ∆ is bijective on For ψ, ψ 1 , ψ 2 ∈ Ran(∆) it thus makes sense to define the inner product where ·, · π denotes the L 2 (X , π)-inner product. The associated norm is given by For arbitrary functions ψ : X → R we set c ψ = x∈X ψ(x)π(x) and note that ψ − c ψ 1 belongs to Ran(∆). A Hilbertian norm on R X can then be defined by The equation ∂ t ρ t = ∆ϕ(ρ t ) can now be studied using classical Hilbertian gradient flow arguments: for ρ ∈ P(X ), and Φ(ρ) = +∞ otherwise. Then the functional Φ is strictly convex on P(X ) and attains its unique minimum at 1. As a consequence, the following assertions hold for all ρ 0 ∈ P(X ): (1) Among all locally absolutely continuous curve (ρ t ) t in P(X ) with ρ 0 =ρ 0 , there exists a unique one satisfying the evolution variational inequality for all σ ∈ P(X ) and a.e. t ≥ 0.
Proof. Strict convexity of Φ follows from the assumption that ϕ is strictly increasing. Part  (2), it suffices to compute the H −1 -gradient of Φ. To do this, note that for a smooth curve which shows that the H −1 -gradient of the functional Φ at ρ ∈ P(X ) is given by −∆ϕ(ρ), as desired.
Since the gradient of the convex functional Φ vanishes at 1, it follows that Φ attains it minimum at 1.  The associated metric on P(X ) will be denoted by W ϕ,f . Often, if confusion is unlikely to arise, we will simply write W.
We consider the entropy functional F : P(X ) → R defined by Of course, if f (r) = r log r we recover the usual relative entropy with respect to π.
Proposition 3.4. Let W be any non-local transportation metric. The W-gradient of the functional F at ρ ∈ P * (X ) is given by ∇f ′ (ρ).
Here we use the identification of the tangent space provided in Proposition 2.4.
Proof. Let ρ ∈ P * (X ) and pick a smooth curve (ρ t ) satisfying the continuity equation where we use the notation from Section 2. It then follows that d dt hence the W ϕ,f -gradient of the functional F at ρ ∈ P * (X ) is given by ∇f ′ (ρ).
The next result is now an immediate consequence.
Theorem 3.5. The gradient flow equation of the entropy functional F with respect to the metric W ϕ,f is given by the porous medium equation ∂ t ρ t = ∆ϕ(ρ t ).
Proof. The gradient flow equation of the smooth functional F on P * (X ) is given by Using Proposition 3.4 and the fact thatρ ∇f ′ (ρ) = ∇ϕ(ρ) by the definition of θ ϕ,f , we infer that hence the gradient flow equation reduces to the porous medium equation ∂ t ρ t = ∆ϕ(ρ t ), as desired.
Clearly, the gradient flow structure for the heat equation given in Proposition 2.6 corresponds to the special case of Theorem 3.5 where ϕ(r) = r.
Let us note that also the H −1 -gradient flow structure from Proposition 3.2 is a special case of Theorem 3.5. Indeed, if f = Φ, then F = Φ. Moreover, the following result asserts that the H −1 -distance on P(X ) coincides with the non-local transportation metric W 1 , which is given by Definition 2.2 in the special case where θ(r, s) = 1 for all r, s ≥ 0.

Porous medium equations and Rényi entropy.
Let us now specialise to a particularly interesting setting, motivated by the Wasserstein gradient flow structure for porous medium equations of the form ∂ t ρ t = ∆ρ m t in R n from [16]. Let 0 < m ≤ 2 and m = 1, and consider the functions In this case the porous medium equation and the entropy functional are given by thus F m is the usual Rényi entropy. The weight function is given by  which is negative semi-definite for all α ∈ [0, 1]. Property (2) follows from the monotonicity in p of L p -"norms", applied on a two-point space {0, 1} with probability measure µ α := (1 − α)δ 0 + αδ 1 . This monotonicity follows from Jensen's inequality and holds for negative p as well.
Let us note that θ m is not a weight function for m > 2, since in this case Minkowski's inequality for L m−1 -norms implies that θ m is convex.
Corollary 3.8. For any reversible Markov chain (X , Q, π) the distance W m is finite and non-increasing in m. More precisely, for any 0 < m ≤ m ′ ≤ 2 and ρ 0 , ρ 1 ∈ P(X ) we have Proof. This is an immediate consequence of the monotonicity of θ m stated in Lemma 3.7 combined with the equivalent definition of the distance W given in [8,Lemma 2.9].
Applied to ϕ = ϕ m and f = f m , Theorem 3.5 reduces to the following result.

Geodesic κ-convexity of entropy functionals
In this section we analyse convexity properties of entropy functionals along geodesics for non-local transportation metrics. We fix a reversible Markov chain (X , Q, π), and we fix ϕ, f : [0, ∞) → R satisfying Assumptions 3.1 and 3.3. Throughout this section we set Let W be the associated metric, and let F be the entropy functional defined in (3.3).
Since the Riemannian metric does not necessarily extend continuously to the boundary of P(X ), it is not obvious that a lower bound κ on the Hessian in the interior P * (X ) implies geodesic κ-convexity in the metric space (P(X ), W). Nevertheless, an Eulerian argument by Daneri and Savaré [6] can be adapted to the current setting, to show that this is indeed the case. We refer to [8,Theorem 4.5] and [13, Proposition 2.1] for the details in the case where θ is the logarithmic mean. (1) For every constant speed geodesic (ρ t ) t∈[0,1] in (P(X ), W) we have (2) For allρ 0 ∈ P(X ), the solution (ρ t ) to the porous medium equation with ρ 0 =ρ 0 from Proposition 3.2, satisfies the evolution variational inequality for all σ ∈ P(X ) and a.e. t ≥ 0.
(4) For all ρ ∈ P * (X ) and ψ ∈ R X we have The evolution variational inequality (4.4) can be regarded as a definition of a (strong) notion of a gradient flow in the setting of a metric space. This inequality has been extensively studied recently, see for example [1] and [6, Section 3].

4.2.
Examples. In this section we will study W-geodesic convexity of F in some simple concrete examples. To simplify notation let us write, for i = 1, 2, with the understanding that sup{∅} = −∞.

4.2.1.
The two-point space. We consider the two-point space X = {a, b} endowed with the Q-matrix Q defined by for p, q > 0. In this case the stationary probability measure π is given by In this case we have the following characterisation of κ Q in terms of p and q. where r = p+q 2q (1 − α), s = p+q 2p (1 + α), and the infimum runs over all α ∈ (−1, 1).
Proof. An explicit computation shows that In view of Proposition 4.2, the result follows from these expressions.
The following result provides a simplified expression for κ Q in the case where π is symmetric and θ = θ m . The case m = 1 has already been considered in [11,Proposition 2.12].
Proof. This follows readily from Proposition 4.3.

4.2.2.
The discrete circle. As announced in Proposition 1.5, we will exhibit an instance of the discrete porous medium equation where convexity fails. For this purpose, we consider the discrete circle of length N , i.e., X = T N = Z/N Z for some N ≥ 2. All computations below are understood modulo N . Let Q denote the discrete Laplacian, normalised so that Q(x, y) = q if |x − y| = 1 and Q(x, y) = 0 otherwise. In this case π is the uniform probability given by π(x) = 1 N for all x ∈ X . Proposition 4.5. Let m = 2, let f m , ϕ m and θ m be as in Section 3.3, and let W be the associated non-local transportation metric. Let Q be the discrete Laplacian on the discrete circle T N = Z/N Z with N ≥ 6 defined above. Then the functional F m is not convex along W-geodesics. More precisely, Proof. Since m = 2 we have the simple expressions Inserting these into (4.2), we calculate the Hessian of the Rényi entropy as Choosing in particular ψ = (ψ 1 , . . . , ψ N ) = (0, 1, 2, . . . , 2, 0) , ρ = (ρ 1 , . . . , ρ N ) = (ε, N − (N − 1)ε, ε, . . . , ε) we see that only three terms in the first and one term in the second sum are non-zero, and we find Since A(ρ, ψ) = q + O(ε), the result follows.
Let us remark that in the case where q = N 2 , the discrete Laplacian converges to the continuous Laplacian on the torus T = R/Z. In this limit space, it is well known that the Rényi entropy F 2 is convex along 2-Wasserstein geodesics in P(T). However, the previous result shows that the behaviour at the discrete level is very different, since κ Q ≤ −N 3 /2.

4.3.
Consequences of geodesic convexity. In this subsection we will show that convexity of the entropy functional along W-geodesics implies a contraction property for solutions to the porous medium equation as well as a number of functional inequalities for the invariant measure of the Markov chain. These results are in the spirit of the work of Otto-Villani [17]. Similar results for the heat flow associated to a finite Markov chain have been obtained in [8].
As a first result we single out the following κ-contractivity property, which is a direct consequence of the evolution variational inequality (4.4).
Proposition 4.6 (κ-Contractivity of the PME). Suppose that κ Q ∈ R and let (ρ t ) t , (σ t ) t be two solutions of the discrete porous medium equation as given by Proposition 3.2. Then we have for all t ≥ 0: Proof. This follows from Proposition 4.2 by applying [6, Proposition 3.1] to the functional F on the metric space (P(X ), W).
We introduce the following functional, which we regard as an analogue of the Fisher information: If f is not differentiable at 0, we use the convention that I(ρ) = +∞ for ρ / ∈ P * (X ). Using Proposition 3.4, we see that for ρ ∈ P * (X ) we have The significance of this functional is due to the fact that it gives the change, or dissipation, of the entropy functional along solutions of the equation ∂ t ρ t = ∆ϕ(ρ t ), namely: Note that in the setting of Section 3.3, the dissipation functional I m associated to ϕ m and f m is given by As before, let 1 ∈ P(X ) denotes the density of the stationary distribution, which is everywhere equal to 1. It follows from the definition that F(1) = f (1).
We introduce the following functional inequalities.
Definition 4.7. The Markov chain (X , Q, π) satisfies (1) an FWI inequality with constant κ ∈ R if for all ρ ∈ P(X ), (2) an entropy-dissipation inequality with constant λ > 0 if for all ρ ∈ P(X ), The following result relates these inequalities to W-geodesic convexity of the entropy functional F. Recall the definition of κ Q from (4.5).
Theorem 4.8. The following assertions hold.
Proof. The proof follows from similar arguments as the corresponding results in [8] and uses the heuristics developed in the continuous case in [17]. However, for the convenience of the reader we give a self-contained proof here.
Hence, the first assertion of the theorem follows by passing to the limit s → 0 in (4.7). Let us now prove (2). Assume that κ Q > 0. From part (1) we know that Q satisfies FWI(κ Q ). From this we derive EDI(κ Q ) by an application of Young's inequality: in which we set x = W(ρ, 1), y = I(ρ) and c = κ Q 2 .

Gromov-Hausdorff convergence
In this final section we discuss the convergence of discrete transportation metrics to the Wasserstein metric in a simple setting.
For N ≥ 2 let T d N = (Z/N Z) d be the discrete torus in dimension d ≥ 1, which we regard as an approximation of the continuous torus T d = [0, 1] d . We consider the matrix Q : T d N ×T d N → R defined by otherwise.
Note that the entries are scaled in such a way that the associated discrete Laplacian ∆ N defined by (1.2) approximates the Laplacian ∆ on T d . Fix 0 < θ ≤ 1 and let W N denote the discrete transportation metric corresponding to a weight function θ m .
The following result has already been announced in the introduction. Since this result can be proved by following the argument in [9], we shall not give a complete proof here. Instead, let us point out the three crucial properties of θ = θ m which allow us to reapply the argument from [9]: (1) θ(t, t) = t for all t ≥ 0.
(2) θ is concave. where θ denotes the harmonic mean defined by θ(a, b) := 2ab a+b . Assumption (1) is clearly necessary, since it is already checked at the formal level that the metrics may converge to a different limit if (1) is violated. The concavity (2) is used in the proof in [9] to ensure that the discrete heat semigroup (P N (t)) t≥0 contracts the distance: This estimate is used in regularisation arguments. Finally, the estimate (3) enters the argument, since it turns out to be easier to compare the Wasserstein metric to the discrete transportation metric W N , which is defined using the harmonic mean. The estimate (3) can then be combined with a regularisation argument, which shows that the difference between W N and W N becomes negligible as N gets large, provided that the weight function satisfies (3).
Let us note that these properties are satisfied for θ m with 0 < m ≤ 2. Indeed, (1) is obvious, and concavity of θ for 0 < m ≤ 2 has been proved in Lemma 3.7. Moreover, since θ −1 is the harmonic mean, it follows from the monotonicity of θ m in m (proved in Lemma 3.7) that it suffices to check (3) for m = 2. In this case an elementary computation shows that (3) indeed holds.