The gradient flow of a generalized Fisher information functional with respect to modified Wasserstein distances

This article is concerned with the existence of nonnegative weak solutions to a particular fourth-order partial differential equation: it is a formal gradient flow with respect to a generalized Wasserstein transportation distance with nonlinear mobility. The corresponding free energy functional is referred to as generalized Fisher information functional since it is obtained by autodissipation of another energy functional which generates the heat flow as its gradient flow with respect to the aforementioned distance. Our main results are twofold: For mobility functions satisfying a certain regularity condition, we show the existence of weak solutions by construction with the well-known minimizing movement scheme for gradient flows. Furthermore, we extend these results to a more general class of mobility functions: a weak solution can be obtained by approximation with weak solutions of the problem with regularized mobility.


Introduction
This work is concerned with the existence of nonnegative weak solutions u : [0, ∞) × Ω → [0, ∞) to the partial differential equation ∂ t u(t, x) = div m(u(t, x))∇ δF δu (u(t, x)) for (t, x) ∈ (0, ∞) × Ω, where Ω ⊂ R d (d ≥ 1) is a bounded and convex domain with smooth boundary ∂Ω and exterior unit normal vector field ν. The assumption on the mobility function m and the free energy functional F are specified below. Above, δF δu denotes the first variation of F in L 2 . Additionally, the sought-for solution u to (1) is subject to the no-flux and homogeneous Neumann boundary conditions m(u(t, x))∂ ν δF δu (u(t, x)) = 0 = ∂ ν u(t, x) for t > 0 and x ∈ ∂Ω, and to the initial condition u(0, ·) = u 0 ∈ L 1 (Ω), with u 0 ≥ 0 and for some fixed U > 0.
Formally, (1) possesses a gradient flow structure with respect to a modified version W m of the L 2 -Wasserstein distance on the space of probability measures, reading as W m (u 0 , u 1 ) = inf (us,ws)∈C 1 0 Ω |w s | 2 m(u s ) dx dt where the admissible set C is a suitable subclass of curves (u s , w s ) s∈[0,1] satisfying the continuity equation ∂ s u s = −div w s on [0, 1] × Ω and the initial and terminal conditions u s | s=0 = u 0 , u s | s=1 = u 1 . We refer to [5,9] (see also [4,17]) for more details concerning the definition and properties of W m . Before stating our assumptions on the mobility function m, let us consider the linear case m = id. Then, it is well-known [2] that W m coincides (up to a scaling factor depending on the value of U ) with the classical L 2 -Wasserstein distance W 2 for probability measures on Ω. Using techniques from optimal transportation theory (see, for instance, [16]), various equations of the form (1) for m = id have been interpreted as gradient systems with respect to W 2 . The specific gradient flow structure often allows for the analysis of the well-posedness of the underlying evolution equation as well as for the study of the qualitative behaviour of the associated gradient flow solutions [7,14]. A powerful tool for those investigations is provided by the so-called minimizing movement scheme (cf. (6) below), a metric version of the implicit Euler scheme, which is used for the construction of a time-discrete approximative gradient flow. Even in very general situations, this approach can be of use (see, for example, [1] for an abstract description or [3,17] in the context of coupled systems). A certain class of equations of the form (1) already has successfully been proved to be well-posed by Lisini, Matthes and Savaré [10] using its variational structure in spaces w.r.t. the distance W m . This work aims at a further extension.
In their seminal paper [7] on the variational formulation of the Fokker-Planck equation, Jordan, Kinderlehrer and Otto rigorously interpreted the heat equation Boltzmann's entropy H shares an important property [13]: it is 0-convex along (generalized) geodesics with respect to the Wasserstein distance W 2 and thus generates a 0-contractive gradient flow (in the sense of Ambrosio, Gigli and Savaré [1]) along which H descreases most. The corresponding dissipation of H over time is given by the so-called Fisher information functional Formally, F induces the (fourth-order) Derrida-Lebowitz-Speer-Spohn equation u as W 2 -gradient flow. A thorough analysis of the relationship between H, F and their corresponding evolution equations has been done by Gianazza, Savaré and Toscani [6], later generalized by Matthes, McCann and Savaré [12] to more general energy functionals in the Wasserstein framework. Even if the Fisher information functional F does not admit a convexity condition along geodesics w.r.t. W 2 , results on existence and convergence to equilibrium can be deduced using the fact that F = |∂H| 2 , i.e., being the squared Wasserstein slope of Boltzmann's entropy H. The main aim of this work is to extend this specific connection to nonlinear mobility functions m and the generalized Wasserstein distance W m .
However, in this case, the structure of the slope w.r.t. W m is not known explicitly. Moreover, convexity along geodesics in this space is a very rare property (cf. [4,17]), with the following exception: It is known that the heat entropy functional is 0-convex along geodesics w.r.t. W m and generates the heat flow as its 0-contractive gradient flow. Formally, the dissipation of H along its own gradient flow, i.e., its autodissipation, reads as This motivates our specific definition for the free energy functional F to be considered in (the consequently fourth-order equation) (1): if f (u) ∈ H 1 (Ω), and F (u) := +∞ otherwise. We call F the generalized Fisher information functional associated to the mobility m. For linear mobility, functionals of the form (5) for other choices of f have already been studied in [11].
We impose the additional constraint u(t, x) ≤ S, where S > 0 either is a fixed real number or equal to +∞. The specific value of S is determined by the structure of the mobility function m which is assumed to satisfy the following. For mobilities satisfying (M), one can define the distance W m via formula (4) on the space where U > 0 is a fixed given number, see [5,9]. In this work, we use some of the topological properties of the metric space (X, W m ). The respective statements are omitted here for the sake of brevity.
In the case S = ∞, we need an additional assumption on the growth of m for large z: There exist γ 0 , γ 1 ∈ [0, 1] with the additional requirement that In particular, (M-S) is met by the paradigmatic examples m(z) = z β for β ∈ ( 2 3 , 1] and S = ∞ as well as by m(z) = z β1 (S − z) β2 for β 1 , β 2 ∈ ( 2 3 , 1] and S < ∞. Compared to [10] where β ∈ ( 1 2 , 1] is allowed, we need a slightly stricter condition on m here due to the appearance of m in the definition of f . Our proof of existence of solutions to (1) with F defined as in (5) and mobilities satisfying (M) and (M-LSC) relies on the formal gradient structure of equation (1). We use the time-discrete minimizing movement scheme for the construction of weak solutions in the space X: given a step size τ > 0, define a sequence (u n τ ) n≥0 in X recursively via With the sequence (u n τ ) n≥0 from (6), we define a time-discrete function The resulting main theorem on the limit behaviour of (u τ ) τ >0 in the continuous time limit τ ց 0 is as follows.
Then, for each step size τ > 0, a time-discrete function u τ : [0, ∞) → X can be constructed via the minimizing movement scheme (6)& (7). Moreover, for each vanishing sequence τ k → 0 of step sizes, there exists a (non-relabelled) subsequence and a limit function u : [0, ∞) → X such that the following is true for arbitrary Notice that due to the non-convexity of the problem, we do not obtain uniqueness of solutions. The study of qualitative properties of u with respect to (large) time is postponed to future research.
Our second main theorem extends the result to mobilities m which are not Lipschitz continuous. We obtain a weak solution to (1) no longer by approximation via the minimizing movement scheme (6), but as a limit of solutions of (1) for mobilities m δ which are close to m and satisfy (M-LSC), as δ → 0. Specifically, we approximate m by m δ as in [10]: where z δ,1 < z δ,2 are the two solutions of m(z) = δ.
where z δ is the unique solution of m(z) = δ.
Since m δ is constructed in such a way that (M-LSC) is fulfilled, there exists a weak solution u δ to (1) with initial condition u 0 (which is assumed to be independent of δ and such that F δ (u 0 ) < ∞), in the sense of Theorem 1.1, for each δ. To indicate the dependence of F on m, we e.g. write f δ and F δ when m δ is considered in place of m. Analysing the limit behaviour of the family (u δ ) δ>0 as δ ց 0, we find that the properties of u δ carry over to the limit: Theorem 1.2 (Existence for non-Lipschitz mobilities). Assume that m satisfies (M), (M-S), and if S = ∞, also (M-PG), and define m δ for δ ∈ (0, δ) and sufficiently small δ as in (8)/ (9). Let an initial condition u 0 ∈ X with F δ (u 0 ) < ∞ be given and denote by u δ a weak solution to (1) with m δ in place of m and initial condition u 0 in the sense of Theorem 1.1, for each δ ∈ (0, δ). Then, there exists a vanishing sequence δ k → 0 and a map u : [0, ∞) → X such that for the sequence (u δ k ) k∈N and the limit u, one has ; H 1 (Ω)) and weakly in L 2 ([0, T ]; H 2 (Ω)). (c) For almost every t ∈ [0, T ], one has F δ k (u δ k (t)) → F (u(t)) as k → ∞; and the map t → F (u(t)) is almost everywhere equal to a nonincreasing function. The plan of the paper is as follows. Section 2 is concerned with the proof of Theorem 1.1. We first study the discrete curves obtained by the minimizing movement scheme in Section 2.1 before passing to the continuous time limit in Section 2.2. In Section 3, we show Theorem 1.2. Properties of the approximation with Lipschitz mobilities are investigated in Section 3.1, its convergence in Section 3.2

Lipschitz mobility functions
In this section, we prove Theorem 1.1. In advance of studying the properties of the scheme (6)&(7), we first prove an auxiliary result on the relationship between F and H. To this end, we make the following specific choice of the integrand h of H (compare with [10]): Note that the following statement does not require condition (M-LSC).
Proof. Note that h(z) ≥ 0 for all z ∈ (0, S) and h(s 0 ) = 0. We first investigate the behaviour of h as z ց 0: for z < s 0 , one has where the last step follows from concavity of m, viz. m(r) ≥ m(z) z r. One directly verifies that the limit lim zց0 s0 z (r−z)z m(z)r dr exists. Thanks to the monotonicity of h for z < s 0 (clearly, h ′ (z) < 0 for z < s 0 ), also the limit lim zց0 h(z) exists. If S < ∞, existence of the limit lim zրS h(z) follows in analogy. Hence, the integrand h can be continuously extended onto the boundary of (0, S).
So, if S < ∞, h is a bounded function. Hence, as f is increasing, we have for some C > 0 and all z ∈ [0, S], from which then (11) with q = 1 follows using Poincaré's inequality. Consider the case S = ∞. By assumption (M-PG), there exist z > max(s 0 , 1) and constants C 0 , C 1 > 0 such that for all z > z: In view of the previous arguments on bounded value spaces, we may restrict ourselves to the case z > z in the following. First, we have by definition of h (10). By similar considerations as above, one easily finds that z s0 z − r m(r) dr ≤ Cz, which can be verified by elementary calculations, using that m(z) ≤ C(z + 1) as a consequence of assumption (M). With (14), we again arrive at For the second term, we use (13) to obtain: All in all, we end up with To estimate f from below, we use (13) again: Consider the case γ 0 < 1. If d ≥ 3, one has by (M-PG) that Putting z = u(x) and integrating over x ∈ Ω, we obtain (11) for some q ≥ 1 with the Gagliardo-Nirenberg-Sobolev inequality. The cases d ∈ {1, 2} and γ 0 = 1 can be treated by similar, but easier arguments.
Proposition 1 (Properties of the scheme). Assume that (M) and (M-LSC) hold, and if S = ∞, let (M-PG) be satisfied. Let an initial condition u 0 ∈ X with F (u 0 ) < ∞ be given. Then, for each τ > 0, the scheme (6) is well-defined and yields a sequence (u n τ ) n≥0 and a discrete solution u τ via (7). Moreover, the following statements hold: (a) For all n ∈ N, s, t ≥ 0, one has: (b) There exists a constant C > 0 such that for all τ > 0 and all n ∈ N, one has: (c) There exists p > 1 such that for all T > 0, there exists a constant C > 0 such that for all τ > 0, the following holds:

Proof. A straightforward application of the direct method from the calculus of variations and Poincaré's inequality shows that, given that
The properties in (a) are a direct consequence of the scheme (6)&(7) and are well-known (see, for instance, [1]).
To prove the additional regularity property (17), we apply the flow interchange technique from [12], using the heat entropy H as 0-convex auxiliary functional. The relevant symbolic calculations are already contained in the proof of [11,Prop. 7.3] where a functional of similar form has been studied: here, it is sufficient to show that Indeed, using the definition of f , one sees and the claim follows by assumption (M) on the mobility m. The first estimate in (c) is an immediate consequence of the energy estimate (15) and Poincaré's inequality. The last one is nontrivial only for d ≥ 3 and S = ∞. Using the first estimate and the Gagliardo-Nirenberg-Sobolev inequality, the above estimate with p = d d−2 follows from the inequality (14). For the second statement in (c), integrate (17) over time and simplify the telescopic sum to see With (11) and the energy estimate (15), we have for some C ′ > 0: which is a finite constant.
Note that since f ′ (z) = 2 m(z) holds, one can also rewrite the term involving f in (18) as 2.2. Passage to the continuous time limit. The remainder of this section is concerned with the passage to the continuous time limit τ ց 0. In particular, we show that the passage to the limit inside (18) yields the time-continuous weak formulation of (1), completing the proof of Theorem 1.1. In order to obtain convergence in a strong sense, we make use of the following extension of the Aubin-Lions compactness lemma. Furthermore, Proposition 1(c) and the Banach-Alaoglu theorem yield the existence of v ∈ L 2 ([0, T ]; H 2 (Ω)) such that f (u τ k ) converges weakly to v in L 2 ([0, T ]; H 2 (Ω)), possibly extracting another subsequence. We now prove that f (u τ k ) → f (u) strongly in L 2 ([0, T ]; L 2 (Ω)) for u = f −1 •v. By a standard interpolation inequality, the desired strong convergence of f (u τ k ) → f (u) in L 2 ([0, T ]; H 1 (Ω)) then follows. Strong convergence of u τ k to u in L p ([0, T ]; L p (Ω)) (on a subsequence) is achieved by essentially the same technique as in [17,11] applying Theorem 2.3 for the admissible choices Y := L p (Ω) (with p > 1 from Proposition 1(c)), We refer to [11] for the details. A rather straightforward application of Vitali's convergence theorem subsequently yields the strong convergence of f (u τ k ) → f (u) in L 2 ([0, T ]; L 2 (Ω)).

Non-Lipschitz mobility functions
In this section, we consider mobility functions which do not satisfy (M-LSC), but can be approximated in a suitable way by LSC mobilities, see (8)&(9). Our strategy of proof for Theorem 1.2 is as follows: first, we demonstrate that the a priori estimates from Proposition 1 are uniform w.r.t. the approximation parameter δ > 0 when considering a family (u δ ) δ∈(0,δ) of weak solutions to (1) with initial condition u 0 (independent of δ) for m δ in place of m (in the sense of Theorem 1.1). This will allow us to pass to the limit δ ց 0 in the weak formulation (19) of (1) for m δ to obtain the sought-for weak formulation of (1) for m.
Proof. For part (a), we obtain by Theorem 1.1(c): Consequently, (b) immediately follows from the Hölder estimate for u δ in (X, W m δ ) (recall (16) and Theorem 1.1(b)) and the monotonicity W m (u, u) ≤ W m δ (u, u) for each u, u ∈ X. The claims (c)-(e) are a consequence of (a) and the respective estimates of Proposition 1(c) the proof of which does not rely on condition (M-LSC).

3.2.
Convergence. In this section, we prove Theorem 1.2. As a preparation, we show Lemma 3.2 (Local uniform convergence). Let a vanishing sequence (δ k ) k∈N in (0, δ) be given and denote, for each k, g δ k := f −1 δ k . The following statements hold: (a) If S = ∞, there exists a (non-relabelled) subsequence on which the sequence (G δ k ) k∈N , defined by converges locally uniformly to the continuous map Proof. At first, we prove-for arbitrary S-that (g δ k ) k∈N converges (on a suitable subsequence) locally uniformly on [0, S) to g := f −1 . Indeed, using the monotonicity m δ k ≤ m on [0, S) and (20), we obtain the differential estimate 0 ≤ g ′ δ k (w) = 1 2 m δ k (g δ k (w)) ≤ 1 2 m(g δ k (w)) ≤ C(g δ k (w) + 1), where the constant C > 0 does not depend on k. Using Gronwall's lemma, we deduce that g δ k , and consequently also g ′ δ k , is k-uniformly bounded on compact subsets of [0, S). The application of the Arzelà-Ascoli theorem yields the desired local uniform convergence.
Consider the case S = ∞. Using the monotonicity properties (see [10] again) m ′ δ k (z) ≤ m ′ (z) and g(w) ≥ g δ k (w) in combination with the concavity of m, we find that for all w > 0: measurable set A ⊂ [0, T ] × Ω, one has for sufficiently large k that where we used the well-known Lions-Villani estimate on square roots [8] (see [10, Lemma A.1] for a formulation in the framework at hand) in the last step. Since (v δ k ) k∈N is L 2 -uniformly integrable (by Vitali's theorem), the former estimate also yields L 2 -uniform integrability of (2G δ k (v δ k )∇ √ v δ k ) k∈N as v δ k is k-uniformly bounded in L 2 ([0, T ]; H 2 (Ω)). Applying Vitali's theorem once again gives m ′ (g δ k (v δ k ))∇v δ k → m ′ (g(v))∇v strongly in L 2 ([0, T ]; L 2 (Ω)), extracting a subsequence if necessary. Consider the remaining case S < ∞ and notice that almost everywhere on [0, T ] × Ω. Thanks to Lemma 3.2, one has 2 )∇v almost everywhere on [0, T ] × Ω. Now, similarly as above, for each measurable set A ⊂ [0, T ] × Ω: , for some C ′ > 0 which does not depend on k. Applying Vitali's theorem as in the former case yields the asserted strong convergence m ′ (g δ k (v δ k ))∇v δ k → m ′ (g(v))∇v in L 2 ([0, T ]; L 2 (Ω)). All in all, we have proved that u = f (v) satisfies the weak formulation (19) of (1), so the proof of Theorem 1.2 is finished.