On microscopic origins of generalized gradient structures

Classical gradient systems have a linear relation between rates and driving forces. In generalized gradient systems we allow for arbitrary relations derived from general non-quadratic dissipation potentials. This paper describes two natural origins for these structures. A first microscopic origin of generalized gradient structures is given by the theory of large-deviation principles. While Markovian diffusion processes lead to classical gradient structures, Poissonian jump processes give rise to cosh-type dissipation potentials. A second origin arises via a new form of convergence, that we call EDP-convergence. Even when starting with classical gradient systems, where the dissipation potential is a quadratic functional of the rate, we may obtain a generalized gradient system in the evolutionary $\Gamma$-limit. As examples we treat (i) the limit of a diffusion equation having a thin layer of low diffusivity, which leads to a membrane model, and (ii) the limit of diffusion over a high barrier, which gives a reaction-diffusion system.


Introduction
We consider evolution equationsu = V (t, u) that are generated by gradient systems (GS). By a GS we understand a triple (X, E, R), where the state space X is a weakly closed convex subset of a Banach space containing the states u(t). The functional E : X ! R [ {1} is called energy, but in applications it may be a free energy, a relative entropy, or the negative of the entropy. Finally R is the dissipation potential depending on the state u and the rateu such that DuR(u,u) 2 X ⇤ denotes the dissipation force. The induced evolution equation is the force balance 0 = DuR(u(t),u(t)) + D u E(t, u(t)), (1.1) where the symbol D denotes the (partial) Gateaux derivative or the convex subdi↵erential. Quite often, we will use the dual dissipation potential R ⇤ that is defined by the Legendre-Fenchel transform of R(u, ·). Then, the evolution equation can be rewritten aṡ u(t) = D ⇠ R ⇤ u(t), DE(u(t)) , (1.2) see Section 2.1 for the details. Since R and R ⇤ are in one-to-one correspondence, we will sometimes denote (X, E, R) also by (X, E, R ⇤ ), in particular if R ⇤ is given explicitly. We call D the De Giorgi dissipation functional.
A GS is called classical if the dissipation potential R(u, ·) is quadratic, i.e. R(u,u) = 1 2 hG(u)u,ui for a linear, symmetric, and positive definite operator G(u) : X ! X ⇤ . If we want to emphasize that a GS is not classical, we call it a generalized GS. The aim of this work is to show that generalized GS arise in two natural ways. First, it is shown in [MPR14, MP ⇤ 15] that they appear via large-deviation principles from a microscopic N -particle system for N ! 1, see Section 2.4 for a brief summary of the main result. Second, generalized GS occur as suitable multiscale limits of classical GS.
Obviously, every GS generates exactly one gradient-flow evolution equation by (1.1) or (1.2), but a given evolution equationu = V (t, u) may be generated by many GS. If there exists at least one such GS, we say that the evolution equation has a gradient structure, if we do not want to specify the particular GS. As an elementary example we treat the scalar ODEṗ = 1 2p with p(t) 2 [0, 1] = Prob({1, 2}), which we interpret as the Kolmogorov forward equation for a Markov process (X t ) t 0 with X t 2 {1, 2}. Obviously, this ODE is generated by the GS ([0, 1], E 2 , R 2 ) with E 2 (p) = a p 1/2 2 and R 2 (p,ṗ) = a 2ṗ 2 for any a > 0. Of course, GS that simply di↵er by a scaling constant such as a > 0 are not considered as di↵erent. Motivated by a Markovian large-deviation principle, a truly di↵erent GS is obtained for a > 0 by where the function C and its Legendre dual C ⇤ are given by C (v) = 2v arsinh(v/2) 2 p 4+v 2 + 4 and C ⇤ (⇠) = 4 cosh(⇠/2) 1 . (1.4) The functions C and C ⇤ will play a fundamental role, so we give some elementary relations: , C ⇤ (⇠) = 1 2 ⇠ 2 + O(⇠ 4 ), C 0 (v) = 2 arsinh(v/2), p pq C ⇤ (log p log q) = 2 p p p q 2 , p pq (C ⇤ ) 0 (log p log q = p q. Indeed, using the last relation and DE Mv (p) = a log p log(1 p) we easily finḋ p = D ⇠ R Mv p, DE Mv (p) = 1 2p. Moreover, using (C ⇤ ) 0 (⇠) = 2 sinh(⇠/2) we see that the evolution takes the exponential formṗ = 2 This form is derived and extensively studied in [BoP14], it occurs in mechanics [RRG00, Eqn. (5)] and in chemistry, see the discussion at the end of Section 2.3.2. The usage of generalized GS is common in the modeling of materials, e.g. for plasticity, ferromagnetism, etc., where the nonsmoothness and nonlinearity of the constitutive laẇ u 7 ! DuR(u,u) for the dissipative forces is essential, see Section 2.3.1 and the survey [Mie15b]. The mathematical usage of generalized GS in smooth models such as reaction-di↵usion equations and systems is rather new. One of the remarkable origins of gradient structures arises from the interpretation of a macroscopic system as a Kolmogorov forward equation⇢ = Q ⇤ ⇢, where ⇢(t) 2 Prob(S), (1.5) for a Markov process (X(t)) t 0 on the set S with generator Q. Considering N independent particles X j (t), j = 1, . . . , N one can define the empirical process ⇢ N (t) = 1 N P N j=1 X j (t) 2 Prob(S). For N ! 1 the process ⇢ N converges to a solution ⇢ of (1.5). Moreover, according to the program in [AD ⇤ 11, AD ⇤ 13] ⇢ N satisfies a large-deviation principle that gives rise to a rate functional I (⇢(·)) = R T 0 L(⇢(t),⇢(t)) dt, where L can be characterized explicitly by Q. The main observation in [MPR14] is that L defines a GS (Prob(S), E, R) via the explicit representation L(⇢,⇢) = R(⇢,⇢) + R ⇤ (⇢, DE(⇢)) + DE(⇢)[⇢], (1.6) whenever Q has a unique steady state ⇡ 2 Prob(S) and Q satisfies the detailed-balance condition with respect to ⇡ (i.e. the Markov process is reversible). We refer to Section 2.4 for details, where we also highlight that the arising gradient systems are classical only in the case of di↵usion processes. In case of jumps, one obtains generalized GS involving the function C . In particular, forṗ = 1 2p one finds ([0, 1], E Mv , R Mv ) with a = 1 2 . We consider the above stochastic approach as a first microscopic origin of GS. The second origin involves the concept of evolutionary -convergence for GS, see the surveys [Ser11,Mie15a] for the general ideas. Here we concentrate on convergence results based on the EDP, cf. (1.3), which is an ideal tool for doing a limit passage for solutions u " : [0, T ] ! X for a family (X, E " , R " ) of GS depending on a small parameter ". The aim is then to derive a limiting GS (X, E 0 , R 0 ) such that a limit u of the solutions u " is indeed a solution for the limiting GS. Our Definition 3.2 introduces the concept of EDP-convergence: A family of GS (X, E " , R " ) converges to the GS (X, E 0 , R 0 ) in the EDP sense, if the following holds: u " : [0, T ] ! X is a solution of (X, E " , R " ), u " (0) * u 0 , and E " (0, u " (0)) ! E 0 (0, u 0 )<1 9 u sol. of (X, E 0 , R 0 ) with u(0)=u 0 and a subsequence " k ! 0 : 8 t 2 ]0, T ]: u " k (t) * u(t) and E " k (u " k (t)) ! E 0 (u(t)); (1.7a) (1.7c) When asking only for condition (1.7a) we speak of pE-convergence, see Definition 3.1. Note that (1.7c) enforces a liminf estimate of De Giorgi's dissipation functionals D " along general functions e u " , not only along the solutions of the GS (X, E " , R " ). Having this liminf estimate, it is easy to pass to the limit in the "-dependent energy-dissipation estimate (1.3), since the initial energy on the right-hand side is assumed to converge according to (1.7a). Then, applying the EDP for the limiting GS (X, E 0 , R 0 ) we see that u is a solution.
In fact, many approaches to evolutionary -convergence establish EDP-convergence, but do not explicitly state condition (1.7c) as a main result. E.g. the Sandier-Serfaty approach [SaS04,Ser11], where the terms R T 0 R " dt and R T 0 R ⇤ " dt are treated separately, provides EDP-convergence. Our approach is more general than the latter, since we only ask that the sum R T 0 R " dt + R T 0 R ⇤ " dt behaves well, but not necessarily the individual terms. This has two e↵ects: (i) we can allow for general functions e u " , and (ii) it can lead to exchanges between the two terms in the limit " ! 0. Point (i) is important to explore D 0 outside of the set of solutions and thus providing the full information about the GS (X, E 0 , R 0 ), while the set of solutions of the limit equationu = V 0 (t, u) := DuR 0 (u, D u E 0 (t, u)) only contains information on V 0 . Point (ii) is relevant for another important message of this paper. The EDP-limit of classical GS can be a generalized GS. This phenomenon is considered as another microscopic origin of generalized GS.
Here we provide three di↵erent examples for point (ii), the first of which is an ODE example in Section 3.3.3, while Sections 4 and 5 contain more elaborate examples treating the membrane limit of a thin-layer and the limit of di↵usion to reaction, respectively.
For the membrane limit we consider a di↵usion equation with a thin layer with very small di↵usivity. In [Lie12,Lie13] pE-convergence to the membrane limit was established; however EDP-convergence was not studied. We start with the di↵usion equation which is the gradient flow for the classical GS (Prob(⌦), Using suitable scalings for the di↵usion coe cient a " (x) Theorem 4.1 provides EDP-convergence to the generalized GS (Prob(⌦), E, R ⇤ 0 ) with where u(0 ) and u(0 + ) denote the limit of u(x) at x = 0 from the left and from the right, respectively. Thus, R ⇤ 0 involves C ⇤ and is therefore non-quadratic. Section 5 follows [PSV10, PSV12, AM ⇤ 12] by considering the limit from pure di↵usion in physical space and along a reaction-path variable y 2 ⌥ ⇢ R to a limit of a reaction-di↵usion system on ⌦. The Fokker-Planck equation readṡ where V is a potential with two global minima y 0 and y 1 and one global maximum inbetween. This equation is generated by the classical GS (Prob(⌦⇥⌥), E " , R ⇤ " ) where E " is the relative entropy and R ⇤ " is the quadratic Wasserstein dissipation potential, see (5.2). Theorem 5.2 establishes EDP-convergence to a generalized GS (Prob(⌦⇥{y 0 , y 1 }), E, R ⇤ ), where R ⇤ again involves the non-quadratic function C ⇤ . We conclude our introduction by a general and surprising observation. The three main models in this work (i.e. the ODE, the membrane, and the reaction-to-di↵usion model in Sections 3.3.2, 4, and 5, respectively) can be seen as Kolmogorov forward equations for naturally associated Markov processes. Thus, the large-deviation theory of Section 2.4 is applicable and provides entropic GS (Prob(S), E " , R " ) for the associated Kolmogorov forward equations⇢ " = Q ⇤ " ⇢ " for each " 2 ]0, 1[ as well as for " = 0. The limit for " = 0 can be also defined in terms of the classical convergence for Markov processes asking Ignoring the (linear) Markovian structure, we can also consider EDP-convergence of the induced entropic GS. In all our three examples we find the surprising result that the EDP-limit is exactly the entropic GS of the limiting Markov process. This means that applying the described large-deviation principle and taking the limit " ! 0 (either on the level of Markov semigroups or as EDP-convergence for GS) commute, see Figure 1.1. This result appears naturally, if we use representation (1.6) of the rate function I giving Figure 1.1: For reversible, time-continuous Markov processes the large-deviation principle (LPD) of Section 2.4 provides a (generalized) gradient structure. This mapping commutes with taking the limit " ! 0 and EDP-convergence, respectively.
Hence, the above large-deviation principle exactly encodes the energy-dissipation principle, and EDP-convergence for the induced entropic GS can be interpreted as -convergence of the rate functionals. The question how general this observation about the interchangeability of the suitable large-deviation principles and the EDP-convergence is, seems to be challenging, but goes beyond the scope of this work. We mention that in [BoP14] similar relations between large-deviation principles and evolutionary -convergence are studied.
As a final general remark, we emphasize that this paper focuses on the modeling aspects of the emergence of generalized GS. Thus, we do not give the full analytical details in terms of estimates and convergences in the proper functional spaces, but rather highlight the structures and manipulations needed to understand the corresponding limit procedures.

Classical and generalized gradient systems
We now convert the formal ideas from the introduction into rigorous mathematical statements. We call a triple (X, E, R) a gradient system (GS), if X is a Banach space, E : [0, T ] ⇥ X ! R 1 := R[{1} is a functional (such as the free energy, the negentropy, etc.), and R : X ⇥ X ! [0, 1] is a dissipation potential, which means that for all q 2 X the functional R(u, ·) : X ! R 1 is lower semicontinuous, nonnegative, convex, and satisfies R(u, 0) = 0. In this section, we allow for the case that the energy functional depends on the time variable t 2 [0, T ] to show that the abstract principle is valid in this general case. However, for notational convenience we will restrict to the autonomous case (i.e. @ t E(t, u) ⌘ 0) in all other parts. We speak of a classical GS, if R(u, ·) is quadratic, i.e. there exists a symmetric and positive definite operator G such that R(u, v) = 1 2 hG(u)v, vi. However, plasticity requires non-quadratic dissipation potentials, e.g. of the form R(⇡) = yield k⇡k L 1 + 1 2 µ visc k⇡k 2 L 2 , see [Mie03,MiR15]. In particular, the rate-independent case is based on R(u, v) = R(u, v) for all > 0, which is incompatible with a quadratic form. If R(u, ·) is non-quadratic, we call (X, E, R) a generalized GS.

Variational principles for gradient systems
The following proposition from convex analysis shows that there are several completely equivalent formulations of the generalized force balance (1.1). The equivalences of the points (ii) to (iv) below are also called Fenchel equivalences, cf. [Fen49]. The essential tool is the Legendre-Fenchel transform ⇤ : X ⇤ ! R 1 of a convex function : In a reflexive Banach space we have ( ⇤ ) ⇤ = .
Proposition 2.1 (Equivalent formulations) Let X be a reflexive Banach space and : X ! R 1 be proper, convex, and lower semicontinuous. Then, for every ⇠ 2 X ⇤ and every v 2 X the following five statements are equivalent: Note that the definition of ⇤ immediately implies the Young-Fenchel inequality (w) + ⇤ (⌘) h⌘, wi for all w and ⌘. Thus, (iii) expresses an optimality as well.

The energy-dissipation principle
The above formulations can already be understood in a variational sense, since the evolution is expressed by extremizing a functional or by variations or derivatives of the two functionals E and R. However, for mathematical purposes it is desirable to have formulations in terms of a minimization problem for the whole solution trajectories u : [0, T ] ! X. One such principle can be derived on the basis of the power balance (PB) by integration in time and using the chain rule and finally employing the Young-Fenchel inequality (w) + ⇤ (⌘) h⌘, wi, cf. [DMT80] or the survey [Mie15a]. This leads to the celebrated energy-dissipation principle, also called De Giorgi's (R, R ⇤ ) principle, see [AGS05] for some historical remarks.
Under additional technical conditions it is su cient to have only the upper estimate where "=" is replaced by "". In this case, we speak of the energy-dissipation estimate (EDE).

Examples of generalized gradient structures
Here we give some examples of generalized gradient structures. First, we discuss dissipative material models like plasticity or shape-memory materials that form a huge class of generalized GS. Second, we treat nonlinear reaction-di↵usion systems (RDS), which will be closer to the main theme of this paper. The third class of examples concerns reversible Markov processes, where the Kolmogorov forward equation has a gradient structure with the relative entropy as energy functional. This latter class is so important that it is treated in the subsequent Subsection 2.4.

Dissipative material models
The state of a body ⌦ ⇢ R d , composed of so-called dissipative materials (also called standard generalized materials), is given in terms of the elastic displacement u : ⌦ ! R d and an additional internal variable z : ⌦ ! R k . The latter may describe plastic deformations, damage, phase-field variables, magnetization, or other internal states of the material. The total stored energy E depends on u, z, and usually also on a process time t 2 [0, T ], i.e. E(t, u, z). As introduced in the theory of standard generalized materials in [HaN75] the dissipative forces are given in terms of a (primal) dissipation potential R that also may include viscoelastic terms: where e(u) = 1 2 (ru+(ru) > ). As before, the corresponding force balance equations (FB) are 0 = DuR(u,ż) + D u E(t, u, z), 0 2 @żR(u,ż) + D z E(t, u, z) with @żR(u,ż) denoting the set-valued convex subdi↵erential. While the viscoelastic potential R visc is assumed to be quadratic in many applications, the potential R diss for the internal variables z is often supposed to be non-quadratic. E.g. in viscoplasticity with yields stress yield one takes the form where > 0 is usually taken small, e.g. in = 0.012 in [ZR ⇤ 06]. The weak growth of order 1+ is sometimes even replaced by a growth O(|ż| log |ż|) as given by our function C (see e.g. [RRG00, Eqn. (5)] and [BoP14]). For later reference, we mention the very simple scalar hysteresis model of a so-called play operator. It is given by the generalized GS (R, E play , R play ) with E play (t, z) = 1 2 z 2 `(t)z and R play (ż) = r|ż| with r > 0. (2.1) It serves as a limit for evolutionary -convergence in Example 3.3 as well as a largedeviation limit in [BoP14].

Nonlinear reaction-di↵usion systems
We consider concentrations c(t) : ⌦ ! [0, 1[ I of chemical species C 1 , . . . C I that can react according to R reactions of mass action type given by a stoichiometric relation where r = 1, . . . , R is the index of the reaction, k f r and k b r are the forward and backward reaction coe cients, and the stoichiometric coe cients ↵ r i and r i are nonnegative integers. The reaction-di↵usion system (RDS) for the concentrations c = (c 1 , . . . , c I ) takes the forṁ [Mie11,Mie13b] that (2.2) has a (classical) gradient structure under the additional assumption of the detailed-balance condition, which means that Using the Boltzmann function B (z) = z log z z + 1 we define the relative entropy which gives rise to the vector of thermodynamic driving forces (also called chemical po- Because of the logarithm laws they satisfy . Thus, using the detailed-balance conditions we obtain the relation To construct the dual dissipation potentials we may choose any scalar, strictly convex dual dissipation functional : R ! R with (0) = 0 (0) = 0 and 00 (0) > 0 and let .  [Maa11,ErM12,MaM15a]. However, it was already criticized in the 1930s that the linear relationċ = K(c)µ (i.e. R ⇤ (c, µ) = 1 2 hµ, K(c)µi is quadratic) arising from Onsager's principle is not suitable for chemical reactions if one wants to model systems that are not very close to thermal equilibrium. As a solution Marcelin and de Donder introduced exponential dependencies between µ andċ, see [Fei72,Def. 3.3] or [GK ⇤ 00, Eqn. (11)]. In [Grm10] Remark iii on p. 77 gives some historical comments and Eqn. (69) explicitly features an exponential dissipation potential ⌅ involving the function e ⇠/2 + e ⇠/2 2). Since the choice (⇠) = C ⇤ (⇠) is central for our paper, we give R ⇤ explicitly for this case, viz. (2.5) We will see that exactly the same structure, up to a trivial scaling factor 1/2, arises via the large-deviation principle described next, see also [MP ⇤ 15].

Markov processes, large deviations, and GS
Here we give a rough sketch of the theory in [MPR14] about gradient structures for the Kolmogorov forward equation⇢ = Q ⇤ ⇢ of Markov processes satisfying a detailedbalance condition, which are also called reversible Markov processes, for short. The idea that large-deviation principles generate gradient structures goes back to [OnM53] (see Eqn. (4-21) therein for a quadratic version of the energy-dissipation principle derived by large-deviations, called Boltzmann's principle). The mathematical theory was developed only recently, see [AD ⇤ 11, AD ⇤ 13, MPR14]. In Section 2.4.1 we first describe a time-dependent large-deviation principle for general Markov processes providing a formula for the rate function I (⇢(·)) = R T 0 L(⇢(t),⇢(t)) dt and then present the result of [MPR14], which shows that for reversible Markov processes the functional L is induced by an EDP for a GS (Prob(S), E, R). In Sections 2.4.2 to 2.4.4 we then discuss a few applications of the abstract result in Theorem 2.3.

Gradient structures obtained via large deviations
We consider a compact metric space S and denote by Prob(S) the subset of probability Radon measures on S equipped with the narrow convergence ⇤ * defined by duality with continuous, bounded functions. The Kolmogorov forward equation⇢ = Q ⇤ ⇢ describes the evolution of the probability laws ⇢(t) of a Markov process (X t ) t 0 , if the law of X 0 is given by ⇢(0) 2 Prob(S). The Markov generator is given as Q acting on functions on S, while its dual Q ⇤ acts on measures such that can be defined. Using the law of large numbers the limit N ! 1 gives ⇢ N (t) ⇤ * ⇢(t), which solves the Kolmogorov forward equation⇢ = Q ⇤ ⇢, see e.g. [Ren13, Thm. 2.3.1]. Under suitable assumptions, see [MPR14], it is shown in [FeK06] that the empirical process ⇢ N satisfies a large-deviation principle with a rate function I (⇢(·)), i.e.
see the above references for the proper definition of "'". The main result is that I has the form We emphasize the simplicity of this formula and the (separate) linearity in ⇢ and in Q.
A main observation in [MPR14] is that the deterministic case, which is given by the relation can be interpreted as an energy-dissipation principle if and only if the Markov process is reversible, which is the same as asking for the detailed balance condition (cf. (2.3)) for the linear Kolmogorov forward equation⇢ = Q ⇤ ⇢. Hence, we now further assume that there exists a stationary measure ⇡ 2 Prob(S) which has, without loss of generality, the full set S as its support. We say that Q satisfies the detailed balance condition with respect to ⇡, for all f and g in the domain if Q. Choosing g ⌘ 1, we find that Q ⇤ ⇡ = 0, i.e. the detailed balance condition implies the stationarity. This version of the detailed-balance condition for Markov processes coincides with the detailed-balance condition for chemical reactions in (2.3). Indeed, if the ODE casė In the sequel we will use the Radon-Nikodym derivative of ⇢ with respect to ⇡ denoted by f = d⇢ d⇡ 2 L 1 0 (S, ⇡) and defined via e. a stationary measure ⇡ 2 Prob(S) satisfying the detailed-balance condition (2.6) exists for the Kolmogorov forward equation⇢ = Q ⇤ ⇢, then the large-deviation rate functional R T 0 L(⇢,⇢) dt has the form of an energy-dissipation principle, namely The cited reference contains not only a full proof, but also specifies under what assumptions this implication is in fact an equivalence, i.e. the existence of a gradient structure implies the existence of a steady state satisfying the detailed-balance condition.
Since the arguments and proofs in [MPR14] are quite involved, we highlight here the main structures and formal calculations to see that (Prob(S), E, R ⇤ ) is a GS and that it generates the Kolmogorov equation⇢ = Q ⇤ ⇢.
We first observe that R ⇤ is defined in terms of H via The latter is independent of the detailed-balance condition and can be established as follows. Consider the Markov semigroup P t = e tQ for t 0. For fixed ⇢ 2 Prob(S) and t 0 define where p t with p t (x, ·) 2 Prob(S) denotes the time-dependent Markov kernel. From the convexity of ⇠ 7 ! e ⇠(x)+⇠(y) and the nonnegativity of p t and ⇢, we conclude that ⇠ 7 ! A t (⇠) is convex. Using 1 t (P t ⌘ ⌘) ! Q⌘ and A 0 (⇠) ⌘ 1, we see that is also convex in ⇠. By definition we have R ⇤ (⇢, 0) = 0, and the detailed-balance condition implies the time reversibility R ⇤ (⇢, ⇠) = R ⇤ (⇢, ⇠). Since this implies DR ⇤ (⇢, 0) = 0, convexity gives the positivity R ⇤ (⇢, ⇠) 0. Thus, R ⇤ is indeed a dual dissipation potential.

A finite-state Markov process
We consider the finite state space S = {1, . . . , I} such that Note that the Markov generator is given by Q finite = A > , and the conditions for a Markov generator are A ij 0 for all i 6 = j and 8 i = 1, . . . , We further assume that there is a unique positive steady state ⇡ = w 2 Prob(S) such that the detailed-balance condition holds, namely To calculate the dissipation potential we use that Q can be split i and e k denotes the kth unit vector in R I . Using the linearity in Q finite of the formula (2.7) for R finite we can first calculate Summing these terms and using the function C ⇤ we find and conclude by Theorem 2.3 that the equationċ = Q > finite c is induced by the GS (Prob(S), E finite , R finite ).

Linear reaction-di↵usion systems
We now return to RDS as discussed in Section 2.3.2, but now consider only linear reactions where all stoichiometric vectors ↵ r and r are given by unit vectors e i and e j , respectively. This means that the reaction is a simple exchange reaction C i ⌦ C j . The linear RDS on a bounded smooth domain ⌦ ⇢ R d takes the forṁ complemented by no-flux boundary conditions. The matrix A is as before. Now, c j (t, ·) 2 L 1 (⌦) is the nonnegative concentration of the chemical species C i . This system can be understood as the Kolmogorov forward equation on the state space ) undergoes a Brownian motion in ⌦ with di↵usion constant j as long as i(t) = j. At discrete times the particle can change its type within {1, . . . , I}, according to the jump process induced by the generator Q finite = A > , and then continue a Brownian motion with the new di↵usion constant. The full generator is We now assume that the linear reaction system satisfies the detailed-balance condition, i.e. we assume that there is an equilibrium state w with w i > 0 and A ij w j = A ji w i for all i and j. Then, the steady state ⇡ 2 Prob(S) is given by the product of the d-dimensional Lebesgue measure on ⌦ and w, up to a suitable normalization factor: By normalizing w suitably, we may assume Z = 1 subsequently. Using the Neumann boundary conditions, it is easy to check that the generator Q satisfies the detailed-balance condition (2.6) with respect to ⇡.
Hence, we can apply Theorem 2.3 which provides the large-deviation GS for (2.8). Note ..,I shows that the probability density d⇢ d⇡ equals the vector of relative concentrations c i /w i . The driving functional E is the relative entropy up to a factor 1/2, viz.
For calculating the dissipation potential R we can take advantage of the linearity in Q. In fact, Q can be split into I di↵usion processes and the reaction part, namely The dual dissipation potential R ⇤ is obtained by replacing ⇠(x, j) by ⇠(x, j)+ 1 2 log Subtracting the term at ⇠ = 0, writing ⇠ = (⇠ j ) j with ⇠ j (x) = ⇠(x, j), and using the result for Q finite from above, we arrive at the formula This is the same dual dissipation potential as given in (2.5), except for the factors 1 2 and 2 outside and inside of C ⇤ . However, these scaling factors arise since in the large-deviation result in Theorem 2.3 a factor 1 2 appears in the definition of E(⇢) (see (2.9)), which is one-half of the usual relative entropy.

Large deviations for a membrane model
We consider a di↵usion equation in the interval ⌦ = ] 1, 1[, where at x = 0 there is a membrane giving rise to a transmission condition. The Kolmogorov forward equation takes the form (where˙= @ t and 0 = @ The last relation means first that the mass flowing out of ] 1, 0[ has to equal the flow into ]0, 1[, and second that this flow is proportional to the di↵erence of the densities. The invariant measure is ⇡ = 1 2 dx, and the Markov generator Q memb takes the form The functional H memb takes the form with ⇣ 0 (±1) = 0 and a + ⇣ 0 (0 + )e ⇣(0 + ) = b e ⇣(0 + ) e ⇣(0 ) = a ⇣ 0 (0 )e ⇣(0 ) .
Inserting ⇣ = ⇠ + 1 2 log(2⇢) and doing an integration by parts using the nonlinear boundary conditions one obtains the dual dissipation potential which again features the non-quadratic dissipation function C ⇤ .

Evolutionary -convergence
Following the notions in the survey [Mie15a] we consider families of GS (X, E " , R " ) "2]0,1[ and ask the question whether the solutions u " for these systems have a limit u for " ! 0 and whether u is again a solution to a GS (X, E 0 , R 0 ). Ideally, one might hope that it is su cient for E " and R " to converge in a suitable topology to E 0 and R 0 , respectively. Such results indeed exist and can be found in the surveys [Ser11,Mie15a]. However, the aim of this work is to highlight the fact that starting with classical (i.e. quadratic) dissipation potentials R " we may end up with a limiting dissipation R 0 that is nonquadratic. Thus, limits of classical GS may be generalized GS. First such examples were given in [Mie12,MiT12] in the context of plasticity.

pE-convergence of gradient systems
We first recall the general definition of pE-convergence, which is a short name for evolutionary -convergence with well-prepared initial conditions. Hence, the letter"E" stands for both, 'E'volutionary convergence and 'E'nergy convergence, while the letter "p" stands for well'P'reparedness of the initial conditions, i.e., E " (0, u " (0)) ! E 0 (0, u 0 )<1. Definition 3.1 (pE-convergence of (X, E " , R " )) We say that the generalized gradient systems (X, (3.1) Here u " * u means the weak convergence in the Banach space X. We emphasize that the notion of pE-convergence asks for convergence of both, the solutions and the energies, but not of the dissipation potentials. However, using the EDP and the convergence of the energies, we easily obtain convergence of the integrated dissipations, namely A first systematic study of evolutionary -convergence relying on gradient structures was initiated in Sandier-Serfaty [SaS04], see also [Ser11,Mie15a]. In this approach one derives su cient conditions for pE-convergence based on a limiting passage in the EDB We observe that on the right-hand side we have the initial energy, which converges to the desired limit because of the well-prepared initial conditions. Thus, to obtain an (EDE) for the limiting process it su ces to show three liminf estimates for the terms on the left-hand side, namely lim inf Of course, it is su cient that these convergences hold only along (a subsequence of) the solutions u " of (EDP). In the following subsection, we will generalize this approach by keeping the terms R " and R ⇤ " together.

EDP-convergence for gradient systems
Here we define a new notion of evolutionary -convergence for GS that, on the one hand, is more restrictive but, one the other hand, gives a more precise information on the limiting dissipation potential R 0 . We use the fact that we do not need to have the two convergences (3.3b) and (3.3c) separately. Indeed, it is su cient that only the integral over the sum of the two terms in (3.2) converges. This approach relaxes the su cient conditions for pE-convergence substantially, since in the limit " ! 0 the di↵erent parts of the dissipation may be distributed di↵erently.
In the quadratic case we always have equidistribution R " (u " ,u " ) = R ⇤ " (u " , DE " (u " )) for solutions u " . So, if (3.3b) and (3.3c) hold, we will still have equidistribution in the limit, but only along the limit solutions, while the limit functionals need not be quadratic. In [Mie12] it is shown that the limit of a classical GS can be a rate-independent system, where we always have R ⇤ 0 (u, DE 0 (u)) = 0, so there is not even equidistribution along solutions.
We also add a strengthening condition to obtain our notion of "EDP convergence" by asking for the convergence not only along solutions, but rather along a suitable general class of functions u : [0, T ] ! X. For this, we associate with the GS (X, De Giorgi's dissipation functionals D " , which are defined as Definition 3.2 (EDP convergence) We say that the family (X, E " , R " ) ">0 converges in the EDP sense to (X, E 0 , R 0 ), and shortly write (X, (3.4c) We emphasize that in condition (3.4c), the functions e u " are arbitrary and need not be solutions of the GS (X, E " , R " ). From this definition we see that the convergence conditions (3.3) obviously imply EDP convergence. However, we will study cases, where (3.3) does not hold, but we still have EDP convergence.
We emphasize that EDP-convergence is to be expected whenever one uses the EDP principle for establishing pE-convergence. Indeed, from general arguments one presumes that De Giorgi's dissipation functional D " has (after extraction of a subsequence) a -limit D 0 in the form From the lower semicontinuity of -limits, one expects that M(u, ·) is convex. Hence, one can define R M via R M (u, v) := M 0 (u, v) M 0 (u, 0) and hope that it is a dissipation potential. For this, one needs to show (i) the positivity R M (u, v) 0 and (ii) M 0 (u, 0) R ⇤ M (u, DE 0 (u)). Often, the positivity (i) follows simply from the evenness M 0 (u, v) = M 0 (u, v) and convexity. Moreover, if it is possible to show M 0 (u, v) hDE 0 (u), vi (this holds for " > 0 in the form R " (u, v)+R ⇤ " (u, DE " (u)) hDE " (u), vi), then we find (ii) via the estimate Thus, we arrive at the desired EDE E 0 (u(T )) + R It would be interesting to study more generally the relations between pE and EDPconvergence. Obviously, showing the liminf estimate for D " is the major step in establishing pE-convergence. Hence, it seems redundant to ask for the pE-convergence explicitly, yet it is not obvious under what additional condition (e.g. the validity of a suitable chain rule) we really can deduce the pE-convergence from the liminf estimate for D " . We end this section with two examples concerning EDP-convergence. Example 3.3 shows that the model discussed in [Mie12] satisfies pE-convergence but not EDP-convergence. Example 3.4 emphasizes the fact that pE and EDP-convergence are not properties of an evolution equationu = V " (u) but of a GS (X, E " , R " ). Indeed, for a given equation one may have di↵erent gradient structures leading to di↵erent limits in the EDP sense, which in turn generate di↵erent limit evolutions. For su ciently smooth loading curves`: [0, T ] ! R it was shown in [Mie12, Thm. 3.2] that the GS (R, E " , R " ) pE-converge to the generalized GS (R, E play , R play ) defined in (2.1). Obviously, we have the uniform convergence E " ! E play , while D " converges to a limit D 0 that cannot be written in terms of R play + R ⇤ play , see [Mie12,Prop. 3

.1]
Example 3.4 (Di↵erent limit equations) Here, we provide an example of an evolution equationu = V " (u) with two di↵erent gradient structures. Both gradient structures have an evolutionary -limit in the EDP sense, and the surprising fact is that the generated limit evolutions are di↵erent. Thus, EDP-convergence and pE-convergence are not properties of the family of evolution equationsu = V " (u), but of the chosen gradient structures.
We introduce two di↵erent gradient structures (X, E " , R " ) and (X, e E " , e R " ) via It is shown in [Mie15a,Cor. 3.8] that these GS converge in the EDP sense to the limit systems (X, E 0 , R 0 ) and (X, e E 0 , e R 0 ), respectively, where In particular, the limit evolution for the first isu = a min u, while it isu = a max u for the second. This is not a contradiction, but has its origin in the well-preparedness condition for the initial data. No sequence can be well-prepared for both systems, i.e. if u " ⇤ * u and , and vice versa.

EDP-convergence for an ODE example
We discuss a very simple example of a discrete Markov process with state space S = {1, 2, 3}. The jump rates are such that in the limit " ! 0 the particles never stay in the state 2. Thus, the limiting Markov process has the state space {1, 3} only, see Figure 3.1. We will start with three di↵erent GS, namely (i) the quadratic one, where both E " and R " are quadratic, (ii) the entropic one with classical R ⇤ " , and (iii) the entropic one with the dual dissipation potential defined in terms of C ⇤ . The interesting point is that in the cases (i) and (iii) the limiting GS obtained via EDP-convergence will still be in the same modeling class. However, in case (ii) we will lose the classical GS and obtain a generalized GS that cannot be described via C ⇤ .
We will study the limit " ! 0 in several gradient structures. For general strictly con- vex and superlinear functions and we consider the GS (X, .
Using the fact that v =u satisfies v 2 = v 1 v 3 we obtain the primal dissipation potential .
In particular, this implies that the limiting ODEṗ = 1 2p is induced by the reduced generalized GS ([0, 1], E, R), i.e.ṗ = 1 2p = D ⌘ R(p, DE(p)). The above theorem follows directly from the next proposition and the general theory described in Section 3.2. For the energy-dissipation principle we consider De Giorgi's dissipation functional Proposition 3. 6 We have the -limits where R is given as in Theorem 3.5.
Obviously, the definition of gives the relation (p) = m(p, 0). Thus, we have derived the reduced dissipation potential R in the form R(p, v) = m(p, v) (p). Doing a Legendre transform with respect to v we obtain the form of R ⇤ given in Theorem 3.5, since the sum turns into an inf-convolution.
Next we consider three di↵erent choices for and .

Quadratic energy and dissipation
First, we consider the case which gives E(p) = p 2 +(1 p) 2 and b a(p, r) ⌘ 1 and simplifies all expressions considerably: Here, the crucial point in the definition of R ⇤ is that the inf-convolution involving ⇤ and ⌧ does not involve any dependence on z, so that the term (p) exactly cancels sup z>0 ⌃(p, z), which is generally not the case. Thus, the limiting GS is ([0, 1], E, R) where R(p, ⌫) = ⌫ 2 is quadratic, and ([0, 1], E, R) is again a classical GS.

Entropic energy and C -type dissipation
Next, we consider the case of the Boltzmann entropy and the dissipation defined in terms of = C , which coincides with Section 2.4 except for the trivial scaling factor 2: (r) = B (r) = r log r r + 1 and ⇤ (⇠) = C ⇤ (⇠) = 4 cosh(⇠/2) 1 .
This gives the reduced energy functional E(p) = 1 2 B (2p) + 1 2 B (2 2p) and b a(p, z) = p 2pz. In the latter expression and in the definition of ⌃ we profit from the interaction of 0 B (r) = log r and the exponential form of C ⇤ , viz.
Minimizing in z > 0 we arrive at For calculating R ⇤ we first observe, for a, b > 0, the formula which follows by writing the left-hand side via C ⇤ (⇠ ⌧ ) = 2e ⇠ /x+2e ⇠ x 4, where x = e ⌧ , and minimizing in x > 0. With a = b a(p, z) and b = b a(1 p, z) we find We emphasize that in this minimization with respect to z it is crucial to keep the terms involving the dual dissipation potential C ⇤ (⌘) and the term ⌃ together. We observe that the resulting gradient structure is again the structure, which is obtained from the large-deviation principle of Section 2.4. This confirms the statement that gradient structures obtained from the large-deviation theory are very stable against taking further limits in the sense of EDP-convergence, see Figure 1.1
We obtain the same limit energy E(p) = 1 2 B (2p) + 1 2 B (2 2p) as in the previous case, but the functions a " j (u) are quite di↵erent as they involve the logarithmic mean ⇤(r, s) = r s log r log s . Indeed we have b a(p, z) = ⇤(2p, z) = 2p z log(2p) log z . We further obtain the functions ⌃(p, z) = 1 2 ⇣ (2p z) log(2p) log z + (2 2p z) log(2 2p) log z ⌘ and have no explicit formula for (p) = inf z>0 ⌃(p, z). In the definition of R ⇤ we can do the inf-convolution explicitly, since ⇤ is quadratic, so we find the formula a(p, z)+b a(1 p, z)) We claim that the growth of R ⇤ (p, ⌘) is no longer quadratic, but exponential. For this we insert z = e b⌘ for ⌘ 1 for some b 2 ]0, 1/2[ into the supremum to obtain a lower bound. From ⌃(p, z) ⇡ z log z and b a(p, z) ⇡ z/ log z for z ! 1 we find the asymptotic lower bound Hence, we see that the growth is at least as e b|⌘| for all b 2 ]0, 1/2[. Moreover, we expect that the function R ⇤ (p, ⌘) does not have a product structure b(p) (⌘) any more. Thus, we see that the classical gradient structure for the relative entropy is not stable under EDP, in general. Nevertheless, in [GiM13,DiL14,MaM15a] evolutionary -limits between discrete Markov processes and continuous Fokker-Planck equation are studied, where the classical gradient structure survives.

The membrane as a thin-layer limit
In our first major application of the EDP-convergence as a microscopic origin of generalized GS, we follow [Lie12,Lie13] and consider a one-dimensional di↵usion equation with a thin layer of very small di↵usivity. Assuming that the di↵usion coe cient and the width of the layer scale in the proper way, we will arrive at a membrane model in the limit. While the limit passage of the linear di↵usion problem to the linear transmission problem at the membrane can be done directly (or with the quadratic gradient structure, see [Lie13]), we prefer to do the somewhat more elaborate EDP-limit using the GS with the relative entropy as energy functional and the classical dissipation potential of Wasserstein type. This case was already studied in [Lie12, Sec. 3.2] in a more special setting and without explicitly calculating R ⇤ 0 . We start from the equatioṅ where˙= @ t and 0 = @ x . By our choice of the boundary conditions, the total mass R ⌦ u(t, x) dx = 1 is conserved, thus we can interpret the equation as the Fokker-Planck equation of a Markov process. Defining the equilibrium density we have the GS (Prob(⌦), E, R ⇤ " ) with The nontrivial behavior happens in the thin layer given by the small interval [0, "]. In particular, we allow a " and V " to depend non-trivially on x: We assume that there are functions a ⇤ , a + , V ⇤ , V + 2 C 1 ([0, 1]) and a , V 2 C 1 ([ 1, 0] a (x) for x < 0, Here V " is constructed to be continuous on ⌦ = [ 1, 1], while a " has jumps for x 2 {0, "}. The pE-convergence result established in [Lie12,Sec. 3] states that the limiting system is given as a membrane problem, where the thin layer is replaced by a transmission condition. The interesting point is that the EDP-convergence reveals that the limiting GS is no longer classical but involves C for the jump of the driving forces at the membrane.
For passing to the limit we note that the function w " converges pointwise to the limit which may be discontinuous at x = 0, but has well-defined limits w 0 (0 ) and w 0 (0 + ) from the left and from the right, respectively. This limit is totally independent of the potential V ⇤ inside the layer. The influence on the layer potential V ⇤ and the layer di↵usion profile a ⇤ will only survive in one coe cient A ⇤ .
Before we go into the details of the proof, some comments are in order. First, we emphasize that the constant Z 0 in the definition of the coupling coe cient A ⇤ is not related to V ⇤ , but only depends on V ± , see (4.3). Hence, for a large barrier V ⇤ the transmission coe cient A ⇤ becomes indeed small.
Second, the limiting equation is a PDE in the subdomains ⌦ = ] 1, 0[ and ⌦ + = ]0, 1[ coupled by a transmission condition. It can be obtained easily by considering test Using the fact that b ⇠ may have a jump at x = 0, the transmission conditions arise via the boundary terms when integrating by parts. We arrive aṫ We refer to [GlM13] for a similar derivation of more general nonlinear transmission conditions and active interface conditions using gradient structures. Finally, we remark that the primal dissipation potential can be written using the integration operator I[u](x) := R x 1u (y) dy = R 1 xu (y) dy, where the last relation follows from R 1 1u dy = 0, which in turn is due to u(t) 2 Prob(⌦). Noting that the functions ⇠ may have a jump at x = 0, one has the identity Here I[u](0) is the flux through the membrane, which is thermodynamically conjugate to the jump ⇠(0 + ) ⇠(0 ) in the driving forces. With this and I [u]( 1) = 0 = I[u](1) the evaluation of the Legendre transform for R ⇤ 0 yields the primal dissipation potential Step 1. Blow up = transformation from D " to b D " : To study the -limit D 0 of D " we blow up the thin layer such that its transformed thickness becomes of order one. For this we use Y " : [ 1, 1] ! [ 1, 2] and its inverse X " = Y 1 " : x for x  0, U " (t, y) = u(t, X " (y)), W " (y) = w " (X " (y)), A " (y) = a " (X " (y)) X 0 " (y) , and the functionals b Following the arguments in [Lie12,Sec. 3.2] it is not di cult to establish theconvergence of b D " to b D 0 , where the latter is given in the form (a (y), e V (y) /Z 0 , 1) for y < 0, (a ⇤ (y), e V⇤(y) /Z 0 , 0) for y 2 [0, 1], (a + (y), e V + (y) /Z 0 , 1) for y > 1.
Step 2. Minimization over the rescaled layer: The main structure in this limit model is that b Inserting the definitions of b A and c W gives exactly the formula for A ⇤ in the theorem. Thus, we have constructed a simpler functional D 0 , which is given by satisfies the lower bound b D 0 (U ) D 0 (U ) for all U , and has the important property that it does not depend on U | [0,T ]⇥]0,1[ .
Step 3. Relation between D 0 and D 0 : Using the special form of R 0 and R ⇤ 0 stated in (4.4) and the theorem, we define the limiting dissipation functional D 0 (u) = R T 0 R 0 (u,u)+ R ⇤ 0 (u, DE 0 (u) dt. By construction and the special form of b G the functional D 0 is closely related to D 0 in the following way. For any function U : Moreover, for any function u one can construct an optimal U as follows. We split u at x = 0, the right part is shifted by 1 to the right, and the minimizer of U (t, ·) 2 H 1 ([0, 1]) of b G(I[u(t)](0), u(t, 0 ), u(t, 0 + )) is inserted into the gap. Then, D 0 (u) = b D 0 (U ).
Step 4. The liminf estimate (3.4c): To establish the fundamental liminf estimate we consider, w.l.o.g., sequences u " satisfying 1/R  u "  R for some large R > 1. In particular, by minimum and maximum principles these bounds can be expected for solutions of (4.1). Defining U " (t, y) = u " (t, X " (y)) we again have U " (t, y) 2 [1/R, R]. Thus, we find a subsequence " Moreover, we have u 0 (t, x) = U 0 (t, Y 0 (x)). Now, using D " (u " ) = b D " (U " ) we arrive at the desired liminf estimate lim inf This concludes the proof of Theorem 4.1.
We conclude this section by observing that the EDP-limit of the thin-layer di↵usion system given by the classical GS (Prob(⌦), E " , R ⇤ " ) is a the generalized GS for the membrane problem. For " > 0 and for " = 0 the gradient structures are exactly the ones obtained from the large-deviation principle, see Section 2.4.4. Hence, we again found an instance where the diagram in Figure 1.1 commutes, that means that applying the large-deviation principle can be interchanged with taking the EDP-limit " ! 0.

From di↵usion to reaction
In our second major application of EDP-convergence as a microscopic origin of generalized GS, we continue the work in [PSV10, PSV12, AM ⇤ 12] which show that linear reactions can be obtained as limits of di↵usion for a suitably scaled energy barrier. In [PSV10,PSV12] the method relies on a quadratic energy functional and a classical gradient structure. In [AM ⇤ 12] the pE-convergence for the entropic GS is shown, but only di↵usion along the reaction path is allowed. In fact, the result therein gives EDP-convergence, if one takes the addition in [MPR14,Prop. 4,4] into account.
Here we generalize the latter work by also allowing di↵usion in a physical space ⌦, such that the resulting limit equation will be a (linear) reaction-di↵usion system. Our physical domain ⌦ ⇢ R d is bounded and has a Lipschitz boundary. For the reaction path we choose ⌥ = [0, 7] ⇢ R and define the cylinder Q = ⌦ ⇥ ⌥. (Indeed, ⌥ could by any bounded or unbounded interval.) For densities u 2 L 1 (Q) the integral R D R y 1 y 0 u dy dx denotes the number of particles per unit volume that are in the subdomain D ⇢ ⌦ and have a reaction state y 2 [y 0 , y 1 ] ⇢ ⌥. The evolution of the density u is driven by di↵usion in the x-direction with di↵usion constant m ⌦ > 0 and a much faster di↵usion in the y-direction with di↵usion constant ⌧ " 1 to allow the particles to overcome a huge potential barrier given by V " (y) = 1 " V (y), see Figure 5.1.
For simplicity we assume that the total mass R Q u dx dy as well as the volume |⌦| of the physical domain equal 1. Hence, we can again consider the model as a Markov process with continuous paths t 7 ! (X t , Y t ) 2 ⌦ ⇥ ⌥ = Q, whose distribution laws can be described by densities u(t) 2 Prob(Q). The Kolmogorov forward equation readṡ u = m ⌦ x u + ⌧ " @ y @ y u + u @ y V " , (r x u, @ y u+u @ y V " ) · ⌫ = 0 on @Q. Clearly, the unique steady state e w " is independent of x and takes the form For studying the limit " ! 0 we now assume that V 2 C 2 (⌥) has exactly two nondegenerate minimizers as pure states, where V = 0 w.l.o.g, and one global maximum as barrier, namely  , . (5.5) Here the convergence means R ⌥ (y)w " (y) dy ! ↵ 0 (2) + ↵ 1 (6) for all 2 C 0 (⌥). The important point is now to choose the di↵usion constant ⌧ " su ciently large such that the transitions between y = 2 and y = 6 can occur on times of order 1. According to Kramer's rule (see e.g. [AM ⇤ 12]), this is achieved by choosing m ⌥ > 0 and setting From the concentration of w " in the points {2, 6} we obtain that e w " 2 Prob(Q) concentrates in the sets ⌦ ⇥ {2} and ⌦ ⇥ {6}, namely e w " ⇤ * e w 0 := ⌦ ⌦ w 0 in Prob(Q).
Recalling that E " is the relative entropy with respect to w " , the -convergence E " * ⇤ E 0 appears natural. To be more precise concerning densities and measures, we define Proposition 5.1 We have E " * ⇤ E 0 in the weak ⇤ topology of Prob(Q). Proof: The liminf estimate is established in [AGS05, Lem. 9.4.3].
One di culty in deriving the liminf estimate for De Giorgi's dissipation functional is that R " is only implicitly defined via the Legendre transform of R ⇤ " . Moreover, we are not able to employ the classical Wasserstein gradient flow theory in [AGS05] using the Benamou-Brenier formulation, because of the di↵erent roles of the di↵usion in x with mobility m ⌦ and the di↵usion in y with mobility ⌧ " ! 1. The first step to establish the following result follows the idea in [MaM15a], where one obtains a lower estimate by replacing R " (u,u) by the smaller term h⇠ " ,u " i R ⇤ " (u " , ⇠ " ) and by choosing a suitable recovery sequence ⇠ " ! ⇠ 0 for the limit passage " ! 0. Finally, one takes the supremum over all ⇠ 0 to recover R 0 as dual of R ⇤ 0 . The second step involves a suitable transformation of the reaction variable z = Z " (y) (first introduced in [AM ⇤ 12]) which allows us to control the relative densities v " := u " /w " and the dual potentials ⇠ " along the reaction path ⌥. In the following result we will again describe the limit GS (Prob(Q), E 0 , R 0 ) by a reduced GS (Prob(⌦ ⇥ {0, 1}), E, R), since in the limit every µ 2 Prob(Q) with finite relative entropy satisfies µ = c 0 dx⌦ 2 (y) + c 1 dx⌦ 6 (y) with (c 0 dx, c 1 dx) 2 Prob(⌦ ⇥ {0, 1}), see (5.6).
Theorem 5.2 (From di↵usion to reaction-di↵usion) The family of gradient systems (Prob(Q), E " , R " ) defined via (5.2) converges in the EDP sense to the gradient system (Prob(Q), E 0 , R 0 ), where E 0 is given in (5.6) via E and accordingly R 0 is given via R, which is defined in terms of the dual dissipation potential The above result means that the limiting GS is a generalized gradient system defined for c = (c 0 , c 1 ) 2 Prob(⌦⇥{0, 1}), where the limiting system is the coupled system of linear PDEs given in the forṁ with Neumann boundary conditions rc j · ⌫ = 0. We emphasize that the original GS (Prob(Q), E " , R " ) is the classical GS for the Fokker-Planck equation, while the EDP limit provides the generalized gradient structure discussed in Section 2.4.3. We observe that for " > 0 as well as for " = 0 we have the GS that is induced by the large-deviation principle discussed in Section 2.4. Thus, we have found another instance of the interchangeability of the large-deviation principle and the EDP-limit, as displayed in Figure 1.1.
Sketch of proof of Theorem 5.2: Since the -convergence E " * ⇤ E 0 was already established in Proposition 5.1, it remains to show the liminf estimate for the dissipation functional D " . More precisely, assume u Step 1. Dualization of R " : The first major idea follows [MaM15a] and exploits the definition of R " as Legendre transform of R ⇤ " . Introducing the functional we easily see that D " (u) can be reconstructed via sup ⇠ B " (u, ⇠). Using the definitions of E " and R ⇤ " we have the explicit form Step 2. Rescaling the reaction-path variable. The second major idea follows [AM ⇤ 12, Sec. 2.1], where no x-direction was present. We define the di↵eomorphism Z " : ⌥ ! Z := [0, 1] and its inverse Y " = Z 1 " : Z ! ⌥ via The transformed equilibrium density b w " on Z is b w " (z) := w " (Y " (z))Y 0 " (z) and satisfies b w " ⇤ * b w 0 := ↵ 0 0 + ↵ 1 1 .

A Evaluation of some functionals
Here we give explicit calculations for the functional G occurring in the membrane limit and the functional N occurring in the limit of di↵usion to reaction. It is surprising that both functional are closely related, see (A.5).
A.1 Derivation of the potential G(↵, u 0 , u 1 ) We first give the result of the standard case of constant coe cients b A and c W , which was already derived in [MPR14,Prop. 4 where the last term simplifies to G(0, u 0 , u 1 ) = 2( p u 0 p u 1 ) 2 . Moreover, the unique minimizer is given by u(x) = (1 x)u 0 + xu 1 + b(x 2 x) with b = u 0 + u 1 p ↵ 2 +4u 0 u 1 .
In Section 4 we need a more general version with non-constant functions b A and c W : We will show that the influence of the coe cient functions b A and c W can be calculated from Proposition A.1 by a suitable rescaling of the layer variable in the form x = X(y). Using the strong link (A.5) between N and G we show that N can be calculated from G.
Proposition A.3 We have the relation Proof: Using (A.5) we want to show that N is related to the Legendre transform G ⇤ ( , v 0 , v 1 ) := sup ↵2R ↵ G(↵, v 0 , v 1 ) of G from (A.1). For this we keep 2 R fixed. The functional (↵, v) 7 ! G(↵, v) ↵ is jointly convex, such that it can be minimized in any desired order of ↵ and v.
Thus, evaluating G ⇤ with G from (A.2) explicitly gives the desired result.