Large-deviation principles of switching Markov processes via Hamilton-Jacobi equations

We prove pathwise large-deviation principles of switching Markov processes by exploiting the connection to associated Hamilton-Jacobi equations, following Jin Feng's and Thomas Kurtz's method. In the limit that we consider, we show how the large-deviation problem in path-space reduces to a spectral problem of finding principal eigenvalues. The large-deviation rate functions are given in action-integral form. As an application, we demonstrate how macroscopic transport properties of stochastic models of molecular motors can be deduced from an associated principal-eigenvalue problem. The precise characterization of the macroscopic velocity in terms of principal eigenvalues implies that breaking of detailed balance is necessary for obtaining transport. In this way, we extend and unify existing results about molecular motors and place them in the framework of stochastic processes and large-deviation theory.


Introduction
In this paper we investigate large deviations for switching Markov processes that are motivated by stochastic models of molecular motors.Molecular motors are proteins that are capable of moving along filaments in a living cell.Molecular motors such as kinesin and dynein drag vesicles along while moving and thereby transport them within the cell.For more background on the phenomenon of molecular motors we refer to a number of reviews [JAP97,How01,KF07,Kol13].
Molecular motors have a directionality: they typically move in one direction only.A central challenge in the study of such motors is to understand the origin of this directionality, and characterize the speed of movement.In fact, mathematical models of molecular motors typically show no energetic benefit in moving in one direction or the other; the directionality arises from a non-trivial interplay between the microscopic features of such models and the dynamics of the motor.As a result, understanding how directionality arises as symmetry breaking in a-directional models is somewhat of a puzzle.
For certain models this puzzle has been solved, at least partially.Hastings, Kinderlehrer and Mcleod studied stationary solutions of certain Fokker-Planck equations and found sufficient conditions for the occurrence of transport [HKM08b,HKM08a].Vorotnikov proved sufficient conditions for transport in deterministically switching [Vor11] and randomly switching systems [Vor14].Perthame, Souganidis, and Mirrahimi developed a dynamic point of view on systems of molecular motors [PS09a,PS09b,MS13].In particular, Mirrahimi and Souganidis prove convergence of solutions of a Fokker-Planck equation to a ballistically travelling pulse, with a velocity that is characterized by a periodic cell problem.
In this paper we extend the results of [MS13] to a much broader class of systems, make explicit the connection to stochastic processes, and place the treatment squarely in the context of large-deviation theory.In this way we elaborate on the work by Perthame, Souganids and Mirrahimi, which appears to be inspired by large-deviation theory, as evidenced by the title of [PS09a] and the use of terms such as 'Hamiltonian'.
The larger class of stochastic processes that we consider is that of switching Markov processes in a periodic setting.This class contains different models of molecular motors as special cases, including the continuum ratchet and discrete stochastic models (see [Kol13] and Section 2, as well as [PS09a,PS09b,MS13,HKM08b,HKM08a]).
The first mathematical results of this paper (Theorems 4.2 and 4.3; see Figure 1 below) are large-deviation theorems for such switching Markov processes.These generalize results by Kumar and Popovic [KP17] by focusing on pathwise large deviations, while placing more restrictive assumptions on the microscopic dynamics.Furthermore, instead of assuming the comparison principle to be satisfied as in [KP17, Lemma 1], we formulate conditions that imply the comparison principle.Faggionato and Silvestri establish large-deviation principles for fully discrete, 'pseudo-one-dimensional' systems [FS17].
A related line of research focuses on large-deviation principles for switching diffusions in a setting where the diffusion potentials do not have smallscale oscillations.Typical results provide large-deviation rate functionals that are simple sums of small-diffusion ('Freidlin-Wentzell') and occupation ('Donsker-Varadhan') rate functionals (see e.g.[FL96, HY14, HMS16, BDG18, KS20]).The rapid-scale oscillation of the potentials in this paper creates a stronger intertwining between the diffusion and switching dynamics, and consequently the rate function is not a simple sum but an expression that fully combines the dynamics of both components.Theorems 4.2 and 4.3 recover previous convergence results such as those of Mirrahimi and Souganidis [MS13, Th. 1.1-1.2].While the methods that Mirrahimi and Souganidis apply are inspired by large-deviation theory, they do not explicitly prove large deviation principles but convergence statements on the level of Fokker-Planck equations.By proving large-deviation principles instead, we are able to make a clear distinction between the contributions that come from general large-deviation theory on the one hand, and the modelspecific contributions on the other hand.
For instance, our results explain from a large-deviation point-of-view why the velocity v can be characterized by a cell problem that can be interpreted as defining a large-deviation Hamiltonian H, through v = H (0). The Hamiltonian depends on the specific model, while the relation v = H (0) is independent of the microscopic details.This relation then also explains the wellknown fact that detailed balance (microscopic reversibility) forces zero velocity.Indeed, we prove under general conditions (Theorem 4.8) that detailed balance leads to a symmetric Hamiltonian.By the characterization of the velocity as v = H (0), this means that detailed balance has to be broken in order for transport to occur.
As another example, the numerical results of Wang, Peskin and Elston suggest that there is no transport in the limit of large reaction rates [WPE03, Section 4.3, Figure 8(a)].We also recover this result by proving that in this limit regime the Hamiltonian becomes symmetric (Theorem 4.9).

Overview of the paper
In Section 2, we illustrate the general results by means of a concrete example of a stochastic molecular-motor model.This provides a 'running example' with which to interpret the general results that follow.We also outline with this example the relation to the papers of Perthame, Souganidis and Mirrahimi.
In Section 3, we introduce the concepts that we work with in order to rigorously formulate our results.In Section 4 we present our main results.Figure 1 summarizes the relationships between the main theorems.Theorem 4.2 provides general conditions under which the so-called spatial component of a switching Markov process satisfies a large-deviation principle.We identify the Hamiltonian H(p), a principal eigenvalue, as the central ingredient.Under the additional assumption that p → H(p) is convex, Theorem 4.3 establishes an action-integral representation.Theorems 4.2 and 4.3 highlight the arguments that come from large-deviation theory.
We then specialize to a concrete ratchet model of molecular motors.Theorems 4.6 and 4.7 establish the large-deviation theorems for two limit regimes.While Theorem 4.6 generalizes the results in [MS13], Theorem 4.7 characterizes yet another limit regime.We include this result to illustrate how the general structure of proof remains unaffected by the choice of scaling.Finally, we show the symmetry of Hamiltonians under detailed balance (Theorem 4.8) and in the regime of scale separation (Theorem 4.9).

Definition of the system
In this example, we consider a two-component Markov process (X n , I n ) with values in T × {1, 2}, where T = R/Z is the one-dimensional flat torus.We fix the initial condition (X n (0), I n (0)) = (x 0 , i 0 ) for some (x 0 , i 0 ) ∈ T × {1, 2}.Let ψ(•, 1) and ψ(•, 2) be smooth functions on the torus, and we write ψ (x, i) for the derivative of x → ψ(x, i).We call these functions potentials.The evolution of (X n , I n ) is characterized by the stochastic differential equation where B t is a standard Brownian motion.The process I n is a continuous-time Markov chain on {1, 2}, which evolves with jump rates r ij (•) such that In summary, the spatial component X n is a drift-diffusion process, the configurational component I n is a continuous-time Markov chain on {1, 2}, and the two are coupled through their respective rates.For details about the rigorous construction of such switching drift-diffusion processes, we refer to [YZ10, Chapter 2]. Figure 2 depicts a typical realization of (X n , I n ), where the trajectory of the spatial component is lifted from the torus to R.

X n (t)
Figure 2: A typical time evolution of (X n , I n ) satisfying (1) and (2).In the left diagram, the black bullet represents a particle that moves according to (1).A red arrow indicates the dynamics of the spatial component X n .A green arrow indicates a switch of the configurational component I n , which switches the potential in which the particle is diffusing.In the right diagram, the spatial evolution is shown in an x-t-diagram.The red dots represent the values of X n , while a green bullet indicates a switch of the configurational component I n .The dynamics of the particle comprises the following typical phases. 1 and 4: diffusive motion of X n near a potential minimum; 2 and 5: configurational switch of I n with the effect of switching to another potential; 3 and 6: flow of X n towards a minimum of the other potential.In both diagrams, the spatial trajectory is shown lifted from the torus T to R.
The specific n-scaling may be motivated by starting from a process (X t , I t ) that satisfies where the jump process I t on {1, 2} evolves according to The large-scale behaviour of (X t , I t ) is studied by considering the rescaled process (X n t , I n t ) defined by X n t := 1 n X nt and I n t := I nt , and characterizing the dynamics of (X n t , I n t ) for large values of n.This rescaling may be interpreted as zooming out of the x-t phase space, which is illustrated below in Figure 3. Itô calculus implies that the process (X n t , I n t ) satisfies (1) and (2).

Large deviations for this example
We are interested in the behaviour of the spatial component X n as n → ∞.
The behaviour of X n for large n is shown in Figure 3.This figure suggests that X n closely follows a path with a constant velocity.Indeed, when specifying the results of this paper to the example at hand-the process (X n , I n ) Figure 3: Two typical realizations of the spatial component X n of the two-component process (X n , I n ) satisfying (1) and (2).On the left, a realization is depicted for n of order one, and on the right for large n.Both graphs depict the lifted trajectory of X n on R.
For large n, realizations of X n closely follow a path with a constant velocity v = H (0), wherein the Hamiltonian H = H(p) may be derived from large-deviation theory.A more detailed illustration of the dynamics is shown in Figure 2 further above.
defined by (1) and (2)-we find that the spatial component X n satisfies a pathwise large-deviation principle in the limit n → ∞.
To describe this fact more precisely, let X := C T [0, ∞) the set of continuous trajectories in T, equipped with the topology of uniform convergence on compact time intervals.The spatial component X n is a random variable in X , with a path distribution P(X n ∈ •) ∈ P (X ).We will show that there exists a rate function I : X → [0, ∞] with which {X n } n∈N satisfies a pathwise large-deviation principle in the sense of Definition 3.2 below.The gist of this statement is that for any trajectory x ∈ X , we have at least intuitively (3) The notation "X n ≈ x" indicates that X n is close to x with respect to the topology on X , and "∼ e −n I(x) " indicates a dominant contribution of the exponential.The rate function I is given by means of a Lagrangian L : R → [0, ∞) as Here I 0 : T → [0, ∞] is the rate function of the initial conditions X n (0); because of the deterministic initial condition X n (0) = x 0 , this functional is given by I 0 (x 0 ) = 0 and +∞ otherwise.The Lagrangian is the Legendre dual of a and the Hamiltonian is the principal eigenvalue of an associated cell problem described in a more general context in Lemma 7.1.
Here, we focus on how this large-deviation result confirms the claim suggested by Figure 3.The rate function (4) has the following properties: These two properties together characterize the unique minimizer of the rate function, and thereby in particular the typical behaviour of X n for large n.Whenever I(x) > 0 for a path x ∈ X , then by (3), the probability that a realization of X n is close to x on X is exponentially small in n.In fact, the largedeviation principle implies almost-sure convergence of X n to the unique minimizer of the rate function (Theorem A.1). Uniquenss of the minimizer, Item (ii), follows by strict convexity of H(p).For the Hamiltonian of this example, strict convexity can be proven as demonstrated in [MS13, Step 4 in Appendix A].
With the large-deviation principle we can investigate which sets of potentials and rates {ψ 1 , ψ 2 , r 12 , r 21 } induce transport, that means a non-zero macroscopic velocity v = H (0). We do not find general sufficient conditions for transport, but can draw some conclusions if the process (X n , I n ) satisfies detailed balance, that is r 12 e −ψ 1 = Cr 21 e −ψ 2 for some constant C > 0. Detailed balance implies that the Hamiltonian is symmetric (Theorem 4.8), and therefore v = 0 under detailed balance.

Preliminaries
In the previous section we sketched the results of this paper at the hand of an example.In this section we introduce the concepts that we use in the subsequent sections to obtain the general results of this paper in a rigorous way.
Large deviations.For a Polish space E, let X := D E [0, ∞) be the set of trajectories in E that are right-continuous and have left limits.We equip X with the Skorohod topology [EK86, Section 3.5].We work with the definition of a rate function as given in [BD19, Chapter 1].Definition 3.1 (Rate function).We call a map I : X → [0, ∞] a rate function if for every C ≥ 0, the sub-level set {x ∈ X : I(x) ≤ C} is compact.
In particular, a rate function is lower semi-continuous.For a Borel subset A ⊆ X , we write int(A) and clos(A) for its interior and closure.Definition 3.2 (Large-deviation principle).For n = 1, 2, . . ., let P n be a probability measure on X , and let I : X → [0, ∞] be a rate function.We say that the sequence {P n } n∈N satisfies a large-deviation principle with rate function I if for every Borel subset A ⊆ X ,

I(x)
A large-deviation principle provides an estimate of the probabilities P n (A) on the logarithmic scale.At least intuitively, Illustrating examples of a large-deviation principle can be found for instance in Ellis' note on Boltzmann's discoveries [Ell99].General introductions to the topic are also provided in [BD19, Chapter 1] and [FK06, Chapter 3].
Identifying tractable formulas for a rate function is crucial for drawing conclusions from a large-deviation principle.In this paper, we shall aim for finding action-integral representations of rate functions.Let T d := R d /Z d be the flat ddimensional torus, and let AC([0, ∞); T d ) be the set of absolutely continuous trajectories in T d .Definition 3.3 (Action-integral form of rate function).We say that a rate function where I 0 : T d → [0, ∞] is a rate function.We refer to the map L as the Lagrangian.
Switching Markov processes in a periodic setting.We shall consider Markov processes defined by two-component stochastic processes (X n , I n ) taking values in state spaces E n that satisfy the following condition.
Condition 3.4 (Setting).Fix J ∈ N.For n ∈ N, the state space E n is a product space E n := E X n × {1, . . ., J}, where E X n be a compact Polish space satisfying the following: there are continuous maps ι n : E X n → T d such that for all x ∈ T d there exists This condition means that the E X n are asymptotically dense in the torus T d .The typical example is the periodic lattice (n −1 Z) d /Z d , where the torus is recovered in the limit of n to infinity.Another example is simply E X n ≡ T d .When it is clear from the context, we omit ι n in the notation.Let X n := D E n [0, ∞).For a distribution µ ∈ P (E n ), we define an E n -valued two-component process (X n , I n ) with initial condition µ by defining its path distribution P n µ ∈ P (X n ).In order to define a path distribution, we shall specify a linear map L n : D(L n ) ⊆ C(E n ) → C(E n ) on a domain D(L n ) and assume well-posedness of the martingale problem of the pair (L n , µ); we refer to [EK86, Section 4.3] for a precise treatment of the martingale problem.We call a linear map L n as above a generator if it gives rise to a well-posed martingale problem.We specify the generators of (X n , I n ) from the following ingredients: (1) For i ∈ {1, . . ., J}, we have a map where the domain is Condition 3.5 (Well-posedness).Let µ ∈ P (E n ).Existence and uniqueness holds for the X n martingale problem for (L n , µ).Denote the solution to the martingale-problem solution of L n by P n µ .The map E n z → P n δ z ∈ P (X n ) is Borel measurable with respect to the weak topology on P (X n ).
Condition 3.5 is the basic assumption on the processes in [FK06].A sufficient condition for the measurability in there is given in [EK86, Theorem 4.4.6].We do not give general conditions on a map L n that imply Condition 3.5.For examples regarding existence and regularity properties we refer to the book of Yin and Zhu about switching hybrid diffusions [YZ10, Part I].Definition 3.6 (Switching Markov processes in a periodic setting).Let (X n , I n ) be a two-component Markov proces taking values in E n = E X n × {1, . . ., J} satisfying Condition 3.4.We call (X n , I n ) a switching Markov process if its generator L n is given by (5) and satisfies Condition 3.5.

Main results
In the previous section we introduced the notion of a large-deviation principle and defined switching Markov processes in a periodic setting.In this section we present our main results as depicted in the flow-diagram Figure 1 above.First, we formulate general conditions for a large-deviation principle of switching Markov processes (Theorem 4.2).Then we find an action-integral representation of the rate function under an additional convexity assumption (Theorem 4.3).The remaining theorems arise from specifications of the general setting to specific models.We prove large-deviation principles for molecularmotor models in two limit regimes (Theorems 4.6 and 4.7), and derive the fact that detailed balance and separation of scales imply symmetry of Hamiltonians (Theorems 4.8 and 4.9).

Large-deviation principle for switching Markov processes
We consider switching Markov processes (X n , I n ) in a periodic setting in the sense of Definition 3.6, with generators of the form (5). The essence of this section is Theorem 4.2, which provides general conditions that imply a pathwise large-deviation principle of the spatial component X n .We state the conditions in terms of nonlinear generators defined as follows.
Definition 4.1 (Nonlinear generators).Let L n be the map defined by (5).The nonlinear generator is the map We shall work under the assumption that the nonlinear generators H n converge in the limit n → ∞.To formulate this convergence assumption, we need to introduce an additional state space E for collecting up-scaled variables.The following diagram depicts the relation between the state spaces: In the diagram, η n : E n → T d is the projection defined by η n (x, i) := ι n (x), where ι n : E X n → T d is the embedding of Condition 3.4.The map η n : E n → E is assumed to be continuous.We shall assume that the E n are asymptotically dense: ), a multivalued operator.We shall assume the following convergence condition: Frequently, for any f in the domain of H, the corresponding image functions g are naturally parametrized by a set of functions on E : Theorem 4.2 (Large deviation principle for switching processes).Let (X n , I n ) be a switching Markov process in the sense of Definition 3.6, with nonlinear generators H n of Definition 4.1.Let E be a compact metric space satisfying (C1), and let H ⊆ C(T d ) × C(T d × E ) be a multivalued operator satisfying (C2) and (C3) from above.Suppose the following: Suppose furthermore that {X n (0)} n∈N satisfies a large-deviation principle in T d with rate function I 0 : and there exists a semigroup V(t) with which the rate function is given by (9).
We give the proof in Section 5.The formula for the rate function I is not important here, which is why we report it only below in (9) in the proof section.Condition (T1) means that the images depend on the variable x ∈ T d only via the gradients ∇ f (x).In the molecular-motor models, Condition (T2) is verified by solving a principal-eigenvalue problem, in which the constant H(p) is the unique principal eigenvalue of a certain cell problem.

Action-integral representation of the rate function
In the previous section, we formulated general conditions that imply a pathwise large-deviation principle.The rate function of Theorem 4.2 however is still generic (equation (9) below).The following Theorem shows that under an additional convexity assumption, the rate function is of action-integral form in the sense of Definition 3.3 above.
Theorem 4.3.Consider the setting of Theorem 4.2.For p ∈ R d , let H(p) be the constant in (T2) of Theorem 4.2.Suppose further the following: Then the rate function of Theorem 4.2 is of action-integral form with the Lagrangian defined by Theorem 4.3 is proven in Section 6.

Large deviations for models of molecular motors
In the previous two sections we considered general switching Markov processes in a periodic setting.In this section we further specify to a class of stochastic processes motivated by molecular motors.
Definition 4.4 (Process modeling molecular motors).The pair where This is an example of a switching Markov process with generators L i n defined on the core C 2 (T d ) by and rates ), the generator acts as defined in (7).The example of Section 2, a stochastic model of molecular motors, corresponds to the choices Definition 4.5.Let J ∈ N. We call a matrix A ∈ R J×J irreducible if there is no decomposition of {1, . . ., J} into two disjoint sets J 1 and J 2 such that A ij = 0 whenever i ∈ J 1 and j ∈ J 2 .Theorem 4.6 (Limit I).Let (X n t , I n t ) be the Markov process of Definition 4.4 with parameter γ(n) = n.Assume that the matrix R with entries R ij = sup y∈T d r ij (y) is irreducible.Suppose furthermore that the family of initial conditions {X n (0)} n∈N satisfies a large-deviation principle in T d with rate function I 0 : Then the family of stochastic processes {X n } n∈N satisfies a large-deviation principle in C T d [0, ∞) with rate function of action-integral form.The Hamiltonian H(p) is the principal eigenvalue of an associated cell problem described in Lemma 7.1.
The irreducibility condition is imposed to solve the principal-eigenvalue problem that we obtain, and is inspired by sufficient conditions for solvability of a coupled system of elliptic PDEs [Swe92].
The parameter γ(n) allows to model a time-scale separation of the components.The following theorem shows that if γ(n) scales super-linearly, then the spatial component is effectively driven by potentials averaged over the stationary measure of the fast configurational component, and the large-deviation principle is governed by an averaged Hamiltonian.
Theorem 4.7 (Limit II).Let (X n t , I n t ) be the Markov process of Definition 4.4, with parameter γ(n) such that n −1 γ(n) → ∞ as n → ∞.Assume that for every y ∈ T d , the matrix R(y) with entries R(y) ij = r ij (y) is irreducible.Suppose furthermore that the family of random variables {X n (0)} n∈N satisfies a large-deviation principle in T d with rate function I 0 : Then {X n } n∈N satisfies a large-deviation principle in C T d [0, ∞) with rate function of action-integral form.The Hamiltonian H(p) is the principal eigenvalue of an associated averaged cell problem described in Lemma 7.2.

Detailed balance implies symmetric Hamiltonians
The large-deviation principles established by Theorems 4.6 and 4.7 can be used to analyse which sets of potentials and rates induce transport on macroscopic scales.To that end, we specify to b i (y) = −∇ y ψ i (y) and γ(n) = n in the generators defined in (7).We say that the set of potentials and rates {r ij , ψ i } satisfies detailed balance if for all i, j ∈ {1, . . ., J} and y ∈ T d , we have Theorem 4.8 (Detailed balance implies a symmetric Hamiltonian).Consider the same setting and assumptions of Theorem 4.6.Suppose that the detailed-balance condition (8) is satisfied.Then the Hamiltonian H(p) of Theorem 4.6 satisfies We give the proof of Theorem 4.8 here, since it is solely based on a suitable formula for H(p).
Proof of Theorem 4.8.We prove in Proposition 8.1 that under the detailed-balance condition, the principal eigenvalue H(p) is given by where P ⊂ P (E ) is a subset of probability measures on E = T d × {1, . . ., J} specified in Proposition 8.1, R(µ) is the relative Fisher information specified in (31), and K p (µ) is given by where , the infimum is taken over vectors of functions Let µ ∈ P. We show that K p (µ) = K −p (µ), which implies H(p) = H(−p).The sum in which the cosh(•) terms appear is symmetric in the sense that With a similar analysis, we can study the behaviour of molecular motors under external forces.Let (X n , I n ) be the stochastic process of Theorem 4.6 in dimension d = 1 with drift b i (y) = F − ψ (y, i), where F is a constant (modeling an external force) and ψ ∈ C ∞ (T) is a smooth periodic potential.The process (X n , I n ) is T × {1, . . ., J}-valued and satisfies where I n t a jump process on {1, . . ., J} with jump rates nr ij (nx).Under de- tailed balance, one can show with arguments similar as above that the Hamiltonian for this process is symmetric around (−F).Since H(0) = 0 and H(p) is strictly convex, this means that the model predicts a positive force-velocity feedback under detailed balance: F > 0 implies ∂ p H(0) > 0, and F < 0 implies ∂ p H(0) < 0. Theorem 4.9 (Separation of time scales implies a symmetric Hamiltonian).Let the stochastic process (X n t , I n t ) of Definition 4.4, with b i = −∇ψ i , satisfy the assumptions of Theorem 4.7.Suppose in addition that the rates r ij (•) are constant on T d .Then H(p) = H(−p), where H(p) is the Hamiltonian in Theorem 4.7.
Since the derivation of the required formula for H(p) is similar to the derivation of H(p), we omit the details and only give a sketch of the argument here.Sketch of proof of Theorem 4.9.The principal eigenvalue H(p) is given by with P and R specified below.The bijective transformation ϕ → (−ϕ) leaves the infimum in K p (µ) invariant, and therefore we have K p (µ) = K −p (µ) for all µ ∈ P.This implies H(p) = H(−p).
In the formula for H(p), the set of probability measures P ⊂ P (T d ) is The map R is the relative Fisher information; with the stationary measure ν of the jump process on {1, . . ., J} with rates r ij ,

Proof of large-deviation principle for switching Markov processes
The main point of this section is to prove Theorem 4.2, the large-deviation principle for switching Markov processes in a periodic setting.The proof is based on a connection between large deviations and Hamilton-Jacobi equations that we first make explicit in Section 5.1 by adapting Theorems of [FK06] to our setting.

Strategy of proof
iii) A function u 1 : E → R is a strong viscosity subsolution of (1 − τH)u = h if it is bounded and upper semicontinuous, and if for all ( f , g) ∈ H and x ∈ E, whenever then there exists a z ∈ E such that Similarly for strong viscosity supersolutions.
A function u ∈ C(E) is called a viscosity solution of (1 − τH)u = h if it is both a viscosity sub-and supersolution.
Let us briefly highlight the adaptations we made with respect to [FK06].First, formulating viscosity solutions via sequences as in [FK06, Definition 7.1] is only required when working with non-compact spaces, while in the context of this paper we only work in compact spaces.Second, the product space E × E in this paper corresponds to the set E in [FK06].

Definition 5.2 (Comparison Principle
).The comparison principle holds for viscosity sub-and supersolutions of (1 − τH)u = h if for any viscosity subsolution u 1 and viscosity supersolution u 2 , we have u 1 ≤ u 2 on E.
If the comparison principle holds, then viscosity solutions are unique, since two viscosity solutions u, v satisfy u ≤ v and v ≤ u.A general large-deviation theorem.Just as in Theorem 4.2, we work with compact Polish spaces E n , E and E that are related via continuous embeddings η n and η n by The following Theorem is an adaptation of [FK06, Theorem 7.18] to our setting.This adaptation is obtained by collecting in one place assumptions that are mentioned in several places in [FK06], and specializing them to the compact setting.
Theorem 5.3.Let L n be the generator of an E n -valued process Y n , and let H n be the nonlinear generators defined by H n f = 1 n e −n f L n e n f .Let the compact Polish spaces E n , E and E be related as in the above diagram.In addition, suppose: (i) (Condition 7.9 of [FK06] on the state spaces) There exists an index set Q and approximating state spaces A q n ⊆ E n , q ∈ Q, such that the following holds: (a) For q 1 , q 2 ∈ Q, there exists q and sup which are the limit of the H n 's in the following sense: and for each q ∈ Q, and for each q ∈ Q, lim n→∞ sup y∈A q n | f n (y) − f (η n (y))| = 0. Furthermore, for each q ∈ Q and every sequence y n ∈ E n such that η n (y n ) → x ∈ E and η n (y n ) → z ∈ E , we have lim inf n→∞ H n f n (y n ) ≥ g(x, z ).
(iii) (Comparison principle) For each h ∈ C(E) and τ > 0, the comparison principle holds for viscosity subsolutions of (1 − τH † )u = h and viscosity supersolutions of (1 Let X n t := η n (Y n t ) be the corresponding E-valued process.Suppose that {X n (0)} n∈N satisfies a large-deviation principle in E with rate function Then {X n } n∈N satisfies the large-deviation principle with a rate function I : with which the rate function is given by where for z, y ∈ E, The semigroup V(t) is defined via the Crandall-Liggett Theorem-for details we refer to [FK06, Chapter 5].

Proof of Theorem 4.2
We prove Theorem 4.2 by verifying the conditions of Theorem 5.3, which are convergence of nonlinear generators (Proposition 5.4) and the comparison principle (Proposition 5.5).The rest of this section below the proof of Theorem 4.2 is devoted to proving the propositions.We point out that the main challenge is to prove the comparison principle using only (T1) and (T2) of Theorem 4.2.
where η n : E n → T d is defined by η n (x, i) = ι n (x) and η n : E n → E is a continuous map.In the notation of Theorem 5.3, we have E = T d .For verifying the general condition (i) of Theorem 5.3 on the approximating state spaces A q n , we take the singleton Q = {q} and set A q n := E n .Then part (a) holds, and parts (b) and (d) are a consequence of Condition 3.4 on E n , which says that for any x ∈ T d , there exist x n ∈ E X n such that ι n (x n ) → x.Part (c) follows by taking the compact sets K q 1 := T d and K q 2 := T d × E .We verify the convergence Condition (ii) of Theorem 5.3.By (T1), part (C2), there exist With these f n , both conditions (a) and (b) are simultaneously satisfied for the operator H = H † = H ‡ , where condition (C1) guarantees that for any point follows from the uniform-convergence condition (C2).
For proving Proposition 5.5, we use two operators H 1 , H 2 that are derived from a multivalued limit H. Define H 1 , H 2 : C(E) → M(E) by , with two maps H 1 , H 2 : R d → R. We prove Proposition 5.5 with the following Lemmas.
Lemma 5.6 (Local operators admit strong solutions).Let H ⊆ C 1 (T d ) × C(T d × E ) be a multivalued limit operator satisfying (T1) of Theorem 4.2.Then for any τ > 0 and h ∈ C(T d ), viscosity solutions of (1 − τH)u = h coincide with strong viscosity solutions in the sense of Definition 5.1.
Proof of Proposition 5.5.Let u 1 be a subsolution and u 2 be a supersolution of the equation (1 − τH)u = h.By Lemma 5.6, u 1 is a strong subsolution and u 2 a strong supersolution of (1 − τH)u = h, respectively.By Lemma 5.7, u 1 is a strong subsolution of (1 − τH 1 )u = h, and u 2 is a strong supersolution of H 2 .
With that, we establish below the inequality max with some x δ , x δ ∈ T d such that dist(x δ , x δ ) → 0 as δ → 0, and certain p δ ∈ R d .Then using that h ∈ C(T d ) is uniformly continuous since T d is compact, and that H 1 (p δ ) ≤ H 2 (p δ ) by Lemma 5.8, we can further estimate as max where ≤ 0 follows by taking the limit δ → 0.
We are left with proving (11).Define Φ δ : where Then Ψ ≥ 0, and Ψ(x, x ) = 0 holds if and only if x = x , and By boundedness and upper semicontinuity of u 1 and (−u 2 ), and compactness of T d × T d , for each δ > 0 there exists a pair (x δ , x δ ) ∈ T d × T d such that ) and u 2 is bounded, we obtain Hence Ψ(x δ , x δ ) → 0 as δ → 0.
In order to use the sub-and supersolution properties of u 1 and u 2 , introduce the smooth test functions f δ 1 and f δ 2 as are both in the domain of H, and hence in the domain of H 1 and H 2 , respectively.Furthermore, (u 1 − f 1 ) has a maximum at x = x δ , and ( f 2 − u 2 ) has a maximum at x = x δ , by definition of (x δ , x δ ) and Φ δ .Since u 1 is a strong subsolution of (1 and since u 2 is a strong supersolution of (1 Thereby, we can estimate max(u 1 − u 2 ) as max which establishes (11), and thereby finishes the proof.
The rest of the section, we prove Lemmas 5.6, 5.7 and 5.8.Regarding Lemma 5.6, a proof for single valued operators is given in [FK06, Lemma 9.9].
Proof of Lemma 5.6.Let τ > 0, h ∈ C(T d ).We verify that subsolutions are strong subsolutions.Let u 1 be a subsolution of (1 − τH)u = h and ( f , H f ,ϕ ) ∈ H, and let x ∈ T d be such that (u The function f defined by f (x ) := Ψ(x , x), with Ψ(x , x) defined by (12), is smooth and therefore f is in the domain D(H).Then x is the unique maximal point of (u Since u 1 is a subsolution, there exists a point z ∈ E such that Using ∇ f (x) = 0 and that H depends only on gradients by (T1), we obtain Thus u 1 is a strong subsolution.The argument is similar for the supersolution case, where one can use (− f ).
Vice versa, when given a strong sub-or supersolution u 1 or u 2 , for every f ∈ D(H), (u 1 − f ) and ( f − u 2 ) attain their suprema at some x 1 , x 2 ∈ T d due to the continuity assumptions on the domain of H, the semi-continuity properties of u 1 and u 2 , and compactness of T d .By the strong solution properties, the suband supersolution inequalities follow.
Proof of Lemma 5.7.Let u 1 be a strong subsolution of (1 For any ϕ there exists a point z ∈ E such that the above subsolution inequality (14) holds.Therefore for all x, Since the point x ∈ T d is independent of ϕ, we obtain The argument is similar for supersolutions.
Proof of Lemma 5.8.By assumption, for every p ∈ R d there exists a function ϕ p ∈ C(E ) such that for all z ∈ E , Thus sup Taking the infimum and supremum over ϕ, we find which finishes the proof.

Proof of action-integral representation
In this section we prove Theorem 4.3, the action-integral representation of the rate function of Theorem 4.2, by following the strategy outlined in [FK06, Chapter 8].We first briefly summarize the strategy in Section 6.1, specialized to our setting.

Strategy of proof
Let H = H(p) be the Hamiltonian of Theorem 4.3 and let L = L(v) be the associated Lagrangian defined by where AC T d [0, ∞) is the set of absolutely continuous paths in the torus.The map V NS (t) is the Nisio semigroup with cost function L. In Definition 8.1 and Equation (8.10) in [FK06], the Nisio semigroup is defined by means of relaxed controls in order to cover a general class of possible cost functions.Since the Lagrangian L(v) is convex, the semigroup V NS (t) equals the semigroup given in (8.10) of [FK06], which can be seen by using that λ s = δ ∂ s x(s) is an admissible control and by applying Jensen's inequality.Such an argument is given for example in Theorem 10.22 in [FK06].
The rate function I of Theorem 4.2 is given in terms of a limiting semigroup V(t) as shown in equations ( 9) and (10).The desired action-integral representation follows if the semigroup V(t) of Theorem 4.2 is equal to the Nisio semigroup V NS (t) defined by (16).In [FK06, Chapter 8], the equality of semigroups is traced back to conditions on their generators.In our case, the generator of the limiting seimgroup is the limiting multivalued operator H of Theorem 4.2, and the generator of the Nisio semigroup is an operator H defined by the Hamiltonian H(p).We summarize in Proposition 6.1 below that the generators satisfy the required conditions of [FK06, Chapter 8] and show that these conditions suffice to prove the action-integral representation.

Proof of Theorem 4.3
In this section, we first prove Theorem 4.3 by means of Proposition 6.1 below.The rest of the section is then devoted to proving Proposition 6.1.(i) The Lagrangian (15) and the operator H satisfy Conditions 8.9, 8.10 and 8.11 of [FK06], with the set of controls (ii) The comparison principle (Definition 5.2) holds for viscosity sub-and supersolutions of (1 − τH)u = h.
(iii) Every viscosity solution u of (1 − τH)u = h is also a viscosity solution of (1 Proof of Theorem 4.3.Let V(t) be the semigroup obtained in Theorem 4.2 and let V NS (t) bet the Nisio semigroup (16).We shall verify that V(t) = V NS (t).
Then by [FK06, Theorem 8.14], the rate function of Theorem 4.2 (given by ( 9)) satisfies the control representation (8.18) of [FK06].The action-integral representation follows from this control representation by applying Jensen's inequality.
By [FK06,Theorem 8.27], we obtain V NS (t) = V(t), where the semigroup V(t) is defined by The conditions of Theorem 8.27 are satisfied since Conditions 8.9, 8.10 and 8.11 of [FK06] are satisfied by Item (i), and since the comparison principle holds by Item (ii).
By [FK06,Corollary 8.29], we obtain V(t) = V(t).The conditions of Corollary 8.29 are satisfied: Item (iii) above corresponds to Item a) of Corollary 8.29, the conditions of [FK06, Theorem 6.14] are satisfied under the assumptions of our Theorem 4.2, the conditions of [FK06,Theorem 8.27] are satisfied for the same reasons as mentioned above, and D α = D(H).
Proof of (i) in Proposition 6.1.We first show that the following Items (a), (b), (c) imply Conditions 8.9, 8.10 and 8.11 of [FK06], which are formulated in order to cover a more general and non-compact setting.
(c) For each x 0 ∈ E and every f ∈ D(H), there exists an absolutely continuous path Regarding Items (1)-( 5) of [FK06, Condition 8.9], the operator We turn to verifying Items (a), (b) and (c).Since H(0) = 0, we have L ≥ 0. The Legendre-transform L is convex, and lower semicontinuous since the map H(p) is convex and finite-valued, hence in particular continuous.For C ≥ 0, we prove that the set {v ∈ R d : L(v) ≤ C} is bounded, and hence is relatively compact.For any Item (b) can be proven as in [FK06,Lemma 10.21].We give the proof here.Let f ∈ D(H).There exists a constant C f such that for all (x 0 , v), we have For s ≥ 0, define the map ϕ(s) by We finish the proof by verifying Item (c).This is shown in [Kra16, Lemma 3.2.3]under the assumption of continuous differentiability of H(p), by solving a differential equation with a globally bounded vectorfield.Here, we verify Item (c) under the milder assumption of convexity of H(p) by solving a suitable subdifferential equation.For p 0 ∈ R d , define the subdifferential ∂H(p 0 ) at p 0 as the set We shall solve for any f ∈ C 1 (T d ) the subdifferential equation ẋ ∈ ∂H(∇ f (x)).This means we show that for any initial condition x 0 ∈ T d , there exists an absolutely continuous path x : [0, ∞) → T d satisfying both x(0) = x 0 and ẋ(t) ∈ ∂H(∇ f (x(t))) almost everywhere on [0, ∞).Then (18) follows by noting that , and integrating gives one inequality in (18).Regarding the other inequality, since ẋ ∈ ∂H(∇ f (x)), we know that for almost every t ∈ [0, ∞) and for all p ∈ R d , we have and integrating gives the other inequality.
For solving the subdifferential equation, define , where the function f ∈ C 1 (T d ) is regarded as a periodic function on R d .We apply Lemma 5.1 in [Dei92] for solving ẋ ∈ F(x).The conditions of Lemma 5.1 in the case of R d are satisfied if the following holds: sup x∈R d F(x) sup is finite, for all x ∈ R d , the set F(x) is non-empty, closed and convex, and the map x → F(x) is upper semicontinuous.
. By continuous differentiability and periodicity of f , and continuity of H, the right-hand side is bounded in x, and we obtain sup For any x ∈ R d , the set F(x) is non-empty, since the subdifferential of a proper convex function H(•) is nonempty at points where H(•) is finite and continuous (see e.g.[Roc66,Th. 23.4]).Furthermore, F(x) is convex and closed, which follows from the properties of a subdifferential set.
Regarding upper semicontinuity, recall the definition from [Dei92]: the map That means for all n ∈ N that the sets ∂H(∇ f (x n )) ∩ A are non-empty, and consequently, there exists a sequence ξ n ∈ F(x n ) ∩ A. We proved above that the set F(y) ∩ A is uniformly bounded in y ∈ R d .Hence the sequence ξ n is bounded, and passing to a subsequence if necessary, it converges to some ξ.By definition of F(x n ), for all p ∈ R d , Passing to the limit, we obtain that for all p ∈ R d , This implies by definition that ξ ∈ ∂H(∇ f (x)).Since ξ n ∈ A and A is closed, we have ξ ∈ A. Hence x ∈ F −1 (A), and F −1 (A) is indeed closed.
Proof of (ii) in Proposition 6.1.The comparison principle for the operator H follows from the fact that H f = H(∇ f ) depends on x only via gradients.Indeed, for subsolutions u 1 and supersolutions u 2 of (1 and max(u 1 − u 2 ) ≤ 0 follows by taking the limit δ → 0.
Proof of (iii) in Proposition 6.1.Let u ∈ C(T d ) be a viscosity solution of the equation (1 − τH)u = h.By Lemmas 5.6 and 5.7, u is a strong viscosity subsolution of (1 − τH 1 )u = h and a strong viscosity supersolution of (1 − τH 2 )u = h.In the proof of Lemma 5.8 we obtained H 1 ≤ H ≤ H 2 , which in particular implies the inequalities −H 1 ≥ −H ≥ −H 2 .With that, we find that u is both a strong viscosity sub-and supersolution of (1 − τH)u = h.

Proof of large deviations for molecular motors
In this section, we consider the stochastic process (X n , I n ) of Defintion 4.4 and prove Theorems 4.6 and 4.7.The generator L n of (X n , I n ) is given by with state space , and γ(n) > 0. We frequently write f (x, i) = f i (x).
The nonlinear generators defined by H n f = 1 n e −n f L n e n f (•) are given by

Proof of Theorem 4.6
Verification of (T1) of Theorem 4.2.Recall that γ(n) = n.Choosing the functions , we find where ∇ y and ∆ y denote the gradient and Laplacian with respect to the variable y = nx.The only term of order 1 n that remains is 1 n ∆ f (x)/2.This suggests to take the remainder terms as the definition of the multivalued operator H.In the notation of Theorem 4.2, we choose E = T d × {1, . . ., J} as the state space of the macroscopic variables, and define where we write ϕ = (ϕ 1 , . . ., ϕ J ) via the identification C 2 (E ) (C 2 (T d )) J .
We now verify (C1), (C2) and (C3) of (T1).For (C1), define the maps η n : E n → E by η n (x, i) := (nx, i), and recall that the maps η n : E n → T d are the projections η n (x, i) := x.For any (x, y, i) ∈ T d × E , we search for elements (y n , i n ) ∈ T d × {1, . . ., J} such that both η n (y n , i n ) → x and η n (y n , i n ) → (y, i) as n → ∞.For d = 1, the point y n := 1 n ( nx + y) satisfies y n → x and ny n = y in T d (i.e.modulo 1).For ≥ 2 this construction can be done for each coordinate.Therefore, (C1) holds with Item (C3), the fact that the images H f ,ϕ depend on x only via the gradients of f , can be recognized in (21).
Verification of (T2) of Theorem 4.2.Let f be a function in D(H) = C 2 (T d ) and x ∈ T d .We establish the existence of a vector function ϕ = (ϕ 1 , . . ., ϕ J ) ∈ (C 2 (T d )) J such that for all (y, i) ∈ E = T d × {1, . . ., J} and some constant H(∇ f (x)) ∈ R, we have For the flat torus E = T d , this means that for fixed ∇ f (x) = p ∈ R d , we search for a vector function ϕ p such that Hϕ p (p, y, i) = H(p) becomes independent of the variables (y, i) ∈ E .We can find this vector function by solving a principal eigenvalue problem.We prove Item (T2) with the following Lemma.
Lemma 7.1.Let E = T d × {1, . . ., J} and H be the limit operator (20).Then: (a) For f ∈ D(H), the limiting images H ϕ (∇ f (x), y, i) are of the form (b) For any p ∈ R d , there exists an eigenfunction g p = (g 1 p , . . ., g J p ) ∈ (C 2 (T d )) J with strictly positive component functions, g i p > 0 on T d for i = 1, . . ., J, and an eigenvalue H(p) ∈ R such that Now (T2) follows by (a) and (b), since with ϕ p := log g p , = H(p).
Proof of Lemma 7.1.Writing p = ∇ f (x), Item (a) follows directly by regrouping the terms in (21).Regarding Item (b), B p + V p + R g p = H(p)g p is a system of weakly-coupled nonlinear elliptic PDEs on the flat torus.They are weakly coupled in the sense that the component functions g i p are only coupled in the lowest order terms by means of the operator R, while the operators B p and V p act solely on the diagonal.By Proposition B.2, there exists a λ(p) and g p > 0 such that −B p − V p − R g p = λ(p)g p .Thereby, B p + V p + R g p = H(p)g p follows with the same eigenfunction g p > 0 and the principal eigenvalue H(p) = −λ(p).This finishes the verification of (T2).
Verification of (T3) of Theorem 4.3.We prove that the principal eigenvalue H(p) of Lemma 7.1 is convex in p ∈ R d and satisfies H(0) = 0.By Proposition B.2, the eigenvalue H(p) = −λ(p) admits the representation with a map F defined by The map F is jointly convex in p and ϕ.For the eigenfunction ϕ = ϕ p , equality holds in the sense that for any z ∈ E , we have H(p) = F(p, ϕ p )(z).Therefore, we obtain for τ ∈ [0, 1] and any p 1 , p 2 ∈ R d with corresponding eigenfunctions g 1 = e ϕ 1 and g 2 = e ϕ 2 that Regarding the claim H(0) = 0, we choose the constant function ϕ = (1, . . ., 1) in the variational representation of H(p).Thereby, we obtain the estimate H(0) ≤ 0. For the opposite inequality, we show that for any ϕ ∈ C 2 (E ) ; the continuous function ϕ on the compact set E admits a global minimum z m = (y m , i m ) ∈ E .Thereby, noting that V 0 ≡ 0, we find This finishes the verification of (T3), and thereby the proof of Theorem 4.6.

Proof of Theorem 4.7
In this section, we consider the process (X n , I n ) of Definition 4.4 in the limit regime 1 n γ(n) → ∞ as n → ∞.As above in the proof of Theorem 4.6, we start with the nonlinear generator H ε given by (19), and verify Conditions (T1), (T2) and (T3) of Theorems 4.2 and 4.3.
Verification of (T1) of Theorem 4.2.We choose functions f n (x, i) of the form We abbreviate y = nx in the following equation.Computing H n f n results in r ij (y) e n(ξ(y,j)−ξ(y,i))/γ(n) − 1 .
The n/γ(n) terms vanish as n → ∞.The last term satisfies Therefore, we choose again E := T d × {1, . . ., J} as the state space of the macroscopic variables, and use the following limit operator H, with functions ϕ and ξ in the sets ϕ ∈ C 2 (T d ) and ξ = (ξ 1 , . . ., Then H satisfies (T1), which is shown by the same line of argument as above in the proof of Theorem 4.6, with the same maps η n and η n .The image functions depend only on gradients, H f ,ϕ,ξ (x, y, i) = H ϕ,ξ (∇ f (x), y, i).
Verification of (T2) of Theorem 4.2.For any p ∈ R d , we establish the existence of functions ϕ p ∈ C 2 (T d ) and ξ ∈ C 2 (E ) such that H ϕ,ξ (p, •) becomes constant on E = T d × {1, . . ., J}.To that end, we find a constant H(p) ∈ R and ϕ p and ξ p such that for all (y, i) ∈ E , we have We reduce the problem to finding a principal eigenvalue.Proof of Lemma 7.2.Regarding (a), writing ξ(y, i) = ξ y (i) and p = ∇ f (x) ∈ R d , for all (y, i) ∈ E we find , with a generator R y of a jump process with frozen jump rates r ij (y).
For (b), let ϕ ∈ C 2 (T d ) and y ∈ T d .We wish to find a function ξ y (•) = ξ(y, •) ∈ C({1, . . ., J}) such that e −ϕ B p,i + V p,i e ϕ + R y ξ y (i) becomes constant in i = 1, . . ., J. By the Fredholm alternative, for any vector h ∈ C({1, . . ., J}), the equation R y ξ y = h has a solution ξ y (•) ∈ C({1, . . ., J}) if and only if h ⊥ ker(R * y ).Since R y is the generator of a jump process on the finite discrete set {1, . . ., J} with rates r ij (y), the null space ker(R * y ) is onedimensional and spanned by the unique stationary measure µ y ∈ P ({1, . . ., J}), which exists by our irreducibility assumption of Theorem 4.7 (e.g.[Kle13, Theorem 17.51]).Hence e −ϕ B p,i + V p,i e ϕ + R y ξ y (i) = h(p, y) is independent of i ∈ {1, . . ., J} if and only if This solvability condition leads to Hence for h(p, y) := e −ϕ(y) B p + V p e ϕ(y) , there exists ξ(y, i) solving the equation R y ξ(y, •) = h.Furthermore, since the stationary measure is an eigenvector of a one-dimensional eigenspace, and the rates r ij (•) are smooth by assumption, the eigenfunctions ξ y depend smoothly on y as well, and (b) follows.
For proving (c) in Lemma 7.2, we note that Equation ( 25

Proof of symmetry of Hamiltonians
In Theorem 4.8, we proved that detailed-balance implies symmetric Hamiltonians.The proof was based on a suitable variational representation of the Hamiltonian.In this section, we show in Proposition 8.1 how to obtain this representation.
Before giving the rigorous proof, we sketch the argument.To that end, we recall the setting.We work with E = T d × {1, . . ., J} and denote by P (E ) the set of probability measures on E .The Hamiltonian H(p) is the principal eigenvalue of the cell problem (22) described in Lemma 7.1, and satisfies In this formula, we have the continuous map V p given by and the Donsker-Varadhan functional where the infimum is over strictly positive u ∈ C 2 (E ) and the operator L p is (29) The variational representation (26) is a special case of Donsker's and Varadhan's representation theorem on principal eigenvalues [DV75].Under their general conditions, the infimum is taken over functions that are in the domain of the infinitesimal generator of the semigroup generated by L p .Pinsky showed that the infimum can be taken over C 2 functions if the coefficients appearing in the operator L p are sufficiently regular (Theorem 1.4 in [Pin85], Equation (3.1) in [Pin07]).
Since it is not clear from (26) that H(p) is symmetric under the detailedbalance condition, we shall perform a suitable shift in the infimum of the functional (28) to obtain a suitable representation.Rewriting in (28) the strictly positive functions as u = exp(ϕ), we find Suppose that dµ i = µ i dx with strictly positive µ i , where dx is the Lebesgue measure on the torus.Then shifting in the infimum as ϕ i → ϕ i + ψ i + 1 2 log µ i , we find by calculation that where R(µ) is the Fisher information given by R(µ and K p (µ) is given by Plugging formula (30) into the variational representation (26) leads to the desired representation of the Hamiltonian.The transformation we used is equivalent to shifting by (1/2) log(µ i /π i ), where π i = e −2ψ i is the stationary measure up to multiplicative constant.This transformation is reminiscent of a symmetrization discussed in Touchette's notes [Tou18, Eq. ( 36)].Also when formulating the detailed-balance condition with additional constants in (8), that is when not shifting the potentials by constants to renormalize, one can include these constants in the shift to arrive at the same conclusions.
In order to make the strategy as outlined above rigorous, we prove that we can restrict to measures µ having the required regularity properties.The central step is to exploit the fact that I p (µ) where P ⊂ P (E ) are the probability measures µ = (µ 1 , . . ., µ J ) such that: (P1) Each µ i is absolutely continuous with respect to the uniform measure on T d .(P2) For each i, we have ∇(log with the maps R and K p given by (31) and (32) above.In K p (µ), the infimum can be taken over vectors of functions The representation (34) follows from (32) by rewriting the sums appearing therein as This leads to the cosh(•) terms in (34), and proves (c).We now give the proof of (a) and (b) of Proposition 8.1.
Proof of (a) in Proposition 8.
with jump rates s ij defined as s ij ≡ 1 and s ji ≡ e 2ψ j −2ψ i , for i ≤ j, and with γ := sup T d r ij /s ij < ∞, where r ij (•) are the jump rates appearing in L p .Furthermore, define I L rev : P (E ) → [0, ∞] by We shall prove two statements: (I) If I L rev (µ) is finite, then the measure µ satisfies (P1) and (P2).(II) If I p (µ) is finite, then I L rev (µ) is finite.
The two statements combined finish the proof.
Regarding (I), suppose I L rev (µ) is finite.Since s ij e −2ψ i = s ji e −2ψ j , the operator L rev admits a reversible measure ν rev in P (E ) given by The measure ν rev is reversible for L rev in the sense that for all f , g ∈ D(L rev ), where dµ/dν rev is the Radon-Nikodym derivative.In particular, since I L rev (µ) is finite, we find that µ ν rev and that I L rev (µ) is explicitly given by where we write f i = (dµ i /dν i rev ) 1/2 .Furthermore, µ i is absolutely continuous with respect to ν i = e −2ψ i dx.Since e −2ψ i dx dx, we find that µ i is absolutely continuous with respect to the volume measure on T d .Hence (P1) holds true.
By the elementary estimates e x ≥ e x 1 {x≥0} and (e x 1 x≥0 − e x ) ≥ −1, we obtain the inequalities a n ≥ a(ϕ n ) and a λ (ϕ n ) ≥ a λ n − C. Furthermore, the bound a n ≤ a λ n is obtained by noting that dµ i , which is bounded above by zero since λ = 1 + ε > 1 and e x/λ ≤ e x for x ≥ 0.
In conclusion, we have Proof of (b) of Proposition 8.1.It is sufficient to show that for any µ ∈ P, the Donsker-Varadhan functional I p (µ) satisfies (30).Integration by parts gives where dµ i = µ i dx.By a density argument, the infimum can be taken over functions ϕ such that ∇ϕ i ∈ L 2 µ i (T d ).Now shifting in the infimum as ϕ i → ϕ i + 1 2 i ) + ψ i , we find after some algebra that + ∑ j r ij µ j µ i e ψ j −ψ i e ϕ j −ϕ i − 1 dµ i .
The term containing the square roots and logarithms are not singular since they are integrated against dµ i , so that the integration is over the set {µ i > 0}.Now writing out the terms and reorganizing them leads to the claimed equality.

A Large-deviation principle implies almost-sure convergence
It is a well-known fact that a large-deviation principle implies a strong type of convergence of random variables.We provide a sketch of proof here since we know of no reference in the literature.Let I : X → [0, ∞] be a rate function.We denote by {I = 0} the set of its global minimizers.
Theorem A.1.For n = 1, 2 . . ., let X n be a random variable taking values in a Polish space (X , d).Suppose that {X n } n∈N satisfies a large-deviation principle with rate function I. Then d(X n , {I = 0}) → 0 almost surely as n → ∞.
We point out that as specified in Definition 3.1, the rate function in Theorem A.1 is assumed to have compact sub-level sets.

B Principal eigenvalues
In this section we collect results on principal-eigenvalue problems that we encounter in the proofs of the molecular-models.
Proposition B.1.Let P be a second-order uniformly elliptic operator given by In the above propositions, the eigenvalue λ is referred to as the principal eigenvalue.The principal-eigenvalue problem on closed manifolds, such as the torus T d , is solved for instance by Padilla [Pad97].Donsker and Varadhan's variational representations for principal eigenvalues, [DV75,DV76], apply to the case of compact metric spaces without boundary.A proof of how to obtain the principal eigenvalue for coupled systems of equations is given by Sweers [Swe92] and Kifer [Kif92].Sweers considers a Dirichlet boundary problem, but his results transfer to the compact setting without boundary.Kifer gives an independent proof for the case of a compact manifold, in Lemma 2.1 and Proposition 2.2 in [Kif92].
be a multivalued operator satisfying (T1).Then H satisfies the convergence condition (ii) of Theorem 5.3.Proposition 5.5.In the setting of Theorem 4.2, let H ⊆ C 1 (T d ) × C(T d × E ) be a multivalued operator satisfying conditions (T1) and (T2).Then for τ > 0 and h ∈ C(T d ), the comparison principle is satisfied for viscosity sub-and supersolutions of (1− τH)u = h.Proof of Theorem 4.2.By Proposition 5.4, conditions (i) and (ii) of Theorem 5.3 hold with the single operator H = H † = H ‡ .By Proposition 5.5, the comparison principle is satisfied for (1 − τH)u = h, and hence condition (iii) of Theorem 5.3 holds with a single operator H = H † = H ‡ .Therefore the largedeviation principle follows by Theorem 5.3.Proof of Proposition 5.4.We recall that with E n = E X n × {1, . . ., J} and ι n : E X n → T d of Condition 3.4, the state spaces are related as in the following diagram,

Proposition 6. 1 .
Under the same assumptions of Theorems 4.2 and 4.3, define the operator H : D(H) ⊆ C 1 (T d ) → C(T d ) on the domain D(H) = D(H) by setting H f (x) := H(∇ f (x)).Let τ > 0 and h ∈ C(T d ).Then: (a) The function L : R d → [0, ∞] is lower semicontinuous and for every C ≥ 0, the level set {v ∈ R d : L(v) ≤ C} is relatively compact in R d .(b) For all f ∈ D(H) there exists a right continuous, nondecreasing function
) corresponds to a principal-eigenvalue problem for a second-order uniformly elliptic operator.By Proposition B.1, the principal eigenvalue problem −B p − V p g p = λ(p)g p has a solution g p > 0, with eigenvalue λ(p) ∈ R. The same function g p and the eigenvalue H(p) = −λ(p) solve (25).Verification of (T3) of Theorem 4.3.The principal eigenvalue H(p) is of the form H(p) = inf ϕ sup y∈T d F (p, ϕ) (y), with F jointly convex in p and ϕ.Convexity of H(p) and H(0) = 0 follow as above in the proof of Theorem 4.6.
with smooth coefficients a k , b k , c ∈ C ∞ (T d ).Then there exists a strictly positive function u ∈ C ∞ (T d ) and a unique λ ∈ R such that Pu = λu, and λ is given by Let L :C 2 (T d ) J → C(T d ) J be a J × J diagonal matrix of uniformly elliptic operators, k (•), c (i) (•) ∈ C ∞ (T d), and let R be a J × J matrix with non-negative functions on the off-diagonal, ij ≥ 0 for all i = j.Suppose that the matrix R with entries R ij := sup y∈T d R ij (y) is irreducible.Then for the operator P := L − R, there exists a unique λ ∈ R and a strictly vector u ∈ C ∞ (T d ) J , u i (•) > 0 for all i = 1, . . ., J, such that Pu = λu.Furthermore, λ is given by it is bounded and lower semicontinuous, and if for all ( f , g) ∈ H there exists a point (x, z ) ∈ E × E such that [FK06]ies Item (1).For Item (2), we can take Γ = T d × R d , and for x 0 ∈ T d , take the pair (x, λ) with x(t) = x 0 and λ(dv × dt) = δ 0 (dv) × dt.Item (3) is a consequence of the above Item (a).Item (4) holds since T d is compact.Item (5) is implied by the above Item (b).Condition 8.10 is implied by Condition 8.11 and the fact that H1 = 0, see Remark 8.12 (e) in[FK06].Finally, Condition 8.11 is implied by the above Item (c), with the control λ(dv is finite since H(p) is finite.By a result of Stroock [Str12, Theorem 7.44], finiteness of the Donsker-Varadhan functional implies certain regularity properties in case the generator is reversible.Since the generator L p is not reversible, we further bound I p by a suitable Donsker-Varadhan functional I rev corresponding to a reversible process in order to be able to apply [Str12, Theorem 7.44].The Hamiltonian H(p) given by (26) satisfies the following: (a) The supremum in (26) can be taken over a smaller set P of measures, that is 1. Let p ∈ R d .The supremum in (26) can be taken over measures µ such that I p (µ) is finite, because H(p) is finite and V p (•) is bounded.We show that finiteness of I p (µ) implies that µ must satisfy (P1) and (P2).To that end, define the map L rev : D(L rev ) ⊆ C(E ) → C(E ) by setting D(L rev ) := C 2 (E ) and