Local stability of Kolmogorov forward equations for finite state nonlinear Markov processes

The focus of this work is on local stability of a class of nonlinear ordinary differential equations (ODE) that describe limits of empirical measures associated with finite-state weakly interacting N-particle systems. Local Lyapunov functions are identified for several classes of such ODE, including those associated with systems with slow adaptation and Gibbs systems. Using results from [5] and large deviations heuristics, a partial differential equation (PDE) associated with the nonlinear ODE is introduced and it is shown that positive definite subsolutions of this PDE serve as local Lyapunov functions for the ODE. This PDE characterization is used to construct explicit Lyapunov functions for a broad class of models called locally Gibbs systems. This class of models is significantly larger than the family of Gibbs systems and several examples of such systems are presented, including models with nearest neighbor jumps and models with simultaneous jumps that arise in applications.


Introduction
In this paper we consider local stability properties of the nonlinear ordinary differential equation (ODE) d dt p(t) = p(t)Γ(p(t)). (1.1) where p(t) takes values in P(X ). Here X is a finite set that we denote by X = {1, . . . , d}, P(X ) is the space of probability measures on X equipped with the topology of weak convergence, which we identify with the unit (d − 1)dimensional simplex S = {r ∈ R d : r x ≥ 0, x ∈ X , and x∈X r x = 1} and for each p ∈ P(X ), Γ(p) is a rate matrix for a Markov chain on X . Such ODEs describe the evolution of the law of so-called nonlinear Markov or McKean-Vlasov processes that arise as limits of weakly interacting Markov chains (see for example Section 2 of the companion paper [5]). In this context, the ODE (1.1) is referred to as the forward equation of the nonlinear Markov process. The focus of the current paper is local stability (see Definition 2.3) of the ODE (1.1), and therefore of the corresponding nonlinear Markov process, for several families of models.
As usual in the study of stability of dynamical systems, the basic approach is to construct a suitable local Lyapunov function (see Definition 2.5). It is known (see, for example, Section 3 of [5]) that for an ergodic linear Markov process on X (i.e., the case where Γ is constant) the mapping q → R(q π), where R is relative entropy and π is the unique stationary distribution, defines a Lyapunov function for the associated linear Kolmogorov equation. Although one does not expect this property to hold for general nonlinear Markov processes (see Section 3 of [5] for a discussion of this point), in Section 3 we consider a family of models, which we call systems with slow adaptation, for which relative entropy is in fact a Lyapunov function when the adaptation parameter is sufficiently small. This result says that relative entropy continues to serve as a Lyapunov function for suitably small non-linear perturbations of linear Markov processes, but it does not yield Lyapunov functions for general nonlinear Markov processes. For one particular family of models whose stationary distributions take an explicit form and which we call systems of Gibbs type, Section 4 of [5] proposed a local Lyapunov function defined as the limit of certain scaled relative entropies that involve the stationary distributions of the associated N -particle weakly interacting Markov processes. In Section 4 of the current work we show that this function is in fact a local Lyapunov function in the sense of Definition 2.5 under suitable positive definiteness assumptions.
For non-Gibbs families, stationary distributions usually will not take an explicit form and thus a different approach is needed. One such approach was developed in Section 5 of [5], where analogous limits of scaled relative entropies, but with the stationary distributions of the N -particle system replaced by the joint law of the N -particles at time t, were identified in terms of the large deviation rate function J t (·) for the empirical measure of the state of the weakly interacting Markov process at time t. The limit of J t as t → ∞ was proposed in [5] as a local Lyapunov function for the ODE (1.1), though the question of when these limits exist and how they can be evaluated was not tackled. In this work we approach this question as follows. We begin by formally deriving a nonlinear partial differential equation (PDE) for {J t (q), t ≥ 0, q ∈ S}. We next show that classical sense positive definite subsolutions of the stationary form of the PDE (see (5.7)), which is formally the equation governing the limit of J t as t → ∞, are local Lyapunov functions for (1.1). With this result, the problem of constructing Lyapunov functions reduces to finding suitable subsolutions of (5.7). Although finding explicit subsolutions can be challenging in general, in Section 6 we introduce an interesting family of models, which we call locally Gibbs systems, for which one can in fact give an explicit solution for (5.7). These models contain, as a special case, the Gibbs type systems studied in Section 4. Moreover, in Sections 6.2 -6.5 we present other examples of locally Gibbs systems, including models with nearest neighbor jumps and models with simultaneous jumps that arise in telecommunications applications. Finally we give an example to illustrate that solutions to the PDE (5.7) can be found for systems that are not locally Gibbs as well.
The paper is organized as follows. Section 2 collects some definitions and basic results related to stability of the ODE (1.1). In Section 3 we study systems with slow adaptation. Section 4 considers the setting of systems of Gibbs type. We then study more general models than the Gibbs systems of Section 4. In Section 5, we present the formal derivation of a nonlinear timedependent PDE that is satisfied by the large deviation rate function {J t (q)}. The main result of this section shows that a positive definite subsolution of the stationary version of this PDE is a local Lyapunov function of (1.1). Finally, in Section 6 we identify a broad family of models, referred to as locally Gibbs systems, for which a non-trivial subsolution of (5.7) can be given explicitly and thus under suitable additional conditions that ensure positive definiteness, one can obtain tractable Lyapunov functions for such systems, ensuring local stability. We also present several examples that illustrate the range of applicability of these results.

Local Stability and Lyapunov Functions
In this section we will collect some definitions and basic results related to stability of the dynamical system (1.1). The following condition will be assumed on several occasions. Some results, such as the main result of this section (Proposition 2.6), only need that Γ be continuous, which is sufficient to ensure the existence of a solution for any initial condition. Denote by S • the relative interior of S: We first recall the definition of a locally stable fixed point of an ODE.
Definition 2.2 A point π * ∈ S is said to be a fixed point of the ODE (1.1) if the right-hand side of (1.1) evaluated at p = π * is equal to zero, namely, π * Γ(π * ) = 0. Definition 2.3 A fixed point π * ∈ S • of the ODE (1.1) is said to be locally stable if there exists a relatively open subset D of S that contains π * and has the property that whenever p(0) ∈ D, the solution p(t) of (1.1) with initial condition p(0) converges to π * as t → ∞.
Our approach to proving local stability will be based on the construction of suitable Lyapunov functions. In order to state the Lyapunov function property precisely, we begin with some notation. Let be the hyperplane containing the simplex S, and let be a shifted version of this hyperplane that goes through the origin. Given a set D ⊂ H 1 , a function U : D → R will be called differentiable (respectively C 1 ) if it is differentiable (respectively, continuously differentiable) on some relatively open subset D ′ of H 1 such that D ⊂ D ′ . In particular, for a differentiable function U on a relatively open subset D of H 1 , for every r ∈ D, there exists a unique vector D tan U (r) ∈ H 0 , called the gradient of U at r, such that is an orthonormal basis of the subspace H 0 , we can write Finally, we say that the differentiable function U : D → R is C 1 if the mapping r → D tan U (r) from D to H 0 is continuous. Frequently, with an abuse of notation, we write D tan U simply as DU .
We introduce the following notion of positive definiteness.
Definition 2.4 Let π * ∈ S • be a fixed point of (1.1) and let D be a relatively open subset of S that contains π * . A function J : D → R is called positive definite if for some K * ∈ R, the sets M K = {r ∈D : J(r) ≤ K} decrease continuously to {π * } as K ↓ K * .
In Definition 2.4, by "decrease continuously to {π * }" we mean that: (i) for every ǫ > 0, there exists K ǫ ∈ (K * , ∞) such that M Kǫ ⊂ B ǫ (π * ) ∩ D, where B ǫ (π * ) is the open Euclidean ball of radius ǫ, centered at π * , and (ii) for every K > K * , there exists ǫ > 0 such that B ǫ (π * ) ∩ S ⊂ M K . Note that if J is a uniformly continuous function on D which attains its minimum uniquely at π * then J is positive definite. A basic example of such a function is the relative entropy function p → R(p π * ) introduced in the next section.
Definition 2.5 Let π * ∈ S • be a fixed point of (1.1), and let D be a relatively open subset of S that contains π * . A positive definite, C 1 and uniformly continuous function J : D → R is said to be a local Lyapunov function associated with (D, π * ) for the ODE (1.1) if, given any p(0) ∈ D, the solution p(·) to the ODE (1.1) with initial condition p(0) satisfies d dt J(p(t)) < 0 for all 0 ≤ t < τ such that p(t) = π * , where τ . = inf{t ≥ 0 : p(t) ∈ D c }. In the case D = S • , we refer to J as a Lyapunov function.
The following result shows that, as one would expect, existence of a local Lyapunov function implies local stability. The proof is standard, but is included for completeness. Proposition 2.6 Let π * ∈ S • be a fixed point of (1.1) and suppose that Condition 2.1 holds. Suppose there exists a local Lyapunov function associated with (D, π * ) for (1.1) where D is some relatively open subset of S that contains π * . Then π * is locally stable.
Proof. Let J be a local Lyapunov function associated with (D, π * ) for (1.1). Since J is positive definite, there exists K * ∈ R such that the sets M K = {r ∈D : J(r) ≤ K} decrease continuously to {π * } as K ↓ K * . In particular, there exists L ∈ (K * , ∞) and a relatively open subset D 0 of S such that π * ∈ D 0 ⊂ M L ⊂ D.
Note that if τ n < ∞, then p(t) ∈ M Kn for all t ≥ τ n . Since the sets M Kn decrease continuously to {π * }, it suffices to show that τ n < ∞ for every n.
is the solution of (1.1) with p(0) = q. Recalling the continuity of q → DJ(q), qΓ(q) and observing that ( Thus we have that sup t<τ 1 d dt J(p(t)) < 0. This shows that τ 1 < ∞. By repeating this argument we see that τ n < ∞ for every n, and the result follows.

Systems with Slow Adaptation
Here we consider the case where the ODE (1.1) exhibits a structure we call slow adaptation, for which the strength of the nonlinear component is adjusted through a small parameter. The long-time behavior of systems of this type, in the context of nonlinear diffusions arising as limits of weakly interacting Itô diffusions, is studied in [14] based on coupling arguments and hitting times (and not in terms of Lyapunov functions).
Suppose that Condition 2.1 holds and π * ∈ P(X ) is a fixed point of the ODE (1.1). The rate matrix Γ λ (p) = Γ(λ(p − π * ) + π * ) corresponds to a version of the original system but with slow adaptation when λ > 0 is small. With λ ∈ (0, 1] fixed, the rate matrices Γ λ (p), p ∈ P(X ), determine a family of nonlinear Markov processes. The corresponding forward equation has a unique solution given any initial distribution p(0) ∈ P(X ). Note that for any λ ∈ [0, 1], π * is also a fixed point for (3.1). We are interested in the question of when the fixed point π * is locally stable for sufficiently slow adaptation.
Recall that given p, π * ∈ P(X ), the relative entropy of p with respect to π * is given by It is known (see, e.g., [12, pp. I-16-17] or [5, Lemma 3.1]) that the mappinḡ serves as a Lyapunov function for finite-state linear Markov processes. The forward equation of a finite-state linear Markov process has the form (1.1), but with a constant rate matrix Γ, and the proof of the Lyapunov function property of relative entropy for such Markov processes crucially uses the fact that Γ is constant. In contrast, since in general the rate matrix in the ODE (1.1) depends on the state, one does not expect R(· π * ) to serve as a Lyapunov function for general finite-state nonlinear Markov processes.
Nevertheless, in this section we will show that for systems with slow adaptation with λ sufficiently small, the function R(· π * ) does in fact have the desired property. The following is the main result of the section. Note that the functionF in (3.3) is positive definite (in the sense of Definition 2.4). Thus the Proposition below, together with Definition 2.5, says thatF is a Lyapunov function associated with π * for the ODE (3.1).
with a strict inequality if and only if p λ (t) = π * .
Proof. By construction and hypothesis, there exists C ∈ (1, ∞) such that for all x, y ∈ X , all λ > 0, and all p ∈ P(X ), Recall that since π * is stationary π * Γ λ (π * ) = 0. Using the definition (3.2) of relative entropy, the ODE (3.1), and the relation x,y∈X p y Γ λ yx (p) = x,y∈X p y Γ yx (π * ) = 0, where we use the convention that 0 log 0 = 0. For x, y ∈ X with x = y and p ∈ P(X ), set To complete the proof we will show that there is λ 0 > 0 such that for every p ∈ P(X ), x,y∈X :x =y with equality if and only if p = π * . It is straightforward to check that p → γ yx (p) is concave. However we will need more than that, namely a uniform estimate on its second derivative. Let r ∈ H 0 with r = 1. Evaluation of the derivatives gives d ds (3.5) If the expression in (3.5) is zero then, since π * y > 0 for all y and all states communicate, r y π * y = r x π * x for all x = y. However this is impossible, since r ∈ H 0 requires that at least one component be of the opposite sign of some other component. Hence the expression in (3.5) is negative. Using that {r : r = 1} is compact and continuity in r show that (3.5) is in fact bounded above away from zero on this set, which shows the matrix of second derivatives is negative definite. Using the fact that P(X ) is compact, we find that there is c > 0, not depending on p, such that which is equivalent to Set γ min . = min{Γ yx (π * ) : Γ yx (π * ) > 0} > 0 and note that max x∈X We distinguish two cases.
This last quantity is strictly negative if λ < λ 2 .
The bound on λ obtained in the proof is obviously conservative, and better bounds that depend on Γ(π * ) and π * can be found.

Systems of Gibbs Type
In this section we revisit the class of Gibbs models introduced in Section 4 of [5]. We begin by recalling the basic definitions. Let K : X × R d → R be such that for each x ∈ X , K(x, ·) is a continuously differentiable function on R d . We sometimes write K(x, p) as K x (p). One special case we discuss in detail is given by where V : X → R, W : X × X → R and β > 0. Let (α(x, y)) x,y∈X be an irreducible and symmetric matrix with diagonal entries equal to zero and off-diagonal entries either one or zero. Define and Ψ : where recall that we identify P(X ) with the simplex S. Then for p ∈ S, Γ(p) is the generator of an ergodic finite-state Markov process, and the unique invariant distribution on X is given by π(p) with By studying the asymptotics of certain scaled relative entropies, the following candidate Lyapunov function was identified in Theorem 4.2 of [5]: for p ∈ S. We note that in [5] K(x, ·) was taken to be twice continuously differentiable (this property was used in the proof of Lemma 4.1 of [5]), however here we merely assume that K(x, ·) is C 1 . Also note that in the special case of (4.1), Since x∈X p x log p x is the negative of the entropy of p, F is the sum of a convex function, an affine function and a quadratic function on P(X ). This fact is useful in determining whether or not the fixed points of (1.1) are stable.
Recall the set H 0 introduced in Section 2 and note that for p ∈ S • = {p ∈ S : p x > 0 for all x = 1, . . . , d} the directional derivative of the function F in (4.5) in any direction v ∈ H 0 is given by where we have used that x∈X v x = 0. The following result shows that the fixed points of (1.1) can be characterized as critical points of F . Proof. Recall that π(p) is the unique invariant probability associated with Γ(p), and hence π(p)Γ(p) = 0. Also note that p is a fixed point for (1.1) if and only if pΓ(p) = 0, which, since Γ(p) is a rate matrix of an ergodic Markov process, can be true if and only if p = π(p). Since π(p) ∈ S • for every p ∈ S we have that any fixed point of (1.1) is in S • . For x, y ∈ X , where e x is the unit vector in direction x. Then by (4.7), (4.2) and (4.4), for any If p is a fixed point of (1.1) then p = π(p), and so ∂ ∂v x,y F (p) = 0 for all Thus p = π(p), which says that p is a fixed point of (1.1).
According to Theorem 4.1, the equilibrium points of the forward equation (1.1) are precisely the critical points of F on P(X ). Note that although for each p, Γ(p) is a rate matrix of a Markov process with a unique invariant measure, the dynamical system (1.1) can have multiple stable and unstable equilibria. Here is an example.
The critical points of F on P({1, 2}) are in a one-to-one correspondence with the critical points of f on [0, 1]. We have f (0) = 0 = f (1), and for x ∈ (0, 1) Moreover, f ′ (x) → −∞ as x tends to zero, and f ′ (x) → ∞ as x tends to one.
If β ≤ 1 then f has exactly one critical point, namely a global minimum at x = 1 2 . If β > 1 then there are three critical points, one local maximum at x = 1 2 and two minima at x β and 1 − x β , respectively, for some x β ∈ (0, 1 2 ), where x β → 1 2 as β ↓ 1, x β → 0 as β goes to infinity. The two minima of f correspond to stable equilibria of the forward equation, while the local maximum corresponds to an unstable equilibrium.

Lyapunov function property
Suppose that the function F defined in (4.5) is positive definite (in the sense of Definition 2.4) in a neighborhood of a fixed point π * of (1.1) which contains no other fixed point of (1.1). In this section we show that F is a local Lyapunov function for the ODE (1.1) (associated with the neighborhood and the fixed point π * ), with Γ defined by (4.3). This result is an immediate consequence of the theorem below and Definition 2.5. Together with Proposition 2.6 this will imply π * is locally stable.  (4.3) and some initial distribution p(0) ∈ P(X ). Then for all Moreover, d dt F (p(t)) = 0 if and only if p(t) = π(p(t)).
Proof. We will show that if p(·) is the solution to (1.1) with p(0) = q then (4.10) In view of the semigroup property of solutions to the ODE (1.1), and since q is arbitrary, the validity of (4.10) implies the first equality in (4.9). Let p(0) = q. By the definition (4.5) of F and since x∈X dpx On the other hand, by the definition of relative entropy, (4.4) and (4.2), and again using the relation x∈X dpx dt (0) = 0, we have Comparing the right sides of the last two displays we see that (4.10) holds. The rest of the assertion follows from the observation that π(q) is the stationary distribution for the (linear) Markov family associated with Γ(q) and from the Lyapunov property of relative entropy in the case of ergodic (linear) Markov processes; see Lemma 3.1 in [5].
Remark 4.4 Consider the slow adaptation setting of Section 3 for the Gibbs model with K as in (4.1). Thus we start from a family of rate matrices Γ(p), p ∈ P(X ), defined according to (4.3). Suppose that π * is a fixed point of the mapping p → π(p). For λ ∈ [0, 1], p ∈ P(X ), set Γ λ (p) . = Γ(π * + λ(p − π * )). The rate matrices Γ λ (p) are again of Gibbs type, that is, Γ λ (p) satisfies (4.3), but with Ψ replaced by Ψ λ , where Ψ λ is defined exactly as Ψ is with K as in (4.1), but with different potentials in place of V and W . In particular, the potentials V λ,β , W λ are given by Fix λ ≥ 0. Then (4.6) and Theorem 4.3 imply that if the function is positive definite in some neighborhood of π * , then it is a local Lyapunov function for (3.1) (associated with that neighborhood and the fixed point π * ). Proposition 3.1, on the other hand, implies thatF (p) . = R (p π * ) is also a local Lyapunov function when λ is positive but sufficiently small. By the definition of relative entropy, (4.4), (4.2) and (4.1), which is equal to F λ (p) + log Z(π * ) for λ = 0. Observe that the term log Z(π * ) has no impact on the Lyapunov function property as it does not depend on p. Thus, the function F λ includes "correction terms" (that vanish when λ = 0) and serves as a Lyapunov function(when positive definite) not just for small λ but rather for all λ ∈ (0, 1].

Comparison with existing results for Itô diffusions
A situation analogous to that of this section is considered in [13], where the author studies the long-time behavior of "nonlinear" Itô-McKean diffusions of the form (4.12) where µ t is the probability law of X(t), B is a standard d-dimensional Wiener process, V a function R d → R, the environment potential, and W a symmetric function R d ×R d → R with zero diagonal, the interaction potential. Here ∇ 1 denotes gradient with respect to the first R d -valued variable. Signs and constants have been chosen in analogy with the finite-state models considered here. Solutions of (4.12) arise as weak limits of the empirical measure processes associated with weakly interacting Itô diffusions. The N -particle model is described by the system where i ∈ {1, . . . , N }, B 1 , . . . , B N are independent standard Brownian motions.
In [13] a candidate Lyapunov function F : P(R d ) → [0, ∞], referred to as the "free energy function", is introduced without explicit motivation, and then shown to be in fact a valid Lyapunov function. The same function is also considered in [6] and plays a key role in their study of convergence properties of µ t as t → ∞. The function takes the following form. If µ is a probability measure that is absolutely continuous with respect to Lebesgue measure and of the form f µ (x)dx, then (4.13) In all other cases F (µ) . = ∞. This function is clearly a close analogue of the function in (4.5), which was derived as the limit of scaled relative entropies. There are, however, some interesting differences in the presentation and proof of the needed properties. The most significant of these is how one represents the derivative of the composition of the Lyapunov function with the solution to the forward equation. In [13] the descent property is established by expressing the orbital derivatives of F in terms of the Donsker-Varadhan rate function associated with the empirical measures of solutions to (4.12), when the measure µ t is frozen at µ ∈ P(R d ). In contrast, in our case the orbital derivative of the Lyapunov function is expressed as the orbital derivative of relative entropy with respect to the invariant distribution π(p) that is obtained when the dynamics of the nonlinear Markov process are frozen at p. The latter expression also applies to the diffusion case in the sense that for all t ≥ 0, where µ t is the law of X(t), X being the solution to (4.12) for some (absolutely continuous) initial condition, and π ν ∈ P(R d ) is given by with Z ν the normalizing constant. Clearly, the probability measures given by (4.15) correspond to the distributions π(p) ∈ P(X ) defined in (4.4). The relationship (4.14) can be established in a way analogous to the proof of Theorem 4.1. On the other hand, the representation for the orbital derivative of the Lyapunov function in terms of the Donsker-Varadhan rate function as established in [13] for the diffusion case does not carry over to the finitestate Gibbs models studied above. Because of this, we argue that (4.14) is the more natural and general way to demonstrate that F has the properties required of a Lyapunov function.
To make this more precise, consider the case of linear Markov processes. Let Γ be the infinitesimal generator (rate matrix) of an X -valued ergodic Markov family with unique stationary distribution π ∈ P(X ). Let p(·) be a solution of the corresponding forward equation (1.1). Then where f t . = dp(t) dπ is the density of p(t) with respect to π and E Γ (·, ·) is the Dirichlet form associated with Γ and its stationary distribution π, that is, for test functions f, g : X → R. On the other hand, the Donsker-Varadhan I-function associated with Γ is given by If Γ is also reversible (i.e., Γ xy π x = Γ yx π y for all x, y ∈ X ), then I takes the more explicit form see, for instance, Theorem IV.14 and Exercise IV.24 in [10, pp. 47-50]. In general, the functions f → E Γ √ f , √ f and f → E Γ (f, log f ) with f ranging over all non-degenerate π-densities are not proportional. As a counterexample, it is enough to evaluate the Dirichlet forms for Γ = −1 1 1 −1 and π = (1/2, 1/2).

A PDE for Limits of Relative Entropies
In the last section we saw that the scaling limits of relative entropies with respect to stationary distributions of certain N -particle Markov processes X N yield candidate Lyapunov functions for (1.1). In this section we consider the case where closed form expressions for the stationary distributions are not available and consequently these limits cannot be evaluated explicitly. Recall from the discussion in Section 5 of [5] that in such cases our basic approach to constructing Lyapunov functions is to take limits of the scaled relative entropy F N t specified in equation (1.4) of [5] (see also equation (5.5) in this section), first as N → ∞ and then as t → ∞. Theorem 5.5 of [5] shows that under some basic assumptions, the limit as N → ∞ coincides with the large deviation rate function J t (·) : S → [0, ∞) for a certain sequence of empirical measures of N -particle systems that converge to the solution of the ODE (1.1). This large deviations result [11,4,8] is recalled in Section 5.1. Next, we formally derive a time-dependent PDE for the associated large deviation rate function J t (r) in Section 5.2, and present the stationary version of this PDE in Section 5.3. These formal calculations simply motivate the form of the PDE -the main result presented in Section 5.5 that subsolutions to the stationary PDE serve as local Lyapunov functions for the ODE (1.1), does not rely on this derivation. The proof of the main result relies on certain properties that are first established in Section 5.4.

A large deviation result
Let, as in Section 2 of [5], X N = (X 1,N , . . . , X N,N ) be a X N -valued Markov process with transitions governed by the family of matrices {Γ(r), r ∈ P(X )}, where for each r ∈ P(X ), Γ(r) = {Γ x,y (r), x, y ∈ X } is a transition rate matrix of a continuous time Markov chain on X (here for simplicity we assume that Γ N = Γ). Specifically, the transition mechanism is as follows. Given X N (t) = x ∈ X N , an index i ∈ {1, . . . , N } and y = x i , the jump rate at time t for the transition The jump rates for transitions of any other type are zero. Under the assumption of exchangeability of the initial random vector ..,N are also exchangeable. From this, it follows that the empirical measure process µ N = {µ N (t)} t≥0 is a Markov chain taking values in S N = S ∩ 1 N Z d , where S is the unit simplex which is identified with P(X ), with the generator L N given by for real-valued functions f on S N .
We recall the following locally uniform LDP for the empirical measure process. The LDP has been established in [11,4] while the locally uniform version used here is taken from [8].
Theorem 5.1 Suppose that for each p ∈ S, Γ(p) is the transition rate matrix of an ergodic Markov chain and that Condition 2.1 holds. For t ∈ [0, ∞) let p N (t) be the distribution of X N (t) = (X 1,N (t), . . . , X N,N (t)). Recall the mapping r N : X N → P N (X ) given by (5.1), i.e., r N (x) is the empirical measure of x . Assume that the initial random vector {X i,N (0)} i=1,...,N is exchangeable and assume that r N under the distribution p N (0) satisfies a large deviation principle (LDP) with a rate function J 0 . Then for each t ∈ [0, ∞), r N under the distribution p N (t) satisfies a locally uniform LDP on P(X ) with a rate function J t , thus given any sequence Furthermore, J t (q) < ∞ for all q ∈ P(X ).
We will now formally derive a PDE solved by J t (q).

A time-dependent PDE
For notational convenience, throughout this section for t ≥ 0 and r ∈ S we write J t (r) as J(r, t). For t ≥ 0, let u N (t) denote the distribution of µ N (t), that is, for r ∈ S N , let u N r (t) = P µ N (t) = r . Then, u N satisfies the Kolmogorov forward equation where L N is as in (5.2). For r ∈ S N , substituting into (5.3) the approximation u N r (t) ≈ e −N J(r,t) that follows from the LDP stated in Theorem 5.1, and recalling the form of the generator L N from (5.2), we obtain ∂ ∂t e −N J(r,t) = x,y∈X ,x =y: x,y∈X ,x =y: Observing that the left-hand side of the last display equals −N e −N J(r,t) (∂J(r, t)/∂t), and multiplying both sides by e N J(r,t) /N , we obtain x,y∈X ,x =y: x,y∈X ,x =y: If J is smooth and N is large, we can use the approximation Substituting this approximation into the previous display, sending N → ∞ and recalling that Γ x,y (·) is continuous, we obtain the following PDE for J(r, t): for r ∈ S and t ∈ [0, ∞), − ∂J ∂t (r, t) = x,y∈X :x =y e DJ(r,t),ey−ex − 1 r x Γ xy (r). (5.4) As mentioned earlier, this derivation is not rigorous because we did not establish the smoothness properties of J assumed in the calculations. In fact, in general one does not expect this smoothness property to hold and one would have to interpret J as a viscosity solution to the PDE (5.4) with appropriate boundary conditions. However, the derivation simply serves to motivate the form of the stationary PDE and the proof of the main result given in the next section does not rely on this derivation.

The stationary PDE and Lyapunov functions
We now introduce our main tool for constructing local Lyapunov functions for (1.1). Recall Recall the definition of positive definiteness given in Definition 2.4. Theorem 5.3 below says that a positive definite subsolution of the PDE (5.7) is a local Lyapunov function. Theorem 5.3 Suppose Condition 2.1 holds. Let π * ∈ S • be a fixed point of (1.1), and let D be a relatively open subset of S that contains π * . Let J : D → R be a C 1 positive definite function that is a subsolution of (5.7) on D. Then J is a local Lyapunov function for (1.1) associated with (D, π * ).
Before proceeding with the proof of the theorem we note some basic properties of the function H introduced in (5.8).  (c) Suppose that for each p ∈ S, Γ(p) is the rate matrix of an ergodic Markov chain. Then given r ∈ S • ,α − α ∈ R d \ {c1 : c ∈ R} and any ρ ∈ (0, 1),

Properties of H
Proof. The definition of H in (5.8) immediately implies part (a), and (b) follows since the map α → e − α,v − 1 is smooth and convex for any vector v ∈ R d and r x Γ xy (r) ≥ 0 for all x = y, r ∈ S. To prove (c), fix r ∈ S • and α,α ∈ R d such that w . =α − α ∈ {c1 : c ∈ R}. Then there exist x,ȳ ∈ {1, . . . , d} such that wx = wȳ. Due to the smoothness and concavity of H(r, ·), it suffices to show that Since Γ is ergodic there is a sequence of distinct statesx = x 1 , x 2 , . . . , x j =ȳ such that Γ x i x i+1 (r) > 0. Also, since wx = wȳ, for some i we have w x i = w x i+1 , and so (5.10) follows.
Given r ∈ S, β ∈ R d , define x,y∈X :x =y r x Γ xy (r)ℓ u xy r x Γ xy (r) : x,y∈X :x =y (e y − e x )u xy = β   . (5.11) The following lemma establishes duality relations between L and H.
Proof. Fix r ∈ S, and defineH(r, x, y ∈ X , x = y}. Then note that for α ∈ R d , For v ∈ V, let ℓ v (r, ·) be the Legendre transform of h v (r, ·): Then with Λ v (r) .
= r x Γ xy (r) when v = e y − e x , From (5.14), it follows using standard properties of Legendre transforms (see, e.g., Corollary D.4.2 of [7]) that the function L(r, ·) defined in (5.11) is the Legendre transform of the functionH(r, ·), that is, which is easily seen to be equivalent to (5.12). Finally, sinceH is convex and continuous by Lemma 5.4(b), the duality property of Legendre transforms shows thatH (r, α) = sup This is clearly equivalent to the relation (5.13), and so the proof is complete.
We now return to the proof of Theorem 5.3.

Locally Gibbs Systems
The PDE characterization of Section 5 gives a recipe for constructing local Lyapunov functions for (1.1). Although in general explicit solutions of (5.7) are not available, there is an important class of nonlinear Markov processes introduced below for which solutions to the PDE (5.7) can be constructed explicitly, and which generalizes the class of Gibbs systems.
Definition 6.1 A family of transition rate matrices {Γ(r)} r∈S on X is said to be locally Gibbs if the following two properties hold: (a) for each r ∈ S, Γ(r) is the rate matrix of an ergodic Markov chain on X , whose stationary distribution we denote by π(r); (b) there exists a C 1 function U on S such that for every x, y ∈ X , x = y, The function U is referred to as the potential associated with the locally Gibbs family.
The following result gives a local Lyapunov function for the ODE (1.1) associated with a locally Gibbs family. Proof. Let {Γ(r)} r∈S , U and J be as in the statement of the theorem. Let {π(r)} r∈S be the corresponding collection of stationary distributions on S.
Since U is C 1 on S • by assumption, J is clearly also C 1 on S • . We now show that J is a solution to the equation (5.9). First note that, due to the locally Gibbs condition (6.1), for r ∈ S and x, y ∈ X , x = y, e D ey −ex J(r) = r y π(r) x r x π(r) y .
Moreover, since π(r) is the stationary distribution for the Markov chain with transition rate matrix Γ(r), for any y ∈ X , x∈X :x =y π(r) x Γ xy (r) = −π(r) y Γ yy (r). Therefore, Thus J solves the PDE (5.7) on S • . The result now follows from Theorem 5.3.
In the rest of this section we will describe several examples that correspond to locally Gibbs systems and also give an example that falls outside this category, and show that for the latter setting in some cases, the PDE (5.7) can still be used to construct local Lyapunov functions. In what follows we will not discuss the positive definiteness property, and instead refer to a function that satisfies (5.9) as a candidate Lyapunov function, with the understanding that if positive definiteness is added such a function will in fact be a local Lyapunov function.
The rest of the section is organized as follows. Section 6.1 considers a class of models that are a slight extension of the Gibbs systems studied in Section 4. A particular case of locally Gibbs that appears in several contexts is introduced and discussed in Section 6.2. Section 6.3 presents two examples of three-dimensional systems which in particular illustrate that Gibbs systems are a strict subset of locally Gibbs systems. In Section 6.4 we consider models with nearest neighbor transitions. Section 6.5 studies an example from telecommunications [1] for which the associated N -particle system has the feature of "simultaneous jumps." We show that an explicit construction of a Lyapunov function carried out in [1] follows as a special case of Theorem 6.2. All examples in Sections 6.1-6.5 are locally Gibbs systems. Section 6.6 considers an example that demonstrates that the class of models for which a non-trivial solution to the PDE (5.7) can be obtained is strictly larger than that of locally Gibbs systems.

Gibbs systems
Recall the empirical measure functional r N : X N → S defined in (5.1). Also recall that throughout we assume r → Γ(r) is Lipschitz continuous. We now introduce a class of models that slightly extend those studied in Section 4 which, with an abuse of terminology, we once more refer to as Gibbs systems.
is a continuously differentiable function on R d . We say a family of rate matrices {Γ(r)} r∈S on X is Gibbs with potential function K, if (a) For each r ∈ S, Γ(r) is a rate matrix of an ergodic Markov chain with state space X .
(b) For each N ∈ N there exists a collection of rate matrices {Γ N (r)} r∈S such that Γ N → Γ uniformly on S and the N -particle Markov process X N , for which the jump rate of the transition

3)
where Z N is the normalization constant: From Section 4 of [5] it follows that the family of rate matrices in equation (4.3) is Gibbs in the sense of Definition 6.3. Note however that Definition 6.3 allows for more general forms of rate matrices than (4.3).
The following lemma shows that a Gibbs system is locally Gibbs in the sense of Definition 6.1.

Lemma 6.4
If {Γ(r)} r∈S is Gibbs with some potential K, then it is locally Gibbs with potential U (r) = z∈X K z (r)r z .
Proof. Since X N is reversible, the following detailed balance condition on X N must hold: for every x ∈ X N , y ∈ X and j ∈ {1, . . . , N }, where T j y x has jth coordinate value equal to y, and all other coordinates having values identical to those of x. Since r N (T j y x) = r N (x) + 1 N (e y − e x j ), by (6.3) and (6.4), it follows that for x ∈ X N , ).

(6.5)
Fix x, y ∈ X , x = y and j ∈ N. Given r ∈ S, let {x 1 , x 2 , · · · } be a sequence in X such that x j = x and with x N = (x 1 , . . . , where H was defined in (4.2) and Z(r) is a normalization constant to make π(r) a probability measure. By sending N → ∞ in (6.5) (with x replaced by x N ), using the uniform convergence of Γ N to Γ and the fact that K is C 1 , we have π(r) x π(r) y Γ xy (r) = Γ yx (r), x, y ∈ X , x = y. (6.6) This shows that π(r) is the stationary distribution for the rate matrix Γ(r), and thus verifies condition (a) of Definition 6.1. Condition (b) holds because U is C 1 due to the assumptions on K, and (6.1) is verified by combining the last display with the fact that U (r) = z∈X K z (r)r z and (4.4) imply − DU (r), e y − e x = log π(r) y − log π(r) x , x, y ∈ X , x = y. Thus, the family {Γ(r)} r∈S is locally Gibbs with potential U .
Given a Gibbs family of matrices {Γ(r)} r∈S , it follows from Lemma 6.4 that the function J : S → [0, ∞) defined in (6.2) solves the stationary PDE (5.7) and thus serves as a candidate Lyapunov function. Example 4.2 shows that in general multiple fixed points of the forward equation (1.1) exist and that the function J may be positive definite in the sense of Definition 2.4 for some of the fixed points and not positive definite for others.
The locally Gibbs condition is significantly weaker than the Gibbs property. Indeed, it follows from (6.6) that Gibbs systems satisfy the detailed balance condition for their corresponding rate matrices, but systems with the locally Gibbs property need not satisfy this property. The simplest example is as follows. Let π be the invariant distribution for the ergodic rate matrix Γ, and assume that detailed balance does not hold, so that it cannot be a Gibbs family. However, it is still locally Gibbs, with U (r) = r, v , v x = − log π x . Note that in this case, the proposed Lyapunov function J(r) in Theorem 6.2 is just the relative entropy R(r π ). Example 6.11 below will also illustrate this point.

A class of locally Gibbs systems
We now introduce a family of ergodic rate matrices {Γ(r)} r∈S that describe limits of particle systems whose dynamics need not be reversible for each N , (and hence may not be Gibbs systems), but nevertheless have a structure that has some similarities with Gibbs systems. We show that they are locally Gibbs, and then give two concrete examples where they arise. Condition 6.5 For a family of transition rate matrices {Γ(r)} r∈S on X the following two properties hold.
(a) For each r ∈ S, Γ(r) is the rate matrix of an ergodic Markov chain on X with stationary distribution π(r).
(b) There exist R : X × [0, 1] → R and K : X × R d → R such that for each x ∈ X , R x (·) = R(x, ·) is a continuous function and K x (·) = K(x, ·) is a C 1 function on S, and such that for each r ∈ S, π(r) has the form where H is defined in terms of K as in (4.2), and Z(r) is, as usual, the normalization constant Note that the Gibbs systems from Section 6.1 satisfy Condition 6.5 with R(x, t) = 1, (x, t) ∈ X × [0, 1]. Remark 6.6 Given a family {Γ(r)} r∈S , suppose there exist R and K as in Condition 6.5 such that for every x, y ∈ S, x = y, Then, for fixed r ∈ S, Γ(r) satisfies the detailed balance conditions with stationary distribution π(r) given by (6.7), and so {Γ(r)} r∈S satisfies Condition 6.5. Proof. First, note that the conditions on R and K ensure that U is a C 1 function on S. Thus, it suffices to verify equation (6.1) of Definition 6.1, namely to show that for every r ∈ S and x, y ∈ X , − log π(r) y π(r) x = D ey−ex U (r).
But this is a simple consequence of the identity the fact that from (4.2) we have ∂ ∂r x z∈X K(z, r)r z = H(x, r), x ∈ X , and the definitions of π(r) and U in (6.7) and (6.9), respectively.
We now provide two classes of models that satisfy Condition 6.5. The first class is a system with only nearest-neighbor jumps.
The following is a sufficient condition for Condition 6.
It follows that {Γ(r)} r∈S satisfies Condition 6.5 with R(i, ·) = log φ i (·), i = 1, . . . , d, and K ≡ 0. By Lemma 6.7, it follows that {Γ(r)} r∈S is locally Gibbs with potential Example 6.9 This example can be viewed as a generalization of the Glauber dynamics introduced in Section 4, in which the rate at which a particle changes state can depend both on the state of the particle and on the fraction of particles in that state. Suppose that we are given R and K as in Definition 6.5, let H be defined as in (4.2), and as in Section 4, let Ψ : X × X × S → R be given by Ψ(x, y, r) = H y (r) − H x (r), x, y ∈ X , r ∈ S, and let (α(x, y)) x,y∈X be an irreducible and symmetric matrix with diagonal entries equal to zero and off-diagonal entries equal to either one or zero. Then, for r ∈ S, define Then the equality in (6.8) clearly holds and thus {Γ(r)} r∈S satisfies Condition 6.5 by Remark 6.6. Next recall that Theorem 6.2 shows that J(r) = U (r) + x∈X r x log r x , with U defined by (6.9), is a candidate Lyapunov function for the associated ODE (1.1). In the present example, consider the case when there exists a common R 0 : [0, 1] → R such that R(x, ·) = R 0 (·) for every x ∈ X and K(x, r) = − log(ν x R(x, ν x )) for some probability measure ν ∈ P(X ) (and hence K(x, r) does not depend on r). SettingR 0 (u) = ue R 0 (u) for u ∈ [0, 1], we then have (up to a constant), which is non-negative ifR 0 is non-decreasing. An analog of this functional for nonlinear diffusions living in an open subset Ω of a Riemannian manifold appears in [2], where it was shown to be equal to the large deviation functional of the so-called zero range process. Moreover, under the condition thatR 0 is strictly increasing, it was shown in [3] that this functional (and a slight generalization of it, where the logarithm in the integrand is replaced by the derivative of a more general C 2 function) serves as a Lyapunov function for the associated nonlinear PDE.

Some three-dimensional examples
Both classes of locally Gibbs families studied so far had the property that for each r ∈ S, Γ(r) is associated with a reversible Markov chain for which the detailed balance condition (6.6) holds. This leads to two natural questions: (a) does every locally Gibbs family have the property that Γ(r) satisfies detailed balance for each r ∈ S? (b) if Γ(r) satisfies detailed balance for each r ∈ S, then does it correspond to a locally Gibbs family? To address these questions, we consider some simple three-dimensional examples; specifically, Example 6.11 answers questions (a) in the negative while Example 6.10 gives a partial answer to (b) by showing that Γ(r) may satisfy detailed balance for each r ∈ S, but it may fail to be a locally Gibbs family with any C 2 -potential.
Example 6.10 Suppose that d = 3, fix a i , b j > 0, i = 1, 2, j = 2, 3, and let B : S → (0, ∞) be some given function. For r ∈ S, consider the matrix Note that for any fixed r ∈ S, the matrix (6.11) corresponds to an ergodic transition matrix with stationary distribution where Z(r) is, as usual, the normalization constant. Note also that the detailed balance condition (6.6) is satisfied for this model. However as see below, in general this is not locally Gibbs system with a C 2 -potential. Let Γ be the transition matrix in (6.11) when B(r) is replaced by 1, and let r * be the associated stationary distribution. Fix c = (c 1 , c 2 , c 3 ) ∈ R 3 and κ ∈ R. Also, for r ∈ S, let Γ(r) be the matrix in (6.11) with B(r) = e κ r−r * ,c . (6.13) A simple calculation shows that if κ = 0 and c 2 = c 3 , there is no C 2 function U that satisfies the equality in (6.1) for all x = y and r ∈ S. Indeed, such a function should satisfy for suitable real numbers α, β DU, e r 2 − e r 1 = κ c, r + α, DU, e r 2 − e r 3 = β. (6.14) Taking second derivatives we see that Adding the last two equations and subtracting from the first, we have However, from (6.14) the left side equals 0. Thus we must have c 2 = c 3 or κ = 0. Consequently when c 2 = c 3 and κ = 0 the model is not locally Gibbs.
On the other hand, if c 2 = c 3 , one can check that (6.1) is satisfied with Thus, when c 2 = c 3 , the model is a locally Gibbs system with potential U .

Systems with nearest neighbor jumps
The nearest neighbor model in Example 6.8 imposed certain symmetry conditions (see (6.10)) on the rate parameters. In the following example we consider a more general family of near neighbor models with certain monotonicity conditions on the rates. . . , d, such that for every r ∈ S, Γ(r) is the rate matrix of a birth-death chain that satisfies and, as usual, set Γ ii (r) = − i,j,j =i Γ ij (r), so that Γ(r) is a rate matrix. Let π(r) denote the stationary distribution of the chain with rate matrix Γ(r). Since π(r)Γ(r) = 0, we have for i = 1, . . . , d − 1, Then we claim that {Γ(r)} r∈S is a locally Gibbs family with potential Indeed, for every r ∈ S and i = 1, . . . , d − 1, using (6.17) Together with (6.16) this shows that condition (6.1) is satisfied, and thus {Γ(r)} r∈S is locally Gibbs.

Models with simultaneous jumps
Weakly interacting particles systems with "simultaneous jumps" are described in [1,8]. For our purposes here we need only know that the nonlinear Markov process associated with such models can also be interpreted as the limit process for an ordinary single jump process, with an effective rate matrix that is defined in terms of various rate matrices used in the definition of the original process. We describe one model with simultaneous jumps that arises naturally in telecommunications and which was studied in [1], and show that the associated family of effective rate matrices is locally Gibbs.
In this model there are N nodes, each with capacity C ∈ N, and there are M ∈ N classes, each with parameters λ m , µ m , γ m > 0, and, in addition, a capacity requirement A m ∈ N. The state of node i is the number of calls of each class present at that node in an N -node network, and thus the state space takes the form and, as usual, Γ xx (r) = − y∈X ,y =x Γ yx (r). Moreover, it is easily verified (see Proposition 1 of [1]) that for each r ∈ S, Γ(r) is the rate matrix of an ergodic Markov chain with stationary distribution π(r) given by π(r) x = 1 Z(r) which is easily seen to coincide with D ey−ex U (r), thus establishing that (6.1) is satisfied.
By Theorem 6.2, it then follows under positive definiteness that J(r) = U (r) + x∈X r x log r x is a local Lyapunov function for (1.1). Using the definition a m (r) = x∈X x m r x , it is easily seen that J coincides with the Lyapunov function g constructed in Proposition 4 of [1]. Remark 6.14 Features of the last example are that the state space X ⊂ R M and the rate matrix depends on r only through the mean values a m (r), m = 1, . . . , M . The example can be generalized slightly. Indeed, consider a family of ergodic rate matrices {Γ(r)} r∈S on X ⊂ R M , with the property that for each r ∈ S, Γ(r) has a stationary distribution π(r) of the form π(r) x = M m=1 Φ (m) (a m (r)) xm exp (−H(x, r)), x ∈ X , r ∈ S, where H is the function defined in (4.2) for some K : X × R d → R, a m is defined by (6.18) and for each m = 1, . . . , M , Φ (m) : R → (0, ∞) is continuous. Then, using arguments exactly analogous to those used previously in this section, one can show that {Γ(r)} r∈S is locally Gibbs with potential U (r) = z∈X M m=1 am(r) 0 log Φ (m) (w) dw + K(z, r)r z .
The last example presented then coincides with the case Φ m (w) = λ m +γ m w, w ∈ R, and for r ∈ S K(x, r) = H(x, r) = M m=1 (x m log(µ m + γ m ) + log(x m !)) .

Remark 6.15
Another example of a model with simultaneous jumps is the model of alternative routing in loss networks introduced in Gibbens, Hunt and Kelly [9]. It can be shown that the family of jump matrices {Γ(r)} r∈S associated with this model is not locally Gibbs with any C 2 potential U . This may explain why this problem has withstood analysis for more than a decade. It is an interesting open problem to see if the PDE characterization introduced here can be used to construct Lyapunov functions for this model and related ones.

A candidate Lyapunov function for a model that is not locally Gibbs
The example in this section demonstrates that the class of models for which explicit non-zero solutions of (5.7) can be found is larger than that of locally Gibbs models. Let d = 3, and for r ∈ S define the rate matrix Γ(r) by Γ(r) =   −a 1 (r) a 1 (r) 0 b 2 (r) −(a 2 (r) + b 2 (r)) a 2 (r) where a 1 and a 2 are measurable functions from S to (0, 1) and b 2 , b 3 are given as follows. Let ψ : [0, 1] → (0, 1) be a continuous function that is bounded away from 0. We set b 2 (r) = (1 + (r 2 − r 3 ψ(r 3 ))a 2 (r)) a 1 (r), b 3 (r) = ψ(r 3 )a 2 (r) (1 + (r 2 − r 1 )a 1 (r)) .