The Stationary Behaviour of Fluid Limits of Reversible Processes is Concentrated on Stationary Points

Assume that a stochastic processes can be approximated, when some scale parameter gets large, by a fluid limit (also called"mean field limit", or"hydrodynamic limit"). A common practice, often called the"fixed point approximation"consists in approximating the stationary behaviour of the stochastic process by the stationary points of the fluid limit. It is known that this may be incorrect in general, as the stationary behaviour of the fluid limit may not be described by its stationary points. We show however that, if the stochastic process is reversible, the fixed point approximation is indeed valid. More precisely, we assume that the stochastic process converges to the fluid limit in distribution (hence in probability) at every fixed point in time. This assumption is very weak and holds for a large family of processes, among which many mean field and other interaction models. We show that the reversibility of the stochastic process implies that any limit point of its stationary distribution is concentrated on stationary points of the fluid limit. If the fluid limit has a unique stationary point, it is an approximation of the stationary distribution of the stochastic process.


Erratum:
In the publication version, the definition of f N i , just after (14), should be f N i (n)

Introduction
This paper is motivated by the use of fluid limits in models of interacting objects or particles, in contexts such as communication and computer system modelling [6], biology [7] or game theory [3]. Typically, one has a stochastic process Y N , indexed by a size parameter N; under fairly general assumptions, one can show that the stochastic process Y N converges to a deterministic fluid limit ϕ [17]. We are interested in the stationary distribution of Y N , assumed to exist and be unique, but which may be too complicated to be computed explicitly. The "fixed point assumption" is then sometimes invoked [15,5,22,14]: it consists in approximating the stationary distribution of Y N by a stationary point of the deterministic fluid limit ϕ. In the frequent case where the fluid limit ϕ is described by an Ordinary Differential Equation (ODE), say of the formẏ = F (y), the stationary points are obtained by solving F (y) = 0. If Y N is an empirical measure, convergence to a deterministic limit implies propagation of chaos, i.e. the states of different objects are asymptotically independent, and the distribution of any particular object at any time is obtained from the fluid limit. Under the fixed point assumption, the stationary distribution of one object is approximated by a stationary point of the fluid limit.
A critique of the fixed point approximation method is formulated in [2], which observes that one may only say, in general, that the stationary distribution of Y N converges to a stationary distribution of the fluid limit. For a deterministic fluid limit, a stationary distribution is supported by the Birkhoff center of the fluid limit, which may be larger than the set of stationary points. An example is given in [2] where the fluid limit has a unique stationary point, but the stationary distribution of Y N does not converge to the Dirac mass at this stationary point; in contrast, it converges to a distribution supported by a limit cycle of the ODE. If the fluid limit has a unique limit point, say y * , to which all trajectories converge, then this unique limit point is also the unique stationary point and the stationary distribution of Y N does converge to the Dirac mass at y * (i.e. the fixed point approximation is then valid). However, as illustrated in [2], this assumption may be difficult to verify, as it often does not hold, and when it does, it may be difficult to establish. For example, in [9] it is shown that the fixed point assumption does not hold for some parameter settings of a wireless system analyzed in [5], due to limit cycles in the fluid limit.
In this paper we show that there is a class of systems for which such complications may not arise, namely the class of reversible stochastic processes. Reversibility is classically defined as a property of time reversibility in stationary regime [13]. There is a large class of processes that are known to be reversible, for example productform queuing networks with reversible routing, or stochastic processes in [14], which describes the occupancy of inter-city telecommunication links; in Section 5 we give an example motivated by crowd dynamics. In such cases, we show that the fluid limit must have stationary points, and any limit point of the stationary distribution of Y N must be supported by the set of stationary points. Thus, for reversible processes that have a fluid limit, the fixed point approximation is justified.

A Collection of Reversible Random Processes
Let E be a Polish space and let d be a measure that metrizes E. Let P(E) be the set of probability measures on E, endowed with the topology of weak convergence. Let C b (E) be the set of bounded continuous functions from E to R, and similarly We are given a collection of probability spaces (Ω N , F N , P N ) indexed by N = 1, 2, 3, ... and for every N we have a process Y N defined on (Ω N , We assume that, for every N, the process Y N is Feller, in the sense that for every Examples of such processes are continuous time Markov chains as in [16], or linear interpolations of discrete time Markov chains as in [4], or the projections of a Markov process as in [12]. We are interested in reversible processes, i.e. processes that keep the same stationary law under time reversal. A weak form of such a property is defined as follows Definition 2. Assume Π N is a probability on E such that Π N (E N ) = 1, for some N. We say that Y N is reversible under Π N if for every time t ≥ 0 and any h ∈ C b (E × E): Note that, necessarily, Π N is an invariant probability for Y N . If Y N is an ergodic Markov process with enumerable state space, then Definition 2 coincides with the classical definition of reversibility by Kelly in [13]. Similarly, if Y N is a projection of a reversible Markov process X N , as in [10], then Y N is reversible under the projection of the stationary probability of X N ; note that in such a case, Y N is not Markov.

A Limiting, Continuous Semi-Flow
Further, let ϕ be a deterministic process, i.e. a measurable mapping We assume that ϕ t is a semi-flow, i.e.
In cases where E is a subset of R d for some integer d, the semi-flow ϕ may be an autonomous ODE, of the formẏ = F (y); here the stationary points are the solutions of F (y) = 0.
Definition 4. We say that the semi-flow ϕ is reversible under the probability Π ∈ P(E) if for every time t ≥ 0 and any As we show in the next section, reversible semi-flows must concentrate on stationary points.

Convergence Hypothesis
We assume that, for every fixed t the processes Y N converge in distribution to some space continuous deterministic process ϕ as N → ∞ for every collection of converging initial conditions. More precisely: for all h ∈ C b (E) and any fixed t ≥ 0. In the above, ϕ is a space continuous semi-flow.
Hypothesis 1 is commonly true in the context of fluid or mean field limits. The stronger convergence results results in [18,16,24,6] imply that Hypothesis 1 is satisfied ; we give a detailed example illustrating this in Section 5. Similarly, [2] gives very general conditions (called H1 to H5) that ensure convergence of a stochastic process to its mean field limit; under these conditions, Hypothesis 1 is automatically satisfied (the deterministic process ϕ is then an ODE). Note that the results in these references are stronger than what we require in Hypothesis 1; for example in [16] there is almost sure, uniform convergence for all t ∈ [0, T ], for any T ≥ 0; in [12] the convergence is on the set of trajectories.
Under Hypothesis 1, ϕ is called the hydrodynamic limit or simply fluid limit of Y N . Proof.
For every y ∈ S c let α be as in Eq.(6) and pick some q(y) ∈ Q and n(y) ∈ N s.t. d(y, q(y)) < 1 n(y) < α. Thus y ∈ B(q(y), 1 n(y) ) and Π B(q(y), 1 n(y) ) = 0. Let F = y∈S c (q(y), n(y)). F ⊂ Q × N thus F is enumerable and Note that it follows that a semi-flow that does not have any stationary point cannot be reversible under any probability.

Stationary Behaviour of Fluid Limits of Reversible Processes
Theorem 2. Assume for every N the process Y N is reversible under some probability Π N . Assume the convergence Hypothesis 1 holds and that Π ∈ P(E) is a limit point (for weak convergence) of the sequence Π N . Then the fluid limit is reversible under Π. In particular, it follows from Theorem 1 that Π is concentrated on the set of stationary points S of the fluid limit ϕ.
Proof. All we need to show is that Π verifies Definition 4. Let N k be a subsequence such that lim k→∞ Π N k = Π in the weak topology on P(E). By Skorohod's representation theorem for Polish spaces [11, Thm 1.8], there exists a common probability space (Ω, F , P) on which some random variables X k for k ∈ N and X are defined such that Fix some t ≥ 0 and h ∈ C b (E × E), and define, for k ∈ N and y ∈ E a k (y) Now a k (X k ) ≤ h ∞ and, thus, by dominated convergence: and similarly for b k . Thus In particular, if the semi-flow has a unique stationary point, we have: It follows that the sequence Π N converges weakly to the Dirac mass at y * .
We leave the proof of the corollary to the reader (it follows in a classical way from compactness arguments; the tightness condition implies that the set Π N , N = 1, 2, 3... N is relatively compact in P(E)).
Recall that tightness means that for every ǫ > 0 there is some compact set K ⊂ E such that Π N (K) ≥ 1 − ǫ for all N. If E is compact then (Π N ) N =1,2,... is necessarily tight, therefore condition 1 in the corollary is automatically satisfied. For mean field limits where E is the simplex in finite dimension, the corollary says that, if the prelimit process is reversible, then the existence of a unique stationary point implies that the Dirac mass at this stationary point is the limit of the stationary probability of the pre-limit process.
Compare Corollary 1 to known results for the non reversible case [1]: there we need that the fluid limit ϕ has a unique limit point to which all trajectories converge. In contrast, here, we need a much weaker assumption, namely, the existence and uniqueness of a stationary point. It is possible for a semi-flow to have a unique stationary point, without this stationary point being a limit of all trajectories (for example because it is unstable, or because there are stable limit cycles as in [2]). In the reversible case, we do not need to show stability of the unique stationary point y * .

Example: Crowd Dynamics
In this section we give an example to illustrate the application of Theorem 2a detailed study of this example beyond the application of Theorem 2 is outside the scope of this paper. We consider the crowd dynamics model of [23]. The model captures the emergence of crowds in a city. A city is modelled as a fully connected bidirectional graph with I vertices, every vertex representing a square, where bars are located. There is a fixed total population N. People spend some time in a square and once in a while decide to leave a square and move to some other square. The original model is in discrete time, and at every time slot, the probability that a tagged person present in square i leaves square i is assume to be equal to (1 − is the population of square i at time t. In this equation, c is the chat probability, and this model thus assumes that a person leaves a square when it has no one to chat with. The model also assumes that departure events are independent. When a person leaves a square i, it move to some other square j according to Markov routing, with probability Q i,j given by where d(i) is the degree of node i, i.e. the person picks a neighboring square j uniformly at random among all neighboring squares. In [23], the authors study by simulation the emergence of concentration in one square. They also show that for regular graphs (i.e. when all vertices have same degree) there is a critical value c * such that for c > c * concentrations occur, whereas for c < c * the stationary distribution of people is uniform. The analysis is based on the study of stationary points for the empirical distribution. Note that, as mentioned in the introduction, the analysis with stationary points may, in general, miss the main part of the stationary distribution, and it is quite possible that the stationary distribution is not concentrated on stationary points (for example if there is a limit cycle [2]). A fluid flow approximation is proposed in [21], and similar results are found.
To understand whether the stationary point analysis is justified, we study the large N asymptotics for an appropriately rescaled version. To avoid unnecessary complications, we replace the original model of [23], which is in discrete time, by its continuous time counterpart. The probability that a tagged person present in square i leaves square i in a time slot is replaced by the rate of service given by i.e. the probability that a tagged person present in square i leaves the square in the next dt seconds is µ N (N i )dt + o(dt).
The corresponding continuous process X N (t) = (N 1 (t), ..., N I (t)) is a Markov process on an enumerable state space. More precisely, it is a queuing network of infinite server stations, with state dependent service rate and with Markov routing. It follows from classical results on quasi-reversibility that it has product-form (see for example [19,Chapter 8]), i.e. it is ergodic (since the graph of squares is fully connected and the population is finite) and its stationary probability is given, for every (n 1 , ..., n I ) ∈ N I such that n 1 + ... + n I = N by In this formula, η N is a normalizing constant, f N i (n) To apply Theorem 2, we need to show that X N is reversible. A product-form queuing network is, in general, not reversible. However, it is so if the Markov routing chain is reversible [20], which is the case here.
Theorem 3. For every N, the process X N is reversible.
Proof. Take θ given by Eq. (15). Then θ i Q i,j = θ j Q j,i for any pair (i, j), thus the Markov chain with transition matrix Q given by Eq. (12) is reversible. By [20], it follows that the product-form queuing network X N is reversible.
In [8], it is suggested to scale the chat probability as in order to account for the fact that, for large populations, meetings tend to be limited by space or size of the friend's group. We use this scaling law and consider the re-scaled process Y N of occupancy measures, i.e. given by Obviously, for every N the process Y N is Markov and is reversible. Further, it converges to an ODE, as we see next.
To establish the convergence of Y N , we compute its drift defined for every possible value y of Y N (t). When the occupancy measure is y, there are N i = Ny i people in square i, and the rate of departure from square i is N i µ(N i ) = Ny i µ(Ny i ); the delta to the occupancy measure due to one person moving from square i to square j is , where e i is the row vector with a 1 in position i and 0 elsewhere. Therefore Taking into account Eq.(16), it comes Let ∆ I denote the simplex, i.e.
∆ I = y ∈ R I , y i ≥ 0 for all i and When N → ∞, V N (y) converges for every y ∈ ∆ I to This suggests that the fluid limit of Y N , if it exists, would be the deterministic process y(t), with sample paths in R d , obtained as solution of the ODE dy dt = V (y). As we show next, this is indeed the case and follows from "Kurtz's theorem" [24,Theorem 9.2.1]. Before that, we rewrite the ODE more explicitly as Theorem 4. Assume that the initial condition y N 0 of the process Y N is deterministic and converges to some y 0 ∈ ∆ I . Let ϕ be the semi-flow defined by the ODE (21), i.e. ϕ t (y 0 ) is the solution of the ODE (21) with initial condition y(0) = y 0 (this solution exists and is unique by the Cauchy Lipschitz theorem). Then for each T > 0 and ǫ > 0: (the notation stands for any norm on R I ). It follows that Hypothesis 1 is verified.
Proof. We apply Theorem 9.2.1 in [24]. We need to find a sequence of numbers δ N → 0 such that the following three conditions hold: In the above, ∆ N I is the set of feasible states of Y N , i.e. the set of y ∈ ∆ I such that Ny is integer, A N (y) is the expected norm of jump per time unit, and A N δ N (y) is the absolute expected norm of jump per time unit due to jumps travelling further than δ N .
We now show Eq. (23). First consider the case y ∈ ∆ N I such that y i > 0 (thus we have 1/N ≤ y i ≤ 1). We apply the inequality e −x − e −x ′ ≤ |x − x ′ |, valid for x ≥ 0 and x ′ ≥ 0, to x = −(Ny i − 1) log 1 − s N , x ′ = sy i and obtain: The right handside is convex in y i thus its maximum for y i ∈ [1/N, 1] is obtained at one end of the interval. Thus Second, multiply by y i and note that y i ≤ 1, it follows that, whenever y ∈ ∆ N I and y i > 0: and this is also obviously true if y i = 0. It follows that from where Eq.(23) follows since lim N →∞ a N (s) = 0. We now show Eq. (24). We take the sup norm on R d so that −e i + e j = 1 for i = j; thus which trivially shows Eq.(25). By Theorem 9.2.1 in [24], this establishes Eq. (22). It follows (as a much weaker convergence) that for any fixed T , Y N (T ) converges in probability to the deterministic ϕ T (y 0 ). Thus there is also convergence in law, since convergence in probability to a deterministic variable implies convergence in distribution, i.e. Eq.(2) in Hypothesis 1 is verified. It remains to see that ϕ t is space continuous: this follows from the fact that the right-handside of the ODE is Lipschitz continuous and from the Cauchy Lipschitz theorem.
It follows from Theorem 2 that any limit point of the stationary probability Eq. (14) is concentrated on the stationary points of the ODE (21). This justifies a posteriori the method in [23], which looked only at stationary points.
For the case of a regular graph (this is the case studied analytically in [23]), the stationary points can be obtained explicitly (Theorem 6.1 in [8]). In particular, there is a critical value s * below which there is only one stationary point, equal to the uniform distribution y * = ( 1 I , ..., 1 I ) and above which there are other stationary points. The critical value is given in [8]) and is equal to with φ(x) def = −W 0 (−xe −x ), W 0 being the Lambert-W function of index 0. For example for I = 3, the critical value is s * ≈ 2.7456.
We can apply Corollary 1: since the state space E = ∆ I is compact, it follows that for s < s * , the stationary distribution given by Eq. (14), re-scaled by 1/N, converges as N → ∞ to the uniform distribution. This illustrates the interest of the reversibility results in this paper; we do not need to show that all trajectories converge to the single stationary point -its uniqueness and the reversibility argument are sufficient.