Opinion dynamics with Lotka-Volterra type interactions

We investigate a class of models for opinion dynamics in a population with two interacting families of individuals. Each family has an intrinsic mean field"Voter-like"dynamics which is influenced by interaction with the other family. The interaction terms describe a cooperative/conformist or competitive/nonconformist attitude of one family with respect to the other. We prove chaos propagation, i.e., we show that on any time interval [0,T], as the size of the system goes to infinity, each individual behaves independently of the others with transition rates driven by a macroscopic equation. We focus in particular on models with Lotka-Volterra type interactions, i.e., models with cooperative vs. competitive families. For these models, although the microscopic system is driven a.s. to consensus within each family, a periodic behaviour arises in the macroscopic scale. In order to describe fluctuations between the limiting periodic orbits, we identify a slow variable in the microscopic system and, through an averaging principle, we find a diffusion which describes the macroscopic dynamics of such variable on a larger time scale.


Introduction
A frequent phenomenon observed in social communities is the emergence of self-organized behaviours. In many large communities of randomly interacting individuals, such behaviours appear on a macroscopic scale and seem to follow an independent rule, namely, each individual in the community feels the influence of other individuals through one or more macroscopic variables whose time evolution is deterministic. On a first approximation, one can assume that members of a social community are described by identical units that evolve randomly in time, choosing their actions from a set of possible "states" and interacting with their "neighbours". This assumption has motivated the interest in describing social systems with models based on a statistical physics approach. An introduction to the most popular of these models, with a general discussion on the usefulness of ideas and tools of statistical physics in the description of social dynamics, can be found in [6]. Typical questions for these models concern their behaviour when the size of the population or time becomes large. Within this context, the field of opinion dynamics models is extremely vast and has attracted researchers from different areas such as social scientists, physicists, computer scientists and mathematicians. All these models differ from one another depending on the set of possible opinions, the structure of the underlying social network and the interaction mechanism between members of the population. Without claiming to give a complete description of such a wide field, we limit ourselves to mentioning here few standard examples coming from the classes of discrete and continuous opinion dynamics. According to social scientists (see, e.g., [4], [14], [10]), two fundamental characteristics in opinion formation are social influence, i.e. the tendency of each individual to adjust her opinion to the one of her neighbours, and homophily, i.e. the tendency to interact more frequently with individuals who are more similar. In dichotomic models, opinions are binary and social influence is usually described in terms of an attractive interaction between agents. A basic example is the voter model [18], where each agent, at random times, adopts the opinion of an agent who is randomly chosen from the set of her neighbours. A similar mechanism holds for the Axelrod model ( [4], [21]), where opinions are vector valued (with entries belonging to a finite set) and an agent interacts with one neighbour by copying one of the entries of her opinion. In continuous dynamics models, opinions are represented by points in a subset of R d and each agent may adjust her opinion by adopting a weighted average of her and one (or more) neighbour's opinion. Examples of such models are the Deffuant-Weisbuch ( [1], [16]) and the Hegselmann-Krause [17] models. In the Axelrod and Deffuant-Weisbuch models, the mechanism of homophily is introduced as follows: two agents interact only if their "cultural distance", i.e. the distance between the vectors representing their opinions (which is given by the discrete L 1 distance for the Axelrod model and the euclidean distance for the Deffuant-Weisbuch model) does not exceed a certain threshold. Models with this feature are known as bounded confidence models (see [22] for a survey. See also [9] for models with heterogeneous populations). With this mechanism, convergence to consensus, which typically occurs when social influence is present, may fail yielding phenomena such as polarization or fragmentation of opinions within the population. A way of describing homophily in dichotomic opinion models could be the introduction of some form of inhomogeneity in the population. For example, one may assume that individuals in the population have different cultural traits, which affect the way one agent's opinion is influenced by the opinion of other agents (see, e.g., the models considered in [7]).
In this paper we consider a dichotomic opinion model where the population is divided into two social groups, each one characterized by its attitude with respect to the other. Members of the same group interact with each other, while the other group exerts on them a social influence, that may also be null or even negative. We assume that the cultural characteristics of an individual do not change with her opinions. The model is defined as an interacting particle system with quenched disorder taking values in {0, 1} N , where N is the size of the population, and can be informally described as follows. A population is divided into two families of individuals that may have one of two possible opinions (labelled as 0 and 1) on a certain subject. For i = 1, 2, an individual of family i chooses at random one member of the population and interaction occurs only if such member belongs to her family: then, the decision to adopt the opinion of her neighbour is amplified or damped by a perceived utility, which is a (strictly positive) function φ i of the fraction of individuals with the same opinion in the other community.
The derivative of such functions may be interpreted as a measure of the social influence of one community with respect to the other. For example, an increasing function describes a "cooperative" attitude, while a decreasing one corresponds to a "competitive" attitude. A zealot family may be represented by a φ i constant or with a derivative close to zero. Other classes of functions can be considered, for example the attitude of one family could change from competitive to cooperative if consensus on a given opinion becomes widespread in the other community. Notice that this system has four absorbing states, corresponding to configurations where each one of the two families reaches consensus. We consider the mean field variables m N i = {m N i (t)} t≥0 , i = 1, 2 where m N i (t) denotes the fraction of agents with opinion 1 in family i at time t, and we show that they satisfy a law of large numbers: for large N , the behaviour of such variables is described by a macroscopic deterministic equation. Then we prove chaos propagation, i.e., we show that, for large populations, any finite set of particles evolves as an independent family with jump rates driven by the macroscopic mean field variables.
We are mainly interested in the case of a cooperative family interacting with a competitive one. For this model, the microscopic interaction between individuals of the two families is a generalization of the Lotka-Volterra classical interaction, where the utility functions are linear. In particular, the macroscopic system evolves through periodic orbits and we are able to identify a quantity H that is conserved along such orbits. Stochastic Lotka-Volterra models (see, e.g., [20] and the references therein) have been introduced to study extinction in predator-prey models. Indeed, the deterministic models exhibit a cyclic behaviour and extinction is never achieved, while the introduction of noise drives the system towards extinction. However, in real social interactions extinction of a given opinion rarely occurs, so we adopt the opposite viewpoint: we give a stochastic microscopic description of a bipartite particle system with "predator-prey" type interactions. Such system converges a.s. to a configuration where all the members of the same family share the same opinion. On the other hand, letting the size of the population grow to infinity, we obtain a deterministic Lotka-Volterra type dynamics as a result of a law of large numbers. The emergence of orbitally stable solutions in the macroscopic dynamics suggests that the microscopic system spends a considerably long time close to these sets. A one dimensional analogue of this scenario is given, for example, in the epidemic model considered in [13], where the authors show that the macroscopic equation has a stable fixed point close to which the microscopic system spends a time that is exponential in the size of the population. Thus, we consider the microscopic counterpart of the quantity H and, through a change of variables, we represent the microscopic system by means of an "action-angle" pair (H N , Then, in order to study how the system fluctuates between the mean field periodic orbits before reaching its absorbing set, we speed up the dynamics and consider the process (H N , . Following the approach of [11], where a two population Curie-Weiss model is considered, we prove an averaging principle, extending their result to the case when the velocity of the fast variable is not constant. From such principle we derive that, for large N , the dynamics of the pair (H N ,Θ N ) becomes essentially one dimensional and we prove that the processH N weakly converges, as N → ∞, to the solution of a stochastic differential equation. If we interpret as "more evolved" a population where two possible opinions coexist and have majorities that change over time, our model suggests that evolution is promoted by cultural diversity, but when the speed of interactions is large compared to the size of the population convergence to consensus within one community is favoured, leading the system to a "less evolved" state.
2 The model and its mean-field approximation in the quenched regime Microscopic system: In what follows, we fix two positive real functions, φ 1 , φ 2 of class C 2 on [0, 1]. Consider a filtered probability space (Ω, F, {F t }, P ) satisfying the usual conditions, in which it is defined a family N = {N i,k ; k = 1, 2, i ≥ 1} of i.i.d. adapted Poisson random measures with intensity ⊗ , where denotes the restriction to [0, ∞) of the Lebesgue measure, and a probability space (Ω , F , ν) in which it is defined a sequence of i.i.d. Bernoulli random variables {Φ i ; i ≥ 1} with values in {φ 1 , φ 2 } and P{Φ i = φ 1 } = r 1 ∈ (0, 1). Our reference probability space will be (Ω, A, P), where Ω = Ω ×Ω, A = F ⊗F and P = ν⊗P . For a fixed integer N ≥ 2, we consider N interacting particles, each one assuming two possible values, 0 or 1. We denote by σ N the particles configuration and by σ N i , i = 1, . . . , N the state of particle i . At each particle i we assign a function Φ i , which is randomly chosen from {φ 1 , φ 2 }, so that particles are divided into two (random) families, which we call 1 and 2 . Let M i = {h : Φ h = φ i } be the set of neighbours of particle i andM i its complement in {1, . . . N }. Then, conditionally on {Φ i ; i ≥ 1}, particle i jumps between states 0 and 1 with the following rates: with the convention j∈A a j = 0 if A = ∅. Since particles in the same family have the same jump rates, denoting, for k = 1, 2, by N k = i I {Φi=φ k } the number of particles in family k and, for j = 1, . . . N k , by σ N j,k the state of particle j in family k, for each realization of {Φ i ; i ≥ 1}, we can write the jump rates for families 1 and 2 as follows: We consider the system in the quenched regime, so that we have ν− a.s.
where r 2 = 1 − r 1 . Moreover, since we want to study the system for large N , we can assume without loss of generality that N 1 , N 2 > 0 for all N .
Let m N 1 , m N 2 be the fraction of 1's of the first and second family, i.e., m N k = 1 We can rewrite the rates (1) as: where k = 3 − k and k ∈ {1, 2}, so that the N -particles system in the quenched regime is described by the Markov process on {0, 1} N with generator: where f : {0, 1} N → R and σ N,i,k denotes the configuration obtained by σ N by replacing σ N i,k with 1 − σ N i,k . Note that the process has four absorbing states, corresponding to the configurations where all the particles within a given family have the same state. In the language of opinion dynamics, such configurations are usually called "consensus", when all the particles in the population share the same state, or "polarization" otherwise.
In what follows, we shall use the bold notation σ N = {σ N (t)} t≥0 to denote a Markov process with generator (2). We denote by σ N i,k (t) the state of particle i of family k at time t and by m N The process σ N can be realized on (Ω, F, {F t } t , P ) as the solution of the following SDE: Remark 1. Note that the functions ψ and λ N are uniformly bounded. Moreover, since ψ and φ k are Lipschitz functions, if we pose j,k , the following Lipschitz condition holds: where · denotes the L 1 norm on R N and C is a suitable constant. Strong existence and uniqueness of solutions to (3), with any initial condition σ N (0) independent of N , can be derived by adapting the proof of Theorem 1.2 in [15].
We then obtain a family of Markov processes σ N ; N > 1 where σ N has sample paths in the space of càdlàg functions D([0, ∞), R N ) and generator given by L N .
Macroscopic system: At an heuristic level, let us make the assumption that a law of large numbers holds for m N k ; N > 1 , k = 1, 2, i.e., it converges, as N → ∞, to a deterministic function m k . Then, for large N , the system can be described by a macroscopic dynamics: if σ k denotes the state of the "limit particle" of family k, we expect that it evolves as a time-inhomogeneous Markov process with jump rates: where m k (t) = E[σ k (t)] for all t and m k satisfy a suitable evolution equation.
We can obtain such equation using the generator of the Markov process above, which we denote by L: We obtain the system:ṁ The following proposition shows indeed that, as N → +∞, the sequence {(m N 1 , m N 2 ); N > 1} converges in distribution to the deterministic process (m 1 , m 2 ) described by equation (6).
Remark 2. The above computation shows that the system undergoes a Hopf bifurcation determined by the sign of φ 1 ( r2 2 )φ 2 ( r1 2 ). When φ 1 and φ 2 are not both montone, other equilibria may appear in the macroscopic equation and their stability depends locally on the sign of φ i , i = 1, 2. In particular, periodic orbits may be observed around different points of the phase space (see figure 2).

Propagation of chaos
In this section we prove propagation of chaos, i.e., as N → ∞ particles of both families behave independently according to the evolution (5) with transition rates depending on the solution (m 1 , m 2 ) of equation (6). To this purpose, we will use a coupling technique following the approach of [15] and [2]. Definition 3.1. Let (E, d) be a Polish space, µ a probability measure on E and, for each N ≥ 1, let µ N be a probability measure on E N . For a fixed integer n, denote by µ N 1,...,n the marginal distribution of µ N over the first n components. The sequence {µ N ; N ≥ 1} is said to be µ-chaotic if, for each N , µ N is permutation invariant and for every n < N the sequence {µ N 1,...,n ; N ≥ 1} converges weakly to the product measure µ ⊗n as N → ∞. We say that propagation of chaos holds for a sequence of random vectors {X N ; N ≥ 1}, where X N takes values on E N , if the sequence of their distributions is µ-chaotic for some probability µ on E.
A stronger notion of chaoticity uses convergence with respect to the Wasserstein distance, which implies weak convergence. Let M 1 (E) be the set of probability measures on E with finite first moment. The Wasserstein metric on M 1 (E) is defined by: : π has marginals µ and ν .
For n ≥ 1 and T > 0, we call ρ n and ρ n,T the Wasserstein distances W 1 · on M 1 (R n ) and ) respectively, where · denotes the L 1 metric on R n and · ∞ denotes the uniform metric on the Skorohod space of càdlàg functions D([0, T ]; R n ).
..,n } converges to µ ⊗n with respect to the metric ρ n (resp. ρ n,T ). Now, consider the following SDE on (Ω, F, {F t } t , P ): with i = 1, . . . , N , k = 1, 2, k = 3 − k, where (m 1 , m 2 ) is the solution of equation (6), the Poisson random measures N i,k , i = 1, . . . , N k , k = 1, 2 are the same of equation (3), the jump rate function is given by The solutionσ N of equation (7) is given by a system of N = N 1 + N 2 particles evolving independently on {0, 1} with jump rates (5) and it is coupled with the solution σ N of equation (3) through the random measures (N i,k ). Such a coupling allows to prove propagation of chaos for the sequence {σ N ; N > 1}.
be the solutions of the microscopic equation (3) and the macroscopic equation (7) respectively, with initial conditions σ N k (0) andσ N k (0), k = 1, 2, independent of the family of Poisson random measures N . Denote by µ Then, for k = 1, 2 and for any To prove chaoticity in W 1 it is enough to prove that, for any T > 0 and i ∈ {1, . . . , N k } we have: For a shorter notation we write: For any t ≥ 0 we have: Taking the expectation on both sides, and recalling that the compensator of N i,k is given by the Lebesgue measure, we obtain: Consider the integral in the expectation above Observe that: Then, from (9) we obtain: where the last equality holds by symmetry. Now, fix i 1 ∈ {1, . . . N 1 } and i 2 ∈ {1, . . . , N 2 }. Using (10) and (11) we obtain: By the hypothesis on Λ N i h ,h and the law of large numbers for {m h N ; N > 1}, choosing i h = i for h = k and t = T in the above inequality we obtain (8).
4 Cooperative vs. competitive families: fluctuations around the mean field limit From now on we focus on the case when φ 1 φ 2 < 0. Our aim in next sections is to investigate how the microscopic dynamics fluctuates around its mean field approximation before it reaches its absorbing states. As observed in section 2, the macroscopic system has an Hamiltonian H that is conserved on the mean-field orbits C k , k ∈ (−∞, 0). Then, H may be considered as a radial coordinate and we can change variables in such a way to represent the system through "action-angle" variables (H, Θ) (see [3]). Consider the macroscopic equation (6) for (m 1 , m 2 ) ∈ (0, 1) 2 . Even though our results will be proved for general monotonic functions φ 1 , φ 2 , let us restrict for the moment to a simpler case for which we can write explicit formulas. Set where a > 0 and b 1 , b 2 are such that the two functions are positive. In this case ψ 1 (z) = r 2 a(2z − 1) and ψ 2 (z) = −r 1 a(2z − 1) and the Hamiltonian is given by We first change variables in order to shift the point ( 1 2 , 1 2 ) at the origin setting (6) becomes: Taking the equivalent Hamiltonian e a −1 H(m1,m2) and, with an abuse of notation, denoting it again by H we consider the change of variables given by: where H ∈ 0, 1 16 , Θ ∈ R/2πZ. The derivative of Θ is given bẏ (14) and recalling that x 2 ∈ 0, 1 4 , we get from which we obtain: Equation (13) in the new coordinates is thus given by: is the Legendre's elliptic integral of the first kind with amplitude ϕ and parameter m, and k(Θ 0 ) denotes a constant which depends on the initial condition. The solution is then given by Θ(t) = − 1 2 am 1−16H (ar 1 r 2 t + k(Θ 0 )), with am m (u) denoting the inverse function to F (ϕ|m), known as Jacobi amplitude function (see [24]).
Note that, if we pose F (H, Θ) = ar1r2 The above representation suggests that the microscopic dynamics may be described in terms of a slow motion (of the microscopic variable corresponding to H) and a faster one (of the variable corresponding to Θ). In particular, assuming that the fast motion has an invariant distribution µ x for each fixed value x of the slow component H, we expect that on the "larger" time scale at which the slow motion of H is observable, the fast variable Θ averages out. This means that on such time scale, for N large enough, the dynamics becomes essentially one dimensional, being described by H, and its dependence on Θ should appear as an integral with respect to the measure µ H .

A change of variables for the microscopic system
In the light of what we have discussed above, we shall give a new representation of the microscopic system by introducing two variables (H, Θ). The resulting markovian dynamics has a generator whose form shows that such variables evolve on different time scales.
be the change of variables defined by ϕ(x, y) = (H(x, y), Θ(x, y)), where Θ is the function defined in (15) and has a generator of the form: where f is a C 3 function on − 1 , − × R/2πZ, lim N →∞ o(1) = 0 and a H , a HH , F, G, a HΘ , a ΘΘ are regular functions (at least C 1 ) obtained by the coefficients in formula (18) by taking x = x(h, θ), y = y(h, θ).
Proof. We recall that, for i, j = 1, 2: In what follows, for i, j = 1, 2, we use the notations: For N > 1, let (x N , y N ) be the process defined by Let (H N , Θ N ) be the Markov process defined by (H N (t), Θ N (t)) = ϕ(x N (t ∧ τ N ), y N (t ∧ τ N )). Notice that (H N , Θ N ) has as absorbing states all the points of the form (−∞, θ) with θ ∈ R/2πZ and the state corresponding to x = y = 0, which can be identified with the point (0, 0). Define, for , > 0, Now, consider the stopped process H N (· ∧ τ N − 1 ,− ), Θ N (· ∧ τ N − 1 ,− ) and let G , N be the generator obtained from G N by replacing 1 D with 1 D , . Let us apply G , is a smooth function. Then, with the usual notations (·) x := ∂/∂x for the partial derivatives we can write: where lim N →∞ N R , N (x, y) = 0. The above derivatives are given by:  (17) can be written as follows: We rewrite (18) as: Using the inverse change of variables ϕ −1 (h, θ) = (x(h, θ), y(h, θ)), the above expression can be written in terms of the variables (h, θ). We pose a H (h, θ) = a (h) (ϕ −1 (h, θ)) and define analogously a HH , a Θ , a HΘ , a ΘΘ . Then, using (18), we can write the asymptotic of the generator K , where, to emphasize the presence of the term of order N in a Θ , we can write a Θ (h, θ) = N (−F (h, θ) + o(1)) + G(h, θ).
Note that by (19) and the form of a Θ we obtain the macroscopic dynamics (6)

Main result
As can be seen by the coefficients in (16), the term of order 1 which appears in a Θ indicates that the variable H N jumps at a time scale larger than the one of Θ N . The goal of this subsection is to describe the macroscopic behaviour of the process {H N (N t)} t∈[0,T ] as N → ∞. Let us consider the generator (16) and change the time scale by multiplying it by N . We obtain the following expression: Next Theorem shows that, as N → ∞, the process {H N (N t)} t∈[0,T ] behaves like the solution of a stochastic differential equation; the coefficients of such equation are averages with respect to the invariant distribution for the macroscopic dynamics of the variable Θ.
The Theorem will be proved in subsection 4.4, using the results of subsection 4.3 and Proposition 4.2 of next paragraph.
The limit process. In order to show that equation (23) is well posed and to state its poperties we shall use some known results concerning existence and uniqueness of solutions of stochastic differential equations in an interval of the real line (see, e.g., [19], section 5.5, p. 329). We recall the definition and a fundamental result.
We denote by τ I the exit time from I, i.e., where {l n } and {r n } are strictly monotonic sequences with l < l n < r n < r for all n and lim n→∞ l n = l, lim n→∞ r n = r.
Theorem 4.2. (Thm 5.1 and subsection C of [19] ) Suppose that the coefficients of (24) satisfy: Then, for every initial distribution µ with µ(I) = 1, the equation (24) has a weak solution in I and this solution is unique in the sense of probability law.
In next proposition we show that (23) has a weak solution in (−∞, 0) and, for any > 0, the solution in the interval (− 1 , 0) exits a.s. from it and does so by the left side. ∈ I} and, for all > 0, and Remark 3. In next paragraph we illustrate the case when φ 1 and φ 2 are two linear functions. In this case we can obtain explicit expressions for the coefficientsā H andā HH and the random time in (28) will be replaced by τ I . In the general case we have to restrict to the interval I . Indeed, since we cannot have explicit expressions of a H (h, θ) and a HH (h, θ) in terms of elementary functions, in order to obtain information about H near the endpoints of (−∞, 0), we need estimates on such coefficients which are possible only when h is close to 0. However, we are interested in the behaviour of H before it eventually reaches −∞, since this should describe the behaviour of the microscopic variable H N (N t) for large N before it reaches its absorbing state −∞. Therefore, for our purposes it will be enough to study the process in the interval − 1 , 0 for arbitrarily small.
The limit process in the linear case. Let us consider the simpler case proposed in the introduction of this section, i.e., the case when φ 1 and φ 2 are as in (12) and the change of variables ϕ is defined by (14) and (15). Applying the same arguments used in the proof of Proposition 4.1 we are able to obtain an explicit expression for the equation satisfied by the limit process.
We recall that the Legendre's elliptic integrals of the first and second kind are defined respectively as F (ϕ|m) = In order to simplify notations, let us pose R(h, θ) = 1 − (1 − 16h) sin 2 (2θ) and β = 2[a(r 2 − r 1 ) + 2(b 1 + b 2 )]. Then we havē Then, adapting the proof of Proposition 4.2, we conclude that the limit process is the (unique, in the sense of probability law) weak solution in the interval I = (0, 1 16 ) of the equation Letting Moreover, (28) can be improved by showing that: Indeed, let us prove that lim z→0 + v(z) < ∞. The scale function (29)  Note that g is a positive function and, for c > z > 0, we have exp 16 x c g(h)dh < 1. Moreover, by the relations lim x→ K[1 − x] − ln( 4 x ) = 0 and lim x→0 E[1 − 16x] = 1 (see [24] ch.22, p. 521) it follows that: where C is a positive constant, and for all > 0 we can choose c sufficiently close to 0 such that: From this it follows lim z→0 + v(z) < ∞.

An averaging principle
In this section we prove an Averaging principle for a sequence (X N , Y N ); N ≥ 1 of Markov processes, where Y N describes a "fast" variable with values in R/2πZ and X N describes a "slow" variable with values in a closed interval of R . This result extends the one of Proposition 3.2 of [11] to the case when the velocity of the fast variable is not necessarily constant. The idea is to compare (X N , Y N ) with a process close to it, where the slow variable is piecewise constant in time.
where lim N →∞ e(N ) = 0. ii) Denoting by L N the generator of the process (X N , Y N ) and by p y : I ×R/2πZ → R/2πZ the projection on the second coordinate we can write: where (x N , y N ) ∈ E N and δ N is a sequence converging to zero. F is a Lipschitz function in both variables and inf (x,y)∈E |F (x, y)| ≥ > 0; G is a continuous function and iii) The martingale given by where lim N →∞ē (N ) = 0.
Proof. Arguing as in [11], by virtue of the Skorohod representation theorem (see, e.g. [5], Ch. 2 Theorem 2.2.2), we can suppose that the processes {(X N , Y N ); N ≥ 1} are defined on a suitable probability space where {X N ; N ≥ 1} converges toX almost surely. We shall prove that on this space we have Let us pose T (x) := 2π 0 1 F (x,y) dy. Observe that, since by assumption ≤ F ≤ F ∞ , we have, for all x ∈ I, Then the invariant distribution of (33) is given by µ x (dy) = 1 T (x)F (x,y) dy and the functionξ is Lipschitz. Writing the last term in the above inequality goes to zero as N → ∞ thanks to the regularity ofξ, the convergence of X N toX and the dominated convergence theorem. In order to study the term B we introduce a suitable construction. Fix t 0 = 0 and X N (0) = x 0 , Y N (0) = y 0 as the initial conditions of and let Y 1 be the solution of the above ODE. By the definition of T we have Y 1 (T (x 0 )/N ) = 2π + y 0 . Now, for i ≥ 0, we proceed recursively as follows: let X N (t i ) = x i and Y N (t i ) = y i be the initial conditions of the equation and denote by Y i+1 its solution. Let T (x i ) < ∞ such that Y i+1 (T (x i )/N ) = 2π + y i . We pose t i+1 = t i + T (x i )/N and consider the process defined by (X(t),Ỹ (t)) : Note that, by (34), letting n := N T t and n := N T t , it follows that: where for a given set A, |A| denotes its cardinality. Now, for each ω, define n(ω) = inf{i : t i+1 (ω) > T }.
For the term B in (35) it holds: We study separately each term of the above inequality. By hypothesis, the function ξ is Lipschitz (with constant, say, L ξ ); using (34) and (31) we have Analogously, for the term B 2 we obtain By hypothesis ii) and by the construction of Y i we can write: The function F is Lipschitz in both variables with constant, say L F , then from (36) we obtain By Gronwall inequality we obtain Then, taking expectations and using (31) and (32) we have Recalling the construction of the processỸ , we change variable in each integral of the sum in B 3 , setting θ = Y i (t). Note that if t i+1 < T using the periodicity of F and ξ we can write each integral as Using the definition of the invariant measure µ we have Note that the only non-zero term of the last sum is the one corresponding to i = n and that t N ≤ |T − t n | ≤t N for all N . Therefore, recalling that ξ is uniformly bounded on E the last term of the inequality tends to zero as N → ∞.
For the first sum, arguing as for the term B 1 and applying again (31) we obtain
The proof of this proposition needs some preliminary results.
Tightness. Let T > 0 be fixed and let , > 0 be small constants. In order to prove the tightness for the sequence of stopped processes H N (· ∧ τ N − 1 ,− ); N ≥ 1 we use the Aldous' tightness criterion (see [8]), namely, we check the following sufficient conditions: i) for every ε > 0 there exist a constant C > 0 such that ii) for any ε > 0 and α > 0 there exists δ > 0 such that Proof. As in Proposition 4.1, we can writẽ so that condition i) of the Aldous' criterion is immediately satisfied since H N (t ∧ τ N − 1 ,− ) ≤ 1 . In order to check condition ii), let us fix ε, α > 0 and take any pair of stopping times τ 1 , τ 2 with 0 ≤ τ 1 ≤ τ 2 ≤ (τ 1 + δ) ∧ T for some δ. By (37) we have Using the optional stopping theorem and the Ito isometry, we have Choosing δ sufficiently small the proposition holds true.
Averaging principle for the stopped processes. In this paragraph we show that the sequence H N (·∧τ N − 1 ,− ),Θ N (·∧τ N − 1 ,− ) ; N > 1 satisfies the conditions of Theorem 4.3. We start by proving the following Lemma 4.1. For any bounded {F t }-stopping time τ and any ζ ∈ R + there exists a constant C ζ , independent of N and τ , such that: Proof. From the definition of the processH N we can write: with λ N being the jump rate function defined in (4). In what follows we use the short notation: Let τ be a bounded stopping time and s ≥ 0. Theñ where p x is the projection on the first coordinate and the martingaleM N is obtained by the sum (38) by replacing N i,k with its compensated processÑ i,k for each k, i. Now, applying the Burkholder-Davis-Gundy inequality to the martingale {R N (s)} s≥0 defined byR N (s) : where C 1 is a constant and RN denotes the quadratic variation ofR N . Using the fact that M N is the sum of orthogonal martingales we obtain: whereN i,k is the compensator of N i,k . We recall that (∆ i,kH N ) 2 (s−) ≤ C 2 1 N 2 and we note also that the jump rate function satisfies λN ∞ ≤ C 3 N for a constant C 3 > 0. Then choosing the right constant K ζ we obtain: Moreover, from the properties of the generator K , N we know that, for any C 2 -function f , the function K , N (f • p x ) is uniformly bounded. Then there exists a constant C > 0, independent of N such that and the proof is complete.
Let A be the operator defined on functions f ∈ C 2 (−∞, 0) × R/2πZ by: where a H , a HH are defined as in (16).
, up to passing to a subsequence, we have, as N → ∞: andā H ,ā HH are defined as in (21) and (22). From (16) and the fact that F ≥ c( , ) > 0 (see the proof of Proposition 4.1 in subsection 4.2) we immediately observe that hypothesis ii) holds too. To show hypothesis iii) we find a uniform bound for the martingale term in the relation: where t, s ∈ [0, T ] and p y denotes the projection on the second coordinate.
Proof of Proposition 4.3 Before giving the proof we state a useful result. Let L be a linear operator defined for bounded measurable functions on a metric space E, let U be an open subset of E and let X be a càdlàg process. We recall that the process X(· ∧ τ ), where τ is the exit time from U of the process X, is said to be a solution of the (L, U )-stopped martingale problem if is a martingale for all f ∈ dom(L). N for a suitable constant C, then (see [12] Ch. 3, Thm 10.2) the limit processH − 1 ,− is continuous. In the setting of the proof of Proposition 4.3 we consider a suitable probability space where the above convergence is almost sure. In such space we have also, up to passing to a subsequence,

Let us write it explicitly
Note that all the terms in the expectations above are uniformly bounded with respect to N . Consider the second term of (43) and observe that: where the last equality comes from the fact that f has compact support. Therefore, the conclusion follows from (43) using (42) and dominated convergence theorem.
Using again Proposition 4.2, we can choose − small enough and N big enough such that Finally, for the second term of (44), by the convergence ofH N (· ∧ τ N − 1 ,− ) toH − 1 ,− , we can take N big enough such that and the proof is complete.