Multipopulation Spin Models : A View from Large Deviations Theoretic Window

This paper studies large deviations properties of vectors of empirical means andmeasures generated as follows. Consider a sequence X1, X2, . . . , Xn of independent and identically distributed random variables partitioned into d-subgroups with sizes n1, . . . , nd. Further, consider a d-dimensional vector mn whose coordinates are made up of the empirical means of the subgroups. We prove the following. (1) The sequence of vector of empirical means mn satisfies large deviations principle with rate n and rate function I, when the sequence X1, X2, . . . , Xn isR valued, with l ≥ 1. (2) Similar large deviations results hold for the corresponding sequence of vector of empirical measures Ln if Xi’s, i = 1, 2, . . . , n, take on finitely many values. (3) The rate functions for the above large deviations principles are convex combinations of the corresponding rate functions arising from the large deviations principles of the coordinates of mn and Ln. The probability distributions used in the convex combinations are given by α = (α1, . . . , αd) = limn󳨀→∞(1/n)(n1, . . . , nd). These results are consequently used to derive variational formula for the thermodynamic limit for the pressure of multipopulation Curie-Weiss (I. Gallo and P. Contucci (2008), and I. Gallo (2009)) and mean-field Pott’s models, via a version of Varadhan’s integral lemma for an equicontinuous family of functions.Thesemultipopulationmodels serve as a paradigm for decision-making context where social interaction and other socioeconomic attributes of individuals play a crucial role.


Introduction
The early 1970s saw the utilization of two-population meanfield models in the study of the phase transitions and critical behaviour of antiferromagnetic systems.Such meanfield models were used as mean-field approximations of bipartite lattice systems for studying metamagnets [1,2].The two-population mean-field ideology was used in [3] to investigate the gibbs-non-gibbs transitions in gibbs measures for Curie-Weiss model subjected to Glauber dynamics.Here the analysis was based on complete analysis of the phase diagram of the evolving spins constrained to having a given magnetization.Phase transitions in such a constrained system is an indication of loss of the gibbs property for the evolving system and it is preserved otherwise.
Statistical mechanical models have seen applications in the socioeconomic literature.Here the focus is on how decisions of individuals are influenced by their socioeconomic environment.For instance, how one's choice of employment, residence, eduction, etc. is influenced by the social and economic environments.Spin models have appeared as natural models for such discrete choice context where social interactions play a crucial role [4][5][6][7].More recently, the authors of [8][9][10] have introduced two-population mean-field models for a binary choice context where the reference population is partitioned into subgroups of individuals sharing the same socioeconomic attributes.The key assumption here is that individuals with the same attributes tend to behave the same way.In these papers it is assumed that the fractions of the individuals in the subgroups are independent of the size of the reference population.The thermodynamic limit for the pressure of the models was proved for many-body interaction version of the interacting Curie-Weiss model.But variational expression for the pressure and almost sure factorization of correlation function were proved for the case of one and twobody interactions case [10].
The aim of this paper is to set up large deviations machinery for assessing the variational formula for the pressure of the general model introduced in [10] and even extend the model there to the case where the fractions of the individuals in the subgroups are dependent on the size of the reference population.We also employ the tools developed here to derive variational formula for the pressure of a multibody multipopulation mean-field Potts' model.
We establish large deviation results for vectors of empirical means associated with a collection of random variables modelling the behaviour of interest of the subgroups that constitute the reference group.These empirical means are derived from uneven numbers of random variables.Thus the vector components are given by empirical mean of different numbers of independent and identically distributed random variables.Due to the variations in the sizes of the subgroups, the large populations asymptotics of the free energy results in proving a version of Varadhan's integral lemma for a sequence of functionals of the vector of empirical measures instead of the usual case where this functional is fixed throughout the asymptotics.We provide a necessary condition for such a sequence to admit the desired asymptotic result.
The rest of the paper is organized as follows: Section 2 discusses the generalities on large deviations theory and main results of the paper.In Section 3 we introduce the multipopulation Curie-Weiss and mean-field Potts' models to motivate the large deviations problem we address in this paper.The proofs of the results in Section 3 are given in Section 4.

Generalities on Large Deviations Theory and Main Results
Large deviation theory tells how, on an exponential scale, the probability for an atypical event decays to zero.More formerly, large deviations are defined as follows.
Definition 1.Let X be a complete separable metric space, B(X) the Borel -algebra of X, and {  ;  = 1, 2, . ..} a sequence of probability measures on B(X).
(1) {  } is said to have a large deviation property if there exist a sequence of positive numbers {  ;  = 1, 2, . ..} which tends to ∞ and a function  which maps X into [0, ∞] such that the following hypothesis holds: (2)  is called the entropy/rate function of {  }.
In the above definition condition (b) implies that the rate function  is good.

The Set Up.
Suppose  1 ,  2 , . . .,   , . .., is a sequence of independent and identically distributed R  -valued random variables, for a positive integer  ≥ 1.Let   1 , . . .,    be a partition of the set   = {1, 2, . . ., }.The partition may be interpreted as the indexing set of the  ≥ 1 subpopulations in a population of size .Here we denote by   the size of the th subpopulation and we assume that lim for any  = 1, 2, . . ., .For each subpopulation  we are interested in the empirical mean and the vector of empirical means for the multipopulation is given by For the case  = 1, it is clear that   ∈ R  , but it is different from the empirical mean of a sequence of R  -valued random variables.Note that in the latter each coordinate of the empirical mean is a sum of  random variables.In our case the coordinates of the empirical mean vector   are made up of sums of uneven numbers of random variables.In what follows we write   for the distribution of   and the space X = (R  )  .

The R-Valued
Case.Suppose the sequence  1 ,  2 , . . . of R-valued random variables is independent and identically distributed with common distribution .Suppose the logarithmic moment generating function associated with  is given by We assume Λ() < ∞, for all  ∈ R. The Fenchel-Legendre transform of Λ() is defined as We now state our first large deviations result for the vector of empirical means   .Recall that   is the law of   .
In view of the above, we will write the inner product for any pair ,  ∈ (R  )  as follows: We define the R  -valued vectors  = ( where Here we have put Note that   [] is a vector whose coordinates are probability measures on Σ.To see the connection between our vector of empirical measures and the vector of empirical means we considered earlier, we introduce the following sequence of random variables: for positive integer , define Then the sequence  1 ,  2 ,  3 , . . . is an i.i.d.R |Σ| −valued random variables are with the property that Thus the empirical mean of the -sequence is the same as the empirical measure of the -sequence and the vector of empirical means associated with the -sequence coincides with the vector of empirical measures of the -sequence.
For every  ∈ R |Σ| , note that Further, for any probability measure ] on Σ, the relative entropy of ] relative to  is given by Suppose M 1 (Σ) is the set of all probability measures on Σ.

Varadhan's Integral Lemma.
In this section we consider Varadhan's integral lemma for a sequence of equicontinuous functions.Here we put X = (R  )  .
Theorem 5. Suppose {  } satisfies a large deviation principle with rate  and a good rate function  : X → R. Further, let the sequence of functions   : X → R be equicontinuous converging point-wise to a function  : X → R. Assume either the tail condition or the following moment generating condition for some  > 1, Then

Applications
Let us now introduce the models that motivated the large deviations questions addressed in this paper.We are interested in a model that will capture how individual decisions or choices are influenced by the choices of the rest of the people in their reference group.Additionally, individuals do have attributes, such as gender, place of residence, level of education, and ethnicity, that also influence their decisions.The Curie-Weiss case introduced in [10] is discussed first and the corresponding mean-field Potts' case will be discussed after that.The method discussed here could also apply to continuous spin models with compact support.In particular, it will apply to the mean-field versions of the following models: the () model [11], the spherical model [12,13], the liquid crystal model [11,14,15], and the Kuramoto model [16].The detail analysis for continuous spin models and their phase diagram will be carried out in a future paper.We present the results for the Curie-Weiss and Potts's models here because we have already studied the thermodynamic limits of these models in [17,18].

Multipopulation Curie-Weiss Model.
Suppose each individual in a population of  agents chooses a binary action, such as voting YES or NO on some issue at some common time.This binary action is coded by   ∈ {1, −1} with The choices made by all the  individuals are also coded by  ∈ Ω  = {−1,+1}  .The level of satisfaction of the population for deciding on  ∈ Ω  is given by the Curie-Weiss Hamiltonian The function   on the configurations  represents the utility of individuals as a result of their choices and the influences on them while making those choices [19].  () measures the level of satisfaction of the entire population for making the choice .The higher is the   (), the higher is the level of satisfaction of the population.It has two parts; the first part models the social incentive of individuals in the population and the second part models the private incentive of individuals.Here   measures the influence of individual  on individual .When   is positive means conformity or imitation is rewarded and conformity is not rewarded when   is negative.ℎ  controls the part of the utility that is specific to individual .
Next we reparametrize the parameters   and ℎ  as follows: suppose that each individual  in the reference population has  attributes   = ( (1)   , . . .,  ()  ) ∈ {0,1}  .For instance, suppose that attributes 1 and 2 are, respectively, employment  (1)   and marital status  (2)   , then and Therefore, with respect to the attributes, the reference group can be partitioned into 2  nonoverlapping subgroups.Members in a given subgroup share the same attributes and it is therefore reasonable to assume that they also behave the same way.In view of this, we shall assume in what follows that   =    , for all choices of  coming from subgroup  and all  taken from subgroup   .Further, we assume that ℎ  = ℎ  for all individuals in group .In the sequel we will let    Therefore, it follows from (25) and the above parametrization of   and ℎ  that Note that if  = 0, we get the original Curie-Weiss Hamiltonian.For the case  ≥ 1, we end up with 2  Curie-Weiss models on the subgroups   1 , . . .,   2  , that are interacting with one another.Here we have 2  subgroups because we have attributes that are binary.We could allow the attributes to have any finite number of alternatives and the alternatives for the attributes need not to come from the same set.Therefore, in what follows we will assume there are  ≥ 1 subgroups and that the Hamiltonian takes the form The Hamiltonian in (30) consists of one-body and two-body interactions.In what follows we will extend the number of bodies in the interaction to range from 1 to , where  is the number of subgroups [10].We consider Hamiltonian of the form Note from ( 30) and (31) that for  = 1,   1 = ℎ  and for  = 2,   1  2 =    .The   1 ,...,  are interaction coefficients associated with the -body interaction among individuals coming from the subgroups  1 , . . .,   , respectively.Thus the interaction is defined with the help of a tensor   1 ,...,  of rank  for each of the -body interactions [10].Further, we assume that there is a probability measure  on the set {1, 2, . . ., }, such that  = ( 1 , . . .,   ) and lim →∞    =   , for any  = 1, 2, . . ., . (32) Note that the model we consider here is more general than the cases in [8,10], in that for any finite  the fractions of the subgroups  1  , . . .,    are dependent on .In [8,10] these fractions are chosen to be independent of .This simplifies the proofs, especially the existence of the thermodynamic limit.
In the sequel we will use the following notation: Further, for every positive integer , define a map   : Δ  → R as Note that the   's are uniformly bounded by Suppose the spins  1 ,  2 , . . .are independent and identically distributed sequence of random variables with We denote by   the corresponding product measure on where is the partition function of the model and   is the law of vector of empirical means   = ( 1  , . . .,    ) under   .In (38) we have used (35).In what follows , ℎ, and  shall be as follows: The pressure function of the model is then given by The large  behaviour of the model is governed by the pressure function.It is known from [18] that the thermodynamic limit exists.The proof of the case    =   was earlier given in [8,10].Theorem 6.For choice of the parameters , ℎ, and , the limiting pressure admits the following variational representation: where and  is given in (34).

Multipopulation
The choices made by all the  individuals are also coded by  ∈ Ω  = {1, . . ., }  .The level of satisfaction of the population for deciding on  ∈ Ω  is given by the mean-field Potts' Hamiltonian Here    ,  is the Dirac-delta measure.The function   on the configurations  represents the utility of individuals as a result of their choices and the influences on them while making a decision [19].  and its parameters have the usual interpretation given for the Curie-Weiss model.Therefore, it follows from (45) and the above parametrization of   and ℎ  that In the second equation above we have used that Note that if  = 0, we get the original mean-field Potts' Hamiltonian.For the case  ≥ 1, we end up with 2  meanfield Potts' models on the subgroups   1 , . . .,   2  , that are interacting with one another.Here we have 2  subgroups because we have attributes that are binary.We could allow the attributes to have any finite number of alternatives and the alternatives for the attributes need not come from the same set.Therefore, in what follows we will assume there are  ≥ 1 subgroups and this gives rise to the Hamiltonian The Hamiltonian in (51) consists of one-body and two-body interactions.In what follows we will extend the number of bodies in the interaction to range from 1 to , where  is the number of subgroups.We consider Hamiltonian of the form In the above we have used that Note from ( 51) and ( 52) that for  = 1,   1 = ℎ  and ⟨ 1 ⟩ = ⟨ 1 ,  1 ⟩.The   1 ,...,  are interaction coefficients associated with the -body interaction among individuals coming from subgroups  1 , . . .,   , respectively.Thus the interaction is defined with the help of a tensor   1 ,...,  of rank  for each of the -body interactions.The model considered here is mean-field Pott's version of the Curie-Weiss model considered in [8,10,17].Further, for any finite  the fractions of the subgroups  1  , . . .,    are dependent on .In [8, 10] these fractions were chosen to be independent of , which simplified the proofs, especially the proof of the existence of the thermodynamic limit.
Note that   [] = ( 1  [], . . .,    []) ∈ P()  .Further, for every positive integer , define a map   : P()  → R as Note that the   's are uniformly bounded by Since the maps ]  →   (]) are continuous for every , and Δ  is a compact subset of (R  )  , it follows from Theorems 7.13 and 7.24 of [20] that the sequence {  } is equicontinuous.Further, the Hamiltonian   in (52) become Suppose the spins  1 ,  2 , . . .are independent and identically distributed sequence of -valued random variables with We denote by   the corresponding product measure   on Ω  =   .The equilibrium state   associated with the Hamiltonian   in ( 52) is given by where is the partition function of the model.Here   is the law of the vector of empirical measures   ().The pressure function of the model is then given by It follows from [17] that the thermodynamic limit exists.
The limiting pressure admits the following variational formula representation.Theorem 7.For any choice of the parameters , ℎ, and , the limiting pressure admits the following variational representation: where is the relative entropy of ]  with respect to  and  is given in (55).
Proof.The proof of this theorem follows from (59) to (61) and Theorems 4 and 5 upon setting   =   ,   =   ,  = ,   =   ,  = , and   =   , and noting that the   's form an equicontinuous family and they are uniformly bounded.

Proofs
The proofs of the results of this paper are given in this section.
In the proof below we will use the following properties of the functions Λ and Λ * .For proof of these properties we refer the reader to the proof of Lemma 2.2.5 of [21].

Lemma 8.
(1) Λ is a convex function and Λ * is a convex rate function.

Proof of Theorem 2
Proof.The proof comes in two steps.In step one we will establish a large deviations upper bound and the corresponding lower bound is proved in step two.The proof is an adaptation of arguments used to prove the large deviations principle for the empirical mean of i.i.d sequence of R  -valued random vectors.
Step 1.Let () = Λ * () be the intended rate function of the problem and define the -rate function as follows: The proof of the upper bound will follow if we can show for every  > 0 and every closed subset We will first of all prove this inequality for compact subsets and extend it to closed subsets with the help of exponential tightness argument.Suppose  is a compact subset of R  .Then for every  > 0 and  ∈ , we can find a It follows from the definition of Λ * that such a   exists.For any   ,  ∈ R  , define Therefore if  = ( 1 , . . .,   ), then (1/) , = ( 1   1  , . . .,       ).For each  ∈ R  , choose   > 0 such that   |  | ≤  and define where Note that for any  ∈  − inf since 0 ≤   ≤ 1 for each  = 1, 2, . . ., .Therefore it follows from the exponential Chebycheff inequality that The second equality uses that the sequence  1 ,  To extend the proof of the above to all closed subsets of R  , we need to establish exponential tightness of the measures   .Let  > 0 and define where    is the law of the th coordinate of   , i.e., the law of    = (1/  ) ∑ ∈     .Thus    =    .Therefore for any   ≥ 0 Therefore, for every  ≥ , it follows from (64) that Similarly, using (65) we get that This then implies that Therefore where   = ( 1  1 , . . .,     ).For any  = 1, . . ., , let   be such that  ∈    .Then and by the dominated convergence theorem

Proof of Theorem 3
Proof.The proof of this theorem follows from that of the R-valued case upon making appropriate substitutions.For instance, every R, R  ,     , and Λ  (  ) in the proof of the R−valued case should be replaced with R  , (R  )  , ⟨  ,   ⟩, and ∇Λ(  ), respectively.In particular, in establishing the exponential tightness of the   's we use that for any  > 0, we define where    is the th coordinate of the th-component of  = ( 1 , . . .,   ) ∈ (R  )  and     is the law of the empirical mean Here    is the th coordinate of   .

Proof of Theorem 5
Proof.The proof comes in three steps.In step one we proof a lower for (23).Using condition (21), we proof an upper bound for (23) in step two.
Step three shows that condition (22) implies condition (21) and that completes the proof.
Step One.Fix  ∈ X and  > 0. Due to the equi-continuity of the sequence   we have that the functions   are lower semicontinuous.Thus, there exists a neighbourhood  of  such that inf ∈   () ≥   () − , for all  = 1, 2, 3, . . . .
We then have that We get from here that lim inf Since  ∈ X and  > 0 were arbitrary chosen, we have that Step Two.
The result for this case follows from the tail condition (21) as we let  → ∞.

Conclusion
This paper has developed large deviations machinery for the empirical means and measures for partitions of independent and identically distributed sequence of random variables.The large deviations result is further applied to derive the limiting free energy for multipopulation Curie-Weiss and Potts' models.The method proposed here can be applied to multipopulation versions of spin models with continuous spins such as the () model [11], the spherical model [12,13], the liquid crystal model [11,14,15], and the Kuramoto model [16].The multipopulation Potts' model may have applications in discrete choice context with more than two alternatives to choose from.This serves as a natural extension to the Ising cases considered in [8-10, 22, 23].
The knowledge gained from the study of the minimizers of the associated minimization problem that leads to the limiting pressure will offer insight to the scaling limit behaviour of the empirical measures associated with the multipopulation Potts' model.This will be a natural extension of the work in [24].

Theorem 4. The sequence of vectors of empirical measures 𝐿 𝑛 [𝑋] satisfies a large deviations principle with rate 𝑛 and rate function
[20] =1 |  1 ...  /|.It is also clear that Since the maps   →   () are continuous for every , and Δ  is a compact subset of R  , it follows from Theorems 7.13 and 7.24 of[20]that the sequence {  } is equicontinuous.Further, the Hamiltonian   in (31) become   () =   (  ) .
. The equilibrium state   associated with the Hamiltonian   in (31) is given by Mean-Field Potts' Model.Suppose this time round that the individuals in the population of  agents choose from finite number of alternatives, say various alternatives of employment, at some common time.This discrete choice action of individual  is coded by   ∈  = {1, . . ., }, where  ≥ 2, with , if individual  chooses employment alternative q for  ∈ {1, . . ., } .
2 , . . . is i.i.d. and the second inequality uses that   / =    .Since  is compact, it has a finite covering consisting of  = (, ) open balls    ,   centred at  1 , . . .  .It follows from the subadditivity property of probability measures and the choice of    's that    )]) is finite, by the finiteness assumption on Λ, which is a concave function of   and consequently it is continuous of   .