ON THE WORK OF SARIG ON COUNTABLE MARKOV CHAINS AND THERMODYNAMIC FORMALISM

The paper is a nontechnical survey and is aimed to illustrate Sarig’s profound contributions to statistical physics and in particular, thermodynamic formalism for countable Markov shifts. I will discuss some of Sarig’s work on characterization of existence of Gibbs measures, existence and uniqueness of equilibrium states as well as phase transitions for Markov shifts on a countable set of states.


INTRODUCTION
Omri Sarig, the winner of the fourth Brin Prize in Dynamical Systems, has made many fundamental contributions to the theory of dynamical systems in general and specifically to thermodynamics of countable Markov chains. In this paper I will describe some of Sarig's results on characterization of existence of Gibbs measures, on existence and uniqueness of equilibrium states as well as on presence of phase transitions for countable Markov chains. It should be stressed that the results on Gibbs measures for Markov chains on finite or countable set of states serve as a ground to study existence, uniqueness and ergodic properties of equilibrium measures for smooth hyperbolic dynamical systems and that uniformly hyperbolic systems are modeled by subshifts of finite type while nonuniformly hyperbolic systems are modeled by countable Markov chains. Symbolic representation of hyperbolic dynamical systems can be obtained using Markov partitions with a finite or respectively countable collection of partition elements. Constructions of such partitions for uniformly hyperbolic systems is due to Sinai [29] and Bowen [3]. Recently Sarig has constructed countable Markov partitions for nonuniformly hyperbolic surface diffeomorphisms (see [27] and also the paper by Ledrappier [15] in this volume).

GIBBS DISTRIBUTIONS
The thermodynamic formalism, i.e., the formalism of equilibrium statistical physics, originated in the work of Boltzman and Gibbs and was later adapted to the theory of dynamical systems in the classical works of Sinai [28], Ruelle [19,20] and Bowen [3]. In this paper we shall only discuss thermodynamic formalism for symbolic dynamical systems starting with subshifts of finite type and moving then to countable state Markov chains. It is the latter where Sarig has obtained many principle results that essentially shaped up the area.
In statistical mechanics the study of thermodynamics deals with systems which appear to human perception as being "static" despite the motion of the particles which the systems are built of. These systems can be described simply by a set of macroscopically observable variables and are thought of as statistical ensembles that depend on a few observable parameters, and which are in statistical equilibrium. We consider systems (canonical ensembles) with a finite (but sufficiently large) number of particles for which the energy is not known exactly and in place of energy, the temperature is specified. We begin with a simple example that will help us reveal some principle components of thermodynamic formalism. Consider a physical system A of finite particles. Each particle is characterized by its position and velocity and we call a given collection of such positions and velocities over all particles a state. There are physical systems for which the set of all states is a finite set X = {1, . . . , N }. We denote by E i the energy of the state i . We further assume that this mechanical system is in thermal equilibrium with a reservoir, i.e., the particles interact with a heat bath B so that 1. A and B can exchange energy, but not particles; 2. B is at equilibrium and has temperature T ; 3. B is much larger than A , so that its contact with A does not affect its equilibrium state.
Since the energy of the system is not fixed every state can be realized with a probability p i given by the Gibbs distribution κT is called the inverse temperature and κ the Boltzman's constant. It is not difficult to show (see [13]) that the Gibbs distribution maximizes the quantity is the entropy of the Gibbs distribution, is the average energy, and ϕ(i ) = βE i is the potential, which in our case is the function that depends on a given coordinate i only.
In other words, the Gibbs distribution minimizes the quantity E −κT H known as the free energy of the system. The importance of this statement can not be underestimated: it means that one of the main principles of thermodynamics that nature maximizes entropy is applicable when energy is fixed and should otherwise be replaced with the principle that nature minimizes the free energy.
Consider now a one-dimensional lattice and assume that to each integer one associate a physical system with a finite set X = {1, . . . , m} of states. A configuration of our infinite systems is a point We assume that the set X is endowed with the discrete topology and the the set Σ n with the direct product topology.
Consider all possible finite configurations (ω −n , . . . , ω 0 , . . . , ω n ) called cylinders, (there is N = m 2n+1 possible configurations) and assume that each configuration has energy where ϕ 0 and ϕ 1 are continuous functions of their coordinates and that the function ϕ 1 satisfies Arguing as above we obtain Gibbs distributions with probabilities µ n that are proportional to e −βE n . Assume now that for every configuration (ω −n , . . . , ω n ) there exists the limit where the sum is taken over all (ω −k , . . . , ω k ) for which ω j = ω j for every | j | ≤ n. The measure µ is called the Gibbs distribution on Σ n and is an invariant measure for the (left) shift σ on Σ n .

SUBSHIFTS OF FINITE TYPE
We shall now describe a substantial generalization of the above example, which is due to Bowen [3], Parry [17] and Walters [33] (see also [34]). It is designed to describe systems that are modeled by subshifts of finite type.

Gibbs measures for subshifts of finite type. Let (Σ +
A , σ) be a (one-sided) subshift of finite type. 1 Here A = (a i j ) is a transition matrix (a i j = 0 or 1, no zero columns or rows), : a x n x n+1 = 1 for all n ≥ 0} and σ is the (left) shift. We view Σ + A as a metric space with the metric d (x, y) given by: for any x = (x n ) and y = (y n ), |x n − y n | 2 n . 1 In the literature the pair (Σ + A , σ) is also called a topological Markov chain.
We assume that A is irreducible (i.e., A N > 0 for some N > 0 and all n ≥ N ) implying that σ is topologically mixing. Consider a continuous function ϕ on Σ + A , which we call a potential. THEOREM 3.1 (see [3]). Assume that the potential ϕ is Hölder-continuous. Then there exist a unique σ-invariant Borel probability measure µ on Σ + A and constants C 1 > 0, C 2 > 0 and P such that for every x = (x i ) ∈ Σ + A and m ≥ 0, The measure µ = µ ϕ is called a Gibbs measure for the potential ϕ and the constant P = P (ϕ) the topological pressure of ϕ.

Ruelle's Perron-Frobenius Theorem. The proof of this theorem is based on Ruelle's version of the classical Perron-Frobenius Theorem for matrices. For a continuous potential ϕ on Σ +
A define a linear operator L = L ϕ on the space It is called the Ruelle operator and it provides a great tool in constructing and studying Gibbs measures. Given a continuous potential ϕ on Σ + A , let be the n-th ergodic sum of ϕ. Note that for all n > 0 THEOREM 3.2 (see [3]). Let ϕ be a Hölder-continuous potential on Σ + A . Then there exist λ > 0, a continuous positive function h and a Borel measure ν such that One can show (see [3]) that the rate of convergence in (3.4) is exponential. This implies that the measure µ ϕ has exponential decay of correlations with respect to the class of Hölder-continuous test functions on Σ + A . Recall that a continuous transformation T has exponential decay of correlations with respect to an invariant Borel probability measure µ and a class H of test functions if there exists 0 < θ < 1 such that, for any h 1 , h 2 ∈ H , One can further show (see [3] and [8]) that the measure µ ϕ in Theorem 3.2 satisfies the Central Limit Theorem (CLT). Recall that a continuous transformation T satisfies the Central Limit Theorem (CLT) for test functions from a class H if for any h ∈ H , which is not a coboundary (i.e., h = g • T − g for any g ), there exists γ > 0 such that 3.3. Conformal measures. The measure ν in Theorem 3.2 has the important property of being conformal. Recall that given a potential ϕ on Σ + A , a Borel probability measure µ on Σ + A (which is not necessarily invariant under the shift) is said to be conformal (with respect to ϕ) if for some constant λ and almost One can show that in our case the relation L * ϕ ν = λν is equivalent to the fact that ν is a conformal measure for ϕ.
3.4. The topological pressure and the variational principle. We defined above the topological pressure of the potential ϕ as a constant in (3.1). Its existence is guaranteed by Theorem 3.1. We shall now give another equivalent definition of the topological pressure.
Denote by THEOREM 3.3. The following limit exists If potential ϕ is Hölder-continuous, then P (ϕ) coincides with the constant P given by (3.1).
The formula (3.5) extends the notion of the topological pressure to continuous potentials.
One of the fundamental results in thermodynamics is the variational principle for the topological pressure: where the supremum is taken over the set Note that one can restrict the supremum in (3.6) to all σ-invariant ergodic Borel probability measures on Σ + A . For proofs of Theorems 3.3 and 3.4 we refer the reader to [3] (see also [34]).

REMARK.
The definition of the topological pressure based on Theorem 3.3 has an advantage over the definition based on formula (3.1): one can replace the , σ −1 (K ) = K ) and one can replace the requirement that the potential ϕ is Hölder-continuous with the weaker condition that it is just continuous. Furthermore, it is exactly Theorem 3.3 that lays down a way to extend the notion of topological pressure to countable state Markov chains (see Section 4.5), which is the main subject of Sarig's work.
3.5. Equilibrium measures. Given a continuous potential ϕ, a σ-invariant measure µ = µ ϕ on Σ + A is said to be an equilibrium measure if THEOREM 3.5 (see [3]). If the potential ϕ is Hölder-continuous, then the Gibbs measure µ ϕ in the Ruelle's Perron-Frobenius Theorem is the unique equilibrium measure for ϕ. Moreover, log λ = P (ϕ).
3.6. Two-sided subshifts. Many results in thermodynamical formalism of onesided shubshifts can be extended to two-sided subshifts (Σ A , σ) where and σ is the left shift. One can view (Σ A , σ) as at the natural extension of (Σ + A , σ). This is based on results by Sinai [28] and Bowen [3].
This equation means that the potentials ϕ and ψ are cohomologous. Due to the variational principle, one can show that two cohomologous potentials have the same set of Gibss measures.

COUNTABLE MARKOV CHAINS
We now move from subshifts of finite type or Markov chains with finitely many states to Markov chains with countably many states or countable Markov chains (X = Σ + A , σ) where A is a transition matrix on a countable set S of states and σ is the left shift. The Borel σ-algebra B is generated by all cylinders. The main obstacle in constructing equilibrium measures in this case is that the space X is not compact and hence, the space of probability measures on X is not compact either and new methods are needed to overcome this difficulty.

4.1.
A bit of a history. I will mention few major developments that preceded Sarig's work. The list below is far from being complete and is only meant to set up the ground for the results described below.
1. In 1967-68 Dobrushin, [6], Landford [14] and Ruelle [20] introduced what is now known as DLR measures; they characterize Gibbs measures (also called Gibbsian distributions) in terms of families of conditional probabilities (see below). 2. Gurevic [9,10] studied the topological entropy (corresponding to the case ϕ = 0) and established the variational principle for the topological entropy; later he introduced the notion of topological pressure and obtained the variational principle (see [11] and also [12]). Vere-Jones [30] studied recurrence properties that are central for constructing Gibbs measures. Both Gurevic and Vere-Jones assumed that the potential function depend on finitely many coordinates which allowed them to use some ideas from renewal theory. 3. Yuri [35] proved convergence in (3.4) requiring the finite-images property (see below). 4. Aaronson, Denker and Urbanski [2] studied ergodic properties of conformal measures (in particular, they proved that these measures are conservative) and Aaronson and Denker [1] established convergence in (3.4) requiring the big images property (which they called the Gibbs-Markov property; see below).

Dobrushin-Landford-Ruelle (DLR) measures.
We begin with a description of DLR measures. We think of X as one-dimensional lattice whose points are called sites, so that each site n can be in one of countably many states x n . Given a probability measure µ on X , consider the conditional measures on cylinders [a 0 , . . . , a n−1 ] generated by µ, i.e., the conditional distribution of the configuration of the first n sites (a 0 , . . . , a n−1 ) given that site n is in state x n , site (n + 1) is in state x n+1 , etc. More precisely, for almost all x ∈ X , µ(a 0 , . . . , a n−1 |x n , x n+1 , . . . )(x) = E µ (1 [a 0 ,...,a n−1 ] |σ −n B)(x).
Given β > 0 and a measurable function U : X → R, we call a probability measure µ on X a Dobrushin-Lanford-Ruelle (DLR) measure for the potential ϕ = −βU if for all N ≥ 1 and almost every x ∈ X the conditional measures of µ satisfies the DLR equation: The problem now is to recover µ from its conditional probabilities.

Conformal measures.
In the particular case ϕ(x) = f (x 0 , x 1 ) recovering the measure µ from its conditional probabilities is the well-known Kolmogorov Theorem in the theory of classical Markov chains where the stochastic matrix P = (p i j ) is given by p i j = exp( f (i , j )) if a i j = 1 and p i j = 0 otherwise. For general potentials ϕ DLR measures can be recovered using conformal measures. More precisely, the following statement holds.

THEOREM 4.1 (see [1]). Let ϕ be a Borel function and µ a nonsingular conformal probability measure for ϕ on a countable Markov chain X . Then µ is a DLR measure for ϕ.
Note that this result is quite general as it imposes essentially no restrictions on the potential ϕ. Indeed, one can obtain much stronger statements assuming certain level of regularity of the potential. and ϕ is locally Hölder-continuous if the exist C > 0 and 0 < θ < 1 such that for all n ≥ 2 It is easy to see that if the potential is locally Hölder-continuous then it has summable variations.

The Gurevic-Sarig pressure.
In what follows we shall always assume that the shift σ is topologically mixing that is given i , j ∈ S there is N = N (i , j ) such that, for any n ≥ N there is an admissible word of length n connecting i and j . For i ∈ S let The Gurevic-Sarig pressure of ϕ is the number 3 This notion is a generalization of the notion of topological entropy h G (σ) for countable Markov chains introduced by Gurevic in [9,10], so that P (0) = h G (σ). Existence of the limit in (4.5) and some basic properties of the pressure are described by the following theorem. [21,26]). Assume that the potential ϕ has summable variations, i.e., it satisfies (4.2). Then   (Sarig [21]). Assume that ϕ has summable variations and sup ϕ < ∞. Then

THEOREM 4.2 (Sarig
where the supremum is taken over all σ-invariant Borel probability measures on Our goal now is to construct an equilibrium measure µ ϕ for ϕ, find conditions under which it is unique and study its ergodic properties. We will achieve this by first constructing a Gibbs measure for ϕ and then showing that it is an equilibrium measure for ϕ providing it has finite entropy.

Gibbs measures for countable Markov chains.
The construction of Gibbs measures is based on the study of the Ruelle operator L ϕ and on establishing a generalized version of the Ruelle's Perron-Frobenius Theorem. The role of the Ruelle operator in the study of Gibbs measures can be seen from the following result that connects this operator with the Gurevic-Sarig pressure.
We say that a nonzero function f is a test function if it is bounded continuous nonnegative and is supported inside a finite union of cylinders. 3 The notion of the pressure for countable Markov shifts that we discuss in this section was introduced by Gurevic [11] (see also [12]) for potentials of a special kind that only depend on the first coordinate. It was Sarig who extended Gurevic's approach to potentials that may depend on all coordinates (see [21,26]) and described all the main properties of the pressure (see Theorem 4.2). This is why we propose to call this pressure after both Gurevic and Sarig.

THEOREM 4.4 (Sarig [21]). Assume that ϕ has summable variations. Then for every test function f and all x
where L ϕ is the Ruelle operator given by (3.2).
This results implies that if P (ϕ) is finite, then for every x ∈ X the asymptotic growth of L n ϕ f (x) is λ n where λ = exp P (ϕ).

4.8.
Recurrence properties of the potential. We now wish to obtain a more refined information on the asymptotic behavior of λ −n L n ϕ . To this end given a state i ∈ S, let Z n (ϕ, i ) be given by (4.4) and where ϕ i is the first return time to the cylinder [i ]. We say that the potential ϕ is

Generalized Ruelle's Perron-Frobenius (GRPF) Theorem.
Our standing assumption now is that the potential ϕ has summable variations and finite Gurevic-Sarig pressure, i.e., P (ϕ) < ∞. The following result provides a complete characterization of each of the above types of potentials, i.e., the necessary and sufficient conditions for the potential ϕ to belong to one of the above classes. THEOREM 4.5 (Sarig [21,22,25] This theorem is a far reaching generalization of Theorem 3.2 and it covers earlier results by Vere-Jones [30,31], by Aaronson and Denker [1] and by Yuri [35,36]. In particular, the result by Yuri requires In fact, the BIP property can be used to characterize existence of σ-invariant Gibbs measures. THEOREM 4.6 (Sarig [25]). Assume that the potential ϕ has summable variations. Then ϕ admits a unique σ-invariant Gibbs measure µ ϕ if and only if 1. X satisfies the BIP property; 2. P (ϕ) < ∞ and var 1 ϕ < ∞ (i.e., n≥1 var n (ϕ) < ∞).
In this case ϕ is positive recurrent and µ ϕ = hν, where ν is the conformal measure for ϕ in the GRPF Theorem 4.5.

Existence and uniqueness of equilibrium measures. Let ϕ be a potential on Σ +
A . The following result by Sarig [25] provides some sufficient conditions on ϕ that guarantee existence of an equilibrium measure for ϕ. THEOREM 4.7. Assume that the potential ϕ has summable variations and P (ϕ) < ∞. Assume also that ϕ is positive recurrent and sup ϕ < ∞. If the measure ν in the GRPF Theorem is such that the measure µ ϕ = hν has finite entropy then µ ϕ is an equilibrium measure for ϕ.
We shall now state a result by Buzzi and Sarig [4] that guarantees uniqueness of the equilibrium measure for the potential ϕ. It also shows that the requirement that ϕ is positive recurrent is necessary for the existence of the equilibrium measure provided the potential is bounded from above. THEOREM 4.8. Assume that the potential ϕ has summable variations and P (ϕ) < ∞. Assume also that sup ϕ < ∞. Then ϕ has at most one equilibrium measure. In addition, if such a measure exists then ϕ is positive recurrent and this measure coincides with the measure ν in the GRPF Theorem and has finite entropy. 4.11. Ergodic properties. THEOREM 4.9 (Sarig [21,25]). Assume that the potential ϕ has summable variations and P (ϕ) < ∞. Assume also that sup ϕ < ∞. If µ = µ ϕ is an equilibrium measure for ϕ then µ is strongly mixing and The strong mixing property is a corollary of a general result by Aaronson, Denker and Urbanski that claims that if ν is a nonsingular σ-invariant measure, which is finite on cylinders, conservative and whose log of the Jacobian has summable variations, then ν is strongly mixing. One can show that the measure µ in the above theorem is indeed a Bernoulli measure. 4.12. Decay of correlations and Central Limit Theorem. THEOREM 4.10 (Sarig [25]). Assume the potential ϕ is locally Hölder-continuous and P (ϕ) < ∞. Assume also that sup ϕ < ∞. Then the equilibrium measure µ ϕ for ϕ has exponential decay of correlations (with respect to the class of Höldercontinuous functions on X ) and satisfies the Central Limit Theorem.
The proof of this results is based on the crucial spectral gap property (SGP) of the Ruelle operator that claims that in an appropriate (sufficiently "large") Banach space B of continuous functions L ϕ = λP + N where λ = P (ϕ) and P N = N P = 0, P 2 = P , dim(ImP ) = 1.
Furthermore, the spectral radius of N is less than λ. The SGP implies the exponential rate of convergence in (3.4) leading to the exponential decay of correlations and the Central Limit Theorem for functions in B (see [8]).
For subshifts of finite type there is a subspace B on which the Ruelle operator has the SGP (due to Ruelle and Doeblin-Fortet) but this may not be true for countable Markov chains due to the presence of phase transitions. Indeed, the SGP guarantees that the function t → p(t ) = P (ϕ + t ψ) (where ϕ and ψ are locally Hölder-continuous) is real-analytic implying uniqueness of equilibrium measures. 4 However, for countable Markov chains as t varies the function ϕ + t ψ can change its mode of recurrence (e.g., move from being positive recurrent to null recurrent or to transient) resulting in nonanalyticity of the function p(t ) and hence, the appearance of phase transitions. Given a subshift of countable type, Cyr and Sarig found a necessary and sufficient condition for the existence of a space B on which the Ruelle operator has the SGP. Using this condition they showed that absence of phase transitions is open and dense in the space of locally Hölder-continuous potentials (see [5]). 4.13. Analyticity of the pressure function. We close our presentation by describing a result of Sarig that guarantee analyticity of the pressure function (and hence absence of phase transitions). We recall that our standing assumption is that the potential ϕ has summable variations and finite Gurevich-Sarig pressure. We are interested in the existence of directional derivatives d d t | t =0 P (ϕ+t ψ) and we restrict ourself to the set of directions Dir(ϕ) which consists of those ψ for which ∞ n=2 V n (ψ) < ∞ (recall that V n (ψ) is the n-th variation of ψ) and there exists ε > 0 such that for any |t | < ε we have P (ϕ + t ψ) < ∞. THEOREM 4.11 (Sarig [22]). If ϕ is strongly positive recurrent then for every ψ ∈ Dir(ϕ) that is Hölder-continuous, there exists ε > 0 such that ϕ + t ψ is positive recurrent for all |t | < ε and such that the function t → P (ϕ + t ψ) is real analytic in (ε, ε).
Of particular interest is the case when ψ = ϕ as it appears in the study of the one-parameter family of potentials {βϕ} β≥β 0 . One can show that if P (β 0 ϕ) < ∞, then P (βϕ) < ∞ for all β > β 0 and hence, ϕ ∈ Dir(βϕ). This however, may not be true for β = β 0 ; see an example in [24].