On the replica symmetric solution of the K-sat model

In this paper we translate Talagrand's solution of the K-sat model at high temperature into the language of asymptotic Gibbs measures. Using exact cavity equations in the infinite volume limit allows us to remove many technicalities of the inductions on the system size, which clarifies the main ideas of the proof. This approach also yields a larger region of parameters where the system is in a pure state and, in particular, for small connectivity parameter we prove the replica symmetric formula for the free energy at any temperature.


Introduction
The replica symmetric solution of the random K-sat model at high temperature was first proved by Talagrand in [8], and later the argument was improved in [9] and, again, in [10]. The main technical tool of the proof is the so called cavity method, but there are several other interesting and non-trivial ideas that play an important role. In this paper, we will translate these ideas into the language of asymptotic Gibbs measures developed by the author in [7]. The main advantage of this approach is that the cavity equations become exact in the infinite volume limit, which allows us to bypass all subtle inductions on the size of the system and to clarify the essential ideas. Using the exact cavity equations, we will also be able to prove that the system is in a pure state for a larger region of parameters.
Consider an integer p ≥ 2 and real numbers α > 0, called the connectivity parameter, and β > 0, called the inverse temperature parameter. Consider a random function on {−1, 1} p , where (J i ) 1≤i≤p are independent random signs, P(J i = ±1) = 1/2. Let (θ k ) k≥1 be a sequence of independent copies of the function θ , defined in terms of independent copies of (J i ) 1≤i≤p . Using this sequence, we define a Hamiltonian H N (σ ) on Σ N = {−1, 1} N by − H N (σ ) = ∑ k≤π (αN) θ k (σ i 1,k , . . . , σ i p,k ), where π(αN) is a Poisson random variable with the mean αN and the indices (i j,k ) j,k≥1 are independent uniform on {1, . . ., N}. This is the Hamiltonian of the random K-sat model with K = p, and our goal will be to compute the limit of the free energy as N → ∞ in some region of parameters (α, β ). It will be convenient to extend the definition of the function θ from {−1, 1} p to [−1, 1] p as follows. Since the product over 1 ≤ i ≤ p in (1) takes only two values 0 and 1, we can write exp θ (σ 1 , . . ., σ p ) = 1 + (e −β − 1) ∏ At some point, we will be averaging exp θ over the coordinates σ 1 , . . . , σ p independently of each other, so the resulting average will be of the same form with σ i taking values in [−1, 1]. It will be our choice to represent this average again as exp θ with θ now defined by θ (σ 1 , . . ., σ p ) = log 1 + (e −β − 1) ∏ where π(α p) is a Poisson random variable with the mean α p independent of everything else and Av denotes the average over ε ∈ {−1, 1}. The functional P(ζ ) is called the replica symmetric formula in this model. Our first result will hold in the region of parameters In this case, we will show that asymptotically the system is always in a pure state in the sense that will be explained in Section 3 and the following holds.

Theorem 1 If (6) holds then lim
Notice that when the connectivity parameter α is small, (p − 1)pα < 1, the formula (7) holds for all temperatures, which is a new feature of our approach. One can say more under the additional assumption that 1 2 (e β − 1)(p − 1)pα < 1.
In particular, in this case one can show that the asymptotic Gibbs measure, which will be defined in the next section, is unique and, as a result, the infimum in (7) can be replaced by P(ζ ), where ζ can be characterized as a fixed point of a certain map arising from the cavity computations. For r ≥ 1, let us consider a (random) function T r : where We set T 0 = 0 and define a map in terms of the functions (T r ) as follows. Given ζ ∈ Pr[−1, 1], if we again let (z j,k ) j≤p−1,k≥1 be i.i.d. random variables with the distribution ζ then T (ζ ) is defined by where L (X ) denotes the distribution of X . In the second line, we simply wrote the distribution as a mixture over possible values of π(α p), since this Poisson random variable is independent of everything else. The following is essentially the main result in Chapter 6 in [10]. (8) holds then the map T has a unique fixed point, T (ζ ) = ζ . If both (6) and (8) hold then lim N→∞ F N = P(ζ ).

Theorem 2 If
As we already mentioned, the main ideas of the proof we give here will be the same as in [10] but, hopefully, more transparent. Of course, there is a trade-off in the sense that, instead of working with approximate cavity computations for systems of finite size and using the induction on N, one needs to understand how these cavity computations can be written rigorously in the infinite volume limit, which was the main point of [7]. However, we believe that passing through this asymptotic description makes the whole proof less technical and more conceptual. Moreover, the results in [7] hold for all parameters, and here we simply specialize the general theory to the high temperature region using methods developed in [8,9,10].
In the next section, we will review the definition of asymptotic Gibbs measures and recall the main results from [7], namely, the exact cavity equations and the formula for the free energy in terms of asymptotic Gibbs measures. In Section 3, we will prove that, under (6), all asymptotic Gibbs measures concentrate on one (random) function (so the system is in a pure state) and in Section 4 we will deduce Theorem 1 from this fact. Finally, in Section 5, we will prove Theorem 2 by showing that, under (6) and (8), the asymptotic Gibbs measure is unique. Of course, as in [10], the same proof works for diluted p-spin models as well but, for simplicity of notations, we will work only with the Hamiltonian (2) of the p-sat model.

Asymptotic Gibbs measures
In this section we will review the main results in [7] starting with the definition of asymptotic Gibbs measures. The Gibbs measure G N corresponding to the Hamiltonian (2) is a (random) probability measure on {−1, 1} N defined by where the normalizing factor Z N is called the partition function. Let (σ ℓ ) ℓ≥1 be an i.i.d. sequence of replicas drawn from the Gibbs measure G N and let µ N denote the joint distribution of the array of all spins on all replicas, (σ ℓ i ) 1≤i≤N,ℓ≥1 , under the average product Gibbs measure EG ⊗∞ N . In other words, for any choice of signs a ℓ i ∈ {−1, 1} and any n ≥ 1, Let us extend µ N to a distribution on {−1, 1} N×N simply by setting σ ℓ i = 0 for i ≥ N + 1. Let M be the sets of all possible limits of (µ N ) over subsequences with respect to weak convergence of measures on the compact product space {−1, 1} N×N . We will call these limits the asymptotic Gibbs measures. One crucial property that these measures inherit from µ N is the invariance under the permutation of both spin and replica indices i and ℓ. Invariance under the permutation of the replica indices is obvious, and invariance under the permutation of the spin index holds because the distribution of the Hamiltonian (2) is invariant under any such permutation. In other words, there is symmetry between coordinates in distribution, which is called symmetry between sites.
Because of these symmetries, all asymptotic Gibbs measures have some special structure. By the Aldous-Hoover representation [1,4], for any µ ∈ M , there exists a measurable function σ : [0, 1] 4 → {−1, 1} such that µ is the distribution of the array where random variables w, (u ℓ ), (v i ), (x i,ℓ ) are i.i.d. uniform on [0, 1]. The function σ is defined uniquely for a given µ ∈ M , up to measure-preserving transformations (Theorem 2.1 in [5]), so we can identify the distribution µ of array (s ℓ i ) with σ . Since, in our case, σ take values in {−1, 1}, the distribution µ is completely encoded by the function where E x is the expectation in x only. The last coordinate x i,ℓ in (15) is independent for all pairs (i, ℓ), and we can think of it as flipping a coin with the expected valueσ (w, u ℓ , v i ). In fact, given the function (16), we can always redefine σ by One can think of the functionσ in a more geometric way as a Gibbs measure on the space of functions, as follows. It is well known that asymptotically the joint distribution µ ∈ M of all spins contains the same information as the joint distribution of all so called multi-overlaps for all n ≥ 1 and all ℓ 1 , . . ., ℓ n ≥ 1. This is easy to see by expressing the joint moments of one array in terms of the joint moment of the other. In particular, one can check that the asymptotic distribution of the array (17) over a subsequence of µ N converging to µ ∈ M coincides with the distribution of the array for n ≥ 1 and ℓ 1 , . . . , ℓ n ≥ 1, where E v denotes the expectation in the last coordinate v only. The average of replicas over spins in (17) has been replaced by the average of functions over the last coordinate, and we can think of the sequence (σ (w, u ℓ n , ·)) ℓ≥1 as an i.i.d. sequence of replicas sampled from the (random) probability measure Here, both du and dv denote the Lebesgue measure on [0, 1]. Thus, thanks to the Aldous-Hoover representation, to every asymptotic Gibbs measure µ ∈ M we can associate a functionσ on [0, 1] 3 or a random measure G w of the above space of functions. One can find a related interpretation in terms of exchangeable random measures in [2].
The main idea introduced in [7] was a special regularizing perturbation of the Hamiltonian H N (σ ) that allows to pass some standard cavity computations for the Gibbs measure G N to the limit and state them in terms of the asymptotic Gibbs measures µ ∈ M . We will refer to [7] for details and only mention that the perturbation mimics adding to the system a random number (of order log N) of cavity coordinates from the beginning. Because of this perturbation, treating a finite number of coordinates as cavity coordinates is "not felt" by the Gibbs measure, which results in a number of useful properties in the limit. The perturbation is small enough and does not affect the limit of the free energy F N . In the rest of this section, we will describe the cavity equations in terms of the functions σ in (15) and state some of their consequences. Let us introduce some notation. We will often need to pick various sets of different spin coordinates in the array (s ℓ i ) in (15), and it is quite inconvenient to enumerate them using one index i ≥ 1. Instead, we will use multi-indices (i 1 , . . ., i n ) for n ≥ 1 and i 1 , . . . , i n ≥ 1 and consider where In addition to (20), we will need for some independent copiesv andx of the sequences v and x.
copies of the random function θ . Take arbitrary integer n, m, q, r ≥ 1 such that n ≤ m. The index q will represent the number of replicas selected, m will be the total number of spin coordinates and n will be the number of cavity coordinates. The parameter r ≥ 1 will index certain terms in the cavity equations that are allowed because of the stability properties of the Hamiltonian (2); these terms played an important role in [7] and will appear in the formulation of the mains results from [7], but will not be used throughout this paper after that. For each replica index ℓ ≤ q we consider an arbitrary subset of coordinates C ℓ ⊆ {1, . . ., m} and split them into cavity and non-cavity coordinates The following quantities represent the cavity fields for i ≥ 1, where ε ∈ {−1, 1} and (π i (α p)) i≥1 are i.i.d. Poisson random variables with the mean α p. Let E ′ denote the expectation in u and the sequences x andx, and Av denote the average over The following result proved in Theorem 1 in [7] expresses some standard cavity computations in terms of the asymptotic Gibbs measures.

Theorem 3
For any µ ∈ M and the corresponding function σ in (15), The left hand side can be written using replicas as E ∏ ℓ≤q ∏ i∈C ℓ s ℓ i , so it represent an arbitrary joint moment of spins in the array (15). The right hand side expresses what happens to this joint moment when we treat the first n spins as cavity coordinates. As in [7], we will denote by M inv the set of distributions of exchangeable arrays generated by functions σ : [0, 1] 4 → {−1, 1} as in (15) that satisfy the cavity equations (25) for all possible choices of parameters. Theorem 3 shows that M ⊆ M inv , which was the key to proving the formula for the free energy in terms of asymptotic Gibbs measures. Let us consider the functional The next result was proved in Theorem 2 in [7].

Theorem 4
The following holds, Remark. This result was stated in [7] for even p ≥ 2 only, where this condition was used in the proof of the Franz-Leone upper bound [3]. However, in the case of the p-sat model the proof works for all p without any changes at all, as was observed in Theorem 6.5.1 in [10]. The condition that p is even is needed in the corresponding result for the diluted p-spin model, and that is why it appears in [6,7], where both models were treated at the same time.
For some applications, it will be convenient to rewrite (25) in a slightly different form. From now on, we will not be using the termsθ k in (24), so we will now set r = 0. Let us consider some function f (σ 1 , σ 2 ) on {−1, 1} m×q of the arguments For example, if we consider the function then the left hand side of (25) can be written as E f (s 1 , s 2 ), where s 1 and s 2 are the corresponding subarrays of (s ℓ i ) in (15). To rewrite the right hand side, similarly to (20), let us consider where Then, with this notation, the equation (25) can be rewritten as Simply, we expressed a product of expectations E ′ over replicas ℓ ≤ q by an expectation of the product, using replicas of the random variables u and x that are being averaged. Since any function f on {−1, 1} m×q is a linear combination of monomials of the type (29), (33) holds for any such f . From here, it is not difficult to conclude that for any functions f 1 , . . . , f k on {−1, 1} m×q and any continuous function F : R k → R, It is enough to prove this for functions F(a 1 , . . ., a k ) = a n 1 1 · · · a n k k for integer powers n 1 , . . . , n k ≥ 0, and this immediately follows from (33) by considering f on q(n 1 + . . . + n k ) replicas given by the product of copies of f 1 , . . ., f k on different replicas, so that each f i appears n i times in this product.

Pure state
In this section, we will show that in the region (6) the functionσ(w, u, v) in (16) corresponding to any µ ∈ M inv essentially does not depend on the coordinate u. In other words, for almost all w, the Gibbs measure G w in (19) is concentrated on one function in L 2 ([0, 1], dv) ∩ { σ ∞ ≤ 1}. This is expressed by saying that the system is in a pure state. (6),σ (w, u, v) = E uσ (w, u, v) for almost all w, u, v ∈ [0, 1], where E u denotes the expectation in u only.

Theorem 5 Under
When the system is in a pure state, we will simply omit the coordinate u and writeσ (w, v). In this case, a joint moment of finitely many spins, does not depend on replica indices, which means that we can freely change them, for example, Es 1 1 s 2 1 s 1 2 s 2 2 = Es 1 1 s 2 1 s 3 2 s 4 2 . As in [10], the strategy of the proof will be to show that we can change one replica index at a time, where a finite set of indices C does not contain (1, 1) and (1, ℓ ′ ). Using this repeatedly, we can make all replica indices different from each other, showing that any joint moment depends only on how many times each spin index i appears in the product. Of course, this implies that so we could replace the functionσ (w, u, v) by E uσ (w, u, v) without changing the distribution of the array (s ℓ i ). This would be sufficient for our purposes, since we do not really care how the function σ looks like as long as it generates the array of spins (s ℓ i ) with the same distribution. However, it is not difficult to show that, in this case, the functionσ (w, u, v) essentially does not depend on u anyway. Let us explain this first. (35)). If (35) holds then Es 1 1 s 2 1 s 1 2 s 2 2 = Es 1 1 s 2 1 s 3 2 s 4 2 . This can also be written in terms of the asymptotic overlaps R ℓ,ℓ ′ defined in (18) as ER 2 1,2 = ER 1,2 R 3,4 .

Proof of Theorem 5 (assuming
Since R ℓ,ℓ ′ is the scalar product in (L 2 [0, 1], dv) of replicas σ ℓ and σ ℓ ′ drawn from the asymptotic Gibbs measure G w in (19), which implies that for almost all w the overlap is constant almost surely. Obviously, this can happen only if G w is concentrated on one function (that may depend on w) and this finishes the proof. ⊓ ⊔ In the rest of the section we will prove (35). The main idea of the proof will be almost identical to Section 6.2 in [10], even though there will be no induction on the system size. One novelty will be that the cavity equations (25) for the asymptotic Gibbs measures will allow us to give a different argument for large values of β , improving the dependence of the pure state region on the parameters. We will begin with this case, since it is slightly simpler. Without loss of generality, we can assume that ℓ ′ = 2 in (35). Given m, q ≥ 1, for j = 1, 2, let us consider functions f j (σ 1 , σ 2 ) on {−1, 1} m×q with σ 1 and σ 2 as in (28). We will suppose that Let us fix n ≤ m and, as before, we will treat the first n coordinates as cavity coordinates. Consider the map that switches the coordinates (σ 1 1 , . . . , σ 1 n ) with (σ 2 1 , . . ., σ 2 n ) and leaves other coordinates untouched. The statement of the following lemma does not involve β , but it will be used when β is large enough.
To see that (38) implies (35) with ℓ ′ = 2, take n = 1, f 2 = 1 and f 1 = 0.5(σ 1 By (36), the function f 2 on {−1, 1} m×q is strictly separated from 0, so we can use (34) with k = 2 and F(a 1 , a 2 ) = a 1 /a 2 to get Recall that Av is the average over ε = (ε ℓ i ) i≤n,ℓ≤q ∈ {−1, 1} n×q and For a moment, let us fix all the random variables π i (α p) and θ i,k and let r := ∑ i≤n π i (α p). Observe right away that if r = 0 then E (ε) = 1 and This is because the average Av does not change if we switch the coordinates (ε 1 1 , . . . , ε 1 n ) with (ε 2 1 , . . . , ε 2 n ) (in other words, just rename the coordinates) and, by assumption, Now, let us denote the set of all triples ( j, i, k) that appear as subscripts in (40) by If we denote bys 1 = (s ℓ e ) e∈J,ℓ≤q all the coordinates of the array s that appear in E (ε) then, for r ≥ 1, we can think of the averages on the right hand side of (39) as functions of s 2 ands 1 , Even though s 2 ands 1 are random variables, for simplicity of notation, here we think of them also as variables of the functionsf j . First of all, since | f 1 | ≤ f 2 , Similarly to T , letT now be the map that switches the vectors of spins (s 1 e ) e∈J and (s 2 e ) e∈J ins 1 corresponding to the first and second replica. Let us show thatf 1 •T = −f 1 . First, we writẽ As above, we will use that the average Av does not change if we switch the coordinates (ε 1 1 , . . ., ε 1 n ) with (ε 2 1 , . . . , ε 2 n ), sof By assumption, f 1 • T = − f 1 and it remains to notice that E (ε) •T T = E (ε), becauseT T simply switches all the terms A i,1 and A i,2 in the definition of E (ε). We showed that (39) can be rewritten as and, conditionally on π i (α p) and θ i,k , the pair of functionsf 1 ,f 2 satisfies the same properties as The only difference is that now n is replaced by the cardinality of the set J in (42), equal to (p − 1)r. For a fixed n, let us denote by D(n) the supremum of the left hand side of (39) over m ≥ n and all choices of functions f 1 , f 2 with the required properties. Then, the equation (44) implies (first, integrating the right hand side conditionally on all π i (α p) and θ i,k ) where π(nα p) := r = ∑ i≤n π i (α p) is a Poisson random variables with the mean nα p. Recall that, by (41),f 1 = 0 when r = 0, so we can set D(0) = 0. Also, the assumption | f 1 | ≤ f 2 gives that D(n) ≤ 1 and, thus, D(n) ≤ n. Then, (45) implies D(n) ≤ E(p − 1)π(nα p) = (p − 1)pαn.

⊓ ⊔
For small values of β , we will give a slightly different argument, following Section 6.2 in [10]. Lemma 2 In the notation of Lemma 1, suppose that n = 1 and Proof. The first part of the proof proceeds exactly the same way as in Lemma 1, and we obtain (44) for the functionsf 1 ,f 2 defined in (43). Since n = 1, we can rewrite (40) as and the set (42) now becomes Its cardinality if (p −1)r, where r = π 1 (α p). Even though we showed thatf 1 •T = −f 1 , we can not draw any conclusions yet since the map T switches only one spins in the first and second replicas, whileT switches (p − 1)r spins (s 1 e ) e∈J and (s 2 e ) e∈J ins 1 , of course, conditionally on π 1 (α p) and θ k . We will decomposef 1 into the sumf 1 = ∑ e∈Jfe , where eachf e satisfiesf e •T e = −f e with some mapT e that switches s 1 e and s 2 e only. We begin by writing If we order the set J by some linear order ≤ then we can expand this into a telescopic sum, Then we simply definef and notice thatf e •T e = −f e , sinceT eTe is the identity. Equation (44) implies We keep the sum inside the expectation because the set J is random. Recalling the definition off j in (43), we can write (for simplicity of notation, we will write E instead of E (ε) from now on) All the mapsT e switch coordinates only in the first and second replica. This means that if we write E defined in (47) as E = E ′ E ′′ where If e = ( j, k) then the terms in the last difference only differ in the term θ k (s ℓ 1,k , . . . , s ℓ p−1,k , ε ℓ 1 ). Since θ k ∈ [−β , 0] and A 1 + A 2 ≤ 0, we can use that |e x − e y | ≤ |x − y| for x, y ≤ 0 to get that Therefore, from (50) we obtain Similarly, using that A 1 + A 2 ∈ [−2β π 1 (α p), 0] we get that and together the last two inequalities yield |f e (s 1 , s 2 )| ≤ β exp(2β π 1 (α p))f 2 (s 1 , s 2 ). (51) Let D be the supremum of the left hand side of (49) over all pairs of functions f 1 , f 2 such that | f 1 | ≤ f 2 and f 1 • T = − f 1 under switching one coordinate in the first and second replicas. Then conditionally on π 1 (α p) and the randomness of all θ k , each pairf e ,f 2 of the right hand side of (49) satisfies (51), and we showed above thatf e •T e = −f e under switching one coordinate in the first and second replicas. Therefore, (49) implies that Even though, formally, this computation was carried out in the case when π 1 (α p) ≥ 1, it is still valid when π 1 (α p) = 0 because of (41). Finally, since π 1 (α p) has the Poisson distribution with the mean α p, The condition (46) together with (52), obviously, implies that D = 0 and this finishes the proof. ⊓ ⊔ To finish the proof of Theorem 5, it remains to show that the region (6) is in the union of the two regions in the preceding lemmas. (6) holds then either p(p − 1)α < 1 or (46) holds.

Inside the pure state
Suppose now that the system is in a pure state and, for each µ ∈ M inv , the corresponding function σ (w, u, v) does not depend on the second coordinate, in which case we will write it asσ (w, v). Let us begin by proving Theorem 1.
Proof of Theorem 1. When the system is in a pure state, we can rewrite the functional P(µ) in (26) as follows. First of all, since the expectation E ′ is now only in the random variables x, which are independent for all spin and replica indices, we can write Similarly, . Therefore, the functional P(µ) in (26) can be written as where E v is the expectation only in the random variables (v i ) and (v i,k ). For a fixed w, the random variablesσ i andσ i,k are i.i.d. and, comparing with (5), this infimum is bigger than inf ζ ∈Pr[−1,1] P(ζ ).
Since this lower bound holds for all µ ∈ M inv , Theorem 4 then implies that The upper bound follow from the Franz-Leone theorem [3] by considering functionsσ (w, u, v) that depend only on the coordinate v (see Section 2.3 in [7], and also [6,10]). As we mentioned above, it was observed in Theorem 6.5.1 in [10] that the upper bound holds for all p ≥ 2.

⊓ ⊔
Let us also write down one consequence of the cavity equations (25) for a system in a pure state. Again, letσ i =σ (w, v i ) and denoteσ j,i,k =σ(w, v j,i,k ). Let where We will now show that the cavity equations (25) imply the following, Lemma 4 If the system is in a pure state, for example in the region (6), then Proof. This can be seen as follows. Take r = 0 and n = m in (25), so all coordinates will be viewed as cavity coordinates. Since the expectation E ′ is now only in the random variables x, which are independent for all spin and replica indices, as in the proof of Theorem 1 we can write (slightly abusing notation) where A i (ε) are now given by (55) instead of (23), i.e. after averaging the random variables x.
By choosing q and the sets C ℓ so that each index i appears n i times gives E ∏ i≤nσ and this finishes the proof. ⊓ ⊔

Proof of Theorem 2
In this section we will prove Theorem 2 and we begin with the following key estimate. For a moment, we fix the randomness of (θ k ) and think of T r defined in (9) as a nonrandom function.
Since θ 1 ∈ [−β , 0], we see that J 1,1 θ ′ 1 ∈ [(1 − e β )/2, 0] and which implies that |∂ T r /∂ σ 1,1 | ≤ (e β − 1)/2. The same, obviously, holds for all partial derivatives and this finishes the proof. ⊓ ⊔ Step 1. Let us first show that, under (8), there exists unique fixed point T (ζ ) = ζ . The claim will follow from the Banach fixed point theorem once we show that the map T is a contraction with respect to the Wasserstein metric W (P, Q) on Pr[−1, 1]. This metric is defined by where the infimum is taken over all pairs (z 1 , z 2 ) with the distribution in the family M(P, Q) of measures on [−1, 1] 2 with marginals P and Q. It is well known that this infimum is achieved on some measure µ ∈ M(P, Q). Let (z 1 j,k , z 2 j,k ) be i.i.d. copies for j ≤ p − 1 and k ≥ 1 with the distribution µ. By (57) and Wald's identity, On the other hand, by the definition (12), the pair of random variables on the left hand side, has the distribution in M(T (P), T (Q)) and, therefore, W T (P), T (Q) ≤ 1 2 (e β − 1)(p − 1)pαW (P, Q).
The condition (8) implies that the map T is a contraction with respect to W . Since the space (Pr[−1, 1],W ) is complete, this proves that T has a unique fixed point ζ .
Step 2. Now, suppose that both (6) and (8) hold. Let ζ be the unique fixed point T (ζ ) = ζ and letσ (w, v, u) be the function corresponding to a measure µ ∈ M inv in the statement of Theorem 4. By Theorem 5, we know thatσ does not depend on u and, therefore,σ (w, v) satisfies Lemma 4. Recall thatσ i =σ(w, v i ) and let (z i ) i≥1 be i.i.d. random variables with the distribution ζ . We will now show that which together with (53) will imply that P(µ) = P(ζ ) for all µ ∈ M inv , finishing the proof. (By the way, the fact that (σ i ) i≥1 are i.i.d. does not mean that the functionσ (w, u) does not depend on w; it simply means that the distribution of (σ i ) i≥1 is independent of w.) To show (59), we will again utilize the Wasserstein metric. For any n ≥ 1, we will denote by D(n) the Wasserstein distance between the distribution of (σ i ) i≤n and the distribution of (z i ) i≤n (equal to ζ ⊗n ) with respect to the metric d(x, y) = ∑ i≤n |x i − y i | on [−1, 1] n . For any r = (r 1 , . . ., r n ) ∈ N n (we assume now that 0 ∈ N), let us denote p r = P π 1 (α p) = r 1 , . . ., π n (α p) = r n = ∏ i≤n (α p) r i r i ! e −α p .
Since ζ = T (ζ ), recalling the definition of T (ζ ) in (12), we get ζ ⊗n = T (ζ ) ⊗n = ∑ r∈N n p r i≤n L T r i (z j,k ) j≤p−1,k≤r i , where the random variables z i,k are i.i.d. and have distribution ζ . Next, similarly to (9), let us define