On the Potts antiferromagnet on random graphs

Extending a prior result of Contucci et al (Comm. Math. Phys. 2013), we determine the free energy of the Potts antiferromagnet on the Erd\"os-R\'enyi random graph at all temperatures for average degrees $d \le (2k-1)\ln k - 2 - k^{-1/2}$. In particular, we show that for this regime of $d$ there does not occur a phase transition.

The Potts antiferromagnet is one of the best-known models of statistical physics. Accordingly, it has been studied extensively on a wide class of graphs, particularly lattices [10,25,27]. The aim of the present paper is to study the model on the Erdős-Rényi random graph G = G(n, m). Throughout the paper, we let m = ⌈dn/2⌉ for a number d > 0 that remains fixed as n → ∞. We also assume that the number k ≥ 3 of colors remains fixed as n → ∞. The Potts model on the random graph G is of interest partly due to the connection to the k-colorability problem. Indeed, the larger β, the more severe the "penalty factor" of exp(−β) that each monochromatic edge induces in (1.1). Thus, if the underlying graph is k-colorable, then for large β the Gibbs measure will put most of its weight on color assignments that leave few edges monochromatic. Ultimately, one could think of the uniform distribution on k-colorings as the "β = ∞"-case of the Gibbs measure (1.1). Now, consider the problem of finding a k-coloring of the random graph by a local search algorithm such as Simulated Annealing. Then most likely the algorithm will start from a color assignment that has quite a few monochromatic edges. As the algorithm proceeds, it will attempt to gradually reduce the number of monochromatic edges by running the Metropolis process for the Gibbs measure (1.1) with a value of β that increases over time. Specifically, β has to be large enough to make progress but small enough so that the algorithm does not get trapped in a local minimum of the Hamiltonian. Hence, to figure out whether such a local search algorithm will find a proper k-coloring in polynomial time, it is instrumental to study the "shape" of the Hamiltonian.
To this end, it is key to get a handle on the free energy, defined as E[ln Z β (G)]. We take the logarithm because Z β (G) scales exponentially in the number n of vertices. As a standard application of Azuma's inequality shows that ln Z β (G) is concentrated about its expectation (see Fact 1.2 below), 1 n | ln Z β (G) − E[ln Z β (G)]| converges to 0 in probability. Furthermore, if E[ln Z β (G)] ∼ ln E[Z β (G)] for certain d, β, then the Hamiltonian can be studied via an easily accessible probability distribution called the planted model. This trick has been applied to the "proper" graph coloring problem as well as to other random constraint satisfaction problems successfully [2,26].
1.2. The main result. Because our motivation largely comes from the random graph coloring problem, we are going to confine ourselves to values of d where the random graph G is k-colorable w.h.p. Although the precise k-colorability threshold d k−col is not currently known, we have [12,14] (2k − 1) ln k − 2ln 2 + o k (1)  where o k (1) hides a term that tends to 0 in the limit of large k. The following theorem determines 1 n E[ln Z β (G)] almost up to the lower bound from (1.2).
Clearly, the function on the r.h.s. of (1.3) is analytic in β ∈ (0, ∞). Thus, in the language of mathematical physics Theorem 1.1 implies that the Potts antiferromagnet on the random graph does not exhibit a phase transition for any average degree d < d ⋆ .
1.3. Related work. The problem of determining the k-colorability threshold of the random graph was raised in the seminal paper by Erdős and Rényi and is thus the longest-standing open problem in the theory of random graphs [20]. Achlioptas and Friedgut [1] proved the existence of a non-uniform sharp threshold. Moreover, a simple greedy algorithm finds a k-coloring for degrees up to about k ln k, approximately half the k-colorability threshold [3]. Further, Achlioptas and Naor [4] used the second moment method to establish a lower bound of d k−col ≥ 2(k −1) ln k +o k (1), which matches the first-moment upper bound d k−col ≤ (2k −1) ln k +o k (1) up to about an additive ln k. Coja-Oghlan and Vilenchik [14] improved the lower bound to d k−col ≥ (2k −1) ln k −2ln 2+o k (1) via a second moment argument that incorporates insights from non-rigorous physics work [28]. On the other hand, Coja-Oghlan [12] (1). The results from [4,14] were subsequently generalized to various other models, including random regular graphs and random hypergraphs [5,13,17,22].
The Potts antiferromagnet on the random graph was studied before by Contucci, Dommers, Giardina and Starr [15], who generalized the second moment argument from [4] to the Potts model. In particular, [15] shows An analogous result was recently obtained (among other things) by Banks and Moore [6] for a variant of the stochastic block model that resembles the Potts antiferromagnet. Their proof is based on [4] as well. In the present paper we improve the corresponding results of [6,15] by extending the physics-enhanced second moment argument from [14] to the Potts antiferromagnet.
Physics considerations suggest that for average degrees d > (2k − 1) ln k − 2ln 2 + o k (1) a phase transition does occur, i.e., the function β ∈ (0, ∞) → lim n→∞ 1 n E[ln Z β (G)] is non-analytic [23,24,28]. The existence and location of the condensation phase transition has been established asymptotically in the hypergraph 2-coloring and the hardcore model and precisely in the regular k-SAT model and the k-colorability problem [7,8,9,11]. However, the Potts antiferromagnet is conceptually more challenging than hardcore, k-SAT or hypergraph 2-coloring because the "variables" (viz. vertices) can take more than two values (colors). Potts is also more difficult than k-coloring because of the presence of the inverse temperature parameter β. In fact, the present work is partly motivated by studying condensation in the Potts antiferromagnet, and we hope that Theorem 1.1 and its proof may pave the way to pinpointing the phase transition precisely, see Section 2.5 below. Additionally, as mentioned above, Theorem 1.1 implies that for d ≤ (2k −1) ln k −2−k −1/2 the Hamiltonian can be studied by way of the planted model. Finally, the ferromagnetic Potts model (where the Gibbs measure favors monochromatic edges) is far better understood than the antiferromagnetic version [16].

Preliminaries.
Throughout the paper we assume that k ≥ k 0 for a large enough constant k 0 > 0. Moreover, let Unless specified otherwise, the standard O-notation refers to the limit n → ∞. We always assume tacitly that n is sufficiently large. Additionally, we use asymptotic notation in the limit of large k with a subscript k.
Proof. If G,G ′ are multi-graphs such that G ′ can be obtained from G by adding or deleting a single edge, then | ln Z β (G) − ln Z β (G ′ )| ≤ 2β. Hence, the assertion follows from Azuma's inequality.
If s is an integer, we write [s] for the set {1, . . . , s}. Further, if v is a vertex of a graph G, then ∂v = ∂ G (v) is the set of neighbors of v in G. If ρ is a matrix, then by ρ i we denote the i th row of ρ and by ρ i j the j th entry of ρ i . Further, the Frobenius norm of a k × k-matrix ρ is For a probability distribution p : Ω → [0, 1] on a finite set Ω we denote by the entropy of p (with the convention that 0ln 0 = 0). Additionally, if ρ is a k × k-matrix with non-negative entries, then we let We will use the following standard fact about the entropy.

OUTLINE
We prove Theorem 1.1 by generalizing the second moment argument for k-colorings from [14] to the partition function of the Potts antiferromagnet. In this section we describe the proof strategy. Most of the technical details are left to the subsequent sections.
2.1. The first moment. As a first step we calculate the first moment E[Z β (G)]. This is pretty straightforward; in fact, it has been done before [15]. Nonetheless, we go over the calculations to introduce a few concepts that will prove important in the second moment argument as well.
To lower-bound Z β (G) we follow Achlioptas and Naor [4] and work with "balanced" color assignments whose color classes are all about the same size. Specifically, call σ : be the partition function restricted to balanced maps. Moreover, let be the number of monochromatic edges of the complete graph. Then uniformly for all balanced σ, Hence, by Stirling's formula Combining (2.1) and (2.2), we find On the other hand, for all σ we have H K n (σ) ≥ 1 k n 2 − n by convexity. Therefore, (2.2) yields Further, define Then an elementary argument similar to the proof of Proposition 2.1 yields .
The function f d ,β is a sum of an entropy term H (k −1 ρ) and an "energy term" For future reference we note that The number |R| of summands on the right hand side of (2.6) is easily bounded by n k 2 . Therefore, Denote by S the set of all singly-stochastic matrices and by D the set of all doubly-stochastic k × k matrices, respectively. Then n≥1 R(n,k)∩ D is a dense subset of D. Together with (2.10) the continuity of f therefore implies (2.11) Settingρ = k −1 1 to be the barycenter of D, we obtain from Proposition 2.2 that Hence, just as in the case of proper k-colorings [4,15], a necessary condition for the success of the second moment method is that the function f d ,β attains its maximum on D at the pointρ. 4 2.3. Small average degree or high temperature. Contucci, Dommers, Giardina and Starr [15] proved that the maximum in (2.11) is indeed attained atρ if the average degree is a fair bit below the k-colorability threshold.
Comparing this result with (1.2), we see that Theorem 2.3 applies to degrees about an additive ln k below the k-colorability threshold. The proof of Theorem 2.3 builds upon ideas of Achlioptas and Naor [4]. More precisely, solving the maximization problem from (2.11) directly emerges to be surprisingly difficult. Hence, Achlioptas and Naor suggested to enlarge the domain to the set of singly stochastic matrices. Clearly, the maximum over the larger space is an upper bound on the maximum over the set of doubly-stochastic matrices. Further, because the set of singly-stochastic matrices is a product of simplices, the relaxed optimization problem can be tackled with a fair bit of technical work. Crucially, for d < 2(k − 1) ln(k − 1) the maximum of the relaxed problem is attained atρ. However, for only slightly larger values of d the maximum is attained at a different point, and thus the relaxed second moment argument fails.
Apart from the case of small d, the second case that is relatively straightforward is that of small β (the "high temperature" case in physics jargon). More precisely, in Section 3 we will prove the following.
improves upon the result from [15], which yields (1.3) merely for β ≤ β 0 for an absolute constant β 0 (independent of k). The proof of Proposition 2.4 is by way of relaxing (2.10) to singly-stochastic matrices as well and builds upon arguments developed in [14] for k-colorability.
2.4. Large degree and low temperature. The most challenging constellation is that of d beyond 2(k − 1) ln(k − 1) and β large. In this regime we do not know how to solve the maximization problem (2.10). In particular, the trick of relaxing the problem to the set of all singly-stochastic matrices does not work. Instead, following [14] we are going add further constraints to the problem. That is, we are going to apply the second moment method to a modified random variable that is constructed so as to ensure that certain parts of the domain D cannot contribute to (2.10) significantly.
The construction is guided by the physics prediction [23] that for large d and β the Gibbs measure µ G "decomposes" into an exponential number of well-separated clusters. Of course, it would be non-trivial to turn this notion into a precise mathematical statement because the support of µ G is the entire cube [k] n . However, the probability mass is expected to be distributed very unevenly, with large swathes of the cube carrying very little mass.
Fortunately, we do not need to define clusters etc. precisely. Instead, adapting the construction from [14], we just define a new random variable Z β,sep (G) that comes with a "hard-wired" notion of well-separated clusters. To be precise, for a graph G denote by Σ G,β the set of all τ ∈ B that enjoy the following property. SEP1: for every i ∈ [k] the set τ −1 (i ) spans at most 2n exp(−β)k −1 ln k edges.
Let B sep = B sep (G, β) ⊂ B denote the set of all separable maps and define To elaborate, condition SEP1 provides that the subgraphs induced on the individual color classes are quite sparse. Indeed, recalling that each monochromatic edge incurs a "penalty factor" of exp(−β), we expect that in a typical sample from the Gibbs measure the total number of monochromatic edges is about nd exp(−β)/(2k). Moreover, suppose that σ ∈ Σ G,β satisfies SEP2 and τ ∈ Σ G,β is another color assignment. Let i , j ∈ [k]. Then SEP2 provides that there are only two possible scenarios.
The upshot is that separability rules out the existence of any "middle ground", i.e., we do not have to consider overlaps ρ with entries ρ i j ∈ (0, 51, 1 − κ).
The following proposition, which we prove in Section 4, shows that imposing separability has no discernible effect on the first moment.
The point of working with separable color assignments is that the maximization problem that arises in the second moment computation of Z β,sep (G) comes with further constraints that are not present in (2.10). Specifically, we only need to optimize over ρ ∈ D such that ρ i j ∈ (0.51, 1 − κ) for all i , j ∈ [k]. In Section 5 we will use these constraints to derive the following.
Proof. On the one hand, Jensen's inequality gives (2.13) On the other hand, by Propositions 2.5 and 2.6 and the Paley-Zigmund inequality, (2.14) Combining (2.14) with Proposition 2.1 , (2.3) and Proposition 2.5, we obtain Further, (2.15) and Fact 1.  (1). Thus, for d > d k,cond there occurs a phase transition at a certain critical inverse temperature β k,cond (d). The existence of a critical β k,cond (d) follows from prior results on the random graph coloring problem [8]. However, the value of β k,cond (d) is not (rigorously) known. The physics intuition of how this phase transition comes about is as follows. For β < β k,cond (d) the Gibbs measure decomposes into an exponential number of clusters that each have probability mass exp(−Ω(n)). Hence, if we sample σ, τ independently from the Gibbs measure, then most likely they belong to different clusters, in which case their overlap should be very close toρ. By contrast, for β > β k,cond (d) a bounded number of clusters dominate the Gibbs measure, i.e., there are individual clusters whose probability mass is Ω(1). In effect, for β > β k,cond (d) the overlap of two randomly chosen color assignments is not concentrated on the single valueρ anymore, because there is a non-vanishing probability that both belong to the same cluster. In effect, the second moment method fails. In fact, we expect that But even the second moment argument for separable color assignments does not quite reach the expected critical degree d k,cond . Indeed, for d > (2k −1) ln k −2+o k (1) the maximum over the set of separable overlaps is attained In terms of the physics intuition, this overlap matrix corresponds to pairs of color assignments that belong to the same cluster. In other words, the second moment method fails because the expected cluster size blows up. A similar problem occurs in the k-colorability problem [14]. There the issue was resolved by explicitly controlling the median cluster size, which is by an exponential factor smaller than the expected cluster size [8]. We expect that a similar remedy applies to the Potts model, although the fact that monochromatic edges are allowed entails that the proof method from [8] does not apply. In any case, Theorem 1.1 reduces the task of determining the phase transition to the problem of controlling the median cluster size.
Furthermore, also in the case of degrees above d k−col at least the existence of a phase transition has been established rigorously [15]. It would be most interesting to see if the present methods can be extended to d > d k−col in order to obtain a more precise estimate of β k,cond (d).

SINGLY STOCHASTIC ANALYSIS
We prove Proposition 2.4 by way of the following proposition regarding the maximum of f d ,β over the set of singlystochastic matrices.
To prove Proposition 3.1 we will closely follow the proof strategy developed for the graph coloring problem in [14,Section 4]. Basically, that argument dealt with optimizing the function f d ,∞ (i.e., c β is replaced by 1) over S and we extend that argument to finite values of β. In fact, the following monotonicity statement shows that it suffices to prove Proposition 3.1 for β = ln k; related monotonicity statements were used in [9] for hypergraph 2-coloring and in [7] for regular k-SAT.
The following basic observation concerning the partial derivatives of f d ,β is reminiscent of [14,Lemma 4.11].
If ∂E (ρ)/∂ρ i j ≥ 1/k, the left hand side of (3.3) is negative for all δ > 0. 7 Proof. By (2.8), (2.9) and the choice of δ, The first part of the claim follows because the signs of the terms in (3.4) are invariant under exponentiation of the minuend φ(δ) = ln(1 + δ/ρ i j ) and subtrahend ψ(δ) = dc 2 β δ/(k − 2c β + c β ρ 2 2 /k). The second part follows from the observation that the linear function exp(φ) : R + → R intersects at most once with the strictly convex function exp(ψ) : R + → R. This is only the case if the derivative of exp(φ) in δ = 0 is strictly greater than that of exp(ψ).
The following lemma provides a general "maximum entropy" principle that we will use repeatedly (cf. [14,Proposition 4.7]).
Proof. We may assume that 0 ≤ min j ∈J ρ i j < max j ∈J ρ i,j . Otherwise, we would haveρ = ρ and there is nothing to prove. Now let denote the set of all possible overlaps. S ρ is a closed subset of S and therefore contains a maximal overlapρ ∈ argmaxρ ∈S f d ,β (ρ). Evidently the derivative of H tends to infinity as ρ i j tends to zero, while the derivative of E remains bounded. Therefore in a maximal overlap each entryρ i j , j ∈ J is positive. As a whole, we know that 0 < min j ∈Jρi j ≤ max j ∈Jρi j ≤ 1. By means of Claim 3.3 it remains to show thatδ = max j ∈Jρi j − min j ∈Jρi j = 0.
In order to achieve a global bound on max ρ∈S f d ,β (ρ) we need to pin down the structure of a maximizing matrix ρ. To this end, the following elementary fact is going to be useful.
The following lemma rules out the possibility that the maximizer of f d ,β has an entry close to 1/2 (cf. [14,Lemma 4.13]).
Proof. By means of Lemma 3.4 we will specify ρ ′ and provide above bound for f d ,β (ρ) − f d ,β (ρ ′ ) in a distinction of two cases. Without loss of generality we may assume that the entry in the interval [0.49, 0.51] is ρ 11 . Suppose ρ maximizes f d ,β subject to the condition that ρ 11 ∈ [0.49, 0, 51].
Generalizing [14,Lemma 4.16], as a next step we characterize the structure of the local maxima of f d ,β on S.   Proof. Claim (1) is an immediate consequence of Lemma 3.4 when setting J = [k], λ = 1 and applying the ρ →ρ operation on the i -th row.

HIGH DEGREE, LOW TEMPERATURE: THE FIRST MOMENT
Throughout this section we assume that d ∈ [2(k − 1) ln(k − 1), (2k − 1) ln k − 2] and β ≥ ln k. In this section we prove Proposition 2.5. The principal tool is going to be the following experiment called the planted model; similar constructions for hypergraph 2-coloring or k-SAT played an important role in [7,9].
obtain a random graphĜ on [n] by independently including every edge {v, w} of the complete graph such thatσ(v) =σ(w) with probability p 2 and every edge {v, w} such thatσ(v) =σ(w) with probability p 1 .
The following lemma sets out the connection between the planted model and the first moment.
We are going to combine Lemma 4.1 with the following proposition, which shows that separability is a likely event in the planted model. To prove Proposition 4.2 we generalize the argument for proper k-colorings from [14, Section 3] to the Potts antiferromagnet. In the following we let V i =σ −1 (i ) for i ∈ [k]. Hence, the first derivative is negative at the left boundary point y = k −0.499 , positive at the right boundary point y = 0.499 and convex on the entire interval. Furthermore, we check that y(3 − 2ln y) + (2y 2 − (1 − 2κ)y + κ) ln k < 0 for y ∈ {0.499, k −0.499 }. Therefore, the assertion follows from (4.3).

Let i ∈ [k]
and let Y = Y (Ĝ,σ) be the number of vertices v ∈ V i with fewer than 15 neighbors in V i . Then Y ≤ κn 3k lnk . (4.4)