Phase transitions of extremal cuts for the configuration model

The $k$-section width and the Max-Cut for the configuration model are shown to exhibit phase transitions according to the values of certain parameters of the asymptotic degree distribution. These transitions mirror those observed on Erd\H{o}s-R\'enyi random graphs, established by Luczak and McDiarmid (2001), and Coppersmith et al. (2004), respectively.


Introduction
Graph cut problems have a very rich history in Combinatorics and Theoretical Computer Science.Given a graph G = (V, E), the k-section problem seeks to partition the vertices sets (or differing by at most 1) such that the number of edges between the distinct sets is minimized.The minimum number of cross edges w k (G) thus obtained is referred to as the k-section width.The related Max-Cut problem seeks to divide the vertices into two sets (not necessarily equal) such that the number of edges between the two sets is maximized.These graph-partitioning problems are extremely important for numerous practical applications in network optimization, VLSI circuit design, computational geometry, and statistical physics [10,11,14,19,33,36,38].On the other hand, from the perspective of Theoretical Computer Science, these problems are computationally hard, and even approximating the Max-Cut up to a constant factor is NP-hard [16,18,21,34].The study of these problems in the average case is mainly motivated by a desire to understand various graph partitioning heuristics.Problem instances are usually chosen to be the Erdős-Rényi random graph, or the random regular graph.An Erdős-Rényi random graph ER n (d/n) is constructed on n vertices, where any two vertices share an edge with probability d/n, independently of each other.A d-regular random graph is drawn uniformly at random from the space of all d-regular graphs on n vertices.We note that these graph ensembles are sparse, in that typical graphs on n vertices have order n edges and the degree of a typical vertex is of the constant order.See [6,23,24,32] for a detailed review of the properties of these random graphs.
Both k-section width and Max-Cut undergo phase transitions on the sparse Erdős-Rényi random graph.These transitions reflect certain structural characteristics of the underlying graphs.Consider the k-section width problem for ER n (d/n), with k = 2.For d < 2 ln(2), the bisection width is exactly 0 with high probability, while for d > 2 ln(2), the bisection width is of order n, with high probability [35].The Max-Cut also undergoes a phase transition; for d < 1, the difference between the total number of edges and the Max-Cut is of the constant order, while it is of the order n for d > 1 [12].The distribution of the Max-Cut within the critical window is analyzed by Daudé et al. [13], while the critical behavior of the bisection width is largely unknown.
A crucial point to note in this context is that both sparse Erdős-Rényi and random regular graph ensembles lead to homogeneous instances, in the sense that any two vertices share an edge with equal probability.This is very different from the instances actually encountered in practical applications.Real networks are extremely inhomogeneous, and often display certain characteristic features, such as a power-law decay in the tail of the degree distribution [1,2,15,24,39].Thus, it is of natural interest to study the behavior of the extremal cuts for graphs with more general degree distributions.The configuration model [5,37] provides a canonical scheme for generating uniform random graphs with any prescribed degree sequence.This model is thus attractive for studying real-world networks, and analysis of its structural properties have attracted considerable attention in recent years [25,26,28,30,31,37].It is worthwhile to mention that despite the presence of very high degree vertices, a plethora of modern research remarkably conveys a qualitatively similar behavior of various statistics in this model to those in Erdős-Rényi random graphs, confirming empirical evidences.
In this paper, we initiate a study of similar phase transition phenomena of the extremal cuts for the configuration model.The main takeaway of our results is that the phase transitions for the extremal cuts are robust, and are present in a large class of random graphs, viz.configuration models with finite second moment.This emphasizes that in the class of sparse non-spatial random graphs these phase transition phenomena are not intimately dependent on the precise model details, but are determined by the component sizes and the structures of the typical local neighborhoods.Technically, the proofs in the Erdős-Rényi case crucially utilize the independence and homogeneity in the model -while we rely on the recent insights about the structure of the configuration model [25,28,30,31] to establish our results.We also prove several novel structural properties of the connected components (see Sections 4.2 and 4.3).Among many other intermediate results, we show that the largest connected component consists of a well-connected 2-core (Lemma 4.5) and several thin hanging trees (Lemma 4.7), and most of the connected components except the largest are finite (Lemma 4.10).Furthermore, we obtain that when the largest connected component is of order n, it must be stable, in the sense that Θ(n) edges must be deleted in order to separate out any Θ(n) vertices (Proposition 4.3).The latter notion is particularly useful to study the stability of the largest connected component subject to intelligent attacks (edge deletion) on networks.
The rest of the paper is organized as follows: Section 2 formally introduces the configuration model along with the assumptions on the underlying degree sequence and summarizes certain preliminary properties of this model.Section 3 states the main results of this paper and offers several key insights.The proofs are included in Sections 4 and 5.

Preliminaries
The configuration model.Consider a degree sequence d = (d 1 , d 2 , . . ., d n ) on the vertex set [n] = {1, 2, . . ., n}.Equip vertex j with d j stubs or half-edges.Two half-edges create an edge once they are paired.Therefore, initially there are n = i∈[n] d i half-edges.Pick any one half-edge and pair it with a uniformly chosen half-edge from the remaining unpaired half-edges.Keep repeating the above procedure until all the unpaired half-edges are exhausted.The random graph constructed in this way is called the configuration model, and will henceforth be denoted by CM n (d).Moreover, under rather general assumptions (see Assumption 1 below), the asymptotic probability of the graph being simple is bounded away from zero [27].
Note that the graph constructed by the above procedure may contain self-loops and multiple edges.It can be shown that conditionally on CM n (d) being simple, the law of such graphs is uniform over all possible simple graphs with degree sequence d (cf.[24,Proposition 7.7], [29]).
A vertex chosen uniformly at random from the vertex set [n], independently of the graph CM n (d) is called a typical vertex.Let D n be the degree of a typical vertex.Throughout this paper we assume the following: ] (moment assumptions); c.P(D = 1) > 0 (positive proportion of degree one vertices).
Like most other sparse random graph models, CM n (d) exhibits a phase transition in terms of the size of its largest connected component, and this has been studied extensively in [31,37].The phase transition occurs when the value of the parameter exceeds one (cf.[31]).More precisely, let g D (x) := E[x D ] be the probability generating function of D, and let ξ be the unique nonzero solution to the equation Then the following theorem characterizes the asymptotic proportion of vertices in each component: where η is as defined in (2.2).Further, η > 0 if and only if ν > 1.
(ii) Moreover, Notation.For any graph G, the k-section width and Max-Cut are denoted by w k (G) and MaxCut(G), respectively.We denote to be the (asymptotic) expected degree of a typical vertex.The degree of a vertex v is denoted by d v , and the number of vertices of degree k by n k , k ≥ 0. If two vertices u and v share an edge, then we write u v.For a nonempty subset U ⊆ [n] of vertices, the neighborhood (or 1-neighborhood) is defined as and the r-neighborhood is defined as For any subset of vertices A, we denote the half-edges incident to the vertices in A by S(A), and the number of edges between A and A c by E(A, A c ).For any integer m ≥ 1, we denote (2m)!! := (2m − 1)(2m − 3)

Main results
In this section we state the main results of this paper, and discuss several heuristics.(ii) If η > 1/k, then there exists ζ > 0, such that with high probability w k (CM n (d)) > ζn.
Theorem 3.1 is proved in Section 4. This result is comparable to [35,Theorem 1], established in the context of Erdős-Rényi random graphs.As mentioned earlier, the proof for the Erdős-Rényi case makes crucial use of the fact that the edge occupancies are independent and identically distributed -a feature that is absent in this case.The proof in this paper, on the other hand, is more robust, and depends on a clear understanding of the local neighborhood structure in these random graphs.Roughly speaking, when η < 1/k, the strategy is to distribute all the components of size at least 3 among k partitions as evenly as possible, and then to add the components of size at most 2 to balance the partitions.Since the size of the largest component is smaller than n/k and the other components are very small (o(n)) in size, a k-partition can be made using the components of size at least 3, with at most n/k vertices in each part.Because there are sufficiently many components of size at most 2 (Lemma 4.2), these can be used to balance the partitions.The latter step results in at most k/2 cross edges between the partitions.The above proof outline for the subcritical case is formalized in Section 4.1.Alternatively, when η > 1/k, the size of the largest connected component is more than n/k.Therefore, in order to split the graph into k equal partitions, the largest component must be split into at least two (possibly unequal) parts, each containing a positive proportion of vertices, and from the structural properties of the largest component, we show that with high probability this creates Θ(n) cross edges.The proof for the supercritical case is provided in Section 4.2.
Remark 1. [35,Theorem 1] establishes that the k-section width is exactly zero below a critical threshold given by η = 1/k.This holds for the Erdős-Rényi case due to the natural presence of many isolated vertices.For a general configuration model, this is not necessarily true, and therefore, Theorem 3.1 (i) is indeed the best possible result that one can hope for in this case.In particular, if we assume the presence of a positive fraction of isolated vertices in the degree sequence, then using Lemma 4.1 below, we recover the same result as in [35].
We continue to describe our results for MaxCut(CM n (d)).For this let us introduce a further notation.The difference between the total number of edges and the Max-Cut is often referred to as the distance from bipartiteness of a graph G, and will be denoted by DistBip(G).In other words, DistBip(G) counts the minimum number of edges in G to be deleted in order to make it bipartite.Recall that µ = E [D].Then MaxCut(CM n (d)) admits a phase transition around ν = 1.More precisely, (ii) (Supercritical) If ν > 1, then there exists δ > 0, such that with high probability, (iii) (High-density regime) Furthermore, when µ > 2, then there exists 0 < c (µ) < √ µ/4, such that for any c > c (µ), with high probability, and c (µ) ln(2)/2 as µ ∞.
The proof of Theorem 3.2 is included in Section 5. Theorem 3.2 establishes the phase transition for DistBip(CM n (d)) for a wide class of degree sequences.The heuristic behind this phase transition is that when ν < 1, CM n (d) is roughly a collection of trees and a finite number of unicyclic components.The trees do not contribute any edge to DistBip(CM n (d)) at all, and the unicyclic components with an odd cycle (i.e., containing an odd number of edges) contributes at most one to the DistBip(CM n (d)).On the other hand, when ν > 1, this is no longer true, and any partition must leave Θ(n) edges uncut.Results analogous to Theorem 3.2 (i) and (ii) were established for Erdős-Rényi random graphs by Daudé at al. [13] and Coppersmith et al. [12], respectively.Remark 2. It was shown in [27] that under Assumption 1, the probability of the graph being simple is bounded away from zero.Thus the phase transition results in Theorems 3.1 and 3.2 also hold for the uniformly chosen simple graph with a prescribed degree sequence.Hence, all the results proved in the paper are true also for ER n (d/n), as well as the generalized random graphs under appropriate conditions [24,Theorem 6.15] on the weight sequence w.In fact, the results are true for an even more general class of inhomogeneous random graph models (cf.[24,Theorem 6.18]).Remark 3. Figure 1 shows the numerical values of c (µ) for 3 ≤ µ ≤ 50.An exact expression of c (µ) is given in (5.21).Notice that even for µ-values as low as 30, c (µ) is sufficiently close to ln(2)/2.This value agrees with the upper bound of Max-Cut for Erdős-Rényi random graphs and random regular graphs in the high density regime as observed in [12,Theorem 20] and [3,Theorem 2], respectively.Thus, our result again establishes a universal behavior for a large class of inhomogeneous random graphs (see Remark 2) as special cases.
To further illustrate the usefulness of the above phase transition results, we consider graphs obtained by random deletion of edges from a given graph.Such results are crucial for studying the stability of networks to random link failures.Percolation refers to keeping the edges of a graph with a given probability p n , independently among each other and independent of the underlying (random) graph.Using Theorems 3.1 and 3.2 we are able to characterize the threshold of the percolation probability for the configuration model, with respect to the k-section width and the Max-Cut.Let CM n (d, p n ) be the graph obtained by retaining the edges of CM n (d) with probability p n .An important property of CM n (d) is that CM n (d, p n ) is again distributed as a configuration model conditionally on its degree sequence [17,26].Therefore, one can deduce the phase transition results for the extremal cuts of CM n (d, p n ) from Theorems 3.1 and 3.2.In fact, since the percolated graphs always have a positive proportion of isolated vertices in the sparse regime (Assumption 1), the minimum bisection below the threshold η = 1/k becomes exactly zero with high probability (see Remark 1).
Let k ≥ 2 be an integer.Then the phase transition for w k (CM n (d, p n )) with p n → p, occurs at p = p min (k, d), such that the asymptotic proportion of vertices in the largest connected component of CM n (d, p n ) is precisely equal to 1/k.For an arbitrary degree sequence, the explicit solution for p min (k, d) is not immediate from [26, Theorem 3.9].However, in the particular case of percolation on the d-regular graph (i.e.d = d1 = (d, d, . . ., d)) with d ≥ 3, notice that by [26, (3.13), (3.14)], p min (k, d) can be obtained as a solution for p in the following system of equations: and thus, It was shown in [26,Theorem 3.9] that, when p n → p, the phase transition for the largest connected component occurs at p = 1/ν.This implies that the phase transition for MaxCut(CM n (d, p n )) also occurs at 1/ν, which for d-regular random graphs equals Therefore, given the phase transition results in Theorems 3.1 and 3.2, we have proved the following theorem: Theorem 3.3 (Extremal cuts for percolation on random d-regular graphs).Let p n → p as n → ∞.Then for any d ≥ 3, , then there exists δ > 0, such that with high probability, with high probability, where c (•) is as given by Theorem 3.2.

Proof for the k-section width
In this section we prove the phase transition of the k-section width stated in Theorem 3.1.

Subcritical case
In this subsection we present the proof of Theorem 3.1 (i).In Lemma 4.1 we first state a useful graph theoretic result, which ensures that if (i) the size of the largest component is smaller than n/k, (ii) there are Θ(n) small components (i.e., of size at most 2), and (iii) the size of every component other than the k largest components is smaller than the k th fraction of the number of small components, then the k-section width is at most k/2.This lemma is an extension of [35,Lemma 9] to fit in the scenario when there are possibly no isolated vertices.Then in Lemma 4.2 we show that under Assumption 1, Θ(n) such small components are created.This will complete the proof of Theorem 3.1 (i).
Proof.Suppose that G contains m 2 components of size more than 2, and enumerate them as i.e., sequentially at each step add all the vertices in components of size more than 2, to the partitions in a way such that the size of each partition does not exceed n/k.The claim below establishes that the above steps are feasible.
due to condition (ii).This in turn implies |C t 0 | > rn k , which contradicts condition (iii).
After step t = m 2 , we first add the components of size 2 and finally components of size 1 (the isolated vertices), if any.Observe that components of size 1, 2 can be added to the partitions such that each partition is of size n/k or n/k + 1, there are no cross edges between the partitions, and the number of vertices remaining to be included in any partition is at most k − 1.Now, if #{i : c i = 1} ≥ k − 1, then at the last step the remaining vertices must be isolated ones, and these do not create any cross edge, and thus the ksection width is exactly zero.Otherwise, the remaining k − 1 vertices can form at most k/2 cross edges (the worst case being there are no isolated vertices).
We will now verify that CM n (d) with η < 1/k satisfies all the conditions of Lemma 4.1, with high probability.Condition (i) follows from Theorem 2.1 (i) and the fact that η < 1/k.In Lemma 4.2 below, we will show that the number of components of size 2 scaled by n, converges in probability to a positive constant, which verifies Condition (ii).Finally, Condition (iii) is a consequence of Theorem 2.1 (ii).The proof of Theorem 3.1 (i) is now complete by Lemma 4.1.
Recall that n 1 denotes the number of vertices in CM n (d) with degree one, and n 1 /n → P(D = 1) = p 1 > 0. Suppose that the degree one vertices are indexed as 1, 2, . . ., n 1 .We say that a pair is created if a degree one vertex is joined with another degree one vertex.Thus, the pairs are the components of size 2 in CM n (d).

Lemma 4.2. Let P
Proof.Note that, by Assumption 1 1 n Therefore, and an application of Chebyshev's inequality completes the proof.

Supercritical case
In this subsection we prove the supercritical case of the k-section width stated in Theorem 3.1 (ii).As mentioned earlier, since η > 1/k, the fraction of vertices in the largest component is more than 1/k, with high probability.Therefore, in any balanced k-partition of the graph G, there must exist two distinct partitions each containing an asymptotically positive proportion of vertices from the largest component.It is thus enough to show that if the largest component is partitioned into two sets V 1 , V 2 , each containing a positive proportion of vertices, then with high probability, there exist Θ(n) cross-edges between V 1 and V 2 .The following key definition formalizes this cut-property: We now briefly sketch the outline of the proof of Proposition 4.3.The idea was first introduced by Bollobás et al. [7] in the context of stability of the largest connected component of inhomogeneous random graphs.We leverage their technique for the configuration model, and in conjunction with suitable structural properties of the giant component, prove Proposition 4.3.The application of this technique to the configuration model poses substantial challenge due to the dependence among edges, and the methods for inhomogeneous random graphs [7] or Erdős-Rényi random graphs [35] are not directly applicable.In this paper, we therefore present some novel arguments that establish the necessary structural properties for this proof technique to work.In particular, we introduce a sequential construction of the configuration model in Subsection 4.2.1, that facilitates the comparison between CM n (d) and the graph with one deleted vertex.For any graph G with vertex set V , define the k-core to be the maximal set of vertices V k ⊆ V , such that in the subgraph induced by V k , each vertex has degree at least k.Note that the k-core of any graph is unique, although it can possibly consist of an empty graph only.It is worthwhile to note that the 2-core of any connected graph is also connected.Algorithmically, the k-core of a graph can be obtained by sequentially deleting the vertices of degree less than k along with all their incident edges, until all the vertices in the remaining graph have degree more than k.Observe that, V k ⊇ V k+1 , and the subgraph induced by V \ V 2 is a forest.See Figure 2a for an instance of the 2-core of a graph and the trees hanging from it.Figure 2b visualizes the 3-core as a subset of the 2-core.
As explained above, the largest connected component C (1) of CM n (d) can be decomposed into two disjoint subsets of vertices: the 2-core C 2 (1) , and a forest of vertex-disjoint trees hanging from the 2-core.Informally speaking, the 2-core is the denser part of the graph.Therefore, at a high level, splitting the 2-core into two parts, each containing a positive proportion of vertices, is in general costly, and would lead to formation of a huge number of cross edges.Thus the optimal strategy might be to peel off the hanging trees, since moving each hanging tree to some other partition would form precisely one crossedge.But in that case also, we show that the number of vertices in each of the hanging trees are small (essentially finite), and hence in order to move Θ(n) vertices to some other partition, Θ(n) trees must be cut, and thus, Θ(n) cross edges must be created.
To formalize the above heuristics, the proof of Proposition 4.3 breaks into two key steps, each being true with high probability: (i) The hanging trees are not heavy, in the sense that peeling off a small number of them cannot separate out a large number of vertices.This is formalized in Lemma 4.4.
Denote by T h , the set of all trees attached with the 2-core of C (1) , i.e., T ∈ T h if and only if the subgraph in C (1) induced by T is a tree, T ∩ C 2 (1) = ∅, and there exists only one vertex v T ∈ C 2 (1) that shares an edge with some vertex in T .With a little abuse of notation we will write T also to denote the set of vertices in T .We always assume that each tree T ∈ T h is rooted at the unique point w T such that (v T , w T ) is an edge and v T ∈ C  Lemma 4.4 (Hanging trees are not heavy).For any ε > 0, there exists δ = δ 1 (ε) > 0, such that with high probability, any collection T ⊆ T h of δn trees contain at most εn vertices in total.
The proof of the above two lemmas are rather technical, and are provided at the end of the subsection.Now we prove Proposition 4.3 using Lemmas 4.4 and 4.5.In Figure 3 we provide a schematic diagram for the structure of the proof of Proposition 4.3 and the interdependence of different intermediate lemmas.
We now claim that for this choice of δ, there is no (ε, δ)-cut in C (1) .Indeed, existence of an (ε, δ)-cut in C (1) implies that there exists δn edges, whose removal splits C (1) into two parts, both containing at least εn vertices.Observe that due to the choice of δ, removal of any set of δn edges can separate out at most εn/2 vertices belonging to ∪ T ∈T h {T }, and at most εn/2 vertices belonging to C 2 (1) with high probability, and the proof is complete.

Hanging trees are not heavy
Proof of Lemma 4.4.The proof consists of two main steps.The first step establishes a property of the underlying degree sequence, which states that the sum of the degrees of 'small' number of vertices is 'small'.
Lemma 4.6.Under Assumption 1.b, given any ε, r > 0, there exists δ = δ(ε, r) > 0, such that for all sufficiently large n, the sum of degrees of the r-neighborhood of any δn vertices is at most εn, i.e. u∈N [U,r] d i < εn uniformly over all subsets U ⊆ [n] such that |U | < δn.
In the second step we show that r can be chosen large enough, so that with high probability, the total number of vertices at depth more than r in all hanging trees combined, is arbitrarily 'small'.This is formalized in Lemma 4.7.
Given Lemmas 4.6 and 4.7, the proof of Lemma 4.4 can now be completed.Consider the following equivalent re-statement of Lemma 4.4: For any ε, β > 0, there exists δ = δ(ε) > 0 and n 0 = n 0 (ε, β), for which, the probability that there exists a subset T ⊆ T h with |T | < δn and | T ∈T {T }| ≥ εn, is at most β for all n ≥ n 0 .
To show the above statement, fix any ε, β > 0. Using Lemma 4.7, choose r = r(ε/2) and Also, appealing to Lemma 4.6, we choose δ = δ(ε/2, r) and where w T is the unique vertex in T that has a neighboring vertex in C 2 (1) .Choose n 0 = max{n 1 , n 2 } so that, for all n ≥ n 0 , the probability of the first event is 0, and that of the latter event is at most β, which concludes the proof.
It remains to prove Lemmas 4.6 and 4.7.We start with Lemma 4.6.
Proof of Lemma 4.6.Fix any ε > 0. We first verify the case when r = 1 and prove this lemma by induction.Due to Assumption 1b, K = K(ε) > 0 can be chosen such that for all sufficiently large n, Take δ = ε/(2K), and fix any V ⊆ [n] with |V | < δn.Then, To prove Lemma 4.7 we require a detailed understanding of the local neighborhood structure of CM n (d).For the ease of readability, we start with a heuristic road-map of the arguments.Observe that for any fixed r > 0, and given any random observation G of , where we recall that V n denotes a typical vertex.Therefore, it is enough to show that for any ε > 0, r = r(ε) can be chosen large enough, such that However, it is challenging to obtain the latter probability.For this reason, we will use the local event approximation technique, a key element in the study of sparse random graphs [7,9,23,24,25,32].In particular, our results for the configuration model mirror the ones proved in [7] in the context of inhomogeneous random graphs.Roughly speaking, the crucial idea is based upon two observations: (i) The local neighborhood of a typical vertex resembles a branching process, i.e., with high probability, the breadth-first-search (BFS) exploration starting from V n up to suitable depth can be coupled with a branching process.This is formally stated in Proposition 4.8.
(ii) Looking at the local neighborhood of V n up to suitable distance, it can be determined whether V n is near the 2-core.More specifically, the event that V n is within the r neighborhood of the 2-core, is asymptotically 'equivalent' to the event that for some L n → ∞, there exists two vertex disjoint paths of length L n from a vertex within the r neighborhood of V n .This fact is later formalized in Lemma 4.9.
The proof follows once we have these ingredients in place.First we start by introducing some notations.Denote by X the branching process with initial distribution D and progeny distribution D * − 1, where D is the limiting random variable as in Assumption 1, and D * follows the size-biased distribution of D, i.e., , j ≥ 1.
Note that the survival probability of X is given by η, as in (2.2) (cf.[31]).The number of offspring of X in generation l is denoted by Z l , and the number of vertices at distance l in the breadth-first neighborhood exploration tree (i.e. the BFS tree) starting from vertex v is denoted by Z l (v).Furthermore, define the following events: (a) TC r (v): the vertex v is within distance r of the 2-core of C (1) , (b) LTC r (v, L): there exists a vertex v at distance t of v, t ≤ r, with two vertex disjoint paths of length L starting at v which join v to the vertices at distance t + L from v.
(c) DS r : the branching process X has a progeny within the first r generations that has two children, both of which survive till infinity.
(d) LDS r (L): the branching process X has a progeny within the first r generations that has two children surviving further L generations.
As explained in the proof sketch above, the following proposition couples the local neighborhood of a typical vertex with the branching process X .l } l≥1 , {Z 2 l } l≥1 be two independent copies of {Z l } l≥1 , and V n , W n be two independent typical vertices of CM n (d).There exists (L n ) n≥1 such that L n → ∞, and a coupling ( Ẑ1 The next lemma shows that for any (L n ) n≥1 that increases to infinity at a rate slower than log(n), the two events TC r (V n ) and LTC r (V n , L n ) are equivalent.Lemma 4.9.Let (L n ) n≥1 be such that L n → ∞ and L n / log(n) → 0.Then, for any fixed r ≥ 1, We defer the proof of Lemma 4.9 until Section 4.3, and complete the proof of Lemma 4.7 using Lemma 4.9.
Proof of Lemma 4.7.Fix any r > 0. Observe that for any L n such that L n → ∞, Furthermore, choose L n such that Lemma 4.9 holds.Therefore, for L n = min{L n }, Also, and hence using (4.7), we get To find Var |N [C 2 (1) , r]| , consider two vertices V n , W n chosen uniformly at random independently of the graph and independently of each other.Again, note that Thus, Recall from Proposition 4.8 that with high probability, the L n neighborhoods of V n , W n can be coupled with two independent copies of X .Hence, under the given coupling and it follows that Therefore,  Now for any supercritical branching process conditioned on survival, the probability that the root has atleast two children surviving to infinity is bounded away from zero.Therefore, conditioned on survival, the probability that any progeny in an infinite line of descendants has another child that survives till infinity is bounded away from zero.Thus, P(X survives \ DS r ) ≤ c r , for some c < 1.Further, since DS r is an increasing event in r, P(X survives \ ∪ r≥0 DS r ) ≤ lim r→∞ c r = 0, and hence lim r→∞ P (DS r ) = P (X survives) = η. (4.12) Using Theorem 2.1, (4.11) yields Now, P(DS r ) η as r → ∞.Thus, for any ε > 0, we can choose r 0 = r 0 (ε) such that η − P (DS r ) < ε for all r ≥ r 0 .Hence, with high probability C (1) \ N [C 2 (1) , r] < εn, for all r ≥ r 0 .

2-core is well-connected
Proof of Lemma 4.5.In this proof we leverage the first moment method argument as used in [7].Condition on the degree sequence d = ( d1 | and let m 2 be the number of edges in the 2-core. Recall that C 2 (1) can be obtained from C (1) by sequentially deleting the vertices of degree one until all the vertices in the deleted subgraph have degree at least two.Thus, two paired half-edges are deleted at each step, and conditional on the deleted half-edges the perfect matching on the rest of the half-edges remains a uniform perfect matching.In particular, C 2 (1) is distributed as a configuration model conditioned on the degree sequence d (cf.[30, Section 3]).Furthermore, we will need the following estimate for the number of degree three vertices in the 2-core: Claim 2. Let N j denote the number vertices in the 2-core having degree j.Denote by ρ j the probability that the root of the branching process X has exactly j neighbors that survive.Then, as Proof.The proof follows using similar arguments as in the proof of Lemma 4.7.Note that , and D n = j), where V n is a typical vertex, and D n is the degree of V n .Let TS j (V n ) denote the event that {V n ∈ C 2 (1) , and D n = j} and LTS j (V n ) denote the (localized) event that there are j disjoint non self-intersecting paths starting from V n of length L n , where L n → ∞ such that Proposition 4.8 holds.The essentially same arguments as in the proof of Lemma 4.9 (see Section 4.3) can be followed to show that, for L n → ∞ and Moreover, an application of Proposition 4.8 and an argument identical to (4.9) again yields Var (N j ) = o(n 2 ) and the proof follows.
Having proved the local event approximation in Section 4.3, the rest of the proof is similar to [7], and will be sketched briefly for completeness.For any subset A ⊂ C 2 (1) , we define Ā = C 2 (1) \A.Further, recall that for A ⊂ C 2 (1) , we denote the half-edges incident to the vertices in A by S(A).For a set of half-edges S, denote by p(S; d) the probability that the half-edges of S are paired among each other in C 2 (1) , conditional on d.Using the fact that the half-edges of C 2 (1) form a uniform perfect matching conditional on the degrees, we obtain and there is a subset S ⊂ S(A) with |S(A) \ S| ≤ δn such that all the half-edges in S are paired with each other during the random matching of the half-edges.Let Γ n denote the number of bad partitions of C 2 (1) .Thus, where E d[•] denotes the conditional expectation given the degree sequence of C 2 (1) to be d.
We need to show that for all ε > 0, there exists We first derive a lower bound on m 2 |S|/2 .Observe that each vertex in C 2 (1) is at least ε 1 n with high probability.Fix such an ε 1 > 0, and let A n denote the event that the proportion of degree 3 vertices in C 2 (1) is at least ε 1 n.Note that on A n , one of the parts among A and Ā contains at least ε 1 n/2 degree three vertices.Consequently, either Using these bounds, and the fact that |A|, | Ā| ≥ εn, it follows that for some a > 0 chosen as a function of ε, ε 1 .Moreover, for any partitions A, Ā we have |S(A)\S| ≤ δn, and δ can be chosen small enough such that which gives the requisite lower bound.
To derive an upper bound on the number of possible choices for A and S in (4.16), we note that given |A| = a 0 , there are n 2 a 0 ways of choosing A. Also, given A, there are at most 2m 2 δn choices for S(A) \ S such that |S(A)\S| ≤ δn.Plugging these estimates back into (4.16)yields Thus, for a small enough choice of δ > 0, it follows that This completes the proof of Lemma 4.5.

Approximation of typical local neighborhoods
We prove Lemma 4.9 in this section.A component C (i) for i ≥ 2 (i.e., except the largest component) will be called an intermediate component if We need to study some structural properties of the intermediate components that will play a key role in establishing Lemma 4.9.For any L > 0, define Consequently, for any Proof.Fix any K ≥ 1. Recall from Theorem 2.1 that Now, based on the information about the K-neighborhood of V n , it can be exactly determined whether the event {|C (V n )| > K} has occurred or not.Therefore, using Proposition 4.8, we have and hence (4.18) follows.To see (4.19), notice that by (4.18), Let us now introduce the following novel construction of the configuration model, that will allow us to relate it to the graph after deletion of one vertex.This will be crucial for completing the proof of Lemma 4.9.(S2) At step t + 1, 0 ≤ t ≤ n − 1, choose d σt degree one vertices from the graph G(t) uniformly at random independently of the perfect matching, and coalesce them into a single black vertex with index σ t .Let G(t + 1) be the new modified graph, and set V (t + 1) = V (t) ∪ {σ t }.See Figure 4 for an illustration of this step.
(S3) After nth step, when all indices i with d i > 1 are exhausted, label all the degree one vertices at random, independently of (S1) and (S2).
Note that the vertex index assignment process is independent of the initial perfect matching, and therefore, at any time step t, G(t) is a configuration model given its degree sequence.The algorithm, thus indeed produces a configuration model with degree sequence d in the end.This is formally stated in Lemma 4.11.Also, notice that at any time step t, the subgraph in G(t) induced by the set of black vertices remains fixed till the formation of CM n (d).
Lemma 4.11.For all t ≥ 0, G(t) is a configuration model given its degree sequence.In particular, the final graph is distributed as CM n (d).

Remark 4.
In Algorithm 1, the indices corresponding to the vertices with degrees at most one are assigned at the final step (S3).It is worthwhile to note that this is not strictly necessary in order for the algorithm to work.In particular, since the uniform matching is created independent of the index assignments, any assignment ordering produces CM n (d) in the end.In the proof of Lemma 4.9 below, however, we will require the stated order of indexing the vertices.
Figure 4: The red vertices are the ones that have not yet been assigned any index.At nth step, five of the unlabeled degree one vertices are selected, and the vertex v is formed.
Fix any vertex v of degree at least 2, and any permutation {σ 1 , σ 2 , . . ., σ n} of the set {i ∈ [n] : d i > 1} such that σ n = v.Denote the sequence of graphs constructed in Algorithm 1 by {G v (t)} t≥0 , i.e., G v (t) denotes the graph at the t th step.The (n − 1) th and nth steps of the algorithm are schematically presented in Figure 4. Now, we complete the proof of Lemma 4.9.We will use the following fact: Lemma 4.12.For any degree sequence satisfying Assumptions 1.a, and 1.b, the maximum degree Moreover, observe that Since the left side of the above inequality does not depend on K, it follows that Proof of Lemma 4.9.Fix r > 0, and The proof is split into two steps: we show that (i)

Case-(i):
Define the event C(v, r, L) that the vertex v is within r distance from a cycle of length at most L. Then note that ) be such that there exists a path P 1 of length at most r from v to v 1 (take ).Now, since v 1 is in the two-core, there exists at least two vertex-disjoint paths (disjoint from P 1 ) starting from v 1 , and because LTC r (v, L) does not happen, any two such paths must either meet each other, or one of them intersects itself within distance L from v .In either cases a cycle of length at most 2L is created that is joined to v via a path of length at most r + L, and therefore C(v, r + L, 2L) must hold.
Proof.In the proof we will make use of the path counting techniques as in [4,28].Define n := n − 4L n + 1.Note that due to Assumption 1.b, a constant κ > 1 can be chosen such that 1 The event C(V n , L n , L n ) implies that there is a path (V n , x 1 , x 2 , . . ., x l ) of length l ≤ L n , and x l belongs to a cycle (x l , x l+1 , . . ., x l+m−1 ) of length m ≤ L n , where the x i 's are distinct.Fix some V n = v.Then the number of structures with a path (v, x 1 , x 2 , . . ., x l ) and a cycle (x l , x l+1 , . . ., x l+m−1 ) is given by where the first term in the product is due to the number of ways the path can be formed, and the second is due to the cycle.Furthermore, each of these specific configurations has probability for some constant K > 0 where in the final step we have used (4.23) and Lemma 4.12.Therefore, by Assumption 1.b, and the fact that L n = o(log(n)).
Therefore, for any fixed r ≥ 1, and the proof of part (i) is complete.

Case-(ii)
We prove this part for r = 0.The proof of the general case is included at the end.Fix any vertex v ∈ [n], and condition on , then P (LTC 0 (v, L n )) = P (TC 0 (v)) = 0. So, without loss of generality assume that d v > 1 and v ∈ C (1) .Recall the construction in Algorithm 1 and the definition of the graph G v (t).Note that, if LTC 0 (v, L n ) \ TC 0 (v) happens, then there are two vertex-disjoint paths in CM n (d) starting from v, which have length at least L n , but they do not meet each other.Furthermore, the event LTC 0 (v, L n ) \ TC 0 (v) is determined by the graph G v (n).Define the event E(v) that, while creating the vertex with index v at time n, one of the degree one vertices in one of the intermediate components of G v (n − 1) was chosen.Observe that Let Q v n (L n ) denote the total number of vertices in the intermediate components of size more than L n , in the graph G v (n−1).Using Lemmas 4.11 and 4.12, it follows that G v (n−1) is a configuration model given its degree sequence that satisfy Assumption 1.
Proof.Note that an application of Lemma 4.10 directly implies that n ).Therefore, we get (iii) max v∈[n] |ν v n − ν n | → 0 as n → ∞.Now, while approximating the breadth-first exploration of G v (n − 1) by a suitable branching process in (4.6), one can in fact obtain error estimates that are uniform over v.This is a consequence of the precise bounds stated in [23,Lemma 5.6], that are used as the main ingredient for the proof of [23,Proposition 5.4].Therefore, while proving (4.21) for the graph G v (n − 1), one can use (i) and (iii) above to get error estimates that are uniform in v. Thus, the claim follows.
Finally, we bound the probability of the event E(V n ).Note that, in G v (n − 1) there are n 1 + d v − 1 degree one vertices.Therefore, conditional on G v (n − 1), the vertex v is created at step n by choosing d v vertices from a set of n 1 + d v − 1 vertices, and E(v) occurs if at least one of those degree one vertices is from an intermediate component (for which there are at most Q v n (L n ) choices).Thus, Again, by Assumption 1, there exists a constant K > 0, such that n 1 + d v − 1 ≥ n /K for all large n.Hence, where the last step follows from Claim 4. Thus it follows that

Proof for the Max-Cut
We prove Theorem 3.2 in this section.The proof for the sub/supercritical cases in Theorem 3.2 (i) and (ii) are provided in Sections 5.1 and 5.2, respectively.The case for large mean degree stated in Theorem 3.2 (iii) is proved in Section 5.3

Subcritical case
The idea in the subcritical regime is to count the number of cycles.This idea has also been adopted in the proof of [12,Theorem 19] for Erdős-Rényi random graphs.Observe that the bipartite components (components with no cycles or only cycles of even length) contribute all of their edges to the Max-Cut.To analyze the non-bipartite components we first observe in Lemma 5.1 that all the components of a subcritical CM n (d) are unicyclic (contains only one cycle) with high probability.Observe that the Max-Cut leaves precisely one edge uncut in each of these unicyclic, non-bipartite components.Therefore, the number of uncut edges in the Max-Cut is with high probability equal to the number of cycles of odd length that the graph contains.Now, the asymptotic number of cycles of length k in CM n (d), for any fixed k ≥ 1, is derived in [8,Theorem 2.18], and is stated in the following lemma.Let C n k denote the number of cycles of length k in CM n (d) (a cycle of length one denotes a loop and of length two denotes a multiple edge). where The next lemma proves that with high probability, there are no cycles of growing length.This will be used to show that asymptotically, the total number of odd-length cycles is equal to the sum of the number of all cycles of finite and odd length.(5.2) Lemma 5.3 is proved at the end of this subsection.Now, we prove result for the subcritical Max-Cut by using Lemmas 5.1, 5.2 and 5.3 Proof of Theorem 3.2 (i).As mentioned earlier, the Max-Cut leaves precisely one edge uncut in each of the unicyclic, non-bipartite components, and by Lemma 5.1, with high probability, the total number of uncut edges precisely equals to the total number of odd-length cycles.Therefore, recalling that the total number of edges equals n /2, it follows that n /2 − MaxCut(CM n (d)) = k≥1,k is odd C n k , with high probability.Hence, Lemmas 5.2 and 5.3 yield, as n → ∞, Proof of Lemma 5.3.For brevity of notation, denote by M the total number of edges, i.e., M = n /2.We find the expected value of C n k using again the path-counting techniques.To this end, we first fix k distinct vertices x 1 , . . ., x k which participate in the cycle in the given order.We denote by I k = {(x 1 , . . ., x k ) : x i = x j , ∀i = j}.For each vertex x i , the two half-edges which participate in the cycle may be chosen in d x i (d x i − 1) ways.The number of ways to pair these half-edges is thus i d where the last step follows from [20, Theorem 52] (see also the proof of [27, Lemma 5.1]).Now, using Stirling's formula we have, In analogy with (2.1), we define where the constant κ 1 > 0 can be chosen to be independent of K. Now, for subcritical CM n (d), we have M < n.To see this, note that by the Cauchy-Schwarz inequality, Taking the square on both sides and using the fact that since ν n < 1, we get Therefore, (1/M − 1/n) > 0, and hence, where the constant κ 2 > 0 is independent of K. Thus, if we first take n → ∞ and then K → ∞.To count the number of cycles of length > √ n, note that P ∃ a cycle of length more than

Supercritical case
The proof for the supercritical case builds upon the following idea: the fact that a graph has small Max-Cut implies that deletion of a small number of edges can make the graph bipartite.When the graph is supercritical, deletion of a small number of edges can still leave it supercritical.In that case, if one can show that the probability of the latter supercritical graph being bipartite is small, then the original supercritical cannot have a small Max-Cut.This idea has been leveraged in [12,Theorem 21] to prove the phase-transition of Max-Cut result for the Erdős-Rényi random graph.The main challenge of implementing this idea for the configuration model is that if a set of edges is deleted from a configuration model (possibly depending on the outcome of the random graph topology), then the edgedeleted graph is not distributed as a configuration model given its degree sequence.It thus becomes challenging to approximate the probability that after a number of edge deletion the graph becomes bipartite.Inspired by the above issues, in case of CM n (d) we introduce a notion of blowing up vertices.In a way, blowing up a vertex is the reverse process of forming a vertex at (S2) of Algorithm 1.Let G = (V, E) be any graph.Also let v ∈ V be a vertex of degree d v ≥ 2 with {u 1 , u 2 , . . ., u dv } being the set of neighbors in G. Then define the graph G b (v) as follows: replace v by a collection of d v degree one vertices {v 1 , v 2 , . . ., v dv }, and for i = 1, 2, . . ., d v , add the edge (u i , v i ).We say that G b (v) is obtained by blowing up the vertex v.The graph obtained by blowing up a set of vertices U ⊆ V each with degree at least 2 is defined as sequentially blowing up each vertex in U .Just like the edge deletion, note that if a graph has small Max-Cut, then by blowing up a small number of vertices it should be possible to make the graph bipartite.Now, it is crucial to note that for any set of vertices U ⊆ V each with degree at least 2, the graph G b (U ) is distributed as a configuration model given its degree sequence.The above key observation enables us to estimate the probability that blowing up a small set of vertices makes the graph bipartite.
Thus our proof argument builds in two steps as follows: (i) First in Lemma 5.4 we show that the probability that a supercritical configuration model is bipartite is exponentially small, and then (ii) Using union bound we establish that for any ν > 1, there exists a δ > 0, for which the probability that blowing up any set of δn vertices makes the graph bipartite converges to 0. This will complete the proof of Theorem 3.2 (ii).First we formally state and prove Lemma 5.4.
Notice that since ν > 1 and P(D = 1) > 0, we must have some k ≥ 2 such that P(D = k) > 0. Without loss of generality, in the rest of this section we assume that P(D = 2) > 0. The argument below remains identical when P(D = 2) = 0, in which case we proceed with min{k : P(D = k) > 0} < ∞ instead of 2. Recall that n = n − n 0 − n 1 .
Denote n * = n − n 2 .Note that in Algorithm 1 until the time step n * , first the vertices of degree larger than 2 are formed.After this the vertices of degree 2 are formed during time steps n * + 1 ≤ t ≤ n, followed by creating vertices of degree one for t > n.It is crucial to observe that for t(ε) = n − ε n , the graph G n (t(ε)) is distributed as a configuration model with the criticality parameter (5.9) Proof.First note that it is enough to show for some ε n ≥ 0 almost surely.Recall Algorithm 1. Also, observe that if C (1) (t(ε)) is non-bipartite, then CM n (d) will also be non-bipartite.Indeed, if C (1) (t(ε)) is non-bipartite, then it must contain an odd cycle of black nodes, and the process of merging degree one (red) vertices does not affect these existing cycles.Thus if there is an odd-length cycle in C (1) (t(ε)), that cycle will be present in CM n (d) as well.So assume C (1) (t(ε)) is bipartite.For t ≥ t(ε) we will now describe an algorithm for partitioning C (1) (t) into two vertexdisjoint sets H 1 (t) and H 2 (t) in a coupled way.The sets H 1 (t) and H 2 (t) are such that if C (1) (t) is bipartite, then these are the unique partite sets.For i = 1, 2, let H B i (t) and H R i (t) be the set of black and red vertices in H i (t), respectively.Also, let R(t) and R P (t) denote the set of all red vertices and red pairs in G n (t), respectively (recall that a pair is two degree one vertices joined with each other).With a little abuse of notation, we will also write H B i (t), R(t) etc. to denote the cardinality of the respective sets.Algorithm 2. Initially consider the unique bipartition of C (1) (t(ε)): C (1) (t(ε)) = H 1 (t(ε)) H 2 (t(ε)), say.
It is important to note that both r(ε) and r P (ε) are bounded away from zero as ε → 0. Thus δ 1 remains positive even when ε is chosen small enough.Recall from Algorithm 2(ii) that during the time interval [t(ε), t * (ε)], H R 2 increases by at least one if some red vertex in H R 1 is coalesced with one of the vertices outside the set H R 1 ∪ H R 2 , in particular, with one belonging to some red pair in R P .Using (5.13), at each time step the probability of the latter event, conditionally on G n (t(ε)), is atleast r(ε)n 8 r P (ε)n for some c 1 ∈ (0, 1].Denote by A(t) the cumulative number of red vertices thus added to H R 2 up to time t starting from t(ε).Observe that A(t) stochastically dominates a binomial random variable with min{r(ε)n/8, ε n /2} number of trials and the success probability atleast δ 1 c 1 (1 + o P (1)).Therefore, standard concentration inequalities for the binomial distribution [32, Corollary 2.3] yields P(A(t * (ε)) ≤ δ 2 n|G n (t(ε))) ≤ e −C 0 n(1+o P (1)) (5.14) for some suitable δ 2 > 0 and a constant C 0 > 0. Further notice that some of the red vertices in H R Then observe that for t ∈ [t(ε), τ (ε)], the quantity B(t) is dominated by a binomial random variable with min{r(ε)n/8, ε n /2} number of trials and success probability at most δ 2 n/2n 1 .The mean of this binomial random variable is of the order at most εδ 2 n, which can be made arbitrarily small compared to δ 2 n by choosing ε small enough.Therefore, standard concentration inequalities for the binomial distribution again imply that P B(τ (ε)) ≥ δ 2 n/4 | G n (t(ε)), {H R 1 (t) ≤ δ 2 n, ∀ t(ε) ≤ t < τ (ε)} ≤ e −C 0 n(1+o P (1)) (5.15) for some constant C 0 > 0. Now set δ 1 as above and δ 2 := δ 2 /2, and observe that H R 2 (t) = H R 2 (t(ε)) + A(t) − 2B(t).Thus (5.14) and (5.15) yields that either the probability that the following will not occur is exponentially small (5.16) Recall that while forming the degree 2 vertices from G n (τ (ε)) to CM n (d), the bipartiteness is lost if one red vertex is chosen from H R 1 and the other is chosen from H R 2 .Therefore, for

Theorem 3 . 1 (
Phase transition of the k-section width).Consider CM n (d) satisfying Assumption 1, and let k ≥ 2 be an integer.Then w k (CM n (d)) exhibits a phase transition around η = 1/k.More precisely, (i) If η < 1/k, then with high probability w k (CM n (d)) ≤ k/2;

Lemma 4 . 1 .
Consider a graph G on n vertices, with m components of sizes c 1 and V 2 such that |V 1 |, |V 2 | > ε|V |, and the number of edges between V 1 and V 2 is at most δ|V |.Now observe that the following proposition is enough to conclude Theorem 3.1 (ii): Proposition 4.3.Consider CM n (d) with ν > 1 and satisfying Assumption 1.For any ε > 0, there exists δ = δ(ε) > 0 such that with high probability the giant component C (1) does not have an (ε, δ)-cut.

Figure 2 :
Figure 2: (a) The highlighted (red) 2-core and the trees hanging from it.(b) The yellow part highlights the 3-core, which is contained in the 2-core (union of red and yellow parts).

Figure 3 :
Figure 3: Proof structure and interdependence of different lemmas.

Algorithm 1 .
Consider a given degree sequence d on vertex set [n]. Recall that n = i d i is the sum of the degrees.First n 0 isolated vertices are assigned their vertex labels.The algorithm below generates the random topology induced by the vertices of degree one or larger.(S1) Initially there are n degree one vertices labeled v(1), . . ., v( n ), each with an attached half-edge.Call these the set of red vertices.Construct a uniform matching of these n half-edges.Denote the corresponding graph by G(0), and set V (0) = ∅.Also, take any permutation of the index set {i ∈ [n] : d i > 1} of all vertices of degree more than one, and denote it by {σ 1 , σ 2 , . . ., σ n}, where n = n − n 0 − n 1 .

(4. 27 )
To see the general case for d ≥ 1, note that (4.27) implies E[#{v ∈ [n] : LTC 0 (v, L) \ TC 0 (v) occurs}]/n → 0. Using Lemma 4.6, it now follows that the fraction of vertices which are within the d neighborhood of a vertex v for which LTC 0 (v , L) \ TC 0 (v ) occurs converges to zero in L 1 .Therefore, P(LTC d (V n , L) \ TC d (V n )) = o(1), and the proof is complete.

Lemma 5 . 1 (
[22,  Theorem 1.2 (b)]).For subcritical CM n (d) satisfying Assumption 1, the probability that there exists a component with more than one cycle tends to zero as n → ∞.