An asymmetric container lemma and the structure of graphs with no induced $4$-cycle

The method of hypergraph containers, introduced recently by Balogh, Morris, and Samotij, and independently by Saxton and Thomason, has proved to be an extremely useful tool in the study of various monotone graph properties. In particular, a fairly straightforward application of this technique allows one to locate, for each non-bipartite graph $H$, the threshold at which the distribution of edges in a typical $H$-free graph with a given number of edges undergoes a transition from 'random-like' to 'structured'. On the other hand, for non-monotone hereditary graph properties the standard version of this method does not allow one to establish even the existence of such a threshold. In this paper we introduce a refinement of the container method that takes into account the asymmetry between edges and non-edges in a sparse member of a hereditary graph property. As an application, we determine the approximate structure of a typical graph with $n$ vertices, $m$ edges, and no induced copy of the $4$-cycle, for each function $m = m(n)$ satisfying $n^{4/3} (\log n)^4 \leqslant m \ll n^2$. We show that almost all such graphs $G$ have the following property: the vertex set of $G$ can be partitioned into an 'almost-independent' set (a set with $o(m)$ edges) and an 'almost-clique' (a set inducing a subgraph with density $1-o(1)$). The lower bound on $m$ is optimal up to a polylogarithmic factor, as standard arguments show that if $n \ll m \ll n^{4/3}$, then almost all such graphs are 'random-like'. As a further consequence, we deduce that the random graph $G(n,p)$ conditioned to contain no induced $4$-cycles undergoes phase transitions at $p = n^{-2/3 + o(1)}$ and $p = n^{-1/3 + o(1)}$.


Introduction
Two of the central objects of study in combinatorics are the family of H-free graphs, that is, the collection of graphs that do not contain H as a subgraph, and the family of induced-H-free graphs, that is, graphs without an induced subgraph isomorphic to H. An extremely well-studied problem (see, e.g., [25] and references therein) is to determine the largest number of edges in an H-free graph with a given number of vertices. This line of research dates back to the seminal works of Turán [47] and of Erdős and Stone [23], which are considered to be the cornerstones of the field of extremal graph theory.
Another natural and well-studied problem, which also makes sense in the setting of induced-H-free graphs, can be informally phrased as follows: What does a typical H-free (or induced-H-free) graph look like?
The first to address this problem were Erdős, Kleitman, and Rothschild [22], who proved that almost all triangle-free graphs are bipartite. That is, the proportion of triangle-free graphs on a given set of n vertices that are bipartite (among all triangle-free graphs) tends to one as n tends to infinity. This result was generalised by Kolaitis, Prömel, and Rothschild [34], who showed that for every r 2, almost all K r+1 -free graphs are 1.1. The structure of graphs with no induced 4-cycle. Given a graph H and n ∈ N, let F ind n (H) denote the family of all graphs with vertex set {1, . . . , n} that contain no induced copy of H and let F ind n,m (H) denote the family of graphs in F ind n (H) with precisely m edges. A split graph is a graph whose vertex set can be partitioned into a clique and an independent set. It is easy to check that a split graph cannot contain an induced copy of C 4 ; indeed, the property of being a split graph is hereditary and C 4 itself is not a split graph. Conversely, as mentioned above, it was proved by Prömel and Steger [38] over 25 years ago that almost all graphs in F ind n (C 4 ) are split graphs. However, since almost all n-vertex split graphs admit a partition into a clique and an independent set of roughly equal sizes and have approximately n 2 /4 edges, this result says nothing about a typical member of F ind n,m (C 4 ) when m is not approximately n 2 /4. It is worth mentioning that Gishboliner and Shapira [26] recently described the structure of all induced-C 4 -free graphs; their description is much coarser, however.
We will prove that if n 4/3 (log n) 4 m ≪ n 2 , then a typical member of F ind n,m (C 4 ) is 'almost' a split graph. We will write a.a.s. (shorthand for asymptotically almost surely) as an abbreviation of "with probability tending to 1 as n → ∞" and say that a graph G with n vertices and p n 2 edges is ε-quasirandom if every subset of more than εn vertices of G induces a subgraph with density between (1 − ε)p and (1 + ε)p. We will say that a graph G is ε-close to a split graph if there exists a partition V (G) = A ∪ B such that e G (A) (1 − ε) |A| 2 and e G (B) εe(G). Our first main result is the following structural description of a typical graph in F ind n,m (C 4 ).
Theorem 1.2. For every ε > 0, there exists δ > 0 such that the following holds. Let G be a uniformly chosen random graph in F ind n,m (C 4 ). (a) If n ≪ m δn 4/3 , then a.a.s. G is ε-quasirandom.
The following result is a relatively straightforward consequence of Theorem 1.2. It determines the number of edges in (and therefore, by Theorem 1.2, the typical structure) of the random graph G(n, p) conditioned on not containing an induced copy of C 4 . We write G ind n,p (C 4 ) to denote the random graph chosen according to this conditional distribution. Note that it follows immediately from Theorem 1.2 that G ind n,p (C 4 ) is a.a.s. ε-quasirandom if n −1 ≪ p ≪ n −2/3 and a.a.s. ε-close to a split graph if p n −1/3 (log n) 4 . We remark that we have not attempted to optimize the exponents of log n, since (we believe that) our technique cannot give the correct power.
We would like to draw the reader's attention to the (somewhat surprising) fact that in the middle range n −2/3+o (1) p n −1/3+o (1) , the typical value of e G ind n,p (C 4 ) stays essentially constant. This is because the proportion of n-vertex graphs with m edges that are induced-C 4 -free drops very sharply from e −o(m) to e −Ω(m log n) as m crosses a very narrow interval around n 4/3 , as shown by Theorem 1.2. A similar phenomenon has been observed in several random Turán problems for forbidden bipartite graphs (even cycles [32,36] and complete bipartite graphs [36]) as well as Turán-type problems in additive combinatorics [15,16]. It would be very interesting to determine whether a similar 'long flat segment' appears in the graphs of p → e G ind n,p (H) and p → ex G(n, p), H for every bipartite H.
Our proof of Theorem 1.2 relies on two new results: (i) an asymmetric container lemma, which generalises the main results of [10,44], and (ii) a new robust stability theorem for induced copies of C 4 in 'pregraphs' (see below). We discuss these two ingredients in the remainder of this section.
1.2. The asymmetric container lemma. The hypergraph container theorems, proved independently by Balogh, Morris, and Samotij [10] and by Saxton and Thomason [44], state (roughly speaking) that the family of independent sets of a uniform hypergraph whose edges are distributed somewhat evenly can be covered with a small number of sets, called containers, each of which is 'almost independent' in the sense that it contains only few edges of the hypergraph. This fact has proved to be a very convenient and useful tool in the study of the families of H-free graphs, as well as other monotone properties of graphs, hypergraphs, sets of integers, etc. There are several reasons for this. First, there is a natural correspondence between H-free graphs with a given number n of vertices and independent sets in the e H -uniform hypergraph H whose vertex set is E(K n ), the edge set of the complete graph with n vertices, and whose edges are the edge sets of all copies of H found in K n . Second, classical results in extremal graph theory provide very precise and explicit descriptions of graphs with few copies of H, which correspond to the containers for independent sets of H. Third, the bounds for the number of containers given by [10,44] are essentially optimal, which allows one to deduce many best-possible estimates on the number of H-free graphs with given numbers of vertices and edges and describe their typical structure.
The container theorems can also be used to enumerate graphs with no induced copy of H. In fact, this was already done by Saxton and Thomason in their original paper [44], where they obtained (implicitly) upper bounds on |F ind n (H)| for all H. One way to phrase this problem in the language of independent sets is to consider the hypergraph H whose vertex set is E(K n ) × {0, 1} and whose edges are where W ranges over all v H -element sets of vertices of K n and E is the subset of W 2 covered by E(H) in one of the v H !/|Aut(H)| non-isomorphic embeddings of H into W and (ii ) all the n 2 pairs {(e, 0), (e, 1)}, where e ranges over all edges of K n . One can see that n-vertex graphs with no induced copy of H are in a natural one-toone correspondence with the independent sets of H with n 2 elements. Even though the container theorems may be applied only to uniform hypergraphs, since one is usually interested in upper bounds, one may disregard the 2-uniform edges of type (ii ) and construct containers for independent sets of the resulting smaller v H 2 -uniform hypergraph, which clearly include all independent sets of the original hypergraph.
One soon realises that the above approach is somewhat flawed when one is interested in the family F ind n,m (H) whenever m is either very small or very close to n 2 and H is neither complete nor empty. This is because the original container theorems completely disregard the obvious asymmetry between the edges and the non-edges of H in each of the v H 2 -uniform edges of H. As a result, one cannot expect to deduce optimal bounds on |F ind n,m (H)| for all m using this approach. Our main motivation for this work is to address this issue.
Departing somewhat from the language of independent sets, we shall regard a graph G ⊆ K n as the characteristic function h G : E(K n ) → {0, 1} of its edge set; that is, h G (e) = 1 if e ∈ E(G) and h G (e) = 0 otherwise. The family F ind n (H), viewed as a set of functions h : E(K n ) → {0, 1}, may be described by a set of constraints of the form In other words, a function h ∈ F ind n (H) cannot simultaneously map all elements of E to 1 and all elements of W 2 \ E to 0, for any W ⊂ V (K n ) with |W | = v H and any E that is the edge set of an embedding of H into W .
There is nothing special here about the family F ind n (H) or the set E(K n ). Therefore, for the remainder of this discussion, we shall replace E(K n ) with an arbitrary finite set V , let H be an arbitrary family of pairs of disjoint subsets of V , and let In other words, one obtains the family F (H) from {0, 1} V by discarding all h : V → {0, 1} that map each element of A 0 to 0 and each element of A 1 to 1 for some pair (A 0 , A 1 ) ∈ H. We shall informally refer to these pairs of sets as constraints and say that h violates (resp. satisfies) a constraint (A 0 , A 1 ) if h maps (resp. does not map) each element of A 0 to 0 and each element of A 1 to 1. Finally, let us note here for future reference that according to the above definition, F (H) is empty whenever H contains the pair (∅, ∅); in other words, every function violates the 'empty' constraint (∅, ∅).
The container theorems imply that if such a family H contains only pairs (A 0 , A 1 ) with a given value of |A 0 | + |A 1 | and the sets A 0 ∪ A 1 are distributed somewhat uniformly, then there is a small family C of partitions and, importantly, every function in each of the cylinders {0} V 0 × {1} V 1 × {0, 1} V * violates only few constraints in H. In particular, one does not allow a trivial covering of F (H) with {0, 1} V , which corresponds to V * = V . Roughly speaking, we might say that F (H) may be 'tightly' covered by a small family of cylinders.
In this work, we take a refined approach to this covering problem. We shall build families of containers that are tailored to the subfamily of all h ∈ F (H) that attain the values 0 and 1 given numbers of times, unlike in previous works. More precisely, for each integer m with 0 m |V |, we shall consider the subfamily F m (H) ⊆ F (H) defined by and build a family of containers for the elements of F m (H) only.
We shall focus our attention on families F (H) determined by collections H of constraints that are uniform in the sense that each (A 0 , A 1 ) ∈ H satisfies |A 0 | = k 0 and |A 1 | = k 1 for some fixed integers k 0 and k 1 . We shall refer to such collections H as (k 0 , k 1 )uniform hypergraphs. In standard applications of the container method this should not be a huge restriction, provided that we are only interested in constraints of bounded size, 6 that is, pairs (A 0 , A 1 ) where |A 0 | + |A 1 | is bounded from above by a constant. Indeed, given a non-uniform family of constraints of bounded size, we may restrict our attention to the 'densest' (k 0 , k 1 )-uniform hypergraph that is contained in the family, losing only some constant factors. In fact, this is precisely what we are going to do in our proof of Theorem 1.2.
Yet another rephrasing of condition (a) is that whenever g(h) = S, then h is forced to take the value 0 on f (S) −1 (0) and it is forced to take the value 1 on f (S) −1 (1). Note the asymmetry between the guaranteed lower bounds on the cardinalities of the sets f (S) −1 (0) and f (S) −1 (1) in (b). Roughly speaking, we are equally satisfied with (i ) containers forcing our function h to take the value 0 on a positive proportion of V (H) and (ii ) containers forcing our function to take the value 1 only on some δr elements of V (H). Condition (c) states that for every h ∈ F m (H), the value of g(h) is 'consistent' with h. This additional property of the function g will not be used in our application of the theorem to enumerating F ind n,m (C 4 ). However, we state it here as the analogous property in the original container theorems was crucial in avoiding superfluous logarithmic factors in many applications of the container method. Finally, let us point out here that we shall be allowing all of our hypergraphs to contain edges with multiplicities greater than one. In particular, both e(·) and deg H (·, ·) count edges with their multiplicities.
A reader who is familiar with the container method might notice that by setting r = m = v(H) in Theorem 1.4, one recovers the statement of the original container theorem [10, Proposition 3.1] in the somewhat more general context of (k 0 , k 1 )-uniform hypergraphs. To illustrate the 'asymmetry' in Theorem 1.4, we need to assume that m ≪ v(H). For brevity, let N = v(H) and consider two cylinders, described by the following two partitions of V (H): Observe that the cylinder described in (i ) contains at most (1−δ)N m functions from F m (H), whereas the cylinder described in (ii ) contains at most N −δr m−δr functions from F m (H). Assume that r ≪ m ≪ N. Since and To obtain Theorem 1.5, we simply apply Theorem 1.4 to the (0, k)-uniform hypergraph with the same vertex set as H whose edges are all pairs (∅, A) such that A is an edge of H. We shall spell out a few more details at the end of Section 2.  3. Robust balanced stability for induced C 4 s. In order to determine the structure of a typical graph in F ind n,m (C 4 ) using the container method, we ought to characterise all containers whose volume is (close to) the largest possible. Our containers for F ind n,m (C 4 ) will be cylinders in {0, 1} E(Kn) that correspond to partitions E(K n ) = E 0 ∪ E 1 ∪ E * with the following property: There are only few 4-vertex subsets induces a copy of C 4 in some graph described by the partition E(K n ) = E 0 ∪ E 1 ∪ E * . 4 Since we are interested only in graphs with exactly m edges, the volume of a container is simply the number of graphs with m edges that this cylinder contains, that is, |E * | m−|E 1 | . The precise statements of our results are rather technical, but roughly speaking we show that each container whose volume is close to largest possible has the following structure: the graph E 1 contains an 'almost-complete' graph with vertex set W , and most edges in E * have an endpoint in W .
To avoid excessive use of indices, we shall view partitions of E(K n ) of the above type as partial two-colourings of the edges of K n that we shall call pregraphs. More precisely, by a pregraph P of E(K n ) we will mean a pair (M, E) of disjoint subsets of E(K n ). We shall refer to the elements of the set E as edges and the elements of the set M as mixed edges. 5 A good copy of C 4 in P is a copy of C 4 in M whose vertex set is independent in E. Note that each good copy of C 4 corresponds to a set {v 1 , v 2 , v 3 , v 4 } described in the previous paragraph (but not vice-versa). This means, in particular, that the pregraph corresponding to each container contains only few good copies of C 4 . We shall therefore restrict our attention to characterising pregraphs with few good copies of C 4 . As we will later see, a sufficiently precise and useful characterisation of containers can be derived from a robust stability theorem for pregraphs, which we state here in an abbreviated form; for the full statement, we refer the reader to Section 3. We will say that a graph G is ε-close to K ℓ if one can transform G into K ℓ by adding or deleting at most ε ℓ 2 edges. Theorem 1.6. For every ε > 0 there exist positive constants C, δ, and β such that the following holds for all integers ℓ and n with ℓ C √ n. Let P = (M, E) be a pregraph on n vertices with Then either E is ε-close to K ℓ or P contains at least βℓ 4 good copies of C 4 .
Observe that Theorem 1.6 provides a structural characterisation of all those pregraphs (M, E) on n vertices with |E| 1.4. Organisation of the paper. The rest of the paper is organised as follows. In Section 2 we prove the asymmetric container lemma, in Section 3 we prove Theorem 1.6, in Section 4 we prove the lower bounds in Theorem 1.2, and in Section 5 we complete the proof of Theorem 1.
This last property is a simple consequence of the fact that the algorithmic construction of f * can be encoded as a sequence of decisions that naturally correspond to a pair of subsets of V (H) containing at most k 0 b and k 1 b elements, respectively. In particular, we shall obtain an implicit decomposition f * = f • g promised in Theorem 1. 4.
The function f * is constructed by an algorithm that operates in a sequence of at most k 0 + k 1 − 1 rounds. At the beginning of each round, we are given an (i 0 , i 1 )-uniform hypergraph G with the same vertex set as H and such that h ∈ F (G); at the beginning of the first round, (i 0 , i 1 ) = (k 0 , k 1 ) and By the end of the round, we will have either (i) defined a function f * h : V (H) → {0, 1, * } satisfying both (a) and (b) above, or (ii) constructed an (i ′ 0 , i ′ 1 )-uniform hypergraph G * with V (G * ) = V (H) and such that h ∈ F (G * ) whose maximum degrees satisfy conditions akin to the conditions on the maximum degrees of H given by (1). This is achieved in the following way.
We start with G * empty and f * h ≡ * . We set c = 1 if i 1 > 0 and c = 0 otherwise, so i ′ c = i c −1. Our algorithm considers a sequence of questions of the form "Is h(v) = c?" for some carefully chosen (sequence of) vertices v ∈ V (H). If the answer is YES, then we set f * h (v) = c and, more importantly, we add new (i ′ 0 , i ′ 1 )-uniform constraints to G * in the following way. As h(v) = c, if h satisfies a constraint 6 (A 0 , A 1 ) with v ∈ A c , then it also satisfies the constraint ( In view of this, for each (A 0 , A 1 ) ∈ G with v ∈ A c , we add to G * the corresponding (A ′ 0 , A ′ 1 ). If the answer is NO, then we only set f * h (v) = 1 − c. (We thus choose to ignore all the constraints (A 0 , A 1 ) ∈ G such that v ∈ A 1−c .) The round ends when either the number of YES answers reaches b or if no constraints remain involving only the vertices that we have not yet asked about. Our assumptions on the maximum degrees of the hypergraph G imply that in the latter case, the number of NO answers will be sufficiently large to deduce that |(f * h ) −1 (1 − c)| is sufficiently large (that is, at least δv(H) if c = 1 and at least δr if c = 0). If this does not happen (and hence the number of YES answers reaches b), then we shall be able to show that the hypergraph G * , which we have created based on the YES answers, contains a subhypergraph with sufficiently many edges, whose maximum degrees satisfy the required conditions. In this case, we let G ← G * and (i 0 , i 1 ) ← (i ′ 0 , i ′ 1 ) and proceed to the next round.
Since, as noted before, no function satisfies the empty constraint (∅, ∅), it follows that in the round when i 0 + i 1 = 1, no YES answers can be given. (Otherwise, a non-empty (0, 0)-uniform hypergraph G * with h ∈ F (G * ) would be constructed.) In particular, the function f * h will have to be defined in this round, provided that the algorithm reaches it. Even though the sequence of values of c that we choose (i.e., we let c = 1 as long as i 1 is not yet zero) may seem somewhat arbitrary, it has a very important consequence.
e(G)/m. Indeed, the set h −1 (1) has at most m elements and it has to intersect A 0 for each (A 0 , ∅) ∈ G. Note that if m ≪ v(H), then e(G)/m is much larger than the average degree of G. This simple observation is the reason why restricting to the family F m (H) allows us to create a smaller family of containers.
Finally, since each of the questions asked by the algorithm is a YES/NO question, we may encode the execution of the algorithm, and thus also the function f * , as a set of at most (k 0 + k 1 − 1) · b vertices for which the answer was YES.
We conclude this outline with an important technical remark. Throughout this section we allow all of our hypergraphs to contain edges with multiplicities greater than one. Moreover, when computing various degrees deg(·, ·) or cardinalities e(·) of the edge sets of various hypergraphs, we shall always count edges with multiplicities. As first discovered by Saxton and Thomason in [44] and later reiterated in [10], this seemingly insignificant detail has far-reaching consequences in both the statement and the proof of the container theorems.

2.2.
Setup. Let k 0 and k 1 be nonnegative integers and let K be a positive real. Let b, m, and r be positive integers and suppose that H is a (k 0 , k 1 )-uniform hypergraph satisfying (1) for every pair (ℓ 0 , ℓ 1 ) as in the statement of Theorem 1.4. We claim that without loss of generality we may assume that b m v(H). Indeed, if m > v(H), then we may replace m with v(H) as F m ⊆ F (H) = F v(H) (H) and the right-hand side of (1) is a non-increasing function of m. ) and the assumed upper bounds on the maximum degrees of H remain true even after we replace b with v(H). Indeed, if ℓ 0 > 0, then for every ℓ 1 ∈ {0, . . . , k 1 }, as v(H) m, and if ℓ 0 = 0, then for every ℓ 1 ∈ {1, . . . , k 1 }, (1) does not depend on m, and if ℓ 0 > 0, then for every ℓ 1 ∈ {0, . . . , k 1 }.
We now define a collection of numbers that will be upper bounds on the maximum degrees of the hypergraphs constructed by our algorithm. To be more precise, for each (i 0 , i 1 ) ∈ U and all (ℓ 0 , ℓ 1 ), we shall force the maximum (ℓ 0 , ℓ 1 )-degree of the (i 0 , i 1 )uniform hypergraph not to exceed the quantity ∆ . .
The above recursive definition will be convenient in some parts of our analysis. In other parts, we shall require the following explicit formula for ∆ For future reference, we note the following two simple corollaries of Observation 2.2 and our assumptions on the maximum degrees of H, see (1). Suppose that (i 0 , i 1 ) ∈ U. If i 1 > 0, then necessarily i 0 = k 0 and hence, where the maximum is over all pairs and an (i 0 , i 1 )-uniform hypergraph G, we define . By the definition of U, it follows that 1 is compatible with (i 0 , i 1 ) ∈ U if and only if i 1 > 0.
2.3. The algorithm. We shall now define precisely a single round of the algorithm that we described informally in Section 2.1. To this end, fix some (i 0 , i 1 ) ∈ U and a compatible c ∈ {0, 1} and (as in the definition of a compatible c) set Suppose that G is an (i 0 , i 1 )-uniform hypergraph with V (G) = V (H). A single round of the algorithm takes as input an arbitrary h ∈ F (G) and outputs an ( The algorithm. Set A (0) := G, let S be the empty set, and let G . Do the following for each integer j 0 in turn: (S4) Let A (j+1) be the hypergraph obtained from A (j) by removing from it all pairs (A 0 , A 1 ) such that either of the following hold: Observe that the algorithm always stops after at most v(G) iterations of the main loop. Indeed, since all constraints 2.4. The analysis. We shall now establish some basic properties of the algorithm described in the previous subsection. To this end, let us fix some (i 0 , i 1 ) ∈ U and a compatible c ∈ {0, 1} and let i ′ 0 and i ′ 1 be the numbers defined in (5). Moreover, suppose that G is an (i 0 , i 1 )-uniform hypergraph and that we have run the algorithm with input h ∈ F (G) and obtained the (i ′ 0 , i ′ 1 )-uniform hypergraph G * , the integer J, the injective map {0, . . . , J − 1} ∋ j → v j ∈ V (G), and the partition of {0, . . . , J − 1} into S and 13 W such that h(v j ) = c if and only if j ∈ S. We first state two straightforward, but fundamental, properties of the algorithm.
Proof. Observe that G * contains only constraints of the form: The next observation says that if the algorithm applied to two functions h and h ′ outputs the same set {v j : j ∈ S}, then the rest of the output is also the same.
The only step of the algorithm that depends on the input function h is (S3). There, an index j is added to the set S if and only if h(v j ) = c. Therefore, the execution of the algorithm depends solely on the set {v j : j ∈ S}.
The next two lemmas will allow us to maintain suitable upper and lower bounds on the degrees and densities of the hypergraphs obtained by applying the algorithm iteratively. The first lemma, which is the easier of the two, states that if all the maximum degrees of G are appropriately bounded, then all the maximum degrees of G * are also appropriately bounded.
and note that j 0, since G (0) * is empty. We claim first that , and therefore the algorithm removes from We next claim that To see this, recall that when we extend G where the last inequality is by our assumption, as claimed. Combining (6) and (7), it follows immediately that , where the final inequality holds by Definition 2.1. This contradicts our choice of (T ′ 0 , T ′ 1 ) and therefore the lemma follows.
We are now ready for our final lemma, which is really the heart of the matter. We will show that if G has sufficiently many edges and all of the maximum degrees of G are appropriately bounded, then either the output hypergraph G * has sufficiently many edges or the value of h(v) will be determined for sufficiently many vertices v. We remark that here we shall use the assumption that h takes the value 1 at most m times.
. . , i 1 }, then at least one of the following statements is true: Proof. Suppose first that c = 0 and observe that 7 To bound the right-hand side of (8), we count the edges removed from A (j) in (a) and (b) of step (S4), which gives Summing over j ∈ {0, . . . , J − 1}, it follows (using (8)) that Observe also that if c = 1, then we obtain an identical bound, with ∆ (1,0) (G) replaced by ∆ (0,1) (G).
In order to discuss both cases simultaneously, we set χ(0) = (1, 0) and since A ⊆ A (j) ⊆ G and G satisfies (A2). It follows that, for both c ∈ {0, 1}, 7 Recall that G * (and G (j) * etc.) are multi-hypergraphs and that edges are counted with multiplicity. Now, recall that v j is the c-maximum vertex of A (j) and observe that therefore, by (8) and (9), where the equality is due to the fact that |S| = b only when A is empty, see step (S1). Next, to bound the sum in (10), observe that, by Definition 2.3, we have for each (ℓ 0 , ℓ 1 ) and therefore We claim that ∆ . We split the remainder of the proof into two cases, depending on the value of c.
Suppose first that c = 1 and observe that substituting (12) into (10) yields, using the bound ∆ Moreover, by (11), and since i 1 1 when c = 1, we have since the maximum degree of a hypergraph is at least as large as its average degree. Combining (13) and (14), we obtain since b v(H). Now, if the first summand on the right-hand side of (15) exceeds e(G)/2, then (A1) implies (P1), since (i ′ 0 , i ′ 1 ) = (i 0 , i 1 − 1). Otherwise, the second summand is at least e(G)/2 and by (A1) and (3), which is (P2). The case c = 0 is slightly more delicate; in particular, we will finally use our assumption that |h −1 (1)| m. Observe first that if c = 0, then substituting (12) into (10) yields, using the bound ∆ cf. (13). We claim that The first inequality follows from (11), so we only need to prove the second inequality. To do so, observe that G is an (i 0 , 0)-uniform hypergraph (since c = 0) and therefore each function in F (G) must take the value 1 on at least one element of each set A 0 such that (A 0 , ∅) ∈ G. Now, recall that h ∈ F (G), that A ⊆ G, and that h takes the value 1 at most m times. It follows that e(A) m · ∆ (1,0) (A), as claimed.
and STOP.
We will show that the above procedure indeed constructs containers for F m (H) that have the desired properties. To this end, we first claim that for each pair (i 0 , i 1 ) ∈ U ∪ {(0, 0)}, the hypergraph H (i 0 ,i 1 ) , if it was defined, satisfies: Indeed, one may easily prove (i ) and (ii ) by induction on (k 0 + k 1 ) − (i 0 + i 1 ). The basis of the induction is trivial as H (k 0 ,k 1 ) = H, see Definition 2.1. The inductive step follows immediately from Observation 2.4 and Lemma 2.6.
Suppose, therefore, that step (C4) is executed when G = H (i 0 ,i 1 ) for some (i 0 , i 1 ) ∈ U, and note that s = (k 0 + k 1 ) − (i 0 + i 1 ). We claim that e(H (i 0 ,i 1 ) ) β s e(H). Indeed, this is trivial if s = 0, whereas if s > 0 and this were not true, then we would have executed step (C4) at the previous step. We therefore have and e(G * ) < β s+1 · e(H), which, by Lemma 2.7 and (ii ), implies that either (P2) or (P3) of Lemma 2.7 holds. Note that if c = 1, then k 1 i 1 > 0 and we have where δ = 2 −(k 0 +k 1 )(k 0 +k 1 +1) K −1 . On the other hand, if c = 0, then k 0 i 0 > 0 and This verifies that f * h satisfies property (b) from the statement of Theorem 1.4. To complete the proof, we need to show that f * decomposes as f * = f • g for some g : and to verify that properties (a) and (c) from the statement of the theorem hold. We claim that one may take g(h) = (S 0 , S 1 ), where S 0 and S 1 are the sets constructed by the above procedure, see (C3). To this end, it suffices to show that if for some h, h ′ ∈ F (H) the above procedure produces the same pair (S 0 , S 1 ), then To see this, observe first that the set S defined in step (C2) is precisely the set of all indices j ∈ {0, . . . , J − 1} that satisfy v j ∈ S c . Indeed, the former set is contained in the latter by construction, see (C3). The reverse inclusion holds because

Robust balanced stability for induced C 4 s
Recall from Section 1.3 that a pregraph is a pair (M, E) of disjoint subsets of E(K n ). The elements of E are called edges whereas the elements of M are called mixed edges. A good copy of C 4 in a pregraph (M, E) is a copy of C 4 in M whose vertex set is independent in E. In particular, the vertex set of each good copy of C 4 induces four, five, or six edges of M, four of which play the roles of edges of C 4 . 8 Given a pregraph P = (M, E), we define three hypergraphs with vertex set M, denoted H P 0 , H P 1 , and H P 2 . The (i, 4)-uniform hypergraph H P i comprises all pairs (A, B) such that B is a good copy of C 4 and A is the set of the remaining i mixed edges induced by the vertex set of this copy (which induces exactly 4 + i edges of M). Recall that we say that a graph G is ε-close to K ℓ if one can transform G into K ℓ by adding or deleting at most ε ℓ 2 edges. The following theorem, a robust stability statement for good copies of the 4-cycle in a pregraph, is the main result of this section. Let us say that an (i, 4)-uniform hypergraph H i is permissible if it satisfies both (all three, if i > 0) maximum degree conditions stated in Theorem 3.1. We shall thus be looking for a permissible subhypergraph H i ⊆ H P i , for some i ∈ {0, 1, 2}, that has Ω(ℓ 4 ) edges. We shall build the H 0 , H 1 , and H 2 by adding to them one edge at a time, making sure that we stay within the class of permissible hypergraphs, until one of them has sufficiently many edges. (Trivially, an empty hypergraph is permissible.) It will be convenient to use the following nomenclature. A pair (S, T ) of disjoint sets of edges of K n is saturated in a hypergraph H if deg H (S, T ) attains or exceeds its maximum permitted value. That is, if Thus, in the setting of Theorem 3.1, we shall be looking for an i ∈ {0, 1, 2} and an edge of H P i \ H i which does not contain any saturated pair. We first show how to deduce Theorem 3.1 from the following, seemingly weaker, statement by performing an appropriate preprocessing of the pregraph P. This preprocessing of P will 'disable' all saturated pairs of types (i ) and (ii ), so that we will only have to worry about pairs of type (iii ).  δ)ℓn, E is not ε-close to K ℓ , and ℓ λn.
Then for any collection C of at most 12βℓ 3 pairs of elements of M, there exist at least 3βℓ 4 good copies of C 4 in P that contain no pair from C.
Derivation of Theorem 3.1 from Theorem 3.2. Given 0 < ε 2, 9 let β 3.2 , δ 3.2 , λ 3.2 , and C 3.2 be the constants whose existence is asserted by Theorem 3.2 with ε 3.2 ← ε/4 and let Suppose that a pregraph P = (M, E) satisfies the assumptions of Theorem 3.1. We shall build the (initially empty) hypergraphs H 0 , H 1 , and H 2 edge by edge, making sure that we stay within the class of permissible hypergraphs, until one of them has sufficiently many edges. To this end, suppose that we have succeeded in constructing some permissible H 0 , H 1 , and H 2 , but each of them has fewer than βℓ 4 edges. We shall modify the pregraph P by removing from M all mixed edges f for which there exists i ∈ {0, 1, 2} such that either (∅, {f }) or ({f }, ∅) (or both) is saturated in H i . This will ensure that every good copy of C 4 that we will later find in this modified colouring will not contain any saturated pair (S, T ) of type (i ) or (ii ). To achieve this, we first move all mixed edges f for which ({f }, ∅) is saturated in either H 1 or H 2 from M to E and then move all f for which (∅, {f }) is saturated in any of the H i from M to an initially empty set N. Denote the modified pregraph by P ′ = (M ′ , E ′ ). Observe, crucially, that each good copy of C 4 in P ′ is also good in P, as E ′ ⊇ E and M ′ ⊆ M. Moreover, each such copy yields an edge of 9 Note that the result for ε > 2 is implied by the statement for ε = 2, since condition (M3) is then stronger than condition (M1), and every graph with at most ℓ 2 edges is 2-close to K ℓ . 20 one of the H P i with no saturated pair of type (i ) or (ii ), where 4 + i is the number of edges of M ′ ∪ N induced by the vertex set of this 4-cycle. 10 Let ℓ ′ = ⌊(1 + δ)ℓ⌋. As each of the H i has fewer than βℓ 4 edges, then In particular, Moreover, if e(M) 4ℓn, then e(M ′ ) 3ℓ ′ n, and if e(M) 2δℓ 2 , and δ ε/10. Therefore, if P satisfies the assumptions of Theorem 3.1 with either (M1) or (M2), then P ′ satisfies the assumptions of Theorem 3.2 with ε 3.2 ← ε/4 and ℓ 3.2 ← ℓ ′ , see (M1*) and (M2*).
Now, let C be the collection of all T such that (∅, T ) is a saturated pair of type (iii ) in one of the H i and observe that as each edge of H i contains at most four such saturated pairs (if f 1 , f 2 ∈ M do not share a vertex, then deg H P i (∅, {f 1 , f 2 }) 2). Therefore, if P satisfies the assumptions of Theorem 3.1 with either (M1) or (M2), then we may invoke Theorem 3.2 to find at least 3β 3.2 (ℓ ′ ) 4 3βℓ 4 good copies of C 4 in P ′ , none of which contains a pair from C.
On the other hand, if P satisfies the assumptions of Theorem 3.1 with (M3), then P ′ restricted to the set U c satisfies the assumptions of Theorem 3.2 with ℓ 3.2 ← 2 √ εℓ, as we may again invoke Theorem 3.2 to find at least 3β 3.2 (2 √ εℓ) 4 3βℓ 4 good copies of C 4 in P ′ , none of which contains a pair from C.
Finally, it follows from our construction that each good copy of C 4 in P ′ corresponds to an edge of H P i for some i ∈ {0, 1, 2} that additionally does not contain any saturated pairs of type (i ) or (ii ). Moreover, by our definition of C, none of the at least 3βℓ 4 copies we have found above contains a saturated pair of type (iii ) either. Recalling that e(H 0 ) + e(H 1 ) + e(H 2 ) < 3βℓ 4 , it follows that one of these good C 4 s yields a pair (A, B) ∈ H P i \ H i such that H i ∪ {(A, B)} is permissible. Iterating this process, we must eventually arrive at a permissible hypergraph H i (for some i ∈ {0, 1, 2}) with at least βℓ 4 edges, as required.
The remainder of the this section is dedicated to the proof of Theorem 3.2. We begin by proving the following proposition, which proves Theorem 3.2 when the condition (M1*) holds and will moreover serve as a helpful warm-up for the proof of the theorem. It will also be a step in the proof of the theorem under the assumption (M2*). Our proofs will use the following two auxiliary statements. The first is a well-known result of Caro [13] and Wei [48]. We remark that, in this section, if G is a graph (such as M or E), we will write d G (v) and d G (v, S) to denote the number of neighbours of v and the number of neighbours of v in S, respectively.
The second is an easy consequence of Jensen's inequality applied to the convex function [0, ∞) ∋ x → 1/(1 + x). Given a nonnegative integer d and a real number q ∈ [0, 1], we shall denote by Bin(d, q) the binomial random variable with parameters d and q. Proof of Proposition 3.3. Fix a pregraph P = (M, E) on n vertices and a collection C satisfying the assumptions of the proposition. We first remove all vertices whose degree in M is less than 2ℓ. As this way we lose at most 2ℓn edges of M, we arrive at an m-vertex subset W ⊆ V (K n ), for some 2ℓ m n, such that δ(M[W ]) 2ℓ. Clearly, it is sufficient to find ℓ 4 /40 good copies of C 4 in P restricted to W , none of which contains a pair from C. Therefore, shall replace the original M, E, and P with their restrictions to the set W . Set q = m/ℓ 2 n/ℓ 2 1 and form a random subset R ⊆ W by retaining each element of W independently with probability q. We apply Lemma 3.4 to the graph E[R] to find an independent set I ⊆ R with By Fact 3.5, we have .
As the function [0, ∞) ∋ x → q/(1 + qx) is convex, the sum in the right-hand side above is minimised when d E (v) = 2e(E)/m for every v ∈ W . As e(E) ℓ 2 /2, then Next, let us choose, for each vertex v ∈ W , an arbitrary set M v of 2ℓ edges of M that are incident to v. We shall say that a copy of K 1,2 is good if its centre v lies in I, both of its edges are in M v , and the pair comprising its two non-centre vertices does not belong to E. The number X g of such good K 1,2 s satisfies We shall say that a copy of K 1,2 in M is saturated if (the set consisting of) its two edges belong to C. Let X s be the number of saturated K 1,2 s in M whose centre vertex belongs to the (random) set I ⊆ R. Writing X for the number of good K 1,2 s that are not saturated, we have X X g − X s and hence, recalling that |C| ℓ 3 /40, where we have used (19), (20), and the inequality m 2ℓ.
Since I is an independent set in E, it follows that any pair of good K 1,2 s with the same non-centre vertices form a good C 4 and therefore we have at least X − m 2 such C 4 s. However, we must disregard those C 4 s that contain a saturated K 1,2 whose two noncentre vertices lie in I, since the two edges of such a saturated K 1,2 could come from two different good non-saturated K 1,2 s whose centre vertices lie in I. The expected number of saturated K 1,2 s of this type is at most q 2 · |C| and each of them lies in at most 2ℓ of our good C 4 s, since the edges of our good C 4 s came only from the sets M v . We must therefore discard (in expectation) at most 2ℓq 2 |C| of the (at least) X − m 2 good C 4 s found using pairs of good K 1,2 s.
To summarise, let Z be the number of good C 4 s that contain no saturated K 1,2 and at least two vertices of I. By (21) and the argument above, we have Finally, observe that each good copy of C 4 containing no saturated K 1,2 has probability at most 2q 2 of being counted by Z. It therefore follows that the total number of such copies of C 4 must be at least m 2 /(40q 2 ) = ℓ 4 /40, as required.
We next consider pregraphs P = (M, E) for which one can find a small set A of vertices of K n that contains only a tiny proportion of the edges of E, but still a large proportion of mixed edges have an endpoint in A. The following proposition will be invoked in the proof of Theorem 3.2. Then for any collection C of at most αℓ 3 pairs of elements of M, there exist at least αℓ 4 good copies of C 4 in P that contain no pair from C.
Proof. The proof of Proposition 3.6 follows the general strategy of the proof of Proposition 3.3, but there are some key differences. In particular, we will find the independent set I inside the set A alone and we shall select vertices of R with different probabilities. Rather than invoking Lemma 3.4 and Fact 3.5, we shall give a somewhat finer argument to produce a large independent set I ⊆ R and use it to construct good copies of C 4 . We start by iteratively removing from A all vertices v that do not satisfy Observe that the set A ′ of vertices remaining after this deletion satisfies Let a = |A ′ | and order the elements of and form a random set R ⊆ A ′ by keeping each v i independently with probability q i . Define and observe that I is an independent set in the graph E. 11 Similarly to before, we shall say that a copy of K 1,2 in M is good if its centre lies in I and the pair comprising its two non-centre vertices does not belong to E. Observe that the number X g of good K 1,2 s satisfies as d M (v) 2ℓ for each v ∈ I. We shall now estimate the probability that a given vertex v ∈ A ′ belongs to the random set I. To this end, suppose that v = v i for some i ∈ [a] and note that, by (22), there are at most d M (v i ) · 16αℓ/n indices j such that v i v j ∈ E. Moreover, by our choice of the ordering, q j q i whenever j > i. Letting , and recalling that 8n/(ℓd) 1/4, it follows that where we used the bounds 1 − x e −5x/4 when 0 x 1/4 and e −1/4 > 3/4. We will need to disregard the saturated K 1,2 s, that is, all those whose pair of edges belongs to C. Let X s be the number of those saturated K 1,2 s whose centre vertex belongs to the set I. Writing X for the number of good K 1,2 s that are not saturated, we have X X g − X s , and hence where we have used (23) and the inequality n 2ℓ (which holds since A ′ is non-empty).
Since I is an independent set in E, it follows that any pair of good K 1,2 s with the same non-centre vertices forms a good C 4 . Thus we have at least X − n 2 such C 4 s. However, we must still disregard those C 4 s that contain a saturated K 1,2 with two noncentre vertices in I. Fix some K 1,2 from C and suppose that its non-centre vertices are v i and v j . Observe that it can lie in at most d M (v i ) of our good copies of C 4 . Therefore, the expected number of good C 4 s that we are forced to disregard because of this single K 1,2 is at most Consequently, the expected number of good copies of C 4 that we have to disregard because of one of the saturated K 1,2 s from C is at most 32|C|n 2 /ℓ 3 .
To summarise, let Z be the number of good C 4 s that contain no saturated K 1,2 and at least two vertices of I. We have shown that But as each good copy of C 4 containing no saturated K 1,2 has chance at most 2q 2 1 to be counted by Z, the number of them is at least n 2 /(40q 2 1 ) ℓ 4 /640. This completes the proof of the proposition.
Proof of Theorem 3.2. We begin by defining the constants whose existence is claimed in the statement of the theorem. Given 0 < ε 1/2, set α = 2 −16 and define Suppose that ℓ C √ n and let P = (M, E) be a pregraph on n vertices with e(E) ℓ 2 . If P satisfies (M1*), then we may immediately invoke Proposition 3.3, noting that |C| 12βℓ 3 ℓ 3 /40, to find find ℓ 4 /40 good copies of C 4 that contain no pair from C.
We may therefore assume from now on that P satisfies (M2*), that is, We begin by iteratively removing all vertices v whose degree in M is smaller than (1−2δ)ℓ. As this way we can remove at most (1 − 2δ)ℓn edges of M, we will eventually arrive at a set W ⊆ V (K n ) with δ(M[W ]) (1 − 2δ)ℓ. Set m = |W |, and note that, since we removed at most (1 − 2δ)ℓ(n − m) edges of M, we have e M (W ) max (1 − δ)ℓm, δℓn , and therefore Observe that the subgraph of E induced by W is also not (ε/2)-close to K ℓ . Indeed, otherwise there would be an ℓ-element set U ⊆ W with e E (U) (1 − ε/2) ℓ 2 , which would imply that E itself is ε-close to K ℓ , as e(E) ℓ 2 . We may thus work with the restrictions of M, E, and P to the set W . We shall surpress W from the notation and write M, E, and P in place of M and moreover δ(M) (1 − 2δ)ℓ. We split the proof into two cases, depending on the shape of the degree sequence of E.
Set q = Cm/ℓ 2 1/C and form a random subset R ⊆ W by keeping each element of W independently with probability q. We apply Lemma 3.4 to the graph E[R] (cf. the proof of Proposition 3.3) to find an independent set I ⊆ R with By Fact 3.5, we have .
As the function [0, ∞) ∋ x → q/(1 + qx) is convex, the sum in the right-hand side above is minimised when d E (v) = 2e(E)/m for every v ∈ W . However, we assumed that d E (v) (1 − α)ℓ 2 /m for every v ∈ L, so a slightly stronger bound holds. Indeed, since then it follows that One may verify the the last inequality in (25) by multiplying the numerators and the denominators in the left-hand side by m/(ℓ 2 q) = 1/C = α 3 /4 and observing that Set d = (1 − 2δ)ℓ and choose, for each vertex v ∈ W , an arbitrary set M v of d edges of M that are incident to v. As before, we shall say that a copy of K 1,2 is good if its centre v lies in I, both of its edges are in M v , and the pair of its non-centre vertices is not in E. As E is not (ε/2)-close to K ℓ , then for every v ∈ W , the set M v of the d other endpoints of the edges in M v contains at least d 2 − (1 − ε/2) ℓ 2 pairs that do not belong to E. In particular, as δ ε/16, each vertex of I is the centre of at least εℓ 2 /8 good K 1,2 s. Unfortunately, this lower bound is not sufficiently strong for the naive argument given in the proof of Proposition 3.3 to work, as E[|I|] is too small. Instead, we shall exploit the rough structure of E.
To this end, we partition the set W into sets W L and W H of low and high degree vertices, which are defined as follows: Given an independent set I, we split it into I L and I H , which are defined as follows: Observe that if v ∈ I L , then M v contains at least d−δℓ 2 − δℓ 2 /2 (1 − 7δ)ℓ 2 /2 pairs that do not belong to E. We shall argue differently for different I, depending on the relative sizes of the sets I L and I H .
In both cases, we will find a (random) collection of at least δ 2 m 2 /16 good C 4 s (in expectation) each of which is the union of two K 1,2 s centred at some v, w ∈ I and such that neither of (the pairs of edges of) these K 1,2 s belongs to C. We first argue that this is sufficient. Indeed, even though we will still have to disregard those copies of C 4 that contain a K 1,2 with two non-centre vertices in I whose edges belong to C, the expected number of such saturated K 1,2 s is at most q 2 · |C| and each of them lies in at most d ℓ of our good copies of C 4 , as the edges of these good C 4 s came only from the sets M v . Hence, letting Z be the (random) number of good C 4 s that contain at least two vertices of I and no K 1,2 whose edges belong to C, we will have But as each good copy of C 4 containing no saturated K 1,2 has chance at most 2q 2 to be counted by Z, the number of them is at least Therefore, in order to complete the proof of the theorem in Case 1, it suffices to prove the existence of (a random collection of) δ 2 m 2 /16 good copies of C 4 (in expectation) of the less restrictive type described above.
Recall that if v ∈ I L , then M v contains at least (1 − 7δ)ℓ 2 /2 pairs that do not belong to E. It follows that the number X g of good K 1,2 s satisfies Writing again X s for the number of saturated K 1,2 s (those whose edges belong to C) whose centre vertex belongs to I and X for the number of good K 1,2 s that are not saturated, we have X X g − X s and consequently, where we used (25), the facts that δ < α 3 /2 7 and β < α 3 /(8 · 24C), and the trivial inequality m (1 − 2δ)ℓ ℓ/2. Since I is an independent set in E, any pair of good Let us write X g for the number of good K 1,2 s with at least one non-centre vertex in W H . We will show in this case that from which it will be straightforward (as in Subcase 1A) to deduce the existence of the required collection of good C 4 s.
To prove the lower bound on X g , recall first that each vertex v ∈ I H is the centre of at least εℓ 2 /8 good K 1,2 s; we claim that at least δℓ 2 /4 of these have at least one non-centre vertex in W H . To prove this, set w = | M v ∩ W L | and suppose first that w εℓ/2. Then at most ε 2 ℓ 2 /8 good K 1,2 s centred at v have both non-centre vertices in W L and since ε/8 − ε 2 /8 ε/16 δ/4, the claim follows in this case. On the other hand, if w > εℓ/2, then there are at least good K 1,2 s centred at v with at least one non-centre vertex in W H . Indeed, since | M v | = d and each u ∈ M v ∩ W L has degree at most δℓ/2 in E, there are at least d − w − δℓ/2 good K 1,2 s centred at v that contain u and a third vertex from W H . Thus as claimed. To prove the claimed upper bound on |W H |, observe that which implies, by (24), that as required. Now, writing X for the number of good K 1,2 s with a non-centre vertex in W H that are moreover not saturated and X s for the number of saturated K 1,2 s (that is, K 1,2 s whose pair of edges belongs to C) whose centre vertex belongs to I, we have X X g − X s and hence, where we again used the bounds β < δ 2 /(8 · 24C) and m (1 − 2δ)ℓ ℓ/2. Finally, since I is an independent set in E, it follows that there are at least X − |W H |m good C 4 s formed by pairs of K 1,2 s that are counted by X and hence the expected number of good copies of C 4 that are formed by two K 1,2 s centred at vertices in I, neither of which belongs to C, is at least as required. This completes the proof in Case 1.
Case 2. There are fewer than αm vertices v satisfying d E (v) (1 − α)ℓ 2 /m. In this case, we shall find our good C 4 s in various ways, depending on the distribution of degrees (in both the graphs M and E) on the set A of vertices whose degree in E is somewhat larger than average. To be precise, set γ = 1/32 and define We claim that e E (A) γ 2 ℓ 2 . To prove this, observe first that Noting that α = γ 3 /2, and recalling that e(E) For the rest of the proof, we will search for good C 4 s formed by two K 1,2 s whose centre vertices belong to B. Let us say that a copy of K 1,2 in M is good if its centre lies in B and the pair of its non-centre vertices does not belong to E. Observe that for each v ∈ B, letting N M (v) denote the M-neighbourhood of v, we have since d M (v) δ(M) ℓ/2 and ℓ/m γ/16 by (24). We therefore have at least It only remains to bound the number of good C 4 s composed of two good K 1,2 s, and remove those that contain a pair from C. Our strategy will be similar to that used above, but there are two additional problems to overcome in this case: the set B is not an independent set and we do not have an upper bound on the degrees d M (v). To deal with the first problem, we will use our upper bound on d E (v) for v ∈ B, together with a slightly more careful application of convexity than was needed earlier in the proof. To deal with the second issue, we will partition B according to the approximate size of d M (v) and restrict our search to one of the parts.
We first partition B into two parts, depending (roughly speaking) on whether or not We first consider the case in which sufficiently many of the mixed edges incident to B have an endpoint in B L .
Let X denote the number of good K 1,2 s whose centre vertex lies in B L and whose pair of edges does not belong to the family C. By (28), we have since by the Cauchy-Schwarz inequality and (29), Let Y denote the number of (ordered) pairs of K 1,2 s that are counted by X and have the same non-centre vertices. By the convexity of the function x → x(x − 1) and by (30), we have Cn Cm. Now, let us denote by Y b the number of (ordered) pairs of K 1,2 s counted by Y that do not correspond to good C 4 s (that is, pairs of good K 1,2 s with the same non-centre vertices, whose centre vertices are adjacent in E). By the definition (26) of B, this number satisfies Thus, writing Z g for the number of good C 4 s consisting of pairs of K 1,2 s counted by Y and combining the last three displayed equations, we obtain Finally, we must disregard those good C 4 s, counted in Z g , that contain a K 1,2 of mixed edges that belongs to the family C. The edges of such a K 1,2 must come from different good K 1,2 s counted by X and therefore (by the definition of B L ) there are at most 2 20 ℓ·|C| such C 4 s. It follows that the number Z of good C 4 s that contain no K 1,2 s whose edges belong to C satisfies as required.
Note that if (29) fails to hold, then v∈B H d M (v) 4/3 − 5/4 ℓm = ℓm/12, by (27). In this case we will choose a subset of B H on which the M-degrees are roughly constant and apply the same argument as in Subcase 2A.
For each integer t 0, set b t = 2 −4t−28 m and d t = 2 3t+20 ℓ and define We claim that there exists t such that contradicting (31). Fix any such t and let X denote the number of K 1,2 s whose centre vertex lies in B t , whose pair of non-centre vertices is not in E, and whose pair of edges does not belong to the family C. Observe that As before, let Y denote the number of (ordered) pairs of K 1,2 s that are counted by X and have the same non-centre vertices. By the convexity of the function where we again used the assumption that ℓ 2 Cm. The number Y b of (ordered) pairs counted by Y that do not correspond to good C 4 s (that is, pairs of good K 1,2 s whose centre vertices are adjacent in E) satisfies Thus, the number Z g of good C 4 s counted by Y satisfies Finally, we disregard those good C 4 s, counted in Z g , that contain a K 1,2 of mixed edges that belongs to the family C. For each element of C, there are at most d t+1 such C 4 s and therefore the number Z of good C 4 s that contain no K 1,2 s whose edges belong to C satisfies as required. This completes the proof of the theorem.

The number of split graphs and the non-structured regime
In this section, we prove assertions (a) and (b) of Theorem 1.2. We first establish two lower bounds on the cardinality of F ind n,m (C 4 ): a stronger bound for all m ≪ n 4/3 and a weaker bound for all m ≪ n 4/3 (log n) 1/3 . Second, we carefully estimate the number of split graphs with n vertices and m edges for all n and m with n ≪ m ≪ n 2 . Third, we provide a simple upper bound on the number of graphs that are not ε-quasirandom. A straightforward comparison of these bounds yields the claimed results.

Lower bounds for F ind
n,m (C 4 ). We first show that if m ≪ n 4/3 , then the family F ind n,m (C 4 ) forms an e −o(m) -proportion of all graphs with n vertices and m edges. In particular, as we shall later verify, if m ≫ n, then for every fixed ε, graphs with no induced copy of C 4 outnumber the graphs that are not ε-quasirandom and thus a typical member of F ind n,m (C 4 ) is ε-quasirandom.  Markov's inequality gives P(X m ′ − m) = P(X δm) 1/2. In particular, at least half of all graphs with vertex set {1, . . . , n} and m ′ edges contain a subgraph with m edges and no copy of C 4 . This implies that Finally, by our assumption on δ, This completes the proof.
The derivation of our second lower bound on |F ind n,m (C 4 )| follows a similar strategy, but the simple deletion argument is replaced with the following result of Kohayakawa, Kreuter, and Steger [32], stated here for the random graph G n,m rather than the binomial random graph G(n, p). The heart of the proof of this theorem (which we shall not give here, but rather refer the reader to [32,Theorem 8] or to [24, Appendix A]) is a classical result of Ajtai, Komlós, Pintz, Spencer, and Szemerédi [1], or rather its corollary derived by Duke, Lefmann, and Rödl [19], that gives a lower bound on the independence number of a uniform hypergraph that contains few short cycles.   Proof. Let c be the constant from the statement of Theorem 4.2. Given a positive γ, choose δ > 0 sufficiently small so that δ c(γ/2) 1/3 , let m ′ = n 4/3+γ/2 , and observe that cn 4/3 log(m ′ /n 4/3 ) 1/3 c(γ/2) 1/3 n 4/3 (log n) 1/3 δn 4/3 (log n) 1/3 .
Suppose that m δn 4/3 (log n) 1/3 . It follows from Theorem 4.2 that at least half of all graphs with vertex set {1, . . . , n} and m ′ edges contain a subgraph with m edges and no copy of C 4 , provided that n is sufficiently large. Therefore, similarly as in the proof of Proposition 4.1, This completes the proof.

4.2.
The number of split graphs. As we shall need to compare the family of split graphs (and graphs that are close to a split graph) to various other families of graphs, we will need to derive some estimates on its cardinality. Let S n,m denote the family of split graphs with vertex set {1, . . . , n} that have precisely m edges. Moreover, let N n,m (ℓ) denote the number of those graphs that are complete on the set {1, . . . , ℓ} and empty on its complement. Observe that , if ℓ 2 m ℓ(n − ℓ) + ℓ 2 , 0, otherwise, and max ℓ N n,m (ℓ) |S n,m | ℓ n ℓ N n,m (ℓ).
Since (32) is rather hard to work with due to its inexplicit form, we establish several asymptotic properties of the function ℓ → N n,m (ℓ), summarised in Proposition 4.4 below. We postpone the rather dull and technical proof of the proposition to Appendix A. Proof of parts (a) and (b) of Theorem 1.2. Fix an arbitrary positive ε, suppose that m ≫ n, and let G be the uniformly chosen random graph with vertex set {1, . . . , n} and exactly m edges. A standard averaging argument shows that if G is not ε-quasirandom, then it contains a subset A with exactly εn vertices and density differing from m/ n 2 by more than εm/ n 2 . Consequently, Hoeffding's inequality for the hypergeometric distribution [27] asserts the existence of a positive ρ that depends only on ε such that P (G is not ε-quasirandom) n εn · exp (−3ρm) .
It now follows from Proposition 4.1 invoked with γ ← ρ that if δ is sufficiently small, then for all sufficiently large n and all m satisfying n ≪ m δn 4/3 , In other words, graphs that are not ε-quasirandom constitute only an exponentially small fraction of F ind n,m (C 4 ). Now, denote by S n,m (ε) the family of graphs with vertex set {1, . . . , n} and m edges that are ε-close to a split graph. Each graph in S n,m (ε) can be obtained from some graph in S n,m by removing from it some εm edges and replacing them with arbitrarily chosen εm edges of K n . Hence, if m ≫ n and n is sufficiently large, then |S n,m (ε)| |S n,m | · m εm · n 2 εm |S n,m | · em εm · en 2 2εm εm n εm · |S n,m |.
Suppose now that n ≪ m δn 4/3 (log n) 1/3 . As ℓ n,m ≪ n 2/3 , it follows that for all sufficiently large n. Therefore, by Proposition 4.3 invoked with γ = 1/24 implies that if δ is sufficiently small, then In other words, graphs that are 1/4-close to a split graph constitute only a superexponentially small proportion of F ind n,m (C 4 ), as required.

An approximate structural theorem
In this section, we shall use Theorems 1.4 and 3.1 to construct a collection of containers for the family F ind n,m (C 4 ) whenever n 4/3 (log n) 4 m ≪ n 2 . Our aim is to do this in such a way that all but a tiny proportion of the family will be covered by containers that describe predominantly graphs that are close to a split graph. To make this notion precise, let us say that a pregraph P = (M, E) on n vertices is an ε-almost split pregraph if there exists a partition V (K n ) = U ∪ W such that , and e M (W ) 7 √ ε|U|n.
We will prove the following container theorem for sparse induced-C 4 -free graphs. Recall from Section 1.2 that a graph G is contained in (described by) a pregraph P = (M, Theorem 5.1. For every ε > 0, there exists λ > 0 such that the following holds. For every n ∈ N and n 4/3 (log n) 4 m λn 2 , there exists a collection C of ε-almost split pregraphs on n vertices with |C| = e o(m) such that all but at most e −λm · |F ind n,m (C 4 )| of the graphs in F ind n,m (C 4 ) are contained in some P ∈ C. To prove Theorem 5.1, we will apply Theorem 1.4 recursively, starting with the trivial container, which is defined by the 'complete' pregraph with M = E(K n ) (and therefore E empty). We continue until we obtain a family of containers, each of which admits only few good copies of C 4 ; we will be able to control this process with the use of Theorem 3.1, which provides us with a precise structural description of such pregraphs. Finally, we will show that the containers that are not ε-almost split pregraphs contain at most e −λm · |F ind n,m (C 4 )| members of F ind n,m (C 4 ). More formally, we shall build a rooted tree T whose vertices are pregraphs with n vertices. The root of T is the pregraph with M = E(K n ) corresponding to the trivial container. The children (in T ) of a pregraph will correspond to refinements of it that we obtain by applying Theorem 1.4 to one of the hypergraphs H i supplied by Theorem 3.1. This way, each graph in F ind n,m (C 4 ) that is described by some pregraph P in T will be described by one of the children of P in T . As a consequence, each graph in F ind n,m (C 4 ) will be accounted for by one of the leaves of T .
In order to decide whether a pregraph P = (M, E) should be a leaf of the tree or not (in which case we will apply Theorem 1.4 to it), we use the following definition.
Definition 5.2. A pregraph P = (M, E) on n vertices is a leaf pregraph (with respect to m, ε, and δ) if either P is an ε-almost split pregraph, or there exists ℓ ∈ N such that e(E) ℓ 2 and e(M) (1 − δ)ℓn, or either of the following holds: e(E) > m or e(M) < n 2 m 2 8 log(n 2 /m) Recall that, given a pregraph P = (M, E), the (i, 4)-uniform hypergraph H P i comprises all pairs (A, B) such that B is a good copy of C 4 in P and A is the set of the remaining i mixed edges induced by the vertex set of this copy (which induces exactly 4 + i edges of M). Also, with foresight, let us set r = m 2 13 log n .
We will use Theorem 3.1 to prove the following lemma.
In this case it follows immediately from Theorem 3.1 that there exists a hypergraph H with the claimed properties. Next, suppose that there exists ℓ C √ n and a set U of size ℓ such that e(E) ℓ 2 and e E (U) (1 − ε) ℓ 2 .
(39) 36 Note that e(M) < 4ℓn, otherwise (38) holds and we are done as above. Since P is not a leaf pregraph, it follows that ℓ 2 r, as above, and e M (U c ) > 7 √ εℓn, as P is not an ε-almost split pregraph. This means that P satisfies condition (M3) of Theorem 3.1 and so we obtain a hypergraph H with the claimed properties, as before. Finally, let ℓ ∈ N be minimal such that e(M) (1 − δ)ℓn and observe that e(E) ℓ 2 , since P is not a leaf pregraph, and that ℓ e(M) (1 − δ)n m 2 8 log(n 2 /m) where the second inequality follows since P is not a leaf pregraph and the third by our bounds on m, since n is sufficiently large. It follows that E is not ε-close to K ℓ , since if it were, then there would exist a set U of size ℓ such that e E (U) (1 − ε) ℓ 2 , in which case (39) would hold and we would be done as before. Note also that e(M) (1 − 2δ)ℓn, by our choice of ℓ and since δℓ δ √ n δ/ √ λ 1. Now, observe that if (38)  where in the second step we used the fact that e(E) m (which holds if P is not a leaf pregraph) and in the third we used our upper bound on m. In either case, it follows that ℓ 2e(M)/n λ 3.1 n, since λ = 2 −8 λ 3.1 /C 2 and C 1. Hence P satisfies condition (M2) of Theorem 3.1 and we again obtain the desired hypergraph H. This completes the proof of the lemma.
We next combine Theorem 1.4 and Lemma 5.3 to construct a rooted tree whose leaves correspond to a family of containers for the family F ind n,m (C 4 ). Lemma 5.4. For every ε > 0, there exist positive constants δ and λ such that the following holds. For every n ∈ N and n 4/3 (log n) 4 m λn 2 , there exists a collection C of e o(m) pregraphs on n vertices such that (a) every P ∈ C is a leaf pregraph with respect to m, ε, and δ and (b) every graph G ∈ F ind n,m (C 4 ) is contained in some P ∈ C. Proof. We will construct a rooted tree T whose vertices are pregraphs on n vertices that has the following properties: (i ) the root of T is the complete pregraph with M = E(K n ); (ii ) if G ∈ F ind n,m (C 4 ) is contained in a pregraph P ∈ V (T ) that is not a leaf of T , then G is contained in some child of P in T ; (iii ) the height of T is O(log n); (iv ) the maximum degree of T is exp o(m/ log n) ; (v ) every leaf of T is a leaf pregraph with respect to m, ε, and δ. It will then follow immediately that the leaves of T form a collection C as required.
To define the children of a vertex P ∈ V (T ), we will apply Theorem 1.4 to the hypergraph given by Lemma 5.3. To begin, let β = β 5.3 , δ = δ 5.3 , and λ = λ 5.3 be the constants given by Lemma 5.3 applied with ε 5.3 ← ε and set ξ(n) = (log log n) −1 (here we could use any function that tends to zero sufficiently slowly as n → ∞). Note that, due to the form of the statement, we may assume throughout that n is sufficiently large.
Claim. For every (ℓ 0 , ℓ 1 ) ∈ {0, . . . , i} × {0, . . . , 4} with (ℓ 0 , ℓ 1 ) = (0, 0), we have that are closed under taking induced subgraphs. As we mentioned in the Introduction, the rough structure of a typical member of an arbitrary hereditary property of graphs was determined a few years ago by Alon, Balogh, Bollobás, and Morris [3]. It would be very interesting (and, most likely, extremely challenging) to obtain a corresponding statement for a typical sparse graph in a hereditary property. In order to give the reader an idea of what it might be possible to prove in this very general setting, let us take this opportunity to state a theorem for monotone properties of graphs (that is, properties of graphs that are closed under taking subgraphs) which follows easily from the container theorems proved in [10,44], but, as far as we are aware, has not previously been stated explicitly in the literature.
Given a monotone property of graphs P, let F (P) denote the family of minimal forbidden subgraphs, i.e., the family of all graphs that are not in P, but all of whose proper subgraphs are in P. Theorem 6.4, below, gives an approximate structural description of a typical member of P with (essentially) any given order n and size m, as long as F (P) is finite. In order to state the theorem, we will need the following definition.
Definition 6.3. Given a non-trivial monotone property of graphs P such that F (P) is finite, we define the sequence m(P) = (a 1 , r 1 ), . . . , (a s , r s ) as follows: (i) Set a 0 = 0 and r 0 = ∞. If r i+1 = 1, then set s = i + 1 and stop; otherwise, increase i by one and go to (ii).
Given integers n and m and a graph property P, denote by P n,m the family of all graphs with vertex set {1, . . . , n} and precisely m edges that belong to P.
Theorem 6.4. Let P be a non-trivial monotone property of graphs such that F (P) is finite and let G be a uniformly chosen random graph in P n,m . Suppose that m(P) = (a 1 , r 1 ), . . . , (a s , r s ) . The following holds for every ε > 0: (a) If n ≪ m ≪ n 2−1/a 1 , then a.a.s. G is ε-quasirandom.
Since the proof of Theorem 6.4 is a (nowadays) standard application of the container method, using (a robust version of) the stability theorem of Erdős and Simonovits [20,46] (cf. the proof of [10, Theorem 1.7]), we leave the details to the reader.
Finally, we remark that the assumption that F (P) is finite is essential. Indeed, suppose that F (P) contains all (minimal) non-bipartite graphs H with m 2 (H) a for a given a > 1. If m an, then P n,m contains only bipartite graphs and thus if ε > 0 is sufficiently small, then there are no graphs in P that are ε-quasirandom. 47