Discrepancies of Spanning Trees and Hamilton Cycles

We study the multicolour discrepancy of spanning trees and Hamilton cycles in graphs. As our main result, we show that under very mild conditions, the $r$-colour spanning-tree discrepancy of a graph $G$ is equal, up to a constant, to the minimum $s$ such that $G$ can be separated into $r$ equal parts by deleting $s$ vertices. This result arguably resolves the question of estimating the spanning-tree discrepancy in essentially all graphs of interest. In particular, it allows us to immediately deduce as corollaries most of the results that appear in a recent paper of Balogh, Csaba, Jing and Pluh\'{a}r, proving them in wider generality and for any number of colours. We also obtain several new results, such as determining the spanning-tree discrepancy of the hypercube. For the special case of graphs possessing certain expansion properties, we obtain exact asymptotic bounds. We also study the multicolour discrepancy of Hamilton cycles in graphs of large minimum degree, showing that in any $r$-colouring of the edges of a graph with $n$ vertices and minimum degree at least $\frac{r+1}{2r}n + d$, there must exist a Hamilton cycle with at least $\frac{n}{r} + 2d$ edges in some colour. This extends a result of Balogh et al., who established the case $r = 2$. The constant $\frac{r+1}{2r}$ in this result is optimal; it cannot be replaced by any smaller constant.


Introduction
Combinatorial discrepancy theory aims to quantify the following phenomenon: if a hypergraph H is "sufficiently rich", then in every 2-colouring of the vertices of H there will be some hyperedge which is unbalanced, namely, has significantly more vertices in one of the colours than in the other. The corresponding parameter, called the discrepancy of H, is then defined as the maximum unbalance that is guaranteed to occur (on some hyperedge) in every 2-colouring of V (H). More concretely, assuming that the colours are −1 and 1, we can define the unbalance of a hyperedge e under a colouring f : V (H) → {−1, 1} to be f (e) := x∈e f (x) , and the discrepancy is then the minimum of max e f (e) over all colourings f . The study of such problems has a long and rich history, with several influential results. We refer the reader to Chapter 4 in the book of Matoušek [28] for a thorough overview.
In this paper, we are concerned with discrepancy questions in the context of graphs. In this setting, the vertices of the hypergraph H are the edges of some graph G, and the hyperedges of H correspond to subgraphs of G of a particular type, such as spanning trees, cliques, Hamilton cycles, clique factors, etc. There are several classical results in this vein for the case that G is a complete graph, including those of Erdős and Spencer [12] and Erdős, Füredi, Loebl and Sós [11]. Recently, Balogh, Csaba, Jing and Pluhár [3] (see also [4]) initiated the study of discrepancy problems for arbitrary graphs G, focusing on the discrepancy of spanning trees and Hamilton cycles. In the present paper, we continue this study. Our main result is a very general theorem on the discrepancy of spanning trees, which arguably resolves the problem of its estimation for all 3-vertex-connected graphs (as well as all "sufficiently expanding" 2-vertex-connected graphs; see the next section for the precise definition).
Our results in fact apply to the more general setting of multicolour discrepancy. While there are several natural ways to generalise the above definition of 2-colour discrepancy to an arbitrary number of colours, the resulting parameters are all within a multiplicative factor of each other, making the choice mostly a matter of convenience. Here we have chosen to use the following definition. Note that D 2 (H) coincides with the (2-colour) discrepancy of H defined in the beginning of this section. For a graph G and a set X of subgraphs of G, we define D r (G, X ) to be the r-colour discrepancy of the hypergraph H with vertex-set V (H) = E(G) and edge-set E(H) = X . We will also sometimes use the notation D(G, X , f ), which is analogous to D(H, f ). It is worth mentioning that discrepancy-type questions for more than two colours already appear in the literature, see e.g. [10]. Moreover, very recently such questions have also been considered in the context of discrepancy of graphs. Specifically, the multicolour discrepancy of Hamilton cycles in random graphs has been studied in [19].

Discrepancy of Spanning Trees
Spanning trees are among the most basic objects in graph theory. Let us denote the set of all trees on n vertices by T n . Hence, for an n-vertex graph G, D r (G, T n ) denotes the r-colour discrepancy of spanning trees of G. We now introduce a graph parameter which will play a central role in our results in this section. For an integer r ≥ 2 and a graph G, denote by s r (G) the minimum s for which there is a partition V (G) = V 1 ∪ · · · ∪ V r ∪ S such that |V 1 | = · · · = |V r |, |S| = s, and there are no edges between V i and V j for any 1 ≤ i < j ≤ r. Such a partition is called a balanced r-separation (a) A hedgehog on 120 vertices, whose body is coloured red and whose spikes are coloured blue.
(b) A hedgehog on 120 vertices, whose body is coloured red and whose spikes are coloured green and blue.
(c) A 5-regular "hedgehog" on 84 vertices. Its body is a random 3regular graph on 12 vertices. of G, a set S as above is called a balanced r-separator of G, and s r (G) will be referred to as the balanced r-separation number of G.
Balogh, Csaba, Jing and Pluhár [3] observed that the 2-colour spanning-tree discrepancy of G is no larger than s 2 (G) − 1. This easily generalises to r colours. Indeed, given an nvertex graph G and a partition V (G) = V 1 ∪ · · · ∪ V r ∪ S as above, consider the r-colouring f : E(G) → [r] defined by assigning colour i to all edges which intersect V i (i = 1, . . . , r), and colouring the edges contained in S arbitrarily. Observe that if T is a spanning tree of G, then for every i ∈ [r], the forest T [V i ∪ S] has at least |V i | edges touching V i , hence at least |V i | edges coloured i. Since the total number of edges of T is n − 1, we have that size of a maximum colour class is at most (n − 1) − (r − 1)|V 1 | = |S| − 1 + (n − |S|)/r, hence Given (1), it is natural to ask to which extent s r (G) "controls" D r (G, T n ). Unfortunately, these two parameters might be arbitrarily far apart. In fact, it is not hard to construct graphs on n vertices with s r (G) = Θ(n) but D r (G, T n ) ≤ 1. Indeed, consider the following family of graphs. A hedgehog with proportion r on n vertices is a clique ("body") on n/r vertices (assuming r divides n), each is connected to distinct (r −1) vertices outside the clique ("spikes"; see Figs. 1a and 1b). It is not hard to see that any balanced r-separator of the hedgehog is of linear size. By colouring its body with the colour r and the r − 1 spikes emerging from each vertex of the body by colours 1, . . . , r − 1, one may verify that the r-colour spanningtree discrepancy of the hedgehog is 1. This construction can be generalized to obtain graphs of large minimum degree (and even degree-regular ones) (see Fig. 1c) which still have the property that their r-separation number is Θ(n) while their spannning-tree discrepancy is only O(1). 1 However, a common notable property of all of these examples is that their vertex connectivity is 1, namely, that they are not 2-connected. 2 Our main result shows that already the (rather weak) requirement of 3-connectivity guarantees that there is a very strong relation between balanced r-separations and r-colour spanningtree discrepancy, namely, that these two parameters are a constant factor apart from each other. The same conclusion is valid for 2-connected graphs in which s r (G) is large enough. These statements are given in the following theorem. Theorem 1.1. For every r ≥ 2 there exists C = C(r) > 0 such that the following holds. Let G be an n-vertex graph satisfying one of the following conditions: Then D r (G, T n ) ≥ s r (G)/C . Theorem 1.1 can be interpreted as saying that under very mild assumptions, balanced separations are the only obstructions to having large spanning-tree discrepancy.
We also show that the lower-bound condition on s r (G) in the second item of Theorem 1.1 is essentially tight, as there are 2-connected graphs with s r (G) = Ω( √ n) and with D r (G, T n ) ≤ 1.
Proposition 1.2. For every r ≥ 2 there exists c = c(r) > 0 such that for infinitely many integers n, there exists an n-vertex 2-connected graph G with s r (G) ≥ c √ n and D r (G, T n ) ≤ 1.
Theorem 1.1 allows us to determine the spanning-tree discrepancy for many graphs of interest. In particular, it immediately implies all results of [3] concerning the discrepancy of spanning trees (up to constant factors), and generalises them to any number of colours. Several new results can also be obtained. Below we give a representative sample of such corollaries.
When applying Theorem 1.1, we need to be able to lower-bound the balanced r-separation number s r of the graphs in question. As it turns out, it will often be more convenient to lowerbound other graph parameters, which are in turn lower bounds for s r . One such parameter is the following. For a graph G, let ι(G) denote its vertex isoperimetric constant, namely, the minimum of |N (U )|/|U |, taken over all sets U ⊆ V (G) with 0 < |U | ≤ |V (G)|/2, where N (U ) denotes the external neighbourhood of U (namely, the set of vertices outside U which have a neighbour in U ). It is not hard to see that the balanced r-separation number of any graph G is at least linear in its vertex isoperimetric constant. Indeed, given a balanced r-separation V 1 ∪ . . . ∪ V r ∪ S of G, the size of V 1 is (n − |S|)/r and its neighbourhood is contained entirely in S, meaning that r|S|/(n − |S|) ≥ ι(G) and hence |S| = Ω(ι(G) · n). Thus, Theorem 1.1 has the following useful corollary: Corollary 1.3 (Isoperimetry). For every r ≥ 2 there is C = C(r) such that the following holds. Let G be an n-vertex graph and suppose either that G is 3-connected, or that G is 2-connected and ι(G Before proceeding to applications of Corollary 1.3, let us quickly mention another corollary of a similar flavour, stating that highly-connected graphs have large spanning-tree discrepancy. Denote by κ(G) the vertex connectivity of G. Corollary 1.4 (Discrepancy vs. vertex-connectivity). For every r ≥ 2 there is C = C(r) > 0 such that for every connected n-vertex graph G it holds that D r (G, T n ) ≥ κ(G)/C . Corollary 1.4 follows immediately from Theorem 1.1, since s r (G) ≥ κ(G) for every graph G. Note that as a special case, Corollary 1.4 implies that if G is a graph on n vertices satisfying δ(G) ≥ n/2 + k for some k > 0, then D r (G, T n ) ≥ k/C for some C = C(r) > 0.
To see that Corollary 1.4 is tight, consider the graph on the vertex set V with the balanced r-separation V = V 1 ∪ . . . ∪ V r ∪ S having |S| = k ≤ n/(r + 1), and endow the graph with all possible edges except for those connecting V i to V j for i = j. This graph is clearly k-connected, but according to (1), its spanning-tree discrepancy is at most O(k).
To demonstrate the usefulness of Corollary 1.3, let us apply it to estimate the spanningtree discrepancy of random regular graphs. For an integer d ≥ 3, let G n,d denote the uniform distribution over the set of all d-regular graphs on n vertices (assuming dn is even). Balogh et al. [3,Theorem 3] have shown that D 2 (G n,d , T n ) = Θ(n) whp. Here we immediately obtain an extension of this result to any number of colours.
For large r we can go a step further, determining the asymptotics, as a function of r, of the multiplicative constant appearing in Corollary 1.5. This is stated in the following proposition: In other words, whp in every r-colouring of E(G) there is a spanning tree with at least d 2r − o 1 r n edges of the same colour, and the constant d 2r is tight. Results similar to Corollary 1.5 can be obtained for regular expander graphs. Let G be a d-regular n-vertex graph, and let λ = λ(G) be the second largest eigenvalue of its adjacency matrix. It is widely known that a small ratio λ/d implies good expansion properties (for a survey we refer the reader to [22]  We remark that a similar statement holds for d 1 as well by strengthening the assumption to λ ≤ (1 − ε)d for some ε > 0.
As our next application, we determine the spanning-tree discrepancy of the d-dimensional hypercube, denoted here by Q d .
The derivation of the lower bound in Corollary 1.8 from Theorem 1.1 follows the same lines as the in the derivation of the preceding corollaries. Naturally, we require estimates for the vertex isoperimetric constant of the hypercube. Such estimates are indeed available [21]. The details are given in Section 4.
For our final application, we let P d k denote the d-dimensional grid on k d vertices (d ≥ 2). Balogh et al. [3,Theorem 1.5] Here we obtain a generalisation of this result to every d ≥ 2 and every number of colours.
Again, the proof is achieved by combining Corollary 1.3 with a suitable isoperimetric inequality. Such an inequality for the grid was given in [8]. The details appear in Section 4. In fact, our methods yield similar results for a much wider family of "grid-like" graphs, such as tori (of dimension d ≥ 2), hexagonal and triangular lattices, etc.
It is natural to ask about the spanning-tree discrepancy of the complete graph K n . Since discrepancy is monotone with respect to adding edges, this is also the maximum possible spanningtree discrepancy that an n-vertex graph can have. As it turns out, the r-colour spanning-tree discrepancy of K n is closely related to a certain parameter ϕ(r), defined in terms of covering a complete graph by smaller complete graphs. The definition of ϕ(r) is as follows.
Let ϕ(r, n) denote the smallest integer k such that there is a covering of the edges of K n with r cliques of size k. In other words, ϕ(r, n) is the smallest integer k for which there is a collection of k-sets A 1 , . . . , A r ∈ [n] k such that every e ∈ [n] 2 is contained in A i for some i ∈ [r]. This paramerer has been studied by Mills [29] and by Horák and Sauer [23]. In both of these works it was shown that the limit ϕ(r) := lim n→∞ ϕ(r, n)/n exists, and its value for several small r was determined; for example, ϕ(2) = 1, ϕ(3) = 2 3 , ϕ(4) = 3 5 , ϕ(5) = 5 9 , ϕ(6) = 1 2 and ϕ(7) = 3 7 (for the values of ϕ(r) for r ≤ 13, see [29]). A trivial counting argument shows that ϕ(r) ≥ 1 k cover all pairs in [n] 2 , then r k 2 ≥ n 2 , which gives k ≥ n √ r − o(n). On the other hand, the value of ϕ(r) is known exactly if r = p 2 + p + 1 (for some p ≥ 1) and a projective plane of order p exists (see [16,Section 7] and the references therein); in particular, by blowing up a projective plane of order p one can show that ϕ(r) = 1+or(1) √ r for such values of r. Using this (and known facts about the existence of projective planes and the distribution of primes), one can show that ϕ(r) = 1+or(1) √ r for every r. It is not hard to see that for every n-vertex graph G, one can r-colour the edges of G so that no spanning tree of G has more than ϕ(r, n) − 1 edges of the same colour. Indeed, setting k = ϕ(r, n), take A 1 , . . . , A r ∈ [n] k as in the definition of ϕ(r, n), and colour an edge e ∈ E(G) with colour i ∈ [r] if e ⊆ A i ; such an i always exists because A 1 , . . . , A r cover all pairs in [n] 2 . Observe that in this colouring, every edge of colour i is contained in A i , meaning that any spanning tree of G contains at most |A i | − 1 = ϕ(r, n) − 1 edges of colour i, as required.
In the other direction, one can show that in any r-colouring of the edges of K n , there is a spanning tree with at least (ϕ(r) − o(1))n edges of the same colour (we shall prove this as part of a more general result). The construction in the previous paragraph shows that the constant ϕ(r) is optimal. It would be interesting to obtain an exact result. One might wonder whether the upper bound ϕ(r, n) − 1 is tight, namely, whether it is true that in every r-colouring of E(K n ) there is a spanning tree with ϕ(r, n) − 1 edges of the same colour.
As our next theorem, we show that the spanning-tree discrepancy of graphs with certain expansion properties is essentially as high as that of K n . In other words, we show that the optimal bound (ϕ(r) − o(1))n holds for these graphs as well. The precise notion of expansion is as follows: say that a graph G = (V, E) is a β-graph (for a given β > 0) if there is an edge in G between every pair of disjoint sets U, W ⊆ V with |U |, |W | ≥ β|V | (see, e.g., in [14]). We note that a β-graph G need not be connected, as for example it may have up to β|V (G)| − 1 isolated vertices. Hence, when studying the spanning-tree discrepancy of β-graphs, we need to explicitly assume that the graphs in question are connected. Theorem 1.10 (β-graphs). For every r ≥ 2 and ε > 0 there is β = β(r, ε) > 0 such that every connected n-vertex β-graph G satisfies the following: in any r-colouring of E(G) there is a spanning tree with at least (ϕ(r) − ε) · n edges of the same colour.
In light of the above discussion, Theorem 1.10 can be interpreted as saying that β-graphs essentially achieve the maximum possible r-colour spanning-tree discrepancy of any graph on the same number of vertices.
It is worth noting that a relation of a similar flavour -that is, between a "covering-pairs" type parameter and a multicolour Ramsey-type problem -was demonstrated in [9].

Discrepancy of Hamilton Cycles
Hamilton cycles are among the most well-studied objects in graph theory, boasting many hundreds of papers. Here we study the multicolour discrepancy of Hamilton cycles in dense graphs. One of the main results of [3] establishes that for every ε > 0, every n-vertex graph G with minimum degree at least ( 3 4 + ε)n satisfies D 2 (G, H n ) = Ω(n), and that moreover, the fraction 3 4 is best possible. Here we generalise this result to any number of colours. Theorem 1.11. Let r ≥ 2 and 0 ≤ d ≤ n 20r 2 , and let G be a graph with n ≥ n 0 (r) vertices and minimum degree at least (r+1)n 2r + d. Then in every r-colouring of the edges of G there is a Hamilton cycle with at least n r + 2d edges of the same colour. We remark that the same result has been very recently independently obtained by Freschi, Hyde, Lada and Treglown [13]. Our proof is completely different from the one given in [13], and gives a slightly better dependence of the bound on d.
The bound in Theorem 1.11 is tight, as is shown by the following example. Let G be the graph on the vertex set we colour the edges between V i and V r in colour i, and the rest of the edges in colour r (See Fig. 2). It is easy to see that any Hamilton cycle in G has two edges touching every vertex in any V i , i = 1, . . . , r − 1, and these edges are distinct for distinct vertices. Thus, the number of edges in any colour is exactly n r . In the same construction, assuming n is even, every perfect matching has exactly n 2r edges touching V i (i = 1, . . . , r − 1), and thus coloured i; this leaves exactly n 2r edges for colour r. On the other hand, looking again at Theorem 1.11, under its assumptions we are guaranteed to find a Hamilton cycle with at least one biased colour (colour 1, say), namely, in which at least n r + 2d edges are coloured 1. If n is even, this Hamilton cycle can be decomposed into two perfect matchings, at least one of which will be biased in colour 1. In conclusion, Theorem 1.11 and the described extremal example yield a similar optimal result for perfect matchings: Corollary 1.12. Let r ≥ 2 and 0 ≤ d ≤ n 20r 2 , and let G be a graph with n ≥ n 0 (r) vertices and minimum degree at least (r+1)n 2r + d. Then in every r-colouring of the edges of G there is a perfect matching with at least n 2r + d edges of the same colour.
Organisation In Section 2 we prove Theorem 1.1 and Proposition 1.2. Theorem 1.6 is proved in Section 3, and in Section 4 we give the full details of the proofs of Corollaries 1.8 and 1.9. Theorem 1.10 is proved in Section 5, and finally in Section 6 we establish Theorem 1.11.
Notation and terminology Let G = (V, E) be a graph. For two vertex sets U, W ⊆ V we denote by E G (U ) the set of edges of G spanned by U and by E G (U, W ) the set of edges having one endpoint in U and the other in W . The degree of a vertex v ∈ V is denoted by d G (v), and we write d G (v, U ) = |E G ({v}, U )|. We let δ(G) and ∆(G) denote the minimum and maximum degrees of G. When the graph G is clear from the context, we may omit the subscript G in the notations above.
. For the sake of simplicity and clarity of presentation, we often make no particular effort to optimise the constants obtained in our proofs, and omit floor and ceiling signs whenever they are not crucial.

Proof of Theorem 1.1 and Proposition 1.2
The goal of this section is to prove Theorem 1.1 and Proposition 1.2, starting with the former. Let us introduce some definitions and terminology which will be used in the proof. Let r ≥ 2, let G be a graph, and let f : E(G) → [r] be an r-colouring of the edges of G. For each 1 ≤ i ≤ r, let G i be the spanning graph of G consisting of the edges of colour i. Connected components of G i will be called colour-i components. We use C i to denote the set of all colour-i components. For a vertex v ∈ V (G), we denote by C i (v) the unique colour-i component containing v. Crucially, define H = H r (G, f ) to be the r-partite r-uniform multi-hypergraph with sides C 1 , . . . , C r , where for each v ∈ V (G) we add the hyperedge (C 1 (v), . . . , C r (v)) ∈ C 1 × · · · × C r (see Fig. 3). Note that |E(H)| = |V (G)|, and that d H (C) = |C| for every C ∈ V (H). In what follows, we will denote vertices of H by capital letters, while vertices of G will be denoted by lowercase letters. For a vertex v ∈ V (G), we will denote by e v the hyperedge of H corresponding to v; and vice versa, for a hyperedge e ∈ E(H), we will denote by v e the corresponding vertex of G. It turns out that the hypergraph construction H r (G, f ) is precisely what is needed to prove Theorem 1.1. It is worth noting that this construction has already been used in prior works, see e.g. [17], as well as the survey [20].
A walk in a hypergraph H is a sequence of vertices v 1 , . . . , v k such that for every We say that H is connected if there is a walk between any given pair of vertices.
Since G is connected, it contains a path z 0 = x, z 1 , . . . , z k = y between x and y. For each 0 ≤ j ≤ k − 1, let i j be the colour of the edge {z j , z j+1 } ∈ E(G), and let Z j be the colour-i j component containing this edge. Observe that for every 0 ≤ j ≤ k − 2, either {z j , z j+1 } and {z j+1 , z j+2 } have the same colour, in which case Z j = Z j+1 , or Z j and Z j+1 are contained together in a hyperedge of H, namely the hyperedge corresponding to the vertex z j+1 ∈ Z j ∩ Z j+1 . Similarly, the hyperedge e x contains both X and Z 0 , and the hyperedge e y contains both Y and Z k (it is possible that X = Z 0 or Y = Z k ). It is now easy to see that X, Z 0 , . . . , Z k , Y is a walk in H between X and Y , as required.
We now introduce some additional definitions related to the hypergraph H = H r (G, f ). Throughout this section, we assume that G is connected, which in turn implies that H is connected as well (by Lemma 2.1). A leaf in H is a hyperedge which contains only one vertex of degree at least 2 (note that since H is connected, every hyperedge must contain at least one such vertex). Let H 0 = H r 0 (G, f ) be the subhypergraph of H obtained by deleting, for each leaf e of H, all (r − 1) vertices of e which have degree 1 (in particular, we delete the hyperedge e). Note that deleting a leaf from a connected hypergraph leaves it connected, and hence H 0 is connected. Next, we show that the 2-connectedness of G translates into H 0 having no leaves.
) has the following property: every hyperedge e ∈ E(H 0 ) has at least two vertices whose degree in H 0 is at least 2.
Proof. Suppose, by contradiction, that e ∈ E(H 0 ) has only a single vertex whose degree in H 0 is at least 2, and denote this vertex by X. Let v = v e be the vertex of G corresponding to e; so e = (C 1 (v), . . . , C r (v)) and hence X = C i (v) for some 1 ≤ i ≤ r. Since e ∈ E(H 0 ), the hyperedge e is not a leaf in H. Hence, there must be Y ∈ e \ {X} such that d H (Y ) ≥ 2 (while d H 0 (Y ) = 1). Now, the definition of H 0 implies that all hyperedges of H containing Y , apart from e, are leaves (in H).
Let U be the set of vertices u ∈ V (G) such that Y ∈ e u and e u = e; as mentioned above, these conditions imply that e u is a leaf (in H).
We now observe that W = ∅ as well. To see this, fix any hyperedge e ∈ E(H) \ {e} which contains X (such a hyperedge has to exist as d H (X) ≥ 2), and let w ∈ V (G) be such that e = e w . We claim that w ∈ W . Indeed, if, by contradiction, we had w ∈ U , then it would be the case that Y ∈ e , which in turn would mean that e is not a leaf of H (as both X, Y ∈ e have degree at least 2 in H). However, this would then contradict the fact that all hyperedges of H containing Y , other than e, are leaves. So we see that w ∈ W , and hence W = ∅.
We will now show that in G there are no edges between U and W . This will stand in contradiction to the assumption that G is 2-connected, thus completing the proof of the lemma. Suppose then, for the sake of contradiction, that there are u ∈ U, w ∈ W with {u, w} ∈ E(G), and let j ∈ [r] be the colour of {u, w}. Then {u, w} ⊆ C j (u) and hence |C j (u)| ≥ 2. Now, as d H (Z) = |Z| for every Z ∈ V (H), we have that d H (C j (u)) ≥ 2. As C j (u) ∈ e u and e u is a leaf (as u ∈ U ), it must be the case that C j (u) = Y . Hence also C j (w) = Y , meaning that Y ∈ e w . It now follows that either e w is a leaf, in which case w ∈ U , or e w = e = e v , in which case For X ∈ V (H 0 ), denote by L(X) the set of leaves of H in which the unique vertex of degree at least 2 is X. If G is 3-connected, then H additionally possesses the following useful property: Proof. The proof is similar to that of Lemma 2.2. Suppose, for the sake of contradiction, that d H 0 (X) ≤ 2 and that there is a hyperedge of H which does not contain X. Let e 1 , e 2 ∈ E(H 0 ) be the only hyperedges of H 0 containing X (where possibly e 1 = e 2 ).
Let U be the set of u ∈ V (G) such that e u ∈ L(X), and note that U = ∅ as L(X) = ∅ by assumption. Set W := V (G) \ (U ∪ {v 1 , v 2 }). Observe that W is precisely the set of w ∈ V (G) such that X / ∈ e w , since every hyperedge which contains X is either in E(H 0 ) (and hence equals either e v 1 or e v 2 ) or belongs to L(X). It follows that W = ∅, since by assumption there is a hyperedge of H which does not contain X. We now show that in G there are no edges between U and W , which would mean that {v 1 , v 2 } is a separator of G, in contradiction to the assumption that G is 3-connected. So let u ∈ U, w ∈ W , and suppose, by contradiction, that {u, w} ∈ E(G). Let j ∈ [r] be the colour of {u, w}. Then {u, w} ⊆ C j (u) = C j (w) and hence |C j (u)| ≥ 2. Now, as C j (u) ∈ e u , and as e u is a leaf in which the only vertex of degree at least 2 is X (by the definition of U ), it must be the case that X = C j (u). However, as C j (u) = C j (w), we get that X = C j (w) ∈ e w , in contradiction to w ∈ W .
We are now ready to prove Theorem 1.1.
Proof of Theorem 1.1. Let G be a 2-connected graph on n vertices. It will be convenient to prove the theorem in the following (perhaps slightly convoluted) form: s r (G) = O(rd + r 2 ), where d is defined as: We may and will assume that d ≤ cn for some suitable c = c(r) > 0. Since D r (G, T n ) ≤ d, there exists an r-colouring of the edges of G in which there is no spanning tree with more than n−1+d r edges of the same colour. Fixing one such colouring f , we claim that |C i | ≥ (r−1)n−d r for every 1 ≤ i ≤ r (recall that C i is the set of colour-i components). Indeed, by taking a spanning tree of each colour-i component, and connecting these spanning trees using edges of other colours (this is possible because G is connected), we obtain a spanning tree of G with n − |C i | edges of colour i. Thus, our assumption implies that n − |C i | ≤ n−1+d r , and hence |C i | ≥ (r−1)n−d Observe that since H 0 is obtained from H by deleting leaves, we have |V (H 0 )| = |V (H)|−(r−1)·(|E(H)|−|E(H 0 )|) = |V (H)|−(r−1)n+(r−1)·|E(H 0 )|. Now, using (2), we get: Recalling the statement of Lemma 2.3, we now observe that the latter option in the conclusion of the lemma is impossible, as it would imply that one of the parts C 1 , . . . , C r contains just one vertex (namely, X), in contradiction to the fact that |C i | ≥ (r−1)n−d r > 1 for every 1 ≤ i ≤ r. Hence, we have the following: Next, we show that by omitting O(d) hyperedges, one can obtain a spanning subhypergraph of H 0 in which all vertex degrees are not larger than 2, and every hyperedge contains at least r − 2 vertices of degree 1 (and hence at most 2 vertices of degree 2). It is easy to see that a hypergraph with these properties is a disjoint union of loose paths and cycles 3 . For the application of this claim, it will in fact be convenient to not only make sure that the maximum degree of H 1 is at most 2, but also that every vertex X with d H 0 (X) ≥ 3 is isolated in H 1 .
Claim 2.5. There exists a spanning subhypergraph H 1 of H 0 with |E(H 1 )| ≥ |E(H 0 )| − 8d, having the following properties: On the other hand, By combining the above with (3), we obtain |A| ≥ 2|V Using (4), we now get that Hence, H 0 contains at most 2d hyperedges which contain more than 2 vertices of degree at least 2. Let E 1 be the set of such hyperedges, and note that |E 1 | ≤ 2d. Next, let us handle high-degree vertices. For each i ≥ 1, let m i be the number of vertices and where the inequality is (3). Subtracting (6) from (5), we obtain Next, note that m 1 = |A| and hence m 1 ≤ (r − 2) · |E(H 0 )| by (4). Combining this with (6), we see that i≥2 m i ≥ |E(H 0 )| − d. Now, subtracting this inequality from (7), we obtain From (8) it follows, in particular, that i≥3 m i ≤ 2d. Multiplying this inequality by two and adding the result to (8), we get that Now observe that i≥3 m i · i is an upper bound on the number of hyperedges of H 0 which contain a vertex of degree at least 3. Let E 2 be the set of such hyperedges (so |E 2 | ≤ 6d). By deleting from H 0 the hyperedges in E 1 ∪ E 2 , we obtain a hypergraph satisfying Items 1-2 in the claim. Furthermore, the number of deleted hyperedges is at most 8d, as required.
For 1 ≤ i ≤ r, define L i := X∈C i ∩V (H 0 ) L(X). In other words, L i is the set of leaves of H whose (unique) vertex of degree at least 2 belongs to C i . Note that n = |E(H)| = |E(H 0 )| + |L 1 | + · · · + |L r |.
Claim 2.6. |L i | ≤ n+d r for every 1 ≤ i ≤ r. Proof. Let 1 ≤ i ≤ r. For j ∈ [r] \ {i}, let U j i be the set of vertices X ∈ C i which belong to some leaf in L j . Observe that |U j i | = |L j | for every j ∈ [r] \ {i}, and that the sets (U j i : j ∈ [r] \ {i}) are pairwise-disjoint and their union is By combining this with (9), we get Now, the fact that H 0 has no isolated vertices (as it is connected) implies that |C i ∩ V (H 0 )| ≤ |E(H 0 )|. Furthermore, we know that |C i | ≥ (r−1)n−d r . Plugging these facts into the above equality, we obtain that indeed |L i | ≤ n − |C i | ≤ n+d r .
To complete the proof, we need to find a balanced r-separator of G of size O(rd + r 2 ). In the following claim, we essentially achieve this task by finding a partition of the edges of H which (roughly) corresponds to such a separator. We then explain how to conclude using this claim.
2. e i ∩ e j = ∅ for every 1 ≤ i < j ≤ r and e i ∈ E i , e j ∈ E j .
Proof. It will be convenient to construct the sets E 1 , . . . , E r , F gradually, i.e. by placing various elements in one of these sets at certain stages in the proof. The final sets E 1 , . . . , E r , F form the required partition. Let H 1 be a spanning subhypergraph of H 0 satisfying the assertion of Items 1-2 in Claim 2.5. Put all hyperedges in E(H 0 ) \ E(H 1 ) into F . Claim 2.5 guarantees that there are at most 8d such hyperedges.
Let M be the set of vertices In particular, if G is not 3-connected, then our choice of d for that case guarantees that |M | < d. For each 1 ≤ i ≤ r and X ∈ C i ∩M , place into E i all elements of L(X). Then at the moment we have E i ⊆ L i , and hence |E i | ≤ |L i | ≤ n+d r by Claim 2.6. Moreover, the current E 1 , . . . , E r satisfy the assertion of Item 2 because for every 1 ≤ i < j ≤ r, no two hyperedges e i ∈ L i , e j ∈ L j intersect.
Let H 2 be the subhypergraph of H 1 obtained by deleting from it all vertices belonging to M . Put in F all hyperedges of H 1 which touch vertices of M . Since the maximum degree of H 1 is at most 2 (by Item 1 in Claim 2.5), the number of such hyperedges is at most 2|M |, which is less than 2d in the case that G is not 3-connected. If, on the other hand, G is 3-connected, then there are no hyperedges of H 1 whatsoever which touch vertices of M . Indeed, this follows from Claim 2.4, which implies that if X ∈ M then d H 0 (X) ≥ 3, and Item 1 of Claim 2.5, which guarantees that d H 1 (X) = 0 for each such X. We conclude that in any case, the number of hyperedges added to F at this step is less than 2d, and hence |F | = O(d) at this moment.
Since H 2 is a subhypergraph of H 1 , it also satisfies the assertion of Items 1-2 in Claim 2.5. As mentioned above, this means that H 2 is a disjoint union of loose paths and cycles (some of whom may be isolated vertices). Let P 1 , . . . , P be the connected components of H 2 (each being a loose path or cycle). We now go over P 1 , . . . , P in some order, and, when processing P k , do as follows. Let X 1 , . . . , X t be the vertices of P k , ordered so that each hyperedge of P k is a (possibly cyclic) interval in this order (such an ordering exists since P k is a loose path or cycle). Fix 1 ≤ i ≤ r such that |E i | ≤ n r at this moment (such an 1 ≤ i ≤ r evidently has to exist, as E 1 , . . . , E r ⊆ E(H) are disjoint and |E(H)| = n). Let j be the largest integer 1 ≤ j ≤ t with the property that adding to E i all hyperedges in E := E({X 1 , . . . , X j }) ∪ L(X 1 ) ∪ · · · ∪ L(X j ) does not increase the size of E i beyond n+d r + 2. Here E({X 1 , . . . , X j }) denotes the set of edges of P k contained in {X 1 , . . . , X j }. Observe that j is well-defined, because |L(X 1 )| ≤ d/r (as X 1 / ∈ M ), and because 1 ≤ i ≤ r was chosen so that E i is not larger than n r before this step. If j = t, in which case we added to E i all edges in E(P k ) ∪ X∈P k L(X), then we simply continue to the next connected component. Suppose now that j < t. Then it must be the case that after placing E into E i , the size of E i exceeds n r , because otherwise we could also add to E i the set L(X j+1 ) and all (at most 2) hyperedges in E({X 1 , . . . , X j+1 }) containing X j+1 , of which there are altogether at most |L(X j+1 )| + 2 ≤ d/r + 2, in contradiction to the maximality of j. We will say that E i is saturated whenever |E i | > n r . Place in F any hyperedges e ∈ E(P k ) satisfying e ∩ {X 1 , . . . , X j } = ∅ and e ∩ {X j+1 , . . . , X t } = ∅, of which there are at most 2, and put P k [{X j+1 , . . . , X t }] into the list of connected components to be processed. The fact that we remove such hyperedges e guarantees that the assertion of Item 2 will be satisfied. Note that if E i becomes saturated, then no more hyperedges will be added to it at any later stage. Since each E i (1 ≤ i ≤ r) can become saturated only once, the overall number of edges added to F in this process is at most 2r. Hence, at the end of the process we have |F | ≤ O(d) + 2r = O(d + r). This completes the proof of the claim.
We now complete the proof using Claim 2.7. Let E(H) = E 1 ∪ · · · ∪ E r ∪ F be a partition satisfying Items 1-2 in that claim. We have min{|E 1 |, . . . , |E r |} ≥ n − (r − 1) · ( n+d r + 2) − |F | = n r −O(d+r). Hence, by moving at most O(rd+r 2 ) elements from E 1 , . . . , E r to F , we may assume that |E 1 | = · · · = |E r |. Now, put V i := {v e : e ∈ E i } (for 1 ≤ i ≤ r) and S := {v e : e ∈ F }. Then |S| = |F | = O(rd + r 2 ) and |V 1 | = · · · = |V r |. We claim that S is a separator of G. To see this, suppose by contradiction that there are But this contradicts Item 2 in Claim 2.7, as e v i ∈ E i and e v j ∈ E j . It follows that F is a balanced r-separator of G of size O(rd + r 2 ), as required. This completes the proof.

Tightness: Proof of Proposition 1.2
The goal of this section is to show that the lower bound on the balanced r-separation number of the graph that appears in Theorem 1.1 is essentially tight. This is achieved by proving Proposition 1.2. To this end, we shall construct an n-vertex graph G with s r (G) = Ω( √ n) and D r (G, T n ) ≤ 1. The graph will be a clique cycle, namely, a cycle (of length Θ( √ n)) with a disjoint clique attached to each of its edges (see Fig. 4). Such graphs are obviously 2-connected. We will have to choose the sizes of the hanging cliques carefully; indeed, if the cliques are of equal sizes, say, then one could easily construct a balanced r-separator of size O(r). We will also have to make sure that the clique sizes allow a balanced colouring that will guarantee small discrepancy.
For the first task (i.e., forcing a large balanced r-separator) we shall use the next technical lemma. For integers 0 ≤ x < k and 0 ≤ i < k let a k,x i = x if i < x and a k,x i = k + x otherwise. One can easily verify that for every 0 ≤ x < k we have k−1 i=0 a k,x i = k 2 . Fix r ≥ 2. Let x = (x 1 , . . . , x r ) be a vector of strictly increasing nonnegative integers, and let k > x r (so k ≥ r). Write R := max{r, x r }. Set µ = µ(x) := 1 r r j=1 x j and assume that x is such that µ is not an integer. For 0 ≤ i < rk write i = q i r + t i for 0 ≤ q i < k and 0 ≤ t i < r, and set For I ⊆ {0, 1, . . . , rk − 1} write ΣI = i∈I b k,x i for the "sum" of I, and let D(I) = ΣI − k 2 denote its "discrepancy", namely, the deviation of its sum from the "mean" k 2 . Let C(I) denote the number of connected components of subgraph of the cycle C rk spanned by the vertex set I, that is, the number of disjoint consecutive (cyclic) intervals of I in Z/(rk)Z. The following lemma shows that every index set either spans many disjoint intervals or has large discrepancy.
hence D(I ) + C(I ) · R 2 ≥ k/r − R 2 . To conclude, we have Assume from now that k R (more precisely, R is constant and k → ∞). We proceed by constructing a clique cycle G on n = rk 2 + rk = Θ r (k 2 ) vertices with s r (G) = Ω r ( √ n) and D r (G, T n ) ≤ 1. The construction depends on a vector x = (x 1 , . . . , x r ) whose entries are strictly increasing nonnegative integers and for which µ = µ(x) is not an integer. Start with a vertex set W = {w 0 , . . . , w rk−1 } of size rk (the cycle), and let A 0 , . . . , A rk−1 be vertex sets with the following properties: (a) A i ∩ W = {w i , w i+1 } for every 0 ≤ i < rk (here and in the rest of this section, indices are taken modulo rk); (10), |V (G)| = rk 2 + rk), and G is obtained by letting each A i (0 ≤ i < rk) be a clique, with no futher edges (see Fig. 4). As mentioned earlier, G is 2-connected. The proof of Proposition 1.2 would follow from the next two claims.
Proof. Let V (G) = V 1 ∪· · ·∪V r ∪S be a balanced r-separation and let s := |S|. For j ∈ [r] set I j to be the set of indices 0 ≤ i < rk such that A i ∩V j = ∅ (and hence A i ⊆ V j ∪S). We may assume without loss of generality that ΣI j = i∈I j |A i | is maximised for j = 1, and thus ΣI 1 ≥ k 2 and (a) A 2-coloured clique cycle on 84 = 2 · 6 2 + 2 · 6 vertices, constructed using x = (0, 1).
Since C(I 1 ) ≤ 1 + |S| we have that ΣI 1 ≥ k 2 + k/r − (s + 4R 2 )R 2 . It follows that and therefore where the second inequality holds for large enough k.
Observe that in every spanning tree of G, the number of edges of colour j is at most n − |C j | = n/r. Thus D r (G, T n ) ≤ r · (n/r − (n − 1)/r) = 1.

Spanning-Tree Discrepancy in Random Regular Graphs
In this section we prove Theorem 1.6. We begin with the following useful fact, which appears as Corollary 2.15 in [6].

Lemma 3.1 ([6]
). For every d ≥ 3 there is C = C(d) such that the following holds. Let G ∈ G n,d , and let E 0 be a set of pairs of vertices of G of size at most 0.49nd. Then the We now use Lemma 3.1 to show that whp, all small enough subgraphs of G n,d are quite sparse; or, equivalently, that no small edge-set E 0 ⊆ E(G n,d ) spans few vertices.
Lemma 3.2. For every d ≥ 3 and ε > 0 there is α = α(d, ε) > 0 such that whp G ∈ G n,d satisfies the following. For every E 0 ⊆ E(G) of size at most αn, the number of vertices spanned by E 0 is at least (1 − ε)|E 0 |.
Proof. Choose α = α(d, ε) > 0 to satisfy e 2 · C · α ε ≤ 1 2 , where C = C(d) is from Lemma 3.1. Fixing 1 ≤ m ≤ αn, let us estimate the probability that there is a set of less than (1 − ε)m vertices which span (at least) m edges. By the union bound and Lemma 3.1, this probability is at most where in the first inequality we used the estimate n k ≤ en k k and the (obvious) fact that (1−ε)m 2 ≤ m 2 2 , and in the second inequality we used the bound ( 1 1−ε ) 1−ε ≤ 2, which holds for every ε ∈ (0, 1). We conclude that the probability that there exists E 0 violating the statement of the lemma is at most It is easy to see that for m in the range 1 ≤ m ≤ √ n, say, the corresponding sum is o(1). As for the range √ n ≤ m ≤ αn, we use our choice of α to obtain Another fact we need regarding random regular graphs concerns the distribution of short cycles. It easily follows from Lemma 3.1 that for every fixed k, the expected number of k-cycles in G n,d can be upper bounded by a function of k. (In fact, much more precise results are known; see e.g. Theorem 2.5 in [32] and the references therein.) Using Markov's inequality, we obtain the following (rather weak, though sufficient for our purposes) fact: Lemma 3.3. For every fixed k, the number of cycles of length at most k in G n,d is o(n) whp.
We now combine Lemmas 3.2 and 3.3 to show that whp, every small enough edge-set in G n,d can be made into (the edge-set of) a forest by omitting only a small fraction of its elements. Lemma 3.4. For every d ≥ 3 and ε > 0 there is β = β(d, ε) > 0 such that whp G ∈ G n,d satisfies the following. For every E * ⊆ E(G) of size at most βn, there is F ⊆ E * , |F | ≥ (1 − ε)|E * |, such that F is the edge-set of a forest.

Proof of Corollaries
To prove the upper bounds in Corollaries 1.8 and 1.9, it will be convenient to use the following observation.
Observation 4.1. Let G = (V, E) be a connected graph, and let V = L 1 ∪ . . . ∪ L k be a partition of its vertex set such that |L i | ≤ D for i = 1, . . . , k and E(L i , L j ) = ∅ whenever 1 ≤ i < j − 1 < k. Then s r (G) ≤ 3rD.
Proof. We will call the sets L 1 , . . . , L k layers. We may assume 2D < n/r, otherwise the statement is trivial. A balanced r-separator of size (1 + r)D can be obtained as follows. Let n = |V | and let σ : V → [n] be an ordering of the vertices of G satisfying σ(u) < σ(v) whenever u ∈ L i and v ∈ L j for i < j. For j = 1, . . . , r − 1 let L i j be the layer containing the vertex labelled jn/r . Set V = V \ j L i j , and observe that V consists of r parts sized between n/r −2D and n/r each, with no edges between distinct parts. We now level all parts by deleting the total of at most 2rD vertices, and set S to be the set of vertices outside these parts. S is a balanced r-separator of G with |S| ≤ 3rD.
We remark that the "right" notion to use here is that of a bandwidth of a graph (see, e.g., in [24]). The bandwidth of an n-vertex graph G is the minimum D such that there exists an ordering σ : where the minimum is over all orderings σ of V (G). Using this terminology, the statement of Observation 4.1 can be simplified as follows: for every graph G with bw(G) ≤ D it holds that s r (G) ≤ 3rD.
The hypercube Here we prove Corollary 1.8. Identify the vertices of Q d with the set {0, 1} d in the obvious way. Denote by L 0 , . . . , L d the layers of the hypercube, namely, For the lower bound in Corollary 1.8, we shall need the following lemma.
Lemma 4.2. For every α > 0 there exists β > 0 such that for every set U of size at least α · 2 d it holds that |N (U )| ≥ β · 2 d / √ d.
Proof. Fix α > 0 and let U be a vertex set with |U | ≥ α · 2 d . A Hamming ball (with centre 0 and radius k) in Q d is a vertex set of the form L 0 ∪ L 1 . . . ∪ L k−1 ∪ L k for some 0 ≤ k ≤ d and ∅ = L k ⊆ L k . A classical result by Harper ( [21], see also, e.g., Theorem 31 of [18]) implies that the vertex boundary of sets of a given size is minimised by Hamming balls. Thus, we may assume U is a Hamming ball (of radius k). Note that (e.g., by Chernoff bounds) k ≥ d/2−α √ d for some α = α (α). Now, note that |N (U )| is at least asymptotically half of the size of L k , which is at least β · 2 d / √ d for some β = β(α), as required. The grid Here we prove Corollary 1.9. Identify the vertices of P d k with the set [k] d in the obvious way. Denote by L 1 , . . . , L k the k layers of the grid, each spans a copy of P d−1 k , namely, Proof of Corollary 1.9. From [8] (see also [31]) we know that ι(P d k ) k d−1 . The lower bound for d ≥ 3 thus follows from Corollary 1.3 by noting that P d k is 3-connected. For d = 2, let P + k be obtained from P 2 k by adding a cycle through the "corner" vertices (1, 1), (1, k), (k, k), (k, 1). Note that any spanning tree of P + k contains at most 3 edges which are not in P 2 k . As P + k is clearly 3-connected, we have by Corollary 1.3 that D r (P 2 k , T n ) ≥ D r (P + k , T n ) − 3r k. The upper bound (for d ≥ 2) is obtained from the combination of (1) and Observation 4.1 by noting that |L i | = k d−1 for every i = 1, . . . , k.
For each 1 ≤ i ≤ r, let G i be the graph on V (G) consisting of the edges coloured with colour i. Connected components of G i will be called colour-i components, and their number will be denoted by i . Suppose first that i ≤ (1 − ϕ(r) + ε) · n for some 1 ≤ i ≤ r. In this case, take a spanning forest of each colour-i component, and connect these components using i − 1 edges to obtain a spanning tree T of G (this is possible since G is connected). The number of edges of T of colour i is at least n − 1 − ( i − 1) ≥ n − (1 − ϕ(r) + ε) · n = (ϕ(r) − ε) · n, as required. So we see that in order to complete the proof, it suffices to rule out the possibility of having i > (1 − ϕ(r) + ε) · n for all 1 ≤ i ≤ r. Suppose then, for the sake of contradiction, that this is the case. For 1 ≤ i ≤ r, let V i be the union of all colour-i components of size at most 2/ε. Evidently, the number of colour-i components of size larger than 2/ε is less than εn/2. Since |V i | is at least as large as the number of colour-i components of size at most 2/ε, we have where in the last inequality we used our assumption that i > (1 − ϕ(r) + ε) · n. For each Now, consider the Venn diagram of the sets W 1 , . . . , W r . Partition each of the (at most) 2 r "regions" of this Venn diagram into sets of size n/m, plus a "residual set" of size less than n/m. Then, collect all residual sets and partition their union into sets of size n/m. Let U 1 , . . . , U m be the resulting partition of V (G). Note that for each 1 ≤ i ≤ r and for all but at most 2 r of the indices 1 ≤ s ≤ m, it holds that either U s ⊆ W i or U s ∩ W i = ∅. Indeed, if U s is not contained in the union of residual sets, then U s is contained in one of the regions of the Venn diagram of W 1 , . . . , W r , implying that either U s ⊆ W i or U s ∩ W i = ∅.
Claim 5.1. For every pair 1 ≤ s < t ≤ m, there exists 1 ≤ i ≤ r such that |U s ∩ W i | ≥ n 2mr and |U t ∩ W i | ≥ n 2mr .
Proof. Let 1 ≤ s < t ≤ n. Suppose, for the sake of contradiction, that for each 1 ≤ i ≤ r it holds that |U s ∩ W i | < n 2mr or |U t ∩ W i | < n 2mr . We now define subsets X ⊆ U s and Y ⊆ U t , as follows. For each 1 ≤ i ≤ r, if |U s ∩ W i | < n 2mr then remove the elements of U s ∩ W i from U s , and if |U t ∩ W i | < n 2mr then remove the elements of U t ∩ W i from U t . Let X be the set of remaining elements in U s and Y be the set of remaining elements in U t . By definition, X ∩Y = ∅. Moreover, we have |X| > |U s |−r · n 2mr = n 2m and similarly |Y | > |U t |−r · n 2mr = n 2m . Let us estimate the number of edges (in G) between X and Y . To this end, fix 1 ≤ i ≤ r and suppose, without loss of generality, that X ∩ W i = ∅ (the case Y ∩ W i = ∅ is symmetrical). The definitions of W i and V i imply that for every x ∈ X, the colour-i component containing x has size at most 2/ε. This means that for every x ∈ X, there are less than 2/ε edges of colour i incident to x. Hence, e G i (X, Y ) < 2/ε · |X| ≤ 2/ε · |U s | = 2n εm . Summing over all colours 1 ≤ i ≤ r, we conclude that On the other hand, since G is a β-graph, there are at least |Y | − βn edges between X and Y for every subset X ⊆ X of size βn. Hence, by considering a partition of X into sets of size βn, we see that where in the second inequality we used the fact that |X|, |Y | ≥ n 2m , and in the last inequality we used the fact that β ≤ 1 4m , which follows from our choice of β. By combining (13) and (14), we get 1 8m 2 β < 2r εm , or, equivalently, β > ε 16rm . But this contradicts our choice of β. Let us now complete the proof of the theorem using Claim 5.
where the last inequality follows from our choice of m.
On the other hand, we now observe that |A i | ≤ |W i | · m/n + 2 r for every 1 ≤ i ≤ r. Indeed, fixing any 1 ≤ i ≤ r, recall that for all but at most 2 r of the indices 1 ≤ s ≤ m it holds that either U s ⊆ W i or U s ∩ W i = ∅. It is now easy to see that indeed |A i | ≤ |W i | · m/n + 2 r . Combining this with (12), we get that for every 1 ≤ i ≤ r. Here the last inequality holds because m was chosen to satisfy m ≥ 2 r+2 /ε. But (16) stands in contradiction with (15). This completes the proof of the theorem.

Discrepancy of Hamilton Cycles
The goal of this section is to prove Theorem 1.11. As a first step, we establish the following result, showing that in every r-colouring of the edges of a graph with minimum degree 1 2 + α n (for α > 0), there is a monochromatic matching of size 1 2r + β n, where β = β(r, α) > 0 is a suitable constant. Note that the (weaker) bound n 2r holds trivially by taking the most popular colour of a perfect matching, which must be present in any graph with minimum degree at least n 2 (by Dirac's theorem). Moreover, this result is tight, in a sense, since graphs of minimumdegree n−1 2 need not contain a perfect matching (for example, consider the complete bipartite graph with sides of size n−1 2 and n+1 2 ). Lemma 6.1. Let r ≥ 2, α ∈ (0, 0.2). Let G be an n-vertex graph with minimum degree at least 1 2 + α n. Then in every r-colouring of the edges of G there is a monochromatic matching of size at least 1 2r + α 3r n − O(1). Proof. Let G be an n-vertex graph with minimum degree at least 1 2 + α n. Set k := 2α 3 n . We run the following procedure for k steps. If, at a given step, the (remaining) graph has vertices x, y, z such that {x, y}, {y, z} are edges of different colours, then remove x, y, z from the graph. Otherwise (i.e., if no such triple of vertices exists), stop. Suppose this procedure ran for steps (for some 0 ≤ ≤ k), and let G be the resulting graph. Let x j , y j , z j , 1 ≤ j ≤ , be the triples of vertices which we removed, and put E := {{x j , y j }, {y j , z j } : 1 ≤ j ≤ }. Observe that |V (G )| = n − 3 , and that the minimum degree of G is at least δ(G) − 3 ≥ Note that We now consider two cases. Suppose first that < k, namely that the procedure terminated "ahead of time". By the definition of the procedure, it must be the case that in G there is no pair of incident edges {x, y}, {y, z} such that {x, y} and {y, z} have different colours. Since G is connected, this implies that all edges of G are of the same colour. It follows that G (and hence also G) contains a monochromatic matching (namely., M ) of size at least |M | ≥ n−1 2 − αn ≥ 1 2r + α 3r n − O(1). Here the first inequality uses (17), and the second holds for α ≤ 0.2.
Suppose now that = k. Then |E| = 2k. For each 1 ≤ i ≤ r, let A i be the set of edges of E of colour i. Then A i is a matching for every 1 ≤ i ≤ r, since every pair of incident edges in E have different colours). It now follows by definition that M ∪ A i is a matching for every 1 ≤ i ≤ r. Next, for each 1 ≤ i ≤ r, let B i be the set of edges of M of colour i. Note that where the inequality follows from (17), and the last equality from our choice of k. By averaging, there is 1 ≤ i ≤ r such that |A i ∪B i | = |A i |+|B i | ≥ 1 2r + α 3r n−O(1). As A i ∪B i is a matching all of whose edges have colour i, the proof is complete.
In what follows, we will need (a multicolour version of) Szemerédi's regularity lemma. Let us then recall the relevant definitions. For a pair of disjoint vertex-sets U, W in a graph, the density of (U, W ) is defined as d(U, W ) := |E(U, W )|/(|U ||W |). When several graphs are considered at the same time, we will write d G (U, W ) for the density in the graph G. A pair (U, W ) of disjoint vertex-sets is called γ-regular if for all U ⊆ U , W ⊆ W with |U | ≥ γ|U | and |W | ≥ γ|W | it holds that |d(U , W ) − d(U, W )| ≤ γ. An equipartition of a set is a partition in which the sizes of any two parts differ by at most 1 (to keep the presentation clean, we will ignore divisibility issues and just assume that all parts have the same size). For our needs, it will be convenient to use the following version of the multicolour regularity lemma. Theorem 6.2 (Multicolour Szemerédi's regularity lemma, degree version). For every r, t 0 ≥ 1 and γ ∈ (0, 1), there exist T = T (r, t 0 , γ) and n 0 = n 0 (r, t 0 , γ) such that for every collection G 1 , . . . , G r of graphs on the same vertex-set V with |V | ≥ n 0 , there exists an equipartition {V 1 , . . . , V t } of V with t 0 ≤ t ≤ T , such that the following is satisfied: for every 1 ≤ i ≤ t it holds that for all but at most γt of the indices j ∈ [t] \ {i}, the pair (V i , V j ) is γ-regular in G for every 1 ≤ ≤ r.
The standard proof of the regularity lemma and its degree version (see [25]) can be easily adapted to give the above variant.
The remaining tools needed in the proof of Theorem 1.11 are the following two lemmas: the first, Lemma 6.3, originated in [5] and is proved by analysing the DFS algorithm (see also [26] and [27, Corollary 2.1]); and the second, Lemma 6.4, appears in [30].

Lemma 6.3 ([5]
). Let n, k ≥ 1 be integers, and let F be a bipartite graph with sides X, Y of size n each. Suppose that there is an edge between every pair of sets X ⊆ X and Y ⊆ Y with |X | = |Y | = k. Then F contains a path of length at least 2n − 4k. Lemma 6.4 ([30]). Let a ≥ 0 and let G be a graph with n vertices and minimum degree at least n 2 + a. Let E ⊆ E(G) be an edge-set which forms a path-forest and has size at most 2a. Then there exists a Hamilton cycle in G which uses all edges in E.
We are now ready to prove Theorem 1.11.
Proof of Theorem 1.11. Set γ := min{ 1 6r(r+2) , 1 100r }. Let t 0 be large enough as a function of r (to be chosen implicitly later). Let T := T (r, t 0 , γ) and n 0 := n 0 (r, t 0 , γ) be as in Theorem 6.2. Let d ≤ n 20r 2 , and let G be a graph on n ≥ n 0 vertices with minimum degree at least (r+1)n 2r + d. Fix any r-colouring of the edges of G. For each i ∈ [r], let G i be the graph on V (G) whose edges are the edges of G coloured by colour i. Let {V 1 , . . . , V t } be the partition of V = V (G) obtained by applying Theorem 6.2 to (G 1 , . . . , G r ). Let R be the graph on [t] in which {i, j} ∈ E(R ) if and only if (V i , V j ) is γ-regular in G for every 1 ≤ ≤ r. By the guarantees of Theorem 6.2, we have δ(R ) ≥ t − 1 − γt. Now, let R be the (spanning) subgraph of R in which {i, j} ∈ E(R) if and only if {i, j} ∈ E(R ) and d G (V i , V j ) > rγ. We claim that δ(R) ≥ 1 2 + 1 3r t. To this end, fix any 1 ≤ i ≤ t and observe that where the second inequality relies on the fact that d G (V i , V j ) ≤ rγ whenever {i, j} / ∈ E(R), the third inequality uses the trivial bound deg R (i) − deg R (i) ≤ deg R (i) ≤ t and the fact that