Long paths and cycles in random subgraphs of graphs with large minimum degree

For a graph $G$ and $p\in [0,1]$, let $G_p$ arise from $G$ by deleting every edge mutually independently with probability $1-p$. The random graph model $(K_n)_p$ is certainly the most investigated random graph model and also known as the $G(n,p)$-model. We show that several results concerning the length of the longest path/cycle naturally translate to $G_p$ if $G$ is an arbitrary graph of minimum degree at least $n-1$. For a constant $c$, we show that asymptotically almost surely the length of the longest path is at least $(1-(1+\epsilon(c))ce^{-c})n$ for some function $\epsilon(c)\to 0$ as $c\to \infty$, and the length of the longest cycle is a least $(1-O(c^{- \frac{1}{5}}))n$. The first result is asymptotically best-possible. This extents several known results on the length of the longest path/cycle of a random graph in the $G(n,p)$-model.


Introduction
Around 1960 Erdős and Renyi proved the first results about random graphs -especially about graphs on n vertices where every possible edge is present independently with probability p, which is nowadays known as the G(n, p)model. It is not an overstatement saying that this field has grown enormously since then and for numerous graph parameters the typical value is (precisely) known for large n. In particular, the lengths of paths and cycles are investigated. As for any ǫ > 0 and p ≥ (1+ǫ) log n n a.a.s. a graph in G(n, p) is hamiltonian, we consider the length of a longest path/cycle if p = c n for some constant c > 1. A series of papers [1,3,4,6,5]  Let us consider a more general random graph model. For a graph G, we denote by G p the random subgraph obtained by deleting every edge independently with probability 1 − p from the edge set of G. Thus (K n ) p is a uniformly at random chosen member of G(n, p). In this paper we consider the typical asymptotic behavior of (G k ) p instead of (K n ) p where G k is a simple graph of minimum at least k. In our setting p depends on k instead of the order of G k .
We denote by G the set of all graph sequences G 1 , G 2 , . . . such that G k has minimum degree at least k. We define {(G k ) p contains a path of length αk a.a.s.} and β ′ (c) analogously for cycles. It is clear that α ′ (c) ≤ α(c) and β ′ (c) ≤ β(c).
We prove that there is essentially no difference between α ′ (c) and α(c) and our second contribution is a lower bound on β ′ (c).
Theorem 3 improves a result of Krivelevich, Lee and Sudakov [8] and Riordan [10] implying β ′ (c) = 1 − o (1). It also generalizes several results of the length of the longest cycle in the G(n, p)-model.
Note that the questions of hamiltonicity in the G(n, p) setting translates to the question whether G k has a cycle of length at least k + 1. These extensions are successfully settled by Krivelevich, Lee and Sudakov [8], and by Glebov and Naves and Sudakov [7].

Preliminaries
We will frequently need to show that a binomial random variable is very close to its expected value and use for these purposes Chernoff's inequality.
Theorem 4 (Chernoff's inequality [2]). If X is a binomial distributed random variable with X ∼ Bin (n, p) and 0 < λ ≤ np = EX, then Several results in this paper are based on the depth-first-search algorithm (DFS-algorithm) which is a frequently used exploration method of graphs. We briefly describe this algorithm and introduce some notation along the way. Several recent results apply this algorithm to random graphs leading to very nice and short proofs [8,9,10].
The DFS-algorithm is an algorithm traversing a graph such that all vertices of a given graph G are finally visited and outputs a rooted spanning forest T of G. It proceeds in the following way.
At any step, there is a partition of the vertex set V (G) into three sets R, S and U . The set U contains the vertices that have not yet been visited during the exploration, R denotes the set of vertices whose exploration is complete, and all the remaining vertices that are currently under exploration are contained in S. The vertices of S are kept in a stack, which is a last-in-first-out data structure.
The algorithm starts with U = V (G) and R = S = ∅ and executes the following rounds until every vertex is explored, i.e. R = V (G) and S = U = ∅.
• If S = ∅, then some unreached vertex v in U is moved to S. This vertex v will be the root of a new component of our rooted spanning forest T .
• Otherwise, let v be the top element of the stack S (the last-in vertex). The algorithm queries whether v has some neighbor w in U . If so, w is placed on top of the stack S. If v has no neighbor in U , it is completely explored and is moved to R.
• As long as U = ∅, the algorithm moves to the next round.
In each round of the algorithm there is exactly one vertex moved either from U to S or from S to R. So indeed, after 2|V (G)| rounds every vertex has been moved from U to R through S and the algorithm terminates with a rooted spanning forest T .
The following properties of the DFS-algorithm are important to us: (I) Every positively answered query about a neighbor in U increases the size of R ∪ S by exactly one.
(II) The set S always spans a path.
(III) At any round of the algorithm, all possible edges between the set R and U have been queried and answered negatively.
(IV) Every edge e = uv of the graph G which is not tested during the exploration of G joins two vertices on some vertical path in the rooted spanning forest T (because otherwise the algorithm would have queried for the edge uv during the exploration).
We will use the DFS-algorithm to explore the random graph G p . Therefore, we assume that the algorithm already knows the underlying graph G and all the edges of G. The DFS-algorithm only queries about these edges of G during the exploration of G p . That is, if the DFS-algorithm looks for neighbors of some vertex v, it only considers the neighbors w of v in G, and queries whether this vertex is also a neighbor of v in G p . We receive a positive answer of each such query independently with probability p. In this way, following this algorithm, we explore a rooted spanning forest of our random graph G p . Note that by definition the answer of a query does not depend on the answers of the previous queries. We say an edge of G is tested if the DFS-algorithm queried whether this edge is in G p and otherwise we say it is untested.
Throughout the paper we consider graphs G k of minimum degree at least k. Almost all our results include asymptotic statements and an event occurs asymptotically almost surely (a.a.s.) if the probability that this event occurs tends to 1 as k → ∞. Furthermore, several inequalities in our computations are only correct if k is large enough and for the purpose of readability we often drop the index k and simple write G.

Auxiliary Results
Before we begin with the proofs of Theorem 2 and 3, we cite and prove some results for later use. The first one uses a nice and direct analysis of the DFSalgorithm.
Lemma 5 (Krivelevich, Lee, Sudakov [8]). Let p = c k for c sufficiently large, and let G be a graph of minimum degree at least k. If G is bipartite, then G p a.a.s. contains a path of length 2 − 6c −1/2 k.
The next lemma is of a similar flavor as the last one. We suitably modify a result of [8] for our purposes. Lemma 6. Let p = c k for c sufficiently large, and let G be a graph of minimum degree at least k.
, and we may assume that |V 0 | = ⌈log k⌉. We modify the DFS-algorithm as follows.
Recall that the stack S denotes the vertices that are currently under exploration. If S = ∅ in some step of the algorithm, then as long as possible we take a vertex of V 0 ∩ U as the new root of a component and put it onto the stack S. Hence, by this modified DFS-algorithm, at least up to the point when we explored at most log k vertices, the root of the current component is in V 0 .
We run this modified DFS-algorithm until the moment at which we reach |R ∪ S| = (1 − ǫ)k. Let A be the event that S = ∅ at some moment after 1 2 log k steps of the algorithm and let B be the event that there are less than (1 − ǫ)k positive answers among the first k p = ǫ 2 k 2 tested edges.
Assuming this claim we can a.a.s. find a path of length (1 − ǫ)k starting in a vertex of V 0 as follows.
Suppose neither A nor B holds. Consider the step of the DFS-algorithm at which we reach |R ∪ S| = (1 − ǫ)k. Thus the root of the current component is contained in V 0 , as A does not hold. Due to property (I) such a step exists. Recall that the vertices in S form a path (property (II)). If |S| ≥ (1 − 2ǫ)k, then the statement of the lemma follows directly. Thus, we may assume that which implies |R| > ǫk. Moreover, each vertex in R has at least k − |R ∪ S| ≥ ǫk neighbors in G in the set of unreached vertices U . Due to property (III), all these edges between R and U have been queried and answered negatively. Hence at least |R| · ǫk > ǫ 2 k 2 queries are answered negatively and less than (1 − ǫ)k are answered positively. Thus B holds, which is a contradiction. We complete the proof of the lemma by the proof of claim. For a positive integer i, let A i be the event that we complete exploring a component when |R| = i. Since every vertex has degree at least k, in this moment of the algorithm every vertex in R has at least k − i ≥ ǫk neighbors in U (for i ≤ (1 − ǫ)k) and all these edges are queried negatively. Thus we queried at least iǫk edges in total, and had at most i positive answers. The probability that this occurs is at most the probability that a binomial distributed random variable X i with X i ∼ Bin(iǫk, p) is at most i. Hence EX i = iǫc = ic 1/2 . By Chernoff's inequality, we obtain Using the union bound leads to the desired result An upper bound for the event B follows by a direct applications of Chernoff's inequality. Let Y be a binomial distributed random variable with Y ∼ Bin k p , p . Then, (1), which completes the proof of the claim and thus the proof of the lemma.

Long Cycles
In this section we prove Theorem 3. Let G be a graph of minimum degree at least k on n vertices and let p = c k for c sufficiently large. This proof is based on ideas of Riordan [10] and follows its strategy. In particular, the first two short lemmas naturally transfer to our setting.
In this section, we consider a rooted forest T which is an output of the DFSalgorithm described in the beginning. We emphasize that every untested edge of G is in G p independently of T . Lemma 7. During the DFS-algorithm on G p a.a.s. at most 2n p = 2nk c many edges are tested.
Proof. We run the DFS-algorithm on G p . Note that the rooted spanning forest T of G p has at most n − 1 edges and that every positively answered query contributes an edge to our exploration of this forest. Let X be the number of tested edges. If at least 2n p many edges are tested, then let Y be the number of positively answered queries of the first 2n p tested edges. Thus, Y is a binomial distributed random variable with Y ∼ Bin 2n p , p and EY = 2n p · p = 2n. By Chernoff's inequality, we obtain This completes the proof.
From now on, let ǫ = c −1/5 . Let E u be the set of untested edges of G during the DFS-algorithm. We call a vertex free if it is incident with at least (1 − ǫ)k untested edges in E u .
Lemma 8. At most 4ǫ 4 n vertices of the rooted forest T are a.a.a. not free.
Proof. Let v ∈ V (T ) be a vertex that is not free. Since the minimum degree of G is at least k, the vertex v is incident with at least ǫk tested edges. Assume that there are more than 4ǫ 4 n vertices that are not free. Hence, we have more than 1 2 4ǫ 4 n · ǫk = 2nk c many tested edges in total. By Lemma 7, the probability of this is o(1), which implies the statement.
For a rooted forest T and a vertex v ∈ V (T ), we introduce the following notation. (iv) For two vertices u, v, let d(u, v) be the number of edges on a shortest u, v-path in T .
(v) We say a vertex v is up if it has many descendants, say if |D(v)| ≥ ǫk. If this is not the case, then v is down.
the set of vertices in T that are not skinny.
Lemma 9. If the rooted forest T of G p contains at most 5ǫ 4 n down vertices, then, for any constant h ≥ 1, at most 6hǫ 3 n vertices of T are at height less than hk.
Proof. For each up vertex v ∈ V (T ), let P (v) be a set of ǫk descendants of v, obtained by choosing vertices of D(v) one-by-one starting with those with largest distance to v in T . For every w ∈ P (v), we have |D(w)| < |P (v)| = ǫk, because D(w) P (v). This implies that every vertex w ∈ P (v) is down. We define the set S 1 = {(v, w) : v is up and w ∈ P (v)}. Each up vertex v appears in exactly ǫk pairs (v, w) ∈ S 1 and by the assumption of the lemma, we have at least (1 − 5ǫ 4 )n up vertices. Hence, we obtain We consider the pairs (v, w) ∈ S 1 that satisfy d(v, w) ≤ hk. For pairs (v, w) ∈ S 1 , we conclude that v ∈ A(w) and w is down. Note that each vertex has at most one ancestor at each distance, hence |A ≤hk (w)| ≤ hk. Since we have at most 5ǫ 4 n down vertices, this implies that there are at most hk · 5ǫ 4 n pairs (v, w) ∈ S 1 satisfying d(v, w) ≤ hk. Hence, if we consider the set S Recall that each up vertex v appears in exactly ǫk pairs (v, w) ∈ S 1 , and since S ′ 1 ⊂ S 1 , each such v appears also in at most ǫk pairs (v, w) ∈ S ′ 1 . Hence, at least By the definition of S ′ 1 , each such vertex v is at height at least hk, which completes the proof.
Lemma 10. If the rooted forest T of G p contains at most 5ǫ 4 n down vertices and X ⊆ V (T ) such that |X| ≤ 5ǫ 4 n, then, for c sufficiently large, T contains a vertical path P of length at least 4k containing at most 1 4 ǫk vertices in X ∪ Y . Proof. Let X be a subset of V (T ) of size at most 5ǫ 4 n. First we show that the set Y ⊆ V (T ) which contains the vertices that are not skinny is small enough for our purposes. We define the set Since a vertex has at most one ancestor at any given distance, we conclude By Lemma 9, all but at most 6ǫ 3 n vertices v are at height at least k and thus, each such v appears in at least (1 − 5ǫ)k pairs (v, w) ∈ S 2 . This contributes at least (1 − 5ǫ)(1 − 6ǫ 3 )kn pairs to the set S 2 . Since |S 2 | ≤ (1−5ǫ)kn, the number of vertices v that appear in more than (1 − 4ǫ)k pairs (v, w) ∈ S 2 is at most (1 − 5ǫ) 6ǫ 2 n, as (if a vertex v has appears in at least (1 − 4ǫ)k pairs (v, w), then it contributes ǫk more pairs to the lower bound given before) is an upper bound for |S 2 |.
By the definition of S 2 all vertices v appearing in at most (1 − 4ǫ)k pairs (v, w) ∈ S 2 are skinny. Hence, Next we want to find the desired path P . We define the set Since a vertex has at most one ancestor at each distance, for a pair (v, w) ∈ S 3 , the vertex w can appear in at most 4k different pairs in S 3 . We obtain This implies that the number of vertices v that can appear in more than 1 4 ǫk pairs (v, w) ∈ S 3 , is bounded from above by 25ǫ 2 kn 1 4 ǫk = 100ǫn.
By Lemma 9, all but at most 24ǫ 3 n vertices of T are at height at least 4k and from above follows that all but at most 100ǫn vertices v appear in at most 1 4 ǫk pairs (v, w) ∈ S 3 . Hence, for c sufficiently large such that ǫ is small enough, there exists a vertex v at height at least 4k that appears in at most 1 4 ǫk pairs (v, w) ∈ S 3 . Let P be the vertical path from v to some vertex in D 4k (v). Then P has length 4k and by the choice of v, the path P contains at most 1 4 ǫk vertices in X ∪ Y .
Proof of Theorem 3. Recall, G is a graph of minimum degree at least k and p = c k for c sufficiently large. We run the DFS-algorithm on G p . Let T be the spanning forest and let E u be the set of untested edges of G that we obtain from this algorithm. By Lemma 8, we may assume that all but at most 4ǫ 4 n vertices of T are free, that is, incident with at least (1 − ǫ)k untested edges in E u . Due to property (IV) of the DFS-algorithm, for every untested edge uv ∈ E u , either u ∈ A(v) or u ∈ D(v).
Assume that for more than 2 log k vertices v, we have This means, that we can find at least ǫk log k untested edges uv ∈ E u in G with d(u, v) ≥ (1 − 5ǫ)k. Using Chernoff's inequality, we can easily find one of these edges present in G p with probability 1 − o(1). As we expect ǫc log k = ǫ −4 log k edges, the probability for the event that at least one edge is present is at least . Thus we can a.a.s. find such an edge present in G p that forms together with T a cycle of length at least (1 − 5ǫ)k in G p . Now assume that for all vertices v except for at most 2 log k, we have Let V 0 be the set of vertices v that do not satisfy (3), that is, |V 0 | ≤ 2 log k.

Claim.
A.a.s. there are at most 5ǫ 4 n down vertices.
Proof. Assume that some vertex v ∈ V (T ) \ V 0 is free and down. Since |D(v)| < ǫk and v is free, there are at least (1 − ǫ)k − ǫk = (1 − 2ǫ)k pairs of untested edges uv ∈ E u with u ∈ A(v). Since each vertex has at most one ancestor at each distance, v has at least (1−2ǫ)k −(1−5ǫ)k = 3ǫk ancestors u with uv ∈ E u and d(u, v) ≥ (1 − 5ǫ)k, which is a contradiction as v / ∈ V 0 . Therefore, no down vertex in V (T ) \ V 0 is free. By Lemma 8, a.a.s. all but 4ǫ 4 n vertices are free. Hence, at most 4ǫ 4 n + |V 0 | ≤ 4ǫ 4 n + 2 log k ≤ 5ǫ 4 n vertices are down.
Thus we may apply Lemma 10, where X is the union of V 0 and the set of vertices that are not free, that is, |X| ≤ 5ǫ 4 n, and recall that Y is the set of vertices that are not skinny. Let P be the path that is given by the Lemma 10 and let Z be the set of vertices of V (P ) \ V 0 that are free and skinny. By Lemma 10, we obtain For any vertex v ∈ Z, there are at least (1 − ǫ)k untested edges uv ∈ E u with u ∈ A(v) ∪ D(v). We want to show that there are sufficiently many of these vertices u in A(v).

Because of (3) and because
We define a set of ancestors of v within a certain distance, namely Again, since G has only one ancestor at each distance, we obtain |B(v)| ≥ ǫk.
Let u 1 ∈ V (P ) be the vertex on the path P , which is at height k. Let V 1 be the set of the first descendants of u 1 on P , such that Hence, there are at least ǫk log k untested edges uv ∈ E u such that v ∈ V 1 ∩ Z and u ∈ B(v). Using Chernoff's inequality similar as before, there is an edge v 1 u 2 present in G p such that v 1 ∈ V 1 , u 2 ∈ B(v 1 ) and ǫk ≤ d(v 1 , u 2 ) ≤ (1 − 5ǫ)k with probability 1 − o(k −1 ).
Let V 2 be the set of the first descendants of u 2 on P such that |V 2 ∩Z| ≥ log k. Thus for every vertex w ∈ V 2 , we have d(w, u 1 ) ≥ ǫk − 1 4 ǫk − 2 log k > ǫ 2 k. Again, as V 2 ∩ Z ≥ log k, and there is an edge v 2 u 3 present in G p with v 2 ∈ V 2 and u 3 ∈ B(v 2 ) with probability 1 − o(k −1 ).
Next, let V 3 be the set of the first descendants of u 3 on P such that V 3 ∩Z ≥ log k.
We may continue in this manner to find such edges v i u i+1 until we reach a vertex u j+1 which is at least 2k steps higher than the vertex v 1 . Since each vertex v i+1 is at least 1 2 ǫk steps above v i , after at most 4ǫ −1 many steps we reach the vertex u j+1 , that is, j ≤ 4ǫ −1 . Thus the procedure does not fail with Note that we also remain within the path P , since P has length at least 4k and we start at most at height k and with each step we go up at most (1 − 5ǫ)k.
Suppose j is even. Consider the following cycle C: Note that every vertex in V (P )\V (v 1 P u j+1 ) is contained in some V i . Therefore, the length of C is at least A similar argument applies if j is odd.

Long Cycles in Pseudo-Cliques
Consider the well-known G(n, p)-model and with our notation a uniform at random chosen member is (K n ) p . It is very natural and intuitive that (K n ) p and H p typically have the same properties if H is a graph on n vertices which is almost a clique. In this section we indicate that a result of Frieze [6] can be suitably modified. Let γ > 0 be a constant sufficiently small. We call a graph G on n vertices a k-pseudo-clique (or simply pseudo-clique) if its minimum degree is at least k and n ≤ (1 + γ)k. We start with some properties of a pseudo-clique G, but before we need to introduce some notation.
A vertex v has small degree if d(G) ≤ c 10 and otherwise its degree is large. Let S and L be the set of all vertices of small and large degree in G p , respectively. For 1 ≤ i ≤ 4, let W i be the set of all vertices v of small degree such that there is a vertex w of small degree and a v, w-path of length i or v is contained in a cycle of length i. We set W = W 1 ∪ . . . ∪ W 4 .
The following lemmas are extensions of the results of Frieze [6], who prove the analogous results for G = K k+1 . As the proofs are quite standard, a bit tedious and can be done along the lines of the proofs of Frieze, we omit the proofs.
Lemma 11. Let G be a k-pseudo-clique on n vertices, p = c k and let ℓ ≥ 7 be an integer. Then a.a.s. G p has the following properties, . Lemma 12. Let G be a k-pseudo-clique on n vertices, p = c k , and let X 1 , X 2 , . . . be a sequence obtained by the following rule If X = j≥1 X j , then |X| ≤ 500c 4 e − 4c 3 k a.a.s.
Let V 2 be vertex set of the largest subgraph of G p with minimum degree 2 (G p [V 2 ] is also known as the 2-core). Moreover, let Y be the set of all vertices v in G which have degree 2 and have a neighbor in X in G p . Let A = V 2 \(W ∪X ∪Y ).
Lemma 13. Let G be a k-pseudo-clique on n vertices and p = c k . Then, a.a.s.
Having proved these three lemmas for pseudo-cliques, one can go once again along the lines of the result of Frieze to obtain the following.
Theorem 14. If G be a k-pseudo-clique on n vertices and p = c k , then a.a.s. G p contains a cycle of length at least where ǫ(c) → 0 as c → ∞.

Long Paths
This section is devoted to the proof of Theorem 2. This proof is inspired by a result in [8] proving that a.a.s. the random subgraph G p of a graph G of minimum degree at least k contains a path of length k if p = (1+ǫ) log k k for any fixed ǫ > 0.
Proof of Theorem 2. Let c be sufficiently large and let ǫ = 5 c and the minimum degree of the graph G[V ′ ] is at least (1 − 2 log k )k , then by Theorem 14, G p a.a.s. contains a cycle of length at least 1 − (1 + ǫ(c))ce −c k, for some function ǫ(c) → 0 as c → ∞, which implies the statement. Hence, we may assume that G does not contain such a set V ′ .
In the following, we use a technique which is known as sprinkling. In our case, we expose the edges of G p in three rounds and in each round we suppose an edge to be present independently with probability c 3k . Thus we consider the union of three graphs G p1 ∪ G p2 ∪ G p3 , where p i = c 3k . As the union of these three graphs underestimates the model G p . Therefore, if we can show that G p1 ∪ G p2 ∪ G p3 a.a.s. contains a path of the desired length, then also G p a.a.s. contains such a path. By Theorem 3, we know that G p1 a.a.s. contains a cycle C of length at least (1 − ǫ)k. Moreover, we may assume that |C| We divide the proof into two parts. First, we suppose that |A| ≤ 10ǫk. Hence, if B = ∅, then G[B] has minimum degree at least 10ǫk.
Suppose first that at least 4k log k edges joining C and B in G and denote this set by E. Consider an ordering b 1 , b 2 , . . . of the vertices in B and consider an ordering e 1 , e 2 , . . . of the edges in E which respects the ordering on C, that is, if i < j, then the indices of the edges incident to b i are smaller than the indices of the edges incident to b j . For 1 ≤ i ≤ ⌈2 log k⌉, let E i = {e j : (2i − 2)k + 1 ≤ j ≤ (2i − 1)k}. This implies that there is no vertex b ∈ B incident to an edge in E i and E j for i = j, since a vertex in B has at most |V (C)| ≤ k neighbors in C. Moreover, with probability 1 − e − c 3 every set E i contains at least one edge in G p2 independently for every i. Thus by Chernoff's inequality, at least log k sets E i contain an edge in G p2 with probability 1 − o(1). Let S be a set of log k vertices in B incident to an edge in G p2 . By Lemma 6, with probability 1 − o(1), there is a path in G p3 [B] starting in S of length, say, ǫk. Combining C, a suitable edge in some E i , and this path leads to a path in G p1 ∪ G p2 ∪ G p3 of length at least k with probability 1 − o(1).
Therefore, we may assume that at most 4k log k edges joining C and B in G. Hence |A ∪ C| ≥ k − 5 log k, otherwise every vertex in C has at least 5 log k neighbors in B contradicting our assumption.
Next, we suppose that there exists a set A ′ ⊆ A with at least √ k many vertices having at least k 2 3 many neighbors in B. As any vertex in A ′ is adjacent to at least one vertex in C in G p2 with probability close to 1, say 3 4 , independently of each other, with probability 1 − o(1), there exists a A ′′ of size at least |A ′ | 2 such that every vertex in A ′′ is adjacent to C in G p2 . By a similar argument as before, with probability 1 − o(1), there are log k vertices in B such that each of them has a neighbor in A ′′ in G p2 . Again, with probability 1 − o(1), there is a path in G p3 at length at least ǫk starting in one of these vertices in B and this leads to a path of length at least k in G p1 ∪ G p2 ∪ G p3 with probability 1 − o(1).
Therefore, there are at most 2 √ k vertices v in A∪C with d B (v) ≥ k 2 3 and let Z be obtained from A ∪ C by deleting all these vertices. Clearly, |Z| ≥ k − 2 √ k. As |Z| ≤ (1 + 10ǫ)k, the set Z is a set as in (4), which is a contradiction.
Thus from now on, we may assume that |A| ≥ 10ǫk. Let A 1 ⊆ A with |A 1 | = 10ǫk. We partition C into 1 10ǫ cycle segments S 1 , S 2 , . . . each of length almost 10ǫk. As every vertex in A 1 has at least (1 − 20ǫ)k neighbors in C, by a simple average argument, there is a segment, say S 1 , such that the number of edges between S 1 and A 1 is at least (1 − 20ǫ)|A 1 ||S 1 |. Let H be the bipartite subgraph of G which is induced by A 1 and S 1 . This implies that the bipartite complement of H has at most 2000ǫ 3 k 2 edges. Of course, this graph contains at most 100ǫ 3 2 k vertices of degree at least 100ǫ 3 2 k. Let H ′ be the graph obtained by deleting these vertices from H. Thus H ′ has minimum degree at least (1 − 20 √ ǫ) · 10ǫk. For some orientation of C, let L and R be the first and last ǫk vertices on C in S 1 . Moreover, remove an arbitrary subset of A 1 to obtain from the graph H ′ \ (R ∪ L) a balanced bipartite graph H ′′ . Thus H ′′ has minimum degree at least (1 − 25 √ ǫ) · 8ǫk. By Lemma 5, H ′′ contains a path P of length 15ǫk in G p2 with probability 1 − o(1). Let P 1 and P 2 be the subpaths at the beginning and at the end of P of length ǫk, respectively. By Chernoff's inequality, with probability 1 − o(1), in G p3 , there exists an edge e 1 joining a vertex in L and V (P 1 ) ∩ A 1 and an edge e 2 joining a vertex in R and V (P 2 ) ∩ A 1 .
Combining the subpath of C between the endpoints of e 1 and e 2 that contains the segment S 2 , the subpath of P between the endpoints of e 1 and e 2 , and