Fixation Probability for Competing Selective Sweeps

We consider a biological population in which a beneficial mutation is undergoing a selective sweep when a second beneficial mutation arises at a linked locus and we investigate the probability that both mutations will eventually fix in the population. Previous work has dealt with the case where the second mutation to arise confers a smaller benefit than the first. In that case population size plays almost no role. Here we consider the opposite case and observe that, by contrast, the probability of both mutations fixing can be heavily dependent on population size. Indeed the key parameter is $\rho N$, the product of the population size and the recombination rate between the two selected loci. If $\rho N$ is small, the probability that both mutations fix can be reduced through interference to almost zero while for large $\rho N$ the mutations barely influence one another. The main rigorous result is a method for calculating the fixation probability of a double mutant in the large population limit.


Introduction
Natural populations incorporate beneficial mutations through a combination of chance and the action of natural selection. The process whereby a beneficial mutation arises (in what is generally assumed to be a large and otherwise neutral population) and eventually spreads to the entire population is called a selective sweep. When beneficial mutations are rare, we can make the simplifying assumption that selective sweeps do not overlap. A great deal is known about such isolated selective sweeps (see e.g. Chapter 5 of Ewens 1979). Haldane (1927) showed that under a discrete generation haploid model, the probability that a beneficial allele with selective advantage σ eventually fixes in a population of size 2N , i.e. its frequency increases from 1/(2N ) to 1, is approximately 2σ. Much less is understood when selective sweeps overlap, i.e. when further beneficial mutations arise at different loci during the timecourse of a sweep.
Our aim here is to investigate the impact of the resulting interference in the case when two sweeps overlap. In particular, we shall investigate the probability that both beneficial mutations eventually become fixed in the population. Because genes are organised on chromosomes and chromosomes are in turn grouped into individuals, different genetic loci do not evolve independently of one another. However, in a dioecious population (in which chromosomes are carried in pairs), nor are chromosomes passed down as intact units. A given chromosome is inherited from one of the two parents, but recombination or crossover events can result in the allelic types at two distinct loci being inherited one from each of the corresponding pair of chromosomes in the parent. We refer to these chromosomes as 'individuals'.
Each individual in the population will have a type denoted ij where i, j ∈ {0, 1}. We use the first and second digit, respectively, to indicate whether the individual carries the more recent or the older beneficial mutation, and assume that the fitness effects of these two mutations are additive. Suppose that a single advantageous allele with selective advantage σ 1 arises in an otherwise neutral (type 00) population of size 2N , corresponding to a diploid population of size N . We use X ij to denote the proportion of individuals of type ij, then the frequency of the favoured allele, X 01 , will be well-approximated by the solution to the stochastic differential equation where s is the time variable, {W (s)} s≥0 is a standard Wiener process, and X 01 (0) = 1/(2N ) (Ethier & Kurtz 1986, Eq. 10.2.7). If the favoured allele reaches frequency p, then the probability that it ultimately fixes is If a sweep does take place then (conditioning on fixation) we obtain dX 01 = σ 1X01 (1 −X 01 ) coth(N σ 1X01 ) ds + 1 2NX 01 (1 −X 01 ) dW (s) and from this it is easy to calculate the expected duration of the sweep. Writing T f ix = inf{ s ≥ 0 :X 01 (s) = 1 X 01 (0) = 1/(2N )}, we have (see for example Etheridge et al. 2006)

INTRODUCTION
selection coefficient σ 2 occurs at a second linked locus in a randomly chosen individual, and the recombination rate between these two loci is ρ. If we assume that the arrival time of the second mutation is uniformly distributed over the timecourse of the sweep of the first mutation and that N is large, then we can expect either U or 1 − U to be close to 0 but ≫ 1/(2N ). The new mutation can arise in a type 00 or 01 individual, forming a single type 10 individual in the former case, and a 11 individual in the latter case. If the second mutation arises during the first half (in terms of time) of the sweep of the first mutation, then U is likely to be very small and it is more likely for a type 10 individual to be formed. Otherwise, the second mutation arises during the second half of the sweep and the formation of a type 11 individual is more likely. The case of the second beneficial mutation forming a type 11 individual is relatively straightforward. Since type 11 is fitter than all other types, its fixation is almost certain once it becomes 'established' in the population, i.e. when the number of type 11 individuals is much larger than 1. If the population size is very large, then it only takes a short time to determine whether type 11 establishes itself, and we can assume the proportion of type 01 individuals remains roughly constant during this time. Hence the fixation probability of type 11 is essentially its establishment probability, which is approximately 2(σ 2 + σ 1 (1 − U )), twice the 'effective' selective advantage of type 11 in a population consisting of 2N U type 01 and 2N (1 − U ) type 00 individuals.
The case of the second beneficial mutation forming a type 10 individual is far more interesting. In order for both mutations to sweep through the population, recombination must produce an individual carrying both mutations. The relative strength of selection acting on the two loci now becomes important. The case of σ 1 > σ 2 has been dealt with in Barton (1995) and Otto & Barton (1997). Here, since type 01 is already present in significant numbers when the new mutation arises (and type 01 is fitter than type 10), the trajectory of X 01 is well approximated by the logistic growth curve 1/(1 + exp(−σ 1 t)) until X 11 reaches a level of O (1). At that point, fixation of type 11 is all but certain. Barton (1995) then uses a branching process approximation to estimate the establishment probability of a type 11 individual produced by recombination. In particular, his approach is independent of population size. Not surprisingly, he finds that the fixation probability of the second mutation is reduced if it arises as a type 10 individual, but increased if it arises as a type 11 individual. Simulation studies performed in Otto & Barton (1997) confirm these findings in the case σ 1 > σ 2 . Gillespie (2001) considers the effects of repeated substitutions at a strongly selected locus on a completely linked (i.e. there is no recombination) weakly selected locus, extending his work in Gillespie (2000), where he considers a linked neutral locus. He too sees little dependence of his results on population size, leading him to suggest repeated genetic hitchhiking events as an explanation for the apparent insensitivity of the genetic diversity of a population to its size. Kim (2006) extends the work of Gillespie (2001) by considering the effect of repeated sweeps on a tightly (but not completely) linked locus. This whole body of work is concerned, in our terminology, with σ 1 > σ 2 .
The case of σ 2 > σ 1 brings quite a different picture. The analysis used in Barton (1995) breaks down for the following reason: because the second beneficial mutation is more competitive than the first, type 10 is destined to start a sweep itself if it gets established in the population. Once X 10 reaches O(1), X 01 is no longer well approximated by a logistic growth curve and in fact will decrease to 0. The fixation probability of type 11 will then depend on the nonlinear interaction of all four types, {11, 10, 01, 00}, and our analysis will show that it is heavily dependent on population size. See Figure 1 below.  This paper is organized as follows. In §2.1 we set up a continuous time Moran model for the evolution of our population. In the biological literature, it would be more usual to consider a Wright-Fisher model, in which the population evolves in discrete, non-overlapping generations. The choice of a Moran model, in which generations overlap, is a matter of mathematical convenience. One expects similar results for a Wright-Fisher model. The choice of a discrete individual based model rather than a diffusion is forced upon us by our method of proof, but is anyway natural in a setting where population size plays a rôle in the results. A brief analysis of our model, for very large N , leads to our main rigorous result, Theorem 2.3, which provides a method to calculate the asymptotic (N → ∞) fixation probability of type 11 when σ 2 > σ 1 . We discuss the case of moderate N in §2.3. The rest of the paper is devoted to proofs, with §3 containing the proof of Theorem 2.3 and §4 containing the proof of Proposition 3.1. Results in §4 rely on supporting lemmas of §5.

A Moran Model for Two Competing Selective Sweeps
In this section we describe our model for the evolution of two competing selective sweeps. We use the notation from the introduction for the four possible types of individual in the population I = {00, 10, 01, 11}, and assume that at the time when the second mutation arises, the number U ∈ {0, 1, . . . , 2N } of type 01 individuals in the population is known. From now on we use t = 0 to denote the time when the second mutation arises. As explained in §1, we may assume that U is much larger than 1. Let σ ∈ [0, 1] be the selective advantage of the second beneficial mutation and σγ be the selective advantage of the first beneficial mutation (for some γ > 0). The recombination rate between the two selected loci is denoted by ρ which we assume to be o(1). We use {(η n ζ n ), n = 1, . . . , 2N } to denote the types of individuals in the population. At time t = 0, we assume that the population of 2N individuals consists of 2N − U − 1 type 00 individuals, U type 01 individuals and 1 type 10 individual. The dynamics of the model are as follows: 1. Recombination: Each ordered pair of individuals, (η m ζ m ) and (η n ζ n ) ∈ I, is chosen at rate ρ/(2N ). With probability 1/2, (η m ζ n ) replaces (η m ζ m ). Otherwise, (η n ζ m ) replaces (η m ζ m ). 2. Resampling (and selection): Each ordered pair of individuals, (η m ζ m ) and (η n ζ n ) ∈ I, is chosen at rate 1/(2N ). With probability p(η m ζ m , η n ζ n ) given by a type (η m ζ m ) individual replaces (η n ζ n ). Otherwise a type (η n ζ n ) individual replaces (η m ζ m ).
Remark 2.1. Evidently we must assume σ(1 + γ) ≤ 1 to ensure that all probabilities used in the definition of the model are in [0, 1].
Remark 2.2. If ρ and σ are small, then decoupling recombination from the rest of the reproduction process does not affect the behaviour of the model a great deal and it will simplify analysis.

Analysis and Results for Large N
We are concerned primarily with the case of very large population sizes, which is the regime where our main rigorous result, Theorem 2.3, operates. A nonrigorous analysis for moderate population sizes based on very similar ideas is also possible but will appear in Yu & Etheridge (2008).
To motivate our result, we present a heuristic analysis of the possible scenarios. The proof of our main result fills in the necessary steps to make this rigorous. If the second beneficial mutation gives rise to a single type 10 individual, then the process whereby type 11 becomes fixed must proceed in three stages and our approach is to estimate the probability of each of these hurdles being overcome. First, following the appearance of the new mutant, X 10 must 'become established', by which we mean achieve appreciable frequency in the population. Without this, there will be no chance of step two: recombination of a type 01 and a type 10 individual to produce a type 11. Finally, type 11 must become established (after which its ultimate fixation is essentially certain). Of course this may not happen the first time a new recombinant is produced. If type 11 becomes extinct and neither X 01 nor X 10 is one, then we can go back to step two.
We assume the first mutation has been undergoing a selective sweep prior to the arrival of the second mutation. Before the arrival of the second beneficial mutation (during which X 10 and X 11 are both 0), we can write where M 01 is a martingale with maximum jump size 1/(2N ) and quadratic variation M 01 (s) = 1+ρ 2N s 0 X 01 (u)(1 − X 01 (u)) du. i.e. M 01 is the unique previsible process such that M 01 (s) 2 − M 01 (0) 2 − M 01 (s) is a martingale. See e.g. § II.3.9 of Ikeda & Watanabe (1981). We drop the martingale term M 01 and approximate the trajectory of X 01 using a logistic growth curve, i.e. X 01 (s) ≈ 1/(1 + (2N − 1) exp(−σγs)) which solves dX01 ds = σγX 01 (s)(1 − X 01 (s)) and X 01 (0) = 1/(2N ). As discussed in §1, if we assume that the arrival time of the second mutation is uniformly distributed on the timecourse of the sweep of the first and N is large, then X 01 spends most of the time near 0 or near 1.
We divide into two cases.
1. The second mutation arises during the first half of the sweep of the first mutation, i.e. when X 01 < 1/2. 2. The second mutation arises during the second half of the sweep of the first mutation, i.e. when X 01 ≥ 1/2.
In Case 2, X 01 is close to 1 and it is most likely that the second mutation arises in a type 01 individual to form a single type 11 individual, in which case the fixation probability is roughly the same as the establishment probability of type 11 arising in a population consisting entirely of type 01 individuals, which in turn is roughly 2σ/(1 + σ).
From now on, we focus on the more interesting Case 1. In what follows, t = 0 will be the time of arrival of the second beneficial mutation. There it is most likely that the second mutation arises in a type 00 individual resulting in a single type 10 individual in the population. If we approximate the growth of X 01 by a logistic growth curve, then it reaches 1/2 at time 1 σγ log(2N − 1) ≈ 1 σγ log(2N ). Choosing the time of the introduction of the new mutation uniformly on [0, 1 σγ log(2N )] we see that at t = 0, The establishment probability for type 10 in this case is relatively easy to estimate. Since σ 2 > σ 1 , type 10 either dies out becomes established before X 01 can grow to be a significant proportion of the population. Therefore the establishment probability of type 10 is almost the same as a type 10 arising in a population consisting entirely of type 00 individuals, roughly 2σ/(1 + σ).
We observe that if type 11 does get established, then since it has fitness advantage over all other types, the probability that it eventually fixes is very close 1 (this follows from Lemma 3.2). Therefore we can concentrate on the behaviour of X before X 11 reaches say (log(2N ))/(2N ), which is still very small compared to 1. After type 10 is established and prior to type 11 being established, we approximate X 10 and X 01 deterministically. Until either X 10 or X 01 is O(1), both grow roughly exponentially, so assuming that type 10 gets established, we have (2.2) We divide Case 1 further into two sub-cases. See Figure 2 for an illustration. Case 1a, ζ < γ. The approximation (2.2) fails once either X 10 or X 01 reaches O(1), which occurs at time 1 σ log(2N ) ∧ ζ σγ log(2N ). If ζ < γ, then X 01 reaches O(1) before X 10 , and will further increase to almost 1 (which takes time only O(1)) before X 10 reaches O(1). At this time, which we denote T 1 , the population consists almost entirely of types 01 or 10. Type 10, already established but still just a small proportion of the population, will then proceed to grow logistically, displacing type 01 individuals until X 10 is close to 1 at time T 2 . During [T 1 , T 2 ] (of length O(1)), both X 01 and X 10 are O(1), so we expect O(ρN ) recombination events between them producing O(ρN ) type 11 individuals. Each type 11 individual has a probability of at least 2σγ/(1 + σγ) of eventually becoming the common ancestor of all individuals in the population. So if we want to get a nontrivial limit (as N → ∞) for the fixation probability of type 11, we should take ρ = O(1/N ). When we use the term nontrivial here, we mean that as N → ∞, (i) the fixation probability does not tend to 0, due to a lack of recombination events between type 10 and type 01 individuals, and (ii) nor does it tend to the establishment probability of type 10, due to infinitely many type 11 births, one of which is bound to sweep to fixation. Case 1b, ζ > γ. In this case, X 10 reaches O(1) at time roughly 1 σ log(2N ), before X 01 does, and X 01 is O((2N ) γ−ζ ) at this time. Furthermore, the biggest X 01 can get is O((2N ) γ−ζ ) since X 10 will very soon afterwards increase to almost 1, after which X 01 will exponentially decrease (since type 01 is less fit than type 10). Hence we expect O(ρN 1+γ−ζ ) recombination events between type 10 and type 01, and the 'correct' scaling for ρ is ρ = O(N ζ−γ−1 ) in this case.
In case 1a, we take ρ = O(1/N ), then most of the recombination events between type 10 and type 01 individuals occur when type 10 is logistically displacing type 01, i.e. in the time interval [T 1 , T 2 ]. During this time, we can approximate X 10 and X 01 by Z 10 and 1 − Z 10 , respectively, where Z 10 is deterministic and obeys the logistical growth equation with parameter σ(1 − γ), twice the advantage of type 10 over type 01. We can further approximate X 11 by a birth and death process Z 11 with deterministic but time-varying rates that depend on Z 10 . Specifically, the rates of increase and decrease for Z 11 are the same as r ± 11 in (2.1), but with X 10 replaced by Z 10 , X 01 replaced by 1 − Z 10 and X 11 replaced by 0.
The probability that X 11 gets established, i.e. reaches is then approximated by the probability that the birth and death process Z 11 reaches δ 11 . The latter can be found by solving the forward equation for the process Z 11 , which can be found in (3.3). We define the fixation time of the Moran particle system of §2.1: We observe that the Markov chain (X 00 , X 01 , X 10 ) has finitely many states and the recurrent states are R = {(0, 0, 0), (0, 0, 1), (0, 1, 0), (1, 0, 0)}. Every other state is transient and there is positive probability of reaching R starting from any transient state in finite time. Therefore Our main result, Theorem 2.3 below, concerns Case 1a, which is the most likely scenario if γ is close to 1.
Theorem 2.3. If ζ < γ < 1 and ρ = O(1/N ), then there exists δ > 0, whose value depends on ρ, σ, γ, and ζ, such that In the above, 2σ 1+σ corresponds to the establishment probability of type 10, while p (11) δ11 (T ∞ ) approximates the establishment probability of type 11 conditioning on type 10 becoming established. Figure 3 compares fixation probabilities obtained from simulation, our non-rigorous calculation (which we briefly discuss in §2.3 below), and the large population limit of Theorem 2.3. In Figure 3(a) we hold ρN constant in this simulation, and observe that the fixation probability of type 11 increases but does not change drastically as N becomes large. The reason for the drop in the fixation probability of type 11 when N is small may be because in this case, the early phase for X 01 is very short and hence grows quickly to reduce the establishment probability of type 10. In Figure 3(a), we use a population size of 2N = 50, 000 to approach the large population limit of Theorem 2.3. At 2N = 50, 000, it takes roughly 12 hours on a PC to obtain one data point in Figure 3, which is run with 20,000 realisations. Apparently this population size still results in underestimates of the limiting large population limit. We expect a similar result for Case 1b, for which we provide an outline here. We take ǫ ≤ (γ − ζ)/(2 + γ) and t 1 = 1−ǫ σ log(2N ), then at time t 1 , we expect X 10 to be either 0 (with probability approximately 1−σ 1+σ , as in Case 1a) or O((2N ) −ǫ ) and X 01 to be roughly (2N ) (1−ǫ)γ−ζ ≤ (2N ) −2ǫ . Since X 01 and X 11 can be expected to be quite small before t 1 , they exert little influence on the trajectory of X 10 , which jumps by ±1/(2N ) at roughly the following rates: Hence before t 1 , 2N X 10 resembles a continuous-time branching process Z with generating function of offspring distribution in the form of u(s) = 1 2 (1+σ+ρ)s 2 + , an exponential distribution with mean 1+σ+ρ 2σ (2N ) −ǫ , as N → ∞. From time t 1 onwards, until either X 10 gets very close to 0 or X 01 becomes much smaller than O((2N ) (1−ǫ)γ−ζ ), we can assume that the paths of X 01 and X 10 resembles those of Z 01 and Z 10 , respectively, where with the initial condition Z 10 (t 1 ) drawn according to Exp( 1+σ+ρ 2σ (2N ) −ǫ ) and As in Case 1a, we can then approximate X 11 by a birth and death process Z 11 with rates the same as r ± 11 from (2.1) but with X 10 replaced by Z 10 and X 01 replaced by Z 01 . The probability that Z 11 reaches δ 11 can then be found by solving the forward equation for Z 11 . Finally, we integrate this probability against all initial conditions for Z 10 , drawn according to Exp( 1+σ+ρ 2σ (2N ) −ǫ ). The proof of such a result is more tedious than that of Theorem 2.3 but makes use of similar ideas.

Brief Comment on Moderate N
For moderate population sizes, the observation in Case 1a of §2.2 that X 01 increases to close to 1 before X 10 reaches O(1) breaks down. We can, however, compute the distribution function f T of the random time T 10;δ10 when X 10 hits a certain level δ 10 , assuming that X 01 grow logistically before T 10;δ10 . From T 10;δ10 onwards and before X 11 hits δ 11 , X 10 grows roughly deterministically, displacing both type 10 and type 00, so we can approximate X 11 by Z 11 , a birth and death process with time-varying jump rates in the form of r ± 11 in (2.1), but with X 10 , X 01 and X 00 replaced by their deterministic approximations. Assuming T 10;δ10 = t, we can numerically solve the forward equation for Z 11 , which is directly analogous to (3.3), to find the probability that Z 11 eventually hits δ 11 , which we denote by p (11) est (t). The dependence of p (11) est on t comes through the initial condition X 01 for the ODE system, which depends on T 10;δ10 . The fixation probability of type 11 is then approximately p (11) est (t)f T (t) dt. This is the algorithm we use to produce the solid line in Figure 3(a) and is given in its full detail in Yu & Etheridge (2008).

Proof of the Main Theorem
We first define some of the functions, events, and stochastic processes needed for the proof, then give some intuition, before we proceed with the proof of Theorem 2.3. We begin by describing a deterministic process Y 10 and a birth and death process Y 11 (t) which, up to a shift by a random time, are Z 10 and Z 11 described in §2.2, respectively. They approximate the trajectories of X 10 and X 11 , respectively, after the establishment of type 10. To describe the (timeinhomogeneous) rates we need the solution to the logistic growth equation L(t; y 0 , θ) = y 0 +θ t 0 L(s; y 0 , θ)(1−L(s; y 0 , θ)) ds. In what follows, a 0 = ζ/(3γ) is a constant, c 1 , c 2 , c 3 are constants (slightly smaller than O(1)) that we specify precisely in Proposition 3.1, and These deterministic times roughly correspond to the lengths of the 'stochastic', 'early' (an upper bound), 'middle', and 'late' phases of X 01 , whose rôle is described in more detail in §4. During the time interval when Y 10 is between c 1 and 1 − c 1 , whose length is exactly t mid , there are birth events of Z 11 corresponding roughly to recombination events between type 10 and 01 individuals.
It is convenient to write k − = k − 1/(2N ) and k + = k + 1/(2N ). Y 11 is run until time t mid +t late . The probability that Y 11 hits δ 11 before then can be found by solving a system of ODE's. Let p (11) satisfy d dt p (11) with initial condition p We use the following convention for stopping times: for any ij ∈ {00, 01, 10, 11} and processes Y and Z, and define stopping times T ∞ = T 10;c1 + t mid + t late , S 10,01,rec = inf{t ≥ 0 : there is a recombination event between a type 10 and a type 01 individual before time t}.
We define events We observe that T 11;1/(2N ) ≥ S 10,01,rec . First we outline the intuition behind these definitions: t 0 is the length of the initial 'stochastic' phase for X 10 . At t 0 , with high probability X 10 either is O((2N ) a0−1 ) or has hit 0 (event E c 1 ). In the latter case, there is no need to approximate X 10 any further. On the other hand, if E 1 occurs, then type 10 is very likely to be established by t 0 and, with high probability, grows almost deterministically to reach level c 1 (slightly smaller than O(1)) at time T 10;c1 . Furthermore, as discussed in §1, in Case 1a, since ζ < γ, with high probability X 01 (T 10;c1 ) is close to 1. Hence conditional on E 1 , the event E 2 is very likely.
For paths in E 2 ∩ E 1 , we define to be the approximations for the trajectories of X 10 and X 11 , respectively, from time T 10;c1 onwards. For convenience, we define Z 10 (t) = Z 11 (t) = 0 for t ≤ T 10;c1 . With the convention of (3.5), and we observe that Z 10 (t) = 1 for t ≥ T Z10;1−c1 . Since X 01 (T 10;c1 ) ≈ 1, X 00 (T 10;c1 ) is very small and is unlikely to recover because type 00 is the least fit type. During [T 10;c1 , T Z10;1−c1 ], with high probability, type 10 grows logistically at rate σ(1 − γ), displacing type 01. Hence conditional on E 1 ∩ E 2 , E 3 is very likely. During [T 10;c1 , T Z10;1−c1 ], the definition of Z 11 takes into account recombination events between type 01 and 10 individuals that produce type 11 individuals at a rate of ρ(2N )X 01 X 10 , which in the definition of Z 11 , is approximated by ρ(2N )Z 10 (1 − Z 10 ). Notice that we can approximate X 01 by 1 − Z 10 since we assume throughout that X 11 ≤ δ 11 , which is very small. Outside the time interval [T 10;c1 , T Z10;1−c1 ], either X 10 is very small or very close to 1 (which means X 01 is very small), hence we ignore any recombination events. Because Z 11 closely approximates X 11 , conditional on E 3 ∩ E 2 ∩ E 1 , event E 7 has a high probability.
After T Z10;1−c1 , X 11 + X 10 is likely to remain close to 1 (event E 5 ) and hit 1 at time T ∞ (event E 6 ). We ignore any more recombination events between type 10 and 01 and Z 11 is a time-changed branching process during this time. If Z 11 has not hit δ 11 by time T Z10;1−c1 (event E 4 ), then we continue to keep track of Z 11 until T ∞ , at which time it most likely has already hit either δ 11 or 0 (event E 8 ). In the latter case, we regard type 11 as having failed to establish and since X 10 is most likely to be 1 (event E 6 ) at T ∞ , the earlier mutation has gone extinct. If X 11 hits δ 11 before T ∞ , we regard type 11 as having established and hence it will, with high probability, eventually sweep to fixation (Lemma 3.2).
Proposition 3.1 below estimates the probabilities of events E 1 through E 8 . These are 'good' events, on which we can approximate the establishment probability of type 11 by the probability that Z 11 hits δ 11 by time T ∞ . Proposition 3.1 is essential for the proof of Theorem 2.3, and will be proved in §4.
Proof of Theorem 2.3. Recall from (3.2) that a 0 = ζ/(3γ) and t 0 = a0 σ log(2N ). We first show that we can safely ignore E c 1 . Let Comparing with (2.1), we see that the jump processX 10 with initial condition X 10 (0) = 1/(2N ), jump size 1/(2N ), and the following jump rateŝ dominates X 10 for all time. Then where M is a martingale with maximum jump size 1/(2N ) and quadratic vari- We recall Burkholder's inequality in the following form: which may be derived from its discrete time version, Theorem 21.1 of Burkholder (1973). We use this and Jensen's inequality to obtain SinceX 10 dominates X 10 , we have On {sup s≤t0 X 10 (s) < (2N ) 2a0−1 }, the number of recombination events between type 10 and 01 during [0, t 0 ] is at most P oisson(2ρ(2N ) 2a0−1 t 0 ), hence for sufficiently large N . On E 9 ∩E c 1 , type 10 has gone extinct by time t 0 , before a single individual of type 11 has been born, hence type 11 will not get established, let alone fix. Therefore Now we concentrate on E 1 where type 10 has most likely established itself at time t 0 . The nontrivial event here is The following events have small probabilities by Prop 3.1(b), Prop 3.1(g-h), and Prop 3.1(f), respectively, where the last estimate above comes from the fact E 82 ⊂ E 4 . There are two events with significant probabilities: on E 82 ∩ E 7 ∩ E 6 ∩ E 2 ∩ E 1 , we have X 11 (T ∞ ) = 0, X 10 (T ∞ ) = 1 hence type 10 fixes by time T ∞ , and on E 81 ∩ E 7 ∩ E 2 ∩ E 1 , X 11 = Z 11 hits δ 11 and get established by time T ∞ . On both these events, X 11 = Z 11 until at least T ∞ ∧ T 11;δ11 . The union of these two events, E 82 ∩ E 7 ∩ E 6 ∩ E 2 ∩ E 1 and E 81 ∩ E 7 ∩ E 2 ∩ E 1 , and the three events in (3.9) is E 1 . On E 1 ∩ E 2 , for exactly one of the two events {T 11;δ11 < ∞} and {T Z11;δ11 ≤ T ∞ } to occur (i.e. either the former occurs but the latter does not, or the latter occurs and the former does not), one of the following three scenarios must occur: 1. X 11 and Z 11 disagree before T ∞ , i.e. E c 7 ; 2. X 11 and Z 11 agree up to T ∞ , but do not hit {0, δ 11 } before T ∞ , i.e. E c 8 ; 3. X 11 and Z 11 agree up to T ∞ and X 11 (T ∞ ) = 0, but X 10 (T ∞ ) < 1 thus allowing the possibility of type 11 being born due to recombination between type 10 and 01 individuals after T ∞ , i.e. E c 6 . Hence by (3.9). From (3.8), we have But by Proposition 3.1(a), We combine the three inequalities above to conclude for some δ > 0, and then use Lemma 3.2, as well as (3.4) and (3.6) to obtain the desired conclusion.

Proof of Proposition 3.1
We divide the evolution of X 10 and X 01 roughly into 4 phases, 'stochastic', 'early', 'middle', and 'late', and use Lemmas 5.1, 5.2, and 5.3 for each of the last 3 phases, respectively. Lemma 4.1 deals with the early, middle, and late phases of X 01 . Because X 01 starts at U = (2N ) −ζ ≫ 1/(2N ) at t = 0, it has no stochastic phase. Its early phase is between t = 0 and the time when X 01 reaches c 01,1 . Its middle phase is between c 01,1 and 1−c 01,2 , after which it enters the late phase. For type 10, since X 10 (0) = 1/(2N ), whether it establishes itself is genuinely stochastic (i.e. its probability tends to a positive constant strictly less than 1 as N → ∞). The stochastic phase lasts for time t 0 , when, with high probability, either type 10 has established or it has gone extinct. If X 10 reaches O((2N ) a0−1 ) by time t 0 , it enters the early phase, which is dealt with by Lemma 4.2. Part (b) of that lemma says that if ζ < γ (as mentioned before, we only deal Case 1a of §1) then it does not reach c 10,2 until X 01 has entered its late phase, while part (c) says that it does reach c 10,3 at some finite time. The proof of Proposition 3.1(ab) reconciles various stopping times used in Lemmas 4.1 and 4.2, and prepares for part (c) of Proposition 3.1, which deals with the middle phase of X 10 during which X 10 increases from c 10,3 to 1 − c 10,3 , displacing X 01 in the process. The c ij,k 's we use throughout the rest of this paper are small positive constants, all of O((2N ) −b ij,k ), whose exact values are specified immediately below (4.2).
Recall the definition of the logistic growth curve L(t; y 0 , θ) from (3.1). Throughout the rest of this section, We use L(t; (2N ) −ζ , σγ) to approximate the trajectory of X 10 during its early phase and t 01;x to denote the time when this approximation hits x, e.g. t 01;c01,1 below is when it hits c 01,1 . Furthermore, we use t 01,x,y to denote the time this approximation spends between x and y. Thus L(t 01;x ; (2N ) −ζ , σγ) = x and L(t 01,x,y ; x, σγ) = y.

Supporting Lemmas
In this section, we establish Lemmas 5.1 to 5.3, one each for the early, middle, and late phase. They are used for the proof of Proposition 3.1 in §4. Lemma 5.1 deals with the early phase and approximates a 1-dimensional jump process undergoing selection by a deterministic function, where the error bound depends only on the initial condition of the process, as long as the process is stopped before it reaches O(1). Lemma 5.2 deals with the middle phase and uses the logistic growth as an approximation. The main difference between the early phase and the middle phase is the error bound: in Lemma 5.2, the error bound depends on both the initial and terminal conditions of the process. Lemma 5.3 deals with the late phase, for which we only need to show that the process does not stray too far away from 1 (or 0 for X 00 ) once it gets close to 1 (or 0).
The dominant term in the denominator of the above quantity is e (a−b)t s K (1−s), which achieves the maximum at s = K/(K+1). For sufficiently large K, this is at least e (a−b)t /(3K). Therefore which implies the desired conclusion of (a.2), if K ≤ e (a−b)t /6. Therefore E sup s≤t ξ(s) ≤ C a,b e (a−b)t , which implies (a.3). For (b), we observe that For sufficiently large t, we have if t is sufficiently large and ke −(b−a)t is sufficiently small. For (c), we observe that ξ (k) = ξ i , i = 1, . . . , k are independent copies of ξ (1) . Therefore P (ξ (k) (t) ∈ [1, K]) ≤ P (ξ