The impact of selection in the \Lambda-Wright-Fisher model

The purpose of this article is to study some asymptotic properties of the \Lambda-Wright-Fisher process with selection. This process represents the frequency of a disadvantaged allele. The resampling mechanism is governed by a finite measure \Lambda on [0,1] and the selection by a parameter \alpha. When the measure \Lambda verifies \int_0^1-\log(1-x)x^{-2} \Lambda(dx)<\infty, some particular behaviours in the frequency of the allele can occur. The selection coefficient \alpha may be large enough to compensate the random genetic drift. In other words, for certain selection pressure, the disadvantaged allele will vanish asymptotically. This phenomenon cannot occur in the classical Wright-Fisher diffusion. We study the dual process of the \Lambda-Wright-Fisher process with selection and prove this result through martingale arguments.


Introduction and main result
We recall here the basics about the Λ-Wright-Fisher process with selection. This process represents the evolution of the frequency of a deleterious allele. When no selection is taken into account, we refer the reader to Bertoin-Le Gall [3] and Dawson-Li [5] who have introduced this process as solution of some specific stochastic differential equations driven by random Poisson measures. Recently Bah and Pardoux [1] have considered a lookdown approach to construct a particle system whose empirical distribution converges to the strong solution of whereM stands for the compensated Poisson measure M on R + × [0, 1] × [0, 1] whose intensity is ds ⊗ du ⊗ x −2 Λ(dx). Strong uniqueness of the solution of (1) follows from an application of Theorem 2.1 of [5]. The process (X t , t ≥ 0) should be interpreted as follows: it represents the frequency of a deleterious allele as time passes. When α > 0, the logistic term −αX t (1 − X t )dt makes decrease the frequency of the allele, this is the phenomenon of selection. More precisely, this equation can be understood as follows: • When the current frequency is X s− , -the frequency of the allele increases of a fraction x(1 − X s− ), (when u ≤ X s− above) the frequency of the allele decreases of a fraction xX s− (when u > X s− above) • continuously in time, the frequency decreases due to the deterministic selection.
Note that we are dealing with a two alleles model: at any time t, the advantageous allele has frequency 1 − X t . The purely diffusive case is well understood (this is the classical Wright-Fisher diffusion) and we exclude it from our study (see e.g. Chapter 3 and 5 of Etheridge's monography [8] for a complete study). We mention also that Section 5 of Bah and Pardoux [1] incorporates a diffusion term in the SDE (1). Last the process (X t , t ≥ 0) should be interpreted as one of the simplest model introducing a natural selection and a random genetic drift (that is the random resampling governed by Λ).
Plainly, the process (X t , t ≥ 0) lies in [0, 1] and is a supermartingale. Therefore, the process (X t , t ≥ 0) has an almost-sure limit denoted by X ∞ . This random variable is the frequency's equilibrium. Since 0 and 1 are the only absorbing states, the random variable X ∞ lies in {0, 1}. Moreover if α > 0, the supermartingale property yields that for all x in [0, 1], Our main result is the following theorem.
In other words the dust-free condition ensures that the deleterious allele cannot vanish asymptotically. However it is worth observing that this is not an equivalence. For some measure Λ, we have −´1 0 log(1 − x)x −2 Λ(dx) = ∞ and´1 0 x −1 Λ(dx) < ∞. An example is provided in the proof of Corollary 4.2 in Möhle and Herriger [15].
• Bah and Pardoux in Section 4.3 of [1] have obtained a first result on selection's impact.
Namely they show that if α > µ :=´1 0 1 x(1−x) Λ(dx) then X ∞ = 0 almost surely. We highlight that the quantity µ is strictly larger than α ⋆ and that our method does not rely on the look-down construction.
• Der, Epstein and Plotkin [7] and [6] got several results in the framework of finite population with selection. They announced the results of Theorem 1 in [7]. However they only worked when Λ equals a Dirac measure. Their method was based on a study of the generator of (X t , t ≥ 0) and differs from ours.
Apart from simple measures Λ, the expression of α ⋆ is rather complicated. We provide a few examples. • Let x ∈ [0, 1] and c > 0, consider Λ = cδ x . We have The case x = 0 corresponds to the Wright-Fisher diffusion and we have α ⋆ (0) = ∞. When x = 1, we also have α ⋆ (1) = ∞ (this is the so-called star-shaped mechanism). Note that the map α ⋆ is convex and has a local minimum in (0, 1). Thus, in this model (called Eldon-Wakeley model, see e.g. Birkner and Blath [4]) the selection pressure which assures extinction of the disadvantaged allele is not a monotonic function in x.
The computation is more involved for general measure Beta(a, b), see Gnedin et al. [11] page 1442.
A direct study of the process (X t , t ≥ 0) and its limit from the SDE (1) seems a priori rather involved. The key tool that will allow us to get information about X ∞ is a duality between (X t , t ≥ 0) and a continuous-time Markov chain valued in N. Namely consider (R t , t ≥ 0) with generator L. Let g : N → R: with We have the following crucial duality lemma: When no selection is taken into account, this duality is celebrate (see for instance the recent survey concerning duality methods of Jansen and Kurt [12]). Several works have incorporated selection and study the dual process. For a proof of this lemma, which relies on standard generator calculations, see Equation 3.11 page 21 in Bah and Pardoux [1].
The process (R t , t ≥ 0) is irreducible and its properties are related to that of (X t , t ≥ 0). On the one hand, since (X t , t ≥ 0) is bounded and converges almost-surely, the transience of (R t , t ≥ 0) entails X ∞ = 0 almost surely. On the other hand, to show that the limit X ∞ charges both 0 and 1, it suffices to prove that the process (R t , t ≥ 0) is positive recurrent.
Similarly as the block counting process of a Λ-coalescent, the process (R t , t ≥ 0) has a genealogical interpretation, roughly speaking it counts the lineages in the population when going backward in time: two kinds of events can occur 1 A coagulation of the lineages, when there is n lineages, this occurs with rate 2 A branching (a birth) event (modelling the selection), when there is n lineages, the process jumps to n + 1 at rate αn.
The last branching events have been incorporated to model the selection by Neuhauser and Krone [13] in the Wright-Fisher diffusion case. When a lineage splits in two, this should be understood as two potential ancestors. We refer the reader to Section 5.2 and 5.4 of [8], and also to Etheridge, Griffiths and Taylor [9] where a dual coalescing-branching process is defined for general Λ mechanism.
2 Study of (R t , t ≥ 0) and a word about coming down from infinity

Proof of Theorem 1
Rather than working with the process satisfying the SDE (1), we will work on the continuoustime Markov chain (R t , t ≥ 0). We will adapt some arguments due to Möhle and Herriger [15]. Denote ν(dx) := x −2 Λ(dx) and define for all n ≥ 2, The maps n → δ(n) and n → δ(n)/n are both non-decreasing and δ(n)/n ↑ α ⋆ . The next lemma is lifted from their work, however we provide a proof for sake of completeness. For the proof of the monotonicity we refer the reader to the proof of Lemma 4.1 and to Corollary 4.2 of [15]. We mention that the lemma below is proved in a more general setting in Lemma 4.1.
Let n ≥ 2 and x ∈ (0, 1), we consider the auxiliary random variable Y n (x) with law: n n j λ n,j . Proof of lemma 3. The first statement is obtained by binomial calculations and left to the reader, see Remark 7.2.2 for Λ-coalescent and Equation (2) of [15]. We focus on the second statement. We have We highlight that Lemmas 4 and 5 below are valid for α ⋆ = ∞ (in that case 1/α ⋆ = 0).

Proof of Lemma 4. By definition,
.
Thus, the process (R t , t ≥ 0) is positive recurrent (and in particular does not go to ∞ almostsurely).
To conclude that the law of X ∞ charges both 0 and 1, we use Lemma 2. Hence, we have where R ∞ denotes a random variable with law, the stationary distribution of (R t , t ≥ 0).
In order to get statement 2), it remains to show that if α > α ⋆ , then X ∞ = 0 almost surely. Consider now that α > α ⋆ . Let f : l → l, we have It is readly checked that ψ(l) ≤ δ(l), moreover the map l → δ(l)/l is increasing, thus Therefore the process (e −(α−α ⋆ )t R t , t ≥ 0) is a positive submartingale. On the one hand, if the process is unbounded then obviously R t −→ t→∞ ∞. On the other hand, if the process is bounded, then it converges almost surely to a variable which is positive with positive probability. On this event, R t −→ t→∞ ∞. Actually since the Markov chain is irreducible, we have R t −→ t→∞ ∞ almost surely. Plainly, applying Lebesgue Theorem in Lemma 2, we have

Revisiting the coming-down from infinity for the Λ-coalescent
A nice introduction to the Λ-coalescent processes is given in Chapter 3 of Berestycki [2]. Denote by (R t , t ≥ 0) the number of blocks in a Λ-coalescent. Started from n, this process has for generator L with α = 0. An interesting property is that this process can start from infinity. We say that the coming down from infinity occurs if almost surely for any time t > 0, R t < ∞, while R 0 = ∞. In this case, (R t , t ≥ 0) will be actually absorbed in 1 in finite time. The arguments that we used to establish Theorem 1 are mostly adapted from technics due to Möhle and Herriger. They have established a new condition for general coalescents to come down from infinity, which involves the function δ. Their work was mostly relying on linear random recurrences. We give here a proof in a "martingale fashion" for the simpler setting of Λcoalescent and deduce an upper bound for the absorption time of the Λ-Wright-Fisher process when selection is incorporated.
Proof. Schweinsberg [17] established that a necessary and sufficient condition for the coming down from infinity is the convergence of the following series k≥2 1 ψ(k) .As mentioned above, for all n ≥ 2, δ(n) ≥ ψ(n). Therefore the divergence of the series 1 δ(n) entails that of 1 ψ(n) and we just have to focus on the sufficient part (for a proof of the necessary part based on martingale arguments, we refer to Section 6 of [10]). Assume The generator of the counting block process corresponds to L with α = 0, thus we study we have and then By Lemma 3, we have l k=2 l k λ l,k − log l − k + 1 l ≥ δ(l)/l.
We deduce that Lf (l) ≥ 1 then, since f (R t ) −´t 0 Lf (R s )ds is a martingale, by applying the optional stopping theorem at time T n ∧ k where T n := inf{t; R t = 1} when R 0 = n, we get: The result follows by letting n → ∞.
Bah and Pardoux [1] have established (Theorem 4.3) that the the absorption of the process (X t , t ≥ 0) in finite time is almost sure if and only if the underlying Λ-coalescent comes down from infinity. In order to establish this property, they use a "lookdown approach". They obtain an upper bound for the expectation of the absorption time. Note that if the Λ-coalescent comes down from infinity, then δ(k)/k −→ k→∞ ∞.
Corollary 7. If Λ verifies ∞ k=2 1 δ(k) < ∞, then (X t , t ≥ 0) is absorbed in a finite time ζ and Proof. The proof is similar as Bah and Pardoux's one and we do not give details. It suffices to replace the function f (l) = k=l+1 1 (ψ(k)−αk)∨1 (defined page 26 in [1]) by f (l) = k=l+1 k (δ(k)−αk)∨1 log k−1 k . . We end this article by observing a link between the threshold α ⋆ and the first moment of a subordinator involved in the dust.
Remark 2.1. Assume α ⋆ < ∞, the corresponding Λ-coalescent process has dust, meaning that it has infinitely many singleton blocks at any time. As time passes, the asymptotic frequency of the singleton blocks altogether is given by a process (D(t), t ≥ 0) valued in ]0, 1] such that where ξ is a subordinator with Laplace exponent We refer the reader to Proposition 26 in Pitman [16]. An interesting feature, easily checked, is that α ⋆ = E[ξ 1 ]. Hence one could expect some fluctuations in (R t , t ≥ 0) when considering the critical case α = α ⋆ . This case is a priori more involved and will be studied in a future work.
Last, several authors have studied coalescents with a dust component through the theory of regenerative compositions. A forward model with mutations has been defined by Lagerås [14] and an interesting question could be to recover the model with selection through random compositions.