A nearly-neutral biallelic Moran model with biased mutation and linear and quadratic selection

In this article, a biallelic reversible mutation model with directional and quadratic selection is analyzed that reconnects to an approach proposed by Kimura [20]. Kimura starts from a diffusion model and derives its equilibrium distribution up to a constant. We use a boundary-mutation Moran model, which approximates general mutation models for small effective mutation rates, and derive its equilibrium distribution for polymorphic and monomorphic variants in small to moderately sized populations. Using this model, we show that biased mutation rates and directional selection alone can cause polymorphism rates within and substitution rates between populations that are ascribed to balancing or overdominant selection by the McDonald-Kreitman test [26]. We illustrate this using a data set of short intronic and fourfold degenerate sites from Drosophila simulans and Drosophila melanogaster.


Introduction
The distribution of fitness effects of new mutations has been the focus of many population genetic studies. Under the neutral theory [18], newly mutated alleles are assumed to be either subject only to random drift and hence selectively neutral, or very strongly deleterious and therefore quickly weeded out. Due to its simplicity, the neutral model is still widely used in population genetics: common usages include inference of molecular diversity measures and within-population demography parameters, as well as estimation of substitution rates of diverging populations or species.
Ohta [32] argued for the pervasive occurrence of slightly deleterious mutations; this later led to the nearly-neutral theory. She proposed a model in which new mutations have exponentially distributed deleterious effects [31]. The exponential distribution corresponds to a gamma distribution with shape parameter α = 1. Kimura [19] showed that a gamma distribution with shape parameter α = 1/2, i.e., with a heavier tail, actually improves the fit to the data. Kimura [19] chose an infinite sites model to approximate a locus with a finite number of nucleotide or amino acid sites. The infinite sites model does not allow for an equilibrium to develop. In fact, the fitness of the population decreases steadily. To attain equilibrium, an infinite number of sites with advantageous mutations would be needed to balance out the deleterious ones. This would increase the number of model parameters. In particular, the relative proportion of positive to negative mutations would have to enter the model as an empirically estimated parameter.
Comparing frequencies of amino acid changing substitutions D n and silent substitutions D s among species or populations has become paradigmatic for testing for adaptive evolution. In this context, the silent substitutions serve as the neutral standard and ratios of D n /D s are reported [43]. Within this framework, a debate about the relative proportions of neutral, positive, and negative mutations has developed [e.g., 44,36,29]. Contrary to their use as a neutral reference, silent mutations seem to be under selective constraint in a wide array of organisms: in particular, codon usage bias in particular has been shown to alter the silent substitution rate in mammals and birds [34] and the plant species Populus tremula [14] as well as the fruitfly species Drosophila [1].
Codon usage bias is central to another paradigmatic testing scenario: the comparison of two classes of nucleotides, one class with the bases A and T and the other with the bases C and G. In D. melanogaster and D. simulans the ratio of [AT ] to [CG] nucleotides is approximately 2 : 1 in short autosomal introns, which likely reflects the mutation bias [8]. In fourfold degenerate sites of D. simulans, however, the ratio is approximately 1 : 2, which likely reflects the joint action of mutation bias and directional selection [7]. Contrasting fourfold degenerate sites and short autosomal introns is therefore a strategy for inferring directional selection in the presence of mutation bias. It results in a polymorphism ratio that can be used as a proxy for directional selection as it correlates well with divergence measures [25,23]. In populations of D. simulans which are not too far from mutation-selection-drift equilibrium, directional se-lection of about e γ = 4 (such that γ = 1.39) favoring C and G nucleotides is needed to compensate for the mutation bias in fourfold degenerate sites [40,17]. The magnitude of this directional selection is within the nearly-neutral range of 0.2 < |γ| < 3 [37], where γ = 4N s with the diploid Wright-Fisher model and γ = N s with the haploid Moran model. More recently, Machado et al. [25] and Lawrie et al. [23] have also shown that codon usage bias appears to account for a substantial amount of the total selective pressure acting on fourfold degenerate sites in D. melanogaster.
Kimura [20] developed a theoretical model that can be applied to codon usage bias: Starting from a diffusion model, he expands the selective forces up to second order. This means that dominance and over-and underdominance can be accounted for. In the first half of his article, Kimura assumes a single segregating mutation. It initially occurs at a proportion 1 2N (with a diploid model) or 1 N (with a haploid model); this proportion changes by drift and selection until the allele is either fixed or lost. In the second half, Kimura incorporates reversible mutation: He analyzes Wright's distribution [20, formula (19)] and derives an approximate formula for the equilibrium divergence (or substitution rates) under directional selection [20, formula (25)]. With Wright's distribution as the underlying mutation model, the equilibrium distribution of allele frequencies can only be derived up to a proportionality constant in the presence of selection. In contrast to the infinite sites model with unidirectional mutations, however, such a reversible mutation model allows for an equilibrium, i.e., for populations with a constant fitness that does not erode with time. Such equilibrium models are compatible with thermodynamic approaches; in particular, functions analogous to entropy can be defined [16,35,2].
Kimura's mathematical model is comprehensive in that it includes both directional and quadratic selection. However, his biological reasoning was criticized: While Kimura based his reasoning for codon usage bias on stabilizing selection of a phenotypic trait, Li [24] and Bulmer [5] argued that mutation, drift, and directional selection alone can account for the observed patterns at silent sites. Like Kimura, they reconnect to Wright's [42] model. The Li-Bulmer model has in turn been questioned for the biological legitimacy of its assumption of constant, weak selection: In the context of codon usage bias, Machado et al. [25] and Lawrie et al. [23] find evidence that many synonymous sites are under strong purifying selection-indeed the strength of selection seems to vary across sites.
The still ongoing discussion around mutation bias and directional selection has therefore relied mainly by models that draw on Wright's distribution. For this model, the equilibrium distribution is only known up to a constant of proportionality. In contrast, a Moran model with finite population sizes and mutations occurring only in the monomorphic boundaries permits the derivation of an exact stationary distribution. It is one of many scenarios in which exact results can be obtained using a Moran model when the Wright-Fisher model and diffusion approaches yield only the approximate equivalents [28]. The neutral boundary-mutation Moran model has been shown to result in an identical sample distribution as a Taylor series expansion of Wright's general mutation model [39]. With selection, the equivalence has not yet been investigated rigorously.
In this article, the equilibrium distribution of a boundary-mutation Moran model with linear and quadratic selection as well as mutation bias and drift is derived. We hope that our efforts help to revive the apparently forgotten model with quadratic selection proposed by Kimura [20] that allows for modeling dominance and over-and underdominance, particularly since we can iron out some previous mathematical inconsistencies. Note that although quadratic selection has received more attention in recent years, a lot of the focus has been on novel modelling approaches rather than its interplay with other population genetic forces in already established settings. Many studies have been based on the signature it leaves on linked sites [4,9]. There are, however, also widely known cases of quadratic selection where linkage does not play a role, such as the human ABO blood polymorphism [38]. Models such as ours that rely only on the direct target of selection for detection are applicable to such cases.
Here we will, however, once again place special emphasis on mutation bias: Many models assume symmetric mutation rates [e.g., 16]. But it has repeatedly been shown that the interplay between (directional) selection and mutation bias has the potential to substantially affect heterozygosity and/or substitution rates in unexpected ways [27,7,40,22,25,23]. We will examine how varying the strength of mutation bias and (directional) selection relative to each other affects the equilibrium rate of new mutations and heterozygosity, and how this affects the neutrality index. Our theoretical discussion will be complemented by application to a D. melanogaster and D. simulans data set. We hope that models such as ours will eventually be applied to other genome-wide data.
2. Moran model with mutation, drift, directional and quadratic selection

The model
Consider a phenotypic trait influenced by K sites, indexed by k with 1 ≤ k ≤ K. Each site is assumed to be biallelic with one allele coded as 1 and the other as 0. Hence, 2 K allelic combinations may contribute to the trait. Per generation (a generation corresponds to N birth-death Moran events) and per site, a mutation from allele 0 to allele 1 occurs at a scaled rate βθ; a mutation in the reverse direction occurs at a scaled rate (1−β)θ. We define β = µ 1 /(µ 1 +µ 0 ) and θ = N (µ 1 + µ 0 ), whereby µ 1 is the mutation rate towards the focal allele, µ 0 the mutation rate away from the focal allele, and N is the (effective) haploid population size. The effect of exchanging an allele 0 with an allele 1 is an increase in the phenotype that is proportional, irrespective of the site. Thus there are K + 1 fitness states. Since we are interested in dominance and other quadratic effects, we will consider diploid individuals. Three genotypes per locus are therefore relevant for fitness.
We will assume small scaled effective mutation rates so we can apply a boundary-mutation Moran model [41]: This means that we assume that in a small to moderate sample no more than a single mutation segregates per site.
The boundary-mutation Moran model approximates the general mutation model [42] well, provided 2β(1 − β)θ < 0.025 per site holds in neutral equilibrium [41]. For many analyses, Kimura assumed a mutated allele initially segregating at a proportion of 1/N (or 1 − 1/N ) and then used diffusion theory to model the course of the allele proportion within only the polymorphic region (between 1/N and 1 − 1/N ) conditional on drift and selection. The reason for neglecting the monomorphic region is that passing a boundary model to the diffusion limit leads to inconsistencies: The diffusion approximation requires the assumption of the population size N → ∞ for the polymorphic interior. The same assumption causes negative-and thus impossible-probabilities of occupancy at the boundaries. The boundary-mutation Moran model [41,3] does not require taking the limit and thus avoids this problem, but the population size is restricted. With it, exact results can be obtained relatively easily, and from these the forward and backward diffusion approximations can be determined in the polymorphic region using only the first and second symmetric derivatives. However, compared to a general mutation model and especially to the Wright-Fisher model, biological realism is sacrificed.
Let i, with 0 ≤ i ≤ N , denote the frequency of allele 1 at a focal locus within the population. With the neutral boundary-mutation Moran model that includes only potentially biased mutation and drift (Appendix 6.1) the equilibrium distribution of alleles at each locus is: whereby H n = n i 1 i is the harmonic number. Substituting N = 2, we obtain the expected heterozygosity at a single locus. This can also be obtained by replacing the allele frequency by the allele proportion x = i/N and taking the limit N → ∞: Assume furthermore that each site fixes independently. This is the case if the scaled recombination rate is much larger than the scaled mutation rate. Even with very low effective recombination rates, the assumption is valid if scaled mutation rates are so small that effectively only one of the K sites segregates in the population at a time.
Bearing this in mind, let us now move towards incorporating selection: We can define the selection coefficient for a change of the frequency of the focal allele from i to i + 1 as (Appendix 6.2.1): and in the reverse direction as: where B 1 determines the strength of positive and B 2 the strength of balancing selection. In Appendix 7 of Bergman et al. [3] it is shown that a diffusion model can be derived from the decoupled mutation-drift Moran model by only using the definitions of the first and second symmetric derivative. Following this procedure, one easily sees that the diffusion approximation of a boundary-mutation Moran model including the selection coefficients above corresponds to Kimura's diffusion approach [20] based on Wright's equilibrium distribution [42] of allele frequencies except for a reversal of signs for the parameter B 2 .
Note that with a strictly haploid model only first order, i.e., directional, selection is possible. A diploid selection model allowing for dominance or overand underdominance involves two alleles that are lost or gained when a diploid individual dies or competes against another diploid individual. In our diploid model, however, the transition matrix is assumed to be tridiagonal for simplicity, such that only single steps are allowed. One may then think of the selection process as follows: The focal allele partners up with an allele randomly drawn from the population to obtain its fitness; it competes with another allele, which also obtains its fitness by partnering up with a further allele randomly drawn from the population. The relative fitnesses for a diploid individual with 0, 1, and 2 alleles of the focal type are given by 1, 1+B 1 +B 2 , and 1+2B 1 . If B 2 = 0, the fitness effects are purely additive. The focal allele replaces its competitor according to their marginal fitness difference, whereas the two partner alleles remain unaffected.-As generally with the boundary-mutation Moran model, biological realism is sacrificed for mathematical tractability. Note that a similar argument for obtaining selection coefficients was used in Muirhead and Wakeley [28].
While we derive the exact equilibrium distribution of the boundary-mutation Moran model with the above selection terms (see Appendix 6.2.2), we also provide a convenient exponential approximation reminiscent of the diffusion approach (see Appendices 6.2.3 and 6.2.4) and employ this approximation for the data analyses within the main text. In the Appendix (6.2.4), we show that the approximate per site equilibrium distribution of the boundary-mutation model with directional and quadratic selection is: Note that the Taylor series expansion e s = (1 + s) + O(s 2 ) means that the exponential reliably approximates the discrete process for the large population sizes usually encountered in population genetics (in particular, for approx-imately s ≤ 0.1).
For small θ, this approximate equilibrium of the boundary-mutation Moran model is an excellent approximation of Wright's equilibrium distribution with general mutation rates [42] (Fig. 1).
Varying the values of B 1 and B 2 -either individually or simultaneouslyaccounts for a wide range of possible selection scenarios and allele frequency distributions; the latter are generally good approximations for the exact version even for very strong selection (Fig. 2, Fig. 3). Note that balancing selection acts symmetrically around a maximum at frequency 1 2 when B 1 = 0, and adjusting of the latter shifts the target frequency.   Recall that γ = B 1 . For all K sites, the approximate equilibrium distribution is the product of individual per site equilibria:

Equilibrium distribution among populations
We will now examine the equilibrium among populations, and look more closely at the effect mutation bias may have on divergence: Let → 0 in eq. (5) and compare to the version without selection by letting θ → 0 in eq. (1). We see from the boundary terms at i = 0 and i = N of eq. (5) that the selective advantage of the preferred allele in the entire population is γ = B 1 to zeroth order in θ. Given K sites with equal effects on the phenotype, a tridiagonal transition rate matrix results. Set the number of sites fixed for the focal allele to y: In equilibrium the following detailed balance equation must hold: From this, one sees that the binomial distribution The mean of the binomial distribution is Kρ, the variance Kρ(1 − ρ). This variance corresponds to the variance among populations. The variance within populations is zero since each population is assumed fixed at all sites with the first order approximation we made at the start of this subsection. When selection opposes mutation bias, it may increase the variance compared to the neutral equilibrium. Note that ρ corresponds to the expected proportion of favored alleles fixed among the K sites. The equilibrium rates of favored (+) and disfavored (-) new mutations are: The equilibrium ratio of favorable to unfavorable new mutations is independent of the mutation parameters and depends only on selection, as previously noted by McVean and Charlesworth [27]: Note that the ratio of the probability of fixation of favorable and unfavorable mutations in equilibrium is 1.

Neutral boundary-mutation Moran model
The equilibrium distribution of the neutral boundary-mutation Moran model is identical to the first order expansion of Wright's equilibrium distribution in θ (see Appendix 6.1). It has been the subject of previous efforts; we recapitulate the main theoretical results in Appendix (6.2).

Boundary-mutation Moran model with selection
Let us now look at the boundary-mutation Moran model that includes directional selection, i.e., γ = B 1 , as well as mutation and drift: The expected heterozygosity for N → ∞ is [40]: The ratio of the expected heterozygosity under directional selection to the expected heterozygosity at neutrality is While directional selection always decreases heterozygosity when mutation rates are unbiased, directional selection opposing mutation bias may increase heterozygosity ( Fig. 4A-C).
This happens because directional selection increases the overall mutation rate by favoring the allele with the higher mutation rate. Note that our formula for expected heterozygosity (eq. 13) is identical to that given by McVean and Charlesworth [27] (see also Appendix (7.1) for a comparison).
Assuming only the interplay between mutation bias and directional selection in Drosophila melanogaster, the mutation bias of approximately β = 1 3 in autosomal fourfold degenerate sites is reversed by directional selection of approximately γ = log(4) [8,7]. This leads to an increase in intra-population heterozygosity by a factor of 3 2 log(4) = 1.08.

Fixation probabilities, substitution rates, and the neutrality index
The fixation probability of a new mutation that initially segregates at the boundary and comes under both directional and quadratic selection is given by, e.g., formula (15) of Kimura [20]. With only directional selection, i.e., γ = B 1 , the substitution rate per generation in equilibrium is balanced between favorable and deleterious mutations. The overall rate is then twice the rate of the favorable mutations: Without selection this rate reduces to The ratio of the nearly-neutral and neutral rates is then: .
Note that while γ e γ −1 is always smaller than 1 for γ > 0, ρ β may be larger, depending on selection intensity and mutation bias. Therefore, nearly-neutral directional selection may increase substitution rates over the neutral rate with biased mutation (Fig. 4D-E).
Following Ohta [30], the common understanding of selection against deleterious mutations is that it slows down the substitution rate and thus the rate of divergence in proportion to the effective population size. Yet we are not the first to note that in equilibrium, a strong mutation bias against the optimal codon may actually increase the substitution rate: McVean and Charlesworth [27] revived the investigation into this phenomenon; Lawrie et al. [22] also discuss how this interplay can confound maximum likelihood estimates of branch lengths and therefore the inference of positive selection on phylogenies. Usually substitution rates elevated above the neutral rate are interpreted as resulting from recent positive selection and not from an equilibrium of directional selection and biased mutation.
In the Appendix (7.2), we provide a comparison between our formula (16) and the equivalent formula derived by Kimura [20].
With the same Drosophila data as before, we get: We thus note that while heterozygosity is increased with directional selection, the substitution rate is -in our case -decreased. With parameters appropriate for directional selection opposing the mutation bias in fourfold degenerate sites of Drosophila melanogaster, we thus get both an increase in heterozygosity and a decrease in divergence compared to the neutral case. Such a result would usually be explained by balancing or overdominant selection.
We note that the neutrality index is independent of mutation rates: It is 1 for γ = 0, and always greater than 1 for γ = 0. This can be shown as follows: The power series contains only odd powers of γ: If γ is positive, all terms are positive. Otherwise all terms are negative. Squaring this power series produces a sum of positive even powers of γ, such that the neutrality index must be greater than 1 (since the term for i = 0 is always 1). Indeed, the neutrality index is unchanged by reversing the sign of γ, which actually corresponds to an exchange of the labels of the two alleles.

Application to short introns and fourfold degenerate sites in Drosophila
The allele frequency spectrum of autosomal fourfold degenerate sites (FF) in Drosophila simulans is close to equilibrium [8,7,17], while that of the closely related species D. melanogaster is far from it. We use an alignment of ten haploid Malagasy D. simulans and ten haploid mainland African D. melanogaster individuals to extract a joint site frequency spectrum of FF and short intronic sites (SI; positions 8 − 30bp of introns < 66bp long) for all autosomal loci. The bases A and T are encoded as allele 0 and the bases C and G as allele 1.
The maximum likelihood estimator for the expected heterozygosity 2β(1 − β)θ [39] is 0.0270 for FF sites and 0.0245 for SI sites, while divergence is 0.036 for FF sites and 0.042 for SI sites. Note that the expected heterozygosity of FF sites, which are likely influenced by directional selection, is higher than that of the presumably neutral SI sites. The situation is reversed for the estimated divergence, where the FF sites show lower values than SI sites.
Assuming neutrality, the maximum likelihood estimator [39] of the mutation bias towards GC from SI sites isβ = 0.34. A first estimator of the directional selection coefficient may be obtained as that which maximizes the heterozygosity of FF sites with a GC proportion of 0.34. But even with the maximalγ 1 = 0.98, only a heterozygosity of about 0.2645 is reached. This is still below the observed 0.027 for FF sites. From the bias of 0.66 towards GC in the FF sites, a second estimate of the directional selection coefficient in FF sites is obtained:γ 2 = 1.32.
The McDonald-Kreitman test [26] evaluates a contingency table for association. In our case, the table consists of two rows with i) non-synonymous and ii) synonymous amino acid sites and of two columns with i) polymorphisms within a population and ii) fixed differences (substitutions) between two populations.
We compiled a table of polymorphic and divergent sites and also report the total number of sites (including sites monomorphic in both species and polymorphic in only D. melanogaster):  (17), is therefore estimated to beγ 3 = 1.81. This estimate is independent from the ratio of allele proportions above. We note that the deviation of the FF site frequency spectrum of D. melanogaster from equilibrium may have biased this estimate of γ, but should not have affected the previous two estimates, since those are only based on the D. simulans site frequency spectrum.
In summary: We have a bias towards A and T of β = 0.34 that is counteracted by effective directional selection γ of a strength within the nearly-neutral range in fourfold degenerate sites. In Section (3.3) we see that such a parameter combination leads to increased polymorphism of fourfold degenerate sites, which are presumably under directional selection, compared to the presumably neutrally evolving short intronic sites, and yet also to a decreased substitution rate. Such a result would usually be interpreted as indicating balancing selection. At the same time, evaluating the neutrality index-which is independent of the mutation rates-from Section (3.3) gives a value of approximately 1.303. Usually such a positive neutrality index is interpreted as indicative of negative selection. Thus calculating these two traditional statistics without knowledge of the underlying interplay of directional selection and mutation bias leads to contradictory interpretations.

Directional and quadratic selection
One can model dominance and under-and overdominance by including quadratic selection [20]. The selection strength is then frequency dependent (Fig. 2B, Fig. 3). In the case of overdominance, i.e., concave fitness on the locus level, a fitness maximum inside the polymorphic region may lead to an increase in polymorphism (Fig. 2B). But in this case the main assumption of the boundary-mutation model, i.e., that mutations only occur in monomorphic states, will likely be violated. With stabilizing selection around a given optimum, selection may be either mainly directional (far away from the optimum) or underdominant (right at the optimum) [20].
Recall that the fitness advantage (or disadvantage) through fixation of a mutant allele of the focal type is B 1 (irrespective of B 2 ). With the focal allele completely dominant and favored by selection, we have B 2 = B 1 ; without dominance B 2 = 0. Hence, dominance makes no difference to zeroth order in θ, i.e., when drift is strong relative to mutation. This changes when first order terms are included, as polymorphism may be increased with overdominance and decreased with underdominance, even with relatively low selection coefficients.

Conclusions
Inspired by Kimura [20], this article investigates a model of linear (i.e., directional) and quadratic (i.e., dominant, over-and underdominant) selection, mutation, and drift for small scaled mutation rates θ. In particular, we introduce the boundary-mutation Moran model for small scaled mutation rates that includes linear and quadratic selection, mutation, and drift.
Assuming that drift is strong relative to mutation, i.e., to zeroth order in θ, simple formulas based on fixation probabilities for both the equilibrium distribution among sites as well as the divergence between populations are obtained (note that this means variation within populations is ignored). In fact, the reversibility of the mutations implies that the rate of negatively and positively selected new mutations (the ratio of which is determined by mutation bias and selection) reaches a stationary distribution. Thus the effects of deleterious mutations [background selection, 6] as well as positively selected mutations [hitchhiking, 15] on the effective populations size will also equilibrate. Indeed, whether one parameterizes such models with positive or negative selection strength, the resultant equilibrium distribution will be identical. This is a promising setting for future explorations of more complex dynamics involving, e.g., background selection, to further other points raised by McVean and Charlesworth [27].
The neutral boundary-mutation Moran model including drift and mutation bias can be thought of as a first-order approximation in θ of a general neutral biallelic mutation-drift model with reversible mutations. Further incorporating linear and quadratic selection into the boundary-mutation Moran model offers a consistent approach that permits polymorphic states in addition to fixed ones. Importantly, the mutation-selection-drift equilibrium distribution can be obtained, as shown in this article. This goes beyond the analysis of Kimura [20] (and also McVean and Charlesworth [27]), who only obtained the equilibrium distribution up to a constant. It also extends the previous derivation of the exact equilibrium distribution of the boundary-mutation Moran model with only directional selection, mutation, and drift [40].
We use estimators derived from the equilibrium distribution of the boundarymutation Moran model with drift, mutation and selection to study a dataset of fourfold degenerate sites and short introns of Drosophila simulans and D. melanogaster. We show that the strong mutation bias of β = 0.34 towards A and T and therefore against against the preferred codon is counteracted by effective directional selection γ of a strength within the nearly-neutral range within the fourfold degenerate sites. This leads to an increase in heterozygosity and a decrease in the substitution rate. This combination of results is conflicting in a traditional framework that involves evaluating the McDonald-Kreitman test and the neutrality index, as it is suggestive of both negative and balancing selection.
The discrepancy among our different estimators of γ indicates deviations from model assumptions. These will be explored in the future. The potential dynamics created by the interplay of directional and balancing selection will also be investigated further on data suitable for that purpose.
In the struggles over the neutral [21] and nearly-neutral [33] theories following Gillespie's review [10], both the subtleties in the approaches of individual scientists as well as the common ideas they were drawing on have -at least to a certain extent -been lost. Nowadays, two strands of research appear to prevail: one centered on data sets that contrast amino acid changing and silent substitutions [26,44,36], where theory is based on the infinite site model with deleterious mutations [33]; the other centered on data sets that contrast fourfold degenerate sites with short introns [12,13,8,7,17], where theory is based on biallelic reversible mutation models and selection-mutation-drift equilibrium [24,5,27,40]. Note that both approaches only allow for directional selection; quadratic selection [20] is ignored. We posit that reversible mutation models such as ours can provide the framework for rigorous analysis of the interplay between mutational bias and both linear and quadratic selection. Broadening the field of application of such reversible models to scenarios in which they are not traditionally used could bring together modelling approaches that seem to have diverged unnecessarily.

Acknowledgments
We thank Nick Barton, Juraj Bergman, Rui Borges, Reinhard Bürger, and Joachim Hermisson for helpful discussions, Carolin Kosiol for proofreading and commenting on the manuscript, and Burcin Yildirim for catching a slip in the conclusions.
CV's research is supported by the Austrian Science Fund (FWF): DK W1225-B20; LCM's by the School of Biology at the University of St.Andrews.

Mutation and drift (no selection)
Let us start with a general reversible mutation-drift Moran model: µ 1 is the mutation rate towards the focal allele, µ 0 the mutation rate away from the focal allele, and N is the (effective) haploid population size. Set β = µ 1 /(µ 1 + µ 0 ) and θ = N (µ 1 + µ 0 ). In the diffusion limit, the equilibrium distribution of the allele proportion x is Wright's beta distribution [42]: Assume binomial sampling; we draw a sample of size N of the random variable i conditional on the "probability" x. Integrate the joint distribution of i and x over x to obtain the beta-binomial compound distribution: Then expand the beta-binomial into a Taylor series in θ and retain terms up to first order to obtain equation (1) [39], the result of which we reproduce here: For large N , we note that either Pr(i = 0), or Pr(i = N ) or both become negative. Hence too large sample sizes are impossible. Note that in order to bypass this issue a functional may be defined: which takes over the role of the beta distribution (18) in the general mutation model [42] and results in a marginal posterior distribution identical to the distribution (1). For this, the sequence of taking the limit in equation (20) and integration over the allele proportions needs to be exchanged. This can be shown to work by an argument of monotone convergence (Vogl in prep.).
With the boundary-mutation Moran model, mutations occur only at the boundaries while the interior transitions are due to either drift or selection. Let us first consider the case with pure drift, i.e., no selection. In the interior, i.e., with allele counts i between 1 and N − 1, the transition rates from t to t + 1 are: At the boundary i = 0, we have: The term 1 1−βθH N −1 ensures that, in equilibrium, mutations enter the polymorphic region at an identical average rate per Moran drift event β θ N , irrespective of the population size N .
At the boundary i = N , we have analogously: Since a generation corresponds to N Moran events, the mutation rates must be multiplied by N to obtain the mutation rate per generation. The equilibrium distribution is given by eq. (1). Since the transition matrix is tridiagonal, it is necessarily reversible and therefore in detailed balance. Therefore we can show that the distribution (1) is indeed the equilibrium distribution by demonstrating that it fulfills this necessary condition of detailed balance with its nearest neighbours (all other transitions being disallowed). In the interior, the flow between states i and i + 1 can be shown to balance as follows: at the boundary i = 0, the flow also balances: and analogously at the boundary i = N .

Selection coefficients
We use the following parameterization for the fitness of diploid genotypes: Genotype 00 01 11 Fitness We note that Kimura [20] reverses the sign of the second order coefficient B 2 . This is because he starts from a normally distributed phenotype, assumes a fitness function proportional to a normal distribution, and then considers the response of a biallelic locus influencing the trait under stabilizing phenotypic selection. Near the fitness optimum there are two possible scenarios: i) The population itself is close to the fitness optimum, resulting in directional selection towards the optimum; ii) the population is right at the optimum, resulting in underdominance. Thus, while in our case a positive B 2 corresponds to overdominant selection, in Kimura's case it corresponds to underdominant selection.
The following considerations we use in determining the selection coefficient are similar to those in Muirhead and Wakeley [28]: Two alleles are taken randomly from a population in Hardy-Weinberg equilibrium. These two alleles (boldfaced alleles of the genotypes in the rows and columns of the table, respectively) are paired with two further alleles, such that the joint fitnesses of the two (ordered) genotypes are: The two columns on the right correspond to the ordered genotypes with the focal allele, the two on the left to the ordered genotypes of the other allele; and analogously for the rows. The probability of a change in the allele frequency from i to i + 1 per Moran event corresponds to the fitness of the genotypes with the focal allele, i.e., of the two columns on the right, multiplied by their probabilities given the allele frequency i: where s i→i+1 is the selection coefficient. The selection coefficient in the reverse direction s i→i−1 , i.e., from i to i − 1, corresponds to the sum of the left two columns multiplied by the genotype probabilities and is minus that of the right two.

Exact stationary distribution
Using the selection coefficients derived in the previous subsection, we get the exact transition probabilities in the interior: Note that this transition matrix deviates from the one we used earlier for only directional selection [41,40], but converges to the same diffusion limit. At the boundary i = 0, we have: with Set At the boundary i = N , we then have: with It follows that the exact equilibrium distribution for a boundary-mutation Moran model with both directional and quadratic selection, mutation, and drift can be written as: Note that the form in which the drift and selection coefficients enter the equilibrium distribution corresponds to the standard equilibrium distribution recurrence relation for birth-death processes (see [11], for instance). This can be determined by confirming the existence of the stationary distribution, and then deriving by induction. Equivalently, we recall that because of the tridiagonal transition matrix, detailed balance must hold for nearest neighbours. So confirming the detailed balance conditions shows that this is indeed the stationary distribution: In the interior, we have the balanced flow between states i and i + 1: At the boundary i = 0, we have the balanced flow: At the other boundary i = N , we have analogously: By showing that these three conditions hold, we have validated eq. (35) as the unique stationary distribution, as long as the boundary terms at i = 0 and i = 1 are positive.

Asymptotics of the Drift and Selection Terms
Let us examine the asymptotics of the terms for drift plus selection: Assuming that N is suitably large and B 1 and B 2 are at most of first order, the numerator can be approximated as: Note that this is essentially a first order Taylor expansion of the exponential in reverse. The denominator can be analogously approximated: Therefore the approximate drift plus selection terms are given by:

Approximate Exponential Transition Rates and Stationary Distribution
We can therefore define the approximate interior transition rates for the boundary-mutation Moran model with directional and quadratic selection, mutation, and drift as follows: The boundary at i = 0 becomes: with Analogously, at the boundary i = N , the approximation yields: with The approximate equilibrium distribution (also given in the main text as eq. 5) therefore becomes: The proof of the detailed balance conditions works identically to the exact case, and is therefore omitted here.

McVean and Charlesworth's formula for expected heterozygosity
McVean and Charlesworth [27] give the following approximate formula for the expected heterozygosity as (in our notation): One can easily see that this is identical to our formula up to a factor of two that stems from the difference between the Wright-Fisher and the Moran model: Our formula was derived using the exponential approximation of boundarymutation Moran model with selection.

Kimura's formula for the evolutionary rate
Kimura gives the relative evolutionary rate (in terms of mutant substitutions) under directional selection compared with the strictly neutral case in his formula (25): where f 2 is the equilibrium proportion of the positively selected allele and f 1 that of the other allele. Without mutation bias Kimura's formula (22) gives f 2 /f 1 = e γ . Substituting this into formula (49), we get u γ /u 0 = 2 e γ (1 + e γ ) 2 log(e γ ) 1 + e γ e γ − 1 = 2 γe γ (1 + e γ )(e γ − 1) This is identical to our formula (16), as long as there is no mutation bias, i.e., β = 1 2 . With mutation bias, a different formula from ours would result. From the numerical example, it can be concluded that Kimura assumed the absence of mutation bias, to which he generally seems to have given little thought. Without mutation bias, the substitution rate cannot increase under directional selection compared to neutrality.