The adaptation rate of a quantitative trait in an environmental gradient

The spatial range of a species habitat is generally determined by the ability of the species to cope with biotic and abiotic variables that vary in space. Therefore, the species range is itself an evolvable property. Indeed, environmental gradients permit a mode of evolution in which range expansion and adaptation go hand in hand. This process can contribute to rapid evolution of drug resistant bacteria and viruses, because drug concentrations in humans and livestock treated with antibiotics are far from uniform. Here, we use a minimal stochastic model of discrete, interacting organisms evolving in continuous space to study how the rate of adaptation of a quantitative trait depends on the steepness of the gradient and various population parameters. We discuss analytical results for the mean-field limit as well as extensive stochastic simulations. These simulations were performed using an exact, event-driven simulation scheme that can deal with continuous time-, density- and coordinate-dependent reaction rates and could be used for a wide variety of stochastic systems. The results reveal two qualitative regimes. If the gradient is shallow, the rate of adaptation is limited by dispersion and increases linearly with the gradient slope. If the gradient is steep, the adaptation rate is limited by mutation. In this regime, the mean-field result is highly misleading: it predicts that the adaptation rate continues to increase with the gradient slope, whereas stochastic simulations show that it in fact decreases with the square root of the slope. This discrepancy underscores the importance of discreteness and stochasticity even at high population densities; mean-field results, including those routinely used in quantitative genetics, should be interpreted with care.


Introduction
In those classical models of evolving populations that include spatial degrees of freedom, the habitat typically has a fixed range and geometry [1]. Adaptation, then, is driven by the selection of traits that improve an organism's performance in that habitat. In reality, the habitat and spatial range of a species are themselves evolvable properties. Generally, the species range is affected by the species' ability to cope with biotic, abiotic, or antibiotic conditions that are heterogeneous in space, and traits that affect this ability are clearly subject to natural selection [2]. Many basic results of population genetics, based on standard models such as the Wright-Fisher and Moran models [3], are not applicable to this mode of adaptation, and attempts to formulate and study suitable alternatives are few.
Many range boundaries observed in nature appear relatively stable, and yet cannot be explained by an obvious shift in environmental conditions [4]. For this reason, a significant amount of theoretical work has been devoted to identifying the factors that determine the range limits of species and the mechanisms that contribute to their stability (e.g., see [2,[4][5][6][7][8][9] and references therein). These studies often rely on the framework of quantitative genetics. Perhaps because of this focus on understanding stable boundaries, less attention has been given to the dynamics of range expansion when the boundaries are not evolutionarily stable.
In the recent years, however, adaptation driven by environmental heterogeneity and range expansion have received additional attention because of their possible implications for the evolution of drug resistant bacteria and viruses [10][11][12][13][14][15][16]. In humans and livestock treated with antibiotics, drug concentrations differ between tissues, organs, and individuals [17]. Environments with low concentrations can function as 'sanctuaries', 'reservoirs', or 'sources' [10][11][12][18][19][20][21] of poorly adapted strains, and local concentration gradients can provide a 'staircase' for adaptation, allowing it to proceed step by step, with adaptation and range expansion going hand in hand [14][15][16]. Indeed, in experiments using micro-environments with an antibiotic gradient, E. coli rapidly evolved drug resistance [22]. Despite this recent work, many basic questions remain unanswered. If heterogeneity can promote adaptation, then what geometries optimize the rate of adaptation? What parameters and quantities tend to affect the rate of adaptation most strongly? What determines the 'Goldilocks point' for the rate of adaptation [23], or in other words, what magnitude of the gradient leads to the optimal 'resistance-selective environment' [24]?
Here, we present a toy model designed to explore such questions. It describes an asexual population adapting to a linear environmental gradient that limits its range. We choose a highly simplified description that allows basic intuitions to be developed and tested. While motivated by, and possibly relevant to, the evolution of antibiotic resistance in bacteria, we deliberately keep the formulation general.
Many phenotypes, including antibiotic resistance levels, are determined by a combination of genetic factors as well as many intrinsic and extrinsic nongenetic (environmental) influences. Nevertheless, previous models of resistance evolution in antibiotic gradients assumed a discrete, one-dimensional, unbranched space of genotypes and a one-to-one genotype-to-phenotype mapping [14,15]. (We discuss these models in more detail in the Discussion and Conclusions section.) While such a fitness landscape can perhaps approximate more complex, higherdimensional systems projected onto a single reaction coordinate, the regular discretization is certainly an idealization. Here, we take an orthogonal approach by assuming that the evolving phenotype is a quantitative trait, that is, a phenotype that can be characterized by a continuous variable [25]. Examples of quantitative traits are height, weight, and blood pressure, and their (apparent) continuity usually derives from a combination of polygenic effects and/or environmental influences. Even though the drug resistance level of a bacterium, often expressed as the minimal inhibitory concentration (MIC), can in principle take on continuous values and is certainly affected by many genetic and non-genetic factors, we do not claim that this approach is more realistic per se; rather, the approaches should be seen as complementary.
Below, we first define the model. We then proceed to study its behavior in the mean-field limit, where we can derive an analytical expression for the adaptation rate. Next, we compare the mean-field result to discrete-stochastic simulations. In order to simulate large numbers of interacting individuals dispersing and evolving in continuous space and time, we have used an event-driven simulation method that is both exact and surprisingly efficient.
We will see that the behavior of the model shows two regimes: a dispersion-limited regime and a mutation-limited regime. In the dispersion-limited regime, the mean-field prediction provides useful intuitions and qualitatively sound results, even though, quantitatively, its deviations from stochastic simulations can be substantial. In the mutation-limited regime, however, the mean-field prediction becomes highly misleading even at very large population densities, demonstrating that such approximations, including those routinely used in quantitative genetics, must be used with caution.

The 'ramp' model
We envision a population of organisms in a onedimensional continuous space. Each organism is characterized by its position x and its trait value y; the population can therefore be represented by a cloud of points (x, y) in a two-dimensional phenotype-position plane (see figure 1). The organisms disperse in real space (horizontally) by diffusion, with diffusion constant D x . Assuming that mutations cause small changes in the trait value y, we also model phenotypic The 'ramp' model of adaptation in an environmental gradient. Asexual organisms (black dots) inhabit a one-dimensional environment, plotted horizontally, and are characterized by the value of a quantitative trait y, plotted vertically. They disperse diffusively (horizontally) with diffusion constant D x . Their quantitative trait value can change due to mutation and/or environmental factors, modeled as diffusion in the vertical direction with diffusion constant D y . Organisms die at a rate d. They reproduce (duplicate) logistically, that is, at a rate decreasing linearly from r to 0 as the local density increases from 0 to the carrying capacity k; the local density is assessed by a kernel density estimate with standard deviation (interaction range) s. Crucially, reproduction is possible for organisms with trait value y at position x only if > y ax, where the slope a represents the steepness of an environmental gradient that limits the species range. This defines a diagonal 'ramp' (red line), below which organisms are not sufficiently adapted to reproduce. As the population adapts, it 'climbs' the ramp at an asymptotic rate v  that can be decomposed into the invasion rate v x and the adaptation rate v y . change as a diffusion process (vertically), with diffusion constant D y .
Organisms reproduce asexually by division; the daughter is placed at the exact same coordinates as the mother. To implement local resource competition, the stochastic rate of reproduction g i of an organism i depends logistically on the local population density. This local density is measured by a kernel density estimate, that is, by the convolution of the population distribution with a compact interaction kernel O(x). In the simulations below, we choose O(x) to be Gaussian with standard deviation ('interaction range') s. The effect of the environmental gradient is included by stipulating that individuals at position x can reproduce only if they have a trait value > y ax. In the diagram in figure 1, this defines a diagonal 'ramp' (red line) dividing the organisms that are sufficiently adapted to reproduce from those that are not; its slope a reflects the steepness of the environmental gradient. To summarize, the rate of reproduction of individual i can be expressed as where r is the maximal rate of reproduction, k is the carrying capacity per unit of length, and H(y) is the Heaviside step function. We stress that, through equation (1), the growth rate of each organism depends on its own coordinates, as well as on the relative position of every other individual in the population. Lastly, organisms die at a fixed stochastic rate d.
Below we demonstrate that an initially compact population distribution starts to 'climb' the ramp, simultaneously invading new territory and adapting to the gradient. In time, the population front obtains a fixed average velocity in the phenotype-position space, denoted v  . The component of this velocity in the x-direction v x we identify as the rate of invasion, and the component in the y direction v y as the rate of adaptation, as illustrated in figure 1 with black arrows.

Results
In this section, we first study the mean-field limit of the ramp model and calculate the rate of adaptation in this limit analytically. Next, we compare the mean-field results to discrete-stochastic simulations.

The mean-field model
If the population density is large, the following meanfield description would seem appropriate: (Remember that d is the death rate; dn is not a differential.) Here ( ) n x y , is the (now continuous) density of organisms with trait value y and position x. For convenience, we replace all variables and parameters by dimensionless ones denoted by corresponding Greek characters. Effectively, we scale time, position and phenotype such that the net growth rate above the ramp -( ) r d and both diffusion constants D x and D y are set to unity; see table 1. Thus, equation (2) simplifies to This reveals that the behavior of the mean-field model depends on just three lumped, dimensionless parameters: the slope of the ramp α, the death rate δ, and the interaction range σ (the standard deviation of the interaction kernel).
For given parameters, equation (3) can be integrated numerically (see the Methods section). As an example, for .  Table 1. Dimensionless variables and parameters. All quantities of the original model (left column, Roman characters) are replaced by convenient dimensionless alternatives (right column, Greek characters). Effectively, the net growth rate above the ramp -( ) r d and both diffusion constants D x and D y are set to unity by scaling time t, position x and phenotype y. Only four dimensionless parameters remain: the slope of the ramp α, the death rate δ, the carrying capacity κ, and the interaction range σ. The population density ( ) n x y , appears only in the mean-field theory; the rescaled density n c y ( ) , absorbs κ, thus eliminating one more parameter.

Original quantity
Dimensionless alternative , d ) for a series of (equally spaced) time points. This figure clearly shows how, in time, the population front obtains a fixed shape and a constant invasion rate. The asymptotic pulse shape in the two-dimensional phenotype-position space of figure 1 is displayed in figure 2 We wish to derive the asymptotic velocity of the front analytically. To do so, we exploit that the pulse is a pulled front [26], similar to the famous Fisher-Kolmogorov-Petrovsky-Piscounov (F-KPP) front [27,28]. The asymptotic velocity of pulled fronts developing from a compact initial distribution is determined entirely by the propagation velocity of small perturbations of the linearly unstable state n = 0 [26]. Therefore, it can be obtained from linearized equations. Linearizing equation (3) around n = 0 results in n t n n y a c dn This equation describes a diffusing population in two dimensions of individuals that reproduce above the ramp at rate 1 and die below the ramp at a rate δ. Two important conclusions follow. First, the asymptotic velocity of the pulse does not depend on the interaction kernel chosen. Second, we note that a rotation of the coordinate frame leaves equation (4) invariant except for a change in α; conversely, that means that the solutions of equation (4) for different values of α are identical up to a rotation. This directly proves that the velocity of the front parallel to the ramp v  (see figure 1) is independent of α. The rate of adaptation v ψ then follows by projecting v  onto the ψ-axis. The result is This analysis does not exclude that v  might depend on δ. However, we derive in the appendix that = v 2  (as expected for F-KPP-type fronts), independent of δ.
In figure 3 we verify equation (5) by measuring the adaptation rate using numerical solutions of the full nonlinear equation (equation (3)), for a range of values α and δ (see the Methods section below for technical details). Also included are results for the limit d  ¥, where the ramp effectively becomes an absorbing boundary. All results are in excellent agreement with the analytical prediction (solid red line). The rates calculated for different values of δ (symbols) fall right on top of each other, confirming that the adaptation rate is independent of δ.
In terms of the original dimensional parameters, the adaptation rate reads: It is useful to distinguish two regimes in this result. If x . In this regime, the adaptation rate is clearly not limited by the mutation rate D y . Phenotypic diversity is generated at a sufficiently high rate to allow the pulse to invade new territory essentially unimpeded by the required adaptation, at the invasion rate y y , and the adaptation is directly limited by the mutation rate. In this regime the rate becomes independent of both D x and a.
Remarkably, equations (5) and (6) show that the adaptation rate increases strictly monotonically with the gradient slope. This is unexpected: intuitively, adaptation against a very steep gradient should be slow, because in this limit only those rare mutations that cause an unusually large increase in the quantitative trait allow for a significant range expansion. And yet, in the mean-field limit a larger slope always yields faster adaptation, even in the mutation-limited regime of a 1  . To test whether this is an artifact of the mean-field limit, we now compare the above results to After some time, the leading edge of the pulse obtains a fixed shape (red curve) with a constant invasion rate. (B)The front of the twodimensional population distribution converges to the fixed shape shown here. We normalized the maximum density to 1, and both axes have an arbitrary offset. stochastic simulations of discrete organisms at a finite carrying capacity.

Discrete-stochastic simulations
Exact simulations of the stochastic ramp model are challenging. Through equation (1), the reproduction rate of each individual depends on the coordinates of all organisms. Because each organism diffuses in continuous time and space, the reproduction rates of all individuals fluctuate rapidly and in unpredictable ways, which disqualifies the standard Gillespie algorithm [29]. We therefore used an event-driven simulation scheme that is both exact and remarkably efficient. We call it the WHINE algorithm. It can deal with reaction rates that depend on time or on any number of other variables, and could therefore be applied to a large variety of stochastic systems studied within and outside of population biology.
The crucial idea is to extend the stochastic model with a 'dummy' process: an additional process that has no consequences for the behavior of the system. This process is given a reaction rate such that the sum of all rates of reactions affecting each given individual (in this case growth, death, and 'dummy') becomes constant-that is, independent of time and other coordinates, and the same for all individuals. Consequently, the next-reaction time and the 2D diffusion (in continuous space and time) of the organisms become statistically independent and can therefore be sampled from their exact continuous probability distributions, without a need to discretize space, quantitative trait values, or time. The exact algorithm is detailed in the Methods section below.
We performed simulations for a large number of parameter combinations. Figure 4(a) is a snapshots from one of these, showing a dense cloud of nearly 30 000 organisms steadily climbing the ramp (k = 500). Figure 4(b) displays a kernel density estimate of the exact same configuration. In this simulation, the same slope and death rate (α and δ) were used as in the mean-field snapshot of figure 2(b). Comparison between figures 4(b) and 2(b) confirms the expected similarity at high carrying capacities between the pulse of the mean-field model and that of the stochastic model.
We have used such simulations to measure the adaptation rate for a broad range of parameter values. Some results are shown in figure 5(a), were the adaptation rate is plotted against the slope α for many values of the carrying capacity κ. In the dispersion-limited regime (a 1  ), we see that the results are qualitatively similar to the mean-field result (red line), even though significant quantitative deviations are found even at large carrying capacities (this was to be expected; see Discussion). In the mutation-limited regime (a 1  ) however, the mean-field result deviates qualitatively from the stochastic simulations: in the meanfield, the adaptation rate saturates at = y v 2, whereas the stochastic simulations show a rapid decrease in v ψ with increasing α. For any fixed slope α, the adaptation rate approaches the mean-field prediction as the carrying capacity is increased. Nevertheless, for each fixed value of the carrying capacity, the adaptation rate is non-monotonic as a function of α, in striking contrast with the mean-field result. At small carrying capacities, the slope providing the maximal rate of adaptation is near a = 1; at larger carrying capacities the optimum slowly creeps to the right. In all cases, however, the adaptation rate is already close to optimal at a = 1.
In order to understand the observed discrepancy, we study the convergence of the simulation results to the mean-field prediction. We consider the deviation from the mean-field prediction D º - ,S , where we denoted the adaptation rate in the stochastic model by y v ,S and the mean-field expectation by y v ,MF . Figure S1 shows the relative deviation D y v ,MF for three values of the slope α as a function of the carrying capacity κ. In the simulated regime of k = 1 to 1000, we find that the convergence to the mean-field result is well described by a power law where the prefactor Q and exponent μ depend on the parameters other than κ (to wit, α, δ and σ). Figure 6(a) plots exponent μ as a function of the slope α, for two values of interaction range σ. In the dispersion-limited regime, μ is roughly constant, but in the mutation-limited regime the scaling is well described by m a µ -1 2 . This does not depend on a particular choice of σ or δ: While the convergence becomes slower over-all if σ is increased, the scaling is not affected ( figure 6(a)), and if the death rate δ is reduced, the convergence becomes slower in the mutation-limited regime, but the same square-root behavior is found for large enough α (see figure  S2(a)). We conclude that, in the mutation-limited regime, the convergence to the mean-field limit slows down strongly as α increases. This non-uniform convergence explains how a non-monotonic dependence of the adaptation rate on the slope at any finite population density can, in fact, be consistent with a monotonic dependence in the mean-field limit. Figure 6(b) plots the prefactor Q as a function of α, again for two values of σ. It appears that Q is proportional with α when a  1 but converges toa 2 1 when a  1. This is remarkably insensitive to changes in σ (figure 6(b)) or δ ( figure S2(b)).
As a result of these scalings, we can reasonably summarize our simulation results as follows: Here, the coefficients c 1 , c 2 and c 3 depend on σ and δ, even though c 1 and c 2 appear to be very insensitive to changes in these parameters (as seen in figures 6(b) and S2(b)). Figure S3 demonstrates the quality of equation (7) as an approximation to the data of figure 5. Incidentally, we note that an expansion of equation (7) in the limit of large α shows that y v scales as a µ -; 1 2 this behavior can indeed be seen in figures 5 and S3 for sufficiently large α and κ.
One rather drastic assumption of the ramp model is that the rate of reproduction switches discontinuously from zero to a finite value as an organism moves from below to above the ramp. To test whether this discontinuity crucially affects the above results, we  figure A (Gaussian kernel, band width b = 1). The maximal density is normalized to 1 to allow comparison with the mean-field pulse in figure 2(b), which has a highly similar shape. Figure 5. Adaptation rate v ψ versus the slope α at various carrying capacities: k Î { } 1, 2, 5, 10, 20, 50, 100, 500 . The red solid line shows the mean-field prediction. While a strictly monotonic relation is predicted by the mean-field result, stochastic simulations show an optimum around a  1. With increasing carrying capacity, the simulation results approach the mean-field result, but at any finite carrying capacity the adaptation rate is non-monotonic as a function of α. At sufficiently large α and κ, we find that v ψ scales as a -1 2 . Parameters: d = 1 9 and s = 1, as in figure 4. performed additional simulations in which the step function H(y) of equation (2)b is replaced by a sigmoidal (logistic) function L(y) defined as: Here w is the width of the transition region, arbitrarily defined as the interval where L(y) increases from 12% to 88% of its maximal value. As shown in figure S4, the results are essentially the same as for the step function, even if the transition region is wide (w = 16).

Discussion and conclusions
We have presented results from stochastic simulations and a corresponding mean-field model. These results identified two qualitatively different regimes: a dispersion-limited regime (a D D y x 2  ) and a mutationlimited regime (a D D y x 2  ). In the dispersionlimited regime, the invasion rate v x is not affected by the gradient. As a consequence, the adaptation rate = v av y x is linear in a. In this regime, the velocity of the pulse is insensitive to the mutation rate D y , which is sufficiently high to provide the phenotypic variation required for the population to adapt sufficiently fast. The dispersion rate D x , however, does directly limit the invasion rate, which is = - x in the mean field-the rate expected for an F-KPP frontwith quantitative corrections at finite carrying capacities. In the mutation-limited regime the situation becomes more complex. The mean-field model predicts an adaptation rate = -( ) v r d D 2 y y independent of a or D x . This prediction, however, proves highly misleading: the adaptation rate in fact decreases with a at any finite carrying capacity. This discrepancy is caused by a non-uniform convergence to the meanfield limit for different values of a. At any finite carrying capacity, then, a non-monotonic dependence of the adaptation rate on the slope is found. The optimal gradient for adaptation (all else equal) occurs where these two regimes meet, that is, at » a D D y x . The effect of stochasticity on the propagation of pulled fronts has been studied extensively, in particular in the context of (variants of) the F-KPP front in one and two dimensions (e.g., [26,[30][31][32][33][34][35][36][37]). This work has shown that, in such fronts, the convergence to the mean-field limit with increasing population density is anomalously slow, so that significant quantitative differences can be found between mean-field and finitedensity stochastic models even at large population densities. The result above show, however, that the consequences are not necessarily limited to quantitative deviations but can actually lead to qualitatively wrong conclusions.
Heuristically, the large difference between the mean-field result and the stochastic simulations in the mutation-limited regime can be understood by considering the effects of scaling the spatial dimension (the x axis). In both models, scaling space must leave the rate of adaptation (i.e., the propagation in the y direction) unchanged, but does change the slope a. In the mean field, the velocity of the pulse is determined by the behavior far ahead of the bulk, where the linearized equation reigns and the nonlinear interaction kernel can be ignored. Then, the effect of scaling the x axis by a factor ò is only to replace the slope by  a and the dispersion rate by  D x 2 . The adaptation rate at a given a and D x must therefore equal the adaptation rate at slope 1 and dispersion rate D a x 2 . That is, increasing a has the same effect on the adaptation rate as increasing D x : it leads to an increase in the adaptation rate that saturates when dispersion is no longer limiting. In the stochastic model, however, the interaction kernel cannot be ignored. Scaling of the x axis now has two Figure 6. Convergence of stochastic simulations to the mean-field result with increasing carrying capacity κ. In the simulated regime of k = 1 to 1000, the deviation Δ from the mean-field result appears to obey a power law , where the exponent μ and the prefactor Q depend on the slope α (see figure S1). (A) Plotted is μ as a function of α, for s = 1 (black crosses) and s = 8 (red triangles). Two regimes are found: for a < 1, μ is approximately constant, whereas for a > 1 the data are well summarized by m a µ -1 2 . The standard deviation of the interaction kernel σ affects the numerical constants, but not the scaling. (B) The prefactor Q as a function of α, for s = 1 (black crosses) and s = 8 (red triangles). For a < 1, we find that a µ Q , whereas for a  1 the results are well fitted by a = -Q 2 1 . These results are remarkably insensitive to changes in σ. (In both plots, d = 1 9.) additional effects: it simultaneously decreases the carrying capacity k and increases the standard deviation s of the interaction kernel by a factor ò. Both of these effects tend to reduce the rate of adaptation and move the system away from the mean-field limit towards a strong-noise regime. In addition, as D x is increased, individuals at the leading edge of the pulse rapidly diffuse away, into the regime > x y awhere they cannot reproduce; this further reduces the population density at the front. Unfortunately, we have not yet been able to find a mathematical derivation of the precise scalings found in figure 6; we hope that future work will fill this void, perhaps based on the scalings found for the strong-noise regime of the F-KPP wave [35,37].
In the past few years, two models have been published that seem very similar to the ramp model on the surface, but differ crucially from it upon closer inspection [14,15].
The first model is the so-called staircase model [14]. The assumptions of this model are closely analogous to the ramp model. The main difference is that in the staircase model both space and phenotype are discretized. Bacteria can stochastically migrate between neighboring 'compartments', but in order to reproduce in compartment i a bacterium must accumulate i 1 mutations. Consequently, instead of the continuous ramp of the ramp model, (red line in figure 1) it features a discrete 'staircase' allowing adaptation and invasion to proceed step by step. This seemingly minor difference makes a direct comparison between the models highly problematic. First and foremost, in the staircase model the steepness of the gradient is not a parameter; because the staircase is directly linked to the discretization of genotype and space, the steepness cannot be varied independently. Consequently, the main results of the ramp model presented above have no direct counterpart in the staircase model. Second, it is tempting to view the ramp model as the continuum limit of the staircase model; that is, as the limit in which the size of the compartments Dx and the spacing between the discrete phenotypes Dy are reduced while simultaneously the migration and mutation rates ν and μ are adjusted such as to keep the macroscopic rates of diffusion nDx 2 and mDy 2 constant. But there are complications. In the staircase model, individuals compete only if they are in the same spatial compartment. In the continuum limit, the interaction range therefore tends to zero, with pathological consequences. Moreover, the published analysis of the staircase model explicitly assumed that the compartments are somewhat isolated (specifically, that the migration rate is low compared to the death rate) and that the carrying capacity per compartment is large. Both assumptions are violated in a continuum limit; consequently, in this limit the published results on the staircase model become invalid.
Indeed, while so-called mutation-limited and dispersion-limited (migration-limited) regimes were found in both models, these regimes have entirely different interpretations. For instance, in the ramp model, the dispersion-limited regime exists because, at a very high mutation rate, the invasion rate approaches but cannot exceed the velocity of the F-KPP model, which depends on the dispersion rate and not on the mutation rate. In the staircase model, the dispersion-limited regime (as defined in [14]) occurs when the mutation rate is high enough to expect that, during each step on the staircase, multiple mutant lineages compete to become the founder of the next compartment; increasing the mutation rate further then has a sub-linear effect on the adaptation rate. These mechanisms have little in common.
The second model is by Greulich et al [15]. In this model, real space is essentially continuous, but the phenotype/genotype space is again discrete. In order to more faithfully mimic the evolution of antibiotic resistance in a drug gradient, a few specific assumptions are made that deviate from the staircase and ramp models. For example, the drug concentration profile is assumed to be exponential and the effect of mutations on the resistance level multiplicative. It turns out, however, that these differences can be removed by a simple variable transformation: a logtransformation of both the drug concentration and the resistance levels yields a model in which the gradient is linear and the effect of mutations additive, more comparable to the ramp model. Like the ramp model, the Greulich model predicts a dispersion-limited and a mutation-limited regime for shallow and steep gradients, respectively. Greulich et al do not discuss the dispersion-limited regime, but we expect the adaptation rate to increase linearly with the gradient, in line with the ramp model. In the mutation-limited regime, the adaptation rate is limited by the arrival and fixation of consecutive mutations. Under the assumption that the dispersion rate is not too high, the authors derive a scaling of m µ v dk a y , where μ is the mutation probability per cell division. This differs from the scaling ofa 1 2 found in the ramp model. This seems to be related to another difference between the models: the Greulich model assumes that the growth rate of a given genotype reduces gradually with the drug concentration, whereas the ramp model assumes a step function. The gradual decrease results in a spatial domain within the habitat of each given genotype in which the next mutation is under positive selection; new mutations are most likely to establish in this domain. Importantly, the width of this domain decreases linearly with a, which produces the observed scaling of the dispersion rate. We conclude that, in the Greulich model, the rate of adaptation is non-monotonic as a function of the slope, as in the ramp model -but the scaling is qualitatively different due to several differences in the modeling choices and the parameter regimes considered.
We have presented a model that was highly idealized; our aim was to only include the minimal ingredients required for adaptation of a quantitative trait to an environmental gradient. In future work, more complex and realistic models can be studied; the simple model presented here can provide a null expectation for such extensions. We do expect that several basic results will hold up in more realistic versions. First, we expect that every reasonable model will feature dispersion-and mutation-limited regimes for shallow and steep gradients, respectively. In the dispersion-limited regime, the rate of adaptation should increase linearly with the magnitude of the gradient; in the mutation-limited regime, a decreasing function should be found, the scaling of which may however depend on details of the fitness landscape and genotype-phenotype mapping assumed. Regardless of those details, the optimal adaptation rate should occur in the transition region between these two regimes. Generally, we expect this transition to occur at a modest slope, that is, a slope that hampers the invasion rate mildly. Such qualitative predictions may in principle be tested experimentally, e.g. using bacterial populations in microfluidic devices or on agar plates in which an environmental gradient is set up. We hope that the ramp model, together with its predecessors, will prove useful in interpreting such experiments.

Numerical solutions of the mean-field equations
To numerically solve the mean-field equation (equation (3)) we used the Method Of Lines: we defined a 2D grid, approximated the PDE using finite differences, and integrated the resulting set of ODEs in Matlab using the ode113() function, which is a variable-step, variable-order Adams-Bashforth-Moulton PECE solver.
For cases with a  1, we used a straightforward rectangular grid. For cases with a < 1, we instead used a grid that was sheared vertically such as to align the grid points with the ramp. In both cases, the Laplacian at each grid point was evaluated using the 5×5 array of grid points surrounding it; the proper finitedifference coefficients were calculated specifically for each α.
The simulation box itself was shaped as a parallelogram, with sides parallel the vertical (phenotype) axis and to the ramp. This box was shifted periodically (parallel to the ramp) in order to keep the pulse front near the middle of the box. On the left-hand side of the box, reflecting (homogeneous Neuman) boundary conditions were imposed; this introduces small artifacts near this boundary, but does not affect the front of the pulse. The other three sides were absorbing (homogeneous Dirichlet) boundaries.

Stochastic simulations: the WHINE algorithm
As explained above, we performed individual-based stochastic simulations using an event-driven algorithm that treats the complex density-and position-dependent growth rates exactly. The main challenge for our simulations is that the organisms move diffusively in continuous space, and therefore the reaction rates (of reproduction and death) cannot be assumed constant between reactions, which rules out (variants of) the Gillespie algorithm. The key idea is to extend the model with a dummy reaction, with a reaction rate chosen such that the diffusive path of each organism (in two dimensions) becomes statistically independent of the next-reaction time. This idea can be used to simulate a diverse class of stochastic systems; here we describe its implementation for the current model.
For vividness, we may imagine that our organisms tend to whine, particularly under conditions unfavorable for reproduction; this is the dummy reaction. The rate of whining w i of individual i is chosen to be = -( ) w r g. 9 i i (Remember that r and g i are the maximal rate of reproduction, at low densities, and the actual rate of reproduction of individual i; see equation (1).) With this choice, the rates of all discrete reactions that can affect any given organism (growth, death, whining) add up to a constant R: Note that, by construction, R is the same for all organisms and independent of density, time, or any coordinates. The algorithm then repeats the following steps as often as desired: 1. Choose next-reaction time. Denote the total number of organisms in the system at time t by N(t). Then the sum of all reaction rates in the whole system is ( ) N t R, which is not affected by the 2D diffusion of the individuals and therefore does not change between reactions. Given the time of the previous event t p , this allows a next-event time t n to be sampled as in the standard Gillespie algorithm, by drawing a waiting time D ºt t t n p from the exponential waiting-time for x and y, respectively).
3. Choose the affected individual. We now know when the next reaction will take place, and where all organisms are at that time; we proceed to choose whom the event will affect. The total event rate R is the same for all individuals; therefore, the affected individual a may be chosen uniformly at random.
4. Choose the reaction. Next, we decide which reaction occurs: reproduction, death, or whining. Now that the next-event time t n , the coordinates of all individuals at that time, and the affected individual a are known, the probabilities of the event types are simply proportional to their rates at t n . To choose the event the ordinary 'roulette wheel' algorithm can be used (as in the standard Gillespie scheme) after the reproduction rate of a has been calculated explicitly using equation (1).
(We note that the expensive calculation of the reproduction rate can be avoided in many time steps: because the probability that a death event occurs is always d/R, the reaction rate needs to be calculated only if it is decided that a does not die.)

Execute the reaction.
If it turns out that the organism merely whines, nothing needs to be done; otherwise, the chosen reaction is executed.
The introduction of the dummy reaction obviously introduces overhead. In some of our simulations, 5 out of 6 events are dummy events, effectively resulting in a sixfold reduced time step. That said, the algorithm is in many ways remarkably efficient. As we emphasized, the reaction rates affecting a given organism depend on its own coordinates-whether it is above or below the ramp-and on the relative position of all other organisms. All these coordinates fluctuate due to the continuous-time diffusion of each organism. Yet, to deal with this exactly, at each time step of WHINE algorithm the reaction rates for only one organism (at most) need to be calculated explicitly.
We note that dummy reactions have been used before to efficiently simulate stochastic systems-e.g., see [15]. In these applications, however, spatial coordinates were discretized, with individuals performing a discrete random walk rather than diffusion in continuous space and time (a Wiener process). As a result, in these systems all reaction rates are constant between reactions. The above algorithm is remarkable in that it properly treats systems in which the reaction rates fluctuate unpredictably in continuous time.
During the simulations, the population expands as it invades new territory. To cap the population size, the simulation box we used is the half-plane bounded on the left by a reflective boundary = ( ) x b t that moves along with the leading edge of the pulse; there is no need to limit the simulation box in the y direction or on the right-hand side. As the boundary moves, organisms are 'lost' on the left. The boundary b(t) is frequently updated such that the pulse has a simulated range in the x-direction of »60 dimensionless units (as in figure 4). This is enough: extensive test simulations with a range of »100 gave nearly identical results.
The simulations were performed using custom code written in Matlab (prototype and first results) and Fortran. In the latter case we used the ziggorat method [38] (translated to Fortran by Alan Miller), to efficiently draw Gaussian random numbers (the algorithm needs many).

Acknowledgments
Part of this work was performed at the Department of Bionanoscience of the TU Delft; I am grateful for their valuable support. In particular, I thank Cees Dekker for his encouragement and advice. This work was also supported by the NWO (Nederlandse Organisatie voor Wetenschappelijk Onderzoek) (VENI 680-47-419).

Appendix. Derivation of the velocity of the pulse
We here derive that the velocity of the pulse parallel to the ramp v  equals 2 (in terms of the dimensionless quantities of table 1) independent of the death rate δ. To obtain v  , we analyze the linearized equation where ( ) x Di is the Dirac delta function. Because we established that v  is independent of α (see the Results) we can choose a = 0 without loss of generality. We then exploit that equation (4) is separable by making the Ansatz n c y t c t y t = t ( ) ( ) ( ) ( ) f g , , e , , .
A.2 (The prefactor t ( ) exp is arbitrary but will simplify the equations below.) Our task is to derive the functions f and g. We note, however, that we do not need the full solution of y t ( ) g , : we are satisfied with an expression for ò t y t y º ( ) ( ) † g g , d , because the velocity of the front can be obtained from the (marginal) population distribution in real space, n c t ( ) † , , defined as ò n c t n c y t y t , , d e , . A.3 We combine equation (A.2) with equation (4) of the main text to obtain separate differential equations for f and g. We find that c t ( ) f , simply obeys the diffusion equation in one dimension, with the standard solution (Again, ( ) x Di is the Dirac delta function.) This describes an ensemble of particles that diffuse in one dimension and are degraded in the domain y  0 at a rate d + 1 , after initially being placed right at the boundary of that domain.
To make progress, we perform a Laplace transformation. We denote the Laplace transform of c t ( ) g , by c ( ) G s , , and use that the Laplace transform of the time-derivative of c t ( ) g , equals c ( ) sG s , c -( ) g , 0 . Thus equation (A.5) is converted to a second-order differential equation that can be solved exactly. To deal with the delta peak of the initial condition, we temporarily replace it by a rectangular peak with a finite width ò and height  1 , solve the differential equation, and then take the limit of   0. We thus obtain: For any δ, this equation has the same functional form as the equivalent result for the F-KPP front in two dimensions, which is known to have the asymptotic velocity of = v 2 F . Therefore, the asymptotic velocity of the pulse is equal to that of the 2D F-KPP equation: