The Molecular Clock of Neutral Evolution Can Be Accelerated or Slowed by Asymmetric Spatial Structure

Over time, a population acquires neutral genetic substitutions as a consequence of random drift. A famous result in population genetics asserts that the rate, K, at which these substitutions accumulate in the population coincides with the mutation rate, u, at which they arise in individuals: K = u. This identity enables genetic sequence data to be used as a “molecular clock” to estimate the timing of evolutionary events. While the molecular clock is known to be perturbed by selection, it is thought that K = u holds very generally for neutral evolution. Here we show that asymmetric spatial population structure can alter the molecular clock rate for neutral mutations, leading to either Ku. Our results apply to a general class of haploid, asexually reproducing, spatially structured populations. Deviations from K = u occur because mutations arise unequally at different sites and have different probabilities of fixation depending on where they arise. If birth rates are uniform across sites, then K ≤ u. In general, K can take any value between 0 and Nu. Our model can be applied to a variety of population structures. In one example, we investigate the accumulation of genetic mutations in the small intestine. In another application, we analyze over 900 Twitter networks to study the effect of network topology on the fixation of neutral innovations in social evolution.


Introduction
A half-century ago, Zuckerkandl and Pauling [1] discovered that amino acid substitutions often occur with sufficient regularity as to constitute a "molecular clock". Theoretical support for this observation was provided by Kimura [2], who argued that observed rates of amino acid substitution could only be explained if the majority of substitutions are selectively neutral. Under simple models of evolution, a single neutral mutation has probability 1/N of becoming fixed in a haploid population of size N. It follows that the rate K of neutral substitution per generation-given by the product of the population size N, the mutation probability u per reproduction, and the fixation probability ρ-is simply equal to u. (A similar cancellation occurs in diploids, leading again to K = u.) In other words, for any neutral genetic marker, the rate of substitution at the population level equals the rate of mutation at the individual level [2].
A number of factors can alter the rate of neutral substitution [3,4], including selection, changes in population demography over time, or mutation rates that vary systematically by demographic classes [5,6] or sex [7]. The extent to which these factors compromise the applicability of the molecular clock hypothesis to biological sequence data has been intensely debated [3,4,[8][9][10][11][12][13][14].
Here we show that the absence of spatial effects on the rate of neutral substitution in these models does not represent a general principle of evolution. Rather, it is an artifact of two common modeling assumptions: (i) all spatial locations are equivalent under symmetry, and (ii) mutations are equally likely to arise in each location. While assumption (i) is relaxed in a number of models, assumption (ii) is made almost universally. Either of these assumptions alone is sufficient to guarantee ρ = 1/N and K = u, as we will show. However, neither of these assumptions is necessarily satisfied in a biological population. In particular, assumption (ii) is violated if some spatial locations experience more turnover (births and deaths) than others. Since each birth provides an independent opportunity for mutation, the rate at which new mutations appear at a location is proportional to its turnover rate [32]. Thus, even with a constant probability of mutation per birth, mutations may appear with different frequency at different locations. If, in addition, fixation probability depends on a mutant's initial location, the rate of neutral substitution is altered.
Our goal is to identify conditions under which the molecular clock rate is maintained (K = u), accelerated (K > u), or slowed (K < u) by spatial population structure. Our main results are as follows (see also Fig. 1): Result 1 If the death rate is constant over sites, the molecular clock rate is identical to that of a well-mixed population: K = u.
Result 2 If the birth rate equals the death rate at each site, the molecular clock rate is identical to that of a well-mixed population: K = u.
Result 3 If the birth rate is constant over sites, the molecular clock rate is less than or equal to that of a well-mixed population: K u.

Result 4
In general (with no constraints on birth or death rates), the molecular clock rate K can take any non-negative value less than Nu.

Obtaining the neutral substitution rate
Our investigations apply to a class of evolutionary models (formally described in the Methods) in which reproduction is asexual and the population size and spatial structure are fixed. Specifically, there are a fixed number of sites, indexed i = 1, . . ., N. Each site is always occupied by a single individual. At each time-step, a replacement event occurs, meaning that the occupants of some sites are replaced by the offspring of others. Replacement events are chosen according to a fixed probability distribution-called the replacement rule-specific to the model in question.
Since we consider only neutral mutations that have no phenotypic effect, the probabilities of replacement events do not depend on the current population state. How spatial structure affects the molecular clock rate K. Relative to the rate in a well-mixed population (K = u), spatial structure can either accelerate (K > u) or slow (K < u) the accumulation of neutral substitutions, depending on how birth rates b i and death rates d i vary across sites. The rate is unchanged from that of a well-mixed population (K = u) if either death rates are uniform across sites (Result 1), or the birth and death rates are equal at each site (Result 2). Almost all previous studies of neutral drift in spatially structured populations fall into one of these two categories; thus the effects of spatial structure on the molecular clock rate are unappreciated. We show that, in general, K can take any non-negative value less than Nu (Result 4). If one adds the constraint that the birth rate is the same at each site, then the molecular clock rate cannot exceed that of a well-mixed population (K u; Result 3).
This class includes many established evolutionary models. One important subclass is spatial Moran processes [23,[33][34][35], in which exactly one reproduction occurs each time-step. This class also includes spatial Wright-Fisher processes, in which the entire population is replaced each time-step [36,37]. In general, any subset R & {1, . . ., N} of individuals may be replaced in a given replacement event. Parentage in a replacement event is recorded in an offspring-to-parent map α:R ! {1, . . ., N} (see Methods, [32,38]), which ensures that each offspring has exactly one parent and allows us to trace lineages over time.
For a given model in this class, we let e ij denote the (marginal) probability that the occupant of site j is replaced by the offspring of site i in a single time-step. Thus the expected number of offspring of site i over a single time-step is b i ¼ P N j¼1 e ij . The probability that node i dies (i.e., is replaced) in a time-step is d i ¼ P N j¼1 e ji . The death rate d i can also be regarded as the rate of turnover at site i. The total expected number of offspring per time-step is denoted We define a generation to be N/B time-steps, so that, on average, each site is replaced once per generation.
We use this framework to study the fate of a single neutral mutation, as it arises and either disappears or becomes fixed. The probability of fixation depends on the spatial structure and the initial mutant's location. We let ρ i denote the probability that a new mutation arising at site i becomes fixed. (ρ i can also be understood as the reproductive value of site i [39].) We show in the Methods that the fixation probabilities ρ i are the unique solution to the system of equations Equation (2) arises because ρ i equals the probability that the current occupant of site i will become the eventual ancestor of the population, which is true for exactly one of the N sites.
To determine the overall rate of substitution, we must take into account the likelihood of mutations arising at each site. The rate at which mutations arise at site i is proportional to the turnover rate d i , because each new offspring provides an independent chance of mutation. Specifically, if mutation occurs at rate u ( 1 per reproduction, then new mutations arise at site i at rate Nud i /B per generation [32]. Thus the fraction of mutations that arise at site i is d i /B.
The overall fixation probability ρ of new mutations, taking into account all possible initial sites, is therefore The molecular clock rate K is obtained by multiplying the fixation probability ρ by the total rate of mutation per generation: The units of K are substitutions per generation. Alternatively, the molecular clock can be expressed in units of substitutions per time-step, in which case the formula is

Effects of spatial structure
How does spatial structure affect the rate of neutral substitution? In a well-mixed population, each individual's offspring is equally likely to replace each other individual, meaning that e ij is constant over all i and j (Fig. 2a). In this case, the unique solution to Eqs. (1)-(2) is ρ i = 1/N for all i, and we recover Kimura's [2] result K = Nu(1/N) = u. Moreover, if each site is equivalent under symmetry, as in Fig. 2b, this symmetry implies that ρ i = 1/N for all i and K = u as in the well-mixed case. However, asymmetric spatial structure can lead to faster (K > u) or slower (K < u) molecular clock rates than a well-mixed population, as shown in Fig. 3. From Eqs. (2) and (4) we can see that K > u is equivalent to the condition dr > dr, where the bars indicate averages over i = 1, . . ., N. This means that the molecular clock is accelerated if and only if d i and ρ i are positively correlated over sites; that is, if and only if there is a positive spatial correlation between the arrival of new mutations and the success they enjoy.
These results led us to seek general conditions on the spatial structure leading to faster, slower, or the same molecular clock rates as a well-mixed population. We first find Result 1. If the death rates d i are constant over all sites i = 1, . . ., N, then ρ = 1/N, and consequently K = u.
Thus the molecular clock rate is unaffected by spatial structure if each site is replaced at the same rate (Fig. 4a). This result can be seen by noting that if the d i are constant over i, then since  Thus if births and deaths are balanced at each site, then all sites provide an equal chance for mutant fixation (Fig. 4b). In this case the molecular clock is again unchanged from the baseline value. In particular, if dispersal is symmetric in the sense that e ij = e ji for all i and j then K = u. Result 2 can be obtained by substituting ρ i = 1/N for all i into Eq. (1) and simplifying to obtain b i = d i for all i (details in Methods). Alternatively, Result 2 can be obtained as a corollary to the Circulation Theorem of Lieberman et al. [23].
Our third result reveals a "speed limit" to neutral evolution in the case of constant birth rates: In other words, a combination of uniform birth rates and nonuniform death rates slows down the molecular clock. An instance of this slowdown in shown in Fig. 3a. Intuitively, the sites at which mutations occur most frequently are those with high death rates d i ; because of these high death rates, these sites on the whole provide a reduced chance of fixation. The proof of this result, however, is considerably more intricate than this intuition would suggest (see Methods).
Finally, we investigate the full range of possible values for K with no constraints on birth and death rates. We find the following: Result 4. For arbitrary spatial population structure (no constraints on e ij ) the fixation probability can take any value 0 ρ < 1, and consequently, the molecular clock can take any rate 0 K < Nu. Asymmetric spatial structure affects the rate of neutral substitution. This is because the frequency of mutations and the probability of fixation differ across sites. Turnover rates are indicated by coloration, with red corresponding to frequent turnover and consequently frequent mutation. (a) A star consists of a hub and n leaves, so that the population size is N = n+1. Edge weights are chosen so that the birth rates are uniform (b i = 1 for all i). Solving Eqs. (1)-(2), we obtain site-specific fixation probabilities of ρ H = 1/(1+n 2 ) and ρ L = n/(1+n 2 ) for the hub and each leaf, respectively. From Eq. (4), the molecular clock rate is K ¼ 2n 1þn 2 u, which equals u for n = 1 and is less than u for n ! 2. Thus the star structure slows down the rate of neutral substitution, in accordance with Result 3. Intuitively, the slowdown occurs because mutations are more likely to arise at the hub, where their chances of fixation are reduced.   Result 4 shows that K can achieve any value 0 K < Nu. This is proven by considering a population structure with unidirectional gene flow from a hub (H) to N−1 leaves (L). Fixation is guaranteed for mutations arising in the hub (ρ H = 1) and impossible for those arising in leaves (ρ L = 0). The overall fixation probability is equal by Eq. (3) to the rate of turnover at the hub: ρ = d H = 1−(N−1)a. The molecular clock rate is therefore K = Nuρ = N[1−(N−1)a]u. It follows that K > u if and only if a < 1/N. Intuitively, the molecular clock is accelerated if the hub experiences more turnover (and hence more mutations) than the other sites. Any value of ρ greater than or equal to 0 and less than 1 can be achieved through a corresponding positive choice of a less than or equal to 1/(N−1). For a = 1/(N−1) we have K = 0, because mutations arise only at the leaves where there is no chance of fixation. At the opposite extreme, in the limit a ! 0, we have K ! Nu. This result is especially surprising, in that it implies that the probability of fixation of a new mutation can come arbitrarily close to unity. Result 4 can be proven by considering the hypothetical spatial structure illustrated in Fig. 5. Any non-negative value of ρ less than 1 can be obtained by an appropriate choice of parameters (details in Methods).

Application to upstream-downstream populations
To illustrate the effects of asymmetric dispersal on the molecular clock, we consider a hypothetical population with two subpopulations, labeled "upstream" and "downstream" (Fig. 6). The sizes of these subpopulations are labeled N " and N # , respectively. Each subpopulation is well-mixed, with replacement probabilities e " for each pair of upstream sites and e # for each pair of downstream sites. Dispersal between the subpopulations is represented by the replacement probabilities e ! from each upstream site to each downstream site, and e from each downstream site to each upstream site. We assume there is net gene flow downstream, so that e ! > e .
Solving Eqs. (1)-(2), we find that the fixation probabilities from each upstream site and each downstream site, respectively, are These fixation probabilities were previously discovered for a different model of a subdivided population [40]. Substituting these fixation probabilities into Eq. (4) yields the molecular clock rate: Above, d " and d # are the turnover rates in the upstream and downstream populations, respectively, and B = N " d " +N # d # is the total birth rate per time-step. In Methods, we show that K > u if and only if d " > d # ; that is, the molecular clock is accelerated if and only if there is more turnover in the upstream population than in the downstream population.
In the case of unidirectional gene flow, e = 0, the molecular clock rate is simply K = (d " /B) Nu. The quantity d " /B represents the relative rate of turnover in the upstream population, and can take any value in the range 0 d " /B < 1/N " ; thus K takes values in the range 0 K < (N/ N " )u. We note that the upper bound on K is inversely proportional to the size N " of the upstream population. The largest possible values of K are achieved when N " = 1, in which case K can come arbitrarily close to Nu. These bounds also hold if there are multiple downstream subpopulations, since for unidirectional gene flow, the spatial arrangement of downstream sites does not affect the molecular clock rate. In particular, if the hub and leaves in Fig. 5 are each replaced by well-mixed subpopulations, then K is bounded above by (N/N H )u, where N H is the size of the hub subpopulation.

Application to epithelial cell populations
Our results are also applicable to somatic evolution in self-renewing cell populations, such as the crypt-like structures of the intestine. Novel labeling techniques have revealed that neutral mutations accumulate in intestinal crypts at a constant rate over time [41]. The cell population in each crypt is maintained by a small number of stem cells that reside at the crypt bottom and continuously replace each other in stochastic manner ( Fig. 7; [42][43][44]). We focus on the proximal small intestine in mice, for which recent studies [41,45] suggest there are * 5 active stem cells per crypt, each replaced * 0.1 times per day by one of its two neighbors. In our framework, this corresponds to a cycle-structured population of size 5 with replacement rates 0.05/ day between neighbors, so that d i = 0.1/day for all i.
Only mutations that arise in stem cells can become fixed within a crypt; thus we need only consider the fixation probabilities and turnover rates among stem cells. By symmetry among the stem cells, ρ i = 1/5 for each of the five stem cell sites. The molecular clock rate is therefore K ¼ u P 5 i¼1 d i r i ¼ 0:1u substitutions per day. This accords with the empirical finding that, for a neutral genetic marker with mutation rate u % 1.1×10 −4 , substitutions accumulate at a ratẽ K % 1:1 Â 10 À5 per crypt per day [41].
Does crypt architecture limit the rate of genetic change in intestinal tissue? Intestinal crypts in mice contain * 250 cells and replace all their cells about once per day [46]. If each crypt were a well-mixed population, the molecular clock rate would beK ¼ Bu=N % u substitutions per day. Thus the asymmetric structure of these epithelial crypts slows the rate of neutral genetic substitution tenfold.

Application to the spread of ideas
Our results can also be applied to ideas that spread by imitation on social networks. In this setting, a mutation corresponds to a new idea that could potentially replace an established one. Neutrality means that all ideas are equally likely to be imitated.
To investigate whether human social networks accelerate or slow the rate of idea substitution, we analyzed 973 Twitter networks from the Stanford Large Network Dataset Collection [47]. Each of these "ego networks" represents follower relationships among those followed by a single "ego" individual (who is not herself included in the network). We oriented the links in each network to point from followee to follower, corresponding to the presumed direction of information flow. Self-loops were removed. To ensure that fixation is possible, we eliminated individuals that could not be reached, via outgoing links, from the node with greatest eigenvector centrality. The resulting networks varied in size from 3 to 241 nodes.
To model the spread of ideas on these networks, we set e ij = 1/L if j follows i and zero otherwise, where L is the total number of links. This can be instantiated by supposing that at each time-step, one followee-follower link is chosen with uniform probability. The follower either adopts the idea of the followee, with probability 1−u, or innovates upon it to create a new idea, with probability u, where u ( 1. With these assumptions, the resulting rate of idea substitution (as a multiple of u) depends only on the network topology and not on any other parameters.
We found that the mean value of K among these ego networks is 0.557u, with a standard deviation of 0.222u. 19 of the 973 networks (2%) have K > u. Two networks have K = u exactly; each of these has N = 3 nodes and uniform in-degree d i , thus K = u follows from Result 1 for these networks. We found a weak but statistically significant negative relationship between the  [41] and [45]. A small number of stem cells (N " * 5) residing at the bottom of the intestinal crypt and are replaced at rate d " * 0.1 per stem cell per day. Empirical results [41,45] suggest a cycle structure for stem cells. To achieve the correct replacement rate we set e ij = 0.05/day for each neighboring pair. Stem cells in an individual crypt replace a much larger number of progenitor and differentiated cells (* 250; [46]). These downstream progenitor and differentiated cells are replaced about every day [46]. The hierarchical organization of intestinal crypts, combined with the low turnover rate of stem cells, limits the rate of neutral genetic substitutions (K~% 0:1u substitutions per day), since only mutations that arise in stem cells can fix.
doi:10.1371/journal.pcbi.1004108.g007 One possible explanation for the rarity of networks that accelerate idea substitution has to do with the intrinsic relationship between the turnover rates d i and the site-specific fixation probabilities ρ i . From Eq. (1), we see that ρ i can be written as the product where the first factor can be interpreted as the "attention span" of node i and the second can be interpreted as its influence. While these two factors are not strictly independent, we would not necessarily expect a systematic relationship between them in our Twitter network ensemble. In the absence of such a relationship, ρ i is inversely related to d i , which implies K < u. In other words, the most fertile nodes (in terms of generating new ideas) are also the most fickle (in terms of adopting the ideas of others); thus many new ideas are abandoned as soon as they are generated. This heuristic argument suggests that K > u, while possible, might be an uncommon occurrence in networks drawn from statistical or probabilistic ensembles.

Discussion
The spatial structure of a population affects its evolution in many ways, for example by promoting cooperative behaviors [36,[48][49][50][51][52][53][54], genetic variation [55][56][57][58], and speciation [59][60][61]. Asymmetric spatial structure in particular is known to have important consequences for adaptation [23,27,28,35,[62][63][64][65] and for genetic diversity [66][67][68]. Our work shows that asymmetric spatial structure also affects the rate of neutral substitution. In light of Results 1 and 2, we see that the critical factors driving the changes in molecular clock rate are differential probabilities of turnover (d i ) and net offspring production (b i −d i ) across sites. If both d i and b i −d i differ across sites, the molecular clock rate will in general differ from that of a well-mixed population (Result 4). If additionally b i is constant across sites, Result 3 guarantees that neutral substitution is slowed relative to the well-mixed case.
Our Result 4 shows that the rate of neutral substitution in a population can come arbitrarily close to Nu. However, this result depends on the existence of a single "hub" individual seeding the rest of the population as in Fig. 5. A similar but more plausible scenario (especially in sexually reproducing populations) involves a well-mixed hub subpopulation seeding one or more "leaf" subpopulations. In this case, the upper bound on the molecular clock rate is (N/N H )u, where N H is the size of the hub subpopulation.
Conditions leading to altered molecular clock rate may occur frequently in natural populations. Asymmetric dispersal may result, for example, from prevailing winds [69,70], water currents [71,72], or differences in elevation [73,74]. Differences in habitat quality may lead to variance in birth and death rates across sites (nonuniform b i and d i ). It is known that such asymmetries in spatial structure have important consequences for adaptation [23,[62][63][64] and for genetic diversity [66][67][68]; our work shows that they also have consequences for the rate of neutral genetic change. In particular, the molecular clock is accelerated if there is greater turnover in "upstream" subpopulations.
One important assumption made in our work is that mutation occurs with a constant probability per reproduction. Alternatively, one might suppose that heritable mutations accrue at a constant rate per individual (e.g. due to germline cell divisions). With this alternate assumption, mutations would arise at uniform rates over sites, resulting in a molecular clock rate of K = u for all spatial structures. The applicability of our results thus depends on the mutation process of the population in question, which may in many cases lie between these extremes. In humans, for example, recent evidence suggests that maternal mutations occur with constant probability per reproduction, whereas paternal mutations (which are more frequent) increase in probability with the father's age [75][76][77]. In general, we expect deviations from K = u to scale with the extent to which mutations in a lineage depend on the number of generations rather than chronological time.
Many somatic cell populations have strongly asymmetric patterns of replacement, with a small number of stem cell pools feeding a much larger number of progenitor and differentiated cells. The rate at which mutations accumulate in these populations has significant implications for the onset of cancer [78][79][80] and the likelihood of successful cancer therapy [81][82][83][84]. It is well-known that this rate is proportional to the rate of stem cell division, since only mutations that arise in stem cells can persist [45,[85][86][87][88]. Our work shows this how principle arises from a general analysis of neutral evolution in spatially structured populations. It is important to note, however, that cancer can alter the structure of cell hierarchies; for example, by altering the number and replacement rate of stem cells [41] and/or allowing differentiated cells to revert to stem cells [89]. This restructuring may, in turn, alter the rate of genetic substitution, with further ramifications for cancer progression.
The influence of social network topology on the spread of ideas and behaviors is a question of both theoretical and practical interest [90][91][92][93][94][95][96][97]. The neutral substitution rate K on social networks describes how innovations spread when they are equally likely to be imitated as an existing convention. Our finding that most Twitter ego networks have rates less than those of wellmixed populations contrasts with results from epidemiological models, which generally find that the heterogeneity of real-world social networks accelerates contagion [90,98,99]. We note, however, that this finding is sensitive to the assumption that individuals generate new ideas in proportion to the rate of incoming ideas (which itself is proportional to the number of individuals followed). Since the success of an idea varies according to the node at which it arises, the overall rate of substitution depends on the distribution of new ideas among nodes.
For example, if one were to instead assume that each individual generates new ideas at an equal rate, one would find that the network topology has no effect on the rate of substitution.
If spatial structure remains constant over time, then the neutral substitution rate K is in all cases a constant multiple of the mutation rate u. In this case, absent other complicating factors such as selection, neutral mutations will accrue at a constant rate that can be inferred from genetic data. However, if the spatial structure changes over time-due, for instance, to changes in climate, tumorogenesis, or social network dynamics [100][101][102][103]-the rate of neutral substitutions may change over time as well.
In our framework, the molecular clock rate is assumed to depend only on the rate at which mutations arise and their probability of becoming fixed. This approach assumes that the time to fixation is typically shorter than the expected waiting time 1/(Nuρ) for the next successful mutation. If this is not the case, then substitution rates are also affected by fixation times. These fixation times are themselves affected by spatial structure [22,30,58,104], leading to further ramifications for the molecular clock rate [105].
The starting point of our analysis is that the convention, commonly assumed in evolutionary models, that mutations arise with equal frequency at each site, is not necessarily the most natural choice. If there is a constant probability of mutation per birth, then mutations instead arise in proportion to the rate of turnover at a site. Here we have applied this principle to study the rate of neutral substitution. However, this principle also holds for advantageous and deleterious mutations, as well as those whose effect varies with location. It also applies to frequencydependent selection [65,106]. Re-analyzing existing models using this new mutation convention may reveal further surprises about how spatial structure affects evolution.

Class of Models
In the class of models we consider, there are N sites indexed i = 1, . . ., N, each always occupied by a single individual. At each time-step, a replacement event occurs, in which the occupants of some positions are replaced by the offspring of others. A replacement event is identified by a pair (R, α), where R & {1, . . ., N} is the set of sites whose occupants are replaced by new offspring, and α:R ! {1, . . ., N} is a set mapping indicating the parent of each new offspring. (This notation was introduced in Ref. [32].) A sample replacement event is illustrated in Fig. 10.
A model of neutral evolution is specified by a probability distribution over the set of possible replacement events. We call this probability distribution the replacement rule of the model. The probability of a replacement event (R, α) in this distribution will be denoted p(R, α). Neutrality is represented by independence of the probabilities p(R, α) from the state of the evolutionary process.
The only assumption we place on the replacement rule is that it should be possible for at least one site i to contain the eventual ancestor of the population: Assumption 1. There is an i 2 {1, . . ., N}, a positive integer n and a finite sequence fðR k ; a k Þg n k¼1 of replacement events such that • p(R k , α k ) > 0 for all k, and • For all individuals j 2 {1, . . ., N}, where k 1 , . . . < k m is the maximal subsequence of 1, . . ., n such that the compositions in Eq. (7) are well-defined.
We observe that Eq. (7) traces the ancestors of j backwards in time to i. This assumption is equivalent to saying that there is at least one site i such that mutations arising at site i have nonzero fixation probability. It is also equivalent to the statement that the weighted digraph with edge weights e ij is out-connected from at least one vertex. Assumption 1 precludes degenerate cases such as two completely separate subpopulations with no gene flow between them, in which case fixation would be impossible.
For a specific replacement event (R, α), the sites that are replaced by the offspring of i is the given by the preimage α −1 (i) & R, i.e., the set of indices that map to i under α. The number of offspring of site i is equal to jα −1 (i)j, the cardinality (size) of this preimage. Taking all possible replacement events into account, the birth rate (expected offspring number) of site i is given by pðR; aÞ ja À1 ðiÞj: The death rate (probability of replacement) of site i is equal to pðR; aÞ: The probability that the offspring of i displaces the occupant of j in a replacement event is e ij ¼ Pr ½ j 2 R and aðjÞ ¼ i ¼ P ðR; aÞ j 2 R aðjÞ ¼ i pðR; aÞ: We observe that b i ¼ P N j¼1 e ij and d i ¼

The evolutionary Markov chain
To study the fixation of new mutations, we consider evolution with two genetic types: mutant (M) and resident (R). The type occupying site i in a given state of the evolutionary process is denoted s i 2 {M, R}. The overall state of the process can be recorded as a string s = (s 1 , . . ., s N ) of length N with alphabet {M, R}. We assume that there is no further mutation after an initial mutant appears; thus offspring faithfully inherit the type of the parent. It follows that if the current state is s = (s 1 , . . ., s N ) and the replacement event (R, α) occurs, then the new state s 0 ¼ ðs 0 1 ; . . . ; s 0 N Þ is given by The above assumptions describe a Markov chain on the set of strings of length N with alphabet {M, R}. We call this the evolutionary Markov chain. It is straightforward to show that, from any initial state, this Markov chain will eventually converge upon one of two absorbing states: (M, . . ., M) or (R, . . ., R) [32]. In the former case, we say that the mutant type has gone to fixation; in the latter case we say that the mutant type has disappeared.

The ancestral Markov chain
It is useful to consider a variation on the evolutionary Markov chain called the ancestral Markov chain, denoted A. Instead of two types, the ancestral Markov chain has N types, labeled 1, . . ., N, which correspond to the N members of a "founding generation" of the population. Evolution proceeds according to the given replacement rule, as described above. The states of the ancestral Markov chain are strings of length N with alphabet {1, . . ., N}.
The ancestral Markov chain has a canonical initial state a 0 = (1, . . ., N), in which the type of each individual corresponds to its location. This initial state identifies the locations of each founding (t = 0) member of the population. The ancestral Markov chain with initial state a 0in our notation, (A, a 0 )-has the useful feature that at any time t ! 0, the state a(t) = (a 1 (t), . . Basic results on fixation probabilities We define the fixation probability from site i, ρ i , to be the probability that, from an initial state a mutant in site i and residents in all other sites, the mutant type goes to fixation: We can use the relationship between the ancestral Markov chain A and the evolutionary Markov chain M to obtain an alternate expression for the site-specific fixation probability ρ i : In words, the site-specific fixation probability ρ i equals the probability that founding individual i becomes the eventual ancestor of the whole population.
Proof. For any i = 1, . . ., N, define the set mapping γ i :{1, . . ., N} ! {M, R} by More generally, we can consider the probability of fixation from an arbitrary set of sites. For any set S & {1, . . ., N}, we let ρ S denote the probability that the mutant type becomes fixed, given the initial state m S with mutants occupying the sites specified by S and residents occupying all other sites: In particular, This result has previously been obtained for a number of specific evolutionary processes on graphs [39,107,108]. Intuitively, the probability of fixation from the initial state described by S equals the probability that one of the individuals in a site i 2 S becomes the eventual ancestor of the population. Since this cannot be true of more than one site in S, the overall probability ρ S is obtained by summing over all i 2 S the individual probabilities ρ i that site i contains the eventual ancestor. We now derive Eq. (1), which allows the fixation probabilities ρ i to be calculated from the replacement rates e ij .
Theorem 3. For each i = 1, . . ., N, Proof. Considering the change that can occur over a single time-step, we have the following recurrence relation: pðR; aÞ r a À1 ðiÞ : The first term above represents the case that the occupant of i survives the current time-step and becomes the eventual ancestor of the population, while the second term represents the case that one of i's offspring from the current time-step is the eventual ancestor. Subtracting (1 −d i )ρ i from both sides and applying Theorem 2 with S = ρ α −1 (i) yields Now interchanging the summations on the right-hand side yields Proof. Assume first that the fixation probabilities from each site are all equal to 1/N. Substituting ρ i = 1/N for all i into Eq. (1) yields d i = b i . This proves the "only if" direction.
Next assume that the birth rate is equal to the death rate at each site (b i = d i for all i = 1, . . ., N). Eq. (1) can then be rewritten as Clearly, ρ i = 1/N for all i satisfies the above equation for all i, and also satisfies P N i¼1 r i ¼ 1. Assumption 1 guarantees that the solution to these equations is unique. Therefore b i = d i for all i implies ρ i = 1/N for all i, proving the "if" direction. Proof. We separate our proof into six steps. First, we show that there is no loss of generality in assuming that B = 1. Second, we use the method of Lagrange multipliers to find the critical points of the function r ¼ P N i¼1 d i r i with respect to the variables fe ij g N i;j¼1 and fr i g N i¼1 and the constraints

Proof of Result 3
e ij r j i 2 f1; . . . ; Ng; We obtain that the critical points are precisely those for which d i = 1/N for all i. In the next three steps, we use partial derivatives of Eqs. (1), (2) and (3) to form a second-order Taylor expansion of ρ in the variables {e ij } i 6 ¼ j around the critical points. Finally, we show that the critical points are maxima, completing the proof.
Step 1: Normalize the expected number of offspring per time-step We first show that there is no loss of generality in assuming that the total expected number of offspring per time-step, B ∑ i, j e ij , is one. We will demonstrate this by showing that, if a maximum of ρ is achieved by a combination of replacement rates fe Ã ij g i;j , this maximum is also achieved using the normalized rates fe ij 0 g i;j with e 0 ij ¼ e Ã ij =B Ã , B Ã ∑ i, j e Ã ij . First, we see that the equations e 0 ij r j are equivalent, differing only by a factor of B. (Above, we have introduced the notation ) Thus the node-specific fixation probabilities fr i g N i¼1 resulting from fe ij 0 g i;j are the same as those resulting from fe Ã ij g i;j . Turning now to the overall fixation probabilities ρ Ã and ρ 0 corresponding to fe Ã ij g i;j and fe ij 0 g i;j respectively, we see that Thus the same maximum is achieved by the normalized replacement rates fe ij 0 g i;j , which satisfy P i;j e 0 ij ¼ 1. We conclude that there is no loss of generality in assuming B = 1. Together with uniform birth rates, this implies the constraint Step 2: Determine critical points We use the method of Lagrange multipliers to find critical points of the function r ¼ P N i¼1 d i r i with respect to the variables fe ij g Critical points with respect to the above constraints are solutions to where the λ i 's, μ i 's, and σ are the Lagrange multipliers. The individual components of the gradient are obtained by taking the partial derivative with respect to e kl , yielding for k, l 2 {1, . . ., N}, and the partial derivative with respect to ρ k , yielding m j e jk þ s; ð15Þ for k = 1, . . ., N. Solving Eq. (14) for ρ l gives Note that in the case l = k, Eq. (16) becomes ρ k = λ k . Therefore, we replace λ k with ρ k in Eq. (16): Interchanging k and l, we obtain Combining Eqs. (17) and (18) yields μ l = μ k and ρ l = ρ k for all k, l 2 {1, . . ., N}. That is, all node-specific probabilities ρ i are equal. It follows from Eq. (2) that ρ i = 1/N for all i = 1, . . ., N. Furthermore, given that μ l = μ k , Eq. (15) yields d k = σ for all k = 1, . . ., N. That is, all death rates d i are equal. Therefore d i = 1/N for i = 1, . . ., N because we have assumed (without loss of generality) that P N i¼1 d i ¼ 1. In summary, we have shown that the overall fixation probability ρ has a critical point whenever all node-specific fixation probabilities and death rates are constant (d i = ρ i = 1/N for all i = 1, . . ., N). Consequently, ρ = 1/N at all critical points. It still remains to prove that this critical value of ρ is a maximum.
Step 3: Taylor series expansion To prove that the overall fixation probability has an absolute maximum of 1/N, we pick a critical point fe Ã ij g i; j i 6 ¼ j . Then, viewing ρ as a function of the independent variables fe ij g i; j i 6 ¼ j , we form a second-order Taylor expansion of ρ around the critical point. (Since we are operating under the constraint P N j¼1 e ij ¼ 1=N for all i, this set of variables suffices to determine the value of ρ.) We will show that the second-order term of this Taylor expansion is always less than or equal to zero.
The second order multivariate Taylor series expansion for ρ about the critical point fe ij g i; j i 6 ¼ j can be written as where all derivatives are taken at fe Ã ij g i; j i 6 ¼ j and De kl ¼ e kl À e Ã kl . More simply, we write this expansion as with ρ (1) and ρ (2) representing the firstand second-order terms, respectively, in Eq. (19). The first-order term ρ (1) is zero since fe Ã ij g i; j i 6 ¼ j is a critical point. Our goal will be to show that the second-order term, ρ (2) is always negative or zero. We can find an alternative expression for ρ (2) using the definition of ρ in Eq. (3) as follows. We introduce the notation We substitute the first-order Taylor series expansion for ρ i , into Eq. (20), noting that Δd i = O(jΔẽj) and P N i¼1 Dd i ¼ 0. This yields Thus, the second-order term of the overall fixation probability can be written as Step 4: Obtain recursion for first-order term of the site-specific fixation probability We now investigate the properties of r ð1Þ i . We begin by rewriting Eq. (1) as Replacing e ii with 1 N À P j6 ¼i e ij and simplifying, we obtain Next we take the partial derivative of both sides with respect to e kl , where k 6 ¼ l, @r j @e kl À @r i @e kl ; and evaluate at the critical point (r Ã i ¼ 1 N and Δd i = 0): ij @r j @e kl À @r i @e kl : Multiplying both sides by Δe kl and then summing over all k, l 2 {1, . . ., N} for k 6 ¼ l yields We observe that r ð1Þ The restriction j 6 ¼ i in the sums on the right-hand side can be removed, since the entire righthand side is zero when j = i. Rearranging further, we have Since the birth rate is uniform over sites This equation provides a recurrence relation for the first-order term of the Taylor expansion of ρ i about the critical point.
Step 5: Show second-order term of the overall fixation probability is nonpositive We now combine Eq. (23) with Eq. (21) to show that ρ (2) 0, with equality if and only if Δd i = 0 for all i, according to the following argument. We first multiply both sides of Eq. (23) by r ð1Þ i and sum over i = 1, . . . , N: By Eq. (21) we can substitute ρ (2) for P N i¼1 Dd i r ð1Þ i . Then, solving for ρ (2) , we obtain We now make use of the fact that the product r ð1Þ i r ð1Þ j can be written as a difference of squares: Substituting this identity into Eq. (24) yields Uniform birth rates implies that P N j¼1 e Ã ij ¼ 1=N for each i. Furthermore, since fe Ã ij g i;j is a critical point, death rates are uniform as well; thus P N i¼1 e Ã ij ¼ 1=N for each j. Making these substitutions yields This shows that ρ (2) 0. We now consider the case that ρ (2) = 0. By Eq. (25), this occurs if and only if r ð1Þ i ¼ r ð1Þ j for any pair i, j with e Ã ij 6 ¼ 0. It follows that each term e Ã ij r ð1Þ j in the sum on the right-hand side of Eq. (23) can be replaced by e Ã ij r ð1Þ i . Making these substitutions and using P N j¼1 e Ã ij ¼ 1=N, we find that Δd i = 0. Thus ρ (2) = 0 implies that Δd i = 0 for each i.
Step 6: Show that the critical points are maxima We now adopt a vector notation, in which e ¼ fe ij g i; j i 6 ¼ j is an arbitrary combination of replacement probabilities satisfying the conditions of the theorem, and e Ã ¼ fe Ã ij g i; j i 6 ¼ j is an arbitrary critical point of ρ. By Taylor's theorem, we can rewrite the second-order expansion (19) of ρ around e Ã as where H e Ã is the Hessian matrix of ρ at e Ã and {R e;e Ã} is a family of symmetric matrices with the property that R e;e Ã converges to the zero matrix as e ! e Ã .
Since H e Ã and R e;e Ã are symmetric, all eigenvalues of these matrices are real, and the eigenspaces corresponding to these eigenvalues are orthogonal. We showed in Step 5 that ρ (2) 0, which implies that H e Ã is negative semidefinite. We also showed that ρ (2) = 0 if and only if Δd i = 0 for all i; thus the null space of H e Ã is the vector space D consisting of all displacement vectors for which Δd i = 0 for all i. It follows that H e Ã is negative definite as an operator on the orthogonal complement D ? of D.
We now fix a particular critical point e Ã 0 , and consider a neighborhood Nðe Ã 0 ; eÞ of the form For each point e Ã 0 þ v þ w 2 Nðe Ã 0 ; eÞ, we observe that e Ã 0 þ v is also a critical point of ρ, since the addition of v 2 D leaves the value of d i unchanged for each i. Now, applying the Taylor expansion (26) Let λ H (v) denote the largest nonzero eigenvalue of H e Ã 0 þv , and let λ R (v, w) denote the largest eigenvalue of R e Ã 0 þvþw;e Ã is negative definite as an operator on D ? and since w 2 D ? , it follows that We also have that, for each Furthermore, since R e Ã 0 þvþw;e Ã 0 þv converges to the zero matrix as w ! 0, we have lim w ! 0 λ R (v, w) = 0 for each fixed v. It follows that, for sufficiently small ε > 0, all points e Ã 0 þ v þ w 2 Nðe Ã 0 ; eÞ satisfy Combining Eqs. (27)-(30), we have that for all e Ã 0 þ v þ w 2 Nðe Ã 0 ; eÞ with ε sufficiently small, Equality is achieved in Eq. (31) if and only if w = 0, in which case e Ã 0 þ v þ w is another critical point. Thus ρ has a local maximum of 1/N at every critical point. It follows that these points are global maxima as well. We showed in Step 2 that the critical points of ρ are precisely those for which d i = 1/N for all i. This completes the proof.

Result 4
We now turn to the full range of possible values for K with no constraints on birth and death rates. Consider the example spatial structure illustrated in Fig. 5, which consists of a hub with outgoing edges to n leaves, so that the population size is N = n+1. There is an edge of weight a, 0 a < 1/(N−1), from the hub to each leaf. The hub also has a self-loop of weight 1−(N−1)a, so that B = 1 births are expected per time-step. The death rates are d H = 1−(N−1)a for the hub and d L = a for each leaf. Solving Eqs. (1)-(2) we obtain the site-specific fixation probabilities ρ L = 0 for each leaf and ρ H = 1 for the hub. By Eq. (3), the overall fixation probability is equal to the rate of turnover at the hub: Any value 0 ρ < 1 can be obtained by an appropriate choice of a, with 0 < a 1/(N−1), specifically, a = (1−ρ)/(N−1). This proves: Theorem 6 (Result 4). For arbitrary spatial population structure (no constraints on e ij ) the fixation probability can take any value 0 ρ < 1, and consequently, the molecular clock can take any rate 0 K < Nu.
We observe that if a = 1/N then the death rate is 1/N at each site, and therefore ρ = 1/N by Result 1. If a = 1/(N−1) then there is no turnover at the hub and thus ρ = 0. At the other extreme, as a approach zero, ρ comes arbitrarily close to unity.
Exact expressions for ρ with N 3 To complement the above arguments, which apply to general population size N, we here derive exact expressions for the overall fixation probability ρ in the cases N = 2 and N = 3. Our results for node-specific fixation probabilities coincide with those found previous for a different model of evolution in a subdivided population [40].
When the condition of uniform birth rates is relaxed, we find, by substituting Eqs. (32) and (33) into Eq. (3), the following expression for the overall fixation probability: Bðe 12 þ e 21 Þ : We discover that r ¼ 1 2 , and consequently K = u, if (i) death rates are equal, d 1 ¼ d 2 ¼ B 2 (Result 1), or (ii) node-specific fixation probabilities are equal, r 1 ¼ r 2 ¼ 1 2 (Result 2). When death rates are unequal and node-specific fixation probabilities are unequal, there are cases in which r < 1 2 (K < u) and cases in which r > 1 2 (K > u). In particular, suppose that d 1 > B 2 . Then ρ is less than 1 2 if e 12 < e 21 and greater than 1 2 if e 12 > e 21 .

Population size N = 3
We solve Eqs. (1) and (2) to obtain expressions for ρ 1 , ρ 2 and ρ 3 : We factor to obtain an expression for num(Δρ) in terms of From Eq. (37) we see that r ¼ 1 3 , and consequently K = u, in the case of equal death rates 3 ) and in the case of equivalent birth and death rate at each site (b i = d i for i 2 {1, 2, 3}). This conforms to Result 1 and 2. When death rates are nonuniform and node-specific probabilities are also nonuniform, we find cases in which K > u and K < u. For example, if Dd 1 ¼ 0; Dd 2 > 0; d 2 > b 2 ; and d 3 < b 3 then Eq. (37) yields num(Δρ) < 0 and consequently, r < 1 3 and K < u. On the other hand, if Dd 1 ¼ 0; b 2 > d 2 > B 3 ; and d 3 > b 3 then Eq. (37) yields num(Δρ) > 0 and therefore, r > 1 3 and K > u.

Upstream-downstream populations
We now turn to the upstream-downstream model introduced in the Results and in Fig. 6.
Theorem 7. In the upstream-downstream model, ρ > 1/N, and consequently K > u, if and only if d " > d # .
Proof. From Eq. (3) we obtain the following expression for the overall fixation probability Since fixation probabilities sum to one, and since the total population size is N = N " +N # , we have Substituting in Eq. (38) yields It follows from Eq. (5) and from e ! > e that ρ " > ρ # . Moreover, Eq. (39) implies that ρ " −1/N and ρ # −1/N have opposite signs, and since ρ " > ρ # , it follows that ρ " −1/N must be positive. Thus the second term on the right-hand side of Eq. (40) has the sign of d " −d # . We therefore conclude that ρ > 1/N, and consequently the molecular clock is accelerated relative to the well-mixed case (K > u), if and only if d " > d # .