Symmetry in models of natural selection

Symmetry arguments are frequently used—often implicitly—in mathematical modelling of natural selection. Symmetry simplifies the analysis of models and reduces the number of distinct population states to be considered. Here, I introduce a formal definition of symmetry in mathematical models of natural selection. This definition applies to a broad class of models that satisfy a minimal set of assumptions, using a framework developed in previous works. In this framework, population structure is represented by a set of sites at which alleles can live, and transitions occur via replacement of some alleles by copies of others. A symmetry is defined as a permutation of sites that preserves probabilities of replacement and mutation. The symmetries of a given selection process form a group, which acts on population states in a way that preserves the Markov chain representing selection. Applying classical results on group actions, I formally characterize the use of symmetry to reduce the states of this Markov chain, and obtain bounds on the number of states in the reduced chain.


Introduction
Natural selection is a complex process.Even in the relatively simple case of two alleles competing in a fixed environment, heterogeneity among individuals can lead to a large number of possible population states, presenting a challenge for mathematical modeling.To make analytical progress, mathematical models typically assume some form of symmetry (Fig. 1), which reduces the number of distinct states that need be considered.
The simplest models of natural selection [1][2][3][4] assume that individuals with the same genotype are interchangeable.Under this strong form of symmetry, one needs only keep track of the number of individuals with each genotype.
These models can refined to take into account individual differences based on sex [2,5], age [2,6], and other factors, collectively referred to as "class" [7][8][9].In class-structured models of selection, one typically assumes that individuals of the same class and genotype are interchangeable.Consequently, one needs only track the abundance of each genotype within each class.[1] and Wright-Fisher [2,3] models, every permutation of the set G of sites is a symmetry.The symmetry group Sym(G, p) is therefore isomorphic to the symmetric group S n , where n = |G| is the number of sites.b In a diploid well-mixed population of size N , symmetries may permute individuals, and may also swap the two sites within individuals; the symmetry group is therefore (isomorphic to) S N × (S 2 ) N .c In a haploid class-structured population, where the classes have respective sizes n 1 , . . ., n k , symmetries may permute the members of each class.The symmetry group is therefore S n1 × . . .× S n k .d For a population structured as a cycle graph, symmetries may rotate and/or reflect the cycle; the symmetry group is the dihedral group D n .e For a directed cycle, only rotations are allowed; the symmetry group is the cyclic group Z n .e For a star graph, any permutation of the n − 1 leaves is a symmetry, yielding the symmetry group S n−1 .
Other models incorporate spatial structure.The earliest models of selection in spatially structured populations, such as Wright's island model [10] and lattice models [11][12][13], also possessed a high degree of symmetry.More recently, attention has turned to asymmetric spatial structures.These are usually represented as graphs, in which each vertex corresponds to an individual and edges represent spatial or social relationships [14][15][16][17][18][19][20][21].
A symmetry can be mathematically represented as a permutation-a bijective mapping from a set to itself-that preserves some relevant structure.For example, a symmetry of a graph (also called a graph automorphism) is a permutation σ of the vertex set that preserves edges-meaning that vertices i and j are joined by an edge if and only if σ(i) and σ(j) are as well.The symmetries of a given object form a group under composition of permutations (see, e.g., [22]).
Applying this mathematical notion of symmetry, Taylor [23] studied selection on graphs possessing a form of symmetry called bi-transitivity.This was followed by other investigations of selection on graphs with various symmetry properties [17,[24][25][26][27].An overarching theory of symmetry for selection on graphs was developed by McAvoy and Hauert [28], who showed how graph symmetries preserve properties of the Markov chain representing selection.
In an effort to unify the wide variety of models of natural selection in structured populations, my collaborators and I have proposed a general mathematical modeling framework for natural selection [29,30].This framework defines a class of models of natural selection, which includes the classical Moran [1] and Wright-Fisher [2,3] models, game-theoretic processes in finite populations [31][32][33], selection processes on graphs [14][15][16][17][18][19][20][21], and models of haplodiploid social insect colonies [30].For models in this framework, selection occurs between two alleles at a single genetic locus, in a population of fixed size.The population structure is represented by a fixed set of sites at which alleles can live.A given population state is identified by specifying which allele occupies each state.Natural selection is represented as a Markov chain, in which each transition involves the replacement of some alleles by others, possibly with mutation.Using this framework, general results-applicable to any model in the class-have been derived regarding fixation probabilities [34][35][36], allele frequencies [37], social behavior [38], and genetic assortment [39].
Here, using this framework, I develop a mathematical theory of symmetry in models of natural selection.A symmetry will be defined as a permutation of sites that preserves probabilities of replacement and mutation.The symmetries of a given natural selection process form a group under composition.This group acts on the set of sites, and by extension, on the set of possible population states.
Applying this theory, I formalize the use of symmetry to reduce the size of the Markov chain representing selection.Two states are equivalent if there is a symmetry transforming one into the other.By grouping equivalent states together, one obtains a reduced Markov chain, the states of which are orbits of the group action.Using classical results on orbit counting, I derive upper and lower bounds for the number of states in the reduced chain.
Section 2 summarizes the relevant modeling framework.Section 3 introduces the formal definition of symmetry.This definition is used in Section 4 to 1 2 3 4 5 6 Current state x = (0,0,1,0,1,1) Next state x' = (0,1,0,0,0,1) Figure 2: Modeling framework.The modeling framework used here [29,30] defines a class of models with two competing alleles (numbered 0 and 1) at a single genetic locus.There a fixed set G of sites at which alleles can live; here, G = {1, 2, 3, 4, 5, 6}.Haploids (pictured here) have one site per individual; diploids two.The population state is a binary vector x = (x g ) g∈G specifying which allele occupies each site.Each time-step, a parentage map α and mutation set U are sampled from a joint probability distribution that depends on the current state x.To form the next state, x , each non-mutated site g / ∈ U inherits the allele of the parent, x g = x α(g) , while each mutated site g ∈ U inherits the opposite allele, x g = 1 − x α(g) .This leads to a Markov chain M(G, p) representing natural selection.formalize the notion of classes of sites, and in Section 5 to show how symmetry reduces the Markov chain representation of selection.

Modeling framework
I begin by reviewing the mathematical modeling framework-developed in Refs.[29,30] and depicted in Figure 2-to which the proposed definition of symmetry will apply.This framework provides a common notation and foundational assumptions for a class of models, each representing selection between two alleles at a single genetic locus.

Sites, alleles, and states
There are two allele types, numbered 0 and 1. Taking a gene's-eye view, the population structure is represented by a finite set G of n genetic sites, at which alleles live.Each site houses one allele, and each individual contains a number of alleles equal to its ploidy (one for haploids, two for diploids).
The allele occupying site g ∈ G is represented by the variable x g ∈ {0, 1}.These variables are combined into a g-indexed binary vector x = (x g ) g∈G ∈ {0, 1} G representing the population state.

Replacement and mutation
State transitions occur via replacement (of some alleles by copies of others) and mutation.Replacement is represented by a set mapping α : G → G, called a parentage map.For each site g ∈ G, α(g) indicates the site from which the new allele in g is either survived or inherited.In other words, α(g) = h can mean either that the allele in g died and was replaced by a copy of that in h, or else that the allele in h survived and moved to g (or stayed in g, if g = h).These two possibilities are formally equivalent, in that both result in site h transmitting its allele to g.
Mutation occurs after replacement, and is represented by a subset U ⊆ G of sites that acquire mutations.Mutation interchanges alleles 0 and 1.
In each state x, the joint probability that parentage map α and mutation set U occur is denoted p x (α, U ).This probability p may be understood as a real-valued function of three arguments, p • (•, •), such that, for each x ∈ {0, 1} G , p x (•, •) is a joint probability distribution over all possible set mappings α : G → G and subsets U ⊆ G.

Fixation axiom
For the population to be unitary, at least one site should be able (with positive probability) to eventually spread its descendants to all sites.This principle is formalized as follows [30]: Fixation Axiom.There exists a site g ∈ G, and a finite sequence of parentage maps α 1 , . . ., α m , such that (i) p x (α k ) > 0 for all 1 ≤ k ≤ m and x ∈ {0, 1} G , and A set of sites G and probability function p = p • (•, •), satisfying the Fixation Axiom, define a selection process.A selection process (G, p) captures all structures and processes relevant to selection, including behavioral interaction, spatial structure, migration, and mating pattern; these are all represented implicitly in the probability function p.

Selection Markov chain
Given a selection process (G, p), one can construct a Markov chain M(G, p), which I call the selection Markov chain, on the set of population states {0, 1} G .From a given state x, the subsequent state x is determined by first sampling a parentage map α and mutation set U from the probability distribution p x (•, •), and then setting (2.1) In this way, each non-mutated site g / ∈ U inherits the allele from its parent site α(g), while each mutated site g ∈ U inherits the opposite allele from its parent site.

Symmetries of a selection process
We are now prepared to define the symmetries of a model encompassed by this framework.

Definition
The definition of symmetry employs the following notation: For any state x ∈ {0, 1} G and set mapping τ : G → G, let x τ ∈ {0, 1} G denote the state that has allele x τ (g) in each site g ∈ G; that is, (x τ ) g = x τ (g) .To illustrate the usefulness of this notation, observe that if parentage map α occurs in state x, and there is no mutation (U = ∅), the subsequent state is then x = x α .This notation obeys a composition rule: if σ, τ : G → G are two set mappings, then With this notation, symmetry is defined as follows: Definition.A symmetry of a selection process (G, p) is a permutation σ : G → G such that, for every state x, parentage map α : G → G, and mutation set Figure 3 provides an example to illustrate Eq. (3.1).The idea is that, if σ is a symmetry, then any state x is equivalent to its permuted state x σ , and every transition from state x has a corresponding equivalent in state x σ .If, in a transition from state x, site g has parent α(g) = h, the equivalent transition in state in a transition from state x, site g acquires a mutation (g ∈ U ), the equivalent transition in state x σ has σ(g) ∈ U , or g ∈ σ −1 (U ).So overall, the pair (α, U ) occurring in state x is equivalent to the pair (σ −1 • α • σ, σ −1 (U )) occurring in state x σ , which motivates Eq. (3.1).These arguments are made precise in the proof of Theorem 1 in Appendix A. x σ = (0,0,1,1,1,0,0,0) Figure 3: Symmetry on a cycle.a.A population is structured as a cycle graph of size 8.For ease of reference, the sites here are numbered g = 1, . . ., 8. (In general, sites need not be numbered.)The group of symmetries is the dihedral group D 8 .The symmetry σ : g → g + 2 (mod 8) is illustrated here.b.
An illustration of the symmetry condition, Eq. (3.1).In the state x depicted, site 8 is replaced by mutated offspring of site 7.This is represented by a parentage map with α(8) = 7 (and α(g) = g for all g = 8) and U = {8}.As shown at right, this is equivalent to parentage map σ • α • σ −1 and mutation set σ −1 (U ) occurring in state x σ .

Symmetry group and its actions
The symmetries of a given selection process (G, p) form a group under composition of permutations.This is proven as Lemma 1 in Appendix A. We denote this group Sym(G, p). Figure 1 shows the symmetry groups for various models of structured populations.Since each symmetry is a permutation, Sym(G, p) is a subgroup of the symmetric group Sym(G)-that is, the group of all permutations of G.For a haploid, well-mixed population, all permutations are symmetries, so Sym(G, p) = Sym(G).For models with additional structure (space, sex, class, graph, etc.), Sym(G, p) is the subgroup of Sym(G) that preserves this structure, i.e. the subgroup of permutations satisfying Eq. (3.1).
Two group actions of Sym(G, p) are relevant to the role of symmetry in selection processes.The first is the left group action of Sym(G, p) on the set of sites G, given by g → σ(g).The second is the right group action of Sym(G, p) on the set of states {0, 1} G given by x → x σ .As will be shown later, these group actions help to formalize the symmetry arguments that often arise in analyses of models of natural selection.

Examples
Without going into full modeling details, it is helpful to identify the kinds of symmetry that arise in different varieties of evolutionary models: • Well-mixed haploid populations (Fig. 1a): The simplest models of selection, such as the Moran [1], haploid Wright-Fisher [2,3], and Cannings exchangeable [40] models-as well as frequency-dependent generalizations thereof [31][32][33]-describe a well-mixed, haploid population.In this case, each genetic site corresponds to an individual, and any two genetic sites are interchangeable.For such models, any permutation of the set G of sites is a symmetry.The symmetry group is Sym(G), which is isomorphic to S n , the symmetry group on n = |G| elements.
• Well-mixed diploid populations (Fig. 1b): A well-mixed hermaphroditic diploid population can be represented by specifying a finite set I of individuals and a partition of the set of sites, G = i∈I G i , where each subset G i contains two sites to house the two alleles in individual i.The diploid Wright-Fisher model [2,3], for example, may be represented this way [39].For such models, any permutation σ that preserves the partition into individuals-meaning that {σ(G i )} i∈I = {G i } i∈I -is a symmetry.A symmetry may interchange individuals, and also may interchange the two alleles within an individual.This is because an individual's two alleles are generally considered interchangeable (Aa and aA genotypes are equivalent), unless the alleles are sex-linked.Letting N = |I| = n/2 denote the number of individuals, Sym(G, p) is isomorphic to S N × (S 2 ) N .
• Class-structured populations (Fig. 1c): In a class-structured population [7], individuals are distinguished by sex or by other factors such as role within a colony.Class structure in a haploid population can be represented by a partition G = k j=1 C j , with each subset C j representing the sites in individuals of class j.Models of class-structured populations typically assume that individuals with the same class and genotype are indistinguishable.In this case, a symmetry is a permutation σ that leaves each class fixed, in the sense that σ(C j ) = C j for each j = 1, . . ., k. Sym(G, p) is therefore isomorphic to S n1 ×• • •×S n k , where n j = |C j | is the number of indivdiuals in class j.
• Island-structured populations (Fig. 1d): In island models [3,10,45,46] and metapopulation models [47][48][49], the population is divided into subpopulations ("islands"), which are joined to each other by low levels of migration.Within each island, individuals with the same genotype are considered indistinguishable.In our framework, assuming haploid genetics, this kind of model is again represented using a partition G = m j=1 G j , where here j = 1, . . ., m index the islands.It is often assumed that the islands have equal size and are interchangeable with each other.In this case, a symmetry is any permutation that preserves the partition into islands; that is {σ(G j )} m j=1 = {G j } m j=1 .Note that symmetries may interchange islands with each other, and may also interchange the sites within any island.Sym(G, p) is therefore isomorphic to S m × (S k ) m , where k is the size of each island, i.e. k = |G j | for each j = 1, . . ., m.

Symmetry and transition probabilities
If the proposed definition of symmetry is to be useful, symmetries should preserve the transition probabilities of the selection Markov chain M(G, p).Denoting the transition probability from state x to state y by P x→y , this principle is formalized in the following theorem (proven in Appendix A): Theorem 1.For any symmetry σ ∈ Sym(G, p) of a selection process (G, p), and any states x, y ∈ {0, 1} G , P x→y = P xσ→yσ .
Theorem 1 guarantees that the action of Sym(G, p) on {0, 1} G preserves the transition probabilities of M(G, p) and, by exetension, all quantities derived from them.For example, m-step transition probabilities are also preserved by symmetries, in the sense that P (m) xσ→yσ .Likewise, if M(G, p) admits a unique stationary distribution π (which occurs whenever there is complete gene flow and recurring mutation; see Theorem 1 of Ref. [29]), then π(x) = π(x σ ) for any state x and symmetry σ.This connects our definition of symmetry to the notion of "evolutionary equivalence" introduced by McAvoy and Hauert [28].

Equivalent sites
Theoretical analyses of natural selection often invoke the idea that certain individuals or locations are equivalent within a given model.We can formalize this idea by defining an equivalence relation ∼ on on the sites G of a given selection process (G, p).Two sites g, h ∈ G are equivalent, g ∼ h, if and only if there is a symmetry σ ∈ Sym(G, p) such that σ(g) = h.This equivalence relation induces a partition of G into equivalence classes.We observe that these equivalence classes are the orbits of the group action of Sym(G, p) on G.These equivalence classes formalize the notion of "class" used in class-structured models of selection (Fig. 1c).
If sites g and h are equivalent, then any property of site g must hold for site h, and vice versa.As an example, consider fixation probability-the probability that a new mutant allele eventually spreads throughout a population [14,35,51].For heterogeneously structured populations, the fixation probability depends on where the mutant allele originates [28,34,52,53].However, if g and h are equivalent sites, mutations arising at these two sites should have the same fixation probability.To formalize this principle, we observe that if there is no mutation (p x (α, U ) = 0 unless U = ∅), the single-allele states 0 = (0, . . ., 0) and 1 = (1, . . ., 1) are absorbing in M(G, p), and all other states are transient (see Theorem 2 of Ref. [29]).Fix two equivalent sites g, h ∈ G and a symmetry σ ∈ Sym(G, p) with σ(g) = h.Let x be the state with allele 1 in site h and 0's elsewhere; i.e., x h = 1 and x = 0 for all = h.Let ρ x denote the probability of eventual absorption in state 1 from initial state x.Then, applying Theorem 1, we have This demonstrates that the fixation probability of allele 1 is the same whether starting from site h (as it does in state x) or site g (as it does in state x σ ).In short, symmetry preserves transition probabilities.This generalizes a result proved in Ref. [28] for models of selection on graphs.
An important special case arises when the action of Sym(G, p) on G is transitive, meaning that for every g, h ∈ G there is some σ ∈ Sym(G, p) such that σ(g) = h.For selection processes with transitive symmetry, all sites are equivalent.This formalizes the idea of a homogeneous population, in which any individual may be understood as representative of all.In Figure 1, the models depicted in panels a, b, d, e, and f possess transitive symmetry, while those in panels c and g do not.Transitive symmetry has been formally applied in models of selection on graphs [17,23,54,55], and is implicitly invoked in evolutionary arguments that reason from the point of view of an arbitrary "focal individual" [56,57].

Reduction of the selection Markov chain
In models of natural selection, symmetry is used primarily as a way to reduce the number of states to be considered.This is a useful principle, since a model with |G| = n sites has 2 n possible states, which quickly becomes unwieldy.

Construction of the reduced chain
To formalize the reduction of the selection Markov chain by symmetry, observe that in light of Theorem 1, two states x and y can be considered equivalent if there is a symmetry σ such that y = x σ .This provides an equivalence relation on the set of states {0, 1} G .The equivalence classes of are the orbits of the group action of Sym(G, p) on {0, 1} G .We denote the equivalence class (orbit) of a state x ∈ {0, 1} G by [x]: (5.1) In particular, since the action of Sym(G, p) permutes alleles among sites, equivalent states must have the same number of each allele.Denoting the number of 1 alleles in state x by Σx, we have x y ⇒ Σx = Σy.These orbits form the states of a reduced Markov chain, R(G, p).The transition probability from [x] to [y] in R(G, p) is In light of Theorem 1, it does not matter which representative x of the orbit [x] is used in the right-hand side of Eq. (5.2).This reduced chain R(G, p) simplifies the original chain M(G, p) by grouping together all states that are equivalent by symmetry.

Examples
This idea of a reduced chain is implicitly used in many established models: • For a well-mixed haploid population (Fig. 1a), two states x and y are equivalent if and only if they have the same number of 1 alleles: x y ⇔ Σx = Σy.The reduced chain R(G, p) therefore has n + 1 states, which can be indexed k = 0, . . ., n according to the number of 1 alleles.This corresponds to the standard representation of the Moran and haploid Wright-Fisher models as Markov chains on {0, . . ., n} (e.g., [58], Chapter 3).
• For a diploid hermaphroditic well-mixed population (Fig. 1b), we can regard the alleles {x i1 , x i2 } in each individual i ∈ I as an unordered pair, representing its genotype.Two states are equivalent if and only if they have the same number of {0, 0}, {0, 1}, and {1, 1} genotypes.The states of the reduced chain R(G, p) can therefore be represented as triples (k 00 , k 01 , k 11 ) of nonnegative integers, subject to k 00 +k 01 +k 11 = n.There are n(n + 1)/2 reduced states in total.Representations of this form have been used at least as far back as 1908 [59].
• For a class-structured population (Fig. 1c) with classes of size n 1 , . . ., n k , if any permutation that preserves classes is a symmetry, then two states are equivalent if they have the same number of 1 alleles in each class.
States of R(G, p) can be represented as k-tuples (m 1 , . . ., m k ), where m j is the number of 1 alleles in class C j .There are k j=1 (n j +1) such reduced states.

Size of the reduced chain
One may now ask, how much of a reduction is achieved?That is, how does the number of states in the reduced chain R(G, p) compare to the 2 n states of the original chain M(G, p)? ).The cycle decomposition is used in Eq. ( 5.3) to count the states in the reduced Markov chain.In general, for a directed cycle of size n (Fig. 1f), rotating by j vertices creates gcd(n, j) cycles, yielding Eq. (5.4).
This is equivalent to a well-known problem in combinatorics of finding the number of distinct colorings of a symmetric object [60,61].For us, the object is the set G of sites and the colors correspond to alleles.Pólya's Enumeration Theorem [60][61][62] provides a formula in terms of cycle decompositions (Fig. 4).Each symmetry σ ∈ Sym(G, p) partitions G into nonoverlapping cycles, where sites g, h ∈ G belong to the same cycle if and only if g = σ k (h) for some power k of σ. (Here, σ k means the permutation σ is iteratively applied k times.)Letting c(σ) denote the number of cycles in the cycle decomposition associated to σ, and S = | Sym(G, p)| the number of symmetries, the number R of reduced states is given by This formula generalizes to m > 2 alleles by replacing 2 with m.
For the directed cycle (Fig. 1f), Sym(G, p) consists of rotations by j vertices, for j = 1, . . ., n.Such a rotation has gcd(n, j) cycles, where gcd denotes greatest common divisor (Fig. 4).Applying Eq. ( 5.3), the number of reduced states is 2 gcd(n,j) . (5.4) This equivalent to a known result for the "necklace problem" in combinatorics [61,63], but expressed in a more elementary way.In general, applying Eq. (5.3) requires computing the cycle decomposition of each symmetry, which is laborious if the number of symmetries is large.However, the following theorem (proven in Appendix A) provides bounds on R using only the numbers of sites and symmetries: Theorem 2. For a selection process (G, p), let n = |G| be the number of states, S = | Sym(G, p)| be the number of symmetries, and R be the number of states in the reduced chain R(G, p).Then R is bounded by The lower bound in Eq. (5.5) is really two lower bounds.First, there must be at least n + 1 reduced states, one for each possible number k = 0, . . ., n of alleles of type 1.Second, the number of states cannot be reduced by a factor greater than the number S of symmetries, and there is no reduction for the two single-allele states 0 and 1, which are invariant under permutation.The upper bound in Eq. (5.5) comes from considering the minimal reduction that can be achieved in a state with k alleles of type 1 and n − k of type 0, for each k = 0, . . ., n.
Let us illustrate these bounds with some examples: • If S = n! (i.e., if every permutation is a symmetry), then gcd(S, k!(n − k)!) = k!(n − k)! and both bounds become n + 1.This agrees with our earlier observation that well-mixed haploid population models have n + 1 reduced states.
• If S = 1, both bounds become 2 n .This simply means that if there are no nontrivial symmetries among the n sites, then there is no reduction and all 2 n states are non-equivalent.
• A directed cycle graph (Fig. 1f) has S = n.If n is prime, then gcd(n, k!(n− k)!) = 1 for each k = 1, . . ., n − 1.So for prime n, both bounds coincide, yielding in agreement with Eq. (5.4).The number of states is reduced by a factor of almost n, corresponding to the n rotations of the cycle.
• For a star graph (Fig. 1g), S = (n − 1)!.The lower bound is n + 1 as long as n ≥ 3.If n is prime, then n k is divisible by n, meaning that k!(n−k)!divides (n−1)! and hence gcd(S, k!(n−k)!) = k!(n−k)!, for each k = 1, . . ., n−1.Using this result, the upper bound evaluates to 2+n(n−1) for prime n.In this case, the bounds of n + 1 ≤ R ≤ 2 + n(n − 1) do not determine the value of R.However, considering the 2 possible states of the hub node, and the n possible numbers of 1 alleles among the leaves, the number of reduced states is R = 2n, which satisfies the lower and upper bounds.

Discussion
Symmetry arguments have been used since the earliest models of natural selection, first implicitly and in recent decades more explicitly [17,23,25,28,54,64].This work aims to formalize these arguments for a broad class of models, and provide a common language of definitions and foundational results.In particular, this work provides mathematical theory for how symmetry can reduce the states of the Markov chain representing selection.
For most well-studied models of selection, such as those depicted in Figure 1, identifying the symmetries is straightforward.However, for an arbitrary selection process, symmetries may be much more difficult to identify.Indeed, this problem includes, as a special case, the Graph Automorphism Problemto determine whether a given graph has a nontrivial automorphism-which is NP-hard [65].
The framework employed in this work describes selection between two alleles, at a single genetic locus, in a population of fixed size and structure.This framework can be extended in a number of ways.Generalizing to more than two alleles is relatively straightforward [39]: one can consider a set A of allele types; the allele x g occupying site g is then an element of A, and the state x = (x g ) g∈G becomes an element of A G .One must also specify a Markov chain on A to characterize mutation of alleles.The definition of symmetry in Eq. (3.1), and the proofs of Lemma 1 and Theorem 1, carry over into the multiallele setting.However, Theorem 2 requires modification depending on number |A| of alleles.
Generalizing to multiple genetic loci is also straightforward-the set G of sites can be expanded to include one site per locus for each haploid individual, or two sites per locus for each diploid.In most applications of multi-locus models, the loci play distinct roles, and therefore a symmetry must preserve the sites that correspond to each locus.
It is less obvious how to extend the definition of symmetry to dynamic population structures [42,[42][43][44]66].In this case, the state of the selection Markov chain consists not only of the population state x, but also the state of the population structure, which can be represented by a variable s belonging to some set of possibilities S. The selection Markov chain must then include a transition rule for structures as well as population states.How can symmetry be defined in this context?One idea-used in a different context by McAvoy [64]is to require that the set S of structures admit an action by the symmetric group Sym(G).This action characterizes how population structures change when sites are permuted.Symmetry can then be represented by the subgroup of Sym(G) whose elements satisfy Eq. (3.1) and whose action on S is compatible with the transition rule for population structures.
Symmetry, in the sense used here, is a property of models of natural selection, rather than the manifestation of natural selection in the real world.Real-world biological processes rarely possess exact symmetry.Still, exploiting symmetry simplifies the analysis of models, enabling tractable predictions for the real-world processes they represent.In this way, a formal theory of symmetry is not only of mathematical interest, but has practical value in aiding our understanding of natural selection.

Figure 1 :
Figure1: Population structures and their symmetry groups.a In a haploid well-mixed population model, such as the Moran[1] and Wright-Fisher[2,3] models, every permutation of the set G of sites is a symmetry.The symmetry group Sym(G, p) is therefore isomorphic to the symmetric group S n , where n = |G| is the number of sites.b In a diploid well-mixed population of size N , symmetries may permute individuals, and may also swap the two sites within individuals; the symmetry group is therefore (isomorphic to) S N × (S 2 ) N .c In a haploid class-structured population, where the classes have respective sizes n 1 , . . ., n k , symmetries may permute the members of each class.The symmetry group is therefore S n1 × . . .× S n k .d For a population structured as a cycle graph, symmetries may rotate and/or reflect the cycle; the symmetry group is the dihedral group D n .e For a directed cycle, only rotations are allowed; the symmetry group is the cyclic group Z n .e For a star graph, any permutation of the n − 1 leaves is a symmetry, yielding the symmetry group S n−1 .