Evolution of Genetic Redundancy : The Relevance of Complexity in Genotype-Phenotype Mapping

Genetic redundancy is ubiquitous and can be found in any organism. However, it has been argued that genetic redundancy reduces total population fitness, and therefore, redundancy is unlikely to evolve. In this letter, we study an evolutionary model with high-dimensional genotype-phenotype mapping (GPM) to investigate the relevance of complexity in GPM to the evolution of genetic redundancy. By applying the replica method to deal with quenched randomness, the redundancy dependence of the fitness is analytically obtained, which demonstrates that genetic redundancy can indeed evolve, provided that the GPM is complex. Our result provides a novel insight into how genetic redundancy evolves.

All living organisms are under selection pressures, which act on their phenotypes, while only their genotypes are heritable. Therefore , the connection between genotype and phenotype, referred to as genotype-phenotype mapping (GPM), is indispensable to fully understand evolutionary processes. However, GPM is generally complex and stochastic. Many phenotypic traits are now known to be a result of complex processes involving interactions between many proteins, RNAs, and genes. For example, developmental processes are largely regulated by transcriptional networks that are modeled by highdimensional and non-linear equations [1,2]. Since such inherent complexity in GPM obscures which genotype is associated with a high-fitness phenotype, this complexity can have an impact on evolution. However, the relevance of GPM complexity to evolutionary processes has not been fully explored as yet.
An important characteristic of GPM is genetic redundancy, i.e ., coding of a phenotypic trait by two or more genes. Numerous examples of genetic redundancy have been found in higher organisms [3,4] and even in microorganisms [5,6]. A classical premise of evolutionary theory is that genetic redundancy lowers fitness at the population level, thus the redundancy would be evolutionarily suppressed [7][8][9]. For example, studies have shown that genes that have been duplicated lose its function in one of the redundant genes [10][11][12][13]. According to this argument, genetic redundancy reduces the sensitivity of fitness to mutation, and thus deleterious mutations may not be eliminated, leading to a decrease in the total fitness of the population. In contrast, individuals without genetic redundancy are generally more susceptible to the deleterious effects of mutations, so much so that most mutants are lethal and mutants with lower fitness are effectively removed, thus maintaining the high fitness of the population. This suggests that genetic redundancy is evolutionarily unstable. In particular, the suppression of redundancy is pronounced in asexual populations. Sev-eral studies have explored the conditions that enable the evolution of genetic redundancy, but it is still not well understood [6,8,[14][15][16]. Specifically, the relevance of complexity in GPM to the evolution of genetic redundancy has not been evaluated.
In the present Letter, we take an asexual evolutionary model and study genetic redundancy to evolution, by comparing the results from simple and complex GPMs. Unlike the classical view mentioned above, we find that populations with higher genetic redundancy can have higher fitness under complex GPMs . This preference for genetic redundancy is characteristic of the highdimensional complex GPM and is independent of previously reported mechanisms [6,8,[14][15][16]. Therefore, this study will provide a novel explanation for the ubiquity of genetic redundancy in biological systems.
In order to model both simple and complex GPM, we introduce an adiabatic spin system, where complex GPM is represented as a non-trivial mapping with quenched randomness. Indeed, Sakata et al. [17] adopted a spin system with quenched randomness to study evolution (not of genetic redundancy) under complex GPM.
In this model, configuration of N sites of a locus is described by g = (g 1 , g 2 , . . . , g N ), where each g i can take two different allelic states g i = ±1. Genotype g determines M phenotypic traits, which is represented by p = (p 1 , p 2 , . . . , p M ), where each p i can take p i = ±1 corresponding to whether the ith trait is expressed or not. Genetic redundancy is characterized by the parameter γ ≡ N/M . For a given genotype g, phenotype p is determined in a stochastic manner whose probability depends on stochastic GPM P (p|g). We model this GPM by the following conditional probability, β −1 ≡ T p is the temperature for the stochasticity in GPM, which represents the strength of phenotypic fluctuation in isogenic individuals, and Z p (g) is the partition function of p. The phenotype fluctuates during an individual lifetime due to developmental noise or environmental variation. These phenotypic fluctuations are sufficiently faster than the changes in genotype mediated by the evolutionary process of selection, reproduction, and mutation. Thus, the phenotypic fluctuation can be adiabatically eliminated. Hence, once g is obtained, one can assume that the distribution of p is uniquely determined by Eq. (1). Changes in genotype distribution on an evolutionary time-scale are dominated by mutation and selection processes under a given fitness function of the phenotype p. Instead of introducing a complex fitness landscape, we adopt a simple fitness function Φ(p) = i p i , to focus on the relevance of GPM to evolution. By the adiabatic assumption, only effective fitness φ(g), the average of fitness Φ(p) over p under given g, contributes to slower evolutionary dynamics of g. Distribution P (g) of genotype g in the population at equilibrium is determined only by the effective fitness φ(g) and "genotypic temperature" T g , which represents the ratio of mutation rate to selection pressure. The distribution of genotypes is approximated by the Boltzmann distribution as where β ′ = T −1 g and Z g is a partition function of g. We study two extreme cases of GPM in order to compare the evolutionary steady states under complex and simple GPMs. As an example of simple GPM, J ij is chosen to be J ij = J 0 /N without randomness, whereas an example of complex GPM is represented by quenched random variables J ij drawn from P (J ij ) = exp −J 2 ij /2(σ 2 J /N ) / 2πσ 2 J /N . First, we consider the behavior of simple GPM J ij = J 0 /N . In this case, the explicit form of effective fitness φ(g) can be calculated as φ(g) ≡ Tr p P (p|g)Φ(p) = M tanh(βJ 0 N j g j /N ). For sufficiently large N , Z g is thus obtained as where m is obtained from the following saddle point equation.
The mean fitness per unit phenotype γ φ is given by γ φ = γ N ∂ ∂β ′ ln Z g = tanh βJ 0 m. Here, we adopt the fitness per unit phenotype γ φ , rather than the fitness per locus φ in order to evaluate the contribution of genetic redundancy to fitness, because γ φ provides a natural measure to compare systems with high redundancy (large N ) to ones with low redundancy (small N ), each with a fixed number of phenotypes M . In Fig. 1-(a), T g and T p dependency of γ φ for γ = 1 is represented. This figure illustrates that γ φ decreases by the increase of either T g or T p , i.e., with the increase in mutation rate or phenotypic fluctuation. Fig. 1-(b) and (c) show γ dependency of γ φ with fixed T p and T g , showing that γ φ decreases as γ increases. Since the γ dependency of γ φ appears in Eq. 5 with the form β ′ /γ, one may find that increasing γ corresponds to the increase in T g and thus leads to the decrease in fitness. This implies that more mutations tend to accumulate in the presence of redundant genes, which leads to a reduction in fitness. This is consistent with the suggestions that genetic redundancy weakens the effectiveness of selection in eliminating deleterious mutations in a population, thus causing the decline in population mean fitness [7][8][9]. As a consequence of this accumulation of mutations, genetic redundancy enhances genetic diversity. This is represented by the increase in entropy per unit phenotype γs = γ(log Z g − β ′ φ ), as shown in Fig. 1-(d). It follows from these results that, in the case of simple GPM, genetic redundancy γ decreases fitness per unit phenotype γ φ and increases entropy per unit phenotype γs.
Next, let us consider the case with a complex GPM, where J ij is a quenched random variable drawn from The explicit form of effective fitness φ(g) can be calculated as φ(g) = M i tanh(β N j J ij g j ). In this case, we need to calculate the expectation value of free energy over quenched random variables, rather than that of Z g . To calculate a quenched average of the free energy [F (T g )] J = F (T g )dJ P (J), the replica method [18] is useful, where we first calculate [Z n g ] J for an integer n, and then [log Z g ] J is obtained from After some algebra, [Z n g ] J for an integer n is obtained as where q µµ ′ and ω µµ ′ are the spin-glass order parameter and its conjugate. Here, Ψ(q µµ ′ ) and S(q µµ ′ , ω µµ ′ ) are given by where m i = (m 1 i . . . m n i ), f 0 (m µ i ) = tanh βm µ i and Q is n × n matrix with diagonal elements, 1, and off-diagonal elements, q µµ ′ . For sufficiently large N , order parameters q µµ ′ and ω µµ ′ in Eq. (8) and Eq. (9) are determined by the following saddle point equation: Here, · L indicates the average over the distribution . For further calculation, we assume replica symmetry (RS) ansatz as q µµ ′ = q and ω µµ ′ = ω for all µ = µ ′ . This ansatz provides equations for q and ω as where Dz = e −z 2 /2 dz/ √ 2π and [·] Θ indicates taking an average over the distribution exp(Θ(m, z; β ′ ))/ dm exp(Θ(m, z; β ′ )) with Θ(m, z; β ′ ) ≡ β ′ f 0 (m) − (m + σ 2 qz) 2 /2σ 2 (1 − q). For all the regions on the T g -T p plane, q is always positive except for when T g = ∞ or T p = ∞, indicating that the model has no paramagnetic phase. By using q and ω in Eq (12), free energy per locus for the RS solution, f RS , is calculated as .
The RS solution becomes invalid when the de Almeida-Thouless (AT) condition [18,19] is broken, accompanied by replica symmetry breaking (RSB). Therefore we perform Monte Carlo simulations (MCS) to estimate the fitness and entropy for the RSB phase, while theoretical estimates based on RS ansatz are used in the RS phase. In Fig. 2-(a), the yellow dashed line shows the phase boundary between the RS and RSB phases on the T g -T p plane obtained by AT conditions and indicates that RSB occurs when either T g or T p are small. Details of the derivation of AT conditions and phase boundary are given in the Supplemental Materials. By differentiating −γβ ′ f RS with respect to β ′ , the mean fitness per unit phenotype of the RS solution, γ φ , is obtained as In the RS phase, the estimates obtained using an MCS agree with the above theoretical estimates. A color map in Fig. 2-(a) shows T p and T g dependency of the fitness per unit phenotype γ φ with fixed γ and illustrates that γ φ decreases with either T p or T g increases. This decrease of γ φ against T p and T g is consistent with the  results of the analysis of simple GPM. Redundancy dependence of γ φ with fixed T p and T g is shown in Fig. 2-(b) and (c). In contrast to the case with simple GPM, γ φ increases with increasing γ. This result is opposite to the classical point of view, which implies that genetic redundancy decreases fitness at evolutionary equilibrium.
Entropy per phenotype, γs, is interpreted as the diversity of genotypes in a population, and is calculated as γs = −γβ ′ ( φ + f RS ) for the RS phase. γs is evaluated for the RS and RSB phases numerically by using a multicanonical Monte Carlo method [20][21][22], which is often used in spin glass systems and other fields [23,24]. Figure 2-(d) shows γ dependency of γs with fixed T p and T g . As shown in Fig. 2-(d), γs almost increases linearly with γ. This increase of γs against γ is similar to the case of simple GPM, where the presence of redundant genes allows for accumulation of mutations, and thus, an increase in the diversity of genotypes found in a population. To conclude, in the case of complex GPM, genetic redundancy γ increases both fitness per unit phenotype, γ φ , and entropy per unit phenotype, γs.
In the present study, an adiabatic spin model with high-dimensional GPM is investigated, where both the complexity in GPM (represented by random quenched variables J ij ) and genetic redundancy (γ) are controllable parameters. We compared evolutionary steady states between for simple and complex GPM cases. In the case of simple GPM, fitness per unit phenotype γ φ decreases as γ increases ( Fig. 1-(b) and (c)), indicating that redundancy should be suppressed by selection pressure, which is consistent with the classical view [7][8][9]. As is shown in Fig. 1-(d), the decrease in redundancy reduces entropy per unit phenotype γs. Since less genetic diversity in the population hinders accessibility to a novel genotype and suppresses evolvability, selection pressure toward less genetic redundancy for simple GPM will lead a population to an evolutionary dead-end. Remarkably, this is not true for a complex GPM. Under complex GPM, a population with higher γ exhibits higher fitness γ φ , as shown in Fig. 2-(b) and (c). Therefore, genetic redundancy, i.e., a system with higher γ = N/M can evolve, which is contrary to the classical view of genetic redundancy. A possible explanation is as follows: Larger N provide greater degrees of freedom in g to optimize the fitness, and thus enables realization of g that provides higher fitness than the highest fitness found in systems with smaller N . At the same time, a system with smaller M decreases the variety of connections from g i to p i , and diminishes frustrations in g i to optimize the fitness. Therefore a system with larger N and smaller M can provide higher fitness.
As is shown in Fig. 2-(d), larger γ accompanies increases in entropy per unit phenotype γs, and thus, in growth in genetic diversity. This allows a population to access a variety of novel genotypes, which could accelerate the emergence of evolutionary innovations. Thus, a population with complex GPM can have evolvability. It is interesting to point out that it has been suggested that genetic diversity enables a population to rapidly respond to large environmental changes [16,26,27], thus our results here also suggest that a population with complex GPM could have potential ability to adapt to new environments.
The model with complex GPM exhibits RS/RSB transition. Interestingly, such a transition was also reported in a different model by Sakata et al [17], which claims that the RSB phase is biologically unfavorable. Similar to their study, we also demonstrate that phenotypic fluctuations suppress the appearance of the RSB phase. This RS/RSB transition can appear in many evolutionary models with complex GPM, and a better understanding of the biological significance of this transition is worthy of further investigation.
In summary, we have demonstrated how complexity in GPM promotes evolution of genetic redundancy. The mechanism is general, and the recombination process in a sexual population does not weaken the proposed preferences in genetic redundancy. Selection processes with complex GPM make gene duplication preferable, which can further enhance the potential of emerging novelty in evolution.
We would like to acknowledge helpful comments made by Koji Hukushima, and discussions with Ayaka Sakata and Tomoyuki Obuchi. This work was supported by a Grant-in-Aid for Scientific Research (No. 21120004) on Innovative Areas Neural creativity for communication (No. 4103)of MEXT, Japan.