Divergent evolution of genetic sex determination mechanisms along environmental gradients

Abstract Sex determination (SD) is a crucial developmental process, but its molecular underpinnings are very diverse, both between and within species. SD mechanisms have traditionally been categorized as either genetic (GSD) or environmental (ESD), depending on the type of cue that triggers sexual differentiation. However, mixed systems, with both genetic and environmental components, are more prevalent than previously thought. Here, we show theoretically that environmental effects on expression levels of genes within SD regulatory mechanisms can easily trigger within-species evolutionary divergence of SD mechanisms. This may lead to the stable coexistence of multiple SD mechanisms and to spatial variation in the occurrence of different SD mechanisms along environmental gradients. We applied the model to the SD system of the housefly, a global species with world-wide latitudinal clines in the frequencies of different SD systems, and found that it correctly predicted these clines if specific genes in the housefly SD system were assumed to have temperature-dependent expression levels. We conclude that environmental sensitivity of gene regulatory networks may play an important role in diversification of SD mechanisms.


Introduction
Sex determination (SD) is a crucial aspect of the development of sexually reproducing organisms, yet the regulatory mechanisms underlying SD are very diverse and prone to evolutionary change (Bachtrog et al., 2014;Beukeboom & Perrin, 2014). SD mechanisms have traditionally been classified as either environmental (ESD) or genetic (GSD) depending on the type of signal that initiates the determination of an individual's sex. Under ESD, such signals include temperature, salinity, and acidity during a sensitive period in embryonic development (reviewed in Devlin & Nagahama, 2002;Janzen & Paukstis, 1991). Under GSD, the signal is a specific gene (or set of genes) present in zygotes, leading to either male or female development, such as the male-determining Sex-determining Region Y (SRY) gene in mammals or transformer (tra) in many insects Goodfellow & Lovell-Badge, 1993;Pane et al., 2005;Sinclair et al., 1990;Verhulst et al., 2010). Diversification of SD mechanisms occurs via the evolution of novel SD mechanisms that replace their predecessors, a process called SD transition. SD transitions are prevalent in some taxa but not in others (Bachtrog et al., 2014;Beukeboom & Perrin, 2014), indicating variable evolvability of SD systems. What enhances the evolvability of some SD systems but not others and what causes SD transitions is still poorly understood.
One often-overlooked aspect of the evolution of SD is how environmental and genetic factors may simultaneously affect SD. Rather than being mutually exclusive, ESD and GSD could instead be considered as two extremes of a continuum (Beukeboom & Perrin, 2014;Pen et al., 2010;Uller & Helanterä, 2011), with mixed systems occurring in several organismal groups, such as amphibians, fish, and insects (Devlin & Nagahama, 2002;Doums et al., 1996;Hamm et al., 2015;Ma et al., 2016;Mezzasalma et al., 2021;Nigro et al., 2007;Pen et al., 2010). In such mixed systems, SD may reflect a delicate balance between genetic effects that bias the process of SD toward male or female development, counteracted by environmental effects that push the system in the opposite direction (Alho et al., 2010;Holleley et al., 2015). GSD has repeatedly evolved in species which previously had ESD (e.g., Ezaz et al., 2009), and several theoretical models have been developed to predict when such turnovers should occur (Muralidhar & Veller, 2018;Pen et al., 2010;Van Dooren & Leimar, 2003). However, such models typically do not explicitly consider the underlying molecular mechanisms of sex determination. Although sex is determined by genes in species with GSD, environmental conditions can affect SD by modifying the expression levels of these genes (Hodson et al., 2017;Schmidt et al., 1997). Despite clear evidence that such environmental effects may perturb the action of GSD mechanisms (e.g., Devlin & Nagahama, 2002), their effect on the evolution of GSD systems is still unknown.
In many species, SD involves hierarchical gene regulation (Kopp, 2012;Pomiankowski et al., 2004;Wilkins, 1995), where an initial signal targets a small number of regulatory genes that in turn regulate downstream targets. Evolutionary transitions between GSD systems are thought to occur primarily by the displacement of the initial signal gene by another gene with a similar function, or by the recruitment of a new regulatory gene on top of the ancestral SD regulatory pathway (Bull & Charnov, 1977;Pomiankowski et al., 2004;Wilkins, 1995). Genes downstream of the top regulatory genes are considered to be more constrained in terms of evolutionary change, as mutations in such genes may interfere with their pre-existing function in regulating SD. Nonetheless, they are not fully prohibited from undergoing further evolution, and some changes may still occur (Herpin et al., 2013).
A new evolutionary framework needs to integrate the views that (1) SD is not solely environmentally or genetically determined but is to varying degrees affected by both types of cues; and (2) changes in SD cascades do not only occur at the top but may also occur via changes in the underlying genetic network. Here, we formalize this framework by developing a theoretical model of the evolution of SD systems in the presence of spatial environmental variation that affects expression of SD genes. The model is inspired by the polymorphic SD system of the housefly Musca domestica, but can be applied more broadly to other systems as well (Table 1). In this system, temperature is likely to act as an environmental cue because (1) variation in SD systems is correlated with variation in temperature between natural populations, and (2) temperature affects SD in several M. domestica strains harboring mutant SD genes (see also Box 1). Like in M. domestica SD, the model features two types of SD genes: an environmentally-sensitive gene F induces femaleness when active, and one or more M-genes that induce maleness by inhibiting F (Figure 1, see details below). We investigate here how the (co-) evolution of F and M can yield novel SD systems under the influence of environmental sensitivity of F. We then use the model to explain how the multifactorial SD system of houseflies has evolved (Box 1).

Methods
Here, we briefly describe the model; a more detailed description of the model and simulation techniques is provided in the Supplementary Methods. We developed an individual-based simulation model, where individuals occupy a linear array of subpopulations (demes) arranged along a temperature gradient. The life cycle is as follows: adults reproduce sexually and then die; their offspring undergo sexual development and viability selection; a fraction of the surviving offspring migrates from their natal subpopulation to a neighboring subpopulation; finally, individuals mature and the next round of reproduction begins.
Motivated by the SD mechanisms of the housefly (Box 1), individual sexual development was modeled to result from interactions between a single feminizing gene F, one or more masculinizing genes M and the local temperature of an individual's environment; note that labels such as "female" and "feminizing" are interchangeable for "male" and "masculinizing" to suit other SD mechanisms, e.g., ZW systems instead of XY systems. The F gene produces a temperature-dependent amount of product which is partially inhibited or degraded by the products from the M genes; the remaining or net amount of F product, denoted by ẑ F, determines sex: if ẑ F exceeds a threshold value θ F, the individual develops as female, whereas if ẑ F is below a second threshold value θ M < θ F it develops into a male. If ẑ F is between the two thresholds, θ M ≤ẑ F ≤ θ F, then the individual develops into an infertile intersex.
The value of ẑ F is obtained by summing up the net expression levels z F of both F alleles in an individual. The quantitative value of z F can vary between F alleles and depends on (1) the temperature-dependent expression level of the allele, (2) the allele's sensitivity to M product, and (3) the amount of M product. Specifically, normalized temperature varies between T = 0 at one end (the "north") of the array of subpopulations and T = 1 at the other end (the "south") and it affects the net amount of F product according to The first factor on the right, z F0 (1 + βT) + ε, represents the initial amount of F product, before partial degradation by M product, where z F0 is the F allele's baseline expression level at T = 0, β ≥ 0 quantifies the linear rate of increase in F expression with temperature, and ε represents random variation in expression levels due to environmental and/or developmental noise. The second factor, 1 − s FMẑM, represents the proportion of F product that is not degraded by M product, where 0 ≤ s FM ≤ 1 is the F allele's sensitivity to M product and ẑ M is the cumulative amount of M product produced by the individual's M alleles.
The baseline expression level z F0 and sensitivity s FM of F alleles are evolvable trait values, as are the expression levels of M alleles. Whenever a gamete is produced, each allelic trait incurs a mutation with a certain probability and its trait value is modified. Most mutations are "regular" mutations that modify traits by adding a small Gaussian amount with mean zero, but a small fraction of mutations are "null mutations" such that the resulting trait values (allelic expression levels or sensitivity to M product) are zero and cannot evolve any further. See Supplementary Table 2  The initial populations all have an XY male heterogametic system: all individuals carry two F alleles on an autosomal F locus and males additionally carry a single M allele (designated M Y ) on their Y-chromosome. Initially there are no M alleles on autosomes, but we assume that during meiosis sometimes a new M allele (designated M A ) is created on an autosomal locus; this is assumed to occur via de novo evolution of a novel M allele, but can also occur via transposition from a Y-chromosomal M allele. Individuals carry at most four M alleles: two on the Y-chromosome (if they have two Y chromosomes) and two on an autosome. In natural systems, many loci may be capable of evolving a male-determining function (Bopp, 2010), but we consider here only a single autosomal locus for simplicity. Thus, the initial XY population has the potential to evolve into a population with a new system of male heterogamety where an autosomal chromosome with an M allele has replaced the original Y-chromosome. Evolution of populations with female heterogamety is also a potential outcome, if females become heterozygous for an insensitive F allele (i.e., with s FM = 0) with a sufficiently high expression level.
We also allowed for Y-chromosomal fitness effects: (1) individuals homozygous for the Y-chromosome will have reduced viability 0 ≤ s YY ≤ 1, and (2) Y-chromosomes carry sexually antagonistic alleles that are beneficial to males and detrimental in females, with additive effect 1 + s a males and 1 − s a in females (where s a ≥ 0 is the antagonistic effect). The combined effects of Y-homozygosity and sexual antagonism are assumed to be multiplicative, i.e., a male with two Y-chromosomes will have his expected fitness modified by a factor s YY (1 + s a ), whereas a female with two Y-chromosomes gets s YY (1 − s a ).

F activity relative to SD thresholds determines which SD systems are evolutionary stable
As an initial exploration of our model, we performed a set of 10,000 simulations without temperature-dependent overexpression, but for different values for the SD thresholds θ M and θ F, as well as different M A activation rates µ D (sampled from a uniform distribution with range [0, 0.05]). Here, de novo activation of M A causes a male-biased sex ratio, thereby promoting the invasion of female-determining alleles (Bull & Charnov, 1977;Wilkins, 1995), similar to, e.g., sex chromosome meiotic drive (Jaenike, 2001;Kozielska et al., 2010). In addition, for these simulations, we did not incorporate fitness effects associated with M Y . For each simulation, we determined the most prevalent genotype among females and males at equilibrium, based on the expression and sensitivity levels of their F alleles as well as the number of expressed M Y and M A alleles. We categorized simulations according to the activity of a single F allele (zF) relative to θ M and θ F to Note. Divergent evolution of sex determination mechanisms within a single species may occur as different populations adapt to different environmental conditions as explained by our model. Similarly, closely related species may exhibit different sex determination mechanisms when occupying different habitats. These species c.q. species complexes may provide opportunities to validate our model or aspects thereof, though a lack of knowledge on the genetic basis of sex determination and/or effect of environmental variation may need to be resolved to do so. a Temperature effects on sex determination are described in related species (same order or genus), b Temperature effect theorized, but lacking empirical validation.
determine which SD systems can evolve under different levels of F activity. We find that under different relationships between the maximum potential activity of a single F allele (zF) and the SD thresholds, different SD systems can evolve (Supplementary Table 3). When θ M < z F < θ F, nearly all simulations yield an SD system where both sexes have two sensitive and expressed F alleles, and males are heterozygous for a single M (either M Y or M A ; females F/F; +/+, males F/F; +/M, with + indicating the absence of an expressed M allele). In contrast, when z F < θ M < θ F or θ M < θ F < z F, we additionally encounter systems in which F alleles evolved to become insensitive and/or unexpressed, but the distribution of these alleles over the sexes differs between these two scenarios: when z F < θ M < θ F, females carry two insensitive F alleles, whereas males carry a single insensitive F and one unexpressed F; here, insensitive F alleles can be regarded as recessive female-determining alleles, whereas the sensitive and expressed F allele (in presence of M) or unexpressed F plays the role of a dominant male-determining allele. When θ M < θ F < z F, females may carry a single insensitive F allele, whereas males carry none, suggesting that insensitive F alleles are dominant female-determining alleles. This is corroborated by the presence of expressed M alleles in both sexes in the simulations, which would otherwise induce maleness in their carriers.
In simulations where insensitive F alleles have evolved, we find that the remaining F alleles have become unexpressed, and in some cases that expressed M alleles are lost as well. Here, insensitive F alleles spread first, along with an increased frequency of expressed M alleles in both sexes (Supplementary Figure 1), so that an equal sex ratio is maintained. Next, sensitive F alleles become unexpressed (and subsequently insensitive due to the constant mutation pressure affecting F sensitivity); the insensitive expressed F allele is retained as it now performs the SD function. Along with the increase in unexpressed F alleles, M alleles similarly become unexpressed. Based on these dynamics, we infer that evolution of F insensitivity indirectly renders the loss of expression selectively neutral for sensitive F alleles, which in turn renders the loss of M expression neutral; both genes may therefore decay via mutation accumulation in a stepwise order (Supplementary Figure 2). In addition to the systems identified in our simulations, we speculate some other systems may also be evolutionary stable ( Figure 2). Their absence from our simulations may be because they only rarely arise through evolution, or because they represent intermediate states between some of the systems that are observed. The latter for example applies to systems where females have one insensitive and one sensitive F allele, males have two sensitive F alleles, and M is fixed in both sexes (females F I /F; M/M, males F/F; M/M in Figure 2). This system is prevalent in some simulations prior to the accumulation of unexpressed F alleles and loss of M (e.g., Supplementary Figure  1). In absence of fitness effects related to M Y , recurrent de novo mutation of M A results in M A replacing M Y as the male-determining factor; without evolution of F, this represents a transition from one male heterogamety system to another, as M Y and M A perform equivalent functions. Male heterogamety systems with M Y or M A as the male-determining gene are observed regardless of the activity of F relative to the SD thresholds (Supplementary  Table 3). Moreover, male heterogamety with M Y or M A as a dominant male-determiner is observed in all simulations when θ M < z F < θ F, suggesting this is the only stable SD system under these conditions, so that insensitive F alleles cannot be part of any SD mechanism under these conditions.

Evolution of F insensitivity and establishment of SD system gradients
The above results show how F, M, and the SD system as a whole evolve in relation to the SD thresholds θ F and θ M. Most importantly, we find that F can evolve into a dominant female-determining gene by becoming insensitive to M, provided that a single F allele generates enough F product to induce feminization, i.e., the threshold for feminization θ F is sufficiently low relative to F expression z F. We refer to such insensitive dominant feminizing F alleles as F I . Invasion of F I alleles in our model is primarily driven by the de novo evolution of M A , which causes a slight male-biased sex ratio, thereby causing feminizing mutations to be favored (but see Discussion) (Bull & Charnov, 1977). Provided that the feminization threshold θ F is constant under all conditions, local variation in temperature could then lead to divergent evolution of SD systems as temperature effects on F expression could then allow for local evolution of a female heterogamety system.
To test this reasoning, we used simulations in which we varied two parameters: the rate β at which temperature increases F Figure 1. Model overview. (A) Sex is determined based on the net total active F product. Active F is produced by the F locus, and degraded by M, produced by M Y and/or M A . Higher temperature increases expression levels of F. If the net total active F product exceeds a threshold θ F, individuals become females, whereas below the threshold θ M individuals become males; otherwise, individuals develop into infertile intersexes. (B) Demes 1 through N are arranged along a linear gradient where T increases from T min to T max. Each deme contains a variable number of females, males, and intersexes. Reproduction occurs by mating between males and females within the same deme; intersexes do not reproduce. Dispersal occurs at a rate d to neighboring demes (indicated by arrows). Every individual has three gene loci F, M Y , and M A that jointly determine sex.
expression (see Equation 1) and the threshold value θ F for feminization. Here, we exclude Y-chromosomal fitness effects, but explore their impact on SD evolution in the following section.
We found that F I can spread in the entire metapopulation when the feminization threshold θ F is sufficiently small (Figure 3A), regardless of how strongly temperature affects F expression. For Figure 2. Classification of sex determination systems depending on relative values of thresholds and F activity. Each system is defined by a recurrent pair of female and male genotypes that can only generate those two genotypes (conform Bull & Charnov, 1977). Here we define three alleles for locus F: a regular F that is expressed and sensitive to M; a variant F I that is insensitive to M and expressed; and a variant F 0 that is unexpressed. For M, we distinguish between active (M) and inactive (0) variants. We use z F to refer to the activity of a single F allele. In systems with both F I and F 0 , M has no function and may be present or absent; this is indicated by asterisks (*/*). sufficiently high values of θ F, F I was unable to spread in colder demes because it would result in intersexual development (Supplementary Figure 3), but could still spread in warmer demes. Under these conditions, a geographical cline in the frequency of F I evolves ( Figure 3B).
These results underline that the distribution of F I is constrained by the expression level of F, and show that temperature-dependent effects on gene expression can establish gradients when temperature varies. Due to its feminizing effect, an F I allele is transmitted as if it were a female-limited W-chromosome in a ZW system, wherein males have a ZZ genotype, whereas females have a ZW genotype (in contrast to XY systems where females are XX and males are XY). As a result, it occurs only in F I /F females (or non-reproducing intersexes). In the presence of M product, the F I product is not broken down but the product of regular sensitive F alleles is degraded, so that regular F alleles do not contribute to the total F product. This scenario becomes increasingly probable as either M Y and/or M A frequency increases, as more F I -bearing individuals will also bear M Y and/or M A . Therefore, feminization of developing embryos under these conditions is achieved solely by the activity of the F I allele. Because F I is insensitive to M, the total F activity is determined by its expression (see Equation 1). When no temperature-dependent expression occurs, feminization is only achieved when the baseline genetic expression level exceeds the feminization threshold, but in the presence of temperature effects is less constrained. Therefore, when θ F is sufficiently low, F I can spread everywhere independent of temperature ( Figure 3A, left panel), whereas otherwise F I frequencies depend on the temperature-dependent expression rate ( Figure 3A, right panel) and the local temperature ( Figure 3B).

Y-chromosomal fitness effects modulate the conditions under which SD turnover can occur and can enable complex SD polymorphisms
For the simulations discussed above, we assumed that M Y is not associated with any fitness effects. Under these conditions, M Y was always lost due to de novo evolution of M A , which causes a male-biased sex ratio so that both M Y and M A are selected against. Recurrent mutation of MA ensures that it nonetheless remains present in the population, but this does not hold for M Y ; in absence of any associated selective effect, M Y is therefore purged. However, as M Y is the ancestral SD gene, it may have induced the chromosome on which it is located to undergo Y-chromosome evolution (reviewed in Bachtrog, 2013;Schenkel & Beukeboom, 2016). If so, this chromosome is expected to become enriched for sexually antagonistic (SA) genetic variants as well as recessive deleterious mutations. In effect, this would cause M Y to be associated with higher fitness in heterozygous +/M Y males, but to induce a fitness cost to M Y -bearing females (who also carry F I ) as well as all homozygous M Y /M Y individuals. Both sexually antagonistic genetic variation and a cost of homozygosity can in theory strongly affect evolutionary transitions in SD (reviewed in van Doorn, 2014). We therefore performed additional simulations where we considered a sexually antagonistic fitness effect of the Y-chromosome (see Supplementary Table 1), causing the Y-chromosome to affect individual fitness during the mating phase or during development (positively in males, negatively in females). In addition, we include costs of M Y /M Y homozygosity as a form of viability selection during embryogenesis.
When M Y is under sexually antagonistic selection, we find that it is maintained over M A (Figure 4). Sexually antagonistic selection on M Y can also inhibit invasion of F I if selection is sufficiently strong, so that negative effects of M Y in females reduces the fitness of F I /F females relative to non-M Y -carrying F/F females. However, when the rate of introduction of novel M A alleles is sufficiently high, F I can always invade even if the selective effects associated with M Y are strong. When M A originates more frequently and therefore reaches higher frequencies, a more male-biased sex ratio results (similar to Y-chromosomal meiotic drive; Jaenike, 2001;Kozielska et al., 2010) and hence the selective benefit for F I as a female-determining factor increases.
Y-chromosomal fitness effects can also enable maintenance of both M Y and M A in the population, albeit in different locations. This occurs when M Y is disfavored through reduced survival of YY homozygotes in subpopulations with F I , in contrast to subpopulations without F I , where M Y is favored over M A via sexually antagonistic selection in +/M Y heterozygotes. Such M Y versus M A polymorphism is highly similar to the distribution of Y-chromosomal versus autosomal M-factors in the housefly M. domestica. In this species, autosomal M-factors are more prevalent at lower latitudes and coincide with a dominant feminizing allele tra D (which is equivalent to F I as described above), resulting in three latitudinal gradients in Y-chromosomal M-factors, autosomal M-factors, and the tra D allele. We find that Y-chromosomal fitness effects enable the evolution of this complex system in our model (see Box 1). This provides an adaptive explanation for the evolution of this system in contrast to existing models of SD evolution, which have been unable to predict when stable polymorphisms for tra versus tra D along with Y-chromosomal M-factors versus autosomal M-factors may occur in general, and in particular when these coincide along environmental clines as observed under natural conditions (e.g., Kozielska et al., 2006, but see Meisel et al., 2016, for an alternative explanation that invokes linkage between different SD genes and sexually antagonistic loci).

Discussion
We have presented a model for the evolution of SD systems in a context where sex is determined by genetic factors in combination with environmental effects. In our model, female development is induced when the activity of a feminizing gene F exceeds a certain threshold, whereas male development is achieved when F activity is below a different and lower threshold. This can be caused by inhibition of F activity by the maleness-promoting gene(s) M Y and/or M A , or by loss of F expression. We incorporated a positive effect of temperature on the expression of a feminizing locus F. We find that several different SD systems can be realized depending on the activity of an F allele relative to the threshold values for masculinization and feminization. Temperature-dependent effects on F expression may alter the relationships between F activity and SD thresholds, thereby enabling the evolution of different genetic SD systems. A particular prediction is the transition from male heterogamety to female heterogamety, which occurs in our model via the evolution of an insensitive dominant feminizing variant (F I ) that induces femaleness even in the presence of M Y and/or M A . F I can spread when activity of a single F or F I allele is sufficient to induce feminization; when activity is modulated by temperature, this can lead to local invasions of F I and subsequently differentiation between populations along temperature gradients. Differentiation can also occur for M Y and M A , with M Y being favored in absence of F I and M A in presence of F I . This occurs when M Y is associated with certain fitness effects such due to linkage with sexually antagonistic variants or recessive deleterious mutations. Altogether, we show that this can lead to the coexistence of multiple gradients in SD genes as found in, e.g., the housefly M. domestica.
Our model can be amended to other SD systems than the M. domestica system on which it is based, provided that they have a basic GSD framework influenced by an environmental effect (Table 1). Environmental effects on genetic sex determination systems are being reported in an increasing number of species (Edmands, 2021;Holleley et al., 2015). Although temperature-dependent effects are well-documented, other environmental effects may also influence SD in certain systems such as hormonal imbalance in fish due to pollution (Devlin & Nagahama, 2002). Other components and assumptions of the model that are based on the housefly system may be represented differently in other species but with similar effects. For example, the impact of M A evolution is not due to a specific mechanism of mutation, but more generally by causing a male-biased sex ratio, thereby promoting the invasion of F I . Male-biased sex ratios occur due to M A overrepresentation in the gene pool via its de novo evolution, but the same effect can be achieved by translocation of a Y-chromosomal male-determining gene or via association with meiotic drivers (Green, 1980;Kozielska et al., 2010). Similarly, although F I is favored due to sex ratio selection in our model, other selective processes such as sexually antagonistic selection (van Doorn & Kirkpatrick, 2010) as well as neutral processes such as genetic drift (Saunders et al., 2018;Veller et al., 2017) can also drive transitions from male to female heterogamety. Inversely, the spread of F I in our model may be impeded by selection against intersexes, but other factors may also limit its spread. For example, when F I is linked to a gene experiencing intralocus sexual conflict in one environment but not another (Plesnar-Bielak & Figure 4. Y-chromosomal fitness effects alter the scope for SD transitions. Shown are the predicted equilibrium frequencies of F I in females (maternally inherited alleles), M Y and M A in males (paternally inherited alleles); we restrict our analysis to the maternal (F I ) c.q. paternal (M Y , M A ) alleles to account for their (potential) sex-specific transmission. Horizontal labels indicate locus and temperature, vertical labels the M A activation rate µ D. Further parameter values used in simulations: θ F = 1.2; θ M = 0.3; β = 0.5. Predicted allele frequencies were smoothed with binomial generalized additive models. Figure 5. Evolution of a housefly-like SD system. A housefly-like SD system is defined by M Y being the major allele at T = 0 but the minor allele at T = 1 (in males, paternally inherited allele), and vice versa for M A (in males, paternally inherited alleles) and F I (in females, maternally inherited alleles). Frequency denotes the predicted frequency of observing a housefly-like system at equilibrium in the model. Parameter values: θ F = 1.2; θ M = 0.3; β = 0.5. Simulations were scored as exhibiting a housefly-like system as described above (10, 000 simulations in total). To obtain these predicted scores, we fitted a binomial generalized additive model.

Box 1. Evolution of the polymorphic housefly system
Our model has been inspired by the common housefly Musca domestica, in which wild populations harbor different SD systems (reviewed in Hamm et al., 2015). Here, we discuss how our model can explain the evolution of this system.
In houseflies, sex is determined by a linear cascade of genes. First, transformer (tra) induces female development when active in developing embryos (Hediger et al., 2010). Its activity depends on an autocatalytic feedback loop where female-specific TRA F protein directs the splicing of Tra pre-mRNA into the female-specific Tra F mRNA, thereby ensuring the production of novel TRA F protein to maintain the loop. Possibly, temperature effects on tra work by affecting pre-mRNA splicing, as splicing is sensitive to temperature as well as other stressors (Palusa et al., 2007). Second, masculinizing factors (M-factors) such as Mdmd (Sharma et al., 2017) trigger male development by inhibition of the tra loop through splicing of tra pre-mRNA to a male-specific variant. Intriguingly, the M-factors in M. domestica are found on different chromosomes in different populations; most of these correspond to orthologs of Mdmd, and the genomic regions harboring these M-factors consist of multiple (partial) repeats of the Mdmd gene, suggesting that these have originated through gene duplication and/or translocation (Sharma et al., 2017, Li et al., in prep). There also exists an insensitive variant of tra, tra D , that induces female development in all carriers regardless of whether they carry any M-factors. In our model, F represents tra and likewise tra D corresponds to the dominant F I as discussed in the main text; M Y and M A in our model represent M-factors.
High-latitude M. domestica populations have a male heterogamety (XY) system in which the Y-chromosomal Mdmd gene induces maleness, and all individuals carry two regularly sensitive tra alleles. At lower latitudes, however, females usually carry the insensitive tra D allele, and both sexes can be homozygous for an autosomal copy of Mdmd; hence, these populations have a female heterogamety (ZW) system (Supplementary Figure 4; Hamm et al., 2015;Kozielska et al., 2008). The geographical transition between these two SD systems is gradual, so that clines exist in the frequencies of Y-chromosomal Mdmd (decreasing toward lower latitudes), autosomal Mdmd and tra D (both increasing toward lower latitudes). Given that a single tra D allele is sufficient for feminization (even in the presence of multiple M-factors; Hamm et al., 2015) suggests that at least in natural populations it would mimic an F I allele with allele in its place as a co-factor for male development. In effect, a transition has occurred from XY male heterogamety to ZW female heterogamety as the sex-determining role has been taken over by F I . a net expression level z F that exceeds θ F. Therefore, in our model this system would likely mimic a transition from θ M < z F < θ F to θ M < θ F < z F. Temperature likely plays a causal role in maintaining these gradients by affecting the SD process Delclos et al., 2021;Feldmeyer, 2009;Feldmeyer et al., 2008). Temperature effects on housefly SD have been reported in the form of biased sex ratios produced in wildtype crosses (Feldmeyer, 2009) as well as in females carrying the masculinizer (man) mutation, another variant of tra (Hediger et al., 2010;Schmidt et al., 1997). The man mutation represents a maternal-effect male-determining gene, where man-carrying females can produce all-male progeny even if the progeny do not carry an M-factor. However, this effect is incomplete and temperature-sensitive (Schmidt et al., 1997), with offspring sex ratios more male-biased at higher temperatures. Altogether, temperature seems to have an important influence on SD in M. domestica, but the underlying mechanisms are not yet fully understood.
The housefly with its different SD systems is represented in our model by gradients in the frequencies of M Y (decreasing with temperature), M A , and F I (both increasing with temperature). Presumably, tra D is limited to warmer localities for similar reasons as F I in our model, i.e., because a single tra D allele may not be sufficient to induce feminization at low temperatures. Y-chromosomal Mdmd and autosomal Mdmd may follow similar dynamics as M Y and M A . Genomic analyses of the Mdmd loci on the Y-chromosome and various other chromosomes suggests that Mdmd may exhibit (or have exhibited) transposon-like activity, leading to tandem duplications of transposition of these duplications as a complex to other chromosomes (Sharma et al., 2017, Li et al., in prep). Y-chromosomal fitness effects can yield gradients in M Y and M A , but may also prevent the spread of F I , particularly when novel M A alleles enter the population at a low rate. The evolution of a housefly-like polymorphic SD system therefore depends on a balance between the Y-chromosomal fitness effects and the rate at which new autosomal Mdmd arises. In our model, we find that a housefly-like system can evolve under various rates of M A de novo evolution ( Figure 5). Higher rates of M A evolution require stronger SA effects for M Y to be maintained in low-temperature demes. Costs of YY homozygosity do not appear essential for M Y to be lost in the presence of F I , although they may increase the likelihood of M Y being lost in favor of M A in the presence of F I by reducing the fitness of YY homozygotes. Possibly, costs of M Y in females due to sexually antagonistic selection may suffice to promote the loss of M Y . Our model therefore can explain the evolution of complex SD systems such as those found in houseflies. Thus, we propose a novel hypothesis for the evolution of the housefly system in which the sex determination cascade is shaped by a combination of environmental influences on tra, recurrent evolution of autosomal Mdmd, and fitness effects associated with the Y-chromosomal copy of Mdmd ( Figure 6). Łukasiewicz, 2021;Schenkel et al., 2018), it may be favored in the first environment due to sex-specific benefits to females; if such benefits are absent in the second type, and instead it is similarly costly to males and females alike, this would restrict the spread of F I to the first environment.
Additionally, we see that in absence of an association between M Y and fitness effects, M A replaces M Y altogether due to recurrent mutation at M A but not M Y , yet still drives the invasion of F I , showing that our model does not strictly require a third locus. Inversely, it is likely that a more complex genetic basis generates similar evolutionary patterns, such as when various genes can evolve into a male-determining gene (Bopp, 2010). Bopp (2010) describes how a mutation that interferes with the key switch transformer, which regulates female development in houseflies and several other species (see also Box 1), is more likely to occur than a mutation that enhances the function of transformer, simply because there are many ways in which a gene's function can be disrupted, but only a single way for it to function as intended. In our model, this mechanism may apply when F function can be disrupted in a variety of ways (though a similar reasoning could be applied to a gene with male determining function). In this scenario, many genes having a small chance to evolve into a male-determining gene may have the same net effect on sex ratio as a gene that is prone to evolving a masculinizing function. In this light, it will be interesting to test whether genes that have acquired sex-determining function in one species are prone to evolve a similar function in a related species, where it has no sex-determining function.
Previous work has shed light on the evolution of ESD and GSD systems, and when transitions between these two may arise (Muralidhar & Veller, 2018;Pen et al., 2010;Schwanz et al., 2020). However, our understanding of the evolution of polymorphic SD systems and the potential for environmental heterogeneity to influence this process remains limited. Our results highlight environmental effects on GSD systems, and under which conditions this can lead to within-species polymorphism in GSD systems. Alternatively, these mechanisms may also explain why in some groups GSD mechanisms may be highly divergent between closely related species, such as in the mosquitofish Gambusia affinis and the closely related G. holbrooki (Kottler et al., 2020). In such cases, different species experience different ecological conditions in their respective niches, among which environmental factors that might impinge on the SD process as described here. Between-species divergence in SD mechanisms may then occur in much the same way as occurs in our model between populations. Our results furthermore suggest that hybridization need not be an impediment to such differentiation; potentially, such diversification may even enable the evolution of reproductive isolation via Haldane's rule (Haldane, 1922).
In conclusion, even in systems that appear to be fully GSD, the role of environmental influences on the SD processes must not be ignored as these may have played an important role in their evolution. In extension of this, changes in environment, e.g., due to global warming, may impose yet unforeseen selective pressures on species with GSD systems.

Data availability
Source code for the agent-based model and data analysis scripts are made freely available via GitHub (https://github.com/ MartijnSchenkel/Environmental_GSD_evolution).  This hypothesis, along with the results in this manuscript, highlights several promising aspects of housefly SD and its evolution that require further investigation. First and foremost, the effect of temperature on tra functionality needs to be verified, as current data are inconclusive Feldmeyer, 2009). Our model predicts that tra function is enhanced at high temperatures, allowing for the evolution of a dominant feminizing allele tra D . Given that tra regulates its own activity via an autoregulatory feedback loop (Dübendorfer et al., 2002;Hediger et al., 2010), this enhanced function may be effectuated in a variety of manners, including pre-mRNA expression, mRNA splicing, and protein catalytic activity, so that a purely molecular analysis of tra function may be inconclusive. These issues are aggravated by the fact that whether or not the loop is established is typically determined in the zygotic stage, so that analyses of tra function in the adult stage may not be informative (in contrast to Adhikari et al., 2021). Given that tra is expressed even in presence of Mdmd, it seems likely that the hypothesized gain-of-function of tra D must act post-transcriptionally. Instead, an inverse approach may be more fruitful; if tra D can only evolve at high temperatures, then tra D -bearing flies reared at low temperatures should be more prone to sex reversal or intersexuality. Second, the evolution of Mdmd must be further explored to evaluate the transpositional activity of Mdmd itself or the supergene complex containing this gene. The complex arrangement of Mdmd copies in M-loci strongly suggests Mdmd may at one point have transposed frequently (Li et al., in prep), but it is undetermined whether this is still the case. Third and final, fitness assays of houseflies with Y-chromosomal versus autosomal Mdmd must be made to determine if there are any fitness effects linked to these genes. In vitro studies on houseflies strains that carry Mdmd on different chromosomes found that the Mdmd-bearing chromosome may exhibit some genetic differentiation (Meisel et al., 2020), and that these may be associated with fitness effects (Son et al., 2019). In vivo characterization is however required to validate these effects, and to determine their impact on the maintenance of M Y over M A as predicted in our model.