Environment-dependent epistasis increases phenotypic diversity in gene regulatory networks

Mutations to gene regulatory networks can be maladaptive or a source of evolutionary novelty. Epistasis confounds our understanding of how mutations affect the expression patterns of gene regulatory networks, a challenge exacerbated by the dependence of epistasis on the environment. We used the toolkit of synthetic biology to systematically assay the effects of pairwise and triplet combinations of mutant genotypes on the expression pattern of a gene regulatory network expressed in Escherichia coli that interprets an inducer gradient across a spatial domain. We uncovered a preponderance of epistasis that can switch in magnitude and sign across the inducer gradient to produce a greater diversity of expression pattern phenotypes than would be possible in the absence of such environment-dependent epistasis. We discuss our findings in the context of the evolution of hybrid incompatibilities and evolutionary novelties.


INTRODUCTION
The regulation of gene expression is essential for the spatiotemporal control of diverse biological functions. Gene regulation is mainly mediated by trans-regulatory proteins and protein complexes such as transcription factors and RNA polymerase, which target specific DNA sequences in cis-regulatory regions such as promoters and enhancers to modulate gene expression levels (1). Transcription factors often regulate their own expression levels, as well as the expression levels of other transcription factors, giving rise to gene regulatory networks (2). Gene regulatory networks drive fundamental physiological and developmental processes, such as the interpretation of morphogen gradients for spatial patterning during embryogenesis (3).
Given their central role in essential biological functions, it is crucial that gene regulatory networks are robust to genetic perturbation. Mutations in cis-regulatory regions that induce quantitative (e.g., DNA mutations that alter the affinity of a transcription factor binding site) or qualitative (e.g., DNA mutations that create or destroy a transcription factor binding site) changes to a gene regulatory network often do not change the network's spatiotemporal expression pattern phenotype (4,5). This robustness causes gene regulatory networks to be highly evolvable, because it facilitates the neutral accumulation of mutations (6)(7)(8)(9). This creates genetic diversity and promotes "system drift" (10), in which a population undergoes a series of quantitative and qualitative changes to a gene regulatory network that are phenotypically neutral (5,11). Subsequent mutations to, or recombination events among, such diverse network configurations can then generate phenotypic variation (7,12,13).
Mutations to gene regulatory networks are commonly implicated in evolutionary adaptations and innovations (13)(14)(15). There has been an intense research effort to understand the molecular details of these mutations and the mechanistic basis of how they alter gene expression pattern phenotypes (from here on referred to as pattern phenotypes) (16). A common observation is that the combination of two (or more) mutations can result in phenotypic effects that would not be expected based on an additive assumption of each single mutant's phenotypic contribution (17)(18)(19)(20). Such context dependence of mutational effects is called epistasis and is referred to as negative (positive) when the combined effects of mutations are less than (more than) expected based on their individual effects. Moreover, epistasis can itself be dependent on environmental conditions, such as the concentration of an expression inducer or an enzymatic cofactor (17,19,(21)(22)(23)(24)(25). For example, in the lambda phage promoter, a canonical gene regulatory system, Lagator and colleagues (17) found that 67% (14%) of 141 double mutants exhibited negative (positive) epistasis when the transcription factor that competes for binding with RNA polymerase was not expressed and that 58% of the double mutants switched from negative to positive epistasis (or vice versa) when the transcription factor was expressed. Epistatic interactions and their dependence on the environment are not only limited to pairs of mutations but can also occur among three or more mutations (18,26), a phenomenon known as higher-order epistasis (27).
Epistasis in gene regulatory networks has many causes including specific interactions across intermolecular interfaces, such as protein-protein (28,29) and transcription factor-DNA interactions (30). These interactions result in the typical nonlinearities inherent to competitive and cooperative binding of several transcription factors to the same target (31,32). In addition, the presence of feedback and feedforward loops in the network introduce additional epistatic effects that depend on the topology of the gene regulatory network (16,33).
Despite substantial progress in the characterization of epistasis in macromolecules (34)(35)(36)(37) and gene regulatory networks (17,38,39), it remains poorly understood how environment-dependent epistasis influences the spatiotemporal pattern phenotypes of gene regulatory networks. This is an important knowledge gap, because epistasis can constrain or facilitate evolvability (9,16), and the interpretation of chemical gradients (i.e., the environment) by gene regulatory networks is fundamental to developmental patterning (40,41), as well as other essential biological processes like chemotaxis (42). One of the reasons this knowledge gap persists is the difficulty of studying gene regulatory networks in situ, due to their embedding in large and complex cellular networks.
Synthetic biology offers a path forward (43,44). By extracting a gene regulatory network from its native cellular environment, it becomes feasible to systematically study the effects of mutations, individually and in combination, on the pattern phenotype of a gene regulatory network. We recently built and studied a synthetic threenode gene regulatory network, expressed in Escherichia coli, that produces a stripe pattern phenotype (low-high-low) along an inducer gradient-analogous to a morphogen gradient interpreted during embryogenesis (45). The network topology was based on the incoherent feedforward loop 2 motif (2), which drives numerous biological functions, such as blastoderm patterning in Drosophila (46). We previously used this synthetic gene regulatory network to study how mutation brings forth phenotypic variation in the network's pattern phenotype (39).
Here, we systematically combined mutations in the cis-regulatory regions of each of the network's three nodes, covering promoters and transcription factor binding sites, into pairwise and triplet combinations. We analyzed how the epistasis of pairwise and triplet combinations varies along the inducer gradient. In doing so, we provide the first study of how environment-dependent epistasis influences the pattern phenotype of a gene regulatory network. Our results reveal a context-dependent picture of epistasis in which genotype combinations exhibit diverse epistatic effects, including positive and negative epistasis, that can change drastically along the inducer gradient. In turn, this inducer-dependence of epistasis leads to more diverse pattern phenotypes in our network than would be expected if the mutations did not interact epistatically or depend on the environment. In the context of evolution, such increased diversity could facilitate adaptive evolution but may also underlie hybrid incompatibility and speciation.

A synthetic gene regulatory network provides an experimental system
We studied a synthetic gene regulatory network with three nodes, which we refer to as sensor, regulator, and output ( Fig. 1A) (39,45). Given the correct promoter activity and repression between nodes, the network produces a stripe pattern phenotype (low-high-low gene expression of the output node) along an inducer gradient, using the following mechanism (Fig. 1B): The sensor is activated by the inducer and represses the regulator and the output. The regulator also represses the output, but its activity decreases with increasing inducer concentration, due to repression by the sensor. Consequently, the output is the least repressed at intermediate inducer levels, resulting in high expression and the formation of a stripe along the inducer gradient. Our definition of a stripe pattern phenotype requires the output expression to be the highest at intermediate inducer levels but leaves room for variation in terms of the stripe shape, intensity, and overall expression (39).
We encoded the three nodes of the synthetic gene regulatory network on separate plasmids (one per node), which we expressed in E. coli. The arabinose-responsive promoter pBAD receives the inducer signal (arabinose) (Fig. 1C). The inhibitions are implemented by the transcriptional repressors TetR (tetracycline repressor) and LacI (lactose repressor) binding to their operator sites (TetO and LacO), which we placed downstream of the regulator and output promoters, respectively. The observable network output is expression of the superfolder green fluorescent protein (GFP) from the output node (47).
We previously introduced random nucleotide changes in the cisregulatory regions spanning the promoter and operator sequences separately in each of the network nodes (39). From this study, we selected 31 genotypes: the starting network ("wild-type," WT) and 10 mutant genotypes for each of the three network nodes (fig. S1). These genotypes were chosen from a pool of 77 genotypes. We required that the genotype robustly displayed a stripe phenotype in our experimental setup and that each mutant genotype contained one to three nucleotide changes in its cis-regulatory regions in only one of the nodes, either sensor, regulator, or output ( fig. S2). Of the genotypes meeting these requirements, we randomly chose 10 mutant genotypes for each node. The changes in promoter and repression levels ( fig. S3) resulted in quantitative variations in overall fluorescence level and shape of the stipe pattern ( Fig. 1D  and fig. S1). However, as the stripe phenotype is maintained, we consider the mutations as qualitatively neutral. We refer to the 30 mutant genotypes as sensor-1, regulator-1, output-1, sensor-2, etc. The variation in GFP expression along the gradient reflects the mechanistic role of the mutated node in the network. For example, sensor genotypes showed high variation at higher inducer concentrations, caused by a weaker promoter and/or lower sensitivity towards the inducer, which resulted in a lower repression of the output node at high inducer concentrations. In contrast, output genotypes showed variation along the gradient, which is caused by changes in their promoter and operator strengths.
To study epistasis between genotypes, we systematically combined all 3 × 10 mutant genotypes to generate all possible 300 (3 × 10 × 10) pairwise and 1000 (10 × 10 × 10) triplet combinations by transforming the plasmid combinations into E. coli cells (Fig. 1E). We cultured cells in 384-well plates and measured GFP expression of all 300 pairwise and 1000 triplet genotypes in triplicate at low (0%), medium (0.0002%), and high (0.2%) inducer concentrations using a fluorescence spectrophotometer (Fig. 1B). Fluorescence values correlated well between the three replicate measurements of pairwise and triplet genotypes (R 2 of 0.88 for all pairwise and 0.95 for all triplet combinations measurements; fig. S4A). The coefficient of variation was similar between single, pairwise, and triplet genotypes ( fig. S4B). In addition to the measurements at three inducer concentrations, we also assayed all single mutant genotypes [i.e., the 30 genotypes selected from (39)] and 40 pairwise and triplet genotypes at 16 inducer concentrations ( fig. S5). These measurements over 16 inducer concentrations confirmed that the measurements at three inducer concentrations capture the pattern phenotypes well ( fig. S4C).

Most genotype combinations exhibit epistasis
We used a multiplicative model to determine whether pairwise or triple combinations exhibited epistasis. In this model, the null hypothesis is that the fold-change in gene expression of the pairwise or triplet combinations is identical to the product of the fold-change observed in single mutant expression levels. This way, epistasis is defined as the deviation of the observations from this product. The multiplicative model is commonly used to detect epistasis in gene regulatory systems (17)(18)(19)38), as well as in other systems ( Fig. 2A) (48,49). The multiplicative model is also referred as log-additive, since it becomes additive when the concentrations are transformed to logarithmic scale.
Concretely, we first calculated the observed fold-change in fluorescence relative to the WT (g obs  ; log 10 ½g expected ði;j;kÞ � ¼ log 10 ðg obs i Þ þ log 10 ðg obs j Þ þ log 10 ðg obs k Þ ð2Þ ( fig. S6C). Next, we compared these expected values predicted from the single mutant genotypes with the actual observed GFP expression for pairwise or triplet combinations to calculate the magnitude of epistasis as where G observed is the logarithm of the observed fluorescence fold change of the mutant with respect to the WT [log 10 g observed ði;jÞ and log 10 g observed ði;j;kÞ for pairwise and triplet combinations, respectively]. Note that this definition of triplet epistasis measures the difference between the expression of the observed triplet expression and the expected expression from the multiplicative model using single mutants. One can also isolate the third-order epistatic effects present in the triplets that cannot be predicted from pairwise epistasis (26,50). Detailed derivations and relationships of the different epistasis contributions can be found in Material and Methods (Eqs. 8 to 10).
The magnitude and sign of the parameter ɛ measure the strength and sign of epistasis ( Fig. 2A). In particular, we defined epistasis values as significant if ɛ deviates from 0 with a false discovery rate (FDR) adjusted P value of <0.05 ( Fig. 2B) (26,48). From all measurements of pairwise combinations at all inducer concentrations, we found that 53% (476 of 900) resulted in significant epistasis (FDR adjusted P value of <0.05) (Fig. 2C). Similarly, 57% (1726 of 3000) of triplet combinations resulted in significant epistasis (FDR adjusted P value of <0.05) (Fig. 2C). Of the significant epistatic pairwise combinations, 82% were negative and only 18% were positive, whereas for the triplet combination, 88% were negative and only 12% were positive (Fig. 2C). Thus, most mutant genotype combinations resulted in lower GFP expression than expected.
Epistatic interactions can be classified depending on the phenotypic effects of each single mutant genotype and their combinations. For example, in magnitude epistasis, the expression level associated to a genotype, but not its sign, changes with the genetic background more or less than would be expected under additivity. In contrast, if a genotype has the opposite effect when in combination with another genotype, i.e., it changes the sign of its relative effect, it is called sign epistasis. Reciprocal sign epistasis (RSE) is a special case of sign epistasis, in which each single genotype has the opposite effect when combined with other genotypes (33). Notably, most of the 2202 cases of significant epistasis for pairwise and triplet genotypes (n = 1230) could be attributed to negative RSE (Fig. 2E). In our case, single mutant genotypes had a higher gene expression than the WT network, but in combination, they had a lower gene expression than any of the single mutant genotypes and the WT, resulting in negative RSE. These observations are in line with prior work in other systems, which uncovered epistasis among components that interact functionally (17,38,51,52) and physically (29,30). In sum, most interactions are epistatic in our system, with a preponderance of negative epistasis and negative RSE.
For the triplet combinations, we also calculated the exclusive third-order epistasis (Eqs. 8 to 10; fig. S7). In contrast to the total epistasis of triplet combinations (Fig. 2), this is the epistasis remaining when taking into account the epistasis already present in the pairwise interactions. We found that there is no trend in the contribution of exclusive third-order effects to the total epistasis of triplet combinations as there is no correlation between triplet epistasis and third-order effects ( fig. S7A). Consequently, the trends observed in the total epistasis in triplet combinations can be mostly attributed to the epistasis of pairwise combinations ( fig. S7B), in line with previous work showing that higher-order epistasis decreases with increasing order (53)(54)(55). Moreover, the exclusive third-order epistasis is negatively correlated with the contribution from pairwise combinations ( fig. S7C). In other words, the exclusive thirdorder epistasis tends to decrease the magnitude of the total epistasis, suggesting that epistatic effects cannot be added in a boundless manner and are controlled by the architecture of the gene regulatory network.

Epistasis depends on the genetic background
Next, we compared the effects of the mutated nodes across the complete set of genetic backgrounds. For this, we plotted the values of ɛ of all pairwise and triplet combinations for each of the 30 mutant genotypes separately and calculated the mean and variability of epistasis (Fig. 3). Notably, we found that every genotype displayed both negative and positive epistasis, depending on the genetic background. Most genotypes have a negative mean value of ɛ, the exceptions being combinations of regulator-10, output-2, and output-10, whose mean ɛ was slightly positive (Fig. 3, A and B). In sum, these results show that epistasis is highly idiosyncratic in our system (56) because the same mutation tends to have different effects in different genetic backgrounds.
To quantify the variability of epistasis, we calculated the coefficient of variation (CoV: SD normalized by the mean) of epistasis for each genotype (Fig. 3C). The variability of epistasis decreased from sensor to regulator to output node, particularly at low and medium inducer concentrations, suggesting that the variability of ɛ was dependent on which network component was mutated ( fig. S8). This may be explained by the components' positions in the regulatory hierarchy, which is known to influence epistasis (51).

Epistasis is inducer-dependent
Next, we asked whether inducer concentrations influence the sign and variability of epistasis values. For this, we plotted the mean of epistasis values at low (0%), medium (0.0002%), and high (0.2%) inducer concentrations separately (Fig. 3D). We found that the epistasis values increased, i.e., became more positive, with the inducer concentration for all three network nodes (figs. S9 and S10).
To test whether epistasis is indeed different at different inducer concentrations, we plotted the epistasis for each genotype of all pairwise and triplet combinations at low concentration against epistasis at medium and high concentrations, as well as the epistasis at medium concentration against epistasis at high concentration and performed a t test with FDR adjusted P values. Specifically, we tested whether the epistasis is independent of the inducer concentration (null hypothesis), defining epistasis as inducer-dependent if the epistasis at any concentration is significantly different from the epistasis at any of the two other concentrations (FDR q value is <0.1) ( fig. S11). At this significance cutoff, 37% of pairwise (111 of 300) and 45% of triplet (447 of 1000) combinations exhibited significant inducer-dependent epistasis.
To further explore the inducer-dependence of epistasis in our dataset, we explored how ɛ changed from low, medium, to high inducer concentrations for each genotype combination. To do so, we classified the changes in epistasis along the inducer concentration into four categories, depending on which inducer concentration the epistasis value was highest or lowest ( Fig. 4): (A) with epistasis being highest at medium inducer concentrations; (B) with epistasis increasing with increasing inducer concentrations; (C) with epistasis decreasing with increasing inducer concentrations; (D) with epistasis being lowest at medium inducer concentrations. We quantified the distribution of pairwise and triplet combinations into these four categories. We found that of all 300 pairwise combinations 33% fell into category (A), 31% into category (B), 18% each into categories (C) and (D). For the 1000 triplet combinations, this distribution changed to 21% falling into category (A), 59% into category (B), 5% into category (C), and 15% in category (D). Thus, our results show that epistasis changes drastically with inducer concentration.
Next, we asked whether epistasis also switched sign between inducer concentrations. For this, we subclassified the four categories into three different scenarios: remaining negative, remaining positive, or switching sign at different inducer concentrations (Fig. 4,  right part). Overall, we found that 31% of the pairwise combinations switched sign along the inducer concentrations, and 13% remained always positive and 56% always negative. For the triplet combinations, we found that 21% switched sign and 10% remained always positive and 69% always negative. In sum, these results demonstrate that epistasis is environment-dependent in our system, changing in magnitude and sign along the inducer gradient.

Inducer-dependent epistasis increases phenotypic diversity
We have shown that epistasis depends on the inducer concentration in our system. Since the function of the network is to interpret an inducer gradient into a gene expression pattern, we next analyzed how this inducerdependence of epistasis affected the network's pattern phenotype. To this end, we characterized the pattern phenotype of all single, pairwise, and triplet genotypes. For this, we calculated the difference in gene expression between low and medium arabinose concentrations and between high and medium arabinose concentrations (Fig. 5A) (39). This approach allows for a simple visualization on a Cartesian plot and classification of pattern phenotypes into four categories for each one of the four quadrants: increase (Q1), anti-stripe (Q2), decrease (Q3), and stripe (Q4). Projections near the origin correspond to a constant expression phenotype.
We first plotted the pattern phenotypes of the WT and all 30 single mutant genotypes (Fig. 5B, left). As expected, they all fell into the stripe category (Q4). We further examined the pattern phenotypes of all 300 pairwise (Fig. 5B, middle) and 1000 triplet (Fig. 5B, right) genotypes. On the basis of the multiplicative model, most of the pairwise (n = 289 of 300) and triplet (n = 920 of 1000) genotypes were also expected to display a stripe pattern phenotype, with few exceptions (11 pairwise and 80 triplet) that were expected to adopt an increase pattern phenotype. These cases were the result of combinations of single mutant genotypes

S C I E N C E A D VA N C E S | R E S E A R C H A R T I C L E
with high GFP expression at high inducer concentrations (e.g., sensor-3, sensor-7, sensor-10, regulator-5, and regulator-6) or low GFP expression at low inducer concentrations (e.g., regulator-1 and output-10) (fig. S1). However, the observed phenotypes were more frequently nonstripe patterns than expected. Specifically, for pairwise and triplet combinations, we observed 22 (11 expected versus 33 observed) and 237 (80 expected versus 317 observed) more increase pattern phenotypes, respectively (Fig. 4B). In addition, observed pairwise and triplet combinations also each showed one decrease phenotype.
In addition to the changes of distribution into the pattern phenotype categories, we also observed that the pattern phenotypes within one category were more diverse than expected from the multiplicative model. For example, several genotypes with epistasis highest at medium inducer concentration had a stronger stripe pattern phenotype than the WT, i.e., a bigger difference between lowest and highest GFP expression levels ( fig. S11). To quantify and compare this spread of pattern phenotypes, we calculated the Euclidean distance between the pattern of the WT and the pattern of each genotype (schematic Fig. 5C). We found that, for most genotypes, the observed mean distance to the WT was significantly higher than expected, both for pairwise (0.24 versus 0.18, paired t test P value of <0.001) and triplet combinations (0.36 versus 0.23, paired t test P value of <0.001) (Fig. 5D). All four epistasis categories in Fig. 4 contribute to this increased phenotypic diversity (fig. S12). Thus, we conclude that environment-dependent epistasis causes a greater diversity of pattern phenotypes than would be expected if the mutations did not interact epistatically or depend on the environment.

DISCUSSION
Mutations in cis-regulatory regions can alter the spatiotemporal gene expression patterns of gene regulatory networks. Such alterations are often deleterious, resulting in developmental abnormalities, disease, or death (57,58). However, they are occasionally advantageous, as evidenced by their common implication in evolutionary adaptations and innovations (13,14). Here, we used the toolkit of synthetic biology to systematically interrogate how combinations of mutations in the cis-regulatory regions of a three-gene

S C I E N C E A D VA N C E S | R E S E A R C H A R T I C L E
regulatory network interact to influence the network's spatial pattern phenotype. We uncovered pervasive epistasis, particularly negative epistasis and negative RSE. Moreover, epistasis depended on the environment, because it varied across the inducer gradient. Last, we showed that this inducer-dependent epistasis resulted in a greater diversity of pattern phenotypes than would be expected if the mutations did not interact epistatically and their phenotypic effects did not depend on the environment.
Prior work has shown that epistasis can either constrain or facilitate the evolution of phenotypic diversity. For example, in individual macromolecules such as DNA, RNA, and proteins, RSE has been shown to constrain the evolution of phenotypic diversity by forming maladaptive valleys in fitness landscapes, which can trap evolving populations on suboptimal adaptive peaks and preclude the generation of further adaptive phenotypic variation (9,59). However, when such macromolecules interact, epistasis among mutations in the interacting components can alleviate the constraints of the individual components, facilitating the evolution of phenotypic diversity (17,30). Lagator and colleagues (38) provide a representative example. They studied the phenotypic effects of mutations in the canonical Lambda bacteriophage switch, a regulatory network consisting of RNA polymerase, a transcriptional repressor, and a cis-regulatory element to which both proteins bind. Their study showed that the phenotypic variation induced by combinations of mutations in the three interacting components was greater than that induced by mutations in the individual components. Moreover, the amount of phenotypic variation brought forth by combinations of mutations in the interacting components was different in the presence and absence of the transcriptional repressor, thus revealing environment-dependent epistasis in this system (17). Our study complements and extends this work by studying environment-dependent epistasis in a larger regulatory network in which the mutations can interact functionally, but not physically (since mutations are only in cis-regulatory regions), to influence phenotypic diversity in a spatial pattern phenotype. While we can explain some of these diverse expression patterns intuitively ( fig. S13), further work is required to develop a mechanistic understanding, particularly of how network topology modulates the influence of epistasis on expression pattern phenotypes. It is our hope that the data reported here will be used to this end, for example, by parameterizing biophysical models of gene regulatory networks.
Our experiments also shed light on the robustness of gene expression pattern phenotypes to recombination, and how recombination generates novel phenotypes in gene regulatory networks. Prior work with computational models of gene regulatory networks compared the pattern phenotypes of recombinant offspring derived from parental networks that have the same phenotype to the pattern phenotypes of mutated offspring derived from these same parents (12). Recombination was far less likely to cause a change in pattern phenotype than mutation. For example, more than 90% of recombinant offspring that differed from their parent by one regulatory interaction preserved the parental phenotype, as compared to only~75% of mutated offspring that differed from their parent by one regulatory interaction. These differences in the robustness of pattern phenotypes to recombination and mutation only increased as the difference in the number regulatory interactions between parent and offspring increased. Our study provides experimental support for these findings, at least qualitatively. Specifically, in our previous work (39), we found that 64.3% of mutations to our regulatory network in one node and 93.8% of mutations in two to three nodes resulted in a nonstripe phenotype, whereas here we found that only 11.7 and 31.8% of pairwise and triplet genotype combinations resulted in a nonstripe phenotype, respectively (fig. S14). Further, we found that when recombination does generate a novel phenotype, this can be partly explained by environment-dependent epistasis. Without such epistasis, recombination is far more likely to preserve the parental pattern phenotype, as evidenced by comparisons with our null model. Our study thus supports the theoretical prediction of a low cost to recombination in creating novel phenotypes in gene regulatory networks (60) and highlights environment-dependent epistasis as one cause of phenotypic novelty in recombinant offspring.
The robustness of a gene regulatory network's pattern phenotype to mutation and recombination facilitates so-called system drift (10), in which an evolving population accumulates diversity in the regulatory and coding regions of the networks' constituent components without causing a change in phenotype. System drift can facilitate the evolution of phenotypic novelties, because the resulting genetic diversity may serve as the basis for subsequent mutations or recombination to bring forth novel phenotypes, or it may be revealed as phenotypic variation upon environmental change (9). Modeling work has long suggested that gene regulatory networks are susceptible to system drift, because many different mutationally connected networks have the same expression phenotype (7,8,11,61), and recent empirical work has demonstrated a role for system drift in the evolution of biofilm formation in the fungus Candida albicans (62). Our work bridges these theoretical and empirical studies, using synthetic gene regulatory networks to experimentally interrogate the phenotypic effects of mutation and recombination, confirming the susceptibility of regulatory networks to system drift and the constructive role of the resulting genetic diversity in the evolution of novel phenotypes. A key finding of our study is that such novelties are at least partly explained by epistatic interactions among network components, which can change in magnitude and sign along an environmental gradient, thus altering gene expression levels across the spatial domain.
Our finding that environment-dependent epistasis can cause novel pattern phenotypes in recombinant offspring is germane to a growing body of literature on hybrid incompatibilities in gene regulatory networks (63)(64)(65)(66)(67)(68)(69)(70)(71). These modeling studies highlight several factors that influence the evolution of such hybrid incompatibilities, including population structure (64), population genetic conditions (69,70), mutational target size (67,68), network topology (66), and whether selection is directional or stabilizing (64). Under stabilizing selection, where system drift can occur, these models suggest that hybrid incompatibilities are most likely to arise when many network variants that have the same pattern phenotype are mutationally connected with one another, forming so-called genotype networks (72) or neutral networks (73), and when at least some mutations in network components interact epistatically. Genotype networks facilitate system drift, because an evolving population can spread across the network while preserving pattern phenotype (72), whereas epistasis can create maladaptive "holes" in these networks, into which recombinant offspring may fall (74). Our work here and in previous studies (39,45,75) provides experimental support that gene regulatory networks form genotype networks and demonstrates that epistasis is not only prevalent among mutations in network components but also depends on the environment. Thus, our results are in line with prior modeling work (65,66,68,70,71), which highlights the role of system drift in causing hybrid incompatibilities in recombinant regulatory networks. Whereas we observe that most recombinant offspring preserve the parental stripe phenotype, we emphasize that the genetic diversity in our parental population is limited. Recombination among a more diverse pool of parental networks may reveal a larger fraction of nonstripe phenotypes. We emphasize further that the synthetic gene regulatory network studied here, expressed in E. coli cells using plasmids, is a highly stylized representation of the complex regulatory networks of multicellular organisms. While it is exactly this reduced complexity that allowed us to study epistasis with minimal confounding factors, we suggest caution in extrapolating our findings to more complex regulatory networks, in which the phenotypic and fitness effects of mutations may be buffered by downstream network components (76).
In sum, we used the toolkit of synthetic biology to perform a systematic analysis of the combined effects of mutations that are individually phenotypically neutral on the pattern phenotype of a gene regulatory network, uncovering pervasive epistasis that changed in magnitude and sign along an inducer gradient, thus giving rise to novel spatial gene expression patterns that in natural gene regulatory networks may cause hybrid incompatibilities or embody evolutionary innovations. Such environment-dependent epistasis therefore strongly influences the evolution of gene regulatory networks, with implications for our understanding of speciation and evolutionary novelty.

Materials
Chemicals and media components, unless stated otherwise, were purchased from Sigma-Aldrich.

The synthetic regulatory network and selected mutants
Our model system, a synthetic stripe-forming regulatory network based on the incoherent feedforward loop type 2 architecture, was constructed and characterized previously (45). Each of the three nodes is encoded on a separate plasmid, and the WT sequences are available at the National Center for Biotechnology Information GenBank (www.ncbi.nlm.nih.gov/genbank/). The GenBank accession codes are KM229377 (sensor, pCOLA backbone, kanamycin resistance marker), KM229382 (regulator, pCDF backbone, spectinomycin resistance marker), and KM229387 (output, pET backbone, ampicillin resistance marker). For a functional network to generate an inducer-dependent gene expression pattern, all three plasmids need to be transformed into E. coli MKO1 cells (77). Mutated networks were selected from a previous study, in which we introduced mutations in the promoter and operator region of each node (39). Each single mutant genotype contains only mutations in one of the three plasmids, whereas the other two plasmids did not contain mutations. Figure S2 shows the nucleotide changes in promoter and operator regions of the 30 selected mutant genotypes.

Generating pairwise and triplet genotype combinations
Plasmids of selected mutants were extracted and purified using the QIAprep Spin Miniprep Kit (Qiagen) according to the manufacturer's protocol. Each mutant selected from our previous study (39) contains one mutated and two WT plasmids. To extract only the mutated plasmids, we first removed the WT plasmids with restriction digest, retransformed the single plasmids into E. coli NEB5α cells, plated and cultured the cells with the appropriate antibiotics, and extracted the single plasmids. Xho I was used to isolate pCOLA plasmids (sensor), Aat II and Xma I were used to isolate regulator pCDF plasmids (regulator), and Xba I was used to isolate pET plasmids (output).
We transformed the selected 10 sensor, 10 regulator, and 10 output mutants in all possible pairwise and triplet combinations into chemical competent MK01 E. coli cells. To generate pairwise combinations, we transformed the two mutant plasmids into competent cells that already contained the third WT plasmid. For the triplet combinations, we transformed the sensor and regulator mutant plasmids (10 × 10 = 100) into competent cells that already carried 1 of the 10 output mutant plasmids. Combinatorial libraries were cultured overnight in 96-deep-well plates using selective LB medium and stored in 96-well plates at −80°C as glycerol stocks. Each 96-well plate contained 50 mutant genotype combinations and three wells with the WT variant and a well containing only media for data normalization purposes as described below.

Fluorescence measurement at 16 inducer concentrations
We measured the pattern phenotype over a gradient of 16 inducer concentrations for the WT, 30 single and 40 selected pairwise and triplet genotypes. We performed the measurements as follows: Starting from a glycerol stock, we inoculated three 5-ml cultures [LB medium containing 0.4% (w/v) glucose and antibiotics] for each genotype, which served as our biological replicates and were from this point on treated independently. They were cultured overnight at 37°C and 200 rpm shaking. The following morning, we inoculated a fresh culture of 5 ml with 200 μl of the overnight culture followed by incubation for 3 hours at 37°C and 200 rpm shaking. From these precultures, we used 5 μl to inoculate 384-well plates (Sigma-Aldrich, Nunc, flat-bottom) containing 55 μl of SM media per well with inducer concentrations of 0.2, 0.1, 0.05, 0.025, 0.0125, 0.00625, 0.00313, 0.0016, 0.0008, 0.0004, 0.0002, 0.0001, 0.00005, 0.000025, and 0.000012% (w/v) arabinose. Immediately after inoculation, 384-well plates were covered with clear lids to reduce evaporation and loaded into plate readers (Biotek Synergy H1) and measured as described below.

Fluorescence measurement of libraries
Measurement of the complete pairwise and triplet mutant libraries were performed as follows: Starting from a glycerol stock plate, we inoculated three 96-well plates, which served as our biological replicates and were from this point on treated independently. Each 96well plate contained one well with media only and three wells with the WT network, which were used for data normalization as described below. Each well contained 120 μl of stripe LB medium containing 0.4% (w/v) glucose and antibiotics, and plates were cultured overnight at 37°C and 700 rpm (THERMOstar, BMG Labtech). From each overnight culture, 5 μl was used to inoculate a 120-μl selective LB medium preculture, which was incubated for 3 hours at 37°C and 700 rpm (THERMOstar, BMG Labtech). From the preculture plate, we used 5 μl to inoculate each of four different wells of a 384-well plate (Sigma-Aldrich, Nunc, flat-bottom) each containing 55 μl of SM containing: 1) 0% arabinose ("low") 2) 0.0002% arabinose ("medium") 3) 0.2% arabinose ("high") 4) 0.2% arabinose + 700 μM IPTG ("metabolic load" control) Position A1 of a 96-well plate was used to inoculate position A1 (low), A2 (medium), B1 (high), and B2 (metabolic load) of a 384well plate. The pipetting steps to inoculate 384-well plates were carried out using a semi-automatic pipetting robot (Rainin Smart 96, Mettler Toledo). Immediately after inoculation, 384-well plates were covered with clear lids to reduce evaporation and loaded into plate readers (Biotek Synergy H1) and measured as described below.

Plate reader assay and data normalization
Microplates (96-well and 384-well) were incubated in plate readers (Biotek Synergy H1) with clear lids to reduce evaporation and shaken continuously in double orbital mode at a 2-mm radius and monitored for cell growth (optical density at 600 nm) and green fluorescence (excitation: 485 nm, emission: 520 nm) every 10 min at 37°C. Approximately after 3 hours, GFP expression peaked and E. coli cells started to reach stationary phase. As described previously (45), the time point when the fluorescence of the WT network at the medium arabinose concentration (0.0002%) peaked was chosen for further analysis of all fluorescence measurements. Fluorescence measurements were corrected for media background fluorescence and variation in cell number by dividing fluorescence with absorbance values. To adjust for plate-toplate variation, we first normalized the three replicate plates based on the average of the three WT replicates on each plate. We then calculated the mean and SD from the replicates and normalized again using the WT values from each plate to adjust for variation between all plates. The final data represent the average of three replicates, independent cultures started from the same glycerol stock, and errors correspond to the SD between replicates.
Replicate measurements were excluded if values of cell growth differed by >0.2 from the absorbance of the WT controls on the same plate for any of the four conditions or suffered from metabolic load [as described previously (39)]. Measurements were repeated if more than two replicates failed. Twenty-one of the 1300 genotypes miss one replicate measurement due to growth differences. Note that we did not observe any metabolic load for any of the genotypes and combinations.
Correlations of R 2 between replicates and between measurements of three and 16 inducer concentrations were calculated with Prism (version 9.4.0, GraphPad Software LLC) using linear regression.

Calculating and defining significant epistasis
On the basis of a multiplicative model of epistasis, we calculated epistasis for all pairwise and triplet combinations at three inducer concentrations [low (0%), medium (0.0002%), and high (0.2%)] (17)(18)(19)38). First, on the basis of the normalized fluorescence values, we calculated the relative fold change in fluorescence (g obs i ) for each of the 30 single mutant genotypes at each inducer concentration with respect to WT, shown in fig. S6A Note that for triplet mutants, the value of ɛ (i,j,k) measures the deviation between the expected fold change in fluorescence of the multiplicative model using single mutants and the measured fluorescence in triplets. Alternatively, one can also attempt to identify the epistatic effects intrinsic to triplets, not present in pairwise combinations, i.e., exclusive third-order epistasis. Under the multiplicative model, and including pairwise measurements, we can calculate the expected triplet fluorescence as the sum of the contributions of each pair minus the contributions of the single mutants (that otherwise would be counted twice) G exp ði;j;kÞ ¼ log 10g exp ði;j;kÞ ¼ log 10 g obs ðj;kÞ þ log 10 g obs ði;kÞ þ log 10 g obs ði;jÞ À log 10 g obs ðiÞ À log 10 g obs ðjÞ À log 10 g obs This allows to define the exclusive third-order epistasis as ɛ ði;jÞ ¼ G obs ði;j;kÞ ÀG exp ði;j;kÞ ¼ log 10 g obs ði;j;kÞ À log 10g exp ði;j;kÞ ð9Þ S C I E N C E A D VA N C E S | R E S E A R C H A R T I C L E Combining Eqs. 6 to 9, we can write an expression for the total epistasis ɛ (i,j,k) of triple combinations in terms of the pairwise epistasis ɛ (i,j) and the exclusive third-order epistasisɛ ði;j;kÞ (26,50) ɛ ði;j;kÞ ¼ɛ ði;j;kÞ þ ½ɛ ði;j;Þ þ ɛ ðj;kÞ þ ɛ ðk;iÞ � ð10Þ To define significant epistasis, we first calculated the uncertainty for each G expected by propagating the log-transformed SD (σ) from the single mutant measurements σ expected ¼ x ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi and σ expected ¼ x ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi σ 2 mut1 þ σ 2 mut2 þ σ 2 mut3 q ð12Þ for pairwise and triplet combinations, respectively, with x as the mean value of G expected . Using the mean of triplicate measurements and their SD or propagated errors, we then defined significant ɛ through series of t tests (R function "tsum.test" with n.y and n.x = 3, alternative = "two.sided," var.equal = TRUE). The resulting P values were then corrected for multiple testing using the "qvalue" package in R with its base parameters (78). We defined epistasis values as significant if ɛ deviates from 0 with a FDR adjusted P value of <0.05 (Fig. 2B) (26,48,79).
To define significant epistasis between low and medium inducer levels for each mutant genotype combination, we first propagated the error of observed and expected values as follows σ epistasis ¼ x ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi σ 2 obs þ σ 2 exp q ð13Þ with x as the epistasis value and σ the SD of G observed and G expected , for each inducer concentration separately. Significant differences of epistasis between low, medium, and high inducer levels were calculated with a series of t tests (Welch's test) and corrected for multiply testing with the Benjamini-Krieger-Yekutieli method in Prism (version 9.4.0, GraphPad Software LLC). We defined significance values below a FDR q value of <0.1.

Classifying types of epistasis
We classified epistatic interactions depending on the phenotypic effects of each single mutant genotype and their combinations into magnitude, sign, and RSE (80). Their definition is schematically illustrated in Fig. 2D. Briefly, magnitude epistasis is defined when the combined phenotypic effect of a genotype combination deviates from the expected effect but does not change the sign of the phenotypic effect of each single mutant. For instance, if two single mutant genotypes have a higher gene expression than the WT and their combination results in an even higher than expected gene expression, it is defined as positive magnitude epistasis. If their combined effect is lower than expected but remains higher than any of the two single mutant genotypes, this is defined as negative magnitude epistasis. A combination is classified as sign epistasis when the combined effect is lower than one of the two mutants and thus changes the sign. For example, if two single mutant genotypes have a higher gene expression than the WT and their combination results in a value which is lower than one but still higher than the other single mutant genotype. A special case of sign epistasis, RSE, occurs when the combined phenotypic effect has the opposite effect compared to any of the single mutant genotype's effect. For example, RSE occurs when two single mutant genotypes have a higher gene expression than the WT, but their combined effect is lower than any single mutant genotype or even lower than the WT.

Pattern phenotype analysis
We visualized the pattern phenotypes on a Cartesian plot (39). To this end, we calculated the difference in gene expression between low and medium arabinose concentrations for and between high and medium arabinose concentrations for all single mutant genotypes and observed and expected pairwise and triplet genotype combinations. The values of M x and M y for each genotype were then plotted as Cartesian coordinates and classified into stripe (Q4), decrease (Q3), anti-stripe (Q2), and increase (Q1) pattern phenotypes, corresponding to the four quadrants. Projections near the origin correspond to a constant expression phenotype, i.e., a flat pattern phenotype.
To quantify and compare the spread of pattern phenotypes between observed and expected mutant genotypes, we calculated the Euclidean distance ‖M exp − M obs ‖ between the pattern of the WT and the pattern of each genotype as follows kM exp À M obs k ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ðM exp x À M obs x Þ 2 þ ðM exp y À M obs y Þ 2 q ð16Þ Significant differences between observed and expected Euclidean distance values were calculated using a two-tailed paired t test assuming no Gaussian distribution (Wilcoxon matched-pairs signed-rank test) in Prism (version 9.4.0, GraphPad Software LLC).

Supplementary Materials
This PDF file includes: Figs. S1 to S14 Legend for source data Other Supplementary Material for this manuscript includes the following: Source data View/request a protocol for this paper from Bio-protocol.