Mechanistic causes of constrained phenotypic variation revealed by synthetic gene regulatory networks

Phenotypic variation is the raw material of adaptive Darwinian evolution. The phenotypic variation brought forth by organismal development is biased towards specific phenotypes, but the molecular mechanisms behind such biases are still poorly understood. Here we study these biases in two synthetic gene regulatory networks expressed in E. coli that produce a gene expression stripe, which is a pivotal pattern in embryonic development. The two gene regulatory networks display the same phenotype, but create it through different regulatory mechanisms. Because these regulatory mechanisms are well understood, we can use a combination of experimental measurements, mathematical modelling and DNA sequencing to understand why mutations bring forth only some but not other novel gene expression phenotypes. Our observations emphasize the interactions between genes in a regulatory network as a mechanistic cause of biased phenotype production.


Introduction
The ability of biological systems to bring forth novel and beneficial phenotypes as a 3 consequence of genetic mutations is essential for evolutionary adaptation and innovation. This 4 ability is encapsulated in the concept of evolvability (Kirschner and Gerhart, 1998;Wagner, 5 2005b). Evolvability can be limited by evolutionary constraints, which are biases or limitations 6 in the production of novel phenotypes (Smith et al., 1985). An example of such constraints 7 comes from laboratory selection experiments with butterfly populations for enhanced wing 8 eyespot colours (Allen et al., 2008). Selection was able to increase the amount of black or gold 9 colouring in the two eyespots simultaneously, but was unable to do so for the two different 10 colours independently in the two eyespots. Constrained variation can have multiple genetic and 11 developmental causes that can be difficult to disentangle in a complex developing organism 12 (Arnold, 1992;Wagner, 2011). Therefore, few experimental demonstrations of evolutionary 13 constraints exist. What is more, thirty years after this concept rose to prominence (Smith et al., 14 1985), we still do not understand the mechanistic causes of evolutionary constraints. 15 16 The instructions for an organism's development are encoded in gene regulatory networks 17 (GRNs)networks of interacting transcription factors that control gene expression in both time 18 and space (Davidson, 2006). Mutations in the cis-regulatory regions of GRNs play an important 19 part in evolutionary adaptation and innovation (Payne and Wagner, 2014; Prud'homme et al., 20 2007; Wray, 2007). Examples include the evolution of the vertebrate spine (Guerreiro et al.,21 2013), of wing pigmentation in butterflies  and of hindwing 22 reduction in flies (Carroll et al., 2001). GRNs are thus primary candidates for systems that might 23 lead to the production of constrained variation (Gompel and Carroll, 2003;Sorrells et al., 2015). 24 However, no experimental work exists to find out whether GRNs might constrain novel gene 25 expression phenotypes, and what the mechanistic causes of such constraints might be. These 26 questions require us to study the relationship between genotypic and phenotypic changes in 27 GRNs. Computational models of gene regulation provide one avenue to understand such  Payne and Wagner, 2015). However, experimental 1 validation of the latter prediction is still lacking. 2 3 To help fill these gaps in experimental evidence, we here use the toolbox of synthetic biology. 4 It allows us to create novel GRNs by assembling well-characterised parts. We are therefore no 5 longer limited to studying GRNs in situ, that is, in one or few well-studied organisms where 6 influences of genetic background or environment may be difficult to control. Instead, we can 7 construct and modify GRNs to understand the properties and potential of GRNs to create novel . We previously built multiple 3-gene synthetic networks that display the same gene 10 expression phenotype, but create this phenotype through different regulatory mechanisms - 11 different regulatory dynamics and regulatory interactions among network genes (Schaerli et al., 12 2014). Their phenotype is a "stripe" of gene expression (low-high-low) along a spatial axis in 13 response to a chemical concentration gradient that is analogous to a morphogen gradient in 14 development. A GRN's ability to "interpret" such a gradient by producing such stripes is crucial 15 in the development of many organisms and body structures, such as axial patterning of the 16 Drosophila embryo and vertebrate neural tube differentiation (Lander, 2007; Rogers and 17 Schier, 2011; Sagner and Briscoe, 2017; Wolpert, 1996). The question of which regulatory 18 mechanisms can produce stripes is therefore itself crucial for developmental genetics (Cotterell 19 and Sharpe, 2010). Here we go beyond this question to ask whether different GRNs that have 20 the same phenotype (a "stripe" of gene expression) can produce different novel gene expression 21 phenotypes in response to mutations, and if so, why.

23
Specifically, we use here two synthetic circuits that employ different regulatory mechanisms to 24 produce a striped gene expression pattern. Both of these circuits are hosted by Escherichia coli 25 bacteria. When these bacteria are grown as a lawn in the presence of a concentration gradient 26 of the morphogen analogue, they display a spatially striped gene expression pattern (Fig. 1 c). 27 We introduced random mutations into the regulatory regions of these circuits, and analysed the 28 resulting phenotypes. The two circuits indeed produce a different spectrum of novel gene 29 expression phenotypes. That is, the gene expression variation they produce is constrained. To 30 identify the mechanistic causes of these constraints, we combined experimental DNA sequence 31 and phenotypic data with a mathematical model of gene expression dynamics.  Fig. 1 shows the topologies (Fig. 1a) and the molecular implementations (Fig. 1b) of our two 6 starting networks, which we had constructed and characterized previously (Schaerli et al., 7 2014). Briefly, their regulatory input is the sugar arabinose, which serves as a molecular 8 analogue of a developmental morphogen. The arabinose is sensed by the arabinose-responsive 9 promoter pBAD that acts in a concentration dependent-manner. The observable network output 10 is fluorescence, which is produced by superfolder green fluorescent protein (GFP) (Pedelacq et 11 al., 2006). Positive regulatory interactions are encoded by T7 and SP6 phage RNA polymerases 12 (RNAPs), which start transcription at T7 or SP6 promoters, respectively. Negative interactions 13 are encoded by the transcriptional repressors LacI (lactose operon repressor protein) and TetR 14 (tetracycline repressor). They inhibit transcription when bound to their operator sites (LacO, 15 TetO), which are placed downstream of promoters. The two networks employ distinct 16 mechanisms to produce a gene expression stripe pattern (Cotterell and Sharpe, 2010;Jimenez 17 et al., 2015). We call these mechanisms the "opposing gradients" and the "concurring  is known to be involved in Drosophila melanogaster anterior-posterior patterning (hunchback, 23 knirps, krüppel) (Jaeger, 2011), to the best of our knowledge the concurring gradients 24 mechanism has so far not been observed in a natural stripe-forming regulatory network.
Box 1. Two starting circuits producing stripes through two different mechanisms 28 Opposing gradients mechanism (Incoherent FFM type 2): The "red" gene (with the open 29 reading frames (ORFs) for LacI and TetR encoded on the same transcript) is activated by the 30 "morphogen" arabinose (vertical arrow). Its products thus form a gradient of increasing 31 concentration with increasing arabinose concentration. The "blue" gene (LacI) and the "green" 32 gene (GFP) are expressed from constitutive promoters. However, the "blue" gene is also 33 repressed by the "red" gene product (TetR). Thus, the "blue" gene product forms an opposing 34 6 gradient with respect to the gradient of the "red" gene product. Both the "blue" (LacI) and "red" 1 (LacI) gene products repress the "green" gene. The GFP thus reaches a high expression only at 2 medium morphogen concentration where the repression from the "red" and "blue" genes are 3 low. 4 Concurring gradients mechanism (Incoherent FFM type 3): The "red" gene (with the ORFs for 5 SP6 RNA polymerase (RNAP) and LacI encoded on the same transcript) is activated by the 6 "morphogen" arabinose, just as in the previous circuit. Its expression thus also mimics the 7 arabinose gradient. However, in this circuit the "red" gene product SP6 RNAP activates the 8 "blue" gene, which thus forms a concurring gradient with respect to the gradient of the "red" 9 gene product. The "green" gene is activated by the "blue" gene (T7 RNAP) and repressed by 10 LacI of the "red" gene. Its maximum expression occurs at medium arabinose concentration 11 where there is already activation from the "blue" gene, but not yet a high level of repression of 12 the "red" gene.    14 We introduced mutations into the regulatory regions of these two networks by replacing the 15 wild-type regulatory sequence with semi-randomised weighted oligonucleotides (Isalan, 2006). 16 Resulting mutation rates ranged from 2.6 to 3.5 mutations (point mutations, insertions and 17 deletions, SI Table 4) per regulatory region. For each of our two networks, we first generated 18 three libraries of mutant networks in which mutations were restricted to regulatory regions of 19 the "red", "blue" or "green" gene ( Fig. 1). After plating cells from a population whose members 20 harboured a synthetic network variant, we randomly picked colonies, grew them in liquid 21 culture, and measured their GFP expression at low (0%), middle (0.0002%) and high (0.2%) 22 arabinose concentrations. We classified the observed fluorescence phenotypes into six 23 categories ( Fig. 2a) (see Methods for exact definitions): "stripe", "increase", "decrease", "flat" 24 and "broken" (all expression values below a threshold) and "other" (phenotypes that do not fall 25 in any of the previous categories). We emphasize that phenotypes like these are not peculiarities 26 of our synthetic system. They have also been observed in developing organisms, for example 27 in the anterior-posterior patterning of mutant fly embryos (Drosophila melanogaster and     Fig. 2b summarises the spectrum of phenotypes we observed after mutagenesis. We first note 15 that both networks are to some extent robust to mutations; that is, a considerable fraction of 16 mutations do not change the "stripe" phenotype (black sectors in Fig. 2b). What is more, the 17 two types of networks we study differ in their robustness. Averaged across the three genes, 45.5 18 % of analysed mutants preserve the stripe phenotype in the concurring gradients network, 19 whereas only 32.9% do so in the opposing gradients network. The concurring gradients network 20 is thus significantly more robust to mutations [Chi-square goodness of fit test, X2 (1, N = 215) 21 = 13.67, p =0.0002]. Next, we note that within any one of the two networks the novel 22 phenotypes do not occur at the same frequency, providing evidence for the biased production 23 of novel phenotypes, where certain types of phenotypes are more common than others. We also 24 observed differences in the types of novel phenotypes between the two networks. For example, 25 8.2% of mutants of the opposing gradients networks show a "flat" GFP expression phenotype, 26 where the GFP expression is invariant to arabinose concentrations (yellow sector in Fig. 2b). In 27 contrast, mutations in the concurring gradients network did not produce a single such 28 phenotype. In addition, mutations in the opposing gradients network are more likely to create a 29 "decrease" phenotype (purple, 29.8% of all novel phenotypes) rather than an "increase" 30 phenotype (orange, 15.4%). For the concurring gradients network, the opposite is true: 31 Mutations are more likely to create "increase" (23.0%) rather than "decrease" (18.1%) 32 phenotypes.

34
Subsequently, we analysed the differences in novel phenotypes created by mutations in specific 1 regulatory regions (i.e. of the "red", "blue" or "green" gene). Within any one of the two network 2 types, regulatory mutations in the "red" gene most often create "increase" phenotypes ( Fig. 2b, 3 small pie charts left to the "red" genes), whereas those in the "blue" gene most often create 4 "decrease" phenotypes ( Fig. 2b, small pie charts at the bottom of the "blue" genes), and those 5 in the "green" gene preferably create "broken" phenotypes ( Fig. 2b, small pie charts to the right 6 of the "green" genes). As a consequence, not all phenotypes can be reached by introducing 7 mutations in the regulatory region of any of the three genes. For example, in the opposing 8 gradient network the "increase" phenotype is only reachable by introducing mutations in the 9 "red" gene, but not in the "blue" and "green" genes. 10 The two networks differ in the spectrum of novel phenotypes that mutations in individual genes 11 create, which is especially obvious for mutations in the "green" gene: Unless regulatory 12 mutations in this gene lead to a complete loss of expression ("broken"), the opposing gradients 13 network is >5 times more likely to create a "flat" phenotype (23.2 %) than a "decrease" 14 phenotype (4.1 %). In contrast, the concurrent gradients network does not produce any "flat" 15 phenotype at all, but readily produces "increase" phenotypes (4.5 %). In sum, mutations in 16 networks which start with the same phenotype (single stripe formation), but which have 17 alternative topologies and regulatory mechanisms, create different kinds of novel phenotypes. 18 Phenotypic variation is subject to constraints, and these constraints differ between regulatory 19 regions and networks. 20 21 Differences in constrained variation can be explained by differences in the regulatory 22 mechanisms behind stripe formation 23 24 We next asked whether the regulatory mechanisms contributing to stripe formation can help 25 explain these phenotypic constraints. In doing so, we focused on novel phenotypes produced 26 by regulatory mutations in the "green" gene, because such mutations produced the most distinct 27 spectrum of novel phenotypes (Fig. 2b). Also, the regulation of this gene is most complex, 28 because it receives two regulatory inputs instead of just one for the other genes ( Fig. 1). (Similar 29 analyses for the "red" and "blue" genes can be found in SI Figures 2-4.) 30 To address this question, we first used a mathematical model that we had developed previously 31 and validated experimentally to describe the regulatory dynamics of our networks (Schaerli et  SI Tables 1 and 2 for details). The unmutated ("wild-type") model for each circuit used 1 parameter values determined in our previous study (Schaerli et al., 2014). Into these models we 2 now introduced quantitative changes in the parameters relating to the promoter activity (binding 3 constants of activators and transcription rates) and to the operator activity (binding constants of 4 repressors) in order to predict phenotypes that are accessible by mutations (see Methods for 5 details). We represent the unmutated network as a point in parameter space, and study regions 6 near this point that are accessible by mutations, and the novel phenotypes they contain (Dichtel-7 Danjoy and Felix, 2004). We visualise the results in phenotype diagrams ( Fig. 3a and SI Fig.   8 2), which are projections of the higher-dimensional parameter space onto two dimensions. 9 Coloured regions indicate the phenotype (see legend, Fig. 3a) that a circuit produces if 10 mutations change parameters from their wild-type values to other values in this region, 11 expressed as a percentage of the wild-type value. For example, the black region in Fig. 3 12 corresponds to mutant parameter combinations that maintain the "stripe" phenotype. Its area is 13 therefore a measure for a network's robustness to parameter changes. 14 For each of the novel phenotypes observed experimentally when mutating the regulatory region 15 of the "green" gene ( Fig. 2b), some "mutated" model parameter values exist which reproduce 16 the phenotype (Fig. 3a). However, the phenotype diagrams of the two networks ( Fig. 3a) are 17 visually very distinct, indicating that the two networks have different potentials to access 18 specific phenotypes. Specifically, for the opposing gradients network we find regions 19 corresponding to the "broken" (grey), "decrease" (purple), "flat" (yellow) and "other" (beige) 20 phenotypes, whereas for the concurring gradients network we find regions for the "broken" and 21 "increase" (orange) phenotypes. Especially instructive are mutants with strongly decreased 22 repressor binding (i.e. reduced operator activity, arrows in Fig. 3a). Such mutants produce a 23 "flat" phenotype (yellow region) in the opposing gradient network, but an "increase" phenotype 24 (orange region) in the concurring gradient network (Fig. 3a). In other words, even though both 25 networks contain the same operator (LacO) in the "green" gene, the model predicts that 26 identical operator mutations can lead to different novel phenotypes. Fig. 3b illustrates how this 27 is possible: Through an operator mutation that removes the incoming negative interaction of 28 the "green" gene in the opposing gradient network, the constitutive promoter becomes the sole 29 driver of "green" gene expression. Consequently, GFP expression becomes independent of 30 arabinose concentrations, which results in a "flat" phenotype. In contrast, after removing the 31 repression of the "green" gene in the concurring gradient network, the "green" gene is still 32 regulated by the activating "blue" gene (T7 RNAP) in an arabinose dependent-manner. Hence, 33 in this mutant circuit, GFP expression increases with increasing arabinose concentrations. In 34 sum, different biases in the production of novel phenotypes can be explained by differences in 1 the regulatory mechanisms of the networks. types. a Phenotype diagrams for parameters that describe the activity of the "green" gene. 6 Horizontal and vertical axes indicate promoter and operator activities of the "green" gene 7 relative to the wild-type value (WT, 100%). Colours indicate phenotypes predicted by the 8 model over the whole range of promoter and operator activity values. White squares indicate 9 the parameter combination of the unmutated circuit, which produces the "stripe" phenotype, 10 and white lines are visual guides that project these values onto the two parameter axes. Arrows 11 point to the phenotype observed when operator activity decreases to a value near zero percent.   17 We next validated our model predictions with DNA sequence analysis. We sequenced the 18 regulatory regions of the mutated "green" genes in the 73 opposing and 67 concurrent gradient 19 networks, and studied their phenotypes. Because many circuits have multiple regulatory 20 mutations, we first categorised circuits according to the number of mutations that they 21 13 contained, counted the number of circuits in each category, and classified their phenotypes 1 (Figure 4a, big diagrams: All mutations). We followed the same procedure for the subset of 2 circuits that have mutations only in the promoter sequence or only in the operator sequence 3 (Fig. 4a, smaller diagrams). This categorisation reveals that mutations in the operator produce 4 a "flat" phenotype in the opposing gradient network, but an "increase" phenotype in the 5 concurring gradient network (Fig. 4a, smaller diagrams), thus validating model predictions (Fig.   6 3). In addition, the model predicts that mutations in the operator of the opposing gradient 7 network are able to produce a "decrease" phenotype ( Fig. 3a). Even though we did not find a 8 circuit with a "decrease" phenotype that has only operator mutations, all circuits with this 9 phenotype carry at least one mutation in the operator (and additional mutations in the promoter, 10 SI sequences). predicted by the model (Fig. 3). For the "decrease" phenotype in the opposing gradients network 18 and the "increase" phenotype in the concurring gradients network our dataset is too small to 19 detect the predicted enrichment of operator mutations. 20 21 Especially informative are those mutants with a novel phenotype that carry only a single point   with a single mutation at the indicated position. 12 13 Encouraged by this agreement between model and mutational data, we also aimed to see 14 whether a model of mutational effects can correctly fit the frequencies instead of just the kinds 15 of phenotypes caused by mutations. To find out, we simulated mutations by changing specific 16 parameters of the model. If a mutation affected the promoter (or operator), we changed all the 17 parameters determining promoter (or operator) activity. We did so by replacing the wildtype 18 promoter (or operator) activity with an activity drawn from a uniform distribution, and aimed 19 to identify upper and lower bounds for this distribution that reproduce the observed frequencies 20 of novel phenotypes (SI Table 3, see Methods). Such bounds do indeed exist, and enable our 1 model (Fig. 2c) to reproduce experimental phenotype distributions (Fig. 2b) very well. Because mutations rarely occur in isolation in a single gene, we next asked whether mutations 7 in different regulatory regions independently affect gene expression phenotypes. To this end, 8 we pooled networks with mutations in single regulatory regions to obtain networks with 9 mutations in the regulatory regions of two or three genes. We then measured the gene 10 expression phenotypes of these multiple-gene mutants and sequenced their regulatory regions 11 (SI sequences). Fig. 2d shows the resulting distribution of phenotypes. Similar to the one-gene 12 mutants (Fig. 2b), some phenotypes occur more frequently than others, and the two networks 13 produce any one phenotype at different frequencies.

Sequence analysis confirms model predictions of constrained phenotypic variation
14 Regardless of the regulatory mechanism, the frequencies of novel phenotypes differed 15 significantly between networks with mutations in multiple versus single genes ( Fig. 2d and  produce the "flat" phenotype in response to multiple mutations, but the concurring gradient 19 networks did not produce this phenotype in response to single-gene mutations (compare yellow 20 sectors in Fig. 2d and 2b, large circles). 21 Because the phenotypes we observed in the multiple-gene mutants are not just additive 22 superpositions or "sums" of phenotypes observed when the mutations occur separately, the 23 mutations in the different genes must interact non-additively (epistatically) to produce novel 24 phenotypes, such that a mutation's phenotypic effect depends on the genetic background in 25 which it occurs (Lehner, 2011;Mackay, 2014). 26 In SI Fig. 5 we show an experimental example of how mutations in the "green" and "blue" 27 genes can interact to produce a "flat" phenotype in the opposing gradient network: The network 28 with the mutated "green" gene maintains the "stripe" phenotype (SI Fig. 5a) and the network 29 with the mutated "blue" gene leads to a "decrease" phenotype (SI Fig. 5b). When these two 30 mutations are combined, the resulting phenotype is "flat" (SI Fig. 5c). Importantly, this new 31 phenotype cannot just be explained as an additive superposition of the two individual 32 phenotypes.

33
To understand the phenotype distributions of the multiple-gene mutants and in particular, the 1 non-additive interactions, we turned again to our model. Analogous to the experiments, we now 2 changed parameters of multiple genes simultaneously within the same ranges as used to model 3 the single-gene mutants. The resulting phenotype distributions (Fig. 2e) predict the 4 experimentally observed distributions very well (Fig. 2d). This implies that both constrained  14 Both networks produced a non-uniform distribution of novel phenotypes, but the different 15 networks displayed different constraints in the production of novel phenotypes. The identity of 16 the mutated regulatory region, and non-additive interactions among mutations in multiple 17 regions also influenced these constraints. 18 A mathematical model describing the regulatory mechanisms of the two networks allowed us 19 to understand the differences between available phenotypes for the two networks ( Fig. 2 and 3) 20 and this understanding was also supported by DNA sequencing data (Fig. 4). We thus provide 21 for the first time empirical evidence that GRNs with different regulatory mechanisms can cause 22 different constrained variation, as was recently proposed (Jimenez et al., 2015). Additional 23 experiments show that the mechanism by which a network produces a stripe constrains the 24 origin of novel expression phenotypes more than quantitative parameters driving gene 25 expression dynamics (SI Fig. 6). 26 Comparisons of GRNs in related species indicate that they indeed use many distinct molecular 27 solutions to produce a desired phenotypic outcome and that these solutions diverge substantially  capture the complexity of a developing animal, this reduced complexity also allowed us to study 30 the potential of GRNs to bias phenotype production without cofounding effects.  (Lehner, 2011). To our knowledge, our study shows for the first time that non-additive 6 interactions in a nonlinear gene regulatory network help produce constrained phenotypic 7 variation. These interactions enabled the origin of novel phenotypes that were not observed 8 when we mutated a single gene (e.g., the "flat" phenotype in concurring gradients network, SI 9 Fig. 5). Our results suggest that these epistatic interactions can also be predicted, if the 10 corresponding GRN, its regulatory mechanism, and the effect of mutations in single regulatory 11 regions are known. Ultimately, understanding the nonlinearities inherent in complex biological 12 systems will be essential to understand how such systems constrain the production of 13 phenotypes. 14 15 Since the 19 th century, Darwinian evolutionary biology has focused on natural selection and its 16 power to shape populations and species. Natural selection, however, requires phenotypic 17 variation, and the molecular mechanisms by which DNA mutations produce novel phenotypes 18 have only become understood in recent years. While orthodox evolutionary theory assumed, 19 often tacitly, that DNA mutations may produce any kind of variation (Mayr, 1963), the 20 discovery of constrained phenotypic variation challenged this view (Arnold, 1992; Smith et al., 21 1985). As we show here, constrained variation in simple yet important spatial gene expression 22 patterns can be explained by the simple fact that genes are embedded in regulatory networks. 23 What is more, the regulatory mechanisms of these GRNs can help explain why specific gene 24 expression patterns originate preferentially. Given the pervasive nonlinearity of gene regulatory 25 networks (Davidson, 2006), we surmise that constraints like those we observe are pervasive in 26 biological pattern forming systems. Future work will show whether they can also influence the 27 trajectories of adaptive evolution. Cloning experiments used Luria-Bertani medium (LB:10 g Bacto-tryptone, 5 g yeast extract, 4 10 g NaCl per 1 l) supplemented with appropriate antibiotic (100 μg/ml ampicillin, 30 μg/ml 5 kanamycin or 50 μg/ml spectinomycin). Experiments with the complete synthetic circuits used 6 'Stripe Medium' (SM: LB plus 0.4% (w/v) glucose, 50 μg/ml ampicillin, 15 μg/ml kanamycin 7 and 25 μg/ml spectinomycin). For the opposing gradients network SM was supplemented with 8 5 μM isopropyl β-D-1-thiogalactopyranoside (IPTG). 9 10 Molecular cloning reagents 11 Restriction enzymes and T4 DNA ligase were purchased from New England BioLabs (NEB). 12 Oligonucleotides were obtained from Microsynth and chemicals were obtained from Sigma-   Mutating one regulatory region of a circuit at a time 25 Oligonucleotides covering the regulatory regions of these circuits were synthesised with 26 weighted base mixtures (Isalan, 2006). That is, four custom-weighted phosphoramidite 27 mixtures were prepared, with the WT base pair constituting 95% and each of the other bases 28 constituting 1.67% of any one mixture. These mixtures were used to randomise the regulatory 29 regions (Table 1) during oligonucleotide synthesis (Microsynth). These semi-randomised 30 weighted single-stranded oligonucleotides (2 µM) were annealed to a reverse primer ( Table 2,  oligonucleotides encoding the mutated regulatory regions were cloned into the plasmid 1 encoding the gene whose regulatory region was to be mutated. For the "blue" and "green" 2 regulatory regions this was done by restriction enzyme digest and ligation. For the "red" 3 regulatory region Gibson assembly (Gibson et al., 2009) was performed instead. 4 5 The cloning of the "blue" and "green" regulatory regions by restriction enzyme digest and 6 ligation was performed as follows: double-stranded oligonucleotides encoding one of the 7 regions were digested with EcoRI and SacI. The plasmid into which the region was to be 8 inserted was also digested with EcoRI and SacI, dephosphorylated with CIP, and gel purified 9 (QIAquick gel extraction kit, QIAGEN). The double-stranded digested oligonucleotide library 10 (25 ng) was then ligated into the cut plasmid (70 ng). 11 12 The cloning of the "red" regulatory regions by Gibson assembly was performed as follows: The 13 plasmid into which the region was to be inserted was first amplified using the primers 14 pBAD_Gibson_f and pBAD_Gibson_r (Table 2), and was then assembled with the double-15 stranded oligonucleotide library using the Gibson assembly master mix (NEB). Due to a 16 mistake in primer design, Gibson assembly produced a two-nucleotide deletion (CT) 17 downstream of the pBAD promoter. Therefore, the library contains sequences with and without 18 this deletion. We confirmed that this change did not affect the initial "stripe" phenotype. 19 20 Ligation ("blue" and "green" regulatory regions) and Gibson assembly reaction products ("red" 21 regulatory region) were transformed into electrocompetent MK01 cells (Kogenaru and Tans,22 2014) that carried already the two other plasmids necessary to complete the synthetic circuit. 23 Transformants were plated out on SM-agar plates. 24 25 Mutating multiple regulatory regions of a circuit 26 For experiments where multiple regulatory regions of a circuit were to be mutated at the same 27 time, cloning was performed as described above for mutating one regulatory region at a time. 28 However, instead of transforming plasmids with mutagenised regulatory regions directly into 29 MK01 cells, ligation products and Gibson assembly reaction products were first transformed 30 into electrocompetent NEBα cells, and plated out on LB-agar plates containing the appropriate 31 antibiotic (100 μg/ml ampicillin, 30 μg/ml kanamycin or 50 μg/ml spectinomycin). All colonies 32 were resuspended in LB, diluted 100-fold into LB containing the appropriate antibiotic, and     Table 3: Sequences of primers used for sequencing 1 2 Fluorescence measurements of mutagenised circuits 3 Colonies were picked from agar plates, inoculated into SM medium in a single well of a 96-4 well plate and grown overnight. Each plate also contained three clones of the WT circuit and a 5 "blank" (SM medium only). A glycerol stock plate was prepared from the overnight cultures. 6 This plate was used to inoculate three further 96-well overnight pre-culture plates with SM 7 medium. Five µl of each well from the 96-well plate were transferred to four wells of a 384-8 well plate containing 55 µl of SM medium, arabinose, and IPTG. Specifically, the four wells 9 contained the following amounts of arabinose and IPTG: 10 1) 0% arabinose ("low") 11 2) 0.0002% arabinose ("medium") 12 3) 0.2% arabinose ("high") 13 4) 0.2% arabinose with 700 µM IPTG for the opposing gradients network and 0.2% arabinose 14 with 100 µM IPTG for the concurring gradients network ("metabolic load control", see section 15 "Phenotype classification, metabolic load" for details). 16 The pipetting steps for this part of the experiment were carried out with a manual pipetting 17 system (Rainin Liquidator 96, METTLER TOLEDO). 18 19 The 384-well plate was incubated at 37 °C in a Tecan plate reader (Infinite F200 Pro or SPARK   expecting to observe a "stripe" phenotype. If the observed phenotype is nevertheless a "stripe", 28 this is a strong indication that the network suffers from metabolic load. Specifically, a circuit 29 was excluded due to high metabolic load if its nF value in the 4 th condition was lower than 90% 30 of its nF value at the medium arabinose concentration, and if the nF value at the medium 31 arabinose concentration was higher than the corresponding nF of the WT controls (we know 32 that expression levels as high as that of the WT do not induce metabolic load (Schaerli et al.,33 2014)).

34
Phenotypic categories 1 All mutant circuits that remained after the filtering procedure just described were classified into 2 the following phenotypic categories (in this order): 3 4 Broken: A threshold value of nF below which a phenotype was considered "broken" was 5 defined as follows: For the opposing gradients networks, nF needed to lie below the nF of the 6 WT controls at the highest of our three arabinose concentration. For the concurring gradients 7 networks, nF needed to lie below 1/3 of the nF of the WT controls at this highest arabinose 8 concentration. Any circuit whose nF was below this threshold in all four conditions was 9 assigned the "broken" phenotype. The reason for the different definitions of the threshold for 10 the two networks is that the WT concurring gradient network circuit has a much higher level of 11 basal fluorescence. 12 13 Flat: The average nF at the lowest, medium and highest arabinose concentration was calculated.
14 If all three nFs differed by less than 10% from this average, the circuit was assigned the "flat" 15 phenotype. 16 17 Decrease: If the following three statements were true, the mutant circuit was assigned the 18 "decrease" phenotype: 19 1) The nF at the lowest arabinose concentration was higher than 90% of the nF at the 20 medium arabinose concentration. 21 2) The nF at the medium arabinose concentration was higher than 90% of the nF at the 22 highest arabinose concentration. 23 3) The nF at the lowest arabinose concentration was higher than 120% of the nF at the 24 highest arabinose concentration. 25 26 Increase: If the following three statements were true, the mutant was assigned the "increase" 27 phenotype: 28 1) The nF at the highest arabinose concentration was higher than 90% of the nF at the 29 medium arabinose concentration. 30 2) The nF at the medium arabinose concentration was higher than 90% of the nF at the 31 lowest arabinose concentration. 32 3) The nF at the highest arabinose concentration was higher than 120% of the nF at the 33 lowest arabinose concentration.

25
Stripe: If the nF at medium arabinose concentration was higher than 120% of the nF at the 1 lowest and highest arabinose concentrations, the mutant was assigned the "stripe" phenotype. 2 3 Other: Any phenotype that did not fall into one of the previous categories. The mutagenised region(s) of all circuits whose phenotype fell into one of our six main 7 categories, and did so consistently in three independent measurements were sent for Sanger 8 sequencing (High-throughput service, Microsynth, see Table 3 for primers).  All sequences are listed in the SI. 20 21 Experimental confirmation of rare observed phenotypes 22 Phenotypes observed fewer than three times in a library (except "others") were experimentally 23 confirmed. The fluorescence output of these circuits was measured in a 96-well plate assay as 24 described previously (Schaerli et al., 2014) at the arabinose concentrations also used in the 386- 25 well plate assay. If the phenotypes of the two assays did not agree, the circuits were excluded 26 from the data set. 27 28 Experimental confirmation of epistasis (SI Fig. 5) 29 The plasmids from mutant 4_4_f (see SI sequences) were isolated. For SI Fig. 5a the mutated 30 plasmid coding for the "green" gene was transformed together with the WT plasmids for the 31 "blue" and "red" genes into electrocompetent MK01 cells (Kogenaru and Tans, 2014). For SI 32 Fig. 5b the mutated plasmid coding for the "blue" gene was transformed together with the WT 33 plasmids for the "green" and "red" genes into electrocompetent MK01 cells (Kogenaru and 34 Tans, 2014). For SI Fig. 5c the initial 4_4_f mutant was assayed (the "red" gene is not mutated). 1 The fluorescent phenotypes were measured in a 96-well plate assay as described ( Fig. 2, combinations of two parameters as indicated on the axes were varied. For Fig. 3a 24 multiple parameters affecting the promoter or operator were varied jointly and to the same 25 extent. For the opposing gradients network, these were parameters a and b for promoter activity, 26 and parameter c for operator activity. For the concurring gradients network these were 27 parameters b, c and e for promoter activity and parameter d for operator activity. 28 29 Distributions of novel phenotypes 30 In order to fit quantitatively the distributions of novel phenotypes for the single-gene mutants 31 and predict the distributions of novel phenotypes for the multiple-gene mutants a custom-made 32 Python scripts was used. We first discuss the single-gene mutants: Multiple iterations of a procedure were performed that consisted of the following three steps: 1 First, a series of simulated mutants was created. For each gene, a random binary vector whose 2 length corresponds to the number of nucleotides in the regulatory sequence (SI Table 4) was 3 generated. In this binary vector 0's and 1's indicate whether a nucleotide is not mutated (0) or 4 mutated (1), and the probability to obtain either 0 or 1 is given by the average mutation rate 5 extracted from the experimental sequencing data (SI Table 4). For every network, it was then 6 assessed which genes were mutated, according to the entries of this vector. For the single-gene 7 mutants, mutants that had only one gene mutated were selected. This process was repeated until 8 1000 single-gene mutants had been obtained. For a particular mutant, all parameters related to 9 the mutated sequence were varied (see SI Tables 1-3). A given single-gene mutant had either 10 only its promoter mutated, only its operator mutated, or both. If a mutation affected the 11 promoter (operator), all parameters determining promoter (operator) activity were changed (SI 12   Table 3). A subset of the changed parameters was varied jointly and to the same extent (i.e. all 13 of them were changed to same percentage of their wild-type parameter value), because a 14 mutation is likely to affect these parameters in a similar way (SI Table 3). New (mutant) 15 parameters were chosen according to a standard uniform distribution between upper and lower 16 ranges which were kept constant for a given model iteration. 17 Second, for each mutant the phenotype of the model (SI Tables 1 and 2) was evaluated at 3 18 arabinose concentrations (0%, 0.000195%, 0.2% for opposing gradients network and 0%, 19 0.000195%, 0.1% for concurring gradients network). In order to allocate the resulting GFP 20 expression pattern to a phenotype category the same rules as described above for the 21 experimental data ("Phenotypic categories") were used.

22
Third, the obtained phenotype distribution of the 1000 assessed mutants was compared to the 23 experimentally observed phenotype distribution. 24 After each iteration of these three steps, the upper and lower ranges of each parameter were 25 manually adjusted to best fit the results of the model to the phenotype distributions observed in 26 the experimental data (see SI "Discussion of intervals"). Finally, the best upper and lower 27 ranges for each mutant parameter distribution were kept (SI Table 3) and used to produce Fig.   28 2c. 29 To predict the distributions of novel phenotypes for the multiple-gene mutants (Fig. 2e) the 30 same procedure was used, with the following modifications: 1) only mutants containing more 31 than one mutated gene were kept for analysis.
2) The upper and lower ranges of the parameter 32 distributions were not adjusted, but the intervals derived from the single-gene mutants were 33 used (SI Table 3). 3) Only one iteration of the three steps above was performed.   Programme/Generalitat de Catalunya. 25 26 Competing interests 27 The authors declare not competing interests.