Combinatorial genetic analysis of a regulatory network reveals the importance of higher order epistasis for gene deletion phenotypes

A key challenge in biology is to understand how mutations combine to alter phenotypes. Each genetic variant in a genome can have diverse effects, for example decreasing, increasing, inactivating, or changing the function of a protein or RNA. In contrast, systematic analyses of how mutations interact have typically used a single variant of each gene, most often a null allele. We therefore lack an understanding of how the full range of genetic variants that occur in individuals can interact. To address this shortcoming, we developed an approach to combine >5000 pairs of diverse mutations in a model regulatory network. The outcome of most mutation combinations could be accurately predicted by simple rules that capture the ‘stereotypical’ genetic interactions (epistasis) in the network. However, for individual genotypes, additional, unexpected pairwise and higher order genetic interactions can be important. These include ‘harmonious’ combinations of individually detrimental alleles that reconstitute alternative functional switches. Our results provide an overview of how the full spectra of possible mutations in genes interact and how these interactions can be predicted. Moreover, they illustrate the importance of rare genetic interactions for individuals, including the impact of higher order epistatic interactions that dramatically alter the consequences of inactivating genes.

Beyond these stereotypical outcomes, particular combinations of mutations sometimes had very different phenotypes to those expected from their individual phenotypic effects. For example, rare pairings of Constitutive GAL4 alleles and Uninducible GAL80 variants yielded Inducible phenotypes, an example of reciprocal sign epistasis 10 . Similar interactions have been previously described for distinct GAL4-GAL80 allele combinations [11][12][13] . Other combinations yielding viable Inducible or Leaky phenotypes included pairings of Uninducible GAL3 with Leaky GAL80 variants (Fig 2E and Fig S2). These alternative combinations of GAL80 and GAL4 genetic variants that permit a WT-like phenotype represent examples of 'harmonious' combinations 14 : alternative genetic solutions to the core Inducible phenotypic characteristic of the GAL pathway.
We hypothesized that the harmonious combinations of GAL80 and GAL4 alleles could simply reflect reconstitution of the original regulatory network that exists in WT cells, with all GALR genes functioning as in the WT network. Alternatively, the functionality of these mutants might reflect different solutions to the same regulatory task, with the roles of other genes changed 15,16 . To distinguish between these two possibilities, we tested whether mutating the additional GALR genes, GAL3 and GAL1, had the same effect in these functional combinations of GAL80 and GAL4 alleles as in the WT system.
We tested a subset of GAL80 and GAL4 alleles in a combinatorially complete set of genotypes incorporating additional third-and fourth-order deletions in the potential galactose sensors GAL3 and GAL1 ( Fig 3A). All Inducible allelic combinations depended on GAL1 or GAL3 for robust growth, reflecting the importance of galactose sensing for pathway induction (Fig 3B,C). However, the consequences of deleting GAL3 or GAL1 sensing activity varied extensively across the different genotypes. For example, whereas the WT switch is completely dependent upon GAL3 (Fig 1D and Fig   3), GAL3 was not required for induction in the Leaky GAL80.07 mutant (Fig 2D and Fig 3), nor was it required for Constitutive expression in GAL4C mutants ( Fig 3D). Moreover, GAL80S-1 + GAL4C double mutants were still Inducible when GAL3 was deleted ( GAL1's galactokinase activity is required for growth in galactose, so the dependency of a network on GAL1's sensing activity cannot be determined by simply deleting the gene. To test the dependency of each genotype on GAL1 sensing activity, we therefore used a strategy in which GALK genes from other species with different galactose sensing mechanisms were expressed from the GAL1 promoter ('GAL1::GALK', Fig 3A, Methods). Consistent with GAL1 sensing activity not being required for induction in WT cells, replacing GAL1 with GALK from other species had no effect on gene expression or growth ( Fig 3B-C and Fig S4, Fig S5). In the Leaky GAL80.07 background, replacing GAL1 by GAL1::GALK also had no effect. However, it completely prevented growth and expression when GAL3 was also deleted. In 6 GAL3 and GAL1, an example of higher order epistasis. Such changes in gene essentiality have been widely observed between and within species [17][18][19][20][21] but the genetic causes are poorly understood. The altered requirement for GAL1 and GAL3 across combinations of mutations in GAL4 and GAL80 suggests that selective pressures on paralogous genes can substantially change with variation in other molecular players.
Together, our results illustrate how the genetic interactions between diverse alleles can be accurately predicted using models that capture the 'stereotypical' epistasis in a system. However they also demonstrate the importance of rare unexpected pairwise and higher order epistasis for the fitness of individual genotypes, and illustrate how higher order epistasis can rapidly change the essentiality of genes.           are Constitutive and therefore grow at high rates, when the prediction is that they will grow slowly due to     Tables 1, and and 2, respectively. All S. cerevisiae yeast strains were generated starting from BY4741, (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 22 ). All cloning in E. coli was performed using DH5-alpha or its commercial derivative NEB 10-Beta (New England Biolabs product number C3019I or C3020). Genomic DNA of BY4741, DH5alpha and C. albicans strain SC5314 were used as templates to generate GALR and GALK constructs. were transformed into the GAL4::URA3 locus.

Generation of screening plasmids for mutagenesis
Primers and strains are in Supplementary Tables 1 and 2. Table 3 includes information specific to the PCR products and transformations used. To ensure that all phenotypic variation observed in the PCR mutagenesis experiment were due to the locus targeted by mutagenesis and not the result of mutations in the other two GALR loci (potentially generated in the assembly or PCR process), we generated a screening plasmid with unique cutting sites between the GALR genes. These plasmids were generated by assembling three-GALR plasmids with the targeted GALR flanked by NotI sites and linker sequences.
One plasmid was assembled per GALR locus. After in vivo gap-repair assembly of these plasmids, they were prepped and transformed into E. coli and screened for correct digestion patterns. These plasmids were named pAMN26, pAMN27 and pAMN28. After identifying correct banding patterns and phenotypic behavior, plasmids were prepared for downstream analysis by removal of the gene of interest. The locus of interest for each one (GAL4, GAL3 and GAL80, respectively) could be liberated with a NotI digest.
Recircularization of these plasmids with a 5-minute room-temperature T4 ligase reaction and transformation into E. coli yielded plasmids pAMN32, pAMN33 and pAMN34 with deletions of their locus of interest.
Each GALR-specific screening plasmid (pAMN32, pAMN33 and pAMN34) was digested at a concentration of 100 ng / µl overnight in 7 ml in a 15 ml Falcon tube at 37ºC with NEB Cutsmart Buffer + High efficiency gap repair transformation in yeast.
In vivo homologous recombination ("gap repair") was used for generation of all plasmids except pAMN31, which was made by Gibson cloning. For gap repair, yeast clones were generated using combined with a high efficiency yeast transformation adapted from on a protocol described in 23 .
Appropriate yeast strain backgrounds were streaked from the freezer and Inoculated galactose plates were sealed with Microseal B seals and placed immediately at 4ºC to prevent growth or gene expression prior to beginning of growth experiment. We found that only turbid cultures could be resuspended by shaking on the 2.5 mm-radius orbital shaker. Therefore, while inoculating into galactose, we took care to distribute the cells evenly across the whole well. Plates were then sealed with Microseal B seals and put at 4ºC. At the end of the day all glucose-pregrown cultures that had been inoculated into galactose media were placed at 30ºC in stacks of 1-2 plates to begin growth.
After inoculation of galactose plates with glucose-grown cells, we put the glucose plates at 4ºC until measurement at the flow cytometer (BD FACS Canto; FACS Diva v 5.0.3 Firmware V 1.4). Prior to measurement at the cytometer, plates were put back on the shaker for 2 hours. Plates were visually inspected to be sure that the cells in all wells were well-suspended. We measured cell density and gene expression ( bandpass filters "FITC-A" 530±15 nm and "PE" 585±21 nm were used for YFP signal and 488±5 nm for SSC signal). High-throughput sampling mode was used with no mixing. The median time to complete a plate was 18 minutes. During this time we determined that cell density measurements did not appreciably change. If any problem was encountered during the cytometry and the measurements needed to be stopped, we took the plate out and put back on the plate shaker briefly to resuspend the cells before resuming the cytometry.
After 12 hours of growth at 30ºC in SC-[LEU or HIS]+0.2% galactose, samples were placed on ice or on a cold surface in a 4ºC room to arrest growth and allowed to cool at least 30 minutes prior to exposure back at room temperature. Prior to measurement at the cytometer, Microseal B covers were removed and samples put on the orbital shaker for 2 hours covered by a breathable plate seal. As mentioned above, samples that did not grow appreciably could not be easily resuspended by the orbital shaker. Selection and isolation of plasmids from single yeast clones for combinatorial genetics experiment.
After screening and phenotypically characterizing mutagenized GALR variants by flow cytometry (see above) we isolated plasmids of interesting phenotypes. Clones were selected to reflect either outlying phenotypes or more typical behaviour. "Outlying phenotypes" included samples where both fracon.glu and fracon.gal measurements were > 0 and less than 1, indicating that the clones had a constitutive character but could not fully induce the GAL pathway. Another rare phenotype we tried to isolate were clones where mean signal in ON cells was less than mean signal of typical inducible ON clones (as discussed in the text this phenotype was quite rare). Although some GAL3 mutants appeared to have constitutive characters in the first screen, we found that none of these phenotypes were recapitulated upon subcloning.
After selection based on phenotype, clones were thawed from the freezer and struck to single colonies.
These were inoculated into SC-HIS+2%glucose media in PCR plates and gDNA prepped in 96-well plates as described above. Clones were transformed into electrocompetent 10-Beta cells (New England Biolabs # C3020) using a 96-well plate electroporator using fresh electroporation plates (BTX 45-0450-M). Cells were recovered in deepwell plates (Thermo Scientific # 260252) and recovered in 0.6 ml SOC media for 1 hour prior to inoculation into 0.6 ml LB+100 mg / ml ampicillin. The next day plasmids were prepped in 96-well plates as described above. Preps were digested with NdeI and PmeI enzymes, which Gene expression distribution clustering.
For gene expression distribution clustering, we used the HDBScan* function (R package dbscan) 24 . This algorithm establishes hierarchical clusters based upon distance-weighted graphs assembled according to the density of data points across n-dimensions. Its final cluster assignments are mainly sensitive to the "minPts" parameter, which sets the minimum cluster size. This sensitivity arises primarily from a 1) the size of the dataset (because more points for a large dataset will yield the same number of clusters as for a small dataset) and 2) how distantly spread the clusters of data points are in n-dimensional space.
For gene expression distribution clustering we wanted to cluster based on expression in both glucose and galactose. For this we first paired the densities of expression across 60 bins of pseudo-log10transformed A.U. FITC-A signal for each sample at time 0 (glucose expression) and after 12 hours of growth in galactose (galactose expression) to generate vectors of 120 units for each observation. The mean vector for each unique genotype was then calculated. These mean values were then pseudo-log transformed to exaggerate signal within bins exhibiting low density values, for example such that a small fraction of cells active in glucose would be more salient to the HDBScan* algorithm (Fig S3A). For this, the density values for each bin transformed as follows: Where V is the final pseudo-log density value and v is the original density value. A matrix was made comprised of rows corresponding to each genotype's vector of measurements of V, with pseudo-log10 expression profiles with a classification. Finally, visual inspection revealed that some clones with constitutive characters exhibited a nonetheless high degree of activation in galactose relative to glucose.
To quantify this we took the mean YFP signal in galactose / mean YFP signal in glucose as a measure of a clone's induction. Using this parameter, we could identify clones mis-classified as constitutive to be "inducible" or "leaky". Similarly, some "inducible" clones displayed low mean induction values, so were classified as uninducible, and some "uninducible" clones actually showed >40% of cells ON in galactose with high a mode of expression and so were classified as inducible.
Predictive modeling of double mutant growth rates based single mutant gene expression clusters. This is because certain triple mutant classes were not measured. Specifically "uninducible" GAL3 was never paired with the three classes of GAL80 and GAL4 that do not include "inducible". These classes number 3 X 3 X 1 GAL4, GAL80 and GAL3 alleles = 9, so 32-9 = 23 unique clusters arising from singlemutant combinations.
Notably, these missing classes are all almost included in the final experiment, including constitutive GAL80, leaky GAL80, uninducible GAL80, uninducible GAL4, constitutive GAL4. The only one not included are the weak expression class of GAL4, three alleles of which all behaved very similarly to WT GAL4 (e.g., uninducible with uninducible GAL80S variants or uninducible GAL3 variants, and constitutive with constitutive GAL80 variants) but with lower peak expression level.
Predictive modeling of double mutant growth rates based on single mutant mutation effects. Standard errors for these predictions were propagated as se mut = abs(µ MUT )* sqrt(((N-1)*(se WT /µ WT )+se mut1 /µ mut1 + se mut2 /µ mut2 + … se mutN /µ mutN )^2) Phenotype values used for µ were equal to growth rate minus the background growth rate observed in the absence of any GAL regulator. These expectation values were used to predict overall variance in the dataset, explaining 52%, 20%, and 85% of variance for pairings between GAL3 vs. GAL4, GAL80 vs GAL3, and GAL4 vs GAL80, respectively. Overall this model explained 55% of variance across the dataset. These low values stem from the fact that pathway-level epistasis leads to constitutive and leaky expression profiles, which dominate signal in the dataset (Fig S3H). For example, all pairwise combinations of constitutive GAL80 and uninducible GAL3 are constitutive and therefore grow at high rates, when the prediction prediction is that they will grow slowly due to GAL3's low growth rate. Similarly Fitting a logistic model to the relationship between GAL1-YFP expression and growth rate To demonstrate that GALK orthologs from E. coli and C. albicans did not display signaling activity, we compared expression of the GAL1pr-YFP fusion to growth rates in which all sensors were deleted and found a sigmoidal relationship between mean YFP expression in glucose. We used this latter expression in glucose a measure of "pathway leakiness" or constitutivity to predict the expected growth rate of the culture in galactose. For this, we fitted a logistic curve using R's SSlogis() function of growth rate as a function of initial gene expression level to demonstrate that inducible or leaky "harmonious combination" mutant backgrounds are able to mount an induction response and high growth rate from an initially OFF state. A one-sided t-test was performed for each within-genotype mean across all backgrounds compared to this null expectation. To account for the differential growth rates observed for C. albicans Samples were then centrifuged on high (~3200 x g, 4000 RPM) in a bucket centrifuge for 10 minutes.
After centrifugation, isopropanol was dumped off and while still upside down, the plates gently dabbed on paper towels to absorb more isopropanol clinging to the plate. 60 µl of 70% ethanol was then added, plates resealed and inverted gently one time to mix the remaining isopropanol and ethanol together.
Plates were then centrifuged on high (~3200 x g, 4000 RPM) in a bucket centrifuge for 2-5 minutes. An optional second 70% ethanol wash was was sometimes performed. Debris was pelleted by centrifugation on high (~3200 x g, 4000 RPM) in a bucket centrifuge for 10 minutes. Then plate seals were removed and 60 µl of plasmid-containing supernatant was transferred ingo into 60 µl of isopropanol in a new 96-well plate. Plates were re-sealed and inverted several times to mix the DNA and isopropanol. Samples were then centrifuged on high (~3200 x g, 4000 RPM) in a bucket centrifuge for 10 minutes. After centrifugation, isopropanol was dumped off and while still upside down, the plates dabbed on paper towels to absorb more isopropanol clinging to the plate. 60 µl of 70% ethanol was then added, plates resealed and inverted gently one time to mix the remaining isopropanol and ethanol together. Plates were then centrifuged on high (~3200 x g, 4000 RPM) in a bucket centrifuge for 5 minutes. An optional second wash was sometimes performed. Then ethanol was dumped and while still upside down put the plate on a paper towel to absorb of yet more ethanol clinging to the plate.
Plates were then spun briefly to bring remaining ethanol to the bottom of the wells, and using a Rainin multichannel P10 with LTS tips (catalog number 17005873; very fine and flexible tips) the remaining ethanol was removed from the plates. Plates were allowed to dry 10 minutes. DNA was resuspended in