Gene Erosion Can Lead to Gain-of-Function Alleles That Contribute to Bacterial Fitness

ABSTRACT Despite our extensive knowledge of the genetic regulation of heat shock proteins (HSPs), the evolutionary routes that allow bacteria to adaptively tune their HSP levels and corresponding proteostatic robustness have been explored less. In this report, directed evolution experiments using the Escherichia coli model system unexpectedly revealed that seemingly random single mutations in its tnaA gene can confer significant heat resistance. Closer examination, however, indicated that these mutations create folding-deficient and aggregation-prone TnaA variants that in turn can endogenously and preemptively trigger HSP expression to cause heat resistance. These findings, importantly, demonstrate that even erosive mutations with disruptive effects on protein structure and functionality can still yield true gain-of-function alleles with a selective advantage in adaptive evolution.

loci for protein quality control, thereby massively boosting their resistance against heat and oxidative stress and raising concerns for their increased survival in medical and industrial settings (15,16). However, the prevalence of these transmissible islands is rather modest (17), and likely, more subtle evolutionary routes for modulating HSP levels exist as well.
In this study, we provide evidence for such an unanticipated adaptive evolutionary route in which E. coli mitigates the impact of proteotoxic stress by mutationally sacrificing the folding fidelity of a single, nonessential, and transiently expressed protein in order to preemptively activate its heat shock response. As such, we demonstrate that erosive mutations with disruptive effects on protein structure can serve as true gain-offunction mutations adaptively raising HSP levels.

RESULTS
Adaptation to heat stress selects for mutants with altered PA management. During preliminary directed evolution experiments, we (i) found that an E. coli MG1655 DrpoS mutant (i.e., deprived of its s S general stress response regulator) could readily complement its intrinsic hypersensitivity to heat (Fig. 1A), and (ii) serendipitously noticed through phase-contrast microscopy that some such independently evolved heat resistant MG1655 DrpoS mutants typically displayed cells bearing a highly refractive polar structure, reminiscent of a protein aggregate (PA) ( Fig. 1B and C). To address these initial experimental observations more systematically, we subsequently heatcycled 17 independent lineages of E. coli MG1655 DrpoS ibpA-msfGFP (able to fluorescently report on the presence of intracellular PAs through the IbpA-msfGFP reporter [18]), and from each lineage retained one random heat-resistant clone ( Fig. 2A). Interestingly, 14 out of these 17 independent heat-resistant mutants (all except H4, H14, and H17) displayed considerably higher fractions of PA-bearing cells (especially toward the early and/or late stationary phase of growth) compared to the parental strain and its two control-cycled derivatives ( Fig. 2B and C), thereby consolidating our observation that adaptive evolution toward heat resistance can alter cellular PA management.
Heat resistance causally stems from gain-of-function TnaA variants. Surprisingly, whole-genome sequencing of the 17 independent heat-resistant mutants revealed that 13 of them harbored a mutation in the tnaA gene (encoding the tryptophanase enzyme; Table 1). Out of these 13 tnaA mutants, 11 stood out as having clearly altered cellular PA management, while 2 (H4 and H17) were hardly distinguishable from the parental MG1655 DrpoS ibpA-msfGFP strain with regard to the IbpA-msfGFP PA reporter (Fig. 2B). Moreover, checking the tnaA allele of some of the preliminary heatcycled mutants, and even of previously reported hydrostatic pressure-cycled mutants (19), revealed another 4 tnaA alleles (Table 1). Interestingly, most of these tnaA alleles (14 out of 17) were found to be unique and incurred point mutations, premature stop codons, or indels (both in-frame or frameshifting) throughout the tnaA open reading frame (Table 1). Moreover, while many tnaA mutants were partially or fully compromised in their tryptophanase activity (i.e., the ability to produce indole from tryptophan), some of them displayed no obvious loss-of-function signs (Table 1).
To infer causality between possible tnaA alterations and heat resistance, the wildtype tnaA allele (tnaA WT ) of E. coli MG1655 DrpoS was replaced with either (i) tnaA D106 as one of the evolved alleles (coming from mutant MT3; Table 1), (ii) DtnaA (unable to produce the TnaA protein), or (iii) tnaA K270A (producing a catalytically compromised TnaA variant [20]) and subsequently heat challenged (Fig. 3A). Interestingly, this revealed not only that the tnaA D106 allele itself was indeed causally sufficient to increase heat resistance of MG1655 DrpoS to the same level as seen in the original MT3 mutant, but also that the DtnaA and tnaA K270A alleles failed to affect this phenotype (Fig. 3A), despite sharing an impaired ability to produce indole with the tnaA D106 allele (Fig. 3C). Moreover, in contrast to the tnaA WT , DtnaA, and tnaA K270A alleles, implementation of the tnaA D106 allele also caused MG1655 DrpoS to produce PAs to the same level as seen in MT3, indicating that expression of this variant directly or indirectly causes protein aggregation (Fig. 3B). As such, it could be inferred that the complete absence of the TnaA protein or the more subtle loss of its enzymatic function (i.e., indole production) is not sufficient or required to impose heat resistance, indicating that the evolved tnaA variants should be considered subtle gain-of-function alleles.
In further consolidation of this causality and the need for tnaA D106 to actually become expressed, we also observed that preventing tnaA D106 expression by depriving the growth medium of tryptophan (i.e., the inducer of the tnaCAB operon [21,22]) abrogated the heat resistance effect (Fig. S1). Moreover, in tryptone soy broth (TSB) batch cultures, induction of the tnaCAB operon naturally occurs toward early stationary phase (when glucose becomes depleted and catabolite repression is alleviated [23][24][25]), which also corresponds in timing to the rise in PA-containing cells in the selected tnaA mutants (Fig. 2B). In contrast, induction of the tnaCAB operon already occurs in mid-exponential-phase LB in batch cultures (where amino acids are the main carbon source [26]), and exponential-phase heat resistance in the tnaA D106 mutant could accordingly be observed in LB but not in TSB medium (Fig. S2).
Compromised TnaA folding fidelity boosts the level of heat shock proteins. We subsequently hypothesized that the protein aggregation directly or indirectly inflicted by TnaA variants could lead to activation of the heat shock response, which in turn would provide the cell with increased heat resistance, while simultaneously leading to the emergence of PAs in the cell. Subsequent Western blot analysis indeed revealed that MG1655 DrpoS equipped with tnaA D106 displayed increased levels of heat shock proteins (HSPs) such as DnaK and GroEL compared to MG1655 DrpoS equipped with either tnaA WT or tnaA K270A (Fig. 4A). Moreover, the increase in DnaK and GroEL levels was Logarithmic reduction factor after exposure to heat (55°C for 15 min) of TSB-grown stationary-phase cultures of MG1655 wild-type (WT), MG1655 DrpoS, and a representative heatresistant mutant (MTH1) derived from MG1655 DrpoS by directed evolution (i.e., successive cycles of increasingly severe heat shocks with intermittent outgrowth to stationary phase in TSB). Different letters indicate statistically (Tukey HSD post hoc test, P value # 0.05) significant differences among strains. (B) Fraction of cells that contain an inclusion body visible through phase-contrast microscopy for the indicated strains in TSB-grown stationary-phase populations and 6 h after reinoculating 1/100 in fresh TSB. On average, 93.2 cells were observed per strain and condition per independent experiment, and all sample sizes were between 85 and 110 cells. Different letters indicate statistically significant differences (Tukey HSD post hoc test, P value # 0.05) among different strains and time points. For panels A and B, the displayed means were determined over three independent experiments, and the error bars indicate the standard error over these experiments. (C) Representative phase contrast images of MG1655 DrpoS (upper panel) and its evolved heat-resistant MTH1 mutant (lower panel) after stationary-phase growth in TSB. White arrows indicate inclusion bodies. Scale bar corresponds to 2 mm.

FIG 2
Mutants H1 to H17 were derived as single clones from independent lineages of the MG1655 DrpoS ibpA-msfGFP parental strain that were subjected to successive cycles of increasingly severe heat shocks (ranging from 15 min at 51°C to 55°C with 0.5°C increments) with intermittent similar to the included positive control of an MG1655 DrpoS rpoH I54T mutant equipped with a constitutively active variant of the heat shock response sigma factor RpoH ( Fig. 4A) (27). In fact, the extent of increased heat resistance of the DrpoS tnaA D106 mutant coincided with that of the DrpoS rpoH I54T strain (Fig. 4B).
To more closely and causally link the expression of TnaA variants to induction of PAs, HSPs, and heat resistance, MG1655 (i.e., now bearing its wild type s S general stress response regulator) was chromosomally equipped with different reporter constructs and different tnaA alleles (or the rpoH I54T allele as an HSP upregulated control), after which corresponding exponential-phase cultures were shortly induced with tryptophan and examined for (i) tnaCAB promoter activity (using the P tnaCAB -msfGFP reporter; Fig. 5A), (ii) heat resistance (Fig. 5B), (iii) HSP expression level (using the RpoHcontrolled P htpG -msfGFP reporter; Fig. 5C), and (iv) appearance of PAs (using the IbpA-msfGFP reporter; Fig. 5D). Next to tnaA WT , the examined tnaA alleles included evolved allele tnaA D106 (causing a very high fraction of PA-bearing cells; Fig. 3B), evolved allele tnaA 259fs (causing a low fraction of PA-bearing cells and emerged in heat resistant mutant H17; Fig. 2B), or tnaA K270A (expressing the catalytically compromised TnaA variant). While the short exposure to tryptophan clearly induced all TnaA variants (Fig. 5A), only induction of TnaA D106 and TnaA 259fs coincided with induction of heat resistance (Fig. 5B), HSPs (Fig. 5C), and PAs (Fig. 5D). In fact, the heat resistance and HSP induction of MG1655 tnaA D106 and MG1655 tnaA 259fs mutants could rival those of MG1655 rpoH I54T in which HSPs and heat resistance are constitutively upregulated in the absence of any obvious protein aggregation ( Fig. 5B to D). Additionally, these results also indicate that s S deficiency is not a requirement for these tnaA alleles to confer their protective effect.
To independently validate these findings across more of the selected tnaA alleles, both control alleles (i.e., tnaA WT and tnaA K270A ) and a set of heat-selected alleles (i.e., tnaA D106 , tnaA 259fs , tnaA D31 , tnaA A130E , tnaA Q240P , tnaA V224E , and tnaA A359P ; Table 1) were individually cloned downstream of the IPTG (isopropyl-b-D-thiogalactopyranoside)controlled P trc promoter in the pTrc99A vector and transformed to the MG1655 DlacY ibpA-msfGFP P dnaK -mScarlet-I (Fig. 6) and MG1655 DlacY dnaK-msfGFP (Fig. 7) reporter strains. Exponential-phase cultures of the resulting strains were subsequently induced with IPTG and examined for (i) heat resistance (Fig. 6A), (ii) HSP expression level (via the RpoH-controlled P dnaK -mScarlet-I reporter in Fig. 6B and the DnaK-msfGFP reporter in Fig. 7), (iii) appearance of PAs (via the IbpA-msfGFP reporter; Fig. 6C), and (iv) PA content (as approached by SDS-PAGE and mass spectrometry analysis of the insoluble and soluble protein content; Fig. 6D and E and Fig. S3). This confirmed that ectopic expression of the heat-selected tnaA alleles led to increased heat resistance (Fig. 6A) and increased levels of HSPs ( Fig. 6B and Fig. 7) and PAs (Fig. 6C) compared to the tnaA WT and tnaA K270A alleles. In two of the heat-selected tnaA alleles (tnaA Q240P and tnaA V224E ), the increase in heat resistance was found not to be statistically significant (Fig. 6A), presumably because plasmid-based overexpression of TnaA Q240P and TnaA V224E coincided with very large amounts of misfolded proteins and PAs (Fig. 6C) that likely imposed a burden on the cell. These experiments also indicate the domi- outgrowth to stationary phase in TSB. As a control, UC1 and UC2 were derived as single clones from independent lineages of MG1655 DrpoS ibpA-msfGFP that were similarly cycled but without being exposed to heat stress. (A) Logarithmic reduction factor after exposure to heat (55°C for 15 min) of stationary-phase TSB cultures of the parental MG1655 DrpoS ibpA-msfGFP strain and its indicated derivatives. Asterisks indicate a statistically significant decrease in the logarithmic reduction factor compared to the parental strain (Tukey HSD post hoc test; *, P # 0.05; **, P # 0.01). (B) The fraction of cells containing a fluorescent IbpA-msfGFP labeled PA in the parental MG1655 DrpoS ibpA-msfGFP strain and its indicated derivatives was determined microscopically by sampling a growing TSB culture every 3 h after reinoculating 1/100 from an overnight culture. Out of the 17 heatselected mutants, 14 (H1, H2, H3, H5, H6, H7, H8, H9, H10, H11, H12, H13, H15, and H16) could be considered as having aberrant cellular PA management compared to the controls on the basis of the IbpA-msfGFP reporter. On average, 100.96 cells were observed per strain and time point per independent experiment, and all sample sizes were between 64 and 148 cells. For panels A and B, the displayed means were determined over three independent experiments, and the error bars indicate the standard error over these experiments. Blue labels indicate mutants with mutations in their tnaA allele as determined by whole-genome sequencing. (C) Representative GFP epifluorescence (reporting IbpA-msfGFP expression and localization) images of MG1655 DrpoS ibpA-msfGFP and one of its heat-selected mutants (H2) after 6 h of exponential growth in TSB. Cell outlines are shown in white, and the white arrows indicate IbpA-msfGFP labeled PAs. Scale bar corresponds to 2 mm.    nance of the heat-selected tnaA alleles over the chromosomal tnaA WT allele that is still present in the reporter strains, which is in line with our hypothesis that these are gainof-function alleles.
Importantly, although the wild-type TnaA protein is known to be a soluble cytoplasmic protein, SDS-PAGE analysis of the insoluble and soluble protein fractions of the above-mentioned plasmid strains clearly revealed that the heat-selected TnaA D106 TnaA 259fs , TnaA D31 , TnaA A130E , TnaA Q240P , TnaA V224E , and TnaA A359P variants (in contrast to TnaA WT and TnaA K270A ) massively ended up in the insoluble protein fraction when their expression was induced with IPTG ( Fig. 6D and E), indicating that these variants are themselves aggregation-prone and can thus serve as the direct molecular cause of HSP induction. The abundant presence of TnaA in the insoluble protein fraction was indeed confirmed by mass spectrometry analysis for the strains expressing TnaA D106 and TnaA 259fs . TnaA has been identified in the gel bands with a 100% protein probability (Fig. S3, Data Set S1). The fraction of cells that contain an inclusion body visible in phase contrast was determined for the indicated strains after growth in TSB to stationary phase and 6 h after reinoculating 1/100 in fresh TSB medium. On average, 95.9 cells were observed per strain and condition per independent experiment, and all sample sizes were between 80 and 110 cells. (C) Indole concentrations produced by the indicated strains after growth in TSB to stationary phase. For panels A to C, the displayed means were determined over three independent experiments, and the error bars indicate the standard error over these experiments. Within each panel, different letters indicate statistically significant (Tukey HSD post hoc test, P value # 0.05) differences among different strains and time points.
In silico analysis of heat-selected TnaA variants. Closer examination of the heatselected full-length TnaA variants indicated that they typically incurred single amino acid substitutions to Glu or Pro (Table 2 and represented in Fig. S4), of which the residues (upon misplacement) were described to be particularly detrimental to overall protein structure (28). In fact, thermodynamic stability calculations (using FoldX [29]) revealed that each of the selected substitutions significantly destabilizes the native structure of the TnaA protein, while the catalytically compromised TnaA K270A variant experiences no significant structural destabilization (DDG scores in Table 2). Moreover, comparing the distribution of the FoldX values of the heat-selected mutants with the distributions obtained either from (i) all TnaA single substitutions that can theoretically be genetically accessed through a single point mutation or (ii) naturally occurring TnaA orthologs strongly suggests heat selection toward TnaA variants with severe structural destabilization (Fig. 8). In addition, the heat-selected substitutions had a zero or extremely low frequency of occurrence in the natural (and likely functional) orthologs (Table 2), further suggesting clear selective pressure toward structural disruption. As such, these findings further underscore that the heat-selected TnaA variants are indeed affected in their folding fidelity and inclined to trigger HSP expression because of this feature.
TnaA-PAs also provide epigenetically inheritable longer-term heat resistance. Time-lapse fluorescence microscopy monitoring of MG1655 cells chromosomally equipped with tnaA D106 or tnaA 259fs (and different reporter constructs) after halting  tryptophan-mediated induction revealed that over subsequent generations HSP expression (as judged by the P htpG -msfGFP reporter) homogeneously returned to basal levels (shown for tnaA D106 in Fig. S5A). On the other hand, the formed TnaA-PAs (as judged by the IbpA-msfGFP reporter) segregated asymmetrically among sister cells, thereby creating PA-bearing (PA 1 ) and PA-lacking (PA -) siblings (shown for tnaA D106 in Fig. S5B). The homogeneous extinction of HSP expression in an emerging microcolony (despite heterogeneous segregation of PAs) underscores that increased HSP expression is triggered by the initial production of misfolded TnaA variants, and not per se by the PA structures into which they assemble. Moreover, the heat resistance of MG1655 tnaA D106 and MG1655 tnaA 259fs was similar (Fig. 5B), despite the fact that in MG1655 tnaA D106 the TnaA-PAs were more abundant (Fig. 5D), further emphasizing that the size or stability of the PA structures is not necessarily instructive for the level of HSP and heat resistance raised.
Nevertheless, we recently documented the occurrence of a different PA-based heat resistance phenomenon (18). In fact, asymmetric inheritance of an ancestral PA (i.e., stemming from a prior heat shock or prior expression of an aggregation-prone protein many generations before) was shown to epigenetically endow the PA 1 cell with improved heat resistance compared to its PAsiblings (18). In order to examine whether ancestral TnaA-PAs could have such a similar protective effect, well after increased HSP expression coinciding with their original emergence has dampened out, MG1655 tnaA D106 was tryptophan-induced for 2 h in a liquid culture and subsequently grown for 2 h on agarose pads lacking tryptophan. The resulting microcolonies, now typically consisting of one PA 1 and several PAisogenic siblings (Fig. 9A), were subsequently heat challenged, after which, survival and resuscitation time of both types of cells were compared. This clearly revealed (i) that PA 1 siblings were endowed with an (on average) 1.9-fold increased survival chance (Fig. 9B), while (ii) the PA 1 survivors were furthermore endowed with an (on average) 2.5 h decrease in resuscitation time compared to PAsiblings (Fig. 9C). In fact, survival frequency and resuscitation time of PAsiblings were similar to those of MG1655 tnaA D106 cells not previously triggered with tryptophan ( Fig. 9B and C), indicating that the contribution of the ancestrally elevated HSP levels in the tryptophan-triggered PAcells meanwhile indeed dampened out completely.
As such, the aggregating TnaA variants cause (i) an immediate population-level heat resistance resulting from the surge in HSP levels and (ii) a secondary subpopulation-level heat resistance in those siblings retaining/inheriting the ancestral TnaA-PA many generations afterward. FIG 5 Legend (Continued) or rpoH I54T derivatives with (trp; light gray) or without (no trp; dark gray) tryptophan induction. The average number of observed cells per strain and condition per independent experiment was 347.0, and sample sizes were always between 143 and 745 cells. (B) Logarithmic reduction factor after exposure to heat (54.5°C for 15 min, recovery on LB agar) of AB-grown exponential-phase populations of MG1655 (harboring tnaA WT and rpoH WT ) and its tnaA K270A , tnaA D106 , tnaA 259fs , or rpoH I54T derivatives with (trp; light gray) or without (no trp; dark gray) tryptophan induction. (C) Average cellular fluorescence of AB-grown exponential-phase populations of the MG1655 P htpG -msfGFP fluorescent reporter (harboring tnaA WT and rpoH WT ) and its tnaA K270A , tnaA D106 , tnaA 259fs , or rpoH I54T derivatives with (trp; light gray) or without (no trp; dark gray) tryptophan induction. The average number of observed cells per strain and condition per independent experiment was 421.7, and sample sizes were always between 116 and 965 cells. (D) Fraction of cells containing an IbpA-msfGFP labeled PA of AB-grown exponential-phase populations of the MG1655 ibpA-msfGFP fluorescent reporter (harboring tnaA WT and rpoH WT ) and its tnaA K270A , tnaA D106 , tnaA 259fs , or rpoH I54T derivatives with (trp; light gray) or without (no trp; dark gray) tryptophan induction. On average, 91.3 cells were observed per strain and condition per independent experiment, and all sample sizes were between 50 and 106 cells. For all panels, strains were grown to exponential phase in AB medium (i.e., 4 h of growth after 1/1,000 dilution of stationary-phase cultures) with or without addition of 1.25 mM tryptophan for the last 2 h of growth. For all panels, the displayed means were determined over three independent experiments, and the error bars indicate the standard error over these experiments. Asterisks indicate a statistically significant difference for induced compared to noninduced populations of the same strain (Student's t tests followed by Bonferroni correction; *, P # 0.05; **, P # 0.01), while different lowercase (for tryptophan-induced populations) or capital (for noninduced populations) letters indicate statistically significant differences among strains (Student's t tests followed by Bonferroni correction, P value # 0.05).

DISCUSSION
Our results essentially indicate that adaptive evolution can capitalize on erosive mutations that alter protein folding fidelity and subsequent aggregation dynamics, thereby enabling cells to autonomously raise HSP expression and boost their robustness against future proteotoxic stresses. This (bottom-up) adaptive evolutionary strategy deviates from common expectations to find (top-down) upregulation of the heat shock response through mutations in its master regulators (such as the dedicated RpoH sigma factor), but instead relies on preemptive self-inflicted damage. Indeed, while the protein quality control network is known to alleviate the impact of a proteotoxic stress by restoring the stability of stress-affected proteins, our findings now (reciprocally) indicate that the acquisition of mutations sacrificing protein folding fidel-   Table 1). For all panels, strains were grown to exponential phase in AB medium (i.e., 4.5 h of growth after 1/1,000 dilution of stationary-phase cultures) with or without addition of 100 mM IPTG for the last 2 h of growth. For panels A to C, the displayed means were determined over three independent experiments, and the error bars indicate the standard error over these experiments. Asterisks indicate a statistically significant difference for induced compared to noninduced populations of the same strain (Student's t tests followed by Bonferroni correction; *, P # 0.05; **, P # 0.01), while different lowercase (for IPTG-induced populations) or capital (for noninduced populations) letters indicate statistically significant differences among strains (Tukey HSD post hoc test, P value # 0.05).
ity of even a single nonessential protein presents an evolutionary mechanism to tune the basal levels and readiness of the chaperone network. More generally, our findings indicate that some loss-of-function mutations might as well be orthogonal gain-offunction mutations, further implying that eroding genes with a degenerating sequence (e.g., pseudogenes that are no longer under stringent selection [30]) are not necessarily on route to being evolutionarily purged and lost from the genome but can actually be actively retained by natural selection.
Interestingly, the reason for the tnaA allele being an apparent hot spot for such folding-compromising mutations in our setup is not related to the enzymatic activity of the encoded protein ( Fig. 3C and Table 1), despite the fact that indole is otherwise known as a signal molecule with a myriad of physiological effects (e.g., on plasmid stability [31], cell division [32], antibiotic tolerance [33,34], virulence [35] and biofilm formation [36]). In fact, it rather, seems related to the fact that tnaA presents a nonessential (i.e., disposable) gene that becomes fully expressed toward the early stationary phase of TSB-grown cells (i.e., when alleviated catabolite repression and the presence of tryptophan boost the tnaCAB operon [23][24][25]). Since, during our iterative directed   evolution regime, heat shocks were consistently administered to TSB-grown stationaryphase cultures, the timing of HSP expression seems to have been optimized by incurring folding-compromising mutations in a gene properly expressed toward this state. This apparent selection for the proper timing of resistance deployment underscores the possible modularity of this adaptive mechanism. Indeed, incurring folding-compromising mutations in a (nonessential) gene that is only expressed during a particular environmental condition that tends to coincide with proteotoxic stress would now provide leverage to preemptively upregulate HSP expression and cellular robustness under such conditions. Furthermore, in terms of fitness cost, the transient induction of HSPs by targetedly sacrificing the folding of a timely expressed protein might outcompete mechanisms that more constitutively upregulate HSPs via upstream regulatory mutations (e.g., rpoH mutants) or a more general attenuation of folding fidelity (e.g., ribosome mutants). In fact, it was most recently shown that synthetically engineered E. coli mutants with increased proteome-wide mistranslation rates (through an errorprone ribosome [37 or depleted levels of cellular tRNA [38]) displayed constitutively upregulated HSP levels and concurrent heat resistance, although indeed, often at the cost of reduced protein synthesis and growth (38). Additionally, mistranslation inherently imposes a proteome-wide burden on the cell to deal with a potentially nonfunctional quasi-species of protein, while limiting that burden to a single protein would naturally be more beneficial.
For some folding-compromised TnaA variants, the specifics of aggregation seem to result in persistent PAs that become asymmetrically inherited for multiple generations (Fig. 9A). This is of interest since PA inheritance itself has most recently been linked to beneficial features in both Saccharomyces cerevisiae (39) and E. coli (18) model systems. In fact, in agreement with our observations regarding heat-induced PAs in E. coli (18), asymmetric cross-generational segregation of ancestral TnaA-PAs leads to a subpopulation of PA 1 siblings endowed with clearly improved cellular robustness in terms of higher survival frequency and faster resuscitation speed in response to a heat shock ( Fig. 9B and C). Unlike cells experiencing protein aggregation, cells inheriting ancestral (i.e., preformed/established) PAs do not display increased expression of HSPs, although their heat resistance might actually stem from coinheritance of PA-associated HSPs that could fortify the PA 1 cell (18). While the underlying mechanism of this phenomenon is still elusive, this additional (longer-term) benefit of protein aggregation might nevertheless also constitute a supplementary selectable advantage that shaped the actual nature of some TnaA-variants and the stability of the resulting TnaA-PAs. Interestingly, and in contrast to the faster resuscitation displayed in this study, intracellular PAs have recently also been correlated with dormancy and persistence, indicating that we are currently still scratching the surface with regard to PA physiology (8,40).
Finally, since our data indicate that erosive folding-compromising mutations in proteins can be considered selectable gain-of-function mutations, it is likely that such mutations have left currently unrecognized evolutionary watermarks in the protein sequence space. Indeed, next to their structural or enzymatic cellular function, the evolution of (some) proteins might also have been instructed by their (conditional) folding instability and resulting capacity to boost HSP expression. Likewise, expression patterns of folding-compromised proteins might become tuned to those conditions most in need of higher HSP levels. Gene Erosion Can Lead to Gain-of-Function Alleles ® In summary, our results reveal a new paradigm in bacterial adaptive physiology in which mutations compromising the folding stability of specific proteins can counterintuitively have a selective advantage because of their subtle upregulating effect on cellular HSP levels. The finding that folding-compromising mutations can actually be positively selected for and are not per se loss-of-function mutations can change our view on the evolution of protein sequences.

MATERIALS AND METHODS
Bacterial strains and growth conditions. The bacterial strains and plasmids used in this study are listed in Table S1, and primers are listed in Table S2. Escherichia coli K-12 strain MG1655 was used as the main background throughout this study. For liquid culturing of bacteria, either tryptone soy broth (TSB; Oxoid, Basingstoke, UK), lysogeny broth according to Lennox (LB), or AB medium (supplemented with 10 mg/ml thiamine, 25 mg/ml uracil, and 1% Casamino Acids) was used as indicated. AB medium (with the above-mentioned supplements) was also used as solid medium to make agarose pads intended for time-lapse microscopy by the addition of 2% agarose (Eurogentec, Seraing, Belgium). In the case of single time point microscopy snapshots, 0.85% KCl agarose pads (2% agarose) were used. Incubation for cell growth was always done at 37°C, except for the appropriate times during strain construction when growth at 30°C or 42°C was required. Liquid cultures were incubated aerobically with shaking (250 rpm) in tubes containing 4 ml of medium. Stationary-phase cultures were obtained by ca. 16 h of growth, while exponential-phase cultures were obtained by diluting stationary-phase cultures 1/100 or 1/1,000 in fresh medium and allowing growth for ca. 3 to 5 h (depending on the dilution and the medium used).
Construction of mutant strains. Mutant alleles (i.e., tnaA K270A , tnaA D106 , tnaA 259fs , and rpoH I54T ) were exchanged with the corresponding MG1655 wild-type alleles using a previously described two-step process of selection and counterselection (41). For the tnaA D106 and tnaA 259fs alleles, the gene was first replaced by an amplicon containing the tetA-sacB marker prepared on E. coli XTL298 using the primers P1 and P2 (Table S2). In the following step, counterselection against the tetA-sacB cassette was used to replace it with a tnaA amplicon obtained on the tnaA spontaneous mutants (MT3 and H17) using primers P3 and P4 (Table S2). The tnaA K270A allele was constructed by placing a tetA-sacB amplicon (obtained with primers P2 and P6; Table S2) immediately downstream of the 270 codon, and the desired mutation was synthetically incorporated in the upstream primer used to obtain the amplicon for counterselection (P7 ; Table S2). For the point mutation located in the rpoH allele, the tetA-sacB cassette (obtained using primers P10 and P11) was inserted downstream of the gene since it is important for proper cell fitness (42,43), and the clones lacking the cassette after the counterselection step (with an amplicon obtained with primers P12-P13 on a laboratory strain harboring the rpoH I54T allele [unpublished data]) were screened for the presence of the mutation by sequencing.
The in-frame deletion of tnaA was performed according to the method of Datsenko and Wanner (44). Briefly, an amplicon prepared with primers P8 and P9 on pKD13 (containing the kanamycin resistance cassette) was recombineered in-frame after the start codon of the target gene of a pKD46equipped strain. The kanamycin resistance gene was flanked by frt sites, enabling it to be excised by transiently equipping the strain with the plasmid pCP20 (expressing the Flp site-specific recombinase [45]). An in-frame deletion of lacY was made in a similar fashion by creating an amplicon on MG1655 lacY::frt-nptI-frt (18) using primers P27 and P28.
To construct chromosomal C-terminal transcriptional fusions of htpG or tnaA with msfGFP, the msfGFP-frt-nptI-frt cassette was amplified from the previously described pDHL1029 plasmid (46) using primers P15 and P16 (htpG) or P3 and P4 (tnaA). Similarly, to create a C-terminal transcriptional fusion of DnaK with mScarlet-I, the mScarlet-I-frt-nptI-frt cassette was amplified from pBAM1-Tn5-mScarlet-I using primers P35 and P36. Subsequent recombineering into the correct locus was achieved by lambda Redmediated recombination using pKD46 (44). The amplicon was placed 5 bp after the stop codon of the gene of interest, creating an artificial operon and ensuring cotranscription. A strong synthetic ribosome binding site (BBa_B0034; sequence AAAGAGGAGAA [47]) was used to facilitate msfGFP or mScarlet-I expression. Subsequently, the frt-flanked kanamycin resistance cassette was excised by site-specific recombination by transiently equipping the strain with plasmid pCP20. Equipping strains with the previously described chromosomal translational ibpA-msfgfp fusion (18) was achieved in similar fashion. The amplicon for recombineering was obtained from MG1655 ibpA-msfGFP-frt-nptI-frt (18) using primers P19 and P20, followed by the above-mentioned steps to achieve the desired translational fusion.
All constructed mutants were verified by PCR, using primers that anneal upstream and downstream of the engineered locus (Table S2). Subsequently, the mutated genes of interest were confirmed by sequencing (Macrogen, Amsterdam, the Netherlands).
Construction of plasmids. A set of pTrc99A plasmids capable of the IPTG-inducible expression of different tnaA alleles was constructed by first making an amplicon of the corresponding tnaA allele with primer pair P23-P24. These primers amplified the tnaA open reading frame and introduced NcoI and XbaI restriction sites to the ends of the amplicon. Subsequently, digestion of both the pTrc99A vector and the amplicons with NcoI and XbaI allowed the directed ligation of the amplicon into the vector.
All constructed plasmids were confirmed by amplicon sequencing of the engineered sites (Macrogen) and introduced into the target host strains by electroporation and selection for antibiotic resistance encoded by the plasmid.
Heat treatment of liquid cultures. For heat treatment, cultures were harvested by centrifugation (6,000 Â g, 5 min) and resuspended in an equal volume of 0.85% KCl. Next, 50 ml of the resuspended culture was transferred aseptically to a sterile PCR tube and heat-treated for 15 min in a T3000 thermocycler (Biometra, Göttingen, Germany) at the indicated temperatures. Additionally, unstressed control cultures were kept at room temperature for the duration of the heat treatment. Samples were aseptically retrieved from the PCR tubes and subsequently used to determine survival, as described below.
Population level determination of viability. Heat-stressed and unstressed bacterial cultures were serially diluted in 0.85% KCl and subsequently spotted (5 ml) onto tryptone soy agar (TSA; Oxoid) or LB agar (corresponding to the initial growth conditions unless indicated otherwise) as previously described (49). In the case of growth in AB medium, cells were recovered on either TSA or LB agar plates, as indicated. After 24 h of incubation at 37°C, the colonies in spots containing between 5 and 50 colonies were counted to determine the CFU/ml, so that the limit of quantification was 1,000 CFU/ml. Subsequently, the logarithmic reduction factor, log 10 (N 0 /N), was determined, in which N 0 and N represent the CFU/ml prior to and after heat treatment, respectively.
Selection of heat-resistant mutants by directed evolution. Heat-resistant mutants of MG1655 DrpoS ibpA-msfGFP were obtained by repeatedly subjecting independent overnight TSB cultures to increasingly severe 15-min heat shocks (from 51°C to 55°C with 0.5°C increments) with intermittent resuscitation and outgrowth of the survivors. After each heat shock, the heat-treated samples were diluted 1/100 in fresh TSB and regrown for 23 h at 37°C before the next round of heat treatment. Following nine cycles of selection, a single surviving clone from each culture was purified on TSA and tested for increased heat resistance at 55°C (15 min), as described above. Additionally, a number of independent lineages were subjected to the same regime in the absence of heat stress in order to determine the potential selective effect of serially passing through TSB.
Whole-genome sequencing (WGS). Genomic DNA was isolated from overnight LB cultures using the GeneJET genomic DNA purification kit (Thermo Fisher Scientific, Waltham, MA, USA), after which 150-bp paired-end libraries were prepared using the Flex library prep kit (Illumina, San Diego, CA, USA) and the Nextra DNA CD index kit (Illumina). Sequencing was performed with an Illumina MiniSeq sequencer and analyzed with CLC Genomics Workbench (Qiagen, Hilden, Germany). The sequencing reads were trimmed and mapped to the reference genome (MG1655) and analyzed for single nucleotide polymorphisms (SNPs), indels, and structural variants.
Determining indole concentration. Indole concentration was determined based on a previously described method (50). Stationary-phase TSB cultures were harvested (6,000 Â g, 5 min), and the supernatant was transferred to a new recipient, to which 300 ml of Kovac's reagent (Sigma-Aldrich, Saint Louis, MO, USA) was added. After 2 min of incubation, 20ml from the top was transferred to a recipient containing 100ml of 37% HCl and 300 ml of 100% ethanol. The absorbance of 200ml of this mixture was measured at 550 nm in a microtiter plate using a Multiskan RC instrument (Thermo LabSystems, Vantaa, Finland). Indole concentration was estimated using a calibration curve obtained by the absorbance values at 550 nm after adding 0 to 400 mM indole (Applichem) to stationary-phase cultures of MG1655 DtnaA.
Protein extraction. For protein extraction, 40 ml of an exponential-phase culture was harvested (6,000 Â g, 10 min, 4°C) and resuspended in 10 mM Tris-HCl (pH 7.5) containing 150 mM NaCl and 5 mM b-mercaptoethanol. The resuspension volume was normalized based on the initial optical density at 600 nm. To break the cells, resuspensions were sonicated (VCX30, Sonics & Materials, Newtown, CT, USA) on ice for a total of 2 min (6 pulses of 20 s at 30% amplitude with 20-s pauses in between). To separate the cellular protein content in soluble and insoluble fractions, the lysate was centrifuged at 13,000 Â g for 15 min at 4°C. The supernatant was collected and retained as the soluble fraction. The pellet (insoluble fraction) was subsequently washed three times (13,000 Â g for 15 min at 4°C) with buffer X (50 mM HEPES [pH 7.5], 300 mM NaCl, 5 mM b-mercaptoethanol, 0.1 mM EDTA, and protease inhibitor cocktail [Sigma-Aldrich]). The samples were further purified by three washing steps (13,000 Â g for 15 min at 4°C) with buffer Y (same as buffer X but without protease inhibitor and supplemented with 0.8% [vol/vol] Triton X_100 and 0.1% sodium deoxycholic acid), with sonicating after every wash step for 10 s (5 pulses of 2 s at 30% amplitude with 2-s pauses in between). Finally, samples were resuspended in buffer Z (50 mM HEPES [pH 7.5], 8.0 M urea), heat-treated (10 min, 90°C), and used for SDS-PAGE (12% gel, 20 ml per well). After SDS-PAGE, gels were stained with Coomassie brilliant blue G-250 as previously described (51).
Western blotting. Cells were harvested by centrifugation at 4,000 Â g for 30 min, washed first with physiological water and then washed with 10 ml of buffer A (50 mM HEPES [pH 7.5], 300 mM NaCl, 5 mM b-mercaptoethanol, and 1.0 mM EDTA). Pellets were finally resuspended in 20 ml of buffer B (buffer A supplemented with a protease inhibitor cocktail). Cells were broken with a Glen Creston cell homogenizer (20,000 to 25,000 lb/in 2 ) and additional sonication (Branson digital sonifier 50/60 HZ) on ice with alternating 2-min cycles (15 pulses at 50% power with 30-s pauses on ice, until completing 2 min of total sonication time).
For protein separation, SDS-PAGE was performed using Any kD precast gels (Bio-Rad, Hercules, CA, USA), after which, proteins were transferred on a nitrocellulose membrane (Trans-Blot Turbo Mini 0.2mm nitrocellulose transfer packs; Bio-Rad) with the Trans-Blot Turbo transfer system (Bio-Rad).
Afterward, the membrane was incubated for 1 h in 5% nonfat dry milk in TBST (Tris-buffered saline with Tween 20, pH 8.0) and then stained overnight with the primary antibodies, i.e., monoclonal mouse anti-DnaK (1 mg/ml; Aviva Systems Biology, San Diego, CA) or anti-GroEL (1 mg/ml; Abcam, Cambridge, UK). The membranes were washed three times for 10 min with 0.5% Tween 20 in phosphate-buffered saline (PBS) buffer and stained with the secondary antibody for 2 h (goat anti-mouse IgG HRP; Abcam). After washing, horseradish peroxidase (HRP) was detected with an enhanced chemiluminescence system (ChemiDoc Imaging System; Bio-Rad), and band intensity was quantified with the Image Lab software (Bio-Rad). The density of each band after background subtraction was expressed in fold change compared to the parental DrpoS strain. Western blot (WB) analysis was performed four times on four independent stationary-phase cultures of each strain.
Liquid chromatography and mass spectrometry (LC-MS/MS). For protein identification, gel pieces were extracted based on the protocol of Shevchenko et al. (52). Digestion was performed overnight at 37°C. Samples were desalted using Pierce C18 solid-phase extraction columns according to the manufacturer's instructions (Thermo Scientific) and dried in a SpeedVac until dry and dissolved in 20 ml 5% acetonitrile (ACN) and 0.1% formic acid. The digested and desalted samples were diluted 10 times and injected (5 ml) and separated on an Ultimate 3000 UPLC system (Dionex, Thermo Scientific) equipped with an Acclaim PepMap100 precolumn (C 18 particle size, 3 mm; pore size, 100 Å; diameter, 0.075 mm; length, 20 mm; Thermo Scientific) and a C 18 PepMap rapid-separation liquid chromatography (RSLC) system (particle size, 2 mm; pore size, 100 Å; diameter, 50 mm; length, 150 mm; Thermo Scientific) using a linear gradient (0.300 ml/min). The composition of buffer A is pure water containing 0.1% formic acid. The composition of buffer B is pure water containing 0.08% formic acid and 80% acetonitrile. The fraction 0 to 4% of buffer B (80% ACN, 0.08% formic acid) increased from 0 to 4% in 3 min, from 4 to 10% B in 12 min, from 10 to 35% in 20 min, from 35 to 65% in 5 min, and from 65 to 95% in 1 min and stayed at 95% for 10 min. The fraction of buffer B, decreased from 95 to 5% in 1 min and stayed at 5% for 10 min. The Orbitrap Elite mass spectrometer (Thermo Scientific) was operated in positive ion mode with a nanospray voltage of 2.1 kV and a source temperature of 250°C. Pierce LTQ Velos ESI positive ion calibration mix (catalog no. 88323, Thermo Scientific) was used as an external calibrant. The instrument was operated in data-dependent acquisition mode with a survey MS scan at a resolution of 70,000 (full width at half maximum [fwhm]) for the mass range of m/z 400 to 1,600 for precursor ions, followed by MS/MS scans of the top 10 most intense peaks with 12, 13, 14, and 15 charged ions above a threshold ion count of 16,0001e16 at 17,500 resolution (full width at half maximum [fwhm]) using a normalized collision energy of 25 eV with an isolation window of 3.0 m/z, Apex trigger of 5 to 15 s, and dynamic exclusion of 10 s. All data were acquired with Xcalibur 3.1.66.10 software (Thermo Scientific).
Protein identification. Tandem mass spectra were extracted using Progenesis Mascot was set up to search the uniprot_ecolimax_ database (282,225 entries, where the mutated sequences have been added) assuming the digestion enzyme trypsin. X!, Tandem was set up to search a reverse concatenated uniprot_ecolimax_ database (unknown version, 564,448 entries). Mascot and X! Tandem were searched with a fragment ion mass tolerance of 0.20 Da and a parent ion tolerance of 12 ppm and 2 possible miscleavages. Carbamidomethyl of cysteine was specified in Mascot and X! Tandem as a fixed modification. Deamidation of asparagine and glutamine and oxidation of methionine were specified in Mascot as variable modifications. Glu-.pyro-Glu of the N terminus, ammonia-loss of the N terminus, gln-.pyro-Glu of the N terminus, deamidation of asparagine, and glutamine and oxidation of methionine were specified in X! Tandem as variable modifications.
Scaffold v 4.11.0 (Proteome Software, Inc., Portland, OR) was used to validate MS/MS-based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 97.0% probability by the Peptide Prophet algorithm (53) with Scaffold delta-mass correction. Protein probability was assigned by the Protein Prophet algorithm (54).
Time-lapse fluorescence microscopy (TLFM). For TLFM, appropriate dilutions of cell cultures were transferred to agarose pads containing the appropriate medium on a microscopy slide and covered with a cover glass attached to a 125-ml Gene Frame (Thermo Fisher Scientific) to hold the cover glass on the microscopy slide. TLFM was performed on a Ti-Eclipse inverted microscope (Nikon, Champigny-sur-Marne, France) equipped with a Â60 Plan Apo l oil objective, a TI-CT-E motorized condenser, and a Nikon DS-Qi2 camera. Green fluorescent protein (GFP) was imaged using a quad-edge dichroic (395/ 470/550/640 nm) and a fluorescein isothiocyanate (FITC) single emission filter. A SpecraX LED illuminator (Lumencor, Beaverton, OR, USA) was used as the light source, using the 470/24 excitation filter. Temperature was controlled with an cage incubator (Okolab, Ottaviano, Italy).
Images were acquired using NIS-Elements software (Nikon), and the resulting pictures were further handled with the open source software ImageJ. The average cellular fluorescence of cells was determined using the open source software Ilastik (55), which was trained to robustly identify and segment bacterial cells and exclude debris and out-of-focus cells. Background fluorescence was subtracted using NIS-elements software.
Single-cell determination of viability and resuscitation time. Cells were placed on agarose pads on a microscopy slide, as described above, and monitored with TLFM during growth for 2 h at 37°C (approximately 3 generations). Subsequently, the XY coordinates of the observed microcolonies were determined with NIS-Elements software, and the slide was subjected to a semilethal heat shock by taping the slide to the lid of a thermocycler set to the appropriate temperature (53.5°C, 15 min). After the heat shock, the locations of the heat-exposed microcolonies were traced back using the XY coordinates, and the cells were further monitored for 8 h to determine survival and lag time. Subsequently, the number of cells surviving the heat shock in microcolonies was determined by monitoring cells with TLFM. Cells were marked as surviving cells when they were observed to grow and divide within 8 h after the heat treatment. The time of the first binary fission after heat treatment was used as a proxy for resuscitation time. Cells were binned according to whether they had an inclusion body by visual determination or a fluorescent IbpA-msfGFP focus before heat treatment.
Thermodynamic stability calculations of TnaA variants. Based on the TnaA sequence of the E. coli K-12 MG1655 strain used throughout this study (protein and nucleotide sequences in Data Set S2), all possible single amino acid substitutions that can theoretically result from a single nucleotide mutation were generated, resulting in a list of 2,765 variants. In addition, naturally occurring substitutions were gathered from a set of sequences orthologous to E. coli K-12 TnaA from the OMA (Orthologous Matrix) database (56), resulting in 218 sequences after removing outliers with internal indels over 10 residues compared to the reference sequence (Data Set S3). A substitution was classified as naturally occurring if it occurs in at least one of the TnaA orthologues. The effects of these substitutions and the heat-selected substitutions on TnaA thermodynamic stability were calculated and compared using FoldX3.0 with default settings (29) on the pdb-structure 2oqx (57).
Statistical analysis. Statistical analyses (analysis of variance [ANOVA], Tukey honestly significant difference [HSD] post hoc test, t test, Kolmogorov-Smirnov (KS) test, generalized linear mixed models, Bonferroni correction, bootstrapping [and the appropriate tests to test for underlying assumptions]), were carried out using the open source software R (R Core Team, 2020) (60). Differences were regarded as significant when the P value was #0.05. Means and the corresponding standard errors were typically calculated over three independent experiments.
To estimate the standard error of the surviving cellular fractions in Fig. 9B, the original sample size was bootstrapped (sampled with replacement) 10,000 times to calculate the mean fractions of surviving cells. This allowed the calculation of a bootstrapped estimation of the standard error of the mean fraction of surviving cells by determining the standard deviation of the bootstrapped means.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. DATA SET S1, XLSX file, 0.01 MB.