Structural and thermodynamic analyses of the β-to-α transformation in RfaH reveal principles of fold-switching proteins

The two-domain protein RfaH, a paralog of the universally conserved NusG/Spt5 transcription factors, is regulated by autoinhibition coupled to the reversible conformational switch of its 60-residue C-terminal Kyrpides, Ouzounis, Woese (KOW) domain between an α-hairpin and a β-barrel. In contrast, NusG/Spt5-KOW domains only occur in the β-barrel state. To understand the principles underlying the drastic fold switch in RfaH, we elucidated the thermodynamic stability and the structural dynamics of two RfaH- and four NusG/Spt5-KOW domains by combining biophysical and structural biology methods. We find that the RfaH-KOW β-barrel is thermodynamically less stable than that of most NusG/Spt5-KOWs and we show that it is in equilibrium with a globally unfolded species, which, strikingly, contains two helical regions that prime the transition toward the α-hairpin. Our results suggest that transiently structured elements in the unfolded conformation might drive the global folding transition in metamorphic proteins in general.


Introduction
Fundamental understanding of how proteins fold has ever been one of the most important questions in structural biology and it is still not answered, despite recent progress in protein structure prediction (Jumper et al., 2021;Tunyasuvunakool et al., 2021). Since the formulation of the 'thermodynamic hypothesis of protein folding' by Anfinsen (Epstein et al., 1963), it has been generally accepted that the amino acid sequence of a protein determines its three-dimensional structure and that a protein adopts only a single folded conformation, which is referred to as physiological state and which corresponds to its global energy minimum. This conformation, in turn, fulfills one distinct function. While this 'one sequence-one structure-one function' dogma holds true for most well-folded (globular) proteins, it has been challenged by several discoveries over the past decades. Among those are, for instance, (i) moonlighting proteins, which fulfill two completely unrelated functions (Jeffery, 2014;  Tubes indicate α-helical elements, arrows represent β-strands. The amino acid sequence is shown above. (C) Secondary chemical shift of VcRfaH. The plots show the difference between the observed chemical shift and the corresponding predicted random coil value of 13 Cα (top) and 13 CO (bottom). Positive values indicate helical, negative values elongated (β-sheet) structures, and values close to zero are observed for random coil-like structures. The secondary structure elements inferred from the analysis are shown above the graphs (code for secondary structure elements as in (B)). The position of the identified disulfide bridge (see also

Evolutionary conservation of fold-switching within the RfaH family
To date, three-dimensional structures and comprehensive evidence for fold-switching are available only for EcRfaH (Belogurov et al., 2007;Burmann et al., 2012;Zuber et al., 2019), although other RfaH orthologs seem to employ a similar mechanism to carry out their function (Carter et al., 2004;Porter et al., 2022). Thus, we first asked whether this ability might be a general feature of RfaH proteins. We chose RfaH from Vibrio cholerae (VcRfaH) for a structural analysis by solution-state NMR spectroscopy as it is evolutionarily remote from EcRfaH (sequence identity Ec/VcRfaH: 43.6% [full-length] or 35.8% [KOW domain], respectively). We first identified the secondary structure elements of the full-length protein by performing an NMR backbone assignment and calculating the secondary chemical shift for each 13 Cα and 13 CO atom, which depends on the main chain geometry ( Figure 1C). In full-length VcRfaH, the KOW domain exhibits two stretches with helical structure that are separated by about four residues and the overall pattern of secondary structure elements perfectly matches the one of autoinhibited EcRfaH (Burmann et al., 2012), suggesting similar tertiary structures for EcRfaH and VcRfaH (compare Figure 1A), but with helix α 3 * being 1.5 turns longer in VcRfaH. Interestingly, the Cα and Cβ atoms of C34 and C102 exhibit chemical shifts typical for cystines (Sharma and Rajarathnam, 2000, Figure 1-figure supplement 1A). These residues are located at the end of helix α 3 * and in strand β 3 *, respectively, and are, most probably, in close proximity, as indicated by the structure of EcRfaH.
The addition of a reducing agent to [ 2 H, 15 N, 13 C]-VcRfaH led to drastic changes of the chemical shifts of C34 and C102 as well as residues in spatial proximity in a [ 1 H, 15 N]-heteronuclear single quantum coherence (HSQC) spectrum ( Figure 1-figure supplement 1B). From this we conclude that C34 and C102 form a disulfide bridge, that covalently tethers the α 3 *-helix to the core of VcRfaH-NGN, a feature absent in EcRfaH. However, upon refolding from a solution containing 8 M urea and reducing agent, 15 N-VcRfaH adopted the same conformation as before denaturation (Figure 1-figure supplement  1C), suggesting that the disulfide bridge is not required for VcRfaH to fold into the autoinhibited state.
Next, we determined the solution structure of the isolated VcRfaH-KOW domain by NMR spectroscopy. VcRfaH-KOW also shows the five-stranded β-barrel topology typical for NusG/Spt5-KOW domains ( Figure 1D and Table 1), with a Cα root mean square deviation (rmsd) of 1.4 Å as compared to isolated EcRfaH-KOW.
Although we do not present functional data on VcRfaH here, these results strongly suggest that VcRfaH-KOW can also switch between an all-α and an all-β state and that VcRfaH thus is, most probably, also a transformer protein.

The model systems
The sequence of NusG/Spt5-KOW domains has been evolutionarily optimized to fold in only one defined conformation. Consequently, in the case of RfaH-KOW, the ability to switch between the all-α Dihedral restraints 76

Deviation from ideal geometry
Bond length (Å) 0.000544±0.000039 Bond angle (Å) 0.1096±0.0056 Coordinate precision*, † Backbone heavy atoms (Å) 0.32 All heavy atoms (Å) 0.90 Ramachandran plot statistics ‡ (%) 91.8/7.9/0.2/0.1 * The precision of the coordinates is defined as the average atomic root mean square difference between the accepted simulated annealing structures and the corresponding mean structure calculated for the given sequence region. and the all-β state must be encoded within the primary structure, whereas the 'decision' which state to adopt solely depends on the availability of RfaH-NGN (Tomar et al., 2013). Sequence alignments and bioinformatical approaches (Balasco et al., 2015;Bernhardt and Hansmann, 2018;Gc et al., 2014;Joseph et al., 2019;Li et al., 2014;Shi et al., 2017;Xiong and Liu, 2015) gave first hints why RfaH, in contrast to NusG, is a metamorphic protein and how the structural switch might proceed. Yet, experimental evidence is still scarce. Thus, we analyzed isolated KOW domains of six NusG/Spt5 or RfaH proteins to identify characteristic properties of fold-switching proteins and to understand the molecular mechanisms underlying the refolding mechanism of RfaH-KOW. Due to the fact that NusG proteins are universally conserved, we chose NusG-KOWs from E. coli and Mycobacterium tuberculosis (Ec/MtNusG-KOW), the Spt5-KOW from the hyperthermophilic archaeon Methanocaldococcus jannaschii (MjSpt5-KOW) and the fifth KOW domain from human Spt5 (hSpt5-KOW5) as representative NusG-/Spt5-KOWs and the Ec/VcRfaH-KOWs as representatives for RfaH proteins. The constructs used are about 65 residues in length and contain the structured region and parts of the neighboring linker(s) (Figure 1-figure supplement 2A). In isolation all six domains exhibit the typical β-barrel topology (Figure 1-figure supplement 2B) with major differences only in the loops or turns connecting the β-strands (Figure 1-figure supplement 2C).

Thermal and chemical stability of the KOW domains
Metamorphic proteins that switch between two stable conformations are expected to show two main minima in their energy landscape, each corresponding to one of these states (Dishman and Volkman, 2018). This implicates that (i) in order to control the structural interconversion, one of the conformations has to be (de)stabilized according to a molecular signal, and (ii) the energy minima cannot be as deep as the global minimum of a protein with a single, stable conformation to avoid permanent trapping of one state. Consequently, the all-β RfaH-KOW should show a limited thermodynamic stability to allow facile refolding to the all-α state when RfaH-NGN is available after transcription termination. To test this hypothesis, we analyzed the thermal stability of the six KOW domains by CD-based thermal denaturation experiments ( Figure 2A) and by DSC ( Figure 2B) at pH 4 and pH 7. At pH 7 unfolding was reversible for all KOW domains except for hSpt5-KOW5, which showed aggregation; the opposite effect was observed at pH 4 (Figure 2-figure supplement 1). All observed unfolding transitions were analyzed with a two-state model to determine the melting temperature, T m , the enthalpy of unfolding at T m , ΔH u (T m ), and, in case of the DSC thermograms, the temperature-dependent difference in heat capacity between the N and U states, ΔC p (T) ( Figure 2C and D and Table 2). Due to the fact that the KOW domains are β-barrels the precision of the thermodynamic parameters determined by CD spectroscopy is not as high as for proteins with helical elements. Nevertheless, the results obtained by DSC and CD spectroscopy are in good agreement showing that EcNusG-KOW, MtNusG-KOW, and MjSpt5-KOW have much higher T m values (87°C, 77°C, and 111°C, respectively) than hSpt5-KOW5 (58-60°C), EcRfaH-KOW (47-50°C), and VcRfaH-KOW (65-70°C). The same trend was observed for ΔH u (T m ) values. Consequently, this data indicates that EcNusG-KOW, MtNusG-KOW, and MjSpt5-KOW have a higher thermodynamic stability than Spt5-KOW5, EcRfaH-KOW, and VcRfaH-KOW.
To corroborate and complement the previous findings, we next performed far-UV CD-based chemical unfolding experiments at pH 4 and pH 7 using urea as denaturant ( Figure 3A-F, left).
EcNusG-KOW, MtNusG-KOW, hSpt5-KOW5, and VcRfaH-KOW show a sigmoidal unfolding curve at either pH, indicative of a two-state unfolding process. Analysis of this data by the linear extrapolation model yields transition midpoints ([urea] 1/2 values) and ΔG u (H 2 O) values that confirm the relative order of the stability as determined by thermal denaturation ( Figure 3G and H, Table 3, and Figure 2C and D). For MjSpt5-KOW only the native state baseline is observable at both pH values, demonstrating that no denaturation could be achieved and that, consequently, this KOW domain exhibits the highest thermodynamic stability (assuming an m value comparable to that of the other KOW domains, MjSpt5-KOW likely has a ΔG u (H 2 O) value >30-40 kJ/mol). Notably, we obtained a ΔG u (H 2 O) value for hSpt5-KOW5 at pH 7, showing that this domain has a stability comparable to that of VcRfaH-KOW at physiological pH ( Table 3). As VcRfaH-KOW, in contrast to all other KOW domains in this study, contains a Trp residue an additional fluorescence-based denaturation experiment was performed, and the obtained parameters are in good agreement with the CD data (Table 3 and  Thermal unfolding monitored via change in the circular dichroism (CD) signal with a temperature gradient from 20°C to 95°C. The line corresponds to the best fit to a two-state unfolding model. Measurements were carried out with proteins in 10 mM K-acetate (pH 4.0) buffer for hSpt5-KOW5 and in 10 mM K-phosphate (pH 7.0) buffer for all other domains. The wavelength for monitoring the transition was chosen based on the largest difference between the spectra of the folded and unfolded protein (for details, see Materials and methods). Data for EcNusG-KOW was not fitted due to the lack of the baseline of the unfolded state. MjSpt5-KOW could not be denatured at all. (B) Thermograms obtained from differential scanning calorimetry (DSC) measurements. All profiles are normalized to one molar of protein. The lines correspond to best fits to a two-state unfolding model that includes a T-dependent ΔC p change. Buffers are as in (A). (C,D) T m (C) and ΔH u (T m ) (D) values derived from thermal unfolding experiments monitored by CD and DSC. Error bars result from data fitting.
The online version of this article includes the following source data and figure supplement(s) for figure 2: Source data 1. Data for thermal denaturation experiments for all Kyrpides, Ouzounis, Woese (KOW) domains. To complement the analysis, we repeated the unfolding experiments at pH 7 using guanidinium chloride (GdmCl; Figure 3A-F, right, Table 3 and Figure 3-figure supplement 1B). As GdmCl is a more potent denaturant than urea, we were now able to denature even MjSpt5-KOW, giving a [GdmCl] 1/2 value of 5.03 M, which is more than twice the value of the next stable protein. In accordance with the urea-based unfolding experiments at pH 7, MjSpt5-KOW, EcNusG-KOW, and MtNusG-KOW exhibit higher ΔG u (H 2 O) and [denat] 1/2 values than VcRfaH-KOW and hSpt5-KOW5, although the relative order of stability of MtNusG-KOW and EcNusG-KOW is swapped. This difference as well as the difference between the absolute ΔG u (H 2 O) values derived from the urea-and GdmCl-based denaturations is a well-documented phenomenon and may be attributed to the limited applicability of the linear extrapolation model for the analysis of denaturations by GdmCl (see e.g. Gupta et al., 1996;Makhatadze, 1999). Thus, we base our conclusions on the relative comparison of the obtained values. We finally note that chemical unfolding was completely reversible in all cases (Figure 3figure supplement 2).
Surprisingly, and in contrast to all other domains, EcRfaH-KOW shows a more complex unfolding curve in both urea-and GdmCl-based denaturation experiments at pH 7, with an additional plateau at ≈3 M urea or ≈1 M GdmCl, respectively, between the N and U baselines ( Figure 3F; no curve could be obtained at pH 4 due to native state aggregation). This suggests that the unfolding of EcRfaH-KOW may be described via a three-step model including an observable equilibrium intermediate that might play an important role in the fold-switching mechanism of EcRfaH-KOW.
In summary, the poor spectroscopic properties of the analyzed domains limit the precision of the absolute values of the thermodynamic parameters obtained from CD experiments. However, our findings reveal clear differences in the global stability of the six domains and allow a grouping into two classes: MjSpt5-KOW and Ec/MtNusG-KOW are considered as 'stable domains', whereas the β-barrel Ec/VcRfaH-KOW as well as hSpt5-KOW5 show a reduced thermodynamic stability.
Regions that are unfolded in all-α RfaH-KOW are destabilized in the all-β conformation We next asked whether the less stable KOW domains also exhibit local differences in their stability as compared to the NusG-KOWs and MjSpt5-KOW. Therefore, we identified the backbone H-bond pattern in the six domains and quantified the magnitude of the through H-bond coupling constant, h3 J NC' , by long-range HNCO NMR experiments ( Table 4). This parameter is inversely proportional to the length of the H-bond and the deviation from its optimum angle, thus reflecting the H-bond strength (Grzesiek et al., 2004). To allow comparison between the six domains, we grouped H-bonds that are located at equivalent positions of the β-barrels and ordered them according to their position in the individual β-sheets ( Figure 4A and B).    Most | h3 J NC' | values are in the range of 0.5-0.9 Hz, which is typical for H-bonds of β-sheets (Grzesiek et al., 2004). In line with having the highest T m , MjSpt5-KOW often exhibits the highest coupling constants, which is indicative of a highly rigid packing of the β-barrel. Strikingly, MjSpt5-KOW has three additional H-bonds between strands β 5 and β 1 (#22-24), which provides an extra stabilization of the C-terminal β-strand that may contribute to the high thermostability of this protein. The 'stable' domains (i.e. Ec/MtNusG-KOW and MjSpt5-KOW) show their strongest H-bonds in two regions, namely between strands β 1 :β 2 and β 3 :β 4 . In addition, most of these H-bonds are more stable than corresponding H-bonds in Ec/VcRfaH-KOW and hSpt5-KOW5, implying that the H-bonds in the domains with reduced stability are more dynamic and on average longer or involve a less optimal bonding angle. From this we conclude that in Ec/VcRfaH-KOW and hSpt5-KOW5 strands β 1 and parts of β 4 are less stably bound to the rest of the β-barrel than in the stable domains. Moreover, together with the fact that β 1 , the C-terminal half of β 4 , and β 5 are disordered in the all-α state of the Ec/VcRfaH-KOW ( Figure 4C), this also reflects the chameleonic folding behavior of these regions in the all-β state.
hSpt5-KOW5, Ec-and VcRfaH-KOW exchange with a globally unfolded conformer on the ms time scale To assess the folding mechanism of the KOW domains at the amino acid level, we performed an NMR-based analysis of the structural dynamics of the six β-barrel proteins. As larger structural rearrangements, such as folding events, mostly occur at the μs-ms time scale for small proteins or are even slower (Maxwell et al., 2005), we focused on the analysis of the slow chemical exchange regime. Therefore, we performed amide 15 N-based chemical exchange saturation transfer (CEST) experiments (Vallurupalli et al., 2012). This method allows the sensitive detection and characterization of sparsely populated states (=minor species; relative population p B ) that exchange with a major species (relative population p A = 1 -p B ) with a rate k ex of 10-200 s -1 . The detection is achieved by frequency-selective saturation along the 15 N dimension that is 'transferred' from the minor to the major species. This decreases the signal intensity of the major species and then leads to an additional dip in the CEST The m value is a measure of the broadness of the transition and correlates with the difference in the accessible surface area between N and U, and the transition midpoint.   Table 4 continued on next page  Table 4 continued profile (major species signal intensity versus saturation frequency) next to the large major species minimum if there is a difference in the resonance frequencies of the two species. None of the CEST profiles of EcNusG-KOW, MtNusG-KOW, and MjSpt5-KOW exhibits an exchange peak ( Figure 5A-C), demonstrating that these domains are stable on the ms time scale, in agreement with their high thermodynamic stabilities (see above). In contrast, most CEST traces of hSpt5-KOW5, EcRfaH-KOW, and VcRfaH-KOW have a second dip, indicating exchange with a second, low-populated state (exemplary traces are shown in Figure 5D-F). Using a two-state exchange model, we fitted all CEST traces that showed an exchange signal individually to determine the residue-specific k ex and p B values. In all three cases, the k ex /p B values appear to cluster in one region, suggesting a global, cooperative process ( Figure 5-figure supplement 1A). Thus, we next performed a global fit of all CEST traces for each of the three proteins resulting in global  rate constants and populations as well as lifetimes of the two states ( Table 5). This analysis yields a relatively high p B value (5.50%) but low k ex (15.0 s -1 ) for EcRfaH-KOW, a much lower p B value (0.43%) but higher k ex (75.0 s -1 ) for VcRfaH-KOW, and p B /k ex values of 0.85% and 89.0 s -1 for hSpt5-KOW5.

MjSpt5-KOW
To characterize the exchanging species structurally, we analyzed the chemical shifts of the minor species. In all three cases, the minor species shifts show a very good correlation with those of a completely unfolded conformation (Figure 5G-I; R 2 >96%, rmsd <1.04 ppm). Note that the chemical shifts for the unfolded state of EcRfaH-KOW were obtained experimentally by backbone assignment of the protein in 8 M urea, whereas those of VcRfaH-KOW and hSpt5-KOW5 are predicted values (see Materials and methods for details). Determination of the relative populations finally results in the equilibrium constant and the difference in Gibbs free energy, ΔG, separating the energy levels of the two species ( Table 5). As expected, these ΔG values are similar to those obtained from the urea-based unfolding experiments at pH 7 (Table 3).
Taken together, the CEST experiments show that the folded all-β state of the isolated RfaH-KOWs and also hSpt5-KOW5 is in equilibrium with a species that resembles an unfolded conformation. As this state is easily accessible from the β-barrel, we conclude that the folding barrier separating the two states cannot be too high as this would prohibit an exchange on the ms time scale.

The unfolded conformers of Ec-and VcRfaH-KOW contain transient helical structures
Although the chemical shifts of the minor species of EcRfaH-KOW nicely correlate with the chemical shifts of the urea-unfolded protein ( Figure 5I), there are some noticeable differences in the 15 N chemical shifts (Δδ 15 N) of the two data sets (red bars in Figure 6A, top panel). In particular, two regions (region 1: Q127-T131, region 2: E136-I150) show significant deviations of -1 to -3 ppm, indicating local residual structures in these regions. As the type of present (secondary) structure cannot be derived from 15 N data, we recorded a CEST experiment on the 13 Cα carbons of 13 C, 15 N-EcRfaH-KOW ( Figure 6-figure supplement 1) and calculated Δδ 13 Cα between the observed minor species values and the random coil values obtained from the urea-unfolded protein (red bars in Figure 6A, bottom panel). The deviations are positive in regions 1 and 2, indicating the presence of helical structures at these sites. This is in agreement with secondary structure predictions, which show that the Leu-rich motif (LLLNL) in region 2, where the deviations of δ 15 N and δ 13 Cα are most pronounced, has high α-helical propensity (Figure 6-figure supplement 2; see also Balasco et al., 2015). Moreover, the two helical elements are located at the positions of the two α-helices in the all-α form of EcRfaH-KOW (compare Figure 1B). Due to the presence of two dips the CEST profiles can be analyzed using a two-state model (minor versus major species). Interestingly, the resulting 15 N transverse relaxation rates (R 2 values) of regions 1 and 2 in the minor species are significantly higher than corresponding rates in the β-barrel state ( Figure 6A, mid panel). Generally, one would expect that the minor species exhibits lower R 2 values as it is more flexible due to its largely unfolded nature (Farrow et al., 1995). The increased relaxation rates thus indicate the presence of additional exchange processes on the intermediate to fast chemical exchange (i.e. μs-ms) time scale. Consequently, the minor species itself  The system was buffered by 20 mM Na-phosphate (pH 6.5), 100 mM NaCl, 1 mM ethylenediaminetetraacetic acid (EDTA), 10% (v/v) D 2 O. Boxed regions mark signals corresponding to the β-barrel state with the signal of S139 being labeled with 'N' ('native'). Arrows and further labels indicate signals of residues that exhibit strong chemical shift changes in the indirect dimension ( 15 N in (B), 13 C in (C)). The spectra are colored as indicated.
The online version of this article includes the following source data and figure supplement(s) for figure 6: Source data 1. 8-Anilino-1-naphthalenesulfonic acid (ANS) binding by EcRfaH-KOW during urea-based denaturation.  seems to be an ensemble of predominantly unfolded, fast interconverting structures with transient helical elements in regions 1 and 2 rather than a static population. As no dips in addition to the ones of the minor and major species can be observed in the CEST profiles, the population of other states is low and beyond the detection limit of CEST experiments. Like EcRfaH-KOW, the minor species of VcRfaH-KOW also seems to contain residual structure ( Figure 6-figure supplement 3A). As the unfolded state of this domain was not assigned experimentally, predicted chemical shift values for the random coil structure were used for the correlation plot ( Figure 5H). When plotting the Δδ 15 N values versus the sequence position ( Figure 6-figure  supplement 3A), the resulting pattern resembles the one obtained for EcRfaH-KOW (compare Figure 6A, top panel). The regions around residues 103-125 (linker) and 155-165 (C-terminus) show relatively low Δδ 15 N values, indicating a random coil structure, whereas the region around residues 140-150 (corresponding to region 2 in EcRfaH-KOW) exhibits significantly increased Δδ 15 N values, suggesting residual structure, similar to EcRfaH-KOW. However, only very small minor species dips were observed in some traces of a CEST experiment recorded on the 13 Cα carbons of 13 C, 15 N-VcRfaH-KOW ( Figure 6-figure supplement 3B), which we attribute to the very low population of the VcRfaH-KOW minor species (0.43%) that is at the detection limit of the Cα-CEST experiment (which is less sensitive than the 15 N-CEST). Consequently, we analyzed the CEST profiles only qualitatively. Unambiguous minor species dips could be identified for amino acids predominantly located in the region with residual structure with chemical shifts that are downfield-shifted as compared to random coil values ( Figure 6-figure supplement 3B), indicating the presence of helical elements. As for EcRfaH-KOW, this is in full agreement with secondary structure predictions, which suggest that all NusG/Spt5-KOW domains adopt four to five β-strands whereas both RfaH-KOW domains exhibit propensities for both β-strands and α-helices, especially in the regions with residual structure in the CEST minor species (Figure 6-figure supplement 2). Taken together this data suggests that the VcRfaH-KOW minor species also contains transient residual helical structures.
The hSpt5-KOW5 domain is part of an 'RNA clamp' during transcription elongation in eukaryotes (Bernecky et al., 2017) and exhibits the typical β-barrel fold in all available structures. Strikingly, hSpt5-KOW5 also exchanges with an unfolded species under non-denaturing conditions ( Figure 5G), just as EcRfaH-KOW and VcRfaH-KOW. The magnitude of the differences between the minor species 15 N chemical shifts and the predicted random coil values ( Figure 6-figure supplement 4A) is similar to that observed for VcRfaH-KOW ( Figure 6-figure supplement 3A). Interestingly, the minor species' chemical shifts of a 13 Cα -CEST of 13 C, 15 N-hSpt5-KOW5 clearly indicate the absence of any substantial residual structure ( Figure 6-figure supplement 4). In contrast to all other KOW domains in this study, hSpt5-KOW5 is not located at the very C-terminus of full-length hSpt5, but it is just one out of seven KOW domains being flanked by several hundreds of residues at either terminus. Thus, the stability of this domain may be different in its physiological environment. Taken together, this data suggests that hSpt5-KOW5 is a typical monomorphic β-barrel and that its decreased stability, accompanied by the existence of a minor, unfolded species, may be attributed to the absence of the neighboring domains, although we cannot completely rule out that these features are real, intrinsic properties of hSpt5-KOW5 in the full-length protein with (yet unknown) functional relevance.
As the completely unfolded state was only experimentally assigned for EcRfaH-KOW we will focus on this domain in the following analysis. Owing to its population of 5.5% (Table 5), EcRfaH-KOW's minor species should be detectable in standard HSQC spectra, given a sufficiently high signal-to-noise ratio.

Figure 6 continued
In both spectra series, we observed a decrease in peak intensity/volume of the β-barrel signals with increasing urea concentration (boxed regions in Figure 6B and C), which is completed at ≈ 4 M urea, indicating that the first transition in the far-UV CD-based chemical denaturation of EcRfaH-KOW ( Figure 3F) corresponds to the unfolding of its β-barrel (tertiary) structure. This is also corroborated by near-UV CD spectroscopy-based chemical denaturation experiments using urea or GdmCl, respectively, (Figure 6-figure supplement 5A, B), which clearly show that the transition during the titration from 0 to ~3 M urea/~1 M GdmCl is accompanied by a loss in tertiary structure. The possibility that the resulting conformation corresponds to an equilibrium molten globule is, however, excluded due to its inability to bind 8-anilino-1-naphthalenesulfonic acid (ANS, Figure 6figure supplement 5C).
In order to identify signals corresponding to the minor species in the HSQC spectra of EcRfaH-KOW, we started with the spectrum of the urea-unfolded protein (8 M urea, purple spectra in Figure 6B and C). Most of the corresponding signals shifted linearly with decreasing urea concentration and also lost intensity at urea concentrations <3 M (e.g. signal of S139 in Figure 6B and C). At 0 M urea, finally, only a set of weak signals remained, which we identified as signals of the minor species as these match the chemical shifts of the minor species identified in the CEST experiments (compare red and gray bars in Figure 6A, top/bottom panels). Based on the linear transition between the positions of the (urea) unfolded state toward the positions of the minor species signals, we conclude that addition of urea shifts the minor species' population toward the completely unfolded state. Although we cannot assess if the minor species samples the completely unfolded state in the absence of any denaturant, the increased 15 N R 2 values indicate additional exchange processes of the minor species on the µs-ms time scale ( Figure 6A, middle panel). Thus, we hypothesize that the minor species can be described as an ensemble of exchanging sub-states, some corresponding to the completely unfolded state 'U' and some exhibiting residual helical structure, hereby referred to as α-helical unfolding intermediate 'Uα' with the minor species observed in the CEST experiments being the average population under native conditions.
If this is true, the urea-induced chemical shift perturbations experienced by the minor species signals in the [ 1 H, 15 N]-HSQCs can be explained by a combination of two effects: (i) change of the chemical environment of the spins due to the presence of urea, which particularly affects δ 1 H (see e.g. signal of T157 in Figure 6B), and (ii) change in the relative populations of the minor species' sub-states toward the unfolded state, which mainly affects Δδ 15 N. Since the Hα/Cα chemical shifts are relatively independent of the solvent conditions, their perturbations in the urea denaturation series ( Figure 6C) even better reflect the change in the ratio of the minor species' sub-states. The shifting of the minor species' peaks in Figure 6C is completed at ≈7 M urea, implying that the second transition in the far-UV CD-based unfolding experiment ( Figure 3F) corresponds to the denaturation of Uα. Interestingly, the R 2 values of residues in region 1 are more than twice as high as those of residues in region 2 ( Figure 6A) and, in the [ 1 H, 15 N]-HSQC-based denaturation experiment ( Figure 6B), the minor species' signals of residues in region 1 do not shift in a linear manner as it is typical for two exchanging states. Instead, they show a curved transition that is 'kinked' at ≈ 2 M urea (see e.g. T131), implying a more complex unfolding process and thus structural heterogeneity of this region. Although our experiments do not allow a precise structural characterization of all states of the minor species, it may be described as an ensemble of largely unfolded, interconverting structures with states U and Uα constituting the extrema.
Due to the fast chemical exchange between the EcRfaH-KOW's U and Uα states during the chemical denaturation, their relative populations in a certain titration step are encoded in the chemical shift of the minor species signal, whereas the volume of the minor species peak is proportional to the sum of the populations of both states (assuming similar transverse relaxation rates for the species). The chemical shifts of Cα/Hα groups depend to a much lower extent on the urea concentration in the sample than the chemical shifts of amide groups and therefore they provide better measures for the exchange between U and Uα. To first quantify the decay of the all-β conformation and the increase of the minor species during the urea denaturation, we analyzed the peak intensity of both species exemplarily for residue S139 in the [ 1 H, 13 C]-ctHSQC-based titration ( Figure 6C and Figure 6-figure  supplement 6A). The resulting ΔG value of ≈ 7 kJ/mol between the energy levels of major and minor species agrees well with the results from the CEST experiment (7 kJ/mol). Additionally, the m value of 3.4 kJ/(mol M) is very similar to the m values obtained for the other KOW domains by CD spectroscopy ( Table 3), indicating that the minor species is indeed close to a completely unfolded state with a small buried surface area.
The complete denaturation of the minor species, that is, the transition of Uα to a fully unfolded state U, can be followed in the [ 1 H, 13 C]-ctHSQC-based denaturation experiment by analyzing the change of the minor species's chemical shifts from the positions in the absence of urea toward those of the completely unfolded state. For example, the Hα/Cα correlation peaks of residues A137, S139, or M140, which are located in region 2, clearly shift from regions typical for α-helical structures (upfield 1 H, downfield 13 C relative to random coil values) to positions corresponding to an unstructured conformation (downfield 1 H, upfield 13 C), and finally they localize next to the signals of the Ala, Ser, or Met residues that do not reside in regions with residual helical structure ( Figure 6C). Plotting the changes of the 13 Cα chemical shifts of A137, S139, and M140 versus the urea concentration ( Figure 6-figure  supplement 6B) results in curves that resemble the second half of an unfolding transition (Uα ⇋ U) and approach the baseline of the fully unfolded state at ≈6 M urea. The absence of a baseline for Uα precludes a quantitative analysis, but it indicates that the transition mid-point of the curve is probably close to or below 0 M urea. In summary, the data of the NMR-based denaturation experiments (i) strongly support our hypothesis that the minor species identified in the CEST experiments is an ensemble of fast interconverting, mostly unfolded structures with U and Uα being the extrema and (ii) suggest that the minor species might be an important intermediate during the refolding process.

Fold-switching is conserved among RfaH proteins
Genes coding for RfaH orthologs can be found in many bacterial pathogens, including Salmonella, Klebsiella, Vibrio, and Yersinia spp. (Carter et al., 2004). Despite their divergent evolution, RfaH proteins seem to have a conserved mechanism of action (Carter et al., 2004). To date, only EcRfaH was structurally characterized in detail, revealing that this protein has unique structural features classifying it as transformer protein (Belogurov et al., 2007;Burmann et al., 2012;Zuber et al., 2019). Here, we show that VcRfaH, an evolutionary quite divergent representative sharing 35.8% sequence identity with EcRfaH, exhibits very similar structural properties, that is, VcRfaH-KOW, like EcRfaH-KOW, folds as α-hairpin in the full-length protein, but adopts a NusG-type β-barrel conformation in its isolated form (Figure 1). Interestingly, in VcRfaH helix α 3 * is 1.5 turns longer as compared to EcRfaH and VcRfaH has a disulfide bridge connecting strand β 3 * and helix α 3 *, stabilizing this helix. These two features imply a stabilization of the domain interface and thus an increased affinity between the domains as compared to EcRfaH. This might also explain the increased stability of the isolated VcRfaH-KOW domain (≈14 kJ/mol), which compensates the higher energy gain of the domain interaction. Further, the increased stability of the VcRfaH-KOW domain may be the cause for the sigmoidshaped CD-based chemical denaturation curves, in agreement with an apparent two-state unfolding process: global unfolding of the folded state occurs at higher denaturant concentrations, where potential partly structured folding intermediates are already largely destabilized and therefore escape detection. This conclusion is supported by the Trp fluorescence-based denaturation data (Figure 3figure supplement 1), suggesting that the change in the CD signal is almost exclusively caused by the decay of the β-barrel conformation and that the contribution of Uα to the change of the CD signal is negligible. Nevertheless, we conclude that VcRfaH may be regulated by fold-switching just like EcRfaH, and that this metamorphic behavior is conserved in the class of RfaH proteins and may even be found in other NusG paralogs, in agreement with a recent study that predicts that nearly 25% of bacterial NusG proteins might perform α ↔ β transitions similar to EcRfaH (Porter et al., 2022).

Model for the structural plasticity of RfaH
EcRfaH switches the conformation and function of its KOW domain in a reversible manner to achieve a tight control of gene expression (Zuber et al., 2019). In free EcRfaH, the α-helical hairpin conformation is the preferred state of EcRfaH-KOW, whereas domain separation or isolation of EcRfaH-KOW fosters population of the all-β state in solution (Burmann et al., 2012), suggesting that the all-α conformation is intrinsically unstable, but becomes the thermodynamic minimum in free EcRfaH due to interaction with EcRfaH-NGN.
Interestingly, our thermodynamic analysis (Figures 2 and 3) of the isolated EcRfaH-KOW domain reveals that, although the all-β conformation is the preferred state in isolation, it is only marginally stable, and it is in rapid equilibrium with an 'unfolded' state, which is populated to a significant extent, even under physiological conditions. The 'unfolded' state is a mixture of random-coil-type unfolded species U and species Uα containing two helical regions.
Based on our results, we suggest a model for the structural transitions of EcRfaH-KOW (Figure 7). In the autoinhibited state the all-α conformation of EcRfaH-KOW corresponds to the minimum of the Gibbs free energy as it is stabilized by contacts to the EcRfaH-NGN. During recruitment of EcRfaH to an ops-paused elongation complex, the EcRfaH-NGN:KOW interface is destabilized (most probably via an encounter complex), the domains dissociate and EcRfaH-NGN is sequestered to RNAP (Zuber et al., 2019). The freed all-α EcRfaH-KOW is not stable as G increases due to the loss of EcRfaH-NGN contacts. Consequently, EcRfaH-KOW unfolds, resulting in an ensemble of rapidly interconverting sub-states. Some of these sub-states still contain two residual α-helical regions (intermediate Uα) that correspond to the tip of the α-hairpin in the all-α state, in agreement with hydrogen/deuterium exchange data, which indicate that the hairpin tip is the most stable part of the all-α conformation (Galaz-Davison et al., 2020). Other sub-states represent the completely unfolded protein, which then rapidly refolds into the all-β form. Upon transcription termination EcRfaH is released, and the process is reversed with unfolding of the β-barrel starting, most probably, by detaching β 1 and β 4 /β 5 from the central strands as the corresponding H-bonds are the least stable ones (Figure 4). The U state is in equilibrium with Uα, where two α-helical regions that will later constitute the α-hairpin tip are formed Figure 7. Model for the conformational plasticity of EcRfaH-KOW. Qualitative Gibbs free energy level diagram and associated structures for the all-α to all-β transition of EcRfaH-KOW and vice versa. In its ground state, that is, the autoinhibited conformation, the energy of the all-α conformation of EcRfaH-KOW is strongly lowered by the extensive inter-domain contacts with the EcRfaH-NGN domain. Upon recruitment, the domains dissociate, the helical structure of the released KOW domain becomes destabilized in isolation, and rapidly decays toward an ensemble of mainly unfolded sub-states that interconvert on the µs time scale. Some of the sub-states correspond to the completely unfolded state (U) whereas others retain some residual (α-) helical elements (Uα). The scheme displays exemplary structures of these sub-states. Due to their fast structural interconversion, U and Uα may be grouped into a single macro-state/ensemble (as is the case during the chemical exchange saturation transfer [CEST] experiments) that exhibits helical structures for a limited amount of time and is otherwise unfolded. Uα is either marginally stable or even unstable (therefore, its energy level is blurred). The disordered conformation then allows for easy and rapid refolding to the all-β conformation. Due to their low thermodynamic stability, or even instability of all-β and Uα, respectively, the last two steps are reversible, that is, the all-α state can be rapidly regained when the EcRfaH-NGN domain becomes available for re-association after transcription termination.
transiently and may thus serve both as the nucleation point for the completion of the all-α structure and as starting point for recognition of its cognate binding site on the NGN. This mechanism ensures rapid re-autoinhibition and prevents aggregation of EcRfaH. Although we did not analyze VcRfaH as extensively as EcRfaH, our results suggest that the VcRfaH-KOW domain most likely employs a similar mechanism for its structural transformation, indicating that the presented model is a general scheme for RfaH proteins.
In support of our model, all computational studies on EcRfaH found that the all-α conformation is stable only when in contact with the NGN. Modification of the strength of the EcRfaH-NGN:KOW interface (Ramírez-Sarmiento et al., 2015) or deletion of the linker (Xun et al., 2016) destabilizes the all-α fold and ultimately drives EcRfaH-KOW into the β-barrel state. Moreover, the β-barrel fold is stable and corresponds to or is close to the energy minimum of the energy landscape of EcRfaH-KOW, whereas the all-α fold rapidly unfolds and has a higher G value than the all-β state (Balasco et al., 2015;Bernhardt and Hansmann, 2018;Gc et al., 2014;Joseph et al., 2019;Li et al., 2014;Xiong and Liu, 2015). Apart from these general concepts, most studies differ in several key points, such as the extent to which the all-α state is populated in the isolated EcRfaH-KOW, or the precise folding pathway from all-α to all-β. Strikingly, a recent bioinformatical study very nicely mirrors our data as the authors also observed a significant portion of transiently formed helical structure within the unfolded state ensemble in their simulations (Seifi and Wallin, 2021).

Requirements for fold-switching proteins
Previous work on designed and naturally occurring fold-switching proteins has identified several specific properties that make fold-switching proteins distinct from others (Bryan and Orban, 2010;Porter and Looger, 2018). In this study, we show that RfaH meets all these requirements and is thus a showcase example for fold-switching proteins: 1. Reduced thermodynamic stability (Bryan and Orban, 2010). A diminished stability is both the result of and key to the function of fold-switching proteins. As the fold-switching sequence must be compatible with both adopted topologies, it can only be optimized to a certain extent to stabilize one specific fold, ensuring that both conformations can be interconverted and that the structure is not 'trapped' in one state. This is reflected by a dual-funneled energy landscape with two main minima, which are, however, not as deep as the global minimum of a stable protein.
Our comprehensive thermodynamic analysis (Figures 2 and 3) reveals that the all-β fold of both RfaH-KOWs is indeed less stable than the bacterial and archaeal NusG/Spt5-KOW domains. As general transcription factors, NusG/Spt5 proteins do not require an as-sophisticated regulation as RfaH ) (for hSpt5-KOW5 see below) and thus benefit from a stable structure to carry out their function. The difference in thermodynamic stability is especially striking for EcNusG-KOW and EcRfaH-KOW as both belong to the same class of transcriptional regulators, originate from the same organism, and share a sequence identity of 35.8% (43.6% for the full-length proteins), yet underly completely different regulatory mechanisms that, in turn, strongly depend on the difference of this thermodynamic parameter. As a result, EcRfaH is tightly regulated by autoinhibition coupled to the conformational transformation of a whole domain and controls just a small set of specific genes whereas EcNusG is a stable, monomorphic protein involved in the transcription of most host genes. 2. Generation of new binding surfaces (Bryan and Orban, 2010). The regulation of conformational transitions in fold-switching proteins is achieved by energetically stabilizing one of the two conformations in response to a molecular trigger, resulting in a far more dynamic energy landscape than that of well-folded, monomorphic proteins as the energy level of a particular conformation strongly depends on the environment. This context-dependent stabilization of one state is possible because the two different folds exhibit different surface topologies, each allowing distinct interactions. The ability to selectively hide/expose 'latent' binding sites within different folds is also the most important function of fold-switching in general, as it enables a level of control that cannot be achieved by other mechanisms. In RfaH, autoinhibition is coupled to a conformational switch. In the autoinhibited state the all-α KOW interacts with the RfaH-NGN to prevent off-target recruitment and interference with NusG (Belogurov et al., 2009), whereas the refolded state allows simultaneous binding of RfaH to RNAP via RfaH-NGN and to the ribosome via all-β RfaH-KOW to activate translation (Kang et al., 2018;Zuber et al., 2019).
3. Unfolded regions in one of the two states (Bryan and Orban, 2010). In RfaH-KOW, the central β-strands β 2 , β 3 , and β 4 transform into two α-helices during the all-β-to-all-α transition and vice versa ( Figure 1B). However, the all-α KOW domain contains unstructured N-and C-termini, whereas the corresponding regions form β-strands β 1 and β 4 /β 5 in the all-β conformation ( Figure 1B and Figure 4C). These disordered parts provide an entropic stabilization of the respective state as they do not adopt a defined structure and the corresponding β-strands are less stably bound to the rest of the β-barrel than in the stable KOW domains (Figure 4). A bioinformatic study indicated that these regions of the CTD additionally stabilize the NGN:KOW interface by forming transient, IDP-like interactions (Xun et al., 2016). We show that the structural interconversion between the two RfaH states proceeds via a chiefly unfolded intermediate and we propose that the disordered segments may help to facilitate and/or initiate this transition, similar to the mechanism suggested for the human chemokine XCL1 (lymphotactin) (Tyler et al., 2011). Finally, disordered regions in one state have the advantage that they can be evolutionary optimized to selectively stabilize one of the two states of a fold-switch pair, whereas there is no need to fit a defined three-dimensional structure in the other state. This is reflected by the secondary structure predictions of both EcRfaH-KOW and VcRfaH-KOW, which suggest propensities for both helical structures and β-strands in the central region that indeed interconvert between α-helices and β-strands, whereas only β-strands are predicted for the termini (Figure 6-figure supplement 2). 4. Divergence in predicted and observed secondary structure (Porter and Looger, 2018).
Secondary structure predictions show that both VcRfaH-KOW and EcRfaH-KOW contain stretches with high propensity for both β-strands and α-helical structures, whereas NusG/ Spt5-KOW domains are predicted to adopt four to five β-strands ( Figure 6-figure supplement 2). Three-dimensional structures of the KOW domains of this study confirm that the NusG/ Spt5-KOW domains are indeed β-barrels, whereas the fold of the RfaH-KOW domains depends on the context (Figure 1 and Figure 1-figure supplement 2). Interestingly, this tendency is also visible in the isolated KOW domain as the disordered regions in the all-α fold correspond to the β-strands that are less stable in the RfaH-KOWs as compared to NusG-KOWs, whereas the helical propensity is reflected in the structure of Uα. However, one should keep in mind that secondary structure predictions strongly depend on the underlying algorithms, as can be seen for MtNusG-KOW ( Figure 6-figure supplement 2). 5. Cooperatively folding units (Porter and Looger, 2018). The folding cooperativity of EcRfaH-KOW depends on the presence of the EcRfaH-NGN, that is, in the absence of EcRfaH-NGN EcRfaH-KOW folds cooperatively on its own. However, the cooperativity is generally rather low and the activation barrier separating the 'unfolded' and the folded states is small, allowing fast transitions.

Fold-switching is a highly efficient principle of regulation with a steadily increasing importance
To date, about six fold-switching proteins have been studied in detail (summarized in Dishman and Volkman, 2018;Lella and Mahalakshmi, 2017;Zamora-Carreras et al., 2020), but estimates suggest that up to 4% of the proteins in the PDB may have the ability to switch folds (Porter and Looger, 2018). Our study demonstrates which molecular mechanisms confer RfaH its structural plasticity that allows operon-specific regulation without competing with its monomorphic paralog NusG/Spt5. In line with our findings, a recent study on XCL1, another model system for fold-switching proteins, identified very similar principles for the evolution and design of fold-switching proteins (Dishman et al., 2021).

Importance of a chiefly unfolded state in protein fold-switching
In summary, our results highlight two key features in protein fold-switching: decreased thermodynamic stability and defined local structures in 'unfolded' intermediates. Diminished stability is often thought to be detrimental for proteins as it favors non-native contacts and promotes aggregation. However, it is essential to confer fold-switching proteins their conformational plasticity, and, as all transitions from and to the unfolded states are very fast, and the population of these states is rather low, fold-switchers can evade aggregation. Further, the capability of the 'unfolded' state to harbor residual defined structures, for example, α-helices, allows to pre-encode a second conformation that could be readily adopted upon a molecular signal.

Continued on next page
Key resources List of recombinant plasmids used Table 7 Sequence-based reagent List of primers used  Table 6. Primers used for cloning.
The quality of all recombinantly produced proteins was ensured according to the guidelines established by ARBRE-MOBIEU and P4EU (https://arbre-mobieu.eu/guidelines-on-protein-quality-control/) (de Marco et al., 2021). In brief, purity was checked by sodium dodecyl sulfate polyacrylamide gel electrophoresis, the absence of nucleic acids by UV spectroscopy, the identity by mass spectrometry and/or NMR spectroscopy, the folding state by CD and/or NMR spectroscopy, and the absence of aggregation by analytical gel filtration or dynamic light scattering.

NMR spectroscopy
NMR experiments were conducted at Bruker Avance 600, Avance 700, Ascend Aeon 900, and Ascend Aeon 1000 spectrometers, each equipped with room temperature (Avance 600) or cryogenically cooled, inverse 1 H, 13 C, 15 N triple resonance probes (all other spectrometers). All measurements were conducted in 5 mm tubes with a sample volume of 550 µl at 25°C, if not stated otherwise. NMR data was processed using in-house software and analyzed using NMRViewJ (OneMoon Scientific).
Backbone resonance assignments for VcRfaH, VcRfaH-KOW, hSpt5-KOW5, MjSpt5-KOW, and ureaunfolded EcRfaH-KOW were obtained using standard band-selective excitation short transient (Lescop et al., 2007;Schanda et al., 2006) transverse relaxation optimized spectroscopy (TROSY)-based triple resonance experiments (Pervushin et al., 1997;Salzmann et al., 1998). Additionally, carbondetected CACO, CAN, and NCO experiments (Bermel et al., 2005) were recorded for VcRfaH-KOW. Side chain assignments for VcRfaH-KOW were obtained from CCH-and H(C)CH-TOCSY, HBHA(CO) NH, C(CO)NH, aromatic [ 1 H, 13 C]-HSQC, and 13 C-edited aromatic nuclear overhauser enhancement spectroscopy (NOESY) experiments (Sattler et al., 1999). Three-dimensional assignment and NOESY experiments were acquired using a non-uniform sampling scheme with a sparsity of 25-50%. Spectra were subsequently reconstructed with in-house written software using the iterative soft thresholding algorithm (Hyberts et al., 2012). The EcRfaH-KOW, VcRfaH-KOW, hSpt5-KOW5, and MjSpt5-KOW samples contained 0.5-1 mM [ 15 N, 13 C]-labeled protein in 20 mM Na-phosphate (pH 6.5), 100 mM NaCl, 1 mM ethylenediaminetetraacetic acid (EDTA), 10% (v/v) D 2 O. The EcRfaH-KOW sample further contained 6 M urea. Due to limited sample stability and poor quality of the initial spectra, VcRfaH (0.3 mM) was [ 2 H, 15 N, 13 C]-labeled and in an optimized buffer (25 mM Bis-Tris-Propane [pH 6.5], 25 mM Na-Tartrate, 50 mM NaCl, 10% (v/v) D 2 O) and the measurements were conducted at 20°C. The Cα and CO secondary chemical shift for VcRfaH was calculated as difference between the observed chemical shift and the predicted random coil value (Wishart and Sykes, 1994) using a deuterium correction as given in Venters et al., 1996. Chemical shift assignments for EcNusG-KOW, MtNusG-KOW, and native EcRfaH-KOW were taken from previous studies (Burmann et al., 2012;Mooney et al., 2009;Strauß et al., 2016). The random coil chemical shifts for characterization of the minor species in case of VcRfaH-KOW and hSpt5-KOW were calculated using the Poulsen IDP/IUP random coil chemical shifts calculator tool (https://spin.niddk.nih.gov/bax/nmrserver/Poulsen_rc_CS/).  Burmann et al., 2012 Distance restraints for the structure calculation of VcRfaH-KOW were obtained from standard 13 Cand 15 N-edited 3D NOESY experiments (Sattler et al., 1999) with mixing times of 120 ms. NOESY cross-signals were classified according to their intensities and converted to distance restraints with upper limits of 3 Å (strong), 4 Å (medium), 5 Å (weak), and 6 Å (very weak), respectively. Hydrogen bonds were identified from corresponding experiments (see below). Psi/Phi angle restraints were obtained from the geometry dependence of the backbone chemical shifts using TALOS (Cornilescu et al., 1999). The structure calculation was performed with XPLOR-NIH version 2.1.2 using a threestage simulated annealing protocol with floating assignment of prochiral groups including a conformational database potential (Schwieters et al., 2003). Structures were analyzed with XPLOR-NIH and PROCHECK-NMR (Laskowski et al., 1996). 15 N-based CEST experiments were conducted according to Vallurupalli et al., 2012. All samples contained ≈0.7-1 mM 15 N-labeled protein. For initial CEST experiments, the domains were in 20 mM HEPES (pH 7.5), 100 mM NaCl, 10% (v/v) D 2 O, and a single CEST B 1 field (ν 1 =18-25 Hz) during an exchange period of 500 ms was employed. Proteins showing an exchange peak in their CEST profiles were further studied in 20 mM Na-phosphate (pH 6.5), 100 mM NaCl, 1 mM EDTA, 10% (v/v) D 2 O to decrease amide proton-H 2 O exchange. CEST experiments were then recorded using two different B 1 fields (ν 1 =13 Hz/26 Hz) and an exchange period of 500 ms. The B 1 frequencies were calibrated using a 1D approach on an isolated signal (Guenneugues et al., 1999). The CEST traces obtained at 13/26 Hz were fitted simultaneously according to a two-state exchange model using ChemEx (version 0.6.1, Vallurupalli et al., 2012). Due to the monodisperse distribution of the resulting k ex /p B values (Table 5), the CEST traces were then fitted globally, yielding a global k ex and p B value. Only those CEST profiles were included in the global fit that showed a Δω>1 ppm. 13 Cα-CEST experiments were recorded on [ 15 N, 13 C]-labeled protein samples using a [ 1 H, 13 C] constant-time (ct) HSQC-based approach (Bouvignies et al., 2014). To maximize the number of analyzable signals, the proteins were in 20 mM Na-phosphate (pH 6.5), 100 mM NaCl, 1 mM EDTA, 99.9% (v/v) D 2 O (pH uncorrected for D 2 O). In this case, the chemical shift was referenced via 0.5 mM internal DSS. The experiment was performed at a single B 1 field strength (25 Hz .5], 50 mM NaCl, 2 mM DTT) was recorded before the protein was incubated in refolding buffer containing 8 M urea for 24 hr. Having recorded another [ 1 H, 15 N]-HSQC spectrum urea was removed by stepwise dialysis against 4 l of refolding buffer containing 4, 2, 1, 0.5, and 0 M urea, respectively (2-4 hr for the first four steps and over-night for the last step). Finally, a [ 1 H, 15 N]-HSQC spectrum of the refolded protein was recorded.

CD spectroscopy
CD data were collected at a Jasco J-1100 spectrometer (Jasco Deutschland GmbH, Pfungstadt, Germany), using quartz cuvettes (Hellma GmbH & Co. KG, Müllheim, Germany). CD spectra were normalized (Equation 1) to obtain the mean residue-weighted ellipticity (Θ MRW ): θ is the ellipticity in mdeg, N the number of amino acids, c the protein concentration in mM, and d the pathlength of the cuvette in cm.
Thermal unfolding and refolding curves were obtained by measuring the CD signal of 15 μM (≈0.1 mg/ml) protein buffered by either 10 mM K-phosphate (pH 7.0) or 10 mM K-acetate (pH 4.0), respectively, in a 1 cm quartz cuvette upon heating to 95°C and subsequently re-cooling to the initial temperature. The scan speed was 1°C/min, the dwell time 1 min, and the integration time 4 s. Checking the reversibility of thermal unfolding and determination of the wavelength used for temperature transition curves was done by recording far-UV CD spectra at 25°C, then 95°C, and after subsequent re-cooling to 25°C in a 1 mm pathlength cuvette using 25 μM protein solutions in either 10 mM K-phosphate (pH 7.0) or 10 mM K-acetate (pH 4.0). The wavelength to follow a thermal transition corresponds to the wavelength >215 nm with the largest difference in the CD signal between folded and unfolded state and was chosen for each transition individually. Using wavelengths <215 nm led to noisy signals at high temperatures and resulted in non-interpretable data.
Changes in ellipticity (θ) upon thermal unfolding were analyzed with a two-state model using Fito-Mat version 0.752 (Möglich, 2018) to obtain the melting temperature (T m ) and enthalpy change at T m (ΔH u (T m )) of the transition (both fit parameters) (Equation 2): with T being the absolute temperature in K, y N and y U the y-intercepts, and m N and m U the slopes of the N-and U-state baselines, respectively. f N is the fraction of folded molecules, which is related to the equilibrium constant K u according to Equation 3: Finally, K u is related to the change in Gibbs free energy of the unfolding reaction (ΔG u ) and ΔH u (T m ) by Equation 4: where R is the ideal gas constant. CD-based chemical equilibrium unfolding experiments were performed at 25°C. Urea (BioScience Grade; ≈10 M) and GdmCl (≈8 M; both from Carl Roth GmbH & Co. KG, Karlsruhe, Germany) stock solutions were prepared according to Pace et al., 1990. Far-UV CD unfolding experiments were conducted using a 1 mm cuvette. All points of the unfolding curves were obtained from individual samples, each containing 40-60 μM (≈0.25-0.4 mg/ml) protein in either 10 mM K-phosphate (pH 7.0) or 10 mM K-acetate (pH 4.0), respectively. All samples were equilibrated over-night. The denaturant concentration of each sample was determined refractrometrically after CD data acquisition.
As for the thermal transitions, the wavelength to follow a chemical denaturation corresponds to the wavelength >215 nm with the largest difference in the CD signal between folded and unfolded state and was chosen for each transition individually (wavelengths <215 nm led to noisy signals and non-interpretable data at high denaturant concentrations).
Unfolding curves that indicate a two-state transition were analyzed using the linear extrapolation method (Santoro and Bolen, 1988) with Fit-o-Mat version 0.752 (Möglich, 2018) to obtain ΔG u (H 2 O) and the m value (Equation 5): where S is the signal derived from far-UV CD spectroscopy (i.e. the Θ MRW value), intrinsic Trp fluorescence (for VcRfaH-CTD), or the normalized peak volumes of the [ 1 H, 13 C]-ctHSQC major/minor species signals for EcRfaH-KOW residue S139, respectively.
[denat] is the denaturant (i.e. urea or GdmCl) concentration in M, y N and y U are the y-intercepts, and m N and m U , the slopes of the N-and U-state baselines, respectively. f N is given by Equation 3. In this case, K u is defined as (Equation 6): Finally, the [denat] 1/2 value is obtained by (Equation 7): Near-UV CD unfolding experiments of EcRfaH-KOW were conducted using a 1 cm quartz cuvette and 0.5 mM protein in 10 mM K-phosphate (pH 7.0). As the exchange between folded and unfolded state is reasonably fast (k ex ≈ 15 s -1 at 0 M urea/GdmCl), all points were obtained from a titration of the initial denaturant-free protein sample with a 10 M urea or 8 M GdmCl solution in 10 mM K-phosphate (pH 7.0).
The sample was then incubated for 5 min at 25°C to reach equilibrium. Curves were smoothed mathematically using a Savitzky-Golay filter.
To probe reversibility of chemical unfolding and validate incubation times used to reach equilibrium, proteins were dialyzed against 20 mM NH 4 HCO 3 (pH 7.0) buffer, shock-frozen, lyophilized, and subsequently solved in 10 mM K-phosphate (pH 7.0) or 10 mM K-acetate (pH 4.0) with or without 10 M urea/8 M GdmCl, respectively. CD samples containing the identical denaturant concentration (1-2 samples in pre-transition region, 1 at [denat] 1/2 , 1 in post-transition region) were then prepared from the native or unfolded proteins. All samples were equilibrated over-night; far-UV CD spectra were then recorded using a 1 mm quartz cuvette.

Fluorescence spectroscopy
Fluorescence spectra were recorded at 25°C using a Peltier-controlled Fluorolog-3 fluorimeter (Horiba Europe GmbH, Oberursel, Germany) equipped with a 1 cm quartz cuvette (Hellma GmbH & Co. KG, Müllheim, Germany). Samples for chemical denaturation of VcRfaH-KOW contained ≈11 μM protein and were prepared as described for the far-UV CD samples. The VcRfaH-KOW Trp residue was excited at 295 nm; emission spectra were then recorded from 300 to 400 nm with slit widths between 2.65/2.65 and 2.8/2.8 nm (excitation/emission) and an integration time of 0.2 s. Analysis of the resulting denaturation curve was performed as described for CD data.
ANS (Sigma/Merck KGaA, Darmstadt, Germany) interaction experiments were conducted by preparing a urea denaturation series of EcRfaH-KOW (final concentration: 5 μM) as described for the CD-based unfolding experiments, equilibrating over-night and adding ANS at a fluorophore:protein ratio of 100:1. Fluorescence spectra were then recorded from 410 to 650 nm following excitation at 395 nm with slit widths of 2.6/2.6 nm (excitation/emission) and 0.1 s integration time. A control experiment was conducted with identical experiment and instrument setup, respectively, but samples lacking protein. The obtained fluorescence at a given wavelength was then plotted against the urea concentration of the respective sample.

Differential scanning calorimetry
The KOW domains were in either 10 mM K-acetate (pH 4.0; hSpt5-KOW5) or 10 mM K-phosphate (pH 7.0; all other domains), respectively. Given a lack of Trp residues in most domains, the protein concentration was determined via absorption at 205 nm using the molar extinction coefficient (ε 205 ) as calculated by the Protein Calculator tool (Anthis and Clore, 2013).
Initial DSC experiments were carried out on a MicroCal VP-DSC instrument (MicroCal/Malvern Panalytical, Malvern, UK; active volume: 509 µl). The samples were vacuum degassed at room temperature just before the measurements. Prior to the protein-buffer scans, several buffer-buffer scans were performed. All thermograms were recorded at a scan rate of 1.5 K/min under an excess pressure of 30 psi in passive feed-back mode from ≈10°C to 110°C or 130°C (MjSpt5-KOW5), respectively. The unfolding was calometrically reversible for EcNusG-KOW, MtNusG-KOW, MjSpt5-KOW, and EcRfaH-KOW (data not shown). hSpt5-KOW5 aggregated at pH 7.0 upon unfolding at all tested concentrations, whereas VcRfaH-KOW aggregated at concentrations >0.2 mg/ml.
We repeated the measurements for all proteins but MtNusG-KOW using a MicroCal VP-Capillary DSC instrument (Malvern Panalytical, Malvern, UK; active volume 137 µl). The thermograms were obtained at a heating rate of 1.5 K/min with excess pressure (30 psi) and at mid gain feed-back mode. Buffer-buffer runs were done prior to the protein measurements. Thermograms were recorded from ≈5°C to 130°C. The protein concentration was 0.2-1 mg/ml for EcNusG-KOW, 0.25-1 mg/ml for MjSpt5-KOW, 0.15-0.25 mg/ml for hSpt5-KOW5, 0.2-1 mg/ml for EcRfaH-KOW, and 0.1-0.15 mg/ml for VcRfaH-KOW. The measurement for hSpt5-KOW5 was carried out with 10 mM K-acetate (pH 4.0), all other KOW domains were in 10 mM K-phosphate (pH 7.0).
The obtained raw DSC data (VP-DSC data for MtNusG-KOW, VP-Capillary DSC data for all other KOW domains) was scan rate normalized, the corresponding buffer-buffer baseline was subtracted, and the thermograms were then normalized to 1 mol of protein. To extract the thermodynamic parameters, the data was fitted to a two-state unfolding model including a temperature-dependent change in heat capacity from native to unfolded state (Viguera et al., 1994). The temperature dependence of the native state heat capacity (C p,0 ) is assumed to be linear (Equation 8; note that C p,0 contains an instrument-specific offset), whereas the difference in heat capacity to the unfolded state (ΔC p,u (T)) is approximated by a parabolic function (Equation 9): The value for the pre-factor of the quadratic term, c, was obtained by calculating the theoretical partial molar heat capacity, C p (T), of the unfolded state for each of the six protein domains at 5°C, 25°C, 50°C, 75°C, 100°C, and 125°C, respectively, according to Makhatadze and Privalov, 1990. Then, the values for C p (T) were plotted over the temperature and a parabolic function was fitted, yielding c. The concentration-normalized heat capacity (C p ) then is the sum of C p,0 , the change of the 'internal' heat capacity that depends on the fraction of the protein in the folded and unfolded state (i.e. the equilibrium constant K u ), δC p int , and the excess heat absorption of the unfolding reaction δC p exc (Equation 10): With δC p int and δC p exc given in Equation 11: K u is related to the change in Gibbs energy of the unfolding reaction (ΔG u (T)) by ( Equation 12): The temperature-dependent enthalpy and entropy change (ΔH u (T), and ΔS u (T), respectively) are given by Equations 13 and 14: During fitting of C p , parameters a 0 , b 0 , a, b, T m , and ΔH u (T m ) were allowed to float, while c was kept constant.

Funder Grant reference number Author
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.