Protein folding of the SAP domain, a naturally occurring two-helix bundle

Highlights • Thol SAP domain is one of the smallest model protein folding domains.• SAP domain folds through a diffuse transition state in which helix 1 is most formed.• Native state stability is dominated by contacts formed after the transition state.


Introduction
The principles which govern the folding and unfolding of proteins have fascinated the scientific community for decades [1]. One of the most successful approaches has been to apply chemical transition state theory and treat the folding reaction as a barrier-limited process between two conformational ensembles of proteins populated at equilibrium: the native and denatured state ensembles.
The transition state, the high energy conformation transiently adopted by the polypeptide chain as the protein crosses the barrier between native and denatured ensembles, provides information on the structural mechanism of protein folding. This information cannot be determined by traditional structural techniques, and instead is inferred from kinetics and mutational analysis using the technique of U-value analysis [2][3][4][5][6][7]. The U-value of a mutation is A value of U = 0 indicates that the interactions are not present in the transition state, and a fractional value of U may indicate partial formation of interactions, complete formation of a subset of multiple interactions, or complete formation of interactions in a fraction of cases (i.e. heterogeneity in the ensemble of transition states).
In order to draw general conclusions about the principles governing protein folding and unfolding, it is important to determine detailed folding information (including structural information on the transition state) for a number of model proteins of different sizes, structures and topologies. We have previously presented the folding and unfolding behaviour of the L31W (fluorophore) mutant of the SAP domain from the Saccharomyces cerevisiae Thol protein [8] (SAP so-named after the first initial of the three proteins in which it was first identified [9]). Thol SAP is monomeric in solution and folds reversibly in an apparent two-state transition making it ideal for further study. The overall fold comprises just 51 residues, which form two approximately parallel helices separated by an extended loop, and possesses a hydrophobic core of just four residues (Leu13, Leu17, Trp31 and Leu35). Its motif of two parallel helices is quite unusual -model a-helical proteins more frequently contain antiparallel or perpendicular helices in a helix-turn-helix arrangement [10][11][12][13][14] -and Tho1 SAP is one of the smallest proteins whose folding has been studied experimentally. It is therefore of interest to study the folding of Tho1 SAP in more detail.
In this paper we have conducted a U-value analysis of the Tho1 SAP domain. The U-values we obtained were fractional, indicating that Tho1 SAP folds through a transition state with transient formation of a core and flickering elements of helical structure. The best formed element of secondary structure was helix 1. Interestingly, the contacts which contributed most to native state stability were not formed in the transition state. In order to obtain a crude indication of the validity of our results across multiple temperatures, we measured the folding of L31W SAP across the range of 283-323 K. As judged by the change in solventaccessible surface area upon folding (b T value and analogous ratio of heat capacities), there are no gross changes in the transition state of Tho1 SAP with temperature.

Reagents
L31W SAP domain was expressed and purified as detailed previously [8]. Point mutations were generated using Stratagene Quikchange mutagenesis. Mutant proteins were expressed and purified as described for SAP L31W until completion of cleavage of the fusion protein, when tag and target were separated by flowing once more through Ni-charged IMAC resin (GE Healthcare BioSciences, Sweden) before concentration and gel filtration on an S75 column into 50 mM MES pH 6.0, total ionic strength of 500 mM made up to this value using NaCl. A single peak was obtained for all proteins and fractions within this peak pooled.

Equilibrium denaturation
Far-UV CD spectroscopy (thermal denaturation) and fluorescence emission (chemical denaturation) were carried out as described previously [8].

Kinetic measurements
We measured relaxation kinetics on the ls-ms timescale using T-jump fluorescence spectroscopy and temperature jumps of 3-5 K on a modified Hi-Tech PTJ-64 (Hi-Tech Ltd., Salisbury, UK) capacitor-discharge T-jump apparatus as previously described [8]. Arrhenius analysis of the plot of microscopic rate constant against temperature was carried out constraining the overall DH, DS and DC p to their equilibrium values at the thermal midpoint. Urea denaturation (chevron) plots were fitted to an apparent two-state transition (Eq. 1) with each point weighted by the fitting error on the rate constant. No further constraints were placed upon the data fit:

Estimating U-values for severely destabilised mutants
We estimated the group (low, medium or high) into which the U-values for L17A, R20A and L35A were likely to fall by using our measured data to place bounds on k f for these severely destabilised mutants. The data for all three destabilised mutants describe the unfolding arm of the chevron well (m à-N and k u ) and we used the average value of m à-D from all other mutants (480 ± 21 cal mol À1 M À1 ) as a fixed parameter in fitting these chevrons to enable convergence on a solution. We also fit the data with m à-D fixed at 400 cal mol À1 M À1 and 600 cal mol À1 M À1 to check that constraining this value did not overly affect the value of k f obtained (Supplementary Table SI). In addition, we fit the data for the R20A mutant using Eq. (1) with all parameters allowed to float freely. In all cases, the value of [D] 50% calculated from the kinetic data was used in calculating U and was checked against the data from equilibrium urea denaturation to ensure that it was consistent with this.

R20A chevron
As an additional check on data fitting, we observed that the measured rate constant for R20A in 0 M urea is at, or is a little above, the midpoint of the transition (visual inspection of Fig. 2d). This places an upper bound of 1500 s À1 on k f (half the measured observed rate constant), consistent with the values in Table 1 Equilibrium characterisation of L31W deletion mutants.  to the 'medium' category.

Contribution of specific side-chains to native state stability
As a preparation for U-value analysis, and in order to probe the contribution of different residues to the native state stability of L31W, we made a series of thirty-five non-disruptive side-chain deletion mutations (Table 1, Fig. 1b and Supplementary Fig. S1).
The greatest contribution to L31W stability (>2 kcal mol À1 ) was made by a cluster of residues on the buried faces of the C-terminal end of both helices and at the beginning of the connecting loop. Of the five residues making up this cluster, Leu17 and Leu35 are buried in the hydrophobic core of the SAP domain, and Leu22 and the alkyl chain of Arg20 pack against them. Asp39 and the guanidinium group of Arg20 are solvent-exposed.
To probe the role of the surface-exposed residues further, we mutated Asp39 to both alanine and asparagine. Both mutations destabilised L31W by similar amounts (2.2 and 2.6 kcal mol À1 ) indicating that the carboxylic acid group in the aspartate is forming a stabilising interaction. Mutating Arg20 to alanine destabilised L31W by a greater amount (>2.6 kcal mol À1 ) than mutating Asp39. 2.0-2.5 kcal mol À1 of this destabilisation is likely to arise from loss of a hydrogen bond or salt bridge between the guanidinium group of Arg20 and the carboxylic acid of Asp39 (Fig. 1d), and the remaining destabilisation is likely to be due to the loss of the Arg20 alkyl chain packing against the hydrophobic core. We speculate that a further hydrogen bond may be formed between Asp39 and Tyr5. Like Arg20, Tyr5 is close to Asp39 and removal of the tyrosine hydroxyl destabilised L31W considerably (Y5F mutation; 2.4 kcal mol À1 ).
Most residues within helix 1 played little role in stabilising the native state. However, comparison of T9S and T9A mutants enabled us to determine the energetic contribution of the N-capping Thr9 hydroxyl group to the overall stability of L31W (i.e. to determine the contribution of the virtual mutation S9A) to be 1.3 kcal mol À1 . Leu8 acts as a hydrophobic staple (L8A gave a DDG D-N of 2.0 kcal mol À1 ).

U-value analysis of SAP L31W
In order to build a picture of the transition state of the SAP domain, we undertook a U-value analysis of L31W. Twenty of our thirty-five point mutants were suitable for U-value analysis [15]. All gave rise to apparent two-state chevron plots with linear arms (Fig. 2) which, in most cases, could be fitted with high confidence to yield U-values (Table 2). Three mutants, L17A, R20A and L35A, were so destabilised that it was not possible to fit the folding arm of the chevron reliably. However by placing bounds on k f and on the equilibrium denaturation midpoint, and by careful analysis of the resulting data fits, we could estimate into which category the U-values for these mutants fell (see Section 2.4 for full details).
Most U-values measured were fractional, which we interpret to indicate that the transition state ensemble for SAP L31W does not contain extensive regions of fully consolidated secondary structure ( Fig. 1c & e). The highest U-values (U = 0.5-0.6) were measured for Leu8, Thr9 and Leu16. Leu8 and Thr9 form the base of helix 1, while Leu16 (on helix 1) participates in core hydrophobic packing. Alanine to glycine mutations at the N-terminus of helix 1 (which probe the degree of helix formation [2,16]) gave U-values ranging from 0.3 to 0.7, although the error on the largest value is considerable. Together, these results indicate that helix 1 is partially formed in the transition state for folding.
Our U-value mutants provide several probes of the degree of formation of the hydrophobic core in the L31W transition state.
Leu13 and Arg20 both have low U-values indicating that their core contacts are not formed in the transition state. Leu35 (helix 2) and Leu16 (helix 1) have medium U-values, with that for Leu35 being lower than that for Leu16. It was not possible to measure a U-value for Leu22 since the L22A mutant was too destabilised for kinetic Table 2 Kinetic characterisation of L31W deletion mutants. a   cates that some hydrophobic collapse occurs in this region. We thus conclude that helix 1 is the most structured region in the transition state, with flickering native-like core contacts from Leu16 and Leu35.

Solvent accessibility of the Thol SAP transition state
Our measurements of SAP domain folding (here and in ref [8]) have all been made at 283 K because the baselines of our equilibrium chemical denaturation are best defined at this temperature and our measurements most accurate. However, many protein folding experiments are carried out at higher temperatures and we wanted some assurance that our conclusions were valid across a range of temperatures.
The denaturant m-value of a protein unfolding transition reports on changes in solvent-accessible surface area (DSASA) upon protein denaturation [17]. In order to determine whether the DSASA between unfolded and transition states (DSASA à-D ) varied with temperature, we measured the kinetics of SAP domain folding at 10 degree intervals between 283 K and 323 K ( Fig. 3a and Table 3).
The Tanford b-value (b T = m à-D /m D-N ) reports on DSASA à-D as the fractional DSASA upon formation of the transition state. Changes in b T with temperature or [chemical denaturant] can indicate Hammond behaviour [18][19][20] or one of many mechanistic changes [21][22][23][24][25]. The average value of b T for Tho1 SAP was 0.70 ± 0.03 (mean ± S.E.M.) and did not vary with temperature.
We also probed the folding of SAP L31W in buffer alone by measuring the observed rate constant, k obs , as a function of temperature (Fig. 3b). We included the rate constants in the absence of denaturant from Table 3 as they overlaid the other kinetic data excellently and helped define the curvature in k f at low temperatures. The fractional position of the L31W transition state at 322 K (calculated from the ratio DC p(à-D) /DC p(N-D) [26]) is also a measure of fractional DSASA upon formation of the transition state analogous to b T [17]. The ratio of heat capacities for SAP L31W was 0.9.
Both b T and the ratio of heat capacities for Tho1 SAP are high, indicating that the greatest change in solvent accessible surface area occurs between denatured and transition states. Thus the transition state of Tho1 SAP is a compact, partially dehydrated, structured species. The b T value for Tho1 SAP did not vary with temperature, giving us confidence that the model of the transition state we have determined is likely to be relevant across a range of temperatures.

Brønsted/Leffler analysis of Thol SAP
As an alternative method of determining the extent of structure formation in the Tho1 SAP transition state, we plotted DDG à-D against DDG D-N for each of our U-value mutants (Fig. 4). By comparison with Brønsted analysis [27] and Leffler a-values [28] of organic chemistry, linearity among residues present in the same region of the protein (e.g. within an element of secondary structure) indicates that this region is equally formed in the transition state. The gradient of the line gives the extent of formation of this region on a reaction coordinate of free energy [29]. Outliers in this analysis can indicate long-range contacts. Depending on the position of the outlier (i.e. above or below the trend line), a point may indicate long-range contacts formed in the transition state, or may indicate contacts formed in the native state but not present in the transition state.
For the SAP domain, there is a weak correlation between DDG à-D and DDG D-N for those residues in helix 1 which make only local contacts in the native state (solid red and black points in Fig. 4; slope of best fit line = 0.4 ± 0.1; R 2 = 0.8). Two points in helix 1 and one point in helix 2 are outliers in this analysis (open circles in Fig. 4). These points are from residues which make long-range contacts in the SAP native state (L17A, R20A and L35A) and deletion of these contacts results in severe destabilisation of SAP domain. The value of DDG à-D for Leu35 is much higher than expected from the other residues in helix 2. This is consistent with Leu35 (but not the other probes in helix 2) forming native-like long-range contacts in the transition state, and is also indicated by the U-values for this residue. Relative to DDG à-D , DDG N-D for both Leu17 and Arg20 is much larger than expected from other residues in helix 1. Both of these residues have low U-values, and their position as outliers on the Brønsted plot is consistent with both residues forming long-range contacts in the native state which are not formed in the transition state.

Comparing contacts stabilising the native and transition states
The contacts which contribute most to the native state stability of Tho1 SAP (DDG D-N > 2.6) were not formed in the transition state (low U-values for Y5F, R20A and D39A/N; and Brønsted analysis above). These contacts form a network of hydrogen bonds in the native state (Fig. 1d). Likewise, the sidechain of Leu17 forms part of the SAP hydrophobic core and contributes >2.6 kcal mol À1 to native state stability, yet it only has a U-value of 0.1. From this, we conclude that the late stages of Tho1 folding involve 'locking' the diffuse transition state ensemble into the native state using strong, long-range interactions.

Summary & conclusions
We carried out a U-value analysis of the folding reaction of the Tho1 SAP domain in order to build a structural model of the transition state. All U-values were fractional, indicating that no elements of secondary structure were fully formed in the transition state. Overall, helix 1 was the most highly structured region in the transition state with occasional native-like core contacts from Leu16 and Leu35.
We also probed the folding of the SAP domain by determining the degree of DSASA at different temperatures. Both b T and the analogous ratio of heat capacities indicated a compact transition state. b T remained constant across the range 283-323 K indicating no gross changes in the folding mechanism.
The contacts which contributed most to Tho1 SAP native state stability were not formed in the transition state. We thus conclude that the early stages of SAP domain folding involve chain collapse and exclusion of solvent water, while the late stages involve formation of strong long-range interactions.  Tables 1 and 2. A minimum error of 10% has been applied to DDG à-D .