Spatially resolved free-energy contributions of native fold and molten-globule-like Crambin

The folding stability of a protein is governed by the free-energy difference between its folded and unfolded states, which results from a delicate balance of much larger but almost compensating enthalpic and entropic contributions. The balance can therefore easily be shifted by an external disturbance, such as a mutation of a single amino acid or a change of temperature, in which case the protein unfolds. Effects such as cold denaturation, in which a protein unfolds because of cooling, provide evidence that proteins are strongly stabilized by the solvent entropy contribution to the free-energy balance. However, the molecular mechanisms behind this solvent-driven stability, their quantitative contribution in relation to other free-energy contributions, and how the involved solvent thermodynamics is affected by individual amino acids are largely unclear. Therefore, we addressed these questions using atomistic molecular dynamics simulations of the small protein Crambin in its native fold and a molten-globule-like conformation, which here served as a model for the unfolded state. The free-energy difference between these conformations was decomposed into enthalpic and entropic contributions from the protein and spatially resolved solvent contributions using the nonparametric method Per|Mut. From the spatial resolution, we quantified the local effects on the solvent free-energy difference at each amino acid and identified dependencies of the local enthalpy and entropy on the protein curvature. We identified a strong stabilization of the native fold by almost 500 kJ mol−1 due to the solvent entropy, revealing it as an essential contribution to the total free-energy difference of (53 ± 84) kJ mol−1. Remarkably, more than half of the solvent entropy contribution arose from induced water correlations.


INTRODUCTION
Folding free energies of proteins at room temperature typically range in the order of a few tens of kilojoules per mol (1)(2)(3)(4), which approximately corresponds to the interaction energy of just a few hydrogen bonds (5). This small folding free energy results from a delicate balance between competing enthalpy and entropy contributions, each of which are much larger in magnitude but compensate each other almost entirely (2). Shifting the balance by external factors, e.g., because of a mutation of an amino acid or simply by changing the temperature, can lead to drastic and sometimes counterintuitive consequences. For example, proteins can unfold at low temperatures (2,6,7), although the protein-internal interaction energies favor the folded state and the entropic protein free-energy contribution ÀTDS, which favors the unfolded state, should decrease in magnitude. This effect, known as cold denaturation, seems paradoxical only if the protein, but not the solvent, contributions are considered for the free-energy difference estimation. Indeed, cold denaturation has been attributed to a temperature-induced weakening of the hydrophobic effect (2), which arises from the thermodynamics of the solvation shell. The hydrophobic effect is known to be a major driving force for protein folding and stability (2,(8)(9)(10).
Cold denaturation exemplarily illustrates the importance of solvation thermodynamics for protein stability. However, its quantitative role in relation to other free-energy contributions, the molecular mechanisms of solvent-driven protein stability, and how the solvent thermodynamics is affected by individual amino acids are still largely unresolved. To study the effect of solvation on protein stability, a decomposition of the individual solvent-and proteinrelated free-energy contributions of a protein native fold and the unfolded state is required. To furthermore characterize the effect of individual amino acids on the solvent free-energy contribution, a spatially resolved map of local solvent enthalpies and entropies is needed.
Here, we address these questions using all-atom molecular dynamics (MD) simulations of the quite typical soluble protein Crambin, for which high-resolution structural data are available. Its small size furthermore allowed us to study the full hydration shell. Because a representative ensemble of the extremely broad distribution of unfolded protein configurations is not sufficiently sampled within the timescales of MD simulations, we assessed the free-energy difference between the native fold and a transiently stable moltenglobule-like conformation. Although in a molten-globulelike state, hydrophobic collapse has already partially occurred, we still expect a solvent-induced thermodynamic driving force toward the native fold.
We decomposed the free-energy difference between the two conformations into enthalpy contributions from protein-protein, protein-solvent, and solvent-solvent interactions and entropy contributions from both solvent and protein. To accurately compare the spatially resolved solvent entropy contributions to their respective enthalpic terms, we preferred a nonparametric method that also captures the entropy effects of multibody correlations. We therefore selected Per|Mut (11)(12)(13)(14) to calculate the spatially resolved solvent entropy over alternative methods, such as the grid inhomogeneous solvation theory (15)(16)(17)(18)(19)(20)(21), which usually excludes multibody correlations, and the grid cell theory (22) or three-dimensional two-phase thermodynamics (23)(24)(25), which both rely on strong model assumptions. Per|Mut provides a further solvent entropy decomposition into the entropy of the individual water molecules and the entropy changes arising from multibody water-water correlations from both translational and rotational degrees of freedom. From the decomposition, we identified a substantial stabilizing effect of the solvent free-energy contributions of almost 500 kJ mol À1 . The further breakdown into different interaction energies and entropic water correlation terms allowed for an interpretation of the free-energy changes on a molecular level. Here, a significant increase of water-water correlations around the molten-globule-like conformation compared to the native fold was found, which corresponds to a stabilizing entropic free-energy contribution. From the spatial resolution of both solvation enthalpy and entropy, we obtain the local free-energy changes due to specific amino acid side chains and capture free-energy effects of the protein shape.

MD simulations
All MD simulations were carried out using the software package Gromacs 2018 (26)(27)(28)(29)(30), the CHARMM36m force field (31)(32)(33) with the TIP3P water model (34) (CHARMM version), and the leapfrog integrator with a 2 fs time step. Unless stated otherwise, the temperature was kept at 300 K using the V-rescale thermostat (35) with a time constant of 0.1 ps. For equilibration simulations under NPT (fixed number of particles, constant pressure, and constant temperature) conditions, the Berendsen barostat (36) at 1 bar pressure with a time constant of 0.5 ps was used. All production runs were carried out using the Parrinello-Rahman barostat (37,38) with a time constant of 1.0 ps and 1 bar pressure. During all simulations, a short-range Coulomb cutoff of 1.2 nm was used; long-ranged electrostatic interactions were calculated using the particle-mesh Ewald method (39). Lennard-Jones forces (40) were switched smoothly between 1.0 and 1.2 nm. Bond-constraints were imposed on all water molecules using the SETTLE algorithm (41) and on all other bonds to hydrogen atoms using the LINCS algorithm (42). During production runs, configurations were stored every 10 ps for subsequent analysis.

System setup
A high-resolution structure of the protein Crambin (46 amino acids) was retrieved from the Protein Data Bank (1CBN) (43). The molecule was placed inside a cubic simulation box of size (7 nm) 3 and solvated with 10,950 water molecules. The system was then subjected to gradient descent energy minimization, which was terminated once the largest force on any atom decreased below 100 kJ mol À1 nm À1 . To equilibrate the solvent water, a 1 ns MD simulation under NVT (fixed number of particles, constant volume, and constant temperature) conditions was started, during which all heavy atoms of the protein were restrained using a force constant of 1000 kJ mol À1 nm À2 . To relax the box size to 1 bar, the system was subsequently simulated for 10 ns under NPT conditions, during which the restraints on the protein were maintained. Subsequently, the protein was equilibrated during a further 10 ns simulation under NVT conditions.
To obtain a molten-globule-like conformation of an unfolded state, the equilibrated protein was unfolded during 50 ns of heating and subsequent annealing. Here, the temperature was linearly increased to 600 K during the first 5 ns; the temperature was then kept at 600 K for the following 40 ns and eventually decreased back to 300 K during the last 5 ns. The unfolded structure was simulated for 2 ms under NPT conditions (Parrinello-Rahman barostat), during which a partial refolding occurred, resulting in the formation of two antiparallel b-sheets, shown in Fig. 1 B,   FIGURE 1 Ribbon-style visualization of Crambin in its native fold (A) and in a molten-globule-like conformation (B). Images were rendered using VMD (44). To see this figure in color, go online.
whereas the rest of the molten-globule-like conformation remained unstructured. To ensure a proper equilibration of the folded system, it was also simulated for further 2 ms.
Ensembles of four Crambin structures were retrieved for both the native fold and the molten-globule-like conformation by extracting equidistant frames from the last 3 ns of each of the equilibration runs. The ensembles therefore cover short-timescale processes such as the side chain reorientation but do not capture larger configurational motions, particularly of the molten-globule-like conformation.
A total of 10 5 statistically independent microstates, which are necessary to converge the nine-dimensional configurational subspace density estimates of the three-body correlations, were obtained at 10 ps intervals from 1 ms production runs under NPT conditions for each of the eight replicas. To restrict configurational changes of the protein, each atom was restrained using a force constant of 1000 kJ mol À1 nm À2 .

Entropy calculation
Hydration entropies were calculated from the production runs of the eight replicas using the method Per|Mut (13,14). For each replica, a permutation reduction (11,12) was carried out, which enhances sampling of the highdimensional water configuration space by the Gibbs factor N! and localizes the water molecules without changing their physical and statistical properties. The first trajectory frame was used as the reference configuration.
Spatially resolved entropies were calculated for the 4000 water molecules closest to the protein using a third-order mutual information (MI) expansion (45)(46)(47)(48). Pairwise MI terms of the translational and rotational entropy, as well as for the translation-rotation correlation, were calculated for water molecules with a maximal average distance of 1.0 nm. For third-order terms, a 0.33 nm cutoff was used. The individual entropy terms in the expansion were calculated using a k-nearest-neighbor algorithm with a value of k ¼ 1 for all expansion orders. The entropy of the solvation shell consisting of the closest 1000 molecules was used for analysis. From those results, the entropy of the inner solvation shell, consisting of the closest 1000 molecules (equivalent to a thickness of $0.8 nm), was calculated as S internal À I/2. Here, S internal is the sum of all expansion terms concerning molecules from the inner solvation shell and I is the sum of all MI terms with a molecule inside and a molecule outside the inner solvation shell. Error bars were obtained as standard deviations of the four replicas per state.
Protein entropies were estimated for both states using Schlitter's method (49) on all nonhydrogen protein atoms from additional unrestrained MD simulations. Here, the molten-globule-like conformation partially refolded after 1.5 ms (see Fig. S1). To obtain values consistent with the restrained states, for which the remaining free-energy terms were calculated, the analysis was limited to the first 1 ms of the trajectories. Errors were estimated from a second analysis using only 0.5 ms of the trajectories.

Enthalpy calculation
Enthalpies were calculated as U þ pV, where U is the sum of all interaction energies, averaged over the MD trajectory. The work term pV was included as the product of pressure (1 bar) and average box volume.
For each water molecule of the inner solvation shell, the interaction energies with the protein, the outer solvation shell, and all other molecules of the inner shell were calculated. Lennard-Jones parameters and partial charges were taken from the CHARMM36m force field, and a 2 nm cutoff was used for Lennard-Jones interactions. The interaction energy of the protein was calculated directly with the software package Gromacs.
Enthalpy and interaction energy differences were then calculated with respect to bulk water (Figs. 2 and 3), as well as between the native fold and molten-globule-like configurations (Fig. 4). Interaction energy differences were defined as the difference between the mean interaction energies in each of two considered states.

Visualization of entropy and enthalpy
The entropy and interaction energy contributions of each molecule were calculated by splitting the two-body MI terms, three-body MI terms, and the pairwise interaction energies equally between the respective molecules, accounting for the otherwise inherent double or triple-counting (50) of pairwise interaction energies as well as two-and three-body correlations. The simulation box was then divided by a 129 Â 129 Â 129 grid. The entropy and energy values of each voxel were calculated as the average contribution of all water molecules that visited the voxel, weighted by the number of times the voxel was visited by each water molecule. Because of permutation reduction, each voxel was only visited by a few of the localized water molecules.

Error estimation
Errors of solvent entropy and solvent enthalpy contributions of each water molecule were estimated as standard deviation of bulk-phase water molecules, assuming that water molecules close to the solvent are subject to the same spread (14). The estimated errors are listed in Table 1.
Here, U SS is the solvent-solvent interaction energy, and ÀTS trans and ÀTS rot are the translational and rotational free-energy contributions, respectively. Both are split into a single-body term ÀTS 1 and a multibody term ÀTS R2 . ÀTS transÀrot R2 corresponds to the entropy contribution from translation-rotation correlations.

Residue contributions and convexity
The local interaction energy and the local solvation entropy contribution around each amino acid were calculated as the average contribution of all water molecules within 0.4 nm of the amino acid. Here, the distance was measured as the minimal distance to any atom of the residue. The contribution of the pV term was small compared to the other contributions (19.8 kJ mol À1 for both the native fold and molten-globule-like conformations) and was therefore neglected. The local free energy was thus calculated as the sum of the interaction energy and the entropy contribution.
As a measure of the local topology, the convexity of the protein-solvent surface was calculated for each residue as where N d is the number of water molecules within a radius of d ¼ 1 nm around the amino acid, V res is the volume of the residue itself, and n W is the number density of water. Accordingly, a convexity of c ¼ 0 indicates that the entire volume within the radius d around the residue is occupied by other protein atoms, i.e., that the residue is buried. Conversely, a value of c ¼ 1 shows that the residue is fully exposed to the solvent.
To quantify the hydrophobicity of a residue and for comparison with our calculations, the empirical hydrophobicity scale by Zhao et al. (51) was used; a small value indicates a hydrophilic amino acid, whereas a large value denotes hydrophobicity.

Spatially resolved solvation free-energy contributions
To quantify and characterize the solvent contribution to the stability of the protein Crambin, we carried out and compared MD simulations of the native fold and a moltenglobule-like protein conformation, as described in the Materials and methods. We will first address the contribution of individual amino acids and the protein shape to the solvent free-energy contributions for a number of exemplary resi-dues and subsequently provide a more systematic assessment. To this end, we calculated spatially resolved solvent interaction energies and corresponding entropies for both the native fold and the molten-globule-like conformation, respectively. FIGURE 2 Spatially resolved hydration free-energy decomposition of Crambin in its native fold (A) and the molten-globule-like conformation (B) relative to bulk water, visualized on a representative two-dimensional slice through the center of the molecules. The total interaction energy DU (top row, left) is split into protein-solvent interactions DU PS (top row, center), and solvent-solvent interactions DU SS (top row, right). In a similar manner, the translational and rotational entropy contributions ÀTDS (center rows) are split into the single-molecule entropy ÀTDS 1 and multibody correlations ÀTDS R2 . The entropy contribution from the translation-rotation correlation is shown in the bottom left. The spatially resolved free-energy change (sum of the first column) is shown in the bottom right. All numerical values are given in kJ mol À1 , and important residues are marked by arrows. To see this figure in color, go online. Because any spatially resolved picture of the solvation shell thermodynamics depends on and changes with the particular conformation of the protein, we chose a representative conformation for each of the folded and the moltenglobule-like ensembles. To obtain better statistics, we carried out simulations of four similar replicas for each conformation. Because the molten-globule-like conformation is structurally unstable, we restrained the structures as described in the Materials and methods.
For each of the 2 Â 4 replicas, spatially resolved solvent interaction energies DU were calculated directly from the average interaction energies between protein and solvent (DU PS ) and between the solvent molecules (DU SS ), assuming pairwise interaction energy functions. Solvent entropy contributions were calculated using the method Per| Mut (13,14), which yields a spatially resolved entropy decomposition into terms S trans

R2
. Here, S trans 1 and S rot 1 are the translational and rota-tional entropies of the individual water molecules, ignoring any correlations. S trans R2 and S rot R2 capture the entropy loss due to two-and three-body correlations between water molecules for translational and rotational degrees of freedom, respectively. Higher-order correlations were assumed to be negligible to approximate the solvent entropy. A positive value of ÀTDS R2 denotes an unfavorable free-energy increase relative to bulk water because of increased waterwater correlations. Conversely, a negative value of ÀTDS R2 indicates a favorable free-energy contribution because of weakened correlations relative to bulk water. Similarly, DS transÀrot R2 is the entropy change due to correlations between translational and rotational degrees of freedom of water molecules. Note that contrary to other decompositions of entropy (52)(53)(54), ÀTDS 1 and ÀTDS R2 are not the associated entropy differences to changes of the protein-solvent (U PS ) or solvent-solvent (U SS ) interaction energies, respectively, but provide an independent decomposition into singlebody entropies and multibody correlations. Fig. 2 compares these spatially resolved free-energy contributions for the native fold (Fig. 2 A) and the moltenglobule-type conformation (Fig. 2 B), respectively. As shown in the upper left of Fig. 2 A, the interaction energy DU of water molecules near the protein is generally more negative (stronger) than in bulk water, particularly close to charged residues such as R17 and D43, with interaction energy differences of À48 and À27 kJ mol À1 , respectively. Similarly, in the vicinity of the polar hydroxyl groups of the tyrosine residues Y29 and Y44, the energy is lower than in bulk water (DU ¼ À22 kJ mol À1 ), and close to residue C40, the interaction energy difference is À27 kJ mol À1 . Residues T2, F13, and R10 form a ''groove'' in the native fold structure, with strong interaction energies below À30 kJ mol À1 relative to bulk water. Fig. 2 A (top row, center and right) reveals the individual interaction energy contributions from the interactions between protein and solvent DU PS , as well as from interactions between solvent molecules DU SS . The interaction energy of hydration is dominated by the protein-solvent interactions, in which charged and polar residues show particularly favorable interactions. Although solvent-solvent interactions (top row, right) have no net effect on the overall free-energy difference (52)(53)(54), they partially counteract the locally

0.18
The solvation thermodynamics of Crambin favorable protein-solvent contributions and therefore contribute to the total enthalpy component. Here, water molecules that bind to charged residues perturb the hydrogen bond network of the surrounding water, thereby weakening the solvent-solvent interactions. Accordingly, around the charged residues R10, R17, and D43, the solvent-solvent interaction energy becomes strongly unfavorable with values between þ17 and þ30 kJ mol À1 relative to bulk water. Similarly, but to a lesser extent, solvent-solvent interactions are weakened for water molecules around the hydroxyl groups of residues Y29 and Y44, resulting in unfavorable energy changes ranging between þ10 and þ15 kJ mol À1 .
Next, we quantified the impact of these strong interactions on the local entropy. For the charged residues R10, R17, and D43, as well as for the polar residues Y29 and Y44, both the translational and rotational solvent entropies are reduced (Fig. 2 A, center rows). At these sites, the free-energy contributions ÀTDS vary from þ5.5 to þ12 kJ mol À1 (translational) and from þ3.5 to þ8.2 kJ mol À1 (rotational) and therefore weaken the otherwise strong solvation. The entropy loss is dominated by the first-order terms DS trans 1 and DS rot 1 , revealing that the reduced mobilities of the individual water molecules due to the protein (partial) charges, and not water-water correlations, are the main cause of the entropy loss.
Translational entropy free-energy contributions from multibody correlations ÀTDS R2 very close to the charged and polar protein parts are indeed relatively weak, but not negligibly so, with values between À2.0 and À1.5 kJ mol À1 relative to bulk water. We note that the water-water correlations, here quantified by ÀTDS R2 , differ from the solvent-solvent entropy contribution, as defined, e.g., by Ben-Naim (52,53) and Yu et al. (54), and hence, our results are compatible with their finding that the latter is exactly compensated by the respective enthalpy term. Remarkably, for second-shell water molecules around charged residues, these correlations again change sign, e.g., for residues R10 (þ1.0 kJ mol À1 ), D43 (þ1.2 kJ mol À1 ), and Y44 (þ0.6 kJ mol À1 ), thus being less favorable compared to bulk water. A similar effect is seen for the rotational multibody correlations in which the free-energy contribution almost vanishes at the first hydration shell (e.g., þ0.2 kJ mol À1 at R10 and þ0.4 kJ mol À1 at Y29 and Y44, each with an estimated error of 0.22 kJ mol À1 ) but increases to values from þ0.7 to þ1.0 kJ mol À1 within the second hydration shell.
A possible explanation for this effect is that the strong binding of the first-shell water molecules (for rotational degrees of freedom, the strong directionality due to the water dipole moment (55)) results in reduced fluctuations, which limits the correlations with the remaining water molecules (for a more detailed analysis, see Supporting materials and methods). As a result, the multibody correlation entropy contribution is more favorable for the bound molecules compared to bulk water. In contrast, the fluctuations of the second-shell water molecules are larger because of the reduced electrostatic interactions with the charged and polar residues, as shown by bulk-like single-body entropy terms. Therefore, the correlations with the adjacent water molecules reduce the entropy by a larger amount.
As the last remaining entropy term, we also quantified the entropy reduction due to translation-rotation correlations (Fig. 2 A, lower left), which reveals increased correlations close to the protein surface, particularly in the vicinity of charged residues and polar chemical groups. Here, the respective solvent free-energy contribution is increased by þ3.2 kJ mol À1 (R10), þ5.9 kJ mol À1 (R17), þ2.7 kJ mol À1 (Y29), þ5.0 kJ mol À1 (D43), and þ4.0 kJ mol À1 (Y44) compared to bulk water. The correlations and the corresponding unfavorable free-energy changes arise because the strong directionality of the water molecules close to a protein (partial) charge quickly decreases with distance. A water molecule close to a charged amino acid is therefore, on average, more strongly oriented than a water molecule at a larger distance.
To characterize the local solvent free-energy change relative to bulk water, we considered the sum of all free-energy contributions ( Fig. 2 A, bottom right). Close to hydrophilic residues, enthalpic contributions dominate, such that the total free-energy change is favorable, as expected. The water molecules that form hydrogen bonds to the hydroxyl groups of Y29 and Y44 contribute À5.3 and À6.2 kJ mol À1 to the hydration free-energy difference of the protein, respectively. Around the charged residues R17 and D43, the free energy per water molecule is reduced by À17.5 and À32 kJ mol À1 , respectively. Close to the residue C40, the free energy is reduced by À5.4 kJ mol À1 , and in the ''groove'' formed by residues T2, R10, and F13, the free-energy contribution per water molecule is À6.6 kJ mol À1 compared to bulk water. Here, the average interaction energy of the closest water molecule to the three amino acids is þ4, À57, and 0 kJ mol À1 , respectively. Although the entropic contributions cannot be assigned to specific residues, their total contribution to the free energy ($30 kJ mol À1 ) is smaller than the enthalpic contribution of R10. In this sense, the favorable free energy is enthalpically dominated because of the charged residue R10. Because of the water-water correlation effects discussed above, second-shell water molecules around protein charges contribute unfavorably to the hydration free energies, with values ranging from þ1.8 to 3.2 kJ mol À1 per molecule.
Next, we carried out the same analysis for a moltenglobule-like conformation to identify possible differences in the solvation thermodynamics between the two states. As shown in Fig. 2 B, the enthalpic and entropic changes for water molecules close to charged and polar residues, e.g., for Y29, D43, and Y44, are similar to the native fold. However, the spatial distribution of the individual free-energy contributions is different. Whereas in the folded state, prominent differences of the local free-energy contributions are isolated and can be well attributed to individual (charged or polar) residues, a particularly large region of favorable interaction energies and unfavorable entropies is seen around residues D43, Y44, and F13 of the molten-globulelike fold. Here, we considered two possibilities: first, the (coincidental) colocalization of specific amino acids in the molten-globule-like state could give rise to the spread-out distribution. Alternatively, the locally concave shape of the molten-globule-like conformation could provide a possible explanation of the large volume of the affected region. In the latter case, the semicavity exposes a markedly larger surface area to the solvent, which strengthens the solvent-protein interactions (À50 kJ mol À1 per molecule) but weakens the solvent-solvent interactions (þ20 kJ mol À1 ) as the number of neighboring water molecules is reduced because of the protein geometry. Correspondingly, the concave shape restricts the mobility of the water molecules, resulting in an unfavorable entropy change compared to bulk water (þ5.8 kJ mol À1 translational, þ5.0 kJ mol À1 rotational). Indeed, similar entropy and enthalpy effects were also seen to a smaller extent at the concave ''groove'' formed by residues T2, R10, and F13 in the native fold. To probe whether this observation is anecdotal, we systematically compared the local curvature to the local free-energy contributions of all amino acids in the next subsection.
In addition to the single-body entropy reduction in the concavity, we also observed unfavorable entropy changes due to the multibody correlation terms, which contribute þ2.2 kJ mol À1 (translational), þ2.1 kJ mol À1 (rotational), and þ9.5 kJ mol À1 (translation-rotation correlation). In light of the geometry-induced weaker solvent-solvent interactions, such high correlations are unexpected. To check whether this observation is anecdotal or, alternatively, more general, we systematically compared the multibody entropy terms to the local convexity for each amino acid of the 2 Â 4 replicas. As shown in Fig. S2, there are indeed stronger correlations (i.e., unfavorable free-energy contributions) between rotational degrees of freedom ðÀTDS rot R2 Þ and between translational and rotational degrees of freedom ðÀTDS transÀrot R2 Þ for smaller convexity values (i.e., locally more concave surfaces). Translational correlations ðÀTDS trans R2 Þ show an inverted dependence on the local convexity. So far, we are unable to provide an explanation for these effects.
For the native fold, as well as for the molten-globule-like conformation, the well-known tug of war between enthalpic and entropic free-energy contributions results in a partial compensation of the two contributions (56). For the native fold, this compensation often also applies to the multibody correlation entropies and the solvent-solvent interaction energies, e.g., close to residues Y29, Y44, and R10. However, this compensation of solvent-solvent terms is not seen for water molecules in the concavity of the molten-globulelike conformation, which might be an effect of the concave shape of the protein. Despite the partial enthalpy-entropy compensation, the solvent free energy is mainly affected by the protein-solvent interaction energies.
The entropy contributions in the native fold state mainly originate from water molecules that are bound to protein charges and mostly affect the first hydration shell. To the contrary, the entropy contributions in the molten-globulelike state are more spread out, and the second hydration shell is also significantly affected. This observation might be a direct consequence of the hydrophobic driving forces of protein folding (9), in which hydrophobic residues, which result in increased multibody correlation entropies (14,57,58), are predominantly packed into the protein interior of the native fold.
We furthermore note that the water entropy seems to become more unfavorable in concave parts of the molecules, for example, as seen in the ''groove'' at residues T2, R10, and F13 for the native fold and around residues R13, D43, and Y44 for the molten-globule-like conformation. The water interaction energy seems to show a compensating effect.

Residue contributions
So far, we have focused on a few illustrative example residues. For a more comprehensive and systematic assessment of the protein-shape effect on each of the local free-energy contributions, we compared the free energies close to all residues to the local convexity, calculated as described in Residue contributions and convexity. Here, the interaction energies and solvation entropies associated to each amino acid were calculated as the average contribution of all water molecules within 0.4 nm of the amino acid. The reported results therefore depend on this cutoff, which was chosen to correspond to the range within which water molecules were affected by amino acids, as shown in Fig. 2. To distinguish protein topology effects from those of the solvation properties of the individual amino acids, we furthermore compared the local free-energy changes around each residue to its hydrophobicity index (51). Fig. 3 reveals a clear correlation between the local interaction energy differences with respect to bulk water and the convexity for both the native fold (Fig. 3 A) and the moltenglobule-like states (Fig. 3 B), reflected in Pearson correlation coefficients of 0.50 and 0.42, respectively. Here, the interaction energy of the water molecules at a concave surface (convexity z 0.3) is reduced by up to 40 kJ mol À1 for the native fold and by up to 33 kJ mol À1 for the moltenglobule-like fold, whereas the interaction energy of water molecules at convex surfaces of the protein (convexity > 0.8) only differs by a small amount (DU > À5 kJ mol À1 ) relative to bulk water. This effect is to be expected, as more solvent-exposed residues interact with a larger number of water molecules.
An even stronger but inverse correlation (correlation coefficients of À0.76 and À0.71 for the native fold and the molten-globule-like states, respectively) is seen in Fig. 3 for the solvation entropy contribution ÀTDS; here, the water entropy becomes strongly unfavorable at concave surfaces, with contributions of up to 28 kJ mol À1 for the native fold and of 24 kJ mol À1 for the molten-globule-like configurations. Correspondingly, the entropy contributions at convex surfaces are small (ÀTDS < 5 kJ mol À1 ). This observation is in line with previous reports about unfavorable water entropies and favorable water enthalpies in cavities (59,60) and altered water behavior at concave surfaces (61).
Interestingly, the convexity dependencies of enthalpy and entropy almost compensate each other, such that the local free energy, shown on the top right of Fig. 3, A and B, shows no significant dependency on the surface convexity (r ¼ 0.21 and À0.08 for the native fold and the moltenglobule-like state, respectively). Furthermore, the side chain polarity, as indicated by the color code in Fig. 3, has no significant impact on the convexity effects of interaction energy and entropy.
As shown in the bottom rows of Fig. 3, A and B, there is only a modest correlation of the local interaction energy and entropy contributions with the hydrophobicity index of each amino acid (r ¼ 0.32 and 0.20 for native fold and moltenglobule-like state, respectively). However, as expected, the local free energy correlates more strongly with the hydrophobicity index (r ¼ 0.39 and 0.42 for native fold and molten-globule-like state, respectively), in which the most hydrophilic residues show favorable local solvation free energies of À16 to À5 kJ mol À1 , whereas local free-energy changes of À11 to þ3 kJ mol À1 are attributed to hydrophobic residues. Unexpectedly, for the nonpolar residues, no correlation between their solvation free energies in the protein context and their hydrophobicity index is seen. As shown in the bottom right of Fig. 3, A and B, the most favorable free energy is observed around charged residues (colored orange), with a free-energy change of (À8.1 5 2.3) kJ mol À1 . Around polar residues (colored cyan), the average free-energy change is (À3.14 5 3.72) kJ mol À1 ; around apolar residues (colored gray), an average contribution of (À1.53 5 4.45) kJ mol À1 is seen.
Comparing the native fold with the molten-globule-like conformation, the most striking difference is seen for the charged residues, for which the local free-energy contribution is more favorable for the native fold ((À9.64 5 2.58) kJ mol À1 ) compared to the molten-globule-like conformation ((À6.53 5 1.97) kJ mol À1 ). This finding agrees with the well-known and important mechanism of stabilizing a folded state by an optimized geometry that maximally exposes hydrophilic charged residues.

Free-energy decomposition
To obtain a quantitative understanding of the overall hydration contribution to protein stability, we decomposed the total free-energy difference DG ¼ G m À G f (following Dias et al. (2)) of the molten-globule-like conformation and the native fold into individual free-energy contributions. In addition to solvent contributions, we also calculated protein interaction energies U PP directly from the ensemble of the eight restrained replicas, as well as protein entropies S P from unrestrained MD simulations of the native fold and the molten-globule-like conformation, respectively. To ensure that the unrestrained simulation of the moltenglobule-type only system samples the conformation of the four similar molten-globule-like ensemble members used for the calculation of the other free-energy contributions, the simulation time was restricted to 1 ms. During this time, the conformation did not change markedly, with a root-mean-square deviation below 0.7 nm. Accordingly, our free-energy budget includes only part of the conformational entropy of the whole molten-globule-type ensemble. Fig. 4 summarizes the calculated differences between the native fold and the molten-globule-like conformation for the various free-energy contributions. Positive values indicate a stabilization of the native fold, and negative values favor the molten-globule-like conformation. Here, absolute free energies, enthalpies, and entropy contributions were calculated for each of the 2 Â 4 replicas separately, from which average differences between the two states were then obtained. As expected, we obtained a total free-energy difference that favors the folded state ((53 5 84) kJ mol À1 ). Although, to the best of our knowledge, a measured value is not available, this result agrees with the expected range of a few tens of kilojoules per mol (2,3).
Likely because the molten-globule-like conformation has a larger solvent-accessible surface area ((36.9 5 0.6) nm 2 vs. (31.3 5 0.3) nm 2 for the native fold), the total interaction energy between the protein and inner solvent shell U PS destabilizes the native fold by (À706 5 216) kJ mol À1 with respect to the molten-globule-like conformation. In contrast, the interaction energies within the inner solvation shell (closest 1000 molecules), given by DU SS ¼ (287 5 86) kJ mol À1 , partially compensate the destabilizing contribution of DU PS . The work term pDV ¼ (À0.002 5 0.002) kJ mol À1 shows no significant difference between the two conformations. As expected, the internal interaction energies of the protein strongly stabilize the folded state (DU PP ¼ (648 5 59) kJ mol À1 ). Overall, the enthalpy contributions strongly favor the folded state (DH ¼ (228 5 106) kJ mol À1 ), in line with textbook thermodynamics of protein folding.
Next, we quantified all entropic contributions to the freeenergy difference, which are expected to largely compensate the strongly stabilizing enthalpy difference. Indeed, the protein entropy strongly favors the molten-globule-like state even though only one single metastable conformation was considered (ÀTDS P ¼ (À673 5 42) kJ mol À1 ). Crucially, however, all solvation entropy terms stabilize the native fold, adding up to 498 kJ mol À1 . One of the largest contributions is the singe-particle translational entropy S trans 1 ¼ (139 5 22) kJ mol À1 . In contrast, the respective rotational entropy contribution, which many studies focus on, yields only a difference of (68 5 17) kJ mol À1 . Also intriguingly, the second-and third-order MI terms S trans R2 and S rot R2 , which denote the two-and threebody correlations, contribute large differences of (39 5 6) kJ mol À1 and (63 5 4) kJ mol À1 , respectively. Strikingly, the largest solvent entropy contributions stem from the translation-rotation correlation entropy, which favors the native fold by (189 5 12) kJ mol À1 . Obviously, these entropies, which arise from the correlated motion of water molecules, contribute markedly to the folding thermodynamics of Crambin.
Interestingly, partial error compensation between the free-energy terms shown in Fig. 4 results in a total free-energy error (584 kJ mol À1 ) that is smaller than the error of some of its components (e.g., DU PS with an error of 5216 kJ mol À1 ). We think that this is a result of the fact that free-energy differences typically converge faster than their individual components; however, this could also be a coincidental result of the limited sample size of four replicas for each of the two states.
In summary, the largest contribution to solvent-induced entropic stabilization (498 kJ mol À1 ) of the native fold is not due to the mobility of single water molecules but to their correlated motion (291 kJ mol À1 ). Conversely, the proteinsolvent interaction energies DU PS and the protein entropy DS P are the only destabilizing terms.

CONCLUSIONS
To quantitatively assess the role of the protein structure and individual side chains in the solvation thermodynamics and stability of the native fold, we compared spatially resolved enthalpic and entropic solvation free-energy contributions of the native fold and the molten-globule-like state of the small globular protein Crambin. A systematic analysis of the solvation thermodynamics of all residues revealed that their local enthalpies and entropies are mainly affected by the protein charges and the local curvature of the protein surface. Close to charged or polar residues, a more favorable interaction energy and a less favorable entropy contribution were observed. Interaction energies are here dominated by protein-solvent terms, whereas solvent-solvent interactions are weakened and therefore become less favorable around protein charges.
Single-body entropy terms are strongly reduced close to protein charges, as the water molecules bind to the protein and therefore lose translational and orientational mobility. For solvent multibody correlation contributions, a more complex picture emerged. Whereas correlations with the remaining water molecules are weakened for the bound water molecules, the second hydration shell water molecules show an increased correlation and therefore an unfavorable multibody entropy contribution. The translation-rotation correlation entropy becomes unfavorable close to the protein surface, particularly close to protein charges, indicating that the correlation between translational and rotational degrees of freedom increases close to the protein surface.
The local contributions from enthalpy and entropy largely compensate each other, as expected, so that the local freeenergy values are smaller in magnitude. The spatial distribution of the local free-energy changes is mainly dominated by favorable contributions close to polar or charged residues. This result suggests that, at least for Crambin, the solvation of hydrophilic residues contributes to a larger extent to the stability of the folded protein than burying hydrophobic residues into the protein interior. We furthermore observed that concave protein surfaces seem to cause more favorable hydration enthalpies and less favorable entropies.
To distinguish between a possible curvature effect and the influence of individual amino acids, we compared the local convexity of the protein surface and the hydrophobicity index of each protein residue to the local enthalpies, entropies, and free energies. Indeed, strong correlations between surface convexity and solvation interaction energies and entropies are seen. A more concave shape of the protein surface generally correlates with more favorable interaction energies and less favorable solvation entropy contributions. Because of the opposing effects on the two quantities, the local free energy of solvation remains almost unaffected by the protein curvature. As expected, we found more favorable local solvent free energies close to hydrophilic residues (i.e., those with a small hydrophobicity index) than close to hydrophobic residues. The finding indicates that the physicochemical properties of the respective amino acids have a larger effect on the local free energy than the protein curvature, although curvature strongly affects both entropy and enthalpy.
To see how much the solvation contributions weigh in the total free-energy balance, we decomposed the total folding free-energy difference between the native fold and moltenglobule-like conformation of Crambin into individual enthalpy and entropy contributions from both protein and solvent. Although other computational methods, such as replica exchange MD (62)(63)(64)(65), allow one to calculate DG with better precision and higher accuracy, our decomposition into individual energy and entropy terms provides additional insights into the thermodynamics of protein stability, even on the globular protein energy budget level. Specifically, the protein-solvent interaction energies and the protein entropy turn out to be the only destabilizing contributions to the free energy, whereas the solvent-related entropy contributions favor the folded state of Crambin by almost 500 kJ mol À1 . Furthermore, the solvent entropy difference is not dominated by water mobilities but by induced multibody water correlations.
One may ask whether this result is compatible with the well-known identity by Ben-Naim (52,53) and Yu et al. (54), who have shown that the solvent-solvent enthalpy change DH SS (which here is approximately DU SS , as pDV z 0) upon solvation compensates exactly the ''solvent reorganization'' entropy contribution (usually denoted DS SS ). In fact, our solvent correlation entropy contributions ÀTDS R2 do not at all cancel with DU SS , as one might expect. Note, however, that ÀTDS R2 , as we have defined and calculated it, differs from the definition of DS SS . Similarly, the Ben-Naim DS PS does not correspond to our DS 1 .
As a result, whereas the Ben-Naim ''solvent reorganization'' cannot provide any net thermodynamic driving force, our solvent correlations, as measured by an MI expansion, are generally not compensated by DU SS and therefore provide new insight, to our knowledge, into thermodynamic driving forces.
As the full ensemble of all potential molten-globule-like conformations is not sampled by our MD simulations, all differences of the free-energy contributions were calculated with respect to four ensemble members of the same moltenglobule-like conformation. Particularly, the calculated protein entropy is therefore underestimated. For a rough estimate of the entropy error, we assume an effective number of 100-1000 metastable conformations (66,67) with similar conformational entropies. In this case, a configurational entropy contribution between k B T log 100 ¼ 11 kJ mol À1 and k B T log 1000 ¼ 17 kJ mol À1 would be missed, thereby not changing the budget qualitatively. As the first transition occurred after 1.5 ms (see Fig. S1), 1000 metastable states would correspond to an approximate but reasonable equilibration time of 1.5 ms.
With this caveat in mind, our calculations are in line with the known delicate balance between enthalpy and entropy, resulting a free-energy difference of (53 5 84) kJ mol À1 , which corresponds to the enthalpy change due to just a few hydrogen bonds. The solvent entropy (free-energy contribution of $500 kJ mol À1 ) was found to be one of the most important stabilizing contributions to the free-energy difference. Our study therefore also provides quantitative mechanistic insights into cold denaturation, driven by temperature-induced weakening of hydrophobic interactions (2). Indeed, the obtained mean entropy and enthalpy contributions would yield cold and heat denaturation temperatures of Crambin within the range of À42 to À22 C and 60 to 68 C, respectively, assuming protein-typical heat capacity differences DC P between 5 and 15 kJ mol À1 K À1 (2,68) (see Supporting materials and methods). These temperature ranges agree with typical unfolding temperatures observed for globular proteins (2,6,7). The spatial resolution of the solvent free-energy terms identified favorable free-energy contributions due to solvated hydrophilic residues (charged or polar) as significantly larger than unfavorable contributions of solvated hydrophobic amino acids. The methodology might be relevant for computational in the context of computational protein design.
Because the local free-energy contributions around each residue also depend on the stereochemical environment formed by adjacent residues, the results presented here are specific to Crambin in the assessed native fold and the molten-globule-like configuration. Nevertheless, because Crambin exhibits a number of stereochemical motifs also present in most other globular proteins, it seems quite likely that the observed effects are, at least qualitatively, features of most globular proteins in general.

AUTHOR CONTRIBUTIONS
L.P.H. performed the research, analyzed data, and co-wrote the manuscript. H.G. designed the research and co-wrote the manuscript.