Theoretical analysis on thermodynamic stability of chignolin

Understanding the dominant factor in thermodynamic stability of proteins remains an open challenge. Kauzmann’s hydrophobic interaction hypothesis, which considers hydrophobic interactions between nonpolar groups as the dominant factor, has been widely accepted for about sixty years and attracted many scientists. The hypothesis, however, has not been verified or disproved because it is difficult, both theoretically and experimentally, to quantify the solvent effects on the free energy change in protein folding. Here, we developed a computational method for extracting the dominant factor behind thermodynamic stability of proteins and applied it to a small, designed protein, chignolin. The resulting free energy profile quantitatively agreed with the molecular dynamics simulations. Decomposition of the free energy profile indicated that intramolecular interactions predominantly stabilized collapsed conformations, whereas solvent-induced interactions, including hydrophobic ones, destabilized them. These results obtained for chignolin were consistent with the site-directed mutagenesis and calorimetry experiments for globular proteins with hydrophobic interior cores.

increases at 1 bar, unfolded conformations become stable over the wide distance and angle (Fig. S1b). In contrast, as pressure increases at 298 K, some extended conformations at specific local dihedral angles become stable (Fig. S1c). In addition, the native state becomes unstable with increasing pressure, while the misfolded state is still stable at the high pressure. Similar temperature and pressure dependences have been observed in a generalized-ensemble MD simulation 9 . The good agreement with the previous MD simulation implies that the present method can provide reliable multidimensional free energy profiles. Figures S2 show free energy profiles of chignolin on the 2D plane of the distance R and another backbone dihedral angle, of Gly7. The native state corresponds to R~ 0.5 nm and ~ 100˚ 5 , while the misfolded state 5-8 cannot be identified in this 2D free energy profile. We again observe that the unfolded conformations become stable over the wide distance R and angle by heating (Fig. S2b) while some specific extended states at large values of R become stable by pressurization (Fig. S2c). Influence of multiple states for ( ) and ex ( ).
In order to calculate the excess chemical potential of the GB model, ex independently. As shown in Figure S3a, the two sets of simulations yield almost the same results. We also show the corresponding intramolecular energy profiles in vacuum, /01 KLMI0 ( ), in Fig. S3b. Again, we confirm no significant difference between these results.

Appendix A. Relationship between the solvation free energy and a partition function of a protein in solvent.
A partition function of a protein in a solvent, O , is given by 10 Here, P is the number of atoms in the protein, is the intramolecular energy of the protein, and Ξ X^T a (r T )b is the grand canonical partition function of the solvent under an external field, T a (r T ), which is caused by the protein-solvent interactions. This equation can be recast as The solvation free energy of the protein in a given conformation r T is given by The partition function O can be expressed by using where 〈 〉 r r denotes an ensemble average with respect to the conformation of the protein in vacuum and is a partition function for the pure solvent plus the protein in vacuum. The excess chemical potential of the protein is defined by the free energy difference between the pure solvent plus the protein in vacuum and the protein immersed in the solvent, This equation can be also regarded as a one-step free energy perturbation method to determine the free energy difference, where the vacuum phase is the initial state and the solution phase is the final state. In general, the distribution of the protein conformation with lower intramolecular energy of the protein in vacuum has no enough overlap with the conformation distribution for lower solvation free energy of the protein in the solvent. We, thus, apply the free energy perturbation method given by Eqs. 4 and 5 to the calculation of the ensemble average of Eq. A7.

Appendix B. Free energy profile in a two-dimensional (2D) plane
We choose the distance between the alpha carbon atoms at the C-terminus side and at the N-terminus side for chignolin, as the primary coordinate, R. The one-dimensional probability distribution u ( ) is defined as We choose the backbone dihedral angle of Gry7 as the secondary coordinate. The relation between the 1D and 2D distribution probabilities is given by The RMDFT hydration free energy is given by Here, •‚ " ( ) is the first derivative of •‚ ( ), which is the excess free energy of the HS system per particle.
A highly accurate expression for •‚ ( ), obtained from the Carnahan-Starling (CS) equation of state, 16,17 is available: where •‚ is the diameter of the reference HS fluid. O (ŽŽ er u O ‰ O g in Eq. (C1) is defined by the EDA excess intrinsic free energy functional for the reference HS system: where O (ŽŽ er u O ‰ O g is the effective density, which is assumed to be a functional of O er‰m`T a ng. O (ŽŽ (r| O ) is approximated by the first-order density functional Taylor series expansion: The expansion coefficient OO ( ) also appears in Eq. (C1) and is related to the second-order direct correlation function ¦¦ •‚ ( ) for the reference HS fluid via where © OO (k) and -¦¦ •‚ (k) are the Fourier transforms of ¦¦ ( ) and ¦¦ •‚ ( ), respectively, and is obtained from Eq. (C4) as follows: In this study, we calculated the site-site direct correlation functions m´ ( )n for bulk water and the site-density distribution functions of water around a solute molecule m α er‰m`T a ngn using the 1D-RISM-KH and 3D-RISM-KH integral equations 18,19 , respectively. Before we calculate ∆ "… † DFT from Eq.
(C1) using the sets of m´ ( )n and m α er‰m`T a ngn, it is necessary to determine OO ( ) or ¦¦ •‚ ( ) by solving the Ornstein-Zernike (OZ) integral equation, 16 in which the following EDA equation combined with the Percus' relation 16,20 is used as the closure: Here, Ta