Theoretical and computational validation of the Kuhn barrier friction mechanism in unfolded proteins

A long time ago, Kuhn predicted that long polymers should approach a limit where their global motion is controlled by solvent friction alone, with ruggedness of their energy landscapes having no consequences for their dynamics. In contrast, internal friction effects are important for polymers of modest length. Internal friction in proteins, in particular, affects how fast they fold or find their binding targets and, as such, has attracted much recent attention. Here we explore the molecular origins of internal friction in unfolded proteins using atomistic simulations, coarse-grained models and analytic theory. We show that the characteristic internal friction timescale is directly proportional to the timescale of hindered dihedral rotations within the polypeptide chain, with a proportionality coefficient b that is independent of the chain length. Such chain length independence of b provides experimentally testable evidence that internal friction arises from concerted, crankshaft-like dihedral rearrangements. In accord with phenomenological models of internal friction, we find the global reconfiguration timescale of a polypeptide to be the sum of solvent friction and internal friction timescales. At the same time, the time evolution of inter-monomer distances within polypeptides deviates both from the predictions of those models and from a simple, one-dimensional diffusion model.


Simulations of the coarse grained model Systems and parameterization.
A coarse-grained C a -only homopeptide model similar to those described earlier 1-5 represents each amino acid residue as a single bead and employs a 3-letter alphabet for the amino acid sequence, consisting of hydrophobic, neutral, and polar beads. To describe the unfolded polypeptide, sequences of various lengths (N=10, 20, 30, 40) consisting entirely of neutral beads were used. All calculations were done using an in-house Langevin code reported elsewhere 6,7 .
The following parameters were used to specify the system's energetics and dynamics: The results are given in terms of dimensionless units, with the thermal energy T setting the energy units, the peptide bond length σ defining the units of length, and the parameter t u = σ 2 / mT ( ) 1/2 , m being an effective monomer mass, setting the units of time (t u later in the SI text and figures). Chain connectivity was accounted for by a harmonic interaction potential acting between adjacent monomers, V bond (r) = k bond (r − σ ) 2 / 2 , with k bond = 100T /σ 2 , and a bending potential of the form V bend (θ ) = k θ (θ − θ 0 ) 2 / 2 with k θ = 20T / rad 2 and θ 0 =105 0 enforced proper peptide bond geometry. Interactions between non-bonded monomers were described by a repulsive potential of the form V nonb (r) = 4T (σ / r) −12 . The dependence of the energy on each of the dihedral angles ϕ was described by the potential V dih (ϕ) = ε(1− cos3ϕ) / 2 , with the height of the dihedral barrier, ε , varied to study how the dihedral energy landscape affects the peptide dynamics. The dynamics of the chain was governed by the Langevin equation with a ; the Langevin equation was integrated using the Verlet algorithm with a time step of Δt = 0.02 .
Simulation Protocol. Starting with a linear initial structure of the chain, the system was equilibrated over 100000 simulation steps. Equilibration was followed by a production run including 55000000 steps. The peptide structure was saved every 1000 steps. This protocol was repeated for values of ε ranging from 2T to 8T, with an increment of Δε = 0.3T . The probability distributions of the end-to-end distance R for each chain are shown in Figure S1, illustrating adequate sampling of the structural ensemble. Estimation of relaxation times. Characteristic relaxation times for the dihedral angles, end-to-end distance, and the end-to-end vector were estimated from the respective autocorrelation functions (ACF), as described in the main text. By fitting the decaying autocorrelation functions with an exponential function of the form ACF(t) = a + ce −bt , the corresponding relaxation time was simply given by = 1 . The dihedral relaxation times varied depending on the sequence location within the chain, with the dihedral belonging to chain extremities relaxing faster. This is illustrated in Fig. S2. In reporting the dihedral relaxation times averaged over the entre chain we thus removed such edge effects by excluding the contributions from the first and last 2 dihedrals in the chain.
In order to assess the spectrum of relaxation times within our coarse grained peptides, we compared the end-to-end distance relaxation times, t EE , with the "end-tomiddle" relaxation time t EM estimated from the autocorrelation function of the distance between a chain end and the monomer located in its middle. In Rouse or Zimm theory (zero internal friction), t EM is dominated by shorter-wavelength relaxation modes and, therefore, is shorter than t EE 2 . In contrast, RIF and ZIF predict that all relaxation modes essentially become identical in the high friction limit, in which case t EE should converge with t EM 8 . Consistent with these predictions, t EM is shorter than t EE for low values of the dihedral barrier, but the two times become comparable as ε is increased (Fig. S3).  . Relaxation times of the end-to-end distance (EE), end-to-end vector (V), dihedral angle (D, averaged over dihedrals) and end-to-the-middle distance (EM) as a function of the dihedral barrier. Note that the end-to-end distance (EE) and the end-to-middle distance (EM) relaxation times converge as the dihedral barrier increases.
Validity of the Rouse picture for long chains and/or low dihedral barriers. As our coarse grained simulations did not include hydrodynamic interactions, we compare them with the Rouse model. In this model, the polypeptide is treated as a chain of n k statistically independent Kuhn segments; the length of each is l k . Since the original chain consists of N bonds of the length s, its total length is given by L = Nσ = n k l k . Further assuming Gaussian statistics, we also have R 2 = n k l k 2 , allowing us to estimate the Kuhn length and the number of Kuhn segments as Under the Rouse model assumption, the slowest relaxation time of the chain (i.e. the Rouse time) can be estimated as where ξ is the monomer friction coefficient. The end-to-end vector relaxation time of the Rouse chain is then 8 ! 0.8τ Rouse . In Table S1, the parameters of the equivalent Rouse chain are estimated from the coarse-grained simulations for a low value of the dihedral barrier, ε / T = 3.6 . The relaxation times τ R estimated under the assumption of the Rouse model are further compared with the actual end-to-end distance (EE) and end-to-end vector (vec) relaxation times. Average dihedral relaxation times (dih) are also reported. Importantly, the end-to-end vector relaxation time is close to its Rouse time, while the end-to-to end distance relaxation time is about 3 times shorter than the end-to-end vector relaxation time, consistent with the behavior expected for a Rouse chain 9 .

Table S1
Peptide  Figure S4 for an example). States I, II, and III correspond, respectively, to the dihedral angles in the ranges [0,120), [120,180] (or [-180,-120]), and (-120,0]. This mapping is consistent with the symmetry of the dihedral potential. Consider the lag time between the successive flips of a dihedral angle i (i.e. an event where it jumps between two distinct states as defined above). In the absence of memory effects, we expect this time to be exponentially distributed, P i (t) = λ i e − λ i t , so that the jumps of this dihedral constitute a Poisson process. Indeed, we find this to be the case, Fig. S5. Now consider a pair of dihedrals, i and j. If the flipping of each is an independent Poisson process, then the distribution of the time lag t between the flip of i and the subsequent flip of j is also exponential, P ij (t) = λ i + λ j ( ) e − λ i +λ j ( )t . For sequencedistant dihedrals, we, indeed, find this to be the case (exemplified by the case j=i+5 in Fig. S5). For the dihedrals that are close in sequence, however, P ij (t) deviates from a single exponential (Fig. S5), showing a bunching effect, where the average lag time between the flips of i and j is shorter than P ij (t) = λ i + λ j ( ) −1 . Fig. S4. Mapping of a dihedral trajectory ϕ(t) onto a 3-state process. The data is given for N=30 and ε /T= 8.

Simulations of the rotational isomeric state (RIS) model.
In the (alpha-carbon only, coarse grained) RIS model, the configuration of the polypeptide chain is entirely specified by its dihedral angles, {ϕ 1 ,ϕ 2 ,...,ϕ N −2 } , where N is the number of monomers. Same geometry (i.e. same bending angles) was assumed as in Langevin dynamics of the coarse grained model described above, but the dihedrals are the only degrees of freedom in the RIS model. Each dihedral was assumed to undergo jumps between three equivalent states 1,2,3, with the same value of the jumping rate coefficients, k 12 = k 21 = k 23 = k 32 = k 13 = k 31 = k . The time evolution of each dihedral was computed using the standard kinetic Monte Carlo scheme, and the resulting trajectory of the end-to-end distance R(t) was computed. The relaxation time of the end-to-end distance was computed, for different values of the chain length, as Fig. S6, this time was found to be inversely proportional to the chain length.

Atomistic simulations of CSP fragments
Simulation Protocol. Molecular dynamics simulations were performed using the GROMACS software package, version 4.5.5. 10 Parameters were taken from the Amber03 parameter set 11 . An extended simple point charge (SPC/E) explicit water model was used to represent the solvent. Starting from the NMR structure of the 66-residue Thermotoga maritima CSP (pdb access code 1G6P) 12 , the initial peptide models were built by cutting the protein in six equal 11-residue peptide fragments. We also modeled one 11-residue peptide fragment with Gly-Ser repeat. Each of these peptides was then solvated separately in a cubic box of water molecules. The dimensions of the box were selected such that all the atoms were at all events 10 Å from any cubic wall. Counterions were added using the genion module of GROMACS, which randomly replaces water molecules with counterions in favorable locations determined by computing the electrostatic potential at the insertion site. Each system was energy minimized by, first, the steepest descent algorithm and then, by a conjugate gradient algorithm, to arrive at a conformation with no steric clashes. Each of these minimized conformations was equilibrated in two steps, with position restraints applied to all the heavy atoms throughout. The first phase involved simulating each system for 500 ps under a constant volume (NVT) ensemble. All the atoms were coupled to a bath, with the temperature kept at 300 K using the Berendsen weak coupling method. Following NVT equilibration, 500 ps of constant pressure (NPT) equilibration was accomplished to maintain pressure isotropically at 0.138 atm. All the simulations were performed in a cubic cell employing periodic boundary conditions with the standard minimum image convention in all three directions. Long-range electrostatics was treated with the particle mesh Ewald method. 13 The cutoff used for Lennard-Jones interactions was 9 Å. Particle mesh Ewald method with a real space cut-off at 9 Å was used to account for the electrostatic interactions. All bond lenghts and all angles involving hydrogens were constrained using the LINCS algorithm. 14 An integration time step of 2 fs was used for all the simulations. Production MD runs were performed at T=300 K and P=0.138 atm in the absence of any restraints; these conditions are close to those employed in previous experimental 15 and theoretical 16 studies. The modified Berendsen thermostat 17 was used to maintain temperature, and the Parrinello-Rahman barostat 18 was used to isotropically regulate pressure during the production runs. The total length of a production run for each of the peptides was 2 μs, resulting in a total of 16 μs of simulation time.

Modified Dihedral Potentials.
To explore the connection between internal friction and the dihedral energy landscape, we repeated our simulations for the 11-residue Gly-Ser repeat peptide using a softer dihedral potential, with all dihedral barriers reduced by a factor of 2, i.e., V n → V n /2. The potential energy function describing the backbone dihedral angles has the form: where θ is the dihedral angle (either φ or ψ) and V n is the corresponding force constant. The phase angle γ takes values of either 0° or 180°, and n is an integer that determines the periodicity of the potential. 11 Autocorrelation Functions. We investigated the conformational dynamics of the peptides by probing the relaxation kinetics of different structural properties. Given the time dependence of a dihedral angle, θ(t), the corresponding autocorrelation function (ACF), as a function of the lag time τ, was defined as 19 : For the end-to-end vector R, the ACF was defined as C τ ( ) = R t ( )R t + τ ( ) , and for the end-to-end distance R we have C τ ( ) = R t ( )R t + τ ( ) . The relaxation time associated with each of these parameters was computed from the normalized ACF, as Because of the noise in the raw ACFs, the integral of Eq. 4 was evaluated using analytic fits of C N (t) . Single exponential, bi-exponential, and stretched exponential fits were, respectively, used for the end-to-end vector, dihedral angle, and end-to-end distance ACFs. Those fits along with the raw autocorrelation functions are shown in Figs. S7-S24, S33-S40.