Molecular dynamics ensemble refinement of the heterogeneous native state of NCBD using chemical shifts and NOEs

Many proteins display complex dynamical properties that are often intimately linked to their biological functions. As the native state of a protein is best described as an ensemble of conformations, it is important to be able to generate models of native state ensembles with high accuracy. Due to limitations in sampling efficiency and force field accuracy it is, however, challenging to obtain accurate ensembles of protein conformations by the use of molecular simulations alone. Here we show that dynamic ensemble refinement, which combines an accurate atomistic force field with commonly available nuclear magnetic resonance (NMR) chemical shifts and NOEs, can provide a detailed and accurate description of the conformational ensemble of the native state of a highly dynamic protein. As both NOEs and chemical shifts are averaged on timescales up to milliseconds, the resulting ensembles reflect the structural heterogeneity that goes beyond that probed, e.g., by NMR relaxation order parameters. We selected the small protein domain NCBD as object of our study since this protein, which has been characterized experimentally in substantial detail, displays a rich and complex dynamical behaviour. In particular, the protein has been described as having a molten-globule like structure, but with a relatively rigid core. Our approach allowed us to describe the conformational dynamics of NCBD in solution, and to probe the structural heterogeneity resulting from both short- and long-timescale dynamics by the calculation of order parameters on different time scales. These results illustrate the usefulness of our approach since they show that NCBD is rather rigid on the nanosecond timescale, but interconverts within a broader ensemble on longer timescales, thus enabling the derivation of a coherent set of conclusions from various NMR experiments on this protein, which could otherwise appear in contradiction with each other.


INTRODUCTION
Molecular dynamics (MD) simulations have the potential ability to provide an accurate, atomic-level description of the conformational ensembles of proteins and their macromolecular complexes (Lindorff-Larsen et al., 2005;Dror et al., 2012;Perilla et al., 2015). Nevertheless, simulations are limited by both the accuracy of the physical models (force fields) and the precision due to conformational sampling (Mobley, 2012;Esteban-Martín, Fenwick & Salvatella, 2012). To overcome these problems, it is possible to bias the simulations using experimental data as structural restraints taking into account the inherent averaging in the experiments (Lindorff-Larsen et al., 2005;Camilloni et al., 2012;Lehtivarjo et al., 2012;Pitera & Chodera, 2012;Ravera et al., 2016). In this way, the experimental data We therefore sought to extend this approach to study conformational variability using more commonly available data, thus making the DER method more generally applicable. We thus focus on using NMR chemical shifts and NOEs as these are both commonly available and are averaged over long, millisecond timescales. We demonstrate the potential by describing the structural heterogeneity of a highly dynamic protein. Our method relies on supplementing the sparse experimental data with the experimentally-validated CHARMM22* force field (Piana, Lindorff-Larsen & Shaw, 2011), which provides a relatively accurate description of the subtle balance among the stability of the different secondary structure classes, and which has been shown to provide a good description of many structural and dynamical aspects related to protein structure (Shaw et al., 2010;Lindorff-Larsen et al., 2012a,b;Piana, Lindorff-Larsen & Shaw, 2012;Papaleo et al., 2014;Rauscher et al., 2015). Our hypothesis was that using a more accurate force field would make it possible to determine an accurate ensemble from less information-rich experimental data. In particular, though chemical shifts in principle contain very detailed information, this information is difficult to extract using current methods.
As object of our study we selected NCBD (the Nuclear Coactivator Binding Domain) of CBP (CREB Binding Protein), a 59-residue protein domain that has been experimentally characterized in substantial detail. Experiments on NCBD have revealed a rich and complex dynamical behaviour of the protein in solution (Demarest et al., 2004;Ebert et al., 2008;Kjaergaard, Teilum & Poulsen, 2010;Kjaergaard, Poulsen & Teilum, 2012;Kjaergaard et al., 2013). For a protein of its size, NCBD displays surprisingly broad NMR peaks, suggestive of conformational heterogeneity with relatively slow interconversion between different states. Nevertheless, it was possible to assign both backbone and side chain chemical shifts and determine a number of conformationally-averaged inter-nuclear distances, including a few long-range contacts, via NOE experiments (Ebert et al., 2008;Kjaergaard, Teilum & Poulsen, 2010;Kjaergaard, Poulsen & Teilum, 2012). NMR relaxation experiments suggest that the protein, at least on the nanosecond timescale, is relatively rigid (Kjaergaard, Poulsen & Teilum, 2012). NCBD forms complexes with several other proteins, where it intriguingly folds into remarkably different tertiary structures (Demarest et al., 2002;Qin et al., 2005). For example, the structure of NCBD in complex with ACTR (Demarest et al., 2002) and certain other partners (Waters et al., 2006;Lee et al., 2010) resembles the average structure populated by NCBD in the absence of binding partners (Figure 1), whereas the structure of NCBD is markedly different when bound to the protein IRF-3 (Qin et al., 2005). Thus, the dynamical properties of NCBD, and its ability to adopt different conformations, appear crucial for its diverse biological functions.
Our results show that a dynamic ensemble refinement that combines NOEs, chemical shifts and the CHARMM22* force field provides a rather accurate description of the structural dynamics of the ground state structure of NCBD. We show via cross-validation with independent NMR data that all three components (the two sources of experimental information and the force field) contribute to the overall accuracy. The ensemble that we obtained reveals a relatively broad distribution of conformations, reflecting the conformational heterogeneity of NCBD on the millisecond timescale. Further, we quantified the level of structural fluctuations that would be measured by relaxation experiments and demonstrate that, on the nanosecond timescale, NCBD is more rigid, thus helping to reconcile earlier conflicting views of this protein.

MATERIALS AND METHODS
Ensemble generation. MD simulations were performed using Gromacs 4.5, (Pronk et al., 2013) coupled to a modified version of Plumed 1.3, (Bonomi et al., 2009) and using either the CHARMM22* (Piana, Lindorff-Larsen & Shaw, 2011) or CHARMM22 (MacKerell, et al., 1998 force fields. As starting structure for most simulations we used the first conformer from a previously determined NMR structure of free NCBD as deposited in the PDB entry 2KKJ (Kjaergaard, Teilum & Poulsen, 2010). To evaluate the effect of our choice of the initial structure, we also performed one simulation starting from an alternative NCBD conformation (PDB entry: 1ZOQ, chain C) (Qin et al., 2005). Missing residues in 1ZOQ (compared to 2KKJ) were rebuilt by Modeller 9.11 (Fiser & Šali, 2003).
The protein was embedded in a dodecahedral box containing 8372 TIP3P water molecules (Jorgensen et al., 1983) and simulated using periodic boundary conditions with a 2 fs timestep and LINCS constraints (Hess et al., 1993). Production simulations were performed in the NVT ensemble with the Bussi thermostat (Bussi, Donadio & Parrinello, 2007) using a pre-equilibrated starting structure for which the volume was selected based on a short NPT simulation.
NaCl was added to a concentration of ~20 mM to reproduce the experimental conditions at which chemical shifts and NOEs were determined (Kjaergaard, Teilum & Poulsen, 2010). The van der Waals and short-range electrostatic interactions were truncated at 9 Å, whereas longrange electrostatic effects were treated with the particle mesh Ewald method (Essmann et al., 1995).
We carried out MD simulations with replica-averaged experimental restraints using 1, 2, 4 or 8 replicas (Table S1 gives an overview of the simulations that were performed). The use of replica-averaged restrained simulations enables us to use different equilibrium experimental observable as a restraint in MD simulation in a way that minimises the risk of over restraining because replica-averaging is a practical implementation of the maximum entropy principle. As a control we also performed a simulation that was not biased by any experimental restraints (i.e. an unbiased simulation). To examine the role played by each of the different types of experimental data, we also performed simulations in which we included different combinations of the experimental restraints: chemical shifts only (CS), NOEs only (NOE), and both chemical shifts and NOEs (CS-NOE). In the simulations, each replica was evolved through a series of simulated annealing (SA) cycles between 304 and 454K for a total duration of 0.6 ns per cycle.
We only used structures from the 304K portions of the simulations for our analyses.
Chemical shifts for the backbone atoms (Ca, C', Ha, H and N) and Cb CS (deposited in BMRB entry 16363) were used as restraints (with the exception of the Cb of glutamines). The resulting dataset includes 54 Ca, 37 Cb, 52 Ha and 48 C', H and N chemical shifts, respectively. The backbone chemical shifts cover most of the NCBD sequence with the exception of the first four to six N-terminal residues, depending on type of chemical shifts. The Cb chemical shifts for the first seven N-terminal and last five C-terminal residues, as well as for some residues of the loops connecting the a-helices, are missing with few exceptions.
During the structure determination protocol, chemical shifts were calculated by CamShift (Kohlhoff et al., 2009) for all the nuclei for which an experimental value is available and then averaged over the replicas of the replicas. The resulting average over the replicas was compared with the experimental value, and the ensemble as a whole restrained using a harmonic function with a force constant of 5.2 kJ mol -1 ppm -2 (Camilloni et al., 2012;Camilloni, Cavalli & Vendruscolo, 2013a). At the higher temperatures, T, explored during the simulated annealing, the force constant was scaled by a factor of (304 K/T). The value of the force constant was chosen roughly to match the calculated chemical shifts to experiments within the uncertainty of the CamShift predictor; the experimental uncertainty of the chemical shifts is negligible in comparison. NOE restraints were obtained by 455 NOE-derived distance intervals (Kjaergaard, Teilum & Poulsen, 2010) (BMRB entry 16363) of which 46 were long-range (i.e. separated by more than 4 residues). The proton-proton distances, r, were calculated and averaged as r -6 over the replicas (Tropp, 1980;Lindorff-Larsen et al., 2005). We used a flat-bottomed harmonic function implemented in Gromacs to restrain the calculated averaged distances within the experimentallyderived intervals. We used a variable force constant for the NOE-restraints during the SA cycles, allowing the protein to sample more diverse structures in the high-temperature regime and thus to decrease the risk of getting trapped in local minima. Force constants of 1000, 20 and 125 kJ mol -1 nm -2 were used for the 304K phase, a heating phase (from 304K to 454K) and cooling phase (from 454K to 304K), respectively.
In short, in the replica-averaged simulations we calculated at each step and for each replicaconformation the atomic distances that were measured by the NOE experiments and the backbone chemical shifts. These calculated single-conformer values were then averaged (linearly for the shifts and using r -6 averaging for the distances) to determine the replica-averaged values, which were then compared to the experimentally determined values. Thus, the simulations penalize deviations between the calculated ensemble averages and experimental values but allow fluctuations of individual structures. In this way, the simulations are biased so as to agree with the experimental data as a whole, while allowing individual conformations to take on conformations whose NMR parameters differ from the experimentally derived averages.
To examine the role of the force field used in our approach, we compared the results from two different force fields belonging to the same family (CHARMM). These force fields mostly differ for the main-chain dihedral angle potential, as well a few parameters for certain side chains. Further, in a previous comprehensive evaluation of protein force fields it, was demonstrated that these two force fields resulted in very different levels of agreement between simulations and experiments (Lindorff-Larsen et al., 2012a), making it possible for us to evaluate the importance of force field accuracy in restrained simulations.
Unbiased simulations for the calculation of fast-timescale order parameters. We also performed 28 independent unbiased MD simulations, each 50 ns long, at 304K and with the same computational setup as the restrained simulations, but without any restraints. As starting points, we selected seven different structures from each of the four replicas obtained in the CS-NOE-4 ensemble (Table S1). In particular, the seven structures were selected from the SA cycles after convergence (i.e. at SA cycles 65,75,85,95,100,110,125). We calculated fast timescale order parameters, which correspond to those measured by NMR relaxation measurements, from these 28 unbiased simulations using a previously described approach (Maragakis et al., 2008). In particular, we calculated bond-vector autocorrelation functions (independently from each simulation) including both internal motions and overall tumbling of NCBD. The resulting correlation functions were then averaged over the 28 simulations and subsequently fitted globally to a Lipari-Szabo model (Lipari & Szabo, 1982) to yield relaxation order parameters. To calculate order parameters that report on the long-timescale motions we first aligned the full ensemble and then calculated order parameters as ensemble averages (Maragakis et al., 2008).

Analyses of convergence and cross validation.
We used two different methods to examine the convergence of our simulations. First, we used the ENCORE ensemble comparison method (Lindorff-Larsen & Ferkinghoff-Borg, 2009;Tiberti et al., 2015) to quantify the overlap between the structural ensembles. The latter is based on clustering the structures using affinity propagation (setting the "preference value" in the clustering to 12) and subsequent comparison of the ensembles by calculating the Jensen-Shannon (JS) divergence between pairs of ensembles by comparing how they populate the different clusters. For additional details, please confer to original descriptions of the method (Lindorff-Larsen & Ferkinghoff-Borg, 2009;Tiberti et al., 2015). As an alternative method, we calculated the Root Mean Square Inner Product (RMSIP) over the first 10 eigenvectors obtained from a principal component analysis of the covariance matrix of atomic (Ca-atoms) fluctuations (Amadei, Linssen & Berendsen, 1993).
To cross-validate our ensembles we calculated the chemical shifts of side chain methyl hydrogen and carbon atoms using CH3Shift (Sahakyan et al., 2011) (both 1 H and 13 C shifts) and PPM (Li & Brüschweiler, 2012) (only 1 H shifts) and compared to the previously determined experimental side chain chemical shifts. In particular, we compared the calculated side chain chemical shifts with the experimental values (deposited in BMRB entry 16363) using a reduced χ 2 metric. In this metric, the square deviation between the calculated and experimental values were normalized by the variance of the chemical shift predictor (for each type of chemical shift) and the total number of chemical shifts, so that low numbers indicate good agreement between experimental and calculated chemical shifts.

RESULTS AND DISCUSSION
Convergence of the simulations. Before assessing the accuracy of the different structural ensembles that we generated, we first ensured that the simulated annealing protocol allowed us to obtain converged ensembles that represent the dynamical properties encoded in the experimental restraints and the molecular force field. To quantify convergence of the ensembles, we calculated two different measures of the overlap between the subspaces sampled by different simulations.
First, we used a previously described approach (Lindorff-Larsen & Ferkinghoff-Borg, 2009;Tiberti et al., 2015), which is based on a quantification of the extent to which the different ensembles mix during conformational clustering, to calculate the Jensen-Shannon (JS) divergence between the ensembles (Figure 2). A JS divergence of zero is evidence of identical ensembles, and it has previously been observed that a JS divergence in the range of 0.1-0.3 represents similar ensembles (Lindorff-Larsen & Ferkinghoff-Borg, 2009;Tiberti et al., 2015).
We expect that in a converged replica-averaged simulation that the different replicas should populate equally the different structural basins. With this in mind, we calculated the JS divergence between two replicas in a simulation restrained by NOEs and chemical shifts (Figure 2, black line). We find that after approximately ~30 cycles of simulated annealing the two replicas have covered approximately the same conformational space with the JS divergence stabilizing around 0.2-0.3 with the fluctuations in the JS-divergence representing the stochastic nature of the simulations. Thus, we decided to discard the first 45 simulated annealing cycles from all the simulations. As an alternative measure of ensemble similarity we also calculated the Root Mean Square Inner Product (Hess, 2002) (RMSIP) with very similar results. In particular, the similarity of the two replicas converge to an RMSIP value greater than 0.83 (here RMSIP=1 is expected for fully overlapping ensembles).
As a second, perhaps even more stringent, test of convergence we also examined whether two simulations with the same number of replicas and experimental restraints, but initiated from substantially different starting structures, converge to similar ensembles. Indeed, we find that simulations initiated from two distinct structures of NCBD (Table S1) converge to similar ensembles when the first 45 cycles are discarded as initial equilibration (Figure 2, grey line). Thus, based on these two tests we concluded that our sampling protocol allows us to obtain structural ensembles that represent the force field and restraints employed.
Assessment of the accuracy of the NCBD ensembles. Once we had assessed the convergence of the simulations, we analysed the different ensembles to evaluate their accuracy. To do so, we back-calculated experimental parameters that were not used as restraints and compared them with the experimental values. As our different simulations employed different sets of experimental restraints, not all experimental data can be employed for validation purposes. For example, while the NOEs can be used to evaluate the quality of an ensemble obtained using CS-restraints, they can obviously not be used to validate an ensemble that was generated using those NOEs as restraints.
We first examined whether the CS or NOE restraints alone are sufficient to increase the accuracy in the description of the conformational ensemble of NCBD. We thus compared unbiased simulations with simulations biased by either CS or NOEs by cross-validation with the measured NOEs and CS, respectively.
We back-calculated NOEs from the inter-proton distances and observed substantial violations (some greater than 2 Å) in both unbiased and CS ensembles ( Figure S1) independently of the number of replicas used for the averaging. To determine the origin of these discrepancies we calculated intramolecular contacts between side chains, and observed an overall decrease in these (from 27 in the previously-determined NMR ensemble, to 14 and 17 in unbiased and CSrestrained, respectively). More specifically we found a loss of inter-helical contacts between helices a1 and a2 in the simulations, in agreement with our finding of several long-range NOEs that are violated in these ensembles.
These results demonstrate that the CS-restraints and MD force field, as implemented here, are not sufficient to provide a fully accurate description of the conformational ensemble of NCBD.
Similarly, we found that back-calculation of backbone chemical shifts from the unbiased simulation and, to a lesser extent a NOE-restrained ensemble, resulted in deviations from experiments. We therefore decided to determine conformational ensembles that combine the information of the NOEs, chemical shifts and force field in replica-averaged simulations (CS-NOE) aiming to provide a more accurate structural ensemble of NCBD than possible via the application of just one of the two classes of restraints. We also assessed the influence of the choice of force field since we expected that a more accurate ensemble could be obtained with the relatively limited amounts of experimental data when using a more accurate force field. Thus, we compared simulations using either the CHARMM22 force field (CS-NOE-4-C22 simulation), or a more recent and accurate force field variant, CHARMM22* (CS-NOE simulations).
As both the NOEs and backbone chemical shifts were used as restraints they cannot be used for validation of these ensembles. Instead, we turned to side-chain methyl chemical shifts for a comparison and validation of the different ensembles. Methyl-containing residues, for which the chemical shifts are available, cover the entire protein structure and are thus excellent probes of both local structure ( 13 C methyl chemical shifts, which are mostly dependent on the rotameric state) and long-range contacts ( 1 H methyl chemical shifts). The methyl chemical shifts were predicted by CH3Shift (Sahakyan et al., 2011) and the resulting values compared to experiments, separating the contributions from 13 C and 1 H. We then calculated c 2 red thus taking into account the inherent uncertainty of the chemical shift predictions (Sahakyan et al., 2011).
As also indicated by the calculation of NOEs and backbone chemical shifts, we find that the side chain chemical shifts predicted from the unbiased simulation (green line in Figure 3) deviates substantially from experiments. The introduction of backbone chemical shift restraints (CS ensembles, orange line in Figure 3) provides a better structural ensemble than the force field alone, especially for 13 C methyl chemical shifts and when averaged over 2 or 4 replicas.
We also calculated the chemical shifts from NOE-derived ensembles, obtained with or without replica-averaging. Surprisingly, we find that the ensembles obtained using NOEs as replicaaveraged restraints (NOE, magenta line in Figure 3) perform slightly worse than the CS ensemble. Thus, when evaluated in this way, ensembles derived by MD refinement using either backbone chemical shifts or NOEs do not increase accuracy compared to the ensemble deposited in the PDB.
By combining the NOEs, chemical shifts and the CHARMM22* force field we were, however, able to obtain even more accurate ensembles, in particular when averaging over four replicas, as assessed by the ability to predict side chain 13 C and 1 H methyl chemical shifts ( Figure   3). Interestingly we find that not only the experimental data but also the CHARMM22* force field contributes to the improved agreement with the experimental data. Indeed, when we employ both chemical shift and NOE-based restraints in simulations averaged over 4 replicas, but replacing the CHARMM22* force field by an earlier, less accurate variant of the same force field (CHARMM22; CS-NOE-4-C22) (Lindorff-Larsen et al., 2012a) we find that the accuracy decreases dramatically. Calculations of 1 H methyl chemical shifts using PPM (Li & Brüschweiler, 2012) instead of CH3Shift demonstrate that the conclusions are robust to the method for calculating the chemical shifts ( Figure S2). Similarly, calculations of the chemical shifts using the ensemble generated from the alternative starting structure (CS-NOE-2-1ZOQ) resulted in essentially the same agreement with the experimental data as when simulations were initiated from the 2KKJ structure (Figure 3), confirming the conclusions from the convergence analysis described above (Figure 2). The CS-NOE-4 ensemble, which we found to provide the most accurate representation of the free state of NCBD in solution, is shown in Figure 4. It is a relatively broad ensemble of conformations, where the three helical regions are maintained overall, but differ in the lengths and relative positions of the three a-helices.
Small Angle X-ray scattering (SAXS) measurements have been carried out for NCBD in solution (Kjaergaard, Teilum & Poulsen, 2010) and previously been compared to simulation-derived ensembles of NCBD (Knott & Best, 2012;Naganathan & Orozco, 2013). We thus calculated the radius of gyration (Rg) using CRYSOL (Svergun, Barberato & Koch, 1995) for the various ensembles. In all cases we find that the average Rg values are in the range of 13.7Å -14.9Å. These values are comparable to that obtained previously from simulations (13.7 Å) (Knott & Best, 2012) but lower than the values estimated from a Guinier analysis of the experimental data (~16.5 Å) or an ensemble-optimization method (18.8 Å) (Kjaergaard, Teilum & Poulsen, 2010). We note, however, that the experimental values also include contributions from a ~8% population of unfolded protein that is not captured by our simulations. Although a detailed understanding is lacking for the role of solvation on the SAXS properties of partially disordered proteins we, however, expect that the discrepancy between experiment and simulation should be ascribed to remaining force field deficiencies. Indeed, overly large compaction of proteins is a common problem of most atomistic force fields (Piana, Klepeis & Shaw, 2014) though recent work suggests that, at least for fully disordered proteins, that modified proteinwater interactions can improve accuracy (Nerenberg et al., 2012;Best, Zheng & Mittal, 2014;Henriques, Cragnell & Skepö, 2015;Mercadante et al., 2015;Piana et al., 2015). We also note that while the force field used here (CHARMM22*) in certain cases has been shown to produce too compact structures, (Piana et al., 2015) in other cases it appears to perform quite well (Rauscher et al., 2015). We expect that resolving these issues will require both further force field developments (Best, 2017) as well as improved methods for comparing experiments and SAXS experiments (Hub, 2018).
A unified view of NCBD dynamics. While the broad peaks and sparse NOEs are suggestive of a rather dynamic protein, previous NMR relaxation measurements of side chain dynamics found relatively high order parameters (S 2 relaxation) comparable to values found in well-ordered proteins (Kjaergaard, Poulsen & Teilum, 2012). To shed light on this apparent discrepancy and to assess whether our relatively broad structural ensemble is compatible with mobility on different timescales, we calculated S 2 values representing different timescales.
To mimic the dynamics probed in relaxation experiments we selected 28 structures from each of the 4 replicas of the CS-NOE-4 ensemble sampled at seven different SA steps. Starting from each of these conformations we performed 50 ns of unbiased MD simulation (in total 1.4 µs, Figure S3), and from each simulation we calculated the autocorrelation functions of the N-H bond vectors (without removing the overall rotational motion of the protein). These correlation functions were subsequently averaged and fitted to the Lipari-Szabo model to estimate the S 2 relaxation values, which report on the nanosecond dynamics of the protein ( Figure 5, black line). The results show a relatively rigid ensemble on the ns timescale attested by high order parameters throughout most of the polypeptide backbone.
To quantify the backbone dynamics on the longer timescales that may influence both the NOE and chemical shifts (but which the relaxation measurements would not be sensitive to) we defined and calculated "S 2 chemical shift"-values from the structural variability in the ensemble after aligning the structures. These S 2 values include contributions also from any millisecond-timescale motions that might be present in the ground state of NCBD. As internal and overall motions cannot be decoupled, the results of such calculations will depend on how the ensemble is aligned. In our calculations we chose THESEUS (Theobald & Steindel, 2012) as the least biased method to align the structures (Figure 4). These order parameter calculations reveal a broader distribution of conformations with additional, longer-timescale dynamics evident both in loop regions and the C-terminal region, even though relatively high S 2 values are found in the regions of secondary structures ( Figure 5, grey line).
A similar analysis of side chain motions suggests even greater differences in motions present on relaxation and chemical shift timescales. In particular, we find that, for methyl-bearing side chains, S 2 chemical shift-values are on average lower than S 2 relaxation-values by 0.4 compared to an average difference of 0.2 for the backbone amides. Finally, we note that although both calculated S 2 chemical shift-values and S 2 relaxation-values correlate strongly with the experimentally determined side chain S 2 relaxation-values (Spearman correlation coefficient of 0.9 and 0.8, respectively), a more quantitative analysis is hampered by several issues including (i) the presence of a small population of unfolded protein in the experiments, (ii) the difficulty in appropriate model selection of the calculated correlation functions, (iii) the well-known observation of toofast rotational motions of proteins in the TIP3P model that we used and (iv) uncertainties in the parameterization of the rotational motions in the experimental analyses. We note, however, the potential complications that arise from the fact that the S 2 chemical shift-values were obtained from simulations with an experimental bias, whereas the S 2 relaxation-values were obtained from simulations starting from such a biased ensemble, but performed with the standard CHARMM22* force field.
Taken together, however, our calculations of order parameters demonstrate that NCBD may be described as a semi-rigid protein on fast-timescales, but with additional dynamics in the backbone and-in particular-side chains on timescales longer than the rotational correlation time of the protein, as also previously suggested (Kjaergaard, Poulsen & Teilum, 2012).

CONCLUSIONS
We have presented an application of the dynamic-ensemble refinement method to study the native state dynamics of NCBD. In the original implementation of DER we combined NMR relaxation order parameters with NOEs in MD simulations (Lindorff-Larsen et al., 2005). This approach was here extended to the combination of chemical shifts and NOEs to make it more generally applicable. In particular, our results show that it is possible to combine NOEs, backbone chemical shifts and an accurate MD force field into replica-averaged restrained simulations, and that all three components add substantially to the accuracy of the resulting NCBD ensemble.
NMR structures are typically obtained by combining distance information from NOE measurements with in vacuo simulations, in certain cases with subsequent refinement by short, MD simulations in explicit solvent. Further, the inherent ensemble averaging of the experimental data is typically not exploited explicitly. In this way, standard NMR structures can provide highly accurate models of the "average structure" of a protein, but only little information about the conformational heterogeneity around this average.
Replica-averaged MD simulations make it possible to obtain structural ensembles that match the experimental data according to the principle of maximum entropy (Pitera & Chodera, 2012;Roux & Weare, 2013;Cavalli, Camilloni & Vendruscolo, 2013;White & Voth, 2014;Olsson et al., 2014). In such calculations prior information, here in the form of a molecular mechanics force field, is biased in a minimal fashion to agree with the experimental data. Thus, to obtain an accurate ensemble, such simulations require an accurate force field, an efficient sampling approach as well as sufficient experimental information. Our results show that, at least in the case of the small, but relatively mobile protein NCBD, it is possible to perform such simulations when NOEs are supplemented by the information available in the backbone chemical shifts and a well-parameterized molecular force field. The application of the experimentally-derived structural restraints helps overcome at least some of the deficiencies in force field accuracy and also improves sampling of the relevant regions of conformational space.
Our approach also allowed us to probe the structural heterogeneity arising from both shortand long-timescale dynamics by the calculation of order parameters. In the case of NCBD we found that this protein can be described as a relatively rigid protein domain on a fast timescale, as attested by the high relaxation order parameters that, nevertheless, displays additional motions in both the backbone and side chains on longer timescales. This situation is reminiscent of the molten globule state of apomyoglobin, that also displays restricted motions on the nanosecond timescale but with greater motions on a slower timescale (Eliezer et al., 2000;Meinhold & Wright, 2011). The current study also provides the groundwork for further studies on NCBDs intricate conformational dynamics, and the relationship to ligand binding (Dogan et al., 2012;Zijlstra et al., 2017). Given the importance of understanding and quantifying protein dynamics, in particular on long timescales, we expect that our approach, which uses only commonly available data, and possible combined with novel algorithms for enhancing sampling , will have a wide range of applications.

Supporting Information
Supporting figures with the data on back-calculation of NOEs in MD-unbiased and MD-CS ensembles are reported in Figure S1. The deviation between experimental and calculated sidechain 1 H chemical shifts calculated with PPM software are shown in Figure S2. The main-chain RMSD profiles of the 28 unbiased constant temperature MD simulations of NCBD are reported in Figure S3. The structure is shown as a cartoon (PDB entry: 2KKJ) with the protein coloured from the N-to the C-terminal (blue to red). The three a-helices are labelled. The goal of this work is to provide an ensemble of structures that represent the conformational fluctuations associated with this average conformation. Figure 2. Assessment of the convergence of the simulations. The similarity between structural ensembles was quantified using structural clustering with Affinity Propagation and subsequent comparison of the ensembles by Jensen-Shannon (JS) divergence. The JS divergence between two identical ensembles is zero, and it has previously been found that values less than ~0.3 represent similar ensembles. We monitored the evolution of the JS-divergence in two different tests, either by comparing two replicas from the same simulation (i.e. CS-NOE-2, black) or two simulations with the same force field and restraints but different starting structures (i.e. CS-NOE-2 starting from 2KKJ and 1ZOQ structures, respectively, grey). As described in the text we discarded the first 45 SA cycles before calculating the ensemble similarity for the test with different starting structures.  Figure 5. Calculation of order parameters from MD simulations to probe short and long timescale dynamics. We calculated S 2 order parameters that reflect either motions faster than overall tumbling of the protein (black) or longer timescale motions that give rise to chemical shift and NOE averaging (red). For reference, the main chain Root Mean Square Deviation (RMSD) values of the 28 unbiased simulations that we used to calculate the S 2 relaxation values are shown in Figure S3. S 2 chem.shift. Figure S1. Cross-validation of unbiased and CS-restrained ensembles using NOE measurements. We calculated the total number of NOE violations in the unbiased and CS-restrained ensembles. We used 455 NOE-derived distance restraints (BMRB entry 16363) of which 409 are short-and 46 are long-range (i.e. separated by more than 4 residues). In addition to the total number of violations (all), we also separated the violations into different categories depending on their magnitude (0.5Å-1Å, 1Å-2Å or greater than 2Å) and whether they are short (0-4 residues apart) or long-range (more than 4 residues apart).  Figure S2. Validation of the structural ensemble using 1 H methyl chemical sidechain chemical shifts. We calculated the deviation between experimental and calculated side-chain 1 H chemical shifts from each MD ensemble with PPM to compare with the CH3Shift predictions reported in Figure 3B. The results are shown as a function of the number of replicas used for the averaging of the simulations. The previously determined NMR structure (black) and unbiased MD simulation (green) do not involve replica averaging and are shown as horizontal lines.