Advanced Sampling Methods for Multiscale Simulation of Disordered Proteins and Dynamic Interactions

Gong, Xiping; Zhang, Yumeng; Chen, Jianhan

doi:10.3390/biom11101416

Open AccessReview

Advanced Sampling Methods for Multiscale Simulation of Disordered Proteins and Dynamic Interactions

by

Xiping Gong

^1,†,

Yumeng Zhang

^1,†

and

Jianhan Chen

^1,2,*

¹

Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA

²

Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this paper.

Biomolecules 2021, 11(10), 1416; https://doi.org/10.3390/biom11101416

Submission received: 31 August 2021 / Revised: 22 September 2021 / Accepted: 24 September 2021 / Published: 28 September 2021

(This article belongs to the Collection Biomolecules In Silico: Contemporary Advances in Computational Approaches to Investigating the Molecular Dynamics of Biological Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Intrinsically disordered proteins (IDPs) are highly prevalent and play important roles in biology and human diseases. It is now also recognized that many IDPs remain dynamic even in specific complexes and functional assemblies. Computer simulations are essential for deriving a molecular description of the disordered protein ensembles and dynamic interactions for a mechanistic understanding of IDPs in biology, diseases, and therapeutics. Here, we provide an in-depth review of recent advances in the multi-scale simulation of disordered protein states, with a particular emphasis on the development and application of advanced sampling techniques for studying IDPs. These techniques are critical for adequate sampling of the manifold functionally relevant conformational spaces of IDPs. Together with dramatically improved protein force fields, these advanced simulation approaches have achieved substantial success and demonstrated significant promise towards the quantitative and predictive modeling of IDPs and their dynamic interactions. We will also discuss important challenges remaining in the atomistic simulation of larger systems and how various coarse-grained approaches may help to bridge the remaining gaps in the accessible time- and length-scales of IDP simulations.

Keywords:

conformational ensemble; enhanced sampling; generalized Born; Gō-model; implicit solvent; liquid-liquid phase transition; replica exchange; protein force fields

1. Introduction

Intrinsically disordered proteins (IDPs) or regions (IDRs), compared to well-structured proteins, do not have stable tertiary structures under physiological conditions. Nevertheless, IDPs or IDRs can be found in nearly a third of proteins encoded in the human proteome [1], and they play key roles in a variety of biological processes that underlie vital cellular functions ranging from signaling and regulation to transport [2,3]. The inherent thermodynamic instability of an IDP’s conformation allows it to respond sensitively to numerous stimuli, including binding, changes in cellular environments (e.g., pH), and post-translational modifications [4,5,6,7,8]. Such conformational plasticity arguably enables IDPs to interact with multiple signaling pathways and serve as scaffolds to form multi-protein complexes [9]. Importantly, IDPs and IDRs house around 25% of disease-associated missense mutations [10]. They have been considered promising therapeutic targets for treating various diseases (such as chronic diseases) [11,12,13]. While many IDPs have been shown to undergo binding-induced folding transitions upon specific binding [3], many examples are also emerging to demonstrate that IDPs can remain unstructured even in specific complexes and functional assemblies [14,15,16,17,18,19,20]. Such a dynamic mode of specific protein interactions seems much more prevalent than previously thought [21,22,23].

It is very challenging to provide reliable descriptions of the conformational ensembles of IDPs and IDRs. A disordered state does not lend itself to traditional structural determination methods that are geared toward describing a coherent set of similar structures. Biophysical techniques, such as NMR, SAXS, and FRET, can provide complementary information on various local and long-range structural organizations [7]. However, these ensemble-averaged measurements alone are not sufficient to unambiguously define the heterogeneous ensemble, due to the severely underdetermined nature of the structure calculation problem [8,24,25]. As a result, studies of IDPs have relied heavily in the traditional structure-function paradigm, by solving the folded structure of the bound state, analyzing coupled binding and folding mechanisms, or identifying putative pre-existing functional structures in the unbound state [3]. However, the disordered ensemble itself is arguably the central conduit of cellular signaling. The functional mechanism of an IDP is encoded in how the disordered ensemble as a whole responds to various stimuli, be it cooperative binding-induced folding or the redistribution of conformational sub-states in dynamic interactions. Multiple cellular signals can be naturally integrated through cooperative responses of the whole dynamic ensemble [26,27,28]. Therefore, there is a critical need for reliable characterization of disordered protein conformation ensembles, in both bound and unbound states, in order to establish the molecular basis of IDPs and IDRs in various physiological and pathophysiological processes.

Given the fundamental challenges of characterizing disordered protein states based on ensemble-averaged measurements alone, molecular modeling and simulations have a crucial and unique role to play in mechanistic studies of IDPs and IDRs [29,30,31,32,33]. This is reflected in continuously increasing numbers of research articles that contain keywords “intrinsically disordered” and “molecular dynamics” published in the last 10 years (Figure 1). A particularly attractive approach is to first generate the disordered ensemble using transferable, physics-based force fields without any experimental restraints and then use the later for independent validation [7]. Such de novo simulations of disordered protein ensembles require both high force field accuracy and adequate sampling of relevant conformational space, pushing the limit of these two central ingredients of molecular dynamics (MD) and Monte Carlo (MC) simulations. The challenges of simulating disordered proteins have driven significant interest in developing better protein force fields and advanced sampling methods (Figure 1). In particular, important advances have been made in the state-of-the-art atomistic force fields for describing the conformational equilibria of ordered and disordered proteins [13]. Enhanced sampling techniques have played crucial roles in both the development and application of atomistic force fields, by allowing one to cross energy barriers faster and accelerate the conformational sampling of IDPs [34,35,36,37,38,39,40,41]. Nonetheless, atomistic simulations still have limited capability in describing large systems such as biological condensates [42]. For this, multi-scale approaches are necessary to bridge the gaps in experimental and computational time- and length-scales, including implicit solvent models, which remove the solvent degrees of freedom [8], and various coarse-grained models, which significantly reduce both proteins and solvent degrees of freedom [43].

In this review, we will start by highlighting the challenges of sampling IDP conformational ensembles and providing a summary on the state-of-the-art force fields available to describe the IDP conformations. It is noted that several excellent review papers have been published recently that cover general theoretical and computational approaches for studying IDPs, in particular regarding protein–protein interactions and biological condensates [29,44,45,46]. This review will therefore focus on the recent development of advanced sampling methods for simulating disordered conformational ensembles and dynamic interactions of IDPs. We will also discuss some of the key advances in the multi-scale modeling of IDPs that greatly extend the accessible length- and time-scales of molecular simulations. Finally, we discuss future directions in developing a robust computational framework for simulating IDP conformational equilibria and interactions.

2. Challenges of Simulating IDP Conformational Equilibria

Compared to the globular proteins that have one or a few well-defined global energy minima, the energy landscape of an IDP is flatter and generally includes many local energy minima separated by modest energy barriers [47]. IDPs and IDRs typically have fewer hydrophobic residues, but a larger number of polar or charged as well as disorder-promoting residues (such as glycine and proline) [44]. These sequence features hamper the formation of hydrophobic cores that drive protein folding and thus prevent the formation of stable tertiary structures. Instead, IDPs and IDRs favor forming an ensemble of unfolded or partially folded states. This presents a major challenge for simulation and depends critically on the ability of the force fields to accurately describe the energetics of relevant conformational states, especially for capturing both folded and unfolded states of an IDP. For example, one recent study tested atomistic simulations of IDPs for eight force fields and found marked differences in the describing the conformational ensembles of IDPs, in particular the secondary structure content [48]. Similar observations have also been made in other benchmark studies, consistently showing that protein force fields previously optimized for folded proteins are not suitable for simulating disordered protein states, largely due to over-stabilization of protein-protein interactions [49]. These benchmark studies also suggested that the key towards better protein force field was to rebalance protein–protein, protein–water, and water–water interactions.

Besides accurate force fields, reliable simulation of IDPs also hinges on sufficient sampling of many relevant conformation states within a reasonable simulation time. Standard MD simulations are generally insufficient to generate representative conformational ensembles, even using the most accurate protein force fields coupled with advance of GPU computing or specialized hardware such as the ANTON supercomputer [50]. For example, a recent reanalysis of a 30-μs ANTON trajectory of a 40-residue Aβ40 peptide in explicit solvent revealed very limited convergence even at the secondary structure level [13]. This can be attributed to the diverse and large accessible conformational space of an IDP and the potentially high free energy barriers separating various sub-states that require exponentially longer time to cross. Note that typical simulation times on conventional hardware (such as GPUs) are at least one-order of magnitude shorter. There is thus great danger in relying on standard MD to calculate disordered protein conformational ensembles at the atomistic level. There is a critical need to develop and leverage so-called enhanced sampling techniques, which aim to generate statistically meaningful conformational ensembles with dramatically less computation.

Computational studies of IDP interaction and assembly are even more demanding. The conformational equilibrium of an IDP can respond sensitively to specific and nonspecific binding, potentially shifting from a disordered to somewhat ordered state or fully folded state. In principle, simulations could provide the much-needed spatial and time resolutions to elucidate the kinetics and thermodynamics of coupled folding and binding processes and characterize the mechanistic features. However, the challenge is that this coupled process of folding and binding is a complex reaction involving the formation of many noncovalent interactions, which requires extremely long simulations generally beyond the current capabilities at the atomistic level. As such, coarse-grained models are generally required for computational studies of IDP interaction and assembly.

3. The State-of-the-Art Protein Force Fields for Describing IDP Conformations

Empirical protein force fields are potential energy functions that typically include physics-motivated bonded and non-bonded terms carefully parameterized based on a wide range of theoretical and experimental data [51]. These force fields can in principle be transferable between folded proteins and IDPs. To achieve this, it is also critical to develop suitable water models and better describe the water–protein interactions [52,53]. Two recent review articles have already provided comprehensive descriptions on the latest development of better protein force fields [51,54]. We therefore briefly summarize the state-of-the-art of nonpolarizable and polarizable force fields for IDP dynamics and interactions.

3.1. Nonpolarizable Protein Force Fields

Many previous nonpolarizable force fields have significant shortcomings for describing the unfolded or disordered proteins. For example, they typically provide a poor description of the secondary structure content for IDPs and have a preference to give too compact conformations with respect to the experimentally measured dimension of IDPs [48,55]. These problems were likely attributed to the unbalanced parameterization of dihedral torsion space and the description of protein–protein and protein–water interactions [56]. As a result, most of the improved force fields managed to give more accurate secondary structure propensities by adjusting dihedral parameters or adding grid-based energy correction map (CMAP) parameters [54]. The over-compactness of disordered proteins can be alleviated by modifying protein–water van der Waals interactions or combining with refined water models [52]. Representative state-of-the-art force fields includes the latest CHARMM36m/TIP3P* [57], ff19SB/OPC [58], and a99SB-disp/TIP4P-D [50]. Many benchmark studies have consistently demonstrated that these refined force fields do provide significant improvements in describing not only single folded and disordered proteins, but also the multiprotein systems that are either soluble or aggregate in the solution [55,59,60,61,62]. At the same time, these studies also identified significant remaining limitations in the description of the noncovalent interactions in the multiprotein systems [60]. Recognizing limitations in the ability of a99SB-disp/TIP4P-D force field to accurately describe the protein–protein interactions, a new force field, DES-Amber, was recently developed to provide more accurate simulations of protein–protein complexes while maintaining reliable descriptions of both ordered and disordered single-chain proteins [61]. However, DES-Amber is still limited in reproducing the experimental protein–protein association free energies of some protein complexes, in particular for the systems with highly polar interfaces [61]. In the latter case, it was found that the charged sidechains were buried at the protein–protein interface instead of being solvent-exposed. It was further suggested that nonpolarizable force fields were fundamentally limited in achieving a balanced description of charged groups that were solvent-exposed or buried at a protein–protein interface.

3.2. Polarizable Protein Force Fields

Polarizable force fields explicitly consider the electronic polarization using various empirical models to provide better description of charged and polar protein motifs in heterogeneous biomolecular environments [63]. Exciting progress has been made in the last few years and several polarizable force fields are now available for the stable simulation of proteins in both aqueous and membrane environments [64,65]. Simulations using the latest polarizable force fields have also showed a high level of consistency with experimental observations, particularly the ion solvation and binding thermodynamics, permeation free energy of ions or small charged molecules into the cell membrane, and protein–ligand binding [63]. For example, the Drude-2013 polarizable force field, compared to CHARMM36 force field, is more accurate in describing the folding cooperativity of (AAQAA)₃ peptide, which can be attributed to enhanced backbone dipole moments in the helix state [66]. Additional studies are still needed to show the necessity of considering polarizable force fields in IDP simulations, where the significantly higher computational cost adds to the challenge of generating converged ensembles [63]. Existing comparisons suggest that polarizable force fields, including AMOEBA and Drude models, still frequently have problems in reproducing the nature structures and folding of proteins [67,68,69]. For example, stronger protein–water interactions in polarizable force fields can destabilize the native protein structure, in opposition to the observations from nonpolarizable force fields where protein–water interactions have traditionally been underestimated [42]. Nonetheless, it can be anticipated that polarizable force fields will continue to be improved and become increasingly important for simulating IDP structure and interactions.

4. Enhanced Sampling Methods for Sampling IDP Conformational Ensembles

Enhanced sampling techniques generally accelerate the crossing of energy barriers to achieve better sampling efficiency, such as by introducing bias potentials, modifying the potential energy itself, and changing the effective temperature. These techniques have proven essential in atomistic simulations of IDPs [70,71], yielding levels of convergence that could not be achieved even with drastically longer standard constant-temperature MD simulations [13]. The central idea of biased MD simulations is similar to importance sampling in MC simulations, where a biased potential is introduced to construct a flat free energy landscape along single or multiple collective variables of interest, such that many states can be readily sampled due to the removal of free energy barriers. The replica-exchange (REX) class of sampling methods, particularly replica exchange molecular dynamics (REMD), has been one of the most popular methods for simulating protein conformations. Figure 2 shows the general scheme of REMD simulations, where the key point is to first set up multiple replicas with different unitless unbiased or biased potentials, given as the energy over k_BT (T is the temperature), and then use the Metropolis rule to allow MC to exchange the replicas and maintain the detailed balance. A key advantage of using multiple replicas and maintaining detailed balance is avoiding the reweighting problem generally required for biased simulations. Note that virtually all biased sampling strategies can be readily incorporated within the REX framework to benefit from both classes of enhanced sampling, including metadynamics (MTD) [72,73], accelerated MD (aMD) [74], umbrella sampling (US) [75,76], and integrated tempering sampling [77]. In practice, effective REMD protocols require a proper choice of (1) the optimal number of replicas and proper distributions of conditions, to ensure a uniform exchange acceptance rate and efficient random walk in the condition space, and (2) the choice of those unitless (biased) potentials for effective conformational diffusion at each condition [78]. Here, we divide various enhanced sampling strategies into two general groups depending on the need for collective variables and discuss their recent applications to IDP conformational sampling. These methods are summarized Table 1.

4.1. Collective Variables-Based Sampling Methods and Optimization

MTD and its variants have been considered one of the most important collective variables (CV)-based sampling methods for protein simulations [90]. MTD uses a history-dependent bias potential, which is generally a sum of Gaussians, to eventually construct a flat free energy landscape along the predetermined CV(s). A well-tempered MTD (WT-MTD) was later developed to increase the convergence, by gradually reducing the size of Gaussians based on the total accumulated bias potential [72,73]. Furthermore, the parallel tempering MTD (PT-MTD) and the combinations with other biased sampling methods have been also developed to increase the sampling efficiency and convergence of free energy calculations [91,92]. Representative examples include the PT-MTD that combines WT-MTD with PT or bias-exchange MTD that uses a different CV in each replica, rather than exchanging the temperatures. For example, the PT-WTD and bias-exchange MTD has been employed to obtain the conformational ensembles and coupled binding and folding of disordered pKID and KID proteins, using the α-score of helical structures as CVs [79]. It has also been shown that the REMD-based MTD, compared to conventional MTD or T-REMD, can enhance the conformational sampling of N-Glycans using dihedral angles as CVs to characterize the global motions [93]. The binding mechanism of two disordered peptides, NRF2 and PTMA, was simulated by the WT-MTD, and the results showed that the WT-MTD method could provide converged free energy profiles with 1.5 μs of sampling time [94]. Together, these applications have shown that MTD-class of sampling methods can be effectively applied to IDP simulations. Beside MTD, another important class of CV-based sampling strategy is the US method [76]. US is not strictly an enhanced sampling method like MTD. It typically uses multiple harmonic potentials to focus on sampling various states along the collective variables of interest. US is often combined with REMD in studies of IDPs, as illustrated in a recent 2D window-exchange US simulation of the coupled folding and binding mechanism of HdeA homodimer [80]. The simulation was able to capture rare unfolding transitions of the dimer at neutral pH and provided a detailed description of the transition pathways.

A central limitation of CV-based sampling methods is that the efficiency strongly depends on the quality of selected CV(s). For diffusion processes such as protein conformational fluctuation, it is often not clear which CVs can best capture large-scale transitions or even if these transitions could be effectively described using one or a few CVs [95,96,97]. Another practical limitation is that the computational cost of MTD and US grows exponentially as a function of the number of CVs, generally limiting the maximum to three. Parallel bias metadynamics (PBMetD) approaches have been proposed to overcome this limitation, by applying multiple low-dimension bias potentials in parallel [98,99]. Nonetheless, the efficacy of PBMetD for sampling complex (disordered) protein conformational space is yet to be demonstrated. Another recent work presented a temperature accelerated sliced sampling method to explore the high dimensional free energy landscape by combining Temperature-accelerated MD/driven-adiabatic free energy dynamics (TAMD/d-AFED), MTD and US methods to sample many CVs simultaneously [100]. However, the approach shares the limitation of PBMetD where the underlying bias potentials remain low dimensional in nature. To address the problem of determining the best CVs for a particular problem of interest, machine learning algorithms and deep learning network have been recently proposed to analyze information from many candidate CVs and construct the free energy landscape using low-dimensional representations [81,82]. On-the-fly discovery of optimal CV was also demonstrated using the artificial neural networks that has a strong capacity of learning and optimization for given linear or nonlinear CVs [101]. In another recent study, an eight-dimensional optimal biased potential was constructed and applied to the free energy calculations of polypeptides using two machine learning algorithms, namely the nearest neighbor density estimator and artificial neural network [102]. Similar deep neural networks have also shown to be capable of constructing nontrivial biased potentials, for deep enhanced sampling of protein conformational space and overcoming so-called hidden barriers [103,104]. These are exciting developments that may greatly expand the applicability of MTD, US, and other CV-based sampling techniques to problems of increasing complexity, including simulations of IDPs and their dynamic interactions, especially when combined with REX.

4.2. Collective Variables-Free Sampling Methods and Optimization

CV-free sampling avoids the need to identify a set of optimal CVs and can be highly desirable for simulating high-dimensional conformational fluctuation of IDPs. Many CV-free sampling methods have also been developed, including the tempering-based and energy-scaled biased methods. Tempering-based sampling methods rely on increasing the effective simulation temperature (i.e., tempering) to accelerate barrier crossing. Examples include the temperature cool walking [105], annealed importance sampling [106], simulated tempering [83], and temperature-based REMD (T-REMD) [36]. T-REMD, in particular, has proven highly effective for protein folding and studies of IDP conformation ensembles, where multiple replicas are simulated at different temperatures in parallel to promote barrier crossing as the system undertakes a random walk in the temperature space (Figure 2). Nevertheless, one potential limitation is the number of replicas required for T-REMD scales, as the squared root of the number of degree of freedoms (DOFs) of the whole system, to maintain a reasonable exchange acceptance probability. This can dramatically increase the computational cost of the explicit solvent T-REMD simulations. Several methods have been proposed to avoid the demanding cost, such as adding energy-related terms (such as accelerated-MD or Gaussian accelerated MD, named GaMD) or scaling the potential energy function (including the scaled MD that scaled all energy terms and replica exchange solute tempering (REST) methods that scaled part of energy terms) [88,93,107,108].

aMD adds boost potentials to reduce the energy barriers and accelerate sampling [74]. However, it suffers from a serious energetic noise when reweighting [109]. The GaMD has been thus developed to reduce noise by introducing a new harmonic boost potential, to allow a new reweighting technique that could accurately recover the free energy landscape using a cumulant expansion to the second order [86]. GaMD has achieved some success in studying protein folding, protein–ligand binding, and protein–protein interactions [109]. In particular, specifically developed Ligand GaMD [110] and Peptide GaMD [111] can capture the binding and dissociation of molecular ligands and highly disordered peptides within microsecond simulations. Recently, this GaMD method has also been combined with the REMD protocol, which can avoid the energy reweighting problem [108]. A combination of replica-exchange umbrella sampling (REUS) and GaMD has also been designed for the conformational sampling and free energy calculations [88]. It is noted that the CVs-free enhanced sampling methods are more generally more suitable for simulating IDP conformations and dynamics, because of the difficulty of identifying appropriate CVs for IDP simulations as discussed above.

REST is a special variant of T-REMD designed specifically to reduce the number of DOFs that contribute to the Metropolis criteria of replica exchange, such that smaller number of replicas is needed [37,85]. The basic idea of REST is to separate the system into two ‘hot solute’ and ‘cold solvent’ regions. The ‘solvent’ could be actual water molecules but could also be any region of the system where no tempering is to be applied. This offers great flexibility in tailoring REST for a specific system of interest. Even more generally, the ‘solute’ region can be defined to include only a subset of interaction terms within the ‘solute’ region, such as dihedral-angle energy or Lennard–Jones energy term in the generalized REST (gREST) method [84]. Temperature-dependent factors are used to scale the ‘solute’–‘solute’ and ‘solute’–‘solvent’ interactions, while keeping the ‘solvent’–‘solvent’ interactions intact:

\begin{array}{l} u_{m}^{REST} (X) = λ_{m}^{pp} E_{pp} (X) + λ_{m}^{pw} E_{pw} (X) + λ_{m}^{ww} E_{ww} (X), \\ \begin{matrix} REST 1 : & λ_{m}^{pp} = β_{m}, & λ_{m}^{pw} = \frac{β_{0} + β_{m}}{2}, & λ_{m}^{ww} = β_{0}, \end{matrix} \\ \begin{matrix} REST 2 : & λ_{m}^{pp} = β_{m}, & λ_{m}^{pw} = \sqrt{β_{0} β_{m}}, & λ_{m}^{ww} = β_{0}, \end{matrix} \end{array}

(1)

where X is the conformational coordinates and β_m is the inverse of k_BT_m. The scaling of ‘solute’-‘solute’ interactions allows the ‘solute’ to be simulated with an effective temperature of T_m while maintaining the ‘solvent’ temperature at T₀. As a result, the exchange acceptance probability will be independent of ‘solvent’–‘solvent’ interactions, which reduces the effective system size and requires fewer replicas to cover the same temperature range. A key open choice in REST is how the ‘solute’–‘solvent’ term is scaled (Equation (1)). Different solute–solute and solute–solvent scaling factors can strongly affect the ability of driving conformational transitions of the selected ‘solute’ region. A strong solute–solute interaction favors to compact the protein conformations, whereas a strong solute–solvent interaction prefers the disordered, solvent-exposed conformations. Different scaling schemes lead to very different characteristics of REST1 (original) and REST2 (revised) protocols (Equation (1)). High temperature conditions favor the unfolded conformations in REST1, while both folded and unfolded conformations were observed in REST2 model for the condition with the same effective ‘solute’ temperature. The reason for this is that REST2 was designed to have a weaker solute-solvent interactions to promote the sampling of folded conformations even at high temperatures [85]. While this could allow the sampling of reversible folding transitions at all temperatures in REST2, it could lead to conformational trapping, hampering the sampling of disordered conformations of IDPs. One important implication is that the performance of REST can be sensitive to the balance of protein–protein and protein–water interactions of a given protein force field. For example, Liu et al. showed that, while REST2 was highly effective in generating converged ensembles of 61-residue p53 N-terminal transactivation domain (TAD) using a99sb-disp, it completely failed to converge even with ~1 μs/replica in CHARMM36m and CHARMM36mw force fields [112]. Separate standard MD simulations reveal that p53-TAD can readily escape the apparent trapped conformations observed during REST2, suggesting that these traps arise due to the imbalance of scaled protein–protein, protein–water, and water–water interactions [112].

REST has proven to be one of the most reliable choices for enhanced sampling of protein folding and particularly disordered conformational ensembles [113,114]. Sugita and co-workers leveraged gREST to target the dihedral-angle energy term and successfully sampled folding transitions of beta-hairpins and Trp-cage in explicit water, using fewer replicas but covering wider conformational space compared to REST2 [84]. Walsh et al. applied REST to investigate n16N disordered peptide conformational ensembles [115]. The conformations obtained via REST methods showed a high consistency with NMR experimental data. Furthermore, REST are specifically appropriate in simulating IDRs as the disordered region can be targeted in REST without tempering the well-structured region (or water). Zhou and co-workers studied the disordered loop of Staphylococcus aureus sortase A (SrtA) to order transition upon binding to calcium [116]. Chen and Liu characterized Bcl-xL interfacial conformational dynamics in explicit solvent [117]. Both works directly showed that REST covered broader conformational spaces for intrinsically disordered regions and led to faster convergence compared to either standard MD or T-REMD simulations. REST simulations have also been successfully integrated with experiments to study how cancer-associated mutations and drug molecules may modulate the disordered ensembles of p53-TAD and Aβ peptides in recent years [118,119,120,121].

Despite the success of REST for CV-free enhanced sampling, it does not benefit from targeted acceleration along specific CVs that are known to be rate limiting. For this, REST (or REX in general) has been combined with CV-based enhanced sampling to maximize the efficiency of sampling the complex, high dimensional conformational space of proteins. Some of the examples are discussed in the sections above. Here, we note a couple additional recent examples. By integrating free energy perturbation (FEP) and REST methods, Abel et al. obtained more thorough samplings of different ligand conformations around the active site and realized relative binding affinity predictions [122]. Okamoto and co-workers have applied the REUS/REST two-dimensional replica-exchange method to predict two protein–ligand complex systems with the help of REST to weaken the solute–solvent interactions but improve the binding events and REUS to enhance the sampling along with the reaction coordinates [87].

Multiscale enhanced sampling (MSES) is yet another fascinating example of a CV-free enhanced sampling strategy. Protein folding and other cooperative transitions such as self-assembly are known to be dominated by entropy barriers, which renders tempering ineffective for driving faster transitions. Coupled with a lack of obvious CVs, sampling complex conformational transitions of IDPs and their interactions is challenging for both CV-based and REX-based CV-free methods. For this, an effective solution is to couple atomistic simulations with a coarse-grain (CG) model, such that one could benefit from both faster transitions of CG modeling and accuracy of atomistic force field [123]. A particularly attractive approach was first introduced by Kidera and coworkers, where restraint potentials were used to couple CG and atomistic conformational dynamics along “essential” DOFs shared by the two models [35]. The bias introduced by the coupling potential is removed using Hamiltonian REX (H-REX). Chen and coworkers further adapt the method to utilize topology-based CG models (see below), better coupling potential and advanced Hamiltonian/temperature REX (H/T REX) [34,124,125]. Coupling the CG and atomistic models using restraints is a key strength of these MSES protocols. It allows full control of the energetic impact of diverged structures at different resolutions, which improves exchange efficiency and provides superior scalability to large systems. MSES coupling also provides robust tolerance of CG defects by preventing the CG model from dictating the conformational dynamics. The efficacy of MSES has been illustrated using several systems. It was highly effective in simulate reversible transitions of small β-hairpins and helical IDPs [34,124,125] and proved instrumental in further refinement of a GBMV2 implicit solvent protein force field for both ordered and disordered peptides [126]. Very recently, MSES was also observed to be effective in sampling the cis–trans transitions of lutein by coupling the atomistic model with the Martini CG model [127]. Nonetheless, the application of MSES to larger and more complex proteins has proven more challenging than originally expected, apparently due to the difficulty in effective coupling of CG and atomistic conformational fluctuations of a larger protein.

Other tempering methods including integrated tempering and simulated tempering have also been combined with different biased potentials to enhance sampling [89,128]. For example, an integrated accelerated MD method has recently been used to sample the conformations of pepX peptides, and it was shown that this method can improve the sampling efficiency and provide a good strategy for simulating IDPs [69,89]. The combination with the metadynamics has also been presented to sample the conformational space of silica, and the acceleration was increased by over one order of magnitude [128]. One significant benefit is that only a single replica is required and could be suitable for Anton specialized hardware [50]. However, one drawback is that we have to estimate the relative free energies of all conditions (or equivalently the density of states), which requires recursive simulations and can be difficult to converge for complex systems, such as large IDPs and complexes.

4.3. Reweighting Techniques for Generating Unbiased Ensembles

When bias potentials are used to enhance sampling, reweighting is often required to obtain the unbiased samples and construct statistically optimal unbiased free energy surfaces. Two reweighting methods are widely used for this, including the weighted histogram analysis method (WHAM) for the biased simulations with specific CVs and a more general multistate Bennett acceptance ratio (MBAR) approach [129,130]. The stability of both WHAM and MBAR can be susceptible to large energetic fluctuations due to exponential dependence of weights on the value of the unitless potentials. Large energy fluctuations among sampled conformations can lead to large uncertainties during reweighting and thus final unbiased distributions. Another population based reweighting method has been used for unbiasing the scaled MD simulations by making a multidimensional histogram of all sampled configurations [131]. However, the dimensionality of configurational space is usually very huge, and thus can hardly be completely described by some dimensionality reduction techniques (such as principal component analysis). Recently, it was proposed that this energetic noise can be alleviated by truncating the cumulant expansion of the exponential average [86], which was originally used in the accelerated molecular dynamics. It has shown that it can accurately recover the free energy profiles within an acceptable error (~k_BT), especially for the near-Gaussian biased unitless potentials [86]. This approximated reweighting methods have therefore been successfully used for reweighting several biased simulations [88]. It should be mentioned that those reweighting techniques can be used for reweighting any biased simulation, even for the REMD simulations. Nonetheless, all reweighting methods including MBAR relies on good overlap between the true conformational space and the region sampled by biased simulations. When the overlap is limited, the reweighted distributions will remain significantly different from the true result. The conformational space of even very short IDPs (e.g., ~10 residues or longer) can be complex enough to present formidable challenges for recovering the true disordered ensemble from a biased trajectory, generated either at high temperatures or with modified Hamiltonian. Instead of analyzing self-convergence (as a function of simulation time), a more rigorous test of convergence is to analyze results obtained from simulations initiated from distinct initial states (such as highly structured and fully disordered conformations [7]).

5. Multi-Scale Approaches for Overcoming Sampling Problems of Large Systems

As discussed above, dramatic improvement in atomistic protein force fields coupled with enhanced samplings and GPU computing have now enabled us to generate the disordered conformational ensembles of increasingly complex IDPs in both bound and unbound states. Many important phenomena related to IDPs remain largely out of the reach of physics-based atomistic simulations, such as aggregation [132,133,134] and biological condensates [135,136,137,138]. Here, we review two of the key multi-scale approaches that allow one to simulate longer time-scale bioprocesses and more complex systems within the capacity of current computational capability, namely implicit solvent and coarse-grained (CG) models. Both approaches have been extensively studied and applied to globular proteins as well as IDPs.

5.1. Implicit Solvent Models for Removing Solvent DOFs

Implicit treatment of solvent is an effective approach to reduce the computational cost of atomistic IDP simulations. The basic idea is to directly estimate the solvation free energy to capture the mean effect of solvent on the thermodynamic properties of the solute [139]. Implicit solvent is essentially a multi-scale model, where the solvent is represented using certain physical model while keeping atomistic details of the solute. These models have emerged as attractive alternatives for simulations of IDPs and their interactions compared to explicit solvent. In particular, many generalized Born (GB) based implicit solvent models have been developed, including the fast analytical continuum treatment of solvation (FACTS) [140], Amber GB models (such as GB-HCT [141], GB-OBC [142], and GB-Neck [143,144]), analytical generalized Born plus nonpolar (AGBNP) [145,146], and GB models implemented in CHARMM program (such as GBSW [147] and GBMV [148,149]). Several of these GB models can be optimized to provide a balance between computational efficiency and accuracy desired for IDP simulations [126,150,151], by systematic optimization of key physical parameters such as atomic radii to balance solvation and intramolecular interactions. Applied to various model IDPs with extensive experimental data, implicit solvent simulations have provided important insights on detailed conformational properties of the unbound state and how these properties may support function [32,33,152,153,154].

Despite many successes, implicit solvent models have not widely been tested and applied to the studies of larger IDPs. Several factors likely contribute to this. Most implicit solvent models are built upon existing protein force fields, which until recent years have had significant limitations in describing disordered protein conformations. Implicit treatment of solvent also relies on various approximations for computational efficiency, such as treating water as a continuous dielectric medium in GB models, limiting the ability of implicit solvent to accurately capture the conformational dependence of solvation free energy. A particular limitation is the common use of a surface area (SA)-based model to describe nonpolar solvation energy, which has known limitations in describing the length-scale dependence as well as solvent screening of dispersion interactions [151]. These limitations can result in a systematic bias towards an overly compact conformational ensemble, which is more pronounced for larger IDPs.

Several recent efforts have been made to further improve implicit solvent models for IDP simulations. The GB-Neck2 model has been optimized to reproduce solvation energies for a variety of protein systems [144]. Recent benchmark studies have shown that the GB-Neck2 model can reasonably discriminate folded and disordered peptides and could be used for quantitative protein folding simulations up to millisecond time scales [155,156,157]. Recently, the GBMV2 model, which includes an analytical approximation of molecular volume and is arguably one of the best GB models, has been implemented on the CUDA platform using the CHARMM/OpenMM interface [158]. The ~2 order of magnitude GPU acceleration greatly enables GBMV2 to simulate the conformation and interaction of larger IDPs. The ABSINTH implicit solvent model focuses on recapitulating the polymer properties of peptides and has been successfully used for a variety of IDP simulations, including Aβ peptides and the aggregation of phenylalanine [159,160] and sequence-conformation relationship of IDPs in general [8,161]. Recently, an ABSINTH-C model was developed to address the problem of overly shallow Ramachandran distributions of ABSINTH, by adding residue-specific correction terms [162]. The new model not only has a capacity to maintain stable native structures of α-/β-folded proteins, but also to increase the reversible folding of β-hairpin peptides.

5.2. Coarse-Grain Models for Reducing the DOFs of Proteins

Notwithstanding the ever-improving atomistic modeling, coarse-graining has remained an attractive and often effective strategy for extending the accessible time and length-scales of MD simulations. By grouping multiple (protein) atoms into CG beads and using simplified potential energy functions, CG modeling does not only reduce the system size, often by ~10-fold, but also allows much larger MD integration time steps up to 10 s of fs. Together, many CG models can be several orders of magnitude more efficient than atomistic ones. Numerous CG models have achieved varying levels of success in studies of protein folding, binding, and assembly [43,163]. Nonetheless, there are important distinctions between the conformational properties between globular proteins and IDPs, as well as the relative importance of electrostatic, hydrophobic, and hydrogen-bonding interactions in governing their conformational equilibria. Therefore, CG models optimized for the folded proteins are generally not suitable for the IDP simulations. It is often necessary to readjust the parameters of protein–protein and protein–solvent interactions or add new terms for more accurate description of IDP conformations (Figure 3). Here, we summarize several of these refined CG models for more efficient sampling of IDP conformation and interactions as well as their successes and limitations.

Gō/Gō-like models, also known as topology-based models, are based on the funneled energy landscape theory [164] and have been highly successful in describing the folding mechanism and pathway of structured proteins [165]. Somewhat surprisingly, Gō-like models have also proven effective for determining the mechanism and kinetics of IDP interactions, particularly the coupled binding and folding process [166,167,168,169,170,171]. The implication is that the binding and folding are governed by similar principles that require minimal frustration for efficiency. Note that Gō-like models generally require additional calibrations to provide a more quantitative description of the balance between intermolecular interactions and intrinsic conformational propensities [172]. A key limitation of the topology-based modeling of IDPs is lack of the ability to capture the impacts of non-“native” structural features and nonspecific interactions, which could play important roles in IDP structure and function. This may be partially overcome by including new energy terms (Figure 3), such as explicit charge-charge interactions, inert crowder molecules, and confinement potentials. A particularly interesting discovery from such extended topology-based modeling of IDPs is the role of long-range electrostatic interactions in promoting efficient coupled binding and folding, allowing IDPs to fold at timescales beyond the μs “folding speed limit” to avoid a potential kinetic bottleneck in specific recognition [167,169,173]. IDP-binding proteins have evolved to contain charges near the binding interface to complement those highly conserved charges on IDPs. Long-range electrostatic interactions between these charges do not only accelerate the encountering of IDPs, but also promote the efficiency of IDP folding upon nonspecific encounters.

Several higher resolution coarse-grained models have been also developed specifically for modeling IDPs. Thirumalai and co-workers reparametrize the two-bead self-organized polymer coarse-grained model (SOP-CG) to reproduce Rg values of a set of diverse IDPs with 20 to 441 residues [174]. The resulting SOP-IDP also accurately reproduce the small-angle X-ray scattering profiles for these IDPs. Nonetheless, SOP-IDP is designed for IDPs solely and lacks transferability and compatibility in describing even small globular proteins under physiological conditions. Recognizing the limitation of C⍺-only backbone representation in capturing the intrinsic conformational propensities of IDPs, Chen and Liu developed a hybrid resolution (HyRes) model that contains an atomistic description of the backbone, to provide a semi-qualitative description of the secondary structure propensities, and intermediate resolution side chains, to allow qualitative description of the overall peptide chain dimension and transient long-range interactions [175]. While HyRes was originally designed for driving faster atomistic sampling for MSES simulations, applications to a set of small and large IDPs including p53-TAD suggest that HyRes may be appropriate for simulating IDP structure and interactions by itself [175]. Papoian and co-workers have developed the AWSEM-IDP model that can be used to efficiently sample the large conformational space of IDPs and at the same time can distinguish the levels of peptide chain expansion of globular proteins and IDPs [176]. AWSEM-IDP includes only C_⍺, C_β, and O atoms, and has been reparametrized for IDPs by adjusting the secondary structure-related potential energy terms as well as introducing a new parameter, V_Rg term, for controlling the collapse and size fluctuation of the protein.

An important application for CG models is to study liquid-liquid phase transitions (LLPS) that are frequently mediated by IDPs [29,44,45,46]. Dignon et al. proposed a residue-based C_⍺-only CG model to represent the disordered low complexity domain of the RNA-binding protein FUS-LCD and the DEAD-box helicase protein LAF-1 in the formation of LLPS [177]. The model uses the Debye–Hückel approximation for long-range electrostatic interactions and the hydrophobicity scale model [177] or the Kim–Hummer model [178] for short-range residue–residue interactions. The results indicated that both two approaches could reproduce the experimentally observed phase behaviors and changes in phase diagrams caused by mutation. Although they mentioned that the temperature-dependent phase behaviors were not compatible with the experimental absolute temperature and the ionic strength dependence was not fully tested due to the breakdown of the Debye-Hückel electrostatic energy potentials. The model could be further refined. For example, more residue-type parameters were considered to account for phosphorylation and acetylation effects [179], which allows in-depth investigation of how post-translational modifications may control LLPS behaviors. Recently, Latham and Zhang re-tuned the model of Dignon et al. to better reproduce the Rg distributions of a set of folded and disordered proteins [180]. The resulting maximum entropy optimized force field (MOFF) includes a new residue–residue interaction matrix and is more transferable for modeling both globular proteins and IDPs. Hummer and co-workers modified the MARTINI model via re-scaling the solute–solute non-bonded Lennard–Jones potentials to reproduce the experimental transfer free energies of phase separation among dilute and dense liquid phases and proposed a more general approach in tuning CG models with MD for LLPS related studies by optimizing and balancing the solute–solute and solute–solvent interactions, then matching the CG data to the atomistic simulation or experimental results [177]. The resulting MARTINI-IDP model was shown to successfully simulate the droplet formation and capture reversible phase transformations. Such exciting progress highlights the strong potential for simple C_⍺-only CG models in molecular simulations of LLPS involving IDPs. Nonetheless, difficulty in describing local structure propensities (such as transient helices) with the C_⍺-only representation may be an important limitation for studying certain specific effects of IDPs in LLPS.

6. Concluding Remarks

Effective and reliable molecular simulations are crucial for characterizing the details of disordered conformational ensembles of IDPs in isolation, dynamic complexes, or biological condensates. Such computational capability, integrated with experimental studies, makes it possible to determine how the dynamic protein states may respond various cellular stimuli in signaling and regulation and more rigorously establish the (dynamic) structure-function relationship of IDPs and IDRs. In this review, we highlight recent advances in meeting two central requirements for reliable IDP simulations, namely accurate force fields, for describing the energetics of protein conformations, and efficient MD simulation methods, for the adequate sampling of relevant conformational space. The need to simulate disordered protein ensembles has played a key role in driving significant improvements in empirical protein force fields in recent years. Many of these force fields are now well balanced for both folded and disordered proteins. The force field development itself has directly benefited from many advanced sampling methods that allow for accurate calculation of the conformational equilibria of model peptides and proteins during force field recalibration. These enhanced sampling techniques rely on carefully designed biasing potentials, modification to the original Hamiltonian, and/or tempering to accelerate barrier crossing and generate statistically meaningful ensembles with far less computation. Many of the enhanced sampling strategies are complementary and can be readily integrated together to further improve the efficiency. Together, the improved protein force fields and powerful sampling techniques now allow realistic simulations of the conformation and interaction of at least modest-sized IDPs at the atomistic level.

Nonetheless, the high dimensionality and complex nature of disordered protein conformation continues to push the limits of the force field and sampling capability. In particular, none of these methods alone appears to be generally applicable to simulate IDPs that are large (e.g., more than a few dozens of residues) and/or contain nontrivial residual structural features. There remains an urgent need and exciting opportunities in developing much more effective methods for sampling IDP conformations and dynamic interactions, such as through the careful integration of various existing CV-dependent and CV-free strategies. A particular promising direction is to leverage machine learning to design superior adaptive sampling strategies that can generate optimal bias potentials on the fly to maximally drive the exploration of the free energy landscape.

Many proteins models with various levels of resolution are also being developed and fine-tuned for IDP simulations, particularly for studying biological condensates. These models range from C_⍺-only single-bead protein models to implicit solvent ones with atomistic proteins. Many of the current models are geared towards modeling systems with minimal residual structures. A key challenge in the multi-scale modeling and simulation of IDPs is finding the optimal compromise between resolution, accuracy, and efficiency for the particular problem of interest. Nonetheless, it can be expected that multi-scale simulations will continue to play a central role in studying IDPs and dynamic interactions.

Author Contributions

Writing—original draft preparation, X.G. and Y.Z.; writing—review and editing, J.C.; visualization, X.G. and Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Institutes of Health (GM114300) and National Science Foundation (MCB 1817332).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Csizmok, V.; Follis, A.V.; Kriwacki, R.W.; Forman-Kay, J.D. Dynamic Protein Interaction Networks and New Structural Paradigms in Signaling. Chem. Rev. 2016, 116, 6424–6462. [Google Scholar] [CrossRef]
Oldfield, C.J.; Dunker, A.K. Intrinsically Disordered Proteins and Intrinsically Disordered Protein Regions. Annu. Rev. Biochem. 2014, 83, 553–584. [Google Scholar] [CrossRef]
Wright, P.E.; Dyson, H.J. Intrinsically disordered proteins in cellular signalling and regulation. Nat. Rev. Mol. Cell Biol. 2015, 16, 18–29. [Google Scholar] [CrossRef]
Uversky, V.N. Intrinsically disordered proteins and their (disordered) proteomes in neurodegenerative disorders. Front. Aging Neurosci. 2015, 7, 18. [Google Scholar] [CrossRef] [Green Version]
Dyson, H.J.; Wright, P.E. Intrinsically unstructured proteins and their functions. Nat. Rev. Mol. Cell Biol. 2005, 6, 197–208. [Google Scholar] [CrossRef] [PubMed]
Owen, I.; Shewmaker, F. The Role of Post-Translational Modifications in the Phase Transitions of Intrinsically Disordered Proteins. Int. J. Mol. Sci. 2019, 20, 5501. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, J. Towards the physical basis of how intrinsic disorder mediates protein function. Arch. Biochem. Biophys. 2012, 524, 123–131. [Google Scholar] [CrossRef] [PubMed]
Das, R.K.; Ruff, K.M.; Pappu, R.V. Relating sequence encoded information to form and function of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2015, 32, 102–112. [Google Scholar] [CrossRef] [Green Version]
Hatos, A.; Hajdu-Soltesz, B.; Monzon, A.M.; Palopoli, N.; Alvarez, L.; Aykac-Fas, B.; Bassot, C.; Benitez, G.I.; Bevilacqua, M.; Chasapi, A.; et al. DisProt: Intrinsic protein disorder annotation in 2020. Nucleic Acids Res. 2020, 48, D269–D276. [Google Scholar] [CrossRef] [Green Version]
Vacic, V.; Iakoucheva, L.M. Disease mutations in disordered regions—Exception to the rule? Mol. Biosyst. 2012, 8, 27–32. [Google Scholar] [CrossRef] [Green Version]
Kulkarni, P.; Uversky, V.N. Intrinsically Disordered Proteins in Chronic Diseases. Biomolecules 2019, 9, 147. [Google Scholar] [CrossRef] [Green Version]
Oldfield, C.J.; Cheng, Y.; Cortese, M.S.; Brown, C.J.; Uversky, V.N.; Dunker, A.K. Comparing and Combining Predictors of Mostly Disordered Proteins. Biochemistry 2005, 44, 1989–2000. [Google Scholar] [CrossRef] [PubMed]
Chen, J.; Liu, X.; Chen, J. Targeting Intrinsically Disordered Proteins through Dynamic Interactions. Biomolecules 2020, 10, 743. [Google Scholar] [CrossRef] [PubMed]
Mittag, T.; Marsh, J.; Grishaev, A.; Orlicky, S.; Lin, H.; Sicheri, F.; Tyers, M.; Forman-Kay, J.D. Structure/Function Implications in a Dynamic Complex of the Intrinsically Disordered Sic1 with the Cdc4 Subunit of an SCF Ubiquitin Ligase. Structure 2010, 18, 494–506. [Google Scholar] [CrossRef] [PubMed] [Green Version]
McDowell, C.; Chen, J.; Chen, J. Potential Conformational Heterogeneity of p53 Bound to S100B(betabeta). J. Mol. Biol. 2013, 425, 999–1010. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wu, H.; Fuxreiter, M. The Structure and Dynamics of Higher-Order Assemblies: Amyloids, Signalosomes, and Granules. Cell 2016, 165, 1055–1066. [Google Scholar] [CrossRef] [Green Version]
Krois, A.S.; Ferreon, J.C.; Martinez-Yamout, M.A.; Dyson, H.J.; Wright, P.E. Recognition of the disordered p53 transactivation domain by the transcriptional adapter zinc finger domains of CREB-binding protein. Proc. Natl. Acad. Sci. USA 2016, 113, E1853–E1862. [Google Scholar] [CrossRef] [Green Version]
Csizmok, V.; Orlicky, S.; Cheng, J.; Song, J.; Bah, A.; Delgoshaie, N.; Lin, H.; Mittag, T.; Sicheri, F.; Chan, H.S.; et al. An allosteric conduit facilitates dynamic multisite substrate recognition by the SCFCdc4 ubiquitin ligase. Nat. Commun. 2017, 8, 13943. [Google Scholar] [CrossRef] [Green Version]
Borgia, A.; Borgia, M.B.; Bugge, K.; Kissling, V.M.; Heidarsson, P.O.; Fernandes, C.B.; Sottini, A.; Soranno, A.; Buholzer, K.J.; Nettels, D.; et al. Extreme disorder in an ultrahigh-affinity protein complex. Nature 2018, 555, 61–66. [Google Scholar] [CrossRef] [Green Version]
Clark, S.; Myers, J.B.; King, A.; Fiala, R.; Novacek, J.; Pearce, G.; Heierhorst, J.; Reichow, S.L.; Barbar, E.J. Multivalency regulates activity in an intrinsically disordered transcription factor. Elife 2018, 7, e36258. [Google Scholar] [CrossRef] [Green Version]
Fuxreiter, M. Fuzziness in Protein Interactions-A Historical Perspective. J. Mol. Biol. 2018, 430, 2278–2287. [Google Scholar] [CrossRef] [PubMed]
Weng, J.; Wang, W. Dynamic multivalent interactions of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2019, 62, 9–13. [Google Scholar] [CrossRef]
Miskei, M.; Antal, C.; Fuxreiter, M. FuzDB: Database of fuzzy complexes, a tool to develop stochastic structure-function relationships for protein complexes and higher-order assemblies. Nucleic Acids Res. 2017, 45, D228–D235. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ganguly, D.; Chen, J. Structural interpretation of paramagnetic relaxation enhancement-derived distances for disordered protein states. J. Mol. Biol. 2009, 390, 467–477. [Google Scholar] [CrossRef] [PubMed]
Fisher, C.K.; Stultz, C.M. Constructing ensembles for intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2011, 21, 426–431. [Google Scholar] [CrossRef] [Green Version]
Ferreon, A.C.; Ferreon, J.C.; Wright, P.E.; Deniz, A.A. Modulation of allostery by protein intrinsic disorder. Nature 2013, 498, 390–394. [Google Scholar] [CrossRef] [Green Version]
Garcia-Pino, A.; Balasubramanian, S.; Wyns, L.; Gazit, E.; De Greve, H.; Magnuson, R.D.; Charlier, D.; van Nuland, N.A.; Loris, R. Allostery and intrinsic disorder mediate transcription regulation by conditional cooperativity. Cell 2010, 142, 101–111. [Google Scholar] [CrossRef]
Berlow, R.B.; Dyson, H.J.; Wright, P.E. Expanding the Paradigm: Intrinsically Disordered Proteins and Allosteric Regulation. J. Mol. Biol. 2018, 430, 2309–2320. [Google Scholar] [CrossRef] [PubMed]
Levine, Z.A.; Shea, J.-E. Simulations of disordered proteins and systems with conformational heterogeneity. Curr. Opin. Struct. Biol. 2017, 43, 95–103. [Google Scholar] [CrossRef] [Green Version]
Knott, M.; Best, R.B. A preformed binding interface in the unbound ensemble of an intrinsically disordered protein: Evidence from molecular simulations. PLoS Comput. Biol. 2012, 8, e1002605. [Google Scholar] [CrossRef]
Mao, A.H.; Crick, S.L.; Vitalis, A.; Chicoine, C.L.; Pappu, R.V. Net charge per residue modulates conformational ensembles of intrinsically disordered proteins. Proc. Natl. Acad. Sci. USA 2010, 107, 8183–8188. [Google Scholar] [CrossRef] [Green Version]
Ganguly, D.; Chen, J. Atomistic details of the disordered states of KID and pKID. implications in coupled binding and folding. J. Am. Chem. Soc. 2009, 131, 5214–5223. [Google Scholar] [CrossRef] [PubMed]
Zhang, W.; Ganguly, D.; Chen, J. Residual structures, conformational fluctuations, and electrostatic interactions in the synergistic folding of two intrinsically disordered proteins. PLoS Comput. Biol. 2012, 8, e1002353. [Google Scholar] [CrossRef] [Green Version]
Zhang, W.; Chen, J. Accelerate Sampling in Atomistic Energy Landscapes Using Topology-Based Coarse-Grained Models. J. Chem. Theory Comput. 2014, 10, 918–923. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Moritsugu, K.; Terada, T.; Kidera, A. Scalable free energy calculation of proteins via multiscale essential sampling. J. Chem. Phys. 2010, 133, 224105. [Google Scholar] [CrossRef] [PubMed]
Sugita, Y.; Okamoto, Y. Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett. 1999, 314, 141–151. [Google Scholar] [CrossRef]
Liu, P.; Kim, B.; Friesner, R.A.; Berne, B.J. Replica exchange with solute tempering: A method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. USA 2005, 102, 13749–13754. [Google Scholar] [CrossRef] [Green Version]
Mittal, A.; Lyle, N.; Harmon, T.S.; Pappu, R.V. Hamiltonian Switch Metropolis Monte Carlo Simulations for Improved Conformational Sampling of Intrinsically Disordered Regions Tethered to Ordered Domains of Proteins. J. Chem. Theory Comput. 2014, 10, 3550–3562. [Google Scholar] [CrossRef] [PubMed]
Peter, E.K.; Shea, J.E. A hybrid MD-kMC algorithm for folding proteins in explicit solvent. Phys. Chem. Chem. Phys. 2014, 16, 6430–6440. [Google Scholar] [CrossRef] [PubMed]
Zhang, C.; Ma, J. Enhanced sampling and applications in protein folding in explicit solvent. J. Chem. Phys. 2010, 132, 244101. [Google Scholar] [CrossRef] [Green Version]
Zheng, L.Q.; Yang, W. Practically Efficient and Robust Free Energy Calculations: Double-Integration Orthogonal Space Tempering. J. Chem. Theory Comput. 2012, 8, 810–823. [Google Scholar] [CrossRef]
Best, R.B. Computational and theoretical advances in studies of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2017, 42, 147–154. [Google Scholar] [CrossRef]
Kmiecik, S.; Gront, D.; Kolinski, M.; Wieteska, L.; Dawid, A.E.; Kolinski, A. Coarse-Grained Protein Models and Their Applications. Chem. Rev. 2016, 116, 7898–7936. [Google Scholar] [CrossRef] [Green Version]
Bhattacharya, S.; Lin, X. Recent Advances in Computational Protocols Addressing Intrinsically Disordered Proteins. Biomolecules 2019, 9, 146. [Google Scholar] [CrossRef] [Green Version]
Wang, W. Recent advances in atomic molecular dynamics simulation of intrinsically disordered proteins. Phys. Chem. Chem. Phys. 2021, 23, 777–784. [Google Scholar] [CrossRef] [PubMed]
Shea, J.-E.; Best, R.B.; Mittal, J. Physics-based computational and theoretical approaches to intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2021, 67, 219–225. [Google Scholar] [CrossRef]
Arai, M. Unified understanding of folding and binding mechanisms of globular and intrinsically disordered proteins. Biophys. Rev. 2018, 10, 163–181. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rauscher, S.; Gapsys, V.; Gajda, M.J.; Zweckstetter, M.; de Groot, B.L.; Grubmüller, H. Structural Ensembles of Intrinsically Disordered Proteins Depend Strongly on Force Field: A Comparison to Experiment. J. Chem. Theory Comput. 2015, 11, 5513–5524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Best, R.B.; Zhu, X.; Shim, J.; Lopes, P.E.; Mittal, J.; Feig, M.; Mackerell, A.D., Jr. Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone phi, psi and side-chain chi(1) and chi(2) dihedral angles. J. Chem. Theory Comput. 2012, 8, 3257–3273. [Google Scholar] [CrossRef] [Green Version]
Robustelli, P.; Piana, S.; Shaw, D.E. Developing a molecular dynamics force field for both folded and disordered protein states. Proc. Natl. Acad. Sci. USA 2018, 115, E4758–E4766. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Huang, J.; MacKerell, A.D. Force field development and simulations of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2018, 48, 40–48. [Google Scholar] [CrossRef] [PubMed]
Piana, S.; Donchev, A.G.; Robustelli, P.; Shaw, D.E. Water Dispersion Interactions Strongly Influence Simulated Structural Properties of Disordered Protein States. J. Phys. Chem. B 2015, 119, 5113–5123. [Google Scholar] [CrossRef]
Wu, H.-N.; Jiang, F.; Wu, Y.-D. Significantly Improved Protein Folding Thermodynamics Using a Dispersion-Corrected Water Model and a New Residue-Specific Force Field. J. Phys. Chem. Lett. 2017, 8, 3199–3205. [Google Scholar] [CrossRef] [PubMed]
Mu, J.; Liu, H.; Zhang, J.; Luo, R.; Chen, H.-F. Recent Force Field Strategies for Intrinsically Disordered Proteins. J. Chem. Inf. Modeling 2021, 61, 1037–1047. [Google Scholar] [CrossRef] [PubMed]
Song, D.; Liu, H.; Luo, R.; Chen, H.-F. Environment-Specific Force Field for Intrinsically Disordered and Ordered Proteins. J. Chem. Inf. Modeling 2020, 60, 2257–2267. [Google Scholar] [CrossRef] [PubMed]
Yang, S.; Liu, H.; Zhang, Y.; Lu, H.; Chen, H. Residue-Specific Force Field Improving the Sample of Intrinsically Disordered Proteins and Folded Proteins. J. Chem. Inf. Modeling 2019, 59, 4793–4805. [Google Scholar] [CrossRef] [PubMed]
Huang, J.; Rauscher, S.; Nawrocki, G.; Ran, T.; Feig, M.; de Groot, B.L.; Grubmüller, H.; MacKerell, A.D. CHARMM36m: An improved force field for folded and intrinsically disordered proteins. Nat. Methods 2017, 14, 71–73. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Tian, C.; Kasavajhala, K.; Belfon, K.A.A.; Raguette, L.; Huang, H.; Migues, A.N.; Bickel, J.; Wang, Y.; Pincay, J.; Wu, Q.; et al. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J. Chem. Theory Comput. 2020, 16, 528–552. [Google Scholar] [CrossRef]
Rahman, M.U.; Rehman, A.U.; Liu, H.; Chen, H.-F. Comparison and Evaluation of Force Fields for Intrinsically Disordered Proteins. J. Chem. Inf. Modeling 2020, 60, 4912–4923. [Google Scholar] [CrossRef]
Abriata, L.A.; Dal Peraro, M. Assessment of transferable forcefields for protein simulations attests improved description of disordered states and secondary structure propensities, and hints at multi-protein systems as the next challenge for optimization. Comput. Struct. Biotechnol. J. 2021, 19, 2626–2636. [Google Scholar] [CrossRef]
Piana, S.; Robustelli, P.; Tan, D.; Chen, S.; Shaw, D.E. Development of a Force Field for the Simulation of Single-Chain Proteins and Protein–Protein Complexes. J. Chem. Theory Comput. 2020, 16, 2494–2507. [Google Scholar] [CrossRef]
Song, D.; Luo, R.; Chen, H.-F. The IDP-Specific Force Field ff14IDPSFF Improves the Conformer Sampling of Intrinsically Disordered Proteins. J. Chem. Inf. Modeling 2017, 57, 1166–1178. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jing, Z.; Liu, C.; Cheng, S.Y.; Qi, R.; Walker, B.D.; Piquemal, J.-P.; Ren, P. Polarizable Force Fields for Biomolecular Simulations: Recent Advances and Applications. Annu. Rev. Biophys. 2019, 48, 371–394. [Google Scholar] [CrossRef]
Bedrov, D.; Piquemal, J.-P.; Borodin, O.; MacKerell, A.D.; Roux, B.; Schröder, C. Molecular Dynamics Simulations of Ionic Liquids and Electrolytes Using Polarizable Force Fields. Chem. Rev. 2019, 119, 7940–7995. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Inakollu, V.S.S.; Geerke, D.P.; Rowley, C.N.; Yu, H. Polarisable force fields: What do they add in biomolecular simulations? Curr. Opin. Struct. Biol. 2020, 61, 182–190. [Google Scholar] [CrossRef] [PubMed]
Huang, J.; MacKerell, A.D., Jr. Induction of Peptide Bond Dipoles Drives Cooperative Helix Formation in the (AAQAA)3 Peptide. Biophys. J. 2014, 107, 991–997. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kamenik, A.S.; Handle, P.H.; Hofer, F.; Kahler, U.; Kraml, J.; Liedl, K.R. Polarizable and non-polarizable force fields: Protein folding, unfolding, and misfolding. J. Chem. Phys. 2020, 153, 185102. [Google Scholar] [CrossRef]
Wang, A.; Zhang, Z.; Li, G. Higher Accuracy Achieved in the Simulations of Protein Structure Refinement, Protein Folding, and Intrinsically Disordered Proteins Using Polarizable Force Fields. J. Phys. Chem. Lett. 2018, 9, 7110–7116. [Google Scholar] [CrossRef]
Wang, A.; Peng, X.; Li, Y.; Zhang, D.; Zhang, Z.; Li, G. Quality of force fields and sampling methods in simulating pepX peptides: A case study for intrinsically disordered proteins. Phys. Chem. Chem. Phys. 2021, 23, 2430–2437. [Google Scholar] [CrossRef]
Yang, Y.I.; Shao, Q.; Zhang, J.; Yang, L.; Gao, Y.Q. Enhanced sampling in molecular dynamics. J. Chem. Phys. 2019, 151, 070902. [Google Scholar] [CrossRef] [Green Version]
Wang, A.H.; Zhang, Z.C.; Li, G.H. Advances in Enhanced Sampling Molecular Dynamics Simulations for Biomolecules. Chin. J. Chem. Phys. 2019, 32, 277–286. [Google Scholar] [CrossRef] [Green Version]
Barducci, A.; Bonomi, M.; Parrinello, M. Metadynamics. WIREs Comput. Mol. Sci. 2011, 1, 826–843. [Google Scholar] [CrossRef]
Barducci, A.; Bussi, G.; Parrinello, M. Well-Tempered Metadynamics: A Smoothly Converging and Tunable Free-Energy Method. Phys. Rev. Lett. 2008, 100, 020603. [Google Scholar] [CrossRef] [Green Version]
Hamelberg, D.; Mongan, J.; McCammon, J.A. Accelerated molecular dynamics: A promising and efficient simulation method for biomolecules. J. Chem. Phys. 2004, 120, 11919–11929. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Torrie, G.M.; Valleau, J.P. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 1977, 23, 187–199. [Google Scholar] [CrossRef]
Kästner, J. Umbrella sampling. WIREs Comput. Mol. Sci. 2011, 1, 932–942. [Google Scholar] [CrossRef]
Gao, Y.Q. An integrate-over-temperature approach for enhanced sampling. J. Chem. Phys. 2008, 128, 064105. [Google Scholar] [CrossRef]
MacCallum, J.L.; Muniyat, M.I.; Gaalswyk, K. Online Optimization of Total Acceptance in Hamiltonian Replica Exchange Simulations. J. Phys. Chem. B 2018, 122, 5448–5457. [Google Scholar] [CrossRef]
Liu, N.; Guo, Y.; Ning, S.; Duan, M. Phosphorylation regulates the binding of intrinsically disordered proteins via a flexible conformation selection mechanism. Commun. Chem. 2020, 3, 123. [Google Scholar] [CrossRef]
Dickson, A.; Ahlstrom, L.S.; Brooks III, C.L. Coupled folding and binding with 2D Window-Exchange Umbrella Sampling. J. Comput. Chem. 2016, 37, 587–594. [Google Scholar] [CrossRef] [Green Version]
Sidky, H.; Chen, W.; Ferguson, A.L. Machine learning for collective variable discovery and enhanced sampling in biomolecular simulation. Mol. Phys. 2020, 118, e1737742. [Google Scholar] [CrossRef] [Green Version]
Chen, W.; Tan, A.R.; Ferguson, A.L. Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design. J. Chem. Phys. 2018, 149, 072312. [Google Scholar] [CrossRef] [PubMed]
Marinari, E.; Parisi, G. Simulated Tempering: A New Monte Carlo Scheme. EPL (Europhys. Lett.) 1992, 19, 451–458. [Google Scholar] [CrossRef] [Green Version]
Kamiya, M.; Sugita, Y. Flexible selection of the solute region in replica exchange with solute tempering: Application to protein-folding simulations. J. Chem. Phys. 2018, 149, 072304. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Friesner, R.A.; Berne, B.J. Replica Exchange with Solute Scaling: A More Efficient Version of Replica Exchange with Solute Tempering (REST2). J. Phys. Chem. B 2011, 115, 9431–9438. [Google Scholar] [CrossRef] [Green Version]
Miao, Y.; Sinko, W.; Pierce, L.; Bucher, D.; Walker, R.C.; McCammon, J.A. Improved Reweighting of Accelerated Molecular Dynamics Simulations for Free Energy Calculation. J. Chem. Theory Comput. 2014, 10, 2677–2689. [Google Scholar] [CrossRef]
Kokubo, H.; Tanaka, T.; Okamoto, Y. Two-dimensional replica-exchange method for predicting protein–ligand binding structures. J. Comput. Chem. 2013, 34, 2601–2614. [Google Scholar] [CrossRef]
Oshima, H.; Re, S.; Sugita, Y. Replica-Exchange Umbrella Sampling Combined with Gaussian Accelerated Molecular Dynamics for Free-Energy Calculation of Biomolecules. J. Chem. Theory Comput. 2019, 15, 5199–5208. [Google Scholar] [CrossRef]
Peng, X.; Zhang, Y.; Li, Y.; Liu, Q.; Chu, H.; Zhang, D.; Li, G. Integrating Multiple Accelerated Molecular Dynamics to Improve Accuracy of Free Energy Calculations. J. Chem. Theory Comput. 2018, 14, 1216–1227. [Google Scholar] [CrossRef]
Bussi, G.; Laio, A. Using metadynamics to explore complex free-energy landscapes. Nat. Rev. Phys. 2020, 2, 200–212. [Google Scholar] [CrossRef]
Galvelis, R.; Sugita, Y. Replica state exchange metadynamics for improving the convergence of free energy estimates. J. Comput. Chem. 2015, 36, 1446–1455. [Google Scholar] [CrossRef] [PubMed]
Piana, S.; Laio, A. A Bias-Exchange Approach to Protein Folding. J. Phys. Chem. B 2007, 111, 4553–4559. [Google Scholar] [CrossRef] [Green Version]
Galvelis, R.; Re, S.; Sugita, Y. Enhanced Conformational Sampling of N-Glycans in Solution with Replica State Exchange Metadynamics. J. Chem. Theory Comput. 2017, 13, 1934–1942. [Google Scholar] [CrossRef]
Do, T.N.; Choy, W.-Y.; Karttunen, M. Binding of Disordered Peptides to Kelch: Insights from Enhanced Sampling Simulations. J. Chem. Theory Comput. 2016, 12, 395–404. [Google Scholar] [CrossRef]
Guo, J.; Zhou, H.X. Protein Allostery and Conformational Dynamics. Chem. Rev. 2016, 116, 6503–6515. [Google Scholar] [CrossRef]
Gianni, S.; Freiberger, M.I.; Jemth, P.; Ferreiro, D.U.; Wolynes, P.G.; Fuxreiter, M. Fuzziness and Frustration in the Energy Landscape of Protein Folding, Function, and Assembly. Acc. Chem. Res. 2021, 54, 1251–1259. [Google Scholar] [CrossRef]
Neupane, K.; Manuel, A.P.; Woodside, M.T. Protein folding trajectories can be described quantitatively by one-dimensional diffusion over measured energy landscapes. Nat. Phys. 2016, 12, 700–703. [Google Scholar] [CrossRef]
Pfaendtner, J.; Bonomi, M. Efficient Sampling of High-Dimensional Free-Energy Landscapes with Parallel Bias Metadynamics. J. Chem. Theory Comput. 2015, 11, 5062–5067. [Google Scholar] [CrossRef] [PubMed]
Prakash, A.; Fu, C.D.; Bonomi, M.; Pfaendtner, J. Biasing Smarter, Not Harder, by Partitioning Collective Variables into Families in Parallel Bias Metadynamics. J. Chem. Theory Comput. 2018, 14, 4985–4990. [Google Scholar] [CrossRef] [PubMed]
Awasthi, S.; Nair, N.N. Exploring high dimensional free energy landscapes: Temperature accelerated sliced sampling. J. Chem. Phys. 2017, 146, 094108. [Google Scholar] [CrossRef] [Green Version]
Chen, W.; Ferguson, A.L. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration. J. Comput. Chem. 2018, 39, 2079–2102. [Google Scholar] [CrossRef] [Green Version]
Galvelis, R.; Sugita, Y. Neural Network and Nearest Neighbor Algorithms for Enhancing Sampling of Molecular Dynamics. J. Chem. Theory Comput. 2017, 13, 2489–2500. [Google Scholar] [CrossRef]
Salawu, E.O. DESP: Deep Enhanced Sampling of Proteins’ Conformation Spaces Using AI-Inspired Biasing Forces. Front. in Mol. Biosci. 2021, 8, 121. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Chen, M. Unfolding Hidden Barriers by Active Enhanced Sampling. Phys. Rev. Lett. 2018, 121, 010601. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Brown, S.; Head-Gordon, T. Cool walking: A new Markov chain Monte Carlo sampling method. J. Comput. Chem. 2003, 24, 68–76. [Google Scholar] [CrossRef] [PubMed]
Neal, R.M. Annealed importance sampling. Stat. Comput. 2001, 11, 125–139. [Google Scholar] [CrossRef]
Fukunishi, H.; Watanabe, O.; Takada, S. On the Hamiltonian replica exchange method for efficient sampling of biomolecular systems: Application to protein structure prediction. J. Chem. Phys. 2002, 116, 9058–9067. [Google Scholar] [CrossRef]
Huang, Y.-m.M.; McCammon, J.A.; Miao, Y. Replica Exchange Gaussian Accelerated Molecular Dynamics: Improved Enhanced Sampling and Free Energy Calculation. J. Chem. Theory Comput. 2018, 14, 1853–1864. [Google Scholar] [CrossRef]
Wang, J.; Arantes, P.R.; Bhattarai, A.; Hsu, R.V.; Pawnikar, S.; Huang, Y.-m.M.; Palermo, G.; Miao, Y. Gaussian accelerated molecular dynamics: Principles and applications. WIREs Comput. Mol. Sci. 2021, 11, e1521. [Google Scholar] [CrossRef]
Miao, Y.; Bhattarai, A.; Wang, J. Ligand Gaussian Accelerated Molecular Dynamics (LiGaMD): Characterization of Ligand Binding Thermodynamics and Kinetics. J. Chem. Theory Comput. 2020, 16, 5526–5547. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Miao, Y. Peptide Gaussian accelerated molecular dynamics (Pep-GaMD): Enhanced sampling and free energy and kinetics calculations of peptide binding. J. Chem. Phys. 2020, 153, 154109. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Chen, J. Residual Structures and Transient Long-Range Interactions of p53 Transactivation Domain: Assessment of Explicit Solvent Protein Force Fields. J. Chem. Theory Comput. 2019, 15, 4708–4720. [Google Scholar] [CrossRef]
Shrestha, U.R.; Smith, J.C.; Petridis, L. Full structural ensembles of intrinsically disordered proteins from unbiased molecular dynamics simulations. Commun. Biol. 2021, 4, 243. [Google Scholar] [CrossRef] [PubMed]
Hicks, A.; Zhou, H.-X. Temperature-induced collapse of a disordered peptide observed by three sampling methods in molecular dynamics simulations. J. Chem. Phys. 2018, 149, 072313. [Google Scholar] [CrossRef] [PubMed]
Brown, A.H.; Rodger, P.M.; Evans, J.S.; Walsh, T.R. Equilibrium Conformational Ensemble of the Intrinsically Disordered Peptide n16N: Linking Subdomain Structures and Function in Nacre. Biomacromolecules 2014, 15, 4467–4479. [Google Scholar] [CrossRef]
Pang, X.; Zhou, H.-X. Disorder-to-Order Transition of an Active-Site Loop Mediates the Allosteric Activation of Sortase A. Biophys. J. 2015, 109, 1706–1715. [Google Scholar] [CrossRef] [Green Version]
Liu, X.; Jia, Z.; Chen, J. Enhanced Sampling of Intrinsic Structural Heterogeneity of the BH3-Only Protein Binding Interface of Bcl-xL. J. Phys. Chem. B 2017, 121, 9160–9168. [Google Scholar] [CrossRef]
Liang, C.; Savinov, S.N.; Fejzo, J.; Eyles, S.J.; Chen, J. Modulation of Amyloid-beta42 Conformation by Small Molecules Through Nonspecific Binding. J. Chem. Theory Comput. 2019, 15, 5169–5174. [Google Scholar] [CrossRef]
Liu, X.; Chen, J. Modulation of p53 Transactivation Domain Conformations by Ligand Binding and Cancer-Associated Mutations. Pac. Symp. Biocomput. 2020, 25, 195–206. [Google Scholar]
Schrag, L.G.; Liu, X.; Thevarajan, I.; Prakash, O.; Zolkiewski, M.; Chen, J. Cancer-Associated Mutations Perturb the Disordered Ensemble and Interactions of the Intrinsically Disordered p53 Transactivation Domain. J. Mol. Biol. 2021, 433, 167048. [Google Scholar] [CrossRef]
Zhao, J.; Blayney, A.; Liu, X.; Gandy, L.; Jin, W.; Yan, L.; Ha, J.H.; Canning, A.J.; Connelly, M.; Yang, C.; et al. EGCG binds intrinsically disordered N-terminal domain of p53 and disrupts p53-MDM2 interaction. Nat. Commun. 2021, 12, 986. [Google Scholar] [CrossRef]
Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson, S.; Dahlgren, M.K.; Greenwood, J.; et al. Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc. 2015, 137, 2695–2703. [Google Scholar] [CrossRef] [Green Version]
Zhou, H.X. Theoretical frameworks for multiscale modeling and simulation. Curr. Opin. Struct. Biol. 2014, 25C, 67–76. [Google Scholar] [CrossRef] [Green Version]
Lee, K.H.; Chen, J.H. Multiscale Enhanced Sampling of Intrinsically Disordered Protein Conformations. J. Comput. Chem. 2016, 37, 550–557. [Google Scholar] [CrossRef]
Liu, X.R.; Gong, X.P.; Chen, J.H. Accelerating atomistic simulations of proteins using multiscale enhanced sampling with independent tempering. J. Comput. Chem. 2021, 42, 358–364. [Google Scholar] [CrossRef]
Lee, K.H.; Chen, J.H. Optimization of the GBMV2 implicit solvent force field for accurate simulation of protein conformational equilibria. J. Comput. Chem. 2017, 38, 1332–1341. [Google Scholar] [CrossRef]
Liu, Y.; Pezeshkian, W.; Barnoud, J.; de Vries, A.H.; Marrink, S.J. Coupling Coarse-Grained to Fine-Grained Models via Hamiltonian Replica Exchange. J. Chem. Theory Comput. 2020, 16, 5313–5322. [Google Scholar] [CrossRef] [PubMed]
Yang, Y.I.; Niu, H.; Parrinello, M. Combining Metadynamics and Integrated Tempering Sampling. J. Phys. Chem. Lett. 2018, 9, 6426–6430. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shirts, M.R.; Chodera, J.D. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 2008, 129, 124105. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kumar, S.; Rosenberg, J.M.; Bouzida, D.; Swendsen, R.H.; Kollman, P.A. THE weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comput. Chem. 1992, 13, 1011–1021. [Google Scholar] [CrossRef]
Sinko, W.; Miao, Y.; de Oliveira, C.A.F.; McCammon, J.A. Population Based Reweighting of Scaled Molecular Dynamics. J. Phys. Chem. B 2013, 117, 12759–12768. [Google Scholar] [CrossRef]
Ilie, I.M.; Caflisch, A. Simulation Studies of Amyloidogenic Polypeptides and Their Aggregates. Chem. Rev. 2019, 119, 6956–6993. [Google Scholar] [CrossRef]
Zhou, H.X.; Pang, X. Electrostatic Interactions in Protein Structure, Folding, Binding, and Condensation. Chem. Rev. 2018, 118, 1691–1741. [Google Scholar] [CrossRef]
Fassler, J.S.; Skuodas, S.; Weeks, D.L.; Phillips, B.T. Protein Aggregation and Disaggregation in Cells and Development. J. Mol. Biol. 2021, 433, 167215. [Google Scholar] [CrossRef] [PubMed]
Alberti, S.; Gladfelter, A.; Mittag, T. Considerations and Challenges in Studying Liquid-Liquid Phase Separation and Biomolecular Condensates. Cell 2019, 176, 419–434. [Google Scholar] [CrossRef] [Green Version]
Holehouse, A.S.; Pappu, R.V. Functional Implications of Intracellular Phase Transitions. Biochemistry 2018, 57, 2415–2423. [Google Scholar] [CrossRef] [PubMed]
Brangwynne, C.P.; Tompa, P.; Pappu, R.V. Polymer physics of intracellular phase transitions. Nat. Phys. 2015, 11, 899–904. [Google Scholar] [CrossRef]
Mathieu, C.; Pappu, R.V.; Taylor, J.P. Beyond aggregation: Pathological phase transitions in neurodegenerative disease. Science 2020, 370, 56–60. [Google Scholar] [CrossRef] [PubMed]
Chen, J.; Brooks, C.L.; Khandogin, J. Recent advances in implicit solvent based methods for biomolecular simulations. Curr. Opin. Struct. Biol. 2008, 18, 140–148. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Haberthür, U.; Caflisch, A. FACTS: Fast analytical continuum treatment of solvation. J. Comput. Chem. 2008, 29, 701–715. [Google Scholar] [CrossRef] [PubMed]
Hawkins, G.D.; Cramer, C.J.; Truhlar, D.G. Parametrized Models of Aqueous Free Energies of Solvation Based on Pairwise Descreening of Solute Atomic Charges from a Dielectric Medium. J. Phys. Chem. 1996, 100, 19824–19839. [Google Scholar] [CrossRef]
Onufriev, A.; Bashford, D.; Case, D.A. Exploring protein native states and large-scale conformational changes with a modified generalized born model. Proteins Struct. Funct. Bioinform. 2004, 55, 383–394. [Google Scholar] [CrossRef] [Green Version]
Mongan, J.; Simmerling, C.; McCammon, J.A.; Case, D.A.; Onufriev, A. Generalized Born Model with a Simple, Robust Molecular Volume Correction. J. Chem. Theory Comput. 2007, 3, 156–169. [Google Scholar] [CrossRef] [PubMed]
Nguyen, H.; Roe, D.R.; Simmerling, C. Improved Generalized Born Solvent Model Parameters for Protein Simulations. J. Chem. Theory Comput. 2013, 9, 2020–2034. [Google Scholar] [CrossRef] [Green Version]
Gallicchio, E.; Levy, R.M. AGBNP: An analytic implicit solvent model suitable for molecular dynamics simulations and high-resolution modeling. J. Comput. Chem. 2004, 25, 479–499. [Google Scholar] [CrossRef] [PubMed]
Gallicchio, E.; Paris, K.; Levy, R.M. The AGBNP2 Implicit Solvation Model. J. Chem. Theory Comput. 2009, 5, 2544–2564. [Google Scholar] [CrossRef]
Im, W.; Lee, M.S.; Brooks Iii, C.L. Generalized born model with a simple smoothing function. J. Comput. Chem. 2003, 24, 1691–1702. [Google Scholar] [CrossRef]
Lee, M.S.; Salsbury, F.R.; Brooks, C.L. Novel generalized Born methods. J. Chem. Phys. 2002, 116, 10606–10614. [Google Scholar] [CrossRef]
Lee, M.S.; Feig, M.; Salsbury, F.R., Jr.; Brooks, C.L., III. New analytic approximation to the standard molecular volume definition and its application to generalized Born calculations. J. Comput. Chem. 2003, 24, 1348–1356. [Google Scholar] [CrossRef]
Chen, J.; Im, W.; Brooks, C.L. Balancing solvation and intramolecular interactions: Toward a consistent generalized born force field. J. Am. Chem. Soc. 2006, 128, 3728–3736. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, J.; Brooks, C.L. Implicit modeling of nonpolar solvation for simulating protein folding and conformational transitions. Phys. Chem. Chem. Phys. 2008, 10, 471–481. [Google Scholar] [CrossRef]
Chen, J. Intrinsically disordered p53 extreme C-terminus binds to S100B(betabeta) through “fly-casting”. J. Am. Chem. Soc. 2009, 131, 2088–2089. [Google Scholar] [CrossRef]
Wang, Y.; Fisher, J.C.; Mathew, R.; Ou, L.; Otieno, S.; Sublet, J.; Xiao, L.; Chen, J.; Roussel, M.F.; Kriwacki, R.W. Intrinsic disorder mediates the diverse regulatory functions of the Cdk inhibitor p21. Nat. Chem. Biol. 2011, 7, 214–221. [Google Scholar] [CrossRef] [Green Version]
Ganguly, D.; Chen, J. Modulation of the disordered conformational ensembles of the p53 transactivation domain by cancer-associated mutations. PLoS Comput. Biol. 2015, 11, e1004247. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Nguyen, H.; Maier, J.; Huang, H.; Perrone, V.; Simmerling, C. Folding Simulations for Proteins with Diverse Topologies Are Accessible in Days with a Physics-Based Force Field and Implicit Solvent. J. Am. Chem. Soc. 2014, 136, 13959–13962. [Google Scholar] [CrossRef] [Green Version]
Maffucci, I.; Contini, A. An Updated Test of AMBER Force Fields and Implicit Solvent Models in Predicting the Secondary Structure of Helical, β-Hairpin, and Intrinsically Disordered Peptides. J. Chem. Theory Comput. 2016, 12, 714–727. [Google Scholar] [CrossRef] [PubMed]
Tao, P.; Xiao, Y. Using the generalized Born surface area model to fold proteins yields more effective sampling while qualitatively preserving the folding landscape. Phys. Rev. E 2020, 101, 062417. [Google Scholar] [CrossRef]
Gong, X.; Chiricotto, M.; Liu, X.; Nordquist, E.; Feig, M.; Brooks Iii, C.L.; Chen, J. Accelerating the Generalized Born with Molecular Volume and Solvent Accessible Surface Area Implicit Solvent Model Using Graphics Processing Units. J. Comput. Chem. 2020, 41, 830–838. [Google Scholar] [CrossRef] [PubMed]
Vitalis, A.; Pappu, R.V. ABSINTH: A new continuum solvation model for simulations of polypeptides in aqueous solutions. J. Comput. Chem. 2009, 30, 673–699. [Google Scholar] [CrossRef] [Green Version]
Vitalis, A.; Caflisch, A. Micelle-Like Architecture of the Monomer Ensemble of Alzheimer’s Amyloid-β Peptide in Aqueous Solution and Its Implications for Aβ Aggregation. J. Mol. Biol. 2010, 403, 148–165. [Google Scholar] [CrossRef]
Mittal, A.; Holehouse, A.S.; Cohan, M.C.; Pappu, R.V. Sequence-to-Conformation Relationships of Disordered Regions Tethered to Folded Domains of Proteins. J. Mol. Biol 2018, 430, 2403–2421. [Google Scholar] [CrossRef]
Choi, J.-M.; Pappu, R.V. Improvements to the ABSINTH Force Field for Proteins Based on Experimentally Derived Amino Acid Specific Backbone Conformational Statistics. J. Chem. Theory Comput. 2019, 15, 1367–1382. [Google Scholar] [CrossRef] [PubMed]
Pak, A.J.; Voth, G.A. Advances in coarse-grained modeling of macromolecular complexes. Curr. Opin. Struct. Biol. 2018, 52, 119–126. [Google Scholar] [CrossRef] [PubMed]
Wolynes, P.G. Recent successes of the energy landscape theory of protein folding and function. Q. Rev. Biophys. 2005, 38, 405–410. [Google Scholar] [CrossRef]
Hills, R.D.; Brooks, C.L. Insights from Coarse-Grained Gō Models for Protein Folding and Dynamics. Int. J. Mol. Sci. 2009, 10, 889. [Google Scholar] [CrossRef] [Green Version]
Law, S.M.; Gagnon, J.K.; Mapp, A.K.; Brooks, C.L., 3rd. Prepaying the entropic cost for allosteric regulation in KIX. Proc. Natl. Acad. Sci. USA 2014, 111, 12067–12072. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chu, X.; Wang, J. Position-, disorder-, and salt-dependent diffusion in binding-coupled-folding of intrinsically disordered proteins. Phys. Chem. Chem. Phys. 2019, 21, 5634–5645. [Google Scholar] [CrossRef] [PubMed]
Ganguly, D.; Otieno, S.; Waddell, B.; Iconaru, L.; Kriwacki, R.W.; Chen, J. Electrostatically Accelerated Coupled Binding and Folding of Intrinsically Disordered Proteins. J. Mol. Biol. 2012, 422, 674–684. [Google Scholar] [CrossRef] [Green Version]
Ganguly, D.; Zhang, W.; Chen, J. Electrostatically Accelerated Encounter and Folding for Facile Recognition of Intrinsically Disordered Proteins. PLoS Comput. Biol. 2013, 9, e1003363. [Google Scholar] [CrossRef] [Green Version]
Liu, X.; Chen, J.; Chen, J. Residual Structure Accelerates Binding of Intrinsically Disordered ACTR by Promoting Efficient Folding upon Encounter. J. Mol. Biol. 2019, 431, 422–432. [Google Scholar] [CrossRef]
Liu, Z.R.; Huang, Y.Q. Advantages of proteins being disordered. Protein Sci. 2014, 23, 539–550. [Google Scholar] [CrossRef] [Green Version]
Ganguly, D.; Chen, J. Topology-based modeling of intrinsically disordered proteins: Balancing intrinsic folding and intermolecular interactions. Proteins: Struct. Funct. Bioinform. 2011, 79, 1251–1266. [Google Scholar] [CrossRef] [PubMed]
Ganguly, D.; Zhang, W.; Chen, J. Synergistic folding of two intrinsically disordered proteins: Searching for conformational selection. Mol. Biosyst. 2012, 8, 198–209. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Baul, U.; Chakraborty, D.; Mugnai, M.L.; Straub, J.E.; Thirumalai, D. Sequence Effects on Size, Shape, and Structural Heterogeneity in Intrinsically Disordered Proteins. J. Phys. Chem. B 2019, 123, 3462–3474. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Chen, J. HyRes: A coarse-grained model for multi-scale enhanced sampling of disordered protein conformations. Phys. Chem. Chem. Phys. 2017, 19, 32421–32432. [Google Scholar] [CrossRef] [PubMed]
Wu, H.; Wolynes, P.G.; Papoian, G.A. AWSEM-IDP: A Coarse-Grained Force Field for Intrinsically Disordered Proteins. J. Phys. Chem. B 2018, 122, 11115–11125. [Google Scholar] [CrossRef]
Ashbaugh, H.S.; Hatch, H.W. Natively Unfolded Protein Stability as a Coil-to-Globule Transition in Charge/Hydropathy Space. J. Am. Chem. Soc. 2008, 130, 9536–9542. [Google Scholar] [CrossRef]
Kim, Y.C.; Hummer, G. Coarse-grained Models for Simulations of Multiprotein Complexes: Application to Ubiquitin Binding. J. Mol. Biol. 2008, 375, 1416–1433. [Google Scholar] [CrossRef] [Green Version]
Dignon, G.L.; Zheng, W.; Best, R.B.; Kim, Y.C.; Mittal, J. Relation between single-molecule properties and phase behavior of intrinsically disordered proteins. Proc. Natl. Acad. Sci. USA 2018, 115, 9929. [Google Scholar] [CrossRef] [Green Version]
Latham, A.P.; Zhang, B. Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins. J. Chem. Theory Comput. 2020, 16, 773–781. [Google Scholar] [CrossRef]

Figure 1. Number of articles identified with three different search keywords published from 2011 to 2021 based on a Web of Science core collection source (as of 15 August 2021).

Figure 2. The generalized replica exchange molecular dynamics protocol based on unitless potentials, where the initial condition of each replica could have a varied temperature or scaled potential. β_m is the inverse of temperature, E_m(X) is the potential energy of m^th condition for given a configuration X.

Figure 3. Coarse-grain modeling for addressing various IDPs-related challenges. These models can have a range of spatial resolutions and may be refined by introducing various effective potentials and/or re-calibrating the parameters of these energy terms.

Table 1. Summary of enhanced sampling methods for IDP simulations.

Types	Sampling Methods	Key Features	References
CV-based	WT-MTD	History-based adaptive bias potentials	[72,73]
	Bias-exchange MTD	Multiple replicas with bias on different CVs	[79]
	Umbrella sampling	Pre-determined bias potentials	[80]
	Machine learning	On-the-fly discover optimal CVs	[81,82]
Tempering-based	Simulated tempering	Random walk in the temperature space	[83]
	Parallel tempering	Multiple replicas to avoid the need for estimating the density of states	[36]
	Integrated tempering	Integral of Boltzmann distributions over a range of temperatures as the bias	[77]
	Solute tempering	Scaling the energies of only selected atoms or terms to achieve effective tempering	[37,84,85]
Accelerated MD	GaMD	Boost potentials to accelerate barrier crossing	[86]
Combinations	MSES	Temperature/Hamiltonian replica exchange simulation by coupling CG and atomistic models	[34]
	REUS/REST	Combined REUS and REST	[87]
	REUS/GaMD	Combined REUS and GaMD	[88]
	Integrated aMD	Integrated aMD and integrated tempering	[69,89]
	PT-MTD	Combined the WT-MTD with PT	[79]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gong, X.; Zhang, Y.; Chen, J. Advanced Sampling Methods for Multiscale Simulation of Disordered Proteins and Dynamic Interactions. Biomolecules 2021, 11, 1416. https://doi.org/10.3390/biom11101416

AMA Style

Gong X, Zhang Y, Chen J. Advanced Sampling Methods for Multiscale Simulation of Disordered Proteins and Dynamic Interactions. Biomolecules. 2021; 11(10):1416. https://doi.org/10.3390/biom11101416

Chicago/Turabian Style

Gong, Xiping, Yumeng Zhang, and Jianhan Chen. 2021. "Advanced Sampling Methods for Multiscale Simulation of Disordered Proteins and Dynamic Interactions" Biomolecules 11, no. 10: 1416. https://doi.org/10.3390/biom11101416

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advanced Sampling Methods for Multiscale Simulation of Disordered Proteins and Dynamic Interactions

Abstract

1. Introduction

2. Challenges of Simulating IDP Conformational Equilibria

3. The State-of-the-Art Protein Force Fields for Describing IDP Conformations

3.1. Nonpolarizable Protein Force Fields

3.2. Polarizable Protein Force Fields

4. Enhanced Sampling Methods for Sampling IDP Conformational Ensembles

4.1. Collective Variables-Based Sampling Methods and Optimization

4.2. Collective Variables-Free Sampling Methods and Optimization

4.3. Reweighting Techniques for Generating Unbiased Ensembles

5. Multi-Scale Approaches for Overcoming Sampling Problems of Large Systems

5.1. Implicit Solvent Models for Removing Solvent DOFs

5.2. Coarse-Grain Models for Reducing the DOFs of Proteins

6. Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI