Normal mode analysis of proteins: a comparison of rigid cluster modes with Cα coarse graining
Introduction
The search for inherent links between protein structure and function is a driving force behind the development of accurate and computationally efficient methods for describing the complex motions attainable by protein structures. X-ray crystallography can provide snapshots of protein structures in (near) equilibrium conformations.1 Other experimental methods such as fluorescent resonance energy transfer (FRET) and nuclear magnetic resonance (NMR) can provide partial information about large-amplitude protein motions [2], [3], [4]. Dynamical simulations of protein structures can provide crucial insights into their function which are not easily obtained experimentally. It has been observed that the low-frequency, large-amplitude motions are most closely related to protein function [5], [6], [7], [8] whereas the high-frequency localized vibrations may be more involved in signal transmission and other internal processes [9]. Computational models (normal mode analysis (NMA) in particular) enable one to derive these desired dynamic motions from the static conformations obtained from crystallography and are thus an essential tool in gaining insight into the structure–function relationship.
Molecular dynamics (MD) simulations rely on atomic details to predict the evolution of conformations of protein structures based on the interactions between all pairs of atoms. These simulations can be computationally prohibitive due to the high number of degrees-of-freedom (DOFs) required to capture motions of large structures and the complicated force calculations required at each iteration.
As a first level of simplification, consider the basic structure of any protein. A protein is comprised of a polypeptide backbone with amino acid residues extending from the alpha-carbon (Cα) of each peptide unit. Since the peptide-bonded backbone remains connected in all conformations (short of denaturing the protein), it is often useful to focus on the backbone structure knowing that each side chain must closely follow its corresponding Cα. Bahar et al. [10] present a scalar model, called the Gaussian network model (GNM), which produces magnitudes of individual residue displacements consistent with experimentally derived quantities including X-ray crystallographic temperature factors [7], hydrogen exchange free energies [11], and the order parameters from NMR-relaxation measurements [12]. While these results validate the use of simple elastic networks, they do not produce displacement directions. Atilgan et al. [13] present the anisotropic network model (ANM) which builds on the GNM by including parameters for displacement directions. Similarly, Kim et al. [14], [15] use Cα-NMA, in which the interactions between residues in contact are modelled with harmonic potentials, to produce three-dimensional displacements. While these methods are much faster than all-atom simulations, they are still computationally expensive for very large structures.
Cα-NMA, as mentioned above, is one of the highest resolution coarse-grained models (one Cα per grain). It uses the Cartesian displacements of each Cα to define the conformational displacement relative to the initial conformation [16]. Coarser-grained models have been employed to capture large-amplitude motions. For example, coarse-graining methods have been employed by [17], in which the full protein structure is projected onto a reduced DOF subspace. The hybrid method MBO(N)D, as presented in [18], makes use of varying grain sizes to achieve desired levels of resolution according to the mobility and functional interest of each region within the structure.
Hinsen [5] uses a Fourier basis to capture a uniform vector field of displacements. This reduced degree-of-freedom model effectively captures the lowest modes, but has a number of limitations which are inherent to a Fourier basis: it is not well suited for capturing translational motion, periodicity of the basis set must be accounted for, displacements given in the basis coordinates do not have physical meaning.
Central to every modelling method is the choice of parameterization. In general, higher DOF parameterizations allow more complex motions to be captured (at a significant computational cost), while lower DOF parameterizations can impose unrealistic conformational constraints. The choice of parameterization allows the user to attain the desired combination of computational performance and motion resolution. This trade-off can be adjusted within a given structure so that regions of interest can be modelled with higher resolution than other regions of less importance. In this paper, we present a low DOF parameterization that produces low-frequency motions consistent with Cα-NMA, which in turn has shown strong agreement with all-atom NMA and MD simulations [5], [10], [19].
An n residue structure requires 3n parameters for full resolution Cα-NMA. We refer to this as the standard parameterization,2 as it serves as our basis for comparison. This results in a computational complexity of .3 However, as mentioned above, the modes of interest are the low-frequency, large-amplitude motions and not the high-frequency localized vibrations (i.e. one is typically not interested in all 3n modes).4 We bypass these issues with clustering algorithms to identify subsets of residues that form rigid clusters and thus move as rigid units under the modes of interest.
When multiple conformations are known, as in the case of lactoferrin (PDB: 1LFG and 1LFH), the structure can be clustered by identifying sets of residues that experience minimal RMS deviation (after optimal alignment of each candidate cluster). For all structures there are algorithms such as the pebble game [21] that count DOF constraints in a network of contacts. Proteins can also be clustered by secondary structure elements. In all cases, clustering effectively filters out the high-frequency modes and enables one to use a low DOF parameterization to more efficiently calculate the global modes.
The rest of this paper is organized as follows. In Section 2.1, the Cα elastic network model is reviewed. In Section 2.2, the central points of clustering algorithms are discussed and cluster notation is introduced. In Section 2.3, the parameters necessary to capture the motions of this system of rigid bodies are defined. In 2.4 Derivation of stiffness matrix, 2.5 Derivation of the mass matrix, the stiffness and mass matrices are obtained from the quadratic expressions for the potential and kinetic energies of the system. In Section 2.6, the mode shapes are extracted from the equation of motion (EOM) and projected onto the structure. This process requires a change of coordinates, the Gram–Schmidt orthonormalization process, and a low mode “unmixing” algorithm. In Section 3, cluster-NMA is applied to a variety of structures. Computational performance and mode accuracy are analyzed. In Section 4, a summary analysis is given.
Section snippets
Review of Cα-NMA
Since cluster-NMA will be compared to Cα-NMA, the Cα model [13], [14], is briefly reviewed here for completeness. Structures are represented as a system of point masses located at each Cα position with a network of connecting springs. Conformational changes are viewed as displacements of each Cα in the structure. The vector of generalized coordinates is , where is the displacement of residue i and denotes the d-dimensional space of real valued vectors.
The 3n×3n mass
Application of cluster-NMA to various protein structures
In this section, we apply a high- and low-resolution helix-based clustering algorithm as described in Section 2.2. Cluster-NMA is tested on a sample set of 12 protein structures ranging in size from 85 to 1287 residues. An even coarser cluster-NMA is performed on lactoferrin (691 residues) by clustering by domain. Finally, a very coarse cluster-NMA is performed on the 8015 residue GroEL/GroES complex. The range in structure size and cluster resolutions is chosen so that computational savings
Conclusions
At the core of cluster-NMA is the rigid-body representation of the protein structure. This simultaneously reduces the number of DOFs and confines the structure to the space of low-frequency motions. Typically, NMA computational performance is limited by the eigenproblem. Cluster-NMA circumvents this limitation by using an transformation to project the structure into a reduced DOF representation. The eigenproblem is then performed in this smaller space and the results are transformed
References (24)
A theorem on amplitudes of thermal atomic fluctuations in large molecules assuming specific conformations calculated by normal mode analysis
Biophys. Chem.
(1990)Single-molecule fluorescence resonance energy transfer
Methods
(2001)Probing molecular motion by NMR
Curr. Opin. Struct. Biol.
(1997)- et al.
Protein dynamics simulations from nanoseconds to microseconds
Curr. Opin. Struct. Biol.
(1999) Domain motions in proteins
J. Mol. Liquids
(2000)- et al.
Relating molecular flexibility to function: a case study of tubulin
Biophys. J.
(2002) - et al.
Anisotropy of fluctuation dynamics of proteins with an elastic network model
Biophys. J.
(2001) - et al.
Elastic models of conformational transitions in macromolecules
J. Mol. Graphics Modell.
(2002) - et al.
Efficient generation of feasible pathways for protein conformational transitions
Biophys. J.
(2002) - et al.
Matrix multiplication via arithmetic progression
J. Symbolic Comput.
(1990)
Analysis of domain motions by approximate normal mode calculations
Proteins: Struct. Function Genet.
Vibrational dynamics of folded proteins: significance of slow and fast motions in relation to function and stability
Phys. Rev. Lett.
Cited by (50)
Efficient prediction of protein conformational pathways based on the hybrid elastic network model
2014, Journal of Molecular Graphics and ModellingCitation Excerpt :Recently, a new domain decomposition method for RTB, called density-cluster RTB was developed [24]. The cluster NMA (cNMA) [25,26] was developed to overcome several limitations in RTB and BNM method. cNMA can directly convert Cartesian coordinates into rigid clusters without any projection, vice versa, whereas RTB and BNM methods transform the full Hessian matrix into a reduced subspace.
NMR spectroscopy on domain dynamics in biomacromolecules
2013, Progress in Biophysics and Molecular BiologyProtein dynamics developments for the large scale and cryoEM: Case study of ProDy 2.0
2022, Acta Crystallographica Section D: Structural BiologyThe importance of slow motions for protein functional loops
2012, Physical Biology