Normal mode analysis of proteins: a comparison of rigid cluster modes with Cα coarse graining

doi:10.1016/S1093-3263(03)00158-X

Journal of Molecular Graphics and Modelling

Volume 22, Issue 3, January 2004, Pages 183-193

https://doi.org/10.1016/S1093-3263(03)00158-X Get rights and content

Abstract

The ability to infer dynamic motions from an equilibrium (static) conformation of a protein can be essential in establishing structure–function relationships. In particular, the low-frequency motions are of functional interest because statistical mechanics predicts these motions will have the largest amplitudes. In this paper, we address the computational cost of normal mode analysis (NMA) applied to a C_α-based elastic network model (C_α-NMA) and present a new coarse-grained rigid-body-based analysis (cluster-NMA). This new method represents a protein as a collection of rigid bodies interconnected with harmonic potentials. This representation produces reduced degree-of-freedom (DOF) equations of motion (EOMs) which, even in the case of large structures (10³⁺ residues), enables the computation of normal modes to be done on a desktop PC. We present the complete theory and analysis of cluster-NMA and also include its application to a variety of structures. The results of the new method are compared with C_α-NMA and it is shown that cluster-NMA produces very good approximations to the lowest modes at a fraction of the computational cost.

Introduction

The search for inherent links between protein structure and function is a driving force behind the development of accurate and computationally efficient methods for describing the complex motions attainable by protein structures. X-ray crystallography can provide snapshots of protein structures in (near) equilibrium conformations.¹ Other experimental methods such as fluorescent resonance energy transfer (FRET) and nuclear magnetic resonance (NMR) can provide partial information about large-amplitude protein motions [2], [3], [4]. Dynamical simulations of protein structures can provide crucial insights into their function which are not easily obtained experimentally. It has been observed that the low-frequency, large-amplitude motions are most closely related to protein function [5], [6], [7], [8] whereas the high-frequency localized vibrations may be more involved in signal transmission and other internal processes [9]. Computational models (normal mode analysis (NMA) in particular) enable one to derive these desired dynamic motions from the static conformations obtained from crystallography and are thus an essential tool in gaining insight into the structure–function relationship.

Molecular dynamics (MD) simulations rely on atomic details to predict the evolution of conformations of protein structures based on the interactions between all pairs of atoms. These simulations can be computationally prohibitive due to the high number of degrees-of-freedom (DOFs) required to capture motions of large structures and the complicated force calculations required at each iteration.

As a first level of simplification, consider the basic structure of any protein. A protein is comprised of a polypeptide backbone with amino acid residues extending from the alpha-carbon (C_α) of each peptide unit. Since the peptide-bonded backbone remains connected in all conformations (short of denaturing the protein), it is often useful to focus on the backbone structure knowing that each side chain must closely follow its corresponding C_α. Bahar et al. [10] present a scalar model, called the Gaussian network model (GNM), which produces magnitudes of individual residue displacements consistent with experimentally derived quantities including X-ray crystallographic temperature factors [7], hydrogen exchange free energies [11], and the order parameters from NMR-relaxation measurements [12]. While these results validate the use of simple elastic networks, they do not produce displacement directions. Atilgan et al. [13] present the anisotropic network model (ANM) which builds on the GNM by including parameters for displacement directions. Similarly, Kim et al. [14], [15] use C_α-NMA, in which the interactions between residues in contact are modelled with harmonic potentials, to produce three-dimensional displacements. While these methods are much faster than all-atom simulations, they are still computationally expensive for very large structures.

C_α-NMA, as mentioned above, is one of the highest resolution coarse-grained models (one C_α per grain). It uses the Cartesian displacements of each C_α to define the conformational displacement relative to the initial conformation [16]. Coarser-grained models have been employed to capture large-amplitude motions. For example, coarse-graining methods have been employed by [17], in which the full protein structure is projected onto a reduced DOF subspace. The hybrid method MBO(N)D, as presented in [18], makes use of varying grain sizes to achieve desired levels of resolution according to the mobility and functional interest of each region within the structure.

Hinsen [5] uses a Fourier basis to capture a uniform vector field of displacements. This reduced degree-of-freedom model effectively captures the lowest modes, but has a number of limitations which are inherent to a Fourier basis: it is not well suited for capturing translational motion, periodicity of the basis set must be accounted for, displacements given in the basis coordinates do not have physical meaning.

Central to every modelling method is the choice of parameterization. In general, higher DOF parameterizations allow more complex motions to be captured (at a significant computational cost), while lower DOF parameterizations can impose unrealistic conformational constraints. The choice of parameterization allows the user to attain the desired combination of computational performance and motion resolution. This trade-off can be adjusted within a given structure so that regions of interest can be modelled with higher resolution than other regions of less importance. In this paper, we present a low DOF parameterization that produces low-frequency motions consistent with C_α-NMA, which in turn has shown strong agreement with all-atom NMA and MD simulations [5], [10], [19].

An n residue structure requires 3n parameters for full resolution C_α-NMA. We refer to this as the standard parameterization,² as it serves as our basis for comparison. This results in a computational complexity of $O (n^{3})$ .³ However, as mentioned above, the modes of interest are the low-frequency, large-amplitude motions and not the high-frequency localized vibrations (i.e. one is typically not interested in all 3n modes).⁴ We bypass these issues with clustering algorithms to identify subsets of residues that form rigid clusters and thus move as rigid units under the modes of interest.

When multiple conformations are known, as in the case of lactoferrin (PDB: 1LFG and 1LFH), the structure can be clustered by identifying sets of residues that experience minimal RMS deviation (after optimal alignment of each candidate cluster). For all structures there are algorithms such as the pebble game [21] that count DOF constraints in a network of contacts. Proteins can also be clustered by secondary structure elements. In all cases, clustering effectively filters out the high-frequency modes and enables one to use a low DOF parameterization to more efficiently calculate the global modes.

The rest of this paper is organized as follows. In Section 2.1, the C_α elastic network model is reviewed. In Section 2.2, the central points of clustering algorithms are discussed and cluster notation is introduced. In Section 2.3, the parameters necessary to capture the motions of this system of rigid bodies are defined. In 2.4 Derivation of stiffness matrix, 2.5 Derivation of the mass matrix, the stiffness and mass matrices are obtained from the quadratic expressions for the potential and kinetic energies of the system. In Section 2.6, the mode shapes are extracted from the equation of motion (EOM) and projected onto the structure. This process requires a change of coordinates, the Gram–Schmidt orthonormalization process, and a low mode “unmixing” algorithm. In Section 3, cluster-NMA is applied to a variety of structures. Computational performance and mode accuracy are analyzed. In Section 4, a summary analysis is given.

Section snippets

Review of C_α-NMA

Since cluster-NMA will be compared to C_α-NMA, the C_α model [13], [14], is briefly reviewed here for completeness. Structures are represented as a system of point masses located at each C_α position with a network of connecting springs. Conformational changes are viewed as displacements of each C_α in the structure. The vector of generalized coordinates is $σ =[σ_{1}^{T},…, σ_{n}^{T}]^{T}$ , where $σ_{i} ∈ R^{3}$ is the displacement of residue i and $R^{d}$ denotes the d-dimensional space of real valued vectors.

The 3n×3n mass

Application of cluster-NMA to various protein structures

In this section, we apply a high- and low-resolution helix-based clustering algorithm as described in Section 2.2. Cluster-NMA is tested on a sample set of 12 protein structures ranging in size from 85 to 1287 residues. An even coarser cluster-NMA is performed on lactoferrin (691 residues) by clustering by domain. Finally, a very coarse cluster-NMA is performed on the 8015 residue GroEL/GroES complex. The range in structure size and cluster resolutions is chosen so that computational savings

Conclusions

At the core of cluster-NMA is the rigid-body representation of the protein structure. This simultaneously reduces the number of DOFs and confines the structure to the space of low-frequency motions. Typically, NMA computational performance is limited by the $O (n^{3})$ eigenproblem. Cluster-NMA circumvents this limitation by using an $O (n)$ transformation to project the structure into a reduced DOF representation. The eigenproblem is then performed in this smaller space and the results are transformed

References (24)

N. Gō
A theorem on amplitudes of thermal atomic fluctuations in large molecules assuming specific conformations calculated by normal mode analysis
Biophys. Chem.
(1990)
T. Ha
Single-molecule fluorescence resonance energy transfer
Methods
(2001)
A.G. Palmer
Probing molecular motion by NMR
Curr. Opin. Struct. Biol.
(1997)
S. Doniach et al.
Protein dynamics simulations from nanoseconds to microseconds
Curr. Opin. Struct. Biol.
(1999)
K. Hinsen
Domain motions in proteins
J. Mol. Liquids
(2000)
O. Keskin et al.
Relating molecular flexibility to function: a case study of tubulin
Biophys. J.
(2002)
A.R. Atilgan et al.
Anisotropy of fluctuation dynamics of proteins with an elastic network model
Biophys. J.
(2001)
M. Kim et al.
Elastic models of conformational transitions in macromolecules
J. Mol. Graphics Modell.
(2002)
M.K. Kim et al.
Efficient generation of feasible pathways for protein conformational transitions
Biophys. J.
(2002)
D. Coppersmith et al.
Matrix multiplication via arithmetic progression
J. Symbolic Comput.
(1990)

K. Hinsen

Analysis of domain motions by approximate normal mode calculations

Proteins: Struct. Function Genet.

(1998)

I. Bahar et al.

Vibrational dynamics of folded proteins: significance of slow and fast motions in relation to function and stability

Phys. Rev. Lett.

(1998)

Cited by (50)

Efficient prediction of protein conformational pathways based on the hybrid elastic network model
2014, Journal of Molecular Graphics and Modelling
Citation Excerpt :
Recently, a new domain decomposition method for RTB, called density-cluster RTB was developed [24]. The cluster NMA (cNMA) [25,26] was developed to overcome several limitations in RTB and BNM method. cNMA can directly convert Cartesian coordinates into rigid clusters without any projection, vice versa, whereas RTB and BNM methods transform the full Hessian matrix into a reduced subspace.
Various computational models have gained immense attention by analyzing the dynamic characteristics of proteins. Several models have achieved recognition by fulfilling either theoretical or experimental predictions. Nonetheless, each method possesses limitations, mostly in computational outlay and physical reality. These limitations remind us that a new model or paradigm should advance theoretical principles to elucidate more precisely the biological functions of a protein and should increase computational efficiency. With these critical caveats, we have developed a new computational tool that satisfies both physical reality and computational efficiency. In the proposed hybrid elastic network model (HENM), a protein structure is represented as a mixture of rigid clusters and point masses that are connected with linear springs. Harmonic analyses based on the HENM have been performed to generate normal modes and conformational pathways. The results of the hybrid normal mode analyses give new physical insight to the 70S ribosome. The feasibility of the conformational pathways of hybrid elastic network interpolation (HENI) was quantitatively evaluated by comparing three different overlap values proposed in this paper. A remarkable observation is that the obtained mode shapes and conformational pathways are consistent with each other. Our timing results show that HENM has some advantage in computational efficiency over a coarse-grained model, especially for large proteins, even though it takes longer to construct the HENM. Consequently, the proposed HENM will be one of the best alternatives to the conventional coarse-grained ENMs and all-atom based methods (such as molecular dynamics) without loss of physical reality.
NMR spectroscopy on domain dynamics in biomacromolecules
2013, Progress in Biophysics and Molecular Biology
Domain dynamics in biomacromolecules is currently an area of intense research because of its importance for understanding the huge quantity of available data relating the structure and function of proteins and nucleic acids. Control of structural flexibility is essential for the proper functioning of the biomacromolecules. Biophysical discoveries as well as computational algorithms and databases have reshaped our understanding of the often spectacular domain dynamics. At the residue level, such flexibility occurs due to local relaxation of peptide bond angles whose cumulative effect results in large changes in the secondary, tertiary or quaternary structures. The flexibility, or its absence, most often depends on the nature of interdomain linkages. Both the flexible and relatively rigid linkers are found in many multidomain biomacromolecules. Large-scale structural heterogeneity of multidomain biomacromolecules and their complexes is now seen as the norm rather than the exception. Absence of such motion, as in the so-called molecular rulers, also has desirable functional effects in architecture of biomacromolecules. The contemporary methods of NMR spectroscopy are capable to provide the detailed information on domain motions in biomacromolecules in the wide range of timescales related to the timescales of their functioning. We review here the current point of view on the nature of domain motions based on these last achievements in the field of NMR spectroscopy. Experimental and theoretical aspects of the collective intra- and interdomain motions are considered.
Protein dynamics developments for the large scale and cryoEM: Case study of ProDy 2.0
2022, Acta Crystallographica Section D: Structural Biology
Influence of model resolution on geometric simulations of antibody aggregation
2016, Robotica
The importance of slow motions for protein functional loops
2012, Physical Biology
Quantified uncertainty of flexible protein-protein docking algorithm
2019, arXiv

View all citing articles on Scopus

View full text

Normal mode analysis of proteins: a comparison of rigid cluster modes with Cα coarse graining

Abstract

Introduction

Section snippets

Review of Cα-NMA

Application of cluster-NMA to various protein structures

Conclusions

Biophys. Chem.

Methods

Curr. Opin. Struct. Biol.

Curr. Opin. Struct. Biol.

J. Mol. Liquids

Biophys. J.

Biophys. J.

J. Mol. Graphics Modell.

Biophys. J.

J. Symbolic Comput.

Analysis of domain motions by approximate normal mode calculations

Proteins: Struct. Function Genet.

Vibrational dynamics of folded proteins: significance of slow and fast motions in relation to function and stability

Phys. Rev. Lett.

Normal mode analysis of proteins: a comparison of rigid cluster modes with C_α coarse graining

Review of C_α-NMA