Molecular dynamics simulations of membrane proteins and their interactions: from nanoscale to mesoscale

Graphical abstract


Introduction
Membrane proteins play a key role in the biology of cells. Around 20% of genes encode membrane proteins, and they form a major class of drug targets. There has been considerable progress in the structural biology of membrane proteins resulting in over 2500 structures in the PDB, corresponding to over 700 distinct membrane protein species [1 ]. Molecular dynamics (MD) and related molecular simulation approaches provide important tools which allow us to simulate both individual membrane proteins and more complex membrane systems [2]. Thus, MD simulations have become a valuable addition to the range of experimental structural and biophysical techniques for studying membrane proteins and their interactions with lipids [3].
In this article we will review two major and complementary trends in molecular simulations of membrane proteins: (i) to probe protein-lipid interactions of single membrane proteins and (ii) to model more complex membranes containing mixtures of multiple lipid species and multiple copies of membrane proteins (Figure 1). It remains a challenge to develop biologically realistic models of cell membranes, but recent methodological advances enable an integrated approach to the problem, drawing together structural, biophysical and biochemical data into dynamic models which aid interpretation of structural and imaging data on membranes of cells and their organelles. We will survey these advances and a number of recent applications. We will therefore also discuss the development of mesoscale approaches which allow very large scale simulations, exploring membrane behaviour beyond the nanoscale and thus narrowing the gap between simulations and experiment.
Lipid-protein interactions at the nanoscale MD simulations may be thought of as a computational microscope [4]: one may 'zoom in' to atomic resolution to examine detailed interactions of a membrane protein with water, ions, and lipids, or 'zoom out' to a lower resolution using for example coarse-grained (CG) [5 ,6] simulations to address longer length and timescales, albeit with some loss of detail in modelling interatomic interactions. This approach has been successfully used to reveal the dynamic interactions of membrane proteins with lipids at the nanoscale [1 ,7,8].
Simulations have been used to predict lipid interaction sites for a number of mammalian integral membrane proteins, providing detailed views of both the lipid annulus, as for aquaporin [9,10] (Figure 1a), and of interactions of specific lipids. A number of recent studies have characterised experimentally observed interactions between cholesterol molecules and G-protein coupled receptors (GPCRs), reviewed in [7], and have explored how such protein-lipid interactions may modulate the dimerization of GPCRs and its possible effects on receptor function (see e.g. [11]). In addition to interactions of membrane proteins with cholesterol, simulations have been used to identify binding sites for phosphatidyl inositol 4,5bisphosphate (PIP 2 ) binding sites on ion channels, transporters, and receptor proteins. Thus, PIP 2 binding sites have been characterised for ion channels including Kir2.2 [12] and Kv7.1 [13] potassium channels, and PIP 2 regulation of dopamine transporters has been explored [14 ]. CG simulations have been used to compare interactions of PIP 2 molecules with the transmembrane and juxtamembrane domains of all 58 human receptor tyrosine kinases [15 ], illustrating how high throughput approaches to membrane protein simulations [16,17] enable systematic surveys of families of membrane proteins and their lipid interactions. CG simulations have also been used to explore the free energy landscapes of PIP 2 and of glycolipids with the transmembrane domain of the EGFR [18].
More recent simulation studies of intact receptor tyrosine kinases (i.e. not simply the transmembrane domain) have revealed how lipid mediated interactions between the membrane surface and the ectodomains of the receptor may modulate the overall conformation of these complex multi-domain membrane proteins. Thus ectodomain/bilayer interactions may result in an asymmetric conformation of the EGFR dimer [19 ], and these interactions can be influenced by receptor glycosylation [20 ]. Ectodomain/bilayer interactions of the related EphA2 receptor are mediated primarily via anionic lipids, and may stabilize different conformations at the membrane of liganded versus unliganded forms of the receptor [21] (Figure 1b).
Simulations have also been used to explore the interactions with proteins of more 'specialized' lipids from mitochondrial and bacterial inner membranes such as cardiolipin (CL). CG-MD simulations have been used to assess binding sites for CL with cytochrome bc 1 [22] or cytochrome c oxidase [23 ] and to estimate the free energy landscapes of these interactions [23 ]. A comparable approach has been used to examine the interactions of cardiolipin with the ADP/ATP carrier ANT1 ( Figure 2a). These studies confirm that such simulations can accurately reproduce lipid binding sites seen in the X-ray structure of this key mitochondrial transport proteins [1 ,7].
Simulations have also been applied to bacterial membranes [24] and their proteins. For example, combined structural, biophysical and computational studies have explored the role of lipids in the mechanosensitivity of the Escherichia coli ion channel MscS [25]. Selective interactions of CL with UraA, a bacterial inner membrane transporter, have also been explored [26]. Realistic modelling of the more complex outer membranes of Gram negative bacteria has required development of models for lipopolysaccharide (LPS), the major constituent of the outer leaflet of these membranes [27,28]. Recent progress in both atomistic [29 ] and coarse-grain [30] simulations of LPS enable studies of the interactions of a number of E. coli outer membrane proteins, for example FecA [31 ], OmpLA [29 ] and OmpF [32], with this complex membrane environment.   Overview of MD simulations of membranes. For each simulation granularity of the simulation (atomistic versus coarse-grained), the number of atoms/particles (including water, which are omitted for clarity from all of the images) in the simulation system, the approximate linear dimension of the simulation box, the duration of the production run simulation, and the resultant trajectory file size are given. The nanoscale interactions of cytoplasmic peripheral membrane proteins and their lipid recognition domains with cell membranes may also be explored by simulations [33]. For example, MD simulations may be used to study how interactions with lipids such as PIPs may guide the recruitment of peripheral proteins such as PTEN to membranes within the cell [34], and also to explore the free energy landscapes [35 ] underlying the interactions of lipid-recognition domains, such as PH domains [36] ( Figure 2b), with PIP-containing membranes. The nanoscale effects of curvature and lipid composition on recruitment of peripheral proteins to cell membranes have also been explored by a combination of experiments and simulation [37 ].
Beyond the nanoscale: simulation of complex and crowded membrane systems The diversity of lipids and proteins simulated and the accuracy with which interaction sites are identified, as surveyed in the previous section, demonstrate the efficacy of the simulation approach, and strengthens confidence in its extension to more complex membrane systems. Models can now incorporate the compositional complexity of cell membranes [38,39 ,40], and mimic the crowding of proteins in cell membranes. Simulations of such models allow us to explore the emergent dynamics of complex and crowded membranes. In multi-component membranes lipids move in concert, with correlation times in the range of hundreds of nanoseconds, and correlation lengths of >10 nm [41], underlining the importance of large scale extended simulations to fully sample the interactions of the complex in vivo environment experienced by membrane proteins. These larger scale models help us to understand the collective behaviour of multiple copies of membrane proteins, such as the influence of crowding of membrane proteins on their clustering and diffusion [42,43 ]. These emergent dynamic properties of membranes may play a key role as regulatory mechanisms [44,45], and will influence the mechanical properties of cell membranes.
Simulations have been used to explore lipid sorting and membrane (nano)domain formation. For example, long atomistic simulations have revealed substructures within ordered lipid phases, and have demonstrated coexistence of ordered and disordered lipid phases [46]. Large scale CG simulations have also provided insights into the degree of dynamic lateral heterogeneity as consequence of lipid clustering within models of mammalian cell membranes [38,39 ]. Protein clustering and oligomerization are observed within such large scale simulations. A number of studies have focussed on GPCR oligomerization and the influence of lipids. Thus, Periole et al. used CG-MD simulations to model supra-molecular assemblies of rhodopsin [47 ]. Simulations of opioid receptors in a mixed POPC-cholesterol membrane helped to define the role of interfacial lipids at the protein-protein interface [48]. CG simulations of oligomerization of the b2-adrenergic receptor have explored the effects of protein-membrane hydrophobic mismatch [49]. Different mixtures of unsaturated and saturated lipids have been shown to affect the oligomerization of both adenosine and dopamine receptors 10 Biophysical and molecular biological methods

Current Opinion in Structural Biology
Protein-lipid interactions via coarse-grained simulations. (a) The mitochondrial ADP/ATP carrier ANT1 (with the three domains in green, pink and blue) interacting with three cardiolipin molecules in yellow. The lipid bilayer is shown in grey [7]. (b) A GRP1 PH domain (green) at the surface of a lipid bilayer bound to a PIP 2 molecule (green) [35 ] (figures courtesy of George Hedger). [50]. Simulations of the sphingosine-1-phosphate receptor in a complex mixed-lipid asymmetric bilayer have revealed how protein-lipid-protein interactions may influence the dynamic clustering of GPCRs [51 ]. This approach has been extended beyond the interactions of GPCRs. For example, simulations of a mitochondrial inner membrane indicate how cardiolipin may 'glue' together respiratory proteins into supercomplexes [52 ]. Analysis of the free energy landscape of interaction of the bacterial outer membrane protein NanC has revealed how intervening lipids may stabilize a membrane protein dimer [53]. Such protein-lipid-protein interaction may underlie functionally important larger scale membrane organization. Thus, combining molecular dynamics simulations with in vitro and in vivo experimental studies has indicated how formation of large clusters of bacterial outer membrane proteins (OmpF and BtuB) may play a key role in the formation of membrane protein 'islands' during the division of bacterial cells [54 ] (Figure 3a). The impact of protein clustering on membrane curvature has been demonstrated in a study of ATP synthase, combining electron cryotomography with simulations of ATP synthase dimers in a phospholipid bilayer [55].
Clustering has also been explored in larger scale simulations of peripheral membrane proteins. For example, clustering of lipid-anchored H-Ras has been observed in simulations of a 3 lipid component (di16:0 PC + di18:2 PC + cholesterol) bilayer, in which the protein accumulated at the interface between lipid ordered and lipid disordered regions, resulting in an increased local membrane curvature [56 ]. CG simulations have also suggested that N-Ras clusters can alter the rate of formation of lipid phases in similar mixed lipid bilayers [57]. Other peripheral proteins may have dramatic effects on membrane properties. For example, simulations of asynuclein aggregation [58] indicate how proteins may remodel the shapes of membranes, and large scale simulations of SNARE proteins suggest that hydrophobic mismatch may induce protein clustering and segregation [59].

Approaching experimental length scales: large scale membrane simulations
Ongoing advances in for example the development of simulation codes to efficiently exploit very large scale computing resources, including CPU/GPU combination [60], and in methods for setup and analysis of complex simulation systems [61] enable molecular simulations of membranes to achieve length scales of several hundred nanometers, thus permitting direct comparison with cell membrane imaging by cryo-electron tomography and by superresolution optical microscopies. Using these approaches, simulations of for example whole virus particles and subcellular organelles become feasible. Furthermore, more highly coarse-grained (or mesoscopic) simulation approaches have been developed to aid modelling of emergent behaviours in these complex protein-membrane systems.
A landmark early study in this field is provided by a combined experimental and modelling study of synaptic vesicles [62], in which diverse data (from structural biology, mass spectroscopy, and biophysics) were integrated to develop a near atomic resolution model which could be compared to images from electron microscopy. With this proof of principle for developing such large scale models, it is now timely to embark on their simulation. For example, molecular simulations can enable dynamic structural models of enveloped viruses to be explored at atomic or near atomic resolution. Thus multiscale simulations, including a coarse-grained model of the lipid bilayer, have been used to the study the early stages of formation of the HIV capsid, [63 ], whilst all-atom molecular dynamics have been used to fit a model of the mature HIV-1 capsid into cryo-electron microscopy density [64 ]. CG-MD simulations have been used to probe the dynamic behaviour of lipid bilayer components of two viral envelopes: those of influenza A [65 ] and of dengue virus [66]. In both cases simulations of the membrane envelopes of intact virions revealed slow and anomalous diffusion of the lipids, that is 'raft-like' behaviour of the viral membrane. Taken together these studies and others (reviewed in [67]) reveal considerable scope for the application of molecular simulations to viruses and their interactions with cell membranes. Other applications have explored large scale dynamic events including for example membrane fusion [68], and BAR domain-induced remodelling of vesicles [69,70], including the influence of membrane tension on BAR assembly [71].
Molecular simulations in combination with AFM and spectroscopic data have been used to construct a model of an intact bacterial photosynthetic chromatophore (Figure 4a) ( [72 ], M Sener, unpublished data) enabling detailed modelling of excitation transfer between pigment molecule clusters. Highly coarse-grained (DPD) models have been used recently to study the dynamic organization of PSII-LHCII supercomplexes in plant photosynthetic membranes [73]. These and related models, which address the dynamic organization of thylakoid membranes on a several-hundred nanometer lengthscale, can be used to model light harvesting mechanisms, thus enabling direct comparison with spectroscopic data on these processes [74 ].
The studies described above have made use of models over a range of scales, from atomistic to CG, building up to mesoscale simulations. Such investigations may be aided by simulation-based tools that allow for the melding of high resolution structural data with lower resolution data from for example cryo-EM [75]. With the recent advances in the resolution of cryo-EM and cryo-ET, and the current expansion in simulations carried out at close to experimental length scales, there is much opportunity for further development of mesoscale models.
Molecular simulations of complex membrane assemblies have benefitted from development of a range of tools for, for example, semi-automated setup of complex mixed 12 Biophysical and molecular biological methods  lipid bilayer [76,77]. On a larger scale for example cell-PACK provides mesoscale packing algorithms to generate and visualize three-dimensional models of complex biological environments, and has been evaluated for models of synaptic vesicles and of an HIV virion [78]. Larger simulations also necessitate the use of significant computing resources (Figure 3b), and careful consideration of scaling on thousands of CPUs becomes important. The volume of data generated by large scale simulations is appreciable, in the range of hundreds of GB per simulation (Figure 1), which imposes substantial data storage and processing demands. Very large scale simulations also require development of novel methods for visualization (e.g. Quicksurf in VMD) [79] and for analysis of for example lipid flows in complex membranes (Figure 4b) [80]. It is clear that future developments are likely to   further integrate a range of tools for setup, running, visualization and analysis of larger and more complex membrane systems, in addition to development of databases for storage and dissemination of the results of membrane simulations (e.g. MemProtMD [1 ]).

Conclusions
Using multiscale molecular simulations as a 'computational microscope' we can characterize the interactions of membrane proteins with lipid, matching, incorporating and extending the information which may be obtained from experimental structural and biophysical (e.g. MS) studies. Simulation approaches have been extended to allow crowded and complex membranes to be simulated with increasing biological realism. Having thus established the accuracy and utility of computational approaches to cell membranes, they are now being used to model and simulate cellular organelles and enveloped viruses. Paired with the growing wealth of cryo-EM and cryo-ET structural data, there is considerable promise for future 'in silico in vivo' studies of cell membranes.

23.
Arnarez C, Marrink SJ, Periole X: Identification of cardiolipin binding sites on cytochrome c oxidase at the entrance of proton channels. Sci Rep 2013, 3:1263. One of the first instances of CG potential of mean force calculations to derive free energy landscapes for lipid interaction with a membrane protein.