Files

Abstract

In the world of molecular simulation, there is a large gap between the capabilities of all-atom molecular dynamics and many biophysical systems that are of interest. Coarse-grained (CG) molecular dynamics addresses this gap by increasing the timescales and system sizes which are accessible to molecular simulation. It does so by reducing the complexity of the all-atom system to a reduced number of degrees of freedom known as CG beads or sites. These sites represent a collection of individual atoms and are often chosen via chemical intuition. Parameterizing the interactions between these sites is done in one of two ways. The first: top-down CG, involves selecting experimental observables and hand tuning the force field to match these targets. This process is laborious and is impossible to do systematically without extensive knowledge of force field development. On the other hand, bottom-up CG addresses these issues by algorithmically parameterizing force fields to minimize loss functions with respect to reference all-atom data which has been mapped to the CG resolution. In this way, bottom-up CG models can in theory be generated for arbitrary systems if one has access to short reference trajectories and an appropriate CG algorithm. In practice though, bottom-up CG becomes more and more difficult as one attempts to apply it to more complicated systems- particularly when many-body correlations, anisotropy, or large numbers of model parameters are in play. Unfortunately, these problematic aspects are all but a certainty when generating CG models of relevant biophysical processes, such as membrane remodeling or protein assembly. Thus, it is imperative that the limitations of bottom-up CG are understood so that better methods can be developed to address them. This work is broken up into two main sections. Chapters 2 and 3 analyze bottom-up and top-down CG lipid bilayers at a resolution of 4 heavy atoms to one CG bead. In both cases CG models tend to fail to reproduce thermodynamic properties of the bilayer without explicit temperature dependence. In addition, the bottom-up CG lipid models require extensive and complicated optimization schemes which limit their practicality. These issues stem from solvent effects and the inherent anisotropy of lipid membranes, and are tricky to address without advanced methodologies, such as the implementation of semi-explicit solvent virtual particles. Ultimately, these models fail to meet the accuracy and expressiveness of CG lipids at lower resolutions which suggests that the higher CG resolution is inappropriate for such systems. The second section of this work pertains to applying machine learning (ML) methods to CG modeling. In chapters 4 and 5 it is demonstrated that for liquid systems with significant many-body and nuclear quantum effects deep neural networks generate much more accurate models. Normally, neural networks bring this increase in accuracy at the cost of integration speed and significant data requirements. In fact, these models in many cases run slower than the corresponding all-atom simulations. However, if a path integral representation of the system is required, these ML based force fields are not only faster, but are essentially just as accurate. In the case of classical MD to CG resolution, this is not the case, but equivariant neural networks can be applied which significantly reduce the amount of training data needed to produce an accurate model. In fact, a single frame of MD data is sufficient to generate a stable CG model of water with an equivariant neural network, which is two orders of magnitude lower than the requirements of non-equivariant networks. Overall, these projects demonstrate that bottom-up CG modeling remains difficult for complex systems, but recent advancements in machine learning and traditional CG methods provide a path towards more accurate and practical CG models.

Details

Actions

PDF

from
to
Export
Download Full History