Molecular dynamics in a grand ensemble: Bergmann–Lebowitz model and adaptive resolution simulation

Animesh Agarwal; Jinglong Zhu; Carsten Hartmann; Han Wang; Luigi Delle Site

doi:10.1088/1367-2630/17/8/083042

1. Introduction

The physics of open systems is considered to be of primary importance in the understanding of natural phenomena and in the development of modern technology [1]. Systems in real life, as well as in experimental setups, are open systems, that is, systems which exchange energy and particles with their environment; the process of the exchange of particles is the basis of their interesting properties (see e.g. [2]). From a theoretical point of view the conceptual development of the classical and quantum statistical mechanics of open systems is challenging; in fact, theorems of statistical mechanics and dynamics derived for systems with a fixed number of particles are no longer valid in their standard formulation and must be revised accordingly, e.g., if the deterministic evolution is substituted with the stochastic evolution which controls the process of exchange of particles [3–7]. We will argue that extensive theoretical work with effective, elegant and (from a practical point of view) useful concepts were developed a long time ago (much in advance of the advent of computer simulations) but have remained unnoticed by the majority of the molecular simulation community. As a matter of fact, until recently, open systems with a varying number of particles have been simulated using algorithms which did not succeed as expected. The lack of success was most probably due to reduced efficiency compared to techniques based on fixed particle numbers (see, e.g. [8]). However, recently algorithms of multiscale character, which aim at bridging different scales within one unified framework, have gained great popularity, which in turn has led to the construction of efficient techniques where systems exchange energy or particles with an external environment; for example techniques using molecular resolution that can adaptively change in space (adaptive resolution simulation), see e.g. [9] and [10] and references therein.

Adaptive resolution simulation techniques allow us to focus on a specific region in space, treated at a desired (high) resolution, while the rest of the system is treated at a lower resolution. In the resolved region, some interesting process takes place while the rest of the system stays in thermodynamic equilibrium with the subsystem of interest (or, beyond equilibrium, exchanges energy and particles according to well defined statistical physical laws). In contrast to the first generation of algorithms with a varying number of particles, such algorithms are technically highly efficient and flexible. This flexibility makes them feasible for use in the calculation of various statistical properties, such as time correlation functions, some of which require a theoretical redefinition (compared to the fixed particle number simulations). The necessity of a formal redefinition of equilibrium time correlation functions in modern open systems MD simulations calls for revisiting the theoretical concepts developed about five to six decades ago for the statistical mechanics of open systems in the context of state-of-the-art computer algorithms.

In this paper, following the terminology developed in [3–5], we will refer to open systems which exchange energy and matter with the environment as grand ensemble systems; the grand canonical ensemble is one particular realization of a grand ensemble, as discussed in [3–5]. The aim of this paper is: (a) a discussion of theoretical concepts of open systems present in the literature; (b) a brief overview about the development/application of algorithms with a varying number of particles in molecular dynamics; (c) the inclusion/adaptation of formal results about open systems into the framework of MD techniques; (d) to provide examples of merging theory and algorithms by reporting numerical results for one specific open system MD technique. We will treat the specific case of the grand canonical-like adaptive resolution simulation method (GC-AdResS) and discuss its conceptual consistency with the theory present in the literature, together with its technical advantages/limitations. The hope is that this may stimulate further research along this direction and add to the theoretical foundation of MD simulations in a grand ensemble; the need to approach more complex systems characterized by the realistic process of exchange of energy and matter with the environment, prohibitive in the past, is becoming a guiding principle in the development and the application of molecular simulation techniques [11].

The paper is organized as follows: in the first section we will give a general overview of the theoretical concepts developed about the statistical mechanics of open systems. Next we will focus on what we will call the Bergmann–Lebowitz approach, a flexible and conceptually robust model that is of utmost relevance for many state-of-the-art MD algorithms. In the second section we will briefly discuss the general features of techniques of MD with a varying number of particles and introduce the idea of MD with molecular adaptive resolution simulation. In the third section we will introduce one of the techniques of adaptive resolution simulation (GC-AdResS) and report results where the BL theory of the first section is employed to give conceptual justification for the simulations and to the corresponding calculation technique. Finally, conclusions and future perspectives will be given. It must be noticed that the technical setup and the numerical results reported in this work are original in the development of the GC-AdResS methodology. In fact the results show that, with the technical setup developed in this work, the method is reliable not only for the calculation of static properties, on which past research focused, but also for the calculation of dynamical properties, thus allowing the study of a much larger class of phenomena.

2. Basic concepts of a grand ensemble and extended Liouville equation

When one uses the keywords 'statistical mechanics of open systems' in an automatic literature search, one finds a considerable amount of rich material (see, e.g., [3–7, 12]). However, the vast majority of this material focuses mostly on the idea of coupling a system to a reservoir of energy, or to nonequilibrium scenarios, such as the transport of matter from an external source. Exchange of matter techniques are usually limited to a simple extensions of the concept of heat exchange and heat flow [4–6]. As a matter of fact, the exchange of heat has been historically most relevant in MD simulations, as the coupling of a system to an external reservoir of energy makes simulations numerically stable and physically targeted to the desired thermodynamic state, without requiring large systems as those necessary to NVE simulations [13]. The circumstances outlined above, together with the lack of success of grand canonical-like MD methods (see the later discussion), were the reasons why the theoretical concepts of grand ensemble, developed, e.g., in [4–6], did not become popular in MD simulations and thus were not implemented in practical tools of calculations. However, as underlined in the introduction, the rediscovery and further development of such work became a timely necessity. In this section we will trace the idea behind the theoretical treatment of open systems in equilibrium and we will restrict the discussion to those approaches where the coupling between system and reservoir is not required in an explicit form; such approaches represent the most general model open system. Moreover, we will restrict the treatment to classical systems because our main interest lies in the field of classical MD. In particular we will define a generalized Liouville equation and associated operator (the Bergmann–Lebowitz Liouville equation/operator).

Instead, the class of approaches which explicitly require a coupling term in the Hamiltonian is usually limited to transport processes (out of equilibrium), whose external source can be formalized in specific cases only (e.g. [14]) and which in general do not admit a grand ensemble. The essential idea behind approaches which do not require an explicit coupling term, is that a small system is in contact with a large reservoir (or more than one, but for simplicity let us consider only one). The aim is to extract (thermo)dynamical laws governing the small system, from the microscopic equation of the global system (comprising the reservoir). The Liouville equation of the global system is the ideal starting point, however, the variables considered in the reservoir are macroscopic variables that can be considered to be averages over microscopic states; in the optimal case these variables do not explicitly enter into the description of the evolution of the small system. The general hypothesis at the basis of such models is that the reservoir exerts its influence on the small system only via intensive properties (see e.g. [4, 6]). The key idea is that, even if the extensive variables of the reservoir change, its intensive variables are constants of motion. As a consequence, the dynamical evolution of the small system does not contain any time-dependent function of the reservoir and the small system is then governed by a self-consistent dynamical evolution. In a pioneering work, Emch and Sewell [6] proposed a method based on the basic principles reported before. They treat quantum systems, and the generalized Liouville equation is a master equation governing the evolution of the statistical operator. However, they need an abstract projector operator which coarse-grains the microscopic variables of the reservoir into macroscopic variables, that, in turn, influence the small subsystem. For MD simulations, although the premises of the method and its formalism are certainly appealing, this idea is not practical; in fact, the explicit specification and formalization of a general coarse-graining operator is not straightforward. However, a similar but more appealing idea for its formal simplicity and from the viewpoint of practical implementation, has been put forward by Bergmann and Lebowitz [4], as will be outlined below; see also [5].

2.1. Bergmann–Lebowitz Liouville equation

In the seminal paper of Bergmann and Lebowitz [4] (and subsequently in the paper of Lebowitz and Shimony [5]), the authors derive a general model of a many-particle system that is interacting with different reservoirs. Here, for simplicity and for closer analogy to a standard grand canonical MD simulation, we will treat only the case of a single reservoir. The key ingredient of the model is an impulsive, Markovian interaction between the reservoir and the system. The effect of the reservoir on the system can be completely described if one specifies the stationary distribution of the reservoir before the reservoir–system interaction (thus the knowledge of the reservoir state as a function of time is not required). In their model, each interaction between the system and the reservoir produces a discontinuous transition of a system from a state with N particles ( ${X}_{N}^{\prime }$ ) to one with M particles (X_M). Such transitions are determined not only by the configuration of the system, ${X}_{N}^{\prime }$ , but depends also on the configuration of the reservoir in phase space. Ignoring the reservoir state upon collision, the change in the system state can be described in terms of a Markovian transition kernel, ${K}_{{NM}}({X}_{N}^{\prime },{X}_{M})$ , that is independent of time. Specifically, ${K}_{{NM}}({X}_{N}^{\prime },{X}_{M})$ is the probability that, in an infinitesimally small time interval, the system at X_M makes a transition to ${X}_{N}^{\prime }$ as a result of the interaction with the reservoir. The probability density function, $\rho ({X}_{M},M,t)$ , at some point X_M of the phase space is governed by the extended Liouville equation which we will name the Bergmann–Lebowitz Liouville equation:

$\begin{eqnarray}\displaystyle \frac{\partial \rho ({X}_{M},M,t)}{\partial t} & = & \{\rho ({X}_{M},M,t),H({X}_{M})\}\\ & & +\displaystyle \sum _{N=0}^{\infty }\displaystyle \int {\rm{d}}{X}_{N}^{\prime }[{K}_{{MN}}({X}_{M},{X}_{N}^{\prime })\rho ({X}_{N}^{\prime },N,t)\\ & & -{K}_{{NM}}({X}_{N}^{\prime },{X}_{M})\rho ({X}_{M},M,t)]\end{eqnarray} \tag{ 1 }$

where, as usual, $H({X}_{M})$ is the Hamiltonian of the system corresponding to the point X_M and $\{\ast ,\ast \}$ are the canonical Poisson brackets.

An important point worth mentioning is that the standard Liouville theorem

$\begin{eqnarray}&&\displaystyle \frac{{\rm{d}}\rho ({X}_{M},M,t)}{{\rm{d}}t}=0\end{eqnarray} \tag{ 2 }$

must be replaced by a generalized Liouville theorem:

$\begin{eqnarray}&&[\displaystyle \frac{{\rm{d}}}{{\rm{d}}t}+\hat{Q}]\rho ({X}_{M},M,t)=f({X}_{M},t)\end{eqnarray} \tag{ 3 }$

where

$\begin{eqnarray*}&&f({X}_{M},t)=\displaystyle \sum _{N=0}^{\infty }\int {\rm{d}}{X}_{N}^{\prime }[{K}_{{MN}}({X}_{M},{X}_{N}^{\prime })\rho ({X}_{N}^{\prime },t)]\end{eqnarray*}$

and

$\begin{eqnarray*}&&\hat{Q}(\ast )=\displaystyle \sum _{N=0}^{\infty }\int {\rm{d}}{X}_{N}^{\prime }[{K}_{{NM}}({X}_{N}^{\prime },{X}_{M}),\ast ].\end{eqnarray*}$

The generalized Liouville theorem expresses the fact that there is a probability flux in and out of the system as a result of the interaction with the reservoir which induces the change from N to M particles. The determinism of the Liouville equation, which characterizes a closed system, is now replaced by a stochastic evolution in time.

It is convenient to retain the original formulation of the Liouville theorem and define an extended Liouville operator (Bergmann–Lebowitz Liouville operator):

$\begin{eqnarray}&&{\rm{i}}{L}_{\mathrm{BL}}^{M}=\{\ast ,{H}_{M}\}+\hat{R}(\ast )\end{eqnarray} \tag{ 4 }$

where

$\begin{eqnarray*}&&\hat{R}(\ast )=\displaystyle \sum _{N=0}^{\infty }\int {\rm{d}}{X}_{N}^{\prime }[{K}_{{MN}}({X}_{M},{X}_{N}^{\prime })(\ast ({X}_{N}^{\prime },N,t))-{K}_{{NM}}({X}_{N}^{\prime },{X}_{M})(\ast ({X}_{M},M,t))].\end{eqnarray*}$

This allows us to formally write the Liouville theorem as in the standard case, namely,

$\begin{eqnarray}&&\displaystyle \frac{\partial \rho ({X}_{M},M,t)}{\partial t}+{\rm{i}}{L}_{{BL}}^{M}\rho ({X}_{M},M,t)=0.\end{eqnarray} \tag{ 5 }$

If the kernel satisfies the following integral condition (flux balance)

$\begin{eqnarray}&&\displaystyle \sum _{N=0}^{\infty }\int [{{\rm{e}}}^{-\beta H({X}_{N}^{\prime })+\beta \mu N}{K}_{{MN}}({X}_{M},{X}_{N}^{\prime })-{K}_{{NM}}({X}_{N}^{\prime },{X}_{M}){{\rm{e}}}^{-\beta H({X}_{M})+\beta \mu M}]{\rm{d}}{X}_{N}^{\prime }=0,\end{eqnarray} \tag{ 6 }$

then the stationary grand ensemble is the grand canonical ensemble with density

$\begin{eqnarray*}&&{\rho }_{M}({X}_{M},M)=\displaystyle \frac{1}{Q}{{\rm{e}}}^{-\beta {H}_{M}({X}_{M})+\beta \mu M},\end{eqnarray*}$

with $\beta ={k}_{{\rm{B}}}T$ the inverse temperature, μ the chemical potential and

$\begin{eqnarray*}&&Q=\displaystyle \sum _{M=0}^{\infty }{{\rm{e}}}^{\beta \mu M}\int {{\rm{e}}}^{-\beta {H}_{M}({X}_{M})}{\rm{d}}{X}_{M}.\end{eqnarray*}$

The flux balance (6) is both a necessary and sufficient condition for stationarity with respect to the grand canonical distribution. In such a case, due to the fact that

$\begin{eqnarray*}&&\displaystyle \sum _{N=0}^{\infty }\int {\rm{d}}{X}_{N}^{\prime }[{K}_{{MN}}({X}_{M},{X}_{N}^{\prime })\rho ({X}_{N}^{\prime },N,t)-{K}_{{NM}}({X}_{N}^{\prime },{X}_{M})\rho ({X}_{M},M,t)]=0,\end{eqnarray*}$

the BL Liouville operator is formally reduced to the standard Liouvillian

$\begin{eqnarray}&&{\rm{i}}{L}^{M}=\{\ast ,{H}_{M}\}\end{eqnarray} \tag{ 7 }$

that is a Liouvillian corresponding to a Hamiltonian which propagates the system in time with a variable number of particles (time dependent, stochastically regulated). As a consequence the BL Liouville equation is formally reduced to the standard Liouville equation,

$\begin{eqnarray}&&\displaystyle \frac{\partial \rho ({X}_{M},M,t)}{\partial t}=\{H({X}_{M}),\rho ({X}_{M},M,t)\},\end{eqnarray} \tag{ 8 }$

with the number of particles being a stochastic process.

3. Molecular dynamics of subsystems with a varying number of molecules

MD with a varying number of particles have been developed mostly for the calculation of the excess chemical potential following the Widom insertion or the thermodynamic integration techniques [15, 16]. Such methods describe the effect of inserting or deleting a molecule in a system of N molecules; they are computationally rather demanding and the calculation of the excess chemical potential is the only aim of such studies. An extension of such a technique is that of hybrid MD/MC methods, in which the dynamical evolution of the MD system is interfaced with MC moves which insert or remove particles and then equilibrate the system locally before the next MD step is actuated (see discussion in [16] and references therein). Such an approach is not optimal and is computationally expensive, in fact each insertion would have costs of the order of those of Widom-like techniques for the calculation of the chemical potential.

Fully MD grand canonical schemes that have been developed in the past did not gain popularity due to their computational costs and a certain conceptual and theoretical artificiality. A pioneering attempt was made by Pettitt and collaborators [8, 17]; see also the work of Lo and Palmer [18]. The method is based on the introduction of an additional dynamic variable s that represents the number of additional particles. At any instant the total number of molecules of the system can be written as $N+s$ and s, the new variable, corresponds to a fractional number depending on the degree of presence of an additional molecule. An extended Hamiltonian is then derived and equations of motion for $N+s$ variables are derived, moreover the knowledge a priori of excess chemical potentials is required at least when the molecular species are more than one (e.g. mixtures). It has been shown that such an approach was not optimal when applied to liquid water [19] and further improvements were implemented in extended versions such as that of Eslami and Müller–Plathe [16]. In our assessment, the method of [16] represents a substantial improvement over previous methods with regard to numerical robustness, nonetheless, it did not meet expectations and the number of applications presented in the literature is rather limited. In our view the idea of fractional particles is conceptually very appealing, but it introduces extra computational costs together with a more complex situation regarding the numerical stability of the algorithm and its implementation into pre-existing computational architectures of flexible (popular) MD codes. Later on, with the increasing success of multiscale MD techniques and the development of concurrent coupling techniques, a new generation of algorithms entered into the game [10]. Such a category is that of adaptive molecular resolution techniques. The common idea to all methods in such a category is the definition of two main open boundary regions, one at high resolution (e.g. atomistic) and one at coarse-grained level (spherical liquid); they are interfaced by a smaller region where molecules crossing the border acquire or loose their high resolution degrees of freedom. Molecules in the different regions are coupled via space-dependent intermolecular forces [20, 21, 24], Hamiltonians [25, 26] or Lagrangians [27]. Each of these algorithms, in principle, can be easily converted to a grand canonical MD scheme if (1) the coarse-grained region is large enough to assure physically realistic particle number density fluctuations and (2) the high resolution region is large enough to be of statistical relevance⁴ . The computational efficiency of these kinds of techniques is provably superior to methods with a varying number of particles of the previous generation (see e.g. [24, 28]). From this perspective they represent a realistic pathway to future MD simulations in general, and in particular for those cases in which the variation in time of the number of particles or the physics of a subsystem is of high relevance.

In the next section we will focus on one of these techniques, developed by some of the authors within the last five to six years, with the specific aim of designing a general grand ensemble algorithm via adaptive resolution simulation. We will present the grand canonical adaptive resolution simulation (GC-AdResS) method and connect its principles to the model of Bergmann and Lebowitz. In the following, the importance of such a connection for the definition and calculation of equilibrium time correlation functions will be discussed and illustrated with numerical results.

4. Grand canonical-like adaptive resolution simulation (GC-AdResS): basic principles

The basic structure of the original AdResS [20] is based on an intuitive technical requirement, namely, the construction of a numerical scheme which allows the system to pass smoothly from an atomistic to a coarse-grained dynamic evolution in space in such a way that the dynamics of the atomistic part is not perturbed significantly by the dynamics of the coarse-grained part and vice versa. The flow of molecules between the two regions must be constructed in such a way that the exchange happens under conditions of thermodynamic equilibrium; it is expected that static and dynamical properties of the atomistic region must be the same as in an equivalent subsystem of a fully atomistic reference simulation. The construction of such a numerical machine is reported step by step below:

The space is partitioned in three regions, one characterized by atomistic resolution (AT) and one characterized by coarse-grained (usually spherical) resolution (CG) and a relatively small interface region with hybrid resolution (transition region or hybrid region) (Δ or HY).
Molecules in the different regions are smoothly coupled through a spatial interpolation formula for the forces:
$\begin{eqnarray}&&{{\bf{F}}}_{i,j}=w({{\bf{r}}}_{i})w({{\bf{r}}}_{j}){{\bf{F}}}_{i,j}^{{\rm{AT}}}+[1-w({{\bf{r}}}_{i})w({{\bf{r}}}_{j})]{{\bf{F}}}_{i,j}^{\mathrm{CG}}\end{eqnarray} \tag{ 9 }$
where i and j indicates two molecules, ${{\bf{F}}}^{{\rm{AT}}}$ is the force corresponding the atomistic interactions ( ${U}_{{\rm{AT}}}$ ) (e.g. standard Lennard–Jones or Coulomb atomistic potential) and ${{\bf{F}}}^{\mathrm{CG}}$ is the force corresponding to the coarse-grained interaction potential ${U}_{\mathrm{CG}}$ (e.g. standard COM-COM potential, where COM stays for 'the center of mass'), ${\bf{r}}$ is the COM position of the molecule and $w(x)$ is a smooth function, defined over the transition region (Δ), which goes from 0 to 1 (or vice versa). It acts in such a way that the lower resolution is slowly transformed in the high resolution (or vice versa), as illustrated in figure 1.
A thermodynamic force, defined via first principles of thermodynamics, acts on the COM of each molecule and a thermostat is added to assure the overall thermodynamic equilibrium at the chosen temperature. The thermodynamic force is derived in such a way that: ${p}_{{\rm{AT}}}+{\rho }_{0}{\displaystyle \int }_{\Delta }{{\bf{F}}}_{{\rm{th}}}({\bf{r}}){\rm{d}}{\bf{r}}={p}_{\mathrm{CG}}$ , where ${p}_{{\rm{AT}}}$ is the chosen pressure of the atomistic system (region), ${p}_{\mathrm{CG}}$ is the pressure of the coarse-grained model, ${\rho }_{0}$ is the chosen molecular density of the atomistic system (region) [29] (the explicit expression of ${{\bf{F}}}_{{\rm{th}}}({\bf{r}})$ will be specified later on). A thermostat is added to take care of the loss/gain of energy in the transition region. This is the first step to pass from the original intuitive idea of AdResS to a well founded grand canonical framework of the method. In the original AdResS setup, the thermostat acts over the whole system (see top panel of figure 1), in this work the idea has been developed further and in order to match the requirements of the reservoir of the BL model for the calculation of equilibrium time correlation functions, we have constructed a setup in which the thermostat is applied to the reservoir only (i.e. hybrid and coarse-grained region); see bottom panel of figure 1.

**Figure 1.** Pictorial representation of the AdResS scheme; CG indicates the coarse-grained region, HY the hybrid region where atomistic and coarse-grained forces are interpolated via a space dependent, slowly varying, function $w(x)$ , and AT the atomistic region (that is the region of interest). Top, the standard setup with the thermostat that acts globally on the whole system. Bottom, the 'local' thermostat technique employed in this work.
Download figure:
Standard image High-resolution image

**Figure 1.** Pictorial representation of the AdResS scheme; CG indicates the coarse-grained region, HY the hybrid region where atomistic and coarse-grained forces are interpolated via a space dependent, slowly varying, function $w(x)$ , and AT the atomistic region (that is the region of interest). Top, the standard setup with the thermostat that acts globally on the whole system. Bottom, the 'local' thermostat technique employed in this work.
Download figure:
Standard image High-resolution image

In [24] and [21] necessary conditions in $\Delta$ were derived so that the spatial probability distribution in the atomistic region was close to that of a fully atomistic reference system up to a certain chosen order. The probability distribution is that of a grand canonical ensemble, hence the name grand-canonical-AdResS (GC-AdResS). We define the mth order statistics of a joint probability distribution of M molecules, $p({{\bf{r}}}_{1},\cdots ,{{\bf{r}}}_{M})$ , as

$\begin{eqnarray}&&{p}^{(m)}({{\bf{r}}}_{1},\cdots ,{{\bf{r}}}_{m})=\int p({{\bf{r}}}_{1},\cdots ,{{\bf{r}}}_{m},{{\bf{r}}}_{m+1},\cdots ,{{\bf{r}}}_{M})\;{\rm{d}}{{\bf{r}}}_{m+1}\cdots {\rm{d}}{{\bf{r}}}_{N}.\end{eqnarray} \tag{ 10 }$

The molecular number density $\rho ({\bf{r}})$ corresponds to the first order, the radial distribution function to the second, three-body distributions to the third order statistics and so on; examples of how the statistics in the atomistic region is reproduced will be shown later on. We emphasize that, by construction of the method, the accuracy in the atomistic region is independent of the accuracy of the coarse-grained model, thus, in the coarse-grained region, one can use a generic liquid of spheres whose only requirement is that it has the same molecular density as the reference system (i.e. we need only to know the distribution of the reservoir and not its microscopic details, which is in accordance with the basic principle of construction of the BL reservoir). It was numerically demonstrated for the case of liquid water that the target grand canonical distribution, numerically defined as the probability distribution of a subsystem (of the size of the atomistic region in GC-AdResS) in a large fully atomistic simulation, is accurately reproduced to (at least) third order. To complete the idea of grand canonical-like setup, it was shown that the sum of the work of ${{\bf{F}}}_{{th}}({\bf{r}})$ and of the thermostat in the transition region is equivalent to the difference of the chemical potentials between the atomistic and coarse-grained resolution (at the given thermodynamic conditions). Details will be given later on.

The construction of a thermostat that acts only in the hybrid and CG regions makes the reservoir of GC-AdResS the effective technical translation of the reservoir hypothesized by Bergmann and Lebowitz in their model. A detailed discussion of the validity of the approximations of the method in the light of the theoretical hypothesis of the BL model is outlined in the next section.

5. Bergmann–Lebowitz model and GC-AdResS

In this section we analyze the correspondence between the BL model and GC-AdResS, more specifically we will discuss the possible mathematical mapping between the formulas of the two models and analyze the corresponding algorithmic meaning.

5.1. Mapping the Hamiltonian of the AT region

For the ith molecule, at position ${{\bf{r}}}_{i}$ in the AT region of AdResS (hereafter named 'system'), we have $w({{\bf{r}}}_{i})=1$ , thus the corresponding force can be divided into two contributions; one is the force generated by the interaction of molecule i with molecules of the AT region:

$\begin{eqnarray}&&{{\bf{F}}}_{i,j}={{\bf{F}}}_{i,j}^{{\rm{AT}}},\forall j\in {\rm{AT}}\end{eqnarray} \tag{ 11 }$

and one is the force generated by the interaction with molecules of the reservoir

$\begin{eqnarray}&&{{\bf{F}}}_{i,j}=w({{\bf{r}}}_{j}){{\bf{F}}}_{i,j}^{{\rm{AT}}}+[1-w({{\bf{r}}}_{j})]{{\bf{F}}}_{i,j}^{\mathrm{CG}},\forall j\in \Delta +{CG}.\end{eqnarray} \tag{ 12 }$

Equation (11) implies the possibility of expressing the force acting on molecule i in terms of the gradient of the atomistic potential:

$\begin{eqnarray}&&{{\bf{F}}}_{i}=\displaystyle \sum _{j\ne i}{{\bf{F}}}_{i,j}^{{\rm{AT}}}=\displaystyle \sum _{j\ne i}{\nabla }_{i}{U}_{{\rm{AT}}}\end{eqnarray} \tag{ 13 }$

where ${\nabla }_{i}$ is the gradient w.r.t. molecule i. Equation (12) expresses instead the action of molecules of the reservoir on molecule i, that is an external force. The system–reservoir coupling term of equation (12) rules out the existence of a microscopic Hamiltonian for the system (embedded in the reservoir) and thus impedes a straightforward correspondence between the BL Hamiltonian, H_M, of equation (7) (or $H({X}_{M})$ of equation (8)) and the Hamiltonian of the AT region, H_AT, of the AdResS model. However, here we want to advocate the view that the AdResS model can be mapped to the BL framework, even though a rigorous derivation of the BL kernel from a microscopic model is beyond the scope of this paper. We will provide numerical evidence for this point of view later on in the text. Roughly speaking, one may argue that the nonintegrable part of the dynamics in the HY region represents a boundary effect that can be absorbed in the definition of the transition kernel. To elaborate on this point, we first notice that equation (12) can be recast as:

$\begin{eqnarray}&&{{\bf{F}}}_{i}=\displaystyle \sum _{j\in \Delta +\mathrm{CG}}[w({{\bf{r}}}_{j}){{\bf{F}}}_{i,j}^{{\rm{AT}}}+[1-w({{\bf{r}}}_{j})]{{\bf{F}}}_{i,j}^{\mathrm{CG}}]=\displaystyle \sum _{j\in \Delta +\mathrm{CG}}[w({{\bf{r}}}_{j}){\nabla }_{i}{U}_{\mathrm{AT}}+[1-w({{\bf{r}}}_{j})]{\nabla }_{i}{U}_{\mathrm{CG}}].\end{eqnarray} \tag{ 14 }$

Hence the net force on the ith particle can be considered as a (nonlocal) gradient field that is instantaneously produced by the external field generated by the other molecules. As a consequence, the energy of the ith molecule at time $t\gt 0$ associated with the coupling force of equation (14) can be defined as

$\begin{eqnarray}&&{W}_{\mathrm{AT}-\mathrm{RES}}^{i}(t)=\displaystyle \sum _{j\in \Delta +\mathrm{CG}}[w({{\bf{r}}}_{j}){U}_{\mathrm{AT}}^{{ij}}+[1-w({{\bf{r}}}_{j})]{U}_{\mathrm{CG}}^{{ij}}],\end{eqnarray} \tag{ 15 }$

where the ${U}_{\cdot }^{{ij}}$ represent the interaction energies between molecule i at position ${{\bf{r}}}_{i}$ and the other molecules sitting at ${{\bf{r}}}_{j}$ . The total energy in the system at time t is then defined as

$\begin{eqnarray}&&{W}_{\mathrm{AT}-\mathrm{RES}}(t)=\displaystyle \sum _{i\in \mathrm{AT}}{W}_{\mathrm{AT}-\mathrm{RES}}^{i}(t).\end{eqnarray} \tag{ 16 }$

The quantity of equation (16) should be compared to the amount of energy, ${W}_{\mathrm{AT}-\mathrm{AT}}$ , corresponding to the interaction between molecules of the AT region only: ${W}_{\mathrm{AT}-\mathrm{AT}}(t)=\displaystyle \sum _{i\lt j}{U}_{\mathrm{AT}}^{{ij}};i,j\in \mathrm{AT}$ . If

$\begin{eqnarray}&&\displaystyle \frac{| {W}_{\mathrm{AT}-\mathrm{AT}}(t)| -| {W}_{\mathrm{AT}-\mathrm{RES}}(t)| }{| {W}_{\mathrm{AT}-\mathrm{AT}}(t)| }\approx 1;\forall t\end{eqnarray} \tag{ 17 }$

then it seems reasonable to approximate the total energy of the atomistic system by the Hamiltonian of the AT region,

$\begin{eqnarray}&&{H}_{\mathrm{AT}}\approx {H}_{\mathrm{AT}-\mathrm{AT}}\;\end{eqnarray} \tag{ 18 }$

which corresponds to the microscopic Hamiltonian ${H}_{{\rm{M}}}$ of the BL model. For all practical purposes, equation (17) holds true when the HY region can be considered thin compared to the AT region and when the AT region is large. In this case, given the typical cutoff radius of interactions across the HY region, there is no direct interaction between the AT region and the CG region. However, equation (17) may not hold under more realistic conditions as they are routinely used in AdResS simulation, with a not too large AT region and an HY region that is not too thin so as to avoid numerically stiff systems. Figure 2 displays the behaviour of ${W}_{\mathrm{AT}-\mathrm{AT}}(t)$ and ${W}_{\mathrm{AT}-\mathrm{RES}}(t)$ for a system of 5000 molecules (about 450 in the AT region) that represents a worst case scenario in this regard. We observe that ${W}_{\mathrm{AT}-\mathrm{AT}}(t)$ is at least one order of magnitude larger than ${W}_{\mathrm{AT}-\mathrm{RES}}(t)$ , so that the modeling error in terms of equilibrium expectation values that arises from replacing ${H}_{{\rm{M}}}$ of the BL model by ${H}_{\mathrm{AT}-\mathrm{AT}}$ is about $10\%$ . This estimate is clearly an upper bound for the model error and the neglected terms can be remodeled by an appropriate choice or parametrization of the kernel, as will be discussed in the next paragraph. A numerical test with a system close to the ideal condition of thermodynamic limit (100000 molecules, with 20000 in the AT region) shows that the energy contribution ${W}_{\mathrm{AT}-\mathrm{RES}}(t)$ is less than $1\%$ . Hence, for all practical purposes, ${H}_{{\rm{M}}}={H}_{\mathrm{AT}-\mathrm{AT}}$ fully specifies the microscopic characteristics of the AT system.

**Figure 2.** Main figure: potential energy of the subsystem as a function of time, ${W}_{\mathrm{AT}-\mathrm{AT}}(t)$ , compared to the energy associated with the interaction between subsystem and reservoir, ${W}_{\mathrm{AT}-\mathrm{RES}}(t)$ ; the former is at least one order of magnitude larger than the latter. Inset: the relative effect of the interaction between the AT region and the reservoir as a function of time: $\displaystyle \frac{| {W}_{\mathrm{AT}-\mathrm{AT}}(t)| -| {W}_{\mathrm{AT}-\mathrm{RES}}(t)| }{| {W}_{\mathrm{AT}-\mathrm{AT}}(t)| }$ ; it can be clearly seen that the contribution is, at most, $10\%$ . It must be underlined that in a test done with a much larger system, the effect goes below $1.0\%$ .
Download figure:
Standard image High-resolution image

**Figure 2.** Main figure: potential energy of the subsystem as a function of time, ${W}_{\mathrm{AT}-\mathrm{AT}}(t)$ , compared to the energy associated with the interaction between subsystem and reservoir, ${W}_{\mathrm{AT}-\mathrm{RES}}(t)$ ; the former is at least one order of magnitude larger than the latter. Inset: the relative effect of the interaction between the AT region and the reservoir as a function of time: $\displaystyle \frac{| {W}_{\mathrm{AT}-\mathrm{AT}}(t)| -| {W}_{\mathrm{AT}-\mathrm{RES}}(t)| }{| {W}_{\mathrm{AT}-\mathrm{AT}}(t)| }$ ; it can be clearly seen that the contribution is, at most, $10\%$ . It must be underlined that in a test done with a much larger system, the effect goes below $1.0\%$ .
Download figure:
Standard image High-resolution image

5.2. The action of the reservoir and the interpretation of the transition kernel

We shall proceed with discussing the correspondence between the BL and GC-AdResS reservoirs and the role of the kernel. To this end we recall that, in the BL framework, ${K}_{{NM}}({X}_{N}^{\prime },{X}_{M})$ is the transition rate for the system in state X_M to make a transition to ${X}_{N}^{\prime }$ as a result of the interaction with the reservoir. Further, recall that (6) is both necessary and sufficient for the system to admit a unique stationary grand canonical distribution. This implies that (6) holds by construction of GC-AdResS that is ergodic with respect to the grand canonical distribution. This clearly does not uniquely determine the transition kernel, nor does it guarantee its existence, but we will discuss how the transition kernel can be interpreted within the GC-AdResS framework.

The influence of the GC-AdResS reservoir on the dynamics in the AT region comprises three contributions: (a) the thermostat, (b) the thermodynamic force, and (c) the coupling force (14). Firstly, the function of the thermostat is that of assuring thermal stability of the reservoir and, as a consequence, of the system. Thermal stability is guaranteed by irreducibility of the kernel, so that it is possible to go from any region of the AT phase space to any other region with a positive probability [22]. A slightly stronger condition is that the dynamics are ergodic which is guaranteed by the recurrence of the dynamics, i.e., every phase space region is visited infinitely often with a positive probability. We should emphasize that this condition is known to be false for almost all deterministic Hamiltonian systems expect for certain billiards and geodesic flows on surfaces of constant negative mean curvature, therefore we use a gentle stochastic thermostat in AdResS. We refrain from going into detail here and instead refer to [23] for a discussion of this issue.

Secondly, the thermodynamic force is computed via the following iterative procedure:

$\begin{eqnarray}&&F{}_{k+1}^{{th}}(x)={F}_{k}^{{th}}(x)-\displaystyle \frac{{M}_{\alpha }}{{[{\rho }_{o}]}^{2}\kappa }\nabla {\rho }_{k}(x).\end{eqnarray} \tag{ 19 }$

The fixed point iteration converges locally as the density profile across the HY region becomes flat. This requires an exchange of particles between the AT and the CG regions, hence the thermodynamic force has the effect that the number of particles in the AT region vary in such a way that the average number density is constant (equal to the fixed target density). This also means that, by transporting the action of the thermostat, the effect of ${F}_{{th}}(x)$ is to impose the stationary distribution of the reservoir at the first order ( $\rho (x)$ ), independently of the interaction between the reservoir and the system; this condition is equivalent to the main condition requested/satisfied by the reservoir in the BL model. The computation of the thermodynamic force corresponds to the equilibration procedure of GC-AdResS; once the fixed-point iteration has converged (which it does at least locally), the obtained force is used for the simulations of production runs. The chemical potential, $\mu ={\mu }_{{\rm{AT}}}$ , in (6) is then automatically determined according to the equation (see [24, 28] for details)

$\begin{eqnarray}&&{\mu }_{\mathrm{CG}}={\mu }_{{\rm{AT}}}+{\omega }_{{\rm{th}}}+{\omega }_{{\rm{Q}}},\end{eqnarray} \tag{ 20 }$

where ${\omega }_{{\rm{th}}}={\displaystyle \int }_{\Delta }{{\bf{F}}}_{{\rm{th}}}({\bf{r}}){\rm{d}}{\bf{r}}$ and ${\omega }_{{\rm{Q}}}={\displaystyle \int }_{\Delta }\nabla w({\bf{r}})\langle w({U}^{{\rm{AT}}}-{U}^{\mathrm{CG}}){\rangle }_{{\bf{r}}}{\rm{d}}{\bf{r}}+{\omega }_{\mathrm{gas}}$ , $w({\bf{r}})$ is the force interpolation function of equation (9) and ${\omega }_{\mathrm{gas}}$ is the chemical potential in the absence of intermolecular interactions and $\langle \cdot {\rangle }_{{\bf{r}}}$ indicates the conditional equilibrium average for fixed AT configurations.

Equation (20) is the minimal necessary condition that the GC-AdResS system should satisfy in order to have a grand canonical-like molecular dynamics, i.e. to satisfy the condition equation (6), and, as stated above, it is imposed by the thermodynamic force. The numerical verification that indeed the AT region of GC-AdResS behaves as a grand canonical ensemble is then made by comparing quantities calculated in the GC-AdResS AT system with those calculated in an equivalent subsystem of a fully atomistic reference system (see results in section 6.1). A subsystem in a fully atomistic simulation, if the subsystem and the total system are large enough, is a natural grand canonical system. It follows that if the reservoir in the fully atomistic reference system and the GC-AdResS reservoir have the identical insertion/deletion behaviour (equation (6)), they must spend the same amount of energy in insertion/deletion, i.e. have the same chemical potential difference between the AT region and the rest of the system. This implies that the condition of equation (6) in the BL model corresponds to equation (20) of the GC-AdResS model.

Thirdly, in accordance with the above reasoning, the coupling force in (14) does not give a major energetic contribution to the AT interactions. Nevertheless it involves strong repulsive forces that prevent the molecules entering in the AT region from overlapping with molecules that are already in the AT region, which would produce (numerical) singularities that would automatically stop the simulation. This soft collision-avoidance has the effect that the smooth density of the transition kernel is exponentially decaying outside the admissible (non-overlapping) particle configurations. Hence, even though the coupling force can be conceptually neglected as far as the construction of the transition kernel is concerned, it plays a key role in the numerical simulation as it imposes collision-avoidance between AT and HY/CG particles in a robust and numerically efficient way.

Altogether, even though we cannot give a rigorous derivation of the BL kernel within the GC-AdResS framework, we have described how some of the properties of the kernel that guarantee well-posedness of the dynamics can be inferred from the properties of the various force contributions. It is unclear whether it is possible to write the kernel explicitly in terms of the forces. We shall argue that, even though such a direct link may not exist, it is still possible to realize the BL model numerically, and GC-AdResS does exactly this. For example, stochastic insertion/removal of molecules in the system (see [8, 16]) can be used to realize ${K}_{{NM}}({X}_{N}^{\prime },{X}_{M})$ in a Monte Carlo fashion. The basic idea is that a molecule is inserted in the system by searching a location that is close to a minimum free energy configuration followed by a local equilibration where the rate of insertion is defined by the chemical potential of the system in accordance with (6); equivalently, within the framework of GC-AdResS the random particle number fluctuations (in the AT region) are realized by the self-consistent iteration of the thermodynamic force.

5.3. Bergmann–Lebowitz model as conceptual guideline for the calculation of equilibrium time correlation functions in the GC-AdResS

According to popular textbooks of statistical mechanics and molecular simulation (see e.g. [13]), the general definition of the equilibrium time correlation function, ${C}_{{AB}}(t)$ between two physical observables, A and B is:

$\begin{eqnarray}{C}_{{AB}}(t)=\langle a(0)b(t)\rangle & = & \displaystyle \int {\rm{d}}{\bf{p}}{\rm{d}}{\bf{q}}f({\bf{p}},{\bf{q}})a({\bf{p}},{\bf{q}}){{\rm{e}}}^{{{iL}}_{t}}b({\bf{p}},{\bf{q}})\\ & = & \displaystyle \int {\rm{d}}{\bf{p}}{\rm{d}}{\bf{q}}f({\bf{p}},{\bf{q}})a({\bf{p}},{\bf{q}})b({{\bf{p}}}_{t}({\bf{p}},{\bf{q}}),{{\bf{q}}}_{t}({\bf{p}},{\bf{q}}))\end{eqnarray} \tag{ 21 }$

where, $a({\bf{p}},{\bf{q}})$ and $b({\bf{p}},{\bf{q}})$ are phase space functions corresponding to the observables A and B respectively, $a(0)=a(t=0)$ and $b(t)$ is the function at time t, $f({\bf{p}},{\bf{q}})$ is the equilibrium distribution function and the dynamics is generated by the Liouville operator iL. The notation ${{\bf{p}}}_{t}({\bf{p}},{\bf{q}}),{{\bf{q}}}_{t}({\bf{p}},{\bf{q}})$ is taken from [13] and indicates the time evolution at time t of the momenta and positions with initial condition ${\bf{p}},{\bf{q}}$ . For a canonical ensemble the definition in equation (21) takes the explicit form:

$\begin{eqnarray}&&{C}_{{AB}}(t)=\displaystyle \frac{1}{{Q}_{N}}\int {\rm{d}}{\bf{p}}{\rm{d}}{\bf{q}}{{\rm{e}}}^{-\displaystyle \frac{{H}_{N}({\bf{p}},{\bf{q}})}{{kT}}}a({\bf{p}},{\bf{q}})b({{\bf{p}}}_{t}({\bf{p}},{\bf{q}}),{{\bf{q}}}_{t}({\bf{p}},{\bf{q}})).\end{eqnarray} \tag{ 22 }$

where Q_N is the canonical partition function and ${H}_{N}({\bf{p}},{\bf{q}})$ the Hamiltonian of a system with N (constant) molecules. According to equation (22), the numerical calculation of ${C}_{{AB}}(t)$ can be done by calculating $a({\bf{p}},{\bf{q}})$ and $b({{\bf{p}}}_{t}({\bf{p}},{\bf{q}}),{{\bf{q}}}_{t}({\bf{p}},{\bf{q}}))$ along each MD trajectory and averaging over all the data obtained. The trajectories must be long enough so that the basic requirements of ergodicity and statistical relevance of the data can be safely assumed. In such a case the dynamics generated by the Liouvillian operator is well defined, since the Liouville operator is well defined by the Hamiltonian of N molecules:

$\begin{eqnarray}&&{iL}=\displaystyle \sum _{j=1}^{N}[\displaystyle \frac{\partial H}{\partial {{\bf{p}}}_{j}}\displaystyle \frac{\partial }{\partial {{\bf{q}}}^{j}}-\displaystyle \frac{\partial H}{\partial {{\bf{q}}}^{j}}\displaystyle \frac{\partial }{\partial {{\bf{p}}}_{j}}]=\{\ast ,H\}.\end{eqnarray} \tag{ 23 }$

Now let us formally generalize equation (22) to the case of a grand canonical ensemble:

$\begin{eqnarray}&&{C}_{{AB}}(t)=\displaystyle \frac{1}{{Q}_{\mathrm{GC}}}\displaystyle \sum _{N}\int {\rm{d}}{{\bf{p}}}_{N}{\rm{d}}{{\bf{q}}}_{N}{{\rm{e}}}^{-\displaystyle \frac{[{H}_{N}({{\bf{p}}}_{N},{{\bf{q}}}_{N})-\mu N]}{{kT}}}a({{\bf{p}}}_{N},{{\bf{q}}}_{N})b({{\bf{p}}}_{t}({{\bf{p}}}_{N},{{\bf{q}}}_{N}),{{\bf{q}}}_{t}({{\bf{p}}}_{N},{{\bf{q}}}_{N})).\end{eqnarray} \tag{ 24 }$

where ${Q}_{\mathrm{GC}}$ is the grand canonical partition function, μ the chemical potential and N the number of particles (now varying in time) of the system. The difficulty lies in how to interpret the quantity $b({{\bf{p}}}_{t}({{\bf{p}}}_{N},{{\bf{q}}}_{N}),{{\bf{q}}}_{t}({{\bf{p}}}_{N},{{\bf{q}}}_{N}))$ . In fact, at a given time t the system evolved from its initial condition and it is likely to have a number of particles/molecules ${N}^{\prime }$ different from the initial state. The correspondence of GC-AdResS with the model of Bergmann and Lebowitz plays a key role for making sense of $b({{\bf{p}}}_{t}({{\bf{p}}}_{N},{{\bf{q}}}_{N}),{{\bf{q}}}_{t}({{\bf{p}}}_{N},{{\bf{q}}}_{N}))$ in the numerical simulation as equation (7) states that there exists a Liouvillian ${{iL}}^{M}$ , the action of which is to evolve the system from $({{\bf{p}}}_{N},{{\bf{q}}}_{N})$ to $({{\bf{p}}}_{t},{{\bf{q}}}_{t})$ with ${N}^{\prime }$ molecules. As we have argued, the operator ${{iL}}^{M}$ is well defined within the GC-AdResS framework. Thus the correspondence between the BL model and GC-AdResS leads to the following ready-to-use definition of the equilibrium time correlation functions for numerical simulations with GC-AdResS: 'if a molecule leaves the AT region in the observation time window, its contribution to the correlation function is neglected'. This principle is in agreement with the philosophy of the BL model, which asserts that a molecule entering into the reservoir loses its microscopic identity.

6. Numerical results

Here we report numerical results for liquid water (SPC/E model) at room conditions. The section is divided in two parts: the first is dedicated to the calculation of static properties with the intention of demonstrating—numerically—that GC-AdResS produces results typical of a natural grand canonical system (as defined before). The second part is dedicated to the calculation of the equilibrium time correlation functions. In such a case the exchange of particles with the reservoir poses, on the one hand, the conceptual question of how to define the Liuoville operator of the atomistic region and, on the other hand, the practical question of how to count correlations when a molecules leaves the atomistic region or enters it. The theoretical concepts of section 2 actually give the guidelines to solve both problems. We will first prove that, with the definitions taken from section 2, GC-AdResS gives the same results as those of an open subsystem of a fully atomistic NVE simulation. Next, since in the thermodynamic limit all ensembles are equivalent, we expect, for physical consistency, that by increasing the size of the atomistic region, results systematically converge to those of a full NVE simulation where the calculations are performed over the whole system; the numerical results reported below confirm our expectations.

6.1. Static properties

Figures 3–5 and tables 1 and 2 show static properties calculated with local thermostat GC-AdResS compared to NVE full atomistic calculations of an equivalent subsystem. In particular, figures 3 and 4 show that GC-AdResS, with the current definition of reservoir, can properly reproduce the probability distribution of a natural grand canonical at least up to second order. The difference with the results of [24] is that the transition region is considerably smaller and that the thermostat acts only in the reservoir. A few remarks in this regard are in order: in figure 3 the number particle density of GC-AdResS agrees in a satisfactory way with that of the NVE calculation, the largest deviation (below $5\%$ ) is at the border of the atomistic region with the hybrid region. This is due to the abrupt absence of the thermostat. The effect is negligible anyway, however, there are three technical options which allow us to make the effects of such a difference even smaller: (a) apply the rigorous GC-AdResS protocol and consider an additional (but, differently from [24], negligible) atomistic buffer as part of the transition region, (b) require that the convergence of the thermodynamic force is stricter, (c) slowly switch off the thermostat in the transition region near the atomistic region. Here we have opted for the simpler option (a), because in any case the effects of this discrepancy on the calculation of physical quantities produce no more than $10\%$ of deviation compared to the reference data (see discussion below). Figure 5 reports the particle number probability distribution of the subsystem compared with an equivalent NVE subsystem, the shape of both curves is a Gaussian and the curve of GC-AdResS is indeed shifted compared to the NVE of reference, but only for two particles. If we apply the rigorous GC-AdResS protocol and consider an additional (negligible) atomistic buffer, then the two curves essentially overlap, see figure 5 (bottom). Table 1 shows the robustness of the method as a grand canonical setup for the calculation of a thermodynamic property, that is, energy fluctuation and the covariance (see appendix for definitions and technical details). Regarding the accuracy, in the worst case the deviation is no more than $10\%$ , which would already be numerically satisfactory. However, if we apply the rigorous GC-AdResS protocol (as in figure 5 (bottom)) the maximum deviation falls down to $3\%$ only, see table 2.

**Figure 3.** Molecular number density calculated with AdResS where the thermostat is acting only in the reservoir. Results are compared with the density obtained for an equivalent subsystem (1.2 nm) in a full atomistic NVE simulation. A discrepancy of about $5\%$ can be observed at the border of the AT region (vertical lines). Besides the fact that a discrepancy of $5\%$ is not dramatic, in general the rigorous application of GC-AdResS requires that this part of the hybrid region contains a buffer of fully atomistic molecules. Here we want to show that even in the worst case scenario, the numerical accuracy is still very high.
Download figure:
Standard image High-resolution image

**Figure 3.** Molecular number density calculated with AdResS where the thermostat is acting only in the reservoir. Results are compared with the density obtained for an equivalent subsystem (1.2 nm) in a full atomistic NVE simulation. A discrepancy of about $5\%$ can be observed at the border of the AT region (vertical lines). Besides the fact that a discrepancy of $5\%$ is not dramatic, in general the rigorous application of GC-AdResS requires that this part of the hybrid region contains a buffer of fully atomistic molecules. Here we want to show that even in the worst case scenario, the numerical accuracy is still very high.
Download figure:
Standard image High-resolution image

**Figure 4.** Oxygen–oxygen (top), oxygen–hydrogen (middle) and hydrogen–hydrogen (bottom) radial distribution functions calculated with AdResS where the thermostat is acting only in the reservoir. Such functions are compared with the results obtained for an equivalent subsystem (1.2 nm) in a fully atomistic NVE simulation and with the same quantity calculated over the entire system in the fully atomistic simulation; the agreement is highly satisfactory.
Download figure:
Standard image High-resolution image

**Figure 5.** (Top) Particle number probability distribution of AdResS compared with the equivalent NVE subsystem. The subsystem employed for this calculation is an open subsystem embedded into the NVE global system (i.e. we consider only molecules in a subregion of the global NVE system). Such a subsystem has the same size of atomistic region as AdResS; it freely exchanges molecules with the rest of the system. The shape of both curves is a Gaussian (reference black continuous curve); the curve of AdResS is shifted compared to the NVE results of only two particles. However, if we consider the additional atomistic buffer (bottom), as it should be if the principles of GC-AdResS are rigorously applied, then the two curves overlap.
Download figure:
Standard image High-resolution image

Table 1. Thermodynamic fluctuations calculated in atomistic subregion (EX = 1.2) in GC-AdResS and full-atom simulations. There is a discrepancy of around 5–10% between the results of GC-AdResS and those of the reference full-atom simulation.

Quantity	Full-atomistic	GC-AdResS
$\frac{\langle {E}^{2}\rangle -\langle E{\rangle }^{2}}{\langle E\rangle }$	20.6 ± 0.4	19.3 ± 0.4
$\frac{\langle {NE}\rangle -\langle N\rangle \langle E\rangle }{\langle N\rangle }$	4.4 ± 0.2	3.9 ± 0.2

Table 2. Same quantities as above calculated in the region excluding the (negligible) part where the density is $5\%$ off compared to the reference density, as discussed in figure 3. The numerical results in GC-AdResS and the full-atom simulation agree now within $3\%$ , which is highly satisfactory.

Quantity	Full-atomistic	GC-AdResS
$\frac{\langle {E}^{2}\rangle -\langle E{\rangle }^{2}}{\langle E\rangle }$	27.1 ± 0.5	26.4 ± 0.5
$\frac{\langle {NE}\rangle -\langle N\rangle \langle E\rangle }{\langle N\rangle }$	5.1 ± 0.2	4.9 ± 0.2

An additional test was done in order to prove that GC-AdResS satisfies a thermodynamic condition of a grand canonical ensemble in the thermodynamic limit. In fact in the thermodynamic limit the isothermal compressibility, ${\kappa }_{T}$ , in a grand canonical ensemble, can be related to the fluctuations encoded in the particle number distributions [31]:

$\begin{eqnarray}&&\rho {k}_{{\rm{B}}}T{\kappa }_{T}=\displaystyle \frac{\langle {N}^{2}\rangle -{\langle {N}^{}\rangle }^{2}}{\langle N\rangle }\end{eqnarray} \tag{ 25 }$

where ρ is the density of particles, ${k}_{{\rm{B}}}$ the Boltzmann constant and, $T=298\;{\rm{K}}$ , the temperature. The test was done for a system of 20000 molecules with a reservoir of 800000 (total number of molecules 100000) at a pressure of $1{atm}$ ; in this case we obtained ${\kappa }_{T}=44.6\pm 1.6{10}^{6}({\mathrm{bar}}^{-1})$ which should be compared with the value of $45.9\pm 1.2{10}^{6}({\mathrm{bar}}^{-1})$ of the corresponding fully atomistic system and with the value of about $45.25{10}^{6}({\mathrm{bar}}^{-1})$ [32, 33] of experiments and $44.0{10}^{6}({\mathrm{bar}}^{-1})$ from NPT simulations of SPC/E water [33]; the overall accuracy is within $5\%$ (in the worst case), which can be considered a satisfactory result. It must also be underlined that an effective compressibility, equation (25), was found to be the same in GC-AdResS and in the fully atomistic simulation (see also [29]). Given the satisfactory tests for static properties, which prove that indeed the reservoir based on the local thermostat of GC-AdResS produces grand canonical statistics, we can now proceed with the calculations of equilibrium time correlation functions where the notion of the BL Liouville operator in the limit of a grand canonical ensemble comes into play in order to provide theoretical solidity to the numerical calculations.

6.2. Dynamic properties

Here we report the numerical results of the application of GC-AdResS to the calculation of three relevant equilibrium time correlation functions for SPC/E water at room conditions. The GC-AdResS results are compared with the results obtained for an equivalent subsystem in a fully atomistic NVE simulation. Figure 6 shows the velocity-velocity autocorrelation function, ${C}_{\mathrm{VV}}(t)$ (top), (molecular) dipole-dipole autocorrelation function, ${C}_{\mu \mu }(t)$ (middle), reactive flux correlation function, $k(t)$ (bottom); the agreement between GC-AdResS and the fully atomistic NVE simulation is remarkable. This implies that the 'ideal' reservoir of the GC-AdResS method is very close to the thermodynamic limit of a microscopic system. A necessary condition of general validity of the concepts and calculations shown here is that, as the AT region of GC-AdResS increases, results must systematically converge to those obtained for the whole system of the fully atomistic NVE simulation. This principle corresponds to the fact that in the thermodynamic limit all the ensembles are equivalent. Figure 7 shows the systematic convergence of the curves to the fully atomistic reference as a function of the size of the AT subsystem of AdResS. A general remark valid when adaptive resolution is used as a multiscale technique rather than as grand canonical setup must be made: it must be noticed that the procedure defined above to calculate time correlation functions introduces a connection between the decay of a correlation function in time and the spatial locality of the process associated with such a decay. For example, in dense gases decay times are relatively large, thus if the size of the atomistic region is too small, many molecules are likely to leave such a region with the effect that the decay time would be shorter than the real one. In practical terms, a way to probe whether or not our method captures a certain decay process is to perform a study where the size of the atomistic region is systematically varied and observe the convergence of the correlation function of interest. At the same time it must also be noticed that the connection between decay times and spatial locality is not necessarily a limitation of the procedure, but actually represents one of its main conceptual advantages; in fact it allows us to identify the essential (atomistic) degrees of freedom (in space and time) required for a certain process.

**Figure 6.** Three relevant equilibrium time correlation functions for SPC/E water at room conditions calculated with GC-AdResS and for an equivalent subsystem in a fully atomistic NVE simulation; as before, velocity–velocity autocorrelation function, ${C}_{\mathrm{VV}}(t)$ , (molecular) dipole–dipole autocorrelation function, ${C}_{\mu \mu }(t)$ , reactive flux correlation function, $k(t)$ (semilogarithmic plot). The agreement between GC-AdResS and the fully atomistic simulation is highly satisfactory.
Download figure:
Standard image High-resolution image

**Figure 6.** Three relevant equilibrium time correlation functions for SPC/E water at room conditions calculated with GC-AdResS and for an equivalent subsystem in a fully atomistic NVE simulation; as before, velocity–velocity autocorrelation function, ${C}_{\mathrm{VV}}(t)$ , (molecular) dipole–dipole autocorrelation function, ${C}_{\mu \mu }(t)$ , reactive flux correlation function, $k(t)$ (semilogarithmic plot). The agreement between GC-AdResS and the fully atomistic simulation is highly satisfactory.
Download figure:
Standard image High-resolution image

**Figure 7.** Systematic convergence of ${C}_{\mathrm{VV}}(t)$ , ${C}_{\mu \mu }(t)$ and $k(t)$ (semilogarithmic plot) of GC-AdResS to the fully atomistic NVE results calculated over the whole system.
Download figure:
Standard image High-resolution image

7. Conclusions

We have discussed the BL model as a prototypical theoretical construction for describing the statistical mechanics of open systems. Despite its conceptual solidity, the model has not been employed or discussed in connection with the development of MD techniques with a varying number of molecules. As we have argued, however, the model turns out to be very useful as far as the conceptual validation of MD techniques is concerned. We have discussed its connection to the GC-AdResS MD technique and used its principles to define equilibrium time correlation functions for a system with a varying number of molecules. Numerical results for a relevant system, liquid water at room conditions, are highly promising. We have then discussed the computational efficiency and convenience of GC-AdResS. Given the technical robustness of GC-AdResS and its conceptual validation within the BL model, one can think from this perspective about moving forward and also approaching systems out of equilibrium, e.g. subject to an external perturbation. For example, biomolecules in solution whose conformational dynamics is driven by an external (electric) field as in [30]. The response of the system to an external perturbation requires a numerical technique similar to that employed in the calculation of equilibrium time correlation functions, moreover the region of microscopic interest is limited to the first two to three solvation shells of the molecule, which is an ideal test case for a AdResS-like technique. The study of open systems is gaining popularity and the development of techniques which are both computationally efficient and theoretically well founded is a necessity of modern research in the field of molecular simulation; GC-AdResS is such an example.

Acknowledgments

We thank Giovanni Ciccotti for countless clarifying discussions about the problem of the Liouville theorem in open systems and Luca Ghiringhelli and Matej Praprotnik for a critical reading of the manuscript. This work was supported by the Deutsche Forschungsgemeinschaft (DFG), partially with the Heisenberg grant (grant code DE 1140/5-2) provided to LDS, and partially with the grant CRC 1114 provided to LDS and CH. The DFG grant (grant code DE 1140/7-1) associated with the Heisenberg grant for AG is also acknowledged. HW acknowledges support from the National High Technology Research and Development Program of China under grant no. 2015AA011201. Calculations were performed using the computational resources of the North German Supercomputing Alliance (HLRN), project bec00100.

Appendix A: Technical details

All simulations are performed by home-modified GROMACS [37], and the thermodynamic force in AdResS simulations is obtained using the VOTCA [38] package. The SPC/E [39] water model is used in all the simulations. The system contains 5000 water molecules and the dimensions of the system are $14.6\times 3.2\times 3.2$ ${\mathrm{nm}}^{3}$ . In AdResS simulations, the resolution of the molecules changes only in the x direction, as depicted in figure 1. Three different atomistic regions are used in AdResS simulations, whose sizes are $0.6\times 3.2\times 3.2$ ${{nm}}^{3}$ , $1.2\times 3.2\times 3.2$ ${{nm}}^{3}$ and $4.8\times 3.2\times 3.2$ ${{nm}}^{3}$ (this latter being a worst case scenario for the reservoir, still results are very promising). The size of the hybrid region is kept the same in all three cases, $2.9\times 3.2\times 3.2$ ${{nm}}^{3}$ . The remaining system contains coarse-grained particles, which interact via generic WCA potential of the form:

$\begin{eqnarray}&&U(r)=4\epsilon [{(\displaystyle \frac{\sigma }{r})}^{12}-{(\displaystyle \frac{\sigma }{r})}^{6}]+\epsilon ,r\leqslant {2}^{1/6}\sigma .\end{eqnarray} \tag{ A.1 }$

The parameters σ and in the current simulations are 0.30 nm and 0.65 kJ ${\mathrm{mol}}^{-1}$ , respectively. The time step used in the simulations is 0.002 ps, and the coordinates and velocities are recorded after every 10 time steps, i.e. 0.02 ps. All simulations are performed at room temperature, 298 K. The coarse-grained and the hybrid region in the AdResS system are coupled to a Langevin thermostat, whose time scale is 0.1 ps. The reaction field method [40, 41] is used for calculating the electrostatic interactions in the system, with dielectric constant ${\epsilon }_{{RF}}=\infty$ , as this tends to give good energy conservation. The 'switch' cutoff method is used to treat the van der Waals interactions. The cutoff radius for interactions is 1.2 nm. For a 1 ns full atomistic simulation (without any thermostat), the total energy obtained is $-195846$ kJ ${\mathrm{mol}}^{-1}$ and the drift is just 11.4 kJ ${\mathrm{mol}}^{-1}$ , which is less 0.01%. The dynamical results from this micro-canonical ensemble are compared with results from AdResS simulations. All the dynamical properties are computed from equilibrated trajectories of 1 ns in fully atomistic and AdResS simulations. The velocity autocorrelation function is defined as:

$\begin{eqnarray}&&{C}_{\mathrm{VV}}(t)=\displaystyle \frac{1}{N}\displaystyle \sum _{i=1}^{N}\displaystyle \frac{\langle {v}_{i}(t)\cdot {v}_{i}(0)\rangle }{\langle {v}_{i}(0)\cdot {v}_{i}(0)\rangle }\end{eqnarray} \tag{ A.2 }$

where $\langle \cdot \rangle$ denotes the equilibrium average and $\langle {v}_{i}(t)\cdot {v}_{i}(0)\rangle$ computes the correlation between the velocities of the $i\mathrm{th}$ molecule at times 0 and t. In this work, the velocity autocorrelation function is calculated only for the oxygen atoms. In the same way, the dipole autocorrelation function is defined as:

$\begin{eqnarray}&&{C}_{\mu \mu }(t)=\displaystyle \frac{1}{N}\displaystyle \sum _{i=1}^{N}\displaystyle \frac{\langle {\mu }_{i}(t)\cdot {\mu }_{i}(0)\rangle }{\langle {\mu }_{i}(0)\cdot {\mu }_{i}(0)\rangle }\end{eqnarray} \tag{ A.3 }$

where $\langle {\mu }_{i}(t)\cdot {\mu }_{i}(0)\rangle$ computes the correlation between the dipole moment of the $i\mathrm{th}$ molecule at times 0 and t. In the current implementation of AdResS, the electrostatic interactions are calculated by a short ranged reaction field method. The dipole autocorrelation function results are consistent with the fully atomistic simulation, also using reaction-field. We also tested the particle mesh Ewald (PME) method [42] as an alternative approach to compute coulomb interactions and calculated the dipole autocorrelation function in a fully atomistic simulation, and found that the results were identical. The reactive flux hydrogen bond correlation [43–46] function is defined as:

$\begin{eqnarray}&&k(t)=-{\rm{d}}{C}_{\mathrm{HH}}/{\rm{d}}t\end{eqnarray} \tag{ A.4 }$

where ${C}_{\mathrm{HH}}(t)$ is the hydrogen bond autocorrelation function defined as:

$\begin{eqnarray}&&{C}_{\mathrm{HH}}(t)=\displaystyle \frac{\langle h(0)\cdot h(t)\rangle }{\langle h\rangle }.\end{eqnarray} \tag{ A.5 }$

Here h is the hydrogen bond population operator for a particular pair of molecules. It is assigned a value of '1' if there is a hydrogen bond between this pair, otherwise a value '0'. The criteria for considering a hydrogen bond between two water molecules is (1) inter-oxygen distance is less than 0.35 nm and (2) the $O-H\ldots O$ angle is smaller than $30^\circ$ . The function ${C}_{\mathrm{HH}}(t)$ is the conditional probability that a hydrogen bond between a pair of molecules is present at time 't', given that it was present at time zero. In both the fully atomistic and AdResS simulations, first ${C}_{\mathrm{HH}}(t)$ was calculated and then $k(t)$ was obtained by taking the numerical derivative of ${C}_{\mathrm{HH}}(t)$ , using a time step of 0.02 ps.

Appendix B: Thermodynamic fluctuations

The following thermodynamic quantities are analyzed in this work:

$\begin{eqnarray}&&{Var}(E)=\displaystyle \frac{\langle {E}^{2}\rangle -\langle E{\rangle }^{2}}{\langle E\rangle }\end{eqnarray} \tag{ B.1 }$

and

$\begin{eqnarray}&&{CoVar}(N,E)=\displaystyle \frac{\langle {NE}\rangle -\langle N\rangle \langle E\rangle }{\langle N\rangle }\end{eqnarray} \tag{ B.2 }$

where ${Var}(E)$ is the variance in the total energy of the molecules in the atomistic subregion in AdResS and an equivalent subregion in the full-atom simulations, ${CoVar}(N,E)$ is the covariance between the total energy of the molecules and the number of molecules which are present in the atomistic subregion in AdResS and an equivalent subregion in the full-atom simulations. The energy E consists of the sum of the kinetic energy of the molecules in the region considered, plus the energy coming from the interactions of each molecule with all the other molecules of the region considered. The interactions with the reservoir, defined in the text as 'technical interactions', are not counted, for consistency with the definition of reservoir in the BL model. The different properties are calculated from a 2 ns long trajectory. The error in the data was calculated using 'block-averaging' analysis.

Molecular dynamics in a grand ensemble: Bergmann–Lebowitz model and adaptive resolution simulation

Article metrics

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. Basic concepts of a grand ensemble and extended Liouville equation

2.1. Bergmann–Lebowitz Liouville equation

3. Molecular dynamics of subsystems with a varying number of molecules

4. Grand canonical-like adaptive resolution simulation (GC-AdResS): basic principles

5. Bergmann–Lebowitz model and GC-AdResS

5.1. Mapping the Hamiltonian of the AT region

5.2. The action of the reservoir and the interpretation of the transition kernel

5.3. Bergmann–Lebowitz model as conceptual guideline for the calculation of equilibrium time correlation functions in the GC-AdResS

6. Numerical results

6.1. Static properties

6.2. Dynamic properties

7. Conclusions

Acknowledgments

Appendix A: Technical details

Appendix B: Thermodynamic fluctuations

Footnotes

Molecular dynamics in a grand ensemble: Bergmann–Lebowitz model and adaptive resolution simulation

Article metrics

Share this article

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. Basic concepts of a grand ensemble and extended Liouville equation

2.1. Bergmann–Lebowitz Liouville equation

3. Molecular dynamics of subsystems with a varying number of molecules

4. Grand canonical-like adaptive resolution simulation (GC-AdResS): basic principles

5. Bergmann–Lebowitz model and GC-AdResS

5.1. Mapping the Hamiltonian of the AT region

5.2. The action of the reservoir and the interpretation of the transition kernel

5.3. Bergmann–Lebowitz model as conceptual guideline for the calculation of equilibrium time correlation functions in the GC-AdResS

6. Numerical results

6.1. Static properties

6.2. Dynamic properties

7. Conclusions

Acknowledgments

Appendix A: Technical details

Appendix B: Thermodynamic fluctuations

Footnotes