Accurate and robust splitting methods for the generalized Langevin equation with a positive Prony series memory kernel

We study numerical methods for the generalized Langevin equation (GLE) with a positive Prony series memory kernel, in which case the GLE can be written in an extended variable Markovian formalism. We propose a new splitting method that is easy to implement and is able to substantially improve the accuracy and robustness of GLE simulations in a wide range of the parameters. An error analysis is performed in the case of a one-dimensional harmonic oscillator, revealing that several averages are exact for the newly proposed method. Various numerical experiments in both equilibrium and nonequilibrium simulations are also conducted to compare the method with popular alternatives in interacting multi-particle systems.


Introduction
It is well known that the popular Langevin dynamics is based on the assumption that there is a clear separation between the characteristic time scale of the "massive" particles and that of those smaller solvent particles forming the heat bath. Within this assumption, the effect of the solvent is simply reduced to an instantaneous drag force and a delta-correlated random force, thereby significantly reducing the computational cost of an "explicit solvent" model. However, it has been well documented that this underlying assumption breaks down in many physically compelling scenarios, in which case a generalized version of the Langevin dynamics is more suitable-it is known as the generalized Langevin equation (GLE) derived from the Mori-Zwanzig formalism [42,55]. The GLE has been widely applied in a large number of applications, including molecular simulations [21,30,17], mesoscopic modeling [11,22], solids [23,24,52], nuclear quantum effects [8,9], and various anomalous diffusion phenomena [16,40].
Unlike the Langevin dynamics, a temporally nonlocal drag force and a random force with nontrivial correlations are typically associated with the GLE, which make its numerical integration highly nontrivial [21,6]. To be more precise, the temporally nonlocal drag force, in the form of a convolution of the momentum with a memory kernel, requires the storage of the momentum history, whose numerical evaluation at each time step can be computationally demanding. On the other hand, it can also be computationally expensive to generate a random force with nontrivial correlations, which may require the storage of a sequence of random numbers at each time step. Although considerable effort has been devoted to developing accurate and robust numerical methods that circumvent either or both of these challenges mentioned above, it is very difficult to claim any individual method as being optimal, especially given the fact that there are such a broad range of applications where the GLE can be applied.
To this end, in this article we focus our attention on a general form of the memory kernel, which is a sum of exponentials and also known as a positive Prony series. It has been widely used in a large number of studies in the literature [15,41,45,6,46,20]. Moreover, as pointed out in [1,41,16], with certain choice of the parameters, a Prony series can be seen as an approximation of a power law [28,40,16], another primary example of the memory kernel.
The rest of the article is organized as follows. In Section 2, we discuss the mathematical formulations of the GLE, including not only the extended variable Markovian formalism and its white noise limit but also the Markovian approximations of the GLE with a more general memory kernel. Section 3 reviews three popular integration methods for GLE, followed by the derivations of a new promising scheme. Error analysis on the averages in the case of a one-dimensional harmonic oscillator will also be performed. A variety of numerical experiments are performed in Section 4 to compare all the schemes described in the article. Our findings are summarized in Section 5.

Mathematical formulations
Consider an N-particle system evolving in dimension d with position q i ∈ R d , momentum p i ∈ R d , and mass m i ∈ R for i = 1, . . . , N, the equations of motion for the non-Markovian form of the GLE are given by where U(q) is a smooth potential energy. Component-wise, the random force η x i (t), where the index x indicates the appropriate Cartesian components, is a mean zero stationary Gaussian process with an autocorrelation functionK, satisfying the fluctuation-dissipation relation [7,27], where β denotes the inverse temperature, both δ ij and δ xy are Kronecker delta functions. In this article, we focus our attention on the following general form of the memory kernel: where M is a positive integer that represent the number of modes, with λ k and α k being two constant parameters and ǫ > 0 being a rescaling parameter (see more discussions in [45]). Note that for the sake of notational simplicity we do not explicitly express the dependence ofλ k andα k on ǫ.

Extended variable Markovian formalism
In order to avoid dealing with the integral form of (1b), it is desirable to rewrite (1) as an extended variable Markovian formalism [28,9,45,6,46,10,36]. More specifically, following [6], we define the extended variable, ξ i,k ∈ R d , associated with the k-th Prony mode's action on the i-th component of q and p: Subsequently, (1) may be rewritten as Differentiating (5) gives a simple stochastic differential equation (SDE): In order to construct a random force that satisfies the fluctuation-dissipation relation (2), we consider the following SDE: where W i,k = W i,k (t) ∈ R d is a vector of uncorrelated standard Wiener processes. It can be easily seen that (8) is an Ornstein-Uhlenbeck (OU) process where, component-wise, η x i,k (t) has mean zero (i.e., η x i,k = 0) and time correlation function [49] of Therefore, the random force η i in (1b) can be rewritten as Moreover, combining the results of (3)-(10) and introducing a new variable of z i,k = ξ i,k + η i,k , the GLE (1) may be rewritten as the following extended variable Markovian form: It is worth mentioning that, in the special case of M = 1, (11) is very similar to a third-order Langevin dynamics that has certain advantages in controlling the discretization errors over its corresponding second-order form, i.e., an underdamped Langevin dynamics (see more discussions in [43]). Note also that, by introducing an additional variable, the extended variable Markovian form (11) can be cast into the GENERIC (General Equation for Non-Equilibrium Reversible-Irreversible Coupling) formalism (see more discussions in [44,14,26]), while this cannot be easily done in the original non-Markovian form (1). We can write down the Fokker-Planck (or forward Kolmogorov) equation associated with (11) It can then be shown that there exists a unique invariant measure defined by the density whereẐ is the partition function. That is, (13) is the unique solution of the stationary Fokker-Planck equation (12), i.e., L † GLE ρ β = 0.

White noise limit
From (11c) we have 4 and substituting it into (11b) gives which, in the white noise limit of ǫ → 0 (4), becomes Rewriting the summation of dW i,k in the equation above as we have dp where the friction coefficient is given by which is precisely the integral of the memory kernel defined in (3), Thus, we have heuristically verified that, in the white noise limit of ǫ → 0, the solution of the GLE (11) converges weakly to that of the Langevin dynamics (see a rigorous proof in [45]).

Markovian approximations of the GLE with a more general memory kernel
We have demonstrated in Section 2.1 that, for the particular form of the memory kernel (3), the GLE (1) may be rewritten as the extended variable Markovian form (11). We consider here the case of N = d = 1 while dropping the subscripts for simplicity. Moreover, for a more general form of the memory kernel, we may be able to approximate the trajectories of (q, p) in (1) by trajectories of (q,p) that solve a Markovian system with a vector of M auxiliary variables, ξ = (ξ 1 , ξ 2 , . . . , ξ M ) T ∈ R M : where g ∈ R M is a constant vector, A, C ∈ R M ×M are constant matrices, and W = W(t) ∈ R M is a vector of uncorrelated standard Wiener processes. As demonstrated in [28], (21) would be a Markovian approximation of (1) if g, A, C and Σ satisfy the following relations: with the approximated memory kernel being We can also write down the Laplace transform of the approximated memory kernel as where the right-hand side is a rational function of s. Thus, a Markovian approximation is established in two steps: first we approximate the Laplace transform of the memory kernel by a rational function; then we construct a matrix A and a vector g so that (24) holds. Subsequently, Σ and C are determined from (22). For a general memory kernelK, these tasks are nontrivial. The simplest case is when the memory kernel can be approximated by a sum of exponentials, e.g., the positive Prony series memory kernel (3), whose Laplace transform can be easily computed as In this case, A and g are chosen as Note also that in cases where A is diagonalizable they can be reduced to the case above after an appropriate orthogonal transformation [28].

Numerical methods
Splitting methods have been widely used in a range of systems, including Hamiltonian dynamics [35,19], dissipative systems [51], and various stochastic dynamics [31,32,34,37,38,39,50,48]. The techniques have also been adopted in the construction of numerical methods for the GLE (e.g., [9,6,36]). In what follows we will first review three splitting methods proposed by Baczewski and Bond [6]. We will then propose a new method based on an alternative splitting of the vector field. Error analysis on the averages in the case of a one-dimensional harmonic oscillator will also be performed.

The BACSCAB method
The vector field of the GLE (11) can be decomposed into pieces, for instance, A, B, C, and S: in such a way that each subsystem can be solved "exactly". It is worth mentioning that the S part consists of a vector of uncorrelated OU processes, each of which has an exact (in the sense of distributional fidelity) solution [25].
In describing splitting methods, we use the formal notation of the generator of the diffusion as in [12,47,54]. The generators for each part of the system may be written out as follows: The generator for the GLE thus can be written as Moreover, the flow map (or phase space propagator) of the system may be given by the shorthand notation where the exponential map is used to formally denote the solution operator to the equation ∂ t u = L GLE u. Furthermore, approximations of F t may be obtained as products (taken in different arrangements) of exponentials of the various splitting terms. For instance, the phase space propagation of a splitting method in Section II E of [6], termed BACSCAB, can be written as where exp (∆tL f ) denotes the phase space propagator associated with the corresponding vector field f . Note that the steplengths associated with various operations are uniform and span the interval ∆t. Therefore, each of the A, B, and C steps in (31) is taken with a steplength of ∆t/2, while a steplength of ∆t is associated with the S step. Moreover, it is worth mentioning that, while each of the OU processes in the S step is exactly solvable, known as the "method 2" in [6], where is the initial value of z i,k , and R i,k is a vector of independent and identically distributed (i.i.d.) standard normal random variables. Note that an alternative approach was used in the BACSCAB method, known as the "method 3" in [6], by slightly modifying the η k defined above,η in order to improve the stability [6]. However, we would like to point that in the current form there does not appear to have stability issues in the exact solutions (32)-(33) asα k → ∞.
(The memory kernel was written in a slightly different way in [6].) The Euler-Maruyama method can also be used for solving the OU process, known as the "method 1" in [6]. However, it will not be included for comparisons since it has been observed that its performance is not as good as the other two alternative methods. The integration steps of the BACSCAB

The PASP method
An alternative splitting method was proposed in Section II C of [6], termed PASP, whose phase space propagation can be written as where Depending on how the S part is solved, there are three variants of the PASP method. For example, if the S part is solved exactly as in (32)-(33), we arrive the PASP-2 method, whose integration steps read Note that the integration steps for the PASP-3 method [6] are exactly the same as PASP-2 except the η k in (38c) is replaced byη k as in (34).

The BAEOEAB method
BACSCAB and both PASP methods mentioned above rely on solving the S part exactly as in (32)- (33) or with a slight modification (34). However, one potential drawback of that approach is, asα k → ∞, the update on z i,k becomes increasingly dominated by the noise in (32), thereby shrinking the contribution from the "force term" (i.e., −λ k m −1 k p k ), which is likely to cause potential accuracy and/or stability issues in sampling the invariant measure (13). To this end, we propose to combine the "force term" with the original C part in (27) to form the E part as follows: In this case, each of the OU processes in the O step is still exactly solvable, Moreover, we can further split the E part in such a way that each subsystem, E x i,k , componentwise, with q x i remaining fixed, corresponds to a harmonic oscillator with the exact solution: where the superscript x ≤ d is a positive integer that represents a specific dimension, p x i (0) and z x i,k (0) are the initial values of p x i and z x i,k , respectively. We further propose a new splitting method, termed BAEOEAB, whose phase space propagation can be written as Given that the E part has been further split, the propagation of either of the E parts in (43) should be more explicitly defined as Note that one may wish to reverse the order in either of the E parts in (43), which would affect neither its overall performance nor the order of convergence to the invariant measure. The corresponding integration steps may be written out as follows: As indicated in (44), both (45c)-(45d) and (45f)-(45g) loop over not only each pair of i = 1 . . . N and k = 1 . . . M, but also each Cartesian component (i.e., x, y, or z). Note that we can combine the E and O parts in (39) together to form an OU process that can also be solved exactly as in [9,36]. However, this approach relies on operations involving potentially very large matrices, which could be computationally very demanding (especially when the number of modes, M, is large and/or some of the parameters,λ k andα k , are, for instance, position-dependent) and thus we would like to avoid. It is also worth mentioning that we can easily adopt the procedures based on the framework of the long-time Talay-Tubaro expansion [53,13,31,32,34,3,2,38,33,48] to analyze the accuracy of ergodic averages (with respect to the invariant measure) in the stochastic numerical methods previously mentioned in this section (with a general nonlinear force), and conclude that they all have second order convergence to the invariant measure.

PASP-2
In the PASP-2 method (36), we have the following averages: While the averages of qp and pz are exact, there exist second order errors in the averages of q 2 , p 2 , z 2 , and qz . Note also that the minus sign of the leading order term in qz appeared to be missing in [6].

PASP-3
Similarly in the PASP-3 method, we have the following averages: Similar to the case of the PASP-2 method, the averages of qp and pz are exact, while there exist second order errors in the averages of q 2 , p 2 , z 2 , and qz . Again, the minus sign of the leading order term in qz appeared to be missing in [6].

BACSCAB
In the BACSCAB method (31), we have the following averages: While the averages of q 2 , qp , qz , and pz are exact, there exist second order errors in the averages of p 2 and z 2 .

BAEOEAB
In the BAEOEAB method (43), we have the following averages: That is, all of the averages are exact, with the only exception of p 2 that has second order errors, which yields a friction-independent upper bound of the stepsize ∆t max = 2 m/K, coinciding with the (deterministic) Verlet stability threshold [32]. Note that a similar method to BAEOEAB (i.e., BAOEOAB) yields exactly the same averages, but requires additional generations of random numbers. It is also worth mentioning that the averages of q 2 , p 2 , and qp match perfectly with their counterparts in the BAOAB method of the underdamped Langevin dynamics [32].

Numerical experiments
In this section, a variety of numerical experiments are conducted to systematically compare the newly proposed BAEOEAB method (43) with alternative methods described in Section 3. While all the methods presented can be used in multi-mode systems, we focus our attention on a single mode (i.e., M = 1) for simplicity where the subscripts inλ and α (3) can be dropped.

Harmonic oscillator
In order to verify the error analysis results in Section 3.4, we compare the performance of various methods with a one-dimensional harmonic oscillator U(q) = Kq 2 /2 (K > 0). The following set of parameters were used: m = K = β =α = 1 andλ = 2. Figure 1 compares the control of the relative errors in the computed averages of q 2 and z 2 . According to the dashed order lines, both PSAP-2 and PASP-3 methods exhibit second order convergence in both averages whereas the BACSCAB method appears to be second order only in the average of z 2 on the right panel. Moreover, the BAEOEAB method is exact in both averages while the BACSCAB method is exact only in the average of q 2 on the left panel. It is worth mentioning that the error in cases where the corresponding averages are exact comes solely from the sampling error, rather than the discretization error. Note also that all the observations are consistent with our findings in Section 3.4.
More precisely, for the average of q 2 on the left panel of Figure 1, PASP-3 appears to be only slightly more accurate than PASP-2 while both methods are significantly outperformed by either BACSCAB or BAEOEAB in terms of accuracy. It is worth pointing out that the BAEOEAB method also appears to be more robust than alternative methods in a wide range of stepsizes-while both PASP-2 and PASP-3 methods exhibit around 50% relative errors at a stepsize of around ∆t = 0.75 and the BACSCAB method became unstable just over ∆t = 1, the BAEOEAB method is still extremely accurate (i.e., up to sampling error) when the stepsize is around ∆t = 1.9, close to its stability threshold ∆t max = 2. The behavior is largely similar for the average of z 2 on the right panel of Figure 1 except the performance of PASP-2, PASP-3, and BACSCAB are almost indistinguishable. In this case, the BAEOEAB method substantially outperforms alternative methods in terms of not only accuracy but also robustness.
The distributions of positions associated with different stepsizes were compared in Figure 2. The behavior is consistent with our findings on the left panel of Figure 1. That is, the distributions of both PASP-2 and PASP-3 methods visibly deviate from the reference solutions with a stepsize of ∆t = 0.7 and substantial deviations can be observed with a stepsize of ∆t = 0.9. In contrast, the distributions of both BACSCAB and BAEOEAB methods are both indistinguishable from the reference solutions; however the largest stepsizes used in the BACSCAB and BAEOEAB methods were ∆t = 1 and ∆t = 1.9, respectively. This again demonstrates the superior accuracy and robustness of the BAEOEAB method over alternative methods.  and z 2 (right) against stepsize by using various numerical methods for the GLE described in Section 3 with parameters ofλ = 2 andα = 1. The system was simulated for 10 7 reduced time units but only the last 80% of the data were collected to calculate the static quantity in order to make sure the system was well equilibrated. 1000 different runs were averaged to reduce the sampling errors. The stepsizes tested began at ∆t = 0.25 and were increased incrementally by 20% until all methods either started to show significant relative errors or became unstable. The dashed black line represents the second order convergence to the invariant measure.

Multi-particle system
We also compare the performance of various methods in systems where multiple particles are interacting with each other. To this end, we chose a soft pair potential energy that has been widely used in the so-called dissipative particle dynamics [37,39,48], where where parameter a ij denotes the maximum repulsion strength between particles i and j, r ij = |q ij | = |q i − q j | is the distance, and r c represents a certain cutoff radius, beyond which there is no interaction between particles. We adopted a standard set of parameters commonly used in algorithms tests as in [48]: m i = r c = β = 1. Moreover, a particle density of ρ d = 3 was used throughout the current article, which also determines the repulsion parameter of a ij = 75/(βρ d ) = 25 in order to match the compressibility of water [18]. A system of N = 500 identical particles was simulated in a cubic box with periodic boundary conditions [5], unless otherwise stated. The initial positions of the particles were i.i.d. with a uniform distribution over the box, while the initial momenta were i.i.d. normal random variables with mean zero and variance 1/β. The average of the computed configurational temperature in the canonical ensemble is expected to be precisely the target temperature: where ∇ i U and ∇ 2 i U, respectively, represent the gradient and Laplacian of the potential energy U with respect to the position of particle i. The control of the configurational temperature has been recommended in [4] as a verification of equilibrium; importantly, good control of the configurational temperature also appears to imply good performance in other configuration-based physical quantities (see more discussions on the configurational temperature in [37,39,48]).
The control of the configurational temperature was compared in Figure 3 withλ = 1 and varyingα. All the methods appear to have second order convergence to the invariant measure (dashed order lines not shown). On the top left panel whereα = 16, the PASP-3 method appears to be more accurate than the PASP-2 method; however it is outperformed by either BACSCAB or BAEOEAB methods, whose curves are almost indistinguishable. As we increase the value ofα, the behavior is largely similar except (a) PASP-2 becomes (bottom right). The system was simulated for 1000 reduced time units but only the last 80% of the data were collected to calculate the static quantity in order to make sure the system was well equilibrated. Ten different runs were averaged to reduce the sampling errors. The stepsizes tested began at ∆t = 0.03 and were increased incrementally by 15% until all methods either started to show significant relative errors or became unstable. less and less accurate than alternative methods and displays substantial relative error at a stepsize of around ∆t = 0.05 on the bottom left panel whereα = 64; (b) BAEOEAB becomes increasingly better than alternative methods in terms of not only accuracy but also robustness-while, on the bottom right panel whereα = 128, both PASP-3 and BACSCAB show an around 25% relative error with a stepsize of slightly over ∆t = 0.05, the relative error is around 36% for BAEOEAB with a stepsize of around ∆t = 0.08. Figure 4 also compares the control of the configurational temperature when the wellknown Lees-Edwards boundary conditions [29] were applied in order to generate a simple and steady shear flow (shear rate κ = 0.1), again withλ = 1 and varyingα. The behavior is very similar to Figure 3 except the superiority of the BAEOEAB method is even more evident particularly in the bottom row whereα is relatively large. For instance, on the bottom right panel whereα = 128, BAEOEAB achieves around an order of magnitude improvement over both PASP-3 and BACSCAB in terms of accuracy. This indicates that BAEOEAB has a  even better configurational temperature control over alternative methods in nonequilibrium simulations than in equilibrium whenα is relatively large. It is worth mentioning that, although BAEOEAB has been designed in equilibrium settings where the invariant measure is preserved, it appears that its performance is also very good in nonequilibrium, particularly when the shear rate is relatively small, which may be thought of as in "near-equilibrium".

Conclusions
We have reviewed the constructions of popular splitting methods proposed for the GLE. Having also pointed the potential drawback of those existing methods, we have proposed an alternative method based on a different splitting of the vector field of the GLE. We have also analyzed the errors on the averages associated with different methods in the case of a one-dimensional harmonic oscillator. We have demonstrated that all of the averages are exact for the newly proposed BAEOEAB method, with the only exception being the average of p 2 that has second order errors, leading to a friction-independent upper bound of the stepsize ∆t max = 2 m/K that coincides with the (deterministic) Verlet stability threshold.
In contrast, there are at least two averages that are not exact for all the other splitting methods examined.
Restricting our attention to a single mode (i.e., M = 1) for simplicity, we have systematically compared the BAEOEAB method with alternative methods in a variety of numerical experiments. In the case of a one-dimensional harmonic oscillator, the BAEOEAB method is exact in both averages of q 2 and z 2 and substantially outperforms alternative methods in terms of not only accuracy but also robustness. The BACSCAB method is also exact in the average of q 2 , and appears to be as accurate as the BAEOEAB method. However, the former became unstable just over ∆t = 1, while the latter is still extremely accurate (i.e., up to sampling error) when the stepsize is around ∆t = 1.9, close to its stability threshold ∆t max = 2. In the average of z 2 , the BACSCAB method becomes second order as both PASP methods. Moreover, the performance of the three methods is very similar to each other, all outperformed by the BAEOEAB method.
We have also examined the case of multiple particles that are interacting with each other-a soft pair potential energy widely used in the so-called dissipative particle dynamics. Fixingλ to be unity, as we increase the value ofα, the BAEOEAB method becomes increasingly better than alternative methods in terms of not only accuracy but also robustness. Moreover, the superiority of the BAEOEAB method is even more evident when the well-known Lees-Edwards boundary conditions were applied in order to generate a simple and steady shear flow, a technique commonly used in nonequilibrium simulations.
Although we have focused our attention on a single mode in this article, it is expected that the newly proposed BAEOEAB method will perform well in multi-mode cases (i.e., a general Prony series) that can be viewed as a summation of single modes. However, we leave a detailed examination (including more general forms of the memory kernel, for instance, in Section 2.3) for future work.
Following similar results in Langevin dynamics [31,38], a fourth-order convergence to the invariant measure has been demonstrated recently for a particular splitting method for the GLE with certain choices of the parameters [36]. However, such superconvergence results have not been generally observed in our numerical experiments. Therefore, we leave further exploration of this observation for future work.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.