Asynchronous coupling of hybrid models for efficient simulation of multiscale systems

We present a new coupling approach for the time advancement of multi-physics models of multiscale systems. This extends the method of E et al. (2009) [5] to deal with an arbitrary number of models. Coupling is performed asynchronously, with each model being assigned its own timestep size. This enables accurate long timescale predictions to be made at the computational cost of the short timescale simulation. We propose a method for selecting appropriate timestep sizes based on the degree of scale separation that exists between models. A number of example applications are used for testing and benchmarking, including a comparison with experimental data of a thermally driven rareﬁed gas ﬂow in a micro capillary. The multiscale simulation results are in very close agreement with the experimental data, but are produced almost 50,000 times faster than from a conventionally-coupled simulation. © 2014 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
A multi-physics description of a multiscale system is often referred to as a 'hybrid' model. In fluid dynamics, a typical hybrid combines a molecular treatment (a 'micro' model) with a continuum-fluid one (a 'macro' model), with the aim of obtaining the accuracy of the former with the efficiency of the latter [1][2][3][4]. The micro and macro models generally have characteristic timescales that are very different, which means that time-accurate simulations can be extremely challenging: the size of the timestep required to make the micro model stable and accurate is so small that simulations over significant macro-scale time periods are intractable. If the system is 'scale-separated', a physical (as distinct from numerical) approximation can be made that enables the coupled models to advance at different rates (asynchronously) with negligible penalty on macro-scale accuracy. E et al. [5] were the first to introduce and implement this concept in a time-stepping method for coupled systems, referred to in the classification of Lockerby et al. [6] as a continuous asynchronous (CA) scheme ('continuous' since the micro and macro models advance without interruption [5]). In this paper we extend this idea to multiscale systems comprising an arbitrary number of coupled models.

Extension to multi-model systems
We consider an N-model timescale-separated system, where the ith model has a characteristic timescale T i , and indexing is ordered such that T i ≤ T i+1 for i = 1, . . . , N − 1. (1) Model i = 1 is the micro model and i = N is the macro model; models i = 2 to i = N − 1 are 'meso' models. The degree of scale separation S i between model i and i + 1 is where the tolerance S tol requires each distinct model of the system to be scale separated from every other to some degree, for example, S tol = O (10). If this condition is not met, the two models are treated as one, and coupling is performed conventionally.
In general each model can be considered to have its own time variable (t i ), and be represented by where X i are the set of variables of the ith model, and F i is some function of the complete system's variables, X = {X 1 , X 2 , . . . , X N }. It is important to make clear the distinction between the characteristic timescale of the ith model in isolation (T i ) and the timescale of its variables within the coupled system; they are, potentially, completely different.
To solve this set of models, the independent time variables must be related to each other. If all time variables are equal (i.e. t = t 1...N ) the system is conventionally coupled. However, we can advance models at different rates with the physical modification where g i is the rate that the ith model advances relative to the micro model. This approximation provides a means to exchange fine timescale resolution for long timescale predictions, and the extent to which it is valid depends on the degree of scale separation between models, i.e. on the magnitude of S i . For coupled models that are highly scale separated (S i > S tol ), the smaller-scale model will remain quasi-equilibrated to the dynamics of the larger-scale model despite the physical modification, and so behave similarly to as in the unmodified system. The aim is thus to represent the scale-separated system (> S tol ) with one that is less, but still significantly, scale separated (= S tol ): this is how acceptable values of g i are determined (see below for the specific procedure). Detailed analyses of the error associated with this physical approximation for a two-model system are given in E et al. [5] and Lockerby et al. [6]. Fig. 1 provides an illustration of a numerical implementation of Eq. (4) using different timestep sizes for each model, while exchanging variables as if the timesteps were equivalent (this is asynchronous coupling). The timestep of the ith model is where t 1 is the micro model timestep.
The aim of this asynchronous coupling scheme is to maximise the total simulated period. This is done by maximising the timestep in each model subject to the following constraints: 1. The physical approximation of Eq. (4) represents a very scale-separated system by one that is less, but still to a degree, scale separated (i.e. has a scale separation of S tol ). This places an upper limit on the amount a timestep of one model can be increased relative to another: 2. The numerical accuracy is satisfactory and stability guaranteed for each individual model: where t i ,max is an estimation of the maximum timestep that is permissible for each model.
Based on these constraints we can set the timestep of each model recursively: where t 1 = t 1,max .
We now consider a series of examples. The examples of Sections 3-5 are used to illustrate the effectiveness, challenges, and shortcomings of employing the physical approximation in Eq. (4) and applying time stepping of Eq. (6); the example of Section 6 provides a demonstration of an application to hybrid continuum-molecular modelling. Note, there are a range of numerical methods that could be applied to the numerically-stiff systems of Sections 3-5, but which cannot be applied to the hybrid example of Section 6.

Example 1: A simple mass-spring system
A serial mass-spring system with N masses and N + 1 springs is shown in Fig. 2. The governing equations for the ith where x i and v i are the displacement and velocity of the ith mass (m i ) and k i is the ith spring constant. A characteristic timescale for each model can be obtained from its natural period in isolation: In this example N = 5, the spring constants are equal, and the (i + 1)th mass is 900× heavier than the ith mass, such that All models are thus significantly scale-separated. 1 The complete system of equations is solved using the midpoint method (a second-order Runge-Kutta scheme), but with timesteps for each model chosen according to Eq. (6), with t i,max = T i /100.
In the first case, the initial displacements x i,t=0 are chosen such that, when the masses are released from rest, only the slowest eigenmode of the system is excited. Fig. 3(a) shows the response of the lightest (m 1 ) and heaviest (m 5 ) masses governed by the micro and macro models, respectively; the analytical eigenvalue solution (the dashed line) is indistinguishable from the multiscale solution. Fig. 3(b) shows the solution using S tol = 5, which places a less conservative restriction on the balance between accuracy and efficiency. For S tol =10, the model advances 81× faster than if standard (synchronous) coupling were used; for S tol = 5 it advances 1296× faster.
The masses are now initially displaced an equal distance and then released from rest; this excites all eigenmodes, to a degree. Fig. 4 shows: (a) the response of a meso model (i = 3) and (b) the response of the macro model (i = 5, the heaviest mass), for S tol = 5. The macro description is accurate, while for the meso model only the low frequency eigenmode is accurately captured. This example illustrates the fundamental trade-off required in multiscaling: efficiency in predicting macro variations can be dramatically increased, but at the expense of micro/meso scale resolution.

Example 2: A Lotka-Volterra system
The Lotka-Volterra equations, in various forms, have been applied to an extremely diverse range of problems, spanning economics [7], biology [8] and chemistry [9]. Originating from the analysis of auto-catalytic chemical reactions [9], the equations are now most commonly used to study the population dynamics of competing biological systems, which is the example we consider here.
The population growth rate of a species in a (sequential) food chain is as follows: where y i is the population size of the ith species, r i is the intrinsic death rate (in the absence of any prey or predator), p i is the population growth rate due to the consumption of lower species in the food chain, and q i is the death rate due to predation from higher species in the food chain. The intrinsic death rate of each species defines a characteristic timescale for that species model (r i = 1/T i ), and which classifies the model in the macro-to-micro hierarchy; this rate increases moving up the food chain. Thus, the Apex Predator, at the top of the food chain, has the highest intrinsic death rate (in the absence of prey), and its population size, y 1 , is governed by the micro model. Table 1 gives parameters for a food-chain example consisting of four species. For the initial population sizes Y i the ecosystem is in equilibrium, and the numbers of each species will remain constant. If, however, all plants are removed (Y 4 = 0), the populations of the remaining species will eventually reduce to extinction. Fig. 5 shows the population response of each species in the ecosystem, as predicted by a standard numerical solution (the dashed line, used as a benchmark) and the multiscale approach using S tol = 10 (the solid line). As in Section 3, time integration is performed using a second-order Runge-Kutta method, with timesteps for each model chosen according to Eq. (6), with t i,max = T i /5. For the benchmark solution t i = t 1,max .
The first observation is that the slowly varying population sizes of the Apex Predators and the Herbivores are predicted very accurately by the multiscale scheme, which is 100× computationally faster than the benchmark solution. Fig. 6 shows the macro model prediction using less conservative tolerances on the minimum acceptable scale separation, i.e. S tol = 2.5 and S tol = 5 (which are 1600× and 400× faster to compute than the benchmark, respectively). As expected, the multiscale   solution converges to the numerical benchmark as S tol is increased; S tol = 10 appears to provide a very accurate result for the macro variable.
However, compared to the other species, the Predators' population decline is rapid, and occurs after some delay. The delay occurs because, initially, the Predators' prey and the Predators' predators are both reducing -only when their prey is significantly diminished is there a major reduction in Predator numbers. The multiscale approach does not capture, with any fidelity, these shorter scale phenomena, and actually introduces erroneous short timescale oscillations. This again highlights that exploiting scale separation affords very efficient prediction on large timescales, but at the expense of finer timescale resolution. Note, in this case, the micro model prediction is very good, because the short scale response only manifests itself in the meso model's variable.

Example 3: A lubrication system
In this section we consider air-layer lubrication of a liquid journal bearing, as depicted in Fig. 7. The air layer, despite being thin, can significantly reduce the overall drag on the bearing, due to the lower viscosity of air relative to liquids. This simple air-layer lubrication concept is exploited in super-hydrophobic coatings, which have a chemical hydrophobicity and surface topology that, when submerged in water, combine to trap air pockets on the surface. Such coatings have applications in marine drag reduction [10,11] and for self-cleaning surfaces [12]. In the context of multiscale modelling they are relevant because of the very different scales associated with the air layer, external water, and the body/vehicle.
The bearing example of this section consists of a steel cylinder, of radius R = 10 cm, to which is applied an oscillatory torque, T . The cylinder rotates within a fixed outer cylinder containing water; the surface of the inner cylinder is coated with an air layer of thickness air = 1 μm; and the annular thickness of water is wat = 0.1 mm (i.e. R wat air ); see Fig. 7.
The low-speed, unsteady, incompressible Navier-Stokes equations provide models for the air layer and the water (i.e. the micro and meso models, i = 1 and i = 2, respectively): and where r is the radial coordinate from the cylinder centre, v is tangential velocity, μ is dynamic viscosity, ρ is density, and the subscripts 'air' and 'wat' denote the respective fluids. Given that R wat air , the curvature of the geometry can be neglected. The micro and meso models are coupled by the requirement for the shear stress and the velocity to be continuous at the air-water interface (assuming no slip): μ air dv air dr r=r int = μ wat dv wat dr r=r int , (13) and v air | r=r int = v wat | r=r int , (14) where the radial position of the air-water interface is r int = R + air . The water at the wall of the outer cylinder is stationary (i.e. there is no slip).
Newton's second law determines the evolution of the tangential velocity of the inner cylinder surface (v cyl ); this is the where L is the length of the bearing (into the page) and I is the moment of inertia of the cylinder. The applied torque is where ω is the angular frequency and A is the amplitude. The macro model is coupled to the micro model through shear stress in the air layer at the cylinder wall (i.e. the term in parenthesis in Eq. (15)) and via no-slip at the cylinder-air interface: See Appendix A for values of the physical parameters used in this example. The characteristic timescales estimated for each model are the viscous timescale (for the two fluids), and a fraction of the torque period (for the cylinder):  12), is performed using a second-order central-difference approximation; for this illustrative example, only 10 grid points are used in each fluid layer (a finer mesh does not substantially change the results). Fig. 8 shows the variation of the velocity of the cylinder wall, and the velocity of the air-water interface, with macro time. The velocity of the cylinder wall is almost 50% higher than that of the air-liquid interface (which would be the approximate velocity of the cylinder wall if no air layer were present); the drag coefficient of the cylinder in water has been reduced by almost 50% due to the presence of the thin air layer.
Here it is not practical to benchmark the multiscale results against a conventional numerical solution (i.e. one with equal timestep sizes), because of the high computational cost to obtain the latter. Instead, and what must be done in practice, is to show the independence of the multiscale result to increases in S tol . This is akin to a grid-dependency study -setting a larger S tol has the effect of reducing the difference between timestep sizes. Fig. 8 shows results for S tol = 5, 10, and 20; the results for S tol = 10 and 20 are barely distinguishable, indicating that S tol = 5 is a fair prediction, and S tol = 10 is a very accurate one. The computational speed-up afforded by the asynchronous timestep coupling is in this case extremely high: ×9.5 · 10 7 (for S tol = 5); ×1.2 · 10 7 (for S tol = 10); and ×3 · 10 6 (for S tol = 20). Now we consider the sudden application of a constant torque, T = 1, to the stationary system. This case highlights the potential difficulties in identifying characteristic timescales. The macro model, Eq. (15), does not have an inherent timescale in the absence of an oscillatory torque. In other words, in isolation (i.e. without air or water), the cylinder would perpetually accelerate in response to the constant torque. In these circumstances some estimate of the timescale of the model when interacting with others, is needed. Here we achieve this with an approximation of the acceleration and velocity of the cylinder wall in terms of the air-layer shear stress, and combine them to get a timescale.
In the absence of an applied torque, the acceleration of the cylinder walls will be proportional to the shear stress in the air layer at the wall (τ wall ), and inversely proportional to the moment of inertia of the cylinder (see Eq. (15)): If we assume a linear velocity profile in the air layer, the velocity of the cylinder wall will be proportional to the shear stress and the air-layer thickness, but inversely proportional to the dynamic viscosity, i.e.
v cyl ∝ τ wall air μ air . (19) Division of Eq. (19) by (18) gives a characteristic timescale that we can use in our simulation: where ρ cyl is the density of the steel cylinder. With the exception of this macro timescale T 3 , and t 3,max = T 3 /200, all other parameters are the same as above. Fig. 9 shows the velocity response of the cylinder wall and air-water interface to the suddenly-applied constant torque.
Again, calculations are performed using S tol = 5, 10, and 20, with S tol = 10 providing a result that appears to be insensitive to further increases of S tol . This solution is achieved ×6.5 ·10 6 faster than a solution using equal timesteps. The characteristic timescale predicted by Eq. (20), T 3 = 78.5 s, is reasonable given the observed timescales. Even so, if this prediction had been much different, the main consequence would be that the S tol -independence threshold would be different, and the dependency study might have required additional simulations to find that threshold.

Example 4: A Knudsen compressor
Finally, we consider the rarefied gas flow between two reservoirs, held at different temperatures, connected by a thin cylindrical capillary: a single-stage Knudsen compressor, see Fig. 10. Rarefaction effects in the capillary transport gas from the cold to the hot reservoir; this counter-intuitive phenomenon is thermal transpiration (sometimes known as thermal creep) and was first observed by Reynolds [13]. The configuration shown in Fig. 10 was constructed by Rojas-Cárdenas et al. [14] in order to study the transient behaviour of thermal transpiration in a closed system; some of their experimental data is presented below.
In terms of simulation, this system cannot be modelled using standard Navier-Stokes equations and boundary conditions, since thermal transpiration is a thermodynamic non-equilibrium phenomenon [15,16]. On the other hand, an accurate gas-kinetic treatment would be computational intractable over the entire domain. To tackle this, we decompose the system into three coupled models, applying the appropriate modelling assumptions to each: the reservoir model (macro, i = 3); the capillary model (meso, i = 2); and the gas-kinetic model (micro, i = 1); see Fig. 10. The macro model defining the reservoir pressures is obtained from mass conservation and by assuming an ideal gas: dp c where p is pressure, R is the gas constant, θ is temperature, V is the reservoir volume, ṁ is the mass flow rate along the capillary, z is distance along the capillary from the cold to hot reservoir, and the subscripts c and h denote the cold and hot reservoirs, respectively. Note, here we assume that there is no significant change in mass of gas within the capillary, though this can easily be accounted for if necessary. The meso model for the high-aspect-ratio capillary is obtained from the continuity equation integrated over the cross section: where A is the cross-sectional area of the capillary. The meso model is coupled to the macro model by the boundary conditions: p = p c , ṁ =ṁ (z=0) at z = 0; and p = p h at z = , where is the length of the capillary. The temperature variation along the capillary is prescribed (using the same fit as in Rojas-Cárdenas et al. [14]) by where α is a constant.
The micro model, G, provides a means to close the entire system, by relating mass flow rate to pressure and temperature: where X contains information regarding the molecular structure of the gas, which is required to accurately model thermal transpiration. Here, the micro model G is a spatially-distributed array of low-variance deviational simulation Monte Carlo (LVDSMC) subdomains (see Fig. 10). LV-DSMC is a particularly accurate and low noise method for simulating small deviations from equilibrium in rarefied gas flows [17,18]. Using micro particle-simulation subdomains to represent points in the meso domain is substantially more efficient than modelling the entire channel with a single particle simulation -this is an application of the Internal Multiscale Method (IMM), and readers are referred to [19][20][21] for a detailed description. The simulated particles of each (streamwise periodic) subdomain are forced by an effective body force, which represents the equivalent pressure and temperature gradient occurring at that location in the meso model (the pressure and temperature are also set). The micro model is thus coupled to the meso model by the streamwise pressure gradient, pressure, and mass flow rate at each of the subdomain locations.
where R cap is the capillary radius and ρ and μ are the average initial density and viscosity of the gas. If we assume a quasi-steady velocity profile (which is only valid for t T 1 ), characteristic timescales of the meso and macro model can be estimated from Eqs. (22) and (21), respectively: and where p is the average initial pressure and V t is the total volume of the combined reservoirs.
The experiments of Rojas-Cárdenas et al. [14] were performed with Argon gas, a borosilicate (glass) capillary of circular cross-section with = 52.7 ± 0.1 mm, R cap = 242.5 ± 3 μm connecting two reservoirs of volume V c = 19.81 ± 0.54 cm 3 and V h = 14.85 ± 0.40 cm 3 , held at θ c = 301 K and θ h = 372 K. A heater applied to the hot reservoir generated a temperature distribution through the capillary fitted by Eq. (23), with α = 84.82 m −1 . Initially, the two reservoirs were held at fixed pressure (p = 237.7 Pa), allowing thermal transpiration flow to develop; the system was then closed, and the pressure in the two reservoirs allowed to equilibrate. Note, in a gas that is non-rarefied the initial state would be the equilibrium one. We ran multiscale simulations of this experimental configuration using Euler time-stepping, for simplicity, with timestep sizes chosen by Eq. (6). Our complete simulation parameters are provided in Appendix B.    11 shows the transient response of the pressure in each reservoir after the system is instantaneously closed; there is very close agreement between the experimental measurements and the multiscale simulation. The asymmetry of the figure around the initial pressure is caused by the different reservoir volumes (V c > V h ). The multiscale result is obtained with S tol = 10, and required the use of twelve Intel Xeon X5650 2.66 GHz cores for 4.13 hours (wall-clock time). If the time steps were set equal (i.e. a conventional synchronous coupling) the simulation would have taken over 20 years on the same hardware. In fact, the saving over conventional modelling is far greater than this, if we also take into account the saving due to spatial multiscaling by using the IMM [19][20][21]. Fig. 12 shows the impact of increasing S tol ; a similar result is obtained in all cases, but with lower noise at higher S tol . This highlights an important general limitation of multiscaling with stochastic models (e.g. LVDSMC) or inherently noisy methods (e.g. Molecular Dynamics): fewer timesteps means less sampling, and thus more noise. The trade-off in hybrid (continuum-particle) multiscaling can thus be summarised: fine-scale accuracy ↔ noise & large-scale prediction.

Discussion and summary
In this paper we have described how an asynchronous coupling of multiple models can be used to balance fine-scale accuracy with long timescale predictions in scale-separated systems. A physical approximation is made that represents the scale separated system by one that is less, but still to some extent, scale separated. In the examples presented, we have been able to make very large computational savings with very little reduction in macro-scale accuracy.
In the majority of these examples, it has been possible to identify clear characteristic timescales, upon which timestep sizes for each model are then based. In many practical cases, though, such identification will not be straightforward. Multiple time scales will be present, and heavy reliance on crude estimation will be necessary. In these cases, performing a dependency study on S tol is essential for obtaining reliable predictions.
Applying the scheme to sets of models that, despite having disparate characteristic timescales in isolation, are strongly nonlinearly coupled, presents another issue (e.g. chaotic systems). These systems can demonstrate extreme macro-scale sensitivity to small scale fluctuations (i.e. are not separable). Here, again, the minimum test of solution integrity must be that the results are independent beyond a threshold S tol .
Significant future technical development needs are towards coping with systems exhibiting time-varying degrees of scale separation. A fully adaptive and automated scheme -one which does not rely on the prediction of characteristic timescales of models -is the next major step forward.