Direct numerical simulation of particulate flows with an overset grid method

We evaluate an efficient overset grid method for two-dimensional and three-dimensional particulate flows for small numbers of particles at finite Reynolds number. The rigid particles are discretised using moving overset grids overlaid on a Cartesian background grid. This allows for strongly-enforced boundary conditions and local grid refinement at particle surfaces, thereby accurately capturing the viscous boundary layer at modest computational cost. The incompressible Navier--Stokes equations are solved with a fractional-step scheme which is second-order-accurate in space and time, while the fluid--solid coupling is achieved with a partitioned approach including multiple sub-iterations to increase stability for light, rigid bodies. Through a series of benchmark studies we demonstrate the accuracy and efficiency of this approach compared to other boundary conformal and static grid methods in the literature. In particular, we find that fully resolving boundary layers at particle surfaces is crucial to obtain accurate solutions to many common test cases. With our approach we are able to compute accurate solutions using as little as one third the number of grid points as uniform grid computations in the literature. A detailed convergence study shows a 13-fold decrease in CPU time over a uniform grid test case whilst maintaining comparable solution accuracy.


Introduction
Flows of finite-sized particles in viscous fluids are common to many industrial as well as natural processes, such as primary cementing in the oil and gas industry [52] and blood flow [3]. Being so ubiquitous, particulate-flow problems span a large range of material and flow properties. Of interest to this work are flows of an incompressible Newtonian fluid with rigid, spherical (circular) particles of finite Reynolds number, where the Reynolds number describes the relative strength of inertial to viscous forces.
Approximate solution methods have been applied to both high and low particle Reynolds number flow regimes, where by neglecting viscous or inertial contributions, respectively, the equations of motion can be linearised and solved with powerful mathematical tools; see [13,44,62] for examples in both flow regimes. It is the intermediate flow regime, where such approximations are not valid, that the full incompressible Navier-Stokes equations must be solved. Direct numerical simulation 5 methods designed for such intermediate Reynolds number flow regimes broadly fit into two categories: those methods that use a static grid that is unaligned with the particle boundaries, and those that use boundary-conforming grids. Below we give a brief overview of these two classes of methods and their application to particulate flow problems.
Arbitrary Lagrangian Eulerian (ALE) methods were developed for deforming boundary problems as a response to the difficulties encountered when using fully Lagrangian schemes in problems with large amounts of circulation, see for example, [11]. The ALE method belongs to the class of boundary conformal methods, to which the boundary integral method (only applicable to inviscid, irrotational flow) and the overset grids method belong. The semi-discrete, viz. finite element in space and finite difference in time, method of Hughes et al. [37] has proved popular with general fluid structure interaction (FSI) problems and has been continually developed over the last few decades. In this ALE approach, nodal velocities explicitly enter the convective terms of the momentum equations allowing the underlying grid to distort with the moving boundary, while an Eulerian description can be used in regions of large circulation. Hu et al. [35] used an ALE method with a combined fluid-solid formulation to simulate pairs of spheres interacting in Newtonian and viscoelastic liquids, and 2D sedimentation simulations of 90 circular particles.
This ALE method can be considered a special case of the deforming spatial domain or stabilised space-time (DSD/SST) method developed by Tezduyar et al. [66], where a finite element formulation is used both in space and time. By extending the finite element formulation in time any grid deformation is automatically accounted for [66]. To avoid turning a three dimensional problem into a four dimensional problem, courtesy of the time dimension, a discontinuous-in-time space-time finite element formulation of the problem is solved for one space-time "slab" at a time.
Both the conventional ALE and DSD/SST methods use unstructured grids to explicitly represent the particle surface. Unstructured grids allow for complex surfaces to be represented but are generally not amenable to fast elliptic solvers, such as geometric multigrid. The distortion of the underlying grid inevitably causes elements to become increasingly skewed. Highly skewed elements can often cause problems relating to accuracy and stability, requiring a new grid to be computed on to which the solution is then projected. Both the regridding and projection steps are costly procedures and are the main drawbacks of ALE and DSD/SST type methods [27]. Particulate flow simulations generally encounter large deformations, requiring frequent regridding.
Various approaches have been developed to improve the quality of the deformed grid. Johnson and Tezduyar [38], for example, used the equations of linear elasticity to govern the deformation of the grid for their particulate flow simulations. The grid distortion was controlled through a stiffness parameter, effectively weighting distortion towards larger elements. This preserved smaller elements in the surface regions that are important to the accuracy of the overall solution. By promoting less important elements, i.e. those far away from the particle surfaces, to distort more easily, regridding is reduced since the grid quality is, potentially, preserved for longer. However, this comes at the cost of solving an additional partial differential equation for the grid deformation.
Due to the complexities and costs of the unstructured regridding associated with boundaryconforming grid methods, the static grid methods have gained favour with many researchers for approaching FSI problems.
Static grid methods can be split into two categories based on the treatment of the solid-fluid interface: diffuse interface and sharp interface methods. An efficient class of diffuse interface methods is the immersed boundary (IB) method. The original IB method of Peskin [59] was developed to study flow patterns around heart valves. The entire computational domain is represented by a Cartesian grid and the interface is represented by a collection of massless Lagrangian points. The velocity of these Lagrangian points is used to compute the stress on the elastic interface through a constitutive relation, such as Hooke's law. This singular force distribution is transmitted from the Lagrangian interface points to the Eulerian fluid through a forcing term in the momentum equation. The information transfer is facilitated by a discrete Dirac delta function, smoothly spreading the force from the Lagrangian to the Eulerian grid. Aside from inherent stability benefits, this approach is attractive because only the right-hand side of the momentum equation is affected, so no changes to the underlying solver are required. Assuming constant density across the entire domain also allows fast Poisson solvers to be used without modification [41].
Directly applying Peskin's original IB method to rigid boundary problems is not straightforward because the constitutive relation for the elastic boundary is generally not well behaved in the rigid limit [50]. Uhlmann [67] used the direct forcing formulation of Mohd-Yusof [51] whilst retaining the smooth information transfer between Lagrangian and Eulerian grid points. Direct forcing methods use a virtual forcing term determined by the difference between the desired interface velocities and the velocities interpolated from the underlying Eulerian grid. Uhlmann proposed evaluating the forcing term at the Lagrangian interface points before smoothly transferring it to Eulerian grid points in order to reduce spatial oscillations common to direct forcing methods. With this direct forcing IB method, Uhlmann and Doychev [68] performed sedimentation simulations with O(10 4 ) spherical particles in tri-periodic domains.
Kempe and Fröhlich [41] modified the interpolation and spreading procedure of [67], greatly improving on the Courant-Friedrichs-Lewy (CFL) time step restriction and the stability for light bodies, achieving particle-fluid density ratios as low as ρ r = 0.3 for spherical particles. With this IB method Vowinckel et al. [69] investigated turbulent channel flow with mobile beds consisting of O(10 4 ) spherical particles.
Glowinski et al. [26] developed a distributed Lagrange multiplier/fictitious domain method (DLM/FD) reminiscent of Peskin's IB method using the principle of variational inequality in a finite element context. A combined variational formulation of the fluid-solid coupling was used-similar to that of [34]-but the rigidity was enforced in the particle subdomains through Lagrange multipliers, thereby allowing the governing equations to be solved on stationary grids. The original DLM/FD method imposed velocity on the particle subdomain. Patankar et al. [57] imposed the deformationrate tensor instead in order to simplify treatment for irregularly shaped particles, while Wachs [70] used this formulation in conjunction with a discrete element method to develop an improved collision mechanism for irregularly shaped bodies.
The smoothing intrinsic to diffuse interface methods reduces the solution accuracy in the immediate vicinity of the interface. This can be addressed through increased grid resolution, but at the cost of greatly increased calculation size given the grid uniformity. This problem is particularly severe in high Reynolds number flows, where the viscous boundary layer is thin. One popular way to improve accuracy at the interface is through the use of sharp interface methods, which generally reconstruct the solution near the interface, thereby enforcing the boundary conditions strongly. Fadlun et al. [21] developed a hybrid approach that borrowed on the momentum forcing approach of [59] but with the goal of enforcing boundary conditions exactly. Fadlun et al. [21] used the direct forcing formulation of [51], but imposed boundary conditions by directly modifying the coefficients of the linear system rather than explicitly calculating the forcing. This was done in order to avoid solving an implicit system for the forcing, which came about because of the implicit treatment of the viscous terms in their fractional step time advancement scheme [21]. In effect, this is equivalent to applying momentum forcing inside the fluid domain, leading to problems with moving grids when previously "solid" grid points are converted to "fluid" points. Yang and Balaras [73] demonstrated that unphysical information carried by a previously solid point into the flow field makes the evaluation of the right-hand side of the momentum equation in the fractional step algorithm problematic. They developed a "field extension" strategy, whereby the velocity and pressure fields are extended into the solid phase at the end or beginning of each sub-step. Borazjani et al. [12], developed a sharp interface method based on front-tracking for general curvilinear background grids that can handle an arbitrary number of bodies in three dimensions. Added-mass effects were treated with sub-iterations and an Aitken acceleration technique. Recently, Yang and Stern [74] improved on the field extension strategy through the introduction of temporary non-inertial reference frames attached to the solid phases. This is based on the fluid-solid coupling strategy of Kim and Choi [43] and allows the velocity coupling to be done in a non-iterative fashion. Crucially, [74] only use the non-inertial reference frame for the forcing, solving the flow field on the inertial reference frame and thereby allowing for an arbitrary number of solid phases. By allowing non-iterative methods to be used for the forcing and rigid-body motion calculations, [74] improved the performance by up to an order of magnitude. However, such field extension strategies reduce the sharpness of the immersed boundary, diminishing the advantage over diffuse methods [63].
Flow reconstruction based sharp interface methods suffer from force oscillations in moving boundary problems [48]. This is caused by the instantaneous role reversal of a grid point from an interpolation to a discrete Navier-Stokes point, and vice versa, when crossed by the boundary. This role reversal involves an immediate change in computational stencil, causing a temporal discontinuity that even the field extension of [73] does not address [48]. Seo and Mittal [63] explored the source of the temporal discontinuity in their study using a cut-cell sharp interface method and were able to reduce the oscillations by roughly an order of magnitude through improved geometric mass conservation near the interface.
Much recent effort has been focused on static grid methods for particulate flow, owing to the high efficiency of such methods and the desire to simulate engineering flows containing huge numbers of particles. Broadly speaking, boundary conformal methods can offer superior accuracy near the solid/fluid interface but may be inefficient for problems with large spatial deformations due to the costs associated with regridding at each time step. On the other hand, static grid methods-which can be highly efficient-have difficulty resolving the boundary layers near particles and may require very fine grids or adaptive grid refinement to obtain accurate results [71].
In this work we will evaluate the overset grid, or Chimera grid, method for viscous particulate flow. Overset grid methods have been widely used for problems with moving geometries. They were recognised early on to be a useful technique for treating rigid moving bodies, such as aircraft store separation [19], and have subsequently been applied to many other moving-grid aerodynamic applications, see for example [15,16,31,33,46,49,75]. English et al. [20] present a novel overset grid approach using a Voronoi grid to link Cartesian overset grids. This differs to the method used here, where interpolation stencils are directly substituted into the coefficient matrix and solid boundaries are represented using curvilinear grids. The basic approach of moving overset grids used in this article was developed for high-speed compressible and reactive flows by [33] and included the support for adaptive grid refinement. The deforming composite grid (DCG) approach was developed in [6] for treating deforming bodies with overset grids, and a partitioned scheme was developed for light deforming bodies that was stable without sub-iterations. A method to overcome the added-mass instability with compressible flows and rigid bodies was developed in [9]. More recently, stable partitioned schemes for incompressible flows and deforming solids have been developed [7,8,47] and extended to non-linear solids [5].
The method described in this paper retains much of the efficiency of static structured grid methods whilst still allowing for sharp representation of solid boundaries. The overset grid method can be seen as a bridge between the static grid methods such as IBM and boundary conformal grid methods; the curvilinear particle grids allow for higher than first-order accuracy and boundary conditions to be implemented strongly, while grid connectivity with the static Cartesian background grid is only locally updated. Since the grid connectivity is only updated locally, the regridding procedure is less costly and complex than for unstructured body conformal methods, such as ALE. Local grid refinement allows boundary layers to be fully resolved without appreciably affecting the total grid point count. This is in contrast with general static grid methods where the solver efficiency is offset by the unfavourable scaling associated with uniform grids, making large fully resolved simulations very costly [71]. For these reasons, we evaluate the suitability of the method for fully resolved simulations of incompressible fluid flow with rigid particles. Note that the scheme described here is implemented in the Cgins solver that is available as part of the Overture framework of codes (overtureFramework.org). Although past papers have described the use of Cgins for bodies undergoing specified motions (e.g. [14]), the discussion here of the algorithm involving freely-moving rigid-bodies is new.
In section 2 the overset grid method is summarised, before stating the mathematical formulation in section 3. These equations are discretised in space and time in section 4 and 5, respectively, while section 6 presents the fluid-solid coupling methodology. A grid convergence study is performed in section 7.1 using a representative test case to determine appropriate spatial resolutions for the wake structures captured by the background grid, and boundary layer captured by the particle grid. Following this, validation cases in both two and three space dimensions are presented, comparing against published experimental and numerical results. The results of our evaluation are summarised in section 8, along with an outlook towards future work.

Overset grids
The method of overset grids (also called overlapping, overlaid or Chimera grids) bridges the stationary and boundary conformal grid methods. A complex domain is represented by multiple body-fitted curvilinear grids that are allowed to overlap, as shown in Fig. 1. Overset grids bring flexibility to grid generation since component grids are not required to align along block boundaries. This flexibility allows component grids to be added in a relatively independent manner, requiring only local changes to grid connectivity.
A composite grid G consists of logically rectangular component grids G k , with k = 1, 2, . . . N g . As illustrated in Fig. 1 the grid points of G are classified as interior points, boundary points, interpolation points and exterior or un-used points. The algorithm for generating a composite grid from a collection of component grids is intricate; a detailed description is beyond the remit of this work and so it will be only briefly discussed, with the intention of showing why the method of overset grids is appropriate for particulate flow problems. The interested reader is referred to [17] and [29] for a full description of the algorithm and implementation. When moving grids are used, as is the case in particulate flow problems where each particle is represented by a separate component grid, the relative position of overset grids changes continuously. As a result, overlapping connectivity information, i.e. Chimera holes (regions of exterior points in the overset component grids) and interpolation points, must be recomputed at every time-step. Crucially, this is cheaper than complete grid regeneration and the required connectivity information recomputation can be locally confined to those grids affected by the moving grid.
Values of the solution at interpolation points are determined by standard tensor-product Lagrangeinterpolation. We use quadratic interpolation (three-point stencil in each direction) for the results in this paper, as required for second-order accuracy [17]. This interpolation is not locally conservative. Locally conservative interpolation on overset grids is possible [18] but has not been found necessary in our experience. Corrections to ensure global conservation are also possible and have been shown to have advantages, see for example [64].

Governing equations
Incompressible flow is governed by the Navier-Stokes equations, where u is the vector of Cartesian components of the velocity u i , p the pressure field, ρ the fluid density, and ν = µ/ρ the kinematic viscosity. For discretising the equations on a moving grid (on an overset grid some grids are static while others are attached to, and move with the body), we make a change of dependent variables x and t to a frame that moves with the grid. As a result, on moving domains, the governing equations transform to where w is the velocity of a point attached to the moving domain. The partial derivative in time in the moving frame, as appearing in (1), is therefore the derivative in time when keeping the spatial location fixed to a point that is attached to the moving domain. We solve an alternative formulation of system (1)- (2). A pressure Poisson equation is derived by taking the divergence of the momentum equation (1) and using (2). The resulting velocity-pressure formulation of the initial-boundary value problem is where J (∇u) ≡ ∇u : ∇u and Ω denotes the fluid domain in n d space dimensions. There are n d primary boundary conditions, denoted by B(u, p) = 0. The velocity-pressure formulation requires an additional boundary condition for the pressure. Here, the velocity divergence (6) is applied as the boundary condition on the pressure, making the velocity-pressure formulation equivalent to the velocity-divergence formulation [28]. For the second-order accurate scheme used here, boundary conditions are required to determine u and p at a line of fictitious (ghost) points outside the domain boundary. Some of the numerical boundary conditions are compatibility conditions, derived by applying the momentum and pressure equations on the boundary.
The motion of a rigid body immersed in the fluid is governed by the Newton-Euler equations, Here x b (t) and v b (t) are the position and velocity of the centre of mass, respectively, m b is the mass of the body, ω is the angular velocity, A is the moment of inertia matrix, e i are the principal axes of inertia, F (t) is the applied force, and T (t) is the applied torque about the centre of mass of the body. The principal axes of inertia are integrated over time to find the rotation matrix which is used to update positions, velocities and acceleration of points attached to the body surface.
The force and torque on the body are determined from both body forces, such as gravity, and hydrodynamic forces arising from the stresses exerted by the fluid on the body surface, Γ, where x is a point on Γ, τ = µ(∇u + (∇u) T ) is the viscous stress tensor, n is the unit normal vector to the body surface (outward pointing from the fluid domain), f b is any external body force and t b is any external body torque.

Spatial discretisation
The equations of the velocity-pressure formulation are discretised to second-order accuracy in space using finite difference methods on overset grids, see [28] and [32]. An overset grid consists of logically rectangular grids that cover a grid region and overlap where they coincide. The solutions between adjacent grids are connected via interpolation conditions. Each component grid (numbered k = 1, 2, . . . , N g ) is associated with a transformation d k : R 3 → R 3 from the unit square, with coordinates denoted by r = (r 1 , r 2 , r 3 ), into physical space, x = (x 1 , x 2 , x 3 ), and denoted by d k (r, t) = x(r, t), which allows for body fitted grids of non-rectangular shapes. Consider solving the equations in three space dimensions on a square component grid G k , with grid spacing h m = 1/N m , for a positive integer N m : is a multi-index and a and b denote the beginning and end grid line numbers, respectively. Ghost points are included at the boundaries, i m = N m,a,k or i m = N m,b,k , to facilitate discretising to second order. The component grid number k will be dropped in the following discussion.
, and P i ≈ p(x i , t) be the numerical approximations to u, w and p, respectively. The momentum and pressure equations are discretised with secondorder finite difference stencils. The derivatives with respect to r are standard second-order centred finite difference approximations, for example, where e m is the unit vector in the m-th coordinate direction. Using the chain rule the derivatives with respect to x are defined as where the entries in the Jacobian matrix, ∂r m /∂x n are obtained from the mapping x = d k (r, t).
The discretised governing equations are

Temporal discretisation
The method of lines is used for solving the equations in time. After discretising the governing equations in space they can be regarded as a system of ODEs, where pressure is considered a function of the velocity, P = P (U ). The equations are integrated in time using either a fully explicit or semi-implicit scheme, depending on the stability restriction imposed by the viscous time-step characteristic to the problem. The explicit scheme uses a secondorder accurate Adams-Bashforth predictor followed by a second-order accurate Adams-Moulton corrector. For light rigid bodies, multiple correction steps are used to stabilise the scheme, underrelaxing the computed forces on the bodies. The semi-implicit scheme treats the viscous term of the momentum equation implicitly with a second-order Crank-Nicolson method, which is once again combined with Adams-Moulton corrector steps if under-relaxed sub-iterations are required.
To illustrate this we will use the momentum equations as an example. Splitting the equations into explicit and implicit components we have where F E and F I are the explicit and implicit components, respectively The equations are integrated using either fully explicit or semi-implicit schemes. The explicit integration scheme used in the present work is the second-order in time Adams predictor-corrector method. It consists of an Adams-Bashforth predictor with the constants β 0 = 1 + ∆t 2∆t 1 and β 1 = − ∆t 2∆t 1 chosen for second-order accuracy even with a variable time-step where ∆t 1 = t n − t n−1 , and an Adams-Moulton corrector Though only one corrector step was taken here, one may in practice correct multiple times. In fact, this is necessary when dealing with moving, light rigid bodies and partitioned fluid-solid coupling, as will be discussed later.
A semi-implicit approach is taken in some diffusion dominated problems where the explicit diffusive time-step is overly restrictive. Generally, this is the case when the Reynolds number is very low or the grid is highly refined near solid boundaries. Here, the non-linear convective terms are treated with the explicit Adams predictor-corrector method while the viscous terms are treated with the implicit second-order in time Crank-Nicolson method. Using the notation introduced in (8) then the time-step consists of a predictor, and a corrector u c − u n ∆t = 1 2 where the superscript c denotes the corrected solution and α = 1 2 gives the second-order Crank-Nicolson method.
The basic Navier-Stokes solver uses a solution algorithm that decouples the pressure and velocity fields [28,32] in a similar fashion to many fractional-step and projection schemes, cf. [1,24,58] and many others. The advantage of the current scheme over typical projection schemes is that the boundary conditions for the pressure are well-defined and it is straightforward to obtain full second-order accuracy for all variables.
Assume that at time t − ∆t the values of U (t − ∆t) and P (t − ∆t) are known at all points in the solution domain and the values of F (U (t − ∆t), t − ∆t) are known at all interior points. To advance the solution in time to t the fully explicit algorithm proceeds as follows: Step 1 Determine an intermediate solution U * i (t) at all interior nodes using a predictor sub-step Step 2 Determine U * (t) at the boundary and ghost nodes by solving the boundary conditions where µ = 1, . . . , n d − 1 and only the tangential component of the momentum equation is used.
Step 3 Determine P i (t) by solving the pressure Poisson equation along with the remaining boundary conditions The normal component of the momentum equation is used here as a Neumann boundary condition for the pressure Poisson equation. The ν∆u term has been replaced by −ν∇ × ∇u to avoid a viscous time-step restriction, see [60] for more details.
Step 4 Given U * (t) and P (t) the pressure gradients can be computed and F (U * (t), t) found at interior nodes.
Step 5 Correction steps can now be taken to either increase the time-step, or as needed, to stabilise the algorithm for light rigid bodies. The correction steps consists of the Adams-Moulton corrector for the velocity followed by an additional pressure solve. For light bodies, when added mass effects are large, under-relaxed sub-iterations are used during these corrector steps to stabilise the scheme. Typically 3-7 corrector steps are used in the present work, depending on the significance of added mass effects in the problem.
For moving grids, additional steps in the algorithm are required to evolve the rigid-body equations (as discussed in the next section) and subsequently move the component grids. After the component grids have been moved the overset grid connectivity information is regenerated. Note that since the governing equations are solved in a reference frame moving with the grid, no additional interpolation is needed to transfer the solution at discretisation points from one time step to the next. As grids move, however, some unused points may become active and values at these exposed-points are interpolated at previous times as discussed in [33] For small problems (number of grid points O(10 4 )) the linear systems of equations for the velocity components and the separate system of equations for the pressure are effectively solved using direct solution methods. Larger problems necessitate iterative approaches; we use Krylov subspace methods from the PETSc library [4], algebraic multigrid solvers from the the Hypre package [22] and the geometric multigrid solvers for overset grids from Overture [30].

Fluid-solid coupling
This system of ODEs governing the particle motion is discretised in time using a Leapfrog predictor and Adams-Moulton corrector scheme. The predictor consists of and is performed before Step 1 in the time-stepping algorithm of Section 5. The corrector is , and is performed after Step 3 (pressure solve) in the time-stepping algorithm. A predictor-corrector scheme is used to facilitate the fluid-solid coupling, and to allow for sub-time-step iterations for light bodies as discussed next.
Low solid/fluid density ratios can cause the standard time-stepping routine to become unstable, owing principally to the added-mass instability [7]. To alleviate this, under-relaxed sub-iterations are performed during the correction stages (i.e. fluid velocity solve and pressure solve) of the time-stepping algorithm. These sub-iterations are thus relatively expensive although the implicit systems are not changed during these iterations.
The approach used here is similar to that used by many previous authors, although we prefer to under-relax the force on the rigid-body as opposed to under-relaxing the entire state of the rigid-body. Note that more sophisticated approaches exist to reduce the required number of subiterations such as those based on Aitken acceleration [12,45].
We illustrate the relaxed sub-iteration through consideration of the rigid-body velocity equation, though this is performed for the angular velocity equation as well. The force-relaxation subiteration replaces the update for v n+1 b in the corrector step above by the iteration where k denotes the iteration count. The iterative forcing used to evolve the equation is where α is a relaxation parameter and F k is the forcing at step k, which initially is simply the predicted force from the previous fluid solve step, i.e. F 1 = F p . During each sub-iteration, the fluid velocity and fluid pressure are recomputed and these updated fluid values are used to compute the next approximation to the force on the rigid body. A small α can ensure stability-at the cost of increased iterations. An optimal value of α is problem dependent and some experimentation is required to reach a good compromise between stability and computational cost. For example, a value of 0.1 was used in the pure wake interaction test case of §7.4 where the maximum number of sub-iterations was 39 during the first few time-steps, likely due to the non-smooth forcing at start up, and the minimum and average number of sub iterations were 5 and 7, respectively. Iterations are performed until the absolute or relative change in the force fall below their respective convergence criteria, ∆F k < τ a , or ∆F k /(|F k | + F ) < τ r , where ∆F k = |F n+1,k − F k |.

Collision model
A hard-sphere collision model based on the linear conservation of momentum is used to handle cases in which particles touch 6 . During the predictor step of the particle advancement scheme the new positions are used to determine whether or not particles breach the minimum separation distance, as stipulated by the requirements of the interpolation stencils. If this minimum separation distance is breached, a collision is deemed to have occurred and the particle velocities are corrected. The velocity corrections are calculated bŷ , e r is the coefficient of restitution and n A = −n B is the unit normal vector pointing from the centre of mass of particle A to the centre of mass of particle B.
In this work, collisions were modelled as perfectly elastic with a coefficient of restitution of e r = 1. This is a frictionless model, so tangential forces are assumed to be zero during the collision, and angular velocities are not corrected by the model. This hard-sphere model is also restricted to the contact of only two particles at any given moment in time.

Convergence study
To accurately simulate viscous flows the grid resolution must be fine enough to fully capture boundary layers adjacent to solid surfaces. These can be very thin, depending on the Reynolds number of the problem as the boundary layer depth scales approximately as 1/ √ Re, see [10]. A major advantage of boundary-conformal over static grid methods is the ability to selectively refine the grid near solid boundaries. In a detailed grid independence study of viscous flow past a static cylinder, Nicolle [53] investigated how refinement of different areas of the grid affected the behaviour of the cylinder. Predictably, the surface resolution was found to most affect the cylinder behaviour, but the downstream wake resolution was found to affect the Strouhal number. Nonetheless, large ratios between surface and wake resolution were found to give very accurate results.
In the present work, emphasis is placed on the grid characteristic length scales to optimise run time. Following Nicolle and Eames [54] we use two grid length scales to quantify the quality of the  computational grid: the domain length scale (DLS) is the background grid element size while the surface length scale (SLS) is the grid element size on the surface of the particle.
The grid independence study is first performed using a grid with nearly uniform grid spacings and is then repeated using grid refinement near the particle boundary. Descriptions of the grids used are provided in table 1. Because curvilinear grids are used to represent the particles, the cells are slightly distorted in physical space. Thus, table 1 provides minimum, average and maximum cell volumes (areas).
The test used in both convergence studies is as follows: the domain is a rectangular channel of dimensions (W * , H * ) = (4 D, 40 D) filled with an a priori quiescent fluid of density ρ f = 1 g/cm 3 and kinematic viscosity of 0.05 cm 2 /s. The particle (D, ρ p ) = (0.25, 1.5) is released from rest at (x * 0 , y * 0 ) = (D, 38.4 D) with the gravitational constant set to g = 981 cm/s 2 in the negative ydirection. The problem geometry can be seen in Fig. 2. The results are non-dimensionalised as follows: where U T = 8.6041 cm s −1 is the measured terminal settling velocity. Under the action of gravity the particle rotates in a clockwise sense, as if rolling down the wall. Immediately after release it moves a short distance towards the near wall before migrating towards its equilibrium position along the channel centreline.
The wake remains attached but is unsteady. This is reflected in the oscillatory u * and ω * velocity time histories, shown in figures 4 and 3. The angular velocity time history exhibits a large initial peak after which it is rapidly damped to low amplitude oscillations about ω * = 0. After a very small negative peak, the u * time history exhibits a large positive peak, much like ω * , with damped successive peaks. However, unlike in ω * , the following u * oscillations have a non-zero mean value as the disk drifts towards its equilibrium position. In contrast, the vertical velocity, v * , shows the particle rapidly reaching a steady settling velocity, unaffected by the attached unsteady wake.
where data from the GU1 simulation are used as reference values. Relative errors in v * and ω * taken early on in the simulation, at t = 0.2 s, and late in the simulation, at t = 1.0 s, show greater than second order convergence for both components early on but a decrease in convergence rate for ω * as the simulation progresses. The gravitational acceleration is impulsively turned on at t = 0, and this non-smooth forcing could have a detrimental effect on the convergence rates. The test was repeated using a series of refined grids. These were constructed using a fine grid immediately surrounding the particle surface, smoothly connected to a coarse background grid using a transition grid. This construction can be seen in Fig. 2. Detailed descriptions of the grids are provided in table 1, where uniform grids are denoted by the prefix GU and refinement grids by GR.  Table 2: Absolute values and relative errors for the convergence study taken at t = 1.0 s. Table 2 shows absolute values of u * , v * and ω * at t = 1.0 s as well as relative errors, where data from GU1 were taken as reference values. As before, it is evident that v * is fairly insensitive to the grid resolution, while u * and ω * show a large dependence on the near surface resolution. With a high resolution surface grid capturing the boundary layer, the motion of the disk can be captured quite accurately, even with a large surface to background grid resolution ratio. In fact, the GR5 grid with a resolution ratio of 6 : 1 reproduced solutions of GU1 with a maximum error of 5 %, with a more than 23 fold reduction in number of grid points. Fig. 5 shows the relative error in v * and ω * at t = 1 s against the total CPU time of the calculation.

Settling disk impacting a wall
This test simulates the fall of a rigid circular disk in a bounded domain and its impact with the bottom boundary. This test has been performed by other researchers using DLM/FDM [26], an FEM fictitious boundary method [72], and an immersed boundary lattice Boltzmann method [36].  (9), where the characteristic velocity scale U s is an estimated terminal velocity, The present results (Fig. 6) are in good agreement with the previous studies. The disk reaches the terminal settling velocity at t * = 20, with a terminal particle Reynolds number of Re T = 17. 45, consistent with the literature. As the disk approaches the bottom wall the results differ slightly.
The studies compared against in Fig. 6 all exhibit a rebound of the disk from the bottom boundary while the present results do not.  In the present study the grid around the disk and along the bottom of the tank is very fine (SLS = D/96 for both the disk and the bottom of the tank), allowing the lubrication forces and flow in the gap to be better resolved. This slows the particle down more before "contact" is made with the wall 7 . Additionally, the present study used a conservation of linear momentum based hard-sphere collision model approach to model the collision between the disk and the bottom boundary. The previous studies compared against here all used repulsive potential type methods. We can estimate a Stokes number for the particle to comment on the "correctness" of the present results. The Stokes number, Stk, is the ratio between the particle and fluid relaxation times, τ p and τ f , respectively. Taking τ f = R/U T , where R is the disk radius, τ p = mv/F d , and F d is the drag force on the disk, then Stk = mv 2 /(RF d ). Once the disk reaches its terminal settling velocity the drag force balances with the force due to gravity, so F d = πR 2 g(ρ d − ρ f ) and thus the Stokes number is Here, the Stokes number is approximately 2. It has been demonstrated that for 3D cases particles settling with Stk < 10 there is no rebound after contact is made with the bottom boundary [2,39]. Assuming this holds true for the 2D equivalent, then the above results indicate that the repulsive potential collision model is a poor sub-grid model for low speed impacts. The rebound evident in the study of Glowinski et al. [26] indicates that a higher grid resolution is required to adequately resolve the lubrication forces than the hydrodynamic interactions during free-fall. While the current approach is adequate here it is clear that in many situations resolving the gap is not practical and prohibitively expensive. Qiu et al. [61] presented a novel solution to computing incompressible flow in thin gaps using pressure degrees of freedom on virtual solid surfaces to provide solid-fluid coupling in the gap region, which when extended to no-slip boundaries could be a good alternative for this sort of problem.

Settling of two offset disks
Two offset cylinders settling in a quiescent fluid are simulated to demonstrate the drafting, kissing, tumbling behaviour observed experimentally by Fortes et al. [25]. This is a difficult problem to simulate owing to the non-linear nature of the particle motion and the particle-particle and particle-wall interactions. Results are compared to previous studies of Patankar et al. [56,57], Wan and Turek [72], Niu et al. [55], Zhang and Prosperetti [76] and Feng and Michaelides [23] for a low Reynolds number case and Uhlmann [67]  For the low Reynolds number case, the density ratio is ρ r = 1.01 and the fluid has kinematic viscosity ν = 0.1 cm 2 /s. For the moderate Reynolds number case, the density ratio is ρ r = 1.5 and the kinematic viscosity is ν = 0.01 cm 2 /s. In both cases the gravitational constant was taken as g = 981 cm/s 2 and the results are non-dimensionalised as in (9), where the characteristic velocity, U s , is again calculated using (10). Initially, the two particles begin moving from rest under the influence of gravity with the same acceleration. As the wake forms behind the lower particle the top particle becomes shielded in the resultant low pressure region. This allows the top particle to draft behind the lower particle, similar to cyclists in a peloton. This is the "drafting" stage. Eventually, the top particle makes near contact with the lower particle (they "kiss") and effectively form an elongated body with axis parallel to the fall. This configuration is inherently unstable and the elongated body rotates to align its long axis perpendicular to the fall. This is the "tumbling" stage described in [25]. The particles separate and the lower particle is overtaken by the top particle, which continues to sediment with a slightly negative u * velocity. The other particle impacts the wall, after which it, too, sediments with a slightly negative u * velocity. The results for the low Reynolds number case compare well qualitatively with those of [23,56,57] but not quantitatively (see Fig. 8). Though quantitative agreement is not necessarily apparent amongst the results of these studies themselves, what is apparent is that their settling velocities are all lower than those found in the present study. Although different methods were used, all three of these simulations used low grid resolutions around the particles, particularly [23]. The study in [76] used a higher resolution grid, with 20 computational nodes per particle diameter. Very good quantitative agreement is found between that study and the present one, as is evident from figure 9, though there is a discrepancy in the duration of the "kissing" contact and the onset of "tumbling". The onset of "tumbling" is caused by the build up of numerical error, so this is expected to be solver specific. In the absence of numerical error, or bias introduced by the grid, the disks would not leave the "kissing" stage [67].
Results for the moderate Reynolds number case are shown in Fig. 10. These are compared to results from [67], who used an immersed boundary method on high resolution grids. Both qualitatively and quantitatively the results are in excellent agreement for the particle positions and u, v velocity components, with the only differences found during the initial contact and subsequent "kissing" stage, due to the different collision models. All of the aforementioned studies used a repulsive force based model, while a conservation of linear momentum model is used here. Fig. 10 shows good qualitative but poor quantitative agreement for the angular velocity component. This is a very sensitive metric [67] and it is likely that the differences are due, in large part, to the different collision mechanisms used. The novel approach of Kempe and Fröhlich [42], which uses a subgrid lubrication force correction and conserves angular momentum, would probably be a more appropriate collision mechanism for this case.

Two particle wake interaction
This is a case presented by Uhlmann [67] to test the fluid-structure interaction, with particular emphasis on examining the effect of wake interactions between the particles on the angular velocity. Two particles of differing densities settle in an otherwise quiescent fluid. The heavier particle passes the lighter particle, subjecting it to perturbations from its wake. The particles do not collide and therefore no collision model is required, making it an attractive benchmark case.  Figure 11: Problem geometry for the two particle wake interaction test case.
The computational domain, shown in Fig. 11, has a width of 10 D, a height of 50 D and the grid characteristic length scales are DLS = D/40 and SLS = D/100, where the particle diameter is D = 0.2 m. A heavier particle of density ratio ρ r,1 = 1.5 is initially positioned at x = (−0.65 D, 4 D) from the channel centreline and top boundary respectively, while the lighter particle of density ratio ρ r,2 = 1.25 is positioned at x = (0.65 D, 6 D). Both particles are initially at rest and the fluid of kinematic viscosity ν = 0.0008 m 2 /s is quiescent at t = 0 s. The gravitational constant is set to g = 9.81 m/s 2 . The results are non-dimensionalised as in (9), where the characteristic velocity U s is calculated using (10) and the density ratio of the heavier disk, viz. ρ r,1 = 1.5.
The maximum particle Reynolds numbers of the heavy and light particle are 280 and 230, respectively [67]. Uhlmann [67] used a uniform grid resolution of DLS = D/40, allowing for fewer than three grid points across the estimated boundary layer depth for both the light and heavy particles.   Figure 13: Time histories of x * , y * , v * , u * and ω * for the pure wake interaction test case with ρr,1 = 1.5, ρr,2 = 1.25, ν = 0.0008 m 2 /s and data from Uhlman [67] overlayed.
Given the findings of the convergence study in section 7.1 it is not likely that the solutions at this  Table 3: Absolute values and relative errors for disk 2 taken at t = 1 s of the wake interaction grid independence study.
grid resolution are grid independent. A brief convergence study was performed, shown in table 3, and solutions were found to converge at SLS = D/100, allowing for 6 points across the boundary layers for both the light and heavy particle. Fig. 12 shows successive snapshots of the instantaneous vorticity field with snapshots from [67] below. The evolving flow field and particle positions match well. Fig. 13 shows the time histories of the particles position, velocity and angular velocity.
The converged solutions of the present study match well with those in [67], although a phase shift is apparent in the oscillatory components and the settling velocity is slightly higher in the present study.
The largest differences are found in the horizontal velocity components, particularly for the light particle. These differences are probably due in large part to the differences in the angular velocity components, which will affect vortex shedding and lift on the particles. The amplitudes of the angular velocity component oscillations for the heavier particle match well but a slight phase shift is apparent. This is reflected in the horizontal velocity components for the heavier particle by matching amplitudes but markedly different periods of oscillation. The angular velocity components of the lighter particle differ in both amplitude and period of oscillation, leading to more pronounced differences in the horizontal velocity components of the two studies.

Settling sphere
As a final validation case we compare experimental [65] and numerical [74] results on the motion of a single sphere in a closely confined container to numerical results produced by the current method. Four cases were run, with Reynolds numbers ranging from 1.5 to a moderate 31.9. The material parameters used in each case are detailed in table 4, with g = 9.81 m/s 2 throughout. For each case the grid characteristic length scales are DLS = D/10 and SLS = D/38, allowing for approximately six points across the estimated boundary layer depth for the highest Reynolds number case.  The sphere undergoes three distinct periods of motion after its release from rest: an initial acceleration followed by a period of steady fall at a terminal settling velocity and finally a deceleration as it approaches the wall. As the Reynolds number is increased the three stages become progressively shorter. Similar to Yang and Stern [74] the wall collisions were not considered here and the simulations stopped before the sphere made contact with the wall.
The present results are shown in Fig. 14 and match satisfactorily with the experimental results of ten Cate et al.
[65] although some slight differences remain: the terminal settling velocity in case 1 is found to be approximately 4.6 % lower here and in case 2 the sphere begins the wall induced deceleration sooner than in [65]. This earlier deceleration in case 2 is also present in the results of [74], as is the lower terminal settling velocity of case 1. The benefit of using overset grids is again evident; the grid used above consists of 5.18 × 10 5 grid points while the results are as good as those produced on a uniform grid over three times the size.

Conclusion
We evaluated the overset grid method for DNS of viscous, incompressible fluid flow with rigid, moving bodies. Several FSI benchmark test cases were carried out for verification and validation purposes. A systematic convergence test was carried out using six uniformly refined grids and six with local refinement near the particle surface. Local refinement was found to produce results deviating no more than five percent from the reference solutions, with a more than 23 fold decrease in grid point count and a subsequent 13 fold decrease in CPU time.
Finally, results for a sphere settling in a small tank at various Reynolds numbers compared well with both experimental results of ten Cate et al.
[65] and recent numerical results of Yang et al. [74], using only one third the number of grid points as the latter study.
The popular test cases presented in this work are all, to varying degrees, inertially dominated and exhibit viscous boundary layers that must be fully resolved to accurately simulate the behaviour of the rigid bodies in the flow. The second-order accurate boundary fitted method demonstrated here was found to produce reasonably converged results with approximately six grid points across the estimated boundary layer depth. By using a coarse-but fine enough to resolve wake structures-Cartesian background grid and refined, boundary-fitted grids, grid point counts were greatly reduced, even in two-dimensional problems.