A nearest-neighbour discretisation of the regularized stokeslet boundary integral equation

The method of regularized stokeslets is extensively used in biological fluid dynamics due to its conceptual simplicity and meshlessness. This simplicity carries a degree of cost in computational expense and accuracy because the number of degrees of freedom used to discretise the unknown surface traction is generally significantly higher than that required by boundary element methods. We describe a meshless method based on nearest-neighbour interpolation that significantly reduces the number of degrees of freedom required to discretise the unknown traction, increasing the range of problems that can practically solved, without excessively complicating the task of the modeller. The nearest-neighbour technique is tested against the classical problem of rigid body motion of a sphere immersed in very viscous fluid, then applied to the more complex biophysical problem of calculating the diffusion timescales of a macromolecular structure modelled by three closely-spaced non-slender rods. A heuristic for finding the required density of force and quadrature points by numerical refinement is suggested. Matlab/GNU Octave code for the key steps of the algorithm are given, which predominantly use basic linear algebra operations, with a full implementation being provided on github. Compared with the standard Nystr\"om discretisation more accurate and substantially more efficient results can be obtained by de-refining the force discretisation relative to the quadrature discretisation: a cost reduction of over 10 times with improved accuracy is observed. This improvement comes at minimal additional technical complexity. Future avenues to develop the algorithm are then discussed.


Introduction
When attempting to formulate and solve mathematical models of microscopic biological flow systems, for example involving macromolecular structures, swimming cells and cilia, a significant challenge to overcome is that the flow domain is typically bounded by curved, moving surfaces. Often it is of interest to model line-like objects such as cilia and flagella, and pointlike bodies such as suspensions of many bacteria, in addition to genuinely 2D surfaces. The Stokes flow equations are linear, and in some celebrated cases it has been found possible to make significant analytical progress, for example by exploiting small amplitude expansions in the boundary movement [1] or slenderness [2,3,4], for certain idealised problems (for a more detailed review of the field, see Lauga & Powers [5]). However the majority of problems of practical interest, typically involving multiple cells, non-planar domains and large amplitude motions, require computational modelling, and there has been intensive activity in this area in the last decade.
The linearity of the flow equations enables the formulation of methods based on the boundary integral equation for Stokes flow; these methods remove the need to discretise and solve directly in the flow volume, as would be necessary for the finite element method. This reduction in dimensionality both removes the need to mesh and re-mesh the evolving flow domain, and vastly reduces the size of the linear algebra problem resulting from discretisation. In certain respects these methods were anticipated by the computational slender body theory work of Higdon [6] and [7]; relatively early examples of the 'fully-fledged' boundary element method for Stokes flow was developed by Phan-Thien and colleagues [8,9]. The achievements of the latter group with late 1980s/early 1990s computational hardware set a benchmark for work in the current era of desktop machines with multi-gigabyte RAM. It should of course be noted that there have been major algorithmic developments in numerical methods for Stokes flow in the intervening period, including the completed double-layer boundary integral equation [10,11], hybrid boundary integral-multipole methods [12], spectral discretisation combined with the fast multipole method [13,14], quadrature by expansion [11], and slender body theory combined with these techniques [15]. These approaches are generally employed by numerical experts to solve problems at the limits of computational feasibility, involving very large numbers of interacting bodies.
The classical boundary element method for Stokes flow, along with the more advanced methods described above, are both accurate and efficient. However, they present two technical challenges in their implementation, particularly when considered from the point of view of users who are not computational specialists. The first challenge is the need to generate a surface mesh, i.e. a geometric discretisation of all surfaces in the problem consisting of oriented smooth, and smoothly-connected, patches which interpolate several surface points. 1 While much easier than the volumetric meshing that would be required for the finite element method, meshes of even moderately complicated biomolecular or cellular structures may require significant time and ingenuity to create, and may not be suited to automated generation -as might be needed to study biological heterogeneity. Furthermore, some objects will appear to a very good approximation as lines or points -detailed surface meshing of these bodies may involve a level of computational refinement that is unwarranted. The second challenge -which has arguably been addressed through the availability of library code such as BEMLIB [16] -is the singularity of the stokeslet velocity and stress kernels, and requirement for semi-analytical quadrature methods. The latter issue does however present an additional layer of complexity for those who are not numerical specialists.
The method of regularized stokeslets, introduced by Cortez and colleagues [17,18,19,20], has proved to be an effective and accessible method for simulating and analysing microscale biological flows. This method deals effectively with both of the above difficulties by removing the need for a true mesh, requiring only a set of discrete points approximating the solid objects in the flow, and regularising the integral kernel so that specialised quadrature is not required. The core idea is the derivation of a family of regularized versions of the singular stokeslet/Oseen tensor kernel that nevertheless satisfy exact conservation of mass. Whereas the singular stokeslet corresponds to the Stokes flow produced by a Dirac delta force-per-unit-volume distribution, a regularized stokeslet corresponds to the Stokes flow produced by a 'blob', i.e. a finite forceper-unit-volume distribution which approximates a Dirac delta function. Cortez and colleagues have derived various versions of the regularized stokeslet corresponding to both 2D [17] and 3D [18] domains, to various forms of blob distribution [17], with image systems to represent a plane boundary [19,21], and for periodic problems [20]. We will not attempt to give a comprehensive survey of applications of the method of regularized stokeslets; it suffices to note that a Google Scholar search on 28th April 2017 with the term "regularized stokeslets" produced 250 results since 2012.
The standard numerical implementation of the method of regularized stokeslets is to employ a Nyström discretisation of the Fredholm integral equation, which replaces the integral directly with a quadrature rule. This method is very simple to implement, and has been used in the great majority of published work. This simplicity does however come at a computational cost, arising from the fact that the quantity of interest in a boundary integral equation method, the surface traction distribution, varies much more slowly than the near-singular kernel. Therefore very many degrees of freedom, corresponding to the discretisation of the traction, are required in order for the quadrature to be accurate. Furthermore, there is a coupling between the discretisation length scale and the regularisation parameter that must be satisfied in order for results to be considered converged. As a consequence, the RAM requirements alone for relatively simple geometries may be very high, as evident in a number of recent studies on helical flagella for example.
The issue of the computational cost of the method of regularized stokeslets was discussed in an earlier paper [22], in which we suggested employing a boundary element discretisation of the regularized stokeslet boundary integral equation. This approach is undoubtedly computationally efficient, and formed the basis for subsequent detailed modelling of the left right organising structures of mouse [23] and zebrafish [24,25], however it becomes necessary to generate a mesh in the same way as the classical boundary element method.
In this paper we will describe an alternative 'nearest-neighbour' discretisation of the method of regularized stokeslets which retains the meshless simplicity of the standard approach, but has greatly reduced computational cost. Alongside the mathematical description, an implementation in Matlab R /GNU Octave will be given, and applied to a simple test problem of the drag and moment on a sphere or prolate spheroid undergoing rigid body motion, followed by a more complex problem of calculating the rotational diffusion timescale of a biological macromolecule.

Stokeslets and boundary integral methods
The very low Reynolds numbers associated with microscopic flows on the length scales of macromolecules and cells motivates the study of the Stokes flow equations for viscous-dominated flow. The dimensionless form of these equations is, augmented with the no-slip, no-penetration boundary condition u(X) =Ẋ for boundary points X. The basis for boundary integral and singularity methods is to exploit the linearity of eq. (1) to construct solutions satisfying the required boundary conditions from sums and/or integrals of fundamental solutions. The classical singular fundamental solution is the stokeslet or Oseen tensor, given by the second rank tensor S jk and first rank tensor P k for which u = (S 1k , S 2k , S 3k ) and p = P k are the solutions of the Stokes flow equations with a Dirac delta distribution force-per-unit-volume located at y: The form of the stokeslet in 3D is, The singularity method for Stokes flow involves seeking an approximate solution to equation (1) by locating Stokeslets, and sometimes higher order stokes-multipoles, outside of the flow domain. For example, singularities may be located inside cells, or along the centrelines of cilia and flagella as in slender body theory; the simplest example is perhaps the solution to Stokes flow driven by a translating sphere, which can be expressed as the sum of a stokeslet and source-dipole (the latter being a special case of the stokes-quadrupole) at the centre of the sphere. Review and references are given for example Smith et al. [22].
Conversely, the boundary integral method for Stokes flow involves formulating the exact integral equation, where T ijk is the stress tensor associated with the Stokes flow u = (S 1k , S 2k , S 3k ), p = P k , given by The summation convention for repeated indices is used throughout. The boundary integral equation is solved numerically by taking the limit of equation (5) as y approaches the bounding surfaces of the domain from within the fluid, then performing discretisation of the surface geometry ∂D and traction f . If the boundary of the domain is stationary and immersed objects in the domain are rigid bodies, the 'double layer' term arising from the integral of the stress is identically zero and so the flow is given exactly by a surface distribution of stokeslets only; under the weaker condition that ∂D u · n dS = 0 it can also be shown that the double layer integral may be eliminated by taking a modified Stokeslet density, which is no longer precisely the surface traction. In either case, the flow is given exactly by boundary integrals of 'single layer' stokeslet velocity tensors only [26]. A detailed exposition of the boundary element method for Stokes flow and its numerical implementation is given by Pozrikidis [26,16]. The boundary integral and singularity methods may be hybridised to formulate approximate but accurate and efficient simulation of cell movement [27].
The integral equation problem formed from equation (5) in the limit y → Y ∈ ∂D possesses singular integrals which require specialised evaluation; moreover line and point singularity distributions, while they may not lie strictly in the flow domain, may nevertheless complicate the evaluation of flow fields for purposes such as particle tracking. An additional complication for boundary element methods is the requirement to build a true surface mesh. It should be emphasised that these issues are technical complications rather than inherent problems, however methods which do not possess these complications are appealing, particularly for biological flow, as evidenced by the rapid adoption and use of the method of regularized stokeslets, which we will briefly review in the next section.
3 The method of regularized stokeslets and its numerical implementation Cortez [17] formulated the regularized stokeslet as the exact solution to the incompressible Stokes flow equations forced by a spatially-smoothed force per unit volume, φ (x − y), The 'blob' φ denotes a family of functions parameterised by satisfying · · · R n φ dV = 1, and tending to a Dirac delta distribution in the limit → 0. The derivation of specific forms of the regularized stokeslet were discussed by Cortez and colleagues [17,18]; we will suffice by noting that a frequently-used form for 3D flow is based on the blob function, which leads to the regularized Stokeslet pressure and velocity tensors, The regularized counterpart to the classical boundary integral equation (5) in 3D is, Unlike the classical boundary integral equation, the regularized version (12) is approximate even before the numerical discretisation is carried out; for the blob function (8) the error is O( 2 ) for y greater than distance 5 /2 from the boundary, and O( ) otherwise [18]. The double layer integral is typically eliminated in practical implementations of the regularized stokeslet. This elimination may be formally justified for boundaries undergoing rigid body motion, for example models of spirochetes as rotating helices [18] and cilia undergoing purely rotational motion [24], however for bodies which undergo significant flexible motion such as respiratory cilia and sperm flagella, this elimination is an approximation which must be justified by either post hoc numerical checks [22] or slender body theory analysis [28]. The resulting approximate single-layer boundary integral equation is then, In what follows we will treat the approximation as exact, however it should be borne in mind that there is error associated with both the continuous integral equation (13) in addition to the error associated with subsequent discretisation. In what follows we will find it convenient to use the identity S ij (x, y) = S ji (y, x); relabelling, and treating the approximation as exact we have, If the body motion is prescribed, the no-slip condition u(x) =ẋ can be applied on the surface ∂D to convert equation (14) to a Fredholm first kind integral equation for the unknown force distribution f (y) -a resistance problem.
If the body is rigid, or its surface velocity is known up to a rigid body motion, and the total force and moment F , M are known, the result is the mobility problem, where the rigid body velocity U and angular velocity Ω, and the force distribution f (y), are unknown; ijk is the Levi-Civita alterating tensor. The mobility problem arises from situations such as a sedimenting body (for which the force is given by gravity or centrifugal force and the moment is zero), or a swimming cell in the inertialess regime of Stokes flow (for which the force and moment are both zero).
To solve the problems (15) and (16), the method of numerical discretisation described by Cortez et al. [18] and used in the majority of studies to date takes advantage of the regularity of the S ij kernel and directly approximates the surface integrals with a quadrature rule followed by collocation on the quadrature points. The result is a system such as, for the resistance problem, where (x[n], A[n]) are quadrature nodes and weights, and g j [n] = −f j (x[n]). For the mobility problem, we have, , for m = 1, . . . , N, The above approach has the principal advantage of computational simplicity, and the principal disadvantage that the degrees of freedom of the resulting linear system are tied to the quadrature required to approximate the rapidly-varying kernel S ij (x, X) for |x − X| = O( ) -and associated high computational expense for a given level of accuracy. Boundary element methods take an alternative approach to numerical discretisation -to discretise the unknown density f (y) with basis functions Φ n (y), i.e. f (y) = − N n=1 g[n]Φ n (y). The integral operator can then be written as, In the simplest 'constant force' implementation, the basis functions {Φ 1 , . . . , Φ N } are indicator functions on the elements of the mesh {E 1 , . . . , E N }. The stokeslet integrals are then decoupled from the force discretisation, and can be subjected to suitably fine spatial discretisation as appropriate, without unnecessarily increasing the number of degrees of freedom in the system -a major saving in both computational storage and time. This approach was suggested in the context of regularized stokeslet methods by Smith [22], and subsequently applied to problems in developmental biology [24,25] and sperm cell motion [29]. The practical drawback of this method is the need to generate a true surface mesh, which for complex geometries may be time-consuming.
To retain the advantages of both approaches -ease of implementation and computational efficiency -we suggest an alternative approach based on nearest-neighbour interpolation.
4 Nearest-neighbour discretisation of the regularized stokeslet boundary integral which we will refer to as the force discretisation and quadrature discretisation respectively. These discretisations are not true meshes because they are not equipped with a mapping from nodes to elements, and we will not need to evaluate integrals in local coordinate systems. In general, N Q because the kernel S ij (x, y) varies much more rapidly than the surface traction f (y).
Provided that they do not vary rapidly relative to the force points, the force f (y) and surface metric dS(y) may then be discretised using nearest-neighbour interpolation. Denote by N : {1, . . . , Q} → {1, . . . , N } the nearest-neighbour discretisation such that, The nearestneighbour operator N can be expressed as a Q × N matrix, . With the above discretisation, the regularized stokeslet boundary integral may be approximated as, Applying the discretisation (22) to the boundary integral equation (14), followed by performing collocation on the force discretisation u(x[m]) =ẋ[m], leads to the discretised resistance problem,ẋ and mobility problem, The discrete resistance problem (22) can be written as a 3N × 3N linear system Af = b, where the unknown 3N -vector f has components, the 3N × 3N left hand side matrix A has components, and the right hand side velocity is given by, The discrete mobility problem can be written similarly as a 3(N +2)×3(N +2) linear system, where A and b are augmented by six rows discretising the force and moment constraints, and f has six additional scalar unknowns representing the values of U and Ω.

Numerical results and analysis
The core numerical codes for implementation of the method given by equations (25)- (27) are given in appendices A.1-A.3. The full code (approximately 1000 lines) used to produce the results in this report is available from github at https://github.com/djsmithbham/ NearestStokeslets. The quadrature weights are absorbed into the g i [n] and so are never calculated explicitly.
For numerical testing we will denote the maximum discretisation spacing (i.e. maximum distance of a point to its nearest-neighbour) by h f for the force points and h q for the quadrature points: This parameter may be computed for a given discretisation as described in appendix A.4.

Rigid body motion of a sphere
The simplest test case is perhaps Stokes' law for a translating or rotating sphere in an infinite fluid. Taking a sphere of radius 1 translating with velocity U = (1, 0, 0), the exact solution to the resistance problem yields total force F = (6π, 0, 0); rotation with velocity Ω = (1, 0, 0) yields total moment M = (8π, 0, 0). Discretising the sphere by projecting onto the six faces of a cube yields the discretisations shown in figure 1 (a -force/collocation points, b -quadrature points). Numerical experiments assessing the L 2 relative error in total force and moment compared with analytic solutions, for varying regularisation parameter , force points h f and quadrature points h q , are shown in tables 2, 6 and 7 and example computational timings are given in appendix C, table 9. The entries on the main diagonal (h f = h q ) correspond to the Nyström discretisation; 'non-trivial' nearest-neighbour results are above the main diagonal (h f < h q ). Results below the main diagonal correspond to more force points than quadrature points; in all cases the system is ill-conditioned (table 3) and the Matlab R linear solver returns 'NaN' (not-a-number). Conditioning is generally not a problem provided that h f < h q , or if h f = h q and the force and quadrature discretisations coincide. If h f = h q and the discretisations are non-overlappling, singular matrices can result -data not shown).
It is immediately clear from examining the table rows that for fixed force discretisation spacing h f , decreasing the quadrature discretisation spacing h q typically results in improved accuracy, notwithstanding a slight reversal in this tendency which may occur for very coarse h f = 0.58 and very fine h q < 0.02. This behaviour can be interpreted as progressively finer h q enabling more progressively more accurate quadrature, until the error is instead dominated by errors associated with force discretisation. An error estimate will follow in section 5.2.
Examining the columns of tables 2 (see also tables 6 and 7) reveals a more interesting behaviour of the algorithm. If the quadrature discretisation size h q is fixed, more accurate results are obtained with the force spacing h f taken coarser than the quadrature spacing (h f > h q ) than with the Nyström method (h f = h q ). Appendix C confirms that, for fixed h q , the choice h f = h q can be rather inaccurate, and is sensitive to the value of , whereas taking h f ≈ 2h q reliably produces results which are accurate to within a few percent, and at much lower computational cost (see appendix C). A similar result is observed for the slightly more complex problem of calculating the resistance tensor of a prolate spheroid (appendix D).
The effect of the regularisation parameter is discussed in appendix A.4. Reducing typically reduces the error for all finite tested, provided that h q < h f /2. The regularisation error is proportional to , however it may be expected that as is reduced, h q may have to be reduced proportionately in order to approximate the integral of the increasingly-peaked kernel more accurately. However, this behaviour was not observed in the test cases analysed (for which was taken as small as 10 −6 ). In applications in which evaluation of the velocity field is of interest, a balance between small regularisation error and smooth/efficient evaluation of the velocity field may be sought, motivating an intermediate choice of .
The final quantity to consider is the force discretisation length h f . This discretisation must be fine enough to resolve variations in the surface force density. The translating sphere case in fact is not a good way to assess this convergence, because the surface stress is constant [30, p. 233]! The rotating sphere does however possess a non-constant surface force density, which varies from zero at the poles to its maximum at the equator. From the results in tables 6-7 it is clear that the coarsest force discretisation h f = 0.58 produces acceptably accurate results (i.e. within about 1% error) provided that the quadrature discretisation is sufficiently fine.

Error estimate
Following these numerical experiments, we shall briefly outline an error estimate for the nearestneighbour method. There are three sources of error: (i) regularisation error associated with the use of the regularised version of the boundary integral equation with parameter -which was discussed above following equation (12), (ii) discretisation error associated with the approximation of the integral by its values on the quadrature points, which have spacing h q , (iii) discretisation error associated with the approximation of the force and metric by their values on the coarser force points, which have spacing h f .
The discretisation error associated with the approximation of the integral by its values on the quadrature points will be chiefly determined by the contribution associated with the rapid variation in the kernel. We will restrict to the case where h f . The lowest order estimate of quadrature error follows from taking the mean value inequality, i.e. |S jk ( The integrand is sharply-peaked but in a small area -to take account of this behaviour more precisely, the integral will be split into three regions based on the value of r = |x − y|, the regions (i) 0 < r < h f , (ii) h f < r < h The total discretisation error associated with quadrature can therefore be estimated as The first term may not be a sharp estimate; the results of table 8 suggest that accurate results may be obtained (perhaps for certain types of discretisation) for very small compared with h f and h q . The second term emphasises the advantage of taking h f > h q , i.e. the force points coarser than the quadrature points.
Finally, the discretisation error associated with the approximation of the force and metric by their values on the force points can be estimated by noting that the error of nearest-neighbour interpolation is again of the form M 2 |X[q]−x[N (q)]|, where M 2 is a bound on ∇ y (f (y)dS(y)) . Hence the force discretisation error is O(h f ).
In summary, our estimate of the error associated with the regularisation and nearestneighbour discretisation of the boundary integral equation The numerical results are consistent with the finding that there are independent errors due to regularisation (see appendix B) and to the force discretisation (see the rightmost column of table 8 for which the regularisation error is minimal); moreover it is advantageous to take h −1 f h q to be small, i.e. h f > h q .

A refinement heuristic
For practical purposes we can therefore recommend the heuristic in 1. Choose much smaller than the lengthscale of the problem geometry L. Regularisation error will typically be linear in , so results which are required to be highly accurate will require a proportionately small value of .
2. Generate the force discretisation -initially this discretisation would be chosen relatively coarse.
3. Generate the quadrature discretisation at least four times as fine as the force discretisation, i.e. h q is no larger than h f /4.   The heuristic in table 1 can be applied to the rotating sphere problem as follows. We choose = 0.01 as the regularisation parameter, and consider numerical errors comparable to 1% acceptable. Taking a relatively coarse force discretisation with h f = 0.5796 and a finer quadrature discretisation of h q = 0.0416 -less than h f /4 -we compute the total moment associated with the rigid body motion Ω = (1, 0, 0). We then assess convergence by halving each of h f and h q . The results are shown in table 4.

Rotational diffusion of a macromolecular structure
The technique will now be applied to a problem from bioinorganic chemistry: determining the rotational diffusion coefficient of a novel macromolecular structure. The scientific application of the calculations will be contained in a future colloborative publication. The structure can be modelled as three nanoscale rods with slightly different orientations, in close proximity, as shown in figure 2, moving together as a single rigid body. The rods are discretised by    subdividing equally in angle, and equally along the length of the rods; the angle and length spacings are chosen based on a target distance in both axial and azimuthal directions.
The grand resistance tensor [16] is defined as the 6 × 6 matrix, where R F U is the force-velocity resistance matrix, R F Ω is the force-rotation coupling, R M U is the moment-translation coupling and R M Ω is the moment-rotation resistance. This matrix relates the force F and moment M exerted by a rigid body on a viscous fluid to the body's translational velocity U and angular velocity Ω, The individual components of the 3 × 3 matrices R ·· are calculated by solving the resistance problems U = e j and Ω = e j in turn and calculating the force and moment in each case. The diffusion tensor is given by D = k B T R −1 , where k B is the Boltzmann constant and T is absolute temperature. The rotational part of the diffusion tensor is the lower right 3 × 3 block of D [31], which we denote D R , The D R block has no dependence on choice of origin [31] (unlike the other blocks of D); it has been verified numerically that moving the origin does not affect the calculation of D R . It is convenient to report the smallest eigenvalue λ 1 of D R , which corresponds to the smallest coefficient of rotational diffusion about each of the principal axes of rotation. The characteristic timescale of rotational diffusion is then given by τ 1 = 1/(6λ 1 ). The results are given in table 5(a).
Starting in the top left corner, applying our heuristic, repeating the process of dividing both h f and h q yields the values given on the main diagonal. The point to terminate the refinement process depends on the degree of accuracy required, and indeed if a relative error of less than 1% is required, the process should be continued further. However for many biophysical applications, the level of modelling error (for example, approximating the structure by three straight rigid  Table 5: Calculation of the rotational diffusion timescale τ 1 for the macromolecular model shown in figure 2. The regularisation parameter is taken as 0.01L where L is the approximate half-length of the peptide, 25Å. The absolute temperature T = 310 K and dynamic viscosity µ = 10 −3 Pa.s. Results with h f = h q (second sub-diagonal) relate to the classic Nyström discretisation, results with h f < h q relate to 'nearest-neighbour' discretisations. (a) Rotational diffusion timescale τ 1 in nanoseconds for each discretisation tested; discretisation parameters h f and h q are given inÅ. (b) Computational timings (in seconds; notebook specification given in appendix C).
rods, assuming rigidity) does not warrant extremely precise numerical calculations. Fixing h q and examining the first three columns, the nearest-neighbour method with coarser force discretisations h f ≈ 1.5h q -2.4h q out-performs the Nyström method (h f = h q ) for both accuracy and efficiency (table 5(b)).

Conclusions
We have presented a simple-to-implement modification of the standard discretisation of the method of regularized stokeslets for modelling particle dynamics at zero Reynolds number. The modification is based on the use of two discretisations, one for the unknown surface force per unit area and one for the stokeslet quadrature, combined with nearest-neighbour discretisation of the force distribution. Practically, the method can be implemented by assembling a nearest-neighbour operator matrix, which can be achieved with a few lines of Matlab R /GNU Octave code. Numerical experiments on the resistance problem of a sphere undergoing rigid body motion, and the calculation of the rotational diffusion timescale of a macromolecular structure provide evidence that the method enables more accurate results to be obtained at lower computational cost than the standard implementation, despite not being substantially more complicated to implement. Our initial error estimate provides insight into the independent effects of h f and h q on the numerical error, and the potential advantage of taking h f > h q , provided that h f is not too large. Numerical results did not however reflect the sensitivity to suggested by this estimate -further investigation of this phenomenon, and possible sharpening of the estimate, may be topics for future work.
The standard Nyström discretisation uses, in our framework, the same discretisation for the force and the quadrature. The present approach shows that this choice much less reliably produces accurate results than if the quadrature discretisation is kept the same but the force discretisation is made twice as coarse. Making the force discretisation twice as coarse means that the number of degrees of freedom is at least halved (more typically reduced by a factor of four). The matrix assembly cost is therefore reduced by a factor of at least four, and the linear solver cost is reduced by a factor of at least eight (for a direct solver). Our practical results suggest a cost reduction of over 10 times may be typical. This reduction in cost means that more complex problems can be solved with a given computational resource -a useful facet, particularly within biological and biophysical fluid dynamics. The code implementation described in this paper makes use primarily of basic linear algebra operations rather than serial for-loops, and therefore can be accelerated through built-in software and hardware parallelisation of these operations. It will be of interest to explore how the algorithm scales on multicore or GPU hardware.
The nearest-neighbour approach still has limitations, particularly if compared with boundary element methods -which may involve higher order force discretisation and adaptive quadrature, and accelerations such as the fast multipole method. However, the nearest-neighbour approach is very simple to implement, requiring only a small modification of the standard regularized stokeslet approach, and not requiring true mesh generation. It may be valuable to explore further whether adaptivity or fast multipole implementations can be introduced without excessively complicating the algorithm. Finally, we do not yet have theoretical results which definitively prove the improved efficiency and accuracy of the method. Nevertheless, for practical purposes, carrying out a sequence of discretisations with h f ≈ 4h q alongside a sequence with h f ≈ 2h q will establish convergence empirically.
The nearest-neighbour discretisation of the regularized stokeslet method is more efficient and accurate than the standard implementation, with minimal additional complexity. It may therefore enable researchers in biological and biophysical fluid dynamics to solve significantly more challenging open problems, for example involving many swimming cells, ciliated cavities, and/or suspended macromolecules. The task of explaining the properties of the method, which we are only able to explain heuristically at present, may stimulate theoretical work. Finally, the technique may also open the way for future algorithmic developments which possess the efficiency and accuracy of boundary element methods but retain its useful properties of meshlessness and simplicity. The above takes advantage of the speed of predominantly vector operations, whilst not exceeding the memory requirements of the system. The optional third argument, blockSize is a measurement in GB of the memory to be allocated to the regularized stokeslet matrix so that blockNodes=f l o o r ( b l o c k S i z e * 2ˆ27/(9 * N ) ) ; gives the number of columns (corresponding to a subset of the force points) which can be dealt with simultaneously. For example, blockSize=0.2 would be suitable for any modern hardware, and has been tested on a Raspberry Pi Model B. The matrix NClosest, which corresponds to ν[q, n] is sparse and so will not produce a memory overflow. The final line involving the Kronecker product operation is required because the nearest-neighbour operator must be copied into three blocks, acting on the f 1 , f 2 and f 3 components in turn.

A.3 Resistance problem
The 'left hand side' matrix A for the discrete resistance problem (26) can then be assembled as, A = R e g S t o k e s l e t ( x , X, ep ) * N ea re st Ne i gh bo ur Ma t ri x (X, x ) ; The regularized stokeslet matrix may be too large to fit in memory, particularly if Q is very large, as may be the case for problems possessing complex geometry. In this case, the problem may be assembled 'block-by-block' as follows, NN=N ea re s tN ei gh bo ur M at ri x (X, x , b l o c k S i z e ) ; A=zeros (3 * M, 3 * N ) ; for iMin =1: blockNodes :Q iMax=min( iMin+blockNodes −1,Q) ; iRange =[ iMin : iMax Q+iMin :Q+iMax 2 * Q+iMin : 2 * Q+iMax ] ; A=A+R e g S t o k e s l e t ( x ,X( iRange ) , ep ) * NN( iRange , : ) ; end As in the function NearestNeighbour, blocking is used to prevent overrun. In all calculations in the present report, the linear system was solved with the 'backslash' operator, i.e. f=A\b

A.4 Discretisation size calculation
The discretisation size parameters h f and h q are calculated using the following function,    Appendix B Effect of the regularization parameter for the rigid sphere test problem To assess the sensitivity of the method to regularisation parameter, we present test results with = 0.02 and = 0.005 in tables 6 and 7 respectively. When h f and h q are taken equal, the results are highly sensitive to the value of , however provided h q is taken no larger than 0.25h f , the error is relatively insensitive. As is reduced, the finite regularisation error (evident in the rightmost entries in the tables) is reduced to below 1%, however convergence to the smaller error with h q is slower. Perhaps surprisingly, it does not appear necessary to choose h q dependent on , at least within the range of values explored. There also does not appear to be any clear advantage to taking = 0.02 as opposed to = 0.005, begging the question of how small can be taken. While = 0 is equivalent to non-regularized stokeslets (and hence singular matrix entries whenever the collocation and quadrature points coincide), taking a very small but finite value of = 10 −6 yields the results of table 8, which are typically at least as accurate as the results with larger value of , provided that h f 2h q .   Table 9: Timing results for the calculation of the translation and rotation resistance problems for the unit sphere with = 0.01.  Appendix D Testing the method on a prolate spheroid To explore further whether the efficiency of the choice h f ≈ 2h q is problem-dependent, we may assess the performance of the nearest-neighbour method in calculating the grand resistance tensor R (defined in equation (29)) of a rigid prolate spheroid, which has a well-known analytical solution [32, p. 64]. Taking a prolate spheroid with long semi-axis a = 5 and short semi-axis c = 1, the relative error in R in the · 2 norm is given in table 10. The results are not likely to be optimal as the discretisation has been created by simply deforming the sphere discretisation depicted in figure 1 without any attempt to space the points uniformly in the directions of the long and short semi-axes.