Multidimensional method-of-lines transport for atmospheric flows over steep terrain using arbitrary meshes

Including terrain in atmospheric models gives rise to mesh distortions near the lower boundary that can degrade accuracy and challenge the stability of transport schemes. Multidimensional transport schemes avoid splitting errors on distorted, arbitrary meshes, and method-of-lines schemes have low computational cost because they perform reconstructions at fixed points. This paper presents a multidimensional method-of-lines finite volume transport scheme,"cubicFit", which is designed to be numerically stable on arbitrary meshes. Stability conditions derived from a von Neumann stability analysis are imposed during model initialisation to obtain stability and improve accuracy in distorted regions of the mesh, and near steeply-sloping lower boundaries. Reconstruction calculations depend upon the mesh only, needing just one vector multiply per face per time-stage irrespective of the velocity field. The cubicFit scheme is evaluated using three, idealised numerical tests. The first is a variant of a standard horizontal transport test on severely distorted terrain-following meshes. The second is a new test case that assesses accuracy near the ground by transporting a tracer at the lower boundary over steep terrain on terrain-following meshes, cut-cell meshes, and new, slanted-cell meshes that do not suffer from severe time-step constraints associated with cut cells. The third, standard test deforms a tracer in a vortical flow on hexagonal-icosahedral meshes and cubed-sphere meshes. In all tests, cubicFit is stable and largely insensitive to mesh distortions, and cubicFit results are more accurate than those obtained using a multidimensional linear upwind transport scheme. The cubicFit scheme is second-order convergent regardless of mesh distortions.


Introduction
Numerical simulations of atmospheric flows solve equations of motion that result in the transport of momentum, temperature, water species and trace gases. The numerical representation of Earth's terrain complicates the transport problem because the mesh is necessarily distorted next to the lower boundary. As new atmospheric models use increasingly fine mesh spacing, meshes are able to resolve steep, small-scale slopes. Numerical schemes in operational weather forecast models can perform poorly over large mountain ranges, exhibiting small-scale numerical noise in momentum [1], temperature, humidity [2] and potential vorticity fields [3], or even violating the Courant-Friedrich-Lewy stability constraint resulting in so-called 'grid-point storms' [4]. A transport scheme is desired that yields stable and accurate solutions, particularly near the surface where the weather affects us directly. We present a new transport scheme which is numerically stable on arbitrary meshes and which is generally insensitive to mesh distortions created by steep slopes. It has a low computational cost since most calculations are not repeated every time-step because they depend upon the mesh geometry only.
There are two main methods for representing terrain in atmospheric models: terrain-following layers and cut cells. Both methods modify regular meshes to produce distorted meshes with cells that are aligned in columns. Most operational models use terrain-following layers in which horizontal mesh surfaces are moved upwards to accommodate the terrain. A vertical decay function is chosen so that mesh surfaces slope less steeply with increasing height. The most straightforward is the linear decay function used by the basic terrain-following transform [5] (also called the sigma coordinate), but many atmospheric models suffer from large numerical errors on such meshes [2,6,7]. To reduce such errors, more complex decay functions have been developed so that mesh surfaces are smoother [8,2,9,6].
An alternative to terrain-following layers is the cut cell method. Cut cell meshes are constructed by 'cutting' a regular mesh with a piecewise-linear representation of the terrain. New vertices are created where the terrain intersects mesh edges, and cell volumes that lie beneath the ground are removed. Cut cell meshes can have arbitrarily small cells that impose severe time-step constraints on explicit transport schemes. Several techniques have been developed to alleviate this problem, known as the 'small-cell problem': small cells can be merged with adjacent cells [10], cell volumes can be artificially increased [11], or an implicit scheme can be used near the ground with an explicit scheme used aloft [12].
Another method for avoiding the small-cell problem was proposed by Shaw and Weller [13] in which cell vertices are moved vertically so that they are positioned at the terrain surface. We refer to this alternative method as the slanted cell method in order to distinguish it from the traditional cut cell method. Slanted cell meshes do not suffer from arbitrarily small cells because the horizontal cell dimensions are not modified by the presence of terrain.
Smoothed terrain-following layers, cut cells and slanted cell methods all reduce the amount of mesh distortion but any mesh that represents sloping terrain must necessarily be distorted, at least near the ground. Even when distortions are minimal, transport across mesh surfaces tends to be more common near steep slopes, and this misalignment between the flow and mesh surfaces increases numerical errors [14,2,13]. A huge variety of transport schemes have been developed for atmospheric models, but few are able to account for distortions associated with steep terrain because they treat horizontal and vertical transport separately [15], resulting in numerical errors called 'splitting errors'. Such errors can be reduced by explicitly accounting for transverse fluxes when combining fluxes [16], but splitting errors are still apparent in flows over steep terrain where meshes are highly distorted and metric terms in a terrain-following coordinate transform are large [17].
Transport schemes are often classified as dimensionally-split or multidimensional. Dimensionally-split schemes such as [18? ] calculate transport in each dimension separately before the flux contributions are combined. Such schemes are computationally efficient and allow existing one-dimensional high-order methods to be used. Dimensionally-split schemes have only been used with quadrilateral meshes where dimensions are inherently separable. Special treatment is required at the corners of cubed-sphere panels where local coordinates differ [20? ]. For similar reasons, dimensionallysplit schemes have only been used with terrain-following coordinate transforms and not cut cells. Perhaps confusingly, dimensionally-split schemes are sometimes called multidimensional, too, because they use one-dimensional techniques for multidimensional transport.
Unlike dimensionally-split schemes, multidimensional schemes consider transport in two or three dimensions together. There are several subclasses of multidimensional schemes that include semi-Lagrangian finite volume schemes (also called conservative mesh remapping), swept-area schemes (also called flux-form semi-Lagrangian, incremental remapping, or forward-in-time), and method-of-lines schemes (also called Eulerian schemes). Two-dimensional semi-Lagrangian finite volume schemes such as [21,22] integrate over departure cells that are found by tracing backward the trajectories of cell vertices. These schemes are conservative because departure cells are constructed so that there are no overlaps or gaps, which requires that cell areas are simply-connected domains [23]. SLICE-3D is a three-dimensional semi-Lagrangian finite volume scheme for latitude-longitude meshes that applies separate conservative remappings in each dimension [24]. Swept area schemes such as [25,26,27,28] calculate the flux through a cell face by integrating over the upstream area that is swept out over one time-step. Such schemes differ in their choice of area approximation, sub-grid reconstruction, and spatial integration method. Because swept area schemes integrate over the reconstructed field, they typically require a matrix-vector multiply per face per time-stage [28,26]. Method-of-lines schemes such as [29,30] use a spatial discretisation to reduce the transport PDE to an ODE that is typically solved using a multi-stage time-stepping method. A method-of-lines scheme using a spectral element reconstruction was recently developed to achieve accurate solutions near the surface of cut cell meshes [31]. Unlike semi-Lagrangian finite volume schemes, swept-area and method-of-lines schemes achieve conservation for small-scale rotational flows. Such flows can twist the departure domain to such an extent that the domain intersects itself [27]. In two dimensions, a self-intersecting departure domain has a bowtie or hourglass shape. There are many more types of atmospheric transport schemes, but all can be classified according to their treatment of the three spatial dimensions. A more comprehensive overview is presented by Lauritzen et al. [32].
For transport schemes that are ordinarily classified as 'multidimensional', a further distinction ought to made between horizontally-multidimensional and three-dimensional schemes. Most multidimensional schemes are only horizontally-multidimensional because, while the two horizontal dimensions are considered together, horizontal and vertical transport are still treated separately. This separate treatment becomes less justifiable as atmospheric models are using increasingly fine horizontal mesh spacings that resolve small-scale steep slopes, resulting in greater mesh distortion and possible splitting errors [15]. Three-dimensional schemes avoid any splitting errors over steep slopes, but only a few conservative three-dimensional schemes have been used in atmospheric models. The multi-moment constrained finite volume scheme [33] is a three-dimensional scheme that has been used to simulate nonhydrostatic flows over orography with terrain-following coordinates on a x-z plane [34]. Simulations of subcritical flow around a cylinder have also been performed on a three-dimensional hexahedral-prismatic hybrid mesh [35]. The Multidimensional Positive Definite Advection Transport Algorithm (MPDATA) is another three-dimensional scheme that is suitable for arbitrary meshes. It has been used on triangular unstructured meshes to simulate two-dimensional nonhydrostatic flows over orography [36], and in three-dimensional transport tests [37]. Most recently, MPDATA has been extended to enable semi-implicit integrations of the compressible Euler equations on arbitrary meshes [38]. The three-dimensional method-of-lines scheme developed by Weller and Shahrokhi [39] has been used in two-dimensional flows over orography on Cartesian x-z planes with distorted meshes [13,17]. This finite volume scheme uses a moving least-squares reconstruction that makes it suitable for arbitrary meshes. This least-squares approach has been applied previously to shallow water flows [40], aeronautic [41] and porous media [42] simulations.
In this paper, we present a new multidimensional method-of-lines scheme, 'cubicFit', that improves the stability of the Weller and Shahrokhi scheme [39] and avoids all splitting errors. To reconstruct values at cell faces, the scheme fits a multidimensional cubic polynomial over an upwind-biased stencil using a least-squares approach. The implementation uses constraints derived from a von Neumann stability analysis to select appropriate polynomial fits for stencils in highly-distorted mesh regions. Almost all of the least-squares procedure depends upon the mesh geometry only and reconstruction weights can be precomputed without knowledge of the velocity field or tracer field. Hence, the computational cost of the cubicFit scheme is lower than most swept-area schemes, being more comparable to dimensionally-split schemes, requiring only n multiplies per cell face per time-stage where n is the size of the stencil. Based on numerical experiments, the scheme is found to be conditionally stable up to maximum Courant numbers of about 1.3 to 3.3.
The remainder of this paper is organised as follows. Section 2 starts by discretising the transport equation using a method-of-lines approach before describing the cubicFit transport scheme and a multidimensional linear upwind transport scheme. Section 3 evaluates the cubicFit scheme using three idealised numerical tests. The first test follows Schär et al. [2], transporting a tracer horizontally above steep mountains on two-dimensional, highly-distorted terrainfollowing meshes. The second is a new test case designed to assess numerical accuracy next to a mountainous lower boundary. In this test, a tracer placed at the ground is transported over steep slopes by a terrain-following velocity field on terrain-following, cut cell and slanted cell meshes. The third is a standard test of deformational flow on a single-layer spherical Earth, specified by Lauritzen et al. [43], which we use to assess the cubicFit transport scheme on hexagonal-icosahedral meshes and cubed-sphere meshes. Concluding remarks are made in section 4.

Transport schemes for arbitrary meshes
The transport of a dependent variable φ in a prescribed, non-divergent velocity field u is given by the equation The time derivative is discretised using a two-stage, second-order Heun method, where g(φ (n) ) = −∇ · (uφ (n) ) at time level n. The same time-stepping method is used for both the cubicFit scheme and the multidimensional linear upwind scheme. Although the Heun method is unstable for a linear oscillator [44] and for solving the transport equation using centred, linear differencing, it is stable when it is used for transport schemes with sufficient upwinding. Using the finite volume method, the velocity field is prescribed at face centroids and the dependent variable is stored at cell centroids. The divergence term in equation (1) is discretised using Gauss's theorem: where subscript f denotes a value stored at a face and subscript F denotes a value approximated at a face from surrounding values. V c is the cell volume, u f is a velocity vector prescribed at a face, S f is the surface area vector with a direction outward normal to the face and a magnitude equal to the face area, φ F is an approximation of the dependent variable at the face, and f ∈ c denotes a summation over all faces f bordering cell c. Note that equation (3) is a second-order approximation of the divergence term which limits the cubicFit transport scheme to second-order numerical convergence. This discretisation is applicable to arbitrary meshes. A necessary condition for stability is given by the multidimensional Courant number, such that Co c ≤ 1 for all cells c in the domain. Hence, stability is constrained by the maximum Courant number of any cell in the domain. The accurate approximation of the dependent variable at the face, φ F , is key to the overall accuracy of the transport scheme. The cubicFit scheme and multidimensional linear upwind scheme differ in their approximations, and these approximation methods are described next.

Cubic fit transport scheme
The cubicFit scheme approximates the value of the dependent variable at the face, φ F , using a least-squares fit over a stencil of surrounding known values. To introduce the approximation method, we will consider how an approximate value is calculated for a face that is far away from the boundaries of a two-dimensional uniform rectangular mesh. For any mesh, every interior face connects two adjacent cells. The velocity direction at the face determines which of the two adjacent cells is the upwind cell. Since the stencil is upwind-biased and asymmetric, two stencils must be constructed for every interior face, and the appropriate stencil is chosen depending on the velocity direction at each face for every time-step.
The upwind-biased stencil for a face f is shown in figure 1a. The wind at the face, u f , is blowing from the upwind cell c u to the downwind cell c d . To obtain an approximate value at f , a polynomial least-squares fit is calculated using the stencil values. The stencil has 4 points in x and 3 points in y, leading to a natural choice of polynomial that is cubic in x and quadratic in y, φ = a 1 + a 2 x + a 3 y + a 4 x 2 + a 5 xy + a 6 y 2 + a 7 x 3 + a 8 x 2 y + a 9 xy 2 .
A least-squares approach is needed because the system of equations is overconstrained, with 12 stencil values but only 9 polynomial terms. The stencil geometry is expressed in a local coordinate system with the face centroid as the origin so that the approximated value φ F is equal to the constant coefficient a 1 . The stencil is upwind-biased to improve numerical stability, and the multidimensional cubic polynomial is chosen to improve accuracy in the direction of flow [14]. The remainder of this subsection generalises the approximation technique for arbitrary meshes and describes the methods for constructing stencils, performing a least-squares fit with a suitable polynomial, and ensuring numerical stability of the transport scheme.

Stencil construction
For every interior face, two stencils are constructed, one for each of the possible upwind cells. Stencils are not constructed for boundary faces because values of φ at boundaries are calculated from prescribed boundary conditions. For a given interior face f and upwind cell c u , we find those faces that are connected to c u and 'oppose' face f . These are called the opposing faces. The opposing faces for face f and upwind cell c u are determined as follows. Defining G to be the set of faces other than f that border cell c u , we calculate the 'opposedness', Opp, between faces f and g ∈ G, defined as where S f and S g are the surface area vectors pointing outward from cell c u for faces f and g respectively. Using the fact that a · b = |a| |b| cos(θ) we can rewrite equation (6) as where θ is the angle between faces f and g. In this form, it can be seen that Opp is a measure of the relative area of g and how closely it parallels face f . The set of opposing faces, OF, is a subset of G, comprising those faces with Opp ≥ 0.5, and the face with the maximum opposedness. Expressed in set notation, this is On a rectangular mesh, there is always one opposing face g, and it is exactly parallel to the face f such that Opp( f, g) = 1.
Once the opposing faces have been determined, the set of internal and external cells must be found. The internal cells are those cells that are connected to the opposing faces. Note that c u is always an internal cell. The external cells are those cells that share vertices with the internal cells. Note that c d is always an external cell. Finally, the stencil boundary faces are boundary faces having Dirichlet boundary conditions 1 that share a vertex with the internal cells.
Having found these three sets, the stencil is constructed to comprise all internal cells, external cells and stencil boundary faces. Figure 2 illustrates a stencil construction for face f connecting upwind cell c u and downwind cell c d . The two opposing faces are denoted by thick dashed lines and the centres of the three adjoining internal cells are marked by black circles. The stencil is extended outwards by including the external cells that share vertices with the internal cells, where the vertices are marked by black squares. A boundary at the far left has Dirichlet boundary conditions, and so the four stencil boundary faces are also included in the stencil, where the boundary face centres are marked by black triangles. The resultant stencil contains fourteen points.

Least-squares fit
To approximate the value of φ at a face f , a least-squares fit is calculated from a stencil of surrounding known values. First, we will show how a polynomial least-squares fit is calculated for a face on a rectangular mesh. Second, we will make modifications to the least-squares fit that are necessary for numerical stability.
For faces that are far away from the boundaries of a rectangular mesh, we fit the multidimensional polynomial given by equation (5) that has nine unknown coefficients, a = a 1 . . . a 9 , using the twelve cell centre values from the upwind-biased stencil, φ = φ 1 . . . φ 12 . This yields a matrix equation which can be written as The rectangular matrix B has one row for each cell in the stencil and one column for each term in the polynomial. B is called the stencil matrix, and it is constructed using only the mesh geometry. A local coordinate system is established in which x is normal to the face f and y is perpendicular to x. The coordinates (x i , y i ) give the position of the centroid of the ith cell in the stencil. A two-dimensional stencil is also used for the tests on spherical meshes in section 3.3. In these tests, cell centres are projected perpendicular to a tangent plane at the face centre. Previous studies found that results were largely insensitive to the projection method [30,25]. The unknown coefficients a are calculated using the pseudo-inverse, B + , found by singular value decomposition, Recall that the approximate value φ F is equal to the constant coefficient a 1 , which is a weighted mean of φ, where the weights b + 1,1 . . . b + 1,12 are the elements of the first row of B + . Note that the majority of the least-squares fit procedure depends on the mesh geometry only. An implementation may precompute the pseudo-inverse for each stencil during model initialisation, and only the first row needs to be stored. Since each face has two possible stencils depending on the orientation of the velocity relative to the face, the implementation stores two sets of weights for each face. Knowledge of the values of φ is only required to calculate the weighted mean given by equation (12), which is evaluated once per face per time-stage.
In the least-squares fit presented above, all stencil values contributed equally to the polynomial fit. It is necessary for numerical stability that the polynomial fits the cells connected to face f more closely than other cells in the stencil, as shown by [25,26]. To achieve this, we allow each cell to make an unequal contribution to the least-squares fit. We assign an integer multiplier to each cell in the stencil, m = m 1 . . . m 12 , and multiply equation (10) to obtaiñ whereB = MB and M = diag(m). The constant coefficient a 1 is calculated from the pseudo-inverse,B + , whereb + 1 =b + 1,1 . . .b + 1,12 are the elements of the first row ofB + . Again, a 1 is a weighted mean of φ, where the weights are nowb + 1 · m. Values for m are chosen so that the cells connected to face f make a greater contribution to the least-squares fit, as discussed later in section 2.1.4.
For faces of a non-rectangular mesh, or faces that are near a boundary, the number of stencil points and number of polynomial terms may differ: a stencil will have one or more cells and, for two-dimensional meshes, its polynomial will have between one and nine terms. Additionally, the polynomial cannot have more terms than its stencil has cells because this would lead to an underconstrained system of equations. The procedure for choosing suitable polynomials is discussed next.

Polynomial generation
The majority of faces on a uniform two-dimensional mesh have stencils with more than nine cells. For example, a rectangular mesh has 12 points (figure 1a), and a hexagonal mesh has 10 points (figure 1b). In both cases, constructing a system of equations using the nine-term polynomial in equation (5) leads to an overconstrained problem that can be solved using least-squares. However, this is not true for faces near boundaries: stencils that have fewer than nine cells (figure 3a) would result in an underconstrained problem, and stencils that have exactly nine cells may lack sufficient information to constrain high-order terms. For example, the stencil in figure 3b lacks sufficient information to fit the x 3 term. In such cases, it becomes necessary to perform a least-squares fit using a polynomial with fewer terms. Figure 3: Upwind-biased stencils for faces near the lower boundary of a rectangular x-z mesh, with (a) a 3 × 2 stencil for the face immediately adjacent to the lower boundary, and (b) a 3 × 3 stencil for the face immediately adjacent to the face in (a). Each stencil belongs to the face marked by a thick line. The local coordinate system is shown, having an x direction normal to the face a y direction tangent to the face. For both stencils, attempting a least-squares fit using the nine-term polynomial in equation (5) would result in an underconstrained problem. There is no normal flow at the lower boundary.
For every stencil, we find a set of candidate polynomials that do not result in an underconstrained problem. In two dimensions, a candidate polynomial has some combination of between one and nine terms from equation (5). There are two additional constraints that a candidate polynomial must satisfy.
First, high-order terms may be included in a candidate polynomial only if the lower-order terms are also included. More precisely, let M(x, y) = x i y j : i, j ≥ 0 and i ≤ 3 and j ≤ 2 and i + j ≤ 3 (15) be the set of all monomials of degree at most 3 in For example, the polynomial φ = a 1 + a 2 x + a 3 y + a 4 xy + a 5 x 2 + a 6 x 2 y is a dense subset of M(x, y), but φ = a 1 + a 2 x + a 3 y + a 4 x 2 y is not because x 2 y can be included only if xy and x 2 are also included. In total there are 26 dense subsets of the two-dimensional polynomial in equation (5). Second, a candidate polynomial must have a stencil matrix B that is full rank. The matrix is considered full rank if its smallest singular value is greater than 1 × 10 −9 . Using a polynomial with all nine terms and the stencil in figure 3b results in a rank-deficient matrix and so the nine-term polynomial is not a candidate polynomial.
The candidate polynomials are all the dense subsets of M(x, y) that have a cardinality greater than one with a stencil matrix that is full rank. The final stage of the cubicFit transport scheme selects a candidate polynomial and ensures that the least-squares fit is numerically stable.

Stabilisation procedure
So far, we have constructed a stencil and found a set of candidate polynomials. Applying a least-squares fit to any of these candidate polynomials avoids creating an underconstrained problem. The final stage of the transport scheme chooses a suitable candidate polynomial and appropriate multipliers m so that the fit is numerically stable.
The approximated value φ F is equal to a 1 which is calculated from equation (14). The value of a 1 is a weighted mean of φ where w =b + 1 · m are the weights. If the cell centre values φ are assumed to approximate a smooth field then we expect φ F to be close to the values of φ u and φ d , and expect φ F to be insensitive to small changes in φ. When the weights w have large magnitude then this is no longer true: φ F becomes sensitive to small changes in φ which can result in large, numerically unstable departures from the smooth field φ.
To avoid numerical instabilities, a simplified, one-dimensional von Neumann analysis was performed, presented in appendix A. The analysis is used to impose three stability conditions on the weights w, where w u and w d are the weights for the upwind and downwind cells respectively. The peripheral points P are the cells in the stencil that are not the upwind or downwind cells, and w p is the weight for a given peripheral point p. The upwind, downwind and peripheral weights sum to one such that w u + w d + p∈P w p = 1.
The stabilisation procedure comprises three steps. In the first step, the set of candidate polynomials is sorted in preference order so that candidates with more terms are preferred over those with fewer terms. If there are multiple candidates with the same number of terms, the minimum singular value of B is calculated for each candidate, and an ordering is imposed such that the candidate with the larger minimum singular value is preferred. This ordering ensures that the preferred candidate is the highest-order polynomial with the most information content. 2 In the second step, the most-preferred polynomial is taken from the list of candidates and the multipliers are assigned so that the upwind cell and downwind cell have multipliers m u = 2 10 and m d = 2 10 respectively, and all peripheral points have multipliers m p = 1. These multipliers are very similar to those used by [25], leading to a well-conditioned matrixB and a least-squares fit in which the polynomial passes almost exactly through the upwind and downwind cell centre values.
In the third step, we calculate the weights w and evaluate them against the stability conditions given in equation (16). If any condition is violated, the value of m d is halved and the conditions are evaluated with the new weights. This step is repeated until the weights satisfy the stability conditions, or m d becomes smaller than one. In practice, the conditions are satisified when m d is either small (between 1 and 4) or equal to 2 10 . The upwind multiplier m u is fixed at 2 10 and the peripheral multipliers m p are fixed at 1. If the conditions are still not satisfied, then we start again from the second step with the next polynomial in the candidate list.
Finally, if no stable weights are found for any candidate polynomial, we revert to an upwind scheme such that w u = 1 and all other weights are zero. In our experiments we have not encountered any stencil for which this last resort is required. Furthermore, our experiments show that the stabilisation procedure only modifies the least squares fit for stencils near boundaries and for stencils in distorted mesh regions. For stencils in the interior of a uniform rectangular mesh, the least squares fit includes all terms in equation (5) with m u = m d = 2 10 .
To illustrate the stabilisation procedure, figure 4a presents a one-dimensional example of a cubic polynomial fitted through five points, with the weight at each point printed beside it. The stabilisation procedure only uses the x positions of these points and does not use the values of φ themselves. The φ values are included here for illustration only. Hence, for a given set of x positions, the same set of weights are chosen irrespective of the φ values.
For a one-dimensional cubic polynomial fit, the list of candidate polynomials in preference order is We begin with the cubic equation (17). The multipliers are chosen so that the polynomial passes almost exactly through the upwind and downwind points that are immediately to the left and right of the y-axis respectively. The stability condition on the upwind point is violated because w u = 1.822 > 1 (equation 16a). Reducing the downwind multiplier does not help to satisfy the stability condition, so we start again with the quadratic equation (18), and the new fit is presented in figure 4b. Again, the multipliers are chosen to force the polynomial through the upwind and downwind points, but this violates the stability condition on the downwind point because w d = 0.502 > 0.5 (equation 16b). This time, however, stable weights are found by reducing m d to one (figure 4c) and these are the weights that will be used to approximate φ F , where the polynomial intercepts the y-axis.

Future extension to three dimensions
All the procedures used in the cubicFit scheme generalise to three dimensions. The stencil construction procedure described in section 2.1.1 creates a stencil with 12 cells for a face in the interior of a two-dimensional rectangular mesh. In three dimensions, the same procedure creates a stencil with 3 × 12 = 36 cells. A three-dimensional stencil has three times as many cells as its two-dimensional counterpart if the mesh has prismatic cells arranged in columns. Hence, the computational cost during integration increases three-fold when moving from two dimensions to three dimensions.
To extend the least squares fit to three dimensions, the two-dimensional polynomial in equation (5) is replaced with its three-dimensional counterpart, φ = a 1 + a 2 x + a 3 y + a 4 z + a 5 x 2 + a 6 xy + a 7 y 2 + a 8 xz + a 9 yz + a 10 z 2 + a 11 x 3 + a 12 x 2 y + a 13 xy 2 + a 14 x 2 z + a 15 xz 2 + a 16 yz 2 + a 17 y 2 z + a 18 xyz . (21) The procedure for generating candidate polynomials described in section 2.1.3 results in 26 dense subsets in two dimensions and 842 dense subsets in three dimensions. Note that the combinatorial explosion of dense subsets in three dimensions does not increase the computational cost during integration. The stabilisation procedure described in section 2.1.4 requires further numerical experiments to verify that it is sufficient for three-dimensional flows and arbitrary polyhedral meshes. An initial three-dimensional test with uniform flow and a uniform Cartesian mesh obtained a numerically stable result. For stencils in the interior of the domain, the least squares fit includes all polynomial terms in equation (21) with m u = m d = 2 10 . The stabilisation procedure does not modify the least squares fit for these stencils, but we have not explored the three-dimensional extension of cubicFit any further.

Multidimensional linear upwind transport scheme
The multidimensional linear upwind scheme, called "linearUpwind" hereafter, is documented here since it provides a baseline accuracy for the experiments in section 3. The approximation of φ F is calculated using a gradient reconstruction, where φ u is the upwind value of φ, and x f and x c are the position vectors of the face centroid and cell centroid respectively. The gradient ∇ c φ is calculated using Gauss' theorem: where φ F is linearly interpolated from the two neighbouring cells of face f . The resulting stencil comprises all cells sharing a face with the upwind cell, including the upwind cell itself. For a face in the interior of a two-dimensional rectangular mesh, the stencil for the linearUpwind scheme is a '+' shape with 5 cells. On the same mesh, the stencil for the cubicFit scheme is more than twice the size with 12 cells. For cells adjacent to boundaries having zero gradient boundary conditions, the boundary value is set to be equal to the cell centre value before equation (23) is evaluated. This implementation of the multidimensional linear upwind scheme is included in the OpenFOAM software distribution [45].

Results
Three idealised numerical tests are performed to compare the accuracy of the cubicFit transport scheme with the multidimensional linear upwind scheme and with other transport schemes in the literature. The first test transports a tracer horizontally on two-dimensional, highly-distorted terrain-following meshes, following Schär et al. [2]. The second is a new test case that modifies the velocity field and tracer position from the first test in order to challenge the stability and accuracy of the transport schemes near mountainous lower boundaries. The third test evaluates the cubicFit scheme on hexagonal-icosahedral meshes and cubed-sphere meshes with deformational flow on a spherical Earth, as specified by Lauritzen et al. [43].
We have implemented the cubicFit transport scheme and the numerical test cases using the OpenFOAM CFD library because it enables a like-for-like comparison between mesh types and transport schemes. We provide source code archives for the OpenFOAM implementation of the cubicFit scheme [46], the ASAM cut cell mesh generator [47] and associated OpenFOAM converter [48], and the hexagonal-icosahedral mesh generator [49]. For the numerical test cases presented here we also supply the source code [50] and result data [51].

Horizontal transport over mountains
A two-dimensional transport test was developed by Schär et al. [2] to study the effect of terrain-following coordinate transformations on numerical accuracy. In this standard test, a tracer is positioned aloft and transported horizontally over wave-shaped mountains. The test challenges transport schemes because the tracer must cross mesh layers, which acts to reduce numerical accuracy [2,13]. Here we use a more challenging variant of this test that has steeper mountains and highly-distorted terrain-following meshes. Convergence results are compared using the linearUpwind and cubicFit transport schemes.
The domain is defined on a rectangular x-z plane that is 301 km wide and 25 km high as measured between parallel boundary edges. Boundary conditions are imposed on the tracer density φ such that φ = 0 kg m −3 at the inlet boundary, and a zero normal gradient ∂φ/∂n = 0 kg m −4 is imposed at the outlet boundary. There is no normal flow at the lower and upper boundaries. A range of mesh spacings are chosen so that ∆x : ∆z = 2 : 1 to match the original test specification from Schär et al. [2].
The terrain is wave-shaped, specified by the surface elevation h such that where a = 25 km is the mountain envelope half-width, h 0 = 6 km is the maximum mountain height, λ = 8 km is the wavelength, α = π/λ and β = π/(2a). Note that, in order to make this test more challenging, the mountain height h 0 is double the mountain height used by [2]. A basic terrain-following (BTF) mesh is constructed by using the terrain profile to modify the uniform mesh. The BTF method uses a linear decay function so that mesh surfaces become horizontal at the top of the model domain [5], where z is the geometric height, H is the height of the domain, h(x) is the surface elevation and z is the computational height of a mesh surface. If there were no terrain then h = 0 and z = z . A velocity field is prescribed with uniform horizontal flow aloft and zero flow near the ground, where u 0 = 10 m s −1 , z 1 = 7 km and z 2 = 8 km.
A tracer with density φ has the shape with radius r given by where A x = 25 km, A z = 3 km are the horizontal and vertical half-widths respectively, and φ 0 = 1 kg m −3 is the maximum density of the tracer. At t = 0 s, the tracer is centred at (x 0 , z 0 ) = (−50 km, 12 km) so that the tracer is upwind of the mountain, in the region of uniform flow above z 2 . Tests are integrated for 10 000 s using time-steps chosen for each mesh so that the maximum Courant number is about 0.4. This choice yields a time-step that is well below any stability limit, as recommended by Lauritzen et al. [43]. By the end of integration the tracer is positioned downwind of the mountain. The analytic solution at t = 10 000 s is centred at (x 0 , z 0 ) = (50 km, 12 km). Error norms are calculated by subtracting the analytic solution from the numerical solution, where φ is the numerical value, φ T is the analytic value, c denotes a summation over all cells c in the domain, and max c denotes a maximum value of any cell. Tests were performed using the linearUpwind and cubicFit schemes at mesh spacings between ∆x = 250 m and ∆x = 5000 m. Numerical convergence in the 2 and ∞ norms is plotted in figure 5. The linearUpwind and cubicFit schemes are second-order convergent at all but the coarsest mesh spacings where errors are saturated for both schemes.
The cubicFit scheme achieves a given 2 error using a mesh spacing that is almost twice as coarse as that needed by the linearUpwind scheme. Doubling the mesh spacing results in a coarser mesh with four times fewer cells because the ∆x : ∆z aspect ratio is fixed. Recall that the stencil for the cubicFit scheme has about twice as many cells as the stencil for the linearUpwind scheme. Hence, for a given 2 error, the computational cost during integration of the cubicFit scheme is about half the computational cost of the linearUpwind scheme. This test demonstrates that cubicFit is second-order convergent in the domain interior irrespective of mesh distortions. The cubicFit scheme achieves In the next test, we assess the numerical accuracy of the transport schemes near a distorted, mountainous lower boundary.

Transport over a mountainous lower boundary
The horizontal transport test in the previous section is useful for assessing numerical accuracy on terrain-following meshes, but it presents no particular challenge on cut cell meshes because there is no flow through the distorted cut cells near the ground [52]. Here we present another variant of the standard horizontal transport test that challenges transport schemes on all mesh types. By positioning the tracer next to the ground and modifying the velocity field, we can assess the accuracy of the cubicFit scheme near the lower boundary. Results using the cubicFit scheme are compared with the linearUpwind scheme on basic terrain-following, cut cell and slanted cell meshes.
Cut cell meshes are constructed using the ASAM grid generator [53,54]. Slanted cell meshes are constructed following the approach by Shaw and Weller [13]: vertices that are underground are moved up to the surface and zero-area faces and zero-volume cells are removed. Unlike [13], vertices are never moved downwards.
Following Schär et al. [2], the domain is 301 km wide and 25 km high as measured between parallel boundary edges, with a mesh spacing of ∆x = 1000 m and ∆z = 500 m. The boundary conditions are the same as those used in section 3.1. Cell edges in the central region of the domain are shown in figure 6 for each of the three mesh types. Cells in the BTF mesh are highly distorted over steep slopes (figure 6a) while the cut cell mesh (figure 6b) and slanted cell mesh (figure 6c) are orthogonal everywhere except for cells nearest the ground.
Similar to the approach by [13], a velocity field is chosen that follows the terrain at the surface and becomes entirely horizontal aloft. A streamfunction Ψ is used so that the discrete velocity field is non-divergent, such that where u 0 = 10 m s −1 , which is the horizontal velocity where h(x) = 0. There is no normal flow at the lower and upper boundaries. The velocity field becomes purely horizontal above H 1 = 10 km and, elsewhere, the flow is predominantely horizontal, with non-zero vertical velocities only above sloping terrain. The horizontal and vertical components of  velocity, u and w, are given by Unlike the horizontal transport test in [2], the velocity field presented here extends from the top of the domain all the way to the ground. The flow is deliberately misaligned with the BTF, cut cell and slanted cell meshes away from the ground (figure 6) to ensure that flow always crosses mesh surfaces in order to challenge the transport scheme. The value of H 1 is chosen to be much smaller than the domain height H in equation (25) so that flow crosses the surfaces of the BTF mesh. This is evident in figure 6a where the the velocity streamlines are tangential to the mesh only at the ground.
The tracer is again defined by equation (27)  from the left side of the mountain to the right [13]: By solving this equation we find that x(t = 10 000 s) = 54 342.8 m when h 0 = 5 km. The tracer density boundary conditions are the same as those in the previous test. Since the cubicFit transport scheme uses values at boundaries with Dirichlet boundary conditions, the cubicFit scheme uses only inlet boundary values in this test case.
Three series of tests were performed using similar configurations. The first series uses a peak mountain height of h 0 = 5 km to examine errors on different mesh types using the two transport schemes. The second series varies the peak mountain height to examine the sensitivity of the transport schemes to mesh distortions. The third series verifies accuracy at Courant numbers close to the limit of stability, and examines the longest stable time-step for different mesh types.
For the first series of tests with h 0 = 5 km, tracer contours at the initial time t = 0 s, half-way time t = 5000 s, and end time t = 10 000 s are shown in figure 7 using the cubicFit scheme on the BTF mesh. As apparent at t = 5000 s, the tracer is distorted by the terrain-following velocity field as it passes over the mountain, but its original shape is restored once it has cleared the mountain by t = 10 000 s. A small phase lag is apparent when the numerical solution marked with solid contour lines is compared with the analytic solution marked with dotted contour lines.
Numerical errors are more clearly revealed by subtracting the analytic solution from the numerical solution. Error fields are compared between BTF, cut cell and slanted cell meshes using the linearUpwind scheme (figures 8a, 8b and 8c respectively) and the cubicFit scheme (figures 8d, 8e and 8f respectively). Results are least accurate using the linearUpwind scheme on the slanted cell mesh (figure 8c). The final tracer is slightly distorted and does not extend far enough towards the ground. The ∞ error magnitude is reduced by using the linearUpwind scheme on the cut cell mesh (figure 8b), but the shape of the error remains the same. The cubicFit scheme is less sensitive to the choice of mesh with similar error magnitudes on the BTF mesh (figure 8d), cut cell mesh (figure 8e) and slanted cell mesh (figure 8f). Errors using the cubicFit scheme on cut cell and slanted cell meshes are much smaller than the errors using the linearUpwind scheme on the same meshes.
To further examine the performance of the cubicFit scheme in the presence of steep terrain, a second series of tests were performed in which the peak mountain height was varied from 0 km to 6 km keeping all other parameters constant. Results were obtained on BTF, cut cell and slanted cell meshes using the linearUpwind scheme and cubicFit scheme. Again, the time-step was chosen for each test so that the maximum Courant number was about 0.4 (table 1) error was calculated by subtracting the analytic solution from the numerical solution ( figure 9). Note that the analytic solution is a function of mountain height, with the tracer travelling farther over higher mountains due to non-divergent flow through a narrower channel. In all cases, error increases with increasing mountain height because steeper slopes lead to greater mesh distortions. Errors are identical for a given transport scheme when h 0 = 0 km and the ground is entirely flat because the BTF, cut cell and slanted cell meshes are identical. Compared with the cubicFit scheme, the linearUpwind scheme is more sensitive to the mesh type and mountain height. The linearUpwind scheme is unstable on the slanted cell mesh with a peak mountain height h 0 = 6 km despite using a Courant number of 0.428. In contrast, the cubicFit scheme is less sensitive to the mesh type and errors grow more slowly with increasing mountain height. The cubicFit scheme yields stable results in all tests. A final series of tests were performed to determine the stability limit of the cubicFit scheme with the two-stage Heun time-stepping scheme (equation 2). The tracer was transported on BTF, slanted cell and cut cell meshes with a variety of mesh spacings between ∆x = 5000 m and ∆x = 125 m. ∆z was chosen so that a constant aspect ratio is preserved such that ∆x/∆z = 2. For each test, the time-step was increased until the result became unstable. The largest stable time-steps, ∆t max , are presented in figure 10a. BTF meshes permit the longest time-steps of all three mesh types since cells are almost uniform in volume. As expected, the longest stable time-step scales linearly with BTF mesh spacing. There is no such linear scaling on cut cell meshes because these meshes can have arbitrarily small cells. The time-step constraints on cut cell meshes are the most severe of the three mesh types. Slanted cell meshes have a slightly stronger time-step constraint than BTF meshes but still exhibit similar linear scaling with mesh spacing. Furthermore, a dynamical model that uses slanted cell meshes instead of BTF meshes is expected to calculate pressure gradients more accurately [13]. Figure 10b presents the largest stable maximum Courant numbers, max(Co), which were calculated by substituting ∆t = ∆t max into equation (4). On basic terrain following meshes, the maximum Courant number tends towards about 1.3 with finer mesh spacings. No such trend is found on cut cell or slanted cell meshes. Cut cell meshes permit the largest maximum Courant numbers of around 3, but the largest stable time-steps on cut cell meshes are still smaller than corresponding time-steps on basic terrain following and slanted cell meshes.
This paper focuses on the spatial discretisation of the cubicFit scheme, but the stability limit depends also upon the choice of time-stepping. As such, we have not calculated a theoretical Courant number limit, although such an analysis should be possible using the techniques in [55]. The transport tests presented in this section demonstrate that the cubicFit scheme is suitable for flows over very steep terrain on two-dimensional terrain-following, cut cell and slanted cell meshes. The cubicFit scheme is less sensitive to the mesh type and mountain steepness compared to the linearUpwind scheme. The linearUpwind scheme becomes unstable over very steep slopes but the cubicFit scheme is stable for all tests. The accuracy of the cubicFit scheme was largely insensitive to the choice of time-step. In the next section, we evaluate the cubicFit scheme using more complex, deformational flows on icosahedral meshes and cubed-sphere meshes.

Deformational flow on a sphere
The tests so far have used flows that are mostly uniform on meshes that are based on rectangular cells. To ensure that the cubicFit transport scheme is suitable for complex flows on a variety of meshes, we use a standard test of deformational flow on a spherical Earth, as specified by Lauritzen et al. [43]. Results are compared between linearUpwind and cubicFit schemes using hexagonal-icosahedral meshes and cubed-sphere meshes. Hexagonal-icosahedral meshes are constructed by successive refinement of a regular icosahedron following the approach by [28,56,57] without any mesh twisting. Cubed-sphere meshes are constructed using an equi-distant gnomic projection of a cube having a uniform Cartesian mesh on each panel [58].
Following appendix A9 in [32], the average equatorial spacing ∆λ is used as a measure of mesh spacing. It is defined as where ∆x is the mean distance between cell centres and R e = 6.3712 × 10 6 m is the radius of the Earth. The deformational flow test specified by Lauritzen et al. [43] comprised six elements: 1. a convergence test using a Gaussian-shaped tracer 2. a "minimal" resolution test using a cosine bell-shaped tracer 3. a test of filament preservation 4. a test using a "rough" slotted cylinder tracer 5. a test of correlation preservation between two tracers 6. a test using a divergent velocity field We assess the cubicFit scheme using the first two tests only. We do not consider filament preservation, correlation preservation, or the transport of a "rough" slotted cylinder because no shape-preserving filter has yet been developed for the cubicFit scheme. Stable results were obtained when testing the cubicFit scheme using a divergent velocity field, but no further analysis is made here. The first deformational flow test uses an infinitely continuous initial tracer that is transported in a non-divergent, time-varying, rotational velocity field. The velocity field deforms two Gaussian 'hills' of tracer into thin vortical filaments. Half-way through the integration the rotation reverses so that the filaments become circular hills once again. The analytic solution at the end of integration is identical to the initial condition. A rotational flow is superimposed on a time-invariant background flow in order to avoid error cancellation. The non-divergent velocity field is defined by the streamfunction Ψ, where λ is a longitude, θ is a latitude, λ = λ − 2πt/T , and T = 12 days is the duration of integration. The time-step is chosen such that the maximum Courant number is about 0.4. The initial tracer density φ is defined as the sum of two Gaussian hills, An individual hill φ i is given by where φ 0 = 0.95 kg m −3 and b = 5. The Cartesian position vector x = (x, y, z) is related to the spherical coordinates (λ, θ) by (x, y, z) = (R e cos θ cos λ, R e cos θ sin λ, R e sin θ) .
The centre of hill i is positioned at x i . In spherical coordinates, two hills are centred at (λ 1 , θ 1 ) = (5π/6, 0) (λ 2 , θ 2 ) = (7π/6, 0) The results in figure 11 are obtained using the cubicFit scheme on a hexagonal-icosahedral mesh with ∆λ = 0.542 • . The initial Gaussian hills are shown in figure 11a. At t = T/2 the tracer has been deformed into an S-shaped filament (figure 11b). By t = T the tracer has almost returned to its original distribution except for some slight distortion and diffusion that are the result of numerical errors (figure 11c).
To determine the order of convergence and relative accuracy of the linearUpwind and cubicFit schemes, the same test was performed at a variety of mesh spacings betweeen ∆λ = 8.61 • and ∆λ = 0.271 • on hexagonal-icosahedral meshes and cubed-sphere meshes. The results are shown in figure 12. The solution is slow to converge at coarse resolutions, and this behaviour agrees with the results from Lauritzen et al. [43]. Both linearUpwind and cubicFit schemes achieve second-order accuracy at smaller mesh spacings. For any given mesh type and mesh spacing, the cubicFit scheme is more accurate than the linearUpwind scheme. Results are more accurate using hexagonal-icosahedral meshes compared to cubed-sphere meshes. It is not known whether the larger errors on cubed-sphere meshes are due to mesh non-uniformities at panel corners but there is no evidence of grid imprinting in the error fields (not shown).
A slightly more challenging variant of the same test is performed using a quasi-smooth tracer field defined as the sum of two cosine bells, otherwise. Hexagonal-icosahedral 0.2 SLFV-SL, swept-area scheme [60] Hexagonal-icosahedral 0.25 cubicFit Cubed-sphere 0.25 cubicFit Hexagonal-icosahedral 0.3 ICON-FFSL, swept-area scheme [60] Triangular-icosahedral 0.42 Table 2: Minimal resolutions for the cubicFit and linearUpwind schemes in the test of deformational flow using cosine bells. Italicised values have been extrapolated using the second-order convergence obtained at coarser mesh spacings. For comparison with existing models, some results are also included for unlimited versions of the transport schemes from the intercomparison by Lauritzen et al. [32].
The velocity field is the same as before. This test is used to determine the "minimal" resolution, ∆λ m , which is specified by Lauritzen et al. [43] as the coarsest mesh spacing for which 2 ≈ 0.033. The minimal resolution for the cubicFit scheme on a hexagonal-icosahedral mesh is about ∆λ m = 0.3 • . Tests were not performed at mesh spacings finer than ∆λ = 0.271 • but approximate minimal resolutions have been extrapolated from the second-order convergence that is found at fine mesh spacings. These minimal resolutions are presented in table 2 along with a selection of transport schemes having similar minimal resolutions from the model intercomparison by Lauritzen et al. [32].
The series of deformational flow tests presented here demonstrate that the cubicFit scheme is suitable for transport on spherical meshes based on quadrilaterals and hexagons. The cubicFit scheme is largely insensitive to the mesh type, and results are more accurate compared to the linearUpwind scheme for a given mesh type and mesh spacing. Neither scheme requires special treatment at the corners of cubed-sphere panels.

Conclusion
Atmospheric models are using increasingly fine horizontal mesh spacings that resolve steep slopes in terrain resulting in highly-distorted meshes, increased numerical errors and numerical instabilities. We have presented a new multidimensional method-of-lines transport scheme, cubicFit, that applies constraints derived from a von Neumann stability analysis to make the scheme stable over steep terrain on highly-distorted, arbitrary meshes. The scheme has a low computational cost at runtime, requiring only n multiplies per face per time-stage using a stencil with n cells. Stability constraint calculations are pre-computed during model initialisation since they depend upon the mesh geometry only.
The cubicFit scheme was compared to a multidimensional linear upwind scheme using three idealised numerical tests. The first test transported a tracer horizontally above steep slopes on highly-distorted, two-dimensional terrainfollowing meshes. The cubicFit scheme was second-order convergent regardless of mesh distortions. The second test transported a tracer over a mountainous lower boundary using terrain-following, cut cell and slanted cell meshes. The cubicFit scheme was generally insensitive to the type of mesh and less sensitive to terrain steepness compared to the multidimensional linear upwind scheme, and the scheme maintained accuracy up to its stability limit. The third test evaluated the transport schemes in a standard deformational flow field on hexagonal-icosahedral meshes and cubed-sphere meshes. In all tests, compared to the multidimensional linear upwind scheme, the cubicFit transport scheme was more stable and more accurate.

Acknowledgements
James Shaw acknowledges support from a PhD studentship funded jointly by NERC grant NE/K500860/1 and the University of Reading with CASE support from the Met Office. We are grateful to the Leibniz Institute for Tropospheric Research for providing their cut cell mesh generator, to the three anonymous reviewers for their helpful questions, and to Dr Shing Hing Man for his assistance with candidate polynomial generation. We also thank Dr Tristan Pryer at the University of Reading for useful discussions about the cubicFit transport scheme.
Appendix A: One-dimensional von Neumann stability analysis Two analyses are performed in order to find stability constraints on the weights w =b + 1 ·m as appear in equation (14). The first analysis uses a two-cell approximation to derive separate constraints on the upwind weight w u and downwind weight w d . The second analysis uses three cells to derive a constraint that considers all weights in a stencil.

Two-cell analysis
We start with the conservation equation for a dependent variable φ that is discrete-in-space and continuous-in-time where v is the velocity, and the left and right fluxes, φ L and φ R , are weighted averages of the neighbouring cell centres. Assuming that v is positive where φ j−1 , φ j , φ j+1 are cell centre values, and j denotes a cell centre position x = j∆x where ∆x is a uniform mesh spacing. α u and β u are the upwind weights and α d and β d are the downwind weights for the left and right fluxes respectively, and α u + α d = 1 and β u + β d = 1. At a given time t = n∆t at time-level n and with a time-step ∆t, we assume a wave-like solution with an amplification factor A, such that where φ (n) j denotes a value of φ at position j and time-level n. Using this to rewrite the left-hand side of equation (44) ∂φ hence equation (44) becomes where the Courant number c = v∆t/∆x. Let = β u − α d + β d cos k∆x − α u cos k∆x and = β d sin k∆x + α u sin k∆x, then A = e −c e −ic (52) and the complex modulus of A is For stability we need |A| ≤ 1 and, imposing the additional constraints that α u = β u and α d = β d , then and, given 0 ≤ 1 − cos k∆x ≤ 2, then Additionally, we do not want more damping than a first-order upwind scheme (where α u = β u = 1, α d = β d = 0), having an amplification factor, A up , so we need |A| ≥ A up , hence therefore Now, knowing that α u + α d = 1 (or α d = 1 − α u ) then, using equations (55) and (57),

Three-cell analysis
We start again from equation (44) but this time approximate φ L and φ R using three cell centre values, having used the same weights α uu , α u and α d for both left and right fluxes. Substituting equation (47) into equation (44) we find so that, if the complex modulus |A| ≤ 1 then If k∆x = π then cos k∆x = −1 and cos 2k∆x = 1 and α u − α d ≥ α uu . If k∆x = π/2 then cos k∆x = 0 and cos 2k∆x = −1 and α u − α d ≥ −α uu . Hence we find that When the same analysis is performed with four cells, α uuu , α uu , α u and α d , by varying k∆x we find that equation (64) holds replacing |α uu | with max(|α uu | , |α uuu |). Hence, we generalise equation (64) to find the final stability constraint where the peripheral cells P is the set of all stencil cells except for the upwind cell and downwind cell, and α p is the weight for a given peripheral cell p. We hypothesise that the three stability constraints (equations 58, 59 and 64) are necessary but not sufficient for a transport scheme on arbitrary meshes. The stability of the one-dimensional transport equation discretised in space and time could be analysed using existing techniques [55], but we have only analysed the spatial stability of the cubicFit scheme. Numerical experiments presented in section 3.2 demonstrate that the cubicFit scheme is generally insensitive to the time-step, provided that it is below a stability limit.

Appendix B: Mesh geometry on a spherical Earth
The cubicFit transport scheme is implemented using the OpenFOAM CFD library. Unlike many atmospheric models that use spherical coordinates, OpenFOAM uses global, three-dimensional Cartesian coordinates with the z-axis pointing up through the North pole. In order to perform the experiments on a spherical Earth presented in section 3.3, it is necessary for velocity fields and mesh geometries to be expressed in these global Cartesian coordinates.

Velocity field specification
The non-divergent velocity field in section 3.3 is specified as a streamfunction Ψ(λ, θ). Instead of calculating velocity vectors, the flux u f · S f through a face f is calculated directly from the streamfunction, where e ∈ f denotes the edges e of face f , e is the edge vector joining the two vertices of the edge, x e is the position vector of the edge midpoint, and Ψ(e) is the streamfunction evaluated at the same position. Edge vectors are directed in a counter-clockwise orientation.

Spherical mesh construction
Since OpenFOAM does not support two-dimensional spherical meshes, instead, we construct meshes that have a single layer of cells that are 2000 m deep, having an inner radius r 1 = R e − 1000 m and an outer radius r 2 = R e + 1000 m. By default, OpenFOAM meshes comprise polyhedral cells with straight edges and flat faces. This is problematic for spherical meshes because face areas and cell volumes are too small. For tests on a spherical Earth, we override the default configuration and calculate our own face areas, cell volumes, face centres and cell centres that account for the mesh curvature. Note that the new centres are no longer centroids, but they are consistent with the horizontal transport tests on a sphere presented in section 3.3.
A face is classified as either a surface face or radial face. A surface face has any number of vertices, all of equal radius. A radial face has four vertices with two different radii, r 1 and r 2 , and two different horizontal coordinates, (λ 1 , θ 1 ) and (λ 2 , θ 2 ). A radial face centre is modified so that it has a radius R e . The latitudinal and longitudinal components of a radial face centre need no modification. The face area A f for a radial face f is the area of the annular sector, where d is the great-circle distance between (λ 1 , θ 1 ) and (λ 2 , θ 2 ). To calculate the centre of a surface face f , a new vertex is created that is positioned at the mean of the face vertices. Note that this centre position,c f , is used in intermediate calculations and it is not the face centre position. Next, the surface face is subdivided into spherical triangles that share this new vertex [61]. The face centre direction and radius are calculated separately. The face centre directionr is the mean of the spherical triangle centres weighted by their solid angle,r where t ∈ f denotes the spherical triangles t of face f , Ω t is spherical triangle's solid angle which is calculated using l'Huilier's theorem, x t,1 and x t,2 are the positions of the vertices shared by the face f and spherical triangle t, andc f is the position of the centre vertex shared by all spherical triangles of face f . The face centre radius r is the mean radius of the face vertices, again weighted by the solid angle of each spherical triangle, where the solid angle Ω f of face f is the sum of the solid angles of the constituent spherical triangles, We use equations (68) and (69) to calculate the centre c f of the face f , The area vector S f of the surface face f is the sum of the spherical triangle areas [61], Cell centres and cell volumes are corrected by considering faces that are not normal to the sphere such that Let F be the set of faces satisfying equation (73). Then, the cell volume V c is which can be thought of as the area A integrated between r 1 and r 2 such that R 0 A(r) dr = r 2 r 1 r 2 Ω dr = 1 3 Ω r 3 2 − r 3 1 . The cell centre is modified so that it has a radius R e , which is consistent with radial faces.
Edges can be classified in a similar manner to faces where surface edges are tangent to the sphere and radial faces are normal to the sphere. The edge midpoints x e are used to calculate the face flux for non-divergent velocity fields (equation 66). For transport tests, corrections to edge midpoints are unnecessary. Due to the choice of r 1 and r 2 during mesh construction, the midpoint of a radial edge is at a radial distance of R e which is necessary for the correct calculation of non-divergent velocity fields. The position of surface edge midpoints is unimportant because these edges do not contribute to the face flux since e · x e = 0. Edge lengths are the straight-line distance between the two vertices and not the great-circle distance. Again, the edge lengths are not corrected because it makes no difference to the face flux calculation.