Embedded discontinuous Galerkin transport schemes with localised limiters

Motivated by finite element spaces used for representation of temperature in the compatible finite element approach for numerical weather prediction, we introduce locally bounded transport schemes for (partially-)continuous finite element spaces. The underlying high-order transport scheme is constructed by injecting the partially-continuous field into an embedding discontinuous finite element space, applying a stable upwind discontinuous Galerkin (DG) scheme, and projecting back into the partially-continuous space; we call this an embedded DG scheme. We prove that this scheme is stable in L2 provided that the underlying upwind DG scheme is. We then provide a framework for applying limiters for embedded DG transport schemes. Standard DG limiters are applied during the underlying DG scheme. We introduce a new localised form of element-based flux-correction which we apply to limiting the projection back into the partially-continuous space, so that the whole transport scheme is bounded. We provide details in the specific case of tensor-product finite element spaces on wedge elements that are discontinuous P1/Q1 in the horizontal and continuous P2 in the vertical. The framework is illustrated with numerical tests.


Introduction
Recently there has been a lot of activity in the development of finite element methods for numerical weather prediction (NWP), using continuous (mainly spectral) finite elements as well as discontinuous finite elements (Fournier et al., 2004;Thomas and Loft, 2005;Dennis et al., 2011;Kelly and Giraldo, 2012;Giraldo et al., 2013;Marras et al., 2013;Brdar et al., 2013;Bao et al., 2015); see Marras et al. (2015) for a comprehensive review. A key aspect of NWP models is the need for transport schemes that preserve discrete analogues of properties of the transport equation such as monotonicity (shape preservation) and positivity; these properties are particularly important when treating tracers such as moisture. Discontinuous Galerkin methods can be interpreted as a generalisation of finite volume methods and hence the roadmap for the development of shape preserving and positivity preserving methods is relatively clear (see Cockburn and Shu (2001) for an introduction to this topic). However, this is not the case for continuous Galerkin methods, and so different approaches must be used. In the NWP community, limiters for CG methods have been considered by Marras et al. (2012), who used first-order subcells to reduce the method to first-order upwind in oscillatory regions, and Guba et al. (2014), who exploited the monotonicity of the element-averaged scheme in the spectral element method to build a quasi-monotone limiter.
1. The introduction of an embedded discontinuous Galerkin scheme which is demonstrated to be linearly stable, 2. The introduction of localised element-based limiters to remove spurious oscillations when projecting from from discontinuous to continuous finite element spaces, which are necessary to make the whole transport scheme bounded, 3. When combined with standard limiters for the discontinuous Galerkin stage, the overall scheme remains locally bounded, addressing the previously unsolved problem of how to limit partially continuous finite element spaces that arise in the compatible finite element framework.
Our bounded transport scheme can also be used for continuous finite element methods, although other approaches are available that do not involve intermediate use of discontinuous Galerkin methods. The rest of the paper is structured as follows. The problem is formulated in Section 2. In particular, more detail on the finite element spaces is provided in Section 2.1. The embedded discontinuous Galerkin schemes are introduced in Section 2.2; it is also shown that these schemes are stable if the underlying discontinuous Galerkin scheme is stable. The limiters are described in Section 2.3. In Section 3 we provide some numerical examples. Finally, in Section 4 we provide a summary and outlook.

Finite element spaces
We begin by defining the partially continuous finite element spaces under consideration. In three dimensions, the element domain is constructed as the tensor product of a two-dimensional horizontal element domain (a triangle or a quadrilateral) and a one-dimensional vertical element domain (i.e., an interval); we obtain triangular prism or hexahedral element domains aligned with the vertical direction. For a vertical slice geometry in two dimensions (frequently used in testcases during model development), the horizontal domain is also an interval, and we obtain quadrilateral elements aligned with the vertical direction.
To motivate the problem of transport schemes for a partially continuous finite element space, we first consider a compatible finite element scheme that uses a discontinuous finite element space for density. This is typically formed as the tensor product of the DG k space in the horizontal (degree k polynomials on triangles or bi-k polynomials on quadrilaterals, allowing discontinuities between elements) and the DG l space in the vertical. We consider the case where the same degree is chosen in horizontal and vertical, i.e. k = l, although there are no restrictions in the framework. We will denote this space as DG k × DG k .
In the compatible finite element framework, the vertical velocity space is staggered in the vertical from the pressure space; the staggering is selected by requiring that the divergence (i.e., the vertical derivative of the vertical velocity) maps from the vertical velocity space to the pressure space. This means that vertical velocity is stored as a field in DG k × CG k+1 (where CG k+1 denotes degree k + 1 polynomials in each interval element, with C 0 continuity between elements). To avoid spurious hydrostatic pressure modes, one may then choose to store (potential) temperature in the same space as vertical velocity (this is the finite element version of the Charney-Phillips staggering). Figure 1 provides diagrams showing the nodes for these spaces in the case k = 1. Details of how to automate the construction of these finite element spaces within a code generation framework are provided in McRae et al. (2015).
Monotonic transport schemes for temperature are often required, particularly in challenging testcases such as baroclinic front generation. Further, dynamics-physics coupling requires that other tracers such as moisture must be stored at the same points as temperature; many of these tracers are involved in parameterisation calculations that involve switches and monotonic advection is required to avoid spurious formation of rain patterns at the gridscale, for example. Hence, we must address the challenge of monotonic advection in the partially continuous DG k ×CG k+1 space.
In this paper, we shall concentrate on the case of DG 1 × CG 2 . This is motivated by the fact that we wish to use standard DG upwind schemes where the advected tracer is simply evaluated on the upwind side; the lowest order space DG 0 × CG 1 leads to a first order scheme in this case. We may return to higher order spaces in future work.

Embedded Discontinuous Galerkin schemes
In this section we describe the basic embedded transport scheme as a linear transport scheme without limiters. The scheme, which can be applied to continuous or partially-continuous finite element spaces, is motivated by the fact that limiters are most easily applied to fully discontinuous finite element spaces. We call the continuous or partially-continuous finite element space V , and letV be the smallest fully discontinuous finite element space containing V . A diagram illustrating V andV in our case of interest, namely the finite element space for temperature described in the previous section, is shown in Figure 2.
Before describing the transport scheme, we make a few definitions.
Definition 1 (Injection operator). For u ∈ V ⊂V , we denote I : V →V the natural injection operator.
The injection operator does nothing mathematically except to identify Iu as a member ofV instead of just V . However, in a computer implementation, it requires us to expand u in a new basis. This can be cheaply evaluated element-by-element.
Definition 2 (Propagation operator). Let A :V →V denote the operator representing the application of one timestep of an L 2 -stable discontinuous Galerkin discretisation of the transport equation.
For example, A could be the combination of an upwind discontinuous Galerkin method with a suitable Runge-Kutta scheme.
In a computer implementation, this requires the inversion of the mass matrix associated with V .
We now combine these operators to construct our embedded discontinuous Galerkin scheme.
Definition 4 (Embedded discontinuous Galerkin scheme). Let V ⊂V , with injection operator I, projection operator P and propagation operator A. Then one step of the embedded discontinuous Galerkin scheme is defined as The L 2 stability of this scheme is ensured by the following result.
Proposition 5. Let α > 0 be the stability constant of the the propagation operator A, such that where · denotes the L 2 norm. Then, the stability constant α * of the embedded discontinuous Galerkin scheme on V satisfies α * < α. Proof.
as required. In the last inequality we used the fact that Pẑ ≤ ẑ , which is a consequence of the Riesz representation theorem.
Corollary 6. For a given velocity field u, let A ∆t denote the propagation operator for timestep size ∆t. Let ∆t * denote the critical timestep for A ∆t , i.e., Then, the critical timestep size ∆t † for the embedded discontinuous Galerkin scheme P A ∆t I is at least as large as ∆t * .
Proof. If ∆t ≤ ∆t * , then P A ∆t I ≤ A ∆t ≤ 1, as required.
Hence, the embedded DG scheme is L 2 stable whenever the propagation operator A is. For the numerical examples in this paper, we consider the case V = DG 1 × CG 2 (our temperature space) andV = DG 1 × DG 2 . For a given divergence-free velocity field u, defined on the domain Ω and satisfying u · n = 0 on the domain boundary ∂Ω, A represents the application of one timestep applied to the transport equation discretised using the usual Runge-Kutta discontinuous Galerkin discretisation (see Cockburn and Shu (2001) for a review). To do this, first we define L :V →V by where Γ is the set of interior facets in the finite element mesh, with the two sides of each facet arbitrarily labelled by + and −, the jump operator denotes [ Then, the timestepping method is defined by the usual 3rd order 3 step SSPRK timestepping method (Shu and Osher, 1988), Since the finite element space V is discontinuous in the horizontal, the projection P :V → V decouples into independent problems to solve in each column (i.e., the mass matrix for DG 1 ×CG 2 is column-block diagonal).

Bounded transport
Next we wish to add limiters to the scheme. This is done in two stages. First, a slope limiter should be incorporated into theV propagator, A; we call the resulting schemeÃ. A suitable limiter is defined in Section 2.3.1 After replacing A withÃ, the only way that the solution can generate overshoots and undershoots is after the application of the projection P . To control these unwanted oscillations, we apply a (conservative) flux correction to the projection, referred to as flux corrected remapping ; this is described in Section 2.3.2. We denote the flux corrected remappingP , and the resulting bounded transport scheme may be written as θ n+1 =PÃI.

Slope limiter for the propagator A
In principle, any suitable discontinuous Galerkin slope limiter can be used in the propagator A. In this paper we used the vertex-based slope limiter of Kuzmin (2010). This limiter is both very easy to implement, and supports a treatment of the quadratic structure in the vertical. Before presenting the limiter forV = DG 1 × DG 2 (recall that this is the space we must use to obtain a transport scheme for our DG 1 × CG 2 space used for temperature), we first review the concepts in the simpler case ofV = DG 1 × DG 1 . The basic idea for θ ∈V = DG 1 × DG 1 is to write whereθ is the projection of θ into DG 0 , i.e. in each elementθ is the element-averaged value of θ. Then, for each vertex i in the mesh, we compute maximum and minimum bounds θ max,i and θ min,i by computing the maximum and minimum values ofθ over all the elements that contain that vertex, respectively. In each element e we then compute a constant 0 ≤ α e ≤ 1 such that the value of at each vertex i contained by element e. The optimal value of the correction factor α e can be determined using the formula of Barth and Jespersen (1989) where N e is the set of vertices of element e and θ e,i =θ e + (∆θ) e (x i ) is the unconstrained value of the DG 1 shape function at the i-th vertex. For our temperature space DG 1 × DG 2 applied to numerical weather prediction applications, we assume that we have a columnar mesh. This means that the prismatic elements are stacked vertically in layers, with vertical sidewalls (but possibly with tilted top and bottom faces to facilitate terrain-following meshes, so that the elements are trapezia). This allows us to adopt a Taylor basis in the vertical, i.e. the basis in local coordinates is the tensor product of a Taylor basis in the vertical with a Lagrange basis in the horizontal. We write where θ 1 ∈ DG 1 × DG 1 , and satisfies the following conditions: 1.θ 1 =θ, 2. ∂θ 1 ∂z and ∂θ ∂z take the same values along the horizontal element midline in local coordinates. Then, ∂θ ∂z ∈ DG 1 × DG 1 whilst ∂θ 1 ∂z ∈ DG 1 × DG 0 . First, we limit the quadratic component in the vertical (the third term in Equation (11)), performing the following steps.
1. In each element, compute ∂θ 1 ∂z , and evaluate the derivative at the horizontal cell midline to obtain ∂θ 1 ∂z ∈ DG 1 × DG 0 . If the quadratic component θ − θ 1 is limited to zero then ∂θ 1 ∂z will become equal to ∂θ 1 ∂z . 2. In each column, at each vertex i, compute bounds ∂θ ∂z | min,i and ∂θ ∂z | max,i by taking the maximum value of ∂θ 1 ∂z at that vertex in the elements sharing that vertex in the column.
3. In each element, compute element correction factors α 1,e according to This approach can also be extended to meshes in spherical geometry for which all side walls are parallel to the radial direction 1 , having replaced ∂ ∂z by the radial derivative. Second, we apply the vertex-based limiter to the DG 1 × DG 1 component θ 1 , obtaining limiting constants α 0 . We then finally evaluate To reduce diffusion of smooth extrema, it was recommended in Kuzmin (2010) to recompute the α 0 coefficients according to α 0,e → max(α 0,e , α 1,e ).
However, this does not work in the case of DG 1 × DG 2 since there is no quadratic component in the horizontal direction, and hence nonsmooth extrema in the horizontal direction will not be detected. A possible remedy is to use α 0,e for the horizontal gradient and max(α 0,e , α 1,e ) for the vertical gradient or to limit the directional derivatives separately using an anisotropic version of the vertex-based slope limiter (Kuzmin et al. (2015)). This limiter is applied to the input toÃ and after each SSPRK stage, to ensure that no new maxima or minima appear in the solution over the timestep.

Flux corrected remapping
The final step of the embedded DG scheme is the projection P of the DG solution (which we denote here asθ) back into V . We obtain a high-order, but oscillatory solution, which we denote θ H . To obtain a bounded solution, we introduce a localised element-based limiter that blends θ H with a low-order bounded solution θ L , such that high-order approximation is preserved wherever overshoots and undershoots are not present.
First, we must obtain the low-order bounded solution. Using the Taylor basis, we remove the quadratic part ofθ, to obtainθ ∈ DG 1 ×DG 1 . A low-order bounded solution can then be obtained by applying a lumped mass projection, where the lumped mass M is defined by the projection matrix Q is defined by is a Lagrange basis for DG 1 × DG 1 and {φ i } n i=1 is a Lagrange basis for DG 1 × CG 2 . The lumped mass M and projection matrix Q both have strictly positive entries. This means that for each 1 ≤ i ≤ n, the basis coefficient θ L i is a weighted average of values ofθ coming from elements that lie in S(i), the support of φ i . The weights are all positive, and hence the value of θ L i is bounded by the maximum and minimum values ofθ in S(i). Hence, no new maxima or minima appear in the low order solution.
Next, we combine the low order and high order solutions element-by-element, in a process called element-based flux correction. Element based flux correction was introduced in (Löhner et al., 1987) and formulated for conservative remapping in (Löhner, 2008;Kuzmin et al., 2010). Here, we use a new localised element-based formulation, where element contributions to the low and high order solutions are blended locally and then assembled.
To formulate the element-based limiter, we note that the consistent mass counterpart of (15) is given by where First, by repeated addition and subtraction of terms, we write (with no implied sum over the index i) where This can be decomposed into elements to obtain where Importantly, the contributions f e i of element e to its vertices sum to zero, since It follows that the total mass of the solution remains unchanged (i.e., n i=1 M i θ H i = n i=1 M i θ L i ) if all contributions of the same element are reduced by the same amount.
We can then choose element limiting constants α e to get where 0 ≤ α e ≤ 1 is a limiting constant for each element which is chosen to satisfy vertex bounds obtained from the nodal values ofθ. The bounds in each vertex are obtained as follows. First element bounds θ e max and θ e min are obtained fromθ by maximising/minimising over the vertices of element e. Then for each vertex i, maxima/minima are obtained by maximising/minimising over the elements containing the vertex: The correction factor α e is chosen so as to enforce the local inequality constraints Summing over all elements, one obtains the corresponding global estimate which proves that the corrected value θ C i := θ L i + 1 M i e α e f e i is bounded by θ max,i and θ min,i . To enforce the above maximum principles, we limit the element contributions f e i using This definition of α e corresponds to a localised version of the element-based multidimensional FCT limiter ((Löhner et al., 1987;Kuzmin and Turek, 2002)) and has the same structure as formula (10) for the correction factors that we used to constrain the DG 1 approximation. A further advantage of the localised formulation is that the limited fluxes can be built independently in each element, before assembling globally and dividing by the global lumped mass by iterating over nodes.

Numerical Experiments
In this section, we provide some numerical experiments demonstrating the localised limiter for embedded Discontinuous Galerkin schemes.

Solid body rotation
In this standard test case, the transport equations are solved in the unit square Ω = (0, 1) 2 with velocity field u(x, y) = (0.5 − y, x − 0.5), i.e. a solid body rotation in anticlockwise direction about the centre of the domain, so that the exact solution at time t = 2π is equal to the initial condition. The initial condition is chosen to be the standard hump-cone-slotted cylinder configuration defined in LeVeque (1996), and solved on a regular mesh with element width h = 1/100 and Figure 3: Solid body rotation test (Section 3.1) on a 100×100 grid. Solution is interpolated to a DG 1 discontinuous field before plotting. Left: initial condition. Right: Solution after one rotation. The solution is free from overand undershoots and exhibits comparable numerical dissipation to discontinuous Galerkin methods combined with limiters.
Courant number 0.3. The result, shown in Figure 3, is comparable with the result for the DG 1 discontinuous Galerkin vertex-based limiter shown in Figure 2 of Kuzmin (2010); it is free from overand undershoots and exhibits a similar amount of numerical diffusion. It is also hard to distinguish between the x-direction, where the finite element space is discontinuous, and the y-direction, where the space is continuous. This suggests that we have achieved our goal of constructing a limited transport scheme for our partially-continuous finite element space.

Advection of a discontinous function with curvature
In this test case, the transport equations are solved in the unit square Ω = (0, 1) 2 with velocity field u = (1, 0), i.e. steady translation in the x-direction (which is the direction of discontinuity in the finite element space). The initial condition is This test case is challenging because the height of the "plateau" next to the continuity varies as a function of y (i.e., in the direction tangential to the discontinuity); this means that the behaviour of the limiter is more sensitive to the process of obtaining local bounds. The equations are integrated until t = 0.4 in a 100 × 100 square grid and Courant number 0.3. The results are showing in Figure 4. One can see qualitatively that the degradation in the solution due to the limiter and numerical errors is not too great.

Convergence test: deformational flow
In this test, we consider the advection of a smooth function by a deformational flow field that is reversed so that the function at time t = 1 is equal to the initial condition. As is standard for The transport equations are solved in a unit square, with periodic boundary conditions in the x-direction. The initial condition is θ(x, 0) = 0.25(1 + cos(r)), r = min 0.2, (x − 0.3) 2 + (y − 0.5) 2 /0.2 , and the velocity field is u(x, t) = (1 − 5(0.5 − t) sin(2π(x − t)) cos(πy), 5(0.5 − t) cos(2π(x − t)) sin(πy)) , where x = (x, y). The problem was solved on a sequence of regular meshes with square elements at fixed timestep ∆t = 0.000856898, and the L 2 error was computed. A plot of the errors is provided in Figure 5. As expected, we obtain second-order convergence (the quadratic space in the vertical does not enhance convergence rate because the full two-dimensional quadratic space is not spanned).

Summary and Outlook
In this paper we described a limited transport scheme for partially-continuous finite element spaces. Motivated by numerical weather prediction applications, where the finite element space for temperature and other tracers is imposed by hydrostatic balance and wave propagation properties, we focussed particularly on the case of tensor-product elements that are continuous in the vertical direction but discontinuous in the horizontal. However, the entire methodology applies to standard C 0 finite element spaces. The transport scheme was demonstrated in terms of convergence rate on smooth solutions and dissipative behaviour for non-smooth solutions in some standard testcases.
Having a bounded transport scheme for tracers is a strong requirement for numerical weather prediction algorithms; the development of our scheme advances the practical usage of the compatible finite element methods described in the introduction. The performance of this transport scheme applied to temperature in a fully coupled atmosphere model will be evaluated in 2D and 3D testcases as part of the "Gung Ho" UK Dynamical Core project in collaboration with the Met Office. In the case of triangular prism elements we anticipate that it may be necessary to modify the algorithm above to limit the time derivatives as described in Kuzmin (2013).  A key novel aspect of our transport scheme is the localised element-based FCT limiter. This limiter has much broader potential for use in FCT schemes for continuous finite element spaces, which will be explored and developed in future work.