Efficient and scalable data structures and algorithms for goal-oriented adaptivity of space-time FEM codes

The cost- and memory-efficient numerical simulation of coupled volume-based multi-physics problems like flow, transport, wave propagation and others remains a challenging task with finite element method (FEM) approaches. Goal-oriented space and time adaptive methods derived from the dual weighted residual (DWR) method appear to be a shiny key technology to generate optimal space-time meshes to minimise costs. Current implementations for challenging problems of numerical screening tools including the DWR technology broadly suffer in their extensibility to other problems, in high memory consumption or in missing system solver technologies. This work contributes to the efficient embedding of DWR space-time adaptive methods into numerical screening tools for challenging problems of physically relevance with a new approach of flexible data structures and algorithms on them, a modularised and complete implementation as well as illustrative examples to show the performance and efficiency.


Introduction
The accurate, reliable and efficient numerical approximation of flow in heterogeneous deformable porous media include several coupled multi-physics phenomena, such as multi-phase diffusion and convection-dominated transport with underlying chemical reactions, deformation and poroelastic wave propagation as well as fluid-structure interactions in a more general sense, and is of fundamental importance in environmental, civil, energy, biomedical and many other engineering fields to yield a cost-effective numerical simulation screening tool to support the still challenging and fundamental research of understanding such multi-physics phenomena. Considerable representatives are coupled problems based on the Navier-Stokes equations for incompressible flow, the Euler equations for compressible flow, the convection-diffusion-reaction equations and the Biot-Allard equations for coupled flow with poroelastic wave propagation, which are characterised by partial differential equations of nonlinear and instationary character. Space-time finite element method (FEM) approximations offer appreciable benefits over finite difference and finite volume methods such as the flexibility with which they can accommodate discontinuities in the model, material parameters and boundary conditions as well as for a priori and a posteriori error estimation to establish optimal computational space-time grids in a self-adaptive way. The dual weighted residual (DWR) method for goal-oriented adaptivity was introduced by Becker and Rannacher ( [1,2]) and further studied; cf. [3,4,5,6] and references therein. The DWR method facilitates to find optimal spatial mesh adaptations, optimal temporal mesh adaptations, as well as optimal local spacetime polynomial degrees, tuning parameters and physical or numerical models, such that the overall computational costs are minimised for reaching a goal in a target quantity of interest.
A target quantity is a cost, error or energy functional J and depends on an user-chosen quantity of interest. Examples for a quantity of interest are the stress derived from the primal displacement variables or the drag coefficient derived from the primal fluid velocity variables of the underlying problem. The goal of the DWR method is to satisfy a guaranteed error bound in the target quantity of interest instead of satisfying a classical error bound of the primal solution variable in a standard norm.
Unfortunately, the DWR method still has challenging drawbacks compared to standard a posteriori error based adaptive methods. It needs to solve the primal or forward problem and an auxiliary dual or adjoint problem of higher approximation quality in each loop of an optimisation problem, it needs variational space-time discretisations for the primal and dual problem, which yield problems on (d+1)-dimensional domains, and there is a lack in efficient data structures, algorithms as well as (non-)linear system solver and preconditioning technologies for sophisticated problems of physical interest, which will be at least partially resolved by this work.
Fortunately, the DWR method works in general situations, in which the problem itself and the quantity of interest are of nonlinear fashion, and it does not rely on generally unknown assumptions of the initial space-time mesh. The key idea of the DWR method is to embed the given problem of finding solutions of a partial differential equation into the framework of optimal control to estimate the functional error of J(u) − J(u τ,h ), derive computable a posteriori error estimates η with an ap-proximation of the corresponding dual solution z of an auxiliary problem, execute space-time mesh (and other) adaptations based on the information given by η and loop until the goal in the target quantity is reached as it is outlined by Fig. 1. Consider to solve the constrained optimisation problem given by for a (cost, error or energy) functional J depending on u, under the constraint of finding u ∈ V from the variational problem which is set up from the partial differential equation problem in a standard way. V and W are corresponding variational trial and test space-time functional spaces such that primal and dual solutions exist at least locally, are unique and stable, but the latter ones can not be guaranteed for arbitrary nonlinear problems without restrictions. The calculus of variations theory associates a corresponding Lagrangian functional L to (1), with z ∈ W as a Lagrange multiplier or, equivalently, as the adjoint or dual solution variable, for the global optimisation problem (1). The primal solution u is the first component of a stationary point (u, z) of L whereas the second component yields the corresponding dual solution z; cf. [1,2,6] for details. Due to the mathematically challenging theory of even local existence of solutions in general nonlinear settings, we restrict ourselves further, without the loss of generality and the impact of our software, to a linear prototype model for instationary transport in heterogeneous porous media.

Exact Scientific Problem solved by the Software
The DTM++.Project/dwr-diffusion frontend simulation tool for the established finite element analysis library deal.II [7] yields a reference implementation of the prototype model for instationary transport in heterogeneous porous media. We consider the goal-oriented approximation of a target quantity J(u) subject to find u ∈ V from the diffusion equation in the space-time domain Q = Ω × I with Ω ⊂ R d (open and bounded), dimension d = 2, 3, and I = (t 0 , T ), 0 ≤ t 0 < T < ∞, and equipped with appropriate initial and boundary conditions with the partition of the boundary ∂Ω = Γ D ∪ Γ N and Γ D ∅. A target quantity J(u) of interest derived from the primal solution u of (3) might represent a local pollution concentration under diffusive transport. The coefficient functions ρ > 0 and > 0 are the mass density and the permeability of the medium, respectively, and the right hand side function f : Q → R represents volume forces acting on the interior domain Ω such as source or sink terms. In (3), the differential operators ∂ t (·), ∇ · (·) = ∂ x 1 (·) + · · · + ∂ x d (·) and ∇(·) = [∂ x 1 (·), . . . , ∂ x d (·)] T denote the partial derivative for the time variable t, the divergence and the gradient for the space variables x 1 , . . . , x d , respectively, and, n denotes the outward facing normal vector, in standard notation. The initial-boundary value problem is closed by the choice of an appropriate initial value function u 0 : Ω → R and a boundary value function g : Γ D × I → R acting on the Dirichlet part of the boundary partition Γ D ∅. Additionally, there is the choice for an appropriate inhomogeneous Neumann boundary type condition function h : The weak variational primal problem, for brevity consider to find u :=ũ − g or put g = 0 without loss of generality of the following discussion, reads as: Seek u ∈ V, V = L 2 (0, T ; H 1 0 (Ω)), satisfying u(0) = u 0 , u 0 ∈ H 1 0 (Ω), such that with the left-hand side bilinear form A : V × W → R, and the right-hand side linear form F : W → R, Remark thatũ ∈ g + V, withũ having inhomogeneous Dirichlet boundary conditions g 0, can be generated in a standard way as it will be explained in the following algorithm. Briefly, the weak variational dual problem results from the stationary condition of L(u, z) (2) given by for ψ ∈ V and z ∈ W; using the Euler-Lagrange method of constrained optimisation, cf. for details [6, Sec. 6.1], [4]. The dual problem for a general nonlinear goal functional J(u) reads as: Find z ∈ W, W = L 2 (0, T ; H 1 0 (Ω)), satisfying z(T ) = z T , z T ∈ H 1 0 (Ω), such that for all ψ ∈ V, V = L 2 (0, T ; H 1 0 (Ω)), and with the left-hand side adjoint bilinear form A * : W × V → R, which is derived from integration by parts in time of A(ψ)(z) of (4) and put this into the first equation of (5). Remark that the choice of trial and test functions for the dual problem here yields traces on t 0 and T . To estimate the functional error of J(u) − J(u τ,h ) and to derive computable a posteriori error estimates η for the space-time mesh adaptivity, we need to discretise the primal problem (4) and the dual problem (6). The primal and dual problem are discretised with appropriate space-time variational methods, i.e. here a piecewise constant discontinuous Galerkin method in time for the primal problem and a piecewise linear continuous Petrov-Galerkin method in time for the dual problem (Fig. 2), combined with a piecewise polynomial continuous Ritz-Galerkin (FEM) method in space; cf. for details [8] and [4,6]. The space-time cylinder Q = Ω × I is divided into non-overlapping space-time slabs Q n = Ω h × I n , with the partition ofĪ = [t 0 , T ] as t 0 =: t 0 < · · · < t N := T , I n := (t m , t n ), t m := t n−1 , n = 1, . . . , N , for the -th loop. On each Q n consider a not necessarily conforming partition T h,n of Ω h into non-overlapping elements K n , denoted as geometrical quadrilaterals for d = 2 or hexahedrons for d = 3. The fully discrete primal and dual solutions are represented by In (7)-(8), u n, j,ι and z n, j,ι denote the space-time degrees of freedom (DoF) of the primal and dual problem, φ primal,n, j and φ dual,n, j denote the global spatial trial basis functions and ζ primal,n, ι and ξ dual,n, ι denote the global temporal trial basis functions (Fig. 3). The basis functions are defined on a reference space-time slab Q = (0, 1) d × (0, 1), e.g. as piecewise Lagrange polynomials, and mapped appropriately to T h,n × I n ; cf. [8] for details. The software implements the goal functional J : V → R, for all ψ ∈ V, with u τ,h denoting the fully discrete primal solution of the -th DWR loop and u denoting the analytic solution (only possible for academic test problems) or an approximation of the exact primal solution generated on a finer space-time mesh or with higher-order approaches, which aims to reach for an absolute or relative tolerance tol criterion, on the spacetime control volume Q c = Ω c × I c ⊆ Q; cf.
for n = 1, . . . , N , by marching forwardly in time through the slabs. The mass and stiffness matrix of the primal problem on the slab n are denoted as M n and A n , respectively, the right hand side assemblies f n 0 and h n 0 correspond to volume forcing and inhomogeneous Neumann boundary terms and the vector (I h u τ,h (x, t n−1 )) j is the interpolation of the initial value function u 0 from (3) for n = 1 or the fully discrete primal solution of the previous (n-1) slab for n > 1 on the primal finite element space of the current slab Q n . Each system is modified such that the Dirichlet boundary conditions from (3) are applied strongly. To set up the dual problem with the goal (9), compute the contribution of the value u − u τ,h Q c (10) with a post-processing on each slab Q n . Solve the dual problem: Find the coefficient vector z n, 0 = (z n, j,0 ) j , j = 1, . . . , N dual,n, DoF , from for n = N , . . . , 1, by marching backwardly in time through the slabs. The mass and stiffness matrix of the dual problem on the slab n are denoted as M d n and A d n , respectively, the right hand side assembly vector J n, 0 corresponds to the goal defined by (9) and the vector (I d h z τ,h (x, t n )) j is the interpolation of the homogeneous initial value function z T = 0 (6) for the chosen goal (9) for n = N or the fully discrete dual solution of the next (n+1) slab for n < N on the dual finite element space of the current slab Q n . The same type of boundary colorisation (either Dirichlet or Neumann type) is used for the dual problem but with homogeneous boundary value functions, even in the case of inhomogeneous primal boundary conditions. Each system is modified such that homogeneous Dirichlet boundary conditions on Γ D are applied strongly to the dual solution z τ,h (8). The numerically approximated a posteriori space-time error estimatẽ is derived from the primal and dual solutions u τ,h and z τ,h , respectively, and each localisedη n, K , K ∈ T h,n , n = 1, . . . , N , is stored independently; cf. for details [5]. The space-time mesh refinement update is implemented as follows: mark a spacetime slab for refinement in time for which the corresponding value ofη n, (13) belongs to the top fraction 0 ≤ θ τ ≤ 1 of largest values, then, on each slab Q n , mark a mesh cell K ∈ T h,n for d-dimensional isotropic refinement in space for which the corresponding value of |η n, K | (13) belongs to the top fraction θ h,1 or θ h,2 , for a slab that is not or is marked for time refinement, of largest values, with 0 ≤ θ h,2 ≤ θ h,1 ≤ 1, then execute the spatial refinement and finally execute the temporal refinement.

Contribution of the Software to Scientific Discovery
Modules of the DTM++.Project, including higher-order in time discretisations with sophisticated solver technologies and distributed-memory parallel implementations, already have contributed to the process of scientific discovery for acoustic, elastic and coupled wave propagation (xwave), mass conservative transport (meat) and coupled deformation with transport (biot) for instance; cf. [8,9] and references therein. The stationary predecessor of dwr-diffusion is used in [4] and the successor is used in [5] for stabilised instationary convectiondominated transport problems. The contributed software shares the modularity and flexibility of all DTM++.Project modules, it provides a freely available open-source framework for efficiently solving problems with goal-oriented adaptivity and will support further scientific discovery with numerical simulations of challenging problems with physical relevance.

Basic Setup and Usage of the Software
The user needs to prepare a deal.II v9.0 toolchain to compile and run the software, this can be done on any major

Related work
The origin of the code is related to the step-14 tutorial code of deal.II [7] for the Laplace equation and other DTM++.Project modules [8]. Additionally, the code gallery of [7] provides some implementations for stationary elastoplasticity problems using DWR-based goal-oriented mesh adaptivity and for the instationary convection-diffusion-reaction equation without adaptivity for instance. All of the listed related implementations can gain advantages from our work since those are mostly specific implementations with partly lacking documentation and, more importantly, do not document their data structures and algorithms in an application-free way for allowing the reuse in other frameworks straight forwardly.

Software description
The software DTM++.Project/dwr-diffusion implements the introduced algorithm of Sec. 1.2 in an object-oriented way and is shipped with an extensive in-source documentation. DTM++ modules are written in the C++.17 language and make use of the new language features since C++.11 such as reference counted dynamic memory allocation, range-based loops, the auto specifier, strongly typed enum classes and others.

Software Architecture
The general workflow to use the dwr-diffusion solver module is illustrated by Fig. 5, it highly rely on user-driven inputs by simple text-based parameter input files for a wide range of similar numerical examples. To keep the code clean for the intended user group, the implementation of the presented algorithm from Sec. 1.2 is done with a procedural-structured single central solver class template Diffusion DWR<dim> : Figure 7: Pictorial overview of the elements to handle a computational slab Q n = T h,n × I n implemented by the DTM++::Grid DWR class. Remark that loop based slab data structures Q n is not stored for efficiency reasons.
DTM::Problem for slightly different time discretisations of the primal and dual problem. All assemblers, such as for the matrices, right-hand side vectors and even the space-time error estimateη, are using the deal.II workstream technology for thread-parallel and further MPI+X-parallel simulations. Thereby, the useful and efficient auxiliary classes of the DTM++ suite are provided independently, such that the input parameter handling and data output handling must not be included into the solver class. An overview on the general DTM++ software framework for PDE solver frameworks can be found in [8] and by the in-source doxygen documentation of the code.

Software Key Technologies
One of the key technologies of this work is the efficient handling of the space-time DoF data storage management as given in Fig. 6 which allow to efficiently iterate forwardly and backwardly to compute the primal and dual solutions as well as the error estimate. Another important key technology is a list approach of the new DTM++::Grid DWR class to handle (d+1)dimensional space-time computational slabs and grids as given in Fig. 7 and allows for inexpensive local time refinements.

Impact
The software gives new and efficient insights to the implementation of goal-oriented mesh adaptivity which supports the development of cost-effective numerical screening tools based on finite element approaches for fundamental problems in science and engineering. Derivatives of the software have already been used for stabilised stationary and instationary convectiondominated problems with goal-oriented mesh adaptivity in [4,5] as well as for coupled deformation and diffusion-driven transport in [9,8]. The intended user group of researchers and engineers using and developing finite element based numerical simulation screening tools for several problems is widespread and globally distributed; an example would be the extension of the work [10] in which they are using higher order variational time discretisations for convection-dominated transport with non-goal oriented adaptive time step control, or, an efficient instationary extension to the approach presented in [3]. The software is currently not used in commercial settings, but the license allows for the integration into commercial tools.

Conclusions
This original software publication provides efficient and scalable data structures and algorithms for the implementation of goal-oriented mesh adaptivity in a highly modular way. The key ideas of the data structures and algorithms can be reused, even independently of the programming language, in any adaptive finite element code. The performance and applicability of the software is shown with an illustrative example and several others are shipped with the code. The work yields a major breakthrough as a freely available open-source implementation with extensive in-source documentation for the dual weighted residual method in an application-free way such that it can be easily adopted by other frameworks. Ongoing work is on the distributed-memory parallelisation, on the serialisation of the data storage hiding memory operations, and on finding optimal tuning parameters of the minimisation problem.