An adaptive local time-stepping scheme for multiresolution simulations of hyperbolic conservation laws

Article history: Received 3 December 2018 Received in revised form 9 July 2019 Accepted 20 August 2019 Available online 14 October 2019


Introduction
Computational simulations of complex flows remain a challenge even today. The large range of temporal and spatial scales in combustive or turbulent flows or flow cavitation challenges the performance of even the largest parallel computers [26]. While a wide range of spatial scales causes high memory consumption, small temporal scales lead to small timestep sizes that inversely proportionally increase the number of required computational operations. Although the smallest spatial and temporal scales predominantly govern the flow evolution, in particular interfacial flows or flows with reaction fronts require the highest spatial resolution levels on co-dimension one-manifolds. Therefore, spatial and temporal resolution adaption schemes have been and are being developed as a remedy to reduce computational cost.
As an approach for spatial adaptivity, Harten [15,16] proposed a multiresolution (MR) framework for the spatial discretization of hyperbolic conservation laws to improve the efficiency of existing finite-volume solvers. The concept, which has been further developed to provide full spatial adaptivity [4], relies on a wavelet-based representation of the flow fields. This allows for compressing data to the minimum necessary amount for representing a field function at desired accuracy. Starting from a solution on a coarse mesh, higher-order wavelet coefficients are added successively to achieve the required accuracy using an increasingly finer mesh. In smooth flow regions only few wavelet coefficients may suffice to approximate the solution with the required accuracy on a coarse grid. In regions where the solution has strong local gradients, higherwavenumber wavelets need to be used to represent the solution which results in a locally refined grid. The underlying refinement strategy is based on dyadic refinement, where the cell spacing is halved in each direction for every additional set of wavelet coefficients. Compared to other spatial adaption methods, such as adaptive mesh refinement (AMR), MR techniques result in improved CPU and memory compression rates for the same truncation-error order discretizations of hyperbolic conservation laws [6].
Spatially varying grid resolution affects the temporal integration of the governing equations. The choice of a stable time-step size t from a stability criterion, such as the CFL condition, depends on the local cell sizes. Constant timestepping (CTS) schemes choose the minimum t based on the globally maximum wave speed |u ± c| together with the smallest occurring cell size to ensure stability in the entire domain [10,29]. Although formally and algorithmically simple, this approach is not very efficient as larger cells could use larger time-step sizes.
As a remedy, Osher and Sanders [25] introduced a local time-stepping (LTS) scheme for solving one-dimensional conservation laws on grids with varying spatial cell size. Each level of refinement uses its specific cell size to scale the time-step size t. For dyadic refinement in one dimension, the time step at each successively coarser level l is twice the time-step size of the finer level l + 1, thus the required number of time integration steps are strongly reduced compared to the CTS scheme. The LTS scheme has been combined with the MR approach by Domingues et al. [7][8][9] and Lopes et al. [22]. In their approach, the time-step size is fixed during each full cycle on the coarsest resolution level. At intermediate stages, interpolation is required in cells at resolution jumps. Müller and Stiriba [24] introduced transition zones between coarser and finer cells. This enables updating the time-step size after each full cycle on the finest levels and propagating this change upwards to coarser grid regions. Their scheme has been successfully employed for multi-fluid simulations [5]. However, their LTS algorithm is limited to stepwise grid refinements and cell-flux reconstruction schemes which require two adjacent cells only. Han et al. [12,13] applied the local time-stepping approach on a block-structured multiresolution scheme, with a fixed time-step size during each full cycle on the coarsest resolution level.
Several works [2,18,28] have demonstrated that the LTS approach with MR outperforms the computational efficiency of CTS. The minimum time-step size, however, can be determined only after each full cycle on the coarsest mesh, regardless of the interim local flow evolution on the finest mesh. Hence, the method does not deliver instantaneous time-step size adaptivity [8], and eventually may violate the stability criterion. Flows with strong temporal gradients, i.e. fast physical processes such as collapsing bubbles during cavitation [1] or chemical reactions [2] reveal such a deficiency of the LTS scheme. In the standard LTS approach, stability for such cases can be achieved by choosing a smaller CFL number, thus decreasing overall efficiency. Alternatively, the number of refinement levels can be reduced to allow more frequent timestep size adjustments, which decreases memory compression.
In this work, we present a revised LTS scheme without limitations on the spatial and temporal adaptivity for a blockstructured MR scheme. Projection and prediction operations of Harten [15] are applied on the numerical divergence of the flux vector rather than the state vector of the finite-volume discretization of the governing equations, similar to the approach of Bihari and Harten [3]. By advancing also halo cells of cell blocks in time, time-step size adaptation for different resolutions in different domain regions without additional synchronization on coarser resolution levels is obtained. The time-step size is adapted after each full cycle on the finest level, thus recovering the stability properties of the CTS approach. This is not limited to single-stage time-integration schemes, but also applies to multi-stage time integration schemes, e.g., the twostage second order Runge-Kutta (RK) Total Variation Diminishing (TVD) scheme [11,14]. In addition, the scheme does not constrain spatial discretization. Hence, it can be used with high-order spatial schemes, such as WENO, and does not limit the resolution jumps between neighboring domain regions.
The organization of the paper is the following: in section 2, the adaptive MR scheme for the finite volume (FV) method is reviewed. In section 3, we describe the new adaptive local time-stepping scheme. In section 4, we present simulation results of our fully adaptive scheme for one-and two-dimensional problems. We compare our simulation results with exact solutions and other numerical schemes to discuss the accuracy and stability of the new scheme. In section 5, we conclude our results.

Finite volume representation
We consider a finite-volume representation of hyperbolic conservation laws on a Cartesian grid (x, y, z, t) ∈ × [0, ∞), ⊂ R 3 to evolve the state vector u. The governing equations (1) constitute an initial-boundary value problem with u(x, y, z, 0) = u 0 (x, y, z) and appropriate boundary conditions on the boundary = ∂ . In the finite volume approach, the cell-averaged numerical solution U can be advanced in time by solving the ordinary differential equation (ODE) system dU dt where D(U ) represents the divergence of the evaluated numerical flux function. High-order shock capturing is achieved in this paper with a 5th-order WENO shock-capturing scheme for the reconstruction of the numerical flux function [19]. The ALTS scheme, however, can be applied with every other linear or nonlinear finite-volume or finite-difference scheme. The semi-discrete ODE system (2) is integrated in time using the 2nd-order RK-TVD scheme of Gottlieb and Shu [11]. The solution at time t (n) and t (n+1 with CFL ≤ 1, where c, u i and x i are the local speed of sound, the velocity component in i-direction, and the cell size in i-direction of the Cartesian mesh, respectively [11]. We use the discrete evolution operator as short notation for a single RK step [29]. For the first stage, it is D(

Multiresolution representation
Following the original concept of Harten [15], the cell average solution U is represented by the combination of data on a coarse grid and a series of values on successively finer grids. We use a Cartesian grid with global indices i, j, and k, where each cell can be refined dyadically into 2 D children cells, with D being the number of dimensions. Starting from the parent cell states U l, (i, j,k) at level l, the children at the next-finer level l + 1 are denoted as U l+1,(2i+α,2 j+β,2k+γ ) , α, β, γ ∈ {0, 1}.
A sketch of this dyadic refinement strategy is shown in Fig. 1.
Two basic operations are introduced to exchange data between consecutive levels l and l + 1 [15]. Cell averages at level l are estimated from a finer level l + 1 by the projection operation Q(l + 1 → l) U l+1,(2i+α,2 j+β,2k+γ ) . (7) Note that we formulate the projection operator for 3D data sets. For lower dimensions, the indices k (2D) or j and k (1D) as well as γ and β are omitted. This operation is exact for cell-averaged data.
The prediction operation P(l → l + 1) is used to estimate cell averages at level l + 1 from a coarser level l and is given bŷ This definition shows that knowing the cell average of the coarsest cell and details of all successively finer cells is equivalent to knowing the cell-averaged values of all finer cells. Implicit mesh adaption is achieved by considering details larger than a pre-defined level-dependent threshold ε, only. As details become negligible, i.e. ||d|| < ε, further grid refinement is not necessary, as a finer mesh does not lead to significantly better results than interpolation from a coarser grid.
For efficiency reasons, we follow the approach of Rossinelli et al. [27] and apply P, Q, and all other operations, e.g. the numerical flux estimation, always on a set of multiple cells, which is called a block. The refinement of cells occurs blockwise, too, i.e., all cells of a block are refined as soon as one single cell in the block triggers refinement. A two-dimensional example of the resulting tree structure for a maximum refinement level of l max = 3 is shown in Fig. 2. In this example, we start with one single block on the coarsest level, which is successively refined three times to give the maximum resolution in the region of interest.
The following definitions are introduced for the tree-structure to reflect the relationship between different blocks. Taking When performing E (or other operations) close to the edge of a block, data from neighboring blocks are required. We apply the approach of Han et al. [12] who use halo cells for each block to facilitate these operations without requiring the tree-structure with corresponding level labels, cell sizes, and time-step sizes for a local time-stepping algorithm. In the grid representation and the tree, existing blocks are shown as circles, where empty circles denote further refined blocks and filled circles mark leaves.  [29]. Note that fluxes of successive iterations need to be scaled with the different time-step sizes to ensure conservation. This way, we maintain strict conservation at cell patches between leaves on different resolution levels.

Adaptive local time-stepping scheme
The fixed cell-size ratio of successive refinement levels allows for the efficient formulation of a local time-stepping (LTS) scheme [8]. The level-dependent increment t l is scaled from the finest admissible stepsize at the maximum refinement level t l max for an arbitrary level l, where l max is the maximum number of refinement levels, and t l max the time-step size computed with the cell size of the finest level. This leads to an improved efficiency of the scheme as compared to the CTS approach, where all cells are integrated in time with t l max regardless of their resolution. Drawbacks of the LTS scheme are the necessary time synchronization at resolution jumps and the constant time-step size t l within each full cycle of the coarsest level [8]. Therefore, the time-step size adaption loses efficiency with increasing number of refinement levels. For example, for a setup with l max = 4, t l max is updated only after 2 4 = 16 full cycles on l max . For a setup with l max = 8, the earliest possible adjustment of t l max occurs after 2 8 = 256 full cycles on l max . In case of strong non-linear effects such as wavesteepening or bubble collapse events, this procedure may result in severe stability issues.
In this work, we propose a fully adaptive local time-stepping (ALTS) algorithm for multiresolution simulations to solve the aforementioned limitations. We split the evolution operation E into the evaluation of the divergence of the numerical flux functions F = F (U ), and the time integration T = T U (n) , D(U 1 ), D(U 2 ), t (n) , so that The key step to allow for instantaneous time-step size adaptivity is to apply the projection and prediction operations to the numerical divergence of the flux functions in halo cells at resolution jumps. The halo cells are integrated in time together with the internal cells according to their level-dependent time-step size. This is mathematically equivalent to performing first the time integration of the internal cells followed by the projection and prediction to update the halo cells, which is shown in the following derivation. The projection operation (7) with an Euler-forward time update can be written as The prediction operation (8) yieldŝ The Q s,(n+1) α terms change accordingly, for any component α. For example, α = x can be written as In other words, instead of integrating cells close to resolution jumps in time, we only compute the divergence of cell-face fluxes for both coarser and finer cells at the resolution jump. Then, we exchange them in the halo cells and integrate them in time for the finer block with its level-dependent time-step size. Halo cells at resolution jumps are now at the same time instant as the internal cells, hence we avoid costly synchronization in time which is necessary for the standard LTS scheme. In fact, the temporal integration of the coarser level is postponed until all finer levels reach the same time. This allows for adapting the time-step size after every full cycle on the finest level, and ensures strongly improved time-step size adaptivity.
Our scheme imposes no limits on the number of resolution jumps between two neighboring leaves. Assume two neighboring blocks, with block A being a leaf at level l and block B a leaf at level l + m with m = 2. First, halo values are exchanged at level l, thus the grandparent block of block B has updated halo values. Second, the halo cells are predicted twice, first to the parent block of block B, and from there to block B itself. Third, the predictions in the halo cells at level l + 2 are evaluated 2 2 times until the neighboring levels are synchronous again. This scheme works accordingly for m > 2. Fig. 4 illustrates our new algorithm in comparison to the classical LTS. The basic algorithm works as follows: the divergence of cell-face fluxes on all parent and child levels are computed. They are projected to update divergences of parent blocks in the full tree, exchanged with neighboring blocks in their halo cells, and then predicted to update halo cells at resolution jumps. The evolution is finished by finally performing the integration in time on the target level. In short notation, the full step is abbreviated with the updated evolution operator Following Domingues et al. [8], the three basic steps to evolve the numerical solution from U (n) to U (n+1) can be expressed as 1.Refinement : Here R is the refinement operator and C(ε) the coarsening operator with the coarsening threshold ε.
The main aspects of one time-step iteration in this scheme are: 1. Each macro cycle extends from t (n) to t (n+2 lmax ) .

Each sub
4. In the first cycle of each macro time step, i.e. cycle (0, 0), we compute the divergence of the fluxes (F ) for the first RK stage for all levels. We then perform the projection and prediction of the divergence to update halo cells at resolution jumps (Q, P). 5. In each following sub-cycle (τ , ς), leaves at all levels l which fulfill the constraint (τ + ς) mod 2 l max −l = 0 are integrated in time (T ). Afterwards, cells at the parent levels are updated through cell average projection (Q). Then, flux divergences are computed (F ), projection is performed, and halo cells at jump conditions are filled by prediction (Q, P). 6. In each sub-cycle, refinement (R) and coarsening (C) can be performed after time integration (T ). The condition for refining level l again is (τ + ς) mod 2 l max −l = 0, i.e. this level was integrated in this sub-cycle. For coarsening, the more restrictive condition (τ + ς) mod 2 l max −(l−1) = 0 is applied. Note that a leaf at level l can only be removed when also the parent level was updated in time during this iteration. This is required to allow for a consistent computation of the details as refinement criterion.
7. The last sub-cycle (2 l max − 1, ς max − 1) ends with the time integration of the second RK stage for all levels.
The scheme allows an update of t τ l max during each full cycle at the finest level. The update ensures that the CFL-condition is fulfilled at every level at all times τ , which is demonstrated in the following, for simplicity, for the two finest levels l max and l max − 1. The maximum wave-speed ||u i ± c|| τ ∞ is computed at all levels from the last completed full RK step and is used to determine the time-step size t τ l max from the CFL-condition (5), giving t τ l max . The timestep size t τ +1 l max is computed analogously from the maximum wave-speed ||u i ± c|| τ +1 ∞ . The time-step size at level l max − 1 is t τ We analyze at which level the maximum wave-speed ||u i ± c|| τ ∞ occurs during two time steps on the finest level to estimate whether the time-step size at levels l max and l max − 1 is within the bounds of the CFL-criterion: • ||u i ± c|| τ ∞ = ||u i ± c|| τ ∞,l max > ||u i ± c|| τ ∞,l max −1 and ||u i ± c|| τ +1 ∞ = ||u i ± c|| τ +1 ∞,l max > ||u i ± c|| τ ∞,l max −1 : the maximum wavespeed in the domain occurs for both time steps at level l max , the maximum admitted time-step size at level l max − 1 therefore is larger than t τ . Hence, the CFL-condition is fulfilled at l max − 1 and l max .
: the maximum wavespeed in the domain occurs for both time steps at level l max − 1, therefore t τ . Hence, the CFL-condition is fulfilled at l max − 1 and l max .
and ||u i ± c|| τ +1 ∞ = ||u i ± c|| τ +1 ∞,l max > ||u i ± c|| τ ∞,l max −1 : the level of the maximum wave-speed changes after the first time step. For level l max − 1, t τ l max −1 = 2 · t τ l max satisfies the CFL-condition l max also satisfies the CFL-condition at both levels l max − 1 and l max .
Thus, the ALTS scheme provides a CFL-number which globally is always within the bounds of the prescribed stability criterion for every RK cycle.

Examples
The following examples illustrate the accuracy, improved stability, and efficiency of the proposed scheme. Grid adaptation is employed using the MR analysis with level-and dimension-dependent error levels with the number of dimensions D. Unless stated otherwise, we use ε ref = 0.01 for the refinement from l max − 1 to l max , and each simulation is performed with one block at level 0 using 16 internal cells in each spatial direction.

Accuracy-order analysis
Two one-dimensional test cases are presented to demonstrate the accuracy of the underlying methods of our adaptive local time-stepping scheme. First, we solve the linear scalar advection equation govern the problem, where ρ is the density, u the velocity, p the pressure, and e the specific energy. The system is closed by the stiffened-gas equation-of-state with the material parameters γ for the ratio of specific heats and p ∞ for the background pressure. Here, we use γ = 1.4 and p ∞ = 0. Initial conditions for velocity and density fields are obtained from the Rankine-Hugoniot conditions for an ideal gas (γ = 1.4, p ∞ = 0). The problem is advanced to a final time of t = 0.2, when the wave has steepened already but still is continuous. For both cases, we present convergence rates for the spatial scheme (WENO 5), the standard RK2-TVD time-integration scheme, and the RK3-TVD time-integration scheme to demonstrate the applicability to higher-order explicit time-integration schemes. The simulations for the spatial error are run with CFL = 10 −3 to minimize the temporal error, and ε ref = 0.001 for a refinement from level 0 to level 1. The simulations for the temporal error are computed on a mesh with maximum level l max = 4.
Error plots of the L 1 norm of the density field are given in Fig. 5 for the two cases, numeric values are given in Table 1. The convergence rates of both the linear and the non-linear examples agree well with the values expected from theory. Only close to CFL= 1.0, the convergence rate of RK2 deviates from the expected value. Note that for RK3 the second-order treatment at resolution jumps [7,22] reduces the global order of the convergence to slightly less than three.

Table 1
The spatial and temporal accuracy of the linear sine-wave advection and non-linear sinewave steepening for RK2-TVD, RK3-TVD, and WENO5 schemes.

One-dimensional examples
Two one-dimensional testcases with strong discontinuities and therefore strong temporal gradients in the wave-speeds are considered. The simulations are again governed by the compressible Euler equations (16).
The first setup is the inviscid shock-tube problem of Sod [30]. The initial conditions for a domain of length 1.0 are The domain is refined to l max = 7 and the final simulation time is t = 0.2.
The second one-dimensional case is the two blast-waves problem of Woodward and Colella [31]. The initial conditions are The domain is refined to l max = 7 and the final simulation time is t = 0.038.
Results for both 1D test cases are shown in Fig. 6. The applied CFL-number is CFL= 1.0. The standard local time stepping cannot resolve the wave dynamics for this high CFL-number and becomes unstable within the first macro time step in both cases. The ALTS scheme allows for stable simulations in both cases. The results show good agreement with the ones from an equally-spaced grid without adaptive mesh. This underlines that the ALTS scheme shows the same temporal stability behavior as an homogeneous grid without local spatial adaptation, which is not the case for the classical LTS scheme.

Two-dimensional example
As two-dimensional test case we present the non-symmetric shock-induced collapse of a bubble near a rigid wall [20]. This problem is of practical relevance in a range of applications, e.g. shock-induced lithotripsy [21]. The wave-speed increases strongly during the bubble collapse, therefore instant adaption of the time-step size is required. Initial conditions for an axisymmetric domain with the bubble radius R 0 = 50 μm, x 0 = 0, and y 0 = 0.7 mm. This leads to a stand-off distance of 2R 0 of the bubble center from the wall. The bubble is resolved with 256 cells per initial radius on l max , and we use CFL= 0.9 for the simulation. A sketch of this setup, which was investigated by Johnsen and Colonius [20], is shown in Fig. 7. We consider the compressible Euler equations in axisymmetric formulation and employ a level-set approach for multi-phase flows. The system is closed by the complete stiffened-gas equation-of-state where E ∞ is the energy translation factor. We use γ = 4.4, p ∞ = 6.0 × 10 8 , E ∞ = 7.456 × 10 6 for water, and γ = 1.4, p ∞ = 0, E ∞ = 0 for air [17,23].
The pressure and axial velocity fields are shown in Fig. 8 at t = 2.47 ms and t = 2.48 ms for both schemes. At t = 2.47 ms, the bubble is already strongly deformed. The maximum pressure regions are slightly shifted from the center axis, and coincide with the location where the bubble is thinnest (a). The high-speed water-jet has formed in the inner part of the bubble (b). For this instant, both schemes exhibit the same results. At t = 2.48 ms, the jet strikes the downstream bubble wall, and the bubble breaks up at the location of the maximum pressure in the previous instant. The jet accelerates even further, and a strong shock wave can be observed propagating away from the bubble (c, d). This leads to a strong increase in the wave speed, so the time-step size needs to be adapted immediately to satisfy the stability criterion. The LTS scheme cannot provide this immediate adaption, and spurious oscillations occur at the wave front of the water hammer.
These oscillations indicate the onset of nonlinear numerical instability near the wave front. The ALTS provides immediate time-step size adaption, and does not exhibit such spurious oscillations.   Johnsen and Colonius [20] have proposed an empirical analytical profile for the localized maximum wall pressure as function of the circular off-center distance along the wall Here, c 1 and c 2 are two constants, which are determined from two arbitrary pressure profiles of the temporal evolution, and H c is the stand-off distance at collapse time. For the given case, H c = 1.5, c 1 = 0.34, and c 2 = 0.025. In Fig. 9, a series of radial wall-pressure profiles at several time instants is shown together with the asymptotical maximum wallpressure function eq. (24). The time difference between two subsequent radial profiles is 20 μs, and the first profile shows the pressure profile at t = 2.80 ms. Due to the radial spreading, the maximum pressure decreases in time. The overall maximum pressure at the wall infact occurs not at the centerline, but is shifted slightly outwards as the collapsing bubble shields the wall from the incoming pressure wave. This maximum pressure is slightly larger than the prediction of Johnsen and Colonius [20], which was already found earlier for this model configuration [1].

Performance estimation
Finally, we analyze the additional cost due to computational operations which were introduced for the ALTS scheme: the projection and prediction of flux functions and the temporal integration of halo cells. We analyze the results for the one-dimensional single-fluid shock-tube simulation (l max = 7) and the two-dimensional multi-fluid shock-bubble interaction (l max = 8) presented in sections 4.2 and 4.3. We have performed our numerical tests on a homogeneous cluster of Intel Xeon E5-2697 v3 "Haswell" CPUs with 28 cores at 2.7 GHz and 2.3 GB memory. One-dimensional single-fluid simulations are executed sequentially, on one core only. Two-dimensional multi-fluid results are run in parallel on all 28 cores using 28 MPI ranks. Operations of a full Runge-Kutta cycle are split into three main categories: the update of the time-step size, additional operations that were introduced with the ALTS algorithm (ALTS related), and operations that are required for both LTS and ALTS schemes (common operations). We compute for each category its share of the compute time over several macro time steps. Results are given in Table 2.
The share of each category varies with the simulation type: the time step size update and ALTS-related operations account for approximately 16% of computational cost in the one-dimensional single-phase simulation, and for approximately 7% in the two-dimensional multi-phase simulation. We conclude that the additional cost has no significant adverse impact on the overall computing efficiency.

Conclusion and outlook
We have presented an adaptive local time-stepping scheme for a block-based multiresolution algorithm. The key improvement of the new approach is the projection and prediction of the right-hand-side terms of the governing equations and additional halo-cell time integration. The benefit of the improved time-stepping scheme is full adaptivity of the numerical time-step size down to blocks with highest resolution and, thus, improved stability for larger CFL numbers.
The ALTS scheme is applied to various one-and two-dimensional examples with strong velocity gradients to demonstrate its advantages. Even for large CFL numbers up to CFL = 1.0, our approach allows for stable and accurate simulations. The scheme is not limited to single-phase simulations, but can also be used for multiple phases without constraints on, e.g., the variation of the wave speed across the material interface.
Despite the fact that additional operations are required during each iteration (projection and prediction of the numerical divergence of the flux function, time integration of halo cells), the additional work of the ALTS scheme in comparison to the LTS scheme is negligible for multi-dimensional realistic flow problems. The numerical stability is improved as compared to the classical LTS scheme. Overall, the stability behavior of the constant time-stepping scheme is recovered, which was lost with the introduction of the LTS scheme.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.