Sparse Grid Discontinuous Galerkin Methods for the Vlasov-Maxwell System

In this paper, we develop sparse grid discontinuous Galerkin (DG) schemes for the Vlasov-Maxwell (VM) equations. The VM system is a fundamental kinetic model in plasma physics, and its numerical computations are quite demanding, due to its intrinsic high-dimensionality and the need to retain many properties of the physical solutions. To break the curse of dimensionality, we consider the sparse grid DG methods that were recently developed in \cite{guo2016sparse,guo2017adaptive} for transport equations. Such methods are based on multiwavelets on tensorized nested grids and can significantly reduce the numbers of degrees of freedom. We formulate two versions of the schemes: sparse grid DG and adaptive sparse grid DG methods for the VM system. Their key properties and implementation details are discussed. Accuracy and robustness are demonstrated by numerical tests, with emphasis on comparison of the performance of the two methods, as well as with their full grid counterparts.


Introduction
The Vlasov-Maxwell (VM) system is a fundamental model in plasma physics for describing the dynamics of collisionless magnetized plasmas, which finds diverse applications in science and engineering, including thermo-nuclear fusion, satellite amplifiers, high-power microwave generation, to name a few. In this paper, we study the VM system that describes the evolution of a single species of nonrelativistic electrons under the self-consistent electromagnetic field while the ions are treated as uniform fixed background. Under the scaling of the characteristic time by the inverse of the plasma frequency ω −1 p , length by the Debye length λ D , and electric and magnetic fields by −mcω p /e (with m the electron mass, c the speed of light, and e the electron charge), the dimensionless form of the VM system is with where the equations are defined on Ω = Ω x × Ω ξ . x ∈ Ω x denotes position in physical space, and ξ ∈ Ω ξ , which is the velocity space. Here f (x, ξ, t) ≥ 0 is the distribution function of electrons at position x with velocity ξ at time t, E(x, t) is the electric field, B(x, t) is the magnetic field, ρ(x, t) is the electron charge density, and J(x, t) is the current density. The charge density of background ions is denoted by ρ i , which is chosen to satisfy total charge neutrality, Ωx (ρ(x, t) − ρ i ) dx = 0. Ideally Ω ξ = R d ξ , however numerical computation requires a truncation of the space Ω ξ and the assumption that f is compactly supported on Ω ξ . In this paper, for simplicity, we will assume Ω x , Ω ξ to be box-shaped domains. The simulations of VM systems are quite challenging. Particle-in-cell (PIC) methods [5,20] have long been very popular numerical tools, mainly because they can generate reasonable results with relatively low computational cost. However, as a Monte-Carlo type approach, the PIC methods are known to suffer from the statistical noise, which is O(N − 1 2 ) with N being the number of sampling particles. Such an inherent low order error of PIC methods prevents accurate description of physics of interest, when, for instance, the tail of the distribution function needs to be resolved accurately. In recent years, there has been growing interest in deterministic simulations of the Vlasov equation. In the deterministic framework, the schemes are free of statistical noise and hence able to generate highly accurate results in phase space. In the context of VM simulations, Califano et al. have used a semi-Lagrangian approach to compute the streaming Weibel (SW) instability [10], current filamentation instability [24], magnetic vortices [9], magnetic reconnection [8]. Also, various methods have been proposed for the relativistic VM system [27,4,28,21]. In this paper, we are interested in a class of successful deterministic Vlasov solvers based on the discontinuous Galerkin (DG) finite element discretization, because of not only their provable convergence and accommodation for adaptivity and parallel implementations, but also their excellent conservation property and superior performance in long time wave-like simulations. Those distinguishing properties of DG methods are very much desired for the VM simulations, and they have been previously employed to solve VM system [13,12] and the relativistic VM system [31]. However, due to the curse of dimensionality, traditional deterministic approaches including the DG methods are not competitive for high dimensional Vlasov simulations, even with the aid of high performance computing systems.
To break the curse of dimensionality, this paper will focus on the sparse grid approach. The sparse grid method [7,15] has long been an effective numerical tool to reduce the degrees of freedom for high-dimensional grid based methods. In the context of wavelets or sparse grid methods for kinetic transport equations, we mention the work of using wavelet-MRA methods for Vlasov equations [3], the combination technique for linear gyrokinetics [23], sparse adaptive finite element method [30], sparse discrete ordinates method [16] and sparse tensor spherical harmonics [17] for radiative transfer, among many others. In [29,18], a class of sparse grid DG schemes were proposed for solving high-dimensional partial differential equations (PDEs) based on a novel sparse DG finite element approximation space. Such a sparse grid space can be regarded as a proper truncation of the standard tenor approximation space, which reduces degrees of freedom of from , where h is the uniform mesh size in each direction and d is the dimension of the problem. Coupling the DG weak formulation with the sparse grid space, the resulting sparse grid DG method is demonstrated to save computational and storage cost without deteriorating the approximation quality by much. In particular, when applied to a d-dimensional linear transport equation, the scheme is proven to be L 2 stable and convergent with rate O(| log 2 h| d h k+1/2 ) in L 2 norm for a smooth enough solution, where k is the degree of polynomials [18]. Motivated by the development of sparse grid DG method [18] and the adaptive multiresolution DG method [19] for transport equations, it is of interest to this paper to develop sparse grid DG methods for solving the VM system. The proposed methods are well suited for VM simulations, due to their ability to handle high dimensional convection dominated problems, the ability to capture the main structures of the solution with feasible computational resource and the overall good performance in conservation of physical quantities in long time simulations.
The rest of the paper is organized as follows: in Section 2, we describe the numerical algorithms, outlining the schemes as well as their key properties. Section 3 is devoted to discussions of simulation results. We conclude the paper in Section 4.

Numerical methods
In this section, we describe two sparse grid DG methods for the VM system: the standard sparse grid DG method and the adaptive sparse grid DG method. We first review the finite element space on sparse grid introduced in [29], and then describe the details of the schemes when applied to the VM system.

DG finite element space on sparse grid
In this subsection, we review the notations of DG finite element space on sparse grid. First, we introduce the hierarchical decomposition of piecewise polynomial space in one dimension. Without loss of generality, consider the interval [0, 1], we define a set of nested grids, where the l-th level grid Ω l consists of 2 l uniform cells I j l = (2 −l j, 2 −l (j + 1)], j = 0, . . . , 2 l − 1, for any l ≥ 0. The nested grids result in the nested piecewise polynomial spaces. In particular, let V k l := {v : v ∈ P k (I j l ), ∀ j = 0, . . . , 2 l − 1} 3 be the usual piecewise polynomials of degree at most k on the l-th level grid Ω l . Then, we have We can now define the multiwavelet subspace W k l , l = 1, 2, . . . as the orthogonal complement of V k l−1 in V k l with respect to the L 2 inner product on [0, 1], i.e., Note that the space W k l , for l > 0, represents the finer level details when the grid is refined and this is the key to the reduction of the degrees of freedom in higher dimensions. We further let W k 0 := V k 0 , which is standard piecewise polynomial space of degree k on [0, 1]. Therefore, we have found a hierarchical representation of the standard piecewise polynomial space V k l on Ω l as V k l = 0≤j≤l W k j . In [29], we used the orthonormal basis functions for space W k l , which are constructed based on the one-dimensional orthonormal multiwavelet bases first introduced in [1]. We refer readers to [1] and [29] for more details. We now denote the basis functions in level l as v j i,l (x), i = 1, . . . , k + 1, j = 0, . . . , 2 l−1 − 1, and they are orthonormal, i.e.
By making use of the multi-index notation, we indicate by l = (l 1 , · · · , l d ) ∈ N d 0 the mesh level in a multivariate sense, where N 0 denotes non-negative integers. We consider the tensorproduct rectangular grid Ω l = Ω l 1 ⊗ · · · ⊗ Ω l d with mesh size h l = (h l 1 , · · · , h l d ). Based on the grid Ω l , an elementary cell is denoted by is the tensor-product piecewise polynomial space, where Q k (I j l ) denotes polynomials of degree up to k in each dimension on cell I j l . If l 1 = . . . = l d = N , the grid and space will be denoted by Ω N and V k N , respectively. For the increment space W k l = W k l 1 ,x 1 × · · · × W k l d ,x d , the orthonormal basis functions can be defined as where v jm im,lm (x m ) denote orthonormal bases in m-th dimension defined in one-dimensional case. With the one-dimensional hierarchical decomposition, we have The sparse finite element approximation space on mesh Ω N we use in this paper, on the other hand, is defined by [29,18] This is a subset of V k N (Ω), and its number of degrees of freedom scales as O((k , where h N = 2 −N denotes the finest mesh size in each direction [29]. This is significantly less than that of V k N (Ω) with O((k + 1) d h −d N ) number of elements when d is large. The sparse grid DG scheme in Section 2.2 is defined using this space. The adaptive sparse grid DG scheme, however, relies on an adaptive choice of the space, and will be discussed in details in Section 2.3.

The sparse grid DG scheme
Below, we formulate a DG scheme in the sparse finite element space for the VM system (1a)-(1b) inspired by [13]. On the PDE level, the two equations in (1c) involving the divergence of the magnetic and electric fields can be derived from the remaining part of the VM system. However, how to impose (1c) numerically is an important and nontrivial issue [25,2,22]. This will be studied in our future work.
Using the notations introduced in the previous subsections, and let d x and d ξ be the dimension of Ω x and Ω ξ , respectively, we consider the partitions of the domain Ω into mesh level N in all directions. We distinguish between the x-and ξ-directions. Let I j x,N , 0 ≤ j m ≤ 2 N − 1, ∀ m = 1, . . . , d x and I j ξ,N , 0 ≤ j m ≤ 2 N − 1, ∀ m = 1, . . . , d ξ be the collection of all elements in Ω x and Ω ξ , respectively. Let E x be the union of the edges for all elements in I j x,N , similarly let E ξ be the union of the edges for all elements in I j ξ,N . Furthermore, with E i ξ and E b ξ being the union of interior and boundary edges of I j ξ,N , respectively. For piecewise functions, we further introduce the jumps and averages as follows. Suppose T + and T − are two elements in I j x,N . For any edge e = {T + ∩ T − } ∈ E x , with n ± x as the outward unit normal to ∂T ± , g ± = g| ∂T ± , and U ± = U| ∂T ± , the jumps across e are defined as x and the averages are When considering periodic boundary conditions, the jumps and averages can be naturally defined on the boundary edges.
By replacing the subscript x with ξ, the jumps and averages can be defined similarly for the ξ-direction. For a boundary edge e ∈ E b ξ with n ξ being the outward unit normal, we use This is consistent with the fact that the exact solution f is assumed to be compactly supported in ξ-domain.
We are now ready to describe the scheme. The sparse discrete spaces on Ω and Ω x we use are defined asĜ Following [18], the semi-discrete DG methods for the VM system are: All "hat" functions are numerical fluxes. For the Vlasov part, we adopt the global Lax-Friedrichs flux: where where the maximum is taken for all possible x and ξ at time t in the computational domain. For the Maxwell part, we use the upwind flux or the alternating fluxes Previous studies in [12,13] have demonstrated the importance of numerical flux on the quality of DG simulations. It was understood that a dissipative flux choice is desired for 6 the Vlasov equation. The alternating and upwind fluxes are both optimal in order for the Maxwell solver, but alternating flux is energy-conserving for the Maxwell's equation, while upwind flux is not. We will not consider the central flux based on the recommendations made in [13].
Below, we will summarize the conservation and stability properties of the semi-discrete scheme. The proof is very similar to those in [13] and is thus omitted. where for any T > 0, the following holds: While for the scheme with alternating fluxes (7) for the Maxwell part, we have In the theorems above, terms like Θ h,1 (t), Θ h,3 (t) come from numerical errors of the velocity boundary truncation from an infinite to a finite domain. If those terms are negligible, i.e. when Ω ξ is chosen sufficiently large, we can conclude that the scheme is mass-conservative, and energy-conservative if the alternating flux is used for the Maxwell solver.
As for time, we use the total variation diminishing Runge-Kutta (TVD-RK) methods [26] to solve the ordinary differential equations resulting from the discretization (3a)-(3c), denoted by (u h ) t = R(u h ). In particular, we use the following third-order TVD-RK method in this paper Finally, we mention some details about implementation. A key to computational efficiency is the efficient evaluation of multidimensional integrations. To compute multidimensional integration, we apply the unidirectional principle. For example, if we want to evaluate Based on the hierarchical structure of the basis functions, we only need some small overhead to compute one-dimensional integrals and assemble them to obtain the multi-dimensional integrations. In (3a)-(3c), the numerical integrations with coefficients ξ and E h , B h (which belong toÛ k h ) can be done very efficiently because they can all be computed using this trick. This procedure is performed one time before time evolution starts, and the sparsity of the matrices due to the multiwavelet basis structures is utilized to accelerate the computation [29].

The adaptive sparse grid DG scheme
In this subsection, we will describe the adaptive sparse grid DG (also called adaptive multiresolution DG) method [19] for the VM system. The motivation to study the adaptive scheme is to offer better numerical resolutions for the probability density function. It is known that the solution to Vlasov equation develops filamentation, therefore the standard sparse grid method can not offer enough resolution when t becomes large (see [19] for comparison in the case of the Vlasov-Poisson system). Therefore, in this paper, we also consider the adaptive scheme.
The main idea of the algorithm is not to useV k N (Ω) in a pre-determined fashion, but rather to choose a subspace of V k N (Ω) adaptively. In this work, adaptivity is only implemented for f h , not for the lower-dimensional variables E h , B h , which are computed using the full finite element space. This will not cause much additional cost because the Maxwell's equations are in lower dimension than the Vlasov equation. If the computational and storage cost is a big concern, then the adaptive strategy can be potentially applied to the Maxwell's part as well.
The algorithm is described in details below. Given a maximum mesh level N and an accuracy threshold ε > 0, we first use the adaptive multiresolution projection algorithm in [19] to get the numerical initial condition f h (x, ξ) for the DG scheme. When the adaptive projection algorithm completes, it will generate a hash table H, leaf table L and . We denote the approximation space V k N,H (Ω) = span{v j i,l ∈ H} and it is a subspace of V k N (Ω). On the other hand, we compute the initializations of . Then we begin the time evolution algorithm which consists of several key steps. The first step is the prediction step, which means that, given the hash table H that stores the numerical solution f h at time step t n and the associated leaf table L, we need to predict the location where the details becomes significant at the next time step t n+1 , then add more elements in order to capture the fine structures. We solve for f h ∈ V k N,H (Ω) from t n to t n+1 based on the solutions E n h and B n h at time t n . The forward Euler discretization is used as the time integrator in this step and we denote the predicted solution at t n+1 by f indicating that such an element becomes significant at the next time step, then we need to add its children elements to H and L provided they are not added yet, and set the associated detail coefficients to zero. We also need to make sure that all the parent elements of the newly added element are in H (i.e., no "hole" is allowed in the hash table) and increase the number of children for all its parent elements by one. This step generates the updated hash table H (p) and leaf table L (p) . Note that (11) corresponds to L 2 -norm based refinement criteria in [19]. Based on the updated hash table H (p) , the third step is to evolve the numerical solution f h by the DG weak formulation with space V k N,H (p) (Ω). Namely, we solve for V k N,H (p) (Ω) from t n to t n+1 , by the TVD-RK scheme (10) to generate the pre-coarsened numerical solutioñ f n+1 h . Meanwhile, we also evolve the numerical solutions E h and B h from t n to t n+1 with space V k N (Ω x ). The last step is to coarsen the mesh by removing elements that become insignificant at time level t n+1 . The hash table H (p) that stores the numerical solutionf n+1 h is recursively coarsened by the following procedure. The leaf table L (p) is traversed, and if an element I j l ∈ L (p) satisfies the coarsening criterion where η is a prescribed error constant, then we remove the element from both table L (p) and H (p) , and set the associated coefficients f j i,l = 0, 1 ≤ i ≤ k + 1. For each of its parent elements in table H (p) , we decrease the number of children by one. If the number becomes zero, i.e, the element has no child any more, then it is added to the leaf table L (p) accordingly. Repeat the coarsening procedure until no element can be removed from the table L (p) . The output of this coarsening procedure are the updated hash table and leaf table, denoted by H and L respectively, and the compressed numerical solution f n+1 h ∈ V k N,H . In practice, η is chosen to be smaller than ε for safety. In the simulations presented in this paper, we use η = ε/10. For the adaptive sparse grid DG scheme, the properties of conservation of moments are not as clear as the sparse grid DG schemes, and will be subject to the error thresholds η, ε [19].

Numerical results
In this section, we use the streaming Weibel (SW) instability to benchmark the performance of the proposed schemes for simulating the VM system. The SW instability and related problems have been previously considered both analytically and numerically [10,13,12]. In this case, a reduced version of the single species VM system with one spatial variable x 2 , and two velocity variables ξ 1 , ξ 2 is considered: where Here, f = f (x 2 , ξ 1 , ξ 2 , t) is the distribution function of electrons, E = (E 1 (x 2 , t), E 2 (x 2 , t), 0) is the 2D electric field, and B = (0, 0, B 3 (x 2 , t)) is the 1D magnetic field. Ions are assumed to form a constant background. The initial condition is given by where b denotes the amplitude of the initial perturbation to the magnetic field, β 1/2 is the thermal velocity, which is take to be β = 0.01, and δ is a parameter measuring the symmetry of the electron beams. Note that when b = 0, this initial condition is an equilibrium state representing counter-streaming beams propagating perpendicular to the direction of inhomogeneity. As in [10], the instability can be triggered by taking b = 0.001 as a perturbation of the initial magnetic field. The computational domain is chosen to be Ω x = [0, L y ], where L y = 2π/k 0 and Ω ξ = [−1.2, 1.2] 2 . We consider two sets of parameters as in [10] choice 1 : δ = 0.5, v 0,1 = v 0,2 = 0.3, k 0 = 0.2; choice 2 : δ = 1/6, v 0,1 = 0.5, v 0,2 = 0.1, k 0 = 0.2, which lead to initially symmetric and strongly non-symmetric counter-streaming electron beams, respectively. For all numerical simulations, unless otherwise noted, the time step ∆t is chosen according to the mesh on the most refined level, i.e.

∆t = CFL
where c m is the maximum wave propagation speed in m-th direction, and we take CFL = 0.1.

Accuracy test
It is well-known that the VM system is time reversible, and such property provides practical means to test accuracy for VM solvers. |Ω| for B: where f h , E 1,h , E 2,h , B 3,h are the numerical approximations to the exact solutions f, E 1 , E 2 , B 3 . Sparse grid DG method: We test accuracy for the sparse grid DG method with k = 1, 2, 3 on different levels of meshes. For k = 3, we take ∆t = O(h 4/3 N ) to match the temporal and spatial orders in the convergence study. The L 2 errors and orders of the numerical solutions with upwind and alternating fluxes for parameter choice 1 are reported in Tables 1 and 2. Similar to [18], we observe at least (k + 1 2 )-th order accuracy for both fluxes. When comparing the results of the two tables, we find that the errors are identical for f , while there are some slight differences in the errors of the electromagnetic fields. This is expected because the solvers only differ in the discretization of the Maxwell's equation.
Adaptive sparse grid DG method: We then perform accuracy test for the adaptive sparse grid DG scheme. As in [19], we run the simulations with a fixed maximum mesh level N = 7 and different ε values, and report the L 2 errors and the number of active degrees of  convergence rate with respect to the error threshold R ε = log(e l−1 /e l ) log(ε l−1 /ε l ) , convergence rate with respect to degrees of freedom R DOF = log(e l−1 /e l ) log(DOF l /DOF l−1 ) , where e l is the L 2 error defined in (18) with refinement parameter ε l , and DOF l is the associated number of active degrees of freedom at final time. The two widely used convergence rates above provide important measurement of accuracy and effectiveness of adaptive schemes [6,19]. R demonstrates how much the numerical error is reduced when one picks a smaller error threshold, while R DOF reveals the relation between the numerical error and the active degrees of freedom. For comparison purposes, recall the DG schemes constructed by the tensor product full grid yields R ≈ 1 and R DOF ≈ k+1 d . Our numerical results are summarized in Tables 3 and 4 for the parameter choices 1 and 2 with the upwind flux for the Maxwell's equation. The results computed by the alternating fluxes are very similar, and are omitted. We observe that for both test cases, rate R is slightly smaller than 1, and R DOF is much larger than k+1 d but still smaller than k + 1 for f. This demonstrates the effectiveness as well as the computational savings of the adaptive scheme compared with the non-adaptive ones. We also observe the error saturation for B and E, i.e. when we reduce ε, unlike for f , the errors for B and E do not decrease. The error saturation can be ascribed to the fact that Maxwell's equations are solved on the fixed finest level mesh, and hence B and E are fully resolved by the scheme in this case. This is also evident from the magnitude of the errors for B and E, which is much smaller than the parameters.
In summary, the proposed sparse grid and adaptive sparse grid schemes are able to simulate the VM system with smooth solutions with desired orders of accuracy. Both schemes require much less degrees of freedom compared with their full grid counterpart in [13].   Table 4: L 2 errors and orders of accuracy for the adaptive sparse grid with parameter choice 2. Upwind flux for Maxwell's equations. Run to T = 1.0 and back to T = 2.0. N = 7.

Numerical results for long time simulations
Here, we are concerned with the performance of the methods in long time, which are often required for plasma simulations. Most simulation results presented below are compared with the more expensive full grid DG method in [13]. The scaled macroscopic quantities under considerations are defined as follows.
where K 1 , K 2 are the scaled kinetic energies in each direction, E 1 , E 2 are the scaled electric energies in each direction, and B 3 is the scaled magnetic energy. Henceforth, the scaled kinetic and electric energies are the summation of the corresponding components in each direction, and the scaled total energy is defined as the summation of K 1 , K 2 , E 1 , E 2 , B 3 . The scaled momentum P 1 and P 2 are where the first term in each expression represents the momentum in particles while the second represents that of the electromagnetic field. First, we consider the SW instability with parameter choice 1 up to t = 100. We take N = 8, k = 3 for the sparse grid scheme and N = 6, k = 3, ε = 2 × 10 −7 for the adaptive sparse grid scheme. We measure the errors in the conserved macroscopic quantities for both schemes. The time evolution of relative errors of mass (charge), total energy, and errors of momentum with the upwind flux are plotted in Figures 1 and 2. When t ≤ 75, our scheme can conserve mass, total energy and momentum very well. For longer time period, the errors start to accumulate, particularly for mass, total energy and the momentum P 1 . We notice that the largest relative error in mass is on the order of 10 −4 and 10 −8 for sparse grid and adaptive sparse grids respectively. Compared with the results in [13] for traditional RKDG method, the error in conservation is much larger. According to the analysis performed in Section 2, this is primarily due to the boundary effect at the velocity domain. Such errors can be reduced by enlarging Ω ξ or enhancing the resolution further. The adaptive sparse grid scheme offers better resolution (which is also validated by later plots in this section), therefore the contributions from the domain boundary is smaller, and the simulation retains better conservation in all quantities. As for total energy, the largest relative error is on the order of 10 −3 and 10 −5 for sparse grid and adaptive sparse grids respectively. We also performed simulations based on alternating flux for the Maxwell part. The results show no difference. This implies that the major contribution of error comes from the velocity domain boundary cutoff rather than the jump terms in the field, i.e. Θ h,2 (t) is much smaller than Θ h,3 (t) terms in Theorem 2.2. In fact, we find that there is no significant difference in performance for the two choices of fluxes for long time simulations. Therefore, for brevity, we only report numerical results for the scheme using the upwind numerical flux for the Maxwell DG solver from now on.
The time evolution of kinetic, electric and magnetic energies are presented in Figure 3. The results from both simulations are very close to [13]. The transfer of kinetic energy from one component to the other is evident and there is a small decay which is converted into the field energy. The magnetic and inductive electric field grow initially at a linear growth rate. For t ≥ 70, kinetic effects come into play and the instability saturates. The magnetic energy comes statistically constant, while the electric energy reaches its maximum value at saturation and then starts to decrease.
In Figure 4, we present the first four Log Fourier modes of E 1 , E 2 , B 3 . Here, the n-th Log Fourier mode for a function W (x, t) is defined as Here, k = k 0 = 0.2. The first and third Log Fourier modes of E 1 and B 3 as well as the second and fourth Log Fourier modes of E 2 agree relatively well with [13]. For the remaining less significant modes, our schemes produce larger magnitudes than those in [13].
In Figure 5, we further present the contour plots of distribution function f at x 2 = 0.05π at several times which are consistent with those plotted for the density in Figure 6. At later times, considerable fine structure generates in agreement with the Log Fourier plots. We can also observe that the figures obtained by both schemes are in agreement with those in [13] at t = 55 and t = 82. At later times, e.g. t = 100, our scheme produces more filamented structures than those in [13]. This is due to the poorer resolution in our approximation spaces. At such time period, the adaptive sparse grid scheme clearly offers better resolution. In Figure 6, we plot the density modulation, containing the expected spikes for the SW instability at selected time t. At t = 55 and t = 82, the results agree overall with the calculations in [13]. At t = 100, the density shows visible fluctuations comparing with those in [13].
To verify the effectiveness of the adaptive scheme, in Figure 7, we show the percentage of active elements for each incremental space W l , l = (l 1 , l 2 , l 3 ) in the adaptive scheme. If the percentage is 1, it means all elements on that level are active. If the percentage is 0, it means all elements on that level are deactivated. A full grid approximation corresponds to percentage being 1 on all levels, while the sparse grid approximation corresponds to percentage being 1 when |l| 1 ≤ N, and 0 otherwise. We observe that the number of active elements in the adaptive approximation space is increasing with time. This is consistent with the fact that more fine structures are generated at late times due to filamentation. Therefore more elements are needed to capture the fine structures than previous times. That also explains why the adaptive scheme produces better numerical results than the standard sparse grid scheme. In particular, we count the total numbers of active elements at t = 0, 55, 82, 100, which are 1924, 11556, 69750, 137406, respectively. The full grid space, on the other hand, has a total of 262144 elements. The percentage of active elements, therefore grows from 0.73% to 52.41% from t = 0 to 100. If t becomes larger, the percentage will eventually reach 100%, which amounts to a full grid calculation. This is intrinsically due to the properties of the VM solutions. After a certain time, it is known that any approximation with fixed mesh size (or grid resolution) will fail [11,14]. By using the adaptive scheme, the main advantage lies in the efficient calculation of the solution when t is not so large.
For choice 2, with the non-symmetric parameter set, we take N = 8, k = 3 for the sparse grid scheme and N = 6, k = 3, with a larger threshold ε = 1 × 10 −6 for the adaptive sparse grid scheme. Similar results in conservation errors are obtained as in choice 1, and the plots are omitted. Figure 8 demonstrates energy transfers in this case. Compared with choice 1, the equipartition of the magnetic and electric energies at the peak is not achieved. The Log Fourier modes are plotted in Figure 9. Compared with choice 1, for all four modes, the results from both schemes stay very close to the full grid approximations. The contour plots and the density plots are collected in Figures 10-11. The distribution of active elements for the adaptive scheme is presented in Figure 12. The conclusions are similar to choice 1, i.e. the adaptive scheme offers better resolution by correctly tracking the filament structures, and the solutions are closer to the full grid approximations. Compared with choice 1, the distribution function is less filamented, therefore less elements are needed. At t = 100, the percentage of active elements is 22.15% compared with 52.41% for choice 1. This also explains why the density plots in Figure 11 show less difference between the two methods when compared with Figure 6 for choice 1.

Conclusions and future work
In this paper, we developed sparse grid and adaptive sparse grid DG schemes for simulating VM equations. The key ingredients of the scheme are the multiwavelet tensorized finite element approximation space and its hierarchical basis representation. Based on this construction, it is possible to reduce the storage and computational cost in high-dimensional simulations. Numerical tests show that our schemes achieve the desired of accuracy and similar results with standard RKDG method for the VM equations [13] can be obtained with the adaptive scheme. In the future, we will extend the scheme to other high-dimensional applications.       (e) Sparse grid, t = 100.          (e) Sparse grid, t = 100.