Cubature rules for weakly and fully compressible off-lattice Boltzmann methods

Off-lattice Boltzmann methods increase the flexibility and applicability of lattice Boltzmann methods by decoupling the discretizations of time, space, and particle velocities. However, the velocity sets that are mostly used in off-lattice Boltzmann simulations were originally tailored to on-lattice Boltzmann methods. In this contribution, we show how the accuracy and efficiency of weakly and fully compressible semi-Lagrangian off-lattice Boltzmann simulations is increased by velocity sets derived from cubature rules, i.e. multivariate quadratures, which have not been produced by the Gauss-product rule. In particular, simulations of 2D shock-vortex interactions indicate that the cubature-derived degree-nine D2Q19 velocity set is capable to replace the Gauss-product rule-derived D2Q25. Likewise, the degree-five velocity sets D3Q13 and D3Q21, as well as a degree-seven D3V27 velocity set were successfully tested for 3D Taylor-Green vortex flows to challenge and surpass the quality of the customary D3Q27 velocity set. In compressible 3D Taylor-Green vortex flows with Mach numbers Ma={0.5;1.0;1.5;2.0} on-lattice simulations with velocity sets D3Q103 and D3V107 showed only limited stability, while the off-lattice degree-nine D3Q45 velocity set accurately reproduced the kinetic energy provided by literature.


Introduction
The lattice Boltzmann method (LBM) [1,2,3,4,5] is an efficient approach for the numerical simulation of fluids. Compared to other methods discretizing the Boltzmann equation, such as discrete-velocity models [6,7] or (unified) gas-kinetic schemes [8,9], the LBM exhibits three key properties. First, the distribution functions are integrated along characteristics in time, leading to the well-known 0.5-shift in the relaxation time [10]. Second, the equilibrium distribution function is expressed as Hermite series, mostly of degree two, but higher for compressible [11] and thermal cases [12]. Third, the velocity space is integrated via quadrature rules, in particular by Gauß-Hermite quadratures [12,13]. By the latter, the unbounded velocity space of the Boltzmann equation is expressed by only a limited number of weighted particle velocities, tied together as a velocity set. The type of numerical integration with the underlying weight function exp −|x| 2 does not only appear in the LBM, but for instance also in Kalman filters [14]. Therefore, the literature on numerical integration is a good starting point for the search of well-suited velocity sets [15,16,17]. More specifically, the literature specifies two main approaches to derive the respective rules: Gauß-product rules and non-product rules [15]. The literature denotes multivariate non-product rules also as cubature rules [16,17]. * Corresponding author: wilde.aerospace@gmail.com The simpler and widely-used approach of the abovementioned ones is the Gauß-product rule, which is calculated by the outer product of a 1D quadrature. For example, the D1Q3 velocity set is essentially the 1D Gauß-Hermite quadrature constructed by the roots of the third order Hermite polynomial [18]. By applying the Gauß-product rule, the D1Q3 is turned into a D2Q9 in two dimensions, and by applying the same rule once more the D3Q27 velocity set is derived [18]. Since the number of particle velocities exceeds the number of the encoded physical moments, pruning the D3Q27 velocity set leads to the D3Q15 and D3Q19 sets with different high-order truncation errors [19,20,21]. A modification of the regular D3Q27 is the recently introduced crystallographic LBM [22], which expresses the particle distributions in a body centered cubic arrangement of grid points. All above mentioned velocity sets fit into a regular grid, which is another key asset of the well-established on-lattice Boltzmann method, with the vector of the particle velocities ending on one of the neighbouring grid points. For weakly compressible flows, this collection of velocity sets in combination with a second-order polynomial expansion of the Maxwell-Boltzmann equilibrium is widely used, although not perfect in terms of Galilean invariance, due to well-known errors in the stress tensor [23]. However, when a higher degree of precision for the numerical integration is needed [24,25], in particular for compressible [26] or dilute flows [27], the resulting velocity sets become unfeasible, for two reasons. Firstly, Gauß-product rules suffer from the "curse of dimensionality", i.e. the number of abscissae rapidly increases especially for high spatial dimensions with high degree of precision. Secondly, when derived from Hermite polynomials the degree-nine one-dimensional velocity set D1Q5 holds abscissae, which do not fit on a regular grid, so that an even larger velocity set D1Q7 with equidistant abscissae must be used to obtain a sufficiently high integration order [28]. These multivariate lattices with equidistant nodes are constructed by solving the orthogonality relations [13,24,29].
Contrary to Gauß-product rules, the abscissae of cubature rules are not derived by quadratures of lower dimension. They are rather found individually for a certain pair of degree of precision and dimension. Due to this freedom of velocity discretization, off-lattice Boltzmann methods are required to apply them in the LBM. There is only a very limited number of publications, which display simulations using off-lattice velocity sets [30,31,32].
The present manuscript therefore derives velocity sets from cubature rules and explores them using the semi-Lagrangian lattice Boltzmann method (SLLBM, [33]), with applications to both weakly and fully compressible flows. In contrast to Eulerian time integration schemes like finite difference [34,35], finite volume [36,37], discontinuous Galerkin LBM schemes [38,39], the SLLBM inherits the Lagrangian time integration along characteristics from the LBM and recovers the off-lattice distribution function values by interpolation. Recent works [40,41,42,43,44,45,46] provided evidence that the SLLBM is a promising off-lattice Boltzmann method for the simulation of both weakly and compressible flows. However, these works used either on-lattice velocity sets, e.g. D2Q9, or D3Q27, or velocity sets that had been derived by the Gaußproduct rule, e.g. D2Q25 from 1D Gauß-Hermite quadrature. This gap is closed by the present article. This work shows that cubature-based velocity sets significantly enhance off-lattice Boltzmann simulations in terms of both efficiency and accuracy.
The remainder of this manuscript is structured as follows. Quadrature and cubature in the LBM are briefly recapitulated in section 2 with a list of all investigated velocity sets in this article. Section 3 details the semi-Lagrangian lattice Boltzmann model for both weakly and fully compressible flows. The results section 4 studies three test cases: a compressible twodimensional shock-vortex interaction and the three-dimensional Taylor-Green vortex both in the weakly and in the fully compressible regime, each of them requiring different equilibria and velocity sets. Sections 5 and 6 provide discussion and conclusion.

Cubature in lattice Boltzmann methods
Following [17], we consider the approximation of an integral of function F with Ω ⊂ R D , weight function ω(x) ≥ 0 and dimensions D ≥ 2 by quadrature or cubature rules of the form with number of abscissae Q and discrete weights w i . The degree of a quadrature is defined as the largest integer d that yields I(F ) = C(F ) for all monomials of degree ≤ d.
These cubature formulas are applied to the distribution function of the force-free BGK-Boltzmann equation with (unshifted) relaxation time λ, particle distribution function f , equilibrium distribution function f eq , and particle velocities ξ. To that end, the Hermite moments a (n) are gained in the form [12,13] where H (n) is a nth-order Hermite polynomial. The distribution function f is expressed as a finite Hermite series with truncation order N [13] with the result that moments of order M are exactly represented, if M ≤ N. The velocity space is discretized by replacing the integral of Eq. (5) by a weighted quadrature leading to with the replacements f i = w i f (ξ)/ω(ξ i ) and H (n) (ξ i ) = H (n) i . From [47] it is known that moments of order M can be exactly determined by quadrature or cubature rules of degree d ≥ N + M. In particular, for weakly compressible flows the order of the moments is usually limited to N = 2, although N = 3 cancels the cubic error when deriving the Navier-Stokes equations by a Chapman-Enskog analysis [13]. In contrast, fully compressible flows require even expansion order N = 4. By overcoming the restriction that abscissae ξ i need to match a regular grid, cubature rules become applicable, provided they approximate the weight function In the literature, cubature rules are usually listed for the weight functionω(x) = exp −|x| 2 . To obtain a velocity set to approximate the moments in Eq. (7), the cubature's abscissae have to  2. The lattice speed of sound c s of these velocity sets is-unless otherwise specified-unity c s = 1, which also implies the reference temperature T 0 = 1.
The following velocity sets were identified to be subject of an in-depth examination: • First, the newly introduced degree-nine D2Q19 in comparison to the D2Q25 derived from Gauß-product rule for fully two-dimensional compressible flows. The shape of D2Q19 is shown in Fig. 1 • Second, in three dimensions velocity sets for both weakly and fully compressible flows were chosen. The D3Q13 based on a icosahedron was already used in a weakly compressible finite difference LBM with a triangular mesh [32], whereas the D3Q21 derived from a dodecahedron has not been examined so far. Both platonic solids, shown in Fig. 2, possess a high geometric isotropy, i.e. the flow information encoded into the distribution function values is transported by regularly spaced abscissae ending on the surface of a sphere. The weights and abscissae of the D3Q13 and D3Q21 velocity sets are listed in Tables 2 and  3, respectively. Both velocity sets were compared to the customary D3Q27 on-lattice velocity set and in addition to a degree-seven velocity set, which we call D3V27 [15]. This velocity set is not affected by the cubic error in the stress tensor that troubles degree-five velocity sets. Although in the same spirit, it is not identical to the velocity set presented by Yudistiawan et al. [31].
• Last, for three-dimensional compressible flows, a recently introduced D3Q45 velocity set was applied [48], based on a cubature rule by Konyaev [49,50]. This off-lattice velocity set was compared in an off-lattice Boltzmann simulation to the state of the art on-lattice counterparts D3Q103 and D3V107 applied to compressible on-lattice Boltzmann simulations. Table 1 designates the velocity sets, which were used for the simulations in this article. The additional notation E Q D,d follows the literature on multivariate quadratures integrating the weight functionω(x) = exp −|x| 2 , listing dimension D, degree d and a) b) Figure 2: Shape of icosahedron a) and dodecahedron b), which are the bases for the velocity sets D3Q13 and D3Q21 used in this work. Gauß-product rule from 1D degree-nine Gauß-Hermite quadrature [28] D3Q13 E 13 3,5 Derived by the vertices of an icosahedron [15,32], Table 2 D3Q21 E 21 3,5 Derived by the vertices of a dodecahedron [15], Table 3 D3Q27 E 27 3,5 Doubly applied Gaußproduct rule from 1D Gauß-Hermite quadrature [ On-lattice velocity set [30] number of abscissae Q. To identify the suitability of the introduced velocity sets beforehand, the symmetry conditions in Appendix B were successfully tested [51]. For example, a degreefive velocity set only reproduces the symmetry condition up to fifth order, whereas degree-nine velocity sets will also correctly reproduce the symmetry conditions up to ninth order.

Model description
The lattice Boltzmann equation reads with relaxation time τ = ν/(c 2 s δ t ) + 0.5 depending on kinematic viscosity ν, and time step size δ t .
This work applies the discrete equilibrium distribution function based on an expansion in terms of Hermite polynomials Table 2: Abscissae ξ i and weights w i for the D3Q13 velocity set, based on an icosahedron [15,32,4] with r 2 = (5 + with ":" denoting full contraction, w i the discrete weights and N the expansion order. Both the moments a (n) eq and the Hermite tensors H (n) i are listed in the Appendix. In the weakly compressible case the local temperature θ = T/T 0 is set to θ = 1 with vanishing terms for second and higher order moments. With the help of Eq. 7 the moments density and momentum are obtained Compared to the standard LBM, the SLLBM replaces the nodeto-node streaming step by a cell-wise interpolation procedure using interpolation polynomials ψ of interpolation order p to obtain the departure points, whose locations are dictated by the respective reversed particle velocities f i (x − δ t ξ i ) , i.e. for all M points x in each cell Ξ: withf iΞ j denoting the distribution function value at the support points in each cell.
Gauß-Lobatto-Chebyshev points were used for the distribution of support points in the cells [44,33].
For the compressible test cases in Sections 4.1 and 4.3 we employed the recently introduced compressible SLLBM [44] with the following extensions. A second distribution function g is introduced to enable the calculation of flows with variable heat capacity ratio γ [47], following the same collide-andstream algorithm as applied to the distribution function f .
The equilibrium of g is determined by the relation with heat capacity at constant volume C v . To complement Eqs. (12) and (13), the local temperature θ is determined by with peculiar particle velocity The local speed of sound c * s consequently depends on the local relative temperature θ and on the heat capacity ratio γ, via The relaxation time τ in the compressible case is dependent on the local pressure P, i.e. τ = µ/(c 2 s δ t P) + 0.5 with dynamic viscosity µ. In addition, a quasi-equilibrium approach was used to vary the Prandtl number Pr. More details about the compressible SLLBM solver are listed in [44]. The NATriuM solver was used for the off-lattice Boltzman simulations [45], which is based on the finite element library deal.ii [54]. The on-lattice Boltzmann simulations were performed using the lettuce software [55,56], being based on the GPU-accelerated machine learning toolkit PyTorch [57].

Results
The present work considers three test cases to evaluate the introduced velocity sets: the two-dimensional shock-vortex interaction, comparing the degree-nine velocity sets D2Q19 and D2Q25. This flow challenges the solver by acoustic emissions by the vortex downstream the shock. Next, the weakly compressible three-dimensional Taylor-Green vortex [58] was simulated by various 3D velocity sets, e.g. D3Q13, D3Q21, D3V27 and D3Q27 to explore the capability of the velocity sets to deal with underresolved flows. Finally, simulations of fully compressible 3D Taylor-Green vortex flows were explored with Mach numbers 0.5, 1.0, 1.5, and 2.0. This test case exhibits shocklets and viscous effects and features a standardized initialization. The on-lattice simulations were performed using the D3Q103 and D3V107 velocity sets, while the off-lattice simulations were run by the D3Q45 velocity set.

Compressible 2D shock vortex interaction
The compressible two-dimensional shock-vortex interaction was tested as a first test case to compare the two velocity sets D2Q19 and D2Q25. This benchmark was intensively studied by Inoue and Hattori [59], with the following setup. Two regions with Ma a = 1.2 and Ma b determined by Rankine-Hugoniot conditions are divided by a steady shock. The Reynolds number is defined as Re = c * s,∞ R/ν ∞ with subscript ∞ denoting the inflow conditions. The flow field of the vortex is given by where Ma v denotes the Mach number of the vortex. The initial pressure and density field are and .
The resolution was 256×256 cells with polynomial order p = 4, ie. 1024×1024 grid points for the physical domain of 60R×24R. Initially the centre of the vortex was located at x v = 32, while the shock was located at x shock = 30. Likewise as in [44] we used a stretched grid to spatially resolve the shock region. The time step size was δ t = 0.002 with characteristic time t = R/c * s,∞ . The Prandtl number was Pr = 0.75 in all cases. Table 4 designates the most important parameters of the simulations. Fig. 3 depicts the density contours at t = 8 for test case a) with Ma v = 0.5, Re = 400 and provides evidence that there is no significant difference between the simulations run by the D2Q19 and D2Q25 velocity sets.
The evaluation of the sound pressure for test case b) at Ma v = 0.25 and Re = 800 confirms this observation, being shown in Fig. 4. The sound pressure ∆P = (P − P B )/P B , with subscript B denoting the pressure downstream, was measured at time t = 8 from the center of the vortex along the 135 • inclined radius with respect to the x-axis. As expected, the sound pressure of both SLLBM simulations coincided well with the reference solution by Inoue and Hattori [59]. These findings approve the cubature-derived D2Q19 as a potent alternative for compressible off-lattice simulations to the D2Q25 being based on the Gauß-product rule used in previous works [41,44].

Weakly compressible 3D incompressible Taylor-Green vortex
The three-dimensional Taylor Green vortex served as a test case to apply the degree-five velocity sets D3Q13, D3Q21, and degree-seven D3V27 in comparison to the customary degreefive D3Q27 using an isothermal configuration θ = 1 with truncation order of the equilibrium N = 2 for D3Q13, D3Q21, and D3Q27. The D3V27 simulation instead was equipped by a third truncation order N = 3. The Mach number was Ma = 0.1. The computational domain is S = [0, 2π] 3 and the following initial conditions were used P(x, t = 0) = 1 16 (cos(2x) + cos(2y))cos(2z + 2), The kinetic energy of the flow is defined as As explained in detail in [33] and references therein, the scaled enstrophy νE measures the physical dissipation in terms of the local shear stresses. In turn, its deviation from the phenomenological dissipation rate -dk/dt provides a measure for the numerical dissipation due to discretization. First, a well-resolved flow was tested by setting the Reynolds number to Re = 400. The domain was discretized by 32 3 cells with order of finite element p = 4, resulting in 128 3 grid points. The velocity sets were scaled in such a way that the time step size was δt = 0.0015 in all cases. A summary of the flow variables is listed in Table 5. Figure 5 depicts the dissipation dk/dt and the scaled enstrophy νE of the four focused velocity sets in comparison to the reference by Brachet et al. [58]. It is shown that all velocity sets match the reference solution for both dissipation and scaled enstrophy. Thus in case of well-resolved simulations, even the lean D3Q13 velocity set commended itself as a good choice for three-dimensional flow simulations. In case of the underresolved simulations with Reynolds number Re = 1600 with time step size δ t = 0.00075 the situation is slightly different though, displayed in Fig. 6 showing different evolutions of dissipation and scaled enstrophy for each of the velocity sets. Initially, the dissipation followed the DNS reference simulation. As the flow entered the fully turbulent regime, the dissipation became more volatile than in the resolved case due to under-resolution of small vortices. In this phase, the numerical discretization provided dissipation in the spirit of an implicit LES [61]. Despite the absence of an explicit subgrid model, all stencils produced macroscopic dissipation rates that roughly followed the reference solution. The D3Q13 and D3Q27 stencils even captured the location of the dissipation peak within 0.5 time units, whereas D3Q21 and D3V27 produced a plateau and a premature peak, respectively. That said, the broader implications of this difference are debatable, since the dissipation in the under-resolved regime is mostly a result of numerical errors interacting with the finest resolved scales. In contrast, the scaled enstrophy is a more meaningful observable as it directly reflects the local shear stresses. Fig. 7 compares the scaled enstrophies revealing that the enstrophy determined by simulations with the common D3Q27 most drastically underestimated the enstrophy in this configuration. Table 6 confirms this observation by listing the Euclidean norm of the difference between the scaled enstrophies obtained by simulation and reference, i.e.
whereN is the number of discrete measurements. The degreeseven D3V27 simulations as well as the degree-five D3Q21 simulations both more accurately predicted the scaled enstrophy.

Compressible 3D Taylor-Green vortex
As a final test case the compressible 3D Taylor-Green vortex was simulated. Recently, this test case has been by extensively investigated by Peng and Yang [62]. The Reynolds number was Re = 400, the heat capacity ratio was γ = 1.4, and the Prandtl number was Pr = 0.75. The initial velocity field corresponds to Eq. (22), whereas the pressure is obtained by P = ρT 0 with T 0 = 1, while the density is given by ρ(x, t = 0) = 1.0 + C 2 16 (cos(2x) + cos(2y))cos(2z + 2). (27) This relation of temperature and density corresponds to the constant temperature initial condition (CTIC) detailed in Peng and   Yang [62]. The nominator C can be either set to C = Ma 2 or to C = 1; this work made use of the latter case. Table 7 lists the most important simulation variables and Table 8 separately lists the time step sizes and relaxation times at θ = 1.0 of the respective simulations with the velocity sets D3Q103, D3V107, and D3Q45. The time step sizes of the on-lattice velocity sets were dictated by the configuration, whereas the time step sizes of the D3Q45 SLLBM simulation were adjustable. The results of the D3Q45 velocity set is discussed in more depth in a follow-up manuscript [48]. Figure 8 depicts the mean kinetic energy over time for a resolution of 256 3 obtained by the velocity sets D3Q103, D3V107 and by the SLLBM D3Q45 simulation. It displays that the reference by Peng and Yang was well captured for Mach numbers Ma = 0.5 and Ma = 1.0 by all three velocity sets. However, the on-lattice simulations became unstable for the larger Mach numbers Ma = 1.5 and Ma = 2.0.
These results confirm the assumptions of other works which predicted the capability of the on-lattice velocity sets D3Q103 and D3V107 to simulate compressible flows [25,30,29,63]. However, simulations applying these velocity sets lack stability, at least in the present configuration for Mach numbers exceeding one. In contrast, the variable time step size of the SLLBM enabled stable simulations even at higher Mach numbers using a substantially smaller velocity set with 45 discrete velocities.

Discussion
The discretization of the velocity space by quadrature is a key asset of the lattice Boltzmann method. However, the coupling of the velocity space discretization with the spatial and temporal discretization in on-lattice Boltzmann methods is obstructive to find sufficiently small velocity sets especially for high quadrature order. Even a recent systematic study as done by Spiller and Dünweg did not yield smaller velocity sets than     the already known D3Q103 [64], whose number of abscissae therefore appears to be the lower limit at this quadrature degree with equidistant nodes. The enormous size of these highdegree sets prevented their application in actual simulations, at least up to the present study. The simulations of the fully compressible Taylor-Green vortex in the present work showed onlattice velocity sets are generally capable for compressible simulations, but they lack stability at Mach numbers beyond unity. The biggest issue, from our point of view, is the fixed time step size of most compressible on-lattice Boltzmann methods. An exception to this are hybrid lattice Boltzmann methods, which solve the energy equation by finite volume or finite difference methods [65,66]. These methods are capable to adjust the time step size by changing the reference temperature, but they suffer from restrictions in the stability regions [67]. Despite the successes of recent works [26,42,68,69,70], the applicability of compressible on-lattice Boltzmann methods remains vague. Although shifted stencils have proven to extend the Mach number range of on-lattice Boltzmann methods [71,72] in certain situations, the Taylor-Green vortex at high Mach numbers would still not be stable with static reference frames. Solely dynamic shifts of the reference frame as presented by Coreixas and Latt [73] might be an alternative for on-lattice Boltzmann methods, although they cause additional computational costs and require further investigation. Compared to on-lattice Boltzmann methods, the discretization of the velocity space is a largely unexplored field of research in off-lattice Boltzmann methods. We were able, however, to identify a large potential in equipping off-lattice Boltzmann methods by velocity sets with significantly different shapes. Instead of deriving stencils case-by-case, the research on multivariate quadrature rules yielded suitable cubature rules as templates for velocity sets. Consequently, this work's main purpose was to explore these velocity sets by off-lattice Boltzmann simulations.
The results in Section 4 clearly indicate that the spatial freedom obtained in the collocation of abscissae, as done in cubature rules, reduces the number of discrete velocities significantly. Simultaneously, the computational costs halved, narrowing the gap between on-lattice and off-lattice Boltzmann methods in efficiency. For example, the D2Q19 degree-nine velocity set is approximately half the size of its on-lattice counterpart D2V37 with equal quadrature order [24]. The difference between the D3Q45 and the D3Q103 velocity set is even larger, but the size of the D3Q45 is also superior compared to a recently utilized degree-nine D3Q77 off-lattice velocity set for compressible finite volume LBMs [74]. The D3Q45 simulations of the Taylor-Green vortex proved to be stable and accurate even for high Mach number flows. This can be attributed to the temporal discretization of off-lattice Boltzmann methods, which is independent of the discretization of space and velocity space.
Differences were also identified in terms of accuracy. The D3Q13 and D3Q21 based on icodahedra and dodecahedra, shown in Fig. 2, proved to be both efficient and accurate offlattice velocity sets for weakly compressible flows. Both velocity sets showed a better enstrophy agreement with the reference solution in underresolved flow simulations. The reason for this is not necessarily that the precision of the cubature is higher, since the degree of precision of the D3Q13, D3Q21, and D3Q27 velocity sets is identical. Instead, we attribute the better agreement of the enstrophy with the reference to the fewer advection equations that have to be solved connected with fewer interpolation steps, leading to less artificial diffusion. Moreover, due to their derivation from platonic solids, these velocity sets exhibit a high geometric isotropy, which is favourable for the simulation of turbulent flows. This is probably also the reason why the D3Q21 outperforms the D3Q13 velocity set in terms of enstrophy. However, simulations with all of these degree-five velocity sets are affected by the cubic error term, being dependent on the Mach number. Obviously, this error term also spoils the simulations, since the Mach number of Ma=0.1 is not negligible. The degree-seven velocity set D3V27 with the aim to eliminate these cubic errors in the model proved to be more accurate than the on-lattice D3Q27, but coming at a comparable price. Despite the numerical diffusion introduced by the SLLBM, we showed in past works that the spatial convergence order corresponds to the order of the finite element [45]. By contrast, the spatial convergence order of on-lattice Boltzmann methods is restricted to second order.
Numerical diffusion and dispersion are critical issues of off-lattice Boltzmann methods compared to on-lattice Boltzmann methods, resulting from the interpolation in case of the SLLBM. High-order interpolation polynomials as used in the present article, however, significantly lower the dissipative effects in the simulation [45]. Contrary to that, on-lattice Boltzmann methods are valued for low dissipation [75] due to the exact streaming step, but they suffer from instabilities at high Reynolds numbers when applying the usual BGK collision operator. To encounter these stability issues, a number of collision models have been proposed to stabilize simulations by introducing minimal diffusion without sacrificing the locality of the collision operator. Examples are multi-relaxation time (MRT) [76,77,78,79], central MRT [80,81], entropic [82,83], regularized [84,85], or cumulant models [86,87]. Due to the non-negligible amount of dissipation evoked by the streaming step of off-lattice Boltzmann schemes, it is likely that these established on-lattice collision models fulfil a different role in offlattice LBM. Nevertheless, previous work has utilized the semi-Lagrangian streaming step in combination with a stabilized collision model [40]. Consequently, future research is necessary to clarify the impact of collision models on off-lattice Boltzmann methods.
Finally, some technical aspects need to be discussed. In particular, the streaming step's implementations of on-lattice and off-lattice Boltzmann methods are significantly different. On the one hand, the exact streaming step of on-lattice Boltzmann methods can naively be implemented by on-board routines of many software packages, e.g. by "numpy.roll" in Numpy [88]. On the other hand, the interpolation routines of the SLLBM require a thoughtful implementation, especially when following our recommendation to employ high-order polynomials with non-equidistant support points and cell-wise organization. For this task, a matured finite element library like deal.ii is helpful [54]. The execution time of the semi-Lagrangian streaming step is one magnitude larger [45] for the same configuration and velocity set.. Despite the computational overhead, offlattice Boltzmann methods have proven beneficial for unstructured meshes or compressible simulations [40,45,44]. More appropriate velocity sets as presented in this article further reduce the gap in computation cost, as the computational complexity of the SLLBM scales linearly with respect to the number of discrete velocities.
To sum up, off-lattice Boltzmann methods relax the LBM in terms of the velocity space discretization. As the development of sparse numerical cubatures is still progressing, the present work lays the foundations to directly utilize new cubature rules in CFD simulations with off-lattice Boltzmann methods.

Conclusion
This paper studied the utilization of cubature rules in offlattice Boltzmann methods for weakly and fully compressible flows. The deduced off-lattice velocity sets presented in this article reduce the number of discrete velocities and increase the accuracy of the simulation. In addition, fully compressible offlattice simulations with degree-nine velocity sets feature better stability compared to the corresponding on-lattice counterparts. Taken together, cubature rule-derived off-lattice velocity sets are a viable alternative to the customary on-lattice Boltzmann velocity sets and should have priority in off-lattice Boltzmann simulations.