A Time-Spectral Method for Initial-Value Problems Using a Novel Spatial Subdomain Scheme

We analyse a new subdomain scheme for a time-spectral method for solving initial boundary value problems. Whilst spectral methods are commonplace for spatially dependent systems, finite difference schemes are typically applied for the temporal domain. The Generalized Weighted Residual Method (GWRM) is a fully spectral method in that it spectrally decomposes all specified domains, including the temporal domain, with multivariate Chebyshev polynomials. The Common Boundary-Condition method (CBC) is a spatial subdomain scheme that solves the physical equations independently from the global connection of subdomains. It is here evaluated against two finite difference methods. For the linearised Burger equation the CBC-GWRM is $\sim30\%$ faster and $\sim50\%$ more memory efficient than the semi implicit Crank-Nicolson method at a maximum error $\sim10^{-5}$. For a forced wave equation the CBC-GWRM manages to average efficiently over the small time-scale in the entire temporal domain. The CBC-GWRM is also applied to the linearised ideal magnetohydrodynamic (MHD) equations for a screw pinch equilibrium. The growth rate of the most unstable mode was efficiently computed with an error $<0.1\%$.


Introduction.
In the field of fusion plasma physics complex sets of linear PDEs need frequently be solved. A particular example is the sub-discipline devoted to the analysis of linearised magnetohydrodynamic equilibria and the evolution of unstable modes [4,9,11,24,30]. The analysis consists of linearising the governing magnetohydrodynamic equations, assuming an equilibrium, introducing a perturbation, and then calculating the mode growth rate. The existence of multiple time scales require many time steps to resolve. It would be beneficial to average over small scale dynamics efficiently in order to obtain the dominant unstable mode.
The efforts of increasing the efficiency and accuracy of time dependent numerical schemes are long standing. Authors such as Y. Morchoisne (1979) [20], Peryet. et al. (1983) [22], Tal-Ezer (1983-84) [31,32], and Bar-Yoseph et. al. (1995) [3], to name a few, have pioneered the concept of time-spectral methods. A number of pseudo-spectral methods with polynomial basis functions such as Fourier, Legendre, and Chebyshev polynomials, with their respective advantages have been analyzed by the various authors. In the works by P. Dutt [8] and Maerschalck et. al. [18] a spacetime Chebyshev collocation method was implemented to solve initial value problems in a least-squared sense. It is in this vein that the current work is pursued.
The time-spectral method proposed in [25] is a fully spectral method that spectrally decomposes all domains including spatial, temporal, and parameter domains. Multivariate Chebyshev polynomials are used to represent all domains in a solution ansatz. The time-spectral method [25] executes all computations in spectral space, i.e. no collocation points are used. The only spatial points that are taken into consideration are the points where the spatial subdomains overlap. The spatial subdomains are overlapped for the purpose of having a two-point contact. This subdomain method ensures spectral accuracy in the entire domain. Unfortunately, the resulting global matrix equations associated with solving the algebraic equations for the solution coefficients may become excessively large [6,7,15].
There are several popular methods for dealing with this issue. One such method is to use parallel algorithms that solve for large sparse matrices [19,28], and another is to create independent spatial domains that can be solved locally with boundarysolvers that connect the subdomains [7,10,15,21,27]. The idea of reducing the global degrees of freedom was first implemented with the static condensation method [14], and thereafter has seen many variations that make the algorithm more general and dynamic [23]. The static condensation algorithm has been popular with finite element methods, however similar algorithms have been introduced to spectral methods. The influence matrix technique is such a condensation algorithm related to spectral methods [5,33]. In [5] the Navier-Stokes streamfunction-vorticity equations are solved with an iterative domain decomposition method by [16] in conjunction with the influence matrix technique. The influence matrix uses the fact that the vorticity and the stream function are linear so that they can be decomposed into a time-dependent term and a static term. The influence matrix includes the static terms, which then enforce the no-slip conditions on the boundaries in every finite time-step.
This brings us to the current spatial subdomain scheme termed the Common Boundary-Condition (CBC) Method. This method solves the "private" physical equations locally and separately, but the "boundary" equations that connect the subdomains are solved globally. This will in effect reduce the global degrees of freedom a substantial amount. The CBC method, compared to the other methods, is a general subdomain scheme for the time-spectral method by calculating how the external boundary conditions and internal patching conditions influence the private, physical equations. Thus, the CBC method does not decompose patching conditions, instead it calculates implicit and explicit derivatives of all patching conditions with respect to the private equations in the individual subdomains. The CBC method also avoids any matrix singularity issues by using a modal representation, instead of avoiding the corner points of the subdomains [5]. Most importantly, it takes into account the temporal modes, and adjusts the boundary and patching conditions throughout the entire temporal domain.
The paper is organized as follows. Section 2 briefly summarizes the Generalized Weighted Residual Method (GWRM), which is the foundation of the time-spectral method. The new spatial subdomain scheme that has been implemented is also laid out. Next follows solution of test problems in section 3. The test problems include the linearised 1D Burger equation, a forced wave equation and a case relating to the linearised ideal MHD equations. Section 4 contains a discussion, including some observations, possible objections, and resolutions. The paper closes with a conclusion.
2. Method. We will here present a brief review of the Generalized Weighted Residual Method GWRM. For a full and detailed presentation the reader is directed to [25].
2.1. Weighted residual formulation. We start by assuming a set of partial differential equations, where D is an arbitrary linear or non-linear operator and f (t, x; p) is a known force term. The solution is approximated with multivariate Chebyshev expansion series in time, space, and parameter space. For a single spatial dimension and one parameter we have Here ( ) denotes that the first term in the summation is divided by two. Since Chebyshev polynomials are defined in the range [−1, 1], a change of variables is performed; σ = (z − A z )/B z , A z = (z 1 + z 0 )/2, and B z = (z 1 − z 0 )/2. Here σ signifies the transformed variable and z is the physical variable, with indices 0 and 1 denoting domain limits.
The weighted residual formulation is obtained by substituting (2) into (1), multiplying the residual with weight functions, and integrating over the entire domain; The residual has the form and the weight function, The resulting algebraic equations may be written as a qrs = 2δ q0 b rs + A qrs + F qrs (6) This simple and general formula contains Chebyshev coefficients relating to the initial conditions b rs , the linear/non-linear operator term A qrs , and the force term F qrs . Boundary equations take the place of the highermost spatial modes of a qrs after inserting Eq. (2) into the given external boundary conditions; internal boundary conditions are produced by coupling the spatial subdomains. In the GWRM equation (6) plus boundary condition equations are solved with a semi-implicit root solver (SIR) [26]. This may entail solving a global matrix equation for all the (K + 1)(L + 1)(Λ + 1) coefficients. The number of numerical operations, and the CPU time, would then scale as ((K + 1)(L + 1)(Λ + 1)) 3 and the memory requirements as ((K + 1)(L + 1)(Λ + 1)) 2 . This is of course unacceptable from a viewpoint of efficiency.
2.2. Subdomain scheme. Subdomains, both in space and time, have the potential to remedy the efficiency problem. Concentrating here on spatial subdomains, it may be noted that a mere division of the spatial domain into subdomains does not have a radical influence on efficiency. Clearly, the possibility to adjust the subdomain length according to the physical terrain would optimize the solution procedure, but essentially the same global number of coefficient equations would need to be solved simultaneously. The Common Boundary-Condition method (CBC-GWRM) described in the following does, however, reduce the number of simultaneous global equations to be solved.
where ε is a small overlapping distance. This procedure allows us to use only a few Chebyshev modes in our ansatz for each subdomain. By overlapping the subdomains, point-wise and gradient continuity is ensured. The system of equations we wish to solve include N s spatial subdomains, N e number of physical equations, K temporal modes, L spatial modes (we here let Λ = 0). This results in N = N s N e (K + 1)(L + 1) equations to solve for the coefficients of the ansatz 2. A question arises: can the global amount of equations be reduced in some way?
Equations (6) can be written in fixed point form formally as x s = Φ s , where s refers to the respective subdomain; for details see [25]. The idea now is to realize that only the subset x s BC = Φ s BC representing the boundary equations need to be solved globally. The reason for this is that the physics equations (6) in each individual subdomain x s P = Φ s P , representing the systems of PDEs, are only dependent on x s BC . These "private" equations can be solved locally in each spatial subdomain in each iteration level. This process can be fully parallellized. Furthermore, we have the dependence x s BC = Φ s BC (s − 1, s, s + 1), depicted graphically in Figure 2. Thus boundary condition equations only contain Chebyshev coefficients (variables) of the immediately neighbouring spatial subdomains.
A numerical example may be elucidating. Let us assume that a system of five PDEs, representing second order in space, is solved in 1D. Employing K = 9 and 10 spatial subdomains with L = 9, a total of 5000 global equations result for the Chebyshev coefficients of Eq. (2). The CBC-GWRM algorithm reduces these to 1000. For 2D and 3D problems the reduction is even more profound. The reduced set of global equations is then solved iteratively by SIR [26]. SIR solves the algebraic set of equations by iterating the following equation, where ϕ ϕ ϕ denotes the right hand side of Eq. (6) plus the boundary equations, the matrix R with elements R mn = ∂Φ m /∂x n controls convergence in SIR and I is the identity matrix. It should be noted that Eq. (7) is a formal representation; for efficiency the matrix inversion is generally avoided and a matrix equation is solved instead. Thus the Jacobian J i BC needs to be computed, where δ pq is the Kronecker delta and T pq includes the explicit and implicit derivatives; The indices p and q refer to common BC variables and the index i refers to the N p private variables. The first term is the explicit derivative. The second term shows that the common BC variables are indirectly dependent on the private variables in the neighbouring subdomains, hence the implicit derivatives. The index ν is introduced in the sum to neglect subdomains that are not directly influencing the current common BC variables.
The ∂x P /∂x BC coefficients in the sum require some attention. The first step is to create a vector of all private equations f i = z i −ϕ i (z), i = 1. . .N e N p , where z and ϕ(z) only contain the private variables and equations from x and ϕ(x). Since the private z variables possess an implicit dependence on the common BC variables, the partial derivatives can be computed and saved in the matrix F ij = ∂f i /∂z j , i = 1. . .N e N p and j = 1. . .N e N BC . The implicit derivatives in F ij are evaluated with current private values from current scheme iteration and the ∂x P i /∂x BCj variables are obtained by solving the linear algebraic system of equations F ij = 0.
An alternative method for obtaining the ∂x P /∂x BC coefficients in Eq. 9 is to approximate the derivative numerically with forward difference ∂x The set of private equations numbering (L − 1)(K + 1) have to be solved for every BC variable, i.e. 2N s N e (K + 1) times. The amount of operations can be further reduced since the calculated Jacobian from the regular solution x P (x BC ) can be used for the perturbed solution x P (x BC + ) as well.
A work flow of the Common Boundary Condition subdomain scheme, employing the latter numerical scheme for the Jacobian, applied to the 1D case, omitting physical parameter dependence, is given in the box below.
Then discretize the spatial domain into overlapping subdomains Ns. The Ne partial differential equations, with initial conditions, are then spectrally decomposed. Assume here 2 external boundary conditions, so that L−1 spatial modes of Eq. 6 are computed.
Step 1: Initial common boundary condition values X s BC are substituted into x s P = ϕ ϕ ϕ s P (x s P , X s BC ), so that only private variables are unknown. Call on SIR to solve private equations.
Step 3: Compute the BC Jacobian (Eqs. 8-9) Step 4: Compute Eq. 7, , which includes JBC from step 3) and ϕ ϕ ϕBC (xBC ) with private variables from step 1). This here is a linear system, hence it only requires a single SIR iteration to calculate the new common BC values x i+1 BC .
Final: Update Chebyshev coefficients a kl in Eq. 2 using solution vector x.
3. Test problems. We have chosen to solve the linearised Burger equation, a forced wave equation, and a problem modelled with the linearised ideal MHD equations. Non-linear problems will be addressed in a forthcoming paper. All CBC-GWRM simulations are carried out on one global temporal interval, that is we here omit the use of time intervals.
3.1. Linearised 1D Burger equation. The performance of the CBC-GWRM with regards to accuracy is here compared to that of the explicit Lax-Wendroff method, the implicit Crank-Nicolson method, and the "standard" GWRM with a global spatial subdomain scheme. The purpose of employing the Crank-Nicolson and Lax-Wendroff methods is to include two well known finite difference methods that require little justification. The performance of other, more or less sophisticated, finite difference methods can easily be compared to these methods.
The parameters that determine the accuracy and efficiency of the GWRM are the order of temporal Chebyshev modes K, spatial Chebyshev modes L, and the number of spatial subdomains N s . Similarly, the grid values determine the performance of the FD methods, namely M temporal steps and N spatial steps. The CPU time and memory consumption of all the methods are documented throughout.
The linearised 1D Burger equation is stated as follows: where κ and b are constants. The initial condition and boundary conditions chosen here are This enables an exact solution that can be used for accurate benchmarking (see Figures  3a and 3b for an example),  Thus we see that for moderate accuracies the C-N method is approximately two times faster than CBC-GWRM and 10 times faster than L-W, all methods requiring comparable memory consumption.
However, when the accuracy is increased to ε ∼ 10 −5 , the CBC-GWRM with N s = 3, K = 6, and L = 6 features a CPU time of 0.50 s and memory of 22 MB. The C-N method gave a CPU time of 0.77 s and memory of 43 MB for M = 120 and N = 400, whilst the L-W method was unable to reach a comparable accuracy. Thus the CBC-GWRM is here faster and more memory efficient for higher accuracies than both FD-methods. With the same parameters and accuracy ε ∼ 10 −5 the "standard" GWRM (without the CBC-scheme) achieves a CPU time t = 0.27 s and memory 23 MB. In Discussion, however, it is argued that the CPU time of the GWRM using the CBC scheme is strongly advantageous when a large number of spatial subdomains is employed.

Fig. 4: Error plot vs CPU time [s]
The three methods have been analysed with optimal parameters so as to obtain the best accuracy for a given computational time. The data is displayed in Figure  4, which shows how all three methods scale in this regard. A simple curve-fit of the data shows that the L-W method scales as ε LW ∼ t −0.6 CP U , and the C-N method scales slightly better at ε CN ∼ t −0.9 CP U . The spectral convergence properties of the CBC-GWRM allows a much stronger scaling than the FD methods; ε GW RM ∼ t −3.47 CP U . It is also found that the CPU time scales linearly as t CP U ∼ N 1 s with regards to the number of subdomains used, see Figure 10, in Discussion. This feature of the CBC-GWRM becomes advantageous when modelling physical systems that require high local spatial accuracy.

Forced Wave Equation.
The forced wave equation employed here is a second order hyperbolic differential equation that features two time scales. This equation is used to test efficiency, in the sense that in some cases accuracy at small scales may be ignored in favour of efficiently resolving large scale dynamics. The wave equation has the form and can be posed as two first order in time partial differential equations: The initial and boundary conditions are Here A, n, α, β and c are free parameters. The forcing equation chosen is This gives the exact solution u(t, x) = cos(ncπt)sin(nπx) + Asin(αt)sin(βx).
First the free parameters are set to A = 10, n = 3, T = 80, α = 6π/T , β = 3π and c = 1. For the time interval t ∈ [0, 80] the CBC-GWRM requires a high number of temporal Chebyshev modes K≥12 to achieve a reasonable average (within 92% percent of the second, slowly evolving part of the solution (15)) of the fast time scale dynamics. The solution at x = 0.2 can be seen in Figure 5. Since it is inefficient to solve for more than one wavelength with high order temporal modes, a more pragmatic test would be to solve the wave equation with parameters A = 10, n = 3, T = 30, α = 2π/T , β = 3π and c = 1 in a time interval t ∈ [0, 30]. This allows a temporal mode K = 5 to accurately average the fast time scale (within 95% percent of the slow manifold). The CBC-GWRM with parameters K = 5, L = 5, and N s = 4 achieves a CPU time of 0.86 s, which is comparable to the standard GWRM subdomain scheme with the same parameters, requiring 0.83 s. In Figure 6a the standard GWRM and the CBC-scheme, employing both analytical and numerical derivatives, for difference numbers of increasing subdomains. The CBC-scheme with numerical derivatives achieves shorter CPU times and is more memory efficient. In Figure 6b the memory consumption of the three GWRM schemes are compared.  The parameters used for the C-N solution is M = 50 and N = 50, which gives a run time of 1.4 s. The C-N method fails to average correctly as it is slightly out of phase. Explicit methods, unlike implicit methods, are conditionally stable because they must obey the CFL condition γ t/ x≤1, where γ is the wave speed, t is the temporal step length, and x is the spatial step length. However, since the forced wave equation is a hyperbolic equation, the FD methods need to resolve the phase of the wave in order to be accurate. Furthermore, the phase can only be resolved by following the strong stability criterion, in which case the implicit method shows no favourable qualities over explicit schemes [2].
To conclude, the CBC-scheme is approximately 1.6 times faster than the finite difference methods when one wavelength is simulated with a K = 5 temporal Chebyshev mode. Similar, and even higher, accuracy and efficiency levels can be attained using lower temporal modes and multiple time intervals than for high temporal modes with few time intervals. The memory consumption of the CBC-GWRM is also superior in this regard compared to the GWRM global subdomain scheme. It can be seen in Figure 6b; if a low temporal mode K = 5 and about 20 subdomains are used the CBC-GWRM is 75% more memory efficient and 50% faster than the global GWRM. These are substantial increases in performance which allows the CBC-GWRM to be highly competitive when solving physical systems with high degrees of freedom.
3.3. Ideal MHD. The ideal MHD equations are included in our analysis to investigate whether the CBC-GWRM is capable of accurately computing complex physical systems. The ideal MHD model is a set of coupled partial differential equations that describe the dynamics of a perfectly conductive fluid. The ideal MHD equations provide a simple description of plasma dynamics. The equations are stated as First we have the fluid equations; the continuity Eq. 16 that describes the conservation of mass density ρ; the equation of motion (17) which solves for the fluid velocity v; and the energy equation (18) describing the evolution of the pressure profile p; where Γ = 5/3 is the ratio of specific heats. The second set involves the electromagnetic equations that describe the evolution of the magnetic field B, electric field E, and the current density J; Faraday's law (19), Ampere's law (20), where µ 0 is the permeability in vacuum, and Ohm's law (21).
The goal here is to solve for ideal MHD instabilities that occur in a magnetically confined cylindrical plasma with coordinates (r, θ, z). The CBC-GWRM is applied to the linearised ideal MHD equations about an equilibrium, which consequently consists of 14 (7 real and 7 imaginary) coupled partial differential equations. A perturbation ∝ exp[i(mθ + kz)] is introduced, where m and k are the azimuthal and transverse mode numbers, respectively.
The inner boundary conditions need to be handled with some care to avoid singularities at r = 0 [17]. The outer boundary r = 1 condition is that of a perfectly conducting wall, hence the radial variable components are set to zero at the wall. The equilibrium chosen is a simple screw-pinch equilibrium (in normalized variables), The equilibrium was then perturbed using m = 1 and k = 5. The time evolution of the perturbed radial velocity and pressure is shown in Figures 7a and 7b. Initially the perturbed variables are dominated by a host of different waves, which is why the simulation has to advance far enough for the dominating unstable mode to become distinguishable. The growth rate obtained from the CBC-GWRM is compared to that of a shooting code developed in [4]. The CBC-GWRM calculated a growth rate ω CBC = 0.839 whilst the shooting code obtained a growth rate ω B = 0.840. For higher accuracy it is advised to use multiple time intervals with fewer temporal modes. Also, this linearised MHD test shows the advantage of using a Chebyshev spectral ansatz because of its ability to average over small scale dynamics. The initial waves that propagate are of no interest, so they are averaged out, which leaves the dominating unstable mode.
4. Discussion. The CBC-GWRM has several favourable properties; 1) rapid spectral convergence and high accuracy is achieved in both time and space, 2) it uses real-valued Chebyshev polynomials having the useful mini-max property, 3) sparse matrix methods can be used for solving the coefficient equations, 4) the use of reduced global coefficient matrix equations where only the internal and external boundary conditions need be solved, and 5) the subdomain private equations can be parallellized. The first three points are common properties of time-spectral methods, while this work is focused on points 4 and 5. Regarding 4), in the example problems of this paper the number of operations is reduced by a factor ∼ (2/L) 3 and the memory usage by ∼ (2/L) 2 since only the internal and external boundary conditions are solved globally. The "2" comes from the fact that only the two highest spatial Chebyshev modes are allocated for the two boundary conditions employed.
It was found that the CBC-GWRM gained quite a substantial amount of computational speedup when the ∂x P /∂x BC variables were computed numerically rather than analytically. This is shown in Figure 6a. Either forward difference or center difference can be employed. It is recommended to use forward difference for optimal efficiency, whilst achieving similar accuracy.
We can find a scaling law estimate for the efficiency of the CBC-GWRM. The number of private equations for one physical equation in each subdomain of, for example the Burger equation is (L − 1)(K + 1). The private set of equations need to be solved twice for each common boundary condition and temporal mode. The total number of operations is then 2N s (K +1)(L−1) 3 (K +1) 3 when a single PDE is solved. The number of global common boundary condition equations are 2N s (K + 1). The standard GWRM scheme with sparse band matrix algorithms scales approximately as N 1.4 s . The same procedure is used to solve the BC equations in the CBC-method. Thus the ratio of the total number of operations for the GWRM-CBC and GWRM methods is From Eq. 23 it can thus be seen that the CBC-GWRM is more efficient than the standard GWRM when N s is large. For example; L = 5 and K = 5 gives the criterion that N s 27 for the CBC-GWRM to be more efficient. This is an important result since many fluid dynamics applications suggest N s > 100. The scaling law Eq. 23 only takes into account the amount of operations it takes to solve the total amount of algebraic equations for both the CBC-GWRM and the standard GWRM, i.e. excluding all overhead operations, such as calculating the Jacobian matrices; see Figure 9 to see actual CPU times consistent with the scaling argument in Eq. 23. The effect of the first term in Eq. 23 can be reduced substantially, employing parallellization. Thus, the limiting dependence (2/(L + 1)) 3 can be approached. It is not surprising that the unoptimized CBC-GWRM, without parallel implementation, is slower in some cases than its global GWRM counterpart since the CBCscheme requires extra overhead operations. It should be noted that the CBC-scheme will outperform the global GWRM in all cases once a sufficient amount of subdomains is used, parallelized or not (see Figure 6a). The work done in each subdomain can be allocated to multiple processors, which would decrease the computation time. Figure  10 shows how the CPU time, including overhead operations, for the linearised Burger equation, scales with the number of subdomains N s ∈[10, 60] for the CBC-GWRM, the standard GWRM with all equations solved globally, and the CBC-GWRM adjusted for approximate parallelizable speedup gains. The standard GWRM outperforms the CBC-GWRM in this simple case. However, when approximations of the parallel speed-up gains are made we see that CBC-GWRM could potentially outperform the standard GWRM, whilst featuring less memory consumption. The estimation of the parallelizable time was made from the speed-up gains ascertained from Amdahl's Law [1]. Amdahl's law is formulated as S = 1/[(1 − p) + p/s], where S is the speed-up gain, s is the proportion of the code that can be computed in parallel, and p is the unoptimized computation time of p. If parallelism is accounted for the difference in CPU time can be resolved and improved upon (see Figure 10). The avenue of parallelizable subdomains grows even more advantageous when studying the ideal MHD equations. For the ideal MHD case approximately 95% of the code can be parallelized, which according to Almdahl's Law gives a speedup > 12 if more than 32 processors are used and it saturates at speedup = 20 for cores numbering 2048 and higher [1].
To close, it has been widely held that time-spectral methods are not as effecient as their finite difference counterparts. While spectral methods do feature higher accuracies, the idea of lacking efficiency has been challenged by such methods as the GWRM and the spectral deferred correction method. The spectral deferred correction method still, most commonly, relies on implicit and explicit finite difference methods to create a crude initial approximation, which is then corrected. This is achieved by using spectral integration in-between the finite time steps which is included to correct the crude approximation. This makes the spectral deferred correction method a high order method, requiring longer computation times [12,13,29].

5.
Conclusion. The CBC-GWRM solves a set of initial-value ordinary or partial differential equations in the temporal and spatial domains with a time-spectral weighted residual method. This allows the method to obtain high accuracy and to efficiently average over small scale dynamics, as seen in the modelling of the linearised Burger and forced wave equations. Before this work the GWRM solved all subdomains simultaneously from the global set of algebraic equations. Our goal was to break down the problem into smaller pieces for enhancing efficiency. The CBC-method provides a solution to this problem.
For the 1D linearised Burger equation the CBC-GWRM was compared to finite difference methods. At low accuracy the implicit finite difference scheme outperformed both the explicit and the time-spectral methods. The maximum error, however, of the CBC-GWRM solution scales as ε GW RM ∼ t −3.47 CP U , compared to ε LW ∼ t −0.6

CP U
and ε CN ∼ t −0.9 CP U . This allows CBC-GWRM to be 30% faster than Crank-Nicolson for accuracies ε∼10 −5 . The approximated parallellized CBC-GWRM CPU time also scales as N 1.1 s , with increasing number of subdomains, which is an improvement from the standard GWRM N 1.4 s scaling. The ideal MHD equations were solved within 0.1% error for the instability growth in a screw-pinch, which agrees well with previous simulations [24]. It was able to reach this accuracy using one time interval, making the CBC-GWRM highly competitive when solving for slow unstable mode growth rates. This shows that the CBC-GWRM is capable of solving complex physical systems that are relevant for fusion plasma physics, whilst efficiently resolving the temporal domain.
It is found that the CBC-GWRM is more efficient than the global GWRM if the ∂x p /∂x BC variables are computed numerically. This allows the CBC-GWRM to be more competitive when many subdomains are used. A clear example of this can be seen when comparing the GWRM subdomain schemes for the linearised Burger equation. Given the same parameters and sixty subdomains the CBC-scheme uses approximately 60% less memory than the global subdomain scheme. More importantly, the CBC subdomain scheme can be parallelized so that the full potential of multiple CPUs and GPU acceleration can be harnessed. For large problems in fluid dynamics and MHD the CBC-scheme consists of 90-95% of potentially parallelizable code, making high speedup gains possible.