A static memory sparse spectral method for time-fractional PDEs

We introduce a method which provides accurate numerical solutions to fractional-in-time partial differential equations posed on $[0,T] \times \Omega$ with $\Omega \subset \mathbb{R}^d$ without the excessive memory requirements associated with the nonlocal fractional derivative operator. Our approach combines recent advances in the development and utilization of multivariate sparse spectral methods as well as fast methods for the computation of Gauss quadrature nodes with recursive non-classical methods for the Caputo fractional derivative of general fractional order $\alpha>0$. An attractive feature of the method is that it has minimal theoretical overhead when using it on any domain $\Omega$ on which an orthogonal polynomial basis is already available. We discuss the memory requirements of the method, present several numerical experiments demonstrating the method's performance in solving time-fractional PDEs on intervals, triangles and disks and derive error bounds which suggest sensible convergence strategies. As an important model problem for this approach we consider a type of wave equation with time-fractional dampening related to acoustic waves in viscoelastic media with applications in the physics of medical ultrasound and outline future research steps required to use such methods for the reverse problem of image reconstruction from sensor data.

1. Introduction.The Caputo derivative [13] of order α / ∈ N is a fractional generalization of the conventional integer order derivative and appears in many mathematical models in the natural science and medical applications [78,79,46,28,22,71,14,10].There are multiple non-equivalent generalizations of the fractional derivative which have received the attention of mathematical research.In this paper, we focus on Caputo type (as opposed to e.g.Riemann-Liouville type) fractional derivatives exclusively.In the general case of non-integer α the Caputo derivative is nonlocal and computing numerical solutions to fractional differential equations frequently more closely resembles integral rather than differential equation solvers.Treated naively, the nonlocality forces us to use an approach in which the history of the function at each point in time t must be stored -in principle indefinitely.Numerical methods have been considered which only retain a finite but hopefully still sufficiently large section of historical data to remain close to the true solution but methods relying on 'short memory' typically have poor approximation properties, cf.[60,25].Since fractional differential equations appear frequently in models used in the natural sciences (some of which we will mention below) where the desired spacetime dimension is usually 3 + 1 there has been a great deal of interest in the literature in mitigating the memory accumulation problem inherent to time-fractional differential equations.In this paper we introduce a nonclassical recursive and static memory ('history-free') method for time-fractional partial differential equations on any arbitrary dimensional domain on which sparse spectral methods for classical PDEs are available, with minimal theoretical overhead when adapting it to new problems and domains.Sparse spectral methods for fractional calculus problems of various kinds have recently been successfully used in a number of different problems: A sparse spectral method using a non-orthogonal sum space basis for ODEs involving Caputo or Riemann-Liouville fractional derivatives was introduced by Hale and Olver in [37].More recently, a spectral method for such ODEs was proposed by Pu and Fascondini in [61] which improved upon the stability of Olver and Hale's method by relying on specially constructed orthogonal functions at the cost of no longer resulting in sparse systems.Despite their efficiency and accuracy, neither of these spectral methods allow memory free computations for time-fractional PDEs, which is a potentially critical limitation for real world applications in higher dimensions.A relevant model of such an equation which we consider in some detail in this paper is a lossy wave equation model for acoustic waves in viscoelastic media as e.g.discussed in [79,78] for the purposes of developing medical ultrasound treatment devices.Colbrook and Ayton [18] recently introduced a competitive contour method for time-fractional PDEs based on taking inverse Laplace transforms.Among the advantages of Colbrook and Ayton's method are tight error control and excellent convergence properties.As a drawback one must be able to make a suitable choice of contour which requires knowledge about the singularities of the involved Laplace transform.The primary motivation of this paper is the development of efficient and accurate solvers for the above-mentioned viscoeleastic fractional wave equation [79] (see also [78]) with applications in medical ultrasound treatment devices: The main domain of interest for this problem is a union of three-dimensional ball and stacked spherical shells of varying thickness as an initial model for the human skull and brain.State-of-the-art methods for this problem, cf.[79,17], replace the timefractional derivative with a spatially nonlocal fractional Laplacian (−∆) α [42,19].While this results in a history-free method, it also introduces an inherent systematic approximation error prior to any considerations of error in the numerical solver.While the full multi-domain model presents additional difficulties which we intend to address in a follow-up paper, we discuss a closely related stand-in on the two-dimensional disk in the numerical experiments in Section 7.5 to demonstrate our method's applicability for this class of problems.This paper is structured as follows: In Section 2 we give a brief introduction into the theory of and recent advances made in sparse spectral methods.Section 3 provides a brief history of nonclassical methods for Caputo derivatives.Section 4 shows how to construct recursive history-free sparse spectral methods for time-fractional Caputo PDEs.In Section 5 we consider the question of how much stored data the introduced static memory method requires.Finally, we give an error analysis of the method in Section 6 and present five detailed numerical experiments in Secton 7 which discuss in practice properties such as error, stability as well as applicability in higher dimensional domains.We conclude with an outlook on future research directions.
2. Sparse spectral methods.In this section we introduce and discuss the building blocks of a class of numerical algorithms for solving differential and integral equations called sparse spectral methods.The specific type of spectral method we describe in this paper belongs to the class of ultraspherical spectral methods as described in a 2013 paper by Olver and Townsend [53], where the 'ultraspherical' is a reference to the ultraspherical polynomials (aka Gegenbauer polynomials) used therein.A more recent thorough treatment of these methods can be found in [52].
The basic idea behind these methods is to expand functions in bases of orthogonal polynomials in such a way that the linear operators appearing in the problem, e.g.differentiation and integration, have banded matrix forms.We call a complete set of degree-ordered polynomials {p j (x)} j∈N0 orthogonal with respect to a positive Borel measure µ(x) if the polynomials satisfy where δ nm is the Kronecker delta and c nm are constants independent of x.Throughout this paper we adopt the notation The advantages of this notation is that polynomial expansion of a sufficiently wellbehaved function in an orthogonal polynomial basis can be written concisely as In a computing context these function approximations are truncated at finite degree K.In addition to the above characterization, it is natural to speak of orthogonal polynomials as living on a characteristic domain Ω = supp(µ) determined by the support of the measure.For our purposes the measure µ(x) will always have an associated density function such that we have dµ(x) = w(x)dx where dx is the Lebesgue measure and we call w(x) the weight function or simply the weight associated with the orthogonal polynomials.Classical examples of orthogonal polynomials are the well-known Legendre, Chebyshev, Gegenbauer, Jacobi, Laguerre and Hermite polynomals.Of particular importance for spectral methods are the Jacobi polynomials P (a,b) (x), a two-parameter family of polynomials (with Legendre, Chebyshev, Gegenbauer as special cases) which are orthogonal with respect to the weight function Finally, we comment on how to solve integro-differential equations using sparse spectral methods.First, we note that differentiation as well as integration are known to satisfy recurrences and are thus banded operators when mapping between appropriate polynomial spaces, see e.g. the explicit forms given in [51, 18.9].As an explicit example we consider the entries of D for Jacobi polynomials and observe that the banded first derivative acts a shift operator between bases as follows [51, 18.9.15]: Obtaining banded derivative operators requires careful choice of target domain and may involve the use of banded conversion operators [51, 18.9] elsewhere in the equation.We will see examples of how this is done in practice in the numerical experiments of this paper.To solve differential equations one usually also requires a way to enforce boundary conditions.In the context of sparse spectral methods this is usually achieved by appending point-evaluation functionals, which take the form of row vectors acting on coefficients, to the linear operator and the corresponding boundary values to the right-hand side, cf.e.g.[53,29].Alternatively one can also build the boundary conditions directly into the basis, cf.[54].
In Section 7 we describe numerical experiments of our method on triangle and disk domains.We thus note that the discussions in this section can be straightforwardly generalized to higher dimensional domains by using multivariate orthogonal polynomials P(x), cf.[54].In general higher dimensional domains each variable x, y etc. has its own associated block-banded multiplication operator X, Y and so on and PDEs with non-zero boundary conditions may be solved using restriction operators analogously to the use of point evaluation operators mentioned above.We close this section with an overview of recent progress on these banded spectral methods on various domains of interest: Since the seminal report on ultraspherical spectral methods in [53] a flurry of papers have been published to expand the range of problems that they can be applied to by deriving recurrence relationships which result in banded operators for various types of problems, see e.g.[66,30,34,33,59,36,31].A directly related line of recent research has concerned the construction of efficiently computable orthogonal polynomial bases on various multivariate domains for the purpose of using them in such sparse spectral methods, recent successful examples of which include disks [85], triangles and simplices [54,55,6], disk slices and trapeziums [67], spherical caps [68], wedges [56], surfaces of revolution [57] as well as quadratic and cubic curves [24,58].For a recent reference on multivariate orthogonal polynomials on classical higher dimensional domains such as balls, simplices and cubes we refer to the monograph on the subject by Dunkl and Xu [23].Finally, we refer to [35] for a detailed discussion of computational aspects relating to constructing orthogonal polynomials with respect to non-classical weight functions, which can be used to construct orthogonal polynomials on non-classical domains such as annuli and spherical bands.
3. Previous work on non-classical methods for Caputo derivatives.The method we describe in this paper belongs to a family of non-classical numerical algorithms for time-fractional differential equations whose origin can be traced back to a 2002 paper by Yuan and Agrawal [92].In this section we provide an overview of the development of these methods without claiming historical completeness.Put concisely, Yuan and Agrawal made use of the following result: Then the Caputo fractional derivative of f can be expressed as where the function Furthermore, for fixed w > 0 the function φ f (w, t) satisfies the (non-fractional) differential equation with initial condition φ(w, 0) = 0.
Strictly speaking Yuan and Agrawal themselves only proved this result for 0 < α < 1, with an extension for 1 < α < 2 being provided by Trinks and Ruge [83] in the same year.The fully general version we reproduced above was proved by Diethelm in 2008 [20].The Yuan-Agrawal method for solving equations involving the Caputo fractional derivative computes φ f (w, t) via the differential equation (3.1) and then uses Gauss-Laguerre quadrature for the half-infinite integral.As all of the resulting terms then only depend on the functions f and φ f at the previous time step this eliminates the need to store the past of f in memory in exchange for the additional computational effort of computing φ f and its half-infinite integral.A similar way of rewriting the Caputo fractional derivative using a slightly different integrand function φ * (w, t) was introduced by Chatterjee [15] and also later analyzed by Diethelm [20] but we will not further discuss this alternative in this paper.The Yuan-Agrawal method initially received mixed responses in the literature due to its poor convergence properties [65,45], specifically in relation to the number of quadrature points required for the Gauss-Laguerre step, but also resulted in a number of papers being written with the specific intention of improving it.Lu and Hanyga split the half-infinite integral into a region of [0, 1] and [1, ∞) and used Gauss-Jacobi and Gauss-Laguerre respectively to improve the convergence behavior of the method [39,44,45].A full explanation of the poor convergence property and error analysis was later given by Diethelm [20].We also note a more recent exploration of positive and negative examples of these non-classical approaches in [43].In [20], Diethelm argued for a modification of the Yuan-Agrawal method which substantially improved its convergence properties by using a transformation of the function We note the connection of this idea to Cayley transforms which map the right half plane to the unit disk, cf.[12].In one dimension this changes the integral on the half-infinite Laguerre domain to an integral on the Jacobi domain: ). Diethelm's method thus uses Gauss-Jacobi quadrature instead of Gauss-Laguerre quadrature which as analyzed in [20] substantially improves the method's convergence in terms of the number of quadrature points needed.In 2010 Birk and Song [8] published a paper which provides an alternative point of view on the Yuan-Agrawal and Diethelm methods.Rather than working in the time domain, Birk and Song derive a frequency domain non-classical method for Caputo derivatives.It is widely known, see e.g.[64,84,8], that the Caputo derivative has the Fourier multiplier form if ∀t < 0 : f (t) = 0, with the Fourier transform defined by Birk and Song [8] show that we can express (iω) α by the following integral: where α := 2α−2 α +1.Note that α ∈ (−1, 1) for any non-integer order of fractional differentiation α > 0, α / ∈ N.This result has a natural interpretation as the frequency domain equivalent of the time domain Yuan-Agrawal form, cf.[8, Appendix A].Birk and Song [8] note that the approaches of Yuan-Agrawal and Diethelm can both be understood as resulting from a quadrature rule being applied to Equation (3.4): with A j , s j taking different values depending on whether an L-point Gauss-Laguerre or Gauss-Jacobi quadrature is used, see Table 1.The third row in Table 1 corresponds to an alternative Gauss-Jacobi quadrature option discussed by Birk and Song in [8] which they show has advantages in some frequency domains and disadvantages in others.

method
quadrature type A j s j Yuan-Agrawal [92] Gauss-Laguerre Table 1: Parameter values for the L-point Gauss-Laguerre and respectively Gauss-Jacobi methods corresponding to Yuan-Agrawal's, Diethelm's and Birk-Song's methods in Equation (3.5).Note that λ j and p j in each row denote the appropriate weights and abscissae of the quadrature methods, cf.[8, Section 4.2].
4. Towards a recursive non-classical sparse spectral method.While the approach discussed in the previous section eliminates the requirement to memorize all previous states of the function f (t) to which we apply the Caputo derivative, this is achieved by introducing additional internal variables.A recursive way to alleviate this drawback was suggested by Birk and Song [8].They use Equation (3.3) and the inverse Fourier transform of the rational approximation in (3.5) to obtain We note that one can arrive at an equivalent expression starting from Yuan-Agrawal's approach without needing to pass through frequency space.The integrals under the sum, i.e. ψ j (t), are readily found to satisfy the following recurrence relationship when advancing by a time step ∆t: Note that we have ψ j (0) = 0 and the right-hand side of (4.3)only depends on the integral which was already computed in the previous time step as well as an incrementing term without long term memory of either f (t) or ψ j (t).Birk and Song show in [8] how the results discussed until this point can be used to solve ODEs involving the Caputo derivative: For simplicity we use α ∈ (0, 1) for the remainder of the paper such that as the general α method becomes somewhat obfuscated by notation.Generalizations for higher orders of α are straightforward to obtain.Approximating f (t) via linear interpolation during each time step, i.e.: one readily obtains the following approximation from (4.3): By the above discussion we thus have Remark 4.1.Birk and Song [8] refer to the linear interpolation in (4.4) as 'trapezoidal rule' but while linear interpolation is behind the trapezoidal rule applied to this is not equivalent to applying the trapezoidal rule to integrate Naturally the linear interpolation can be replaced with a higher order interpolation to obtain a higher order method at the cost of storing additional function values.In fact any numerical method capable of numerically integrating the right-most term of (4.3) can be substituted in this approach.In this paper we restrict ourselves to using linear interpolation for notational simplicity but recommend using higher order methods in step size sensitive applications.Since our aim is to derive a memory free solver for PDEs featuring time-fractional Caputo derivatives the minimal problem of interest involves ∂ α ∂t α f (t, x), where f now depends on both space x ∈ Ω ⊂ R d and time t ∈ [0, T ].To obtain a method in which each time step involves the solution of a sparse linear system, while simultaneously providing excellent spatial convergence properties we choose to combine Birk-Song's recursive method with sparse spectral methods as described in Section 2. At any fixed point in time n∆t we represent the function f (t, x) and the L auxiliary node functions ψ j (t, x) via their coefficient vectors in an orthogonal polynomial basis P(x) on the desired domain, i.e.: We store only the current states of these functions such that the coefficient vectors f n and ψ n j approximating f (n∆t) and ψ j (n∆t) are overwritten in each step.By Equation (4.6) we can thus approximate the Caputo derivative (with α ∈ (0, 1)) at a point in time n∆t as with the following recurrence for the auxiliary function coefficients: Note that the domain Ω ⊂ R d determines what constitutes a suitable choice of basis functions (which in principle need not be orthogonal polynomials or in fact polynomials at all) and that the method can thus be used in arbitrary dimensions as long as a basis suitable for computation is available on the desired domain.Canonical higher dimensional domains such as higher dimensional cuboids can be treated using tensor products of Jacobi polynomials while arbitrary dimensional ball domains can make use of higher dimensional generalizations of Zernike polynomials.For further examples we refer to the literature overview in Section 2.
To ensure that each step is a sparse linear solve, the left and right-hand side orthogonal polynomial bases must be chosen carefully such that any other linear operators appearing in the fractional partial differential equation are accounted for consistently, typically also requiring sparse conversion operators in the process.Due to the many possible combinations of linear operators one may encounter when dealing with fractional PDEs, we show the procedure of ensuring sparsity in various cases in action in the numerical experiments in Section 7.3.

5.
Memory cost of the history-free method.In this section we investigate how much memory the introduced static memory method requires.We focus only on the memory cost strictly associated with computing the Caputo fractional derivative as the exact cost of the method will otherwise naturally vary depending on the other present differential or integral operators.We count the minimal required values in storage for a solution loop in a diagram in Figure 1, where K is the length of the coefficient vectors in the orthogonal polynomial basis of choice and L is the number of quadrature points chosen in Eq. (4.7).The number of values required in storage is found to be L(2 + K) + 2K and we observe that this memory requirement is entirely independent of the number of time steps taken to reach the final desired time T = N ∆t.We also note that for most problems of interest one expects L N .In practice the method's efficiency and accuracy can benefit from keeping some additional repeatedly used static values in memory but this additional cost is also O(L) rather than O(N ).
Fig. 1: Diagram of memory cost associated with solving the Caputo derivative part of a fractional PDE using the method described in this paper, where K is the length of the coefficient vectors in the orthogonal polynomial basis of choice and L is the number of quadrature points chosen in Eq. (4.7).The number of values required in storage is L(2 + K) + 2K.
6. Error analysis.An error analysis of the non-recursive Yuan-Agrawal and Diethelm methods for ODEs was given by Diethelm himself in [20] -in this section we provide an error analysis for the recursive approach to time-fractional Caputo derivatives including resolving spatial dependence in a basis of orthogonal polynomials.We only discuss the error directly incurred from the computation of the Caputo derivative in this section, keeping in mind that for a particular fractional PDE the error will also be affected by what method we use to deal with the remaining differential or integral operators.For the sake of brevity we also only discuss Diethelm's quadrature rule and fractional orders 0 ≤ α ≤ 1 but note that analogous results for the Birk-Song quadrature rule and higher orders can be derived in the same way.
Furthermore, let φT,x (κ) := φf (κ, T, x) with φf defined in Eq. (3.2).If φT,x ∈ C 2L [0, T ] then the quadrature error incurred from Diethelm's method as described in Section 3 is given by for some ξ ∈ (a, b) where •, • is the inner product with respect to the appropriate orthogonality weight.To use this result we have to translate our compact notation into a sligtly more cumbersome one: Using Table 1 and the definitions in Section 3 we find that for Diethelm's method The stated error expression follows as an immediate corollary.
Definition 6.2.We denote by P K the canonical projection operator corresponding to truncation-to-degree-K, i.e.: Definition 6.3.We denote by err P,K (g(x)) the error incurred from truncating the polynomial expansion of a function g in the basis P(x) at degree K, i.e.: err P,K (g(x)) = |P(x)g − P(x)P K g|.
While for sufficiently nice functions one generally expects spectral convergence in the truncation degree when expanding functions in orthogonal polynomials, cf.[82,Chapter 21] and [11,Section 2.3], more generally the study of these truncation errors is an active field of research in its own right, see [38,87,88,95,91,89,86] and the references therein.We will make no attempt to fully cover the range of different error bounds for all the various different basis choices and domains our method can be applied with and instead settle for an abstract result in which the error caused by the polynomial approximation in space is separated from the time-fractional Caputo contribution.
We assume the quadrature-related constants A j and s j are obtained with negligible error and define the following non-negative functions: Then the error incurred from using the recursive degree-K-truncated spectral method as described in Section 4 to compute the time-fractional Caputo derivative of order 0 ≤ α ≤ 1 of a function f (t, x) at time T using Diethelm's quadrature rule and a uniform time grid of step size ∆t = T N is bounded by Proof.Using the triangle inequality we obtain Note that the first term, |P(x)f N α − P(x) L j=1 A j ψ N j |, now describes an error contribution at each spatial point x which is solely incurred due to the recursive timefractional Caputo derivative method.The second term is simply the approximation error of truncating the polynomial expansion of the function L j=1 A j ψ j (T, x) in P(x) which we leave abstract.To obtain a more explicit bound for the first term we use Lemma 6.1 to observe that the error added to the series approximation in (4.1) caused by perturbations δ j (x) in the auxiliary functions ψ j (t, x) is for some ξ ∈ [0, T ].From this we obtain the error bound We thus seek a bound on |δ j (x)| caused by the approximation in (4.5).Comparing (4.3) with the approximation in (4.5) we find that the δ j are the errors caused by integrating the derivative of the linear interpolation.If f (x) is in C q+1 [a, b] and p f,q (x) is its interpolating polynomial of degree ≤ q at q + 1 points x 0 , ..., x q ∈ [a, b], then the remainder term of the interpolation is given by [51, 3.3.11] for some ξ ∈ [a, b].For our case of taking derivatives of the linear interpolation in each time step we thus obtain that there exists a ξ ∈ [(n − 1)∆t, n∆t] such that Since for all τ ∈ [(n − 1)∆t, n∆t] we have | ∆t+2(τ −n∆t)

2
| ≤ ∆t 2 this yields the error bound In a single time step the error in the integral is thus bounded by resulting in the following error bound for δ j (x): Combining this with the above-derived error bounds yields the stated result.
Remark 6.5.We note that similarly to what was discussed in [69, Section 3.7], such error bounds for quadrature-type methods tend to not be useful for error estimations in actual applications as they include difficult to compute or estimate high order derivative terms.A more practical approach to estimating errors in an application is to check convergence with higher precision results.Nevertheless, the derived error bounds suggest a sensible convergence strategy: Prioritize convergence in L, while keeping L∆t constant, then adjust ∆t further if required once convergence in L has been achieved.We verify that this is a sensible convergence strategy in practice in the numerical experiments in the next section.
As mentioned before, it is straightforward to see from the proof of Proposition 6.4 that choosing higher order accuracy integration methods for the integrals in the recurrence instead of using (4.4) would also lead to a higher order in ∆t method for the Caputo derivative.
7. Numerical experiments.We present five numerical experiments in this section which respectively explore the following subjects: 1. Recursive nonclassical computation of time-fractional Caputo derivatives.2. Stability of the recurrence relationship for the auxiliary functions ψ j .
3. A toy model PDE with an analytic solution in one dimension.4. A time-fractional heat equation on a triangle.5.A wave equation dampened by a time-fractional term on a disk.When discussing error compared to reference solutions we will be considering the pointwise absolute and relative errors and indicate which is being considered.We furthermore use Birk-Song's quadrature rule as described in Table 1 throughout this section.As accuracy in the Gaussian quadrature related values A j and s j is critical for the method and these values may be precomputed for a given quadrature type without having to ever be re-computed (or one can use lookup tables), we initially use 128-bit floating point numbers and then project them down to 64-bit precision before starting the recurrence loop -all other values in the numerical experiments are generated and stored as 64-bit floating point numbers.
The implementation used for the numerical experiments in this section made use of the open source Julia [7] packages MultivariateOrthogonalPolynomials.jl [5], Fast-GaussQuadrature.jl [2] as well as ApproxFun.jl[1].
7.1.Experiment 1: Direct computation of Caputo derivatives.While constructed for the purpose of solving differential equations, the recursive non-classical approach can nevertheless straightforwardly be used to numerically approximate the Caputo derivative of a known function at a given point.In this section we compare some cases with known explicit Caputo derivatives with numerical approximations obtained via (4.6).This idea was also explored by Diethelm in [20] but his numerical experiments did not include error plots and were limited by small quadrature point counts, a problem which we are unencumbered thanks to [9,27,75] and implementations such as FastGaussQuadrature.jl [2], cf.[73].We make use of the following explicitly given Caputo derivatives for these numerical experiments, cf.[40,Sec. 3.1]: where E a,b (z) denotes the two-parameter Mittag-Leffler function [48,90], which is defined by (cf.[51, 10.46.3]): We furthermore have (cf.[21,Chapter 4]) which via (7.2) yields In Figures 2 and 3 we show error plots comparing recursively computed numerical approximations to the explicitly known solutions in (7.1) and (7.3) with varying time step sizes ∆t and quadrature nodes L respectively.Note that as suggested by the error bounds in Section 6 we observe linear improvement in accuracy in ∆t once sufficient convergence in L has been achieved.

Experiment 2: Stability of the auxiliary function recurrence.
A remaining concern with the proposed method may be that floating point arithmetic can cause recurrence relationships to be unstable.In this section, we thus provide some numerical experiments in which we compare full domain numerical integration in (4.2) to the recurrence approach in (4.8) to see how this error behaves in practice, using the explicitly known solutions of the previous numerical experiment as a template.Fig. 4: Maximal-in-j absolute errors at each time step between auxiliary functions ψ j after n time steps of indicated size ∆t computed via the recurrence and full domain quadrature for the problem in (7.3) with L = 30 quadrature points using the Birk-Song quadrature rule.For a high n endpoint experiment see Figure 5. Fig. 5: Absolute errors between auxiliary functions ψ j , with j on the x-axis, after 10 million time steps of indicated size ∆t computed via the recurrence and full domain approach for the problem in (7.1) with L = 30 quadrature points and α = 1 2 using the Birk-Song quadrature rule.For an error over time experiment see Figure 4.
Figures 4 and 5 show the error between the recurrence and full domain approaches to computing the auxiliary functions ψ j .Consistent with the error analysis in Section 6, we observe that if L is low, reducing ∆t does not improve the error.As suggested by the error bounds discussion in Section 6, one should thus increase L to convergence before excessively refining the step size ∆t.

Experiment 3:
A time-fractional PDE with known solutions.In this section we present numerical experiments for time-fractional PDEs with constructed analytic solutions.The following time-fractional equation of motion is an ODE arising in the massless one-dimensional Kelvin-Voigt model: where c > 0, k > 0 and the external force g(t) is given by This equation has the following known analytic solution, cf.[65,20,8] Equation (7.4) was part of a critique of non-classical methods due to Schmidt and Gaul [65] and was also addressed again by Diethelm in [20] as well as by Birk and Song in [8].With its known explicit solution and relevant historical role in early papers on nonclassical methods this equation has become a popular toy model for numerically evaluating the accuracy of fractional Caputo differential equation solvers.
In this section we use it to derive a fractional PDEs with known solutions to use as a toy model to test the PDE approach with before moving on to more involved problems in the following sections.Multiplying Equation (7.4) by e −x , we readily obtain satisfying f (0, x) = 0 and where we have defined g(t, x) := g(t)e x = 0, if t = 0, e x , if t > 0.
An explicit solution on R can be inferred via the equation's relationship with (7.4): Using Equation (4.7) one can then derive the following history-free nonclassical scheme for Equation (7.6): where the ψ n−1 j are computed recursively via (4.8).As discussed in Section 2, to ensure a banded derivative operator we have to choose the left and right-hand side bases correctly, cf.[51, 18.9].One way to accomplish this for Eq.(7.6) is: where C is the banded conversion operator from the Legendre polynomials P (0,0) (x) to the Jacobi polynomials P (1,1) (x) and f and ψ are expanded in P (0,0) (x).As usual for this type of spectral method, the boundary conditions at each time step can be enforced either by using suitable weighted bases or by appending rows corresponding to the evaluation operator to the top of the LHS as well as the boundary condition values to the top of the RHS, see e.g. the methods discussed in [74,37,29] and the references therein.We opt to use the latter option along with the analytic solution values as both the t = 0 initial and spatial boundary conditions in this numerical experiment.
In Figure 6 we show absolute and relative error contour plots for the problem in (7.Remark 7.1.We note that while in the numerical experiments in this section we let c and k be constants independent of t and x due to the availability of analytic solutions, the nonclassical method we propose in this paper readily accommodates spacetime dependent coefficients by e.g.replacing the scalar coefficient k(t, x) by a multiplication operator K n (X) at time n∆t.If k(t, x) is polynomial in x, this operator does not break the sparsity of the method since polynomial multiplication operators are banded, cf.[52].Well-behaved non-polynomial coefficient functions, while in theory resulting in dense multiplication operators, may in certain circumstances nevertheless decay rapidly off-band and can thus sometimes still be approximated by banded matrices.If k(t, x) = k(x) is independent of t the banded multiplication operator may be stored in memory for significant performance improvements.conditions can be enforced by resolving f (x, y) in the weighted Proriol polynomial basis The Laplacian ∆ then maps W (1,1,1) (x) to P (1,1,1) (x) and a sparse conversion operator C for this basis map is also available, see Figure 8 for sparsity pattern plots.We thus obtain the recursive non-classical method Fig. 8: Sparsity pattern (spy) plots for 100 × 100 blocks of the Laplacian ∆ and conversion operator C appearing in Equation (7.9).See [54] for how to compute their entries.
valid for the heat equation with zero Dirichlet boundary conditions on the triangle domain Ω T and k > 0. Note that the solution obtained via the above scheme is In Figure 9 we plot numerical solutions obtained for the time-fractional heat equation on the triangle with k = 1 500 with fractional order α = 3 5 and initial condition f 0 (x, y) = 100xy(1 − x − y) 2 e −10((x−0.6) 2 +(y−0.2) 2 ) .(7.10) Figure 10 shows the corresponding coefficient decay in the polynomial approximation which can be used as a proxy for whether the desired spatial convergence of a given application has been achieved, cf.[54].Stepping through 1000 time steps with L = 45 and K = 400 takes roughly 143 milliseconds on a Lenovo ThinkPad X1 Carbon laptop with a 12th Gen Intel i5-1250P CPU.7.5.Experiment 5: A wave equation dampened by a fractional term on the disk.We consider a problem closely associated with the motivating model used in [78,79] to describe medical ultrasound devices used on the human brain.As mentioned in the introduction, the equation of interest (cf.[79]) is posed on the three dimensional ball surrounded by multiple matching spherical shells of varying thickness, cf. the sphere head models described in e.g.[49].The presence of the term ∂ α ∂t α (∆f ) as well as the requirement to decompose the domain into matching segments add complications to this problem which we intend to address in a separate paper.In this section we discuss the similar but slightly simplified equation posed on a two-dimensional disk with zero Dirichlet conditions and given initial conditions f (0, x) and ∂ ∂t f (0, x).Note that this stand-in equation also describes power  law attenuation in viscous media -it is in fact a modified version of Szabo's wave equation, see [70,16] which has also been suggested for use in applications, cf.[41].
To our knowledge no non-trivial explicit solutions for this equation are known.
As the problem is posed on a disk domain, we use generalized Zernike polynomials Z (b) (r, θ).Zernike polynomials have a longstanding history [94,93] not just in mathematics research but also in natural science and engineering applications, especially in the field of optics [72,63,47,62,50], and as result a handful of different and conflicting conventions exist for them.The convention we follow for the generalized Zernike polynomials is that Z (b) (r, θ) are orthonormal polynomials on the unit disk with respect to the weight function w(r) = (1 − r 2 ) b , whose radial components are given by orthonormal Jacobi polynomials P(α,β) (2r 2 − 1), i.e.: 2−δm,0 2π cos(mθ) for m ≥ 0, sin(|m| θ) for m < 0.
The Zernike polynomials are traditionally given in terms of polar coordinates but they are polynomials in x and y and generally not polynomials in r and θ.For a discussion of sparse spectral methods using Zernike polynomials, see e.g.[85].For open source numerical implementations we refer to FastTransforms [3,4] as well as MultivariateOrthogonalPolynomials.jl [5].As in the triangle example, zero Dirichlet boundary conditions can be enforced by choosing an appropriate weighted basis to resolve the solution function at each time step f (n∆t, r, θ).Using a simple backward finite difference approximation for the second derivative we then find the following scheme for Equation (7.11): with approximate solution given by Higher order schemes may be obtained in the straightforward way.As in the triangle experiment, we plot a sparsity pattern plot for C in Figure (11) but omit the spy plot for the Laplacian ∆ on these bases since it is in fact simply diagonal when mapping from W (1) (r, θ) to Z (1) (r, θ).
Remark 7.2.The second derivative in t requires access to at least two previous points in time, one of which can be re-used in the Caputo part of the equation.As a result, while the minimum total memory requirement discussed in Section 5 still holds, the memory increase from adding a Caputo term to a wave equation is K less than said minimum since f n−1 is a vector of length K and need not be stored twice.Fig. 11: Sparsity pattern (spy) plot for 100 × 100 block of the conversion operator C appearing in Equation (7.12).See e.g.[85] for how to compute its entries.
We plot a numerical solution to Equation (7.11) with parameters τ = 1, c = 100, α = 1 2 and initial conditions on the disk in Figure 12.Four animations of solutions to this problem for various parameters and initial conditions, including the one pictured in Figure 12, have been made available via FigShare, see [32].
In Figure 13 we plot an example of an initial condition along with a circular array of 70 sensors and the data these sensors read out over time.State-of-the-art software such as the Matlab package k-wave, cf.[77,80,76,81], can use data from such sensor arrays to approximately reconstruct source images with natural applications in medical ultrasound as well as photo-acoustic imaging.This section serves as a proof of concept that a recursive sparse spectral element method may be applicable to such problems in the future -we discuss additional research required to reach that point in the next section.
8. Discussion & Future Research.In this paper we have suggested a nonclassical sparse spectral approach to solving PDEs involving Caputo fractional derivatives.Among the method's core strengths are its recursive and history-free design, allowing computations that would otherwise be extremely challenging, and the reliance on sparse spectral methods in space resulting in banded operators which can be efficiently solved in each time step.The generic, almost 'plug-and-play', nature of the approach means that it can easily be adapted to any domains on which sparse spectral methods are viable, i.e. all domains on which suitable sets of orthogonal polynomials and derivative operators are available.We demonstrated this by presenting numerical solutions to a time-fractional heat equation on the triangle and a time-fractional modification of the wave equation on the disk.As seen in the numerical experiments, the method does not require excessively many quadrature points to achieve good accuracy, with time step size ∆t becoming the primary error contributor once convergence in the number of quadrature points is achieved.This suggests that the approach outlined in this paper may in practice benefit significantly from replacing the various time stepping aspects of the method with higher order in time methods in a straightforward manner.In future research we intend to adapt this method to work for the full equation of interest 1 c 2 0 ∂ 2 ∂t 2 f − ∆f − τ ∂ α ∂t α ∆f = 0, which will require memory and computationally efficient extensions of this method for computing terms like ∂ α ∂t α (∆f (t, x)) as well as a discussion of how to join up different domains with matching boundary conditions -in this case combining a core ball cell with several layers of surrounding spherical shells of varying thickness.Applying such methods for the time reversed problem from finite sensor array readouts, as suggested with Figure 13 and the Matlab package k-wave, is of significant interest in applications.This functionality would require the use of theoretical results on the time-reversal of the discussed equations as well as an implementation of absorbing boundary conditions to prevent noise coming back from the domain boundary.To sensibly use absorbing boundary conditions with the present method, it is natural to e.g.use a spectral elements approach and enforce rapid decay in the outer-most element.We intend to discuss image reconstruction using a spectral element variant of this method in detail in a future paper.While the above points remain a challenge for now, the introduced method's minimal theoretical overhead when changing domains suggest it is a good candidate for such multi-domain problems.Advances in this direction should be compared with the current state-of-the-art methods for solving these equations, which involve replacing the temporal nonlocality with a spatial nonlocality as discussed in [79] (as implemented in k-wave).

2 Fig. 2 : 2 Fig. 3 :
Fig. 2: Relative errors between analytic and recursively computed Caputo derivatives of indicated f (t) with number of quadrature points L for various time step sizes ∆t.

Fig. 6 :Fig. 7 :
Fig. 6: Contour plots of the absolute and relative errors for the fractional PDE in Equation (7.6) with k = 10, c = 100 on the spacetime box [0, 1]×[−1, 1].The legend is logarithmic, indicating the order of magnitude of the errors.The numerical solution was computed with L = 50 quadrature points, approximation degree K = 40 and with stepsize ∆t = 2 −20 .See Figure 7 for a slice through t = 1.

Fig. 9 :
Fig. 9: (b), (c) and (d) show numerical solutions at different times t to the problem in Eq. (7.7) with α = 3 5 and k = 1 500 , starting from the initial condition in Eq. (7.10) which is shown in (a).The num. solutions are degree K = 400 approximations computed with L = 45 quadrature points and ∆t = 2 −20 .To make comparison easier the scale on all plots is kept consistent.

Fig. 10 :
Fig. 10: Semilogarithmic plot of absolute values of the k-th coefficients of the weighted Proriol polynomial approximations corresponding to the numerical solutions shown in Figure 9.

Fig. 12 :
Fig. 12: (b), (c) and (d) show numerical solutions to the fractionally dampened wave equation in Equation (7.11) after indicated amount of time steps starting with the initial conditions in (7.13) pictured in (a).The numerical parameters for the pictured simulation were (∆t, K, L) = (2 −20 , 500, 50).Animations for deeper time as well as other initial conditions are available on FigShare, see [32].

Fig. 13 :
Fig. 13: (b) shows the read-out as a function of time of the 70 sensors placed around the initial state plotted in (a) -the circular sensor array with radius r s = 1 2 is marked by white squares in (a).The underlying equation is of the type in (7.11) with α = 1 10 .Inputs of the form of (b) can be used by the Matlab package k-wave to approximately reconstruct the original source images.