Tensor calculus in spherical coordinates using Jacobi polynomials. Part-I: Mathematical analysis and derivations

This paper presents a method for the accurate and efficient computations on scalar, vector and tensor fields in three-dimensional spherical polar coordinates. The methods uses spin-weighted spherical harmonics in the angular directions and rescaled Jacobi polynomials in the radial direction. For the 2-sphere, spin-weighted harmonics allow for automating calculations in a fashion as similar to Fourier series as possible. Derivative operators act as wavenumber multiplication on a set of spectral coefficients. After transforming the angular directions, a set of orthogonal tensor rotations put the radially dependent spectral coefficients into individual spaces each obeying a particular regularity condition at the origin. These regularity spaces have remarkably simple properties under standard vector-calculus operations, such as \textit{grad} and \textit{div}. We use a hierarchy of rescaled Jacobi polynomials for a basis on these regularity spaces. It is possible to select the Jacobi-polynomial parameters such that all relevant operators act in a minimally banded way. Altogether, the geometric structure allows for the accurate and efficient solution of general partial differential equations in the unit ball.


Introduction
This paper outlines a set of algorithms for arbitrary vector and tensor computations in spherical polar coordinates in the three-dimensional unit ball. A previous paper (Vasil et al. 2016 [53], from now on V16) presented a similar calculus for vector and tensor operations in cylindrical polar coordinates on the unit disk. The two-dimensional disk is one of the simplest examples of a well-behaved domain with a natural coordinate singularity. The unit ball has three separate coordinate singularities: disk-like singularities at the north and south pole, and a third at the centre of the domain.
Spherical geometry is important for a large number of two-and three-dimensional applications; see e.g., [1,46,50,43,42,48,29,17,22]. Astrophysical and planetary applications provide many obvious examples with stars and planets. In this paper, we first discuss the mathematical structure for accurate tensor calculations in spherical coordinates. A companion paper (Part-II) discusses some applied test problems that demonstrate effectiveness in relevant geophysical and astrophysical scenarios. These kinds of multi-scale problems often require high spatial resolution, but accurately representing functions in the vicinity of coordinate singularities can become numerically unstable in the presence of small-scale features. The most reliable way around this difficulty is to design a numerical method that explicitly conforms to the correct type of singular behaviour.
A considerable amount of past work is devoted to overcoming problems with coordinate singularities in spheres and disks. Boyd & Yu (2011) [3] provide an excellent review of the topic and compare several different methods. We also summarise the history and a number of the issues in V16. A Fourier series in polar coordinates demonstrates the issue most clearly. For example, where R is the cylindrical radius and φ the circular angle. If we assume f is infinitely differentiable in the vicinity of the origin, then where F is a well-behaved function of R 2 [40]. In V16 we called the radial and angular coordinates (r, θ). To avoid confusion with 3D polar coordinates, we relabel the 2D coordinates (r, θ) → (R, φ). From now on, 0 ≤ r ≤ 1 represents a 3D radius, 0 ≤ θ ≤ π represents a polar angle (co-latitude), and 0 ≤ φ ≤ 2π represents a circular angle (longitude). The circular angle is a common feature between the disk and the sphere.
Representing a highly multi-scale function requires components with |m| ≫ 1. Expanding f m (R) in a series of functions without explicitly considering the behaviour at the origin requires a large degree of cancelation to produce an accurate result. Several authors discuss the numerical challenges of resolving equation (2) when m ≫ 1 [2,3,22,27,30,47,56]. The two-dimensional disk is a prototype for this kind of general behaviour. The polar regions in two-dimensional spherical coordinates are locally identical to the disk, with colatitude replacing the cylindrical radius. Fortunately, the 2-sphere comes equipped with spherical harmonic functions. In this case Y ℓ,m (θ, φ) ∼ sin |m| (θ)P (cos(θ))e imφ , where P (cos θ) is a particular polynomial of cos θ. As θ → 0, sin(θ) ∼ θ and cos(θ) ∼ 1 − θ 2 /2; similar behaviours exist as θ → π. Equation (3) behaves exactly as equation (2) dictates it should, and spherical harmonic functions naturally conform to the singular behaviour of the poles at θ = 0, π. Therefore spherical harmonic functions naturally conform to the singularcoordinate behaviour at both the north and south poles.
With spherical harmonic functions we can represent analytic functions in the angular direction without fear from polar conditions, where Y ℓ,m (θ, −φ) gives the complex conjugate of Y ℓ,m (θ, φ).
An analogous condition exists at the centre of the ball, f ℓ,m (r) ∼ r ℓ F (r 2 ) as r → 0, where F is a well-behaved function of r 2 . Throughout this paper, ℓ represents the spherical harmonic degree. For components with ℓ ≫ 1, resolving behaviour at the centre of the sphere presents the same kind of numerical challenges as the centre of the disk.
Scalar functions alone, like f in equation (6), are not sufficient for many important applications.
In the vicinity of coordinate singularities, vector and tensor fields behave slightly differently than scalar functions. In the case of vectors, we know that the leading-order behaviour near the origin is at least one-degree lower than for scalar functions. For example, with the radial component of the gradient, r · ∇f ∼ ∂ ∂r f ℓ,m (r) ∼ r ℓ−1 , as r → 0.
The angular gradient terms also multiply by 1/r terms that produce the same effect as equation (7), with terms scaling as r ℓ−1 as compared to the r ℓ scaling of scalar functions (6).
Summary of results for the unit disk: In V16, we devised an efficient way to define and operate on functions, vectors and arbitrary tensors on the unit disk without problems resulting from the coordinate singularity at the origin. Understanding the principles involved with the disk provide an excellent framework for understanding the more complicated calculus in three dimensional spheres. The main results regarding the disk are: • The Fourier transform of vector-field components in the coordinate basis, v R m (R), v φ m (R), naturally decompose into the sum of two alternative functions of radius, v − m (R), v + m (R). One component scales near the origin as v − m (R) ∼ R |m|−1 , the other scales as v + , as R → 0, where V ± are independent well-behaved functions of R 2 . Higher-order tensor fields behave similarly.
• Given the regularity structure of vector and tensor fields, we used modified covariant derivative operators of the form These derivative operators raise and/or lower the regularity class of tensor components. For the gradient of a scalar, v ± (R) ∝ ∇ ± f (R).
• Given the above geometrical properties, V16 represented general tensor field components as a sum of functions of the form where µ is an integer that depends on the tensor rank. P (z) is a polynomial (in the variable z ≡ 2R 2 − 1) carefully chosen to conform with the leading-order behaviour.
• It happens that Jacobi polynomials with parameters depending on m and the tensor rank match the geometric regularity conditions in a natural way. The raising and lowering gradient operators increment and decrement the parameters of the Jacobi polynomials to match the geometrical regularity conditions of tensor components. The difference between past work and V16 was that we allowed the polynomial parameters to change under differentiation.
• A past solution for treating vectors was to use rescaled components and derivative operators such as R v R , R v φ , R ∇ R , R ∇ φ [47]. While variables and operators of this type allow for sparse computations, they also have the effect of obscuring the underlying geometrical structure, which is somewhat of a nuisance. The main point of V16 was to exploit the underling geometrical structure in polar coordinates. This allowed for straightforward construction of complicated vector equations. The resulting matrix systems are even sparser than previous approaches, and better conditioned than what results from an approach that does not take the geometry into consideration.
In the current paper, we are interested in arbitrary vector and tensor calculations in sphericalpolar coordinates. The same general ideas apply to the unit ball as the unit disk. The calculus of the unit 2-sphere is very similar to the unit disk. In that case, a complex-valued basis transformation produces vector components with a coherent spin-weight. These components naturally break into a basis of spin-weighted spherical harmonics. Several authors have used this approach on a number of different applications. Gelfand, & Shapiro (1956) [14] originally discovered spin-weighted spherical harmonics in their studies of the Lorenz group. Newman & Penrose (1966) [34] applied the methods to classical problems in theoretical physics such as electro-magnetism and gravitational radiation. Phinney & Buridge (1973) [41] proposed (what they called) generalised spherical harmonics as efficient analytical and computational tools in whole-earth geophysics. Our work relies on spin-weighted spherical harmonics as a necessary part of computations in three-dimensional spherical coordinates.
In §2 we review the most important features of tensor calculus on the unit 2-sphere, focusing on the essential building blocks for computations in the full 3-dimensional ball. Most of the analysis follows along the lines of [41]. We provide a few new results needed to make arbitrary numerical calculations; particularly we discuss operators for multiplying by functions of cos(θ) and/or sin(θ). In addition, the calculus of spin-weighted spherical harmonic functions are not widely known in the fluid mechanics literature. It is worth reviewing for this reason alone.
In §3 we proceed where Phinney & Buridge (1973) [41] left off. In Appendix C of that work, the authors provided explicit formulae for the covariant differentiation of scalars and vectors in 3D. The formulae show the radial dependence of various spin-weighted components. The most obvious thing to say about the results are that they are complicated. The same level of organisation does not seem to appear with the radial dependence as it does with the angular components. We show that there is a way to organise calculations, but that the operators must occur in spectral space. That is, there exist transformations that render vector calculus operations into sparse decoupled operations. But now the transformations depend on the spherical harmonic degree, ℓ. The crux in the three-dimensional calculations is that the individual (radially dependent) spherical harmonic coefficients decompose into a superposition of functions with different regularity near the origin.
In the case of scalar functions, we use the shorthand This assumes that ℓ is a non-negative integer. We wish for equation (8) to be understood to mean exactly the same thing as equation (6). That is, we define the vector space of functions where F (r 2 ) represents any even function of r that is analytic in a neighbourhood of r = 0. We can put any additional structure on the function F (r 2 ) that we need to make further computations make sense. We call the spaces Reg(ℓ) regularity classes of degree ℓ. It is important to note that In the case of vectors, the spherical harmonic coefficients now have three (ℓ-dependent) components. Equation (7) implies that we need to consider Reg(ℓ − 1) at the very least. One of the main findings of the current paper is that vectors break into the direct sum of regularity classes, Continuing on from vectors, rank-2 tensor fields have nine (ℓ-dependent) components. These components reside in a direct sum of Reg(j) where ℓ − 2 ≤ j ≤ ℓ + 2. This only represents five seperate spaces, so some of these spaces appear more than once to represent nine-component tensors. How exactly this happens follows directly from our derivations in §3.
For the case of the two-dimensional disk, we found in V16 that v m (R) ∈ Reg(|m| − 1) ⊕ Reg(|m| + 1); (12) although this is not how it was presented at the time. However, writing vectors in this form highlights an important difference between the 2-disk and 3-ball. For the disk, That is, it is possible to rescale vectors in two dimensions to behave the same as scalars at the origin. This allows a wider range of computational methods for the disk, e.g., [47]. The main result of V16 is that although equation (13) holds, it is better to exploit the structure of equation (12). But the sphere does not allow the same kind of rescaling in radius. We need to build everything from scratch.
In §3, we show why a decomposition into regularity spaces emerges naturally from vector and tensor calculus. We find that the classic operators of vector calculus decompose into operators that map between the regularity spaces. But at the same time, we also find that the transformations between the components of vectors and tensors in coordinate bases and the regularity spaces is not extremely obvious.
There has been some related work on tensor spherical harmonics in the past. The most important predecessor to our work that we know of is James (1976) [19]; we refer to this work as J76 from now on. This remarkable paper works out a full theory of tensor spherical harmonics for general-order tensors. J76 provides recursion formulae for constructing higher-order tensors out of lower-rank tensors. These formulae are based on the Wigner 3-j coefficients. The paper also provides formulae for computing vector calculus operations (div, grad, curl, Laplacian). However, J76 made most of the derivations using spherical coordinates for the angle variables and Cartesian basis vectors for the tensorial behaviour. There is only a small discussion of the results using polar vectors. The paper states, the procedure of constructing tensors using polar vector components is "very tedious for [tensor rank] ≥ 2." And while geometrically elegant, the use of Wigner coefficients also hinders the necessary constructions for the purpose of numerical computation.
In our work, we find a straightforward method for making the construction iteratively using polar vectors. Our formulae involve explicit algebraic relations, and make no use of Wigner coefficients. The Q ℓ coefficients that connect the angular and radial dependencies also manifestly satisfy orthogonality relations without any kind of spatial integration. This approach makes the pseudospectral approach very straightforward. First we project out the angular polar coordinate vectors, then we perform a series of individual spin-weighted spherical harmonic transforms. Then we project out a series of radially dependent functions in different regularity classes. Sparse differential operations can act on regularity classes, and the whole thing can go back to grid space for nonlinear multiplications.
In §4 we continue our development of numerical methods that work well given the geometric properties of spheres in three dimensions. For this section we use Jacobi polynomials as basis functions for Reg(ℓ). This follows almost the same pattern as V16. It happens that the operators mapping between regularity spaces have very natural action on classes of properly rescaled Jacobi polynomials. We find that it is tremendously helpful to make very specific choices regarding the Jacobi parameters.
In §5 we discuss how to use the results of §2-4 to construct linear-algebraic systems for solving common PDEs. A main theme of V16 was that letting the Jacobi parameters adjust allowed for representing differential operators with maximally sparse matrices. This allows for the simple and well-conditioned construction of numerical equations from arbitrary formulations of physical equations. This not only allows for highly accurate results, but also automates the process in a way that helps reduce human error. We also show how our basis sets map to other past works. Some authors use rescaled Jacobi polynomials in one form or another, but do not allow the parameters to change as derivatives act [3,27,30,47,56].
In §6 we provide conclusions for this Part-I of our two-part series. The companion paper, Part-II, picks up with a series of numerical examples showing the effectiveness of our mathematical methods. We do this with a series of numerical eigenvalue problems and full-scale nonlinear computations of numerical benchmark problems. The first set of tests shows the high accuracy in well-understood unit tests of different mathematical aspects. The second set of tests shows that everything can be combined relatively easily to make numerical solvers that might once have taken several person-years to create. We show that we can use our mathematical framework to create automated numerical solvers within the Dedalus framework.
2 Tensor calculus on the unit 2-sphere In this section, we assume r = 1. For the three-dimensional part of the paper ( §3), we assume 0 ≤ r ≤ 1. Restricting to the unit sphere in this section eliminates pesky 1/r factors in the various formulae.

The connection
The foremost difficulty with computations in spherical geometry is that the basis vectors change from point to point. The gradient on the 2-sphere is where e θ , e φ represent orthonormal coordinate unit vectors on the surface of the sphere. For scalar-valued functions How the covariant derivative acts on a vector sets it apart from a re-scaled set of partial derivatives. The affine connection in the coordinate basis is The cot θ factors reflect the singular nature of the basis vectors at the poles. We can define an alternative spinor basis that simplifies the affine connection, We refer to Appendix A for a number of different useful properties of this basis. Most important, the e ± basis diagonalises the action of the gradient such that The gradient in the new basis becomes, where We use Einstein summation notation for the first part of equation (19).
For each individual component of the gradient and basis elements where µ, σ = ±1. So far, this analysis ignores the radial aspects, which we discuss in §4.
We use a modified notation to denote higher-rank tensors. For a multi-index σ = {σ 1 , . . . , σ r } with σ i = ±1, then See Appendix A for more details regarding this notation. Acting on an arbitrary-rank tensor component equation (21) becomes The presence of the spin-weight,σ, is the only difference between the formula for single and multi-index components.

Spin-weighted Spherical Harmonics
Spin-weighted spherical harmonics generalise traditional spherical harmonics. Both sets of functions result from classical Jacobi polynomials with different parameters. Written in full form, where The function P (a,b) n (z) represents a Jacobi polynomial of degree n = ℓ − ℓ 0 and parameters a, b = |m ± s|. We use the recursion properties of Jacobi polynomials to construct operators acting on this spherical basis.
Traditional (scalar-valued) spherical harmonics correspond to s = 0. The covariant derivative shifts the spin-weight, s → s ± 1. Specifically, where The cot θ term in Equation (27) implies that the covariant derivative of the spin-weighted harmonics can become singular in some cases. However, putting equation (27) together with equation (23) gives where µ, σ i = ±1. The combination of the spin basis, and spin-weighted spherical harmonics yields a very simple action for the gradient operator. For the caseσ = 0, both Yσ ℓ,m and e(σ) each individually produce singular gradients at the poles; the product of the two remains regular everywhere.
For an arbitrary tensor field Now, where µσ = {µ, σ 1 , . . . , σ r } and µσ = µ +σ; see Appendix A. The new spherical harmonic coefficients of ∇T are simple multiples of the original coefficients for T. For the spectral coefficients of a general multi-index tensor field This is one of the main points of the paper: • Using a combination of spin-weighted spherical harmonics, and spinor basis vectors, differentiation of the sphere becomes just like Fourier series; i.e., purely diagonal wavenumber multiplication.
• The major difference in this case is that Fourier series uses the same basis to represent scalar functions and the components of vectors and tensors. The sphere requires a hierarchy of bases. The one-dimensional circle also uses a small hierarchy of bases, i.e., sin/cos.

2-sphere Laplacian
The intrinsic Laplacian on the sphere results from contracting two gradient operations where (for a single value of spin-weight, s), Also note that Therefore the commutator of derivatives is: Equation (33) gives a slightly different formula than results from taking the three-dimensional Laplacian and restricting it to the surface of the 2-sphere. In this case, additional terms result from contracting in the third dimension. Equation (33) defines what is often called the rough Laplacian, or connection Laplacian. In a curved geometry, it is possible to define several linear, second-order, elliptic differential operators with a reasonable claim to the title of Laplacian. The Weitzenböck identity ensures that any two such Laplacians differ by a scalar curvature term at most; i.e., a term with no derivatives; see e.g., [18]. In the case of the restricted three-dimensional Laplacian, where |σ| = r gives the tensor rank of e(σ). In the case of the simple vectors e ± , we have that σ 2 = |σ| = 1. In applications, the most appropriate Laplacian depends on the details of the underlying physics.

Multiplication by trigonometric functions
Numerical computations for PDEs obviously require derivative operations. Some calculations also require multiplication by non-constant coefficients. On the surface of the 2-sphere, the action of sin θ and cos θ allows for multiplication by arbitrary well-behaved functions. Treating sin θ and cos θ as operators acting between bases of spin-weighted spherical harmonics allows for non-constant coefficient terms (e.g., Coriolis and shear) on the 2-sphere in time-evolution and/or eigenvalue problems.
The multiplication of non-constant coefficients act as operators in the entire basis set of spinweighted spherical harmonics. Defining the row vector, we have The operators C s and S s have the property where I s represents the identity acting on the spin-weighted bases.
The sine multiplication operators are related via the transpose, while the cosine operators are symmetric. That is, The C s is the Jacobi matrix needed for the construction of the spin-weighted spherical harmonics themselves.
Explicitly, in terms of each basis element, where The special case ℓ = m = 0 in equations (46) & (47) is handled by setting m = 0 and cancelling terms before taking ℓ = 0. Cosine multiplication preserves the spin-weight, while sine multiplication can either raise or lower the spin-weight depending on the situation. We include an example using the cosine operators in Paper-II.

Tensor calculus of the 3-ball
The main purpose of this paper is to use, decompose and organise all scalar, vector, and tensor computations in the three-dimensional ball into a mathematical structure that is efficient for numerical computations. The previous section introduced how to do this for the angular components. This section lays out a similar framework for the radial direction.

Motivation: Scalar functions
For a scalar field, we can decompose the radial and angular coordinates into a function of the form The most general situation would involve a sum over m, and ℓ, but equation (48) is enough to illustrate the main points. The Laplacian acts in the radial direction such that The second-order radial operator decomposes into the product of two ℓ-dependent first-order operators, D ± ℓ , such that where Equations (50) & (51) are well known in many contexts. For example, the spherical Bessel functions j ℓ (r) have These operators are the spherical versions of the m-dependent operators we discussed for the 2D disk. In the disk, we were able to use the two factors of the Laplacian to create components of the gradient operator. But the sphere presents us with a conundrum: in the sphere there are two factors of the Laplacian, but three components of the gradient. Furthermore, the angular gradient on spherical harmonic functions does not produce the effect of multiplying by ℓ or ℓ + 1. This is in contrast to the disk where the angular derivative on exponential functions does result in the multiplication by m. How can we use the operators in equation (51) in a general tensor framework?
The main reason that these operators in (51) are useful is the fact that We will see this more explicitly in §4 when considering computational strategies. The D ± ℓ operations appear individually for vector and tensors computations, but getting to the point where these operators appear takes some effort compared to scalars.

The connection
In three dimensions we have an additional basis vector e 0 ≡ e r . The gradient formulae for the spin basis now acquires additional terms arising from the radial element, Equation (54) reduces to equation (21) when restricted to the surface of the 2-sphere (e.g., r = 1).
A difficulty exists with three dimensions that is not present with the 2-sphere: there is no single rotation (complex-valued or otherwise) that diagonalises equation (54) in a manner similar to equation (21). In the 2-sphere, the e ± basis diagonalises the connection coefficients. This is seen in equation (54) where the first term on the right-hand side is proportional to e σ , which is the input on the left-hand side. To make sense of the other terms, it is helpful to consider the action of ∇ µ on all three basis vectors simultaneously. Therefore, define The first thing we can say about equation (54) is that The other components are equivalent to where and We see that J 0 is diagonal. This is what we mean when we say that the e ± diagonalises the connection in the angular directions. Also, clearly J † ± = J ∓ . The question is, how do we know there is not a transformation that diagonalises all three matrices? This would simplify tensor calculus in three dimensions considerably. We can simultaneously diagonalises two matrices if and only if they commute. Checking the commutation relations, No only do the matrices not commute, but they form the Lie algebra sl(2) ≃ sp (2). Transformations do exist that simplify computations, but they require additional mathematical ingredients. We need to work in spectral space to find what we are looking for.

Motivation: Vector Laplacian.
We can represent 3D vector fields in terms of spin-weighted spherical harmonics in a method similar to 2D vector fields and/or scalar fields. For example, Like equation (48) the most general situation could involve a sum over m, and ℓ, but this is enough to illustrate the main points. The Laplacian now acts in a more complicated way than for scalars, Restricting this to the surface of the 2-sphere coincides with equation (37). The vector Laplacian couples the different spin-weight components such that with the ℓ-dependent symmetric matrix The eigenvalues of Λ are We see clearly that the Laplacian acts on vectors as if they were composed of three scalars each with ℓ − 1, ℓ, ℓ + 1. This is what we mean when we say that the components of vectors are in a direct sum of regularity classes. The eigenvectors of Λ are orthogonal. The following matrix diagonalises the Laplacian in ℓ, That is, We use the notation B(s) of Eastwood & Tod (1982) to denote spaces of coherent spin-weight = s. That is, Y s ℓ,m forms an orthonormal basis for B(s). The matrix has the property that, and where the notation ⊕ in this case represents the direct sum of vector spaces 1 . We therefore arrive at another main point of this paper.
• Finding simple and sparse operators in the radial direction requires an orthogonal rotation on the vector components. This rotation is spherical-harmonic-degree (ℓ) dependent.
Because Q ℓ depends on ℓ its action must occur after a spherical-harmonics transformation.
In defining the matrix Q ℓ , it is useful to define the following notation These factors appear in many of the following formulae. They have the useful property that With this notation, equation (68) becomes Notation -In working with spin-weight degrees of freedom, and regularity degrees of freedom, we use particular notation to denote each type of index. We use Greek letters to denote spin indices, e.g., µ, ν, σ. We use non-italics Roman letters toward the beginning of the alphabet to denote regularity class relative to ℓ, e.g., a, b, c. Also, we use a function notation to represent the indices of the orthogonal matrices, e.g., Q ℓ (σ, a) represents the σ-th, a-th element of Q ℓ . We do this to avoid too many subscripts and to generalise the notation more easily to multiindex notation in higher-ranks tensors. When we use multi-index notation, an over-bar, e.g.,σ, represents a sum over multi-indices, as in equation (22) and Appendix A.

Vector calculus.
Now define a scalar field f and two seperate vector fields v and u: where we again suppress more-general summations over ℓ and m. Our primary vector calculus operators are then: Div And also, which is exactly the same form as the diagonalised vector Laplacian from §3.2.2.
The divergence of the curl, and the curl of the gradient vanish.

General Tensor Calculus
We want to find a way to operate on general tensors in a similar way we can for vectors and scalars. This is important in many physical applications. Stress tensors appear naturally both in solid and fluid mechanics, and magnetohydrodynamics. General relativity requires 4th-order tensor fields. The polarisation field arises naturally as a 2nd-rank tensor in electro-magnetics.
We first try to derive a simple set of transformations from the Laplacian in the same way we did for vector fields. This turns out to be informative, but does not completely determine everything we need. A systematic approach requires iteratively apply the gradient operator to a tensor field and cleaning up the result.

Motivation: Tensor Laplacian
For a single ℓ, m component of a 2-tensor After separating out the angular components, the 2-tensor Laplacian becomes In this case, Λ is a 4th-rank operator (indexed with the multi-index στ, µν) that acts like a matrix on the 2nd-rank tensors. The operator is symmetric in the sense that Λ στ µν = Λ µν στ . Similar to matrices, Λ has eigenvalues in the sense that where −2 ≤ a ≤ 2. Equation (88) represents 3 2 = 9 seperate linear equations. The eigenvalues fall into degenerate classes such that This implies that for the spherical-harmonic coefficients of rank-2 tensors, For rank-r tensors, the number of eigenvalue degeneracies for each spin-weight, −r ≤ a ≤ r, follows a generalised Pascal's triangle, Unlike the traditional Pascal triangle, there are no simple closed-form expressions for this recursion relation. The coefficients do satisfy a trinomial expansion This is the generating function for the number of ways to select q = a + r objects from r objects where we can select any individual object up to two times.
In the case of vectors, the Laplacian acts on spaces of non-degenerate eigenvectors; i.e., ℓ, ℓ ± 1. For tensors, the picture is not as clear. We want to build up higher-order transformations that serve the same purpose as the Q ℓ matrices in equation (68). For this we need to define an efficient way to organise higher-order tensor indices.

Orthogonal rotations of general tensor components
For the general case, we use a multi-index notation similar to that for the basis elements; see Appendix A.
In equation (96), is one of the 3-dimensional representation matrices of sl(2) defined in equation (59). By definition J σ 0 = 1. The matrices I r represent identity matrices for rank = r; i.e., a 3 r × 3 r identity matrix. The symbols ⊕ and ⊗ represent the Kronecker sum and product respectively 2 . Many modern numerical libraries enable the efficient computation of Kronecker products [52]. The derivative parameters are defined the same as for the 2-sphere, i.e., equation (28). The matrices in equation (96) also satisfy the sl(2) Lie algebra, which is useful for proving many relations involving the Q ℓ transformations. Equation (94) defines a sequence of orthonormal (arbitrary-rank) tensors such that for each ℓ, where δ(σ, σ ′ ) represents a multi-index version of the Kronecker-δ.
For rank-1, the individual matrix entries associated with equation (74) are When convenient, we identify a multi-index with a single element with that element, i.e., σ ↔ {σ}. Equation (99) shows that equation (94) reduces to equation (74) for rank-1.
This means that some cases need exceptions for low ℓ values depending on the tensor rank. For example, there is only one linearly independent ℓ = 0 vector, rather than the expected 3 components, i.e., v(r) = ∇Φ(r). This means that for rank-1 and ℓ = 0, Q 0 : Reg(1) → B(0) only. None of the other transformations/spaces exist. These special low-ℓ cases are straightforward to catch and avoid when creating the transformations.
Equation (94) has the advantage of not using Wigner 3-j coefficients. The formulae benefit from using explicit and elementary expressions at each stage. The formula is simple to program and stable for all values of ℓ and all tensor ranks, r. It runs efficiently within practical limits; e.g., even rank-4 tensors (comprising 81 individual fields in 3D) are sufficient for nearly any realistic physical application.

General Tensor Calculus
We can define an arbitrary-rank tensor field such that Projection -We need to extract the radial coefficients in order to preform any derivative operations. Therefore, See Appendix A, equation (194), for details on the meaning of σ * . Furthermore, where dn ≡ sin θ dθ dφ, and Yσ ℓ,m (θ, −φ) implies complex conjugation. Lastly, We could operate on A a ℓ,m (r) in a number of ways, and reform a series similar to equation (101) but using the appropriate bases.
Grad -We can represent the gradient of a general tensor in similar form as equation (101) ∇T(r, θ, φ) = ab,στ ℓ,m where the radial coefficients, B ab ℓ,m (r), are one rank higher than the input, A b ℓ,m (r). That is, and equation (72) defines the ξ a ℓ+b coefficients.
This highlights two main points of this paper: • Equation (106) implies that the gradient acts almost identically for vectors and scalars as for arbitrary-rank tensors.
Equation (107) is complicated to prove in general. It is straightforward to verify for any given specific rank. Also which is a more straight forward fact than equation (107). Together, equations (107) & (108) imply Equation (110) is one rank higher than the output in equation (111); σ and a represent the additional indices. The radially dependent components relate such that where −a is the negative of index "a". Recall that "a" represents a single index, while "b" is possibly a multi-index. Equation (112) is the tensor generalisation of equation (81).
Laplacian -If we consider the action of ∇ · ∇ then the input and output components relate such that This shows the diagonalised Laplacian for general tensor fields.
4 General radial operations using Jacobi polynomials.
At this point, our discussion of the geometric properties of three-dimensional space are complete enough to build a set of accurate and efficient numerical methods. The remainder of the paper follows much of V16. In the case of the 2D disk, it took much less background to get to the point where it is straightforward to compute numerical operations. In both V16 and the current paper the goal was to uncover a collection of spaces such as Reg(ℓ). These vector spaces are concrete enough to design "bespoke" computational algorithms.

Radial basis and differentiation
To show why the D ± ℓ operators map between regularity spaces, we consider functions of the form and F(z) is any differentiable function of z. We make the specific convention of the even function z of r so that −1 ≤ z ≤ 1. This is the same choice from V16 and is done to make calculations simpler for Jacobi polynomials. Therefore, consider the action of the differential operators on an element from Reg(ℓ), Equations (115) & (116) are nothing more than the multiplication and chain rule of differentiation. From the leading-order behaviour on the right-hand side of equations (115) & (116) we see that which is a property shared with spherical Bessel functions.
Finding a good numerical method for the sphere is the same as finding a good basis for the regularity spaces. Jacobi polynomials P are almost perfectly suited for the task. One reason is the following two differentiation formulae, for any parameters a, b > −1; see [53,35]. We see that equations (115) The first Jacobi parameter can take any value a > −1. Equation (120) contrasts with other choices in the literature; e.g., [27] with b = ℓ − 1/2. The many properties of Jacobi polynomials allow many more useful relations than we will use here. The differential properties of the spinweighted spherical harmonics in terms of k ± ℓ,s are a direct consequence of equation (119) and a similar relation for (a, b) → (a − 1, b + 1). This is another main point of the paper: • Carefully selecting Jacobi polynomial parameters provides big advantages in the efficiency of various differential operations.
Equation (120) allows us to represent f ℓ (r) ∈ Reg(ℓ) as a series of Jacobi polynomials of type (α, ℓ + 1/2) and achieve simple differentiation formulae in the r variable. We therefore define We assume α > −1; although it is possible to define negative parameter extensions. The parameter n represents the degree of the Jacobi polynomials in terms of the z argument. These polynomials are orthonormal on the weighed unit ball such that 1 0 Q α,ℓ n (r)Q α,ℓ n ′ (r) (1 − r 2 ) α r 2 dr = δ n,n ′ .
Matsushima & Marcus (1995) [30] defined a parameterised basis set that is equivalent to equation (121). Their work found sparse differentiation formulae for these polynomials. We find in V16 and in this current work that their matrices naturally factor into lower-bandwidth operators. We think the best term for the functions in equation (121) are "Generalised Spherical Zernike Polynomials". This name reflects the fact that the α = 0 basis was discovered by Zernike for the disk in the 1930s [56]. Also, the definition in equation (121) is to the polynomials in V16 as spherical Bessel functions are to integer-index Bessel functions in 2D.
The combination 2n + ℓ represents the total degree of the polynomial Q α,ℓ n (r) in the r variable. Therefore the right-hand side of both equations (123)  • The scalar Laplacian factors naturally into two first-order operators. These act as pure raising and lowering operators of the different Jacobi-polynomial parameters.

Conversion & multiplication operations
Equations (115)-(124) dictate how the second Jacobi index should behave. But just like in V16, we need to address how the first Jacobi index increments as we apply D ± ℓ ; notice that α → α + 1 in both cases. The orthonormal basis in equation (121) allows us to promote Reg(ℓ) to a collection of Hilbert spaces. Define the spaces for each α > −1, From equation (10), we still have However, we also have the simple inequality, which implies We can now see the action of the differential operators more clearly Equations (123) & (124) imply that this mapping is very sparse on the basis functions Q α,ℓ n (r). Because successive spaces are embedded in each other, there is a well-behaved mapping between α → α + 1. Fortunately, Jacobi polynomials possess algebraic recursion formulae that allow incrementing the parameters and multiplying by z-dependent factors [53,35] After including normalisation factors, we find the Jacobi recursion operators become various multiplication operators acting on the Q α,ℓ n (r) functions 3 , Q α,ℓ n (r) = Q α+1,ℓ n (r) A n + Q α+1,ℓ n−1 (r) B n−1 (1 − r 2 )Q α+1,ℓ n (r) = Q α,ℓ n+1 (r) B n + Q α,ℓ n (r) A n r Q α,ℓ n (r) = Q α,ℓ+1 n (r) C n + Q α,ℓ+1 n−1 (r) D n−1 r Q α,ℓ+1 n (r) = Q α,ℓ n+1 (r) D n + Q α,ℓ n (r) C n with the coefficients defined by, The α and ℓ dependence in the coefficients in equation (133) is the same for each element; only n changes. In all cases Q α,ℓ −1 (r) = 0 by definition.
We expand functions in H α (ℓ) in terms of the orthonormal basis, We compute the spectral coefficients via projection The coefficients, F n , depend implicitly on α and ℓ, but we omit the labels for simplicity. For a finite number of modes in the sum in equation (134), we can compute equation (135) using Gaussian quadrature for Jacobi polynomials. We want a simple way to implement various operators on spectral coefficient data. Therefore, we define a row vector of basis elements, We use a tilde to denote the following column vector, i.e., Equation (137) gives the dual basis in the sense that, where δ(r − r ′ ) is the Dirac-δ distribution, and I is the identity matrix acting on the n index. These definitions imply that the sum and projection in equations (134) & (135) become, We make the choice of the row/column convection so our state vector of spectral coefficients, F , is represented as a column vector.
With the above definitions, the algebraic and differential operators act on the basis elements in the following way, That is, starting with f ∈ H α (ℓ): where in the last set of correspondences, the D ± ℓ operators acting on function represents the differential operators defined in equation (51), and D ± α,ℓ represent sparse matrices defined via equations (123) & (124).
In equation (140), C α,ℓ , R α,ℓ , and D ± α,ℓ all represent sparse matrices that act from the left on spectral coefficients. The matrix D − α,ℓ is diagonal, D + α,ℓ has a single band immediately above the diagonal, and each C α,ℓ and R α,ℓ matrix has entries along the diagonal and one the band immediately above. The following duality is an interesting pattern of the various operators: 1. Operators that are local in physical space, i.e., 1, r, 1 − r 2 , couple two neighbouring modes in the n index.
2. Operators coupling nearby points in physical space, i.e., D ± ℓ , do not couple modes in the n index.
Operators that are local in physical space are non-local in coefficient n space, and the converse is also true.
Finally, for computing the solution to boundary-value problems we need the restriction operator acting at r = 1. This is as simple as, For Jacobi polynomials, After adjusting for the normalisation, For n ≫ 1, which gives an idea of how well a function in H α (ℓ) must behave to have a meaningful restriction to the boundary. Equation (145) implies that restriction is likely to make sense for the vast majority of circumstances encountered.
As with V16, all the operators except C ⊤ α,ℓ are essential for constructing algebraic systems out of general linear PDEs. However, since 1 − r 2 = 0 at r = 1, the C ⊤ α,ℓ matrices are useful for creating basis sets that automatically satisfy Dirichlet boundary conditions. From these mappings, we can easily spot a few useful relationships. For example, and All of these are manifestly positive definite both because 0 ≤ 1 − r 2 ≤ 1, and because M ⊤ M is self-adjoint for any real matrix. Furthermore (for example),

Discussion
At this point we have derived methods for solving a large number of possible linear PDEs in spherical coordinates. The principles behind our methods are two-fold: 1. First we leverage the geometry as much as possible to decompose a high-dimensional problem into a (possibly large) collection of decoupled one-dimensional problems. This stage relies on properties specific to the sphere and ball in three dimension. The methods are exact, and could be considered analytical as opposed to numerical or computational. The computational effectiveness of the exact methods relies on their expressiveness; which is to say the amount of useful information contained in an easily computable expression. We believe that the formulae provided in §1-4 meet this criteria. In the Part-II companion paper, we discuss in detail how we create the operators and data structures in easy-to-use code, and apply those operators to interesting PDE problems in the sphere.
2. After distilling the geometric content from three-dimensional polar-coordinate tensors, we then focus on the numerical task of solving equations within the regularity classes Reg(ℓ). This is a somewhat seperate question from how we arrive at the classes in the first place. Functions with a particular behaviour near a coordinate singularity can be represented either with a specially designed basis, or with something more general purpose. In principle, virtually any numerical method employed in the solution of PDEs can solve equations on Reg(ℓ). Some of these methods do much better than others. In §4 we provide a class of methods (with a single free parameter) that conform specifically to the type of differential operators that commonly act in the sphere. These operators are not exact eigenfunctions of the given operators. In this sense they make a general-purpose method. On the other hand, explicitly considering geometry allows for a set of maximally sparse matrices that compute common operators accurately and fast on general data. Our Jacobi-polynomial basis is physics agnostic, but adheres to the geometry. We could however, choose an even more general-purpose methods. We make some comments on this below.

Matrix-system construction
There is a large literature discussing the advantages and disadvantages of different methods for solving equations of the form (for example) of The exact solutions to equation (149) are spherical Bessel functions f n (r) ∝ j ℓ (κ n r) with j ℓ (κ n ) = 0.
As with trigonometric functions on the interval, Bessel functions make excellent test problems for orthogonal polynomials. The reason is that like sin / cos, Bessel functions are very much not finite polynomials. Highly accurate solutions are not a fait accompli. A method that does well on this kind of simple test problem is likely a good general-purpose method. The companion paper (Part-II) discusses the details of much more complicated numerical examples. For now, we only want to show a simple example for the purpose of demonstrating how to construct systems of linear equations from our method. This example also gives some context for how the same systems are constructed and solved in competing general-purpose methods.
The key question is: how should we turn equation (149) into a set of matrices that we can solve numerically?
First we pick a value for α > −1 for our primary basis set. The values α = 0 and α = −1/2 are both important special cases. The case α = 0 corresponds to the geometric weight in 3D, r 2 dr. The case α = −1/2 corresponds to r 2 dr/ √ 1 − r 2 . This second case has the disadvantage of not using the proper geometric measure. It is possible to circumvent this minor problem by making small adjustments to the quadrature weights when computing integration over the spherical volume. In any case, In the terminology of finite-element methods, we choose Q α,ℓ as a basis of trial functions. After applying the left-hand side of equation (149), we find The problem with equation (152) is that the left-hand side is now in the α + 2 basis, while the right-hand side is still in the α basis. But we can convert the basis on the right up two levels.
We can now project equation (153) on the the Q α+2,ℓ basis. In the language of finite-element methods, this is the basis of test functions. In general, the trial functions and test functions can lie in seperate spaces. Many spectral methods produce dense matrices because the trial and test spaces are considered to be the same. This may seem like a simplification, but simplifying the basis set at the cost of sparseness in the matrices can lead to a large amount of computational cost.
We also must include boundary conditions, Currently, equation (153) is upper-triangular in terms of the array of variable, F . Uppertriangularity implies that the system is closed if we truncate with N equations for N variables without considering the boundary conditions. The boundary condition equation (154) makes for N + 1 equations for N variables, and the system is over constrained. We could solve the problem by truncating equation (153) with only N − 1 equations for N variables. But there is a more general method that can accomplish the same thing as well.
We employ a τ -method correction [23,39,2] in equation (153). That is, we include a right-hand side correction of the form In equation (155) P ℓ N (r) ∈ Reg(ℓ) is any specific polynomial with The variable τ is a single additional degree of freedom that allows imposing the boundary conditions. The choice of specific P ℓ N (r) determines the specific truncation of the system. If we choose, P ℓ N (r) = Q α+2,ℓ N (r), then the τ -correction is equivalent to dropping the last equation from the system a replacing it with the boundary conditions. But we can make more general choices that can affect the accuracy of the solution. We explore the consequences of different τ -correction choices in Part-II.
We can now project equation (154) onto the Q α+2,ℓ basis. The entire system written in block form is is the column vector of the τ -correction polynomial projected against the test basis.
is a column of all 0s, except for a single 1 in the last entry. However, we could choose P ℓ N (r) = Q α,ℓ N (r), in which case, In this case, Π α,ℓ N only has three entries (rather than one) and the system is still sparse. We can pick P ℓ N (r) to maximise accuracy (under some constraints), and then we can pick α to maximise the sparsity (under some other constraints). It is also possible with a small number of matrix operations to reduce the τ column to a single entry and therefore be able to decouple it entirely form the rest of the system.

Automatic boundary conditions
There is also another quite elegant method for implementing the boundary conditions, which we turn to now. We note that the adjoint of the conversion matrix C has the effect of multiplying by (1 − r 2 ). We can therefore propose that where G is now an unknown column vector. Equation (161) implies that This implies that f (r = 1) = 0 automatically. Equations (161) & (162) imply that it is sometimes advantageous to use a non-orthogonal basis for trial functions. We now only need to apply the appropriate matrix on the right of the system and we are done: Livermore (2010) [24] pointed out some notable algebraic properties of Jacobi polynomials. In particular their ability to easily satisfy Galerkin boundary conditions and still maintain some kind of orthogonality. We believe that equation (162) lies at the root of that ability. We could make small adjustments to the matrix in equation (161) if we want a different boundary condition other than Dirichlet (see §5.2 of V16, and also [9,20,24,25,26]). The matrices in equation (163) have one band on the lower diagonal, and two bands above; there are no dense rows. This approach is called the "Dirichlet basis" method, the "Galerkin basis" method, or "basis recombination"; depending on the context.
With a few more moving parts, it is possible to combine the τ -correction method with the automatic boundary conditions. Even if the τ column is dense, it is possible to row-precondition the system in a similar way as for the automatic boundary conditions, and increase the matrix bandwidth by one more band. This leads us to highlight another important theme of our work: • The sparsity and the accuracy of a given system are two seperate issues that are all-toooften intertwined. We can make a system both minimally banded and also choose an arbitrary (polynomial) space to absorb the truncation error.
• Sparsity is advantageous for the speed of computing solutions, numerical well-conditioning, and ease of automatic system construction.
• Slightly different τ -correction terms can produce large differences in the overall accuracy of a scheme, independent of the sparsity.
• A specific choice of τ -correction calls into existence an exact finite solution in terms of polynomials. At this point, any basis of polynomials will solve to problem; some of them with many more numerical operations required than others.

Comparison to other work
It is also helpful to compare the basis sets and operators described in §4 to some other tools that exist for problems in the full sphere. It is also useful to compare our methods to similar schemes using Chebyshev polynomials in domains without coordinate singularities (e.g., Cartesian, and cylindrical annulus).
Other than the generality in the choice of τ -correction, equations (154) & (155) follow the same approach that we took in V16. A number of other authors have taken similar approach in onedimension with Chebyshev polynomials [57,4,2,15,8,20,8,33,36,54,51]. Equation (154) is the closest in methodology to the "ultra-spherical method" of Olver and Townsend (2012) [36]. Our method also shares some similarities to the "quasi-inverse" method of Julien and Watson (2009) [20]. In the latter case, the "quasi-inverse" implied finding a pre-conditioning operator that renders the equations sparse. The pre-conditioning operator followed systematically from a three-term recursion formulae for the derivative of any basis set. It turns out that Jacobi polynomials are the unique class of basis that satisfies the properties require for the quasiinverse. We also know now that for Jacobi polynomials and PDEs, the preconditioning matrices will always relate to a basis-conversion operator in some way.
We can truncate equation (154) in many different ways without using a τ correction. However, all methods of truncation can be mapped into a τ -method with some kind of equation correction; see [7] 4 . For many examples, truncating the system and replacing a row with equation (155) works quite well. However, it is not automatic that all truncations work equally well [5,6,13,31]. A full discussion of which method works best is an important topic that is outside the scope of our work here.

Jones-Worland Polynomials
In recent years, a number of author have used a basis set that they have termed Jones-Worland polynomials [27,17,28,29]. This basis relates in a clear way to the basis sets that we use. The individual basis elements W are defined in terms of Jacobi polynomials P (up to normalisation) such that This basis is similar to our α = −1/2 basis. But note the important difference that the second Jacobi parameter is ℓ − 1/2, rather than ℓ + 1/2 from equation (120), which allows for the basis to most efficiently conform to spherical geometry. The proponents of equation (164) say that the basis set has the advantage of being the closest analogue of Chebyshev polynomials with as close-to-uniform oscillation as possible.
In some of our numerical tests in Part-II, we find some advantage from using α = −1/2 over α = 0; but not in every situation. When α = −1/2 is better, we find that almost all of the gains come from controlling behaviour near the outer boundary. Half-integer differences between values of Jacobi parameters are not accessible with sparse transformations; integer differences allow sparse conversion. 5 We see from equation (145) that α = −1/2 does produce uniformity in the boundary values of the basis functions. However integer differences for the geometrically motivated parameter do not make much difference in the numerical properties of solutions.
There is a simple conversion between the polynomials defined in Livermore, Jones & Worland (2007) [27] and our definition of a Generalised Spherical Zernike basis set. For α = −1/2, The best this basis can do in terms of sparsity would be to replace our operators in equation (154) such that Previous work using this basis have always used a dense formulation such that, The inverse operators make the problem dense. Even in the sparse equation (166), there is no need to add an additional matrix band. If the Jones-Worland basis gives different numerical results for a specific problem, it is always possible to make small alterations to the last entries of a truncated matrix to render the two methods exactly equivalent. This could be done with a small change in a τ -correction polynomial; i.e., this approach. In the ∞−dimensional case, the idea is to preform sparse orthogonal matrix operations to render part of the system upper-triangular before deciding the degree of truncation based on an adaptive error estimate. The method is infinite in the sense that it uses a greedy algorithm in the forward orthogonal transformation stage. 5 However, it is possible to make the conversion fast in a highly accurate, albeit, approximate way. This technique is useful for computing transforms between a spatial grid and spectral coefficients [45,44]. This column contains four non-zero entries. Figure 1 shows the relative error of the first solution to the spherical Bessel function problem using three different methods. The green and red lines use respective τ columns The plots compute the relative error from solving for the first root of a spherical Bessel function, We can see that the different methods produce different slopes of the error curves. Small changes can produce large difference at low-to-moderate resolutions. We find this is very important in Part-II when studying the some of the finer details of magnetic dynamos. Both methods from equation (169) do much better than Chebyshev methods at the same number of degrees of freedom.

The recursion formulae of Matsushima & Marcus
Matsushima & Marcus (1995) [30] found a basis with sparse differentiation operators. Their basis sets were equivalent to our basis set for specific parameters. We show here how their operators relate.

Matsushima & Marcus used operators on the form
It is possible to construct sparse matrix systems by multiplying through by an appropriate power of r. For example, Acting on a basis set, That is, Matsushima & Marcus also found other related operations, e.g., (1 − r 2 ), but not r individually. The operations are presented in terms of explicit recursion formulae for the polynomials within a regularity space. The most important distinction was that there are not operators that can map between regularity spaces. This not only makes systems more difficult to construct automatically, but also significantly more dense.

Chebyshev solution
Several authors have employed Chebyshev polynomials as a basis for Reg(ℓ) in spherical problems; see [27,3,17,29,50]. It is possible to use Chebyshev polynomials and achieve reasonably good results. A Chebyshev scheme has the advantage of a fast transform. We discuss this more in the next section. The difficulty for Chebyshev methods is that it takes a many degrees of freedom just to represent the leading-order r ℓ in Reg(ℓ) for ℓ ≫ 1. Depending on the error requirements, it may not require fully ℓ degrees of freedom, but it will not be easy.
For example, it is possible to represent r 100 to within 1% point-wise absolute error with Chebyshev polynomials of degree less than approximately 25. But the truncation will not allow for any of the analytic properties of the original function; i.e., r 100 /r = r 99 , and dr 100 /dr = 100r 99 . The Chebyshev truncation will give completely incorrect results under differentiation and division until the expansion degree reaches exactly 100. And even then, numerical imprecision can still lead to serious errors in the analytical properties.
To solve the spherical Bessel problem, we first define ϑ ≡ r d/dr. We then need to multiply through by r 2 to avoid the singularity at r = 0. That is, It is advantageous to take the reflection symmetry into account in the Chebyshev expansion, but not to include an r ℓ prefactor, which leads to severe ill conditioning [3]. Therefore expand, Under the action of the scaled derivative operator, ϑ, After posing the problem, sparse matrix construction requires using the following relationships between Chebyshev polynomials of the first and second kind, Equation (178) are the same as the differentiation, conversion, multiplication operations we present in §4, only between Jacobi polynomials with parameters ±1/2. Equation (178) also forms the foundation of all sparse Chebyshev methods that have come before. Most particularly the "quasi-inverse" method of Julien and Watson, and the "ultra-spherical" method of Olver and Townsend [20,36], both of which provided much inspiration for the methods in V16 and the current work.
After using the properties of Chebyshev polynomials, the matrix version of equation (175) contains 9 matrix bands along the diagonal, and a dense boundary condition row. It is also possible to enforce the boundary conditions with basis recombination, and thereby increase the matrix bandwidth by two. The system is sparse, and straightforward to solve numerically.
The biggest drawback of the Chebyshev method is the difficulty of working with vector and tensor systems. Equation (175) already takes more special intervention than in §5.1. Everything quickly becomes much more complicated for higher-rank systems. It is also not possible to forward differentiate (e.g., grad), an expansion of the form in equation (176), which requires dividing by r. Townsend, Wilber, & Wright [50,55] took steps to mitigate this problem by filtering the first few monomials after solving.
In many cases, a Chebyshev method is sufficient for reasonable results at modest resolution. But we can see from the error plots in Figure 1 that the results for more specialised methods can be orders of magnitude better.

Grid representation and spectral transforms
After finding the solution to a given system of equations in terms of spectral coefficients, it is natural to consider how to represent the solution on a spatial grid. Specifically, how can we compute the projections in equation (139)? The obvious answer is to use Gaussian quadrature. We implement this method for the calculation of nonlinear terms in Part-II. We use methods similar to those outlined in [49]. Given practical and efficient transform methods for our bases, there are also some interesting of new alternatives on the horizon.
We discuss the issue of spectral transforms at length in V16. An unfortunate aspect of general Jacobi polynomials is the lack of a fast spectral transform. Only the Chebyshev polynomials have an unambiguous O(N log N ) transform. All other bases technically require O(N 2 ) operations for the transforms. In V16, we demonstrated that number of spectral modes must get to a couple hundred before the asymptotically O(N log N ) beats matrix-based transform on modern SIMD hardware. Boyd and Yu (2011) [3] also discuss the advantages and disadvantages of using a geometrically adapted method, versus a Chebyshev-based method with a fast transform. They also stress that the advantages of a fast transform are not as great as sometimes believed. For the 3D ball, the timing and scaling results for matrix transforms are identical to those in the disk. However, Jacobi-derived bases in three dimensions allow for additional advantages over 2D.
Recall that, We know that we can convert each Jacobi index up one degree with a two-band matrix. This means that we can convert down one degree by inverting a two-band matrix, which requires O(N ) operations per conversion. If we pick α = −1/2 as the primary basis, then we can convert our basis all the way to P (−1/2,−1/2) n (z) ∝ T n (z) in O(N ℓ) total operations. This is advantageous because we can use a fast transform on the Chebyshev basis.
The downside to converting to a Chebyshev basis is that typically ℓ max ∼ N . There are big savings for small-ℓ values, but not so much for the high-degree modes. It is also not clear a priori how numerically stable it is to convert down many degrees. But recent work by Slevinsky [44,45,46] shows great promise in this regard for fast and stable algorithms with general Jacobi polynomial bases. Slevinsky showed how to use a divide-and-conquer butterfly method to compute the fast conversion from arbitrary Jacobi parameters to parameter values −1 < a, b ≤ 0. For half-integer values of both parameters, it is then possible to use the Chebyshev transform after conversion. If the down-conversion does not land on −1/2, then Slevinsky [45] and others [16] recently demonstrated how to transform to the Chebyshev basis (from e.g., Legendre) in a simple way based on asymptotics. While these methods are not fully exact, the accuracy can be made machine precision without compromising overall efficiency.
Moreover, the results of Slevinsky and others allow for effective fast transforms for spin-weighted spherical harmonics. Although, fast spherical harmonics transforms are not a serious impediment to the creation of high-performance codes for multi-scale physical applications. Several highly tuned numerical transform libraries have proven quite effective in advanced applications [48,42].

Conclusions
This paper derives a new algorithm for the accurate and efficient solution of partial differential equations involving scalars, vectors, and tensors in spherical polar coordinates. We represent variables in a spectral basis derived from the geometric properties of the sphere. The basis respects the tensorial nature of equations, and tensorial derivative operators (i.e., grad, div, curl, etc.) are all represented as maximally sparse matrices. Furthermore, the basis automatically satisfies the regularity conditions at the coordinate singularities of spherical polar coordinates: the north and south poles, and the centre of the sphere.
We derive the basis in two steps. First we consider the angular dependence ( §2). Following the work of Phinney & Buridge (1973) [41], we use spin-weighted spherical harmonics. These are a generalisation of the regular spherical harmonics (which correspond to a spin-weight of 0), which can also be used to represent vectors and tensors on a 2-sphere. To represent the angular variation of vectors and tensors, we first need to transform them into a basis of elements with definite spin-weight, via a unitary transformation. A new result for computations on the 2-sphere is the derivation of operators which represent the action of multiplication by cos(θ) or sin(θ), which is useful for the solution of equations whose coefficients have angular variation. Setting up a calculation for on the surface of a sphere is no more challenging than working with Fourier series. We have already deployed the above methods to study 2D active-matter turbulence on the surface of a sphere [32].
The second step derives a radial basis. The difficulty is that different components of tensors may have different behaviour near r = 0. One key result of this paper is the derivation of an orthogonal transformation which takes a tensor represented with spin-weighted spherical harmonics, to a new basis in which each element is in a specific regularity space ( §3). This orthogonal transformation depends on the spherical harmonic degree ℓ. A simple recursion relation allows the easy construction of the required transformations via equation (94). This extends the work of James (1976) [19] which defined a different type of transformation based off Wigner 3-j coefficients, and which was difficult to use for high-rank tensors.
A particular basis of weighted Jacobi polynomials represents each regularity space ( §4). Sparse tensorial derivative operators (e.g., grad and div ) map between different classes of weighted Jacobi polynomials. We derive a series of useful operators for the solution of partial differential equations, all based on the properties of Jacobi polynomials; which is another key result of this paper. These can be combined in a straightforward way to solve partial differential equations ( §5).
We also demonstrate how to construct different truncations via the τ method in §5. We find that different residuals can lead to error properties that differ by many orders of magnitude. But we only demonstrate these differences for the simples relevant eigenvalue problem. In Part-II of this series, we show how our methods behave in situ.
Part-II describes the implementation of our algorithms as an extension of the Dedalus code framework 6 . For demonstrating whole-code capability we solve several different partial differential equations relevant in astrophysical and geophysical fluid dynamics; written in tensorial form. For example, switching from solving a hydrodynamics problem to a magneto-hydrodynamics problem involves the addition of a few dozen of lines of code. We are able to demonstrate similar performance to other numerical methods which are specifically designed to magnetohydrodynamics, despite our much more general approach. The algorithms derived in this paper thus have great promise for the solution of a wide range of problems involving partial differential equations in spherical polar coordinates.
As an overall theme, our work in this series of papers, and in V16, demonstrate how to construct very efficient and general numerical methods on domains with coordinate singularities. The approach of using the algebraic properties of Jacobi polynomials extends in a similar way to even more general domains. Working out the details can get complicated. But the end result follows the same pattern and ends up with maximally sparse operators. With other collaborators, we have already shown how Jacobi polynomials have analogous properties on triangles [37]. The corners of a triangle act the same as polar-coordinate singularities. It is necessary to link the Jacobi parameters to account for singular behaviour [11,21]. The same situation holds for squares, and higher-dimensional polyhedra.

A Vector and tensor basis element notaion
This appendix describes several aspects of tensor basis elements. Starting with the standard orthonormal coordinate basis vectors, we generate a spin-weighted set of basis vectors, e 0 ≡ e r , e ± ≡ 1 √ 2 (e θ ∓ ie φ ) .
A unitary matrix transforms between the coordinate and spinor basis vectors, To transform higher-order tensors (of rank-r) between the coordinate and spinor bases, we take the Kronecker product of U with itself r times.
The coordinate basis is orthonormal under the usual dot product e r · e r = e θ · e θ = e φ · e φ = 1, e r · e θ = e r · e φ = e θ · e φ = 0 (182) Therefore if σ, τ = −1, 0, +1 e σ · e τ = 1 if σ = −τ 0 otherwise (183) This is the case for the standard dot product. This is not a proper inner product for the complex basis elements. We can take a couple of different approaches to defining dual (or contra-variant) vectors. We could simply use the complex conjugate, e σ · e * τ = e * σ · e τ = δ σ,τ . We could alternatively define the contra-variant unit vector e σ ≡ e −σ = e * σ (184) Then one must take care of superscripts e σ · e τ = δ σ τ .
The cross product for the coordinate basis implies, e 0 × e ± = ±i e ± , e + × e − = i e 0 We use multi-index notation for tensor components, We use a slightly different notation to denote a multi-index basis element, rather than an individual vector element. For tensor basis elements of a rank-r tensor, we define e(σ) ≡ e σ 1 . . . e σr , σ i ∈ {−1, 0, +1} The sumσ is called the spin-weight.σ This quantity, the spin-weight, arises repeatedly throughout the following analysis. It is also useful to define the rank via |σ| ≡ r A general rank-r tensor therefore has the form T = σ T σ e(σ), where T σ = T σ 1 ,...,σr , and each element σ i = −1, 0, +1.