Solution of the Schrödinger equation for quasi-one-dimensional materials using helical waves

We formulate and implement a spectral method for solving the Schrödinger equation, as it applies to quasi-one-dimensional materials and structures. This allows for computation of the electronic structure of important technological materials such as nanotubes (of arbitrary chirality), nanowires, nanoribbons, chiral nanoassemblies, nanosprings and nanocoils, in an accurate, efficient and systematic manner. Our work is motivated by the observation that one of the most successful methods for carrying out electronic structure calculations of bulk/crystalline systems — the plane-wave method — is a spectral method based on eigenfunction expansion. Our scheme avoids computationally onerous approximations involving periodic supercells often employed in conventional plane-wave calculations of quasi-one-dimensional materials, and also overcomes several limitations of other discretization strategies, e.g., those based on finite differences and atomic or-bitals. The basis functions in our method — called helical waves (or twisted waves ) — are eigenfunctions of the Laplacian with symmetry adapted boundary conditions, and are expressible in terms of plane waves and Bessel functions in helical coordinates . We describe the setup of fast transforms to carry out discretization of the governing equations using our basis set, and the use of matrix-free iterative diagonalization to obtain the electronic eigenstates. Miscellaneous computational details, including the choice of eigensolvers, use of a preconditioning scheme, evaluation of oscillatory radial integrals and the imposition of a kinetic energy cutoff are discussed. We have implemented these strategies into a computational package called HelicES (Helical Electronic Structure). We demonstrate the utility of our method in carrying out systematic electronic structure calculations of various quasi-one-dimensional materials through numerous examples involving nanotubes, nanoribbons and nanowires. We also explore the convergence properties of our method, and assess its accuracy and computational efficiency by comparison against reference finite difference, transfer matrix method and plane-wave results. We anticipate that our method will find applications in computational nanomechanics and multiscale modeling, for carrying out transport calculations of interest to the field of semiconductor devices, and for the discovery of novel chiral phases of matter that are of relevance to the burgeoning quantum hardware industry.


Introduction
Low dimensional materials have been intensely investigated in the past few decades due to their remarkable electronic, optical, transport and mechanical characteristics [1,2]. The properties of these materials often provide sharp contrasts with the bulk phase, and have led to various technological applications, including e.g., new kinds of sensors, actuators and energy harvesting devices [3][4][5][6][7][8]. Quasi-one-dimensional materials -which include nanotubes, nanoribbons, nanowires, nanocoils, as well as miscellaneous structures of biological origin [9,10] -are particularly interesting in this regard. This is due to the unique electronic properties that emerge as a result of the availability of a single extended spatial dimension in these structures [11][12][13][14], the possibility that they are associated with ferromagnetism, ferroelectricity, and superconductivity [15][16][17][18], and the fact that the behavior of these materials may be readily modulated via imposition of mechanical deformation modes such as torsion and/or stretching. [19][20][21]. Quasi-one-dimensional materials have also been investigated as hardware components for computing platforms -both conventional [22,23] and quantum [24]. The applications of such materials in the latter case are connected to anomalous transport (the Chiral Induced Spin Selectivity effect [25]) and exotic electronic states [26] that can be observed in such systems.
Given the importance of quasi-one-dimensional materials, it is highly desirable to have available computational methods that can efficiently characterize the unique electronic properties of these systems. However, conventional electronic structure calculation methods -based e.g. on plane-waves [27,28] -are generally inadequate in handling them. This is a result of the non-periodic symmetries in the atomic arrangements of such materials. As a result of these symmetries, the single particle Schrödinger equation associated with the electronic structure problem exhibits special invariances [29,30], which plane-waves, being intrinsically periodic, are unable to handle. For example, ground state plane-wave calculations of a twisted nanoribbon (see Fig. 1a) will usually involve making the system artificially periodic along the direction of the twist axis -thus resulting in a periodic supercell containing a very large number of atoms, as well as the inclusion of a substantial amount of vacuum padding in the directions orthogonal to the twist axis, so as to minimize interactions between periodic images. Together, these conditions can make such calculations extremely challenging even on high performance computing platforms, if not altogether impractical. There have been a few attempts to treat quasione-dimensional materials using Linear Combination of Atomic Orbitals (LCAO) based techniques [31][32][33][34][35][36]. However, such methods suffer from basis incompleteness and superposition errors [37][38][39], which can make it difficult to obtain systematically convergent and improvable results.
In view of these limitations of conventional methods, a series of recent contributions has explored the use of real space techniques to study quasi-one-dimensional materials and their natural deformation modes [20,30,40,41]. Specifically, this line of work incorporates the helical interaction potentials present in such systems using helical Bloch waves and employs higher order finite differences to discretize the single particle Schrödinger equation in helical coordinates. While this technique shows systematic convergence, and has enabled the exploration of various fascinating electromechanical properties, it also has a number of significant drawbacks. First, due to the curvilinearity of helical coordinates, Email address: asbanerjee@ucla.edu (Amartya S. Banerjee) the discretized Hamiltonian appearing in these calculations is necessarily non-Hermitian [42,43]. This complicates the process of numerical diagonalization and makes many of the standard iterative eigensolvers [44] unusable. Second, the discretized equations have a coordinate singularity along the system axis which restricts the use of the methods to tubular structures and prevents important nanomaterials such as nanowires and nanoribbons from being studied. The presence of the singularity also tends to ill condition the discretized Hamiltonian, which further restricts the applicability of the method to systems in which the atoms lie far enough away from the system axis (e.g. larger diameter nanotubes). Finally, while the finite difference approach does allow for the simulation of materials with twist (intrinsic or applied), the sparsity pattern of the discretized Hamiltonian worsens upon inclusion of twist, making simulations of such systems significantly more burdensome.
In this work we formulate and implement a novel computational technique that remedies all of the above issues and allows one to carry out systematic numerical solutions of the Schrödinger equation, as it applies to quasi-one-dimensional materials and structures. The technique presented here can be thought of as an analog of the classical plane-wave method, and is similar in spirit to the spectral scheme for clusters presented in [45]. Like the classical plane-wave method, a single parameter (the kinetic energy cutoff) dictates the overall quality of solution of our numerical scheme. We present a derivation of the basis functions of our method -called helical waves (or twisted waves) -as eigenfunctions of the Laplacian under suitable boundary conditions. We describe how helical waves may be used to discretize the symmetry adapted Schrödinger equation for quasi-one-dimensional materials, and how matrix-free iterative techniques can be used for diagonalization. A key feature of our technique is the handling of convolution sums through the use of fast basis transforms, and we describe in detail how these transforms are formulated and implemented. We also discuss various other computational aspects, including the choice of eigensolvers and preconditioners, and the handling of oscillatory radial integrals that appear in our method. We have implemented these techniques into a MATLAB [46] package called HelicES (Helical Electronic Structure), which we use for carrying out demonstrative electronic structure calculations of various quasi-onedimensional materials. We also present results related to the convergence and accuracy properties of our method, while using finite difference, transfer matrix and plane-wave methods for reference data.
We remark that our technique has connections with methods presented in earlier work concerning electronic structure calculations in cylindrical geometries [47][48][49][50][51], but is more general in that the use of helical waves automatically allows both chiral (i.e., twisted) and achiral (i.e., untwisted) structures to be naturally handled. Additionally, some of these earlier studies have employed the strategy of setting up of the discretized Hamiltonian explicitly and then using direct diagonalization techniques, which scales in a significantly worse way (both in memory and computational time) compared to the transform based matrix-free strategies adopted by us. We also note in passing that the basis functions presented here appear to be scalar versions of twisted wave fields explored recently in the x-ray crystallography [52,53] and elastodynamics [54,55] literature.
The rest of this paper is organized as follows. In Section 2, we specify the class of systems of interest to this work, formalize the relevant computational problem, and describe our discretization strategy. Numerical techniques and algorithms are presented in Section 3, following which we present results in Section 4. We conclude in Section 5 and 3 also discuss the future outlook of the work. Miscellaneous derivations and computational details are presented in the Appendices.

Formulation
In what follows, e X , e Y , e Z will denote the standard orthonormal basis of R 3 . Position vectors will be typically denoted using boldface lower case letters (e.g., p) and rotation matrices using boldface uppercase (e.g., Q).The atomic unit system of m e = 1, = 1, 1 4π 0 = 1 will be used throughout the paper, unless otherwise mentioned. Cartesian and cylindrical coordinates will be typically denoted as (x, y, z) and (r, ϑ, z) respectively. The × sign will be reserved for denoting dimensions of matrices (e.g. using M × N to denote the dimensions of a matrix with M rows and N columns), while * will be used to explicitly denote multiplication by or in between scalars, vectors and matrices.

Description of Physical System and Computational Problem
We consider a quasi-one-dimensional nanostructure of infinite extent aligned along e Z (see Fig. 1). We assume the structure to be of limited extent along e X and e Y . Let the atoms of the structure have coordinates: Quasi-one-dimensional structures in their undeformed states, or while being subjected to natural deformation modes such as extension, compression or torsion, can often be described using helical (i.e., screw transformation) and cyclic symmetries [10,20,29,30]. Accordingly, we may identify a finite subset of atoms of the structure with coordinates: and a corresponding set of symmetry operations: such that: Here, the Υ ζ,µ are symmetry operations of the structure -specifically, each Υ ζ,µ is an isometry whose action on an arbitrary point x ∈ R 3 (denoted as Υ ζ,µ • x) is to rotate it by the angle 2πζα + µΘ about e Z , while simultaneously translating it by µτ about the same axis. The natural number N is related to cyclic symmetries in the nanostructure about the axis e Z , with Θ = 2π/N denoting the cyclic symmetry angle. The quantity τ is the pitch of the screw transformation part of Υ ζ,µ , the parameter α takes values 0 ≤ α < 1, and β = 2πα/τ captures the rate of twist (imposed or intrinsic) in the structure. The case α = 0 usually represents achiral or untwisted structures (see Fig. 1) . The electronic properties of a quasi-one-dimensional material under study can be investigated by calculating the spectrum of the single particle Schrödinger operator:  associated with the system. Determination of the spectrum in an efficient manner, especially for realistic quasi-one-dimensional nanomaterials serves as the primary computational problem of interest in this work. Here, V (x) represents the "effective potential" as perceived by the electrons. The potential can be computed through self-consistent means (for example, as part of Density Functional Theory calculations [20,30]), or through the use of empirical pseudopotentials [56,57], as done here. Due to the presence of global structural symmetries, the potential is expected to obey: As a consequence of the quasi-one-dimensional nature of the system, and the above symmetry conditions, the eigenstates of the Hamiltonian can be characterized in terms of Helical Bloch waves [29,30]. Specifically, solutions of the Schrödinger equation: can be labeled using band indices j ∈ N, and symmetry adapted quantum numbers η ∈ − 1 2 , 1 2 , ν ∈ {0, 1, 2, . . . , N − 1}. Moreover, these solutions obey the following condition for any symmetry operation Υ ζ,µ ∈ G: The above relation can be used to reduce the computational problem of determining the eigenstates of the Schrödinger operator over all of space, to a fundamental domain or symmetry-adapted unit cell. 5 Since the structures considered here have limited spatial extent in the e X − e Y plane, so does the computational unit cell. We denote the maximum radial coordinate of the points in the computational domain as R. Then, this region of space (see Fig. 3) can be parametrized in cylindrical coordinates as: Due to the decay of the wavefunctions in the radial direction [58,59], it is often appropriate to enforce Dirichlet boundary conditions on the surface r = R, as done here. In practice, the value of R can be chosen so as to ensure a sufficient amount of vacuum exists between the structure under study and this lateral boundary surface [30,40].

The Helical Coordinate System and Transformation of Schrödinger's Equation
For computational purposes, it is useful to utilize a coordinate system that describes the computational domain D, and the quasi-one-dimensional system's symmetries more naturally. To this end, we employ helical coordinates [29,60,61] in this work (Fig. 2). For a point p ∈ R 3 with Cartesian coordinates (x p , y p , z p ), cylindrical coordinates (r p , ϑ p , z p ), and helical coordinates (θ 1 p , θ 2 p , r p ), the following relations hold: Regardless of the amount of twist or cyclic symmetries present in the system, the fundamental domain D (eq. 9) can be conveniently expressed as a cuboid in helical coordinates, i.e., Thus, it is easier to setup a computational mesh over the fundamental domain using helical coordinates. Moreover, the action of the symmetry operations Υ ζ,µ ∈ G is to simply result in translations of the helical coordinates: if p ∈ R 3 has helical coordinates (θ 1 p , θ 2 p , r p ), then Υ ζ,µ • p has helical coordinates θ 1 p + ζ, θ 2 p + µ N , r p . In particular, this implies  that a function that is group invariant may be represented over the computational domain by means of periodic boundary conditions along the θ 1 and θ 2 directions.
Substituting eq. 13 into the Schrödinger equation above (eq. 12) and after some algebra (Appendix A), we arrive at: This serves as the governing equation for the computational method in this work. It needs to be discretized and solved over the fundamental domain along with the enforcement of periodic boundary conditions in the θ 1 and θ 2 directions (eq. 14), and the imposition of wavefunction decay in the radial direction, i.e.: Note that due to eq. 6, the effective potential in helical coordinates, V (θ 1 , θ 2 , r), also obeys conditions of the form outlined in eq. 14, although it is generically not expected to obey the decay conditions similar to eq. 16.

Basis Set and Discretization
We now discuss discretization of the governing equations using helical waves. The derivation of these basis functions as symmetry adapted eigenfunctions of the Laplacian is presented in Appendix B. In what follows, we will usually suppress the dependence of φ j (θ 1 , θ 2 , r; η, ν) on the band index (j) for the sake of simplicity of notation. Denoting the basis functions as F m,n,k (θ 1 , θ 2 , r) in helical coordinates, we write: Here,φ m,n,k are the expansion coefficients, J nN (·) denotes Bessel functions of the first kind of order nN, while b nN k denote the zeros of the Bessel functions. The basis function normalization constants c m,n,k are: The set Γ denotes triplets of integers (m, n, k) such that m ∈ [−M max , M max ], n ∈ [−N max , N max ] and k ∈ [1, K max ]. The basis set size is L = (2M max +1) * (2N max +1) * K max , i.e., it grows as O(M max N max K max ) in terms of the discretization sizes along the θ 1 , θ 2 , r directions. By design, the basis functions are orthonormal, i.e.: F m,n,k , F m ,n ,k L 2 (D) = δ m,m δ n,n δ k,k , and they satisfy (see Appendix B): The above condition implies that the kinetic energy part of the single particle Schrödinger operator is diagonalized in this basis. Consistent with the literature, we will refer to the representation of a function in terms of its expansion coefficients (i.e.,φ m,n,k in the above) as its reciprocal space representation. Furthermore, we will refer to the representation of the function in terms of its values on a discrete set of grid points as its real space representation. If the basis functions are also available on these same grid points, the real and reciprocal space representations of the function can be connected via eq. 17. In Appendix C we demonstrate how the gradients of quantities expressed via eq. 17 may be evaluated.
For common systems of interest, the number of basis functions required for discretizing the governing equations can number in the tens or hundreds of thousands (See Section 4). Thus, it can become infeasible to explicitly store the discretized Hamiltonian. This scenario is also encountered in the classical plane-wave method for bulk systems [27,28], and can be addressed by working with the discretized Hamiltonian implicitly, and using iterative, matrix-free diagonalization techniques to compute the eigenstates [44,62]. For adopting such strategies, we need to be able to compute action of the Hamiltonian on an arbitrary vector, such as the wavefunction, as represented in our basis set. To this end, we consider a vector φ ∈ Span F m,n,k : (m, n, k) ∈ Γ , substitute eq. 17 into eq. 15, and use eq. 20, to arrive at: which further simplifies to: The constants a, b, c in the above are as follows: The action of the Hamiltonian on the vector φ is simply the left hand side of eq. 22 above. We observe that due to orthonormality of the basis set, the first four terms on the left hand side are easily handled in reciprocal space. Specifically, the second term is simply a scaling of the input vector φ with the factor a (α, τ, η, ν), while the other three terms can be evaluated as element-wise product operations (Matlab operation . * ). Thus, these terms can all be evaluated at a cost that scales linearly with the basis set size. The last term on the left hand side is associated with action of the effective potential V (x) on the wavefunction vector. If the expansion coefficients of the potential are available as: then the expansion coefficients of V (x)φ(x) can be computed as: Γφ m,n,k F m,n,k (θ 1 , θ 2 , r) , F m ,n ,k (θ 1 , θ 2 , r) There are two problems with the above evaluation strategy. First, the time complexity of the procedure scales in a cubic manner with respect to the basis set size, i.e., . Moreover, if the coupling coefficients: are to be calculated and stored ahead of time for easier evaluation of eq. 25, the memory complexity of the procedure would also scale cubically with the basis set size. By making use of the fact that the coupling coefficients are non-zero only for m +m = m and n+ñ = n , their evaluation, storage and application to eq. 25, can be somewhat simplified. Despite this, the overall complexity still continues to be cubic in the basis set size in both memory and time. Second, the potential V (x) is generally not expected to be equal to zero at r = R and may be slowly decaying due to long range electrostatics effects. Hence, it is inappropriate to express this quantity in terms of helical waves obeying Dirichlet boundary conditions. Both of the above issues can be remedied by adopting a pseudospectral evaluation strategy [45,[63][64][65][66], as we now describe. This is related to the observation that if V (x) and φ(x) are available in real space, as functions sampled at a common set of grid points, the product χ(x) = V (x)φ(x) can be evaluated with a cost proportional to the size of the grid. Thereafter, the function χ(x) can be directly expanded in terms of helical waves to yield: Since χ(x) obeys Dirichlet boundary conditions and inherits all symmetries of the group G, its expansion using helical waves is appropriate. To put this strategy into practice however, we need access to fast basis transforms so that functions expressed in reciprocal space (i.e., as expansion coefficients) and real space (i.e., on the grid), may be readily interconverted. We describe the formulation and implementation of such transform routines in Sections 3.4.1 and 3.4.2. The overall computational cost of this strategy is the sum total of the costs of the forward and inverse transforms, and the cost of carrying out the real space product. Theoretically, the transforms described here scale in a manner that is slightly worse than the basis set size. However, as we show later, in practice they scale more favorably, in a sub-linear manner (see Fig. 4). Furthermore, the real space grid size is usually a constant multiple of the basis set size, leading to the overall cost of the pseudospectral method scaling in a manner that is close to the first power of this quantity (= O(M max N max K 2 max )). The memory complexity is also reduced and scales as the basis set size itself, i.e., O(M max N max K max ).
Finally, we discuss the evaluation of the fifth and the sixth terms on the left hand side of eq. 22. The fifth term, i.e., satisfies (θ 1 , θ 2 , r = R) = 0 and invariance under G, since it is a finite linear combination of terms which individually obey these conditions. Thus, the expansion coefficients are: In the above, we have made use of the orthonormality of the complex exponentials in the θ 1 and θ 2 directions. We may rewrite eq. 29 as: with: Thus, if the quantities I(n , k, k ) are known ahead of time, the coefficients m ,n ,k can be readily evaluated at a cost of O(M max N max K 2 max ), i.e., quite close to the overall basis set size, and similar in computational complexity to the evaluation of the potential term. Since I(n , k, k ) is problem independent (e.g., it has no dependence on R, α, τ or the potential V (x)), we may evaluate and store it as a table for a large range of values of n, k and k . During program execution, this table of values may be loaded into memory, and each m ,n ,k can be evaluated as a vector dot product (eq. 30), after accessing the necessary values of I(n , k, k ). As for the evaluation of the I(n , k, k ) values themselves, we may use the recurrence relation [67]: to rid the integrand in eq. 31 of it's denominator, and obtain a pair of oscillatory integrals. We may then compute these by using Gauss-Jacobi quadrature as outlined in eq. Appendix D.
In an analogous manner, the sixth term on the left hand side of eq. 22, i.e., can be simplified to: With the quantities I(n , k, k ) available, the above can be evaluated in a manner similar to the evaluation of the fifth term, at a computational cost of The key difference is that instead of the vector {φ m,n,k } (m,n,k)∈Γ , we need to consider an alternate one with entries {i2πn Nφ m,n,k } (m,n,k)∈Γ . However, this modified vector is already available as part of evaluation of the fourth therm on the left hand side of eq. 22, and therefore, it can be reused.

Numerical Implementation
We have implemented the above computational strategies into a MATLAB [46] package called HelicES (Helical Electronic Structure). To ensure efficiency, our code heavily relies on vectorization features of MATLAB. Various details of our implementation are as follows.

Wave function storage: reciprocal and real space considerations
For any quantity in reciprocal space, there are three indices m, n, k associated with each expansion coefficient, making the collection of coefficients a three-dimensional object of dimensions (2M max +1)×(2N max +1)×K max = L . However, it is easier for linear algebra operations to have these coefficients stacked up as vector in C L . To achieve this, we adopt the following mapping between (m, n, k) ∈ Γ and the linear index i ∈ {1, 2, . . . , L}: With this, a collection of N s wavefunctions can be stored as a complex matrix of dimensions L × N s . For real space representation, the number of grid points to be chosen along each helical coordinate θ 1 , θ 2 is dictated by the accuracy of the basis transforms (see Section 3.4). We choose to work with Fourier nodes along the θ 1 and θ 2 directions and denote the corresponding number of grid points as N θ 1 and N θ 2 respectively. Along the radial direction, we choose N r Gauss-Jacobi nodes [68] over the interval [0, R]. This has the advantage that the coordinate singularity at the origin is automatically avoided. In order to accommodate non-linearities and to reduce aliasing errors [45,69], we typically choose N θ 1 = 4 * M max + 1, N θ 2 = 4 * N max + 1 and N r = 4 * K max . These choices generally allow transforms to be evaluated accurately up to machine precision. With this setup, quantities such as the wavefunction are available in real space over a three-dimensional grid (i.e., the tensor product grid resulting from the one-dimensional grids along the individual coordinate directions), and each grid point is indexed via i ∈ {1, 2, . . . , N θ 1 }, j ∈ {1, 2, . . . , N θ 2 } and k ∈ {1, 2, . . . , N r }. For storage, we stack this three dimensional representation into a complex column vector of size N θ 1 * N θ 2 * N r , for which we use the following ordering: Since the memory requirement for storage of each wavefunction in real space is much higher than storing it in reciprocal space, we typically avoid storing real space versions of all N s wave functions simultaneously.

Imposition of kinetic energy cutoff
In conventional plane-wave calculations, it is common to specify a kinetic energy cutoff, i.e., a limit on the H 1 Sobolev norm of the plane-waves to be used for discretization [28,70]. Once a suitable periodic unit cell has been identified, this criterion automatically provides a recipe for calculating the number of planewaves along each of the Cartesian axes, and in turn, the dimensions of the underlying real space grid to be used for Fast Fourier Transforms (FFTs). In a similar vein, we may wish to retain only helical waves with kinetic energies below a pre-specified cutoff in our calculation, since this has the advantage that the basis set limits M max , N max , and K max get specified automatically in proportion to the computational domain's geometry parameters. At the gamma point (η = 0, ν = 0) for example, the kinetic energy cutoff criterion requires that all helical waves f m,n,k , with m, n, k values satisfying: be included in our calculations. In our implementation, we first determine the largest absolute values of integers m, n and the largest natural number k consistent with with eq. 37. We set the basis set limits M max , N max , and K max accordingly. The real space grids used for carrying out fast transforms (described below) are chosen based on these quantities. Within these (2M max + 1) * (2N max + 1) * K max helical waves, however, not every combination of m, n, k would satisfy the kinetic energy criterion. To remedy this, we create a masking vector to exclusively retain helical waves which satisfy eq. 37, in various operations of interest (such as the Hamiltonian times wavefunction products). Based on the linear ordering for reciprocal space storage outlined in eq. 35, we may express the masking vector as: Element-wise multiplication of a given vector with the masking vector results in only kinetic energy limited helical waves being retained in the calculation. We implement the above strategy at each η, ν point (with the expression for the kinetic energy modified appropriately) to impose the kinetic energy cutoff in HelicES.

η-space discretization and parallelization
As described earlier, to obtain the helical Bloch states (eq. 8), i.e., solutions to the single electron problem with a symmetry adapted potential (eq. 6), the single electron Hamiltonian has to be diagonalized for every η ∈ [− 1 2 , 1 2 ) and ν ∈ {0, 1, 2, . . . , N − 1}. To make this calculation feasible, we sample η over a discrete set . The specific choice of the values η b is based on the Monkhorst-Pack scheme [71]. This procedure akin to "k-point sampling" in conventional periodic codes [27]. With this choice, the Hamiltonian needs to be diagonalized at N K = N η × N points, and integrals in η can be calculated via quadrature: Here, {w b } Nη b=1 are the Monkhorst-Pack quadrature weights and are uniformly equal to 1/N η . Integrals of the above kind appear, for example, while computing the electronic band energy, or the electron density from helical Bloch states [20,30].
If the single electron Hamiltonian does not include magnetic fields -as is the case here, time reversal symmetry allows further reduction in the number of η, ν points at which the Hamiltonian has to be diagonalized [40,72]. Specifically, it holds that for any η ∈ [− 1 2 , 1 2 ) the helical Bloch states and the associated electronic bands obey: and: Overall, this reduces the number of discrete points in reciprocal space by a factor of 2.
Since the diagonalization problem arising from distinct sets of η, ν values are independent of one another, they can be dealt with in an embarrassingly parallel manner. In our implementation, we have used MATLAB's Parallel Computing Toolbox (specifically, the parfor function) to carry out this parallelization.

Fast basis transforms
Since our strategy for carrying out Hamiltonian matrix-vector products involves fast basis transforms, we now elaborate on various aspects of the implementation of such operations within the HelicES code. To arrive at fast transforms, we exploit the separability of the basis functions into radial and θ 1 , θ 2 dependence. This allows us to make use of quadrature along the radial direction, and subsequently, efficient two-dimensional fast Fourier transforms (FFTs) along the θ 1 , θ 2 directions for each radial grid point, or for each radial basis function. Since the radial part of the basis functions consists of Bessel functions, we have also investigated the use of Hankel and discrete Bessel transforms [73][74][75]. However, we found that the quadrature approach adopted here resulted in better performance for the basis set sizes considered, consistent with some earlier studies [76].
In what follows, O M ×N is used to denote a zero matrix of size M × N . The typical real space grid point for sampling a function will be denoted as (θ i 1 , θ j 2 , r k ), with 14 i ∈ {1, 2, . . . , N θ 1 }, j ∈ {1, 2, . . . , N θ 2 } and k ∈ {1, 2, . . . , N r }. We will use the MAT-LAB commands ifft2 and fft2 to denote two-dimensional fast inverse and forward Fourier Transforms respectively [46]. Additionally, we will use the MATLAB commands ifftshift and fftshift to denote rearrangements of Fourier transform coefficients related to shifting of zero frequency components to matrix center [46]. Finally, we will use the MATLAB ':' (colon) operator notation to denote array/matrix indices with regular increments. Thus, i : k : j will denote indices starting at i, incremented by k and ending at j, i : j will denote the indices i, i + 1, i + 2, . . . , j − 1, j, and simply ':' will denote all indices along a particular matrix dimension.

Fast Inverse Basis Transform
Given the expansion coefficients {ĝ m,n,k } (m,n,k)∈Γ , a naive way of implementing the inverse basis transform would be to calculate each basis function {F m,n,k } (m,n,k)∈Γ at every grid point (θ i 1 , θ j 2 , r k ), and to then evaluate the sum: The computational complexity of this "naive inverse transform" is easily seen to be The constant involved in the latter estimate can be seen to be quite large based on the discussion in Section 3.1. To remedy this situation, we express the basis functions as in Appendix B, i.e., F m,n,k (θ 1 , θ 2 , r) = e i2π(mθ 1 +nNθ 2 ) ξ n,k (r) and also rewrite the eq. 42 as: Kmax k=1ĝ m,n,k ξ n,k (r k ) .
Since the quantity in parentheses is independent of the basis function index k, we may rewrite the above as: Thus, if the quantities G m,n (r k ) are known, calculation of the inverse basis transform amounts to computing an inverse two-dimensional fast Fourier transform at each radial grid point r k . Additionally, we observe that at each radial grid point r k , G m,n (r k ) can be expressed as a vector dot product between two K max dimensional vectors, i.e., {ĝ m,n,k } K Max k=1 and {ξ n,k (r k )} K Max k=1 . In fact, by grouping the evaluation of G m,n (r k ) for different grid points together, the above operation may be expressed as the product of a N r × K max matrix with a K max dimensional vector, which allows for the use of Level-2 BLAS [77] operations. If the radial part of the basis functions (i.e., ξ n,k (r k )) are available ahead of time, the above steps provide a convenient recipe of computing the inverse basis transforms 15 with computational complexity O (M max N max K max (K max + log (M max ) + log (N max ))), a significant improvement over the naive algorithm discussed earlier. We outline the overall steps of our implementation in Algorithms 1 and 2 below, and also illustrate some key aspects through Fig. 5.
In Fig. 4 we compare the naive and fast inverse transforms as implemented in HelicES. The starting vectors {ĝ m,n,k } (m,n,k)∈Γ were randomly chosen for the tests. The results from these two methods always agreed with each other to machine precision, guaranteeing consistency of the implementations. However, consistent with the discussion above, the computational time for the naive transforms is found to scale in a quadratic manner with the basis set size, while for the fast transforms, it is close to being linear. The fact that the observed scaling of our fast transform implementation is actually sublinear, is almost certainly related to our use of machine optimized linear algebra and Fourier transform routines as available within MATLAB.

Fast Forward Basis Transform
We now discuss the implementation of forward basis transforms within HelicES. Given a function g(θ 1 , θ 2 , r), the forward basis transform is defined as: With F m,n,k and g both sampled on the real space grid, this can be approximated via quadrature as: Here, the quadrature weights along the θ 1 , θ 2 directions are constants, i.e., ω i θ 1 = 1/N θ 1 and ω j θ 2 = 1/(NN θ 2 ), due to the use of Fourier nodes (i.e., trapezoidal rule). The radial weights {ω k r } Nr k=1 correspond to Gauss-Jacobi quadrature. We can see that like the case of the inverse transforms, a naive implementation of the above expression will lead to a computational complexity of O(M 2 max N 2 max K 2 max ). Instead, we deal with the evaluation of this expression along the θ 1 , θ 2 directions simultaneously at each radial grid point using 2D FFTs, and then perform quadrature in the radial direction. Thus, we compute: The radial quadratures in the above expression can be conveniently cast in terms of Level-2 BLAS [77] operations if the radial basis functions scaled by the corresponding quadrature weights (i.e. {ω k r ξ n,k (r k )} K Max k=1 ) are available as a matrix ahead of time. We outline the steps of our implementation in Algorithms 3 and 4 below, and illustrate key aspects in Figure 6.   Referring to Fig. 4, we see that like the case of the fast inverse basis transforms, our implementation of the fast forward basis transforms scale in a sublinear manner with respect to basis set size increase, although a somewhat worse performance is expected theoretically. In contrast, a naive implementation of the forward transform scales in a quadratic manner with respect to basis set size, although both implementations of the transforms always agree with each other to machine precision.
In practice, the differences between the efficiencies of the fast and the naive transform implementations (both forward and inverse transforms) are not only apparent in terms of their respective scalings with respect to basis set size, but also the actual computational wall times. Indeed, we found that the fast transform implementations can be orders of magnitude faster as compared to the naive ones, even for relatively small basis set sizes. In Algorithm 5, we outline the steps of calculating the product of the Hamiltonian matrix with a wavefunction vector block by use of the fast transforms, as implemented in HelicES.

Eigensolvers and Preconditioning
As mentioned earlier, we make use of matrix-free iterative eigenvalue solvers for diagonalization of the discretized Hamiltonian. Within HelicES, we have investigated two different diagonalization strategies for this purpose. The first is based on the Krylov-Schur method as implemented in the MATLAB Eigs function [78][79][80]. The second is based on a MATLAB implementation [81] of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) scheme [82][83][84]. LOBPCG requires the use of a preconditioner, for which we have adopted the Teter-Payne-Allan (TPA) recipe [79,80,85]. This preconditioner was originally developed in the context of plane-wave calculations of bulk systems, but has also been successfully applied to other spectral methods [45]. During LOBPCG iterations, use of the TPA preconditioner requires the calculation and application of a diagonal matrix K ∈ R L×L to the residual vector. The entries of the matrix are: with: g i = kinetic energy of basis function i kinetic energy of the residual vector .

20
As shown in Fig. 7, the preconditioner can have quite a dramatic effect on the convergence of the diagonalization procedure, especially as the basis set size (and therefore, the size of the discretized Hamiltonian) is increased. Thus, the benefits of the preconditioner are likely to become more apparent when larger systems and/or harder pseudopotentials are considered.   [85] on LOBPCG [82] iterations for diagonalizing the discretized Hamiltonian in HelicES. An untwisted (6, 6) armchair carbon nanotube (Mayer pseudopotentials [86]) has been used, and the residual associated with the 2 nd eigenvalue for η = 0, ν = 0 as been monitored. For clarity, the residual for every 10 th iteration has been plotted. Without preconditioning, the number of iterations required to reach a given convergence threshold tends to dramatically increase as the basis set grows larger.
We found that use of LOBPCG along with the TPA preconditioner generally tends to require longer diagonalization wall times as compared to Eigs along with an energy cutoff. Therefore, the latter strategy is adopted for most of the the examples considered in the next section. Implementation of more efficient eigensolvers in HelicES, particularly, ones that work well within self consistent field iterations [87][88][89], is the scope of future work.

Results
We now present results obtained using HelicES and investigate the convergence and accuracy properties of our implementation. All of our calculations have been carried out using smooth empirical pseudopotentials [86,90]. We have used the planewave code PETRA [91], as well as two separate MATLAB based finite difference codes to generate reference data for comparison purposes. Specifically, we have employed the helical symmetry adapted finite difference code Helical DFT [20,30] and the Cartesian grid finite difference code RSDFT [92]. The original versions of these finite difference codes were designed for self consistent field calculations, and were modified to work with the empirical pseudopotentials used in HelicES. We have also carried out comparisons of results obtained from HelicES against data obtained from the literature [86,90]. We have used the WebPlotDigitizer tool [93] for extracting data from published plots.

Computational Platform
All simulations involving HelicES were carried out using dedicated desktop workstations (Dell Precision 7920 Tower and iMac Pro) or on single nodes of the Hoffman2 cluster at UCLA's Institute for Digital Research and Education (IDRE). The Dell Precision workstation has an 18-core Intel Xeon Gold 5220 processor ( [30], RSDFT [92] and PETRA [91] employed the above platforms as well.

Convergence Studies
Using a twisted armchair carbon nanotube as an example system (Mayer pseudopotentials [86]), we first investigate the convergence properties of HelicES. Considering first the case of eigenvalues of the Hamiltonian at η = 0, ν = 0, we see in Fig. 8 that as the number of basis functions in HelicES is increased, there is a rapid convergence to the reference values, regardless of which eigenvalue is considered. Consistent with earlier results for electronic structure calculations using spectral basis sets [45,70], HelicES shows a curvature on a log-log scale, indicative of super-polynomial convergence. In contrast, the finite difference method, also shown on the same figure, shows slower, polynomial convergence. This is consistent with earlier findings for finite difference electronic structure calculations using curvilinear coordinates [20,40]. Furthermore, when the energy cutoff criterion is engaged, HelicES appears to employ noticeably fewer degrees of freedom than the finite difference method (Helical DFT) in reaching the same levels of convergence. The electronic features of quasi-one-dimensional systems can be characterized by onedimensional band diagrams [20,30], and these can be readily calculated for systems of interest using HelicES. As the next step in our studies, we checked the convergence behavior of the code with regard to a few quantities that are associated with the overall features of the one-dimensional band diagram of the aforementioned armchair carbon nanotube system. These include the the electronic band energy -which for an insulating system is simply twice the sum of all occupied state eigenvalues, the valence band maximum eigenvalue, the conduction band minimum eigenvalue and the band gap. As shown in Fig. 9, we see that all these quantities, except for the band gap, show monotonic convergence to reference values. We also note that convergence of the band gap is nearly monotonic until the curve enters regions of very high accuracy (O(10 −6 ) in the figure) and this behavior is likely related to the fact that the band gap is calculated as the difference of two quantities.

Relative error
Band energy Electron density Figure 10: Convergence of the band energy and electron density of a (16, 16) armchair carbon nanotube with a twist parameter of α = 0.002. The reference value was taken to be from a calculation with 45 η-points Within HelicES, the electronic properties of quasi-one-dimensional systems are also expected to exhibit convergence with respect to the number of points used to discretize the η-space (Section 3.3). In Fig. 10, we explore the convergence behavior of the electron density (in terms of the L 1 norm per electron) and the electronic band energy for the aforementioned carbon nanotube system, as the number of η points in the calculation is increased. We see that both the above quantities show excellent convergence. We note that the electron density can be calculated from the wavefunctions φ j (θ 1 , θ 2 , r; η, ν), and the corresponding electronic occupation numbers ς j (η, ν) as: This requires inverse basis transforms to be carried out on each wavefunction vector, at the end of the diagonalization procedure. We also note from Figs. 9,10 that for the Mayer pseudopotential employed in the above calculations, an energy cutoff of 16 Ha and 15 η-points are more than sufficient to reach chemical accuracy.

Accuracy Studies
While the discussion in the above serves to illustrate the systematic convergence properties of HelicES, it does not address the accuracy or correctness of the converged results produced by the code. Therefore, we now carry out a series of systematic tests and compare the results produced by HelicES against solutions produced by other methods, for a variety of systems.
Our first set of tests compares the results produced by HelicES against those computed through the Finite Difference Method (FDM). For these studies, the Mayer pseudopotential [86] was once again employed and the energy cutoff in HelicES was set at 16 Ha. Reference results using the FDM codes were generated using a mesh spacing of 0.2 Bohr, this being the finest mesh that could be uniformly employed for all systems of interest, within computational resource constraints. We first used the RSDFT code [92] for calculating the electronic structure of a variety of finite (cluster-like) systems. The bound state eigenvalues for these same systems, as calculated by HelicES are compared against RSDFT results in Table 1. We see that for these discretization parameters, the agreement between the codes with respect to individual eigenvalues is about 1.3 × 10 −4 Ha or better, suggesting excellent accuracy.
Next, we generated the electronic band diagram associated with a deformed quasione-dimensional system, namely an armchair nanotube subjected to about β = 2.95 • of twist per nanometer. Reference calculations were carried out using the Helical DFT code [20,30]. Both Helical DFT and HelicES were made to use 21 η-points and the Eigs eigensolver in MATLAB. As shown in Figure 11, the band diagrams produced by the two codes are virtually identical, once again suggesting the excellent accuracy of HelicES. We also noted that while the two codes produced nearly identical results for the above discretization choices (e.g. the differences in band energy per atom and the band gap were O(10 −5 ) Ha), the diagonalization wall time for HelicES was about a factor of 27 lower, and the memory footprint was also significantly less. These observations continue to be true when larger values of the energy cutoff are used in HelicES, with the diagonalization wall time of the code being about a factor of 8 lower than Helical DFT, even when an energy cutoff of 40 Ha is employed. Overall, these findings illustrate that HelicES adequately addresses many of the the computational bottlenecks in existing methods for the study of electronic properties of quasi-one-dimensional systems, commensurate with its design goals.  Table 1: Accuracy of the HelicES code while studying finite systems (green arrow denotes the e Z axis). Reference data was generated using RSDFT [92], a finite difference method (FDM) based MATLAB code. The last column shows the maximum differences in the eigenvalues computed using the two methods.

Energy (in Ha)
Band structure using HelicES Band structure using the FDM based Helical DFT code Figure 11: Comparison of band diagrams for a twisted (16,16) armchair carbon nanotube with twist parameter of α = 0.002, generated using HelicES and the FDM based Helical DFT code [20,30]. The green shaded region in the structure on the right is the fundamental domain used in HelicES, while the green arrow denotes the e Z axis.
Due to inherent design limitations, the aforementioned FDM codes are unable to simulate quasi-one-dimensional nanostructures which have atoms situated near or along the system axis (e.g. nanoribbons, nanowires or small diameter nanotubes). However, these systems can be conveniently dealt with by HelicES. To carry out accuracy tests for such systems therefore, we compared the band structures calculated by HelicES against those generated through alternate electronic structure calculation techniques. The first of these is based on the transfer matrix method [94][95][96], often used in electromagnetics calculations. In Figs. 12 and 13 we see that the band structure calculated by HelicES is in nearly perfect agreement with results calculated using this technique in [86]. The systems considered here are carbon nanotubes with radii about 0.3 to 0.4 nanometers. For the (5, 5) armchair nanotube, the position of the Dirac cone is correctly predicted to be at η = ± 1 3 . Additionally, the (10, 0) zigzag nanotube, the band gap calculated by HelicES is 1.05 eV which is very close to the value of 1.04 eV obtained in [86].

Energy (in Ha)
Band structure using HelicES Results using transfer matrix method Figure 12: Comparison of band diagrams for a (5, 5) armchair carbon nanotube generated using HelicES and a transfer-matrix technique [86]. The dashed green box in the plot represents the region of the band diagram over which the reference data was available for comparison. The green shaded region in the structure on the right is the fundamental domain used in HelicES and the green arrow denotes the e Z axis.

Energy (in Ha)
Band structure using HelicES Results using transfer matrix method Figure 13: Comparison of band diagrams for a (10, 0) zigzag carbon nanotube generated using HelicES and a transfer-matrix technique [86]. The dashed green box in the plot represents the region of the band diagram over which the reference data was available for comparison. The green shaded region in the structure on the right is the fundamental domain used in HelicES and the green arrow denotes the e Z axis.
Next, we used the Plane-wave Electronic TRAnsport (PETRA) code [90,91,97] for studying an armchair graphene nanoribbon, as well as a silicon nanowire oriented along the 100 direction. Both these systems were treated using the empirical pseudopotentials developed in [90] and feature hydrogen passivation. Figs. 14 and 15 show that the overall agreement between HelicES and PETRA is excellent, although some minor variations at the edge of the highest energy band for the nanoribbon case may be observed. This is possibly due to the different boundary conditions being employed by PETRA and HelicES in the directions orthogonal to the ribbon axis. We also note that the band gap for the silicon nanowire calculated by HelicES is 3.82 eV, which is very close to the value of 3.84 eV reported in [90].

Energy (in Ha)
Band structure using HelicES Band structure calculated using the PETRA (plane-wave) code. Figure 14: Comparison of band diagrams for a hydrogen passivated armchair graphene nanoribbon generated using HelicES and a plane-wave technique [90,91]. The green shaded region in the structure on the right is the fundamental domain used in HelicES and the green arrow denotes the e Z axis.

Energy (in Ha)
Band structure using HelicES Band structure calculated using the PETRA (plane-wave) code Figure 15: Comparison of band diagrams for a hydrogen passivated, 100 oriented silicon nanowire generated using HelicES and a plane-wave technique [90,91]. The dashed green box in the plot represents the region of the band diagram over which the reference data was available for comparison. The green shaded region in the structure in the middle is the fundamental domain used in HelicES, with the green arrow denoting the e Z axis. The right image shows a top view of the structure (i.e., looking down along e Z ).

Application to the study of the electromechanical response of a nanotube
Finally, as a demonstration of the utility of the computational method developed here, we study the the electromechanical response of a quasi-one-dimensional nanomaterial as it undergoes deformations. Specifically, we consider a carbon nanotube with a radius of about 1.0 nanometer (an armchair (16,16) tube), and subject it to twisting. We start from the untwisted structure and increase the rate of applied twist, considering up to about β = 7.4 • per nanometer, in our simulations. Fig. 16 shows the variation of the band gap of the material with applied twist. For comparison purposes, results from full self consistent Kohn-Sham DFT calculations using ab initio Troullier Martins pseudopotentials [98] and Local Density Approximation based exchange correlation [99,100], are also shown (obtained from [20]). It is well known that upon twisting, armchair nanotubes -which are generally metallic in untwisted form -show metal-to-semiconductor transitions, and that these changes manifest themselves as oscillatory behavior in the band gap [20,[101][102][103]. We see from Fig. 16 that the results from HelicES do reproduce this qualitative behavior correctly, but the actual response curve is quantitatively different from the first principles data. This is very likely due to the lack of inclusion of atomic relaxation effects in HelicES, as well as the general failure of the Mayer pseudopotential to model scenarios where the carbon atoms do not form a perfect honeycomb lattice -a consequence of the shearing distortions that arise from the applied twist in this case. Therefore, these results strongly suggest the need for building in ab initio pseudopotentials and self consistent iterations into HelicES, which constitutes ongoing work [104].

Conclusions
In summary, we have presented a novel spectral method for efficiently solving the Schrödinger equation for quasi-one-dimensional materials and structures. The basis functions in our method -helical waves -are natural analogs of plane-waves, and allow systematically convergent electronic structure calculations of materials such as nanowires, nanoribbons and nanotubes to be carried out. We have discussed various mathematical, algorithmic and implementation oriented issues of our technique. We have also used our method to carry out a variety of demonstrative calculations and studied its accuracy and convergence behaviors.
We anticipate that the method presented here will find utility in the discovery and characterization of new forms of low dimensional matter. It is particularly well suited for coupling with specialized machine learning techniques [105] and for the multiscale modeling of low dimensional systems [106]. Building self-consistency into the method, so as to enable ab initio calculations (e.g. using Hartree-Fock or Kohn-Sham Density Functional Theory [107]) remains the scope of ongoing and future work. An important first step in this direction is efficient solution of the associated electrostatics problem [108], towards which we have been making recent progress [104,109]. -

Appendix A. Derivation of the governing equation in helical coordinates
We are interested in solutions of the Schrödinger equation, i.e., − 1 2 ∆+V (x) ψ = λ ψ, as it applies to a quasi-one-dimensional structure. For a function ψ(θ 1 , θ 2 , r) expressed in helical coordinates, the Laplacian is given by [29,30]: Considering the helical and cyclic symmetry adapted Bloch ansatz, ψ(θ 1 , θ 2 , r; η, ν) = e −i2π(ηθ 1 +νθ 2 ) φ(θ 1 , θ 2 , r; η, ν), we first note: Thus, we get: which simplifies to: Hence the action of the Schrödinger operator on ψ can be expressed as: Since the phase e −i2π(ηθ 1 +νθ 2 ) = 0, canceling it from both sides of the Schrodinger equation in ψ leaves us with the following eigenvalue problem in φ: This equation needs to be discretized and solved over the fundamental domain, along with suitable boundary conditions in φ.

Appendix B. Derivation of the Basis Set
In analogy to the classical plane-wave method [27,28], the basis functions in our scheme are eigenfunctions of the Laplacian. However, instead of periodic boundary conditions obeyed by planewaves, we consider boundary conditions resulting from invariance under helical and cyclic symmetries. The calculation presented below is based on similar results in [29], while a vector version of this calculation appears in [52,53] in the context of x-ray diffraction patterns of twisted nanomaterials.
Let F (θ 1 , θ 2 , r) be a basis function expressed in helical coordinates. Then, invariance under helical and cyclic symmetries implies that this function must be periodic in θ 1 with a period of 1, and also periodic in θ 2 with a period of 1 N . Assuming F (θ 1 , θ 2 , r) is separable, we characterize the dependence of the function on θ 1 and θ 2 through Fourier modes (i.e., complex exponentials), and write: F m,n,k (θ 1 , θ 2 , r) = e i2π(mθ 1 +nNθ 2 ) ξ(r) . (B.1) Here ξ(r) is a purely radial function that possibly depends on m, n, k, and incorporates normalization constants. The Laplacian of the above function in the helical coordinates is: ∆F m,n,k = ξ r e i2π(mθ 1 +nNθ 2 ) + 1 r ξ rr e i2π(mθ 1 +nNθ 2 ) − 4π 2 m 2 τ 2 F m,n,k which can be rewritten as: ∆f m,n,k = e i2π(mθ 1 +nNθ 2 ) ξ rr + 1 r ξ r − F m,n,k n 2 N 2 Now, imposing the condition that f m,n,k is an eigenfunction of the Laplacian, i.e., −∆F m,n,k = λ 0 m,n,k F m,n,k , (B.4) we get: −e i2π(mθ 1 +nNθ 2 ) ξ rr + 1 r ξ r + e i2π(mθ 1 +nNθ 2 ) n 2 N 2 which simplifies to: Denoting ξ 2 m,n = 4π 2 τ 2 (m − αnN) 2 and performing the change of variables: we see that the above equation reduces to: This is simply Bessel's equation [67,110] in ξ(r). Since nN is real, the general solution of this equation can be expressed in terms of ordinary Bessel functions of the first and second kind as: To evaluate the constants A and B, we need to invoke boundary and normalization conditions. Since the wavefunctions are expected to be finite valued at the origin (r = 0), and Bessel functions of the second kind approach infinity near 0, we conclude that B = 0. Furthermore, since the wavefunctions obey Dirichlet boundary conditions on the lateral surface of the computational domain (r = R), so should the basis functions used to discretize them. Hence, we obtain: This implies that R λ 0 m,n,k − γ 2 m,n must be a root of the the ordinary Bessel function of the first kind. Denoting the k th root (k = 1, 2, . . .) of the Bessel function of order p, as b p k , we see that: b nN k = R λ 0 m,n,k − γ 2 m,n , (B.11) from which, it follows that: (B.12)

31
Thus, we have: Finally, to determine the constant A, we apply the orthonormality condition between two distinct basis functions F m,n,k and F m ,n ,k : F m,n,k , F m ,n ,k L 2 (D) = δ m,m δ n,n δ k,k . (B.14) This requires that: Due to the properties of complex exponentials and Bessel functions, we note that this condition is readily satisfied for distinct basis functions (i.e., when any of the conditions m = m , n = n , k = k hold). For the case m = m , n = n , k = k , we arrive at: i.e., Thus it follows that the normalization constant: and that: Hence, the basis functions in our method have the form: Note that if the computational domain were an annular cylinder (as employed in [20,30]), instead of the solid cylinder considered here, the boundary conditions on the radial part of the wavefunction would be expected to change. For Dirichlet boundary conditions applied to the inner and outer walls of such an annular cylinder -often employed in simulations of large diameter nanotubes -Bessel functions of both kinds would be involved (i.e., A, B = 0 in eq. B.9) and the zeros of the cross products of Bessel functions [111] would be required.

Appendix C. Calculation of Gradients
In electronic structure calculations, it can sometimes become necessary to compute the derivative of a quantity expressed using a chosen basis set, or over a grid. For instance, evaluation of the Hellmann-Feynman forces [112,113] on the atoms of a system involves calculation of Cartesian gradients, if atomic pseudopotentials and pseudocharges are used to compute total energies [30,40,43]. In this section, we describe how such gradients may be computed for quantities expressed using helical waves.

(C.4)
These expressions correspond to the inverse basis transforms of vectors with entries {(i2πm)Ê m,n,k } (m,n,k)∈Γ and {(i2πnN)Ê m,n,k } (m,n,k)∈Γ respectively, and so they may be readily computed. To calculate the radial derivative E r , we first note the following identity [67]: This expression may be used for computing the radial derivative of all helical waves within the basis set. However, as n varies from −N max to N max , the order of the Bessel functions involved range from κ = −NN max to κ = NN max , and the above expression results in a Bessel function that lies beyond the range of the basis set. To remedy this, we may use the following alternate expression [67] for the κ = −NN max case: Combining eqs. C.5 and C.6 with eq. C.1, we se that the radial derivative E r may be expressed as: E r (θ 1 , θ 2 , r) = ΓÊ m,n,k c m,n,k e i2π(mθ 1 +nNθ 2 ) B n (r) , (C.7) where the radial functions B n (r) are: With this, the radial derivative may be considered an inverse basis transform of the vector with entries {Ê m,n,k } (m,n,k)∈Γ , provided we use the functions B n (r) along the radial direction. These functions may be computed ahead of time and stored, and Algorithm 1 (Section 3.4.1) may then be used for computing E r (θ 1 , θ 2 , r). With the derivatives E r (θ 1 , θ 2 , r), E θ 1 (θ 1 , θ 2 , r) and E θ 2 (θ 1 , θ 2 , r) available on hand as values on a real space grid, we may use eq. C.3 to evaluate the Cartesian derivatives at each point on the same grid.
Instead of obtaining the derivatives as real space quantities as described above, it is also possible to directly obtain them in reciprocal space. The expansion coefficients of E θ 1 and E θ 2 are immediately seen to be: ( E θ 1 ) m ,n ,k = (i2πm )Ê m ,n ,k , ( E θ 2 ) m ,n ,k = (i2πn N )Ê m ,n ,k , (C.9) and they may be evaluated at a cost that is proportional to the basis set size. By considering the inner product of eq. C.7 with the basis functions, the expansion coefficients of the radial derivative may be expressed as : ( E r ) m ,n ,k = E r , F m ,n ,k L 2 (D) = 2 R ΓÊ m ,n ,k A(n , k, k ) . (C.10) The numbers A(n , k, k ) can be expressed in terms of oscillatory integrals: We may precompute them using the techniques described in Appendix D and store them for use later. Note that such an expansion of the radial derivatives using the basis set implicitly requires these quantities to obey Dirichlet boundary conditions at r = R. However, this may not be satisfied in general. The real space expression outlined earlier (eqs. C.7,C.8) skirts this issue.

Appendix D. Evaluation of Oscillatory Radial Integrals
This work has multiple instances in which integrals with oscillatory integrands along the radial direction make an appearance (e.g., eq. 31 and eqs. C.11,C.12). A typical scenario is depicted in Fig. D.17. Techniques for the evaluation of such integrals have been extensively studied in the literature [114][115][116][117] and specialized methods for integrands involving Bessel functions are also available [118]. Instead of adopting these more elaborate methods, we choose to evaluate the oscillatory integrals in this work by using the simpler procedure of employing a large number of Gauss-Jacobi quadrature [119] nodes and weights. Thus, denoting q = r/R, we write: The values of the weights w i and the nodes q i dependent on the the exponent σ, as well as the quadrature order N q . The weights and nodes can be computed inexpensively [120] even when N q is of the order of a few thousand. For the case of the integrals involved in the evaluation of I(n , k, k ) via eq. 31, the number of oscillations of the integrand is approximately equal to k + k . Thus, within a given basis set, the maximum number of oscillations is 2K max . For all the examples considered in this work, K max does not generally exceed 200, and we have found that choosing N q to be a few thousand for such cases allows the integrals to be converged to O(10 −14 ). To verify our calculations, we have also used Gauss-Kronrod quadrature [121,122] as employed within Matlab (quadgk function). This allows for automatic adaptive placement of the integration nodes and monitoring of the quadrature error, and we verified that the latter was always O(10 −13 ) or lower, even for the cases involving the most oscillatory integrands.