The Relativistic Schr\"odinger Equation through FFTW3: An Extension of quantumfdtd

In order to solve the time-independent three-dimensional Schr\"odinger equation, one can transform the time-dependent Schr\"odinger equation to imaginary time and use a parallelized iterative method to obtain the full three-dimensional eigenstates and eigenvalues on very large lattices. In the case of the non-relativistic Schr\"odinger equation, there exists a publicly available code called quantumfdtd which implements this algorithm. In this paper, we (a) extend the quantumfdtd code to include the case of the relativistic Schr\"odinger equation and (b) add two optimized FFT-based kinetic energy terms for non-relativistic cases. The new kinetic energy terms (two non-relativistic and one relativistic) are computed using the parallelized Fast Fourier Transform (FFT) algorithm provided by the FFTW library. The resulting quantumfdtd v3 code, which is publicly released with this paper, is backwards compatible with version 2, supporting explicit finite differences schemes in addition to the new FFT-based schemes. Finally, the original code has been extended so that it supports arbitrary external file-based potentials and the option to project out distinct parity eigenstates from the solutions. Herein, we provide details of the quantumfdtd v3 implementation, comparisons and tests of the three new kinetic energy terms, and code documentation.


Motivation
The three-dimensional Schrödinger equation has a closed-form analytical solution for only a very small class of systems (i.e., a free particle in a box, harmonic oscillator, or some central potentials). In all other cases, one has to rely on approximations, perturbation theory, or numerical solutions. In particular, most quark models used for phenomenological descriptions of QCD bound states are described by the threedimensional Schrödinger equation with a wide variety of potentials. Potential-based quark models have enjoyed a long history of success in describing below-threshold charmonium production and bottomonium spectra and have helped to establish confidence in QCD as the first-principles description of hadronic matter, see, e.g., Refs. [1,2,3]. In recent years, non-relativistic effective field theory methods have allowed for a first-principles approach to potential-based non-relativistic QCD (pNRQCD) [4,5]. Additionally, in order to describe quarkonium evolution in the quark-gluon plasma, such potential models have been extended to finite temperature [6,7,8] and non-equilibrium [9,10,11,12,13], in which case, the potentials are no longer real-valued or spherically symmetric. In the full non-equilibrium case, it is necessary to use a full three-dimensional Schrödinger equation solver that allows for complex potentials.
In the past, a parallelized solver called quantumfdtd has been made available for real-valued [14] and complex-valued [15] potentials. The two previous versions of quantumfdtd use local operators in a finitedifference time-domain (FDTD) scheme for solving the non-relativistic three-dimensional Schrödinger equation. However, if light quarks are involved, such as in heavy-light mesons, e.g., in B mesons, these have to be described as relativistic degrees of freedom. The light quark's relativistic dispersion results in a much larger average radius of the state, namely with the wave-function's falloff in the case of a linearly rising Cornell potential V (r) = −A/r + σ · r, like Ψ(r) ∼ exp(−M r) instead of Ψ(r) ∼ exp(−M r 3/2 ) (M being the reduced mass of the system) [16]. While spin-orbit and spin-spin corrections can be accounted for in terms of angular-momentum-dependent contributions beyond a simple central potential, at a fundamental level, the change to relativistic dispersion requires an implementation of a relativistic kinetic term in the Schrödinger equation. For this reason we extend the existing quantumfdtd parallelized 3D Schrödinger solver [14,15] to permit a relativistic kinetic term. These relativistic quark model calculations have a purpose beyond purely phenomenological calculations. The role of heavy-light mesons or heavy-heavy mesons (quarkonia) as an excellent laboratory of QCD (or the whole Standard Model) is well established. The confinement property of QCD, which implies that individual gluons or quarks cannot be isolated as asymptotic states, precludes their direct observation. Nevertheless, heavy-light mesons are bound states of QCD, where the dominant contribution to the meson mass can be understood as being due to its heavy-quark constituent, while any other contribution to its mass -whether from the mass of its light-quark constituent or from the dynamical gluon fields that mediate the binding -is small in comparison. These extra contributions can be understood quantitatively using an effective field theory approach that permits an extraordinarily precise extraction of quark masses when applied to sufficiently accurate lattice QCD results; see Ref. [17] for a review. Together with experimental data for their weak decay rates, the decay constants of heavy-light mesons provide powerful constraints to many elements of the CKM matrix. Violation of CKM unitarity would represent evidence of new physics. Heavy-light mesons are important probes in other, more direct searches for new physics, too. In particular, the large heavy-quark mass implies that heavy-light mesons may couple rather strongly to currents that mediate interactions with new physics at some high energy scale. The suppression of any new physics contributions is less pronounced for heavy-light systems, since the heavy-quark mass is much larger than typical hadronic scales. The matrix elements of such currents interacting with heavy-light mesons can be computed using lattice QCD. Thus, accurate lattice QCD results for heavy-light mesons can provide strong constraints on new physics.

Interface of relativistic quark models with lattice QCD
In order to extract the aforementioned information from lattice QCD simulations, hadron spectroscopy calculations are required; see Ref. [18] for a pedagogical introduction to lattice hadron spectroscopy aimed at non-expert readers 1 . Lattice spectroscopy with heavy quarks is challenging for many reasons. The most obvious one is that the heavy-quark mass M h is quite large compared to the inverse of the lattice spacing A. Hence, discretization artifacts in the heavy-quark sector are usually substantial. If these artifacts are parameterized in the most general form as a series in [log(AM h )] i (AM h ) j , rather high orders may be needed due to poor convergence caused by the largeness of the expansion parameter AM h . Fixing many coefficients then necessitates the availability of a wide range of heavy-quark masses and lattice spacings. Fine lattice spacings that would sustain a moderate size of AM h imply that a vast number of lattice sites in all directions are required to sustain a sufficient physical volume that prevents major finite size corrections due to correlations stretching across the lattice's boundaries. Due to the computational cost associated with such a large lattice, one may then have to deal with further issues that are not directly related to the heavy quarks, such as limited statistics, unphysical sea quark masses, or similar drawbacks that may be even more severe than in simulations that do not aim at incorporating heavy quark physics. There is a workaround for these heavy-quark mass discretization artifacts, which is in spirit close to the quark-model ideas. Namely, one may treat a heavy quark in the static limit. While the mass of such a static quark, which is due to the self-energy of the gluon field, is finite at finite lattice spacing, it diverges in the continuum limit. Then one may study the operators appearing in an expansion in the inverse heavy-quark mass, i.e., in the heavyquark effective theory (HQET) [19], or determine mass differences between states involving static as well as non-static quarks on the lattice (see Ref. [18]).
To make matters worse, heavy quarks in the valence sector give rise to further complications that are quite independent of the underlying ensemble. On the one hand, the large mass (or energy) E of a corresponding (ground state) meson implies a very rapid decay of any correlation functions C(t 1 −t 0 ) ∼ e −E(t 1 −t 0 ) involving heavy quarks, such that numerical errors due to limited statistics, limited precision in matrix inversions, or simply due to machine precision may become quite significant at large times. On the other hand, the splittings between different energy levels of heavy-quark bound states, e.g., the splitting E 1 − E 0 between the first excited state and the ground state, are much smaller than the mass of the ground state, E 0 . Multi-particle thresholds are usually not far away either. Taken in combination, these circumstances imply that one all too often has to analyze lattice correlation functions in a time window, where the excited-state contamination is non-negligible. In the direct searches for new physics on the lattice, one typically studies three-point functions and extracts QCD matrix elements for external currents. Excited-state contamination is particularly severe for these, as the current couples to the heavy quark at an intermediate time t, where t 0 < t < t 1 . In practice, it is usually not possible to simultaneously achieve (t 1 − t), (t − t 0 ) 1/(E 1 − E 0 ), which would be necessary to suppress the excited states to a satisfying degree.
Ground or excited state contributions are usually separated in the analysis of lattice correlators by utilizing a wide range of different hadron interpolating operators, which have different overlap factors for the various low-lying hadron (or multi-hadron) states, and then solving a generalized eigen-value problem by diagonalizing a matrix of correlation functions. This is necessary, since typical hadron interpolating operators in lattice QCD encode three-dimensional shapes that are very different from those typical for the low-lying hadrons, and thereby include major contributions from many excited states or even from a wide range of scattering states. The most common shapes of hadron interpolating operators -Point-type or Wall -type -are discussed later on. It has been clearly demonstrated in the past [16] using valence light quarks in the Wilson fermion formulation [20], heavy quarks in the static limit, and gauge ensembles with Wilson plaquette action [20] in the quenched approximation, i.e., without sea quarks, that heavy-light meson wavefunctions from relativistic quark models provide interpolating operators (hereafter: RQM -type) for heavylight mesons with excellent excited-state suppression in first-principle lattice QCD calculations. A more recent application uses gauge ensembles in (2+1+1)-flavor QCD with physical strange and charm quarks, and two degenerate light quarks at their physical or two unphysical values of the masses (corresponding to pion masses of m π ≈ 140, 220, or 300 MeV in the continuum limit) [21]. They have been generated by the MILC collaboration with one-loop Symanzik improved gauge action [22,23] and highly improved staggered quark action (HISQ) [24] for the sea. The valence heavy-quark is implemented in the static limit such that it is propagating forward in time in the first half of the lattice and backward in the second half of the lattice, while the valence "light quark" is implemented using HISQ action and the sea strange quark mass. This peculiar combination permits the usual symmetrization procedure for meson correlation functions (folding the backward propagating meson state on top of the forward propagating one at the midpoint in time). The static quark propagator is built up from bare gauge links, or from links after a single iteration of fourdimensional hypercubic (HYP) smearing with standard coefficients [25]. The latter reduces the gauge noise and the mass of the static quark in the lattice simulation. Due to the presence of the staggered antiquark field, the meson correlation function receives contributions from states of opposite parity. For the pseudoscalar interpolating operator used in the correlation function that is being considered, the parity partner has scalar quantum numbers. The correlation function takes the form of where PS stands for pseudoscalar and S stands for scalar. Both towers of pseudoscalar or scalar states involve a significant, but finite number of bound states, and an infinite number of scattering states in an infinite volume; in a finite volume, there is only a huge, but finite number of scattering states. In a lattice correlation function with finite amount of data in the time direction and a non-negligible statistical error, only a limited subset of these states can be resolved at all, and both towers get truncated in a lattice hadron spectroscopy analysis. While the alternating factor in the parity partner's contribution (and the fact of its admixture) is an artifact of the staggered quark discretization, the parity partner state itself is a genuine state of QCD. The masses of any individual states may be affected by different discretization artifacts. The amplitudes A X i or B X i (X = {PS,S}) are associated with the details of the hadron interpolating operators at source or sink, respectively, and their products, e.g., A X i B X * i are the overlap factors of the correlation function with specific hadrons. In each square bracket, the second term is due to the backward propagating contribution, and can be safely neglected for τ aN τ /2. After neglecting this backward propagating contribution, the two different parities can be separated using ratios of the correlation function spread over a 4-point stencil in time (see Ref. [26] for details), and reduced correlators of the two individual parities can be constructed. Obviously, for quark discretizations without doubling such procedures are not necessary.
Both the staggered quark field as well as the static quark field are single-component (or rather single spin component) fields, i.e., they cannot encode non-trivial spin-taste 2 structure in hadron interpolating operators that have the quark and antiquark at the same lattice site. Other spin-taste combinations could be accessed using point-split hadron interpolating operators with quark and antiquark at different sites. Those would give rise to unphysical mass differences caused by discretization artifacts associated with different staggered tastes, while physical mass differences associated with different orbital angular momenta and spins could be realized using specific (anti-)symmetrized geometries of the hadron interpolating operators, see Ref. [18].
In Fig. 1.2.a, we show the effective mass of the reduced negative parity correlator with a pseudoscalar static-strange hadron interpolating operator computed on the coarsest lattice of the MILC data set, i.e., 32 3 × 48 with a lattice spacing of A ≈ 0.15 fm and all sea quark masses at their physical values, after one step of HYP smearing is applied to the links making up the static quark propagator. The lattice simulations were performed using the MILC code [27]. In the following, we discuss the combinations of interpolating operators shown in the figure.
On the quark level, a Point-type operator (at source or sink) has non-zero overlap factors with quarkantiquark states for all allowed momenta at uniform weights. However, since low-lying, single hadron states have the largest overlap with quark-antiquark states, where both quark and antiquark have relatively small momenta, the Point operators exhibit large overlap with high energy states and comparatively poor overlap with the lowest hadron states of interest. Instead of using a Point source utilizing only a single site, randomcolor-wall sources realize statistically independent gaussian Z(2)-noise (in all three color components) on all sites at the same time. Hence, with (sufficiently large number of simultaneous) random color-wall sources for quark and antiquark, any combinations from different sites for quark and antiquark cancel eventually, and one projects the hadron state to vanishing spatial momentum, while the individual quark (or antiquark) may still cover the full range of allowed momenta. The Point sink is realized by a simple summation over all spatial sites at the same time (propagator endpoints), and thereby achieves the same projection at the sink. In Fig. 1.2.a, such source and sink operators are denoted as Point. On the quark level, a Wall -type operator (at source or sink) has non-zero overlap factors with quark-antiquark states only if their momenta are at a few specific values (the Fermi points of the reciprocal lattice). Since the quark and antiquark fields at different sites transform differently under local gauge transforms, one either has to guarantee gauge invariance by connecting them with Wilson lines or by fixing an appropriate gauge such that the Wilson lines can be neglected. Since Wilson lines would introduce path dependence, it is common to simply fix a Coulomb gauge before constructing Wall operators on the lattice. When implementing Wall operators for Effective mass plot for a heavy-strange meson correlation function (static limit for the heavy quark with 1 step of 4D hypercubic smearing [25], and the highly improved staggered quark discretization [24] for the strange quark) with different source and sink operators. The source operators are either Point-type (namely, random Z(2) noise sources with no momentum projection) or Wall -type (namely, corner-wall sources, with momenta pi = 0 or π/A). The sinks are either Point sinks, Wall sinks or relativistic quark model inspired wave-functions obtained with the code discussed here. Simulations were conducted in (2+1+1)-flavor QCD with physical sea and lattice spacing A ≈ 0.15 fm using MILC ensembles [21] and the publicly available MILC code [27]. The dotted line shows the average of the data obtained using the wave-function sink and the gray band shows the respective standard deviation. It is clearly visible how much excited state suppression can be gained by using wave-functions at the sink.
staggered quarks it is necessary to pay attention to the staggered taste structure. In particular, non-trivial separations between quark and antiquark must be decomposed into one part that is an integer multiple of a unit hypercube and another part within a unit hypercube. Non-trivial separations of the latter type (within a unit hypercube) would generate hadrons with non-trivial spin-taste structure. If a variety of such non-trivial separations is included at both source and sink, this would lead to a correlation function, whose spectrum contains many states with similar masses differing only by discretization artifacts of order A 2 , which are very challenging to disentangle numerically. In particular, a common Wall -type operator that does not pay heed to the taste structure is called an even-and-odd wall operator. If an even-and-odd wall operator is combined with any taste-filtering operator at the other end (such as Point-type) of the correlator, the latter filters out the unwanted tastes, whose remnants only show up as contributions to the gauge noise. However, in a symmetric correlator with even-and-odd wall at both source and sink, the additional states due to the unwanted staggered tastes would spoil any attempts to efficiently solve the generalized eigen-value problem. Thus, it is common practice to use so-called corner-wall sources that utilize only the lower corner of each spatial unit hypercube, i.e., (0, 0, 0), which project to spatial momenta where all three components are 0 or π/A. These corner-wall operators eliminate the contributions from the other staggered tastes in the valence sector. In this recent application corner-wall sinks -the naive analog of this prescription applied at the sink -have been used for the first time (to the authors' knowledge). It is irrelevant whether the Wall operator is applied at the sink to the quark or to the antiquark or to both, since it is followed by a summation over all spatial sites at the same time (propagator endpoints). On the contrary, an application of a Wall operator at the source yields different results whether it is applied either to the quark or to the antiquark or to both, since it is not followed by a summation over all spatial sites. The latter is the "correct", symmetric choice for the generalized eigen-value problem. In Fig. 1.2.a, such source and sink operators are denoted as Wall.
Finally, the last type of hadron interpolating operators considered here is the RQM -type. In this case, a real-valued bound state wave-function of the positive parity ground state in a relativistic quark model is directly read into a complex field from an ASCII table (in the format: x 1 ␣x 2 ␣x 3 ␣ (Ψ)␣ (Ψ)) and then convoluted with the antiquark propagator, before it is contracted with the quark propagator at the sink, and all spatial sites at any given sink time are summed over. For this reason, it does not matter, whether the wave-function is convoluted at the sink with the quark or with the antiquark. The requirement of ensuring gauge-invariance with Wilson lines or through gauge fixing carries over from the discussion of Wall operators. In principle, the same amount of attention with regard to staggered taste structure as in the case of Wall operators is in order here. As soon as the desired taste structure is filtered out at the source, the omission of isolating the desired taste structure at the sink does not lead to non-vanishing contributions from the other tastes (although they still contribute to the gauge noise). In the presented correlators, taste-filtering on the level of RQM sink operator has been ignored, which leads to the larger gauge noise in comparison to the correlators that are taste filtered at both source and sink. While it is not implemented for this data set, the RQM wave-functions could also be used directly as source operators. However, similar to the case with Wall operators at the source, one has to distinguish between applying the RQM operator to the "light" antiquark or to the "heavy" static quark (the latter being computationally advantageous, see Ref. [16]), since there is no spatial summation involved. Moreover, one has to take care of enforcing the appropriate taste filtering. Hence, the "correct", symmetric choice for the generalized eigen-value problem, would see a taste-filtered RQM operator at the source for either propagator contracted with a corner-wall operator at the source for the other propagator.
The RQM wave-functions are obtained by first computing the static quark-antiquark correlator on the same gauge ensemble, then using its effective mass at time t ≈ 0.5 fm as an external potential in the quantumfdtd v3 code, extending it for large distances with a Cornell-type fit and flattening it off as a constant at even larger distances to mimic the string breaking (see subsection 5.4). Using the same type of box geometry, i.e., 32 3 , a similar lattice spacing, i.e., A = 0.15 fm, and a "light" constituent quark mass of M = 300 MeV with the relativistic kinetic term (see subsection 2.1), the ground state wave-function is obtained using the quantumfdtd v3 code. While this wave-function decays rapidly, it is still large enough to remain sensitive to the (periodic) boundary conditions if a physical 3D volume similar to the one available in typical 4D QCD lattice calculations is employed. In order to prevent the operator of the RQM wave-function adding with the contributions across the boundaries, as one would encounter in these usually rather small 3D volumes, the wave-function's amplitude at larger distances is artificially suppressed to reduce the impact of finite size effects, before the wave-function is converted into a MILC-code readable file format and fed back into the lattice QCD simulation for convolution at the sink. An ad-hoc form-factor, FF, is applied to the wave-function for r > POTFLATR (see the definition in subsection 5.4 on page 38), that makes the potential flat for r > POTFLATR. The form-factor FF is defined as Furthermore, if FF < 10 −10 , the wave-function is set to 0. In Fig. 1.2.a, this type of sink operator is denoted as RQM. It is apparent from the figure, that the combination of a Wall at the source with an RQM at the sink eliminates the excited state contamination to an extraordinary extent. Whereas the excited state suppression is not as thorough with a Point at the source, it can still be of a great benefit when it comes to projecting out the ground state. It appears likely that the use of a combination of such Wall and/or RQM operators at the two endpoints of a three-point function with a Point-like operator for the external current and the corresponding two-point function by which it is divided, could make a major difference for the excited state suppression and the accuracy of the QCD matrix elements that could be achieved with the same underlying gauge ensembles.
In this context, the question of heavy-light tetraquarks also looms large. A three-dimensional relativistic Schrödinger equation solver may be able to deliver interpolating operators for first-principles lattice QCD studies of tetraquarks that are able to enhance overlap factors for certain geometries 3 (i.e., spherical symmetry in a light-diquark heavy-antidiquark system, or cylindrical symmetry in a light-diquark heavy-antiquark heavy-antiquark system) of tetraquarks, while effectively suppressing the excited states. Such an approach might make the resolution of the geometric structure of tetraquarks feasible and distinguish between a wide range of possible scenarios.

Overview and structure of the paper
In this work, we extend the original quantumfdtd code [14,15]. The new version, like the old version, allows one to compute the ground-state and low-lying excited state three-dimensional (3D) wave-functions and their corresponding energies for both real-and complex-valued potentials. We implement a new relativistic kinetic term and add two new non-relativistic ones, both of which are Fast Fourier Transform (FFT) based and implemented using the parallelized FFTW 3 library. In addition, the original code has been extended so that it supports arbitrary external potentials and the option to project out distinct parity eigen-states from the solutions. These options also work with the original implementation of the finite-difference time-domain (FDTD) solver for the non-relativistic Schrödinger equation. Because of the original code design, we use a 3-D Cartesian, homogeneous and isotropic lattice with N points in each dimension and lattice spacing A (in units of GeV −1 ). No adaptive engine has been implemented. The solver depends on the MPI library to distribute the lattice between the MPI processes. Finally, we note that by construction, the original FDTD solver for the non-relativistic Schrödinger equation used Dirichlet boundary conditions. For the new kinetic terms, based on the FFT, periodic boundary conditions are used. The potential can be centered either in the lattice volume at the off-grid point (N H , N H , N H ) (with N H = (N + 1)/2) or in the origin at the grid point (0, 0, 0). We use the original iterative solver (see Ref. [14] and section 2.2 where we review the method).
This paper is organized as follows: In section 2 we introduce the newly implemented kinetic terms (subsection 2.1), the iterative procedure (subsection 2.2), and the parity projection scripts (subsection 2.4). In section 3 we show and compare results obtained using the different kinetic terms. In section 4, the program requirements are listed and basic instructions for compilation and running are provided. In section 5, we show the basic configuration parameters of the program (subsection 5.1) and list the different initial conditions (subsection 5.2) that are implemented. We furthermore list the hard-coded potentials in subsection 5.3, and present a newly implemented option that allows the user to load potentials from external files (subsection 5.4). The new post-processing scripts, which include parity projection scripts for wave-functions, are explained in the subsection 5.5 and 5.6. We end with a conclusion in section 6.

The new relativistic kinetic term
The Schrödinger Hamiltonian can be generally 4 split into a kinetic piece, H K , and into a potential V ( r). The non-relativistic kinetic term is given by where p = (p 1 , p 2 , p 3 ) is the spatial three momentum. On the other hand, the relativistic term is given by if neither spin nor antiparticle degrees of freedom are given consideration.
In the following subsections we describe the newly implemented (relativistic) kinetic terms.
of freedom interacting with each other in the background of another potential, which would be needed if the two light quarks were not represented as a light diquark in the model. Nonetheless, the present code could be straightforwardly extended to enable computations with two light degrees of freedom that interact with each other and are symmetrized or anti-symmetrized in the overall wave-function according to their spin statistics. 4 In the absence of dissipation and other non-potential effects.
In particular, the FDTD kinetic term H K is based on Taylor expansions on the lattice sites and assumes continuity and differentiability. In momentum space this term corresponds to where Ap l is real-valued and satisfies −π < Ap l ≤ π. These coefficients are not constrained to a finite set of momenta (due to the absence of periodic boundary conditions). In the following we write k l ≡ Ap l for convenience. Note that there are no fundamental limitations for a modified FDTD kinetic term with periodic boundary conditions, although it would require a substantial addition of structure encoding the communication across this boundary in the present version of the code. The newly implemented kinetic terms depend on the FFT and, hence, use periodic boundary conditions, and have the momenta constrained to the eigenmodes of the finite box, i.e., for l = 1, 2, 3. We note that the momentum space lattice is the reciprocal of the position space lattice. The action of the new kinetic terms on a wave-function Ψ = Ψ( r) ( r = (Ax 1 , Ax 2 , Ax 3 ) being spatial coordinates) is given in terms of the Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) operators (realized through the FFTW 3 library) by where the FFT and IFFT operations are defined via Due to the realization of the kinetic terms through the FFT, optimized communications across the internode boundaries, as done in the FDTD approach, are not required here.
The term H K is a discretized FFT implementation of the classical non-relativistic kinetic term H nr K , Eq. (4). For this reason, the discretization errors of Eq. (7), namely, are completely absent in H K . Note that although H K formally seems to avoid the breaking of the O(3) symmetry to the cubic group, this symmetry breaking is implicit in the FFT. The 2π periodicity and restriction to the interval of Eq. (8) is automatically realized in the FFT implementation. Using the Symanzik effective field theory [28,29] the discretization errors of H (0) K could be systematically reduced by including more extended stencils with adjusted coefficients, e.g., The choice c 0 = 4/3, c 1 = −1/12 restores the leading term to its continuum coefficient and cancels the second term in the series. Hence, there cannot be any local coordinate space representation of H K , since infinitely many stencils up to infinite extent would be required to completely remove the discretization errors.
The term H K is an FFT implementation of the classical non-relativistic kinetic term H (0) K for periodic boundary conditions. Due to the different boundary conditions there are differences between those terms that vanish in the infinite volume limit. However, the discretization errors of H (0) K and H (2) K are formally equivalent as they have the same coefficients in the Symanzik effective theory in the infinite volume limit.
The term H K is the newly implemented relativistic kinetic term, which realizes the most simple discretized version of Eq. (5). It is inspired by a relativistic quark model (RQM) [16] and can be used by means of the KINTERM parameter (see the definition in subsection 5.1 on page 32). There is no local or continuously differentiable coordinate space representation of the relativistic kinetic term, as is evident from the momentum dependent term under the square root. Using Symanzik effective theory [28,29] the discretization errors of H K could be systematically reduced by including more extended stencils with adjusted coefficients just as in the non-relativistic case. It is also straightforward to construct an relativistic analog to H K permits a quantitative study of finite volume errors while using a single volume, due to the different boundary conditions. On the other hand, comparing the results of H (1) K and H (2) K permits a quantitative study of discretization errors while using a single lattice spacing, due to different coefficients in the Symanzik effective theory. Nevertheless, the dependence of the low-energy spectrum on these differences is usually very small. On the contrary, the relativistic kinetic term encodes qualitatively different physics already in the low-energy spectrum.

The iterative procedure
The iterative procedure is unmodified with respect to quantumfdtd v2. That is, it is the very same procedure as described in section 3 of Ref. [14], Equation (17) is the result of a Wick rotation of the Schrödinger Equation, i t → τ . The evolution with τ is given by Eq. (2.5) of Ref. [14], where E i is the i-th eigen-energy of the Hamiltonian and {a 0 , a 1 , . . . } are the decomposition coefficients of the initial guess Ψ( r, 0) in the basis of eigen-vectors. Each wave-function that is generated at each time step of the iterative procedure of Eq. (17) constitutes a snapshot. Before snapshots are computed, the wave-function is normalized in the full lattice volume, V = (AN ) 3 , as r∈V |Ψ( r| 2 = 1. In order to compute excited states, overlaps between the last two wavefunction snapshots are used. These last two snapshots are temporarily kept in memory, and can also be saved to disk 5 . However, storing many of them consumes a lot of disk space. Instead of that, saving a projection of the wave-functions can be triggered. 6 Note that, at some point e −E i τ will become smaller than the machine precision. If we consider the decomposition of a wave-function in an eigen-energy basis, the evolution of each component Ψ i ( r, τ ) with τ is given by Eq. (18). Hence, it decays as ∼ e −E i τ , where E i is the eigen-energy associated with the respective eigen-state. The iterative procedure keeps r∈V |Ψ 2 ( r, τ )| = 1, for all τ by renormalizing the wave-function after a certain number of iteration steps. While the ground state component remains large, the excited state components decay exponentially compared to the ground state.
Hence, the characteristic time τ decay, i during which an excited state i vanishes during the iterative procedure is where is the computer-.
For the question whether two excited states Ψ i ( r) and Ψ j ( r) can be separated in the iterative (or overlap) procedure, their energy difference |E i − E j | has to be sufficiently large, or they have to be distinguishable in terms of another quantum number by which they can be separated. Hence, if all snapshots are taken at large values of τ , the only wave-function that can be recovered is the ground state and the overlap procedure will not be able to recover any excited states unless snapshots at much smaller times are included as well, as we will see in detail in the examples in the next section. The logic behind this iterative procedure is the assumption that at the latest available time any contamination from excited states is already negligible, such that the estimate given by this large-time wave-function is a good representation of the ground state. The next step in this line of reasoning is that the preceding snapshot contains additional non-negligible contributions of only one excited state. An estimate of this excited state can then be extracted once the projection to the estimate of the ground state has been removed. Finally, the iterative procedure assumes that the second excited state is non-negligible at the next earlier time snapshot, removes the projections to the estimates of all lower states, and takes the remaining wave-function as the estimate of the second excited state. In principle this logic could be iterated successively in order to get one excited state after the other, as long as their energies differ by a sufficient amount, such that there is only one additional state at each earlier snapshot. In practice, isolating more than two excited states is very often infeasible.
Unless the ground state is successfully isolated at the latest time, its estimate will be a linear combination of it with one or a few of the lowest excited states instead. Thus, the estimate of the excited states are in turn also linear combinations of the ground state and a similar or even larger set of excited states. This is the reason why none of the states is correctly identified at small simulation times. Only if the simulation runs long enough to successfully isolate the ground state, reliable results for any of the other states become possible, too. In order to get accurate results for the excited states, one snapshot has to be available in each time interval between the freezeout times of any two excited states. Since the code internally uses the two latest snapshots, better projections could be obtained during post-processing if the snapshot times are chosen in a manner particularly suitable for the energies of the system in question.
We improve the previous version of the code by adding the option of using a kinetic term H (i) K . In all cases, the values of A and B are taken to be the same as in [14], The time step ∆τ is an independent parameter that should be chosen as small as necessary to reduce the discretization errors of the time evolution and achieve good convergence, while being as large as permissible to minimize the computational expense. Both of these limitations depend on the combination of the kinetic term and the potential term.

Finite volume normalization
Close to the edges, the wave-functions are always affected by severe finite volume effects, albeit differently for Dirichlet or periodic boundary conditions. Namely, the same distance r realized by a separation along any one of the axes is much closer to the boundary than any realization in terms of an off-axis separation, e.g., r = 3n · A implemented via r = (3, 0, 0)n · A vs. r = (2, 2, 1)n · A. On the level of post-processing scripts 7 we use a parameter min_sep_edge to remove contributions from sites closest to the boundary from expectation values, and restrict wave-functions to an effective lattice volume V 2 ≡ ((1−min_sep_edge)AN ) 3 . This finite volume normalization in the effective lattice volume, V 2 , or box normalization for short, is different from the full volume normalization used within the quantumfdtd v3 code, cf., Sec. 2.2. Instead it is defined as We provide a script that normalizes the wave-function by enforcing r∈V 2 |Ψ( r| 2 = 1, within the effective lattice volume V 2 . As a very simple and intuitive measure for the sensitivity to finite volume effects we employ the average radii in this box normalization, ρ 2 , that are computed after normalizing the wavefunctions (obtained through the iterative procedure) in the effective lattice volume V 2 , Unless ρ 2 N/2 we have to expect significant distortions of the wave-function due to finite volume effects.

Parity projection
We include scripts for computing the positive, P + , negative, P − , and negative-around-an-axis, P − p k , parity projections of the wave-function, Here, we have chosen p k =ê 3 = (0, 0, 1), although the script acceptsê 1 andê 2 as well. The aim is to separate the positive parity s-waves from the negative-parity p-waves. For the p-wave states, the negative-around-anaxis, P − p k separate the states with different magnetic quantum number m ∈ {0, ±1}. The weight of a parity projection P is defined as Note that within the effective lattice volume V 2 , Ψ|P + |Ψ + Ψ|P − |Ψ = 1 because of box normalization (unitarity) and the fact that parity projections are orthonormal. In this sense, we talk about the weight of a parity projection P as being one particular projection. It is straightforward to extend these scripts for projection to the higher multipoles, if the need should arise.

Studied cases
We discuss the convergence of the iterative procedure (see the previous subsections 2.1 and 2.2) in the following. Furthermore, we study the performance of the overlap method for separating excited states (see subsection 2.2), as well as the behavior of the parity projections (see subsection 2.4).
We first use the Coulomb-like potential (see subsection 5.3) as our benchmark, setting POTENTIAL=2, as it is shown in the left panel of Fig. 3.1.a. This Coulomb potential diverges outside of the grid at (N H , N H , N H ) with N H = (N + 1)/2, i.e., in the center of the unit cell at (N/2, N/2, N/2). Alternatively, one could implement a Coulomb potential centered on the grid at (0, 0, 0) via POTENTIAL=102; the regularization of the divergence at its center gives rise to a discretization artefact resulting in suppressed contributions from 7 See subsections 5.5 and 5.6. Right: The harmonic oscillator potential used for the analysis in subsection 3.6. We use N = 256 points in each spatial direction, and the potential is centered in the lattice volume as well.
distances of the order of the lattice spacing, r ∼ A. Different paths bridging the same radial distance may belong to different representations of the cubic group, i.e., (3, 0, 0) or (2, 2, 1), which both correspond to r/A = 3. Hence, the derivatives along these different paths still exhibit a substantial roughness as functions or r/A, which, however, reduces rapidly for larger r/A. This is the dominant discretization artefact in all of those following examples. Discretization artefacts can be parameterized through a series in dimensionless products of the lattice spacing and any other parameter of non-trivial mass dimension. We set the lattice spacing A = 0.12 fm and use a reduced mass M = 0.3 GeV with no other dimensionful parameters; hence, the discretization artefacts can be understood in terms of factors of a mixed series in A · M or A/r, where the latter are relevant only if r is not much larger than A. Because each kinetic term is an even function of A, the discretization artefacts are restricted to even terms in A, too. Due to (A · M ) 2 ≈ 0.03, we expect mild finite mass discretization errors; discretization errors of the form (A/r) 2 are negligible in comparison for r/A 6. Along these lines we use the Coulomb-like initial conditions, setting INITCONDTYPE=2 (see subsection 5.2).
As a second example we use the harmonic oscillator potential together with the relativistic kinetic term, setting POTENTIAL=4 as it is shown in the right panel of Fig. 3.1.a. We keep the Coulomb-like initial conditions (INITCONDTYPE=2) but we reduce the lattice spacing to A = 0.006 fm while increasing the number of points in each direction to N = 256 and we also increase the mass to M = 30 GeV which results in (A · M ) 2 ≈ 0.81 which should result in reasonably small finite mass discretization errors 8 . Note that in contrast to the Coulomb potential, the harmonic oscillator potential has no divergence at its origin.
In all of the cases, we do not plot points that are closer than 0.15N to one of the edges and use the box normalization corresponding to min_sep_edge = 0.3 9 , in order to avoid zones where finite volume effects are prominent, see subsection 2.3. We compare -wherever possible -to the theory expectation obtained from a continuum, infinite volume wave-function integrated in the same effective lattice volume V 2 .
We plot the positive parity projection P + , Eq. (23), and the negative parity around-an-axis P − p k , Eq. (24). We do not plot a projected wave-function if its weight is less than 10 −5 . However, we need to stress that the overlapping procedure used for extracting ground and excited states out of the iterative procedure (see subsection 2.2) can also introduce numerical artefacts affecting excited states.
This section is organized as follows: We start with the Coulomb potential. In subsection 3.2, we study the convergence of the relativistic kinetic term H  K at an increased lattice volume using N = 128. In 8 In order to be able to compare to existing literature results we had to slightly change the potential and use a large mass. See the discussion in subsection 3.6 for more details. 9 See the definition of the load_wf function in subsection 5.6 on page 40)  K . In subsection 3.5, we study the parity fixing machinery of quantumfdtd, for both symmetrization and anti-symmetrization. We use the non-relativistic kinetic term H (0) K (FDTD) as an example. We then switch to the harmonic oscillator potential and study the relativistic kinetic term H (3) K in subsection 3.6. Finally, in subsection 3.7, we summarize the convergence conditions and numerical behavior of the program. Note that, for τ > 125 GeV −1 , the weights of the excited states fall below the computational accuracy, , of the computer. For a 64 bit architecture, ∼ 10 −16 . Below this threshold we expect numerical artefacts which can be seen in the top right plot of Fig. 3.2.a, where the exponential decay breaks for a weight of the minor component of the parity projection ∼ 10 −21 ∼ 10 −16 /N 3 (N = 64). See also, on the top left side of Fig. 3.2.a, that the energy of the first excited state decays to that of the ground state because the overlap method is no longer able to separate the first excited state from the ground state once numerical artefacts show up.

The relativistic kinetic term H
For large values of τ , the energy of the excited states decay to that of the ground state, whose energy eigen-value remains stable throughout the evolution after its correct value has been reached.
Note also, on the top left plot of Fig. 3.2.a, a first crucial point at τ ∼ (60 − 65) GeV −1 , where the energy  of the second excited state, E 2 , degenerates with the one of the first excited state, E 1 . This can also be seen in the relative weights of the parity projections of the second excited state shown in the bottom right plot of Fig. 3.2.a. Actually, we are seeing that the second excited state has fallen below the machine precision. Hence, the only remaining states are the ground state and the first excited state.
For high values of τ , but not high enough to hit the aforementioned cancellation issues, the parity projected wave-functions are stable as a function of τ . The negative parity projections around an axis, P − p k , of different separated states are compatible (they actually come from the lowest P − p k odd state). Additionally, there are 2 positive parity projections, P + , the ground state and the first excited state in the s-wave. This can be seen in Fig. 3.2.b where we plot the parity projected wave-functions for three different snapshot time steps.
We add an additional row corresponding to 10000 steps, i.e., τ ∼ 123 GeV −1 . The wave-functions do not change by eye which is also supported by the values of the average radii at that later time step.Their values lie well within the considered lattice volume such that we do not expect considerable finite volume effects. Note also that, for τ > 100 GeV −1 , these average radii become stable as a function of τ (see the table in the caption of Fig. 3.2.b). The average radii of the parity projections, P + Ψ 1 and P + Ψ 2 , are compatible. Similarly for the parity projections, P − p k Ψ 1 and P − p k Ψ 2 . That is, the overlapping procedure of subsection 2.2 can separate the first excited states of each parity, but cannot identify the higher states.  The eigen-energies obtained using the different kinetic terms are shown as a function of τ in Fig. 3.3.a. The lines that correspond to H (1) K and H (2) K , respectively, are pretty much coincident, since they are both non-relativistic and FFT-based implementations of the kinetic term. The only difference is a small lattice correction in the kinetic term (see Eqs. (9) and (10)). The H (0) K line, on the other hand, is based on the FDTD method for computing the (non-relativistic) kinetic term.

Comparison between the different non-relativistic kinetic terms
This method assumes Dirichlet boundary conditions (vanishing wave-function on the boundary) instead of periodic ones, so that disagreement with FFT-based methods is expected when finite volume effects are prominent, i.e., in states similarly large as the box. This qualitative difference shows up as a significant deviation of the H The difference is small and basically negligible for the ground state, that stabilizes at τ ∼ 50 GeV −1 and whose eigen-energy value, E 0 , shows no discrepancy between H K , respectively. The second excited state's energy, E 2 , decays to E 1 in all the cases, and it also shows the discrepancy between H K . Note that the 1s ground state of the Coulomb potential is the one most concentrated around r = 0, and, hence, the one that should show less finite volume effects. This implies that the differences between H where M is the reduced mass assuming unit charge and the Y (Ω) are the spherical harmonics. The continuum normalization is given by The corresponding average radii in the continuum are defined through integrals as Explicitly, they are given by However, for the sake of comparability, we use the box normalization as defined by Eq. (22). Although the 1s, 2p, and 2s states can be recognized, and the wave-functions that can be extracted via parity projection from different quantumfdtd states (see subsection 2.4) are compatible, there are some differences between the 3 kinetic terms, due to finite volume effects or due to discretization errors. A study of the former is performed at the end of the section, where we increase the lattice volume to N = 128.
For the FDTD kinetic term, H K , the most prominent finite volume effect is a faster than expected decay of the 2p and 2s wave-functions close to the borders of the lattice volume which can be seen in Fig. 3.3.b. This is expected because of the Dirichlet boundary conditions. The 1s wave-function decays exponentially, so that its theoretical value at the borders of the finite volume is also low and therefore, this wave-function is not as severely affected by the Dirichlet boundary conditions.
For the FFT-based kinetic terms, H K and H K , the most prominent discretization artefacts can be seen in Figs. 3.3.c, and 3.3.d and are located at values of r close to the lattice spacing A. It is very pronounced due to the hard UV cutoff of the high momentum modes in the FFT. This effect is not present for the 2p wave-function, that vanishes at the origin and effectively suppresses contributions from the hardest modes. But it is especially prominent for the 2s wave-function.
For low values of τ , higher states with quantum number l > 0 show up as broad lines on the plots. Such states decay exponentially as a function of τ , according to Eq. (18), so that they do not show up at larger τ . In particular, the negative parity-projected ground state, P − p k Ψ 0 , is dominated by the Ψ 2p contribution and K . The average radii in units of A are given by the positive parity-projected first excited state, P + Ψ 1 , is actually dominated by the true ground state Ψ 1s . Note that the ground state, Ψ th 1s , and the first excited state, Ψ th 2p , have P + and P − parities, respectively. This is why P − p k Ψ 0 and P + Ψ 1 are actually dominated by the first excited state and by the ground state, respectively. According to subsection 2.2, the overlap procedure allows for an extraction of the ground state and of the excited states. However, due to numerical inaccuracy, this separation is not perfect. Otherwise, we would have P − p k Ψ 0 = 0. For all of the higher states contributions with both parities persist, because both states are degenerate for a non-relativistic Coulomb problem (infinite volume and continuum).
The continuum average radii in units of A are given by ρ ∞ 1s = 8.21, ρ ∞ 2s = 32.83, and ρ ∞ 2p = 27.36. However, in the considered effective lattice volumes, the average radii of the continuum solutions of Eq. (26) in units of A are ρ 2 1s = 8.14, ρ 2 2s = 20.30, and ρ 2 2p = 19.06. The lower values are linked to finite volume effects, that make larger values of r being discarded. Since the radial probability distribution of the 2s has its maximum at larger radii than the radial probability distribution of the 2p, it is expected that the even parity excited states are a bit more sensitive to a periodic boundary condition than their odd parity counterparts. In fact, radii of large even parity states approach the continuum value from below for Dirichlet boundary condition and from above for periodic boundary condition. These box normalized theoretical average radii should be compared with those listed in the captions of Figs. 3.3.b to 3.3.d. The wave-functions shown  get closer to ρ 2 2p , and the ones of ρ 2 P + Ψ 1,2 get closer to ρ 2 2s . However, we note that ρ 2 2p ≈ 20.30 is pretty close to ρ 2 2s ≈ 19.06. Indeed, the only difference between these states is the angular quantum number l, while n = 1 is equal. This fact makes the comparisons harder. We find that the average radii of the wave-functions of the FFT-based kinetic terms H (1) K (Fig. 3.3.c) and H  (Fig. 3.3.b) shows -with the notable exception of the ground state -after stabilization, systematically smaller average radii than the FFT-based or continuum results due to the Dirichlet boundary conditions. This is especially prominent for the 2s and 2p states.
Looking at the tail of the wave-functions, the decay of the excited state wave-functions is faster in the FDTD case. This fast decay is explicitly enforced by the Dirichlet boundary conditions. For the FFT-based kinetic terms, periodic boundary conditions are used which do not enforce Ψ(boundary) ≡ 0, and thus the tails of the corresponding wave-functions are closer to the continuum ones as can be seen in Figs. 3.3.c and 3.3.d. However, this more realistic, yet slower decay implies an increased sensitivity to the boundary. The broad bands are due to the distinction of on-axis and off-axis separations, which effectively feel different sizes of the box, see Figs. 3.3.c, and 3.3.d. The 1s state is less affected by finite volume effects because its theoretical average radius ρ 2 1s ≈ 8.14 is well within the finite lattice volume. We now compare the FDTD kinetic term, H In Fig. 3.3.e we plot the energy eigen-values for the N = 64 and N = 128 simulations. The relative weights of the parity projections are shown in Fig. 3.3.f, and finally in Fig. 3.3.g we plot the wave-functions for two different values of τ .
For N = 128 the differences between H This trend continues when looking at the behavior of the relative weights of the parity projections in Fig. 3.3.f. In the right panels we see little to no difference between the two kinetic terms. The same holds true for the ground state even in the smaller volume (top left panel of Fig. 3.3.f). The parity projections of the  K . This behavior is not totally unexpected given that H K their sum ends up with a weight of p even ≈ p 00 +5p 2m ≈ 1/6, while the sum of the p-waves ends up with a weight of p odd ≈ 3p 1m ≈ 1/3. With H (2) K , the odd parity states are much more suppressed, and even parity dominates completely. Eventually, the second excited state freezes out, and the same pattern as in the first excited state emerges for τ > 125 GeV −1 in both volumes.
The continuum average radii in units of A using the box normalization are given by ρ 2 1s = 8.22, ρ 2 2s = 30.77, and ρ 2 2p = 26.18, respectively. As it happened for N = 64 above, the infinite volume continuum solutions of Eq. (26) in units of A have larger average radii: ρ ∞ 1s = 8.21, ρ ∞ 2s = 32.83, and ρ ∞ 2p = 27.36, respectively. However, note that the difference is not as prominent as in the N = 64 case. Actually, ρ = 31 falls well inside the lattice volume of N 3 = 128 3 , so that finite volume effects are not so drastic.   Fig. 3.3.b, that was attributed to finite volume effects and the Dirichlet boundary conditions, disappears when the volume is large enough to accommodate most of the wave-functions.
The ground state radius ρ 2 P + Ψ 0 is minimally larger in the bigger box, and insensitive to the boundary condition as in the smaller box. On the one hand, the even parity excited states computed with the FDTD (Dirichlet boundary conditions) have average radii in the box normalization, ρ 2 P + Ψ 1,2 , that are still systematically smaller than the corresponding theoretical value ρ 2 2s ; on the contrary, these state have larger radii when computed through the FFT (periodic boundary conditions). On the other hand, the odd parity excited states have average radii ρ 2 P − p k Ψ 1,2 that are marginally smaller than the theoretical value ρ 2 2p .
It can finally be seen in Fig. 3.3.g that, especially for the FFT-based kinetic term (H K , right plot), the behavior of the 2s and 2p states deviates from the respective continuum result close to the origin. The continuum result curves lie slightly higher than the numerically computed ones in that region. This points out the issue with the lattice implementation of the Coulomb potential close to its center as already mentioned and is expected due to the pole at the origin. Hence, we are facing discretization artifacts originating from the discretization of the potential. The FFT approach, seems slightly worse than the FDTD approach (H (0) K , left plot of Fig. 3.3.g) in handling this kind of effect. The singularity of the Coulomb potential at r = 0 makes the derivatives of the potential diverge close to r = 0, whereas the FFT sets a cut on their values due   to the finite volume in momentum space. The FDTD is based on finite differences, so that such a drastic cut in momentum space is not present in contrast to the former.

Comparison of the relativistic H
K and the non-relativistic H K kinetic terms at high mass In this subsection, we increase the reduced mass up to M = 3 GeV in order to simulate a heavy, nonrelativistic system. In all of the plots in Fig. 3.4.a, the results of the relativistic kinetic term, H The goal is to compare the numerical behavior of the relativistic and the non-relativistic kinetic terms in the regime of high mass, where they should become more similar. The non-relativistic energy solutions have to be shifted by M in order to allow for a comparison with the relativistic energy solutions. Due to (A · M ) 2 ≈ 3.3 we expect finite mass discretization errors. For this reason, this setup is not appropriate for production purposes, where one should reduce the lattice spacing by (approximately) the inverse factor from A = 0.12 fm to A 0.02 fm. The energies and parity projections of the wave-functions evolve faster with τ (see Fig. 3.4.a), such that the energy of the second excited state, E 2 , degenerates with E 1 at τ ∼ 25 GeV −1 and E 1 degenerates with E 0 at τ ∼ 70 GeV −1 . For τ > 60 − 65 GeV −1 , numerical cancellation effects start showing up (see parity projections in Fig. 3.4.a). The similarities between the results in the heavy mass limit, however, can be also seen in the top left panel of Fig. 3.4.a, where the decay curves are quite similar after taking into account an additive factor of 3 GeV on the energies due to the rest mass on the relativistic approach.
As expected, the wave-functions (see Fig. 3.4.b) of the relativistic and the non-relativistic cases are closer to one another due to the increased reduced mass. For comparison reasons we therefore show the nonrelativistic theory curves also for the relativistic kinetic term (left plots). However, the results are visibly non-smooth due to discretization artifacts. Comparing these results with the results from the previous subsections 3.2 and 3.3 shows that the increase of the reduced mass from M = 0.3 GeV to M = 3 GeV by an order of magnitude leads to the expected decrease of the characteristic average radius of the wave-functions. For the non-relativistic case, ρ ∞ 1s = 0.82 in units of A, which is below a single lattice spacing. For the 2s and 2p excited states, the continuum average radii are ρ ∞ 2s = 3.28 and ρ ∞ 2p = 2.74, respectively. The fact that the continuum average radii in units of A are close to 1 makes a precise analysis of this case more complicated because higher terms in the series in (A/r) 2 are substantial contributions to the discretization artefacts (see subsection 3.1).
In the right panels of Fig. 3.4.b, it can be seen that, although the wave-function computed via the nonrelativistic FFT-based kinetic term H (2) K are close to the continuum result, there are clear discrepancies that partially originate from the momentum discretization. Since the large mass compresses the states to smaller radii, i.e., r ∝ 1/M , even the discretization artefacts from the series in (A/r) 2 are enhanced by the large    mass. This is very prominent for the 1s and 2s states, that have non-vanishing values close to the origin. Anyhow, in Fig. 3.4.b it can be seen that the wave-functions computed via the relativistic kinetic term, H K , and the non-relativistic one, H (2) K , (both based on a FFT with periodic boundary conditions) are pretty close to one another. This is to be compared against the cases shown in the right panel of Fig. 3   3.5. The quantumfdtd internal parity fixing machinery In this subsection, we investigate the ability to specify the initial spatial symmetry of the wave-function using the INITSYMMETRY parameter (see the definition in subsection 5.2 on page 35). We set INITSYMMETRY=1 for the left plots in all figures, which corresponds to symmetrization along the z axis, and we set INITSYMMETRY=2 for the right plots in all figures, which corresponds to anti-symmetrization. We again use N = 64 and M = 0.3 GeV and the non-relativist kinetic term H (0) K . The corresponding energy solutions are shown in Fig. 3.5.a. The relative weights of the parity projections are shown in Fig. 3.5.b, and, finally, the wave-functions are shown in Fig. 3.5.c. There the dashed lines again represent the analytic solutions for the non-relativistic kinetic term and a Coulomb potential. The results for the latter should be compared with the figures of subsection 3.3 and with the FDTD based kinetic term, H (0) K , therein. That is, with Fig. 3.3.b.
In Fig. 3.5.b, it can be seen that the choice INITSYMMETRY=1 (left plots) removes the 2p z state completely. However, there is still a residual negative parity component, most likely due to higher lying states or due to 2p states that are symmetric around the z axis. On the other hand, INITSYMMETRY=2 (right plots in Fig. 3.5.b) removes all s states, leaving only 2p and higher lying states. In particular, there are 3d states, which are odd under P − p k for two directions while being even under parity. This manifests in the P − p k Ψ 1,2 and the P + Ψ 1 projections in the bottom right plot of Fig. 3.5.c remaining broad (the corresponding bottom left plot had this artefact removed that was present for smaller values of τ in the top plots). These states are not separated by the post-processing parity projections (see subsection 2.4). Hence, these positive parity wave-functions mix contributions from states with different geometry that show up as the broad bands in the Figure even for late time steps τ . Note that, on the right energy plot of Fig. 3.5.a, the ground state energy no longer corresponds to the ground state energy of the left plot. It now corresponds to the first excited state energy of the left plot which is the 2p state, since it is the lowest lying state compatible with the anti-symmetrization. The two excited states in the right-hand plot are corresponding to 3d and 3p. Although these large states would be degenerate in the continuum in an infinite volume, they both show very strong finite size effects that lift their degeneracy.
The continuum average radii in units of A are given by ρ ∞ 1s = 8.21, ρ ∞ 2s = 32.83, and ρ ∞ 2p = 27.36. As explained previously in subsection 3.3, the values are lower when computed using the box normalization, i.e., ρ 2 1s = 8.14, ρ 2 2s = 20.30, and ρ 2 2p = 19.06, respectively. At sufficiently large times τ , the average radii agree reasonably well with the theoretical expectations. This behavior is in line with what we have already seen in the previous subsections; especially in subsection 3.3. The possibility to exclude certain states from the beginning can be an advantage during demanding calculations and is also a non-trivial cross check.
Because of the excited state contamination with 3p and 3d state in the case of the anti-symmetric initial condition we also provide their average radii 10   states with main quantum number n = 3 become extremely compressed within this finite box. In particular, the wave-function at large radii aligned with any axis is forced to be small due to the Dirichlet boundary condition and continuity. Yet it may be much larger at similar radii realized in off-axis directions that are still far away from the boundary. The consequence of these boundary effects are the broad, wavy bands which become narrow only close to either the center or to the corners of the box.
Since the 3p-state is significantly more deformed by cutting away large parts of the contribution between its two nodes at large radii in the finite box, its box-normalized radius is even smaller than for the 2p-state. Nevertheless, the corresponding pattern of a function with two nodes at finite radii is clearly recognizable in both right panels of Fig. 3.5.b. The 3d-state, on the other hand, is rather localized with only a single node. The average radius of its box-normalized wave-function is surprisingly similar to the one obtained in the simulation. With regard to Fig. 3.5.b this may be a bit surprising, as its box-normalized radius still substantially undershoots the infinite volume radius.  We also note that the wave-function of the 3d-state is not projected by the post-processing scripts with the correct symmetries, since projection onto d-waves has not been implemented 11 . Hence, the remaining angular dependence is averaged over, which leads to a wave-function spreading over both signs in the smeared band of the positive parity state. Physical information should not be extracted from these states unless the volumes are larger by at least a factor 4 3 .

The relativistic kinetic term H
K and the harmonic oscillator In this subsection, the relativistic FFT-based kinetic term H K , Eq. (11), is studied together with a harmonic oscillator potential. Our goal is to compare to the existing analytic literature results of Ref. [30]. There, however, the harmonic oscillator potential comes without the factor of 1/2 which we also take into account for the simulation (compare to the standard form shown in Eq. (43)). Additionally, there, the reduced mass is chosen to be M = 30 GeV which we use as well. In order to combat large mass discretization effects we decrease the lattice spacing to A = 0.006 fm which results in (A · M ) 2 ≈ 0.81 which should result in reasonably small mass discretization effects. We increase the number of points in each direction to be N = 256 to sustain a large enough physical volume. Smaller masses are possible in principle which would remove the need for such small A and large N , however, the computational cost for the analytic solution grows very rapidly (see below). Such choices would, however, very likely increase the numerical behavior of the simulation. We plot the convergence of the energies as a function of τ (top left plot in Fig. 3.6.a) and the relative weights of the parity projections for the 3 lowest states versus τ (top right and bottom plots in Fig. 3.6.a), and finally we plot the wave-functions in Fig. 3.6.b where we also compare them to existing theoretical results for l = 0.
At small values of τ we can see some finite mass discretization artefacts, but for values of τ > 8 GeV −1 the values of all three states have stabilized and we do not see any decay of excited states. The ground state is dominated by positive parity as one would expect. For the first excited state, one would expect negative parity domination of the state, which we can see, but due to the slow convergence caused by the large mass, only at large τ . Finally, for the second excited state we -as expected -see positive parity at large τ . The above enables us to extract three distinct states with the expected parities and their corresponding energies reliably.
In contrast to the relativistic Coulomb potential, the relativistic harmonic oscillator does not have a divergence at the origin. Additionally, a closed form analytic solution in momentum space exists [30] for l = 0 states which reads 12 The momentum space wave-function, Ψ( p), is the Fourier transform of the position space solution 13 , Ψ( r), 12 In the following equations any dimensionful quantities are rescaled in order to be dimensionless; for details see Ref. [30]. An implementation of the recursive formula can be found at https://github.com/quantumfdtd/relativistic_harmonic_ oscillator. 13 See also the discussion in subsection 2.1 and the Eqs. (8) to (16). and the recursive coefficients read c 2n = 0 for n = 0, 1, 2, . . . , for n = 1, 2, 3, . . . .

Similar to
Ref. [30], the sum in Eq. (30) needs to be truncated at some valueÑ for which we take their value ofÑ = 44. The convergence of this momentum space solution gets worse with decreasing M , such that very high values ofÑ are required and numerical instabilities due to division of large numbers (due to the factorials) spoil the results. We show the results in position space as well as the momentum space results obtained via a FFT in Fig. 3.6.b. The latter is compared to the truncated analytic result of Eq. (30). In Fig. 3.6.b we show the position space solution for the largest τ value available in the left panel as a function of r/A. We can separate the states as discussed above but the relatively large contribution of the positive parity state to the first excited state is still visible. In the right panel we show the momentum space wave-functions, |y(p)|, according to Eq. 30 as a function of A · p obtained via a FFT, and compare them to the theoretical predictions following Ref. [30]. We only show the l = 0 states, i.e., the ground state and the second excited state that we could clearly identify in Fig. 3.6.a to be the ones with positive parity, because we can only compare those to the theoretical predictions.
To obtain the right plot of Fig. 3.6.b, the FFT over the parity-projected wave-functions (left plot of Fig. 3.6.b) is used. This is similar to the implementation of the kinetic terms H K and H K (see subsection 2.1). There, a FFT is used to turn from position to momentum space, where the kinetic term is evaluated, and then an IFFT is used to turn back into position space. Here, we use a FFT to turn position space into momentum space so that we can compare our results with the ones from Ref. [30].
Note that the FFT has to be applied over the whole original lattice of dimension N 3 , without any notion of effective volume V 2 , i.e., without taking into account min_sep_edge. Otherwise, we would be changing the normalization of the momentum coordinates. After the FFT has been computed, the relation between the lattice indices and the value of the corresponding momentum coordinates is given by the inversion of Eq. (8) and reads where the x i are in lattice units and satisfy which is motivated by the usual layout of FFT implementations. Ref. [30] plots the radial momentum space wave-function, Because only the l = 0 states are computed, these states have spherical symmetry. The normalization of Ref. [30] is such that y (0) = 1, because their computation amounts to a Taylor expansion around p = 0 that solves an ordinary differential equation (ODE) with initial conditions y(0) = 0, y (0) = 1. On the right plot of Fig. 3.6.b, we normalize these solutions (theory lines) such that, instead, where max(p) is the maximum p for which y(p) has been evaluated. For a comparison with the radial wave-functions, y(p), of Ref. [30] (normalized as in Eq. (37)), we have to modify the normalization of our results such that where (2π/N A) is the reciprocal lattice spacing. Then, these wave-functions, Ψ(p), are multiplied by a factor √ 4πp turning them into spherical coordinates. Finally, in order to get rid of complex phases due to the FFT, the absolute values of our results and of the theory predictions are plotted for comparison. With the results presented so far, we reproduce Fig. 1, and the values 14 of Tab. 1 of Ref. [30].

Convergence conditions
The convergence of the energies and wave-functions shown in Figs. 3.2.a to 3.5.c is pretty clean. Discretization errors that are predominantly due to the lattice implementation of the Coulomb potential are visible on the coarser lattices and excited states are affected by finite volume effects in small lattice volumes. The wave-functions generated by the quantumfdtd can be considered reliable, only after the following conditions are met: • Appropriate parity projection and normalization has been performed using the provided post-processing scripts for separating the wave-functions (see subsection 2.4).
• Using the snapshot facility of the code for multiple values of τ the wave-function has been shown to be stable within small changes of τ and the energy E s,i (τ ) should be in a plateau, i.e., flat or, at least, in a saddle point.
• A < M −1 , such that the typical radius of the ground state is substantially larger than the lattice spacing A, ρ 2 1. This condition is violated in subsection 3.4 in the example.
N , such that typical radii of the excited states are substantially smaller than half of the box size N .
• It is important to choose the temporal grid spacing EPS ∝ A 2 relative to the lattice spacing A (see the EPS parameter on page 33).  [30], where the first (radially) excited (l = 0) state corresponds to the second excited state in our numerical simulation (since our first excited state has odd parity). At τ = 10 GeV −1 we obtain numerical results for the ground state energy of (E num 0 −M ) ≈ 0.38645 GeV, and first or second excited state energies of (E num , respectively, which is agreement at 0.5‰ level for E0 or still at sub-percent level for the excited state.

Compilation and running
We require ${NCORES} MPI processes, with ${NCORES} being a divisor of N . data and data/snapshot: Folder for default output files and snapshot files. The file format is ASCII, and the output consists of (i) the potential (see the SAVEPOT parameter for formating options), (ii) wave-functions (see the DATAFOLD parameter for formating options), and, if the parameter DUMPSNAPS is set, (iii) snapshots (see the parameter DUMPSNAPS) and (iv) eigen-energies. The snapshots are saved in data/snapshot which must exist if one wants to store them, otherwise, the run will fail. In the cases (i), (ii), and (iii), the first columns of the ASCII files are the lattice coordinates x 1 , x 2 , and x 3 , being the Cartesian coordinates in lattice units. The fourth column is the distance to the origin in lattice units which is corrected in case the potential is centered on the lattice center. The eigen-energies are stored in the files decay.dat, ground_state.out, first_excited_state.out, and second_excited_state.out. The first file is an ASCII list containing: The other 3 files, ground_state.out, first_excited_state.out, and second_excited_state.out, contain the eigen-energies of the ground state (i = 0), the first (i = 1), and the second (i = 2) excited state, respectively. Their layouts are: In all the cases, E b,i stands for the bare eigen-energy and E i , for the actual eigen-energy (for an explanation see the definition of the parameter UPDATE in subsection 5.

scripts:
Examples of scripts for running the program and post-processing of the output data. This includes the Python 3 [36] module quantumfdtd.py, that offers a Python class (quantumfdtd) for the postanalysis inside the Pandas [37,38] framework as well as the scripts symmetrize.sh, for computing parity projections, and several scripts for extracting plots 20 .
The root directory contains a Makefile that allows one to build the program provided that all of the external libraries are installed. If the libraries are installed in non-standard locations, this Makefile should be modified accordingly.
The following commands, executed in the root folder of the provided tarball, are required for building and executing the program: Here, ${NCORES} is the number of cores to be used which should be a divisor of the number of the spatial grid points. The source code is available on GitHub: https://github.com/quantumfdtd/quantumfdtd_v3.

Parameters of the program
The parameters listed in the following subsections should be defined in the standard parameter file located at input/params.txt. The input directory should be located in the directory where the program is also run from. Note that many of the post-processing scripts require the corresponding input file in the standard location (input/params.txt) to work. The configuration file accepts full line comments, that are marked with an initial double backslash, \\, on the commented line.
For convenience, the program accepts command line parameters as well. They can be parsed as follows:

Basic configuration
This program is based on an iterative method for finding both the ground state energy and the two first excited state energies. The corresponding wave-functions are computed as well. Hence, most of the basic configuration flags that are required for running the program are related to the setup of the lattice (discretization) or to the iterative procedure. The following parameters are required for running the program:

NUM:
Number of spatial grid points in each direction. Note that they should be divisible by the number of computational nodes, i.e., NUM mod N MPI cores = 0.

MASS:
Reduced mass of the simulated system. Default units: GeV.

KINTERM:
Selection of the kinetic term (see subsection 2.1) to be used: • The non-relativistic kinetic term based on FDTD (0).
• The relativistic kinetic term also based on FFT (3). 20 postprocessing.sh for the potential, energies and wave-functions and postprocessing_sym.sh for the parity projections. In both cases, plots of the full 3D wave-functions snapshots are obtained if they have been computed.

POTENTIAL:
Potential to simulate. There are 6 hard-coded potentials described in subsection 5.3 and 4 different options to read the potential from file described in subsection 5.4.

EPS:
Temporal grid spacing. Should be proportional to A 2 with a proportionality constant that can be determined from empirical stability analysis.

UPDATE:
How many steps should be taken before checking the SNAPUPDATE, SNAPDUMP and STEPS variables, and computing observables 21 and printing them to command line. If the SAVEDECAY flag is enabled, observables are also saved to the file decay.dat, as an ASCII table: T , E b . Here, T stands for the total time (EPS×#steps) and E b is the bare ground state energy. Note that some hard-coded potentials split the actual physical potential V in two parts 22 : A bare (V b ) and a subtracted (V s ) potential, with the actual physical potential being V = V b − V s . This option is inherited from previous versions of the quantumfdtd code [14,15], where it was used to introduce additive scalar constants to the potential, that do not modify the dynamics of the Schrödinger equation and can thus be split off. V b is the one used inside the iterative procedure, for computing the wavefunctions Ψ i and energies E b . Afterwards, the subtracted physical value of the energy, E s , is computed from the subtracted potential V s via where the integral extends over all of the lattice positions. This is correct for the potentials V (r) in the particular case V (r) = V b (r) − V s , where the subtracted potential V s is a scalar constant. If V s (r) depends on the position, although the code will run, there is no guarantee that the solution is correct. In the general case, setting V b (r) = V (r) and V s (r) = 0 should be the preferred way. We have maintained this option for backwards compatibility with previous versions of the quantumfdtd code.

STEPS:
Maximum number of steps to take. This limit is enforced each time a number of steps given by the variable UPDATE is evaluated. Hence, the actual number of steps to take is exactly STEPS only if STEPS is divisible by UPDATE. Otherwise, it will be the smallest multiple of UPDATE bigger than STEPS. This is the actual number of steps to take if the TOLERANCE parameter is set to a negative value.

SNAPUPDATE:
How many steps should be taken before enforcing both normalization of the wave-functions ( r∈V |Ψ i ( r)| 2 = 1) and their required symmetries (see the flag INITCONDTYPE). It triggers the check of the tolerance condition (see TOLERANCE flag). It also triggers the storage of the 2 snapshots that are required for the extraction of the first and second excited states via the overlapping algorithm (see subsection 2.2). The post-analysis scripts based on the Python 3 module quantumfdtd.py (see subsections 5.5 and 5.6) require both, saving the full snapshots to disk (setting DUMPSNAPS=4), and, setting the STEPS variable to a multiple of SNAPUPDATE, i.e., ensuring that STEPS%SNAPUPDATE=0.

SNAPDUMP:
In case DUMPSNAPS (see below) is set to a non-null value, how many steps should be taken before storing a wave-function snapshot to disk.

TOLERANCE:
Convergence tolerance with respect to the ground state energy. The stop condition being 23 E b,i − E b,i−1 < TOLERANCE. It can be disabled by setting a negative value.

SAVEWAVEFNCS:
Save a full snapshot of the 3d wave-functions at the end of the run. Set to 1 for activation. Note that most of the post-processing scripts require the intermediate snapshots, too. 21 Ground state energy and the expectation values r 2 , x , y , and z . Note that here ρ 2 is with respect to the center of lattice, i.e., with respect to the point (x1, x2, x3) = (NH , NH , NH ) with NH = (N + 1)/2. 22 There is an option for using this splitting for externally provided potentials as well (see subsection 5.4). 23 E b,i − E b,i−1 stands for the difference between the current bare ground state energy and the one obtained when SNAPUPDATE was triggered previously.

SAVEDECAY:
Save a table of energy versus time. This is particularly useful for checking the convergence of the iterative procedure to the ground state, especially, when using the FFT-based kinetic terms.

SAVEPOT:
Name of the external file where the potential that is used will be saved to. The file format is an ASCII list containing: where ρ 2 is the distance to the center of the potential in lattice units, and x 1 , x 2 , and x 3 are the lattice coordinates. These x i are shifted by −1/2 if the potential is centered on the lattice volume.

DATAFOLD:
Name of the folder where the full output, wave-functions, and snapshots will be saved to. By default, is set to data. Note that many of the post-processing scripts require this standard directory data to work. The file format is an ASCII list containing: where ρ 2 is the distance to the center of the potential in lattice units, and x 1 , x 2 , and x 3 are the lattice coordinates. For the indices we have: i = 0 (ground state), 1 (first excited state), or 2 (second excited state). The coordinates x i are shifted by −1/2 if the potential is centered on the lattice volume.
• Store "snapshot slices" of the ground state (1). Only valid for potentials centered on (N H , N H , N H ) with N H = (N + 1)/2. Do not use when centering the potential on (0, 0, 0).
• Store full 3D wave-functions of the ground state (2).
• Store "snapshot slices" of the ground state, and of the first and second excited states (3). Only valid for potentials centered (N H , N H , N H ) with N H = (N + 1)/2. Do not use when centering the potential on (0, 0, 0).
• Stores full 3D wave-functions of the groundstate, and of the first and second excited states (4). The post-analysis scripts described in the subsections 5.5 and 5.6 require this value for DUMPSNAPS.
The full 3D wave-functions are stored in the same way as the ones triggered by DATAFOLD. The file format for the "slices" consists of three concatenated ASCII lists, separated by a row containing "&&", respectively. These lists are, respectively: The coordinates of V [x 1 , x 2 , x 3 ] are in lattice units, and are shifted by −1/2 if the potential is centered on (N H , N H , N H ) with N H = (N + 1)/2. These "slices" are a legacy function of quantumfdtd v2 that should not be used if the potential is centered in (0, 0, 0) as mentioned above because it has not been adapted. In both cases, the output is stored in the data/snapshot directory. This directory must exist prior to running the program. The file names are data/snapshot/wavefunction_step_i.dat, where step is the step associated to the snapshot (τ = EPS · step) and i is an index running over the computational nodes.

Initial conditions
The following parameters allow changing the initial state condition. The Coulomb-type initial conditions are intended to be a sensible initial guess for spherically symmetric confining central potentials.

INITCONDTYPE:
Initial condition to use. Valid values:

Hard-coded potentials
The program comes bundled with several hard-coded potentials, that were already present in the original code [14,12,11]. The user should be aware that some of these potentials may be unphysical depending on the boundary conditions. We stress that all coordinates (x 1 , x 2 , x 3 ) and corresponding radii (r,r) are dimensionless numbers.

0:
No potential (V ( r) = 0). This is equivalent to an infinitely deep 3d well due to boundary conditionsprovided that KINTERM = 0 is used.

1:
3d square well in the center of the lattice volume: where Θ(x) is the Heaviside Theta function.

6:
Cornell potential with string breaking: We take the string-breaking scale to be r sb = 5.5745 GeV −1 and definê r = Aρ , for Aρ < r sb , r sb , for Aρ ≥ r sb , where ρ = x 2 1 + x 2 2 + x 2 3 , and where MASS is the quantumfdtd reduced mass (see subsection 5.1) and SIGMA is an independent parameter specified in the input file.

>99:
If 100 is added to any of the previous potential codes, the corresponding potential will be loaded with its origin centered at the point ( with N H = (N + 1)/2. This is especially useful for periodic boundary conditions.

External potentials
The following values of POTENTIAL offer different options to read external potentials from files:

90:
Read external potential as a table containing separated by spaces.

91:
Read external potential as a table containing separated by spaces. Pay attention to the fact that the variable is ρ 2 and not ρ.
The following configuration flags are only valid when loading potentials from external files 24 : EXPOT: Name of the file containing the external potential. The file format depends on the selected kind of external potential as described above.

POTCRITR:
Defines a point from which on (in units of the lattice spacing), the potential will be a χ 2 adjustment to the following function: The The potentials are implemented in the function potential in the file build/src/potentials.cpp. The potential number is a global variable called POTENTIAL. There already is a loop in which new potentials could be easily implemented. The function potentialSub in the same source file implements the subtracted 25 potential.

Post-processing scripts
The following Shell and Python 3 scripts have been included for the post-processing of the wavefunctions. They can be found in the scripts folder. Note that these scripts rely on the full snapshots being written to disk (by setting DUMPSNAPS=4), and, setting the STEPS variable to a multiple of SNAPUPDATE, i.e., ensuring that STEPS%SNAPUPDATE=0. The only scripts that do not explicitly require the full snapshots are symmetrize_wf.py and normalize.py. The other ones are part of a processing chain that post-process said snapshots explicitly and thus require their existence.
Unless otherwise stated, they should be run from the parent folder where both the data and input folders are located: symmetrize.sh: Computes the projections and normalizations of all of the wave-functions, using the scripts symmetrize_ wf.py and normalize.py. It accepts direct output stored in the data folder or it accepts wave-functions which are stored in gz compressed tar archives named wave-function.tgz which allows for batch evaluation of results from many runs. Non-normalized output files named wave-function_%i_%p.dat and normalized ones named wave-function_%i_%p_norm.dat are generated. %i=0,1,2 denotes the ground, first and second excited states, respectively. %p stands for the projection of the wave-function, as follows: wave-function_%i_all.dat.gz: The wave-function as computed by quantumfdtd, but in a single file instead of one file per computational node. wave-function_%i_p.dat.gz: Positive parity projection P + , Eq. (23).
For all of the cases but the normalized projection P − p k , the output is a gzipped ASCII file containing: For the normalized projection P − p k , the output follows the format:

postprocess.sh:
Generates plots of the potential and the wave-functions.

postprocess_sym.sh:
Generates plots of the parity projected wave-functions obtained with symmetrize.sh.

symmetrize_wf.py:
Computes the positive and negative parity projections of the given wave-function. It is called as: ../scripts/symmetrize_wf.py [params.txt] wave-function_%i_%d.dat The optional parameter [params.txt] encode the path to the config file, from which the simulation parameters like N , A, or the centering of the lattice are taken. If not provided, the script expects to be called from the data folder and tries to load the config file from the relative path ../input/params.txt. %i should be substituted by the numbers 0, 1 and 2 for the ground, first and second excited states, respectively. %d should be left on the actual call. It means that the script will look for files where %d is a numerical index starting on 0. This procedure has been implemented because quantumfdtd splits the output data between as many files as computational nodes are used. This %d is the numerical index of these files. In practice, for the ground state wave-function, the call (from the data folder) is ../scripts/symmetrize_ wf.py wave-function_0_%d.dat normalize.py: Normalizes the wave-functions to unity over the lattice volume. It is expected to be run from inside the data folder, but it can be run from any folder where wave-functions are stored. It is called ../scripts/symmetrize_wf.py wave-function.dat out.dat.

Post-processing Python module
We provide a post-processing Python 3 module in scripts/quantumfdtd.py, whose class quantumfdtd aims at helping with the analysis via Python 3 [36], Pandas [37,38], NumPy [39,40,41], and Matplotlib [42]. It accepts as input the output of either symmetrize.sh or allpostprocess.sh from the previous subsection 5.5. Note that every script relying on the class functions load_wf, load_proj_components, or get_snaps requires the full snapshots to be present.
The quantumfdtd class functions are: __init__(self, case, config=None): Constructor. It requires a case parameter that is a parent folder where input and data folders of an individual run of quantumfdtd and allpostprocess.sh are stored. There is an optional parameter, config, to provide an absolute or relative path to a params.txt (config file). The function load_ energies, that loads the energies of the final wave-function (and the intermediate snapshots, if they are taken), is also called by the constructor.
load_case(self, case, config=None): In case the default constructor is called, this function accepts the same arguments as the constructor in order to perform a proper initialization of the class. It also allows to load a different case 26 without destroying and creating a new class.

load_energies(self):
This function is implicitly called by the constructor and by the load_case function, so that it does not need to be called explicitly. It loads the eigen-energies as a function of τ into the self.e_wf variable, as a Pandas DataFrame. It also loads the decay.dat table, that encodes the eigen-energy of the ground state with a higher frequency on τ , and returns it as a DataFrame via the self.e_dec variable.

load_potential(self):
This function loads the potential and returns it as a DataFrame with the columns x, y, z (integers, coordinates in lattice space in units of A), r (in units of A), re_v and im_v (real and imaginary part of the potential, in GeV).
load_wf(self, snap, state, figure, min_sep_edge=-1., normalization=True): This function loads a wave-function and returns it as a DataFrame. If snap=-1, the final wave-function is loaded. Otherwise, an intermediate 3D snapshot is loaded, whose STEP value is given by the snap parameter. Remember that τ = STEP · EPS. The argument state is an integer value from 0 to 2, corresponding to the ground state, or to the first or second excited state. The figure variable is m, p, pk or all, corresponding to the projection that is going to be loaded: negative parity projection P − , positive parity projection P + , negative-around-an-axis P − p k and non-projected wave-function, respectively. The min_sep_edge variable keeps points that are closer to the edge than (min_sep_edge × N/2) out of the loaded wave-function. Disabled if set to negative value (default). Finally, if Normalization is set to true, the loaded wave-function is normalized. This happens after the potential dropping of points due to the min_sep_edge flag.
fix_positive_wf(self, wf, lim=0.05): This function returns the wave-function, inverting its sign if the integration of its real values for the points r < A · N · lim is negative.

get_avg_r(self, wf):
This function returns the average radius, r , of the wave-function passed by the wf parameter.

load_proj_components(self):
This function adds additional rows to self.e_wf. In these rows, the weights of the parity projections of each excited state (see subsection 2.4) are stored. If snapshots are taken, this option is very useful for measuring the decay of the different excited states through the iterative procedure explained in subsection 2.2.

get_snaps(self):
This function returns an array of snapshots. For each snapshot, the value of STEP is stored on the array, where τ = STEP · EPS.

Conclusions
In this work, the quantumfdtd code version 2.1 [14,15] has been extended to include the case of a relativistic Schrödinger equation. This is accomplished via a FFT implementation for the RQM inspired kinetic energy operator and working with the iterative solver in imaginary time. Additionally, we have extended the program to have support for arbitrary external potentials, introduced as ASCII tables, and included several analysis scripts. One of them allows to extract parity projections of the extracted wavefunctions. This greatly improves the flexibility and usability of the original program and makes this version 3 valuable even if the original FDTD solver (for the non-relativistic Schrödinger equation) is being used.
We have seen in section 3 that the ground-state wave-function is stable for high values of the time evolution parameter τ . The output wave-functions are linear combinations of the first few excited states. The overlap method is an approximation to project out the first and second excited state from the full wave-function. The evolution Eq. (18) (together with a periodic internal normalization of the evolved wavefunction), makes the weight of the excited states in the wave-function decay exponentially. These weights will eventually decay below the machine precision, eliminating the information about excited states from the runs with the highest values of τ . The parity projection scripts are a useful tool for separating excited states when they are nearly degenerated in energy which is especially noticeable for the 2s and the 2p-states. For various center-symmetric Hamiltonians we have demonstrated that this degeneracy is quantitatively restored in large enough volumes. Wherever analytic theory predictions were available, they are reproduced quite well by the simulations. The possibility to exclude certain states from the beginning by means of choosing the initial symmetry can be an advantage during demanding calculations and is also a non-trivial cross check.
To sum up: the ground-state wave-function should be easy to recover in all of the situations, provided that the corresponding energy is in a plateau and the wave-function is stable with respect to τ . For increased precision, the positive parity projection can be used. The first opposite parity state, i.e., typically the 2pstate, can be separated by means of the negative-around-an-axis parity projection operator. The 2s-state can be also separated, but is more challenging due to the behavior of the tail after the first node on a periodic potential. A large lattice volume can help here. In the examples presented in this paper, we mainly focused on relatively small lattices (N ≤ 256) that are most relevant for interfacing with lattice gauge theory simulations in order to assess the numerical accuracy of the newly introduced relativistic and non-relativistic kinetic energy terms. In practice, due to the parallelized implementation of both the FDTD and FFT-based algorithms it is possible to use quantumfdtd on quite large lattices. For example, there have been past studies with lattices as large as N = 1024 [14].
Possible future improvements include: • Implementing an automatized algorithm in order to look for saddle points of E s,i (τ ) for i = 0, 1, 2.
• Implementing new projection operators on the post-processing stage (see subsection 2.4). For instance, separation of d-wave states. For larger overlaps this may involve including contributions from higher states in the initial wave-function (see subsection 5.2).
• Implementing a suitable binary format for the wave-functions.
• Allowing for variations of the coupling strength of each potential. This could be realized by making the coupling a parameter that can be defined like, e.g., the mass or the lattice spacing. A more involved way would be to, e.g., allow for a file of pairs of scale and coupling to be read in and thereby allow to implement a rudimentary running of couplings with, e.g., the scale µ ∼ 1/r. Finally, linking the program to existing software like RunDec [43,44,45], would allow for an exact running of the strong coupling in applications that would benefit from that. Anyhow, note that new analytic potentials can be easily implemented (see the end of subsection 5.3) and arbitrary external potentials can be loaded from files (see subsection 5.4).
• Implementing an implicit Crank-Nicolson method. This exists for diffusion equations and even for the real-time non-relativistic Schrödinger equation, and makes the evolution unconditionally stable regardless of the choice of time step. The main issue with this concerning the relativistic Schrödinger equation, however, is that the relativistic Schördinger equation is a non-linear partial differential equation, due to the non-linear kinetic term M 2 + i=1,2,3 p 2 i . This spoils the discretization of the FDTD method, requiring solving of non-linear equations inside the finite-difference time discretization. This numerical solution of a system of non-linear algebraic equations inside each step of the FDTD method is also likely to introduce further numerical instabilities.
• Permitting the (anti-)-symmetric solution for a two-particle system in a background potential with an interaction potential between the two particles would be of interest for tetraquark phenomenology. Evolving the wave-functions of two non-interacting particles in a background potential would be a straightforward, and a factor 2 more expensive. Implementing a nontrivial interaction between the two particles would require implementation of new MPI communication procedures in a rather straightforward manner, but would increase the cost of evaluating the potential by a factor 2+${NCORES} instead of 2 and create some additional communication overhead. In both cases two-particle (anti-) symmetrization could be easily realized during the postprocessing stage, or during the SNAPUPDATE stage, where geometric (anti-) symmetrization is performed if requested.
Such improvements are beyond the scope of the present work. However, it would be useful to consider them as a guidance for further improvements of the quantumfdtd program.