Cost-Efficient High-Resolution Linear Absorption Spectra through Extrapolating the Dipole Moment from Real-Time Time-Dependent Electronic-Structure Theory

We present a novel function fitting method for approximating the propagation of the time-dependent electric dipole moment from real-time electronic structure calculations. Real-time calculations of the electronic absorption spectrum require discrete Fourier transforms of the electric dipole moment. The spectral resolution is determined by the total propagation time, i.e., the trajectory length of the dipole moment, causing a high computational cost. Our developed method uses function fitting on shorter trajectories of the dipole moment, achieving arbitrary spectral resolution through extrapolation. Numerical testing shows that the fitting method can reproduce high-resolution spectra by using short dipole trajectories. The method converges with as little as 100 a.u. dipole trajectories for some systems, though the difficulty converging increases with the spectral density. We also introduce an error estimate of the fit, reliably assessing its convergence and hence the quality of the approximated spectrum.


Introduction
The rapid advancement of laser technology in the past decades allows us to probe matter on spatiotemporal scales that approach the characteristic time and length scales of the electron, opening the field of attosecond science. 1,2 This development has forced quantum chemists to shift their attention from the time-independent to the time-dependent Schrödinger and Dirac equations. [3][4][5] Numerical approaches to laser-driven electron dynamics are often labelled real-time methods to distinguish them from the response-theoretical methods to the time-dependent Schrödinger/Dirac equation, the latter solving the equations of motion perturbatively in the frequency domain. 6,7 Even without explicit reference to results derived from perturbation theory such as, e.g., Fermi's golden rule, it is still possible to extract linear and low-order nonlinear optical properties from nonperturbative real-time simulations, including electronic absorption spectra and time-resolved pump-probe absorption spectra that would be hard or impossible to compute using response theory-see Refs. [8][9][10][11][12] for recent examples.
In this work, we focus on electronic absorption spectra extracted from electron-dynamics simulations driven by a Dirac-delta impulse, which excites the molecule into all dipole-allowed excited states simultaneously. 8 Due to the nonperturbative nature of real-time methods, the resulting spectrum may contain nonlinear (e.g., two-photon) as well as linear absorption lines, 11 depending on the field strength of the broad-band laser pulse. In practice, the induced electric dipole moment is recorded in the course of the simulation and subsequently transformed to the frequency domain to yield the absorption cross section. When using the conventional discrete Fourier transform to process the signal, the spectral resolution is inversely proportional to the number of time steps N and the time-step length ∆t, as ∆ω = 2π/(N ∆t). Obtaining sufficient spectral resolution typically requires tens to hundreds of thousands of time steps since ∆t cannot be increased beyond a certain limit if rapid oscillations of the electron density are to be captured. Moreover, increasing ∆t reduces the accuracy and stability of the numerical integration scheme used to propagate the electronic degrees of freedom. As the computational effort in each time step requires multiple rebuilds of the Hamiltonian matrix, it is comparable to several iterations of a ground-state optimization within the chosen electronic-structure model. Hence, there is considerable interest in decreasing the number of time steps required to achieve sufficient spectral resolution.
Previous efforts to improve the spectral resolution have been made by estimating excitation energies through various signal processing techniques. 13-15 More recently, Bruner et al. 16 investigated the use of Padé approximants to interpolate the discrete Fourier transforms used for the absorption spectrum. These are all methods operating in the frequency domain, leaving no other validation options than comparison with a fully propagated spectrum.
The original periodic signal is typically damped using a decaying exponential function to reduce unwanted artefacts arising when the discrete Fourier transform is applied to oscillating functions in simulations with finite trajectory length. In the time domain, the number of time points can be increased by padding the damped signal with zeros, leading to finer spectra. However, this artificial extension of the trajectory length can only be applied on sufficiently damped signals.
In this work, we investigate a more sophisticated and powerful alternative: the extrap-olation of a short signal. The discrete Fourier transform of an extrapolated signal achieves increasingly higher spectral resolution as the extrapolation length increases. This requires the development of a stable and reliable method for time-series forecasting. The inherently harmonic character of the time-dependent wave function in the absence of an external field suggests that such forecasting of molecular properties should be possible. Importantly, the forecasted signal can be verified in the time domain by comparing it with relatively few additional time steps. To the best of our knowledge, no published work exists on improving the spectral resolution by such extrapolation of the time-dependent dipole moment.
The current success and popularity of machine learning is undeniable, including use cases in chemistry, 17-20 and one might be tempted to leverage artificial neural networks for forecasting the time-dependent electric-dipole moment. However, while artificial neural networks are powerful tools for pattern detection in large data sets, they struggle with precise and reliable extrapolations. 21,22 Although the universal approximation theorem 23 tells us that an excellent interpolation can be achieved, it does not guarantee a stable extrapolation. In order to achieve a stable extrapolation, over-fitting must be avoided by enforcing sufficient restrictions.
In this paper, we present a novel approach for obtaining high-resolution absorption spectra from real-time simulations of laser-driven electron dynamics by exploiting a priori knowledge of the form of the dipole function from quantum mechanics in a finite-dimensional Hilbert space. The form of the dipole function thus is motivated by the underlying physics, with unknown parameters to be determined by fitting a short dipole trajectory from a realtime simulation. The fitted function may be evaluated at any point in time, meaning that it can be extrapolated in the time domain to arbitrary future time. This further implies that we can achieve arbitrary spectral resolution. For sufficiently weak Dirac-delta impulse, the evaluation of absorption spectra based on these fitted functions may use analytical expressions for the linear response function. 6 Working in the time domain, a quantitative error measure of the fitted dipole function can be monitored during the course of the real-time simulation and used to evaluate convergence. This way, an unnecessarily long real-time propagation can be avoided by automatically terminating the propagation upon the convergence of the fit. The developed method is independent of the quantum mechanical model and is tested with several molecular systems using mainly real-time time-dependent density-functional theory (RT-TDDFT). 8,[24][25][26][27][28][29] Despite certain flaws arising mainly from the almost universally adopted adiabatic density-functional approximation, 3,30 RT-TDDFT is the far most widely used electronic-structure method for laser-driven electron dynamics. With computational costs comparable to (or below ) timedependent Hartree-Fock theory, 31 RT-TDDFT takes into account electron-correlation effects that would otherwise require advanced and computationally expensive wave-function theories. 4,5 To demonstrate the independence of the underlying electronic-structure theory, we also present results obtained from real-time time-dependent configuration interaction singles (RT-TDCIS) 32-34 theory.
We will start with a short presentation of the electric-dipole approximation within realtime simulations before introducing the proposed method for fitting the time-dependent electric-dipole moment. After briefly laying out the simulation details for the real-time simulation of a selection of systems, the results of the fitting method on these systems are presented and discussed. Finally, we reflect on the performance of the fitting method and discuss potential future improvements.

Theory
In this work, we employ the following conventions: Subscripts u, v denote Cartesian components, vectors are typed in boldface, and quantum-mechanical operators are denoted by a hat. Following the convention of response theory by Olsen and Jørgensen, 6 we define the Fourier transform and its inverse according tõ Atomic units are used throughout unless otherwise specified.

Real-time simulations of absorption spectra
Within the clamped-nucleus Born-Oppenheimer approximation, real-time simulations of electronic absorption spectra typically assume the electric-dipole approximation, where a molecule is subjected to a time-dependent spatially uniform electric field, F (t). The timedependent Hamiltonian readsĤ whereĤ 0 is the time-independent electronic Hamiltonian, and the interaction operator is given byV whereμ is the electric dipole operator. The linear polarization direction of the electric field is determined by the real unit vector u, such that the field aligns with one of the Cartesian axes. This implies the formV (t) = −μ u F (t), whereμ u its component along the polarization direction.
We assume that the electronic system is in the ground state |0 at time t < 0, and that the external field F (t) is only active between t = 0 and time t 0 ≥ 0. At time t 0 , the Hamiltonian reduces to the time-independent Hamiltonian such that Schrödinger's equation The time-dependent wave function in the absence of the external field oscillates around the solution at time t 0 , |Ψ(t 0 ) = n k n (t 0 ) |n , as given by where |n denotes a normalized eigenfunction of the unperturbed Hamiltonian,Ĥ 0 |n = E n |n . 35,36 This formulation is exact when the electronic continuum is excluded, e.g., by choosing a fixed, finite basis as commonly done in quantum chemistry. Actual simulations are not performed in the energy eigenbasis but in, e.g., a basis of Slater determinants, implying that the coefficients k n (t 0 ) are not known.
In order to obtain the electronic absorption spectrum averaged over all molecular orientations relative to the electric field, the time-dependent electric dipole moment µ u (t) = Ψ(t)|μ u |Ψ(t) is calculated in three separate simulations with the electric field polarized in each of the three Cartesian directions (u = x, y, z). The absorption cross-section is then obtained from the Fourier transform of the dipole moments,μ u (ω), as 37 where c is the speed of light. The resulting spectrum contains both linear (one-photon transitions between the ground and excited states) and nonlinear (multi-photon transitions between the ground and excited states, and one-and multi-photon transitions between excited states) absorptions, as recently stressed by Guandalini et al. 11 We note that only the induced dipole moment, that is the total dipole moment with the static ground-state part subtracted, contributes to the absorption cross section but, for notational convenience, we will only distinguish between that and the total dipole moment when it is strictly required.
Since the dipole moment is calculated on a finite discrete time grid, the Fourier transforms are replaced by discrete Fourier transforms, thus introducing artificial periodic boundary conditions. To avoid artefacts from these, the dipole moment is multiplied by a damping factor before the discrete Fourier transform, i.e., where we have used that the induced dipole moment vanishes for t < 0. The Fourier transform thus becomes a Laplace transform. The parameter γ ∈ R + can be interpreted as a common (inverse) lifetime of all excited states, giving rise to Lorentzian line shapes in the simulated absorption spectra. 38 The discrete Fourier transform, however, requires a very large, often prohibitive, number of time steps to achieve sufficient spectral resolution. In the following sections, we will describe an extrapolation technique aiming at high resolution with short simulation time.

The expected form of the electric dipole moment
Once the external field is turned off, the time-dependent electric dipole moment oscillates with the Bohr frequencies ω nm = E n − E m according to for time t ≥ t 0 . The function form of the approximated dipole momentμ u (t) ≈ µ u (t) will therefore be given bȳ where N u ω is the number of participating frequencies ω u i . If we can determine these frequencies and their corresponding real coefficients c u i and c u N u ω +i from a short dipole time series, we obtain a continuous dipole function and, hence, infinite spectral resolution.
As will be described in detail below, we estimate the participating frequencies using the poles of a Fourier-Padé approximant, while the linear coefficients are determined using linear regression in a subsequent step.

Estimating Bohr frequencies
In order to estimate the frequencies of the dipole moment, we will investigate the singular points of the Fourier-Padé approximant, originally introduced in real-time quantum simulations by Bruner et al. 16 In general, the Padé approximant is used to accelerate the convergence of a truncated power series. The discrete Fourier transform can be written as the power seriesμ where z depends on the frequency according to The diagonal Fourier-Padé approximates the Fourier transform using two polynomials P u (z) and Q u (z) of degree M = (N t − 1)/2, where the coefficients of the polynomials create a Toeplitz linear system. For details see Ref.
16. The Fourier-Padé poles, denoted z u p , are found by where the damping parameter γ is set to zero. The Bohr frequencies are positive and realvalued, while the frequencies corresponding to roots of Q u (z) will be complex. The number of roots of Q u (z), (which amounts to M roots), should also significantly exceed the number of Bohr frequencies. The potential frequencies are given by considering only roots Im(z u p ) > 0, as the complex conjugate root theorem states that complex roots will form conjugate pairs. These conjugate pairs yield duplicates of the real-valued frequencies.
Real-valued roots z u p yield purely complex frequencies, and are therefore also excluded. The potential frequencies ω u p discard the imaginary component and should represent extrema of the Fourier-Padé spectrum, not singular points like z u p .
Estimating the potential frequencies uses the Python NumPy 39 library to compute the eigenvalues of the companion matrix 40 of the polynomial Q u (z) to determine its roots. This method exhibits poor scaling with respect to the number of time points N t , representing a computational bottleneck of the dipole-momentμ u (t) fitting procedure. In real-time simulations using very small time steps, one may safely increase the step length on the dipole data used to create the Fourier-Padé approximant to alleviate the computational cost. As shown by Mattiat and Luber, 41 the convergence of the Fourier-Padé approximant is mostly impacted by the trajectory length N t ∆t, and not the time step itself. However, the discrete Fourier transform, and hence also the Fourier Padé, is periodic with a cycle length of 2π/∆t.
Peaks above π/∆t will fold back due to anti-symmetry and appear as negative duplicates polluting the spectrum. Therefore, it is crucial to keep the time step ∆t < π/ω max , where ω max is the largest significant frequency in the signal.
The potential frequencies ω u p must be classified as either an estimated frequency or a redundant root. The classification is based on the assumption that ln z u p /(i∆t) should have a significant imaginary component if ω u p is a redundant root, while it should lie close to the real axis if ω u p corresponds to an actual Bohr frequency. This further means that Q u (z(ω u p )) should be close to zero and that [M/M ] µu (ω u p ) should be large for estimated frequencies.
Hence, we create a two-dimensional representation r u p of the prospective frequencies ω u p given by where the unnormalized features are defined as The base-10 logarithm is used to manage the extreme scaling of both features, as prospective frequencies should cause Q u (z(ω u p )) to approach zero and hence be a nearly singular point of [M/M ] µu (ω u p ). The features are constructed such that estimated frequencies should be close to r u p = (0, 0), while redundant roots should be closer to r u p = (1, 1).
We use the K-means clustering algorithm (see e.g. 42,43), implementation from the Python SciKit-Learn 44 library, with K = 2 to classify prospective frequencies. The 2means clustering algorithm is a computationally inexpensive way to separate a set into two categories. The centroid for the cluster of potential frequencies should be closer to (0, 0), whereas the centroid for the redundancy cluster should be closer to (1, 1).

Determining the linear coefficients
Once the frequencies are estimated, the linear coefficients are determined using linear regression. The coefficients are optimized by minimizing the cost function, 45 Using the general form of the dipole moment in Eq. (9), the linear coefficients may be optimized using a simple least squares optimization. The only restraint on the optimization of these coefficients is that they are real. This fitting procedure is general for any type of external field, F (t). However, as shown by Hauge, 46 restricting the coefficients is crucial to avoid over-fitting the dipole moment.
The formμ u (t) in Eq. (10) is based on the full dipole moment, correct through all orders in perturbation theory and is independent of the electric field. In this work, we will use a Dirac delta-type impulse 8 of strength κ, which has an infinitely wide frequency distribution and thus generates the full absorption spectrum for the given polarization direction. This implies that t 0 = 0. Further, we assume that the electric field strength is sufficiently weak, such that we may regard the interaction operatorV (t) as a time-dependent perturbation and assume that the interaction only induces one-photon transitions from the ground state-i.e., a linear absorption spectrum. The electric dipole moment should then be of the form µ u (t) ≈ µ where the zeroth order dipole moment corresponds to the ground-state value, µ (0) u = µ u (t = 0). We will now investigate an analytical expression for the first-order correction to the dipole moment induced by a weak Dirac delta impulse.
We start with the exact expression for the linear response function, 6 where we have used the Fourier transform of the interaction operatorV (t), The linear response function and the first-order correction to the dipole moment are related By using well-known Laplace transforms, the first-order correction is identified as a linear combination of sine waves: We have also used that the first-order perturbation correction to the dipole moment should only include one-photon transitions. This further means that the approximated dipole moment, when using a weak Dirac delta impulse, should have the form where all sine coefficients are positive.
The coefficients ofμ u (t), approximating the dipole moment from the Dirac delta impulse, are optimized using the least absolute shrinkage and selection operator (LASSO) 47 method.
The coefficients are determined according to where λ is the shrinkage parameter restricting the magnitude of the coefficients, c u i . In contrast to the ordinary linear least-squares algorithm, the LASSO method is iterative and therefore somewhat less computationally efficient. In return, this makes it possible to enforce positive coefficients, as in the implementation by SciKit-Learn. 44 This makes the method less prone to over-fitting.

Molecular orbital decomposition
The electric dipole moment can be written as a sum of contributions from elementary molec- where i and a label occupied and virtual MOs, respectively. The components µ ia u (t) are then approximated separately. This MO decomposition can divide a dense spectrum into a series of sparser spectra and aid in the assignment of absorption lines. 8 Clustering the MO components into groups can be used to offset the increased memory consumption. 48 For the fitting method, creating clusters with well-separated frequencies could also reduce the accumulation of errors when summing the component fits. and their corresponding partial spectra may contain negative peaks. Only the full spectrum, i.e., the sum of the components, is guaranteed to contain positive peaks exclusively. The ordinary least squares method must therefore be used when optimizing the linear coefficients of the individual components, which may introduce additional errors due to over-fitting in each component.
Alternatively, the fitting algorithm may estimate the frequencies of each component separately and then optimize the linear coefficients for the full dipole moment. This way, the additional coefficient restrictions can be used in the optimization. In our experience, however, this produces a vast number of estimated frequencies leading to problems with over-fitting even when enforcing positive linear coefficients.

Convergence criterion
The goal of the fitting method is to accurately construct the functionμ(t) using the shortest The error of the fit is estimated using one minus the coefficient of determination, R 2 , i.e., where µ m u is the mean value of the induced dipole moment. The error measure is unitless and independent of the magnitude of the dipole moment. The fitting method can be run in parallel with real-time simulations, which are terminated once E u drops below a pre-defined threshold value. The computational cost of the fitting method is not insignificant, and we recommend that it is run once per time intervals of 50 − 100 a.u. when used to automatically terminate the real-time simulation.
The error measure E u cannot distinguish error contributions from different parts of the spectrum, preventing termination once the desired frequency region is converged. In order to focus on valence excitations in the low-frequency region, we apply a low-pass (smoothing) filter to remove frequencies above a cut-off frequency ω max from the dipole moment in the time domain. We used a Butterworth filter, implemented by SciPy, 49 which removes the high-energy part of the spectrum while leaving the lower-energy part almost unchanged. If bound core excitations are the main targets, a high-pass filter must be used instead.

Computational details
We test the dipole extrapolation scheme using RT-TDDFT simulations, supplemented by a few RT-TDCIS simulations to demonstrate its applicability to wave-function-based theo- Implementation of the fitting algorithm can be found at https://github.com/HyQD/ absorption-spectrum.

Results
All reference spectra in this paper are produced from low-pass filtered electric dipole moments with a trajectory length of 4000 a.u., such that the spectral resolution becomes ∆ω = 1.6 · 10 −3 a.u.. In this paper, the resolution of the fitted spectrumS(ω) is the same as its reference spectrum. This is to allow direct comparisons of the two spectra, though the resolution of S(ω) could be made arbitrarily fine. The Fourier transform of the approximated dipole momentμ u is calculated according tõ The half-life parameter was always set to γ = 0.5 · 10 −3 π, and the spectra were cut at an estimated ionization energy of 0.5 a.u. − HOMO , where the energy of the highest occupied molecular orbital HOMO of all systems are listed in the supplementary information.
Using the low-pass filter will leave the lower energy part of the absorption spectrum unaltered, while the higher energy part is removed and set to zero. Differences between filtered and unfiltered spectra are shown in the supplementary information. The low-pass filter does not give a clean cut-off at the cut-off frequency ω max but rather a gradual lowering of the peak intensity around ω max . The cut-off frequency should therefore be set somewhat higher than the desired range of the spectrum. We have used ω max = 4 a.u. for all systems.
When fitting the dipole moment, the available trajectory is from when the external field is turned off at t = 0 to time t = T u ver . The linear coefficients are determined on the time The error in the spectrum E S is calculated the same way as in the time domain, as given in Eq. (29).

Performance on a selection of systems
For each spatial direction, the convergence of the fit is tested every 50 a.u. in time of the trajectory length, starting from T min = 100 a.u.. The simulation is terminated when the fit has converged below a given threshold or when the trajectory length reaches T max = 1000 a.u..
The convergence criterion was set to E u < 10 −3 , a strict threshold corresponding to a nearperfect fit. Since the real-time calculations on a given system using three spatial directions of the external field are independent, the trajectory length needed for a converged fit might vary between the three simulations.
The required trajectory length of each spatial direction T u ver and their corresponding verification error E u as well as the error in the spectrum E S are listed in Tables 1 and 2.
Systems with relatively sparse spectra (CH 4 , CO 2 , H 2 O, LiH, and NH 3 ) also converged nicely with short dipole trajectories (T u ver ≤ 350 a.u.). As the spectral density increases, the fitting method struggles to converge. Systems like C 2 H 6 and CH 2 O only converged when using a double-zeta basis set, while the fitting of C 6 H 6 and CH 3 OH did not achieve errors below the low threshold.
The CH 2 O molecule with a double-zeta basis set converged for both real-time methods.
The spectra of the fit in both cases are nearly indistinguishable from their reference spectra.
A comparison between the approximated and reference spectrum from RT-TDDFT calculations is shown in Fig. 1 compared to its reference spectrum (E S = 3 · 10 −3 ). Its spectrum is shown in Fig. 2, and was the approximated spectrum with the most visible deviation from its reference spectrum among the converged systems. The approximated spectrum shows a deviation in a peak at ω ≈ 0.75 a.u., but the rest of the peaks correspond well to the reference spectrum.  The fitting method only partially converged for C 6 H 6 , as well as C 2 H 6 and CH 2 O with triple-zeta basis, meaning that error of the fit was below the set threshold in only one or two of the spatial directions. Still, the spectral error in all three cases is quite low. The result of the fitting of benzene is shown in Fig. 3, which had the largest spectral error (E S = 4 · 10 −2 ) of the three. The trajectory length needed for the fitting method to converge strongly depends on the spectral density. We observed a trend in that the fitting becomes increasingly difficult as the spectral density increases. Increasing either the number of electrons in the system or the size of the basis set will in general require longer real-time simulations before the fitting method converges. The trend with increasing basis set size is clearly seen from the fitting of CO 2 from RT-TDCIS calculations. The simulation using the cc-pVDZ basis set converges faster (T u ver = 100 a.u.) than when using the larger basis sets like the aug-cc-pVDZ basis set (T u ver ≤ 250 a.u.) or aug-cc-pVTZ basis set (T u ver ≤ 300 a.u.).
Only the fit of CH 3 OH did not get errors below the convergence threshold in any of the spatial directions. This was true for both the double and tripe-zeta basis (from RT-TDDFT calculations). The result using a triple-zeta basis set is shown in Fig. 4 and is the case with the highest error in the time domain, E u ∼ 10 −1 . There is a significant deviation from the reference spectrum, though the main features are intact. Of all systems in this paper, this gave the worst approximation to the reference spectrum. Despite this, the spectrumS(ω) still provides a reasonable coarse approximation.
These results are promising in all cases as the converged fit seems to reproduce its refer- The reference spectrum is in solid blue, while the yellow dashed line shows the spectrum of the fitted functions. The fitting error was E x = 3 · 10 −1 , E y = 8 · 10 −1 , and E y = 4 · 10 −1 .
ence spectrum reliably with only minor deviations in the peak intensities. The error of the fit E u also correlates with the spectral error, E S . This predictability is crucial if the convergence criterion is used to automatically terminate real-time simulations. Our results also indicate that the convergence criterion used in this study is stricter than necessary. A slight relaxation in the criterion might lead to faster convergence without significantly impacting the quality of the approximated spectrum.
For the estimated dipole moment, the frequencies and their corresponding linear coefficients are known. For a successful fit, one may therefore obtain the transition probability | 0|μ u |n | 2 of a transition with energy E n − E 0 directly from the linear coefficient, as | 0|μ u |n | 2 = B u n /(2κ). This could be used to calculate the oscillator strength and create stick spectra. However, estimated frequencies in different spatial directions but corresponding to the same transition will have a small error associated with the frequencies. In order to compute the oscillator strength, one would therefore have to assess which estimated frequencies across spatial directions correspond to the same transition.
The convergence of the dipole moment fitting depends primarily on the frequency estimation. When the fit does not converge, it follows that the Fourier-Padé approximant is not sufficiently converged to accurately capture the Bohr frequencies. The quality of the Fourier-Padé depends on the dipole trajectory length t N rather than the number of steps or step length. 41 However, there is no given final time t N ensuring convergence, the necessary trajectory length depends on the spectral density. High spectral density can cause the Fourier-Padé to fail, even for relatively long simulations. The fitting method introduces a measure of the error E u which does not rely on any reference spectrum. This introduces a more reliable way of estimating the error in the approximated spectrum.

Fitting using MO decomposition
We assessed the performance of the fitting procedure used to extrapolate the components of the dipole moment of C 6 H 6 decomposed to MO pairsμ ia u in the RT-TDDFT calculation.
Instead of creating a fitting function for each individual MO pair, which would increase the memory overhead, we clustered the componentsμ ia u into groups of ten. These groups are formed so that the overall sparsity of spectra obtained for each cluster is maintained. This is accomplished by spreading the individual constituents of the cluster across the energy range.
A fitted function of each cluster was then created from the sum over the MO pairs that the cluster contains. The total error is measured for the full dipole moment, µ u (t). Some of the MO pairs were omitted entirely, making the MO decomposition work as a low-pass filter.
When fitting components, the low-pass filter is therefore not needed.
Fitting the decomposed signal, however, did not improve the convergence compared to when the full dipole moment was used for extrapolation. Simulations for both directions µ x and µ y reached the max trajectory length (t = 1000 a.u.) without the fitting error going below the error threshold. The errors (E x = 1 · 10 −2 and E y = 1 · 10 −2 ) were only slightly lower compared to fitting without the MO decomposition. The last spatial direction µ z converged at T ver = 650 a.u. (E z = 4 · 10 −4 ), which is somewhat slower than without MO decomposition. The spectral error was E S = 9 · 10 −3 , which corresponds to a low spectral error.
Although the MO decomposition did not lead to accelerated convergence of the fitting method, we still observed improvements. For example the simulation with T u ver = 600 a.u.
has a lower error of the fit for the decomposed dipole moment (E x = 6 · 10 −2 , E y = 4 · 10 −2 and E z = 1 · 10 −3 ) for all spatial directions compared to the fit on the full dipole moment, (E x = E y = 3 · 10 −1 and E z = 2 · 10 −3 ). The decomposed fit in Fig. 6 (E S = 3 · 10 −2 ) is visibly improved compared to the fit using the full dipole moment in Fig   It is important to note that the scope of our testing of the fitting method using MO decomposition was limited. Previous success using the Fourier-Padé approximant in combination with the MO decomposition on RT-TDDFT data suggests that this in many cases is very effective. 16 Our system could be particularly unlucky in this regard, however, our study raises cause for caution regarding the use of the Fourier-Padé approximant. The Padé can struggle, even when using MO decomposition. The unknown amount of error introduced to the final spectrum by this procedure remains an open problem that the user should be aware of when analyzing spectra with the Fourier-Padé method with the MO decomposition.
The particular form of the components µ pq u is also very dependent on the quantum mechanical method used to compute the time-dependent wave function. Using MO decomposition on the electric dipole moment from RT-TDCCSD calculations leads to large overlaps in frequencies among different components. 46 The usefulness of such decomposition might vary between the different quantum mechanical frameworks.

Conclusion
We have developed a novel method for creating functions approximating the electric dipole moment from real-time calculations. The fitted functions for the dipole moment in the three spatial directions can then be used to produce absorption spectra with arbitrary high resolution. Real-time calculations of absorption spectra require the use of the discrete Fourier transforms, demanding long simulation times to obtain high spectral resolution. In our work, we have shown that the length of the real-time simulations, and hence the computational cost, can be greatly reduced by the developed fitting method.
We introduced a quantitative error measure to evaluate the convergence of the fit. For all systems in this work, a converged fit reliably reproduced the reference spectrum from long real-time calculations. In order to reduce the computational cost of calculating absorption spectra, the real-time calculations should be automatically terminated once the convergence criterion is reached.
In this work, we set the verification window to be 25% of the available dipole trajectory.
The critical step of the method is determining the frequencies, which always uses all available data. For the linear optimization, the verification window should include an entire period of the smallest frequency in the signal, as an insufficiently large verification window may lead to misleading error estimates. In future work, the verification window should depend on an estimate of the smallest frequency in the signal based on differences in the molecular orbital energies.
The fitting method converged with as little as 100 a.u. long trajectories in time for systems with sparse spectra. Convergence slows down as spectral density increases, even leading to failure of convergence in some cases. The current version of the fitting method shows encouraging results for smaller systems, although aspects of the method require further investigation.
Our testing of the fitting method using molecular orbital decomposition of a single system gave mixed results. The decomposition did not enable the fit to meet the convergence criterion, although we observed improvements in the approximated spectrum. This motivates the need for further investigations.
An apparent weakness of the current implementation is the way of estimating frequencies.    Each system's highest occupied molecular orbital energy is listed in Tables 11 and 12, for the calculations using RT-TDDFT and RT-TDCIS respectively.      Li 0.0000 0.0000 0.0000 H 0.0000 0.0000 1.6299 Table 9: Molecular geometry of H 2 .

Supporting Information Available
x H 0.0000 0.0000 0.0000 0.0000 0.0000 0.7408  aug-ucc-pVDZ −0.2807  The effects of the discarding high-energy molecular orbital transitions on the spectrum of C 6 H 6 using the aug-ucc-pVDZ basis set in a RT-TDDFT calculation is shown in Fig. 5.
Here, the spectrum of the molecular orbital components (dark green line) is compared to the full unfiltered spectrum (faded blue line).

Spectra of the approximated functions
The                         T x ver = 1000 a.u. T y ver = 1000 a.u. T z ver = 650 a.u. Figure 28: Spectrum of C 6 H 6 using the aug-ucc-pVDZ basis in a RT-TDDFT simulation, discarding high-energy molecular orbital components. The spectrum of the approximated dipole moment using molecular orbital decomposition is shown by the dashed orange line, while the solid dark green is the reference spectrum.           T x ver = 100 a.u. T y ver = 100 a.u. T z ver = 300 a.u.