A room temperature CO$_2$ line list with ab initio computed intensities

Atmospheric carbon dioxide concentrations are being closely monitored by remote sensing experiments which rely on knowing line intensities with an uncertainty of 0.5% or better. We report a theoretical study providing rotation-vibration line intensities substantially within the required accuracy based on the use of a highly accurate {\it ab initio} dipole moment surface (DMS). The theoretical model developed is used to compute CO$_2$ intensities with uncertainty estimates informed by cross comparing line lists calculated using pairs of potential energy surfaces (PES) and DMS's of similar high quality. This yields lines sensitivities which are utilized in reliability analysis of our results. The final outcome is compared to recent accurate measurements as well as the HITRAN2012 database. Transition frequencies are obtained from effective Hamiltonian calculations to produce a comprehensive line list covering all $^{12}$C$^{16}$O$_2$ transitions below 8000 cm$^{-1}$ and stronger than 10$^{-30}$ cm / molecule at $T=296$~K


Introduction
The quantity of carbon dioxide (CO 2 ) in Earth's atmosphere is thought to have a key role in climate change and is therefore being closely monitored. Several agencies are flying experiments or whole missions, for example GOSAT [1], OCO-2 [2] and ASCENDS [3], to explicitly monitor the atmospheric CO 2 content. Similarly, international ground-based networks such as TCCON [4] and NDACC [5] are also dedicated to monitoring atmospheric CO 2 content. A major aim of this activity is to establish CO 2 concentrations at the parts per million (ppm) level or, preferably, better. These projects will aim not only to look at overall CO 2 concentration and its variation; it is of particular interest to pinpoint where CO 2 is being produced (sources) and where it is going (sinks). This activity is clearly vital to monitoring and hopefully controlling CO 2 and hence climate change [6].
All CO 2 remote sensing activities, both from the ground and space, rely on monitoring CO 2 vibration-rotation spectra and therefore are heavily dependent on laboratory spectroscopy for reliable parameters; it is only through these parameters that atmospheric spectroscopic measurements can be interpreted. These spectroscopic parameters are of three types: line centers, line profiles and line intensities. Line centers or positions are established to high accuracy in many laboratory high resolution spectroscopy studies and in general do not require significant improvement for studies of Earth's atmosphere. Line profiles are more difficult but significant progress on these has been made in recent years with, for example, the inclusion of line mixing in both the HITRAN database [7] and many retrieval models, and the move beyond Voigt profiles [8]. Here we focus on line intensities for the main isotopologue of carbon dioxide, 12 In the laboratory it is much harder to determine accurately line intensities than line frequencies. Typical accuracies for experimental line intensity data used in atmospheric models and retrievals is only 3 to 10 % [9, 10,11,12] and, until recently, the best published measurements, e.g. Boudjaadar et al., [13], only provide accuracies in the 1 to 3 % range, still very significantly worse than the precision of 0.3 to 1 % required by the modern remote sensing experiments [14,15,16].
Recently there have been a number of laboratory measurements aimed at measuring absolute CO 2 line intensities with the high accuracy needed for remote sensing [17,18,19,12,20,21,22]. With the exception of recent work by Devi et al. [21], these studies have all focussed on obtaining the highest possible accuracy for a few lines or even a single line. These investigations will be discussed further below. While they clearly do not provide the volume of data needed for remote sensing studies, they do provide benchmarks that can be used to assess calculated intensities such as those provided here.
Approximately 20 000 transitions of 12 C 16 O 2 have been measured experimentally; the experiments up to 2008 were reviewed by Perevalov et al. [10] and more recently by Tashkun et al. [23].
There have been a number of attempts to use theory to provide intensities for CO 2 . Wattson et al. [24,25] 27,28]. In particular, Huang et al. provide the most accurate currently available potential energy surface (PES) for the CO 2 system. A widely-used alternative theoretical approach is based on effective operators for the Hamiltonian and the spectroscopic dipole moment [29]. Currently, the effective Hamiltonian approach achieves one order of magnitude better accuracy for 12 C 16 O 2 frequencies than the best-available PES [26]. Within this framework, the calculation of intensities requires eigenfunctions of an effective Hamiltonian whose parameters were fitted to observed positions of rotation-vibration lines as well as dipole moment operators tuned to observed transition intensities. This approach has been used to create dedicated versions of the carbon dioxide spectroscopic databank (CDSD) for room-temperature [23] and high-temperature [30,31] applications.
Recently a number of studies have shown that it is possible to compute line intensities using dipoles from ab initio electronic structure calculations with an accuracy comparable to, or even better than, available measurements [20,32,33,34,35]. The intensity of a line depends on the transition line strength which is obtained quantum-mechanically from the integral where here |i and |f are the initial and final state rovibrational wavefunctions of the molecule and µ α is component of the dipole moment surface (DMS). The requirements for accurate linestrengths are therefore high quality nuclear motion wavefunctions and DMSs. Lodi and Tennyson [33] developed a procedure which provides estimated uncertainty on a transitionby-transition basis based on the evaluation of multiple line lists. They initially applied this procedure to water vapor spectra. Their data were used to replace all H 2 17 O and H 2 18 O intensities for water in the 2012 release of HITRAN [36]. These data have since been critically assessed and verified empirically for the 6450 to 9400 cm −1 region [37]. The present study combines the high accuracy ab initio DMS presented by Polyansky et al. [20] and the methodology of Lodi and Tennyson, which required some extension for the CO 2 problem. This is discussed in the following section.
The current release of HITRAN [36] takes its CO 2 line intensities substantially from two sources: the Fourier transform measurements of Toth et al. [38] and an unpublished version of CDSD [39]. The CDSD, whose intensities are accurate to about 2 -20 % depending on the vibrational band, has recently been updated and released as CDSD-296 [23]. The uncertainty estimate is up to 20 % for many transitions and is probably rather conser-

Methodology
The Lodi-Tennyson method [33] for validating linelists on a purely theoretical basis relies on the use of accurate, ab initio transition intensity calculations requires an accurate procedures for obtaining nuclear motion wavefunctions together with the use of at least two DMSs and two PESs. These aspects are described below.

Ab initio surfaces
The first stage in the molecular linelist evaluation process involves computing energy levels and rotational-vibrational wavefunctions. Our approach utilizes an exact nuclear kinetic energy operator following the framework proposed by Tennyson and Sutcliffe [40,41,42,43] and implemented in DVR3D suite [44]; the quality of the electronic PES provided is of primary importance. Energy levels and rotational-vibrational wavefunctions obtained in this way are further used in intensity calculations, requiring additionally a DMS function as input. The accuracy of the resulting line positions depends strongly on the quality of the PES, while line intensities are dependent both on the PES and DMS. Therefore, in order to generate high accuracy line intensities, it is necessary to provide those two essential functions with the highest possible accuracy. The present state-of-the-art ab initio PESs are capable of reproducing experimental energy levels to 1 cm −1 accuracy, which still remains insufficient for high resolution spectroscopy purposes. Hence empirical fitting of ab initio surfaces has become a standard procedure. This semi-empirical approach is much less successful in the case of DMSs, partly due to technical difficulties in obtaining accurate experimental data, suggesting the use of ab initio DMSs is a better choice [45]. It is natural to ask how different PESs and DMSs affect energy levels and line intensities. Answering this, in turn, can shed some light on the reliability of line intensities provided by our theoretical scheme. Accordingly, the present study involves 6 independent runs of nuclear motion calculations using the inputs presented below.

Ames PES
As a primary choice we decided to use the semi-empirical Ames-1 PES A comparison with the Ames-1 PES shows a 1.5 cm −1 average discrepancy between the energy levels computed with the two surfaces for levels below 4000 cm −1 . Above this value some energy levels spoil this relatively good agreement to give a RMSD of 6.2 cm −1 for states below 11 000 cm −1 , with 200 (0.5% total) levels unmatched. However, for a fully ab initio procedure this PES represents roughly the state-of-the-art for CO 2 . It was therefore used as part of the theoretical error estimation procedure.
Fitted PES Higher quality can be achieved by refining our ab initio PES with Ames energy levels. This was done for levels with J = 0, 1 and 2. This fit resulted in a RMSD of 0.2 cm −1 between respective low J energy levels and around 1.4 cm −1 RMSD for states including all J's (0-129) below 11 000 cm −1 , leaving only 30 levels above 10 000 cm −1 (0.1% total) unmatched.

Ames DMS
The Ames dipole moment surface 'DMS-N2' was based on 2531 CCSD(T)/aug-cc-pVQZ dipole vectors [27]. The linear least-squares fits were performed with 30 000 cm −1 energy cutoff and polynomial expansion up to 16-th order with 969 coefficients, which gave a RMDS of 3.2 × 10 −6 a.u. and 8.0 × 10 −6 a.u. for respective dipole vector components. Comparison with recent experiments [27] and CDSD data leads to the general conclusion that the Ames DMS, while reliable, still does not meet requiremets for remote sensing accuracy.

UCL DMS
Our dipole moment surface was calculated using the finite field method.
Both positive and negative electric field vector directions were considered for the x (perpendicular to molecular long axis) and y (along molecular long axis) components of the dipole moment, requiring 4 independent runs for each ab initio point. Finally the dipole moment was computed as first derivative of electronic energy with respect to a weak uniform external electric field (3 × 10 −4 a.u.); a two-point numerical finite difference approximation was used.
Previous research suggests that in general derivative method yields more reliable dipole moments than those obtained from simple expectation value evaluation [47]. Randomly distributed ab initio points were then fitted with a polynomial in symmetry adapted bond-lengths and bond angle coordinates.

Nuclear motion calculations
Nuclear-motion calculations were performed using the DVR3D suite [44].
Symmetrized Radau coordinates in bisector embedding were applied to represent nuclear degrees of freedom. Rovibrational wavefunctions and energy levels were computed utilizing exact kinetic energy operator (in Born-Oppenheimer approximation) with nuclear masses for carbon (11.996709 Da) and oxygen (15.990525 Da).
As a first preliminary step in our procedure basis set parameters were optimized with respect to energy levels convergence using the Ames-1 PES.
The final set of parameters for Morse-like basis functions [44,48], describ- The final step involved running the DIPOLE program [44]. A uniform 10 −30 cm/molecule cutoff value is sufficient to cover most of experimentally available data and also corresponds to HITRAN2012 standard, facilitating further comparisons. The value for the partition function at 296 K Q = 286.096 was taken from Huang et al. [28] and coincides with the value 286.095 obtained from the present calculation. For 12 C 16 O 2 , half of the possible energy levels do not exist due to nuclear spin statistics. Transition intensities in cm/molecules were calculated using where ω if is the transition frequency between the i'th and f 'th state, g i = (2J + 1) is the total degeneracy factor, Q(T ) is the partition function and S if represents the linestrength, see eq.(1), for transition i to f . Units for line intensity are cm/molecule.

Estimatation of the intensity uncertainties
The dominant source of uncertainty in line intensities is given by the ab initio DMS. The accuracy of the UCL DMS was considered in detail by Polyansky et al. [20] who suggested that for the vast majority of transitions below 8000 cm −1 it should give intensities accurate to better than 0.5 %.
A characteristic of an ab initio DMSs is that entire vibrational bands are reproduced with very similar accuracy. This is because to a significant extent ro-vibrational transitions in a molecule like CO 2 can be thought of as the product of a vibrational band intensity and a Hönl-London factor.
Although DVR3D does not explicitly use Hönl-London factors, the use of an exact nuclear motion kinetic energy operator ensures that these rotational motion effects are accounted for exactly. Here we therefore follow the Lodi-Tennyson strategy [33] but constructed and evaluated six linelists utilizing the three different PESs and two differ-  For each 'matched' line, the ratio of strongest to weakest transition intensity was calculated, yielding a scatter factor ρ. Figure 1 shows scatter factors statistics for the two sets of interest. We can clearly see that   The latter is just the case for 1110i-00001 (i = 1, 2, 3) bands. They borrow intensities from very strong asymetric stretching fundamental via second order Coriolis interaction. This appears as a sharp peak around 2000 cm −1 (upper energy level) as depicted in Fig. 2. In this case, reproducing the line intensities with high accuracy requires very precise wavefunctions. We describe these lines as being associated with a J-localized instability.
Together with bands for which scatter factor has peaks concentrated around certain energetic region we also encountered entire vibrational bands with ρ > 2.5, which we shall name as 'sensitive'.

Results
Our final line list given in the supplementary information, includes the ρ parameter, determined from (AA,AU,FA,FU), as one of the fields; ρ is set to −1.0 whenever it could not be extracted. For the most intense bands this automatic procedure was followed by manual matching and double-check, see Table 1.

Scatter factors
In order to appreciate the landscape of scatter factor distributions, it is instructive to introduce scatter factor maps as a function of lower and upper energy level.

Ames-296
Huang et al. [28] published infrared line lists for 12 stable and 1 radioactive isotopologues of CO 2 . These linelists were calculated with Ames-1 PES [26] and DMS-N2 [27], or (A,A) in our notation above. We generated from their data a 12 C 16 O 2 line list for its natural abundance, T = 296 K and with an intensity cutoff of 10 −30 cm/molecule, which we refer to as Ames-296.

CDSD-296
The effective operator approach enables one to reproduce all published   intensities are supposed be accurate to better than 2% (uncertainty code 7) or 5% (code 6). This reveals two issues with current version of HITRAN: a) The stated uncertianty estimate of all current entries are insufficiently accurate for remote sensing applications. Our previous study [20] already showed that for a number of important bands the actual accuracy of the intensities in HITRAN is much higher than suggested by their estimated uncertainties.
b) line intensity accuracies are not uniform throughout the spectral region.
Our experience from studies on several molecules is that the ratio of observed to variational line intensies should be roughly constant for a given unless there is an isolated resonance (see below). For CO 2 , comparing HITRAN intensities with our predictions we would expect the same, but detailed analysis (cf. figure 11), that such jumps in accuracy cause artificial patterns in line intensities within a single vibrational band.
All HITRAN2012 entries taken from a pre-release version of CDSD have been tagged with uncertainty code 3 (20% or worse). However, this number does not reflect actual uncertainties of the intensities. Most of the HITRAN intensities have the uncertainties much better than 20%. More detailed information about the actual uncertainties can be found in the official release of CDSD [58]. The reader should use this work in order to get a realistic information about the uncertainties of the line parameters.  Intensities of all assigned UCL lines relative to HITRAN2012 are depicted in figure 8. As expected discrepancies between the two linelists grow as lines get weaker, which results in a funnel-like shape in the plot which characteristic of such comparisons (e.g. [59]). The stability of the UCL lines on the scatter factors are also shown; as could be anticipated stable lines predominate at higher intensities. A similar situation occurs for bands with HITRAN uncertainty code 6, see figure 10; here very good agreement is spoiled by 01131 -01101 band.    The majority of line intensities (151 602) were taken from our "AU" calculations; we assign HITRAN uncertainty code 8 for stable bands with at least one transition stronger than 10 −23 cm/molecule and 7 for stable bands weaker than 10 −25 cm/molecule together with 8647 lines from intermediate bands.
Whenever our line intensity turned out to be unreliable (i.e. was either unstable and no additional tests confirmed its high accuracy or belonged to 3v 3 family of bands) it was replaced by CDSD-296 value. This was the case for 10 080 (6%) lines.

Conclusion
We present a new mixed ab initio-empirical linelist providing reliable intensities for 12 C 16 O 2 up to 8000 cm −1 . We believe this line list is more complete and the intensities more accurate than in HITRAN2012 [36]. A detailed analysis shows that our line intensities generally are accurate at the sub-percent level when compared to recent, high-accuracy measurements, consequently validating our approach; furthermore we find that intensity uncertainties stated in HITRAN2012 are probably too conservative. We believe these improved intensities should assist to improve CO 2 monitoring in remote atmospheric sensing studies, and in other applications. Furthermore this new line lists fills in the small gaps in the HITRAN2012 list. Of course for use in atmospheric conditions this line list needs to be supplemented by both line profile parameters and consideration of line-mixing [60].
One issue that we should raise concerns perpendicular transitions (those with ∆ = ±1 and ±2).The majority of the perpendicular bands borrow intensity from the considerably stronger parallel (∆ = 0) bands via Coriolis resonance or anharmonic plus -type interactions. To describe this process it is necessary to have very precise wavefunctions. So far, the very high accuracy of the line intensity calculations presented here is confirmed experimentally only for parallel bands. All weaker bands have been given a lower accuracy rating in our line list; none-the-less it would be very helpful to have some high accuracy experimental measurements of perpendicular bands to help to independently validate our results.
Future work will focus on two aspects of the problem. First, it is apparent that our ab initio dipole moment surface is less accurate for transitions involving changes of 3 or more quanta in ν 3 . This problem will be the subject of future theoretical investigation which will also aim to extend our model to frequencies higher than 8000 cm −1 . Second, a major advantage of our methodology is that theoretical calculations can be used to give intensities for all isotopologues of CO 2 with essentially the same accuracy as the 16 [61]. Line lists for isotopically substituted CO 2 will be published in the near future.