Vibration–rotation Transition Dipoles from First Principles

The use of ab initio methods to calculate line positions and associated transition intensities for the infra-red spectrum of small molecules has recently become common. The first principles calculation of transition dipoles, upon which the intensity is based, relies on three distinct steps: the quantum chemical calculation of the dipole moment surface at a grid of geometries, the accurate representation of this surface using an appropriate functional form and the wave functions used to represent the initial and final states, which in turn depend on the accuracy of the potential energy surface used to generate them. Each of these stages is discussed with a view to obtaining the highest possible accuracy. The prospect of computed transition intensities displacing measured ones as the primary source of such information is considered. A detailed knowledge of molecular spectra is important for a large variety of remote sensing and radiative transport applications , which is of course why there are extensive compilations of spectroscopic data [1–4]. These applications all rely on the use of laboratory data to provide the position and intensity of the molecular transitions involved. Laboratory measurements generally provide line positions with a very high level of precision, something that has recently taken another significant step forward with the development of frequency combs [5], see Galzerano et al. [6] for example. The accuracy obtained for measurements of line positions is rarely mirrored in the accuracy with which transition intensities are determined. Indeed, there are a variety of situations where measurements do not routinely provide absolute intensity data at all. These include many microwave spectra, most spectra of unstable species such as radicals and ions, and spectra recorded at high temperature under non-thermodynamic equilibrium conditions. For other species, water being a classic example, the provision of accurate line intensities is complicated by practical problems in determining the precise number density of the molecules in the line-of-sight [7]. There are very precise measurements of line intensities for water transitions in the infrared [8,9], but such measurements are performed using very specially developed methodologies which can then only be applied to a relatively few lines. Workhorse methods of populating databases, such as Fourier Transform Spectroscopy, usually only yields line intensities accurate to a few percent. Theoretically the situation is somewhat different. Only for systems with very few electrons is it possible to compute transition frequencies with an accuracy approaching experimental [10,11]. This is not true …


Introduction
A detailed knowledge of molecular spectra is important for a large variety of remote sensing and radiative transport applications, which is of course why there are extensive compilations of spectroscopic data [1][2][3][4]. These applications all rely on the use of laboratory data to provide the position and intensity of the molecular transitions involved. Laboratory measurements generally provide line positions with a very high level of precision, something that has recently taken another significant step forward with the development of frequency combs [5], see Galzerano et al. [6] for example.
The accuracy obtained for measurements of line positions is rarely mirrored in the accuracy with which transition intensities are determined. Indeed, there are a variety of situations where measurements do not routinely provide absolute intensity data at all. These include many microwave spectra, most spectra of unstable species such as radicals and ions, and spectra recorded at high temperature under non-thermodynamic equilibrium conditions. For other species, water being a classic example, the provision of accurate line intensities is complicated by practical problems in determining the precise number density of the molecules in the line-of-sight [7]. There are very precise measurements of line intensities for water transitions in the infrared [8,9], but such measurements are performed using very specially developed methodologies which can then only be applied to a relatively few lines. Workhorse methods of populating databases, such as Fourier Transform Spectroscopy, usually only yields line intensities accurate to a few percent.
Theoretically the situation is somewhat different. Only for systems with very few electrons is it possible to compute transition frequencies with an accuracy approaching experimental [10,11]. This is not true for transition intensities where there is increasing evidence that the careful and systematic application of ab initio procedures described below can lead to predicted transition intensities of high accuracy. If accuracy can be achieved in first principles calculations which is similar to that obtained experimentally, then theoretical procedures have a number of distinct advantages. They can be applied uniformly to several isotopologues; they can be used to study lines not easily observed under equilibrium conditions, such as flourescence spectra in comets [12,13]; and they can be extended to cover hot vibrational bands and highly-excited rotational states, both of which become increasingly important at elevated temperatures.
The ability to predict spectra at high temperatures is being extensively exploited in the ExoMol project [14]. This project is dedicated to producing a spectroscopic database for the analysis and modeling of the spectra of exoplanets and other hot atmospheres. As a rotation-vibration line list for a single polyatomic molecule may contain billions of transitions [15,16], the ExoMol project relies on developing and solving a robust theoretical model for each molecule considered. While these calculations are performed using a variety of empirical data to obtain the best possible estimate of each line position, it has so far used entirely ab initio procedures to consider transition probabilities as represented by the Einstein-A coefficient.
The transition dipole between two states can be written: where for a vibration-rotation transition, the initial and final states are represented by nuclear motion wave functions jii and jf i, and the sum runs over the components of the internal dipole moment vector. Here I will consider only transitions between fully specified vibration-rotation states and not the simpler but approximate vibrational band intensities, which require the use of an Eckartembedded coordinate system [17]. In the full vibration-rotation case, the dipole moment, l, is the instantaneous, dipole moment of the molecule at a given nuclear configuration in any coordinate system and the integration runs over all nuclear-motion degrees-of-freedom. This formulation implicitly assumes the Born-Oppenheimer approximation. Given the transition dipole, it is straightforward to derive other measures of a transition probability such as the Einstein A-coefficient or the transition intensity [18]. A tutorial on theoretical methods for computing molecular rotation-vibration spectra has been given by Lodi and Tennyson [19] and a comprehensive survey of available, ab initio dipole moment surfaces for molecules with between 3 and 5 atoms has recently been the presented by Yurchenko [20]. Here I consider only the ingredients required to obtain accurate predictions of transition dipoles and hence intensities. These fall into three parts: electronic structure calculations, representation of the dipole moment surfaces or curve (DMS or DMC) and nuclear motion wave functions. I will consider each of these in turn.

Ab initio procedures
Standard electronic structure packages, such as MOLPRO [21], offer the possibility of computing the instantaneous electronic dipole moment for any given geometry for the molecule under consideration. In practice this can be done using two distinct methods. Given the electronic wave function, the dipole can be computed as an expectation value of the dipole operator. Alternatively, it can be obtained from the derivative of the electronic energy when the system is placed in a uniform, static electric field. The Hellmann-Feynman theorem suggests that the two methods should be equivalent [22]. However, this theorem only holds in general for exact wave functions. In practice, differences can be significant with values found to differ by more than 0.1 D for calculations on water with large basis sets and sophisticated CI models [23].
Experience suggests that the perturbation method yields better results in practical calculations with approximate electronic wave functions. This can be understood in terms of the convergence of the energy in a calculation, which is second-order, against the first-order convergence of the wave function. In addition, use of the perturbation method means that the contribution to the dipole moment due to effects for which only the energy is calculated can readily be considered. A number of effects can be treated in this fashion including relativistic corrections, the Born-Oppenheimer (BO) diagonal correction and higher-order configuration interaction corrections such as the Davidson correction. The disadvantage of the perturbation theory method over the use of expectation values is that it requires several calculations at each geometry with different electric fields and is therefore computationally more expensive.
Conventional wisdom holds that accurate dipole moment calculations require the use of large diffuse basis sets since the contributions from diffuse portions of the wave function are emphasized by the dipole moment operator. In practice, experience, in particular supported by detailed studies on the water molecules, suggests that the dipoles moments are largely converged using basis sets of about 5-zeta quality [24,25]. This contrasts with the electronic energy, which is not satisfactorily converged even with a 7-zeta basis set [26]. Conversely there is strong evidence that for really accurate results it is necessary to extend standard treatments of the electron correlation problem [23] by, for example, using a larger than usual active space for the electron correlation problem. This, combined with the requirement to do several calculations at each geometry, discussed above, and the need for a finely space grid of dipole points, discussed below, can makes the calculation of high accuracy DMS for polyatomic molecules computationally expensive even for few-electron systems.
It is becoming increasingly standard to consider corrections to the dipole moment due to core correlation, relativistic effects and failure of the BO approximation. While the non-Born-Oppenheimer contribution is important for systems such as the HD molecule which have no permanent dipole within the BO approximation [27,28], it is generally less important for systems which already have a permanent dipole moment [29,30]. However core correlation and relativistic effects have both been found to make small but significant contributions to the DMS [24,25]. Interestingly, in the case of water, these two effects appear to essentially cancel [24]. The reasons for this remain unclear and the situation does not appear to apply to other similar molecules such as H 2 S [25]. However, it would seem that including only one of these corrections may actually give worse results than including neither. Table 1 compares the various contributions to the equilibrium dipole moment of water and H 2 S taken from ab initio studies performed in my group [24,25,31]. The aim in each case was to obtain this dipole with an accuracy of better than 1 %. For water aug-cc-pCV5Z and aug-cc-pCV6Z CCSD (T) calculations were extrapolated to the complete basis set limit but this only changes the value by 0.00005 D. Additional corrections to the model only give minor contributions: spin-orbit coupling is estimated to contribute about 0.000005 D [23], while the Born-Oppenheimer Diagonal Correction (BODC, also known as the adiabatic correction) at equilibrium was computed by Hobson et al. [30] and amounts to 0.002 D. As can be seen, the contribution to the dipole due to vibrational motion of the nuclei is also small. This leaves the treatment of the electron correlation problem as contributing the largest uncertainty, with the MRCI value of Partridge and Schwenke [34] lying 0.01 D higher that of Lodi et al. [23], who used a larger active space for the electrons.
For H 2 S the issues with computing a precise value for the equilibrium dipole moment are similar to those encountered in water. However for this system, the fundamental transitions are all rather weak; weaker than some of the combination bands for example. This can be associated with the behavior of the dipole moment surface which passes through zero for at geometries close to equilibrium, which gives rise to various intensity anomalies [35]. This behavior make the DMS very challenging to compute [25,36,37].

Smoothness
So far the discussion has considered the high-accuracy, singlegeometry determination of the dipole moment. This value, when obtained close to the equilibrium geometry, is important for determining the intensity of pure rotational transitions. However, the strength of vibrational excitations depend on the variation of the dipole moment as a function of molecule geometry. This means that any useful dipole moment surface must vary in a smooth and physical manner as a function of molecular geometry. Such behavior is far from guaranteed in standard ab initio procedures.
When considering smoothness it is worth considering how a DMS effects different transitions. A naive view of the transition dipoles linking vibtational states suggests [18] that this is related to DMS via the expression where for simplicity only transitions from the vibrational ground state, j0i, involving a single vibrational mode, represented by Q, are considered. This expression neglects a number of terms that contribute to overtone intensities such as the anharmnonicity of the underlying potential energy surface (PES) and the non-linearity of DMS, but it useful to illustrate issues with smoothness. Based on Eq. (2), pure rotational transitions, for which m ¼ 0, depend on the vibrationally averaged dipole moment h0jlj0i, and the fundamental transition depends on the first derivative of this moment. Overtone transitions depend on increasingly higher derivatives so that, for example, a transition involving 9 quanta of excitation as considered below would depend on the ninth derivative of the DMS. As in practice such calculations are performed numerically by evaluating integrals which to a very high degree cancel to zero, then it is clear that this places huge demands on the appropriate behavior of the DMS and, in particular, its smoothness.
There are a number of reasons why a computed dipole moment may not behave correctly as a function of molecular geometry. For example it is well-known that Hartree-Fock wave functions do not go correctly to dissociated atoms [22], in many cases with consequent effects on the dipole moments at large inter-nuclear separations. In practice all single-reference methods including coupled cluster wave functions, such as those used by the ''gold standard'' model CCSD (T), have similar problems. Use of bond-centered basis sets can similarly give problems with dipole moments when the bond in question becomes large [38].
Other methods, such as multi-configuration self consistent field (MC-SCF) or multi-reference configuration interaction (MRCI), use flexible configuration sets which allow wave functions to change as a function of geometry. However, these methods involve dividing the orbitals being used into different groups such as frozen or core, active and virtual orbitals. Problems with the smoothness of dipoles arise when orbitals cross between these groups as a function of geometry. Such orbital crossings can lead to significant features in the dipole functions which normally manifest themselves as continuous dipoles with discontinuous derivatives. In principle, this particular problem can be mitigated by rotating orbitals between the different groups. Similar problems can arise for systems with many electronic states near crossings, even when such crossings should be avoided by symmetry [39]. Numerical issues, such as lack of convergence at certain geometries, can also give rise to artificial structures in the DMS.
The result of these issues is that ab initio dipole moment functions need careful inspection for smoothness and correct asymptotic behavior. There are numerous examples of published curves that do not appear to vary smoothly with geometry [40][41][42][43] or behave correctly at large inter-nuclear separations. Fig. 1 gives an illustrative example taken from a recent study on the SiO molecule [44]. For SiO a number of independent ab initio calculations of the DMC give very similar results close to equilibrium geometries [41,[44][45][46]. However at extended bond lengths, these calculations give significant differences. In particular, calculations based at the CCSD (T) level of theory [44] behave unphysically at large separations, presumably due to problems with the dissociation limit. MRCI calculations avoid this problem, but those due to Chattopadhyaya et al. [41] show unphysical kinks, probably due to orbital crossings. The MRCI calculations of Barton et al. [44], however, appear to give a DMC which goes smoothly to zero as SiO dissociates, which is the theoretically correct behavior for a system which dissociates in neutral fragments.

Molecular ions
Molecular ions actually form a special class when it comes to constructing dipole moment functions. The transition intensities of molecular ions, particularly cations, have received considerably less attention than neutral molecules probably because they are usually prepared experimentally under conditions which make absolute intensity measurements problematic. Theoretically the situation is somewhat different. It is usual for standard electronic structure methods to perform somewhat better for cations than neutrals, because in these species the two-body electron -nuclear Coulomb interactions are enhanced in comparison to the more difficult to treat electron correlation effects. However in the case of dipoles there is an additional factor.
For most cations, and in particular symmetric ones, the largest contribution to the dipole moment is simply given by the separation between the center-of-charge and the center-of-mass. Thus, for example, within the Born-Oppenheimer approximation, the dipole moment of HD + at internuclear separation R is given by where M H and M D are the mass of atomic hydrogen and deuterium respectively. Assuming integer masses, this gives l ¼ R 6 a.u. for R in Bohr. Given the simplicity of calculating such terms, the more complicated contributions, which depend on the details of the wave function, become relatively less important.
The center-of-mass to center-of-charge relationship for ions means that these species have rather different intensity patterns in the ro-vibrational spectra than those well understood for neutrals. Thus, for example, it has long been known for H þ 3 that the transitions in the Einstein A coefficient for the first overtone of the bending mode is about the same value as that for the bending fundamental [47]. Similar behavior can be expected to be desplayed by higher clusters such as H þ 7 [48]. What has received less attention is the prediction that for systems such as the N þ 2 -He complex, the intensity of the complex stretching vibrational band should actually increase with the number of quanta of excitation [49]. Again, this is essentially a geometric effect which is straightforward to model.

More electrons
While our ability to compute very accurate dipoles for few-electron molecules is certainly improving, a note of caution is necessary. For this it is worth considering the conclusions of a recent study on ro-vibrational intensities, including overtone transitions, of several hydrogen halides by Li et al. [54]. They found that their calculated intensities showed an extreme sensitivity to the precise shape (and derivatives) of the DMC employed. So, for example, use of a seemingly high quality ab initio DMC for HCl [55] gave poor results although visible differences in the curves are small compared to Li et al.'s own [56] empirically-determined DMC. Conversely similar comparisons by Li et al. [54] for HF showed that the ab initio DMCs [55,57] worked well for this system. HF is a benchmark, 10electron system. It would seem that there remains more work to do on larger molecules, which in this context means those with moreelectron, to develop ab initio methods with the same predictive capacities found for few-electron systems.

Dipole moment curves and surfaces
Although in some cases, particularly for diatomics, it may be possible to compute the required dipole moment(s) at each grid point for which the wave function has been generated, this is not in general feasible. Thus it is usual to employ an appropriate function to smoothly interpolate the DMS between the points where actual computed points are available. As a starting point for such fits it is a good idea to use a functional form which enforces known analytic behavior [58] of the DMS, such as ensuring the dipole is analytically zero at geometries where this can be predicted by symmetry. For example, recent studies on the high-symmetry molecule methane made extensive use of symmetry [59][60][61].
The sensitivity to the precise form of the DMC, noted by Li et al. [54] and discussed above, illustrates why fitting dipole moment curves and surfaces is actually quite difficult. In particular computed intensities can show significant dependence on the quality of the fit with, for example, poor fits leading to an artificial increase in the strength of certain weak transitions. For diatomics it is relatively straightforward to avoid this situation by simple visual inspection of the curves and fits. For the DMS of polyatomic molecules this not always so easy.
Let me illustrate the situation by considering the DMS of the water molecule. In a benchmark paper Partridge and Schwenke (PS) [34] computed an ab initio PES for water of unprecedented accuracy. They then tuned this PES using spectroscopic data to produce a semi-empirical PES which yields water transitions over a wide range of frequencies with close to experimental accuracy. The line lists generated with this semi-empirical PES are still being extensively used. At the same time PS also computed a high accuracy DMS for which they produced an analytic fit. In subsequent work, Schwenke and Partridge [62] observed that use of this fit to calculate transitions intensities leads to the systematic overestimation of the intensities for bands involving large changes in quantum numbers, particularly those involving the bending mode. They identified this problem as being associated with unphysical oscillations in their DMS as a function of water bond angle. Fig. 2 gives two illustrative examples of the difficulty of representing the behavior of the dipole moment as a function of geometry. These difficulties occur because dipole moment functions actually show quite complicated dependence of the molecular geometry and are therefore difficult to fit with a limited number of data points. Schwenke and Partridge's solution to this problem was to generate a large number of artificial points to further constrain their dipole fit and remove, or at least mitigate, the effect of artificial oscillations in the fit. Lodi et al. [24], from whose work the data in Fig. 2 is taken, computed ab initio dipole moments at those points where the differences between their fit and simple interpolation gave the largest differences. Even then, they still found it necessary to add extra artificial points to reduce the effect of any unwanted features in their surface. From the perspective of functional forms, there are two possible solutions to this smoothness problem. One solution is to use very simple functions to represent the DMS. Such an approach was used by Gabriel et al. [63] who represented the DMS of water using a fourth-order polynomial with 24 and 13 parameters in directions of the bond angle bisector (the B axis) and the in-plane axis perpendicular to this (the A axis), respectively. This fit gives excellent results and stability but cannot correctly predict the intensities for transitions involving highly excited vibrational states. For water this is particularly true for bending modes as the shape of the DMC as a function of bond angle is quite subtle, as is also shown by Fig. 2.
The second solution is to use high-order functions to represent the DMS with a lot of data to constrain them. Thus, PS originally used a sixth-order fit and 84 parameters to fit their DMS. Schwenke and Partridge [62] concluded that this fit was insufficiently flexible to represent the true behavior of the DMS, particularly as a function of bond angle. They therefore refitted the DMS with terms extending up to eighteenth-order which required 823 parameters. Since calculating sufficient ab initio points to properly determine the full parameter set was considered too expensivee, Schwenke and Partridge added a large number of artificial points along their angular DMC which were determined by graphical interpolation. The resulting DMS is much better behaved and has been widely used (e.g. [64]).
The conclusion from these studies is that it is necessary to use a significantly larger number of ab initio grid points to fully characterize the DMS than are required to obtain a reliable PES. Calculations using large numbers of grid points are starting to be performed for molecules such as water [23], H þ 3 [65] and CO 2 [66]. I note that even with increased coverage of data points used to characterize the DMS, the issue of how to get stable results remains an important one. Recently Du et al. [67] reported cavity ring down measurements of absorption by water at near ultraviolet wavelengths. These spectra are broad band and thus do not give individual lines or assignments. However analogies with high-resolution, high-frequency spectra of water [68] suggests that the absorptions measured by Du et al. are due to transitions in which at least nine vibrational quanta change. As part of their work on developing a high accuracy DMS for water, Lodi et al. [23] performed calculations looking at the near ultraviolet absorption spectrum of water. They obtained results very similar to those of Du et al. but did not publish them because of uncertainty over the stability of their calculations with respect to representations of the DMS. In particular, it was difficult to demonstrate that the relatively strong intensities they obtained were not simply an artifact caused by residual oscillations in their fit of the DMS. Given the potential importance of these studies for determining the near ultraviolet absorption of sunlight by water vapor in the earh's atmosphere. more work is clearly needed on this problem given.

Nuclear motion calculations
The computation of transition dipoles relies on vibration-rotation wave functions. When considering how these wave functions are generated one needs to consider two issues: one is the underlying model used for the nuclear motion problem and the second in the accuracy with which this model can be solved. For small molecules variational, or discrete variable representation (DVR) methods which are only quasi-variational [69], can give highly accurate wave functions. Thus, for example, for triatomic systems there are comparisons between codes available which demonstrate this [70,71]. In general, any inaccuracy in the wave functions is thus almost entirely due to the choice of model and, in particular, the PES used.
The way the PES affects computed transition intensities needs rather careful consideration. In many cases, use of a reasonable PES will lead to wave functions which can be used to give transition dipoles of good accuracy but there are isolated exceptions. These occur when an otherwise weak transition steals intensity from a strong one because of mixing between the associated wave functions. This intensity stealing occurs when two states with the same rigorous quantum numbers (J, parity and symmetry) lie accidentally close in energy. Under these circumstances, the stronger transition to one of these states may become a little weaker but the transition to the other ''dark'' state can increase sharply in intensity, sometimes by many orders of magnitude. The degree of this mixing is very sensitive to the energy separation between these quasi-degenerate states which is, in turn, strongly dependent on the underlying PES.
Lodi and Tennyson [72] developed a method to test the results of calculations to identify those states whose intensities are hard to predict because of sensitivity due to resonance interactions. Their method involved calculating the transition intensities four times using two different PES and two different representations of the same DMS. The two PESs chosen were the then best available ab initio surface [73] and the best available spectroscopically determined one [74]. The two DMSs used were fits to the same dipole data but with a full and reduced functional form [23]. The scatter in the ratios of these computed intensities can then be used to flag up those states which are too sensitive to state mixing in the wave function and/or to the stability of the DMS fit to be reliably predicted using the available surfaces. Recent comparisons with detailed experimental studies have shown that this method of predicting which transition intensities cannot be computed reliably is highly effective [75,76].

Conclusions
The discussion above has focussed almost exclusively on high accuracy studies for diatomic and triatomic molecules. However, such work is starting to be performed for larger molecules and high accuracy calculated dipole moment surfaces are available for a number of systems including ammonia [77], phosphine [78], thioformaldehyde [79], methane [59][60][61] and even water clusters [80].
Recent times have seen tremendous progress in the use of ab initio electronic structure procedures to compute high accuracy dipole moments and hence vibration-rotation transition intensities. These are clearly important in the many circumstances when there are no available measured intensities. However, it is becoming apparent that such calculations are competitive in accuracy with measurements in many cases. Thus, as mentioned in the introduction, the ExoMol project is based almost entirely on the use of transition intensities computed using ab initio dipole moment surfaces.
In practice, other databases such as HITRAN [1] and HITEMP [81] are also beginning to make extensive use of ab initio line intensities. For example the recent release of HITRAN contained an extensive data set of such transitions for isotopically substituted water [72] as well as many for the main isotopologue, many transitions for ammonia [14] and a new molecule, SO 3 for which ab initio intensities [82] were used to normalize the measured relative intensities. A number of other species are being considered for a similar treatment. It can therefore be anticipated that the use of ab initio methods to produce reliable transition intensities will only continue to increase.

Acknowledgments
This work is supported by ERC Advanced Investigator Project 267219 and a Royal Society Wolfson Research Merit Award. I would like to thank the members of the ExoMol team for their collaboration and for use of their results in this article. In particular, I thank Clara Sousa-Silva for helpful comments on this manuscript, and Lorenzo Lodi and Oleg Polyansky for many helpful discussion on the issues discussed here.