Laboratory spectra of hot molecules: data needs for hot super-Earth exoplanets

The majority of stars are now thought to support exoplanets. Many of those exoplanets discovered thus far are categorized as rocky objects with an atmosphere. Most of these objects are however hot due to their short orbital period. Models suggest that water is the dominant species in their atmospheres. The hot temperatures are expected to turn these atmospheres into a (high pressure) steam bath containing remains of melted rock. The spectroscopy of these hot rocky objects will be very different from that of cooler objects or hot gas giants. Molecules suggested to be important for the spectroscopy of these objects are reviewed together with the current status of the corresponding spectroscopic data. Perspectives of building a comprehensive database of linelist/cross sections applicable for atmospheric models of rocky super-Earths as part of the ExoMol project are discussed. The quantum-mechanical approaches used in linelist productions and their challenges are summarized.


Introduction
There are vast areas of the Universe thinly populated by molecules which are cold. However, there are also huge numbers of important astronomical bodies which support hot or highly-excited molecules. It is the spectroscopic demands of studying these hot regimes we focus on in this review. We will pay particular attention to the demands on laboratory spectroscopy of a recently identified class of exoplanets known as hot rocky super-Earths or, more colourfully, lava and magma planets. These planets orbit so close to their host stars that they have apparent temperatures such that their rocky surface should melt or even vaporize. Little is known about these planets at present: much of the information discussed below is derived from models rather than observation.
Of course hot and cold are relative terms; here we will take room temperature (T ∼ 300 K) as the norm which means, for example, that so-called cool stars which typically have temperatures in the 2000 -4000 K range are definitely hot. Much of the cold interstellar medium is not thermalised and excitation, for example by energetic photons, can lead to highly excited molecules. This can be seen, for example, from maser emissions involving transitions between highly excited states, which is observed from a range of molecules from a variety of interstellar environments [1]. Similarly the coma of comets are inherently cold but when bathed in sunlight can be observed to emit from very high-lying energy levels [2,3,4].
Turning to the consideration of exoplanets. At the present it even remains unclear how to conclusively identify which planets of a few to ten Earth masses are actually rocky [5]. From density observations some of them appear to be rocky (silicate-rich), or with a fraction of ice/iron in the interior. Others suggest a structure and composition more similar to gas giants like Neptune. Density alone is not a reliable parameter to distinguish among the various cases. In addition to there is a class of ultra-short period (USP) exoplanets which are thought to be undergoing extreme evaporation of their atmosphers due to their close proximity to their host star [6,7,8,9]. These objects are undoubtedly hot but as yet there are no mass measurements for USP planets. Spectroscopic investigations of atmospheres of super-earths and related exoplanets holds out the best prospect of learning about these alien worlds. The prospects of observing the atmospheric composition for the transiting planets around bright stars make us confident we will be in a much better position in a few years time with the launch of the James Webb space telescope (JWST) and future dedicated exoplanet-characterization missions.
From the laboratory perspective, the observation of hot or highly excited molecules places immense demands on the spectroscopic data required to model or interpret these species. As discussed below, a comprehensive list of spectroscopic transitions, a line list, for a single molecule can contain significantly more than 10 10 lines. This volume of data points to theory as the main source of these line lists [10].
A line list consists of an extensive list of transition frequencies and transition probabilities, usually augmented by other properties such as lower state energies, degeneracy factors and partition functions to give the temperature dependence of the line and, ideally, pressure-broadening parameters to give the line shape. For radiative transport models of the atmospheres of hot bod-ies, completeness of the the line list to give the opacity of the species is more important than high ("spectroscopic") accuracy for individual line positions. This is also true for retrievals of molecular abundances in exoplanets based on the use of transit spectroscopy which, thus far, has largely been performed using observations with fairly low resolving power (R < 3000) [11]. However, the situation is rather different with the high-dispersion spectroscopy developed by Snellen and co-workers [12,13,14,15,16], which is complementary to transit spectroscopy. This technique tracks the Doppler shifts of a large number of spectroscopic lines of a given species, by cross-correlating them to the reference lab data on the line positions. This exciting but challenging technique requires precise frequencies with R ≥ 100,000, as well a good spectroscopic coverage (hot transitions), available laboratory data is not always precise enough for this technique to work [17].
This review is organised as follows. First we summarise what is known about hot rocky super-Earth exoplanets. We then consider the laboratory techniques being used to provide spectroscopic data to probe the atmospheres of these bodies and others with similar temperatures. In the following section we summarise the spectroscopic data available making recommendations for the best line list to use for studies of hot bodies. Molecules for which little data appears to be available are identified. Finally we consider other issues associated with spectroscopic characterization of lava planets and prospects for the future. e [38,39], and around 3000 K in Kepler-10b [19]. Somewhat cooler but still hot rocky planets include temperatures of 700 K in Kepler-37b [40], 750 K in Kepler-62b [24], 580 K in Kepler-62c [24], and 400-500 K in GJ 1214b [41,42,43].
If the main constituent of these atmospheres is steam, it will heat the surface of a planet to (and above) the melting point of rock [44]. For example, the continental crust of a rocky super-Earth should melt at about 1200 K [45], while a bulk silicate Earth at roughly 2000 K [46]. The gases are released from the rock as it heats up and melts, including silica and other rock-forming elements, and is then dissolved in steam [47]. The main greenhouse gases in the atmospheres of hot rocky super-Earths are steam (from vaporizing water and hydrated minerals) and carbon dioxide (from vaporizing carbonate rocks), which lead to development of a massive steam atmosphere closely linked to magma ocean at the planetary surface [48,49,50,44,51,52,47].
At temperatures up to 3000 K, and prior to significant volatile loss, the atmospheres of rocky super-Earth are thought to be dominated by H 2 O and CO 2 1 for pressures above 1 bar. [46]. These objects will necessarily have spectroscopic signatures which differ from those of cooler planets. At present interpretation of such signature is severely impacted by the lack of the corresponding spectroscopic data. For example, recent analysis of the transit spectrum of 55 Cnc e [38] between 1.125 and 1.65 µm made a tentative detection of hydrogen cyanide (HCN) in the atmosphere but could not rule out the possibility that this signature is actually in part or fully due to acetylene (HCCH) because of the lack of suitable laboratory data on the hot spectrum of HCCH. The massive number of potential absorbers in the atmosphere of these hot objects also have a direct effect on the planetary albedo [50] as well as the cooling and hence evolution of the young hot objects; comprehensive data is also crucial to model these processes.
Atmospheric retrievals for hot Jupiter exoplanets such as HD 209458b, GJ 1214b and HD 189733b [53] show that transit observations can help to establish the bulk composition of a planet. However, it is only with good predictions of likely atmospheric composition allied to a comprehensive database of spectral signatures and proper radiative transfer treatment that the observed spectra can be deciphered. The completeness of the opacities plays a special role in such retrievals: missing or incomplete lab data when analysing transit data will lead to overestimates of the corresponding absorbing components.
The typical compositions of steam atmospheres have been considered by Schaefer et al. [46], with an example for low atmospheric pressure shown in  Figure 1: Atmospheric composition for a planet similar to CoRoT-7b. Starting compositions were taken for the continental crust (left) and the bulk silicate Earth (right) at 2500 K and 10 −2 bars. Reproduced with permission from Schaefer et al. [46]. At temperatures above about 1000 K, sulfur dioxide would enter the atmosphere, which leads the exoplanet's atmosphere to be like Venus's, but with steam. SO 2 is a spectroscopically important molecule that is generally not included in models of terrestrial exoplanet atmospheric models [46]. In high concentrations (greater than a few ppm), more than one spectral fea-  [58]. Models of exoplanets suggest that NO and NO 2 , as well as a number of other species, are likely to be key products of lightning in a standard exoplanet atmosphere [59]. Further thermochemical and photochemical processing of the quenched CH 4 and NH 3 can lead to significant production of HCN (and in some cases C 2 H 2 ). It has been suggest that HCN and NH 3 will be important disequilibrium constituents on exoplanets with a broad range of temperatures which should not be ignored in observational analyses [60].
Ito et. al. [61] suggested that SiO absorption dominates the UV and IR eclipse spectroscopy. Such observations have the potential to study lava planets even with clouds and lower-atmospheres [62].
Other abundant species that may contribute to the transmission spectrum include CO, OH, and NO at high temperatures. These molecules should be present in a planet with an O 2 -rich atmosphere and magma oceans, such as were recently suggested as the composition of the super-Earth GJ 1132b [65].
It is suggested that for atmospheres of hot rocky super-Earths with high temperature (>1800 K) and low pressure almost all rock is vaporized, while at high pressure (>100 bar) much of this material is in the condensed phase [46]. Most elements found in rocks are expected to be soluble in steam [47],  [68], which is well-known to be a source of major absorption from near-infrared to the optical spectral regions of M dwarfs [69]. There have been several attempts to detect [70,70] and a recent reported detections of TiO in exoplanet atmospheres [71]. Whether complex polyatomic molecules like Fe(OH) 2 , Ca(OH) 2 , CrO 2 F and P 2 O 5 will survive at T > 1000 K is questionable. It should be noted that it is the lower pressure regimes that hold out the best prospects for analysis using transit spectroscopy, as the high pressures will tend to result in opaque atmospheres.
Post-impact rocky planets are shown to have very similar atmospheric and therefore spectroscopic properties. According to estimated luminosities, the hottest post-giant-impact planets will be detectable with near-infrared coronagraphs on the planned 30 m class telescopes [28]. The 1-4 µm region will be most favorable for such observations, offering bright features and better contrast between the planet and a potential debris disk. The greenhouse absorbers in a rocky exoplanet atmosphere strongly influence its cooling properties. The very large cooling timescales (on the order of 10 5 − 10 6 yr) lead to the possibility of discovering tens of such planets in future surveys [28]. It has recently been suggested [72] that even gas giant planets may form visible massive, rocky exomoons as a result of giant impacts.
55 Cnc e is currently the most attractive candidate magma planet for observations [38,29]; its atmosphere is amenable to study using secondaryeclipse spectroscopy and high-dispersion spectroscopy observations.
It is thought that during its formation of the atmosphere of the early Earth was dominated by steam which contained water-bearing minerals [27,48,49,50,44,51,52,47]. As Lupe et al. [28] pointed out, modern stateof-the-art radiative transfer in runaway and near-runaway greenhouse atmospheres [49,50] are mainly based on the absorption of H 2 O and CO 2 , with rather crude description of hot bands and neglecting other opacity sources.
It is important, however, that the line-by-line radiative transfer calculations of outgoing longwave radiation include greenhouse absorbers of a rocky exoplanet atmosphere affecting its cooling. Discussion of such data is given below.
It should be noted that clouds and hazes can lead to flat, featureless spectra of a super-Earth planet [73], preventing detection of some or all of the spectral features discussed above. As Morley et al argued [73], it is however possible to distinguish between cloudy and hazy planets in emission: NaCl and sulfide clouds cause brighter albedos with ZnS known to have a A summary of the molecules important for the spectroscopy of hot melting planets is given in Table 1. The following sections in turn discuss how suitable spectroscopic data can be assembled and the present availability of such data required for retrievals from the atmospheres of rocky super-Earths which are essential for analysis of the exoplanetary observations. Exactly these types of hot rocky objects will be the likely targets of NASA's JWST (due for launch in 2018) and other exoplanet transit observations. Models suggest that magma-planet clouds and lower-atmospheres can be observed using secondary-eclipse spectroscopy [19] and that a photon-limited JWSTclass telescope should be able to detect SiO, Na and K in the atmosphere of 55 Cnc e with 10 hours of observations [61]. Furthermore, albedo measurements are possible at lower signal to noise; they may correspond to the albedo of clouds, or the albedo of the surface [74,75] High quality is also needed for complementary high-dispersion spectroscopic [15,17,76] (see Fig. 2, where the technique is illustrated using Doppler shifted TiO lines). For example TiO could not be detected in the optical transmission spectrum of HD 209458b due to (arguably) poor quality of the TiO spectral data [76].
The above discussion concentrates on molecular species and infrared spectra. However, transit observation of atomic spectra at visible wavelengths, particularly due atomic hydrogen [77] and sodium [78], were actually the earliest spectroscopic studies of exoplanets. More recently, the Hubble Space Telescope telescope has been used to perform transit spectroscopy of exoplanets in the ultraviolet revealing the presence of both neutral Mg [79] and its ion Mg + [80], as well as the possible detection a variety of other possible atoms and atomic ions.

Methodology
The spectroscopic data required to perform atmospheric models and retrievals comprise line positions, partition functions, intensities, line profiles and the lower state energies E", which are usually referenced to as 'line lists'.
Given the volume of data required for construction of such line lists is far from straightforward. When considering how this is best done it is worth dividing the systems into three classes: 1. Diatomic molecules which do not contain a transition metal atom which we will class as simple diatomics; 2. Transition metal containing diatomics such as TiO; 3. Polyatomic molecules.  of experimental data [81] or use of empirical energy levels and calculated, ab initio, dipole moments and hence transition intensities [82]. It is also possible to generate such line lists by direct solution of the nuclear motion Schrödinger equation [83,84] for a given potential energy curve and dipole moment function [85]. This means that while there are still simple diatomics for which line lists are needed, it should be possible to generate them in a reasonably straightforward fashion.
When the diatomic contains a transition metal, things are much less straightforward [86,87].
Potential energy curves Dipole moments Spin-orbit coupling Angular momentum systems. Furthermore, the many low-lying electronic states are often strongly coupled and interact, which makes it difficult to construct robust models of the experimental data. From a theoretical perspective, the construction of reliable potential energy curves and dipole moment functions remains difficult with currently available ab initio electronic structure methods [86,87]. The result is that even for important systems such as TiO [91], well-used line lists [92,93] are known to be inadequate [17].
For polyatomic molecules there have been some attempts to construct line lists directly from experiment, for example for ammonia [94,95] and methane [96,97]. However, this process is difficult and can suffer from problems with both completeness [98] and the correct inclusion of temperature dependence.
The main means of constructing line lists for these systems has therefore been variational nuclear motion calculations.
There are three groups who are systematically producing extensive theoretical line lists of key astronomical molecules. These are the NASA Ames group of Huang, Lee and Schwenke [99,100], the Reims group of Tyuterev, Nikitin and Rey who are running the TheoReTS project [101] and our own ExoMol project [102,103]. While there are differences in detail, the methodologies used by these three groups are broadly similar. Intercomparison for molecules such as SO 2 , CO 2 and CH 4 , discussed below, are generally characterized by good overall agreement between the line lists presented by different groups with completeness and coverage being the main features to distinguish them. Thus, for example, both the TheoReTS and ExoMol groups pointed out that the 2012 edition of the HITRAN database [104] contained a spurious feature due to methane near 11 µm [105,106], which led to its removal in the 2016 release of HITRAN [107]. procedure is well established [10] in that for all but a small number of systems with very few electrons [109,110,111], the PES used is spectroscopically determined. That is, an initial high-accuracy ab initio PES is systematically adjusted until it reproduces observed spectra as accurately as possible. Conversely, all the evidence suggests that the use of a purely ab initio DMS gives better results than attempts to fit this empirically [112,113,114].
The PE, SO, EAM and (T)DM surfaces are usually interpolated by appropriate analytical representations to be used as an input for the nuclear motion program. The quality of the PES (as well as of the coupling curves) is improved a priori by refining the corresponding expansion parameters by comparison with laboratory high resolution spectroscopic data. This refinement, particularly of PESs, using spectroscopic data is now a well-developed procedure pursued by many groups. For example, the Ames group have provided a number highly accurate PES for small molecules based on very extensive refinement of the PES [115,116,117] starting from initial, high accuracy, ab initio electronic stucture calculations. Our own preference is to  constrain such fits to remain close to the original ab initio PES [118]; this has the benefit of forcing the surface to remain physically correct in regions not well-characterized experimentally. Such regions are often important for calculations of extensive, hot line lists. Further discussion of the methods used to refine PESs can be found in Ref. [10].
Our computational tools include the variational nuclear-motion programs Duo [84], DVR3D [119], and TROVE [120,121] which calculate the rovibrational energies, eigenfunctions, and transition dipoles for diatomic, triatomic and larger polyatomic molecules, respectively. These programs have proved capable of producing accurate spectra for high rotational excitations and thus for high-temperature applications. All these codes have been adapted to face the heavy demands of computing very large line lists [122] and are available as freeware.
Duo was recently developed especially for treating open-shell system of astrophysical importance [90,123,124,125]. To our knowledge Duo is currently the only code capable of generating spectra for general diatomic molecules of arbitrary number and complexity of couplings.
DVR3D [119] was used to produce line lists for several key triatomics, including H 2 S, SO 2 , H 2 O, CO 2 , HCN [126,56,127,128,129,130,131,132]. DVR3D is capable of treating ro-vibrational states up to dissociation and above [133]. A new version appropriate for the calculation of fullyrotationally resolved electronic spectra of triatomic species has just been developed and tested for the X -A band in SO 2 [134].
TROVE is a general polyatomic code that has been used to generate line lists for hot NH 3 , PH 3 , H 2 CO, HOOH, SO 3 , CH 4 [135,136,137,138,139,140,141]. Intensities in TROVE are computed using the new code GAIN [142] which was written and adapted for graphical processing units (GPUs) to compute Einstein coefficients (or oscillator strengths) and integrated absorption coefficients for all individual rotation-vibration transitions at different temperatures. Given the huge number of transitions anticipated to be important at elevated temperatures, the usage of GPUs provides a huge advantage.
However TROVE requires special adaptation [143] to treat linear molecules such as the astronomically important acetylene (HCCH).
An alternative theoretical procedure has been used by Tashkun and Perevalov from Tomsk. Their methodology uses effective Hamiltonian fits to experimental data for both energy levels and transition dipoles. This group has provided high-temperature line lists for the linear CO 2 [144] molecule and the NO 2 [145] system. This methodology reproduces the positions of observed lines to much higher accuracy than the variational procedure but generally extrapolates less well for transitions involving states which are outside the range of those that have been observed in the laboratory. In particular, comparisons with high-resolution transmission measurements of CO 2 at high temperatures for industrial applications suggest that indeed the CDSD-4000 CO 2 line list loses accuracy at higher temperatures. We note that the Ames group have produced variational line lists for CO 2 designed to be valid up to 1500 K [146] and 4000 K [147].
As mentioned above, a disadvantage of the use of variational nuclear motion calculations is that the transition frequencies are rarely predicted with spectroscopic accuracy. One method of rectifying this problem is by use of the MARVEL (measured active rotational-vibrational energy levels) proce-dure [148,149]. The MARVEL procedure inverts the measured transition frequencies to provide energy levels from which not only can the original transition frequencies be regenerated but all other transitions linking these states can also be obtained with experimental accuracy. However, the MAR-VEL procedure does not provide any information on levels which have yet to be observed experimentally. MARVEL datasets of energy levels are available for a range of astronomically important molecules including water [150,151], H + 3 [152,153], NH 3 [154,155], C 2 [156], TiO [157] and HCCH [158]. In particular, the energy levels and transition frequencies from the analysis of TiO spectra should provide the high-resolution transition frequencies need to allow the detection of TiO in exoplanets using high-dispersion spectroscopy for which previously available laboratory data was not precise enough [17].
Indeed this analysis pointed to a number of issues with previous analysis of observed TiO spectra and significant shifts in transition frequencies compared to those provided by the currently available line lists [92,93].
The MARVEL energy levels can also be used to replace computed ones in line lists. This has already been done for several line lists [159,111,160]. This process is facilitated by the ExoMol data structure [161,103] which does not store transition frequencies but instead computes them from a states file containing all the energy levels. This allows changes of the energy levels at the end of the calculation or even some time later [162,163] should improved energy levels become available.
The polyatomic molecules discussed above are all closed shell species.
However the open shell species PO 2 , mentioned above, and CaOH are thought to be important for hot atmospheres [164]. There have been a number of variational nuclear motion calculations on the spectra of open shell triatomic systems [165,166,167,168,169,170], largely based on the use of Jensen's MORBID approach [171]. However, we are unaware of any extensive line lists being produced for such systems. The extended version of DVR3D [134] mentioned above should, in due course, be applicable to these problems.  [63]. However, the ab initio calculation of good dipole curves is always essential since these are not in general tuned to observation.
The ExoMol line lists are prepared so that they can easily be incorporated in radiative transfer codes [103]. For example, these data ar directly incorporated into the UCL Tau-REx retrieval code [172,173,174], a radiative transfer model for transmission, emission and reflection spectroscopy from the ultra-violet to infrared wavelengths, able to simulate gaseous and terrestrial exoplanets at any temperature and composition. Tau-REx uses the linelists from ExoMol, as well as HITEMP [175] and HITRAN [104] with clouds of different particle sizes and distribution, to model transmission, emission and reflection of the radiation from a parent star through the atmosphere of an orbiting planet. This allows estimates of abundances of absorbing molecules in the atmosphere, by running the code for a variety of hypothesised compositions and comparing to any available observations. Tau-REx is mostly based on the opacities produced by ExoMol with the ultimate goal to build a library of sophisticated atmospheres of exoplanets which will be made available to the open community together with the codes. These models will enable the interpretation of exoplanet spectra obtained with future new facilities from space [176,177] and the ground (VLT-SPHERE, E-ELT, JWST).
Of course there are a number of other models for exoplanets and similar objects which rely on spectroscopic data as part of their inputs. These include modelling codes such as NEMESIS [178], BART [179], CHIMERA [180] and a recent adaption of the UK Met Office global circulation model (GCM) called ENDGame [181,182]. More general models such as VSTAR [183] are designed to be applied to spectra of planets, brown dwarfs and cool stars.
The well-used BT-Settl brown-dwarf model [184,185] can also be used for exoplanets. There are variety of other brown dwarfs [186] and cool star models [187,188,189]. These are largely concerned with the atmospheres of the hydrogen rich atmospheres which are, of course, characteristic of hot Jupiter and hot Neptune exoplanets, brown dwarfs and stars.
Besides direct input to models, line lists are used to provide opacity functions [190,191,64,192,193] whose reliability are well-known to be limited by the availability of good underlying spectroscopic data [194]. Cooling func-tions for key molecules are also important for the description of atmospheric processes in hot rocky objects. These functions are straightforward to compute from a comprehensive line lists [195]; this involve computation of integrated emissivities from all lines on a grid of temperatures typically ranging between 0 to 5000 K.

Available spectroscopic data
Spectroscopic studies of the Earth's atmosphere are supported by extensive and constantly updated databases largely comprising experimental laboratory data [104,196]. Thus for earth-like planets, by which we mean rocky exoplanets with an atmospheric temperature below 350 K, the HI-TRAN database [107] makes a good starting point. However, at higher temperatures datasets designed for room temperature studies rapidly become seriously incomplete [197], leading to both very significant loss of opacity and incorrect band shapes. The strong temperature dependence of the various molecular absorption spectra is illustrated in figures given throughout this review which compare simulated absorption spectra at 300 and 2000 K for key species.
HITRAN's sister database, HITEMP, was developed to address the problem of high temperature spectra. However the latest release of HITEMP [175] only contains data on five molecules, namely CO, NO, O 2 , CO 2 and H 2 O. For all these species there are more recent hot line lists available which improve on the ones presented in HITEMP. These line lists are summarised in Table 2 below. Table 1 gives a summary of species suggested by the chemistry models as Such data, when available, will be important for the interpretation of present and future exoplanet spectroscopic observations. Below we consider the status of spectroscopic data for key molecules in turn.
H 2 O: As discussed above, water is the key molecule in the atmospheres of rocky super-Earths. There are a number of published water line lists available for modelling hot objects [212,213,214,215,216,217,218,175]. Of these the most widely used are the Ames line list of Partridge and Schwenke [215], or variants based on it, and the BT2 line list [218], which provided the basis for  water in the HITEMP database [175] and the widely-used BT-Settl brown dwarf model [219]. The Ames line list is more accurate than BT2 at infra red wavelengths but less complete meaning that it is less good at modelling hotter objects. Recently Polyansky et al. have computed the POKAZaTEL line list [131] which is both more accurate and more complete than either of these. We recommend the use of this line list, which is illustrated in Fig. 7, in future studies. up to 1500 K [146]. Recent work on CO 2 has improved computed transition intensities to point where they as accurate as the measured ones [114,127]; this suggests that there is scope for further improvement in hot line lists for this system; some work in this direction has recently been undertaken by Huang al al [147]. Fig. 8 illustrates the temperature-dependence of the CO 2 absorption spectrum in the infrared.
CH 4 : methane is an important system in carbon-rich atmospheres and the construction of hot methane line lists has been the subject of intense recent study by a number of groups both theoretically [221,222,223,224,225,226,227,228,197,106,105] and experimentally [96,97]. The most complete line lists currently available are our 10to10 line list [106], which is very extensive but only valid below 1500 K, and the Reims line list [105], which spans a reduced wavelength range but is complete up to 2000 K. In fact we extended 10to10 to higher temperature some time ago but the result is a list of 34 billion lines which is unwieldy to use. We have therefore been working data compaction techniques based on the use of either background, pressure-independent cross sections [97] or super-lines [101]. This line list will be released shortly [141]. Figure 9 illustrates the temperature-dependence of the methane absorption spectrum in the infrared. The strongest bands are at 3.7 and 7.7 µm. suggesting that more work is required on the SO 3 dipole moment.
NH 3 : Ammonia has a very prominent absorption feature at about 10 µm.
Extensive line lists for ammonia are available [231,135]. The BYTe line list [135], which was explicitly designed for needs of exoplanet spectroscopy in mind, has been used to model spectra of brown dwarfs [183,232,233].
However, BYTe loses accuracy in the near infrared. Rather old laboratory measurements of room temperature for ammonia have recently been assigned [234,235]. These data plus improved ab initio treatment of the problem [236] and a MARVEL analysis leading to a set of accurate, empirical energy levels [154,155] will form the basis of a new line list which will both extend the range and improve on the accuracy of BYTe. Fig. 10 illustrates the absorption spectra of ammonia at T = 300 K and 2000 K. The strongest and most prominent feature is at 10 µm.
H 2 S: The main source of the emission of H 2 S on Earth is from life [237].
It has been, however, ruled out as a potential biosignature in atmospheres of exoplanets [238]. H 2 S is also generated by volcanism. Fig. 11 illustrates the absorption spectra of H 2 S at T = 300 K and 2000 K based on the AYT2 line list [126]. using variational nuclear motion calculations [239,132]. Indeed the first of these line list was the basis of a ground-breaking study by Jørgensen et al. [240] showed that use of a comprehensive HCN line list in a model atmosphere of a 'cool' carbon star made a huge difference: extending the model of the atmosphere by a factor of 5, and lowering the gas pressure in the surface layers by one or two orders of magnitude. The line list created and used by Jørgensen and co-workers [239,240] only considered HCN. However HCN is a classic isomerizing system and the HNC isomer should be thermally populated at temperatures above about 2000 K [241,242]. More recent line lists [132,162,210,163] consider both HCN and HNC together. All these line lists are based on the use of ab initio rather than spectroscopically-determined PESs, which can lead to significant errors in the predicted transition frequencies [243]. However the most recent line list, due to Barber et al. [163] used very extensive sets of experimental energy levels obtained by Mellau for both hot HCN and hot HNC [244,245] to improve predicted frequencies to, essentially, experimental accuracy. This line list was used for the recent, tentative detection of HCN on super-Earth 55 Cancri e [38]. The line list of Barber et al. [163] is illustrated in Fig. 12.
CO: is the most important diatomic species in a whole range of hot atmospheres ranging from warm exoplanets to cool stars from a spectroscopic perspective. Li et al. [202] recently produced comprehensive line lists for the nine main isotopologues of CO. Figure 13 illustrates the absorption spectrum of the main isotopologue, 16  SiO: Figure 3 illustrates the absorption spectrum of SiO molecule. SiO is well known in sunspots [246] and is thought likely to be an important constituent of the atmosphere of hot rocky super-Earths. An IR line list for SiO available from ExoMol [63] and a less accurate UV line list is provided by Kurucz [64].
Line lists are available for both NaCl and KCl [206], see Fig wavelength, m Figure 15: Infrared absorption spectra of NaCl (upper) and KCl (lower) at T =300 K and 2000 K simulated using the ExoMol line list [206].  There are a number of systems which have been identified as likely to be present in the atmospheres of hot rocky super-Earths for which there are no available line lists. Indeed for most of these species, which include NaOH, KOH, SiO 2 , MgO, PO 2 , Mg(OH) 2 , SO, ZnS (see Table 1), there is little accurate spectroscopic data of any sort. Clearly these systems will be targets of future study.
Probably the most important polyatomic molecule, at least for exoplanet and cool star research, for which there is still not a comprehensive hot line list is acetylene (HCCH). Acetylene is a linear molecule for which variational calculations are possible [247,248] and an extensive effective Hamiltonian fit is available [249]. One would therefore expect such a line list to be provided shortly.

Other considerations
All the discussion above has concentrated very firmly on line spectra.
However there are a number of issues which need to be considered when simulating or interpreting exoplanet spectra [251]. A discussion of procedures for this is given in Chapter 5 of the recent book by Heng [250]. General codes, such as HELIOS [251,252] and our own ExoCross [253], are available for taking appropriate line lists and creating inputs suitable for radiative transfer codes.
The first issue to be considered is the shape of the individual spectral lines. Lines are Doppler broadened with temperature due to the thermal motion of the molecules and broadened by pressure due to collisional effects.
While the total absorption by an optically thin line is conserved as function of temperature and pressure; this is not true for optically thick lines. For these lines use of an appropriate line profile can have a dramatic effect [254,255].
The nature of primary transit spectra, where the starlight has a long pathlength through the limb of the exoplanet atmosphere, is good for maximizing sensitivity but also maximizes the likelihood of lines being saturated. This means that it is important to consider line profiles when constructing line list for exoplanet studies.
While it is straightforward to include the thermal effects via the Doppler profile; pressure effects in principle depend on the collision partners and the transition concerned. Furthermore, there has been comparatively little work on how pressure broadening behaves at high temperatures [256]. Studies are beginning to consider broadening appropriate to exoplanet atmospheres [257,258,259,260]. However, thus far these studies have concentrated almost exclusively on pressure effects in hot Jupiter exoplanets, which means that molecular hydrogen and helium have been the collision partners considered. The atmospheres of hot rocky super-Earths are likely to be heavy meaning that pressure broadening will be important. Clearly there is work to be done developing appropriate pressure-broadening parameters for the atmospheres of these planets. We note, however, that line broadening parameters appropriate for studies of the atmosphere of Venus are starting to become available, largely on the basis of theory [261,262,263,264].
Besides broadening, it is also necessary to consider collision induced absorption in regions where there are no spectral lines. On Earth it is know that the so-called water continuum majors an important contribution to atmospheric absorption [265]. Similarly collision induced absorption (CIA) in by H 2 is well known to be important hydrogen atmospheres [266]. CIA has also been detected involving K-H 2 collisions [267]. What CIA processes are important in lava planets is at present uncertain.
Finally it is well-known that the spectra of many (hot Jupiter) exoplanets are devoid of significant features, at least in the NIR [53,71]. It is thought that this is due to some mixture of clouds and aerosols, often described as hazes. Such features are likely to also form in the atmospheres of rocky exoplanets. It remains unclear precisely what effect these will have on the resulting observable spectra of the planet.

Conclusions
To conclude, the atmospheres of hot super-Earths are likely to be spectroscopically very different those of other types of exoplanets such as cold super-Earth or gas giant due to both the elevated temperatures and the different atmospheric constituents. This means that a range of other species, apart from the usual H 2 O, CH 4 , CO 2 and CO, must be also taken into consideration. A particularly interesting molecule that is likely to feature in atmospheric retrievals is SO 2 . Detection of SO 2 could be used to differentiate super-Venus exoplanets from the broad class of super-Earths. A comprehensive line list for SO 2 is already available [56]. SiO, on other hand, is a signature of a rocky object with potentially detectable IR and UV spectral features. Another interesting species is ZnS, which can be used to differentiate clouds and hazes. At present there is no comprehensive line list for ZnS to inform this procedure.
Models of hot super-Earths suggest that these exoplanets appear to re-semble many properties of the early Earth. An extensive literature exists on the subject of the early Earth, which can be used as a basis for accurate prediction of the properties of the hot rocky exoplanets. Super-Earths also provide a potential testbed for atmospheric models of the early Earth which, of course, are not amenable to direct observational tests. Post-impact planets may also be also very similar in chemistry and spectroscopy.
From different studies of the chemistry and spectroscopy of hot super-Earth we have identified a set of molecules suggested either as potential trace species or sources of opacities for these objects. The line list for a significant number of these species are either missing or incomplete. Our plan is systematically create line lists for these key missing molecules and include into the ExoMol database.

Acknowledgments
We thank Giovanna Tinetti, Ingo Waldmann and the members of the Ex-oMol team for many fruitful discussion, and Laura Schaefer for providing a