On the Chemical Abundance of HR 8799 and the Planet c

Comparing chemical abundances of a planet and the host star reveals the origin and formation path. Stellar abundance is measured with high-resolution spectroscopy. Planet abundance, on the other hand, is usually inferred from low-resolution data. For directly imaged exoplanets, the data are available from a slew of high-contrast imaging/spectroscopy instruments. Here, we study the chemical abundance of HR 8799 and its planet c. We measure stellar abundance using LBT/PEPSI (R=120,000) and archival HARPS data: stellar [C/H], [O/H], and C/O are 0.11$\pm$0.12, 0.12$\pm$0.14, and 0.54$^{+0.12}_{-0.09}$, all consistent with solar values. We conduct atmospheric retrieval using newly obtained Subaru/CHARIS data together with archival Gemini/GPI and Keck/OSIRIS data. We model the planet spectrum with petitRADTRANS and conduct retrieval using PyMultiNest. Retrieved planetary abundance can vary by $\sim$0.5 dex, from sub-stellar to stellar C and O abundances. The variation depends on whether strong priors are chosen to ensure a reasonable planet mass. Moreover, comparison with previous works also reveals inconsistency in abundance measurements. We discuss potential issues that can cause the inconsistency, e.g., systematics in individual data sets and different assumptions in the physics and chemistry in retrieval. We conclude that no robust retrieval can be obtained unless the issues are fully resolved.

mass. Moreover, comparison with previous works also reveals inconsistency in abundance measurements. We discuss potential issues that can cause the inconsistency, e.g., systematics in individual data sets and different assumptions in the physics and chemistry in retrieval. We conclude that no robust retrieval can be obtained unless the issues are fully resolved.

INTRODUCTION
HR 8799 bcde remain one of only a few multi-planet systems that have been directly imaged (Marois et al. 2008(Marois et al. , 2010, the other two systems being PDS 70 b and c (Keppler et al. 2018;Haffert et al. 2019) and β Pic b (Lagrange et al. 2010) and c (Lagrange et al. 2019). The HR 8799 system has been extensively studied previously. The astrometry of the four planets are measured to a few milliarcsecond precision (e.g., Konopacky et al. 2016;Wertz et al. 2017). The atmospheres of planets in this system have been studied by multi-band photometry and low-resolution spectroscopy (e.g., Skemer et al. 2014;Bonnefoy et al. 2016;Zurlo et al. 2016;Lavie et al. 2017).
Planet c is the most amenable for currentgeneration high contrast instruments for its favorable separation and planet-star flux ratio. Planet d and e are so close to the host star so that speckle noise significantly compromises data quality. The planet-star flux ratio of planet b is lower than that of c despite a larger separation.
Consequently, rich data sets have been obtained for HR 8799 c, including integral field spectroscopy (IFS) data from Palomar/P1640 (Oppenheimer et al. 2013), Keck/OSIRIS (Konopacky et al. 2013), and Gemini-S/GPI (Ingraham et al. 2014;Greenbaum et al. 2018). Readers are referred to Bonnefoy et al. (2016); Zurlo et al. (2016) and subsequent references (Greenbaum et al. 2018;Wang et al. 2018a;Ruffio et al. 2019;Gravity Collaboration et al. 2019; Petit dit de la Roche et al. 2020) for a summary of observations of HR 8799 * 51 Pegasi b Fellow planets. VLT/SPHERE has a too small a field of view (FOV) for IFS data for HR 8799 c. At high spectral resolution, Wang et al. (2018a) obtained L-band Keck/NIRSPEC (R=15,000) data and detected water in the atmosphere of HR 8799 c. In the near-term future, it is expected that high quality data will come from JWST and new-generation high-contrast imaging and spectroscopy instruments (e.g., Mawet et al. 2017;Currie et al. 2018;Gravity Collaboration et al. 2019).
Spectral analysis and atmospheric retrieval have been applied to HR 8799 c. Multiple lines of evidence suggest that the atmosphere is cloudy and not in chemical equilibrium. Evidence from photometric data includes the suppressed J band flux and CH 4 narrow-band absorption in L band (Skemer et al. 2014). Spectral fitting and atmospheric retrieval also suggest a cloudy atmosphere and significant vertical mixing that causes chemical disequilibrium (Madhusudhan et al. 2011;Konopacky et al. 2016;Lavie et al. 2017). The C/O ratio has been constrained to inform the formation and accretion history of the planet (Konopacky et al. 2013;Lavie et al. 2017).
Despite much progress on abundance measurements, there are two outstanding questions. First, how well can we constrain the stellar abundance? It is the comparison between stellar and planetary abundances that reveals the pathway of planet formation. Previous works used stellar abundances from Sadakane (2006), which is based on data of spectral resolution of 42,000. Can the abundance measurements be improved and the associated uncertainties be properly accounted for?
Second, how can we access the robustness of atmospheric retrieval and its implications to planet formation? More specifically, how do we develop a retrieval framework that combines multiple data sets, how do we interpret the discrepancy between different data sets, and how do we properly set priors to ensure a physically and chemically sensible retrieval?
To improve planet atmospheric retrieval, we supplemented new IFS data to existing Gemini/GPI and Keck/OSIRIS data. Our new data is from the Coronagraphic High Angular Resolution Imaging Spectrograph (CHARIS) that simultaneously covers J, H, and K band (Peters et al. 2012).
The paper is organized as follows. §2 summarizes the observational data. §3 describes stellar abundance measurements and results. §4 describes the framework of planet atmosphere retrieval and results are given in §5. Discussion and summary are given in §6 and §7.

DATA
We used high-resolution spectra to determine stellar abundances. For planet atmospheric abundances, we used IFS data sets for HR 8799 c. While photometric data points are valuable and can be easily taken into account in our analysis, it is unclear how systematics in photometry would affect our analysis, e.g., whether the systematics is astrophysical, coming from the instrument, or from data reduction.
We supplemented new observational data to archival data. Below we provide details on the data that we used for stellar and planetary abundance measurements in two categories: new and archival data.  (Strassmeier et al. 2015) were made on November 14, 2019 UT with a 200 µm fiber (R = 120 000) in two spectral regions 4265 -4800Å and 5441 -6278Å (cross-dispersers 2 and 4) with 5 min integration time starting at 02:16:52 UT. The telescope was pointed to the star with binocular mode with PEPSI on one side (right DX) and LBTI on the other one (Left SX). The data were processed as described in Strassmeier et al. (2018) and yielded the maximal signal-to-noise ratio of 340 and 500 in two cross-dispersers.
2.1.2. Subaru/CHARIS Observation of HR 8799 c HR 8799 was observed on two consecutive nights, 2018 September 1 and 2018 September 2 (PIs: Jason Wang and James Graham), using the CHARIS integral field spectrogrpah (Groff et al. 2015 behind the SCExAO adaptive optics system (Jovanovic et al. 2015b). A full analysis of the data will be published in a following paper (Wang et al. in prep), but here we will briefly summarize the observations and reduction relevant for spectroscopy of HR 8799 c. With the broadband mode of CHARIS, we obtained R∼20 integral field spectrscopy of HR 8799 c from 1.1 to 2.4 µm, covering J, H, and K bands simultaneously in 22 spectral channels. In this work, we used data corresponding to 6.45 hours and 6.72 hours of open shutter time data on HR 8799 c from night 1 and night 2 respectively. The raw integral field spectroscopy data was turned into spectral data cubes using the CHARIS data reduction pipeline . Stellar PSF subtraction and spectral extraction was performed using the open-source package pyKLIP (Wang et al. 2015) that implements the forward modeling framework presented in (Pueyo 2016). Uncertainties and biases from the spectral extraction were estimated using simulated planets injected into the data at the same separation as HR 8799 c, but at different azimuthal positions. The data are flux calibrated using artificial satellite spots injected into each exposure (Jovanovic et al. 2015a). From binary calibrators taken during the night, we determined the satellite spots to have a flux ratio relative to the star of 2.64 × 10 −3 × (λ/1.55 µm) −2 , where λ is the wavelength of each spectral channel (Wang et al. in prep).
CHARIS data of two nights were combined in the following way. We first flux-calibrated the data for each night using GPI H and Kband IFS data. Then the average flux of the two nights was taken as the flux. Error bars are the larger of the following: (1) summed quadrature of error bars that were reported by CHARIS data reduction pipeline, and (2)  Data of HR 8799 using the HARPS spectrograph from the 3.6 m ESO telescope were taken in 2009 (PI: Chauvin, ESO runs Program ID: 083.C-0794), with a wavelength coverage of 3800−6900Å and spectral resolution of ∼115,000. The data were reduced by the standard HARPS pipeline. Eight 180s exposures were reduced and combined to yield a one dimensional spectrum for spectral analysis, which has a SNR (per pixel) of ∼300 at 5500Å.

Keck/OSIRIS data for HR 8799 c
OSIRIS K-band IFS data (R∼5000) were reported in Konopacky et al. (2013), in which absorption lines of molecular species such as H 2 O and CO were detected using a cross-correlation technique. The higher spectral resolution allowed for an investigation of atmospheric C/O and probed the formation and accretion history of HR 8799 c (Öberg et al. 2011;Madhusudhan 2012).

Gemini-S/GPI data for HR 8799 c
Newly analyzed GPI data from 1.5 µm to 2.4 µm (R∼45-80) were reported in Greenbaum et al. (2018), where they found significant discrepancy between the newly-analyzed and previously-analyzed data (Ingraham et al. 2014). The discrepancy was attributed to over subtraction in spectral extraction.
3. STELLAR ATMOSPHERIC PARAMETERS AND ABUNDANCES OF HR 8799

Stellar Parameters
We derived the stellar atmospheric parameters of HR 8799 from the PEPSI and HARPS spectra using (1) the spectral analysis package MOOG (Sneden 1973), (2) a line list consisting of 48 Fe I and 18 Fe II lines, and (3) ATLAS9 stellar atmospheric grid models. The equivalent widths (EWs) of the spectral lines were measured using TAME (Kang & Lee 2012), which used a Gaussian profile to fit the absorption lines. This EW technique is based on the iron ionisation equilibrium and excitation equilibrium. We require that (1) Fe I lines and Fe II lines give the same averaged [Fe/H] abundance, (2) no correlations between [Fe I/H] values and reduced EW values,and (3) no correlation between [Fe I/H] and excitation potential of the lines. The errors are estimated using the same method as that in Tabernero et al. (2019). We used [ ] to denote logarithmic abundance with respect to solar value (Asplund et al. 2009).
The initial guesses of the stellar parameters were taken from Sadakane (2006) and then iterated on by slightly changing each parameter until above requirements were met. The final adopted stellar atmospheric parameters are T eff = 7390 ± 80 K, log(g) = 4.35 ± 0.07, [Fe/H] = −0.52 ± 0.08 and = 3.1 ± 0.2 km s −1 , as shown in Table. 1.

C and O Abundances
We then proceeded to derive the abundances of C and O using the HARPS data. Adopting the stellar atmosphere models of Kurucz AT-LAS9 with the above derived atmospheric parameters, the abundances of C and O were derived using the abfind driver from MOOG, results of which are shown in Table. 2. The errors associated with abundance measurements were calculated by accounting for the atmospheric parameter errors presented in Table. 1 and the measurement errors of the EWs. The abundance of C was derived from C I 5052.2, 4771.7, 4932.0, and 5380.3Å features. The abundance of O was derived from the O I triplet feature at 6156´6158Å, with their line values taken from the NIST database.
We adopt the values from Asplund et al. The final adopted [C/H] and [O/H] were 0.11 ± 0.12 dex and 0.12 ± 0.14 dex respectively, where the solar abundance values were taken from Asplund et al. (2009). The ratio of C/O was 0.54 +0.12 −0.09 , where the error from uncertainty of stellar parameters was small because both C and O abundances change in the same direction as the effective temperature of the stellar model changes.
The abundances were estimated based on a LTE model. The non-LTE correction for C and O abundances are small (∼ 0.04 dex) for HR 8799 based on the estimate of Takeda et al. (1999) for late-A type stars.

Comparing to Previous Works
HR 8799 has been identified as a λ Bootis type star (Gray & Kaye 1999). A λ Bootis type star has solar surface abundances for volatile elements such as C, N, O and S, and sub-solar surface abundances of Fe-peak elements. The potential origins of this type of stars are discussed in §6.4, and one explanation could be related to the debris disk (Draper et al. 2016) and planets (Jura 2015) around HR 8799. HR 8799 is given the following stellar parameters: T eff = 7424 K, log(g) = 4.22, and [Fe/H] = −0.5, using a spectrum with a low spectral resolution (∆λ = 1.8Å, Gray et al. 2003).
By using high-resolution spectra from the Elodie spectrograph (R=42,000), Sadakane (2006) derived the stellar atmospheric parameters and chemical abundances of HR 8799. The results are consistent within uncertainties although we used a different set of line lists when deriving the atmospheric parameters. As for the C and O abundances, we found relatively lower values than those reported in Sadakane (2006). Their values are log C = 8.63 and log O = 8.88, in comparison to our values in Table 2: log C = 8.54 and log O = 8.81. The reason behind this is likely due to a different set of stellar atmospheric model are used. The C/O ratio is 0.54 +0.12 −0.09 from this work, comparing with 0.56 from Sadakane (2006).

ATMOSPHERE MODELING AND
RETRIEVAL FOR HR 8799 c

Modeling Planet Atmospheres
We used petitRADTRANS to model exoplanet atmospheres (Mollière et al. 2019). petitRADTRANS is a versatile package that can model transmission and thermal spectra at high (R=100,000) and low spectral (R=1000) resolution with opacities available for all major chemical species and their isotopic species. We modeled the thermal spectrum at low resolution, considering four molecular species including H 2 O, CO, CO 2 , and CH 4 . In addition, Rayleigh scattering due to H 2 and He, and collisional broadening due to H 2 -H 2 and H 2 -He, were considered.

Parameterization
The following input parameters were used for petitRADTRANS: planet radius, mass mixing ratio for H 2 O, CO, CO 2 , and CH 4 . We included the corresponding pressure at which an optically-thick global cloud forms. In addition, we used analytical pressure-temperature (P-T) profile (Parmentier & Guillot 2014;Parmentier et al. 2015). For directly-imaged exoplanets with low stellar irradiation levels, the P-T profile is iso-thermal at high altitudes where energy transport is radiative and adiabatic at depth (e.g., optical depth > 1 for convective atmospheres (Parmentier et al. 2015). Parameters to define a P-T profile were: surface gravity and internal temperature (t int ), which describes the heat flux from the planet interior. Potential correlated noise in these data sets was modeled using Gaussian process with three parameters: correlation length, correlation amplitude, and white noise amplitude.
To facilitate comparison with observations, we also calculate planet luminosity and effective temperature. Planet luminosity is calculated by integrating retrieved planet spectral energy distribution from 0.5 to 15 µm, where the bulk of planet emission is. Planet effective temperature (t eff ) is then inferred using the following equation: L = 4πR 2 P ·σt 4 eff , where R P is planet radius and σ is the Stephan-Boltzmann constant.

Chemistry
We calculated C/H with the following equation: where X is volume mixing ratio. The conversion from mass mixing ratio, which was used in model parameters, to volume mixing ratio is as follows: where mr i is mass mixing ratio and subscript i denote a molecular species, µ is molecular weight in atomic mass unit, and MMW is mean molecular weight, which is defined as: The mass mixing ratios for all considered species add up to unity. Molecular hydrogen and helium ratio is 3:1, which is based on the assumption of primordial composition. Similarly, O/H was calculated using the following equation: And C/O was calculated as:
Priors are listed in Table 3 and the likelihood function is exp[−(D − M) 2 /E 2 ], where D is data, M is model, and E is the error term.

RETRIEVAL RESULTS
We report our atmospheric retrieval results based on a combined data set that includes GPI, CHARIS, and OSIRIS data. As shown in the Appendix, retrieval based on individual GPI and CHARIS data returns reasonable agreement between modeled spectra and data points, so we use the GPI and CHARIS data as they are. In comparison, OSIRIS data have systematics that cannot be accounted for by our model. We first provide detailed treatment of the OSIRIS data before working on the combined data set. Subsequently, we will show that the choice of priors significantly affects the retrieval results, leaving a cautionary note on atmospheric retrieval.

Using Normalized OSIRIS Data
OSIRIS data have a higher spectral resolution (R=5,000) than the low-res mode of petitRADTRANS (R=1,000). We could in principle use the high-res mode of petitRADTRANS (R=100,000), but the computational time for the whole K band wavelength coverage prevents us from sampling the parameter space in a reasonable amount of time. We therefore decide to down-sample the OSIRIS data to R=1,000 to match the low-res mode of petitRADTRANS. While this procedure potentially degrades the original data, spectral features such as lines and band heads would be mostly preserved even at R=1,000.
As shown in §A.1.1, we cannot obtain a reasonable agreement between our modeled spectra and the absolute flux measured from OSIRIS. We suspect that OSIRIS data have some systematic error that alters its spectral shape, which results in the challenges in using OSIRIS absolute flux. Our suspicion is further supported by the agreement between GPI data and the fitted modeled spectra (A.2), which suggests that the modeled spectra can reasonably fit certain data sets with less contribution from systematic errors. If systematics only affects low-frequency part of the spectrum, e.g., overall spectral shape etc., then we can remove the low-frequency systematics with a high-pass filter and use the normalized flux for retrieval. The normalized flux was also used in Konopacky et al. (2013).

Weak Priors
Spectra based on retrieval posterior samples are shown in Fig. 2. Modeled spectra trace data points reasonably well except for J-band: models underpredict planet flux when compared to CHARIS data points.
As shown in Table 4, planet luminosity and effective temperature are within the range from Konopacky et al. (2013), i.e., 900-1300 K and −4.8 < log(L/L ) < −4.6. We note here the difference between t int and t eff . T eff is gen- erally used in grid model of spectra. T int is parameter used in P-T profile that describes the heat flux from the planet interior (see §4.2). A higher t int does not necessarily translate into a higher t eff . This is because there are other factors that regulate the emerging flux, e.g., cloud and molecular opacities.
The retrieved gravity log g is 3.97 and radius is 1.47 R Jupiter . The planet mass would be 8.11 M Jupiter . While it is consistent with the estimate from (Wang et al. 2018b), it is slightly higher than the previously allowed maximum 7 M Jupiter from a dynamical stability point of view (Marois et al. 2010;Fabrycky & Murray-Clay 2010).

Strong Priors
Given the aforementioned issue in inferred planet mass, we decide to put stronger priors to restrain the possible parameter space for posterior samples. We set the priors for internal temperature and planet radius to be Gaussian functions with negligible standard deviation. This effectively set the internal temperature and planet radius to be 1200 K and 1.2 R Jupiter . The t int of 1200 K is chosen to be more consistent with retrieved t int using GPI or CHARIS individual data set, which shows better agreement between models and data (see Table  §4 and §A).
At the retrieved log g = 3.96, the corresponding planet mass is 5.28 M Jupiter . This is a reasonable mass given the dynamical stability constraint (Fabrycky & Murray-Clay 2010). The comparison between the weak and strong prior cases highlights the limitation of retrieval: the most likely parameter space to fit the data may not be physically/chemically allowed.
Because of the different priors, the retrieved posterior distributions are significantly different for different parameters. Specifically, in order to maintain the same flux, the weak-prior case returns higher abundances and thus high opacity to offset the higher internal temperature (see Table 4).

Adding L-and M -band Photometric Data
Given the significantly different results between the weak-prior and the strong-prior cases, we explore if adding additional constraints would reconcile the difference. We include Land M -band photometric data that are complied by Bonnefoy et al. (2016) from previous works (Galicher et al. 2011;Skemer et al. 2012;Currie et al. 2014;Skemer et al. 2014). With weak priors, the retrieval results are shown in Fig. 4 and summarized in Table. 4. Dynamical stability is no longer a concern because the retrieved planet radius (1.05 R Jupiter ) and surface gravity (log g = 3.99) are consistent with a 4.4 jupiter-mass planet, an even lighter planet than the strong-prior case. As shown in the section below, adding L-and M -band photometric data represents a solution that is in between the weak-prior and the strong-prior cases. Our results, depending on choice of priors, imply different formation path for HR 8799 c (Table 2). The result with weak priors shows consistent C and O abundances with the star. The interpretation could be (1), formation through direct gravitational instability; (2), accretion of carbon-enriched planetesimals (e.g., tar) to ex-plain the slightly higher than stellar C abundance; and/or (3), core erosion after planet formation. The result with strong priors is concerning because there is no plausible formation scenario in which a planet has both sub-stellar C and O abundance and sub-stellar C/O according to Madhusudhan et al. (2017).
The data point corresponding to the case after including L-and M -band photometric data is located in between the weak-and strong-prior cases. In addition, it is consistent with the data point based on the result from Konopacky et al. (2013). The result shows a planet C/O that consistent with stellar C/O. Both C/H and O/H are sub-stellar. This data point is in line with a scenario of formation beyond water iceline and no core erosion. The water iceline is located at ∼3-10 AU from the host star assuming an optical thin disk (Lavie et al. 2017).

Comparing results with Previous Works
More intriguingly, results from previous works (Konopacky et al. 2013;Lavie et al. 2017) also show a significant difference. In summary, the four data points are roughly consistent with stellar C in Konopacky et al. (2013). The most likely explanation, based on Madhusudhan et al. (2017), is that the planet form between the H 2 O and CO 2 ice line, accreting gas that is deficient of H 2 O, and core erosion plays no role in enriching the planet atmosphere.  2017) conducted comprehensive study on atmospheric retrieval for the four planets in HR 8799 system. Their methodology shares some similarities to ours, e.g., using multiple data sets and using PyMultiNest for posterior sampling, although there are several key differences that will be discussed below. A comparison of our method and their approach would shed light on how to reconcile the difference and improve retrieval for future data sets.

Common Parameters
We first compare retrieved parameters that are common in our analyses (Fig. 6). For plotting clarity, we omit the result of adding L-and M -band data because the result is bracketed by the weak-and strong-prior cases.
H 2 O (major O carrier) and CO abundances (major C carrier) from Lavie et al. (2017) are higher, explaining the super-stellar C and O abundances in Fig. 5. Abundances for CO 2 and CH 4 are low in all analyses, and therefore do not contribute significantly in C and O abundance. We will explain in §6.2.1 why the abundances for CO 2 and CH 4 are detectable despite being relatively low.
Planet radius and surface gravity retrieved by Lavie et al. (2017) are ∼1.25 R Jupiter and 4.5, which implies the inferred planet mass is 20 M Jupiter , too massive from a dynamical stability point of view (Fabrycky & Murray-Clay 2010;Wang et al. 2018b). As we discussed in §5.2, strong priors need to be put on planet radius in order to satisfy the dynamical stability criterion.  Table  3 for definition of each parameter.

P-T Profile
The opacity retrieved in our work differs from that of Lavie et al. (2017) by at least one order of magnitude. However, we used different formula for P-T profile calculation. We used an analytical function from (Parmentier et al. 2015) and Lavie et al. (2017) used a reduced version of equation 126 in Heng et al. (2014). A more direct comparison would be the P-T profiles that are calculated by different methods. Fig. 7 shows P-T profiles using parameters that drawn from posterior distributions. Our results with weak priors shows highest temperature at low pressure levels but increases slowly toward higher pressures. This is due to the small retrieved opacity that causes a less sensi- tive temperature to pressure. The overall shape of P-T profile is similar between Lavie et al. (2017) and our results with strong priors, but with temperature offset. Our P-T profile is lower by ∼100 K than that from Lavie et al. (2017) although our retrieved t int is ∼200 K hotter than theirs.

Cloud
We assume global cloud that is parameterized by cloud top pressure, a cloud treatment from petitRADTRANS. At a first glance, the treatment is different from Lavie et al. (2017) where three parameters are used to describe the global cloud: particle size, extinction efficiency, and cloud mixing ratio. In the case of planet c, the result from Lavie et al. (2017) retrieved a cloud with refractory species (e.g., silicates) with particle size (∼30 µm) larger than observed wavelength (<2.5 µm), which suggests a wavelength-independent cloud opacity in the near infrared. This is consistent with our cloud treatment with single parameter. One caveat though is that it is not entirely clear how to quantitatively compare our one-parameter model to their three-parameter model. We retrieved a cloud-top pressure of 1-2 bar, which indicates the global cloud is quite deep compared to that on hot Jupiters. Konopacky et al. (2013) conducted pioneering work on measuring chemical abundances of HR 8799 c. Unlike our free retrieval approach, they used a forward modeling approach, in which physically and chemically motivated models are used to fit the data. The two approaches may result in the difference of abundance measurements as shown in Fig. 5. One outstanding issue is the non-detection of CH 4 and CO 2 in the original analysis in Konopacky et al. (2013). Below we offer some insights as to why CH 4 and CO 2 can be constrained by OSIRIS data alone. Their abundance can be further constrained by adding additional data as shown in Lavie et al. (2017) and this work.

Constraining CO 2 and CH 4 Abundances
We show here that the normalized OSIRIS data can constrain the abundance of all four species including previously unconstrained CO 2 and CH 4 Konopacky et al. (2013). We check likelihood function when varying only CO 2 or CH 4 value and find that indeed the likelihood is at maximum at peak values of CO 2 or CH 4 from posterior distributions.
We further investigate if the presence of CO 2 or CH 4 can be visually discerned at the level of log(mr CO 2 ) between -3 and -4 and log(mr CH 4 ) between -4 and -5. For reference the retrieved log(mr CO 2 ) and log(mr CH 4 ) is around ∼-3 and between -4.5 and -5.0 (see Table 4). Fig. 8 and Fig. 9 show the difference spectra at different mixing ratio of CO 2 or CH 4 . The difference spectra are created by subtracting two spectra: one with a specified mixing ratio and the other with a zero mixing ratio for a given species. The difference spectra, when compared to error bars (also shown in Fig. 8 and Fig. 9), would suggest if the data have the distinguishing power for a give mixing ratio. Indeed, Fig. 8 and Fig. 9 show that spectral features of CO 2 and CH 4 at mixing ratios of log(mr CO 2 ) between -3 and -4 and log(mr CH 4 ) between -4 and -5 are larger or comparable with error bars, and therefore can be distinguished by the data. We use petitRADTRANS to generate spectra by varying only CO 2 abundance. Then, we normalize the spectra and subtract the normalized spectrum with zero CO 2 abundance. This results in the plotted differential normalized flux. For comparison, error bars reported in Konopacky et al. (2013) are shown in grey and 1-σ errors that correspond to the width of CO 2 spectral features are plotted in black. When log(mr CO 2 ) is between -3 (orange) and -4 (green), which is the range of retrieved CO 2 abundance (Table 4), CO 2 spectral features are comparable to 1-σ errors, suggesting that it is possible to constrain CO 2 abundance with the OSIRIS data quality.
We caution that while we provide tentative evidence that CO 2 and CH 4 Abundances can be constrained by OSIRIS data alone, there could be other explanations. For example, including CO 2 and CH 4 in the retrieval could further increase the likelihood because changing CO 2 and CH 4 abundance may look like a change in the continuum shape at the OSIRIS S/N level, therefore these extra free parameters help to fit the data better. This argument may be supported by that GPI data alone can constrain CO 2 abundance ( §A.2). Furthermore, Figure 9. Same as Fig. 8 except for CH 4 . When log(mr CH 4 ) is between -4 (green) and -5 (red), which is the range of retrieved CH 4 abundance (Table 4), CH 4 spectral features are comparable to 1-σ errors, suggesting that it is possible to constrain CH 4 abundance with the OSIRIS data quality.
CO 2 abundance has never been constrained in previous retrieval works on brown dwarfs (Line et al. 2015;Madhusudhan et al. 2016;Line et al. 2017;Zalesky et al. 2019;Kitzmann et al. 2020). Indeed, CO 2 abundance is expected to be ∼4 orders of magnitude lower than CO for a brown dwarf with similar effective temperature (Sorahana & Yamamura 2012).

The Contribution of CO 2 and CH 4
Abundances to C/O The abundance of CH 4 , even in the constrained case, is at least two orders of magnitude lower than CO abundance. Therefore, CH 4 is not the major carbon carrier and does not significantly change the total carbon abundance in the C/O calculation.
Whether or not constraining CO 2 abundance could significantly change the measured C/O in the strong-prior case. The CO 2 abundance is ∼0.3 dex lower than CO and H 2 O abundances (Table 4). Omitting CO 2 would increase C/O by ∼25%, changing C/O from 0.39 to 0.48 (Table 2. This change is about 1.5-2.0 times the reported uncertainty.

Retrieval on Individual Data Set
We perform retrieval on individual data set from GPI, OSIRIS, and CHARIS. The results are provided in §A. Here we summarized the major findings and the lesson learned.
We propose two ways of comparing retrieval results. The first way is to compare C/O. We show that C/O ratios retrieved from individual data set are consistent except for CHARIS retrieval because CHARIS data do not constrain the major carbon carrier CO (Fig. 15). In principle, we can compare C/H and O/H, but the normalization process of the OSIRIS data makes it difficult to do so.
Secondly, comparing the likelihood distribution can check consistency of retrieval results. We show that GPI H and K band data return consistent retrieval results (Fig. 16). This demonstrates the internal consistency of GPI data. In contrast, we show that retrieval results between GPI and OSIRIS are inconsistent by comparing the distribution of likelihood values (Fig. 17). The most likely explanation is the different systematics that exist in individual data set.
To summarize, combining data sets that are taken using different instruments at different epochs comes with a caveat: astrophysical conditions and instrument-induced systematics may play a critical role in causing inconsistent retrieval results. For future observations, it is ideal to (1) obtain simultaneous data that cover wide wavelengths and (2) understand better systematics that is caused by different instruments.

λ Bootis Nature of HR 8799
Gray & Kaye (1999) and Sadakane (2006) have confirmed that HR 8799 is a mild λ Bootis star. Concerning the origin of the abundance anomalies detected in λ Bootis type stars, several scenarios have been proposed. One scenario suggests that at least a part of λ Bootis type stars can be explained by assuming a binary system consisting of two stars of similar spectral types (Andrievsky 1997;Faraggiana & Bonifacio 1999;Gerbaldi et al. 2003). Since Mathias et al. (2004) and Henry et al. (2005) concluded that HR 8799 is a single star based on radial velocity observations, the second scenario is ruled out.
Another widely accepted scenario invokes the process of selective accretion of circum-stellar or inter-stellar material (ISM, Venn & Lambert 1990;Turcotte & Charbonneau 1993;Gray & Corbally 2002) by a solar abundance star. For example, Su et al. (2009) studied the environment of HR 8799 using infrared imaging data from Spitzer and CO(3-2) map from the James Clerk Maxwell Telescope (JCMT). They detected a molecular cloud that is possibly associated with HR 8799, and suggested accretion from this cloud might explain the abundance anomalies of HR 8799. Jura (2015) argued that the accretion rates of ISM required for the occurrence of λ Bootis phenomena are unachievable, and instead suggested that accretion of winds from an additional hot Jupiter companion might account for the λ Bootis nature of HR 8799. This idea has support from observations that A-type stars are efficient at forming giant planets (Johnson et al. 2011;Murphy et al. 2016), and some of these giant planets are found in tight orbits around the star (Collier Cameron et al. 2010). However, detection of such an unknown planet around HR 8799 is difficult using current observing techniques. Draper et al. (2016) studied a sample of λ Bootis stars using Herschel Space Observatory, and found IR excesses around nearby λ Boo stars are caused by debris disks rather than ISM bow waves. Indeed, HR 8799 has a resolved debris disk based on Herschel data (Matthews et al. 2014). Since a higher influx of planetesimals and comets could provide enough volatile gas for accretion, the planet systems in HR 8799 could possibly trigger higher than usual dy-namic activities in the disk, which provide accretion material to the star and cause the λ Bootis nature of HR 8799. Currently this is the most probably scenario. However, further theoretical study on the transport of circum-stellar material and stellar mixing mechanisms are required to establish a relation between the stellar surface abundance anomalies and external accretion process.
Contrary to the above debris-disk theory, Moya et al. (2010) studied the λ Bootis nature of HR 8799 using a comprehensive seismology modeling. They found the observation data contradicts one of the main hypotheses for explaining the λ Bootis nature, namely the accretion/diffusion of gas by a star with solar abundance. However, their results are dependent on the inclination angles of HR 8799 and the correct mode identification, which are vital for the asterseismic analysis. Thus, more observational and theoretical investigations are still needed to pin down the origin of the λ Bootis nature of HR 8799. • A summary of the retrieval results can be found in Table 2 and 4 and Fig. 5. The implications are discussed in §5.5.
• When compared to previous results, we find the abundances varies by more than 1 dex, highlighting the uncertainty in atmospheric retrieval. We discuss potential issues that could lead to the differences in §6 and §A.
• The inconsistency in retrieval results could stem from different treatments in modeling planet atmospheres, e.g., P-T profile calculation and cloud modeling, or the different data sets that used in retrieval. We show in §A that systematics in individual data set, be it astrophysical or instrumental, can results in significantly different results. We therefore advocate for (1) providing details for modeling planet atmospheres to facilitate comparison with other investigations; (2) simultaneous observations that span a wide wavelength range to minimize the heterogeneous contribution from temporal astrophysical variations and/or systematics from using different data sets; and (3) a better understanding of systematic errors that are introduced by different instruments.

ACKNOWLEDGMENTS
We would like to thank Quinn Konopacky for providing Keck/OSIRIS data, Baptieste Lavie for providing the retrieval posterior samples for comparison, Jonathan Fortney for insightful discussions on atmospheric retrieval, Paul Molliere for the help in setting up and running petitRADTRANS. We thank the anonymous referee for his/her constructive comments and suggestions that greatly improve the manuscript. We thank the Heising-Simons Foundation for supporting the workshop on combining highresolution spectroscopy and high-contrast imaging for exoplanet characterization, where the idea originated on combining photometric data and spectral data of different resolutions. BM thank the support of CSST grant and NSFC grant U1931102. The data presented herein were obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation. The authors wish to recognize and acknowledge the very significant cultural role and reverence that the summit of Mauna Kea has always had within the indigenous Hawaiian community. We are most fortunate to have the opportunity to conduct observations from this mountain.

A. RETRIEVAL RESULTS FOR INDIVIDUAL DATASET
In addition to analyzing the data set that combines OSIRIS, GPI, and CHARIS data, we analyze each component of the data set individually. The retrieval results and our notes are shown in this section and Table 4. We cannot obtain a reasonable agreement between our modeled spectra and the absolute flux measured from OSIRIS. This is evident from the residual plot in Fig. 10. The residual has a clear correlated pattern especially towards the longer wavelengths beyond 2.3 µm. We attribute the uncounted residual pattern to the overall spectral energy distribution and the longer wavelength end of the OSIRIS data. More specifically, the OSIRIS data is more rounded than the model and the OSIRIS flux between 2.3 and 2.4 µm increases whereas the model flux flattens in the same wavelength region. We suspect that OSIRIS data have some systematic error that alters its spectral shape, which results in the challenges in our retrieval using OSIRIS absolute flux. In principle, the systematics can be taken care of by the Gaussian process. However, the Gaussian process may not be the best tool to model the two aforementioned spectral morphological differences. Figure 11. Same as Fig. 10 but for normalized OSIRIS data with weak priors as shown in Table 3.

A.1.2. Normalized Flux
If systematics only affects low-frequency part of the spectrum, e.g., overall spectral shape etc., then we can remove the low-frequency systematics with a high-pass filter and use the normalized flux for retrieval.
After normalization, the continuum information is missing. This causes the negative correlation between κ IR vs. P cloud . The two parameters compensated each other: the same observed line depth can be achieved by either a high κ IR or a high P cloud (i.e., low or no cloud). However, the degeneracy of the two parameters cannot be broken by the normalized-flux data set. In addition, the retrieved mixing ratios are too high than physically allowed especially for CO and H 2 ), e.g., log(mr H 2 O ) = -0.81 and log(mr CO )= -0.61.
To alleviate the above issues, we fix the following values in retrieval because they are more physically plausible and constrained by other date sets whose continuum information is not lost due to the normalization. The parameters that we fixed are: logg = 3.95, R P = 0.90 R Jupiter , log(P cloud ) = -0.05, log(κ IR ) = -2.11, and t int = 1213 K. The retrieved mixing ratios are more reasonable when compared to retrieval results using other data sets. However, the log likelihood drops by ∼25, which is also evident from the slight more scatter when comparing the residual plots between Fig. 11 and Fig. 12. This highlights the limitation of retrieval: the most likely parameter space to fit the data may not be physically/chemically allowed.

A.2. Gemini-S/GPI
Spectra based on GPI retrieval results are shown in Fig. 13. Modeled spectra trace data points reasonably well in both H and K band. All parameters are well constrained except for CH 4 .

A.3. Subaru/CHARIS
Modeled spectra based on CHARIS retrieval are shown in Fig. 14. The modeled spectra generally agree with data points, but the number of data points are sparse due to the low spectral resolution.

B. COMPARING RESULTS FROM INDIVIDUAL DATESETS
In order to compare retrieval results from different data sets, we need to define a set of criteria for the comparison. by "compare", we mean to check the consistency of results.

B.1. C/O
Because we have normalized data from OSIRIS, which are insensitive to absolute molecular abundances, we only compare C/O that are retrieved (Fig. 15). GPI and OSIRIS results are generally consistent. However, the result from the normalized OSIRIS data with strong priors skews C/O to lower values, in a similar way that the C/O is lower in the strong-prior case for the combined data ( §5.3). C/O from the CHARIS data peak at zero because H 2 O is constrained but the major carbon carrier CO is unconstrained.

B.2. Likelihood
Comparing likelihood is another way of checking consistency between retrieval results. More specifically, Two likelihoods will be calculated and compared. The first likelihood is the likelihood of one retrieval as calculated from the posteriors. The second likelihood is the likelihood when plugging posteriors of another retrieval. If two results are consistent, then the two likelihoods should share similar statistics (e.g., median or mode) or follow the same distribution. Otherwise, the two retrieval results are inconsistent with each other.

B.2.1. GPI H vs. K band
We divide GPI data into two parts, H and K band, and conduct atmospheric retrieval for the two data sets. This is to check the internal consistency of GPI data. If retrieval results are similar between H and K band, i.e., retrieved parameters share similar posterior distributions and similar posterior values when plugging posteriors from H band retrieval and when plugging posteriors from K band retrieval. Fig. 16 shows the results. Top two panels show models with parameters drawn from posterior distributions from H and K band data. It is understandable that models using retrieval results of H (or K) band only agree well with H (or K) band data.
Retrieved parameters share similar values as shown on the lower left panel in Fig. 16. We use percentage error here with zero as zero percent and unity as 100%. The percentage error is calculated as the difference of median between H and K band results divided by the average of the medians. The error associated with each point is calculated as follows: we add in quadrature the average of reported upper and lower error bars for each band, and divide the value by the average the medians. The majority of parameters as retrieved from H and K band agree well with a few percent except for Pcloud, log(mr CO ), and log(mr CH 4 ). Pcloud has a value that is very close to zero, and therefore the large percentage error and the error bar. H band data do not provide a strong constraint on log(mr CO ) and log(mr CH 4 ) because of weak absorption of these two species and therefore the disagreement between H and K band results. Overall, H and K band results are consistent within a few percent.
The consistency between H and K band results is further supported by the comparison of posterior values (lower right panel in Fig. 16). Posterior value distribution from H-band retrieval is in good agreement with the posterior value distribution using retrieval results from K band (two-sample K-S test p = 0.15).

B.2.2. GPI vs. OSIRIS
Comparing GPI and OSIRIS retrieval results is more complicated because the two data sets are taken by different instruments and at different times. Astrophysical time variability may be one of the reasons that causes inconsistency, if any. Among parameters that are used in the retrieval process, P cloud may be variable over time. In addition, we used the same log g, R pl , and t int that are retrieved from GPI in OSIRIS retrieval. Therefore in the following likelihood comparison, we only plug in posteriors of chemical abundance from OSIRIS into GPI likelihood function. Fig. 17 shows that the likelihood distribution already looks significantly different with a K-S test p value of 2.9 × 10 −32 . The mode of the OSIRIS likelihood distribution is at the 1 percentile of the GPI likelihood distribution. The retrieval results are inconsistent. The inconsistency suggests that astrophysical time variability is not the sole explanation because the potential time variable has been excluded in the comparison. Instrument-or date-reduction-induced systematics should be responsible for the inconsistency as observed in the retrieval results. Figure 16. Top left: Modeled spectra drawn from posterior distribution (red, based on H band data) are compared with the GPI IFS data. Top right: Modeled spectra drawn from posterior distribution (red, based on K band data) are compared with the GPI IFS data. Bottom left: comparing retrieved parameters from the H-band only data and the K-band only data. The retrieved parameters are generally consistent with each other. The inconsistency in CO abundance is due to the lack of constraining power on CO abundance in H-band only data. Bottom right: the distribution of posterior values for K-only retrieval (blue) and the distribution of posterior values for H-only retrieval (orange). The two posterior distributions are indistinguishable (p=0.15 from a two-sample K-S test) suggests that the results are consistent between H-only retrieval and K-only retrieval. 7250 ± 150 4.30 ± 0.10 -0.55 ± 0.11 3.3 ± 0.3 PEPSI 7450 ± 100 4.40 ± 0.10 -0.48 ± 0.12 2.9 ± 0.2 HARPS 7390 ± 80 4.35 ± 0.07 -0.52 ± 0.08 3.1 ± 0.2 Combined · · · Log-uniform -10 0 CO Mixing Ratio (log(mr CO )) · · · Log-uniform -10 0 CO 2 Mixing Ratio (log(mr CO 2 )) · · · Log-uniform -10 0 CH 4 Mixing Ratio (log(mr CH 4 )) · · · Log-uniform -10 0 Effective temperature (