CoCCoA: Complex Chemistry in hot Cores with ALMA Selected oxygen-bearing species

Context. Complex organic molecules (COMs), especially the oxygen-bearing species, have been observed to be abundant in the gas phase toward low-mass and high-mass protostars. Deep line surveys have been carried out only for a limited number of well-known star-forming regions using the Atacama Large Millimeter/submillimeter Array (ALMA), which has unprecedented resolution and sensitivity, and statistical studies on oxygen-bearing COMs (O-COMs) in a large sample of high-mass protostars using ALMA are still lacking. Aims. We aim to determine the column density ratios of six O-COMs with respect to methanol (CH 3 OH) in a sample of 14 high-mass protostellar sources in order to investigate their origin through ice and/or gas-phase chemistry. The selected species are: acetaldehyde (CH 3 CHO), ethanol (C 2 H 5 OH), dimethyl ether (DME; CH 3 OCH 3 ), methyl formate (MF; CH 3 OCHO), glycolaldehyde (GA; CH 2 OHCHO), and ethylene glycol (EG; (CH 2 OH) 2 ). Methods. We fit the spectra of 14 high-mass sources observed as part of the Complex Chemistry in hot Cores with ALMA (CoCCoA) survey and derived the column densities and excitation temperatures of the six selected O-COMs. We used the minor isotopologue of methanol CH 183 OH to infer the column density of the main isotopologue CH 3 OH, of which the lines are generally optically thick. We compared our O-COM ratios with those of five low-mass protostars studied with ALMA from the available literature as well as with the results from experiments and simulations. Results. Although the CoCCoA sources have different morphologies and brightness in their continuum and methanol emission, the O-COM ratios with respect to methanol have very similar values in the high-mass and low-mass samples. The DME and MF have the highest and most constant ratios within one order of magnitude, while the other four species have lower ratios and exhibit larger scatter by one to two orders of magnitude. The ratio between DME and MF is close to one, which agrees well with previous observational findings. Current simulations and experiments can reproduce most observational trends with a few exceptions. For example, they tend to overestimate the abundance of ethanol and GA with respect to methanol. Conclusions. The constant column density ratios of the selected O-COMs among the low-and high-mass sources suggest that these species are formed in similar environments during star formation, probably on icy dust grains in the pre-stellar stages. Where deviations are found, hypotheses exist to explain the differences between observations, simulations, and experiments, such as the involvement of gas-phase chemistry and different emitting areas of molecules.

However, the formation mechanisms of COMs are still under debate. The observed O-COMs were initially thought to exist exclusively in the gas phase in hot (T ≳ 100 K) environments around protostars (i.e., hot cores/corinos), primarily from thermal desorption of ices Jørgensen et al. 2020). On the other hand, observations in the past decade started detecting them in cold (T ∼10 K) pre-stellar cores, albeit in low abundances (Bacmann et al. 2012; Article number, page 1 of 32 arXiv:2308.02688v1 [astro-ph.GA] 4 Aug 2023 A&A proofs: manuscript no. aanda Jiménez-Serra et al. 2016;Soma et al. 2018;Scibelli et al. 2021). These detections indicate that COMs may already be formed on the surfaces of dust grains in the early pre-stellar stages before any ice desorption or subsequent gas-phase chemistry occurs. Simulations and experiments have found that solidphase methanol can be formed in the ice mantles of dust grains. The formation begins with a series of CO hydrogenation under both energetic (e.g., UV radiation) and non-energetic (e.g., atom addition) conditions (Hiraoka et al. 1998;Shalabiea & Greenberg 1994;Watanabe & Kouchi 2002;Fuchs et al. 2009;Cuppen et al. 2009;Simons et al. 2020). Formation of bigger COMs on cold dust grains is also possible. Experiments by Fedoseev et al. (2015) show that GA and EG can be formed through surface hydrogenation of CO under cold dense cloud conditions (T = 12 K). A follow-up experimental study by Chuang et al. (2016) mixed CO, H 2 CO, and CH 3 OH ices at T = 15 K, and produced not only GA and EG, but also some MF in the solid phase. However, compared with observational results in the gas phase, MF was underproduced in their experiments while GA and EG were overproduced. The underproduction of MF was alleviated in Chuang et al. (2017) by introducing ultraviolet (UV) irradiation, which may photodissociate the initial ingredients in the mixed ice into reactive free radicals and therefore boost the formation of larger species. Based on these laboratory works, Simons et al. (2020) explored the dependence of the final grain mantle composition on the initial gas-phase composition and the dust temperature by chemical simulations. Their simulations were still able to produce MF, GA, and EG at temperatures as low as 8 K, but MF was again underestimated (without UV). A much higher abundance of MF is reproduced by Garrod et al. (2022), who introduced a set of non-diffusive mechanisms to their simulations (see detailed discussions in Sect. 5). These studies show that the abundances of some COMs like MF could depend on environmental conditions, and a debate still exists whether and to what extent gas-phase chemistry is involved in their formation (Balucani et al. 2015;Ceccarelli et al. 2022).
With the development of new observational techniques, it is now possible to make a more complete inventory of COMs in star-forming regions. The Protostellar Interferometric Line Survey (PILS, Jørgensen et al. 2016) using the Atacama Large Millimeter/submillimeter Array (ALMA) detected more than 20 COMs, including all the O-bearing species mentioned above plus ethanol (C 2 H 5 OH), and some of their isotopologues, in the low-mass protostellar binary IRAS 16293-2422 A andB (Jørgensen et al. 2018;Manigand et al. 2020). Recent ALMA surveys of a larger sample of sources further confirm the ubiquity of COMs in both low-and high-mass protostars (van Gelder et al. 2020;Yang et al. 2021;Nazari et al. 2022a). The next step in COM studies is therefore to investigate similarities and differences in COM abundances and their ratios among different types of sources to constrain their formation routes. Indeed, surveys have revealed an interesting consistency of O-COM ratios with respect to methanol among different sources. Coletta et al. (2020) summarized single-dish observations of DME and MF in various objects, including pre-stellar cores, star-forming regions, a protostellar shock, and galactic center cores. They find that DME and MF abundances are strongly correlated with a ratio of about 1. van Gelder et al. (2020) report constant ratios of O-COMs with respect to methanol in five low-mass protostars. For N-bearing species, Nazari et al. (2022a) also find rather constant ratios with respect to methyl cyanide (CH 3 CN) in more than 30 high-mass sources, though some species such as formamide (NH 2 CHO) show larger scatter in their ratios. The constant abundance ratios revealed by observations are interest-ing since different objects have different physical environments, which are expected to influence the chemical evolution and alter the ratios among COMs. These similarities suggest that COMs are mainly formed under similar physical conditions, probably on the surfaces of dust grains during the cold pre-stellar phase.
Previous observational studies mostly focused on a small number of well-known sources, and many of them did not have enough spatial resolution and sensitivity to overcome beam dilution and optical depth issues. So far, a few ALMA surveys with enhanced sensitivities and resolutions on O-COMs in low-mass protostellar regions have been published (van Gelder et al. 2020;Hsu et al. 2022), while results for the high-mass counterparts, especially from ALMA surveys that cover minor isotopologues of methanol, are still lacking. We emphasize the importance of detecting optically thin lines of minor isotopologues of methanol, which requires higher sensitivity, in order to better constrain the column density of methanol and hence the ratios with respect to other O-COMs. There are several ALMA studies on O-COMs in the high-mass counterparts (e.g., Csengeri et al. 2019;El-Abd et al. 2019;Mininni et al. 2020), but all of them are case studies without using minor isotopologues of methanol in the analyses, and/or only focus a small set of O-COMs such as the C 2 H 4 O 2 isomers. High-mass protostellar regions are not only important to the investigation of COM chemistry under different physical conditions, but given the higher temperatures and the possibly enhanced UV radiation along outflow cavity walls, it is also expected that the gas-phase data will give a better representation of the solid-phase abundances, as icy COMs are expected to be mostly thermally desorbed from dust grains. It is also timely to present the results of ALMA observations on high-mass sources after the latest development of a comprehensive model on COM chemistry in hot cores by Garrod et al. (2022).
In this paper, we present the analysis of O-COMs observed in 14 high-mass star-forming regions in the Complex Chemistry in hot Cores with ALMA (CoCCoA) survey. The analysis is focused on six O-bearing COMs: C 2 H 5 OH, CH 3 CHO, DME, MF, GA, EG (including two conformers, anti and gauche, a-EG and g-EG), as well as two methanol isotopologues 13 CH 3 OH and CH 18 3 OH. Covering minor isotopologues of methanol is essential because the lines of the main isotopologue, 12 CH 16 3 OH, are likely to suffer from a high optical depth, in which case its column density needs to be inferred from optically thin lines of a minor isotopologue. C 2 H 5 OH and CH 3 CHO are added to this sample for their potential detections in ices (Schutte et al. 1999;Yang et al. 2022;McClure et al. 2023). MF, GA, and EG are selected to study their relative abundances, based on laboratory and modeling findings in Chuang et al. (2017) and Simons et al. (2020). DME is included because of its high abundance observed in star-forming regions.
This work will provide a base to compare to future observations of icy COMs by the James Webb Space Telescope (JWST). To verify if COMs are formed in ices, a direct approach is to observe their vibrational absorption features in the infrared. However, this requires very high sensitivity and spectral resolution in the fingerprint wavelength range at ∼2-15 µm, which has only become feasible with the successful operation of JWST. So far, methanol is the only COM that has confirmed detections in interstellar ices, whereas several other solid-phase COMs such as C 2 H 5 OH and CH 3 CHO were only tentatively identified (see review by Boogert et al. 2015, and references therein). Recently, several JWST teams have started hunting for solid-phase COM features (Yang et al. 2022;McClure et al. 2023). Important laboratory measurements of COM infrared spectra are now available for JWST data analyses (Terwisscha van Scheltinga et al. Y. Chen et al.: CoCCoA: Complex Chemistry in hot Cores with ALMA 2018Rachid et al. 2020Rachid et al. , 2021Rachid et al. , 2022Hudson & Gerakines 2019;Hudson & Ferrante 2020;Hudson et al. 2021;Gerakines et al. 2022;Rocha et al. 2022). The ultimate goal is to bridge the gas and grain chemistries and directly relate gas-phase spectra as presented here to solid-phase infrared features, from which we can gain a better understanding about the formation history of COMs.

Overview
This paper presents early results from a first look at data from the Complex Chemistry in hot Cores with ALMA (CoCCoA) survey (PI: B. McGuire). A complete overview of the survey will be provided in a separate publication once all observations are completed. This will be followed by the public release of fully-reduced spectral-line image cubes. When complete, CoC-CoA will comprise a dataset targeting 23 high-mass star-forming regions at a spatial resolution of ∼0.3 ′′ using multiple configurations of the ALMA 12-m array as well as the ALMA 7-m array to capture both extended and compact emission. CoCCoA will ultimately include data from ALMA projects 2019.1.00246.S, 2019.2.00112.S, and 2022.1.00499.S.
The sources were chosen to be within ∼1-4 kpc, corresponding to a linear-scaled spatial resolution of ∼300 to 1200 au at the nominal 0.3 ′′ angular resolution). Other than this distance requirement, the only other criterion used for selection was that the sources have a prior literature report of "hot core chemistry," either described as such or evidenced by the presence of a rich array of emission lines of methanol and other COMs. The goal is to provide a diverse dataset to better sample the phase space of chemical complexity, and avoid biases toward the chemistry seen in the commonly observed extraordinary sources such as Sgr B2 (see, e.g., El-Abd et al. 2019).
The observations cover two spectral tunings per source in Band 6. The lower tuning covers 238.0-241.7 GHz while the upper tuning covers 258.0-261.7 GHz. All observations are taken with 0.488 MHz spectral resolution (twice the channel width due to online Hanning smoothing), corresponding to ∼0.6 km s −1 . These frequency ranges were chosen to maximize both the number of transitions and the range of upper-state rotational energy levels of key target molecules including those highlighted here. These first-look results use data from the upper tuning (258.0-261.7 GHz) for which observations have been completed for 14 total sources listed in Table 1.

Data reduction
The 12m-array CoCCoA data presented here from project 2019.1.00246.S, were observed between March 24, 2021 and April 2, 2021 using configuration C-5. The data were calibrated using the Cycle 8 version of the ALMA Pipeline (CASA version 6.2), including corrections for renormalization issues larger than 2%. Due to the copious line emission emanating from the massive protostars within each targeted cluster, the default parameters of the findContinuum.py procedure used by the pipeline to identify line-free channels did not yield optimal results (i.e., significant line contamination was present). Therefore, findContinuum.py (available in the extern directory of the pipeline distribution) was run manually with a few key pa-rameters (primarily sigmaFindContinuum 1 ) adjusted in order to minimize the line contamination.
After optimization of the channels used for continuum subtraction, the line-free channels were used to image the continuum and perform an iterative self-calibration (Brogan et al. 2018). The self-calibration solutions were also applied to the continuum-subtracted line datasets. The imaging of both the continuum and data cubes employed the multiscale functionality of CASA's tclean with scales of 0, 5, and 15 (the cellsize was chosen to oversample the beam on the smallest axis by a factor of ∼5). The robust parameter employed is 0.5, which yields an angular resolution near 0.3 ′′ ; both the continuum and line images were subsequently convolved to exactly 0.3 ′′ . The maximum recoverable scale of these data is ∼2.5 ′′ .
Considering that the data in the lower tuning are still under reduction, only the upper tuning was used for further analyses. The quality of the ALMA pipeline data of the lower tuning is good enough to see how many COM lines are included in each tuning. Among the 14 sources, G19.88-0.53 is taken as an example for line identification, since its spectrum shows less line blending despite the large line intensities. This ensures a sufficiently large number of detected lines without suffering too much from blending issues. The upper tuning covers about 70% of the COM lines found in both tunings (not including CH 3 OH), while the lower one is more abundant in strong lines of CH 3 OH and CH 3 CN, which are not directly relevant to our project. Therefore, we consider our results to be robust using only the data in the upper tuning. Figure 1 shows the continuum maps and the integrated intensity (moment 0) maps of the CH 3 OH 19 3,17 -19 2,18 transition with upper energy E up = 490 K and Einstein A coefficient A ij = 9.27 × 10 −5 s −1 . This particular methanol line was chosen because it is unblended and has the lowest E up and the highest A ij in the upper tuning, which is expected to show the most extended methanol emission. Figure 2 presents an overview of the spectra in the upper tuning, with several representative line features of MF and DME indicated by shaded areas. The spectra are extracted from the peak pixel of the methanol intensity maps. The moment 0 maps of other selected O-COMs of one example source G19.88-0.53 are shown in Fig. A.1, which confirmed that their emission peaks at similar regions to the methanol emission. In most sources, the methanol emission peaks at the same location as the continuum. However, in some bright sources such as G34.30+0.20 (Fig. 1), the methanol emission shows a ring shape around the continuum peak, which is likely due to the high optical depth of dust (De Simone et al. 2020;van Gelder et al. 2022). In this case, we picked the brightest pixel on the ring to extract the spectra. For the two bright sources G34.41+0.24 and G23.21-0.37, the methanol emission peaks at the same location as the continuum, but the spectrum extracted from the central pixel has too much line blending, so we deliberately chose a pixel offset from the actual peak (∼9 pixels offset for G34.41 and ∼3 pixels offset for G23.21).  GHz (19 3,[17][18][19]18 ) with E up = 490 K (even rows, blue-red) for the 14 high-mass CoCCoA sources. The white contours indicate the continuum emission at 3, 5, 10, 30, 50, and 100σ level (σ = 0.03 mJy beam −1 ). The peak pixels from where the spectra were extracted are marked by crosses in light green. The sources are sorted by peak flux of methanol emission. The beam size (0.33 ′′ ) is denoted by white ellipses in the lower right of each panel. The field of view for NGC 6334 I(N)-SM2 and G35.20-0.74 N is set to be twice as large as that of others in order to include more emission peaks.  After extracting all the spectra, we performed line identification and spectral fitting using the spectral analysis software CASSIS 2 (Vastel et al. 2015). The spectroscopic data in CAS-SIS are taken from two databases: the Jet Propulsion Laboratory database (JPL; Pickett et al. 1998) and the Cologne Database for Molecular Spectroscopy (CDMS; Müller et al. 2001Müller et al. , 2005Endres et al. 2016). Detailed references for each species can be found in the Appendix A of van Gelder et al. (2020). Vibrational corrections were applied to GA and EG (more information can be found in Sect. 5 of Jørgensen et al. 2016). Some species only have data available in one database, while others are included in both databases. In the latter case, we used the database with smaller uncertainties in central frequency or that has been used more frequently in previous work. We first went through the detection inventory of the PILS survey (Jørgensen et al. 2016) and checked for each species whether all the transitions in the databases have corresponding line features in the observed spectra. The detection results are presented in Sect. 4.2.

Spectral analyses
In the next step, we chose the six O-COMs and the two methanol isotopologues ( 13 CH 3 OH and CH 18 3 OH) mentioned in Sect. 1 for detailed spectral fitting using CASSIS. The two conformers of EG (a-EG and g-EG) were fitted separately due to their different sets of transitions. The column density (N), excitation temperature (T ex ), and full width half maximum (FWHM) of each molecule were fitted for each source. Here we assume one T ex for each species, that is, the populations of all levels can be characterized by a single T ex , which is often called "local thermodynamic equilibrium (LTE)" in radio astronomy. However, the LTE here does not necessarily refer to its strict definition that T ex approaches the kinetic temperature T kin under high-density conditions. In fact, COMs are likely to be subthermally excited (T ex < T kin ) in hot cores, while the observed lines can still be well characterized by one T ex . Examples provided in Fig. 6 of Johnstone et al. (2003) and Fig. 8 of Jørgensen et al. (2016) show that for the case of methanol, densities of 10 9 -10 10 cm −3 are needed for thermalization. Correspondingly, physical models of the envelopes of high-mass protostars (e.g., van der Tak et al. 2013) 2 http://cassis.irap.omp.eu/ indicate densities of 10 7 -10 9 cm −3 at temperatures of 100-300 K on scales of a few hundred au. At such densities, the level populations of COMs may not yet be fully in LTE, while the fitting results in Sect. 4 show that the single-T ex assumption is reasonable and works well. We adopted two methods for fitting the spectra: χ 2 minimization or visual inspection when the former is not applicable. For each source, the radial velocity v lsr and FWHM are determined for each species based on strong unblended lines. The uncertainties of v lsr and FWHM are smaller than 0.5 km s −1 . The difference of v lsr among the O-COMs is within 1 km s −1 in most cases. For sources where the lines are narrow and unblended, grid-fitting was used to determine the best-fit values as well as the uncertainties by calculating χ 2 in the parameter space and making the contour plot on the N-T ex plane (also see Sect. 3.1 in van Gelder et al. 2020). For each species, χ 2 was calculated from the difference between the LTE model and the observed spectrum around unblended lines. We started with a sparse grid with broad ranges of N, T ex , and FWHM and large intervals between the grid points. We then gradually narrowed down to smaller ranges and smaller intervals. Finally, we ended up with a fine grid around the best-fit grid point, from which we could make the contour plot and estimate the 2σ uncertainties. This grid-fitting method works well with weaker sources where most of the lines are unblended. However, for bright sources where the lines are very broad and blended, grid-fitting does not converge to a solution and the results need to be visually inspected.
When it came to fitting by visual inspection, we started with an initial guess of the parameters and adjusted them to a better fit until no improvement could be made. This is more efficient and reliable for complex spectra with blended lines, since we can monitor the change intuitively and interactively. This method was also adopted by van Gelder et al. (2020) and Nazari et al. (2021Nazari et al. ( , 2022a. Uncertainties were estimated by comparisons shown in Fig. A.2, in which the three panels correspond to the upper/lower limits and the best-fit. We can see in panel (a) that with an underestimated T ex , transitions with low E up tend to overestimate the observation while the high-E up lines are normally fit, and vice versa in panel (c).    The spectroscopic databases used in the fitting. c Component 1 and 2 (some sources exhibit two components in their spectra). d T ex in square brackets means that it was fixed to that value during the fitting. In this case, the uncertainties of N were determined only for this particular T ex .
a-(CH 2 OH) 2 (CDMS) g-(CH 2 OH) 2 (CDMS) In some cases, grid-fitting did not work well for T ex even when there are enough unblended lines available for a certain species. This is because a robust estimation on T ex requires the unblended lines to cover a wide range of E up so that the models can be sensitive to temperature changes. For species such as CH 18 3 OH, GA, and g-EG, there are few unblended lines covering a wide range of E up . These lines are either too weak to be detected in faint sources, or severely blended by strong lines in bright sources. Under these circumstances, the results of gridfitting will not be able to constrain the T ex , often accompanied with huge uncertainties. As a solution, T ex was fixed to that of MF, which has the most identified transitions and therefore the best constraint on T ex , and only N was fitted and estimated for uncertainties. Besides, the T ex of g-EG is always set as the same as that of a-EG. We consider the fitting results of N still representative since the difference would be within a factor of two if we change the fixed value of T ex by 20-50 K in the range of 100-250 K, which is a typical temperature range for hot cores (e.g., Fig. 2 in Ligterink et al. 2015). Similarly, if the FWHM cannot be constrained to better than 0.5 km s −1 uncertainty due to the blending of lines, it was also fixed to a value (e.g., if the possible range is 3.5-4.0 km s −1 , we use 3.8 km s −1 ).
Additionally, there are two special cases encountered during the fitting: one is that the spectra of several sources (i.e., G19.01-0.03, G34.30+0.20, G34.41+0.24, and G35.20-0.74N) show double-peaked line profiles in some O-COMs, which can not be fitted when only one component. These double-peaked features appear in nearly all the transitions, and thus are unlikely to be due to self-absorption. An example of two-component fitting is given by Fig. A.3, which shows that the spectra can be well fitted by two components with different v lsr and N, while T ex remains the same (see Table 3 for relevant sources and species). It is likely that these sources are not spatially resolved (e.g., maps of G19.01-0.03 show a perfect beam shape in Fig. 1), and there is more than one physical component contained in the beam. The second case is that in the two sources G23.21-0.37 and NGC 6334-38, the MF lines are found to be highly optically thick. All strong lines with A ij ≳ 10 −4 s −1 are saturated, and therefore do not show correct intensity ratios against weaker lines. In this case, the fitting was only based on weak lines with A ij < 10 −4 s −1 .

Isotope ratio calibration for methanol
As mentioned in Sect. 3.1, the column density of CH 3 OH needs to be inferred from its minor isotopologues since the main isotopologue itself is likely to be optically thick. The isotope ratios of 16 O/ 18 O and 12 C/ 13 C can be calculated from the distance to the Galactic center (D GC ) using the equations in Wilson & Rood (1994) and Milam et al. (2005): (1) ( 12 C/ 13 C) = (6.21 ± 1.00)D GC + (18.71 ± 7.37) where D GC can be derived from the coordinates of the sources and their distances to Earth. The two values before and after D GC (with uncertainties) are the slope and intercept of these linear relationships, respectively. In the vicinity of the solar system, D GC is 8.05 kpc, which gives 16 O/ 18 O ∼ 510 and 12 C/ 13 C ∼ 69.
In the literature, 560 and 70 are commonly used for nearby lowmass sources. For the high-mass sources that are farther away from the solar system, isotope ratios can be calculated from D GC before applying them to infer the column density of 12 CH 16 3 OH (Table 1). If both 13 CH 3 OH and CH 18 3 OH are optically thin, we can expect their column density ratio to be 7-8. However, our fitting results show a ratio of 2-5, which indicates that 13 CH 3 OH is also (marginally) optically thick. Therefore, only CH 18 3 OH is used to calculate the column density of methanol.
If the errors of both the slope and the intercept are considered in error propagation, the uncertainties of 16 O/ 18 O would be around 30% of the ratios themselves (e.g., if 16 O/ 18 O = 300, the error will be ∼90). However, a large portion of the total uncertainty comes from the intercept error (82.6), which is even two times larger than the intercept itself (37.1). Considering that only the slope in Eq. (1) contains the information of the trend between the 16 O/ 18 O ratio and D GC , and the intercept error only represents the scatter of the sources from which the equation was originally fitted, we did not include the intercept error in the error propagation. This yields a decrease in the uncertainties of 16 O/ 18 O from ∼30% to ∼18%. Results that include the intercept error are shown in Fig. A.8.

Morphology
The continuum maps of the 14 CoCCoA sources show different morphologies (see odd rows in Fig. 1). Six out of 14 show a single peak in both continuum and methanol emission. Others have multiple peaks in continuum but only one or two peaks in methanol emission, usually corresponding to the brightest continuum peak(s). G345.5+1.5 is an exception that its methanol peak is off-set from the continuum peak, and the corresponding region in continuum does not show a peak feature. G35.20-0.74N is an interesting source where the four continuum peaks are located along a line and the methanol emission tends to follow the same alignment. Except for several sources that are not well resolved, most sources show extended weak continuum emission aside the flux peaks, implying the existence of dusty envelopes or parent cores. More detailed studies on the morphology of CoCCoA sources will be presented in a future paper.

Spectra and detection
As expected from the selection of sources and frequency range, the extracted spectra are rich in COM lines for all the 14 sources in the upper tuning. An overview of the full spectra of each source is presented in Fig. 2. We can see that the spectral appearance of our sources is diverse in intensities and line widths. Bright sources such as G34.41+0.24 and G23.21-0.37 have very strong and broad lines (≳5 km s −1 ) that are severely blended and can only be fitted by visual inspection. There are also faint sources like IRAS 1851-1208 and NGC 6334 I(N)-SM2 where the lines are much weaker and narrower (<2 km s −1 ). Their spectra are clean enough for χ 2 fitting, but some less abundant species and some weak lines (with lower A ij ) may remain undetected. The spectra of the other sources have intermediate intensities and line widths (3-4 km s −1 ), which are easiest to fit.
The 14 sources provide a rich inventory of detections. More than ten O-bearing COMs and about five N-bearing species are detected in the upper tuning. These detections include the originally targeted species as well as other O-and N-bearing species, covering alcohols, aldehydes, esters, ethers, ketones, acids, and the simplest sugar (GA). We do not detect methoxymethanol (CH 3 OCH 2 OH) since our frequency range only covers its weak lines with A ij < 10 −5 s −1 and no corresponding line features are spotted in the spectra. Besides the COMs mentioned above, two simple O-bearing molecules, ketene (H 2 CCO) and formic acid (t-HCOOH), have 1 and 3 transitions covered in the upper tuning, respectively. The N-bearing species include HNCO, CH 3 CN, C 2 H 5 CN, and NH 2 CHO. Some abundant species such as HNCO and CH 3 CN also have their 13 C isotopogues detected. A hydrocarbon molecule, propyne (CH 3 CCH), is also detected but with all its transitions covered in the lower tuning. Simpler molecules such as SiO, SO, and SO 2 only have very few strong lines covered in the upper tuning and do not affect our analyses on O-COMs. Species that are detected in the CoCCoA sources but are not the focus of this paper will be studied in future works. Figure 3 shows the best-fit model of G19.88-0.53 in a linerich frequency range (259.4-260.6 GHz). A version for the full upper tuning is given by Fig. A.4, and Figs. A.5-A.7 show zoomin panels for selected unblended lines of each species. The model contains the minor isotopologues of methanol and the six O-COMs that we focus on in this paper. More than 70% of the line features can be fitted quite well with uncertainties of < 30% assuming a single excitation temperature. Apart from sev-eral strong features that are attributed to the main isotopologue of methanol and some simple molecules (e.g., SO and SO 2 ), weaker features that are not well-fitted by the model likely originate from other COMs that are not included in the model, such as acetone (CH 3 COCH 3 ), N-bearing COMs, and some minor isotopologues of the detected species. The identified transitions of the selected O-COMs are listed in Table B.1, where transitions that are above 3σ and not fully blended with other strong lines are considered as "identified." We can see that most of the identified transitions have an upper energy level of 100-300 K.

Column density and excitation temperature
The column densities and excitation temperatures of all sources are presented in Tables 2 and 3. With the uniform beam size of 0.33 ′′ , most sources have a methanol column density of 10 18 -10 19 cm −2 , except the weakest source IRAS 1851-1208, which has N = 7 × 10 17 cm −2 . Among the six O-COMs that we focus on (excluding methanol isotopologues), DME and MF are the most abundant, with column densities about one order of magnitude higher (10 16 -10 17 cm −2 ) than those of the other four species. These two species always have a considerable number of strong lines available for fitting in the spectra of all sources (as shown by the shaded areas in Fig. 2). C 2 H 5 OH has somewhat fewer distinct transitions (unblended and large A ij ), and it is not always abundant enough to produce strong lines. In some sources, the column density of C 2 H 5 OH is of the same order of magnitude as DME and MF, while in some other sources the difference can reach up to two orders of magnitude. This big variation is also present for a-EG, which has many strong transitions covered in our data. In some sources, the number of detected lines of a-EG are as high as those of DME, while in others there are only a few obvious line features. The fitting of GA and g-EG is more difficult, since they only have two to three distinct transitions, which are often subject to blending issues. In summary, the average column densities of the six O-COMs rank as DME ∼ MF > C 2 H 5 OH > (a+g)EG > CH 3 CHO > GA. Since the absolute values of N are related to the physical environments of the parent sources, it is more useful to look at the relative abundances of O-COMs, that is, their column density ratios with respect to methanol (see Sect. 5 for details).
As for the excitation temperature, C 2 H 5 OH, MF, and a-EG tend to have a warm T ex of ≳150 K, while CH 3 CHO and DME have a relatively lower T ex of 100-130 K. This may because different species has a different emitting area of the hot core, for example, some species are emitting from a slightly colder and more extended region, which is not well resolved in our sample according to Fig. A.1). Hot cores are known to have temperature gradients (e.g., van der Tak et al. 1999van der Tak et al. , 2000Beltrán et al. 2018;Gieser et al. 2019) but only probed on scales larger than the observing beams. Since our analyses are based on the spectra at the peak pixels, the temperature structure on larger scales is not expected to affect our results. The lower T ex of CH 3 CHO and DME is also consistent with earlier single-dish findings of other high-mass sources (Bisschop et al. 2007;Isokoski et al. 2013) and the results of low-mass sources in van Gelder et al. (2020). As mentioned in Sect. 3.1, CH 18 3 OH, GA, and g-EG usually have fixed T ex due to a lack of unblended lines covering a wide range of E up .

Column density ratios
To further investigate the COM chemistry in our sample, we calculated the column density ratios of O-COMs with respect to methanol and summarize these column density ratios in Fig. 4. The column densities of the two conformers of EG, a-EG and g-EG, are summed up in the last subplot, although they were fitted separately. The uncertainty of the column density ratios was propagated from the uncertainties of the O-COM column densities and the 16 O/ 18 O ratio (calculated in Sect. 3.2) of each source. We compare our observational results of high-mass protostellar sources with the results of low-mass sources in literature (Jørgensen et al. 2018;Manigand et al. 2020;van Gelder et al. 2020) and the simulation results of COM chemistry in Garrod et al. (2022). For consistency, this figure does not include results of other COM observations (e.g., Csengeri et al. 2019;Yang et al. 2021) considering that they did not use minor isotopologues to derive the column density of methanol as we did.
From Fig. 4 we can see that there is no obvious difference in the O-COM ratios between low-mass and high-mass sources. However, molecules in the two groups should have experienced different physical conditions, such as temperature and fluence of energetic particles (UV photons, X-rays, cosmic rays) before the hot cores/corinos formed. This implies that these species are likely formed under similar conditions, which points to a common and early pre-stellar stage before the star formation processes begin to differentiate.
The six selected O-COMs have different column density ratios with respect to methanol, varying by over two orders of magnitude between 0.01% and 10%. Among the six species, DME and MF have the highest and similar ratios with respect to methanol (2-3%). C 2 H 5 OH and EG have the intermediate ratios but with a much larger scatter. CH 3 CHO and GA have the lowest ratios at around 0.1%. Figure 5 shows the spread factor, that is, log 10 of the standard deviation of O-COM ratios in log scales of our combined sample of low-mass and high-mass protostars. It is obvious that DME and MF have smaller scatter than the other four species. The scatter may result from an observational effect, that different species have different gas-phase emitting areas depending on their sublimation temperatures from dust grains (Nazari et al. 2021(Nazari et al. , 2022a. The column densities that we derive from spectral fitting represent the abundances averaged over the observational beam. If the actual emitting area of a species is smaller than the area that we can resolve, then we are suffering from a beam dilution issue. This is likely the case in the CoCCoA observations since the moment 0 maps of selected O-COMs given in Fig. A.1 show that the emission of these molecules are barely or not spatially resolved. The actual resolved area associated with a specific beam size depends on not only the angular resolution (which is constant across the CoCCoA sample), but also the distance to a source (which varies by a factor of 3-4 across the sample). Moreover, the size of the hot core, as defined by the radius where T = 100 K, depends on the square root of the luminosity of the source (Bisschop et al. 2007): Therefore, there is a beam dilution factor between the observationally inferred column density and the actual one. This factor can differ from source to source and from species to species, and hence can lead to the scatter in COM ratios. These effects are discussed and quantified in more details in Nazari et al. (2022bNazari et al. ( , 2023. More generally, it should be noted that surveys of large samples of low-and high-mass protostars have found that some fraction of sources do not show any methanol or COM emission (e.g., Yang et al. 2021;van Gelder et al. 2022, paper I). The reasons for this absence of COM emission are varied but include the possible presence of a disk which lowers the overall temperature structure (Nazari et al. 2022b, paper II), as well as different evolutionary stages, such as the presence of an H II region (Nazari et al. 2023, modeling paper).

Observations versus simulations
The comparison with simulations is mainly based on the stateof-the-art models in Garrod et al. (2022), hereafter G22. The G22 models simulate the chemistry coupled in three phases: gas-phase, grain/ice-surface, and bulk-ice mantle. The last two phases are collectively known as the solid phase. The evolution of hot cores is treated as two stages, a cold collapse stage followed by a static (fixed density) warm-up stage once the central protostar has formed. There are three warm-up timescales used in the models: 5 × 10 4 yr (fast), 2 × 10 5 yr (medium), and 1 × 10 6 yr (slow). A major update in the G22 models is to include non-diffusive chemistry on surfaces and in bulk ices, which is proposed to be important in interstellar ices based on laboratory work Linnartz et al. 2015). G22 test the effect of different non-diffusive mechanisms along with other parameters in about 20 models (see Table 1 in G22). The "final" model includes all the discussed non-diffusive mechanisms and we use it as the fiducial model for further discussion. In Fig. 4, the horizontal dashed lines correspond to the COM ratios (w.r.t. methanol) in the solid phase at the end of the collapse stage, and the horizontal dotted lines correspond to the gas-phase ratios after the warm-up stage with the medium warm-up speed. In addition to Fig. 4 that shows the O-COM ratios w.r.t. methanol, we also present a number of ratios between two O-COMs in Fig. 6. In the following subsections, we will compare our observational results of each species to the "final" model with the "medium" warm-up speed in G22.

CH 3 OCH 3 (DME) and CH 3 OCHO (MF)
DME and MF have the most stable column density ratios in observations and the best match with the G22 simulations (Fig. 4a,b). Their high abundances are often underproduced by experiments and simulations Chuang et al. 2016;Simons et al. 2020;Jin & Garrod 2020). However, the inclusion of the new formation routes in G22 is able to reproduce our observational results on the gas-phase ratios of DME and MF with respect to methanol.
In the G22 models, more than 60% of the DME is formed in ices, through the reaction between methylene (CH 2 ) and methanol via CH 2 + CH 3 OH → CH 3 OCH 2 , and (4) H + CH 3 OCH 2 → CH 3 OCH 3 . (5) Nearly 90% of CH 3 OCH 2 radicals are formed by the combination of CH 2 and CH 3 O, in which CH 3 O comes from the hydrogenation of formaldehyde (H 2 CO) or H abstraction from methanol. As the reactant of two reactions, CH 2 directly affects the formation of DME ices. G22 set the activation energy barrier of the grain-surface reaction to zero (Krasnokutski et al. 2016;Henning & Krasnokutski 2019), and added about 20 grain-surface CH 2 -related reactions to the network (see Table 4 in G22). This greatly enhances the efficiency of forming CH 2 and its subsequent contribution to COMs formation. Nevertheless, this assumption is not fully supported by the combined experimental/theoretical work of Lamberts et al. (2022), who argued that the reaction between C and H 2 is unlikely to be fully barrierless on water ices. However, the effects to the full kinetic model may not be significant with a modest non-zero barrier of reaction (6). G22 also introduced a set of methylidyne (CH) reactions (Table 5 in G22) which can form CH 2 and larger hydrocarbons barrierlessly. These reactions can make up the CH 2 formation when reaction (6) has a barrier. According to G22, the inclusion of CH and CH 2 chemistry in the solid phase enhances the abundance of solid-phase DME by more than a factor of 2. This emphasizes the importance of including the carbon hydrogenation to the chemical network .
Besides the bottom-up formation of DME from CH 2 , the photodissociation of CH 4 may also provide important ingredients. Several experimental studies were able to produce DME from UV-irradiated ices of CH 3 OH and CH 4 (Öberg et al. 2009;Paardekooper et al. 2016;Yocum et al. 2021). However, Fedoseev et al. (2015) and Chuang et al. (2016Chuang et al. ( , 2017 did not observe DME formation in their experiments with CH 4 or H 2 CO not included in the deposition, even when UV was introduced in Chuang et al. (2017).
According to G22, the remaining 40% DME is formed in the gas phase through two reactions: In the first step, protonated DME is formed via the reaction between methanol and protonated methanol. The second step is the proton transfer to ammonia (Charnley et al. 1995;Rodgers & Charnley 2001;Taquet et al. 2016), where ammonia comes from ice sublimation and therefore the reaction sequence would not be efficient under cold conditions (Skouteris et al. 2019). MF in the solid phase forms on grain surfaces mainly through the non-diffusive reaction (Chuang et al. 2016) A small contribution is made by the newly introduced 3-body excited-formation (3-BEF) reactions in the bulk ice whereby the hydrogenation of H 2 CO would produce excited CH 3 O that can overcome the reaction barrier (Jin & Garrod Fig. 5: Spread factor (log 10 of the 1σ scatter) for the ratios plotted in Fig. 4 of different O-COMs with respect to methanol. 2020). The CH 3 OCO radicals can then recombine with H atoms to form MF. Reactions (9) and (10) occur in the cold collapse stage and contribute about 70% of the total MF. In the gas phase, the newly added reaction (Balucani et al. 2015) O + CH 3 OCH 2 → CH 3 OCHO + H becomes the main production route when T > 100 K. The gaseous CH 3 OCH 2 radicals can be released from ice mantles, or converted from DME by OH abstraction (Shannon et al. 2014): The formation of both DME and MF is strongly related to methanol and its precursor CH 3 O. Simulations with different input n(H)/n(CO) ratios by Simons et al. (2020) and experiments by Santos et al. (2022) proposed that the final step of methanol formation is dominated by Chuang et al. (2016) noticed that H 2 CO is a prerequisite for the formation of MF, that is, without the input of H 2 CO there would be no MF detected in the outcome, which was also the case in Fedoseev et al. (2015) where only H and CO were used. Since CH 3 O is necessary to form MF, it is inferred that the H abstraction from CH 3 OH yields primarily CH 2 OH, while CH 3 O mainly comes from the hydrogenation of H 2 CO. This may explain why DME and MF can retain relatively stable abundance ratios with respect to methanol among a large sample of sources, since the formation of all the three species tends to be strongly related with the same precursor CH 3 O. However, it is not clear whether the fact that DME/MF ratio ∼1 (Fig. 6a) is a pure coincidence or there is some chemical balance between the two COMs.

C 2 H 5 OH
The observed abundance ratio of ethanol is on average one order of magnitude lower than the simulation results in G22. The scatter is also relatively large compared to DME and MF. The G22 models give nearly the same abundance after the collapse stage and the warm-up stage. This is because in their models ethanol is formed almost entirely in the early cold collapse stage on dust grains. The dominant reactions are Article number, page 12 of 32  (14) are formed by diffusive reactions of atomic H with ethane (C 2 H 6 ) and ethylene (C 2 H 4 ). These routes may be supported by the experiments in Chuang et al. (2020) showing that ethanol can be formed through non-energetic processing of C 2 H 2 ices. However, the efficiency of these reactions is not well constrained. The OH radicals in reaction (14) come from photodissociation of water ice by external UV radiation. The G22 models used the same initial visual extinction (A V, init = 3) throughout the molecular cloud, while in reality, A V is higher in the inner part of the cloud. This difference may result in an overestimation of the UV intensity in the cold collapse stage, and hence the overproduction of ethanol on dust grains.
Besides the production routes, there is also a recently proposed gas-phase destruction mechanism, the "ethanol tree" network (Skouteris et al. 2018), that is not included in the G22 models. This new network starts with the H abstraction from ethanol by halogen atoms or OH radicals, leading to two reactive radicals: CH 3 CHOH and CH 2 CH 2 OH. These radicals are further converted into formic acid (HCOOH) and formaldehyde (H 2 CO) by reacting with O atoms, along with other minor products (see branching ratios in Fig. 1 of Skouteris et al.). As a result, the abundance of gas-phase ethanol decreases over time. Skouteris et al. (2018) predicts the ethanol/GA ratio to fall from ∼200 to ∼10 in about 1000 years. In our observations, the C 2 H 5 OH/GA ratios are around 10 (see Fig. 6d). This suggests that the gasphase destruction of ethanol in a later stage may play a role in explaining its overproduction in chemical models.

CH 3 CHO
In the G22 simulations, the abundance ratio of CH 3 CHO with respect to methanol in the solid phase after the collapse stage agrees well with our observational data. However, there are discrepancies among the warm-up stages with different timescales. The longer the warm-up stage is, the more CH 3 CHO is produced, indicating substantial gas-phase formation. In the G22 models, only 25% of the total amount of CH 3 CHO is formed in ices during the cold collapse stage (T ∼ 10 K) through the hydrogenation of ketene CH 2 CO + 2H → CH 3 CHO.
About 35% is produced in the early warm-up stage (T < 100 K) by on the grain surface, and in the bulk ice. Nearly 40% is formed in the gas phase through the reaction after the desorption from dust grains at T > 200 K in the warm-up stage. Vazart et al. (2020) give a brief overview on the recorded gas-phase formation routes of CH 3 CHO and explore some new reactions by theoretical computations. They confirm that reaction (19) is efficient in the temperature range of 7-300 K.
The observed ratios of CH 3 CHO only match the modeled values after the cold collapse stage and the fast warm-up stage (the latter of which is not shown in Fig. 4d), which implies that the formation of CH 3 CHO in the early stage may be more dominant than suggested by the G22 models. The experimental studies by Fedoseev et al. (2022) proposed a similar formation route for CH 3 CHO, that is, the hydrogenation of ketene on cold (10 K) surfaces of mixed C, H, CO, and H 2 O ices. They proposed that H 2 CCO can be formed through hydrogenation of CCO radicals, which are the product of the barrierless association reaction between CO molecules and C atoms. Unfortunately, the intermediate products, CCO and HCCO, are very difficult to observe in experiments due to their high reactivity, and the relevant reaction rates remain undetermined. Chuang et al. (2020Chuang et al. ( , 2021 also proposed formation routes from C 2 H 2 ices to CH 3 CHO at T = 10 K under both non-energetic and energetic conditions. To better understand the difference between the chemical modeling results in G22 (especially the gas-phase ratios after the warm-up stage) and our observational results, we need to know more about the relative importance of CH 3 CHO formation in the solid and gas phase.
It is also interesting to notice the relation between CH 3 CHO and the two more hydrogenated O-COMs, GA and C 2 H 5 OH. The ratios of CH 3 CHO/GA and C 2 H 5 OH/CH 3 CHO shown in Fig. 6(b) and (c) behave differently; CH 3 CHO/GA is ∼1 with a small scatter comparable to DMF/MF, while C 2 H 5 OH/CH 3 CHO exhibits a larger scatter. These results are counterintuitive since C 2 H 5 OH is suggested to be a direct hydrogenation product of CH 3 CHO (e.g., experiments by Fedoseev et al. 2022), while GA and CH 3 CHO are usually not present simultaneously in experiments. More investigation is needed to verify if GA and CH 3 CHO are chemically linked.

CH 2 OHCHO (GA) and (CH 2 OH) 2 (EG)
In contrast to MF, simulations and experiments usually produce more GA than observations (e.g., Chuang et al. 2016Chuang et al. , 2017Simons et al. 2020), as do the G22 models. Fedoseev et al. (2015Fedoseev et al. ( , 2017 have confirmed experimentally that GA and EG can be formed in the solid phase through surface hydrogenation of CO molecules at T = 13 K. They suggested that GA and EG are the outcome of successive hydrogenation of glyoxal (HC(O)CHO), which is formed from two HCO radicals. In the G22 models, EG is mainly formed through the addition of two CH 2 OH radicals in the solid phase: Most of the GA is formed through repetitive H abstraction from EG: About 60% of the conversion from EG to GA through reaction (21)-(22) is finished in the cold collapse stage when T ≲ 10 K, while about 30% occurs in the middle of the warm-up stage when T ∼100-200 K. A small portion of solid-phase GA is formed via the route proposed by Chuang et al. (2016): However, despite the large scatter in the observed abundance ratios of GA and EG, it is obvious that EG is overall more abundant than GA in observations, while the G22 models give the opposite results. This implies that the interconversion between EG and GA may not be well modeled in G22.
Another simulation work by Simons et al. (2020) computed their O-COM network with four H/CO input ratios (5-60%) at six low temperatures (8-20 K). They summarized the flux distribution of the network for the fiducial model with n(CO) = 10.0 cm −3 , n(H) = 2.5 cm −3 and T = 10 K (see Fig. 8 of Simons et al.). They found that the hydrogenation of glyoxal is more important to the GA formation than the H-abstraction from EG. Their results also show that the relative abundance of GA to EG is very sensitive to the n(H)/n(CO) ratio. The observed abundance ratios of GA over EG are ∼0.1-1 (Fig. 6g), which corresponds to n initial (H)/n initial (CO) > 0.25 in their models. Chuang et al. (2017) were able to reproduce a similar GA/EG ratio through the hydrogenation of ices with CO:CH 3 OH = 4:1 at T = 14 K, and also find it subject to the initial composition ratios. Besides pure hydrogenation, they tried introducing UV radiation, but the GA/EG ratio was not affected. The simulations and experiments mentioned above suggest that the formation of GA and EG is strongly regulated by the relative abundance of H atoms with respect to other ingredients such as CO. Parameters such as activation energy barriers and branching ratios of the related reactions of GA and EG formation may also attribute to the difference between simulations and observations. The abundance ratio of GA with respect to methanol predicted by the G22 models is nearly one order of magnitude higher than our observational results. A possible reason for the overproduction by chemical models is that GA has a higher desorption temperature than methanol (see Fig. 2 in Fedoseev et al. 2015). As a result, GA is expected to desorb and emit from a smaller region than methanol (for quantification, see the toy model described in Appendix B of Nazari et al. 2021). If our spatial resolution is not high enough to resolve the actual emitting region of GA (suggested by the moment 0 map in Fig. A.1), there will be a beam dilution effect leading to an underestimated column density ratio of GA compared to the actual abundance ratio. An interesting fact is that EG has an even higher desorption temperature than GA  and is expected to suffer more from beam dilution, but it is not overproduced by the G22 models. This means that if beam dilution actually accounts for the underestimation of GA in observations, EG would have been underproduced by the G22 models, or there exist other reasons to explain the observed GA depletion.

The influence of energetic processes
As mentioned in Sect. 5.2.4, Chuang et al. (2017) show that ratios among MF, GA, and EG can be altered by UV radiation. The ratios of MF with respect to GA and EG both increase by nearly two orders of magnitude when introducing UV radiation to the experiments, while the ratio between GA and EG is not much affected. The second row of Fig. 6 shows the comparison among observations, simulations, and experiments (this work, Simons et al. 2020;Chuang et al. 2017, respectively). The UV intensity is not varied during the experiments, but the discrepancy between pure hydrogenation and UV-irradiation implies a positive correlation between the UV irradiation and the relative abundance of MF. However, the observed ratios of MF/GA and MF/EG are still overall higher than the values produced by simulations and experiments.
In G22, one of the models tests the influence of cosmic-rayinduced ionization and UV-induced photodissociation, which shows an obvious enhancement of the O-COM abundances. The "final" model includes these energetic processes but with a fixed efficiency, as do the simulations in Simons et al. (2020). The G22 models are able to reproduce enough MF by introducing a new set of non-diffusive reactions, but the abundances of GA and EG are not very consistent with our observations. Experiments by Öberg et al. (2009) suggest that the final product composition after irradiating CH 3 OH ices with UV lamps depends more on the UV fluence and temperature, instead of the UV flux itself. There is a possible explanation that the short timescale offsets the high UV flux in high-mass sources, and the total UV fluence falls in the same order as low-mass sources. To figure out the influence of energetic processes on COM chemistry, especially in the solid phase, more experiments and simulations with varied parameters are needed. Observations of larger samples of protostellar objects with different masses and luminosities are also needed to provide a more reliable statistics of COMs ratios. Figure 7 summarizes the observed O-COM ratios with respect to methanol in different astronomical objects. The data of protostars are represented by the average ratios of the lowmass and high-mass sources discussed in this work, and the uncertainties correspond to the standard deviation. We compare them with the literature data of the outbursting protostar V883(FU) Ori (Lee et al. 2019b) and the protoplanetary disk around Oph IRS 48 (Brunken et al. 2022). Two comets, 67P/Churyumov-Gerasimenko (67P/C-G) Drozdovskaya et al. 2019) and 46P/Wirtanen (Biver et al. 2021), are also taken into account, which reflect the pristine chemical composition in our solar system. All the sources were observed by ALMA, except that the data of 67P/C-G were collected by the Rosetta Orbiter Spectrometer for Ion and Neutral Analysis (ROSINA), and the data of 46P/Wirtanen were taken by the Institut de Radio Astronomie Millimétrique (IRAM) 30-m telescope and the NOrthern Extended Millimeter Array (NOEMA). ROSINA is a mass spectrometer which cannot distinguish among isomers with the same mass (e.g., C 2 H 5 OH and DME have the same atomic mass unit (amu) of 46). Considering the data availability, we chose three groups of O-COMs for comparison: amu = 44: CH 3 CHO, CH 2 CHOH (vinyl alcohol, VA), c-C 2 H 4 O (ethylene oxide, EO) amu = 46: C 2 H 5 OH, DME amu = 60: MF, GA, CH 3 COOH (acetic acid, AA) EO, VA, and AA are mentioned only for the potential degeneracy of the detection of 67P/C-G; they were not searched or detected in other sources except one low-mass protostellar object (IRAS 16293-2242 B) that we considered here. The O-COM ratios of IRS 48 may be overestimated since the column density of methanol was determined from lines that are likely to be optically thick (hence upper limit signs are used).

O-COMs from clouds to comets
In general, protostellar sources have lower O-COM ratios than the other sources, especially for the group of amu = 44. For the other two groups, considering that the data of IRS 48 may be overestimated, the O-COM ratios are more comparable among the samples. However, the sample here is too small to draw any robust conclusion. A larger sample of sources at different evolutionary stages are needed to study the chemical inheritance throughout star formation.

JWST ice observations
With JWST successfully operating, it is now becoming more possible to detect COMs other than methanol in ices. The absorption features for the identification of O-COM ices are mainly located in the mid-infrared between 2 and 15 µm (Boogert et al. Fig. 7: Column density ratios of selected O-COMs with respect to methanol in different astronomical objects (see references in Sect. 5.4). Species are divided into three groups with different atomic mass unit (amu), and those in parentheses are only shown for the degenerate detections in the comet 67P/C-G by the ROSINA mass spectrometer; they have not been detected separately in other sources. The column densities of the species with the same amu are summed up. The data of IRS 48 are shown as upper limits due to the possible underestimation of the column density of methanol.
2015; Rocha et al. 2022). CH 3 CHO and C 2 H 5 OH ices tentatively account for the spectroscopic features at 7.24 and 7.41 µm in ISO (Schutte et al. 1999), Spitzer/IRS (Öberg et al. 2011), and the latest JWST/MIRI observations (Yang et al. 2022;McClure et al. 2023). Besides, the feature around 11.3 µm in the JWST spectra may also have some contributions from CH 3 CHO, C 2 H 5 OH, and MF (Terwisscha van Scheltinga et al. 2018Scheltinga et al. , 2021. However, more detailed spectral modeling is needed to confirm the detection of these COMs. Once JWST observations with larger samples and higher sensitivities are available, we will be able to estimate the ratios of CH 3 CHO and C 2 H 5 OH with respect to methanol in ices, based on the absorption features at 7.2 and 7.4 µm. By comparing their ice ratios to the gas-phase ones in the same sources, we can get an idea whether and to what extent these species participate in gas-phase chemistry after they sublimate from the ice mantles of dust grains.

Conclusions
We analyzed the spectra of 14 high-mass protostellar objects in the CoCCoA survey. We focused on six selected O-COMs: CH 3 CHO, C 2 H 5 OH, DME, MF, GA, and EG, and derive their column densities and excitation temperatures for the 14 sources. We also performed various comparisons between the observed O-COM ratios with respect to methanol and the results from previous simulations and experiments. We summarize our conclusions below: 1. The column density ratios of the six selected O-COMs with respect to methanol show no clear difference between the five low-mass studied previously and the 14 high-mass protostellar objects observed with ALMA, suggesting that these species are mainly formed under similar conditions. Current astrochemical simulations and experiments support the possibility of early formation of COM ices on dust grains in the pre-stellar stage, before the environments in low-and high-mass star-forming regions begin to diverge. However, the possibility exists that other gas-phase formation routes also play an important role in shaping the COM ratios, which needs more investigations to pin down. 2. DME and MF show smaller scatter in their ratios with respect to methanol than CH 3 CHO, C 2 H 5 OH, GA, and EG. This may hint at some chemical links among DME, MF, and methanol, such as having the same precursor (e.g., CH 3 O) in their formation routes. 3. The ratios among pairs of O-COMs also show the same trends between the low-mass and the high-mass groups. In particular, the ratios of DME/MF and CH 3 CHO/GA are quite consistently around 1, while others show larger scatter. 4. Previous experiments show that the ratios of MF/GA and MF/EG can be significantly enhanced by UV irradiation, but the observed values are even higher than the laboratory ones. The ratio of GA/EG is not affected by UV in experiments and match well with our observations. 5. Comparison with the state-of-the-art models shows consistency for some O-COMs such as DME and MF. The differences between models and observations may result from less constrained gas-phase chemistry (CH 3 CHO and C 2 H 5 OH) and the different emitting areas under limited spatial resolutions (GA and EG). Chemical simulations and laboratory experiments are important to testing and exploring possible explanations. 6. The comparison of O-COM ratios among sources at different evolutionary stages may probe the chemical inheritance during star formation processes. However, observations toward larger samples are needed to enable statistical analyses.
ALMA line surveys toward large samples of star-forming regions are shedding light on the origin of COMs and the chemical evolution in the early stages of star formation. We look forward to linking our results to more ALMA observations on gas-phase COMs and the upcoming JWST mid-infrared data on solid-phase COMs to probe their formation history.  Figure A.2 shows how the fitting looks like with the best fit and the upper and lower limits of T ex , taking MF and G19.88-0.53 as an example. Figure A.3 shows the spectral fitting with two components, taking G19.01-0.03 as an example. Figure A.4 shows the full spectral fitting results of selected O-COMs for the representative source G19.  show the zoom-in fitting of each O-COM. Figure A.8 shows the ratios of the six selected O-COMs with respect to methanol, which is the same as Fig. 4 but with the intercept error in Eq. (1) included.   Table 2 for other sources and species that are fitted by two components.