Massive pre-main-sequence stars in M17 First and second overtone CO bandhead emission and the thermal infrared

Context. Recently much progress has been made in probing the embedded stages of massive star formation, pointing to formation scenarios that are reminiscent of a scaled-up version of low-mass star formation. However, the latest stages of massive-star formation have rarely been observed, as young massive stars are assumed to reveal their photospheres only when they are fully formed. Aims. Using first and second overtone CO bandhead emission and near-to mid-infrared photometry, we aim to characterize the remnant formation disks around five unique pre-main-sequence (PMS) stars with masses 6 − 12 M ⊙ that have constrained stellar parameters thanks to their detectable photospheres. We seek to understand this emission and the disks from which it originates in the context of the evolutionary stage of the studied sources. Methods. We used an analytic disk model, and adopted local thermodynamical equilibrium, to fit the CO bandhead and the dust emission, assumed to originate in different disk regions. For the first time, we modeled the second overtone emission, which helped us to put tighter constraints on the density of the CO gas. Furthermore, we fit continuum normalized bandheads, using models for stellar and dust continuum, and show the importance of this in constraining the emission region. We also included 13 CO in our models as an additional probe of the young nature of the studied objects. Results. We find that the CO emission originates in a narrow region close to the star ( < 1 AU) and under very similar disk conditions (temperatures and densities) for the different objects. This is consistent with previous modeling of this emission in a diverse range of young stellar objects and identifies CO emission as an indicator of the presence of a gaseous inner disk reaching close to the stellar surface. From constraining the location of the inner edge of the dust emission, we find that all but one of the objects have undisrupted inner dust disks. Conclusions. We discuss these results in the context of the positions of these PMS stars in the Hertzsprung-Russel diagram and the CO emission’s association with an early age and high accretion rates in (massive) young stellar objects. We conclude, considering their mass range and the fact that their photospheres are detected, that the M17 PMS stars are observed in a relatively early formation stage. They are therefore excellent candidates for longer wavelength studies to further constrain the end stages of massive star formation.


Introduction
The impact massive stars (M ≥ 8 M ) have on their host galaxies is disproportionate to their numbers: they provide strong mechanical and radiative feedback, and deposit new elements in their surroundings through powerful stellar winds and supernova ejecta which constitute the building blocks of planets and life.Understanding the way in which these stars form is therefore important, yet challenging.The formation events are rare, take place deep inside a dust obscured natal environment and unfold on such short timescales that the stars arrive on the main sequence prior to the dispersal of their natal cloud.Notwithstanding these complicating factors, both observational (e.g.Johnston et al. 2015;Ilee et al. 2016;Cesaroni et al. 2017;Beuther et al. 2018) and theoretical (e.g.Krumholz et al. 2009;Kuiper et al. 2010;Rosen et al. 2016;Meyer et al. 2018) evidence is mounting that disk mediated accretion is crucial to the process.
Recent studies that reveal disks, large-scale disk-like structures and/or outflows, probe the embedded stages of formation (e.g.Frost et al. 2019Frost et al. , 2021a;;Maud et al. 2018;Pomohaci et al. 2017).However, much remains unclear about how massive stars arrive on the main sequence, what their stellar properties are when they do, and how and when their accretion is halted.One of the ways to study the latest stages of formation is to observe newly formed massive stars in their natal environment.By making a census of the young populations in massive star forming regions and determining their stellar and circumstellar (e.g.multiplicity, disks) characteristics, one can constrain possible formation scenarios and provide robust initial conditions for population synthesis studies (e.g.Bik et al. 2006).Through one such study in the very young (≤ 1 Myr) Galactic cluster M17, Ramírez-Tannus et al. (2017, hereafter RT17) made the unique discovery of 6 PMS stars in the mass range ∼ 6 − 12 M , that have observable photospheres and at the same time show spectroscopic and photometric features typical for more embedded massive young stellar objects (MYSOs).part of the spectrum was of sufficient quality to allow quantitative spectroscopic modeling to determine stellar properties.Combining this with available photometry they confirmed the pre-main-sequence (PMS) nature of 6 objects (∼ 6 − 12 M ) in the sample, based on their position in the Hertzsprung-Russell diagram (HRD; see Section 5.2 and Figure 6).Two more objects (B289 and B215) were found to be close to, but not on, the zero age main sequence, displaying only mid-infrared (MIR) excess and lacking the other disk signatures listed below.These objects were characterized by RT17 as young stars whose PMS nature could not be solidly confirmed.The objects that were confirmed to be PMS stars possess features which point to the presence of circumstellar disks: (1) double peaked atomic emission lines, most notably H i, Ca ii and O i; (2) CO-bandhead emission; and (3) NIR to MIR excess.Of these, especially CO bandhead emission has been associated with high accretion rates and the earlier stages of formation in young stellar objects (YSOs; see Section 5.2 and Table 9 for references).
The ro-vibrational transitions of the CO molecule that result in the CO bandhead emission, have recurrently been used to study the innermost regions of circumstellar disks in YSOs of all masses.Especially for the higher mass YSOs, where these regions can rarely be resolved, they constitute one of the few commonly used diagnostics.A few studies focus on a single object (e.g.Gravity Collaboration et al. 2020;Fedriani et al. 2020), revealing the spatial distribution of the emission with interferometry or integral field units.Other studies have gathered samples of intermediate to high mass YSOs to study the disk properties in a more statistical manner (Bik & Thi 2004;Wheelwright et al. 2010;Ilee et al. 2013Ilee et al. , 2014)).In all these cases the emission has been modeled either as a flat disk or a ring of material in Keplerian rotation, with the general result that the emission likely originates from hot (2000 − 5000 K) and dense (N CO ∼ 10 20 − 10 22 cm −2 ) disk regions inside of the dust sublimation radius.This paper's focus is to constrain the properties of the circumstellar material for the five objects among the RT17 sample of PMS stars in M17 that show CO bandhead emission.As these objects also reveal NIR to MIR excess, we include this thermal emission in our analysis.Based on these features we aim to determine properties of the inner gaseous disk (< 1 AU) where the CO-emission likely originates, and the somewhat further regions (≥ 3 − 5 AU) where the bulk of the thermal NIR and MIR dust emission comes from.Our sample consists of relatively high mass (6 − 12 M ) PMS objects, that additionally have well constrained stellar properties.This allows us to address the question: can the disk properties be linked to stellar properties, such as mass and evolutionary stage?
Though we take a similar modeling approach to Bik & Thi (2004), Wheelwright et al. (2010), Ilee et al. (2013) and Ilee et al. (2014), we refine it in several ways (see section 3.4 for details).First, for the first time, we take into account the observed 2 nd overtone emission (∆v = 3, from 1.54 µm), as opposed to only the 1 st overtone (∆v = 2, from 2.29 µm).All the included (1 st and 2 nd overtone) bandheads are fit together, rather than individually.Second, by studying the dust emission with the available photometry we obtain robust continuum estimates, which we use to normalize the modeled bandheads.Third, we take into account 13 CO, which increases the quality of our fits and allows us to probe the isotopologue ratio 13 CO/ 12 CO as a further test of the young nature of our sample.
The paper is organized as follows.In Section 2 we present the data and the previous analysis performed by RT17.The disk model and the treatment of gas and dust are explained in Sec-tion 3. In this section we also explore the sensitivity of the predicted line spectrum to model parameters.In Section 4 we describe our fitting approach and the resulting disk parameters.Section 5 places these results in the context of previous literature and of the evolutionary state of the M17 PMS stars.Section 6 contains a brief summary and our main findings.

Data and previous analysis
The data used in this work consist of near-to mid-infrared photometry (1 − 11 µm) and the NIR part (1 − 2.4 µm) of spectra obtained with VLT/X-shooter.

Spectra
The X-shooter spectra (Vernet et al. 2011) of the 5 young stars studied in this paper were first analyzed by RT17, who derived stellar parameters from quantitative spectroscopy and optical (λ 1 µm) spectral energy distribution (SED) fitting.For a detailed listing of the spectroscopic observations used in this study we refer to that paper.Only for B275 we used a different spectrum, observed on 2019-06-06 for Program 0103.D-0099 (P.I.Ramírez-Tannus).In Table 1 we summarize the results from RT17 that we use directly in our analysis.
For the fits of the CO bandheads we only used the NIR part (1-2.4 µm, with spectral resolution R = 13000) of the spectra, which contain both the 1 st and the 2 nd overtone emission.We (re-)reduced these spectra using version 3.3.5 of the X-shooter pipeline (Modigliani et al. 2010), running under the ESO Reflex environment (Freudling et al. 2013) version 2.11.0.The standard settings of the pipeline resulted in flux-and wavelengthcalibration issues at the location of the 1 st overtone bandheads.At wavelengths beyond ∼ 2.29 µm, the flux showed wave-like fluctuations for some objects, hindering normalization.Additionally, in this range, wavelengths were shifted by ∼ 80 km s −1 .In our attempt to solve these problems, while at the same time obtaining an optimal telluric correction, we first reduced the objects without flux calibration and then used the molecfit tool (? Kausch et al. 2015), version 1.5.9, for a first telluric correction.Importantly, molecfit also provided a correction of the faulty wavelength calibration.We then divided the object spectra by their respective telluric standard stars that were reduced and molecfit corrected in the same way.This way both the instrument response function and the bulk of the residuals of the molecfit telluric correction are divided out.The continuum of the resulting spectra is the ratio of the continua of the science object and the telluric standard star.From this, it is in principle possible to obtain a flux calibrated spectrum by multiplying by the SED of the telluric standard star and applying some correction for slit losses.However, due to uncertainties involved in that procedure, we deemed a more reliable and consistent estimate of the continuum could be obtained from fitting SEDs to the photometric data points (Section 2.2).Therefore, we chose to normalize the spectra and compare the observations to models that are normalized to continuum estimates obtained from fitting SEDs, as described in Section 3.2.
Finally, we correct for the hydrogen absorption features of the telluric standard stars in the wavelength range 1.55 -1.8 µm where they interfere with the CO 2 nd overtone bandhead line fluxes.We do not correct for the stellar and disk (emission) fea-Name Sp.Type  tures of the science spectra in the same wavelength range, but exclude the most contaminated regions from our fits. 1

Photometry
The photometric points for our objects were taken from online catalogs and literature.We used photometry from the DENIS (Fouqué et al. 2000), 2MASS (Skrutskie et al. 2006), Spitzer GLIMPSE (Reach et al. 2005) and WISE (Wright et al. 2010;Jarrett et al. 2011) point source catalogs.For B275 and B331 we could add the extra points around 10 µm available from Nielbock et al. (2001) and Kassis et al. (2002).Though a few points beyond these wavelengths were available from the same papers, we did not include them in the final fits because they likely contain envelope emission, which was not considered in our models.All adopted photometric data per object are listed in Appendix A.

Keplerian disk model
The disk model that was used to fit the CO bandheads and continuum constitutes a flat disk with a Keplerian rotation profile, containing dust and/or CO gas. 2 The disk has a radial temperature and surface density profile for the dust and the gas, described by analytical power laws as follows: where T is the temperature in K, N H the hydrogen (H 2 ) surface density in cm −2 , and T i , (N H ) i their initial values at the inner radius R i .The hydrogen surface density is converted to a 12 CO gas density using the canonical CO/H 2 abundance of 10 −4 (Lacy et al. 1994) and to a dust particle density using a gas to dust mass ratio of 100 (Bohlin et al. 1978).For including 13 CO in the models, several abundances were tested (see   section 3.4).For the final results a standard interstellar abundance ratio N( 12 CO)/N( 13 CO) of 89 was used, taken from the HITRAN database 3 , which reproduces the isotopologue lines well.To avoid confusion, all densities are reported as hydrogen column density N H .

Independent treatment of gas and dust
In all models that were used for fitting, the gas and dust properties are treated independently of each other, i.e. each have their own values.The reason for this approach stems mainly from the fact that the CO emission originates from hot gas well within the dust sublimation radius, with temperatures well above the assumed dust sublimation temperature of 1500 K.We did test a fitting approach in which dust and gas were combined and coupled with a source function S tot λ as follows: 3 https://hitran.org/docs/iso-meta/with the total optical depth τ tot λ = τ CO λ + τ dust λ the sum of the CO line and dust optical depths, the CO and dust source functions given by Planck curves according to the same temperature (so in this case S CO λ = S dust λ ) and where all quantities are a function of wavelength.This revealed that the CO emission and the dust emission responsible for the NIR-MIR excess are representative of different regions in the disk, with little or negligible overlap.Where the two do overlap the main effect of importance is the optical thickness of the dust 'damping' the CO emission.However, since dust only exists at lower temperatures, where the CO emission is already significantly reduced, this effect is minimal.
Of course, even though the CO line and dust continuum emission likely do not originate from the same region, the dust continuum still has a significant influence on the strength of the observed bandheads: for the same absolute CO emission, a stronger continuum leads to a weaker bandhead signal relative to the continuum.Therefore, it remains important to determine and model the continuum.We opted to fit the SEDs and the CO bandheads separately with a dust-only and gas-dust disk respectively; where for the latter the dust parameters were set by the best fits obtained from the SEDs.For the fitting of the SEDs not only dust continuum, but also stellar continuum was taken into account.
We now elaborate on the specifics of the respective modeling and fitting approaches for the continuum and CO bandhead emission.

Continuum treatment
The continuum flux has a stellar component and one that arises from thermal emission of dust near the star.The dust contribution is computed using the described analytic disk model including only dust.For the dust 0.1 µm astronomical silicate was used with opacities from Laor & Draine (1993).The stellar and extinction parameters as determined previously by RT17 (T eff , log g and A V ; Table 1) are input in our calculations using the corresponding Kurucz models (Kurucz 1993;Castelli & Kurucz 2004) and reddening (Cardelli et al. 1989) the resulting SED after adding the disk.The near-and mid-infrared photometry points are then fit by means of a grid of disk models -stellar parameters remain fixed.
The outer radius of all the dust disk models was fixed to 500 AU, regardless of the temperature at this point.We do not expect to constrain an outer radius of the entire disk, because the photometric points available for our objects are only representative of the warm to hot dust (T 150 K) and emission from cooler dust does not change the models in the fitted wavelength ranges (1-12 µm).
For the inner radius R i we used a radiative equilibrium approach (following Lamers & Cassinelli 1999) to calculate the radius at which a certain dust temperature is reached as a function of the previously determined T eff and R (Table 1).This R i is therefore dependent on T i for each object.This approach results in five free parameters: T i , p, (N H ) i , q from Equations (1) and (2) and the inclination i.The grid of parameter values used for fitting is given in Table 2.Only for B331 a good fit could not be obtained with R i as function of T i and stellar parameters.We therefore fitted the SED of this object using a separate grid in which also R i is a free parameter and whose values are given in Table 3.

Combining dust and gas
In the models used to compute the normalized CO bandheads, the dust and CO gas are both included, but do not share the same temperatures and densities, as motivated above.Instead, the dust parameters were fixed by the parameters obtained from the SED fits, and the CO gas parameters were varied over in a grid of models.Only the inclination was shared between the dust and gas disk.
The outer radius of the gas disk model is taken to be the point where the temperature drops to 600 K. Below this temperature the gas does not substantially contribute to the bandhead emission anymore, because the excited vibrational states are not sufficiently populated.The outer radius for the dust disk is again 500 AU, as before (see Section 3.2).
Because the CO gas excitation temperatures are typically much higher than the dust temperatures and because the inner radii for the gas-disk are so close to the star (see Table 4) the gas and the dust do not necessarily overlap.In fact, they overlap only in the models where the CO gas temperature drops to 600 K beyond the dust sublimation radius (calculated as described in Section 3.2).In these cases the source function in the overlap region was calculated according to Equation (3), where the CO and dust source functions are given by Planck curves according to their respective temperatures.In all regions and cases where there is no overlap, i.e.only gas and no dust or only dust and no gas, the gas and dust emission are calculated separately and added.For the CO modeling the main significance of the dust disk is its contribution to the continuum, to which the bandheads are normalized.

The gas disk and parameter grid
The vibrational-rotational level populations of CO are assumed to be in LTE, with Einstein A coefficients and level transitions taken from Li et al. (2015).The line profiles of the rotational transitions are assumed to be Gaussian with a width v G .This parameter accounts for intrinsic width of the lines as well as thermal and micro-turbulent velocities.The intensities are calculated per ring at radius r, with a dependence on the ring temperature and density.They are then convolved with a Keplerian velocity profile per ring and corrected for inclination, before integrating over the radial extent of the disk.Finally, the model spectra are convolved with the instrument resolution of R = 11300 and normalized to the continuum fits presented in Section 4.1, adjusted only to match the inclination of the CO disk. 4s the masses of the stars are previously determined (Table 1; needed for Keplerian velocities), we have 7 free parameters for our model: T i , p, (N H ) i , q, R i from Equations ( 1) and (2), the inclination i and the Gaussian width v G .To fit the data we calculated a grid of models with parameter ranges given in Table 4.The parameter values for this final grid are not all regularly spaced and are based both on literature and on several other exploratory grids that were calculated to investigate the relevant parameter space for our objects.The inner gaseous disk radius R i is given in units of stellar radii (Table 1, RT17).Both the initial hydrogen column density (N H ) i and the R i values are evenly spaced on a log scale.The positive value (+1) for the density ex- ponent q was included, because of evidence for an inverse density profile of the innermost regions of the disk (Antonellini et al. 2020).

Correcting for pseudo-continuum
When calculating the final CO bandhead spectrum, each subsequent bandhead flux is added to the P-branch (∆J = −1) and higher R-branch (∆J = +1) rotational transitions of the previous one.This, when convolved with the instrument resolution, creates a kind of secondary continuum and influences the overall appearance of the bandhead spectrum.This is true for both overtones (see also Figure 1).However, when normalizing the observed spectra in the 2 nd overtone wavelength regions it is impossible to recognize such effects, especially because the overlapping line wings of the hydrogen Brackett series also influences the continuum there.To account for this pseudocontinuum effect, we introduced a correction of the bandhead continuum of the modeled 2 nd overtone bandheads before fitting.We did this by fitting a 3 rd degree spline through the minima just before the bandhead-onset wavelengths and dividing the modeled bandheads (already normalized to star and dust continuum) by this 'bandhead continuum'.The normalization of the data in the 1 st overtone region is much less problematic and the models there did not require similar corrections.

Model exploration
Several examples of CO bandhead fitting using analytical disk models are available in literature (e.g.Kraus et al. 2000;Bik & Thi 2004;Ilee et al. 2013).Similar to our approach, these works adopt an LTE treatment of the gas, Keplerian rotation and a 1D thin disk approximation.Here we list changes and improvements in our approach relative to these earlier efforts.First, we introduce an internally consistent treatment of the continuum and normalization.In previous work, the bandheads are usually continuum subtracted and normalized to the first bandhead.Instead, we estimate the continuum from SED fitting and use this to normalize our models.With this approach the information contained in the strength of the bandheads is retained.
Second, we fit the first 5 bandheads of the first vibrational overtone (v = 2-0, 3-1, 4-2, 5-3, 6-4) at the same time.We also predict the 6 th bandhead (v = 7-5), but because it is at the very edge of the observed spectral range, the signal to noise ratio is insufficient to include it in the fits.
Fourth, we include 13 CO.These lines are weak and their diagnostic value in constraining disk properties is limited.How-ever, including these lines allows to constrain the isotopologue ratio 13 CO/ 12 CO.
Finally, we probe higher temperatures than the commonly assumed dissociation temperature of 5000 K (Bosman et al. 2019); grid values run up to 8000 K.Such high temperatures should be taken with care as they may not be physical, since CO dissociation is not included in the models.Allowing for higher temperatures may mimic limitations in the LTE assumption, which can overestimate the temperature if non-collisional excitation mechanisms, such as UV pumping, contribute to setting the state of the gas.The maximum temperatures retrieved from our fits can then serve as an additional probe of the validity of the LTE assumption.
To facilitate a comparison with previous results of similar fitting efforts in literature we identify where and how these novel aspects in our approach might impact the parameter estimation.
To do this we explore the effects of the free parameters on the appearance of the 1 st and 2 nd overtone bandhead emission using Figure 1.This figure was made by over-plotting several models, sequentially varying only one parameter and keeping all others fixed at a value fitting to the B275 bandheads.The fitted model is plotted in black in each panel to give an impression of where the data for this object are.In the very last panel we show the effects of including 13 CO in varying abundances; we discuss the inclusion of 13 CO at the very end of this section.
The amount of emission in both the 1 st and 2 nd overtone bandheads is most sensitive to the temperature and the extent of the (hot) gaseous disk, as can been seen on the T i , p and R i plots in Figure 1.These parameters have very similar effects on the amount of flux of the bandheads.Since the temperature always drops going outward, a small initial radius will lead to a small area of the hottest gas, i.e. less flux.This can be compensated for by a shallower temperature drop (smaller |p|) resulting in a larger emitting surface.Note the strong emission in the T = 6000 K model, which is stronger than the commonly observed strength of these features.Again, CO dissociation is not included in our models.
Since the disk is assumed to be Keplerian, the velocities are set by the mass of the central star (kept constant for each object), the inner radius R i and the inclination i5 .These parameters, therefore, determine the onset wavelengths of the bandheads and influence their shape.High (more edge on) inclinations and smaller inner radii broaden the bandhead profile and shift the onset wavelength blue-ward.Both these parameters, however, also influence the total flux.
For the relative strength of the 1 st and 2 nd overtone bandheads optical depth effects play an important role.The lines of the 2 nd overtone are intrinsically weaker and therefore remain optically thin for higher column densities.For this reason line broadening, i.e. higher v G , leads to higher flux only for the 1 st overtone bandheads and not for those of the 2 nd overtone.The inclination has slightly less effect on the 2 nd overtone flux than on the 1 st overtone flux: the radiating surface of an optically thick inclined disk is reduced, but the 2 nd overtone remains optically thin for longer sight-lines through the disk.
Finally, increasing the initial column density (N H ) i , or decreasing |q|, beyond certain grid values has a larger impact on the 2 nd overtone flux, because these lines remain optically thin within the grid values.These changes to the 2 nd overtone flux are more detectable than the changes to the 1 st overtone within crucial ranges of (N H ) i .Therefore, including the 2 nd overtones in the fits especially helps towards constraining the column density.
We now discuss the effects of each of the four modeling improvements.

Continuum normalization
It is important to keep in mind that the stellar parameters and thermal dust continuum that are kept fixed for each star in the fitting process, do strongly influence the appearance of the bandheads -and therefore the fit results.A thorough discussion of these effects is given in Appendix B.Here we mention the most important effect: a higher continuum originating from the star and dust disk will lead to weaker bandheads for the same gas disk model.The most important constraint from including the bandhead strength relative to the continuum is on the ranges of values for p and v G .Shallow temperature gradients and high intrinsic line velocities lead to very high bandhead fluxes beyond observed ranges, that cannot be compensated for by other parameters.Thus, we limited the probed ranges in our grid accordingly (Table 4).

Fitting multiple 1 st overtone bandheads
The effect described earlier (in Section 3.3) that the bandhead series, if strong enough, creates a secondary continuum, can be clearly seen on Figure 1 for any parameter that influences the strength of the bandheads.The later bandheads (i.e.those with higher vibrational quantum numbers) are also more sensitive to temperature changes, likely due to excitation effects.These effects generally lead to better temperature constraints when fitting more bandheads together.

Including the 2 nd overtone bandheads
Following from our previous observations, the most important determinant of the detectability and (relative) strength of 2 nd overtone bandheads is the column density (profile) of the disk.We tested this by comparing fit results with or without including 2 nd overtones.The exponent q is poorly constrained in general and even its selective effect on the 2 nd overtone did not change fit results significantly.However, including the 2 nd overtone did yield significant changes for the initial column density (N H ) i .

Including 13 CO
CO bandhead emission is a signature not unique to YSOs -it is evidence for the presence of a (hot and dense gaseous) disk, but as such it is also observed to originate in the (decretion) disks of evolved B[e] stars (Kraus et al. 2000).In these stars the stellar surface layers can be enriched in 13 C, due to the evolution of the (massive) star and chemical mixing processes.Stellar winds then cause this anomalous isotopologue ratio to be carried into the circumstellar environment, where consequently the N( 12 CO)/N( 13 CO) abundance ratio drops and 13 CO bandhead signals are enhanced.Kraus et al. (2020) show that therefore, including 13 CO emission in the models, can provide a test to distinguish the evolved objects from the younger ones.
The objects in our sample are expected to be YSOs due to their young cluster environment.To assess the sensitivity of our results to the 13 CO feature, we computed models for gradually increasing 13 CO abundances.In the last panel of Figure 1 we show the model with only 12 CO and with ratios of 12 CO/ 13 CO down to 4 (the ∼ 22-fold enhancement of the abundance found by Kraus et al. (2020) in a B[e] supergiant).Even for the standard (interstellar) ratio 89, 13 CO leaves a detectable signal in the 1 st overtone, which becomes very prominent for the highest abundances.In Figure 2 we show the observed 1 st overtone bandheads of B163, the object for which the 13 CO feature is most pronounced, together with its best fit model and models with varying 13 CO abundance.When included with the standard abundance, the 13 CO feature clearly improves the fit; but increas-  (a) The inner radius R i is reported in square brackets next to the initial temperature T i as the distance from the star at which this temperature is reached. (b) Only for B331 R i is a free parameter varied over in a grid.Notes.The inner dusty disk is defined as the ring of dust where 99% of the modelled 1-12 µm dust emission comes from.Where not provided, units are the same as in table 5.
a The mass is the total gas and dust mass of the ring, under the assumption of a gas to dust mass ratio of 100.
Table 6: Properties of the inner dusty disk.
ing the abundance leads to progressively poorer results, limiting a possible enhancement of the abundance to ∼ 30% in this case.
We also find that including 13 CO does not change the other disk parameters that are fitted.These findings are similar in the other objects; save for B331, where the hydrogen Pfund line series interferes with the 13 CO bandhead signal and we cannot assess the impact of the isotopologue.We conclude that, although the inclusion of 13 CO is of limited diagnostic value to constrain disk parameters, its detected features, consistent with an interstellar abundance, do provide a confirmation of the young nature of our objects.Therefore, when presenting and plotting our results we use the models including 13 CO, except for B331.

Continuum fits
The aim of the SED fits is, first, to determine a reliable continuum for normalizing the modeled CO bandheads, and second, to obtain constraints on the inner parts of the dust disk.Figure 3 visualizes the best fitting model SEDs for the five objects in our study, overplotted with the available photometric points.The case of B331 is different and we discuss it separately.
For the objects B163, B243, B268 and B275 the available photometry could be fitted well with optically thin dust emission.One of the consequences of this is that the inclinations remain unconstrained.We used a reduced χ 2 statistic, in addition to an inspection by eye, to determine the range over which a reasonable fit can be obtained for each parameter.These ranges, along with the best fits are reported in Table 5.The tempera-ture T i and column density (N H ) i at the inner rim are reasonably well constrained, with T i ≈ 1500 K and (N H ) i ≈ 10 21 cm −2 for all four objects.The temperature and density exponents p and q are degenerate: a less extended emission region (higher |q|) can be compensated for with a slower temperature decline (lower |p|).The derived values of the exponents are different than for the gaseous disk (Section 4.2), with the values for p generally shallower and the values for q steeper than those obtained by fitting the CO bandheads.Apart from degeneracies, this may also reflect the difference between the disk regions probed, i.e. the gaseous part close to the star (see section 4.2) versus the inner parts of the dust disk.
With respect to the best fit model, B163 has an excess in the J (∼ 1.2 µm) and the H (∼ 1.7 µm) bands.This could be because the disk is not actually optically thin (see also Section 5.4) or due to refractory dust grains with sublimation temperatures in excess of 1500 K.The normalization of the 2 nd overtone bandheads is only slightly affected; maximally by a factor of 1.3.
The SED of B331 could not be fit with the grid used for the other objects.It is the only object which does not have a NIR dust excess within the X-shooter spectral range (the excess begins at λ ∼ 3.6 µm), signifying that the star is surrounded by cooler dust.Here too, the parameter that is best constrained is the dust temperature at the inner rim.However, when using the radiative equilibrium approach to calculate R i the cool dust disk would start as far as ∼ 140 AU, leading to poor fits.We opt to fit B331 with R i as a free parameter and arrive at a dust disk with R i ≈ 30 AU.For this object the modeled emission is optically thick, hence the best fit column density should be considered a lower limit.The inclination is again poorly constrained, this time because it is degenerate with the density decline |q|, i.e. extent of the disk.
We use these best fits and their parameters when modeling the dust for the CO bandhead models and for normalizing those models, as described in Section 3.3.It is more challenging to use the results to truly constrain the parameters of the inner dust disk.Ultimately, because the data are simply too scarce to determine all the parameters, we find it more realistic to describe the results in terms of an amount (in the optically thin case) or surface (in the optically thick case) of dust in a ring around the star at a certain temperature.Because our photometry points only reach up to 12 µm, they are only representative of dust emission from a limited part of the dust disk, i.e. the hotter, inner parts.To understand and compare the characteristics of these parts we define the inner dust disk as the part of the disk where 99% of the dust emission between 1-12 µm comes from.Using the best fitting model we can then provide an inner and outer radius and a surface area for this emission region, and determine the temperatures and column densities at those radii.For B331 we calculate the inclined disk surface, because of the emission being optically thick.Integrating the column density between inner and outer radii provides a total mass for the emission region.The results are reported in Table 6.The inner and outer rim column densities as well as the masses of the defined emission region are very similar for the first three objects, even though their outer temperatures (T out ) and their areas are rather different.In combination with identical inner rim temperatures (T i ), it follows that the best fitting models are not sensitive to the size of the emission region and its temperature structure, and that it is thus the hottest dust that dominates the emission.B275 clearly has more dust emission, which is reflected in a higher disk mass.
The dust emission from B331 is dominated by longer wavelength emission from cooler dust.The NIR emission seen in the other objects is absent (Figure 3).The fact that the inner rim temperature is so low suggests a dust-free inner cavity in the disk.A higher density and a slower density decline lead to a much higher disk mass than for the other objects.The caveat here is, however, that it is possible that the three longest wavelength points are contaminated with envelope emission.The even longer 20 µm and 37 µm points from Lim et al. (2020) could not be fit with our disk model, most likely due to the presence of envelope emission which we did not include in our models.

CO bandhead fits
The best fits resulting from the grid of models described in Section 3.3, are plotted for each object in Figure 4, together with the data and the fit regions.For B331 we fitted only two of the 1 st overtone bandheads, due to strong emission in the hydrogen Pfund line series.All other spectra are fitted including Notes.The parameters p, q and i were fixed to their maximum probability values.For the other parameters either the mean or maximum probability (max prob) value are quoted, along with the best fitting (bf) model parameters.
a B331 was fitted without including 13 CO.Notes.The CO emission region is defined as the part of the disk (bounded by R i and R out ) where the best fitting model CO emission originates.All radii are given in AU and column densities in cm −2 .
a The inner radius of the dust disk.
b Column density at R i .
c Extrapolated column density at R out .
d The mass M tot is the total gas mass of the CO emitting region, based on the assumed CO abundance CO/H 2 = 10 −4 .
Table 8: Properties of the CO emission region.
five 1 st overtone bandheads.The 2 nd overtone bandheads suffer from poor signal to noise due to a combination of hydrogen and other emission features and telluric contamination.We fitted only those bandheads that were more or less free from hydrogen emission.Even if they were not clearly detected, at least some 2 nd overtone bandhead regions were always included in each fit, as their absence also provides constraints.
Because of the degeneracy between different parameters and limits to the data quality and spectral resolution, it was nontrivial understand and determine the accuracy of fits to the data.The most straightforward approach of quoting the minimal χ 2 best fit values was insufficient as 'next best' fits sometimes produced very different parameter values.In addition, especially the power law exponents p and q were poorly constrained within the probed ranges.To get better insight into the parameter space and to determine errors, we used marginalized likelihood distributions for each parameter, as described in detail in Appendix C. One caveat with this approach is that, in general, the errors and mean depend on the parameter space chosen.In this case, the parameter space is large and limited to physically reasonable values, so that we can assume that likelihoods for parameter values outside the grid are (close to) zero.To better understand the temperature, density and velocity information, we fixed the p, q and v G parameters at their maximum likelihood values after fitting with the entire grid.We determined the best fits for the remain-ing parameters (T i , (N H ) i , R i , i) from a 'reduced' grid with the p, q and v G parameters fixed.
The results are summarized in Table 7.For T i , (N H ) i , R i and i we determined both the most probable and the mean values, with their respective uncertainties, based on the marginalized probability distributions.As explained in Appendix C, the best fit, the most probable and the mean values can be rather different, but for almost all parameters the best fit values fall close to and within the error bars of either the mean or the most probable value or both -we list the one that is closest to the best fit value.In the table the most probable value is indicated as 'max prob', the mean value as 'mean' and the best fit as 'bf'.Only for the inner radius R i of B243 does the best fit value in the reduced grid lie outside the error bars of the quoted mean.This is the object with the weakest bandheads and the poorest signal to noise.
In order to illustrate and compare these results in Table 7 a bit better we made a similar emission region analysis as for the inner dust disk (Table 6).In Table 8 we show the inner and outer radii of the part of the disk where, according to the best fitting model, the CO emission originates.By construction the outer radius R out is given by the point at which the gas temperature drops to 600 K.We also show where the dust disk begins (R dust ) and provide the column densities at R i and R dust .Finally, we provide the total gas mass of the CO emitting region (since there is no dust there), based on the assumed CO abundance.

Summary of the most important aspects of the results
Before we discuss our results we briefly bring together the different aspects that can help form a picture of the studied disks.Firstly, the SEDs are fit with a thin dusty disk, but the parameters remain poorly constrained apart from the temperature at the inner rim.Except for B331 all inner rim temperatures are at or very close to the assumed dust sublimation temperature, suggesting that these inner dust disks are undisrupted.B331 is an exception with cooler dust at larger radii, pointing to a perturbed inner dust disk (see Section 5.2 for further discussion).The SED-derived densities we will leave aside for reasons explained in Section 5.4.
The distribution of best-fit inclinations is consistent with random orientations.The Gaussian line widths are in the order of the thermal velocities at the CO temperatures, suggesting that turbulent or other stochastic motions are not significant in the emitting regions.
Though the column densities are similar for all objects, we see that they are slightly lower (by a factor of ∼ 3 − 8) where the 2 nd overtone is not observed (see objects B243, B331 in Figure 4).This is consistent with the 2 nd overtone being observed only when column densities are high enough.The fact that the emission originates from a narrow ring means that the density change over larger parts of the disk and hence q will be poorly constrained (also apparent on Figure 1).Nonetheless, the most likely value for q was found to be identical for all objects.The last column in Table 8 shows that the total masses of the CO emitting region are surprisingly similar.This largely relates to, but is not fully explained by, the similarity in the derived densities.
In general we note that, given the accessible parameter space, the overall similarity in the emitting regions is striking.To illustrate this we show in Figure 5 the range of temperatures and column densities that are in principle consistent with the normalized strength of the observed bandheads.Especially the range in column densities spans orders of magnitude, while the fit results are similar within one order of magnitude.We return to this observation in the discussion.

Comparison with previous CO bandhead modeling results
We first compare the inner gas disk parameters determined from the CO bandhead emission with those resulting from similar modeling efforts in literature -particularly those that pertain to YSOs in the higher mass ranges.There are mainly two approaches when modeling CO bandhead emission towards YSOs.The first one assumes a thin disk model with free parameters similar to ours (e.g.Kraus et al. 2000;Wheelwright et al. 2010;Ilee et al. 2013Ilee et al. , 2014)).The second approach differs from the first mainly in that it does not fit a temperature or density structure, but models an isothermal ring at one density and v sin i, where the velocity can be converted to a distance assuming an inclination and Keplerian rotation (e.g.Bik & Thi 2004;Koutoulaki et al. 2019;Fedriani et al. 2020;Gravity Collaboration et al. 2020).Both approaches have been used to successfully fit 1 st overtone emission.We do note that all previous studies normalize the bandheads to the peak of the first bandhead (except Kraus et al. 2000;Wheelwright et al. 2010), thereby ignoring the strength of the emission.Furthermore, in our study, for the first time, the 2 nd overtone CO bandhead emission is reported and fitted, leading to better constraints on the column density.With these differences in approach in mind, we summarize the similarities and differences in the results.
Our derived CO column densities are all of order N CO ∼ 10 21 cm s −2 , which is in excellent agreement with literature values that are mostly within the range N CO ∼ 10 20 −10 22 cm s −2 .There is, however, often a larger spread in column densities when a sample of objects is studied (e.g.Ilee et al. 2013Ilee et al. , 2014)).While the morphology and strength of our bandhead profiles is equally diverse as in these studies, the determined densities within our sample are remarkably similar.
The spread in temperature values is similar to those found in previous studies.All inner radii in our sample are well within 1 AU, which is on the lower end of values derived for this parameter by others.Ilee et al. (2013Ilee et al. ( , 2014)), for instance, find in-ner radii up to 6 AU, but mostly around 1 − 2 AU.They also generally find rather shallow temperature exponents (p ∼ −0.6 on average), leading to (very) large outer radii (determined by where the temperature drops below 1000 K) ranging from a few to hundreds of AU.In our study, the strengths of the bandheads are very sensitive to R i and p (Figure 1), leading to (mostly) steep temperature exponents that mimic the effect of modeling a ring of emission close to the star.This justifies the assumptions underlying ring models.In the only study to date where the CO bandhead emission of a MYSO 6 is spatially resolved, the authors indeed find, by fitting the visibilities for each bandhead separately, that the CO emitting region must be relatively small ∆R/R ≤ 20%, where R = 0.58 +0.04 −0.04 AU is the ring radius (Gravity Collaboration et al. 2020).This radial location is in good agreement with our findings and the ring thickness is comparable to and even slightly narrower than our 'rings' of emission (Table 8).
Similarly to other studies confronting the disk model to a sample of systems (Ilee et al. 2013(Ilee et al. , 2014)), the density exponent q remains poorly constrained.This is not surprising since the emission region is narrow.This parameter is therefore also of little consequence to other results or statements about the disk (e.g. the total mass).
For the inclinations we find a spread in values, consistent with random orientations.Ilee et al. (2014) find a preference for higher (closer to edge-on) inclinations and suggest that this may be due to a geometric selection effect.This is interesting in the context of detection rates of CO bandhead emission (see Section 5.2).Within our small sample, we do not find evidence for such a selection effect, in agreement with the larger sample of Ilee et al. (2013).In fact, we predict (see Figure 1) that higher inclinations make the (optically thick) CO emission rather less detectable, due to a decrease in effective emitting surface.Caratti o Garatti et al. ( 2017) and Fedriani et al. (2020) report CO emission towards two deeply embedded MYSOs to be the reflection of light from the inner gaseous disk onto the perpendicular outflow cavity wall.In these cases the disk is almost fully edge on and the central source with its inner gas disk is too enshrouded for the CO emission to be detected directly.However, because the outflow cavity 'sees' the inner disk face-on, the observed CO emission is rather fitted with low velocities (or a face-on orientation).It is highly unlikely that the CO emission towards our objects is indirect considering that all of them have detected photospheres.Thus, we do not expect that our inclinations are 'artificially' low.But these observations can potentially help explain why detection rates of CO bandhead emission remain low overall: when looking at (embedded) MYSOs at high(er) inclinations one might not observe the innermost disk regions.
Finally, the intrinsic line width v G (also denoted as ∆v in literature) is often taken to be a measure of turbulent motions of the gas.In our case, the main constraint on this parameter comes from line fluxes of the optically thick 1 st overtone, that become very high for higher widths (> 5 km s −1 ) that exceed thermal velocities of the gas at the given temperatures (Section 3.4 and Figure 1).Studies that fit the bandheads normalized to the first bandhead peak lack the sensitivity to this effect of intrinsic line widths.Such studies commonly find higher values up to 10 − 20 km s −1 .
In summary, our results point to very similar conditions for the origin of the CO emission as previous studies and, if anything, place even stronger constraints.Despite the fact that our objects represent different masses and (likely) different evolutionary stages (see Section 5.3) than (M)YSOs investigated in 6 NGC2024 IRS2 the studies mentioned so far, it seems that the emission originates in disks that at least locally look very much alike.Even towards low-mass YSOs very similar conditions are derived, where the CO bandhead emission is known to signal a temperature inversion, i.e. a heated disk atmosphere by stellar irradiation, winds or magnetospheric accretion heating (e.g.Najita et al. 2007, see also Section 5.2).Thus, it is plausible to assume that in many if not all YSOs the emission originates in disk regions with similar conditions, if not (locally) similar disks.5.2.CO bandhead detection rates and the link with accretion and age CO 1 st overtone bandhead emission (in this section referred to as CO emission) has been observed in YSOs over a large mass range, as well as in Be and B[e] stars, which can be YSOs, but also objects of a more evolved nature such as B[e] supergiants (Lamers et al. 1998;Kraus et al. 2020) or classical Be stars (Cochetti et al. 2021).Here we focus on findings for YSOs.
We summarize the detection rates among YSOs of different masses and formation stages in Table 9.Detection rates, especially among low mass YSOs, appear to be higher for earlier formation stages.In general, CO emission towards lower luminosity YSOs is associated with properties that point to the earliest stages of formation and/or high accretion rates, i.e. deep embedding, energetic outflows, veiling and/or measured high accretion rates (Najita et al. 2007).In the intermediate to high mass YSOs detection rates are consistent with this trend.The lower rates among Herbig Ae/Be stars can perhaps be understood in terms of these objects representing a later stage in formation than most MYSOs, yet earlier than T Tauri's (see also Section 5.3).
Apart from high occurrence rates in objects undergoing an accretion burst (e.g.Contreras Peña et al. 2017), the link with accretion is supported by correlations of CO emission (strength) with other features that signal accretion, such as high veiling (Connelley & Greene 2010) and Br-γ luminosity (Ilee et al. 2014;Pomohaci et al. 2017). Caratti o Garatti et al. (2017) report on an accretion burst in a MYSO of ∼ 20 M where the CO emission is detected only during the burst.Despite all this evidence towards a link with accretion, actual reported accretion rates span a large range of values: from ∼ 10 −8 M yr −1 in the lowest mass objects (Koutoulaki et al. 2019) to ∼ 5 × 10 −3 M yr −1 in the highest mass objects (Caratti o Garatti et al. 2017), with Herbig Ae/Be stars in between with ∼ 10 −7 − 10 −6 M yr −1 (Ilee et al. 2014;Contreras Peña et al. 2017).On the other hand, a theoretical study by Ilee et al. (2018b) predicts CO emission to be most prominently observable with moderately high accretion rates of ∼ 10 −4 − 10 −5 M yr −1 .
Finally, it should be noted that CO emission detection rates remain relatively low over all mass ranges, also in accreting objects.In MYSOs it is definitely less common than other emission features, such as Br-γ emission at 2.16 µm, also in the K-band, and which is detected towards 70-90 % of objects in the quoted studies.Moreover, CO emission is not thought to be a direct probe of accretion, i.e. it does not originate in the accretion flow itself, where it is likely destroyed by UV light from the surface.As such it seems reasonable to follow Koutoulaki et al. (2019) in their conclusion that accretion seems to be a necessary but not sufficient condition for CO emission to be observed.The low detection rates could be due to geometric effects (as discussed in Section 5.1), veiling due to high continuum emission or simply varying conditions in disks around accreting objects.It remains remarkable, however, that such vastly different accretion rates around objects of a wide Table 9: Detection rates of CO 1 st overtone bandhead emission (CO emission) in YSOs of different masses and stages of formation.
range in masses could lead to emission that consistently points to very similar disk conditions (see Section 5.1).
Our own sample was taken from a total of 6 YSOs, or 8, if we include the two young stars with only MIR excess from RT17 (B289 and B215 on Figure 6; see also Section 1).It is surprising to find that 5 (63-83%) of these objects show CO bandhead emission.This raises the question as to the evolutionary state of these objects, which we address in the following section.

Evolutionary stage(s) of the intermediate to high mass
YSOs in M17 The mass range of the YSOs in our sample (6 − 12 M ) spans the upper mass ranges of Herbig Be stars and the lower mass ranges of MYSOs.To better understand the formation stage of the studied sample, we briefly discuss the most important characteristics of the two categories and make comparisons to objects in each category.
Massive YSOs are minimally understood to be objects that are in the process of forming a star (or more in a multiple system) that is at least 8 M when reaching the zero age main sequence (ZAMS).The widely used MYSO catalog Red MSX survey (RMS; Lumsden et al. 2013) selects objects based on their MIR brightness and high luminosities ( 10 4 L ), which are hence defining properties for studies in the near-and midinfrared (e.g.Ilee et al. 2013;Frost et al. 2019).Some of these studies explicitly exclude from their definition objects that have started to ionize their surroundings to form an H ii region (e.g.Wheelwright et al. 2010;Pomohaci et al. 2017).However, this specification is not included in studies in (sub)-mm to radio wavelengths (e.g.Beltrán & de Wit 2016;Maud et al. 2018;Johnston et al. 2015;Ilee et al. 2018a), that reveal large scale (∼ 10 2 − 10 4 AU) molecular outflows and/or disk-like structures.Takami et al. (2012) find that these mm-bright objects can be embedded at MIR wavelengths.Studies that explicitly include both wavelength regimes are rare (Hsieh et al. 2021), which makes it hard to compare objects in either categories.It appears that the term MYSO is used for objects with a range of masses and evolutionary stages, from the near molecular core phase up to objects for which (some) photospheric features are observed, covering thereby a wealth of objects that differ in their characteristics.However, we note that the overwhelming majority of studies into MYSOs pertains to objects that are highly embedded in gas-dust envelopes (A V 20 − 30) and have (estimated) masses of at least 10 − 15 M with no detectable photospheric features.
Of the M17 PMS stars, B331 with 12 M best fits the category MYSO, while also having a well detected photosphere.In a detailed study of a MYSO of ∼ 25 M , Frost et al. (2019) find evidence for a 60 AU inner clearing of the dust disk, while also reporting on the detection of CO bandhead emission.This combination of features resembles our findings for B331.Notably the authors refer to transition disks in low mass young stars which show dust clearing while still having a small gaseous disk where accretion could be ongoing (Wyatt et al. 2015).They tentatively suggest that the object of their study could represent such a stage in an MYSO and propose photo-evaporation or a companion as the most likely dust clearing mechanisms.These conclusions are corroborated in their follow-up study of a sample of 8 MYSOs, from which they find evidence for an evolutionary sequence, the later stages of which are indeed characterized by inner dust clearing by photo-evaporation (Frost et al. 2021a,b).According to this interpretation the disk of B331 is in a state of transition.For our other objects this suggests an earlier stage of formation where they have not yet started to clear their inner dust disk.This scenario is consistent with the masses and luminosities of the respective objects.The two young stars from RT17 close to the ZAMS, with masses of 20 M and 10 M (B289 and B215 on Figure 6), also support this picture as they lack all inner (gaseous or dust) disk signatures, but do show MIR excess similar to B331.
Recent studies that identify and characterize large samples of Herbig Ae/Be stars describe these objects as optically revealed PMS stars in the mass ranges 2 − 10 M , characterized by emission lines and NIR excess attributed to disks (Vioque et al. 2018(Vioque et al. , 2022;;Guzmán-Díaz et al. 2021).Because the M17 PMS stars fit this description well, we highlight the recent study of Vioque et al. (2022) (hereafter V22) for comparison.V22 analyze a sample of 128 Herbig Ae/Be stars with stellar masses up to ∼ 20 M .Based on optical spectra, GAIA parallaxes and photometry they derive stellar parameters and accretion rates.
The V22 Herbig stars with masses > 4 M , most of which are spectral type B, are shown on the HRD in Figure 6 together with the M17 PMS stars.In the figure we include B337, the one PMS star studied by RT17 without CO bandhead emission, but with otherwise very similar characteristics.We also include the two objects from that study (B289 and B215; in brown), that show no signatures of inner (gaseous or dust) disks in their Xshooter spectra.The more extincted M17 sources (A V ∼ 13) are shown in purple.From the HRD plot, it appears that the M17 PMS objects are all younger than the V22 sources.This is consistent with the derived extinctions for the respective samples; with A V = 2.0 − 6.7, and mean A V = 3.6 ± 1.1 for the V22 subsample (M * > 4 M ), and A V ∼ 7 − 13 for the M17 sources (RT17 and Table 1).Furthermore, V22 report photo-evaporation of inner dust disks for sources above 7 M based on a decline in NIR/MIR excess, similar to B289 and B215.In this context, the youth of the M17 PMS sources is again confirmed by the presence of inner dust and gaseous disks, with the most luminous and highest mass object (B331) showing signs of inner dust disk clearing (in line with the comparison to the MYSO from Frost et al. 2019).
V22 do not mention CO bandhead emission, as the necessary wavelength ranges were not included in their study.The main reference for the occurrence rate of CO bandhead emission amongst Herbig Ae/Be stars therefore remains Ilee et al. (2014).Of the 23 objects in their sample with masses 4 M , 13 − 17% exhibit CO bandhead emission (masses from Fairlamb et al. 2015).This is on the low side compared to detections towards MYSOs, which, as proposed earlier, could be because these objects are typically in a later stage of formation than most MYSOs.Though we did not perform a full census of PMS stars in M17, the presence of CO bandhead emission in 5 out of 6 them is yet another pointer towards young age.

Limitations and open questions
In this final section we discuss the limitations of this work and make suggestions for further research.
The column densities derived from the SED fitting (Table 5) are orders of magnitude lower than those derived from the CO bandhead fitting (Table 7).The explanation for this lies in the fact that the disk model used lacks vertical disk structure.This means that the relatively high dust temperatures associated to the NIR emission are likely valid only on the disk surface, which can hide a cold and dense mid-plane underneath.Thus the derived column densities from the dust are representative only of the top disk layers and it is highly unlikely that the thermal emission is truly optically thin throughout the vertical disk extent as the best fitting models suggest.Similar statements can be made for the CO bandhead modeling: we have derived the properties of the region in the disk where the emission originates, which, as we have concluded, is relatively small.Therefore, it is difficult to make statements about the disk as a whole.Longer wavelength data (MIR and (sub)mm) are needed to constrain the masses and sizes of these disks.This, in turn could give further information on the disk evolution around these PMS stars -for instance whether the disks are (also) eroded outside-inwards.
The question whether and how much these stars are accreting is also an open one.V22 estimate accretion rates of ∼ 10 −4 − 10 −6 M /yr for objects in the mass range 6 − 12 M , based on line luminosities and extrapolations of magnetospheric accretion models.Since the M17 PMS stars appear younger than the V22 sample and show CO bandhead emission, they could well be still accreting with rates on the higher end.However, since the similarity in disk properties from CO emission seems hard to reconcile with the disparity in reported accretion rates, the link with accretion remains ambiguous.
With regard to the possible accretion mechanisms, i.e. boundary layer (BL) vs. magnetospheric accretion, we note that although CO emission probes regions close to the star, for most of our objects it still originates from outside the corotation radius.With v sin i values taken from RT17 and using equation 2 from Ilee et al. (2014) we find corotation radii of ≤ 0.081, 0.19 and 0.14 AU for B243, B268 and B275 respectively.Though we have no constraint on v sin i values for B163 and B331, the CO emission in those objects also (mostly) originates outside these radii.Thus, only for B268 (and perhaps B163) the CO emitting region covers the corotation radius.This could be an indication that the disk extends close enough to the stellar surface for BL accretion to take place, but for most of our objects the CO emission cannot probe this, in agreement with Ilee et al. (2014).

Summary and conclusions
In this work we have studied the CO bandhead and thermal infrared emission from the disks around 5 intermediate to massive PMS stars in M17, with the aim to constrain their inner dust and gaseous disk properties and further our understanding of the end stages of massive star formation.To this end, we develop a LTE disk model that accounts for dust and CO overtone emission.For the first time we fit the 2 nd overtone.We fit normalized bandheads, using models for stellar and dust continuum.We compare and discuss the derived disk properties in the context of the previously derived PMS evolutionary states of the central stars (RT17) and CO bandhead emission detection (rates) among YSOs in the literature.We arrive at the following conclusions: 1. Taking into account the breadth of the parameter space and the diversity in line morphology, we arrive at surprisingly similar inner gaseous disk characteristics for the CO emitting region, which is consistent with results in the literature for YSOs of different masses.Therefore, it seems that this emission is typical of certain (more or less) specific physical conditions, i.e. high densities (N CO ∼ 10 21 cm −2 ) and temperatures (2000 − 5000 K). 2. The (overall low) detection rates reported in the literature suggest that CO overtone bandhead emission is preferentially observed towards objects with high(er) accretion rates and in earlier stages of their formation, across all mass ranges.The high detection rate among the M17 PMS stars, in combination with their position on the HR-diagram suggests that these objects are exceptionally young for their mass range.3. The previous conclusion is supported by the SED analysis, which points to undisrupted inner dust disks, except in the case of the most luminous object B331, for which the disk is likely in transition.4. Though it is likely that these objects are accreting, the CO emission is not suited to probe the rate and mechanism of accretion.Results in this work point to CO bandhead emission regions close to the star, but outside the co-rotation radius (save for one object) within which boundary layer vs. magnetospheric accretion can be probed (in agreement with Ilee et al. 2014).Analysis of hydrogen lines for these objects is more suited to probe accretion regions, as these lines originate even closer to the stellar surface (Backs et al. 2023). 5. Taking into account the strength of the bandheads as well as the inclusion of the 2 nd overtone emission provides more stringent constraints on the emission region and column densities, than previous studies lacking these diagnostics.
6.We clearly detect the 13 CO feature, but find no evidence for its enhancement in the spectra, consistent with the PMS nature of the M17 objects.Though of low diagnostic value to constrain disk parameters, we corroborate that 13 CO abundances can be useful in determining the evolutionary state of objects with CO overtone emission, as suggested by Kraus et al. (2020).7. The M17 intermediate to high mass YSOs appear unique in their combination of having both observable photospheres while also spectral and SED properties typical of younger, more enshrouded objects.Multi-wavelength analysis applied to the same objects is crucial for a better understanding of these objects in particular and the different stages of massive star formation in general.

Appendix A: Photometry tables
Tables A.1 to A.3 list the photometry points used for the fits.

Fig. 3 :
Fig.3: Best fit model SEDs and photometric data points.On each figure the data and results for B275 (in black) are plotted along with one other object (in blue) for comparison.Kurucz models representing the stellar spectrum are given in dashed lines; the combined stellar and disk model in solid lines.The scattered points mark the photometry.Also indicated are the wavelength ranges of the 2 nd and 1 st overtone bandheads.Note how B331 is both more luminous and also lacks NIR excess, with the dust emission starting around 4 to 5 µm.

Fig. 4 :Fig. 5 :
Fig.4: Best fits (in red) overplotted with data (in black) for all objects.The data points included in the fits are marked in orange.Note the different y-axis scales for the 1 st (right) and 2 nd (left) overtone.All fluxes are normalized.On the 2 nd overtone bandhead plots the hydrogen Brackett series (not marked) can also be seen, in absorption from the stellar photosphere and/or in emission from the disk.

Fig. 6 :
Fig. 6: Hertzsprung-Russell diagram displaying the MIST premain-sequence tracks (Dotter 2016) for stellar masses between 6 and 20 M , with the solid black line on the left denoting the ZAMS and the gray dashed line on the right indicating the birthline.The objects in this study, along with the one M17 PMS star from RT17 without CO bandhead emission (B337), are shown together with the > 4 M Herbig stars from the Vioque et al. (2022) sample.Also included are the two RT17 objects (B289 and B215) close to the ZAMS without inner dust or gaseous disk signatures, but with MIR excess.The M17 PMS stars appear to represent a younger stage of formation, consistent with the high detection rate of CO bandhead emission in their spectra and their high extinction.For comparison, the estimated CO detection rate representative for > 4 M Herbig stars is 13 − 17% (see text).
B163 and B331 were too embedded to allow for full modeling.Physical parameters were estimated by spectroscopic classification and estimating T eff and L from Kurucz calibration tables.
a ZAMS mass of best fit PMS track.b

Table 1 :
Stellar and extinction properties derived from quantitative spectroscopy and optical (λ 1µ) SED fitting.

Table 2 :
Parameter values in grid for SED fitting.

Table 3 :
Parameter values in grid for SED fit of B331.

Table 4 :
Parameter values in the grid for CO bandhead fitting.

Table 5 :
Parameter ranges for which reasonable SED fits could be obtained to the available photometry.

Table 7 :
Fit results for CO bandhead modeling (see text for details).