Testing a New Model of Embedded Protostellar Disks Against Observation: The Majority of Orion Class 0/I Disks Are Likely Warm, Massive, and Gravitationally Unstable

We formulate a parametrized model of embedded protostellar disks and test its ability to estimate disk properties by fitting dust-continuum observations. The main physical assumptions of our model are motivated by a recent theoretical study of protostellar disk formation; these assumptions include that the disk should be marginally gravitationally unstable, and that the dominant dust heating mechanism is internal accretion heating instead of external protostellar irradiation. These assumptions allow our model to reliably estimate the disk mass even when the observed emission is optically thick and to self-consistently determine disk (dust) temperature. Using our model to fit multi-wavelength observations of 156 disks in the VANDAM Orion survey, we find that the majority (57%) of this sample can be fit well by our model. Using our model, we produce new estimates of Orion protostellar disk properties. We find that these disks are generally warm and massive, with a typical star-to-disk mass ratio $M_{\rm d}/M_\star = \mathcal O(1)$ in Class 0/I. We also discuss why our estimates differ from those in previous studies and the implications of our results on disk evolution and fragmentation.


INTRODUCTION
Accretion disks in protostellar systems1 play a crucial role in the formation of stars and planets by regulating protostellar accretion and by setting the initial conditions of planet formation.However, our understanding of protostellar disks has been limited by several challenges in characterizing observationally the properties of protostellar disk populations in nearby star-forming regions and in theoretical studies of protostellar disk formation and evolution.
Our observational understanding of protostellar disk populations used to be limited mainly by the small number of observed protostellar disks (see reviews of early protostellar disk observations in Zhao et al. 2020 andTobin et al. 2020), in part due to the short timescale of protostellar evolution.The problem of sample size has been greatly alleviated by recent surveys (Segura-Cox et al. 2018;Williams et al. 2019;Tobin et al. 2020), which provide data for tens to hundreds of protostellar disks in young star-forming regions.However, it remains challenging to estimate reliably the disk properties from these data.In particular, it is difficult to obtain reliable estimates of disk masses from dust-continuum emission (which is what most large surveys measure).This process involves at least two major uncertainties: whether the disk is optically thin at the observed wavelength, and the temperature of the emitting dust grains.A common choice is to ignore these uncertainties by blindly assuming that the disk is optically thin and prescribing some arbitrary dust temperature (often ∼30 K).This leaves large uncertainties in the result; in particular, when the disk is optically thick, this approach can significantly underestimate the disk mass.One can also avoid making these arbitrary assumptions and fit observation with parametrized disk models (often coupled with radiative transfer) with more degrees of freedom (e.g., Sheehan & Eisner 2017;Sheehan et al. 2022).A major limitation of this approach, however, is that the model may be underconstrained by data.A sufficiently general model contains many (≳10) free parameters while currently available data sets often do not contain enough information to constrain these parameters reliably or test the physical assumptions involved in the model.
On the other hand, obtaining a clear theoretical understanding of protostellar disk formation is also very W. Xu challenging.Despite the long history of theoretical studies of this subject (see a review in Zhao et al. 2020), a clear and quantitative physical picture of disk evolution is still absent.This is mainly due to theoretical difficulties in understanding the interplay between the various physical mechanisms relevant for disk formation and to technical difficulties in achieving good resolution (and numerical convergence) while modeling all relevant physical ingredients in a realistic fashion (Xu & Kunz 2021a).However, recent studies by Xu & Kunz (2021a,b) offer some optimism in resolving this problem.Using a combination of simulation and analytic theory, they argued that a typical protostellar disk should be self-regulated by gravitational instability (GI) and stay marginally unstable, and that the thermal budget of the disk is determined by a simple balance between accretion heating and radiative cooling (see a more detailed summary in Section 2.1-2.2).Given these constraints, the disk profile can be determined (approximately) from just a few parameters (Xu & Kunz 2021b, see also Section 3).Yet one major caveat is that these results are based on simulations using a highly idealized initial condition (a non-turbulent pre-stellar core with uniform rotation and magnetization), and it is unclear how well the resulting physical picture can be generalized to real protostellar systems (cf.Section 8.3).
In this paper, we try to address the aforementioned challenges by fitting observations from a recent large survey of protostellar disks, the VANDAM Orion survey (Tobin et al. 2020, hereafter T20), using a model based on the physical picture outlined in Xu & Kunz (2021b, hereafter XK21b).This will allow us to test this physical picture and obtain new (and potentially more reliable) estimates on the properties of Orion protostellar disks.
This paper is organized as follows.We begin by describing our model in Sections 2-4, with a summary of our physical assumptions in Section 2, details of the model setup in Section 3, and details of how we fit the model to data in Section 4. We carefully test our model against observation in Section 5, and discuss the estimates of Orion protostellar disk properties coming from our model with a focus on the physical implications of our results in Section 6.We compare our model with others in the literature to explain why we find much higher disk mass and demonstrate that our model has better predictive power in Section 7. We conclude with some additional discussion in Section 8 and a brief summary in Section 9.

Gravitational self-regulation
The first assumption of our model is that the disk is gravitationally self-regulated and is in a marginally gravitationally unstable state (cf.Vorobyov & Basu 2007;Kratter & Lodato 2016).Here we sketch out a simple argument for this assumption; for more details, see Xu & Kunz (2021a) Section 5.2 and XK21b Section 4.
In an accreting protostellar system, material is accreted from the envelope2 onto the disk, and then from the disk onto the protostar.During Class 0/I, the accretion onto the disk due to envelope infall happens quickly (∼10 −5 M ⊙ /yr based on the typical duration of Class 0/I, Andre et al. 2000).Meanwhile, when the disk is gravitationally stable, the accretion rate through the disk onto the star is much lower; such accretion is mainly facilitated by angular-momentum removal from the disk via magnetic braking and magnetically-launched outflow, which are weak because of the strong ambipolar diffusion in the very weakly ionized disk (XK21b; cf.Masson et al. 2016;Zhao et al. 2018).Therefore, the disk mass tends to increase, eventually making it gravitationally unstable.Once the disk becomes unstable, gravitationally-excited perturbations tend to reduce the degree of instability as they can efficiently transport angular momentum outwards (which drives accretion onto the protostar) and heat up the disk. 3The balance between envelope infall and gravitationally-driven transport should then leave the disk in a marginally unstable state.In terms of the Toomre Q parameter, this expectation translates to having Here cs the density-weighted average sound speed, κ is the epicyclic frequency, and Σ is the surface density.This provides a simple constraint on the surface density profile of the disk.We stress that there are several important uncertainties related to this argument, which we summarize in Section 8.3.Given these uncertainties, the assumption of gravitational self-regulation should not be taken for granted.We perform several tests in Section 5 to check whether observational data favor this assumption.

Thermal budget and temperature profile
While gravitational self-regulation provides a constraint on the surface density profile, we also need to know the temperature profile of the disk in order to predict the dust thermal emission.
First, consider the thermal budget of the disk.We assume that the disk is mainly heated internally due to accretion, and ignore external heating due to stellar irradiation.This assumption follows the fact that the geometric thickness H/R of a gravitationally self-regulated disk tends to increase towards smaller radii and the disk inner edge shields the rest of the disk from direct protostellar irradiation (XK21b, also see the radial dependence of temperature in Appendix C).This assumption also implies that the dust temperature is approximately equal to the gas temperature; in this paper we do not distinguish between these two temperatures.
As a rough approximation, the heating rate should be comparable to the rate of gravitational energy release due to accretion (see proofs in Balbus et al. 1994 and XK21b).In approximate thermal equilibrium, this gives Here T eff is the effective temperature of the disk, g R is the radial gravity at the midplane, Ṁ (R) is the mass flux (accretion rate) through the disk at R, Ṁ is the rate of mass infall from the envelope onto the disk, and Ω is the orbital frequency.For simplicity, we assume g R = −GM (<R)/R 2 and Ω = −g R /R.We also assume that Ṁ (R) ∼ Ṁ , which is valid when the infall rate and the disk properties all evolve slowly at a timescale comparable to the lifetime of Class 0/I.
There are two caveats regarding this assumption.First, the envelope might heat the outer disk by scattering and reemitting the radiation from the protostardisk system (D'Alessio et al. 1997).It is difficult to quantify the importance of this mechanism without introducing many free parameters as it would be sensitive to the properties of the envelope (Natta 1993).Second, for a small subset of our sample the temperature in the outermost part of the disk can get below ∼ 10 K and become comparable to the background temperature of interstellar irradiation; in this case this ambient irradiation could become the dominant source of heating.Generally, these limitations would lead us to underestimate the temperature in the outer disk in some cases.
Next, we want to relate T eff to the vertical temperature profile at a given radius.This relation is determined mainly by the mechanisms of disk heating and vertical heat transport.For the former, we assume that the heating rate per unit mass is constant (at given R), which is a rough but reasonable approximation for an internally heated disk.For the latter, we assume that the vertical heat transport is dominated by radiative heat transport following XK21b.While perturbations in the disk (mainly gravitationally excited spirals) could induce some forced convection, XK21b demonstrated that such forced convection generally cannot be the dominant vertical heat transport mechanism and can be safely ignored (cf.Rafikov 2007).
Under the two assumptions discussed above, the vertical temperature profile can be well-approximated by a simple analytic form (Hubeny 1990, Eq 3.11) (3) Here τ R is the Rosseland optical depth at a given location and τ R/P,mid is the Rosseland/Planck optical depth at the midplane.This approximation assumes that the opacities vary slowly in depth and the heating rate per mass is approximately constant.

Comparison with assumptions in previous studies
There are two main differences between our model and the assumptions in previous observational estimates of protostellar disk properties.The first is that we do not assume the emission to be optically thin at the observed wavelength.While typical protostellar disks are likely optically thick at (and even above) mm wavelength (as suggested by their low spectral indices; Li et al. 2017;Galván-Madrid et al. 2018), most previous mass estimates assumed optically thin emission for lack of a better method to infer the mass of obscured dust at τ ≳ 1.Here we tackle this problem by using the physical constraints discussed above to relate the properties of the disk surface (at τ ≲ 1) to surface density and midplane temperature.
Another important difference concerns the assumed dust heating mechanism.When translating dust thermal emission to dust mass, most existing studies (e.g., T20; Sheehan & Eisner 2017;Sheehan et al. 2022) assume that dust temperature is set mainly by protostellar irradiation, while we assume that the dust temperature is set mainly by the gas in the disk, which is heated internally due to accretion during the main accretion phase.

Generating the disk profile
The physical assumptions given in Section 2.1 and 2.2 provide enough constraints to generate the radial surface W. Xu density profile and the radial and vertical temperature profile given three inputs: the mass of the protostar M ⋆ (as the inner boundary condition), the disk size R d (as the outer boundary), and the mass infall rate from the envelope Ṁ .Scripts for solving the disk profile are available online at https://github.com/wxu26/GIdisk2obs;here we sketch out a method of solution.
First we discuss how we solve the surface density and vertical temperature profile at a given radius for a given set of Ω, κ, and Ṁ .Here we treat "∼" and "≈" above as "=" when solving the disk profile and assume a constant Q = 1.5 in our fiducial model. 4We begin by mapping (τ R,mid , T eff ) to the vertical temperature profile and then to (Σ, cs ).For a given set of (τ R,mid , T eff ), we can solve τ P,mid by plugging Eq. 3 into the constraint The Rosseland and Planck mean opacities κ P , κ R are given in Section 3.3.Knowing τ P,mid , we can then compute the vertical temperature profile T (τ R ) with Eq. 3 and use T (τ R ) to compute Σ (using the relation dΣ = κ R dτ R ) and cs .We then (numerically) convert this map into a map from (Σ/c s , T eff ) to other local disk properties (Σ, τ P,mid , τ R,mid ).Since Eq. 1 and 2 directly determine Σ/c s and T eff , we can use this map to determine the local disk properties.
To generate a disk profile, we just need to update the radial profile of (Ω, κ) iteratively using the Σ profile calculated with current estimates of (Ω, κ) until the results converge.Here M ⋆ and R d are used as boundary conditions for this process.

Generating mock observation
Using the disk temperature profile T (R, τ R ) generated from our model, we can produce mock observations at a given wavelength (frequency).
We first compute the intensity I ν (R) at disk surface using the temperature profile T (R, τ R ).Since the disk is often optically thick and the scattering opacity is generally higher than absorption opacity at the observed wavelengths, we need to include scattering in our calculation (Zhu et al. 2019).For simplicity, we assume that the disk is geometrically thin; this reduces solving I ν at given R to a 1D problem.The intensity is given by (5) Here τ ν is the optical depth from disk surface (along vertical direction), τ ν,tot is the total optical depth, µ = cos I with I being the inclination of the disk, and S ν is the source function.We map τ R , which is used to specify the temperature profile, to τ ν using the relation Here κ ν,abs is the absorption opacity and κ eff ν,sca = (1 − g ν )κ ν,sca is the effective scattering opacity, where g ν is the forward-scattering parameter and the factor (1 − g ν ) accounts for anisotropic scattering (Ishimaru 1978).The source function S ν is given by where B ν (T ) is the Planck function, J ν = 1 4π I ν dΩ is the isotropic intensity, and ω ν = κ eff ν,sca /(κ ν,abs +κ eff ν,sca ) is the single-scattering albedo.Now we only need to solve for J ν .Under the Eddington approximation, the second moment of the radiative transfer equation becomes and we solve it (numerically) with the boundary condition of no incoming radiation field, which under the twostream approximation is given by dJ ν /dτ ν = ± √ 3J ν at top/bottom surface (Miyake & Nakagawa 1993).In the limit of high optical depth, including scattering generally reduces I ν by a factor of O( √ 1 − ω ν ) (Zhu et al. 2019).
We then use the I ν (R) profile to generate a mock observation image by orienting the disk using the position angle (PA) and inclination estimates from T20, which are based on the deconvolved shape of the bestfit gaussian profile of the 0.87 mm image, and convolve the image with a 2D gaussian beam whose widths and orientation are identical to the synthesized beam of the corresponding observation.This produces the intensity profile I ν in Jy per beam.

Dust opacity and disk truncation
Generating the disk profile and mock observation both require knowledge on the opacity of the disk, which is dominated by dust opacity.Here we assume a constant dust-to-gas mass ratio of 0.01 and a power-law grain-size distribution with dn/da ∝ a −3.5 (Mathis et al. 1977) having a maximum grain size a max = 1 mm and a minimum grain size a min → 0. (As long as the minimum and effective scattering opacity κ eff ν,sca before any dust sublimation (T < 150 K).We also label the dust opacity indices β between 0.87 mm and 9 mm (vertical black lines) for κ ν,abs .
grain size a min ≪ a max , the opacities are insensitive to the exact choice of a min .)The assumption of a constant dust-to-gas ratio should be reasonable, as the dust grains are well coupled to the gas (Appendix A).
Our choice of a max assumes that the dust-size distribution is more similar to that in protoplanetary disks than that in the ISM, which would have a max ≲ µm.This is because grain growth is expected to proceed quickly in the disk before approaching the equilibrium between coagulation and fragmentation (Birnstiel et al. 2011).Still, the exact value of a max remains highly uncertain, as a max can vary significantly across different disks and vary radially within a disk.To address this uncertainty, while we use a fiducial value of a max = 1 mm, we also produce models with a max = 10 µm, 100 µm, and 1 cm and compare the results across these models.
The dust opacities are computed using the DSHARP opacity package (Birnstiel et al. 2018).The assumptions for grain porosity and composition also follows that of Birnstiel et al. (2018), with zero-porosity grains composed of 20% water ice, 40% refractory organics, 7% troilite, and 33% silicates by mass.Since the temper- atures in the inner part of our model profiles are often several hundred K and above, we need to include the effect of dust sublimation.We adopt the sublimation temperatures from Pollack et al. (1994) and remove water ice, refractory organics, troilite, and silicates from our dust mixture (while adjusting the dust-to-gas ratio accordingly) at 150, 425, 680, and 1200 K, respectively.
Here for simplicity we have dropped the density dependence of water ice and silicate sublimation temperatures.
Beyond 1200 K, we assume that κ P , κ R → 0 as the remaining gas opacity there is much lower than the dust opacity below 1200 K.In Fig. 1 we summarize the opacities for dust models with different a max .Once disk material exceeds 1200 K, it can barely cool by radiation due to the drop in opacity and thermal equilibrium can only be achieved at a much higher temperature when gas opacity becomes sufficiently high.This results in a steep increase of temperature at R < R 1200K , where R 1200K is the radius where midplane temperature first reaches 1200 K.The high temperature at R < R 1200K would lead to well-coupled magnetic field because the thermal ionization of potassium increases disk ionization exponentially above ∼ 1000 K (Umebayashi 1983).In this regime, our model is no longer applicable, and the disk is likely to be regulated by MRI and magnetized outflow which efficiently transport angular momentum (with effective α ≳ 0.01) to keep the surface density at R ≲ R 1200K low (cf.Gammie 1996).For simplicity, we excise this region from our calculation and assume that the surface density and flux density are zero at R < R 1200K .

Choice of model parameters
We conclude this section by summarizing all parameters of our model and discussing the choice of free (and fixed) parameters.The disk profile can be solved with only three parameters: the mass of the star M ⋆ , the disk size R d , and the rate of mass infall (accretion) from envelope Ṁ (Section 3.1).One intuitive choice is to leave all three parameters as free parameters; this, however, will be problematic for unresolved disks, where obser-W.Xu vation only has two degrees of freedom corresponding to the integrated flux at the two observed wavelengths.(There is a little additional information as a finite disk size will always cause the profile to deviate from being exactly Gaussian; but such information is negligible when the disk diameter is significantly smaller than the beam size.)Therefore, in order to fit all observed systems, we want our model to have at most two free parameters.This is achieved by assuming a constant Ṁ /M ⋆ = 10 −5 yr −1 .The choice of this ratio is motivated by the typical lifetime of the main accretion phase (Class 0 and Class I).The mass dependence is meant to capture (to some extent) the possible difference in accretion rates between low-mass and high-mass systems, but not the variation of accretion rate in a single system as the protostar gains mass.This is of course a very crude approximation; it is unclear whether the duration of the main accretion phase has any mass dependence, and it does not correctly capture the temporal variation of accretion rate within the main accretion phase (where Ṁ should remain approximately constant or decrease as M ⋆ and the age of the system increase).We address the large systematic uncertainty in our assumed Ṁ by quantifying how a different choice of Ṁ /M ⋆ impacts the result in Section 4.3 and 5.1.
We also comment that there are several other ways for estimating Ṁ , but they each have their own limitations.First, one can simply choose a fixed Ṁ , but it would not be reasonable to assume that all systems -which span several orders of magnitude in mass -have identical accretion rates.Besides, we find that the best-fit model for some low-mass systems would have M ⋆ → 0 if we assume a fixed Ṁ ∼ 10 −5 M ⊙ yr −1 .Another possibility is to estimate Ṁ using the luminosity of the protostar L ⋆ .However, L ⋆ depends on the accretion rate onto the star Ṁ⋆ and can be highly variable (potentially due to the modulation of the disk), such that an instantaneous estimate of Ṁ⋆ is not necessarily a good approximation of the mass infall rate Ṁ (cf.Offner & McKee 2011;Zakri et al. 2022).Therefore, our choice of a constant Ṁ /M ⋆ is probably already the best we can do before observations reach better resolution (allowing us to fit Ṁ as a free parameter) or provide direct estimates of Ṁ (e.g., through envelope dynamics, which is already available for a small number of systems, e.g., Kristensen et al. 2012;Pineda et al. 2012).
In addition to our two free parameters M ⋆ , R d and our fixed Ṁ /M ⋆ , there are a few additional fixed parameters to the model, including an assumed constant Toomre Q of 1.5 (following the idea of gravitational self-regulation) and parameters for the fiducial dust model (Section 3.3).We summarize the fiducial model parameter choices in Table 1.In Sections 4.3, 5.1, and Appendix B we also discuss how different choices of these fixed parameters impact the agreement between model and observation and the estimated disk properties.

Sample selection
We use observations from the VANDAM Orion survey (T20; Tobin 2019a,b), a large survey of protostellar systems in the Orion molecular clouds with ALMA (0.87 mm) and VLA (9 mm) at ∼40 au resolution.Our sample consists of all systems that are detected at both wavelengths and have positive deconvolved major and minor axes at 0.87 mm (which are used for estimating inclination).This gives a total of 163 systems; among them, 98 are Class 0, 40 are Class I, 21 are Flat Spectrum, and 4 are unclassified.

Fitting
For each system, we vary the two free parameters of the model, M ⋆ and R d , to minimize the error between observed and model images (flux density).We fit the model with images as opposed to visibilities (in frequency space) because it is easier to characterize the systematic uncertainty of the model (due to the oversimplifications we made) in real space rather than in frequency space.
We evaluate the error between the observed flux density I obs and the model flux density I model using Here σ obs is the observation uncertainty, estimated with the RMS flux density of each field of observation, and σ log model captures the systematic uncertainty of the model due to the many oversimplified assumptions in our model.The errors in disk properties due to these oversimplifications are generally of order unity (cf.XK21b Fig. 16); therefore we choose σ log model = log(2)/2, such that ±1σ covers a factor of 2 difference in I obs /I model .χ 2 roughly corresponds to the minus log likelihood (per beam) of observing the given deviation between model and observation.
In order to find the best-fit model, we minimize Here S B,A/V = π 4 ln 2 b A/V,maj b A/V,min is the beam area (converted to au 2 to match dS), with b being the FWHM.Roughly speaking, ... S −1 B,A/V dS represents a summation over beams.The integral covers a square region around the protostar with width 800 au or 4× the disk radius estimate in T20, whichever is larger.l can be interpreted as the minus log likelihood of observing the given error.
We minimize l under the constraint that the model flux density at 0.87 mm (before blurring) needs to be ≥σ obs at the disk edge.This is because the fit becomes much less reliable when the flux is below detection limit, as the model tends to fit the envelope emission around the disk (which is not included in our model) by incorrectly increasing the disk size.When the fit is affected by this constraint, the best-fit disk size should be interpreted as an estimate for the lower limit of the true disk size.For our fiducial model, 31 systems are in this regime.

Quantifying systematic uncertainties
In order to avoid overfitting, we have reduced the number of free parameters in our model by making a series of oversimplifying and/or arbitrary assumptions.It is important to evaluate whether our estimated disk properties are sensitive to these assumptions and quantify the systematic uncertainties in our results.This is done by fitting our sample with different assumed model parameters, and using the measured dependence to estimate the systematic uncertainties in our results.The details of this process are documented in Appendix B. In summary, assuming one order-of-magnitude uncertainty in a max and Ṁ /M ⋆ and an order-unity uncertainty in Q, the uncertainties in key disk properties are generally of order unity and at most a factor of ∼ 3 (see Table 4).

Agreement between model and observation
In order to test our model against observation, we first check whether the model is consistent with a large fraction of systems in our sample.Note that we do not expect the model to be consistent with all systems, as the physical assumptions of our model corresponds to only one of several possible scenarios of disk formation (see Section 8.3).
We evaluate the agreement between the best-fit model and observation using the mean χ 2 inside each disk, Here S A/V is the area of the region where the ALMA/VLA mock observation is above the detection limit, and χ 2 A/V is the mean χ 2 for the ALMA/VLA mock observation within this region.In other words, χ 2 mean is an average of χ 2 A/V weighted by the (detectable) disk area in unit of beams.Fig. 2 provides a few examples of systems at different χ 2 mean .For χ 2 mean ≲ 1, the model agrees well with observation.Fig. 3 plots the distribution of χ 2 mean for our fiducial model; 57% of our sample can be fit reasonably well by the model (χ 2 mean ≤ 1).Fig. 4 shows the distribution of χ 2 mean if we vary the assumed dust-size distribution and accretion rate, which are both subject to relatively large systematic uncertainties.The agreement between model and observation is relatively insensitive to different choices of these parameters, while showing a slight preference for a maximum grain size a max ∼ mm over larger (1 cm) or smaller (≲100 µm) grains.
To give a more direct impression on how well the model fits observation, we also compare flux density (Fig. 5) and apparent disk size (Fig. 6) between model and observation.Our model is consistent with observations at both wavelengths, and there is no apparent systematic difference in any of these metrics.Especially, our model naturally reproduces the trend that the apparent disk size shrinks towards longer wavelength, which is also visible in Fig. 2.

Is the good agreement coincidental?
Good agreement between observation and model prediction alone cannot be a strong evidence for arguing that the model is a good description of real disks.It remains possible that the good agreement is just an overfit, where the model is too flexible (or the observables provide too few constraints) and predicts an unrealistic disk profile which happens to produce the right observables.While one cannot completely rule out this possibility, we argue that an overfit is unlikely as follows.
First, we note that our model has only two degrees of freedom (two free parameters, M ⋆ and R d ), while the observations often offer more constraints than that.For disks that are well resolved by ALMA, the model needs to fit not only the integrated flux at both wavelengths but also the disk size and the radial profile of the flux density.In Fig. 3 we see that the agreement between the model and well-resolved systems remains good, suggesting that the agreement between the model and observation is likely not because the model has too many free parameters.
Second, we test the predictability of our model by fitting it only with single-wavelength 0.87-mm (ALMA) observations and use it to predict the 9-mm flux.This is not a trivial task; the emission is generally optically thick at 0.87 mm, and most of the 9-mm emission come from dust that is not visible at 0.87 mm.Therefore, cor- mean ≲ 1); 57% of our sample are in this regime.The last panel shows an example where the model cannot fit the observation very well.For this particular system this is due to the presence of substructure (which at current resolution is uncommon in the sample).In general, the lack of a good fit could be due to a variety of reasons, such as disk being inconsistent with our model, envelope contamination, and binarity. of the total dust mass using the emission at the surface

Cumulative probability
Vary assumed Q of the disk and the assumed physical constraints in our model.In Fig. 7 we see that the model predicts the 9-mm flux without systematic error and reproduces the typical spectral index relatively well.

Does observation favor gravitationally self-regulated disks?
A central assumption of our model is that the disk is gravitationally self-regulated with a marginally unstable Toomre Q.However, as we commented earlier, this assumption should not be taken for granted (cf.Section 8.3).Here we try to use observation to constrain (at the population level) the typical Toomre Q parameter of our sample.
In Fig. 8 we compare the distribution of χ 2 mean for several different assumed values of Q.The agreement between observation and model is similarly good for Q = 1 − 10, but deteriorates as we further increase Q. Fig. 9 compares the distribution of disk spectral index for different Q values; it shows a preference for Q = 1.5 over larger Q values, with the caveat that the difference between Q = 1.5 and Q = 3 remains small.
In summary, these evidences show that order-unity Q is preferred at a population level.We can understand this preference by noticing that, at a higher Q and for the same (optically thick) 0.87-mm emission, the model would under-predict the surface density and 9-mm flux, typical systematic uncertainty Figure 10.Distribution of disk size R d and mass M d for our fiducial model.We only include systems with χ 2 mean ≤ 2; for χ 2 mean > 1, the opacity of the marker decreases as χ 2 mean increases.We also plot the 1σ uncertainty due to systematic uncertainties in the assumed model parameters for reference (cf.Appendix B).There is a positive correlation between disk size and mass.
making it difficult to reproduce the observation at both wavelengths.

DISK PROPERTIES AND PHYSICAL IMPLICATIONS
In this section we discuss the new estimates of the properties of Orion protostellar disks obtained with our fiducial model and discuss the physical implication of our results on disk evolution and fragmentation.

Summary of disk properties and comparison with previous studies
We begin with an overview of the estimated properties of our disks, which are summarized in Table 2 and Figs. 10 -12.Here we only focus on systems with χ 2 mean ≤ 2, as for systems with large χ 2 mean our model may not be able to provide reliable estimates.
Disk mass and surface density: Our sample has a mean disk mass of 0.68 M ⊙ and median disk mass 0.52 M ⊙ , which is often comparable to or more massive than the protostar mass; there also appears to be a correlation between disk mass and stellar mass, with slope ≈ 0.6 (Fig. 11).The surface density profile of our disks approximately scale as Σ ∝ r −2 (Fig. 12), therefore a sig- Note-Here we only include systems with χ 2 mean ≤ 2. The last row shows results from T20 (based on 0.87 mm observation) for the same sample.nificant portion of disk mass is concentrated in the inner part of the disk.These properties are all generic features of a gravitationally self-regulated disk, and agrees with previous semi-analytic calculations (e.g., Lin & Pringle 1990; Rafikov 2009 also see Appendix C).Also note that the tight correlation in Fig. 11 partly because we have assumed (for simplicity) consant Ṁ ∝ M ⋆ .In reality, the distribution would likely be more scattered.Our mass estimates are significantly higher than previous estimates for the same data set (T20, Sheehan et al. 2022).Such difference is mainly due to different model assumptions; we make a more detailed comparison be-tween these models and argue that our model features better predictive power in Section 7.
Our disks are also somewhat more massive than the disks in the large-scale hydrodynamics disk population synthesis by Bate (2018), which gives M d /M ⋆ ∼ 0.1 − 1.This difference could be related to insufficient resolution as commented in the resolution study in Bate (2018).XK21b also demonstrated that M d /M ⋆ can be O(1) even when magnetic fields have been included, as long as there is no excessive numerical dissipation (which might be common among earlier simulations due to their relatively low resolution in the innermost several 10 au).
Disk size: The disk size in our sample has a median of 77.7 au and shows a wide distribution, with a factor of ∼3 difference between the first and third quartile.Our disk sizes are often larger than those quoted in T20, but this is mainly because T20 is reporting the apparent disk size R 2σ ; there is no longer a systematic difference if we compare R 2σ of our model with the T20 results (Fig. 6).We also see a positive correlation between disk mass and disk size; similar correlation has been reported for older disks (Tripathi et al. 2017).
Disk temperature: The midplane temperature in our disks range from ≲10 K to 1200 K, and the mean temperature T mean are generally several 100 K.Such high temperature is in part because the disk is often optically thick (τ R,mid ≳ 1) and cannot cool efficiently.It is also related to the steep surface density scaling, which makes the hotter inner disk dominate the densityweighted averaging.The disk temperature decreases in radius with a relatively steep power-law slope ≈−1, and T mid generally varies by more than an order of magnitude across the disk.(As a result, T mean should not be interpreted as a single characteristic temperature of the disk.)Also note that the vertical temperature profile (Eq. 3) implies that the observed dust (from disk surface) can be significantly cooler than T mid , especially at 0.87 mm.Radial profiles of disks in our sample, estimated with our fiducial model.From top to bottom, we show surface density Σ, midplane temperature T mid , miplane Rosseland mean optical depth τ R,mid , and optical depth at the two observed wavelengths.We mark the disk edge with a blue dot in each curve.For the last two panels we also show the optical depth at disk edge for absorption opacity only.We only include systems with χ 2 mean ≤ 2; for χ 2 mean > 1, the opacity of the marker decreases as χ 2 mean increases.Both disk size and mass decrease towards later evolutionary stages.These trends are discussed in Section 6.2.

Implication on disk evolution
Now we discuss the evolution of disk size and mass by comparing their statistics across different evolutionary stages and against Class II disks in the literature.We also attempt to link the observed trends with theories of Class 0/I disk evolution.
Disk mass evolution: The estimated masses of Class 0 and Class I disks in our sample are largely similar, while those of Flat Spectrum disks are lower by more than a factor of 2.5 Physically, the decrease of disk mass towards later evolutionary stages is expected because as the envelope disperses, the accretion rate could drop exponentially (Fischer et al. 2017) and that leads to less accretion heating, lower disk temperature, and a lower disk mass required for gravitational self-regulation.In other words, as the accretion rate drops, a gravitationally selfregulated disk tends to decrease its mass to maintain marginal instability (and this is achieved by keeping the accretion rate from the disk to the star slightly above the accretion rate from the envelope to the disk).
One could also estimate the mass of a gravitationally self-regulated disk at the end of envelope dispersal as an initial condition of Class II evolution.At that point accretion heating is low enough that the disk temperature should become comparable to the ambient temperature of ∼ 10 K, and that produces a disk mass of order 0.1M ⊙ for a solar-mass star with a ∼ 100 au disk (cf.Xu & Kunz 2021a Section 5.3).This is still significantly higher than early observational estimates of Class II disk masses in young (≲ 3 Myr) star-forming regions (Taurus, Ophiuchus, Lupus, ONC), which are of order 3 × 10 −3 M ⊙ (Andrews et al. 2013;Ansdell et al. 2016;Tripathi et al. 2017;Eisner et al. 2018; see a summary in T20 Figs. 14,15).But the simple models applied in these studies could systematically underestimate the disk mass by 1-2 orders of magnitude according to recent studies adopting different (and probably more robust) methods of mass estimation (Booth et al. 2019;Powell et al. 2019;Anderson et al. 2022).Accounting for this bias, the typical disk mass at the beginning of Class II would be broadly consistent with gravitationally self-regulated evolution during Class 0/I.
Disk size evolution: The size of disks in our sample decreases after Class 0, consistent with the trend observed in T20.While there is no firm conclusion on what causes this trend, one possible explanation for the shrinking disk size is enhanced magnetic braking in the inner envelope at later times (XK21b).During the collapse of a magnetized core, ambipolar diffusion decouples magnetic flux from the gas before the gas is accreted by the protostar-disk system.Most of the magnetic flux that initially belongs to the material in the protostellardisk system piles up around it in a growing magnetically dominated region (a smoother version of the "magnetic wall", cf.Li & McKee 1996;Tassis & Mouschovias 2005).At later times, infalling material needs to pass through this magnetically dominated region and would lose most of its angular momentum before reaching the disk.In this case, while accretion is still increasing the total mass of the protostar-disk system (M ⋆ + M d ), it barely increases the angular-momentum budget of the disk.This could lead to a decrease in disk size, as demonstrated in the simulation of XK21b.
This trend of shrinking disk size should end as the envelope disperses and system transitions into Class II, and the disk size can start growing again if the disk remains mainly regulated by angular-momentum transport (e.g., gravitational or magneto-rotational instability) as opposed to angular-momentum removal (wind,  (Eisner et al. 2018), the typical disk size can differ by a factor of a few across different starforming regions, making it difficult to make a meaningful comparison between these populations and our sample of Orion protostellar disks.Summary: Comparing disk properties across different stages of protostellar evolution and with statistics of Class II disks in the literature, we find that the evolution is affected by several different (and often competing) mechanisms, resulting in non-monotonic evolution where the evolution of disk size and mass switches from growth to decay before finishing Class 0. Especially, the cooling of the disk during the dispersal of the envelope might play an important role in bridging protostellar disks to young protoplanetary disks, which are significantly less massive.

Fragmentation
One important possible outcome of GI is fragmentation, which can lead to the formation of (stellar and sub-stellar) companions and are thought to be related to the outbursts which are common among protostellar systems (Vorobyov & Basu 2006).Here we check whether the disks in our sample might be prone to fragmentation.
Fragmentation occurs when the overdensities produced in GI-driven perturbations cool rapidly enough to collapse under their own gravity before being disrupted by orbital shear and other perturbations.Traditionally, the condition of fragmentation is Ωτ cool < 1 − 3 (Gammie 2001).Here the cooling timescale τ cool is the ratio between disk internal energy U and cooling rate 2σT 4 eff , both of which can be computed from our model disk profile.
In Fig. 14 we plot the radial profile of Ωτ cool for our best-fit models.Ωτ cool decreases in radius, and 40-60% of the disks in our sample are prone to fragmentation beyond 50-100 au.This is broadly consistent with the analytic prediction of Clarke (2009) and the observation of a bimodal distribution of companion separation in protostellar systems (Tobin et al. 2016(Tobin et al. , 2022) ) where the peak at smaller separation (∼100 au) could be due to fragmentation.Note that such observation does not provide a very good constraint on whether fragmentation commonly occurs as the visibility and survival rate of fragments are still not well understood.
It is worth noting that our physical picture of a gravitationally self-regulated disk cannot be directly applied to a fragmenting disk.Still, it is possible that a similar kind of self-regulation exists whereby the disk selfregulates by switching between gravitationally stable and unstable (fragmenting) states (cf.Vorobyov et al. 2013;Vorobyov & Elbakyan 2019).In this case, whenever the surface density becomes high enough for disk to be unstable and fragment, this part of the disk can get rid of some mass by forming this fragment which will migrate away, and that (together with the associated angular-momentum transport and heating) can push the disk back towards a gravitationally stable state.This fragmentation self-regulation keeps the disk around a marginally stable state, with occasional fragmentation.Qualitatively, this is similar to the constraint of gravitational self-regulation used in our model.One caveat is that this argument has not been studied in sufficient detail in theory or simulation to allow reliable predictions, and there is still a lot of uncertainties regarding the interplay between fragmentation and disk evolution.

COMPARISON WITH PREVIOUS STUDIES
Our disk model produces disk masses that are significantly higher than those in T20 and Sheehan et al. (2022), as shown in Fig. 15.In this section we discuss why it is possible to get such drastically different estimates from the same observational constraints, and point out that the prediction of our model is more consistent with the constraints from multi-wavelength observation.Comparison between disk mass estimates from our fiducial model (fitted with images at both wavelengths) and models based on images at a single wavelength.These include the optically thin, isothermal models from T20 for each observed wavelength (blue and green dots), the parametrized disk+envelope model coupled with radiative transfer from Sheehan et al. ( 2022) (orange triangles), and our model fitted with 0.87 mm images only (grey circles).The difference can be attributed to the fact that single-wavelength observation leaves a degeneracy between mass (or optical depth) and temperature, which need to be resolved by the (highly different) physical assumptions of each model (Section 7).

Summary of models
We begin by summarizing the model used in these three studies.T20 estimated the disk mass by assuming isothermal, optically thin dust; hence the disk mass is directly porportional to the observed flux.The dust temperature was assigned following an empirical scaling law, T dust = 43 (L bol /L ⊙ ) 1/4 K. Since this estimate requires only single-wavelength flux, T20 obtained separate disk mass estimates for 0.87 mm and 9 mm; the two estimates differ by a factor of ∼ 5. 7  Sheehan et al. ( 2022) performed radiative transfer calculations on a generic, 17-parameter disk+envelope model to fit the SEDs (up to 0.87 mm) and 0.87 mm visibilities for a subset of the VANDAM Orion sample.The 9-mm data has not been used for fitting or vali-7 T20 also adopted a much higher 9-mm opacity, which comes from assuming a mm-cm dust opacity index of ≈ 1; such small dust opacity index requires the grains to be highly porous (Woitke et al. 2016) or have size much larger than this wavelength (i.e.≳ a few cm; cf.Draine 2006).The difference between T20's estimates at 0.87 mm and 9 mm would be more significant (and their 9 mm estimate would be closer to our mass estimate) if T20 adopted our dust model.
dation.The model assumes that the grains are mainly heated externally and ignores internal accretion heating.The resulting disks are optically thin at 0.87 mm with masses slightly lower than the T20 results.
Our model fits the 0.87-mm and 9-mm images simultaneously and assumes that the disk is marginally gravitationally unstable and internally heated; external heating by protostellar irradiation is ignored (Section 2).Our model has only two free parameters to ensure that the data have more degrees of freedom than the model, and fiducial values are assigned to other relevant parameters (Section 3.4).We generally find the disks to be massive and optically thick.We also tried to fit our model with only 0.87-mm images and obtained similar results (

Why can models get different results from the same observation?
Fundamentally, this is because the models in T20 and Sheehan et al. (2022) are both based on singlewavelength images, yet the disk properties are underconstrained by single-wavelength observation.Roughly speaking, the luminosity of the disk is given by where M τ <1 ν is the mass of the visible (τ ν < 1) portion of the disk (or τ ν √ 1 − ω ν if scattering is strong; cf.Zhu et al. 2019) and T τ <1 ν the typical temperature in this region.If we only know L ν (from F ν ), there is still a degeneracy between mass and temperature.
Estimating disk properties requires lifting this degeneracy (or constraining how much mass is invisible, when the disk is optically thick).Each model achieves this by using its own physical assumptions (which could also involve additional observational information such as L bol and SED) to provide additional constraints.In other words, the estimated disk properties would depend on both observational constraints and model assumptions.Therefore, it is not too surprising that models using significantly different physical assumptions produce different estimates of disk properties but all fit the observed data well.

Which model should I trust?
From a theoretical perspective, each of the three models have their limitations.For example, in terms of the dust heating mechanism, T20 and Sheehan et al. (2022) only consider protostellar heating while we only consider accretion heating; in reality both mechanisms might be important.As a result, it would be difficult to confidently tell a priori which of the models is a better approximate to reality.Meanwhile, the three models are all consistent with the observations they are based on; so we would need more stringent tests.
One ideal choice would be to test the model's predictive power, i.e. whether it can predict observables that may not be directly derived from the data fed to the model.This could help to rule out incorrect or overfitted models.The multi-wavelength observations of the VANDAM Orion survey offers an excellent opportunity to perform such test; we can check whether the model can use the 0.87-mm data to predict the 9-mm data (or the spectral index between 0.87 mm and 9 mm).The models in T20 and (Sheehan et al. 2022) both fail at this test; the produce optically thin disks with dust spectral index ≳ 1, which results in a spectral index ≳ 3 (much higher than the observed mean spectral indices of ∼2.2) and systematic under-prediction of the 9-mm flux.Meanwhile, our model demonstrates good predictive power and successfully predicts the 9-mm emission when fitted with only the 0.87-mm data (Fig. 7).
In summary, while it is difficult to judge models based on how reasonable their physical assumptions are, current constraints from multi-wavelength observations prefer our model over existing models in T20 and Sheehan et al. (2022).In this subsection we discuss why our argument that the majority of protostellar disks may be gravitationally unstable is not invalidated by the fact that dustcontinuum observations of most protostellar disks show no spiral structure.The most straightforward explanation is the lack of resolution, as the typical width of the spirals would be ∼ H and our current observation is nowhere close to resolving that.In addition to resolution, the detectability of spirals are also going to be limited by disk optical depth and/or detection sensitivity; here we discuss these issues in order to better plan and interpret future observations.
At sub-mm wavelengths, the high optical depth of the disk would decrease the visibility of gravitationally excited spiral structures.To provide an example, we reran the protostellar disk formation simulation in XK21b and computed the brightness temperature (which would be directly proportional to flux density) at 0.87 mm in Fig. 16. 8 The disk is gravitationally unstable and shows prominent spirals with order-unity amplitude in the column-density profile.However, since only the sur-  face of the disk is visible, the observed flux is not directly proportional to the column density.As demonstrated in Fig. 16, the large-scale spirals are barely visible and we only see less coherent, small-scale perturbations at the surface of the disk, which are likely associated with shocks in the cascade of turbulent perturbations driven by the large-scale spirals.Meanwhile, gravitationally unstable protoplanetary (Class II) disks generally suffer less from this problem, as they are cooler and have lower column density (e.g., Rowther et al. 2020).
The problem of high optical depth is less severe for wavelengths ≳ cm, as typical protostellar disks will be optically thin at these longer wavelengths (Fig. 12).However, at such long wavelengths, detection sensitivity becomes the bottleneck.For example, for the 9-mm observations in T20, only a small (and often unresolved) central portion of the disk is above the detection limit.In the future, observations with higher sensitivity and resolution at ∼ cm wavelength might be a direct probe for spiral structures in protostellar disks.8.2.Is the 9-mm flux contaminated by free-free emission?
In this paper we have assumed that flux at both observed wavelengths are dominated by dust thermal emission.While this assumption is reasonable at 0.87 mm, the 9-mm flux may be subject to contamination from free-free emission coming from ionized gas in the outflow.Tychoniec et al. (2018) studied radio emission in Perseus and concluded that ∼60% of the 9-mm flux there could be due to free-free emission, based on an extrapolation of (free-free dominated) fluxes at 4.1 cm and 6.4 cm.Here we evaluate whether our sample of Orion protostars also suffer from a high level of free-free contamination at 9 mm.
We begin with a rough estimate using results from Tychoniec et al. (2018), which fitted an empirical relation between L bol and 4.1-cm luminosity, and find that the typical spectral index of free-free emission (between 4.1 cm and 6.4 cm) is 0.3-0.4.Using this empirical re-lation and assuming a free-free spectral index of 0.4 between 9 mm and 4.1 cm, we can estimate the free-free luminosity at 9 mm from L bol .This estimated 9-mm freefree emission is generally lower than the observed 9-mm emission, with median F free−free 9mm /F 9mm ∼ 0.2 (Fig. 17, top panel).Therefore, the observed 9-mm flux should still be dominated by dust thermal emission.
However, this conclusion is subject to some uncertainty given the spread in the estimated 9-mm flux.It is also unclear whether the empirical scaling laws in Perseus would also be applicable to Orion.To address these caveats, we check another diagnostic which is less quantitative but more robust: the misalignment between the orientation (PA) of 0.87-mm and 9-mm emissions.When the source is resolved, the 9-mm emission should be aligned with the 0.87-mm emission if it is dominated by dust in the disk, and approximately perpendicular if it is dominated by free-free emission in the outflow.In the bottom panel of Fig. 17 we plot the distribution of the estimated misalignment; the distribution peaks at 0 • and does not show any peak around 90 • , consistent with disk-dominated 9-mm emission.One caveat, however, is that most systems in our sample are unresolved or barely resolved at 9 mm; for these systems there are no accurate estimates of orientation, which results in a large "base" of random estimated misalignment in the distribution, and we cannot directly confirm whether they have disk-dominated emission.This problem can be somewhat alleviated by looking at subsamples of low-inclination systems; for example, for systems with cos I ≤ 0.5 (0.25), the fraction of aligned systems (∆PA < 15 • ) increases to 47% (71%).
In summary, current data prefer disk-dominated emission at 9-mm, yet future observations at longer wavelengths and/or higher resolution are needed to more reliably constrain the level of free-free contamination.

Uncertainties in the physical picture and other scenarios of disk formation
The physical picture of protostellar disk formation can be diverse.While in this paper we demonstrate that the observed Class 0/I disks are consistent with gravitational self-regulation at the population level, this does not rule out other scenarios of disk formation.Here we discuss the main (theoretical) uncertainties in our assumed physical picture (gravitational self-regulation), and how they result in other possible scenarios of disk formation.
One major uncertainty in our argument in Section 2.1 is whether the disk is in a quasi-steady state where accretion from the envelope onto the disk is approximately balanced by angular-momentum transport within the disk.For example, if the initial condition of the pre-stellar core and subsequent envelope evolution are highly turbulent, the specific angular momentum of the material being accreted onto the disk might undergo large variations, and the disk evolution might be driven mainly by such variations as opposed to angularmomentum transport within the disk.In the most extreme case, the whole disk might just be formed by a small jet of high-angular-momentum material; this is one possible explanation for the small number of systems in the VANDAM Orion survey that show prominent rings (Sheehan et al. 2020).From a theoretical point of view, it is unclear whether the envelope should be highly turbulent and the accretion onto the disk highly variable.Some molecular-cloud-scale simulations support this possibility (e.g., Kuffmeier et al. 2017;Kuznetsova et al. 2020), with the caveat that the adoption of ideal MHD in these studies could overestimate the level of turbulence at and below core/envelope scale.
Another uncertainty in our argument is whether ambipolar diffusion inside the disk is strong enough to render magnetically driven angular-momentum transport/removal negligible.This could depend on both initial conditions and disk chemistry, which are subject to large uncertainties.Especially, the abundance of small grains plays an important role in determining the strength of non-ideal MHD effects (Zhao et al. 2018).If the magnetic field is better coupled to the gas, disk evolution (and accretion onto the protostar) could be driven mainly by magnetic braking and wind launching and the disk could be gravitationally stable and less turbulent.As a side note, it is also possible that this kind of magnetic regulation co-exists with gravitational selfregulation in some disks, especially during later times (e.g., Flat Spectrum) when the accretion rate drops.

SUMMARY
In this paper, we formulate a parametrized disk model that generate radial profiles and mock observations of embedded protostellar disks; this model can be used to infer disk properties from multi-wavelength dustcontinuum observations (Sections 2-4).
The central assumption of our model is that the disk is gravitationally self-regulated and marginally gravitationally unstable (due to the presence of an infalling envelope).This and other physical assumptions of our model are motivated by a recent theoretical study of protostellar disk formation (XK21b; Section 2).The adoption of these assumptions reduces the number of free parameters in our model without making arbitrary assumptions on dust temperature and disk optical depth.
Especially, our model can produce reliable disk mass estimates even when the disk is optically thick.
We find that this model fits relatively well to the majority of the protostellar disks in the VANDAM Orion survey (T20; Section 5).Moreover, the observations prefer our fiducial model with marginally unstable Toomre Q compared to models with larger Q values, suggesting that the assumption of gravitational self-regulation is likely valid at the population level.(Note that this does not rule out other scenarios of disk formation; cf.Section 8.3) Using our model, we produce new estimates of Orion protostellar disk properties (Section 6).Our main findings include: • Disks are significantly more massive than previously expected, with typical disk-to-star mass ratio M d /M ⋆ = O(1).The high optical depth at ≲ mm wavelengths could have caused a systematic underestimation of disk mass in previous studies (e.g., T20).
• Both disk mass and disk size decrease towards later stages of protostellar evolution.These trends might be associated with the decrease in accretion rate and the pile-up of magnetic flux in the inner envelope at later times.In general, the evolution of disk properties throughout its lifetime is determined by a competition of several different mechanisms and can be non-monotonic.
• Our model suggests that most disks in our sample are likely prone to fragmentation beyond 20-50 au, with the caveat that the applicability of our model to fragmenting disks cannot be guaranteed.
One limitation of our model is that its estimates are subject to large uncertainties, mainly due to the ordermagnitude systematic uncertainties in assumed model parameters (Section 4.3).While the qualitative trends reported in this paper, including the O(1) disk-to-star mass ratio, are robust against these uncertainties, the estimates for key disk properties are subject to uncertainties of a factor of a few.In the future, these uncertainties can be reduced if observations provide additional constraints on protostar mass, accretion rate, or grain size distribution.
Scripts for using our model to fit multi-wavelength dust-continuum images can be downloaded at https://github.com/wxu26/GIdisk2obs.
The repository also contains the estimated properties (including radial profiles) of individual systems in our sample.
We thank Matthew Kunz for insightful discussions and detailed comments on a draft version of this paper, Patrick Sheehan and Steven Stahler for discussions on comparing different methods of estimating disk properties, John Tobin for discussions on estimating the amplitude of free-free emission, and the anonymous referees for pointing out the importance of scattering opacity and improving the presentation of the paper.The simulation presented in Fig. 16 was performed on computational resources managed and supported by Princeton Research Computing, a consortium of groups including the Princeton Institute for Computational Science and Engineering (PICSciE) and the Office of Information Technology's High Performance Computing Center and Visualization Laboratory at Princeton University.
Software: Astropy (Astropy Collaboration et al. 2013, 2018), DSHARP-opac (Birnstiel et al. 2018), Matplotlib (Hunter 2007)  Note-Here we assume σ(log amax) = log(10), σ(log Ṁ /M⋆) = log(10), and σ(log Q) = log(2)/2.We also assume that the errors in assumed model parameters are uncorrelated.These statistics are evaluated on the sub-sample with χ 2 mean ≤ 2 for our fiducial model.The uncertainties in disk property estimates are between a factor of exp(0.26)≈ 1.3 and exp(1.99)≈ 3. σ(log p i ), the uncertainty in log y is simply Here y j is the estimated value of y for the jth system and ⟨•⟩ j denotes an average in j.In order to estimate σ(log y) we first directly evaluate ∂ log y j /∂ log p i for each system by comparing y j fitted from different choices of p i , and then assign physically reasonable σ(log p i ) to obtain σ(log y).For our model, the assumed parameters are the max dust grain size a max , the ratio between accretion rate and protostar mass Ṁ /M ⋆ , and the Toomre Q parameter.The first two are highly uncertain, so we assume that their 1σ uncertainties are both a factor of 10, i.e. σ(log a max ) = σ(log Ṁ /M ⋆ ) = log(10).For Q, since a gravitationally self-regulated disk generally have Q = 1 − 2, we let this range correspond to ±1σ and choose σ(log Q) = log(2)/2.In Table 3 and 4 we summarize the evaluated ∂ log y/∂ log p i and the estimated uncertainties of log disk properties σ(log y).The resulting 1σ uncertainties in disk properties range from a factor of 2-7.Therefore, disk properties estimated with our current model should generally be considered as a rough estimate.In the future, these uncertainties can be significantly reduced if data (or theory) can constrain the grain size and the accretion rate (or protostellar mass) better.
Since different disk properties are affected by the same set of parameters, their systematic errors are generally correlated.To capture this correlation, we evaluate the covariance between the systematic error of two different disk properties (y, y ′ ) using cov(log y, log y ′ ) = Here we have assumed that M (<R) ∼ M ⋆ , which is a reasonable order-of-magnitude approximation since we never have M d /M ⋆ ≫ 1 in our sample.The scalings of Σ, T with respect to R are consistent with those observed in Fig. 12. (Here κ R/P should also vary in radius, but since Σ and T depend weakly on κ R/P , we can ignore this dependence.)Now we consider the scaling of disk mass.The disk mass is simply given by d = 2πΣR 2 d log R. (C7) From Eq. C5 we see that the ΣR 2 scales with M ⋆ at a slope between 5/7 = 0.71 and 5/9 = 0.56, which is consistent with the empirical scaling of M d ∝ M 0.6 ⋆ .ΣR 2 is much less sensitive to other parameters (including R), and that explains why M d shows such tight correlation with M ⋆ .Besides, the weak dependence of ΣR 2 on R means that the relatively steep correlation between M d and R d in Fig. 10 for larger disks is not just a generic feature of gravitationally self-regulated disks.

Figure 2 .
Figure 2. Examples of well-resolved systems with different χ 2 mean .Each row shows the observations at both wavelengths and compares the best-fit model with observation by taking cuts along the disk major axis (positive offset is in the direction of increasing RA).Shaded areas show 1σ observation and model uncertainties.The first two panels show examples where the model agrees well with the observations (χ 2mean ≲ 1); 57% of our sample are in this regime.The last panel shows an example where the model cannot fit the observation very well.For this particular system this is due to the presence of substructure (which at current resolution is uncommon in the sample).In general, the lack of a good fit could be due to a variety of reasons, such as disk being inconsistent with our model, envelope contamination, and binarity.

Figure 3 .Figure 4 .
Figure 3. Cumulative distribution of model error χ 2mean for our fiducial model.57% of our sample can be fit well with our model (χ 2 mean ≤ 1).The result is similar if we limit the sample to only well-resolved disks (T20 disk size estimate >50 au).

Figure 5 .
Figure 5. Top panel: Comparison between flux densities from our model and observation.The observed flux densities are quoted from T20. Bottom panel: the distribution of spectral indices.For most of the systems, our model is able to fit the flux at both wavelengths well.

Figure 6 .Figure 7 .
Figure6.Comparison between apparent disk sizes from our model and observation.Here the apparent disk size R2σ is 2σ of the deconvolved gaussian fit to the observed image or mock observation.Our model broadly agrees with observation at both wavelengths, and reproduces the trend that R2σ is systematically smaller at longer wavelength.

Figure 8 .
Figure 8. Cumulative distribution of model error χ 2 mean when we vary the assumed Toomre Q. Observation strongly prefers Q ≲ 10 over Q ∼ 100.

Figure 9 .
Figure 9. Distribution of disk spectral indices for observed systems (first panel), our fiducial model (second panel), and models assuming larger Q values (last three panels).At large Q, the model is no longer able to reproduce the observed spectral index distribution, even though observations at both wavelengths are used to fit the model.

Figure 11 .
Figure11.Same as Fig.10, for the distribution of stellar mass M⋆ and disk mass M d .Most systems have orderunity disk-to-star mass ratio.There appears to be a strong correlation between M d and M⋆, with slope ≈ 0.6; this is a generic property of gravitationally self-regulated disks (Section 6.1; cf.Appendix C).
Figure12.Radial profiles of disks in our sample, estimated with our fiducial model.From top to bottom, we show surface density Σ, midplane temperature T mid , miplane Rosseland mean optical depth τ R,mid , and optical depth at the two observed wavelengths.We mark the disk edge with a blue dot in each curve.For the last two panels we also show the optical depth at disk edge for absorption opacity only.We only include systems with χ 2 mean ≤ 2; for χ 2 mean > 1, the opacity of the marker decreases as χ 2 mean increases.

Figure 13 .
Figure 13.Cumulative distribution of disk size (top panel) and mass (bottom panel) for different evolutionary stages.Both disk size and mass decrease towards later evolutionary stages.These trends are discussed in Section 6.2.
Figure15.Comparison between disk mass estimates from our fiducial model (fitted with images at both wavelengths) and models based on images at a single wavelength.These include the optically thin, isothermal models from T20 for each observed wavelength (blue and green dots), the parametrized disk+envelope model coupled with radiative transfer from Sheehan et al. (2022) (orange triangles), and our model fitted with 0.87 mm images only (grey circles).The difference can be attributed to the fact that single-wavelength observation leaves a degeneracy between mass (or optical depth) and temperature, which need to be resolved by the (highly different) physical assumptions of each model (Section 7).
Fig 7; also see grey circles in Fig 15).
Visibility of GI-induced disk substructures

Figure 16 .
Figure 16.Column density (top panel) and brightness temperature at 0.87 mm (bottom panel; this is proportional to the flux density at this wavelength) from the snapshot of a simulation of protostellar disk formation.While the disk contains prominent large-scale spirals, they are barely visible at sub-mm wavelength due to high optical depth.

Figure 17 .
Figure 17.Top: contribution from free-free emission to 9-mm flux estimated with empirical scaling relations in Tychoniec et al. (2018).Bottom: the distribution of misalignment between orientations of 0.87-mm and 9-mm emission from PA estimates in T20, showing a peak for aligned emission and a large base of unresolved systems.Both results suggest that free-free emission in the outflow is unlikely to dominate the observed 9-mm flux.
used for plotting the 2D uncertainties in Figs. 10 and 11.C.SCALINGS OF RADIAL DISK PROFILE AND DISK MASSIn this appendix we discuss the origin of the scalings observed in Figs.11 and 12.We begin by considering the scalings of Σ and T mid .Approximately, our physical constraints scaling corresponds to gravitational selfregulation (Q ∼constant) and the second corresponds to the thermal budget.κ R,P are the average opacities at the given radius and T is the average temperature (which should be comparable to the midplane temperature) Top panel: Planck and Rosseland mean opacities κP, κR for dust models with different maximum grain size amax.(For amax < 10 µm, the results are nearly identical to amax = 10 µm.)Bottom panel: absorption opacity κ ν,abs

Table 1 .
Summary of fiducial model parameters

Table 2 .
Distributions of key disk properties

Table 4 .
Estimated 1σ uncertainties of log disk property due to each parameter