Demographics of Planetesimals Formed by the Streaming Instability

Rixin Li; Andrew N. Youdin; Jacob B. Simon

doi:10.3847/1538-4357/ab480d

1. Introduction

An indispensable step in planet formation is to build planetesimals—objects large than a kilometer and bound by self-gravity—in protoplanetary disks (Chiang & Youdin 2010; Johansen et al. 2014). One of the compelling pathways to planetesimal formation is the efficient concentration of solids by the streaming instability (SI) followed by their gravitational collapse (Youdin & Goodman 2005; Johansen et al. 2007).

The SI is an aerodynamic instability arising from the relative drift and the mutual drag forces between gas and solids (Youdin & Goodman 2005). The SI is one example of a broader class of drag instabilities in protoplanetary disks (Goodman & Pindor 2000; Lin & Youdin 2017; Squire & Hopkins 2018).

Strong particle clumping can be induced by the SI under the right conditions, that is, when the midplane dust-to-gas volume density ratio exceeds unity, which further depends on the particle size, dust-to-gas surface density ratio (often referred to as metallicity), and gas pressure gradient (Johansen & Youdin 2007; Bai & Stone 2010a, 2010b; Carrera et al. 2015; Yang et al. 2017). In a typical smooth disk with centimeter-sized pebbles, slightly supersolar metallicity is required to trigger the strong SI (Johansen et al. 2009). Higher metallicity may be reached by gas removal (e.g., via photoevaporation) or dust pile-up (e.g., at pressure bumps, snow lines), where smaller solids can also lead to strong particle clumping (Drążkowska et al. 2016; Carrera et al. 2017; Drążkowska & Alibert 2017; Schoonenberg & Ormel 2017). Previous studies have also shown that high-resolution SI simulations with self-gravity can produce a broad initial top-heavy mass distribution of planetesimals (Johansen et al. 2015; Simon et al. 2016, 2017).

Recently, Yang et al. (2018) suggested that a midplane dust-to-gas density ratio exceeding the order of unity may not be necessary for the strong SI when the radial diffusion of particles driven by turbulence near the midplane is weak. Lin (2019) finds that particle back-reaction onto the gas can lead to self-sustained dust sedimentation against the vertical shear instability turbulence at a modest supersolar metallicity, which may trigger the SI. Furthermore, Krapp et al. (2019) investigate the linear growth of the SI with multiple dust species for the first time, showing how a properly resolved particle size distribution affects the linear phase of the SI. Their study motivates more detailed simulations of the nonlinear phase of the multi-species SI as well, which would extend previous studies (Bai & Stone 2010a; Schaffer et al. 2018).

Observations of asteroids and trans-Neptunian objects support models in which planetesimals were born big (Morbidelli et al. 2009), with evidence of a drop-off in planetesimal numbers below ∼1–50 km, depending on the population (Delbo' et al. 2017; Singer et al. 2019). Nesvorný et al. (2019) find that SI simulations correctly predict the primarily prograde mutual inclinations of the abundant binaries in the cold classical Kuiper Belt (Grundy et al. 2019). Further studies on the demographics of planetesimals formed via the SI, and by other mechanisms, offer the promise of more detailed observational comparisons and tests.

Quantifying the mass distribution of planetesimals formed by the SI is of significant interest. Due to the high computational cost of SI simulations, a parameterized mass function can be used as the input for global studies of disk evolution and planet formation (Drążkowska & Dullemond 2014). Furthermore, the shape of the mass distribution offers insights to the physical processes of particle clumping and gravitational collapse.

Previous work has fit the mass distribution to a simple power law (Simon et al. 2016, 2017) or to a power law with an exponential cutoff or truncation (Schäfer et al. 2017; Abod et al. 2019). These works suggested that the initial planetesimal mass function might be near-universal. However, it is not trivial to determine the best parameterization of the initial planetesimal mass function in SI simulations. Moreover, it is not clear whether a single functional form can describe planetesimal formation with different physical conditions, i.e., simulation parameters.

Motivated by these issues, our goal in this paper is to better understand and constrain the broad initial mass distribution of planetesimals with robust statistical analyses. This work will fit many different parameterizations to simulated planetesimal mass distributions. To determine which models best describe the data, model selection techniques weigh the goodness of fit against a complexity penalty, intended to avoid the overfitting of data features that might be spurious. Since there is no universal agreement on complexity penalties either, we apply different model selection techniques, including a bootstrap method that we developed independently.

The paper is organized as follows. In Section 2, we begin with an overview of the numerical models and our simulations. Section 2.3 then introduce our newly developed clump-finding tool, Planetesimal Analyzer (PLAN). Section 3 lays out all the statistical models and our fitting procedure as well as the model selection criteria. In Section 4, we show the fitting results and the model selection results. Section 5 discusses the implications of our statistical understanding, and gives a summary and conclusions at the end.

2. Method

To simulate the formation of planetesimals, we use the ATHENA code with a similar setup to Simon et al. (2017). In Section 2.1, we briefly introduce the numerical methods employed in ATHENA for modeling the coupled dynamics of gas and particles, including the self-gravity of solids, in a protoplanetary disk (see Bai & Stone 2010c; Simon et al. 2016, for more details). Section 2.2 then summarizes the numerical setup and parameters used in our simulations. Section 2.3 explains how PLAN identifies and characterizes all the self-bound clumps in the output particle data.

2.1. Simulations of Planetesimal Formation

We use ATHENA to simulate a small three-dimensional vertically stratified patch of the protoplanetary disk with the local shearing box approximation (Hawley et al. 1995; Stone et al. 2008; Stone & Gardiner 2010). This approximation—which is justified by the small length scales of the SI compared to the radial position in the disk—maps the global disk geometry (R, ϕ, z') onto a local Cartesian coordinate system (x, y, z) (Goldreich & Lynden-Bell 1965). The local box is centered at a fiducial disk radius (R₀) in the midplane, where (x, y, z) ≡ (R − R₀, R₀ϕ, z'), where the Keplerian frequency and velocity are Ω₀ and v_K = Ω₀R₀, respectively.

In this non-inertial computational domain, ATHENA solves the equations of gas dynamics and the equation of motion for each particle (indexed by i):

$\begin{eqnarray}&&\displaystyle \frac{\partial {\rho }_{{\rm{g}}}}{\partial t}+{\rm{\nabla }}\cdot ({\rho }_{{\rm{g}}}{\boldsymbol{u}})=0,\end{eqnarray} \tag{ 1 }$

$\begin{eqnarray}&&\begin{array}{l}\displaystyle \frac{\partial ({\rho }_{{\rm{g}}}{\boldsymbol{u}})}{\partial t}+{\rm{\nabla }}\cdot ({\rho }_{{\rm{g}}}{\boldsymbol{u}}{\boldsymbol{u}}+P{\boldsymbol{I}})\\ \ =\ {\rho }_{{\rm{g}}}\left[2{\boldsymbol{u}}\times {{\boldsymbol{\Omega }}}_{0}+3{{\rm{\Omega }}}_{0}^{2}{\boldsymbol{x}}-{{\rm{\Omega }}}_{0}^{2}{\boldsymbol{z}}\right]+{\rho }_{{\rm{p}}}\displaystyle \frac{\bar{{\boldsymbol{v}}}-{\boldsymbol{u}}}{{t}_{\mathrm{stop}}},\end{array}\end{eqnarray} \tag{ 2 }$

$\begin{eqnarray}\begin{array}{rcl}\displaystyle \frac{d{{\boldsymbol{v}}}_{i}}{{dt}} & = & 2{{\boldsymbol{v}}}_{i}\times {{\boldsymbol{\Omega }}}_{0}+3{{\rm{\Omega }}}_{0}^{2}{{\boldsymbol{x}}}_{i}-{{\rm{\Omega }}}_{0}^{2}{{\boldsymbol{z}}}_{i}\\ & & -\ \displaystyle \frac{{{\boldsymbol{v}}}_{i}-{\boldsymbol{u}}}{{t}_{\mathrm{stop},i}}-{\rm{\nabla }}{{\rm{\Phi }}}_{\mathrm{sg}}-2\eta {v}_{{\rm{K}}}{{\rm{\Omega }}}_{0}\hat{x},\end{array}\end{eqnarray} \tag{ 3 }$

where ρ_g, ${\boldsymbol{u}}$ , and P are density, velocity, and pressure of gas, ${\boldsymbol{I}}$ is the identity matrix, ${{\boldsymbol{\Omega }}}_{0}={{\rm{\Omega }}}_{0}\hat{z}$ , ρ_p and $\bar{{\boldsymbol{v}}}$ are the average density and velocity of the particles in a hydrodynamic grid cell, t_stop is the dimensional stopping time, ${{\boldsymbol{v}}}_{i}$ is the velocity of the ith particle, Φ_sg is the potential field of the self-gravity of solids, and η denotes the relative difference between the gas orbital velocity and the Keplerian velocity due to the radial pressure gradient in the disk.

Our model calculates the Coriolis forces, radial and vertical tidal gravity, and the particle feedback exerted on the gas, as on the right side of Equation (2). The equation of state for the gas is assumed to be isothermal, $P={c}_{{\rm{s}}}^{2}{\rho }_{{\rm{g}}}$ , where the constant c_s is the isothermal sound speed. We neglect the self-gravity of the gas because the gas density fluctuations are relatively negligible.

For solids, ATHENA adopts the super-particle treatment, where each particle in our simulations statistically represents a large number of pebbles in terms of mass. The acceleration of each particle is governed by Equation (3) with the Coriolis and tidal forces (similar to those in Equation (2)), and also the gas drag as well as the force due to particle self-gravity. The gravitational potential field, Φ_sg, is obtained by solving Poisson's equation

$\begin{eqnarray}&&{{\rm{\nabla }}}^{2}{{\rm{\Phi }}}_{\mathrm{sg}}=4\pi G{\rho }_{{\rm{p}}}\end{eqnarray} \tag{ 4 }$

with the fast Fourier transform method of Simon et al. (2016), where G is the gravitational constant. The accuracy of particle self-gravity from such a method depends on the grid resolution. The last source term in Equation (3), $-2\eta {v}_{{\rm{K}}}{{\rm{\Omega }}}_{0}\hat{x}$ , is a constant radial force in ATHENA to implement the effective global radial pressure gradient under the restrictions of the local model.

The radial and azimuthal boundary conditions (BCs) for our model are the standard shearing-periodic BCs (Stone & Gardiner 2010). In the vertical direction, we use a modified outflow BC that extrapolates the gas density into the ghost zones exponentially and prohibits any gas inflow (Simon et al. 2011; Li et al. 2018). These vertical BCs maintain hydrostatic equilibrium and reduce artificial gas motions at vertical boundaries, which is beneficial for simulations in short boxes. Furthermore, the total gas mass is renormalized to compensate for the gas outflow at each time step so as to ensure mass conservation.

The physical behavior of our simulations is dominated by four key dimensionless parameters. The SI is characterized by the first three of them: the dimensionless particle stopping time

$\begin{eqnarray}&&{\tau }_{{\rm{s}}}={{\rm{\Omega }}}_{0}{t}_{\mathrm{stop}},\end{eqnarray} \tag{ 5 }$

which represents the ratio of a particle's aerodynamic (t_stop) and orbital (Ω₀⁻¹) timescales, increases with a particle's size and decreases with the local gas density; the surface density ratio between the solids (Σ_p) and the gas (Σ_g)

$\begin{eqnarray}&&Z=\displaystyle \frac{{{\rm{\Sigma }}}_{{\rm{p}}}}{{{\rm{\Sigma }}}_{{\rm{g}}}},\end{eqnarray} \tag{ 6 }$

which is sometimes called the total solid-to-gas mass ratio; and the global radial pressure gradient parameter

$\begin{eqnarray}&&{\rm{\Pi }}\equiv \displaystyle \frac{\eta {v}_{{\rm{K}}}}{{c}_{{\rm{s}}}}\equiv -\displaystyle \frac{1}{2}\displaystyle \frac{{c}_{{\rm{s}}}}{{v}_{{\rm{K}}}}\displaystyle \frac{\partial \mathrm{ln}P}{\partial \mathrm{ln}R},\end{eqnarray} \tag{ 7 }$

which accounts for the strength of the headwind on the particles. The fourth key parameter controls the relative strength of the particle self-gravity compared with the tidal shear:

$\begin{eqnarray}&&\tilde{G}\equiv \displaystyle \frac{4\pi G{\rho }_{0}}{{{\rm{\Omega }}}_{0}^{2}}=\displaystyle \frac{4}{\sqrt{2\pi }Q},\end{eqnarray} \tag{ 8 }$

where ρ₀ is the midplane gas density and Q is Toomre's Q (Toomre 1964).

The code units of our simulations are set to the natural units of the shearing box. The density unit and the time unit are ρ₀ and ${{\rm{\Omega }}}_{0}^{-1}$ , respectively. While ρ₀ and Ω₀ are code units, their allowed physical values and thus the choice of disk model are constrained by the parameter $\tilde{G}$ . The length unit is H = c_s/Ω₀, the vertical scale height of the gas.

2.2. Numerical Setup

All of our simulations are initiated with Gaussian vertical density profiles for both the gas and particles:

$\begin{eqnarray}\begin{array}{rcl}{\rho }_{{\rm{g}}} & = & {\rho }_{0}\exp \left(\displaystyle \frac{-{z}^{2}}{2{H}^{2}}\right),\\ {\rho }_{{\rm{p}}} & = & \displaystyle \frac{{{\rm{\Sigma }}}_{{\rm{p}}}}{\sqrt{2\pi }{H}_{{\rm{p}}}}\exp \left(\displaystyle \frac{-{z}^{2}}{2{H}_{{\rm{p}}}^{2}}\right),\end{array}\end{eqnarray} \tag{ 9 }$

where H_p is the particle scale height and is set to 0.02H in the beginning. The choice of such an initial particle scale height matches our previous work in Simon et al. (2017), which let particles naturally sediment to a pseudo-equilibrium scale height of the order of 0.01H (Yang & Johansen 2014; Li et al. 2018). Both gas and particles are then initialized with the Nakagawa–Sekiya–Hayashi equilibrium drift velocities (Nakagawa et al. 1986).

In this work, we fix Π = 0.05 and $\tilde{G}=0.05$ (Q ≃ 32), which are typical values in protoplanetary disks. Table 1 lists other physical and numerical parameters for both of our simulations. Run I has (τ_s, Z) = (2.0, 0.1) and a higher resolution in a smaller domain (one grid cell Δx is H/5120 wide, the highest resolution to date). Run II has (τ_s, Z) = (0.3, 0.02) and a lower resolution in a larger domain. The data of Run II are directly taken from Simon et al. (2017), and our Run I is a higher-resolution version of another simulation in Simon et al. (2017). The relation between particle size and τ_s depends on uncertain properties of the particles and gas disk. The range of τ_s adopted here may correspond to solids with any size from millimeter to decimeter. These physical parameters are known to produce strong particle clumping that triggers gravitational collapse. The resulting bound clumps are the subject of our statistical analyses below.

Table 1. Simulation Parameters

Run	Domain Size	Number of Cells	N_par^a	τ_s	Z	t₀^b	N_tot^c
	(L_X × L_Y × L_Z)H³	N_X × N_Y × N_Z				(Ω₀⁻¹)
I	0.1 × 0.1 × 0.2	512 × 512 × 1024	134,217,728	2.0	0.1	36.0	284
II	0.2 × 0.2 × 0.2	512 × 512 × 512	153,600,000	0.3	0.02	110.0	174

Notes. For all runs: the radial pressure gradient term is Π = 0.05 and the particle self-gravity strength is $\tilde{G}=0.05$ .

^aThe number of particles. For reference, 2²⁷ = 512³ = 134,217,728. ^bThe time when the particle self-gravity is switched on. ^cThe number of clumps identified by PLAN at the snapshot where we perform analyses and fitting in Section 4.

Download table as: ASCII Typeset image

Following previous studies, we start our simulations first without particle self-gravity. Only after the SI has fully developed and saturated do we switch on the self-gravity. This approach significantly reduces the computational expense and has little influence on the final properties of planetesimals (Simon et al. 2016; Abod et al. 2019). For convenience, we define a self-gravity time

$\begin{eqnarray}&&{t}_{\mathrm{sg}}=t-{t}_{0},\end{eqnarray} \tag{ 10 }$

where t is the simulation time and t₀ denotes the time when the self-gravity is turned on (see the next to last column in Table 1). Also, we present all planetesimal masses in units of the dimensional mass for a self-gravitating particle disk:

$\begin{eqnarray}&&{M}_{{\rm{G}}}=\pi {\left(\displaystyle \frac{{\lambda }_{{\rm{G}}}}{2}\right)}^{2}{{\rm{\Sigma }}}_{{\rm{p}}}=4{\pi }^{5}\displaystyle \frac{{G}^{2}{{\rm{\Sigma }}}_{{\rm{p}}}^{3}}{{{\rm{\Omega }}}_{0}^{4}}=\displaystyle \frac{\sqrt{2}}{2}{\pi }^{9/2}{Z}^{3}{\tilde{G}}^{2}({\rho }_{0}{H}^{3}),\end{eqnarray} \tag{ 11 }$

where λ_G is the critical unstable wavelength from the standard Toomre dispersion relation.

To translate M_G into a physical mass unit and planetesimal size, additional assumptions about the disk model are required. For instance, if we assume a fiducial disk radius of R₀ = 10 au in the modified minimum mass solar nebula (MMSN) model of Chiang & Youdin (2010, adapted from Hayashi 1981, hereafter the CY10 model), where Σ_g ∝ R^−3/2 and the temperature T ∝ R^−3/7, then our parameter $\tilde{G}=0.05$ implies that the gas mass in the CY10 model is about half the original MMSN values. For these parameters, the CY10 model has Π = 0.068, slightly higher than Π = 0.05 in our simulations. Nevertheless, a smaller Π value might arise from a weak pressure bump, a common substructure in protoplanetary disks (Pinilla & Youdin 2017; Dullemond et al. 2018). Moreover, Π values do not significantly affect the planetesimal mass distribution, as studied in Abod et al. (2019). The mass unit for Run I then equates to M_G = 1.82 × 10²³ g = 0.19 M_Ceres. With 1 g cm⁻³ as the mean density, the physical radius of a mass of 1 M_G is ≃350 km in this model. For Run II, the same gas disk model and location give M_G = 1.45 × 10²¹ g = 0.0015 M_Ceres, and a physical radius of ≃70 km.

Figure 1 shows a snapshot of the particle surface density from Run I at t_sg = 4/Ω₀, where solids already collapse into self-bound clumps. In the following section, we describe how PLAN finds these clumps in detail.

2.3. Clump-finding with Planetesimal Analyzer

To identify and further characterize the properties of planetesimals produced in our simulations, we develop a new clump-finding tool, PLAN (Li 2019a). It is designed to work with the 3D particle output of ATHENA and find self-bound clumps robustly and efficiently. PLAN is scalable to analyze billions of particles and many snapshots simultaneously because of its massively parallelized scheme written in C++ with OpenMP/MPI.

We now briefly present the workflow of PLAN. The approach is based on the dark matter halo finder HOP developed by Eisenstein & Hut (1998), which is able to quickly group physically related particles. PLAN first builds a memory-efficient linear Barnes–Hut tree representing all the particles in the Morton order (Barnes & Hut 1986). Each particle is then assigned a density computed from the nearest N_den particles (N_den = 64 by default). For particles with densities higher than a threshold, ${\delta }_{\mathrm{outer}}=8{\rho }_{0}/\tilde{G}$ , PLAN chains them up toward their densest neighbors recursively until a density peak is reached. The value of δ_outer is physically motivated to be slightly smaller than the Roche density ( $9{\rho }_{0}/\tilde{G}$ ) such that PLAN can quickly find most relevant particles. All the particle chains leading to the same density peak are combined into a group.

PLAN then merges those groups by examining their boundaries to construct a list of bound clumps. Based on the total kinematic and gravitational energies, deeply intersected groups are merged if bound. However, two particle groups with a saddle point less dense than δ_saddle = 2.5δ_outer remain separated (Eisenstein & Hut 1998). Next, PLAN goes through each group—or raw clump—to unbind any contamination (i.e., passing-by and not bound) particles and gather possibly unidentified member particles within its Hill sphere. After discarding those clumps (if any) with Hill radii (R_Hill) smaller than one hydrodynamic grid cell (Δx) or density peaks less than δ_peak = 3δ_outer, PLAN outputs the final list of clumps with their physical properties derived from particles.

Most clumps in our high-resolution simulations are highly concentrated, where particles often collapse into regions much smaller than the Hill radius (see Figure 1). For small clumps, these regions are comparable to one cell size. The particle-mesh method of calculating self-gravity does not resolve scales below Δx, which is a primary motivation for our high-resolution simulation and the reason why PLAN compares R_Hill to Δx. While this work was in progress, PLAN was already used in the analyses of Abod et al. (2019) and Nesvorný et al. (2019).

Our former clump-finding tool analyzed the surface density of solids in the (x, y) cells of the hydrodynamic grid. This technique identifies prominent clumps and calculates the clump masses as the column mass encapsulated within their projected Hill radii. Such a treatment is limited by the grid resolution and has difficulty detecting small planetesimals, especially when a massive clump is nearby. Consequently, this method tends to overestimate the clump mass by including the mass of surrounding small planetesimals as well as other solids that are vertically far away. PLAN overcomes those issues by diagnosing the particle data instead, as described above.

Figure 2 shows that the PLAN results agree with our previous results at large masses, though the previous analyses slightly overestimated masses, as explained above. The significant advantage of the PLAN analysis is that we identify gravitationally bound clumps at much lower masses than before, which improves our ability to statistically characterize the resulting mass distributions.

3. Statistical Modeling of the Mass Distribution

This section details our statistical methodology for analyzing the planetesimal mass distribution. We present a maximum likelihood estimator (MLE) for estimating the parameters of a given model and the uncertainty on those parameters in Section 3.1. Section 3.2 lists the models that we fit, which vary in complexity from two to five parameters. Finally, Section 3.3 describes our model selection criteria and how they apply a penalty to more complex models.

3.1. Maximum Likelihood Parameter Estimation

We assume that the masses, M, of planetesimals (strictly, protoplanetesimal clumps in the simulations) are drawn from a probability density function (PDF), ξ(M), parameterized by a vector ${\boldsymbol{\theta }}$ , where $\xi (M;{\boldsymbol{\theta }}){dM}$ represents the probability that a given clump forms in the mass interval M to M + dM.

In practice, it is easier to work with a logarithmic mass coordinate, x = ln(M/M_min), referenced to the minimum mass of the distribution, M_min. With this transformation, the functional form of the PDF is different and written as

$\begin{eqnarray}&&p(x;{\boldsymbol{\theta }})=\displaystyle \frac{1}{{N}_{\mathrm{tot}}}\displaystyle \frac{{dN}(x;{\boldsymbol{\theta }})}{{dx}}=C({\boldsymbol{\theta }})g(x;{\boldsymbol{\theta }}),\end{eqnarray} \tag{ 12 }$

where the first equality simply relates the PDF to the total number of bodies N_tot and the number in a given logarithmic mass interval, dN. In the second equality, the normalization factor $C({\boldsymbol{\theta }})$ is introduced for later convenience (see also Youdin 2011).

Accordingly, the cumulative distribution function (CDF) is

$\begin{eqnarray}&&{P}_{\gt }(x;{\boldsymbol{\theta }})={\int }_{x}^{+\infty }p(x^{\prime} ;{\boldsymbol{\theta }}){dx}^{\prime} =C({\boldsymbol{\theta }}){\int }_{x}^{+\infty }g(x^{\prime} ;{\boldsymbol{\theta }}){dx}^{\prime} ,\end{eqnarray} \tag{ 13 }$

which denotes the expected fraction of clumps with masses larger than M (= e^x M_min) in the distribution.⁵ The normalization of the CDF, ${P}_{\gt }(x;{\boldsymbol{\theta }}){| }_{x=0}=1$ , gives

$\begin{eqnarray}&&C({\boldsymbol{\theta }})=\displaystyle \frac{1}{{\int }_{0}^{+\infty }g(x;{\boldsymbol{\theta }}){dx}},\end{eqnarray} \tag{ 14 }$

which requires that $g(x;{\boldsymbol{\theta }})$ does not diverge as x → + $\infty$ .

To fit a model to the data, i.e., the simulated mass distribution of planetesimals, we consider the likelihood of the data given the model

$\begin{eqnarray}&&{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})\equiv \displaystyle \prod _{i=1}^{{N}_{\mathrm{tot}}}p({x}_{i};{\boldsymbol{\theta }})=\displaystyle \prod _{i=1}^{{N}_{\mathrm{tot}}}C({\boldsymbol{\theta }})g({x}_{i};{\boldsymbol{\theta }}).\end{eqnarray} \tag{ 15 }$

It is usually easier to consider the log-likelihood

$\begin{eqnarray}&&\mathrm{ln}{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})=\displaystyle \sum _{i=1}^{{N}_{\mathrm{tot}}}\mathrm{ln}p({x}_{i};{\boldsymbol{\theta }})={N}_{\mathrm{tot}}\mathrm{ln}C({\boldsymbol{\theta }})+\displaystyle \sum _{i=1}^{{N}_{\mathrm{tot}}}\mathrm{ln}g({x}_{i};{\boldsymbol{\theta }}).\end{eqnarray} \tag{ 16 }$

The MLE of ${\boldsymbol{\theta }}$ estimates the best-fit parameters, ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ , by maximizing the log-likelihood (within certain physical bounds if necessary). For some simple log-likelihood functions, ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ can be solved as the root(s) of

$\begin{eqnarray}&&\left\{\begin{array}{l}\displaystyle \frac{\partial \mathrm{ln}{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})}{\partial {\theta }_{j}}=0,\\ \displaystyle \frac{{\partial }^{2}\mathrm{ln}{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})}{\partial {\theta }_{j}^{2}}\lt 0,\end{array}\right.\end{eqnarray} \tag{ 17 }$

where θ_j means the jth parameter in ${\boldsymbol{\theta }}$ . However, constraints on the allowed values of parameters sometimes confound traditional root-finding methods.

In this work, we apply numerical techniques to maximize the likelihood of the trial PDF (see Section 3.2 for our choices). In practice, we first use the Python package emcee to explore the parameter space with a Markov chain Monte Carlo (MCMC) approach (Foreman-Mackey et al. 2013) to obtain an initial guess of parameters, ${{\boldsymbol{\theta }}}_{\mathrm{MCMC}}$ . We then use the minimize method⁶ in the scipy.optimize package (Jones et al. 2001) to find the most likely ${\boldsymbol{\theta }}$ that minimizes $-\mathrm{ln}{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})$ .

To quantify the uncertainties of the best-fit parameters, ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ , we adopt the nonparametric bootstrap method (Efron & Tibshirani 1994; Burnham & Anderson 2002). By repeatedly taking a random sample of size N_tot with replacement from the actual mass data, we first generate N_bs independent bootstrap samples. They serve as a proxy for a set of N_bs independent real samples from the same mass distribution, because taking extra data (from additional simulations) is too costly. The MLE is then employed to fit the model PDF to each bootstrap sample to obtain the best-fit parameters, ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}$ (k = 1, ..., N_bs). Parameter uncertainties expected from real samples are estimated by calculating the distance between ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ and the 84th and 16th percentiles of the distribution of ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}$ , i.e., ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}^{84 \% }$ and ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}^{16 \% }$ , as

$\begin{eqnarray}\begin{array}{rcl}{\rm{\Delta }}{{\boldsymbol{\theta }}}_{\mathrm{bs}}^{+} & = & {{\boldsymbol{\theta }}}_{\mathrm{bs},k}^{84 \% }-{{\boldsymbol{\theta }}}_{\mathrm{MLE}},\\ {\rm{\Delta }}{{\boldsymbol{\theta }}}_{\mathrm{bs}}^{-} & = & {{\boldsymbol{\theta }}}_{\mathrm{MLE}}-{{\boldsymbol{\theta }}}_{\mathrm{bs},k}^{16 \% }.\end{array}\end{eqnarray} \tag{ 18 }$

Efron & Tibshirani (1994) have shown that this bootstrap method works reasonably well if N_bs is large (e.g., >1000). In this work, we fix N_bs = 10,000. Appendix A gives a model fitting example in detail.

Our MLE is similar to that used in Simon et al. (2016, 2017) and Abod et al. (2019) but we use bootstrapping to estimate parameter uncertainties. A different method of parameter estimation is used by Johansen et al. (2015), Schäfer et al. (2017), etc., who applied curve-fitting routines to the CDF. While maximizing the likelihood functions seems more fundamental to us, we make no claim that our method is actually superior. We conduct tests to recover the slope of a single power-law distribution from randomly generated data using these two methods. This test yields non-identical results, which confirms that the methods are not equivalent, but offers no evidence to clearly favor either method. A more rigorous investigation of this statistical issue could be warranted.

3.2. Statistical Models

We now describe the seven statistical models that are used to fit the mass distribution of planetesimals. To focus on the shapes of these models, this section only gives their basic functional forms. All the normalization coefficients and the full functional expressions are put in Appendix B. For convenience, we define K as the number of parameters in a model.

The first three models below are presented as CDFs. We simply convert them to their corresponding PDFs to apply our MLE. However, it is natural to expect, if the planetesimal masses arise probabilistically, that a continuous PDF would be a more physical description. The reason to consider the CDFs is that some appeared previously in the literature (the first two models) and one of our runs shows visual evidence for a kink in the CDF (the third model). Because this kink gives a discontinuity in the PDF, it is arguably unphysical, but in this work we only examine statistical robustness, because no comprehensive physical theory for the distribution of masses exists.

1.
Simply tapered power law. The concave downward profile of the CDFs of clump masses (see Figure 2) suggests a power-law distribution with exponential tapering (Abod et al. 2019):
$\begin{eqnarray}&&{P}_{\gt }(M;{\boldsymbol{\theta }})={c}_{1}{M}^{-\alpha }\ \exp \left(-\displaystyle \frac{M}{{M}_{\exp }}\right),\end{eqnarray} \tag{ 19 }$
where M_exp is the characteristic mass scale and c₁ is the renormalization coefficient (see Appendix B, same for coefficients below). This model has two free parameters (K = 2) and ${\boldsymbol{\theta }}=(\alpha ,{M}_{\exp })$ , with M_min ≤ M_exp ≤ M_max but no constraints on α.
2.
Variably tapered power law. In addition to the first model, this model frees the tapering power by adding one more free parameter β inside the exponential function (Schäfer et al. 2017),
$\begin{eqnarray}&&{P}_{\gt }(M;{\boldsymbol{\theta }})={c}_{2}{M}^{-\alpha }\ \exp \left[-{\left(\displaystyle \frac{M}{{M}_{\exp }}\right)}^{\beta }\right],\end{eqnarray} \tag{ 20 }$
where ${\boldsymbol{\theta }}=(\alpha ,\beta ,{M}_{\exp })$ and K = 3. This model requires that at least one of α and β is positive, and again M_min ≤ M_exp ≤ M_max.
3.
Broken cumulative power law. A broken cumulative power-law distribution connects two power-law segments with different slopes in the cumulative distribution. It also manifests different behaviors in different mass ranges:
$\begin{eqnarray}{P}_{\gt }(M;{\boldsymbol{\theta }})=\left\{\begin{array}{ll}{c}_{31}{M}^{-{\alpha }_{1}}\ & M\leqslant {M}_{\mathrm{br}}\\ {c}_{32}{M}^{-{\alpha }_{2}}\ & M\gt {M}_{\mathrm{br}},\end{array}\right.\end{eqnarray} \tag{ 21 }$
where M_br denotes the characteristic mass scale at which the slope breaks. This model has three free parameters (K = 3) and ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{M}_{\mathrm{br}})$ . There is no constraint on α₁, but α₂ needs to be positive, and M_min ≤ M_br ≤ M_max.
4.
Truncated power law. Schäfer et al. (2017) also tested a truncated power-law model,
$\begin{eqnarray}\xi (M;{\boldsymbol{\theta }})=\left\{\begin{array}{ll}{c}_{4}{M}^{-\alpha -1}\ & M\leqslant {M}_{\mathrm{tr}}\\ 0\ & M\gt {M}_{\mathrm{tr}},\end{array}\right.\end{eqnarray} \tag{ 22 }$
where an upper bound M_tr truncates the PDF (and CDF). In this model, ${\boldsymbol{\theta }}=(\alpha ,{M}_{\mathrm{tr}})$ , K = 2, α > 0, and M_tr ≥ M_max. Here the power-law exponent in PDF becomes −α − 1 because the exponent in the corresponding CDF is −α. Furthermore, it is easy to show that the PDF monotonically increases with increasing M_tr ≤ M_max, and hence $-\mathrm{ln}{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})$ minimizes if and only if M_tr = M_max.
5.
Broken power law. Another compelling possibility is the broken power-law distribution.⁷ The corresponding PDF consists of two different power-law segments, leading to a smooth transition in the CDF near the breaking point:
$\begin{eqnarray}\xi (M;{\boldsymbol{\theta }})=\left\{\begin{array}{ll}{c}_{51}{M}^{-{\alpha }_{1}-1}\ & M\leqslant {M}_{\mathrm{br}}\\ {c}_{52}{M}^{-{\alpha }_{2}-1}\ & M\gt {M}_{\mathrm{br}}.\end{array}\right.\end{eqnarray} \tag{ 23 }$
This model has three free parameters (K = 3), ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{M}_{\mathrm{br}})$ , where α₁ has no limits, α₂ > 0, and M_min ≤ M_br ≤ M_max. When ${\alpha }_{2}\to +\infty$ , this model reverts to the truncated power-law model.
6.
Truncated broken power law. This model complicates the last model by adding a truncation to the PDF:
$\begin{eqnarray}\xi (M;{\boldsymbol{\theta }})=\left\{\begin{array}{ll}{c}_{61}{M}^{-{\alpha }_{1}-1}\ & M\leqslant {M}_{\mathrm{br}}\\ {c}_{62}{M}^{-{\alpha }_{2}-1}\ & \mathrm{otherwise}\\ 0\ & M\gt {M}_{\mathrm{tr}}.\end{array}\right.\end{eqnarray} \tag{ 24 }$
This model has four free parameters (K = 4) and ${\boldsymbol{\theta }}\,=({\alpha }_{1},{\alpha }_{2},{M}_{\mathrm{br}},{M}_{\mathrm{tr}})$ , where α₁ and α₂ have no limits, M_min ≤ M_br ≤ M_max ≤ M_tr. Similar to the truncated power-law model, the PDF monotonically decreases with M_tr, and $-\mathrm{ln}{ \mathcal L }({\boldsymbol{x}}| {\boldsymbol{\theta }})$ minimizes when M_tr = M_max.
7.
Three-segment power law. We take a step further to consider another broken power-law distribution but with three segments in the PDF,
$\begin{eqnarray}\xi (M;{\boldsymbol{\theta }})=\left\{\begin{array}{ll}{c}_{71}{M}^{-{\alpha }_{1}-1}\ & M\leqslant {M}_{\mathrm{br}1}\\ {c}_{72}{M}^{-{\alpha }_{2}-1}\ & \mathrm{otherwise}\\ {c}_{73}{M}^{-{\alpha }_{3}-1}\ & M\gt {M}_{\mathrm{br}2}.\end{array}\right.\end{eqnarray} \tag{ 25 }$
This model has five free parameters (K = 5) and ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{\alpha }_{3},{M}_{\mathrm{br}1},{M}_{\mathrm{br}2})$ . Both α₁ and α₂ have no boundaries, but α₃ > 0 and M_min ≤ M_br1 ≤ M_br2 ≤ M_max. When ${\alpha }_{3}\to +\infty$ , this model reverts to the truncated broken power-law model.

We choose these seven statistical models as candidates since they have been previously used to fit the planetesimal mass function or are commonly applied to fit top-heavy mass distributions. Other models are also certainly possible, but are beyond the scope of this paper. Note that all the models above are transformed to a PDF function of x (see Table 6) to be used in our MLE.

3.3. Model Selection Criteria

Out next goal is to select the statistical models that best represent simulation data. Models with more parameters (larger K) have the flexibility to provide closer fits to the data, i.e., higher likelihood values. Often, a well-chosen function with fewer parameters can provide a better fit than a different function with more parameters. The much larger concern is the opposite case, where a more complex model does not better represent reality but merely overfits statistical fluctuations in the data.

For the problem of planetesimal formation by the SI, this statistical concern is relevant. The high-mass tail of the planetesimal distribution is very significant, but with low numbers of high-mass clumps in any simulation, the risk of statistical fluctuations impacting model fitting is potentially high.

To address these issues, we first review two of the most commonly used model selection criteria and then introduce a selection criterion that we develop independently, motivated by the nonparametric bootstrap method.

3.3.1. Information Criteria

The most commonly used model selection criteria are (i) the Bayesian information criterion (BIC) (Kass & Raftery 1995)

$\begin{eqnarray}&&\mathrm{BIC}=K\mathrm{ln}({N}_{\mathrm{tot}})-2\mathrm{ln}{ \mathcal L },\end{eqnarray} \tag{ 26 }$

and (ii) the Akaike information criterion (AIC) (Akaike 1974)

$\begin{eqnarray}&&\mathrm{AIC}=2K-2\mathrm{ln}{ \mathcal L }.\end{eqnarray} \tag{ 27 }$

Both involve calculations of the log-likelihood $-2\mathrm{ln}{ \mathcal L }$ , which are affected by arbitrary constants and the sample size. Thus, the individual BIC/AIC values are not significant and the relative differences between models

$\begin{eqnarray}\begin{array}{rcl}{{\rm{\Delta }}}_{\mathrm{BIC}} & = & \mathrm{BIC}-{\mathrm{BIC}}_{\min },\\ {{\rm{\Delta }}}_{\mathrm{AIC}} & = & \mathrm{AIC}-{\mathrm{AIC}}_{\min }\end{array}\end{eqnarray} \tag{ 28 }$

are more important, where BIC_min/AIC_min is the minimum of the BIC/AIC values of all the model candidates. In this way, the preferred model naturally has Δ_BIC = 0 and other models have positive Δ_BIC (similar for AIC). To interpret Δ_BIC and Δ_AIC quantitatively in model selection, we follow the conventional categorical guidelines in Tables 2 and 3.

Table 2. Interpretation Guidelines for BIC

Δ_BIC	Evidence against the Preferred Model
0–2	Not worth more than a bare mention
2–6	Positive
6–10	Strong
>10	Very strong

Download table as: ASCII Typeset image

Table 3. Interpretation Guidelines for AIC

Δ_AIC	Level of Empirical Support for a Model
0–2	Substantial
2–4	Strong
4–7	Considerably less
>10	Essentially none

Download table as: ASCII Typeset image

Formally, the value of Δ_BIC represents the complexity-corrected likelihood ratio in a natural logarithmic scale, or the evidence provided by the data in favor of the preferred statistical model over another model (Kass & Raftery 1995). The value of Δ_AIC measures the Kullback–Leibler distance, or the information lost when a less preferred model is used to approximate the true distribution (Burnham & Anderson 2002). For further discussions on the differences between the BIC and the AIC, we refer the reader to Burnham & Anderson (2002, 2004).

These two criteria put different weights on the penalty on the number of parameters, K, which becomes quite significant for large N_tot and can lead to different results in model selection. It is difficult (for us) to determine which information criterion is more appropriate, or indeed if either is reliable. More complex and computationally costly methods exist to assess the complexity penalty based not simply on the number of free parameters and/or data points, but on the actual geometry of the model space (e.g., Fisher information approximation, Ly et al. 2017). However, these methods were beyond the scope of this work. Instead we describe an alternative model selection method below, which we compare to the conventional AIC/BIC methods.

3.3.2. Bootstrap Model Selection

Motivated by concerns about the applicability of standard model selection techniques (BIC and AIC, discussed above), we consider an alternative method where the complexity penalty is not given as a fixed, simple function of the number of parameters but instead is generated automatically by bootstrapping.

Inspired by the nonparametric bootstrap method for uncertainty estimation (described in Section 3.1), we again consider all the bootstrap samples as a good proxy for mass distributions from N_bs independent simulations, which in reality are too computationally expensive to be conducted. Through such a proxy, the median likelihood of all the bootstrap samples given the best-fit parameters can be used as a model selection criterion:

$\begin{eqnarray}&&\mathrm{BMS}=-2\times \mathrm{median}\,\left(\mathrm{ln}{ \mathcal L }({{\boldsymbol{x}}}_{k}| {{\boldsymbol{\theta }}}_{\mathrm{MLE}})\right).\end{eqnarray} \tag{ 29 }$

where BMS stands for bootstrap model selection, ${{\boldsymbol{x}}}_{k}$ is the kth bootstrap sample, and the factor of 2 is chosen for similarity to AIC/BIC. This empirical criterion represents the extent to which the best-fit parameters can explain/describe other samples drawn from the same mass distribution. Also, it naturally penalizes more complex models that tend to overfit data because they yield poorer fits to those bootstrap samples that deviate farther from the original data. In the following work, we therefore also consider

$\begin{eqnarray}&&{{\rm{\Delta }}}_{\mathrm{BMS}}=\mathrm{BMS}-{\mathrm{BMS}}_{\min },\end{eqnarray} \tag{ 30 }$

as one of our model selection metrics and follow similar conventional categorical guidelines. The comparison between BMS and the commonly used BIC/AIC is discussed in the following sections.

4. The Initial Planetesimal Mass Distribution

In this section, we analyze and compare our two high-resolution simulations (Runs I and II, see Table 1) that have different physical parameters (τ_s, Z). We fit seven statistical models (described in Section 3.2 and Appendix B) to the simulated mass distribution of planetesimals identified by PLAN (described in Section 2.3). In this section, we first present the fitting results and then the models preferred by model selection criteria. Section 4.3 describes an interesting turnover in the fitted mass distribution of Run I. Finally, Section 4.4 compares our results to recent studies.

4.1. Maximum Likelihood Fitting Results

Tables 4 and 5 list the best-fit likelihood and parameters for all the statistical models for Runs I and II, respectively. The mass distribution used in fitting is extracted at t_sg = 7.5/Ω₀ for Run I and at t_sg = 7.6/Ω₀⁸ for Run II. By this time hundreds of planetesimals have formed, and the time evolution of the mass distribution has slowed. As shown in the two tables, for each model, the best-fit parameters for the two simulations are statistically different and are outside the uncertainty ranges of each other, indicating that two different mass distributions are produced.

Table 4. Model Fitting Results for Run I (t_sg = 7.5/Ω)

Models	Best-fit Parameters	Mass Scales (M_G)	$-\mathrm{ln}{ \mathcal L }$	Δ_BMS	Δ_BIC	Δ_AIC
Simply tapered power law	α = ${0.208}_{-0.014}^{+0.011}$	${M}_{\exp }={0.1385}_{-0.0515}^{+0.0529}$	660.443	63.7	53.1	60.4
K = 2, ${\boldsymbol{\theta }}=(\alpha ,{x}_{\exp })$	${x}_{\exp }={8.905}_{-0.464}^{+0.323}$

Variably tapered power law	$\alpha ={0.036}_{-0.041}^{+0.041}$	${M}_{\exp }={0.0021}_{-0.0014}^{+0.0038}$	633.695	10.6	5.3	8.9
K = 3, ${\boldsymbol{\theta }}=(\alpha ,\beta ,{x}_{\exp })$	$\beta ={0.298}_{-0.040}^{+0.061}$
	${x}_{\exp }={4.734}_{-1.128}^{+1.022}$

Broken cumulative power law	${\alpha }_{1}={0.162}_{-0.019}^{+0.009}$	${M}_{\mathrm{br}}={0.0018}_{-0.0008}^{+0.0000}$	631.683	10.1	1.2	4.9
K = 3, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}})$	${\alpha }_{2}={0.589}_{-0.071}^{+0.042}$
	${x}_{{\rm{br}}}={4.550}_{-0.607}^{+0.003}$

Truncated power law	$\alpha ={0.140}_{-0.029}^{+0.012}$	${M}_{\mathrm{tr}}={0.9981}_{-0.0000}^{+0.0000}$	651.846	47.5	35.9	43.2
K = 2, ${\boldsymbol{\theta }}=(\alpha ,{x}_{\mathrm{tr}})$	${x}_{\mathrm{tr}}={10.880}_{-0.000}^{+0.000}$

Broken power law	${{\boldsymbol{\alpha }}}_{1}=-{\bf{0}}.{{\bf{079}}}_{-0.049}^{+0.043}$	${M}_{\mathrm{br}}={0.0052}_{-0.0013}^{+0.0020}$	631.946	7.2	1.8	5.4
K = 3, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}})$	${\alpha }_{2}={0.628}_{-0.055}^{+0.078}$
	${x}_{\mathrm{br}}={5.620}_{-0.285}^{+0.329}$

Truncated broken power law	${{\boldsymbol{\alpha }}}_{1}=-{\bf{0}}.{{\bf{102}}}_{-0.043}^{+0.058}$	${M}_{\mathrm{br}}={0.0030}_{-0.0004}^{+0.0020}$	628.241	0.0	0.0	0.0
K = 4, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}},{x}_{\mathrm{tr}})$	${\alpha }_{2}={0.468}_{-0.070}^{+0.082}$	${M}_{\mathrm{tr}}={0.9981}_{-0.0000}^{+0.0000}$
	${x}_{\mathrm{br}}={5.066}_{-0.144}^{+0.519}$
	${x}_{\mathrm{tr}}={10.880}_{-0.000}^{+0.000}$

Three-segment power law	${{\boldsymbol{\alpha }}}_{1}=-{\bf{0}}.{{\bf{102}}}_{-0.043}^{+0.057}$	${M}_{\mathrm{br}1}={0.0030}_{-0.0004}^{+0.0020}$	628.241	0.0	5.6	2.0
K = 5, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{\alpha }_{3},{x}_{\mathrm{br}1},{x}_{\mathrm{br}2})$	${\alpha }_{2}={0.468}_{-0.071}^{+0.081}$	${M}_{\mathrm{br}2}={0.9981}_{-0.4674}^{+0.0000}$
	${\alpha }_{3}=5.78{\rm{e}}{5}_{-3.9{\rm{e}}4}^{+6.07{\rm{e}}7}$ ^a
	${x}_{\mathrm{br}1}={5.066}_{-0.142}^{+0.511}$
	${x}_{\mathrm{br}2}={10.880}_{-0.632}^{+0.000}$

Note.

^aThe best-fit third slope, α₃, of the three-segment power-law model is extremely large because this model essentially reverts to the truncated broken power-law model (see also Section 3.2 and Figure 3).

Download table as: ASCII Typeset image

Table 5. Model Fitting Results for Run II (t_sg = 7.6/Ω)

Models	Best-fit Parameters	Mass Scales (M_G)	$-\mathrm{ln}{ \mathcal L }$	Δ_BMS	Δ_BIC	Δ_AIC
Simply tapered power law	$\alpha ={0.388}_{-0.039}^{+0.030}$	${M}_{\exp }={11.2531}_{-5.7787}^{+6.8113}$	311.520	16.1	10.2	13.3
K = 2, ${\boldsymbol{\theta }}=(\alpha ,{x}_{\exp })$	${x}_{\exp }={6.397}_{-0.721}^{+0.473}$

Variably tapered power law	$\alpha ={0.304}_{-0.060}^{+0.063}$	${M}_{\exp }={3.3255}_{-1.7965}^{+3.6627}$	309.274	11.6	10.8	10.8
K = 3, ${\boldsymbol{\theta }}=(\alpha ,\beta ,{x}_{\exp })$	$\beta ={0.527}_{-0.088}^{+0.233}$
	${x}_{\exp }={5.178}_{-0.777}^{+0.743}$

Broken cumulative power law	${\alpha }_{1}={0.348}_{-0.033}^{+0.035}$	${M}_{\mathrm{br}}={0.4719}_{-0.0298}^{+0.0169}$	303.864	1.4	0.0	0.0
K = 3, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}})$	${\alpha }_{2}={0.866}_{-0.080}^{+0.143}$
	${x}_{{\rm{br}}}={3.226}_{-0.065}^{+0.035}$

Truncated power law	$\alpha ={0.360}_{-0.051}^{+0.029}$	${M}_{\mathrm{tr}}={50.126}_{-0.000}^{+0.000}$	310.769	14.9	8.7	11.8
K = 2, ${\boldsymbol{\theta }}=(\alpha ,{x}_{\mathrm{tr}})$	${x}_{\mathrm{tr}}={7.891}_{-0.000}^{+0.000}$

Broken power law	${\alpha }_{1}={0.240}_{-0.068}^{+0.064}$	${M}_{\mathrm{br}}={1.5754}_{-0.2594}^{+0.5992}$	309.058	11.0	10.4	10.4
K = 3, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}})$	${\alpha }_{2}={0.895}_{-0.103}^{+0.265}$
	${x}_{\mathrm{br}}={4.431}_{-0.180}^{+0.322}$

Truncated broken power law	${\alpha }_{1}={0.249}_{-0.077}^{+0.065}$	${M}_{\mathrm{br}}={1.4241}_{-0.7103}^{+0.4254}$	307.718	8.4	12.9	9.7
K = 4, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}},{x}_{\mathrm{tr}})$	${\alpha }_{2}={0.729}_{-0.200}^{+0.152}$	${M}_{\mathrm{tr}}={50.1262}_{-0.0000}^{+0.0000}$
	${x}_{\mathrm{br}}={4.330}_{-0.691}^{+0.261}$
	${x}_{\mathrm{tr}}={7.891}_{-0.000}^{+0.000}$

Three-segment power law	${\alpha }_{1}={0.771}_{-0.159}^{+0.291}$	${M}_{\mathrm{br}1}={0.1166}_{-0.0314}^{+0.0362}$	303.711	0.0	10.0	3.7
K = 5, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{\alpha }_{3},{x}_{\mathrm{br}1},{x}_{\mathrm{br}2})$	${\alpha }_{2}=-{0.362}_{-0.303}^{+0.181}$	${M}_{\mathrm{br}2}={0.6216}_{-0.0474}^{+0.1359}$
	${\alpha }_{3}={0.833}_{-0.075}^{+0.164}$
	${x}_{\mathrm{br}1}={1.827}_{-0.314}^{+0.270}$
	${x}_{\mathrm{br}2}={3.501}_{-0.079}^{+0.198}$

Download table as: ASCII Typeset image

Figure 3 visualizes all the resulting model CDFs for both of our simulations. They produce a broad mass distribution spanning more than three orders of magnitude in mass. However, Runs I and II cover different mass regimes due to the different choices of physical parameters and hence the disk conditions, which may also require differently shaped distribution functions to describe. We emphasize that a physical understanding of these differences is elusive, hence our focus on statistics.

For Run I, we find the best-fit first-segment power-law slopes (−α₁) for the last three models in Table 4 become positive, which is of great interest for understanding which planetesimal masses dominate in number counts, and will be discussed in Section 4.3. In addition, the third-segment power-law slope (−α₃) for the three-segment power-law model is extremely steep. As mentioned in Section 3.2, when α₃ approaches $+\infty$ , this model reverts to the truncated broken power law, with the other four parameters being identical between these two models.

In this paper, we do not further consider the mass distributions in other snapshots and the possible variations with time, which belongs to future work. Simon et al. (2017) used a single power-law model to fit the mass distribution and found that the power-law index remains relatively constant in time after an initially rapid collapse. Schäfer et al. (2017) used the variably tapered power-law model and also found that the power-law index, characteristic mass scale, and tapering exponent all remain approximately constant in time for five orbital periods. Based on these results, we do not expect the mass distribution in our simulations to evolve rapidly, i.e., on dynamical timescales, after the snapshots.

4.2. Model Selection

Our model selection analyses are based on the information criteria described in Section 4.2, i.e., Δ_BMS, Δ_BIC, and Δ_AIC values, presented in Tables 4 and 5. We find that more complex models (K ≥ 3) are generally much preferred than the two simpler models (K = 2), regardless of the model selection criteria. In other words, the simply tapered power-law and the truncated power-law models are consistently less favored.

In the case of Run I, all the model selection criteria reach a consensus on choosing the K = 4 truncated broken power law as the preferred model. However, there is disagreement on the strength of preference, as we now explain. In this specific case, the K = 5 three-segment power-law model reverts to the preferred K = 4 model, because the high-mass power law is very steep (effectively a truncation) with the other four parameters identical. Since the BMS does not count parameters, it does not distinguish between these identically shaped distributions. However the AIC and BIC both apply a penalty for the fifth parameter, which is much more significant for the BIC. While all model selection metrics prefer the K = 4 truncated broken power law, the BIC method (only) finds that the evidence against a pair of K = 3 models—the broken cumulative power-law model (Δ_BMS = 1.2) and the broken power-law model (Δ_BMS = 1.8)—is not significant.

In the case of Run II, both the BIC and the AIC designate the broken cumulative power-law model as the preferred model, which only has a moderate complexity level among all the model candidates. However, the BMS prefers the three-segment power-law models slightly more but also substantially supports the broken cumulative power-law model. The overall preference for a broken CDF model may not be surprising given that the mass distribution shows a kink in the CDF (see Figure 3), but it is less physically intuitive since the planetesimal masses are expected to arise probabilistically and broken PDFs have been observed in the size–frequency distributions of asteroids and Kuiper Belt objects (KBOs) (e.g., Morbidelli et al. 2009; Fraser et al. 2010; Shankman et al. 2013; Singer et al. 2019). More work is needed to further understand whether our case is special or not.

Figure 3 allows a visual inspection of our model fitting and selection results. Not surprisingly, more complex models generally fit the mass distributions better. The effect of the logarithmic number axis is worth emphasizing. A small deviation from the (logarithmically plotted) CDF at the low-mass end is more statistically significant than a larger deviation at the high-mass end, where numbers, especially cumulative numbers, are much higher. The advantage of CDF plots is that no binning choices are required. While no data features are lost to binning, a disadvantage is the difficulty in interpreting slopes at the low-mass end (where the CDF is near unity).

Figure 4 provides an alternative view of binned planetesimal numbers (masses), which are compared to the PDFs of the best-fit models. The PDF of the broken cumulative power-law model—fit to Run II—has a discontinuous value at the CDF break. As noted earlier, a physical explanation for such a break does not exist. Overall, the selected models are excellent fits to the binned data, especially when accounting for the error bars, which represent the Poisson noise on the expected number (and mass) of bodies per bin. For Run II, the two preferred models seem to represent the mass distribution equally well.

4.3. A Turnover in the Mass Distribution

In SI simulations to date, the low-mass end of the PDF is described by a power law with α > 0, where dN/d ln(M) ∝ M^−α (most of these works, described below, use p ≡ α + 1). Such a slope cannot extend to arbitrarily low mass, because the total mass of planetesimals would diverge. Simulations with sufficiently high resolution should solve this problem, by revealing a low-mass tail with α < 0. Such a measurement would determine the mass of planetesimals formed by SI that initially dominate by number. We present here the first evidence of such a turnover.

In Figure 4, the mass frequency distribution of Run I presents a turnover below ∼0.003 M_G (roughly 100 km sized objects for the disk model in Section 2.2). The number of clumps drops with decreasing mass, except for the lowest mass bin (more on this below).

Our preferred statistical model confirms this visual evidence. The K = 4 truncated broken power-law model has a positive low-mass slope of −α₁ = 0.102 (as does the K = 5 three-segment power-law model that is identical in practice). Our bootstrap error estimates (see Table 4) confirm the significance of the positive slope. Moreover, the simpler K = 3 broken power-law model (while not the most preferred model) also has a positive low-mass slope (−α₁ = 0.079), which is flatter but agrees within statistical uncertainties.

The evidence for a mass turnover (i.e., a positive slope at low masses) in Run I is fairly compelling. However, Run I also has an increase in the number of bodies in the lowest mass bin in Figure 4. It is unclear whether this increase has a physical origin, though we suspect that insufficient resolution on the smallest scales is an issue. We note that the lower-resolution Run II shows an uptick in the planetesimal numbers in the two lowest mass bins.

Future studies with higher resolution are required to better resolve the low-mass planetesimals and better characterize the inevitable turnover in the gravitational collapse mass function at low masses.

By contrast, for Run II resolution is not sufficient for evidence of a low-mass turnover. However, one of the preferred statistical models (the K = 5 three-segment power-law model) reveals a positive slope at intermediate masses (−α₂ = 0.362). Whether this slope extends to the low-mass end and whether the uptick of the two lowest mass bins is numerical again require higher-resolution studies. Moreover, the binned mass distribution of Run II shows a much flatter slope in the high-mass regime than that of Run I, which is also shown by the CDFs in Figure 3. This comparison further demonstrates that the different physical conditions of our two simulations produce different mass distributions.

4.4. Comparison with Previous Studies

In this section, we compare our fitting results to two recent works on planetesimal mass distribution that considered models beyond a single power law.

Schäfer et al. (2017) ran a suite of simulations with different resolutions and box sizes, fixing the physical parameters, $({\tau }_{{\rm{s}}},Z,{\rm{\Pi }},\tilde{G})=(0.314,0.02,0.05,0.318)$ , i.e., similar to our Run II with weaker self-gravity. They considered a two-parameter truncated power-law model and a three-parameter variably tapered power-law model, and found that the latter provides a better fit. Our analysis did not favor either of these models, especially for the similar run II. For the tapering exponent (of the variably tapered power law), they fit β ≃ 0.3–0.4, similar to our results (0.298 and 0.527 for Runs I and II, respectively). Schäfer et al. (2017) explained that, with limited resolution (Δx = H/640 at best), their simulations did not produce enough planetesimals to constrain the shape of the CDF in the power-law part, and thus the values of α and M_exp. Their work used a different code, PENCIL, and also used a sink-particle algorithm to handle bound clumps. These differences are a useful check on numerical robustness.

Abod et al. (2019) used high-resolution (Δx = H/2560) simulations with $({\tau }_{{\rm{s}}},Z,\tilde{G})=(0.05,0.1,0.02)$ —i.e., smaller solids in a lower-mass disk than our runs—to study how the initial mass distribution of planetesimals depends on the pressure gradient, Π, with values from 0 to 0.1. They found that the planetesimal mass function depends at most weakly on Π. Abod et al. (2019) used a two-parameter simply tapered power-law model to fit the simulation data. Our analysis did not prefer this model, though it does have an advantage of simplicity. They fit the power-law exponent α ≈ 0.3 and the characteristic mass scale of ∼0.3 M_G when Π = 0.05. Our results give similar power-law slopes (α = 0.208 for Run I and 0.388 for Run II), and a similar characteristic mass scale (M_exp = 0.1385 M_G) for Run I. In our Run II fit, M_exp (= 11.2531 M_G) differs by a factor of ∼81, for reasons that are not yet clear.

Since there is always more than one difference in the physical conditions and only limited model candidates are considered, these comparisons are somewhat inconclusive. Though costly, more parameter studies are needed to understand how the initial planetesimal mass function varies with each of the four physical parameters $({\tau }_{{\rm{s}}},Z,{\rm{\Pi }},\tilde{G})$ and eventually how these parameter dependences couple.

5. Discussions and Conclusions

We investigate the mass distribution of planetesimals formed in high-resolution SI simulations. This mass distribution is of great astrophysical interest since it provides insights for observations of small bodies in the solar system (e.g., cold classical KBOs, Nesvorný et al. 2019, etc.) as well as for the modeling of subsequent protoplanet formation (e.g., Liu et al. 2019).

In this work, we conduct SI simulations including particle self-gravity with the highest resolution to date, which produce broad mass distributions of planetesimals. While such distributions are top-heavy in mass for all choices of numerical resolution, higher-resolution simulations probe the lower-mass tail that dominates planetesimal numbers. We also develop and apply a new clump-finding tool, PLAN (described in Section 2.3), to accurately identify self-bound clumps in our simulations and extract their mass distributions. PLAN was used in previous work (Abod et al. 2019; Nesvorný et al. 2019), but the details of the method are presented here.

We fit the mass distribution to statistical models with different parameterizations (described in Section 3.2). To determine and select the preferred model from simulation data is a difficult statistical art, especially when different model candidates have different numbers of parameters. Thus, this work considers a variety of model selection criteria: the commonly used BIC and AIC, as well as a bootstrap model selection method that we call BMS.

Based on our analyses, we find that

1.
Run I is best described by a four-parameter model, the truncated broken power law.
2.
For Run II (with smaller solids and a lower solid abundance than Run I) the preferred model depends on the selection criterion used. The AIC and BIC prefer a three-parameter broken cumulative power law. The BMS selects a five-parameter model, the three-segment power law.

The interpretations and conclusions are drawn as follows:

1.
The initial mass distribution of planetesimals formed by the SI is shown to be numerically robust for the high-mass regimes, and is most probably also robust at lower masses. Simulations with different numerical resolution (Run I and an equivalent run with lower resolution) show a similar mass distribution at the high-mass end. Higher resolution gives a correction at intermediate masses and an extension to lower masses.
2.
For different physical conditions, the initial mass distribution is not universal. While all cases produce top-heavy mass distributions with similar overall shapes, simulations with different physical parameters produce statistically different mass distributions. Fitting the same model to different runs often yields different best-fit parameters, e.g., power-law slopes and characteristic mass scales. Moreover, the preferred models for different runs have different functional forms. More work and more high-resolution simulations are needed to better understand the initial mass distribution.
3.
Our preferred models were not previously considered in the literature. We analyze the models that were used in previous studies, and find alternative models that rank higher by all model selection criteria. We make no claim to have found the optimal model, which we may not have considered and which may change as simulation data improve.
4.
We find evidence for a turnover in the mass frequency distribution at the low-mass end. This evidence is most prominent for Run I, where the PDF of the logarithmic masses transitions to a positive slope below M ∼ 0.003 M_G at roughly 2σ significance in the estimated slope. There is also statistical evidence for a turnover in the case of Run II, but only at intermediate masses. To better characterize the turnover of initial planetesimal mass distributions, higher-resolution simulations are required.
5.
The most complex model is not always selected as the preferred model. This result emphasizes the importance of applying complexity penalties for model selection.
6.
Different model selection criteria disagree on both the absolute and relative rankings of different models. It is often difficult to rigorously justify a single model selection criterion for a given (astrophysical) application. In the absence of this justification, we generally recommend that multiple selection criteria be considered to increase confidence in model selection analyses.

Nesvorný et al. (2019) recently found that the clumps formed by the SI possess excess angular momenta and are likely to form binaries or multiple systems. In that case, the mass distributions from our simulations may describe the mass function of binaries/systems, not individual objects. This finding introduces corrections to the overall mass distribution and also some uncertainties to our modeling, which are beyond the scope of this work. However, those corrections and uncertainties would be minor if all clumps eventually form equal-size binaries as proposed in Nesvorný et al. (2010).

We thank Kaitlin Kratter, Paola Pinilla, Philip Pinto, and Peter Behroozi for useful discussions. R.L. acknowledges support from NASA headquarters under the NASA Earth and Space Science Fellowship Program grant NNX16AP53H. A.N.Y. acknowledges partial support from NASA Astrophysics Theory Grant NNX17AK59G and from the NSF through grant AST-1616929.

Software: ATHENA (Stone et al. 2008; Bai & Stone 2010c), Matplotlib (Hunter 2007), Numpy & Scipy (Jones et al. 2001), emcee v3.0rc2 (Foreman-Mackey et al. 2013), PLAN (Li 2019a), Pyridoxine (Li 2019b).

Appendix A: A Model Fitting Example

In this appendix, we take the fitting of the variably tapered power-law model to the data from Run I as an example and describe it in detail. First, we assume a uniform prior distribution of parameters and use emcee to explore the posterior likelihood sampling (see Figure 5). In this example, our chains consist of 32 walkers and 6000 steps with 1000 burn-in steps, while the estimated autocorrelation time returned by emcee is only about 75 steps, is well below the number of burn-in steps. From this MCMC result, the best-fit parameters and the log-likelihood are

$\begin{eqnarray}\begin{array}{rcl}{{\boldsymbol{\theta }}}_{\mathrm{MCMC}} & = & (\alpha ,\beta ,{x}_{\exp })=(0.035,0.291,4.657),\\ -\mathrm{ln}{ \mathcal L } & = & 633.716.\end{array}\end{eqnarray} \tag{ 31 }$

**Figure 5.** The posterior likelihood sampling results of fitting the variably tapered power-law model (see Section 3.2 and Table 6) to the data from Run I using MCMC. The median values are shown by the vertical dashed lines.
Download figure:
Standard image High-resolution image

We now feed a set of initial guesses generated on a mesh grid centered on ${{\boldsymbol{\theta }}}_{\mathrm{MCMC}}$ to the minimize method provided by the scipy.optimize package. Various minimization algorithms are then employed for further minimization, including "Nelder–Mead," "Powell," "CG," "BFGS," "Newton-CG," "L-BFGS-B" and "TNC." The latter five algorithms compute the gradient vector, $\partial \mathrm{ln}{ \mathcal L }/\partial {\boldsymbol{\theta }}$ , to converge more quickly to the solution. We have implemented the methods to compute the gradient vector and Hessian matrix for all seven statistical models and make them available on GitHub.⁹ In this specific example, all algorithms but "L-BFGS-B" converge at the final best-fit parameters with an acceptable tiny gradient vector

$\begin{eqnarray}&&\begin{array}{l}{{\boldsymbol{\theta }}}_{\mathrm{MLE}}=(\alpha ,\beta ,{x}_{\exp })=(0.036,0.298,4.734),\\ {\left.\left(\displaystyle \frac{\partial \mathrm{ln}{ \mathcal L }}{\partial \alpha },\displaystyle \frac{\partial \mathrm{ln}{ \mathcal L }}{\partial \beta },\displaystyle \frac{\partial \mathrm{ln}{ \mathcal L }}{\partial {x}_{\exp }}\right)\right|}_{{\boldsymbol{\theta }}={{\boldsymbol{\theta }}}_{\mathrm{MLE}}}\\ \quad =\ (3.763e-11,1.265e-11,-1.812e-12),\\ {\left.\left(\displaystyle \frac{{\partial }^{2}\mathrm{ln}{ \mathcal L }}{\partial {\alpha }^{2}},\displaystyle \frac{{\partial }^{2}\mathrm{ln}{ \mathcal L }}{\partial {\beta }^{2}},\displaystyle \frac{{\partial }^{2}\mathrm{ln}{ \mathcal L }}{\partial {x}_{\exp }^{2}}\right)\right|}_{{\boldsymbol{\theta }}={{\boldsymbol{\theta }}}_{\mathrm{MLE}}}\\ \quad =\ (-6365.209,-3364.693,-18.445),\\ -\mathrm{ln}{ \mathcal L }=633.695,\end{array}\end{eqnarray} \tag{ 32 }$

where Equation (17) is satisfied with a good numerical precision.

In the next step, we use the nonparametric bootstrap method to estimate the uncertainties of ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ . Figure 6 shows all the bootstrap samples in the left panel. We again apply our MLE to obtain the best-fit parameters, ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}$ , for the kth bootstrap sample (k = 1, ..., N_bs). All these best fits are plotted in the right panel of Figure 6. The uncertainties are then calculated based on Equation (18):

$\begin{eqnarray}\begin{array}{rcl}\alpha & = & {0.036}_{-0.041}^{+0.041},\\ \beta & = & {0.298}_{-0.040}^{+0.061},\\ {x}_{\exp } & = & {4.734}_{-1.128}^{+1.022}.\end{array}\end{eqnarray} \tag{ 33 }$

Furthermore, the uncertainty of the characteristic mass scale, M_exp, can be derived as

$\begin{eqnarray}&&\begin{array}{c}\begin{array}{l}{M}_{\exp }+{\rm{\Delta }}{M}_{\exp }^{+}={M}_{min}{e}^{{x}_{\exp }+{\rm{\Delta }}{x}_{\exp }^{+}}\\ \,\Longrightarrow {\rm{\Delta }}{M}_{\exp }^{+}={M}_{\exp }\left({e}^{{\rm{\Delta }}{x}_{\exp }^{+}}-1\right),\\ {M}_{\exp }-{\rm{\Delta }}{M}_{\exp }^{-}={M}_{min}{e}^{{x}_{\exp }-{\rm{\Delta }}{x}_{\exp }^{-}}\\ \,\Longrightarrow {\rm{\Delta }}{M}_{\exp }^{-}={M}_{\exp }\left(1-{e}^{-{\rm{\Delta }}{x}_{\exp }^{-}}\right),\\ \therefore {M}_{\exp }={0.0021}_{-0.0014}^{+0.0038}.\end{array}\end{array}\end{eqnarray} \tag{ 34 }$

**Figure 6.** A demonstration of the nonparametric bootstrap method. Left: the simulated mass distribution data (red) and N_bs = 10,000 bootstrap samples (gray). Right: the best-fit variably tapered power-law model (cyan, ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ ) to the data and the best-fit model to each bootstrap sample (gray, ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}$ ).
Download figure:
Standard image High-resolution image

**Figure 6.** A demonstration of the nonparametric bootstrap method. Left: the simulated mass distribution data (red) and N_bs = 10,000 bootstrap samples (gray). Right: the best-fit variably tapered power-law model (cyan, ${{\boldsymbol{\theta }}}_{\mathrm{MLE}}$ ) to the data and the best-fit model to each bootstrap sample (gray, ${{\boldsymbol{\theta }}}_{\mathrm{bs},k}$ ).
Download figure:
Standard image High-resolution image

Appendix B: Model Coefficients and Full Functional Forms

In this section, we list all the renormalization coefficients for the statistical models in Section 3.2 and show their full functional forms in Table 6.

1.
Simply tapered power law
$\begin{eqnarray}&&{c}_{1}=\displaystyle \frac{1}{{M}_{\min }^{-\alpha }}\ \exp \left(\displaystyle \frac{{M}_{\min }}{{M}_{\exp }}\right).\end{eqnarray} \tag{ 35 }$
2.
Variably tapered power law
$\begin{eqnarray}&&{c}_{2}=\displaystyle \frac{1}{{M}_{\min }^{-\alpha }}\ \exp \left[{\left(\displaystyle \frac{{M}_{\min }}{{M}_{\exp }}\right)}^{\beta }\right].\end{eqnarray} \tag{ 36 }$
3.
Broken cumulative power law
$\begin{eqnarray}\begin{array}{rcl}{c}_{31} & = & \displaystyle \frac{1}{{M}_{\min }^{-{\alpha }_{1}}}\\ {c}_{32} & = & \displaystyle \frac{1}{{M}_{\min }^{-{\alpha }_{1}}{M}_{\mathrm{br}}^{{\alpha }_{1}-{\alpha }_{2}}}.\end{array}\end{eqnarray} \tag{ 37 }$
4.
Truncated power law
$\begin{eqnarray}&&{c}_{4}=\displaystyle \frac{\alpha }{{M}_{\min }^{-\alpha }-{M}_{\mathrm{tr}}^{-\alpha }}.\end{eqnarray} \tag{ 38 }$
5.
Broken power law
$\begin{eqnarray}\begin{array}{rcl}{c}_{51} & = & \displaystyle \frac{1}{{M}_{\min }^{-{\alpha }_{1}}}{\left[\displaystyle \frac{1}{{\alpha }_{1}}+\left(\displaystyle \frac{1}{{\alpha }_{2}}-\displaystyle \frac{1}{{\alpha }_{1}}\right){\left(\displaystyle \frac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\right]}^{-1}\\ {c}_{52} & = & {c}_{51}{M}_{\mathrm{br}}^{{\alpha }_{2}-{\alpha }_{1}}.\end{array}\end{eqnarray} \tag{ 39 }$
6.
Truncated broken power law
$\begin{eqnarray}\begin{array}{rcl}{c}_{61} & = & \displaystyle \frac{1}{{M}_{\min }^{-{\alpha }_{1}}}\left[\displaystyle \frac{1}{{\alpha }_{1}}+\left(\displaystyle \frac{1}{{\alpha }_{2}}-\displaystyle \frac{1}{{\alpha }_{1}}\right){\left(\displaystyle \frac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\right.\\ & & -\ {\left.\displaystyle \frac{1}{{\alpha }_{2}}{\left(\displaystyle \frac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)}^{{\alpha }_{2}-{\alpha }_{1}}{\left(\displaystyle \frac{{M}_{\mathrm{tr}}}{{M}_{\min }}\right)}^{-{\alpha }_{2}}\right]}^{-1}\\ {c}_{62} & = & {c}_{61}{M}_{\mathrm{br}}^{{\alpha }_{2}-{\alpha }_{1}}.\end{array}\end{eqnarray} \tag{ 40 }$
7.
Three-segment power law
$\begin{eqnarray}\begin{array}{rcl}{c}_{71} & = & \displaystyle \frac{1}{{M}_{\min }^{-{\alpha }_{1}}}\left[\displaystyle \frac{1}{{\alpha }_{1}}+\left(\displaystyle \frac{1}{{\alpha }_{2}}-\displaystyle \frac{1}{{\alpha }_{1}}\right){\left(\displaystyle \frac{{M}_{\mathrm{br}1}}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\right.\\ & & +\ {\left.\left(\displaystyle \frac{1}{{\alpha }_{3}}-\displaystyle \frac{1}{{\alpha }_{2}}\right){\left(\displaystyle \frac{{M}_{\mathrm{br}1}}{{M}_{\min }}\right)}^{{\alpha }_{2}-{\alpha }_{1}}{\left(\displaystyle \frac{{M}_{\mathrm{br}2}}{{M}_{\min }}\right)}^{-{\alpha }_{2}}\right]}^{-1}\\ {c}_{72} & = & {c}_{71}{M}_{\mathrm{br}1}^{{\alpha }_{2}-{\alpha }_{1}}\\ {c}_{73} & = & {c}_{71}{M}_{\mathrm{br}1}^{{\alpha }_{2}-{\alpha }_{1}}{M}_{\mathrm{br}2}^{{\alpha }_{3}-{\alpha }_{2}}.\end{array}\end{eqnarray} \tag{ 41 }$

Table 6. Mass Distribution Models

Name	Mass Distribution Function	Mass Scale	PDF in Likelihood Estimator
K: Number of Parameters	CDF[ ${P}_{\gt }(M)]$ or PDF[ξ(M)]	x ≡ ln(M/M_min)	$p(x;{\boldsymbol{\theta }})$

Simply tapered power law (Abod et al. 2019), K = 2, ${\boldsymbol{\theta }}=(\alpha ,{x}_{\exp })$	${P}_{\gt }(M)={\left(\tfrac{M}{{M}_{\min }}\right)}^{-\alpha }\ \exp \left[-\tfrac{M-{M}_{\min }}{{M}_{\exp }}\right]$	${x}_{\exp }\equiv \mathrm{ln}\left(\tfrac{{M}_{\exp }}{{M}_{\min }}\right)$	$\tfrac{\alpha +{e}^{-{x}_{\exp }}{e}^{x}}{\exp \left[\alpha x+{e}^{-{x}_{\exp }}({e}^{x}-1)\right]}$

Variably tapered power law (Schäfer et al. 2017), K = 3, ${\boldsymbol{\theta }}=(\alpha ,\beta ,{x}_{\exp })$	${P}_{\gt }(M)={\left(\tfrac{M}{{M}_{\min }}\right)}^{-\alpha }\ \exp \left[-\tfrac{{M}^{\beta }-{M}_{\min }^{\beta }}{{M}_{\exp }^{\beta }}\right]$	${x}_{\exp }\equiv \mathrm{ln}\left(\tfrac{{M}_{\exp }}{{M}_{\min }}\right)$	$\tfrac{\alpha +\beta {e}^{\beta ({x}_{\exp }+x)}}{\exp \left[\alpha x+{e}^{\beta {x}_{\exp }}({e}^{\beta x}-1)\right]}$

Broken cumulative power law, K = 3, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}})$	${P}_{\gt }(M)=\left\{\begin{array}{ll}{\left(\tfrac{M}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\ & M\leqslant {M}_{\mathrm{br}}\\ \tfrac{{M}^{-{\alpha }_{2}}}{{M}_{\mathrm{br}}^{{\alpha }_{1}-{\alpha }_{2}}{M}_{\min }^{-{\alpha }_{1}}}\ & M\gt {M}_{\mathrm{br}}\end{array}\right.$	${x}_{\mathrm{br}}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)$	$\left\{\begin{array}{ll}{\alpha }_{1}{e}^{-{\alpha }_{1}x}\ & x\leqslant {x}_{\mathrm{br}}\\ {\alpha }_{2}{e}^{({\alpha }_{2}-{\alpha }_{1}){x}_{\mathrm{br}}-{\alpha }_{2}x}\ & x\gt {x}_{\mathrm{br}}\end{array}\right.$

Truncated power law (Schäfer et al. 2017), K = 2, ${\boldsymbol{\theta }}=(\alpha ,{x}_{\mathrm{tr}})$	$\xi (M)=\left\{\begin{array}{ll}\tfrac{\alpha }{M}\tfrac{{\left(M/{M}_{\min }\right)}^{-\alpha }}{1-{\left({M}_{\mathrm{tr}}/{M}_{\min }\right)}^{-\alpha }}\ & M\leqslant {M}_{\mathrm{tr}}\\ 0\ & M\gt {M}_{\mathrm{tr}}\end{array}\right.$	${x}_{\mathrm{tr}}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{tr}}}{{M}_{\min }}\right)$	$\left\{\begin{array}{ll}\tfrac{\alpha {e}^{-\alpha x}}{1-{e}^{-\alpha {x}_{\mathrm{tr}}}}\ & x\leqslant {x}_{\mathrm{tr}}\\ 0\ & x\gt {x}_{\mathrm{tr}}\end{array}\right.$

Broken power law, K = 3, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}})$	$\begin{array}{l}\xi (M)=\left\{\begin{array}{ll}\tfrac{{C}_{0}}{M}{\left(\tfrac{M}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\ & M\leqslant {M}_{\mathrm{br}}\\ \tfrac{{C}_{0}}{M}\tfrac{{\left(M/{M}_{\min }\right)}^{-{\alpha }_{2}}}{{\left({M}_{\mathrm{br}}/{M}_{\min }\right)}^{{\alpha }_{1}-{\alpha }_{2}}}\ & M\gt {M}_{\mathrm{br}}\end{array}\right.\\ {C}_{0}={\left[\tfrac{1}{{\alpha }_{1}}+\left(\tfrac{1}{{\alpha }_{2}}-\tfrac{1}{{\alpha }_{1}}\right){\left(\tfrac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\right]}^{-1}\end{array}$	${x}_{\mathrm{br}}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)$	$\begin{array}{l}\left\{\begin{array}{ll}{C}_{0}{e}^{-{\alpha }_{1}x}\ & x\leqslant {x}_{\mathrm{br}}\\ {C}_{0}{e}^{({\alpha }_{2}-{\alpha }_{1}){x}_{\mathrm{br}}-{\alpha }_{2}x}\ & x\gt {x}_{\mathrm{br}}\end{array}\right.\\ {C}_{0}={\left[\tfrac{1}{{\alpha }_{1}}+\left(\tfrac{1}{{\alpha }_{2}}-\tfrac{1}{{\alpha }_{1}}\right){e}^{-{\alpha }_{1}{x}_{\mathrm{br}}}\right]}^{-1}\end{array}$

Truncated broken power law, K = 4, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{x}_{\mathrm{br}},{x}_{\mathrm{tr}})$	$\begin{array}{l}\xi (M)=\left\{\begin{array}{ll}\tfrac{{C}_{1}}{M}{\left(\tfrac{M}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\ & M\leqslant {M}_{\mathrm{br}}\\ \tfrac{{C}_{1}}{M}\tfrac{{\left(M/{M}_{\min }\right)}^{-{\alpha }_{2}}}{{\left({M}_{\mathrm{br}}/{M}_{\min }\right)}^{{\alpha }_{1}-{\alpha }_{2}}}\ & \mathrm{otherwise}\\ 0\ & M\gt {M}_{\mathrm{tr}}\end{array}\right.\\ {C}_{1}=\left[\tfrac{1}{{\alpha }_{1}}+\left(\tfrac{1}{{\alpha }_{2}}-\tfrac{1}{{\alpha }_{1}}\right){\left(\tfrac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\right.\\ {\left.-\tfrac{1}{{\alpha }_{2}}{\left(\tfrac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)}^{{\alpha }_{2}-{\alpha }_{1}}{\left(\tfrac{{M}_{\mathrm{tr}}}{{M}_{\min }}\right)}^{-{\alpha }_{2}}\right]}^{-1}\end{array}$	$\begin{array}{l}{x}_{\mathrm{br}}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{br}}}{{M}_{\min }}\right)\\ {x}_{\mathrm{tr}}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{tr}}}{{M}_{\min }}\right)\end{array}$	$\begin{array}{l}\left\{\begin{array}{ll}{C}_{1}{e}^{-{\alpha }_{1}x}\ & x\leqslant {x}_{\mathrm{br}}\\ {C}_{1}{e}^{({\alpha }_{2}-{\alpha }_{1}){x}_{\mathrm{br}}-{\alpha }_{2}x}\ & \mathrm{otherwise}\\ 0\ & x\gt {x}_{\mathrm{tr}}\end{array}\right.\\ {C}_{1}=\left[\tfrac{1}{{\alpha }_{1}}+\left(\tfrac{1}{{\alpha }_{2}}-\tfrac{1}{{\alpha }_{1}}\right){e}^{-{\alpha }_{1}{x}_{\mathrm{br}}}\right.\\ {\left.-\tfrac{1}{{\alpha }_{2}}{e}^{({\alpha }_{2}-{\alpha }_{1}){x}_{\mathrm{br}}-{\alpha }_{2}{x}_{\mathrm{tr}}}\right]}^{-1}\end{array}$

Three-segment power law, K = 5, ${\boldsymbol{\theta }}=({\alpha }_{1},{\alpha }_{2},{\alpha }_{3},{x}_{\mathrm{br}1},{x}_{\mathrm{br}2})$	$\begin{array}{l}\xi (M)=\\ \left\{\begin{array}{ll}\tfrac{{C}_{2}}{M}{\left(\tfrac{M}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\ & M\leqslant {M}_{\mathrm{br}1}\\ \tfrac{{C}_{2}}{M}\tfrac{{\left(M/{M}_{\min }\right)}^{-{\alpha }_{2}}}{{\left({M}_{\mathrm{br}1}/{M}_{\min }\right)}^{{\alpha }_{1}-{\alpha }_{2}}}\ & \mathrm{otherwise}\\ \tfrac{{C}_{2}}{M}\tfrac{{\left(M/{M}_{\min }\right)}^{-{\alpha }_{3}}}{{\left(\tfrac{{M}_{\mathrm{br}1}}{{M}_{\min }}\right)}^{{\alpha }_{1}-{\alpha }_{2}}{\left(\tfrac{{M}_{\mathrm{br}2}}{{M}_{\min }}\right)}^{{\alpha }_{2}-{\alpha }_{3}}}\ & M\gt {M}_{\mathrm{br}2}\end{array}\right.\\ {C}_{2}=\left[\tfrac{1}{{\alpha }_{1}}+\left(\tfrac{1}{{\alpha }_{2}}-\tfrac{1}{{\alpha }_{1}}\right){\left(\tfrac{{M}_{\mathrm{br}1}}{{M}_{\min }}\right)}^{-{\alpha }_{1}}\right.\\ {\left.+\left(\tfrac{1}{{\alpha }_{3}}-\tfrac{1}{{\alpha }_{2}}\right){\left(\tfrac{{M}_{\mathrm{br}1}}{{M}_{\min }}\right)}^{{\alpha }_{2}-{\alpha }_{1}}{\left(\tfrac{{M}_{\mathrm{br}2}}{{M}_{\min }}\right)}^{-{\alpha }_{2}}\right]}^{-1}\end{array}$	$\begin{array}{l}{x}_{\mathrm{br}1}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{br}1}}{{M}_{\min }}\right)\\ {x}_{\mathrm{br}2}\equiv \mathrm{ln}\left(\tfrac{{M}_{\mathrm{br}2}}{{M}_{\min }}\right)\end{array}$	$\begin{array}{l}\left\{\begin{array}{ll}{C}_{2}{e}^{-{\alpha }_{1}x}\ & x\leqslant {x}_{\mathrm{br}1}\\ {C}_{2}{e}^{({\alpha }_{2}-{\alpha }_{1}){x}_{\mathrm{br}1}-{\alpha }_{2}x}\ & \mathrm{otherwise}\\ {C}_{2}\tfrac{{e}^{-{\alpha }_{3}x}}{{e}^{({\alpha }_{1}-{\alpha }_{2}){x}_{\mathrm{br}1}+({\alpha }_{2}-{\alpha }_{3}){x}_{\mathrm{br}2}}}\ & x\gt {x}_{\mathrm{br}2}\end{array}\right.\\ {C}_{2}=\left[\tfrac{1}{{\alpha }_{1}}+\left(\tfrac{1}{{\alpha }_{2}}-\tfrac{1}{{\alpha }_{1}}\right){e}^{-{\alpha }_{1}{x}_{\mathrm{br}1}}\right.\\ {\left.+\left(\tfrac{1}{{\alpha }_{3}}-\tfrac{1}{{\alpha }_{2}}\right){e}^{({\alpha }_{2}-{\alpha }_{1}){x}_{\mathrm{br}1}-{\alpha }_{2}{x}_{\mathrm{br}2}}\right]}^{-1}\end{array}$