Requirements for gravitational collapse in planetesimal formation --- the impact of scales set by Kelvin-Helmholtz and nonlinear streaming instability

The formation of planetesimals is an unsolved problem in planet formation theory. A prominent scenario for overcoming dust growth barriers in dead zones is the gravitational collapse of locally over-dense regions, shown to robustly produce $\sim$100 km sized objects. Still, the conditions under which planetesimal formation occurs remain unclear. For collapse to proceed, the self-gravity of an overdensity must overcome stellar tidal disruption on large scales and turbulent diffusion on small scales. Here, we relate the scales of streaming and Kelvin-Helmholtz instability, which both regulate particle densities on the scales of gravitational collapse, directly to planetesimal formation. We support our analytic findings by performing 3D hydrodynamical simulations of streaming and Kelvin-Helmholtz instability and planetesimal formation. We find that the vertical extent of the particle mid-plane layer and the radial width of streaming instability filaments are set by the same characteristic length scale, thus governing the strength of turbulent diffusion on the scales of planetesimal formation. We present and successfully test a collapse criterion: $0.1 Q \beta \epsilon^{-1}Z^{-1} \lesssim 1$ and show that even for Solar metallicities, planetesimals can form in dead zones of sufficiently massive disks. For a given gas Toomre-parameter $Q$, pressure gradient $\beta$, metallicity $Z$ and local particle enhancement $\epsilon$, the collapse criterion also provides a range of unstable scales, instituting a promising path for studying initial planetesimal mass distributions. Streaming instability is not required for planetesimal collapse, but by increasing $\epsilon$, can evolve a system to instability.


INTRODUCTION
Protoplanetary disks around young stars are believed to mark the birthplace of planets. However, which disk processes govern planetesimal formation remains a central question. While µm-sized dust particles have been shown to efficiently grow (e.g., Brauer et al. 2008), further growth of m-sized pebbles is halted by fragmentation and rapid radial drift (e.g., Birnstiel et al. 2012; see Chiang & Youdin 2010 for a review). Thus, processes to form planetesimals -building blocks of planets defined as the smallest gravitationally bound objects -must ef-may be found in vortices (Raettig et al. 2015;Manger & Klahr 2018) and zonal flows or other particle trap structures caused by instabilities in the gas disk Pfeil & Klahr 2019) that convert a pebble flux to planetesimals Gerbig et al. 2019).
In recent years, the streaming instability (Youdin & Goodman 2005) has gained prominence as it was shown to be able to significantly increase local particle densities Bai & Stone 2010a;Yang & Johansen 2014;Carrera et al. 2015;Yang et al. 2017;) and thus bootstrap the formation of planetesimals with sizes of ∼ 100 km without the need for preexisting large-scale particle traps, as shown by e.g., Johansen et al. (2015); Simon et al. (2016); Schäfer et al. (2017); Nesvorny et al. (2019). While Sekiya & Onishi (2018) recently showed that the level of particle concentration by streaming instability is regulated by just two parameters, the conditions required for collapse and planetesimal formation in the presence (or absence) of streaming instability remain unclear.
To trigger local gravitational collapse, a particle clump must overcome both disruptive tidal shear and turbulent diffusion (Klahr & Schreiber 2015;Klahr & Schreiber 2019). The former requires sufficiently high particle concentrations and the latter sufficiently low turbulent particle velocities, both of which are regulated by streaming and Kelvin-Helmholtz instability.
Kelvin-Helmholtz instabilities (KHI) in the particle stream are driven by the vertical gradient in orbital velocity intrinsic to disks with frictionally coupled dust and gas subject to a radial pressure gradient (Weidenschilling 1980;Nakagawa et al. 1986;Sekiya 1998;Sekiya & Ishitsu 2000;Johansen et al. 2006). Kelvin-Helmholtz instability leads to vertical mixing and particle excitation and thus sets the vertical extent of the particle layer (Weidenschilling & Cuzzi 1993;Youdin & Shu 2002;Chiang 2008;Sekiya & Onishi 2018). Streaming instability likewise is a drag instability (Youdin & Goodman 2005;Jacquet et al. 2011;Umurhan et al. 2019). Here, frictionally coupled particles exert a feedback force onto an epicyclic gas wave, which for the fastest growing wave modes is in resonance with the pressure-gradientdependant dust streaming velocity (Squire & Hopkins 2018a;Zhuravlev 2019). The interplay of these two instabilities, albeit crucial for planetesimal formation, is not well studied numerically, in particular with selfgravity.
Thus, in this work, we employ the Pencil Code (Brandenburg & Dobler 2002;Brandenburg 2003) to conduct 3D fluid-dynamical simulations in the shearing-box approximation (Goldreich & Lynden-Bell 1965;Balbus & Hawley 1992;Brandenburg et al. 1995) to numerically investigate the scales of streaming and Kelvin-Helmholtz instability. In contrast to e.g. ; ; , we include vertical stellar gravity to capture particle settling and KHI-induced stirring. To verify analytic predictions for the vertical extent of the particle layer set by KHI Johansen et al. (2006); Chiang (2008), we perform a brief parameter study, focusing on the effects of metallicity and of the gas disk's radial pressure gradient, which is the energy source of both instabilities. By then introducing self-gravity to our simulations, we test an updated version of the diffusion-limited collapse criterion in Klahr & Schreiber (2015); Klahr & Schreiber (2019) and the conditions for planetesimal formation. This is in contrast to Shi & Chiang (2013), who studied graviational collapse in the absence of a radial pressure gradient, i.e. without SI and KHI effects. Finally, we put forth a more general collapse criterion that we derive by relating diffusion and local enhancement to the scales of streaming and Kelvin-Helmholtz instability.
The paper is structured as follows. In Sect. 2, we review and expand upon the diffusion-limited collapse criterion from Klahr & Schreiber (2015); Klahr & Schreiber (2019). Sect. 3 reviews the effect of Kelvin-Helmholtz and streaming instability on particle densities in protoplanetary disk dead zones. In Sect. 4, we discuss our numerical setup. Sects. 5 and 6 present our numerical results for vertical and radial scales of particle overdensities in the absence and presence of streaming instability, respectively. In Sect. 7 we include self-gravity to numerically verify the collapse criterion. Finally, our conclusion, limitations and an outlook are presented in Sect. 8.

COLLAPSE CRITERION FOR LOCAL PARTICLE OVER-DENSITIES
The diffusion-limited collapse criterion for planetesimals from Klahr & Schreiber (2015) displays surprising similarities to the Toomre criterion for gas disk stability (Toomre 1964). Klahr & Schreiber (2015); Klahr & Schreiber (2019) present a collapse length scale appropriate for marginally unstable systems at the Hill density (see Eq. (12)). We expand upon their result by recasting the interplay between diffusion, tidal shear, and self-gravity in a form familiar to the development of Toomre's Q criterion for collapse. In the process, we both account for systems that allow for multiple unstable modes and gain insight into the disk evolution processes that may lead up to planetesimal fragmentation, a subject that we return to in Sect. 8.3.
We consider a particle cloud with density ρ c , mass M c at distance from the star a, collapsing under its own gravity. A dust particle at distance r from the cloud center is assumed to in-fall with its terminal velocity v term , i.e., where we introduced gravitational constant G, orbital frequency Ω and Stokes number St, a dimensionless parameterization of the stopping time t stop defined as Here, t stop = mw/F D for a particle of mass m experiencing drag force F D as it moves at velocity w relative to the gas. For St < 1, particles are coupled to the gas, whereas particles with St > 1 decouple from the gas. As such, the Stokes number quantifies the aerodynamic behavior as encoded in particle size and gas density (see Epstein 1924, or e.g., Chiang & Youdin (2010) for a review). Integration of Eq.
(1) yields a radius evolution and contraction timescale on which the particle reaches the cloud center.

The diffusion limited collapse criterion
If the cloud is embedded in turbulent gas, and its particles are coupled to the gas i.e., St < 1, particles will be subject to turbulent diffusion. We quantify the strength of this diffusion with the dimensionless parameter δ, such that the diffusion coefficient is given by Here we introduced sound speed c s and gas pressure scale height Diffusion is assumed to be isotropic. We discuss and test this assumption in Appendix A. Following Fick's second law for diffusion, particles are diffused over the length scale r with the diffusion timescale (Youdin & Lithwick 2007) t diff = r 2 δc s H .
Thus, cloud collapse will be prevented by turbulent diffusion if t diff < t con , or Assuming the cloud extends to the vertical particle scale height H p , i.e., if H p ≈ r, vertical integration of ρ c yields the cloud particle column density Σ p,c , i.e., Thus, we can transform Eq. (7) to a diffusion-limited stability criterion If the radius r of a cloud with column density (mass per area) Σ p,c falls below L diff , collapse will be prevented by turbulent diffusion.

Cloud stability against tidal shear
Whether a particle is gravitationally bound to the cloud or disrupted by tidal shear originating from the central star may be assessed by comparing the cloud's self-gravitational force to the force of tidal gravity. Selfgravity dominates when where a is the separation between the cloud and the star. This criterion may be conveniently re-written using the Hill-radius The cloud will be stable against tidal shear if its density, ρ c , is larger then its local Hill density, i.e., (Klahr & Schreiber 2015) Using Eq. (8), the cloud radius r for a given cloud column density Σ p,c must not exceed for the cloud to be stable against tidal shear.

Cloud radii subject to gravitational instability
We combine the diffusion criterion in Eq. (9) and the tidal shear criterion in Eq. (13) to conclude that for local instability there must exist an r, such that L diff < r < L hill , i.e., δ St This is equivalent to The right hand side of Eq. (15) is very reminisent of the inverse Toomre Q parameter (Toomre 1964) where Σ g is the gas surface density. For Q < 1 the gas disk is subject to gravitational instability and fragmentation, since neither gas pressure nor tidal forces may prevent the gas disk's collapse (see e.g., Baehr et al. 2017). Next, we can define the total metallicity where Σ p and Σ g are spatially averaged particle and gas column densities respectively, i.e.
where ρ p and ρ g are particle and gas volume densities respectively. We express the local metallicity of the particle clump as The dimensionless quantity characterizes potential radial and azimuthal enhancement in particle column density due to, for example, streaming instability. We note that for the purposes of our paper, Z is defined by Eq. (17) at each location in the disk and thus does not necessarily correspond to the stellar [Fe/H] and could in fact vary temporarily and spatially due to dust drift and growth processes. We can transform Eq. (15) to Eq. (20) is a criterion for instability. It neatly highlights the ambivalent effect of streaming instability on planetesimal formation, of both enhancing local particle densities Chiang & Youdin 2010;Simon et al. 2016) thus increasing , while also providing the cloud with turbulent diffusion δ . Furthermore, high metallicities favor collapse, which is in line with e.g. Youdin & Shu (2002); Johansen et al. (2009). The critical value of Q p = 1, which is equivalent to ρ c = ρ H and L hill = L diff , corresponds to the diffusion-limited collapse criterion of Klahr & Schreiber (2015); Klahr & Schreiber (2019) and the attached marginally critical length scale, given by This can be derived by setting the three terms in Eq. (14) equal to each other. However, for Q p < 1, more modes become unstable leading to a range of unstable scales that are both larger and smaller than the critical length scale. This may lead to the initial planetesimal size distributions seen in e.g. Metallicity Z and dominating Stokes number St can be informed from dust-evolution models that contain growth, drift and fragmentation (e.g., Birnstiel et al. 2010Birnstiel et al. , 2012Lenz et al. 2019;Powell et al. 2019) and are therefore often treated as input parameters in planetesimal formation models, as is the Toomre Q, which parameterizes disk mass. On the contrary, the diffusion coefficient δ and metallicity enhancement are local properties which cannot be treated as input parameters. However, both can be related directly to the scales regulating local particle densities in planetesimal-forming disks. These scales are set by drag instabilities, in particular the particle Kelvin-Helmholtz instability (Sekiya 1998;Johansen et al. 2006;Chiang 2008) and the nonlinear saturation of streaming instability (Youdin & Goodman 2005;Squire & Hopkins 2018a;Yang & Johansen 2014). Thus, investigating how these two instabilities interplay and regulate local enhancement and diffusion is crucial for understanding planetesimal formation in the scenario of local gravitational collapse.

PARTICLE DENSITIES REGULATED BY DRAG INSTABILITIES
Protoplanetary disks contain gas and dust, both of which can be treated as fluids. The Euler equations of the two fluids are coupled via the drag force exerted by gas onto particles as well as its back-reaction by the particle flow onto gas. We define the position vector in the disk in cylindrical coordinates centered around the star, i.e. r = ar + φφ + zẑ. The set of hydrodynamic equations then reads (Youdin & Goodman 2005) Here, we denote the gas and particle velocities as u and v respectively. We have introduced gas pressure P , the local dust-to-gas ratio and the relative velocity between particles and gas While particles (in the absence of gas) orbit with Keplerian velocity radial stratification due to the pressure force in Eq. (23) causes the gas (in the absence of particles) to rotate with a sub-Keplerian velocity of where the pressure support parameter, η, for a disk with aspect ratio is given by Alternatively, the pressure gradient can be quantified by the dimensionless parameter which leads to the relations and ηa = 1 2 haβ = 1 2 βH.
Note that for a typical gas density profile exponent of around unity (e.g., Andrews et al. 2009), β ∼ h. A fiducial value is h = 0.1, leading to β = 0.1 and η = 0.05 · h = 0.005 (e.g., . We find β to be the more convenient representation for the pressure gradient in our work, since length measurements become a fraction of gas scale height H after Eq. (34), and are as such easily discernible. In the following sections, we discuss how the length scale 1 2 βH in Eq. (34) is characteristic for drag-instabilities, such as KHIs or SIs.
3.1. Vertical particle extent regulated by Kelvin-Helmholtz instability Eqns. (28) and (29) can only describe particle and gas orbital velocities correctly as long as the two are not significantly subject to frictional coupling. For strongly coupled particles with St < 1, the equilibrium azimuthal particle velocity v 0,y of the combined fluid is (e.g. Nakagawa et al. 1986;Sekiya 1998;Youdin & Shu 2002) We note that Eq. 35 depends on the local dust-to-gas ratio, µ, such that the azimuthal velocity limits to Eq. 29 when gas dominates and to Eq. 28 when dust dominates. The restriction St < 1 is valid in the context of the herein presented simulations and is also justified physically by dust evolution models like Birnstiel et al. (2010Birnstiel et al. ( , 2012. Since particles are not vertically supported by a pressure gradient, they settle towards the mid-plane, resulting in a strongly z-dependant dust-to-gas ratio µ(z). In the mid-plane, where µ(z) is maximal, Eq. (35) implies that both the particles and the gas will orbit at velocities close to Keplerian, whereas in layers above the midplane their orbital velocities are reduced (by ηv K in dustfree layers) . The resulting vertical gradient in orbital velocity leads to Kelvin-Helmholtz instabilities (KHIs), sometimes known as particle vertical shearing instabilities, that vertically mix and stir up particles (Sekiya 1998;Sekiya & Ishitsu 2000, 2001Youdin & Shu 2002;Chiang 2008). We will follow Chiang (2008) in their analytical consideration of the KHI.
Whether or not the dust layer is subject to KHI is commonly assessed with the Richardson number Ri, which compares buoyancy oscillations (for incompressible perturbations given by the Brunt-Väisälä frequency) to the rate of vertical shearing (Chandrasekhar 1961). It is given by where g z = −Ω 2 z is the vertical component of stellar gravity, and ρ = ρ g + ρ p . If the Richardson number falls below a critical value, i.e. if Ri < Ri crit = 1 4 (37) for Cartesian flows (Chandrasekhar 1961;Howard & Maslowe 1973;Li et al. 2003;Chiang 2008), KHIs trigger and dust parcels are overturned and mixed. The exact value of the critical Richardson number is problem dependant. Johansen et al. (2006) found a critical value of Ri crit ∼ 1 when numerically studying self-sustained Kelvin-Helmholtz instability in protoplanetary disks. As dust settles much closer to the mid-plane than gas ∂ρ/∂z ≈ ∂ρ p /∂z. Hence, the Richardson number in protoplanetary disks becomes (Chiang 2008) where we plugged in Eq. (35). For Ri = const, which is expected if settling times are much longer than KHI growth rates (Johansen et al. 2006), integration of Eq. (38) yields a vertical profile for the dust-to-gas ratio µ(z) dependant on the mid-plane dust-to-gas ratio µ 0 , i.e. (Chiang 2008) µ(z) = 1 (1 + µ 0 ) 2 + 1 Ri Solving µ(z) = 0 for z yields the maximum extent of the particle layer (Chiang 2008) Eq. (40) predicts the vertical height of the particle layer for a given mid-plane dust-to-gas ratio and constant Richardson number. For µ 0 1, z max limits to 0.5 √ RiβH. Particles are expected to settle until the mid-plane is dense enough and the critical Richardson number is reached. Mixing and vertical excitation are now energetically favorable and counteract further settling, and thus particles are lifted to z max . When particles are stirred up, the Richardson-number of the flow rises to sub-critical values again such that the Kelvin-Helmholtz instability can not grow further. This leads to a selfregulating steady-state where the flow Richardson number returns to the critical value and the maximum extent is again given by Eq. (40). As Eq. (40) assumes a constant Ri-flow, we treat the steady-state flow Richardson number as a parameter and investigate which value best fits our numerical setup.
Note, that this consideration does not account for the high-density mid-plane cusp found by Sekiya (1998); Youdin & Shu (2002); Gómez & Ostriker (2005) for super-solar metallicities. Indeed, for sufficiently high metallicities, Youdin & Shu (2002) showed that this overdense particle mid-plane can by massive enough to gravitationally fragment and form planetesimals (also see e.g., Chiang & Youdin 2010). These enhancements in metallicity may be found in the inner disk due to radial drift of particles (Youdin & Shu 2002), in particle traps such as pressure bumps (e.g., Rice et al. 2006;Pinilla et al. 2012;Taki et al. 2016) or as the disk dissipates due to photoevaporation (Gorti et al. 2015).
However, none of these processes are thought to generically operate and sufficiently increase metallicities across a broad range of disk environments.

Streaming instability
The streaming instability (Youdin & Goodman 2005), therefore, may offer a promising path of ubiquitously forming planetesimals (Simon et al. 2016;Nesvorny et al. 2019). This section briefly summarizes the nature of streaming instability, recent developments and its importance in the context of the collapse criterion presented in Eq. (20).
The streaming instability is a linear instability that occurs when the particles' back-reaction onto gas are taken into account (Youdin & Goodman 2005;Jacquet et al. 2011;Zhuravlev 2019). For marginally coupled particles and local dust-to-gas ratios exceeding unity, the SI displays rapid growth rates, thus strongly enhancing local particle densities (Carrera et al. 2015;Yang et al. 2017) and triggering planetesimal formation (Bai & Stone 2010a;Johansen et al. 2015;Simon et al. 2016;Schäfer et al. 2017;Nesvorny et al. 2019). Whether or not this critical dust-to-gas ratio can be achieved depends on the metallicity and pressure gradient (Sekiya & Onishi 2018), which determine the vertical particle height set by KHI. The functionality of the SI can be understood in the context of resonant drag instabilities, a class of instabilities where the relative dust-gas motion is in resonance with a pressure perturbation in the gas (Squire & Hopkins 2018b). As showed by Squire & Hopkins (2018a), for the SI, the equilibrium solution for the relative dust-gas streaming velocity provided by Nakagawa et al. (1986) is in resonance with an epicyclic perturbation in the gas.
Recently, the role of the linear SI in the context of planetesimal formation was cast into doubt by two additional considerations that were omitted in the stabil-ity analysis by Youdin & Goodman (2005) and likewise in most numerical work. First, the inclusion of external turbulence strongly suppressed SI growth rates, in particular on small scales (Umurhan et al. 2019). Secondly, Krapp et al. (2019) showed that the inclusion of multiple particle species also damps growth rates significantly. Still, in dead-zones, where turbulence is entirely self-induced by SI and KHI, and if the particle size distribution is sufficiently top-heavy and narrow as suggested by recent dust evolution models (e.g. Birnstiel et al. 2011;Okuzumi et al. 2012;Birnstiel et al. 2016), SI remains an important processes and continues to be a key aspect of global models for planetesimal formation such as Drążkowska et al. (2016).
Therefore, understanding the ambivalent role of SI for gravitational collapse is crucial. On the one hand, its linear phase and the subsequent onset of turbophoresis 1 can concentrate particles and increase local densities to Hill-density and beyond. On the other hand, the selfinduced turbulence intrinsic to the nonlinear saturation of SI also diffuses particles apart and thus limits collapse on small scales (Klahr & Schreiber 2015;Klahr & Schreiber 2019). In fact, the fastest linearly growing modes are always on the smallest scales (Youdin & Goodman 2005;Squire & Hopkins 2018a;Umurhan et al. 2019) and therefore typically diffusion-limited. As the complex interplay of these oppositional processes during 3D nonlinear SI remains to be studied in detail, we propose a straight-forward estimate for a characteristic scale of SI-induced over-densities.
3D simulations of nonlinear SI universally feature large-scale azimuthally elongated filaments as linear modes are sheared apart (see e.g., Johansen et al. 2012;Yang & Johansen 2014;Simon et al. 2016;Li et al. 2018;Abod et al. 2018, or Sect. 6). Previously, their typical separation (feeding zone) has been investigated numerically by Yang & Johansen (2014), and their radial extent has been connected to the operating scale of SI (Umurhan et al. 2019). In the following, we apply the Richardson criterion used to derive z max for the KHI in Eq. (40) to SI filaments to derive an analogous estimate for a characteristic radial scale.
To order of magnitude we may neglect Keplerian shear and the radial gradient in azimuthal velocity follows from a radially-dependent dust-to-gas ratio µ = µ(x) that by construction is present in SI filaments. Motivated by the findings of Squire & Hopkins (2018a), we take the restoring force per unit mass as g x = −Ω 2 x 1 Turbophoresis (Caporaloni et al. 1975) is the tendency of particles coupled to a fluid flow to accumulate in regions of low turbulence (also see e.g., Reeks 2014; Belan et al. 2014).
from radial epicyclic oscillations. Under the assumption of a constant Richardson number, we can exactly match the argument presented in Sect. 3.1 and arrive at a maximum radial extent x max for SI filaments given by We point out, that while the simplification µ 0 ≥ 1 is per definition true as otherwise the SI would not have produced an over-dense filament in the first place, the assumption of a constant Richardson-number is likely not correct due to the asymmetry that arises when Keplerian shear is taken into account. On the scales of SI-filaments, the linearized Keplerian shear velocity is ∼ ηv K , comparable to the velocity difference due to the particle density gradient, and adds to the relative dust-gas streaming velocity on one side of the filament and subtracts on the other. Also, as peak densities during the nonlinear SI often significantly exceed unity, it seems likely that an analogous structure to the highdensity mid-plane cusp found by Sekiya (1998) forms, which is neither considered in the vertical nor the radial Richardson criterion. For these reasons, we stress that this radial scale has to be taken as a rough estimate. Still, we show in Sect. 7 that having an analytic prediction for the radial scale, even if approximate, is very useful for retracing local diffusivity and enhancement to metallicity and pressure gradient, and thus for evaluating our collapse criterion in Eq. (2) without a detailed knowledge of the turbophoretic state of the nonlinear SI.

NUMERICAL SETUP
For our numerical studies, we employ the Pencil Code 2 , which uses sixth-order spatial derivatives and third-order Runge-Kutta integration in time to solve hydrodynamic equations on an Eulerian grid with Lagrangian particles (for details, also see Brandenburg & Dobler 2002;Brandenburg 2003). All presented simulations are conducted in a shearing-box approximation, i.e. a locally Cartesian coordiante frame that is corotating with Ω at distance a to the central star, where x, y a (for the derivation of shearing box approximation equations see e.g., Goldreich & Lynden-Bell 1965;Umurhan & Regev 2004). Here, the unperturbed azimuthal velocity in the local frame is given by quantifies the linearized radial shear.

Gas and particles
For all simulations, we consider a non-magnetized gas obeying an isothermal equation of state This is a good assumption for disk regions that are dominated by irradiation instead of accretion heating (Kratter & Murray-Clay 2011). Initially, the gas is evenly distributed in the radial and azimuthal directions, while vertically stratified due to the vertical component of stellar gravity causing mid-plane sedimentation. We achieve sub-Keplerian gas velocities by artificially imposing a radial pressure gradient to support the gas. This is quantified by β defined in Eq. (32), which is added to the gas Euler equation (see Eq. (46)). Boundary conditions are shear-periodic, periodic and periodic in x, y and z respectively. We defer to Li et al. (2018) for a discussion of the effect of vertical boundary conditions on streaming instability. We seed the box with Lagrangian super-particles following a Gaussian with a scale height of H p,0 . We test different values for H p,0 and discuss our choices for subsequent simulation runs in Appendix B.1. Each superparticle is characterized by its position x p = (x p , y p , z p ) and velocity v = (v x , v y , v z ) (measured with respect to u 0,y ), and represents a swarm of identical physical solids interacting with the gas as a group. The Triangular Shaped Cloud (TSC) scheme (Hockney & Eastwood 1988;) is applied to smooth out super-particle properties to the neighboring grid cells. Particle properties are characterized by the input parameters of particle Stokes number St from Eq. (2) and disk metallicity Z from Eq. (17). Note that the Pencil Code up-scales particle masses such that the total dustto-gas ratio in the boxes matches Z even if the vertical domain size does not include the entire gas disk.
We consider particles with a fixed St. For streaming instability for a mixture of different particle species, we defer to Schaffer et al. (2018).
We also note that while the particles may display significant clumping in our simulations, the gas is comparatively static throughout all simulations.

Evolution equations
In our Pencil Code setup, the gas obeys the continuity equation where f D (ρ g ) represents an artificial hyper-diffusivity. The gas velocity u relatitive to u 0,y is evolved via the equation of motion Here, the terms from left to right are: the velocity time derivative, advection due to velocity perturbation, advection due to shear flow, pressure gradient, artificially imposed centrifugal support due to global radial pressure gradient, the combination terms from linearized stellar gravity, centrifugal force, and Coriolis force, the back-reaction of the drag force exerted on gas by particles, and an artificial hyper-viscosity. Hyper-viscosity and diffusivity are used to stabilize the code and smooth out steep gradients while preserving the power at the larger scales (See Appendix B of Yang & Krumholz 2012).
Similarly, the evolution equations for the particles read and with the key difference being, that particles are not subject to pressure gradient.

Self-gravity
Self-gravity is implemented by solving the Poisson equation with the fast Fourier transform algorithm (Gammie 2001) containing gravitational softening. We consider the gravitational potential Φ(r) at arbitrary position r of both gas and particles. The relative strength of tidal shearing compared to self-gravity is parameterized with the Toomre Q in Eq. (16) (Toomre 1964).
This implies a value for the gravitational constant G in code units, which is then plugged into the right hand side of the Poisson equation, i.e.
Note that Q relates to the self-gravity parameterĜ used by Johansen et al. (2012); Simon et al. (2016); Schäfer et al. (2017) to parameterize self-gravity viâ Lastly we note, that in our simulations even for Q < 1, the gas does not collapse as the chosen domain size does not include the critical unstable wave length of 2πH.

Units
In our simulations we set the code units for angular frequency Ω = 1, isothermal sound speed c s = 1, and initial mid-plane gas density ρ g,0 = 1. While the Pencil Code is agnostic to the choice of unit, we deem it useful to express our results in units that have physical meaning. Thus we choose the orbital period, which is related to the dynamical time via 2πΩ −1 , the gas scale height H, and the initial mid-plane gas density ρ g,0 as time, length, and density units respectively. As the orbital frequency Ω is given by we can calculate scaling relations of our code units, i.e.
where we used the isothermal sound speed given by with Boltzmann constant k B and a mean molecular weight ofm = 2.33m p , where m p is the proton mass. While the density scales freely with ρ g,0 for simulations where self-gravity is disabled, Eq. (16) introduces a scaling relation once self-gravity is introduced. Thus, the density unit can be expressed via

Simulation runs
We conduct multiple simulations with various different parameter choices, displayed in Tab. 1. We follow Simon et al. (2016) and choose a domain size of L x = L y = L z = 0.4H, which allows us to sufficiently capture multiples of the characteristic scale βH. The majority of our simulations are based on a relatively small numerical grid with a size of 64×64×64, thus sacrificing resolution in favor of computational time. Additionally, we explore a higher resolution of 128×128×128 for selected simulation setups. We denote the number of grid cells in radial, azimuthal and vertical direction with N x , N y and N z respectively.
For these choices, the grid cell size is ∼ 0.006H and ∼ 0.003H respectively, which for β = 0.1 and typical viscosity values may not be enough to sufficiently resolve the fastest growing streaming instability wave length (Umurhan et al. 2019). However, the marginally unstable scale for local gravitational collapse r crit in Eq. For low resolution simulations, the grid is seeded with one super-particle per grid cell on average. High resolution simulations are seeded with 0.1 super-particles per grid cell to save on computation time. As particles are expected to settle to a layer of thickness ∼ 0.04H after Eq. (40), corresponding to about a tenth of the domain size, the mid-plane layer will nevertheless have about one super-particle per grid cell.

VERTICAL SCALES AND DIFFUSION COEFFICIENT
Our first task is verifying that z max from Eq. (40) indeed well describes the vertical extent of the particle layer. Further, we investigate how turbulent particle velocities relate to particle scale height H p , and therefore diffusion δ -a key parameter in our collapse criterion in Eq. (20) -and z max .
For this purpose, we choose St = 0.005 to slow down streaming instability growth rates (Carrera et al. 2015;Yang et al. 2017), and investigate the particle layer in the KHI-dominated case for four different values of metallicity. Particles are initialized with a scale height of z p,0 = 0.025H and settle towards the mid-plane with settling time Appendix B.1 shows that the initial particle scale height does not affect the vertical extent in the self-regulated state.
We conducted a series of settling simulations with different metallicities (see Tab. 1), the final snapshot of four of which is displayed in Fig.1. Note that for high metallicities of Z ≥ 0.1, we only performed low resolution simulations to save computation time (see Appendix B.2 for a discussion of limitations due to the numerical resolution).
As expected from Eq. (40), the vertical extent of the particle layer is Z-dependant. We identify three regimes. For low metallicities, the particles may settle very thin before entering the self-regulating regime (z max ≈ 0.01H, and z max ≈ 0.02H for Z = 2 · 10 −4 and Z = 2 · 10 −3 respectively). Intermediate metallicities display the largest vertical extent (z max ≈ 0.03H for Z = 0.02). For high metallicities the particle scaleheight visibly falls below the analytic expectation from Eq. (40) (z max ≈ 0.025H for Z = 0.1). There, we also recognize the high-density particle cusp around the midplane from Sekiya (1998); Youdin & Shu (2002); Gómez & Ostriker (2005). While the Kelvin-Helmholtz instability attempts to stir up particles to z max , the mid-plane particle layer is so massive that stronger eddies are required for efficient particle lifting. Thus, only the surface layers can be stirred up effectively. As our consideration in Sect. 3.1 does not well describe the vertical extent for this high-metallicity case, we will exclude this regime from further analysis in this section.

Flow Richardson number
To appropriately relate particle scale heights to z max , it is worthwhile to first check whether the assumption of a constant Richardson number is appropriate for our system. Due to azimuthal and radial symmetry of the settling simulations, particle positions are projected onto the vertical axis. We calculate the Richardson number   Fig. 2. The marker shape indicates if the system was able to reach a KHI-regulated steady state. Low metallicitiy simulations presumably have not fully settled yet and the correlation time is still decreasing. High metallicity simulations develop a high-density midplane cusp (Sekiya 1998;Gómez & Ostriker 2005;Johansen et al. 2006), and the concept of a steady-state self-regulated by KHI is not applicable (approximately indicated by the gray region). The bottom panel depicts zmax/e from Eq. (40) for Richardson numbers of Ri = 0.75 ± 0.5.

at height z after Eq. (38) via
This calculation is limited to our resolution. As such, ∆µ is the difference in dust-to-gas ratio of two vertically adjacent slices, ∆z = L z /N z , and z-positions of interest fulfill z ∈ n · L z /N z with N z > n ∈ N. To avoid statistical effects, we only calculate the Richardson number at a given z if the two adjacent slices contain more particles than some threshold value, which was chosen to be 500. The height dependant Richardson number after Eq. (57) for the final snapshot of high resolution settling simulations (N x,y,z = 128, St = 0.005) for three different metallicities Z = 0.0002, Z = 0.002, and Z = 0.02 is shown in the middle panel of Fig. 2. Its top panel depicts the number of particles per slice in z. Note, that although the mid-plane contains about three time as many particles for Z = 0.0002 compared to Z = 0.02, the mid-plane dust-to-gas ratio is smaller as individual super-particle masses are lower by two orders of magnitude.
The vertical profile of the flow Richardson number qualitatively matches results by e.g., Johansen et al. (2006); Bai & Stone (2010a). Our estimate of the Richardson number supports findings by Johansen et al. (2006) that the appropriate critical Richardson number is indeed greater than the literature value of 1/4. Thus, while we continue to treat the Richardson-number as a parameter, we deem the assumption of it being constant required to derive z max in Eq. (40) to be reasonable.

Diffusion, correlation time and particle scale height
Next, we relate the vertical particle extent z max to vertical diffusivity and show than root-mean-square velocities of particles behave accordingly. For a discussion of the radial diffusivity and our assumption of sphericallysymmetric diffusion, see Appendix A.
The dimensionless diffusion coefficent δ is commonly measured using the root-mean-square velocity, the vertical-component of which is given by (Carrera et al. 2015; with v z,i being the vertical component of the velocity of particle i. It relates to the dimensionless diffusion coefficient via (Johansen et al. 2006b;) Figure 5. Same as Fig.4 but for low resolution simulations. The red rectangle marks the low resolution fiducial run for which self-gravity was turned on. Note that this simulation would also fit between the two right most panels in the bottom row.
where τ corr is the dimensionless correlation time. For St > 1 particles, this correlation time is just given by the Stokes number St (Youdin & Lithwick 2007). However, in our case for well-coupled particles with St < 1, particle dispersion does not originate from random movement, but from gas turbulence. Thus, eddy turnover times will dictate correlation times (Youdin & Lithwick 2007). For Kolmogorov turbulence, where particle turbulent kinetic energy is dominated by the largest eddies, whose turnover time is just the orbital time, then τ corr = 1. In numerical simulations however, this is often not a good assumption. For example  found values of 0.1 < τ corr < 1 depending on their simulation setup. Hence, we deem it worthwhile to measure appropriate correlation times for our simulation setup. Equating settling time in Eq. (56) and diffusion time in Eq. (6) yields a well known expression for the particle scale height (Youdin & Lithwick 2007) In fact, Eq. (60) is the definition of δ z (and via our isotropy assumption also δ x ) in the context of our work. Plugging in Eq. (59) yields an expression for the dimensionless correlation time We determine v rms according to Eq. (58) and measure the particle scale height H p using (Carrera et al. 2015) where z i is the vertical position of particle i, and z its mean over all i (here z = 0). The dimensionless correlation times required to match root-mean-squared velocities of the three high-resolution settling simulations at the final snapshot to the respective particle scale heights are τ corr (Z = 0.0002) ≈ 2, τ corr (Z = 0.002) ≈ 0.3, τ corr (Z = 0.02) ≈ 0.1. This is in line with typical correlation times measured by e.g. .
The top panel of Fig. 3 plots the final snapshot correlation times vs metallicity for both low and high resolution simulations. The panel also distinguishes between simulations where the correlation time was able to reach a constant, non-evolving value within the simulation run time (steady state) and simulations where the correlation time was still evolving (no steady state), i.e. the particles are still in the settling phase and the flow has likely not yet reached the KHI-regulated equilibrium. The two simulations with super-solar metallicities of Z = 0.1, Z = 0.2 which develop the high-density mid-plane cusp are excluded from this classification.
Having verified that (vertical) root-mean-squared velocities correlate with the (vertical) diffusivity and thus the particle scale height as anticipated, we proceed with Eq. (60) and use H p to measure diffusivity.
The bottom panel of Fig. 3 plots the final snapshot particle scale-height vs metallicity Z. In order to overlay the analytic prediction from the self-regulated KHI steady state, we require a measurement for the midplane dust-to-gas ratio µ 0 . We find that the approximations µ 0 ≈ (H/H p )Z well matches our numerical result for µ 0 (also compare to Eq. 3 of Sekiya & Onishi 2018).
Moreover, to account for the fact that z max by construction is the maximum particle extent, we must divide by an order unity factor to yield a quantity that relates to the scale height. The bottom panel of Fig. 3 suggests, that z max /e is a good match for low and intermediate metallicities that do not possess a high-density mid-plane cusp.
Together with Eq. (60), Note again, that the first equality in Eq. (63) is just the definition of δ z in the context of our work. Relating δ z to root-mean-squared velocities requires knowledge of correlation times, which we will defer to future work. The second equality relates the measured particle scale height to its analytic prediction from Chiang (2008), that was reviewed in Sect. 3.1. We can combine Eq. (63) with Eq. (21) to find that, under the assumption of spherically symmetric diffusion (see Appendix A), the critical radius scale is just . Shown is the mean particle scale height at 40 orbits averaged over each x for high and low resolution simulations depicted in Fig.4 (orange) and Fig. 5 (green) as well as high resolution settling simulations in Fig. 1 (blue). Error bars show the standard deviation originating from the radial average. Like in the bottom panel of Fig. 3, we plot the analytic expectation for zmax/e for different values of Ri for comparison. The gray region approximately indicates metallicities for which we do not expect the analytic expression to match our numerical results.

SCALES OF STREAMING INSTABILITY FILAMENTS
Whereas the Kelvin-Helmholtz instability regulates vertical enhancements of the particle layer, the streaming instability does so in the radial direction by producing azimuthally elongated particle filaments (see e.g., Simon et al. 2016;Sekiya & Onishi 2018). In this section we investigate the scales of these filaments. ters the vertical extent of the particle layer. Then, we confirm our suspicion that the radial width of streaming instability filaments indeed never exceeds L max from Eq.(41). All simulations presented in this section are performed with St = 0.2, which roughly corresponds to the maximum particle size available in protoplanetary disks when considering drift and fragmentation limits Powell et al. 2019). Figures 4 and 5 show the final snapshot of high resolution (N x,y,z = 128) and low resolution (N x,y,z = 64) simulations for a range of β and Z respectively. 6.1. Vertical particle distribution in the presence of streaming instability The top panels of Figs. 4 and 5 display azimuthally integrated particle densities. The wave-like structure characteristic to diagonal streaming instability wave modes emerges (Compare to e.g., Li et al. 2018). As this feature needs to be accounted for when calculating the particle scale height, H p,j is calculated via Eq. (62) at each slice j in x (N x slices with radial extent L x /N x ) and then averaged, i.e.
The resulting final snapshot particle scale heights are depicted in Fig. (6) against pressure gradient β and Z. The high resolution settling simulations are plotted for comparison. The top panel of Fig. (6) shows an approximately linear scaling of H p in β, as expected from Eq. (40). For the trivial case of β = 0, the relative dust-gas velocity is zero and both KHI and SI lose their energy source such that razor-thin settling occurs. The bottom panel of Fig. (6) reproduces the bottom panel of Fig. 3. For low and intermediate metallicities z max describes H p well, even in presence of SI. For Z > 0.01, the high-density mid-plane cusp develops and the particle scale height decreases below the expected height. Thus, we expect the approximation in Eq. (63) to hold both in presence and in absence of streaming instability.

Radial scales and enhancement
The bottom panels of Figs 4 and 5 display vertically integrated particle densities. Over-dense streaming instability filaments develop for Z ≥ 0.02 and 0 < β ≤ 0.13, which is in line with e.g. Bai & Stone (2010b).
To quantify the width of a filament, we define the slice density as the vertically and azimuthally integrated density, i.e., The temporal evolution of the radial distribution of the slice density for the high resolution fiducial run is shown in Fig. 7. The slope of the filaments correlates with the local dust-to-gas ratio according to Nakagawa et al. (1986). We also identify radial epicyclic oscillations in particular at early times. The radial extent of the a filament ∆x can be defined as Here, Σ p,0 is the mean particle slice density per grid cell extent x cell = L x /N x . Note that we must limit the radial resolution of this procedure to the gas grid cell size to remain selfconsistent.
The top panel of Fig. 8 shows the radial extent of the most dense filament at the final snapshot vs pressure gradient β for those simulations that developed distinct streaming instability filaments with Z = 0.02, i.e. for three different Richardson numbers for simulations with Z = 0.02 that showed distinct SI-filaments (see main body for details). The extent ∆x/2 was measured once per orbit and then averaged. The error bar is the resulting standard deviation Bottom panel: temporal evolution of after Eq. (68) for the simulations FidRun_64 with Nx,y,z = 64, Z = 0.02, FidRun_128 with Nx,y,z = 128, Z = 0.02, and LowZ_128 with Nx,y,z = 128, Z = 0.02. We only plot times before self-gravity is turned on, which is at 40, 60, and 40 orbits respectively (see Sect. 7).
FidRun_128, Beta_128 with β = 0.05, FidRun_64, and Beta_64 and β ∈ {0.05, 0.07, 0.1, 0.13} (see also Tab. 1,and Figs. 4 and 5). The numerical data well match L max in Eq. (41). To the extent that the vertical and radial scale heights of over-dense particle filaments are set by the same scale, Eq. (60) implies that vertical and radial diffusivity must also be approximately equal and the approximation in Eq. (63) is valid for radial diffusivity as well.
The resulting dimensionless enhancement parameter defined in Eq. (19) is estimated which by construction is larger than unity. Here, we introduced a correction coefficient p ≥ 1. While the term in parenthesis assumes a azimuhthally isotropic filament, with radially uniformly distributed material, the correction exponent p accounts for potential additional azimuthal enhancement, as well as the fact that a radially uniform distribution tends to underestimate the local enhancement. Note, that while the latter effect could be corrected via a multiplicative factor, the former requires an exponential coefficient. As it is challenging to account for these corrections rigorously, we treat them with a single correction exponent and set it to best match our simulation results. This procedure should be regarded as a first step and could be refined in future assessments of the collapse criterion.
The bottom panel of Fig. 8 shows the evolution of for the three simulations, for which self-gravity is turned on in Section 7 at 40 orbits (FidRun_64, Z = 0.02 and LowZ_128, Z = 0.01) or 60 orbits (FidRun_128, Z = 0.02). We set p = 2.5 for the two fiducial runs with Z = 0.02. Our methodology of measuring enhancement in Eq. (68) is specifically designed for enhancements within azimuthally elongated filaments. As such, in setups where those are not distinctly visible, as is the case for Z = 0.01, our procedure does not work, since the projection onto the x-axis removes all information about density enhancements that are rotated in the x-y-plane. Thus, we use p = 4.5 for the low metallicity run with Z = 0.01, which best reproduces locally measured dustto-gas ratios, to still get comparable values and assess the collapse criterion (see Sect. 7).
As expected for Z = 0.01, the local enhancement never exceeds an order unity factor of ∼ 5. This is in line with Carrera et al. (2015); Yang et al. (2017) who concluded that streaming instability can not effectively concentrate particles for this metallicity. For Z = 0.02, the enhancement parameter fluctuates around 10 100. There is no qualitative difference between the two tested resolutions to be found. We attribute the peak in at 55 orbits for FidRun_128 to statistical fluctuations typical to the nonlinear phase of streaming instability.

SELF-GRAVITY AND PLANETESIMAL FORMATION
Next, we numerically test the collapse criterion for planetesimal formation discussed in Sect. 2 and intro- duce self-gravity to our simulation. Equation (20) defined Q p , a dimensionless quantity assessing the collapse of a particle cloud in analogy to Toomre-Q for the gas disk. For collapse, Q p must be less than one. Thus, after Eq. (20), for collapse, the Toomre-Q of the system must fall below a critical value of Q crit given by where we used Eq. (60) and assumed δ = δ x = δ z (see Appendix A). Note that we choose to test for a critcial Q, rather than a critical Q p as Q is a direct input parameter in the Pencil-Code via the gravitational constant G (see Sect. 4.3).
We verify the validity of this collapse criterion for three simulation setups: N x,y,z = 128, Z = 0.02, N x,y,z = 128, Z = 0.01, and N x,y,z = 64, Z = 0.02 by testing 4 different values for Q around a analytic estimation of Q crit . For all setups, β = 0.1 and St = 0.2. We calculate Q crit at the time self-gravity is activated by measuring H p and via Eq. (65) (see Fig. 6) and Eq. (68) (see bottom panel of Fig. 8) respectively. Similar to Simon et al. (2017); Schäfer et al. (2017);Nesvorny et al. (2019), self-gravity is turned on at an arbitrary time during the non-linear saturation of streaming instability. This method must be regarded as a qualitative proof of concept and not as a quantitative verification of the collapse criterion. As such, we discuss potential problems with this procedure in Sect. 8.3. Figure 9 shows the evolution of the maximum particle density for the three setups where self-gravity was included. The analytical prediction for Q crit at the time self-gravity is turned on is noted in the top of each panel. Before self-gravity is turned on, Q is not defined and all simulations behave identically.
For the high-resolution setups, the density increases significantly at the time self-gravity is turned on if Q Q crit , which is indicative of collapse and planetesimal formation (Compare to e.g., Simon et al. 2016;Klahr & Schreiber 2019). For the low resolution simulation, Q = 16 does not lead to collapse, which suggests that this simulation may not be converged (see Appendix B.2). If the critical radius is not quite resolved, lower Q-values are required to render larger scales unstable.
The final snapshot of the vertically integrated particle density is shown in Fig. 10. There, planetesimal candidates are highlighted with yellow circles using a simple peak finder algorithm that identifes local maxima in the vertically-integrated particle surface density. We confirmed that planetesimal candidates indeed are gravitational bound by comparing their dispersive kinetic energy with their gravitational binding energy. We expect Figure 10. Vertically integrated particle densities for self gravity simulations. Top, center and bottom row show simulations based on FidRun_128 with Nx,y,z = 128, Z = 0.02, FidRun_64 with Nx,y,z = 64, Z = 0.02, and LowZ_128 with Nx,y,z = 128, Z = 0.02 respectively. The respective snapshot at the time self-gravity is turned on is depicted in the left column. Panels on the right five columns show the final snapshot for different Q-values (Compare with Fig. 9). Yellow circles highlight planetesimal candidates found with a simple peak finder algorithm. For reference, the Toomre value of the Minimum Mass Solar Nebula (MMSN) is ∼ 30 (Weidenschilling 1977;Hayashi 1981).
the initial planetesimal mass M pls to scale with the cube of the size of the unstable region r, i.e. M pls ∝ ρr 3 . In general, the range of radii r subject to instability will to depend on the value of Q p in Eq. (20), but we expect the most unstable radius to be given by Eq. (21), which as we argue in this paper, is of order ηa -the size of both the vertical particle height set by KHI and the radial width of SI filaments. For a disk evolving gradually into an unstable configuration, we expect the density at initial collapse to be ρ ≈ ρ H , so that M pls scales as ρ H (ηa) 3 (also compare to Chiang et al. 2014). A detailed numerical investigation of the initial mass function in the context of the herein presented collapse criterion would require more sophisticated clump finder (see e.g. Li et al. 2019) and is subject to future work. Fig. 10 shows, that collapse can occur also in the absence of streaming-instability-induced clumping for a metallicity of Z = 0.01, if Q is super-critical (this is in contrast to e.g., Johansen et al. 2009). Likewise, even if streaming instability produces high local particle densities (for Z = 0.02), collapse does not occur if Q is too high. Due to Q ∝ ρ −1 g,0 , this implies that massive disks can form planetesimals in dead-zones without the prerequisite of streaming instability and that very lowmass disks cannot form planetesimals even if they are metal-rich.

Towards an universal criterion for planetesimal formation
We numerically showed, albeit only for a limited range of parameters, that local gravitational collapse to form a planetesimal requires the diffusion-and tidal-shearlimited collapse criterion presented in Eq. (20) to hold true. Applying the criterion to global disk models is not obvious as both and δ are local properties that are potentially dependent on resolution and included physics. However, in Sect. 5 we showed that the ratio δ/St can be related to L max via Eq. (63). Thus, the collapse criterion in Eq. (20) can be expressed as If Q p 1, i.e. for small pressure gradients and Qvalues, and for large metallicities and local enhance-ments, we expect collapse and planetesimal formation. This Toomre-like formulation of the collapse criterion highlights that cloud collapse and planetesimal formation do not fundamentally depend on the presence of streaming instability and interpreting streaming instability as a required catalyst for planetesimal formation is not appropriate. Instead, collapse is assessed by comparing disk mass (quantified by Q), local dust content Z and pressure gradient β. In fact, Sekiya & Onishi (2018) recently showed that combining pressure gradient and metallicity to a single parameter appropriately describes streaming instability clumping, so it is no surprise that this ratio also appears in our collapse criterion.
The criterion in Eq. (70) can also be utilized to assess if it is easier to collapse the gas disk or local particle clumps. If particle collapse is easier. For our fiducial pressure gradient value of β = 0.1, and for no local enhancements ( = 1), this would be the case for Z 0.01.

8.2.
Limitations and other procceses potentially affecting the particle layer By replacing diffusivity with particle scale height, one also eliminates the explicit dependency of the collapse criterion on Stokes number. We point out, that Stokes number still implicitly effects the collapse criterion by influencing the local enhancement due to streaming instability. Our work is most appropriate for scenarios of narrow, top-heavy particle size distribution. The validity of the collapse criterion for different Stokes numbers as well as for a particle size distribution (compare to e.g., Schaffer et al. 2018) which may strongly damp SI growth rates (Krapp et al. 2019) remains to be studied. Similarly, we defer a high-resolution test to future work (also see Appendix B.2).
Although the physical origin of KHI investigated in Chiang (2008) and in our work is vertical shear, we explicitly abstain from referencing to it as vertical shear instability. This term is commonly used for a gas-only instability active in protoplanetary disks that are able to cool sufficiently fast to allow the development of vertical gas perturbations (see e.g. Urpin & Brandenburg 1998;Nelson et al. 2013;Lin & Youdin 2015;Pfeil & Klahr 2019). Here, the vertical gradient in orbital frequency is not caused by dust-gas coupling, but instead by radial variations in temperature and entropy (Barker & Latter 2015), leading to vertically elongated modes (Arlt & Urpin 2004;Klahr et al. 2018). The vertical shear in-stability is explicitly not studied in our work, as we are exclusively considering locally-isothermal scenarios.
Further, thermally driven (subcritical) baroclinic instabilities (Klahr & Bodenheimer 2003;Petersen et al. 2007a,b;Klahr et al. 2018) and its linear phase the convective overstability (Klahr & Hubbard 2014;Lyra 2014;Latter 2016) are based on radially perturbed gas parcels carrying entropy and have previously been shown to affect the particle mid-plane layer. There, inward moving gas radiatively thermalizes with its environment, which leads to buoyancy oscillations that in turn transport entropy outward. This mechanism was shown to form vortices that can trap particles and thus significantly alter the particle distribution in the mid-plane layer (Manger & Klahr 2018;Klahr et al. 2018).
The gas vertical shear instability, the subcritical baroclinic instability, and the convective overstability, as well as others not yet mentioned, like the magnetorotational instability (see e.g. Balbus & Hawley 1991;Balbus & Hawley 1998;Davis et al. 2010;Bai 2011;Béthune et al. 2016), or zombie vortex instabilies (Marcus et al. 2015(Marcus et al. , 2016Umurhan et al. 2016;Lesur & Latter 2016) can introduce additional turbulence , which as shown by Umurhan et al. (2019) reduces growth rates of SI modes and thus may render particle clumping and planetesimal formation via SI-induced over-densities more challenging.
Thus, while the diffusion-limited collapse criterion from Klahr & Schreiber (2015); Klahr & Schreiber (2019) and Eq. (20) is generally applicable for gravitationally collapse of over-dense clumps, the herein presented numerical verfication thereof and specifically the expression in Eq. (70) is most appropriate for protoplanetary disk dead zones. In particular, our criteria apply when δ α, where the level of turbulence generated by other processes is given by the commonly-used parameter α (Shakura & Sunyaev 1973) such that the turbulent velocity dispersion of the gas is v turb = α 1/2 c s (e.g., Youdin & Lithwick 2007;Rosenthal et al. 2018). For the simulations in this paper, δ ∼ 10 −7 − 10 −5 depending on metallicity (see Figs. 2, 6) and Stokes number after Eq. (60).
Recently, the DSHARP survey (see e.g., Andrews et al. 2018) and likewise the Ophiuchus disk survey employing ALMA (Cieza et al. 2019) showed that rings are an abundant substructure in protoplanetary disks, which may hint at the existence of planets carving gaps in the gas disk, which leads to a pressure bump collecting the radially drifting dust particles (Pinilla et al. 2012). Recently, Stammler et al. (2019) were also able to explain the observed rings as a by-product of ongoing planetesimal formation. Regardless, it seems likely that pressure bumps constitute a common phenomena in protoplanetary disks (Dullemond et al. 2018), and one must therefore ask about the applicability of our collapse criterion therein. Pressure bumps lead to a local decrease in pressure gradient β. As a result, the relative dust-gas velocity decreases and KHI weakens leading to a thinner particle layer. As this effect is reflected in the proportionality of Q p in β in Eq. (70), we expect pressure bumps to favor planetesimal formation, which agrees with findings by Dittrich et al. (2013) who investigated planetesimal formation in MRI-induced zonal flows and pressure bumps.
Lastly, we acknowledge that the evolution of formed clumps has not been studied in our work. This includes planetesimal growth, mergers and accretion (e.g., Kokubo

Outlook and conclusion
Our results show that particle concentration by streaming instability and planetesimal formation through gravitational collapse are not equivalent. As such, our simulations were able to produce planetesimals for a metallicity of Z = 0.01 for Q 2, even though no prior streaming instability clumping occurred, and likewise did not fragment for high Q-values regardless of significant prior SI-induced particle concentration.
Instead, the diffusion limited collapse criterion in Eq. (70) implies that whether or not planetesimals form depends only on disk mass via the Toomre parameter Q, the ratio of pressure gradient and metallicity β/Z (compare to Sekiya & Onishi 2018), and the local enhancement in dust-content .
The herein presented Toomre-like formulation of the diffusion-limited collapse criterion also may be able to shed new light on the initial planetesimal mass function. Eq. (14) predicts a range of length scales subject to nonlinear gravitational instability, determined by the value of the right-hand side of Eq. (20) (or Eq. (70)). By studying the dispersion relation in more detail, one can attach growth rates to each unstable scale. which can then be used to analytically inform an initial planetesimal mass function.
In past numerical studies of the initial mass function, e.g. Simon et al. (2016); Schäfer et al. (2017);Nesvorny et al. (2019), the enhancement is the only one of the quantities determining cloud collapse that is evolving in time. Therefore, the range of unstable scales and also initial planetesimal masses in past numerical studies may have been predominantly affected by parameter choices instead of characteristic properties of streaming and Kelvin-Helmholtz instability. This interpretation could explain why Simon et al. (2016) formed planetesimals when self-gravity was active from the start -the collapse criterion in Eq. (20) was already fulfilled by the initial condition -and why they found that higher self-gravity parameters lead to more massive planetesimals: more scales are unstable for lower Q-values. Real disks however, are not expected to start off unstable to gravitational collapse, so one must ask how numerical simulations can inform initial planetesimal masses more physically.
Our collapse criterion in Eq. (70) offers a possible solution to this conundrum.
In principle, there are four ways a system can evolve into fulfilling the collapse criterion. However, Q and global pressure gradient β in the absence of pressure bumps (see Sect. 8.2) tend to be non-evolving. As such, softly turning on self-gravity as performed by Schäfer et al. (2017) does not necessarily reflect real conditions in protoplanetary disks, in particular since the strength of self-gravity does not explicitly affect SI growth rates. This is in contrast to the metallicity and Stokes number, which both highly influence streaming instability clumping (e.g., Carrera et al. 2015), but also inform the collapse criterion directly (for the metallicity) and implicitly via the enhancement parameter .
Thus, to prevent the initial condition from collapsing, a simulation could be set up with a temporally increasing metallicity and having self-gravity turned on from the start. Then, planetesimals will only be formed once the collapse criterion is fulfilled. For high mass disks, we expect the mid-plane to start collapsing before the metallicity is high enough for streaming instability to start (compare to our Z = 0.01 simulation). This is of particular interest in the light of recent results by Powell et al. (2019) suggesting that disks may tend to be more massive than previously appreciated.
On the other hand, for low mass disks, we expect the onset of streaming instability to occur before the collapse criterion is fulfilled. In this case, streaming instability can locally increase particle densities, such that collapse occurs as soon as is high enough. A similar effect can be achieved by gradually increasing the Stokes number of super-particles. Both approaches mimic drift and growth processes in the sense that the abundance of particles with Stokes numbers appropriate for streaming instability increases over time.
Regardless of if the entire mid-plane or only locally enhanced regions collapse, we expect that, unless the growth timescale of the perturbation is shorter than its collapse timescale, marginally unstable modes will collapse first (i.e., for the right-hand side of Eq. (20) equal to unity). This would prevent collapse of larger (or smaller) scales and dictate a single characteristic and dominant planetesimal size determined by the critical length scale in Eq. (21) (see Klahr & Schreiber 2015;Klahr & Schreiber 2019), and therefore potentially lead to a initial planetesimal mass function that is qualitatively different to the power law seen in e.g., Johansen et al. (2015); Simon et al. (2016).
To conclude, our analytical considerations and numerical simulations suggest that gravitational collapse of over-dense clouds does equally depend on metallicity, pressure gradient, Toomre Q and potential local enhancements. These quantities should therefore be evaluated together, when assessing planetesimal formation. Streaming and Kelvin-Helmholtz instability are crucial in regulating local over-densities. Our collapse crite-rion remains to be tested for higher resolutions, different Stokes numbers and pressure gradients and for a larger range of metallicities.

ACKNOWLEDGMENTS
KG thanks Orkan Umurhan, Paul Estrada, Til Birnstiel, Marco Vetter, Diana Powell, Mickey Rosenthal, John McCann, Eve Lee, Jake Simon and Andreas Schreiber for fruitful discussions. KG thanks UC Santa Cruz for hosting an extended visit. RMC acknowledges support from NSF CAREER grant number AST-1555385. Simulations were performed on the ISAAC cluster owned by the MPIA and hosted at the Max Planck Computing and Data Facility in Garching, Germany.

A. ON VERTICAL AND RADIAL DIFFUSIVITY DRIVEN BY STREAMING AND KELVIN-HELMHOLTZ INSTABILITY
Diffusion plays a crucial role in limiting graviational collapse of a particle cloud on small scales. Throughout this paper, we used the vertical particle scale height as a proxy for vertical diffusivity to evaluate the collapse criterion in Eq. (20) under the assumption of spherical symmetric diffusion. This procedure has the benefit of being easier to conduct compared to measurements of radial or azimuthal diffusivities, as neither collective drift nor Keplerian shear affect vertical particle velocities, and thus one can forgo the use of tracer particles (which were utilized by e.g., .
However, as seen in 3D simulations of SI-driven turbulence by  as well as 2D simulations by , who compared the streaming instability in x − z with its counterpart in x − y, radial and vertical diffusion driven by pure streaming instability tend to deviate by an order unity factor. Indeed, in our simulations, the radial component of the local root-mean-squared velocity of a local N-particle system measured via tends to be larger than the vertical component by up to one order of magnitude. This is in line with findings by Schreiber (2018), even though they do not include vertical gravity and hence neither Kelvin-Helmholtz instability. Still, non-spherically-symmetric diffusivities should lead to an asymmetric gravitational collapse. In particular, an over-dense cloud may collapse faster or first in the vertical direction, thus potentially affecting planetesimal properties. We defer the detailed analysis of non-spherically-symmetric cloud collapse, that is not only including vertical but also radial and azimuthal diffusivity, to future work. To verify that the vertical extent of the particle layer is independent of initial dust distribution, we performed particle settling tests with varying widths of the initial Gaussian distribution. For all simulations, we chose N x,y,z = 64, Z = 0.02, St = 0.01, and β = 0.1. We therefore expect a settling time of τ set ≈ 15 orbits (see Eq. (56)). The result of this test is shown in Fig. 11. If particles are initialized below the expected extent, which for our choice of parameters is z max ≈ 0.25H after Eq. (40), they are excited to the expected scale, whereas they settle to z max if they are initialized above this scale. We note, that the bottom left two panels in Fig. 11 show more turbulent features than the bottom right panel, because here, the flow was initialized with a sub-critical Richardson number and the Kelvin-Helmholtz instability had to become active sooner to stir particles up.
We conclude that the initial particle scale height does not affect the final particle scale height, and H p,0 can in principle be chosen arbitrarily. In practice, for simulations with St = 0.005, we initialize particles with a scale height close to the expected value in Eq. (40), in order to minimize computational time. For St = 0.2, the settling time in Eq. (56) is very short and we can choose a higher initial height (also see Tab. 1).

B.2. Resolution and numerical convergence
In this paper, we present simulations with a resolution of 64x64x64 and 128x128x128 (see Tab. 1). In particular the lower resolution is not sufficient to resolve the fastest growing streaming instability modes which are always on the smallest scales. Still, due to being numerically inexpensive it allowed us to conduct a comparatively extensive parameter study. Moreover, the diffusion-limited collapse criterion presented in Sect. 2 and by Klahr & Schreiber (2019) suggests that not the fastest growing SI wave-mode, but the critical radius in Eq. (21) needs to be resolved to accurately present gravitational collapse and planetesimal formation. The fact that the low resolution fiducial run including self gravity required a lower Q-value for fragmentation to occur than the corresponding high resolution simulation (see Figs. 9 and 10), suggests that the low resolution simulations were not in fact able to sufficiently resolve r crit . Since, we do not test our criterion for a resolution exceeding N x,y,z = 128, we can not be sure that our high resolution simulations are converged. Still, our estimate for the critical radius of r crit ≈ 0.006H covers two grid cells for N x,y,z = 128. It therefore stands those simulations are if not fully converged, close to numerical convergence.
We note that due to the strong spatial and temporal fluctuations in particle concentration inherit to the nonlinear saturation of streaming instability, numerical convergence is challenging to test by solely evaluating fragmentation. For example, the evolution of the maximum particle density shown in Fig. 9 before self-gravity is turned on is qualitatively similar for the two fiducial runs. Due these challenges and the large computational cost of performing a comprehensive convergence test, we leave this exercise for future work.