Spatial clustering of polydisperse inertial particles in turbulence: II. Comparing simulation with experiment

Particles that are heavy compared to the fluid in which they are embedded (inertial particles) tend to cluster in turbulent flow, with the degree of clustering depending on the particle Stokes number. The phenomenon is relevant to a variety of multiphase flows, including atmospheric clouds; in most realistic systems, particles have a continuous distribution of sizes and therefore the clustering of ‘polydisperse’ particle populations is of special relevance. In this part of the study, measurements of spatial correlations of particles in high-Reynolds-number turbulence are compared with the results of a direct numerical simulation of particle-laden turbulence. The experimentally derived radial distribution functions (RDFs) exhibit a pronounced scale break at approximately 10–30 times the Kolmogorov scale, with large-scale clustering arising from ‘scalar mixing’ of the droplet field, and smaller-scale clustering depending on the particle Stokes numbers. A procedure is outlined for isolating the RDF due to inertial clustering from that resulting from large-scale mixing. Reasonable agreement between the experiment and the direct numerical simulations (DNS) is obtained for St ≲ 0.3 when particle Stokes number distributions in the DNS match those existing in the experiments. The experimental RDFs are consistent with the flattening or saturation scale appearing for bidisperse particles, but as in the companion paper, also support the ‘saturation’ effect in the asymmetric response of the power-law slope. The evidence for a universal scale break, as observed in both the DNS and the experiments, suggests that the pre-factor in the theoretical expression for the RDF is inherently tied to the power-law exponent, and an empirical form for this is given. Finally, no strong influence of the turbulence Reynolds number was observed for the clustering phenomenon. The consistency between the carefully analyzed DNS and experiments, in terms of St dependence, dissipation-range scale break and saturation of clustering for polydisperse particles, provides an indirect confirmation of the diffusion-drift theory of Chun et al (2005 J. Fluid Mech. 536 219–51).


Introduction
This is the second of a two-part paper describing investigations of spatial clustering of particles possessing finite, but small inertia, embedded in turbulent flow. This second part is focused on comparing direct numerical simulations (DNS) and experimental measurements of water droplets in a turbulent wind tunnel. The background and motivation for this work have been presented in Part I [1] and will not be repeated here. Inertial clustering is characterized by the particle Stokes number, defined as the ratio of the particle inertial response time τ p to the 3 Kolmogorov time scale τ k [2]: where ρ p is the particle mass density, d is the particle diameter and r k is the Kolmogorov length scale. Most realistic systems in which inertial particle clustering could be of relevance possess a broad range of Stokes numbers, most commonly due to a range of particle diameters. These papers explore the dynamics of such 'polydisperse' particles, and Part II specifically presents the results of an experimental study of polydisperse particles in turbulence, and compares those results with DNS. Spatial clustering of particles is quantified in this study by the radial distribution function (RDF) g(r ). The bidisperse RDF for particles of two distinct types is denoted by g i j (r ). For example, it can be defined as the probability of a particle of St i being found at distance r from a particle of St j , relative to the probability of the same event under a perfectly random spatial distribution (see Part I [1] for further details). For a polydisperse particle size distribution, it was shown in Part I that the generalized RDF that does not take into account particle type is given by where p(St)dSt is the fraction of particles with Stokes numbers between St and St + dSt. Equation (2) is at the heart of this work, allowing us to bridge between theoretical RDFs expressed for monodisperse and bidisperse particles, and realistic, polydisperse particles. Even early experimental measurements of particles in turbulence showed evidence of inertial clustering [3,4]. However, only recently has the precision of the measurement of particle positions become sufficient to allow quantification of the RDF [5][6][7]. Wood et al [5] found that their RDFs were considerably smaller than those from DNS at similar values of the particle Stokes number; they explained this discrepancy by the dimension reduction associated with their two-dimensional (2D) measurement (see section 4.1 for an explanation of this effect). Salazar et al [6] made three-dimensional (3D) measurements of a polydisperse particle field and found quantitative agreement between the measured RDF and the simulated RDF with a matched particle size distribution.
The present work is an extension of the high Reynolds number (R λ > 400) experimental study of Saw et al [7]. Here we strive to complement the previous work by analyzing monodisperse and bidisperse RDFs. The results are relevant for predicting the polydisperse RDF, as shown in Part I. Furthermore, the experiments can be interpreted in the context of the DNS results presented in Part I, thereby allowing for an indirect comparison with theory (especially that of Chun et al [8]).
According to [8], inertial clustering arises from a competition between an inward 'drift' that tends to bring particle pairs closer together and outward 'diffusion' that causes particle pairs to separate. The theory results in a power-law RDF for a monodisperse particle size distribution, at scales below the Kolmogorov length scale, and similar for bidisperse particles but with a saturation scale. The bidisperse RDF can be written as where g i j (r ) is the bidisperse RDF, r is the separation distance, r k ≡ (ν 3 /ε) 1/4 is the Kolmogorov length scale, ε is the dissipation rate, ν is the kinematic viscosity of the fluid, and c 0 , c 1 , and r c are model coefficients. In the monodisperse limit, equation (3) reduces to g(r ) = c 0 (r k /r ) c 1 . Furthermore, the theory predicts c 1 ∝ St 2 for small St. Other assumptions that go into the theory, and that we attempt to match in the experiments described here, include: Stokes flow around a spherical particle (particle Reynolds number much less than unity); particle size much smaller than flow structures (d/r k 1); negligible effect of gravitational settling (small gravitational settling parameter, S g ≡ v g /v k , where v g = p g is the particle terminal falling speed); and negligible flow modification by the particles (dilute particle loading). In order to explore inertial clustering of polydisperse particles, we have seeded a turbulent wind tunnel flow with particles possessing a broad range of Stokes numbers. The Stokes number is determined essentially by the local turbulent energy dissipation rate and the particle diameter (particle and fluid densities remain fixed throughout the experiments). Measurements of the spatial distribution of particles are conditioned on particle Stokes number, allowing quasi-monodisperse and bidisperse RDFs, g(r ) and g 12 (r ), respectively, to be determined. The qualifying adjective 'quasi' is used because averaging over finite particle size bins is required (as in any experiment or application). A DNS with the corresponding particle Stokes number distribution has been performed, as described in Part I [1], and is compared with the measurements in this part.
The paper is organized as follows. In sections 2 and 3, we provide a description of the experimental methodology and data analysis methods, respectively. This is followed by a comparison of the experimental and simulation results in section 4 (details of the DNS can be found in [1]). Where appropriate, we shall draw on the conclusion from Part I pertaining to the comparison of DNS and theory, which, taken together with the current findings, allows us to make an indirect comparison between theory and experiment, using the DNS data as the intermediary. In the same section (section 4.3), we also discuss specific aspects of the inertial clustering phenomena that draw on all of our findings-from both experiment and simulation. Finally, in section 5, we summarize the findings of both the present and the companion paper [1].

The experimental setup and methods
A schematic diagram of our experimental setup is shown in figure 1. The experiment was conducted in an open-circuit, suction wind tunnel in the DeFrees Hydraulics Laboratory at Figure 2. Energy spectra of turbulence at a downstream position of X = 5 m and a fan speed of 20 Hz. The cyan continuous line is the result from HWA and the red line with dots is the result from the PDI instrument. For the latter we used small droplets (<10 µm), first obtaining the longitudinal structure function of the droplet velocities from which the velocity autocorrelation function can be simply derived. The spectrum results from the Fourier transform of the velocity autocorrelation. The spectrum was also averaged with exponentially increasing bins to reduce noise at high wavenumber κ (note that κ as used here does not include the typical factor of 2π ). The green dotted line is defined as 0.5 ε −2/3 κ −5/3 , where ε is obtained empirically from hotwire anemometry (HWA) data. The ability to obtain a turbulence spectrum via the phase Doppler interferometer (PDI) that is so closely consistent with HWA is evidence that the instrument was well calibrated and that the small droplets were acting essentially as tracers.
Cornell University. The wind tunnel is 20 m in length and 0.95 m × 1.00 m across (height × width) with the achievable mean velocity of 1-15 ms −1 . Air is drawn into the wind tunnel by a fan at the end of the tunnel (far right in the figure; not shown). Turbulence is generated via an active grid [9], downstream of which four spray nozzles produce water droplets (inertial particles) with a broad range of sizes (∼1-50 µm). The sprays produce small volume loadings ( v ∼ 10 −5 ) and small mass loading ( m ∼ 10 −2 ), so only one-way coupling, from turbulence to droplets, is significant. Ample time is allowed for the development of the turbulence and also for interaction between droplets and turbulence before measurements of the droplet statistics and, indirectly, the turbulent flow, are made with a PDI by recording the arrival time, velocity and diameter of each detected droplet.
The turbulence in the present setup is well characterized and was found to be sufficiently homogeneous and isotropic for the present study of small-scale turbulence. Typical turbulence energy spectra obtained in the present setup using both HWA (without droplets) and PDI are shown in figure 2. The turbulence generated by the present (and similar) active grid setup has been carefully characterized and used in a variety of turbulence studies (e.g. [9][10][11]). In the experiments of Mydlarski et al [9], a similar but scaled down system (about half the size) was extensively studied to determine the performance of the active grid. It was found that the turbulence generated was slightly less isotropic (∼10% on less) compared with that generated by passive grids. However, the anisotropy was restricted to the large-scale turbulence motion and thus the flow was concluded to be well suited for studies of fine-scale turbulence dynamics. It was also found that the velocity probability distribution function is slightly non-Gaussian (skewed) due to the high intensity and rapid decay of the turbulence. This was argued to be negligible since it is very small compared to the highly non-Gaussian statistics of velocity differences and derivatives in the small-scale fluid motions. In [10], a larger active grid, similar to the one in this study, was built and used in a wind tunnel of nearly the same cross-section as the present. The generated turbulence was found to be qualitatively similar to that of the smaller system in [9] and closely followed all the trends. Finally, the turbulent flow in the present setup was also characterized and found to be in good agreement with the aforementioned systems [12, chapter 5]. In the work reported here, the generated turbulence shows flow statistics in agreement with standard turbulence theory, including a clear scaling range (inertial subrange) in all spectra. Table 1 lists the flow parameters for the conditions that were studied in the wind tunnel. Each experiment is referred to by a name based on the distance downstream from the active grid (in meters) where measurements are taken and the speed of the fan (in Hz) that drives the wind tunnel. Because of the different mean speeds and different measurement locations along the tunnel axis (weakly decaying turbulence), the experiments differ in R λ and ε, and therefore have different r k . As a result, the St range studied here corresponds to different droplet diameter ranges, with representative examples shown in the table. A distribution of particle Stokes numbers obtained from the measured droplet size distribution and the turbulence parameters for experiment 5 m 20 Hz is shown in figure 3. The distribution is observed to be approximately log-normal, with a dominant mode between St of approximately 0.01-0.3. Further technical details of the present setup can be found in [13, chapter 4], and additional details of the PDI instrument are given in [14] and [13, chapter 2].

Data analysis
From the collected particle statistics we calculate the RDF (cf Part 1 [1]) of each experimental set, which allows us to study the dependence of clustering on flow and particle parameters (St, R λ , etc). This section discusses the methods of obtaining RDFs experimentally, including the time-to-space transformation and the associated error analysis, subtleties involved in the estimation of the RDF and the separation of inertial clustering from the clustering resulting from mixing.

Calculation of the radial distribution function from particle arrival times
A PDI measures the arrival time, velocity and diameter of each detected particle. When particles share an average velocity that is large as compared to fluctuations, the series of arrival times can be transformed into a one-dimensional (1D) distribution of particles in space using Taylor's frozen turbulence assumption (e.g. [2,15]). Under this assumption, we can relate the time difference to a space difference as where U is the mean velocity of the droplets, x i are the inter-particle distances and t i are the inter-particle arrival times. A full discussion of errors, including those resulting from the use of the Taylor assumption, is given in appendix A. Given a linear sample of inter-particle distances, it is possible to estimate the 3D RDF assuming isotropy and by correctly accounting for 1D to 3D weighting due to finite interrogation volume ( r ), as well as correcting for 'edge' effects at the beginning and end of data series (see [16] and [13, chapter 3] for further details). The correct formulation is given by where N p (r ) is the total number of particle pairs separated by distance r ± r/2, N in (r ) is the number of interior particles-particles that are at least a distance r from sample edges, N ex (r ) is the number of edge particles-particles closer than r from the sample edges, N is the total number of particles in the sample and L is the total length of the sample. To see how equation (5) results from taking into account edges in 1D data, let us imagine a plot of particle positions along a horizontal line from left to right. Following the procedure outlined in [13, section 3.2.2], we treat the interior and edge particles separately: where ψ(r ) is the sum over 'the numbers of particles found at distance r from each trial particle'. We have written L in place of V (the full sample volume) and 2 r in place of δV r (in 1D, an isotropic shell reduces to two symmetric bins). We use all particles as trial particles and apply a full edge treatment in order to achieve maximum statistical convergence. Each interior trial particle provides us with two interrogation volumes r , one on its left and one on its right. However, for each edge particle, there is only one such r , since the data are truncated on the other side. Thus, when combined, we obtain a formula for g(r ) much like equation (6) but with the total number of interrogation volumes 2N t substituted with 2N in + N ex . This coupled with the fact that ψ(r ) becomes 2 N p (r ) when all particles are used in the calculation gives us equation (5).

Correcting for large-scale spatial inhomogeneity of particle density
Large-scale spatial inhomogeneities resulting from the mixing of the localized sprays influence the RDF. Figure 4 shows a droplet RDF for St = 0.1 → 0.5 from various experimental runs (directly calculated from the data, as detailed above). Each RDF curve has a power-law-like region at the smallest scales (r/r k 10), followed by a plateau region at r/r k 50 → 100, and then a rapid fall-off region at still larger scales. The RDFs measured at the same downstream distance are Stokes similar (coincide with each other). On the other hand, it is clear that the RDFs measured at 3 m downstream of the grid (the triangles) lie consistently above those measured 5 m downstream counterparts (the circles). The presence of a 'shoulder region' (plateau at large scales) in the RDF curve is due to large-scale inhomogeneities in the spatial distribution of the droplets. As pointed out by Saw et al [7], this apparent 'clustering' results from the imperfect mixing of the droplets after injection, and is therefore unrelated to inertial clustering. The fact that the plateau in RDF measured at 5 m is lower than that measured at  observations are consistent with the linkage between the RDF and the 1D scalar correlation function and power spectrum [18].
The fact that the RDFs measured at the same downstream distance still show Stokes similarity, despite the shoulder regions, implies that inertial clustering occurs at small scales that are well separated from the large-scale inhomogeneity, suggesting that the RDF is still intact, but its prefactor c 0 is multiplied by a factor greater than unity that is determined by the degree of the large-scale inhomogeneity. A more precise description of how this multiplicative relationship arises is provided in appendix B. We can therefore compensate for the large-scale inhomogeneity by dividing the RDF by this estimated factor. Figure 5 shows the same set of RDFs as shown in figure 4, but each is vertically shifted in logarithmic coordinates (multiplied by a constant factor) so that its values near r/r k = 80 are equal to 1. The slope of the RDFs in log-log coordinates shows clear Stokes similarity after this simple treatment.

The quasi-monodisperse case
Here we present a comparison between experimental and DNS results. The experimental data presented are from the 5 m 20 Hz run (cf table 1) with R λ = 440, and the DNS data correspond to R λ = 143 (discussion of the possible R λ dependence is in section 4.3). Figure 6 shows experimental and DNS results on the same logarithmic axes. The symbols with error bars are the experimental RDFs calculated from quasi-monodisperse particles (i.e. a finite distribution of St with a single mode). Remember that we cannot avoid considering particles with finite Stokes number ranges in order to have acceptable statistical convergence of the RDF. The other lines are from the DNS and will be described shortly. Three sets of particles are shown here with different colors, starting with the bottom, in blue are particle, with St = 0.01 → 0.21; green, St = 0.09 → 0.29; red, St = 0.25 → 0.45. In order to facilitate direct comparison of RDF slopes (i.e. clustering exponents) between the experiments and the DNS, the experiment RDFs are shifted vertically so that their plateau regions at 30 r/r k 60 approximately coincide.
DNS results for the three Stokes number ranges are also shown in figure 6; for each Stokes number range, three curves are plotted in order to isolate the relative contribution of each correction that was made. The dotted lines correspond to the 3D RDF ( g 3d (r ) ) of particles, but with a 'flat' distribution of Stokes numbers that overlap with the experiment. No attempt was made to match the shape of the distribution. Clearly, these lines are significantly steeper than the measurements. (For clarity, we do not plot the monodisperse counterparts, but in light of Part I [1, figure 5] these results would be even steeper than the dotted lines.) The dashed lines in figure 5 show the 3D RDF corrected for the shape of the particle size distribution (shown in figure 3). The agreement is further improved. Finally, we take into consideration the dimension reduction associated with making a 1D measurement of the intrinsically 3D RDF, following the approach of Holtzer and Collins [16]. This is done based on the dimensions of the phase-Doppler device's measurement volume, which are approximately r k × r k . The results are shown by the solid lines in figure 6. The final correction is largest at small separations, where the 1D RDF ( g 1d (r ) ) is reduced considerably by the dimension reduction of the instrument sample volume; at the actual resolved scales of the measurement this correction is relatively small. It can be clearly seen that the agreement between DNS and the experiments improves with each correction.
We now focus on the corrected DNS and experiment over a somewhat larger range of St, plotted in figure 7. The most striking agreement between the two is that inertial clustering becomes pronounced at length scales below r/r k ∼ 10-30, and increases with increasing particle inertia (St). A similar trend was found for exactly monodisperse particles (Part I) and is consistent with the diffusion-drift theory [8]. That inertial clustering starts at ∼10 r k is consistent with the idea that it is a dissipative scale phenomenon. There is considerable evidence that ∼10 r k is the characteristic length scale of the strongest velocity gradients in turbulence (e.g. see [19,  The best quantitative agreement between the experiment and DNS is at the lower values of St. At large values of St, the experimental data are consistently below the DNS, especially the red curves. Note that the experimental error bars only capture statistical (sampling) errors, not bias errors. There are several other possible sources of error, including the sizing error of the PDI instrument, the difference in Reynolds number between the DNS and the experiment, and the fact that the effect of gravity was neglected in the DNS. The importance of gravitational settling can be evaluated through the settling parameter S g = St (g/a k ), with clustering suppressed for large S g [21,22]. For the experimental conditions relevant to figure 7, we estimate that even for the largest droplets the settling number is still less than unity (S g = 1 occurs for St = 1.1). Quantitatively, we note that reducing the experiment value of ε by about 30% (equivalent to increasing τ k by 9%) results in nearly perfect agreement between the DNS and the experiment.
We conclude that there is reasonable agreement between experiment and DNS, despite these deviations. Together with the agreement found in Part I between theory and DNS data (from exactly the same run used here), we see that there is a general consistency among the three descriptions of the clustering phenomena.

The quasi-bidisperse case
As shown in equation (2), the polydisperse RDF can be constructed from a weighted sum of mono-and bidisperse RDFs. In this section, we compare the DNS and theory for the quasibidisperse RDF.  , the general trend with respect to St 2 is similar to the bidisperse case described in Part I. As the difference between the Stokes numbers of the two particles increases, the RDF at small scales reaches a plateau, and there is a general reduction in the slope of the RDF at larger scales, signifying a weakening of the correlation. Furthermore, we observe that the reduction in the slope of the RDF relative to the monodisperse case is asymmetric with respect to the sign of St 2 − St 1 , for a fixed St 1 . This is related to the 'saturation' effect described in Part I. The experimental RDFs (top right) show similar trends albeit the trend at larger St 2 is hardly discernible due to statistical noise. The agreement for the powers of these RDFs is very good when St 2 is small, as can be seen in the bottom panels. At larger St 2 , the results remain close, but the experimental curves are consistently below the DNS. Here again, due to statistical noise, there is no clear evidence in the experiment of the RDF flattening at small r .
We conclude here that the experimental observations are consistent with DNS results for the quasi-bidisperse RDFs to within the statistical uncertainty of the data. Further, the experimental results clearly support the asymmetric dependence of the RDF slopes in the intermediate range (r k r 10 r k ); that is, the saturation effect discussed in Part I, wherein the cross-correlation is limited by the least correlated particles.

General observations from comparing experiments and DNS
In this section we make some general observations about the experiments and DNS. The first is related to the spatial scale observed for the onset of inertial clustering. All RDFs from the simulations (Paper 1 [1], figures 2, 5 and 7 for quasi-monodisperse, polydisperse and bidisperse distributions, respectively) and the experiments (this paper, figures 4 and 5) exhibit a scaling region that begins at approximately r/r k ≈ 10-30. The fact that this holds regardless of the Reynolds number or particle size distribution suggests that it is a fundamental characteristic of the clustering mechanism. As was pointed out by Saw et al [7, figure 1], who noted a similar feature in their experiments, this is consistent with clustering continuing to the correlation scale of the velocity gradients, which is of the order of r/r k ∼ 10.
The existence of a common length scale r s beyond which inertial clustering ceases implies an even simpler form for the power-law equation (3): or equivalently c 0 = (r s /r k ) c 1 . Empirically, by extrapolating the power-law slope of the RDF to g = 1 we find that this relationship holds for 10 r s /r k 30. We do note, however, that a gradual increase of c 1/c 1 0 with St is observed, within the range of 10-30. This may be related to clustering of larger particles (St > 1) in the inertial subrange of turbulence.
Whether the scale r s can be universally related to the properties of the turbulent flow is not clear, but we note that for the four experimental conditions in this study, the ratio of the Taylor microscale to the Kolmogorov scale is r λ /r k ≈ 50, suggesting that r s could be related to r λ as well. The speculation that inertial clustering onset is related to r λ would seem to have some physical basis given the interpretation of r λ as the characteristic radius of curvature of the velocity field at small separations, especially when taken together with the diffusiondrift model for inertial clustering that depends essentially on a smooth but fluctuating velocity field [8]. Unfortunately, the range of Reynolds numbers in this study is insufficient to draw any conclusions regarding scaling of r s with r k versus r λ , given the very weak Reynolds number dependence of r λ /r k ∼ Re 1/4 . The second general observation from the combined DNS-experiment results is regarding the dependence of inertial clustering on Reynolds number, which has important implications for extrapolating these results to geophysical and astrophysical systems [e.g., 23,24]. Collins and Keswani [25] analyzed DNS results of monodisperse particles at various Reynolds numbers up to R λ = 152 and concluded that both the clustering exponent, c 1 , and the pre-factor, c 0 , approach saturated values as R λ increases. On the other hand, Falkovich and Pumir [26] studied a similar range of Reynolds numbers with DNS and found c 1 (denoted as α in their paper) increased linearly (or faster) with R λ . This was interpreted as a consequence of increased intermittency in turbulent velocity gradients with Reynolds number. Our results, from both experiment and DNS, 15 strongly support saturation of c 1 in the limit of large Reynolds number. The good agreement between experiments (R λ = 440) and DNS (R λ ∼ 140) found in section 4.1 implies that any Reynolds dependence of c 1 is much weaker than the dependence on St. Earlier work using a similar experimental setup [7] presented the results from different sets of experiments covering R λ from 440 to 660 and showed strong St similarity in the steepness of the RDF (equivalent to c 1 ), which supports the same conclusion. Furthermore, since c 0 is apparently linked to c 1 , in light of equation (7), our results imply that c 0 and r s also saturate at high R λ .

Summary and general conclusions
This work is a systematic study of inertial clustering of particles in turbulence, using a computational and experimental approach. Due to the volume of material, it is presented in two companion papers. Part I [1] focuses on a detailed comparison of the results from DNS of particle-laden turbulence with theory. Part II (this paper) focuses on comparisons between experiments and DNS. Taken together, Parts I and II provide an indirect comparison of theory and experiment using the DNS data as an intermediary.
In this paper, we have presented wind tunnel measurements of the RDF that were made using a PDI. We have also discussed the procedure for isolating the RDF due to inertial clustering from that resulting from large-scale mixing (section 3.2). We then proceed to study inertial clustering by comparing the experimental results with DNS of particle-laden turbulence, and also indirectly with theory using the DNS data as an intermediary. The DNS were performed matching particle Stokes number distributions found in experiments in order to allow an unambiguous comparison.
Observations and conclusions that can be drawn from this paper, as well as those from the two papers taken together, can be summarized as follows: (i) The RDF, and thus inertial clustering, is found to show dynamical similarity with respect to particle Stokes number (cf section 4.1). The RDF is flat (c 1 = 0, no clustering) when St = 0 and becomes steeper as St increases from zero, for St 1. (ii) For non-zero St, the RDF exhibits a clear scale break, exhibiting enhanced clustering at length scales r < 10 r k to 30 r k (cf section 4.1 and Part I [1]). This is found to be true for all cases (perfectly monodisperse, quasi-monodisperse and quasi-bidisperse for both DNS and experiment). This finding supports the understanding that inertial clustering is driven by dissipation scale fluid motions and is consistent with turbulent vorticity reaching a maximum around the scales of 10-20 Kolmogorov lengths (r k ). (iii) Quantitative agreement was found between DNS and the perfectly monodisperse theory for St 0.3. Good quantitative agreement was also found (albeit minor deviations at large St) between DNS and experiment in the quasi-monodisperse study. Taken together, this implies that there is consistency between theory and experiment regarding the Stokes number scaling of monodisperse RDFs (see section 4.1 for details). (iv) When comparing the bidisperse RDFs we found that the theory was able to predict the trend of the flattening scale r c very well, but it did not account for the saturation effect found in the trend of c 1 (the power-law slope in the range (r k r 10 r k )). Experimental results reported here confirm this finding (section 4.2). In view of that, we have proposed a correction for the bidisperse c 1 based on the concept that the most weakly correlated particles dominate the RDF (Part I).
(v) DNS and experimental data for all St and over a range of Re showed consistently that inertial clustering becomes insignificant above a length scale of approximately 10-30 times r k . This implies that the power-law expression for g(r ) can be written with a single parameter, c 1 (see discussion around equation (7) for details). Alternatively, the observation leads to a relationship between the pre-exponential factor of the RDF, c 0 , and the power-law exponent, c 1 . (See section 4.3 for details.) (vi) In section 4.3 the experimental and DNS data were reviewed to conclude that inertial clustering becomes saturated once the Reynolds number is increased beyond a certain level (R λ ∼ 140).
These results have several implications. Firstly, they provide considerable support for the diffusion-drift theory, and therefore to the underlying concept that particle inertia leads to slow decoupling of the particles from the fluid motions, resulting in spatial clustering within the dissipation range. Secondly, the results underscore the crucial importance of considering detailed experimental conditions when attempting a comparison with computational or theoretical results. Specifically, the shape of the particle Stokes number distribution (e.g. size distribution for constant density particles) must be accounted for in such comparisons and, by extension, in the application of theoretical results to practical problems. For example, this may explain some of the difficulty in measuring strong inertial clustering in transient environments containing relatively broad particle size distributions, such as in atmospheric clouds (e.g. [24,27,28]). Only in a carefully controlled and well-characterized wind tunnel environment, in which particles could be counted for time periods of the order of hours, has it been possible to make unambiguous comparisons between theory and experiment. Thirdly, evidence is mounting that the Reynolds number dependence of inertial clustering, if present at all, is quite weak. This provides further support for the application of DNS and theory to practical problems, at least for mean conditions (we may speculate that fluctuations may have a stronger R λ dependence). Fourthly, several questions are raised that merit further attention. The most significant in our opinion is the saturation effect, or asymmetric dependence of the RDF power-law slope c 1 on the two particle Stokes numbers in a bidisperse population. The possible relationship between the power-law slope and the pre-factor (c 1 and c 0 ), or alternately the single-parameter c 1 power law with a constant cutoff scale r s , is also intriguing and is likely relevant to practical application of the theory. are obtained from this time series via a time-to-space transformation, equivalent, in principle, to the Taylor frozen turbulence hypothesis: equation (4). After that, g(r ) is calculated using equation (5). Sources of error in estimating g(r ) include: (i) Shot noise. Because g(r ) is derived from counting droplets within a particular interval (cf equation (5)), it is subjected to random sampling statistics. These can be approximated as Poisson distributed (correlations are relatively weak). From this, one can derive the sampling uncertainty to be σ g(r ) = g(r ) /N n p r , (A.1) where N is the total number of droplets detected in the whole experiment, n p is the globally averaged droplet density and r is the width of the interrogation bin. A large sample is usually required to minimize this noise. Since this error is dominant and quantifiable, we present all experimental RDFs with error bars of one σ at each point.
(ii) Finite sample volume and 1D sampling. The PDI has a measurement cross-section with height ≈230 µm and width of roughly 2-3 times the slit aperture (selectable between 50, 100, 200, 500 and 1000 µm). This has the implication that the x estimated via equation (4) has a tendency to underestimate the true inter-particle distance. Proceeding from this line of reasoning, Holtzer and Collins [16] showed that g(r ) obtained from 1D sampling of a droplet distribution could systematically deviate from the true (3D) g(r ) due to averaging effects from the finite instrument sampling volume. The same study also showed that this deviation is negligible for r greater than the sample volume dimensions.
In this work, we have reported only g(r ) at scales satisfying this condition. Furthermore, following the analysis of [16] we have estimated that at the scales reported in this paper the maximum decrement of any single point of g(r ) from its 3D counterpart is ≈0.4; most deviations in the g(r ) curves due to this spatial averaging are much smaller than this. Lastly, we note that we have not shown the corrected curves in this paper because the inversion described in [16] is based on the presupposition that the data follow a power law in r . We have chosen to keep the analysis as general as possible because the power-law form is one of the experimental questions. (iii) Error from time-space transformation. The use of equation (4) essentially assumes that all droplets are advected downstream with a velocity equal to the mean flow velocity U . This is, by definition, incorrect because droplets are embedded in a turbulent flow. The resulting estimates of inter-droplet distances, x, thus contain random errors. The resulting error is of the order of (u/U ) x, where u is the rms velocity fluctuation. In our experiments, u/U never exceeds 20%, thus resulting in errors of less than 20% in x.
Since we are ultimately interested in g(r ), which is derived from counting the number of droplet pairs separated by distances satisfying r − δr/2 < x < r + δr/2 (cf equation 5), random errors in x imply that some droplet pairs are wrongly assigned to inter-droplet distance bins. This is essentially equivalent to a smoothing of g(r ). However, since we report coarse-grained RDF (with finite interrogation volumes, r > 0.1r ), the probability of particle pairs being wrongly binned is suppressed, and therefore the low-pass filtering effect of this error is masked. Because of the loss of possible fine features due to low-pass filtering or r -coarse graining, the conclusions reached in this study, such as observations of Stokes similarity, are limited to the coarse trends of the g(r ) curves. (iv) An additional error from the time-space transformation. A related bias to points (ii) and (iii) results from lateral motions of the droplets over the time required for a pair to be counted. A simple way to explain this is that because equation (4) does not allow for any displacement of droplets in the plane normal to the mean flow direction, it always tends to underestimate the true particle separation distance; that is, lateral droplet motion during the time interval between detection of two droplets is not measured, and this results in an underestimate of the inter-particle distance. It is similar to the bias discussed in [16] in that the error results essentially from a collapse of three spatial dimensions into one. However, starting from an error of 20% in x, we estimate that the 'scale of averaging' is less than 4% of r , which implies that the resulting bias is negligible at small r . Even at r ∼ 10 r k (the largest separation distance reported in this study), this bias is less than half the magnitude of that resulting from the finite measurement volume. The combined effect is thus insignificant for the results presented in this paper. (v) Bias from using finite interrogation volumes. As discussed [13, chapter 3], this only affects g(r ) calculated from 2D and 3D data. In our case, the finite bin size r causes averaging similar to that found with a finite difference scheme, which in principle could smooth over fine-scale features; however, the smooth nature of g(r ) and the dimensionality of the measurement make this inconsequential in our analysis. (vi) Error in measurement of droplet arrival times. Particle arrival time accuracy was found to be better than 3.5 µs (details in [13, chapter 2]), which we shall see is just a few per cent of the smallest inter-droplet arrival time studied. This error has the same effect as the error discussed in item (iii) but is one order of magnitude smaller, so it is ignored here.

Appendix B. Inertial clustering versus turbulent mixing
The RDF, as typically defined (e.g. [1, equation 1]), does not discriminate between the mechanisms of correlation between particle positions. In turbulence, particles may have spatial correlations caused by other effects apart from the mechanism of inertial clustering. An example is shown in figure 4. All of these correlations will contribute to the RDF calculated via the formulation (see [13, chapter 3] for a detailed derivation): where N t is the number of 'trial particles', ψ(r ) is the sum over the 'the number of particles found at distance r from each trial particle' and δV r is the volume of the infinitesimal shell (e.g. 4 r 2 δr in 3D and simply δr in 1D). Particularly important to this work, apart from inertial clustering itself, is the inhomogeneity due to particle injection into an otherwise homogeneous and isotropic turbulent flow, again as illustrated in figure 4. Due to the random stirring and mixing effect of turbulence, these density inhomogeneities are slowly smoothed. However their residues blend into the observed RDF in addition to the signature of inertial clustering of interest. Here we shall show that, due to the distinct scales of the two processes (the residual turbulent mixing being at large scales and the inertial clustering being at dissipative scales), the effect of the mixing residues on the inertial clustering signature is simply a multiplicative factor on the RDF of the otherwise 'pure' signature of inertial clustering and that this factor is nearly a constant over the inertialclustering scales when the two scales are well separated. This finding is utilized in our data analysis to isolate the influence of mixing from the signatures of inertial clustering (section 3.2).
Although the main intention here is to address the relation between mixing and inertial clustering, the mathematical result obtained may be extended to other systems satisfying similar conditions. We begin with some definitions and assumptions: • We assume that the inhomogeneity of the particle density field is driven by two classes of processes with distinct scales and that they are mutually uncorrelated. Particularly for our case of interest, the inhomogeneous initiation of the particle field into turbulence should have no effect on how particles are clustered due to small-scale turbulence. Thus, the role of the large-scale inhomogeneity is just to introduce a landscape of varying local densities within which the particles are clustered by the local turbulence.
• We define m( r ) as the local number density dictated only by the large-scale inhomogeneity. Precisely, this may be defined as the (idealized) density profile in a system containing only a large number of particles that are unaffected by the small-scale clustering (e.g. a field of passive scalars) while satisfying all initial conditions of the problem (such as the particle injection geometries). Experimentally, m( r ) can be approached by averaging the observed particle density over a scale much larger than the small-scale clustering yet much smaller than the large-scale inhomogeneity.
• We define g m (r ) as the RDF that corresponds to the large-scale inhomogeneity and that can be computed directly from m( r ). When there is no such large-scale inhomogeneity, in other words when 'large-scale averaged' particle density is constant everywhere, g m (r ) would be equal to unity at all r .
• We define g n (r ) as the 'normal' or directly measured RDF, as calculated from a field of particles using equation (B.1), that is, when a global particle density is used and any inhomogeneities are interpreted as particle spatial correlations.
• We use g(r ) to represent the 'pure' RDF signature of the small-scale clustering that is of interest. It would be recovered by g n (r ) in the special case with large-scale particle density, m( r ), being constant everywhere. Otherwise, g n (r ) is expected to be some composite of g(r ) and g m (r ).
• We are interested in the mathematical relation among g n (r ), g m (r ) and g(r ). To that end, the 'limit of large numbers' will be assumed throughout the development to follow. This means that we will be considering the idealized situation where we have infinitely many particles (so that density is well defined) inside an infinitely large sample (so that all RDF estimators approach the true values).
For the sake of simplicity, conceptual clarity and direct relevance to this work, we proceed to consider a scenario in which particles are distributed in one dimension. The formula for g n (r ), the normally or directly calculated RDF (cf equation (B.1)) for the 1D case, can then be written as where S i (r ) is the number of particles found at position r ± δr from the ith trial particle and n is the sample's globally averaged particle density (in 1D). Note that, from equation (B.1), ψ = N t i=1 S i (r ). Next, we introduce in the summation the local (large-scale) particle density at position r relative to each trial particle, m i (r ): Here N j is the number of trial particles belonging to the jth group-those that see, at position r relative to each, other particles with a local number density of value m j . Note that N t = j N j . Similarly, S i j (r ) is the number of particles found at position r ± δr from the ith trial particle, with the condition that only trial particles distance r from regions with local number density m j are selected. To be precise, equation (B.3) should distinguish between 'left' and 'right' m(r ) relative to the trial particle (recalling that m( r ) was originally defined with a vector argument). The result equation (B.4) is the same, however; so we avoid the addition of cumbersome notation here.
Bringing everything that is independent of i out of the summation over i, introducing N j , and bringing δr inside the summation over j: Here we recognize that the quotient inside the brackets is an unbiased estimator of g(r ) (the small-scale clustering signature) for every m j . The reason for this is because the trial particles under the summation over i all see the same density m j at distance r relative to them. Hence the quotient is a construct much like an RDF estimator averaged over many small yet statistically identical samples. We thus write it as g Est j (r ) in what follows. Next, multiplying and dividing the right side with N j m j and rearranging, we have g n (r ) = j N j m j N t n j N j m j g Est j (r ) j N j m j .

(B.6)
We quickly see that the term inside the parentheses is a weighted average of g Est j (r ) and since we are taking the limit of large numbers, it will approach the true RDF signature of smallscale clustering, g(r ). Before continuing, it is worth noting as a corollary that this provides us with a means of recovering directly the small-scale clustering signature even when large-scale inhomogeneity is present, that is: Turning back to equation (B.6), in the limit of large numbers, the quotient outside the parentheses becomes g m (r ). This can be seen more clearly if we consider j N j m j N t n = j N j (m j δr ) N t n δr , (B.8) where the numerator reads: the sum over the 'expected number of neighbor particles found at r from each trial particle' as dictated by the density field m( r ). Hence we have established the general relation g n (r ) = g m (r ) g(r ). (B.9) In other words, the RDF calculated using global average density (equation (B.2)) is the product of the RDF that would result from large-scale inhomogeneity (e.g. inertial-range mixing) and that of the small scale (e.g. inertial particle clustering). Finally, if the scales of the two families of inhomogeneities are sufficiently separated, then g m (r ) would be almost constant in the range of small r where g(r ) is of interest. This is valid for the experimental data presented in section 3.2.