Improved Prediction of Settling Behaviour of Solid Particles through Machine Learning Analysis of Experimental Retention Time Data

The motion of particles through density-stratified interfaces is a common phenomenon in environmental and engineering applications. However, the mechanics of particle-stratification interactions in various combinations of particle and fluid properties are not well understood. This study presents a novel machine-learning (ML) approach to experimental data of inertial particles crossing a density-stratified interface. A simplified particle settling experiment was conducted to obtain a large number of particles and expand the parameter range, resulting in an unprecedented data set that has been shared as open data. Using ML, the study explores new correlations that collapse the data from this, and previous work Verso et al. (2019). The ``delay time,'' which is the time between the particle exiting the interfacial layer and reaching a steady-state velocity, is found to strongly depend on six dimensionless parameters formulated by ML feature selection. The data shows a correlation between the Reynolds and Froude numbers within the range of the experiments, and the best symbolic regression is based on the Froude number only. This experiment provides valuable insights into the behavior of inertial particles in stratified layers and highlights opportunities for future improvement in predicting their motion.


I. INTRODUCTION
Settling of inertial particles across layers of fluids of different densities appears in various engineering and environmental fluid mechanics problems [2][3][4].The settling velocity of particles can be estimated from the first principles only for cases of particles settling with low Reynolds (Re p = aV p /ν, a is particle diameter, ν fluid kinematic velocity) and low Stokes number (St = T p /T f , T p particle response time scale, T f flow response time scale).Furthermore, to be able to estimate the settling velocity, the particle must also settle through a fluid of homogeneous density (ρ = const) or weakly linearly stratified fluids (∂ρ(z)/∂z = const, where z is fluid depth).When inertial particles cross interfacial layers, which are fluid layers with sharp density changes, there may be a noticeable amount of lighter fluid that follows the particle into interfacial layer with different densities.This lighter fluid is sometimes referred to as a "caudal wake."The coupled dynamics of the particle motion and of its caudal wake with respect to the surrounding fluid, together with the flow due to particle motion, lead to additional resistance on the particle.We will denote this additional resistance force by F s as a single quantity, although there is an active discussion in the community about the origin, nature, and magnitude of various sources of the resistance force [1,2,[5][6][7][8].Modeling accurately the time it takes for an inertial particle at a higher Reynolds and Stokes number to move through a stratified fluid with sharp density changes (hereinafter called interfacial layers) would be beneficial, for example, in problems of marine snow aggregations [9,10], airborne or waterborne pollutant dispersion [9,11,12], and oxygen levels regulation in the ocean [13,14].
Estimating the force components for different parameter regimes is extremely challenging due to the complexity of the dynamics of all the components, including the particle, the fluid layers, and the wake.Instead of integrating the force model in time, we suggest modeling the settling time directly, combining the experimental and machine learning methods.

A. Problem definition
Let us consider the problem at hand schematically in Fig. 1.In panel a), we show the scheme of the physical process: a particle heavier than the fluid is settling from the top, lighter fluid layer, through the interfacial layer of thickness h, and into the heavier fluid layer at the bottom (i.e., ρ 1 < ρ 2 < ρ p ).In panel b), we plot the fluid density profile of the two homogeneous fluid layers at densities ρ 1 , ρ 2 , and a continuous smooth transition ρ(z) between the two homogeneous layers, called the stratified interfacial layer of thickness h (green curve).We also plot the curve corresponding to the typical particle velocity expected to vary from V 1 to V 2 as a smooth monotonic function (cyan curve).This curve describes the case of a sphere settling without any additional resistance force stemming from the stratification, F s = 0 [2].In some parameter regimes, the additional resistance is nonnegligible, and there is possibly a non-monotonic change of velocity with a local minimum near the edge of the interfacial layer [1,5,7,8] (blue curve).
In this study, we focus on the settling time estimate.Suppose the additional resistance force in the stratified interfacial layer is negligible.In that case, we will measure the theoretically predictable settling time of the particle, tV 1 −V 2 , defined as the time the particle moves from one homogeneous-density layer to another.Instead of settling time, we can determine the so-called retention time; the interval during which particle velocity changes from one terminal velocity value, V 1 to another terminal velocity value, V 2 , marked as This definition is more beneficial for several cases, such as cases when the particle is heavier than both fluids ρ p > ρ 2 , when the particle is lighter ρ p < ρ 1 and rises through the interfacial layer, as well as for the cases when particle temporarily changes its direction of motion and levitates [1,6].
For the monotonic, theoretically predictable case, the theoretically predictable settling time through the interfacial layer is equivalent to the retention time (i.e., tV Suppose there is an unknown additional resistance force.In that case, the retention time is longer because it also contains the interval during which particle settling velocity is lower than both the steady-state values or it levitates and changes the velocity sign.We define the difference between the expected settling time and observed retention time as the "delay time", marked as τ (see Fig. 1).In this study, we develop the method to predict τ based on particle and fluid properties, using a machine learning model trained on experimental data.

B. Existing data
Only a few studies address the problem of inertial particles crossing sharp interfacial stratified layers (h/a ∼ O(10), h is the interfacial layer thickness, and a is the particle The density of the fluid ρ(z) changes vertically throughout the medium (green), from the first layer ρ 1 to the second layer ρ 2 .The particle velocity V (t) is illustrated for the case of negligible (cyan) or significant (blue) stratification force.The settling velocity in each layer is named V 1 and V 2 , and the interfacial layer thickness is h.We mark three relevant time intervals: the theoretically predictable settling time tV 1 −V 2 , the retention time t V 1 −V 2 , and the delay time τ .diameter) between two miscible fluids of densities ρ 1 and ρ 2 .The key parameters are the "entrance" Reynolds number, defined with the particle size and the velocity and viscosity of the layer from which the particle enters the interfacial layer: Re 1 = V 1 a/ν > 10, and the corresponding entrance Froude number, F r 1 = V 1 /N a < 100, where N is calculated as defined below in Eq. (1).In Fig. 2, we summarize the existing results on the map of Re 1 , F r 1 .
The figure also demonstrates the limited number of studies in the literature that referred to this parameter range, in which Re 1 > 10, i.e., the particles have significant inertia, and at the same time, the stratification is relatively strong, i.e., F r 1 < 100.Outside of this parameter range, there are many more studies and comprehensive results.Most of these results are for cases of linear stratification and small particles at creeping flow regime, where Re 1 < 1 In the suggested parameter regime, the first experimental study of inertial particles settling through an interfacial layer of finite thickness is by Srdić-Mitrović et al. [7].The authors used a water-alcohol-brine system and particles in the ranges 1.5 < Re 1 < 15, 3 < F r 1 < 10.The authors attributed the particle slowdown to the additional drag force due to caudal fluid.In addition, they mentioned the plausible contribution of internal waves or modification of flow structure around the particle due to the density gradient [7].However, the authors studied only the time it took the spheres to reach minimum velocity and did [6], Srdić-Mitrović et al. [7] and this study, in terms of entrance Froude versus entrance Reynolds numbers.A -present study, B -Verso et al. [1], C - [6], and D - [7].Minima -refers to particles that exhibited a clear local minimum velocity, No minima -refers to particles without a local minimum, and Levitation -refers to particles that momentarily reverse their motion upwards, against gravity.
not investigate the phenomena in the denser bottom layer after the particles crossed the interface.Therefore, we do not have their estimate of the retention time (as it requires the tracking of the particle within the bottom layer), and thus cannot infer the particles' delay time τ .This prevents us from using their measurements to train or validate our model.
Abaid et al. [6] performed similar experiments with sharper interfaces in the ranges of observed different caudal wake effects and reported a "temporal levitation" phenomenon, where spheres momentarily reverse their motion upwards, against gravity.Abaid et al. [6] formulated a closure model as an additional virtual mass with its own degree of freedom, mimicking a caudal wake that moves upwards when the particle moves downwards.The authors did report prolonged time periods until the sphere regained steady state motion in the bottom layer, but did not study any time scales.Verso et al. [1] estimated the retention time based on the results of an experiment similar to Srdić-Mitrović et al. [7].The authors used a water-alcohol-brine system in ranges of 2 < Re 1 < 14, 0.6 < F r 1 < 4, and h/a ∼ 10.Verso et al. [1] did not observe the levitation phenomenon in their parameter range.However, they proposed a model for the additional resistance due to stratification and caudal fluid that helped to estimate the parameter range of Re, F r, and h/a in which the levitation is possible.In addition, Verso et al. [1] observed very prolonged t V 1 −V 2 timescales and developed their parametric model for the stratification force F s .The authors also show that the crossing time t V 1 −V 2 is inversely proportional to the particle Reynolds number in the bottom layer, Re 2 = V 2 a/ν 2 .Their parametric model also predicted the data from the literature [7].This is the only data that we could use for training, along with the data measured in our experiments.
Recently, Wang et al. [15] reported in their preprint a detailed experiment on the bouncing effect of particles, similar to the levitation reported by Abaid et al. [6].The authors measured and numerically simulated velocity fields and suggested a model of the resistance force.
Unfortunately, their data is not yet available for comparison.
The studies mentioned above are based on detailed measurements of a relatively low number of particles, 5−50, mainly due to the technically challenging experiments.The major challenges relate to control of the thickness and location of the interface without mixing the fluid layers, handling small spheres (a ∼ 10 ÷ 10 3 µm), tracking the particles with high spatial resolution, and in some cases enforcing refractive index matching.Although these experiments provide insight into the fluid mechanics and dynamics of the particles inside the interfacial layers, they do not create a statistically sufficient dataset to model the delay times.
We also suspect that there are more useful forms of Reynolds and Froude numbers using different velocity and length scale combinations.It appears that the present typical distinction of parameter regimes used in the literature in terms of Re 1 and F r 1 [2, 16-19] is incomplete.Examination of the map presented in Fig. 2 raises questions about its specificity regarding the effect of stratification resistance.We could expect a map on which it is more clear which particles experience different physics (feeling a significant caudal wake resistance) and which do not (i.e., crossing with quasi-steady-state velocity values).
In this study, we propose another approach: we deliberately simplify the experimental design, avoiding the difficulties associated with estimating the force component, F s , particularly the need for index refraction matching.Note that the correlation between Fs and index refraction stems from the need to precisely track the particles within the interfacial layer to measure Fs.This entails meticulously matching the refractive indexes of the top and bottom layers, ensuring minimal refractive differences throughout the particle's trajectory.By avoiding the need to match the refraction index of the top and bottom fluid layers, we can simplify the experimental design and significantly extend the parameter space to include previously unexplored regimes.
Furthermore, the simplified experimental design allowed us to obtain an unprecedented number of particle trajectories.We use this sufficiently big data set in the unexplored parameter regime with the custom-designed ML-based symbolic regression tool, SciMED [20].
We developed this tool to find symbolic regression correlations of τ , using hidden nondimensional parameters, that might have the potential to better explain the underlying physical mechanisms.This approach leads to an opportunity to find a new parametric predictive model for τ , using a "data-driven" methodology [20].
In Fig. 3, we plot our data together with the only data and existing correlation for the delay and retention times by Verso et al. [1].It is clear that we arrived at a different parameter regime for which previous correlations do not match.

MATERIALS AND METHODS
In this section, we describe in depth the materials, equipment, measures, and analysis techniques that were used, following the order of the scheme presented in Fig. 4. The unprecedented number of particle trajectories collected enables the use of machine learning methods.For that reason, we developed and applied a specific method that we abbreviated SciMED , described in detail in the recent publication [20].Our main interest at the end of the process is to find a new correlation that fits the available data of τ as a function of particle and fluid parameters.

Experimental Setup
We conducted experiments in a glass tank with a cross-section of 200 × 200 mm 2 and a depth of 300 mm, creating a top lighter layer of water (marked as layer 1) above a bottom heavier layer of water and Epsom salt (MgSO4) solution (marked as layer 2).We first fill the lighter fluid and pump the heavy fluid from a valve at the bottom of the tank, resembling the method used in Verso et al. [1,21].This results in an interfacial stratified layer between two fluid layers, created through molecular diffusion.The interfacial layer is growing very slowly with time (less than 10 mm per day), and in the present experiments, it is in the range h = 10a ÷ 100a.We varied the ratio of densities of the fluid layers ρ 2 /ρ 1 in the range of 1.040 to 1.200.For these low concentrations of Epsom salts, it is customary to ignore the small variation in fluid viscosity.Alike previous studies [7], we assume equal kinematic viscosity of both layers, The values of fluid density (ρ 1 , ρ 2 ) and Brunt-Vaisala frequency (N ) are shown in Table I, where in this study, N is calculated as follows:  Both the upper and lower fluid layers of the resulting medium are sufficiently deep for particles to reach terminal velocity.We used one type of high-quality commercially available spherical particles (Cospheric Inc., Santa Barbara, CA).These smooth spherical polystyrene particles have a density of 2.5 g cm −3 and a diameter range of 1000-1180 µm.We verified particle parameters using microscopy imaging and custom image processing code.We have found that the particles are close to perfectly spherical, with a diameter that is normally distributed within the prescribed range.
We released particles at the center of the tank, below the free surface, using a syringe pump and a long glass pipette filled with water and particles.The diameter of the pipette exit is slightly larger than the particle diameter, ensuring that only single particles can exit.We controlled the rate of the syringe pump and ensured that there was a sufficient time interval between the releases.Thus the majority of the settling particles crossed an undisturbed steady-state interface.Sometimes, however, we were unable to prevent particles from concentrating at the exit of the pipette and exiting with insufficient time intervals between them.In these cases, particles settle one after another with vertical distances of about 10 -100 diameters.The insufficient time between two consecutive particles means that some particles enter the interface at the same horizontal location, and the interface itself is distorted by the previous particle.
We repeated the process 17 times, with each experiment taking between 3 and 5.5 hours (from the moment of filling the tank until the moment of the last recorded measurement).
We filmed the motion of the settling particles using a digital high-speed camera (Photron Nova, 1024 × 1024 pixels) with a frame rate of 250 frames per second.

Trajectory tracking and data collection
We tracked the location of particles using the well-known particle tracking algorithm of OpenPTV [1,22].As our experiments were conducted with a relatively sharp interfacial layer, they have a significant drawback of substantial variation of refractive index in the interfacial layer.As a result, images of small particles from this region are significantly distorted, as demonstrated in zone (b) in Fig. 5. Similar to other studies [23], it is impossible to obtain particle center in this region with an uncertainty smaller than a particle radius.
Therefore, we could not measure particle velocity within the interfacial layer and focused only on measuring the information relevant to the delay time.Thus, we measured the time instant at which each particle starts slowing down, slightly before it enters the interfacial layer and the time instant at which it reaches the terminal velocity at the bottom layer, far below the layer of the strong refractive index gradient.
In addition, we use the particles to estimate the position of the interface.We mark the entrance and exit points for each trajectory based on particle velocity (outside the region of strong refractive index gradients).The entry point is marked at the location where V (z) = 0.99V 1 , and the exit point where either the particle reaches its minimum velocity, or if it doesn't experience a minimum, the terminal velocity V 2 , similar to Refs.[1,6,7,21].
We average these position values for groups of particles that were released in the same run and use this estimate as the approximate interface thickness ĥ.In other words, indicate that the particle is currently settling through a homogeneous fluid, while the smeared elliptical shape in the middle images implies that the particle is in the virtual interface layer ĥ.
ĥ is an average of individual interface thickness measurements, collected from trajectories of particles that were released in proximity to one another.Fig. 6 demonstrates the individual entrance and exit points (dashed lines) of three archetypes of particle trajectories that we measured, alongside the interface thickness ĥ calculated based on them (blue rectangle).To the best of our knowledge, this is the first time this method for estimating the position and thickness of the interface has been used.Note that it is as accurate as other definitions of interface thickness, such as sampling of fluids at fixed depths, intrusive measurements with a conductance probe, or imaging methods using dye or Schlieren optical methods [7,15].Our method avoids mixing and significantly prolongs the experimental run time, measuring thousands of trajectories.
As seen in Fig. 6, the three archetypes of particle trajectories are particles with a clear local minimum velocity (I), particles without a minimum (II), and particles that increase their velocity before entering the interfacial layer (III).Due to our inability to define particle position in zone (b), we "tracked" the particles in the sense that they are still linked to the same object, but there is no useful quantitative information in this region.Therefore, we masked the values tracked within the individually measured interface and used trajectories only to obtain V 1 , V 2 , the position of the minimum velocity, and the actual retention time we averaged the first and last 20-100 measurement points (at 250 fps), with the number of measurements to be averaged depending on the position of the interfacial layer entrance in the frame.For trajectories of archetype III, we adjusted the number of averaged measurement points to avoid the time with increased velocities.For trajectories of archetype I, where the trajectory fails to stabilize within the limits of the frame, we determined V 2 as the average of the last five measurement values of each trajectory.Moreover, the retention time is determined by the time interval between the moment the particle returns to a terminal velocity by reaching 0.99 ≤ V (t)/V 2 ≤ 1.01 and the time instant of departing from V 1 before entering the interfacial layer.
Next, similar to previous work [1], we estimate each particle diameter and density using the measurements of its terminal velocities in the top and bottom layers, and applying a force balance between the immersed weight, F W B = (ρ p −ρ f )-V g, and the drag Here ρ f is the density of the surrounding fluid (either ρ 1 or ρ 2 in the homogeneous layers), -V = πa 3 /6 is the particle's volume, A = πa 2 /4 is the projected surface area, and C D is the drag coefficient.The drag coefficient is estimated using the correlation [1,24]: We estimate the diameter and particle density using constrained optimization within the range provided by the manufacturer using the force balances in the upper and lower layers simultaneously, similar to Ref. [1].
In analogy to previous studies looking for correlations [1,6,7,25], we also used two average properties: the average density of the fluid medium ρ, and the average of terminal velocities V = 0.5(V 1 + V 2 ).From these properties, we are able to calculate the expected "crossing time" tV 1 −V 2 = ĥ/V that represents the time it takes the particle to cross ĥ assuming that F s is negligible.Finally, we calculated the delay times τ as the delta between the actual retention time t V 1 −V 2 and the expected crossing time tV 1 −V 2 (0 ≤ τ < ∞).

Data cleaning and preparation of non-dimensional dataset
Footage of 3,264 settling particles is freely available as open access data, divided into 16 experiment dates (one of the days had 2 experiments, thus 17 experiments in total).
We processed and extracted the dimensional properties and velocity trajectory of 2,039 particles from this database.The remaining 1,225 trajectories could not be tracked due to the technical limitations of the image processing and tracking algorithms.For example, due to the sharp refractive index gradient in zone (b) in Figs.5-6, some trajectories were misidentified as two separate trajectories; one starting from above the interface and ending within it, and the other starting somewhere within the interface and continuing until the end of the frame.We did not attempt to match fractions of trajectories together and deleted them from the dataset.Other examples of deleted trajectories are of particles that resulted in a calculated expected time ( tV 1 −V 2 ) that is greater than the measured retention time However, since we do not measure tV 1 −V 2 directly but estimate it using the averaged interfacial layer thickness ĥ, some particles appear as they have tV In such cases, the measured retention time is negative, and we discard these from the dataset used for the following stage.In summary, we have a dataset of 2,039 trajectories for which 0 ≤ τ < ∞.
The dimensional properties of all the particles in this dataset are presented in Table .II.
From this dataset, 18% of the analyzed particles do not experience a minimum velocity, and 50% exhibit a minimum that is significantly lower than V 2 (more than a 5% decrease from the terminal velocity in the bottom layer).This leaves 32% of particles with a minimum that is smaller than V 2 by 0.01% − 4.5%.Due to the large amount of uncertainty associated with the measurements, it is difficult to determine whether these particles truly experience a minimum or not.The dataset comprises of 9 dimensional properties of particles and fluids, i.e., particle diameter and density ρ p , a, the terminal velocities V 1 , V 2 , the delay time τ , the fluids density and viscosity ρ 1 , ρ 2 , ν, and the averaged interfacial layer thickness ĥ.Prior to symbolic regression of a correlation for τ , we initially create a dataset of nondimensional parameters.
According to the Buckingham theorem of dimensional analysis, the number of nondimensional parameters π i equals the number of relevant dimensional variables minus the number of dimensional units.In our case, the number of dimensional units is 3: either mass, length, and time, or force, length, and time.We estimated that we need six non-dimensional parameters to formulate a correlation for the delay time, listed in Table .III.
Second, we had to select the form of every dimensionless parameter.This is because every dimensionless parameter π i can be calculated in various options based on variables with the same dimensions (between 1 and 384 different options depending on the complexity of the dimensionless parameter).For instance, the Reynolds-like parameters can be defined using one of the two characteristic length scales (a, ĥ) and three different velocities ), resulting in six plausible options.Each of these options will lead to a different meaning of the parameter and different relative contributions of the physical properties in that parameter.Table .III includes the number of options for each parameter.
Third, we needed a non-dimensional time scale based on the delay time τ defined above.
For that, we used the expected crossing time tV 1 −V 2 for the normalization.This value can be obtained for the particle of a known size and density for a given interfacial layer.Therefore, it is useful for future prediction applications.The normalized delay time scale is then Theoretically, it can be any value in the range 0 ≤ τ < ∞, with infinite values corresponding to particles entrapped in the interfacial layer.

Selection of six non-dimensional parameters through a feature selection process
The first stage in symbolic regression (SR) was to choose the "best" option for each of the six π i parameters (e.g., Reynolds-like, Froude-like, etc.) out of a total of 412 different options presented (see No. of options column in Table III).This is a major component of the proposed methodology, called feature selection, and it is only available due to the large dataset of measurements.We implemented the process in SciMED , which selects six features (each feature being one of the options for each π i ) that are "most informative" to the dimensionless delay time."Most informative" means that an ML model for "black box" prediction, generated the most accurate predictions of τ using the data of the selected six features, compared to all other feature combinations.
During the feature selection process, SciMED incorporates the information about the category of each feature, meaning to which π i parameter it can be assigned.Then, it uses a genetic algorithm (GA) to generate tens of thousands of subsets of the original dataset, where each subset includes only six features, one from each π i .The subsets are evaluated and compared by training an AutoML model [20] that predicts τ based on the selected set of features.The score of each subset is determined by the accuracy of the prediction that was achieved.Finally, the subset leading to the most accurate result is passed as the dataset for the SR process.

Symbolic regression for the delay time
In this step, SciMED replaces the scientist, and instead of manually searching for the best fit, it suggests the optimal equation structure and parameters that best predict τ of all particles in the dataset.For that purpose, the Las Vegas -based SR component [20] searches for the optimal correlation.This step results in a list of three possible equations, ordered by their complexity (in terms of the number of operators and parameters in it).
Each equation consists of a numerical prefactor α and a constant term β that is optimized for our dataset of samples (i.e., τ = αf (π 1 , ..., π 6 ) + β).For cases of data collected from particles or fluid medium that differ from this experiment, such as the data by Verso et al.
[1], we run another optimizer to obtain the α and β numerical terms.Eventually, for the sake of generalization, we suggest one ML-selected correlation (i.e., equation resulting from the SR) that best fits both our data and the data from Verso et al. [1].

Feature selection
We report here the set of six π i that result from a feature selection process, comparing 43,000 variations.This set supposedly comprises the dimensionless numbers representing the dominant mechanisms determining the normalized delay time τ .
In Table III, we present a comparison between the most typical choice in the literature and the ML-selected form of each π i parameter.In the right column, we present the value range for each parameter according to the selected form and our measurements.
In Fig. 7, we compare τ = f (π i ) in the conventional (top row) and the selected (bottom row) form of four different π i parameters.The dimensionless parameters of length and velocity are not presented in this comparison as, in this case, there is no other plausible form to calculate them.As seen in the figure, the collapse of data of the π i parameters versus the normalized delay time leads to improved correlation, although weak for each dimensionless parameter separately.Furthermore, the parameters leading to the most explicit trends are the selected Re-like and F r-like ratios, which from here on now, will be named Re and F r.
In Fig. 8, we present our measurements (marked as A, for either particle that experiences a significant velocity minimum or those that do not) and that of Verso et al. [1] (marked as Length scales 1 ĥ/a ĥ/a 25 -50  B, varying from type P1 to P4), over the F r = f ( Re) map, similarly to Fig. 2. We did not plot the measurements of Abaid et al. [6] and Srdić-Mitrović et al. [7] that were presented in Fig. 2, as we do not have access to the data needed to calculate F r. Note that particle type P3 corresponds to the case that did not show minima (see Verso et al. [1]).This new map emphasizes for the first time that in terms of the new dimensionless parameters F r and Re, all the particle types are different, a fact that was not observable in the conventional F r 1 = f (Re 1 ) map [1].Note that for our measurements, there is also a notable though the small difference between the particles without and with significant minima.
In Fig. 9, we show the scatter of the predicted normalized delay time (using versus the actual measurements of τ .In each plot, a linear trend line (dashed line) is fitted to all predicted samples.It is clear that the trend line fitted to the predictions of f 1 (left) is closest to the expected y = x trend (solid line).It is also noteworthy that the results were obtained with particles measured in two separate sets of experiments (ours and that of [1]), using two different fluid layer combinations, five particle types, and 10 density jumps.
Based on them we suggest to use f 1 as the correlation for τ , which can be written as: N = g a and if we substitute N it reads:

DISCUSSION
In this study, we took a new approach to reveal the structure of empirical results.We utilized machine learning techniques to study the complex behavior of inertial particles crossing an interfacial stratified layer.We simplified the experimental setup to gain a substantial expansion of the previously unexplored parameter range and an unprecedented number of particle trajectory datasets.With this data, we aimed to find a correlation that covers our and previous results.All the experimental results are open to the community, and we hope that additional ideas and new correlations can be found through new approaches.
Consistently with previous results [1,[6][7][8], we also find that not all particles in this parameter range experience slowdown and minimal velocity.In our case, 18% of all the trajectories did not experience a minima.More research is needed to understand why this effect occurs, but possible factors include particle rotation, the timing of particle release, non-spherical shapes, or interface oscillation.New experiments would be necessary to better understand this effect.
We demonstrate that the normalized delay time (τ ) has a stronger dependence on the newly selected forms of dimensionless parameters rather than the conventional form, as shown in Fig. 7.We also inspect the selected forms of the dimensionless parameters and infer the possible meaning of the selection.Thus, Re is based on average velocity and not on the entrance velocity as it was suggested by previous works [1,7].The Reynolds number does depend on the particle size, and it represents the so-called particle Reynolds number.
The particle density is normalized to the averaged fluid density (which is the approximate density of the interface, assuming linear density gradient).It seems to be more reasonable to incorporate both densities instead of the conventional ρ 2 /ρ 1 .
We also learn that because there is a strong correlation between the Reynolds and Froude numbers (see the new map in Fig. 8), the final correlation does not include the Reynolds number explicitly.The most dominant effect in this problem is stratification, and it determines the retention and the delay times.Therefore, the most dominant dimensionless parameter here is the Froude-like parameter.Furthermore, the newly selected Froude number differs from the one typically suggested in the literature.It relates linearly to the interfacial layer thickness, h, and only as a square root of the particle diameter a −0.5 .In other words, we can infer that the dimensionless frequency, N should be defined with the particle diameter, while Froude number as F r = V / ĥ N .The better choice of this form for the Froude number is also supported by the better collapse of data than the rest of the π i parameters.We also compared it to two Froude number forms and see that it is significantly stronger related to the delay time, i.e. (τ = f ( F r) leads to R 2 = 0.19, while both F r 1 and F r 2 lead to an order of magnitude weaker correlation with R 2 = 0.01.
Undoubtedly, our experiment has limitations.Although it enabled the sampling of thousands of trajectories, it also led to increased uncertainty and lesser control over particle diameter, density, and less precise release timing.The main disadvantage is the lack of index refraction matching and the corresponding lack of detailed trajectories inside the interfacial layer.If such an experiment can be performed, it would be possible to implement SciMED for particle trajectories and compare the point-wise position and velocity of each particle versus the predicted equation of motion.We expect this approach to significantly improve our ability to predict the correct form of the stratification force and formulate a model in a more general form of detailed physical mechanisms.Additionally, we appreciate the idea of an anonymous reviewer to try to construct additional dimensionless features using a new length scale that instead of h will be of the type of l ρ = 2(∂ρ/∂z)(∂ 2 ρ/∂z 2 ).We encourage the readers to use SciMED's open-source code and the openly shared data to verify whether this analysis leads to a better prediction of the retention time scale.

FIG. 1 .
FIG. 1. Schematic description of the problem of a particle crossing a stratified interfacial layer.

1 FIG. 2 .
FIG. 2. Summary of the parameter range of the experimental results of Verso et al. [1], Abaid et al.

FIG. 5 .
FIG. 5. Digital snapshots of particle positions taken at different moments throughout its motion, moving from left to right sequentially.Spherical particle shapes in the left and right images

FIG. 6 .
FIG.6.Three typical plots that can be found in the dataset, showing the dimensional velocity of a particle versus the distance traveled from the top of the viewing.These particular trajectories are taken from the same group of particles that were released successively in the presence of a density ratio of ρ 2 /ρ 1 = 1.040.The dashed lines indicate the stratification region as determined by each particle's motion, while the colored area (zone b) indicates the interfacial layer determined by the average of the located entrance and exit points.Zones marked as a-c indicate the upper and lower fluid layers.

FIG. 7 .
FIG. 7. Normalized delay time τ versus four different π i parameters in their conventional form (top row) and the from selected by SciMED (bottom row).Results are shown with the coefficient of determination, R 2 , representing the variance captured by a linear regression between the X and Y axis of each plot (dashed line)

FIG. 8 .
FIG. 8. Parameter range of the experimental results of this study (A) and of four types of particles presented in Verso et al. [1] (B P1-P4), in terms of F r versus Re (i.e., the Re and F r-like parameters selected by SciMED ).

FIG. 9 .
FIG. 9. Predicted normalized delay time as calculated by f 1 −f 3 versus the actual measurements of τ .The scatter is the experimental results of this study (A) and of four types of particles presented in Verso et al. [1] (B P1-P4).

TABLE I .
Ranges of the properties the fluid layers varied in, across all experiments.

TABLE II .
Ranges of the properties of the particles and the dimensionless numbers of the top and bottom layers varied across all experiments.

TABLE III .
Comparison of the six non-dimensional parameters in the conventional form and the form selected by SciMED .For the length and velocity scales, there is only one reasonable selection.The ranges are for the selected form.

Table .
[1]presents the three types of equations suggested by the SR component of SciMED together with the numerical constants optimized to our dataset.The three typically used success metrics of coefficient of determination R 2 , mean absolute error (MAE), and mean squared error (MSE) presented were calculated from all samples of our study, together with those of Verso et al.[1].As seen in the table, all the suggested correlations resulted in similar scores.It is difficult to decide which correlation is "better," but this is also not the purpose of this study.In order to obtain a much better correlation, we need to obtain more measurements of very different types of particles, with different fluids, interfacial layer thickness, and density jumps.To this end, we obtained symbolic expressions representing the result better than the previous studies.Furthermore, we focus on the question of why these correlations are better than the previous ones and what underlying physical mechanisms these selections could highlight.

TABLE IV .
Three plausible correlations for the normalized delay time, as suggested by SciMED .
[1] R 2 , mean absolute error (MAE), and mean squared error (MSE) metric scores are reported based on measurements of this study and that of Verso et al.[1].