Application of Machine Learning to Predict the Acoustic Cavitation Threshold of Fluids

: The acoustic cavitation of ﬂuids, as well as related physical and chemical phenomena, causes a variety of effects that are highly important in technological processes and medicine. Therefore, it is important to be able to control the conditions that allow cavitation to begin and progress. However, the accurate prediction of acoustic cavitation is dependent on a complex relationship between external inﬂuence parameters and ﬂuid characteristics. The multiparameter problem restricts the development of successful theoretical models. As a result, it is critical to identify the most important parameters inﬂuencing the onset of the cavitation process. In this paper, the ultrasonic frequency, hydrostatic pressure, temperature, degassing, density, viscosity, volume, and surface tension of a ﬂuid were investigated using machine learning to determine their signiﬁcance in predicting acoustic cavitation strength. Three machine learning models based on support vector regression (SVR), ridge regression (RR), and random forest (RF) algorithms with different input parameters were trained. The results showed that the SVM algorithm performed better than the other two algorithms. The parameters affecting the active cavitation nuclei, namely hydrostatic pressure, ultrasound frequency, and outgassing degree, were found to be the most important input parameters inﬂuencing the prediction of the cavitation threshold. Other parameters have a minor impact when compared to the ﬁrst three, and their role can be compensated for by alternative variables. The further development of the obtained results provides a new way to optimize and improve existing theoretical models.


Introduction
When tensile stresses occur in a fluid, voids form and develop. This effect is known as cavitation. Depending on the cause of tensile stresses, cavitation is classified into two types: acoustic cavitation and hydrodynamic cavitation. Hydrodynamic cavitation is a process in a flowing fluid accompanied by bubble implosion due to a decrease and subsequent increase in local pressure. Hydrodynamic cavitation has a few negative consequences, including turbine and propeller erosion [1]. Acoustic cavitation occurs when high-intensity acoustic waves pass through a fluid. Due to the technical capabilities for generating and controlling acoustic waves, acoustic cavitation is of special interest to researchers. The physical and chemical effects that are observed during acoustic cavitation are used in a wide range of applications, such as obtaining biofuels [2], the acceleration of chemical reactions [3], recycling of food waste [4], and many other applications [5][6][7].
The presence of solid, vapor, or gas micro-inhomogeneities in a fluid promotes both types of cavitation. These heterogeneities weaken the fluid locally. Therefore, they are also known as cavitation nuclei. When an acoustic wave of a certain amplitude is applied, the cavitation nuclei begin to expand in the negative pressure phase and then shrink in the rising pressure phase. In the case of stable cavitation, bubbles perform oscillatory movements during multiple periods of tension and compression phases. This type of cavitation is associated with small acoustic microflows and local heating. In the case of unstable cavitation, the bubbles collapse during one or several tension-compression cycles. Unstable acoustic cavitation is accompanied by hydrodynamic microshock waves and the release of large amounts of energy.
It is intuitively clear that cavitation is a multi-parameter process associated with fluid and acoustic impact characteristics at various scale levels. The acoustic threshold of cavitation, for example, is the amplitude of ultrasound at which bubbles in the fluid collapse. Hydrostatic pressure, fluid temperature, ultrasound frequency, degree of degassing of the fluid, and other parameters can all have an influence on the behavior of the threshold value [8,9]. It has been demonstrated in experimental works [10,11] that increasing the frequency of ultrasound raises the threshold of acoustic cavitation. The effect of fluid temperature on the acoustic cavitation threshold was investigated in [12]. The experimental work [13] investigates the acoustic cavitation threshold's dependence on the gases dissolved in fluid. The authors of the work [14] analyzed the acoustic strength of oil depending on different values of hydrodynamic pressure. It is also worth noting the investigation into the possibility of using the cavitation number to determine the threshold of acoustic cavitation by analogy with hydrodynamic cavitation. The cavitation number is a common dimensionless value that characterizes the potential of the flow to cavitate. Despite the fact that its definition for acoustic cavitation is not as obvious as for hydrodynamic cavitation, some studies show that it can be estimated taking into account the parameters of ultrasonic setups [15,16].
Theoretical models estimating threshold parameters of ultrasonic cavitation were developed to study an extensive amount of experimental data. The classical nucleation theory [17] is one approach. This approach uses the fact that the vapor-gas phase in a liquid is nucleated when the liquid's pressure drops below the pressure on the liquid-vapor curve to calculate the cavitation threshold. The cavitation threshold can also be evaluated using equations that describe the dynamics of a cavitation bubble in a fluid. The Blake criterion is based on this approach [18]. This criterion is often used to quantitatively estimate the amplitude threshold. To successfully apply this approach, it is necessary to know the size and distribution of cavitation nuclei in the fluid, which is difficult in most cases. In contrast to the previous two methods, the incubation time criterion of cavitation takes into account the frequency of ultrasound [19]. This approach draws an analogy between solid failure and fluid cavitation.
Despite the availability of theoretical methods to determine the acoustic strength of fluids, they do not always qualitatively and quantitatively describe experimental data. For example, [20] demonstrated that the Blake criterion qualitatively models fluid tensile strength only at frequencies less than 500 kHz. It is shown in [21] that the theoretical curve of the acoustic strength of heavy water as a function of temperature calculated within CNT does not match with the experimental data. In the case of the cavitation incubation time criterion, the question of the correct choice of the incubation time and its physical nature remains unanswered [20].
The fundamental dependence of a fluid's acoustic strength threshold on specific parameters remains unresolved, and theoretical models use their own set of parameters. However, conducting experiments with each parameter determined separately is quite a complicated task. In this regard, it is necessary to highlight the primary set of parameters that have the greatest influence on a fluid's cavitation strength.
In recent years, machine learning models have been widely used to solve complex engineering problems in various fields. For example, machine learning was used to process and interpret the experimental data [22,23]. In the work [22], a machine learning model was developed to detect cavitation and its intensity using acoustic signals. The work [23] considered the strength of rocks. Given these successful developments, the application of machine learning in the field of cavitation can improve the ability to predict the threshold of acoustic cavitation.
The purpose of this paper is to look at machine learning models for predicting fluid tensile strength at impact of acoustic waves. The model's sensitivity to input parameters was evaluated in order to identify the degree to which each parameter affects the threshold characteristic of cavitation. First, we collected experimental data from the literature on acoustic cavitation thresholds. Then, several machine learning models were built and trained on three-quarters of the available data, and the remaining data were used to test the quality of predictions made by the chosen models. Finally, the influence of the input parameters on the considered model is discussed.
The cavitation incubation time criterion in general form can be written as [19]: where P(t ) is the time profile of sound wave pressure, P c (T) is the static threshold, T is the fluid temperature, α is the parameter that has a relationship to the viscosity of the fluid [34], P st is the hydrostatic pressure, P ph is the saturated vapor pressure of the fluid at temperature T, A and f are the amplitude and the frequency of acoustic oscillations, τ is the incubation time of cavitation, τ 0 is the incubation time typical for a given spatial scale level (can be considered as a timescale), k is the Boltzmann constant, and W is the fraction of energy required to start cavitation in a representative volume of a given scale level.
According to the approach, the main parameters that influence cavitation are hydrostatic pressure, ultrasound frequency, fluid temperature, and fluid viscosity. The expression for the strength of a fluid P th on a bubble of radius R 0 can be represented by the Blake criterion [18]: where σ is the surface tension. According to the classical nucleation theory, the threshold value of the acoustic pressure P CNT in volume V during time ξ can be determined by the following formula [17,35]: where J 0 is the value that can be approximated as [36]: where n is the number density of the fluid, and m is the mass of the molecule. Based on the approaches described above, the following parameters were chosen to train machine learning models: hydrostatic pressure, surface tension, ultrasonic frequency, fluid density, fluid viscosity, and fluid volume. Another important parameter is the size distribution of cavitation nuclei in the fluid. Thus, "the fluid type parameter" was introduced because the average radius of cavitation nuclei decreases with fluid purification. This parameter indicates the degree of fluid preparation prior to the experiment: 0 represents untreated fluid, 1 represents distilled fluid, and 2 represents ultrapure fluid.
The Pearson correlation coefficient between the parameters was calculated to understand the basic correlations between the selected parameters (see Figure 1). The acoustic cavitation strength of a fluid is highly correlated with hydrostatic pressure and the fluid type parameter. Note that the negative correlation between temperature and fluid density can be explained by the fact that an increase in temperature causes a decrease in fluid density.
introduced because the average radius of cavitation nuclei decreases with fluid purification. This parameter indicates the degree of fluid preparation prior to the experiment: 0 represents untreated fluid, 1 represents distilled fluid, and 2 represents ultrapure fluid.
The Pearson correlation coefficient between the parameters was calculated to understand the basic correlations between the selected parameters (see Figure 1). The acoustic cavitation strength of a fluid is highly correlated with hydrostatic pressure and the fluid type parameter. Note that the negative correlation between temperature and fluid density can be explained by the fact that an increase in temperature causes a decrease in fluid density. Let us consider the basic principles of machine learning methods, which are then used to predict and emphasize the threshold characteristics of cavitation.
The first model is linear regression with multicollinearity, which is one of the simplest. The results of this model are simple to interpret and appropriate for small amounts of data. Another model employs a random forest algorithm, which can assist in calculating more complex relationships between parameters without overfitting. The support vector regression (SVR) model, which has good learnability but is more difficult to interpret, is also suitable for small amounts of data.
Ridge regression is an improved form of linear regression. It can be applied to multicollinearity data and is also more robust to errors. Ridge regression imposes limits on the linear regression coefficient by adding a penalty coefficient , and the problem is reduced to minimizing the expression [37]: where is the regression matrix, is the vector of values-in our case, it is the vector consisting of threshold values of acoustic cavitation-and is the weight vector. By varying the parameter , we can find the optimal solution. Figure 2 shows how a random forest algorithm creates an ensemble of decision trees and averages the results of each decision tree. The algorithm implementation can be conventionally divided into several stages: Let us consider the basic principles of machine learning methods, which are then used to predict and emphasize the threshold characteristics of cavitation.
The first model is linear regression with multicollinearity, which is one of the simplest. The results of this model are simple to interpret and appropriate for small amounts of data. Another model employs a random forest algorithm, which can assist in calculating more complex relationships between parameters without overfitting. The support vector regression (SVR) model, which has good learnability but is more difficult to interpret, is also suitable for small amounts of data.
Ridge regression is an improved form of linear regression. It can be applied to multicollinearity data and is also more robust to errors. Ridge regression imposes limits on the linear regression coefficient by adding a penalty coefficient α, and the problem is reduced to minimizing the expression [37]: where X is the regression matrix, y is the vector of values-in our case, it is the vector consisting of threshold values of acoustic cavitation-and w is the weight vector. By varying the parameter α, we can find the optimal solution. Figure 2 shows how a random forest algorithm creates an ensemble of decision trees and averages the results of each decision tree. The algorithm implementation can be conventionally divided into several stages: The formation of a repeated random subsample from the initial training dataset.

2.
Creating a decision tree from randomly selected features. This means that, in the case of the problem under consideration, this tree may not take into account one or more parameters, for example, temperature, density or some other parameter.

3.
Construction of an ensemble of trees, each of which is built on its own subsample.
The prognostic result will be the average of all readings.
The final outcome can be written as y = 1 n n ∑ i=1 y i , where y i is the prediction of the ith tree [38]. The number of trees in the ensemble and the structure of the decision trees can be changed to optimize this method. 1. The formation of a repeated random subsample from the initial training dataset. 2. Creating a decision tree from randomly selected features. This means that, in the case of the problem under consideration, this tree may not take into account one or more parameters, for example, temperature, density or some other parameter. 3. Construction of an ensemble of trees, each of which is built on its own subsample.
The prognostic result will be the average of all readings.
The final outcome can be written as ∑ , where is the prediction of the ith tree [38]. The number of trees in the ensemble and the structure of the decision trees can be changed to optimize this method. The support vector machine (SVM) model was originally developed for data classification and then modified for regression analysis [39]. In this work, the purpose of SVM was to find the optimal hyperplane that best separates the data. Support vector regression (SVR) is an extension of SVM, where the algorithm is adapted for regression tasks. As the model is trained, the parameters of this hyperplane are continually adjusted to minimize the sum of the distances of all data points from that hyperplane ( Figure 3). In the case under consideration, there is a dataset from a table that can be represented as , 1,2, … , , where is the acoustic strength of the fluid and the vector is other parameters from the table, and k is the amount of data. Then, the value of acoustic cavitation strength can be written as: , where is the nonlinear function defined indirectly through the kernel function choosing, b is the bias constant, and is the weight vector. The support vector machine (SVM) model was originally developed for data classification and then modified for regression analysis [39]. In this work, the purpose of SVM was to find the optimal hyperplane that best separates the data. Support vector regression (SVR) is an extension of SVM, where the algorithm is adapted for regression tasks. As the model is trained, the parameters of this hyperplane are continually adjusted to minimize the sum of the distances of all data points from that hyperplane ( Figure 3). In the case under consideration, there is a dataset from a table that can be represented as {x i , y i } (i = 1, 2, . . . , k), where y i is the acoustic strength of the fluid and the vector x i is other parameters from the table, and k is the amount of data. Then, the value of acoustic cavitation strength can be written as: where φ is the nonlinear function defined indirectly through the kernel function choosing, b is the bias constant, and w is the weight vector.
To find the optimal parameters, the expression 1 2 w 2 2 should be minimized under the given condition [40]: where ε is the threshold parameter. However, with this approach, there is a possibility that some experimental points will lead to the problem being impossible to solve. Slack variables ξ i , ξ * j are introduced to address this shortcoming. Then, the problem can be written as follows [41]: where c is the penalty coefficient.
Using the method of Lagrange multipliers for problem (8), the optimal hyperplane equation is as follows [42]: where α i and α * i are Lagrange multipliers or weights, and K(x i , x) is the kernel function. The kernel function for the considered problem statement is as follows: As a result, two parameters should be chosen to optimize the method. The parameter c is responsible for assigning a penalty for leaving the band. The parameter γ is responsible for the effect of one measurement on the result.
The results of applying the considered machine learning algorithms for estimating the threshold of cavitation onset according to different ultrasound and fluid parameters are discussed in the following section.  To find the optimal parameters, the expression ∥ ∥ should be minimized under the given condition [40]: where is the threshold parameter. However, with this approach, there is a possibility that some experimental points will lead to the problem being impossible to solve. Slack variables , * are introduced to address this shortcoming. Then, the problem can be written as follows [41]: Figure 3. A schematic diagram of the principle of the SVM algorithm. 1-optimal hyperplane y = wφ(X) + b; 2-y = wφ(X) + b + ε; 3-y = wφ(X) + b − ε; 4-points for which a slack variable is to be introduced; 5-experimental points for which the problem is solved without introducing a slack variable.

Results and Discussion
The machine learning models were trained on 75% of all the data, with the remaining 25% used to evaluate their performance. The predicted parameter was the fluid's acoustic cavitation threshold, and the other parameters (hydrostatic pressure, temperature, ultrasound frequency, fluid density, fluid type, fluid viscosity, and surface tension) were chosen as input parameters. The grid search method [43] was used to find the optimal parameters.

Model Comparison
Initially, the models' effectiveness was assessed by plotting the predicted acoustic cavitation threshold value against the initial value. If the points follow the diagonal line very closely, the predicted acoustic strength is close to the measured one. As can be seen in Figures 4-6, the greater the density of points, the better the predictive ability of the models. Once the density of points decreases, as when P th > 30,000 kPa, the accuracy of the models decreases. This behavior can be explained by the fact that there are not enough experimental points in this region to accurately predict the fluid's acoustic cavitation strength.

Results and Discussion
The machine learning models were trained on 75% of all the data, with the remaining 25% used to evaluate their performance. The predicted parameter was the fluid's acoustic cavitation threshold, and the other parameters (hydrostatic pressure, temperature, ultrasound frequency, fluid density, fluid type, fluid viscosity, and surface tension) were chosen as input parameters. The grid search method [43] was used to find the optimal parameters.

Model Comparison
Initially, the models' effectiveness was assessed by plotting the predicted acoustic cavitation threshold value against the initial value. If the points follow the diagonal line very closely, the predicted acoustic strength is close to the measured one. As can be seen in Figures 4-6, the greater the density of points, the better the predictive ability of the models. Once the density of points decreases, as when > 30,000 kPa, the accuracy of the models decreases. This behavior can be explained by the fact that there are not enough experimental points in this region to accurately predict the fluid's acoustic cavitation strength.

Results and Discussion
The machine learning models were trained on 75% of all the data, with the remaining 25% used to evaluate their performance. The predicted parameter was the fluid's acoustic cavitation threshold, and the other parameters (hydrostatic pressure, temperature, ultrasound frequency, fluid density, fluid type, fluid viscosity, and surface tension) were chosen as input parameters. The grid search method [43] was used to find the optimal parameters.

Model Comparison
Initially, the models' effectiveness was assessed by plotting the predicted acoustic cavitation threshold value against the initial value. If the points follow the diagonal line very closely, the predicted acoustic strength is close to the measured one. As can be seen in Figures 4-6, the greater the density of points, the better the predictive ability of the models. Once the density of points decreases, as when > 30,000 kPa, the accuracy of the models decreases. This behavior can be explained by the fact that there are not enough experimental points in this region to accurately predict the fluid's acoustic cavitation strength.    For quantitative analysis, the metrics RMSE (root mean square error), MAE (mean absolute error), and MedAE (median absolute error) were calculated: Figure 6. Comparison between measured and predicted acoustic cavitation thresholds using the support vector regression method. 1: Training data; 2: test data; 3: best line. For quantitative analysis, the metrics RMSE (root mean square error), MAE (mean absolute error), and MedAE (median absolute error) were calculated: The ridge regression has the highest error among all of the models considered (Figure 7). This may indicate that the relationship between the parameters is non-linear. RF and SVR have approximately the same values of the metrics. For quantitative analysis, the metrics RMSE (root mean square error), MAE (mean absolute error), and MedAE (median absolute error) were calculated: The ridge regression has the highest error among all of the models considered ( Figure  7). This may indicate that the relationship between the parameters is non-linear. RF and SVR have approximately the same values of the metrics.

Influence of Input Parameters
The SVR and RF models agree well with the experimental data, but it is not always possible to collect a large amount of data in acoustic cavitation research and experimental techniques that would include each of the eight selected parameters. The absence of some parameters can have a significant impact on the predictive ability of the chosen models, while others may have no effect at all on the final result of the acoustic cavitation threshold. We will determine the importance of a parameter by randomly rearranging one of the parameters in the original data. In this regard, a random rearranging of one of the param-

Influence of Input Parameters
The SVR and RF models agree well with the experimental data, but it is not always possible to collect a large amount of data in acoustic cavitation research and experimental techniques that would include each of the eight selected parameters. The absence of some parameters can have a significant impact on the predictive ability of the chosen models, while others may have no effect at all on the final result of the acoustic cavitation threshold. We will determine the importance of a parameter by randomly rearranging one of the parameters in the original data. In this regard, a random rearranging of one of the parameters in the original data was performed in order to determine the importance of the parameter. The relationship between the parameter and the original value is broken in this way, and the decrease in the estimate indicates how much the original model is dependent on this parameter. This method can be used for any machine learning method if the data are tabulated.
Let D be the original data table for the problem under consideration. Each column of matrix D corresponds to one of the parameters. These parameters include hydrostatic pressure, surface tension, ultrasonic frequency, fluid density, fluid viscosity, fluid volume, and fluid type. Each row in Matrix D corresponds to an individual experiment or observation in which the values of these parameters were measured. Let us calculate the metric s (RMSE, MAE, and MedAE) for the machine learning models trained on table D. Consider column j of table D, which can be static strength, ultrasound frequency, or any other parameter. Then, this column should be shuffled N times at random. As a result, N tables D ji (i = 1, 2, . . . , N) were generated. The metric s ji was then calculated for the considered models using D jk as the input. In this case, the importance of the j-th variable is calculated as follows: The greater the value of variable m j , the more the predicted value is dependent on the corresponding parameter j, whereas a value of variable m j close to zero indicates that the parameter has no influence on the predicted value. Figure 8 demonstrates the results of calculating variable m j for each input parameter. The prediction of cavitation onset for both machine learning models is most sensitive to hydrostatic pressure. The variable m j for hydrostatic pressure has the highest value of all input parameters. It is important to note that this fact has been experimentally confirmed [12,44]. Additionally, it should be noted that hydrostatic pressure is already included as a parameter in analytical models.
rameter. Then, this column should be shuffled N times at random. As a result, N tables 1,2, … , were generated. The metric was then calculated for the considered models using as the input. In this case, the importance of the j-th variable is calculated as follows: The greater the value of variable , the more the predicted value is dependent on the corresponding parameter j, whereas a value of variable close to zero indicates that the parameter has no influence on the predicted value. Figure 8 demonstrates the results of calculating variable for each input parameter. The prediction of cavitation onset for both machine learning models is most sensitive to hydrostatic pressure. The variable for hydrostatic pressure has the highest value of all input parameters. It is important to note that this fact has been experimentally confirmed [12,44]. Additionally, it should be noted that hydrostatic pressure is already included as a parameter in analytical models. The fluid type (pre-treatment) also indicated a significant impact on acoustic cavitation strength. As previously stated, the fluid type was related to the distribution of cavitation cavity sizes in the fluid. This influence has also been confirmed by various works, e.g., [8,45]. The size of cavitation nuclei is considered in the traditional approach-the Blake criterion (2)-as well as indirectly in the cavitation incubation time criterion (1) [46].
The frequency of ultrasound comes in third place in terms of influence. Despite the fact that most studies are usually associated with only one frequency of ultrasound, the effect of frequency on the cavitation threshold has already been demonstrated in several early studies [45,47]. This may also be related to the size of the cavitation nuclei. When the frequency of ultrasound increases, the resonance radius of the bubble changes, which leads to cavitation, and the stretching time of cavitation cavities decreases.
Another distinguishing factor is the fluid volume. The greater the volume of fluid subjected to acoustic wave action, the more likely it is to contain cavitation cavities of sufficient radius [8].
Machine learning did not show a significant dependence of the cavitation onset threshold on fluid viscosity, although the correlation between these variables is shown, for example, in [25,34]. Nonetheless, the experimental results support the modelling. The The fluid type (pre-treatment) also indicated a significant impact on acoustic cavitation strength. As previously stated, the fluid type was related to the distribution of cavitation cavity sizes in the fluid. This influence has also been confirmed by various works, e.g., [8,45]. The size of cavitation nuclei is considered in the traditional approach-the Blake criterion (2)-as well as indirectly in the cavitation incubation time criterion (1) [46].
The frequency of ultrasound comes in third place in terms of influence. Despite the fact that most studies are usually associated with only one frequency of ultrasound, the effect of frequency on the cavitation threshold has already been demonstrated in several early studies [45,47]. This may also be related to the size of the cavitation nuclei. When the frequency of ultrasound increases, the resonance radius of the bubble changes, which leads to cavitation, and the stretching time of cavitation cavities decreases.
Another distinguishing factor is the fluid volume. The greater the volume of fluid subjected to acoustic wave action, the more likely it is to contain cavitation cavities of sufficient radius [8].
Machine learning did not show a significant dependence of the cavitation onset threshold on fluid viscosity, although the correlation between these variables is shown, for example, in [25,34]. Nonetheless, the experimental results support the modelling. The effect of viscosity, for example, was studied in [28] by varying the temperature of water with different dissolved air contents. It was found that the cavitation threshold changed by only a few percent in the temperature range from 5 to 45 • C. According to the findings of [48], the cavitation threshold for liquids ranging in viscosity from 0.7 to 800 mP·s increased by less than three times, and the threshold increases with increasing viscosity. Another set of experimental data [27] show that to increase the acoustic cavitation threshold by 30%, it is necessary to increase the viscosity of the fluid by about ten times. Based on the results of these experiments, it appears that fluid viscosity has little effect on the cavitation threshold. Note that the viscosity parameter in the database analyzed ranges from 890 to 1500 mP·s, which is possibly why the models did not demonstrate considerable sensitivity to changes in this parameter. It is worth noting that similar reasons may be used to explain the slight effect of surface tension.
The model developed in this study could potentially be used to predict which parameters will cause the acoustic cavitation threshold to decrease or increase. This possibility could aid in the control of cavitation onset and suppression. However, further development of the database is required in order to train machine learning models for future use in determining the optimal set of parameters for specific conditions.
The database should be extended with many similar experiments for a more qualitative and objective assessment of the effect of input parameters on the threshold of acoustic cavitation. Nonetheless, the following primary parameters may be identified based on the current results: hydrostatic pressure, fluid type, and ultrasonic frequency. It should be noted that all three parameters are related to the size of potential cavitation nuclei.

Future Development of the Database
Machine learning methods can be used to predict acoustic cavitation thresholds. However, it is important to choose the optimal method. In the case under consideration, this is the support vector regression model.
Notably, the parameter P st has a higher degree of variability in the data set than other parameters such as temperature and fluid viscosity. However, this may be because the column of data with hydrostatic pressure contains very diverse data, while the other parameters are not as diverse. Possibly, the strong influence of P st on the acoustic strength limit can be caused by the insufficient amount of experimental data with a wide range of variation in the input parameters. The outcomes could vary if the dataset is expanded with more experimental points.
To further investigate the effect of various parameters on the acoustic cavitation threshold, we created a database where researchers can submit their acoustic cavitation experiments (OSF|Acoustic cavitation experiment (https://osf.io/vywx7)). As this database expands, corrective modelling will be possible using machine learning algorithms. The increased variability in the experimental data as a result of the larger data set will allow for a more detailed study of the influence of each input parameter on the acoustic cavitation threshold.
Despite these shortcomings, the results obtained are already consistent with known findings about the cavitation strength of fluids. Further research into cavitation strength using machine learning can help improve the predictive ability of available theoretical models. This work can also be used to quickly process new experimental data.

Conclusions
The goal of this research was to explore the possibility of using machine learning models to predict the acoustic cavitation threshold of fluids depending on the initial conditions. The onset of cavitation was studied in relation to the ultrasound frequency, hydrostatic pressure, temperature, and degree of fluid degassing, as well as the fluid's viscosity, density, surface tension and volume. The database for training machine learning models was formed using experimental results from the literature. Three machine learning models based on ridge regression (RR), random forest (RF), and support vector regression (SVR) algorithms were considered.
The results show that machine learning models can provide a reliable prediction of acoustic cavitation thresholds. The SVM algorithm showed the best learning for predicting the onset of cavitation. It was found that hydrostatic pressure is the most important input parameter influencing the prediction of cavitation threshold. The acoustic cavitation threshold is also affected by ultrasound frequency, fluid degassing, or purity. Other parameters have no significant impact when compared to the previous three.
The considered approach for predicting the acoustic strength of a fluid can be used to find the optimal modes of acoustic action on fluids in the required applications. However, expanding the diversity and volume of the experimental database is still required for better analysis and prediction of the cavitation threshold.

Conflicts of Interest:
The authors declare no conflict of interest.

A
Amplitude of acoustic oscillations m j Importance of the j-th variable D Original data