Detonation Cell Size Prediction based on Artificial Neural Networks with 1 Chemical Kinetics and Thermodynamic Parameters

Abstract


Introduction
Detonation is a supersonic, combustion-driven, compression wave [1].Due to the effects of instability, the propagation of a detonation wave generally exhibits a cellular pattern, where the width has proven to be an extremely useful length scale to characterize the sensitivity of an explosive mixture.Knowledge of the cell size permits other dynamic detonation parameters (i.e.critical initiation energy, detonability limit, critical tube diameter) to be estimated [2].For this reason, there is a substantial amount of literature on the experimental measurement of cell sizes of different mixtures.
So far, no quantitative theory has been developed to predict cell size.Yet, from pure dimensional analysis, it should be related to a characteristic reaction zone length of the detonation structure.Hence, numerous attempts have been made to relate experimentally measured cell sizes to some characteristic chemical lengths scale Δ in the one-dimensional ideal ZND detonation structure.In general, a linear proportionality relationship between the cell size λ and the steady chemical induction length scale Δi has been proposed, i.e. λ = A×Δi, where A is a constant proportionality factor [3][4][5][6][7].These results have been shown to capture qualitatively the effects of mixture composition, temperature, and pressure on cell size, provided that a suitable model is made to describe the factor A in the correlation [8,9].In most cases, the factor A is simply determined by matching the induction length with one experimental data point for a particular combustible mixture (e.g., value at stoichiometric composition), and the relationship is then extended to predict cell size over a limited range of initial conditions.However, cell sizes predicted by this technique are usually only valid for mixtures with conditions that are similar to that of the matching point.
More, the factor A is not universal and significantly varies for different mixture compositions, especially off-stoichiometric and diluted mixtures, and initial conditions.Hence, results for the predicted cell size can be several orders of magnitude different than the experimentally measured values.
In recent years, Machine Learning (ML) is becoming increasingly common in fluid dynamics to analyze and interpret large enough datasets [10].Using ML techniques thus provides a good opportunity to develop a new strategy for detonation modelling.An example is the study reported in [11] where feedforward Artificial Neural Network (ANN) combined with POD (Proper Orthogonal Decomposition) modal analysis to extract the features of the flow fields is used to predict the wave configurations of cellular detonations.Apart from (ANN) [12], our recent work [13] also uses the Convolutional Neural Network (CNN) trained with numerical simulation results for constructing lead shock evolution from the reactive front to obtain a full cellular detonation surface.Equivalently, CNN can also be trained and applied for wave mode identification in a Rotating Detonation Combustor (RDC) based on a single image [14].Other deep learning methods were applied to predict energetic material detonation performance [15] or explosive blast-loads on engineering structures [16], and used in other studies of combustion phenomena [17,18].In most cases, a robust evaluation of input representations and ML algorithms is needed.
For detonation cell sizes, various experimental measurements have been collected in the Caltech detonation database [19], thus providing an open dataset for ML.It is believed that ML algorithms can be applied to learn from the detonation database to make better predictions.In fact, a detonation cell size model based on a deep artificial neural network of three fuels, namely hydrogen, methane and propane, with air and oxygen as oxidizers has been developed previously by Malik et al. [20].In their model, they only used the mixture condition and the thermochemical properties, i.e., the adiabatic flame temperature and fuel fraction, as input features for the neural network construction and training.Therefore, the characteristics of the detonation structure, e.g., characteristic lengths and reaction sensitivity, are not considered in their ANN model development.
Recent advances on detonation instability suggest that the unstable dynamics of the detonation structure depend not only on the temperature sensitivity of the reaction, governed by the activation energy Ea, but also on the shape of the reaction zone, characterized by the length of induction and main heat release layer.It is thus logical to believe that the cell size should also be a function of these detonation structure characteristics.In fact, based on this observation, Ng et al. [21] has previously formulated a relevant non-dimensional stability parameter χ, given by the degree of temperature sensitivity in the induction zone εI multiplied by the ratio of induction length ΔI to the reaction length ΔR, which is approximated by the inverse of the maximum thermicity (1/ṁ ax ) multiplied by the Chapman-Jouguet (CJ) particle velocity and the thermicity is given by: where W is the mean molar mass of the mixture, Cp is the mixture's specific heat at constant pressure, Yi and hi are the mass fraction and the specific enthalpy of species i, respectively.The global activation energy in the induction process eI can be obtained by constant-volume explosion calculations.Assuming that the induction time τi has an Arrhenius form: with r the density to the power n, the activation temperature eI = Ea/RTs can be determined by: where two constant-volume explosion simulations are run with initial conditions (T1,τ1) and (T2,τ2).
Conditions for states one and two are obtained by considering the effect of a change in the shock velocity by ±1%DCJ [22].
From its definition, the parameter χ includes essentially all the important elements controlling the instability, i.e. energetics, temperature sensitivity, induction and chemical energy release zone length.From a physical point of view, the role of these parameters provides the scenario that incoherence in the exothermicity can lead to gasdynamic instabilities in the reaction zone, resulting in different behaviors of the detonation front, equivalent to Meyer and Oppenheim's coherence concept [23].With this parameter χ, Ng et al. [9] model the variation of the proportionality factor A and obtain an improved generic relationship λ = A×Δi correlating the cell sizes and induction zone length computed from detailed chemical kinetics, taking into account the effect of detonation instability, i.e., Using again the cell sizes from the Caltech database and with the degree of a polynomial equal to N = 3, the coefficients ak and bk are obtained using a multi-variable least square regression [9].It is shown to provide a good correlation and prediction over a wide range of mixture composition and initial conditions.
Considering the importance of instability which is related to the detonation structure and the improved accuracy by including chemical kinetics and hence, the stability parameter χ in the cell size correlation, in this paper, a predictive modelling based on the ANN approach with both chemical kinetic and thermochemical parameters are presented.In Sec. 2, we present the detailed methodology used to construct the ANN-based model.In Sec. 3, results obtained using ANN with different input features are presented.This paper ends with the conclusion in Sec. 4.

Methodology
In this study, a type of ANNs was developed to predict the detonation cell size, more specifically classified as a Deep Neural Network (DNN) as it comprises of multiple hidden layers.The development and optimization of the DNN were done using Keras [24] and the KerasTuner [25] frameworks, which allowed for the determination of the optimal number of neurons and layers of the DNN that lead to the minimum loss.Table 1: Mixture compositions and initial conditions for all cell size data considered in the correlation.(Original references are detailed in [19] or [26]).
For the DNN model, 388 data rows were used, which were a combination of experimental cell size data as well as chemical kinetics and thermodynamic data.The first were experimental cell sizes λ, sourced from the Caltech database [19] for different reactive mixtures and initial conditions, and the second chemical kinetics parameters, calculated from the analysis using the steady one-dimensional Zel'dovich-von Neumann-Döring (ZND) model [5,26,27], for the same initial conditions and reactive mixtures, using Konnov's detailed reaction mechanism [28] and the CHEMKIN II package [29].Surveyed existing detailed reaction mechanisms, the validation study and report by Schultz and Shepherd [22,30] has shown the adequacy of the Konnov's mechanism for use in detonation simulation.In this work, as listed in Table 1, the reactive mixtures include CH4, C2H2, C2H6, C2H4, H2 and C3H8, oxidized with O2 and air at a wide range of different equivalent ratios, initial pressures, temperatures and different dilutions with AR, H2O and N2.All these data can be divided into two main categories, the features and the target of this neural network.The target is the detonation cell size, which the network aims to predict accurately, and the features, which are any possible combination of the remaining available input parameters that are to be used to predict the target once the model is created.Creating the DNN model requires training and a testing process based on the available data, which is outlined in Fig. 1.The outlined process aims to determine the optimal number of layers, neurons per layer and model fitting for a given set of input features, which is crucial to obtain the minimum loss for these inputs.At the same time careful consideration was given to the computational cost, with a series of optimizations, frameworks and techniques employed in order to minimize it.The first stage of this process, after the data is initially imported, is to perform data exploration, in order to determine whether there are any missing or apparently wrong data inputs, to check the range and distribution of the data and also to visualize the relationship between different features.
The last one allows us to determine which of the features correlate (i.e y = a•x), and thus one of them can be dropped from the training process as it would not contribute to the improvement of the model.The initial data is then split randomly into 3 parts, the training, the validation and the testing data, with each part representing respectively 60%, 20% and 20% of the initial data.The first two are used during the creation and optimization of the model (hyper-tuning and fitting), while test data is used only once the model is created, in order to determine its accuracy, thus remaining impartial to the model creation process.Having a second set of data, specifically the validation data, during the creation and optimization of the neural network helps avoid overfitting the model to the training data.Although the validation data are not directly used in the fitting process of the model, there is still information that is passed to the model creation process, which makes the need for another independent, impartial set of data, such as the testing data, necessary to determine the accuracy of the model.After this first step, the features of all 3 data parts are scaled using the minmax scaler available from Scikit-learn (https://scikit-learn.org/).It should be noted that the fitting of the scaler is only done to the training data, in order to avoid any data leakage, and then the scaler is applied to the validation and testing data.This scaling step is necessary to avoid features impacting the model more than others just due to their higher magnitude.
After the initial data processing, the structure of the model and the range of the hyperparameters, such as the number of layers, number of nodes/layer and the learning rate are determined.More specifically, the number of hidden layers is specified between 1 to 4, with a different maximum number of nodes for each layer (512, 256, 128, 64) and step sizes (16,8,4,2) for the iteration process to follow.The default values for the batch size (32), which is the number of training points to be used in one forward and backward pass, and for the learning rate (1e-3) were determined from an initial sensitivity analysis to be adequate for this problem, without significant improvements in the model's accuracy from modification of these parameters.The ReLU (Rectified Linear Units) activation function is chosen for each layer and neuron.This is a function that returns 0 for negative inputs and the input value for any positive results, mathematically expressed as f(x) = max(0, x).The reason for choosing this activation function is that it has wide applicability with good accuracy, can capture well non-linearities and does not require a lot of computational resources [31].It was also found for this problem specifically to produce more accurate predictions compared to other available activation functions.To create the loss function, which needs to be minimized, the average square relative error is chosen,i.e.,: The reason for choosing this loss function lies in the range of the target data, which includes cell sizes from ~ 0.1 mm up to ~1500 mm.This means that if an absolute type of loss is chosen instead, such as the most commonly used Mean Squared Error (MSE), then the generated model would not be able to correctly predict small cell sizes, as their contribution to the cost function is smaller than the larger cell sizes.Choosing the square error, instead of the absolute error has the advantage of penalizing larger errors, similarly to the effect of MSE compared to Mean Average Error (MAE).
Finally, for the DNN model, the RMSprop (Root Mean Squared Propagation) is chosen as an optimizer, which is an algorithm to change attributes of the neural network such as weights and learning rate in order to minimize the loss function.
Once this is completed, the hyper-fitting process begins, in order to determine the optimal hyper-parameters of the DNN.This is an iterative process during which different models are created based on the specified range of hyperparameters and are then each fitted to the training data for 150 epochs, with each epoch representing one forward and one backward pass of all training data, or until the model stops improving.The cost function for each combination is then calculated using the validation data, in order to determine eventually through this iterative process, the hyperparameter combinations that lead to the lowest loss function values.It should be noted that through this iterative process the framework does not go over every possible combination, like in grid search, but instead uses a hyperband optimization process, which identifies the best values of hyperparameters to be tested within the specified range.
Once this process is finished, the top 3 hyperparameter combinations with the lower validation loss are determined.The models with these hyperparameter combinations are each now fitted to the training data to determine the parameters of the model.The fitting process occurs now for 3000 epochs, a number much higher than before, which guarantees that the model is not under-fitted, meaning that the model using the validation data could not have been improved further if the fitting process continued.The opposite behavior, which is overfitting, is avoided by saving the parameters for each epoch, and then choosing the parameters of the epoch with the lowest validation loss.
Overfitting essentially occurs when the model continues improving with each iteration of its predictions using training data, but increasingly worsens with each step of its predictions using the validation data.These fitting stages can be both seen in Fig. 2. Monitoring the validation loss during training to achieve optimal training can be found in [32], [33].Finally, once the best model out of the 3 is determined, it is evaluated using the test data, which as mentioned, has remained impartial to the training process.

Basic 3-feature model
Through the hyperparameter and fitting process described above, and a parametric study of different features, a 3-feature model was created to predict the detonation cell size.Compared to other generated ANNs, this one offers a very good prediction accuracy while requiring a low number of features.The structure of this DNN model can be seen in Fig. 3 below.The model shows a good mean error of 16.14% for the training data, indicating a good fitting of the model to the training data.A similar mean error is found between the validation and testing data, meaning that these two data sets can be considered equivalent, therefore validating the 60-20-20 splitting choice.Smaller testing and validation data sets demonstrated big error differences between them.As previously mentioned, the prediction accuracy of a DNN model is determined using only the testing data, which in this case is at an average absolute error of 22.34% and a maximum of 81%.Looking at the distribution of predictions compared to the actual data in Fig. 4, it can be seen that the model predicts with higher accuracy lower cell sizes than higher.This becomes clearer in the Bland-Altman plots [34] in Fig. 5, where the data shown are a combination of the predicted cell sizes from the model and the actual (experimental) cell sizes.In the x-axis the average of the two is shown, and in the y-axis the difference (actual -prediction).
From these plots, it is shown that a higher prediction accuracy in the cell size region 0-150 mm for all 3 data sets, with all but one data points falling within the lower and upper 95% confidence bounds.This behavior could be explained by the distribution of available data points that were used during training, which was mainly in the lower cell size region.Therefore, having more data corresponding to higher cell sizes means that the model's accuracy could potentially be further improved in that cell size region.The prediction accuracy of this model can be considered very good, especially once the inherent uncertainty of measuring the experimental cell size is taken into account, which would also explain the higher cell size variations, where the instability is more prominent and thus more difficult in determining the experimental cell size.In other words, cell sizes in the larger range are usually related to conditions near limits (e.g., low initial pressure, offstoichiometric conditions or as a result of physical boundary effects) where measurement data are limited.The cell patterns at these conditions are highly irregular and a characteristic cell value is difficult to distinguish.The actual cell size could be of the order of the tube diameter and thus can be affected by the physical boundary condition.From the Bland-Altman plots it can also be determined from the average difference (black line) that the model has a slight positive bias, meaning that it tends to slightly overpredict the cell size.To verify the data randomization does not have a significant effect on the result, two more random divisions of the initial data were considered, with the predicted vs actual cell sizes depicted in Tables 2 & 3, along with their corresponding graphs of predicted vs actual cell sizes for all data sets (Fig. 5 & 6).They showed a very close mean error for the testing data, and generally similar error distribution.This indicates the data sample is sufficient and the randomization done here eliminates the selection bias.It also insures the process against accidental bias and that the initial data randomization has negligible effects on the results.Finally, one more study was performed for these 3 input features, but with the data used for training, validation and testing limited to cell sizes in the range 0-150mm.This was done in order to determine the overall prediction improvement of the model if the higher than 150 mm cell sizes were disregarded, as the amount of available data is limited in that region.The prediction of this reduced model for the 3 data sets can be seen in Table 5.      as independent features the model accuracy is much higher compared to using the χ parameter alone.
It is interesting to re-iterate that the above parameter study demonstrates the ANN with the minimum of 3 features provides relatively good performance to predict the characteristic detonation cell sizes.Existing models developed or applicable for a wide range of applications such as the model proposed by Gavrikov [8] and Ng et al. [9,26] could have a mean absolute percentage error about 50%, as compared to about 23-30% in this work.
The basis 3-feature model that includes the induction length, thermicity and Mach number appears to provide the minimum features to describe the reaction zone and the detonation strength necessary for the cell size prediction.Although additional features that related to the mixture's or initial condition (i.e., P0, T0 and equivalence ratio) could lead to a slight smaller mean error, the improvement is indeed not significant.An interesting observation from these ANNs is that adding the global activation energy as a feature for the prediction does not seem to increase the model accuracy.This could perhaps imply that the temperature sensitivity of the reaction zone does not necessary govern the cell size scale but affect the regularity of the cell patterns.

Concluding remarks
An accurate DNN model has been developed for detonation cell size prediction, using available experimental cell size values and computed kinetic data over a wide range of initial and thermodynamics conditions.By extension, this model could also be used to estimate other dynamic detonation parameters.The advantage of this model lies in its simplicity, requiring only three features, and could be used with any reactive mixture, beyond those that were used during training.
The inputs are irrespective to the mixture molecular compositions but mainly computed chemical kinetic features.Hence, the model has the potential to further apply or be amended to include higher hydrocarbon mixtures not considered in this work.One limitation is that in its development a single chemical kinetics mechanism by Konnov [28], previously validated for detonation simulation, was used.It may imply that a similar accuracy might not be obtainable when other mechanisms, particularly tailored for a specific combustible mixture, are employed.The dependence of the developed ANNs to different chemical kinetic mechanisms will be examined for potential improvement in the future work.
In this paper, the described development process guarantees that an optimal ANN is generated for each combination of input parameters, and that its prediction accuracy is correctly assessed by using an independent set of data.An average prediction error of 22.34 % was obtained for the 3feature model, with better accuracy exhibited in the lower cell region.Aside from the basic neural network configuration, others with different combinations and numbers of features were considered, indicating that at least 3 features are required to predict accurately the cell size.
Increasing the number of input parameters does not improve the prediction accuracy of the model.
Finally, taking into account the subjectiveness of the cell size measurement, the developed ANN model provides quantitively accurate results.

Figure 1 :
Figure 1: Flowchart of model creation process

Figure 2 :
Figure 2: Error variation for training and validation data during fitting process

Figure 3 :
Figure 3: Deep Neural Network structure

Figure 4 :
Figure 4: Model prediction vs original data for training, validation and testing data

Figure 5 :
Figure 5: Bland-Altman plot of the actual vs predicted cell sizes using the training (a), validation (b) and testing (c) data set.The black line is the average difference, and in red the upper and lower limits of the 95% confidence interval for the average difference

Figure 7 :
Figure 7: Model prediction vs original data for training, validation and testing data, with random data division 3

Table 2 :
Error analysis of optimal model using the training, validation and testing data

Table 3 :
Error Analysis for training, validation and testing data, with random data division 2

Table 4 :
Error Analysis for training, validation and testing data, with random data division 3

Table 5 :
Error Analysis for training, validation and testing data for reduced model

Table 7 :
Error analysis of Testing data for DNNs created with different features, part I

Table 8 :
Combination of features used for DNN model, part II

Table 9 :
Error analysis of Testing data for DNNs created with different features, part II Starting from Tables6 and 7, it can be seen that introducing additional features to the basic 3 29% to 32%).It is therefore clear that a combination of any 3 features is necessary for a good or reasonably accurate prediction, or even 2, provided that they are a combination of Δ i , M CJ and σ max .Using only one feature (DNNs 15 to 17) does not allow for an accurate prediction.It should be noted when the parameters that compose the χ parameter are used accuracy than the basic model, but could still provide a reasonable prediction accuracy of close to 29%.Removing either one of the features from the basic model (DNNs 12 to 14) resulted in similar prediction accuracy (