Mathematical modeling and machine learning-based optimization for enhancing biofiltration efficiency of volatile organic compounds

Biofiltration is a method of pollution management that utilizes a bioreactor containing live material to absorb and destroy pollutants biologically. In this paper, we investigate mathematical models of biofiltration for mixing volatile organic compounds (VOCs) for instance hydrophilic (methanol) and hydrophobic (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α-pinene). The system of nonlinear diffusion equations describes the Michaelis-Menten kinetics of the enzymic chemical reaction. These models represent the chemical oxidation in the gas phase and mass transmission within the air-biofilm junction. Furthermore, for the numerical study of the saturation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α-pinene and methanol in the biofilm and gas state, we have developed an efficient supervised machine learning algorithm based on the architecture of Elman neural networks (ENN). Moreover, the Levenberg-Marquardt (LM) optimization paradigm is used to find the parameters/ neurons involved in the ENN architecture. The approximation to a solutions found by the ENN-LM technique for methanol saturation and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α-pinene under variations in different physical parameters are allegorized with the numerical results computed by state-of-the-art techniques. The graphical and statistical illustration of indications of performance relative to the terms of absolute errors, mean absolute deviations, computational complexity, and mean square error validates that our results perfectly describe the real-life situation and can further be used for problems arising in chemical engineering.

• The mathematical models for biofiltration of VOCs such as hydrophilic (methanol) and hydrophobic ( α -pinene) along with first and zero-order kinetics are derived and analyzed by developing a novel design of intelligent computing integrated by Elman neural networks and Levenberg-Marquardt (LM) algorithm.• The impact that shifts in a variety of parameters have on the entropy and saturation of methanol and α-pinene alongside the removal ratio has been explored.• The results calculated by the intended technique for various cases and scenarios are compared with Adomain decomposition method, Bernstein polynomial method, Chebyshev wavelets method, and solutions in terms of the fourth order of the Runge-Kutta technique.The design algorithm is executed multiple times to exhibit stability and accuracy by studying the solutions' absolute errors and mean absolute deviations in each run.

Mathematical formulation
The biophysical model of a single volatile organic compound that was presented by Van den Oever and Ottengraf in 1983 aids as the cornerstone for the mathematical paradigm of biofiltration for mixing volatile organic compounds (VOCs), such as hydrophilic and hydrophobic VOCs.The process includes two main components that are compound diffusion across biofilms and breakdown in the carriage of microorganisms.A conventional schema representation of an individual fragment in the biofilter that is coated by a homogeneous layer of biofilm and that is undergoing the simultaneous biodegradation of α-pinene and methanol can be found in the Fig. 1.The experimental setup is given through Fig. 2. The derivation is based on the following assumptions: (a) Across the biofilter the gradient of radial concentration is neglected, and airflow mimics the plug of the flow model.(b) The biofilm grows over the exterior face of the particle.This is the proliferation of microorganisms on the surface of pores rather then inside.This demonstrates that there is no biodegradation in the pores.(c) The biofilm completely occupies the padding media and has a very miniature thickness therefore, a rectangular design can be used. (d) α-pinene and methanol are the sole substrates that effects the biodegradation.(e) The boundary layer of the gas phase is absent at the interface of air, hence the transformation of mass is ignored.www.nature.com/scientificreports/(f) Because the buildup of the biomass is anticipated to be slow under steady-state circumstances, it is believed that the film density will remain the same during the duration of the experiment.(g) The microbial communities degrading α-pinene, and methanol are different.Further details regarding the experimental setup can be found in 35 .

Mass balance in the bio film phase
The structure of non-linear differential equations that describes the elimination of methanol and α-pinene in biofilm at equilibrium state conditions are specified as here, S m and S p are the saturations of methanol and α-pinene respectively.K is constant of half saturation, Y is yield coefficient, µ max denotes the rate of maximum specific growth, D em and D ep are the coefficients of effective diffusions of methanol and α-pinene.x is the total population of microorganisms.α is the empirical constant that shows the influence of methanol on α-pinene during the biodegradation which is given as In air phase, the entropy and saturation of methanol are denoted by C m , K i is the inhibition constant.The bound- ary conditions subjected to Eqs. (1) and ( 2) are here, the m m and m p represent the mass of microorganisms associated with methanol and α-pinene respectively, in the biofilm phase.These masses denote the quantity of microorganisms capable of metabolizing or interacting with the respective compounds.The following dimensionless parameters are defined to reduce Eqs.(1) and (2) into dimensionless mass balance equations Using the above dimensionless parameters in Eqs.(1) and (2) will result in The boundary conditions are expressed as

Mass balance in gas phase
The concentration (saturation) of methanol and α-pinene in biofilter along the air are governed by the nonlinear equation given as www.nature.com/scientificreports/here, C p denotes the saturations of α-pinene, C m , represents the saturations of methanol in the gas/air phase.U g represents the gas velocity, A s is the area of a surface, and h denotes the status along the height of biofilters.The primary conditions subjected to Eqs. ( 12) and ( 13) are The following dimensionless parameters are defined to reduce Eqs.( 12) and ( 13) into a non-dimensional mass balance equation in the gas patch.
Using the above dimensionless parameters in Eqs. ( 14) and (15) will result in with conditions

Unsaturated kinetics
In this section, we review the case when the saturation of VOCs (methanol and α-pinene) are relatively low in the biofilm juncture.In this study, S m ≤ K m and S p ≤ K p , therefore, Eqs.(1) and ( 2) are reduced to

Saturated kinetics
The mathematical model for biofiltration is considered when the saturation of methanol and α-pinene is com- paratively high.In this case, S m ≥ K m and S p ≥ K p , therefore, Eqs.(1) and ( 2) are reduced to

Elman neural networks
An artificial neural network (ANN), widely known as a neural network, is a computational model that mimics the behavior of biological brain.It comprises one input layer, one or more hidden layers, and an output layer.These layers are interconnected, with the input layer receiving data and passing it to deeper layers, which in turn forward signals to the output layer through the activation function.The information transmitted between layers undergoes multiple modifications before reaching the units responsible for constructing the inner levels, which are all hidden [36][37][38] .This architecture enables each layer to serve as both an input and output in solving complex problems.Figure 3 illustrates the framework and layout of an ANN.An Elman neural network (ENN) is a three-layer structure with an additional context layer.Figure 4 illustrates the connection of these hidden neurons to the context layers.Besides, the traditional three layers, the context layer accepts the input from the output of the hidden layer and maintains the state of the preceding time's values 39,40 .
We consider the architecture of the Elman neural network as depicted in Fig. 4, with output weight matrix, context, and external input layers which are denoted by w o,h (t), w h,c (t), w h,i (t) .Also, the n-dimensional input and output vectors are x 1 (t) = x 1 1 (t), x 1 2 (t), . . ., x 1 n (t) T and y(t) = y 1 (t), y 2 (t), . . ., y n (t) T .Additionally, the count of hidden neurons is N therefore, w h,i (t) ∈ R N×n , w h,c (t) ∈ R N×N , and w o,h (t) ∈ R n×N .The output vector of the hidden layer c(t − 1) = [c 1 (t − 1), c 2 (t − 1), . . ., c m (t − 1)] T , is associated back to the hidden layer as another input vector so x 2 (t) = x 2 1 (t), x 2 2 (t), . . ., x 2 N (t) T = c(t − 1) , so the entire input vector is defined as  www.nature.com/scientificreports/therefore, The output vector may be determined by using the given equation where, f is the activation function and a o i (t) is given as

Optimization and training of neurons
This section describe the training that is required for the unknown weights/ neurons that are included in the ENN framework.The neurons involved in ENN are adjusted using the optimization-based local search technique known as the Levenberg-Marquardt (LM) algorithm.LM algorithm is a curve fitting least square technique that is used for the minimization problems and it converges faster then order training algorithms 41 .Among the more recent implementations and uses of the LM algorithm are the identification of Hammerstein nonlinear systems 42,43 , predicting drag reduction in crude oil pipelines 44 , charging state in the batteries with lithiumions 45,46 , blind source separation of joint diagonalization 47 and parameter estimation of inverse heat transfer problems 48 .Moreover, the actualization of the design algorithm consists of two distinct stages.Initially, an Adam's method is utilized to generate a dataset or targeted data comprising 1001 points within the range of [0,1] for various cases ( 23)  of mathematical models representing the saturation of methanol and α-pinene in the biofilm and air phases.This dataset is divided into three subsets: a training set, a validation set, and a test set, typically in proportions such as 70%, 15%, and 15%, respectively.In the second phase, the Elman neural network is configured with initial parameter settings as provided in Table 1.The parameters chosen in Table 1 were determined based on a combination of factors, including prior knowledge of similar neural network architectures used in related studies [49][50][51] , computational resources available, and practical considerations regarding model complexity and training time.
While an optimization strategy was not explicitly employed to define the number of layers and neurons in this specific instance, the selection process involved experimentation and iterative refinement.We conducted preliminary trials with different configurations of layers and neurons and assessed their performance using metrics such as mean square error on validation datasets.The chosen configuration of 2 layers with 60 neurons was found to provide a balance between model complexity and predictive accuracy within the constraints of our study.Subsequently, an appropriate learning algorithm, such as the Levenberg-Marquardt (LM) optimization paradigm, is implemented to achieve optimal weighting inside the ENN framework and validate the estimated solutions for the problem.An overview and workflow of the designed scheme are depicted in Fig. 5.

Numerical experimentation and discussion
In this section, the design technique of the ENN-LM algorithm is implemented on the mathematical model for biofiltration of methanol and α-pinene in biofilm and air phases.The influence that changes in different param- eters had on concentration profiles is investigated.The estimated solutions by ENN-LM algorithm for saturation of methanol and α-pinene in biofilm juncture are in contrast to the numerical values secured by the use of the Runge-Kutta technique (RK-4), Bernstein polynomial method (BPM), Chebyshev wavelets method (CWM) 8 , and Adomian decomposition method (ADM) 52 as shown in Tables 2 and 3.The results demonstrate the findings obtained from the design algorithm overlap the results of the numerical calculations and have the least amount of absolute error when correlated to other state-of-the-art methodologies.Figure 6 demonstrates the impact that various values have on α and β on dimensionless saturation profile.It can be inferred from Fig. 6a and b that the saturation of methanol during the biofiltration process increases with the growth in the initial saturation of methanol β .It is worth noticing that for large values of β the saturation remains steady.The influence of variations in φ with fixed value of α on saturation of methanol is shown through Fig. 6c and d.It can be noticed that maximum growth rate of methanol biodegradation φ causes decrease in the saturation and finally becomes firm at higher values.
The result of several factors such as β 1 and φ 1 on the dimensionless saturation of α-pinene are shown through Fig. 7.It is evident from the following Fig. 7a and b that the saturation (saturation) rises with enhancement in β 1 and fixed values of dry cell density.Also, the increase in saturation is observed when the denseness of the film reduces.
In addition, the saturation profiles of methanol and α-pinene under the sway of alterations in β , β 1 , φ and φ 1 in air phase are shown though Fig. 8.The saturation of methanol increases and gradually reaches to the stable Table 2. Comparative analysis of the approximate solutions for the saturation of methanol in biofilm phase, acquired by the suggested ENN-LM scheme with RK-4, CWM 8 , and ADM 52 .state (constant) when the film denseness or φ increases.Also, with a decrease in half saturation parameter, the saturation decreases.The conclusion that can be drawn from Fig. 8c and d is that the constant level of saturation of α-pinene is achieved after an increase with the increase in diffusion coefficient and initial saturation parameter with fixed values of α.Further, the suggested technique is implanted to investigate how specific parameters affect the unsaturated and saturated kinetic of the saturation of methanol and α-pinene during the biofilteration.Each parameter is changed while the others, which are thought to be constants are given as D em = D ep = 0.004, 34 and K m = K p = 10 .The results of saturation profiles obtained by the ENN-LM algorithm by varying different parameters are illustrated through Figs. 9 and 10.The approximate and numerical solutions overlap each other, which demonstrates the precision and robustness of the recommended design technique in approximating stiff nonlinear problems.
For the purpose of conducting a comprehensive analysis of the intended ENN-LM method, the number of independent execution of the design scheme is carried out to beget huge data for the statistical and numerical analysis to exhibit the solutions' stability, efficiency, and accuracy.In this regard, different performance indexes are defined in terms of solutions deviation (MAD), coefficient of Theil's inequality (TIC), and error in Nash-Sutcliffe efficiency (ENSE).These performance metrics are formulated mathematically as follows: . The potential impact of the variances in β and φ on the saturation profile of methanol in biofilm phase.
here, M shows the number of independent executions, S * m , S * p , S * m and S * p are the analytical and approximate solutions for biofilm and gas phases respectively.
The objective values or performance values referring to the root mean square deviation is dictated through Fig. 11.It is worth noticing that the values of MSE are approaching to zero for different case of biofilm and air phases during the biofiltration of VOCs.In addition, Table 4 displays the lowest and maximum values of performance metrics, together with their standard deviations during the multiple exsections of the ENN-LM algorithm.It can be observed that global (mean) values lies around 10 −5 to 10 −7 that reveals the exactness and robustness of the intended supervised layout.Figure 12 is plotted to exemplify the efficacy and potency of the deliberate approach in terms of the execution time.It is clear that the ENN-LM algorithm is much faster than the technique available in the latest literature.

Conclusion
In this article, we conduct an investigation of the mathematical models of biofiltration for mixing volatile organic compounds (VOCs) such as methanol and hydrophobic α-pinene in the biofilm and air phases, respectively.The models are based on nonlinear diffusion equations.Furthermore, we have developed a machine learning artificial intelligence-based computing technique to study the saturation profiles of methanol and α-pinene.Vari- ous parameters have been varied to study their effect on the saturation of saturated and unsaturated kinetics of the VOCs.The results demonstrate that methanol saturation during the biofiltration process increases with an increase in the initial saturation of methanol β and maximum growth rate of methanol biodegradation φ causes a decrease in the saturation.Also, the saturation of α-pinene increases with an increase in β 1 with fixed values of dry (28) cell density.The results captured by the intended ENN-LM method are contrasted to the solutions of numerical studies by the Runge-Kutta method (RK-4), Bernstein polynomial method (BPM), Chebyshev wavelets method (CWM), and Adomian decomposition method (ADM) that illustrates the precision and effectiveness of the method that has been suggested.The results of mean square error, mean absolute divisions, and the importance of the design method are further shown by the complexity of the computations.

Figure 3 .
Figure 3. Details of a neuron in MLP network.

Figure 4 .
Figure 4.The internal organization of the Elman neural network.

Figure 5 .
Figure 5.The operational mechanism of the design algorithm carried out in a sequential fashion.

Figure 7 .
Figure 7.The effect of differences in β 1 and φ 1 on the saturation profile of α-pinene in biofilm phase.

Figure 8 .
Figure 8.The influence that changes in the various factors have on (a, b) saturation drafts of methanol and (c, d) α-pinene during the biofiltration process at air phase.

Figure 9 .
Figure 9.The impact of changes in the values of the various parameters on the saturation of methanol and α -pinene for unsaturated (first order) kinetics.

Figure 10 .
Figure 10.The consequence of deviations in the values of the various parameters on the saturation of methanol and α-pinene for saturated (zero order) kinetics.

Figure 11 .
Figure 11.Performance values in term of mean square error obtained by the proposed algorithm during multiple execution.

φFigure 12 .
Figure 12.Illustration of the time required/taken by ADM, BPM, CWM and ENN-LM for computing an approximation of the saturation of methanol with φ = β = 10.

Table 1 .
Essential parameter setting for the accomplishment of the proposed ENN-LMA scheme.

Table 4 .
Statistical analysis based on minimum, average results and deviations of the performance indicators for cases in a mathematical model of biofilm and air phases during the biofiltration of VOCs.