Dynamic Modeling Using Artificial Neural Network of Bacillus Velezensis Broth Cross-Flow Microfiltration Enhanced by Air-Sparging and Turbulence Promoter

Cross-flow microfiltration is a broadly accepted technique for separation of microbial biomass after the cultivation process. However, membrane fouling emerges as the main problem affecting permeate flux decline and separation process efficiency. Hydrodynamic methods, such as turbulence promoters and air sparging, were tested to improve permeate flux during microfiltration. In this study, a non-recurrent feed-forward artificial neural network (ANN) with one hidden layer was examined as a tool for microfiltration modeling using Bacillus velezensis cultivation broth as the feed mixture, while the Kenics static mixer and two-phase flow, as well as their combination, were used to improve permeate flux in microfiltration experiments. The results of this study have confirmed successful application of the ANN model for prediction of permeate flux during microfiltration of Bacillus velezensis cultivation broth with a coefficient of determination of 99.23% and absolute relative error less than 20% for over 95% of the predicted data. The optimal ANN topology was 5-13-1, trained by the Levenberg–Marquardt training algorithm and with hyperbolic sigmoid transfer function between the input and the hidden layer.


Introduction
Bacillus velezensis, a species of the genus Bacillus [1], has been extensively studied in recent years due to its remarkable biocontrol qualities against different plant pathogenic bacteria and fungi [2]. Besides an ability to produce large number of metabolites responsible for its antibacterial and antifungal activity, such as lipopeptides [3], enzymes [4] and volatile organic compounds [5], biomass of Bacillus velezensis could also be successfully utilized as a biocontrol agent [6,7]. Since multiplication of bacterial biomass is predominantly performed in a liquid culture, separation of bacterial biomass from the cultivation broth is of a great importance.
Membrane separation processes such as microfiltration are gaining interest as an alternative to conventional separation processes and recently it has become the mainstream separation techniques used for cell harvesting and broth clarification [8]. Sustainability issues dominating today's industry have contributed to the shift from chemical synthesis processes to biotechnology production in numerous branches of industry, so the membrane-based separations have become very promising unit operations [9]. pressure, initial permeate flux, shear rate, feed concentration). This research has revealed that one hidden layer was usually enough for accurate prediction of permeate flux, while an increase in the number of hidden neurons has led to overtraining of the neural network model. On the other hand, Aydiner et al. [38] have concluded that a neural network with too small number of neurons in the hidden layer will not approximate non-linear relations appropriately. A similar conclusion was drawn by Fu et al. [39], who have predicted adsorption of the bovine serum proteins into polyethylene membrane using ANN, indicating a necessity to appropriately define neural network architecture in order to use it as a reliable prediction tool. Furthermore, neural networks trained using the common Levenberg-Marquardt training algorithm, were successfully applied in modeling of microfiltration of different suspensions [34,[40][41][42][43][44][45][46][47]. Bayesian regularization was used as a training algorithm for neural network prediction of permeates flux during microfiltration of starch industry wastewater [31]. Liu et al. [20] have developed a neural network model for permeate flux prediction during microfiltration of calcium carbonate suspensions with the application of turbulence promoter as a method for permeate flux improvement. Neural networks were also applied for modeling of microfiltration of red plum juice [48] and sugar beet juice suspension [49]. ANN applied in modeling of microfiltration processes have usually used tangential, sigmoid or logistic transfer functions.
In contrast to the previous studies [15,31], the novelty of the approach adopted in this work is training of the single neural network that could be used as simulation tool for the different modes of microfiltration operation. The modes of the operation include the use of hydrodynamic methods (air-sparging and static mixer) for flux decline lessening. Considering the previous research in the field of ANN application for dynamic modeling of microfiltration [15,31], the aim of this study was to investigate the performance of non-recurrent feed-forward network with one hidden layer in prediction of permeate flux value during microfiltration of Bacillus velezensis cultivation broth. The main goal was determination of the optimal neural network architecture for dynamic permeate flux prediction under various conditions. The data for ANN modeling were collected by the microfiltration experiments in the system without hydrodynamic methods for permeate flux improvement and with application of the Kenics static mixer or two-phase flow as fouling mitigation methods. Also, the experimental results obtained for the combination of these two methods were included in the data sets used for selection of the optimal neural network topology.

Preparation of Bacillus Velezensis Cultivation Broth
Bacillus velezensis cultivation broth has been produced by cultivation of the producing microorganisms in the Woulff bottles (total volume 2 L, working volume was 2/3 of the total volume) using the cultivation medium which contained 10 g/L of glycerol, 3 g/L of yeast extract, 3 g/L of (NH 4 ) 2 SO 4 , 1 g/L of K 2 HPO 4 and 0.3 g/L of MgSO 4 ·7H 2 O. pH value of the cultivation medium was set to 7.0 ± 0.2 before sterilization by autoclaving (121 • C, 2.1 bar, 20 min). Inoculum was prepared using nutrient broth (HiMedia Laboratories, Karnataka, India) under the following conditions: temperature 28 • C, external mixing rate 150 rpm, spontaneous aeration, duration 48 h. Cultivation medium in the Woulff bottles was inoculated with 10% (v/v) of inoculum. Cultivation conditions were as follows: temperature 28 • C, external mixing rate 150 rpm, aeration rate 0.75 vvm (volume of air·volume of liquid −1 ·min −1 ), duration 96 h. The resulting cultivation broth was used as a feed mixture for the microfiltration experiments.

Microfiltration Experiments
Microfiltration experiments were carried out using the single channel ceramic membrane (Tami Industries, Nyons, France). The pore diameter was 200 nm, while the length and the inner diameter of the membrane were 250 mm and 7 mm, respectively. The effective membrane length was 230 mm, with filtration area of 0.0043 m 2 . Apparatus used in microfiltration experiments was described by Jokić et al. [16]. Recirculation of retentate and permeate in microfiltration experiments had assured constant feed conditions. The temperature was adjusted to 25 • C. Permeate flux (J P , L·m −2 ·h −1 ) value was calculated using the following equation (Equation (1)): where t (h) is the time required to collect certain volume of the permeate V P (10 mL), and A (m 2 ) is the effective filtration area. Filtration experiments were carried out in four varying modes. The first set of experiments was conducted by using plain membrane without application of any method for permeate flux improvement. In the second set, the static mixer was used as a turbulence promoter to enhance filtration flux. Air sparging was employed as a method for flux decline mitigation in the third set, while the fourth set of experiments was used to investigate the combination of static mixer and air sparging.
The turbulence promoter used in microfiltration experiments was Kenics static mixer, made of stainless steel, with diameter of 6 mm and length of 230 mm (corresponding to the active membrane length). In the experiments with air sparging air flow rate was set and maintained constant using the electronic flow rate regulator (EL FLOW ® F 201AV, Bronkhorst, Ruurlo, The Netherlands). Experimental variables and their values used in the Box-Behnken experimental plan (3 3 -three variables varied at three levels) for microfiltration experiments are listed in Table 1.

Data Compilation
For development of the ANN model for dynamic prediction of permeate flux behavior during microfiltration of Bacillus velezensis cultivation broth all experimental data were joint in one single dataset. The time-dependent permeate flux values comprised 1115 experimental data points for all microfiltration modes. Operational conditions including microfiltration time, superficial feed velocity, transmembrane pressure and superficial air velocity were taken as four of five input variables for the ANN model in this study (Table 1). Experiments with or without static mixer were conducted in the same operational parameters range, so to make a distinction in the compiled datasets an additional input (the presence of the Kenics static mixer) was selected. The mode of operation with the static mixer was designated the value 1, and in the case of microfiltration without the static mixer the value was 0 (Table 1). Furthermore, the differences between values of superficial feed velocity and superficial air velocity in the operation modes with and without static mixer arose from the reduced effective cross section of the membrane channel due to the presence of the static mixer. The normalization of data was performed using the procedure according to Equation (2) [50]: where J normal and J P are normalized permeate flux value and measured permeate flux value, respectively, J max and J min are maximal and minimal value of permeate flux in the series of experimental data, respectively, and ∆ U and ∆ L are upper and lower values of normalization limit, respectively (0.01 for each limit).

Artificial Neural Network Modelling
In order to obtain single neural network for simulation of all experiments including methods for improvement of permeate flux (turbulence promoter, air sparging and combination of the turbulence promoter and air sparging) during microfiltration of Bacillus velezensis cultivation broth, the neural network with one hidden layer was used. In this study, a non-recurrent feed-forward artificial neural network is used, as this type of neural networks is used in numerous studies that deal with ANN microfiltration modeling [27]. Optimal neural network architecture with the best prediction performance was selected among the four models of neural network. The models have combined two types of training algorithm-the Levenberg-Marquardt algorithm (trainlm) and Bayesian regularization (trainbr)-with two types of transfer functions between the input and the hidden layer: sigmoid logistic (logsig) and sigmoid hyperbolic (tansig). In this way, four ANN types were investigated ( Table 2). In all ANN models, the transfer function between the hidden and the output layer was linear function (puerlin) ( Table 2). Table 2. Neural network models investigated for microfiltration modeling.

ANN Type
Training Algorithm Transfer Function

Input-Hidden Layer Hidden-Output Layer
The Levenberg-Marquardt training learning algorithm is based on an improved method of error back-propagation and it uses an early training stop criterion in order to increase efficacy and training speed of the neural network. Training stop criteria are usually maximal number of epochs or the moment when the error decrease under the acceptable limit. Early training stop happens if generalization ability is not being improved during the training, or the model mean square error (MSE) starts to increase.
Bayesian regularization represents a modification of the Levenberg-Marquardt training algorithm which contributes to solving the problem of lowered generalization ability due to neural network model overtraining (reduced model accuracy when presenting a new unseen set of the data). Iterative procedure contributes to MSE lowering, but also to the reduced number of parameters which should be adjusted to achieve the lowest possible MSE, therefore this training algorithm minimizes the number of synaptic weights which should be adjusted during the neural network training. Hence, the accuracy of the Bayesian regularization is around five times higher compared to the Levenberg-Marquardt early-stop training algorithm [51].
The main goal of optimal neural network architecture is to select a simpler neural network, i.e., a neural network with minimal number of neurons in the hidden layer, which simultaneously provides satisfactory predictability of the network.
The normalized dataset of microfiltration experimental results was randomly divided into three groups, using the randperm algorithm in Matlab software (R2015b, MathWorks, Natick, MA, USA). The data used for ANN training consisted of 70% of all data, whilst 15% of data were used for validation and the rest 15% for model testing.
The criteria for the end of ANN training were maximal number of epochs 1500, minimal MSE 0 or minimal performance gradient 1·10 −10 . The MSE for each neural network model was calculated for increasing number of neurons in the hidden layer from one to 15 neurons using Equation (3): where n is the number of data, J exp,i is ith experimental flux value and J pred,i is ith flux value predicted by the neural network. Another criterion for optimal network selection was coefficient of determination (R 2 ), calculated by Equation (4). The neural networks were trained 30 times, and the average MSE and R 2 were calculated to avoid probabilistic weight selection influence: where J pred,avg is the average flux value predicted by the neural network.
Validation of the neural networks models was performed using the linear regression analysis -Pearson's correlation coefficient (r). The neural network model shows good correlation, i.e., good prediction capability, if the absolute value of the Pearson's coefficient, calculated by Equation (5), is |r| ≥ 0.8: where J exp is an arithmetic mean of the experimental flux values and J pred is an arithmetic mean of the flux values predicted by the neural network. Relative importance of the input variables effect to permeate flux, as the output variable, was calculated using Garson's equation (Equation (6) where n v and n h are number of the neurons in the input and the hidden layer, respectively, i j is the absolute value of connection weights between the input and the hidden layer neurons, and o j is the absolute value of connection weights between the hidden and the output layer neurons.

Effect of Learning Algorithm, Transfer Function and Number of Hidden Layer Neurons
A trial and error-based method was selected for defining the number of neurons in the hidden layer of the ANN. Figure 1 shows the variation of MSE and coefficient of determination versus the number of neurons in the hidden layer. The results are presented for the training set of data, and they clearly show that the outcome of increase in number of the hidden layer neurons is better predictability for all investigated networks. As can be seen, all network types investigated in this study had values of MSE and R 2 in a narrow range. This suggests that their predictive capacities were alike. Even for just a few hidden neurons high values of coefficients of determination were obtained. In the case of the neural network model with the Levenberg-Marquardt algorithm and sigmoid hyperbolic function (network type B) minimal value of MSE was achieved using the neural network with 13 neurons in the hidden layer. Furthermore, the increase of the coefficient of determination could also be observed with the increase in number of neurons in the hidden layer. MSE and R 2 values for this network were 2.69 × 10 −4 and 0.99498, respectively. In case of using the Levenberg-Marquardt algorithm and sigmoid logistic function (network type A), minimum MSE value of 2.60 × 10 −4 and maximum R 2 value of 0.99539 were achieved for a neural network with 15 neurons in the hidden layer.
In the case of the neural network model with the Bayesian regularization and hyperbolic sigmoid function (network type D), a decrease of MSE could be also noticed when the number of neurons in the hidden layer was increased (Figure 1a). A minimal value of MSE and maximal value of R 2 were achieved with 15 neurons in the hidden layer, 2.74 × 10 −4 and 0.99513, respectively. On the other hand, when using neural network model with Bayesian regularization and hyperbolic logistic function (network type C), minimal value of MSE (2.74 × 10 −4 ) and maximal value of R 2 (0.99515) were achieved with 14 neurons in the hidden layer.
Therefore, it could be concluded that the optimal number of hidden neurons for approximation of the microfiltration results during microfiltration of Bacillus velezensis cultivation broth was 13. The neural network model trained by the Levenberg-Marquardt training algorithm and with hyperbolic sigmoid transfer function (network type B: trainlm, tansig) has shown the best prediction capability due to high values of coefficient of determination and MSE, with the lowest number of hidden neurons. Hence, the chosen optimal neural network was type B with architecture 5-13-1.

Verification of the Neural Network Model
Prediction accuracy of the neural network model was checked using the Pearson's coefficient (Equation (5)) and the coefficient of determination on experimental versus ANN data. Parity plot (Figure 2) for the complete dataset shows that there is a very close agreement between the experimental and the ANN predicted data for the great majority of cases. The Pearson's coefficient value of 0.99611 suggested a good linear correlation between the experimental data and the data predicted by the neural network. Therefore, it could be concluded that the optimal number of hidden neurons for approximation of the microfiltration results during microfiltration of Bacillus velezensis cultivation broth was 13. The neural network model trained by the Levenberg-Marquardt training algorithm and with hyperbolic sigmoid transfer function (network type B: trainlm, tansig) has shown the best prediction capability due to high values of coefficient of determination and MSE, with the lowest number of hidden neurons. Hence, the chosen optimal neural network was type B with architecture 5-13-1.

Verification of the Neural Network Model
Prediction accuracy of the neural network model was checked using the Pearson's coefficient (Equation (5)) and the coefficient of determination on experimental versus ANN data. Parity plot (Figure 2) for the complete dataset shows that there is a very close agreement between the experimental and the ANN predicted data for the great majority of cases. The Pearson's coefficient value of 0.99611 suggested a good linear correlation between the experimental data and the data predicted by the neural network. The coefficient of determination (R 2 ) value of 0.99224 suggested that the linear regression equation for permeate flux could not explain less than 0.8% of the variations in the system. In other words, the majority of the data are close to the line which represents the ideal fitting of the experimental data (full line in the Figure 2, which represents ideal fitting by the linear model). This implies very good prediction consistency of the neural network model. Detailed estimation of neural network capability to predict permeate flux value during cross-flow microfiltration of Bacillus velezensis cultivation broth has been investigated using the analysis of absolute relative error ( Table  3). The neural network model was able to predict 85% of the data with error less than 10%. Furthermore, for 67% of the data the value of absolute relative error was less than 5%, while for only 6% of the data absolute relative error was higher than 20%. Considering that the neural network model for the experimental data for all microfiltration experiments has given predictions with absolute relative error less than 20% for 95% of the data, it could be concluded that ANN approach is suitable for prediction of permeate flux value during microfiltration of Bacillus velezensis cultivation broth.
Additional microfiltration experiments were carried out at transmembrane pressure of 0.2 bar, superficial feed velocity of 0.43 m•s −1 and for air-sparging experiments superficial air velocity was set at 0.2 m•s −1 . The experiments were undertaken with and without static mixer. The results of the simulation experiments are given in Figure 3. The coefficient of determination (R 2 ) value of 0.99224 suggested that the linear regression equation for permeate flux could not explain less than 0.8% of the variations in the system. In other words, the majority of the data are close to the line which represents the ideal fitting of the experimental data (full line in the Figure 2, which represents ideal fitting by the linear model). This implies very good prediction consistency of the neural network model. Detailed estimation of neural network capability to predict permeate flux value during cross-flow microfiltration of Bacillus velezensis cultivation broth has been investigated using the analysis of absolute relative error ( Table 3). The neural network model was able to predict 85% of the data with error less than 10%. Furthermore, for 67% of the data the value of absolute relative error was less than 5%, while for only 6% of the data absolute relative error was higher than 20%. Considering that the neural network model for the experimental data for all microfiltration experiments has given predictions with absolute relative error less than 20% for 95% of the data, it could be concluded that ANN approach is suitable for prediction of permeate flux value during microfiltration of Bacillus velezensis cultivation broth.
Additional microfiltration experiments were carried out at transmembrane pressure of 0.2 bar, superficial feed velocity of 0.43 m·s −1 and for air-sparging experiments superficial air velocity was set at 0.2 m·s −1 . The experiments were undertaken with and without static mixer. The results of the simulation experiments are given in Figure 3.
In the initial microfiltration phase permeate flux value decline could be observed in all experimental setups. The reason for this can be found in the formation of a cake layer on the membrane surface. The layer made up of deposited microbial cells and large molecules causes an increase of the specific resistance to permeate flow, and thus the decrease in flux values. In the initial microfiltration phase permeate flux value decline could be observed in all experimental setups. The reason for this can be found in the formation of a cake layer on the membrane surface. The layer made up of deposited microbial cells and large molecules causes an increase of the specific resistance to permeate flow, and thus the decrease in flux values.
As expected, the lowest flux values were obtained without any fouling mitigation method used, i.e., in an empty membrane channel without static mixer (NSM mode). Air-sparging (AS mode) increased permeate flux values by around 100%, due to instabilities in feed flow caused by the existence of two-phase flow in the membrane channel. The presence of the Kenics turbulence promoter (SM mode) resulted in a significant increase of flux value (around 300%) for selected experimental conditions depicted in Figure 3. The characteristic shape of the mixer that intensifies radial mixing in the membrane channel and reduces the cake thickness is a probable reason for this [13][14][15]. Combination of the static mixer and two-phase flow (AS + SM mode) resulted in the highest flux values, although the contribution of air-sparging is less compared to the static mixer.
Generalization capacity of the neural network model has been confirmed by a simulation of microfiltration experimental data, which had not been presented to the neural network in the phases of training, validation or testing. The experimental conditions of microfiltration variables in the presence of a turbulence promoter and with air-sparging (AS + SM mode) were set to 0.35, 0.6 and 0.3, for transmembrane pressure, superficial feed velocity and superficial air velocity, respectively. The results of the simulation experiment are given in Figure 4.
As can be seen in Figure 4, the selected ANN network is capable of satisfactorily simulating experimental data for the experiment. The majority of the predicted flux values falls in the range of 10% error. The most significant prediction error is noticed at the beginning of microfiltration operation. The reason for this can be found in the rapid flux decline at the beginning of Bacillus velezensis broth microfiltration. Cultivation broth is a complex mixture of various components that can cause severe membrane fouling in the initial phase of microfiltration [8,11,14,16]. This contributes to reduced ANN prediction capabilities in this region. A relatively small number of flux value experimental observations during the initial microfiltration period results in limited number of data provided for the network in the training stage. This limitation can be avoided by capturing more flux values in the initial microfiltration phase, thus improving predictability of the network at the beginning of the microfiltration period.
The results of a simulation of microfiltration experiments in all investigated modes indicate that a single neural network is capable of accurately predicting dynamic permeate flux values during microfiltration of Bacillus velezensis cultivation broth. As expected, the lowest flux values were obtained without any fouling mitigation method used, i.e., in an empty membrane channel without static mixer (NSM mode). Air-sparging (AS mode) increased permeate flux values by around 100%, due to instabilities in feed flow caused by the existence of two-phase flow in the membrane channel. The presence of the Kenics turbulence promoter (SM mode) resulted in a significant increase of flux value (around 300%) for selected experimental conditions depicted in Figure 3. The characteristic shape of the mixer that intensifies radial mixing in the membrane channel and reduces the cake thickness is a probable reason for this [13][14][15]. Combination of the static mixer and two-phase flow (AS + SM mode) resulted in the highest flux values, although the contribution of air-sparging is less compared to the static mixer.
Generalization capacity of the neural network model has been confirmed by a simulation of microfiltration experimental data, which had not been presented to the neural network in the phases of training, validation or testing. The experimental conditions of microfiltration variables in the presence of a turbulence promoter and with air-sparging (AS + SM mode) were set to 0.35, 0.6 and 0.3, for transmembrane pressure, superficial feed velocity and superficial air velocity, respectively. The results of the simulation experiment are given in Figure 4.
As can be seen in Figure 4, the selected ANN network is capable of satisfactorily simulating experimental data for the experiment. The majority of the predicted flux values falls in the range of 10% error. The most significant prediction error is noticed at the beginning of microfiltration operation. The reason for this can be found in the rapid flux decline at the beginning of Bacillus velezensis broth microfiltration. Cultivation broth is a complex mixture of various components that can cause severe membrane fouling in the initial phase of microfiltration [8,11,14,16]. This contributes to reduced ANN prediction capabilities in this region. A relatively small number of flux value experimental observations during the initial microfiltration period results in limited number of data provided for the network in the training stage. This limitation can be avoided by capturing more flux values in the initial microfiltration phase, thus improving predictability of the network at the beginning of the microfiltration period.
The results of a simulation of microfiltration experiments in all investigated modes indicate that a single neural network is capable of accurately predicting dynamic permeate flux values during microfiltration of Bacillus velezensis cultivation broth.

Relative Importance of the Input Variables
Although neural networks are considered to be black-box models, ignorant of real physical connections between experimental parameters, there are some possible ways to gain insights into the influence of experimental parameters on the microfiltration process. Considering weights and biases of the optimal trained neural network it is possible to assess the influence of the input parameters by the means of Garson equation (Equation (6)) [37]. Relative importance of the input variables is given in Table 4.
It could be concluded that filtration time has the most significant effect in determination of the permeate flux decline (50.30%). These results are in agreement with findings reported for dynamic modeling of Streptomyces hygroscopicus fermentation broth microfiltration [15], microfiltration of starch wastewater [31] and cross-flow microfiltration of a mixture that contains phosphate and fly ash [38].
The importance of superficial air and feed velocities were ranked second and fourth, respectively. The relative importance of superficial air velocity is around 4% higher compared to relative importance of superficial feed velocity, leading to the conclusion that increase of air velocity could contribute more to the change of permeate flux. Rod-shaped cells of Bacillus velezensis are oriented by the feed flow in the membrane channel [11,52]. As the feed flow velocity increases a turbulent flow regime is reached and influence of feed flow is less pronounced. On the other hand, when two-phase flow is applied, superficial velocities of the two-phase flow are increased and enhance turbulence due to movement of air bubbles, which change the cake layer structure. Consequently, superficial air velocity influences permeate flux more importantly compared to feed velocity.

Relative Importance of the Input Variables
Although neural networks are considered to be black-box models, ignorant of real physical connections between experimental parameters, there are some possible ways to gain insights into the influence of experimental parameters on the microfiltration process. Considering weights and biases of the optimal trained neural network it is possible to assess the influence of the input parameters by the means of Garson equation (Equation (6)) [37]. Relative importance of the input variables is given in Table 4. It could be concluded that filtration time has the most significant effect in determination of the permeate flux decline (50.30%). These results are in agreement with findings reported for dynamic modeling of Streptomyces hygroscopicus fermentation broth microfiltration [15], microfiltration of starch wastewater [31] and cross-flow microfiltration of a mixture that contains phosphate and fly ash [38].
The importance of superficial air and feed velocities were ranked second and fourth, respectively. The relative importance of superficial air velocity is around 4% higher compared to relative importance of superficial feed velocity, leading to the conclusion that increase of air velocity could contribute more to the change of permeate flux. Rod-shaped cells of Bacillus velezensis are oriented by the feed flow in the membrane channel [11,52]. As the feed flow velocity increases a turbulent flow regime is reached and influence of feed flow is less pronounced. On the other hand, when two-phase flow is applied, superficial velocities of the two-phase flow are increased and enhance turbulence due to movement of air bubbles, which change the cake layer structure. Consequently, superficial air velocity influences permeate flux more importantly compared to feed velocity.
The presence of the static mixer in the membrane channel is at the third place. Insertion of the turbulence promoter in the membrane channel significantly contributes to permeate flux increase, as shown in Figure 3. The Kenics static mixer causes the increase of feed velocity, hence the turbulence is caused by the unique flow pattern forming secondary flows and radial mixing which enable fluid movement closer to the membrane surface. This flow near the membrane surface increases permeate flux as it reduces the thickness of the cake layer and changes its brick like structure [16]. Furthermore, in the system with air sparging the presence of the turbulence promoter contributes to the significant change of a two-phase flow regimen by causing coalescence and bursting of the large air bubbles, as well as by making it possible for smaller air bubbles to flow near the membrane surface affecting breaking of the filtration cake structure, which lowers the permeate flow resistance leading to higher values of permeate flux and the steady state being reached faster.
The relative influence of transmembrane pressure has the lowest rank. It can be explained by the formation of a brick-like cake layer of the rod-shaped microbial cells. This type of cake structure becomes more compact with the increase of pressure, which in turn results in lower influence of transmembrane pressure on the permeation flux [11,14,52].

Conclusions
Considering the results presented in this study, it could be concluded that a single non-recurrent feed-forward ANN with architecture 5-13-1, trained by the Levenberg-Marquardt training algorithm with hyperbolic sigmoid transfer function between the input and the hidden layer, could be used as a reliable tool for modeling of permeate flux during microfiltration of Bacillus velezensis cultivation broth in various modes. The modes of the operation include the use of hydrodynamic methods (air-sparging and static mixer) for flux decline reduction. The proposed neural network model has predicted permeate flux values for all modes of microfiltration experiment. The accuracy of 99.23%, indicated by the coefficient of determination, shows absolute relative error less than 20% for over 95% of the predicted data. Furthermore, an analysis of relative importance of the input variables (microfiltration parameters) to permeate flux has revealed that the most noticeable effect has filtration time, followed by air linear velocity, presence of the Kenics static mixer, feed linear velocity and transmembrane pressure. The results of this study have confirmed a suitability of ANNs as a modeling tool for microfiltration, with an emphasis on the importance of adequate experimental data preparation, network architecture optimization, as well as on defining of appropriate neural network training parameters in order to obtain a highly accurate and reliable model.