Simulating and Comparing CO2/CH4 Separation Performance of Membrane–Zeolite Contactors by Cascade Neural Networks

Separating carbon dioxide (CO2) from gaseous streams released into the atmosphere is becoming critical due to its greenhouse effect. Membrane technology is one of the promising technologies for CO2 capture. SAPO-34 filler was incorporated in polymeric media to synthesize mixed matrix membrane (MMM) and enhance the CO2 separation performance of this process. Despite relatively extensive experimental studies, there are limited studies that cover the modeling aspects of CO2 capture by MMMs. This research applies a special type of machine learning modeling scenario, namely, cascade neural networks (CNN), to simulate as well as compare the CO2/CH4 selectivity of a wide range of MMMs containing SAPO-34 zeolite. A combination of trial-and-error analysis and statistical accuracy monitoring has been applied to fine-tune the CNN topology. It was found that the CNN with a 4-11-1 topology has the highest accuracy for the modeling of the considered task. The designed CNN model is able to precisely predict the CO2/CH4 selectivity of seven different MMMs in a broad range of filler concentrations, pressures, and temperatures. The model predicts 118 actual measurements of CO2/CH4 selectivity with an outstanding accuracy (i.e., AARD = 2.92%, MSE = 1.55, R = 0.9964).


Introduction
Developing carbon capture and sequestration technologies, deploying renewable sources of energy, and tightening the regulations are the key strategies to achieve the Paris Agreement targets [1,2]. To this end, among different renewable energies such as solar, wind, biogas, and biomass, biogas has already demonstrated an appealing potential to be replaced with fossil fuels [3]. On the other hand, to meet the pipeline safety and maintenance criteria, biogas requires upgrading by separating the bio-methane from the involved contaminants, such as carbon dioxide [4]. Accordingly, separating the CO 2 /CH 4 is not only necessary for climate change mitigation, but also favorable for synthesizing the value-added chemicals [5]. To date, several different technologies including electrochemical reduction [6], membrane [7], cryogenic [8], adsorption [9], and absorption [10] have been investigated for gas capture and sequestration. Selectivity is one of the key factors of the membrane-based separation processes [11].
Among these strategies, absorption and cryogenic ones are the most mature, while absorption represents some serious environmental problems and cryogenics consume a It must be mentioned that, although it is possible to simulate the effect of all influential variables on the CO 2 /CH 4 selectivity using the CNN model, the following matters convinced us to only consider the effect of SAPO-34 dose, polymer type, temperature, and pressure on the selectivity.

1.
The model is better to develop based on its easy and always available variables 2.
Some of the potentially influential variables, including the MMM synthesis method and selectivity measurement procedure, are not reported in some of the original articles. Therefore, we have not considered them as independent variables. 3.
It is better to ignore those variables that have a minor impact on the selectivity.

Dependency of CO 2 /CH 4 Selectivity on Involved Variables
The previous section clearly defined both dependent and independent variables involved in methane and carbon dioxide separation by polymer/SAPO-34 membranes. Presently, Pearson's method is applied to reveal the relationship between each pair of dependent and independent variables. This method uses Equation (1) to identify the most possible pattern between a pair of variables (i.e., x and y) and extract their relationship [53].
Here PC x,y shows the Pearson's coefficients between independent (x) and dependent (y) variables. In addition, m stands for the number of available records, and k is an index determining the upper and lower bounds of the summation operator. The relationship type/strength is coded using a coefficient that varies between −1 and +1 [54]. The minimum and maximum coefficients show the strongest indirect and direct relationships, respectively. The physical meaning of other coefficients can also be inferred by their closeness to either the maximum or minimum bounds. Figure 1 presents the Pearson's coefficients for the relationship of CO 2 /CH 4 selectivity to the SAPO-34 dosage in MMMs, polymer type, temperature, and pressure. The selectivity directly relates to the first two independent variables, while it indirectly relates to the two son's predictions are sometimes the opposite of those approved by scientific facts [5 Although Pearson's method claimed that selectivity has almost no relationship with pr sure and temperature, all experts in the field of membrane-based separation know th these are important variables and have a substantial impact on the operation.

Cascade Neural Networks (CNN)
ANN is a non-linear topology that was first developed according to the pattern human brain processing for data analysis [55]. Accordingly, the flexibility, robustness, a accuracy of this strategy quickly nominated it in a broad range of applications from b medicine to sustainable development [56,57]. It is possible to design a powerful AN model to extract a logical pattern among the considered determinative factors, with lated dependent ones, despite any degree of complexity [58]. To this end, providing acceptable experimental dataset is a primary step in developing a black box for the p diction of targets [59]. In this way, the architecture of ANNs is obtained with signal an ysis among the input and output factors. It is worth noting that the successful develo ment of the ANN approach demands order specifications in different layers related neuron interactions. Accordingly, various types of ANN topologies have been introduc such as multi-layer perceptron, cascade, radial basis function, and general regression [5 Among these paradigms, the cascade neural network (CNN) is considered the most po ular tool to simulate different phenomena [60,61]. The CNN approach directly conne the input layer to all the next hidden/output layers [60]. Accordingly, in the current stud the CNN topology has been employed which includes three key layers such as input, h den, and output. Here, the input layer is associated with independent factors (variabl It is better to note that the level of data scattering is a critical factor that often misleads the relevancy tests to provide inaccurate results. The literature clearly states that the Pearson's predictions are sometimes the opposite of those approved by scientific facts [53]. Although Pearson's method claimed that selectivity has almost no relationship with pressure and temperature, all experts in the field of membrane-based separation know that these are important variables and have a substantial impact on the operation.

Cascade Neural Networks (CNN)
ANN is a non-linear topology that was first developed according to the pattern of human brain processing for data analysis [55]. Accordingly, the flexibility, robustness, and accuracy of this strategy quickly nominated it in a broad range of applications from biomedicine to sustainable development [56,57]. It is possible to design a powerful ANN model to extract a logical pattern among the considered determinative factors, with related dependent ones, despite any degree of complexity [58]. To this end, providing an acceptable experimental dataset is a primary step in developing a black box for the prediction of targets [59]. In this way, the architecture of ANNs is obtained with signal analysis among the input and output factors. It is worth noting that the successful development of the ANN approach demands order specifications in different layers related to neuron interactions. Accordingly, various types of ANN topologies have been introduced such as multi-layer perceptron, cascade, radial basis function, and general regression [55]. Among these paradigms, the cascade neural network (CNN) is considered the most popular tool to simulate different phenomena [60,61]. The CNN approach directly connects the input layer to all the next hidden/output layers [60]. Accordingly, in the current study, the CNN topology has been employed which includes three key layers such as input, hidden, and output. Here, the input layer is associated with independent factors (variables) that have already affirmed the most significant traces of the process, which, after some data analyses, are sent to the hidden layer. In this layer, the major data processing and mathematical treatment are applied; thereafter, the outcomes are introduced to the output layer for final analysis. The applied mathematical processing related to neurons is specified by Equation (2) as follows [62]: This equation states that the net output of a neuron (NON) can be calculated from the entry signal (i.e., [x 1 , x 1 , . . . , x M ] T ), weight vector (ω r ), and a bias (b). Additionally, a transfer function (f ) is supplied to receive the NON and calculate the outlet signal, i.e., f(NON). Although there are various types of transfer functions [57], the current study uses the hyperbolic tangent sigmoid (Equation (3)) and logarithmic sigmoid (Equation (4)) in hidden and output layers, respectively [63].
It should be mentioned that to properly benefit from the normalization process to enhance the training rate as well as improve its quality, the logarithm sigmoid has been incorporated into the output layer. Indeed, all variables in our study have been normalized [0 1] before starting the model development phase.

Accuracy Measurement
This study relies on several accuracy criteria to measure the deviation between actual and predicted carbon dioxide to methane selectivity (CMS) of MMMs. The correlation of determination (R), absolute average relative deviation percent (AARD%), mean squared errors (MSE), relative deviation percent (RD%), and residual error (RE) are defined by Equations (5)- (9) and are used in the current study [64]: where m indicates the number of CMS samples. Furthermore, the act and pred superscripts are the actual and predicted values of the CMS.
It should be mentioned that the authors wrote several distinct codes in the Matlab environment and used each of them for a specific purpose: for (1) conducting the relevancy test, (2) constructing and testing the CNN model, (3) performing statistical analysis, and (4) creating graphs.

Results and Discussion
This section explains the process followed to determine the best topology of the CNN and evaluate its prediction accuracy.

Tuning the CNN Topology
Although it is possible to create the cascade neural network with arbitrary numbers of neuronic layers, this study estimates the CO 2 /CH 4 selectivity of MMMs using a single hidden layer CNN. All the designed CNNs include an input layer, one hidden layer, and an output layer. Independent variables constitute the input layer and, therefore, it is fixed. In addition, the number of output neurons is dictated by the number of dependent variables (i.e., one). Therefore, it is only necessary to determine the best number of hidden neurons to fine-tune the CNN topology.
The numbers of the hidden neurons of CNNs are changed from one to twelve during a trial-and-error process. In addition, ten models are developed per each number of hidden neurons. In summary, this study develops 120 CNNs and compares their accuracy to find the highest accurate one. Figure 2 reports the results of the ranking analysis performed on the 120 developed CNNs. It can be seen that the eighth developed CNN model with eleven hidden neurons has the best performance and achieves the first-rank position. Therefore, the CNN model with eleven hidden neurons, a tangent sigmoid in the hidden layer, and a logarithm sigmoid in the output layer is selected as the final model to estimate the CO 2 /CH 4 selectivity of MMMs.

Tuning the CNN Topology
Although it is possible to create the cascade neural network with arbitrary numbers of neuronic layers, this study estimates the CO2/CH4 selectivity of MMMs using a single hidden layer CNN. All the designed CNNs include an input layer, one hidden layer, and an output layer. Independent variables constitute the input layer and, therefore, it is fixed. In addition, the number of output neurons is dictated by the number of dependent variables (i.e., one). Therefore, it is only necessary to determine the best number of hidden neurons to fine-tune the CNN topology.
The numbers of the hidden neurons of CNNs are changed from one to twelve during a trial-and-error process. In addition, ten models are developed per each number of hidden neurons. In summary, this study develops 120 CNNs and compares their accuracy to find the highest accurate one. Figure 2 reports the results of the ranking analysis performed on the 120 developed CNNs. It can be seen that the eighth developed CNN model with eleven hidden neurons has the best performance and achieves the first-rank position. Therefore, the CNN model with eleven hidden neurons, a tangent sigmoid in the hidden layer, and a logarithm sigmoid in the output layer is selected as the final model to estimate the CO2/CH4 selectivity of MMMs. Table 2 introduces the numerical values of the AARD%, MSE, and R indices related to the CNN performance to estimate 118 CO2/CH4 selectivity samples of polymer/SAPO-34 membranes. The training, testing, and all datasets have been estimated with excellent AARDs of 2.31%, 6.36%, and 2.93%, respectively. The structure-tuned CNN has also predicted these three datasets with the MSE of 0.51, 7.32, and 1.55, respectively. The closeness of the observed R values to the one in the training and testing steps, and their combination, is another indication of the outstanding ability of the proposed CNN for simulating the considered separation task.  The key information related to the designed CNN has been presented in Figure 3. The feedforward connection between input/hidden and hidden/output layers, as well as  Table 2 introduces the numerical values of the AARD%, MSE, and R indices related to the CNN performance to estimate 118 CO 2 /CH 4 selectivity samples of polymer/SAPO-34 membranes. The training, testing, and all datasets have been estimated with excellent AARDs of 2.31%, 6.36%, and 2.93%, respectively. The structure-tuned CNN has also predicted these three datasets with the MSE of 0.51, 7.32, and 1.55, respectively. The closeness of the observed R values to the one in the training and testing steps, and their combination, is another indication of the outstanding ability of the proposed CNN for simulating the considered separation task. The key information related to the designed CNN has been presented in Figure 3. The feedforward connection between input/hidden and hidden/output layers, as well as the cascade connection between input/output layers, are observable in this figure. Moreover, four inputs are SAPO-34 dosage, polymer type, pressure, and temperature, while the CO 2 /CH 4 selectivity of MMMs is the only output. This figure also shows two neuronic layers (i.e., hidden and output) with 11 and 1 neurons, respectively. The tangent and logarithm sigmoid transfer functions can be easily seen in the hidden and output layers.

CNN Performance Evaluation
The cross-plot showing the predicted CO2/CH4 selectivity by the CNN versus actua values for the training, testing, and all datasets are depicted in Figure 4. This figure prove that the designed CNN has successfully mapped the estimated selectivity samples on thei corresponding actual data in both training and testing steps. The R values related to th estimation of the training and testing groups are 0.9988 and 0.9860, respectively. Thes values state that the major parts of the predicted-actual symbols are located around th diagonal lines.

Predicted selectivity
Training step Testing step Diagonal line

CNN Performance Evaluation
The cross-plot showing the predicted CO 2 /CH 4 selectivity by the CNN versus actual values for the training, testing, and all datasets are depicted in Figure 4. This figure proves that the designed CNN has successfully mapped the estimated selectivity samples on their corresponding actual data in both training and testing steps. The R values related to the estimation of the training and testing groups are 0.9988 and 0.9860, respectively. These values state that the major parts of the predicted-actual symbols are located around the diagonal lines.
The relative deviation percent (i.e., Equation (8)) associated with estimating each experimental measurement of the actual CO 2 /CH 4 selectivity of MMMs is shown in Figure 5. This figure shows that only four training and six testing CO 2 /CH 4 selectivity samples have been estimated with an RD% higher than 5% or lower than −5%. It can be claimed that the proposed CNN model estimates 110 out of 118 actual CO 2 /CH 4 selectivity samples with an excellent RD% in the range of −5% to 5%.
The residual error between experimental and calculated CO 2 /CH 4 selectivity samples (i.e., Equation (9)) for the training and testing phases, as well as all the datasets, has been illustrated in Figure 6. This figure shows that the CNN estimates almost all the experimental selectivity samples with the RE ranges from −3 to +3. Few CO 2 /CH 4 selectivity samples are estimated with a RE outside this narrow range. that the designed CNN has successfully mapped the estimated selectivity samples on their corresponding actual data in both training and testing steps. The R values related to the estimation of the training and testing groups are 0.9988 and 0.9860, respectively. These values state that the major parts of the predicted-actual symbols are located around the diagonal lines. The relative deviation percent (i.e., Equation (8)) associated with estimating each experimental measurement of the actual CO2/CH4 selectivity of MMMs is shown in Figure  5. This figure shows that only four training and six testing CO2/CH4 selectivity samples have been estimated with an RD% higher than 5% or lower than −5%. It can be claimed that the proposed CNN model estimates 110 out of 118 actual CO2/CH4 selectivity samples with an excellent RD% in the range of −5% to 5%. The residual error between experimental and calculated CO2/CH4 selectivity samples (i.e., Equation (9)) for the training and testing phases, as well as all the datasets, has been illustrated in Figure 6. This figure shows that the CNN estimates almost all the experimental selectivity samples with the RE ranges from −3 to +3. Few CO2/CH4 selectivity samples are estimated with a RE outside this narrow range.  Figure 7 shows the experimental and CNN predictions for the training and testing sample of the CO2/CH4 selectivity of MMMs. Although the experimental CO2/CH4 selectivity measurements cover a relatively broad range (1.38-66.99) in different compositions of MMMs and operating conditions, the proposed CNN is able to precisely simulate the considered process. This figure also explains that 100 CO2/CH4 selectivity samples have been used in the training step and the remaining 18 samples are used in the testing step. It was previously reported that the proposed CNN estimates the training and testing datasets with excellent AARDs of 2.31% and 6.36%, respectively.  Figure 7 shows the experimental and CNN predictions for the training and testing sample of the CO 2 /CH 4 selectivity of MMMs. Although the experimental CO 2 /CH 4 selectivity measurements cover a relatively broad range (1.38-66.99) in different compositions of MMMs and operating conditions, the proposed CNN is able to precisely simulate the considered process. This figure also explains that 100 CO 2 /CH 4 selectivity samples have been used in the training step and the remaining 18 samples are used in the testing step. It was previously reported that the proposed CNN estimates the training and testing datasets with excellent AARDs of 2.31% and 6.36%, respectively. Figure 8 introduces experimental records and CNN predictions for the CO 2 /CH 4 selectivity of polyurethane membrane and polyurethane-SAPO-34 MMM versus pressure at 298 K. This figure justifies the acceptable agreement between actual and predicted selectivity samples. This figure also shows that the CO 2 /CH 4 selectivity of both membranes increases with increasing pressure and filler dose. Although Pearson's method correctly anticipated the filler effect on the selectivity, it provided a wrong result for the selectivity-pressure relationship. As mentioned before, this wrong prediction of the relevancy test is often related to the high level of scattering in the experimental data.

Investigating the Effect of Involved Features on the Selectivity
It is better to note that the relevancy test also anticipated a weak relationship between selectivity and pressure, but this analysis approves that the pressure is able to change selectivity sharply.
The pressure effect on the CO 2 /CH 4 selectivity of Pebax 1074-based MMMs containing 20% SAPO-34 at two temperature levels (298 and 308 K) is depicted in Figure 9. The increasing effect of pressure and decreasing effect of temperature on the CO 2 /CH 4 selectivity can be concluded from this figure. The relevancy test also confirmed an indirect relationship between selectivity and temperature.  Figure 8 introduces experimental records and CNN predictions for the CO2/CH4 selectivity of polyurethane membrane and polyurethane-SAPO-34 MMM versus pressure at 298 K. This figure justifies the acceptable agreement between actual and predicted selectivity samples. This figure also shows that the CO2/CH4 selectivity of both membranes increases with increasing pressure and filler dose. Although Pearson's method correctly anticipated the filler effect on the selectivity, it provided a wrong result for the selectivitypressure relationship. As mentioned before, this wrong prediction of the relevancy test is often related to the high level of scattering in the experimental data.

Investigating the Effect of Involved Features on the Selectivity
It is better to note that the relevancy test also anticipated a weak relationship between selectivity and pressure, but this analysis approves that the pressure is able to change selectivity sharply.
The pressure effect on the CO2/CH4 selectivity of Pebax 1074-based MMMs containing 20% SAPO-34 at two temperature levels (298 and 308 K) is depicted in Figure 9. The increasing effect of pressure and decreasing effect of temperature on the CO2/CH4 selectivity can be concluded from this figure. The relevancy test also confirmed an indirect relationship between selectivity and temperature.
It is better to note that Pearson's method also claimed that a weak relationship exists between selectivity and temperature, but this analysis approves that the temperature impact on the selectivity is strong.

Conclusions
This study aimed to estimate the CO2/CH4 selectivity of those mixed matrix membranes composed of polymeric substrates and with SAPO-34 zeolite as the filler. The cascade neural network was chosen to extract the relationship between the CO2/CH4 selectivity of the considered MMMs and the involved independent variables (i.e., filler dosage in It is better to note that Pearson's method also claimed that a weak relationship exists between selectivity and temperature, but this analysis approves that the temperature impact on the selectivity is strong.

Conclusions
This study aimed to estimate the CO 2 /CH 4 selectivity of those mixed matrix membranes composed of polymeric substrates and with SAPO-34 zeolite as the filler. The cascade neural network was chosen to extract the relationship between the CO 2 /CH 4 selectivity of the considered MMMs and the involved independent variables (i.e., filler dosage in polymeric matrices, polymer type, temperature, and pressure). Pearson's method proved that an acceptable degree of relevancy existed between the selectivity and the influential variables. This method identified that the SAPO-34 dosage in the composite membrane has the strongest direct effect on the CO 2 /CH 4 selectivity of the considered MMMs. The CNN topology has been fine-tuned using trial-and-error and statistical analyses. The CNN with 11 hidden nodes estimate 118 actual selectivity samples with the AARD = 2.92%, MSE = 1.55, and R = 0.9964, is identified as the most accurate model for the considered task. Moreover, several graphical analyses, including cross-plot, residual error, relative deviation percent, and predicted versus actual graphs have further justified the outstanding performance of the proposed CNN for estimating the CO 2 /CH 4 selectivity of the MMMs. Such a reliable model helps monitor the effect of MMM composition (polymer type and filler dosage) and operating conditions (pressure and temperature) on the potential CO 2 /CH 4 selectivity.  Data Availability Statement: The study data analyzed in this article can be obtained by request from the corresponding author.