Data-Based Prediction and Stochastic Analysis of Entrained Flow Coal Gasification under Uncertainty

Entrained flow gasification is a commonly used method for conversion of coal into syngas. A stable and efficient operation of entrained flow coal gasification is always desired to reduce consumption of raw materials and utilities, and achieve higher productivity. However, uncertainty in the process hinders the stability and efficiency. In this work, a quantitative analysis of the effect of uncertainty on the conversion efficiency of the entrained flow gasification is performed. A data-driven, i.e., ensemble, model of the process was developed to predict conversion efficiency of the process. Then sensitivity analysis methods, i.e., Sobol and Fourier amplitude sensitivity test, were used to analyze the effect of each individual process variables on conversion efficiency. For analyzing the collective impact of uncertainty in process variables on conversion efficiency, a non-intrusive polynomial chaos expansion (PCE) method was used. The PCE predicts probability distribution of the conversion efficiency. Reliability of the process was determined on the basis of percentage of the probability distribution falling within control limits. Measured data is used to derive the control limits for off-line reliability analysis. For on-line reliability analysis of the process, measured data is not available so a just-in-time method, i.e., k–d tree, was used. The k–d tree searches the nearest neighbor sample from a database of historical data to determine the control limits.


Introduction
Coal is one of the major energy sources being used for several centuries. According to British Petroleum (BP) statistical review of world energy, an estimated amount of 1,139,331 million tonnes of reserves exists in different parts of the world [1]. In spite of the concerns raised by environmental agencies and lower usability of coal as a fuel, the shortage of other energy resources makes the use of coal still valid. Entrained flow gasification is a commonly used method for conversion of coal into a green form of fuel, syngas. Extensive work on coal gasification has been reported in literature [2][3][4][5][6][7][8][9][10][11][12][13][14]. A stable and efficient operation of entrained flow coal gasification is a highly desired objective of the researchers to reduce its environmental impact and achieve higher conversion efficiency.
For realizing efficient process design, process modeling of the entrained flow gasification process has been the focus of research [2][3][4][5][8][9][10][11][12][13][14]. Ni et al., in 1995, developed a multivariable model for performance estimation of the entrained flow coal gasifier [8]. Liu et al., in 2000, investigated the effect of char structure and kinetics on the behavior of coal char in a pressurized gasifier [10]. Vicente et al., in 2003, developed a model to analyze turbulent dispersion and coal particle drag in a coal gasification process [11]. Chen et al., in 2012, investigated the effect of inlet flow pattern and coal to steam ratio on the conversion efficiency of the process [14]. Ilamathi et al., in 2013, used artificial neural networks (ANN) to predict unburnt carbon in a gasifier. In addition, a genetic algorithm (GA) was employed to optimize process conditions for realizing reduction in unconverted carbon in bottom ash [2]. Halama et al., in 2016, studied the behavior of two phase mixture, i.e., solid particles and gas phase, in char gasification by using computational fluid dynamics (CFD) techniques [3]. Mitianiec et al. (2017) used CFD simulation for an axial fluidized boiler to determine the thermal loads and toxic emissions during combustion of coal [4]. Wu et al., in 2017, developed a hybrid framework comprised of kinetic model and energy utilization diagram for prediction of reaction progress and exergy destruction in coal gasification [5].
Process uncertainty is a challenge in realizing stable and efficient operation of the coal gasification process. Some work is reported in the literature on analyzing uncertainty in coal gasification process models [9,12,13]. Chen et al., in 1999, used the multi-solids progress variables (MSPV) method to analyze uncertainty in product gas properties; the effect of various feed parameters, like coal type, gas flow rates, and the temperature, was studied [9]. Watanabe et al., in 2006, studied the fluctuations in the coal gasifier's performance by varying some input parameters, e.g., gasifier pressure, air ratio, and coal type [12]. Similarly, Haung et al., in 2007, performed the sensitivity analysis of coal type, bed temperature, static bed height, and various feed flow rates [13].
In the present work, a novel framework comprising of data-based prediction, sensitivity analysis, uncertainty analysis, and reliability analysis is proposed. A data-driven model based on ensemble technique was developed to predict conversion efficiency of the entrained flow gasification process. For the development of the data-driven model, data was generated through the interfacing of MATLAB R -Excel R -Aspen R . In order to analyze the effect of each individual process variable on the conversion efficiency of the process, Sobol and Fourier amplitude sensitivity tests were used. The two methods, i.e., Sobol and Fourier amplitude sensitivity tests, were used to cross-check their index-based ranking of input variables. Non-intrusive polynomial chaos expansion (PCE) was used to analyze the collective effect of uncertainties on the conversion efficiency. PCE is a stochastic approach that resulted in a predictive distribution of conversion efficiency of the entrained flow coal gasification. The reliability of the process was determined on the basis of the probability distribution falling within control limits. Measured data is used to derive control limits for the reliability analysis. For on-line reliability analysis of the process, measured data is not available so a just-in-time method, i.e., k-d tree, was used. The k-d tree searches nearest neighbor sample from a database of historical data to determine the control limits. Section 2 presents process description of entrained flow coal gasification followed by modeling methods described in Section 3. The proposed modeling framework is shown in Section 4, and then the results and discussion are provided in Section 5. Section 6 concludes the work.

Process Description
A process flowsheet of an oxy-fired entrained flow coal gasification is shown in Figure 1. This is a schematic representation of an Aspen PLUS R based model derived from the work of Wen et al. [15]. The symbols used in Figure 1 are defined in Table 1. Major process units of the flowsheet were pyrolysis, separator, volatile combustion, and char gasification. The pyrolysis process had two reactors, i.e., R1 and R2, in series. In R1, a pressure drop occurred which was readjusted in R2 to complete the reaction [12].
The outlet of R2 was separated into S3 and S4 by separator, i.e., SEP. The gas stream S4 entered into the combustion reactor R4 along with an oxygen stream. On the other hand, the solid stream S3 went to reactor R3 and decomposed to form C, H 2 , O 2 , N 2 , S, and ash. The mixer M mixed the streams S6 and S5 to form the feed stock for the gasifier R5. The gasifier converts char into carbon dioxide, carbon monoxide, methane, hydrogen, and hydrogen disulfide [12].
Rplug reactor and Ryield reactor, built-in modules of Aspen PLUS R , were used for gasification and and pyrolysis processes, respectively. For combustion process and decomposition process, another type of built-in models of Aspen PLUS R , Rstoic reactor, was used. A list of reactions occuring in pyrolysis, decomposition, combustion, and gasification processes is shown in Table 2 [15]. Table 1. Material streams and blocks symbols used in Figure 1. Decomposition process Char − − → C + H 2 + O 2 + N 2 + S + Ash 2 2 17 In this study, 14 process variables were used as input variables for the development of the modeling framework; see Tables 3 and 4.

Soft-Sensor Development
A kind of ensemble learning method, i.e., boosting, was used to develop the soft-sensor for prediction of conversion efficiency of the entrained flow coal gasification process [16]. Ensemble learning methods have recently gained the attention of researchers due to their high prediction accuracy and robustness [17]. In the boosting method, several weak models were combined to form a single efficient and robust model as demonstrated in Figure 2 (revised figure from [17]). On each round of developing a model, the data sample difficult in learning was getting more focus through weights allotted by the boosting mechanism. Least squares boosting (LSBoost), a boosting method for regression problems, was adopted in this study [18].

Uncertainty Analysis
Uncertainty analysis is a method used for quantification of collective impact of uncertainty in multiple input variables of a process [19]. Ahmad et al. have done an extensive review on dimensions and analysis methods of uncertainty in process models [20]. Although several methods are available for uncertainty quantification, the current study used a sensitivity analysis method and a sample-based method. The sensitivity analysis method helped in determining the impact of individual variables on the process output while the sample-based method assessed the collective impact.

Sensitivity Analysis
For the sensitivity analysis, Sobol technique and Fourier amplitude sensitivity test (FAST) were used [21][22][23]. The two methods were used to cross-check their index-based ranking of input variables.
Consider a model y = f (x), where y is a scalar output and input factors x 1 , ..., x p are independent random variables following known method, f (x) is decomposed as follow: (1) where v i , v ij , ..., v 1,...,p denote the variance of f i , f ij , ..., f 1,...,p , respectively. The first-order Sobol sensitivity indices were derived as follows: In the Fourier expansion, when all terms were mutually orthogonal, the model y = f (x) was expanded in a Fourier series [24].
where the A j and B j are Variance caused by factor i, v i , in the output was estimated as: where Z o = Z − 0 represents the set of integer numbers except 0. The total variance is given bŷ where ∧ j = A 2 j + B 2 j is the spectrum of the Fourier series expansion.

Sample-Based Method of Uncertainty Analysis
A sample-based uncertainty analysis method, i.e., polynomial chaos expansion (PCE), was used for uncertainty analysis of the process. In the PCE based method, a random variable x is represented as a function ( f ()) [25,26]: where ξ is another random variable. x is decomposed as follows: where α i is deterministic components while ψ i is stochastic components, and ψ i obeys the following condition: where the inner product of ψ j and ψ k results in ψ j , ψ k , and p ξ is the probability density function (PDF) of ξ.
In this study, a non-intrusive (black box) method was used to determine the mode strengths [27,28].

Reliability Analysis
Reliability of a process is very important for stable operation. In this work, a framework of reliability analysis is proposed which monitors the probability distribution of process output, i.e., conversion efficiency, predicted through the PCE based method. Percentage of the probability distribution within control limits is used to asses reliability of process conditions. Off-line reliability analysis was performed by deriving the control limits through measured data of the process. For on-line reliability analysis, a just-in-time method, i.e., k-d tree, was used to derive the control limits. As in on-line operation, measured values of process output were not available so the k-d tree algorithm searches nearest neighbor sample from database of historical measured values [29]. The search for the nearest neighbor sample was carried out on the basis of recursively splitting of data at different dimensions until a child subset with only one sample is left. For a data set comprising of (3,4), (2,5), (5,9), (6,5), (9,6), (10,7), and (10,3), the k-d tree-based search of the nearest neighbor samples for a query (5,7) is shown in Figure 3.

Modeling and Analysis Framework
The modeling strategy adopted in this work is summarized as follows: • Data generation: steady state values of the process model were altered through the MATLAB R -Excel R -Aspen R interfacing to generate data. The data generated through the interfacing was referred to as measured data. We generated 1000 data samples through the interfacing.

•
Soft-sensor design: the data was used to develop the boosting based soft-sensor. The number of trees, weak learner, were optimized. The optimized number of trees was 500. • Sensitivity analysis: to analyze the sensitivity of input variables, a MATLAB R -based algorithm of Sobol and FAST techniques was used [22]. The two methods were used to cross-check their index-based ranking of input variables. In both the sensitivity analysis methods, samples were taken for each input variable, listed in Table 3, from a normal distribution of 200 values with a standard deviation of 0.5% of the measured values. Both methods calculated first-order indices of the input variables. The most sensitive input variables were selected and their theoretical basis was analyzed. • PCE based uncertainty analysis: the ensemble model was incorporated in the PCE framework. Hermite function at level 6 and initial 20 terms was used in the PCE method. The framework was used to generate random variables for each of the input variables. The random variables for inputs of validation samples were fed to the ensemble (boosted) model to get predictive distribution of the process output.

•
Reliability analysis: reliability of the process was determined off-line through measured data. For on-line reliability analysis of the process, a just-in-time method, i.e., k-d tree, was used. For the control limits, 1.5% of the nearest neighbor sample values were taken as control limits. A threshold was set for the reliability of the predictive distribution of CO mass flow rate; the process was termed as reliable if 60% of the predictive distribution fell within the control limit otherwise unreliable.

Results and Discussion
Considering the fact that CO is one of the desired product components of the coal gasification process, we assumed a mass flow rate of the CO as a representative of the conversion efficiency. A similar assumption has been reported in literature where the mass fraction of CO in product stream is considered as the representative of the carbon element efficiency of the coal gasification process [31]. 90% of data generated from interfacing of MATLAB R -Excel R -Aspen R was used for model development while 10% was used for validation. Deterministic prediction of mass flow rate of CO through the soft-sensor is shown in Figures 4 and 5. The correlation coefficient between the target (validation samples) and the predicted values was 0.95.  Sensitivity analysis was performed using Sobol and FAST methods. In the Sobol method, the top five most sensitive variable were oxygen flow rate, coal flow rate, steam flow rate, the pressure of decomposer unit (R3) and pyrolysis temperature as shown in Figure 6a. In the FAST-based sensitivity analysis, the top five sensitive variables were oxygen flow rate, coal flow rate, steam flow rate, steam temperature, and pyrolysis temperature as shown in Figure 6b. The top three and the fifth ranked variables were common. The only difference in ranking was the variable ranked at number four. In addition to the FAST and Sobol-based sensitivity analysis, the ensemble model was also used to perform sensitivity analysis of the top six sensitive variables indicated in Figure 6. For 5% variation in steady state value of these variables, their effect on mass flow rate of CO is shown in Figure 7.
The first most sensitive variable was O 2 flow rate. According to Huang et al., the CO production increases with increase in air flow rate from eight to 12 normal cubic meters per hour. After nine normal cubic meters per hour, the CO formation decreases with increase in oxygen flow rate; the decrease in mass flow rate of CO is due to the formation of CO 2 in presence of excessive air [13].
The second most sensitive variable was the flow rate of coal. The production of CO is directly proportional to the flow rate of coal [13]; an increase in the coal flow rate results in a reduction of coal bed temperature, a decrease in CO 2 formation, and increase in CO formation.
The third most sensitive variable was steam flow rate. With the increase in the flow rate of steam, the production of CO increases for steam to coal ratio from 0.2 to 0.6. Afterwards, CO mass flow rate decreased due to the conversion of CO to CO 2 ; the CO produced in the gasifier started to react with the steam to produce CO 2 [32].
The fourth most sensitive variable in Sobol sensitivity analysis was R3 pressure; it was observed that R3 pressure and formation of CO were directly proportional because of the reaction kinetics where reaction moves in a forward direction with an increase in pressure [33]. The fourth most sensitive variable in the FAST analysis was steam temperature. Increase in steam temperature results in increase in CO formation [13].
The fifth most sensitive variable, in the Sobol as well as FAST based sensitivity analysis, was pyrolysis reactor (R2) temperature. In R2, there is already a presence of CO coming from R1 so with the further increase in temperature from 1050 • C to 1060 • C, the CO converts to CO 2 [33].
The results of non-intrusive PCE-based uncertainty analysis were used to know the effect of uncertainty in input variables on CO mass flow rate. For 0.5% uncertainty in the measured values of all the fourteen input variables, an average of 2% uncertainty was observed in the PCE-based prediction of CO mass flow rate as shown in Figure 8. In order to compare prediction accuracy of the PCE-based method with the ensemble model, mean values of the predictive distributions were determined. Correlation between actual values of CO mass flow rate and mean values of the predictive distributions was 0.90.
The results of reliability analysis are shown in Figure 9a-d. In the off-line reliability analysis, 4% of the predictive distributions could not qualify the threshold of reliability as shown in Figure 9c. The k-d tree-based derived control limits had 0.90 correlation coefficient with the measured data based control limits, Figure 9d. When the percentage of predictive distribution within the control limits comes below a threshold, a control problem is sensed by the proposed method; it is inferred that uncertainty in one or more input variables is larger than the allowable limit.

Conclusions
In this work, a novel framework comprised of data-based prediction, sensitivity analysis, uncertainty analysis and reliability analysis of conversion efficiency of entrained flow gasification process was developed. MATLAB R -Excel R -Aspen R interfacing was used to generate data for development of the modeling framework. Mass flow rate of carbon monoxide in the product stream was assumed as the representative of conversion efficiency. The correlation coefficient between the predicted values of CO mass flow rate and the measured values of CO mass flow rate was 0.95. The top most sensitive variables were oxygen flow rate, coal flow rate, steam flow rate, the pressure of decomposer unit, steam temperature, and pyrolysis temperature. For 0.5% uncertainty in the process variables, an uncertainty of 2% was noted in the conversion efficiency. Data-based control limits were used to evaluate the reliability of predicted probability distribution.
The MATLAB R -Excel R -Aspen R is used to generate process data because of unavailability of real plant data for this study. However, the ultimate aim of the proposed framework is on-line data-based sensing, sensitivity analysis, uncertainty analysis and reliability analysis of a real time coal gasification process. Hence, in future work, the Aspen based process flowsheet can be replaced by an actual coal gasification plant.