Estimation of Coal’s Sorption Parameters Using Artificial Neural Networks

This article presents research results into the application of an artificial neural network (ANN) to determine coal’s sorption parameters, such as the maximal sorption capacity and effective diffusion coefficient. Determining these parameters is currently time-consuming, and requires specialized and expensive equipment. The work was conducted with the use of feed-forward back-propagation networks (FNNs); it was aimed at estimating the values of the aforementioned parameters from information obtained through technical and densitometric analyses, as well as knowledge of the petrographic composition of the examined coal samples. Analyses showed significant compatibility between the values of the analyzed sorption parameters obtained with regressive neural models and the values of parameters determined with the gravimetric method using a sorption analyzer (prediction error for the best match was 6.1% and 0.2% for the effective diffusion coefficient and maximal sorption capacity, respectively). The established determination coefficients (0.982, 0.999) and the values of standard deviation ratios (below 0.1 in each case) confirmed very high prediction capacities of the adopted neural models. The research showed the great potential of the proposed method to describe the sorption properties of coal as a material that is a natural sorbent for methane and carbon dioxide.


Introduction
Coal is a natural sorbent for gases such as carbon dioxide and methane. These gases, present in large amounts in coal mines, are associated with the occurrence of natural hazards. Methane is particularly noteworthy because of its presence in the strata, and its release as a result of mining and geological processes. This hazard is related to the geology of the deposit, i.e., to the type of coal and the presence of cracks or fault zones [1][2][3][4]. The basic tool for evaluating the likelihood of the aforementioned threats in hard-coal seams is analysis of coal's sorption parameters. In order to determine the properties of a coal-gas system under laboratory conditions, two parameters are primarily used, sorption capacity a and effective diffusion coefficient De. They complement in situ studies as far as the identification of methane and outburst hazards in mines are concerned. Sorption capacity describes the ability of coal beds to accumulate gas, and the effective diffusion coefficient is decisive for the speed of gas release from coal beds. Among methods used for gas-sorption measurements, gravimetric methods are of importance [5]. In these methods, the amount of sorbed gas is directly determined on the basis of measuring the accumulation of the investigated sorbent's mass after the sorbate is introduced into the gas system with constant pressure and temperature. These methods have a number of advantages [6,7];

Materials and Methods
Coal used in the research was obtained from 24 coal beds in the Upper Silesian Coal Basin (Poland). Coal samples were comminuted after being transported to a laboratory. Subsequently, with the dry-sieving method, 4 grain fractions were distinguished for specific purposes. The 10-20 mm grain fraction was used for densitometric analyses, and the 0.50-1.00 mm grain fraction for the microscope petrographic analyses. The 0.125-0.160 mm grain fraction was for sorption measurements, and the grain fraction below 0.20 mm was used in technical analyses [23].
Granular samples (thick polished sections) for petrographic analyses were prepared in line with guidelines specified in ISO norm 7404-2 (methods of preparing coal samples) [24]. Analyses were performed using the AXIOPLAN polarizing microscope (Zeiss, Oberkochen, Germany). The thick polished sections were analyzed in reflected white light, in oil immersion, with 500× magnification. Tests were performed at 1500 evenly distributed measurement points across the surface of the analyzed sample, in line with guidelines specified in ISO norm 7404-3 (method of determining maceral group composition) [25]. Microscopic research was accompanied by measurements of the average reflexivity (i.e., the ability to reflect light) of vitrinite (colotelinite), denoted as R 0 . Measurements were performed in line with the procedure described in ISO norm 7404-5 (method of microscopically determining the reflectance of vitrinite) [26].
The basic characteristics of the investigated coal samples were obtained from results of the technical analyses. The researchers applied the gravimetric method, falling back on procedures specified in ISO norms 562 and 1171 (determination of volatile matter and ash, respectively) for testing solid fuels and hard coal [27,28]. Technical-analysis parameters are expressed as the percentage loss of a sample's mass in relation to its original mass.
Densitometric analyses were performed with the helium-and quasiliquid-pycnometry methods using analyzers AccuPyc 1340 and GeoPyc 1360, respectively, provided by Micromeritics (Atlanta, GA, USA). Measurement results of the real and apparent density of the coal samples were used as a basis for determining the porosity of these samples.
Parameters that were determined as a result of the above-mentioned analyses constituted a source of basic information on the tested coal samples. Such analyses are routinely performed in coal laboratories and provide information such as the petrographic composition of the samples, moisture and ash content, and internal structure (porosity). These analyses are performed in a relatively short amount of time, and their execution does not require high costs. This is important from the point of view of the methodology proposed in this paper.
Sorption measurements were also conducted, and two parameters were established for each sample: effective diffusion coefficient D e from the Timofiejew equation [29]: where R-equivalent radius: d 1 -minimal grain diameter (lower sieve size, cm), d 2 -maximal grain diameter (upper sieve size, cm), t 1/2 -sorption half-time (s); and maximal sorption capacity a m based on sorption isotherm [30]: where a-amount of sorbed methane under given equilibrium pressure p (m 3 CH 4 /Mg), a m -maximal sorption capacity when p→∞ (m 3 CH 4 /Mg), b-constant peculiar of coal-methane system (MPa −1 ), and p-free gas pressure (in volume stage, MPa).
Measurements were conducted using the gravimetric method, with the IGA-001 sorption analyzer (Intelligent Gravimetric Analyzer) manufactured by Hiden Isochema (Warrington, UK). The measurements involved tracking changes in sample mass caused by gas sorption/desorption in the function of time [31]. The procedure was performed under constant sorption pressure within the range of 0-1 MPa and under isothermal conditions, in the temperature range from 25 • C (298 K) to 55 • C (328 K) [23].
In our research, the following parameters were used [32,33].
The research additionally considered the temperature at which the sorption measurements were conducted (T, • C) and the depth of the coal-bed location (H, m). Parameters used in the analyses were normalized in the range [0, 1].
As a result, each analyzed coal sample was described by means of a 13 dimensional feature vector, which was applied at the input of a multilayer-perceptron (MLP) neural network with unidirectional information flow ( Figure 1). That network was used for predicting the values of maximal sorption capacity a m and effective diffusion coefficient De.
The research additionally considered the temperature at which the sorption measurements were conducted (T, °C) and the depth of the coal-bed location (H, m). Parameters used in the analyses were normalized in the range [0, 1].
As a result, each analyzed coal sample was described by means of a 13 dimensional feature vector, which was applied at the input of a multilayer-perceptron (MLP) neural network with unidirectional information flow ( Figure 1). That network was used for predicting the values of maximal sorption capacity am and effective diffusion coefficient De. We had 24 coal samples at our disposal for which sorption measurements under 4 temperature levels were conducted: 25, 35, 45, and 55 °C. In this way, we obtained 96 input vectors of the network with the corresponding output values that were subsequently used to train, validate, and test neural models applied in the research. At that point, the time-consuming nature of sorption measurements, determined primarily by the size of the grains of the investigated coal material, should be indicated. In the case of actions involving the 0.125-0.160 mm grain fraction, the time needed to establish a complete sorption isotherm (including necessary time to outgas the sample) at a given temperature value averaged 72 h, which translates into some 10 months of work for all conducted sorption measurements, assuming the measurements would be carried out without pause. The schematic diagram of the performed analyses is presented in Figure 2. We had 24 coal samples at our disposal for which sorption measurements under 4 temperature levels were conducted: 25, 35, 45, and 55 • C. In this way, we obtained 96 input vectors of the network with the corresponding output values that were subsequently used to train, validate, and test neural models applied in the research. At that point, the time-consuming nature of sorption measurements, determined primarily by the size of the grains of the investigated coal material, should be indicated. In the case of actions involving the 0.125-0.160 mm grain fraction, the time needed to establish a complete sorption isotherm (including necessary time to outgas the sample) at a given temperature value averaged 72 h, which translates into some 10 months of work for all conducted sorption measurements, assuming the measurements would be carried out without pause. The schematic diagram of the performed analyses is presented in Figure 2.

Prediction Model
Choosing the right number of hidden layers plays an essential role in a neural network solving a given problem [34]. The most common solutions use one or two hidden layers of neurons. On the basis of previous experiences [11,15] and the analysis of relevant scientific sources, we concluded that the optimal neural network for conducting the measurements would be an MLP network with one hidden layer of neurons [35,36]. In this case, it is also important to properly select the number of neurons in the hidden layer and the optimal neuron activation function.
From the available 96 element dataset, 68 elements were randomly selected for the process of training the neural network. The remaining elements were arranged in 2 balanced datasets, validation and test (each consisting of 14 elements). The selection process was repeated 100 times, each time randomly. The validation set was used to evaluate the functioning of the network, and served as a detector of symptoms of the network's overlearning. The test set was used for the final evaluation of the neural model's functioning. Analyses were conducted using MATLAB v. 8.5 software (MathWorks, Natick, MA, USA).
At the input of the network, a 13 dimensional feature vector was used: parameters from technical (Parameters 1-3), petrographic (Parameters 4-8), and densitometric (Parameters 9-11) analyses, measurement temperature (Parameter 12), and the depth of the coal-bed location (Parameter 13). The network's output was constituted by a single linear neuron, which made it possible for the network to reach an unlimited output value range. The tests began with the application of 4 neurons in the hidden layer. That number was established as a geometric mean of the number of inputs and outputs of the network. Thanks to this rule, it was possible to approximately determine the minimal number of neurons in the hidden layer.
In order to determine the optimal size of the hidden layer of the model predicting the first of the considered sorption parameters, i.e., effective diffusion coefficient De, analyses were performed for various numbers of neurons in the hidden layer. Analyses also considered the impact of the selected neuron-activation function on the network effects. Two activation functions, widely used in this type

Prediction Model
Choosing the right number of hidden layers plays an essential role in a neural network solving a given problem [34]. The most common solutions use one or two hidden layers of neurons. On the basis of previous experiences [11,15] and the analysis of relevant scientific sources, we concluded that the optimal neural network for conducting the measurements would be an MLP network with one hidden layer of neurons [35,36]. In this case, it is also important to properly select the number of neurons in the hidden layer and the optimal neuron activation function.
From the available 96 element dataset, 68 elements were randomly selected for the process of training the neural network. The remaining elements were arranged in 2 balanced datasets, validation and test (each consisting of 14 elements). The selection process was repeated 100 times, each time randomly. The validation set was used to evaluate the functioning of the network, and served as a detector of symptoms of the network's overlearning. The test set was used for the final evaluation of the neural model's functioning. Analyses were conducted using MATLAB v. 8.5 software (MathWorks, Natick, MA, USA).
At the input of the network, a 13 dimensional feature vector was used: parameters from technical (Parameters 1-3), petrographic (Parameters 4-8), and densitometric (Parameters 9-11) analyses, measurement temperature (Parameter 12), and the depth of the coal-bed location (Parameter 13). The network's output was constituted by a single linear neuron, which made it possible for the network to reach an unlimited output value range. The tests began with the application of 4 neurons in the hidden layer. That number was established as a geometric mean of the number of inputs and outputs of the network. Thanks to this rule, it was possible to approximately determine the minimal number of neurons in the hidden layer.
In order to determine the optimal size of the hidden layer of the model predicting the first of the considered sorption parameters, i.e., effective diffusion coefficient De, analyses were performed for various numbers of neurons in the hidden layer. Analyses also considered the impact of the selected neuron-activation function on the network effects. Two activation functions, widely used in this type of neural network, were subjected to tests: logistic and hyperbolic tangent. In order to train the network, the Levenberg-Marquardt back-propagation algorithm was applied [37]. Results, expressed as average values obtained for 100 randomly selected learning sets, are shown in Table 1.
The adopted matching criterion for the proposed model is expressed as the average percentage prediction error returned by the neural model: where fw-prediction value, f w -observed (measured) value, X i -test-set element, and n-number of elements in the test set. Analysis of Table 1 shows that the optimal results for estimating the value of the effective diffusion coefficient (the lowest value of the matching criterion) were obtained using a hidden layer with 6 logistic neurons (MLP 13-6-1).
Corresponding research was conducted for the neural network of predicting the value of maximal sorption capacity a m . Results, expressed as the average values for 100 randomly selected learning sets, are presented in Table 2. For estimating the value of the maximal sorption capacity (Table 2), the best results were delivered by a network with 7 hyperbolic tangent neurons in the hidden layer (MLP 13-7-1).

Results and Discussion
On the basis of analyses described in the previous section, we chose a neural model with a hidden layer of six logistic neurons (MLP 13-6-1) in order to predict the value of the effective diffusion coefficient. The efficiency of the neural model was examined using a test set of 14 examples that were not provided during the neural network's training process. For 100 random samplings of the learning set, the average prediction error for the parameter values returned by the network, as compared with the actual values determined by using IGA-001 (Formula (1)), was 22.86% ( Table 1). The difference between the best match of the neural model and the observed values of the investigated parameter was 6.13%. Values of the effective diffusion coefficient for that match are shown in Table 3. For the best match, correlations between the values provided by the neural network and the observed values were determined. The procedure was performed for the training, validation, and test sets ( Figure 3).
As a result of the performed analyses, strong relationships were identified between the theoretical values of the diffusion coefficient and the measured values for the training, validation, and test sets (determination coefficients at 0.98-0.99). Results indicated strong predicting abilities of the investigated neural model. When assessing the regressive model, one should also focus on the value of the standard deviation for training examples and prediction errors. For a very good regression model, the value of the quotient of the standard deviation of prediction errors and the standard deviation of the dependent variable assume values below 0.1. These ratios were independently determined for each of the three datasets, and their values were 0.051, 0.054, and 0.082 for the training, validation, and test sets, respectively. The low values of the determined ratios and the high values of the determination coefficients confirmed very good predicting skills of the neural network described in the research.

Results and Discussion
On the basis of analyses described in the previous section, we chose a neural model with a hidden layer of six logistic neurons (MLP 13-6-1) in order to predict the value of the effective diffusion coefficient. The efficiency of the neural model was examined using a test set of 14 examples that were not provided during the neural network's training process. For 100 random samplings of the learning set, the average prediction error for the parameter values returned by the network, as compared with the actual values determined by using IGA-001 (Formula (1)), was 22.86% ( Table 1). The difference between the best match of the neural model and the observed values of the investigated parameter was 6.13%. Values of the effective diffusion coefficient for that match are shown in Table 3. For the best match, correlations between the values provided by the neural network and the observed values were determined. The procedure was performed for the training, validation, and test sets (Figure 3).   In the case of predicting the other discussed parameter, i.e., maximal sorption capacity, the researchers applied a neural model with a hidden layer containing 7 hyperbolic tangent neurons (MLP 13-7-1). For 100 randomly selected learning sets, the average match of parameter values returned by the neural network with the actual values established by IGA-001 (Formula (2)) was 0.89%. The difference between the best match and the observed values of the investigated parameter was 0.22%. Values of maximal sorption capacity for that match are presented in Table 4. Table 4. Maximal sorption capacity measured using IGA-001 as compared with values returned by the neural network for the best match. For the best match, correlations between values returned by the neural network and the observed values were determined. The procedure was performed for the training, validation, and test sets (Figure 4).  In the case of predicting the other discussed parameter, i.e., maximal sorption capacity, the researchers applied a neural model with a hidden layer containing 7 hyperbolic tangent neurons (MLP 13-7-1). For 100 randomly selected learning sets, the average match of parameter values returned by the neural network with the actual values established by IGA-001 (Formula (2)) was 0.89%. The difference between the best match and the observed values of the investigated parameter was 0.22%. Values of maximal sorption capacity for that match are presented in Table 4. For the best match, correlations between values returned by the neural network and the observed values were determined. The procedure was performed for the training, validation, and test sets ( Figure 4).  In the case of predicting the other discussed parameter, i.e., maximal sorption capacity, the researchers applied a neural model with a hidden layer containing 7 hyperbolic tangent neurons (MLP 13-7-1). For 100 randomly selected learning sets, the average match of parameter values returned by the neural network with the actual values established by IGA-001 (Formula (2)) was 0.89%. The difference between the best match and the observed values of the investigated parameter was 0.22%. Values of maximal sorption capacity for that match are presented in Table 4. For the best match, correlations between values returned by the neural network and the observed values were determined. The procedure was performed for the training, validation, and test sets (Figure 4).   As a result of these analyses, strong relationships between the maximal-sorption-capacity values determined with the neural network and the measured values for the training, validation, and test sets were obtained (determination coefficients approximating 1). This indicated the strong predicting abilities of the investigated neural model. Just as in predicting values of the effective diffusion coefficient, standard deviation ratios were determined for the training examples and prediction errors for each of the three datasets in order to evaluate the regression model. The low values of the determined ratios-0.011, 0.032 and 0.017 for the training, validation, and test set, respectively-with the high values of the determination coefficients proved that the applied neural network was characterized by very strong predicting skills.

Conclusions
This article described research on using artificial neural networks to estimate the maximal sorption capacity and the effective diffusion coefficient of methane in coal samples. In the case of the values of the effective diffusion coefficient, the best estimation results were thanks to an MLP network with six hidden logistic neurons of the logistic activation function. With regard to maximal sorption capacity, the best results were returned by an MLP network with seven hyperbolic tangent neurons in the hidden layer. In each of the investigated cases, theoretical values estimated by the neural network and results obtained using the IGA-001 proved compatible to a substantial extent. With the adopted criterion, in the case of the best match, the relative prediction errors for the analyzed parameters were 6.1% and 0.2% for the effective diffusion coefficient and maximal sorption capacity, respectively. The high determination coefficients (0.982-0.999) obtained in both cases, along with the low values of the standard deviation ratios (below 0.1 in each case) that were determined for the training data and prediction errors, confirmed that the adopted neural models possessed very good prediction skills. This points to substantial practical-application potential for the proposed methodology. Given the time-consuming nature of determining the analyzed parameters with a gravimetric device-conditioned, above all, by the size of grains of the coal material-the proposed methodology could be an effective alternative to the traditional measurement method. It could thus be possible to evaluate gas parameters on the basis of other measurements, which would eliminate the necessity of day-long measurements involving a costly gravimetric device. This would make it possible to use the proposed method to describe the sorption properties of coal as a natural sorbent for methane, which could be used for the ongoing forecasting and evaluation of gas and rock outburst threats in hard-coal mines.  As a result of these analyses, strong relationships between the maximal-sorption-capacity values determined with the neural network and the measured values for the training, validation, and test sets were obtained (determination coefficients approximating 1). This indicated the strong predicting abilities of the investigated neural model. Just as in predicting values of the effective diffusion coefficient, standard deviation ratios were determined for the training examples and prediction errors for each of the three datasets in order to evaluate the regression model. The low values of the determined ratios-0.011, 0.032 and 0.017 for the training, validation, and test set, respectively-with the high values of the determination coefficients proved that the applied neural network was characterized by very strong predicting skills.

Conclusions
This article described research on using artificial neural networks to estimate the maximal sorption capacity and the effective diffusion coefficient of methane in coal samples. In the case of the values of the effective diffusion coefficient, the best estimation results were thanks to an MLP network with six hidden logistic neurons of the logistic activation function. With regard to maximal sorption capacity, the best results were returned by an MLP network with seven hyperbolic tangent neurons in the hidden layer. In each of the investigated cases, theoretical values estimated by the neural network and results obtained using the IGA-001 proved compatible to a substantial extent. With the adopted criterion, in the case of the best match, the relative prediction errors for the analyzed parameters were 6.1% and 0.2% for the effective diffusion coefficient and maximal sorption capacity, respectively. The high determination coefficients (0.982-0.999) obtained in both cases, along with the low values of the standard deviation ratios (below 0.1 in each case) that were determined for the training data and prediction errors, confirmed that the adopted neural models possessed very good prediction skills. This points to substantial practical-application potential for the proposed methodology. Given the time-consuming nature of determining the analyzed parameters with a gravimetric device-conditioned, above all, by the size of grains of the coal material-the proposed methodology could be an effective alternative to the traditional measurement method. It could thus be possible to evaluate gas parameters on the basis of other measurements, which would eliminate the necessity of day-long measurements involving a costly gravimetric device. This would make it possible to use the proposed method to describe the sorption properties of coal as a natural sorbent for methane, which could be used for the ongoing forecasting and evaluation of gas and rock outburst threats in hard-coal mines.