Study on the linkages between microstructure and permeability of porous media using pore network and BP neural network

In applying porous media air bearings (PMABs), designing the pore microstructure of porous media to obtain the desired permeability is challenging. The key parameters in this design are to map the pore microstructure characteristics to permeability and adapt to manufacturing process with the characteristics. For this purpose, a framework is proposed to characterize pore microstructure with morphology descriptor and predict permeability. 3D digital images of porous media are obtained using X-ray micro-computed tomography and various image construction techniques. The complex pore microstructure of porous media is represented with a pore network. Permeability is calculated based on the pore network. Sixteen pore microstructure morphology descriptors are initially calculated to characterize pore microstructure. A back-propagation neural network (BPNN) is built to learn the correlation between morphology descriptors and permeability. Pearson correlation coefficient (PCC) and feature importance scores of morphology descriptors are obtained based on the dataset and trained BPNN. The results demonstrate that the prediction performance of BPNN is excellent. The following six morphology descriptors (porosity, coordination number, average pore diameter, average throat diameter, average pore throat ratio, average throat length) are reserved to characterize pore microstructure. Finally, two types of pore microstructure are designed with the help of knowledge obtained by this research.


Abstract
In applying porous media air bearings (PMABs), designing the pore microstructure of porous media to obtain the desired permeability is challenging. The key parameters in this design are to map the pore microstructure characteristics to permeability and adapt to manufacturing process with the characteristics. For this purpose, a framework is proposed to characterize pore microstructure with morphology descriptor and predict permeability. 3D digital images of porous media are obtained using X-ray micro-computed tomography and various image construction techniques. The complex pore microstructure of porous media is represented with a pore network. Permeability is calculated based on the pore network. Sixteen pore microstructure morphology descriptors are initially calculated to characterize pore microstructure. A back-propagation neural network (BPNN) is built to learn the correlation between morphology descriptors and permeability. Pearson correlation coefficient (PCC) and feature importance scores of morphology descriptors are obtained based on the dataset and trained BPNN. The results demonstrate that the prediction performance of BPNN is excellent. The following six morphology descriptors (porosity, coordination number, average pore diameter, average throat diameter, average pore throat ratio, average throat length) are reserved to characterize pore microstructure. Finally, two types of pore microstructure are designed with the help of knowledge obtained by this research.

Introduction
Porous media air bearings (PMABs) are often used in many Original Equipment Manufacturer (OEM) precision machine applications such as metrology equipment, semiconductor wafer manufacturing machines, and precision machine tools due to their advantages of near zero friction and wear. The PMABs can obtain good stiffness and capacity when the permeability of porous media inserted in it ranges from 3.3 10 15 m 2 to 8.4 10 14 m 2 [1,2]. It is well-known that permeability is solely determined by the pore microstructure [3]. Many studies have been done with the guidance of PMABs by the reference of papers [4][5][6][7][8][9]. However, these works mainly focus on the macro-performance such as capacity, stiffness, which cannot provide sufficient knowledge for the design of the pore microstructure of porous media. Therefore, it is worthy of finding a solution to determine the quantitative relationships between pore microstructure and permeability of porous media at the pore scale.
There are usually four methods for calculating the permeability based on the pore microstructure of porous media: pore-scale numerical simulation, pore network modeling (PNM), empirical formula, and convolutional neural network (CNN). The pore-scale numerical simulation approaches mainly include Navier-Stokes's equations (NSEs) and Lattice Boltzmann equations (LBEs). NSEs or LBEs are solved on the pore microstructure geometry to obtain the flow rate [10][11][12][13][14]. The principle of PNM is to obtain the pore network of pore microstructure and then solve the transport equations on the pore network to determine the flow rate [15][16][17][18]. Darcy's law [19] is employed to calculate permeability using the flow rate. Kozeny-Carman equation may be the prominent empirical formula [20] that calculates the permeability with empirical coefficient, specific surface, and porosity. CNN has achieved significant success in image classification [21,22]. CNN has been used to predict permeability of porous media from images with the inspiration of image classification [23][24][25]. CNN has also been used to predict other properties of materials directly [26,27].
Generally, the pore-scale numerical simulation method provides a way to calculate the permeability value of porous media accurately. The solution of LBEs and NSEs is obtained using computational fluid dynamics (CFD) techniques. However, CFD requires a high computational cost, which limits the application of this method. The PNM can characterize the pore microstructure with morphology parameters and calculate the permeability, significantly reducing the computational cost with acceptable accuracy. CNN provides a novel way to predict permeability from pore microstructure images.
It should be noted that characterization is a comprehensive concept. All the methods mentioned above can characterize the pore microstructure in some way. However, this study aims to establish the relationship between pore microstructure characteristics and permeability furthermore link the characteristics with the porous media manufacturing process. The morphology descriptors, shown in table 1, are intuitive for characterizing pore microstructure with the help of X-ray micro-computed tomography technology. Notably, compared with the two-point correlation function, mapping these morphology descriptors to process steps is simple. For example, the pore diameter can be controlled by changing the size of the glue particles.
With the development of three-dimensional imaging technology, the pore microstructure of porous media can be seen on a large scale [33]. X-ray micro-computed tomography technology can obtain 1500 3 voxels of porous media with a spatial resolution of 1 m.
m An image contains rich information about the pore microstructure of porous media. The advances in imaging techniques for analyzing complex pore microstructure have revolutionized our ability to characterize various porous media systems [34,35]. The X-ray data of pore microstructure can be converted to a three-dimensional matrix for many analyses. Two problems need to be solved. First, how to characterize pore microstructure of porous media; second, what is the proper mathematical model related to the characteristics and permeability. This study presents a research framework consisting PNM and BPNN to characterize porous media effectively and predict permeability. The PNM is used to represent pore microstructure with morphology parameters. The BP neural network (BPNN) is built to map the morphology parameters to permeability. Pearson correlation coefficient (PCC) is calculated to determine the linear relationship between pore microstructure and permeability. Feature importance scores are calculated to show which morphology parameters have a significant effect on predicting permeability.

Materials and methods
As mentioned above, two problems need to be solved: characterization and mapping. The pore network simplifies the complex geometry of the pore space with a regular node and channel, which can characterize porous media with morphological parameters. BPNN makes it possible to find relations between variables from large amounts of data. The framework presented in this paper includes four parts shown in figure 1.

Building the dataset of porous media samples
Four methods, shown in figure 1(a), are used to obtain the dataset, including natural and synthetic porous media samples.

Extracting the pore networks of porous media samples
The watershed algorithm is used to obtain the pore network from images of porous media. The morphology parameters of the pore microstructure are calculated based on the pore network shown in figure 1(b).

Training the BPNN model
A Multi-layer perception model is built. The model inputs are the morphology parameters, and the output is labeled as permeability (X17), shown in figure 1(c).

Explaining the trained model
The correlation between the pore microstructure morphology parameters and permeability is analyzed. The performance of the BPNN is evaluated. The feature importance scores are calculated based on the trained model to investigate which morphology parameters are important role in predicting permeability. The degree of linear correlation between morphology parameters and permeability is measured by Pearson correlation coefficient (PCC), which provides a reference to check the morphology parameters for desired permeability. 2.5. Dataset 441 natural porous graphite images are obtained using 3D X-ray. However, to provide reliable training to BPNN, the 441 samples are not sufficient to define the morphological variability that can be seen in graphite. Therefore, throur methods are employed to increase the number of images.

Boolean method
In this model, white noise is built with a size of N 3 . The Gaussian blur filter is performed on the white noise. The anisotropic Euclidean Distance Transform (EDT) is calculated. Peak point method is performed on the EDT images to find the sphere centers, allowing spheres to overlap. The density of sphere centers, or the sphere volume fraction, is used to generate a parametric model. The volume fractions of spheres and embedding medium are denoted by 1-q and q, respectively. Two kinds of models are considered, where spheres are either 'solid' or 'pore'. In the model (A), flow is outside the spheres entirely, in the model (B) it is only inside the spheres.
The specific surface area g is calculated using the following formula: where C h ( ) is the covariance function that has a different formula for models (A) and (B) where D denotes spheres diameter 2.5.3. Gaussian random fields According to Lang and Potthoff [36], gaussian random fields are created. Assuming a gaussian random field x , ( ) G x 3 Î  to be simulated, with a mean of zero and a covariance function x y , .
The covariance function can be written as: is the spectral density of the gaussian random field, and , · · á ñ is the inner product. A pore microstructure with length scale parameter L and resolution N 3 is to be created. Letting L N, / d = 1.25. a = FFT and FFT-1 denote the forward and inverse three-dimensional Fourier Transforms, respectively. It can be performed as follows: Generating an array W where all components are independent and getting a gaussian distribution with a mean of zero and a standard deviation of 3 d (white noise). The Fourier space grid is defined as and likewise for p , 2 p . 3 By using the commutation relation p ( ) g on the grid, U p p FFT W L .
After that, the scale fields are converted to two-phase media using the threshold method. The threshold is chosen as an acceptable value to procure a specified porosity. The execution time is approximately 1 s for each porous media sample by using Python programming.

Quartet structure generation set (QSGS)
QSGS is a method based on stochastic clustering theory to generate stochastic porous media [37]. Four major factors primarily regulate the formation of porous media. The implementation strategy is explained below with numbered sentences. (1) An array of zeros M with a size of N 3 is generated. (2) Solid seeds are randomly spread in the array depending on a distribution probability, C , d which is lower than the target porosity of porous media. This is accomplished by assigning a random virtual number to each element then selecting the elements whose number is less than C d as the solid seed. The value of the seed elements in the array M become one. (3) The directional growth probability P i is used to check these seeds expanding their neighboring components. In this checking process, each of the neighboring elements of the solid seeds is assigned a random virtual number, then compared with P. Only the number of neighboring elements is less than P i becomes part of the growing solids.
(4) Repeat steps (2) and (3) until the target porosity is achieved in the array M.

Pore network
A pore network is made up of nodes that represent individual pores in pore space and links that connect the nodes of neighboring pores. Physical properties such as local permeability can be assigned to these links. The pore network is a simplified representation of the natural pore space. Different approaches using different definitions for a pore have been proposed. The definition proposed by Piovesan [16] is used in this study. Prefiltering the distance diagram, deleting peaks on saddles and plateaus, mixing peaks that are too close, and  assigning void voxels to appropriate pores using a marker-based watershed are the four key steps of the algorithm. Figure 4 illustrates how the pore microstructure is divided into individual pore bodies and throats. The pore body with voxels labeled i can be isolated to compute morphology properties. Figure 5 shows the PNMs obtained by applying the watershed algorithm to the pore microstructure of porous media. The spheres represent the position and size of the pores, while the cylinders represent the throat. After obtaining the pore network, permeability can be calculated based on it. Figure 6 shows the calculation process from porous media image to permeability. The steady-state condition, mass conservation at the nodes is assumed to model singlephase fluid flow. Flowing in x-axis is simulated by adding a pressure drop between inlet and outlet; the other sides are impermeable. Since the links are saturated with fluid, the flow rate Q ij from pore i to pore j is given by: where G ij stands for the channel hydraulic conductance and the pressure values registered in pore i and j are P i and P , j respectively. Hydraulic conductance of a fluid-filled link determined by applying Poiseuille's Law with the condition of laminar flows in a cylindrical cross-section: where r, l and h denote the throat radius, length, and dynamic fluid viscosity, respectively. The next step is to apply mass conservation to pores, which is mathematically expressed for pore i in the conditions of the incompressible fluid: From the mass conservation hypothesis, a linear system equation is written: where the pressure values at each pore are stored in the vector P. Q represents the flow rate at each pore of the pore network, and their summation is equal to zero, according to equation (8). The conductance matrix G is a sparse, symmetric matrix with non-diagonal elements expressed by the cylindrical hydraulic conductance. The diagonal elements are computed as: The volumetric flow rate Q through the domain is calculated after the linear system equation is solved. Darcy's law is used to determine the permeability, K: where L and A are the sample length and cross-section area, respectively. h is dynamic fluid viscosity set to 0.001 Pa s, P D is the pressure difference applied to the sample.
The watershed algorithm is applied to Berea sandstone to verify the permeability prediction. The Berea sandstone is a standard material used in geoscientific studies known for its permeability. Figures 6(a) and (b) show the pore microstructure of Berea sandstone and the pore network extracted by the watershed algorithm, respectively. The pressure distribution can be seen in figure 6(c). The experimental permeability of the Berea sandstone is 1.3 10 12 m 2 , and it is calculated from the pore network as 1.6 10 12 m 2 . The difference is close to the experimental value. A further function of the pore network is to characterize the porous media explicitly. Therefore, sixteen morphology parameters shown in table 1 are produced with a statistical method.

BP neural network
BPNN is a mathematical representation of the biological neural network. The BPNN can be used to learn and store a lot of mapping relationships from the input-output dataset. In the beginning, the identification of the mathematical equation which describes the relationships is not needed. In this paper, the inputs are morphology descriptors (X1-X16) and the output is permeability (X17). The relations between the inputs and output are very complicated without any explicit mathematical equation. BPNN has the potential to find relationships hidden under a big dataset.
There are sixteen neurons in the input layer and one neuron in the output layer. The key point is to determine the number of hidden layers, and neurons in all layers included hidden. Generally, there are no exact rules for designing BPNN except for some basic rules: (1) If the dataset is linearly separable, there is no need to introduce a hidden layer. (2) If the dataset dimension or feature is fewer, it means less complex, and BPNN would work with 1 to 2 hidden layers. 3 to 5 hidden layers can be used for a complex dataset. (3) The number of hidden neurons should be between the size of the input layer and the output layer. Based on these rules, initially, two hidden layers are selected, each hidden layer with sixteen neurons. The performance of BPNN is compared with linear regression, decision tree, random forest, gradient boosting, support vector machine [38][39][40][41].

Matrix
The Pearson Correlation Coefficient (PCC) is a statistics function that measures the linear correlation between two variables. It is calculated by the covariance of two variables divided by the product of their standard deviations. The value of PCC ranges from −1 to 1 because it is essentially a normalized measurement of the covariance. When applied to a sample, PCC is commonly represented by r xy and may be referred as the sample correlation coefficient or the sample Pearson correlation coefficient. Given paired data x y x y , , , ,

Results and discussion
3.1. Prediction performance Table 2 presents the coefficient of determination R 2 for the proposed BPNN and surveyed methods. All the models give high values of R .      regression gives 0.867 for R . 2 The R 2 of BPNN is 0.964. That indicates the used mathematical model shows a good correlation between input and output.

Feature importance
Feature importance (FI) is a category of feature selection that assigns scores to input features in a predictive model, indicating the importance of each feature when making a prediction. Figure 8(b) shows the importance scores of all morphology descriptors. Some morphology descriptors have a high degree of linear correlation. Therefore, it is necessary to select important features considering the PCC. Only one of the highly linearly related parameters remains, such as X1, X2, and X3 have high PCC values. Therefore, X1 is selected. On the other hand, the morphology parameters with maximum value, such as X4, X6, X8, X12, and X14 are ignored due to their low probability of these values. Finally, X16, X3, X5, X7, X15, and X13 are selected to represent the pore microstructure of porous media, and the rank of PCC and FI can be seen in table 3.

Design of pore network
Two types of pore networks are designed. The first type has irregular pore and throat distribution with the permeability of 1.1 10 15 m 2 shown in figure 9. The pore network is the representative elementary volume (REV) with a size of 0.4×0.4×0.4 mm 3 . It is concluded that X16, X3, X5, X7, X15, and X13 are suitable for representing the pore network. The distributions of these parameters are shown in figures 9(b)-(f). The equivalent pore diameter follows Gauss distribution with σ=2, μ=8, and the pore throat ratio follow Gauss distribution with σ=2, μ=2.5.
y e 1 2 13 The second type has regular pore and throat distribution with the permeability of 4 10 16 m 2 , as shown in figure 10(a). The pore network is the REV with a size of 10 10 10´mm 3 . The coordination number is shown in figure 10(b). The other important morphology parameters are shown in table 4. In the future, 3D printing technology may be used to manufacture porous graphite materials with controlled permeability.

Conclusions
Mining from big data explains the relationship between pore microstructure and permeability for porous media manufacturing, and it presents a design method for porous media. This method will reduce the possibility of trial-and-error, so lead to fast developments in porous media with tailored permeability. In this paper, several well-developed methods are employed with pore microstructure morphology descriptors to predict permeability accurately. The framework can easily apply general procedures to other permeability-like transport properties of porous media by replacing the permeability label. In particular, the three construction methods (GRF, QSGS, and Boolean) can also be used to generate different realizations of porous media for investigating a diversity of transport properties, such as diffusivity, electrical and thermal conductivity. 5240 images of porous media are obtained. Sixteen morphology descriptors are extracted from the pore network as input variables for BPNN training. The permeability, treated as the output of BPNN, is calculated based on the PNM. The R 2 between actual and predicted permeability achieves 0.964, indicating that the proposed BPNN is proper and efficient. The investigation of feature importance scores demonstrates that porosity, coordination number, average pore diameter, average throat diameter, average throat length, and average pore throat ratio are suitable for characterizing pore microstructure. Therefore, a designer should focus on linking these parameters with the specific manufacturing processes in practical application. The investigation of the Pearson correlation coefficient demonstrates that average throat length is inversely proportional to permeability. Porosity, average pore diameter, average throat diameter are highly proportional to permeability. In the same way, the designer should map these parameters to the process steps and obtain tailored permeability by adjusting these steps.