Pore Pressure Prediction Using Artificial Neural Network Based On Logging Data

Pore pressure is a critical parameter in designing drilling operations. Inaccurate pore pressure data can cause problems, even incidents in drilling operations. Pore pressure data can be obtained from direct measurement methods or estimated using indirect measurement methods such as empirical models. In the oil and gas industry, most of the time, direct measurement is only taken in certain depth due to relatively high costs. Hence, empirical models are commonly used to fill in the gap. However, most of the empirical models highly depend on specific basins or types of formation. Furthermore, to predict pore pressure using empirical models accurately requires a good understanding in determining Normal Compaction Trendline. This proposed approach aims to find a more straightforward yet accurate method to predict pore pressure. Using Artificial Neural Network Model as an alternative method for pore pressure prediction based on logging data such as gamma-ray, density, and sonic log, the result shows a promising accuracy.


INTRODUCTION
Pore pressure is a critical parameter in designingdrilling operations. Pore pressure will affect the design of the drilling mud, casing, cement, and tubing [1] . Errors in determining the pore pressure can cause problems in drilling operations such as lost circulation, kick,and blow out. These problems can result in increased drilling costs and even fatality.
The pore pressure,which has a pressure equal to the hydrostatic pressure of a fluid column with the height from the surface to a certain depth in the subsurface, is called the normal pore pressure. Therefore, Normal pore pressure can be calculated using the density of formation fluid and the depth of the observed point. Pore pressures whose value is higher than normal are called abnormal pore pressures while pore pressures whose values are below normal pressure are called subnormal formation pore pressures. Abnormal pore pressure is found almost in all sedimentary basins throughout the world and caused by the compacting effect, the effect of diagenesis, differences in density, and fluid migration [3] .
Pore pressure can be determined using direct or indirect measurements. Direct measurement of pore pressure can be done using RFT (Repeat Formation Tester), DST (Drill Stem Test), or RCI (Reservoir Characterization Instrument) [4] . However, these methods are limited to a certain depth due to the issue of costs incurred. Besides, RFT and DST are only used in formations with high permeability whereas abnormal pressures usually occur in formations that have low permeability such as shale [5] The indirect measurement method can be done using empirical models. Some empirical models that can be used to calculate formation pore pressures using well logging data include Hotmann and Johnson [6] , Eaton [7] ,and Bowers [8] .
The Hotmann and Johnson modelcannot be used in areas that have abnormal pressure (overpressure) caused by compacting as a result of overburden stress [6] . The Eaton model is the most widely used qualitative method [9] . The Eaton modelis limited to be used in young sedimentary basins, where the cause of overpressure pressure is undercompaction [10] . The Bower method will give overestimation resultswhen used in shallow formations that are unconsolidated [11] .
As previously explained, most empirical models have their limitations. Empirical models are based on data in one basin so that it will be different fromone basin to another. Justification and modification are needed to be able to use in another basin. An analysis of NCT (Normal Compaction Trendline) is also necessary to use empirical models. The accuracy of the method for determining the pore pressures based on rock porosity will depend on the determination of the Normal Compaction Trendline based on [12] . An understanding of resistivity compaction trends, the effect of lithology, formation fluid salinity, and temperature on the resistivity response to effective stress are also needed if the resistivity data is used [13] . Based on those limitations, more straightforward yet accurate method to predict pore pressure is necessary.
Artificial Neural Network (ANN) is one method that can help find the relationship between input parameters and output parameters without producing correlations. Artificial Neural Networks can be defined as an information-processing system that has similar characteristics to the biological nervous system [13] . The artificial neural network consists of connected input and output units that have associated weight in each connection [14] . The network is trained and adjusted until the output, and the target is matched. The schematic of the ANN network can be seen in Figure 1. [15].
With the ability to tolerate noisy data and classify new patterns, Artificial Neural Networks can be used as prediction, classification, data association, and data conceptualization tool [16] . Artificial Neural Network has been widely used in the oil gas industry, especially to solve complicated problems where conventional modeling is not a practical option [17] .
This study aims to create an Artificial Neural Network Model as an alternative method to predict pore pressure. Data logging from several wells were used to develop an Artificial Neural Network model.

Figure 1.Schematic of Artificial Neural
Network [15] METHODOLOGY The methodology of this research can be seen in figure Figure 2. Alyuda NeuroIntelligence software was used in this research.

Collecting and Analyzing Data
A sufficient amount of data is needed to build an artificial neural network. Logging data, daily drilling reports, lithology, and RFT data were collected from several wells. Logging data such as gamma-ray, density, and sonic log were collected and became input data. The pore pressure data were calculated using the Eaton equation and adjusted based on drilling event, mud density, and RFT. Pore pressure data were used as the output data. Then the data were analyzed where the outlier's data, typo, and missing values were eliminated from the dataset. Analyzed data were divided into 3 sets; training set, validation set, and test set. The training set is the dataset that used for neural network training, the validation set is the data that used to tune network parameters other than weights, and the test set is the data that will be used to test the performance of the neural network model.Simple Random Sampling (SRS) method was used to split the dataset. Using SRS method, the data were selected randomly with a uniform distribution [18] . This method has the advantage of giving a low bias to the model's performance [19] .

Pre-processing Data
Before the data were inputted to Artificial Neural Network, it needs to be normalized or scaled because ANN has a limited range of operating values and. due to differences in the range of values in each parameter.The range value of each input parameter was scaled into the range of -1 to 1 using Equation 1 and Equation 2. (1) Where xp is preprocessed value, SRminis lower scaling range limit, x is the actual value of the numeric column,xmin is the minimum actual value of the column,SF is scaling factor, SRmax is upper scaling range limit, xmax is the maximum actual value of the column, xmin is the minimum actual value of the column.

Developing ANN Model
The architecture of the ANN model used in this research consist of 3 layers; input layer, hidden layer, and output layer. The number of neurons in the hidden layer was selected based on iteration. The optimal number of neurons that give the highest correlation and the lowest error was chosen as the architecture of the ANN model.
Training can be defined as the process of tuning the weights and biases from the input data.Each iteration of the training process consists calculating the predicted output, (feedforward) and tuning the weights and biases, known as backpropagation [20]. The model was used the sigmoid function as the activation function in the forward-propagation process. This function has a sigmoid curve and is calculated using theEquation 3and Equation 4, where W is the weight, x isthe input layer, b is the bias, z is the input for the sigmoid function.
Back Propagation algorithm was used since this algorithm is the generalpurpose training algorithm of choice.Several advantages of Backpropagation are simple and easy to program, no parameters to tune apart from the numbers of input, and this method is a standard method that generally works well [21] .
The training is conducted to find the best set of weights and biases that minimizes the loss function. Sum-ofsquares error was used as the loss function. Sum-of-squares error can be calculated using Equation 5, where n is the amount of data, isthe actual value and is the predicted output. (3)

Testing ANN Model
The ANN model was tested to predict a new dataset that never been introduced to the model. The performance of the ANN model was checked by comparing the prediction pore pressure data with the actual pore pressure data. The coefficient of determination (R 2 ) was used to see the correlation between the predicted result and the real data. The coefficient of determination (R 2 ) was calculated using Equation 6 where n is the amount of data; miis predicted value at step i, pi is the observed value at step I, is the mean of observed value and is the standard deviation of the observed value.

RESULTS AND DISCUSSION
3 wells were selected in an onshore field. Those 3 wells were exploration wells and drilled directionally. The general stratigraphy of this field can be seen on Table 1. 1434 records were extracted. The data from logging tools such as True Vertical Depth (TVD), Gamma Ray (GR), Density, and Acoustic Log were collected and became the input data. The pore pressure data were selected as the input data.
Based on data analysis, 13 records were deleted (outliers), 995 records were selected for the training set (70%), 213 were selected for the validation set (15%), and 213 records were selected for the test set (15%). After the data were selected and classified, then each of parameter was scaled into the range of -1 to 1. The statistics of the data can be seen in Table 2. Statistic of The ANN Data.
The architecture of the ANN model consists of an input layer, one hidden layer, and output layer. One hidden layer is selected to make the network fast and efficient. The optimum number of neurons in the hidden layer was selected based on iteration. The 5 best architecture can be seen inTable 3..Ten neurons in the hidden layer werechosen because it showed the best model with the highest value of correlation (R = 0.974989) above all. The architecture of the model can be seen in Figure 3.  As can be seen in Figure 3, the ANN model consists of 4 input layer, 1 hidden layer with 10 neurons and 1 output layer. Then it was trained using Back Propagation Algorithm. Back Propagation Coefficient was 1.75 and the learning rate was 0.1. Number of iterations was done until the lowest number of errors was achieved. After 487 iterations, the absolute error for training and validation was 0.19555 and 0.19893 respectively ( Figure  4).
After ANN Model had been developed, it was tested to predict pore pressure using new data from the test set data. The result of this testing (ANN model output) was compared with the actual pore pressure data. This comparison can be seen in Figure 5. In Figure 5, the ANN output shows a good correlation with the actual data. Figure 6 shows the cross plot comparing actual value and ANN output. It shows the coefficient of determination (R 2 ) of the actual data compared to ANN output is 0.927.

CONCLUSION
An Artificial Neural Network model has been developed using True Vertical Depth (TVD), Gamma-Ray, Density and Sonic log as the input parameter and pore pressure as the output parameter.
The model consists of one input layer with four nodes, one hidden layer with 10 nodes and one output layer with one node. The model uses sigmoid function as the activation function and sum of squares as the lost function.
The test's result of this model shows a good correlation with R 2 between the actual data and the ANN output is 0.927. Based on the result shown, ANN is a very potential alternative method or tool to predict pore pressure in Field X.