Application of neural networks in petroleum reservoir lithology and saturation prediction

The Kloštar oil fi eld is situated in the northern part of the Sava Depression within the Croatian part of the Pannonian Basin. The major petroleum reserves are confi ned to Miocene sandstones that comprise two production units: the Lower Pontian I sandstone series and the Upper Pannonian II sandstone series. We used well logs from two wells through these sandstones as input data in the neural network analysis, and used spontaneous potential and resistivity logs (R16 and R64) as the input in network training. The fi rst analysis included prediction of lithology, which was defi ned as either sandstone or marl. These two rock types were assigned categorical values of 1 or 0 which were then used in numerical analysis. The neural network was also used to predict hydrocarbon saturation in selected wells. The input dataset was extended to depth and categorical lithology. The prediction results were excellent, because the training and prediction dataset showed little disagreement between the true and predicted values. At present, this study represents the best and most useful application of neural networks in the Croatian part of the Pannonian Basin.


INTRODUCTION
We used data from the Kloštar oil fi eld in Croatia to test the application of neural networks as a part of a project with a Cro ati an oil company (INA).The only other geomathematical tools used on this oil fi led were several interpolation methods used for porosity mapping (BALIĆ et al., 2008).The fi eld is located approximately 35 km east of the Croatian capital of Zagreb (Fig. 1).This particular fi eld was selected because it is part of a joint research project between the Faculty of Mining, Geology and Petroleum Engineering, and a Croatian oil company.We used neural network analysis to predict reservoir lithology of the I and II sandstone series as either shale or sandstone, as well as the hydrocarbon saturation of the sandstone intervals.These intervals are generally represented by clastic, brackish to freshwater deposits that are characteristic of the Upper Pannonian and Lower Pontian succession throughout the Croatian part of the Pannonian Basin (LUČIĆ et al., 2001).

PETROLEUM GEOLOGY SETTINGS
The Dinaric oriented (NW-SE) Križ structure (including the Kloštar fi eld) is located at the most northwestern part of Moslavačka Gora Mountain.Despite many available well data, the borders of the stratigraphic units are commonly not precisely defi ned, mostly because of a lack of palaeontological samples and complex tectonics resulting in many tectonic blocks.At favorable locations, stratigraphic boundaries are determined based on available well data, including cores, mud chips, and logs.The following fi ve units are defi ned and de-The Pliocene deposits, i.e., Dacian and Romanian, are also locally known as the Paludina beds.These sediments are characterized by the alternation of clays and mediumand coarse-grained sands.
Quaternary deposits consist predominantly of yellowish sandy clays with abundant lime concretions.The average thickness ranges between 10 and 15 m.
The Kloštar structure is a faulted anticline with Dinaric strike (NW-SE).The geological history appears interesting, as shown by investigations of structural settings and the tectonic evolution of the western part of the Sava Depression (VELIĆ, 1979(VELIĆ, , 1980(VELIĆ, , 1983)).The fi eld structure was formed in the Middle Miocene, when intensive Badenian and Sarmatian uplifting events resulted in formation of a NW to SEoriented anticline of 7×2 km.Later, in the Late Miocene, this structure was differentiated in two smaller parts: the northern, which was uplifted during the Pontian, and the southern, which was only activated during the Upper Pontian.The recent structural shape was tectonically created during the Pliocene and Quaternary, when the main phase of hydrocarbon migration probably occurred.

ARTIFICIAL NEURAL NETWORKS
A neuron is a basic element of a network that is mathematically presented as a point in space toward which signals are transmitted from surrounding neurons (Fig. 2).
The value of a signal on the activity of a neuron is determined by a weight factor multiplied by a corresponding input signal.The total input signal is determined as a summation of all products of weight factor multiplied by the corresponding input signal given by 1 ( )  , 1980).This formation is a buried hill, formed by radial tectonic movements and denudation processes that occurred before the Miocene epoch.The rocks are weathered and fractured.In structurally favourable places, hydrocarbon accumulations are confi rmed.
The Middle Miocene (Badenian, Sarmatian) unit unconformably overlies the Palaeozoic igneous-metamorphic complex.The basal part is characterized by coarse-grained conglomerates, conglomeratic sandstones, and sandstones often intercalated with shale.It is overlain by dark-gray, sandy, bituminized marls, partially intercalated with lightgray, fi ne-grained sandstones.Miocene beds form economic hydrocarbon reservoirs at the southern and eastern parts of the Kloštar structure.
Upper Miocene (Pannonian, Pontian) strata are well documented throughout the fi eld area.Lower Pannonian strata conformably overlie Sarmatian bituminized marls.Upper Pannonian sediments are represented by predominantly brown or dark-gray calcareous marls and sandstones, located in the southwestern part, and partially saturated by hydrocarbons.They are defi ned as the II sandstone series.Lower Pontian sediments are represented by dark-gray, massive marls and sandy marls to the south and east of the fi eld area.Sandstones, mostly arenites with minor proportions of marl or clay, are dominant in the northwest and are partially saturated with hydrocarbons.The Lower Pontian sandstones are also called the II sandstone series.
The Upper Pontian succession is monotonous, consisting of soft sandy or clayey sediments and the proportion of sand increases upward.where n represents the number of inputs for the neuron i.If the total input signal has a value greater than the sensitivity threshold of a neuron, then it will have an output of maximum intensity.Alternatively, a neuron is inactive and has no output.Value of the output is given by ( ) where F represents the activation function and t i , the targeted out put value of neuron i.One can fi nd a more detailed des cription of neural network basics and methods in MCCUL LO CH & PITTS (1943), ROSENBLATT (1958) and ANDERSON & ROSENFELD (1988).
The basic architecture of a neural network consists of neurons divided into layers.Three is a minimum number of layers which a neural network has to have.These layers are the input layer, the hidden layer and the output layer.The input layer is for accepting signals, or in our case, input variables.These data are transferred to the hidden layer where it is processed by the activation function belonging to the neurons within it.Data which proved to be signifi cant in the analysis in the hidden layer neurons is sent to the output layer as the resulting data for the predicted variable.The number of hidden layers can be more than one or strictly one, depending on the type of the neural network.For example, the multi layer perceptron (MLP) neural network is basically designed to be able to have more than one hidden layer and to perform better with two or three hidden layers than with one.In opposition to the multi layer perceptron, the radial basis function (RBF) neural network can only have one hidden layer but the number of neurons within this layer is much larger than the number of neurons in the single hidden layer of the MLP.
For this analysis, we used the two aforementioned types of neural networks, the supervised learning-multilayer perceptron and the radial basis function neural network.The MLP network is based on a back propagation algorithm which calculates the error surface gradient in each step of the analysis.In the following step, the weight factors are adjusted according to the earlier calculated error surface gradient so the error minimizes and a new error surface is calculated.Also, the network can utilize a two-phase learning with second learning algorithms such as conjugate gradient descent (GORSE et al., 1997), quasi-Newton (BISHOP, 1995), Levenberg-Marquardt (LEVENBERG, 1944;MARQU-ARDT, 1963), quick propagation (FAHLMAN, 1988) and delta-bar-delta (JACOBS, 1988) which basically work on similar principles to the back propagation algorithm with a somewhat different approach.The greatest advantage of these aformentioned algorithms over the back propagation is that they are signifi cantly faster but sometimes the standard back propagation algorithm gives the best results.For more information in Croatian on these learning algorithms please refer to the geostatistical dictionary of MALVIĆ et al. (2008).
The MLP is more successfully applied in classifi cation and prediction problems (RUMELHART et al., 1986), and is the most often used neural network in solving geological problems.The RBF network is also a commonly used neural network but is more successfully and frequently applied in solving classifi cation problems than in solving prediction problems.
Neural networks have been successfully applied in petroleum geology problems such as determining reservoir properties (e.g., lithology and porosity) from well logs, (BHATT, 2002)

DATA ANALYSIS
We used well log data from two wells as input data for the neural network analysis.The data consist of the most basic well logs because measurements were taken in 1956 (well Klo-A) and 1957 (well Klo-B).Although these logs were taken some 50 years ago, we can successfully use them for interpretation (Fig. 3).Available data for analysis included resistivity (R 16 and R 64 ) and spontaneous potential (SP) logs.Resolution of the data was 10 cases (measurements) per metre of well log.
We performed two types of analysis.First we predicted the lithology, followed by the hydrocarbon saturation.We carried out the analysis such that the neural network was fi rst trained on a specifi c interval of well log data (overseen learning), and afterward we used the trained neural network to predict the value of desired parameters for the intervals on which the neural network was not trained.
All neural network analysis were made using StatSoft STATISTICA 7.1.

Lithology prediction
To predict lithology, we manually determined the lithological component by distinguishing layers of marl and sandstone from well logs (BASSIOUNI, 1994) on wells Klo-A and Klo-B.The neural network was trained on the fi rst set of data, which includes intervals that correspond to the I sandstone series, and the prediction was made on the intervals that correspond to the II sandstone series and vice versa.Input data for training of the neural network, resistivity (R 16 , R 64 ) and SP logs, were used.We defi ned lithology as a categorical variable (1 for sandstone, 0 for marl).For training of the neural network, we used manually determined lithology.Results described in the tables represent the success of the neural network training.Values are shown as training error and prediction error.The program in which the analyses were made automatically divides the training dataset into two parts, the length of which is user defi ned.The fi rst set is used for training of the neural network; the second, for testing the neural network's ability to predict cases.This kind of data distribution minimizes the risk of overtraining.The training procedure is stopped when user-defi ned conditions have been met (fi nal number of iterations or desired amount of error) or when the program decides that further training will no longer yield better results.Results of neural network training for the prediction of lithology are given in Table 1 and expressed as two error values in percentages.The training error corresponds to the previously mentioned dataset that was used for training the neural network.The selection error describes the neural network's success for predicting the values on unknown data.Two neural networks were trained for each well, one MLP and one RBF network.Here the prediction could only be done on the deeper or shallower parts of the well log in the same well; cross-prediction or 2D neural network analysis did not yield satisfactory results here because of the different values of SP logs in the two wells.
Table 1 shows that both of the neural networks have been successfully trained on the corresponding interval.The anomaly is shown in Table 1, where the MLP network showed a high training error but low selection error.Initially one might presume that the RBF network with signifi cantly lower training error is more successful in predicting the unknown data interval, but this is not the case.The MLP network had slightly better results than the RBF, so we conclude that the value of the selection error is a much more reliable indicator of success of a neural network than the training error is.Thus, when neural network training is fi nished, the best network parameters are the ones with the smallest selection errors.The relationship between manually determined lithology and lithology gained from neural network analysis is shown in Figs. 4 and 5.

Hydrocarbon saturation prediction
As opposed to lithology prediction, hydrocarbon saturation prediction uses cross-prediction.The neural network is trained on one well log interval, much larger than in the former prediction case.Training had been done on one well, Klo-A, and prediction had been performed on another well,  a Neural network type and properties correspond to the type of network and number of neurons per layer where fi rst and last number represent the properties of the input and output layer.Values between these represent the number and properties of the hidden layers.b Error value ranges from 0 to 1, where 0 represents 100% success of prediction, i.e., no error.
Klo-B.For input data we used resistivity logs (R 16 , R 64 ), SP logs, corresponding data about well depth and lithology, and the hydrocarbon saturation value.Hydrocarbon saturation was manually determined from resistivity log R 64 .The corresponding data well depth gave better results in this neural network performance than in lithology prediction, where it had little or no effect.Hydrocarbon saturation value, as well as lithology, was defi ned as a categorical value.Here "1" stands for positive hydrocarbon accumulation and "0" for negative.For this analysis, only the MLP neural network was used because an RBF network was characterized by a high selection and training error.
Neural network parameters are shown in Table 2. Relationships between manually determined and neural network predicted hydrocarbon saturation are shown in Fig. 6.

DISCUSSION
Generally, for all neural network analyses, the more input cases and more input variables used, the more successful the results and the better prediction will be.
Prediction of the lithology has proven reliable only when the extrapolation of data was within one well interval.In this analysis the most signifi cant value was the SP log.With the input of resistivity logs alone, correspondence between true and predicted values was not satisfactory.Also, the input logs were recorded in 1956 and 1957.
A problem that appeared in this study, which prohibited cross-prediction of lithology, was different SP log values for wells Klo-A and Klo-B.The R m for Klo-A was 72 Ωm for mud temperature of 13°C and 625 Ωm for Klo-B with a mud temperature of 1°C.This problem led to unsuccessful prediction in the shallower part of the well log data interval, where electric properties of mud on wells Klo-A and Klo-B were signifi cantly different, and to successful prediction in the deeper part of the well log data interval (>850 m of depth), where values were similar on both corresponding well log data intervals.This problem probably occurred because of the different electrical properties of mud, infl uenced by different temperature values and the composition of the mud itself.In deeper segments of the well, temperature and electrical properties of mud on both wells were normalized, and therefore cross-prediction on deeper intervals was possible.One solution to this problem for shallower intervals could be introducing lithology descriptive variables that are not as dependent on mud properties as the SP log.For example, gamma ray logs as well as other well logs that defi ne reservoir properties, such as compensated neutron and density logs, could be used to obtain better neural network performance.

CONCLUSIONS
In this study, we trained several artifi cial neural networks with the task of predicting the lithology of Upper Pannonian sediments (II sandstone series) and Lower Pontian deposits (I sandstone series), as well as hydrocarbon saturation within these beds.Sandstone facies are adequate media for statistical and neural network analysis.Our analysis of sandstone reservoirs of the Kloštar fi eld by neural tools yielded the following results: When determining the lithological component in wells Klo-A and Klo-B with RBF and MLP neural networks, we achieved excellent correspondence between true and predicted values.
Prediction of hydrocarbon saturation in well Klo-B with a neural network trained in well Klo-A gave excellent correspondence between true and predicted values.
Our results show the great potential of neural networks' application in petroleum geology research, where they could be used to quickly acquire results from well logs, to obtain vertical and lateral correlation of such logs, and to solve other petroleum geology problems.
based on these boundaries: Palaeozoic, Middle Miocene, Upper Miocene, Pliocene, and Quaternary sediments.The basement of the Tertiary system represents the core of the Križ structure.It includes extrusive igneous and metamorphic rocks, including granites of Palaeozoic age (VRAGOVIĆ & MAYER

Fi gu re 1 :
Geographic position of the Kloštar fi eld in Croatia.
and well-log correlation (LUTHI & BRYANT, 1997).In the Croatian part of the Pannonian Basin, only a few petroleum geology research projects have been performed.In these studies, clastic facies were determined from well logs (MALVIĆ, 2006) and porosity was predicted based on well and seismic data (MALVIĆ & PRSKALO, 2007).

Fi gu re 3 :
Part of the Klo-B well log representing the interval for the I sandstone series.

Fi gu re 4 :a
Comparison of MLP neural network predicted (dotted line) and manually determined data (solid line) in lithology prediction analysis for well Klo-A.The diagram's abscissa represents vertical depth, and the ordinate represents the value of lithology expressed as either marl (0) or sandstone (1).Fi gu re 5: Comparison of RBF neural network predicted (dotted line) and manually determined data (full line) in lithology prediction analysis for well Klo-B.The diagram's abscissa represents vertical depth, and the ordinate represents the value of lithology expressed as either marl (0Neural network type and properties correspond to the type of network and number of neurons per layer where the fi rst and last numbers represent the properties of the input and output layer.Values between these represent the number and properties of the hidden layers.b Error value ranges from 0 to 1, where 0 represents 100% success of prediction, i.e., no error.

Fi gu re 6 :
Comparison of MLP neural network predicted (dotted line) and manually determined data (solid line) in hydrocarbon saturation analysis in wells Klo-A and Klo-B.The diagram ordinate represents hydrocarbon saturation (0 or 1), whereas the abscissa represents the corresponding data depth.

Table 1 :
Neural network parameters for lithology prediction