Using Artificial Neural Networks for Analyzing Efficiency of Advanced Recovery Methods

* To whom all correspondence should be addressed. Analysis of advanced recovery methods (ARM) is the most important task in developing an oil deposit. Today, the main part of the geological and technical measures (GTM) is the processing of the bottom holes, particularly in carbonate formations. Their calculation is associated with great difficulty, due to the absence of mathematical models for accounting for factors of a real deposit, which influence the productivity of the well,for example, precipitation of organic salts, asphalt, resin, and paraffin deposits, complex structure of the collecting pipe. Therefore, a new approach to calculating efficiency of advanced recovery methods is required, which would make it possible to take into account properties of a real oil formation1. In general, efficiency of the analysis of advanced recovery methods may be divided into the following stages: 1) Calculation of increasing the well productivity coefficient after using ARM, 2) Calculation of water content and defining further dynamics of oil recovery. In the first stage, an instant boost in the well productivity coefficient occurs. This can be achieved by using various models of radial inflow into the well. In this case, some parameters of a real well remain unaccounted for, which leads to significant errors2. Using Artificial Neural Networks for Analyzing Efficiency of Advanced Recovery Methods

Other methods of defining the coefficient of instant boost of well productivity are the methods based on statistics laws.These include regression equations, ranking, factor analysis, etc.This approach differs from the statistic one by the fact that it makes it possible to account for factors present in a real deposit.Accuracy of statistic models depends on the sampling used.In case of low dispersion of initial data, the error in predictive calculations may reach 100% and more 3 .

METHODS
In case of significant heterogeneity of the data for calculating efficiency of advanced recovery method, it is appropriate to use a mathematical apparatus of artificial neural networks (ANN).Neutral networks are the area of applied mathematics, which has recently been widely used in various technical fields.An ANN consists of a set of neurons interconnected in a certain way.
A neuron (nerve cell) consists of the socalled soma (cell body) and two tree-like external branches: dendrite and axon.The cell body contains a nucleus where the information about neuron properties is contained, and plasma that produces the materials needed for the neuron.The neuron, receiving pulses from other neurons via dendrites, transmits signals generated by the cell body along the axon branching at the end of the fiber.There are synapses at the ends of these fibers.
A synapse is a functional junction between two neurons (axon fibers of one neuron and the dendrite of the other one).When the pulse reaches the synapse end, chemical substances called neurotransmitters are generated.The neurotransmitters pass through the synaptic gap and, depending on the type of the synapse, excite or inhibit the ability of the receiving neuron to generate electric pulses.Since efficiency of the synapse is defined by the signals passing through, the synopses learn, depending on activity of the processes in which they participate.The neurons interact with each other using short series of pulses.The message is transmitted with the help of pulse-frequency modulation.
Experimental research shows that by their structure,biologic neurons are much more complex than the simplified explanation of the existing artificial neurons, which are elements of modern ANNs.Since neurophysiology provides scientists with expanded understanding of neurons action, and computing technology is constantly improving, developers of the networks have unlimited space for improving models of biological brain.
ANNs are electronic models of the neuron structure of the brain, which mainly learns on its own experience.The natural analog shows that many problems that cannot be solved with the use of traditional computers may be efficiently resolved with the help of neural networks.
Human brain has many qualities absent in modern computers.The most important are: a) adaptability; b) ability to learn and generalize; c) distributed representation of information and parallel computing; d) low power consumption; and e) tolerance to errors.The instrument built on the principles of biological neurons has the above listed characteristics, which fact is a significant achievement in the industry of data processing.
Figure 1schematically demonstrates a mathematical model of a neuron.

Summer
The activation function The task of analyzing the efficiency of advanced recovery methods is predictional; the classical multi-layer Rosenblatt's perceptron is usually used for solving this class of tasks.[2-4]  The architecture of a multi-layer perceptron consists of series-connected layers, where the inputs of neuronsin each layer are connected to all neurons in the previous layer, and outputs -to all neurons in the next one.The signals received by the artificial neuron are called "synapses".Each neuron calculates the non-linear transformation (according to its own activation function) from the linear combination of signals from the previous layer.The multi-layer neural networks learn with the help of the «back propagation of error» algorithm, which, in turn, is a method of gradient descent in the space of weights for minimization of the total network error [2, 5].
One of the criteria of model reliability (accuracy) is the mean square error.The mean square error in the process of training neural networks is minimized by the cross validation procedure.It involves dividing the trained set into two sub-sets (training and test ones), training on the training sub-set, and validation of the ability to forecast on the test sub-set (i.e., on the data not participating in the training).In case when the errors obtained in course of such calculations (the so-called "forecast error" or "training error") are comparable by their magnitude, a conclusion can be made about stability of the calculation performed on this ANN model.Such a procedure serves as the criteria for choosing parameters of the neural network, i.e., the lower the difference between the errors in the testing and training sets is, the more accurate evaluation results of the neural network calculation are.
The following parameters should be defined for synthesis of a neural network model: -topology; -model input/output; -input parameters normalization method; and activation function.
[6] Topology V. I. Arnold (1957) and A. N. Kolmogorov proved the theorem which states that any continuous function of n variables can be represented using addition and superpositions of continuous functions of a single variable.The role of this theorem is that it shows a fundamental possibility to accurately present arbitrary complex functional dependencies with the use of structures made of simple elements, such as artificial neurons.Rather recently, several authors (Hornik, White, Staincomb (1989), Kibenko (1989), Funahashi (1989)) proved that any function of many variables in the form of F(x 1 ,x 2 ,…,x n ) can be approximated with any desired accuracy, using a simple three-layer ANN with a sufficient number of neutrons in the second (hidden) layer, and appropriately chosen synaptic coefficients.The well-known theorem of Arnold -Kolmogorov -Hecht -Nielsen was used for calculating the optimum number of neurons.The required number of neurons in the hidden layers of the perceptron is defined according to the formula, which is a consequence from this theorem: where N y is the dimension of the output signal; Q is the number of elements in the set of training examples; N w is the required number of synaptic connections; and N x is the dimension of the input signal 7 .
Using the formula, and having estimated the required number of synaptic connections N w , one can calculate the required number of neurons in the hidden layers.The number of neurons in the hidden layer of a two-layer perceptron may be calculated according to the following formula:

Model input
Input parameters (input vector) were chosen in order to most fully characterize properties of oil formation.Such properties include: 1.
Physico-chemical properties of fluids that saturate the reservoir bed (viscosity of oil and water, content of paraffins in the oil).

2.
Geometrical and structural properties of the formation.These include parameters that characterize the size of the formation (the area of oil deposit, efficient oil saturated thickness), anisotropy along the strike (sandiness, roughness), and parameters that describe capacitive characteristic of the formation (porosity, oil saturation).

3.
Filtration properties of the formation and drained off liquid (displacement coefficient) 8 .

Model output
At the output, the only parameter is obtained, namely,the coefficient of productivity; therefore in the outer layer of an ANN only one neuron will be present.
The method of input parameters normalization.Since all calculations made by using the method of artificial neural networks imply certain rules for presenting synaptic coefficients and values of activation functions, the input values cannot be supplied to ANN input arbitrarily.The input data should be normalized and remain in the range between 0 ... 1. [2] To do so, the simplest method of normalization is used, according to which for each component of the vector the following formula is used: where -maxx i and minx i -are the maximum and minimum values for all training selection.The same data processing formula is used for recalculating output vectors and transforming them into true values.

[9]
Activation function.There are a many activation functions: sigmoid, threshold, tangential.The most frequently used activation function is the sigmoid where ƒ is the activation function; w T is the transposed vector of weight coefficients and x is a vector consisting of input variables of the model 10 .
The neuron input receives vector (x 1 ,x 2 ,x 3 …x n ), which is multiplied by the transposed vector of weight coefficients (w 1 ,w 2 ,w 3 …w n ), the value of the activated function is received at the output of the neuron, depending on the result of multiplying the vector of weights by the input vector.
Before training, a test set example is defined from the general set of training examples, on which assessment of prediction properties of the trained neural network will be based.With this, the size and the nature of the training set of samples are established.
The solution of the task of analyzing the efficiency of advanced recovery methods will be values likew 1 ,w 2 ,w 3 …w n , which, at the output the ARM, will ensure efficiency value equal to y out with any input vector (x 1 ,x 2 ,x 3 …x n ).The vector (w 1 ,w 2 ,w 3 ...w n ) is obtained by training the network [11][12][13] .
For training the network, a sampling is formed: where y N is the actual value of the predicted parameter [14][15][16] .
Further, vector x i is supplied to the input of the neural network, and the y out value, the network output, is calculated, which is compared to the actual value of y i and, depending on the magnitude of the error, the vector of weight (w 1 ,w 2 ,w 3 …w n ) is adjusted.The weights are adjusted until the error between the actual and predicted level reaches a certain threshold level.
The mathematical model of a neuron can be represented as follows: ...( 2) Formula ( 2) is very similar to the equation of the regression analysis: ... (3)   where solution y out also depends on vector (w 1 ,w 2 ,w 3 ...w n ).[13] To evaluate ANN applicability for predicting instantaneous productivity growth coefficient after introducing ANN, it is necessary to compare it with the regression analysis.To do so, a two-layer neural network (Fig. 2) will be used.

Fig. 2. A two-layer neural network
Hidden layer

Output layer
The procedure of creating an ANN includes defining the number of neurons in each layer of this network.The method of defining the number of neurons in a layer of the network should be between the number of input and output variable models, the number of neutrons should not exceed the double number of input variables.For assessing the results of training, the following criteria of calculation accuracy are used in the proposed model of an artificial neural network: a) The absolute deviation (error) is calculated as the difference between the actual and predicted value of the parameter; it is the assessment of the absolute error of measurement; b) The relative error is the error of measurement expressed as the ratio of the absolute parameter calculation error to the actual value of this parameter; c) The mean square error -for calculation, all separate absolute errors were squared, summed, and the sum was divided by the total number of absolute errors, after which the square root was extracted.The resulting number characterizes the total error; the described characteristic is used for assessing results of training and testing artificial neural networks.

RESULTS
The ANN model for analyzing efficiency of ARM was created in the Statistica program suite.After starting the SW package, the networks were trained and tested.Alter the calculation, the program displayed the best networks that could be used for viewing and analyzing the obtained results.Out of the obtained results, some networks that had the best scores in training, checking and testing sequences were chosen.The model was trained on 90% of the initial sampling, and the remaining 10% of the sampling were used for testing quality of model training.The sampling was made by the technological indicators of 50 wells.For training an ANN consisting of 8 neurons in the hidden layer with sigmoid activation function, the algorithm of multi-layer Rosenblatt's perceptron was used.The change in the expected productivity coefficient of a production well after hydraulic fracturing of the formation was used as input data.
Hydraulic fracturing is the process where the pressure of liquid acts directly on the rock of the formation until it is fractured, and a crack appears.Prolonged exposure to pressure of liquid widens the crack downwards from the point of fracturing, and expands natural cracks.
After fracturing the formation with liquid pressure, the crack expands, and connects with the system of natural cracks, which had not been opened by the well, and with the areas of increased permeability; thus, the drained by the well area of the formation is expanded.Granular material (proppant) is forced into cracks formed by the fracturing, which contributes to fixing the crack in open state after removing excessive pressure.
Hydraulic fracturing increases the flow rate of extracting wells and intake capacity of wells many times, due to decreasing hydraulic resistance in the critical zone and increasing filtering capacity of the well; the final oil recovery increases due to joining poorly drained areas and interstratified beds to the deposit.
Hydraulic fracturing with formation of long cracks leads to increasing not only permeability of the near-well bore, but covering the formation with the exposure, adding additional reserves of oil to the development, and increasing the oil extraction coefficient.Doing so may reduce the current water content in the extracted products.
The highest efficiency of hydraulic fracturing maybe achieved in designing its use as an element of development system with regard to the system of placing wells and assessing their interaction in various combinations of production and injection wells.The effect of hydraulic fracturing is manifested non-uniformly in operation of separate wells, therefore it is necessary to consider not only the increase in production of each well after hydraulic fracturing, but the influence of mutual location of wells, specific distribution of non-uniformity of the formation, energetic possibilities of the facility, etc.Such an analysis is only possible based on a three-dimensional mathematical modeling of the process of developing the formation, or the facility in general, with the use of an adequate geological-development model, identifying the features of geologicalnon-uniformness of the facility.With the help of a computer model of the development process and of additional introduction ofANN into hydraulic fracturing, it is possible to assess the practicability of hydraulic fracturing of injection wells, the influence of hydraulic fracturing on oil extraction, and the rates of working out the reserves of the formation, to identify the necessity for repeated treatments, etc.In course of industrial implementation of hydraulic fracturing, it is necessary to compile a project document beforehand, where the technology of hydraulic fracturing would be justified and reconciled with the overall system of deposit development.In performing hydraulic fracturing, it is necessary to provide a complex of production research at the first-priority wells, for determining location, direction, and dimensionless fracture conductivity, which would make it possible to adjust the technology of hydraulic fracturing with regard to the peculiarities of each certain facility.Systematic designer supervision is required for the implementation of hydraulic fracturing, which would make it possible to make prompt decisions for increasing its efficiency.
In the Republic of Bashkortostan, where the studied field is located, hydraulic fracturing has been used relatively recently.In this regard, the method of increasing oil extraction rate has not been well studied to date.This fact points out upon the relevance of this research, and makes possible further use of artificial neural networks for analyzing efficiency of methods for increasing oil extraction at enterprises in this region.
It is necessary to calculate the coefficient of well production rate and the coefficient of increasing well production rate after hydraulic fracturing, using models.The calculations are made according to the information obtained for the wells in the Solonets deposit 17 .
The initial sampling is formed on the basis ofactual values of the productivity coefficient.Direct use of the results of hydraulic fracturing makes it possible to speed up the formation of the database, and excludes the interpretation of porosity and permeability [18][19] .

Fig. 3. Comparison of actual and estimated productivity coefficients
Figure 3 shows that the ANN model can provide more accurate efficiency analysis for the performed ARM, therefore, the error in calculating the productivity coefficient is less than that in the model based on the regression equation.The analysis of the above results shows that the built ANN is characterized by high quality of training.
The error for artificial neural networks was 8.4%, which is an allowable value for practical calculations.
For the analysis of the technological efficiency of the measure, it is necessary to consider changes in the most valuable parameters of well operation: fluid rate, oil production rate, and water content in the extracted product before  Using an ANN, let us forecast the daily fluid rate for the Solonets deposit, and compare the actual data with the results obtained with the use of an ANN.TheANN model for analyzing efficiency of the hydraulic fracturing was created in the Statistica program suite.Sampling was made by the technological indicators of the first three wells.
Table 2 shows high accuracy of predicting with the use ofan ANN.The average deviation of predicted values obtained with the use of an ANN from the actual data, is not more than 5%.This proves that the results of predicting a 2-level perceptron are the closest to the actual values.Much better results may be obtained with additional use of the method of principal components (MPC) for processing initial data.The method of principal components makes it possible to reduce the number of indicators required for efficient presentation of data, leaving out only the vectors with the greatest dispersion.After conversion, the results of prediction models will increase significantly.

DISCUSSION
In recent years, attempts have been made to develop models for predicting efficiency of taking geological and technical actions with the use of an ANN.The main disadvantage of the developed models is the fact that they are not widely used, which is due to the specific geological and physical properties of various deposits, and the fact that the main characteristics of the formation are not used as key parameters.To understand the reason for a lower error of ANN's prediction, it is necessary to consider its principle of action.
In order to build a multi-layer perceptron, it is necessary to choose its parameters.Most often, choosing the values of weights and thresholds requires training, i.e., changing weight coefficients and threshold levels one by one.
The algorithm for solving the task of predicting isas follows: 1.
Determine the meaning of input vector components.The input vector should contain formalized condition of the task, i.e., all the information necessary for getting the answer.2.
Select the output vector y so that its components contain full solution to the problem in question.

3.
Choose the type of non-linearity in neurons (activation function).With that, it is desirable to take into account the specifics of the problem, since a good choice will reduce time of training.4.
Choose the number of layers and neurons in the layer.5.
Set the range for changes of inputs, outputs, weights, and threshold levels, taking into consideration the number of values of the chosen activation function.

6.
Assign initial values to weight coefficients, threshold levels and additional parameters (for example, steepness of the activation function, if it is adjusted during training).The initial values should not be too high, so that the neurons are not saturated (on horizontal section of the activation function), otherwise the training will be very slow.The initial values should not be too low, either, so that the output of the majority of neurons is not equal to zero, otherwise training will also be slow.

7.
Perform training, i.e., choose parameters of the network, so that the problem is resolved in the best way.Upon completion of training, the network is ready to solve problems of the type for which it has been trained.

8.
Send conditions of the problem in the form of vector x to the input of the network.Calculate the output vector y, which will be the formalized solution to the problem.The network consists of an arbitrary number of neuron layers.The neurons in each layer are connected with the neurons in the previous and subsequent layers according to the "everyone with everyone" principle.The first layer (left) is called sensor, or input, the inner layers are called hidden, or associative, the last (rightmost, consists of a single neuron in the figure) is the output, or result.The number of neurons in the layers can be arbitrary.Usually, all hidden layers contain equal number of neutrons 20 .
Inside a cluster, non-uniformity of initial data is higher, as compared to the whole sampling, therefore, accuracy of prediction is higher, as compared to the regression equation, which is built based on the whole amount of data.
Analysis of operation efficiency is not only limited to defining instantaneous boost in well productivity.
Very often, within several months after the geological and technical measures, well flow rate decreases, which is caused by decreasing of the well productivity coefficient.This is called «PTA contamination, precipitation of salts, paraffins», etc.This is why it is very important to correctly predict further changes in the productivity coefficient after ARM [21][22][23][24] .For calculating the productivity coefficient, an ANN with multiple outputs is the most appropriate tool.
Figure 7. Changes in the productivity coefficient within several months after hydraulic fracturing In this case, each output of the ANN will receive K well.prod.in the corresponding time-slot after geological and technical measures.Figure 7 shows the comparison of the predicted ARM efficiency and the actual calculation.

CONCLUSION
From our study it follows that the proposed model of an artificial neural network for predicting results of geological and technical measures is the most efficient one 25,26 .The accuracy of the calculation obtained with the use of the mathematical model of multi-layer perceptron was 8.4%.
In the course of creating a model of an ANN, optimal parameters of the network for training on a data array about strata of the Solonets deposit have been defined.
The main advantage of the model of artificial neural networks is the possibility to actualize results of calculations, if necessary,for example, in the future, when more detailed information for some geological parameters appears from the results of an additional research, discovering new oil deposits, seismic surveys, refining the size and the borders of oil-saturated formation after drilling, etc. Performing this kind of updates and refining the final ANN model is a less time-and-labor-consuming task, and makes it possible to improve the quality of predicted assessment of the final productivity coefficient.From this, it follows that there is a possibility to use an ANN model for calculating the productivity coefficient in case of missing data, which is rather useful in planning geological and exploration works and programs for deployment and

Fig. 1 .
Fig. 1.Diagram of an ANN consisting of a single neuron

Table 1 .
Technological efficiency of hydraulic fracturing

Table 2 .
Results of calculating liquid flow-rate at the wells of the Solonets deposit