Prediction of gas-lift performance using neural network analysis

: Integrated gas lift system optimization plays an indispensable role in production and economics by maximizing the revenue from a gas lift field. This requires: optimization of gas lift parameters, finding the best tuning of the completion and surface production systems to keep pace with the dynamic reservoir changes along with saving gas quantities and compression costs. Accordingly, a comprehensive study is being carried out to measure the capability of the Artificial Neural Network (ANN) and Machine Learning (ML) in the optimization of gas lift parameters. The results of this study show the power of two different mechanisms of neural network (NN) which are Radial Base Function (RBF) and Back Propagation Function (BPF) to predict the most three important factors of the process: optimal gas injection rate, bottom hole pressure and flow rate and compare the findings with conventional methods. In addition, this work provides 3 functional equations that can be utilized by applying the field data with no artificial intelligence (AI) expertise or software knowledge. This effort provides forth an industrial insight into the role of data-driven computational models for the production recognition scheme, not only to validate the well tests, but also to reduce the uncertainties in production optimization. The work was completed by generating an economic analysis to illustrate the understanding of potential benefits of implementing irregular gas lift mechanisms in the field to stand on both technical and economic aspects of the study.


Literature review
ANN's inception started by Warren McCulloch and Walter Pitts in 1943 [1] constructed the first computational model for Neural Network (NN) based on mathematical algorithms called threshold logic which was used for binary cases. This model paved the way for research to be divided into two systems. First system relied on biological processes meanwhile the other relied on the application of NN to AI. Several technical researches have discussed the applications of ANN at oil and gas industry, Elgibaly et al. (1998) [2][3] employed ANN in calculating the Optimal Hydrate Inhibition Policies Ghahfarokhi et al. (2018) used ANN to predict gas production, Khan et al. (2018) [4] utilized the ANN to predict the optimum production rate, luo et al. (2018) utilized the ANN to optimize the production in several fields, Nande (2018) utilized the ANNs to reach lowest error in predicting closure pressure for hydraulic fracturing analysis, Tariq (2018) [5] used ANNs to predict flowing bottom hole pressure. Bahaa (2018) [6] Utilized ANN to forecast the Fluid Rate and Bottom Hole Flowing Pressure for Gas-lifted Oil Wells. Radhi (2020) [7] assigned the Numerical Simulation to optimize gas lift quantity using Genetic Algorithm for a Middle East Oil Field.
In the field of mathematical models there are several functions for neural networks and this paper is focused on only two types which are backpropagation function network and radial basis function network. Continuous and timely lift optimization on a well is dependent on knowledge of the operational conditions of the well and its associated reservoir parameters. As the target is to maximize the economic production values from a well at a given point in time, the operational expenditure can be inversely associated with production maximization. generating set-point recommendations to optimize lift performance at timely intervals considering the above-mentioned setup can be a significant challenge. Popularly, the gas lift injection rate set-point is decided by operators based on historical experience estimates. The main target of gas lift optimization is expanding the current production from the well by allocating the relation between the rates of gas injection and oil production. The main factor to describe this relationship is the marginal increment in oil production rate per unit change in the gas lift injection rate. With a low quantity of injected G/L can result in production rate loss due to insufficient reduction in production fluids gravitational head. On the contrary, overvalues of injected gas rate led to high frictional head and wellhead pressure resulting in additional backpressure on the downhole formation, this results in a significant loss in production. (Radhi 2020) [8][9]. The major parameters controlling the previously mentioned balance includes but are not limited to: reservoir formation pressure or SBHP, Pi or inflow performance, WHP, depth of injection point, produced gas oil ratio (GOR), Water Cut %, API gravity, tubing size, and tubing roughness (such as friction factor). Some of these factors are not continuously measurable, and in many situations, estimation is provided based on old measurements or proximal well properties. The variation in the well behavior further complicates the equation. As well, the variation in production values, GOR, and WC % in a short span of evaluation may be due to the individual well's daily production rates not being physically measured but mathematically allocated. The bearing in production values and associated factors might be generated from the well natural decline or other interventions such as a frac hit, workover or re-stimulation. With respect to all of this, an injection rate which is optimum at a given point in time may not be so at a later stage. This challenge leads to attempt to understand the underlying state and represent the well using physics-based models (Khan, 2020) [10]. In the present study, the data of total 205 wells from more than 5 different fields that are artificially subjected to gas lift were selected for training and testing for the proposed model, wells are located in the Western, Eastern Desert and offshore field in Egypt. For each of these wells, the set of data were gathered for subsequent use in the study to predict three important variables output by using conventional commercial software then they were compared with two different neural network models "Radial Base and Back Propagation" built by MATLAB 2018A to predict the optimal values of gas injection quantities, bottom hole flowing pressure and well productivity. The ultimate target of this approach is to:  Provide a new way to speed up the calculations;  Dispense the use of tedious and time-consuming software training;  Show an attempt to cut the cost values of the engineering software supply.

Data gathering, input variable quality selection and data used
Adopted ANN in this work depends primarily on wells' actual test data obtained from test separator and measuring devices installed on both mobile test package, flow lines and gas lift lines. In addition, as ANN can deal with limited or faulty data while acting with data with uncertainty which is considered a proven advantage of ANN over any analytical conventional methods, downhole data obtained from static and flowing surveys using downhole memory gauges, production logging tools (PLT) using electrical line, reservoir rock and fluid properties obtained from PVT lab analysis and core lab analysis were used in this work to run different models to reach the optimum results with satisfactory accuracy. The input data set was randomly divided into 70% for training, 15% for validation, and 15% for the primary test. Training data are used to improve the network according to their error. Validation data are used to evaluate network generalization, and to stop training when generalization stops improving. Test data do not affect training, so they provide an independent measure of network performance during and after training. To effectively model the unique behavior in gas-lift wells, data such as the gas injection rate and depth of gas injection would have been required. None of the records of the gas-lift wells, in the compilation of more than 180 wells, provided these data. Owing to the insufficiency of comprehensive gas-lift well records for training and testing, a neural network that would predict the temperature profile in gas-lift wells was not developed. All data for gas and gas-lift wells were discarded. Furthermore, because of the presence of outliers and anomalies the database was reduced to 50 wells from different fields. The data ranges including minimum and maximum values of the input elements parameters assigned in different generated ANN models are listed in Table 1. Based on the oil API gravity. Twenty-two wells were removed because of extremely low oil gravities ranging from 10 to 19°API. In addition, to make the oil viscosities reported for the databases compatible, the viscosities in Database were neglected. because most of wells have very close API values except 3 or 4 wells, in addition, after testing the importance of each inputs, it is noticed that API is of less importance in the contribution.

The following criteria were used in selection of variables
A spread of values for that parameter in the databases must exist, this permits the NN to more easily approximate the function. The parameter must not be dependent on other input variables only. A parameter may be dependent on other input variables but must also be dependent on some parameter that is not an input variable. In this method, the variable will supply more data about the well that is not already provided and known by the other variables. All the variables met these criteria except for water specific gravity, water flow rate, and dead oil viscosity. The water specific gravity was removed because the values reported for most of the wells were the same. Although for many of the wells the water flow rate was varying, the water flow rate was retained because it is an important parameter in describing the hydrodynamics of the system. Dead oil viscosity was dependent on oil °API gravity and temperature. However, since the dead oil viscosity for each well was determined at 100 °F, then the dead oil viscosity was dependent on a single variable, the oil °API gravity. To observe the effect of each parameter on the predictive power of the network, two networks were developed: one is forward and backward propagation and one radial base function network.

Selection of training samples and normalization of data
In this paper, more than 2900 production real test data of 16 elements were gathered & investigated for inconvenience and processed, 16 valuable elements were chosen as inputs and 3 elements were selected as outputs representing Well flow rate (BPD), Gas lift rate (MMSCFD) and bottom hole flowing pressure (PSIA). Before the data sets were fed to the networks, the values of the input variables were normalized by dividing each value of the data by the maximum absolute value of data. Using below equation, every field in the data sets was normalized, reducing the range of the values from 0 to −13,000 to between 0 and +1. Note that the limits are the same as those of the output of the hidden layer tan-sigmoid transfer function: .

Network training and validation
In this study, the selected 2 different algorithms were conducted to compare the performance, the first one is backpropagation learning algorithm and the reason behind choosing it for this training is because it was desirable to adjust the weights and biases based on the error signal arising from the output layer. The goal of this algorithm was to reduce the mean-squared error of the network outputs for all the input sets to a global minimum value. At the end of every epoch, the mean-squared error was calculated. Momentum was also added to the backpropagation algorithm to reduce the possibility of the learning process getting stuck in a local minimum. To further increase the speed of convergence of the mean-squared error to a global minimum, the Levenberg-Marquardt method was used to update the weights and biases for forward and backward propagation and Radbas function for radial base function NN. Traditionally training is halted when the network error reaches the performance goal of the mean-squared error-specified or when the network has converged. To mitigate network over-training, the early stopping method was applied. This required monitoring the mean-squared error of both the training data set and validation data set.

Testing of models
The accuracy of the proposed neural network models (Backward propagation and Radbas) was tested by different data from other fields by comparing the predicted following three parameters (fluid rate, gas injection and bottom hole flowing pressure profile) to actual tested data from mobile test package and generated from the commercial software.

ANN proposed architecture design and training
For finding the optimum network design, trial-and-error attempts were undertaken, starting with one hidden layer and going to two hidden layers with trials many numbers of hidden units was set almost equal to the number of inputs multiplied by two. Hidden units were then gradually added. The maximum number of hidden units is rarely requiring to exceed more than 4 times the number of inputs. The architecture was retrained at least 3 times (up to 10 times is recommended) with different initial weight randomizations and only the best one was saved for comparison with other architecture. Nodes optimum number needed in the hidden layer is a contingent issue, depending on the intricacy of the input and output mapping, the amount of noise in the data and the amount of training data available. If nodes number in hidden layer is less than the optimum, the back-propagation algorithm will fail to reach to a minimum during training. Conversely, too many nodes will result in the network over fitting the training data, resulting in poor generalization performance. The developed neural network contained 16 input variables, one hidden layer with 50 neurons and 3 output variables. Log-Sigmoid function (logsig) was used as a transfer function in the hidden layer.

Applied workflow to obtain the optimum gas-lift rates
1-In the applied field, each well has its own data represented in WHP, FLP, FLT, Pr, WC%, GOR, injected volume, and injected rate…etc. and by using a total of 16 inputs to build a model using commercial software "PIPESIM" to predict the target data (bottom hole flowing pressure and flow rate with optimum gas injection) which is essential in reallocation and optimization of the limited gas among the 52 wells in the field for testing then to be applied on different fields for accuracy testing.
2-Building two different ANN models "REDBAS and Feed-forward, back propagation" and compared with actual data and commercial software output data to show the power of computational program over the conventional software in solving problems that are hard to be modeled analytically.
3-Development of usable empirical equations for calculating the three desired outputs, can be applied without the necessity of AI software or expertise.
4-The economic analysis model was generated to calculate the commercial benefits of implementing gas lift optimization technique in the field and to be applied in several fields.

First approach: -(FB propagation with 1 hidden layers)
Using normalized data, one hidden layer with 50 neurons and Log-Sigmoid activation function and the results as shown below in Figures 1 and 2.

Third approach: -(Radial basis neural network)
Using normalized data with spread parameters equal one to reach zero error, one hidden layer with 50 neurons and Radbas activation function, and the results are as shown below in Figures 7 and 8.

Testing model performance with different artificial neural network models
Based on the trials carried out to reach the optimum developed ANN model, 60 points were assigned from different fields as a validation and approbation of the model efficiency in predicting optimum Well rate (BFPD), optimum gas injection rate (MMSCF) and bottom hole flowing pressure (Psia), Tables 3 and 4. Figures 14 and 15 proved a significant matching between the predicted results and actual results for the three targets.

Results and discussion
The extensive study carried out between the commercial software "Pipesim" and different ANN models as shown on below Figures 10,11 and 12 illustrates the results which showed a clear result that ANN models are matched properly with actual data, on the contrary, Pipesim results showed miss alignment with the best fit curve "actual data curve" while applying Pipesim software, a different correlations has been applied ended by applying Duns and Ros (modified) due to It has a flow regime map extended by the work of Gould et al. [11]. This includes a new transition region between bubble and slug flow, and an additional froth flow region at high flow rates. The holdup is considered as no-slip for froth flow, and is interpolated over the bubble-slug transition, the other holdup relationships are as for the standard Duns and Ros [12]. Friction is calculated by the method proposed by Kleyweg. This uses a monophasic friction factor rather than two-phase, but involves use of an average fluid velocity. This is claimed by Kleyweg to be a better method. Duns and Ros Modified gives the highest pressure drops in the slug flow regime for oil wells.  Table 2, ANN Output Forward and Backward Propagation Results with two hidden layer models produce the most accurate results, as they depict the lowest APRE, AAPRE, MSE, RMSE and SD, and Highest R 2 and R, which means the error of this method is the closest to zero error, on contrary Pipesim models produce the least accurate results as they depict the highest APRE, AAPRE, MSE, RMSE and SD, and lowest R 2 and R, which means the error of this method is the furthest to zero error.

Relative importance of input variables in the developed ANN models
The proposed study provides the contribution of all inputs by using input and output weights of the hidden layers which was presented previously by Garson (1991) and repeated by Goh (1995) [13][14] in different aspects. This process essentially requires extracting input and output weights of hidden layers. And the reason behind using Garson is based on applying 7 different methods to assess the variable relative importance and Garson law showed much accurate and better results than the other methods and matches with the results of MLR in terms of the partial regression, In this paper, to calculate the relative importance of the sixteen inputs on the three target outputs, assessment process based on the weight matrix of the proposed optimized network and Garson's modified equation have been used. The equation below as follows: (1) where, I j represents the relative importance of the j th input on the output, N i refers to the number of input neurons similarly, Nh refers to the number of hidden neurons in hidden layers, and the terms W ih and W ho are connection weights between the input and the first hidden layers and W ho is the connection weights between the second hidden layer and the output layer, subscripts k, n and m refer to input, hidden and output neurons, respectively. Table below demonstrates the weights produced between artificial neurons of the neural network model used in this study. Figure 13. Workflow for weights produced between artificial neurons of the NN model [15].

Figure 14.
Input data importance to the target. Table 6 lists the relative importance of various input parameters on both oil rate output and gas lift rate output respectively. As can be generally seen, water cut has the greatest impact on oil rate and gas lift rate prediction followed by net pay thickness (or producing interval), most of the parameters have almost equal importance in the average range of (4%-8%), separator pressure and flow line pressure have the least impact on the output parameters in the developed ANN model.
The results of Garson calculation demonstrate the following points:  Water cut has the greatest proven impact on the targets: well rate and optimum gas injection rate and bottom hole flowing pressure. Results in water cut increases in a well, the total pressure gradient in the well will increase due to the increase in liquid density as water is heavier than oil, thus causes a decrease in well rate and necessitates increasing gas lift rate to bring oil production rate to surface with previous value of gas injection.

GL IMP
Gross Fluid IMP Pwf IMP AVG  Well depth and downhole temperature along with well permeability come in second place in relative importance. As the well depth increases, the required gas injection values increase, to lift the liquid upward in parallel with downhole temperature which has a shown impact on the well fluid mobility, furthermore, the effect of permeability appears with same importance values of well depth and downhole temperature around 7% indeed that when permeability values are ranged between 200 up to 800 (md) has a great impact on well fluid rate (9% relative importance).
 The rest of the parameters related to wellbore and reservoir fluid properties has almost same impact on the desired target; well rate, optimum gas injection and bottom hole flowing pressure values.
The least impact resulted from separator pressure and flow line pressure which have minor influence on the targets, so it could be neglected in ANN models.

Development of simplified formula for calculating well flow rate using ANN
In addition to the previous work achieved for prediction of the well flow rate through ANN along with MATLAB, an actual and usable empirical model could be applied with the elimination of AI programs and technical expertise. The generation of this equation is based on utilizing a group of biases and weights which relates the layers from the output to the input. The weights and biases contributing to their neurons; those linked with input-hidden layer is characterized by w1, whereas, those linked with the hidden-output layer is called w2, moreover, the hidden and output layer biases are b1 and b2 respectively. The novel empirical correlation developed by this AI technique to estimate any of three outputs rates in gas lift oil wells is given by the equation below: (2) where refers to the output from the second layer which is a vector containing three main outputs (gross fluid, Produced Gas and PWF). The term represents weights of second layer as matrix has 3 rows corresponding to the number of outputs and 50 columns corresponding to the number of neurons. Similarly, is a matrix with 50 rows and 16 columns corresponding to number of neurons and inputs respectively, this term refers to the weights of the first layer. The term P is the input vector, and are the biases of first and second layer. First, the dataset is organized in a column vector that has 16 values which is denoted in the equation by term "P". Since there are two hidden layers, there are subscripts 1 and 2 corresponding to them. In the first hidden layer, the number of neurons is 50 and each neuron is connected to all input data by the weights. So, the weights were organized in a matrix with 50 rows corresponding to number of neurons and 16 columns corresponding to number of data set. And the bias is organized in a column vector that has 50 values corresponding to number of neurons which is an integer number used to optimize the output value. The value of each neuron in this layer is the summation of the product weights with the dataset then the bias is added. The output from this calculation is a vector that has 50 values for 50 neurons. The accuracy of well flow rate formula is within range of variables data which are used to derive it. For this reason, many researchers tried to derive a unique production estimation formula for a special reservoir or a layer. In this paper, a global well rate formula is generated and validated by a set of unfiltered measured data points. Besides of inherent errors in measurement devices (gauge pressures, flow rate meters) and calculation procedures for liquid calculations, human error is one of the important errors in field operation. By examining the above-mentioned formula, it is revealed that −20% error in reported input data would cause an amount of +/−14% error in estimated flow rate. Also, just 9% error in choke size diameter would result more than 7% error in rate.

Economic analysis
An economical study was carried out to evaluate the commercial results of implementing gas lift optimization technique in the field. This analysis is performed by the equation adopted from Huh et al. (2010), Nakashima and Camponogara (2006) [16] as shown below. There are some assumptions that this analysis is based on, whereas oil and gas price are $55/STB, and $5,500/mmscf respectively.
The calculated daily production of oil, water, gas, gas injection rate, and net profit for each well are presented in Figures 15 and 16 for natural flowing wells and gas lift wells respectively. The daily net profit with and without gas lift and overall gain in daily oil production rate for 60 wells are compared in Figure 17. As can be seen in the Figures, the gas lift technique can substantially increase the daily oil production rate of every well in the field; and thus, the net profit as compared to naturally producing wells.    Figure 17. Comparison of daily net profit with and without gas lift and oil production gain for 60 wells.

Comparison summary with previous work
Several works were carried out and discussed to compare the power of artificial neural network along with empirical correlations on gas lift parameters, below table illustrates the results reach throughout the studies.  The paper results showed great results compared to the other papers observed in AAPE, MSE and regression as well due to enormous actual dataset used, different normalization technique applied to reach the optimum input data range and different model for neural functional networks.

Conclusions and recommendations
This study primarily focuses on exploring the feasibility of the implementation of ANN-based optimization technique in numerical modeling for optimizing the gas lift wells on daily basis in a large field with complex network system. Accordingly, ANN technique is utilized to optimize the allocation of the continuous gas lift injection rate for 60 wells. The principles of ANN, and mathematical model including the workflow for performing the simulation studies are comprehensively discussed in this paper. Sensitivity studies and sample economic analysis were also performed to get an insight into the benefit of implementing gas lift techniques for depletion drive reservoir, especially in the event of increasing the water cut and very low reservoir pressure.
ANN technique appears to be an efficient technique with an ability to model a large number of wells produced concurrently in a network system for the prediction of optimally allocating the gas injection rate towards maximization of oil production rate while maintaining the optimum bottom hole flowing pressure.
Gas lift technique is found to be more beneficial for wells with relatively higher water cut.
In the event of reservoir pressure depletion, gas lift appears to be not only beneficial for improving the well production performance but also for increasing the field life cycle by allowing the well to continue production even at a very low flowing wellbore pressure (P wf ) at its given minimum well head pressure. In this work, various training algorithms for BP and FP networks and RBF networks were nominated to test the prediction of three important parameters in well life cycle. Based on the study carried out, the following conclusions were achieved: 1-First approach performed was a feed forward and backward propagation network, that was tested by using different numbers of neurons, one or two hidden layers and different training process which are steepest descent, Levenberg-Marquardt and Bayesian to get the best results. The second approach was a radial basis network. Both systems can exhibit advantages and disadvantages when compared to one another and in both cases the results were quite satisfying.
2-RBFs can be trained much faster than perceptron's and achieved less mean square error with high number of neurons.
3-After applying test data from the different fields found the fastest training and testing error were achieved with BP and FP neural network which showed in statistical analysis compared with Pipesim models with accuracy up to 91%.
4-Stability is achieved by the RBFs network and the network trained by Levenberg-Marquardt algorithm contrary to the networks trained with the Bayesian and steepest descent algorithms.
5-The experimental results indicate that a strong matching between model predictions and observed values, since MSE is 0.0012. When performance results are compared, it was concluded that RBFNN-based model is a more reliable predictor, with MSE value of 0.003 and ARPE of 8.2. Therefore, the smallest MSE value indicates a creditable method for accuracy, while RBF finding illustrates best proposed model to analyze the output. 6-It has been shown that there is a minimum number of inputs required to achieve an accurate model and any further decrease in input number would result in increasing MSE and reducing regression value which affects the whole model performance.
7-Using Garson algorithm, sensitivity analysis processed, the rate and the way of input data distribution with the highest impact on the model output is determined. With this process, the trial-and-error steps in the design process can be reduced and the most important effective parameters can be identified.
8-A comprehensive comparison with antecedent studies showed that this work presents more reliable results due to using large actual data with respect to pre-processing for the data resulted in normalization with different methods, then applying two different function neural networks along with generating of a new empirical equation.
9-A new empirical model is proposed for estimating the three outputs, it can be applied to any dataset given the input parameters are within the model's range. This would not require the expertise of coding or using convoluted software.
10-Sample economic analysis demonstrates that the gas lift technique can substantially increase the daily oil production rate of every well in the field; and thus, the net profit as compared to naturally producing wells.