Initial-Productivity Prediction Method of Oil Wells for Low-Permeability Reservoirs Based on PSO-ELM Algorithm

: Conventional numerical solutions and empirical formulae for predicting the initial productivity of oil wells in low-permeability reservoirs are limited to speciﬁc reservoirs and relatively simple scenarios. Moreover, the few inﬂuencing factors are less considered and the application model is more ideal. A productivity prediction method based on machine learning algorithms is established to improve the lack of application performance and incomplete coverage of traditional mathematical modelling for productivity prediction. A comprehensive analysis was conducted on the JY extra-low-permeability oilﬁeld, considering its geological structure and various factors that may impact its extraction and production. The study collected 13 factors that inﬂuence the initial productivity of 181 wells. The Spearman correlation coefﬁcient, ReliefF feature selection algorithm, and random forest selection algorithm were used in combination to rank the importance of these factors. The screening of seven main controlling factors was completed. The particle swarm optimization–extreme learning machine algorithm was adopted to construct the initial-productivity model. The primary control factors and the known initial productivity of 127 wells were used to train the model, which was then used to verify the initial productivity of the remaining 54 wells. In the particle swarm optimization–extreme learning machine (PSO-ELM) algorithm model, the root-mean-square error ( RMSE ) is 0.035 and the correlation factor ( R 2 ) is 0.905. Therefore, the PSO-ELM algorithm has a high accuracy and a fast computing speed in predicting the initial productivity. This approach will provide new insights into the development of initial-productivity predictions and contribute to the efﬁcient production of low-permeability reservoirs.


Introduction
The initial-productivity prediction of low-permeability reservoirs is an important fundamental task in the initial stage of reservoir exploration and development. The work can provide the basis for the development dynamic analysis, well optimization strategy plan, and reserves estimation. Recently, many scholars have proposed different methods to predict initial productivity, including a mathematical model [1][2][3], numerical simulation method [4], and drill-stem testing [5,6].
The accurate prediction of well productivity plays a pivotal role in enhancing the oil recovery of reservoirs. All data-driven productivity prediction models revolve around the feature selection and forecasting model. Firstly, we consider the feature selection. There are many factors affecting the initial productivity. Most researchers studying the main controlling factors primarily focus on geological factors and dynamic development Energies 2023, 16,4489 2 of 17 factors [7][8][9][10][11]. Wenli Ma et al. proposed the Pearson maximum information coefficient correlation synthesis analysis method to identify 13 main control factors for the initial shale gas productivity [12]. Hao Chen et al. used a combination of Pearson's coefficient, Spearman's coefficient, and Kendall's coefficient methods to optimize the main control factors [13]. Rastogi et al. used the SelectKBest feature selection, tree regression, Pearson's correlation coefficient, recursive feature elimination, and correlation feature selection algorithms to select the main control factors for the impact of hydraulic fracturing chemicals on the unconventional reservoirs' productivity [14]. Zhao Wang et al. established the XGBoost linear regression prediction model to predict the initial productivity and evaluate the main controlling factors [15]. The above scholars used the feature factor selection algorithm, screening and ranking the influencing factors to finally determine the main controlling factors affecting productivity [16][17][18][19]. However, the majority of current studies on the main control factors solely emphasizes geological factors and dynamic development factors, overlooking the influence of the non-numerical variables in engineering factors on productivity. Previous research has primarily focused on using algorithms to analyze the correlation between input and output data, but the results often lack practical application explanations. In contrast, the author of this study utilized a combination of factor selection algorithms, reservoir engineering knowledge, and production experience to screen for the main controlling factors. This approach resulted in selected factors that are more aligned with practical applications, rather than relying solely on a single algorithm for selection.
Multiple machine learning algorithms have been applied to build productivity prediction models, such as neural networks, support vector machines, random forests, and Bayesian networks. Yintao Dong et al. constructed the XGBoost algorithm without physical constraints to reduce the relative error in achieving a highly accurate initial productivity [20]. Yapeng Tian et al. used a genetic algorithm to optimize the weights and thresholds of the neural network to improve the accuracy of predicting the initial productivity of shale gas [21]. Hao Chen et al. applied the support vector machine to predict the initial productivity of horizontal wells' volumetric fracturing in tight reservoirs. Hui-Hai Liu used the incorporation of physics into an ML model predicting well productivity [13]. On the basis of LSTM and DNN neural networks, DongXiao Hu conducted the development of a novel fitting function-neural network synergistic dynamic productivity prediction model for shale gas wells [22]. Through an analysis of the aforementioned prediction model studies, it is evident that the majority of scholars have employed machine learning algorithms for data mining in order to predict the initial productivity of reservoirs. Additionally, they have utilized diverse optimization algorithms to enhance the accuracy of these productivity models [23][24][25][26][27]. However, the author proposes an initial-productivity prediction model that is aligned with the production in the research area. The model has been compared with various other models and has shown a high prediction accuracy, fast running speed, and strong robustness.
In summary, in order to predict the productivity of low-permeability reservoirs, a comprehensive approach combining the Spearman correlation coefficient, random forest, and ReliefF feature selection algorithms is employed. This method allows for the ranking of 13 influencing factors from three aspects of geological factors, engineering factors, and dynamic development factors. Seven main controlling factors are identified by combining the reservoir engineering theory and importance ranking. There is a complex nonlinear relationship between the seven main controlling factors and the productivity of the well. In order to better predict the initial-production capacity of oil wells, the introduction of the extreme learning machine algorithm can better deal with the nonlinear prediction regression problem. However, since the initial value of the extreme learning machine is generated randomly, the prediction accuracy will be affected. In order to reduce this error, the author uses the particle swarm optimization algorithm to optimize the input weight and threshold of the extreme learning machine. This model aims to facilitate the practical application of the initial-productivity assessment, providing valuable insights for reservoir evaluation. In this paper, the author introduces an innovative approach by employing the PSO-ELM algorithm for predicting the initial productivity of low-permeability reservoirs. The main works of this research are as follows: Firstly, Section 2 describes the selection of characteristic factors. Secondly, Section 3 introduces the main theory of the main controlling factors' selection. Thirdly, Section 4 states the main theory of establishing the initialproductivity prediction model. Finally, Section 5 concludes the work of the paper. This initial-productivity prediction model was implemented in the Matrix Laboratory.

Selection of Characteristic Factors
This study focuses on identifying the key factors that affect the initial productivity of low-permeability reservoirs. The factors are categorized into three aspects: geological factors, engineering factors, and dynamic development factors. The selection of these factors is based on the original data collected from actual oilfields. A combination of production experience and reservoir engineering knowledge is used to screen and identify the most significant factors. This study selects five geological factors, including porosity, permeability of the oil formation, initial oil saturation, coefficient of variation of stratigraphic permeability, and formation permeability grade difference, as indicators of low-permeability reservoirs. Additionally, four engineering factors are considered, including reservoir shot thickness, fracturing fluid sand content ratio, fracturing fluid discharge, and reservoir modification method. Finally, four dynamic development factors, namely, water content, production pressure difference, pumping depth and dynamic fluid surface depth, are selected for dynamic development factors.
Based on the initial-productivity data collected from 181 wells in the low-permeability oilfield, specifically focusing on the first 60 days, the aforementioned 13 characteristic factors were carefully selected and collated. This resulted in a comprehensive data set of 181 instances, which serves as a valuable foundation for conducting the initial-productivity analysis. Table 1 displays the distribution range of the base data used for the prediction model. To ensure consistency, non-numerical variables in the engineering factors were transformed using the label-encoding numbering process, resulting in corresponding numerical variables. This allowed us to obtain the values for each individual well.

Selection and Analysis of Main Control Factors
This study utilizes the combination feature selection algorithm to identify the main controlling factors that affect low-permeability reservoirs. Utilizing the main controlling factors and removing the irrelevant factor of productivity can avoid over-fitting and quantitative calculation. The 13 characteristic factors mentioned above are used as input layer data for the initial-productivity forecast. Finally, through the correlation between each characteristic factor and the initial productivity, the rational reasonable ranking was carried out.

Spearman's Correlation Coefficient Method
The Spearman correlation coefficient (SCC) is used as a method of estimating the correlation between two variables [28][29][30]. The correlation between the variables is reflected through the difference of the corresponding series of two pairs of grades. The closer the Spearman correlation coefficient is to +1 or −1, the stronger the correlation is between the two variables.
The Spearman correlation coefficient is calculated as where x i and y i are the values of the characteristic factors x and y, respectively; and n is the maximum productivity of the sample.

ReliefF Feature Selection Algorithm
The ReliefF feature selection algorithm utilizes the features of the samples for learning and training. It begins by randomly selecting one sample from the training data set D of the oil well. The distance between this sample and the other samples is used to determine the weight of the feature factor. The algorithm then continuously searches for the nearestneighbor samples to update the weight of the feature factor [31][32][33][34]. Finally, the first few items with a higher weight of the feature are selected as the main control factor.
Consider the set of samples S = {S 1 , S 2 , . . . . . . , S m }; each sample contains p features, s i = s i1 , s i2 , . . . . . . , s ip , and 1 ≤ i ≤ m. The values of the features are nominal or numerical. The difference between two samples s i and s j (l ≤ i = j ≤ m) on feature t(1 ≤ t ≤ p) chosen randomly in the training set D is defined as follows: If the features of sample R are nominal features for the label-encoding numbering process, we can obtain the numerical type. If the features of sample R are numerical, we use the formula directly for calculation. The specific formula is as follows: where max t and min t are the maximum and minimum weights of the characteristic factors, respectively; A sample s i is randomly selected from the sample set D. A sample s i is taken as the centre. Then, k near-neighbor samples nearest to s i are selected from the samples of the same kind in the sample set. k samples of near hits are found from the sample set of the same kind. The weight W(A) of this sample is calculated. The weight value W(A) of this feature is updated.
where the number of samples sampled is m; the number of nearest-neighbor samples is k; H j represents the characteristics of similar samples; and M j represents the characteristics of dissimilar samples.

Random Forest Selection Algorithm
A random forest is created by combining multiple decision trees in a random manner [35][36][37][38]. The regression results of the decision trees are then used to make predictions. The algorithm determines the relative importance of the characteristic factors in predicting the target variable by calculating the out-of-bag error rate.
where the i-th importance of feature X, errOOB 2i is the out-of-bag data error after adding random noise. errOOB 1i is the corresponding out-of-bag data error for each decision tree and is selected to calculate the out-of-bag data error.
The value of the out-of-bag data error is an indicator of the importance of a feature. If the accuracy of the out-of-bag data decreases significantly after adding random noise (errOOB 2 increases significantly), this suggests that the feature has a significant impact on the prediction results of the sample, thereby indicating a relatively high level of importance.

Analysis and Determination of the Main Control Factors
This study employed a combination of Spearman's correlation coefficient, the ReliefF feature selection algorithm, and the random forest selection algorithm to identify the main controlling factors among thirteen feature factors, including geological factors, engineering factors, and dynamic development factors. The calculated weights were comprehensively ranked and presented in Figures 1-3. The results show that seven factors were selected as the main controlling factors, as presented in Table 2.

Random Forest Selection Algorithm
A random forest is created by combining multiple decision trees in a random manner [35][36][37][38]. The regression results of the decision trees are then used to make predictions. The algorithm determines the relative importance of the characteristic factors in predicting the target variable by calculating the out-of-bag error rate.
where the -th importance of feature , 2 is the out-of-bag data error after adding random noise.
1 is the corresponding out-of-bag data error for each decision tree and is selected to calculate the out-of-bag data error.
The value of the out-of-bag data error is an indicator of the importance of a feature. If the accuracy of the out-of-bag data decreases significantly after adding random noise ( 2 increases significantly), this suggests that the feature has a significant impact on the prediction results of the sample, thereby indicating a relatively high level of importance.

Analysis and Determination of the Main Control Factors
This study employed a combination of Spearman's correlation coefficient, the ReliefF feature selection algorithm, and the random forest selection algorithm to identify the main controlling factors among thirteen feature factors, including geological factors, engineering factors, and dynamic development factors. The calculated weights were comprehensively ranked and presented in Figures 1-3. The results show that seven factors were selected as the main controlling factors, as presented in Table 2.          Our study utilizes a combination feature selection algorithm to identify seven main controlling factors, as shown in Figures 1-3. The RF algorithm removes factors to calculate the corresponding out-of-bag data error, which differs from the other two methods. Although there are some numerical differences in the importance rank required by the three methods, the results of their importance evaluations converge.
Through this result of the importance rank, we can analyze it to obtain the importance rank. The importance rank of geological features can be obtained as follows: porosity > permeability > stratigraphic permeability grade difference > coefficient of variation of straigraphic permeability > initial oil content saturation (from Figure 1). The ranking of engineering factors is as follows: sand ratio in fracturing fluid > fracturing fluid discharge > reservoir shot open thickness > reservoir modification methods (from Figure 2). In terms of dynamic development factors, the importance ranking is as follows: pump depth > production differential pressure > depth of dynamic fluid level > water content (from Figure 3).
The study identified seven main controlling factors. Firstly, there are geological factors such as porosity, permeability of the oil formation, and the difference in stratigraphic permeability grades. Engineering factors such as the sand ratio in fracturing fluid and fracturing fluid discharge were also found to be significant. Additionally, dynamic development factors such as production differential pressure and pump depth were identified as important considerations. The ELM algorithm is a type of single-hidden-layer feedforward neural network (SLFNS) that exhibits high operational efficiency, correctness, and strong generalization performance with few training parameters. In contrast to traditional neural networks, the algorithm randomly determines the weight vector W and threshold matrix b for the hidden layer, with only the number of neurons in the hidden layer being specified [39][40][41][42][43]. During the execution of the algorithm, there is no need to adjust the values of W and b. The predicted target value can be approximated by the excitation function g(x), which is infinitely differentiable for any interval. Figure 4 shows the network structure of the ELM model. hough there are some numerical differences in the importance rank required by the three methods, the results of their importance evaluations converge.
Through this result of the importance rank, we can analyze it to obtain the importance rank. The importance rank of geological features can be obtained as follows: porosity > permeability > stratigraphic permeability grade difference > coefficient of variation of straigraphic permeability > initial oil content saturation (from Figure 1). The ranking of engineering factors is as follows: sand ratio in fracturing fluid > fracturing fluid discharge > reservoir shot open thickness > reservoir modification methods (from Figure  2). In terms of dynamic development factors, the importance ranking is as follows: pump depth > production differential pressure > depth of dynamic fluid level > water content (from Figure 3).
The study identified seven main controlling factors. Firstly, there are geological factors such as porosity, permeability of the oil formation, and the difference in stratigraphic permeability grades. Engineering factors such as the sand ratio in fracturing fluid and fracturing fluid discharge were also found to be significant. Additionally, dynamic development factors such as production differential pressure and pump depth were identified as important considerations.

Overview of the Algorithm
The ELM algorithm is a type of single-hidden-layer feedforward neural network (SLFNS) that exhibits high operational efficiency, correctness, and strong generalization performance with few training parameters. In contrast to traditional neural networks, the algorithm randomly determines the weight vector and threshold matrix for the hidden layer, with only the number of neurons in the hidden layer being specified [39][40][41][42][43]. During the execution of the algorithm, there is no need to adjust the values of W and b. The predicted target value can be approximated by the excitation function ( ), which is infinitely differentiable for any interval. Figure 4 shows the network structure of the ELM model.

Mathematical Models
The input matrix X corresponds to the n neurons in the input layer of the ELM algorithm, which is X = [x i1 , x i2 , . . . . . . , x in ] T R n . The output matrix corresponds to the m neurons in the output layer, which is T = [t i1 , t i2 , . . . . . . , t im ] T R m . There are l neurons in the hidden layer. The activation function g(x) is modeled as where W i = [W i1 , W i2 , . . . . . . , W in ] T is the input node and the input weight of the i-th hidden-layer neuron node. β i = [β i1 , β i2 , . . . . . . , β im ] T is the output node with the output weight of the i-th neuron node.
H is the output matrix of the ELM hidden layer, and can be expressed as This is expressed as Hβ = T , where T is the transpose of the matrix T.
The target output of a single-hidden-layer neural network learning is zero error. It is infinitely close to the test sample and can be expressed as We are training a single-hidden-layer neural network with a large number of sample data. When the activation function g(x) is infinitely differentiable, the W i , b i of the input layer to the hidden layer is determined. The error model for ELM can be obtained as where ε is the error value of the ELM algorithm. When the error is less than the preset error value, the output weight value β can be calculated as H * T. According to the least squares criterion, H * is the generalized inverse matrix of the output of the hidden layer.

Overview of the Algorithm
The particle swarm optimization algorithm simulates the foraging behavior of a flock of birds, where individuals within the group share and exchange information to continuously iterate and search for the optimal particle. This search process involves two attributes: the particle's velocity and position [44,45].

Mathematical Models for Particle Swarm Optimization Algorithms
Suppose there are n particles in D-dimensional space; the i-th particle (1 ≤ i ≤ n) and its position can be represented as X i = [X i1 , X i2 , . . . . . . , X iD ] T . Its velocity is represented as The root-mean-square error (RMSE) is a measure of the deviation between the initial-productivity prediction and the actual tested initial productivity. It is used as the fitness function to calculate the fitness value of each particle. A smaller RMSE value indicates a smaller deviation of the initial-productivity prediction model and higher prediction accuracy. Therefore, the particle with the smallest RMSE value is considered the best. The current best value of the i-th particle, the individual extreme value, can be expressed as P best i = [P best i1 , P best i2 , . . . . . . , P best iD ] T . The current best value of the population, the global extreme value, can be expressed as G best i = [G best i1 , G best i2 , . . . . . . , G best iD ] T .
ELM training outputs the root-mean-square error (RMSE) as the fitness value of PSO.
where y real and y i are the output value of the desired sample and the actual predicted value of the model. When the adaptation values of PSO are continuously calculated, the two extreme values P best i and G best i are searched. Its velocity and position are continuously updated through Equations (10) and (11), as follows: where V k i , X k i are the velocity and position of the k time and i-th particle. P k best i , G k best i are the individual extremes and global extremes of the k time and i-th particle.
w i represents the weight values for balancing the individual-extreme-value finding ability and the global finding ability; C 1 , C 2 are learning factors. They reflect the importance of individual extreme values and global extreme values. r 1 , r 2 are random numbers within 0 to 1. The basic flow of the particle swarm algorithm is as follows: In the first step, the initial parameters are setting, which include the population size, dimensionality, initial speed, and position of each particle, as well as the number of iterations and the error rate size.
In the second step, the function has been set up with a constant or extreme value problem, and the current fitness value of each particle has been determined.
In the third step, each particle's position and velocity are adjusted based on its own memory and experience.
In the fourth step, the termination condition is set to find the optimal value. The algorithm ends when the number of iterations reaches the maximum. If there is no optimal result, the algorithm continues to be executed from the second step.
The whole process is represented in Figure 5.
where and are the output value of the desired sample and the actual predicted value of the model.
When the adaptation values of PSO are continuously calculated, the two extreme values and are searched. Its velocity and position are continuously updated through Equations (10) and (11), as follows: where , are the velocity and position of the time and -th particle. , are the individual extremes and global extremes of the time and -th particle.
represents the weight values for balancing the individual-extreme-value finding ability and the global finding ability; 1 , 2 are learning factors. They reflect the importance of individual extreme values and global extreme values. 1 , 2 are random numbers within 0 to 1. The basic flow of the particle swarm algorithm is as follows: In the first step, the initial parameters are setting, which include the population size, dimensionality, initial speed, and position of each particle, as well as the number of iterations and the error rate size.
In the second step, the function has been set up with a constant or extreme value problem, and the current fitness value of each particle has been determined.
In the third step, each particle's position and velocity are adjusted based on its own memory and experience.
In the fourth step, the termination condition is set to find the optimal value. The algorithm ends when the number of iterations reaches the maximum. If there is no optimal result, the algorithm continues to be executed from the second step.
The whole process is represented in Figure 5.

Particle Swarm Optimization-Extreme Learning Machine Algorithm
In the ELM algorithm, the initial weights (w) and layer bias (b) are randomly generated. To find the optimal values for W and b, the particle swarm optimization (PSO) algorithm is employed. The PSO algorithm continuously adjusts the parameter values to reduce the mean squared error (MSE) value. This approach results in the construction of an optimum PSO-ELM initial-productivity prediction model [46].
The PSO-ELM initial-productivity prediction model is used in this paper. The specific PSO-ELM algorithm process is shown as follows: Firstly, the input layers consist of data on the main control factors that affect the initial productivity of the low-permeability reservoir. The data are then divided into training and detection data, and pre-processed accordingly; Secondly, to set up the relevant parameters of the particle swarm initializing the input weight W and the implied layer threshold b of the ELM algorithm, we can use a trial-anderror approach or a more systematic method such as grid search. The value of w and b will depend on the specific problem and data set being used, so it is important to experiment with different values to find the optimal combination that yields the best performance; Thirdly, the mean squared error MSE is calculated by using the predicted values and the actual test-data values. It takes the MSE as the fitness value for PSO; Fourthly, the particle swarm optimization algorithm involves continuously updating the positions and velocities of particles to obtain their optimal fitness values. By calculating the fitness values of each particle, we can determine the optimal input weights W and the layer threshold b, while ensuring that the mean squared error (MSE) remains within the allowed range; Fifthly, the optimal W and b are substituted into the ELM algorithm for prediction to achieve accurate prediction for the model.
The flow of THE PSO-ELM algorithm is shown in Figure 6.
ated. To find the optimal values for W and b, the particle swarm optimization (PSO) algorithm is employed. The PSO algorithm continuously adjusts the parameter values to reduce the mean squared error (MSE) value. This approach results in the construction of an optimum PSO-ELM initial-productivity prediction model [46]. The PSO-ELM initial-productivity prediction model is used in this paper. The specific PSO-ELM algorithm process is shown as follows: Firstly, the input layers consist of data on the main control factors that affect the initial productivity of the low-permeability reservoir. The data are then divided into training and detection data, and pre-processed accordingly; Secondly, to set up the relevant parameters of the particle swarm initializing the input weight W and the implied layer threshold b of the ELM algorithm, we can use a trialand-error approach or a more systematic method such as grid search. The value of and will depend on the specific problem and data set being used, so it is important to experiment with different values to find the optimal combination that yields the best performance; Thirdly, the mean squared error MSE is calculated by using the predicted values and the actual test-data values. It takes the MSE as the fitness value for PSO; Fourthly, the particle swarm optimization algorithm involves continuously updating the positions and velocities of particles to obtain their optimal fitness values. By calculating the fitness values of each particle, we can determine the optimal input weights W and the layer threshold b, while ensuring that the mean squared error (MSE) remains within the allowed range; Fifthly, the optimal and are substituted into the ELM algorithm for prediction to achieve accurate prediction for the model.
The flow of THE PSO-ELM algorithm is shown in Figure 6.

Research Area
The proposed PSO-ELM algorithm model is utilized to predict the initial productivity of wells in the JY oilfield. This oilfield is situated in the western part of the middle region of the northern Shaanxi slope in the Ordos Basin. The Chang 8 reservoir of the Triassic Yanchang Formation is the primary oil-bearing formation in this area. The porosity of the formation ranges from 8% to 25% with an average of 11.3%, while the permeability ranges from 0.01 to 20 mD with an average value of 0.64 mD. The research area pertains to a reservoir that is typical of the low-porosity and low-permeability type. The current initial development of the reservoir is characterized by the low productivity of individual wells, early emergence of water, insufficient formation energy, and rapid decline in well productivity.

Construction of the Initial-Productivity Model and Evaluation Analysis
This study utilized the PSO-ELM algorithm to construct a model for predicting initial productivity. The model was trained using data from 127 wells in the JY low-permeability field and seven main control factors. To verify the accuracy of the model, the initial productivity of 54 wells was simulated.
In the optimization process of the PSO algorithm, the initial parameters are set as follows: learning factor c 1 = 1.45, c 2 = 1.64; and the minimum value of inertia factor w min = 0.1, and the maximum value w max = 0.8. The population size is 30. The maximum number of iterations is 100 and the error threshold is 10 −6 . The iteration times of PSO optimization reached the 42nd time and the adaptive value tended to be stable. The curve of PSO optimization was shown in Figure 7.
ability ranges from 0.01 to 20 mD with an average value of 0.64 mD. The research area pertains to a reservoir that is typical of the low-porosity and low-permeability type. The current initial development of the reservoir is characterized by the low productivity of individual wells, early emergence of water, insufficient formation energy, and rapid decline in well productivity.

Construction of the Initial-Productivity Model and Evaluation Analysis
This study utilized the PSO-ELM algorithm to construct a model for predicting initial productivity. The model was trained using data from 127 wells in the JY low-permeability field and seven main control factors. To verify the accuracy of the model, the initial productivity of 54 wells was simulated.
In the optimization process of the PSO algorithm, the initial parameters are set as follows: learning factor 1 = 1.45 , 2 = 1.64 ; and the minimum value of inertia factor = 0.1, and the maximum value = 0.8. The population size is 30. The maximum number of iterations is 100 and the error threshold is 10 −6 . The iteration times of PSO optimization reached the 42nd time and the adaptive value tended to be stable. The curve of PSO optimization was shown in Figure 7. In the construction process of the ELM algorithm model, the number of hidden layers is set at 8. The optimized W and b are, respectively: In the construction process of the ELM algorithm model, the number of hidden layers is set at 8. The optimized W and b are, respectively: A PSO-ELM model was constructed using 127 groups of training data. The model's evaluation effect showed that the RMSE is 0.0145, MAE is 0.854, and R 2 is 0.911. The model's learning effect is very close to the training set data, making it a reliable tool for prediction.
Two prediction models, namely, PSO-ELM and ELM, were constructed using the test-set data and prediction-set data to estimate initial productivity. The predicted values of both models were compared with the actual values and shown in Figure 8. evaluation effect showed that the RMSE is 0.0145, MAE is 0.854, and R 2 is 0.911. The model's learning effect is very close to the training set data, making it a reliable tool for prediction.
Two prediction models, namely, PSO-ELM and ELM, were constructed using the test-set data and prediction-set data to estimate initial productivity. The predicted values of both models were compared with the actual values and shown in Figure 8. Obviously, as shown in Figure 8, the initial productivity predicted by PSO-ELM is closer to the zero-error line than that predicted by the ELM model. The ELM algorithm models and are optimized by using particle swarm optimization, which makes the value predicted by the PSO-ELM algorithm closer to the actual value. That is to say, the algorithm has a higher accuracy. The PSO-ELM algorithm error is made smaller and its running time is more than five seconds shorter than the unoptimized ELM model, whereas, when the predicted initial-productivity value of the two models is greater than 30, the forecast deviates from the actual value and the predicted value is lower than the actual value, and when the predicted initial-productivity value is 15, the predicted value is higher than the actual value. This is because, in the two models, the two main controlling factors, fracturing fluid displacement and production pressure difference, have a greater influence on the prediction weight. At the same time, the optimized ELM model Obviously, as shown in Figure 8, the initial productivity predicted by PSO-ELM is closer to the zero-error line than that predicted by the ELM model. The ELM algorithm models w and b are optimized by using particle swarm optimization, which makes the value predicted by the PSO-ELM algorithm closer to the actual value. That is to say, the algorithm has a higher accuracy. The PSO-ELM algorithm error is made smaller and its running time is more than five seconds shorter than the unoptimized ELM model, whereas, when the predicted initial-productivity value of the two models is greater than 30, the forecast deviates from the actual value and the predicted value is lower than the actual value, and when the predicted initial-productivity value is 15, the predicted value is higher than the actual value. This is because, in the two models, the two main controlling factors, fracturing fluid displacement and production pressure difference, have a greater influence on the prediction weight. At the same time, the optimized ELM model has better adaptability. Because w and b are constantly adjusted, the model is more consistent with the predicted value.

Comparison of Different Forecasting Models
We select three commonly used performance measures: root-mean-square error (RMSE), mean absolute error (MAE), and coefficient of determination (R 2 ). They are selected to evaluate the forecasting performance of the models. Their evaluation of the initial-productivity forecasting results is calculated as follows: where y i is the actual initial productivity of the i-th well;ŷ i is the predicted initial productivity of the i-th well; and y i is the average of the initial productivity. The closer the values of RMSE and MAE are to 0, the closer the value of R 2 is to 1, and the better the prediction results will perform. To evaluate the effectiveness of the PSO-ELM algorithm, this study employs three prediction models: random forest (RF), back propagation neural network (BP), and recurrent neural network (RNN). The initial productivity of these models is compared using cross-sectional data from 54 wells. Figure 9 illustrates the comparison of the predicted and test-set data for each model.
where is the actual initial productivity of the -th well; ̂ is the predicted initial productivity of the -th well; and ̅ is the average of the initial productivity. The closer the values of RMSE and MAE are to 0, the closer the value of 2 is to 1, and the better the prediction results will perform.
To evaluate the effectiveness of the PSO-ELM algorithm, this study employs three prediction models: random forest (RF), back propagation neural network (BP), and recurrent neural network (RNN). The initial productivity of these models is compared using cross-sectional data from 54 wells. Figure 9 illustrates the comparison of the predicted and test-set data for each model. Figure 9. Comparison of predicted and test-set data for each model. Figure 9. Comparison of predicted and test-set data for each model.
In Figure 9, it is evident that the overall predicted value of the four prediction models is low when the model predicts the actual value to be greater than 30. The PSO-ELM, RNN, and BP models show small prediction errors, while the RF model shows a large prediction error. This is because the RF model is determined by multiple random decision trees voting, which gives it a good tolerance for noisy data but reduces the accuracy of predicting individual data. When the predicted actual value is high, the two main controlling factors are fracturing fluid displacement and production pressure difference in the input layer, which have a larger influence on the predicted value. However, the actual data of these two main controlling factors are not particularly large in the same class, resulting in some errors between the predicted and actual values.
When the value of the model predicting the productivity is less than seven, all four prediction models output a value that is too high. The RF model has the largest prediction error among the four models, which aligns with the analysis results mentioned above. However, when considering the weight of the predicted value influenced by the main controlling factors in the input layer of the model, the sand content ratio and porosity weight of the fracturing fluid have the greatest impact. The two main control factor data values in this set of data are too large in the same category, resulting in a large prediction. To sum up, there are two reasons for the analysis of the above points with large errors: (1) The model requires more in-depth learning on data pertaining to special points in order to accurately predict them. Unfortunately, the training set used in this paper only contains a very small amount of such data. (2) To improve the accuracy of the model, it is necessary to consider more engineering factors. The current model in this paper is based on only seven main control factors, which is insufficient for achieving a high-accuracy prediction for the entire range of data. In summary, in view of the two situations, there are some errors in the prediction results of the four different models. However, the error of the PSO-ELM model is smaller. This shows that the model has good robustness and adaptability.
The test-data evaluation results of each of their models are shown in Table 3. Table 3. Results of the evaluation of each model for predicting initial productivity. Obviously, the PSO-ELM algorithm has the smallest error (RMSE and MAE) and the highest accuracy (R 2 ) in the model. Moreover, its running speed is faster. As the amount of sample data increases, the advantage of its model running speed becomes more and more obvious. This suggests that the PSO-ELM algorithm can handle the highcomplexity characteristics of initial-productivity forecasting more effectively than RF, BP, and RNN. It is more suitable as a forecasting method for the dynamic analysis of oilfield initial productivity.

Discussion
The prediction model has good application in the petroleum industry. It also has a wide range of applications in the other industry, including initial-production capacity and peak prediction of natural gas production [47], as well as predicting the head of aquifers [48,49], carbon emissions [50,51], electricity, electric power, and electric load [52,53]. This datadriven forecasting model has a significant impact on the industry's forecasting research.
In this paper, based on the advantage of the ELM algorithm where it runs quickly due to its single hidden layer, the PSO-ELM model was developed by integrating PSO optimization techniques. It improves the accuracy of the whole prediction model. Although other controlled algorithms are limited in this paper, we can observe that the advantages of this prediction model are more prominent in the evaluation results. It is consistent with the results of the initial predictivity of the JY oilfield.
The model currently performs well on other wells in the JY oilfield. However, it has not been applied to other oilfields with different geological factors, engineering factors, and dynamic development factors. There may be a possibility that the accuracy of the prediction model's prediction would be reduced in other fields. The replacement of the input data source may make the parameters of the prediction model unsuitable for the new prediction model. It is necessary to adjust the parameters of the prediction model to meet the initial-productivity prediction of the oilfield.

Conclusions
This study has shown that the prediction of initial productivity is extremely important for the development process of low-permeability oilfields. The accuracy and precision of the model have been verified by the test data. Therefore, the initial-productivity forecasting model can guide the fundamental task in the initial stage of reservoir exploration and development.
The machine learning model solves the problems of poor adaptability and the lower consideration of influencing factors in traditional mathematical models.
(1) This paper proposes a combination feature selection algorithm that utilizes the correlation between characteristic factors and initial productivity to provide a reasonable