Forecast of E-Commerce Transactions Trend Using Integration of Enhanced Whale Optimization Algorithm and Support Vector Machine

E-commerce has become a crucial business model through the Internet around the world. Therefore, its transaction trend forecast can provide important information for the market planning and development in advance. For this purpose, the integrated model of enhanced whale optimization algorithm (EWOA) with support vector machine (SVM) is proposed for forecast of E-commerce transaction trend in this study. First, the global optimization ability of the whale optimization algorithm (WOA) is enhanced by the search updating strategy. Second, multiple factors that may affect the E-commerce transaction trend are analyzed and determined using the gray correlation mechanism. Third, the EWOA algorithm is employed to optimize the SVM random parameters. Finally, the EWOA-SVM model is established for forecasting E-commerce transaction trend. Two representative cases tests confirm that the EWOA-SVM model is superior to other existing methods in terms of fast convergence speed and high prediction accuracy.


Introduction
e current digital economy is moving forward much faster than before in recent years. erefore, E-commerce transactions have become an important core of the digital economy in the global market [1,2]. e rapid E-commerce development has resulted in certain changes in the logistics, manufacturing and traditional retail industries, etc. Transaction volume is usually regarded as a crucial indicator used for assessment of the E-commerce development level. As a result, E-commerce transactions trend forecast is indispensable to provide a quantitative basis for the long-term planning and strategy formulation for enterprises and governments [3,4].
At present, the prediction methods applied for E-commerce transactions trend have been focused on machine learning models, regression models, and combination models [5,6]. Machine learning models include neural network model, support vector machine (SVM), extreme learning machine (ELM), etc. [7]. ey are somehow sensitive to parameters selection to predict the E-commerce transactions trend through the mapping relationship between influencing factors and transactions volume [8]. e regression models mainly refer to moving average model (MA), autoregressive model (AR), autoregressive moving average model (ARMA), and the nonlinear regression models. ey may work well in analyzing the stationary series, but they are not suitable for the nonstationary time series analysis. On the other hand, the combination models can provide competitive results and outperform the single model, but their computational cost is relatively higher than that of the single model [9,10].
Zhang et al. [11] applied the ELM model for forecasting E-commerce transactions and proposed an improved optimization algorithm to optimize the random model parameters. However, only few basic influencing factors were considered so that it may be not applicable in real circumstances. Ji et al. [12] combined XGBoost and ARIMA models to predict the size of E-commerce transactions, presenting better results than individual XGBoost and ARIMA models, but the complexity and computational cost may post a difficulty for further applications. Alternatively, Chen et al. [13] combined clustering technology and machine learning model to forecast the transaction size. Clustering technology is used to divide the training samples, and then the machine learning model was applied to train groups. e random parameters that may influence machine learning performance were not well considered, resulting in the instability of prediction results. Di Pillo et al. [14] employed SVM to predict the sales scale. Compared with linear regression models, the SVM model has a stronger nonlinear mapping ability. However, it is sensitive to random model parameters. Mao et al. [15] predicted China's E-commerce online transaction volume based on the combination of ARMA model and gray model, and the forecasting result was satisfactory.
SVM model is suitable for small sample prediction, and it has strong generalization ability and less random parameters. Li et al. [16] proposed an improved dragonfly algorithm to optimize SVM's random parameters in shortterm wind power prediction. Liu et al. [17] applied SVM to forecast the remaining life of lithium-ion batteries and used chicken swarm optimization algorithm to solve random parameters. Pham et al. [18] employed the SVM model for rainfall prediction and achieved high prediction accuracy. Additionally, Hossain and Muhammad [19] applied SVM in emotion recognition system for emotion classification. Huang and Wang [20] used SVM model for pattern classification, where the genetic algorithm was applied to optimize model's random parameters to improve classification accuracy.
For SVM to be used for forecasting the E-commerce transaction trend, two main problems need to be resolved. e first task is to reduce the influence of random parameters of SVM model, which is an optimization point. e other is to choose crucial factors on E-commerce transaction trend. Consequently, this study proposes an integrated model using enhanced Whale Optimization Algorithm (EWOA) with SVM model on the basis of multiple factors analysis and machine learning. In this approach, EWOA algorithm was used to optimize the random parameters of SVM model. e modeling process is introduced in Section 2. Section 3 analyzes multiple influencing factors and determines critical points on E-commerce transactions. Section 4 validates the proposed model through two cases. e conclusions with future work are presented in Section 5.

Models of E-Commerce Transaction
Trend Forecast 2.1. SVM Model. SVM, which is a hot spot model in machine learning models, has the characteristics of simple structure, few adjustable parameters, and strong generalization ability. It is often used in pattern recognition, disease diagnosis, regression prediction, and other fields [21]. e samples are mapped to the high-dimensional space R N through a mapping function φ(x). Given a sample set (x i , y i )|x i ∈ R N , y i ∈ R, j � 1, 2, . . . , n , the hyperplane g between input x and output y is established in a high-dimensional space as follows [22]: where g(x) represents the output function, w is the weight, and l indicates the offset. g(x) is transformed into a constrained optimization problem through the principle of structural risk minimization. Taking into account the errors, the slack variable is introduced into the objective function. e constraints minimization is expressed as follows [23]: where ζ i and ζ * i are slack variables, ρ is a penalty coefficient, and τ is the error. e optimization problem is transformed into solving the equation by Lagrange multiplier, then the derivation of each variable is performed, and finally the dual form of the optimal problem is obtained [24,25].
where α ∧ i and i α are Lagrange multipliers, κ(x i · x j ) is kernel functions, and the radial basis function (RBF) kernel is used in this study.
where δ (δ > 0) is the size of the kernel parameter.

Computational Intelligence and Neuroscience
Finally, the SVM regression function is defined as follows: Generally, in SVM model, the penalty coefficient ρ and kernel coefficient δ are random parameters, which bring uncertainty to the prediction results under the complexity of the data. To solve this problem, these random parameters need to be optimized. For this reason, a whale optimization algorithm based on search updating strategy is proposed to optimize SVM parameters and achieve the prediction accuracy for E-commerce transaction trend.

WOA Algorithm.
To date, a variety of intelligent algorithms have been developed and applied, such as PSO algorithm [26], crow search algorithm (CSA) [27], and a series of hybrid swarm algorithms [28][29][30][31][32][33]. Each algorithm uses a different method or strategy to fit some specific purposes or applications. For example, Zapata et al. [34] developed a hybrid swarm algorithm for collective construction of 3D structures, and Precup [35] proposed slime mould algorithm-based tuning of cost-effective fuzzy controllers for servo systems. Alternatively, WOA, which has a strong optimization ability, is a new swarm intelligence optimization algorithm [36]. It can simulate the predatory behavior of whales in nature, including foraging, encircling, bubble hunting, and food searching [37]. During the foraging phase, the information is exchanged between individuals in the whale group, and the food location is determined through the information communication. Usually, the initial optimal target position is used as the food position, and the whale can approach the food by updating its position. e whale location update strategy is described as follows [38,39]: where m is the current iteration number; A and C are coefficient matrices; B represents the distance between the whale and the food; x is the whale position, and x∧ represents the optimal position in the whale group. e coefficient matrices A and C in equations (6) and (7) are calculated as follows: where u is a random number between 0 and 1; a decreases linearly from 2 to 0 in the iterative process. e whales adopt enveloping and spiraling behaviors in the predation stage. To realize the contraction encirclement, A decays with a that decreases from 2 to 0. e prey is attacked through spiraling model when the food location is locked. At this time, the location search updating strategy of whales is shown as follows [40,41]: where b (b � 1) as a constant is the spiral shape; o is a random number in the interval [−1, 1]; B ∧ represents the distance between the whale and the locked food.
Assume that the probability of the whale taking the action of shrinking encirclement and spiral attack is 50%, and the position updating strategy is expressed as follows: where p is a random number in the interval [0, 1].
In addition, whales randomly searching for food can succeed through updating A. e whale can search for food in a larger area when |A| > 1 and search for food in a smaller area when |A| < 1.
where x rand represents a random position vector.
In the WOA algorithm, most of the parameters are random, and only the maximum number of iterations and population size need to be set, which is one of the advantages of the algorithm. [42] believed that no optimization algorithm can solve all optimization problems according to the "no free lunch" theory. It means that different optimization algorithms may obtain different solutions under the same issue. erefore, developing new algorithms may achieve better results. e traditional WOA algorithm may suffer from some disadvantages even it has stronger optimization capability than Particle Swarm Optimization (PSO) and differential evolution algorithms, etc. For example, its coefficient a decreases linearly to achieve shrinking encirclement, but the dynamic changes during the iteration process are not convincing. e population diversity is also limited in the later iteration, resulting in being trapped into local minimum. e EWOA model is thus developed to resolve above problems. First, the dynamic attenuation coefficient d a is introduced to simulate the dynamic change situation in the shrinking and enveloping behaviors of whales during the iterative process. e mathematical model of the dynamic attenuation coefficient is defined as follows:

EWOA Algorithm. Wolpert and Macready
where m max represents the maximum number of iterations and m is the current iteration coefficient. e value of dynamic attenuation coefficient over iterations is depicted in Figure 1. It can be seen that the coefficient (d a ) value declines faster in the initial stage of the Computational Intelligence and Neuroscience iteration, which enables locking the food position shortly. During the later iteration period, it decays slower but strengths the algorithm's local exploration ability instead.
Aiming at the deficiency of population diversity weakening in the later iteration, an area search updating strategy is proposed. e whales migrate to other regions to search for food by the regional update frequency M during the optimization process. It can promise the population diversity reaching to a large extent, thus improving the algorithm's optimization ability. e whale's update position is shown as follows: where rand n(0, σ 2 ) obeys Gaussian distribution. Similar to WOA algorithm, most parameters in EWOA algorithm are random, but the maximum number of iterations, population size, and migration frequency need to be set in advance. e process flowchart of the EWOA algorithm for searching the global optima is shown in Figure 2.
As shown in Figure 2, EWOA algorithm optimization process includes the following steps: (1) Initialize EWOA algorithm parameters.
(2) Determine whether to implement area search updating strategy. (3) If the area search updating strategy is implemented, the location is updated according to equation (14); otherwise the location is updated according to equations (10) and (12) [36]. (4) Update the optimal location of whale population [37]. (5) Determine whether to terminate the iteration. If the iteration is terminated, the optimization is completed. Otherwise, return to step (2).

Convergence Analysis.
ere are five standard test functions used to analyze the model convergence efficiency.
e f 1 , f 2 , and f 3 are unimodal functions, where the local extremum is the global optima, while f 4 and f 5 are multimodal functions. e variable ranges in functions are listed in Table 1. In addition to EWOA algorithm, PSO algorithm [26], crow search algorithm (CSA) [27], and classic WOA algorithm were chosen tests for comparison. Note that CSA algorithm is a new type of swarm intelligence optimization algorithm with better convergence performance and is suitable as a comparison algorithm. PSO algorithm is a classic optimization algorithm and is usually used as a comparison algorithm. Simultaneously, the traditional WOA algorithm is used as a comparison algorithm to compare the convergence results with the EWOA algorithm. Algorithms' parameters are set as shown in Table 2.
In PSO, w max and w min are the maximum and minimum values, respectively. C1 and C2 are the learning coefficients. In CSA, AP is the consciousness probability. FL is the flight length. b is used to define the spiral shape in both WOA and EWOA algorithms. M represents the search update frequency in EWOA. e population size is 30, and the maximum number of iterations is 500. e algorithms are tested under a unified platform, and each optimization algorithm is repeated 30 times for optimizing each test function. e average convergence value, the best convergence value, and the worst convergence value from every test function are concluded in Table 3.
As can be seen, the EWOA algorithm presents better search outcomes than others. Obviously, the results obtained from unimodal functions such as f 1 , f 2 , and f 3 consistently outperform those of multimodal functions like f 4 and f 5. Furthermore, PSO and CSA algorithms showed poor optimization results in multimodal functions, where the worst value is up to 28.85 in PSO. For all unimodal functions, the WOA algorithm can achieve very low optimal value close to 0, while the EWOA algorithm reaches the optimal value, i.e., 0.
e fitness function to evaluate the convergence process is defined as follows: where n is the number of training samples; P train,i represents the E-commerce transaction training value; and P * train,i indicates the E-commerce transaction prediction value. e iterative convergence curves (log (Fitness)) using WOA and EWOA algorithms in various test functions are shown in Figure 3. e convergence speed of EWOA algorithm is considerably faster than WOA algorithm in all test functions, requiring much shorter iterations to converge.   [43][44][45].

Equations
Variable ranges Global optima dim Note: "dim" denotes the test dimension. Algorithms In transaction level, the factors including express delivery business volume (A5; unit: 100 million) and express delivery business revenue (A6; unit: 100 million yuan) have a crucial impact on transactions level. Among them, express delivery business plays an important role in the E-commerce sales. On the other hand, the express delivery business revenue can reflect the level of E-commerce transactions in the express delivery industry. In economic development level, Gross Domestic Product (GDP) regarded as a macroeconomic factor is considered a key factor in the economic activity. It can reflect the situation of the E-commerce development. erefore, GDP (A7; unit: trillion yuan) is used to evaluate the E-commerce transactions in this study.
e statistics on China's E-commerce transactions volume and the influence factors from 2005 to 2019 are presented in Table 4. It indicates that E-commerce transaction volume (T) increases every year, i.e., from 1.29 trillion in 2005 to 34.81 trillion in 2019. Similarly, other influencing factors, e.g., A1-A7, in E-commerce transactions have a growing trend, in which A5 and A6 increase much more than others. e gray correlation is employed to analyze the correlation degree between multiple influencing factors and E-commerce transaction volume. Initially, the dimensionless process in E-commerce transaction volume and  influencing factors is implemented to reduce the difference between the numerical values. en, the correlation coefficient is calculated. e E-commerce transaction volume is denoted as the reference sequence x 0 (t) , and the influencing factors are denoted as the comparison subsequence x i (t) . e correlation coefficients at time k are expressed as follows. e difference (Δ i (k)) between the reference sequence x 0 (t) and the comparison sequence x i (t) is as follows: e correlation coefficient C i (k) between the i th comparison sequence and the reference sequence is as follows: where Δ i (k) represents the absolute difference between the two sequences at time k, Δ min is the minimum absolute difference of the comparison sequences, and Δ max is the maximum absolute difference of the comparison sequences. e correlation degree (cr i ) between the i th comparison sequence and the reference sequence is defined in the following equation: where n denotes the number of the sequence data. e correlation degree between E-commerce transaction volume and multiple influencing factors is presented in Table 5.
It reveals that the highest correlation degree in the express business revenue (A6) reaches 0.91; the correlation degrees in Internet penetration rate (A1), number of CN domain names (A3), and number of Internet users (A4) exceed 0.8; the correlation degrees in website number (A2), express delivery business volume (A5), and GDP (A7) are below 0.8. As above, the collected data from A1, A3, and A4, A6 are considered as the input variables for the forecasting models.

E-Commerce Transaction Prediction Using the EWOA-SVM Model.
Based on the integration of EWOA and SVM models, the proposed EWOA-SVM model is established to forecast E-commerce transactions trend. e architecture of prediction process is depicted in Figure 4, being demonstrated as follows: (1) Analyze the impact of multiple influencing factors on E-commerce transaction volume (2) Calculate the correlation degree between different influencing factors and E-commerce transactions through gray correlation using equation (18) (15) [41] (10) Output the optimal parameters of SVM after the training process is complete (11) Verify EWOA-SVM model using test set (12) Employ trained SVM to predict E-commerce transaction trends (13) Evaluate the forecast results of E-commerce transaction e performance of all algorithms throughout this study was carried out using MATLAB software, and the code of core programs and datasets can be freely accessed on the web page https://drive.google.com/drive/folders/1OPlt_W_u8X HrvT_PW-wOwUBZtUL2mind?usp�sharing. e root mean square error (rmse) [46] and fitting coefficient r 2 [47] are used to evaluate the model performance, where rmse represents the prediction error, and r 2 indicates the changing trend of the predictive values. When the fluctuating trend of the predictive value is closer to the real one, the r 2 is closer to 1.
where num is the number of testing samples; P test,i is the test value of E-commerce transaction; P * test,i represents the predicted value of E-commerce transaction; and P test,i is the average test value of E-commerce transaction.

Case 1.
Two cases were used to test the effectiveness of the proposed EWOA-SVM model, also including the SVM and WOA-SVM model for comparison. SVM model is selected to analyze the influence of random parameters on the prediction results, and WOA-SVM model is chosen to compare with the mining capability of the EWOA algorithm. e E-commerce transaction data collected from 2005 to 2014 was chosen as the training set, and the data collected between 2015 and 2019 was used as the test set. e training convergence curves in both WOA-SVM and EWOA-SVM models are presented in Figure 5. It indicates that the convergence speed of the EWOA algorithm is significantly faster than WOA algorithm. Moreover, the fitness value of the EWOA algorithm is obviously smaller than that of the WOA algorithm during the iteration process. e test results from the performance of SVM, WOA-SVM, and EWOA-SVM models are presented in Figure 6. e specific prediction values are listed in Table 6. It is found that the prediction of WOA-SVM and EWOA-SVM models fits well with the actual E-commerce transaction curve. On the contrast, the SVM model shows a relatively higher error, particularly in 2015.

Case 2.
e E-commerce transaction data collected between 2005 and 2010 was selected as the training set, and the data collected between 2011 and 2019 was used as the test set. e training convergence curves of the WOA-SVM and EWOA-SVM models are presented in Figure 7. It reveals that the convergence speed of the EWOA algorithm is faster than that of the WOA algorithm. Besides, the fitness value of the EWOA algorithm is considerably smaller than that of the WOA algorithm during iteration process. e results from the prediction performance of SVM, WOA-SVM, and EWOA-SVM models are shown in Figure 8. It indicates that the prediction accuracy of WOA-SVM and EWOA-SVM models is relatively higher than that of SVM model in general, well fitting with the actual E-commerce transactions trend. Nevertheless, the predicted result in SVM model is satisfactory except 2014 to 2015, showing more deviation from the actual values. e detailed prediction outcomes are listed in Table 7.

Evaluation of Test Results.
e SVM, WOA-SVM, and EWOA-SVM models were used to predict the trend of E-commerce transactions in Cases 1 and 2. e prediction results of the model were evaluated in this section. For Cases 1 and 2, the relative error (Re%) curves from SVM,   Table 8. e prediction error of E-commerce in 2013 was relatively large, but the remaining errors were less than 20%. e overall prediction effects of WOA-SVM and EWOA-SVM models were better than that of SVM model. e prediction evaluation results using the rmse and r 2 from the SVM, WOA-SVM, and EWOA-SVM models are listed in Table 9. In Case 1, the rmse values of WOA-SVM and EWOA-SVM models are contained below 0.6, where the minimum rmse value as 0.51 is obtained in EWOA-SVM model. It verifies that the rmse of EWOA-SVM model is 13.56%, 47.42% smaller than that of WOA-SVM and SVM models, respectively. However, the fitting result of the SVM model is better than others, showing its r 2 value up to 99%. In Case 2, the rmse values in all three models significantly increase, being compared with Case 1. e minimum rmse of the EWOA-SVM model is 1.26, which is 14.86% and 17.64% smaller than that of the WOA-SVM and SVM models, respectively. Additionally, the EWOA-SVM model reaches the highest r 2 value up to 98.42% among all models.    Computational Intelligence and Neuroscience the impetus of the information technology. As E-commerce has the characteristics of wide transaction coverage, low cost, fast information circulation, and high work flow coordination, it has become a new engine for economic development. Accordingly, E-commerce transaction trend is becoming an important indicator to measure the business or economic activity level. To this end, this study proposes the EWOA-SVM model to predict the trend of E-commerce transactions, which provides a theoretical and effective tool for E-commerce development.
In real applications, a precise E-commerce transaction trend prediction can provide a decision-making basis for the governments or enterprises to formulate relevant development policies or plans in future business or industrial investments. e proposed model in this study can mine the crucial factors with high correlation degree in E-commerce transactions and construct E-commerce transaction trend correlation indexes. Importantly, it can be applied to logistics enterprises, Internet enterprises, and other information network companies in their business behavior.

Conclusions
In this study, the model training data was collected from E-commerce transactions volume and the influence factors, e.g., A1-A7, between 2005 and 2019.
is sufficient data support and robust EWOA network structure can effectively alleviate the overfitting problem. e evaluation results of each model have been given more evidence with discussion to clarify this issue. e main contributions in this paper are concluded as follows: (1) A dynamic search coefficient and search updating strategy are combined to solve WOA algorithm's limitations. Accordingly, the EWOA algorithm can reach the global optima, i.e., 0, for multimodal functions, indicating a strong ability to escape from local minima.   In the future work, it suggests that additional influencing factors in E-commerce transaction trend may be extended in practical circumstances. Besides, the generalization ability for various data prediction may be improved further.

Data Availability
e data used to support the findings of this study are included within the article.