A Hybrid Prediction Method for Stock Price Using LSTM and Ensemble EMD

. The stock market is a chaotic, complex, and dynamic ﬁnancial market. The prediction of future stock prices is a concern and controversial research issue for researchers. More and more analysis and prediction methods are proposed by researchers. We proposed a hybrid method for the prediction of future stock prices using LSTM and ensemble EMD in this paper. We use comprehensive EMD to decompose the complex original stock price time series into several subsequences which are smoother, more regular and stable than the original time series. Then, we use the LSTM method to train and predict each subsequence. Finally, we obtained the prediction values of the original stock price time series by fused the prediction values of several subsequences. In the experiment, we selected ﬁve data to fully test the performance of the method. The comparison results with the other four prediction methods show that the predicted values show higher accuracy. The hybrid prediction method we proposed is eﬀective and accurate in future stock price prediction. Hence, the hybrid prediction method has practical application and reference value.


Introduction
According to the statistics of China Securities Depository and Clearing Corporation, as of March 2020, there are 163.3 million securities investors in China. Stock price forecasting is a difficult and meaningful task for financial institutions and private investors. In order to effectively reduce investment risks and obtain stable returns on investment, many scholars put forward a large number of prediction models [1][2][3][4][5][6][7][8][9]. With the speedy development of big data application technology, especially the application of machine learning and deep learning in the financial field, it has a profound impact on investors. Research directions include low-frequency data and high-frequency data [10]. e previous research studies are mainly divided into two kinds of methods: fundamental analysis and technical analysis [11].
On the one hand, in technical analysis, people widely use mathematical statistical techniques to analyze historical stock price trends and predict recent stock prices. In recent years, many researchers have applied a variety of machine learning algorithms to analyze and predict stock prices, such as neural networks, multicore learning [12], stepwise regression analysis [13], and deep learning [14,15]. Although many algorithms have achieved good results in certain aspects, there are many parameter configurations and data selection problems in the use of machine learning, which is still an important area of research. On the other hand, in the fundamental analysis [16][17][18], people mainly use natural language processing to analyze the company's financial news and financial statements to predict the future stock price trend. e long-short term memory (LSTM) is a very good method in dealing with time series. Stock price data belongs to time-series data. erefore, many researchers use LSTM [19][20][21][22][23][24][25][26] to analyze and predict stock prices. Many studies have analyzed the correlation between time-series data [27][28][29][30], and the results show that LSTM has advantages in time-series data. In the literature [31], researchers used LSTM to predict the coding unit split, and the experimental results proved the advantages of LSTM in terms of efficiency. Time-series data trend research is also a new form of timeseries data prediction, so LSTM is a natural choice.
Empirical mode decomposition (EMD) technology is usually applied to nonstationary and nonlinear signals. EMD can decompose nonlinear signals into several inherent modal functions (IMF) adaptively. EMD can effectively suppress continuous noise, such as Gaussian noise [32]. However, EMD cannot suppress intermittent noise and mixed noise. Ensemble empirical mode decomposition (EEMD) technology can solve the problem of noise-mode mixing [33]. In the EEMD algorithm, a group of white noise is first added to the original signal, and then it is decomposed into several IMF. e average value of the corresponding IMF set is regarded as the correct result. EEMD will separate the noise in different IMF from the original signal components [34], thus eliminating the noise-mode mixing phenomenon. In recent years, the application of EEMD has attracted the attention of many researchers and scholars [27,[35][36][37][38][39][40][41][42][43][44]. In order to solve the problem that noise in practical applications makes interference term retrieval difficult, Zhang et al. [35] proposed a technique based on EEMD and EMD to achieve automatic interference term retrieval from the spectral domain low-coherence interferometry. e proposed algorithm uses EEMD technology to make the relative error of coupling strength less than 2%. To solve the problem that Gaussian noise and non-Gaussian noise seriously hinder the detection of rolling bearing defects by traditional methods, Jiang et al. [36] proposed a new rolling bearing inspection method that combines bispectrum analysis with improved integrated EMD. is method uses ensemble empirical mode decomposition technology to have superior performance in reducing multiple background noises and can more effectively detect defects in rolling bearings. To solve the problem of the influence of the authenticity of the partial discharge signal on the evaluation accuracy of the transformer insulation performance, Wang et al. [37] proposed a method to suppress white noise in PD signals based on the integration of EMD and the combination of high-order statistics.
is method uses EEMD decomposition to the threshold and reconstructs each IMF to suppress the white noise in each component. To solve the problem that most existing measurement methods only focus on mathematical values and are affected by measurement errors, interference, and uncertainty, Wei et al. [38] proposed a new time-history comparison for vehicle safety analysis by the integrating empirical mode decomposition method. is method uses EEMD decomposition to make the trend signal to reflect the overall change and is not affected by high-frequency interference. To solve the difficult problem of wind speed prediction, Yang and Yang [39] proposed a hybrid BRR-EEMD short-term prediction method for wind speed based on the EMD and Bayesian ridge regression (BRR).
is hybrid method uses the Bayesian regression method and the EEMD to perform regression prediction on each subsequence decomposed by the EEMD and obtains good results in wind speed prediction. In order to find potential profit arbitrage opportunities when the returns of stock index futures contracts and stock index futures contracts continue to deviate from fair prices in irrational and nonefficiently operating markets, Sun and Sheng [40] proposed a time-series analysis method based on integrated EMD.
is method uses EEMD to analyze the stock futures basis sequence and extracts a monotonically decreasing trend from the sequence to discover business opportunities. To improve the problem that a single method of predicting complex and nonlinear stock prices cannot achieve good results, Al-Hnaity and Abbod [41] proposed a hybrid integrated model based on ensemble empirical mode decomposition and backpropagation neural network to predict the closing price of the stock index. e researchers [41] have proposed five hybrid prediction models: ensemble EMD-NN, ensemble EMD-Bagging-NN, ensemble EMD-Crossvalidation-NN, ensemble EMD-CV-Bagging-NN, and ensemble EMD-NN.
e experimental results show that the performance of the ensemble EMD-CV-Bagging-NN, ensemble EMD-Crossvalidation-NN, and ensemble EMD-Bagging-NN models based on ensemble EMD are all a grade higher than that of the ensemble EMD-NN model and significantly higher than the single neural network model. e typical forecasting scheme is based on the forecast of the time-series data itself and does not deal with the timeseries data itself. It has become a challenge that how to combine the existing forecasting methods to improve the forecasting effect by decomposing the time-series data. e above methods use EEMD to decompose the time-series data to improve the performance of the algorithm. How to effectively decompose complex and nonlinear stock timeseries data for prediction has been puzzled by many researchers. Due to the uncertainty and nonlinearity of the stock time series, the deviation of a single method to predict stock prices is generally relatively large. e abovementioned hybrid method does indeed improve the algorithm significantly. erefore, it can be boldly guessed that the hybrid method generally can get better prediction results than the single specific method. Besides, the original complex time series was decomposed by the EEMD method into several relatively stable subsequence time series. By effectively combining several current effective forecasting methods, the forecasting results of relatively stable subsequence time series are theoretically better. Combining the features of the EEMD method based on the improved empirical mode decomposition method and the LSTM machine learning algorithm, this paper proposes a hybrid LSTM-EEMD method for stock index price prediction. e rest of this paper is organized as follows. ree related terminology such as EMD, EEMD, and BRR are presented in Section 2, while Section 3 briefly introduces the flowchart of our proposed LSTM-EEMD method and the structure of our proposed hybrid LSTM-EEMD method. In Section 4, we describe the experiment data collection, 2 Complexity experiment data preprocessing, and modeling processing. In Section 5, we describe the experimental results of our proposed hybrid LSTM-EEMD method and analyze the results of simulation experiment of our proposed hybrid LSTM-EEMD method for prediction. Finally, the conclusion of this paper and some future works are described in Section 8.

Related Works
Over the years, many studies in the financial field have focused on the problems of stock price prediction. ese studies mainly focus on three important research directions: (1) based on the machine learning method; (2) based on the time-series analysis method; and (3) based on the hybrid method. Below, we first briefly introduce related terminology. en, we introduce the LSTM and EEMD related to this study. Although these factors will temporarily change the stock price, in essence, these factors will be reflected in the stock price and will not change the long-term trend of the stock price. erefore, stock prices can be predicted simply with historical data. is paper believes that there are many studies using a single analysis method to predict stock market trends, but the results are not good. Need to consider a variety of factors or use a variety of techniques to build a hybrid model to further explore the prediction of stock prices. In-depth systematic research is required to answer the following research questions: RQ1 Which factors or combinations of factors most affect the trend of the stock? RQ2 What kind of analysis technology combination is most suitable for stock trend prediction? RQ3 Do we need to use deep learning methods to mine data in order to better discover the internal relationship between the stock market and influencing factors? RQ4 Whether the predictability of the analysis model depends on specific stock company characteristics, such as the domain, shareholder background, and policies? RQ5 In the context of stock market forecasting, have we developed some effective forecasting methods? RQ6 We should focus on the analysis of the specific nature of the stock price itself, rather than solving general relationship problems. Whether the price analysis driven by influencing factors can be more effective?

LSTM.
e long-short term memory neural network is generally called LSTM. e LSTM was proposed by Hochreiter and Schmidhuber [27] in 1997. e LSTM is a special type of recurrent neural network (RNN). e biggest feature of the RNN which was improved and promoted by Alex Graves is that long-term dependent information of data can be obtained. LSTM has been widely used in many fields and has achieved considerable success in many problems. Since LSTM can remember the long-term information of data, the design of LSTM can avoid the problem of longterm dependence. Currently, the LSTM is a very popular time-series forecasting model. Below, we first introduce the RNN network, followed by the LSTM neural network.

Recurrent Neural Network.
When we deal with problems related to the timeline of events, such as speech recognition, sequence data, machine translation, and natural language processing, traditional neural networks are powerless. RNN is specifically proposed to solve these problems. Because the correlation between the contexts of the text needs to be considered in the word processing, the weather conditions of consecutive days and the relationship between the weather conditions of the day and the past days need to be considered when predicting the weather. e RNN has a chain form of repeating neural network modules. In the standard RNN, this repeated structural module has only a very simple structure, such as a tanh layer. e simple structure of the recurrent neural network is shown in Figure 1.
e design intent of RNN is to solve nonlinear problems with timelines. e way of internal connection of the recurrent neural network generally only feeds forward the data, but in the bidirectional recurrent neural network, it allows the forward and backward directions to feedback the data. e RNN has designed a feedback mechanism, so the RNN can easily update the weight or residual value of the previous step. e design of the feedback mechanism is very suitable for time-series forecasting. e RNN can extract rules from historical data and then use the rules to predict time series. Figure 1 shows the simple structure of the RNN, and Figure 2 shows the expanded diagram of the basic structure of the RNN. e left side of the arrow with unfold label in Figure 2 is the basic structure of the recurrent neural network. e right side of the arrow with unfold label in Figure 2 is a continuous 3-level expansion of the basic structure of the recurrent neural network. An input data x t is input into module h of the RNN. e y t is an output of module h of the RNN values at time t. Like other neural networks, the recurrent neural network shares all parameters of each layer, such as W hx , W hh , and W yh in Figure 2. As shown in Figure 2, the RNN shares two input parameters W hx and W hh , and one output parameter W yh . As we all know, the number of parameters in each layer of a general multilayer neural network is different. Looking at Figure 2, we feel that the operation of each step of the recurrent neural network is the same on the surface. In fact, the output y t and the input x t are different. e number of parameters of the Complexity recurrent neural network will be significantly reduced during training. After multilevel expansion, the recurrent neural network becomes a multilayer neural network. Looking closely at Figure 2, we find that W hh between layer h t-1 and h t is the same as W hh between layer h t and h t+1 in form. In value and meaning, W hh between layer h t-1 and h t is different as W hh between layer h t and h t+1 . Similarly, W hx and W yh have similar situations.
Although each layer of the RNN neural network has output and input modules, the output and input modules of some layers can be omitted in specific application scenarios. For example, in language translation, we only need the overall language symbol output after the last language symbol is input and do not need to know the language symbol output after each language symbol is input. e main feature of RNN is selfexpanding, with multiple hidden layers.
As we all know, during network training, the recurrent neural network models are prone to disappearing gradients. Once the gradient of the model disappears completely, the algorithm enters an endless loop, and the network training will not end, eventually leading to RNN paralysis. erefore, simple RNNs are prone to gradient disappearance problems and are not suitable for long-term predicting. e purpose of designing LSTM is to avoid or reduce the appearance of the problem of vanishing gradient while dealing with long-term correlation time series by simple RNN. Based on the simple RNN, the LSTM adds the output gates, the input gates, and the forget gates. In Figure 3, all three gates are replaced by σ, which can effectively prevent the gradient from being eliminated. erefore, LSTM can solve long-term dependence problems. e purpose of designing memory neurons is to store some LSTM important information for state information. In addition, in general, each gate has an activation function. is function performs nonlinear transformations or trade-offs on data. Generally, the forget gate f t can filter some status information. Equations (1)-(6)fd6 associated with the LSTM neural network is shown below. ey are for the forget gate, input gate, inverse of the memory cell, memory cell, output gate, and output, respectively: Figure 1: e simple structure of the RNN.  Complexity

EMD.
e empirical mode decomposition (EMD) proposed by Huang et al. in 1998 [42] is a widely used adaptive time-frequency analysis method. Empirical mode decomposition is an effective decomposition method for time-series data. Due to the common local features of timeseries data, the EMD method has extracted the required data from them and obtained very good results in applications fields. Hence, many scholars apply the EMD method in many fields successfully. e prerequisite for EMD decomposition is the existence of the following three assumptions [43]: e following briefly introduces the decomposition process: (1) Suppose there is a signal s(i) with the black line in ( (4) Calculate and get the IMF f(t) by where r(t) is considered the residual signal. (5) Repeat execution the four steps above N times until running status meets stop conditions. Obtain the N IMFs which meet with (1) Next level signal has to contain more than two extreme values: one is the minimum value and the other is the maximum value

EEMD.
A classical EMD has mode mixing problems when decomposing complex vibration signals. To solve the above problem, Wu and Huang [44] proposed the EEMD method in 2009. e EEMD method is short for ensemble empirical mode decomposition method. EEMD is commonly used for nonstationary signal decomposition. However, the EEMD method is significantly different from WT transform and FFT transform. Here, WT is the wavelet transform and FFT is the fast Fourier transform. Without the need for basis functions, the EEMD method can decompose any complex signal. At the same time, the EEMD method can decompose any signal into many IMFs. Here, IMF is the intrinsic modal function. e decomposed IMF components contain local different feature signals. EEMD can decompose nonstationary data into multiple stable subdata and then use Hilbert transform to get the time spectrum, which has important physical significance. Comparative analysis with FFT and WT decomposition, the EEMD has the characteristics of intuitive, direct, posterior, and adaptive. e EEMD method has adaptive characteristics because of the local features of the time-series signal. e following briefly introduces the process of EEMD decomposing data.
Assume that the EEMD will decompose the sequence X. According to the steps of EEMD decomposition, n subsequences will be obtained after decomposition. ese n subsequences include n − 1 IMFs and one remaining subsequence R n . Here, these n − 1 IMFs are n − 1 component . e remaining subsequence R n is sometimes named residual subsequence. e detailed steps of using EEMD to decompose the sequence are introduced as follows: (1) Suppose there is a signal s(i) with the black line in  Finally, the following equation (12) expresses the composition of the original sequences and n subsequences decomposed by the EEMD: where the number n of subsequences depends on the complexity of the original sequences. Figure 5 shows the sin time series represented by equation (13) and the IMF diagram of it which is decomposed by the EEMD: where t � 0, f, 2f, . . . , 1000f, f � 0.001 .

Methodology
e principle of our hybrid prediction method LSTM-EEMD for stock price based on LSTM and EEMD is introduced in detail in this section. ese theories are the theoretical formation of our forecasting methods. e following first introduces the flowchart, the basic structure, and the process of the LSTM-EEMD hybrid stock index prediction method based on the ensemble empirical mode decomposition and the long-short term memory neural network. Our proposed hybrid LSTM-EEMD prediction method first uses the EEMD to decompose the stock index sequences into a few simple stable subsequences. en, the predict result of each subsequence is predicted by the LSTM 6 Complexity method. Finally, the LSTM-EEMD obtains the final prediction result of the original stock index sequence by fusing all LSTM prediction results of several stock index subsequences. Figure 6 shows the structure and process of the LSTM-EEMD method. e basic structure and process of the LSTM-EEMD predict method include three modules. e three modules are the EEMD decomposition module, LSTM prediction module, and fusion module. Our proposed LSTM-EEMD prediction method includes three stages and eight steps. Figure 7 shows three stages and eight steps of our proposed method. e three stages of the proposed hybrid LSTM-EEMD prediction method are input data, model predict, and evaluate model. e model evaluation stage and the data input stage each include 4 steps, and the model prediction stage includes 3 steps. e hybrid LSTM-EEMD prediction method is introduced in detail as follows: (1) e simulation data is generated. And the real-world stock index time-series data are collected. en, the original stock index time-series data are preprocessed to make the data format of stock index time series satisfy the format requirements for decomposition of the EEMD. Finally, the input data X of the LSTM-EEMD hybrid prediction method is formed. (2) e input data X is decomposed into a few sequences by the EEMD method. If n subsequences are obtained, then there are one residual subsequence R n and n − 1 subsequences. ese n subsequences are expressed as R n , IMF 1 , IMF 2 , IMF 3 , . . ., IMF n − 1 , respectively.     train n LSTM for n independent subsequences. ese n independent LSTM models are named LSTM k (k � 1, 2, . . ., n − 1, n), respectively. We use the n LSTM models to predict these n independent subsequences and get n prediction values of the stock index time series. ese n prediction values are named SubP k (k � 1, 2, . . ., n − 1, n), respectively. (4) Fusion function is the core of hybrid method. At present, there are many fusion functions, such as sum, weighted sum, weighted product, and so on. e function of these fusion functions is to merge several results into the final result. In this paper, the proposed hybrid stock prediction method selected the weighted sum as the fusion function. e weighted results of all subsequences are accumulated to form the final prediction result for the original stock index data. e weight here can be preset according to the actual application. In this paper, we use the same weight of each subsequence and the weight of each subsequence is 1. (5) Finally, we compare the predicted values with the actual value of stock index time-series data sequence and calculate the values of RMSE, MAE, and R 2 . We use three evaluation criteria of the RMSE, MAE, and R 2 to evaluate the LSTM-EEMD hybrid prediction method. According to these evaluation values, the pros and cons of the method can be judged. Figure 7 shows the predict progress and data flowchart of the proposed LSTM-EEMD method in this paper. e predict progress in Figure 7 can be introduced in 3 stages. e three stages are input data, model predict, and evaluate model. e stage of input data is divided into 4 steps. e four steps are collect data, preprocess data, decompose data by the EEMD, and generate n sequence. ere are n LSTM model in the stage of model predict. e input data of the LSTM model is n sequences generated in the previous stage.
ese LSTM models separately predict n sequences to obtain n prediction results. e stage of the evaluate model is also divided into 4 steps. e first step is to fuse n predicted values with weights. In this paper, we choose weighted addition as the fusion function. e weighted addition fusion function sums the n prediction value with certain weights. e output result of the weighted addition fusion function is the prediction result of the stock index timeseries data. Finally, we need to calculate the value of R 2 , MAE, and RMSE of the prediction results before evaluating the proposed hybrid prediction model. e quality of the proposed hybrid prediction model can directly evaluated by the values of R 2 , MAE, and RMSE.

Experiment Data
e experiment data in our research of this paper is introduced in detail in this part. We selected two types of experiment data to test in this paper in order to better evaluate the prediction effect of this method. e first type of experiment data is artificially simulated data generated automatically by computer. e correctness and effectiveness of our method are verified by these artificially simulated data. e second type of experimental data is real-world stock index data. Only the actual data of the society can really test the quality of the method. e model tested through social actual data is the most fundamental requirement for applying our proposed method to some realworld fields.

Artificially Simulated Experimental Data.
We use artificially simulated experiment data to verify the effectiveness and correctness of our method. To get effective and accurate experiment result, the artificially generated simulation experiment data should be long enough. Hence, in the experiment, we choose the length of the artificially simulated experiment data to be 10,000. e artificially simulated experiment data is created by the computer according to the sin function of formula (13).

Real-World Experimental Data.
In order to empirically study the effect of the proposed prediction method, we collected stock indices data in the real-world stock field as experiment data from Yahoo Finance. To obtain more objective experimental results, we choose 4 stock indices from … (1) Collection data.

Data Preprocessing.
In order to obtain good experiment results, we try our best to deal with all the wrong or incorrect data in the experiment data. Of course, more experimental data is also an important condition for obtaining good experimental results. Incorrect data mainly include records with zero trading volume and records with exactly the same data for two and more consecutive days. After removing wrong or incorrect noise data from original data, we show the trend of the close price for the four major stock indexes in Figure 8.
In the experiment, we usually first standardize the experimental data. Let X i � {x i (t)} be the ith stock time-series index at time t, where t � 1, 2, 3, . . ., T and i � 1, 2, 3, . . ., N. We define the daily return logarithmic as shown in the formula G i (t) � log(x i (t)) − log(x i (t − 1)). We define the daily return standardization as shown in the formula R i (t) � (G i (t) − 〈G i (t)〉)/δ, where 〈G i (t)〉 is the mean values of the daily return logarithmic G i (t) and δ is the standard deviation of the daily return logarithmic G i (t): where x � 1/n n i�1 x i .

Experiment Results of Other Methods
In order to compare the performance of other forecasting methods, we comparatively studied and analyzed the results of other three forecasting methods on the same data in the experiment. Table 1 shows the experiment results of five prediction methods. Since the LSTM is introduced above, it will not be repeated here. Firstly, SVR, BARDR, and KNR are briefly introduced. en, the experimental results of these three methods are analyzed in detail.

SVR. SVR is used as an abbreviation of the Support
Vector Regression. e SVR is a widely used regression method. Refer to the manual of the libsvm toolbox, the loss, and penalty function control the training process of machine learning.
e libsvm toolbox is support vector machine toolbox software, and this toolbox is mainly used for SVM pattern recognition and regression software package. In the experiment, the SVR used was a linear kernel, which was implemented with liblinear instead of libsvm.
e SVR should be extended to large number of samples. e SVR can choose a variety of penalty functions or loss functions. SVR has 10 parameters, and the settings of these parameters are shown in Table 2.

BARDR. BARDR is short for Bayesian ARD Regression
which is used to fit the weight of the regression model. is method assumes that the weight of the regression model conforms to the Gaussian distribution. To better fit the regression model weights, the BARDR uses the ARD prior technique. We assume the distribution of the regression model weights conforms to the Gaussian distribution. e parameter alpha and parameter lambda of BARDR are the precision of the noise distribution and the precisions of the weights distributions, respectively. BARDR has 12 parameters. Table 3 shows the settings of these parameters.

KNR. KNR is used as an abbreviation of the K-nearest
Neighbors Regression. e K-nearest Neighbors Regression model is also a parameterless model. e K-nearest Neighbors Regression just uses the target value of the Knearest training samples to make a decision on the regression value of the sample to be tested. at is, predict the regression value based on the similarity of the sample. Table 4 shows the KNR 8 parameters and the settings of these parameters.
e experimental results of other four methods and our proposed two methods for five sequences of sin, SP500, HSI, DAX, and ASX are shown in Table 1. In the experiment, we preprocessed the five sequences for those methods. Table 5 shows the length of the experimental data. Table 1 shows the experiment values of R 2 (R Squared), MAE (Mean Absolute Error), and RMSE (Root Mean Square Error). According to the real results and the predicted results, we try to get smaller experimental value of MAE, RMSE, and R 2 . e smaller the result of MAE or RMSE is, the better experimental values of the method are. However, the larger the result of R 2 is, the better the prediction effect of the method is.
For comparison, we show the top two results of each sequences in bold, as shown in Table 1. Among the above four traditional methods (SVR, BARDR, KNR, and LSTM) for predicting sin artificial sequences data, BARDR and LSTM are found to be the best methods. When predicting SP500, HSI, and DAX real time-series data, the BARDR method and LSTM method have the best results. However, the BARDR method and SVR method have the best results while predicting ASX real time-series data. erefore, among the above four traditional methods, the BARDR  10 Complexity method and LSTM method are the best methods for the five sequences data. Carefully observed Table 1, we found that the remaining five methods, in addition to the KNR, all have very good prediction effects on sin sequences. e R 2 evaluation indexes of SVR, BARDR, LSTM, LSTM-EMD, and LSTM-EEMD methods are all greater than 0.96. BARDR, LSTM, and LSTM-EMD have the best prediction effect, and their R 2 evaluation indexes are all greater than 0.99. Among them, BARDR has the best prediction effect, and its R 2 evaluation value is greater than 0.9999. ese values show that the method has better prediction effect for the time series of change regularity and stability, especially the BARDR method. Here, the result of choosing one of the six methods shows the prediction effect in Figure 9. To make the resulting map clear and legible, the resulting graph of Figure 9 displays only the last 90 prediction results.
Observing the SVR experiment results in Table 1, we found that the prediction values of this method on DAX, ASX, and sin is better than that of SP500 and HSI time-series data. Figure 10 shows the prediction results of the SVR on ASX, which has a better prediction effect on ASX stock data. Because the change of sin sequence has good regularity and stability, the SVR method is more suitable to predict sequence with good regularity and stability. It can be speculated that the DAX and ASX time-series data have good regularity and stability. However, the SVR method predicts SP500 and HSI sequences data. e prediction effect is very poor. is shows that SP500 and HSI sequences data changes have poor regularity and stability.
Observing the experiment results of the KNR in Table 1, we found this method has similar performance to the SVR method. e prediction results on DAX, ASX, and sin sequence are better than that on SP500 and HSI sequence. Especially for stock SP500 and HSI sequence, the prediction effect is poor. e stock time series is greatly affected by people activities, so the changes in stock sequence are complicated, irregular, and unstable. It can be inferred that the KNR is not suitable to predict the stock sequence, so the KNR is not suitable to predict sequence predictions with unstable and irregular changes.
In Table 1, observing the experiment results of the BARDR and LSTM, we found the performance of the BARDR and LSTM methods is relatively good. ese two methods not only have better prediction effects on DAX, ASX, and sin time-series data but also have more significant prediction effects on SP500 and HSI sequence. In particular, the prediction values of stock SP500 and HSI sequences are much better than that of KNR and SVR methods. e changes of stock sequence data are irregular, complex, and unstable, but the prediction effects of the BARDR and LSTM are still very good. It can be concluded that the BARDR and LSTM methods can predict stock time series, so the BARDR and LSTM are more suitable to predict sequence predictions with unstable and irregular changes. Figure 11 shows the prediction results of the BARDR for DAX time-series data, which has a better prediction effect on DAX stock timeseries data.
In Table 1, observing the experiment results of the BARDR and LSTM, we found these two methods have good prediction effects on the five different time-series data provided by the experiment. By comparing their experimental results, it is easy to find that the performance of the BARDR is better than the LSTM method. Except that the LSTM is a little better than the BARDR in the SP500 timeseries prediction effect, the LSTM is worse than the BARDR in the other four time-series prediction effects. Although the BARDR is better than the LSTM in predicting performance, the BARDR method is several times longer than the LSTM method in experimental time. In addition, although in our experiments, the prediction values of the LSTM is worse than the BARDR method, the experimental time is short and the training parameters used are still very few. In future work, we may be able to improve performance of our proposed methods by increasing the number of neurons and the number of training iterations of the LSTM method. Based on the principle analysis, we think the predictive effect of the LSTM will exceed that of the BARDR method by reasonably increasing the number of neurons and the number of training iterations.

Experiments
We selected two types of experiment data to test in this paper in order to better evaluate the prediction effect of this method. e first type of experiment data is artificially simulated data generated automatically by computer. e  second type of experimental data is real-world stock index data. e model tested through social actual data is the most fundamental requirement for applying our proposed method to some real-world fields. is section conducts detailed research and analysis on the experiment results of two aspects.

Analysis of Experimental Results Based on Artificial
Simulation Data. We use artificially simulated experiment data to verify the effectiveness and correctness of our method. To get effective and accurate experiment result, the artificially generated simulation experiment data should be long enough. Hence, in the experiment, we choose the length of the artificially simulated experiment data to be 10,000, as shown in Table 5.
Before analyzing the experiment results of real-world stock index data, we first use the proposed LSTM-EMD and LSTM-EEMD methods to predict sin simulation sequence. e simulation experiment can verify the effectiveness and correctness of the two proposed methods.
Observing the sin data column of Table 1, we found the three indicators of the LSTM method, LSTM-EMD method, and LSTM-EEMD method are basically equivalent under the same number of neurons and iterations. e three results indicated that these three methods predict effects are similar to the sin simulation time-series data. Among them, the LSTM-EMD method has the best prediction effect, indicating that the method we proposed is effective and can improve the prediction effect of the experiment. Observing      the sin data column in Table 1, we found the experimental results of the LSTM are a little better than the LSTM-EEMD, but the gap is very small and can be almost ignored. After indepth analysis, this is actually that the sin time-series data itself is very regular. By adding noise data and then using EEMD decomposition, the number of subtime series decomposition becomes more and more complicated, so the prediction effect is slightly worse. In summary, the two proposed prediction methods LSTM-EMD and LSTM-EEMD in this paper have a better comprehensive effect than the original LSTM method, indicating that the two proposed prediction methods are correct and effective.

Analysis of Experiment Results Based on Real Data.
In order to verify the practicability of the proposed LSTM-EMD and LSTM-EEMD prediction methods, we use the two methods proposed in this paper to predict four real-world stock index time series in the financial market. e four time-series are the DAX, ASX, SP500, and HSI. e stock index time series is generated in human society and reflects social phenomena. e time series of stock indexes in the financial market is seriously affected by human factors, so it is complex and unstable. ese four time-series are very representative. ey come from four different countries or regions and can better reflect the changes in different countries in the world financial market.
By comparing the three evaluation indicators of RMSE, MAE, and R 2 of the LSTM, LSTM-EMD, and LSTM-EEMD prediction methods in Table 1, we found the prediction results of the LSTM-EEMD prediction method in the four sequence data are better than the LSTM-EMD prediction method.
e experimental results show that the LSTM-EEMD prediction method is better than the LSTM-EMD prediction method. Figure 12 shows the prediction results of the LSTM-EMD method and the LSTM-EEMD method of HSI time series.
By observing Table 1, it is easy to find that the results of the proposed LSTM-EMD and LSTM-EEMD methods in the three sequences of HSI, DAX, and ASX are much better than the traditional LSTM method. We think that there are two main reasons for obtaining such good experimental results. On the one hand, the method proposed in this paper decomposes HSI, DAX, and ASX time series by EMD or EEMD so that the complex original HSI, DAX, and ASX time series are decomposed into multiple more regular and stable subtime series. e multiple subtime series are easier to predict than the complex original time series. erefore, the predicting results of multiple subtime series is more accurate than the predicting results of directly predicting complex original time series. On the other hand, although the changes of HSI, DAX, and ASX time series are complicated, they can be decomposed into more regular and stable time series. In order to clearly understand the changes in HSI, DAX, and ASX time series and EMD or EEMD decomposition, Figures 13 and 14 show all changes in EMD or EEMD decomposition of DAX time series, respectively. Figure 8 is the original change of the DAX time series, and all subgraphs in Figures 13 and 14 are the EMD or EEMD decomposition subgraphs of the DAX time series. It is easy to see that the lower EMD or EEMD decomposition subgraphs are gentler, more regular, more stable, more predictable, and have better prediction effects.
Carefully observe the data of the LSTM method, LSTM-EMD method, and LSTM-EEMD method in Table 1 in the three sequences of HSI, DAX, and ASX. It can be found that the proposed LSTM-EEMD method has a better experimental effect than the proposed LSTM-EMD method, and the proposed LSTM-EMD method has a better experimental effect than the traditional LSTM method. e three indicators RMSE, MAE, and R 2 used in the experiment all exemplify the above relationship. e smaller the RMSE or MAE experimental value of a method, the better the prediction effect of the method. However, the larger the R 2 value of a method, the better its prediction effect.
In this paper, we proposed a hybrid prediction method based on EMD or EEMD. e results of experiment show that the fusion prediction results are more superior to the traditional method for direct prediction of complex original time series in most instances. Although the experiments in this paper comparatively studied two hybrid methods based on EMD or EEMD and four other classical methods, such as SVR, BARDR, KNR, and LSTM. e two hybrid methods based on EMD and EEMD have different effects in different time series, and there are different timeseries methods that reflect different experimental advantages and disadvantages. In actual application, people can choose different methods according to the actual situation and apply it to specific practical fields. In the actual environment, if the results obtained by the method you choose do not meet your expectations, you can choose another method.

Conclusion and Future Work
We proposed a hybrid short-term prediction method LSTM-EMD or LSTM-EEMD based on LSTM and EMD or EEMD decomposition methods. e method is based on a complex problem divided and conquered strategy. Combining the 14 Complexity advantages of EMD, EEMD, and LSTM, we used the EMD or EEMD method to decompose the complex sequence into multiple relatively stable and gentle subsequences. Use the LSTM neural network to train and predict each subtime series. e prediction process is simple and requires only two steps to complete. First, we use the LSTM to predict each subtime series value. en, the prediction results of multiple subtime series are fused to form a complex original timeseries prediction result. In the experiment, we selected five data for testing to fully the performance of the method. e comparison results with the other four prediction methods show that the predicted values show higher accuracy. e hybrid prediction method we proposed is effective and accurate in future stock price prediction. Hence, the hybrid prediction method has practical application and reference value. However, there are some shortcomings. e proposed method has some unexpected effects on the experimental results of time series with very orderly changes. e research and application of analysis and prediction methods on time series have a long history and rapid development, but the prediction effect of traditional methods fails to meet certain requirement of real application on some aspects in certain fields. Improving the prediction effect is the most direct research goal. Taking the study results of this paper as a starting point, we still have a lot of work in the future that needs further research. In the future, we will combine the EMD method or EEMD method with other methods, or use the LSTM method in combination with wavelet or VMD.

Data Availability
Data are fully available without restriction. e original experimental data can be downloaded from Yahoo Finance for free (http://finance.yahoo.com).

Conflicts of Interest
e authors declare that they have no conflicts of interest.