Hybrid data decomposition-based deep learning for Bitcoin prediction and algorithm trading

In recent years, Bitcoin has received substantial attention as potentially high-earning investment. However, its volatile price movement exhibits great financial risks. Therefore, how to accurately predict and capture changing trends in the Bitcoin market is of substantial importance to investors and policy makers. However, empirical works in the Bitcoin forecasting and trading support systems are at an early stage. To fill this void, this study proposes a novel data decomposition-based hybrid bidirectional deep-learning model in forecasting the daily price change in the Bitcoin market and conducting algorithmic trading on the market. Two primary steps are involved in our methodology framework, namely, data decomposition for inner factors extraction and bidirectional deep learning for forecasting the Bitcoin price. Results demonstrate that the proposed model outperforms other benchmark models, including econometric models, machine-learning models, and deep-learning models. Furthermore, the proposed model achieved higher investment returns than all benchmark models and the buy-and-hold strategy in a trading simulation. The robustness of the model is verified through multiple forecasting periods and testing intervals.

losses are much greater. Therefore, accurately predicting and capturing the changing trends in the Bitcoin market are of great importance to investors and policy makers.
The rise of Bitcoin and its underlying blockchain technology have attracted significant attention from scholars. In recent years, numerous studies have analyzed them from different perspectives, such as their adoption across different industry sectors (Chen et al. 2016;Easley et al. 2019;Janssen et al. 2020;Mu et al. 2019), their impact on firms (Cheng et al. 2019), and the relationship between Bitcoin transactions and illegal activities (Gandal et al. 2018). For example, Leng et al. (2019) proposed a blockchain-driven model to manage the cyber-credit of social manufacturing among various makers. As an example of investigating the association between Bitcoin transactions and criminal activities, Foley et al. (2019) analyzed a large number of cryptocurrency transactions and found that approximately one-quarter of all Bitcoin transactions were involved in illegal activity. In terms of investigating the relationship between Bitcoin and energy emissions, Mora et al. (2018) concluded that the emissions created from the mining of Bitcoin could increase the global temperature by 2 °C. However, despite the scholarly attention, empirical works on Bitcoin pricing forecasting models remain at a relatively early stage.
Some natural questions arise regarding Bitcoin price prediction: Is Bitcoin price predictable? What factors affect its market price? Potential factors affecting Bitcoin price movements discussed by previous studies can be classified into two groups: The first group comprises internal factors, which includes features, such as Bitcoin market volume, Bitcoin transactions patterns, and hash rate. For example, Chen et al. (2019) reported that Bitcoin network, Bitcoin trading, investor attention, and gold price can effectively influence Bitcoin price. Jang and Lee (2018) indicate that factors, such as the hash rate, difficulty, and block size, can cause Bitcoin price movement. Catania et al. (2019) applied a set of cryptopredictors to study the predictability of a cryptocurrency time series. The second group involves external factors, which include features, such as external commodities, investor sentiment, and exchange rates of other currencies (Baur and Dimpfl 2019;Liu 2019;Yu 2019). In this respect, Kristoufek (2013) utilized Google trends and the Wikipedia index to investigate the relationship between search queries and Bitcoin volatility. Li and Wang (2017) found that as Bitcoin evolves, the Bitcoin exchange rate relates more with economic fundamentals and less with technology factors. Ji et al. (2019) suggested that Bitcoin is integrated within energy, metals, and other commodity markets. Moreover, Selmi et al. (2018) assessed the role of Bitcoin as a hedge, safe haven, and diversifier against extreme oil price movements. In essence, these findings indicate that the Bitcoin price is predictable through appropriate factors and predication models.
Several types of models have been adopted by previous studies to forecast the Bitcoin market price. Econometric models, such as generalized autoregressive conditional heteroskedasticity (GARCH), vector autoregressive (VAR), and Grey Lotka-Volterra (GLVM), were first introduced to investigate the determinants of Bitcoin returns (Zhu et al. 2017;Jalali and Heidari 2020). Dyhrberg (2016) investigated asset capabilities of Bitcoin with the GARCH model, which showed that Bitcoin is similar to some major commodities, such as gold and stock. Katsiampa in 2017 introduced the AR-CGARCH model to describe the volatility and price returns of Bitcoin. However, the studies described above are mostly of an explanatory nature as they do not focus on their predictive capacities.
The econometric models also have a major drawback in that the models all assume the time series are linear and stationary, which is hardly satisfied by the volatile and nonstationary nature of the Bitcoin market (Yu et al. 2008). As a result, they are less effective in predicting the Bitcoin market price.
Some studies have adopted machine-learning methods to develop prediction models for the financial market, which attempt to capture the nonlinear characteristics of financial time series Chaudhari 2020b, 2021a;Thakkar and Lohiya 2021). Kristjanpoller and Minutolo (2018) propose a framework integrating GARCH and ANN to forecast the price volatility of Bitcoin. Peng et al. (2018) use support vector regression (SVR) to predict the volatility of cryptocurrencies. More recently, a few studies have adopted deep-learning models to forecast the financial market price as they have shown superior performance over their shallow counterparts (Ramadhani et al. 2018;Thakkar and Chaudhari 2020c;Thakkar and Chaudhari 2021b;Chaudhari and Thakkar 2021). For example, Altan et al. (2019) utilized the long short-term memory (LSTM) neural network to identify nonlinear properties of the Bitcoin price time series. Atsalakis et al. (2019) developed a novel neuro-fuzzy technique with artificial neural networks, which demonstrated improved prediction accuracy and trading results compared to the traditional artificial neural networks. Ji et al. (2019) compare prediction accuracy of several deep-learning methods, including the deep neural networks, convolutional neural networks and LSTM, and found varied prediction performances among different models. Thakkar and Chaudhari (2020a) use integrated neural networks to improve directional accuracy of the predicted stock trend.
Since Bitcoin price represents extremely volatile and non-stationary time-series data, the prediction accuracies may suffer as a result. In recent years, another type of ensemble learning approach based on the concept of "divide and conquer" has been proposed to improve the prediction accuracies of non-stationary time series. This type of approach decomposes the original time series into different cycle factors. The decomposed factors are estimated individually and then integrated together to generate the final prediction output. Currently, empirical mode decomposition (EMD) is the predominant method used to decompose the non-stationary time-series data into intrinsic mode functions (IMF). For example, Yu et al. (2015) adopt a hybrid approach of complementary ensemble empirical mode decomposition (CEEMD) and extended extreme learning machine to forecast crude oil prices. Wen et al. (2017) use CEEMD and combined SVM and ANN to forecast gold prices. Santhosh et al. (2019) combine EEMD and Deep Boltzman Machines to forecast wind energy. However, the prediction error from the individual decomposed modes tend to accumulate, which could negatively affect the forecasting results of the prediction model . Moreover, a mode-mixing problem may occur in the process of EMD, which can produce oscillations with similar scales in IMF (Colominas et al. 2014).
Based on previous studies discussed above, this paper proposes a novel data decomposition-based hybrid bidirectional deep-learning model to forecast the Bitcoin market price. First, a non-recursive signal decomposition method, variational mode decomposition (VMD), is introduced to decompose historical Bitcoin price data into various intrinsic modes. In comparison to the widely adopted EMD method, VMD can effectively avoid the mode-mixing problem (Dragomiretskiy and Zosso 2014). Second, a bidirectional long short-term memory neural network (BiLSTM) is employed as the deep-learning prediction model. The proposed deep-learning model is able to extract a two-way sequential relationship in the time series (Ullah et al. 2017). To assess the prediction performance of the proposed model, several types of prediction models such as econometric models, machine-learning models, and deep-learning models are used as benchmarks. The results indicate that the proposed decomposition-based bidirectional deep-learning model can effectively improve its predictability. In addition, results revealed that although data decomposition improves the overall predictive ability of the model, not all decomposed factors contribute equally to the improved predictive ability of the model. To further test the practicality of the model, algorithmic trading is conducted based on the prediction results and the performances are assessed against the buy-and-hold strategy. The results indicate that the proposed VMD-LMH-BiLSTM model generate higher returns compared to other measured strategies.
The methodology and empirical results by our study shed new light on developing a reliable Bitcoin forecasting and trading decision support system based on large-scale online datasets and data-driven approaches. Furthermore, the empirical results indicate that the proposed model is recommended to be used when the financial market data is a volatile and non-stationary time-series data (like Bitcoin), and when the prediction accuracies may suffer because of its high volatility. The concept of "divide and conquer" behind our proposed approach, decomposes the original time series into different cycle factors and then integrates them to generate the final predictive output, which can effectively improve the prediction accuracies on non-stationary time series.
Our intended contributions of this paper may lie with our proposal of a synthetic framework of Bitcoin price prediction incorporating influential factors of macroeconomics and investor behaviors based on data decomposition and deep-learning approaches. In particular, we incorporate Google trends in order to utilize the hidden and effective information of irrational behaviors in the Bitcoin market to improve the forecasting results. We empirically confirm the effectiveness of our proposed hybrid deep-learning models for Bitcoin price forecasting. Our proposed model outperforms several benchmark econometrics, machine-learning models, deep-learning models, and hybrid learning models. In addition, we extend the practical implications of our paper to conduct algorithmic trading based on the forecasting model. The methodology and empirical results of our study shed new light on developing a reliable Bitcoin forecasting and trading decision support system based on large-scale online datasets and datadriven approaches.
The remainder of this paper is organized as follows: "Methodological framework" section presents the methodological framework of this paper, including VMD and bidirectional LSTM neural networks. "Empirical study" section presents the empirical study on the Bitcoin market and the performance results, as well as the robustness tests of our proposed model. "Conclusion" section concludes and provides plans for future works.

Methodological framework
This section presents the proposed VMD-LMH-BiLSTM methodology framework for Bitcoin market price forecasting and algorithmic trading as shown in Fig. 1.
In the proposed approach, two main steps are involved, i.e., data-decomposition and deep-learning forecasting.
Step 1: Data decomposition An effective VMD decomposition technique is utilized to decompose the original time-series data of Bitcoin market price X t into K simple and stationary sub-series of different frequencies, which corresponds to the different inner factors of the data.
Step 2: Deep-learning forecasting A bidirectional LSTM deep-learning model is employed as the forecasting tool to generate the prediction result for the Bitcoin market price. The forecasting performance is evaluated by comparing proposed model with various benchmark models and robustness tests. Meanwhile, the economic performance of the model is evaluated by algorithmic trading on the Bitcoin market.
"Variational mode decomposition" and "Bidirectional LSTM" sections provide a detailed description into the corresponding techniques of VMD and bidirectional LSTM, respectively.

Variational mode decomposition
VMD is an entirely non-recursive signal decomposition technique proposed by Dragomiretskiy and Zosso (2014). Based on Wiener filtering and Hilbert transform (Wang and Markert 2015), it decomposes the original input signal f (t) into a series of quasi-orthogonal band-limited discrete sub-signals u k that are mostly centered tightly around their respective center frequency ω k (Li et al. 2021). In essence, VMD is a variational optimization problem that seeks to minimize the total bandwidth of each mode. The optimization procedure is as follows (Zhang et al. 2017): Step 1: Calculate the Hilbert transform of each mode u k and transform into respective uni-sided frequency spectrum; Step 2: The frequency spectrum of each mode u k is altered to narrow frequency baseband by multiplying an exponential function tuned to the corresponding estimated center frequency; Step 3: Obtain the bandwidth of each mode u k by conducting the H 1 Gaussian smoothness on the demodulated signal.
The iterative minimization process can be expressed in the following form: where { u k } and { ω k } are the modes and their respective center frequencies, K denotes the number of decomposed sub-signals, δ(t) denotes the Dirac delta function, ⊗ denotes the convolution operator, and f (t) represents the original input signal.
To obtain the optimal solution of the constrained optimization problem in Eq.
(1), a quadratic penalty function α and a Lagrangian multiplier are introduced for finite convergence and constraint enforcement purposes. Thus, the augmented Lagrangian multiplier function L can be obtained as follows: The Lagrangian functions are shifted from a time domain to a frequency domain and the corresponding extreme values are calculated. The modes u k and their respective central frequency ω k are calculated as follows: The optimal solution is then obtained using the alternative direction method of multipliers, and the original input signal f (t) is decomposed into K sub-signal modes.

Bidirectional LSTM
Proposed by Schuster and Paliwal (1997), the bidirectional recurrent neural network is a recurrent neural network (RNN) that utilizes both forward and backward information in the data. In this paper, the traditional RNN cells are replaced by LSTM cells. A LSTM cell consists of an input gate i t , a forget gate f t , an output gate o t , as well as a memory cell block C t . The forget gate f t and input gate i t are defined as follows: (1) min Li et al. Financial Innovation (2022) 8:31 A tanh layer is then utilized to generate a new memory cell block ∼ C t . The existing memory cell block is then C t updated, and the output gate o t as well as the hidden state is h t are generated: where x t denotes the input at time t . σ represents the sigmoid function and * is the element-wise multiplication. W and b are the respective weight matrices and bias vectors.
As illustrated in Fig. 2, the BiLSTM contains two hidden layers, where one of the layers processes information in the forward direction and the other layer processes information in the reverse direction. The two hidden layers are connected to one output layer so that the BiLSTM neural network can learn the information from two different data directions. Since time-series data contains a two-way sequential relationship as the current state is not only the reflection of historical information but also the basis of the future state. Therefore, BiLSTM is more effective in complex reality, thus making more accurate predictions.

Empirical study
In this study, the proposed VMD-LMH-BiLSTM model is used to predict the Bitcoin market price and conduct algorithmic trading based on the predictions. In order to verify the effectiveness of the proposed model, historical Bitcoin market price time series is used as the sample data. To potentially improve the forecasting performance as suggested by previous works (Ramadhani et al. 2018; Gyamerah 2020), we have also included several internal market factors, external factors, as well as macroeconomic factors that could influence the Bitcoin price movement as input features in the forecasting model. In addition, several benchmark models are formulated for forecasting performance comparison.
"Experiment design" section provides a detailed description about the experimental design. "Empirical results" section presents the results and verifies that the proposed model is robust across different market conditions and forecasting horizons.

Data description
This study aims to forecast the price fluctuations in the Bitcoin market and conduct algorithmic trading based on the prediction results. The data used in this study consist of the historical daily Bitcoin market closing price time series obtained from Quandl (www. quandl. com). The raw data is from the period of April 29, 2013 to January 1, 2021, with a total of 2805 observations. A graphical representation of the data is illustrated in Fig. 3. Table 1 presents the common descriptive statistics for the daily Bitcoin market price, while also introducing the augmented Dickey-Fuller test (ADF). The null hypothesis is rejected in the ADF test, which indicates that the data is non-stationary with a unit root present. Table 2 presents macrofundamental factors that are inputted into our model.  The data are divided into two sets: a training set and a testing set. The preceding 90% of the data are used to train the prediction model and the remaining 10% are to evaluate the model performance. Overall, the training set consists of 2525 observations from April 29, 2013 to March 27, 2020. The testing set contains 280 observations from March 28, 2020 to January 1, 2021.
To eliminate the differences in variable dimensions, the data is adjusted and normalized using the 0 − 1 normalization as shown below: where x t denotes the true value of the time series at time t , maxx t andminx t are the maximum and minimum true values of the time series, respectively.
In this paper, a sliding-window approach is adopted in the prediction process. The window length, N , represents the data lag-order utilized for the prediction model. For example, a window length of N = 3 means that the model takes the input data from time t − 2 to t to forecast the daily market price at time t + 1 . To determine the sliding-window length in this study, a grid search is conducted with the search range of [1,100] using 10% of the training set as validation. Figure 4 shows the prediction results of the proposed VMD-LMH-BiLSTM model with the window length of N = 1, 5, 10, 25, 50, 100 . It was found that when  Li et al. Financial Innovation (2022) 8:31 N = 25 , the model yielded the best prediction performance. As a result, the sliding-window length for the model is set to N = 25. The prediction model proposed in this study consists of five layers: an input layer, a forward hidden layer, a backward hidden layer, an output layer, and a fully connected layer. The input layers, hidden layers, and output layers are set to the same dimensions as those of the input data. The fully connected layer consists of one node, which corresponds to the predicted value. The model utilizes the Adam optimizer with a learning rate set to 0.01 with tanh selected as the activation function. To ensure that the model does not overfit the training dataset, a rolling forecasting process with the rolling window set to 90 days is suggested and shown in Fig. 5 (Yu 2019). In addition, multistep-ahead predictions are also generated to test and compare the robustness of the model.

Fig. 5 Rolling Forecast Process
where x t and x t , (t = 1, 2, . . . , N ) are the predicted value and the actual value at time t , and N represents the total number of data points in the testing set. Moreover, directional accuracy (DA) is introduced to assess the market trend predictive ability of the model (Yu et al. 2008). The larger the DA, the better the model market trend predictive ability: where

Benchmark models
The benchmarking procedure consists of two steps. First, five single benchmark models-autoregressive integrated moving average (ARIMA), linear regression (LR), SVR, LSTM, and bidirectional LSTM (BiLSTM)-are developed to compare the predictability of the proposed VMD-LMH-BiLSTM model. These models use the original Bitcoin market price and relevant market factor time series as input features without data decomposition. By comparing the prediction performance of the proposed VMD-LMH-BiLSTM model with benchmark models formulated based on different forecasting techniques utilized in previous literature, such as traditional econometric models, machine-learning models, and deep-learning models, we can comprehensively assess the effectiveness of signal decomposition technique in improving the Bitcoin market forecasting performance.
Second, in order to assess the effectiveness of different decomposed inner factors in improving the forecasting performance of the model, hybrid one-and two-characteristic models are formulated by importing factors of different frequencies into the proposed model.
The different inner factors extracted through data decomposition are classified as lowfrequency, medium-frequency, and high-frequency modes based on their respective periodicity. They are selected respectively as input features to construct the corresponding hybrid one-characteristic models: VMD-L-BiLSTM, VMD-M-BiLSTM, and VMD-H-BiLSTM. For hybrid two-characteristic models, modes from two different frequencies are imported as input features, which results in three models: VMD-LM-BiLSTM, VMD-LH-BiLSTM, and VMD-MH-BiLSTM. For each benchmark model, the parameters and lag-order are all kept consistent with that of the VMD-LMH-BiLSTM model.

Data decomposition
After selecting relevant market features, the historical Bitcoin market price data is decomposed via VMD. Combined with the guidelines of Zhu et al. (2019) and our pre-experimented forecasting performances, we decompose the closing price of the asset into 11 sub-signal modes of various frequencies.
To investigate the characteristics of the decomposed modes, the fast Fourier transform is conducted to detect the main cyclic patterns within each mode by transforming the time series into frequency domains and identifying the maximum spectral density (Welch 1967;Wang et al. 2014). Each decomposed mode is labeled from M1 to M11, respectively, with M1 having the lowest frequency and M11 having the highest. The decomposed modes contain different inner factors hidden in the original signal that have various effects on the price movement in the Bitcoin market.
Based on the detected cyclicity as shown in Fig. 6, each mode is classified into one of three groups: low frequency, medium frequency and high frequency (Zhu et al. 2019;Li et al. 2021). In terms of Bitcoin market price, M1 has a cycle of approximately three years, which is significantly longer than other decomposed modes. Thus, it is classified as the low-frequency mode, which captures the long-term trend in the Bitcoin market prices from 2013 to 2019. Modes M2-M4 have cycles between three months and eight months, which are considered as the medium modes. The mediumfrequency modes may represent price shocks brought on by economic and political events related to Bitcoin and other cryptocurrencies. Modes M5-M11 are regarded as high-frequency modes with relatively short cycles (4-21 days), which may possibly reflect the short-term fluctuations, such as investor speculations, that exist on the market.

Forecasting performance evaluation with single benchmark models
As can be seen from Table 3, the five single benchmark models and the proposed VMD-LMH-BiLSTM model displayed significantly different model performances. Looking at the levels of prediction accuracy between the single models, the BiLSTM model performed better than the traditional LSTM model. Taking a one-step-ahead prediction as an example, the MSE, RMSE, MAPE and MAE criteria are decreased by 18.18%, 10.21%, 9.45% and 8.94%, respectively. This indicates that the bidirectional structure is superior to the traditional monodirectional neural networks structure. By adopting a bidirectional structure, the model is able to extract more information within the time series, thus yielding better performance results. When comparing the levels of prediction accuracy between the proposed model and the single models, the proposed VMD-LMH-BiLSTM model displayed far superior fitting performance by operating better across all criteria ( MSE, RMSE, MAPE, and MAE).
In terms of directional accuracy ( DA) , the one-step-ahead DA for the LSTM, BiLSTM, ARIMA, LR, and SVR models are 52.3%, 53.9%, 54.3%, 52.7%, and 56.3%, respectively. The directional accuracies achieved by these single models are all below 60%, which indicates that, despite their predictive accuracy, they are unable to effectively predict the Bitcoin market trend. By comparison, the proposed VMD-LMH-BiLSTM model displayed significantly better market trend predictability by achieving a DA value of 81.7%, which demonstrates effective market trend predictability.
As presented in Table 3, the forecasting performance results show that compared to the single benchmark models, the proposed decomposition-based hybrid model significantly improves Bitcoin market prediction accuracy and DA. The main reasons could be attributed to "data decomposition, " which can effectively decompose the complex historical Bitcoin market price time series into inner factors of different frequencies. These hidden inner factors can reveal the patterns and information that exist in the original time series, which enhances the model's predictive accuracy.

Forecasting performance evaluation with hybrid benchmark models
Although the decomposed inner factors can significantly enhance the market prediction accuracy of the model, they have different cycles that range from several days to several years. The differences in period lengths may indicate that for short-term price predictions in the Bitcoin market, not all decomposed factors contribute equally to the improved predictive ability of the model. Therefore, we further assess the effectiveness of different decomposed inner factors in improving the forecasting performance of the model by comparing the performances of the proposed VMD-LMH-BiLSTM model with hybrid one-and two-characteristic models.
The proposed VMD-LMH-BiLSTM model and the six hybrid benchmark models are formulated and utilized to forecast the daily price of the Bitcoin market. The one-step prediction result comparisons and prediction accuracy results are illustrated in Table 4. According to the comparison results, the proposed VMD-LMH-BiLSTM model not only achieved the highest prediction accuracy (measured by MSE, RMSE, MAPE , and MAE ) but also obtained the highest DA (measured by DA ) across different forecasting horizons (1-step, 2-step, and 3-step). By comparison, it performed significantly better than the benchmark models, including the single models, the hybrid one-characteristic models, and the hybrid two-characteristic models. This superior performance shows that the proposed model effectively captures the different inner factors that exist in the Bitcoin market and thus significantly enhance the final prediction accuracy. Examining the prediction accuracy results for the hybrid one-characteristic models and the two-characteristic models, it is clear that the proposed VMD-LMH-BiLSTM model outperforms all the benchmark models across multiple forecasting horizons. In terms of the three hybrid one-characteristic models, the VMD-H-BiLSTM model with high-frequency inner factors achieves the best prediction accuracy among the three. When medium-frequency modes are used, the VMD-M-BiLSTM experiences a considerable decline in its forecasting performance. Finally, when the low-frequency factors are included instead, the VMD-L-BiLSTM achieved the worst performance, resulting in a significant increase across all criteria ( MSE, RMSE, MAPE, and MAE) . These results indicate that the inner factors of different frequencies decomposed from the original Bitcoin market time series have various effects on the prediction model.
A similar pattern can be observed in the hybrid two-characteristic models. When the high-frequency modes are removed from the proposed VMD-LMH-BiLSTM model, the prediction performance of the resulting VMD-LM-BiLSTM model suffers a significant decrease, with the RMSE, MAPE, and MAE increasing by 4.93, 5.08, and 5.19 times, respectively. When the medium-frequency inner factors are removed, the RMSE of the VMD-LH-BiLSTM model doubled from 0.007 to 0.014. By comparison, with the low-frequency modes removed, the VMD-MH-BiLSTM model achieved better prediction performance than the other two hybrid two-characteristic-models and obtained the smallest MSE, RMSE, MAPE and MAE values. These further indicate that although the short-term Bitcoin price prediction performance is affected by all decomposed inner factors, factors of different frequencies have varying effects on the Bitcoin market prediction performance. In particular, the high-frequency modes contribute the most to improving the Bitcoin market price predictions in the proposed model, whereas the lowfrequency inner factors have the least effect on improving prediction performance.
Examining the directional accuracies of the benchmark models, it is clear that the VMD-L-LSTM hybrid one-characteristic model obtained an accuracy below 60%, which indicates that it is unable to effectively predict the Bitcoin market trend. In addition, the hybrid, two-characteristic VMD-ML-LSTM model achieved the highest DA out of all the benchmark models. This further shows that the low-frequency modes have trivial effects on short-term market movements, while high-frequency modes have essential effects.
The inner factors of different frequencies may each contain hidden information of varying economic significance (Wang et al. 2014). Specifically, the low-frequency inner factor approximately captures the long-term trend in Bitcoin market prices from 2013 to 2019. Since Bitcoin is recognized as an investment option and a trading commodity, its long-term trend may have been largely influenced by the economic cycle and reflect changes in the global economy (Dyhrberg 2016;Ji et al. 2019).
The medium-frequency modes, which have periods ranging from one to eight months, represent economic and political events related to Bitcoin and other cryptocurrencies.
These events, such as international regulatory policies, are important and influential factors on Bitcoin market price volatility over the medium-term. For example, the Bitcoin market prices crashed in early January, which corresponds to the time when multiple governments such as China, South Korea, and the United States announced tightened regulations on Bitcoin trading (Cumming et al. 2019). Over time, the medium-frequency component reverts to the mean, which indicates that the Bitcoin market has absorbed the influences of these events. As a result, the Bitcoin market price eventually returned to its long-term trend.
The high-frequency inner factors have the shortest periods of all three components, which range from 4 to 21 days. These factors may possibly reflect short-term fluctuations such as investor speculations that exist on the market. These random disturbances exhibit mean-reversion characteristics with very short cycles, which indicate that their influences are quickly dissipated in the market and rarely sustained over time. However, these high-frequency inner factors may actually portend relatively greater effects on short-term fluctuations in the Bitcoin market, which makes them more meaningful for short-term price forecasting.
In general, the proposed VMD-LMH-BiLSTM model displays higher prediction accuracy in comparison to all the benchmark models, including the single models, the hybrid one-characteristic models, and the hybrid two-characteristic models. This indicates that by decomposing the original Bitcoin market price time series into different inner factors, the model can effectively extract the hidden information that exists in the data and significantly improves the prediction performance. The proposed model also obtained the highest DA out of all the constructed models. This shows that the proposed method can effectively capture the Bitcoin market movement trend, making it a practical and promising technique for predicting the Bitcoin market price. Although the prediction performance of the proposed VMD-LMH-BiLSTM model is greatly enhanced due to the comprehensive effect of all the decomposed inner factors, the inner factors of different frequencies have varying effects on model prediction results. In particular, the highfrequency modes mostly contain the short-term random fluctuations that exist in the market. As a result, it contributes most to the improvement of short-term Bitcoin price prediction accuracy.

Robustness check
In order to further evaluate the prediction performance and verify the robustness of the proposed approach, we employ the following two-step approach: First, we construct benchmark models using various state-of-the-art forecasting techniques to conduct horizontal performance comparison. For horizontal performance comparison, we construct four benchmark models, which utilize the same decomposition and sentiment input features as the proposed approach: econometric-based, traditional machine-learning, deep-learning, and hybrid approaches. The four benchmark models, in turn, include four models: ARIMA, LR, unidirectional LSTM, and EMD LSTM, which are utilized to predict Bitcoin price by previous works (Altan et al. 2019;Wirawan et al. 2019;Cohen 2020). Second, since the trading rule of Bitcoin is T + 0 plus 24 h, we also apply our proposed approach to higher frequency price data and compare its forecasting performance against the four benchmark models horizontally.
Looking at the horizontal daily logarithmic return performance comparisons between the proposed approach and other benchmark forecasting approaches as shown in Table 5, it is clear that proposed approach is able to significantly outperform the other four benchmark forecasting models in terms of RMSE, MAPE and MAE across all three forecasting horizons. In addition, the proposed approach is able to improve the DA of the forecast. Thus, the results show that our proposed approach is able to effectively improve the market trend predictive ability of the model. Furthermore, since the trading rule of Bitcoin is T + 0 plus 24 h, we also apply our proposed approach to higher frequency Bitcoin price data to evaluate its performance. For this test, we use the minute trading Bitcoin price data from March 28, 2020 to January 1, 2021 as our testing interval. The entire dataset consists of 6048 observations. Similar to the daily frequency Bitcoin data, we use the preceding 90 percent of the data to train the prediction model and the remaining 10 percent to evaluate the model performance. For comparison, we use the four previously constructed state-of-the-art benchmark models, which include ARIMA model, LR model, unidirectional LSTM model, and EMD LSTM model. To ensure consistency, we utilize the same decomposition and sentiment input features as the proposed approach.
Looking at the horizontal daily logarithmic return performance comparisons between the proposed approach and other benchmark forecasting approaches as shown in Table 6, it is clear that when our proposed approach is applied to a higher frequency Bitcoin price dataset, it is still able to significantly outperform the other four benchmark forecasting models in terms of RMSE, MAPE, MAE and DA across all three forecasting horizons. As a result, the validity of our proposed approach is further verified. The results show that our proposed approach is able to effectively improve the Bitcoin market predictive ability of the model. To further verify the superiority of our proposed approach, we conduct the Diebold-Mariano test between the proposed approach and all the benchmark models for one-, two-, and three-step-ahead forecasts during the testing period. The results tabulated in Tables 7 and 8 indicate that the performance of the proposed approach is significantly better than all the models for forecasting steps during the testing period. Specifically, the outperformance is significant under the 1% significance level for the non-decomposition LSTM model, SVR model, LR model and ARIMA model; for the hybrid one-and twocharacteristic decomposition models, the outperformance is significant under the 10% significance level.

Trading results comparisons
To determine the practicality of using the proposed model as a decision support tool in real world Bitcoin trading, algorithmic trading is conducted based on the predicted  Bitcoin market price. The forecasting model produces a buy signal if the predicted Bitcoin market price next day will be higher than the current Bitcoin market price today. If the predicted price next day is lower than the price today, the forecasting model generates a sell signal. Otherwise, the model produces a hold signal. The model will then use the generated signals to conduct algorithmic trading. Specifically, if a buy signal is generated, all the available capital will be used to purchase Bitcoin at that specific time. On the other hand, if a sell signal is generated, all the purchased Bitcoins will be sold at that specific time. In this study, the initial investment capital is set to $100,000. To be potentially useful as a trading decision support system, the forecasting model must have a DA higher than 50%, which could be obtained by chance. In this paper, we set the DA threshold at 60%, which means the prediction model must have a DA above 60% to be able to capture the market trends effectively. As shown in Table 3 and 4, the proposed VMD-LMH-BiLSTM model achieved a DA of 81.7%, which is higher than that of all the benchmark models. In addition, since the DA of the single models (LSTM, BiL-STM, ARIMA, LR, SVR) and VMD-L-BiLSTM model are all below 60%, they are considered incapable of predicting the market trends and thus will not be used in the trading comparisons.
Assuming that the efficient market hypothesis holds, it is impossible to consistently generate superior trading strategies in comparison to the market (Fama 1970). Thus, the buy-and-hold strategy is included to compare the trading performance of the proposed prediction model. Under the buy-and-hold strategy, Bitcoin is purchased on the first day of the trading interval and sold on the last day. In this paper, the annualized return ( AR ) is used as the performance measure to compare the trading strategies, which is calculated as follows: (17) AR = Total Capital Initial Capital   The out-of-sample testing interval consists of 280 trading days from March 28, 2020 to January 1, 2021. As illustrated in Fig. 3, the Bitcoin market price experienced significant fluctuations during the testing interval. The out-of-sample testing period is split into two different intervals-the "Up" interval consisting of 93 days from March 28, 2020 to July 11, 2020, and the "Down" interval consisting of 159 days from July 12, 2020 to January 1, 2021. Figure 7 and Tables 7 and 8 present the annualized returns obtained by all trading strategies in the out-of-sample "Up" interval, "Down" interval, as well as the overall interval. The results clearly demonstrate that the proposed VMD-LMH-BiLSTM prediction model outperforms the naïve buy-and-hold strat egy, as well as other benchmark models across all testing periods. Specifically, during the "Up" interval where the Bitcoin market price soared quickly, an investor trading on the Bitcoin market using signals generated by the proposed VMD-LMH-BiLSTM model, achieves an annualized return of 383.06% after 93 trading days. This is a 58.67% increase compared to the buy-andhold strategy. During the "Down" interval where the Bitcoin market experienced significant losses, the proposed model is able to withstand the negative market impacts and consistently generate profits. The trading results above clearly show that the proposed VMD-LMH-BiLSTM model is able to generate accurate buy and sell signals based on the predicted Bitcoin price. More importantly, the proposed model is able to reduce the negative impacts of bull market conditions and steadily generate profits.
Overall, the superior trading performance displayed by the proposed VMD-LMH-BiL-STM model indicates it is not only an effective prediction model, but also a potentially useful trading support system.

Conclusion
Although Bitcoin has attracted significant attention from investors and policy makers, the empirical works in the Bitcoin pricing forecasting models are at an early stage. This paper fills the gap by proposing the VMD-LMH-BiLSTM model, a novel, bidirectional, deep-learning method combined with data decomposition techniques, to 2 020/5/18 2020/6/7 2 020/6/27 2020/7/17 2 020/8/6 2 020/8/26 2 020/9/15 2020/10/5 2 020/10/25 2020/11/14 2 020/12/4 2020/12 forecast the Bitcoin market price. The prediction performance of the proposed model is assessed against several benchmark models, including the single models, hybrid one-characteristic models, and the hybrid two-characteristic models. In our study, by decomposing the original Bitcoin price time series into different inner factors, the proposed model is able to effectively capture the hidden patterns of different frequencies that exist in the time series. In addition, by adopting a bidirectional neural networks structure, the proposed model is able to effectively capture the two-way, sequential relationship within the time series.
According to our empirical results, the proposed VMD-LMH-BiLSTM model outperformed all the benchmark models in terms of prediction accuracy across multiple forecasting periods. Moreover, the proposed model also displays superior trading performance when compared to other benchmarks such as the buy-and-hold strategy.
In particular, our model shows strong consistency in generating profits under volatile market conditions. It effectively reduces the negative impacts of bull markets, which is especially important for avoiding losses in the highly volatile Bitcoin market. Overall, the superior performances demonstrated by the proposed VMD-LMH-BiLSTM model, in terms of prediction accuracy and trading results, indicate that it is not only an effective prediction model but also a potentially useful trading support system.
In addition, this study also investigates the effects of different decomposed frequency modes on the prediction performance of the model. The results show that the inner factors of different frequencies have various effects on model prediction results. In particular, the high-frequency modes contain mostly the short-term random fluctuations that exist in the market. As a result, it contributes most to the improvement of short-term Bitcoin price prediction accuracy. By investigating the effects of different decomposed frequency modes on the prediction result, it further reveals the potential factors that affect Bitcoin market movement.
To conclude, this paper extends the Bitcoin literature by serving as a first attempt toward developing a reliable forecasting and trading decision support system using a novel data decomposition-based hybrid bidirectional deep-learning method.
Nevertheless, this study contains some limitations. Features from external financial environments should be exploited to investigate the effects of the proposed model on Bitcoin price prediction performance. Moreover, future attempts should be made to generate a more user-friendly decision support system for investors.