1 Introduction

The Shanghai–Hong Kong (SH-HK) Stock Connect program which started on November 17, 2014 links Shanghai and Hong Kong stock exchanges by enabling investors in one market to trade shares on the other market through their local brokers and clearing houses. Then, on December 5, 2016, the Shenzhen–Hong Kong (SZ-HK) Stock Connect program was formally launched, establishing an interconnection mechanism between the Shenzhen and Hong Kong stock markets. Although the opening of SH-HK Stock Connect and SZ-HK Stock Connect programs is conducive to promoting the development of the stock market and economic progress, it also brings some new challenges. Yang et al. (2020) and Zhangbo and Rihong (2021) confirmed the opening of the Stock Connect programs increases the stock market risk and brings about market instability. Besides, some studies suggest that the asymmetric characteristics of risk spillovers between the two markets are considerable both before and after the Stock Connect programs (Bai and Chow 2017; Huo and Ahmed 2017; Cao and Zhou 2019; Ma et al. 2019; Yang et al. 2020). Therefore, it is important to study the financial risk under the background of the Stock Connect program.

Note that, financial risk has sustained detrimental impacts on the economy via depressed economic growth and rising unemployment (Guiso 2012; Rewilak 2018). In particular, financial risk in one country may spread to other countries. Therefore, a wide range of studies have been conducted by the researchers on early warning system of financial risk. For example, Candelon et al. (2014) identified currency risk periods using the KLR modified pressure index, and predicted currency risk using dynamic logit model. Dawood et al. (2017) identified debt risk on the basis of expert opinions and employed binary logit model to predict risk. Antunes et al. (2018) evaluated bank risk on the basis of expert judgments and anticipated banking risk using dynamic panel probit models. Wang et al. (2020) assessed stock market risk using switching ARCH and used LSTM network to produce daily predictions that alert investors to market upheaval. Fu et al. (2020) identified a stock market risk by the CMAX ratio and predicted stock market risk via logit models. At present, the majority of research concentrates on three distinct categories of financial risk: currency risk (Berg and Pattillo 1999a, b; Kumar et al. 2003; Bussiere and Fratzscher 2006; Sevim et al. 2014; Candelon et al. 2014), bank risk (Barrell et al. 2010; Nitschka 2011; Chaudron and de Haan 2014; Lang and Schmidt 2016; Geršl and Jašová 2018; Antunes et al. 2018), and debt risk (Ciarlone and Trebeschi 2005; Fuertes and Kalotychou 2006; Manasse and Roubini 2009; Dawood et al. 2017; Liu et al. 2021). It is worth noting that, except above three types of financial risk, stock market risk which is produced by asset values deviating from their regular levels also has a significant influence on the economic development. Therefore, in this paper, the research on stock market risk early warning systems is conducted.

As we all know, two issues in the stock market risk early warning system must be handled sensitively, that is the recognition of stock market risk and the construction of prediction mechanism. At present, stock market risk is generally classified into two groups: extreme risk and non-extreme risk (e.g., Wang et al. 2020; Fu et al. 2020; Lu et al. 2021). However, multi-class classification is more common and important since the real-world instances may belong to multiple labels. In addition, extensive studies have established that Value at Risk (VaR) could measure stock market risk (Yang et al. 2019, 2020, 2021). Specially, VaR can divide the stock market risk into multiple categories (Chen et al. 2014). To the best of knowledge, there are relatively few research on using VaR to measure stock market risk and dividing the stock market risk into multiple categories.

In the construction of prediction mechanism, more and more researchers tended to improve the ability to predict finance crisis by utilizing advanced machine learning models. Previously, Multilayer perceptron (MLP) is one of the most popular models. Tsai (2014) combined MLP and other classifier ensembles to predict four different types of bankruptcy. Ozturk et al. (2016) explored the prediction performance of several machine learning models such as MLP in predicting sovereign credit ratings. Note that the inputs of MLP are considered to be independent of each other, however there is a sequential relationship between the financial time series data. Recurrent Neural Network (RNN) and its variant LSTM and GRU are able to overcome this shortage by using the internal memory units, and has been introduced for risk warning. Altan et al. (2019) proposed a novel hybrid forecasting model based on LSTM model for digital currency time series. Ouyang et al. (2021) developed LSTM model optimized by attention to study the systemic risk early warning of China. To impressively improve the performance of machine learning models including the classification accuracy and the generalization ability, ensemble learning methods have been generally approved as useful tools. Du et al. (2020), utilized XGBoost to predict financial distress and obtained better results. Moreover, there are also a few studies based on reinforcement learning models (Catullo et al. 2015). Until now, machine learning models are very popular, but it is difficult to determine which model can achieve the best results.

The indicators selection is another very important work in the construction of prediction mechanism. Consequently, a lot of research works have been made on the selection of indicators in the early warning system of stock market risk. For example, Żbikowski (2015) selected several technical indicators as inputs and achieved better rate of return and maximum drawdown. Long et al. (2019) took stock market basic indicators including open price, close price, highest price, lowest price, volume and amount as inputs and obtained satisfactory results. Wang et al. (2020) further considered macroeconomic indicators on the basis of basic indicators to improve the accuracy of predictions. Fu et al. (2020) took investor sentiment indicators into account to further improve the forecasting performance. Huang et al. (2020) took indicators including basic indicators, technical indicators and overseas return rate indicators as inputs, and found that overseas return rate indicators have significant effect on predictions. Lu et al. (2021) considered the basic indicators of stock market and the macroeconomic indicators simultaneously and obtained better predictions. However, few scholars simultaneously consider basic indicators, technical indicators, overseas return rate indicators, and macroeconomic indicators to design a stock market risk early warning system. Specially, the opening of SH-HK Stock Connect and SZ-HK Stock Connect programs has accelerated the integration of China’s stock market. Therefore, under the background of the Stock Connect programs, considering the indicators of Shanghai, Hong Kong and Shenzhen stock markets may further improve the accuracy of risk warnings.

To fill in the gaps discussed above, this paper develops a China’s stock market risk early warning system. The findings of this study have three major contributions. First, we use VaR to identify China’s stock market risk in which stock market risk is divided into four levels, namely high risk, medium risk, low risk, and lowest risk. Second, we build a comprehensive indicator system to improve the forecasting performance, in which basic indicators, technical indicators, overseas return rate indicators, and macroeconomic indicators are considered simultaneously. Finally, we analyze the validity of the indicators and the impact of the Stock Connect programs on China’s stock market risk early warning system.

Fig. 1
figure 1

Framework for the proposed early warning system

The rest of this paper is organized in this way. Section 2 provides data and develops a China’s stock market risk early warning system. Section 3 contains the experimental findings and discussion. Finally, the conclusion is established in Sect. 4.

2 Data and methodology

This research proposes an early warning system to forecast the risk associated with China’s stock market. Figure 1 depicts the process structure of China’s stock market risk early warning system. Firstly, the value at risk method is used to classify the stock market risk into high risk, medium risk, low risk, and lowest risk. Secondly, some characteristic indicators, i.e., stock market indicators, overseas return rate indicators and macroeconomic indicators, are selected to build a comprehensive indicator system. Finally, the early warning model is constructed to predict the stock market risk.

2.1 Data

Considering that the Shanghai–Hong Kong Shenzhen 500 Index (SHS500 index) covering stocks in Shanghai Stock Market, Shenzhen Stock Market and Hang Seng Stock Market and has a good market representation, we take the SHS500 index as the research object. Additionally, in order to cover entire rise and fall of China’s stock market, we collect the data of SHS500 index from January 4, 2005, to December 28, 2020. The distribution characteristics of SHS500 index close price are shown in Table 1.

Table 1 Distribution characteristics of SHS500 index close price

The realized return at the specified time t is calculated as \(R_{t} = ln(P_{t}/P_{t-1})\) where \(P_{t}\) is the closing price at the end of day t. Figure 2 presents the daily returns of SHS500 index. It can be seen that the periods of the huge daily return volatility are 2008, 2015 to 2016 and 2020 respectively, which are consistent with the real situation of financial markets. Table 2 presents the descriptive statistics of SHS500 index return series. The kurtosis is more than 3, suggesting that we cannot assume that the return distribution is normally distributed. In this paper, multiple tests are used to verify the statistical properties of SHS500 index as follows: (1) The Jarque–Bera test is utilized to test whether the sample data corresponds to normal distribution. The results show that at the 1% significance level, the returns on the SHS500 index reject the null hypothesis of normal distribution. This indicates that the SHS500 index returns follow the normal distribution. (2) The Ljung–Box Q test is used, which is a statistical test of serial autocorrelations. The results show that serial autocorrelations appear in the returns of the SHS500 index. (3) The Lagrange multiplier test is used to test volatility clustering, which verifies that there is vital evidence of volatility clustering in the SHS500 index returns. (4) The stationarity test is implemented based on the augmented Dickey–Fuller test. The results show that the SHS500 index returns are stable, indicating that they may be used in further econometric study.

Fig. 2
figure 2

Daily returns of the SHS500 index

Table 2 Descriptive statistics of SHS500 index return series

2.2 Measuring stock market risk by VaR

VaR is a widely used risk metric that refers to the greatest loss that an asset or portfolio may sustain over a specified length of time under market volatility and a certain confidence level (Rockafellar et al. 2000). The traditional definition of VaR is a reference to the downside VaR which is denoted by \(V_{t}\mathrm{(down)}\). For a given time series of returns \(Y_{t}\), at the confidence level of \((100-\alpha )\%\), the downside VaR is written as:

$$\begin{aligned} P(Y_{t}<-V_{t}\mathrm{(down)}|I_{t-1})=\alpha , \end{aligned}$$
(1)

where \(I_{t-1}={Y_{t-1}, Y_{t-2},...}\) is the information set available at time \(t-1\). Mathematically, the downside VaR is the left \(\alpha -quantile\) of the returns.

In accordance with the descriptive statistics contained in Table 2, the characteristics of volatility clustering, fat tails, skewness, and non-normality can be seen in the SHS500 index daily return distributions. GARCH models will be employed in this paper in order to capture the clustering of volatility occurrences. It is commonly established that the discrepancies of GARCH-t(1,1) model between the probability levels and the sample coverages are very large (Predescu et al. 2011; Orhan and Köksal 2012). Therefore, the GARCH-t(1,1) model are used in this paper. In addition, VaR on 3 quantiles (0.1, 0.3, 0.5) is selected as the critical value of the warning interval. The stock market risk are defined as Eq. (2). The stock market risk is divided into four levels: high risk, medium risk, low risk, and lowest risk, which is shown in Fig. 3.

$$\begin{aligned} Risk_{t}=\left\{ \begin{array}{rcl} 0,&{} &{} {VaR_{t}\mathrm{(down)}>{VaR_{0.5}},}\\ 1,&{} &{} {VaR_{0.3}<{VaR_{t}\mathrm{(down)}}\le {VaR_{0.5}},}\\ 2,&{} &{} {VaR_{0.1}<{VaR_{t}\mathrm{(down)}}\le {VaR_{0.3}},}\\ 3,&{} &{} {VaR_{t}\mathrm{(down)}\le {VaR_{0.1}}.} \end{array}\right. \end{aligned}$$
(2)
Fig. 3
figure 3

Risk level based on VaR model

Table 3 Stock market critical events

As risk classification is a highly subjective subject that is highly dependent on an individual’s perception of risk, we verify reasonability of the stock market risk classification method from two aspects: statistical test and comparison with stock market risk levels and actual states. By statistical test, the K-S test statistics reject the original hypothesis at the level of 1%, indicating that VaR can significantly distinguish different risk level. Table 3 make an intuitive attempt to determine whether the risk classifier’s stock market risk level matches critical events in China’s stock market. Figure 4 judges whether the log return is accordant with the stock market risk level obtained by the risk classifier. During the period 2005 to 2020, there are three significant occurrences that have caused turbulence on the stock market—the 2008 global financial crisis that produced the SSEC index to fall by 65%, the 2015 and 2016 Chinese stock market wobbles that led to a 30% drop in the value of A-shares in Shanghai stock exchange, and the 2020 COVID-19 that caused a negative impact in China’s stock market. According to Table 3, China’s stock market risk measured by VaR are consistent with the stock market critical events, with the exception of one rare occurrence during the indicated instability phase in 2011–2012, which could not be explained by the chronology. In Fig. 4, the log return and risk level plots both emphasize turbulent occurrences recognized by the risk classifier. From the above discussions, it is concluded that the stock market risk classification method using VaR is effective.

Fig. 4
figure 4

Return (upper panel) and the corresponding risk level (lower panel)

2.3 Variable selection

In the previous studies (Żbikowski 2015; Long et al. 2019; Huang et al. 2020), only basic indicators, technical indicators, and overseas return rate indicators are simultaneously considered. However, there is a large volume of published studies describing that the macroeconomic indicators play a major role in the stock market (Huang et al. 2005; Pilinkus et al. 2010; Wang et al. 2020; Zhou and Li 2019; Lu et al. 2021). Therefore, except for the above-mentioned three indicators, the macroeconomic indicators are also considered, such as the Gross Domestic Product (GDP), M1, M2, Fixed asset, Consumer Confidence Index (CCI), Consumer Price Index (CPI), and Foreign Exchange Rates. Specially, besides the basic indicators of SHS500 index, we also consider the basic indicators of SSEC, SZCZ, and HSI. Among them, macroeconomic indicators are collected from Choice database (choice.eastmoney.com), which are seen on a monthly basis. The indicators except macroeconomic indicators are gathered from the Wind database (www.wind.com.cn), which are seen on a daily basis. As in Huang et al. (2020), conventional indicators are established, which includes basic indicators of SHS500 index, technical indicators of SHS500 index, and overseas return rate indicators (see Table 4). The other indicators, including macroeconomic indicators, and basic indicators of SSEC, SZCZ, and HSI, are presented in Table 5. It should be emphasized that we use the T test and the K-S test to choose the variables that can demonstrate a statistically significant difference between the lowest risk, low risk, medium risk, and high risk groups of participants. We get final variables indicated by * in Tables 4 and 5.

Table 4 Conventional indicators
Table 5 Other indicators

2.4 Building prediction model

In order to avoid the randomness of predictions obtained by single model, we use four machine learning models to predict China’s stock market risk, including LSTM, GRU, MLP, and XGBoost. Specifically, LSTM and GRU are two improved versions of RNN, which can handle the time series data better. MLP is a traditional ANN model with multilayers. XGBoost is the most widely used ensemble learning algorithm, which excels in solving complex nonlinear relational problems. More details are given as follows.

2.4.1 LSTM

LSTM was recommend by Hochreiter and Schmidhuber (1997). LSTM is a type of recurrent neural network (RNN) that is special. By meticulously structuring the gate structure, it overcomes the gradient disappearance and explosion problems associated with conventional RNNs and can efficiently learn long-term reliance. Therefore, in dealing with the prediction and classification of time series, LSTM with memory function shows a strong advantage.

First, the information removed from the loop body is determined by the forgetting gate \(f_{t}\). \(\sigma \) is the Sigmoid function. The current input vector is denoted by the symbol \(x_{t}\). \(h_{t-1}\) represents the hidden vector that came before it. Weight and bias of the forgetting gate \(f_{t}\) are represented by the variables \(W_{f}\) and \(b_{f}\). Second, the information transmitted in the loop body is determined by the input gate \(i_{t}\). Weight and bias of the input gate \(i_{t}\) are represented by the variables \(W_{i}\) and \(b_{i}\). The candidate state \(\widetilde{C_{t}}\) is calculated by the current input \(x_{t}\) and the preceding concealed state \(h_{t-1}\), which are both positive integers. The current state of LSTM loop body is expressed as \(C_{t}\). After calculating the new state \(C_{t}\), the LSTM loop body generates the output of the current time step. The process is completed through the output gate \(o_{t}\). It is decided by the output gate \(o_{t}\) and the most recent LSTM cycle body state \(C_{t}\) that the financial output of the hidden layer of the current time step \(h_{t}\) will be generated.

$$\begin{aligned} f_{t}= & {} \sigma (W_{f}\cdot [h_{t-1}, x_{t}] + b_{f}), \end{aligned}$$
(3)
$$\begin{aligned} i_{t}= & {} \sigma (W_{i}\cdot [h_{t-1}, x_{t}] + b_{i}), \end{aligned}$$
(4)
$$\begin{aligned} \widetilde{C_{t}}= & {} tanh(W_{C}\cdot [h_{t-1}, x_{t}] + b_{C}), \end{aligned}$$
(5)
$$\begin{aligned} C_{t}= & {} f_{t}C_{t-1}+i_{t}\widetilde{C_{t}}, \end{aligned}$$
(6)
$$\begin{aligned} o_{t}= & {} \sigma (W_{o}\cdot [h_{t-1}, x_{t}] + b_{o}), \end{aligned}$$
(7)
$$\begin{aligned} h_{t}= & {} o_{t}tanh(C_{t}). \end{aligned}$$
(8)

2.4.2 GRU

GRU (Zhao et al. 2018) is a variant based on LSTM structure. There are two gates in the unit structure, which are referred to as the update gate and the reset gate. The update gate is applied to limit the influence of previous time’s state information on the current time. The more state information that is reserved, the larger the update gate becomes. The reset gate is used to disregard a portion of the state information that was received at the prior time. The more state information that is ignored, the smaller the reset gate becomes.

It is worth noting that at time t, the GRU hidden layer state \(h_{t}\) is a linear interpolation between the prior state \(h_{t-1}\) and the candidate state \(\widetilde{h_{t}}\). \(x_{t}\) is the current input vector. Update gate \(z_{t}\) can be considered as a combination of forgetting gate and input gate in LSTM. It is determined by the update gate \(z_{t}\) how much state information from the previous moment can be transferred into the present moment. The closer \(z_{t}\) is to one, the more information from the preceding moment is utilized by the current state of affairs. The candidate state is represented by \(\widetilde{h_{t}}\). The reset gate is represented by the symbol \(r_{t}\). \(\odot \) denotes the Hadamard product. Increasing the distance between \(r_{t}\) and 0 decreases the proportion of output state that was in the prior time. Therefore, a typical RNN will be generated as a result of setting the reset gate value to 1 and updating it with the value of 0 in the update gate setting.

$$\begin{aligned} z_{t}= & {} \sigma (W_{z}\cdot [h_{t-1}, x_{t}]), \end{aligned}$$
(9)
$$\begin{aligned} r_{t}= & {} \sigma (W_{r}\cdot [h_{t-1}, x_{t}]), \end{aligned}$$
(10)
$$\begin{aligned} \widetilde{h_{t}}= & {} \textrm{tan}h(W\cdot [r_{t}\odot {h_{t-1}}, x_{t}]), \end{aligned}$$
(11)
$$\begin{aligned} h_{t}= & {} (1-z_{t})\odot {h_{t-1}}+z_{t}\odot {\widetilde{h_{t}}}. \end{aligned}$$
(12)

2.4.3 MLP

MLP (Tang et al. 2015) includes input layer, hidden layer and output layer. MLP’s layers are fully linked, which indicates that each neuron in the upper layer is linked to every neuron in the lower layer. The simplest MLP is a three-layer structure.

In MLP, the input layer is located at the bottom of the hierarchy, the hidden layer is located in the middle, and the output layer is located at the very top. To demonstrate the idea of MLP, we will use the simplest possible example. The simplest MLP procedure may be broken down into two parts. The first stage involves the transfer of information from the input layer to the hidden layer, and the second step involves the transfer of information from the hidden layer to the output layer. To begin, we suppose that the vector of the input layer is \(x_{t}\), and the output of the hidden layer is \(X_{1}\). In the output of the hidden layer, the weight and bias of the first step are represented by the variables \(W_{1}\) and \(b_{1}\), respectively. Sigmoid function, tanh function, and ReLU function are all examples of functions f that are often employed. We may then deduce, by transferring information from the hidden to the output layers, that the output of the output layer is \(X_{2}\). The weight and bias of the second step are represented by the variables \(W_{2}\) and \(b_{2}\) in the output of the output layer, respectively. It is possible to think of the process from hidden to output layers as a multi category logical regression, which is why the softmax function is used in this case.

$$\begin{aligned} X_{1}= & {} f(W_{1}x_{t}+b_{1}), \end{aligned}$$
(13)
$$\begin{aligned} X_{2}= & {} \textrm{softmax}(W_{2}X_{1}+b_{2}). \end{aligned}$$
(14)

2.4.4 XGBoost

XGBoost (Chen and Guestrin 2016) is a boosting technique that is based on trees. Compared with the traditional gradient lifting decision tree algorithm,XGBoost algorithm innovatively uses the second derivative information of the loss function, which makes xgboost algorithm converge faster, ensures higher solution efficiency, and increases the scalability. Another advantage of XGBoost algorithm is that it uses the column sampling method of random forest algorithm for reference, and further reduces the amount of calculation and over fitting. At present, the reason why XGBoost algorithm is widely used is not only that the trained model has good performance, fast speed, and can carry out some large-scale calculation of data, but also that it is capable of solving the classification problem and dealing effectively with the classification difficulty.

The principle of XGBoost algorithm is as follows. Suppose there is a data set D.

$$\begin{aligned} D = {(x_{t},y_{t})} (t = 1, 2,\ldots , n), \end{aligned}$$
(15)

\(x_{t}\) is the attribute set of the t sample, \(y_{t}\) is the class of the t sample. There will be \(\widetilde{y_{l}}\) which means predictive results for the l tree.

$$\begin{aligned} \widetilde{y_{l}} = \sum _{k=1}^Kf_k(x_{t}), \end{aligned}$$
(16)

\(f_k(x_{t})\) is the forecast results of the k tree. Then there will be loss and \(\Omega (f_{k})\).

$$\begin{aligned} \textrm{loss}= & {} \sum _{i}l(\widetilde{y_{l}},y_{t})+\sum _{k}\Omega (f_{k}), \end{aligned}$$
(17)
$$\begin{aligned} \Omega (f_{k})= & {} T +\frac{1}{2}\lambda \Vert \omega \Vert ^{2}. \end{aligned}$$
(18)

\(\widetilde{y_{l}}\) is the predictive value of model. \(y_{t}\) is the actual value of sample. The number of trees is represented by the symbol K. The model of the k tree is denoted by the symbol \(f_{k}\). The number of leaf nodes in the tree is represented by the variable T. The score at each leaf node is denoted by the symbol \(\omega \). The hyperparameter is denoted by the symbol \(\lambda \).

From the expression of XGBoost algorithm, it can be seen that the model is a set of iterative residual trees. Every iteration will add one tree, and each tree will learn the residual of the previous \((n-1)\) trees. Finally, the model is formed by the linear combination of K trees.

Table 6 Confusion matrix for stock market risk early warning system

3 Empirical study

This section conducts experiments to evaluate the performance of our stock market risk early warning system. The hyperparameters of the machine learning models are optimized based on the grid scheduling optimization algorithm. In Sect. 3.1, we present the performance criteria for model evaluation. In Sect. 3.2, we test the validity of the indicators. In Sect. 3.3, we assess the impact of the Stock Connect programs on China’s stock market risk early warning system.

3.1 Evaluation metrics

In this paper, we use accuracy, recall, precision, F1, and the receiver operating curve (ROC) to evaluate the predictive performance of model. Accuracy is the proportion of properly identified samples to the overall sample count, reflecting the overall performance of each model. Recall and precision are two metrics that indicate a model’s capacity to identify the specific stock market risk level. In general, the greater the accuracy, recall, and precision of model, the better the model’s overall performance. The F1 value represents the harmonic mean of precision and recall. The ROC curve plots the FPR (x-axis) versus the TPR (y-axis) for each model and is considered to be one of the most traditional performance measurements. ROC can reflect the ability of models to identify the stock market risk under a certain threshold. To provide a quantitative representation of the pictorial data included in the ROC, the Area Under Curve (AUC) statistic determines the entire area covered by the ROC. The bigger the AUC value, the more accurate the model. Table 6 provides the confusion matrix, and the evaluation metrics are defined as follows:

$$\begin{aligned} \textrm{Accuracy}= & {} \frac{\sum \limits _{i=0}\limits ^{3}{N_{ii}}}{\sum \limits _{i=0}\limits ^{3}\sum \limits _{j=0}\limits ^{3}{N_{ij}}}, \end{aligned}$$
(19)
$$\begin{aligned} \textrm{Recall}_{i}= & {} \frac{TP_{i}}{TP_{i}+FN_{i}}, i=0,1,2,3, \end{aligned}$$
(20)
$$\begin{aligned} \textrm{Precision}_{i}= & {} \frac{TP_{i}}{TP_{i}+FP_{i}}, i=0,1,2,3, \end{aligned}$$
(21)
$$\begin{aligned} F1_{i}= & {} 2\frac{\textrm{Precision}_{i}*\textrm{Recall}_{i}}{\textrm{Precision}_{i}+\textrm{Recall}_{i}}, i=0,1,2,3, \end{aligned}$$
(22)
$$\begin{aligned} TPR_{i}= & {} \frac{TP_{i}}{TP_{i}+FN_{i}}, i=0,1,2,3, \end{aligned}$$
(23)
$$\begin{aligned} FPR_{i}= & {} \frac{FP_{i}}{FP_{i}+TN_{i}}, i=0,1,2,3, \end{aligned}$$
(24)
Table 7 Performance of methods with and without the other indicators
Fig. 5
figure 5

ROC curves without and with the other indicators

Fig. 6
figure 6

Aggregated relative importance

where \(TP_{i}\) corresponds to the true prediction that the stock market risk level is i, \(TN_{i}\) represents the true prediction that the stock market risk level is not i, \(FP_{i}\) amounts to the false prediction that the stock market risk level is i, and \(FN_{i}\) is the true prediction that the stock market risk level is not i. \(TPR_{i}\), which indicators that the stock market risk level is i, is referred to as the ratio of correctly anticipated stock market risk signals to the total number of actually occurring stock market risk signals. , which indicators that the stock market risk level is i, is regarded as the ratio of incorrectly anticipated stock market risk signals to the total number of actually occurring stock market risk signals.

3.2 Analysis of the validity of the indicators

In this part, we compare the predictions of different models, and examine the validity of the indicators. We build two predication methods respectively based on data sets with and without the other indicators. Accuracy, recall, precision, specifically, F1 of two prediction methods are given in Table 7. If the performance of the model improves after adding other indicators, the results are marked on bold.

Table 8 Training time comparison (in seconds)

Table 7 shows that different models perform differently on different evaluation indicators. For example, when other indicators are used, MLP performs best on accuracy, but performs worst on Recall_3. When other indicators are used, LSTM performs best on accuracy, but performs worse than GRU on some other evaluation indicators. Additionally, these models have different ability to identify different categories. In the most cases, the models are better at identifying the lowest and high risk than the low and medium risk.

To investigate the validity of the indicators, we compare the predictions obtained by two different prediction methods. As Table 7 suggests, for LSTM, GRU, MLP, and XGBoost models, the prediction accuracies increase by 8.18% after adding the macroeconomic indicators and the basic indicators of SSEC, SZCZ, and HSI. What is more, recall, precision, and F1 of each classification have improved to a certain extent after adding the other indicators in most cases. The four models demonstrate an outstanding forecasting power after adding the other indicators. Figure 5 further presents the test-set ROC curves before and after adding the other indicators. In particular, Fig. 5a, c, e, g shows the ROC curves and AUC values using the conventional indicators, and Fig. 5b, d, f, h shows the ROC curves and AUC values using both the conventional indicators and the other indicators. It is obvious that the AUC values increase after adding the other indicators. All results indicate the other indicators are beneficial to predict China’s stock market risk.

In order to measure the computational complexity of the proposed early warning system, we count each model’s running time under using both the conventional indicators and the other indicators (see Table 8). The numerical results illustrate that the running time of XGBoost is faster than the running time of other models.

To further discuss the validity of the indicators, taking XGBoost model as an example, the Tree SHAP method (Ribeiro et al. 2016; Meng et al. 2021) is used. Tree SHAP calculates the contribution of each indicator to the model output (Tree SHAP values). After that, the contributions of the indicators are sorted on the basis of the mean (|Tree SHAP|) over all samples. In this paper, the contributions of the top 13 indicators before and after adding the other indicators are shown in Fig. 6. Purple represents the contribution to predicting the lowest risk of China’s stock market, green indicates the contribution to predicting the low risk of China’s stock market, pink represents the contribution to predicting the medium risk of China’s stock market, and blue represents the contribution to predicting the high risk of China’s stock market.

In Fig. 6, the x-axis represents the average magnitude change in prediction outcomes following the removal of an indicator from the model. The indicators are arranged in descending order based on the absolute total of the effect magnitudes on the model. From Fig. 6a, we find that OBV_H30455 and MACD_H30455 have higher importance in the predictions of China’s stock market risk. From Fig. 6b, we can see that MACD_H30455 and CPI_CN are more important to predictions. Specially, it can be observed that among the top 13 feature indicators in the importance ranking of indicators, except MACD_H30455 and OBV_H30455, other 11 indicators belong to the other indicators. All above results reflect that the other indicators are valid. Our findings are consistent with the expected results. The reason may be that: (1) There is mutual influence between macroeconomic indicators and stock markets, which could able to improve the accuracy of predictions. (2) The linkage of the Shanghai–Shenzhen–Hong Kong Stock Market has increased after the opening of the Shanghai–Hong Kong Stock Connect and the Shenzhen–Hong Kong Stock Connect. Therefore, basic indicators of SSEC, SZCZ, and HSI are conducive to predict China’s stock market risk. To sum up, it is necessary to consider the other indicators, namely the macroeconomic indicators and the basic indicators of SSEC, SZCZ, and HSI, to forecast the China’s stock market risk.

3.3 Impact of the stock connect programs

In this part, we divide the sample into three time periods to examine the influence of the Stock Connect programs on China’s stock market risk early warning system according to the launches of SH-HK Stock Connect program on November 17, 2014 and SZ-HK Connect program on December 5, 2016. To facilitate understanding, we refer to the time preceding the SH-HK Stock Connect program as period 1, the period between the SH-HK Stock Connect program and the SZ-HK Stock Connect program as period 2, and the period following the SZ-HK Stock Connect program as period 3. Then, using three samples, we construct three prediction models, each of which is then evaluated. The evaluation results are summarized in Table 9 and the ROC curves are shown in Figs. 7, 8, 9, 10.

Table 9 Performance of model in three periods
Fig. 7
figure 7

ROC curves by LSTM in three periods

Fig. 8
figure 8

ROC curves by GRU in three periods

Fig. 9
figure 9

ROC curves by MLP in three periods

Fig. 10
figure 10

ROC curves by XGBoost in three periods

It is observed that:

(1) The forecasting performance in period 2 is superior to that in period 1 with accuracy, recall, precision, F1, and ROC curves in most cases. For example, when the LSTM model is used to predict China’s stock market risk, the accuracy ranges from 88.4 to 89.1%, the recall rate of high risk ranges from 83.1 to 94.3%, the precision of high risk ranges from 90.7 to 100%, the F1 score of high risk ranges from 86.7 to 97.1%, the AUC value of high risk ranges from 0.99 to 1. It might be that following the implementation of the SH-HK Stock Connect program, the return spillover impact from Shanghai to Hong Kong is both swifter and more powerful than the return spillover effect from Hong Kong to Shanghai, which leads some indicators in mainland China begin to have a positive impact on the China’s stock market risk early warning system. Therefore, the forecasting performance has improved after the HS-HK Stock Connect program.

(2) The forecasting performance in period 3 is inferior to that in period 2 in most cases. For example, when the LSTM model is used to predict China’s stock market risk, the accuracy ranges from 89.1 to 85.4%, the recall rate of medium risk ranges from 78 to 71.8%, the precision of high risk ranges from 100 to 87.5%, the F1 score of high risk ranges from 97.1 to 93.3%, the AUC value of medium risk ranges from 0.98 to 0.95. It might be that the communication between mainland China stock markets and Hong Kong stock market has changed from a one-way transmission structure into a two-way transmission structure after the SZ-HK Stock Connect program, which leads more and more foreign investments participate in the mainland China stock markets. Thereby the forecasting performance has reduced after the SZ-HK Stock Connect program.

To further interpret the effect of the Stock Connect programs on China’s stock market risk early warning system, we rank the indicators’s contributions in three periods. The contributions of the top 13 indicator are shown in Fig. 11.

Fig. 11
figure 11

Aggregated relative importance in three periods

By comparing Fig. 11a, b, it can be found that the importance of China’s macroeconomic indicators improve after the SH-HK Stock Connect program. It might be that the SH-HK Stock Connect program makes a contribution to increase the importance of the mainland China stock market. Thus, the indicators related to mainland China dominated those related to Hong Kong after the SH-HK Stock Connect program. Comparing Fig. 11b, c, we find that the importance ranking of the indicators related to Hong Kong increase to a certain extent. It might be that after the SZ-HK Stock Connect program, the communication between mainland China stock market and Hong Kong stock market has changed from a one-way transmission structure into a two-way transmission structure. Therefore, the indicators related to Hong Kong are becoming more and more important after the SH-HK Stock Connect Program.

4 Conclusions

In this study, an early warning system is used to anticipate the risk of China’s stock market in the context of the Stock Connect programs, which are now in place. In the proposed system, we first employ VaR to define stock market risk in which stock market risk is divided into multiple categories. Then, we construct indicators using the conventional indicators and other indicators which includes macroeconomic indicators and basic indicators of SSEC, SZCZ, and HSI. After then, we select four machine learning models to predict China’s stock market risk. Lastly, we verify the validity of indicators and the impact of the Stock Connect programs on predicting China’s stock market risk. The main results and contributions of this study are summarized as follows:

  1. (1)

    Macroeconomic indicators and basic indicators of SSEC, SZCZ, and HSI have an important influence on predicting China’s stock market risk, and the performance of stock market risk early warning system has been significantly improved when the conventional indicators and the other indicators are both considered.

  2. (2)

    Macroeconomic indicators have higher importance in predicting China’s stock market risk.

  3. (3)

    The prediction models have better performance in predicting China’s stock market risk after the SH-HK Stock Connect program, while the predictive performance decreased after the SZ-HK Stock Connect program.

  4. (4)

    The indicators related to Hong Kong become very important after the SZ-HK Stock Connect program.