An attention-based LSTM model for long-term runoff forecasting and factor recognition

With advances in artificial intelligence, machine learning-based models such as long short-term memory (LSTM) models have shown much promise in forecasting long-term runoff by mapping pathways between large-scale climate patterns and catchment runoff responses without considering physical processes. The recognition of key factors plays a vital role and thus affects the performance of the model. However, there is no conclusion on which recognition algorithm is the most suitable. To address this issue, an LSTM model combined with two attention mechanisms both in the input and hidden layers, namely AT-LSTM, is proposed for long-term runoff forecasting at Yichang and Pingshan stations in China. The added attention mechanisms automatically assign weights to 130 climate phenomenon indexes, avoiding the use of subjectively set recognition algorithms. Results show that the AT-LSTM model outperforms the Pearson’s correlation based LSTM model in terms of four evaluation metrics for monthly runoff forecasting. Further, the set indirect runoff prediction method verifies that the AT-LSTM model also performs effectively in precipitation and potential evapotranspiration forecasting, and the indirect runoff prediction is inferior to the AT-LSTM model to establish a direct link between climate factors and runoff. Finally, four key factors related to runoff are identified by the attention mechanism and their impacts on runoff are analyzed on intra- and inter-annual scales. The proposed AT-LSTM model can effectively improve the accuracy of long-term forecasting and identify the dynamic influence of input factors.


Introduction
Accurate runoff forecasting, especially long-term, plays an important role in water resources management (Yang et al 2018, Tan et al 2018, Fang et al 2019. However, runoff has spatiotemporal variability and high uncertainty owing to anthropogenic forcing of the land and atmospheric processes (Milly et al 2015, Deb et al 2019, Xie et al 2021. It has become increasingly difficult to accurately capture dynamic processes of long-term runoff time series. Long-term runoff forecasting models developed in the past few decades can be broadly divided into two categories: physically based models and data-based models. Physically based models attempt to simulate the complex and nonlinear physical hydrological process (e.g. General Circulation Model, National Centers for Environmental Prediction) to forecast climate variables (typically precipitation and temperature), and the streamflow is then calculated through the projection of climate variables using a rainfall-runoff model (Arnell andGosling 2016, Leng et al 2016). However, the simulated climate variables, especially precipitation, usually have some bias at the regional scale (Langenbrunner andNeelin 2013, Mehran et al 2014), which limits the widespread use of the method.
Thus, current long-term runoff forecasting is mostly based on data-based teleconnection approaches, with the establishment of a statistical model representing the relationship between largescale climate patterns and catchment runoff. As an example, Chiew and McMahon (2002) presented an overview of global El Niño-Southern Oscillation (ENSO)-streamflow teleconnection and suggested that the ENSO-streamflow relationship can be used to successfully forecast streamflow. Zubair (2003) demonstrated the viability of using ENSObased predictors for predictions of the January-to-September or April-to-September streamflow of the Kelani River. Zhang et al (2007) and Jiang et al (2006) found that ENSO episodes are in good teleconnection with floods and droughts in the Yangtze catchment.
The key is that the statistical model is representative of the link between the runoff and the identified largescale climate patterns. The use of machine learning (ML)-based statistical models, such as methods adopting an artificial neural network (Humphrey et al 2016), support vector machine (Huang et al 2014), adaptive neuro-fuzzy inference system (Ashrafi et al 2017), or long short-term memory (LSTM) neural network (Yuan et al 2018, Xu et al 2021 for longterm hydrological forecasting have received attention because of their good performance. ML models require high-quality data before the forecasting models are established, and it is thus necessary to adopt recognition algorithms to identify key factors from a massive number of factors related to prediction. As an example, global sensitivity analysis was adopted to calculate sensitivity for up to 24 factors that affect runoff in the Nenjiang River Basin (Li et al 2012). Liao et al (2020) used the maximal information coefficient to select elements and then used gradient-boosting regression trees for forecasting. The correlation coefficients of the input variables and runoff were estimated by cross-correlation analysis, and the streamflow series were then regressed by an artificial neural network and support vector regression (Wang et al 2020b). The quality of the recognition algorithm largely determines the performance of the forecasting models. However, the choice of recognition algorithm is subjective and there is no conclusion on which algorithm is the best. Most of these screening algorithms are linear correlations or improved based on linear correlation. However, the long-term hydrological process is a dynamic and highly nonlinear process. Using only linearly related elements for prediction will lead to unstable or biased forecast results. The proper approach is to consider as many factors as possible, but the performance of ML models decreases when the input data are mixed with a large number of invalid, inaccurate low-quality data (Bi et al 2020). And the weights identified by the linear method are static, inconsistent with the fact that factors affecting runoff are dynamically changing in different periods.
The attention mechanism has recently become a hot topic in neural network research. The attention mechanism is similar to the human selective visual attention mechanism, which focuses on target information within the global information and suppresses other, useless information (Bahdanau et al 2014). Mnih et al (2014) first used the attention mechanism with the recurrent neural network (RNN) model for image classification, and Bahdanau et al (2014) then applied the attention mechanism in the field of natural language processing, addressing entailment (Rocktschel et al 2015), sentence summarization (Rush et al 2015), and reading comprehension (Hermann et al 2015). Furthermore, the weights extracted by the attention mechanism can be used to analyze which input elements are more relevant to the object time series, avoiding the drawbacks of ML as a black-box model. However, the attention mechanism has seldom been applied in the field of hydrology, especially in long-term runoff forecasting.
The present paper develops an LSTM model combined with two attention mechanisms for monthly runoff prediction at Yichang and Pingshan stations. We (a) automatically recognize key factors instead of using a subjective recognition algorithm, (b) verify the performance of the model by comparing the model with traditional models, and (c) analyze the relationship between the identified large-scale climate patterns and runoff.
The remainder of the paper is organized as follows. Relevant information about the study area and climate phenomenon data is presented in section 2. The attention mechanism and proposed model are described in detail in section 3. Results and a discussion are presented in section 4. Conclusions are drawn from the results of the study in section 5.

Study area
The study area (see figure 1) is the upper reaches of China's Yangtze River. The Yichang station and Pingshan station were selected for the present model application. The former is the control station of the Jinsha River and the latter is the control station of the upper Yangtze River. The monthly runoff series spanned a period from January 1961 to December 2009, covering 588 months, and the dataset was divided into a calibration dataset  and a validation dataset (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)). The precipitation data (P) were recorded by 73 hydrological stations in the upper reaches of the Yangtze River and converted into areal rainfall adopting the Thiessen polygon method (Thiessen and Alfred 1911). Similarly, rainfall data from 27 weather stations in the Jinsha River Basin controlled by the Pingshan station were converted to surface rainfall adopting the Tyson polygon method. The potential evapotranspiration (PET) data were from the Climatic Research Unit grid point dataset. The timing of the calibration period and verification period was consistent with that of the runoff data. More statistical information about the input data is summarized in section S1 of supplementary information.

Climate phenomenon indexes
The monthly climate phenomenon index dataset was obtained from the National Climate Center of the China Meteorological Administration. The dataset includes 88 indexes of atmospheric circulation, 26 indexes of the sea surface temperature, and 16 other indexes. The dataset with more than 30% missing ratio was excluded, and the data with less than 30% missing ratio was linearly interpolated with adjacent data. The period of the dataset was from 1961 to 2009, which is consistent with the runoff.

Methodology
In this section, an LSTM model combined with two attention mechanisms (AT-LSTM) is proposed in detail. The AT-LSTM model is compared with the LSTM model and conventional recognition algorithm in runoff prediction. Furthermore, an indirect runoff prediction comparison method is established to verify the practicability of the model in regional climate variables (PET and P), as shown in figure 2.

LSTM model and attention mechanism
The LSTM model is a special kind of RNN capable of learning long-term dependencies (Hochreiter andSchmidhuber 1997, Gers et al 2000). The repeating module of the LSTM model contains four interacting layers instead of only a single neural network layer like the traditional RNN model, achieving better prediction results than the RNN model in the simulation of long-sequence data. Although the LSTM model overcomes the weakness of long-term dependencies, it requires recognition algorithms in advance to identify key input factors and provide high-quality data. The attention mechanism can help handle this issue. In the case of deep neural networks, attention models are constructed as a dimension of interpretability of the internal representations of the networks by selectively focusing on specific information, thus allowing the extraction of features of different time steps.

AT-LSTM model
In this study, two attention mechanisms are added to the LSTM model to obtain the AT-LSTM model for monthly runoff forecasting using climate phenomenon indexes. The model structure is shown in figure 3. Different from the traditional attention mechanism that is added only to the hidden layer, the attention mechanism here is added to both the input layer and hidden layer of the LSTM model. The attention mechanism added to the hidden size layer is used to dynamically allocate weights and thus avoid degradation of the model performance as the time step increases. In the input layer, for each  input multi-dimensional climate datum x n,m (where n and m respectively represent the length of the time step and input size, with the time step being the lag time and the input size depending on the type of climate data), the attention mechanism assigns weights to different types of climate data so that key climate data can be recognized on the basis of the weight value. Detailed information on the AT-LSTM model structure and parameters setting are summarized in sections S2 and S3 of supplementary information.

Direct runoff prediction method
The direct runoff prediction method means that the statistical model directly establishes the relationship between climate phenomenon indexes and runoff to predict runoff. In this study. In addition to the proposed AT-LSTM model as a statistical model, two comparison models are established to verify the performance of the AT-LSTM model. One is the single LSTM model without a recognition algorithm. The other is the traditional recognition algorithm (PCCs-LSTM model). First, the Pearson correlation

Indirect runoff prediction method
In addition to the direct establishment of the relationship between climate phenomenon indexes and runoff, it has been suggested that there is a strong association between regional climate variables (e.g. precipitation) and large-scale climate patterns (Cai et al 2011, Gao et al 2021. Therefore, an indirect prediction comparison method is established. First, the statistical models (LSTM model and AT-LSTM model) predict the regional rainfall and PET, using the same climate phenomenon indexes, and the predicted climate variables are then input into the rainfall-runoff models to simulate runoff. The rainfall-runoff models here are the traditional physical model (i.e. the two-parameter monthly water balance (TWB) model) and the statistical model (i.e. LSTM model). The calibration and validation periods are the same as for the direct forecasting runoff model. The flow chart of the application of the AT-LSTM model in the direct and indirect models is shown in figure 4.

Evaluation metrics
Four mathematical metrics are selected to quantify the performance of the model in predicting runoff, i.e. the Nash-Sutcliffe efficiency (NSE) (Krause et al 2005), bias (Gupta et al 1999), Pearson's correlation coefficient (R), and mean absolute relative error (MARE), which are computed using equations (1)-(4): where N is the length of the runoff time series, Q s i and Q o i respectively represent simulated and observed runoff series,Q s andQ o are respectively the means of the simulated and observed runoff.
The root-mean-square error (RMSE) and coefficient of determination (R 2 ) are selected to quantify the performance of the model in predicting precipitation and PET, as follows:

Performance of the AT-LSTM model
Tables 1 and 2 present the performances of LSTM, PCCs-LSTM, and AT-LSTM models in monthly runoff forecasting with a lead time of 1 month during the training and validation period. For the two stations, the performance of the AT-LSTM model is better than the performances of the LSTM and PCCs-LSTM models in terms of all four evaluation criteria both in the training and validation period. Compared with the PCCs-LSTM model, the AT-LSTM model improves the NSE by 0.019 and R by 0.011 at the Yichang station. The benefit of using the AT-LSTM is even greater for the Pingshan station. The AT-LSTM model has bias of 2.08% and −0.22% at Yichang and Pingshan stations respectively, which is the smallest bias among the models. These results indicate that the attention mechanism increases the forecasting precision compared with a conventional recognition algorithm like PCCs. This may be due to the complexity of the factors affecting long-term runoff; i.e. conventional analysis of correlation between a single factor and a complex sequence is not representative (Wang et al 2020a).
The AT-LSTM model also outperforms the LSTM model, though the benefit is less pronounced than in the comparison with the PCCs-LSTM model. At the Yichang (Pingshan) station, the NSE and R are improved by 0.018 (0.031) and 0.008 (0.018), and the MARE drops 0.95% (1.57%). This demonstrates that the added attention mechanisms effectively screen the key climate factors affecting runoff and improves the model performance. The use of ML-based hydrological models often results in the poor simulation of high flows owing to the low frequency of high flows, which is important to practical applications (Wu et al 2009).
Compared with LSTM and PCCs-LSTM models, the AT-LSTM model also achieves better forecast results in high flows. The MAREs of the highest ten monthly runoff during the validation period decreases from 19.69% (20.23%) when using the LSTM model and 14.98% (18.54%) when using the PCCs-LSTM model to 12.01% (15.22%) when using the AT-LSTM model at the Yichang (Pingshan) station. Results presented in figures 5 and 6 also illustrate this point. The scatter points of the LSTM and PCCs-LSTM models are ill-organized, especially near the high flow points on the upper right, indicating much lower forecasting precision. For the maximum ten months runoff in the validation period, the MARE of the AT-LSTM model is reduced from 14.98% to 12.01% compared with PCCs LSTM models at Yichang Station. The MARE of Pingshan Station decreased from 18.54% to 15.22%. Compared with the LSTM model, the AT-LSTM model improves more significantly. The MARE decreases by 5.01% and 7.68% at Yichang and Pingshan stations, respectively. Table 3, table 4, figure 7 and figure 8 show that the AT-LSTM model also performs well in forecast regional climate variables. At the Yichang (Pingshan) station, the AT-LSTM model improves R 2 by 0.009 (0.020), and the RMSE drops 0.87 (1.28) in rainfall forecasting. The benefit in forecasting PET is less pronounced, where the AT-LSTM model improves R 2 by 0.006 (0.011), and the RMSE drops 0.35 (0.58) at the Yichang (Pingshan) station.

Comparisons of runoff forecasting performance with the indirect prediction model
The forecasted rainfall and PET are then input into the rainfall-runoff model to simulate runoff. It is obvious that more accurate climate variables lead to more accurate runoff predictions, and therefore,   Although good results are obtained in rainfall and PET prediction, and these results are even better than the results of runoff prediction, the indirect runoff prediction is inferior to the prediction by the AT-LSTM model, which establishes a direct link between large-scale climate patterns and runoff. A    possible explanation is that errors in the climate variable prediction model and the rainfall-runoff model accumulate, such that the prediction is not as good as that of a model that establishes a direct link. At the same time, the results show that the direct prediction of climate variables by the AT-LSTM model is better than the prediction of runoff. This may be due to greater teleconnection between regional climate variables and large-scale climate patterns. Compared with the rainfall and PET, the runoff sequence is more fluctuating and affected by changes in the underlying surface and human activity, making it more difficult to make highly accurate predictions.

Weight analysis of climate phenomenon indexes
In addition to improving the prediction accuracy, another significant advantage of the proposed AT-LSTM model is that the identified weights are time-varying, while the weights identified by other methods are mostly fixed. In fact, the influence of climate variables on runoff is dynamically changing in different periods, which can be better reflected by the proposed AT-LSTM model. Taking the Yichang station as an example, the sum of the weights of the top four indexes exceeds 99%. The East Asian Trough Intensity Index (EATI) accounts for 53.51%, the Tibet Plateau Region 2 Index (TPR-2) accounts for 23.56%, the Northern Hemisphere Polar Vortex Central Intensity Index (NHPVCI) accounts for 15.98%, and the Atlantic Subtropical High Area Index (ASHAI) accounts for 6.13% of the weighting. Figure 9 compares variations of weights for four climate factors and runoff. Overall, the weights of the four factors change over time but show a significant cyclical change in flood, non-flood seasons, and wet, normal, dry years. The weights of the EATI and ASHAI have trends opposite the runoff. For example, in the flood season, the weights of EATI and ASHA fall close to zero, but in the non-flood season, especially in January and February with the lowest runoff, the weights become largest. In contrast, TPR-2 and the NHPVCI have the same change trend as runoff.
During the flood season, their weights both increase from close to zero in the non-flood season to a maximum.
Furthermore, in order to analyze the relationship between four climate variables and runoff on intra-and inter-annual scales, the average monthly and annual weight of four climate variables were calculated, as shown in figures 10 and 11. It can be seen that there are significant differences in the roles played by the four climate variables in different seasons. The weights of EATI and ASHA show the opposite trend with runoff, and the changing trend of NHPVCI is consistent with the runoff, while TPR-2 had a two-month lag. From January to April, the runoff is mainly affected by EATI, and the other three factors are negligible. With the increase of flow, the weight of EATI gradually decreases and the influence degree becomes smaller, while the weight of NHPVI and TPR-2 increase and began to play a dominant role. TPR-2 and the NHPVCI play important roles in the flood season while the EATI and ASHAI play important roles in the non-flood season.
As shown in figure 10, on the inter-annual scale, the annual weight change trends of EATI and TPR-2 are opposite to the annual runoff, which increase significantly during the period of runoff decline. In the wet year of 1998 (The Yangtze River basin suffered a huge flood), the weight of EATI and TPR-2 are reduced to the lowest level, indicating that both of them mainly affect the runoff in dry years. The weight of NHPVCI is positively correlated with runoff, which is the largest in the wet year (1998) and even shows the same fluctuation change with runoff in the fluctuating period from 2006 to 2009. The weight change trend of ASHA is not significantly correlated with the annual runoff, which only shows a negative correlation in some periods. Further, the analysis exploring the impact of these four indicators on the extremes under different scenarios is summarized in section S5 of supplementary information.
The identified factors are consistent with the conclusions of previous studies. The four climate phenomenon index factors identified have a high correlation with runoff in the Yangtze River Basin. For example, The East Asian Trough is a trough of low pressure formed by the westerly belt at mid-high latitudes of the Northern Hemisphere; the location and strength are closely linked with rainfall and streamflow in the Yangtze River Basin (Wen et al 2000(Wen et al , 2015. As the highest plateau in the world, The Tibet Plateau has a strong dynamic and thermal influence on the atmospheric circulation over East Asia (Wang et al 2011(Wang et al , 2019. Inter-annual variations in drought and flooding in the Yangtze River Basin are highly correlated with the upstream surface latent heat flux over the Tibetan Plateau, especially during the Meiyu (Dong et al 2019). Studies of summertime precipitation in China have found that the magnitude of the NHPVCI is related to drought and flooding in the Yangtze River Basin in spring and winter (Huang et al 2004). Zhang et al (2017) summarized that the ASHAI affects the East Asian monsoon circulation and thus the occurrence of extreme precipitation in the Yangtze River Basin.

Conclusion
The key to improving long-term runoff forecasts is to establish a statistical model that describes the relationship between forecast factors and runoff based on identifying effective forecast information. In this study, we proposed a new LSTM combined with two attention mechanisms model for long-term runoff prediction. Unlike conventional recognition algorithms, the added attention mechanisms automatically recognize the key input factors and the identified weights are time-varying, which is more consistent with the fact that the factors affecting runoff are dynamically changing in different periods. The model has achieved good results in the long-term prediction of runoff, precipitation, and PET at Yichang and Pingshan stations. Finally, four identified key climate factors affecting runoff were further analyzed. The main conclusions were drawn as follows: (a) Compared with the traditional LSTM and PCCs-LSTM model, the proposed AT-LSTM model achieved the best performance in runoff forecasting at both Yichang and Pingshan stations, especially in high flow regions. The attention mechanisms added to the input layer automatically identify the key factors relevant to the runoff and avoid the use of conventional subjective recognition algorithms. (b) The AT-LSTM model also performs effectively in rainfall and PET forecasting compared with the LSTM model. The predicted meteorological variables are substituted into the rainfall-runoff model to verify that the direct model outperforms the indirect model, providing a valuable reference for the practical application of the AT-LSTM model in long-term runoff prediction. (c) The EATI, TPR-2, NHPVCI, and ASHAI are the four identified factors most relevant to monthly runoff at Yichang station, which is consistent with the conclusions of previous studies. Furthermore, the dynamic effects of four factors on runoff are analyzed on intra-and inter-annual scales. The added attention mechanisms effectively identify reasonable and relevant climate indexes.
The AT-LSTM model proposed in this study can not only screen out key input factors, but also identify the dynamic effects of the factors on runoff and improve the performance of long-term forecasting. Future studies should consider the non-stationarity  of climate phenomenon indexes and the uncertainty of the model. Additionally, limited by the scale of the data, the present study only analyzed the monthly runoff prediction at Yichang and Pingshan stations. If appropriate data are available, the superiority of the model can be verified on different time scales and basins.

Data availability statement
The climate phenomenon indexes are openly available at http://cmdp.ncc-cma.net/cn/index.html. The potential evapotranspiration data are openly available at https://crudata.uea.acuk/cru/data/hrg/cru_ts_ 4.03/. The runoff and precipitation data are available on request from the corresponding author. The code used to support the findings of this study are available from the corresponding author upon request.
The data generated and/or analysed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.