The effect of time series length on runoff characteristics analysis

The length of time series is of great importance, which is an uncertain factor in runoff characteristics analysis, especially in the analysis of trend and period. This paper investigates the tendency and period using runoff observation data of 107 years from 1909 to 2015 at Monticello hydrological station in the Upper Sangamon River in USA. The data series of 30 years, 40 years, 50 years and so on until 100 years and 107 years are selected to test their trend and period, respectively, and then are compared with each other. The results show that, with the increase of time series length, the tendency and period tend to be stable, and the annual runoff begins to show a significant increase trend once the length of time series is more than 80 years. The period tends to a dynamic stability value of 22 years when the time series length is more than 70 years. When the series length is larger than 70 years, the results of runoff characteristics analysis are much more reliable.


Introduction
In recent years, most of river runoff has changed to a great extent due to the impact of climate change and human activities [1]. Hence, runoff characteristics have been widely analysed to provide a basis for the rational development and utilization of water resources [2]. In the last few decades, scholars have been much more mature on the research of trend and period analysis of hydrological series. Sang et al [3] analysed the determined period by using empirical mode decomposition (EMD) and maximum entropy spectral analysis (MESA) which can efficiently avoid the impacts of noise on trend and period analysis. Burn and Hag Elnur [4] utilized the Mann-Kendall non-parametric test to detect trends and found out that a greater number of trends are observed than are expected to occur by chance. Hamed [5] improved Mann-Kendall test to study the significant trends in annual maximum flow of a group of 57 worldwide total annual runoff time series. Dixon et al [6] used the non-parametric Mann-Kendall test with bootstrap resampling to calculate annual and seasonal trends and then concluded that a marked east-west gradient of streamflow trend emerged in the mountainous west. Labat [7] gave a review of concepts of Wavelet analyses and the recent advances, which can be of great importance to identify period components. Gaucherel [8] exploited the potential of continuous wavelet transform on flow curves to detect new periodicities or time annual features. Sang et al [9] developed a discrete wavelet spectrum (DWS) approach to identify non-monotonic trend patterns in hydrological time  [10] established the Autoregressive Integrated Moving Average (ARIMA) model to analyse the influence of the length of time series for the results of monthly runoff forecast. However, up to now, few people have investigated the effects of time series length on the analysis of runoff characteristics. Therefore, in this paper, we preliminarily discuss the optimum length of time series using runoff observation data of 107 years from 1909 to 2015 at Monticello hydrological station in the Upper Sangamon River in USA.  USRB has been affected by human activities including construction of railway and drainage systems, excavation of drains, laying drainage pipe underground and other activities since the 19 th century. In the midstream of USRB, there is a manmade lake named Decatur, where a reservoir lies. Because of the lack of reservoir's water supply data, the water level gage is finally selected to make runoff characteristics analysis, which is located in the upstream of Decatur Lake [12].

Study area and data
The runoff data are obtained from USGS. They are divided into 9 groups with a fixed year of 1909 and a step of 10 years. In other words, we take the runoff observation of 30  then their tendency and period are analyzed, respectively and are compared.

Methods
In this paper, three methods are applied to analyse the runoff characteristics, among which Correlation Coefficient Analysis (CCA) and Linear Regression Trend Test (LRTT) for tendency analysis and Wavelet analysis (WA) for period analysis.

Correlation coefficient analysis (CCA) and linear regression trend test (LRTT)
It is assumed that there is a linear trend component in a hydrologic sequence xt, and the formula is as follows: where, Tt represents the trend component; ηt represents other components; a and b represents the parameters of the linear equation in the trend component respectively; t is time. The expressions of each parameter are as follows [13]: where, b S represents the standard deviation of b ; 2 1 n t t    represents the sum of square of deviation; x and t represent the mean of t x and t , respectively; n represents the capacity of sample. r is taken as the correlation coefficient between t x and t , and T is defined as the statistic value, there calculation formulas are as follows: In this paper, there are two steps for trend analysis:  r is calculated first, and then find the correlation coefficient checklist to identify whether the relationship between t x and t is significant or not. In other words, we want to find out x and t , hence there is no tendency there.  T is calculated using equation (7). At the beginning, b is assumed to be equal to 0, and T is subject to t-distribution with 2 D n   . Giving  , 2 t  can be checked in the checklist of t.
Then compare T and 2 t  , if T > 2 t  , the initial assumption of b will be rejected, that is to say that the trend component is significant; otherwise, the trend component can be ignored.

Wavelet analysis (WA)
WA, called the 'mathematic microscope' [14], is an important break-through in the development history of Fourier transform (FT), which can identify very weak period signal compared to FT and has a powerful local time-frequency resolution for description of local signal characteristics [15]. In this paper, WA is used to test period component of different length of time series, it is as follows [11]    Through CCA, the correlation coefficient of each group is calculated in table 1. Compared with the critical value of the correlation coefficient from the correlation coefficient checklist, i.e. r0.05 and r0.01, the results are shown in table 1 and figure 2. The correlation coefficient is between r0.05 and r0.01 when n is more than 80 years. In other words, when the length of hydrological sequence exceeds 80 years, it exhibits a specific change with the influence of time and the trend component begins to appear. Then we adopt LRTT to test the significance of tendency for the hydrological sequence whose length is more than 80 years.

Tendency analysis
The statistic value T is calculated using equation (7), and listed in table 2 along with the t-test value 2 t  where  is defined to be equal to 0.05. When n is greater than 80 years, T is always larger than 2 t  . So the increasing tendency is significant. Hence, when the length of time series is more than 80 years, the effect of trend becomes significant.

Period analysis
The curves of wavelet variance have been shown in figure 3. It is obvious that the wavelet variance curves of each group are similar in shape. They all have 2 or 3 peak values, and the first, second, third peak value represents the first, second, third main period respectively [16]. And the results are shown in table 3 and figure 4. From figure 3 and table 3, the first main period is generally expressed as a half of the length of time series, which has a great possibility to be the pseudo period. The first and second main period is approximately equal to the multiple of the third main period, and therefore the third main period can be ignored. Examining the second main period, we can conclude that the period of USRB tends to a dynamic stability when the length of hydrological sequence is more than or equal to 80 years. Therefore, we recommend that the length of hydrological sequence should be more than or equal to 80 years when analyzing period component in order to get a better result.

Conclusion
Most of the runoff observation data of the watershed in China is about 60 years, which limits our research on runoff characteristics analysis. This paper discusses trend and period components using runoff observation data of 107 years from 1909 to 2015 at Monticello hydrological station in the Upper Sangamon River in USA, which are divided into 9 groups.
The results show that, in USRB, we can detect a significant increasing tendency when the length of series is more than 80 years; based on WA, and the curves of wavelet variance are used to identify the periods of different group. We find out that the length of series should be more than or equal to 80 years when analyzing period component in order to get a better result. Therefore, it is recommended that the length of hydrological series should not be less than 80 years when analyzing runoff characteristics. But future runoff can be completely different, that should be taken into account with the actual situation. This study can provide the reference for future research.