A new Xin’anjiang and Sacramento combined rainfall-runoff model and its application

The Xin’anjiangmodel and the Sacramentomodel are twowidely used short-termwatershed rainfall-runoff forecastingmodels, eachwith their own unique model structure, strengths, weaknesses and applicability. This paper introduces a weight factor to integrate the twomodels into a combined model, and uses the cyclic coordinate method to calibrate the weight factor and the parameters of the two models to explore the possibility of the complementarity between the two models. With application to the Yuxiakou watershed in Qingjiang River, it is verified that the cyclic coordinate method, although simple, can converge rapidly to a satisfactory calibration accuracy, mostly after two iterations. Also, the results in case studies show that the forecast accuracy of the new combined rainfall-runoff model can improve the forecast precision by 4.3% in a testing period, better in runoff process fitting than the Xin’anjiang model that performs better than the Sacramento model.


INTRODUCTION
Stream flow, a major part in the water cycle, plays an essential role in managing and planning water resources systems (Hansen & Hallam 1991), and the objective existence of the lead time between the rainfall and its then caused runoff in river networks opens an opportunity for us to reduce the uncertainty of stream flows by forecasting the runoff based on rainfall observed (Anderson et al. 2002). Many efforts have long been made in previous works to address this engineering problem by modeling the relationship between rainfall and runoff (Todini 1988;Aytek et al. 2008;Asadi et al. 2019). Among these rainfall and runoff models, the physically based models are best used when precise data are available, physical properties of the hydrological processes are accurately understood, and applied on fine scales due to computational time, while conceptual models have gained popularity in the modeling community because they are easy to use and calibrate (Sitterson et al. 2018). The Xin'anjiang and Sacramento models are two lumped conceptual hydrological models extensively studied and applied in forecasting runoffs from rainfall ( Jaiswal et al. 2020).
Professor Zhao is well known as the founder of the Xin'anjiang rainfall-runoff model based on the conception of hillslope hydrology (Ren-Jun 1992), which has since been widely applied in China (Bai et al. 2016). The model has been improved in its structure and simulation procedure by previous works, including the following: Jayawardena & Zhou (2000) modified the spatial distribution curve of soil moisture storage capacity from the traditional single parabolic to a general double parabolic in order to differentiate the wet from dry condition; Meng et al. (2018) included both the infiltration and saturation excess runoff mechanisms, coupled to a two-source potential evapotranspiration model (TSPE), to simulate the hydrological process; Yao et al. (2014) coupled the Xin'anjiang model with the geomorphologic instantaneous unit hydrograph to achieve better flood predictions in ungauged catchments; and Liao et al. (2016) employed the antecedent precipitation to correct the real-time forecast. The sensitivity of forecasting accuracy to the parameters was investigated by, for example, Zhang et al. (2012), who used the generalized likelihood uncertainty estimation to investigate the sensitivity of parameters of the Xin'anjiang model, and Song et al. (2013), who presented a two-step statistical evaluation framework using global techniques to reduce the computational cost in sensitivity analysis of the Xin'anjiang model. The parameter calibration has attracted the interest of many researchers, including the genetic algorithm in the work by Cheng et al. (2006), the shuffle complex evolution (Zhijia et al. 2013), and the multi-objective artificial bee colony algorithm (Huo & Liu 2019), to mention a few. The previous applications of the model show that the Xin'anjiang rainfall-runoff model was more applicable to humid and semi-humid regions than any other regions.
The Sacramento model was originally developed for the United States National Weather Service and State of California, Department of Water Resources by Burnash et al. (1973), and then widely applied worldwide. Efforts were also made to improve its performance by structurally modifying the model in previous works. Koren et al. (2014), for example, presented two physically based modifications in order to take into account the effects of freezing and thawing soil on the runoff generation process. The parameters in the model can be calibrated with various optimization methods, including the shuffled complex evolution (Sorooshian et al. 1993), the direct search algorithm that Hendrickson et al. (1988) showed as being more robust than the Newton-type algorithm, which is more susceptible to poor conditioning of the response surface, as well as the multi-objective particle swarm optimization in a previous work (Ahirwar et al. 2018) that took into account two criteria, root mean square error and bias, to calibrate parameters. The Sacramento model has been used in humid, arid, and semi-arid areas with good results.
To date, due to differences in the modeling concepts and simulation procedure, such as water source division, runoff yield and confluence, different hydrological models will have large differences in the forecast accuracy when even applied to the same watershed for the same forecast period (Mathevet et al. 2020). It is difficult for a single hydrological model to cover various possible factors that affect the forecast results, as it often strengthens individual modules and ignores some factors, resulting in certain limitations of the model to adapt to different engineering backgrounds, as shown by Gill et al. (2006). Shamseldin et al. (1997) presented a combined forecasting method, which was helpful in improving the accuracy of the model forecast since it might amplify the simulation effects of the superior modules in the hydrological models, while at the same time, reduce the impact of its own defects on the forecast results. Xiong et al. (2001) showed that better runoff estimates can be obtained by combining the outputs of different models. Bates & Granger (2001) compared four methods in combining the simulation results from the different models to produce better combination forecasts. These previous works all made efforts in processing only the results obtained by different models independently so as to produce a better runoff ensemble.
This work introduces a weight factor to combine the Xin'anjiang with the Sacramento model, which, instead of only processing the simulation results obtained separately by the two models, optimizes the weight factor and involved parameters all together. Also, to the best of our knowledge, the complementarity between the Xin'anjiang and Sacramento models in forecasting runoffs has never been investigated in previous works.

Combine Xin'anjiang with Sacramento
The runoff from the combined model is simply defined as the weighted runoffs from both the well-known Xin'anjiang and Sacramento models, expressed as: where, l is the weight factor between 0 and 1 introduced to combine the Xin'anjiang with the Sacramento model. 'COM', 'XAJ', and 'SAC' are tags used to indicate the combined, Xin'anjiang, and Sacramento models, respectively; Q t represents the runoff at time t; I, S, and μ represent vectors of inputs that mainly include rainfalls and evaporation capacities, initial states, and parameters, respectively. The schematic diagrams involving the inputs, states, and parameters, as well as the mathematical relationships between them, can be found in previous works for both the Xin'anjiang model (Ren-Jun 1992) and the Sacramento model (Burnash et al. 1973). The structure of this combined model allows the inclusion of more individual models.

Initial states and parameters to be calibrated
The parameters that need to be calibrated include the weight factor as well as all the parameters involved in both the Xin'anjiang and Sacramento models. The calibration is inevitably based upon certain performance criteria to be quantified by model simulation, which depend on, among others, the initial states, including, for example, the initial soil moisture storage and runoff in the river. It is very difficult, if not impossible, to observe or measure any of these initial states, which, when artificially set to certain values, may have negative impacts on model calibration. For example, the initial moisture storages, when far above their real values, are very likely to result in false runoffs much greater than the real ones, especially during the early stage of simulation or prediction. Table 1 gives the parameters to be calibrated for the Xin'anjiang model, as well as the eight initial states, which will also be equally regarded as parameters to be calibrated so as to improve the accuracy of the model in simulating the rainfall-runoff process. Table 2 gives all the 12 initial soil moisture and river flow states, as well as 23 parameters in the Sacramento model. The initial states will also be calibrated all together with the parameters. Most of the parameters are conceptually defined in the tables and can be determined by field measurement, which, however, would be a very laborious yet unnecessary task to yield only an initial estimation on the parameters, that can instead be easily optimized by starting with initial parameters enforced within the empirical range given in the tables as a reference for future research. Usually, the simulated and observed runoffs are compared to evaluate the accuracy of a forecast model. To evaluate the calibration accuracy, the present work employs the well-known Nash-Sutcliffe efficiency (NSE), the closer whose value is to 1, the better the calibration. The NSE is expressed as: where, n is the number of time intervals during the simulation period; Q COM t is the stream flow at time t, simulated by the combined model; Q t is the stream flow observed at time t; and the Q is the average over Q t .

The cyclic coordinate method to calibrate parameters and initial states
Combining more than one rainfall-runoff model has been practiced in previous works, most of which, however, calibrate parameters independently for each model and then process the results from all individual models to produce better forecast accuracy. The present work calibrates the parameters and the initial states of the two models, as well as the weight factor to combine the models all together by using the cyclic coordinate method (CCM), which is popular with practitioners owing to its simplicity.
The CCM is an optimization algorithm that successively optimizes along coordinate directions to find the optimum of a function. This procedure, although very simple, is very effective in finding a KKT solution for unconstrained optimization or one with only box constraints. Obviously, the calibration problem in this work is a nonlinear optimization with box constraints on each parameter or initial state, which is only constrained with an upper and a lower bound.
Let a solution of parameters and initial states to the calibration problem be: and start with an initial solution denoted as: The procedure of the CCM is illustrated in Figure 1. For an optimization with box constraints on decision variables X [ R n , the CCM starts with the initial solution (X(0)) at the beginning of the first cycle, searches along the first coordinate directioñ to derive the optimal solution ( ) on the direction, then searches along to obtain the optimum ( ) on the direction, and continues in this way but let every time after the last coordinate rotated. The procedure repeats until reaching the convergence jjX (k) n À X (k) 1 jj 1 which also determines the optimal parameters and initial states as:

The watershed background
The Qingjiang River Basin is located at the eastern end of the Yunnan-Guizhou Plateau, China. The terrain is high in the west and low in the east, with the total length of the mainstream being 423 km, and the total drainage area about 16,700 km 2 . The basin belongs to a subtropical monsoon climate zone, with an average annual relative humidity of 70-80% and an average annual rainfall of 1,460 mm, which is unevenly distributed throughout the year, with more than 75% of the annual rainfall concentrated in the rainy season from April to September. The basin is rich in hydropower resources, with a large drop in water head, a small submerged area, making it easy to be developed. A three-level of cascaded hydropower stations 'Shuibuya-Geheyan-Gaobazhou' has been built in the middle and lower reaches of the mainstream. The map of Qingjiang River Basin and its sub-basins is shown in Figure 2.
The hydrological data available for Qingjiang Yuxiakou Basin include rainfall, evaporation, runoff data from 1990 to 1991. Part of the data with a long time interval are processed with linear interpolation so that all the data are at 6-hour intervals. Thus, each year can be divided into 1,460 6-hour intervals, with those in 1990 used for training the models, and 1991 for testing the model. It is worth noting that this work does not take into account the impacts of hydro-plants Gaobazhou and Shuibuya on the stream flow since they were constructed later, in 2001 and 2009, respectively. The initial values set for the parameters and initial states will affect the calibration results. Here, the experiments are done by starting the CCM at different initial solutions, selected out one by one from 1,600 sets of values randomly and reasonably generated for parameters, including initial states for the model. Figure 3 illustrates the convergence of the NSE in the Xin'anjiang model from part of the experiment results derived by starting the CCM at different initial values generated for the parameters and initial states of the model. The NSE, although quite different at the starting points, will mostly rise to around 0.8 through about two cycles of coordinate rotation, indicating that the CCM for parameter calibration converges fast and has excellent performance. Similar conclusions can be drawn when applied to the Sacramento model.

Results in the training period
As mentioned above, the observed data in 1990 are used for training the model. To explore more local optimums, 1,600 sets of initial values for the parameters are randomly generated for each model separately, serving as their own starting solutions of the CCM. When applied to all three models, the Sacramento, Xin'anjiang, and present combined model, the best estimations give a low, high, and the highest simulation accuracy for the three models respectively, with NSE ¼ 0.763 for the Sacramento, NSE ¼ 0.875 for the Xin'anjiang, and NSE ¼ 0.916 for the new combined model in the training period, as summarized in Table 3. The weight factor to combine the Xin'anjiang with the Sacramento model is calibrated to its best value, equal to 0.874, which suggests the Xin'anjiang model is more weighted in the present combined model. Figure 4 illustrates the comparison of stream flows between the Xin'anjiang and Sacramento models, simulated at the best estimated parameters. The results show that the runoff in the Yuxiakou Basin of Qingjiang River is mainly concentrated during the 400-900th periods, and the rest of the periods have relatively small runoff, especially with the average of observed flow over the 1,400-1,600th periods being only 53.6 m 3 /s. In the first 400 time periods, the runoff changing trends of the Xin'anjiang and Sacramento models are basically consistent with the observed values, but the simulated runoffs are all smaller than the observed ones. The 400-900th periods, with frequent rainfalls, are in the wet season, when the Sacramento model simulates a higher peak flow in a flood event than the observed value, and runoffs during the recession limb after the flood peak lagging behind both the observed value and the value simulated by the Xin'anjiang model. The Xin'anjiang model has a better curve fitting between the simulated runoffs and their observed values, with the peaks and valleys basically coinciding with each other. The 900-1,400th periods are in the dry season in the basin, with the average runoff only being 103 m 3 /s. The simulated value of the Sacramento model is larger than the observed value, while that of the Xin'an River model is slightly smaller than the observed value, showing a higher simulation accuracy of the Xin'anjiang model than the Sacramento model.    models respectively, as summarized in Table 4. The combined model improves the NSE by 4.3% compared with the Xin'anjiang model, the better one when individually applied. The NSE of the combined forecast model is greater than that of the two models when forecasting separately, indicating a higher forecast accuracy than the two single models when forecasting the runoffs in 1991 of the Yuxiakou watershed, Qingjiang River. The weight factor can be seen as the contribution ratio to the runoff simulated by the combined model, then in this way the contribution from the Sacramento model is only 12.6%, which is small and reduces the influence of the model's own defects on the forecast results to a certain extent.
The comparison between the simulated and observed runoffs during the testing period is given in Figure 5 for both the Xin'anjiang and Sacramento models. The forecast results show that in the first 400 periods when it is relatively dry, the forecast values of the two models are lower than the observed values with a certain forecast error. During the 400-900th wet period, three major floods occurred in the Qingjiang Basin. The average error of the Xin'anjiang model for the peak discharge simulation is 4.87%, while the peak discharge forecast values of the SAC model are all greater than the actual observed values, with the average error reaching 27.89%, which is a large deviation, indicating that the Xin'anjiang model has higher accuracy in forecasting the peak flood during the wet season. In the recession limb, the Sacramento model is more accurate in simulation, while the Xin'anjiang model simulates too large runoffs, indicating the existence of complementarity between the two models. In the 900-1,400th periods of the dry season, the forecast difference between the Xin'anjiang model and the Sacramento model is not significant.
The comparison between the simulated runoff by the combined model and the observed runoff during the testing period is shown in Figure 6. The combined model is better than any of the two models in either fitting the flood peak flow or the recession limb, with simulated runoffs closer to the observed values. The results show that the combined forecast model can take the advantages of both the Xin'anjiang and Sacramento models to dynamically modify the forecast value when forecasting runoff so as to improve the forecast accuracy and enhance the adaptability of the model. For the first 400 periods in the dry season, the forecast values of the two models are smaller than the observed runoffs when the two models are separately applied, making the combined model forecast runoffs smaller than the observed values too.

DISCUSSION
Obviously, the combined model is a general one that is adaptive to its particular models and theoretically guarantees better performance than any of its particular models, either the Xin'anjiang or the Sacramento model. Physically, however, there are still some mechanisms remaining open to be explored, including, but not limited to, the long range dependence (LRD) (Dimitriadis et al. 2021) that the present combined model may have enhanced so as to decrease the uncertainty when translating the rainfalls to stream flows, as well as the surface, inter, and base flows that may be more connective (Keesstra et al. 2018) because of combining the Xin'anjiang with the Sacramento model on more diverse runoff yields. The sensitivity of the results to parameters is yet to be analyzed to give a more comprehensive assessment on the combined model by using, for example, different sensitivity indexes (Ballinas-González et al. 2020).

CONCLUSIONS
This work introduces a weight factor to combine the Xin'anjiang with the Sacramento model to explore the complementarity between the two models. Instead of only combining the results derived from the two models separately to produce better forecast, this work optimizes the weight factor, the parameters, and initial states in the two models all together by the cyclic coordinate method. Case studies in the Yuxiakou catchment of Qingjiang River in China show that the CCM converges very fast, with the Nash-Sutcliffe efficiency, though quite different at the starting points, mostly rising to around 0.8 through about two cycles of coordinate rotation. The application also reveals that the combined model performs much better than either the Xin'anjiang or Sacramento model alone, improving the NSE by 4.7% in the training period and 4.3% in the testing period compared with the Xin'anjiang model that has higher NSEs than the Sacramento model.

DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.