Uncertainty of Flood Forecasting Based on Radar Rainfall Data Assimilation

1State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China 2Hubei Collaborative Innovation Center for Water Resources Security, Wuhan University, Wuhan 430072, China 3College of Tourism Culture and Geographical Science, Huanggang Normal University, Huanggang 438000, China 4Australian Rivers Institute, Griffith University, Nathan, QLD 4111, Australia


Introduction
Flooding is a hazard with potentially serious consequences of loss of life, displacement of communities, and economic costs.The risks can be reduced by issuing warnings and implementing planned responses to avoid the threat.This relies on access to accurate real-time forecasts of flood peak magnitude and timing and duration at the time-scales of nowcasting (∼2 hours) and short-range forecasting (∼1-2 days).The time required for rainfall to be converted into runoff and be conveyed downstream through channels is sufficient that forecasting of flash floods can use recently observed rainfall data rather than less accurate forecast rainfall.Thus, accurate rainfall data is the key input to flood forecasting [1].Such data is usually obtained from an automatic network of rain gauges, set up according to established standards to continuously measure rainfall volume at short intervals with an acceptable accuracy.For practical reasons, the density of gauges in the network will often be too low to properly characterize the spatial and temporal distribution of rainfall throughout a catchment.Failure to capture data concerning the intensity of the main storm cell can lead to underestimation of flood peak magnitude and timing.Weather radar offers an alternative method of capturing rainfall data that has the potential to overcome this limitation of a rain gauge network or to enhance the information from the network.Radar echoes provide real-time, high spatial resolution information on cloud extent and thickness, evolution and direction of travel of storm cells, and precipitation distribution, but with a degree of uncertainty that could be considered less than desirable for reliable flood forecasting.The accuracy of radar-derived rainfall data can be improved by calibration with observed data using data assimilation techniques [2][3][4].Data assimilation refers to 2 Advances in Meteorology the process of incorporating observed data into the model state of a numerical model of a system and is employed in situations when the number of available observations of the system is much smaller than the number required to specify the model state.Postassimilation rainfall data can be input to distributed hydrological models to more accurately predict flood hydrograph characteristics [5][6][7][8][9].
There has been considerable progress in practical application of weather radar for estimation of surface rainfall.For example, the UK Met Office uses a network of 13 weather radars that provides 2 km × 2 km gridded radar echo data at 5-minute intervals, as well as an extrapolation based on radar data every 30 minutes, produced and accumulated in real time.Precipitation forecasts are provided every 6 hours [10].The US Weather Bureau has operated a national river forecasting system since the early 1990s that combines weather forecasts with flood prediction models to disseminate flood warnings [11].The precipitation input data are derived from satellite, ground radar, and rainfall gauges.China has established a network of 172 new generation Doppler weather radars, achieved a real-time data transmission every 6 minutes and provides composite weather radar data.This has helped to enhance the monitoring of sudden, potentially calamitous weather and improved the warning services for rainstorm disasters [12].
The performance of conceptual hydrological models depends to a large extent on the quality of the rainfall data but also on the selection of parameter values and model structure.Ideally, parameters are optimized using historical flood events, but model performance also relies to some extent on expert judgment and could be limited by the structure of the model, which is a simplification of a complex hydrological process [13][14][15].Uncertainty associated with model structure and parameters has received attention [16][17][18][19][20], as has uncertainty associated with the quality of the rainfall data input [21,22].Error in remotely sensed precipitation data will be transmitted, and possibly amplified, through the hydrological simulation process.This paper examines the nature of uncertainty in flood forecasting due to transmission of errors associated with assimilated radar-based rainfall data.
In this paper the data assimilation methods of average calibration (AVG), optimum interpolation (OPT), Kalman filter (KLM), statistical weight integration (SWI), and variational calibration (VAR) were applied to radar rainfall data transformed from the classified - (radar reflectivityrainfall rates) relationship, and then the result was compared with precipitation data observed from a rain gauge network.The rainfall data were then used as input to the Xinanjiang hydrological model to predict flood hydrographs using the five sets of assimilated radar-based rainfall data and uncalibrated (UC) classified - relationship rainfall data.This paper provides critical descriptive analysis of the main assimilation methods and the effect of the errors on uncertainty of flood forecasting using new data sets from the White Lotus River catchment, China.A novel aspect of this paper is quantification of the influence of the propagation of rainfall input error on simulated flood event peak flow magnitude, flood event volume, and flood peak timing.This was done by perturbing radar rainfall data using the Breeding of Growing Modes (BGM) method.

Study Area
The White Lotus River, located in Hubei Province, China, has a catchment area of 1797 km 2 .Its headwaters are in the Dabie Mountains and it forms a midcatchment tributary of the Xishui River, which flows into the lower Yangtze River (Figure 1).The catchment lies within the subtropical monsoon climate zone, with cold dry winters, hot wet summers, and distinct climatic seasonality.Mean annual temperature is 16.7 ∘ C, and annual average precipitation is 1366 mm, with the majority falling from June to August.In the wet season, rainstorms occur with a high frequency and intensity, and the short convergence time typically results in flood hydrographs with rapid rates of rise and fall.

Data and Method
3.1.Data Sets.The White Lotus catchment has 19 rain gauges and one river hydrological station at the outlet (Figure 1).This study used hourly recorded rainfall data from the years 2005 and 2006.The radar data were sourced from the CINRAD/SA Doppler weather radar located in Wuhan for the flood seasons of 2005 and 2006.The study area is located within the radar radius of 150 to 250 km.Generally, the 0.5 ∘ elevation [23] reflectivity factor plan position indicator (PPI) was used to develop the transformation of - relationship.
Radar estimated rainfall was quantified on a 1 km × 1 km grid at a 1-hour time step.Eight rainstorms were chosen for analysis (Table 1).The total sampled period for these events was 222 hours.

Average Calibration (AVG)
. AVG is a simple and precise method for calculating regional rainfall volume [24,25] whereby  rain gauges are used to calculate the mean calibration factor: where   is the observed rainfall of station  and   is the radar reflection rate of the station.AVG is applied by multiplying  by  to give regional rainfall distribution.In the situation of an inconsistent spatial pattern of rainfall over an area of interest, a number of regional AVG factors could be calculated.

Kalman Filter (KLM).
In radar-based precipitation measurement, there exists not only system errors produced by unstable radar capability but also the random errors due to the unstable relation between  and  and the influence of wind field.These random errors are called noise.KLM aims to find the optimum estimation of calibration factor  from noisy data.The basic theory of KLM [26,27] is that if  1 and  2 , two independent estimations of the variable , can be obtained by different equations, the relation between ,  1 , and  2 is where  is a weighting factor.The criterion for determining  is minimizing the variance of the calibrated .KLM is used to determine the weight, with the input being observed data and the output being the optimum estimation of .The KLM algorithm is a real-time recursive filter, using present input data and the previously calculated state.It is a real-time recursive filter, with the accuracy of estimation improving continuously as the number of observations increases.

Optimum Interpolation (OPT).
The OPT method uses radar rainfall data derived from echoes as the first-guess background meteorological field and then corrects this using a linear combination of the differences between this first guess and observed data from the rain gauges.The weighting function is determined by minimizing the expected square error of the data [28,29].An OPT is generally expressed as follows: where   (  ) is an analysis (first guess) value at a grid cell  to be interpolated,   (  ) is an observed (first guess) value given at observation point , and   denotes a weight function at observation point .The optimum weight is computed by supposing that the errors are unbiased and uncorrelated and can be expressed as where  represents the first-guess error correlation coefficient between two arbitrary grid points.

Variational Calibration (VAR).
The VAR method [30,31] uses radar-based and gauge-measured precipitation for each observation point to achieve a calibration factor  R() ( = 1, 2, . . ., ) and then obtain  R(, ) in every grid cell (, ) by interpolation.Next, an optimum calibration factor field (, ) which minimizes the sum of squared errors between (, ) and  R(, ) is sought.Finally, the analyzed precipitation in the grid (, ) is where  (  ) is analysis (radar-based) value at grid (, ).

Statistical Weight Integration (SWI).
The SWI method averages  results of precipitation intensity distribution, derived by  methods according to weighting that depends on the reciprocal of error statistics.
Error   =   −   , where   is the output value of th method in grid  and   is the measured value of automatic rain gauge in the same grid.Error variance , where   is the mean error of each method.The reciprocal of the error variance of various methods is usually used as the weight of the integration analysis.The result after integrating is where ŷ is estimated results of the th grid after integration.

Evaluation Criteria.
The absolute error (AE) and relative error (RE) were used as criteria to evaluate the five assimilation methods.The Nash-Sutcliffe (NS), peak relative error (PRE), and peak time difference (PTD) were used to evaluate the results of hydrological modelling.
where Q is the predicted streamflow at time step ,   is the observation at time step , while  is the mean of observations,  represents the total number of observations, q is the predicted peak flow at time step ,   is the measured peak flow at the same time,   is the predicted peak time, and   is the observed peak time.

Precipitation-Runoff
Model.The Xinanjiang model developed by Zhao [32] has been applied extensively and successfully to predict runoff from precipitation in humid and semihumid regions [33].This model conceptualizes that runoff is not produced until the soil moisture content of the aeration zone reaches field capacity, and thereafter runoff equals the rainfall excess without further loss.It consists of three submodels, a three-layer evapotranspiration submodel, a runoff generation submodel, and a runoff routing submodel.Many studies have described the structure of the model in detail [32][33][34][35].The inputs to the model are daily areal precipitation and pan evaporation, with observed daily discharge used for calibration.The outputs are the modelled daily discharge at the catchment outlet and the estimated actual evapotranspiration of the catchment [36].The water balance equation of the model can be expressed as follows: If where  is the runoff,   is the effective precipitation which equals the precipitation minus evaporation,  is the initial water storage within the catchment and  is the ordinate value corresponding to ,   mm is the maximum storage capacity of the catchment,   is the areal mean tension water capacity, which is composed of the capacity of each soil layer and represents drought conditions, and  is a parameter that represents the heterogeneity of the water storage capacity of the catchment.
To reasonably describe the evapotranspiration process, the soil profile is divided into three layers: the upper soil layer, the lower soil layer, and the deepest layer.If the water content in the upper layer is sufficient, evaporation equals the evaporation capacity of the upper layer.If not, all upper water is evaporated and the residual evaporation is sourced from the lower layer.If the water content in the lower layer is insufficient to meet the residual evaporation capacity, the water in the deepest layer will be evaporated.
The generated runoff is divided into the surface flow, interflow, and groundwater using steady infiltration.The total runoff can be routed by a linear system before arriving at the outlet of the catchment [37].Flow routing uses the Muskingum or piecewise continuous algorithm.Taking into consideration the uneven distribution of rainfall and the nature of the underlying surface, the catchment is divided into a set of subcatchments using an appropriate method, such as Thiessen polygons.Finally, the total catchment runoff is obtained after the hydrographs at the outlets of each subcatchment are simulated and flood routing is applied [35].

Uncertainty Methods.
Most research into uncertainty of hydrograph forecasting has focused on model parameters and structure rather than quality of input data.Toth and Kalnay [38] recommended perturbation of the initial condition with perturbation fields that are representative of errors present in the analysis and then using these perturbations as the input to the model in the analysis of input uncertainty.The main methods used to generate perturbations that reflect the initial uncertainty are Monte Carlo, Singular Vector, and Breeding of Growing Modes (BGM).Monte Carlo is a statistical method, whereby the smaller the sample number the less powerful the statistics.However, the sample number is often very limited, and randomly chosen initial values do not perfectly match with the dynamical mode.Although close to actual weather at initial time, the predicted results often deviate rapidly from the actual atmospheric condition due to the adjustment of the mode itself [39].The Singular Vector method is better able to deal with many nonquantitative hypotheses in the data assimilation, increase the number of ensemble members, and more easily capture the errors generated by analysis.It can also determine the direction of the fastest disturbance and have a better dispersion.Disadvantages of this method are that the components of errors that do not grow are ignored and the calculations are tedious [40,41].
The BGM proposed by Toth and Kalnay [38,42] was designed to model how growing errors are bred and maintained in a conventional analysis cycle through successive use of short-range forecasts, with the bred modes offering an estimate of possible growing error fields in the analysis.The growth rate obtained by this method is better than that of the Monte Carlo method and even higher than the growth rate of the lagged average forecasting.A formula for the specific perturbation is generated: where  is random perturbation field and  is perturbation amplitude coefficient.In order to make the disturbance amplitude not too large,  is set equal to 1. Rmse is the mean square error between the measured rainfall and rainfall estimation from radar data assimilation.Rand is a random distribution function which is evenly distributed between −1 and 1, where the equation [43] for producing a random number that is uniformly distributed in [, ] is where   is a uniformly distributed random number that belongs to [0, 1].
Using the measured data of eight storm events from 19 rain gauge stations and radar estimations calibrated by the KLM method, 100 sets of random perturbation field were produced for each event by the BGM of initial disturbance.Then 200 sets of disturbance for each event, formed after the superposition and deduction of the perturbation field, were used as the input to the Xinanjiang hydrological model.For the eight storm events, this resulted in 1600 predicted hydrographs.The influence of the propagation of rainfall input error on runoff simulation was quantified using the three previously described goodness-of-fit statistics.

Evaluation of Methods of Assimilation of Radar Rainfall
Data.A comparison of uncalibrated and calibrated results of five kinds of assimilation methods for eight storm events (Table 2) revealed that, among the methods, the SWI method produced results closest to the observed data, followed by the VAR method, with a mean absolute error smaller than that of the SWI method.The next best method was the OPT method, followed by the KLM method and, finally, the AVG method.The average of the results across the eight storm events (Table 2) allows a rapid comparison of the relative performances of the five assimilation methods.For AVG, the accuracy of regional total precipitation estimation improved after calibration, and the accuracy would improve further with more rain gauges.One problem with the AVG method is that it does not account for the shifting location of the focus of intense rainfall during storm events.AVG cannot assign greater weight to points independently identified to be of particular interest, and it does not consider the representativeness of the data derived by the automatic rain gauges and radar echoes.These problems explain the relatively low performance of the AVG method.
Variable terrain, synoptic conditions, and precipitation intensity impart a high degree of spatial and temporal variability to radar rainfall data that presents a challenge for data assimilation.The results indicated that the KLM method of data assimilation, with its strong theoretical basis and consideration of the distribution of error in determining the calibration factor, produced a better result than the AVG method.The KLM method can improve accuracy by reducing the relative deviation and degree of dispersion between the hyetometer and radar rainfall data and eliminating the intrusion of nonrainfall echoes from rainfall estimation.This method calibrates the radar estimated rainfall distribution field in the time domain rather than the spatial structure error of precipitation, which means it is more suitable for the calibration of stable stratiform cloud precipitation and in places with low density of rain gauges.These may be the key factors that limit the calibration accuracy of the KLM method.The spatial structure and distribution of rain clouds are complicated and changeable, with mixed convective precipitation being a common phenomenon.Under these conditions, the OPT and VAR methods, which have similar results (Table 2), outperform the KLM method.This relative performance difference would also apply in places with a high density of rain gauges.While each precipitation calibration model has its advantages depending on the conditions, SWI combines the advantages of the other four methods and gives an appropriate weighting after the calculation.On this basis the SWI method would be expected to be superior to the other data assimilation methods tested here [44], and this was supported by the results (Table 2).
Scatterplots of hourly gauged rainfall and uncalibrated and calibrated radar-based rainfall (Figure 2) illustrated the good agreement between the rainfall distribution obtained by the SWI method and gauged rainfall.The VAR and OPT assimilation methods also performed well.The KLM and AVG method produced noticeably poorer results (Figure 2).

Hydrological Model for Flood Hydrograph Prediction. The Xinanjiang hydrological model was applied in the White
Lotus River catchment taking into consideration temporal and spatial variation of rainfall and the nonlinear effects of watershed topography and river channel characteristics on the runoff concentration and dispersed inflow and nonlinear convergence.Model parameters were calibrated by the method of Rosenbrock [45].The three indexes certainty factor, peak relative error, and peak time difference were used to evaluate the flood forecasting results.Goodness-offit statistics (Table 3) and simulated hydrographs (Figure 3) indicated that the methods of SWI, OPT, and VAR were superior in producing rainfall input data for storm hydrograph modelling.In particular, the SWI simulated hydrographs were in almost perfect agreement with the observed data.The rainfall data generated by the methods of KLM, AVG, and UC led to unsatisfactory hydrograph prediction.

Uncertainty Analysis of Hydrological Forecasting.
The relative error of predicted results increased after assimilated rainfall data were randomly disturbed compared with the undisturbed peak flow, flood volume, and peak time difference.The peak flow and flood volume were increased by 5% and 8%, respectively (Figures 4 and 5), which means that the model input errors increased when they were processed through the hydrological model.Also, most of the predicted peak times lagged the observed times (Figure 6).
The error of precipitation input data in hydrological modelling is a significant source of error in predicted peak flow, flood volume, and peak time.Therefore, the spread of precipitation uncertainty has a great influence on runoff prediction.In China, practical applications usually adopt qualitative rainfall prediction and assumed rainfall input, and this would have implications for the forecast period and accuracy of flood forecasting.At present, both the accuracy and resolution of rainfall forecasts provided by numerical weather prediction (NWP) cannot meet the requirements of quantitative flood forecasting, and this situation may continue to exist into the foreseeable future.Weather radar and satellite remote sensing (SRS) can perform an important role in quantitative precipitation forecasting.Use of precipitation forecasts obtained by SRS as the input of hydrological models to simulate and predict runoff warrants further research attention.This could be a key step towards improving the accuracy of flood forecasting.It will be necessary to develop methods for combining and assimilating the SRS data and observed data from rain gauges for producing better precipitation forecasts.SRS has sufficiently high spatial and temporal resolution to effectively characterize the instantaneous distribution of precipitation throughout a catchment, while rain gauges can provide high-precision single-point observations.
The generation of floods is a complex and dynamic process.Modelling this process can be improved through the use of rainfall input data with high spatial and temporal resolution.How to best make use of rainfall information available from multiple sources, including gauge networks, radar, and satellite monitoring, is one of the key problems to be solved in hydrological modelling and forecasting.

Conclusions
Because of the high level of risk posed by floods, especially flash floods, it is critical that warning procedures make use of accurate nowcasting and short-range runoff forecasting capability.Precipitation data is the most important input to hydrological models used for forecasting hydrographs.While rainfall data is usually sourced from a network of gauges, an alternative is weather radar detector, which offers superior spatial and temporal resolution but at the cost of low precision and accuracy.This paper investigated methods of precipitation data assimilation with the objective of improving the quality of radar-based rainfall estimates as a means of achieving more reliable flood forecasts.Using data from a network of rain gauges in the White Lotus River in Hubei Province, China, and data from the weather radar detector based in nearby Wuhan, five established methods of assimilating radar data estimated by the classified - relationship were applied.The assimilated data was compared with the observed rainfall data and then used in a hydrological model to predict historical flood hydrographs.Finally, BGM was used for the perturbation of radar data to evaluate the effect of the rainfall data input error on runoff simulation.The main conclusions arising from this work were as follows: (1) Statistical weight integration, variational calibration, and optimum interpolation were superior methods of data assimilation, while Kalman filter and average calibration methods gave a less satisfactory result.(2) For modelling storm event hydrographs, rainfall data generated by the statistical weight integration, variational calibration, and optimum interpolation methods resulted in hydrographs that were the closest fit to the gauged data.Rainfall data generated by the average calibration and Kalman filter methods of data assimilation resulted in unsatisfactory hydrograph prediction.(3) The random errors generated by model input had a tendency to increase during the process of running the hydrological model.Also, most of predicted flood peak times were lagged.
(4) As hydrometeorology ensemble forecasts become more widely available and the data acquisition methods develop, there will be an ongoing need for evaluation of uncertainty in input data.
(5) This paper provides critical descriptive analysis of the main assimilation methods and the effect of the errors on uncertainty of flood forecasting.It demonstrates that weather radar has potential to improve the precision of hydrological forecasting, which has implications for improved flood control and disaster mitigation.The influence of the propagation of rainfall input error on three simulated factors of hydrological forecast is quantified.

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Comparison of simulated hydrographs for eight storm events using rainfall input data derived from five assimilation methods and uncalibrated radar data. is measured hydrograph.

Table 1 :
Timing of the eight rainstorms chosen for the study.

Table 2 :
Comparison of calibration results among different methods.UC is uncalibrated, KLM is Kalman filter, AVG is average calibration, OPT is optimum interpolation, VAR is variational calibration, and SWI is statistical weight integration.

Table 3 :
Comparison of flood hydrograph prediction using different rainfall data assimilation methods.MPF is measured peak flow and FPF is forecast peak flow.