Evaluation of radar-gauge merging methods for quantitative precipitation estimates

Accurate quantitative precipitation estimates are of crucial importance for hydrological studies and applications. When spatial precipitation fields are required, rain gauge measurements are often combined with weather radar observations. In this paper, we evaluate several radar-gauge merging methods with various degrees of complexity: from mean field bias correction to geostatistical merging techniques. The study area is the Walloon region of Belgium, which is mostly located in the Meuse catchment. Observations from a C-band Doppler radar and a dense rain gauge network are used to estimate daily rainfall accumulations over this area. The relative performance of the different merging methods are assessed through a comparison against daily measurements from an independent gauge network. A 4-year verification is performed using several statistical quality parameters. It appears that the geostatistical merging methods perform best with the mean absolute error decreasing by 40% with respect to the original data. A mean field bias correction still achieves a reduction of 25%. A seasonal analysis shows that the benefit of using radar observations is particularly significant during summer. The effect of the network density on the performance of the methods is also investigated. For this purpose, a simple approach to remove gauges from a network is proposed. The analysis reveals that the sensitivity is relatively high for the geostatistical methods but rather small for the simple methods. The geostatistical merging methods give the best results for all tested network densities and their relative benefit increases with the network density. Correspondence to: E. Goudenhoofdt (edouard.goudenhoofdt@oma.be)


Introduction
Interest in quantitative estimation of rainfall based on weather radar has increased during the last years.Indeed new applications have risen in the field of distributed hydrological modelling or numerical weather prediction which require accurate precipitation estimates at high spatial resolution.
Weather radar is a remote sensing instrument that measures the reflectivity of precipitation at a given altitude.Those measurements can be used to estimate precipitation at ground level.Several sources of errors affect the accuracy of this estimation (e.g., Wilson and Brandes, 1979;Joss and Waldvogel, 1990;Germann et al., 2006;Ciach et al., 2007).The measure of reflectivity itself can suffer from electronic miscalibration, contamination by non-meteorological echoes or range effect (attenuation, increase of the sample volume due to beam broadening).When retrieving the rainfall estimation at ground level, additional uncertainties arise.Those are due to the non-uniform vertical profile of reflectivity (VPR) and the conversion of radar reflectivity into rain rates (Z-R relationship).Nevertheless, a weather radar provides precipitation estimation at high spatial and temporal resolution over a large area.A network of rain gauges can provide more accurate point-wise measurements but the spatial representativity is limited.The two observation systems are generally seen as complementary and it is interesting to combine them.
Merging radar and gauge observations has been an intense topic of research since the beginning of the operational use of weather radars in the 70's.A review of gauge adjustment methods and operational use in Europe can be found in a COST 717 report (Gjertsen et al., 2003).More complex methods have been proposed such E. Goudenhoofdt and L. Delobbe: Evaluation of radar-gauge merging methods for QPE as co-kriging (Krajewski, 1987;Sun et al., 2000), statistical objective analysis (Pereira et al., 1998) or Kalman filtering approach (e.g., Todini, 2001;Seo and Breidenbach, 2002;Chumchean et al., 2006).Some of those methods are very time consuming and are not well suited in an operational context.Note that those merging methods must be seen as a final step in the processing of radar data.All kinds of corrections should be applied first to improve the radar-based precipitation estimates like ground echo elimination, VPR correction or attenuation correction (e.g., Germann et al., 2006;Tabary, 2007;Uijlenhoet and Berne, 2008).
The aim of this study is to perform a long-term verification of different existing merging methods.Several methods of various degrees of complexity have been implemented and tested.All selected methods are appropriate for operational use.Verification of the merging methods faces the problem that the real precipitation field is unknown.A traditional approach is to compare precipitation estimates with rain gauges.Cross-validation (i.e.removing a gauge from the adjustment network to use it for verification) is a possible method but the drawback is that the network used for adjustment varies.In this study an independent verification network is used, more suitable to analyse the performance of the methods.Since the time sampling of this network is 24 h, the merging is made on daily accumulations.Several statistics are then computed to evaluate and compare the different methods.Similar long-term verification has been performed in recent studies but limited to one (e.g., Borga et al., 2002;Holleman, 2007) or a few methods (e.g., Cole and Moore, 2008;Heistermann et al., 2008;Salek and Novak, 2008).In Schuurmans et al. (2007), three different geostatistical methods have been compared based on 74 selected rainfall events.
The impact of the gauge network density on the merging methods performance has been little assessed in past studies (Sokol, 2003;Chumchean et al., 2006).One of the contributions of this paper is to determine the best method for a given network density.
The characteristics of the radar and the rain gauge networks can be found in Sect. 2. The different methods used for merging are described in Sect. 3 and the results of a 4year verification against rain gauges are presented in Sect. 4. In Sect.5, a sensitivity analysis to the density of the network used for merging is carried out.

Radar and gauge observations
The Royal Meteorological Institute of Belgium (RMI) operates a single-polarisation C-band weather radar.It is located in Wideumont at 592 m above sea level.The radar observations are routinely used at RMI for operational shortterm precipitation forecast, detection of severe thunderstorms (Delobbe and Holleman, 2006) and a posteriori analysis in the case of severe weather events.The use of the Wideumont radar observations for hydrological studies and applications is also an important field of research and development (e.g., Berne et al., 2005;Delobbe et al., 2006;Leclercq et al., 2008).
The radar performs a 5-elevation scan every 5 min with reflectivity measurements up to 240 km.The beam width is 1 degree.The resolution of the radar polar data is 250 m in range and 1 degree in azimuth.A time-domain Doppler filtering is applied for ground clutter removal.An additional treatment, based on a static clutter map, is applied to the volume reflectivity file to eliminate residual permanent ground clutter caused by some surrounding hills.A pseudo-CAPPI at 1500 m above sea level is extracted from the 5-elevation scan (0.3, 0.9, 1.8, 3.3 and 6 • ) on a Cartesian grid with a resolution of 600×600 m 2 .The height of the pseudo-CAPPI is chosen to limit the effect of ground clutter.Reflectivity factors are then converted into precipitation rates using the Marshall-Palmer relation Z=aR b with a=200 and b=1.6.The 5 min images are integrated in time to produce a 24 h rainfall accumulation starting at 08:00 LT.
The hydrological service of the Walloon region (SPW) operates a dense (1 gauge per 135 km 2 ) and integrated network of 90 telemetric rain gauges.Most of them are tipping bucket systems providing hourly rainfall accumulations.The collected data are used for hydrological modelling and directly sent to RMI.The rain gauges are controlled on site every three months and in a specialised workshop every year.Every day, a quality control of the data is performed by RMI using a comparison with neighbouring stations.Radar data are also used in this quality control for the elimination of outliers.
RMI maintains also a climatological network including 270 stations with daily measurements of precipitation accumulation between 8 and 8 local time (LT).These stations are manual and the data are generally available with a significant delay.The data undergo a drastic quality control.This network is routinely used for the long-term verification of radar precipitation estimates.It will be used here to evaluate the radar-gauge merging methods.
Since the estimation of precipitation can be very inaccurate at large distance from the radar, a maximum range of 120 km is used.The SPW network, used for adjustment, is then reduced to 74 gauges.Several stations of the RMI network are not always available during the 4-year verification period.Those stations are removed to ensure that the same network is used for the whole period.The remaining verification network includes 110 gauges.The positions of the radar and the two rain gauge networks can be seen in Fig. 1.The topography of the area of interest is shown in Fig. 2.

Description of the methods
Various methods combining radar and rain gauge data have been implemented to obtain the best estimation of precipitation.Several methods require the comparison between a rain gauge measurement G and a corresponding radar value R. The spatial sampling issue is of crucial importance when radar areal estimates are compared or combined with gauge point measurements (Villarini et al., 2008).In our study, the average over 9 radar pixels around the gauge location is used as the corresponding radar precipitation estimation.This allows limiting the effect of wind drift which can be very significant (Lack and Fox, 2007).Spatial sampling error increases when the radar estimate is based on a larger number of pixels especially in convective situation.However this effect decreases with increasing accumulation time and is therefore relatively limited at a daily time scale.Besides, the use of a larger radar estimation area allows reducing the temporal sampling error.
Only the pairs for which both R and G exceed 1 mm are considered as valid.A day is valid if there are at least 10 valid pairs.The methods are applied on a square domain containing the Walloon region.It means that some areas fall outside the network convex hull (i.e. the boundary of the minimal convex set containing all the gauges).On those areas, adjusted values must be seen as extrapolation.The uncertainties associated with those values are then higher.

Mean field bias correction (MFB)
The assumption here is that the radar estimates are affected by a uniform multiplicative error.This error can be due for example to a bad electronic calibration or an erroneous coefficient a in the Z-R relationship.The adjustment factor is estimated as a mean field bias: where N is the number of valid radar-gauge pairs, G i and R i are the gauge and radar values associated with gauge i.

Range-dependent adjustment (RDA)
This method assumes that the R/G ratio is a function of the distance from the radar.Range dependences are essentially produced by the increasing height of the measurements, beam broadening and attenuation effects.The range dependent adjustment is mainly based on the BALTEX adjustment method (Michelson et al., 2000).The relation between R/G expressed in log-scale and range is approximated by a second order polynomial whose coefficients are determined using a least squares fit. log where r is the distance from the radar.The range dependent multiplicative factor C RDA is derived from the polynomial fit.

Static local bias correction and range dependent adjustment (SRD)
The static local bias correction aims at correcting for visibility effects.The correction is calculated from a one-year data set using the climatological gauge network.The 24 h radar accumulations are first adjusted by a mean field bias correction.Then, for each gauge location, the averaged residual bias over 1 year is estimated.Finally a spatial interpolation based on kriging is performed to obtain the correction field.
To simulate an operational context, the correction calculated over a given year is used for the next year.The fields obtained for 2004, 2005 (see Fig. 3), 2006 and 2007 are very similar.This correction is applied before a range dependent adjustment (Slb+RDa=SRD).

Brandes spatial adjustment (BRA)
This spatial method was proposed by Brandes (1975).A correction factor is calculated at each rain gauge site.All the factors are then interpolated on the whole radar field.This method follows the Barnes objective analysis scheme based  on a negative exponential to produce the calibration field: where d i is the distance between the grid point and the gauge i.The parameter k controls the degree of smoothing in the Brandes method.It is assumed constant over the whole domain.The parameter k is computed as a function of the mean density δ of the network, given by the number of gauges divided by the total area.A simple inverse relation has been chosen: The factor 2 was adjusted to get an optimal k for the full network.The optimal k was estimated by trial and error based on the verification for the year 2006.The same relation between k and δ is used for the reduced networks (see Sect. 5).

Ordinary kriging (KRI)
A geostatistical method like ordinary kriging deals with the spatial interpolation of a random field from observations at several locations.A general description is presented in Goovaerts (1997).This method requires the definition of a variogram characterising the spatial variability of the precipitation field.The estimation U 0 at a specific location is a linear combination of the gauge values G i : The weights λ i are computed to obtain the best linear unbiased estimator assuming a constant unknown mean across the field.This involves solving a linear equation system whose size is equal to the number of gauges.
In this study, only the 20 nearest gauges are used.This allows reducing the computational cost with little loss of accuracy.The model variogram, assumed isotropic, is a first order linear function of the distance.More complex climatological variograms (i.e.Gaussian, exponential and spherical) have been tested but no significant improvements of the performance were observed.Those results are consistent with the study of Haberlandt (2007).The KRI method, based only on rain gauges, is used to evaluate the added value of radar observations in the other methods.

Kriging with radar-based error correction (KRE)
This method referenced as "conditional merging" in Sinclair and Pegram (2005) uses the radar field to estimate the error associated with the ordinary kriging method based on rain gauges and to correct it.First, radar values at each gauge site are used to produce a radar-based kriging field.This field is then subtracted from the original radar field to obtain an error field.Finally, the error field is added to the gaugebased kriging field.The KRE method is relatively simple and computationally efficient.

Kriging with external drift (KED)
This method is a non-stationary geostatistical method that uses the radar as auxiliary information.A general description is given in Wackernagel (2003).It follows the same scheme as the ordinary kriging except that the mean of the estimated precipitation field is now considered as a linear function of Hydrol.Earth Syst.Sci., 13,[195][196][197][198][199][200][201][202][203]2009 www the radar field.Additional constraints are then added to this scheme: where R i is the radar value at gauge location i, λ i the corresponding weight and R 0 the radar value at the estimation location.The weights are given by solving the augmented system of linear equations.The variogram is also assumed linear and isotropic.This is the most complex and time consuming method.Note that an automatic method to compute a variogram model has been proposed recently by Velasco-Forero et al. (2008).

Methodology
The performance of the radar-gauge merging methods has been evaluated by comparing the adjusted 24-h precipitation accumulations R to the measurements of the climatological gauge network G.The testing period extends from 2005 to 2008, which includes 612 valid days.The gauge data used for the adjustment and for the verification are independent.Unfortunately the two networks have several locations in common or very close.The gauges of the RMI network situated at a distance less than 2 km from a gauge of the SPW network are then removed.The remaining verification network includes 75 gauges.Several quality parameters are found in the literature.The Root Mean Square Error: is the most common parameter used in verification studies.However, the Mean Absolute Error: is less sensitive to large errors and it is used here as first quality parameter.All pairs of gauge radar values are taken into account for these parameters.
A standard for objective judgement of radar performance is proposed in Germann et al. (2006).The mean bias, the error distribution and the scatter as defined in that paper are also used in the present study.The mean bias (MB) is the total precipitation as seen by the radar divided by the total precipitation measured by the gauges.The error distribution is the cumulative contribution to total rainfall as a function of the R-G ratio expressed in dB.The scatter is half the distance between the 16% and 84% percentiles of the error distribution.It is a robust measure of the spread of the multiplicative error, insensitive to outliers.The standard deviation of the multiplicative error (STD) and the root mean square factor: have also been calculated.Only pairs with both adjusted and gauge values larger than 1 mm for all the methods are taken into account.This ensures that the same data set is used for comparison between the different methods.

Results
The verification methodology has been first applied for the four years separately the results between the four datasets.As illustrated in Fig. 4, the relative performance of the different methods is similar for the four years.Nevertheless, the ordinary kriging method (KRI), using only rain gauges, exhibits some variability between the years.The 4 years taken as a whole will now be considered for the rest of the evaluation.
As shown in Fig. 5, the Mean Absolute Error (MAE) of all methods is significantly reduced compared to the MAE of the original radar data (ORI).A simple mean field bias (MFB) correction reduces the error by about 25%.Using the range dependent adjustment (RDA) allows a small additional improvement.A further improvement is obtained when a static local bias (SRD) correction is applied.The performance of the latter method is close to the Brandes one (BRA), which is also a spatial method.The ordinary kriging method (KRI), only based on rain gauge data, shows a result close to the RDA method.This good result is due to the high density of the rain gauge network (see Sect. 5).The two geostatistical methods using both radar and rain gauge observations (KRE, KED) perform best for this quality parameter.When the KED method is used, the error decreases by almost 40% with respect to the original data.
Figure 6 shows the error distribution for the different methods.The vertical line divides the R/G ratios (in dB) in underestimation (left) and overestimation (right).A perfect match should give a step function, with a mean bias and a scatter equal to zero.The original radar data (ORI) reveal a significant underestimation with a mean bias of −1.2 dB.The mean field bias correction (MFB) succeeds in balancing the error distribution.The method combining a range-dependent adjustment and a static local bias correction (SRD) slightly reduces the spread of the error while the most sophisticated geostatistical method (KED) further tightens the error distribution.The ordinary kriging (KRI) shows a small underestimation.The results for the scatter (Fig. 7) are similar to the results for the MAE.It is worth pointing out that the relative performance of Brandes compared to the other methods is slightly better.Actually this method can sometimes lead to large errors that are taken into account for the MAE but not for the scatter.This figure also shows that methods with a daily spatial correction feature based on radar and gauges (BRA, KRE, KED) perform significantly better.
The values of the different statistics for all the methods can be found in Table 1.It appears that the ranking of the methods is very similar for the different scores.

Seasonal variation
The spatial pattern of 24 h rainfall accumulation varies significantly along a year, from widespread precipitation during stratiform events in winter to very local precipitation cells during convective events in summer.Therefore the accuracy of radar precipitation estimates and the spatial representativity of gauge measurements depend on the season.
Figure 8 shows the value of the MAE (normalised by the MAE of the radar) with the data set sorted by month.The ranking of the methods slightly varies along the year.As expected, the estimation from the gauges only (KRI) is relatively inaccurate in the summer.It is outperformed by the mean field bias correction in this period.In the winter, the ordinary kriging (KRI) is better and very close to the kriging with external drift (KED), which is the best method for all months.This analysis points out that the additional information given by the radar is especially valuable during summer, when convective events prevail.

Range-dependence
The performance of the different algorithms as a function of range is analysed up to 120 km from the radar.The study area is divided into 6 range intervals of 20 km as shown in Fig. 9.The gap between the performance of the radar and that of the gauges is significant at short distance (<20 km) due to the bright band effect.This is also the case at long distance (>100 km) due to the decreasing accuracy of radar estimates.The small difference between KRI and KRE or KED at those ranges shows that the radar added value is very limited.The positive effect of the range dependent adjustment when compared to the mean field bias appears at distances further than 80 km.KED is the best method for all distances.Seasonal variation of the range dependence may exists and a preliminary analysis of this effect has been performed.A significant variability between the years has been found and no robust conclusions could be reached.

Effect of the network density
The effect of gauge density on the performance of the different merging methods has been analysed.This is useful to select the most appropriate method for a given network density or to determine the minimum network density needed to achieve a given level of performance.None of the methods takes directly into account the density of the network except the Brandes method where an inverse relation is used to determine the smoothing factor k (see Sect. 3).

Removing gauges
As the region seen by the radar is characterised by low climatological variations, the assumption that the probability of precipitation is the same everywhere is valid.Consequently, a perfect network should be made of a regular grid of points considering a rectangular area.Actually the position of the gauges depends on practical constraints and specific interest on catchment.The spatial distribution of gauges in a real network is then less uniform.It is obvious that gauges cannot be randomly removed from the network.Indeed, when the furthest gauge is removed, the coverage area decreases.Furthermore, a gauge that belongs to the convex hull (see Sect. 3) cannot be removed without decreasing the study area.A rain gauge must be removed from the network in such a way that the spatial distribution of the remaining gauges is as uniform as possible.A simple approach is proposed here, based on the distance between gauges.For each gauge, the sum of the inverse of the distance to the four nearest gauges is computed.Then the gauge with the maximum value (that is too close to its neighbours) is removed.The effect of the algorithm can be seen in Fig. 10 which shows the reduced networks of 50 and 20 gauges.Note that the convex hull is relatively well preserved while the number of gauges decreases.

Global statistics
A long term verification is performed with decreasing network densities.For the sake of consistency, the valid days for adjustment (see Sect. 3) at the lower density are taken as the common verification dataset for all densities.
Figure 11 shows that a mean field bias correction is not very sensitive to the gauge density and the performance remains acceptable even for a low density network.The performance of the range dependent adjustment, involving a second order polynomial fit, slightly increases with density but only for low densities.As expected, the ordinary kriging (KRI) is the most sensitive method to this parameter.Indeed, the error significantly grows when the density decreases.The MAEs of the Brandes (BRA)  methods (KED, KRE) follow the same tendency but with a lower sensitivity.KED is the best method for all network densities.However, for the lowest tested density (1 gauge per 500 km 2 ), the static local bias followed by a range dependent adjustment (SRD) performs as well as the most sophisticated methods (KRE and KED).Similar results have been obtained with the other quality parameters.

Conclusions
Various methods combining rainfall estimations from a Cband weather radar and an automatic rain gauge network have been implemented.A 4-year verification up to 120 km range was carried out against an independent gauge network of daily measurements.Several statistics have been computed to evaluate the performance of the radar-gauge merging methods.The effect of the network density has also been tested.
The results point out that simple methods like mean field bias correction can significantly reduce the error of the radar estimation.Nevertheless, there is a clear benefit of using a spatial correction factor.Based upon our verification study, the best method is the kriging with external drift which makes use of the radar as secondary information to improve the spatial interpolation of gauges values.The kriging with radar-based error correction shows very similar performance while the computational cost is reduced.A seasonal verification shows interesting results.In the winter, when stratiform widespread precipitation prevails, the ordinary kriging based on gauges performs as well as the best radar-gauge merging method.In the summer, when convective events occur, the added value of radar observation is very clear.The sensitivity analysis to the gauge network density shows that the geostatistical merging methods perform best for all tested densities.Furthermore, their relative benefit increases with the density.A method combining a static local bias correction and a range dependent adjustment is less sensitive to the gauge density.
For the lowest tested network density (1 gauge per 500 km 2 ), this method is as efficient as the most sophisticated merging methods.

Fig. 5 .
Fig. 5. Mean absolute error of all methods based on a 4-year verification.

Fig. 6 .
Fig.6.Error distribution based on a 4-year verification.The scatter score for one method is half the distance between the 2 intersections of the curve with the 2 red lines.

Fig. 7 .
Fig. 7. Scatter score for all methods based on a 4-year verification.

Fig. 8 .
Fig. 8. Mean Absolute Error of all methods computed for each month and normalised by the Mean Absolute Error of the original radar data.

Fig. 9 .
Fig. 9. Effect of the distance from the radar on the Mean Absolute Error of all methods normalised by the Mean Absolute Error of the original radar data.

Fig. 10 .
Fig. 10.Gauge network of decreasing densities obtained by an algorithm for removing gauges.

Table 1 .
Several statistics (see Sect. 4.1) for the 4-year verification of radar-gauge merging methods.