A Comparative Analysis of TRMM–Rain Gauge Data Merging Techniques at the Daily Time Scale for Distributed Rainfall–Runoff Modeling Applications

This study compares two nonparametric rainfall data merging methods—the mean bias correction and double-kernel smoothing—with two geostatistical methods—kriging with external drift and Bayesian combination—for optimizing the hydrometeorological performance of a satellite-based precipitation product over a mesoscale tropical Andean watershed in Peru. The analysis is conducted using 11 years of daily time series from the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) research product (also TRMM 3B42) and 173 rain gauges from the national weather station network. The results are assessed using 1) a cross-validation procedure and 2) a catchment water balance analysis and hy- drological modeling. It is found that the double-kernel smoothing method delivered the most consistent im-provementovertheoriginalsatelliteproductinboththecross-validationandhydrologicalevaluation.Themean bias correctionalsoimprovedhydrologicalperformance scores, particularly at the subbasinscale where the rain gauge density is higher. Given the spatial heterogeneity of the climate, the size of the modeled catchment, and the sparsity of data, it is concluded that nonparametric merging methods can perform as well as or better than more complex geostatistical methods, whose assumptions may not hold under the studied conditions. Based on theseresults,a systematicapproachtotheselectionofa satellite–raingaugedatamergingtechniqueisproposed that is based on data characteristics. Finally, the underperformance of an ordinary kriging interpolation of the rain gauge data, compared to TMPA and other merged products, supports the use of satellite-based products over gridded rain gauge products that utilize sparse data for hydrological modeling at large scales.


Introduction
Hydrological studies rely on the quality of rainfall estimates to produce meaningful modeling output. Rain gauges can deliver accurate point measurements, but their poor ability to describe the spatial structure of rainfall can be a major limitation when precipitation fields are required, for example, in distributed hydrological modeling applications. This problem is more severe in tropical regions because of high rainfall variability and scarce data conditions.
Satellite-based estimates such as the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA; also TRMM 3B42) product are becoming increasingly attractive as an alternative source of forcing data in data-sparse regions, although their application can be limited by quantitative inaccuracies (Zulkafli et al. 2014a;Anagnostou et al. 2010;Tian et al. 2007). Information from a large number of rain gauges is already assimilated as part of global/regional satellite algorithms; nevertheless, the rain gauge databases sourced by these procedures can exclude more extensive national networks where data accessibility is restrictive, as often is the case in developing countries. In these regions, the global precipitation product may be found to be unsatisfactory and requiring local adjustment (e.g., Lavado-Casimiro et al. 2009).
Satellite algorithms are known to internally perform gauge correction, for example, using a mean-field bias correction (MBC) and/or inverse-error-weighted averaging methods (Huffman et al. 1997;Grimes et al. 1999). Postanalysis merging methods have also been used on the products of these algorithms, for example, the spatial adjustment of TMPA using interpolations by inverse distance weighting (Lavado-Casimiro et al. 2009), double-kernel smoothing (DS; Li and Shao 2010), and the nearest neighbor method (Vila et al. 2009); correction through regression analysis between the satelliteand rain gauge-based precipitation at various temporal scales, for example, climatologies in Almazroui (2011) and monthly in Yin et al. (2008); and correction using probability distributions (Anagnostou et al. 1999). Geostatistical methods have also been used such as the kriging with external drift (KED) to combine gauge and 10-day (dekad) IR-based precipitation data from Meteosat (Grimes et al. 1999) and the cokriging approach to interpolate daily rain gauge data with the GPCP multisatellite precipitation estimates as covariates (Kottek and Rubel 2008). More recently, Heidinger et al. (2012) performed a wavelet analysis on the signals from daily rain gauge and TMPA time series and reconstructed a combined product by overlaying shortterm noise from the gauge signal on the long-term trends in the TMPA signal.
The methods applied to satellite data have origins in the radar research community, for example, MBC [references in Gjertsen et al. (2004)], spatial adjustment (Moore et al. 1989;Wood et al. 2000), KED (Haberlandt 2007;Erdin 2009;Schiemann et al. 2011;Goudenhoofdt and Delobbe 2009), and cokriging (Krajewski 1987). A particularly promising technique that has not been largely applied to satellite-based rainfall estimation is the Bayesian combination (BC) method proposed by Todini (2001). This method quantifies the estimation uncertainties of precipitation data measured by different sensors (e.g., satellite, radar, and ground rain gauges), then combines these data such that the overall estimation uncertainty formulated in terms of estimation error covariances is minimized. Promising results from this method have been obtained from both numerical examples (Mazzetti and Todini 2004) and case studies in a small-scale densely gauged urban catchment (Wang et al. 2013).
Despite the wealth of merging algorithms at our disposal, there is not a clear guideline as to which method would be optimal given the data characteristics (density, spatial bias, temporal resolution, etc.). In this study, we compare two nonparametric rainfall data merging methods-MBC and DS-with two geostatistical methods-KED and BC (untested for satellite applications)-over a mesoscale tropical Andean catchment in Peru. The analysis is conducted using 11 years of available data from the TMPA, version 7, research product and 173 rain gauges in the national weather station network of Peru. The results are assessed through a cross-validation procedure and supplemented by a hydrological evaluation using the catchment water balance and a hydrological model. The aim is to highlight the strengths and weaknesses of each of the methods and provide some guidelines for a preliminary selection of methods in the context of datasparse hydrological applications.

a. Study area
This study focuses on a mesoscale river basin that spans the Peruvian Andes and the Amazon floodplains (Fig. 1). The Peruvian Amazon (locally Marañón) River basin extends from latitude 18 to 118S and from longitude 808 to 748W, covering about 360 000 km 2 . The altitude of the basin ranges from well above 4000 m MSL in the western boundary in the Andes to the Amazonian floodplains to the east.
The downstream limit of the hydrological basin is defined by the gauging station at San Regis (4.48S, 748W). Upstream, the river is gauged at three other stations: Borja, Santiago, and Chazuta. Farther downstream, the Marañón River merges with the Ucayali River to form the Amazon River just before reaching the city of Iquitos. The basin covers the Pacaya-Samiria National Reserve, which is the largest floodable forest reserve in Peru and is an area of high ecological significance.
A complex climatic pattern is present in the region because of the interaction between various synoptic meteorological processes and the orographic effect of the Andes. In general, rainfall is particularly high in regions close to the equator (more than 2000 mm yr 21 ) because of the passage of the intertropical convergence zone (ITCZ) and decreases toward the tropics (Espinoza Villar et al. 2009). Additionally, humid Atlantic trade winds flow east-west across the vast rain forest and form an intense orographic belt along the 2154 eastern Andes as they encounter the first slopes (Espinoza Villar et al. 2009;Garreaud et al. 2009;Bookhagen and Strecker 2008). They also deviate southward, transporting moist air toward the subtropics. In the highlands, the Andean landscape causes shielding and moisture intensification resulting in various microclimates (Buytaert et al. 2006a;Rollenbeck and Bendix 2011).

b. Precipitation data
The satellite-based product is obtained from the TMPA product, in its latest available version 7, released in November 2012. The TMPA product combines observations from multiple satellites, including TRMM and other geostationary satellites (Huffman and Bolvin 2013). The performance of this dataset in this study area has been previously assessed by Zulkafli et al. (2014a). Precipitation estimates from 1998 to 2008 were obtained with a 3-h temporal resolution and 0.258 3 0.258 (approximately 27.8 km 3 27.8 km) spatial resolution. Values were subsequently aggregated to the daily scale in order to match the temporal resolution of the rain gauge records.
The rain gauge dataset is the historical rainfall time series obtained from the Peruvian National Meteorological and Hydrological Service [Servicio Nacional de Meteorología e Hidrología del Perú (SENAMHI)]. This includes daily rainfall amounts in millimeters for 11 years between January 1998 and December 2008 from 173 recording stations located within the satellite domain. Their locations are shown in Fig. 1. The area covered by the rain gauges is about 1 100 000 km 2 , which translates to an average network density of less than one rain gauge per 5000 km 2 . The gauges are particularly clustered around towns and along rivers, providing higher densities in regions of easier access. The highlands are also better represented compared to the lowlands, as the average rain gauge altitude of 1560 m MSL indicates (Table 1).

1) MEAN BIAS CORRECTION
This method corrects the satellite-based product by the total multiplicative bias between the rain gauge estimates and the collocated satellite values, thus FIG. 1. The Marañón basin and three subbasins. The map includes the regional topography, the main river network, the discharge stations, and the rain gauges.
OCTOBER 2015 N E R I N I E T A L . assuming a uniform bias over the spatial domain. For each daily event (i.e., each time step), a correction factor B is thus calculated using the following equation: where N is the number of available gauges inside the satellite domain, and Z G (x j ) and Z S (x j ) are the gauge and satellite daily rainfall values corresponding to gauged location j.

2) DOUBLE-KERNEL SMOOTHING
The nonparametric double-kernel smoothing is a technique developed by Li and Shao (2010) specifically for data-sparse applications. The main idea of this method is the interpolation of the point residuals « S using the kernel density function and the adjustment of the satellite field by the estimated residual field. The point residual at the given gauged location j 5 1, . . . , N is defined to be A first-level interpolation is performed to generate a full set of residuals (pseudoresiduals) « SS on the same grid as the satellite data. At the given gridpoint location i 5 1, . . . , M, the pseudoresidual is defined to be: where kÁk is the Euclidean norm and L is the Kernel function defined as a Gaussian kernel following Li and Shao (2010): The variable H is the position of the points, and the bandwidth b is determined using Silverman's rule of thumb (Silverman 1998 where n is the number of samples and s is the standard deviation of samples. A second-level interpolation is applied on both the residuals and pseudoresiduals to estimate the final error field « DS : and the merged product Z DS at point k is calculated by subtracting to the satellite estimate Z S k the corresponding error « DS k : The kernel smoothing (interpolation) of the residuals does not rely on the stationary assumption, as is the case for geostatistical methods. The formulation is such that the product of the merging will converge toward the rain gauge estimates with decreased distance toward the ground observations.

3) ORDINARY KRIGING
Among kriging estimation methods, the ordinary kriging provides the best unbiased linear estimates of point values or block averages (where ''best'' means minimum variance). The basic assumption of the OK is that data values can be characterized using a stationary random variable with an unknown mean (Müller 2007). Here, the OK is used to produce the rainfall estimates at each satellite grid location through the interpolation of discrete (point or grid averaged) rain gauge measurements. These spatially interpolated rain gauge estimates, along with the original satellite product, then serve as the baseline for comparison with the merged products.
Ordinary kriging uses the (semi) variogram to characterize the spatial association of data values at different locations. The definition of the (isotropic) semivariogram is given as follows: where g is the semivariance of the random variable Z between two points with distance h. Note that g does not depend on the position x, but it is only a function of distance h, thus assuming stationarity of the variance of the differences separated by the same distance. In this case, we also assume the isotropy of the model, since g is not function of the direction. The variogram can be empirically estimated from available observations by computing the semivariance g for several classes of distances between gauges. This is known as experimental or empirical variogram. Once the experimental variogram is estimated, it is possible to fit a theoretical variogram model to it. In the literature there are a variety of different models: see de Marsily (1986) for a complete list. In this study, the exponential model was found to produce the best fit: where a denotes the range, the distance after which the variogram reaches its asymptotic value, the sill v, so that the correlation between farther apart gauges is approximately equal to zero. A nugget effect can be added to Eq. (9) to represent any residual variance not explained by distance at an infinitesimally small separation, due to measurement and epistemic errors.
The unknown precipitation value Z* at location x 0 is estimated as a linear combination of the precipitation values Z G (x j ) at known points x j , weighted by the semivariogram function l(h): The only additional constraint is that the weights have to add up to unity: This condition is necessary to guarantee an unbiased estimator (de Marsily 1986). Equation (10), subjected to Eq. (11), is solved by minimizing the variance of the estimation errors var(Z G * 2 Z G ), which are assumed to have a Gaussian distribution. The resulting OK equation is thus the following: where m is the Lagrange multiplier used to fulfill the unbiased condition [Eq. (11)]. Equation (12) can be rewritten in the matrix form A 3 x 5 b, so that the unknown x represents the set of weights in Eq. (10), A is the semivariogram matrix with a term for each pairs of gauges, and b is the semivariogram vector between all gauges and the prediction point x 0 : where g ij denotes g(x i , x j ), the semivariogram value between two known points i and j.
The way the equation is set up is such that the solution may include negative weights, which can distort the final estimated values. Hence, an a posteriori correction to the weights is performed according to the algorithm proposed by Deutsch (1996).

4) KRIGING WITH EXTERNAL DRIFT
KED is an extension of the OK whereby nonstationarity is assumed and represented by the drift term, which in our case is the satellite-based estimates.
The method implemented is as described in Hewitt (2012). KED requires the satellite value at the prediction location x 0 to be equal to the weighted average of the satellite values Z S at gauged locations x j : This additional condition is included in the error minimization equation [Eq. (12)] to produce the following: In matrix format, this equation is where Z S,i denotes Z S (x i ), the satellite-based estimation at location i. For the interpolation, Z S (x i ) and Z S (x j ) are used solely for their covariance for weighting values of Z G (x i ); the actual values of Z S (x i ) do not feed into the final estimates.

5) BAYESIAN COMBINATION
The theory underpinning the Bayesian combination method is described in Todini (2001) and PROGEA Srl (2009) and summarized here. This method first uses the block kriging method to generate interpolated rain gauge rainfall estimates at each satellite grid location and to estimate the associated estimation error covariance (i.e., estimation uncertainty).
After kriging interpolation, the Kalman filter (Kalman 1960) is then employed to merge these estimates with the coincidental satellite rainfall product by comparing the uncertainty of these two data sources. Here the merged product becomes the a posteriori estimate y 00 , which is defined as where m « is the mean of the satellite-rain gauge bias [i.e., E(Z G 2 Z S )] and K is the Kalman gain. The Kalman gain represents the relative measure of uncertainty and is defined as follows: The uncertainty associated with the satellite estimates V « S is represented by the covariance matrix of the satellite error field (i.e., « 5 Z G 2 Z S ), while the uncertainty associated with the rain gauge rainfall estimates V « G is the error variance produced in the kriging interpolation. The variable K increases (decreases) as V « S is larger (smaller) compared to V « G , and consequently, the filter will weigh more in favor of the rain gauge (satellite) estimates in the combination. The Bayesian combination method was initially implemented using the commercial software RAIN-MUSIC (PROGEA Srl 2009). As the software is not open source, the default settings of the method cannot be changed, for example, the theoretical variogram model for the kriging interpolation has to be Gaussian. A MATLAB script was thus implemented to execute the Bayesian combination with several changes: 1) Instead of the Gaussian variogram model, the exponential variogram model was found to provide a better fit to the experimental variogram for the kriging interpolation. 2) Instead of using the block kriging method, the OK method was used to interpolate grid-averaged rain gauge data. The main difference between the ordinary and block kriging methods lies in the way variograms are calculated from data. The former calculates the variance between two points, while the latter calculates the variance between point and block (which is the average of the point-to-point variances).
To distinguish the two different implementations, the RAINMUSIC Bayesian combination will be subscripted as BC R , whereas the replicate (MATLAB) model is annotated as BC.

d. Meteorological evaluation
Since we do not know the true rainfall value at a given grid point, we employed collocated rain gauge observations as a first approximation in order to assess the accuracy of the various merged precipitation products. The reader should nonetheless be aware of the scale mismatch that exists between rain gauge point measurements and 0.258 satellite cells (approximately 773 km 2 in TMPA). However, in our context of a comparative evaluation, the pixel-to-point evaluation is implemented in the same fashion across all four merging methods and therefore is not expected to affect the relative performance.
The collocation of a rain gauge to a satellite grid cell was performed using nearest neighbor approximation, such that rain gauges are associated to the centers of the nearest grid cells. If multiple gauges are present over the same cell, their estimates are averaged to produce a presumably more representative value.
The satellite-gauge data merging was tested in a 10fold cross-validation scheme. The cross validation was performed at the daily time step for 141 days of rain, during which more than 75% of the rain gauges recorded a rain event.
The goodness-of-fit between the merged products and the corresponding pixel-average rain gauge values was evaluated in terms of the mean absolute error MAE, mean error ME, root-mean-square error RMSE, Nash-Sutcliffe efficiency NSE, and percent bias (relative bias in percent).
An analysis of error components such as false alarms and missed events is not included. This would require a detailed discussion of each of the merging techniques on those performance indices, which is beyond the scope of this study. From a hydrological modeling perspective, this should not represent an issue given the large scale of the studied basins. Both false alarms and missed events are typically dominated by low-intensity or very localized events (covering part of the pixel but missed by the rain gauge or vice versa) which have only a limited contribution to streamflow. If they indeed have a hydrological impact (e.g., on the water balance and flow magnitude), then this should be captured in the performance metrics included in the analysis. However, for smaller-scale applications, such error components might have an impact. The reader is referred to Maggioni et al. (2013) for a related analysis.

e. Hydrological evaluation
A first-level hydrological assessment of the data quality was conducted using the catchment water balance with the runoff ratio RR defined as the ratio of the precipitation that contributes to runoff: The RR values calculated using the different outputs from the merging were compared to known values from the literature (Campling et al. 2002;Buytaert et al. 2006b;Rollenbeck and Anhuf 2007). The merged precipitation products were subsequently used to drive a hydrological model. The Joint UK Land Environment Simulator (JULES; Best et al. 2011), which is a physics-based community model derived from the Met Office land surface scheme, was parameterized over 2040 grid boxes of 0.1258 3 0.1258 (approximately 14 km 3 14 km) in the catchment using best-available land cover and soil data to generate runoff. The runoff was then routed offline along the river network using a delay function to simulate streamflows at various points in the river basin. The river network was developed using hydrographic data (90-m resolution) from the Hydrological Data and Maps Based on Shuttle Elevation Derivatives at Multiple Scales (HydroSHEDS; Lehner et al. 2008). Model optimization was only used to obtain the routing parameters (flood wave celerities); any inconsistent model compensation of the precipitation errors across the merging products is not expected. For further description of the model, the reader is referred to Zulkafli et al. (2013).
The impact of merged precipitation forcing data on streamflow simulations is assessed by comparison with observed daily discharges. The evaluation is done at four streamflow gauging stations, identified in Fig. 1, over an 11-yr period (1998-2008). Streamflow data were obtained through Geodynamical, Hydrological, and Biogeochemical Control of Erosion/Alteration and Material Transport in the Amazon Basin (HYBAM) from SENAMHI and the Instituto Nacional de Meteorología e Hidrología, Ecuador (INAMHI), monitoring networks. The goodnessof-fit was evaluated in terms of the percent bias (relative bias in percent), correlation, RMSE, and NSE at the daily scale.

f. Software
The merging algorithms in this study were originally written in MATLAB. Subsequently, these have been rewritten in R and made publicly available on GitHub (Zulkafli et al. 2014b).

a. Spatial variability of precipitation
A spatial analysis of the mean annual precipitation across the merged products is presented in Fig. 2. The figure shows MBC and DS retaining a high degree of information from TMPA, whereas KED and BC R resemble more closely the OK field. These patterns from MBC, DS, and KED are reasonable, given that the MBC and DS methods consider the entire satellite rainfall field information in the adjustment process, while KED only uses the satellite information at gauging locations as an external drift term. However, the pattern produced by BC R is inconsistent with expectations and thus calls for further analysis.
The BC method, in theory, is expected to provide an estimate in between the TMPA and gauge estimates depending on the degree of reliability in either data. However, the resulting BC R field suggests that a significantly high weight was given to the block kriged estimates (i.e., measurements), which could be attributed to the (estimation) error covariance of the block kriged rain gauge estimates being much smaller than that of the TMPA rainfall estimates. Through the comparison of the experimental variograms and the associated theoretical variogram models fitted by the RAINMUSIC OCTOBER 2015 N E R I N I E T A L .
software (not shown here), it was found that the theoretical variograms consistently underestimated the sill and/or overestimated the range, and as such, underestimated the error covariance of block kriging estimates. This relation between theoretical variogram fitting and (estimation) error covariance can be demonstrated through a sensitivity analysis (see Fig. 3). It can be observed that the faulty estimation of variogram parameters could result in extensive areas of low error variance (Fig. 3b), and consequently, the preferential use of gauge estimates versus satellite estimates in ungauged areas of the field (Fig. 3c). The reason for this poor fit by RAINMUSIC could be due to the use of point-to-block variogram estimation, which may oversmooth the spatial variability of rain gauge measurements over the satellite's coarse grids. Caution is therefore required for satellite applications of this method, as the spatial resolution of existing satellite data is still limited. The replicate BC algorithm, which used an OK instead of block kriging for the rain gauge interpolation, yields substantially better results, as are evidenced by a better retention of the a priori spatial fields by the merged product (Fig. 2g). Based on this, the BC R estimates are excluded from further analysis. Meanwhile, MBC and DS produced slightly varying results, and this can be better explained using a map of the residuals between TMPA and the merged products (Fig. 4). The OK residual field (Fig. 4a) highlights that TMPA estimates are closer to the OK values in the mountains than they are in the lowlands. MBC (Fig. 4b) averages the bias over the entire spatial domain, effectively reducing the intensity of the gauge correction to the north and the east, as is shown by the distinctly smooth residual field. On the other hand, with DS, the kernel effect can be observed (Fig. 4c). Strong positive residuals are observed in the northwest and likewise negative residuals in the northeast that are an extension of the residuals calculated from the nearest group of gauges.

b. Cross validation against reference rain gauge estimates
The summary of performance scores (Table 2) indicated improvements in the merged product when compared to TMPA and OK data. Values of MAE, RMSE, ME, and NSE were found to improve in all of the methods, and the KED provided the best scores, followed by DS and BC.
The spatial distribution of these scores across the study area reveals some insights into the relative performances between the merging methods. Figure 5 shows the percent bias, which is the ME normalized by the station average daily rainfall. MBC retains much of TMPA's negative bias whereas DS improves along the Andes. This suggests that the latter has a better capability for local correction, which is highly important for such a complex environment as the Andes. To a lesser extent, both methods also achieve correction in the lowlands, where the original TMPA is expected to already perform well. Given OK's tendency to converge to rain gauge measurements and that the meteorological evaluation takes rain gauge values as reference, OK performs well in general but in particular in the lowlands, and KED follows the same spatial pattern. This is also consistent with the expectation that spatial correlation is higher in the lowlands than it is in the mountains. Moreover, OK and KED perform better in highly sampled regions, which is discernible in Fig. 6 in terms of higher modeling efficiency values at the centers of rain gauge clusters. Finally, this figure further suggests that the best merging products overall are achieved FIG. 4. Annual mean residual between merged products (including OK) and TMPA (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008). The black line delimits the San Regis basin and the silver lines are country borders. The black dots indicate the locations of gauges used in the data merging. The Andean range follows the western boundary of the basin.
using OK, KED, and DS. This is in spite of OK and KED's poor representation of the spatial patterns of the annual mean precipitation over the entire merging field. This highlights another limitation of the meteorological cross-validation framework when applied to large areas with an uneven rain gauge density distribution.
c. Hydrological evaluation: Impact on the water balance and hydrological modeling at basin and subbasin scales Table 3 presents a hydrological evaluation of the merging algorithms and shows several merging algorithms having a positive impact on the water balance. The most prominent improvements are seen with the DS, MBC, and (to a lesser degree) the KED methods in the upper Andean basins Borja and Santiago, where the runoff ratios are closer to the reference runoff ratio value of 0.7. In San Regis, the DS and MBC are similarly successful in reducing the water balance errors whereas KED increases them. In Chazuta, improvements are gained with the BC and DS methods, although the highest improvement in this subbasin is achieved with MBC S , which is an additional analysis that was run by excluding the rain gauges located outside the subbasin. In contrast, OK exacerbates the errors in all basins compared to TMPA.
The hydrological modeling results are summarized in terms of the performance scores across the basins. Figure 7 shows RMSE and relative bias reductions as well as NSE increases (compared to TMPA) by MBC and DS, and additionally by BC in Chazuta subbasin. In contrast, OK, KED, and BC (except in Chazuta) show generally poorer scores for all four indices-percent bias, RMSE, correlation, and NSE-compared to TMPA. In Borja and Santiago subbasins, the DS method also resulted in above zero model efficiency. Over the entire modeled time series, the NSE improvements range from 0.1 to 0.5, with the higher increases achieved at Santiago and Borja.

Discussion and conclusions
The results of this case study suggest that the relative performances of the rainfall merging methods are strongly influenced by the rain gauge density over the domain of analysis. Overall, the DS method followed by the MBC method delivered the most consistent improvement over the satellite product in both crossvalidation and hydrological verification performance scores. Although the geostatistical methods, that is, KED and BC, did not result in a good hydrological performance, they nonetheless performed well in the cross validation and showed the potential to produce valuable merged rainfall estimates if employed in sufficiently gauged areas. Finally, the underperformance of methods such as OK and KED in the hydrological evaluation, which either fully or heavily rely on rain gauge precipitation, supports the use of satellite-based products over gridded rain gauge products that utilize sparse data for hydrological modeling at large scales.
Guided by these observations, we propose in Fig. 8 a guideline for the selection of a merging method given FIG. 6. The NSE between merging product and rain gauge time series at all cross-validation points. data characteristics and constraint and discuss this in the following.
The rain gauge network density in the studied basin should be considered when choosing the optimal merging method. In relatively well gauged basins (less than 250 km 2 per gauge), the interpolation of gauge data only is believed to be sufficiently reliable for water resources applications (World Meteorological Organization 1994;Chappell et al. 2013), and simpler kriging methods such OK are regarded as accurate interpolation techniques for average daily rainfall (Collischonn et al. 2008;Buytaert et al. 2006a). However, large-scale applications might violate the assumption of stationarity (a constant mean) in OK because of the nature of the spatial variability of precipitation. KED provides a solution to this by introducing the external drift as observed by the satellite measurements.
In moderately gauged basins (between 250 and 1500 km 2 per gauge), merging methods based on geostatistics such as BC or KED provide valuable rainfall estimates, as demonstrated in the Chazuta subbasin. However, geostatistical methods may still be affected by a number of practical limitations and fundamental assumptions. For example, the Gaussian distribution assumption made during the variance minimization in the weights estimation does not particularly hold for applications at fine temporal scale, as daily precipitation data are typically positively skewed, unlike monthly accumulations, which are more normally distributed. Ways around this problem are widely discussed in the literature, for example, through data transformation methods (e.g., Müller 2007).
In poorly gauged basins (more than 1500 km 2 per gauge), geostatistical merging methods have delivered unsatisfactory results because of a strong reliance on rain gauge observations over valuable satellite information in ungauged areas. A possible cause could be the assumption of a known semivariogram common to all kriging estimators. Since the true semivariogram is unknown in reality, the experimental semivariogram is used instead in order to estimate the semivariogram parameters from the data. As far as rainfall fields are concerned, nonzero precipitation events tend to have a spatial footprint that negatively correlates to the intensity of the event (e.g., small-scale convective vs large-scale stratiform precipitation). This should be captured in the experimental semivariogram, but in poorly gauged basins, distances between rain gauges are often too large and spatial correlation can be overestimated. Instead, nonparametric merging methods DS and MBC have proved to be more robust to data scarcity. The success of the DS method reflects the spatial heterogeneity of the climate and the size of the model field and supports the idea that distant gauges should have very little bearing on and should be given minimal weight in the estimation of unknown points simply based on their distance alone, particularly in very complex landscapes such as the Andean Amazon. A local correction is best achieved by DS that substantially weights surrounding observed residuals by distance. MBC is attractive for its simplicity, but the assumption of a uniform multiplicative bias is a very narrow assumption given the complexity of the climate and the size of the domain. This is reflected by the improvements observed by applying the method to a smaller scale (with MBC S in Chazuta subbasin), which in essence is a move toward a more local bias correction. Extended methods such as range-dependent mean bias correction have also been implemented in radar applications (e.g., Amitai et al. 2002) that can be adopted. The range threshold may further be made adaptable to the dominant precipitation processes (large-scale versus convective) in the area of the rain gauges.
This decision flowchart is intended to be a preliminary, nondefinitive guide to the selection of an optimal method given the data characteristics. As it was developed based on experiences from a region characterized with highly heterogeneous topography and precipitation processes, we expect this flowchart to hold for many other regions, at least for humid tropical and (sub) tropical mountain regions. Alternatively, an ensemble modeling approach can be considered that utilizes all of the merging inputs, whereby the model parameters can be weighted using Bayesian model averaging based on each model's reproduction of observed streamflow (Jiang et al. 2012;Strauch et al. 2012).