A Comparative Performance Analysis of TRMM 3B42 (TMPA) Versions 6 and 7 for Hydrological Applications over Andean–Amazon River Basins

The Tropical Rainfall Measuring Mission 3B42 precipitation estimates are widely used in tropical regions for hydrometeorological research. Recently, version 7 of the product was released. Major revisions to the algorithm involve the radar reﬂectivity–rainfall rate relationship, surface clutter detection over high terrain, a new reference database for the passive microwave algorithm, and a higher-quality gauge analysis product for monthly bias correction. To assess the impacts of the improved algorithm, the authors compare the version7 and theolderversion6 productswithdatafrom263rain gaugesin and aroundthenorthernPeruvian Andes. The region covers humid tropical rain forest, tropical mountains, and arid-to-humid coastal plains. The authors ﬁnd that the version 7 product has a signiﬁcantly lower bias and an improved representation of the rainfall distribution. They further evaluated the performance of the version 6 and 7 products as forcing data for hydrological modeling by comparing the simulated and observed daily streamﬂow in nine nested Amazon River basins. The authors ﬁnd that the improvement in the precipitation estimation algorithm translatestoanincreaseinthemodelNash–Sutcliffeefﬁciencyandareductionintherelativebiasbetweenthe observed and simulated ﬂows by 30%–95%.


Introduction
The Tropical Rainfall Measuring Mission (TRMM) produces global estimates of precipitation based on remote observations. The product of the 3B42 algorithm [hereafter referred to as the TRMM Multisatellite Precipitation Analysis (TMPA)], which is high in spatial (0.258) and temporal (3 h) resolution, is a widely used forcing dataset for hydrometeorological applications such as hydrological modeling, especially in data-sparse regions (e.g., Awadallah and Awadallah 2013;Li et al. 2012;Khan et al. 2011;Wagner et al. 2009;Asante et al. 2008;Su et al. 2008).
There is consensus among studies using TMPA in and near tropical mountain regions (e.g., Ward et al. 2011;Scheel et al. 2011;Condom et al. 2011;Dinku et al. 2010;Nair et al. 2009;Bookhagen and Strecker 2008) about the limitation of the data, in particular, the poor quantification of high-precipitation events, which are the prevalent form occurring in regions highly influenced by the intertropical convergence zone (ITCZ). As TMPA combines remote observations such as TRMM precipitation radar (TPR), passive microwave (PMW), and infrared (IR) from multiple low-Earth-orbiting and geostationary satellites and ground observations (Huffman et al. 2007), Denotes Open Access content. various explanations for the estimation uncertainty are possible.
For example, the TMPA algorithm relies heavily on cloud-top (IR) temperatures from TRMM's onboard instruments, as well as from other participating geostationary satellites in between TRMM satellite overpasses, as proxy measurements of rain [''colder clouds precipitate more'' (Huffman et al. 2010, p. 10)]. It has been argued that in tropical mountain regions, the temperatures of orographic clouds well exceed the rain-no rain threshold imposed in the algorithm that can cause an underestimation of precipitation (Dinku et al. 2010). Indeed, estimates solely based on IR measurements, such as Precipitation Estimation from Remote Sensing Information Using Artificial Neural Networks (PERSIANN; Hsu et al. 1997), have been found to underperform other satellite precipitation products in mountainous environments (Thiemig et al. 2012;Ward et al. 2011). Estimation using PMW observations has a stronger physical basis but remains problematic with warm rain clouds deficient in ice particles (Huffman et al. 2010;Dinku et al. 2010). The PMW sensor may also be insensitive at the scale of measurement, leaving very localized heavy rainfall cells undetected (Thiemig et al. 2012). Additionally, TMPA's poor estimation of extremes has been attributed to the optimization of the TPR's reflectivityrainfall rate (Z-R) relationship over moderate precipitation rates, given their higher occurrence (Thiemig et al. 2012). Notwithstanding these limitations, it has also been shown with the TRMM 2A25 product (TPRbased estimates that feed into the 3B42 algorithm) that clear precipitation gradients can be observed over larger temporal scales over the Andes (Nesbitt and Anders 2009).
The TMPA version 6 algorithm is described in Huffman et al. (2007), while changes in the version 7 algorithm at various processing levels are described in Huffman et al. (2010) and Huffman and Bolvin (2013) and are summarized here. They include the new Goddard profiling algorithm (GPROF) 2010 algorithm for PMWbased estimation that references TRMM's available records of storm profiles, PMW brightness temperatures, and precipitation rates, replacing a reference database constructed using a cloud model in version 6. Additionally, the TMPA version 7 also incorporates more observation datasets at different detection ranges than does version 6, notably, the 10-km resolution IR data to replace the Global Precipitation Climatology Centre (GPCC) histograms used in the early part of the time series (1997)(1998)(1999)(2000) and the full time series of Microwave Humidity Sounder (MHS) and Special Sensor Microwave Imager/Sounder (SSM/IS) observations. A single-calibration reprocessed Advanced Microwave Sounding Unit-B (AMSU-B) dataset from the National Oceanic and Atmospheric Administration (NOAA) satellite also replaces the prior version, for which two different calibration periods were used, thus removing some of the internal inconsistency present in TMPA version 6 (Huffman et al. 2007). Furthermore, the algorithm implements a final step gauge bias correction at the monthly scale, and while the GPCC monitoring product (version 2.0) and the NOAA Climate Prediction Center's Climate Anomaly Monitoring System (CAMS) data product were previously used in the TMPA version 6 algorithm, the version 7 algorithm uses a new full data reanalysis (version 6.0) from GPCC that 1) interpolates anomalies instead of amounts and 2) incorporates a denser rain gauge network.
Over mountain regions, global and region-specific improvements were implemented in the TPR estimation, as detailed in a technical document (TRMM Precipitation Radar Team 2011) and summarized here. In version 6, the algorithm was found to mistake the high level of surface clutter over the mountains for rain echo. It also mislocates surface echoes because of 1) inaccurate elevation data and 2) concealment by strong signals from heavy rainfall. The version 7 algorithm renews its elevation map for the Andes and Himalayas using data from the Shuttle Radar Topography Mission with 30-arc-s spacing (SRTM30) and introduces a repeat search algorithm for the surface echo that should improve its detection and thus the determination of clutter-free rain regions in the storm profile. This is expected to improve the quantification of light rain. Global changes such as the Z-R relationship based on a nonspherical rain drop distribution, an increase of 0.5 dB to stratiform precipitation to compensate for heavy rain attenuation, and allowance for small convective storm cells favor higher estimations of heavy rainfall rates.
Few studies have looked into the performance of the TMPA version 7 precipitation product. Kirstetter et al. (2013), using data from TRMM 2A25 (TPR analysis) show that in the contiguous United States, bias against ground observations is reduced and correlation is improved. The same product provides an increase in total and convective rainfall over Asia south of 158S (Shiratsu et al. 2011). In a benchmarking exercise against radar observations in Japan, Nakagawa et al. (2011) saw no change in correlation but saw improved bias. Meanwhile, Hobouchian et al. (2012) found increases in the probability of detection and equitable threat score as well as high extreme bias reduction from version 6 to version 7 of TMPA in South American regions south of 208S. These findings are encouraging for tropical mountain regions, where there is a growing body of modeling work using TMPA, but often with some level of postprocessing required to improve the water balance (e.g., Lavado-Casimiro et al. 2009;Arias-Hidalgo et al. 2013;Zulkafli et al. 2013). TMPA version 7 data will be increasingly used in modeling studies (e.g., Espinoza et al. 2013), necessitating a full exploration of the implications of the TMPA algorithm revisions on reducing data uncertainty. Therefore, the objective of this paper is to analyze if, how, and where TMPA version 7 is superior to version 6 in the Peruvian Andes region from a hydrological perspective. As the region covers some of the major climates and gradients found in the tropics, the findings will have a high potential for extrapolation to many other tropical regions relying on remote estimates of rainfall.

a. Study area
The study domain is located in north Peru and southeast Ecuador between 118S and 18N and between 808 and 708W (Figs. 1a,b). The area covers humid tropical rain forest, tropical mountains, and arid-to-humid coastal plains.
The region's climate has been discussed by various authors (Espinoza Villar et al. 2009;Garreaud et al. 2009;Casimiro et al. 2012;Buytaert et al. 2006;Kvist and Nebel 2001). The climate and seasonality (see Fig. 1c) is controlled by large-scale meteorological phenomena such as the ITCZ and the South American monsoon system (SAMS; Marengo et al. 2012) that cause predominantly wet austral summers [December-February (DJF)]. In the austral winter [June-July (JJA)], the ITCZ band remains north of 58N but continues to cause some deep convection and rain in the northern parts of the Amazon basin (Espinoza Villar et al. 2009). Additionally, the Amazon regions experience large-scale stratiform precipitation throughout much of the year from exposure to the humid tropical Atlantic easterly winds.
In the Pacific coast south of the Ecuador-Peruvian border, the von Humboldt oceanic current causes a cooler, drier climate regime throughout the year. The humid Pacific coast areas in Ecuador are less subject to this atmospheric cooling and experience a wetter summer because of the predominance of the ITCZ (Fig. 1c). Over the Andes, the climate is complex and is primarily controlled by orography, windward/leeward effects, and the formation of local microclimates. The climate is wetter in the east slopes (Amazon) than it is in the west slopes because of the same climate drivers that affect the lowland regions.
In our analysis, we subdivided the area into six climate regions: Pacific coast, north and south; the Andes, west and east slopes; Amazon sub-Andes; and Amazon lowland (summarized in Table 1). We define the Andes as the regions above 1500 m, and the Amazon sub-Andes as the eastern Andean slopes located at altitudes of 1300 6 200 m, which is a belt of high orographic precipitation (above 3500 mm yr 21 ) illustrated in a previous study of Andean transects by Bookhagen and Strecker (2008).

b. Precipitation data
TMPA version 6 and 7 for the time domain 1998-2009 were obtained from the NASA archive (ftp://disc2.nascom. nasa.gov/ftp/data/s4pa//TRMM_L3/) and aggregated to daily, monthly, seasonal, and annual values. Out of 1920 pixels (0.258 3 0.258) in the study domain, 144 are collocated with the ground observation stations. The number of collocated pairs are tabulated in Table 1.
Historical rain records (years 1998-2009) were obtained from the national weather station networks of Peru (Servicio Nacional de Meteorolog ıa e Hidrolog ıa) and Ecuador (Instituto Nacional de Meteorolog ıa e Hidrolog ıa). The records consist of daily time series from 184 gauges in Peru and monthly time series from 79 gauges in Ecuador.

c. Precipitation analysis
The intercomparison was performed in terms of 1) the mean annual rainfall (mm yr 21 ), 2) the mean annual relative bias [Eq. (1)], and 3) the mean seasonal bias [mm day 21 ; Eq. (2)] at each ground observation location. For each region, we also averaged the time series of all paired observations and inspected the bias at the monthly scale: (1) and BIAS 5 å T t51 P TMPA,t 2 P GAUGE,t . (2) We further analyzed TMPA's skill at estimating various precipitation event types by comparing their distributions of daily rainfall rates to those recorded by the rain gauges. In presenting our results, we adopted the following precipitation classification criteria (mm day 21 ): zero rain, 0-0.2; light rain, 0.2-1.0; moderate rain, 1.0-5.0, heavy rain, 5.0-15, very heavy rain, 15-50, and extremely heavy rain, above 50. We computed the probability of occurrence of each precipitation type from the entire time series for each satellite-gauge pair. For each region and precipitation class, the statistics are summarized in a boxplot to represent all data pairs, and the probability distributions are compared between the rain gauge, TMPA version 6, and TMPA version 7 datasets.

d. Hydrological analysis
To gauge the impact on hydrological performance, the water balance was evaluated at multiple nested hydrological basins tributary to the Amazon river by calculating the long-term average runoff ratio [RR; Eq. (3) ( Additionally, both TMPA versions were evaluated in terms of the output of a hydrological model constructed for the basins. Detailed model development has been JULES requires near-surface meteorological data as input, which it uses to solve fully coupled energy, water, and carbon balance equations, producing a continuous output of ET and runoff (surface and subsurface). This runoff is then fed into a delay function routing model to produce streamflows that are compared to observations. Daily streamflow data were provided by the Geodynamical, hydrological and biogeochemical control of erosion/alteration and material transport in the Amazon basin (HYBAM) project from nine stations in Ecuador and Peru (Table 2). Information from global and local maps is used to describe the land surface properties, and the simulations (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) were performed with few perturbations to the original model parameters. The performance scores such as Nash-Sutcliffe efficiency (NSE) and the relative bias between the simulated runoff and the observed daily streamflows were tabulated and compared between the TMPA versions.

Results and discussion
a. Mean annual, seasonal, and monthly bias Figures 2a-c show the mean and the relative change of the mean annual precipitation in TMPA versions 6 and 7. A clear spatial trend is observed-there is a substantial increase in the total precipitation amounts from version 6 to version 7 along the Andes and the Pacific coast in the north that results in corresponding reductions in the negative bias against rain gauge observations (Figs. 2d,e). Figure 2f shows that, with the exception of a few gauge locations in the Pacific coast in Peru, the direction of change in the relative bias is positive. This observation agrees with an increase in gauge densities in these areas between the different datasets used in versions 6 and 7 and suggests a large role in the bias correction within the algorithm. In spite of this, TMPA version 7 continues to overall underestimate precipitation, except in the northern Andean regions down to the Ecuador-Peruvian border, where it is now overestimating compared to the rain gauges.
A seasonal analysis demonstrated the main reduction of the negative bias from version 6 to version 7 occurring along the Andean range and in the coastal region in Ecuador during the wet season (DJF and MAM) (Fig. 3). TMPA version 7 also tends to cause some overestimations over the Andes (west and east slopes) in the north, and these overestimations persist during the drier seasons (JJA and SON). Changes between versions 6 and 7 over the lowland Amazon and the Amazon sub-Andes regions are relatively small with no apparent seasonal trend, which may be explained by the low seasonality in their climate. Altogether, there is evidence of an increase in wet season deep convective heavy precipitation amounts and an increase (to the point of overestimation) of the dry season light rain, and this is further confirmed in the time series analysis of the monthly bias between TMPA and gauge estimates. Figure 4 shows that TMPA versions 6 and 7 monthly biases against gauge data are highly correlated and that the direction of change is positive throughout most of the time series. As the biases in version 6 tend to be negative, this resulted in biases shifting toward zero in version 7, and in some cases, such as the Pacific lowland in the south, toward positive biases. A strong seasonality in the negative bias reduction (highest in DJF) is observed in the coastal regions (north and south) and the west TABLE 2. Streamflow stations and water balance summary. The numbers refer to Fig. 1. The mean observed discharge Q obs (m 3 s 21 ) is calculated using all available data. Runoff ratio is given for version 6 (RR V6) and 7 (RR V7) and the corresponding evapotranspiration (mm yr 21 ) is calculated from the water balance equation assuming zero long-term change in storage for version 6 (ET V6) and 7 (ET V7 Andes, which are the regions with the strongest seasonalities. A few exceptions are the prominent positive biases with version 7 in the sub-Andes between 2002 and 2006, and in the Pacific lowlands in the south, during the same time period and in 2007. These are drier summer periods associated with El Niño episodes of drought, as these regions experience increased dry air subsidence from intensified convection over the Pacific Ocean.

b. Precipitation rates distribution
Figures 5a-e provide further insight into the shifts in the daily rainfall distributions estimated in versions 6 and 7. In version 6, the TMPA distributions are more strongly skewed toward light-to-moderate intensity precipitation compared to the gauge distributions across all regions. This observation concurs with the reported underestimation of extreme high precipitation by TMPA version 6 in the literature. The version 7 product effectively shows a shift in the distribution toward higher-intensity precipitation and an increase in the internal variability across the range of precipitation rates. Consequently, there is a reduction in the bias between TMPA and rain gauge distributions over the Andes and sub-Andes, particularly for heavy and very heavy precipitation, where the medians of the distributions align closer than previously. The underestimation, nevertheless, persists to some extent, and light-to-moderate rain continues to be overestimated most severely in the west slopes of the Andes.
We recognize that TMPA's underestimation of high extremes may simply be a reflection of the nature of their data as a spatial average when compared to point rain gauge data. However, TMPA also shows an overestimation of zero-rain days, whereas, by their nature, spatial averages should observe lower no-rain days compared to point estimates. This may be caused by the low sampling frequency and consequently missed shortduration precipitation events between satellite measurements. The overestimation of dry days is considerably reduced in version 7 and may have to do with the refinement to the surface reflectivities routine in the TPR algorithm that improves the determination of rain signals from clutter, and as well as the recalibration of the TPR's Z-R relationship toward a general increase in the precipitation rates.

c. Impact on the water balance and hydrological simulation
The impact of the TMPA algorithm change to the water balance in several hydrological basins tributary to the Amazon basin (Fig. 2) are presented in terms of runoff ratios (Table 2). TMPA version 6 typically generates physically unrealistic runoff ratios above 1, highlighting the consistent regional underestimation of precipitation. Version 7 generates substantially reduced runoff ratios, with values closer to those expected for humid tropical basins, even in the small Andean basin of Paute. Some unrealistically high runoff ratios remain in basins with a high areal runoff, such as Santiago, San Sebastian, and Nueva Loja located in southeastern Ecuador, which reflect the prevailing underestimation of heavy rain in version 7 TMPA, as discussed in section 3b. The increase in precipitation amounts also results in ET estimates closer to MODIS-based estimates averaged for each basin (Table 2, last column) and literature values of ET (600, 1200, and 1300 mm yr 21 median values for the Andes and tropical montane and lowland rain forests, respectively; see Zulkafli et al. 2013, and references therein).
The improvement in the water balance translates directly into hydrological modeling performance, as seen in Fig. 6 and Table 3. Simulations driven by TMPA version 7 produce a closer estimate of daily streamflows to the observed time series and result in an increase in the modeling efficiency (NSE score) in all nine basins. At San Regis, which is the largest basin analyzed, the relative bias between simulated and observed flows decreased from 237.8% to 22.0%, which is a reduction of 95%. Here, the averaged precipitation bias reduction from 235% to 210% parallels the reduction in the simulated discharge. In Chazuta, where there is a good coverage of rain gauges across the basin, we performed an additional simulation using rain gauge data interpolated with kriging to serve as a benchmark and found it to underperform (NSE of 20.19, bias of 230.0%) the simulation forced by TMPA version 7 (NSE of 0.43, bias of 218.7%). This implies a high potential skill of TMPA version 7 in ungauged catchments, a sentiment echoed by Xue et al. (2013) based on their hydrological evaluation of TMPA version 7 against version 6 and ground observations in Bhutan. Improvements at varying degrees were observed elsewhere, most notably in the humid north Andean basins of Paute, Nuevo Rocafuerte, and Francesco de Orellana, which suggests the role of an improved high precipitation estimation. Nevertheless, the hydrographs also show that the variations in the peaks are still poorly modeled, except in the larger basins. This reflects the continued underestimation of extremes by TMPA, as well as the limitations of the hydrological model in representing surface runoff generation processes in mountain environments. In spite of this, our work has demonstrated that the forcing uncertainty is significantly reduced in TMPA version 7. This enables further work FIG. 4. The average monthly bias in TMPA versions 6 and 7 vs gauge by climate region. 588 to focus on developing more accurate process representations for the tropical Andes.

Conclusions
The TMPA versions 6 and 7 intercomparison work completed over six climate regions in the tropical Andes-Amazon showed an overall increase in precipitation, especially in the Pacific lowlands (north) and the Andes. Our results corroborate the findings of the few existing validation studies on TMPA version 7 that show better agreement with gauge data compared to version 6. Our closer inspection of the bias distributions indicated that the primary improvement is in the reduction of the negative bias of the wet season's high extreme. We could infer that the positive outcome is attributable to a combination of the changes in the algorithm that improves heavier rain quantification, and we hypothesize that 1) a higher number of rain gauges used during bias correction, 2) the TPR radar recalibration toward higher precipitation rates, and 3) an improved GPROF 2010 algorithm for the PMW-based precipitation estimates play a large role. The hydrological performance of TMPA with version 7 increased considerably over nine hydrological basins in the region, increasing our confidence in the use of TMPA as forcing data for modeling applications to complement ground observations in tropical mountain regions where they are usually scarce or inaccessible. This applies not only to hydrological studies but also to other modeling applications that benefit from the use of precipitation as driving data.
We recognize several pathways for further evaluation. First, by analyzing a composite, final product, we restrict our ability to directly attribute the improvements to TMPA version 7 to the different steps of the TMPA algorithm. The logical next step is therefore to evaluate multiple precipitation products from the various levels of the TMPA processing individually, which will enable us to identify and inform the main contributors to the overall uncertainty. For example, one could compare the TMPA's research product to the real-time product and quantify the added value of a regional gauge correction of the satellite product. Second, from a water resources standpoint where the main interest is in the means and extremes, it is sensible to look at TMPA's representation of entire distributions of precipitation rates compared to those of gauge data, as we have presented in our analysis. However, for operational applications such as forecasting, early warning, or risk analysis, further performance indices, such as false alarm ratios, missed volumes, and the probability of detection, should be considered. In this context, a direct pixel-to-point satellite-gauge comparison will have to accommodate the fundamental challenge of FIG. 5. (a)-(e) Precipitation rate distributions in TMPA version 6 vs 7, gauge, and TMPA vs gauge, according to precipitation types. The precipitation type is characterized based on precipitation intensities (mm day 21 ): zero rain, 0-0.2; light rain, 0.2-1.0; moderate rain, 1.0-5.0; heavy rain, 5.0-15; very heavy rain, 15-50; and extremely heavy rain, above 50. The boxplots in each interval represent the variability between the data points, which are probability of occurrence for each pixel-to-point pair. The boxes extend from the first to the third quartiles of the data points, and the whiskers extend to the highest value within 1.5 times the interquartile range. The dots represent values outside this range. resolving the mismatch in the temporal and spatial support of the data products in both occurrence and amounts, that is, the timing of the precipitation event versus that of a satellite retrieval and the spatial integration of satellite estimates that smooths extremes. Aggregating point rain gauge data to the satellite pixel using a simple averaging or more complex geostatistical interpolation methods, or conversely, downscaling satellite data to finer-resolution estimates using geophysical predictors such as elevation [as has been shown in Fang et al. (2013)], should be implemented before a reasonable point-to-pixel comparison can be made.
Finally, conclusions from our analysis of a set of data from a specific region and the potential for extrapolation should ideally be further corroborated using cross validation with rain gauge data from other regions. This extended analysis can also explore the data performance at different spatial and temporal scales.