Experiences in using the TMPA-3 B 42 R satellite data to complement rain gauge measurements in the Ecuadorian coastal foothills

At present, new technologies are becoming available to extend the coverage of conventional meteorological datasets. An example is the TMPA-3B42R dataset (research – v6). The usefulness of this satellite rainfall product has been investigated in the hydrological modeling of the Vinces River catchment (Ecuadorian lowlands). The initial TMPA3B42R information exhibited some features of the precipitation spatial pattern (e.g., decreasing southwards and westwards). It showed a remarkable bias compared to the groundbased rainfall values. Several time scales (annual, seasonal, monthly, etc.) were considered for bias correction. High correlations between the TMPA-3B42R and the rain gauge data were still found for the monthly resolution, and accordingly a bias correction at that level was performed. Bias correction factors were calculated, and, adopting a simple procedure, they were spatially distributed to enhance the satellite data. By means of rain gauge hyetographs, the bias-corrected monthly TMPA-3B42R data were disaggregated to daily resolution. These synthetic time series were inserted in a hydrological model to complement the available rain gauge data to assess the model performance. The results were quite comparable with those using only the rain gauge data. Although the model outcomes did not improve remarkably, the contribution of this experimental methodology was that, despite a high bias, the satellite rainfall data could still be corrected for use in rainfall-runoff modeling at catchment and daily level. In absence of rain gauge data, the approach may have the potential to provide useful data at scales larger than the present modeling resolution (e.g., monthly/basin).


Introduction
At present, remote sensing has become immensely useful to improve our understanding of spatiotemporal variation of rainfall, particularly for data-scarce regions.In this regard, the Tropical Rainfall Measuring Mission (TRMM) (Simpson et al., 1988;Kummerow et al., 1998), an initiative of the US National Aeronautics and Space Administration (NASA) and the Japanese Aerospace Exploration Agency (JAXA), is instrumental in shaping the research related to the use of satellite-based rainfall products in hydrological studies (http://trmm.gsfc.nasa.gov/).The TRMM system has been operational since November 1997, and it has released products since 1998.As its name indicates, the mission covers only the tropical zone, i.e., between the latitudes 50 • N and 50 • S. The current spatial resolution is 0.25 • .
A large number of publications have reported worldwide experiences in the use of TRMM Multi-satellite Precipitation Analysis (TMPA) products (Nicholson, 2005;Hughes, 2006;Wong and Chiu, 2008;Buarque et al., 2011;Rollenbeck and Bendix, 2011), particularly the 3B42 research type, version 6 (Huffman et al., 2007).In this regard, two lines of research can be distinguished.The first one has been focusing on comparing the TRMM with the rain gauge data, either to study the spatial and temporal variability or to test the validity of the TRMM products.The second line of research has investigated the potential use of that satellite information as an independent data source or complementing rain gauge data for hydrological studies.

M. Arias-Hidalgo et al.: Experiences in using the TMPA-3B42R satellite data
There are important results related to the first category.For instance, in the arid environments of southern Africa, Nicholson (2005) and Hughes (2006) reported that the TRMM data overestimated the rain gauge data in every comparison based on a monthly scale.Other interesting cases are the ones reported by Bell and Kundu (2003).These researchers also compared on a monthly basis and recognized that even across densely gauged networks there were large differences between ground data and TRMM data.A number of studies have reported that a comparison on an annual timescale yields low biases; but with finer temporal scales, those biases showed an increasing trend.At daily or weekly time resolution, bias values around 50 % have been reported (Wilheit, 1988;Olson et al., 1996;Huffman et al., 2010).Other examples of comparisons performed in different places have been reported as well, such as Hong Kong (Wong and Chiu, 2008), the Brazilian part of the Amazon River basin (Buarque et al., 2011), Indonesia (Vernimmen et al., 2012) and countries with poorly gauged areas such as in Ghana (Endreny and Imbeah, 2009).
The location of the selected study area seems to strongly influence the comparison process.Publications whose case studies deal with oceanic environments or flat areas (e.g., Amazon Basin) report a very good match between the data from rain gauges mounted on buoys and the TRMM values (Adler et al., 2000;Bowman, 2005).In studies in locations with higher altitudes and particularly in the foothills of mountainous regions (e.g., the Andes), there were notorious differences between the two sources of data (Tian and Peters-Lidard, 2010).In this regard, the TRMM data might show lower values than the gauge rainfall (Dinku et al., 2010;Javanmard et al., 2010).The spatial resolution of the satellite-borne data possibly plays a minor role when attempting to mimic ground precipitation patterns in mountain range foothill areas.On the other hand, sampling error (both spatial and temporal) should be the first-order cause for underestimation in mountainous areas, which imply complicated convection mechanics, and thus high uncertainty for rainfall spatial distribution (Bendix et al., 2009).Consequently, this scheme is usually hard to be captured by satellite snapshot with limited swath range.For the present study, a total number of 9 TMPA pixels with 3 cells along the xdirection (longitude) seem adequate to detect variations from west to east (i.e., the lowlands, foothills and Andes region).And precisely these hilly areas are frequently the most unattended by the national weather agencies in terms of ground data availability.To tackle this problem, TMPA and in general satellite data may contribute to better comprehend the spatial and temporal pattern features of precipitation, in particular if space-borne and gauge data complement each other (Rollenbeck and Bendix, 2011).This possibility still needs to be investigated in areas with large spatial variability of rainfall.
A second group of researchers have gone beyond data comparisons.They have used the satellite products as a new input for rainfall-runoff models and then compared the simulation results with those using only rain gauge data.Noteworthy examples are the models developed in California (Guetter et al., 1996;Yilmaz et al., 2005) where flow simulation and soil water estimates were undertaken at a meso-scale basin using the GOES (Geostationary Operational Environmental Satellite) data.Although the outcomes of those simulations showed some biases when compared with an existing hydrological model, the authors were able to demonstrate a procedure of combining multiple data sources.Possible bias sources may have been the following: (i) the quality level of the GOES atmospheric correction algorithms at that time; and (ii) the fact that the precipitation estimations (Yilmaz et al., 2005) were developed for ungauged basins, hence involving a large uncertainty.In the Tapajós River (Amazon Basin, Brazil) spatial rainfall and daily comparisons of different data sources for hydrological simulations have been investigated: firstly, using only rain gauge observations; and secondly, integrating these point measurements with the TRMM data (Collischonn et al., 2008).These comparisons gave support to large-scale rainfall-runoff and further hydrodynamic simulations (Paiva et al., 2011).Bell and Kundu (2003) have shown that finding an optimal space-time correlation between rain gauge data and TMPA needs a model, and for that purpose a spectral model of rain rate covariance is often used.For comparing data averaged over a month, Bell and Kundu (2003), with the help of a spectral model, have shown that the error is very high when the diameter of the averaging area is small (∼ 20 km).As the averaging area increases, the error exponentially decreases to a minimum (close to 10 % level) for a diameter of 90 km for satellite overpasses every 3 h.As the averaging area increases, the error of monthly average values exponentially decreases to a minimum (close to 10 % level) for a diameter of 90 km for satellite overpasses every 3 h, even with just a few gages.On the other hand, as the frequency of satellite overpasses decreases to daily visits, the optimal diameter grows up to 200 km with a corresponding (increased) error of about 25 %.They also concluded that for 3-hourly satellite visits to a gauge site it is possible to bring the error of monthly average values within 10 % with just a few gauges.
The literature suggests the promising possibility of complementing the rainfall data from rain gauges with data from the TMPA-3B42 (research version) in hydrological studies of data-scarce regions such as the Vinces Catchment in the Guayas River basin in Ecuador.Prior to achieving this goal, a second bias correction might be necessary because the global validation spots provided by NASA (http: //trmm-fc.gsfc.nasa.gov/trmmgv/data/data.html)are usually too far away and thus might not be sufficiently representative for the present study area.Therefore, this article proposes the use of local rainfall ground stations as anchor points to re-correct the TMPA-3B42R research values (v6).In addition, this paper presents a simple procedure to combine the ground measurements with the 3B42 data for hydrological simulation using an existing rainfall-runoff model of the Vinces Basin.

The Vinces River catchment
The Guayas River basin (GRB, 34 000 km 2 ) is located within the Ecuadorian coastal region (Fig. 1).It is one of the most important areas in Ecuador in terms of economic production.Three main activities take place within the basin, viz.(i) urban/industrial development, (ii) agriculture and (iii) aquaculture (Southgate and Whitaker, 1994;Falconi-Benitez, 2000).More than 68 % of the national crop production originates from this watershed (Borbor-Cordova et al., 2006).The Vinces River catchment is located in the central part of the Guayas River basin.Up to the Quevedo at Quevedo station, the catchment area is 3400 km 2 (Fig. 1a).Elevations range from 60 m (at the outlet) up to 4080 m along the Andean foothills, particularly the northeastern part.Annual rainfall (derived from rain gauge values) typically varies from around 1000 mm in the southwestern side to more than 3500 mm in the northeastern zone close to the Andes (Arias-Hidalgo et al., 2012).The mean historical flow at the upper catchment's outlet is 220 m 3 s −1 .In general, two seasons are distinguished across the Ecuadorian lowlands: the wet (rainy) season (mid-December to May) and the dry period (the rest of the year), characterized by a common absence of rainfall.

The hydrological model
A simulation study was carried out to compute the streamflow contribution from the upper to the lower Vinces catchment as part of a broader study involving a wetlandcatchment analysis framework (Arias-Hidalgo et al., 2012).The Lulu and San Pablo rivers, main tributaries of Vinces, have crucial importance since they may mitigate the effects the Baba dam project may exert on the lower course of the river, specifically a significant flow diversion to the Daule Peripa dam (Fig. 1).As such, the main target of that study was to calculate the hydrographs at the confluence of Lulu and San Pablo with the Vinces River as well as at the catchment's outlet.To that end, the aforementioned catchment was divided into 6 subbasins (Fig. 1b), where four had streamflow gauges available for calibration.The HEC-HMS tool (Sharffenberg and Fleming, 2010) was used to compute the catchment runoff.The two aforementioned tributaries are expected to reduce the potential water shortage caused along the Vinces River by the Baba dam, up to 60 % (Efficacitas, 2006;Arias-Hidalgo, 2012).
In general, spatial data are very scarce across the Guayas River basin.This involves a low number of weather stations, a poor density of available meteorological measurements and few calibration points and long gaps throughout the daily time series, etc.Because of these situations, the model was built using simple approaches that require a low number of variables.The precipitation loss was modeled using the deficit and constant loss method (Skaags and Khaleel, 1982;USDA, 1986).This method is a simple one and is suitable when limited data do not allow using a more rigorous method that models the evolution of soil moisture (e.g., using the Soil Moisture Accounting model).The deficit and constant loss method accounted for the sum of surface storage, canopy interception, infiltration, evapotranspiration, and soil moisture.This composed index was estimated for each subbasin based on its soil type.The classification of soil was adopted from the soil classification used in the curve number method (i.e., soil types A, B, C and D).
The delineation of the catchment and computing the physical features such as areas, distances, etc. were carried out using HEC-GEOHMS, a GIS-based tool (Fleming and Doan, 2009).The impervious area of each sub-catchment, information typically used in computing precipitation loss, was determined using the land use map of the catchment.In the meteorological model, the gauge weights to individual rain gauges were determined based on the Thiessen polygon method.The direct runoff was computed using the SCS unit hydrograph (SCS, 1972).The baseflow was computed using the recession method.Initial values of lag time and initial baseflow were estimated by analyzing some typical hydrographs of the catchment.Model parameters were calibrated using the univariate gradient as the optimization method and the minimization of sum of squared residuals as the objective function (see Tables A1, A2 and A3).The model was set up and calibrated for the years 2004 and 2005 (normal years, not showing any extreme pattern typical of El Niño phenomenon) against discharge observations at Quevedo at Quevedo, Baba, Pilaló and Toachi stations (dots in Fig. 1b).The average daily Nash-Sutcliffe coefficient (NSC) (Nash and Sutcliffe, 1970) was around 0.75 (a summary of the NSC numbers for some subbasins is shown in the Table A4).
In the present research, three hourly satellite visits and monthly averaged satellite rainfall values are considered.Therefore, as per Bell and Kundu (2003), a zone with a diameter of 90 km should have been optimum.The Vinces Basin has an area of about 3400 km 2 , which approximately can be considered as an area with a diameter of 66 km.As per the spectral model of Bell and Kundu (2003), the error level for this comparison may be around 11 to 12 %, which perhaps can be considered as quite low.It should be noted that the subbasins are small for this study (Table A1); however, their presence does not mean that a comparison was made at such scales.The hydrological processes of the entire Vinces catchment were calibrated and validated in an integrated method and not performed at a sub-basin scale.
Annual rainfalls from rain gauges and the TMPA-3B42R data, averaged over the time span, were computed at their respective measurement points.Adopting the inverse distance weighting (IDW) for interpolation, an average spatial distribution of annual rainfall is shown in Fig. 2a and b.The ground-based map indicated an increasing pattern principally towards the north.Such a trend was also somewhat captured by the TRMM-based map, although its order of magnitude was 50-65 % smaller than the rain gauge representation.However, the northeast of the catchment shows the lowest density of ground-based measuring stations.This fact corroborates the concerns about possible high uncertainties that may be associated with rainfall estimation across foothill areas (Paiva et al., 2011).
Several time scales were considered for bias correction: annual, seasonal, monthly, etc.In this regard, high correlations between the TMPA-3B42R and the rain gauge data were still found for the monthly resolution (to be detailed shortly).A monthly bias correction for the TMPA-3B42R research data has been adopted previously by other researchers as well (Bell and Kundu, 2003;Hughes, 2006;Huffman et al., 2007;Rollenbeck and Bendix, 2011;Vernimmen et al., 2012).For that objective, Huffman et al. (2007) used specific global stations.Finally, beyond monthly resolution and in general at rain gauge locations using daily timescales, most researchers found low correlations between the rain gauge and TMPA-3B42R rainfall data (R 2 often less than 0.30).
Most likely, the rain gauge location and the TMPA grid cell centers do not coincide.As a consequence, the average monthly satellite-based data at the grid cells had to be estimated at the rain gauge locations (the inverse distance weight method was used once again).Thus, the average monthly precipitation values for the study period (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) measured at each rain gauge location were compared against their TMPA interpolated counterparts.The following equation expresses a relationship between the rain gauge and the uncorrected TMPA monthly values, at a certain location i: where K i,m is the bias factor at the rain gauge location i. TRMM i,m is the uncorrected monthly rainfall (mm month −1 ), obtained from the satellite data and estimated at the rain gauge location i during the month m, and TP i,m is the total rainfall at rain gauge i during the month m (mm month −1 ), from ground observations.An example of this correlation can be seen for the "Puerto ila" station (Fig. 3).Table 1 shows the extended results of this annual comparison using the monthly bias corrector (Eq.1).In general, a high correlation was observed at the monthly scale (R 2 = 0.81 on average).In that regard, Fig. 4 illustrates a graphical comparison between the rain gauge observations, the uncorrected and corrected TMPA data.In order to assess the validity of the bias correction, the relative bias and the root mean square error (RMSE) were calculated as follows: where P Grounds t is the annual rainfall from ground observations (mm year −1 ).P TRMM is the uncorrected and corrected annual rainfall, derived from the satellite data (mm year −1 ).P TRMM i is the monthly rainfall for month m, at rain gauge location i, for both uncorrected and corrected TMPA information (mm month −1 ).As a further step, the bias adjustment coefficient (K in Eq. 1) was spatially distributed across the Vinces upper catchment resulting in a distributed map of correctors (Fig. 5).As before, the approach followed inverse distance weighting, based on the correctors estimated at each rain gauge location.As could have been expected from the differences in annual averages, bias correctors between 2.7 and 3.2 constituted a representative interval for most of the catchment domain, with the exception of those ground stations situated in the uppermost portions of the catchment (close to the water divide).The corresponding bias correction coefficients were estimated for every TMPA grid center, and thus the correction was performed using the following expression: where K j,m is the monthly bias factor, estimated at the TMPA grid center j .TRMM j,m is the uncorrected TMPA   monthly rainfall at the grid center j during month m (mm month −1 ), and TRMM corr,j,m is the corrected TMPA monthly rainfall at the grid center j during month m (mm month −1 ).Thus, seven TMPA-3B42R points were added to the rain gauge input network (circles in Fig. 5): two in the lowlands and five in the highlands.Because the rainfall-runoff model was built using a daily time step (as the rain gauge dataset), the satellite-corrected monthly values were disaggregated to that time resolution for each new information spot.To achieve this, empirical factors (f i ) were derived from the rain gauge time series as follows: where f i,d,m is the temporal disaggregation coefficient, at the rain gauge i, for the day d of month m.P i,d,m is the total rainfall at rain gauge location i on the day d of month m (from ground observations, mm day −1 ), and TP i,m is the total rainfall at rain gauge location i during the month m, from ground observations (mm month −1 ) as explained in Eq. ( 1).The f i,d ratios were then applied back to the corrected TMPA-3B42R monthly values to estimate the daily series (day x, month X) at the satellite grid centers.There, the procedure took the factors from the nearest ground location.The final expression is as follows: where TRMM corr,j,d is the disaggregated, daily corrected TMPA monthly rainfall at grid center j (mm day −1 ).Finally, in order to illustrate the validity of this simple procedure, an example was taken from the location of the Puerto ila gauge station, as shown in Fig. 6.At the daily timescale, the correlation at this spot was high enough (R 2 = 0.88) given the empirical approach and the large initial bias.

Performance of complementary TMPA-3B42R data for the HMS model
The HEC-HMS model of the Vinces River catchment was run for the year 2006 (a normal year), in principle with the data from rain gauges exclusively, and afterwards using the TMPA-3B42R and rain gauges together.For the first simulation, Fig. 7 shows an example of hydrograph comparison between observed and computed values, at the Quevedo at Quevedo streamflow station.It was observed that although some of the observed peaks were not accurately matched by the simulation, at least the trend and some other peaks were very well represented.The model computed several flow peaks in this period as a response to their corresponding precipitation events, such as the peaks in May, October and November 2006.Yet the differences with the observed data in May and November are noteworthy.According to what has been experienced during the data collection campaigns, there might be some concern about the reliability of the discharge observations (stations without proper maintenance), particularly during the dry season.This has not been the case with the rainfall observations.Disregarding some mismatches during May and November, the Nash-Sutcliffe coefficients were considered acceptable considering the simplicity of the model (Table A4).
In order to assess the usefulness of combining precipitation data from rain gauges and the TMPA-3B42R as an alternative data source for rainfall-runoff modeling, a new hydrological simulation was executed for the Vinces upper catchment.The time series at the rain gauge stations were not modified, but the generated (synthetic) daily time series at the TMPA-3B42R centers (Fig. 5) were considered as additional information in computing the average areal rainfall for each sub-basin.
The results of the hydrological simulation using average areal rainfall from both sources are shown in Fig. 7.During some peaks throughout the rainy season of 2006 (e.g., 8 February, 5 March, and others), the newly fed model showed higher streamflow values compared to the groundbased data simulation.This may imply an improvement on the model performance (8-13 %) for the series trend and even for some peaks (e.g., the peak of 16 March), because the rain gauge model in general underestimated the discharge observations.However, for other peaks the new model caused a larger positive bias of around 18 %.Globally, the Nash-Sutcliffe coefficient for the wet period remained almost the same, ranging from 0.83 to 0.81 (Table A4).
For the dry season the new simulation did not show any remarkable improvement compared to the model results, which used rainfall data only from rain gauges (NSC was 0.70 to 0.53 for the Vinces outlet station; Table A4).For other periods the NSC at the catchment outlet increased slightly from 0.98 to 0.99 (not shown in Table A4).On an overall yearly basis, the Nash-Sutcliffe coefficient slightly decreased from 0.81 to 0.76 (Table A4).
Finally, mass balance error values have been included in the analysis, not only for the Quevedo at Quevedo station but also for the other three control locations: Baba dam, Toachi and Pilalo (Table A4).In addition, annual as well as seasonal results are presented.There were improvements on mass balance error (and NSC) when comparing results from the model using only rain gauge data and the combined simulation.Such enhancements were observed at the Baba gauge (annual, −32.2 % to −15.7 %; wet, −32.3 to −17 %; and dry season, −32.2 % to −14.8 %) and at the Pilaló gauge (annual, −41.5 to −15.2 %; wet, 17.7 to 8.2 %; and dry, −58.3 to −31.7 %), as well as at the Quevedo at Quevedo station (rainy season, 17.4 to −1.5 %).However, as for the simulations using only the TMPA values, the overall outcomes for the whole Vinces catchment were not as good as for the combined approach (e.g., NSC numbers below 0.50).This finding indicates that, for a daily resolution, the TMPA-3B42R research values may still depend on the ground data not only for re-correction but also for complementing the modeling input data process.

Concluding remarks and further research possibilities
New remote sensing technologies provide multiple options to complement the conventionally obtained spatial rainfall data.In this paper, the use of the TMPA-3B42R data to complement precipitation data from rain gauges for the scarcely gauged Vinces upper catchment in Ecuador was explored.
The spatial distribution of the annual rainfall data from TMPA showed some similarity to the spatial pattern obtained  from rain gauge data.Despite this initial visual correlation, the raw satellite data showed high negative bias at monthly time resolution.Bias correction factors were computed and, adopting a straightforward procedure, were spatially distributed, and then used to improve the TMPA-3B42R data.
The procedure showed an easy yet effective way for recorrecting the bias of the TMPA-3B42R data at the catchment scale, using local calibration points (rainfall ground stations) instead of the global validating ground spots utilized by NASA.The interpolation stages may be strengthened using a comparison of performances between IDW, co-kriging and kriging with external drift (incorporating elevation as a second variable/drift respectively).By means of the hyetographs constructed from rain gauges, the bias-corrected monthly TMPA-3B42R data were disaggregated into a daily resolution.The temporal disag-gregation technique was, albeit simple, sufficient to generate synthetic daily series that were quite comparable with the temporal data coming from the measuring stations.
At first, a hydrological model across the upper Vinces catchment was built using only rainfall ground data as an input variable.Results at several locations (e.g., at the Baba, Toachi, Pilalo catchment outlets and at the Quevedo at Quevedo river station) were compared against the river discharge observations and found to be reasonably acceptable.In general, the differences between simulated and observed runoff mainly happened in May and November (during the dry season) probably partly as a consequence of localized stormy events in the Andes, or suspicious discharge measurements or the several strong assumptions adopted throughout the model construction.
The corrected TMPA-3B42R data were employed with the rain gauge observations as complementary data sources for the rainfall-runoff representation.This new simulation showed outcomes very comparable with those using only rain gauge information.In spite of a slight decline in the Nash-Sutcliffe number (possibly caused by the simplicity of the empirical temporal disaggregation procedure and at some point due to the spatial resolution of the TMPA-3B42R grid), the mass balance error showed a general recovery when incorporating the satellite-based data into the rainfall-runoff simulation.In general, although the new model's results did not improve remarkably compared with the rain gauge simulation, the validity of this experimental approach should be seen as the development of an alternative source of rainfall information.At a daily timescale, this new and re-corrected source (combined with rain gauge data) can provide reliable rainfall estimates and can help in predicting the hydrology at the catchment scale.Thus, the TMPA-3B42R research information contributed to an enlarged spatial characterization on the scarcely gauged (catchment) area, in this case, the Andean foothills region.Furthermore, the TMPA-3B42R (without rain gauge values) may have the potential to provide useful data at scales larger than the present modeling resolution (e.g., monthly/basin).
An ultimate research opportunity would be to analyze spatial rainfall patterns by combining the TMPA satellite and the ground measurement data during El Niño and La Niña

Table A1 .
Surface water variables for the Vinces River model in HEC-HMS.

Table A2 .
Baseflow parameters for the Vinces River model in HEC-HMS.

Table A3 .
Gage weights for San Pablo -Quevedo subbasin in the Vinces upper catchment model.