Climatology-based regional modelling of potential vegetation and average annual long-term runoff for Mesoamerica

. Mean annual cycles of runoff, evapotranspiration, leaf area index (LAI) and potential vegetation were modelled for Mesoamerica using the SVAT model MAPSS with different climatology datasets. We calibrated and validated the model after building a comprehensive database of regional runoff, climate, soils and LAI. The performance of several gridded precipitation climatology datasets (CRU, FCLIM, WorldClim, TRMM, WindPPT and TCMF) was evaluated and FCLIM produced the most realistic runoff. Annual runoff was successfully predicted ( R 2 =0.84) for a set of 138 catchments, with a low runoff bias (12%) that might originate from an underestimation of the precipitation over cloud forests. The residuals were larger in small catchments but remained homogeneous across elevation, precipitation, and land-use gradients. Assuming a uniform distribution of parameters around literature values, and using a Monte Carlo-type approach, we estimated an average model uncertainty of 42% of the annual runoff. The MAPSS model was most sensitive to the parameterization of stomatal conductance. Monthly runoff seasonality was mimicked “fairly” in 78% of the catchments. Predicted LAI was consistent with MODIS collection 5 and GLOBCARBON remotely sensed global products. The simulated evapotranspiration:runoff ratio increased exponentially for low precipitation areas, high-lighting the importance of accurately modelling evapotran-spiration below 1500 mm of annual rainfall with the help of SVAT models such as MAPSS. We propose the ﬁrst high-Correspondence


Introduction
Mesoamerica has a population of around 60 million people living in its 1 million square kilometres.The region comprises of eight countries or part countries (Southern Mexico, Guatemala, Belize, El Salvador, Honduras, Nicaragua, Costa Rica and Panama) with a high diversity of human development and environmental conditions.It is also an integrated region due to shared economic development plans (the Puebla-Panama Plan1 ), conservation goals (the Mesoamerican Biological Corridor2 ), and international catchments, some of which have potential use conflicts (Wolf et al., 2003).Water issues drive many aspects of human wellbeing and national development.For instance, most Central American countries rely heavily on hydroelectric energy and on irrigated agriculture (Siebert and Döll, 2001;Kaimowitz, 2005).
Given the importance of understanding hydrological regimes and river runoff for better water management and hydro-power planning, there is a demand for quantitative knowledge on regional hydrological resources and water budgets (Griesinger and Gladwell, 1993;Nijssen et al., 2001).Many studies have analysed runoff (Zadroga, 1981; P. Imbach et al.: Regional modelling of vegetation and long-term runoff for Mesoamerica Abbot, 1917;Niedzialek and Ogden, 2005;Thattai et al., 2003) and groundwater (Calderón Palma and Bentley, 2007;Genereux and Jordan, 2006) on a local scale in Mesoamerica.Cloud forests have received special attention (Cavelier et al., 1996(Cavelier et al., , 1997;;Clark et al., 1998;Holder, 2003Holder, , 2004)).Nevertheless, there is a need for understanding the hydrological regimes and vegetation with a regional scope and fine resolution (typically 1 km 2 ).Global runoff simulations (Fekete et al., 2002) poorly represent the Mesoamerican region.Continental scale simulations conducted for other regions (Gordon et al., 2004;Vörösmarty et al., 1989) have an unsuitable spatial resolution (0.5 degree) for the complex topography and climate of Mesoamerica.
Stochastic hydrological modelling has been the preferred approach for its high-precision simulation of water budgets within catchments.However, this empirical approach requires a long-term series of climate and runoff data for calibration and it is not generic.Alternatively, choosing a process-based modelling approach allows for scaling-up from catchments to regions even in areas where runoff data are missing.Process models can also be forced by climate or land-use change scenarios to assess future impacts of climate change on water availability.Neilson (1995) developed a soil-vegetation-atmosphere transfer (SVAT) model with outputs such as water balance partition, runoff, evapotranspiration, leaf area index and potential vegetation cover.This model, called MAPSS (Mapped Atmosphere Plant Soil System), has been validated on a large scale for the USA (Neilson, 1995;Bishop et al., 1998) and provided a good trade-off between the accuracy of the outputs (vegetation and hydrology) and the amount of input and validation data required.MAPSS static biogeography approach allowed modelling components of both hydrology and vegetation for a region where high resolution climate time series are not available and runoff data for validation is not available or easily accessible.The main assumptions of MAPSS are as follows.(1) Potential vegetation cover can be simulated solely on the basis of climate and soil (texture, depth and rock content) data.Land use does not need to be forced into the model, which is of a considerable advantage in areas where detailed land use maps are missing.(2) The resulting water balance partitioning is a fairly good proxy for the actual runoff for most basins.(3) Water storage can be neglected on an annual basis.(4) Evapotranspiration, which is estimated through explicit ecophysiological modelling, is a key component of the water balance partitioning, particularly for dry areas.
To our knowledge, no high resolution (1 km 2 ) model-tomap runoff, evapotranspiration and vegetation in Mesoamerica exists.Thus, the goals of this study were: 1. to calibrate and validate the MAPSS model long-term average runoff outputs using a set of representative Mesoamerican catchments; 2. to evaluate the uncertainty of simulated runoff, the model sensitivity to its parameters and the model-data misfit (residuals) distribution; 3. to validate the modelled long-term average leaf area index and potential vegetation distribution using remotely sensed data (MODIS and GLOBCARBON); 4. to map long-term averages of regional runoff, evapotranspiration, leaf area index and potential vegetation for Mesoamerica at a working resolution of 1 km 2 .

Region description
The study area spans continental land within 6.5-22 • N and 76.5 -99 • W. This one million square kilometre area covers Southern Mexico to the north and the seven Central American countries down to Panama to the south.This region has a highly complex biophysical environment; Hastenrath (1967) describes it as structurally rich in coastlines and plains, with high mountains and plateaus exerting a large influence on climate.The main topographic feature is a mountain range that reaches over 4000 m a.s.l. and runs close to the Pacific coast with few interruptions (Fig. 1a).
The amplitude of the annual cycle of surface temperature is smaller in the tropics than at temperate latitudes.By contrast, precipitation is highly variable.Seasonal precipitation is determined by the Inter-Tropical Convergence Zone (ITCZ), which brings convective rains.Winds coming from the Caribbean interact with mountains and coastlines, further increasing seasonality (Nieuwolt, 1977).The result is a high variability of precipitation over short distances, with humid windward mountains and coastal areas, and dry leeward valleys (Fig. 1b).Therefore, convective rains dominate the Atlantic watersheds and orographic rains have a higher contribution in the Pacific watersheds (Shultz, 2002;Guswa et al., 2007).
The rainy season lasts from May to October (Hastenrath, 1967).The distribution of precipitation is bimodal, with two maxima during June and September-October, and a distinctive relative minimum in between called the "mid-summer drought" (Magaña et al., 1999).Runoff follows precipitation inputs because most rivers in the region are rain-fed.The longest and largest rivers are on the Atlantic side (Griesinger and Gladwell, 1993).
The vegetation of the Pacific watersheds and northern part of the Yucatán Peninsula is that of a tropical climate with summer rain (Schultz, 2002).The vegetation of the Atlantic watersheds is tropical with year-round rain.In pristine Pacific areas, there are savanna grasslands with tree densities dependent on available moisture.In the Atlantic areas, evergreen forests dominate.Anthropogenic influence has reduced natural vegetation to 58% of the total area.Rainfall, Hydrol.Earth Syst.Sci., 14, 1801Sci., 14, -1817Sci., 14, , 2010 www.hydrol-earth-syst-sci.net/14/1801/2010/ vegetation and high radiation, much of it being diffuse due to high cloud cover, lead to high annual evapotranspiration rates over 1000 mm (Shultz, 2002).

MAPSS model description
MAPSS simulates potential vegetation cover and leaf area index (LAI), given light and water constraints based on the climatology and soil type at each site.In the following paragraphs, we present a general overview of the model based on Neilson (1995) excluding snow processes which are not relevant to our study area.We will focus on how the MAPSS model estimates evapotranspiration, runoff, leaf area index (LAI) and vegetation types.These four quantities are the model outputs presented in this paper.
The MAPSS model determines the long-term climax vegetation of a site, assuming that vegetation in equilibrium with climate will maximize the use of available water.The climax vegetation is characterised by the leaf area index of trees, shrubs and grasses supported.Leaf area index is related to canopy transpiration rates that affect the amount of soil water.The model estimates the LAI equilibrium value through successive iterations in which grasses and woody vegetation compete for light and water, depending on soil available water, until finding the point at which most of the water is used and LAI is maximized.If wilting points are passed, the LAI is decreased and LAI is increased if soil water is not consumed to near matrix potential.Grass growth is limited by the shade of trees and the equilibrium LAI, therefore, is a mixture of maximum potential LAI for grasses, shrubs and trees that will consume all soil water above wilting point during an average (climatology) dry season.The type of vegetation (trees, shrubs and grasses) is called "life forms" in the following.
Precipitated water is intercepted and evapourated by the canopy while the throughfall is partitioned in surface runoff and infiltration depending on soil water content.The soil is divided into three layers and infiltration is separated into saturated and unsaturated percolation.Transpiration by woody (trees and shrubs) and grass vegetation is a function of stomatal conductance and LAI.Stomatal conductance is modelled to decrease with soil water potential (depending on soil types) and with increasing potential evapotranspiration (PET), and is used to estimate canopy conductance based on life forms LAI.
The model simulates "climax" vegetation in equilibrium with climate and its local water balance, excluding interannual variability.Therefore, the model calculates the water budget of each grid point forced by a monthly climatology of precipitation, temperature, wind speed and vapour pressure deficit.Values for elevation, soil depth and texture are prescribed at each grid point.Given these processes, MAPPS produces a climatology of the partitioning of rainfall in each grid point into monthly runoff and evapotranspiration.A modelled average-year represents a multi-year average of available climate data (see period for each variable in Table 1).In the following paragraphs, we present an overview of the modelled processes.
Canopy interception loss (Int c ) is a function of the number of rain events (N p ) in each month and of leaf area index (LAI) (Eqs. 1 and 2).Rain events are linearly related to monthly precipitation (P t ) while canopy interception loss is constrained by a potential maximum under closed canopy (Int m ) and determined according to actual LAI: (1) Where k e is a constant (units of number of events per mm of monthly precipitation) and N pmax is a maximum number of rain events predetermined for summer frontal and winter stratiform precipitation (Eq.1).Actual LAI from grasses (LAI G ) and woody vegetation ( LAI W ) are separately accounted for (Eq.2).
After interception loss, throughfall precipitation is available for infiltration and surface runoff (superficial and macropore flow).Surface runoff depends on the amount of water in the top-soil (above matrix potential) and of soil diffusivity as a function of soil texture and wetness.Water that does not runoff is infiltrated and its transfer is regulated by saturated and unsaturated percolation processes according to Darcy's Law (Hillel, 1982).Field capacity, saturated water holding Hydrol.Earth Syst.Sci., 14, 1801-1817, 2010 www.hydrol-earth-syst-sci.net/14/1801/2010/ capacity and wilting points are estimated based on the thickness of soil layers and texture.Water infiltrates from one layer to another only if the water content of the deeper layer is lower than its saturated water holding capacity.Saturated and unsaturated percolation flows depend on the difference between current soil water content and water content at field capacity and wilting point, respectively.Both processes are standardized by the water content in a saturated soil and calibrated by constants for each soil layer separately.Water movement in the soil is further explained by Bachelet et al. (1998).
The ratio of actual to potential evapotranspiration (PET) is life form specific.This ratio increases exponentially with LAI.PET is calculated using climate and an aerodynamic turbulent transfer model adapted from Brutsaert (1982).The main components of the model estimate the transfer of momentum, heat and water vapour (sensible and latent energy fluxes) at the surface for each pixel based on air density, surface roughness length (function of life form), air and surface temperature and humidity, and wind speed.Wind speed affects canopy resistance, whose turbulent interaction is scaled by the roughness length of each life form.The heat and mass transfer coefficients are originally from Brutsaert (1982) and were calibrated to monthly averaged temperature and humidity gradients (Neilson, 1995), since the model does not explicitly simulate the diurnal cycles which are normally the dominant features in turbulence.
Actual transpiration (AT i ) increases with canopy conductance (Co ci ), which depends on LAI and vegetation specific stomatal conductance: (3) The coefficient k att accounts for the attenuation of light, wind and humidity throughout the canopy (Neilson, 1995).Soil water potential depends on soil texture and water filled pores, according to Saxton et al. (1986).The sensitivity of stomatal conductance to PET is controlled by a parameter, the wilting point, and maximum stomatal conductance.Grasses only have access to water from the top soil layer.Woody vegetation (including trees and shrubs) can access the top and intermediate layers, whereas the deepest layer is used for base-flow having no roots.Light competition between woody vegetation and grasses is based on an inverse linear relationship, in which LAI of trees increases and that of grasses decreases up to a threshold where grasses are eliminated and the canopy is closed: Where the maximum LAI allowed under light competition (LAI glite ) depends on the LAI of woody vegetation (LAI t ), the upper limit of grass LAI (LAI gmax ) and the lowest woody LAI affecting light competition (LAI t0 ).Below LAI t0 grass LAI is not light limited and above LAI tfull grasses cannot be supported.
Finally, vegetation physiognomy is hierarchically classified with rules based on life forms' LAI (grasses, shrubs and trees), leaf form (broadleaf or microphyllous), and phenology (evergreen or deciduous) of woody vegetation combined with their thermal zones (subtropical and tropical in this study, based on a mean annual temperature threshold related to frost presence).Classification rules for forest types are provided in Neilson (1995).
To summarize, soil types define the available water that regulates transpiration rates, which in turn modifies the water balance (including runoff and evapotranspiration) and the values of possible LAI of different life forms.MAPSS assumes that the annual water storage change in soil and aquifers is close to zero, which is true for the most part on an annual basis or in catchments characterised by a high superficial runoff-to-infiltration ratio.Runoff is estimated as follows: Where, R is runoff, P is precipitation, E is evapotranspiration, I is interception loss, and s is the water storage in soils and aquifers.
A detailed model description, on which we based this section, including model equations and default parameters, is given by Neilson (1995).

Model set up and input data
We implemented MAPSS combining different data sources for the climate variables required (Table 1).The temperature forcing is a monthly climatology (for the period  from the global WorldClim dataset (Hijmans et al., 2005).We tested several precipitation climatologies (see section on precipitation for a detailed description of each dataset) but the FCLIM climatology is used unless otherwise stated.FCLIM is a climatology developed specifically for the Mesoamerican region based on data from the period 1960-2000.Wind speed data and mean vapour pressure were estimated from CRU CL2.0 data (New et al., 2002(New et al., ), a 1961(New et al., -1990 global climatology at 10 minutes (vapour pressure was estimated following the dataset guidelines based on relative humidity and saturation vapour pressure data).Our implementation was made from the resolution of the temperature forcing data (1 km 2 ).In cases where input data had a coarser resolution (i.e.wind speed data) it was re-sampled to 1 km 2 (see Table 1 for data description).

Precipitation
Meteorological forcing data have a strong influence on the model's performance and uncertainty (Linde et al., 2008).This is particularly true in the topographically complex Mesoamerican region.Uncertainties in model parameters as well as in the climate input data can have an influence on the model hydrological output (Arnell, 1999) 1).These precipitation datasets were: -CRU CL 2.0: the coarsest dataset used, based on interpolation of weather station data using latitude, longitude and elevation as co-predictors (New et al., 2002) -WorldClim: developed from weather station data, interpolated at high resolution and accounting for elevation (Hijmans et al., 2005) -FCLIM: developed specifically for Central America by interpolation of weather station, distance to southern coastline, elevation and precipitation data from remotesensing sources by the Climate Hazard Group at the University of California in Santa Barbara -Wind PPT (modelling wind-driven precipitation): modelled with the TRMM dataset using wind speed and direction, as well as terrain conditions (slope, aspect and topographic exposure) (Mulligan, 2006) -TRMM: developed from two remote-sensing sources (a passive microwave radiometer and a scanning radar) to estimate rainfall (Mulligan, 2006) -TCMF: a 10% increase factor was applied to precipitation in the FCLIM dataset over areas covered by cloud forests (from a map developed by Mulligan and Burke, 2005).As clouds go through forests in these areas, water is intercepted by the vegetation and adds to the total amount of water available for runoff production.This intercepted water is not regularly captured by rain gauges.The increase in value was arbitrarily selected from a range of interception values of 6-35% of total rainfall (Bruijnzell, 2005).
There was no regional dataset of precipitation observations independent of those used to generate the precipitation maps available, so it was not possible to validate the datasets used in this study.

Runoff observation at catchment scale
A new runoff dataset was created from data with different levels of temporal resolution (daily, monthly and annual) and different series length collected from several institutions across the Mesoamerican region.For some catchments, the years used to estimate the runoff average were not available, but only the number of years averaged (see Table 1).The catchment boundaries of each runoff station were delineated using the SRTM (Shuttle Radar Topography Mission) 90 m digital elevation model (Jarvis et al., 2008).Stream flow data in cubic metres per second were converted to depth values in mm by normalizing the flow with the catchment area above the measurement point.
We selected 138 catchments (from a total of 466 with available runoff data) across the region (Fig. 1a and b) using the following criteria: (1) Retain only catchments with an annual runoff-toprecipitation ratio less than unity, thus, excluding catchments where either precipitation interpolation or runoff data are miscalculated.In these cases, long-term runoff average is larger than the precipitation climatology, and we could exclude errors of larger-than-real runoff records, or lower-than-real precipitation interpolations in the climate forcing.The first case could be a problem of reporting one year runoff average as a long-term record and the second, a deficiency in the coverage of weather station data used to interpolate the map of precipitation climatology; (2) Exclude catchments with water bodies larger than 1% of the catchment area that could represent regulated catchments.This simple criterion excludes catchments with potential reservoir and flow regulation projects for irrigation or hydro-electricity production that modify runoff, and are outside the scope of the study.The only systematic method to filter out these managed catchments was to estimate the size of the water bodies from a land cover map (see land cover map details in the modelling vegetation and LAI section).The assumption behind this criterion is that such projects require large projects/reservoirs that appear on a land cover map; (3) Retain catchments with available runoff data series longer than 15 years (Gerten et al., 2004) to minimize the effects of sub-sampling inter-annual variability (Hartshorn, 2002;Aguilar et al., 2005).We applied this criterion to have a closer match between the length of the period averaged in the input (climate) and validation (runoff, LAI) data and, thus, reducing mismatches due to inter-annual variability.
We refer to this dataset as the Long Time Series Average dataset (LTSA).Another runoff dataset was constructed without criterion iii), leaving 251 catchments.This larger dataset is called the Time Series Average dataset (TSA).TSA results are presented separately since they are based on a larger dataset for calibration and validation, but could be biased by some catchments characterised by short dry or wet periods.
The area of each LTSA catchment ranged from less than 100 km 2 to 15 378 km 2 (Fig. 2a).Since the difference between potential and actual vegetation may affect model performance, we corroborated that these catchments represent the full range of natural vegetation cover (Fig. 2b).Selected catchments are representative of the study area in terms of their precipitation and mean elevation (Fig. 1a and b), except for areas with annual precipitation smaller than 500 mm (as the lowest annual precipitation in the catchment dataset is 578 mm) which cover 6% of the region.
Hydrol.Earth Syst.Sci., 14, 1801-1817, 2010 www.hydrol-earth-syst-sci.net/14/1801/2010/The runoff data were collected for a time period overlapping the time period of climate data (see data description in Table 1) and any bias due to a mismatch between time periods of input and validation data was ignored.In addition, no significant trends in mean precipitation have been observed in Central America between 1960 and 2005.However, in the same period mean temperature has been observed to increase (Aguilar et al., 2005) but its impacts on runoff have not been analysed for Mesoamerica.
On a monthly basis, the storage term s of Eq. ( 14) can represent a substantial fraction of the total water budget in some catchments.These catchments are associated with a high coefficient of variation in their monthly R:P ratio (CV-R:P).Thus, we performed a monthly analysis of the modeldata comparison only in catchments with a CV-R:P <0.5.This threshold was selected to minimize the effect of high s variability, while keeping at least half of the catchments for analysis.Using this criterion, a monthly model-data comparison is possible for 63 catchments in the LTSA dataset (94 in the TSA dataset).

Model calibration and validation
Calibration and validation of MAPSS was performed with annual runoff data using a split-sample test (Klemeš, 1986;Xu, 1999;Xu and Singh, 2004), i.e. by randomly selecting half of the catchments for calibration and the other half for validation.This test is a common approach for splitting data into calibration and validation sets, either spatially or temporally (Motovilov et al., 1999;Wooldridge and Kalma, 2001;Donker 2001;Guo et al., 2002;Xu and Singh, 2004;Linde et al., 2008).This calibration and validation method was selected due to the diversity of biophysical conditions present in our runoff dataset.
Before model calibration, we examined results from the uncalibrated MAPSS parameterization to look for runoff under or over prediction and bias due to non-potential vegetation in the catchments.The potential forest vegetation (modelled by MAPSS) has higher transpiration rates than actual vegetation that also includes pastures and croplands that can modify runoff (Neilson, 1995;Haddeland et al., 2007;Gordon et al., 2005).Since we found no bias due to the percentage of natural cover in the catchment, we assumed that our modelling approach was not sensitive to differences in evapotranspiration rates between natural and current vegetation cover.We could then make adjustments to obtain a regression curve with a slope close to 1 and an intercept of 0. We calibrated the model by manually adjusting parameters controlling transpiration and soil layer thickness until modelled runoff matched the observations.Parameters selected for manual adjustment (Table 2) included: (1) Total soil layer thickness: we increased this parameter relative to its MAPSS default value in our manual calibration procedure, to account for deep rooting (Schenk and Jackson 2002;Ichii et al., 2007) (2) Stomatal conductance: this parameter was also increased to reduce runoff and match the data (Ray Drapek and Ron Neilson, personal communication) (3) The wilting points of trees and grassy vegetation: were decreased to match runoff data.

Sensitivity tests
We performed three sensitivity tests.The first test used FCLIM precipitation with MAPSS original parameters.The second test was based upon FCLIM with a calibrated MAPSS version (see Sect. 2.3.4Model calibration and validation).The third was based upon FCLIM with a compilation of national soils maps (NS) to evaluate the effect of highresolution soils data (no data were available for Nicaragua and Belize).The NS soil parameters (texture and soil depth) compilation was made by digitizing country-wide soil maps and analysing their technical documentation to estimate soil texture and depths to bedrock (see Table 1 for references).Gaps in information were filled with data from a global soils map (FAO, 2003).

Model performance and efficiency
Several indices were used to compare observed against modelled annual and monthly runoff values for each catchment.
The LTSA dataset contains 128 catchments with monthly data, for which model performances were estimated using the Nash-Sutcliffe efficiency coefficient (NS, Eq. 6) (Nash and Sutcliffe, 1970) and Kendall's ranked correlation coefficient (τ , Eq. 7) (Guo et al., 2002;Boone et al., 2004;Gordon et al., 2004;Quintana Seguí et al., 2009) as follows: Where Q oi and Q mi are the observed and modelled runoff values at time step i, respectively, and Q o is the average observed value.Performance NS or τ WB Very Good > 0.9 < 5% Good 0.8-0.95-10% Fair 0.8-0.5 10-25% Poor < 0.5 > 25% Where n c and n d are the number of concordant and discordant pairs, and the denominator is the total number of possible pairings.NS and τ coefficients were rated according to Moriassi et al. (2007).The NS efficiency assesses the match between modelled and observed monthly values.A value of 1 indicates a perfect match, while a value of 0 means the model is as poor a predictor as the mean of the observed data.Negative values indicate the mean of observed values performs better than the model (Table 3).Kendall's coefficient is calculated with ranked monthly values, to assess how the seasonal variation is mimicked by the model.The coefficient ranges from 1, indicating a perfect agreement between the two rankings, to −1, indicating a perfect disagreement (one ranking is the opposite of the other).

Performance of vegetation and LAI modelling
A crucial part of how MAPSS determines runoff is the relationship between actual transpiration and LAI, since it not only determines water available for runoff, but the potential vegetation type that can be supported on site.For this purpose, two observed LAI datasets were chosen to assess model output performance (Table 1): EOS-Terra-MODIS (Yang et al., 2006a) and the GLOBCARBON-ESA European Remote Sensing (ERS-2) ENVISAT-SPOT sensors (Plummer et al., 2006) (referred to as MODIS-LAI and GLOBCARBON-LAI, respectively).
Both LAI datasets were used to test whether the model was simulating runoff under realistic conditions of vegetation leaf area, as this can be a relevant factor affecting spatial and temporal variability of runoff at large scales (Peel et al., 2004).Both LAI products rely on actual, not potential, land-cover maps: MODIS-LAI is based on an eight-biome map derived from MODIS data (Friedl et al., 2002) and GLOBCARBON-LAI on the Global Land Cover 2000 (GLC2000) map from SPOT-VEGETATION satellite (EC-JRC, 2003) (with approximate resolutions of 8 and 1 km, respectively).Comparisons were made only in pixels where each land-cover map matched the ecosystem type on the Central America Ecosystem map (leaving 17% of the total area for comparison) (World Bank and CCAD, 2001) over pristine ecosystem classes (i.e.excluding agricultural areas).Only pristine Hydrol.Earth Syst.Sci., 14, 1801Sci., 14, -1817Sci., 14, , 2010 www.hydrol-earth-syst-sci.net/14/1801/2010/ classes on the ecosystems map were used because MAPSS simulates potential vegetation and its LAI is not comparable with the LAI of agricultural lands.We used this map as a reference because there are discrepancies in forest types for areas in the region between the land cover maps for the LAI products and the Central American Ecosystems.Misclassification of forest types can lead to important differences in LAI values from remote-sensing sources.In addition, this map is based on extensive field work and high-resolution imagery (28.5 m pixel) from Landsat TM.Misclassifications could be due, in part, because within the studied region there is only one validation point for the GLC2000 land cover underlying the GLOBCARBON-LAI product (Mayaux et al., 2006), and none for the land-cover map underlying the MODIS-LAI (MODIS Land Team, 2009).Additionally, both landcover maps have a much lower resolution (1 km 2 ), due to its global nature, than Landsat TM.A comparison of the landcover sources used in the two LAI products shows the best agreement on the Atlantic side of Mesoamerica and larger differences on the Pacific side, Southern Mexico and southern Panama (Giri et al., 2005).

Uncertainty analysis
Analysis of uncertainty from model parameters was performed on the basis of Zaehle et al. (2005) and using Sim-Lab 2.2.1 software 3 .The uncertainty analysis is based on the probability distribution functions (PDFs) of model parameters and their effect on model output.The PDFs for 61 parameters of model components controlling rainfall interception, evapotranspiration and soil site conditions were built on the basis of a literature review of field studies.To be conservative in the uncertainty assessment, a uniform distribution was assumed for all parameters within the range of values found in the literature.A 30% variance was assumed for conceptual parameters that are used to simplify complex processes and are not measurable in experiments (Zaehle et al., 2005; see Table 1 in Supplemental material).
The space of parameter PDFs was explored using a Monte Carlo-type approach, the Latin-Hypercube sampling (LHS) method, to build a stratified sample of random sets of parameter values.The LHS method has the advantage of building a stratified representation of all parameters with a reduced variance due to additive effects of parameters on model output.The parameter sample consisted of 610 parameter combinations to be tested in model runs.Some runs had parameter combinations outside model boundaries, leading to crash-runs, leaving a total of 456 runs, well within the recommended number of between 3/2 (92 runs) and 10 times (610 runs) the number of parameters (EC-JRC, 2009).

Performance of calibrated model assessed by statistical tests
Calibration and validation of total annual runoff for LTSA catchments showed good overall agreement across the whole range of runoff values (1-4774 mm) and an underestimation of modelled runoff of around 12% (Fig. 3).In turn, calibration and validation results for the TSA dataset were also satisfactory, but modelled annual runoff was underestimated by approximately 20% (data not shown, N = 251, Slope = 0.81, Intercept = 36 mm and R 2 = 0.78).A similar trend was obtained when MAPSS simulated runoff for the USA, in this case because the approach was sensitive to evapotranspiration differences between actual and potential vegetation (Neilson, 1995).Vörösmarty et al. (1989) found a similar trend when coupling water balance and water transport models for a large-scale application in South America.In our case, we suspect that underestimation might come from the precipitation forcings used (see Sect. 3.4 Residuals distribution and uncertainty analysis).
We split the dataset into two classes by rainfall category, above and below 2000 mm of annual rainfall, which roughly divided the data points evenly (i.e.half above and half below 2000 mm) and in summer and year-round rainfall areas (see vegetation types under Sect.2.1 Region description).We then found that model performance was poorer in dry areas (71 catchments with an R 2 of 0.22, slope of 0.32, and intercept of 143.7 mm) than in wetter areas (67 catchments with an R 2 of 0.60, slope of 0.67, and intercept of 525 mm).Similar poorer performance in dry catchments was found by Gordon et al. (2004) when evaluating six terrestrial ecosystem models in the USA, although the runoff range they analysed is much smaller than that for Central America.Poorer model performance is probably due to the effect of greater uncertainties in precipitation, including rainfall frequency and local heterogeneity (rainstorms) of runoff in drier regions due to nonlinearity of the runoff generation process (Fekete et al., 2002).It is, thus, possible that in dry areas, the observed runoff is dominated by few daily events of intense precipitation that cannot be captured in our working monthly time steps.
The monthly model performance, according to classes for NS and τ statistical criteria (Table 3), was fair or better for 46% and 78% of catchments, respectively (Table 4).Annual runoff was modelled fairly or better for 48% of the catchments (Table 4).In general, our model performance was similar to that of other studies (Artinyan et al., 2008;Linde et al., 2008), but slightly less than that found over France with the SIM model (Quintana Seguí et al., 2009, 61% fair or better).Given the very large number of small catchments, complex orography and climate uncertainties found in Mesoamerica, we considered our model performance satisfactory.

Performance of the model assessed by comparison with a "poor man" model
We compared the results of MAPSS with those of a "poor man" model, where the runoff is modelled to be proportional to annual rainfall only, that is runoff = alpha × rainfall.We tested all potential of alpha, and the performance was always poorer than that of MAPSS, irrespective of the statistical criteria used.This test shows that useful information is contained in the MAPSS parameterization that improves the simulation of runoff in Central America, even though the model is based on potential vegetation and runs on a monthly time step.

Modelled versus observed LAI distribution
Figure 4a shows the modelled LAI by MAPSS, and the difference between modelled LAI and observed from MODIS and GLOBCARBON (Fig. 4b and c, respectively).Agricultural areas and other disturbed land-cover types were excluded from the LAI comparison.Over naturally vegetated areas, the comparison was made over pixels where both the land-cover map used to generate MODIS and GLOBCAR-BON LAI products matched the vegetation type on the Central American ecosystems map (used as the reference for the vegetation type, as explained in the section on performance of vegetation and LAI modelling).Both criteria define the excluded areas category in Fig. 4b and c.The general feature is an under-prediction of LAI in the northern part of the Mesoamerica region and an overprediction in the south.Discrepancies appeared when comparing MODIS and GLOBCARBON LAI (Fig. 4b and c).This could be due to misclassifications of land cover, particularly between classes with different architecture and foliage optics (Myneni et al., 2002;Yang et al., 2006a).Atmospheric and cloud conditions are also problematic over tropical areas, where MODIS-LAI values are calculated by the simpler NDVI (Normalized Difference Vegetation Index) based backup algorithm for bad atmospheric conditions (Myneni et al., 2002).Yang et al. (2006b), for example, found that broadleaf forests (which cover most of our study area) can be underestimated by as much as 3.4 LAI units.Accordingly, we found that areas with under-prediction are dominated by the main algorithm (used with good atmospheric conditions) and those with over-prediction by the backup algorithm, except for small areas in southern Panama.

Residuals distribution and uncertainty analysis
There was no systematic trend in the residuals of runoff as a function of annual precipitation, elevation or percentage of potential vegetation cover (Fig. 5a, b, and c, respectively).However, larger residuals were found for small catchments (<1000 ha or 10 pixels), probably due to larger uncertainty in the delineation of each catchment.
The model systematically underestimated runoff by around 12% (Fig. 3).We did not detect a positive trend in the residuals with decreasing potential vegetation cover (Fig. 5c).Consequently, it appears that the 12% underprediction in annual runoff cannot be attributed to the potential vegetation cover assumed by MAPSS (Neilson, 1995).We explored the possibility of missing rainfall due to cloud forest horizontal interception (Bruijnzell, 2005;Holder, 2004;Zadroga 1981) and results showed an improvement when this type of forest is accounted for (see Sect. 2.3.4Sensitivity tests).Nevertheless, in Fig. 3 we included an estimation of changes in modelled runoff (using Eq. 1) due to a ±10% difference in evapotranspiration to account for the mismatch between current and potential vegetation.
Figure 6 shows annual runoff results for each catchment from the uncertainty analysis, along the annual precipitation range.For each catchment, we show the range of values modelled by 456 parameter combinations from the set of parameter samples built with the LHS method.The average range of modelled annual runoff values within one standard deviation is within 36% of the total modelled range Hydrol.Earth Syst.Sci., 14, 1801Sci., 14, -1817Sci., 14, , 2010 www.hydrol-earth-syst-sci.net/14/1801/2010/ and equals 42% of the observed annual runoff.No apparent trend in uncertainty was found along the precipitation range, suggesting a constant effect of parameters uncertainty along the dry-to-wet gradient.Our uncertainty assessment could be used in future studies of the region for comparison of the suitability of different model structures and the benefits of adding data (Beven, 1993), by quantifying and comparing changes in the prediction uncertainties.Furthermore, it could help quantify the uncertainty of the impacts of climate change in the runoff of Mesoamerica when added to uncertainties from future climate scenarios.

Seasonal bias
Figure 7 shows the seasonality of precipitation and runoff for two selected catchments with different storage terms ( s in Eq. 14).In Fig. 7a, monthly modelled runoff mimics the observed time course, which corresponds to a situation with small s.In Fig. 7b, the modelled time course crosses the observed curve twice, indicating a period of excess water accumulation in the basin between July and November and a period of excess discharge later on.Zadroga (1981) found a similar situation in Costa Rica by analysing runoff and weather station data across seven watersheds.Similar results were also found by Heyman and Kjerfve (1999) in Belize, probably due to the release of water from limestone aquifers.In Nicaragua, Calderón Palma and Bentley (2007) identified shallow local recharge-discharge systems and a deep system that recharges in higher mountains and discharges in the central and lower plains.Moreover, using isotopes in Costa Rica, Guswa et al. (2007) showed that orographic precipitation (wind-driven precipitation and fog interception) contributed to the dry season base flow and the delayed contribution of the rainy season precipitation to dry season streamflow.MAPSS showed good performance on an annual time scale.Given that MAPSS does not simulate the groundwater storage processes controlling the s seasonal variation, it remains valid on a monthly time scale only for the selected catchments where the storage term is not significant.MAPSS monthly performance analysis falls to 26% (fair) when considering all catchments, but increases to 46% after excluding those with a significant storage term (see Sect. 2.3.5 Model performance and efficiency).Since spatial data on groundwater recharge for the region is not yet available, the area of applicability of monthly and seasonal MAPSS runoff outputs remains a subject for further research.

Sensitivity to different precipitation input datasets
The modelled standard deviations were lower than observations (A) and correlation coefficients were similar for all precipitation forcing datasets (Fig. 8).The TRMM (G) and Wind PPT (H) had good correlations (0.84 and 0.85, respectively), but lower regression slope among the datasets (0.66  2000and 2009, and GLOBCARBON between 1998and 2007. .and 0.64), probably due to extreme precipitation variations (Fekete et al., 2002), with differences of more than 1000 mm over large areas compared to other datasets (data not shown).
www.hydrol-earth-syst-sci.net/14/1801/2010/ Hydrol.Earth Syst.Sci., 14, 1801-1817, 2010   On the basis of the uncertainty analysis runs, we also assessed the model sensitivity to its parameters.We estimated the Ranked Partial Correlation Coefficient (RPCC) for model parameters based on average annual runoff of the study area in each run of the uncertainty analysis (Zaehle et al., 2005).MAPSS was most sensitive to the parameter that sets the ceiling for maximum stomatal conductance for all vegetation types (RPCC = 0.47).

Regional mapping
After the model calibration and validation steps, we simulated runoff across the entire region and analysed model outputs.Modelled runoff and evapotranspiration maps show a mean annual runoff and evapotranspiration of 552 and 1200 mm, respectively, with highest values distributed mostly in the southern part of the region and in mountain areas in the north Pacific side (Fig. 9a and b, respectively).We explored the water balance partitioning along the annual precipitation gradient.Figure 10 shows the relationship between the evapotranspiration-to-runoff ratio (E/R) and precipitation classes, each E/R value being an average for 100 mm annual precipitation classes.Below the 1500 mm annual precipitation threshold, evapotranspiration becomes a key component of the water balance, justifying the need of a SVAT model such as MAPSS for reliable modelling of the annual water balance.
Potential vegetation types are also simulated by the model and correspond to forest types that appear along available humidity across the year from evergreen forests to dry tropical savanna (Fig. 11).

Conclusions
We calibrated and validated the SVAT hydrological model MAPSS (Neilson, 1995) for the Mesoamerican region at 1 km resolution, after building a new database of observed runoff of 466 catchments.Herein, we presented a regionally calibrated version of MAPSS and output maps of runoff, evapotranspiration, leaf area index (LAI) and potential vegetation.
Runoff prediction performed similarly to other large-scale studies.A general underestimation of runoff has been attributed by Neilson (1995) in temperate conditions to the fact that MAPSS simulates potential vegetation.However, our residual analyses did not find the same cause.We suspect    that large horizontal interception of precipitation could play an important role in tropical mountain areas.
MAPSS simulation of monthly runoff was consistent only in catchments where the storage term ( s) was not significant, as this component is not simulated by the model.Availability of spatial information to estimate s is a crucial limitation to improve monthly performance of the model.
Accounting for wind and cloud forests in precipitation estimates improved results, indicating the importance of horizontal precipitation interception for runoff generation in our study area.
Modelled LAI was consistent with remotely sensed observations (MODIS and GLOBCARBON), except in humid areas of Mesoamerica, where high LAI has been measured directly in the field and cloud cover is frequent.In these areas, remotely sensed LAI is known to have lower quality estimates, although modelling errors cannot be discarded as part of the issue.
It is important to use a SVAT model to explicitly model actual evapotranspiration, especially in drier areas with below 1500 mm of annual precipitation, where evapotranspitation represents a very large fraction of the water balance.
Future steps with our calibrated MAPSS version will focus on simulating the impacts of climate change on water balance and vegetation of the Mesoamerican region.

Fig. 1 .
Fig. 1.Maps of the digital elevation model (a) and annual precipitation (b) in the Mesoamerican region and in Long Time Series Average (LTSA) and Time Series Average (TSA) catchments.

Fig. 2 .
Fig. 2. Percentage distributions of Long Term Series Average (LTSA) catchments according to size of catchment (a) and percent of catchment under natural vegetation cover (b).

Fig. 3 .Fig. 3 .
Fig. 3. Observed versus modeled annual runoff of catchments used for calibration Fig. 3. Observed versus modelled annual runoff of catchments used for calibration (N =69, black dots), validation (N=69, gray dots), and all (N =138).Each dot represents observed average annual runoff for one catchment.The error bar represents the effect on modelled runoff of a ±10% change in catchment evapotranspiration to account for the mismatch between current and potential vegetation cover simulated by MAPSS.

1812P.Fig. 5 . 976 Fig. 6 .
Fig. 5. Residual distributions of annual runoff according to catchment annual precipitation (a), average elevation (b), percentage of natural vegetation cover (c), and size (d).Each dot corresponds to the difference between MAPSS-modelled and observed annual runoff for each catchment. 981

Fig. 6 .
Fig. 6.MAPSS model runoff uncertainty obtained by a Monte Carlo-type approach (Latin Hypercube Sampling).For each catchment, the black line shows the whole range of predicted values and the box ranges within one standard deviation from the mean.Catchments were ordered according to annual precipitation.(Horizontal axis does not show all catchment values).

Fig. 7 . 990 Fig. 8 . 994 Fig. 8 .
Fig. 7. Two examples of contrasting seasonal catchment behaviour, according to the water storage term from Eq. (14): (a) a catchment without significant storage term (San Juan, Panama), and (b) a catchment (Los Cañones, Panama) with recharge during the rainy season (July to November) and discharge later on.Rainfall (gray straight line), observed (black straight line) and modelled (black dashed line) monthly runoff.

Table 1 .
Data sources for model input, calibration and validation.

Table 2 .
Modified parameter values from the original MAPSS configuration for calibration.

Table 3 .
Categories of model performance with the NS (monthly match), τ (monthly match of ranked values), and WB (bias in annual runoff) statistical measures.

Table 4 .
Modelled runoff qualified by percentage of catchments in each performance category for the LTSA dataset and, in parenthesis, the TSA dataset.