Improving estimates of water resources availability over North Tropical South America: comparison of two satellite precipitation merging schemes

Low-density precipitation measurements impair the ability of hydrological models to estimate surface water resources accurately. Remote sensing techniques and climate models can help to improve the estimation of the space-time rainfall variability. However, they alone are not good enough to be used in surface models built to support water management. In this research, we test the improvement of rainfall field estimation by using hydrological modelling based on the premise that a higher hydrological performance generally implies that precipitation is more consistent with streamflow observations and evaporation estimates in the basin. The SWAT model was forced with two satellite and rain gauge blending techniques and with the traditional IDW deterministic interpolation method from stations. The three simulated streamflows were compared separately against observed records. We do not only focus the comparison on one hydrological performance metric but also conduct a deeper evaluation using several hydrological signatures and statistics. We included the bias, the temporal correlation, the relation of general variability, and an analysis of the Flow Duration Curves (we found that low and medium segments were estimated correctly, whereas the high segments were underestimated). We conclude that either combination technique has its advantages over the other and that both outperform the performance achieved by the IDW in most of the defined criteria, with an overall 10% improvement and with individual streamflow gauge performance enhancement up to 50%.

Mejoras en estimativos de oferta del recurso hídrico en el norte de Suramérica tropical: comparación de dos procedimientos de mezcla con lluvia satelital

Introduction
There is an increasing need to acquire hydrological information in small basins or ungauged river reach for water resources assessment and water-related policy implementation.Hydrological modelling has been used as an alternative in determining water resources when streamflow and level measurements are scarce (Silberstein, 2006).Estimated precipitation (Ppt) fields are used in distributed hydrological and land surface models, and their influence is significant in model performance (Arnaud et al., 2002).However, precipitation gauges in areas with complex spatio-temporal behaviour are lacking, which can compromise models' ability to represent the system appropriately.Moreover, several knowledge areas require distributed rainfall estimates to understand hydro-bio-geochemical natural processes and connections.Climate classification indices (Beck et al., 2018;Rodríguez et al., 2020), and environmental conditions monitoring need accurate rainfall fields to perform a proper policy implementation.Evaluation of the uncertainty in hydrological modelling due to rainfall is another kind of study (Muñoz et al., 2015).
Rain gauge networks have been the traditional instrument for measuring rainfall or precipitation.The World Meteorological Organization (WMO) guidelines advise on their characteristics, yet networks need to be continuously adjusted to quantify rainfall variability at different temporal and spatial scales (Blöschl & Sivapalan, 1995).Deterministic or geostatistical methods are used to distribute the measured values (Grimes & Pardo-Igúzquiza, 2010).However, some places are almost inaccessible or financial resources are low enough to deter the expansion of the network, which could also impair its maintenance; as a result, rainfall records have decreased worldwide (García et al., 2016).Satellite (BARRETT, 1970), atmospheric reanalysis (Baatz et al., 2021), and multi-source merged products (Beck, Van Dijk, et al., 2017), are alternatives that can help tackle the lack of point measurements, albeit they often suffer from bias and temporal errors (Dinku et al., 2010).Those errors must be corrected for applying those distributed Ppt estimates in local hydrological modelling and forecasting (Serrat-Capdevila et al., 2014).
Rain gauges are generally considered the ground truth, although they might suffer from undercatch.Hence, correcting the alternative measurements with rain gauges can improve the accuracy of the distributed estimates (fields not produced initially with those specific rain gauges).The correction can also be done with downscaling methods based on physiographic variables like Digital Elevation Models (Long et al., 2016), vegetation indices (Immerzeel et al., 2009), or a combination of them (Ceccherini et al., 2015).The direct merging with rain stations has relied on: bias correction through coefficient multiplication or addition (Vila et al., 2009), a direct or linear combination of coefficients found with inverse variance weightings (Woldemeskel et al., 2013), spectral analysis combination (Heidinger et al., 2012), conditional merging based on geostatistics (Ehret et al., 2008), Bayesian conditioning (Todini, 2001), filtering functions (Li & Shao, 2010) and recently, Random Forest Machine Learning (Baez-Villanueva et al., 2020).
Basin-specific studies that compare data merging schemes are scarce (Chua et al., 2022;Nerini et al., 2015;Ur Rahman et al., 2020).There is a plethora of methods for merging satellite precipitation and rain gauges to improve hydrological modelling performance, as described above.Some studies are focused on comparing radar-based and station blending rather than satellite-based estimates (Goudenhoofdt & Delobbe, 2009;Habib et al., 2012;Nanding et al., 2015).In general, other studies are more focused on the evaluation of the available global satellite or reanalysis data (Beck, Vergopolan, et al., 2017;Sun et al., 2018), against local data and far few use hydrological modelling to recently intercompare the improvement of the merging techniques as far as the authors know.
The objective of this study is to evaluate two precipitation merging schemes and their respective improvement in local hydrological modelling performance in a recent time period (1980 to 2019).Rather than compare global precipitation products that may be created with rain gauges, we aimed to create local or regional products and analyse their performance in hydrological estimation.We focused on blending satellite-based and rain gauge estimates, as those are the most common precipitation datasets available worldwide and compared them with a traditional interpolation method.

Case of study and data
The Middle Magdalena Valley is an important bio-economical and energy production region in Colombia where many stakeholders use the water resource.Oil palm plantations are one of the region's most extensive land covers, but other agricultural and livestock developments are also present and use water.Three hydrological watersheds comprise the right margin of the Middle Magdalena Valley, represented in hydrological management subzones according to the Instituto de Hidrología, Meteorología y Estudios Ambientales (IDEAM, 2013).Sogamoso River Basin is an interbasin whose model considered streamflow gauges as inputs (see Fig. 1).They were hydrologically modelled to fulfil the research objectives.

Hydrometeorological data
Point measurements are mainly managed by the IDEAM, but recently, they are also gathered by regional environmental authorities.The IDEAM freely and publicly deliver the data through a web portal (IDEAM, 2023) and the regional authorities by direct request; time series of meteorological variables provided by regional authorities were analysed and discarded since their records were only available after 2013.Relative humidity, solar radiation and wind speed records have many missing values and hence were not considered; we used the precipitation, minimum and maximum temperatures time series managed by IDEAM.Anomalous data were checked and retrieved.We analysed streamflow records, also looking for anomalous data and homogeneity in the series (Castro & Carvajal Escobar, 2010); some gauges were discarded due to the high number of missing values.Data statistics are listed in Table 1.
Improving estimates of water resources availability over North Tropical South America: comparison of two satellite precipitation merging schemes

Alternative hydrometeorological data to bolster hydrological modeling
We used the Climate Hazard Group Infrared Precipitation with Stations (CHIRPS) to enhance precipitation representation.CHIRPS is a multisource rainfall product that considers infrared cold cloud duration measurements and ground correction with some rain gauge data (Funk et al., 2015).It has a 0.05° spatial resolution at a daily time scale in the period 1981-present and has been used in several hydrological studies, even in Colombia (Kaune et al., 2019).This was the product used with the rain stations and the merging algorithms for the comparisons made in this study.Another adapted version of CHIRPS that uses rain gauges for bias correction at 10-day periods has also been used in Colombia for estimating the general water budget (IDEAM, 2019), called CHIRPS-IRE by IDEAM.
Due to the lack of appropriate measurement of some meteorological variables that influence evapotranspiration, atmospheric reanalysis data had to be collected to supplement the deficiency.Reanalysis are numerical systems that incorporate mainly simulations of physical systems and assimilation of satellite, gauge or radiosonde observations (Baatz et al., 2021).We chose to use the Climate Forecast System Reanalysis (CFSR) (Saha et al., 2010) because it has already been used and recommended to be used with the Soil & Water Assessment Tool (SWAT) model, with good results (Dile & Srinivasan, 2014;Tan et al., 2021); its data can be acquired from the SWAT webpage of the Texas A&M University (Texas A&M University, 2022).

Methodology
We evaluated the congruence of rainfall fields with streamflow records and the performance of the hydrological model in the region.This is based on suggestions made by other researchers stating that a higher hydrological performance (consistency with streamflow observations and evaporation estimates in the basin), generally implies that the evaluated precipitation field is more accurate than the one with a lower streamflow performance (Beck, Vergopolan, et al., 2017).However, we did not only focus on one hydrological performance metric and hence intended to conduct a deeper evaluation (see section 3.3).In this section, we describe the merging and benchmark rainfall fields, the hydrological modelling platform that we calibrate and validate for each rainfall field, and the evaluation of its robustness (see Fig. 2).

Precipitation merging schemes and interpolation benchmark
The purpose of the rainfall fields is to force the hydrological model.Due to the subbasins' size, the field's resolution was established at cells of 1 km.
After the spatial interpolation or merging, the time series in each 1 km cell were aggregated using each of the subbasins' boundaries to create the forcing for the SWAT model (see section 3.2); the aggregation used the mean value of the cells within a subbasin boundary.

Inverse Distance Weighting (IDW)
IDW is one of the most used interpolation methods in environmental sciences (Webster & Oliver, 2007), and specifically to interpolate rainfall fields.It has been highlighted that it produces as good as or the best estimates of rainfall fields even compared with geostatistical methods that work better in very high-density networks (Duque-Gardeazabal & Rodríguez, 2023;López López et al., 2018;Vargas et al., 2011).Therefore, using in situ data, we used it as the benchmark field for the hydrological modelling.We fix the exponent of distance weighting in the value of two because it is the most used exponent in rainfall interpolation and other environmental variables.This interpolation method was applied to rain gauges and temperature data, yet a linear regression against ground altitude was considered for temperature in each time step.

Random Forest Merging Precipitation (RFMEP)
RF-MEP is a procedure that combines precipitation information from ground-based stations with precipitation products and topographic information (Baez-Villanueva et al., 2020).It considers that: i) ground-based measurements are reliable at the point level, ii) precipitation products from satellites or reanalysis are biased but contain interesting information on spatial distribution, and iii) the combination of precipitation products and ground-based data can generate a better representation of the spatial distribution of precipitation.
This procedure has been proposed to improve the spatial representation of precipitation in areas with a scarcity of in situ measured information through an ensemble learning method called Random Forest (RF) (Breiman, 2001).A RF regression is applied in each time step (e.g., daily, monthly), which is trained with ground-based observation of the same time step.Due to the fitting necessary for training the forest, the method needs enhanced computation resources and parallel processing is available for taking advantage of them (Zambrano-Bigiarini et al., 2020); yet it is still computationally demanding.Even with parallel processing, we needed to create the rainfall field first at 0.05° (~5 km) and afterwards bilinearly rescale it to 1 km.

Double Smoothing Merging (DS)
This merging scheme is based on data assimilation techniques and does not rely on second stationary assumptions made by geostatistical methods (e.g., kriging and its derivatives) (Li & Shao, 2010).Its main characteristic is the interpolation of residuals obtained by subtracting the rain gauge values from a background field (i.e., satellite or reanalysis), which makes it suitable for sparse design rain gauge networks.After the interpolation of residuals, they are again subtracted from the background field to perform the merging; this makes it flexible and simple for implementation.
The interpolation is performed with kernel density estimation functions, of which the Gaussian kernel is the most used due to its infinite support and simplicity (one parameter).Its parameter has been studied regarding its change against the density of rain gauges (Duque-Gardeazábal et al., 2018); therefore, this was the function used in this study.The influence of the parameter in hydrological simulation performance has also been studied (Duque-Gardeazabal & Rodríguez, 2023).Based on that research, we fixed the parameter's value to 17.5 km.

Soil and Water Assessment Tool (SWAT)
SWAT is software for continuous hydrological modelling consisting of several modules representing physical process-based formulations (Arnold et al., 1998(Arnold et al., , 2012)).It is spatially semi-distributed, runs at daily time scale, and considers a subdivision in Hydrological Response Units (HRUs) for enhanced surface-climate interactions representation in homogeneous areas.That makes it suitable for eco-hydrological studies (impact of land management, sediments, agriculture, etc.), climate change and scenario-based assessments.It has also been applied in tropical conditions and in Colombia (Caro Camargo & Velandia Tarazona, 2019; Uribe et al., 2018).
The main physical law in SWAT is the water balance, which is represented in several compartments associated with the hydrological cycle.For potential evaporation, it can consider a simplified energy balance using the Penman-Monteith equation (the chosen one) or the simplified Prestley-Taylor equation; it can also use the empirical Hargreaves equation.It considers a plant growth model that afterwards affects the water balance in the soil profile; this allows the tool to appropriately simulate watersheds with several land covers.Moreover, it simulates other processes such as surface runoff, infiltration using CN curve (Mosquera et al., 2022), percolation, lateral flow, ponds, routing through channels, among others (Neitsch et al., 2011).
To build the models, we used the ASTER Global DEM from NASA (NASA/METI/AIST/Japan Spacesystems & U.S./Japan ASTER Science Team, 2019) to create the subbasin boundaries and the slope classification to define HRUs.Regarding the soil properties, we reclassified the soil information gathered by the Agustin Codazzi Institute (Post et al., 2000;Saxton & Rawls, 2006) that is available in the regional soil studies (mainly Santander, Norte de Santander and Cesar) (IGAC, 2015).Land cover/use was reclassified to SWAT's classification from Colombia's Corine Land Cover maps (IDEAM, 2015).The models were calibrated and recalibrated for each rainfall field (details of calibration periods for each of the three watersheds are in the Supplementary material).
Calibration of parameters was performed using the SWAT-CUP application (K.C. Abbaspour et al., 2015), and using a split-parameter structure (Francés et al., 2007).The split-parameter structure consists of calibration factors that multiply the pre-assigned parameters based on soil and land cover properties (often called relative calibration factors).The latter ensures that the model conserves the spatial variability of parameters.
We first perform semi-manual trials using the limited parallel processing in SWAT-CUP; the latter is to find a region of the parameter space that achieves good performance but follows what we believe is the near-real physical behaviour of the catchments.Second, the Particle Swarm Optimization (PSO) (Kennedy & Eberhart, 1995) calibration algorithm was set to search for other nearby regions of the parameter space and refine the finally chosen parameters.To achieve that aim, at least 60 particles with at least 30 evolution steps were used (in some streamflow gauges, we used 100 particles with 40 evolutions).The objective function selected to guide the algorithm was the Kling-Gupta Efficiency (KGE) (Gupta et al., 2009), calculated with simulated and observed streamflow.After the completion of the algorithm, the tested sets of parameters were filtered to choose one set with high performance but with credible values.

Robustness of hydrological representation
Model performance can be estimated with traditional metrics such as the Nash-Sutcliffe Efficiency (NSE) (Nash & Sutcliffe, 1970) or the KGE to determine the general behaviour of the model.However, the representation of the whole hydrological system needs to be assessed to have confidence in its modelled estimates.Hydrological signatures evaluation is needed to perform what some call the Diagnostic Model Evaluation (Gupta et al., 2009), a diagnosis which is needed to improve the deficiencies of simulated fluxes.This must be done on both daily and monthly scale as the model represents processes at daily time scale and its use for water resources management needs to represent monthly scale accordingly.
Although the general calibration of the model parameters was conducted with the KGE, we also analysed the metric sub-components represented in three signatures (Gupta et al., 2009).The percent bias (PBias) serves as indication of water balance general agreement (precipitation, evaporation, and runoff); the determination coefficient (R 2 ), assesses the temporal representation of streamflow and indirectly of precipitation; and the relative variability of simulated and observed streamflows (alpha).That allowed us to identify whether the rainfall fields were temporally similar to the actual field.Moreover, the representation of model precipitation transformation into streamflow was assessed by comparing Flow Duration Curve (FDC) percentiles; the latter was performed visually and numerically through the Root Mean Square Error (RMSE, or difference RMSD) of FDC percentiles.A previous comparison of the three rainfall fields was also performed.

Precipitation Fields Comparison
We performed a comparative analysis between the rainfall fields without assuming which of them is the true field.Those differences can be spatially initially analysed with the long-term annual rainfall values maps, depicted in Fig. 3.The zones with similar values between the three maps are in the high area of the Lebrija and Sogamoso basin.In contrast, the north of the Lebrija and the west of the Sogamoso basin have higher values in the RFMEP and even higher in the DS compared with the IDW.The most noticeable difference is in the south, in the highlands of the Opón basin, where the values of the DS field are quite higher than the other two forcings (the RFMEP also has higher values than the IDW); that zone almost does not have rain gauges (see Fig. 1).Considering that we did not calibrate the bandwidth value of the DS (as Duque-Gardeazabal & Rodríguez (2023) did), it is clear that the method has a heavy influence from the background field.
The differences in the temporal dimension are also important.Therefore, we compare some features in each subbasin's time series.Fig. 4 depicts the metrics calculated between daily forcing's Ppt time series for each subbasin (each black dot is a subbasin).It shows their distribution through violin plots with quartiles as horizontal lines and the 5 % and 95 % percentiles.R 2 plot shows that all fields are temporally similar (most of subbasins coefficients are higher than 0.9), but RFMEP and IDW are the most (the mean value and the distribution is higher than the other two comparisons).Regarding the Percent Bias (PBias), the field with the higher volume is the DS which is on average 12.41% higher compared to the IDW and 12.38% compared with RFMEP; RFMEP and IDW are quite similar in volume (with a distribution between -15% and 10%), being the RFMEP 1.2% lower on average.In all volume comparisons, there are subbasins where one forcing is lower than the other and vice versa.The relation of temporal variabilities (alpha) shows that most subbasins with the RFMEP and IDW rainfall fields have very similar temporal variability (near 1.0), and the variability of the DS is higher than IDW and RFMEP variability.As a consequence of all this behaviour, the RMSE or Difference (RMSD) is lower between the RFMEP and IDW than between the DS and IDW; similar values are reported when comparing RFMEP and DS, yet in some subbasins the DS is more similar to IDW.

Streamflow Evaluation
Once the model was configured and calibrated, its performance was evaluated.The final values of the parameters for each forcing and each part of the models are shown in tables S2, S3 and S4 of the supplementary material (SM); it can be seen that the relative percentage of change of the soil parameters is low and the groundwater parameters are within physical range.As a result, Fig. 5 displays the daily streamflow performance measured by the NSE coefficient classified in the calibration and validation period for each forcing input in the models (the distribution of results is shown on the right side).Overall, the use of merged precipitation products increases the hydrological performance, and in certain cases, there is a considerable improvement (see gauges 23197680, 23197400 and 23197270, whose improvements were 27%, 56%, 52%, respectively, with RFMEP).The distribution of RFMEP values in the validation period is higher but not as significant as for those specific gauges since some gauges reported a reduction of performance (most of them already had values lower than 0.5).The monthly performance is shown in the SM; according to D. N. Moriasi et al. (2007)), the 50% of streamflow gauges can be classified as satisfactory, 25% as good or very good (three as very good with the RFMEP or the DS forcing and one with IDW).
The conjunction of three hydrological signatures expressed through the KGE is shown in Fig. 6 (daily performance can be seen in the SM).From these results, we can see that KGE informs about the general representation of the observed streamflow, but it is necessary to evaluate the other hydrological signatures that comprise it to achieve a diagnostic evaluation of the model.Gauges 23187040 and 23197130 reported low values, but we needed to identify the reason for those results (in the validation period, gauge 23197400 has few records and hence does not display a proper evaluation).
Regarding the evaluation of the temporal variation between simulated and observed streamflow, Fig. 7 shows the daily coefficient of determination for the different forcings.Some simulations were significantly improved by using the RFMEP or the DS rainfall fields instead of the IDW (gauge 23197680 which already had a satisfactory correlation and was improved to good category).That evidenced the contribution of temporal information by CHIRPS and the RFMEP method for creating a proper field.Interestingly, the DS deteriorated some performances in gauges with good results and improved others, causing that overall performance to be similar to the IDW.Monthly results are displayed in the supplementary material (SM).Another metric analysed was bias error, Fig. 9 depicts its results which can be related to the general water balance (this metric gives the same results for daily and monthly scale due to its calculation).Gauges 23147020, 23147040, 23187040 and 23197130 have a higher bias (mainly negative) than the other streamflow gauges.Still, almost all values are within plus or less 25%, which classifies them as at least satisfactory results, according to D. N. Moriasi et al. (2007).Those basins do not have many rain gauges inside their limits, which could influence the model's performance.Although all forcings form a distribution, RFMEP concentrates more gauges near 0% in the calibration periods, yet its average is further than the DS and IDW due to an extreme value.
An analysis of the general signatures in the streamflow was conducted with the FDC and is depicted in Fig. 10 for three selected gauges.Graph a) reveals a streamflow gauge where low, medium and high discharge values are similar; b) shows a general good agreement with measurements except in the high flows; whereas c) displays a gauge where only medium flows are well represented by simulations.That behaviour repeats in other gauges where the model was evaluated, with a tendency to misrepresent high flows while having similar results for medium and low flows.These results might stem from the use of the KGE metric for broadly calibrating the model; other researchers have stated the challenge of calibrating a model that satisfactorily reproduces high, medium and low phases (Pfannerstill et al., 2014) and specific calibration methodologies are designed to achieve precise objectives (K.Abbaspour et al., 2017).
The FDCs shown in Fig. 10 were analysed through the RMSE.RMSE corresponds to the error between the daily simulated and observed FDC for each forcing (calculated using the 100 percentiles as point entries).In Fig. 11, the percentiles were calculated using all the available data to ensure a proper statistical estimation of values.The streamflow produced with the IDW appears to slightly better represent the percentiles of streamflow, yet that produced by the merged forcings represents better those stations with lower values of annual streamflow (e.g.23197680, 23197700, 23197270).In those streamflow gauges with high annual values (e.g.Sogamoso River or La Colorada River), the differences between simulated and observed streamflow are higher, and the IDW tends to better represent the percentiles in some cases nearly followed by the RFMEP.Although gauges over the Oro and Lebrija rivers have a considerable annual flow (23197290 & 23197370), the difference in the FDC percentiles is quite small and lower in those coming from merging than in the IDW.The distribution of the RFMEP values is slightly lower under the median than in the other two forcings.

Conclusions
In this research, we have tested the ability and influence of different rainfall fields to properly represent surface water availability variation over time.The physical representation of hydrological components in the SWAT tool was key to ensure a fair comparison between different forcing inputs.The evaluation was performed using several streamflow signatures and the FDC to retrieve insights from various hydrological phenomena rather than just a general metric performance.
Overall, the two merging techniques slightly increase the estimation of streamflow.The mean performance improved by 10%, with higher or lower improvements depending on the evaluated streamflow gauge, most of all in the temporal and magnitude of the variability.Considering individual streamflow gauges, the enhancement in NSE was up to 50%.Regarding the bias, it is unclear that merging techniques concentrate the values around 0%.The main conclusion of this study is that, whenever possible, hydrological models should use several rainfall fields to achieve the best modelling agreement with observations, being the merging of sources a tool worth trying.
Regarding the FDC and the streamflow signatures associated with it.The calibration of the continuous hydrological model was not specifically focused on representing the high segment of the FDC; hence, the differences in that segment can heavily influence the value of RMSE used to compare the simulated and observed FDC.As the main objective of model building was to reproduce the general water budget and variability, it is understandable that some hydrological signatures have big differences in some streamflow gauges, and specific methodologies can tackle particular objectives.
From the comparison of merging methods and the IDW, we retrieve a special conclusion in the south of the Opon River basin.Considering the lack of rain gauges in that zone, the streamflow evaluation is a good alternative for evaluating the rainfall fields estimated by merging and the interpolation methods.Some of the metrics in the 23187020 gauges are better with the merged rainfall fields, and in some cases, the difference is substantial.Another important improvement is the better relation of streamflow variabilities, which in the case of the DS concentrates the distribution towards the value of 1. Regarding the bias, the DS reports a slightly better agreement than the RFMEP, but RFMEP has a better temporal correlation.
It is clear that the DS merging method is highly influenced by the background field (CHIRPS).The change in the background field due to the bandwidth value apparently did not affect too much those areas with low rain gauge density, generating that the background field had an even heavier weight in the final value of the field.An increase in bandwidth value could reduce the general bias, but it will also reduce the high spatial variability that one expects in daily rainfall fields.
The small difference in performance in the validation with respect to the calibration period when comparing forcing might be explained by the fact that more rainfall records are available in the validation period.This induces a short improvement of the rainfall fields by using the CHIRPS background field in that period of time.The improvement is higher in the calibration period since there are fewer rain gauge records.
Future work could include using other background fields to improve the estimates (e.g.MSWEP, PERSIANN, etc).Moreover, other meteorological variables can benefit from satellite-based estimations, for instance, temperature.
A call for open access to local hydrometeorological data is another conclusion of this study and a reminder that rain gauges are what we consider the ground truth.While the plethora of rain gauge time series are not shared with international organisations that create satellite or reanalysis-based global products, local or regional merged datasets can improve hydrological studies estimates, such as the one exposed here.However, global or continental products are highly useful for transboundary watershed studies; those could be better when considering more local data.

Figure 1 .
Figure 1.Location of watersheds, their hydrological network, and in situ hydrometeorological data

Figure 3 .
Figure 3. Mean annual precipitation fields for the period 1983-2020 generated with IDW, RFMEP and DS.

Figure 4 .
Figure 4. Metrics of comparison between the precipitation fields.Black dots represent the value for each modelling subbasin, and the colour dots indicate the mean of those values.Horizontal lines inside the violins depict the data's 5, 25, 50, 75 and 95% percentiles.

Figure 5 .Figure 6 .Figure 7 .
Figure 5. Daily NSE values for each streamflow gauge and the distribution of their values, for calibration and validation period

Figure 8 .
Figure 8. Relation between simulated and measured streamflow variability at monthly time scale and the distribution of their values, for calibration and validation period

Figure 9 .Figure 11 .
Figure 9. Percent bias between simulated and measured streamflows and the distribution of their values, for calibration and validation period

Table 1 .
Available point measurement in the region