Interactive comment on “ Simulating stream flow over data sparse areas – an application of internet based data

large amount of data used. Figure 3 in all the results shows that on a daily basis in all low and normal flows the error is more than 100% of the actual value (at each time step). This low and normal flow rates roughly represent around 50 to 70% if we look at the figure (dry or summer seasons). The high flows in almost all the graphs show to be not accurate and although it might follow overall shapes if we try to visualize a kind of moving average it is not possible to see the use of this daily values (for sure not in flood management). I mean there is no use on the time to the peak situations as well as there is no use in the quantification of the peak value itself. This would imply very dangerous mistakes. I think may be the comments on promising can be clarified on how the authors see this information can be used (on daily scale). Also I think is important to assess the problem as a modular system looking first low flows and then high flows.


Introduction
Water resources management is very crucial in many countries and draws an added importance near the border regions between countries.In the world, international river basins that include political boundaries of two or more countries cover 45. land surface, host about 40 % of the world's population and account for approximately 60 % of global river flow (Wolf et al., 2005).Due to different governmental policies, conflicts arise in sharing of the water resources, more so when the advantage is more for the upstream user country of these water resources.The United Nations Educational Scientific and Cultural Organization (UNESCO) has established a program known as the "Potential Conflict to Cooperation" Potential (PCCP) to facilitate multi-level and interdisciplinary dialogues between countries in order to foster peace, cooperation and development related to the management of shared water resources (UNESCO, 2011).
As one such endeavor, the Mekong River Commission (MRC), formed in 1995, agreed on joint management of the shared water resources between four countries in the Lower Mekong Basin (LMB): Thailand, Cambodia, Lao PDR and Vietnam, to coordinate the development of the economic potential of the river.However, activities of MRC are very much dependent on the water uses of upstream countries like China and Myanmar.Research by Lu and Siew (2006) showed that the series of dam constructions in China are affecting water discharge and sediment flux over Lower Mekong River over the last decades.The problem of water resource management comes into picture when there are no data available due to poorly managed observing stations, lack of technology and resources, war time and financial limitations.Data availability is also an issue when data sharing is difficult between countries that are not into any formal agreement.For example, when availability of water resources needed to be assessed over Northern Vietnam, the data for upstream region which lies over the southern part of China are not available, as the water quantity over the downstream region over Vietnam depends on the flow from the upstream China part.This is a clear case of a trans-boundary problem, the issue cited earlier.Hence, this paper describes an approach to resolve data requirement issues of a trans-boundary nature in managing water resources by employing a hydrological model, the Soil and Water Assessment Tool (SWAT) that uses data available from the internet.
Many research studies that focus on basin hydrology have used the SWAT model to simulate runoff (Easton et al., 2010;Ouessar et al., 2009;Mengistu and Sorteberg, Introduction Conclusions References Tables Figures

Back Close
Full 2011; Pohlert et al., 2007: Cau andPaniconi, 2007;Stehr et al., 2010).Some SWAT employed studies have been done in Southeast Asian (SEA) region.Victor (2009) mentioned about the potential application of SWAT model for countries in SEA.The Mekong River Commission (MRC) also used the SWAT model in their Digital Support Framework (DSF) for LMB Planning (John, 2008).A study over the Da River catchment (Vietnam part) has been done by Nguyen et al. (2010), in which they used local data to simulate soil erosion.However, SWAT is a physical based model that requires the availability of spatial data like topography, land use and soil map with meteorological data (precipitation and temperature) which are difficult to obtain from local authorities of many countries.The use of internet based data into SWAT model was first introduced by Van Griensven et al. (2007) for 3 international river basins: river Kagera (Rwanda, Burundi, Uganda and Tanzania), river Blue Nile (Ethiopia) and river Ganges (India, Bangladesh, Nepal) using monthly weather data to quantify the monthly river flow.Later, Rouholahnejad and Abbaspour (2010)  Full Climatology Network version 2) are used in this study.Model sensitivity analysis, calibration and validation are presented.Finally, simulation results are analyzed in order to highlight that data scarcity problems over such trans-boundary regions can be largely resolved using this approach.
2 Model and study region

Study region
The Da River (known as "Black" river in English or rivi ère Noire in French) originates from the Yunnan Province in China and flows downstream through mountainous regions, crosses the border of China-Vietnam and joins as a tributary of the Red River in Vietnam.The Da River has a total catchment area of about 53 000 km 2 in which 48 % of the area lies on China's territory, 2 % in Lao PDR and 50 % in Vietnam (Fig. 1a).It spans a total length of 1010 km of which 570 km lies in Vietnam and a population of 1.3 million people make this region their dwelling.The Da River Basin is formed by the ranges of high mountains in the region (Hoang Lien Son mountainous area).The total precipitation ranges between 1000 to 2300 mm per annum.Precipitation pattern is high over the central part of the catchment due to Hoang Lien Son mountain chain and reduced to upper and lower catchment (Fig. 1b).The Da River has a high annual average discharge of 1770 m 3 s −1 .During the main flood season (June to October), the total discharge occupies around 78 % of the annual discharge.forest to cropland, mostly because of people indulging in forest burning for agriculture.
The land use data collected from Global Land Cover ( 2000) is recent and it shows that about 70 % of the area is cropland whilst another 25 % is forest (Fig. 1c).However, due to limited data availability, we assume that the land cover situation in the year 2000 is the same as in the computation period (1971)(1972)(1973)(1974)(1975)(1976)(1977)(1978)(1979)(1980)(1981)(1982).There are two main types of soil for this regions which are Ferric (Af) and Orthic (Ao) from Acrisols group, covering almost 90 % of the area (Fig. 1d).Due to its importance in the Vietnam hydropower system, Da River is always listed as the first in hydropower generation.The need for modeling comes in the backdrop of main issues which are: how to manage the water sources coming from China and what will be the impact if there are dams constructed in China that may affect the quantity of water flowing into Vietnam.

SWAT model
SWAT  (Monteith, 1965).It depends on the amount of required inputs that each model is preferred.While Hargreaves method requires only maximum, minimum and average air temperature, the Priestley-Taylor method needs solar radiation, air temperature and relative humidity.The inputs for the Penman-Monteith method are the same as that of Priestley-Taylor, in addition requiring the wind speed.Due to limitations in the available meteorological data, the Hargreaves method is applied in this study.In the SWAT model, the land area in a sub-basin is divided into what are called Hydrological Response Units (HRUs).In other words, a HRU is the smallest portion that combines different land use and soil type by overlaying their spatial map.All processes such as surface runoff, PET, lateral flow, percolation, soil erosion, nitrogen and phosphorous are usually carried out for each HRU (Arnold and Fohrer, 2005).

Data
The main spatial data used in this study are taken from many different sources from the internet.The respective sources have been cited in the references.The Digital Elevation Model (DEM) was taken from NASA (National Aeronautics and Space Administration) SRTM 3 arc second (approx.90 m) where the digital elevation data was obtained on a near global scale to generate the most complete high-resolution digital topographic database of Earth (Farr et al., 2000).The land use map was taken from the Global Land Cover ( 2000 of 10 km with soil properties for 2 layers (layer 1: 0-30 cm, layer 2: 30-100 cm depth).
The cropped region for Da River Basin includes 6 types as shown in Fig. 1d.Rainfall data were taken from the gridded rainfall product called the APHRODITE (Yatagai et al., 2009) having a spatial resolution of 0.25 • (∼ 25 km) on a daily time scale.This latest high resolution gridded precipitation product is available for a long period from 1957 to 2007 over the Monsoon Asia, Middle East and Russia and this study region is a subset of Monsoon Asia.The suitability of using the internet based gridded data in the place of station data has been mentioned by Vu et al. (2011).It is to be noted that the exact rainfall station locations over the Chinese region are not known and are selected to be closer to the center of the sub-catchment delineated in SWAT, whilst the rainfall station locations in Vietnam are known, being a part of the study region within Vietnam.Daily average, maximum and minimum air temperatures were obtained from the modified Global Historical Climatology Network version 2 (it has been referred to in this paper as GHCN2, for simplicity).This dataset provides historical daily data over global land areas from 1950-2008 with spatial resolution of 0.5 • .It is a composite of climate records from numerous sources that were merged and then subjected to a suite of quality assurance reviews.More information can be found in Adam and Lettenmaier (2003).Daily river discharge data was the only source obtained from the local authority in Vietnam (Institute of Meteorology, Hydrology and Environment -IMHEN) for two gauging stations from 1971-1990: one gauge at the downstream at Hoa Binh station and the other at the upstream, named Lai Chau.The latter is located within the Vietnam territory and is 120 km away from the border of Vietnam and China which can be used to verify discharge coming from the upper part in China.The use of this observed discharge station data in this paper is to calibrate the model which used internet based input data as described above.A detailed list of input variables and sources is tabulated in Table 1.The model setup and result is described in the following sections.Introduction

Conclusions References
Tables Figures

Back Close
Full

Experiment methodology and results
The trans-boundary region consists of two parts: the upper part belongs to China and the lower one belongs to Vietnam.As mentioned earlier, the input spatial data (DEM, land use and soil map) described above are cropped for the study region and used as the input datasets for the SWAT model.The whole Da River catchment was divided into 23 sub-catchments in which the discharge station Lai Chau is at the sub-catchment 13 and Hoa Binh, at 23. Daily precipitation for the whole study period was bi-linearly interpolated to 16 stations within the whole catchment (see Fig. 2).The SWAT model takes as input, measured rainfall data from gauged stations and then uses a rainfall distribution code (skewed distribution or mixed exponential distribution) to generate precipitation values all over a catchment (Neitsch et al., 2004).Hence, an interpolation method is required to compute the station data (at a particular grid point) from the gridded observation data.Amongst different interpolation methods, piecewise constant interpolation, linear interpolation, polynomial interpolation and spline interpolation, the linear interpolation method is usually used by many because of its simplicity and convenience.The bilinear interpolation method is an extension of the linear interpolation for interpolating functions of two variables on a regular grid and hence we use bilinear interpolation method to extract precipitation value for station data, at a grid point.The same approach is applied for the air temperature at the meteorological station.upstream station where the stream flow source is from the runoff over the Chinese region of the catchment.Figure 2 shows the locations of the 2 discharge gauging stations (Lai Chau at upstream and Hoa Binh at downstream), rainfall stations and the entire trans-boundary nature of the catchment in discussion.For clarity, the dotted region at downstream is the catchment framed by 2 control stations Lai Chau and Hoa Binh which lies over the Vietnam territory and the striped region at upstream is catchment controlled by Lai Chau station (in Vietnam) which measures stream flow from the China part.
Sensitivity analysis is a method to analyze the sensitivity of model parameters to model output performance.In SWAT, there are 26 parameters sensitive to water flow, 6 parameters sensitive to sediment transport and other 9 parameters sensitive to water quality.The sensitivity analysis method coupled in SWAT model uses Latin Hypercube One-factor-At-a-Time method (LH-OAT), which accounts for the strength of the Latin Hypercube sampling (McKay et al., 1979;McKay, 1988).This has also been highlighted by Griensven et al. (2006).The first column of Table 3 shows the order of 10 parameters in SWAT model which are the most sensitive to model output.Auto-calibration using ParaSol is applied to those most sensitive parameters to find the appropriate range of parameters that yield the best result compared to observed discharge data at gauging station.ParaSol is an optimization and a statistical method for the assessment of parameter uncertainty and it can be classified as being global, efficient and being able to deal with multiple objectives (Van Griensven and Meixner, 2006).The Shuffled Complex Evolution method (SCE-UA), an algorithm that optimizes model parameters (Duan et al., 1992), is used in this study.This methodology has also been discussed by Stehr et al. (2010).
The Nash-Sutcliffe Efficiency (NSE) (Nash and Sutcliffe, 1970)  and it varies from negative infinity to 1 (perfect match).The NSE is considered to be the most appropriate relative error or goodness-of-fit measures available owing to its straightforward physical interpretation (Legates and McCabe, 1999). (1) (2) where o and s are observed and simulated discharge dataset, respectively.
Results for the daily time scale calibration at the Hoa Binh station for the period 1971-1982 show that the NSE and R 2 for the calibration part are quite promising with values of 0.90 and 0.91, respectively (Fig. 3a).This is taken as an indicator for very good performance as such values have been obtained for modeling at daily time scales which are usually highly variable in space and time compared to monthly time scales.
Results for validation at the Hoa Binh station, for the period 1983-1989 show NSE and R 2 values of 0.88 and 0.90, respectively (Fig. 3b) and the verification results done for the Lai Chau station show NSE and R 2 indices of 0.83 and 0.86, respectively (Fig. 3c).
Summary of the above results are tabulated in Table 4.The very promising indices for the verification part at Lai Chau gauging station imply that even there is no data available for upstream region, the inputs from internet sources are good enough substitutes for station data.This is a very important finding, especially for trans-boundary area and developing countries like in Vietnam and Southeast Asia region where lack of station data is very common.These results also imply that the study can be furthered in lieu of climate change studies where high resolution climate models can generate important climate variables such as rainfall and temperature which can then be used for hydrological modeling.As such, these internet data sources and available gridded Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Flood occurs strongly because of very high rainfall concentrations over the steep topography and narrow valleys thus leading to very high peak flood of about 5000 m 3 s −1 .Therefore, it has high potential for hydropower over this basin.There have been two huge dams built on the main river: Hoa Binh (installed capacity 1920 MW, in operation since 1989) at the downstream part and Son La (installed capacity of 2400 MW, in operation since 2006) at the upstream part, in which the latter is currently the largest dam in Southeast Asia (Hydroelectric power plants, 2011).Land cover of this region reduces rapidly from Discussion Paper | Discussion Paper | Discussion Paper | is a river basin scale model, developed by the United States Department of Agriculture (USDA) -Agriculture Research Service (ARS) in early 1990s.It is designated to work for a large river basin over a long period of time.Its purpose is to quantify the impact of land management practices on water, sediment and agriculture chemical yields with varying soil, land use and management conditions.Detailed information and several related publications are available at http://swatmodel.tamu.edu.SWAT version 2005 with an ArcGIS user interface is used in this paper.There are two methods for estimating surface runoff in the SWAT model: Green & Ampt infiltration method(Green and Ampt, 1911) and the Soil Conservation Service (SCS) curve number procedure(SCS Handbook, 1972)  in which the latter was selected for the model simulation.Retention parameter is very important in SCS method and it is defined by Curve Number (CN) which is a sensitive function of the soil's permeability, land use and antecedent soil water conditions.Potential evapotranspiration (PET) may be defined as the evapotranspiration from a large vegetation covered land surface having adequate moisture at all times.SWAT model offers three options for estimating PET: Hargreaves(Hargreaves et al., 1985),Priestley-Taylor (Priestley and Taylor, 1972) and Penman-Monteith 11020 Discussion Paper | Discussion Paper | Discussion Paper | ) Products having a spatial resolution of 1 km with 22 distinguished land use classes.The land use for the Asia region as a whole was downloaded and cropped to the study region.Table 2 shows the land use/land cover types and corresponding area percentage in the Da River Basin.The land use classes are classified into 6 different categories, in which Agricultural Land-Row crops and Forest Evergreen cover majority with respective area percentage of 62.7 % and 37 %.The soil map was taken from the digital soil map of the world provided by the FAO of the UNESCO (FAO, 2003) at 1 : 5 000 000 scale, in the geographic projection (latitude -longitude) intersected with a template containing water related features (coastlines, lakes, glaciers and double-lined rivers).There are around 23 soil types with spatial resolution 11021 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The experimental study consists of three parts: (1) set up the SWAT model for the Da River Basin using the internet based data and calibrate the model using APHRODITE daily rainfall, GHCN2 temperature for 11 yr from 1971-1982, with the first year as a warm up period.The observed daily data from Hoa Binh gauging station was used as the benchmark to compare SWAT performance.(2) Validate the SWAT model for the basin for 7 yr 1983-1989 for the same region using Hoa Binh station as the benchmark for comparison of model performance.(3) Verify the model using upstream station at Lai Chau for the same initially calibrated period 1971-1982.This step was to prove that the model was performing well not only for the downstream station but also for the Discussion Paper | Discussion Paper | Discussion Paper | and the Coefficient of Determination R 2 (Krause et al., 2005) are used as the benchmarking indices for the simulated runoff.R 2 is the square of correlation coefficient (CC) from Eq. (1) and NSE is calculated from Eq. (2) shown below.The R 2 ranges from 0 to 1 in which 1 is the best match.The NSE shows the skill of the estimates relative to a reference Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |observations can be used to validate and evaluate the climate model generated variables.Such climate model derived estimates can then be used to quantify stream flow changes in the future, in respect of climate change.In a similar fashion of the use of bi-linear interpolation, the climate model derived variables can also be bi-linearly interpolated to the station locations and then used in the SWAT model to study any hydrological responses.4ConclusionsThis study simulates the stream flow of Da River in Vietnam for a daily flow over a 11 yr period between 1971-1982 using the SWAT model across a trans-boundary region between China and Vietnam.The chosen period ensures that there was no dam built on the main river which allows recalculating the existing natural flow of the river.Due to lack of spatial and weather data from the Chinese regions, internet based data have been used in the modeling.Daily scale gridded rainfall is also used as input to the SWAT model whose results are compared against observed gauging station data in Vietnam.Three scenarios have been run with calibration and validation part for the Hoa Binh station and the verification for the Lai Chau station.The NSE and R 2 indices are very promising with values higher than 0.85 for most cases, also showcasing very good performance of the SWAT model.The results of this study also indicate that internet based data are applicable for hydrological model for large scale watersheds, especially for regions where spatial and temporal data are scarce or sensitive like the trans-boundary problem discussed here.This approach also has implications for climate change applications, where the daily scale rainfall and temperature could be obtained from high resolution regional climate models for present-day and future climates from which the hydrological responses may be ascertained.Discussion Paper | Discussion Paper | Discussion Paper |

Figure 1 :
Figure 1: (a) Da River basin location and DEM (b) Total annual rainfall distribution map (c) Land use map (d) Soil map

Fig. 1 .
Fig. 1.(a) Da River Basin location and DEM (b) total annual rainfall distribution map (c) land use map (d) soil map.

the main contribution of this paper is that it uses the spatial data on a daily time scale using available meteorological data from the internet. In particular
, the use of daily time scale data is deemed more robust than modeling approaches which use monthly time scales for such hydrological modeling and it is the first-of-itskind study to be done over this Da River trans-boundary region.Globally available gridded observation data downloaded from Internet include topography (from SRTM - applied this approach over the Black Sea Catchment which contains 19 European and Asian countries, to quantify hydrological components of water resources (surface runoff, deep aquifer recharge, soil

Table 4 .
Model evaluation statistical indices for daily discharge for 3 simulation scenarios.