IPH-Hydro Tools : a GIS coupled tool for watershed topology acquisition in an open-source environment

A delimitação de bacias hidrográficas, geração da rede de drenagem e determinação de características hidráulicas de um rio de interesse são partes importantes de estudos na área de hidrologia. Atualmente muitas dessas informações são obtidas com o processamento de modelos digitais de elevação (MDEs) em softwares comerciais de SIG, como o ArcGIS e o IDRISI. Por outro lado, pacotes de SIG para uso livre, ou seja, gratuitos e de código aberto, têm aumentado significativamente nos últimos anos, e as vantagens desses pacotes incluem ampla distribuição e customização, desenvolvimento continuado pela comunidade de usuários e atendimento a necessidades específicas. Este trabalho apresenta o pacote livre (open-source) denominado IPH-Hydro Tools, um conjunto de ferramentas acoplado ao software livre MapWindow GIS criado para facilitar a aquisição de informações topológicas em bacias hidrográficas, bem como realização de etapas de pré-processamento em modelos hidrológicos a exemplo do MGB-IPH. Para avaliar a aplicabilidade e o desempenho da ferramenta desenvolvida foram realizados testes específicos, através da comparação dos resultados do IPH-Hydro Tools em relação a outros pacotes de SIG (ArcGIS, IDRISI, WhiteBox) disponíveis para esta finalidade. O IPH-Hydro Tools apresentou qualidade de rede de drenagem geralmente superior aos demais pacotes e menor tempo de processamento necessário para delimitação de bacias, apesar de algumas limitações como incompatibilidade em relação a matrizes muito grandes e dificuldade na representação da rede de drenagem em áreas extensas de mesma cota, a exemplo de reservatórios e rios muito largos.


INTRODUCTION
Watershed delineation, drainage network generation and determination of river hydraulic characteristics are key factors in hydrological studies. Up to three decades ago, these activities have been traditionally performed manually from topographic maps and aerial photographs. Nowadays, with the emergence of new techniques and computational advances, these procedures are widely performed using Geographic Information Systems (GIS) (BUARQUE et al., 2009;BURROUGH;MCDONNEL, 1998;FAN et al., 2013;CIRILO, 2001;MIRANDA, 2005;COLLISCHONN, 2008).
Within a GIS framework, the characteristics of a watershed can be derived by processing a Digital Elevation Model (DEM). A DEM is a grid-based representation of the terrain, where each cell or node, stores the elevation value according to its geographical location (BUARQUE et al., 2009;FAN et al., 2013;O'CALLAGHAN;MARK, 1984;TARBOTON, 1997;ZEILHOFER, 2001). In most cases, digital elevation models can be obtained from the interpolation of digital topographic maps (BURROUGH; MCDONELL, 1998;MARTZ;GARBRECHT, 1999;PIRES et al., 2005;ZEILHOFER, 2001), from aerial surveys, e.g. the Shuttle Radar Topography Mission -SRTM - (FARR et al., 1997)

and the Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation
Model -ASTER GDEM - (TACHIKAWA et al., 2011), as well as from the airborne laser scanning, for instance the Light Detection And Ranging -LIDAR - (LIU, 2008).
In practical terms, GIS commercial softwares like ArcGIS ® and IDRISI ® are widely used for DEM processing in the context of hydrological studies (BUARQUE et al., 2009), whose advantages include robustness, support warranty, and constant update for the computational tools. On the other hand, open source GIS tools have significantly increased over the last years (SHEKHAR; STEINIGER;BOCHER, 2009), since they are free to use and more flexible in terms of both programming language and customization to fit any specific needs. Several agencies and institutes have developed hydrology-based applications using open source GIS tools, and some examples include TauDEM (TARBOTON, 2005) for topographic analyses, and also EPA-BASINS (KITTLE et al., 2006), SWAT (GEORGE;LEON, 2007) and MGB-IPH (FAN;COLLISCHONN, 2014) to provide a graphical user interface for hydrological models, in this case all coupled to the MapWindow GIS® software (AMES et al., 2008).
In this context of new hydrology-based open source packages, this work aims to introduce and investigate the results of the IPH-Hydro Tools. This framework is a set of tools developed for DEM processing within a GIS environment, allowing for the extraction of drainage networks and topological characteristics for watersheds as well as the information needed to MGB-IPH hydrological model preprocessing COLLISCHONN, 2014;TUCCI, 2001).
This text is organized in the following order: firstly, the IPH-Hydro Tools and its main components are briefly described, being highlighted the algorithms with greater complexity during DEM processing. Secondly, some tests were carried out through a verification of the results in watersheds/basins with different characteristics and a comparison with other available GIS platforms. Then, the applicability of the developed tool is discussed, with remarks to its main advantages and limitations.

DESCRIPTION OF IPH-HYDRO TOOLS
IPH-Hydro Tools is a package of tools developed in VB.NET language that works as a plugin of MapWindow ® (AMES et al., 2008), an open source GIS software with a great number of functionalities that generally fulfill the basic needs of users. These plugins consist in compiled codes -Dynamic Link Library (DLL) -programmed in .NET framework (Visual Basic or C#), being added to the program through a simple installation process (AMES, 2006;COLLISCHONN, 2014).
The procedures available in the IPH-Hydro Tools (Table 1) are based on the ArcHydro Tools (MAIDMENT, 2002), which is originally linked to the ArcGIS and refer to the basic steps for generating information for hydrological models, including extraction of drainage networks and watershed delineation. Regarding to its functional features, IPH-Hydro Tools make it possible to work with data stored in an ASCII Grid format, widely used in geoprocessing since it is an easy to read, software-independent extension. Besides, the data can also be stored in an ASCII binary format, entitled IPH Raster Grid (IRST). Among the main products that can be generated using this package are: the flow directions based on terrain slope (Figure 1a), drainage network derived from a minimum threshold area (Figure 1b), watershed delineation to the point of interest ( Figure 1c) and the watershed subdivision in small catchments, which are characterized by the area between two confluences or between a confluence and a headwater ( Figure 1d).
Extracting topologic information from IPH-Hydro Tools involves a DEM processing in several steps, which starts with removing the depressions from the terrain dataset. These topographical depressions are given by an area of one or more contiguous cells with lower elevation resulting in areas without outlet definition (BUARQUE et al., 2009;ZANDBERGEN, 2006), which can occur as a natural aspect of the terrain like endorheic basins and lakes, or as an artificial depression being considered spurious (HESSE, 2008;LIU, 2006). Most of the depressions in the terrain grid are usually classified as spurious, and they are derived from interpolation issues during DEM generation, truncation of interpolated values and limited resolution of the grid (ARNOLD, 2010;JENSON;DOMIN-GUE, 1988;MARTZ;GARBRECHT, 1999).
Depression removal, among other steps related to DEM processing, is the procedure that presents greater complexity in terms of hydrological purposes (BUARQUE et al., 2009). Since the 80's, several computational methods have been proposed to ensure the continuity of the flow in downstream direction (e. g. HOU et al., 2011;JENSON;DOMINGUE, 1988;JONES, 2002;MAGALHÃES et al., 2012;MARTZ;GARBRECHT, 1999;PLANCHON;DARBOUX, 2001;GRATIN, 1994;LIU, 2006). Despite the ongoing efforts, there is not a single correct solution to depressions removal, as specific algorithms may be more or less effective according to the area processed (ARNOLD, 2010;TARBOTON, 1997). In this context, recent studies have focused in performance improvement over the existing solutions (e.g. BARNES; LEHMAN; MULLA, 2014a,b;GOMES et al., 2012;METZ;MITASOVA;HARMON, 2011), taking into account some computational issues like Input-Output (I/O) operations and processing speed.
Removing depressions, assigning flow directions and defining an area threshold of the accumulated cells are the basic procedures needed to extracting the drainage network (O'CALLAGHAN;MARK, 1984). Despite the latter is an important procedure to outline channel locations, the other ones can affect the topology (i.e. the connectivity) of the river network and its related information, which includes length and slope of the reaches as well as the watershed area up to any location of the drainage.
In the next section, the primary method for assigning the flow directions and removing the depressions implemented in the IPH-Hydro Tools (MHS, or Modified Heuristic Search) is described, as well as the procedure for calculating the accumulated flow and extraction of drainage network based on the work of Haverkort and Janssen (2012). The other tools available in this package were not discussed herein because they do not bring any significant change in the extraction of topologic information, and additional details can be found on the IPH-Hydro Tools user manual (the reader is referred to the software availability, at the end of this paper).  Watershed Delineation Defines the watershed up to one or more predefined outlets.
Catchment Delineation Defines catchments that drain to each segment of the drainage network.
Drainage Line Converts the drainage network into a shapefile vector (line).

Watershed Polygon
Converts the watersheds in a shapefile vector (polygon).
Depth-Area-Volume Provide depth-area and depth-volume curves up to a certain height determined at any point in the drainage network. Extract Raster by Raster Allows the extraction of cells from a raster to a specific mask.
Conversion IRST-ASCII Allows the conversion of an IRST file to an ASCII format.

Process All Steps
Provides a batch processing environment to automatically generate the files above. Hydrologic Response Units Allows the generation of a raster file, combining land use and soil type for MGB-IPH model application.

Algorithm to derive flow directions and remove topographical depressions
The procedure for removing depressions in the IPH -Hydro Tools can be performed from two different approaches. The first one is the Priority First Search (PFS) algorithm, described by Sedgewick (1992) and adapted by Jones (2002), while the second, and most important approach herein, is a variation of the algorithm proposed by Hou et al. (2011), which applies functions based on heuristic information. In the latter method, the tendency of DEM runoff in the surrounding area of the depression is used in order to avoid a blind search, and least-cost paths (JONES, 2002;SEDGEWICK, 1992) are used to minimize differences between elevation values in adjacent cells, along the trajectory needed for solving the problem (i.e., finding for a depression outlet). In addition to the heuristic and least-cost path approach, the algorithm also combines pit filling and breaching techniques (MARTZ; GARBRECHT, 1999) to define the hydrologically corrected DEM and to derive flow directions. One of the advantages of this procedure is to reduce the number of parallel drainage networks, a common practice in GIS softwares like ArcGIS (BUARQUE et al., 2009).
At first, the elevation for each DEM grid cell is read and a data grid (organized in lines and rows) is made from that information. This structure allows the identification of the flow direction in each cell using the deterministic eight-node (D8) method (O'CALLAGHAN;MARK, 1984;MARKS;DOZIER;FREW, 1984), where the slope of each cell, except of the ones located on the edge of the grid, is computed for its eight neighbor nodes. Once the greater positive slope is identified, the central cell receives a specific coding which identifies the downstream cell, according to Figure 2. The value of the slope is obtained using the following equation: where E 0 is the elevation in the central node; E i is the elevation of the i cell analyzed; and Ø (1) is the relative distance between the centre of the cells, being attributed a value of 1 to neighbor cells 1, 4, 16 and 64, and a value of √2 to neighbor cells 2, 8, 32 e 128. For performance improvement in terms of computational cost, some modifications were introduced to the original form of D8 method. If the value of the highest slope is negative, a single depression is identified in the central node. The elevation value of the central node must be increased (pit filling) to match the neighboring cell with the lowest elevation, and the single depression is then turned into a flat area ( Figure  3). The coding used to indicate the absence of flow direction is a null value, and the position of the modified cell is stored to be processed in a next step. This procedure is repeated until the entire DEM grid is processed.
If there are two or more cells with the same maximum slope value (and positive), the flow direction is given to the one with the smallest rank, which starts on the left neighbor of the analyzed node (rank 1) and increases counterclockwise. If there are two or more cells with maximum slope value equal to null, the flow direction is given in the same as before, provided that the downstream cell does not present a defined flow. However, if all neighboring cells have already a defined flow direction, the central cell is identified as a depression.

Figure 3 -Cells with a single depression transformed into a flat area, with the elevation increased to the lowest neighbor
After increasing the elevation of single depressions and determination of flow directions, all cells that have been assigned to a null value are classified as "depressions" and proceed to a step of searching for the outlet location. The algorithm is based on the work of Hou et al. (2011) using three data vectors in the form of priority queues: • Open List: vector that stores all "candidate" cells to be part of the least-cost path; • Closed List: vector that stores all cells selected from the candidates (open list); • Array: vector that stores all cells identified as a depression.
Each cell of these vectors has attributes given by x and y coordinates referring to the position in the data matrix, by the value of the associate objective function -or the heuristic function -and by the relative coordinates of the source cell along the least-cost path. Initially, the depressions identified in the previous step are stored in the Array vector with the respective x and y coordinates of its location. Then, the first cell is removed from Array and stored in the Closed List, indicating the point in which the analysis should start. The neighboring cells of the initial depression are then added in the Open List, and are characterized as the first candidates of the path to be traced. At the same time, cells added in the Open List get the relative position of the central source cell, so that it is possible (1) to trace back to the starting point from any given cell. The criteria for selecting the next cell from the group of candidates, when searching for the outlet point, is based on the minimum value resulting from a heuristic associated function f(n). The value of this function is obtained from the combination of two other functions, as shown in the following equation: ( 2) where g (n) is the real cost to leave the starting cell and reach the current neighboring cell; h (n) is the estimated least-cost path through the current neighboring cell, given by the arithmetic mean of pixels located "downstream" of the analyzed neighboring cell; w is a weight factor, being w > 1.
In other words, the function g (n) represents the elevation difference between the analyzed cell and the depression cell, while the function h (n) represents the runoff tendency in the adjacent region of the analyzed cell, i.e., in the opposite direction from the depression or a source cell. The introduction of the weight factor w assumes that the path should preferably pass through the neighbor cell with the lowest elevation difference, while the tendency of DEM runoff should be mainly used in low relief areas when two or more neighboring cells have the same height difference in relation to the analyzed cell. In order to calculate the functions g (n) and h (n), the following equations are used: where Es is the elevation of the cell representing the depression (starting point); E i is the elevation of the i cell analyzed; S i is the vector set containing all the cells in an external 3 x 3 window, E k is the elevation of the k cell in the set S i . The value of i S refers to the number of cells in a 3 x 3 window, or 9 units. Figures 4 and 5 show the delineation of the 3 x 3 window (in red) located on the edge of a neighboring cell (in blue), respectively, according to its position -diagonal or lateral. For each one of the eight neighboring cells (in grey), a 3 x 3 window is delineated and the function h (n) is computed by the arithmetic mean of the nodes. The function g (n) of each cell, multiplied by the weight factor, is then summed to the respectively function h(n) to compose the final values of the heuristic function f (n).
The minimum value of the heuristic function among all candidate cells defines the priority cell, which means that a path through this node should have the least cost. This cell is then moved from the Open List to the Closed List while its neighboring cells are added in Open List, provided that they are neither in Open nor in Closed vectors. Similarly, recent cells that were added to the group of candidates (Open list) get the relative position of the source node.
The above procedure is continuously performed until the outlet criteria is satisfied, i.e., when the elevation of the selected cell is lower than the elevation of the depression cell, or when the selected cell is located on the edge of the DEM. Tracing back from the outlet to the source cell of depression, the flow directions are then attributed to the cells along the path and the elevations are adjusted to create a linear gradient, which is given by the equation below: where E o is the elevation of the outlet cell; E s is the elevation of the depression cell (starting node); E i ' is the corrected elevation of the i cell; L o-i is the number of cells from the outlet to the i cell; L total is the total number of cells from the outlet to the starting node.
After the definition of flow directions along the path, all the cells are excluded from both Open and Closed List vectors, and the next depression stored in the vector Array is moved to Closed List. The process is completed when all the depressions have been solved and the flow directions assigned to each cell.

Performing flow accumulation
In order to determine the accumulated flow in the IPH-Hydro Tools, a method described in the work of Haverkort and Janssen (2012) was applied. The algorithm is simplistic but clever enough to run without I/O operations, just following the steps below: 1   (i) For each grid cell it is initially assigned one accumulated unit, and the first cell is then selected for processing; (ii) This cell is marked as "checked" and is accumulated downstream following the flow direction grid (i.e., adding one unit of accumulated flow to the next cell).; (iii) If are there any unchecked neighboring cell pointing (draining) to the analyzed node, the process of accumulating the flow in downstream direction waits; (iv) The next cell of the grid is selected, and the step (ii) is started again. The step (iii) is only continued after all the neighbor pointing cells are marked as "checked"; (v) The algorithm continues until all the cells in the grid are processed, and the flow accumulation grid allows the extraction of the drainage network from a predefined threshold.

METHODOLOGY APLLIED FOR ASSESSING IPH-HYDRO TOOLS
In order to assess the applicability and the performance of the developed tool, DEMs with different topographic characteristics and grid sizes were selected for several processing tests, and the results of IPH-Hydro Tools were compared to other GIS tools that use different algorithms to remove depressions and derive flow directions. The GIS softwares applied in this work are: the 32-bit version of IDRISI, with the PFS algorithm (SEDGEWICK, 1992;JONES, 2002) for removing depressions; the ArcGIS (ArcHydro Tools), which uses the pit filling method proposed by Jenson and Domingue (1988); and the open source WhiteBox GAT software (LINDSAY, 2014), using the algorithm proposed by Planchon and Darboux (2001). A description of the performance tests is presented below: 1)Processing grids of different sizes: It aims to determine the maximum number of cells in a DEM to be processed by IPH-Hydro Tools. Therefore, several areas with different numbers of rows and columns have been selected, and for each DEM the IPH-Hydro Tools and the aforementioned software packages were run. Although dealing with a great amount of data in digital elevation models is increasingly necessary for hydrology purposes, aspects involving external hard disk storage while processing the DEM (ARGE et al., 2003;GOMES et al., 2012) were not taken into account in this version of IPH-Hydro Tools, thus only the internal memory capacity of the computer is used.
2) Open and Closed List minimum sizes: Once the size of the Open and Closed List vectors need to be specified by the user, it was attempted to estimate the minimum values of these parameters in order to minimize the memory usage, as well as to verify a possible influence on the processing time.
The assessment was performed applying the MHS method several times in a same DEM, with the size of Closed List ranging from a purposely high value to an insufficient (low) one. As the size for the Open List must be always higher than the Closed List, i.e., the number of candidate cells must be always higher than the selected cells, for all the cases, it was defined the first value equal to twice the second. Moreover, the weight factor "w" was kept constant at 2 during the performance of the tests.
3)Processing time: In order to test the performance of the IPH-Hydro Tools in terms of computational efficiency, the processing times associated to each one of the steps performed were aggregated. The procedures performed were: Depression Removal, Flow Direction, Flow Accumulation, Stream Definition, Stream Segmentation, Watershed Delineation and Catchment Delineation. Also, the IPH-Hydro Tools was run in two different forms: (1) using only AS-CII-type files as input and output data, and (2) using only IRST binary files. 4)Quality of the drainage network: In order to assess the quality of the river networks, the extracted drainage from the IPH-Hydro Tools was compared to the ones resulted by the other GIS packages as well as to the "true" drainage (after vectorization), using the methodology described by Buarque et al. (2009). This method suggests that the area formed between the true drainage network (derived from satellite imagery, for example) and the extracted one from a DEM, divided by the river length, represents the error of the latter. Moreover, the area between the true drainage centerline and the riverbanks should not be taken into account, as the network generated within the channel region is often considered to be correct.
The results were also compared to a reference shapefile (river network in a vector format) provided by the National Water Agency of Brazil (ANA) in the scale of 1: 1.000.000, which is available for download at HidroWeb website (http://hidroweb. ana.gov.br/). It was an additional comparison but without any purpose to verify the quality of the reference, since this drainage network was obtained from a rough scale and made available by ANA for widespread use.
A desktop computer (Intel Core i7-2600K 3.4 GHz, 16GB RAM) was used to run all the GIS packages, and the 64bit version of MapWindow ® GIS was adopted in the case of IPH-Hydro Tools.

Areas selected for testing
As shown on Figure 6, the locations selected for the present study were the Purus, Taquari-Antas, Prata, São Francisco and Uruguai basins, and also a tributary of Itajaí River. Some of these basins were selected with the purpose of testing regions with low relief (as the Purus River, in Amazonia) as well as mountainous areas, characterized by well-defined valleys (as the Taquari River, in Southern Brazil). In addition, Prata and São Francisco basins were chosen to test the performance when working with a large number of cells, which also includes a watershed with a high resolution DEM (tributary of Itajaí River).
Regarding to the data used, the digital elevation model for Itajaí watershed was taken from an aerophotogrammetric survey of the state of Santa Catarina, in the scale of 1:10.000. In all other cases, DEMs were obtained from the 3 arc-sec (90 m) SRTM composition, available from the Consortium for Spatial Information of the Consultative Group for International Agricultural Research (CGIAR-CSI) database.

Processing different grid sizes
A key parameter when working with DEM files in geoprocessing refers to the maximum grid size that can be processed, which directly influences the processing time. Table  2 shows the information about each one of the DEMs selected for this study, including its covered area, spatial resolution and  number of rows and columns. When applying the IPH-Hydro Tools, all the tested grid sizes were handled with the exception of that one covering the São Francisco basin (with a DEM of approximately 14.600 columns by 19.600 rows), which was not successful due to limitations in terms of internal memory of the computer.  However, this problem was solved by reducing the number of rows to 17.200, and this grid size was established here as the upper limit for the IPH-Hydro Tools in computers with similar configuration.
Regarding to other packages tested, IDRISI was not able to process DEMs above the order of 7.000 by 10.000 cells. Nevertheless, both ArcGIS and WhiteBox GAT successfully handled all of the terrain datasets, with no limitations in terms of the maximum grid size. Table 3 presents the minimum sizes obtained for the Open and Closed Lists. Although each DEM processing had resulted in a different value for these parameters, no alterations in the computational time were observed when values of 1.000.000 for the Open List and 500.000 for the Closed List were applied in a same DEM. These values are also suggested for applying the tool in other terrain datasets, in order to avoid problems concerning insufficient values for these vectors when removing the depressions. Furthermore, there was no topologic difference between the extracted drainages in a single DEM, if using the minimum or maximum values obtained for Open and Closed List.

Processing time
Despite all the current advances in computer sciences for calculating and analyzing data, processing time is still an important factor for the development of new tools for DEM processing. However, it is worth mentioning that each one of the packages herein tested uses a specific programming language for its algorithms, and they work differently in some basic functions, for instance, when loading the grid file for visualization purposes. Thus, the purpose of this test is to provide a reference for the computational cost when processing grids with different sizes and topographic characteristics, rather than a detailed discussion (in numbers) of processing times associated to the GIS packages. Table 4 shows the total aggregate processing times for the packages tested, which includes all the intermediate steps from removing depressions to delineating watersheds for each segment of the drainage network (catchments). However, results about the time required to run these steps on IDRISI were not presented, as it was possible to process only the Taquari-Antas basin, in this case due to issues related to the DEM grid size. Nonetheless, the quality of the drainage network provided by this software is further assessed in this work.
Based on the results, it can be seen that the total number of cells in each grid has a large influence on the total processing time. For all steps after determining the flow directions, the processing time is proportional to the number of grid cells, but it does not necessarily occur for removing depressions. This initial step is greatly affected by the existence of areas with low relief, where it is usually more difficult to identify preferred paths for the drainage network because of the small variation of elevation values. It can be observed by comparing grids with similar sizes, for example the tributary of the Itajaí River and the Purus River, where it was necessary more processing time for the latter due to a large relatively flat area on the lower part of the basin.
In general, processing all the steps for the generation of catchments using the IPH-Hydro Tools was faster than using other packages. Although the average processing time for removing depressions was higher compared to ArcGIS (ArcHydro Tools), for instance, the flow accumulation was performed much faster in the developed tool. Moreover, the processing time when using IRST binary files was reduced, since reading and writing input and output files present a greater computational efficiency compared to ASCII.

Quality of the drainage network
One of the conditions for a DEM-based drainage network to be considered good is when it is located inside the river channel. The spatial resolution plays an important role in this context, but since most of the available DEMs have a relatively low resolution (including SRTM 90m DEM used here), inconsistencies between the extracted and the true drainage are quite common, besides the uncertainties in representing the terrain due to data interpolation errors. Two different river segments were selected to assess the quality of the extracted drainage network (Figure 7), one in the Taquari river (106,5 km), in the Taquari-Antas basin -Southern Brazil, and another one in the Iquiri river (95 km), a tributary of the Purus river in Amazonia. These segments were selected because of their topographic differences. The Taquari River has plenty of valleys and hillsides, while the Iquiri River is characterized by a smaller width and is located in a relatively low relief region with abandoned meanders, thus is not easy to accurately extract the drainage network for this area.
The images used to delineate the true drainage network were georeferred from Landsat Satellite, with 30 m of spatial resolution. Moreover, following the method proposed by Buarque et al. (2009), buffers of 150-and 40-meter length were applied respectively for the Taquari and the Iquiri river centerlines (based on widths) to subtract the area within the channel, as a drainage passing through this region is assumed to be correct. Figures 8 and 9 (Iquiri River) and Figures 10 and 11 (Taquari River) show the delimitation of the areas between drainage networks produced by IPH-Hydro Tools and ArcGIS in relation to the true drainage network, and presents only a part of the selected segments for visualization purposes. Furthermore, Tables 5 and 6 present the comparison parameters found after the application of different algorithms for drainage extraction, for the Iquiri and Taquari rivers respectively. These parameters are defined by the area between curves (including the application of the buffer), the ratio between the area error/ river unit length (km) and the river segment length.
In a comparison to the drainage network generated by ArcGIS, it can be noticed that the IPH-Hydro Tools resulted in smaller areas between curves in the case of Iquiri River, thus representing a better agreement to the true drainage. The methods that provided the best results were the MHS and PFS available in IPH-Hydro Tools and IDRISI (only the PFS) with errors close to 0.06 km²/km, while a lower quality was obtained using ArcGIS and WhiteBox GAT with errors around 0.21 km²/ km. Besides, when assessing the length of the drainage network, it was observed that MHS and PFS presented results closer to the real value; however, the length obtained by both algorithms was underestimated in about 15 km (16%). The other methods tested resulted in an underestimation above 37%, which also represented a performance lower than the quality of the vector drainage available by the National Water Agency.
Especially in areas with low relief, both ArcGIS and WhiteBox GAT presented several rectilinear reaches that were often parallel to the main river, with low representation of the meanders. These problems were less evidenced in the drainage network generated by IPH-Hydro Tools and IDRISI, which can be explained by the use of breaching techniques for removing the depressions.    Assessing the results for the Taquari river, the ratio between area error and river unit length was relatively low for all the methods tested (less than 0.03 km²/km), and the small differences between the extracted drainages are related to the well-defined valleys in that region. PFS (IPH-Hydro Tools and IDRISI) and MHS algorithms had a better performance with similar errors of about 0.0035 km²/km, while for ArcGIS and WhiteBox GAT these errors were around 0.009 and 0.02 km²/ km, respectively. Nevertheless, the resulting drainage for all the methods was closer to the real one in Taquari River when compared to the National Water Agency data, which had an error around 0.12 km²/km.
Regarding to the segment length, the results were slightly different. The algorithms implemented in ArcGIS and WhiteBox GAT presented values close to the real ones, overestimated in only 2 km (around 2%). The PFS method, implemented in both IDRISI and IPH-Hydro Tools, presented an increase of 8 km (7.5%) approximately, while for MHS the increase was 6 km (5.5%). Compared to the pit filling algorithms, the overestimate for both PFS and MHS methods is expected, as they are characterized by generating a smaller number of rectilinear reaches. It is worth mentioning that, as these algorithms are based in least-cost paths, the resulting drainage network tends to contour the riverbank in situations like large reservoirs and wide rivers (considering the DEM spatial resolution), once the elevations surrounding the analyzed cell must be heterogeneous in order       to provide enough information for the problem solving. This heterogeneity is not frequent in areas with a great number of adjacent cells with the same elevation (large flat areas), thus forcing the drainage to follow the riverbank border and increasing the length of the rivers. Another fact that must be highlighted is that part of the differences found when comparing the drainage networks may be caused by other factors, according to Paz and Collischonn (2008) and Paz et al. (2008). For instance, the length of the network measured by satellite -which allows the visualization of a river segment -is considered real, but the obtained value depends on the scale of digitizing, the interpretation itself and the georeferred image used. Possible inconsistencies in these steps may question the validity of the tracing and the length of the reaches. However, since the image digitizing is in many times the best option available, it is often adopted for drainage networks assessment. Moreover, another fact that must be remarked is the quality of the DEM used for obtaining the drainage network, as in cases where the width of the river is smaller than the DEM horizontal resolution, as in the Iquiri River, for example, there is a tendency to underestimate the length by the lack of representativeness in the meanders.

CONCLUDING REMARKS
Hydrology sciences are making use of a great number of available GIS datasets. In particular, DEM processing is essential to provide input information to hydrological models, which includes, among other steps, the topology definition, watershed delineation and determination of river hydraulic characteristics. Open source GIS tools, at the same time, are becoming increasingly widespread in the scientific community, since they are free to use and are related to both code programming flexibility and customization to meet specific needs.
In this context, this paper presented a new hydrology -based open source package called IPH-Hydro Tools. One of the main motivations of its development was the possibility to process a DEM within the same platform, i.e MapWindow GIS, in which the MGB-IPH graphical user interface is functional (FAN; COLLISCHONN, 2014). Therefore, the MGB-IPH hydrological model can be applied in the MapWindow GIS from the early steps of preprocessing to the final simulation. Likewise, additional tools can be also implemented independently by users, as well as the optimization of any existing procedure herein described.
In order to assess the applicability and performance of the IPH-Hydro Tools, several tests were carried out through a comparison to other available GIS packages. Results showed that the quality of the drainage networks produced by IPH-Hydro Tools using the MHS method, which was based on the heuristic approach presented by Hou et al. (2011), were better than those implemented in ArcGIS and Whitebox GAT, with lesser time needed to conduct all the steps for generating catchments. Some disadvantages, like the parallel drainages generated by the algorithm of Jenson and Domingue (1988), were also minimized by using the tools developed in the present work.
When compared to IDRISI, which also uses least-cost paths and breaching methods for removing the depressions, the IPH-Hydro Tools was similar in terms of drainage quality but showed some advantages as it processed larger grids. Furthermore, it is pointed out that the generation of catchments between two river confluences on IDRISI is not straightforward as in the developed tool, an important procedure for spatial discretization of hydrological models like the MGB-IPH.
Among the main limitations of the IPH-Hydro-Tools it can be mentioned the incompatibility with very large grids, which is an important aspect for the use of DEM models with higher spatial resolution. This limitation is due to the only use of the internal memory, which can be solved with more advanced techniques related to I/O operations during the processing. In addition, some shortcomings are underlined when representing the drainage over large flat areas, which is likely to occur in situations like reservoirs and large rivers. Regarding the structure of the methods implemented in IPH-Hydro Tools for removing depressions and deriving flow directions, these tend to generate a drainage closer to the riverbank and not the channel centerline when processing large areas of same elevation, for example in the Purus River. Therefore, there is a need to couple specific algorithms to handle these situations, as the flat carving technique (SOILLE; VOGT; COLOMBO, 2003), already successfully used together with the PFS (ROSIM et al., 2013).

SOFTWARE AVAILABILITY
The 64-bit IPH-Hydro Tools is available for downloading on the webpage of the Large-Scale Hydrology Research Group (HGE-IPH), from the Federal University of Rio Grande do Sul at http://www.ufrgs.br/hge/modelos-e-outros-produtos/iph-hydro-tools/. Besides the IPH-Hydro Tools plugin, the source-code and supporting data are also available, as well as tutorials of use.