Advances in Water Resources Development and evaluation of a framework for global ﬂood hazard mapping

Nowadays, the development of high-resolution ﬂood hazard models have become feasible at continental and global scale, and their application in developing countries and data-scarce regions can be extremely helpful to increase preparedness of population and reduce catastrophic impacts. The present work describes the development of a novel procedure for global ﬂood hazard mapping, based on the most recent advances in large scale ﬂood modelling. We derive a long-term dataset of daily river discharges from the hydrological simulations of the Global Flood Awareness System (GloFAS). Streamﬂow data is downscaled on a high resolution river network and processed to provide the input for local ﬂood inundation simulations, performed with a two-dimensional hydrodynamic model. All ﬂood-prone areas identiﬁed along the river network are then merged to create continental ﬂood hazard maps for different return periods at 30 (cid:3) (cid:3) resolution. We evaluate the performance of our methodology in several river basins across the globe by comparing simulated ﬂood maps with both oﬃcial hazard maps and a mosaic of ﬂooded areas detected from satellite images. The evaluation procedure also includes compar-isons with the results of other large scale ﬂood models. We further investigate the sensitivity of the ﬂood modelling framework to several parameters and modelling approaches and identify strengths, limitations and possible improvements of the methodology.


Introduction
River floods are recognized as one of the major causes of economic damages and loss of human lives worldwide ( European Commission, 2007 ;UNISDR and CRED, 2015 ). Over the period 1980-2013, flood losses exceeded $1 trillion globally, and resulted in ca. 220,0 0 0 fatalities ( Re, 2014 ). Moreover, the impact of floods in next decades could dramatically increase due to the ongoing socio-economic and climatic changes ( UNISDR 2009;Jongman et al., 2012 ).
The catastrophic impacts of river floods can be reduced thanks to mathematical models for predicting and mapping flood hazard and risk ( European commission, 2007 ). Flood hazard maps (showing the probability and magnitude of flood events over an area) and flood risk assessment maps (showing potential consequences of a flood event in terms of affected population and assets, and expected economic damages) can increase preparedness and improve land use planning and management in flood prone areas. On * Corresponding author.
E-mail address: francesco.dottori@jrc.ec.europa.eu (F. Dottori). the other hand, reliable and fast flood forecasting tools are crucial to develop effective emergency response strategies and to prevent and reduce impacts.
Until recent years, flood mapping and forecasting tools were available only in few areas of the globe, given their high demand of resources and data for development and maintenance. However, the situation is radically changing nowadays: thanks to the constant increase of computational power and precision of remotely sensed datasets, the application of large-scale, high-resolution (i.e. 1 km or less) flood models have become feasible ( Wood 2011 ), and different studies at continental and global scale have been proposed in the literature. Pappenberger et al. (2012) coupled a land surface rainfall-runoff model with a river routing algorithm, to produce floodplain and flood flow across the river network, based on a 30 years meteorological forcing data. Annual maxima were used to derive peak flow return periods on a 25 × 25 km grid, which was then reprojected onto a 1 × 1 km grid to derive flood maps of higher resolution. Winsemius et al. (2013) proposed a similar framework for global flood risk assessment, using a cascade of climate forcing datasets, hydrological and hydraulic modelling, extreme value statistics, derivation of inundation maps, and flood impact modelling.
More recently, large-scale flood modelling approaches have begun to couple the hydrological modelling component with large-scale flood inundation models, in order to provide more accurate flood mapping. Yamazaki et al. (2013) applied a new river routing model to improve the global flood hazard maps derived by Pappenberger et al. (2012) , while Hibarayashi et al (2013) applied the same model to produce global flood risk estimations for the end of this century based on the outputs of 11 climate models. Schumann et al. (2013) used ensemble forecast data to force a hydrologic model and produce boundary daily flow conditions for the 2-D hydrodynamic model LISFLOOD-FP ( Bates et al., 2010;Neal et al., 2012 ). The modeling system was calibrated and successfully applied over a reach of the Zambezi River. Recently, a similar flood modelling framework has been applied by Sampson et al. (2015) to produce global flood hazard maps at 3arcseconds resolution. The hydrological input in their work was given by a regionalised flood frequency analysis based on a global dataset of stream gauging stations. The resulting dataset provided the necessary input for a modified version of the LISFLOOD-FP model, which was used to derive global hazard maps for several return periods. Alfieri et al. (2014) derived a flood hazard map for Europe using the long-term streamflow simulation (23 years) developed for the European Flood Awareness System (EFAS). Streamflow data were downscaled and used as input for the LISFLOOD-FP model to compute local flood maps, which were then merged into a pan-European flood hazard map.
Another global flood model has been developed for the Global Assessment Report on Disaster Risk Reduction (GAR). The methodology used in the latest version for 2015  makes use of a global database of stream-flow data, elaborated with downscaling techniques and statistical regional analyses to compute extreme discharges at global scale. A global hydrologic model was used to improve the estimations and produce input discharges for a simplified hydraulic flood model, which interpolates flood levels considering stream flow in hydraulic cross-sections and local morphology, taking into account hydraulic connectivity and back water effects. Flood hazard maps were produced for return periods from 25 to 10 0 0 years.
Large scale flood modelling techniques can also be a key development for global flood forecasting systems. Currently, forecasts are mostly given in terms of hydrological parameters such as river discharge, but there is a demand from end users (e.g. emergency services, local institutions) for risk based forecasts, for example in terms of flood-prone areas, assets, and population. Several global flood forecasting and warning systems are now being developed towards this goal. For instance, the Global Flood Monitoring System (GFMS) uses satellite precipitation data in combination with coupled land surface and river routing models to provide nearreal time flood detection and flood inundation mapping .
Here, we present a flood hazard mapping methodology based on the hydrological information produced by the Global Flood Awareness System (GloFAS; Alfieri et al., 2013 ). GloFAS is a probabilistic flood early warning system running at global scale, developed in a collaboration between the Joint Research Centre (JRC) and the European Centre for Medium Range Forecasts (ECMWF). The system is running since 2011 for research purposes, and since May 2015 GloFAS forecasts are displayed in real time on a dedicated web platform ( http://globalfloods.eu/ ). Streamflow data available from long term GloFAS simulations is downscaled on a higher resolution river network and processed to provide the input for the flood simulations following the approach proposed by Alfieri et al. (2014) . For each river basin considered, the drainage network is divided in river sections where local simulations are run in parallel. Simulations are performed with a twodimensional hydrodynamic model, designed to ensure an accurate representation of flow processes in the river network and flood prone areas. The local flood maps produced are merged together to produce continental maps for different return periods at 30 arc seconds resolution, including areas at latitudes above 60 degrees.
The developed global maps are tested in several river basins located in different continents, in order to have a comprehensive assessment of the modelling framework. Where possible, we use official flood hazard maps as a reference, and we compare the performance of the produced maps against the results obtained by other global models ( Alfieri et al., 2013;Winsemius et al, 2016;Sampson et al., 2015 ) Where official maps are not available, we produce basin-scale flood extent maps derived for the period 20 0 0-2013, and we compare these maps with a mosaic of flooded areas detected from satellite images for the same reference period. Such evaluation strategy allows for investigating the performances of the modelling framework in areas where no previous evaluations of global flood hazard models were done.
In addition, we investigate the sensitivity of the modelling framework by providing a quantitative evaluation of the influence of various parameters, such as the hydrological input and the grid resolution. The performance of the modelling framework emerging from these analyses are discussed to identify strengths, limitations and possible improvements. Finally, future applications and developments of the methodology are presented.

Data and methods
The proposed framework is based on a cascade of modelling components with an increasing level of detail. The hydrological input is provided by the streamflow climatology of GloFAS simulations, based on meteorological reanalysis datasets ( Section 2.1 ). Streamflow data is processed at global scale and then downscaled at higher resolution in the main river basins and hydrological regions of the globe ( Section 2.2 ). Flood simulations are performed in parallel with a hydrodynamic model using the downscaled information and incorporating river network geometry derived from high resolution terrain data ( Section 2.3 ). A general scheme of the proposed methodology is provided in Fig. 1 . Note that two different hydrological datasets are used within the flood mapping methodology, to produce two different sets of flood maps. The analysis of extreme discharge values is used for deriving flood hazard maps for different return periods, as described in this section, while the analysis of discharge maxima for the period 20 0 0-2013 is used to generate flood extent maps which are applied for testing the procedure (see Section 3 for more details).

Hydrological datasets
Hydrological simulations in GloFAS are performed by coupling two distributed global models. The land surface model HTESSEL ( Balsamo et al., 2009 estimates the surface water and energy fluxes and the temporal evolution of soil temperature, moisture content and snowpack conditions in response to atmospheric forcing. Surface and sub-surface runoff from HTESSEL are then used as input to LISFLOOD Global, which simulates the groundwater and routing processes and produces streamflow simulations along the stream network of large global rivers at 0.1 degree grids ( Alfieri et al., 2013 ).
The long term streamflow simulations of GloFAS are based on the global atmospheric reanalysis dataset ERA-Interim ( Dee et al., 2011;Balsamo et al., 2015 ), and cover the period from 1980 to 2013, at 0.1 degrees resolution (approximately 11 km at the Equator). For running GloFAS simulations, the ERA-Interim dataset has been bias-corrected using the Global Precipitation Climatology Project (GPCP) ( Huffman et al., 2009;Balsamo et al., 2015 ). Maps of daily annual maxima of discharge are extracted for each grid element of the GloFAS river network, and fitted with a Gumbel extreme value distribution ( Gumbel, 1941 ) to estimate discharge maps for any return period. For the mapping procedure described here, discharge maps for 10, 25, 50, 100, 250, 500 and 10 0 0 year floods have been derived. The river network considered includes those rivers with an upstream drainage area larger than 50 0 0 km 2 . This choice is due to the coarse resolution of the ERA-Interim climatology, which is not able to correctly resolve localized precipitation patterns at small scale ( Thiemig et al., 2012;Alfieri et al., 2013 ). Due to these assumptions, the total area not included in the analysis is 27% of the emerged land at global scale, excluding the Antarctic continent.

Processing of hydrological input
Mapping the flood hazard at continental and global scale is a challenging task. Especially in major world rivers, this requires a modelling framework designed to simulate flow routing along the river network over lengths of hundreds of kilometers. At the same time, simulations should account for multiple flooding processes, potentially involving floodplains with a width of hundreds of kilometers, including complex channel-floodplain flow interactions and the presence of dyke systems, dams and reservoirs. Thus, carrying out such simulations requires a number of simplifying assumptions to correctly handle domain decomposition, hydrological inputs and other modelling issues. The procedure herein proposed is built on the methodology by Alfieri et al. (2014) , which provided a feasible and effective solution to the mentioned issues. The basic assumption is that large scale flood hazard maps can be consistently derived from a mosaic of small scale simulations of local flood processes, covering all the considered river network. The streamflow information (daily and extreme discharges) described in Section 2.1 is first downscaled to a high-resolution river network at 30 arc seconds resolution (approximately 1 km at the Equator). For river basins located below 60 °N of latitude, the river network has been derived from hydrologically corrected Digital Elevation Model (DEM) and Drain Direction (DD) raster maps developed by HydroSHEDS ( Lehner et al., 2006 ). For areas at higher latitudes where HydroSHEDS information is not available, the GTOPO30 DEM, available at the long term archive of the US Geological Survey, has been used ( http://www.usgs.gov ).
The downscaling procedure is based on the identification of so called "flood points" regularly spaced along the high-resolution river network, where flood simulations are executed. Points are identified starting from the river basin outlet and moving upstream, until the threshold value of minimum upstream area of 50 0 0 km 2 is reached. The distance between flood points, computed along the 30 river network, has been set to 10 km, comparable to the grid resolution of discharge maps. Flood points are then linked to a section of the 0.1 °river network, in order to assign to each point a discharge hydrograph. Where the coarse and high resolution river networks do not overlap, flood points are linked with the closest 0.1 °pixel in the upstream direction. Fig. 2 shows a conceptual scheme of the two river networks.
The downscaled discharge climatology (daily and extreme discharges) is then processed to derive synthetic flood hydrograph for each flood point, following the approach proposed by Alfieri et al. (2014) . At each grid point in the 0.1 °river network, the daily discharge climatology is used to compute a Flow Duration Curve (FDC) for each year of data. The FDC is obtained by sorting in decreasing order all the daily discharges, thus providing annual maximum values Q D for any duration i between 1 and 365 days. Annual maximum values are then averaged over the entire period of data, and used to calculate the ratios ε i : where Q Di is the average maximum discharge for i -th duration and Q P is the corresponding value for its peak flow (i.e. Q D = 1 day).
Design flood hydrographs are derived using daily time steps. The peak value is given by the peak discharge for the selected Tyear return period Q T , while the other values Qi are derived multiplying Q T by the ratio ε i . In this work, the position of the hydrograph peak Q T is always assumed to be in the centre of the hydrograph, while the other values Qi are sorted alternatively to produce a triangular hydrograph shape, as shown in Fig. 3 . In order to account for the simplified representation of river channels in SRTM DEM, flood hydrograph discharges are reduced by subtracting the estimated average daily discharge Qm .
The total duration of the hydrograph is given by the local value of the time of concentration Tc , therefore all the durations > Tc are discarded from the final hydrograph. Tc is computed along each point of the river network with the empirical formula proposed by Giandotti (1934) : Where A is the drainage basin area (km 2 ); L is the length of the longest drainage path to the basin outlet (km); h m is the mean elevation of the basin (m); and h 0 is the elevation of the basin outlet (m).The Giandotti formula provided consistent values of concentration time, ranging between 1 day and more than 40 days in the longest rivers (Nile, Amazon, Congo). A schematic view of the method is shown in Fig. 3.

Flood inundation modelling
The simulation of flooding processes at large scale requires a modelling tool able to reproduce all the relevant flow processes with an adequate degree of detail, while being not computationally demanding ( Hunter et al., 2007 ). In particular, previous works showed the importance of simulating diffusion processes in floodplains ( Yamazaki et al., 2011 ) and to account for the geometry of the river network channels ( Neal et al., 2012 ). Two-dimensional, reduced complexity hydraulic models offer a good solution to these issues, as demonstrated in several recent works where the LISFLOOD-FP model was applied ( Neal et al., 2012Alfieri et al., 2014;Sampson et al., 2015 ).
In our approach, flood simulations are performed with the 2D hydraulic model CA2D ( Dottori and Todini, 2011 ). Here we give a brief description, focusing on the modifications made to adapt the model structure to run large scale simulations.
The model is based on a cell-centred finite volume scheme, similar to more conceptual methods like the macroscopic cellular automata approach. The model structure allows for great flexibility, as both structured and unstructured grids can be used. The integration in time is performed through a first-order Euler explicit scheme: for each time step, volume exchanges between grid cells are computed first, by integrating the momentum equation along For the hydraulic simulations presented in this paper, the semiinertial formulation of the CA2D model is used, as it can simulate gradually-varied flow processes over large areas with high computational efficiency ( Dottori and Todini, 2011;Dottori, 2012 ).
To ensure a good reproduction of flow routing in the river network, a sub-grid parameterization scheme has been incorporated in the model, following the approach proposed by Neal et al. (2012) . Grid cells included in the river network are composed of a channel and a flood plain fraction, each one with different elevation and area. Connections between river network cells are schematized with a composite flow section (channel and floodplain), with separate bed slope and roughness values. Flow is first computed in the channel subsection, while flow in the flood plain sub section occurs when the channel water depth in the cell is higher than channel bank elevation.
For river basins below 60 of latitude, channel elevations are assigned from the lowest elevation value coming from cells of the 3 SRTM DEM included in the 30 river network cell ( Lehner et al., 2006 ). The averaging procedure allows for smoothing the random errors which characterizes the 3 SRTM DEM, although systematic errors might still be present ( Yamazaki et al., 2012 ). It is important to note that STRM elevation over rivers is not measuring bed bottom, because radar signals are reflected by water surface. Therefore, we assume that the channel elevation derived from the SRTM is referred to the average water level in the river, following the approach adopted by Alfieri et al. (2014) as mentioned in Section 2.2 . Also, note that elevation values for both channels and floodplains are taken from the void-filled versions of the SRTM.
Channel width values are taken from the Global Width Database, which is also based on SRTM elevation data ( Yamazaki et al., 2014 ), integrated with geometric functions applied for Glo-FAS ( Alfieri et al., 2013 ) where the dataset by Yamazaki et al. has missing values. In addition, a minimum width value of 100 m has been set. For river basins partially or entirely above 60 °of latitude the sub-grid treatment has not been applied, because the available high-resolution terrain datasets such as ASTER DEM were found not to provide acceptable results.
To further improve the reproduction of river network at coarse resolution, the model computation grid adopts an 8-direction link network, also called Moore neighbourhood rule ( Parsons and Fonstand, 2007 ). Fig. 4 shows the conceptual scheme of this grid structure. Each grid cell is connected with the adjacent cells also in the diagonal directions, which requires some modifications with respect to the standard 4 direction grid, as described by Dottori (2012) . While the continuity equation is not modified, the integration of the momentum equation is performed by considering each cell as a regular octagon. The effective flow width of each connection (B) is given by the octagon faces, while the length for diagonal links x is increased with respect to horizontal and vertical links. Dottori (2012) investigated the accuracy of this model grid structure in several test cases. The results showed that, while the 8direction grid may produce limited inaccuracies in numerical cases, in most of the cases the overall results did not differ to standard 4 direction approach. On the other hand, simulations of real test case showed that the 8-direction grid may be preferable to reproduce flow processes in channels of size comparable to grid resolution. In addition, the 8-direction link network is particularly convenient for the modelling framework here described, as flow directions in the DEM can be more easily followed within the model grid.
All the hydraulic model simulations have been performed at 30 resolution ( ∼ 1 km). The same resolution has been adopted in recent studies for large scale flood modelling ( Neal et al., 2012 ), and it has been deemed adequate to the precision of available datasets. Roughness value for the hydraulic simulations are derived from land use information provided by the Global Land Cover 20 0 0 map at 30 resolution ( Bartholomé and Belward, 2005 ). Given the global scale of the application, a calibration of roughness values was not possible, and the values have been selected based on recent large scale applications of reduced complexity 2D models. Thus, the range of values goes from 0.2 m 1/3 s for forest areas, to 0.04 m 1/3 s for river channels.
Also, a global flood hazard mapping framework requires to consider the presence of internal conditions such as lakes, lagoons and reservoirs, and the boundary condition given by the sea level. In all the simulations performed, the sea level is given by the local "zero" of the DEM, although this overlooks possible local sea level rises due to storm surges or tides. Regarding internal water bodies, the reference water level is again given by the local DEM value. On this point, it should be noted that large flood discharges can alter the water level of the water body, especially in reservoirs used for flood control. However, a more detailed schematization was not possible since reservoir management rules are generally not available.
Before processing the elevation data for the simulations, the HydroSHEDS DEM has been corrected using the global vegetation height dataset developed by Simard et al., (2011) . Considering the findings of previous research works regarding the influence of vegetation on SRTM elevation data ( Baugh et al., 2013 ), we elaborated the original dataset by Simard et al. before correcting the DEM. We considered the 50% of vegetation height value, as the SRTM elevation data is in between actual ground elevation and vegetation canopy top. Then, we performed a 3 × 3 moving average on values, to smooth locally high values, and finally we removed the correction in areas where the influence of vegetation cover was not significant. As for the roughness parameter, a formal calibration was not undertaken given the need of applying the correction at global scale. No correction was applied in urban areas, since at 30 resolution the influence on elevation data was found to be not relevant.
The flood modelling approach followed in this paper requires a simulation for each flood point in the river network, as described in Section 2.2 . The domain size for each local simulation is assigned through a trial-and-error process, considering the upstream basin area, the local flood plain width and the downstream flood plain configuration. Such a procedure is used to optimize the run time of simulations and prevent discontinuities in the final flood map or the exclusion of areas potentially at risk of flooding. Accordingly, domain size ranges from a minimum value of 100 km 2 to over 50,0 0 0 km 2 in large wetland areas like the Sudd swamps in the White Nile basin, and the Pantanal in the Parana basin. To avoid distortions of the modelling grid due to projection of global maps, all the simulations use structured grids, where grid cell areas and length of connections are corrected based on local geographical coordinates.
After all the local flood simulations are executed, the maps are merged together, taking the maximum depth value where more maps overlap.

Evaluation of the methodology
The evaluation and validation of global flood hazard models is strongly limited by the scarce availability of reference flood maps. Pappenberger et al. (2012) compared the global flood maps produced in their work against previous global maps based on statistical estimation of peak discharges. Winsemius et al. (2013) used extreme discharge values for the evaluation of their modelling framework. Rudari et al. (2015) performed a qualitative assessment of simulated flood hazard maps against satellite flood footprints retrieved for a number of large flood events from different sources (DFO archive, UNOSAT flood portal). Alfieri et al. (2013), Sampson et al. (2015) and Winsemius et al. (2016) performed more detailed evaluations against official high-resolution flood hazard maps in a number of rivers in Canada, Great Britain and Germany. So far, no detailed analyses could be performed in the major rivers of Africa, Asia and South America, given the absence of official flood hazard maps, whereas an evaluation on these areas might offer new, valuable information on the performance of global flood models.
Given this framework, here we propose a validation of the flood hazard mapping methodology using both official flood hazard maps and satellite-derived flood maps. The aim of the comparison is to investigate where the methodology is providing good results and where not, and understand the possible explanations behind both good and poor performances.
The performance of simulated flood maps against reference maps is evaluated using a number of indexes proposed in literature ( Bates and De Roo, 20 0 0;Alfieri et al., 2014;Sampson et al., 2015 ). The hit ratio evaluates the agreement of simulated maps with observations and it is defined as: where F m ∩ F o is the area correctly predicted as flooded by the model, and Fo indicates the total observed flooded area. The formulation of the hit ratio does not penalize overprediction, which can be instead quantified using the false alarm ratio: where Fm / Fo is the area wrongly predicted as flooded by the model. Finally, a more comprehensive measure of the agreement between simulations and observations is given by the critical success index, defined as: where F m ∪ F o is the union of observed and simulated flooded areas.
The assessment is carried out in several areas in Europe, Africa, South America and Asia, depending on the availability of reference maps for the evaluation procedure ( Fig. 5 ).   Sampson et al. (2015) . For the Po river basin, we consider the official hazard map for the return period of 500 years, as the levee systems in the main river reaches have a design return period estimated in 200 years, and therefore would prevent flood events with a lower magnitude. For the Thames and Severn rivers the return period of the official hazard maps is 100 years and flood defences are not considered. For the Elbe River the return period is again 100 years but defences are considered.

Comparison against satellite-derived flood maps
In a second phase, we derive a set of flood extent maps using the dataset of flood footprints developed by the Dartmouth Flood Observatory (DFO), integrated with the flood database of the United Nations Operational Satellite Applications Programme (UN-OSAT).
The DFO dataset mostly consist of images from sensors Terra-MODIS and Aqua-MODIS at 250 m spatial resolution ( Brakenridge and Anderson, 2006 ). The images acquired at near-global scale twice each day are accumulated over 14 days to remove nearly all cloud obscuration. Although the DFO database starts from 1998, until 2011 imagery analysis was manually made and only major flood events were recorded, while since 2011 the database is updated daily. The data is freely available at global scale at the DFO website ( http://floodobservatory.colorado.edu/ ). The UNOSAT database contains flood extent maps for several flood events, derived from a variety of satellite sensors. The data is freely available for download at the UNOSAT website ( http://floods.unosat.org/ geoportal/catalog/main/home.page ).
Flood extent maps detected from MODIS imagery have been used in several research works in the last decade, and it was found that that their accuracy is comparable to flood footprints from higher-resolution sensors like LandSat TM/ETM + ( Sakamoto et al., 2007;Huang et al., 2014 ) Envisat-ASAR  and RADARSAT ( Islam et al., 2010 ). On the other hand, we think that the accessibility, frequency and near-global coverage of MODIS images in the DFO database make them suitable for the task of evaluating large scale flood hazard maps.
Since the flood hazard maps produced with the proposed methodology are not designed to reproduce single flood events, we used the available satellite observations to derive cumulated maps of all the observed flood footprint areas for the period 20 0 0-2013. This period has been selected because it is consistently covered both by the ERA-Interim climatology and by the DFO-UNOSAT databases.
The analysis is carried out for several large river basins in Africa (Niger), South America (Tocantins) and Asia (Indus, Irrawaddy, Ganges-Brahmaputra, and Mekong), as shown in Fig. 5 . The selection of these basins was made based on the availability of consistent coverage of flood maps at basin scale, considering the limitations of MODIS imagery. Basins at high latitudes were excluded because the imagery processing is hindered by the interference of snow and ice cover. Mountain areas with steep slopes can also interfere with the signal given by water surface, therefore we excluded from the analysis the areas above an elevation of 500 m. Note that this choice does not hamper the validity of the comparison because the great majority of flooded areas in the selected basins are located at a lower elevations. Also, optical sensors cannot penetrate dense vegetation cover, such as rainforests. For instance, MODIS maps significantly underestimate flood extent in the Amazon River basin, in comparison to observations from Synthetic Aperture Radar (SAR) imagery ( Baugh et al., 2013;Schumann and Moller, 2015 ). Finally, the merged MODIS flood maps show a significant scattering in some areas, and local inaccuracies due to projection and georeferencing errors. Therefore a manual correction has been performed to delete false flood pixels in areas far from the main river network and to correct relevant errors.
It is important to note that the MODIS derived flood maps may underestimate flooded areas for a number of reasons. Because image acquisition was not systematic before 2011, minor flood events might have not been recorded. In addition, flood extent detection can be hindered by cloud cover, although this can be compensated by the cumulative approach used by the DFO ( Brakenridge and Anderson, 2006;Westerhoff et al., 2013;Brakenridge et al., 2013 ).
To further improve the data coverage and quality, we integrated the cumulated MODIS maps with flood extents derived from several satellite sensors available at the UNOSAT portal. In particular, for the Indus and Mekong river basins several flood events are recorded, with different maps available. Interestingly, for the Indus and Ganges-Brahmaputra basins the comparison of flood extent derived from MODIS sensors and other satellite sensors for the same flood events did not show relevant differences. For the Indus River, the large data coverage of the UNOSAT database also allowed to assess the cumulated flood maps from DFO data, and again we did not find relevant differences or errors. However, we cannot exclude that the applied procedure of summing up a large database of images might produce significant errors in other cases.
Contrary to official flood hazard maps, the maps of cumulated flood extent for the period 20 0 0-2013 cannot be linked to a single return period of occurrence, and this requires to modify the hydrological input of the methodology to produce simulated maps comparable to observations. This has been achieved by producing a global flood map derived from the GloFAS discharge climatology for the time span 20 0 0-2013. As can be seen in the scheme in Fig. 1 , the only modification to the standard methodology is that synthetic hydrographs are derived by taking the maxima of discharge in each point of the river network, instead of the peak discharges for a specific return period. It should be noted that in this case no hypothesis about the return period of the hydrological data is assumed. Since the simulation is referred to reanalysis data, ideally the simulated discharges should have the same return period of the "real" discharges actually occurred in the same period, which may be variable in the river network. Therefore, this approach offers the possibility to test all the components of the modelling approach excluding the extreme value analysis.

Results
The mapping procedure has been used in each continent to produce flood hazard maps with return periods of 10, 25, 50, 100, 250, 500 and 1000 years. The flood hazard map for Africa for 100y return period is shown in Fig. 6 . The maps for the other continents for the same return period are available online on the GloFAS website ( http://globalfloods.eu ).

Evaluation of results
Tables 1 and 2 report the value of the performance indices in the test river basins, together with the results for other global flood models where they are available. When comparing the performances, it is important to keep in mind the differences in modelling resolution and river network extent of each modelling framework. As already mentioned, the GloFAS maps have a 30 resolution and are defined only for river channels with an upstream catchment area > 50 0 0 km 2 . For the model results presented by Sampson et al. (2015) and Alfieri et al. (2014) , we show here the scores computed considering only the basin areas > 500 km 2 (see Sampson et al. for more details). For both models, spatial resolution is 3 . For the modelling framework presented by Winsemius et al. (2016) , the resolution is 30 and rivers included have generally an upstream area above 10,0 0 0 km 2 . For all the models and area, the comparison is carried out by resampling the official hazard maps to a 3 resolution.
Figs. 7 and 8 illustrate the comparison of simulated flood maps against official flood hazard maps, while Figs. 9 and 10 show the simulated and satellite-derived flood extent in some selected areas, computed for the reference period 20 0 0-2013. The intersection of reference and simulated flood extent, that is, the areas correctly predicted by the methodology, is also shown.
Here we provide a short description of the results, while a more detailed discussion is undertaken in Section 5 .
As can be seen, the model provides fairly good results in the Po river basin ( Fig. 7 ), especially when taking into account only the area actually simulated in GloFAS. Results are less satisfactory in the Severn and Thames ( Fig. 8 ), but still comparable to the other models, with lower scores compared to the modelling framework of Sampson et al., and results comparable to models by Winsemius et al. and Alfieri et al. Poor results are produced in the Elbe area, although it should be remembered that the flood maps in this area are computed taking into account the existing flood protection structures.
A more detailed discussion on the models' performances is presented in Section 5 , here we underline two points that should be considered. First, most of the rivers considered in the test have confined floodplains with a limited extension compared to world's large rivers, therefore flood extent might be relatively easy to predict for a global model ( Hunter et al., 2007 ). Second, the scores can be influenced by the coarser modelling resolution, with respect to the other methods considered (90 m for the methods by Sampson et al. and Alfieri et al.,500 m for Winsemius et al).
The comparison against satellite-derived maps allows to investigate the model performances in different climatological and hydrological contexts. As can be seen from the ratio between satellite and GloFAS maps in Table 2 , there is a large overestimation in the river basins of Indus, Tocantins and Niger, as indicated also by the large values of the false alarm ratio. The overestimation is mainly concentrated in the delta areas and in inland wetland areas, such as the Niger inner delta ( Fig. 9 ). While for the Niger sea delta this can be due the dense vegetation cover, large overestimation occurs also in semi-arid and wetland areas not included in the present analysis, such as the Pantanal (Paraguay River basin) and the Sudd swamps (White Nile River basin) (see Section 4 for a detailed discussion). On the other hand, the values of the hit ratio for Indus, Niger and Tocantins indicate that most of the MODIS observations are correctly included in the simulated flood extent.
In other river basins like the Ganges and Brahmaputra, the total flood extent is similar both in simulated and observed flood maps, however the skill of the produced flood maps is relatively low.
In other areas, the observed flood extent seems to be better reproduced by the methodology. The performance is fairly good in the river basins of Irrawaddy ( Fig. 10 ) and Mekong, but also in large reaches of the rivers Niger ( Fig. 9 ) and Tocantins. As already mentioned, this might be due to the favourable morphology of the river floodplain.      Regarding the Irrawaddy basin, it should be noted that MODIS maps include also the flooded large coastal areas due to the NAR-GIS cyclone in 2008. Most of these areas are actually not reproduced by the methodology, however, since a separation of areas flooded due to storm surge and to river flooding was found to be not feasible, we opted for not modifying the map.

Sensitivity analysis
The developed modelling framework depends on several factors, which can influence the accuracy of the flood hazard maps produced. Since a formal calibration process would not be feasible given the global scale of the application, we performed a quantitative sensitivity analysis taking into account a number of relevant parameters and modelling assumptions.

Roughness
Roughness values are generally considered as an important calibration parameter for inundation models. To evaluate the effect of this parameter on flood extent, we performed two additional simulations in the Orinoco River basin using the discharge dataset for the 100 years return period and two different sets of roughness values. These sets of values have been derived by decreasing by 50% (R2) and increasing by 50% (R3) the reference set of values (R1). Note that the Orinoco has been chosen as test site because there are large unconfined floodplain areas.
Results in terms of relative change of flooded area are reported in Table 3 . As can be seen, large variations in roughness parameters seem to have a limited influence on flood extent, suggesting that the overall sensitivity of the flood modelling framework to this parameter is relatively low. Similar differences are also observed considering water levels.

Hydrological input
The influence of the streamflow climatology on the results has been evaluated by considering the variation of total flood extent according to the return period, which is shown in Fig. 11 for all the continents. The lower value of the curves, corresponding to permanent water bodies (lakes, reservoirs and large rivers), has been derived from a corrected version of the Global Lakes and Wetlands Database (available at http://www.worldwildlife.org/pages/ global-lakes-and-wetlands-database ).
As can be seen, the differences in flood extent are relatively small, with limited increases in flooded area for higher return periods. Higher variability is observed for South America, while North America and Europer have relatively flat curves. The limited sensitivity to inflow discharge might depend on the absence of flood defences and wave attenuation in the mapping approach, as we discuss in Section 4 . However, the representation of extreme values in discharge climatology can be a more influent factor. To investigate this issue, we performed a comparison of the peak discharges for different return periods and river reaches, derived from in situ observations and GloFAS. The comparison is done for a number of catchments where long enough time series (e.g. 15 years of continuous daily data) were available, based on the work by Hirpa et al. (2016a) . Available data are mostly from North America, with few data from Europe and Australia, and a list of representative gauge stations are listed in Table 4 with results shown in Table 5.
As can be seen from the results, there is a tendency to underestimation, although in the majority of rivers the difference is ± 25%. Underestimation in GloFAS seems to occur especially in locations at higher latitudes. It should be noted that many rivers are regulated by dams and reservoirs, which may alter the distribution of extreme discharges.
A further modelling assumption that can influence the results is the shape of the synthetic hydrographs ( Section 2.2 ), as it determines the total flood volume and therefore flood extent and depth. A systematic comparison at global scale of observed and simulated hydrograph would not be feasible because of time required and because observations are not available in most of the rivers. Therefore, we compared a number of simulated hydrographs with observed hydrographs, in rivers where observations are available. Fig. 12 shows two examples of this comparison for the Rivers Irrawaddy and Brahmaputra, where GloFAS synthetic hydrographs computed for the 10 year return period are compared against flow events with comparable peak discharge. As can be seen, the hydrograph shapes in both cases are in good agreement, although some characteristics like consecutive peaks of discharge cannot be modelled.

Vegetation cover
DEM correction given by vegetation cover can also be considered as a "calibration" parameter. In order to assess the influence  11. Flood extent computed for all the continents for return periods from 10 to 500 years, expressed as the percentage of continental area flooded.

Table 5
Comparison of extreme discharges derived from GloFAS simulations and observed time series for the return periods of 10 years (RL10) and 100 yeras (RL100). of vegetation removal on flood maps, simulations using both uncorrected and corrected DEM have been carried out. The comparison showed that vegetation cover does play a role in areas with dense and uniform vegetation cover, like tropical rainforests (Amazon and Congo River basins, Indonesia, New Guinea) and high latitude forests in Canada, Scandinavia and Russia. For the whole Amazon River basin, vegetation removal results in an increase in flooded area of 27.5%. In the Congo basin, the increase in flooded area is 9.3%, while in other basins the differences are less significant. However, it is difficult to evaluate at global scale the actual improvement given by vegetation removal, given the mentioned limitations of MODIS flood maps in areas with dense vegetation cover (Section 3.1.2). This prevents the possibility of a calibration of the chosen parameters as done for other local scale studies ( Baugh et al., 2013 ).

Resolution
The modelling approach applied in this research adopts a relatively coarse modelling resolution (30 ) with respect to the maximum possible precision of flood mapping, which is theoretically given by the 3 SRTM DEM. While flood maps can be downscaled at higher resolution to increase their precision as done in previous works (e.g. Winsemius et al., 2013;Sampson et al., 2015 ), it is important to evaluate the influence of the chosen resolution on the results. To this end, the flood hazard map at 100y for Europe has been compared with a higher resolution flood map, developed by Alfieri et al. (2015) at 100 m resolution using the discharge climatology of the European Flood Awareness System (EFAS). Since the European-scale map includes reaches with upstream area above 500 km2, which are not modelled at global scale, we excluded from the comparison all the flooded areas located farther than 25 km from the global flood map.
The results of the analysis are reported in Table 5 , using the same indices applied for the comparison against satellite flood maps ( Section 3.1 ). In this case the EFAS flood maps are used as reference. The analysis has been done separately for Scandinavia and the rest of Europe, because of the different terrain datasets available. Fig. 13 reports a visual comparison for the area of lower Danube course.

Table 6
Comparison of European flood hazard maps computed from GloFAS simulations (current work) and EFAS simulations .  Fig. 13 shows a good agreement between global-and Europeanscale maps for the Danube River basin. Despite the different modelling resolution and the denser river network considered in the European-scale map, especially along the major river network the differences are not significant, and overprediction is limited to few areas, as shown by the scores in Table 6. On the other hand, agreement is lower for Scandinavia, due to the less accurate terrain dataset used ( Section 2.2 ).

Discussion
The results of the sensitivity analysis and flood maps evaluation need to be discussed considering the assumptions and limi-tations of the methodology, along with the limitations of available datasets.

Hydrological dataset
The reliability of the GloFAS discharge climatology depends on the limitations of the meteorological forcing and the hydrological modelling structure. In particular, the ability of ERA-Interim climatological dataset to reproduce extreme precipitation events is a crucial element for the performance of the flood modelling framework. ERA-Interim rainfall data have generally been assessed in terms of climatological variables such as monthly or daily average precipitation ( Dee et al., 2011: Szczypta et al., 2011. Balsamo et al. (2015) performed validation exercises against various GRDC stations, though the comparison was not specifically focused on extremes. Thiemig et al. (2012Thiemig et al. ( , 2013 tested the precipitation data in four African river basins, including extreme values. The analysis highlighted limitations in semi-arid areas due to the dataset low resolution and the consequent inability to reproduce some small and medium scale climate features. This might generate underestimations of extreme discharges in rivers with limited upstream drainage area and in semi-arid areas, and could be an explanation of the results shown for GloFAS extreme discharges in Section 4.2.2 . Regarding the reliability of the current GloFAS setup, it should be considered that, even if the modelling framework has not yet been calibrated, all the components of GloFAS have undergone evaluation tests. As already described, runoff data taken from ERA Interim Land underwent a dedicated validation performed by Balsamo et al. (2015) , while GloFAS discharges were evaluated by Alfieri et al., (2013) and in the present work. In addition, one should note that the Lisflood-Global calibration parameters have a physical meaning, hence current values have been chosen in a realistic range ( Alfieri et al., 2013 ). The calibration process for the river routing part (Lisflood-Global) is currently ongoing and the improvements in the overall performance are under evaluation ( Hirpa et al., 2016b ).
In addition, other limitations of the current model setup need to be considered. First, lakes and reservoirs are not yet included. While this is likely to alter flow routing in several rivers in normal flow conditions, the effects on discharge extremes are more complex to evaluate. This analysis is currently ongoing and the next GloFAS setup will include a large dataset of the most important lakes and reservoirs ( Zajac et al., 2016 ). Also, flow routing along rivers in GloFAS is computed by routing all the discharge downstream, apart from evaporation losses. Therefore, possible peak wave attenuation during high flows due to water storage in flooded areas is not accounted for. This issue is present also in the flood modelling framework, as in each flood point the not-attenuated hydrological input is used. This means that during high flows in unbanked rivers along floodplains, discharges might be overestimated, resulting in an overestimation of flood extent. Alfieri et al. (2013) carried out an extensive comparison of Glo-FAS discharges against observations in several rivers. The analysis showed good performances in rivers like the Amazon, Mississippi and Indus ( Fig. 14 ), while for rivers in semi-arid areas (e.g. the Niger) a general overestimation of discharges was observed, probably due to the underestimation of evaporation and infiltration losses ( Fig. 15 ). Although infiltration processes do represent a relevant component of the water balance in semi-arid regions, their implementation in GloFAS hydrological simulations is currently still limited, despite the ongoing improvements in the modelling structures. It should be noted that this model limitation may partly compensate for local underestimations of peak discharges due to the climatology. Additional research and the availability of new observations would be necessary to further investigate this point.
Finally, a further source of uncertainty derives from the extrapolation of peak discharges for high return periods, given to the limited length of the reference climatological period.

Terrain datasets
Terrain datasets are a major source of uncertainty in any flood model ( Dottori et al., 2013 ) and especially in large scale models, given the well-known limitations of existing global datasets regarding in particular the reproduction of river network ( Sampson et al., 2015 ). While river channels width comparable to grid resolution are generally represented by the SRTM dataset, the real channel depth is not detected. Even though this issue has been addressed in the modelling framework ( Section 2.3 ), underestimation of channel conveyance might still occur. Besides the inaccuracy in representing the river network, the quality of the 30 DEM might be a further cause of local inaccuracies in flood hazard maps where smaller scale features are important in flood propagation.
A further limitation is the absence of information on flood defence and control structures at global scale, although ongoing research effort s are st arting to fill this gap ( Scussolini et al., 2015 ). Dykes and other floodplain embankments (e.g. motorways, railways) are generally not reproduced by SRTM, which means that overbank flow often results in floodplain inundation. On this point, it should be noted that many river reaches especially in Africa and South America do not have flood control structures and therefore large flooding actually occurs even for relatively low return periods. In these areas the proposed methodology is likely to provide better results, as can be seen for instance in the middle reach of the Niger River and in the Irrawaddy River ( Figs. 6 and 9 ). However, the good results might also depend on a favourable morphology of local river floodplain ( Hunter et al., 2007 ). Comparisons with local flood depth values or with other models should be carried out to further investigate the actual reliability of the methodology. Another important issue to note, especially in developed countries, is that flood events are mostly caused by the failure of levee systems, which are difficult to predict and incorporate into large scale flood hazard studies ( Falter et al., 2014;Thieken et al., 2014 ).

Flood modelling scheme
The proposed scheme based on the parallel simulations of local flood processes contains a number of approximations, which are partly due to the limitations of the GloFAS river routing scheme. First, peak wave reduction due to water storage in flooded areas is not accounted for, as mentioned in Section 5.2 . In addition, the modelling framework does not include water courses with an upstream area below 50 0 0 km 2 , which means that some flood prone areas are excluded from the flood hazard maps, as shown by the comparison with European-scale maps in Section 4.2.4 . However, the proposed procedure can easily be modified to incorporate minor rivers in the flood maps, should more detailed datasets and models be available.
The reproduction of river channels is another potential issue, as the simple approach applied might result in local wrong estimations in channel conveyance. The use of more sophisticated algorithms to derive channel geometry from terrain datasets might improve the results, although these methods require a calibration to local conditions ( Yamazaki et al., 2012 ). The design of a suitable method for global-scale calibration of river channels is a point that will be addressed in future research work. Apart from that the methodology does not explicitly account for secondary river branches and multiple channel reaches, which play an important role in flooding dynamics in wetlands ( Neal et al., 2012 ). These limitations might influence the distribution of flooding volumes in flood prone areas and are probably a cause of the overestimation observed in the Ganges-Brahmaputra and Mekong deltas, in the Niger inner delta ( Fig. 6 ) and in other large deltas of rivers like the Nile and the Mississippi. On the other hand, the good agreement with the European-scale flood maps described in Section 3.2.4 suggests that the global methodology can provide good results for rivers with not braided channels and significant wetland areas, provided that the hydrological input is reliable.
Flooding of coastal areas might be increased by storm surge events which are not included in the modelling framework, leading to underestimation of flood extent (as discussed for the Irrawaddy River delta in Section 3.1). Flood events caused by ice jams in rivers located at high latitudes (e.g. Canada, Russia) are also not accounted.
Finally, the approach for producing flood hazard maps, based on domain decomposition and independent flood simulations, would require verification. A possible method to evaluate the accuracy would be to compare the results against a reference flood simulation in a given study area (e.g. Niger inner delta), where the simulation domain coincides with the study area itself. This could be addressed in future research works.

Conclusions and next developments
In this work, we presented a modelling framework for mapping flood hazard at global scale, which is designed by taking into account the most recent advances in large scale flood modelling. The framework is fully integrated in the Global Flood Awareness System (GloFAS), and can be seen as a first step to integrate the Glo-FAS forecasting system with flood impact assessment.
An extensive evaluation of the quality of the maps produced has been performed using official hazard maps, flood maps pro-duced by other large scale models and flood extent data retrieved from different datasets of satellite imagery. To the authors' knowledge, this is the first attempt to carry out a large scale evaluation of a global flood hazard mapping methodology for river basins in several continents. Given the limitations in quality and quantity of observed datasets, the results here presented could be seen as a first step to investigate the validity of the proposed approach, and more research will be necessary for a comprehensive global scale evaluation. Nevertheless, such an approach allowed for testing the model performance in different climatological and morphological contexts, and provided useful indications on the potential and limitations of the modelling framework. In addition, a comprehensive sensitivity analysis of the flood modelling approach has been carried out, to identify the most relevant sources of uncertainty. In European rivers, the model results were generally comparable to previous applications of global models by Alfieri et al (2014), Sampson et al. (2015) and Winsemius et al. (2016) . In larger river basins in Africa, Asia and South America the evaluation produced more variable results, with better performances in rivers like the Mekong and the Irrawaddy, and poor in semi-arid areas like the Niger.
These findings confirm that the current version of modelling procedure have limitations, such as the low accuracy of hydrological data in semi-arid areas and the need of refining the mapping methodology to reproduce complex hydraulic contexts such as large wetland areas and rivers with multiple channels. On one hand, the results provided valuable indications on the way to improve the flood mapping methodology, while the foreseen improvements of GloFAS, including the global scale calibration of the river routing and the inclusion of lakes and reservoirs, are likely to improve the hydrological dataset. On the other hand, the relatively good results obtained in a number of tests suggest that the methodology can be an effective tool for large scale flood hazard mapping.
In order to gain a better perspective of these performances, the ongoing intercomparison exercises among global flood models will provide crucial information, allowing to investigate the reliability of the existing modelling approaches in the regions of the globe ( Trigg et al., under review ). On this point, we believe that more efforts should be aimed at building reliable large scale datasets for model assessment in different regions of the globe. The evaluation procedure adopted here, based on official, simulated and satellitederived maps might be used for the assessment of other global flood hazard models, taking advantage of the increasing availability and quality of remote sensing datasets, such as the foreseen Surface Water Ocean Topography (SWOT) satellite mission ( Alsdorf et al., 2007 ). Significant improvements will be possible with the development of new global datasets. Currently, the most relevant constraint is given by the limited quality of the terrain elevation data Sampson et al., 2015 ), and by the absence of global datasets of river depth and flood defences. The situation is likely to improve thanks to the release of new global databases such as the TanDEM-X DEM, the public version of the 1 arcsecond SRTM and a first global methodology for the assessment of flood defences ( Scussolini et al., 2015 ). However, we expect that a substantial improvement in the accuracy of large scale flood hazard maps (especially for flood hazard in urban areas) will only be possible with the availability of terrain datasets designed for flood modelling ( Schumann et al., 2014 ) and reliable global datasets of river geometry.
Finally, the integration of our flood hazard mapping methodology within GloFAS, in a flood monitoring system running seamless at global scale offers interesting perspectives for forthcoming research. Thanks to the constant development and application of its hydrological components, GloFAS provides a robust framework to the proposed methodology, allowing for a rapid improvement of the flood hazard maps each time that significant advances to GloFAS models and datasets are introduced. In addition, the structure designed for the flood maps has the potential to be applied for operational flood hazard and risk mapping within GloFAS. An application at European scale has already been successfully tested , and it is foreseen to test its implementation within GloFAS.