Benchmarking flexible meshes and regular grids for large-scale fluvial inundation modelling

Damage resulting from flood events is increasing world-wide, requiring the implementation of mitigation and adaption measures. To facilitate their implementation, it is essential to correctly model flood hazard at the large scale, yet fine spatial resolution. To reduce the computational load of models, flexible meshes are an efficient means compared to uniform regular grids. Yet, thus far they have been applied only for bespoke small-scale studies requiring a high level of a priori grid preparation. To better understand possible advantages as well as shortcomings of their application for large-scale riverine inundation simulations, three different flexible meshes were derived from Height Above Nearest Drainage (HAND) data and compared with regular grids under identical spatially explicit hydrologic forcing by using GLOFRIM, a framework for integrated hydrologic-hydrodynamic inundation modelling. By means of GLOFRIM, output from the global hydrologic model PCR-GLOBWB was passed to the hydrodynamic model Delft3D Flexible Mesh. Results show that applying flexible meshes can be beneficial depending on the envisaged purpose. For discharge simulations, similar model accuracy was obtained between flexible and regular grids, with the former generally having shorter run times. For inundation extent simulations, however, the coarser gridding of flexible meshes in upstream areas results in a poorer performance if assessed by contingency maps. Moreover, while the ratio between minimum and maximum spatial resolution of flexible meshes has limited impact on discharge simulations, water level estimates may be stronger influenced by the application of larger grid cells. . As this study presents only a small set of possible realizations, additional research needs to unravel how the data and methods used as well as the choices for discretizations influence model performance. Generally, the application and particularly discretization process of flexible meshes involves more options, bringing more responsibilities for the user. Once an a priori decision is made on the model purpose, flexible meshes can be a valuable addition to modelling approaches where short run times are essential, facilitating large-scale flood simulations, ensemble modelling or operational flood forecasting.


Introduction
In recent years, losses due to riverine inundations increased strongly: between 1980 and 2013, they exceeded $1 trillion of direct economic losses and more than 220,000 fatalities (Munich Re, 2013). This development can be attributed to the growth of both population and asset values in floodplains (Ceola et al., 2014;Winsemius et al., 2016) as well as changes in river regimes (Jongman et al., 2012;Munich Re, 2010;Visser et al., 2012;Winsemius et al., 2016). Despite inherent uncertainties, several studies indicate that flood risk will enhance in the future (Hirabayashi et al., 2013;Jongman et al., 2014;Winsemius et al., 2016).
To capture the driving climate-flood interactions and processes world-wide, it is beneficial to apply global hydrologic models (GHMs) to guarantee seamless large-scale inundation modelling across basins and borders. Besides modelling flood hazard at such scale, information should be provided at a spatial resolution sufficiently fine to be "locally relevant" (Bierkens et al., 2015), facilitating stakeholders involvement (Beven et al., 2015). However, the finest spatial resolution achieved for GHMs is currently 10 km x 10 km at the Equator (Bierkens, 2015).
One way to improve the applicability of GHMs would be by simulating lateral floodplain flow and channel-floodplain interactions at a finer scale. Moving to a finer scale is, however, not straightforward as the current debate about "hyper resolution" shows (Beven et al., 2015;Bierkens et al., 2015;Wood et al., 2011).
In contrast to GHMs, hydrodynamic models can run at a finer spatial resolution, for instance 1 km globally (Sampson et al., 2015) or 30 m for the Continental United States (Wing et al., 2017). A downside of hydrodynamic models, however, is that they often use observed discharge as model forcing or employ synthesized flood waves, hence not accounting for all relevant hydrological processes. Consequently, the spatial correlation of large-scale flood events as well as the impact of climate change on flood hazard and risk can be simulated only with concessions.
One way to circumvent the problems associated with coarse spatial resolutions of GHMs and data dependency of hydrodynamic models is hydrologic-hydrodynamic model coupling. On smaller scales, this was already achieved (Felder et al., 2018;Kim et al., 2012;Viero et al., 2014), and in a more recent study Hoch et al. (2017a) coupled large-scale hydrologic and hydrodynamic models. Besides spatial extent, the latter approach distinguishes itself from others such that it employs the Basic Model Interface (BMI; Peckham et al., 2013), providing a flexible coupling design avoiding changes to, and entanglement of model code. By means of this interface, output from PCR-GLOBWB (PCR; Sutanudjaja et al., 2017) forced the hydrodynamic model Delft3D Flexible Mesh (DFM; Deltares, 2018a, Kernkamp et al., 2011). While discharge simulations improved, the extent to which the chosen flexible mesh impacted results remained unclear, calling for additional research on the use of flexible meshes for large-scale inundation modelling.
Despite the large number of studies employing flexible meshes for fluvial flooding (Castro Gama et al., 2013;Kim et al., 2014;Kumar et al., 2009;Sanders et al., 2010;Schubert et al., 2008), none explicitly assesses the role of different mesh configurations, let alone for large-scale applications. While for these bespoke studies an efficient mesh was usually created first, such finetuning is too time-consuming for large-scale inundation modelling which may encompass several larger catchments. What is rather needed are fast approaches to generate flexible meshes over large areas covering a grand variety of topological properties. To make maximum use of the potential of flexible meshes, a resolution sufficiently fine to provide locally relevant results has to be determined a priori. How various degrees of mesh refinement impact large-scale inundation modelling results is hardly researched until now and thus additional insight is needed.
In contrast to flexible meshes, there is a multitude of studies investigating various aspects of regular grid refinement. For instance, it was found that spatial resolution impacts the accuracy of inundation estimates (Savage et al., 2016a), water depth estimates and floodplain drainage flow (Savage et al., 2016b), and channel flow through near channel storage effects (Horritt and Bates, 2001a). Hardy et al. (1999) concluded that grid resolution impacts simulated discharge linearly and water depth in a less structured way due to the impact of the geographical surrounding of each observation location. Comparable results were obtained by Fewtrell et al. (2008) in a small urban environment.
In this study, we will add considerations for large-scale (potentially even global-scale) flood hazard models using a fast set-up of flexible meshes. We present a first benchmark and sensitivity analysis to advance our understanding how model accuracy scales with flexible mesh discretization in large-scale studies. Eventually, we want to better understand a) how different configurations of flexible meshes influence model accuracy, b) how results differ between flexible meshes and regular grids, and c) what lessons can be learned for future applications. The analysis was performed by employing GLOFRIM, a globally applicable framework for integrated hydrologic-hydrodynamic modelling (Hoch et al., 2017b). GLOFRIM is an openly accessible, modular, and extensible tool facilitating model coupling, currently allowing for spatially coupling PCR with DFM or LISFLOOD-FP (LFP; Bates et al. (2010). What was decisive to apply GLOFRIM was the requirement to guarantee identical spatially varying model forcing for all discretizations as well as the need to include all river reaches and floodplains in the analysis.
Three different 1-D/2-D flexible meshes of the lower Elbe basin (Figure 1) were forced with identical output from PCR at 30 arc-minutes spatial resolution. All flexible meshes were created based on the Height Above Nearest Drainage (HAND) method (Rennó et al., 2008). We decided to use HAND as it requires only little input data and is fast in computing topographical gradients with respect to the channel network. To benchmark model results of flexibly gridded meshes, we also applied GLOFRIM to three regular grids. All model results were then validated against observed discharge values as well as benchmarked with respect to their simulated water levels, water volume, run time, and inundation extents.

PCR-GLOBWB
The global hydrologic model PCR-GLOBWB (PCR; Sutanudjaja et al., 2017) distinguishes between two vertically stacked soil layers, an underlying groundwater layer, and a surface canopy layer. Water can be exchanged vertically, and excess surface water can be routed horizontally along a local drainage direction network, employing the kinematic wave approximation. The model was forced with Climate Research Unit (CRU) precipitation and temperature data (Harris et al., 2014), and potential evaporation was computed using the Penman-Monteith equation. Data sets were downscaled to daily fields for the period from 1957 to 2010 using ECMWF ("European Centre for Medium Weather Forecasts") re-analysis products (ERA40/ERAI; Kållberg et al., 2005 as outlined in van Beek (2008). PCR furthermore takes into consideration irrigation water demand and industrial and domestic water abstraction based on reported water demand (FAO, 2017). To use the best possible hydrologic forcing for the hydrodynamic model, we applied a regional optimization scheme to find the parameterization yielding the most accurate discharge estimations at Neu-Darchau ( Figure 2). Further explanation regarding the optimization technique can be found in Hoch et al. (2017a). Manning's surface roughness coefficients of 0.04 s m -1/3 and 0.07 s m -1/3 were used for river channel and floodplain, respectively.

Delft3D Flexible Mesh
Delft3D Flexible Mesh (DFM; Kernkamp et al., 2011) allows its user to discretize the 2-D model domain with a flexible mesh, applying different geometrical shapes at various resolutions to discretize the study area, or regular grids, using the same spatial resolution over the entire domain. The application of a flexible mesh with DFM is both mass and momentum conservative as a) the continuity equation is formulated in a conservative way and b) requirements of orthogonality must be met for stable model runs. For instance, triangles must be acute, that is none of the internal angles must be larger than 90°. For further information on the use as well as technical descriptions of DFM, we refer to the user manual and technical reference manuals (Deltares, 2018b(Deltares, , 2018a. In contrast to PCR, DFM solves the full shallow water equations and thus can capture important flood triggering processes such as backwater effects (Moussa and Bocquillon, 1996). To maintain comparability, DFM also employs Manning's surface roughness coefficients of 0.04 s m -1/3 and 0.07 s m -1/3 for 1-D channels and 2-D floodplain flow, respectively.

GLOFRIM
GLOFRIM is a globally applicable framework for integrated hydrologic-hydrodynamic modelling (Hoch et al., 2017b). With GLOFRIM, it is possible to perform spatially explicit coupling between hydrologic and hydrodynamic models at a time step basis. With the current version of GLOFRIM, PCR can be coupled to either the DFM model used here or LISFLOOD-FP (Bates et al., 2010). Applying GLOFRIM has two major advantages: first, identical model forcing is provided through PCR output, guaranteeing reproducibility and comparability; and second, setting up a coupled hydrologic-hydrodynamic model is greatly facilitated due to the pre-defined workflow. GLOFRIM is built upon the "Basic Model Interface" (BMI; Peckham et al., 2013). In contrast to other model coupling studies employing internal coupling (Morita and Yen, 2002), using the non-invasive BMI allows continuing separate development of the models and avoids the entanglement of model code. By means of the BMI it is possible to retrieve, manipulate, and place model data during model execution. Hence, spatial coupling can be achieved by overlaying grids from two models, and assigning hydrodynamic to hydrologic cells on a grid-to-grid basis. Consequently, the hydrodynamic model is forced with output from PCR by exchanging runoff and discharge volumes between corresponding PCR and DFM cells. For further information regarding GLOFRIM, we refer to (Hoch et al., 2017b).  Table 1 Hydrodynamic discretizations Six different DFM discretizations of the lower Elbe basin were designed ( Figure 2): three with spatially varying grid size and three with uniform grid size. For the flexible meshes, we designed these set-ups: "F1" used a length of 1600 m for its coarsest resolution whereas "F2" used 3200 m. Both employed 400 m for its smallest cells. Comparable to F1, "F3" also used 1600 m for its coarsest resolution but used only 800 m for its finest cell, in order to evaluate the effects of both the largest and finest cell lengths. For the regular grids, we set up discretizations with 400 m ("R1"), 800 m ("R2"), and 1600 m spatial resolution ("R3") to be compared with the flexible meshes. Additional descriptive statistics can be found in Table 1. None of the hydrodynamic discretizations were calibrated as the impact of model parameters scales with spatial resolution (Fewtrell et al., 2008). To derive the model discretization for DFM, we used HydroSHEDS surface elevation and drainage network data at 15 arc-seconds (Lehner et al., 2008) to apply the "Height Above Nearest Drainage" algorithm (HAND; Rennó et al., 2008).
We opted for HAND as it provides a tool for fast grid generation in terms of both data requirements and execution time and is thus well suited for large-scale applications. Besides, HAND was applied for other inundation modelling studies (Nobre et al., 2016;Speckhann et al., 2018) and has only a user-defined upstream area threshold as possible source of uncertainty.
Various levels of grid refinement were achieved by using different initial grid sizes as well as varying values for both minimum grid cell size and maximum model time step.
As 2-D floodplain elevation values we employed canopy-removed surface elevation data (Baugh et al., 2013;O'Loughlin et al., 2016) and hydraulically smoothed it to account for the vertical measurement errors inherent in remotely sensed elevation data (Yamazaki et al., 2017(Yamazaki et al., , 2012 before assigning it to the 2-D part of the grids. We based both the network and river width information of the 1-D channels on the "Global Width Database for Large Rivers" (GWD-LR; Yamazaki et al., 2014), while river depth information was derived by applying the equations of Leopold and Maddock (1953). Bathymetric information was stored at cross-sections with a spacing of around 10 km and subsequently interpolated between cross-sections along the river network. For both flexible meshes and regular grids, the 1-D channel discretization remained unaltered to guarantee consistency between model runs.
We did not account for dikes and other man-made structures due to the lack of reliable global data for our large-scale applications and implementing them would otherwise introduce additional uncertainty to model results. Due to the same reason, we desisted from using sub-grid elevation data or spatially heterogeneous surface roughness values which would typically be done for catchment-scale studies.

Assessment of model results
All test cases were run for the period 01 January 2002  To obtain an impression how simulated water levels differ throughout the basin, they were compared qualitatively at six observation stations covering the up-, mid-, and downstream part of the basin (Figure 2).
Inundation extent was benchmarked for all discretizations similar to the approach and reasoning of Fewtrell et al. (2008). Thereby, the hit rate H, the false alarm ratio F, and the critical success index C were determined for each inundation map with respect to the map with the highest spatial resolution R1. H, F, and C were computed with the subsequent equations where NR1 and Ncomp indicate the number of inundated cells in of the benchmark map obtained with R1 and the map of the discretization to be compared, respectively.
All parameters can vary between 0 and 1. While H=1 shows that all inundated cells in the benchmark data are also inundated in the comparison data, F=1 indicates that the inundated cells in the comparison are entirely false alarms with respect to the benchmark. The critical success rate C, in turn, should be 1 for perfect agreement, thereby penalizing for both under-and overprediction.
Unfortunately it was not achievable to validate simulated inundation extent against observations due to the lack of embankment height information and the resulting overestimation of simulated inundation extent.
Simulated discharge, water levels, and inundation extent were put into perspective by assessing simulated water volumes, which functions as a proxy for overbank water storage. In addition, run times are reported to evaluate the computational efficiency of the different grids.

Simulated discharge
Three observations can be made across all six discretizations regardless the gridding scheme. First, computed discharge exceeds observations for regular flow regimes, but underpredicts discharge for peak flow conditions (Figure 3). Further investigation revealed that this is not mostly due to the discharge overpredicted by PCR which thus already determines the potential accuracy of the coupled output. Second, the magnitude of exceedance increases downstream, as expressed by the increase in RMSE (Table 2a). We postulate that this larger bias is caused, at least partly, by the absence of hydrological processes in the hydrodynamic model, such as groundwater infiltration or evaporation. Last, the absence of dikes influences the shape of all simulated hydrographs. Without dikes, simulated discharge is smoother due to less flow constriction, dampening and lagging particularly peak discharge. While the different aspects do affect model accuracy, all discretizations are, however, affected equally and hence further benchmarking is not hampered.
We assess the influence of the gridding technique applied first. Comparing the KGEs of the discretizations with 400 m (F1, F2, and R1) and those with 800 m finest spatial resolution (F3 and R2) reveals that the application of a regular grid improves model's skill insignificantly compared to a flexible mesh discretization if the same finer spatial resolution is applied (Table 2a). Additionally, results obtained for the regular grid runs indicate that further coarsening of the grid from 800 m to 1600 m impacts discharge results less drastically than from 400 m to 800 m, especially with respect to peak discharge computations. Evaluating the impact of spatial resolution on discharge estimates, we find at ND that simulated discharge deviates only slightly between spatial resolutions (Figure 3a and Table 2a). For flexible meshes, F1 and F2 show near-identical discharge results while F3 yields lower estimates. Similarly, F1 and F2 yield comparable discharge results at TM and TG. The near-identical results of F1 and F2 at all three stations suggest that the choice of the finest spatial resolution within a flexible mesh strongly determines the accuracy of discharge simulations while the coarsest resolution is less influential. At these farther upstream stations, however, the deviation of F3 from F1 and F2 as well as of R3 from R1 and R2 is larger than at ND (Figure 3c-f). Since discharge at ND differs hardly between discretization, the overall discharge volumes passing TM and RT should also be comparable to exclude any water balance errors. As this is not the case here, we re-run all discretization with cross-sections covering the entire floodplain width to exclude uncaptured floodplain flow as cause.
Comparing discharge obtained from channel flow (Figure 3) with the full floodplain discharge (Figure 4) suggest that with coarser cells a larger fraction of total downstream floodplain flow travels via the 2-D floodplain cells, most likely due to the reduced number of 2-D cells available to accommodate floodplain flow.
Since model skill is near-identical at each station across set-ups (Table 2a), the new results provide insight into flood wave propagation: at the most upstream station TR, PCR discharge and DFM discharge correlate very strongly (r=0.94), but with increasing downstream distance, the discrepancy between discharge simulated without and with GLOFRIM increases from r=0.75 at TM to r=0.57 at ND. This underpins the above made assumption that not accounting for open water evaporation and groundwater infiltration in hydrodynamic models can lead to a reduction of model accuracy. In the subsequent section, their potential influence is analysed in more depth.

Figure 4: Simulated and observed discharge at three observation stations throughout the basin for cross-sections covering the entire floodplain width, separately plotted for all flexible meshes (left) and regular grids (right): Neu-Darchau (a and b), Tangermuende (c and d), and Torgau (e and f)
Since, as mentioned above, discharge peaks are not well simulated by the hydrodynamic model, we performed a peak-above-threshold analysis to assess performance for peak flows separately (Table 2b). As threshold we used the long-term mean discharge per station as reported by the BfG 1 : for ND 705 m 3 s -1 , for TM 562 m 3 s -1 , and for TR 340 m 3 s -1 . Results suggest that for peak flow conditions, the discretization approach opted for as well as the absence of dikes and other flood wave containing measures impacts model accuracy strongly and poses a limitation to using the here applied discretizations in an operational setting. Besides, results corroborate that capturing floodplain flow for coarser discretizations is even more important for peak flow conditions. These findings are, however, in line with expectations as we used only global data sets and thus applicability for local bespoke studies may be reduced (Ward et al., 2015).

Water Volume
From Figure 5, three groups can be distinguished: the water volume of R3 which grossly exceeds all other simulated volumes; an intermediate group consisting of F1, R2, and F3; and the group of F2 and R1 containing the least water volume in the system. Overall, results show that the water volume stored in the system increases significantly when moving to a coarser discretization. The aggregation rate expressed as slope of the linear fit ranges between 11·10 -5 m 3 d -1 and 25·10 -5 m 3 d -1 for R1 and R3, respectively. Such accumulation may potentially lead to overestimation of simulated water levels and discharge (Table 4a).
One possible cause for the increase in water volume storage may be the absence of feedback loops between hydrodynamics and hydrology as discussed above. Another reason for the accumulation may be the absence of small 1-D channels, hampering the drainage of floodplains, as well as the coarse 2-D elevation information obstructing important floodplain-channel flows (Neal et al., 2012).
We conducted a first-order assessment whether estimates of spatial-temporal averages of both potential evaporation and groundwater infiltration rates could absorb the accumulated volumes (Error! Reference source not found.b). By multiplying the average of potential evaporation as used by PCR forcing (0.0019 m d -1 ) with inundation area (Table 4b), a potential volumetric evaporation between 1.37*10 11 m 3 (for R1) and 2.06*10 11 m 3 (for R3) over the entire model period is obtained, both greatly exceeding the total accumulated water volume for any discretization. The average infiltration capacity expressed as the ksat value is with 0.15 m d -1 even higher than potential evaporation. Since both values exceed the actually accumulated water volume, we cannot exclude the absence of hydrologic processes as cause for the aggregation.
A clear answer whether this or hindered dewatering of floodplains as reported by Neal et al. (2012) was the main driver can, however, not be unambiguously be provided. Water level Figure 6) that the chosen spatial resolution impacts simulated water levels at all stations, regardless the application of flexible meshes or regular grids. Even though there are locally marked deviations, coarser spatial resolutions result in higher water levels at most of the stations. The main trend in higher water levels with coarser resolution is consistent with larger flood volumes during inundation.
The results can be explained by coarser spatial resolutions reducing connectivity as well as representation of both floodplain flow and floodplain-channel processes which may result in locally higher water levels (Altenau et al., 2017;Horritt et al., 2006;Neal et al., 2012). Besides, coarser spatial resolutions reduce dynamics and, especially at upstream stations, do not capture all inundation events. Even though there is no linear relation between coarsening of grid size and change in surface elevation at all six measuring points, elevation values at observation stations tend to increase with spatial resolution, potentially limiting the magnitude of water level fluctuations (Table 3). This decrease of elevation with spatial coarsening is due to spatial averaging of input elevation values (Savage et al., 2016a).
Also, studies report a non-linear connection between model results and bulk flow effects at coarser resolution as well as varying feedback loops at different resolution due to surrounding cells (Fewtrell et al., 2008;Hardy et al., 1999). An unambiguous answer which is the driving factor is unfortunately not possible due to the system's complexity.
Water levels of flexible meshes and regular grid per station generally compare well. The closest fit between flexible and regular grids could be found for the upstream stations Loc3b and Loc3c as well as the most downstream station Loc1 where the F1, F2, and R1 as well as F3 and R2, respectively, exhibit near-identical results.

Inundation extent
We benchmarked inundation extent at the end of the simulations of all discretizations with R1 as reference map and computed contingency maps for visualization of the hit rate H, false alarm ratio F, and critical success rate C (Figure 7).

Figure 7: Contingency maps of inundation extent of all discretizations (test data) benchmarked against R1 (reference data)
It should be noted that, similar as for the discharge results, the absence of dikes and other man-made structures in our discretizations results in overestimations of inundation extent and thus we desisted from performing an actual validation against observed inundation extent. Besides, other factors potentially affecting inundation extent such as urban areas could not be included due to lacking data. Also, including sub-grid elevation data and spatially varying surface roughness values may have positively influenced the inundation extent obtained.
We find that not only the overall spatial resolution, but also the gridding approach greatly impacts the agreement of inundation extent at the finest level (Table 4b): although F1 has the same finest grid size as R1, they agree only to 74%. This suggests that accuracy of flexible meshes is reduced in those areas where a coarser spatial resolution is employed, which is mostly in upstream areas. This underlines the above-made suggestion that the coarsest grid size has a marked impact on simulated inundation extent. Besides, it seems that for a certain range of coarser discretizations it is inconsequential which cell size or gridding technique is opted for as H, F, and C are within close limits.
Coarser resolution models tend to predict larger inundation extent not only on floodplains, but also for areas farther away from the channels. This again can be related to a lack of 2-D return flows with coarsened spatial resolution or missing hydrological feedback, as shown above and in previous studies. Besides, similar studies for regular grids also report an increase in inundation extent for coarser resolutions (Hardy et al., 1999) which, in turn, is linked to a reduction of contingency and representativeness (Altenau et al., 2017;Horritt and Bates, 2001a;Savage et al., 2016a). Altenau et al. (2017) also concluded that the critical success index drops for coarser spatial resolutions due to averaging of channel and floodplain properties.
We, nevertheless, must acknowledge that when other mesh generation approaches other than HAND would be applied, as for example GIS based approaches (Kumar et al., 2009), results may differ as mesh size properties are to some extent configuration dependent.

Run time
We find that especially for smaller resolutions the differences increase significantly which is in line with expectations and also found in comparable studies, although merely considering regular grids (Altenau et al., 2017;Savage et al., 2016a). Besides very similar performance in discharge computations, run times are almost identical for both R2 and F3. This is because only 2-D cells adjacent to rivers will be inundated and thus run time does not depend on the overall number of 2-D cells but is mainly governed by the number of 2-D cells inundated. Even though F2 and F1 have the same finer spatial resolution, run times differ markedly with the latter having a factor 1.73 longer run time. The longest run time was, as expected, measured for R1 which is 12% longer than F1. While the difference between regular grids and flexible meshes is as expected, results show that major gains can be obtained if doubling the coarsest spatial resolution, for instance from 1600 m (F1) to 3200 m (F2) of the flexible mesh -provided potential reduction of upstream model accuracy is acceptable.

Conclusion and Recommendations
To foster our understanding of differences between flexible meshes and regular grids as well as to better understand both advantages and shortcomings of using flexible meshes for large-scale inundation modelling, we compared six hydrodynamic discretizations of the lower Elbe basin. To facilitate the fast generation of meshes, we used the Height Above Nearest Drainage (HAND) algorithm. Comparability between runs was ensured by employing the GLOFRIM framework, allowing for identical spatially varying and explicit forcing of hydrodynamic models with hydrologic output.
We conclude that the spatial resolution of the hydrodynamic model discretization influences model skill in simulating discharge, local water levels, and agreement between inundation maps, which complies with previous studies, although performed with different models and on other scales (Altenau et al., 2017;Fewtrell et al., 2008;Hardy et al., 1999;Horritt and Bates, 2001a;Savage et al., 2016bSavage et al., , 2016a. Even though the findings are configuration dependent and we test only a sample of all possible discretizations, those similarities across scales and applications are both confirmation of the robustness of our results and proof that these links are model independent and thus of more universal nature.
For discharge simulations, the finest spatial resolution in the grid determines the accuracy. Furthermore, there are no significant differences between the application of a flexible or a regular mesh if comparable in resolution. This means that for discharge simulations less detailed discretizations are acceptable for areas farther away from the river system, if the finest spatial resolution is sufficiently fine to capture both channel and floodplain flow processes. This is crucial since with coarser spatial resolution a higher fraction of overall flow is conveyed via the 2-D part. To some extent adding more 1-D channels could alleviate this, but on basis of our results we find the spatial resolution the larger impediment.
Since the here presented study solely employs large-scale data sets for a catchment-scale analysis, a peak-over-threshold assessment shows that such approaches should be critically examined for detailed flood hazard and risk assessments as not considering structures such as dikes and drivers like spatially varying roughness coefficients may reduce the accuracy of discharge simulations.
Unlike discharge, assessing inundation extent exhibited stronger deviations between the gridding techniques: generally, a uniform and fine spatial resolution outperforms any coarser or flexible grid. This is mostly due to the progressive coarsening of mesh size in upstream areas, leading to larger simulated inundation extent once bankfull discharge capacity is exceeded.
To better understand to which extent the application of HAND influenced the extent of simulated discharge and the model's skill in simulating peak discharge situations, we recommend testing other mesh refinement approaches.
Results suggest that applying a coarser spatial resolution enhances the accumulation and flow of water on the floodplains. Comparing the accumulated volumes with potential reduction due to evaporation or groundwater infiltration showed that these processes cannot be neglected. As most hydrodynamic models do not simulate evaporation or groundwater infiltration, accumulated water will remain there except for return flows. Consequently, future work should focus on establishing feedback processes between inundation floodplains and hydrologic processes.
What can be derived from these findings is that there is a threshold resolution defining the limits of meaningfulness of mesh refinement -only if a certain minimum fineness of resolution is given, flow and inundation processes can be represented sufficiently well. While this was already found to be true for regular grids (Horritt and Bates, 2001), this study illustrates similar patterns for flexible meshes.
As the relation between this resolution and model accuracy will most likely differ depending on basin and river dimensions as well as grid generation technique, we recommend further research to establish a relation between basin properties and mesh design. Such knowledge would be of invaluable use for any large-scale hydrodynamic study as essential time savings effects by grid size optimization could be achieved.
As a guideline for future applications of flexible meshes for large-scale inundation studies, we define three major aspects to consider if applying a flexible mesh for large-scale inundation studies: • using HAND to generate large-scale flexible meshes is a fast and low-level approach for large-scale applications where more detailed topographical features can be neglected • results for this test case underline importance of including smaller topographic features for bespoke and detailed catchment-scale flood hazard and risk assessments • local observations, such as river discharge and floodplain water levels, are less sensitive to coarse-resolution flexible meshes in upstream areas • A minimum fine spatial resolution must be met for floodplain areas to reduce volume conveyed via floodplains and facilitate return flows into channel • domain-wide output, such as inundation extent, profits from the application of uniform fine-resolution regular grids As this study is the first of its kind focussing on comparing flexible meshes with regular grids using methods and data for large-scale applications, the number of flexible meshes used was limited. To further increase our understanding of the confines of their applicability, we recommend a study merely focussing on the impact of mesh variations but then with a wider range of discretizations.
To conclude, we see potential in the application of flexible meshes for future "hyper resolution" large-scale inundation studies, but it also brings more responsibilities. Applicants of flexible mesh models need to put additional emphasis on the creation of the hydrodynamic discretization as it is the coarsest spatial resolution that may become the bottleneck of accuracy. While we used the HAND algorithm for fast and semi-automated mesh creations, we recommend testing other mesh generation approaches as well.
Generally, we think that mesh designs should be based on a number of considerations. For instance, for discharge simulation larger ratios between largest and smallest cell size are admissible, whereas for inundation extent computations this ratio should be minimized. Also, the context of the simulation needs to be considered: are detailed estimates required or are short run times essential? Does one need high accuracy output for the entire domain or just a small part of it? Once the user has a clear idea of the study objectives, the application of a flexible mesh can indeed serve as a timesaving alternative to regular grids, proving potentially useful for large-scale, operational, or ensemble modelling studies where results need to be computed in brief time.

Acknowledgments
This study was financed by the EIT Climate-KIC programme under project title "Global highresolution database of current and future river flood hazard to support planning, adaption and reinsurance". Furthermore, we kindly thank the German Waterway and Shipping Administration (Wasser-und Schifffahrtsverwaltung des Bundes; WSV) for providing the discharge measurements used for model validation. We also thank Edwin Sutanudjaja for support with PCR-GLOBWB as well as Herman Kernkamp and Arthur van Dam for advice on Delft3D Flexible Mesh. We also thank two anonymous authors for their critical remarks on a previous version of this article.