GPU‐Accelerated Urban Flood Modeling Using a Nonuniform Structured Grid and a Super Grid Scale River Channel

New remote sensing technologies, and the meter‐scale geospatial data they create, now allow for detailed urban landscape characterization, thereby advancing grid‐based hydrodynamic models. However, using a uniform fine grid over urban catchments generally result in dense grids and can lead to prohibitive computational costs. Moreover, an inability to see below the water surface and measure river bathymetry in most terrain remote sensing can severely impact local‐scale river hydraulics calculations given the significant volume of water conveyed by the channel. This paper introduces a super grid channel model which allow river channels with any width above that of the grid resolution to be simulated in 1D manner. As an extension of a previous subgrid model, this integration facilitates a seamless transition between subgrid and super grid channels, accommodating situations where channel width may surpass or fall below the grid resolution. The key contribution is the integration of the novel 1D channel representation with a nonuniform structured 2D floodplain hydrodynamic model and then coding this for application on GPU. Compared with the previous pure 2D nonuniform structured approaches, the new model presents an efficient compromise for riverine urban flooding where we are less concerned about fine‐scale details of in‐channel flow. Three tests reveal that the proposed model maintains accuracy but with significantly reduced computational cost. By leveraging GPU architectures, a ∼10× speedup compared to CPU computations is achieved, and a typical 6‐day urban flooding problem (domain size 1.42 km2) at 1 m resolution can be achieved within 10 hr on a single 8 GB GPU.


Introduction
Characterized by diverse underlying surface compositions and complex topography, modeling the hydrodynamic process is challenging in urban catchments with highly complicated overland flow paths and complex interactions between river and floodplain flow.The transport of surface flow along roads and other flow paths is a phenomenon of shallow water (i.e., typically <20 cm deep) that can move at relatively high velocities (de Almeida et al., 2018).Small perturbations in the water elevation can produce significant changes to the surface flood patterns, thereby impacting flood risk assessment due to the concentrated population densities and high economic value there.Adequate representation of these features and of their nonlinear impact on physical dynamics in computational models necessitates a meter-scale grid resolution that is commensurate with flow features (Comer et al., 2017).This necessity is especially crucial for grid-based models because of their strong scale dependence (Cao et al., 2020).
The growing availability of new terrestrial and remote survey techniques, coupled with advancements in computing capabilities, facilitates the development of high resolution urban flood models.These models exhibit the capacity to resolve individual buildings, with an enhanced resolution in characterizing channel and floodplain hydraulics that is consistent with known processes (Bates et al., 2005;Buttinger-Kreuzhuber et al., 2022;Jung et al., 2012).In particular, increases in the abundance of data collected through Light Detection and Ranging imagery (LiDAR) provide city-scale Digital Elevation Models (DEMs) of 1 m resolution or finer, with current vertical errors of approximately 10-20 cm (Chen et al., 2012;Dottori et al., 2013;Fewtrell, Duncan, et al., 2011).However, applying finely resolved topography data in flood modeling generally results in dense grids (Echeverribar et al., 2019) and can lead to prohibitive runtime costs.Simulation run time increases by approximately an order of magnitude for a halving of resolution (Bates et al., 2010).A common practice is therefore to restrict the simulation domain to a small area that only covers part of an urban catchment or to use lower resolution at the expense of accuracy.However, simulations on localized domains will inevitably introduce unnatural domain boundaries and cut off the dynamic links between flood flows inside and outside of these areas, leading to unreliable predictions (Xing et al., 2018).On the other hand, including the whole urban catchment results in a model setup that requires a large computational domain of which the area of particular interest comprises only a small percentage (Comer et al., 2017).
An effective strategy for minimizing runtime costs while addressing local small-scale effects across a vast domain is Adaptive Mesh Refinement (AMR).This approach entails conducting shallow water simulations on a nonuniform grid, allocating finer resolution only where higher accuracy is essential (Caviedes-Voullième & Kesserwani, 2015;Harten & Hyman, 1983;Hou et al., 2018;Kesserwani & Liang, 2012;Liang, 2011;Liang et al., 2008).Although unstructured triangular grids are commonly utilized for flood modeling due to their robustness, ease of generation and flexibility in conforming to complex domain geometries (Sridharan et al., 2021;Zhang et al., 2018), grid connectivity poses challenges in terms of complex programming and slower operation (Bates, 2022).In contrast, block-structured grids offer an easy approach to discretizing the domain by employing cell blocks with uniform resolution, while permitting variable resolutions for different blocks of cells (Donat et al., 2014;Saetra et al., 2014).The data arrangement utilizes plain matrix memorization, optimizing regularity in computer memory accesses for high GPU parallel computing throughput (Vacondio et al., 2017).However, communication and coordination between blocks often introduce computational overhead, due to increased complexity in managing the inter-block interfaces (Liang & Borthwick, 2009).Alternatively, the hierarchical quadtree grid is generated through recursive domain decomposition, where local refinement is achieved by dividing a cell into four cells, and so on (Müller & Stiriba, 2009;Popinet, 2011).A single mesh unit is generated without any necessity to separate the computation into different individual mesh components as required by a block-structured grid.Crucial grid details, like neighbour references and parent-child relationships, are stored in a hierarchical data structure, facilitating easy retrieval for grid adjustments (Hou et al., 2018;Liang, 2011).Yet, the memory arrangement and the cost of accessing graph-like data structures, as opposed to plain matrix memorization, cannot exploit the computational capability of GPUs, given that optimal GPU throughput is achieved through regularity in computer memory accesses (Chowdhury et al., 2023;Vacondio et al., 2017).It is noteworthy that the core of AMR lies in designing refinement criteria based on interpolation error estimates of some key quantity (e.g., water surface gradient, velocity).These criteria can be derived from either a priori knowledge of the solution or posteriori error estimates (Li, 2010).The computational performance and accuracy depend on the chosen threshold values or seeding points (Hu et al., 2018).However, mathematical rigorous a priori or a posteriori error estimates for nonlinear systems of Shallow Water Equations (SWEs) are generally unavailable (Gerhard & Müller, 2014).In this paper, multiresolution analysis of underlying topography based on wavelet theory is implemented to construct a hierarchical quadtree grid (Caviedes-Voullième et al., 2020;Gerhard et al., 2015;Kesserwani & Sharifian, 2023;Sharifian et al., 2023), which facilitates error control as a key aspect as opposed to traditional methods for AMR (Gerhard et al., 2015;Kesserwani et al., 2019).
Moreover, the absence of river bathymetry from LiDAR data can significantly impact local scale hydraulic calculations (Grimaldi et al., 2018), given the significant volume of water conveyed by the channel, even during floods.In fact, no technology exists for obtaining accurate and detailed measurements of the entire bottom surface for all types of rivers in a flood study (National Research Council, 2009), and bathymetric LiDAR and photogrammetry are limited to shallow and clear river reaches only.Though the GPS-equipped boat-mounted surveying techniques which utilize echo sounders are more convenient, a minimum water depth that the boat can access is required, and measurements are only taken along the path of the boat (Dey, 2016).Given the challenges of field surveys and remote sensing, conceptual models that can estimate bathymetry from readily available input parameters have become quite popular.These methods infer river bed elevation by either assuming a uniform flow status (Miguez-Macho & Fan, 2012;Sampson et al., 2015), employing downstream hydraulic geometry theory (Leopold & Maddock, 1953;J. Neal et al., 2012), or applying gradually varying flow theory (J.Neal et al., 2021).By establishing relationships among river discharge, depth, and water surface width, these methods provide a means to approximate river bed elevations.With the availability of water surface elevation and width pairs from the SWOT mission (Altenau et al., 2021;Durand et al., 2023), the river depth under the minimum sensed water surface elevation can potentially be estimated by referring to relationships inferred amongst water surface elevation, width and/or discharge (Durand et al., 2023).An approximated bathymetry profile all along the channel allows for a more consistent representation of the river channel and floodplain interaction that better aligns with the 2D floodplain model.This bathymetry could be incorporated into the DEM and a purely 2D model, however it is often cost-prohibitive to model river hydraulics with a purely 2D model in meter-scale urban flood modeling due to the unnecessary grid refinement in the channel region (Ivanov et al., 2021).
A super grid scale representation of river channels is therefore proposed, leveraging a hybrid 1D-2D modeling strategy while permitting physically consistent representation of the river-floodplain system.The super grid channel (SGC) model simplifies the river channel that may span several continuous grids at each cross-section in a 1D manner, leaving only the necessary river centreline grids for flow connectivity while masking all other channel grids.River channel conveyance capacity is estimated with the separately configured bathymetry parameters (e.g., river width, bed elevation, Manning's coefficient), regardless of the floodplain grid resolution.This is combined with a previously developed subgrid channel approach which allows river channels with any width below that of the grid resolution to be simulated on a relatively coarse grid (J.Neal et al., 2012;Rong et al., 2023).As an extension of the subgrid channel approach, this new integration allows a seamless transition between subgrid and super grid channels, depending on the river channel width and the underlying mesh resolution.Consequently, it is applicable in both large-scale and local-scale flood modeling, accommodating situations where channel width may surpass or fall below the grid resolution.Moreover, the channel discretization is decoupled from the overlying 2D grid, for which a locally adapted grid more appropriate for floodplain flow simulation can then be selected.
How to make good use of valuable topographic details while keeping a balance between model complexity, efficiency and reliability becomes a key issue in urban flood modeling (Savage et al., 2016).Improved hardware or purely data-oriented strategies such as parallel computing are not sufficient to overcome the arising difficulties in urban flood modeling (Dahmen et al., 2001).Opportunities exist to improve urban flood models by tailoring adaptive solution schemes and river channel models to meter-scale LiDAR data (Schubert et al., 2008).Thus, the key contribution in this paper is the integration of a novel 1D channel representation with a nonuniform structured 2D floodplain hydrodynamic model and then coding this for application on a GPU.Previous nonuniform structured approaches in hydrodynamics (Chowdhury et al., 2023;Kesserwani & Sharifian, 2023;Sharifian et al., 2023) have been, to the best of our knowledge, pure 2D only, but this is computationally inefficient for representing riverine urban flooding where we are not concerned with fine-scale details of in-channel flow.Instead, for practical flood simulations hybrid 1D-2D models are preferred, but to date such models have used uniform 2D grids for the floodplain.Uniquely, our approach combines the best elements of both these approaches in a new model.The ultimate goal is to reduce the computational costs of urban flood modeling with regard to both computational time and memory requirements but still maintain the accuracy of a reference scheme (at least over the domain of interest).
The paper is structured as follows: Section 2 details the optimized procedure of nonuniform mesh generation, and the implementation of the SGC, as well as the incorporation of these aspects on GPU parallel calculation.Section 3 validates the model performance in river hydraulics calculation, floodplain inundation, and the water exchange between the river and floodplain.An idealized test case is incorporated to assess the model performance in handling the transition between the subgrid and SGC approach.Limitations of the method will be in Sections 4 and 5 will draw conclusions regarding the model performance.

Methodology
This section details the quadtree mesh generation procedure and the incorporation of the SGC with nonuniform structured grids.The extension of the hybrid finite difference-volume numerical solution scheme for the SWEs to the nonuniform mesh is expanded upon.The main distinctions between the subgrid river and SGC are outlined, including their transition.Lastly, the storage of the graph-like hierarchical grid in memory is presented, along with insights into optimizing the utilization of GPU capabilities.

Wavelet-Based Multiresolution AMR for Urban Flood Modeling
Wavelet-based multiresolution AMR is implemented to construct a hierarchical quadtree grid on which the local inertial formulation of SWEs is solved (Caviedes-Voullième et al., 2020;Gerhard et al., 2015;Kesserwani & Sharifian, 2023;Sharifian et al., 2023).Depending on the characteristics of the flow field (e.g., gradually propagating urban floods, rapidly propagating flows like dam break flow), either a static adaptation or a dynamic one can be applied.The static grid adaptation is placed in user-defined regions of interest at the beginning of a simulation and held fixed during the model run.In contrast, dynamic adaptations are utilized to monitor the evolution of flow features of interest over time, with time-varying high-resolution meshes (Kesserwani & Sharifian, 2023).Accurate dynamically adaptive solutions necessarily demand a multiresolution analysis process at every time step for accurate prediction of the mesh concentration corresponding to the flow status.This incurs severe overhead costs even for parallel implementation, especially for real-world simulations where the flooding inundates large previously dry areas.Moreover, the propagation of flow in urban areas is strongly influenced by small-scale terrain features that are fixed in position.The concentration of infrastructure in urban regions represents prior knowledge that should guide strategies for mesh refinement.Instead of tracking the flow evolution, much more emphasis should be paid to the impact of local small-scale features on flood inundation (Rong et al., 2020;Vacondio et al., 2017;Yu & Lane, 2006).Considering the typically gradually varied flow status in the urban environment and the overhead costs for dynamic mesh generation, in this paper a static refinements strategy is applied for hierarchical quadtree mesh construction.The rigorous filter formulae of wavelet-based dynamic adaptivity are no longer applicable in the context of static mesh refinement.Therefore, only extrinsic wellbalanced solution limits can be reconstructed across the heterogeneously sized adjacent elements based on heuristic averaging on a graded grid, as in conventional adaptive grid refinement method (Kesserwani & Sharifian, 2023).
We briefly summarize the steps to generate a locally adapted grid that corresponds to the underlying topography features.While most AMR methods create a nonuniform grid by selectively refining a baseline grid at the coarsest resolution, the wavelet-based multiresolution AMR approach diverges from this by initiating with a baseline uniform grid at the finest resolution and coarsens it (Chowdhury et al., 2023;Sharifian et al., 2023).Like data compression in image processing, wavelet-based AMR can be achieved in three steps, namely multiscale decomposition, hard thresholding, and grid adaptation:

Multiscale Decomposition
Given fine resolution DEM data at the reference level z fine , a hierarchy of nested grids at descending resolutions can be constructed by successively applying low-and high-pass filters to every 2 × 2 child element z fine [i] ,i = 0,1,2,3, according to Equations 1-4.As shown in step 1 in Figure 1, the low pass filter HH 0,1,2,3 allows only low-frequency components (the average over four child elements) to pass through while attenuating or filtering out high-frequency components (details).The high pass filter (GA 0,1,2,3 , GB 0,1,2,3 , and GC 0,1,2,3 ) instead, is used to get the details from the four child quadtree elements, highlighting the difference between these four child elements.Specifically, starting from the reference grid at level n = L (where L is the reference level and 2 L ≥ max(rows, columns)), a coarser grid at level n 1 can be generated using a cell-average process with these two filters.Each 2 × 2 child elements are aggregated to one coarse grid z coarse and three encoded detail coefficients , and d coarse

D
. These detail coefficients encapsulate the encoded differences between the child quadtree and parent scaling basis along the horizontal, vertical, and diagonal directions respectively.z coarse indicates that the coarse-grid average can be represented by a linear combination of the corresponding fine-grid averages.A hierarchy of nested grids at descending resolution from level L to 0 can be constructed by successively applying these two filters to every 2 × 2 child element over any two subsequent levels.Utilizing the averaged information at the coarser level along with these detail coefficients at each level, we can reconstruct the fine resolution grid entirely, employing the reverse process as outlined. (1) (2) (3) Here z is the topography elevation, the superscript fine refers to the topography variables in the finer resolution level, and the subscripts 0-3 index the 2 × 2 child elements.The HH 0,1,2,3 are low-pass filter matrices and GA 0,1,2,3 , GB 0,1,2,3 and GC 0,1,2,3 are high-pass filter matrices.For the Haar wavelet, HH i = 1 4 ,i = 0,1,2,3, and Equation 1 is designed to calculate an average from the 2 × 2 child elements.diagonal directions respectively.Here we omitted these 0 elements in these low-and high-pass filter matrices for a clearer explanation.However, it is important to note that alternative wavelets can also be employed to reconstruct the parent elevation and compute their differences, potentially resulting in more complex filter matrices.Detailed expressions of these filters are available in Kesserwani and Sharifian (2020).

Hard Thresholding
By recursively performing the multiscale decomposition, z fine at reference resolution level L is now compressed into a sequence of details living on the hierarchy of uniform grids at subsequently lower resolution levels (L 1, …, 1, 0).As shown in step 2 in Figure 1, the crucial point in the adaptation process is the threshold selection by which to decide whether to coarsen the grid locally or not.Since the details may become small for a locally smooth region, the idea is to perform data compression on the vector of details using hard thresholding.We discard these detail coefficients whose normalized magnitudes (d coarse ) all fall below a leveldependent threshold value ε l = 2 (l L) ε, according to Equation 5.Here ε is a user-defined acceptable tolerance while l indicates the specific level of the detail coefficients.DEM L i,j indicates the maximum elevation at reference level L. For example, supposing a maximum elevation at level L is 100 m, an ε of 10 3 means that the maximum acceptable error caused by coarsening the DEM is 0.1 m.Small detail coefficients indicate a locally smooth behavior which can be well handled by interpolation while large coefficients require more computational effort on a fine resolution grid.The ideal ε would be to balance the discretization error of the reference scheme on the uniform finest grid with the perturbation error introduced by the hard thresholding.The optimal range for ε is expected to be somewhere between 10 6 to 10 4 within the scope of modeling shallow water flows but is rather context specific.
The mesh generation controlled by a single threshold ε over the whole domain provides a feasible solution for AMR.However, there is a risk of degrading the representation of terrain features or unnecessarily over-specifying the mesh in rural areas using a single threshold (de Almeida et al., 2018; Kesserwani & Sharifian, 2023).We argue that the wavelet-based AMR should keep fine resolution for urban areas, channels, or any other regions of interest, especially for the static grid adaptation that is generated at the beginning of a simulation and held fixed during the model run (Jablonowski et al., 2009).A mask file showing the regions of interest is additionally provided where the finest resolution is kept.The reason why we keep the finest resolution instead of a relatively coarser one (i.e., permitting a maximum of two levels coarser) is to avoid excessive differences between grid spacing and resolution.More access to regularly sized neighbors helps maximize the modeling efficiency.Otherwise, nonuniform grids which feature more depth and variability in resolution result in a high number of differently sized elements.This can lead to computational overhead at those elements which are surrounded by too many neighbors with different sizes (Sharifian et al., 2023).

Grid Adaption
As shown in step 2 in Figure 1, level-wise identification of significant details from fine to coarse level is conducted to check whether the detail coefficient is higher than the threshold ε l , or the cell is within the regions of interests as provided by a user-defined mask file.Specifically, two binary mask files are employed to indicate the significant details where the finest resolution is kept.The first mask file delineates the regions of interest, while the second one indicates the extent of the river channel.The river channel mask excludes the masked grids (besides these river centerline grids) from calculation, resolving the issue arising from the double consideration of the river channel conveyance capacity (Section 2.3).Any grid marked as significant on a finer resolution will be recorded even though significance may be lost at coarser levels.Any one significant parent element marks the significance of its 2 × 2 child elements.The hierarchy of details in fact forms a quadtree structure, where the locally adapted grid corresponds to the "leaf elements."Here the "leaf elements" are such that their details on the tree have either stopped being significant or belong to the finest grid.As shown in step 2 in Figure 1, the colorshaded grids are the identified leaf elements.By means of the remaining significant details, a locally adapted hierarchical quadtree structure is acquired whose complexity is substantially reduced.The computational efficiency for the numerical solution is proportional to the rate of data compression, for a prescribed level of tolerance ε.Overall accuracies are captured within a given tolerance ε while extracting the inherent complexity of the problem with as few degrees of freedom as possible.A depth-first traversal algorithm from coarse levels to fine is then performed to collect all the leaf elements, thereby forming the nonuniform grid (Chowdhury et al., 2023;Sedgewick & Wayne, 2011).

Implementation of Efficient Inertial Formulation of the SWEs Over Nonuniform Grid
An efficient local inertial formulation of the SWEs is adopted for urban flood modeling, which simplifies the full SWEs by neglecting the convective acceleration (Bates et al., 2010), as shown in Equations 6 and 7. A hybrid finite difference-volume numerical scheme over a compact computational stencil can be employed to numerically solve the equation.This scheme evaluates continuous discharges at the interfaces between adjacent grid elements and achieves an element-wise update of averaged water levels.In most cases, the control equations can achieve a good approximation to the SWEs in an efficient manner for gradually varying, subcritical flow (de Almeida et al., 2012;Shaw et al., 2021).Equation 7with formulae decoupled in the x and y directions can be employed directly for the calculation of flood propagation over the floodplain.A derivation with an explicit inclusion of the river hydraulics radius is applied for river flow calculation to simulate relatively small, narrow, and deep channels (J.Neal et al., 2012).
Where h is the water depth, q is the discharge per unit width, z is the bed elevation, g is the acceleration due to gravity, n is the Manning friction coefficient, x/y denotes the horizontal/vertical coordinate, and t is the time.
After the AMR, the uniform fine resolution topography is compressed into a multi-dimensional nonuniform structured grid.Such a grid maintains the structured property of a uniform grid so that the neighbor information is directly provided by simple mathematical relationships at the reference level.As shown in Figure 2 details are retained to assemble the shallow water solution scheme on a nonuniform grid.This scheme leverages locally constant solutions to ensure flux conservation across faces between arbitrarily coarse-and fine-scale elements.The governing equations (Equation 8) are directly solved on the whole computational domain in a way identical to that on a uniform grid, and no interpolation or approximation is needed (Sharifian et al., 2023).
where the superscript t is the time step, and the subscript E marks an individual grid.Here, to exactly maintain equilibrium solutions over the non-uniform grid, we limit the cell width at the shared interface to the smaller one of these two cell widths.This restriction implies that only the shared proportion of the cell can be utilized for momentum exchange between arbitrarily sized grid elements.The distance between the center of two adjacent cells is therefore calculated as 0.5(∆x + ∆x E1 ).When the discharge matrix between each pair of adjacent grids is updated, the continuity equation can be applied for the water depth calculation, which has the form of Equation 9.
Here Equation 9shows the water depth update in the grid (i, j) in Figure 2, where m indicates the number of neighboring elements in the North (N), West (W), South (S), and East (E) directions.S m represents one specific neighboring element in the South direction, where m = 1…m S .The neighboring information is directly provided by simple mathematical relationships.The code systematically traverses all nonuniform grids to gather neighboring information, organizing it into a 1D array for subsequent flux calculations and water depth updates (Section 2.4).For each grid, all adjacent grids that exchange discharge across their common interface are responsible for the water depth update.This exchange represents the mass entering or leaving one unit grid.A positive value means the mass is entering a cell while negative values show that mass is leaving.Only after the mesh generation can we count how many surrounding elements exist.Therefore, the core of the nonuniform calculation is to index the multi-dimensional adapted grids and find all neighbors as detailed in Section 2.4.

Super Grid River Channel Representation
A SGC model is proposed that leverages a hybrid 1D-2D modeling strategy while permitting a physically consistent representation of the river-floodplain system.As shown in Figure 3, the idea for a super grid representation is to simplify the river channel that may span several continuous grids at each cross-section in a 1D manner.Only the necessary river centerline grids are retained for flow connectivity, while masking all other channel grids.The simplified river channel contains the independently defined bathymetry information (river bed elevation, width, and bank elevation) for the river hydraulics calculation, regardless of the grid resolution.Before conducting flood inundation calculations, explicit links are established between the simplified river centerline grids and the floodplain.This linkage facilitates direct volume exchange when the bankfull height is surpassed, as the channel is spatially separated from the floodplain by the river channel mask (masked river grids are denoted with an "×" in Figure 3).This is achieved by connecting the floodplain grids that are adjacent to rivers (marked with a "+" in Figure 3) to the closet river centerline grids (blue shaded channel grids).Once bankfull depth is reached, the volume exchange between the channel and floodplain should commence through these explicit links, according to Equation 10.In contrast, in the subgrid model the river channel shares the same volume with the immediately overlying floodplain grids once bankfull depth exceeds.No explicit volume exchange exists between river channel and floodplain.
Here the subscript e shows the volume exchange V through the explicit links between the river channel and floodplain.n is Manning's coefficient, and h t e is the depth between cells through which water can flow and can be calculated in Equation 11. S t e is the water surface slope between the river channel grids and floodplain and can be calculated in Equation 12.
Figure 3. Differences between the subgrid channel and the super grid channel (SGC) model.The subgrid model allows river channels with any width below that of the grid resolution to be simulated on a relatively coarse grid.The super grid river channel is used to simplify the channel that may span several continuous grids at each cross-section in a 1D manner, leaving only the necessary river centerline grids for flow connectivity.The independently defined 1D river channel is parameterized in the same way in both schemes (where river bed elevation, width, and bank elevation are approximated for bathymetry representation).This allows a seamless transition between the SGC and the subgrid channel schemes.
Water Resources Research

10.1029/2023WR036128
Where subscript c represents the river channel while f means floodplain.∆x e is the distance between the river channel extent and the floodplain.As the SGC serves as a representative model of the 2D river channel directly connected to the floodplain, it is important to note that ∆x e remains consistent with the fine grid length, to accommodate the fine-resolution representation required for the adjacent floodplain grid cells.
Unlike the subgrid channel model (J.Neal et al., 2012), which permits the simulation of river channels with widths below the grid resolution on relatively coarse grids, the SGC representation is tailored for meter-scale flood modeling.In this approach, river channels are depicted with several continuous grids at each crosssection, and the river hydraulics are no longer a subgrid scale process anymore.The fundamental concept of both the subgrid and super grid river channels is to represent the river channel in the simplest possible 1D manner, to minimize the complexity of its parameterization.The combination of a 1D channel and a 2D floodplain offers the benefits of capturing 2D processing on the floodplain whilst minimizing the computational costs and below water line data requirements in the river channel.The super grid river channel is parameterized in the same ways as the subgrid channel (J.Neal et al., 2012), where river bed elevation, width, bank elevation and Manning's friction are defined for bathymetry representation.A wide number of ways, including using ground survey, simple power laws, or a bathymetry inversion from water height, can be applied to parameterize the channel (J.Neal et al., 2021).Ideally the estimated river channel can route the flow downstream with an equivalent conveyance capacity at bankfull to a fine resolution 2D channel, but with a significantly reduced number of grids.Both the super grid and the subgrid river channel are implemented in LISFLOOD-FP.This integration, extending the subgrid modeling approach (J.Neal et al., 2012;Rong et al., 2023), facilitates a smooth transition between subgrid and super grid channels, making it possible for both large-scale and local-scale flood modeling.Such a hybrid 1D-2D approach has significant advantages for inundation prediction where we are less concerned about the detail of in-channel flows and it can be the most effective compromise in many situations (Apel et al., 2009).
Equation 13 uses the hydraulic radius R of the channel to mimic relatively small, narrow, and deep channels rather than h in Equation 8or Equation 10 which is intended for the shallow water flow.This is a more generic scenario for in-channel flow movements.It is worth noting that Equation 10 with simplified wetted perimeter is for the 1D-2D model interface calculation, while Equation 13 is for the 1D river flow calculation.Flows in both the channel and floodplain are coupled and solved using the same inertial SWEs approximation, guaranteeing a stable discharge exchange between the river channel and floodplain.
Q t c is the in-channel flow discharge at time t, ∆t is the time step.A t c is the wetted cross section area (Equation 14), R t c is the wetted perimeter (Equation 16, supposing a rectangular river channel but can be changed).S t c is the surface slope.n c is the Manning's parameter in the channel.
Where w c is the minimum channel width between two river cells, and the h t c is the depth between two river cells through which water can flow and can be calculated in the same way as Equation 11.
The time step is tied to the finest discretization level according to the Courant-Friedrichs-Lewy (CFL) condition (Equation 17).A parameter α is introduced to reduce the time step, as the assumption of small amplitude in calculating the wave celerity is not always valid and because of the inclusion of friction term in the model.h max is the maximum water depth from the nonuniform grids, ∆x is the grid size at the reference level.The time step ∆t imposes strict requirements on flow dynamic update to guarantee the stability of the solution scheme.This stability holds true irrespective of the grid level and aligns with the CFL conditions.

Index the Nonuniform Grid With Space-Filling Morton Curve for GPU Architecture
The locality of the solution scheme (Equations 6-17) for the SWEs allows for element-wise updates to flow variables using only data from neighboring elements to evaluate numerical fluxes across interfaces.Here the neighboring elements refer to a grid at a given level that has continuous neighbors of the same or different levels/ sizes, connected by a common interface.However, the complex topology of the nonuniform mesh poses challenges for traversing the grids to update water depth and retrieving information from neighboring elements for discharge exchange calculation.To address these issues, the space-filling Z-curve (also known as a Morton curve) is implemented as a simple and consistent approach to ordering and accessing data on the nonuniform grid.This approach minimizes memory access and improves cache locality (Kesserwani & Sharifian, 2023).
Step 3 in Figure 4 illustrates the construction of the Z-curve, achieved by connecting the ends of the curve to the start, thereby creating a continuous loop that encompasses the entire space/domain.The curve preserves the spatial locality of data by mapping neighboring elements in the nonuniform mesh to neighboring positions on the curve.This mapping allows the multi-dimensional grid to be represented as a 1D continuous curve with a linear and unique index array, facilitating the assignment of the exact number of threads for parallel computation.The Morton code, detailing the position of a grid along the curve, is created by taking the binary representation of each coordinate (row, column) in a regular grid and interleaving their bits to produce a single binary number (interpreted as an integer).A reverse interleaving process can be applied to convert the code back to its corresponding grid coordinate, facilitating the retrieval of the coordinates of a grid from its index, and vice versa.
Element-wise updates to the water depth in a nonuniform grid need to access all its geometrical neighbors during a grid traversal.A neighbor-searching algorithm is implemented to locate the surrounding elements of an interior grid on the nonuniform mesh, by referring to the relative position in the non-compacted matrix.As shown in step 3 in Figure 4, the algorithm relies on reversing the Morton code into the coordinate vector (row, column) on the reference regular grid and then locating the surrounding elements with a simple adjustment of the coordinate, for example, (row ± 1, column ± 1).A 1D array indexed with the cumulative counting of the neighboring elements can be recorded for the storage of the neighboring index, facilitating the retrieval of flow information from neighboring elements.Following the procedure to search the neighboring elements, we can exactly know how many pairs of neighboring elements exist and allocate the appropriate resources for these discharge exchange tasks.Once the discharge across the interface is calculated, it is stored in a manner similar to the neighboring array, allowing direct access to the calculated discharge for efficient water depth updates.

GPU Model Architecture
Although the locally adapted grids may lead to a significant reduction in the computational demand compared with the uniform grid, this alone may not be sufficient for efficient meter-scale urban flood modeling on CPU architecture.To further enhance computational efficiency, parallelization techniques are essential to reduce the computational time to an affordable order of magnitude.The mapping of nonuniform grid elements to a 1D array, along with their respective neighboring elements, enables efficient traversal of the nonuniform mesh.This arrangement facilitates the retrieval of all necessary neighboring information for computational purposes, a benefit derived from the locality of the solution scheme.Moreover, we only need the local solution status (i.e., h, z), from neighboring elements and the calculated discharge from the last time step (i.e., Q) for the calculation.This storage structure allows individual threads direct access to the calculated discharge for efficient water depth updates.Consequently, each thread can access the necessary information directly from the stored arrays, contributing to a streamlined and parallelized computation process.All these features favor a parallel computation of the calculation process.
The SGC model with the nonuniform structured grid is implemented in a CUDA/C++ code that exploits the potential for parallel computation offered by a GPU.For discharge calculation, each thread, the basic work unit of a GPU, corresponds to a common interface shared by two adjacent grids.As we know how many pairs of neighboring elements exist, we can allocate exactly the number of threads to parallelize the tasks.For elementwise water depth updates, each thread corresponds to a grid element used to discretize the physical domain, assembling information from neighboring elements for mass balance update.The framework of the solution process is shown in step 4 in Figure 4.In theory, transferring the nonuniform mesh back to the CPU could facilitate task decomposition and resource allocation for multiple GPUs.However, for simplicity, calculations are currently conducted on a single GPU.It is essential to highlight that the conventional matrix memorization, reliant on regular access patterns typical in uniform grids, is not applicable to the adaptive method.Therefore, the efficiency of the adaptive method relies on achieving significant data compression.This compression must be significant enough to offset the additional computational costs associated with the adaptive approach.Moreover, the gain in computational efficiency on a GPU is subject to the specific characteristics and demands of the given problem and is investigated in the following section.

Numerical Tests
The performance of the SGC solvers on the nonuniform structured mesh is explored alongside their uniform counterparts, while analyzing their grid-coarsening ability, and reliability in handling river channel conveyance capacity.To reproduce small-scale yet critical topographic features in riverine urban flood modeling, an urban mask file showing the extent of urban regions is processed using a 10 m land cover map (Marston et al., 2022).Fine grid resolution is maintained over the user-defined regions of interest.A freshwater layer is also processed to maintain flow connection with fine resolution grid, allowing us to quantify the impact of river channel simplification on the flood distribution.To assess the impact of the implemented treatments (masked regions of interest and SGC simplification) on model performance, four adaptive models were established.Initially, a 2D model was set up where mesh generation was governed by a range of acceptable error ε from 10 6 to 10 4 .Subsequently, each scenario incorporated additional components: masked regions of interest, the SGC model, and their combined effect, in flood modeling.By comparing the results of these four adaptive models (the 2D model, masked 2D model, SGC model, and masked SGC model) to the reference 2D model at full resolution (full resolution 2D), we can assess how each treatment impacts the overall performance in flood modeling.Here, all model performance is evaluated within the same framework of LISFLOOD-FP.Model system uncertainties can be compensated for by using the same code structure, thus making direct comparisons between the different techniques easier to achieve.While alternative approaches, such as HEC-RAS and MIKE, are applicable for 1D-2D Water Resources Research 10.1029/2023WR036128 flood modeling, adopting different model structures introduces inevitable uncertainties which makes the comparison difficult.To compensate for the potential errors in these flood models, field survey data is collected and utilized to improve the reliability of the evaluation.
The objective of the benchmarking was to evaluate how close the simulated results are to a reference full resolution simulation and in-situ measurements if available.Taking the full resolution 2D model as the reference, the flood inundation extent from four different adaptive solvers is quantitatively validated by the hit rate (H), false alarm rate (F) and critical success index (CSI) metrics.Given that flooded pixels are denoted as "1" and nonflooded pixels as "0," we define the hit rate (a) as instances where both observed (obs) and simulated (sim) values are "1.A desirable model performance should be with a large H and CSI and a smaller F value, for example, where both H and CSI are larger than 0.9 while F is less than 0.1.The CPU-based solvers were run on a 2.7 GHz Intel(R) Xeon(R) Gold 6226 using 4 CPU cores, while the GPU-based solvers were run on an NVIDIA Tesla P100 on the high performance computing (HPC) at University of Bristol.

Carlisle 2005 Urban Flooding-Performance Evaluation of the 1D-2D Adaptive Solver
The Carlisle flooding event, characterized by gradually propagating flood, was investigated to validate the SGC solver over a nonuniform discretization of the computational domain.Over a 36-hr period preceding the flooding, up to 175 mm of rain fell over the Eden catchment, in which the city of Carlisle sits.Overflow from the River Eden and backwater flow impacts along the Rivers Caldew and Petteril aggravated the situation, leading to significant flood damage.Three inflow hydrographs with a time interval of 15 min at the upstream points of Rivers Eden, Petteril, and Caldew with a maximum flow rate of 1,200 m 3 s 1 are provided to run the model (J.C. Neal et al., 2009), and the simulation time is set to 45 hr to allow the water to come to rest and pond in depressions.
A free outflow boundary condition is established at the downstream end of the River Eden, while the remaining boundaries are set to closed.The study area is shown in Figure 5, covering about 14.5 km 2 of the city of Carlisle, with the surveyed maximum water surface elevations at 183 wrack marks and the maximum flood inundation extent (light-blue shaded area within the purple dash line).Further details can be found in J. C. Neal et al. (2009).These wrack marks will be compared against the maximum free-surface elevations simulated by the SGC solvers.
The optimized Manning's parameter is 0.04 m 1/3 s for the channel and 0.06 m 1/3 s for the floodplain, using a calibration process with those surveyed data.In addition, water depth time series are recorded at eight sampling points, including at the three gauges of Sheepmount (P1), Botcherby Bridge (P2), and Denton Holme (P3) for which in-situ water depth measurements are available to further support the validation of the solvers.
5 m resolution LiDAR terrain data with an interpolated river channel profile from measured cross-section data was collected for flood modeling (Horritt et al., 2010).The procedure below is designed to closely replicate the behavior of the 2D channel in 1D so that the models can be benchmarked against each other.Procedures to get the independently defined super grid river channel are as follows: 1. River channel extraction from a steady flow simulation calculated by LISFLOOD-FP.With the objective to get the bankfull river extent, steady state inflow hydrographs for the three tributaries are applied, and the model run until a steady flow status is achieved such that the inflow rate is equal to the outflow from the domain.A total of 274 steady flow runs were computed at different discharges, and we then chose one inflow scenario that closely matched bankfull status, that is, the greatest discharge where there was no obvious expansion of the shallow water surface onto the floodplain.The steady state simulation was used to define the river extent and create the channel mask.2. River channel width estimation from the river extent binary image using the RivWidth package.A river centerline image can be acquired with a morphological operation that extracts a binary image skeleton.Then the RivWidth package was used to estimate channel width for each grid in the 1D river centerline, where river width is estimated along orthogonal direction as the Euclidian distance between the centers of the two pixels in the river extent image at which the orthogonal intersects the bank (Yang et al., 2020).3. River channel bed elevation optimisation.An initial guess of the riverbed elevation is taken as the same as the LiDAR data where the 1D river centerline lies.The riverbed elevation is then adjusted using the gradually varied flow solver (J.Neal et al., 2021), with the objective of narrowing the difference between the steady river flow extent calculated from the 2D model and the SGC model.With the optimized river bathymetry, the river channel conveyance capacity can be maintained, and all other channel grids under mask are deactivated, making the calculation much more efficient.
The performance of the four adaptive models and the full resolution 2D model were evaluated by comparing their maximum free surface elevation predictions against the measured data at 183 surveyed water level and wrack mark points.Table 1 shows the Root Mean Squared Error (RMSE) and mean error generated by the differences in water surface elevation between the simulated and observed data at these measurement points.The masked SGC model was found to be the most accurate model, with the smallest RMSE of 0.332 m and a mean error of 0.017 m when ε ≥ 10 5 .The masked 2D model exhibited the closest error distribution to that of the masked SGC model, with an RMSE of 0.347 m and a mean error of 0.012 m.These results compare well with the errors reported in other studies (Horritt et al., 2010;J. C. Neal et al., 2009).In contrast, the SGC model resulted in a higher RMSE error of 0.432 m and a mean error of 0.101 m, and its histogram errors were centered at 0.1 m, indicating an underprediction tendency for the free-surface elevation.Similarly, the 2D model resulted in RMSEs of 0.847 and 0.545 m and mean errors of 0.473 and 0.151 m under ε = 10 4 and ε = 10 5 , respectively.These, together with the fact that their histogram errors were centered around 0.4 and 0.2 m, suggested that the 2D model, on their grids involving nonuniform resolutions, tended to significantly overpredict the free-surface elevation compared to the measurements due to the poor mesh quality.When it comes to ε = 10 6 , all these four models yield a good approximation to the full resolution 2D model, and the masked 2D model slightly outperforms the other three adaptive solvers with the smallest RMSE of 0.294 m.Overall, the results indicate that the masked SGC and masked 2D models offer the most accurate predictions for this test, while the 2D and SGC models tend to be less accurate.Particularly when the underlying discretization is relatively coarse, the masked SGC model can significantly improve the overall model reliability.As expected, the full resolution 2D model demonstrates the highest agreement with the measured water surface elevations, yielding the smallest RMSE of 0.29 m.As a result, the estimated flood inundation extent from the full resolution 2D model is used as the reference for quantitively evaluating the performance of these four models.
Figure 6 shows the maximum water depth distribution by these four solvers, under different thresholds ranging from 10 4 to 10 6 .All these solvers run efficiently for the Carlisle test, and the results are all acquired within 30 min considering that there are ∼581 k grids at fine resolution over a simulation time of 45 hr.Taking the full resolution 2D model as the reference, the flood extent predicted by all these four solvers under ε = 10 6 yields a good result that is in line with the maximum free surface elevation validation, with the lowest F of 0.01 and the highest CSI of 0.98.However, the efficiency under ε = 10 6 is not increased much with a dense grid over the domain considering that the compression rate (the ratio of nonuniform grids to the total number of grids in fine resolution) is 28.98%, making the 2D nonuniform solution less compelling.
The predicted inundation extent is severely impacted by a larger ε for the 2D model, with a false alarm up to 0.34 under ε = 10 4 .The coarser grid size typically varies within a range of approximately 3-7 levels coarser, given the DEM resolution at the reference level is 5 m.With a coarser representation of the underlying topography on which the solution scheme is applied, the overall predictions are impacted due to the accumulation of the numerical diffusion, especially for this long-duration simulation over a large and rough area.As the narrow proportion of River Petteril cannot be identified from the 10 m land cover data, the masked 2D model cannot keep fine resolution through this river channel.With a poor representation of the river channels, the degraded  Water Resources Research 10.1029/2023WR036128 conveyance capacity would elevate the overall water levels in the river system.As a result, the predicted inundation extent around the River Petteril is slightly overpredicted, with an increased false alarm of 0.08 and a mismatched inundation extent of 0.91 for CSI.Though the river channel conveyance capacity can be guaranteed with the SGC, the simplified inundation process due to the coarser representation of the critical small-scale topography features leads to an unsatisfactory model performance of the SGC model, most notably over the inflow point near the main tributary.This is reflected in their second lowest CSI of 0.73, resulting from the simplified floodplain inundation process.When modeling water level and flood extent for gradually propagating floods across urban and rural areas over long periods, the combination of the masked regions of interest and the simplified river channel results in the best agreement with the reference inundation extent even under ε = 10 4 .Slight underpredictions can be observed in limited urban areas close to the banks of the Rivers Caldew and Petteril.In contrast, while a 2D model can achieve comparable performance with ε = 10 6 , this comes at the expense of a sharp increase in the number of grids.This leads to unnecessary mesh refinement over dry cells, which in turn elevates computational demands.
The water level time series recorded at six sampling points are compared in Figure 7, where the location of these sampling points is illustrated in Figure 5.The 2D model under ε = 10 4 yields water level predictions that have the worst agreement with the reference solution, irrespective of the locations of the sampling points.This model significantly overpredicts water depths at all the sampling points, with a compression rate of 1.23%.As the smallscale topographic features which have a nonlinear effect on wave motions in the urban area, grid coarsening or upscaling smears critical elevation information or degrades the shape of terrain features (connectivity of roads and pathways, conveyance capacity of channel, water storage capacity etc.), resulting in overprediction of inundation distributions and the timing of inundation due to the simplified wetting process.As we apply a threshold of 10 6 for mesh generation, the corresponding compression rate of the 2D model is 28.98%.The recorded water level time series from the four models show good agreement with the reference solution.No obvious improvement in the model performance in terms of the water level time series is acquired with the additional masked file and the simplified river channel, though the compression rate reaches to 46.44% in the masked SGC model.The reason why a masked model did not yield a better accuracy is that 26.45% of the study area is masked as the urban area, and most of these areas are dry cells during the model runs, resulting in unnecessary efforts in mesh refinement.
Even without these masked regions of interest, the mesh generation controlled by the threshold of 10 6 can keep a necessary topographic detail that important for wave motions.The results illustrate that modeling accuracy is not necessarily increased by using a dense grid over the whole computational domain (Savage et al., 2016), and a nonuniform mesh can be a powerful alternative for efficient flood modeling.Overall model performance in terms of model accuracy and efficiency can be found in Figure S1 in Supporting Information S1.
The importance of the masked SGC model should be highlighted under ε = 10 4 .The loose constraint on mesh generation results in an overall coarse grid resolution but still meets given accuracy requirements, while the masked urban area and channels further improve the mesh quality over regions of interest.Compared with the masked 2D model, the river channel conveyance capacity can be guaranteed with the independently defined river bathymetry, and the timing of inundation can be predicted as the full resolution model, as illustrated in sampling point P6.Due to the simplified inundation process in the masked 2D model and the smeared critical topographic features in the SGC model, the timing of inundation is either delayed or accelerated.The combination of these treatments offers flexibility to control the mesh generation, highlighting the regions that deserves much more emphasis in an efficient manner without increasing the overall number of grids.

Cockermouth 2015 Urban Flooding-Performance Evaluation of the 1D-2D Adaptive Approach
The case study involves a 6-day fluvial flooding scenario over a 1.42 km 2 urban area in Cockermouth town, UK.
The study area represents a typical urban area with a large proportion of built-up area as can be seen from Figure 8.The Cockermouth town is located at the confluence of River Derwent and its tributary, River Cocker.A record-breaking storm falling on the already saturated catchment led to a water level increase of the River Derwent that caused river waters to inundate the town of Cockermouth on 5 and 6 December 2015.Overtopping of the defences occurred in the River Cocker and Derwent with the highest flow ever recorded between 390 and 450 m 3 /s in River Derwent on the 6 December.Calibrated time series of the volumetric flow at 15 min intervals were available for the River Cocker and River Derwent, respectively (Muthusamy et al., 2021), and a free outflow Figure 9 compares the generated nonuniform grids for underlying topography representation over the domain in the 2D and masked SGC model.For the 2D model, the choice of ε = 10 4 leads to an overly simplified grid, with a compression rate of 0.42%.While an overly refined grid for the surrounding rural areas is acquired under ε = 10 6 with a compression rate of 16.92%, which is a similar situation as ε = 10 5 for the 2D model.One potential issue with the 2D model mesh generation is the absence of the river channel banks or defences, particularly at the upstream of River Derwent.The floodplain inundation commences at the start of the simulation, resulting in the false prediction of the timing of inundation.Comparatively, the masked SGC model led to a more sensible resolution selection as the finest 1 m resolution is only present around the urban features, while resolutions up to three levels coarser (i.e., 8 m), are selected over the surrounding rural regions.Furthermore, even coarser resolutions ranging between 16 and 32 m are appropriately selected for the smooth floodplain near the river channel under ε = 10 5 .The resulting compression rate is calculated to be 14.04%.Though the river channel is covered with fine resolution, in fact these grids would be simplified to 1D river channel for flow connection, and the surrounding river channel grids would be deactivated, incurring no computational burden on these dense river channel grids.Therefore, a compression rate of 10.23% is actually achieved during the calculation, where the deactivated channel grids are responsible for the compression rate reduction from 14.04% to 10.23%.Compared to the 2D model, the masked SGC model maintains a flood inundation computation that aligns with known processes.Overbank flow only occurs after exceeding the river conveyance capacity. Figure 10 shows the maximum water depth distribution over the urban area calculated by the masked SGC solver under ε = 10 5 .As the modeled flood extent from full resolution 2D model shows a good agreement of the outer boundary of the EA inundation extent, and not all properties within the EA extent were flooded and no further information was available to identify the areas within the extent that were not flooded, the 2D model is used as the reference solution for evaluating the masked SGC model performance.Remarkably, we can achieve a consistent flood inundation extent comparable to the full resolution 2D model that in turn agrees well with the surveyed extent, using only 1/10 of the computational resources.A CSI value up to 0.91 is acquired, suggesting a very close accuracy in the flood extent predicted by the masked SGC solver.However, it should be noted that false alarms do occur due to the coarsening of the floodplain mesh near the upstream and downstream sections of the main tributary, resulting in an enlarged flooding extent in those areas.
The evaluation of the accuracy of the masked SGC models and the 2D models in terms of water depth hydrograph is shown in Figure 11, with reference to the measured hydrographs from gauge points P1 and P2.P1 is located on the River Cocker, and P2 is on the River Derwent in Figure 8.Both the rising and falling limb of the measured hydrograph is captured from the masked SGC and 2D solvers with a negligible difference that likely occurs from the accumulation of numerical diffusion due to the coarser resolutions on the nonuniform grids.A goodness-of-fit can be acquired with only a 40 cm underprediction of the peak water depth.The RMSE between the measured and predicted water levels from the masked SGC model is 0.299 m (P1) and 0.277 m (P2) respectively, and 0.292 and 0.232 m for the 2D model.The slight disparity in RMSE between the masked SGC and the 2D model can possibly be attributed to uncertainties in estimating the bathymetry.Due to the flat floodplain adjacent to the River Derwent, a slightly wider river channel estimation results in an underestimation of the water surface elevation.This discrepancy may also explain why the full-resolution SGC model only yields a marginal improvement in the predicted water depth.In terms of speed-up, we can get the inundation results for a 6-day flood event within 10 hr under ε = 10 5 for the masked SGC model, which is ∼10× faster than the full resolution 2D model on CPU.A comparable efficiency on a single GPU can be achieved as the OpenMP-accelerated subgrid channel solver on HPC, with a desirable modeling accuracy (H > 0.90, F < 0.10, CSI > 0.90).It is crucial to acknowledge that computational efficiency gains can vary based on the device being utilized and its conditions, including device load and resource availability within HPC environments.In addition, this gain in speed-up can get compromised depending on the mesh quality.For example, compared with the 2D model under ε = 10 6 , the speed-up in the masked SGC model under ε = 10 5 is insignificant (∼1.7), though less memory is utilized for storage of these nonuniform grids.This little gain in speed-up could be explained by contrasting the depth, variability and distribution of the resolutions selected on the non-uniform grids.It can be noted that the nonuniform grids in this test are featured by the dominance of three resolutions (1 m for the masked urban regions, 8 m resolution for the rural regions, and 16-32 m resolution for the floodplain).The nonuniform grids featured by more depth and variability in resolution result in a high number of differently sized elements, causing an overhead in the calculation at those elements surrounded by too many neighboring elements with different sizes.
More information regarding the model performance evaluation, especially on the seamless transition between the subgrid and SGC model can be found in Supporting Information S1.

Discussion
A SGC model with the nonuniform grid discretization of the floodplain is proposed in this paper to take full advantage of the meter-scale topography data for efficient urban flood modeling.Though modeling accuracy is not necessarily increased using high precision data over the computational domain (de Almeida et al., 2018;Dottori et al., 2013;Savage et al., 2016;Yu & Lane, 2006), a fine resolution data is always preferred and recognized as the most important factor influencing the water surface elevations and flooding extent (National Research Council, 2009;Zheng et al., 2018).However, it is important to note that high precision data alone is insufficient to ensure high modeling accuracy (de Almeida et al., 2018), and using uniform meter-scale data can significantly increase computational demands.This perspective often stems from limited computational resources but can also vary depending on the specific circumstances.To strike a balance between modeling efficiency and accuracy, nonuniform adaptive mesh generation based on multiresolution analysis is employed in this paper.By incorporating locally adaptive mesh discretization over the domain, modeling accuracy can be guaranteed over regions of interest, while computational efficiency is maintained by utilizing coarser coverage in rural areas where flow processes are changing much less rapidly.This approach accounts for the given accuracy requirements and optimizes the allocation of computational resources, but it is worth noting that the quality of the validation data although very good at our test sites is a limiting factor on the evaluation.
One of the challenges of using a nonuniform grid in flood modeling is that the time step is constrained by the CFL condition on the reference level for stability, which is ultimately determined by the smallest grid cells over the domain.This constraint means that the time step remains at the same magnitude as the full resolution model, limiting the modeling efficiency.An alternative solution is to implement a local time-stepping technique, which allows for different grid cells to have individualized time steps based on their size (Cohen et al., 2003;Kesserwani & Liang, 2015;Qin et al., 2019).The time step is adapted to the specific requirements of each grid cell, allowing for more precise simulations in areas where it matters most while optimizing computational resources in other regions.However, it is worth noting that implementing a local time-stepping technique in the context of GPU calculation may not provide significant efficiency gains, as the calculation time is in fact determined by the slowest threads considering that there are thousands of threads available.Incorporating additional processes to account for the local time-stepping approach can make the calculation process more complex without yielding substantial efficiency improvements on GPU.Therefore, the implementation of local time-stepping in GPU calculations should be carefully evaluated to ensure that the benefits outweigh the added complexity and potential limitations in computational efficiency.For simplicity and stability, the CFL conditions holds critically by demanding all grid cells evolve at the same time step as the smallest grid cells in the model.By imposing a uniform time step across all grid cells, potential instabilities that may arise from cells evolving at different rates or the need for sub-cycling in certain regions can be avoided.This simplification promotes stability and makes the simulation easier to manage.
The capacity to refine both in space and time is crucial when dealing with simulations that involve massively disparate length scales.While static nonuniform mesh generation demonstrates its efficacy in addressing gradually varied urban flood modeling and ensures modeling accuracy through optimized mesh resolution, it is noteworthy that the rigorous filter formulae of the wavelet-based dynamic adaptivity become inapplicable in the context of static mesh refinement (Kesserwani & Sharifian, 2023).From the perspective of error control, dynamic adaptivity would be a promising solution where the mesh refinement is gradually achieved with the evolution of the flood inundation and the solution accuracy is maintained within a given accuracy while also optimizing computational resources (Caviedes-Voullième et al., 2020;Gerhard et al., 2015;Kesserwani & Sharifian, 2020).
It is also essential to carefully consider the trade-off between the benefits of refinement and the associated computational costs when dealing with complex systems that exhibit significant variations in length scales, as it can be a huge burden for dynamic mesh refinement at each time step.We will further assess the dynamic adaptivity in future work.
One limitation of the nonuniform SGC solver is that it is currently implemented to run on a single GPU.This restriction means that the computational capabilities of the SGC solver are limited by the resources of a single GPU.The maximum coverage of urban areas that can be effectively simulated using the nonuniform SGC solver is constrained by the available GPU memory, and now the ability is typically limited to ∼20 km 2 at 1 m resolution for a 16 GB memory.One possible approach is to distribute the computational workload across multiple GPUs.By utilizing multiple GPUs, the solver can leverage additional computational power and potentially increase the coverage area that can be effectively simulated.However, implementing multi-GPU solutions can introduce additional complexities and may require significant modifications to the algorithms and code structure.The performance of a parallelized code crucially depends on the load-balancing and the inter-processor communication on a distributed memory architecture.We also have to pay the cost of inter-processor communications, while neighbors from the geometrical domain may belong to different processors.

Conclusions
The proposed SGC 1D river model integrated with a nonuniform overlying 2D grid presents a novel approach to incorporating surveyed/estimated bathymetry into urban flood modeling, eliminating the need for extensive interpolation or modification of the DEM.The independently defined SGC allows river channels with any width above that of the grid resolution to be simulated in 1D manner, reducing the computational costs, and minimizing data requirements below the waterline in the river channel.A set of structured tests ranging from an idealized scenario to real-world flooding cases validated the efficacy of the SGC model.The results demonstrate that this model offers a more straightforward representation of river channel width and depth based on available observations combined with an efficient approach to floodplain discretization which concentrates resolution in areas where it is most required.Notably, it achieves comparable performance to full resolution 2D models but with a significantly reduced number of grids.Also, the 1D river channel representation releases the constraint on mesh refinements in full resolution 2D models that need to represent channel bathymetry.The focus is primarily on accurately representing the river channel itself rather than refining the entire river channel mesh as in traditional Water Resources Research 10.1029/2023WR036128 2D approaches where mesh refinements are often required to accurately represent the complex flow patterns and variations in water depth.
As an extension of the subgrid modeling approach, this integration allows us to seamlessly transition between sub-grid and super grid channels, depending on the local DEM resolution used for the simulation.In particular, the bathymetry can be estimated from commonly available data (water surface profile, gauge discharge) in a way similar to the subgrid modeling approach (J.Neal et al., 2012Neal et al., , 2021)).In high resolution urban flood modeling, where the river bathymetry is often unobserved in most terrain remote sensing, the proposed SGC model proves particularly valuable by incorporating estimated bathymetry in flood modeling.The subgrid channel model allows river channels with any width below that of the grid resolution to be simulated on a relatively coarse grid.On the contrary, when the river channel width exceeds the grid resolution, the super grid approach with a mask file is activated for river flow simulation.Consequently, the method will hopefully find wide applicability in both largescale and local-scale flood modeling, accommodating situations where channel width may surpass or fall below the grid resolution.It is important to note that while the water depth profile generated by the integrated subgrid and SGC model may exhibit slight differences compared to the full resolution 2D model (Text S3.3 in Supporting Information S1), these variances primarily stem from the simplified physical flow dynamics inherent in the 1D channel solver and are small compared to other uncertainties (Di Baldassarre & Montanari, 2009;Fewtrell, Neal, et al., 2011;McMillan et al., 2012;Wing et al., 2021).In contrast, a fully 2D model fully resolves flow dynamics using a larger number of grids.Despite these minor differences, the performance of the model presented in this paper remains acceptable, especially considering the high compression rate achieved.This approach offers a compromise between computational efficiency and modeling accuracy, making it well-suited for practical applications in flood modeling across diverse scales.
The integration of optimized adaptive 2D nonuniform structured grids with independently defined 1D river channels represents a significant advance in reducing the computational demands of urban flood modeling.This approach, guided by a mask file delineating regions of interest where local small-scale effects are crucial, achieves a substantial reduction in the total number of grids.Importantly, this grid maintains the structural properties of a uniform grid, enabling the direct solution of governing equations across the entire computational domain without the need for interpolation.Coupled with GPU parallel calculation and the efficient local inertial formulation of SWEs, the hybrid 1D-2D hydrodynamics model significantly reduces computational time and memory requirements without compromising the accuracy of the model, at least over domains of interest.A compression rate of up to 30% can be achieved where the overall model accuracy is maintained.A ∼10× speed-up can be achieved compared with the CPU architecture, a 2-day flooding inundation can be calculated within 30 min for Carlisle and 6-day flood event within 10 hr for the Cockermouth test on a Tesla P100 GPU.This acceleration in computational speed positions the model as a highly efficient and practical tool for urban flood modeling, offering substantial benefits in both time and resource management.
2-4 are designed to emphasize the difference in elevation among each 2 × 2 child elements, along the horizontal, vertical, and

Figure 1 .
Figure 1.Procedures to generate the locally adapted grid based on multiresolution analysis of the underlying topography data.A hierarchy of nested grids at descending resolution can be constructed by successively applying low-and high-pass filters.Significant details are those whose normalized value is greater than a user defined threshold ε, or within a user-defined region of interests.The significant details form a tree structure, where the locally adapted grid corresponds to the leaves of the tree.

Figure 2 .
Figure 2. The hybrid finite difference-volume solution scheme stencil for the local inertial formulation of shallow water equations on a nonuniform grid.

Figure 4 .
Figure 4. Indexing the nonuniform structured grid with the Morton code and the GPU implementation of the calculation processes.The space-filling Z-curve is used to order and access data on the nonuniform grid.The neighbor-searching algorithm relies on converting the Morton code to the coordinate vector (row, column) on the reference regular grid and locates the surrounding elements with simple mathematical relationships.
" The false alarm (b) occurs when obs = 0 and sim = 1, and the misses (c) occur when obs = 1 and sim = 0. Correct negatives are instances where both observed and simulated values are "0."H = a/(a + c) measures the extent to which the simulated flood extent aligns with the reference one, with H = 1 indicating full coverage and H = 0 otherwise.F = b/(b + d) quantifies the portion of the simulated flood extent that falls outside the reference one, with F = 0 denoting that no portion of the simulated flood extent lies outside the reference, whereas F = 1 indicates the entire simulated flood extent is outside the reference.The CSI = a/(a + b + c) combines H and F to measure how much of the sum of the simulated and the reference flood extents is covered by the simulated flood extent.CSI = 1 indicates full coverage and CSI = 0 no coverage.

Figure 5 .
Figure 5. Surveyed flood inundation extent and 183 water wrecks in the city of Carlisle.The time-series water level at six sampling points (P1-P6) is recorded for the performance validation of these adaptive solvers.

Figure 6 .
Figure 6.Comparison of maximum flood extents predicted by four solvers: 2D, masked 2D, super grid channel (SGC), and masked SGC model, against the full resolution 2D model.The 2D model utilizes user-defined threshold for mesh generation, while the masked 2D model incorporates masked regions of interest for additional mesh refinement.The SGC model incorporates a super grid representation of the river channel, and the masked SGC model further includes mesh refinement over the floodplain as the masked 2D model.The evaluation was conducted using hit rate (H), false alarm (F), and critical success index values computed against the reference full resolution 2D predictions.

Figure 7 .
Figure 7. Water level time series recorded at six sampling points by four models under ε = 10 4 and ε = 10 6 .

Figure 8 .
Figure 8. Environment agency surveyed maximum flood inundation extent in Cockermouth town, UK.

Figure 9 .
Figure 9. Nonuniform mesh in the 2D model and masked super grid channel model with varying threshold values of ε for mesh generation.

Figure 10 .
Figure 10.Maximum water depth distribution over the urban area and the river channel from the masked super grid channel model under ε = 10 5 .

Figure 11 .
Figure 11.Water level time series recorded at two sampling points by 2D model and masked super grid channel model under optimized mesh quality.

Table 1
The Model Performance Was Evaluated Using Root Mean Squared Error (RMSE) and Mean Error at 183 Surveyed Water Surface Elevation Points Youtong Rong was supported by the China-Scholarship-Council (CSC)-University of Bristol Joint Ph.D. Scholarships Program (No. 202006250032).The work of Paul Bates and Jeffrey Neal was supported by UK Natural Environment Research Council Grants NE/V017756/1 and NE/ X014134/1.