Towards developing comparable optical and SAR remote sensing inundation mapping with hydrodynamic modelling

ABSTRACT Inundation mapping is an essential part of environmental monitoring, flood disaster management and risk mitigation. There are many earth observation sensors that can provide spatial information about inundation extent at different times. However, these extents need to be comparable to provide an accurate and consistent estimate of a flood’s progression. Monitoring inundation extent around the peak of a flood event is important because it captures the hazardous period during a large flood event, and it identifies connections between off-stream waterholes and flood waters for environmental monitoring. This paper presents results from a study comparing near-coincident flood inundation maps derived from optical (Landsat, MODIS, Sentinel-2 and VIIRS) and Synthetic Aperture Radar (SAR) (Sentinel-1 and NovaSAR-1) remote sensing imagery. We also compare all inundation maps derived from remote sensing data with near-coincident hydrodynamic (HD) modelling. The study was conducted for the Fitzroy floodplain in Western Australia, a large, complex and remotely located river basin. The results show that optical remote sensing data have average F1 scores ranging from 0.57 to 0.65 when compared to the HD model results, with Landsat and MODIS performing best. Sentinel-1 and NovaSAR-1 SAR have a poor agreement (average F1 score of 0.31 and 0.35 respectively) when compared to the HD model results, particularly within scattered vegetation adjacent to the river channel, with better results in open water on the floodplain and in the river. If comparisons are made only during the peak flood stage, the F1 scores improve for the optical data (0.61 up to 0.8). Comparisons of the remote sensing inundation maps show that the optical data are suitable for interchangeable mapping during large flood events.


Introduction
The location, duration and timing of inundation are essential information for river basin management from an engineering, ecological and environmental perspective. Climate change and an increasing population place pressure on water availability (Leblanc et al. 2012) which in turn leads to stress on water environments (such as wetlands and floodplains) and their connectivity to the river systems (Karim et al. 2016;Arthington et al. 2020). Furthermore, identifying flood-affected areas is important for supporting emergency services, as well as assessing flood risk (FEMA 2003;WMO 2009). The maximum extent of a flood event can be considered as the most important stage. This is because it determines all areas affected by inundation during a large, hazardous flood event, and it can be used to identify connections between off-stream waterholes and flood events for monitoring vulnerable wetland habitats.
Ground observations can provide valuable information about the extent of surface water, but they are not always available, especially during flood events. Furthermore, it can also be difficult to obtain detailed and accurate spatial information on water extent using gauging stations alone (Peña-Arancibia et al. 2015). Satellite-based remote sensing technology is an affordable and easily accessible means of capturing near real-time inundation information with reasonable spatial resolution (Schumann et al. 2018;Oddo and Bolten 2019;Sakai et al. 2015). This can be valuable for flood monitoring at a regional scale, of which many freely available sensors meet this requirement.
Common optical remote sensing instruments that can be used for mapping surface water include Landsat (Fisher, Flood, and Danaher 2016), MODerate resolution Imaging Spectroradiometer (MODIS) (Sims et al. 2018) and Sentinel-2 (Phiri et al. 2020). The Sumoi National Polar-orbiting Partnership Visible Infrared Imaging Radiometer Suite (S-NPP VIIRS) can also be used for mapping surface water , and as a surrogate for the MODIS instrument (Huang et al. 2015) which is approaching its end of life. Cloud cover is often associated with flood events, particularly around the peak of a flood, so Synthetic Aperture Radar (SAR) sensors can help fill these gaps (Oddo and Bolten 2019). Sentinel-1 (A, and B until 2021) data are freely available (Torres et al. 2012) and have been acquired routinely since 2016, every 6-12 days across much of the globe. Other SAR instruments are available, but currently need to be tasked to capture a flood. NovaSAR-1 is an S-band instrument, launched in 2019, which the CSIRO has a 10% share in, and can task imagery over Australia . The CSIRO prioritizes the acquisition of NovaSAR-1 data during hazardous events, and as such has captured much of the recent large floods across eastern Australia. The NASA-ISRO SAR (NiSAR) Mission (NASA 2023), scheduled for launch in 2024, will include an S-band, so NovaSAR-1 provides an opportunity to further test this SAR band for inundation mapping. Preliminary results show that it can detect flooded wetlands . Relevant characteristics for these sensors are shown in Table 1.
Many methods have been applied to map surface water extent using optical remote sensing data (Brakenridge and Anderson 2006;Feng et al. 2015;Pekel et al. 2016; Tulbure Table 1. Characteristics of sensors used in this analysis. 'Spatial resolution' contains the value used in this analysis (some sensors have a range of spatial resolutions).  al. 2016). Spectral indices are often used for mapping surface water because they are simple and repeatable (McFeeters 1996;Xu 2006;Feyisa et al. 2014;Fisher, Flood, and Danaher 2016;Rad, Kreitler, and Sadegh 2021). Spectral indices that include a visible and Short-Wave InfraRed (SWIR) band perform well for flood detection (Boschetti et al. 2014). The daily frequency of the MODIS sensors makes it suitable for identifying medium-tolarge floods at a regional scale, with the potential of capturing fast moving events provided the sky is clear (Soulard et al. 2022). Different approaches have been applied to mapping surface water using MODIS spectral information (Khandelwal et al. 2017;Pekel et al. 2014;Guerschman et al. 2011). The VIIRS sensor has characteristics similar to MODIS and is also able to identify surface water (Huang et al. 2015). Landsat and Sentinel-2 data are also useful for mapping surface water as they can identify narrow or small water features compared to instruments such as MODIS (Soulard et al. 2022;Li et al. 2021;Sekertekin, Cicekli, and Arslan 2018). SAR backscatter tends to be very low over open water, allowing for its discrimination with land (Ovakoglou et al. 2021;Bioresita et al. 2019). There are many methods for mapping surface water in a SAR image with some incorporating thresholding based on backscatter (Martinis, Twele, and Voigt 2009;Zhang et al. 2020;Bioresita et al. 2018) or different input bands (Li and Wang 2015), as well as other classification methods (Yuan et al. 2019;Shen et al. 2022).
Capturing a flood's progression can be difficult when only one sensor is used due to the extended time between acquisitions, as well as cloud cover for optical instruments. Combining all available remote sensing data can help build a more complete picture of water movement during flood events. One of the challenges is ensuring the different sensors are mapping comparable extents. While direct comparisons can be made when two different sensors capture a flood event at the same (or similar) time, this does not occur often. However, hydrodynamic (HD) modelling provides fine temporal details (and spatial details if a fine-resolution DEM is available) about a flood's movement (Dutta et al. 2007;Karim et al. 2016). Although it requires large computing resources and hence can only be applied to small areas, it generally provides an accurate and consistent estimate of flood extent through time.
This study provides a detailed comparison of all the available remote sensing water maps with HD model outputs during a selection of flood events within the Fitzroy catchment in Western Australia. It also compares estimates of flood extent as derived from all available remote sensing imagery when acquired at a similar time. The producer's accuracy, user's accuracy and F1 scores are used to assess the similarity in extent and conclude which data sources are compatible. Further comparisons are assessed for the critical time close to the peak of a flood event.

Study site
The study is based on the Fitzroy catchment in Western Australia. It covers an area of approximately 93,830 km 2 and consists of three major river systems: the Fitzroy River, Margaret River and Christmas Creek (Figure 1). The Fitzroy River traverses 735 km through elevated ranges (>450 m above average sea level) before discharging into the Timor Sea. The Fitzroy is one of the largest rivers in Australia and carries a mean annual flow (MAF) of 8000 gigalitres at Fitzroy Crossing (Petheram et al. 2012). It has vast floodplains on both sides of the river in its mid-to-lower reaches. The topography of the floodplain is complex, having several braided channels of varying width and depth along the main channel of the river. The Margaret River and Christmas Creek are two major tributaries that have significant effects on the floodplain flow regime and inundation. At Fitzroy Crossing, the river can swell to extend 15 km across the floodplain during floods. The floodplains of the Fitzroy catchment are underlain by alluvial aquifers, which are recharged during flood events (Karim et al. 2018).
The Fitzroy catchment experiences a semi-arid climate with a mean annual rainfall of 550 mm. The rainfall is characterized by a highly distinctive wet season (November to April) and dry season (May to October). Rivers and creeks in the Fitzroy catchment are mostly ephemeral except the Fitzroy River downstream of Fitzroy Crossing where flow varies between perennial and intermittent. Intense seasonal rains from December to April create localized flooding and inundate large areas across the Fitzroy and Margaret rivers. A major source of flood water is the inflow from headwater catchments upstream of Dimond Gorge and Mt Krauss, while about 15% of flood water is generated from local rainfall. Floods in the Fitzroy catchment mostly occur in the period from January to March (88%) with the highest in February (38%). Since 1980, there have been 23 floods ranging from an annual exceedance probability (AEP) of 1-50 years.

Topography
Land surface elevation is an essential input to the HD model, which is often a raster digital elevation model (DEM). Topography governs direction and velocity of flow between modelling grids. We have used two sets of topography data: 1-m resolution LiDAR (Light Detection And Ranging) data was used for the rivers and riparian zones and 30-m resolution SRTM (Shuttle Radar Topography Mission) data for the remainder of the floodplain. The modelling domain covered an area of 3.6 × 10 4 km 2 and the area covered by LiDAR is approximately 0.6 × 104 km 2 (16% of the total area) (Figure 1).

Hydraulic roughness
Land cover which provides resistance to the propagating flood wave is an essential input to HD modelling. We have used several sources of land cover data to estimate hydraulic roughness such as satellite imagery, aerial photos, topographic mapping and Google Earth imagery. At first, a 30 m resolution gridded land cover map was produced by categorizing land cover into bare soil, river, wetland, riparian vegetation, savannas, shrubs and tropical woodlands. Each land cover grid was then replaced with a corresponding hydraulic roughness coefficient (n) using recommended values in published literature (e.g. Duan et al. 2018;LWA 2009;Prior et al. 2021). Roughness coefficients were adjusted during model calibration to produce the best match between observed and modelled water level and discharge along the rivers and across floodplains.

Flow
Streamflow and water level are two important inputs to the HD model configuration and calibration. These data were obtained from the Department of Water of the Government of Western Australia. Daily streamflow at Christmas Creek, Dimond Gorge and Mt Krauss were used to specify inflow from upstream catchments, and the only downstream boundary was specified using daily water levels at Willare (Figure 1). Water levels at Fitzroy Barrage, Fitzroy Crossing, Looma and Noonkanbah gauging sites were used to calibrate model parameters (e.g. roughness coefficient and eddy viscosity). Streamflow data were used to identify relatively recent flood events in the Fitzroy floodplain that were suitable for HD modelling and captured by earth observation sensors. Based on the availability of satellite data, we have selected five floods to evaluate modelled inundation. The magnitude of the selected floods varies from 1 in 5 to 1 in 25 years exceedance probability (Table 2).

Earth observation data
The two MODIS sensors (onboard the TERRA and AQUA platforms since 2000 and 2002, respectively) acquire daytime images over Australia in the morning (TERRA) and afternoon (AQUA). The spatial resolution of MODIS surface reflectance data ranges from 250 m to 1000 m, depending on wavelength (Salomonson, Barnes, and Masuoka 2006). Daily images of the surface reflectance from only the TERRA MODIS sensor (MOD09GA) were used here. This is because of failure of detectors that produce Band 6 on the AQUA sensor, resulting in stripes in the data (Gladkova et al. 2012). Daily MODIS surface reflectance (MOD09GA) was downloaded from NASA's Earth Data Search website (NASA 2022). Landsat data have a spatial resolution of 30 m pixels (Wulder et al. 2019). Landsat images are available every 16 days from a single sensor. This becomes more frequent along the edge where the swaths overlap, or with two sensors in orbit, but reduces due to cloud cover and missing data. The archive of Landsat data (Landsat 5 Thematic Mapper, Landsat 7 Enhanced Thematic Mapper, and Landsat 8 Operational Land Imager) is available from Digital Earth Australia (DEA). The DEA provides consistent pre-processing of Landsat data (including atmospheric correction and Bidirectional Reflectance Distribution Function (BRDF) correction) to an analysis-ready level for the Australian continent (Dhu et al. 2017).
The Sentinel-2 sensors are part of the Copernicus Programme by the European Space Agency (ESA). These sensors provide a fine spatial resolution (10-20 m), with observations every 5-10 days. Sentinel-2A data are available from 2015, with more frequent coverage following the launch of Sentinel-2B in 2017 (Phiri et al. 2020). Sentinel-2 data are also provided as BRDF-corrected surface reflectance from DEA.
The S-NPP VIIRS instrument (which will now be called VIIRS) was launched in 2011 and has similar characteristics to MODIS. VIIRS provides daily surface reflectance at a similar spatial resolution to MODIS (375 m pixels for the reflectance bands used in this study), with an afternoon daytime overpass time (Xiong et al. 2016). Daily VIIRS (VNP09GA) were downloaded from NASA's Earth Data Search website (NASA 2022).
Sentinel-1 consists of two SAR sensors, Sentinel-1A (launched in 2014) and Sentinel-1B (launched in 2016) (Potin et al. 2016). Their common Interferometric Wide Swath mode provides radar backscatter measurements at a pixel size of 10 m, with a spatial resolution of 20 × 22 m, and is available every 6-12 days (although acquisition frequency has been reduced to 12 days following issues with the Sentinel-1B instrument; ESA 2022). Sentinel-1 data were downloaded from the Sentinel Australasia Regional Access Hub (Australian Government 2022).
NovaSAR-1 is an S-band (3.2 GHz; wavelength 9.4 cm) SAR satellite designed and built by Surrey Satellite Technology Ltd, UK. Although data are not routinely acquired, the CSIRO is able to task imagery over Australia (Held et al. 2019). All NovaSAR-1 data are freely available from the CSIRO data portal (CSIRO 2022) and will soon be released in an analysis-ready format (following the Committee on Earth Observation Satellites Analysis Read Data for Land (CARD4L) recommendation of normalized radar backscatter (CARD4L 2022).
All remote sensing imagery for Landsat, MODIS, Sentinel-2, VIIRS, Sentinel-1 and NovaSAR-1 within the flood dates (Table 2) were downloaded for the Fitzroy River HD model domain. Optical imagery with a large proportion of cloud cover was removed from further analysis, however some cloud cover was allowed if the remaining pixels provided a clear view of inundation somewhere within the floodplain. The number of remote sensing scenes used in this study is shown in Table 3. All imagery were resampled to a pixel size of 20 m (the finest spatial resolution of the remote sensing data), allowing for a direct pixel-topixel comparison between remote sensing and HD model inundation maps.

Hydraulic modelling
The HD modelling was conducted using a two-dimensional flexible mesh flood inundation model, commonly known as MIKE21 FM (DHI 2016). The modelling domain was divided into four zones using different sizes of mesh, with the smallest mesh for rivers and the largest mesh for areas not usually subjected to inundation (Figure 1). The model consisted of approximately 1.62 million triangular mesh, with three inflow boundaries at Dimond Gorge, Mt Krauss and Christmas Creek and a water-level boundary at Willare (Figure 1). Daily discharge was used at three inflow boundaries (Dimond Gorge, Mt Krauss and Christmas Creek), and water level was specified at Willare (Figure 1).
Model boundaries were specified at a daily time step, and hydrodynamic simulations of flow and water depth were undertaken at 5-s intervals. Irrespective of magnitude, each flood event was run for 5 days of warmup and 35 days of simulation to ensure complete flood recession. CSIRO's high-performance computing facilities were used to run the models. Each machine consists of 3 GPU (graphic processing unit) cards and 16 CPU (central processing unit) cores. For the current setup, each model run takes about one and a half days of computer time for a 40day flood event. To reduce data volume, model outputs were recorded at six hourly time steps: 6am, midday, 6pm and midnight, which was deemed sufficient for the inundation analysis.

Water maps from optical data
Two of the most commonly used and validated methods of mapping surface water within Australia are the modified Normalized Difference Water Index (mNDWI) by Xu (2006) and a water index by Fisher, Flood, and Danaher (2016) (which will be called FWI). Ticehurst, Teng, and Sengupta (2022) utilized these indices, along with the Tasseled Cap Wetness index (Dunn et al. 2019), applying them within the water environments where they perform best. This method was used here for the Fitzroy floodplain, which, given its environment, uses the maximum of mNDWI and FWI (Max fwi_ndwi ): where: NIR is the near-infrared band. SWIR1 and SWIR2 are the Short Wave Infrared bands around the 1500-1600 nm and 2100-2200 nm wavelengths, respectively. Since Max fwi_ndwi uses the combination of the FWI and mNDWI indices, it allows for the identification of different coloured water (Fisher, Flood, and Danaher 2016). Max fwi_ndwi was used to produce water maps from all the optical images (Landsat, MODIS, Sentinel-2 and VIIRS) to allow a direct comparison.

Water maps from Synthetic Aperture Radar data
The Sentinel-1 data were first terrain corrected and processed to orthorectified normalized radar backscatter using the European Space Agency SNAP toolbox. NovaSAR-1 was provided as terrain corrected and orthorectified normalized radar backscatter (processed using GAMMA software). The backscatter from both the Sentinel-1 and NovaSAR-1 images was converted from linear to decibels (dB) and a 3 × 3 Refined Lee filter applied to reduce speckle effects.
Applying a single threshold to a co-polarized band of SAR backscatter (in this case VV) is a common method for identifying surface water extent due to the low backscatter over open surface water (Bioresita et al. 2019). It requires a strong bimodal distribution of the pixel values, separating the low backscatter of surface water from land. This was achieved by selecting a subset within each SAR image that contained a good representation of both inundated and dry pixels (Bioresita et al. 2018 suggests at least 10% surface water) such that the two modes are visible in the histogram (Islam and Meng 2022). For SAR images covering the lower floodplain, this subset was located between Looma and Fitzroy Barrage (Figure 1), and for those further upstream this subset was located at Fitzroy Crossing. Otsu's threshold method (Otsu 1979) was then used to automatically select a backscatter threshold. Values below this threshold are classified as water across the whole SAR image.
Low backscatter due to speckle and steep terrain (Islam and Meng 2022), as well as smooth ground (O'Grady, Leblanc, and Gillieson 2013) result in dry pixels mis-classified as water. This can be improved by masking regions of steep terrain and removing small clumps of pixels classified as water (Islam and Meng 2022). Hence, two further steps were applied: a sieve filter (available in QGIS software) was used to remove small clumps of classified water pixels less than 10 in size (since these are not usually part of a large, inundated area); and a floodplain mask was applied using MrVBF (Gallant and Dowling 2003) allowing pixels on steeper terrain (with MrVBF<5) to be removed. The MrVBF algorithm uses the SRTM DEM to categorize topography based on the degree of 'valley bottom flatness', resulting in an index ranging from 0 (steep slopes) to 8 (flat valley).

Image assessment
All inundation maps for the images within the flood dates were directly compared to the HD model maps for the same date (and similar matching time). Given that the HD model was calculated every 6 hours (6am, midday, 6pm, midnight), the midday HD model extent was used for the optical images (which were acquired around 10-10.30am). For Sentinel-1 (with a local acquisition time ~7am) the 6am HD model extents were used, and for NovaSAR-1 (acquired ~9.30pm) it was midnight. All remote sensing images acquired on the same date were also compared, despite their different acquisition times.
All comparisons were made using the user's accuracy (UA; Equation 4), producer's accuracy (PA; Equation 5) and F1 score (Equation 6). All null pixels from the remote sensing data were omitted from the analysis.
where TP (true positive) is the number of water pixels in agreement, FP (false positive) is the number of dry pixels incorrectly mapped as water, and FN (false negative) is the number of water pixels incorrectly mapped as dry (Maxwell and Warner 2020).
For the remote sensing image -HD model comparisons, the HD water extents were considered the 'true' pixels. For the remote sensing -remote sensing image comparisons, one dataset was nominated as 'true' pixels to enable calculation of PA and UA: MODIS was considered the 'true' pixels when compared to the other datasets; VIIRS was considered the 'true' pixels when compared to Sentinel-2, Sentinel-1 and NovaSAR-1; and Sentinel-2 was considered the 'true' pixels when compared to NovaSAR-1. Selection of the remote sensing data considered as 'true' is arbitrary since it was just to allow additional comparisons, in relation to omission and commission errors, of the water extents.   Overall, for the optical remote sensing data, the best agreements with the HD model occur when the flood is at its largest. The lowest agreements happen near the end of the flood events.

Comparison of remote sensing data with hydrodynamic model output
The SAR -HD comparisons were mostly of poor agreement (F1 scores ranging from 0.19 to 0.38), with the best agreement occurring for Sentinel-1 (on 9 March 2020; Figure 2t). These poor agreements do not appear to be related to the size of flood event, however the individual comparisons in Figure 2 show areas of agreement are distributed throughout the HD model domain in the larger bodies of water.
For each sensor, the mean PA and UA, and mean, minimum and maximum F1 scores were calculated when compared to the HD model extent (Table 4). The Landsat water maps have the highest agreement to the HD model extents with a mean F1 score of 0.65. Half of the image pairs show good agreement with F1 scores above 0.76. The mean PA (0.59) is lower than the mean UA (0.76) indicating the HD model is identifying more water than the Landsat water maps. The MODIS and VIIRS show moderate agreement with a mean F1 score of 0.61 and 0.58, respectively. For the MODIS -HD image pairs, 13 of the 38 had substantial agreement with F1 scores above 0.7. However, Landsat, MODIS and VIIRS all show good agreement with the HD model for some of the individual images, with the best agreements occurring when the flood is at its largest.
For Sentinel-2, the mean F1 score is 0.57, and the mean PA is the same as the mean UA (0.58) indicating an even amount of over-mapping and under-mapping of water compared to the HD model. Both the SAR sensors show poor agreement with the HD model water extents with a mean F1 score of 0.31 for Sentinel-1 and 0.35 for the NovaSAR-1 image. The mean PA is much lower than the mean UA for both Sentinel-1 and NovaSAR-1 indicating that the SAR water maps identify much less water than the HD model. The agreement is best in the open floodplain areas and in the deep river channel. Figure 3 summarizes the spatial agreement across the floodplain of the remote sensing images with the HD model for the common dates. It shows proportion of times where: the remote sensing identifies water in the same pixels as the HD model (left column); the HD model identifies water and the remote sensing does not (middle column); and the remote sensing identifies water and the HD model does not (right column). The best agreement for the Landsat -HD (Figure 3a) and MODIS − HD (Figure 3d) image pairs occurs in the lower half of the floodplain (along the eastern side). The best agreement for the SAR -HD image pairs occurs along the lowest section where the water spreads along the floodplain   Figure 3l,o respectively). The HD model was used to identify date of maximum extent of a flood event. To allow for the fact that remote sensing data are unlikely to capture the maximum extent, the dates within 80% of the flood peak are used. If only this stage of the flood is considered, the F1 scores improve somewhat for the optical data (Table 5). The Landsat -HD mean F1 score increases from 0.65 to 0.80. The MODIS -HD mean F1 score increases from 0.61 to 0.70, and the Sentinel-2 -HD mean F1 score also increases from 0.57 to 0.72. The mean F1 score for the VIIRS -HD image pairs increases marginally from 0.58 to 0.61. However, this smaller improvement from the VIIRS -HD image pairs, compared to the MODIS -HD image pairs, may be more related to the size of the flood events available for assessing VIIRS (since the 2017 and 2020 events are much smaller than those of 2002 and 2011). There is no change for the NovaSAR-1 -HD image pair, and the Sentinel-1 -HD image pair increases slightly (0.29 to 0.35) but still remains with a low mean F1 score.

Comparison of remote sensing data with other coincident remote sensing data
Individual remote sensing images acquired on the same day from different sensors were spatially compared and their F1 scores calculated. Figure 4 shows the spatial comparison of a selection of remote sensing water maps with the MODIS water maps (since these were the most numerous of the sensors) to demonstrate the differences seen between the sensors. For the MODIS -VIIRS image pairs, the agreement was not related to the size of inundation extent since the best results occurred not only when the flood was at its peak (with an F1 score of 0.81 on 19 February 2017; Figure 4f) but also at the very end of a flood event (with an F1 score of 0.83 on 11 March 2020; Figure 4l). The lowest agreement had an F1 score of 0.75 (on 7 March 2020; Figure 4i) and showed that MODIS identified more water than VIIRS. This may be partly related to the different acquisition times during the flood's recession (MODIS ~10am, VIIRS ~1.30pm).
For the MODIS -Sentinel-2 image pairs, the highest F1 score was 0.76 (on 6 March 2020; Figure 4c) and the lowest F1 score (0.02) showed extremely poor agreement (on 8 March 2020; Figure 4d). This was mostly due to the difference in spatial resolution of the sensors. This image pair only covers the upstream section of the river when the inundation extent was low, so the MODIS was not detecting water in its larger pixels.
For the MODIS -SAR image pairs, there was low-to-moderate agreement for all image pairs. The best agreement was found for NovaSAR-1 with an F1 score of 0.46 (on 6 March 2020; Figure 4n). The lowest F1 score was 0.32 for Sentinel-1 (on 17 February 2017; Figure 4m). The SAR data agree with MODIS for the larger bodies of water rather than the narrow river channels. For the NovaSAR-1 -Sentinel-2 image pair (Figure 5a; noting there were no image pairs from Sentinel-1), there was moderate agreement for the one date (F1 score of 0.46). Both NovaSAR-1 and Sentinel-2 have a similar spatial resolution; however, the NovaSAR-1 is not capturing as much water as was identified in Sentinel-2. To further illustrate this, flood extent (as derived from the Sentinel-2 image using Max fwi_ndwi ) is outlined in blue in Figure 5b,c,d. Figure 5b uses the Red, Green and SWIR-2 band to show flooded water, as well as shallow flooded vegetation (which can be seen by the speckled texture in the water). The NovaSAR-1 water map (Figure 5c) is identifying water where flooded vegetation is not visible, including the flooded river channel running along the bottom of the image. The NovaSAR-1 VV backscatter image (Figure 5d) shows the backscatter is greater in the areas where there is scattered, partially submerged vegetation, compared to inundated areas void of vegetation.
There were 29 image pairs acquired on the same day from different remote sensing instruments. The number of image pairs, as well as mean PA, mean UA, and mean, minimum and maximum F1 scores are shown in Table 6. There were no image pairs for Landsat -Sentinel-2 or Sentinel-1 -Sentinel-2. The MODIS -Landsat and MODIS -VIIRS image pairs had substantial agreement (with mean F1 scores of 0.76 and 0.80 respectively). For MODIS -VIIRS image pairs, there was strong agreement for most dates with seven of the ten having an F1 score above 0.8, and the remaining three with an F1 score above 0.75. The MODIS -Sentinel-2 and VIIRS -Sentinel-2 image pairs had a moderate agreement (mean F1 score of 0.57 and 0.56, respectively). However, for both the MODIS -Sentinel-2 and VIIRS -Sentinel-2 image pairs, there was substantial agreement for three out of the four dates, with F1 scores ranging from 0.72 to 0.77 and from 0.68 to 0.75, Figure 4. Spatial comparison of Landsat (LS), Sentinel-2 (S2), VIIRS (VI), Sentinel-1 (S1) and NovaSAR-1 (N1) with MODIS water extent for common flood dates (F1 score is shown in brackets next to date, RS = remote sensing data being compared to MODIS).
respectively. For the VIIRS -SAR image pairs there was moderate agreement for two out of the four dates: one from NovaSAR-1 and one from Sentinel-1 (F1 scores of 0.45 and 0.41 respectively). The other two images, both Sentinel-1, showed poor agreement with VIIRS (F1 scores below 0.32).  The mean UA for the MODIS -Landsat and MODIS -Sentinel-2 image pairs are lower than the mean PA, indicating that MODIS is under-mapping water more than the other sensor ( Table 6). The same result is also seen in the VIIRS -Sentinel-2 comparison. This is largely due to the different spatial resolution of the sensors and the fine spatial inundation extent of the receding flood water. The MODIS -VIIRS image pairs have similar mean UA (0.80) and mean PA (0.81) meaning they are mapping similar extents. Similar to the MODIS -SAR image pairs, the VIIRS -SAR image pairs have a low to moderate mean F1 score (lower than 0.48). All the image pairs containing SAR have a much lower mean PA than mean UA indicating SAR is under-mapping water extent compared to the other sensors. Figure 6 summarizes the spatial agreement of the MODIS water maps with the other remote sensing water maps (Landsat, SAR, Sentinel-2 and VIIRS) for the common dates. It shows proportion of times where: MODIS identifies water in the same pixels as the other remote sensing water map (left column); the MODIS identifies water and the other remote sensing water map does not (middle column); and the other remote sensing water map identifies water and the MODIS does not (right column). The MODIS -Landsat image pairs show the highest agreement within the lower half of the floodplain on the eastern side, Figure 6. Spatial comparison of proportion of agreement ('MOD water, RS water') and disagreement ('MOD water, RS dry' and 'RS water, MOD dry') for Landsat, SAR (Sentinel-1 and NovaSAR-1), Sentinel-2 and VIIRS with the MODIS water extent as calculated from all common flood dates (RS = all remote sensing data except MODIS -which it is being compared to). The red dot shows the location of Noonkanbah in Figure 6a. downstream from Noonkanbah (red dot in Figure 6a), while the MODIS -SAR image pairs show agreement around the larger bodies of water within the lower floodplain ( Figure 6d). The MODIS -Sentinel-2 and MODIS -VIIRS image pairs show agreement distributed across the whole floodplain (Figure 6g,j respectively). The MODIS -SAR image pairs show the largest difference, where MODIS identifies a lot more water than SAR (Figure 6e), while SAR is not identifying much more water than MODIS (Figure 6f). Landsat and Sentinel-2 show more water than MODIS across the whole floodplain (Figure 6c,i), although this difference is low and mostly related to the lower spatial resolution of the MODIS. The VIIRS and MODIS are showing small differences across the whole floodplain (Figure 6k,l), which is possibly related to the different observation times (MODIS in the morning and VIIRS in the afternoon).

Discussion
The main strength of hydrodynamic modelling is that it is directly linked with hydrology and can produce inundation information at a desired spatial (e.g. 5 m) and temporal (e.g. sub-daily) resolution with high accuracy. The main limitation of hydrodynamic modelling is that it requires large input data sets. Moreover, the configuration, simulation and calibration of the model is time-consuming. To our knowledge, comparison of a range of remote sensing-derived inundation extents with a HD model is a novel approach, and enables accuracy assessments to be made even when remote sensing observations are not coincident.
The agreements between the optical imagery and the HD model are better during the flood peak rather than the recession where the HD model identifies more water. Many factors influence the accuracy of the remote sensing and HD model image pairs, including the timing of image acquisition in relation to the stage of the flood event, as well as the area covered by the imagery (since the comparisons tend to be better in the lower floodplain where inundation is more extensive).
The lower spatial resolution of the MODIS/VIIRS means it cannot detect the finer river channel, and sections of braided river, compared to the HD model and the higherresolution remote sensing imagery. These lower spatial resolution sensors cannot detect changes in inundation extent along narrow rivers or small or irregular-shaped waterbodies (Wang et al. 2022), further explaining why results improved in optical imagery during flood peak times. While the VIIRS results, when compared to the HD model, were poorer than for MODIS, the common dates of the VIIRS and MODIS showed similar F1 scores with the HD model. Furthermore, the MODIS -VIIRS image pairs showed substantial agreement with a mean F1 score of 0.8, hence VIIRS may also be a suitable sensor for large flood events. Huang et al. (2015) also concluded that MODIS and VIIRS provided similar extents for a large lake, validating this with Landsat 8.
The SAR data are mapping less water than the other remote sensing datasets and the hydrodynamic model. One reason for this is because SAR backscatter increases in turbulent water, which may be due to vegetation scattered within the shallow inundated areas, making it difficult to distinguish from other surface features. In cases where inundation occurs under reasonably dense vegetation, resulting in a strong increase in backscattering due to double-bounce scattering, these areas can be identified using a different threshold (Cazals et al. 2016). The situation in this study is more complicated as illustrated in Figure 5, where an increase in backscatter in the inundated areas is not indicative of the strong double-bounce response that enables flooded vegetation to be easily identified from its surrounds (Rosenqvist et al. 2007). The small size of the flood events in 2017 and 2020, when SAR data were available, resulted in shallower inundation depth and hence more influence on backscatter from low vegetation. Souza et al. (2022) also found that optical imagery (Landsat-8 and Sentinel-2) mapped water extent better than SAR in areas where water features, such as inlet branches, were more complex. Levin and Phinn (2022) compared PlanetScope and Capella-Space X-band imagery captured at the peak of a flood event in Brisbane, Australia, finding the SAR identified less flood water due to vegetation and buildings. Despite these findings, SAR sensors have demonstrated their ability to map smooth open surface water.
While NovaSAR-1 is not a global dataset, applying the S-band SAR to inundation mapping contributes to the very limited body of research (such as Bhardwaj, Saini, and Chatterjee 2021;Liang et al. 2016) in preparation for the NiSAR (S-band and C-band) mission scheduled for 2024 (NASA 2023). While we could not directly compare the Sentinel-1 and NovaSAR-1 inundation extents since observations were not coincident, comparison with the HD model enabled their accuracy to be assessed. Results showed that both SARs were able to identify the larger open bodies of water and flooded river channels, but not shallow inundation mixed with scattered vegetation. Similar results were found for Ramsey, Rangoonwala, and Bannister (2013) who report mapping accuracies with L-and C-bands that were lower for shallow water depths compared to deep water. Manavalan, Rao, and Krishna Mohan (2017) reported inundation extent in deeply flooded agricultural areas was greater from L-band compared to C-band, since flooded vegetation can appear smoother at the longer wavelength, however our study was unable to detect any noticeable difference between S-band and C-band.
Apart from the NovaSAR-1 data, all the remote sensing data used here are freely available covering most of the globe. Given the simplicity of the methods used to map surface water in the optical remote sensing imagery, they can be applied and have been validated across a diverse range of environments (Max fwi_ndwi in southeast Australia, Ticehurst, Teng, and Sengupta 2022;FWI in eastern Australia, Fisher, Flood, and Danaher (2016); mNDWI in the Yangtze River Basin, Li et al. (2013)). The SAR backscatter threshold method has already been utilized in many different water mapping applications (e.g. Kavats, Khramov, and Sergieieva 2022;Brown, Hambidge, and Brownett 2016), after adapting the threshold to individual sites. The topography mask applied to the SAR water extents was extracted from MrVBF, which is only available for Australia, however the SRTM data used to derive it is a global product. Furthermore, other topography information can be applied depending on the environment and available data (e.g. Wang et al. 2022;Islam and Meng 2022). While comparisons of water extent derived from different remote sensing data have been reported for large lakes (e.g. MODIS, VIIRS and Landsat, Huang et al. 2015;Sentinel-1, Landsat-8 and Sentinel-2, Souza et al. 2022) and floods (PlanetScope, Capella-Space X-band and VIIRS night-time brightness imagery, Levin and Phinn 2022) we are not aware of other comparisons made with such a large range of sensors for flood events.
The same method and index threshold have been used to map surface water for all the optical remote sensing imagery. While the agreement between the optical remote sensing water maps is generally good, the thresholds could be fine-tuned to produce surface water maps with an even stronger agreement. For the SAR data, further analysis on the identification of shallow flood water within scattered vegetation would help improve these comparisons, and methods such as image differencing using the baseline method (Brown, Hambidge, and Brownett 2016), provided a pre-flood image is available, may improve results.

Conclusions
In this study, we have gathered, processed and evaluated several sets of nearcoincident flood inundation maps derived from optical sensors (Landsat, MODIS, Sentinel-2 and VIIRS), and SAR sensors (Sentinel-1 and NovaSAR-1). Inundation maps derived from satellite imagery were evaluated against an independent data set obtained from hydrodynamic modelling. Among the satellite data, the optical remote sensing data produced a moderate to substantial agreement when compared to the HD model results, with Landsat and MODIS performing the best with mean F1 scores of 0.65 and 0.61, respectively. In contrast, Sentinel-1 and NovaSAR-1 SAR data produced relatively poor agreements with HD model results (mean F1 scores of 0.31 and 0.35 respectively) and the optical remote sensing data (mean F1 scores ranging from 0.34 to 0.46), particularly within scattered vegetation adjacent to the river channel, with better results obtained in open water on the floodplain and in the river channel. This study also showed that Landsat, MODIS, and Sentinel-2 all compare well to the HD model around the peak periods of a flood event with mean F1 scores of 0.80, 0.70 and 0.72, respectively. The results imply that these three sensors can be used interchangeably to identify inundation extent around the peak of a flood event with reasonable confidence, while the VIIRS sensor may also be a suitable sensor for large flood events given it has a high mean F1 score (0.80) when compared to MODIS. This study demonstrates the advantages of using multiple remote sensing technologies for identifying flood events.

Data availability statement
The data that support the findings of this study are available from the corresponding author, CT, upon reasonable request.