A Satellite-Based Monitoring System for Quantifying Surface Water and Mesic Vegetation Dynamics in a Semi-Arid Region

Semi-arid and arid systems cover one third of the earth’s land surface, and are becoming increasingly drier, but existing datasets do not capture all of the types of water resources that sustain these systems. In semi-arid environments, small surface water bodies and areas of mesic vegetation (wetlands, wet meadows, riparian zones) function as critical water resources. However, the most commonly-used maps of water resources are derived from the Landsat time series or single date aerial photographs, and are too coarse either spatially or temporally to effectively monitor water resource dynamics. In this study, we produced a Sentinel Fusion (SF) water resources product for a semi-arid mountainous region of the western United States, which includes monthly maps of both a) surface water and b) mesic vegetation at 10 m spatial resolution using freely available Earth observation data on an open access platform. We applied random forest classifiers to optical data from the Sentinel-2 time series, synthetic aperture radar (SAR) data from the Sentinel-1 time series, and topographic variables. We compared our SF product with three commonly used and publicly available datasets in the western U.S. We found that our surface water class contained fewer omission errors than a leading global surface water product in (94 % producer’s accuracy (PA) vs 84 %) and comparable user’s accuracy (UA) (91 % vs 97 %) with commission errors occurring largely in mixed water pixels. Our mesic vegetation class had up to 43 % higher PAs compared to the National Wetlands Inventory (NWI) estimates and up to 78 % higher UAs over the Sage Grouse Initiative mesic resources maps during the most critical part of the water year. We found that while inclusion of SAR data from the C-band Sentinel-1 sensor consistently improved estimates of water resources in each of the last four months of the 2021 water year when compared to optical-only + topographic variables, only in September did those improvements lie outside of the 95 % confidence interval. With nine times finer spatial resolution and more frequent image collection, our SF maps characterize intra-annual dynamics of smaller water bodies (<30 m wide) and mesic vegetation integral to ecosystem functioning in semi-arid systems compared to leading Landsatderived products. Further, our workflow is easily reproducible using freely available data on an open access platform, and can be adopted to help guide land use decisions related to water resources by farmers, ranchers, and conservationists in semi-arid environments.


Introduction
Uncertainty surrounding water availability is increasing due to climate change and rising demand for water due to population growth, with two-thirds of the global population experiencing freshwater scarcity at least one month per year (Mekonnen and Hoekstra, 2016). Rising temperatures and changes in precipitation patterns (Pepin and Lundquist, 2008;Xia et al., 2017) have led to extended periods of drought (Morote et al., 2019) often resulting in devastating ecological and economic outcomes (Hoekstra et al., 2012;Yu et al., 2019). Warming climates lead to rain on snow events (McCabe et al., 2007) and earlier arrival of streamflows due to faster melting reduces late season water availability (Maurer et al., 2007). An improved understanding of how water availability is changing over time and space is a crucial step towards managing water resources effectively.
Satellite imagery has revolutionized our ability to monitor water on the landscape and how it changes over time (Feng et al., 2016;Pekel et al., 2016;Yamazaki et al., 2015). Datasets derived from Landsat such as the Global Surface Water Explorer (GSWE) (Pekel et al., 2016) and the Global Land Analysis and Discovery (GLAD) laboratory global surface water dynamics maps (Pickens et al., 2020) exemplify the power of cloud computing and time series analysis. Both of these open-access datasets categorize pixels to describe both inter-and intra-annual changes in surface water integral to understanding our changing world and the associated effects on surface water. However, both products are inadequate for detecting bodies of water less than 30 m wide that are characteristic of semi-arid systems, and do not measure mesic vegetation indicative water availability. Another Landsat surface water product, the Dynamic Surface Water Extent (DSWE) (Jones, 2019(Jones, , 2015, developed by the United States Geological Survey, relies on spectral mixture analysis (SMA), a technique that describes each pixel as proportions of possible land cover classes (water, vegetation, soil). The DSWE then produces maps where pixels are categorized as open water (high and medium confidence) and partial surface water (conservative and aggressive estimates) (Jones, 2019). These 30 m data enable a better understanding of surface water dynamics that can inform management of water resources over longer time periods (1984-present). While these products are well suited to describe synoptic patterns of surface water extents and inundation in wetlands over lengthy time periods, they do not differentiate mesic vegetation of high ecological importance in semiarid systems and other types of vegetation indicative of a lower functioning system.
More than one third of the earth's land surface is considered to be arid or semi-arid (Wickens, 1998). These biomes differ from other climatic zones in that mesic vegetation such as wet meadows, riparian zones, and wetlands are indicative of water availability. Mesic vegetation is often referred to as "green groceries" by land managers, utilized as a food source by wildlife and livestock, and only available when water is present. Furthermore, mesic vegetation is more predominant on the landscape and more visible with satellite data. Including mesic vegetation in a water monitoring product is critical to fully understanding the dynamics of water changes on the landscape. However, surface water and mesic vegetation are typically not both included in water resource inventories. Several products focus on mapping mesic vegetation integral to semi-arid land functions, but typically omit surface water and do not describe intra-annual variation. The National Wetlands Inventory (NWI) represents a leading water resources inventory in the United States delineated from aerial images in varying seasons and years, and thus, does not capture intra-or inter-annual dynamics. The Intermountain West Joint Venture (IWJV) has made substantial advances in mesic vegetation mapping and focuses on water resources in the American West as they pertain to greater sage grouse and migratory waterfowl in the semi-arid Mountain West and deliver maps through non-profit agencies like the Sage Grouse Initiative (SGI) (Donnelly et al., 2020;Donnelly et al., 2019, Donnelly et al., 2016. This group analyzes available images in the Landsat time series, but focuses on inter-annual variability in terms of habitats for species of concern. The most recent IWJV studies use SMA and thus describe proportions of water resources per pixel, but the previous products are delivered as 30 m classifications. The IWJV maps, while increasingly more robust with regards to water resources for migratory fowl, are not designed to describe intra-annual water resource dynamics necessary for monitoring water resources throughout the water year. The Sentinel constellation presents a way to improve water monitoring in semi-arid systems with freely available data. Optical data collected from sensors aboard the Sentinel-2 satellites have been shown to effectively map water resources at higher resolutions (Du et al., 2016;Kaplan and Avdan, 2017;Pena-Regueiro et al., 2020). The Dynamic World (Brown et al., 2022) is an open-access tool that provides a land cover classification based on Sentinel optical imagery for a userspecified period of interest. The workflow uses all available Sentinel-2 images within a given time-period to map nine basic land cover classes, but lack the thematic resolution necessary for effective characterization of mesic resources for semi-arid environments. For example, the Dynamic World water-specific classes are a) Water, and b) Flooded vegetation, which do not include mesic vegetation that is unflooded (e.g. riparian zones or mesic meadows). Concurrently, a synthetic aperture radar (SAR) aboard the Sentinel-1 constellation collects data with an active sensor unaffected by varying light and cloud conditions and has shown to be an effective alternative for mapping water resources, particularly in cloudy environments (Huang et al., 2018;Mayer et al., 2021;Soman and Indu, 2022). Researchers are increasingly using active and passive datasets in tandem, and have found this approach useful for detection of not only surface water, but wetlands as well (Behnamian et al., 2017;Markert et al., 2018;Merchant et al., 2019). However, Sentinel data fusion techniques for mapping water resources have yet to be implemented in semi-arid regions with highly varying topography.
With the advance of cloud computing and new satellite technologies, there is an unprecedented opportunity to improve water mapping in semi-arid systems. Until the recent introduction of Google Earth Engine (GEE) and other cloud computing platforms, steps for preprocessing, classification, and rigorous accuracy assessment over large swaths of space and time necessary for fully understanding literal ebbs and flows was logistically unachievable (Woodcock et al., 2020). Large time-series datasets like Landsat and Sentinel are now fully and freely available online. Access to these archives through GEE makes analysis of these large datasets more feasible than ever (Gorelick et al., 2017;Pekel et al., 2016;Pickens et al., 2020). Newer open access data sources with higher spatial and temporal resolutions such as the Sentinel time series allow for greater detail in time series analyses (Rapinel et al., 2019), but no workflows or products have been developed specifically for monitoring water resources in aggregate using these data.
The overarching goal of this study was to implement a workflow for automated mapping of water resources specifically for semi-arid systems for the duration of the complete Sentinel-2A and − 2B time series (2017present). We developed our workflow in a semi-arid region of the western U.S. with a wide range of topographic, vegetation, and land use conditions with three main objectives: 1) Map both water and mesic vegetation in a single product and quantify intra-annual variability of surface water and mesic vegetation in a water limited setting. 2) Quantify the value of SAR data when fused with optical and topographic variables for estimating abundance of water resources. 3) Demonstrate the value of increased spatial and temporal resolution through direct comparisons with the publicly available estimates of surface water and mesic vegetation produced by Landsat products (Donnelly et al., 2016;Pickens et al., 2020).

Study area
Much of the western United States is characterized by mountainous semi-arid systems that rely on snowpack for water delivery in the spring and summer months. In these semi-arid mountain systems, uncertainty surrounding surface water, riparian zones, wet meadows and wetlands (hereafter water resources) is growing in response to intensifying land uses, population growth and climate-induced environmental changes (Abatzoglou et al., 2017). Within the western U.S., the High Divide is an ideal study area for investigation of water resource variability due to varying land uses, complex topography, and ubiquity of need for adaptations to increase resilience to climatic uncertainty (Fig. 1). The 30 year normal precipitation and snowmelt received in the High Divide is ~ 543 mm, reaching its peak in June at ~ 69 mm and its lowest in August just below 30 mm (Supplemental Information (SI), Appendix 1 (A1), Figure S1), with most precipitation falling as snow in winter months and released as the days become warmer. Minimum (maximum) normal temperatures range from − 11 (-1) in December to 8 (26) degrees Celsius in July (Daly and Bryant, 2013) Figures S2, S3). This region of eastern Idaho and western Montana depends heavily on surface water and mesic vegetation for sustaining human and wildlife populations. Coexistence with the wildlife that migrate through the High Divide between relatively intact ecosystems (Greater Yellowstone, Crown of the Continent, Salmon-Selway) is paramount for conservation and ecotourism. Greater understanding of the extent and dynamics of surface water and mesic vegetation will enable better conservation implementation and insights into the implications of a changing climate on water availability in this multifaceted mosaic of land cover and land use.
Changes in land use in the High Divide coupled with population growth results in exceedingly high water demands for direct consumption and agriculture . Exurban development is more likely to occur at the expense of farmland than any other land use (Ahmed and Jackson-Smith, 2019) leading to agricultural intensification. Intensifying agricultural practices often means increasing wateruse efficiency, which can result in decreased returns of water to the hydrological system through groundwater and surface water exchange (Van Kirk et al., 2019). Beyond agricultural impacts, wetland areas are drained and channels straightened to accommodate exurban development, altering local hydrology (Johnston, 1994). These losses collectively increase the efficiency of water movement through the system and reduce areas of high ecological importance to salmonids, sage grouse, and other species at the center of conservation discourse (Davee et al., 2019;Donnelly et al., 2016;Gibson and Olden, 2014).
Management efforts recognize the trade-off associated with increased hydraulic efficiency and aim to restore incised channels and concomitant ecosystem functions (Pollock et al., 2014). These efforts support a burgeoning ecotourism industry that relies on healthy ecosystems for activities such as angling, hunting, and wildlife tourism. Conservation efforts aimed to protect iconic salmonid species that rely on deepwater refugia (Bouwes et al., 2016), greater sage grouse, an umbrella species that relies on areas of mesic vegetation for nesting and rearing young (Donnelly et al., 2016), and amphibians that require wetland habitats (Arkle and Pilliod, 2015) all focus on slowing water movement through the system, increasing soil water storage, and maximizing water resource availability during summer months when water is most scarce.

Optical satellite imagery
To include as many observations as possible, we used top of atmosphere level 1-C Sentinel-2 data, as water and land are separable in these data without risking the inclusion of possible bias or loss of information from conversion to surface reflectance or normalization (Pekel et al., 2016;Pickens et al., 2020). These images include four bands (blue, green, red, near infrared (NIR)) delivered at 10 m resolution, and six bands delivered at 20 m resolution (three red edge bands, a narrow NIR  Table S5) false color composites (NIR, red, green) of archetype sites (reservoir (C), spring-fed (D), riparian (E), small-stream (F)). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) band, and two short-wave infrared bands (SWIR)) (Table S1). We also calculated common normalized difference spectral indices useful for detection of productive vegetation and surface water including the Normalized Difference Vegetation Index (NDVI) (for all red edge and NIR bands) (Tucker, 1979) for detection of photosynthetically active mesic vegetation, Normalized Difference Water Index (NDWI) (McFeeters, 1996) and Modified Normalized Difference Water Index (MNDWI) (Xu, 2006) for surface water. We restricted our analysis to images that contained less than 50 % clouds, screened by using image metadata. We applied the s2cloudless dataset to mask any clouds that were present in remaining images (Zupanc, 2020). We also masked areas within our study region that contain lava flows, as these dark regions are often a source of confusion for some water classifiers (Pekel et al., 2016) (Fig. 2).

Synthetic aperture radar satellite imagery
We used the ground range detected (GRD) C-band synthetic aperture radar (SAR) data collected by the Sentinel-1 platform. GRD data are radiometrically and topographically corrected using the Sentinel-1 Toolbox (European Space Agency, 2020) and delivered as an analysis ready product. We used GRD images derived from VV and VH polarizations where available. VV and VH polarizations are useful for measuring properties of soil moisture (Kornelsen and Coulibaly, 2013) and vegetation density (Patel et al., 2006) respectively. HH and HV polarizations are also collected by Sentinel-1, but mainly in polar regions for sea-ice detection and are not applicable for this work. We used images exclusively from ascending orbits of the Sentinel-1 satellite over the High Divide. There were too many lengthy data gaps in the descending dataset as parts of our study area are only partially covered with descending scenes during eclipse season (May -September) as not foreseen by the present order geometry (Copernicus, personal communication) (A1, Figure S4). Missing acquisitions in the ascending paths during the eclipse season prior to 2019 also left considerable gaps in the time series and thus, we could not produce a continuous time series classification prior to 2019 for all areas in our study area.
Researchers often choose to smooth SAR data with a spatial filter due N.E. Kolarik et al. Ecological Indicators 147 (2023) 109965 to inherent noise associated with SAR backscatter (Behnamian et al., 2017;Hird et al., 2017;Mahdianpari et al., 2019). However, as we were interested in maintaining the spatial integrity of these data, we employed a space for time substitution. This substitution implies that instead of a spatial filter, we used mean values of available images during each month to smooth the data with the goal of filtering out noise without sacrificing spatial resolution. Spatial filtering, while often employed, is acknowledged to underestimate extents of water bodies near edges (Behnamian et al., 2017) and could result in total omission of the small tributaries targeted in this study.

Topographic variables
Topography is a first order determinant of hydrological processes and we incorporate these data to account for temporally invariant processes that contribute to water allocation. For example, slope dictates direction of flow and duration of water presence and aspect is a determinant of soil moisture, and thus indicative of areas with higher potential for surface water and mesic vegetation (Western et al., 1999). We produced values for aspect and slope through built-in GEE functions using the 1 / 3 arc second USGS National Elevation Dataset (NED). We also calculated a topographic wetness index (TWI) (equation 1) where α is the upslope contributing area and tanβ is the local grade known to be a good predictor of wetlands and mesic vegetation (Hird et al., 2017). We used the flow accumulation tool with a flow direction raster in ArcMap 10.4.1 to calculate α (similar tools exist in open-source GIS platforms such as QGIS) and produced a TWI map via the GEE cloud computing platform. TWI = ln(α/tanβ) (1). We also included a wetland probability layer from (Hansen et al., 2022). In that study they created a global wetland map derived from Landsat metrics as well as hydrologic metrics following the methods of Bwangoy et al., (2010) and Margono et al., (2014). The reflectance metrics were statistical subsets of Landsat observations. The hydrologic metrics were derived from the elevation collected by SRTM and of greatest importance were layers representing the relative elevation of each pixel within various catchment sizes. This layer represents the modeled probability of wetland formations at 30 m.

Training data
We used five distinct classes to distinguish water resources from other areas: water, shadow, snow/ice, upland, and mesic vegetation. We manually delineated polygons for each class to encompass all types of each class in 29 randomly selected images. For example, the surface water class included rivers, lakes, retention ponds, reservoirs, etc. as they were available in the training images. Similarly, the mesic vegetation class included sampled areas of wooded riparian zones, irrigated agriculture, seasonal wetlands, and wet meadows. We iteratively added training examples of initially problematic areas in the images (i.e. shallow water, small channels) until these areas were adequately represented. Our final training area set consisted of more than 1.3 million pixels in the sample pool. We extracted point values from the image stacks using 29 Sentinel-2 images randomly selected from April-October of 2017-2020 (A1 , Table S2), the associated monthly means of the SAR backscatter, and the topographic variables. We used these months to extend beyond the typical growing season with the intent of reducing the classifier's sensitivity to anomalous conditions.

Validation data
We used an equal allocation stratified random sampling design to validate our classifier for each of the growing season months (June -September) of 2021, a year that we did not include in the training sample pool. Our rationale for equal allocation stratification was to increase sampling rates for classes of interest that are relatively rare in the study region (e.g. surface water, mesic vegetation) to increase precision in the accuracy assessment compared to a simple random sampling design (Stehman and Foody, 2019). This approach also ensures each pixel with data during our period of interest has a non-zero selection probability, and should provide a reasonable dataset to serve as reference data for accuracy assessment (Olofsson et al., 2014). We defined the strata in a hierarchical approach using the maps we produced with all variables included. Each stratum corresponds to a map class and was defined as all pixels with presence of that class in any of the four monthly maps by first identifying surface water pixels, and then mesic pixels within the remaining set, then shadow and snow accordingly. All remaining pixels with valid data were assigned to the upland land stratum. We then randomly selected 100 pixels per stratum, resulting in a total sample size of 500 (Table 1). We used the same 500 pixels for each month, randomly distributed throughout the entire study area, and recorded the land cover class observed in each, resulting in 2,000 total validation observations (500/month). As suggested in Stehman and Foody (2019), the number of reference pixels we chose resulted in sufficiently narrow confidence intervals for overall and user's accuracies to be useful for mapping and monitoring surface water and mesic vegetation extents. For each of the 500 pixels, the lead author used expert knowledge and familiarity with the study area to create a monthly reference label of surface water, mesic vegetation, shadow, snow, or other land via visual inspection of a monthly composite assembled from Sentinel-2 images (examples shown in A1 Fig. S5, S6). In our study area, these land covers have high separability in Sentinel imagery, in particular in the NIR, red, and green bands, and thus are highly distinguishable using a false color composite and in-scene context. In rare instances where a pixel could not be clearly identified in the monthly composite, we chose to label the nearest identifiable pixel.

Classification
We accessed all data (Sentinel optical and SAR time series, and NED) and produced a Random Forest (RF) classifier in GEE. RF is a nonparametric machine learning approach routinely used for land cover classification, and well suited for multi-class problems where other similar classifiers (i.e. support vector machine) are intrinsically-two-class (Breiman, 2001;Nguyen et al., 2020;Pal, 2005). Further, RFs have been successfully implemented for wetland classification, making RF an obvious choice (Berhane et al., 2018;Hird et al., 2017). From the training samples collected across the 29 images throughout the time series, we randomly sampled 40,000 pixels to represent each class and train the RF classifier. The rationale behind this equally allocated sampling design is to boost sampling rates of underrepresented classes (surface water and mesic vegetation) to reduce omission bias while capturing samples randomly over the spatial and temporal domains of our dataset (Jin et al., 2014). We then classified all image stacks and produced monthly maps using the mode pixel value for each month.
We tuned the number of variables per split parameter for the RF using the tuneRF function in the randomForest package in R v3.6.3 (Liaw and Wiener, 2002). Regardless of the number of variables for the model we tuned, the optimal value was consistently equal to the square root of the number of features in the model, consistent with studies using RF for remote sensing applications (Belgiu and Drȃguţ, 2016), and the default parameter in GEE (A1, Figure S7). We also tested for the optimal number of trees in the RF (50, 200, 500), which did not reveal meaningful differences (A1 , Table S3A and B). The final RF we implemented used 50 trees as there were only small improvements in classification results with a higher number of trees and computational times were considerably lower with a smaller forest. We isolated classes of interest (surface water, mesic vegetation) in RF models to determine important variables for separability of each class. Using this approach, we distinguish which variables (A1 , Table S1) are most useful for each respective class through variable importance plots and describe the relative importance of each data source as the mean decrease in Gini importance (Han et al., 2016).

Accuracy assessment
To assess the accuracy of the SF classifier, we evaluated the monthly maps of the entire study region for June through September of 2021 with a probability-based sample of reference data (see section 3.1.5). The sample locations were selected using a stratified random sampling design as described in section 3.1.5 to reduce the uncertainty in classes of interest. We calculated overall accuracy (OA), producer's accuracy (PA), and user's accuracy (UA) as well as estimated areas using equations 1-8 and equation 10 as found in Olofsson et al. (2014). We used area adjusted producer's accuracies (PA) to estimate the omission bias of the classifiers for a given class (sensitivity), and user's accuracies (UA) to estimate commission biases (specificity), as well as associate variance estimators to calculate 95 % confidence intervals.

Comparisons with surface water and mesic vegetation datasets
To provide an objective comparison of the accuracy of our SF product to the accuracy of other leading datasets, we conducted an additional accuracy assessment with the following steps. First, we chose four archetype sites to represent common water resource types in our study area. Second, we manually delineated areas of water (mesic) and land (other) of PlanetScope imagery to represent a high-resolution independent dataset for comparison with other products. Third, we chose three leading datasets meant to describe spatial extents and locations of water resources. Finally, we conducted formal accuracy assessments between all the products and PlanetScope classifications.

Archetype sites
We chose four sites to serve as representatives of typical water resources in the western U.S. to conduct comparisons with existing water mapping products; three for surface water monitoring, and a fourth to focus on mesic vegetation extents in a small-stream system. First, reservoirs are commonly built to store water from early season snowmelt to be delivered throughout the growing season. We used Mackay reservoir in central Idaho as our reservoir site (Fig. 1C). Second, we focus on an anomalous spring-fed area named Thousand Springs Creek near Dickey, Idaho (i.e. spring-fed site; Fig. 1D). Small water sources such as Thousand Springs Creek are integral to ecological functioning. For example, as wildlife migrate throughout the region they actively seek these water resources. Third, we investigated a portion of the Big Lost River drainage (i.e. riparian site; Fig. 1E) upstream of the Mackay reservoir. Relatively small riparian systems such as the Big Lost are typical in drainages throughout the High Divide and the Mountain West and represent important refugia for wildlife, forage for livestock, and water sources more generally.
Fourth, for mesic vegetation monitoring, we chose to focus on a site where the stream width was typically smaller than the width of a Sentinel pixel (10-m) but the riparian zone with mesic vegetation can be used as a proxy for water presence and abundance. We chose to use the confluence of Hailey, Sheep, and Baugh Creeks east of Ketchum, ID (i.e. small-stream site; Fig. 1F), a site where some low-tech stream restoration projects are ongoing, typical of those occurring elsewhere in the High Divide in need of consistent monitoring efforts (Silverman et al., 2019).

PlanetScope reference data
We used monthly maps described in section 3.2 to produce binary water/land (or mesic/other) maps and make quantitative discrete class comparisons between products. For reference datasets, we trained RF classifiers using cloud-free, single date PlanetScope images at our archetype sites to quantitatively compare our Sentinel Fusion (SF) outputs with the leading products, with the rationale that water resources are relatively stable at the monthly time step (Pickens et al., 2020). We iteratively added training polygons until we considered our reference maps to be as high quality as if we had digitized these relatively small areas by hand. We then resampled these reference maps to the Sentinel grid, and used a 50 % threshold for assigning pixels to water and land classes (or mesic and other) prior to comparison. At archetypical surface water sites (reservoir, spring-fed, riparian sites) we conducted our comparisons in July 2019 (wet year) and July 2021 (dry year). For mesic vegetation (small-stream site), we conducted comparisons using representative PlanetScope images for each of the last three months of the water year (July, August, September) in 2019 and 2021 (Tables S5 and S6) to highlight the dynamic nature of mesic vegetation extents and the intra-annual information gained using the SF product.

Comparison with leading datasets
We compared our SF product with three leading publicly available water monitoring products (Table 2). Though these products are not derived from the same source imagery, our goal is to compare our results with leading products used for monitoring and assessment of water resources and their management.
Surface water: We compared the SF outputs in July 2019 and July 2021 to those of the Global Land Analysis and Discovery (GLAD) laboratory Global Surface Water Dynamics product (Pickens et al., 2020) at the reservoir, spring-fed, and riparian sites. These months able us to assess accuracy both within the training period (2019) and beyond (2021), as well as wet (2019) vs dry (2021) years. With the goal of mapping surface water dynamics for the entire globe from 2000 to present, the GLAD product represents a leading surface water product for capturing temporal changes. We used a threshold of 50 % to create binary monthly maps directly comparable to SF monthly maps.
Mesic vegetation: We conducted the mesic vegetation map comparisons at the small-stream site in the last three months of the 2019 and 2021 water years (July, August, and September). We compared the SF mesic vegetation extents to the National Wetlands Inventory (NWI). Though highly spatially detailed (minimum mapping unit of 0.2 ha (Subcommittee, FGDC Wetlands, 2009)), the NWI contains no time series information or any indicator of dynamism. We also used the IWJV Sage Grouse Initiative (SGI) mesic resources dataset for comparison of mesic vegetation extents. The SGI polygons are aggregated from Landsat images (30 m pixels) and are described each year from 1984 to 2017 using the annual maximum NDVI value (Donnelly et al., 2016) (accessed  here: https://map.sagegrouseinitiative.com/ecosystem/mesic-res ources). While the SGI dataset describes inter-annual variation in mesic vegetation quality for pre-determined areas, it does not capture intra-annual variability, or any resources beyond the static polygons initially determined as mesic resources.

Results
We applied our classifier to 310 image stacks in June, 305 in July, 272 in August, and 293 in September to produce monthly maps in our study area in 2021. The most important variable for the surface water class was the wetland probability layer, followed by the VV polarization from ascending C-band Sentinel-1 SAR data, and slope (Fig. 3A). Slope was the most important variable for distinguishing mesic vegetation from all other land covers, followed closely by the wetland probability layer (Fig. 3B).
Our area estimations for each class demonstrate the rarity of water resources in our study area (Table 3), with surface water ranging from the lowest areal coverage of 1,991 km 2 (±48 km 2 ) at the end of the water year in September (0.70 % of the study area) and the peak of 3,184 km 2 (±1,588 km 2 ) in July (1 % of the study area) (Table 3). Mesic vegetation steadily decreased throughout the growing season, with the peak of 57,558 km 2 (±5,736 km 2 ) estimated in June (20 % of the study area) reduced to 22,593 km 2 (±3,562 km 2 ) in September (8 % of study area). These estimates suggest that all other cover types are estimated as approximately 3.7 times as abundant as water resources in aggregate (surface water and mesic classes) in June, 4.8 times as abundant in July, 7.5 times as abundant in August, and 10.6 times so in September, indicating a continued decline of water resources from late spring to early fall.
We also directly compared our estimates to the GLAD surface water dynamics product in 2021, using a 50 % threshold to make binary maps from the GLAD monthly product. We then used a pixel counting approach to produce the GLAD estimate. Table 4 shows that using the SF approach results in higher estimates of surface water in each month.

Table 2
Sentinel Fusion (bold) and comparison product details for Sentinel Fusion (SF), Global Land Analysis and Discovery dynamic surface water (GLAD), National Wetlands Inventory (NWI) and Sage Grouse Initiative mesic resources maps (SGI).

Table 3
Area estimations and area adjusted accuracy metrics with 95% confidence intervals for the full Sentinel Fusion (SF) classifier, and the classifier with the SAR variables omitted. We omit the snow and shadow classes here due to low sample sizes in late and early months respectively that lead to unreliable area estimates and accuracy metrics.

Quantify the value of SAR data when fused with optical and topographic variables for estimating abundance of water resources
When including SAR with the optical and topographic covariates, we found negligible commission differences for all classes, in particular the surface water and mesic vegetation classes of interest, but greater omission of surface water. The classifier that omits SAR data has high UAs for surface water ranging from 99 % (±2 %) to 100 % (±0 %), which are all within 0.01 % of the respective UAs from the maps including SAR data ( Table 3). The mesic vegetation class monthly UAs are lower, but still high, ranging from 84 % (±8 %) to 92 % (±9 %) and are all within 3.6 % of those with SAR included. For the water class, excluding SAR results in a PAs ranging from 58 % (±34 %) to 80 % (±32 %), with monthly decreases of 17 %, 1 %, 12 %, and 20 %, respectively. These differences were often only due to the omission of one pixel (A2, Tables S1-S8), but amplified due to the relatively low map proportions of the water class.

Impacts of increased spatial and temporal resolution
In this section, we report the results of the product comparisons for four archetype sites with leading products. We have included detailed results for 2021 (dry year) in the main text as this serves as our test period, while results for 2019 (wet year, during the training period) can be found in the supplement.

Surface water
The SF product had higher PA than the GLAD product at the four archetypal sites for July 2021 ( Fig. 4; A2 Tables S9-S12). The riparian site, in particular, demonstrates how the higher spatial resolution of the SF product is better suited for monitoring surface water in water limited environments where small water bodies are critical to landscape function. At the riparian site (Big Lost River) (Fig. 5E, F), the SF product omitted 72 % of water pixels (28 % PA; Fig. 4; A2 Table S11), but the GLAD product failed to detect any water pixels (0 % PA, Fig. 4; A2 Table S11). The SF product detected water in stream at a much higher rate, despite the small width of the channel at the riparian site ( Fig. 5E and F). In Fig. 5A, the difference maps reveal relatively similar omission and commission rates the reservoir site where the PA of the surface water class is 88 % compared to 97 % and UA of 97 % and 91 % for GLAD and SF, respectively ( Fig. 4; A2 Table S9). Omission rates at the spring-fed site are slightly lower in the SF output, with 66 % PA compared to 59 % using the GLAD product. GLAD does overestimate water at this site at a higher rate, with a UA of 68 % versus 96 % using the SF workflow (Figs. 4, 5C, D; A2 Table S11). We see similar patterns in 2019 (wet year; A2, Figure S1, Tables S13-S16), with the caveat that we found higher rates of commission and omission using the GLAD dataset, resulting in lower accuracy metrics when compared to the SF maps.

Mesic vegetation
At the small stream site in 2021, we observed highest overall agreement with the reference maps for all three months (July, August, September) using the SF product (98 %, 97 %, 97 % OAs; Fig. 6; A2 Table S17), striking a balance between omission and commission errors. The NWI had the lowest PAs among all products for July and August at the small-stream site (41 %, 47 % PAs; Fig. 6; A2 Table S19), with SF showing a slightly lower PA (51 %) than the NWI (55 %) in September. Conversely, the SGI polygons consistently overestimated the mesic vegetation extents (29 %, 24 %, 17 % UAs; Fig. 6; A2 Table S18). Fig. 7 helps to highlight the spatial locations of underestimations (omission errors) of mesic vegetation using the NWI, and the overestimations (commission errors) using the SGI dataset.
We highlight the way we can use the SF workflow to monitor mesic vegetation dynamics throughout the growing season in Fig. 7. While static datasets like the NWI and SGI remain constant, the SF maps show Table 4 Monthly surface water area estimates (km 2 ) for SF and GLAD in 2021.

Fig. 4.
Overall accuracy (OA), producer's accuracy (PA), and user's accuracy (UA) of the SF surface water and GLAD datasets at each of the three surface water archetype sites and the aggregate accuracy metrics for July 2021. Tabular results can be viewed in Tables A2 S9-S12.

Fig. 5.
Three examples (first, second, and third archetype sites) of differences between the Sentinel Fusion (SF) surface water and GLAD monthly products in July 2021. We subtracted GLAD monthly estimations from SF monthly maps, so yellow tones indicate higher rates of detection of surface water by GLAD and pink tones SF. Panels A and B are the reservoir site (Mackay Reservoir), panels C and D the spring-fed site (Thousand Springs Creek), and panels E and F the riparian site (Big Lost River). For reference, we included PlanetScope images for each site (A1 , Table S5) beneath the difference layers. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) varying mesic vegetation extents with the spring pulse early in the season, stabilization with baseflow in summer months, and subsequent senescence at the end of the growing season as indicated by the false color PlanetScope composites in Fig. 7. Our quantitative comparisons show the consistency of SF to communicate accurate intra-seasonal dynamics compared to the other products ( Fig. 6; A2 Tables S17-S19).
Where the NWI omits a lot of pixels from the mesic class early in the summer and gets increasingly more accurate as the dry season progresses (41 %, 47 %, 55 % PAs; Fig. 6; A2 Table S19), SF shows a higher omission rate as the season progresses, but PA remains higher than the NWI in July and August, but slightly lower in September (84 %, 59 %, 51 %; Fig. 6; A2 Table S17). On the other hand, SF maps consistently show highest UAs throughout the end of the water year (93 %, 96 %, 95 %; Fig. 6; A2 Table S17), where SGI maps continue to commit more pixels to the mesic class as the summer gets drier (29 %, 24 %, 17 % UAs; Fig. 6; A2 Table S18). We see a similar pattern in 2019, where SF omits fewer pixels than the NWI, and commits fewer than SGI in all three months (A2 Fig. S2; A2 Tables S20-S22). These comparisons show two important advances of the SF product over existing mesic vegetation monitoring products. First, existing mesic vegetation monitoring maps are static and thus do not capture intraannual changes, but in semi-arid environments rapid seasonal changes are characteristic, and a critical part of monitoring. Second, the NWI generally substantially underestimates mesic vegetation, whereas the SGI product substantially overestimates mesic vegetation (Fig. 7). The SF product represents a suitable balance between commission and omission errors. Further, products that measure only surface water would totally omit this resource area and would be inadequate for mapping critical sustaining resources in this system.

Intra-annual variability of surface water and mesic vegetation extents
The workflow we document in this study using open access data in GEE is robust enough to effectively estimate water resource abundance in aggregate throughout the growing season in a semi-arid system. Our area estimations indicate that water resources are relatively scarce across the landscape, which aligns with our knowledge of these semiarid systems. Further, our area estimations validate the importance of mesic vegetation and suggest that it should be considered in the monitoring of water availability in semi-arid mountain systems, as in aggregate these slowly wane as the water year progresses (Petersen et al., 2012;Tulbure et al., 2016). The exclusion of mesic vegetation and monitoring of only surface water using data at the 10 m scale, results in a different pattern of water availability, where June appears to have less water than in July. We suspect that in June, water is allocated not in main channels low in the valleys, but rather in smaller channels and in the soil, remaining undetected at the 10 m scale. Using mesic vegetation abundance and health as a proxy for water availability seems to be a reasonable approach to monitoring this dynamic. Furthermore, our findings show that mesic vegetation extents exhibit temporal dynamism and are not bound to specific spatial regions on an intra-annual basis as is represented by the products to which we compared the SF results, which are limited by either a single date in time (NWI) or a single annual value (SGI) and do not capture these intra-annual dynamics that dictate habitat suitability and forage availability. As efforts throughout the West focus on restoring incised channels and re-wetting valley bottoms (Bouwes et al., 2016;Silverman et al., 2019), both spatial and temporal dynamics of these resources should be monitored within the growing season when demands for water resources are at their highest.
While we initially intended to develop a product to be useful at the reach scale for monitoring surface water and mesic vegetation in single valley bottoms in restoration settings, we found that our approach helps in synoptic assessments as well. The currently available 30 m global surface water product we explored returns much lower estimates of surface water due to omission errors of even medium-sized rivers such as the Big Lost River. These are resource areas that sustain semi-arid systems and their omission can alter landscape level assessments.

Quantify the value of SAR data when fused with optical and topographic variables for estimating abundance of water resources
We observe consistent, although slight improvements in mapping surface water when including SAR data in our list of covariates in the RF model in June, August, and September. However, in all instances, these improvements are due to a single validation pixel committed to another class using the classifier without SAR variables. The relative scarcity of the water class exacerbates these differences with the area adjustment formulae and we are reluctant to make strong claims regarding the importance of using SAR based on our results, but these data can be incorporated rather easily and show no signs of decreasing the quality of the model. We recommend including these for future work, particularly given that SAR data are particularly well suited for bolstering optical times series fraught with cloudy scenes (Markert et al., 2018;Pham-Duc et al., 2017), obfuscated by wildfire smoke (Ban et al., 2020), and during periods of inundation (Canisius et al., 2019). While we found the VV band from the Sentinel-1 SAR data to be the second strongest predictor of surface water based on MDG, we suspect that considerable amounts of correlation among the Sentinel-2 predictors leads to identifiability problems where importance is then spread among all correlated predictors (Strobl et al., 2008).
Generally, the SAR data do not show to contribute to improvements of estimates of mesic vegetation in our study area. This result could be  Table S5). From right to left, top to bottom are monthly modes of the SF product from June (top left) to September (bottom right). In each classified map, we show SF outputs for mesic vegetation (green), surface water (blue), upland areas (brown), and shadow (black). We also show the NWI polygons (pink outline) and SGI polygons (blue hatch) for comparison. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) attributed to the way that we defined this class, including multiple vegetation types with starkly different physical properties. For example, we expect backscattering mechanisms would vary greatly from riparian trees such as Populus sp. (cottonwoods) or Salix sp. (willows) to wet meadows to irrigated agriculture; all considered to be important mesic vegetation in the Mountain West (Donnelly et al., 2016;Krosby et al., 2018;Macfarlane et al., 2017). We also expect that SAR products delivered at a similar spatial resolution from sensors with longer wavelengths such as L-band data from the upcoming NISAR mission will be able to penetrate upper canopies and be better suited to detect differences in soil and canopy moisture, as well as quantify seasonal inundation extents left undetected with C-band data from Sentinel-1 (Bwangoy et al., 2010;Merchant et al., 2019).

Impacts of increased spatial and temporal resolution
We found that the higher spatial and temporal resolution of Sentinel imagery enable more accurate and detailed inventories of water resources compared to existing datasets derived from only Landsat imagery. In terms of spatial resolution, we show that the Sentinel data with nine times as many pixels are better suited than Landsat data for detecting smaller water bodies with widths smaller than 30 m that are characteristic of semi-arid waterscapes. This study highlights the scalar mismatch of Landsat data relative to surface water bodies in semi-arid systems, and shows they are often inadequate for detecting changes in the dynamics of water resources in our study area where even small changes in surface water availability can have large impacts on land functions (Hansen et al., 2002;Hillis et al., 2020;Huntsinger, 2016).
Another compelling finding was the importance of higher temporal resolution in estimating water resources. We suspect that more frequent image collection in the Sentinel-2 time series leads to improved monthly estimates, translating to higher temporal precision in monitoring efforts. The bulk of commission errors we observed in the SF product were found at the transition zone of the reservoir site in 2021, where these are likely mixed pixels of water and reservoir substrate, making even the interpretation of PlanetScope imagery difficult. Erroneously classified pixels as found in the GLAD maps for 2019 comparisons can be avoided using the Sentinel time series. Erroneous observations simply hold less weight when incorporating more scenes into a monthly product, leading to maps with fewer errors that are more applicable for monitoring efforts on intra-annual scales. With 2019 being an overall wetter year, likely causes of misclassifications of the Landsat product are turbid water or undetected haze (Arst et al., 2003;Pekel et al., 2016), and an overall absence of clear observations in the few opportunities afforded by the relatively coarse 8-16 day revisit time, resulting in both underestimates and overestimates of water resources at the monthly scale shown in our archetype site comparisons. These misclassifications indicate the importance of the higher number of observations in the SF product. In this instance, the SF approach benefits from the shorter revisit time of Sentinel 2a and 2b, where the GLAD product, using Landsat, may only have one observation per month deemed uncontaminated by cloud or haze. We expect this result to be increasingly important as contaminated observations due to frequent wildfire events could lead to low numbers of observations in the driest months of the year when monitoring water resources is most important (Abatzoglou and Williams, 2016;Ban et al., 2020;Burke et al., 2021). While we focus entirely on Sentinel data to deliver maps at 10 m spatial resolution, the freely available Harmonized Landsat and Sentinel-2 (HLS) (Claverie et al., 2018;Tulbure et al., 2022) dataset could also be considered for monitoring in contexts where 30 m spatial resolution is considered sufficient and the focus is on surface water monitoring in larger water bodies.

Global vs Regional perspectives
Global products, while useful for global scale analyses, are often too general to be applied in highly nuanced regions (Wyborn and Evans, 2021). We demonstrate that our regionally parameterized machine learning approach is better suited to capture surface water dynamics in a semi-arid system than are leading global products. With so much spectral variability among water resources (Pekel et al., 2016), heterogenous atmospheric loading, and other regional considerations such as varying topography (Feyisa et al., 2014) we find that a regional approach is important for monitoring water resource dynamics in water limited environments where even small errors can have profound implications for management.

Caveats and limitations
Our SF product shows improvement over Landsat products for capturing dynamics of surface water and mesic vegetation in our study area, but this workflow will need training and testing data specific to another study are if adopted by a manager outside of the High Divide. Further, those that need to monitor water resources at even finer spatial and temporal scales could explore other Earth observation (EO) data sources that are not currently open-access. Bhushan et al. (2021) show that high resolution terrain models can be extracted from PlanetScope images to accompany the four band spectral products. These images are costly to acquire, however, and require additional expertise and software beyond that of GEE. Alternatively, monitoring agencies could collect their own high-resolution data from unoccupied aerial systems (drones) at the expense of their own time and labor (Carbonneau et al., 2020). Processing and analyzing these ultra high-resolution alternatives require considerable computational resources and technical expertise that could potentially act as a barrier to effective monitoring, particularly over broad spatial and temporal scales. We urge managers considering adoption of EO monitoring datasets to clearly define the goals of their monitoring program before implementing any EO product into their decision-making process (Beier et al., 2017;Tulloch et al., 2015). Understanding the tradeoffs in decision making when selecting the necessary data and methods of analysis are important when trying to develop any monitoring protocol, and EO are no different. However, we encourage users to harness the power of these freely available data and consider incorporating them into their calculus to make sound environmental decisions.
We acknowledge that we are likely overestimating highly functioning mesic areas as a result of the predominance of irrigated agriculture exhibiting characteristics of and subsequently classified as mesic vegetation. We chose not to mask agriculture with the recognition that some irrigated areas and crops provide ecosystem functions and refugia for wildlife; these functions, however, vary among crop and resource types (Donnelly et al. 2016). We are also aware that the accuracy of the NED is variable across space due to the variable quality of the elevation models used to produce it (Gesch et al., 2014). As is the case, the accuracy of important topographical metrics may be lacking, but could improve along with the quality of open-access elevation data. We further acknowledge that the scope of this study is limited to utilizing only backscatter coefficients to describe surface properties with SAR data. Mayer et al. (2021) use five SAR indices in a deep learning approach to map surface water with SAR data only. When we tested these indices in our RF model, however, we observed only slight increases in accuracy metrics for water and mesic classes, with no meaningful qualitative improvements. Further, variable importance plots revealed that all five indices were ranked lowest among all variables included in the classifier for distinguishing surface water and mesic vegetation from other land covers. These results indicate that these indices increase complexity of the analysis without substantial benefit, so we opted to omit them altogether.

Conclusion
The workflow we present represents considerable improvement upon maps that are currently available for monitoring surface water and mesic vegetation dynamics in our semi-arid study area. We found that by aggregating surface water and mesic vegetation resources, we are better able to get a pulse on intra-annual dynamics of water resources in semiarid regions sustained by small surface water bodies. Using 10-m Sentinel data, we found that our monthly estimates were less likely to omit water resource pixels than leading Landsat products due to higher spatial resolution. The higher temporal resolution of Sentinel data can also reduce both omission and commission errors caused by bad observations and low observation counts using Landsat products at the monthly time step. We observed small, but consistent improvements when including C-band SAR data with optical data, and recommend consideration of this data fusion approach for mapping water resources in any environment. Using a regionally specific approach, the workflow we have outlined in this study uses freely available data on an openaccess platform and can be adopted by anyone seeking to develop water resources maps tailored to their study area.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.