Assessment of Surface Fractional Water Impacts on SMAP Soil Moisture Retrieval

Fractional water (FW) correction of satellite microwave brightness temperature (Tb) observations is a prerequisite for accurate soil moisture (SM) mapping over mixed land and water areas. Here, we evaluated the FW impacts on NASA Soil Moisture Active Passive (SMAP) L-band (1.4 GHz) SM retrievals using two water masks including (a) the NASA Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Land Water Mask version 6 (MOD44W) multi-year (2015–2019) water record and (b) the Ocean Discipline Processing System (ODPS) water mask previously used for SMAP global operational Tb and SM processing. The MOD44W and ODPS data were first compared with the European Commission's Joint Research Centre (JRC) Landsat-based water record. MOD44W showed major improvements in land/water classifications relative to the ODPS, with producer accuracy increasing from 50.02% to 95.02%, and user accuracy from 53.93% to 91.73% for water pixels. For assessing the FW impacts on SM retrievals, the same single channel V-polarization (SCA-V) algorithm was applied to SMAP Tb datasets corrected using ODPS and MOD44W water masks separately. MOD44W showed overall greater FW values (mean increase of 0.006) relative to the ODPS, leading to relatively drier SM retrievals (mean decrease: −0.012 m3/m3). Additional comparisons with globally distributed SM measurements confirmed consistently lower SM retrieval biases (mean decrease 0.04 m3/m3) and higher correlations (mean increase 0.06) of the MOD44W-based results relative to those based on the ODPS. Our results revealed non-negligible SM retrieval uncertainty introduced from the underlying ancillary FW data for areas with substantial water presence (e.g. FW>0.01).

(SMAP) mission can be degraded by the presence of surface water bodies within a sensor footprint [1]. The presence of even a small areal fraction of standing water can lead to a large difference in brightness temperature (Tb) observations relative to pure land conditions due to the large difference between land and water emissivities [2]. Accordingly, water correction for the Tb observations is a prerequisite for ensuring accurate retrievals of higher order land surface parameters, including soil moisture. For example, about ±0.04 m 3 /m 3 uncertainty in L-band volumetric soil moisture (SM) retrievals can be caused by 0.02 surface fractional water cover (FW) within bare soil regions [3]. The accuracy of SMAP freeze/thaw retrievals was also found to decrease with increasing FW [4].
The characterization of land surface water bodies mainly relies on measurements from microwave and optical-infrared (IR) satellite remote sensing [5]. Global FW data updated daily/subdaily are generally derived from satellite microwave radiometer observations [6], [7]. The global FW products generated from the advanced microwave scanning radiometers (AMSR-E and AMSR2) have been used in flood/drought monitoring and satellite river gauging [7], [8]. In addition, a SMAP-based FW dataset was developed and showed high correspondence with river discharge measurements and alternative water maps derived from optical-IR remote sensing [9]. However, the dynamic FW datasets have not been used for SMAP Tb water corrections since it is still challenging to distinguish between standing water and wet/saturated soil, whose Tb signals are similar in magnitude [9]. In addition, the SMAP FW data available at 36-km resolution cannot be directly applied to the SMAP operational data process workflow, which requires high resolution (e.g., 1-km) water/land classifications for antenna-pattern based Tb processing. The AMSR-E/2 sensors have different sampling times, frequencies (e.g., 18, 23, and 89 GHz) and observation geometry relative to SMAP, which potentially introduces additional uncertainties for SMAP Tb correction.
Traditional satellite optical-IR sensors for land cover and open water mapping include Sentinel 2 Multispectral Instrument, Landsat Thematic Mapper/Operational Land Imager and terra/aqua moderate resolution imaging spectroradiometer (MODIS) sensors, which enable accurate delineation of open water at spatial resolutions from 10 to 250 m under clear-sky conditions [10], [11], [12], [13], [14]. Recent developments from growing CubeSat constellations (e.g., Planet Dove/SuperDove/Skysat satellites) provide further Fig. 1. Locations of high-latitude (red rectangle) and Amazonia (black rectangle) study regions for comparing SMAP water mask datasets; and the spatial distribution of 734 stations (red dots) used for assessing the impacts of standing water on SMAP soil moisture retrievals. enhancements in surface water monitoring and hydrological assessments at submeter to meter resolutions and with subdaily sampling frequency [15]. Despite major data loss due to persistent cloud cover and suboptimal solar illumination [16], water inundation records composited from clear-sky optical-IR observations enable effective monitoring of seasonal and interannual changes in global surface water cover [17].
For SMAP operational data production, static water masks are used to account for water body impacts on the Tb observations over adjacent land areas within a grid cell [18], which do not account for seasonal FW dynamics but facilitates the operational data processing. The original baseline global water mask used for operational processing of SMAP Level 1-3 Tb and SM records was obtained from the ocean discipline processing system (ODPS), which was assembled from two vector maps delineating coastal lines and inland water bodies [19]. For SMAP processing, the ODPS data were used for deriving binary land/water classifications over a 1-km resolution global Equal-Area Scalable Earth Grid, Version 2.0 (EASE-Grid 2.0) projection format [20]. The 1-km water/land classifications were then used for antenna pattern-based Tb correction of FW contamination over land dominant grid cells. For the latest SMAP data release (R19), the SMAP Tb and SM products use an updated water mask derived from the NASA Terra MODIS Land Water Mask version 6 (MOD44W v6) multiyear (2015-2019) record [11]. Compared with the ODPS (circa 1997) and more recent observations from the commercial CubeSat constellations (e.g., limited availability of Planet SuperDove imagery circa 2021; [21]), the MOD44W v6 record overlaps with the SMAP operational period and provides consistent global annual coverage and favorable accuracy. For this study, the differences of the ODPS and MOD44W v6 water masks and the associated impacts on SMAP SM retrievals were assessed.

A. Study Region
This study involves the following. 1) Assessments of the ODPS and MOD44W v6 water masks, and their impacts on SMAP SM retrievals over the global domain, excluding ocean and permanent snow/ice areas (latitude: -60°to 90°; longitude: -180°to 180°). 2) Detailed intercomparisons of the water masks over two selected regions in the high latitudes and Amazonia, respectively. 3) Evaluation of impacts of standing water on the SMAP retrievals over 734 ground SM stations (see Fig. 1). The northern hemisphere high-latitude region (latitude: 58.2°t o 67.7°; longitude: -126.3°to -112.4°; Fig. 1) is characterized by an abundance of surface water bodies including the largest lakes (the Great Bear and the Great Slave lakes) in the Northwest Territories of Canada, and numerous small lakes. The region was selected to examine the difference between the ODPS and MOD44W v6 water masks in quantifying lake distributions. The Amazonia study region (latitude: -5.4°to 0.5°; longitude: -71.0°t o -52.1°; Fig. 1) consists of a major portion of the Amazon River and its tributaries. The region was used to examine the capability of water masks in delineating river features with various channel widths and shapes.

B. Datasets
Three groups of datasets were used in this study including 1) the global water masks; 2) SMAP Tb and SM products; 3) ancillary data for assessing the water masks and their impacts on SM retrievals. 1) Water Masks: The ODPS water mask was derived using the World Vector Shoreline (WVS) database for distinguishing coastal boundaries and the World Data Bank (WDB) for delineating inland water bodies [19]. The WVS is a standard US Defense Mapping Agency product for describing shorelines, international boundaries, and country names of the world [22]. The ODPS water mask is a rasterized version of the WVS and WDB, and has a spatial resolution of approximately 0.9 km at the equator. The ODPS was originally prepared to provide land/water masks for processing sea-viewing wide field-of-view sensor (SeaWiFS) data, and represents global surface water conditions around the period (∼1997) when SeaWiFS was launched.
A global water map derived from the MOD44W v6 record is used as the ancillary water mask for operational processing of the latest (R19) SMAP Level 1-3 Tb and SM product release; replacing the older ODPS water mask used in earlier product versions. The MOD44W v6 dataset was derived using a decision tree classifier trained with MODIS data and incorporating a series of masks to address known issues caused by terrain shadows, burn scars, cloudiness, or ice cover in oceans [11]. Compared with the prior MOD44W v5 product, which has a documented 98% producer's accuracy and 79% user's accuracy [10], the next generation MOD44W v6 dataset provides additional improvements in representing smaller water bodies, which led to an overall increase in surface water areas and mapping accuracy [11]. The MOD44W v6 water maps used to update the SMAP R19 water mask spanned the period from 2000 to 2019 and are therefore more representative of the post-2015 SMAP operational period than the ODPS (circa 1997) water mask.
For assessing the accuracy of the ODPS and MOD44W v6 water masks, the European Commission's Joint Research Centre (JRC) surface water dataset was used as an independent benchmark. The JRC dataset contains 30-m resolution maps derived using Landsat observations [17]. Each pixel was individually classified into water/nonwater using an expert system and the results were collated into a monthly history for each month between March 1984 and January 2022. The JRC Monthly Water History v1.4 data set was accessed from Google Earth Engine (GEE) for this study [23], [24].
2) SMAP Tb and SM Products: Two sets of SMAP Tb products were used in this study, including the official SMAP Enhanced L1C Radiometer Half-Orbit 9 km EASE-Grid Brightness Temperatures, Version 3 (SPL1CTB_E) product processed using the ODPS water mask [25], and the SMAP Enhanced L1B Radiometer 9-km Tb version T18420 (SPL1BTB_E) product corrected using MOD44W v6 water mask. The SPL1BTB_E version T18420 data were processed by the SMAP team and archived in the NASA Jet Propulsion Laboratory Offline Algorithm Staging and Input System.
The SMAP L3 Radiometer Global Daily 9 km EASE-Grid Soil Moisture (SPL3SMP_E) product was also adopted for facilitating the assessment of water mask impacts. The SPL3SMP_E products include soil moisture estimates from the V-polarization single channel algorithm (SCA-V), H-polarization single channel algorithm (SCA-H) and dual channel algorithm (DCA), and ancillary inputs for the algorithms [26]. The assessment of water mask impacts was based on SCA-V algorithm in this study, which has been shown to outperform SCA-H and has similar performance with the DCA [27]. Accordingly, the ancillary inputs for SCA-V including scattering albedo, vegetation optical thickness, surface roughness and soil texture were extracted from the SPL3SMP_E product set. The temporal coverage of the SMAP Tb and SM products used for this investigation extends from January 1 to December 31, 2017. Considering overall similar accuracy of SMAP ascending and descending products [28], only the ascending or 6:00 PM data sets were used for this study.
3) Ancillary Datasets: The European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 Land (ERA5-Land) dataset provides a complete and consistent view of the evolution of land variables by combining model data with observations for the period from 1981 to three months from real-time [29]. The monthly averages of ERA5-Land 2-m air temperature data with a spatial resolution 11 132 m were used for excluding possible frozen surface conditions potentially affecting Landsat water/land classifications. The data set was generated by the Copernicus Climate Change Service [30] and accessed from GEE [31].
The international soil moisture network (ISMN) data record represents the most comprehensive database of in situ SM station measurements established and maintained by international cooperation [32], [33]. The ISMN database has been widely used in validating and improving global satellite products [34]. The SM measurements for the upper-layer soil (0 to ∼5 cm) from 734 ISMN ground stations (see Fig. 1) were used as an independent benchmark for assessing the surface water impacts on SMAP SM retrievals. For facilitating the correlation analysis between satellite and ground-based SM data, the selected stations were required to have at least 30 observations coinciding with SMAP ascending overpass during the one-year study period.

C. Data Processing
The 1-km binary water/land classifications are used as the ancillary water masks in SMAP operational data processing for performing water corrections over 3, 9, and 36-km EASE-GRID 2.0 grid cells [18]. Here only the 9-km grid cells were used for exemplifying the water mask impacts on SMAP SM products.
For evaluating different water masks within the context of the SMAP operational workflow, the MOD44W v6 and JRC Landsat water maps were aggregated over 1-km EASE-GRID cells similar to the ODPS dataset. A MOD44W v6 projection conversion was first performed to obtain the latitude and longitude of each 250-m pixel from the original MODIS sinusoidal projection coordinates; and the geographical coordinates were then converted to the column and row numbers of the global 1-km EASE-GRID 2.0 projection. The water fraction of each 1-km grid cell was calculated as the ratio between the number of water pixels and total pixel number within the grid cell. A given 1-km grid cell was assigned as water if had a FW> 0.5 for three or more years during the 2015-2019 period; otherwise, it was classified as land. Similar aggregation operations were performed for the JRC monthly water record using GEE for obtaining the FW data of the 1-km grid cells representing the study period. In order to reduce uncertainties associated with snow or ice in the Landsat water maps, only the monthly water records with ERA5-Land monthly mean temperatures above 5°C were included in generating the Landsat 1-km water/land classification maps.
The 1-km ODPS-or MOD44W-based water masks were used to calculate FW for 9-km grid cells [18]. For generating the enhanced resolution (9-km) SMAP Tb data, the Backus-Gilbert (BG) optimal interpolation technique was used to obtain the 9-km antenna temperature (TA) data first utilizing the overlapped SMAP radiometer footprints [35]. The interpolated TA data were then processed into the 9-km Tb data after correction/calibration procedures [28]. The same BG interpolation technique was also applied to the generation of FW data for the global 9-km grid cells, which were used for correcting water contaminations for the Tb data [35]. The water correction was performed for land dominated 9-km grid cells with FW ≤ 0.5 following [18], [35] T b land where T b u p is the uncorrected Tb, subscript p represents horizontal or vertical polarization, f is the interpolated and antennagain-weighted FW, a i is the BG coefficient used in interpolating the TA data, f i represents the antenna-gain-weighted FW derived using 1-km water masks for the original SMAP measurements, n is set to a constant value (6) optimizing between accuracy and latency in SMAP data processing, T b land p is the corrected land Tb, and T b water p is the estimated water Tb as detailed in [35]. Based on the same operational workflow, the FW and watercorrected Tb data for global 9-km grid cells were generated using the respective ODPS and MOD44W v6 1-km water masks, and stored with the SPL1CTB_E v3 and SPL1BTB vT18420 products, respectively.

D. Data Analysis
Three-tier comparisons were made to evaluate differences in the 1-km binary water/land maps derived from the ODPS and MOD44W records, and SMAP SM retrievals derived using the respective global water masks.
The first-tier analysis focused on intercomparisons among ODPS, MOD44W v6 and JRC Landsat water masks. Qualitative assessments were performed over selected regions in Amazonia and the northern high latitudes for examining the performance of the water masks in delineating water bodies with a variety of sizes and shapes. Quantitative assessment was then performed for the global land domain by comparing ODPS and MOD44W v6 1-km water/land classifications with the Landsat-based results. The accuracy assessment metrics include producer accuracy, user accuracy, and overall accuracy. The producer accuracy of water pixels was defined as N ww /(N ww + N wl ), where N ww and N wl are the respective number of pixels correctly identified as water (ww) or belonging to water but classified as land (wl). The user accuracy of water pixels was defined as N ww / (N ww + N lw ), where N lw is the number of pixels belonging to land but classified as water. The overall classification accuracy was defined as (N ww + N ll )/(N ww + N ll + N lw + N wl ), where N ll is the number of pixels correctly identified as land.
The second-tier analysis was aimed to address water mask impacts on the SM retrievals. Two sets of antenna-gain-weighted FW and water-corrected Tb datasets over 9-km EASE-GRID cells were generated using the ODPS and MOD44W v6 water masks, respectively (see Section II-C). The SCA-V algorithm was then applied to the SMAP Tb data using the same ancillary inputs obtained from the SMAP SPL3SMP_E products (see Section II-B). The SCA-V was the prior baseline algorithm used for generating the SMAP SM products [28]. Based on the zero-order radiative transfer or tau-omega model where the algorithm corrects vegetation and surface roughness impacts on SM retrievals using ancillary data describing the surface physical temperature (assuming soil temperature T s is approximately equal to canopy temperature T c ), vegetation optical thickness defined as the product of the b parameter and vegetation water content, scattering albedo (ω), and soil surface roughness (h); r p is the reflectivity from rough soil surface, r p_smooth is the smooth-surface reflectivity, and θ is the incident angle. Soil moisture is finally estimated from r p_smooth using soil dielectric models. Due to the use of the same algorithm and ancillary inputs, any resulting SM differences stem from the choice of water masks (ODPS or MOD44W v6) used in the Tb water correction. The MOD44W v6 based water mask used for the SMAP Level 1-3 Tb and SM product release (R19) is a static dataset similar to the prior baseline. For assessing the SM retrieval uncertainties associated with the interannual variations of surface water, the standard deviation (σ) of FW for each 9-km grid cell over the globe was calculated using the annual MOD44W v6 data from 2015 to 2019. Another set of SM data was then derived using the SCA-V algorithm and the MOD44W-based FW data perturbed by random noise within the ± σ range for each grid cell. The absolute differences between the original and perturbed SM retrievals were then used to quantify the impacts of FW interannual changes on the SM retrievals.
For the tier 1 and tier 2 analyses targeting the spatial distributions of water mask differences and their impacts on SM retrievals globally, one week of SMAP data (July 21-27, 2017) with full global coverage were used. For a more comprehensive assessment, comparisons were made between SMAP estimates and in situ measurements from ISMN global sites over a full annual cycle for the selected year 2017. The FW data of the 9-km grid cells overlying the ISMN sites were first calculated using the ODPS and MOD44W v6 water masks, respectively. The sites were then divided into six groups characterized by different levels of absolute FW differences between the ODPS and MOD44W data sets. Accordingly, there are 608, 60, 32, 11, 12 and 11 sites whose absolute FW differences are in the ranges of (0.00, 0.01), (0.01, 0.02), (0.02, 0.03), (0.03, 0.04), (0.04, 0.05), and (0.05, 1.0), respectively. Two sets of SM datasets were then obtained using the SCA-V algorithm similar to the tier 2 analysis, and compared with the in situ SM measurements. Considering the different spatial representativeness of SMAP retrievals and point-scale site measurements, only the relative performance changes of the two SM datasets were examined for the six FW station groups. The performance metrics included absolute bias and correlation coefficient (R), which were calculated for each station and averaged for each group.

A. Differences in 1-km Water Masks
Similar spatial distributions of the main channel and numerous tributaries of the Amazon River were delineated by Landsat [see Fig. 2 water maps showed the characteristic regional abundance of small water bodies, and delineated the boundaries of large lakes in a consistent and smooth manner; while the ODPS result [see Fig. 3(c)] failed to detect the majority of smaller lakes and showed an unrealistic jagged delineation of large lake boundaries.
Quantitative assessment showed that the ODPS data had similarly high classification accuracy (∼ 99%) as the MOD44W v6 results for 1-km land pixels, but with a major accuracy decline (∼ 50%) for water pixels (see Table I). Statistics over different latitudinal zones further showed that the MOD44W-based water   map has consistently higher classification accuracy than the ODPS over different regions (see Table II). For the ODPS results, major accuracy drops were found over the northern (accuracy 96.76%) and southern high latitudes (accuracy 92.34%), which are much lower than the global mean accuracy 98.07% (see Tables I and II).

B. Impacts of Water Masks on Soil Moisture Retrieval
The FW difference map [MOD44W v6 minus OPDS; see Fig. 4(a)] showed overall greater water cover from MOD44W v6 relative to ODPS over the 9-km grid cells, which is consistent with the regional evaluations (see Figs. 2 and 3). For land-dominant (FW ≤ 50%) grid cells, the FW increase and root mean square difference (RMSD) of MOD44W v6 relative to the ODPS results are about 0.64% and 2.07%, respectively. The updated MOD44W water mask is more representative of water conditions during the SMAP era (post-2015) and appears to provide an improved delineation of water bodies than the ODPS data [see Fig. 4(a)], which allows for better characterization of areas with substantial surface water heterogeneity, including coastlines, river floodplains, and reservoirs.
The larger FW estimates from MOD44W v6 led to overall higher water-corrected Tb values [mean increase 0.130 K;  Table III]. For grid cells characterized by higher FW levels, the FW differences between the two water maps are more evident, resulting in larger mean differences and RMSDs in the Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. SM retrievals (see Table III). Therefore, the MOD44W-based water mask is expected to improve SMAP SM performance for grid cells with substantial FW cover, including areas along major river and lake systems (e.g., Amazon River and Mississippi rivers, and the Yangtze River in Central China) or in areas with abundant smaller water bodies (e.g., the northern high latitudes). However, for a majority (76.85%) of the land grid cells with small surface water presence (e.g., FW≤0.01), the impacts of the different water masks on SM retrievals were minimal with RMSDs less than 0.01 m 3 /m 3 (see Table III). The FW interannual variations showed similar surface water distributions among the years from 2015 to 2019, with standard deviation σ close to 0.0 for most land areas and generally lower than 0.06 over the global grid cells (see Fig. 5). Major FW interannual changes occurred over large rivers frequently flooded such as the Yangtze River, the Mekong River in Southeast Asia, the Brahmaputra River flowing through India and Bangladesh, and the Amazon River. Relatively large FW dynamics were also identified over the northern high latitudes (see Fig. 5) possibly due to the rapid climate change and major disturbance events in the region [36]. Overall, the resulting absolute differences between the original and perturbed soil moisture retrievals showed negligible impacts (SM differences ≤ 0.0026 m 3 /m 3 ) over the global domain as well as the grid cells with different FW levels (see Table IV).

C. Assessment Using In Situ Soil Moisture Measurements
The correlation and bias of SMAP SM retrievals relative to ISMN in-situ measurements were summarized for the station groups defined by different levels of absolute FW differences between the MOD44W v6 and ODPS water masks (see Section II-C). Except for the first group whose absolute FW difference is minimal (from 0 to 0.01) and with negligible corresponding accuracy difference, the MOD44W-based SM retrievals had consistently higher correlations and lower biases than the ODPS-based results (see Fig. 6). The performance difference generally increased with the absolute FW difference.
In particular, when the absolute FW difference reached 0.04 or higher, the performance difference maximizes, with substantially smaller biases (< 0.07 m 3 /m 3 ) and larger correlations (> 0.11) of the MOD44W-based results than those based on ODPS data when comparing with the in situ measurements (see Fig. 6). Overall the comparisons confirmed consistently lower SM retrieval biases (mean decrease 0.04 m 3 /m 3 ) and higher correlations (mean increase 0.06) of the MOD44W-based results relative to those based on the ODPS for the five station groups with absolute FW difference higher than 0.01.

A. Water Mask Selection
This study aimed at documenting the influence of an updated global water mask used to produce the latest (R19) SMAP Level 1-3 Tb and SM operational product release. The updated water mask is derived from the MOD44W v6 water body record that is more representative of the SMAP era (2015-present) than the older (circa 1997) ODPS water mask used in previous SMAP product versions. The JRC Landsat-based water masks, used to assess MOD44W and ODPS performance, have monthly temporal resolution and 30-m spatial resolution. However, it is still challenging to derive monthly or seasonal water masks for SMAP operational production due to possible missing data issues caused by persistent cloud cover or suboptimal solar illumination conditions, particularly over the tropics and northern high latitudes. For example, for the selected Amazon subregion, there are likely no water/land classifications over substantial areas for the rainy austral summer (e.g., about 22.7% of the region has no data for the DJF months of 2018-2019). Therefore, the Landsat-based water masks were only used as an independent benchmark for evaluating the relative accuracy of the ODPS and MOD44W v6 surface water data, and a temporally static water mask continues to be used for SMAP operational processing. Another issue related to the processing of Landsat water masks of this study (see Section II.C) is that the impacts of ice may not be entirely eliminated, as the relatively coarse resolution (∼11 km) of the ERA-5 temperature data may not fully account for the heterogeneous surface conditions of small rivers and lakes during the shoulder seasons.
The climatology dataset derived from the MOD44W v6 2015-2019 annual water maps showed substantially higher consistency with the JRC Landsat record relative to the ODPS data in delineating water body boundaries and classifying land/water pixels. Large surface water differences between the MOD44W v6 and ODPS data were mainly found over the northern high latitudes [see Table III; Figs. 3 and 4(a)], and major river systems (e.g., the Amazon river, the Ob river in Eurasia and the Mekong river in Southeast Asia) [see Figs. 2 and 4(a)]. The major accuracy drop and missing small water features from the ODPS water mask in the high latitudes and along river floodplains are likely caused by the relatively low spatial resolution of the original ODPS data (approximately 0.9 km) relative to MOD44W (250 m). In addition, compared with the ODPS data set circa 1997, the MOD44W v6 record was updated annually from 2000 to 2019 and is more capable of capturing major changes in global surface water conditions caused by anthropogenic and natural factors such as reservoir construction/operation, surface water diversions, planetary climate oscillations (e.g., the 2016-2017 The El Niño-Southern Oscillation or ENSO event), climate extremes such as drought and flood events, and long-term climate trends. In sum, the MOD44W v6 data enables global surface water descriptions over the SMAP era with improved accuracy than the ODPS data.
A major challenge of water correction for SMAP Tb data comes from the dynamic nature of surface water, which is affected by a variety of factors such as short-term precipitation, seasonal freeze-thaw events, and multiyear climate variability [37], [38]. The water masks ideally for supporting SMAP data processing are those routinely acquired at the SMAP observation time and updated subdaily with minimum latency. Considering the microwave sensitivity to water and penetration ability are frequency-dependent, water extent derived from similar SMAP L-band frequencies would likely further improve the water mask accuracy, particularly for detecting water under emergent vegetation or in flooded forests. The current study addressed static water masks derived from satellite optical-IR observations; whereas, the influence of transient surface water dynamics, including daily/sub-daily FW variations, on the SMAP retrievals requires further study. Satellite optical-IR sensors can underestimate the water extent over seasonal wetlands or inundated areas, where the surface water signal may be obscured by emergent vegetation. For example, the global area of permanent and transitory inland water cover estimated from Landsat is less than 43% of the estimate derived from satellite microwave-based datasets [39, Table II]. The residual surface water signals not accounted for by optical-IR products may lower the satellite microwave Tb observations and bias other microwave-based land parameter retrievals such as vegetation optical depth and above ground biomass [40] and soil moisture [3]. Planned next generation satellite missions such as the L-band NASA-ISRO Synthetic Aperture Radar will provide new capabilities for global surface water mapping [41], and may allow for dynamic water masks to be generated and updated every 6-12 days, potentially enabling further SMAP product enhancements.

B. Impacts on SMAP Soil Moisture Retrieval
For the 9-km grid cells, the ODPS-based FW data tended to be underestimated compared to the MOD44W v6 results [see Fig. 4(a)]. Water emissivity is generally substantially lower than surrounding land areas; therefore the residual water signals caused by incomplete water correction using the ODPS water mask (see Table III) lead to cold biases in the Tb retrievals (1), and thus underestimation of bare soil emissivity and overestimation of soil moisture for FW affected grid cells (see Table III).
Accordingly, drier SM conditions are generally expected in the MOD44W-based retrievals relative to the ODPS results [see Fig. 4(b), Table III]s. The FW differences and the associated impacts on SM retrievals generally increase with FW cover (see Table III). Therefore, major accuracy improvements in SMAP SM products are expected over regions close to major rivers and lakes, or with abundant small water bodies. The choice of water mask also has minimal impact (RMSD <0.001 m 3 /m 3 ) on SM retrieval accuracy for grid cells with minimal (<1%) FW cover, which includes the majority (76.85%) of global land grid cells.
The MOD44W v6 surface water IAV map (see Fig. 5) detected major water body changes over the 2015-2019 SMAP record.
The global statistics indicate overall negligible impacts of FW IAV on the SM retrievals (see Table IV). However, noticeable retrieval uncertainties are expected for individual grid cells with large FW changes such as those caused by major flood or drought events. Therefore, the FW IAV information and the difference between satellite optical and microwave-based datasets (see Section IV-A) may provide ancillary grid cell level information on the relative quality (QC) of high-order SMAP soil moisture and freeze/thaw products.

C. Evaluations Using In Situ Measurements
The operational SMAP SM products have been widely validated using measurements from in situ networks, including core validation sites having spatial representativeness similar to SMAP observations [28], [42]. However, most of these regional networks were designed for validating satellite retrievals over particular land cover types (e.g., cropland, grassland, and forests) [42] and are generally located away from major water bodies. The SMAP performance is expected to remain similar for the core validation networks due to their minimal water presence and negligible impacts from the water mask update. Therefore, the water mask impacts on the SM retrievals were assessed using the ISMN stations available for 2017, which are spatially distributed across the globe and include a range of FW levels within their overlying 9-km grid cells (see Fig. 1; Section III-C). Considering the spatial scale differences between the 9-km SMAP product and in situ SM measurements, only the relative changes of the assessment metrics between the ODPS and MOD44W-based SM results were analyzed in combination with the associated grid cell FW differences. As expected, the use of a more accurate water mask led to lower biases in SM retrievals (see Fig. 6). In addition, the SM dynamics over 2017 derived from the ODPS-based Tb datasets were likely contaminated by residual water signals, and thus tended to have lower correlations with in-situ measurements relative to the MOD44W v6 results (see Fig. 6). The performance enhancement is generally more evident as the FW discrepancy between ODPS and MOD44W v6 data increases (see Fig. 6). The water mask update likely improves SMAP SM performance in terms of lower retrieval biases and higher sensitivity to soil wetness changes for water contaminated grid cells. Considering the impacts of FW on SMAP retrievals, further comparisons of the SMAP algorithms including DCA, SCA-V, and SCA-H using in situ SM networks distributed over grid cells with substantial FW cover would be necessary for a more robust and comprehensive assessment of the satellite SM products.

V. CONCLUSION
A new global 1-km EASE-GRID 2.0 water mask was developed for SMAP operations using an updated MOD44W v6 multiyear (2015-2019) water record. Compared with the prior ODPS derived water mask, the MOD44W v6 data was found to be more suitable for supporting SMAP Tb data processing due to long-term coverage overlapping with the SMAP operational record, continuing NASA support and updates for the MOD44W record, and higher consistency with independent JRC Landsat water maps (see Table I).
Assessments of the resulting FW impacts on SMAP SM retrievals were performed using the same SCA-V algorithm but different SMAP Tb inputs corrected by the respective ODPS and MOD44W v6 water masks. The updated mask shows generally more surface water cover over land-dominated grid cells (FW ≤ 0.5) relative to the prior water mask, which leads to relatively drier soil moisture retrievals (mean decrease: -0.012 m 3 /m 3 , RMSD: 0.051 m 3 /m 3 ). The SM retrieval uncertainties associated with the MOD44W FW IAV are negligible (RMSD < 0.003 m 3 /m 3 ) over the global domain. The benefit of the updated water mask is greater in areas with substantial surface water heterogeneity, including coastlines, river floodplains, and reservoirs. The benefit is also greater in the Northern Hemisphere high latitudes, where the updated water mask is more effective in resolving the abundance of smaller water bodies in boreal and tundra wetlands. However, the majority (∼77%) of SMAP grid cells show minimal fractional water (FW) coverage (FW ≤0.01), where associated impacts on soil moisture retrievals are negligible (RMSD: 0.009 m 3 /m 3 ).
Comparisons with globally distributed SM measurements further showed consistently lower SM retrieval biases and higher correlations of the MOD44W-based results relative to the ODPS. Overall, accuracy enhancement (∼90% improvement) of the MOD44W v6 water classifications relative to the prior ODPS baseline is expected to provide more accurate water corrections on Tb data over areas with mixed land and water cover, and improve the accuracy of higher order SMAP products such as soil moisture and freeze/thaw status. Similar to the SMAPbased study, the updated water masks can be applied to other space-borne microwave sensors such as AMSR-E/2 and FY-3 (FengYun-3) Microwave Radiation Imager for improving water corrections and land surface parameter retrievals.

ACKNOWLEDGMENT
The authors would like to thank Dr. M. Carroll from National Aeronautics and Space Administration (NASA) Goddard Space Flight Center for kindly providing the MOD44W v6 data for the 2016-2019 period. The MOD44W v6 data for 2015 was downloaded from the Land Processes Distributed Active Archive Center (LP DAAC) (lpdaac.usgs.gov/products/mod44wv006/). The 1-km ODPS water mask and 9-km SMAP SPL1BTB_E Tb data corrected using MOD44W v6 were provided by the SMAP science team. The SMAP SPL1CTB_E and SPL3SMP_E data were downloaded from the National Snow and Ice Data Center (NSIDC) Distributed Active Archive Center (DAAC), located in Boulder, CO, USA. The Landsat water masks were provided by JRC and processed for this study using GEE. The ERA5 temperature reanalysis data were generated using Copernicus Climate Change Service information and processed using GEE. The in situ soil moisture datasets were provided by the ISMN. A contribution to this work was made at Jet Propulsion Laboratory, California Institute of Technology, under a contract with the NASA.