Retrieving Freeze/Thaw Surface State From CYGNSS Measurements

Freeze/Thaw (F/T) surface state retrieval is important to further understand hydrological patterns and climate change. This article investigates the use of Earth-reflected Global Positioning System (GPS) L-band signals as collected by the National Aeronautics and Space Administration NASA’s Cyclone Global Navigation Satellite System (CYGNSS) mission for F/T surface state retrieval over a target area in South America, covering the Andes Mountains and the Argentinian Pampas. In the study, CYGNSS responsiveness to changes in surface permittivity is leveraged to detect transitions of F/T surface state, at an improved spatio-temporal sampling as compared to traditional Remote Sensing missions. A Seasonal-Threshold Algorithm (STA) is developed and validated using surface temperature data as provided by the European Centre for Medium-Range Weather Forecast (ECMWF) ERA5-Land numerical reanalysis model. Then, the monthly evolution of CYGNSS-derived F/T surface state maps is evaluated and an inter-comparison with the Soil Moisture Active Passive (SMAP) F/T data product is performed.


I. INTRODUCTION
T HE spatio-temporal variability of freeze/thaw (F/T) surface state is high over the cryosphere. It has a strong influence on climate, biogeochemical processes, and seasonal surface energy exchange [1], and it determines vegetation net primary production and Net Ecosystem CO 2 Exchange (NEE) with the atmosphere [2]. Additionally, it is important to highlight that more than one-third of the Earth's land surface is covered by seasonal or permanent soil frost. Many agricultural, engineering, and environmental issues are affected by the F/T surface state [3].
Detecting highly dynamic F/T transitions at large scales requires spaceborne Remote Sensing observations. Landscape F/T processes are significantly heterogeneous [4]. Synthetic Aperture Radar (SAR) missions generate high spatial resolution (∼100 m) but low temporal resolution (∼7 to 14 days) F/T products, which prevents monitoring F/T transitions with the required spatio-temporal resolution. On the other hand, Manuscript  microwave radiometry missions provide high temporal resolution (∼3 days) but low spatial resolution (∼25 km), which prevents observing the landscape processes that directly affect F/T surface state [5]- [7]. Currently, there is no way to simultaneously monitor the F/T surface state globally, at appropriate spatial and temporal scales. Brightness temperature, backscatter, and forward scatter vary due to changes in the surface permittivity when water changes its state (liquid or solid), but also they are a function of a wide range of surface and instrument properties. The change in the surface permittivity due to F/T surface state transitions and how this phenomenon impacts observed surface reflectivity form the cornerstone of the formulation presented in this study. The surface permittivity changes when the temperature goes from below 0 • C to above 0 • C. This process leads to changes in the surface reflectivity. This is the fundamental element of F/T surface state change detection. The permittivity of thawed soil is much higher than that of frozen soil.
Spaceborne microwave radiometry and SAR missionsderived measurements at several frequencies have been used to detect melt onset [8], [9] and landscape F/T state [10]- [15]. In particular, microwave observations at L-band have demonstrated the ability to detect F/T surface state because of the following reasons: 1) a strong sensitivity to surface permittivity, which is influenced by the phase of water and 2) a higher penetration depth through the vegetation and into soil surface than at higher frequencies (starting at C-band) from sensors such as, e.g., the Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) [16]. Groundbased L-band radiometers have been used to detect F/T surface state [17]- [19] and several algorithms have been adapted to the Soil Moisture and Ocean Salinity (SMOS) mission [20]. High-resolution L-band data from Phased Array type L-band Synthetic Aperture Radar (PALSAR) have been used to detect local-scale variability in F/T surface state transitions [21], [22]. The NASA (Satélite de Aplicaciones Científicas/D) SAC/ D-Aquarius mission pioneered large-scale L-band radar response to F/T surface state with coarse spatial (∼100 km) and temporal (weekly) resolutions [23]. After the failure of the Soil Moisture Active Passive (SMAP) radar, the science team developed a new F/T surface state product using L-band radiometer measurements [6].
More recently, some pioneering studies have shown a promising sensitivity to F/T surface state transitions over high latitude regions, including SMAP GNSS-R mode [34], [39] and TechDemoSat-1 [40]; and also CYGNSS [41]. CYGNSS [30] was originally proposed for ocean surface winds speed estimation over tropical cyclones using reflected L1 Global Positioning System (GPS) signals at left-handed circular polarization (LHCP). CYGNSS enables observations of the Earth's surface along 32 tracks simultaneously. The unique sampling properties of CYGNSS could enable an improved understanding of F/T surface state dynamics. In this work, CYGNSS data are used to investigate the potential capability of GNSS-R to determine the F/T surface state over the Andes Mountains. The generated theoretical and experimental capabilities developed here could be applied to high inclination GNSS-R missions (Spire CubeSats series and HydroGNSS), in order to better resolve spatial patterns and temporal dynamics with an improved spatio-temporal sampling as compared to traditional Remote Sensing missions. The F/T changes of seasonally frozen ground are an important indicator of climate change and contributor to global methane distributions.

II. FREEZE/THAW DETECTION WITH GNSS-R: THEORETICAL BACKGROUND
The CYGNSS-derived surface reflectivity is directly related to the square modulus of the cross-polarization Fresnel reflection coefficient R cross . This can be theoretically expressed as follows: where and are the vertical V-pol and horizontal H-pol Fresnel reflection coefficients, respectively. ε is the complex relative permittivity of the reflecting medium, and θ i is the incidence angle. For angles larger than θ i ∼ 50 • , there is a significant reduction in the reflection coefficient, while it remains roughly constant in the range θ i = [0, 50] • [42]. The complex relative permittivity ε describes the electrical properties of the reflecting medium. It can be analytically computed as follows [43]: where ε is the dielectric constant, which is related to the speed of propagation, and ε is the so-called dielectric loss factor that is related to the attenuation rate of electromagnetic energy flow. It is highlighted that ε depends on the electromagnetic wavelength of the incident signal as well as on the internal components of the reflecting medium and their relative distribution.
The permittivity of a mixed medium of soil and water ε s can be formulated as follows [41], [44], [45]: where ρ b is the bulk density, ρ s is the solid density, and ε α s and ε α fw are the permittivity of the solid matter and pure water, respectively. α is the shape factor, and m v is the total moisture.
In the case of frozen soil, the phase state of the water changes. The ice composition can be added to the permittivity model. The final permittivity of frozen soil can be expressed as follows [41], [46]: where V is the volume content of different components, and the subscripts s, a, fw, bw, and i are solid soil, air, free water, bound water, and ice, respectively. The transition from thawed to frozen soil surface significantly decreases the permittivity. Consequently, R cross decreases, which in turn reduces the surface reflectivity. This phenomenon is the fundamental basis of F/T surface state detection in this study.

A. Target Area
CYGNSS provides an unprecedented spatio-temporal sampling of the Earth's surface over tropical latitudes. The selected target area (Lat = [−35, −28] • , Lon = [−71, −67] • ) includes a region of the Andes (Fig. 1) because permafrost takes place extensively over these high-altitude mountains [47]. In the northern Andes, permafrost can be detected above ∼5000 m. This lower limit gradually decreases down to ∼1500 m in the Southern regions of Chile and Argentina. The total area affected by permafrost is ∼30 000 km 2 .

B. CYGNSS
CYGNSS Level 1 (L1) data from the version 3.0 (v3.0) Science Data Record [48], [49], available at the Physical Oceanography Distributed Active Archive Center (PODAAC) [50], are used in this article. L1 daily files are provided in NetCDF format, each one with the required calibrated measurements and observables for each nominal specular point, so as to enable a wide variety of scientific studies over both land and ocean surfaces. Current v3.0 products incorporate real-time monitoring of transmitted GPS power [48], [49]. This provides an improvement as compared to v2.1, which assumes that transmitted GPS power is constant.

C. ECMWF ERA5-Land
European Centre for Medium-Range Weather Forecast (ECMWF) ERA5-Land describes a wide set of land variables over several decades with an improved resolution as compared to ERA5. It provides an accurate description of the past climate using synergistically physical models with globalscale measurements [51]. In this study, ERA5-Land reanalysis SMC and soil temperature data with hourly frequency are selected as the reference, considering the temperature in the first layer of the soil (0-7 cm) within the ECMWF Integrated Forecasting System (IFS). The surface is set at 0 cm in the IFS, the temperature is set at the middle of each layer, and heat transfer is estimated at the interfaces between them. The first layer of the soil is selected because both active and passive L-band observations (e.g., SMAP, SMOS, and CYGNSS) have sensitivity to the top ∼5 cm of soil [52].

D. SMAP
SMAP Enhanced L3 Radiometer Global and Northern Hemisphere Daily 9 km Equal-Area Scalable Earth (EASE)-Grid F/T State product is selected for this study. It provides a daily classification of F/T surface state derived from the SMAP L-band radiometer, over the global 9 km EASE 2.0 grid with a spatial resolution of ∼20 km × 20 km [53]. To do so, the Backus-Gilbert technique is applied. This methodology enables the use of additional microwave radiometry data that are not available for the original product because the foundation measurements of brightness temperature are oversampled in the along-track direction.

A. Reflectivity Estimation
Analog Delay Doppler Maps (DDMs) are selected (power_analog), along with the required metadata, for estimation of the observables and interpretation of the results.
Power_analog is the true power that would have been measured by an ideal (analog) power sensor. Power_digital is the power measured by the actual 2 bit sensor, which includes quantization effects. Power_analog (power after radiometric calibration) has been corrected for quantization effects. One year of data is considered from January to December 2019 to develop and apply the retrieval algorithm. The equivalent "CYGNSS overall quality flag" over land surfaces is used to filter out the data, improving the quality of the observables (see the Appendix for more information).
The delay bin resolution of the original 17 × 11 bins DDMs is 0.2552 GPS Coarse Acquisition (C/A) code chips, while the Doppler bin resolution is 500 Hz. After resampling, an interpolation [54] of the obtained 1700 × 1100 bins DDMs is applied to improve the accuracy, before performing the estimation of the reflectivity . To do so, a spline method is used.
Earth-reflected delay waveforms WF r,analog are derived from the improved analog DDMs |Y r (τ, f )| 2 at zero Doppler frequency as where τ is the delay of the signal from the transmitter to the receiver, and f is the Doppler shift of the electromagnetic reflected signal. In this study, it is assumed that the received power is described by the reflection model given by the Friis transmission formula [55], [56]. is obtained as it follows [57]: where WF r,analog−Peak is the peak value of the analog delay waveform, N is the DDM noise floor, P t is the transmitted power, G t and G r are the transmitter and the receiver antennas gains, R t and R r are the ranges from the transmitter and the receiver to the nominal specular point, respectively. G t P t is the transmitter equivalent isotropically radiated power (EIRP), which is provided in the metadata files along with the other variables, except WF r,analog−Peak and N, which are calculated from the DDMs. Over land surfaces, coherent and incoherent scattering terms contribute to the reflected power in a general scenario [58]- [67]. Some pioneering works have also shown the impact of topography and vegetation [67]- [70]. Since it is not practical to switch between reflectivity and Normalized Bistatic Radar Cross Section (NBRCS) with every DDM, a choice needs to be made to use one or the other.
The noise power floor is calculated as the mean value of the region of the DDM where there is no signal [71]. The delay separation between that region and the peak of DDM is at least 0.75 C/A code chips. The estimation of is not considered if this distance is below 0.75 C/A code chips.

B. Reference Freeze/Thaw Surface State
The assumed F/T surface state at a particular location in a particular month is derived from the population of hourly ERA5-Land temperatures, T i , in that month. If the average of all values is <−1 • C, the state is defined as frozen. If the average is >+1 • C, the state is defined as thaw. The confidence in the state definition is derived from the statistical distribution of hourly samples in the month according to the following schedule: 1) "Highest" freeze confidence: All hourly samples T i < −1 • C; 2) "High" freeze confidence: All hourly samples T i < 1 • C; 3) "Moderate" freeze confidence: Less than 10% of hourly samples T i > 1 • C; 4) "Low" freeze confidence: At least 10% of hourly samples T i > 1 • C; 5) "Highest" thaw confidence: All hourly samples T i > 1 • C; 6) "High" thaw confidence: All hourly samples T i > −1 • C; 7) "Moderate" thaw confidence: Less than 10% of hourly samples T i < −1 • C; 8) "Low" thaw confidence: At least 10% of hourly samples Examples of freeze state in Winter (June, July, and August) and thaw state in Summer (January, February, and March) with different confidence levels are shown in Figs. 2 and 3, respectively, for several months. In Winter, the "highest" confidence freeze state covers almost all the Andes. On the other hand, the "highest" confidence thaw state covers almost all the Pampas in Summer. The "moderate" confidence freeze and thaw states clearly cover the boundaries between the "high" and the "low" levels. Finally, the "low" freeze and thaw levels cover the "highest" thaw and freeze levels, respectively.
The schedule so defined provides the required scenario to properly evaluate the ability of CYGNSS for F/T retrieval, accounting for the different degrees of confidence on the F/T surface state reference. The "low" confidence levels can be clearly discarded because they cover regions with the "highest" confidence for the opposite surface state. On the other hand, "moderate" levels appear in the frontier, so in principle, these areas are linked with a quite uncertain state. The performances of the "highest" and "high" levels are quantitatively analyzed based on the behavior of the retrieval algorithm hereafter defined.

C. Retrieval Algorithm
The F/T retrieval approach is based on the Seasonal-Threshold Algorithm (STA). This algorithm evaluates the relationship between the time series of CYGNSS-derived reflectivity (Fig. 4) and seasonal reference frozen and thawed states. For measurement at time t, the seasonal scale factor (t) is defined as follows [6]: where (t) is the reflectivity measurement estimated at time t, and fr and th are reflectivity measurements corresponding to frozen and thawed reference states, respectively. Different F/T surface states correspond to different observation times. GNSS-R Earth's surface sampling properties are pseudorandom. A specific gridding strategy is thus required. In this work, it is defined using CYGNSS's sampling properties and it is applied also to data from ERA5-Land and SMAP.
The STA is evaluated for different sizes of the latitude/ longitude grid (0.1 • ×0.1 • , 0.05 • ×0.05 • , and 0.001 • ×0.001 • ), different numbers of measurements per pixel (5, 10, and 20), and several temporal windows (1, 2, and 3 months). Finally, a 0.05 • × 0.05 • grid is selected, and data are averaged using a moving window of 0.1 • at steps of 0.05 • . This selection provides a trade-off between the spatial resolution and the available number of measurements per pixel. The associated spatial resolution is ∼10 km at equatorial latitudes. th [ Fig. 4(a)] is found by averaging the five largest values occurring in Summer, while fr [ Fig. 4(b)] is found by averaging the five smallest values occurring in Winter. Overall, the standard deviation (SD) in the computation of both reference values is below ∼1 dB [ Fig. 4(e) and (f)]. The variability of the maximum values [ Fig. 4(e)] is larger over areas with a higher topographic roughness index [72] [ Fig. 4(h)] because of the larger temperature variability over high-altitude areas. On the other hand, the variability of the   Fig. 4(f)] shows some few tracks with higher SD, probably because of the impact of non-geophysical effects [73], [74]. The computation of both reference values does not depend on external reference datasets. The whole Winter and Summer sessions are the time periods over which fr and th are computed. Finally, it is worth pointing out there is no impact of the SMC in the results [ Fig. 5(a) and (b)].
The F/T surface state signal at sub-boreal latitudes has not such a strong signature as compared to, e.g., Artic areas, which of course have a much colder and longer Winter season [75], [76]. The STA could improve detection capability relative to the ERA5-Land and SMAP datasets when the difference between fr and th is greater. Thus, only the smallest and largest five data samples are used to set the references. The use of a larger number of measurements to compute fr and th could be reasonable, e.g., Artic areas because of the colder and larger Winter period. It is recommended to test this with data from future high-inclination GNSS-R missions [76], [77].
Two threshold levels T fr and T th are then defined such that the surface state is frozen if (t) > T fr , while the surface state is thawed if (t) < T th . The threshold levels T fr and T th can be varied parametrically to define the receiver operating characteristic (ROC) freeze and thaw curves, using the F/T surface state derived from ERA5-Land temperatures as the reference. A ROC curve shows the capability of diagnosis of a binary classifier as a function of the selected system threshold. Fig. 6 shows the ROC curves for the "highest" and "high" freeze and thaw confidence levels for July and November. It is found that the sensitivity is improved for the "highest" confidence case, so this is the confidence level used hereafter. This improvement is higher in July because of the stronger temperature gradient in the target area as compared to November. Optimum operating points of the ROC freeze and thaw curves are selected as the inflection points where the slope of the curve transitions from >1 to <1. This is the point beyond which more false than true positives are detected.
The retrieval algorithm is validated for July [ Fig. 7(a)-(c)] and November [ Fig. 7(d)-(f)]. The reference F/T surface state shown is the "highest" confidence case, and the observed F/T state also corresponds to this case. Frozen areas are depicted in blue, while thawed areas are in red [ Fig. 7(a), (b), (d), and (e)]. Fig. 7(c) and (f) shows the difference between reference and observed maps. Each pixel is associated with one of the following states.
2) Freeze missed detection ⇒ reference shows F but retrieval does not (cyan). 3) Thaw detected ⇒ both maps show T (red). 4) Thaw missed detection ⇒ reference shows T but retrieval does not (magenta). The freeze and thaw detections cover most of the "highest" confidence areas, both in July and November. The freeze missed detection is rather low [July ∼7% and November ∼1%], however, the thaw missed detection pixels are non-negligible [July ∼15% and November ∼29%], (Table I). They are identified in the boundary with areas of "moderate" and "low" confidence. Overall, the agreement between the CYGNSS and ERA5-Land F/T surface state maps is high.

A. Yearly Evolution of CYGNSS F/T Detection Capabilities
The retrieval algorithm was validated in Section IV using two specific months, which are representative of different climatological conditions in the target area. On the other hand, the overall main objective in this section is to evaluate the yearly evolution of F/T surface state detection using ERA5-Land F/T reference maps (Fig. 8), CYGNSS F/T observation maps (Fig. 9), and CYGNSS/ERA5-Land difference F/T maps ( Fig. 10 and Table I). All these maps correspond to the "highest" confidence level. Fig. 8 provides a graphical description of the monthly reference F/T surface state in the first layer of the soil (0-7 cm) along the year, which is useful for the interpretation of the results. During Summer, the "highest" confidence level areas appear only for thaw conditions. In Autumn, there is a decreasing extension of the thawed areas from April to June. This is consistent with the expected transition toward the coldest period of the year, in the Southern hemisphere. In Winter, Fig. 8. "Highest" confidence F/T state reference maps for each month derived from ERA5-Land soil temperatures. Blue-freeze/red-thaw. Fig. 9. "Highest" confidence F/T state-observed maps for each month, derived from CYGNSS. Blue-freeze/red-thaw. the frozen surface is extended along almost all the Andes (Fig. 1). Some areas remain thawed over the Pampas, which provides an adequate scenario to evaluate the STA algorithm. Finally, the extension of frozen areas gradually decreases in Spring, with a remaining frozen area in November. On the other hand, the extension of thawed areas increases with gradually warmer temperatures as we are closer to Summer. Overall, this target area offers a rich scenario with a wide variety of climates and transitions from frozen to thawed and vice-versa. Fig. 9 shows monthly CYGNSS observed F/T surface state maps along the year. Over the Andes, the extension of frozen Fig. 10. Difference between CYGNSS observed and ERA5-Land reference F/T maps for each month, using the "highest" confidence level of the reference freeze and thaw state. Freeze detected ⇒ both maps show F (blue). Freeze missed detection ⇒ reference shows F but retrieval does not (cyan). Thaw detected ⇒ both maps show T (red). Thaw missed detection ⇒ reference shows T but retrieval does not (magenta). soil clearly increases from June to August, and it gradually decreases from September to November. On the other hand, the extension of the thawed soil is clearly larger in Summer, rather residual in Winter, and it is transitional in Spring and Summer. Fig. 10 provides the difference between the F/T maps observed by CYGNSS and those derived from ERA5-Land. In June, the freeze missed detection is ∼27%. ERA5-Land map [ Fig. 8(f)] shows that the surface is frozen over the Andes; however, the STA algorithm is identifying the surface as mostly thawed [ Fig. 9(f)]. In July, August, and September, the situation is quite different. Both reference and observed F/T maps [Figs. 8(g)-(i) and 9(g)-(i)] are rather similar. This is probably because the ERA5-Land temperature is not so accurate in June, since this is a transitional month. A similar situation is found in October [Figs. 8(j) and 9(j)] and November [Figs. 8(k) and 9(k)], but for the thaw missed detection case. The thaw missed detection during October and November is high and roughly similar in both months, ∼33% and 29%, respectively. There is no evidence of the influence of SMC in this observation (Figs. 5 and 10). In the boundary with the "moderate" and "low" confidence thawed areas, CYGNSS detects the soil as frozen while ERA5-Land maps show that the surface temperature is above 0 • C. It is assumed that the ERA5-Land temperatures are not so accurate during these periods because the differences appear over the transitional months (Spring and Autumn). Future activities should include ground truth stations to elucidate what is the truth. However, access to most of these regions is quite complicated because they are high-altitude mountain terrain.
Finally, it is worth commenting that this study uses low incidence angles in the range θ i = [0, 50] • . In this range, the impact of the incidence angle on the reflection coefficient can be assumed to be negligible [42]. Additionally, data are averaged month-by-month at grid cells ∼0.05 • × 0.05 • . This strategy helps to homogenize data for the study, and to minimize the potential impact of the local terrain slopes.

B. Comparison Between CYGNSS and SMAP F/T Surface State Maps
An intercomparison of CYGNSS observed F/T surface state maps with the SMAP-radiometer F/T product is provided over the "highest" confidence areas for two representative months: February and August. The overall objective is to evaluate the performance of the new CYGNSS F/T capabilities with the more classical L-band microwave radiometry approach, which may provide a higher performance because of the lower impact of surface roughness. Fig. 11 shows a monthly-averaged F/T product derived from the SMAP baseline seasonal threshold algorithm applied to the normalized polarization ratio (NPR) of radiometer measurements [53]. This monthly product reports the state as frozen if any SMAP sample within the month is frozen. In other words, Fig. 11(a) and (d) correspond to "peak-detection" maps. Thawed areas are depicted in red and frozen areas in blue, similar to the previous F/T maps. Added to the figure are the corresponding maps produced by ERA5-Land (from Fig. 8) and CYGNSS (from Fig. 9). The surface is totally thawed in February, and frozen areas are properly identified in August over the Andes. This evolution generally agrees with the CYGNSS and the ERA5-Land maps.

C. Time Series Analysis
Two representative target areas are selected over the Andes and the Pampas for a time series analysis along 12 consecutive months from January to December 2019 (Fig. 12). Both target areas correspond to the "highest" confidence level. The objective is to further evaluate the behavior of the F/T detection capability of the algorithm over time. To do so, the scale factor (t) is selected, which is the main observable used in the retrieval algorithm.
Over the Andes [ Fig. 12(a)], (t) shows a significant increment from May (t) ∼ 0.2 to August (t) ∼ 1.2 due to the arrival of the Winter. In Spring, it decreases down to (t) ∼ 0.8, as is expected because of the higher surface temperatures. In Summer, it shows lower values, which correspond to a totally thawed surface. In the Pampas [ Fig. 12(b)] on the other hand, (t) remains fairly low throughout the year. This is consistent with the generally thawed state of the land surface at the location in the Pampas considered.

VI. CONCLUSION
This article describes a CYGNSS-based F/T surface state specifically designed STA, which is developed and validated over a target area in South America, covering the Andes Mountains and the Argentinian Pampas. Then, the capability to evaluate the monthly evolution of F/T surface state extension is studied, showing an overall good agreement between STA F/T maps with those derived from ECMWF ERA5-Land surface reanalysis data. Finally, an intercomparison with the SMAP radiometer-based F/T product also shows consistency with CYGNSS F/T maps, over this target area. In the future, the STA could be applied over polar regions using data from new GNSS-R high-inclination satellites. The higher spatiotemporal sampling of GNSS-R as compared to more traditional Remote Sensing techniques could open new insights in monitoring highly dynamic F/T surfaces processes.

APPENDIX
The equivalent "CYGNSS overall quality flag" over land surfaces is used to filter out the data, improving the quality of the observables. If any one of the following flags are set, then poor_overall_quality is set: