Feasibility of tundra vegetation height retrieval from Sentinel-1 and Sentinel-2 data

The quantification of vegetation height for the circumpolar Arctic tundra biome is of interest for a wide range of applications, including biomass and habitat studies as well as permafrost modelling in the context of climate change. To date, only indices from multispectral data have been used in these environments to address biomass and vegetation changes over time. The retrieval of vegetation height itself has not been attempted so far over larger areas. Synthetic Aperture Radar (SAR) holds promise for canopy modeling over large extents, but the high variability of near-surface soil moisture during the snow-free season is a major challenge for application of SAR in tundra for such a purpose. We hypothesized that tundra vegetation height can be derived from multispectral indices as well as from C-band SAR data acquired in winter (close to zero liquid water content). To test our hypothesis, we used C-band SAR data from Sentinel-1 and multi-spectral data from Sentinel-2. Results show that vegetation height can be derived with an RMSE of 44 cm from Normalized Difference Vegetation Index (NDVI) and 54 cm from Tasseled Cap Wetness index (TC). Retrieval from C-band SAR shows similar performance, but CVV is more suitable than C-HH to derive vegetation height (RMSEs of 48 and 56 cm respectively). An exponential relationship with in situ height was evident for all tested parameters (NDVI, TC, C-VV and C-HH) suggesting that the C-band SAR and multi-spectral approaches possess similar capabilities including tundra biomass retrieval. Errors might occur in specific settings as a result of high surface roughness, high photosynthetic activity in wetlands or high snow density. We therefore introduce a method for combined use of Sentinel-1 and Sentinel-2 to address the ambiguities related to Arctic wetlands and barren rockfields. Snow-related deviations occur within tundra fire scars in permafrost areas in the case of C-VV use. The impact decreases with age of the fire scar, following permafrost and vegetation recovery. The evaluation of masked C-VV retrievals across different regions, tundra types and sources (in situ and circumpolar vegetation community classification from satellite data) suggests pan-Arctic applicability to map current conditions for heights up to 160 cm. The presented methodology will allow for new applications and provide advanced insight into changing environmental conditions in the


Introduction
Land cover information is in high demand for the Arctic (Raynolds et al., 2019). Quantitative estimates of vegetation height are fundamentally important for a wide range of applications in high latitude tundra environments, because canopy height is a key biophysical control on and proxy for environmental conditions. For example, shrub height influences snow trapping and snow melt (Selkowitz, 2010;Selkowitz and Stehman, 2011;Marsh et al., 2010), which impacts ground thermal conditions (Schimel et al., 2004;Hollesen et al., 2015;Cable et al., 2016;Frost et al., 2017;Wilcox et al., 2019). It further modifies turbulent fluxes (Endrizzi and Marsh, 2010), reflects shrub biomass  and conditions wildlife habitats (Zhou et al., 2017). Shrub cover change over time in the tundra-taiga transition zone is a frequent topic of investigation in climate change studies (Tape et al., 2006;Frost et al., 2014). Changes in tundra vegetation height can also result from local-and landscape-scale disturbances such as grazing (e.g. Olofsson, 2006;Olofsson et al., 2009;Vowles et al., 2017), wildfires (Racine et al., 2004;Jones et al., 2013a) and landslides (Khitun et al., 2015). At present, landcover classifications are often used to bracket canopy heights. For example, maps which represent shrub physiognomy (dwarf species or higher) together with wetness patterns are often produced for up-scaling of carbon pools and fluxes (e.g. Schneider et al., 2009;Hugelius et al., 2010). Such classes are also used for the identification of plant functional types (PFTs, e.g. Gould et al. (2003); Macander et al. (2017)). However, quantitative models of canopy height offer many advantages for characterizing current conditions, monitoring future changes, and informing earth-system models. Continuous-field height estimates for the entire Arctic are still lacking (Raynolds et al., 2019). Remote sensing approaches which provide actual height information such as lidar show good results for tundra vegetation (Greaves et al., 2015(Greaves et al., , 2016b but are so far only locally applicable as they rely on terrestrial and airborne acquisitions. A methodology applicable across the entire Arctic is therefore needed, allowing for circumpolar modeling of tundra canopy height. The normalized difference vegetation index (NDVI) has been shown to be applicable to derive tundra vegetation biomass in previous studies Goswami et al., 2015;Hogrefe et al., 2017;Berner et al., 2018). A simple exponential relationship was empirically determined in all cases. Results from an in situ study by Berner et al. (2018) indicate that shrub biomass (as is frequently derived from NDVI) correlates with canopy height in tundra. This suggests that NDVI can be used to infer canopy height but the potential of NDVI for mapping of larger areas has so far not been explored for this purpose. Occurrence of shrubs coincides also with areas of high wetness (e.g. on central Yamal Peninsula Khitun et al., 2015). Wetness indices from multi-spectral data such as the Tasseled Cap Wetness (TC) index may therefore also be applicable, but have not yet been investigated in this context. Good spatial coverage for the Arctic can be achieved with Synthetic Aperture Radar (SAR), but this technique has so far only been applied over other environments to retrieve plant functional traits, e.g. forest biomass (Santoro et al., 2011) or crop height (Erten et al., 2016). Such data contain information on surface roughness, moisture as well as vegetation structure. These factors have been investigated for tundra environment in a range of studies focusing on wetlands, soils and vegetation distribution (e.g. Bartsch et al., 2009;Widhalm et al., 2015;Duguay et al., 2015;Bartsch et al., 2016b;Widhalm et al., 2017). Depending on wavelength, backscatter increases with the amount of leaves and branches (volume scattering) but also with increasing roughness and soil water content (Ulaby et al., 1982). The application of SAR in tundra for vegetation structure retrieval during the snow-free season requires the separation of the information from soil moisture. This specifically applies to C-band, with a wavelength of 5.6 cm, which is expected to represent both near-surface soil moisture and volume scattering when shrubs are present. The signal partially penetrates the canopy, but also interacts with leaves and thicker branches (with respect to the wavelength). An alternative could be the use of winter acquisitions, where liquid water content of soils and vegetation can be expected to be close to zero. Although deciduous plants lack leaves in winter, interaction with branches and stems is expected to represent vegetation physiognomy with C-band. Results for C-band in HH polarization by Widhalm et al. (2018) indicate that not only the backscatter -local incidence angle relationship is reflecting variability in volume scattering and surface roughness over tundra during winter, but also the normalized backscatter level itself. C-HH winter backscatter from ENVISAT Advanced SAR (ASAR) has also been shown to have spatial patterns (e.g., latitudinal gradient) similar to NDVI in tundra environments (Bartsch et al., 2016b).
We therefore hypothesize that winter C-band backscatter as well as multi-spectral indices such as NDVI and TC from images acquired during the peak growing season can be used for vegetation height retrieval in tundra. Whereas NDVI utilizes the red and near infrared information only (e.g. Sentinel-2 bands 4 and 8), the Tasseled Cap Wetness index also considers blue, green and short-wavelength infrared information (Crist (1985); e.g. Sentinel-2 bands 2, 3, 4, 8, 11 and 12).
Here we test both optical and SAR data obtained from freely available multi-spectral as well as C-band data from the Copernicus Sentinel program over a 1500 km long transect in Western Siberia (extent defined based on Bartsch et al. (2014)). This includes Sentinel-1 with its C-band SAR as well as Sentinel-2, a multi-spectral sensor. Both C-HH and C-VV winter backscatter have been analysed in order to assess polarization impact and spatial resolution (Sentinel-1 Extra-Wide swath mode versus Interferometric Wide swath mode). A further aspect is the lower spatial coverage of Sentinel-1 EW (HH) compared to IW (VV) over land. In situ data covering a range of vegetation heights have been collected for validation and calibration in Western Siberia, Russia. Additional sources from across the Arctic are included in the assessment and the discussion of uncertainties and errors. Table 1 and Figs. 1 and 2 give an overview of the available datasets for this study. All measurements have been made in July and/or August.

Vegetation surveys
Data from two recent vegetation surveys from across North-Western Siberia, Russia were used for calibration and validation of the vegetation height retrieval (2017 northern part and 2018 southern part; referred to as primary dataset in the following). All occurring vascular plant types are included in the estimates of mean height and vegetation coverage. Data from both surveys have been combined and eventually separated into two distinct sets for calibration and validation. Several long term monitoring sites exist within this region for which further vegetation height information is available (Vaskiny Datchi, Mordy Yakha, Khalevto and Bovanenkovo). These data, together with three further regions in North America (Alaskan North Slope and Yukon-Kuskokwim Delta, US, and Mackenzie Delta, Canada) have been used for assessing the applicability of the approach in different Arctic environments and different sites than covered in the main analyses region. Vegetation surveys from the additional sites differ regarding observed parameters from the primary survey and among each other. In some cases the surveys were limited to shrub species excluding other vegetation types such as herbaceous vegetation (Vaskiny Datchi, Mordy Yakha, Khalevto and Bovanenkovo, Mackenzie, partially Gydan).

Primary vegetation surveys in Western Siberia, Russia
Geobotanical expeditions to Yamal, Gydan and Tazovskiy peninsulas and the Polar Urals were carried out by the Russian Academy of Science (RAS) in July and August 2017. The main goal of the survey was to make a geobotanical map (1:200 000) using field data and satellite images for the tundra zone of the Yamalo-Nenets district. The survey followed the standard protocols of the Alaska Arctic Vegetation Archive (AVA-AK) (Walker et al., 2016). Mean vegetation height and/ or coverage measurements are available for more than one thousand sites from 2017 (see Table 1). The information was collected along transects in a tundra environment which represents the northern areas with barren areas and cryptogams as well as the transition to areas from low to tall shrubs. Data were collected for grids of different vegetation communities. Four groups have been collecting data on Gydan (G), one on Tazovskiy (T) and four on Yamal (Y). One of the surveys was located on the northern tip of the Yamal peninsula , where shrub coverage is low. It has been therefore not considered for this study. Not all groups have collected overall vegetation height. Maximum shrub height was recorded for the most northern surveys on the Gydan peninsula (G1 and G2). Further available measurements from several of the surveys (coverage, low shrubs, tall shrubs) are used for the determination of thresholds required for masking in this study.
A survey dedicated for this study was carried out in conjunction with a further similar RAS survey in August 2018 (referred to as 'GlobPermafrost' in Table 1). The focus was on tundra in proximity to the taiga-tundra transition zone which also includes Alder dominated communities (Fig. 3). Data were collected approximately 20 km west of Kharp on the northeastern slopes of the Urals. Overview of calibration and validation sites (circles), differing extent of Sentinel-1 data over the GlobPermafrost West Siberian transect (IW -Interferometric Wide swath, EW -Extra-Wide swath); and extent of the Circumpolar Arctic Vegetation Map (CAVM, Walker et al., 2002a). NCAP -North American Carbon Project. For detail on surveys in Western Siberia (blue outline), see Fig. 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Further in situ monitoring sites and surveys
Two long-term monitoring sites are located on central Yamal, Russia, the Vaskiny Dachi research station and the Mordy Yakha experimental site. The Vaskiny Dachi (VD) research station (70°20′N, 68°51′E) is situated in the central Yamal Peninsula in a system of highly dissected alluvial-lacustrine-marine plains and terraces . Information on vegetation was collected in August 2014. 36 points include maximum shrub height measurements (Widhalm et al., 2017).
The Mordy Yakha experimental site (70°11′ N, 68°31' E) is situated in the central Yamal peninsula, Russia, about 20 km South-West from Vaskiny Dachi and has similar climate, substrate and vegetation composition. A sample grid (5 km × 5 km) is placed over a 40 m high ridge, which is a typical tundra landscape with hillocks and ridges with dwarf shrub moss tundra interspersed with concave valleys comprising of dense willows (Salix spp.) vegetation on slope tails and at valley bottoms with marshlands and lakes (Fig. 3). The location of the 272 plots were predefined and arranged in transects. Each plot had a size of 15 m 2 (radius = 2.18 m). The distance between each transect was 300 m and the distance between each plot on each transect was 200 m. If present, shrub height of Betula nana and/or Salix lanata was measured of each specimen closest to the centre of each plot.
Khalevto and Bovanenkovo are also located on the Yamal peninsula, Russia. The Khalevto site is located about 10 km South west from Mordy Yakha (70°07′ N, 68°21′ E). This site differs from Vaskiny Dachi and Mordy Yakha because of dominant sand deposits. Willows are absent from hill tops and slopes, they only occur in river valleys and depressions. Dwarf shrubs are dominant with lichens. The location of the 132 field plots were randomly selected. Field plot size is 20 m × 20 m and the main species coverage was estimated and the average height of shrubs (Salix lanata and Betula nana) were measured. The Bovanenkovo site is located about 15 km west from Vaskiny Dachi (70°21′ N, 68°36' E). Bovanenkovo is a large gas field with a lot of anthropogenic disturbance (Kumpula et al., 2011). The location of the 92 field plots were randomly selected. Field plot size is 20 m × 20m and the main species coverage was estimated and the average height of shrubs was measured.
Vegetation height measurements were taken in August of 2018 near Trail Valley Creek Research Station, located 45 km north of Inuvik, Northwest Territories, Canada. This site is at the northern edge of the taiga-tundra transition zone, with a landscape characterized by distinct alder-dominated (Alnus alnobetula) and birch-dominated (Betula glandulosa) shrub patches, surrounded by tundra dominated by Sphagnum moss, lichen and sparse birch shrubs Wilcox et al., 2019;Zwieback et al., 2019). The survey site is separated into several plots of sizes between 1000 m 2 and 9000 m 2 . Plots were designed to encompass distinct patches of dominant vegetation cover types of alder, birch, and open tundra. Two of the plots are dominated by alder shrubs, two others by birch shrubs, and two others by lichen and Sphagnum moss tundra. Vegetation height measurements were made randomly at several locations (12-26) within each plot depending on plot size.
The NACP (North American Carbon Project) Dalton survey originates from a field campaign in 2010 and 2011 which was part of a NASA-funded research project. A statistical survey of shrub structure characteristics was carried out for 26 sites on the North Slope of Alaska by Durchesne et al. (2016a). The sites spanned from north of the Brooks Range to the Arctic Coastal Plain. At each 250 m × 250 m site field data was collected which for the here used field estimates data set includes mean crown radius, mean shrub height, total number of shrubs, and fractional cover surveyed using the belt transect method. While most sites featured shrubs less than 0.40 m tall, at some sites shrubs reached average heights of higher than 1.5 m (Durchesne et al., 2016b). Mean shrub height is used for this study.
Western Alaska's Yukon-Kuskokwim Delta (YKD; US) represents one of the warmest, southernmost parts of the Arctic tundra biome. Much of the region consists of wet, graminoid-dominated meadows, but shrubs are common along active river channels, permafrost plateaus, and upland tundra. We measured shrub canopy heights in two study areas on the YKD: (1) riparian shrublands on the modern Yukon River Delta with tall, canopy-forming shrubs (primarily willow and alder) (August 2018), and (2) upland tundra on the central YKD with low-growing shrubs (primarily dwarf birch and ericaceous shrubs) in several tundra fire scars dating from 1971 to 2015 and in unburned tundra (July 2017). Shrub canopy heights were measured at 1-m intervals along Table 1 Vegetation surveys used for calibration and validation (*; in the text also referred to as primary vegetation survey) as well as for further assessment. All height measures are in cm. RAS -Russian Academy of Science, Y -Yamal, T -Tazovskiy, G -Gydan, NACP -North American Carbon Program, (t) -partially near boundary to tundra-taiga transition zone. A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 transects within circular plots (30-m radius). Average vegetation heights were then calculated for all sample points that lie within a 20 m × 20 m area at the plot center. All available records with average height up to 3 m were included in the analyses; however, riparian shrublands on the Yukon River Delta frequently exceeded 3 m height.

Teshekpuk airborne survey
The Teshekpuk Lake study site is heavily affected by permafrost degradation and aggradation associated with thermokarst lake dynamics. Roughly 20% of the area is covered by thermokarst lakes, 60% by drained thermokarst lake basins of various ages, and 20% by erosional remnant upland topography (Jones and Arp, 2015). The most common vegetation communities in the area consist of wet and moist sedge meadow tundra and dwarf shrub graminoid tundra (Markon and Derksen, 1994). Several airborne survey photographs are available for the Teshekpuk Lake study site. Oblique photographs have been taken from altitudes of several hundred meters on an annual basis since 2007. This data allows for the remote assessment of terrain and vegetation characteristics that are otherwise inaccessible. We use the aerial photographs from Teshekpuk lake to identify patches of wetland graminoids and to analyze low Arctic wetland areas (species include Arctophila fulva, Dupontia fischerii, Eriophorum schuzeri, Eriophorum angustifolium, and Carex aquatilis) with characteristic high photosynthesis signal, similar to shrubs.

Landcover maps
Global landcover maps lack thematic content for tundra (Bartsch et al., 2016a). Some landcover maps which distinguish tundra specific vegetation communities are, however, available from Advanced Very High Resolution Radiometer (AVHRR) and Landsat. Suitable Landsat based maps are only available regionally. The map by Virtanen et al., 2004 covers the Usa Basin which is located west of the Urals and extents from tundra into taiga. It contains 21 different classes. We utilize it for the determination of the normalization function required for the preprocessing of Sentinel-1 data. The Circumpolar Arctic Vegetation Map (CAVM; Walker et al., 2002a) is based on AVHRR and is used in this study to assess the obtained vegetation height ranges as its classes can be to some extent associated with vegetation physiognomy. The CAVM is based on a false colour infrared image of 1993 and 1995 AVHRR data. The original 1:7.5 Mio scale CAVM is a GIS database which provides the first and to date only detailed circumpolar vegetation map of the Arctic tundra. It was derived by manual photo interpretation and infers vegetation information from expert knowledge of plant communities in relation to climate, parent material and topographic factors (Walker et al., 2002a). A raster version of the vegetation community classes has been recently published by Raynolds et al. (2019). Nine out of the 15 vegetation community units occur over the West Siberian transect. They are used to assess the results of the height retrieval. This dataset also includes boundaries of bioclimatic subzones (see Fig. 2).
The forest classes of the global Climate Change Initiative (CCI) Landcover have been used to assess the performance in the tundra-taiga A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 transition zone. The CCI Landcover project provides consistent global land cover maps at a spatial resolution of 300 m for the years 1992-2015. Using the UN-LCCS (United Nations-Land Cover Classification System) the legend is compatible with the plant functional types used in climate models. The hierarchical classification allows adjustment to the available information amount, providing a global as well as a more detailed regional legend (Santoro et al., 2016).

Satellite data
Data of the Copernicus Sentinel-1 as well as Sentinel-2 mission have been used. Sentinel-1 satellites offer C-band SAR measurements since autumn 2014. The Arctic coastal regions are covered by data acquired in EW as well as IW mode (Torres et al., 2012;Potin et al., 2014). The swaths cover an area of 400 km and 250 km width, respectively. Polarizations are HH and HV with a spatial resolution of 20 m × 40 m for EW mode and VV and VH with a spatial resolution of 5 m × 20 m for IW mode. Incidence angles range from 19°to 47°in EW mode and 29°to 46°i n IW mode. EW is usually acquired in the proximity of the coast, due to demand by sea ice services. It covers the northern part of the West Siberian analyses transect only (Fig. 1). The repeat intervals of IW and EW are theoretically comparably high due to overlapping orbits in the polar regions. But the number of actually available acquisitions in the higher resolution IW are lower than in EW due to e.g. demand by sea ice services.
Sentinel-2 is a multi-spectral satellite mission. The first satellite was launched in 2015. Its spectral resolution allows the retrieval of a wide range of common indices related to vegetation and the underlying soil. Vaskiny Datchi on central Yamal is for example covered at least once a week, but cloud cover is very frequent. First images became available in summer 2016. Data are provided by ESA solely in Level 1C, what corresponds to orthorectified images of top of atmosphere reflectance. In March 2017 a second Sentinel-2 satellite was launched leading to a combined revisit time of five days for any point on Earth. We use data from both and all July and August scenes available until 2018 in order to obtain complete and cloud-free mosaics. Multi-spectral indices as well as biomass are expected to be comparably stable over these two months (Hogrefe et al., 2017).

General workflow
The investigated parameters are Sentinel-1 C-HH and C-VV backscatter and Sentinel-2 derived NDVI and Tasseled Cap Wetness. Both, Sentinel-1 and Sentinel-2 require pre-processing steps before a relationship with vegetation height can be assessed. For Sentinel-1 this includes orthorectification and polarization specific normalization as the incidence angle varies across the available acquisitions. An efficient approach applicable over frozen tundra for C-HH is available (Widhalm et al., 2018) but has not yet been tested for C-VV. This needs to be assessed and parameters determined in a first step. SAR specific noise is addressed through multi-temporal averaging.
Sentinel-2 data as provided through the Sentinel data hub does not require orthorectification but is affected by clouds which are not fully captured in masks available with the products. Several acquisitions need to be therefore combined to identify errors in the cloud masks of individual acquisitions.
A further issue are ambiguities with respect to vegetation height in Sentinel-1 as well as Sentinel-2. Ambiguities are limited to only one of the data types (radar or optical) in several cases. This enables the identification of such areas but thresholds need to be determined to build applicable masks. Specifically the in situ vegetation coverage records are used for this purpose. The selected Sentinel-1 and -2 indices have been used together as input for a Principle Component Analyses to further explore their representativeness for vegetation height and the added value of their combination.
Eventually, records were assessed for their relationship with vegetation height, retrieval functions empirically determined, applied and the accuracy assessed.

Processed areas
In a first step, a 1500 km transect stretching from Bely Island in the North to Surgut in South, and the Ural mountains in the West to Gydan peninsula in the East has been processed ( Fig. 1, IW extent). It overlaps with the transect of the primary in situ records (Fig. 2). This area also covers different permafrost types and is therefore of interest for permafrost modelling (Bartsch et al., 2014). Recent changes in climate have been reported for this region (e.g. Babkina et al., 2019). The approach has been then transferred to additional sites where in situ information is available (see Table 1 and Fig. 1) for further assessment. Sentinel-1 acquisitions have been also pre-processed over the Usa basin for the determination of normalization parameters. This basin is located west of the Northern Urals, Russia.

Sentinel-1 data selection and pre-processing
In order to exclude effects of soil moisture and phenology on C-band backscatter, only data under frozen conditions were used. Previous studies showed that December is applicable for Arctic C-band studies which require frozen conditions (Widhalm et al., 2015(Widhalm et al., , 2018Bartsch et al., 2016b). Only volume scattering and surface roughness is preserved in the signal which is expected to relate to landcover type and vegetation physiognomy.
535 Sentinel-1 images acquired in December 2016 and 2017 were selected over all the study regions (215 over the Western Siberian region). We used Ground Range Detected (GRD) images which consist of focused SAR data that has been detected, multi-looked and projected to ground range using an Earth ellipsoid model. IW data of GRD high resolution is provided resampled to 10 m pixel spacing/nominal resolution and EW data to 25 m pixel spacing/nominal resolution in GRD high resolution mode (Potin, 2013). We applied border noise removal, based on the 'bi-directional all-samples method' of Ali et al. (2018), calibration, thermal noise removal and orthorectification using the digital elevation model (DEM) GETASSE30 (Global Earth Topography And Sea Surface Elevation at 30 arc second resolution). Widhalm et al. (2018) demonstrated the applicability of this DEM for normalization in high latitudes in low to moderate terrain. These steps were carried out with the SNAP (Sentinel Application Platform) toolbox provided by the European Space Agency and 0 was derived. As part of the orthorectification step, local incidence angle (θ) maps were derived for normalization. The function describing the relationship between incidence angle and radar backscatter is largely driven by volume scattering in case of vegetation coverage (Ulaby et al., 1982;Frison and Mougin, 1996;Menges et al., 2001). The lower the dependency the higher the volume scattering and expected biomass. This has been also confirmed for selected tundra vegetation classes under frozen conditions in case of Sentinel-1 EW HH (Widhalm et al., 2018). In addition, tundra frozen C-HH backscatter ( 0 ) variation is directly proportional to the slope if described by a linear model. This is expected to allow the direct use of C-HH backscatter for applications which benefit from variations in volume scattering such as our study of vegetation height retrieval. The backscatter-incidence angle relationship varies by landcover type and needs to be determined for each location to allow combination of scenes over space and time. They can be derived for each location if sufficient scenes are available over time (e.g. Widhalm et al., 2015). This is, however, very processing intensive and requires sufficient representative acquisitions. Widhalm et al. (2018) suggest a simplified method which is applicable to frozen tundra environments but has so far not been tested for C-VV. A linear dependency is assumed, what limits the validity to an incidence angle range of about 20°-45°. It is A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 therefore not applicable in mountain areas. Most high latitude tundra is, however, of moderate terrain, including all our study areas. We transferred this approach to VV for this study. A landcover map which includes a range of vegetation types reflecting vegetation physiognomy was used for this purpose (Usa basin, Russia, Virtanen et al., 2004; see also Bartsch et al. (2016b) and Widhalm et al. (2018)). Data from multiple orbits, representing a range of incidence angles for each location, were considered. A function was fit to each landcover specific sample to describe the relationship of the local incidence angle with 0 . The relationship between the slope values k for all the different landcover classes and 0 at 30°(as frequently used in C-band SAR applications, e.g. Bartsch et al., 2008Bartsch et al., , 2009Bartsch et al., , 2012Reschke et al., 2012;Trofaier et al., 2013) was derived in order to obtain the normalization function. VV and HH backscatter ( VV 0 and HH 0 ) were normalized using the following equations to obtain the pixel specific slope k of the normalization function and eventually VV (30) 0 and HH (30) 0 , respectively: HH HH HH HH 0 (2) with θ being the local incidence angle and a and b are the polarization specific parameters. a HH and b HH are available from Widhalm et al. (2018) (8.618 and 5.978 respectively). The parameters a VV and b VV were determined in this study.
The normalization approach introduced by Widhalm et al. (2018) requires theoretically only one acquisition per location. The normalized backscatter was, however, averaged over time in our study in order to reduce noise. Depending on availability in December 2016 and 2017 on average 5 to 20 IW measurements per pixel were used.

Sentinel-2 data selection and pre-processing
Sentinel-2 data were first checked for clouds using the masks which are provided with the datasets as well as visual inspection. An fmask cloud masking based on the algorithm by Zhu et al. (2015) was applied. All available cloud-free data of summer 2016, 2017 and 2018 have been combined in order to obtain a complete coverage (approximately 1800 tiles over the West Siberian transect, and 700 tiles over the additional sites). Data only include acquisitions from July and August, when biomass and NDVI are expected to be comparably stable (Hogrefe et al., 2017). Tasseled Cap Wetness index (referred to as TC in tables and figures) and NDVI were calculated and averaged for each pixel over the available scenes. Cloud shadows and specific types of thin clouds are, however, not removed with fmask. Based on the standard deviation of the Tasseled Cap Wetness time series an additional cloud masking was therefore applied in order to identify resulting artifacts. Areas with a standard deviation > 0.04 are marked and separately processed. For these locations the highest Tasseled Cap Wetness value and the lowest NDVI value have been excluded from the respective average calculation. A water mask was derived from the maximum extent of the fmask derived water class and applied.

Sentinel-1 and Sentinel-2 combination for treatment of ambiguities
Both C-band backscatter and multi-spectral indices contain ambiguities. High backscatter and high NDVI respectively do not in all cases represent high vegetation in tundra. This can be addressed by combination of Sentinel-1 and Sentinel-2 in two specific cases where an ambiguity is only present in one of the data types. An additional preprocessing step has been therefore introduced which identifies sites of high discrepancies between Sentinel-1 and Sentinel-2.
In areas free of vegetation and soils (rocks and boulder of several centimeter to meter size), multiple scattering dominates the C-band response. This can lead to similar or high backscatter as from areas with high vegetation. Such areas are however characterized by low NDVI values. A threshold needs to be determined in order to exclude affected pixels from vegetation height retrieval based on the SAR data. Several in situ records from the GlobPermafrost survey in the Northern Urals are available for such sites and have been utilized for this purpose.
The opposite case, with large NDVI but close to zero shrub coverage, occurs for Arctic wetlands. Information from Sentinel-2 indices is ambiguous when very wet sites are covered with moist and wet sedge meadow tundra. This is for example expected for the Teshekpuk region in Northern Alaska where aerial surveys are available. Scattering from C-band is assumed to be low in these areas under frozen conditions (Widhalm et al., 2015). A threshold needs to be determined to mask these areas before vegetation height retrieval from multi-spectral indices. A minimum backscatter to be expected for shrubs needs to be determined. Records from the RAS survey distinguish low and tall shrubs (see Table 1) and have been used for this purpose.

Determination of the vegetation height retrieval function
In situ records from the 2017 botanical surveys from group RAS Y1-3 and the 2018 GlobPermafrost survey (Table 1) were used for the determination of an empirical relationship for both Sentinel-1 and Sentinel-2 records. The data points represent a gradient of vegetation communities and two bioclimatic zones (see Fig. 2). The records were separated for calibration and validation by selecting every second sample of the dataset sorted by vegetation height. This produces two sets of statistically similar samples (average for calibration samples is 68.5 cm and for the validation samples 66.9 cm; Kolmogorov-Smirnov test 0.0119 with p-value 1.0). The vegetation height information was averaged for bins of 10 cm (height ranges) in order to account for variations in representativeness of the in situ data (point measurement versus spatial resolution and placing of pixel) as well as uncertainties in the field estimates which are unknown and expected to differ between the datasets (different persons mapping across the sites).

Assessment of height retrieval
Basic statistics have been derived to evaluate the retrieval. This includes Pearson correlation and RMSE. To assess uncertainties with respect to in situ measurements they have been derived for all individual measurements as well as for 10 cm bin averages similar as for the calibration of the vegetation height retrieval.
The validation data set of the primary in situ records (Y1-3 and GlobPermafrost) was in addition separated for this purpose since measurements were made by different persons and in different areas (RAS Y1-3 (2017) on Yamal and the GlobPermafrost survey (2018) at the Urals). Also in this case, both, the original records and the 10 cm bin averages were considered for the assessment.
Separate assessments were also made for the other sites where at least 36 data points have been available. This includes Mordy Yakha, Vaskiniy Datchi, Khalevtvo & Bovanenkovo (see Table 1). All others were only included in the overall assessment. This includes the Mackenzie plots. The records from this site exceed the 36 data points threshold but cannot be associated with a certain coordinate (random distribution within defined polygons). The satellite derived information was therefore averaged for all pixels which overlap with the polygon of the known plot extent. All in situ measurements made within a plot were averaged for the comparison. This resulted in six data points at this location.
Vegetation height statistics were derived for CAVM vegetation community classes for the Western Siberian analyses area (Fig. 2) as some of these community classes can be associated with certain height ranges (Raynolds et al., 2019).
The performance along the tundra-taiga transition zone was eventually also visually assessed using the forest class boundary from the CCI landcover classification.
A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 4. Results  2018)) as well as values determined for VV. The relationship was similar (comparable rate of increase of backscatter with slope) but VV values were approximately 1 dB lower. The parameters a VV and b VV for the retrieval of the slope for the normalization (Eq. (1)) were 10.155 and 6.233 respectively. Fig. 5b shows examples for points of barren rockfields from the Urals survey in 2018. According to these measurements, an NDVI threshold of 0.4 can be applied and affected pixels in Sentinel-1 excluded from the vegetation height retrieval.

Thresholds for treatment of ambiguities
Ambiguities associated with high Arctic wetlands relevant for Sentinel-2 were for example found around Teshekpuk lake in Alaska and Bely Island in Russia. Fig. 6 shows Tasseled Cap Wetness, NDVI and backscatter for the two locations. Indices were partially high, similar to tall shrub areas. This was for example the case for partially or recently drained lakes (which may flood seasonally after snowmelt) on Bely Island as well as north of Teshekpuk lake in Alaska. These areas are characterized by grasses and sedges, as shown in Fig. 5a. 95% of all tall shrub samples showed backscatter larger than −15.4 dB in VV. Fig. 5b provides information on NDVI of selected sites without shrubs in the Teshekpuk area (from airborne surveys in 2010-2017) compared to values in areas with varying shrub coverage and two height categories over the West Siberian transect. Sentinel-1 observations showed no indication of volume scattering (high backscatter) at these sites (Fig. 7b). This demonstrates that not only areas with an NDVI < 0.4 need to be masked out but also areas with low backscatter values (< -15 dB for HH and −15.4 dB for VV).

Relationship between satellite derived parameters and vegetation height
The variation of the selected parameters within vegetation height bins (10 cm) differs between the parameters (see Fig. 8). The spread of C-VV values is partially lower than for all other parameters. A sensitivity to height can not be documented for heights less than 30 cm in all  (1) and (2)) and 0 at 30°local incidence angle for most common landcover classes from Virtanen et al., 2004. Percentage corresponds to fraction within the analysis area. Grey numbers represent Extra-Wide swath mode (EW, HH polarization; source Widhalm et al. (2018)) and black numbers Interferometeric Wide swath mode (IW, VV polarization).  Fig.6), b) vegetation coverage versus NDVI for low and tall shrubs, no shrubs (as in (a)) and none vegetated areas (from GlobPermafrost survey).
The relationship between averaged Sentinel-1 backscatter and Sentinel-2 indices with vegetation height was non-linear in all cases (see Fig. 9). Although the Tasseled Cap Wetness fit of the calibration function had the highest R 2 , the RMSE was higher than for NDVI and VV polarization (Table 2). Overall, VV polarization showed the best performance for the calibration dataset. The results of the Principle Component Analyses confirm the representation of vegetation height in the analysed indices (see Fig. 10). The first component explains 89% of the variance across the four indices and its values show similar agreement with height like Tasseled Cap Wetness and VV data (Table 2).

Assessment
Validation results confirmed the lower performance of the fitted function to HH at most sites. Opposite behaviour was, however, found in one case. A large discrepancy between HH and VV was observed for upland tundra sites on the YKD, many of which were located within a 2015 tundra fire scar (Tables 3 and 4). The overestimation by VV was in average 200 cm at YKD, with in average 120 cm for burns from 1971, 216 cm from 1985 to 284 cm from 2005 (Fig. 11). This was less pronounced for HH, with in average 65 cm. No differences between burn activity years occurred. These areas were included for regional assessment to exemplify the issue, but excluded for the overall statistics.
Although the validation based on the primary dataset showed high Pearson correlations, the RMSE was comparably high when separated by survey (RAS Y1-3 versus GlobPermafrost; Tables 3 and 4). The comparison to the in situ measurements demonstrates the general ability to provide vegetation height information from Sentinel-1 as well as Sentinel-2 (Fig. 12). It also revealed performance differences between the indices. Sentinel-1 HH backscatter based retrieval overestimates vegetation height, what seems to be pronounced for heights below 100 cm. Sentinel-1 VV retrievals do, however, perform similar to multi-spectral indices. The Pearson correlation for the exponential relationship determined using the calibration data indicates a good fit for C-VV as well as TC (see Table 2). The comparison to the validation datasets does, however, reveal lower performance of TC (see RMSE results in Table 4). This is especially the case for data collected close to the tundra -taiga transition zone.
The comparison of the results for masked C-VV retrievals with the additional in situ observations as well as landcover information further confirms its suitability. Comparison with landcover (forest classes from CCI Landcover map), does show general agreement of the 160 cm upper boundary with the forest boundary in the tundra-taiga transition zone (Fig. 13). The differences to in situ data, however, indicate a tendency of underestimation by the satellite data (Fig. 14). This applies to both calibration and validation records (Fig. 14b). The underestimation was in average 11 cm and 23 cm, respectively. This can be partially attributed to the type of measurement (shrub height only versus vegetation height) for the additional validation records.
For example, the retrievals from VV backscatter are higher at Mordy Yakha and Khaletvo & Bovanenkovo sites as in situ data do not represent average vegetation height, but shrubs only (Fig. 14b). Further on, measurements at Mordy Yahka represent areas smaller than the satellite data resolution. The heterogeneous landscape on central Yamal can not be fully represented for the Sentinel-1 and Sentinel-2 resolution with this type of sampling scheme. The variation in overall vegetation height and range of height in these areas was also rather small (see Table 1), which impacts the chosen assessment statistics. RMSE values were comparably large for most central Yamal sites (see Table 4). Pearson correlation was higher for Vaskiny Datchi (Table 3), as sites have been selected with respect to their homogeneity.
The lowest vegetation heights obtained from C-VV correspond to vegetation community areas with cryptogam tundra and wetlands without shrubs within the overlap area of the West Siberian transect and the CAVM (Fig. 15). Averages agree with the height descriptions of communities with dwarf and low shrubs by Raynolds et al. (2019) ( Table 5). Vegetation height is overestimated for prostrate dwarfshrubs. A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 5. Discussion

General applicability of tested parameters
Vegetation height is reflected in all analysed bands and indices to varying degrees. For example, there are broad similarities in the type of relationship (exponential) as well as the limitations. The latter specifically applies to low heights where the sensitivity is low. The tested approach is also comparably simple. Although the use of advanced multivariate analyses may lead to incremental improvements to model performance, the exponential relationship between the Sentinel metrics and vegetation height poses limitations that are unlikely to be overcome. The investigation of further bands, indices or even other sensors which provide further frequencies and polarization combinations may be required to improve the overall accuracy of the retrieval across the complete range of vegetation heights present in Arctic tundra. Widhalm et al. (2017) for example show that several classes of tundra shrub height ranges below 60 cm can be characterized with X-band SAR data.
The similarity of the height-relationship between Tasseled Cap Wetness and NDVI underlines the linkage of soil wetness to vegetation height in tundra. The comparably low agreement of TC with in situ data close to the tundra-taiga transition zone indicates limited importance of this environmental factor for vegetation growth in this region.
The good performance of C-VV may be of benefit for long-term assessment of vegetation changes in the Arctic as the limitation of optical data to cloud free conditions is a major constraint. In addition, only few records are available from before 2000 for medium resolution multi-spectral data (30 m and better (Nitze and Grosse, 2016)). C-band SAR has been, however, acquired since 1992 with ERS (European Remote Sensing Satellite) and specifically in VV polarization.
The comparison to vegetation maps (CAVM and CCI Landcover) further supports the hypothesis that vegetation height can be derived from SAR data. The CAVM vegetation community class of low shrub tundra included, as expected, the largest heights and also exemplifies the typical heterogeneity of the tundra (Fig. 15). The CAVM vegetation units are based on 1 km resolution data (Walker et al., 2002a;Raynolds et al., 2019). This results in occurrence of bare or low vegetation areas in all categories and a large spread of the second and third quartile. The comparison with the expected height range in some classes with prostrate shrubs (Table 5) suggests overestimation for low heights (approximately less than 30 cm). Also the other analysed parameters do not show sufficient sensitivity at low heights (see Fig. 8). The statistics for the samples in the lower bin ranges are similar to each other. For SAR this might be due to the impact of micro-topography (e.g. presence of tussocks) and in case of multi-spectral data the presence of mosses. The use of the exponential function therefore results in an overestimation for low heights.
The observed close relationship between C-VV frozen backscatter and in situ vegetation height measured during summer implies limited impact of winter-summer height differences on the retrieval. Leaves of tundra vascular plant species are comparably small and woody parts remain in winter for the most common shrub species.
The different degree of variation of parameter values within certain height classes among the different parameters may yield potential for further landcover related applications. The combination of Sentinel-1 and Sentinel-2 may lead to an advanced characterization of tundra vegetation classes. Vegetation height retrieval as targeted in our study benefits from combination of the two data types for error treatment (masking) as demonstrated in case of open rockfields and Arctic wetlands (Figs. 5 and 7).

Vegetation height and biomass
The found non-linear relationship of NDVI and vegetation height is similar to the relationship described in the literature of NDVI and tundra biomass. An exponential relationship of tundra phytomass with NDVI (summer maximum from Landsat) was e.g. reported by Walker et al. (2003). This is plausible as a linear relationship between in situ measured tundra shrub height and biomass has also been documented by Berner et al. (2018Berner et al. ( , 2015 for varying shrub species. The similarity of the relationship of C-band SAR backscatter at VV polarization as well as TC with vegetation height with results for NDVI indicates that retrieval of tundra biomass may also be feasible with such data. It has been shown in previous studies that C-band SAR data (HH) are suitable to obtain biomass in taiga regions (e.g. Santoro et al., 2011). Retrievals did, however, not consider shrublands due to the focus on forests. Considered values started at approximately at 25 m 3 /ha growing stock volume. Tundra phytomass shows much lower values. Up to 2 kg/m 2 was for example measured in Walker et al. (2003) as well as Ukraintseva and Leibman (2000). In case of forest species, such a mass would equal approximately 20 m 3 /ha (based on conversion according to Smith et al., 2003).
Our results show a range of more than 4 dB for 50 cm to 300 cm height in case of HH as well VV (Fig. 9). The maximum backscatter for HH was in the order of −11 dB. The maximum HH backscatter in Santoro et al. (2011) was in the order of −10 dB for growing stock volume of forest of about 300 m 3 /ha at a site in Siberia. Ground backscatter was estimated at −12.5 dB. This would correspond to about 100 cm vegetation height in our analyses. This difference could be a result of the statistical approach used for the ground backscatter retrieval in Santoro et al. (2011), variations between frozen and unfrozen conditions or the difference in normalization method. The overall sensitivity (difference between minimum and maximum backscatter),  Fig. 5a) and none vegetated areas (as shown in (a), from GlobPermafrost survey).
A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 to biomass is, however, lower for taiga than for tundra (2.5 dB and 4 dB respectively). First, we only used frozen conditions which makes the contribution from soil moisture negligible and results in lower backscatter, and second a roughness contribution was not subtracted in our study. It was only masked in extreme cases (see example in Fig. 7). Santoro et al. (2011) used backscatter statistics to distinguish dense forest levels and ground levels. The ground information was then subtracted to exclude surface roughness effects. This is based on the assumption that the roughness in clearings is similar to the roughness below the trees. In tundra, vegetation distribution, however, to some extent reflects soil conditions and thus deviations in roughness. The approach of Santoro et al. (2011) is therefore not applicable. Our analyses underline that roughness needs to be taken into account for the vegetation height retrieval from SAR data in tundra (Fig. 7). The magnitude of backscatter from vegetation free barren rockfields can be higher than for tall shrubs. A masking based on NDVI is suggested to account for this effect, since such areas are typically covered by little vegetation. The determined lower performance of HH compared to VV may result from the expected larger penetration depth of HH. Roughness is therefore expected to have a larger impact at HH than at VV polarization. Methods which account for roughness directly may improve the retrieval. Further advanced height retrieval methods which are for example applied in case of crop mapping include interferometeric, combined interferometry and polarimetry SAR analyses or the use of models which provide a simplified description of plant architecture (Erten et al., 2016). The applicability is, however, limited A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 across the Arctic since SAR data are only sporadically acquired in fully polarimetric mode. Especially X-band data which are often used for characterizing low vegetation are of limited availability compared to Cband data across the entire Arctic domain. Several recent studies have, however, tested polarimetric C-and X-band SAR data to characterize tundra landcover and demonstrated the potential of such data for tundra shrub identification (Banks et al., 2014;Ullmann et al., 2014;Duguay et al., 2016).

In situ records
The derived functions indicate a limitation regarding applicable vegetation height. The saturation level can, however, not be determined as vegetation height measurements are limited to 3 m in this study due to the focus on tundra vegetation. The number of available measurements were also comparably low for tall vegetation (see Fig. 8), only two surveys used in the validation included the transition and riparian zone, respectively. We therefore suggest to limit the application of this approach to a maximum vegetation height of 160 cm.
Our various surveys may also represent different uncertainties in vegetation height measurements. This also applies to the different groups of the RAS survey although they followed the same protocol, as vegetation height is not completely physically measured over the plot area but estimated. The calibration might therefore have been influenced, but the GlobPermafrost 2018 survey was comparably similar in performance if separated (Table 3 and Fig. 14a). The GlobPermafrost 2018 survey did, however, show higher RMSE in case of HH and TC. This may be a result from different persons estimating the height, but also from the fact that the 2018 survey was made in tundra close to the tundra-taiga transition zone compared to the 2017 survey, which was made at higher latitudes with lower overall heights (see Table 1). Several sites in the transition zone with Alder species of up to 4 m height and scattered Larch with comparably thick stems were included (Fig. 3). This may have resulted in different scattering behaviour in case of SAR compared to higher latitude tundra. The difference in dependence of vegetation growth on environmental conditions might have impacted the performance of TC along the transition zone. Soil moisture (as expected to be represented in the Tasseled Cap Wetness index) as driving factor may play a smaller role than in northern tundra.
As uncertainties in the primary in situ vegetation survey were

Table 2
Statistics for the exponential fits for the vegetation height information from Russian Academy of Science & GlobPermafrost calibration dataset (in cm, height range 0-300 cm) with satellite derived parameters ( 0 from Sentinel-1, Normalized Difference Vegetation Index (NDVI), Tasseled Cap Wetness Index (TC) from Sentinel-2) and Principle Component 1 (input HH, VV, TC and NDVI) based on averaged (bin width 10 cm in situ height) and original data. A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 unknown, we introduced an averaging over 10 cm bin before determination of the retrieval function. The analyses of the data in bins and at the same time inclusion of several surveys (all RAS sites and GlobPermafrost records) also overcomes some of these sampling problems (limitation in height range, heterogeneity etc). The difference between the R 2 for the raw (original) validation data and the averages (bins) as well as the results from different types of surveys in Yamal (Tables 4 and 3 and Fig. 14b) underline the problem of uncertainties with point data collection and the need for a more suitable strategy for in situ data collection. First, larger plots need to be surveyed and second, estimates need to consider large height variations over short distances. The combination with aerial surveys using Lidar information may provide more reliable validation data. The pronounced underestimation for surveys which only provide shrub height (Fig. 14b), however, supports the hypothesis that C-band SAR winter backscatter reflects overall vegetation height. Variations in height between the years of the primary vegetation surveys and within the satellite averaging period are expected to be less than the 10 cm bin width used to account for uncertainties in the in situ data observations. The availability of data regarding year of mapping differed among the additional sites contrary to the primary validation and calibration dataset. They ranged from 2005 to 2017. Shrub growth

Table 3
Pearson correlation (R 2 ) for estimated vegetation height from satellite derived parameters ( 0 from Sentinel-1, Normalized Difference Vegetation Index (NDVI) and Tasseled Cap Wetness Index (TC) from Sentinel-2) versus in situ measurements (height range 0-300 cm, see Table 1) (Hudson and Henry, 2009) and about 0.1 cm per year mean canopy height increase for a range of sites across the Arctic for 1980-2010 (Elmendorf et al., 2012). Shrub growth itself (aging) is also comparably slow. Myers-Smith et al. (2011) determined 0.11-0.16 cm vertical growth per year for different Salix species. Larger changes are only expected in case of disturbances such as forest fires or landslides (e.g. Khitun et al., 2015). Some structure changes are also reported from the transition zones. Frost et al. (2014) report a change in NDVI over an 18 year period. Thus, all used measurements from before 2017 were collected in higher latitudes, in colder bioclimate subzones, outside of the transition zone, where rapid changes in shrub height was unlikely to occur. The circum arctic vegetation map used for evaluation also dates more than 20 years back in time. As it represents general vegetation communities it was nevertheless assumed applicable for comparison using overall statistics (i.e. boxplots by communities, see Fig. 15).
Uncertainties are expected to be especially high for in situ measurements of higher height values. Anomalous behavior could be observed for such records when compared to the selected parameters. The values in the 120 cm bin of the calibration dataset showed for example an unusually large spread in all cases (see Fig. 8). An underestimation of vegetation height occurs for several in situ records of the validation dataset in case of all four tested parameters (Fig. 12, 150-200 cm compared to estimations below 50 cm). This indicated issues with the spatial representativity of the in situ measurements. In the observed cases of the validation dataset, data points were located at the boundary of a landcover type, in proximity to seasonally inundated areas. Such wet areas are often associated with patches of comparably high shrubs (Dvornikov et al., 2016), but do not cover large areas.   Table and 4 and 3 respectively.

SAR specific issues
It can be expected that certain characteristics of SAR data and their pre-processing contribute to uncertainties in the retrieval.
A radiometric accuracy of 1 dB has been determined as mission requirement for Sentinel-1 (Torres et al., 2012). Recent studies, however, show that the actual performance is in the order of 0.3 dB for point targets (Schmidt et al., 2018). This may result in variations of up to 12 cm considering an observed range of approximately 4 dB over 160 cm vegetation height. This value is, however, lower than the determined RMSE in all cases.
The speckled nature of SAR images, whereby an interference caused by numerous scatterers within each resolution cell induces a noise-like effect, might also have had an impact on the deviations. The impact is, however, expected to be small due the use of multiple acquisitions and multi-temporal averaging. Widhalm et al. (2018) show that the simplified normalization is not applicable in steep terrain and the comparably coarse resolution DEM GETASSE30 may introduce artifacts in mountains. It is therefore expected that vegetation height retrieval would be of lower accuracy in these areas. This could not be quantified as all in situ data have been collected in flat to moderate terrain, reflecting the dominant landscape type for the Arctic. Large parts of potentially affected regions (e.g. Ural mountains, see Fig. 13) were, however, eventually masked in our scheme, as most mountain areas in high latitudes are characterized by sparse vegetation.
The relationship of the slope of the normalization function with backscatter was similar for HH and VV (just offset), but differences could be observed with respect to landcover class (Fig. 4, class number 10). Shrub tundra heath with lichen dominated lower layer as described in Virtanen et al., 2004 consists of patches of heath with extensive lichen covered areas between them. The deviation of the backscatter properties of this class in VV may result from spatial resolution differences. HH (EW mode, 30 m) is available at coarser resolution than VV (IW mode, 10 m) and is similar to Landsat (30 m) which was used to derive the landcover classification (Virtanen et al., 2004). The classification is therefore expected to be not fully appropriate in case of VV. A higher spatial resolution landcover classification, based on for example Sentinel-2 may be of benefit for determination of the normalization function. The Pearson correlation of the fitted linear functions, which serve as input for the slope versus backscatter analyses, decreases with volume scattering (Widhalm et al., 2018) as the sensitivity to variation in backscatter with local incidence angle is decreasing at the same time. This might result in larger uncertainties in height, but the surface types   Table 1.
A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 under investigation still showed a clear dependence on the incidence angle (slope below −0.1, Fig. 4) due to limited volume scattering in tundra. Thresholds could be determined to address ambiguities in SAR backscatter which lead to errors. It has been demonstrated for open rock fields and a similar pattern (high backscatter and low NDVI) can be expected for build-up areas. Comparably high backscatter occurs also over lakes in winter (e.g. Duguay et al., 2013). Such areas should be masked before height retrieval (e.g. based on Sentinel-2 as in our study).
The pre-processing step for treatment of ambiguities did not capture high backscatter caused by other features than vegetation in case of recently burned areas. Tundra fires are common in some parts of the Arctic, and are of high interest due to the potential for rapid shrub expansion in fire scars (Racine et al., 2004;Jones et al., 2013). Multispectral indices are better suited than SAR for vegetation height retrieval in affected areas according to our results. For example, pronounced overestimation of canopy heights occurred for the Yukon-Kuskokwim Delta burned area sites when using SAR (C-VV), including burns from 1971 (Fig. 11). The dramatic decreases in estimated canopy height with time after fire probably arise from a technical artifact rather than actual post-fire ecosystem dynamics. The overestimation results from winter backscatter which was higher than typical for this environment in December. Backscatter under frozen conditions is conditioned by roughness and volume scattering within vegetation (woody parts) and snow (depending on grain size). Further on, early winter backscatter variations in this environment are driven by the dielectric constant (change in liquid water content), changing volume scattering in lake ice and snow (e.g. Duguay et al. (2002); Bergstedt et al. (2018)). Higher backscatter after forest fires is typical for unfrozen conditions in permafrost areas (e.g. Reschke et al., 2012) due to increase in near surface soil water content (Liljedahl et al., 2007), but these areas are expected to appear as 'dry' (low backscatter) when frozen. Liljedahl et al. (2007), however, report of an up to 30 days delayed freeze-up (compared to pre-fire conditions) and presence of liquid water in the soil at 10 cm depth in early December based on in situ measurements. The latter may contribute to the higher backscatter. A common related phenomenon for snow in high latitude which needs to be considered is depth hoar at its base. Its presence causes comparably high backscatter return at C-band (e.g. West, 2000;Pivot, 2012). Depth hoar depends on water-vapor flux what might be influenced by the specific moisture conditions which result from burn activities. Pruitt (1959) reported that caribou avoid areas of recent fire activity in winter as snow has higher density in these areas. The pronounced difference between HH and VV (little impact on HH) also suggests an effect associated with change in volume scattering.

Conclusions
The strength of the presented approach lies in its transferability and spatial detail across large geographic extents. It allows for circumpolar retrieval which has not been achieved with other methods so far. Consistent records are essential for example for permafrost related applications (snow redistribution), which currently rely on landcover information which does not represent the needed thematic content to infer height classes and is of too coarse spatial resolution to represent the heterogeneity of tundra environments. Results are therefore expected to allow advancement despite an RMSE in the order of 50 cm for the investigated height range (0-300 cm).
The suitability of the proposed approach, consisting of combining SAR and multispectral vegetation indices, is supported by the availability of C-band SAR, especially through the acquisition strategy of Sentinel-1 (Torres et al., 2012). The utilization of C-band SAR in VV polarization performs similar to Sentinel-2 derived Tasseled Cap Wetness, but shows regionally better results. This applies specifically to areas close to the tundra-taiga transition zone, where the role of temperature plays a larger role than moisture availability for shrub growth.
As we hypothesized, vegetation height was reflected in all analysed bands and indices, although to varying degrees. The utilization of further bands, frequencies, polarization combinations or indices may provide further improvement. An approach with higher performance at low vegetation heights would be particularly useful. Advanced multivariate analyses and machine learning techniques may provide further improvement.
A. Bartsch, et al. Remote Sensing of Environment 237 (2020) 111515 derived from multi-spectral data. An exponential relationship was described before for growing stock volume in forest (C-band HH) and for phytomass in tundra (for NDVI), but so far not for C-band as well as Tasseled Cap Wetness in tundra. The similarity of the relationship between C-VV and NDVI with vegetation height suggests that also C-VV can be used to derive tundra above ground biomass. Our analyses demonstrate limitations of the use of NDVI (especially in Arctic wetlands) and show that also the Tasseled Cap Wetness multi-spectral index and C-band data acquired in VV polarization possess similar capabilities for mapping tundra vegetation height. Further analyses is, however, needed to quantify the suitability of such data for actual biomass retrieval in tundra. Trends from multi-spectral indices have received large attention for permafrost regions in the past (Nitze and Grosse, 2016;Lara et al., 2018, e.g.). The demonstrated relationship of the multispectral indices with vegetation height may allow for enhanced interpretation of long-term trends in tundra. C-band SAR may provide an additional data source for these type of studies, as preceding missions such as ERS-1, ERS-2, Radarsat-1 and ENVISAT ASAR also acquired data over the Arctic. Results indicate that specifically Sentinel-1 VV can be used to derive vegetation height up to 160 cm. Specific settings (e.g. aquatic vegetation, barren rockfields dominated sites) do, however, require a combined use with NDVI. It is therefore suggested to use Sentinel-1 VV data in a first step and to apply a masking scheme based on NDVI to the VV based height retrievals in order to account for the above mentioned limitations. Retrievals from radar information over recent fire scars also need to be treated with care as changes in volume scattering in snow (and/or impact of presence of liquid water in the upper soil in early winter) may affect the relationship of backscatter with vegetation height. More in depth analyses on this issue is needed to clarify the interrelationship between burned soils, snow structure and backscatter response.
Calibration as well as validation is affected by the type of in situ data collection. Botanical surveys following specific protocols as carried out for large scale vegetation community mapping in Western Siberia have provided the calibration data. Validation data from further sites includes in addition surveys of different nature, purpose and height ranges, which is reflected in the results. The joint use of these datasets does however indicate applicability of Sentinel-1 and Sentinel-2 data for vegetation height mapping across the Arctic. The usage of Lidar data for validation and calibration may, however, provide more precise insight into retrieval errors.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.