Improving the snowpack monitoring in the mountainous areas of Sweden from space: a machine learning approach

Under a warming climate, an improved understanding of the water stored in snowpacks is becoming increasingly important for hydropower planning, flood risk assessment and water resource management. Due to inaccessibility and a lack of ground measurement networks, accurate quantification of snow water storage in mountainous terrains still remains a major challenge. Remote sensing can provide dynamic observations with extensive spatial coverage, and has proved a useful means to characterize snow water equivalent (SWE) at a large scale. However, current SWE products show very low quality in the mountainous areas due to very coarse spatial resolution, complex terrain, large spatial heterogeneity and deep snow. With more high-quality satellite data becoming available from the development of satellite sensors and platforms, it provides more opportunities for better estimation of snow conditions. Meanwhile, machine learning provides an important technique for handling the big data offered from remote sensing. Using the Överuman Catchment in Northern Sweden as a case study, this paper explores the potentials of machine learning for improving the estimation of mountain snow water storage using satellite observations, topographic factors, land cover information and ground SWE measurements from the spatially distributed snow survey. The results show that significantly improved SWE estimation close to the peak of snow accumulation can be achieved in the catchment using the random forest regression. This study demonstrates the potentials of machine learning for better understanding the snow water storage in mountainous areas.


Introduction
Snow is an important component of the Earth's system for regulating climatic processes (Sturm et al 2017), through its strong feedbacks related to albedo (Groisman et al 1994, Déry andBrown 2007), and plays an important role in the water cycle. Climate change is rapidly altering snow conditions in the world (Peng et al 2013, Chen et al 2015, Pulliainen et al 2020, and spatially distributed and temporally dynamic observations of snow conditions are highly valuable for water resource management, hydropower planning, flood risk assessment and climate studies (Bormann et al 2018). Globally, snowpacks and glaciers provide water for more than one-sixth of world population (Barnett et al 2005), and a better understanding of the snow amount and its distribution is therefore of large interest for the society.
Due to inaccessibility and complex topography, there is a severe lack of ground observations in remote mountainous areas, making the characterization of snow water equivalent (SWE) in mountainous terrains one of the most challenging topics in snow hydrology (Dozier et al 2016). In areas where snowmelt is the primary source for hydropower, knowledge of snow water storage can be of crucial importance for both economic (e.g. hydropower production) and environmental purposes (e.g. flood risk assessment). In Sweden, hydropower produces about 40% of national electric energy (SCB 2019). One of the largest losses for potential hydropower is the spilling water during melt periods due to inaccurate quantification of the snow storage upstream power dams. For example, in 2015, the spilling water resulted in an economic loss of roughly 1-6 million Euros in Umeälven (Gustafsson 2016), the third largest river in terms of hydropower production in Sweden.
Manual snow surveys by snow depth probings and bulk density measurements have been long used by the hydropower industry and national hydrological services. In recent decades, ground-penetrating radar (GPR) measurements have been developed and used for measuring the snow thickness over long distances (Marchand et al 2001, 2003, Lundberg et al 2010, Sundström et al 2012, Gacitúa et al 2013, Holbrook et al 2016. However, those measurements are very labor-consuming, and usually limited to specific locations. Therefore, those point/line measurements may not be representative of the snow properties beyond the locations where they are collected, especially in mountainous landscapes with large spatial heterogeneity.
Remote sensing offers great opportunities for dynamic and large-scale snow monitoring (Saberi et al 2020). Currently, the most commonly used approach to obtain large-scale SWE information is passive microwave (PM) sensors, which normally have very high temporal resolution (e.g. daily) but quite low spatial resolution (e.g. tens of kilometers). The SWE retrieval from PM sensors relies on the change of microwave radiation from Earth's surface due to snow presence (Dietz et al 2012). This inverse relationship between microwave radiation (often expressed as brightness temperature, T b ) and snow mass has been used by many PM systems for large-scale snow depth and SWE monitoring since the 1970s (Pulliainen and Hallikainen 2001, Pulliainen 2006, Kelly 2009, Derksen et al 2010, Takala et al 2011, Frei et al 2012, Hancock et al 2013, Wang et al 2019.
However, these PM derived SWE products have poor spatial resolutions, also often assuming constant values for certain snow properties (e.g. snow density). In general, the SWE retrievals from PM sensors are problematic for deep snow due to T b saturation when snow depth is above a certain threshold (e.g. 8-80 cm in Chang et al (1987)). In addition to saturating effects, another main problem is the mismatch between the spatial resolution of PM sensors and the dominating spatial scale of snow variability. In mountainous regions, the characterization of SWE from PM sensors is especially challenging due to a lack of ground truth, deep snow, coarse spatial resolution of sensors, and large spatial heterogeneity of snow distribution (Dozier et al 2016, Cai et al 2017. More information on factors that impact the snow distribution, such as topography and vegetation (Dong et al 2005, Stähli and Gustafsson 2006, Veitinger et al 2014, should be incorporated to improve the mountain SWE retrieval from satellites.
Machine learning techniques are able to reproduce nonlinear effects and interactions among variables and are also robust to overfitting (Hastie et al 2009, Alpaydin 2020. Recently, there has been a growing trend to use machine learning techniques in remote sensing applications (Holloway and Mengersen 2018), and several recent snow works have started to explore the potentials of machine learning for improving SWE estimation from PM sensors (Bair et al 2018, Xiao et al 2018, Kwon et al 2019, Revuelto et al 2020, Yang et al 2020. However, works using machine learning for estimating the snow water storage in Sweden, especially in mountainous areas, are totally missing. This study focused on the Överuman Catchment of Northern Sweden. Based on continuous snow depth measurements at the weather station in the catchment (2012-2019), the capabilities of multi-frequency microwave T b for describing snow depth were explored and the effects of snowmelt on the performances were investigated. Then, based on 2019 GPR snow surveys, a machine learning algorithm was developed to estimate the SWE distribution around the peak of snow accumulation in the catchment, by integrating satellite observations, topography and forest cover information. The performance of the machine learning algorithm for SWE estimation was also compared to that of a traditional multiple linear regression algorithm.

Study area
Our study area, the Överuman Catchment, is located in Northern Sweden along the border between Sweden and Norway (latitude: 65.9-66.2 • N, longitude: 14.4-15.4 • E). The Överuman Lake is the head water source for Umeälven River. This catchment covers an area of 652 km 2 , with the elevation ranging from 524 to 1575 m.a.s.l. The land cover is dominated by mountain tundra terrain (56%) and forest (31%, mostly birch). In this catchment, very limited ground measurements exist. As shown in figure 1, there is only one snow depth measuring station (Mjölkbäcken) located within the catchment. To complement ground measurements, typically one field campaign close to the peak of snow accumulation is carried out every year to obtain snow information based on GPR (along snow transects) and manual measurements (at control points along snow transects).

Data
AMSR2 onboard the Japanese GCOM-W1 satellite (launched in 2012) provides measurements of microwave radiation for multiple frequencies (6.9/7.3/10.65/18.7/23.8/36.5/89.0 GHz) and both polarizations (horizontal and vertical). The data used here is the AMSR2 Level 3 daily brightness temperature (T b ) data product at 0.1 degree, which was generated by Japan Aerospace Exploration Agency (JAXA). More detailed information on AMSR2 L3 product can be found in JAXA (2016). For the Mjölkbäcken station, the AMSR2 data for all frequencies (descending orbit) during the 2012-2019 snow seasons were used, and the satellite signal most sensitive to snow amount in this area was selected by exploring the capabilities of all single-frequency T b and all combinations of dual-frequency T b difference (∆T b ), which is detailed in section 3.2.1. For the catchment-wide SWE mapping during 2019 field campaign period, the spatial image of the snow-sensitive frequency as identified based on Mjölkbäcken was used.

• DEM
The DEM used in this study is the 2 m spatial resolution (grid2+ product) provided by the Swedish mapping, cadastral and land registration authority (Lantmäteriet). It is obtained by a linear interpolation of a triangulated irregular network based on airborne lidar surveys of the terrain surface. The grid2+ product uses the projected coordinate system (SWEREF99TM) with the national altitude system (RH2000).
• Forest cover Information about the forest cover is provided by Lantmäteriet in vector format. This product does not distinguish between different forest types, and was converted into a 2 m resolution binary raster map. The binary forest cover map was further converted into a forest cover fraction layer as the model input for SWE estimation.

Ground snow measurements
Snow depth data collected at the Mjölkbäcken station were measured at roughly bi-weekly intervals by the Swedish Meteorological and Hydrological Institute (SMHI), and the snow depth measurements during the 2012-2019 snow seasons were used in this study.
In addition to station-based measurements, one field campaign in the Överuman Catchment was carried out in 26-28 March 2019. The following ground snow measurements were performed: (a) GPR snow depth measurements along six designed snow transects (b) Manual snow depth and bulk density measurements at approximately 1 km intervals along the transects. Detailed information on GPR measurements, data interpretation and data quality can be found in the supplementary material (available online at stacks.iop.org/ERL/16/084007/mmedia).

Meteorological observations
In addition to snow depth, daily precipitation is the only meteorological parameter recorded at the Mjölkbäcken station. In absence of air temperature records at this station, the daily maximum temperature from the nearest weather station close by the Hemavan airport (data also available from SMHI) was used to calculate the number of melt days in Mjölkbäcken.

Brightness temperature (T b ) for describing snow depth
AMSR2 dataset provides T b for multiple frequencies.
Despite certain frequencies are more commonly used for depicting snow amount (e.g. T b at 36.5 GHz), the sensitivity of different AMSR2 frequencies might vary across space in relation to local snow conditions. In this work, we explored the capabilities of all singlechannel T b and dual-channel T b difference (∆T b ) of any two frequencies for describing snow depth based on the 2012-2019 snow seasons, aiming to identify the AMSR2 signal most informative to regional snow water information in the catchment. The criteria used to evaluate the strength is the Pearson's correlation coefficient (r), where n is the number of observations, x indicates ∆T b (or T b ), and y indicates snow depth.

Spatial aggregation and downscaling
The spatial data covering the catchment (e.g. DEM, forest cover, satellite T b ) have different resolutions. To keep consistency and also consider the dominant scale of spatial variability for snow distribution, all spatial data were rescaled to 500 m resolution in this study. Forest cover was aggregated from 2 m binary data into 500 m fractional data. DEM was aggregated into 500 m resolution and also used for deriving the terrain parameters (slope and aspect). Point-based GPR derived SWE along the snow transects was gridded to 500 m resolution to match the same spatial coverage as DEM and forest cover, using the mean values of GPR measurement points located within each pixel. Coarse-resolution AMSR2 T b was also resampled from the original 0.1 degree to 500 m resolution using the nearest-neighbor interpolation, similar to Wang et al (2019). Those rescaled spatial data were further used for the spatial mapping of SWE in the catchment.

Statistical and machine learning models
In this study, both a machine learning (random forest regression) and a traditional statistical (multiple linear regression) model were used.
• Random forest regression Random forest regression is an ensemble machine learning technique based on multiple decision trees to get more accurate and stable predictions (Breiman 2001, Segal 2004, Hastie et al 2009. Compared with other machine learning methods, random forest regression is selected because of its high interpretability and ability in reducing overfitting, measuring feature importance, and dealing with smaller sample sizes. Key parameters for setting up a random forest regression model include predictor variables, target variables and sample train-test split. In this study, a SWE estimation model from random forest regression was developed using the GPR measurements during the 2019 field campaign as ground truth samples. For model development, the samples were randomly divided into training and validation datasets (80% for training and 20% for validation). In the developed model, predictors include both static (e.g. elevation, slope, aspect, forest cover fraction) and dynamic (satellite T b ) variables at 500 m resolution; the target variable is 500 m resolution SWE aggregated from GPR point measurements.

• Multiple linear regression
Multiple linear regression is the statistical technique to predict the value of dependent variable using several predictor variables. In this study, in addition to the random forest regression, SWE at any point in space (y i ) is also estimated as a linear function of predictors x 1−p : where, i = n observations, y i = SWE, x p = the same predictors (both static and dynamic variables) as used in the random forest regression, and β 0 = intercept, β p = coefficients for each predictor, ϵ = residuals, which are solved by least square method.

Diagram of the methodology
For illustration, the diagram of the methodology is shown in figure 2.

Time-series AMSR2 T b for characterizing snow depth in Mjölkbäcken
Continuous snow depth measurements from the Mjölkbäcken station were used to explore the strength of time-series AMSR2 T b for estimating snow depth. As mentioned earlier, around 37 GHz is one of the mostly commonly used frequencies for depicting snow mass. In this section, AMSR2 T b at 36.5 GHz was used, and the time series of AMSR2 T b values (36.5 GHz, vertical) for the grid where Mjölkbäcken is located and the snow depth in Mjölkbäcken are plotted during 2012-2019 (figure 3).
The comparisons between snow depth and T b showed a very strong annual cycle, with a decline of   figure 4(a), T b and snow depth during the past seven snow seasons showed a quite scattered relationship, especially for the later months of the snow seasons with significant snowmelt, such as April and May ( figure 4(b)). However, it was noticed that occasional snowmelt might occur even earlier in the snow season. To quantitatively explore the impact of snowmelt on the performance, instead of simply splitting the snow season into accumulation and melt periods, we used the daily maximum temperature from the nearest weather station to calculate the number of melt days in-between two snow depth measurements (as snow depth was mostly measured bi-weekly about every 15 days, an equal split of the melt days number was used for describing three melting categories) and then explored the relationships for each snowmelt category (figure 4(c)). As seen from figure 4(d), the performance of T b for characterizing snow depth in the catchment is significantly better in periods with limited snowmelt.

Capability of multi-frequency AMSR2 T b for characterizing snow depth
Section 4.1 explored the ability of T b at 36.5 GHz (vertical) for characterizing the snow depth in the catchment. However, the sensitivity of multi-frequency AMSR2 T b to snow depth might vary with local snow conditions. In this section, the correlation between snow depth and each AMSR2 T b (or ∆T b ) was investigated to explore the capabilities of all single-channel T b and dual-channel ∆T b of any two frequencies for characterizing snow depth (figure 5).
As in figure 5, the different correlations between snow depth and ∆T b (or T b ) clearly show the varying capabilities of different AMSR2 frequencies for describing snow depth in the catchment. Consistent with section 4.1, when snowmelt is limited (figure 5, lower), higher correlations between snow depth and ∆T b (or T b ) were achieved for all combinations, indicating its better ability for describing snow depth during the melt-limited period in Mjölkbäcken. Among all frequencies, higher-frequencies at 18.7 GHz, 23.8 GHz, 36.5 GHz are better correlated with snow depth, with the best correlation for 36.5 GHz (very similar but slightly better correlation for the horizontal polarization). While combing two frequencies together, the performances of ∆T b did improve for most cases as compared to T b itself, although the improvements are almost negligible. For model simplicity, only the single frequency at 36.5 GHz (horizontal) was further used for estimating the spatial distribution of SWE around the snow peak season in section 4.4.

Snow depth and SWE distribution in the Överuman Catchment from GPR
At the end of March 2019, the Överuman Catchment was covered by deep snow with large variation, with the snow depth ranging from 0 to about 8 m (coefficient of variation (CV): 75.9%) and SWE ranging from 0 to about 3784 mm (CV: 89.6%) (table 1). Of Figure 5. Correlation between ∆T b (or T b ) and snow depth at different band combinations for all snow observations (upper) and snow observations with limited melt (lower, snowmelt days in-between snow depth measurements <5) in Mjölkbäcken. Notes: the value in each grid indicates the correlation between snow depth and ∆T b of two frequencies at x-axis and y-axis (at the diagonal, the value indicates the correlation between snow depth and T b itself for the specific single frequency).  all six snow transects, Transect 2 showed the highest average snow depth/SWE, followed by Transect 8, 6, 1, 4, 5 for snow depth and Transect 8, 6, 4, 1, 5 for SWE. The largest variation for snow depth/SWE occurred for Transect 6, followed by Transect 5, 8, 4, 2, 1, and the deepest snow was discovered in Transect 6. To show the spatial distribution of snow depth and SWE, the GPR measurements along the snow transects (aggregated to 500 m spatial resolution) are displayed in figure 6.

AMSR2 for characterizing the spatial distribution of around-peak SWE in the Överuman Catchment
In this section, a random forest regression model for SWE estimation was developed using the snow sensitive satellite signal (AMSR2 T b at 36.5 GHz) as derived in section 4.2 and ancillary parameters (elevation, slope, aspect, forest cover). The developed random forest regression model was then applied to the entire catchment for per-pixel SWE estimation. For comparison, a multiple linear regression model using the same input parameters was also developed in parallel for spatial SWE estimation. As mentioned in section 3.1.2, the GPR measurements were carried out in 26-28 March 2019 with the most measurements done on 27 March, and therefore 27 March was selected to represent the snow conditions during the campaign period. The spatial distribution of SWE in the catchment during the 2019 field campaign period and the relationship between predicted and measured SWE for the validation dataset are shown in figures 7 and 8. As from figure 8, the random forest regression (upper: R 2 0.63, Root Mean Square Error (RMSE) 164.92 mm) shows much improved SWE estimation as compared to the traditional multiple linear regression (lower: R 2 0.37, RMSE 216.57 mm), in terms of both R 2 and RMSE (much higher R 2 and lower RMSE). In addition, the random forest regression can describe SWE at a much larger range (figure 7), which is more consistent with the ground truth based on the field campaign (figure 6) and demonstrates its better ability for deep snow characterization.

Conclusions and implications
This paper explored the potentials of AMSR2 T b for characterizing the snow depth/SWE in the data-sparse mountainous regions of Northern Sweden. The results have demonstrated that, despite challenges, AMSR2 T b provides important information on snow accumulation in the Överuman Catchment. The main conclusions of this work are as follows: • AMSR2 T b time series at different frequencies show varying ability of characterizing snow amount in the catchment. Better correlations are found at higher frequencies of 18.7 GHz, 23.8 GHz, 36.5 GHz, with the best correlation at 36.5 GHz.
The ∆T b of two frequencies does not provide too much add-on values as compared to single frequency in this case.
• The ability of T b for characterizing snow depth is significantly improved when there is limited snowmelt. • By integrating AMSR2 T b , topographic factors and forest cover information, the random forest regression shows significantly improved estimation of SWE as compared to the traditional multiple regression, especially for depicting deep snow. • The machine learning algorithm, such as the random forest regression in this study, shows promising results for improving the SWE estimation in terms of both spatial resolution and accuracy, providing great potentials for regional applications, especially in areas with large spatial heterogeneity (e.g. mountain areas).
This study has reaffirmed the influences of topography on snow distribution, which have been studied in many previous works (Trujillo et  . It was also clearly noticed from those studies that the topographic influences demonstrate large spatial and temporal variability related to local snow conditions, and there is not a universal pattern regarding what topographic factors are relevant and how those factors contribute to the snow distribution. This has posed significant challenges to investigate the mountain snow processes based on knowledge from existing studies. In this work, we use a data-driven machine learning algorithm (random forest regression) to reproduce the nonlinear interactions among variables governing the snow distribution and automatically find out the otherwise hidden factors for predicting the snow distribution. In addition, as compared to traditional statistical SWE estimation, the random forest regression also shows improved prediction ability, making it possible for developing a robust and generalized SWE estimation algorithm across space and time. This offers a new perspective for applications at a large scale and in complex environments such as mountains. Despite significant potentials for improving SWE characterization, machine learning for mountain snow applications is still at its very infancy, mostly limited to exploratory studies in Alps and Pyrenees (López-Moreno et al 2017, Revuelto et al 2020) and in North America (Mccreight et al 2012), and missing in Nordic regions such as Sweden. This work intends to bring researchers' attention on using machine learning approach for improving the mountain snow characterization in Nordic countries.
However, there still exist uncertainties and challenges. Due to an insufficient ground measurement network in the study area, the manual and GPR snow measurements during the field campaign were used as the primary ground truth. We should note that these measurements were done based on a set of measured points/transects, making it difficult to fully describe the spatial heterogeneity in the complex mountain terrains. To complement those ground measurements, the use of unmanned aerial vehicle for mapping snow depth (de Michele et al 2016, Harder et al 2016, Avanzi et al 2018 can provide larger spatial coverage and better reveal the spatial heterogeneity, thus helping determine the optimal resolution and optimal density of snow surveys for mountain snow characterization. Currently, the field campaign is carried out close to the peak of snow accumulation, without continuous monitoring during the entire snow season. The installation of automatic weather stations and repeated aerial drone surveys for continuous monitoring will provide more perspectives on the evolution of snow conditions, help reveal hidden factors potentially impacting the snow distribution, and thus offer inputs for building a more robust model. Another challenge is related with the number of training samples used for machine learning. Despite random forest regressions do not require so many samples as compared to other machine learning algorithms, there is still a limited number of input samples currently existing in the Överuman Catchment and in mountain environments in general, and continued monitoring is warranted.
In this work, we use the Överuman Catchment as a pilot study, which can provide guidelines for applications in other catchments. Future efforts can be extended to other catchments to establish a distributed observation network across Sweden and thus help derive a good estimate of national snow water storage for efficient and sustainable hydropower production. With more ground and satellite observations becoming available in the future, this will also provide opportunities for exploring other machine learning and deep learning algorithms.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.