Temporal patterns of phytoplankton phenology across high latitude lakes unveiled by long-term time series of satellite data

Monitoring temporal changes in phytoplankton dynamics in high latitude lakes is particularly timely for un- derstanding the impacts of warming on aquatic ecosystems. In this study, we analyzed 33-years of high resolution (30m) Landsat (LT) data for reconstructing seasonal patterns of chlorophyll a (chl a ) concentration in four lakes across Finland, between 60°N and 64°N. Chl a models based on LT spectral bands were calibrated using 17-years (2000–2016) of field measurements collected across the four lakes. These models were then applied for estimating chl a using the entire LT-5 and 7 archives. Approximately 630 images, from 1984 to 2017, were analyzed for each lake. The chl a seasonal patterns were characterized using phenology metrics, and the time-series of LT-based chl a estimates were used for identifying temporal shifts in the seasonal patterns of chl a concentration. Our results showed an increase in the length of phytoplankton growth season in three of the lakes. The highest increase was observed in Lake Köyliönjärvi, where the length of growth season has increased by 28days from the baseline period of 1984–1994 to 2007–2017. The increase in the length of season was mainly attributed to an earlier start of phytoplankton blooms. We further analyzed surface temperature (T s ) and precipitation data to verify if climatic factors could explain the shifts in the seasonal patterns of chl a . We found no direct relationship between T s and chl a seasonal patterns. Similarly, the phenological metrics of Ts, in particular length of season, did not show significant temporal trends. On the other hand, we identify potential links be- tween changes in precipitation patterns and the increase in the phytoplankton season length. We verified a significant increase in the rainfall contribution to the total precipitation during the autumn and winter, ac- companied by a decline in snowfall volumes. This could indicate an increasing runoff volume during the beginning of spring, contributing to an earlier onset of the phytoplankton blooms, although further assessments are needed to analyze historical streamflow values and nearby land cover data. Likewise, additional studies are needed to better understand why chl a patterns in some lakes seem to be more resilient than in others.


Introduction
Monitoring phytoplankton dynamics in inland waters is critical for understanding environmental changes in aquatic ecosystems. At the base of the food web, phytoplankton provides energy for the entire aquatic trophic system by fixing carbon through photosynthesis (Wetzel, 2001). Phytoplankton further play a fundamental role in the biogeochemical cycles with high ecological importance (Behrenfeld and Boss, 2014). Similarly, changes in phytoplankton biomass can lead to profound environmental impacts. Most freshwater systems on the planet experience undesirable increases in planktonic and benthic biomass, mostly caused by anthropogenic eutrophication (Smith et al., 2006;Vincent et al., 2004). Such changes may lead to reduced dissolved oxygen, increased fish mortality, and potential health risks due to the release of toxins (Oliver et al., 2017;Padedda et al., 2017;Rabalais et al., 2009).
In general, satellite data offers an online data possibility, and relatively cheaper way to assess changes than in-situ estimates. Moreover, it https://doi.org/10.1016/j.rse.2018.12.006 Received 26 June 2018; Received in revised form 7 November 2018; Accepted 4 December 2018 offers a possibility to understand the temporal patterns of phytoplankton blooms, which is particularly interesting for assessing the ecological status of aquatic ecosystems. Information on phytoplankton phenology can provide scientific support for policies related to water resources management and environmental impacts mitigation (Anderson, 2009;Mitrovic et al., 2011). Furthermore, understanding the phytoplankton seasonal patterns helps identifying drivers of environmental changes in aquatic ecosystems. The timing of the onset of phytoplankton blooms is mainly driven by the physicochemical conditions of the water column, such as thermal stratification and water column mixing conditions, variation in solar radiation and the extent of the possible ice cover (Adrian et al., 1999;Bleiker and Schanz, 1989;Vehmaa and Salonen, 2009). Phytoplankton blooms have been shown to be closely associated with climatic variables, the surrounding land use and nutrients flow (Gittings et al., 2018). Nonetheless, currently the temporal and spatial dynamics of phytoplankton blooms across inland waters has not yet been comprehensively studied.
Due to the phenomenon of Arctic amplification, climate change is happening faster at high-latitudes (Cohen et al., 2014;Yamanouchi, 2011). Studies indicate that the Arctic region has warmed more than twice as fast as the global average (Screen and Simmonds, 2010). High latitude lakes are thus particularly interesting for studying environmental changes in aquatic ecosystems. Furthermore, in northern countries, such as Finland, we can observe lakes with vast range of trophic states distributed along a large latitudinal gradient.
However, the impacts of climate change on the temporal dynamics of phytoplankton in lakes are not well understood and poorly monitored. Although several studies using time-series analysis of Earth observation data have been undertaken in the marine environment (Blondeau-Patissier et al., 2014;Ji et al., 2013;Racault et al., 2012), much less has been done over inland waters (e.g. Thackeray et al., 2008;Palmer et al., 2015). Many factors contribute for the lack of multi-temporal assessments of phytoplankton dynamics over lakes, including scarce information on the spatial variability of phytoplankton concentration and limited time series of field data (Oliver et al., 2017).
Remote sensing (RS) can provide a powerful tool for improving phytoplankton monitoring, given its ability to collect data over large areas and the existence of multi-decadal imagery archives. Several satellite sensors can collect data at spectral wavelengths that allow estimating chlorophyll a (chl a) concentrations in aquatic ecosystems. All phytoplankton species contain chl a as the key photosynthetic pigment, which is used as the most common proxy for monitoring phytoplankton. For instance, ocean colour satellite instruments, such as the NASA's Seaviewing Wide Field-of-view Sensor (SeaWiFS, 1997(SeaWiFS, -2010 and Moderate Resolution Imaging Spectroradiometer (MODIS, 1999-), onboard Aqua satellite, collect(ed) data at wavelengths spanning from 0.44 to 0.67 μm, supporting widely used algorithms for chl a estimates particularly in ocean water (Bailey and Werdell, 2006;Hu et al., 2012;Werdell and Bailey, 2005). More recently, in 2016, the European Space Agency launched the Ocean and Land Colour Instrument (OLCI) onboard the Sentinel-3A satellite. The OLCI collects data in 21 spectral bands, from 0.4 to 1.02 μm, allowing the development of products such as estimates of phytoplankton biomass and total suspended matter concentrations.
Nonetheless, the application of the above mentioned tools in lakes remains challenging due to the coarse spatial resolution of the longerrun missions (e.g. MODIS: 500-1000 m, MERIS 300 m) and/or the short time-series of the new generation of satellites (e.g. Sentinel-3A, Sentinel 2A and B). In northern countries (e.g. Finland, Sweden, and Norway), the complex shapes and often small size of lakes require the use of high spatial resolution sensors. However, high spatial resolution often comes at the expense of coarser temporal resolutions, requiring the acquisition of images throughout longer periods to reliably capture the seasonal patterns of chl a. In this context, the sensors onboard the Landsat (LT) satellites family presents an interesting alternative. The LT sensors provide a long and continuous time series, running from 1972 (LT-1) to the present date (LT-8). Since the launch of LT-4, in 1982, the program provides high spatial resolution data at 30 m.
Although LT sensors have relatively low spectral resolution, several studies have been successful in assessing chl a concentration using LT imagery. Vincent et al. (2004) applied LT data for mapping cyanobacterial blooms in Lake Erie. Using linear models, they were able to estimate phycocyanin concentration with accuracy up to 77%. Allan et al. (2011) used LT data for estimating chl a concentrations in central North Island lakes of New Zealand, reporting classification accuracies up to 95%. Also applying LT data, Isenstein et al. (2014) used multivariate models to retrieve chl a concentrations, reporting that models accounted for 72 to 83% of the variability in chl a observations.
Although recent studies have used LT time-series for assessing phytoplankton long-term trends in lakes (Ho et al., 2017;Tan et al., 2017), it is still broadly unknown how the temporal dynamics of lake's chl a (i.e. seasonality, phenology) have changed over time. Similarly, the temporal trends of chl a concentration in high latitude lakes, where climate change is reportedly occurring faster, remain poorly understood.
Some challenges still exist in using LT imagery for assessing chl a seasonal dynamics. The low temporal resolution of LT (e.g. 16 days for LT-5 and -7) makes it difficult for obtaining cloud free observations in some seasons of the year. In some cases it can be difficult to precisely capture the temporal characteristics of chl a blooms. In addition to the above-mentioned technical bottlenecks, the natural variability and patchy distribution of phytoplankton communities further present a challenge in detecting patterns in chl a concentration. However, as the first LT satellite was launched in 1972, we now have available an extensive archive of imagery, which combined, can provide a dense timeseries for observing seasonal patterns. Yet, the high spatial resolution of LT imagery results in relatively large file sizes, and analyzing over 30 years of data is computationally intensive. Until now this has been an important bottleneck in using high resolution RS data for time-series assessments. The dissemination of high performance computational tools for geospatial analysis, such as Google Earth Engine (Gorelick et al., 2017), has largely facilitated the usage of so called "big-data" in environmental assessments, leading to new and unprecedented opportunities for assessing long time-series of satellite data (Pekel et al., 2016).
In this study we use the extensive LT imagery archive and high performance computational tools for addressing the following objectives: a) to characterize seasonal patterns of phytoplankton phenology at high latitude Finnish lakes; b) to identify shifts or trends in the temporal dynamics of phytoplankton phenology during the past 30 years, and c) to examine the role of climatic variation on phytoplankton seasonal patterns.

Study areas and field data
We carried out analysis in four Finnish lakes: Haapajärvi, Kuortaneenjärvi, Köyliönjärvi and Tuusulanjärvi (Fig. 1). The lakes are spread across a latitudinal gradient varying from 60°to 64°north. All studied lakes are medium sized (5-25 km 2 ) with moderate to high concentrations of chromophoric dissolved organic matter, typical to boreal lakes. Haapajärvi is a shallow humic-rich lake with a maximum depth of 8.5 m. Kuortaneenjärvi is humic-rich with an average depth of 3.3 m and a maximum depth of 16 m. Köyliönjärvi is an eutrophic nutrient-rich lake with an average depth of 2.6 m and a maximum depth of 12.8 m. Tuusulanjärvi is a hypereutrophic lake, with a mean depth of 3.2 m and one main basin reaching 10 m depth. Tuusulanjärvi is naturally eutrophic, and with the predominant clay soils in the catchment the water is greyish-brown in colour.
Here we use 17-years (2000-2016) of chl a field measurements collected in each of the four lakes (see Fig. 1 for samples location). The field measurements (in-situ knowledge) were mostly collected between May and October, without a fixed interval. The number of field measurements varied between lakes, being 42 for Lake Haapajärvi, 68 for Lake Kuortaneenjärvi, 89 for Lake Köyliönjärvi and 255 for Lake Tuusulanjärvi.
All the lakes belong to the intensively monitored lakes of the national routine monitoring network of Finland. Chl a concentrations were determined from water samples taken by a tube sampler from 0 to 2 m. The concentration of chl a was measured in laboratory with a spectrophotometer after extraction with hot ethanol (ISO10260, 1992).

Landsat imagery
All imagery from the LT-5 and 7 archives, until December 2017, were analyzed in this study. All pre-processing procedures, as well as data extraction, were carried out using the high performance cloud computing tools offered by Google Earth Engine (Gorelick et al., 2017). The LT-5 collection contains images collected from Jan 1, 1984 to May 5, 2012, while LT-7 imagery were obtained from Jan 1, 1999 -December 30, 2017. Approximately 630 images were analyzed for each lake. For both, LT-5 and 7, we used the USGS Surface Reflectance Tier 2 product. This dataset contains surface reflectance imagery from the LT-5 Thematic Mapper (TM) and LT-7 Enhanced Thematic Mapper Plus (ETM+) sensors. The images contain 4 visible and near-infrared bands and 2 short-wave infrared bands. In this study, we used bands 1 (blue, 0.45-0.52 μm), 2 (green, 0.52-0.60 μm), 3 (red, 0.63-0.69 μm), 4 (nearinfrared, 0.77-0.90 μm) and 5 (shortwave infrared 1.55-1.75 μm). All bands were processed to orthorectified surface reflectance. The imagery of both, LT-5 and LT-7, have spatial resolution of 30 m, radiometric resolution of 8-bits, as well as consistent spectral resolution in the bands used for this study.
The images were atmospherically corrected using the LEDAPS (Landsat Ecosystem Disturbance Adaptive Processing System) method, providing radiometrically consistent surface reflectance data (Masek et al., 2012). We carried out quality control assessment in every image using cloud, shadow, water and snow masks produced using CFMASK (Foga et al., 2017;Zhu et al., 2015), as well as a per-pixel saturation mask. However, the CFMASK has difficulties to operate over bright targets such as snow/ice, and optically thin clouds have a higher probability of being omitted by the algorithm. Hence, to further eliminate pixels containing artifacts associated to cloud or ice, we removed outliers in the surface reflectance values extracted for each lake, using the 25th and 75th percentiles as threshold.
To extract the surface reflectance values at each site, we averaged all (quality controlled) pixels inside a 500 m radius buffer around the location where the in-situ samples were collected. To certify that only open water pixels were contained inside the 500 m buffer, we use the high-resolution Global Surface Water Data (Pekel et al., 2016) to mask out any land surface inside the buffers.

Chlorophyll a modeling
Estimates of chl a concentration based on LT imagery were obtained using linear multi-variate models. Individual models were created for each lake. The explanatory variables used for the models were the surface reflectance values from LT bands 1 to 5, and two spectral indices designed specifically for assessing chl a concentration in aquatic ecosystems. The first index is the normalized difference chlorophyll index (NDCI), developed by (Mishra and Mishra, 2012). The NDCI is calculated as follow: where ρ is the surface reflectance for the respective spectral band. It is important to note that the NDCI was initially developed using MERIS spectral bands (i.e. 660-670 nm and 704-714 nm), which differ in location and width from those of LT bands (i.e. 630-690 nm and 770-900 nm). The second spectral index, based on the red, blue and green reflectance bands, was defined as follow (Brivio et al., 2001): Although the BRG index was initially developed to be used in ocean waters, it has been successfully applied by Brivio et al. (2001) in Lake Garda, Italy, achieving an R 2 = 0.818. The same index led to satisfactory results for other small water bodies as in the Malilangwe Reservoir, Zimbabwe with R 2 = 0.81 (Dalu et al., 2015).
Given that LT bands were not originally designed for assessing chl a in inland waters, most lake studies applying LT data used empirical or semi-empirical approaches (e.g. Vincent et al., 2004;Allan et al., 2011;Isenstein et al., 2014). Generally, there is no single solution available for defining the explanatory variables of the empirical models. For instance, Tan et al. (2017) have tested simple linear models based on four different spectral indexes (band arithmetic and band ratios) and a multi-linear regression using LT bands 1, 2 and 4. They found that the best accuracies were achieved using the multi-linear regression, with an R2 of up to 0.78. Isenstein et al. (2014) also compared different combination of bands and indexes, and found the best results applying a multi-linear regression model composed of two band ratio (B2/B1 and B3/B1). Hence, given the lack of a common solution in the literature, we chose to test all the five LT bands and two indexes described in Eqs.
(1) and (2). For each lake, we tested the significance and explanatory power of the variables.
The models were firstly tested using a 5-fold cross validation, and later all samples were used to generate the final model used for prediction. The models were tested and developed using the in situ data, which were merged with the LT data using an interval of ± 8 days. That is, for each in-situ sample, we searched for LT images obtained 8 days before or after the sample acquisition. A previous study has shown that, in comparison with smaller time-windows, the 8 days interval increases the chances of matching a LT observation with the field sample, without significantly affecting the model performance (Tan et al., 2017). Our preliminary assessments have confirmed these results, showing that a time-windows below 8 days reduces significantly the number of matching observations, consequently decreasing the model performance (see Fig. S2, supplementary material). In case more than one image was available during this time-window, we considered the average of the surface reflectance values.

Climate variables
Although a full assessment of the drivers of chl a phenology goes beyond the scope of this study, we carry out preliminary assessments on key climate variables over the studied lakes. We used remotely sensed estimates of daytime and nighttime land surface temperature obtained by the Moderate resolution Imaging Spectroradiometer (MODIS), onboard the TERRA satellite. The MODIS land surface temperature has been successfully applied to estimate lake surface water temperature (Ts) (Wan et al., 2017). The product used in our study was the MOD11A2, collection 6, which offers daytime and night-time Lake Surface Temperature (LST) data stored on a 1 km sinusoidal grid as the average values of clear-sky LSTs during an 8-day period (Wang et al., 2008). MODIS Ts represents the radiometric temperature related to the thermal infrared radiation emitted from the lake surface observed by an instantaneous MODIS observation. The daytime Ts corresponds to measurements acquired around 10:30 am, while night-time Ts observations are acquired around 10:30 pm (local solar time). Analyzing both, daytime and night-time Ts, can provide information on the intradaily variability, which is relevant to assess, for instance, the periods when the lakes surface is more likely to be frozen. MODIS land surface temperature products have been validated over a broad range of representative conditions and extensively tested using comparisons with in situ values and radiance-based validation (Wan et al., 2017;Wang et al., 2008). We certified that only good quality data were used by applying filters based on the MOD11A2 quality control layer and, therefore, excluding pixels contaminated by clouds or with unreliable/ unquantifiable quality. All MOD11A2 data from March 5, 2000 to December 30, 2017 were analyzed.
Furthermore, we analyzed rainfall [kg m −2 s −1 ], snowfall [kg m −2 s −1 ] and total precipitation [kg m −2 s −1 ] data obtained from the Global Land Data Assimilation System (GLDAS), version 2.1. This dataset ingests satellite and ground-based observational data products and, using advanced land surface modeling and data assimilation techniques, it generates optimal fields of land surface states and fluxes (Rodell et al., 2004). The spatial resolution is 0.25°and the original temporal resolution is 3 h. However, to make this dataset consistent with the MODIS Ts data, we aggregate all GLDAS data into 8-day averages, using the same dates as the MODIS composites. Here, we analyzed all GLDAS data from January 1, 2000 to December 30, 2017.

Time-series analysis
This study focused on assessing chl a intra-annual patterns between April and October (warm months), given that during the remaining months (cold months) the lakes are often frozen and LT imagery have high frequency of cloud coverage. As demonstrated in Fig. S3 (supplementary material), the availability of images is significantly reduced during the cold months. Because of the low temporal resolution of the LT sensors, obtaining a dense enough time-series to reconstruct the chl a seasonal patterns requires aggregating several years of data. In this study, we used a 10-year sliding window, with 1 year incrementing steps. In other words, for the interval between years n and n + 9, we aggregated all available LT imagery to reconstruct the average chl a seasonal pattern during this period, while the next interval would comprise the data between years n + 1 and n + 10. This approach results in a temporal series with irregular steps between observations (i.e. the date of the observations are defined based on the individual observations of each year inside the sliding window), but with a higher density, allowing a more robust description of the chl a phenological patterns.
The analysis of the seasonal patterns followed a similar approach as proposed in Forkel et al. (2015), which was initially developed to assess land surface phenology and trends. The approach consisted in four main steps: i) filling of permanent gaps in the time-series (i.e. cold months), ii) time-series smoothing and interpolation, iii) detection of phenology metrics and iv) identification of temporal trends in the phenology metrics.
The first step consisted in filling the values from the cold months with a baseline value. The baseline value for each lake was defined as the minimum chl a concentration observed in the in-situ time-series. In the second step, we used the Local Polynomial Regression Fitting (LOESS) method to perform a time-series smoothing. This procedure is necessary for removing high-frequency noise and optimize the calculation of the phenological metrics. Simple linear interpolation was used to fill eventual data gaps.
In the third step, we used the resulting smoothed time-series for extracting three chl a phenology metrics: start of chl a season (SOS), end of chl a season (EOS) and length of season (LOS). We also calculated the position of chl a peak (POP) and position of chl a trough (POT), but given the large uncertainties in these two variables, we do not analyze them thoroughly in this study. The POP can be affected by remaining high frequency noise, while the POT is affected by the artificial temporal limits imposed to define the warm months and cold months. The SOS and EOS were calculated using 50% thresholds on the seasonal chl a curve (White et al., 1997), which is based on the definition of SOS and EOS as the mid-points of spring chl a bloom (equivalent to greening in land vegetation phenology) and autumn senescence, respectively (Forkel et al., 2015). Finally, after applying the phenology metrics for each 10-year window period, we evaluated if significant temporal trends could be observed. The significance of the trends were assessed using the modified version of the Mann-Kendall trend test (Mann, 1945), as proposed by Hamed and Ramachandra Rao (1998). This modified version of the Mann-Kendall trend test reduces the chances of false positives by accounting for serial correlation, often present in time-series data due to sub-sequent observations. The magnitude of the trends were assessed using the Sen's slope (Sen, 1968), which is less vulnerable to errors in comparison with least squares estimator of a regression coefficient β, as well as less sensitive to nonnormality of the parent distribution and outliers.
A similar procedure for calculating phenology metrics was applied to the Ts time-series, aiming to evaluate if any trends observed in the chl a temporal dynamics could be explained by surface temperature changes. However, in the case of Ts, the metrics were calculated for each year, from 2001 to 2017, given that the higher temporal resolution of the MODIS sensor allows a solid delineation of the Ts intra-annual variation. The Mann-Kendall trend test was also applied to evaluate trends in rainfall, snowfall and total precipitation, between 2000 and 2017. However, the seasonality metrics were not calculated for the precipitation variables, given the lack of consistent seasonal patterns in this region (as will be demonstrated later in the results).

Results
The LT models for estimating chl a, fitted individually for each lake, showed satisfactory performance. The 5-fold cross validation of the models showed Mean Absolute Errors (MAE) of 2.39 μg l −1 (Haapajärvi, n = 30), 1.43 μg l −1 (Kuortaneenjärvi, n = 47), 5.61 μg l −1 (Köyliönjärvi, n = 61) and 2.76 μg l −1 (Tuusulanjärvi, n = 121). All final models used for the predictions were statistically significant (p < 0.01), with R 2 varying from 0.37 to 0.63. The relative importance and statistical significance of each explanatory variable varied between lakes. The BRG index and the blue reflectance did not show significant influence in any of the models (p > 0.05), which is explained by the high chromophoric dissolved organic matter (CDOM) absorption, which shadows the optical signature of phytoplankton in the blue spectrum. Please refer to the supplementary material for detailed statistical summary of all models. Fig. 2 shows the seasonal pattern of chl a estimated using the LT model, in comparison with the observations for the same period. The red lines show the interpolation of in situ chl a concentrations observed from 2000 to 2016, without a regular temporal resolution. All models E.E. Maeda et al. Remote Sensing of Environment 221 (2019) 609-620 were able to capture the magnitude and seasonal variation of chl a. A perfect fit between the field samples and the satellite estimates is not expected, given the large range of fluctuations and uncertainties inherent to each acquisition level. The uncertainties and variation mainly stem from a better detection accuracy of the chl a measurements from water samples compared to satellite detected ones, differences in the spatial scale of detection, patchy distribution of phytoplankton assemblages in lakes, and the variation in the optical properties of the lakes. Furthermore, the number of satellite observations were considerably higher in comparison with field samples for the same period, which leads to slightly smoother curves for the modeled chl a dataset. For instance, for Lake Köyliönjärvi, there were 222 satellite observations between 2000 and 2016, while only 89 in situ observations during the same period. In some cases (e.g. Haapajärvi and Kuortaneenjärvi), there were no chl a field observations available before mid-May, which allowed the modeled chl a values to cover a larger period of the year. Considering the in situ values, average chl a concentration were the highest in lake Köyliönjärvi (53.8 μg l −1 ), followed by Haapajärvi (41.1 μg l −1 ), Tuusulanjärvi (29.1 μg l −1 ), and Kuortaneenjärvi (μg l −1 ). Likewise, the largest seasonal range was observed in Lake Köyliönjärvi, where we observed a 58 μg l −1 variation between the lowest and highest average chl a concentration. In comparison, the seasonal range in the lakes Kuortaneenjärvi and Tuusulanjärvi were much lower (20 and 25 μg l −1 respectively).
The results of the seasonal chl a curve fitting using the LOESS method, as well as the average phenological metrics for each lake, are presented in Fig. 3. These results show estimates for the POP, POT, SOS, EOS and LOS. Nonetheless, we hereafter focus on assessing the LOS, SOS and EOS, as these metrics can be robustly estimated in our given conditions.
The average (1984-2017) SOS was observed between June and beginning of July (DOY 154-189), with small variations between lakes. The average EOS took place around September, also with slight variations between lakes. The mean LOS in all lakes was 85 days. The smallest LOS was observed in Lake Tuusulanjärvi (83 days), and the larger in Lake Köyliönjärvi (90 days).
The analysis of the temporal trends of the chl a phenological metrics is presented in Fig. 4 and Table 1. We observed significant (p < 0.01) increasing trends in the LOS across Lakes Kuortaneenjärvi, Köyliönjärvi and Tuusulanjärvi. Comparing the average LOS from 1984 to 1994 with the period of 2007-2017, we observed an increase in LOS of 26.5 days in Kuortaneenjärvi, 28.5 days in Köyliönjärvi, and 14 days in Tuusulanjärvi (Table 1). The increasing trend in LOS was mostly explained by a decreasing trend in the SOS, while no significant trends were seen in the EOS. In Lake Kuortaneenjärvi, the SOS during the period 1984-1994 was in average observed around 16 of June (DOY 167), while during the period 2007-2017, it has shifted to 14 of May (DOY 134), that is, 32 days earlier. In Lake Köyliönjärvi, the SOS has decreased 23 days during the same period (from June 29 to June 6).
In Lake Haapajärvi, we report a decreasing, but not significant trend in the LOS. Such decrease was mostly caused by an increase in the SOS. Nonetheless, although the trend in EOS was significant considering the entire time-series, we observe that the EOS increase took place between 1984 and 2004, after which, the EOS remained stable.
To better understand the factors driving the changes in the chl a seasonal patterns observed in Fig. 5, we further examine the Ts and precipitation over the studied lakes. The seasonal patterns of Ts, between 2001 and 2017, are presented in Fig. 5 and Table 2. Both daytime and nighttime Ts showed a clear and consistent seasonal pattern, with relatively small inter-annual variability. On average, the warm season started in April and ended in October in all four lakes. The maximum Ts was generally observed during July, for both daytime and nighttime Ts. The peak of Ts did not coincide with the peak of chl a concentration in any lake, as other factors are likely to drive the timing of chl a peak (as will be discussed later).
We did not observe statistically significant trends in average Ts, or in any of the phenological metrics. The time-series of the length of warm season is presented in Fig. 6. The standard deviation of the length of warm season between 2001 and 2017 varied from 9 to 15 days in the four lakes. Hence, the seasonal patterns were shown to be consistent and stable during this period and, therefore, cannot be attributed as a cause for the changes in the chl a LOS previously reported. The intra-annual precipitation patterns for the four lakes are illustrated in Fig. 7. Although the total precipitation did not present clear seasonality, some temporal patterns could be observed in the rainfall and snowfall components. In average (2000-2017) the snowfall rate tended to zero after April, increasing again after October. The rainfall seasonality showed inverse pattern, with average peak occurring around July. Nonetheless, rainfall showed very high inter-annual variability, and its seasonal dynamics cannot be robustly characterized.
These results show that the beginning of the phytoplankton bloom generally takes place after the snowfall contribution to the total precipitation has vanished (Fig. 7). This is explained by higher runoff volumes at the end of winter, when melting snow and rainfall carry sediments and nutrients to the lakes, contributing for increasing productivity. Phytoplankton blooms end usually in September with decreasing temperature and light. Once again, multiple other factors are likely to contribute to the temporal dynamics of blooms.
The temporal trends in precipitation from 2000 to 2017 are presented in Fig. 8 and Table 3. We report a strong consistent decrease in snow rates in all lakes during winter months (Oct-Dec and Jan-Mar). With the decrease in snow rates, we see an increase in the relative rainfall contribution to the total precipitation, although rainfall did not show significant trends during winter. On the other hand, changes in rainfall were observed mostly during spring (Apr-Jun), with significant increasing rates in lake Köyliönjärvi.

Discussion
Our results indicate a tendency for increasing length of phytoplankton growth season in high latitude lakes in Finland. The magnitude of these temporal trends was, however, not consistent between lakes. To better understand these discrepancies, further studies are necessary to clarify the biophysical drivers of environmental changes across these lakes and the sources of temporal variation in their optical properties. Currently there are only few studies on the temporal trends of phytoplankton blooms in lakes. A study using long-term (1955-2003) physical, chemical and biological data from the North Basin of Windermere, UK, reported an advancing trend on spring peak biomass of two diatom taxa, Asterionella formosa and Cyclotella spp (Thackeray et al., 2008). They concluded that phytoplankton phenological shifts can be caused by local processes, as well as by climate change. While changes in Cyclotella spp. seasonal patterns were a result of earlier thermal stratification, the advancement of the Asterionella formosa spring peak was linked with both progressive nutrient enrichment and lake warming (Thackeray et al., 2008). Our results did not show evidences of a linkage between lake surface temperature and long-term trends of phytoplankton phenology. However, these results cannot discard an underlining process related to temporal changes in the thermal stratification of lakes. Moreover, challenges persist for separating climate driven changes in primary production dynamics from other anthropogenic forcing, such as changes in land-use and nutrient runoff (Moss, 2011).
Shifts in phytoplankton phenology associated with changes in the thermal stratification of lakes were reported by Winder and Schindler (2004), who investigated the effects of climatic and biotic drivers on lake processes using a historical dataset of 40 years from Lake Washington, USA. They reported that, in 2002, phytoplankton spring bloom occurred about 19 days earlier than it did in 1962. These changes were tightly linked to an increase in the thermal stratification period (by 25 days over the last 40 years), which was mainly caused by an earlier spring stratification (Winder and Schindler, 2004). In this Fig. 3. Time-series of chl a smoothing (solid black lines) and phenology metrics calculated using the average for the entire baseline period . Gray lines show the average seasonal variation of chl a, for the same baseline period, without smoothing.
E.E. Maeda et al. Remote Sensing of Environment 221 (2019) 609-620 aspect, our results confirm previous evidences indicating that shifts in lakes' phytoplankton phenology are being mostly driven by an early spring onset. A study over New England shelf, between 2003 and 2016, has shown that phytoplankton blooms in this area now occur 20 days earlier than at the start of observations (Hunter-Cevera et al., 2016). They concluded that earlier springtime warming stimulates cell division earlier each year. Nonetheless, the drivers of phytoplankton dynamics over coastal waters can differ significantly from those in high latitude lakes. An earlier onset of phytoplankton blooms and longer growing season may potentially result in higher biomass (De Senerpont Domis et al., 2013). Variation in the timing of phytoplankton blooms in lakes affects the species composition by favoring certain taxa (Sommer and Lengfellner, 2008) and further has an impact on competition within the phytoplankton community as well as on trophic interactions with other organisms. Combined with the increased nutrient loading from    1984-1994 2007-2017 1984-1994 2007-2017 1984-1994 2007-2017  anthropogenic sources, a longer phytoplankton growth season may further have an impact on the recreational values provided by lakes, with possible implications of toxic phytoplankton species on human health (Lam et al., 1995). Our results also present evidence that, at higher latitudes, the earlier spring onset is likely to suffer influence from changes in precipitation patterns. In particular, we observed a decrease in snow rates during winter, and consequent increase in the relative rainfall contribution to the total precipitation. We also show significant increase in rainfall during spring. These changes in the precipitation pattern can impact the hydrological regime in the basins draining to the lakes, strongly affecting the timing and magnitude of nutrients discharge. In fact, Thackeray et al. (2008) have shown that, over the North Basin of Windermere, UK, nutrient enrichment explained more variation in phytoplankton phenology than water temperature. Hence, more detailed studies will help to assess the complexity of the interacting factors driving phytoplankton phenology in inland waters, including land use change and geomorphological characteristics of the basins.
The discrepant results between lakes observed in our study are likely associated with different basin characteristics, such as surrounding land cover, temporal variation in nutrient runoff, and different phytoplankton assemblages. Natural bodies of water present high temporal variation in their chemical characteristics due to the runoff they receive from both anthropogenic and natural sources, as well as the temperature stratification and water mixing. While the examined lakes present different optical characteristics that challenge modeling chl a dynamics accurately, our approach provides a new insight into in examining phytoplankton phenologies in boreal lakes that could not be captured by snapshot water sampling by monitoring programmes.

Model performance and uncertainties
Studies aiming to estimate phytoplankton/Chl a concentration in lakes using LT imagery have reported a large range of model performances, with R 2 varying from 0.30 to 0.95 (Allan et al., 2011;Isenstein et al., 2014;Tan et al., 2017;Vincent et al., 2004). Some of these accuracies are higher than those obtained in our study (0.37 to 0.63), particularly considering our results for mesotrophic lakes (Tuusulanjärvi, Kuortaneenjärvi). Nonetheless, we highlight that there are fundamental differences in the design of our study. More importantly, in some of the previous studies reporting high model accuracies, models were developed to assess the spatial variability of chl a within lakes, but did not account for temporal variability (Allan et al., 2011;Vincent et al., 2004). On the other hand, the goal of our models was to estimate chl a temporal variability (i.e. how chl-a concentrations varied throughout time in a same location). This latter task is technically more challenging, given that the spectral differences between sampling dates can be quite subtle, particularly in oligotrophic and mesotrophic lakes. A recent study by Tan et al. (2017), in which LT models were created to   (2019)  The columns from left to right represent, rainfall, snowfall, total precipitation and rainfall contribution to total precipitation, respectively.

Table 3
Sen's slope estimate of the precipitation time-series presented in Fig. 8. ***p < 0.01; **p < 0.05; *p < 0. assess phytoplankton temporal variability, has reported R 2 from 0.39 to 0.7, which is consistent with our results. Including a temporal dimension in the analysis impedes making absolute estimates of the chl a concentration due to natural variation in chl a per cell. The fundamental assumption for the measurement of chl a by satellites is that signal per unit chl a is constant. However, the chlorophyll a content by cell varies between species (Strickland, 1968), light intensity (Vincent, 1980(Vincent, , 1979, and the physiological state of the cells (Babin et al., 1996). Another important factor is the patchy distribution of phytoplankton assemblages in natural waters. In boreal lakes, cyanobacterial blooms are frequent. These blooms are constituted by filamentous or colonial cyanobacteria, which form dense aggregates of different size and shape, mostly located in surface water. During large cyanobacterial blooms during summer, the mismatch of satellite observed chl a and field samples can be substantial.
We therefore emphasize that our approach may not be applicable to lakes with low chl a concentration. In oligotrophic lakes, the seasonal changes in phytoplankton biomass are considerably smaller, so that the spectral and radiometric resolution of LT5 and 7 are likely to be too coarse to capture this temporal variability. In such cases, newer high resolution sensors as the Sentinel-2 multi-spectral instrument (MSI), with 13 spectral bands and 12-bits radiometric resolution, is likely to provide more promising results in the longer run, when a more dense time-series archive becomes available. MSI includes the bands 0.646-0.684 μm and 0.695-0.714 μm, which are close to the MERIS bands 0.660-0.670 μm and 0.704-0.714 μm used in the original NDCI index (Mishra and Mishra, 2012). These MERIS bands have been shown to be optimal for the estimation of Chl a by band ratio algorithms in lakes in Finland (e.g. Kallio et al., 2005;Kallio, 2012). Chl a was estimated with good accuracy from Sentinel 2 data in Estonian lakes (Toming et al., 2016) with trophic status ranging mainly from mesotrophic to eutrophic. The algorithm was based on the height of the 705 nm reflectance peak. The new OLI sensor (Operational Land Imager), onboard LT-8, also offers important improvements in comparison with its predecessors. The LT-8 OLI sensor includes additional bands that allow a more confident assessment of data quality, and the radiometric resolution has been improved to 12-bits (in comparison to 8-bits in LT-5 and 7).
As pointed out by Oliver et al. (2017), assessments on environmental changes over lakes are often limited by temporal and spatial availability of observation data. The capability of remote sensing to overcome such bottlenecks is limited by sensors' resolution or the length of time-series. As exemplified in our study, solving these problems requires dealing with high computation costs, and a careful management of uncertanties, such as cloud contamination and the low spectral signal over fresh water bodies. Here, we used the high performance cloud computing tools offered by Google Earth Engine (Gorelick et al., 2017) to process and analyze over 30-years of high resolution satellite data. The processing involved using imagery converted from top-of-atmosphere irradiance to surface relectance using stated-of-the-art atmospheric correction (Masek et al., 2012), as well as cloud, cloud shadow, and snow/ice masking in each image, using algorithms based on decision trees (Foga et al., 2017;Zhu et al., 2015).

Conclusions
In this study we evaluated 33-years of satellite data for characterizing the phytoplankton phenology and long-term trends across four lakes in Finland. Our approach could successfully characterize the average chl a seasonal patterns in all lakes, providing a novel baseline for evaluating environmental changes. We present evidences of increasing length in phytoplankton bloom seasons in high latitude Finnish lakes, mostly caused by an earlier onset of phytoplankton growth. We report an increase up to 28 days in the length of chl a season (Lake Köyliönjärvi) over the past three decades. Nonetheless, the magnitude of changes in the chl a seasonal patterns varied between lakes, with one of the lakes showing no significant changes. The observed changes in chl a temporal patterns are unlikely to be explained by changes in surface temperature (Ts), as we could not detect significant trends on average Ts, or in the Ts phenological metrics from 2001 to 2017. However, changes in the thermal stratification of the lakes cannot be discarded. We also point out for important shifts in the precipitation patterns over the past decades that could potentially drive the observed changes in chl a seasonal patterns. In particular, our results show a significant decrease in snowfall rates during winter, with a consequent increase in the rainfall contribution to total precipitation. Finally, we suggest that further studies will decrease the uncertainties related to the biophysical factors driving the temporal patters in chl a, allowing a robust explanation for the shifts in chl a seasonality reported here. Particularly, it is crucial for climate variables to be accounted in combination with a comprehensive assessment of the hydrological characteristics of the drainage basins contributing to the lakes' influx.