Drone data reveal heterogeneity in tundra greenness and phenology not captured by satellites

Data across scales are required to monitor ecosystem responses to rapid warming in the Arctic and to interpret tundra greening trends. Here, we tested the correspondence among satellite- and drone-derived seasonal change in tundra greenness to identify optimal spatial scales for vegetation monitoring on Qikiqtaruk—Herschel Island in the Yukon Territory, Canada. We combined time-series of the Normalised Difference Vegetation Index (NDVI) from multispectral drone imagery and satellite data (Sentinel-2, Landsat 8 and MODIS) with ground-based observations for two growing seasons (2016 and 2017). We found high cross-season correspondence in plot mean greenness (drone-satellite Spearman’s ρ 0.67–0.87) and pixel-by-pixel greenness (drone-satellite R 2 0.58–0.69) for eight one-hectare plots, with drones capturing lower NDVI values relative to the satellites. We identified a plateau in the spatial variation of tundra greenness at distances of around half a metre in the plots, suggesting that these grain sizes are optimal for monitoring such variation in the two most common vegetation types on the island. We further observed a notable loss of seasonal variation in the spatial heterogeneity of landscape greenness (46.2%–63.9%) when aggregating from ultra-fine-grain drone pixels (approx. 0.05 m) to the size of medium-grain satellite pixels (10–30 m). Finally, seasonal changes in drone-derived greenness were highly correlated with measurements of leaf-growth in the ground-validation plots (mean Spearman’s ρ 0.70). These findings indicate that multispectral drone measurements can capture temporal plant growth dynamics across tundra landscapes. Overall, our results demonstrate that novel technologies such as drone platforms and compact multispectral sensors allow us to study ecological systems at previously inaccessible scales and fill gaps in our understanding of tundra ecosystem processes. Capturing fine-scale variation across tundra landscapes will improve predictions of the ecological impacts and climate feedbacks of environmental change in the Arctic.


Introduction
Identifying the scales at which ecological processes operate is a fundamental, yet often neglected element of ecological research [1][2][3]. Cross-scale ecological information can inform our understanding of the causes and consequences of global change [2]. In tundra ecosystems, vegetation responses triggered by rapid Arctic warming could influence ecosystem functions through altered carbon and nutrient cycles with potential feedbacks to the global climate system [4][5][6][7][8]. Yet, challenging logistics have limited the extent of field-based observations in Arctic ecosystems [9][10][11]. The grain sizes of global-extent satellite products (tens of meters to kilometres) are too coarse to capture the fine-scale dynamics of tundra plants

The scale discrepancy problem in Arctic greening
A major problem in linking satellite-derived trends of spectral greenness and phenology to in situ observations of ecological processes in the tundra is the discrepancy in observational scales [13,29,61,72,74]. Satellite datasets with long-term records are limited by their moderate-to coarse-grain sizes, ranging from 30 m (Landsat) to 250 m (MODIS) and 8 km (AVHRR-GIMMS3g). In situ ecological monitoring in the Arctic is logistically challenging and therefore restricted in extent to a limited number of sites and often metre-squared plots [10,75]. Only a few studies have linked on-the-ground vegetation or phenology change to satellite trends in NDVI in Arctic tundra [13,14,47,48,53,[76][77][78]. However, drones equipped with compact sensors now allow for the collection of ultra-fine-grain multispectral imagery at landscape extents that can potentially bridge the scale-gap between satellite and ground-based observations [14,[79][80][81][82].

Novel drone data to study variation in greenness
Here, we set out to test whether drones can be used to identify the key ecological scales for studying tundra greenness on Qikiqtaruk in the Canadian Arctic by bridging the scale gap between satellite and in situ data. First, we tested whether satellite-and dronederived measures of mean landscape-scale greenness (NDVI) agree across two growing seasons while controlling for the potentially confounding effects of topography and land cover. Second, we identified the key spatial scales for ecological variation in landscape greenness within the two most common vegetation types at our study site using variograms. Third, we tested how the magnitude of seasonal variation in tundra greenness scales across grain sizes from fineresolution drone imagery to medium-grain satellite imagery. Finally, we assessed whether drone-derived NDVI corresponds with on-the-ground measures of within growing season change in plant growth based on methods frequently used by long-term, field-based monitoring networks. Thus, in our analysis, we validated satellite-derived landscape estimates of vegetation greenness with ultra-fine-grain drone data and described spatial and temporal variation in tundra productivity at landscape extents (1-100 ha) with grain sizes that were previously not accessible.

Site description: Qikiqtaruk-Herschel Island
Qikiqtaruk (69.57 N, 138.91 W) is located in the Beaufort Sea along the coastline of the North Slope of the Yukon Territory, Canada. The vegetation is the moist acidic shrub tundra [83] characteristically found in the Western Arctic regions of North America, which has experienced strong spectral greening in recent decades [13]. The two most common plant communities on the island are the tussock sedge ('Herschel') and Dryas-vetch ('Komakuk') vegetation types [84,85]. The tussock sedge vegetation is dominated by the name-giving tussock sedge Eriophorum vaginatum L. with varying cover of Salix pulchra Cham. The top-soils of the island are underlain by ice-rich permafrost and undergo frequent disturbance [85]. The Dryas-vetch vegetation is particularly found on ground disturbed by soil creep and is characterised by the near ubiquitous presence of Dryas integrifolia Vahl., the willow Salix arctica Phall., various grass species including Arctagrostis latifolia. (R.Br.) Griseb. and forb species [86]. The relative abundances of these species are shown in (figure S1, which is available online at (https://stacks.iop.org/ERL/15/125002/mmedia)). Though the two vegetation types are specific to the region, these plant communities would group with tundra types S1, W2 and G3/4 of the Circumpolar Arctic Vegetation Map [87].
We established four study areas on the east end of the island, each with two co-located one-hectare plots in the two key vegetation types (figure 1, table S1). We selected plots with homogenous terrain and land cover to represent the two key vegetation types and to control for the potentially confounding effects of terrain and cover heterogeneity. The island harbours small herds of caribou (100s of individuals) and muskox (3-35 individuals in recent years) of fluctuating size, as well as cyclic populations of voles and lemmings [88]. We estimate the overall impact of herbivory on the vegetation in our study plots to be low particularly in 2016 and 2017 when there were few muskox on the island.

Multispectral drone time-series
We analysed a total of 62 drone surveys from 21 dates; see table S2 for a breakdown by one-hectare monitoring plots. We collected multispectral drone imagery using Parrot Sequoia (Paris, France) compact multispectral sensors mounted on multi-rotor drone platforms in June to August in 2016 and 2017. We used three different drone platforms: a Tarot 680 Pro hexacopter with camera sensor stabilisation in 2016, and a 3DR Iris + and a DJI Phantom 4 Pro without sensor stabilisation in 2017. Surveys were flown using parallel flight lines (a lawn-mower flight pattern) at an altitude of ca. 50 m, giving ground-sampling distances of 0.04 m to 0.06 m. Images were acquired with 75% front-and side-lap as close as possible to solar noon (mean absolute difference to solar noon 2.16 h, maximum 6-7 h). See table S2 and the methods section of the supplementary materials for further details on the drone surveys, including additional information on radiometric calibration, as well as temporal and spatial coverage. Drone data and code to carry out all analyses are available via a GitHub repository [91].
We processed the Sequoia imagery using Pix4D Mapper v4.0.21 (Lausanne, Switzerland) with the agMultispectral template and the 'merge map tiles' option set to true to generate co-registered singleband surface reflectance maps. Radiometric calibration was carried out in Pix4D Mapper using pre-or post-flight imagery of calibrated reflectance panels; in 2016 we used a MicaSense (Seattle, USA) panel and in 2017 a SphereOptics (Herrsching, Germany) Zenith Lite panel. We measured panel reflectance pre-and post-season and used the mean values for radiometric calibration. We also calibrated for sensor properties and sun irradiance measured by the incident light sensor. We used four to six ground control points per survey that were precisely geolocated with a GNSS system to spatially constrain the reconstructions in Pix4D Mapper with an estimated accuracy of 1-2 pixels between bands and 2-6 pixels between surveys [81]. We calculated the Sequoia NDVI as the normalised difference between the near-infrared (770 nm-810 nm) and red (640 nm-680 nm) bands of sensor.

Satellite time-series
Satellite time-series were obtained from three different satellite sensors: 1) the Moderate Resolution Imaging Spectroradiometer (MODIS) on the USGS Terra satellite, 2) the Multispectral Instrument (MSI) on Sentinel-2 A & B and 3) the Operational Land Imager (OLI) on Landsat 8.
We obtained MODIS NDVI values for the time period from the 1st May to the 30th of September in 2016 and 2017 for all 250 m MODIS pixels that contained the survey plots. NDVI values were retrieved from the 16-day MOD13Q1 v6 Terra product [92] using the Google Earth Engine [93]. We discarded all values with a 'Summary QA' score of −1 (no data) or 3 (cloudy). table S3 lists the resulting MODISpixel-date pairs. The MODIS NDVI is calculated as the normalised difference between bands 1 (841 nm-876 nm) and band 2 (620 nm-670 nm).
For the Sentinel-2 time-series, we gathered all Sentinel-2 MSI L1C scenes containing the tile covering Qikiqtaruk (T07WET) that were available on the Copernicus Open Access Hub (https://scihub.copernicus.eu/) for the same time period as the MODIS pixels. We processed all scenes to L2A using Sen2Cor 2.4.0 [94], retained all bands with 10 m resolution (2, 3, 4 and 8), applied the cloud mask and generated a true-colour image. We inspected all scenes visually and discarded all imagery with cloud contamination over the study area (78% of Figure 1. Drone-data captured the temporal variation in satellite data across vegetation communities, areas and years. This figure showcases variation in mean landscape greenness (NDVI) across the eight one-hectare sampling plots on Qikiqtaruk as derived from drone orthomosaics and the MODIS Vegetation Index (MOD13Q1.v006 Terra), Landsat 8 Level 2 and Sentinel-2 Level-2 A products. Vertical dotted grey lines represent the average snow-melt at long-term monitoring plots close to Area 3-Hawk Valley for the given year [88]. Dashed grey lines represent simple quadratic phenology curves (NDVI ∼ a x 2 + b x + c, where x is the day of year, a the quadratic coefficient, b the linear coefficient and c the y-axis intercept) fitted to all data points pooled across sensors. The lower central panel demonstrates the close correspondence between seven-day mean values from drone and satellite NDVI, albeit with a positive offset for all satellite sensors. For this panel, drone NDVI values were spatially aggregated by mean to the one-hectare plots and temporally aggregated by mean in consecutive seven-day blocks starting on the first of May in both growing seasons (2016 and 2017) where data was available. Matching seven-day block pairs between drone and satellite platforms were then identified and plotted as shown. Spearman's rank correlation as well as mean differences (offsets) in NDVI amongst all platform combinations can be found in tables S12 and S13 respectively. The grey dashed line in this panel represents the one-to-one line. Map sources: North America [89,90] in latitude and longitude on the WGS84 reference ellipsoid and Qikiqtaruk, Copernicus Sentinel-2 true colour image july 2017 in UTM 7 N based on the WGS84 reference ellipsoid.
scenes for 2016 and 74% of scenes for 2017). The resulting set contained nine cloud-free Sentinel-2 L2A scenes of the study area from 2016 and 15 scenes from 2017 (table S4). Finally, the Sentinel NDVI was calculated as the normalised difference between band 8 (784.5 nm-899.5 nm) and band 4 (650 nm-680 nm).
Landsat 8 OLI Level-2 (surface reflectance) timeseries were obtained using the USGS EarthExplorer website (https://earthexplorer.usgs.gov/) by querying the search engine for all scenes that covered the study site during the same time-period as the MODIS pixels (n = 94). The automatically generated cloud masks were of insufficient quality, so we manually inspected all scenes and retained only the scenes cloud-free over the study site (n = 7 for 2016, n = 8 for 2017, table S5). The Landsat 8 NDVI was then calculated as the normalised difference between band 5 (845-885 nm) and band 4 (630-680 nm). The study plots were not designed with a Landsat 8 analysis in mind and did not naturally coincide with the Landsat grid. We therefore calculated subsequent one-hectare plot NDVI averages as a weighted mean, where each pixel was weighted by the proportion of the plot area covered by the pixel.

Ground-based plant phenology measurements
We carried out ground-based phenology monitoring in eight 2 m × 2 m plots (table S6), one adjacent to each one-hectare plot (mean distance = 23 m, max distance = 52 m). We placed the ground-based monitoring plots adjacent to the drone-based survey plots to minimise the effects of ecological disturbance and trampling in the drone survey plots caused by the repeat visits necessary for the ground-based monitoring. Within these plots, we monitored six individual plants from the most common species: E. vaginatum, D. integrifolia, S. pulchra and A. latifolia in tussock sedge tundra; D. integrifolia, S. arctica and A. latifolia in Dryas-vetch tundra. On each survey date, we measured the length of the longest leaf on each individual to the nearest millimetre. This approach is widely used in field-based phenology monitoring protocols [95], and will allow for NDVI to be directly related to phenological changes in plant traits. We conducted the ground-based surveys in tandem with the dronebased surveys where logistical possible, resulting in a dataset of 52 drone and ground survey pairs spread over 20 different dates. The majority of ground-based phenology surveys were carried out on the same day as the drone surveys (mean difference = 0.3 d, maximum difference = 3 d, table S7).

Cross-sensor correspondence
To test cross-sensor correspondence, we first had to scale all datasets to a shared spatial grain and time-window. To achieve this, we first plotted the spatial mean NDVI for all one-hectare plots, timepoints and available sensors (MODIS = single pixel, Landsat 8 = weighted mean) across both growing seasons (2016 and 2017). We then divided the two growing seasons into 22 consecutive sevenday blocks starting on the 1st of May each year. Next, we calculated the temporal mean of the spatial mean NDVI for each seven-day block for all plot and sensor combinations where data was available. We then identified all matching seven-day block and study plot combinations for each drone-satellite and satellite-satellite combination. We then tested cross-sensor correspondence by calculating Spearman's rank correlation and mean sensor-to-sensor difference in the plot means across the whole data set.
Additionally, we matched all drone and Sentinel-2 scenes, as well as all drone and Landsat 8 scenes that were less than two days apart. We resampled the red and near-infrared drone bands to the relevant Sentinel-2/Landsat 8 grids and calculated the NDVI. We restricted the analysis to Landsat 8 pixels fully contained within the study plots and reprojected the drone data from UTM 7 N to UTM 8 N using a bilinear reprojection where the Landsat 8 scenes were provided in this projection. Finally, we tested the predictive relationship between the resampled drone and satellite NDVI pixel-pairs for a random subsample of Sentinel pixels (10% of total, n = 700) and all available Landsat 8 pixels (n = 198) with Bayesian linear models (tables S8 and S9 for Sentinel-2, S10 and S11 for Landsat 8) using the MCMCglmm v.2.29 package [96].
We used the 'resample' function of the R raster package v. 3.0-12 [97] for resampling from finer to coarser resolutions. The function first aggregates the smaller grid to the largest clean divisor of the larger grid using the mean and then, if required, resamples to the larger grid using bilinear interpolation. We also tested an alternative resampling approach by first resampling to a common resolution and grid of 0.5 m and then aggregating by mean, but found no qualitative differences in our results ( figure S2). Further details about software and package versions used for raster manipulations and visualisations can be found in the supplementary materials.

Spatial autocorrelation
To assess the spatial autocorrelation of variation in tundra greenness within the eight plots, we sampled variograms and fitted variogram models using the gstat v. 2.0-5 package [98,99]. First, we pre-thinned the acquired drone-data by randomly sampling 5% of the ca. 4 million pixels of each orthomosaic. We then sampled variograms for all plots at the peak of the 2017 season (26 and 28 July) and fitted variogram models, letting the gstat algorithm select the best fit amongst spherical, exponential and Matern models. The only exception was Area 3 for which the closest available complete set of flights was on the 18 July 2017. To test conformity of the variograms across the season, we repeated the analysis for the surveys from the 26 June and 9 August 2017 for Area 1 and 2. No change in the variogram patterns were observed across the 2017 season and we therefore assume that our analysis is representative for the 2016 season also. All variograms were sampled with a bin width of 0.15 m from 0 to 15 m and a bin width of 3 m from 0 to 45 m.

Grain size and phenology
We tested the influence of grain-size on observations of tundra greenness phenology by fitting simplified growing season curves to the raster stacks for each plot and season. We first resampled the drone bands for all time-points to grids with grain sizes of 0.5, 1, 5, 10, 20 and 33.33 m. We then calculated the NDVI and fitted simple quadratic models to each pixel in the growing season stacks (y = ax 2 + bx + c, where x is the day of year and y the pixel NDVI, a the quadratic coefficient, b the linear coefficient and c the constant term). We found a strong negative correlation between the quadratic and linear coefficients of the models (figure S6), and therefore selected only the quadratic coefficient for further analysis. Additional details on model choice and analysis can be found in the method section of the supplementary materials.

Ground validation
To test the correspondence between our groundbased phenology measurements and the drone observations, we derived time-series of the plot mean standardised longest leaf length (hereafter mean longest leaf length) for all species (using a z-scorecentred data with a standard deviation of 1) and the drone-greenness for each 2 m × 2 m ground-based monitoring plot. See supplementary methods for details on how the leaf measurements were standardised. The drone-based plot mean NDVI values were then matched with the plot mean longest leaf length values from the closest ground-based survey date (table S7). We then calculated the Spearman's rank correlation between mean NDVI and mean longest leaf length for each plot and season. We replicated the analysis using Sentinel-2 data where available (see supplementary materials). Finally, we also conducted a by-species version of the analysis using the byspecies mean of the absolute longest leaf length for each 2 m × 2 m plot rather than the mean based on the standardised longest leaf lengths.

Landscape greenness corresponded among sensors
Landscape greenness corresponded well among drone, Sentinel-2, Landsat 8 and MODIS across both the 2016 and 2017 growing seasons. Growing season curves of the mean NDVI for the one-hectare plots were similar (figure 1) and the plots' temporal (seven-day) mean NDVI values were highly correlated across sensors (Spearman's ρ > 0.59-0.98, table S12). However, we observed a positive offset between the drone and satellite seven-day mean NDVI values for the plots. This offset ranged between 0.027 (Landsat 8) and 0.073 (Sentinel-2B) absolute NDVI and was consistently positive across satellites (table S13). The Landsat 8 offset of 0.027 fell within the range of the estimated error (±0.03) associated with the drone-derived mean NDVI for the study plots determined in a previous study using the same survey method [81].
Resampled drone pixels (10 m and 30 m) and the corresponding spatially co-located Sentinel-2 and Landsat 8 pixels were highly correlated (marginal R 2 = 0.69 and marginal R 2 = 0.58 respectively, see figure 2 and tables S8 and S10). We found that vegetation type, the time-difference between satellite scene and drone data acquisition, and the specific Sentinel platform (Sentinel-2A/Sentinel-2B) influenced the relationship between Sentinel-2 pixel NDVI and dronederived NDVI (marginal R 2 = 0.87 see table S9). While the Sentinel platform (Sentinel-2A/Sentinel-2B) had the strongest impact on the intercept and the slope of the linear model, vegetation type and time-difference mainly influenced the slope, with time-difference having the smallest effect on slope and intercept overall (table S9). In contrast, we only detected a statistically meaningful effect for the timedifference between satellite and drone scene acquisition in the Landsat 8-drone pixel model (marginal R 2 = 0.70); vegetation type did not have a statistically meaningful influence on this relationship (table S11).

Spatial variation in landscape greenness peaked at approx. 0.5 m
Spatial variability in the NDVI values associated with distance peaked at ranges below 0.5 meter (mean range 0.44 m) during the peak-season of 2017 (26-28 July). Little additional autocorrelation structure in the NDVI was found between pixel pairs for distances of up to 45 m (figure 3). This pattern was consistent across vegetation types in seven out of our eight plots (figures 3, S3 and S4). The only exception is the Dryas-vetch plot in Area 3, which showed the same patterns for distances below 10 m, but thereafter spatial variation steadily increased (figure S4). Peak variability (sill) in NDVI decreased as the growing season progressed (figure S5), and varied with vegetation type (figures 3, S3 and S4). Unexplained variability (nugget) was consistently low across all Areas (figures 3, S3 and S4).

Seasonal-variation was lost when aggregating to medium grain sizes
We observed a notable loss in the amount of seasonal variation in tundra greenness when aggregating . Fine-scale variation representing key ecological heterogeneity in tundra phenology was lost when aggregating from ultra-fine-grain drone to medium-grain satellite pixel sizes. We observed a logarithmic decay in variation (standard deviation) in the quadratic coefficient of simple growing season curves fitted to the eight vegetation plots in the 2017 season when aggregating the drone data across grain sizes (a). To provide an example of the underlying raw data, we visualised the pixel-by-pixel curves fitted to the time-series of pixels from the dryas-vetch tundra plot in Area 2 for a subset of three grain sizes (b). Here, each point represents a pixel NDVI value at a given day of year and grain size (indicated by colour). The transparent lines represent the simple quadratic curves fitted to each pixel across the time-series, again the colour of the line indicates the pixel's associated grain-size. See also figure S8, which shows a random sample of nine curves for all grain sizes from the same study plot. Furthermore, to provide an example of the spatial distribution of the quadratic coefficient and how it changes across grain sizes, we plotted the respective rasters for Area 2 dryas-vetch tundra in panel (c). Similar patterns are found across all areas (a). grain sizes from ultra-fine-grain drone to mediumgrain satellite data. The loss was particularly pronounced at grain-sizes above 10 m-the grain size of Sentinel-2 MSI pixels (46.2%-63.9%) (figure 4). The variation in the quadratic coefficient of the simple growing season curves (figures 4(b) and S6) decayed logarithmically with grain size (figure 4(a)), while no change occurred in the mean tendency of the coefficient (figure S7). The quadratic and linear coefficients of the growing season curves were strongly correlated (Spearman's ρ = −0.999), thus the same pattern holds true for the linear component of the curve fit ( figure S6).

Drone-derived spectral greenness correlated well with leaf measurements
Drone-derived spectral greenness correlated well (mean ρ = 0.70) with ground-based measurements of phenology for graminoids and deciduous plants across the growing season ( figure 5). The Spearman's correlation coefficient of the plot mean longest leaf length and the mean drone-derived NDVI (mean ρ = 0.70, table S14 and figure 5) matched the byspecies analysis based on absolute leaf lengths in the ground-based phenology plots (mean ρ = 0.68, table S15 and figure S9). The graminoids and deciduous shrub species followed this mean tendency well across all time-series, while the partially-evergreen D. integrifolia showed mixed responses between plots and years (mean ρ = 0.22, figure S9). The dronebased time-series of greenness of the 2 m × 2 m ground-phenology plots highlight fine-scale differences in phenology such as the continuous greening of tussocks that was visible at the tussock sedge tundra plot in Area 2 ( figure 5(c)). Sentinel-2 greenness of the ground-monitoring plots showed slightly weaker correlations (mean ρ = 0.58, figure S10) with the mean longest leaf length, but for this analysis no time-series of sufficient length were available for 2016 and peak-season observations in 2017 were limited.

Discussion
Our analysis of time-series of landscape greenness on Qikiqtaruk across scales highlights four main findings: 1) Measures of mean tendency in landscape greenness were consistent across sensors, but drone-derived NDVI values were lower than those from Sentinel-2, Landsat 8 and MODIS products (figures 1 and 2). 2) The majority of variation in landscape greenness was contained at scales of around half-a-metre, and is thus not captured by medium-grain satellites such as Sentinel-2 (figure 3).
3) When aggregating growing season curves from ultra-fine-grain drone to medium-grain satellite pixel sizes, a notable amount (46.2%-63.9%) of variation in greenness phenology was lost (figure 4). 4) Dronebased measures of landscape greenness correlated well with ground-based measurements of leaf length (figure 5). Taken together, our results highlight that drone platforms and compact multispectral sensors can capture key ecological processes such as vegetation phenology and enable us to bridge the existing scale gap between satellite and ground-based monitoring in tundra ecosystems.
The correspondence between drone and satellitederived NDVI has yet to be comprehensively tested across Arctic sites [13,14]. Siewert and Olofson [14] also demonstrate cross-sensor agreement between drone-and satellite-derived NDVI from Arctic Sweden. While similar or higher levels of cross-sensor agreement have been observed in other natural and agricultural systems [14,100,101], some non-Arctic studies showed mixed or poor agreement [102][103][104].
Continued efforts in replicating these studies at different sites and systems are much needed to comprehensively evaluate cross-sensor correspondence in Arctic tundra systems and beyond.
We observed a small but consistent offset between drone-and satellite-derived NDVI that warrants further investigation. A similar offset has been detected in rice fields in Italy [103] and with spectroradiometer readings in ecologically similar tundra in Alaska [77], but see Siewert and Olofson [14] for a lack of offset in the more heterogeneous tundra of Arctic Sweden. Both technical and ecological factors could explain the offset. We were not able to conduct spectroradiometer readings coinciding with our drone surveys for on-the-ground comparisons. Technical reasons for the observed offset may include: atmospheric effects, differences in viewing geometries, sensor properties (e.g. band widths) and signal processing between drones and satellites (e.g. radiometric calibration), but also among different drone studies. Ecologically, the variation in land cover (especially the presence/absence of nonvegetative surfaces) or topography within a landscape may influence the correspondence between measures of vegetation greenness across scales due to nonlinearities in how different patch sizes and cover types are aggregated when measured with the NDVI [12,105]. The high homogeneity of the survey plots on Qikiqtaruk likely contributes to the strong correlation between drone-and satellite-derived NDVI that we have observed. Yet, in our drone data fine-grain patterns of higher and lower NDVI within the landscape were evident, including bare-ground patches and areas of more productive vegetation in wetter parts of the landscape (figures 1-3). Non-linearities in the scaling of these patches could contribute to the offset between satellite and drone NDVI that we observed on Qikiqtaruk. Further research is needed to evaluate cross-sensor and cross-scale correspondence in NDVI and other vegetation indices across Arctic tundra systems.
We found that a plateau of spatial variation in tundra greenness occurred around 0.5 m, approximately the same width as biological and environmental patterning at this site. The E. vaginatum sedges that dominate the tussock sedge vegetation type typically have diameters of ∼0.1-0.5 m (figure 3(b)) [106]. The tussock sedge vegetation type is underlain by ice-wedge polygons that when thawed create bands of wetter or drier plant communities with widths of ∼0.5 m-3.0 m [107]. Dryas-vetch vegetation is often found on gentle sloping uplands where active layer disturbances such as cryoturbation and solifluction create characteristic bare-ground patches perpendicular to the slope [85] with dimensions of ∼0.3 m-0.5 m width and ∼0.3-1.0 m length ( figure 3(b)). We expect that spatial variation would increase with distances beyond the one-hectare extents of our plots as more topographic diverse terrain is encountered and vegetation type transitions are reached. Topography is a key proxy for many processes that structure heterogeneity in tundra vegetation [108][109][110] and the plots were selected for little topographic variation to allow us to isolate specific effects of land cover on scaling of greenness patterns from topography. The plot with the highest elevational range (Area 3-Dryas-vetch tundra: 8.7 m) showed a small but steady increase in spatial variation in distance classes above 10 m ( figure  S4). Our findings illustrate that on Qikiqtaruk, grain sizes of 0.5 m or less are required to capture key spatial variation in vegetation greenness.
In our study, ecological information was lost when upscaling from ultra-fine-grain (∼0.05 m) drone to moderate grain (∼10-30 m) satellite resolutions. Even the most recent generation of freelyavailable multispectral satellite products can be limited in their ability to capture fine-grain ecological processes of tundra vegetation change [13]. Information transfer during upscaling leads to the loss of more information in tundra ecosystems compared to other biomes [14,111] as land cover and vegetation structure are fragmented at finer scales [112]. However, exactly how spatial aggregation influences the loss in observed ecological variability across the diversity of Arctic landscapes remains poorly quantified [11]. Yet, this variability is critical to understanding climate-driven changes in vegetation phenology [35,36,88], plant-pollinator interactions [113], and trophic interactions [114]. With finegrain observations, we were able to detect a subtle decrease in the magnitude of the spatial variability in landscape-level phenology as the growing season progressed (figure S5), while aggregation to moderate satellite grains obscured both the magnitude and timing of phenological heterogeneity ( figure 4). Thus, time-series of fine-grain remotely-sensed observations will be critical for answering key research questions about tundra ecosystem functioning in a warming Arctic [115].
Our results indicate that drone-based greenness time-series captured variation in leaf-growth of deciduous tundra plant species at the plot level. We demonstrate how drones can be used to measure variation in tundra plant phenology of metre-scale patches at landscape extents. Drones have been successfully used to monitor phenology of individual plants (trees) in temperate forest ecosystems [116][117][118], and our ability to detect sub-decimeter variability in our study indicates that individual plantlevel phenology monitoring with drones could also be carried out in the tundra. Future studies that quantify plant growth or phenology events such as leaf emergence and flowering across the landscape could provide key information on resource availability for plant-consumer interactions [113,114]. Our findings also highlight known limitations of NDVI to track phenology in evergreens or other non-deciduous taxa (D. integrifolia, figure S9), suggesting that tests of alternative vegetation indexplant growth relationships [118] are needed to capture variation in plant metabolic activity of tundra evergreen and moss species within the growing season. Combining drone-based time-series with observations from phenocams, satellite and ground-based study plots has the potential to revolutionise our understanding of landscape-scale phenology [13] by moving beyond the previously small samples of individuals monitored in the Arctic tundra [36,37,39,119].
The collection of multispectral drone timeseries in Arctic ecosystems has limitations and challenges. Recent studies have discussed challenges with radiometric consistency and repeatability when using compact multispectral drone sensors [81,120,121]. Due to logistical constraints, we were not able to always conduct surveys under optimal conditions due to sun angle or cloud cover, nor as frequently as planned due to wind or precipitation (table S2), which likely introduced bias and/or noise into our drone data (e.g. figure 4(b)). Site access limitations meant that we could not capture spring and autumn on Qikiqtaruk. As an early-generation multispectral drone sensor, the Parrot Sequoia was tailored for deriving the NDVI, which despite being the legacy workhorse of tundra remote-sensing has limitations [11,13]. In particular, NDVI can be confounded by moisture and surface water [11,73,122], complicating interpretation in wet tundra, particularly at fine-grain sizes. However, the rapid technological development of drones and sensors, as well as further consolidation and standardisation of methods [123], will allow for pan-Arctic syntheses of finegrain data to resolve the uncertainty and complexity of Arctic greening patterns trends [13,14,81] (see also the High Latitude Drone Ecology Network-(https://arcticdrones.org/)).
Our study demonstrates that drones can fill the scale-gap between satellite and field studies of terrestrial Arctic vegetation change. Rather than investigating and explaining patterns at scales pre-defined by satellite datasets or field-based networks, researchers can use drones to identify scale-domains that are most closely associated with the ecological processes of interest. Field ecologists can now combine scaling theory provided by the remote sensing community [74,[124][125][126][127] with observations at scales and temporal intervals that allow for the testing of hypotheses about the mechanisms that drive landscapelevel ecological change. Drone imagery will also allow the remote sensing community to track the effects of sub-pixel heterogeneity on satellite products down to the grain of individual plants and communities [14], which have been long studied by field-based monitoring networks, like the International Tundra Experiment [75]. Only by improving our understanding of how ecologically important information is captured across grain sizes can we reduce uncertainties in the medium-and coarse-grain satellite observation that feed into Earth system models and shape their predictions [4,8]. Fine-scale remote sensing from drones and aircraft therefore provide key tools for disentangling the drivers behind the greening of the Arctic [14,79,115].

Conclusions
Novel remote-sensing technologies such as drones now allow us to study ecological variation in landscapes continuously across scales. Fine-grain ecological observations are of particular importance where variation in plant growth happens at very small spatial scales such as in tundra ecosystems [13,71]. The peak in spatial variation we found at distances of ∼0.5 m in the plots on Qikiqtaruk demonstrates the grain size at which phenological information within the plant communities is best captured at this site. We show that key ecological information is lost when observing the tundra at even decimeter or coarser scales, such as those of medium grain satellites (∼10-30 m). Despite the methodological challenges of collecting multispectral drone imagery in remote environments [81], our time-series of vegetation greenness correlated well with ground-based measurements of leaf growth in the validation plots. Drones now enable studies that fill the scale gaps between satellite and ground-based observations, and therefore improve our ability to identify the key drivers of vegetation change and project climate change impacts and feedbacks in the tundra biome.
Finally, we would like to thank the Inuvialuit people for the opportunity to conduct research in the Inuvialuit Settlement Region.

Author contributions
JJA and IMS conceived the study with input from JTK and AMC. JJA carried out data processing and analysis. JJA and IMS led the drone and ground-validation field work in 2016. AMC led the drone field surveys in 2017 with input from JTK. GD led the ground-validation for 2017 with input from JTK. JJA, IMS and JTK wrote the manuscript with input from AMC and GD. IMS supervised and acquired funding for the research.

Data availability statement
The data and code that support the findings of this study are openly available at the following URL/DOI: (https://github.com/jakobjassmann/qhi_phen_ts).