Mapping common and glossy buckthorns (Frangula alnus and Rhamnus cathartica) using multi-date satellite imagery WorldView-3, GeoEye-1 and SPOT-7

ABSTRACT Buckthorns (Glossy buckthorn, Frangula alnus and common buckthorn, Rhamnus cathartica) represent a threat to biodiversity. Their high competitivity lead to the replacement of native species and the inhibition of forest regeneration. Early detection strategies are therefore necessary to limit invasive alien plant species’ impacts, and remote sensing is one of the techniques for early invasion detection. Few studies have used phenological remote sensing approaches to map buckthorn distribution from medium spatial resolution images. Those studies highlighted the difficulty of detecting buckthorns in low densities and in understory using this category of images. The main objective of this study was to develop an approach using multi-date very high spatial resolution satellite imagery to map buckthorns in low densities and in the understory in the Québec city area. Three machine learning classifiers (Support Vector Machines, Random Forest and Extreme Gradient Boosting) were applied to WorldView-3, GeoEye-1 and SPOT-7 satellite imagery. The Random Forest classifier performed well (Kappa = 0.72). The SVM and XGBoost's coefficient Kappa were 0.69 and 0.66, respectively. However, buckthorn distribution in understory was identified as the main limit to this approach, and LiDAR data could be used to improve buckthorn mapping in similar environments.


Introduction
Human activities and climatic variability are the main drivers for the introduction and spread of invasive alien plant species (IAPS) (Kumar Rai and Singh 2020; Langmaier and Lapin 2020;Paz-Kagan et al. 2019;Guido and Pillar 2017;Early et al. 2016).Their introduction, intentional or accidental, as well as their spread after installation, can lead to negative ecological, health and economic impacts.These impacts can include the replacement of native species, the costs of eradication interventions, devaluation of invaded properties and spontaneous or long-term health issues such as photodermatitis from contact with the sap of some IAPS such as giant hogweed (Heracleum mantegazzianum) (Lavoie, Guay, and Joerin 2014;Vilà et al. 2010;Mack et al. 2000).
Glossy buckthorn (Frangula alnus) and common buckthorn (Rhamnus cathartica) are native to Europe and were introduced in North America in the early nineteenth century.Initially used as windbreaks (Boettcher, Gautam, and Cook 2021;Becker, Zmijewski, and Crail 2013;Heneghan et al. 2006), both species are now considered invasive (Lavoie, Guay, and Joerin 2014).These IAPS colonize forest edges, understory, riverbanks and open environments (e.g.wastelands), the last of which promote very dense colonies (Lavoie, Guay, and Joerin 2014;Heneghan et al. 2006;Frapier, Eckert, and Lee 2004).These IAPS are also characterized by a particular phenology as their leaves appear very early in the spring and remain green late into the fall (Labonté et al. 2020;Becker, Zmijewski, and Crail 2013;Knight et al. 2007;Archibold, Brooks, and Delanoy 1997).Heneghan et al. (2006) also found that buckthorns alter the chemical properties of the soil as a result of altered moisture and accelerated organic matter transformation processes (e.g.nitrogen and carbon mineralization).These changes in organic matter cycling can subsequently cause an increase in pH and decrease of soil nutrient availability for other organisms (Knight et al. 2007).The main impacts of these IAPS in invaded areas are the loss of biodiversity, the facilitation of the establishment of new invasive species and the inhibition of forest regeneration (Lavoie, Guay, and Joerin 2014;Knight et al. 2007;Frapier, Eckert, and Lee 2004).
Buckthorns are more competitive and replace native species by reducing their regeneration.As an example, Frapier, Eckert, and Lee (2004) showed that sites with high buckthorn percentage cover (>90% cover), had fewer new pines (0.11 seedlings/m 2 ) than sites without buckthorns (0.40 seedlings/m 2 ) in a coniferous forest (Pinus sp.) of New Hampshire (USA).Replacement of native species is also facilitated by the high germination rate of buckthorns, which can reach 85% (Archibold, Brooks, and Delanoy 1997).Moreover, their photosynthesis rate can be up to twice that of native species' which allows them to grow very rapidly in comparison (Kalkman, Simonton, and Dornbos 2019;Harrington, Brown, and Reich 1989).This high competitiveness (Boettcher, Gautam, and Cook 2021;Kalkman, Simonton, and Dornbos 2019) necessitates early detection strategies so that eradication interventions can be promptly performed.Positive identifications can be made by in situ inventories which are time consuming and expensive, especially when conducted in areas with limited access (Lawrence, Wood, and Sheley 2006).Remote sensing has the potential to reduce these constraints and improve detection of IAPS over large areas.
Only two studies that used remote sensing to map buckthorn distribution were found in the available literature.Labonté et al. (2020) used a phenological approach based on a series of six multi-date (April, June, August, September, October and November) Landsat-8 OLI images to map understory buckthorns in Richmond and Cookshire (Quebec, Canada) forests dominated by broadleaf deciduous and coniferous trees.An overall accuracy of 69% was obtained and this low performance was attributed to low levels of buckthorn cover and spectral mixing (Labonté et al. 2020).Becker, Zmijewski, and Crail (2013) used the same approach using a series of 49 Landsat 7 ETM+ and Landsat 5 TM images (January to December between 2001 and 2011) in Ohio and Michigan (USA) and obtained an overall accuracy of 88%.This high accuracy can be attributed to the study environment (oak openings) and high buckthorn density.Labonté et al. (2020) showed that the classification error decreased with the increase of buckthorn cover levels in the studied plots (plot size = 30 × 30 m), as the spectral mixing in the pixel is reduced.For example, this error was 20% when buckthorn cover was 75% or higher, but rose to 31% from 25% or higher coverage.These studies highlight the limits of detecting low density and understory buckthorns (i.e.low percentage cover in a pixel) using medium spatial resolution satellite imagery.Although the use of multi-date very high (VHR) and high spatial resolution (HR) images could improve classification accuracy (Labonté et al. 2020;Becker, Zmijewski, and Crail 2013), there are no studies to date that have incorporated these data to map low-density and understory buckthorns.
The main objective of this study is to explore and compare the use of multi-date HR and VHR satellite imagery in combination with machine learning classifiers to detect understory buckthorns at low density.

Study area
The study area is a 4 km long riparian zone located along the Beauport River in Quebec City (46°52 ′ 02 ′′ N, 71°12 ′ 35 ′′ W) (Figure 1).The canopy of the area is dominated by broadleaf deciduous trees: boxelder maple (Acer negundo), red maple (Acer rubrum), American elm (Ulmus americana), black willow (Salix nigra) and white birch (Betula papyrifera).A few patches of evergreen trees are present, mainly coniferous, including blue spruce (Picea pungens) and balsam fir (Abies balsamea).In this area, buckthorn presence is isolated, does not form homogeneous patches and are featured predominantly in the understory.We therefore chose this riparian zone as our study area for (1) the presence of many buckthorn individuals and (2) their spatial distribution (i.e.low density and understory) allowing us to test the detection efficiency of the VHR and HR images.

Reference data
The collection of reference data was conducted during two periods: in the summer (26 June to 26 August 2020) and in late autumn (15-21 October 2021), when buckthorns are easily identifiable after most broadleaf deciduous trees have lost their leaves (Figure 2).Three classes of reference data were collected: buckthorn, evergreen, and broadleaf deciduous species.Two high-precision (≈1 m) Pro 6 H (Trimble, California) and Arrow Gold (EOS positioning system, Quebec) GNSS (Global Navigation Satellite Systems) receivers were used to locate and record the crown centre position of individuals of the three classes.
The broadleaf deciduous class corresponds to the dominant individual trees in the study area and samples were randomly located.Buckthorn and evergreen species were less numerous, and all accessible individuals were located.Buckthorns that were less than 4 metres tall or characterized by small crown surface were not used, as they are completely in the understory and could not be detected by remote sensing.The total number of points collected for each class was 54, 50, and 108 for buckthorn, evergreen and deciduous species, respectively.
A digital terrain model (Varin, Allostry, and Chalghaf 2020) at one-meter spatial resolution was used for orthorectification of the pansharpened images, and an orthomosaic of the WV-3 image was generated using the Bundle colour balancing method (PCI Geomatics 2018) from fourteen tiles.
Three masks were applied to remove unvegetated areas, shadows and vegetation less than 4 m tall.A first mask was created using the WV-3 image to remove water, bare soil and buildings by applying a normalized difference vegetation index (NDVI) threshold (NDVI < 0.37) based on the NDVI distribution of the field reference data.To remove shadows, a second mask was applied using a shadow index (SI) threshold (Zhou et al. 2018) (SI ≤ 13.13) based on SI analysis of shadow objects.A third mask was applied to remove vegetation below 4 metres in height using a digital height model at one-meter spatial resolution (Varin, Allostry, and Chalghaf 2020).

Segmentation
The GeoEye-1 image was used for segmentation because of its acquisition period (i.e.fall) that maximizes the distinction between buckthorns and other species (i.e.buckthorns leaves remain green late in the fall) and its better spatial resolution than the SPOT-7 image taken during the same season.The multi-resolution algorithm was selected for segmentation due to its performance in several recent image segmentation studies (Lourenço et al. 2021;Chen, Chen, and Jing 2021;Jombo, Adam, and Odindi 2021).
The segmentation parameter values (i.e.scale, colour, and compactness) were determined using an iterative trial-and-error process combined with the estimation of scale parameter (ESP2) method (Drăguţ et al. 2014;Fernandes et al. 2014;Müllerová, Pergl, and Pyšek 2013;Jones et al. 2011; Dra  guţ, Tiede, and Levick 2010).Scale, colour, and compactness were set to 5, 0.9 and 0.5, respectively.Blue and green bands weights were each set to 1 while red and near-infrared bands were set to 2. Spectral, textural, and geometric arithmetic features were extracted from the objects resulting from the segmentation (Table 2).A total of 150 features for WV-3 and 77 for each of the SPOT-7 and GeoEye-1 images were calculated, for a total of 304.The features were centred and scaled and a correlation analysis was performed to eliminate correlated features using a Pearson coefficient greater than or equal to 0.85 (Varin, Chalghaf, and Joanisse 2020).The Jeffries Matusita (JM) separability distance was used to select the most relevant features among two correlated ones.

Classification approach and optimization of machine learning classifiers
Object-based classification was performed following a multi-date approach by combining the three images and the BTBR (Bi-Temporal Band Ratio) (WV-3 + SPOT-7 + GeoEye-1 + BTBR).Objectbased classification consists of first grouping homogeneous neighbouring pixels into objects which will then be classified (Chen, Zhao, and Powers 2014).This approach has the advantage of using spectral, geometric, textural, and topologic features in the classification process (Hantson, Kooistra, and Slim 2012).It also avoids the salt-and-pepper noise frequently observed in images at very high spatial resolutions (Hirayama et al. 2019;Chen, Zhao, and Powers 2014;Jones et al. 2011).The BTBR (equation 1) developed by Dorigo et al. (2012) and used for multi-date classification was calculated between WV-3 and each of the other two images (BTBR1: WV-3 and GeoEye-1, BTBR2: WV-3 and SPOT-7), where R and G indicate reflectance values in the red and green bands, respectively.The suffixes indicate the image acquisition period (growing season (on) and senescence period (off)).Random Forest (RF) (Breiman 2001), Support Vector Machines (SVM) (Guenther and Schonlau 2016) and Extreme Gradient Boosting (XGBoost) (Zarei, Hasanlou, and Mahdianpari 2021;Samat et al. 2020) classifiers were used for classification and compared.The selection of relevant discriminant features for each classifier was performed using the recursive feature elimination selection method (Yang et al. 2019;Guyon, Weston, and Barnhill 2002;Ambroise and McLachlan 2002) and is detailed in Nininahazwe, Varin, and Théau (2022).The caret library in R (R core team 2021) was used to optimize the number of random features used at each node (mtry) and the number of decision trees (ntree) for the RF classifier.The radial basis function showed good performances in previous studies (Jombo et al. 2020;Qian et al. 2015;Huang, Davis, and Townshend 2002) and was therefore used in this study for SVM kernel function.The SVM's cost and sigma parameters as well as the XGBoost parameters, such as the learning rate, were automatically optimized using the caret library (R core team 2021).The optimal values selected were presented in the Appendix 1.

Classification accuracy assessment
The segments were overlaid on the reference data, and 70% of each class was used to train the models while 30% of the segments were used for validation.Cohen's Kappa coefficient (Congalton 1991; Cohen 1960), overall accuracy, and the F1-score (i.e. the harmonic mean user and producer accuracy) (Costa et al. 2021;Yang 2001) were used to evaluate the accuracy of the models.The output of the models is a membership probability between 0 and 1 for each class.The sum of the class probabilities is equal to 1.The membership of the object was the class with the highest probability.

Results and discussion
The multi-date classification performed in this study shows that the RF classifier performs better (Kappa = 0.72) compared to the other classifiers (Kappa = 0.69 (SVM); 0.66 (XGBoost)) (Table 3).
For individual class performance, buckthorns are less well detected compared to the other classes regardless of the classifier used.In particular, the optimal classifier (RF) for buckthorns reaches 0.62 (F1-score) compared to 0.93 and 0.88 for broadleaf deciduous and evergreen species, respectively (Table 3).This low F1-score value results from omission errors (43%, 1-producer accuracy) and commission errors (33%, 1user accuracy).The omission errors indicate the difficulty of the classifier to predict some buckthorns, while the commission errors show that some absences (e.g.broadleaf deciduous or evergreen species) were erroneously classified as buckthorns.The spatial distribution of buckthorns in the study area probably accounts for these errors.Although individuals less than 4 m tall were eliminated from the analyses, buckthorns are always found below the canopy of other dominant and taller broadleaf deciduous trees, so branches and trunks of other overhanging trees would contribute strongly to the pixel reflectance (Labonté et al. 2020).Broadleaf deciduous trees (e.g.black willow) were also observed with green leaves late in the fall, which could contribute to misclassification.
Our results produced higher levels of accuracy (overall accuracy = 83%) compared to studies conducted in similar conditions (Labonté et al. 2020) (overall accuracy = 69%).The relatively low performance of the study conducted by Labonté et al. (2020) was related to the spectral mixing occurring in pixels at medium spatial resolution images they used (Landsat 8 OLI).In that case, buckthorn individuals' size was smaller than that of the pixels (Labonté et al. 2020).The VHR used in our study therefore appear to have reduced the effects of spectral mixing.
On the other hand, our results (OA: 83%, Kappa: 0.72) are comparable to those found by Becker, Zmijewski, and Crail (2013) who detected buckthorns in a more favourable context (i.e.open environment and high densities) than ours (OA: 88%, Kappa: 0.73).Considering the detection challenges (i.e.mapping low-density and understory buckthorns) in our study area, our multi-date classification approach is performing well, although improvements are needed to produce a more accurate map that can be used directly by managers.In addition, fifteen features used by the optimal classifier highlight the relevance of the multi-date classification approach.The features from the WV-3 and GeoEye-1 images contributed in similar proportions (6/15 for WV-3 and 7/15 for GeoEye-1), in addition to the BTBR calculated between these two images (BTBR1).However, the contribution of the features calculated from SPOT-7 is less significant (1/15).This could be due to the low spatial resolution of the multispectral SPOT-7 bands (6 m, before pansharpening) compared to those from GeoEye-1 (2 m) and WV-3 (1.24 m), hence the advantage of using VHR images in the development of buckthorn mapping methods.Our study also highlights the low relevance of adding lower resolution images to VHR images acquired during the same period.
In environments similar to our study area, the omission and commission errors could be reduced using other data sources, such as high-density point clouds from LiDAR (Light Detection and Ranging) acquired in the fall season.This technology had not yet been applied for buckthorn detection, although they have been used successfully in previous vegetation mapping studies (Guo et al. 2022;Jombo, Adam, and Tesfamichael 2022;Budei et al. 2018;Dalponte, Bruzzone, and Gianelle 2008;Asner et al. 2008).In particular, high density or multiple-return LiDAR would allow for the derivation of several structural features (e.g.crown shape and area), identification of crowns at different heights (Shi et al. 2018), and the extraction of several radiometric features derived from backscattered signal intensity (Shi et al. 2018;Ørka, Naesset, and Bollandsås 2009;Dalponte et al. 2009;Dalponte, Bruzzone, and Gianelle 2008;Asner et al. 2008).These features could be incorporated into optical images to improve remote sensing mapping approaches for buckthorns in understory and at low density.Features such as height could allow for a more accurate removal of unsuitable vegetation strata using a canopy height model (Asner et al. 2008), while radiometric features could be used for discrimination between buckthorns and native species (Shi et al. 2018;Ørka, Naesset, and Bollandsås 2009;Dalponte, Bruzzone, and Gianelle 2008).

Conclusion
The mapping performances of low-density understory buckthorns using multi-date VHR and HR images are satisfactory in comparison with similar previous studies.However, the use of this multidate approach in an operational context (e.g.target intervention areas) would require improvements to reduce the errors and thereby target the problematic areas more precisely.Future studies could incorporate LiDAR data as well as multispectral images at centimetre resolution (e.g.drone images) to improve buckthorn detection in understory.

Figure 1 .
Figure 1.Study area and collected sample locations.The background is a true colour composite (red, green, blue) of a GeoEye-1 satellite image acquired on 5 November 2020.

Table 2 .
Features used for classification.
NIR: Near-infrared, RE: Red-Edge.*Because of limited space, only used bands are presented (see references for detailed definition).

Table 3 .
Performance measures of classifiers.Values in bold indicate the maximum performance values.