Assessing a novel modelling approach with high resolution UAV imagery for monitoring health status in priority riparian forests

Black alder (Alnus glutinosa) forests are in severe decline across their area of distribution due to a disease caused by the soil-borne pathogenic Phytophthora alni species complex (class Oomycetes), “alder Phytopththora”. Mapping of the different types of damages caused by the disease is challenging in high density ecosystems in which spectral variability is high due to canopy heterogeneity. Data obtained by unmanned aerial vehicles (UAVs) may be particularly useful for such tasks due to the high resolution, flexibility of acquisition and cost efficiency of this type of data. In this study, A. glutinosa decline was assessed by considering four categories of tree health status in the field: asymptomatic, dead and defoliation above and below a 50% threshold. A combination of multispectral Parrot Sequoia and UAV unmanned aerial vehicles -red green blue (RGB) data were analysed using classical random forest (RF) and a simple and robust three-step logistic modelling approaches to identify the most important forest health indicators while adhering to the principle of parsimony. A total of 34 remote sensing variables were considered, including a set of vegetation indices, texture features from the normalized difference vegetation index (NDVI) and a digital surface model (DSM), topographic and digital aerial photogrammetry-derived structural data from the DSM at crown level. The four categories identified by the RF yielded an overall accuracy of 67%, while aggregation of the legend to three classes (asymptomatic, defoliated, dead) and to two classes (alive, dead) improved the overall accuracy to 72% and 91% respectively. On the other hand, the confusion matrix, computed from the three logistic models by using the leave-out cross-validation method yielded overall accuracies of 75%, 80% and 94% for four-, three- and two-level classifications, respectively. The study findings provide forest managers with an alternative robust classification method for the rapid, effective assessment of areas affected and non-affected by the disease, thus enabling them to identify hotspots for conservation and plan control and restoration measures aimed at preserving black alder forests.


Introduction
Degradation of forests due to the rapid spread of damaging pests and pathogens, is already threatening forest functioning and services provision, challenging the Sustainable Development Goals (SDG) and urgently demanding innovative and cost-effective tools for their long term monitoring and for upscaling restoration efforts (European Commission 2020). In the case of forest ecosystems that are prone to biotic and abiotic disturbances, detailed information about tree decay and mortality can be obtained by quantifying different levels of tree defoliation to assess health status (Cardil et al. 2017Navarro-Cerrillo et al. 2019). However, terrestrial surveys are often hampered by difficult access, the time and cost of near-term and spatially-explicit data acquisition over large areas, or by the sensitivity of areas with high conservation value, such as riparian and floodplain forests (Díaz-Varela et al. 2015;Rodríguez-González et al. 2017). Hence, forest management increasingly relies on the use of conventional remote sensing (RS) technology (Pádua et al. 2017).
Forests growing on riparian, periodically flooded or waterlogged soils, are highly valuable, but threatened ecosystems due to long lasting human impacts (Nilsson et al. 2005). Among riparian ecosystems, alder woodlands are considered of priority for conservation in European Habitats Directive 92/43/CEE. Alders are unusual among European trees in that they fix nitrogen (Huss-Danell 1997) supporting the riparian ecosystem and contributing to biodiversity. Despite their high ecological value, alder-dominated forests are threatened due to the combination of long-lasting human impact on fluvial systems and emerging abiotic (i.e. climatic) and biotic (i.e. pests and diseases) global changes. Phytophthora-induced alder decline was first reported in the 1990s in the UK, but spread across Europe to become one of the most devastating epidemics of common trees in several countries (Jung and Blaschke 2004). It has more recently been reported in Spain (Solla et al. 2010) and Portugal (Kanoun-Boulé et al. 2016). Research efforts have been increased in an attempt to understand the causes of the decline (Jung et al. 2016), the environmental factors driving the spread (Aguayo et al. 2014) and also the effects on ecosystems (Bjelke et al. 2016). However, there remain substantial knowledge gaps regarding the most effective ways to prevent infection, to reduce the intensity and extension of alder forest decline and to mitigate the effects on ecosystems. Serious implications are also expected for the future establishment of forest stands due to widespread infections in nursery stocks (Jung et al. 2016) and in other forest species, given the wider host range of other Phytophthora spp. Mapping, assessing and quantifying these effects is therefore of the most importance in relation to understanding the disease progress and for developing effective forest management plans.
Focusing on halting the spread of Phytophthora is essential, as prevention is the primary means of defence against the disease. However, the spatial and temporal dynamics of the disturbance caused by the disease are not yet fully understood (Aguayo et al. 2014). Earth observation satellite data and aerial flight surveys are often used in forestry applications and are appropriate for some research objectives (Immitzer and Atzberger 2014;Immitzer et al. 2016;Fassnacht et al. 2016;Michez et al. 2016;Senf et al. 2017;Zarco-Tejada et al. 2018). Nonetheless, their applicability is limited by the low resolution of freely available satellite imagery as well as high costs associated with aircraft-based surveys (Immitzer et al. 2016). The low resolution of remote sensing is especially problematical for riparian ecosystems, which exhibit a predominantly linear dendritic configuration along hydrographic networks (Huylenbroeck et al. 2020). Compared to satellite and airplane-based remote sensing, applications based on unmanned aerial vehicles (UAVs), also called remotely piloted aircraft systems (RPAS) or simply "drones", have higher spatio-temporal resolution and greater flexibility in regard to selecting payloads and flight plan specifications for an appropriate spatiotemporal resolution (Guerra-Hernández et al. 2017;Díaz-Varela et al. 2018).
Developments regarding the automation of small UAV-based flight and sensor data acquisition and also the processing of large collections of individual aerial images into seamless orthomosaics have boosted the potential of UAV systems for accurate and effective forest monitoring (Goodbody et al. 2017;Guerra-Hernández et al. 2017, 2018Torresan et al. 2017). Furthermore, these systems enable calculation of classificationenhancing raster layers such as spectral band ratios, vegetation indices, texture metrics and structural variables. Calibrated consumer-grade cameras and more powerful multispectral and hyperspectral instruments have been successfully used for mapping individual forest stands (Näsi et al. 2015(Näsi et al. , 2018, for crown reconstruction and individual tree phenotyping (Díaz-Varela et al. 2015;Guerra-Hernández et al. 2017), identification of tree species (Laliberte et al. 2010;Ahmed et al. 2017) and health status (Lehmann et al. 2015;Lisein et al. 2015;Nevalainen et al. 2017;Dash et al. 2017;Senf et al. 2017;Safonova et al. 2019).
Despite the potential advantages of UAVs, few studies have made use of these systems to detect biotic damage in forests. A review by Senf et al. (2017) revealed that studies related to the remote sensing of insect pest infested has focused on conifer bark beetles and defoliators of deciduous trees. The North American mountain pine beetle (Dendroctonus ponderosae Hopkins) and the European spruce bark beetle (Ips typographus) have been particularly widely studied among other bark beetles, especially in the last few years. Lehmann et al. (2015) were some of the first researchers to use UAV to detect levels of pest infestation in forest ecosystems. These researchers used multispectral imagery and object-based image analysis to detect infestation in oak stands. Näsi et al. (2015) reported the first study that investigated bark beetle infested at tree-level by means of UAV photogrammetry and hyperspectral imaging. These researchers showed that different health status stages (i.e. asymptomatic, infested and dead trees) could be identified by computer vision technologies based on hyperspectral UAV imaging at the individual tree level; an overall accuracy of 76% was achieved when using three categories based on colour (asymptomatic, infested, dead). Minařík and Langhammer (2016) reported preliminary findings obtained with a UAV equipped with a multispectral sensor for mapping bark beetle (Ips typographus) infestations. Dash et al. (2017) used a UAV with a multispectral sensor to classify a disease outbreak in mature Pinus radiata D. Don. More recently, Klouček et al. (2019) and Safonova et al. (2019) studied bark beetle infestations on trees by using low-cost and customized UAV sensors to detect stages of bark beetle attack in spruce (Picea abies) and fir (Abies sibirica Ledeb) forests in the Czech republic and Russia, respectively. However, very few studies aimed at the fine-scale assessment of forests severely affected by diseases caused by pathogens of the genus Phytophthora have used remote sensing techniques (Cerrillo et al. 2005;Medcalf et al. 2011;Michez et al. 2016;Barnes et al. 2017).
Several studies have reported that texture analysis can improve the accuracy of health status classification in forests (see for example Coburn and Roberts 2004;Franklin et al. 2000). However, much work remains to be done in relation to the efficacy of texture analysis in studies of tree decay and mortality (Moskal and Franklin 2004). In this regard, textural and structural variables derived from digital surface models (DSMs) require further development and testing in forestry and ecological mapping applications. In addition, there is a recent trend to use UAV technology combined with object-based classification methods such as object-based image analysis (OBIA) and Random Forest (RF), which perform well for analysing complex high spatial resolution data, including multispectral, and textural data (Blaschke 2010;Duro et al. 2012;Hossain and Chen 2019;Otsu et al. 2019). Although these object-based image classification methods using RF are generally considered to be more accurate than other methods Franklin and Ahmed 2017;Otsu et al. 2019), RF models have practical limitations, notably the space-complexity of RF estimation from large amounts of data and the difficulty in transferring the models at landscape level from different data sets. Considering some difficulties encountered with these complex and computationally resourcedemanding models, other simpler, robust approaches are urgently required to enhance the potential of highresolution imagery data such as textural and structural data . However, to the best of our knowledge, no studies have yet applied such simple, robust logistics models in combination with very high spatial resolution UAS imagery, to detect biotic decay in forest applications.
The transferability of these models will inspire further research and applications involving a combination of methods applying the results to different areas with similar characteristics or allowing the scaling up of the predictive models on multispectral by integrating satellite remote sensing information in the assessments over large spatial scales (Dash et al. 2018). In this respect, calibration and parametrization of these models are crucial as this represents a starting point for quantifying and analysing the variation spatial-temporal of deterioration of tree health using multitemporal data in this type of ecosystem.
The present study combines the use of multispectral imaging and a red, green, blue (RGB) sensor from UAV for: i) assessing the separability of health status categories in relation to infection by Phytophthora in Alnus glutinosa trees based on comparison of spectral responses and vegetation indices and ii) developing and comparing models to classify the health status from UAV-derived variables using a three-step logistic approach and Random Forest.

Study area
The study area is located in a deciduous floodplain forest covering an area of approximately 41.71 ha (Fig. 1b), within a Protected Area (Natura 2000 SCI Rio Lima PTCON0020; Ramsar Site 1613 "Lagoas de Bertiandos e São Pedro d'Arcos") in the Municipality of Ponte de Lima (NW Portugal). The wetland forest is located along the downstream alluvial floodplain of Estorãos, a stream with a drainage area of 54.4 km 2 and which is a tributary of the Lima river. The elevation in the alluvial floodplain ranges between 4 and 10 m a.s.l. (meter above sea level), the average annual rainfall in the region is 1385.9 mm, and the average annual temperature, 14°C. Land use in the floodplain is a combination of forest alternated with pastures managed at different levels and also abandoned areas where the forest has recovered naturally. The major native forest species are Alnus glutinosa, Salix atrocinerea and Quercus robur, with the occasional presence of Fraxinus angustifolia. Old plantations of showing enlarged views of high resolution crown orthophotomosaics (ground sample distance (GSD), 5 cm) consisting of four categories of forest health status: d asymptomatic, e less than 50% defoliation; f more than 50% defoliation, and g dead Fig. 2 Health status categories in Alnus glutinosa trees considered in this study: healthy (asymptomatic), less than 50% crown defoliation, more than 50% crown defoliation, and dead (pictures, from left to right) Eucalyptus camaldulensis, Acacia melanoxylon and Populus spp. are also present, representing ca. 1% of the forest area.

Field data
In the field, 81 black alder trees (Alnus glutinosa) with distinctive crowns were surveyed and located using a hand-held Ashtech Mobile Mapper 100 submetric GPS unit (Fig. 1c). These 81 trees were representative of the four health status classes considered: healthy (asymptomatic) trees (A, n = 30); trees with less than 50% crown defoliation (B, n = 12); trees with more than 50% crown defoliation (C, n = 14); and dead trees (D, n = 25) (Fig. 2). Apart from tree defoliation percentage, the survey included a detailed forest inventory of the following variables (measured in all trees): height, diameter at breast height, number of dead and alive stems, and occurrence of other health status symptoms such as presence of canker in the stems. The crown of each tree sampled in the field was manually delineated based on visual interpretation of the UAV orthomosaic. Field data were used as reference for the tree health status predictive modelling based on UAV imagery as described in the section "Data analysis".

Remote sensing data acquisition and pre-processing
The airborne data collection campaigns were conducted on 27 September, 2018. Weather conditions were characterized by calm winds and clear atmospheric conditions at the flight time. RGB and Multispectral flights were done closer to solar noon. The RGB data were collected between 12.31 and 12.53 pm to minimize the effect of shading. An RGB S.O.D.A. 10.2 (20 MP) camera (senseFly Co, Cheseaux-Lausanne, Switzerland) was mounted, with nadir view, on a fixed-wing UAV (Sense-Fly eBee) (Fig. 3). The camera, equipped with a 12.75 mm × 8.5 mm sensor and 5472 × 3648 pixel detector, was used in manual mode, and exposure settings (ISO 150 and shutter speed of 1/1000) were established before each take-off according to the atmospheric conditions. This provided a mean~5 cm•pixel − 1 resolution for a mean altitude of 170 m above ground level. eMotion V. 3.2.4 flight planning and monitoring software was used to determine the main flight parameters. A longitudinal and lateral overlapping of 85% and flight line spacing of 25 m were used to collect the images. A total of 570 images were used to generate orthomosaics and DSMs in further analyses. Two-block flights were required to capture the entire forest study area (the orthomosaic covered an area of approximately 175 ha). For further details of sensor and flight parameters see the steps outlined in Guerra-Hernández et al. (2018).
In addition, 447 discrete-band multispectral images were obtained with a 1.2-megapixel Parrot Sequoia camera (focal length 3.98 mm), which captures~50 nm wide bands in the green (B1~550 nm), red (B26 60 nm), and near-infrared (B4~790 nm) regions, and a~10 nm wide band in the red-edge (B3~735 nm) region. Ancillary data for radiometric correction and orthorectification (irradiance, sensor coordinates and position) were captured synchronically for each multispectral image. The Parrot Sequoia also captures simultaneous true-colour imagery with a 16-megapixel sensor (RGB Sequoia, not used in the current study). Imagery acquisition was also conducted on 27 September, 2018 between 11.00 am and 11.45 am, before the RGB S.O.D.A. flight, using the same previously described fixed wing UAV. Multispectral Sequoia images were acquired for an mean altitude of 140 m above ground level, with 80% longitudinal overlap and 80% lateral overlap, yielding a ground sample distance (GSD) of 10 cm.

Photogrammetric processing of UAV data
The UAV images were processed using Agisoft Photoscan version 1.4.5 (Agisoft 2018), and three products were generated for the study area: 1) RGB and multispectral reflectance orthomosaics with pixel size of 5 cm × 5 cm; 2) a point cloud with a density of approximately 143 points•m − 2 and 3) a DSM with a pixel size of 10 cm × 10 cm. The most important parameters used for photogrammetric reconstruction in Photoscan are summarized in Table 1. Images were initially geotagged on the basis of data from the aircraft's on-board global positioning system (GPS), and angular offsets were determined on the basis of data obtained from the on-board inertial measurement unit. During processing, sets of images were georeferenced to 9 ground control points (GCPs) geolocated throughout the study area by using a survey-grade GPS (centimetre-level accuracy). The number of GCPs used in the image sets depended on area coverage, resulting in root mean square error (RMSE) values of X = 0.65 cm, Y = 0.183 cm and Z = 0.183 cm and total 0.240 cm.
Multispectral imagery was radiometrically calibrated to absolute ground reflectance on the basis of incident irradiance measures for each capture and reference reflectance from a calibration target (AIRINOV Co., Paris, France; reference reflectance values green: 17%, red: 21.3%, red edge: 26.2%, NIR: 36.2%) recorded with the Parrot Sequoia camera on the ground prior to each flight.
Geospatial products derived from RGB sensor included orthomosaics, point clouds and digital surface models, while those derived from multispectral imagery included four orthomosaic reflectance bands and the normalized difference vegetation index (NDVI). Multispectral orthomosaics were subsequently used to elaborate other vegetation indices than NDVI as described in section "Feature layer generation and data extraction". Some examples of the derived products are shown in Fig. 4a-d.

Feature layer generation and data extraction
We extracted 34 candidate explanatory variables from the UAV data at crown scale, including both spectral and 3D structural features of the tree canopy. We hypothesized that these variables would potentially discriminate between the different categories of crown defoliation.
We used the set of selected spectral indices (Table 2) to extract the values for tree crowns with different levels of defoliation. The aim was to analyse the informative value of different spectral indices to detect the different stages of defoliation based on four spectral bands. Spectral indices were selected from previous research, on the basis of the spectral bands available in the multispectral data set and evidence of correlation with plant physiological stress and leaf cover. The NDVI is a widely used indices that has been demonstrated to be correlated with physiological stress and photosynthetic activity in a different range of environments (Lausch et al. 2016(Lausch et al. , 2017(Lausch et al. , 2018. The green NDVI (GNDVI), the Normalized Green-Red Vegetation Index (NGRVI) and the red-edge NDVI (RENDVI) were specifically included to examine the sensitivity of indices including the red-edge, green and red bands (c.f. Table 2 for each index specification) and to compare these directly to an index based on near-infrared band. The non-linear index (NLI) was selected because some previous research findings suggest that the relationship between spectral indices and some tree biophysical characteristics are non-linear (Goel and Qin 1994;Dash et al. 2017). Therefore, the NLI may be a good indicator of various stages of physiological stress as it can be transformed to express the non-linear relationships in a linear manner. The optimized soil adjusted vegetation index (OSAVI) (Huete 1988), which was originally designed to correct soil contamination using vegetation signals, may detect subtle differences in crowns with relatively low green vegetation coverage. A Raster stack of spectral band was generated in RStudio to obtain the spectral indices using raster package (R Core Team 2020).
The selection of variables obtained directly from UAV-DSM (Digital Surface Model) can also be useful for describing forest canopy structures. Furthermore, the use of these variables avoids the need to normalize the point cloud, which prevents introducing random errors and further propagating in the Canopy Height Models (CHMs) (Giannetti et al. 2018). Such errors may be especially serious in complex scenarios as our study area, where errors of Digital Elevation Models (DEMs) obtained from UAV photogrammetric data (Fig. 5a) or even Airborne Laser scanning (ALS) data (e.g. due to reduced penetration of pulses to the ground and greater uncertainty of the ground classification of ALS data) might propagate errors in the normalization of DSM ( Fig. 5a and b). The DSM variables were computed using the raw DSM values (i.e. in m above sea level) of all pixels within the area corresponding to each crown and including the mean (TPI mean , TRI mean , ROUGHNESS mean ) and standard deviation (SLOPE sd , TPI sd , TRI sd , ROUGHNESS sd ) of the following topographic variables, calculated using the terrain function in the R package raster (Hijmans et al. 2015): slope, topographical position index (TPI), terrain roughness index (TRI) and roughness (ROUGHNESS). Eight neighboring cells were used to compute slope for any cell. These variables were used to describe the horizontal variations in tree canopy height and possibly different crown patterns associated with different levels of defoliation at crown scale.
To provide further description of the horizontal variations of the canopy surface, textural variables were calculated using the DSM with the same parameters as previously described for spectral textural variables in the NDVI. The mean (dsm_GLCM_mean, dsm_GLCM_variance, dsm_GLCM_homogeneity, dsm_GLCM_contrast, dsm_GLCM_dissimilarity, dsm_GLCM_entropy, dsm_ GLCM_secondmoment) were then computed using all pixels within the crown. A window size of 3 × 3 pixels and a 45-degree shift was used in addition to NDVI textures.
Training / validation data extraction at crown level was based on the identification and manual delineation on the RGB orthomosaics of the trees surveyed in the field and stored as polygon vectors. Pixel values of the candidate explanatory variables were retrieved for each tree crown polygon using the extract function embedded in the raster package in R. The extracted values were used to calculate the mean and standard deviation of the  (Tables 2 and 3). The derived statistics were included in a database and used as the input data for further analysis using RF (Random Forest) classifiers and logistic models.

Data analysis
Data analysis was conducted in two steps as described below: i) an initial step of exploratory analysis based on assessing the separability of damage classes based on comparison of spectral responses and vegetation indices and ii) a second step of modelling, where two different approaches of classifications were implemented, one based on random forest and other based on a cascade of logistic models. Results of these two alternative classifications were used to compare their performance as health status predictors. Field data were used as reference for training and validation purposes.
In addition, throughout the modelling stage, we assessed for each of the classification methods (i.e. random forest and logistic models respectively) the performance of three different levels of complexity (generalization or grouping) of classes, namely: the original four classes discriminated in the field (A, B, C, D), three classes by merging the two classes of alive trees with defoliation (A, B + C, D) and two classes discriminating alive vs dead trees (A + B + C, D).

Exploratory analysis: reflectance/VI thresholding
Boxplot from reflectance value of the spectral bands were empirically analysed at crown level by using the four health status categories. On the other hand, oneway ANOVA was then used to determine the differences between the means of the spectral characteristics of the four classes considered based on the indices values calculated from the images. These previous analyses were done to see possibilities for discrimination and possible overlaps of spectral responses between the target classes.

Tree health status modelling by random forest
The tree health status was modelled using a random forest (RF) model implemented in the randomForest package of R (Liaw and Wiener 2002). RF models were fitted using the four health status categories -classes of defoliation-considered in the field tree health assessment as response variables and the variables computed from both sensors as candidate predictor variables. RF is a machine learning method which relies on the generation of classification trees and on the aggregation of their results. It uses an input feature vector, which is classified with all trees in the forest. This results in a class label for the training sample in the terminal node where it is finally located. Iterating this procedure over all trees yields the random forest prediction. All trees are trained with the same features but with different training sets, which are generated from the original training set. The regression trees are generated through bagging (Breiman 1996), so that trees are randomly selected through bootstrapping of the response and dependent variables. In the forest building process, when bootstrap sample set is drawn by sampling with replacement for each tree, about 1/3 of original instances are left out (Adelabu et al. 2015). A randomly selected subset of the variables at each node are used to find the best split. The number of classification trees was set to 500, as the error rates were reaching the asymptote for this value. Seven variables were tested at each split as this was found to be the most suitable number based on a trial and error approach.
To assess image classification accuracy for each of the three levels of generalization of health status classes, we computed the corresponding confusion matrices, producer, user and overall accuracies and Kappa indices (Cohen 1968) by comparing the predicted and reference class on the 81 selected trees that were surveyed in the field. Roughness is the difference between the maximum and the minimum value of a cell and its 8 surrounding cells Texture dsm_GLCM_mean (window size of 3 × 3 pixels and a 45 degree shift) "mean", "variance", "homogeneity", "contrast", "dissimilarity", "entropy", "second_moment", "correlation" A measure of the explanatory power of the input variables known as the Variable Importance (VI) was obtained by random permutation and reassessment of the misclassification rate for each variable (Breiman 2001).
Finally, RF classifications were also conducted and checked by excluding the textural variables derived from the NDVI and DSM and topographic variables related to DSM from SfM point clouds.

Tree health status modelling by logistic regression
Although most cumulative distribution functions may perform well in modelling the probability of occurrence in classification problems, the logistic function is the most widely used (e.g. Vanclay 1994; Monserud and Sterba 1999) because it is mathematically flexible and has a meaningful interpretation (Hosmer Jr et al. 2013). The logistic regression predicts a probability of an occurrence ranging continuously between 0 and 1.
Two different approaches can be used in logistic regression to model the probability of belonging to a specific category: the first is based on the use of a single multinomial logistic regression and the second is based on the use of a set of binary logistic regressions. As we wish to compare probability estimation models with different number of classes (4 classes: A, B, C and D; or 3 classes: A, B + C and D) a set of binary logistic regressions was used; in addition, this method does not require assumption of the hypothesis of independence from irrelevant alternatives that it is necessary to assume in multinomial logistic regression (Paramesh 1973).
The logistic model is formulated as where π is the probability of occurrence of event (decay class selected), x 1 to x p are independent variables, β 0 is the intercept, and β 1 to β p are parameters to be estimated or regression coefficients.
Different classification sequences of the categories considered were tested using binary logistic regression models. The best results were obtained by first fitting a model to estimate the probability of belonging to class A (asymptomatic trees) versus non-class A, then a model to estimate the probability of belonging to class D (dead trees) versus defoliated trees (class B or C) and finally a model to estimate the probability of belonging to class B (less than 50% defoliation) versus C (more than 50% defoliation).
The independent variables were selected using the stepwise approach of the LOGISTIC procedure of SAS/ STAT® (Institute SAS 2004). Collinearity between regressors was prevented by checking the variance inflation factor (VIF). In this study, regressors with VIF above 10 were disregarded (Belsley et al. 2005). The performance of the models was evaluated in terms of percentage of concordant pairs, generalized coefficient of determination (R 2 ; Cox and Snell 1989) and adjusted coefficient of determination (adj. R 2 ; Nagelkerke 1991). Once each logistic regression model was fitted, the threshold for deciding whether to classify a tree as an event or nonevent (class A versus no A or class D versus B + C or class B versus C, depending on the model) was obtained by maximizing both the sensitivity and specificity of the model. Finally, the confusion matrix was obtained using the previously selected thresholds.
Leave-one-out cross-validation was performed for each binary logistic regression model using the glm function of the R statistical software (Dobson and Barnett 2008;Marschner et al. 2018) and the confusion matrix for cross-validation was obtained using the previously obtained set of thresholds. As in the random forest classification, confusion matrices, accuracies and kappa indices were computed comparing the predicted vs. reference values for the set of 81 trees and considering the three classification schemes assessment.

Reflectance/vegetation indices thresholding
Analysis of SEQUOIA-sensor spectral bands was conducted to determine the spectral profiles of the asymptomatic, defoliated (2 classes) and dead trees (Fig. 6). The spectral differences between asymptomatic (class A) and defoliated trees (classes B and C) were distinguishable both in the visible and near-infrared and were higher in the red-edge and near-infrared (NIR) bands (B3 and B4, respectively) than in the visible bands (B1 and B2). Spectra of the dead trees (class D) differed from the normal canopy reflectance spectra; the spectra were brighter at the visible wavelengths and darker at the NIR wavelengths. Only minor differences were observed in the spectra of defoliated and asymptomatic trees; compared to the asymptomatic trees, the defoliated trees had a higher reflectivity in the red and green portions of the spectrum and lower reflectivity in the NIR part of the spectrum. The results suggest that the red band is more valuable for detecting defoliation induced by the Phytophthora alni species complex.
The differences of the vegetation indices values can be visually assessed in Fig. 7. The box plots show overlapping in the vegetation indices values between the classes and different behaviours depending on the particular classes and indices. The overlapping is particularly clear for the classes B and A with NGRVI, NLI and OSAVI. Class C also showed a higher level of dispersion for all the tested indices and only a low level of overlapping with other classes for the NGRVI index.
ANOVA was then used to determine the difference between the means of the spectral characteristics of the four classes considered (Table 4). The confidence interval for this statistic was 95%. The ANOVA revealed the existence of significant differences between the NDVI values for the different classes (ANOVA, p < 0.001). A pairwise multiple comparison of means showed that NDVI was robust enough to separate the dead from the living trees (class D vs. classes A + B + C in Table 4) and also to separate most of the pairs of classes. However, the results of the Tukey's test indicated that differences in NDVI between asymptomatic trees and trees with less than 50% crown defoliation (A vs. B) were not significant in this parametric test as well as the other indices. As NDVI, the Tukey's test also indicated significant differences between classes C and D for NGRVI, NLI and OSAVI indices.

Random forest classification
The results for the overall accuracy was 67% when using the four classes considered (Table 5). The accuracies for each class revealed a high level of confusion between the two categories of defoliation (B and C).
The RF classification for the generalized three-class scheme (A = Asymptomatic, B + C = defoliated and C = dead trees) yielded an increase of 5% in terms of the overall accuracy (72%). As expected, the fusion of the classes of partially defoliated trees in a single class increased the accuracies of this class to around 57% for user accuracy and 46% for producer's accuracy (Table 6).
Finally, good separation was obtained when only two classes of trees were considered (alive and dead) ( Table 7). The data set provided the best results with an overall accuracy of 91% and also increased per class accuracies to values higher than 80%. The overall Kappa value indicated good agreement between reference and predicted values, close to the interval of almost perfect agreement (0.81-1.00) according to the scale proposed by Landis and Koch (1977).
Regarding the different choices of variables tested, the exclusion of image texture decreased by 7%, 10% and 4% for four classes, three classes and two classes, respectively (Table 8). After removing the DSM-derived variables, the classification results showed that the overall classification accuracy decreased by 7%, 11% and 5% for  (Table 8).

Logistic regression
Three logistic models were adjusted in a three-step approach to discriminate and classify the 4 categories of defoliation in the study (Estimated parameters, standard errors (SE), z values statistics and p-values for the models are presented in Additional file 1). The first logistic model (logit1) predicts the probability of the tree belonging to class A (asymptomatic trees) as shown in Table 9. Hence, the model distinguishes between class A and the other trees (B and C, defoliated trees and D dead trees). The model included only 2 variables as significant predictors, the GNDVI and dsm_ Glcm_dissimilarity as a texture feature from the DSM.
The fitted model is expressed as The percentage of concordant pairs was 91.21%. The probability threshold for discriminating between belonging to group A or not was established to be 0.37. Therefore, trees for which probabilities higher than 0.37 were observed, belong to category A. The logistic model estimated sign (positive) indicates that an increase in GNDVI will increase the likelihood of the tree of belonging to class A. On the other hand, lower variability in canopy surface height from DSM texture (i.e. lower values of the GLCM dissimilarity parameter) increases the likelihood of the tree belonging to class A. The second logistic model logit2 discriminates between the remaining classes D (dead trees) and defoliated trees (B and C) ( Table 10). The model included as predictive variables the ndvi_Glcm_variance related to texture from NDVI and dsm_Glcm_variance as a texture measure related to canopy height from DSM.
The fitted model is expressed as: The percentage of concordant pairs was 95.69%. According to Hosmer Jr et al. (2013), the area under the ROC curve (0.9569) indicated excellent discrimination of the model. A cut-off point discriminating between belonging to group D (dead trees) or group B + C (defoliated trees) was established at 0.50. In other words, at probabilities higher than 0.50 the tree belongs to class D (once it has been ruled out that the tree belongs to class A, asymptomatic trees). The model indicates that an increase in the variability of canopy height from DSM and higher variability in contrast from NDVI texture increases the likelihood of tree belonging to the class D (dead).
Finally, the third logit3 model discriminated between classes B and C, after elimination of alive and dead trees (classes A and D, respectively) (   The percentage of concordant pairs was 85.71%, and the cut-off point for determining whether trees belong to class B or C was established at 0.41. Therefore, trees for probabilities higher than 0.40 were observed belong to class B. The confusion matrix computed for all the logistic models using cross validation showed an overall accuracy of 75%, 80% and 94% for 4, 3 and 2 classes, respectively (Tables 9, 10 and 11).

Discussion
Rapidly increasing disease-induced forest disturbance is a global threat to forest sustainability, and accurate and cost-efficient detection of stand and tree conditions for timely forest management is therefore needed (Lausch et al. 2016(Lausch et al. , 2017(Lausch et al. , 2018. The emergence of new sensors and UAS provides an opportunity to augment traditional field-based approaches by combining remotely-sensed  Fig. 8 The most important variables measured in term of Mean Decrease in Gini considering four (a), three (b) and two classes (c) from RF approach data products to produce enhanced information about forest condition (Pádua et al. 2017). Lightweight sensors mounted on UAV platforms has proven to be particularly suitable for forestry applications (Guimarães et al. 2020) but might suffer from shortcoming for wide-scale applications, due to their relative low yields. Previous works (Matese et al. 2015;Manfreda et al. 2018) pointed out that UAV is a costeffective solution for areas equal or less than 20 ha. In our case we developed and tested the sensitivity of multispectral and RBG images obtained from UAVs to detect disease-induced defoliation in a natural floodplain forest in northern Portugal on an area of around 200 ha, flown in a single day, pointing out that acceptable yields might be reached with lightweight UAVs on significantly larger areas. Furthermore, such yields are still far lower than satellite remote sensing, but like in our study case, the use of UAV showed advantages for the detection of anomalies at tree scale, useful for an early stage detection of disease outbreak. In fact, it was argued that even high-resolution satellite imagery requires clusters of at least three or four mature trees for detecting stress, making them more suitable for large-scale detection of advanced stages of damage (Dash et al. 2018).
The proposed approaches showed a high potential for detection of disease outbreaks in a spatial explicit way over small-medium size forest stands at detailed scale based on a limited field sample. The method might also be extended to other scenarios, either requiring a new collection of training data or, in the case of the logistic model approach, even considering the same regression coefficients and therefore avoiding such data collection, as discussed in the section "Logistic regression". The method might also be extended to the multitemporal monitoring of the disease evolution when applied on multitemporal collections. Of course, the extrapolation to other geographical settings or to a multitemporal analyses relies on the availability of datasets with comparable resolutions (radiometric, spatial, spectral) and good quality standards to avoid artefacts due to data differences. In fact, even complex radiometric corrections of multispectral images based on reflectance calibration panels and synchronically irradiance records might not address conveniently the complexity of radiometric normalization and some artefacts might remain (Manfreda et al. 2018). In any case such potential problems in radiometric correction and homogeneity are by far compensated by the sensitivity of UAV data to minor differences of vegetation cover characteristics, attributable to their acquisition flexibility, spatial resolution and specificity of spectral bands recorded (Dash et al. 2018;Manfreda et al. 2018;Guimarães et al. 2020).

Reflectance/VI thresholding
Although the significance of the NIR band has already been demonstrated (Minařík and Langhammer 2016;Abdullah et al. 2018;Stoyanova et al. 2018;Klouček et al. 2019), detection of the pre-visual stage of this disease attack ("green attack") remains challenging, especially using inexpensive cameras. Asymptomatic vegetation typically takes up much of the visible light that falls on it and reflects much of the nearby infrared Table 9 Image classification accuracy by group in four classes, where A = Asymptomatic trees, B = trees defoliated less than 50%, C = trees defoliated more than 50% and D = dead trees, PA = producer's accuracy, UA = user's accuracy, value shown in bold = overall accuracy. Classification and reference (field check) frequencies are arranged in rows and columns respectively   area. According to this general principle, our study confirmed that in comparison to asymptomatic trees, defoliated trees have a higher reflectance in the green and red intervals of the spectrum and lower in the NIR (c.f. Fig. 5), which corresponds with the findings of Näsi et al. (2015) at tree level and those of Abdullah et al. (2018) at leaf level. Hence, in the visible region, the mean reflectance was higher in the defoliated crown than in the asymptomatic canopy, in conjunction with the absence of leaves and also with pigment and leaf structure degradation. In any case, important overlaps between spectral signatures were detected, pointing out limitations and the need of further elaboration of the reflectance data.
The concentration of pigments such as total chlorophyll is the main factor for determining leaf spectral variation and absorption peaks (Carter and Knapp 2001). The defoliated trees had significantly lower chlorophyll concentrations than the asymptomatic ones, resulting in lower absorption and higher scattering. There was a larger difference between defoliated and asymptomatic crowns in the near-infrared (~790 nm) region, as well as in the~10 nm wide band in the red-edge (~735 nm), as the asymptomatic crowns had higher reflectance compared to the average defoliated crown spectra as an evident effect of pigment signal decrease due to the reduction of green biomass. Hence, in healthy plants, reflectance is generally greater in the NIR region than in diseased and stressed plants, and reflectance can therefore be used to distinguish one from the other. The position and amplitude of the red-edge transition between the high red absorption and the high NIR reflection by vegetation have also shown a good correlation with pigment concentration and canopy structure features such as leaf area index (Filella and Penuelas 1994;Ju et al. 2010). One of the key features of remote pigment assessment is the capacity to separate the signals related to individual pigments and structural changes in tree canopies. These assessment methods must account for overlapping signals, especially when differentiating between chlorophyll and other pigment groups (Hernández-Clemente et al. 2014).
Considering these general principles, several vegetation indices combining visible, red-edge and NIR spectral bands were chosen for study as we aimed to evaluate the joint effect of changes in pigments and leaf/canopy structure as an indicator of defoliations caused by the disease. In all the cases, the separation between the asymptomatic and the early stages of defoliation proved to be problematic based only in the comparison of vegetation indices values. The NDVI proved to be sufficiently robust to separate the dead from live trees (class D vs. classes A + B+ C in Table 4) and also between most of the pairs of classes. Although many indices have been used in remote sensing research in the last 40 years (Bannari et al. 1995), the NDVI has been extensively used to discriminate healthy from senescent foliage in forest stands affected by insects and disease attack (Spruce et al. 2011;Lottering and Mutanga 2016;Dash et al. 2017). NLI and OSAVI computed with red and NIR bands (like the NDVI) indicated significant differences between the two classes of defoliated crowns. Our findings showed that the combination of red-edge and NIR region (RENDVI) was less capable than NDVI of detecting defoliation classes induced by alder Phytophthora. NGRVI calculated from the visible green and red bands also revealed significant differences between the two defoliated categories and was selected as a significant variable in the third logistic model to classify the defoliated trees. In fact, as pointed out by Motohka et al. (2010), this index is particularly suited to detecting subtle changes in canopy pigmentation in the range of yellow colours in the autumn stage of deciduous forests than other indices based on red-edge or NIR wavelengths. Our findings suggest that the index also might detect such changes, in this case associated with different levels of biotic defoliation.

Random forest
An RF model was chosen to predict defoliation levels, as such models have been demonstrated to be suitable for predicting tree health status (Adelabu et al. 2015). As the previous studies (Näsi et al. 2015) distinguished well between asymptomatic and dead trees (with an overall accuracy of 91%), we focused our attention on distinguishing between asymptomatic, dead and two categories of defoliated trees. Categorization into two classes (dead vs. alive) provided good results, also with an overall accuracy of 91% and Cohen's kappa of 0.82. However, separation between the two classes of partial defoliation to produce a total of four health status categories yielded poorer results, with an overall accuracy of 67% and Cohen's kappa of 0.52 pointing out the complexity of the discrimination between classes of partial defoliation. Our results regarding overall accuracy (72%) using the RF approach with 3 classes (asymptomatic, defoliated, dead) complement those reported by Klouček et al. (2019), who evaluated UAV-based detection on mosaics based on images acquired during a single mission. In the corresponding acquisition time (at a different latitude), these researchers recorded a 78% overall accuracy. Our results with 3 classes are also similar to those of Näsi et al. (2015), who reported an overall accuracy of 76% based on hyperspectral UAV imaging at the individual tree level when using three classes (asymptomatic, defoliated, dead) in urban spruce forests.
Comparable applications of airborne image texture analysis in forest health studies using UAV-derived data are scarce (Moskal and Franklin 2004). The texture and crown surface height information denoted the shape of the defoliated crowns, adding a spatial domain to the spectral domain. Texture processing generally improved the classification accuracy in the order of 5 ± 10%, which is slightly lower than the accuracy obtained by (Moskal and Franklin 2004) for aspen (Populus tremula) defoliation (increase of 15%) from multispectral imagery and also using NDVI texture. The present study demonstrated that the severity of black alder crown defoliation can be detected with multispectral and RGB UAV image data, specifically by the incorporation of imagery spatial component captured by image texture from NDVI and DSM.
In contrast to previous findings based on satellite data applications for early detection of physiological stress (Dash et al. 2018), indices based on red-edge bands were not prominent in the discrimination of defoliation levels. In turn, GNDVI and textural variables from NDVI were the most important predictors of the first and second model. RENDVI was less important predictor of the categories tested using the RF approach for four and three classes, respectively. The greater importance of NDVI may be attributable to the broader coverage of the NIR from this sensor (50 nm wide band in the red-edge (7 90 nm) than the red edge region covered by the UAV mounted sensor, which is limited to a narrower spectral window (10 nm wide band in the red-edge (~735 nm)). The considerably narrow band and the location of the red-edge band in this particular multispectral sensor may result in failure to detect spectral differences associated with pigment/leaf structure differences reflected at a given point of the red to NIR transition not recorded by the sensor.

Logistic regression
A robust and parsimonious first logistic model (logit1) was developed by using only two variables to distinguish asymptomatic trees from defoliated and dead trees. The model indicates that an increase in the mean value of GNDVI from the crown and lower dissimilarity from DSM increases the likelihood of tree belonging to class A (asymptomatic trees). The GNDVI is attributed to the chlorophyll content of the leaves (Shanahan et al. 2001). The chlorophyll content reflects the physiological state of vegetation as it decreases in stressed plants and can therefore be used as a measurement of plant health. In our case, the GNDVI also succeeded in the detection of levels defoliation in trees in concordance with the chlorophyll contents. The dsm_Glcm_dissimilarity as a texture predictor from the DSM feature is a measure of the amount of local variations from the crown surface height. Therefore, a large amount of local variation present in defoliated trees incremented the probability of not belonging to the class A (i.e. indicating defoliation). Regarding the second logistic approach (logit2) to discriminating between the group D (dead trees) and groups B and C (defoliated trees), our results in terms of percentage of concordant pairs displayed a greater fitting capacity (i.e. model efficiency with higher value of concordance -0.95) than in other studies developed for tree mortality using biometric and topographic variables at tree level (Botequim et al. 2017) and stand level (González et al. 2004). The significant variables selected suggest that both NDVI and DSM textures also affect the capacity of the models to explain the probability of belonging to the dead class, which is consistent with the importance of the variables indicated by the RF model. These variables were used to describe the variations in tree canopy height surface and spectral behaviour between pixels and possibly different crown patterns associated with different levels of defoliation at crown scale. The DSM showed that spatial variations in surface height reflect the changes over the different crown conditions. Hence, in the case of DSM_Glcm_variance, the positive sign in the parameter of the model indicates that greater variance from DSM increases the probability of belonging to the dead class. This effect may be explained as variability in canopy height increased within the dead crowns and consequently increased the probability of the tree dying, while a smoother surface due to the leaf coverage indicated crowns with a certain level of leaf coverage. This finding is consistent with the biological meaning of the first model including a textural significant variable from the DSM. In addition, higher variability in contrast of the NDVI texture increased the likelihood of a tree belonging to the dead class and indicated a significant impact on the spectral response in a given dead crown. In the present case, ndvi_Glcm_contrast was also positively correlated with the probability of mortality as the contrast of NDVI between pixels was higher in death crowns probably due the effects of shadows between naked branches. Thus, forest managers may wish to consider textural variables from the DSM and NDVI, in order to detect black alder crown mortality.
Finally, the third model (logit3), which aimed to discriminate between categories B and C (defoliated trees) achieved a value of 85.71% in terms of concordance of pair value. In this case the model basically relied on the NGRVI. The NGRVI was based on green and red bands and showed a lower dispersion for all the tested indices and only low overlaps with other classes for the defoliated class C (c.f. Fig. 7). Our findings for NGRVI confirmed that the green and red bands had higher discriminatory power than other tested predictors for classifying the defoliated trees, in accordance with the results of earlier studies focused on plant stress (Carter 1993;Carter and Knapp 2001;Zarco-Tejada and Sepulcre-Cantó 2007) and confirming that pigment concentration signal retrieved by vegetation indices in the visual spectrum successfully discriminated between intermediate levels of crown defoliation.
This research estimated a set of models through two alternative approaches to predicting classes of defoliation that represent a step forward for managing broadleaved forest stands under risk conditions in areas affected by the disease, and identifying remotely sensed features that enable reliable estimation of the level of defoliation. Our study highlighted the simplicity and replicability of a logistic method that suggests combining the best variables for a series of classifications to extract the relevant information on different vegetation features categories. This novel approach showed advantages in the overall performance of the method and in the transparency and replicability relative to the RF approach. In fact, the absolute values of the binary regression model might be directly extrapolated to datasets other forest under similar conditions, avoiding a new collection of reference data. This line of research appears particularly useful for designing prescriptions that could enable forest managers to reduce the ecological and economic effects of disease-induced decline particularly in ecosystems where disease is recurrent.

Conclusions
The aim of this study was to develop and test innovative methodological approaches for assessing spatial and qualitative aspects of forest disturbance at high spatial resolution from close-range remote sensing data obtained with inexpensive sensors. Regarding the performance of the methods tested for classification crown defoliation, a high level of overlapping between classes was observed for an effective discrimination between classes using reflectance/Vegetation indices thresholding. On the other hand, we observed differences between the RF and logistic models depending on the classification scheme considered. Both methods performed well in discriminating between live and dead trees, but there were notable differences in more complex classification schemes of three and four classes, in which the logit models outperformed the RF approach (by up to 8 percentage points) in overall accuracy. With the aim of developing a simple and robust monitoring tool for forest managers, we used a three-step logistic analytical method that achieved the highest overall accuracy of 75% by considering 4 classes scheme. The findings suggest that simple, robust logistic models could be applied to enable forest technicians to obtain timely information and monitor forest defoliation at the operational level as an alternative classification method to the more complex random forests approach. The method could also lead to savings through model transfer and reductions in data acquisition costs (i.e. by using only RGB and multispectral data). In future studies, the robustness of the best performing models for differentiating specific crown defoliation levels should be tested in different study areas and at different times in order to contribute to forest health monitoring at the operational level. The survey method based on a combination of high-resolution multispectral and RGB imaging will be of great practical value for forest health management as it is capable of indicating the potential severity of black alder crown defoliation at a given time. This approach provides a reliable and cost-effective tool to support the monitoring and assessment of changes in alder floodplain forests, and can be used as an aid to design conservation and restoration plans, transferable to other rivers and catchments across the whole distribution where these ecosystems naturally occur.

Availability of data and materials
The research data and materials are available from the corresponding author on reasonable request.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
We express our consent to publish the paper in the journal.