Discriminating Xylella fastidiosa from Verticillium dahliae infections in olive trees using thermal-and hyperspectral-based plant traits

large-scale detection methods. Airborne hyperspectral and thermal imagery have been successfully used to detect both Xf and Vd infection symptoms independently, i.e., when only one of the two diseases is present. Never-theless, the discrimination of Vd from Xf infections in contexts where both pathogens are present has not been addressed to date. This study proposes a three-stage machine learning algorithm to distinguish Vd infections from Xf infections, using a series of datasets from 27 olive orchards affected by Xf and Vd outbreaks in Italy and Spain between 2011 and 2017. Plant traits were derived from airborne hyperspectral and thermal imagery, including physiological indices from radiative transfer model inversion, Solar-induced Fluorescence emission (SIF @760 ), the Crop Water Stress Index (CWSI), and a selection of narrow – band hyperspectral indices. Several distinct spectral traits successfully discriminated Xf from Vd infections. The three-stage method generated a false-positive rate of 9%, an overall accuracy (OA) of 98%, and a kappa coefficient ( κ ) of 0.7 when identifying Vd infections using a mixed Vd + Xf dataset. When identifying Xf infections, the false-positive rate was 4%, the OA was 92%, and κ was 0.8. These results indicate that hyperspectral and thermal traits can be used to discriminate Xf from Vd infection caused by the two xylem – limited pathogens that trigger similar visual symptoms.


Introduction
Olive trees are cultivated on over 11.5 million ha of land worldwide, with most of the production occurring in the Mediterranean Basin (Rodríguez-Cohard et al., 2020).Regions such as the USA, Asia, Oceania (FAO, 2008), and, more recently, other areas in the southern hemisphere also produce olives (Torres et al., 2017).Over 100 pests and pathogens infect olive trees (Olea europaea), reducing yields and increasing total production costs (Fernández-Escobar et al., 2013).Two particularly destructive pathogens include the soil-borne fungus Verticillium dahliae (Vd) Kleb (Tsror, 2011) and the gram-negative bacterium Xylella fastidiosa (Xf) (EFSA, 2019).
Vd causes Verticillium wilt (VW) (Jiménez-Díaz et al., 2012), the main soil-borne fungal disease threatening olive production worldwide (Jiménez-Díaz et al., 2012).First described in Italy in 1946, VW is now present in many Mediterranean countries, including Spain, Italy, Greece, France, Malta, Turkey, Morocco, Jordan, Algeria, Israel, Iran, and Tunisia.VW has also been reported in Argentina, the USA, Australia, and Peru, with more than 80,000 ha affected globally (Jiménez-Díaz et al., 2012); López-Escudero and Mercado-Blanco, 2011).There are two main pathotypes of Vd: defoliating (D) and non-defoliating (ND).The D pathotype is more virulent than the ND pathotype and can be lethal for infected trees.The first report of the D pathotype in Spain was in an intensive cotton field, but it has since spread across olive orchards in Andalusia, where it is now the predominant Vd pathotype, causing significant yield losses and tree mortality (Navas-Cortés et al., 2008).
Xf is one of the most devastating plant pathogens of olive and is categorized as a quarantine plant pathogenic bacterium in the European Union (Schneider et al., 2020).Xf infections cause olive quick decline syndrome, characterized by leaf scorch and desiccation of twigs and branches.Xf-infected olive trees were first reported in California in 2003 (Wong et al., 2003) and then in Apulia, the main olive-growing area of Italy, in 2013 (Saponari et al., 2013).Since then, Xf infections have been identified in several European countries, infecting about 100 host plant species across Tuscany (Italy), Corsica and the PACA region (France), Alicante and the Balearic Islands (Spain), and Northern Portugal (EFSA, 2020).Countries in the Mediterranean basin in Europe (EFSA, 2019) and in the Middle East and Africa, including Lebanon, Turkey, Morocco, and Tunisia, are the most vulnerable to Xf outbreaks (Frem et al., 2020).
Both Vd and Xf are xylem-invading pathogens that colonize the host through vascular tissue and eventually block water flow through the xylem.Vd enters through the root system, reaches the xylem, and moves within the vascular stream.Symptoms appear when the fungus emerges, where it blocks vessels and colonizes neighboring tissues (Klosterman et al., 2009).Xf infections gradually block the xylem, leading to a reduction of sap flow due to bacterial growth and plant physiological responses (Sicard et al., 2018).Similar symptoms are triggered for both xylem-limited pathogens in the canopies of infected plants, including foliar discoloration, wilting of apical shoots, dieback of twigs and branches, and general decline (Carlucci et al., 2013).Symptoms caused by the two pathogens may also be confounded with generic water stress responses, due to the xylem-limiting nature of the infection processes (Hopkins, 1989;Klosterman et al., 2009).
Vd and Xf infections cannot be discriminated visually in situ due to their similar effects on host trees.Accurate identification requires laboratory testing using molecular diagnostics that are laborious and expensive (Gramaje et al., 2013).Consequently, there is a need for more cost-effective approaches to detect and discriminate infected plants at large spatial scales and within reasonable time frames.Such methods could enable eradication of incipient Xf outbreaks (Almeida, 2016) and reduce the serious damage caused by Vd worldwide.Remote sensing techniques have great potential for efficiently detecting early disease symptoms in olive trees (Martinelli et al., 2015).Calderón et al. (2013) successfully used hyperspectral and thermal sensors to detect Vd infections in olive trees, finding that the thermal Crop Water Stress Index (CWSI) (Idso et al. 1981), Solar-induced Chlorophyll Fluorescence (SIF) (Plascyk, 1975), and the blue-region spectral indices BGI1 (Zarco-Tejada et al., 2005), BRI1 (Zarco-Tejada et al., 2012), andB (Calderón et al., 2013) were the most sensitive for detecting Vd-induced symptoms at the early stages of infection.Calderón et al. (2013) also established that several spectral indices were useful for assessing disease severity at advanced stages of disease development (i.e. for damage-mapping after symptom emergence).The indicators of advanced disease severity included the photochemical reflectance index (PRI) (Gamon et al., 1992) and variants of PRI, such as PRI 515 (Hernández-Clemente et al., 2011), the Transformed Chlorophyll Absorption Reflectance Index (TCARI) (Haboudane et al., 2002), and the Healthy Index (HI) (Mahlein et al., 2013).A subsequent study by the same authors (Calderón et al., 2015) conducted over a large area comprising infected orchards differing in soil and crop management characteristics showed that the air-canopy temperature difference (Tc-Ta); SIF; PRI 515 ; HI; structural indices such as the Renormalized Difference Vegetation Index (RDVI), the Modified Triangular Vegetation Index (MTVI 1 ), the Modified Simple Ratio (MSR), the Optimized Soil-Adjusted Vegetation Index (OSAVI), and the Enhanced Vegetation Index (EVI); the chlorophyll indices TCARI, Gitelson & Merzlyak (GM 1 ), and Pigment Specific Simple Ratio (PSSR b ); and the carotenoid index (R 515 /R 570 ) were the most sensitive spectral indicators for detecting Vd infections at both early and advanced stages of the infection.They reported an overall accuracy (OA) of 79.2% and a kappa coefficient (κ) of 0.49 when detecting Vd-infected olive trees across all severity levels.The authors of both studies (Calderón et al., 2013;Calderón et al., 2015) suggested that Tc-Ta and SIF were the most important indicators for detecting Vd infection symptoms at the early stages of disease development, both at the orchard scale and over larger areas.They reasoned that these thermal and fluorescence indicators of early infection were more robust than spectral reflectance indices to variability in the orchard management strategies typically encountered when monitoring large areas.
The first study to evaluate early detection of Xf symptoms using hyperspectral remote sensing imagery (Zarco-Tejada et al., 2018) found that a combination of spectral plant traits retrieved from radiative transfer model (RTM) inversion, narrow-band hyperspectral indices, and thermal traits successfully identified Xf-infected olive trees in Apulia, Italy.The authors demonstrated that the blue-region Normalized Phaeophytinization Index (NPQI) (Barnes et al., 1992;Peñuelas et al., 1995), the thermal CWSI, anthocyanin pigment content (Anth), and SIF were the most sensitive spectral-based plant traits for successfully discriminating asymptomatic from symptomatic Xf-infected trees.They achieved an OA of 80% and a κ of 0.61.Later, Poblete et al. (2020) compared the sensitivity of hyperspectral and multispectral bandsets for detecting Xf infections in an operational context, assessing the contribution of CWSI and SIF.They found that the thermal CWSI was crucial for detecting Xf and demonstrated that large-scale Xf detection is feasible with multispectral and thermal cameras, provided that more than six narrow bands (i.e., 10 nm bandwidths) centered at specific spectral regions are captured, including (in order of importance) 400, 423, 601, 667, 713, 534, 478, 769, 760, and 735 nm.The study demonstrated that the blue spectral region (i.e., bands within the 400-450 nm spectral range used to derive the NPQI index) and the thermal region (used to derive CWSI) were the most critical spectral regions to detect Xf symptoms in olive.Using six 10-nm bandwidth spectral bands coupled with CWSI, their OA and κ coefficients were 74% and 0.36, respectively.After CWSI and NPQI, the most sensitive indices were those related to carotenoid and xanthophyll pigment dynamics (DC axbc ); the carotenoid reflectance index (CRI 700M ); PRI derivatives such as PRI M1 , PRI M4 , PRI n , and PRI × CI; and chlorophyll indices such as Vogelmann (VOG 2 ) and TCARI normalized by the OSAVI index in the form TCARI/OSAVI.
Although both Xf-and Vd-induced symptoms have been successfully detected independently using hyperspectral and thermal remote sensing techniques, discriminating Xf-from Vd-induced symptoms when both pathogens are present has not been attempted to date.Both Xf and Vd are vascular pathogens that cause similar visual symptoms and may therefore be confounded when monitoring large outbreaks.Xf-and Vd-induced symptoms may also be confused with plant responses to abiotic stress, i.e., nutrient or water limitation.As both Vd and Xf continue to spread globally, detecting and discriminating these infections via remote sensing is increasingly important.
Machine learning techniques may be able to distinguish subtle physiological differences between pathogen-induced symptoms, particularly when combined with hyperspectral and thermal image data.In this study, we propose a multi-stage method to distinguish Vd and Xf infections in olive trees.We combined airborne hyperspectral datasets collected from Xf- (Zarco-Tejada et al., 2018;Poblete et al., 2020) and Vd-infected olive orchards (Calderón et al., 2013;2015) to assess this method.The remainder of this manuscript is organized as follows: in Section 2, we describe the study areas in detail.We then describe airborne hyperspectral and thermal imagery collection, field assessments, and how Vd and Xf datasets were combined.At the end of Section 2, we outline the multi-stage machine learning methodology for identifying the diseases using the combined datasets.In Section 3, we present and discuss our results.Finally, we highlight the main conclusion in Section 4.

Study sites and Xf and Vd dataset description
Vd data were derived from 11 olive orchards in Southern Spain that were monitored in 2011 and 2013 (full description of the dataset can be found in Calderón et al. (2013) and Calderón et al. (2015)).The presence of Vd in these orchards was confirmed using a Vd-specific PCR assay (Mercado-Blanco et al. 2003).The D pathotype of Vd was prevalent in all the monitored olive orchards.Orchards were distributed across two study sites.The first study site was located in Castro del Rio (Cordoba, Spain) and contained commercial olive orchards of cv.Picual planted at a spacing of 6 X 4 m 2 (Calderón et al., 2013).Visual assessment of the disease incidence (DI) and severity (DS) was performed on a scale of 0-4 based on foliar symptoms and percentage of incidence (Fig. 1A-D).A total of 1,878 olive trees were assessed, among which 1,569 were asymptomatic and 283 were symptomatic (77% DS = 1, 16% DS = 2, 4% DS = 3, and 3% DS = 4).The second Vd-infected study site was located in Écija (Seville, Spain).Orchards comprised cv.Picual and cv.Hojiblanca, planted at densities of between 123 and 357 trees per ha (Calderón et al., 2015).Visual assessments of DI and DS were performed for 5,223 olive trees; 5,040 trees were asymptomatic and 183 were symptomatic, of which 61% had DS = 1, 22% had DS = 2, 12% had DS = 3, and 5% had DS = 4.The extension of the Vd-infected olive orchards used in this study ranged between 1.69 and 7.28 ha each.
The Vd dataset was used in Calderón et al. (2013;2015), and the Xf dataset was used in Zarco-Tejada et al. ( 2018) and Poblete et al. (2020) to assess the spectral plant traits sensitive to each disease.In this study, Xf and Vd datasets were combined, therefore generating mixed Xf + Vd datasets containing asymptomatic, Vd-infected, and Xf-infected trees.
Using this mixed Xf + Vd dataset, specificity and sensitivity tests were conducted to assess the accuracy when discriminating Xf from Vd symptoms.The sensitivity test was performed by applying Xf and Vd discrimination models on datasets containing asymptomatic/symptomatic trees from the non-target disease (the Xf model tested on the Vd dataset and vice versa).The specificity test assessed the performance of each discrimination model using data derived from symptomatic and non-symptomatic trees from the non-target disease (95%) and symptomatic trees infected by the target infection (5%).More details regarding the proposed methodology and the construction of the datasets used to perform the sensitivity and specificity tests can be found in Section 2.3.To account for non-disease-induced water stress in Xf and Vd detection, hyperspectral and thermal datasets acquired in pathogenfree orchards were used as reference.

Hyperspectral and thermal airborne campaigns
Airborne campaigns were performed over every study site to acquire high-resolution thermal and hyperspectral imagery (Fig. 2A,B and C,D for Vd-and Xf-infected study sites, respectively).Flights were concurrent with field data collection.A FLIR Systems SC655c (USA) uncooled microbolometer-based detector thermal camera was used, sensitive to the spectral range 7.5-14 µm, with a resolution of 640 × 480 pixels and a focal length of 24.6 mm f/1.0.Thermal vicarious calibration was performed by measuring soil temperature as described in Calderón et al. (2013).
A Headwall Photonics (Fitchburg, MA, USA) VNIR linear hyperspectral imager was used to collect hyperspectral imagery.The imager is sensitive to the spectral range 400-885 nm and collects 260 bands with 6.4-nm full-width at half maximum (FWHM).The sensor has an 8-mm focal length that allows an angular field of view of 50 • .Radiometric calibration was performed using a CSTM-USS-2000C integrating sphere (LabSphere, North Sutton, NH, USA).The Simple Model of Atmospheric Radiative Transfer of Sunshine (SMARTS) was used for atmospheric correction (Gueymard, 2001).A portable weather station (Transmitter PTU30,Vaisala,Helsinki,Finland) and MICROTOPS II sunphotometer (Solar LIGHT Co., Philadelphia, PA, USA) were used to measure meteorological parameters and aerosol optical properties at the time of hyperspectral and thermal acquisitions.The calibrated and atmospherically corrected hyperspectral data were ortho-rectified using IMU and Pure tree-crown temperature and reflectance were obtained by performing Niblack's thresholding method (Niblack, 1986) and Sauvola's binarization techniques (Sauvola and Pietikäinen, 2000) to isolate tree crowns from the background and remove soil effects and within-crown shadows.The pure tree-crown temperature was used to calculate the CWSI (Idso et al., 1981) and to assess changes in the transpiration rates linked to Xf- (Zarco-Tejada et al., 2018) and Vd-induced (Bruno et al., 2020;Calderón et al., 2013) physiological changes.Pure tree-crown radiance and reflectance were used to i) quantify sun-induced chlorophyll fluorescence at 760 nm (SIF @760 ) using the O 2 -A in-filling Fraunhofer line depth (FLD) method (Plascyk, 1975); ii) calculate Narrow Band Hyperspectral Indices (NBHIs) (Table S1 of the Supplementary Material); and iii) retrieve leaf biochemical constituents and canopy structural parameters using the PRO4SAIL RTM that couples PROSPECT-D (Féret et al., 2017) and 4SAIL (Verhoef et al., 2007) models.

Statistical analyses and machine learning algorithms to discriminate between Vd-and Xf-infected olive trees
We first used analysis of variance (ANOVA) to assess whether simulated plant traits, spectral indices, and thermal traits could discriminate between asymptomatic and symptomatic trees for each disease.The ANOVA test was applied to each disease dataset, using each remote sensing trait as a response variable.Indicators were considered to significantly differentiate infected trees from non-infected trees if pvalues were less than 0.05.
A three-stage approach was followed to discriminate Vd-asymptomatic (DS = 0) from Vd-symptomatic (DS ≥ 1) and Xf-asymptomatic (DS = 0) from Xf-symptomatic (DS ≥ 1) olive trees (Fig. 3).The threestage process consisted of i) detection, ii) reclassification, and iii) discrimination.Predictors used in analysis included i) plant traits obtained by inverting the PRO4SAIL RTM, ii) SIF @760 , iii) CWSI, and iv) NBHIs.RTM-based traits, SIF @760 , and CWSI were included as model predictors since they directly relate to plant physiological status.Hyperspectral indices (the NBHI pool) were included after a dimension reduction step based on a Variance Inflation Factor (VIF) analysis (James et al., 2013).NBHIs with a (VIF) > 5 were considered collinear and were not included in later models (Akinwande et al., 2015).This step was important because the NBHIs include bands that are spectrally adjacent and thus redundant (Bhardwaj and Patra, 2018).The reduced, non-collinear set of NBHIs contained only those indices that improved the classification process.
(i) Detection stage (Stage I).Random forests (RFs; Breiman, 2001) were trained on each disease dataset using a random split of 70% of the data, with balanced symptomatic/asymptomatic observations (TR).Models were evaluated using the remaining 30% of data (testing dataset, TS).Models were fit using all the variables at each node split and m = 500 trees per model.This process was iterated 100 times for each dataset (Vf, Vd, and abiotic stress), and each predictor's importance was calculated by the permutation of the out-of-bag (OOB) method (Thomas et al., 2021).
Next, a feature-weighted random forest (FWRF) classification model was built for each disease using the feature importance scores for each plant trait from the initial permutation step.The FWRF algorithm was implemented in R based on the methodology proposed by Liu and Zhao (2017).The method samples distinct features in proportion to their variable importance scores and thus reduces but does not completely ignore less informative variables' contributions (Chen and Hao, 2017).
Overall, RF algorithms require the definition of two main parameters to create a classification model.These parameters include the number of trees (m) and the number of variables used at each node (n).Different studies have set the m parameter at its default value of m = 500 (Collins et al., 2020;Huo et al., 2021;Ghosh et al., 2014), since it has been suggested that m has a negligible effect on the accuracy compared with n (Belgiu and Drȃgut ¸, 2016).Normally, for classification purposes, the number of variables randomly selected at each split in the tree building process is set as the square root of the total number of features.However, it has been suggested that setting n to the total number of features can give better results but can increase the computational cost (Belgiu and Drȃgut ¸, 2016;Ghosh et al., 2014).As in our study, the total number of variables was reduced by the VIF analysis, and we included the importance of each variable into the models.All the variables at each node split were included, leaving n as the total number of features for each model and m as default (m = 500).Separate feature-weighted models were obtained for Xf and Vd infection datasets, each with different input feature weights.
(ii) Reclassification stage (Stage II).Using the FWRF models obtained in Stage I, classification probabilities were obtained for each tree (Malley et al., 2012).Those trees that showed a probability exceeding the mean (µ) ± standard deviation (σ) limits of the probability distribution of all the trees classified were considered uncertain and were included in the clustering process described below.The traits selected for inclusion in the reclassification stage were identified as being important for predicting disease status but not for predicting abiotic stress.Only traits with disease-model importance scores at least double those of the  recognize clusters on unusual shapes since it performs the clustering process in a projected space in which the transformation matrix is calculated by the eigenvectors of the Gaussian similarity matrix (Tan et al., 2021).Spectral clustering has been used in remote sensing studies for classifying hyperspectral, high-resolution, and synthetic aperture radar data (Tasdemir et al., 2015;Zhao et al., 2019;Xia et al., 2015;Zhang et al., 2008).In this study, the method used to compute the affinity matrix was a Radial Basis kernel with two classes to be identified (asymptomatic versus symptomatic trees).Parameters were obtained by a heuristic procedure at every point in the dataset.
To evaluate the accuracy of the detection and discrimination between infections, specificity and sensitivity analyses were performed.As described by Trevethan (2017), specificity is understood as correctly identifying the true negatives (TN) obtained in the classification while avoiding false positives (FP).By contrast, sensitivity is understood as the probability of correctly identifying true positives (TP) while avoiding Fig. 7. Comparison of asymptomatic (0) and symptomatic (1) trees with either Verticillium dahliae or Xylella fastidiosa infections.Groups with the same infection but different letters (a and b) are significantly different according to ANOVA (p < 0.05).The black horizontal line in each box represents the median, and the top and bottom lines represent the 75th and 25th quartiles, respectively.The whiskers are the upper and lower limits based on the interquartile ranges (IQRs; Q ± 1.5 × IQR).The outliers (asterisks) are the values out of the upper and lower limits.false negatives (FN).To assess model sensitivity, each infection model was applied to its counterpart dataset (Xf model on Vd data and vice versa).Validation datasets contained a random sample of 30% of the data, balanced between symptomatic and asymptomatic trees (Fig. 4A).A similar scheme was used to assess specificity, except that 5% of each test dataset also contained symptomatic trees of the model's target disease (Xf model validated on a Vd dataset with 5% Xf symptomatic trees and vice versa) (Fig. 4B).To assess any improvement in accuracy due to the three-stage approach, specificity and sensitivity tests were performed after combinations of stages, including i) Stage I; ii) Stages I and II; iii) Stages I and III; and iv) Stages I, II, and III.False-positive (FP) and true-negative (TN) rates, as well as OA and κ, were obtained and compared at each consecutive stage.In addition, errors of omission and commission were calculated considering the FP, FN, and TP rates, as shown in Anders et al. (2021).

Results and discussion
Visual assessments in the field confirmed that Vd and Xf infections of olive trees are difficult to distinguish due to their similarity in symptoms (Fig. 1).This is consistent with previous studies that describe similar visual symptoms for both infections (Carlucci et al., 2013).Differences in pure tree-crown reflectance between asymptomatic (DS = 0) and symptomatic (DS ≥ 1) trees were evident for both types of infections, particularly in the blue spectral region (400-495 nm) and within the ranges 510-520 nm and 540-560 nm, as well as in the red edge region at 685-760 nm (Fig. 5).
The blue region is associated with indices previously used to identify infections, including the blue ratio index (B) for Vd detection (Calderón et al., 2013) and the BF 1-5 and NPQI for Xf-induced symptom detection (Zarco-Tejada et al., 2018).The VNIR region is associated with plant pigment absorption by C x+c (Chappelle et al., 1992, Merzlyak et al., 1999), Anth (Ustin et al., 2009), and C a+b (Gitelson & Merzlyak, 1996).These pigment levels were retrieved in this study by RTM inversion, in contrast to previous work by Calderón et al. (2013;2015), which only used derived reflectance indices for Vd detection.
Plant traits that showed highly statistically significant differences (p < 0.0001) for either Vd or Xf infection were considered as candidates for discriminating Vd and Xf infections.Distributions of remote sensing indices and RT model-inverted parameters for asymptomatic and symptomatic trees of each infection type are shown in Fig. 7.
Predictor importance scores from RFs (Stage I) (Fig. 8) were consistent with those reported by Zarco-Tejada et al. (2018) and Poblete et al. (2020) in detecting Xf infections.Zarco-Tejada et al. (2018) found that NPQI, CWSI, Anth, C x+c , and SIF @760 contributed most to discriminating between asymptomatic versus symptomatic trees, while this study found that CWSI was most important, followed by NPQI, Anth, SIF @760 , PRI n , and CRI 700M .The slight differences in importance ranks for these features could be explained by the additional spectral traits used in our analyses.For example, C x+c and CRI 700M , which are related to the carotenoid pigment content, were included in this study and found to be important for Xf detection.The importance of CWSI identified in this study is consistent with the results of Poblete et al. (2020), who suggested that adding CWSI to multispectral-based indices was crucial for detecting Xf infection.However, they also found that PRI × CI and PRI n were at least as important.In this study, PRI n was found to be more important than PRI × CI; thus, only PRI n was considered.
The most important plant traits for detecting Vd-infected trees included CWSI, Anth, NPQI, B, and LIDF.As expected, most of these traits were similar to those selected for detecting Xf infection.Nevertheless, the B index (R 490 /R 450 ) was reported as an essential indicator of Vd infection but not for Xf infection.This is consistent with the results reported by Calderón et al. (2013), in which B was also found to be sensitive to Vd infections in olive trees.In their study, the leaf inclination distribution function (LIDF) was not calculated and therefore not evaluated.Nevertheless, the importance of LIDF in this study is consistent with the results reported by Calderón et al. (2015), which concluded that structural indices were sensitive to the detection of Vd-infected trees at all stages of infection.In previous work detecting Vd infections,  CWSI, SIF @760 , and PRI n were the most important spectral plant traits suitable for detecting both diseases.However, CWSI (Calderón et al., 2013, Zarco-Tejada et al., 2018, Pineda et al., 2021), SIF @760 (Zarco-Tejada et al., 2018, Mohammed et al., 2019), and PRI n (Zarco-Tejada et al., 2013) have been previously shown to be highly correlated with water-induced stress in other contexts.For this reason, the reclassification stage (Stage II) was designed to reduce the uncertainty caused by traits affected by both sources of stress, i.e., Vd-and Xf-induced stress in addition to the water stress present in the orchards under study.Traits most important for detecting non-pathogenic water stress included CWSI, PRI n , and C x+c (Fig. 9).NPQI was not affected by water stress but was one of the most important indicators for Xf and Vd infection.Anth and SIF @760 were important for detecting water-stressed trees but less important for detecting Xf and Vd infection.Due to the somewhat orthogonal distribution of importance scores among the infection and the water-stress models, the indicators NPQI, Anth, and SIF @760 were ultimately selected as distinct inputs for the reclassification stage (Stage II).
Spectral plant traits used in the final discrimination stage (Stage III) included B, LIDF, and C x+c for detecting Vd from Xf infections.PRI n , BF 1 , CUR, and CRI 700M traits were the most important for discriminating Xf from Vd infections.
The results of the sensitivity analysis are shown in Fig. 10.OAs based on Stage I (detection) alone were acceptable for detecting Vd and Xf (0.8 and 0.6, respectively), but κ coefficients were extremely low (0.11 and 0.14, respectively).Thus, neither model was able to accurately detect non-infected trees.The reclassification stage (Stage II) increased OA and κ for both diseases, with an overall increase of the TP rate.Nevertheless, the FP rate did not decrease, suggesting that there was no improvement in the ability to distinguish between pathogen infections.Analyses including both the detection (Stage I) and discrimination (Stage III) steps showed reduced FP rates compared to single alternatives.However, the TP rate did not increase compared to the results obtained with only Stage I.
Including the discrimination stage (Stage III) after the reclassification stage (Stage II) reduced the FP rate, but the TP rate did not drastically increase compared to using only Stages I and II.Therefore, the discrimination stage (Stage III) improved the ability to differentiate both diseases.When Stage II was omitted, both models generated less accurate predictions of infection status due to the confounding effects caused by water stress.When applying the three stages consecutively (Stage I + II + III), errors of omission were reduced to 2.9% in discriminating Vd from Xf infections and to 6.5% when discriminating Xf from Vd infections.Commission errors were 10% and 11% when discriminating between Vd and Xf infections, respectively.
The specificity analysis (Fig. 11) revealed that FP rates declined when applying the three stages consecutively.For the Vd model, the FP rate decreased from 0.22 to 0.09 and then to 0.05 when applying Stages I, II, and III consecutively.Similarly, FP rates for the Xf model decreased from 0.39 to 0.25 and then to 0.04 when the three stages were applied.
These results represent significant progress compared to previous studies focused on the detection of single infection symptoms with hyperspectral datasets either in controlled conditions (Rumpf et al., 2010;Tian et al., 2021;Zhou et al., 2019) or at the field scale (Zarco-Tejada et al., 2018;López-López et al., 2016).Nevertheless, few studies have attempted to differentiate pathogens that cause similar symptoms.Gold et al. (2020) used partial least squares discriminant analysis (PLS-DA) with hyperspectral data to distinguish between Phytophthora infestans and Alternaria solani fungal infections.Both of these diseases cause similar necrotic leaf symptoms in potatoes (Solanum tuberosum).They  suggested that under controlled conditions, the shortwave infrared (SWIR) spectrum best enabled differentiation between the infections.Fallon et al. (2020) used PLS-DA to detect oak wilt produced by the fungus Bretziella fagacearum.As a xylem-limited pathogen, oak wilt triggers symptoms similar to those of drought stress.Other diseases such as bur oak blight, caused by Tubakia iowensis, produce similar symptoms that could also be mistaken for oak wilt.They suggested that, under drought conditions, the differentiation between infected and uninfected leaves was possible using wavelengths on the near and SWIR spectra for two different oak species.They found that wavelengths in the range 820-1320 nm were able to differentiate between oak wilt and both drought and T. iowensis.They also observed small differences in the visible spectra when comparing oak wilt-infected leaves with droughtaffected leaves.Our study conducted at tree-crown scales showed that the major differences between infected trees and water-stressed trees were most strongly observed in SIF @760 , NPQI, and Anth indicators.
Despite the results of Fallon et al. (2020) and Gold et al. (2020), successful remote sensing-based discrimination of tree pathologies from abiotic stresses that trigger similar symptoms has not been previously reported.This study successfully differentiated two different xylem pathogens that cause similar symptoms and proposed a method to reduce confounding effects associated with drought stress.Analysis was performed on a large spatial scale and under field conditions, using sensitive spectral indicators related to physiological plant parameters.We demonstrated that, by implementing a three-stage classification model, it was possible to discriminate between Xf and Vd infections in a combined dataset where both infections were co-occurring, yielding detection accuracies beyond 92%.
Further work will focus on the suitability of these methods across different regions and scales, accounting for large structural and background variability characteristics of different cultivars.As shown by Tian et al. (2021), infection detection accuracies decrease during the earlier stages of infection.Therefore, further work will focus on extending the proposed multiple-stage algorithm to detect and differentiate vascular diseases at early stages of infection, when abiotic confounding factors are relatively more influential.Improving the accuracy of disease detection across geographic contexts will facilitate the eradication of emerging and re-emerging pathogens worldwide.

Conclusions
Hyperspectral and thermal remote sensing indicators have been previously used to detect vascular pathogens such as Vd and Xf in olive trees.The most sensitive indicators previously identified include spectral traits from the inversion of radiative transfer models (Anth., C a+b , C x+c , LIDF), solar-induced fluorescence (SIF @760 ), thermal-based CWSI, and specific narrow-band hyperspectral indices (NBHI) such as NPQI.However, both Xf and Vd infections induce similar disease symptoms, which can be confounded in spectral datasets and detection models.Here, using a combined Vd and Xf dataset, we demonstrate that a threestage machine learning algorithm successfully discriminated between infections.The spectral traits that differentiated Vd-infected trees from those affected by Xf were the blue index B, the structural parameter LIDF, and the carotenoid pigment content C x+c .Sensitivity tests revealed that OA increased from 80% to 98% when applying the three stages consecutively and that κ increased from 0.11 to 0.7.The traits found to discriminate Xf from Vd were the normalized PRI index PRI n , the blue index BF 1 , the fluorescence curvature reflectance-based index CUR, and the chlorophyll index CRI 700M , which improved the discrimination yielding OA from 60% to 92% and κ from 0.16 to 0.8 when using the proposed three-stage machine learning algorithm.The specificity obtained by the proposed method was high, reducing FP rates to 0.09 and 0.04 when using the cross-infected dataset to detect Vd and Xf infections, respectively.The three-stage machine learning algorithm can be used to differentiate olive tree infections, accurately identifying the source (Vd or Xf) despite the high similarity between the symptoms triggered by both xylem-invading pathogens.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
used for the Verticillium dahliae study sites.b tts used for the Xylella fastidiosa study sites.

Fig. 5 .Fig. 6 .
Fig. 5. Pure tree-crown airborne reflectance values for olive trees at different symptom development stages for infection by Verticillium dahliae (1,878 and 5,040 olive trees infected with Vd were assessed during 2011 and 2013, respectively) and Xylella fastidiosa (7,296 olive trees infected with Xf were assessed during 2017 and 2018).Infection statuses ranged from mild (DS = 1) to severe (DS = 4).(A) VNIR region (400-800 nm), (B) 400-495 nm (blue region of the spectrum), (C) 510-510 nm, (D) 540-560 nm, and (E) 685-769 nm.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 8 .
Fig. 8. Normalized variable importance scores for detecting Verticillium dahliae-and Xylella fastidiosa-infected olive trees.Scores were obtained by permutation of the out-of-bag predictor method using remotely sensed thermal and spectral traits.

Fig. 9 .
Fig. 9. Normalized importance scores for detecting Verticillium dahliae--infected, Xylella fastidiosa-infected, and water-stressed olive trees.Scores were generated for hyperspectral and thermal traits obtained by the permutation of the out-of-bag predictor method.

Fig. 10 .
Fig. 10.Sensitivity tests for detecting Verticillium dahliae-and Xylella fastidiosa-infected olive trees.Accuracies of different combinations of methods are compared (Stage I, detection stage; Stage II, reclassification stage; and Stage III, discrimination stage).OA, overall accuracy; κ, kappa coefficient.

Fig. 11 .
Fig. 11.Specificity test for discriminating Verticillium dahliae-and Xylella fastidiosa-infected olive trees across combinations of the detection, reclassification, and discrimination stages.

Table 1
Values and ranges used for model inversion and look-up table generation for the PRO4SAIL (PROSPECT-D + 4SAIL) radiative transfer model.