Cucumber powdery mildew detection using hyperspectral data

Abstract: This study aimed to understand the spectral changes induced by Podosphaera xanthii, the causal agent of powdery mildew, in cucumber leaves from the moment of inoculation until visible symptoms are apparent. A principal component analysis (PCA) was applied to the spectra to assess the spectral separability between healthy and infected leaves. A spectral ratio between infected and healthy leaf spectra was used to determine the best wavelengths for detecting the disease. Additionally, the spectra were used to compute two spectral variables [i.e., the red-well point (RWP) and the red-edge inflexion point (REP)]. A linear support vector machine (SVM) classifier was applied to certain spectral features to assess how well these features can separate the infected leaves from the healthy ones. The PCA showed that a good separability could be achieved from 4 days post-inoculation (DPI). The best model to fit the RWP and REP wavelengths corresponded to a linear model. The linear model had a higher adjusted R2 for the infected leaves than for the healthy leaves. The SVM trained with five first principal components scores achieved an overall accuracy of 95% at 4 DPI (i.e., two days before the visible symptoms). With the RWP and REP parameters, the SVM accuracy increased as a function of the day of inoculation, reaching 89% and 86%, respectively, when symptoms were visible at 6 DPI. Further research must consider a higher number of samples and more temporal repetitions of the experiment.


Introduction
The Canadian greenhouse crops are the most significant and fastest-growing segment of Canadian horticulture. The surface has increased 21% over the last five years and 48% over the last decade [Agriculture and Agri-Food Canada (AAFC) 2020]. The province of Ontario has the highest greenhouse crop production, with 69% of the total production in Canada (AAFC 2020). Such as with other crops, fungal diseases can affect greenhouse crops and be a significant limiting production factor (Khater et al. 2017). Powdery mildew due to Podosphaera xanthii is a fungal disease affecting greenhouse cucumber (Cucumis sativus L.) production. The pathogen most likely survives from season to season in the asexual state on living cucurbits, spreading by wind-blown spores. Conidia may also survive in the greenhouse for short periods, infecting new cucumber crops, particularly when the new crop overlaps or follows too soon after removing the old crop (Pérez-García et al. 2009). The pathogen could produce between 30% and 50% yield losses in cucumber greenhouse production (Hafez et al. 2020). P. xanthii grows haustorium to interact with the cell walls of the leaves, petioles, and stems, establishing a close connection beneath the host cells (Nishizawa et al. 2016;Eskandari and Sharifnabi 2019). The biotrophic pathogen does not kill the host cells to obtain nutrients (Spanu and Panstruga 2017); however, it induces leaf chlorophyll degradation and internal structural damage of colonized cells. Chlorophyll variations can be detected in the visible spectral domain (400-700 nm) (Mutanga and Skidmore 2007), while the internal leaf structure changes can be detected in the near-infrared spectral domain (700-1300 nm) (Knipling 1970;Delalieux et al. 2007).
Another interesting spectral domain is the red-edge region, a transition zone located between 660 and 780 nm (Horler et al. 1983). The transition zone is between the maximal chlorophyll absorbance in the red wavelength and the strong near-infrared reflectance due to leaf mesophyll scattering. This region has two variables, the red-well point (RWP) and the red-edge inflection point (REP) (Pu et al. 2003). RWP is the wavelength in the red region, between 660 and 680 nm, corresponding to the minimum reflectance due to maximum chlorophyll absorption. REP is the wavelength corresponding to the inflexion point of the spectral curve between the red and near-infrared spectral domains. At REP, the reflectance slope in the red-edge region is maximal (Darvishzadeh et al. 2009). Shifts of the REP to longer or shorter wavelengths have already been related to chemical and morphological plant status (Peng et al. 2011). Therefore, changes in the reflectance of the red-edge region and its associated parameters, RWP and REP, can be related to powdery mildew in cucumber leaves.
Previous studies on the detection of cucumber powdery mildew inside greenhouses were based on RGB images when the symptoms were already visible (Wspanialy and Moussa 2016;Zhang et al. 2017, Zhang et al. 2019. Other studies used hyperspectral data; hyperspectral imagery combined with chlorophyll fluorescence and thermograms was used by Berdugo et al. (2014) to study cucumber leaves infected with powdery mildew. Atanassova et al. (2019) examined cucumber powdery mildew at the leaf level using vegetation indices computed with data acquired with a spectrometer having a spectral range between 450 and 1100 nm. Other cucumber diseases were also detected using remotely sensed data. Cucumber downy mildew (Pseudoperonospora cubensis) was detected over hyperspectral imagery in the range between 400 and 1100 nm by Tian and Zhang (2012). Remote sensing was also used to detect other significant damages in cucumber. For example, Cen et al. (2016) applied hyperspectral imagery acquired between 400 and 675 nm in reflectance mode and from 675 to 1000 nm in transmittance mode to study chilling damage in cucumber fruits. Diseases affecting other cucurbit species were also studied using remote sensing. Kalischuk et al. (2019) applied unmanned aerial vehicle (UAV) multispectral imagery for mapping and scouting gummy stem blight (Stagonosporopsis cucurbitacearum) in watermelon (Citrullus lanatus) fields, and hyperspectral UAV imagery was used by Abdulridha et al. (2020) to detect powdery mildew in squash (Cucurbita pepo).
This paper is the first part of a research project aiming to detect powdery mildew on cucumber leaves. This paper has the specific objective to understand the changes induced in hyperspectral data by powdery mildew in cucumber leaves from the moment of inoculation until visible symptoms are apparent. A principal component analysis (PCA) was applied to the spectra to assess the spectral separability between healthy and infected leaves. A spectral ratio between infected and healthy leaf spectra was used to determine the best wavelengths for detecting the disease. Additionally, the spectra were used to compute two spectral variables (i.e., RWP and REP). A linear support vector machine (SVM) classifier was applied to the principal component scores, RWP, or REP, to assess how well these features can separate the infected leaves from the healthy ones. The second part of the study (Fernández et al. 2021) will test the use of simulated multispectral band reflectance of the Micasense RedEdge camera and associated vegetation indices to detect powdery mildew on cucumber leaves because the main objective of the study is to test the use of a Micasense RedEdge camera for detecting cucumber powdery mildew.

Experiment
The study used data acquired during a controlled experiment in December 2019 in a walk-in growth chamber located at the Biotron facilities of the University of Western Ontario (London, Ontario). The chamber had a constant relative humidity of 70%, a photoperiod of 12 h, with an air temperature of 23°C, followed by a dark period of 12 h with an air temperature of 20°C. Cucumber seeds cultivar Straight Eight (William Dan Seeds Ltd) were planted individually at 3 mm depth in 0.5 L pots filled with a volume of 0.475 L of the multipurpose Pro-Mix LP15 substrate. The plants were regularly well-watered. A dose of 10 mg per plant of MiracleGro All Purpose (NPK 24-8-16, with micronutrients) commercial soluble fertilizer was applied at 100% plant emergence and then every seven days after plant emergence until the end of the experiment. Ten plants were used and were split into two groups: healthy (n = 5) and infected (n = 5). The healthy plants were placed in a separate chamber to avoid cross-contamination.
Before inoculation, the leaves of each plant were marked on the petiole base with a small white tape, which was numbered from the bottom to the top leaf. Due to the large surface of the cucumber leaves, a region of interest (ROI) of 12 cm 2 was drawn on each leaf. These ROIs allow collecting spectral data precisely on the same inoculated area over time. The infected plants were inoculated as follows. Cucumber leaves infected with P. xanthii were collected one day before inoculation from Great Lakes Greenhouses Inc., located in Leamington, Ontario, Canada. Samples were wrapped in a paper towel to absorb humidity, bagged into a Ziploc® bag, and transported to A&L Laboratories Inc., where they were kept in a cold chamber at 4°C until the inoculation. They were cut into small pieces, and the infected sections were used to contact each ROI marked on the leaves of the plants from the infected group five times. The ROI of the control group was also physically contaminated five times using non-infected sections of cucumber leaves.

Spectral data
Radiances between 325 and 1075 nm at 1.5 nm sampling intervals were measured at the leaf level four hours before inoculation (0 DPI), until seven days postinoculation (DPI), with an ASD FieldSpec® HandHeld 2 spectroradiometer (ASD Inc., Boulder, Colorado, USA) having a 25°field of view (FOV) bare fibre optic cable. No data was collected at one-day post-inoculation (1 DPI) to properly establish the pathogen and reduce air dispersion risk of non-germinated spores and contamination of the healthy plants. The spectra were acquired and calibrated using the RS 3 version 6.4 software from ASD (ASD Inc., Boulder, Colorado, USA). Leaf reflectance spectra were acquired over a total of 183 healthy and 201 infected ROI, respectively. They were acquired on the centre of each marked ROI, including the midrib region when present, with the optic fibre coupled to an ASD high-intensity probe equipped with a light source. A leaf clip, having a white Goretex (99% reflectance) and a black reference, was attached to the probe. The calibration of the probe was performed every 10 min and set to obtain one mean reflectance value from 10 ASD spectral measurements per reading.

Spectral data processing
All the data was processed using MATLAB R2020b (MathWorks, Inc., Natick, Massachusetts, USA) following the flowchart of Fig. 1. In the analysis, we did not consider the spectra obtained from the following cases: (i) ROIs where there were scars due to the friction during inoculation, and (ii) yellow bottom leaves due to natural senescence. The resulting data set contained 71 spectra from healthy leaves and 57 spectra from infected leaves. All the spectra were then cropped to the 400-900 nm range and converted to reflectance spectra using the ViewSpec Pro version 6.2.0 software (ASD Inc., Boulder, Colorado, USA). These reflectance spectra were then subjected to a Savitzky-Golay filtering method to reduce instrumental noise (Savitzky and Golay 1964). It was followed by a multiplicative scatter correction (MSC) to remove additive and multiplicative scattering effects (Isaksson and Naes 1988).
To determine the DPI when a spectral difference between healthy and infected leaves can be observed, the mean reflectance spectra were plotted as a function of the DPI. Such as in Fernández et al. (2020a), they were then used to compute the following spectral ratio (SR) at each wavelength between the spectra acquired over healthy and infected leaves or plants, which was then plotted as a function of DPI.
where SR λ is the spectral ratio at wavelength n, Ri λ is the mean reflectance at wavelength n for the spectra acquired from the infected leaves, and Rh λ is the mean reflectance at wavelength n for the spectra acquired from the healthy leaves.
The SR plots have peaks corresponding to wavelengths with the highest spectral difference between healthy and infected leaves or plants.
Given that spectral data have high correlation and multicollinearity, following several studies on crop disease detection (Wang et al. 2012;Tian and Zhang 2012;Gulhane and Kolekar 2014), the spectra were subjected to a PCA. PCA is a multivariate statistical technique for transforming data based on eigenvalue analysis. A PCA on hyperspectral data transforms the data to a new set of uncorrelated variables that are linear combinations of the original dataset; and reduces the dimensionality of the data set while preserving the variance (Kumar et al. 2014;Martinez and Cho 2015). We applied PCA over the truncated reflectance dataset from 400 to 900 nm. Then, the first 2 PCA were plotted as a function of the day post-inoculation to observe the separability of the healthy and infected spectra as an effect of the disease development (Everis et al. 2001;Sankaran et al. 2010).
Such as in Fernández et al. (2020b), to compute RWP and REP, the spectra were subjected to first derivations. Both the reflectance and the first-order derivative spectra were cropped to the 660-780 nm spectral range. For each red-edge parameter (RWP and REP wavelengths), we created a unique vector containing all RWP or REP during all days of evaluation. The vectors containing the RWP or REP wavelengths of the healthy (infected) cases were 497 × 1 (399 × 1). Linear regression models were fitted to explain the RWP or REP trend as a function of the DPI using the fitlm function of MATLAB R2020b over each vector. Such modelling over each vector allows obtaining the residuals to perform normality assumption analysis properly. The one-sample Kolmogorov-Smirnov test was applied over the residuals of each fitted model using the kstest function of MATLAB R2020b. The normality assumption was followed by analyzing the variance homogeneity of residuals using the F-test (vartest2 function of MATLAB R2020b). Finally, to assess the trend of each rededge parameter, we plotted the mean value of healthy(infected) RWP or REP as a function of the DPI.
In order to test whether spectral feature allows sorting healthy and infected leaf ROI, we applied an SVM algorithm embedded in the MATLAB Statistics and Machine Learning Toolbox™. The SVM was applied to (i) the first five principal components, (ii) the RWP, (iii) and REP wavelengths. The SVM algorithm was calibrated using cross-validation and a 10 K-fold division to avoid overfitting. For each spectral feature, the SVM outputs a confusion matrix as a function of the DPI that was used to compute the SVM parameters sensitivity, precision, F1 score, and overall accuracy (Table 1).

Mean leaf reflectance and PCA
The mean leaf reflectance spectra for the healthy and infected leaves are presented in Fig. 2. The results indicate that across the whole spectra, between 400 and 900 nm, spectral changes between healthy and infected leaves are not observed until 3 DPI (Figs. 2a-2c). At 4 DPI, the reflectance of the infected leaves is lower than those of the healthy leaves in the 400-450 nm spectral region. Between 520 and 520 nm, the reflectance of the infected leaves is higher than those of the healthy leaves (Fig. 2d). These results are also reflected at 5 DPI (Fig. 2e). However, during the whole experiment period, 7 d, the magnitude of changes across the whole spectra (400-900 nm) is not large enough between healthy and infected leaves, especially in the near infrared (NIR) region (780-900 nm), where no visible differences are apparent (Figs. 2a-2g).
The total variance explained by the five first PCs changes with the DPI because of the powdery mildew development, but it is always near or above 97% of the total variance of the leaf reflectance spectra ( Table 2). The first two PCs showed high variances and improved the visualization to discriminate between healthy and infected observations. For PC1, the explained variance dropped from 84.53% at 0 DPI until 83.51% at 3 DPI, then increased to 86.91% at 4 DPI, then had a slight reduction to 86.81% at 5 DPI, then increased and reached its maximum value of the experiment at 6 DPI (88.31%) and then dropped to 83.38% at 7 DPI (Table 2). For PC2, the explained variance showed a minor reduction from 6.42% at 0 DPI to 6.24% at 3 DPI (Table 2). Then, the variance increased to 7.58% at 4 DPI and decreased to 5.71% at 5 DPI (Table 2). It increased again to reach its maximum value (10.16%) at 7 DPI (Table 2). Figure 3 presents the variation of the scores for the first and second principal components (PC1 and PC2, respectively) as a function of the DPI. It allows assessing the separability of the spectra of healthy and infected leaves. Both spectra are similar, and no separability can be seen on the day of inoculation (Fig. 3a) and at 2 DPI (Fig. 3b). From 3 DPI (Fig. 3c), some infected leaves have spectra that begin to separate from the centre of the ellipse of their distribution. It is possible to observe that at 4 DPI (Fig. 3d), when spectra of most healthy leaves are closely related, the spectra of the infected leaves become distant from the spectra of healthy leaves. However, it is the opposite at 5 DPI (Fig. 3e). At 6 and 7 DPI, the spectra show greater separation (Figs. 3f and 3g). When the total explained variance of PC2 is over 6.5% at 4 DPI, the distribution of the spectra of the infected leaves is distant from the ones of the spectra of the healthy leaves.

Spectral ratio
The SR between healthy and infected cases for each DPI is presented in Fig. 4. The curve peaks of these plots correspond to the wavelengths where the spectral difference between healthy and infected leaves is the largest. As expected, the spectra of the infected leaves are very similar to the spectra of the healthy leaves on the day of inoculation, leading to a spectral ratio closed to 1. The changes in the spectral ratio were observed from 2 DPI (Figs. 4b-4g). The peaks detected by the ratios are more pronounced for a longer time after inoculation and are presented mainly between 450 and 700 nm. We did not consider the spectral region between 400 and 430 nm during this evaluation because of apparent instrumental noise (Figs. 4a-4g). When symptoms were visible for the first time, at 6 DPI, peaks can be easily located at 444, 570, 667, and 703 nm ( Fig. 4f and Table 3). Between 6 and 7 DPI, there was no significant variation in the peak positions at 445, 570, 665, and 704 nm (Table 3). The NIR region (800-900 nm) presented almost no variations on the computed spectral ratio.

RWP and REP
A linear regression model obtained the best fit for the variation of the mean RWP and REP wavelength positions as a function of the number of days post-inoculation (Fig. 5). In both cases, the adjusted R 2 of the linear model was not significant at P > 0.05 for the healthy leaves (R 2 = 0.49 for RWP and R 2 = 0.18 for REP). For the infected leaves, it was significant at P ≤ 0.01 for RWP (R 2 = 0.88) and at P ≤ 0.001 for REP (R 2 = 0.91). We applied a linear SVM classifier to either the first five PCs that explained more than 97% of the total variance, RWP, or REP. We obtained an inconsistent trend for the spectra classification. At 0 DPI, the overall accuracy was around 65%, then increased to 86% at 2 DPI to drop drastically to 77% at 3 DPI (Table 4). A high increment was observed from 3 to 4 DPI, reaching an overall accuracy of 95% (Table 4). Then again, the accuracy dropped by 10% to reach a value of 85% at 5 DPI (Table 4). When signs of the disease were visible at 6 DPI, the accuracy increased to 97%, with a slight reduction of 1% at 7 DPI (Table 4).
The best overall classification accuracy was obtained using RWP across all DPIs (Table 4). RWP constantly increased with the DPI and showed higher accuracy than REP. The corresponding confusion matrix for the RWP classification is presented in Table 5. At 0 DPI, the overall classification using RWP was 55% (Table 4). The overall accuracy increased to 57.81% at 2 DPI, showing a higher increment to 72% at 3 DPI (Table 4). At 2 DPI, the corresponding confusion matrix showed that the number of correctly classified healthy and infected leaves was 60 (84%) and 14 (24%), respectively (Table 5). At 3 DPI, the number of adequately classified samples dropped to 59 (83%) in the case of healthy leaves and increased to 34 (59%) in the case of infected leaves (Table 5). At 4 DPI and 5 DPI, it increased to near 78% (Table 4). At 4 DPI, the number of adequately classified healthy and infected observations was 58 (81%) and 42 (73%), respectively (Table 5). This number showed slight variations at 5 DPI to 55 (77%) and 45 (78%) for the healthy and infected leaves, respectively (Table 5). At 6 DPI, when symptoms were visible, the overall accuracy was 89%. It reaches its maximum value at 7 DPI with 94% (Table 4). The related confusion matrix at 6 DPI showed that 90% (64) of the healthy and 88% (50) of the infected leaves were classified correctly (Table 5). These numbers Fig. 4. Variation as a function of the number of days post-inoculation of the spectral ratio computed using the mean spectrum, acquired from healthy and infected leaves, concerning the healthy leaf spectra according to eq. 1.
With REP, the SVM classification presented an overall classification accuracy of around 55% at 0 DPI and 2 DPI (Table 4). At 3 DPI, the overall classification accuracy was 61% (Table 4). From 4 DPI, REP increased continuously from 71% to 86% at 6 DPI (Table 4). At 7 DPI, the REP's overall classification accuracy dropped slightly to 83% (Table 4).

Discussion
This study analyzed the changes in reflectance spectra in the visible-NIR regions induced by P. xanthii on cucumber leaves. In particular, we applied PCA to the spectra and computed two spectral parameters, RWP and REP. Only a few studies assessed these features for detecting biotrophic diseases such as cucumber powdery mildew with RWP, REP, or PCA score; hence we compared our results with studies on necrotic diseases, such as late blight (Phytophthora infestans) on potatoes.
In our experiment, the signs and symptoms of the disease were visible at 6 DPI, while Berdugo et al. (2014) observed signs at 4 DPI. Another difference with Berdugo et al. (2014) is that the evaluations were done every four days after inoculation. Our study considered daily evaluations (excepting 1 DPI to allow the pathogen establishment and reduce cross-contamination risk).
The spectral changes in the 400-900 nm were assessed using several data analysis methods. A graphical comparison of the mean reflectance spectra of the healthy and infected leaves as a function of the day of inoculation showed that the green reflectance of infected leaves has higher values than the healthy leaves. Cucumber leaves infected with P. xanthii showing higher reflectance values between 550 and 680 nm concerning healthy leaves were also reported by Berdugo et al. (2014). Atanassova et al. (2019) reported that the most considerable spectral differences between healthy leaves and those infected with P. xanthii were found between 540 and 680 nm, corresponding to the wavelengths between the green and red spectral regions. Mahlein et al. (2013) also reported higher reflectance values in the case of sugar beet (Beta vulgaris var. saccharifera) leaves infected with Erysiphe betae (powdery mildew) in the green and red-edge spectral domains. In the red-edge reflectance, we observed a slight blue spectral shift. Gitelson et al. (1996) attributed the blue shift of the red-edge reflectance to the chlorophyll loss and plant stress.
Such as in Fernández et al. (2020a) and Xie et al. (2017), we applied a PC analysis to reduce the high number of reflectance data to a new uncorrelated dataset containing 5 PC explaining near or above 97% of the total variance of the data set. The total variance explained by these 5 PCs was similar to the one reported by Fernández et al. (2020a) (96%), who studied late blight on potato leaves. The explained variance explained by the first 5 PCs was also in the same magnitude as the total variance of 98.88% observed at 5 DPI with the first 3 PC by Xie et al. (2017), who studied grey mould (Botrytis cinerea) on tomato leaves. We reached an overall accuracy of 95% at 4 DPI (i.e., two days before visible symptoms). Our accuracies are higher than those reported by Abdulridha et al. (2020) (82%), who applied a radial basis function to hyperspectral imagery to discriminate healthy squash leaves from those infected with P. xanthii before the symptoms were visible. When the symptoms were apparent (at 6 DPI), our accuracy increased to 97%. This accuracy was higher than the one of Tian and Zhang (2012) (90%). They classified a limited number of hyperspectral images (ten on healthy leaves and ten on infected leaves) acquired over healthy leaves and leaves infected with P. cubensis (cucumber downy mildew). However, our accuracy was lower than that of Abdulridha et al. (2020) (99%), who discriminated with a radial function healthy squash leaves from those infected with P. xanthii at a late development disease stage. Our accuracies were also lower than those reported by Cen et al. (2016) (100%), who applied an SVM algorithm to discriminate chilling injured and nondamaged cucumber fruits. We hypothesize that these changes in the classification accuracy when using PCA might be related to some noise in the 400-440 nm region. We can observe that the spectral ratio dropped drastically from 3 to 4 DPI (Figs. 4c and d) in this region and then increased at 5 DPI (Fig. 4e). Those changes could be attributable to instrumental noise. Therefore, the computation of a PCA over the whole visible spectra (400-900 nm) might be affected by instrumental noise in some spectral regions, i.e., below 450 nm in our case. Another possible explanation could be changing chlorophyll-a concentration due to the pathogen infestation, affecting the chlorophyll a/b ratio (Tanaka and Tanaka 2011). Finally, the low number of observations might significantly influence the overall accuracies even when few samples are misclassified. Note: Data determined from the spectral ratio computed using the mean spectrum acquired over healthy and infected leaves concerning the healthy leaf spectra.
In order to determine the best wavelengths to detect powdery mildew, we computed spectral ratios between infected and healthy spectra, such as in Fernández et al. (2020a). When powdery mildew signs and symptoms were visible for the first time (at 6 DPI), the wavelengths we defined as being the best to detect the disease were the blue (444 nm) and green (570 nm) ones. They are closely located to those reported by Xie et al. (2015) for late blight disease detection (i.e., visible symptoms) on tomato leaves (442 and 573 nm, respectively). However, Xie et al. (2015) also reported 508, 696, and 715 nm as suitable wavelengths that are not closely related to our wavelengths. None of our wavelengths were close to those used by Xie et al. (2017) to detect Botrytis cinerea infecting tomato leaves. They were also different from those reported by Fernández et al. (2020a) (488, 556, 681, and 709 nm) for late blight disease detection in potato leaves when symptoms were visible for the first time at 3 DPI. At 5 and 6 DPI, the 703nm wavelength matches the wavelength reported by Xie and He (2016), who tested spectral and image textural features to detect Alternaria solani (early blight) on eggplant (Solanum melongena) leaves. The authors also reported wavelengths located at 408, 535, and 624 nm. These discrepancies might be explained by anatomical differences between plants species (Fernández et al. 2020a). Also, it must be noted that Fernández et al. (2020a), Xie et al. (2017), Xie et al. (2015), and Xie and He (2016) studied a necrotic pathogen, while cucumber P. xanthii is a biotrophic pathogen.
A linear SVM was applied to the first 5 PC scores, the RWP or REP wavelengths, to sort the leaf spectra according to the leaf status (healthy or infected) as a function of the DPI. Using five PCs, the lowest overall classification accuracy for healthy and infected leaves was around 65% at 0 DPI. The highest overall accuracy was around 97% at 6 DPI when the symptoms were visible. Our overall classification accuracy was higher than those of Fernández et al. (2020a). They reported an overall accuracy of around 89% at 3 DPI when late blight symptoms were visible for the first time on potato leaves, using a partial least square discriminant analysis (PLS-DA) applied leaf reflectance from 400 to 900 nm. However, our accuracies were lower than those (100%) reported at 4 DPI when disease signs were observed in Berdugo et al. (2014), who applied a stepwise discriminant analysis to discriminate healthy cucumber leaves from those infected with P. xanthii using the following features: effective quantum yield, SPAD values, maximum temperature difference (MTD), NDVI and the anthocyanin reflectance index (ARI).
The leaf spectra we measured were used to compute two main parameters of the red and red-edge regions, the RWP and REP wavelengths. For both parameters, the best fit was achieved using a linear model. For the REP, our results are not following Fernández et al. (2020b). They reported an exponential model to fit best the REP changes in the case of potato late blight, which is due to a hemibiotrophic pathogen that necrotizes infected tissue. This study studied variations induced by powdery mildew due to a biotrophic pathogen that  interacts with the host without producing necrosis. The slope of the linear model was higher for the REP model than for the RWP model; hence the REP parameter is more sensitive to the disease than the RWP. For the infected leaves, the R 2 values were 0.88 with RWP and 0.91 with REP. The one for RWP was higher than the one reported by Fernández et al. (2020b) (0.83), but the one for REP was lower than the one reported by Fernández et al. (2020b) (0.99). When the symptoms were visible for the first time, our overall classification accuracies with the RWP and REP wavelengths position (89% and 86%, respectively) were higher than those (65% with RWP and 59% with REP) reported by Fernández et al. (2020b) who applied a linear SVM classifier to classify healthy potato leaves and those infected with P. infestans.
Early disease detection is a challenging topic and might not find a definitive solution yet. One main limitation for early disease detection can be related to the conditions where the experiment was performed. Such as the ones of the current study, most of the literature results on disease detection have been obtained under controlled conditions with optimal light illumination and plants having excellent growing conditions. There is, therefore, the need for further work to up-scale the laboratory measurements to actual crop conditions. Gold et al. (2020) already reported that using leaf-level spectroscopy for disease scouting has been already reported as unfeasible. Khaled et al. 2018 suggested that it is necessary to consider the effect of environmental light for measuring spectral data in the field. Another limitation when studying plant-pathogen interactions is the spectral range of the available sensors (Khaled et al. 2018). Usually, the spectral range for image acquisition is focused in the visible-NIR range (400-1000 nm) because the majority of the commercial cameras operate in this range. Indeed, short-wave infrared sensors are up to five times more expensive than visible and NIR sensors (Mishra et al. 2020). Finally, a universal method will be difficult to develop as the specific wavelengths required to detect specific diseases depend on the nature of the pathogen. Three different types of pathogens can be found in nature: biotrophic, hemibiotrophic, or necrotrophic. In the case of biotrophic pathogens such as P. xanthii (this study) and hemibiotrophic pathogens such as P. infestans, the useful wavelengths for disease detection change with the disease progress (Fernández Table 5. Confusion matrix and related statistics when a linear support vector machine (SVM) classifier is applied to the red-well point (RWP) wavelength to classify 71 healthy and 57 infected spectral samples as a function of the number of days post-inoculation (DPI). et al. 2020b, Gold et al. 2020). For a necrotrophic pathogen such as Alternaria solani on potato leaves, the best wavelength was determined to be in the short-wave infrared region (Gold et al. 2020).

Conclusions
In this study, we determined which spectral variables could be used to detect powdery mildew on cucumber leaves, as well as the timing of onset following disease inoculation. Using principal component analysis over the whole visible spectra range, between 400 and 900 nm, allowed us to reduce the number of variables from 501 wavelengths to 5 components explaining above 97% of the total variance. However, the classification based on principal components is affected by instrumental noise observed in the 400-440 nm region. Regarding the two red and red-edge parameters (RWP and REP) we studied, REP was more sensitive to the disease than RWP. However, when a linear SVM is applied to REP and RWP to classify healthy and infected leaves, higher overall classification accuracy was achieved earlier with RWP than with REP.
Our results were obtained on Straight Eight cultivar cucumber plants. Further work is needed to test the method on other cucumber cultivars. While the results of this study are quite promising, they were acquired on a limited number of plants; therefore, there is a need to test the method over a large number of plants. Our results were obtained in a walk-in chamber with a controlled environment using point measurements. Future studies should address testing the method in actual greenhouse conditions, and on hyperspectral imagery acquired over those greenhouse conditions.