Development of a system based on aerial images for the morphological patterns classification using support vector machine

Oil palm cultivation is one of the major agricultural activities in Colombia. Production performance is related to the good practices in the plantation, mainly regarding the management of phytosanitary conditions. Bud rot disease is the one with the greatest impact in Colombia. The most commonly used technique for its detection is from routine visual inspection on each palm, being costly and inefficient. For this reason, the aim of this study is the development of a classification algorithm based on binary support vector machines for the detection of Bud Rot. The model was obtained from 798 aerial images acquired by unmanned aerial vehicles. Each image was tagged by an expert palm grower based on the presence or absence of the disease. These images were described by 531 morphological features extracted using the concatenation of uniform binary local pattern vectors. Bootstrapping was used to balance the classes, obtaining 507 observations per class. To evaluate the performance metrics of the classifier, an 8-fold Monte Carlo cross-validation was implemented by randomly splitting the data set into training (80%), validation (10%), and test (10%) sets with balanced classes. Finally, the model achieved a performance greater than 96.0%. This indicates that the model developed could be a great technique to automate bud rot detection with high reliability, increasing the efficiency in the recognition. All these thanks to the fusion of Machine Learning techniques with the phenomena of optical physics.


Introduction
Oil Palm cultivation is one of the main agricultural activities in countries such as Indonesia, Malaysia, Thailand, and Colombia. These are the largest producers, accounting for 85% of world palm oil production. This oil has become the most consumed vegetable oil worldwide. Its main use is the generation of electricity and fuel. In recent years, these crops have been seriously affected by pests and diseases. Such as Basal Stem Rot (BSR) in the oil palm industry in Southeast Asia. This has been the most notorious and impactful systematic disease. On the other hand, in Latin America, especially in Colombia, Bud Rot (BR) localized disease has been the most devastating. This has caused a decrease between 8% and 29% in production in the affected regions. Losses of up to USD $ 2.4 million and in more than 8000 direct and indirect jobs in the last 10 years [1]. It is present in 56 % of the plantations in Colombia.
Currently, BR detection is carried out by trained personnel using the routine inspection technique. Analyzing the morphological characteristics of each one of the palms, especially the  [2]. This technique is expensive and inefficient due to the large extensions, the height, and the small morphological changes of the palms that it presents. This technique prevents the disease from being detected early, promoting its spread [3]. As an alternative, studies have been carried out with satellite images. Some in disease detection such as BSR using multispectral imaging from Quickbird [4] and WorldView-3 [5] in Indonesia. However, their spatial resolution, temporal availability, and meteorological conditions at the time of their capture have not allowed them to be used with much success in the detection of diseases [6]. Changing in recent years to a more revolutionary one that is the UAV implementation. This implementation in addition to reducing time and costs [7], it has allowed reducing the ground sampling distance (GSD) compared to satellite images [8]. These UAVs have been widely implemented in automatic palm counting [9,10], land cover classification [11], BSR disease detection [12], and BR disease detection [13].
This study proposed the development of a classification algorithm based on binary support vector machines (SVM) for the detection of BR between level two and five of severity. Selecting the most optimal hyperparameters from a design of experiments (DoE). Allowing for detection in lower degrees of severity compared to previous studies [13]. For this, aerial images will be acquired by unmanned aerial vehicles (UAV) in the visible light spectrum. This will allow obtaining a greater GSD and less dependence on atmospheric conditions. Achieving BR disease detection in a faster, more effective, and cheaper way compared to traditional methods. It will lead growers to treat it appropriately by taking the necessary steps to prevent the spread of the disease. Reducing losses in production and improving income for growers and their region of impact.

Materials and methods
The methodology proposed in Figure 1 for the detection of bud rot palms. It begins by creating the database from the images acquired using UAVs and processed in orthomosaics. These images were labeled by an expert palm grower into two classes: healthy (negative) and BR palms (positive). Then, the Bootstrapping method was used to balance the classes. Features extraction was performed on the pre-processed images using the uniform local binary pattern (ULBP) descriptor. Finally, the predictive SVM model was obtained using an 8-fold Monte Carlo cross-validation. Its hyperparameters were tuned using a DoE.

Database generation
An area of 10 Ha with 6-year-old palms was selected to acquire the images in the region strongly affected by BR in the Magdalena Medio, Colombia. The acquisition of the images in the visible light spectrum was done using the Phantom 3 standard and Phantom 4 UAVs from the manufacturer DJI R . Figure 2(a) showed some of the 10500 images acquired.
The images of each flight were processed in the Pix4D mapper R photogrammetry software for the creation of 25 orthomosaics. An example of the orthomosaic obtained is presented in Figure  2(b). All images of oil palms centered in reference to the bud were extracted. Each palm image sample was 800 x 800 x 3 pixels in size from RGB color mode. These samples were labeled by an expert palm grower into two classes: healthy (negative) and BR presence on oil palm (positive). Each label was corroborated by the traditional method of inspection of the young arrow directly on the palm. Obtaining 507 negative samples and 201 positive samples. The Bootstrapping method was used to balance the positive class to 507 samples. Finally, a pre-processing stage was performed on each of the samples. This is to reduce computational costs and improve the feature extraction process. It was made by MATLAB R R2019b (The MathWorks, MA, USA) using built-in functions from statistics and machine learning toolboxes. It was subsampled with d = 2, the color mode transformation from RGB to grayscale was performed and a 1.

Feature extraction
Negative samples obtained as shown in Figure 2(c) and positive samples as shown in Figure 2(d) were described by the ULBP descriptor. This descriptor is translation invariant, therefore each sample was divided into nine equal parts. Each division describes its pattern by 59 features. Finally, the nine vectors were concatenated obtaining 531 morphological features.

Binary support vector machines model
The generated database was randomly splitted into training (80%), validation (10%) and test (10%) sets with balanced classes. In the training and validation of the binary SVM classification model a 8-fold Monte Carlo cross-validation was implemented. The selection of the best kernel and its best hyperparameters was based on two exploratory DoE.
A DoE was implemented for the polynomial kernel see Equation (1) with 336 treatments. Where 3 values of n, 8 values of c, and 14 values of α were taken. On the other hand, a 31 1 DoE was implemented for the radial base kernel see Equation (2) was 31 treatments. Where the σ was varied between 0 and 30. Finally, the best DoE treatment is selected and a new 11 1 DoE is designed around the selected treatment for each type of kernel. For the DoE assume that X ∈ R 2 , and X 1 and X 2 are random vectors referring to morphological features.
where α is the scale, c is the offset and n is the polynomial order.
Radial basis function kernel: where σ is the standard deviation.

Polynomial kernel
The best treatment of the DoE of the polynomial kernel function in the SVM model was obtained with a 3 order, zero offset and a scale of 2.5 see Equation (1). This treatment presented an error rate of 6.36%, a sensitivity of 95.39%, and a specificity of 91.89%. It was also found that offset is not a significant factor in the performance of the classifier. From this treatment, a zoomin of experimental design 11 1 was designed around the scale of 2.5. The result of this new DoE is presented in Table 1. This design kept the order 3 of the polynomial and the offset at zero, obtaining that the treatment four with a scale of 2.3 shown in Table 1 presents the best performance, with an error rate of 6.25%, a sensitivity of 95.39%, and a specificity of 92.11%. This selection was made by choosing the one with the lowest error rate from Table 1.

Radial basis function kernel
The best treatment of the DoE of the radial basis function kernel see Equation (2)

Testing model
Comparing the polynomial kernel with the radial basis function kernel, it was obtained that the polynomial kernel had 4.6% more sensitivity. For this reason, it was selected to obtain the performance metrics with the test set. The SVM model with polynomial kernel function was evaluated with the hyperparameters that obtained the highest performance in the experimental design. Which were: the function of order two, the scale of 2.3, and zero offset. It is obtained that the model has a 98.04% sensitivity, a 96.08% specificity, and an error rate of 2.94%.

Discussion
Physics and its various areas have contributed both to the progress of research and its applications in agronomy, especially in the oil palm sector. The search to find more efficient, more profitable ways to control the crop, such as its phytosanitary management, has strengthened the implementation of precision agriculture with UAVs. Its operation can be described with basic physics such as the definitions of momentum, work, kinetic and potential energy, power, and Newton's laws of motion. With this, it is being possible to change the current routine inspection techniques, to a faster, more efficient, and economical one, being able to cover large areas of crops. These UAVs, such as the Phantom 3 standard and the Phantom 4, are integrated with complementary metal-oxide-semiconductor (CMOS) sensors that allow the capture of 6 electromagnetic waves of wavelengths between 400 nm and 700 nm. These wavelengths are suitable for the detection of BR [13].
The polynomial binary SVM classifier developed in the present study for the detection of BR in oil palms showed a sensitivity of 98.04%, a specificity of 96.08%, and an error of 2.94%. This performance is higher than that obtained when using the LR classifier for the detection of BR in high severity levels (4 and 5) [13]. In addition, it allowed making the detection from level two of severity compared to level four of the previous study, considering that the images in both studies were acquired at the same height and overlap.
The use of UAVs for the detection of BR in oil palms had a better performance than that obtained in BSR detection studies with satellite images. It had an accuracy 6.06% higher than that obtained by Santoso [4] with the RF classifier and 42.96% higher than that obtained by the same author [5] using an SVM model.
In future works, increasing the database will allow the model to be improved so that it can be applied in a greater variety of plantations. In addition, the optimization of the hyperparameters through genetic algorithms. In the same way, image processing techniques or machine learning (ML) models based on Clustering could be incorporated that allow segmenting the palm from the background of the image. This is to be able to apply the model directly on the orthomosaics automatically, without the models learning irrelevant information. Finally, a convolutional neural network (CNN) could be used. This will allow it to obtain a greater perceptual field of the images, minimizing the bias of the selection of characteristics.

Conclusions
The polynomial kernel function with the hyperparameters of order 3 and scale of 2.3 obtained a BR detection capacity of the diseased palms 4.6 % higher than the radial base kernel function. Through the DoE of 336 treatments developed, it was possible to identify the insignificance of offset in the performance of the model. In addition, to identify the values of the hyperparameters, which were adjusted with the final 11 1 DoE.
The polynomial binary SVM classifier developed obtained a performance higher than 96 % for the detection of BR. This indicates that through the acquisition of aerial images by means of UAVs, and the implementation of a technique that looks to maximize the distance in the feature space between the separating hyperplane and the training data that are closest to this hyperplane, it is a great fusion to automate BR detection on oil palm with high reliability.
It is incredible how physics such as optics, quantum physics, waves, and particles have contributed to both the progress of research and its applications. Thanks to the properties of particle physics in the process of receiving signals and images from UAVs in conjunction with their processing. It is possible to observe the reduction of time in the BR disease detection by increasing efficiency in recognition. All this thanks to the fusion of ML techniques (which identify complexes patterns of millions of pixels allowing extract textures of the images to segmenting into different classes) with the phenomena of optical physics.