Classification of diseases of banana plant fusarium wilted banana leaf using support vector machine

Fusarium wilt is an important disease of various types of bananas and one of the most common diseases that causes destruction of banana plants in tropical and subtropical regions. Fusarium wilt caused by the fungus Fusarium oxysporum f.sp. cubense (FOC). As an inhabitant, invader, soil-transmitted fungus and causes wilt that colonizes xylem vessels, FOC requires penetration through the roots of the host plant, so that in its control it is necessary to try to provide protection and induction of host resistance systems in the root system. in this research. The proposed banana plant disease detection system consists of two phases, in the first phase, namely in the learning process, the images of healthy banana leaves and those affected by fusarium disease are files each measuring 640x480 pixels obtained from the results of taking a digital microscope on the plantation. PT. GGP Lampung. Next is the classification process. The method used for pattern recognition in this study is a support vector machine (SVM). Basically SVM can only be used to classify data into two classes (binary classification). To be able to apply to the problem of classifying healthy banana leaves and those affected by mild, moderate and severe fusarium disease


INTRODUCTION
Indonesia ranks 3rd as a global banana producer but only has an export volume of 0.31% of the total national production.The largest supply of banana exports (70-80%) comes from Latin American and Caribbean countries.The popular type of banana commodity is the Cavendish clone which was developed in the 60s as a resistant variety to replace the Gros Michel (Ambon Kuning) clone which was destroyed in Latin American banana plantations by the soil-borne pathogenic fungus, Fusarium oxysporum fsp.cubense (FocRas1).About 25 years ago Cavendish's resistance was broken by the attack of the Tropical Foc Race 4 (FocTR4) which was able to attack other types of bananas with high economic value such as Barangan, Mas, and so on.FocTR4 has also spread in various centers of world banana production, and have not found the right way to control it.If the spread of FocTR4 is not controlled in Indonesia, it will hinder the development of the banana commodity, which actually has a very large potential as a source of foreign exchange for the country.
Detection of plant diseases has become a major topic for researchers by using image processing techniques and using machine learning has a major contribution to this research topic.The automatic detection method was proposed by Sandip P. Bhamare in el.[5] for Black Sigatoka disease in Banana plants.Researchers used images of banana plants affected by black sigatoka disease and input manually using the background subtraction method for object extraction and using regional and local feature extraction methods to identify banana plants infected with black sigatoka disease.
Another study for automatic detection of banana plant diseases was also proposed by Basavaraj Tigadi et.al [6].The image obtained is then extracted into color features and feature histogram features.The extracted features were then used Artificial Neural Networks (ANN) to classify healthy and abnormal leaves.Another research is Jihen Amara in el.[7] proposed using a system-based convolution neural network method (CNNs) for identification and classification of diseases of banana plants.A minimal image processing based on the LeNet architecture [8] is also proposed.In this study, the authors claim the designed system can study visual features directly by comparing images of affected plant leaves with those that are not.98% accuracy is achieved by the proposed system.Godliver Owomugisha at el. [9] has proposed a python platformbased system to detect banana plant diseases using the k-Fold cross-validation method.In this article different classifiers are compared to detect banana plant diseases.Tree classifier [10] performed best to identify bacterial wilt disease of banana (BBW) and Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8 (1), 2022, 59 banana Black Sigatoka (BBS).Bhavini J at el. [11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.[9] has proposed a python platformbased system to detect banana plant diseases using the k-Fold cross-validation method.In this article different classifiers are compared to detect banana plant diseases.Tree classifier [10] performed best to identify bacterial wilt disease of banana (BBW) and banana Black Sigatoka (BBS).Bhavini J at el. [11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.[9] has proposed a python platformbased system to detect banana plant diseases using the k-Fold cross-validation method.In this article different classifiers are compared to detect banana plant diseases.Tree classifier [10] performed best to identify bacterial wilt disease of banana (BBW) and banana Black Sigatoka (BBS).Bhavini J at el. [11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.In this article different classifiers are compared to detect banana plant diseases.Tree classifier [10] performed best to identify bacterial wilt disease of banana (BBW) and banana Black Sigatoka (BBS).Bhavini J at el. [11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.In this article different classifiers are compared to detect banana plant diseases.Tree classifier [10] performed best to identify bacterial wilt disease of banana (BBW) and banana Black Sigatoka (BBS).Bhavini J at el. [11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.[11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.[11] used texture recognition techniques to detect apple fruit diseases such as apple spot and apple rot.In the K-Means system, the Euclidean distance is used to find the infected area of the fruit image and converted to RGB.Color, shape & texture features are extracted and feature level fusion is performed to integrate more than two features.
In this research, for the machine learning process, the images of healthy banana leaves and those affected by Fusarium disease are files with a size of 640x480 pixels Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8 (1), 2022, 60 each, with 120 images obtained from the digital microscope taking at PT. GGP Lampung.The next stage is feature extraction with the aim of getting the features that will be used for classification.The method used for pattern recognition in this study is a support vector machine (SVM).

RESEARCH METHODS
The steps of the proposed method for classifying banana plants infected with fusarium disease are shown in Figure 1.In the learning process, the image from a digital microscope from the plantation of PT.GGP Lampung are files each measuring 640x480 pixels which are obtained from the manual segmentation.The normalization process such as cropping and resizing to equalize the dimensions of the image is done manually so that it becomes a single image.The next stage is feature extraction with the aim of getting the features that will be used for classification.The features that will be used are the mean, standard deviation, , kurtosis, skewness, and entropy of the Color Histogram, Grayscale Histogram and Saturation Level Histogram.Next is the classification process.The method used for pattern recognition in this study is the support vector machine (SVM) one against all (OAA).In the training classification process, the hyperplane variables for each classifier obtained will be stored and will later be used as data for each classifier in the testing process, in other words the training classification process is a process to find Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8 (1), 2022, 61 support vectors, alpha and bias from the training input data obtained.in the form of feature vectors from banana plant images, namely healthy banana leaves, light fusarium, medium fusarium, and heavy fusarium (four classes).While in the testing process, the image data of banana plants, namely healthy banana leaves, light fusarium, medium fusarium, and heavy fusarium were used were data that were not included in the training process.If the class resulting from the test classification process is the same as the actual data class, then the recognition is declared true.

Preprocessing
To build a color histogram from the disease image of banana plants infected with fusarium, first the image is normalized.The image is mapped to pixels with a size of 640x480.The purpose of normalization is to reduce the resolution of the image which is useful during the image recognition process and also to increase the recognition accuracy.The equation used for color normalization is as follows: From the results of this normalization, red, green and blue are used to construct the histogram.The value of the disease image of banana plants infected with fusarium is an RGB color model.To get the gray level value of an image consisting of RGB color components, it is done using the following equation: The result of this grayscale process will be at a gray level of 8 bits.The distribution of the values of each pixel of the grayscale image is entered into the histogram.Saturation level histogram is used to get the values of color intensity based on its saturation (saturation).Color components based on saturation were obtained from banana plants infected with fusarium RGB color model through calculations with the following equation: Where the saturation value for each pixel is used to build the histogram of the distribution.
Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8 (1), 2022, 62 Feature Extraction From the value of the color histogram, grayscale and saturation level calculation results can already be used as input vectors, but to reduce large computational problems, these values are represented by the mean, standard deviation, kurtosis and skewness of the histogram distribution, which can be calculated by equality : To find the entropy value used is the co-occurrence matrix.This co-occurrence matrix is used to describe the frequency of occurrence of pairs of two or more pixels with a certain intensity in the image.The results obtained will find the value of randomness (entropy) of the intensity distribution, where the equation is: ( , )log ( , ) where p(i1,i2) is the co-occurrence matrix of the fusarium-infected banana plant image

Classification
After preprocessing and extraction of leaf image features from banana plants, banana leaf images are classified as healthy images or diseased images using a classifier.The classification of banana leaf imagery consists of two main steps: the training and testing process.In this study, the k-Fold cross-validation method was used to obtain training and testing datasets.To analyze the performance of the proposed SVM classifier algorithm.Statistical parameters such as accuracy, sensitivity and specificity were used to check the performance of the classifier.The Support Vector Machine (SVM) was developed by Boser, Guyon, and Vapnik, and was first presented in 1992 at the Annual Workshop on Computational Learning Theory.The basic concept of SVM is actually a harmonious combination of computational theories that have existed decades before, such as the hyperplane margin (Duda & Hart in 1973, Cover in 1965, Vapnik 1964, etc.), the kernel introduced by Aronszajn in 1950, as well as other supporting concepts.However, until 1992, there had never been an attempt to assemble these components.
In contrast to the neural network strategy which seeks to find a dividing hyperplane between classes, SVM tries to find the best hyperplane in the input space.The basic principle of SVM is a linear classifier, and then it was developed to work on non-linear problems by incorporating the concept of kernel tricks in high-dimensional workspaces.This development stimulates research interest in the field of pattern recognition to investigate the potential of SVM capabilities theoretically and in terms of application.Currently, SVM has been successfully applied to real-world problems, and generally provides a better solution than conventional methods such as artificial neural networks [4].
The SVM concept can be explained simply as an attempt to find the best hyperplane that functions as a separator of two classes in the input space.Hyperplane in d-dimensional vector space is an affine subspace with d-1 dimension which divides the vector space into two parts, each of which corresponds to a different class [4].Figure 1 shows some patterns that are members of two classes: +1 and -1.Patterns belonging to class -1 are symbolized by red (squares), while patterns in class +1 are symbolized by yellow (circles).Classification problems can be translated by trying to find a line (hyperplane) that separates the two groups.Various alternative dividing lines (discrimination boundaries) are shown in Figure 3 (a).Figure 3 shows several patterns that are members of two classes, namely class 1 and class 2.Where the pattern incorporated in class 1 is symbolized by a red box, while the pattern in class 2 is symbolized by a green circle.Figure 5(a) shows that there are several alternative discriminatory boundaries that separate patterns that are members of two different classes.The pattern that is in this limiting plane is called a support vector.Figure 5(b) shows that two classes can be separated by a pair of parallel bounding planes.The first bounding field delimits the first class so that the following equation results: Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8 (1), 2022, 64  ⃗⃗ . +  ≤ −1 for yi = -1 while the second limiting field limits the second class so that the following equation is produced:  ⃗⃗ . +  ≥ +1 for yi = +1 Where w is the coefficient of the weight vector and b is the bias.The distance between the training vector xi and the hyperplane is called the margin.By multiplying b and w by a constant, the resulting margin value is multiplied by the same constant.Equation ( 15) is a scaling constraint that can be met by rescaling b and w.In addition, because maximizing Where i is Lagrange multipliers, which are zero or positive ( i 0 ).The optimal value of equation ( 19) can be calculated by minimizing L with respect to ⃗⃗ and b, and maximize L with respect to i.By taking into account the property that at the optimal point of the gradient L = 0, equation ( 22) can be modified as a maximization problem that only contains i, as in equation ( 23   (23) From the results of these calculations, i is obtained and the data that is correlated with i which has a positive value is determined as a support vector.

Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8(1), 2022, 65
The one against one SVM method is one method for implementing SVM for multiclass using the second approach.The binary classification model built using this method can be calculated by following equation ( 24): ( 1) 2 where k is the number of classes.At the training stage, each classification model is trained using training data from two classes.Meanwhile, at the testing stage, there are several ways to conduct testing after all k(k 1)/2 classification models have been completed.One way that can be used is to use the voting method (Hsu, 2002).An example of using the one against one SVM method can be shown in table 1 and figure 2.
3 )x+b 1.3   Class 1 4th grade f 1.4 (x)=(w 1.4 )x+b 1.4   Grade 2 Grade 3 f 2.3 (x)=(w 2.3 )x+b 2.3   Grade 2 4th grade f 2.4 (x)=(w 2.4 )x+b 2.4   Grade 3 4th grade f 3.4 (x)=(w 3.4 )x+b 3.4    Gravity: Jurnal Ilmiah Penelitian dan Pembelajaran Fisika, 8 (1), 2022, 66 And in the end the class of data x is determined from the number of votes obtained by the highest number of votes.If there are two classes that have the same number of votes, then the class with the smaller index is declared as the class of the data being tested.This method will build a number of k binary SVMs, where k is the number of classes (Hsu, et.al., 2002).The i-th SVM is trained with all samples in the i-class with a positive class label and all other samples with a negative class label.If given l training data (xi,yi),…,(xl,yl), with  ∈ R  , i = 1,...,l is a class of xi, then the i-th SVM will solve the following problems: min   ,  , The data class x will be determined based on the highest decision function value.To find the minimization solution in equation 22 using quadratic programming.

RESULTS AND DISCUSSION
In this section, the effectiveness of the methods used in overcoming the classification problem of banana plants infected with fusarium will be described and evaluated.Tests on the SVM classification method used 120 image data of banana plants infected with fusarium with a size of 50x50 pixels.Each class of classification consists of 30 data.In the trial process, there are two stages, the first stage is training while the second stage is the test stage.The training stage is used to obtain the coordinates of the support vector, weight, bias and distance of the support vector, while the testing stage is to use data other than training data to obtain classification results, so that the level of accuracy can be known.

A. Experiment of the Effect of Histogram Feature Range on Classification Results
Using Support Vector Machine One Against All Figure 8 shows that the lowest average accuracy percentage for class I (healthy banana leaves) is 78.07%, class II (light fusarium) is 70.02%, class III (medium fusarium) is 72%, Class IV (heavy fusarium) is 72.78%, while the highest average accuracy percentage for class I (healthy banana leaves) is 90.833% obtained when using the number of feature ranges 2, 3, 4, 5, 6, 7, 8, 9, 10 , 11, 12, 13, 14, and 15.For class II (light fusarium) it is 76.88% obtained when using the number of feature ranges 2, 3, 4,

CONCLUSIONS
In this study, a classification system for banana plants infected with fusarium has been developed with feature extraction of mean, standard deviation, , kurtosis, skewness, and entropy from the color histogram, grayscale histogram and saturation level histogram of the malaria parasite image and the support vector machine method as the classifier.From the results of experiments on 120 malaria parasite image data, it was stated that the classification using the support vector machine kernel linear one against one method when compared with the SVM one against all method, the accuracy level of the support vector machine kernel linear one against all method was class I (healthy banana leaves).) was 90.833%, class II (mild fusarium) was 76.88%, class III (moderate fusarium) was 77.5%, class IV (severe fusarium) was 95%.

Figure 1 .Figure 2
Figure 1.Data Retrieval of Banana Leaf Image

Figure 6 .
Figure 6.One against one SVM classification method From Figure 6, if the xi data is entered into the function obtained from the training stage in equation 25: f(x) = (wij) T (x) + b (25) and the results obtained by x are classes including class i, then class i gets one vote (vote).And then the dataxitested on all classification models obtained from the training phase.

Figure 7
Figure 7 Graph of Average Accuracy per class svm one vs all jian ).