CLASSIFICATION OF DISEASES FOR RICE PLANT BASED ON NAIVE BAYES CLASSIFIER WITH A COMBINATION OF PROMETHEE

Rice is a staple food crop for most of the population in Indonesia. The need for food, such as rice, is increasing every year. However, in recent years, the low productivity is due to the prolonged dry season. Also, quality rice production can be influenced by several factors, such as the presence of pests and diseases that attack rice plants so that farmers have difficulty dealing with them. Most of the researchers have applied a measure with an intelligencebased measurement; however, the obtained accuracy cannot achieve maximum. Therefore, in this research, a new approach was carried out using the Naive Bayes and PROMETHEE method to determine diseases and pests in rice that have been proposed to reduce the risk of errors and shorten the time in decision making. The contribution of this research is to identifying these problems, including the search for prior probability, conditional probability, posterior probability, and ranking that can use the Naive Bayes and PROMETHEE method to result the highest accuration value. So, this research can be use rapid decision making based on learning data. From the sample's data of the Agriculture Office of Lamongan Regency, East Java pointed out that 38 symptoms can cause 13 types of diseases and rice plants' 2 ANAMISA, RACHMAD, YUSUF, JAUHARI, ERDIANSA, HARIYAWAN pests based on the learning process has 73.91% accuracy with k=3. Testing the system used data on pest and disease disturbances as many as 180 data and the data division using k is 4. It proves that the Naive Bayes and PROMETHEE method is able to give better results.


INTRODUCTION
Rice is one of the staple food for most Indonesians. The increase in food demand is directly proportional to the increase in the number of people [1]. Therefore, the Government of Indonesia is trying to increase the production of quality rice in large quantities. However, it is constrained by several factors, including rice plants, that require high rainfall because rainfall can determine the availability and adequacy of water during the rice growth phase [2]. Java Island is one of the largest rice-producing areas in Indonesia [3]. However, East Java, especially in the Lamongan area, has experienced a decline in rice production. This phenomenon is due to uncertain weather, a long drought that causes rice farmers to experience crop failure. Several previous studies have stated that the main factor of crop failure is not only the weather factor but also the number of rice plants that have been attacked by pests and diseases [4]. Pests and diseases are planted pest organisms that damage rice plants if they are not appropriately handled [5]. Therefore, farmers need a system that can early identify rice plants that have been attacked by diseases or pests to be overcome immediately.

The development of information technology is increasingly fast and sophisticated.
This has influenced the development of web applications, where the use of web applicati ons is increasingly recognized by the wider community [6]. The development of web applications helps present information quickly and efficiently by accessing the internet through these applications. Meanwhile, the minimum information about rice with disease attacks as information is obtained by farmers in Lamongan. Rice farmers in Lamongan are still sometimes guessing about diseases or pests that attack their rice plants so that handling is slow. Therefore, this study is trying to design a web-based disease of rice determine system 3 CLASSIFICATION OF DISEASES FOR RICE PLANT to help rice farmers find the right solution based on the inputted symptoms and relevant for the active determine at a point in time. In this system's design, an expert system can provide alternative solutions in the diagnosis of diseases or pests in rice plants with both actively and dynamically performance. Additionally, an expert system is a computer -based system that can improve knowledge in the form of a knowledge base and a rule base to solve problems by presenting expert knowledge [7] [8].
Along with the development of research results in the field of clasification systems, several approaches have been proposed and produced by many researchers to classify and priority of alternative types of diseases in rice crops. Some researchers limit ed their work to detection only, while others cover the area of classification and priority level in handling.
However, very little effort has been made to introduce based on classification model and dominant of attack ranking of rice diseases, such as diagnose system of rice plants using forward and backward chaining methods to diagnose diseases with 9 rule bases and 25 symptoms have produced a 50% degree of certainty [9]. One of the most popular methods that have ever been published for classification is Naïve Bayes (NB). This classifier was done to classify and detect diseases and pests in plants. Naïve Bayes has approa ched a classification based on the maximum probability of trafficking objects. The research in 2014 based on 12 datasets of eye diseases using the Naïve Bayes method has resulted in a percentage of suitability between expert system diagnosis and human expe rt by 70% [10].
Meanwhile, a research was conducted in 2019 to diagnose rice stress at a regional scale from satellite imagery using the Bayesian method has produced an accuracy of 70.57% [11]. The Naive Bayes method is a supervised learning method using simple statistical probability to produce decisions based on learning data [12] [13]. This method has advantages as follows: simple, fast, and high accuracy. Preference Ranking Organization Method for Enrichment Evaluation method (PROMETHEE) is a Multi Crieteria Decision Making (MCDM) technique that is very easy to apply compared to other methods. while to determine the priority of attack on rice plants in this research use this method. PROMETHEE is one of the methods to determine the priority of Multicriteria Analysis [14]. This method used an priority method by giving weights to each criterion and produces the best priority output so as to provide 4 ANAMISA, RACHMAD, YUSUF, JAUHARI, ERDIANSA, HARIYAWAN convenience in decision making in recommending better approval [15]. This method has advantages in processing alternative ranking using preference functions and different weights. And so this method can be used to choose the best alternative where there are many criteria by analyzing the scope of the criteria and the weights for these criteria [16].
Therefore, this research has designed a system using the naive Bayes classifier method to determine diseases in rice plants and PROMETHEE is used to determine the priority of diseases that attack rice plants so that it affects decisions that can be taken by farmers to handle the affected rice and its can break the chain of spread to minimize failure harvest. The determine result depends on some process in this method, i.e., prior probability, conditional probability, posterior probability, classification, and priority preference of PROMETHEE method. Furthermore, at the classification stage have been produced the category label of diseases with the highest probability value and analyze criteria based on compared alternatives so that they can rank based on these values. The high probability values have results that can use rapid decision making based on learning data. Most of the researchers have ignored important process, as mentioned before [10][11] [13], even though an intelligence-based measurement has applied; however, the obtained accuracy cannot achieve maximum and the development of PROMETHEE that can be used as an application-based decision support system is much better because it produces precise an d accurate calculations [17].
In this case, we propose a model to improve the accuracy and information that has been published on the web system as an alternative solution to the priority process for classifying rice plant diseases. So, this problem is how to determine the decision-making model using PROMETHEE to provide recommendations for handling solutions from the results of the classification of diseases attacked in rice plants based on the NB probability value from the learning data. In this study, the classification of diseases in rice plants was carried out based on 38 symptoms and 13 types of diseases. As a result of disease classification in rice, we present an innovative approach for effective and dynamic crop management to achieve classification performance comparable to standard optimization procedures based on selected disease and symptom alternative data. 5

A. Literature Review
This section explored related works and references related about methods of determine systems for diseases in plants as follows, including Machine learning-based disease recognition problems can be broken down into two domains, namely detection, and classification [18]. Some researchers limit their work to detection only, while others cover the area of determine, classification and ranking. However, very little effort has been made to introduce machine learning-based food crop diseases, which have specifically focused on rice. Many methods have been developed to improve accuracies such as Simple Additive Weighting (SAW), K-Nearest Neighbor (K-NN), and Analytic Hierarchy Process (AHP) methods. However, not all methods are suitable implemented to quickly classificated diseases based on compared alternatives with criteria and improve accuracy. Inappropriateness in applying the classification method will cause mismatch results in classification. [7][18] developed an identification system to detect plant diseases based on dialogue and using Multi-Criteria Decision-Making techniques, namely hybrid of AHP and Sensitive SAW where missing attribute probabilities approach 40%. Furthermore, [19] discussed a system to identify various capsicum bacterial or fungal diseases using image processing techniques. This study utilized K-NN to classify 62 healthy or diseased paprika images and select the correct treatment for the disease and produced better crop production. The experiment achieved 64% of images are identified and classified as diseased and 48% of images are recognized and categorized classified healthy.
Additionally, [20] examined the detection of infectious animal disease in France using Naïve Bayes (NB) and Support Vector Machine (SVM) classifiers. This experiment evaluated 545 documents on African swine fever (ASF). This study produces a classification of ASF samples with NB slightly better than the SVM. Followed by research by [21] proposed a probabilistic programming technique to detect plant disease using the Bayesian deep learning method and measured the misclassification using uncertainty. The experiment obtained an outstanding performance. In addition to the NB method which has contributed, PROMETHEE is also able to select exemplary teacher candidates by producing higher accuracy than the two methods carried out separately [22]. The PROMETHEE method is a method that has the advantage of being able to make comparisons between individual elements. The PROMETHEE method is a method that has the advantage of being able to make comparisons between individual elements that are used as 6 ANAMISA, RACHMAD, YUSUF, JAUHARI, ERDIANSA, HARIYAWAN decision making from several alternatives. The solution obtained is in the form of ranking leaving flow, entering flow and net flow. In this method, all declared parameters have a significant effect [23].

B. Research Method
Statistical reasoning can be used to solve the problem of uncertainty [24]. In this study, to clasified diseases in rice, learning data processing was carried out using the naive Bayes classifier method based on experts' learning process to produce the right grouping solution.
Then the promethee method is used to determine the order (priority) in the analysis of the results of the classification of diseases that attack rice plants. An overview of the disease detection system in rice using the NB and PROMETHEE methods can be seen in Figure 1.

A. Naive Bayes Method
The Naive Bayes method is a probabilistic classification technique based on the Bayes theory and uses the assumption that there is no link between attributes in the classification proces [25] [26]. Naive Bayes can be trained efficiently in supervised learning. This method's advantage is that it only requires a small amount of training data to estimate the variance parameter of the variables required for classification. Since independent variables are assumed, only the variation of each class's variables must be determined, not the entire covariance matrix [27]. Furthermore, Naive Bayes states that the presence or absence of a feature in a class is not related to the other features in the same class. Moreover, the stages of the Naive Bayes method to solve these problems as follows [28]: determine and produce a decision from several alternatives by combining the data into one and given a weight value that has been obtained through previous assessments. PROMETHEE uses a weighted assessment for each criterion and produces the best priority output so as to make it easier for decision makers to provide better approval recommendations [29]. Steps for calculating the PROMETHEE method [30] to get the results of decisions in determining the priority of diseases that attack rice plants so that they can immediately get solutions and recommendations for handling them: (b) Define some criteria: The value of f is the real value of a criterion, it can be seen in equation (6).
: → For each alternative ∈ , ( ) is an evaluation of the alternative for a criterion. When two alternatives are compared , , the comparison of preferences must be determined. The rule of Knowledge are: • P(a,b) = 0, means there is no difference (indefferent) between a and b or there is no preference of a better than b.
• P(a,b) ~ 0, means that the preference of a is better than b is weak • P(a,b) ~ 1, means that the preference of a is better than b is strong • P(a,b) = 1, it means that the preference of a is better than b is absolute (c) Determine the type of assessment, namely minimum and maximum (d) Define preference type : For each criterion the most suitable is based on the data and considerations of the decision matrix, such as seen in equatiion (7). There are six preference types (Usual, Quasi, Linear, Level, Quasi Linear and Gaussian). Normalize the decision matrix:

C. K-Fold Cross Validation
The data training process using the NB and PROMETHEE methods goes through a kfold process as a form of evaluating the training results, then calculated using the k-fold cross validation method. In this method, the training data was evaluated by a number of k-subsets formed.
The way the k-fold method works is by dividing the data a number of k, then iterating the test data on the training data as much as k subsets as well. The use of k-fold crossed validation to eliminate bias in the data. The preprocessed data were carried out by cross validation by dividing the data into training data and test data for the classification process. The test model was carried out 3 times and each data subset will have the opportunity as testing data or training data, as shown in Figure   2. The next method used is z-score normalization. Z-score normalization is a normalization method based on the mean (average value) and standard deviation (standard deviation) of the data. This method is very useful if the minimum and maximum actual values of the data are not known, it can be seen in equation

A. Data Input
Data is a representation of a fact, which is modeled in pictures, words, and numbers.
The benefit of data is a unit of representation that can be remembered, recorded, and processed into information [32]. Data has two types, primary data, and secondary data.
Primary data is carried out by direct interview with an expert in agriculture [33]. Meanwhile, this study's secondary data were six types of rice (IR36, IR64, Way Apo Buru, Ciherang, Cibogo, and Lusi), 13 types of diseases, and 38 types of symptoms from the Agriculture and Forestry Service of Lamongan Regency in 2020. The symptoms list is shown in Table 1, whereas the diseases list is in Table 2. The leaves become short S12 The item becomes empty or contains no S31 Grow many but small seedlings S13 Attack the rice leaf stalk S32 Less perfect plant growth S14 Attacking midrib that forms tillers

S33
The leaves turn yellow and brown S15 The amount of grain decreased

D13
Leaf Brown Spots S12, S17, S23, S26 Based on the stages to complete the detection of diseases and pests in rice plants using the NB and PROMETHEE methods and, these steps are calculating new possibilities by searching and entering dataset as much as 180 data to be included in calculating NB and PROMETHEE methods.

B. Testing With NB and PROMETHEE Methods
After determining the alternatives and criteria, the steps to be taken were to determine the value of Prior Likelihood and Prior opportunities based on scoring by expert. This can be seen in Table 3. That carried out the determination the posterior value, this can also be seen in Table 4, which is the final stage of the NB method. After calculating the posterior value, the next step is to normalize the data used for the PROMETHEE process, which is shown in Table 5. After the data was normalized, the next step is to evaluate the differences between one alternative and another, as shown in Table 6. After that, it performed the process of calculating the preference function, as in Table 7 and calculated the aggregate preference function by considering the weight of the criteria where the total weight for the criteria is 1, can be seen in Table 8. The next thing was to determine the value of leaving flow, entering flow, and next flow to rank each rice disease, can be seen in Table   9.
12 ANAMISA, RACHMAD, YUSUF, JAUHARI, ERDIANSA, HARIYAWAN       Alternative  x1  x2  x3  x4  x5  x6  (D1,D2   The last trial was to find the accuracy for each disease class in rice plants using the k-fold cross validation method with k as much as 4. Based on the results of the confusion matrix in Table   10, it is known that the Blast class has the highest accuracy of 73. 91% and the Leaf blight crackle class. It has the lowest accuracy of 43.48%. This low accuracy is due to some data from the Leaf blight crackle class which are similar to the data from the Striped Leaves class. Furthermore, the Tungro and Dwarf classes get accuracy below 60%, which are 43.48% and 52.17%, respectively.
The confusion matrix was used to determine the accuracy of each disease class in rice plants. From the test results, it shows a graph of each system accuracy for each k parameter, shown in Figure 3.
Based on tests with variations in the highest accuracy k value i.e. when the value of k = 3 so that when tested with data validation the parameter is initialized equal to 3 of 73. 91% in the Blast class.

CONCLUSIONS
Based on the results of research with the calculation, Naive Bayes and PROMETHEE method was used to classify of rice plant diseases, the highest accuracy rate was 73.91% against 180 datasets with 38 symptoms and 13 types of diseases. Therefore, it can be concluded that based on the level of accuracy, the NB method modeling is better in the classification process and the PROMETHEE method is also able to carry out the ranking process so that decisions can be made in helping rice farmers to provide handling solutions to reduce crop failure.