Machine Learning Strategies for Predicting Crop Diseases

Prevalence of crop diseases is a major hindrance for successful crop production. These diseases can be identified in less time and more accurate using Machine Learning (ML) strategies as compared to any manual approach. Agronomy plays a key role in anticipating crop diseases at an early stage. With the advent of computer vision, plants can be classified as diseased or healthy by extracting architectural characteristics of a leaf using various image processing techniques. Support Vector Machines (SVM) classification technique is used in distinguishing between diseased and healthy leaf from the datasets that are publicly available. SVM method exhibited high fitting and predictive precision. The proposed paper is organized in various steps such as identifying the features, extraction of features using a computer vision technique known as Scale Invariant Feature Transform (SIFT), model training and testing. Predominantly, crop diseases on a larger scale are predicted by harmonizing speed and accuracy using computer vision and machine learning strategies.


Introduction
The Indian economy relies immensely upon the efficiency of farming, its productivity and sustainability.
The key challenge to farmers these days are protecting the crops from pests and diseases. So, predicting the crop diseases plays a key role in the field of horticulture. Different modern technologies have emerged in order to minimize post-harvest processing, strengthen agricultural sustainability and maximize productivity. Automated Bots and Drones are deployed in farms to monitor and serve the crop. Digital apps are designed to identify diseases attacked in the farm by measuring the NPK ( [2] Authors proposed various image processing techniques for classifying the plants using pattern recognition and digital image processing [3] Authors proposed how colour features are represented in RGB to HIS. Seven invariant moment are taken as shape parameter. SVM classifier is used for detecting disease in wheat plant. [4] In this paper, authors proposed different strategies for pest detection and identification on tomato plants. Computer vison and machine learning techniques are used for detection. [5] Modifications to the SIFT descriptor is proposed by the authors in order to improve its robustness against spectral variations. It is based on fact, that edges remain well preserved in multispectral imaging for achieving better image matching results. This is implemented by boosting the contribution of local edges in the SIFT descriptor construction process.

Framework Proposed
To predict if a plant is diseased or healthy a framework is followed in this paper. The architecture of the system proposed is as shown in Fig1. The design flow includes the following steps: SIFT calculation utilizes Difference of Gaussians which is an estimate of DoG. Distinction of Gaussian is acquired as the distinction of Gaussian obscuring of a picture with two diverse , let it be and . This procedure is accomplished for various octaves of the picture in Gaussian Pyramid. It is as shown in Fig5. When this DoG are discovered, pictures are looked for neighbourhood extrema over scale and space.

Fig5. Image of Gaussian Pyramid
Fig6. Shows the Key points extracted using SIFT algorithm for a healthy leaf.

Fig6. Key points identified for a healthy leaf
Fig7. Shows the Key points extracted using SIFT algorithm for a diseased leaf.

Fig7. Key points identified for a diseased leaf
Next comes Histogram of Oriented Gradients (HoG). The guided gradient histogram is a feature descriptor used for the purpose of object detection in computer vision and image processing. The technique counts occurrences of gradient orientation in localized portions of an image [5].

Fig10. Feature Matching based on Euclidian distance
(v) Prediction using SVM: In the proposed work the classifier used is Support Vector Machine (SVM). It is a supervised algorithm for machine learning, used for classification and regression. SVM classifier is trained with the leaf image features extracted. It is tested for new images to predict the desired results. The input leaf images for the system taken are grape black rot, peach bacterial spot, Tomato healthy, grape leaf blight, pepper-bell bacterial spot, tomato bacterial spot and tomato late blight. The results conclude the prediction of a crop if its healthy or diseased.
Two types of SVM kernels are used for classification. They are Linear Kernel and the Radial Basis Function(RBF) kernel. Linear Kernel is a commonly used kernel, which can be used when the dataset can be linearly separable by a single line. RBF is used for non-linear classification of dataset. It is also known as kernel trick.

Results and Discussion
In this experimental study, 100 samples for training and another 100 samples for testing has been used in the system. The attributes are normalized to ensure that they all have same scale. Scaling used is Min-Max scaling. The formula for it is as follows: The classification for training data is as shown in Fig11

Conclusion
In this proposed work, Computer Vision and Supervised machine learning algorithms are used for predicting and classifying if a crop is healthy or not. The computer vision technique used is SIFT. SVM is a supervised machine learning algorithm used for classification and regression using two different kernels i.e., Linear and RBF. The dataset used are the images of leaves obtained from a publicly available dataset knows as PlantVillage dataset. The results show that RBF kernel provides more accurate results than Linear Kernel if the dataset is non-linear. Optimum results are obtained with very less computations.