A Robust Pest Identification System using Morphological Analysis in Neural Networks

ABSTRACT


Introduction
Agriculture is the basic necessity for human survival.The progress in agriculture is intertwined with the economic progress of the society in which the farmers play a key role in putting up the capital and the labor.The innovations and applications of technology impact large sections of rural farming societies and bring them into the mainstream of development.In the past decades the government has launched many schemes to improve the livelihoods of people engaged in this sector.As per 2011 census, 24.6% of the populations are involved in agriculture [1].The production of food grains for a colossal population of 1.2 billion people requires extensive investments in the form of pesticides, fertilizers, and labor.The use of pesticide is essential for the survival of rural economy as the yield obtained, is often equivalent to a quarter of the total GDP [2].As rice is the major crop which covers 63% of the total area under cultivation, we have considered this as the use case in this paper.The indiscriminate use of pesticides causes extremely high rate of cancer among the humans who consume the said product, and the farmers who use these pesticides [3].Multiple surveys conducted by the government in a span of 5 years from 2011 to 2015 has shed light on the fact that at least 1.5% to 3% of all the food grown is poisonous and unfit for consumption [4].The lack of awareness about the harmful effect of the pesticides is alarming to say the least.
A number of pest detection mechanisms which employ classifiers to detect pests in the field have been proposed [5][6] [7].The use of machine learning techniques for classification reduces the time as well helps in providing a prompt response.The existing classification techniques provide high accuracy in detecting insects when the training and testing sets have similar orientations, but the accuracy reduces when the insects are partially visible or are in a different orientation.In this paper, we proposed a pest identification system which identifies the pests from the field irrespective of their orientation by the means of classification.The use of sensors like camera (mobile) or stick based thermal sensors provide the image and a pre-trained classifier identifies the insect as friendly or harmful.The proposed approach attempts to mitigate the limitations of classifying the binary images that are either partially visible or are in different orientation.Thus, in spite of using economical resources for capturing images, this technique provides accuracy in classification of pests.The paper is structured as follows.Section 2 discusses the Background and Related Work encompassing the conceptual workings of a pest detection system as well as the most popular classifiers, followed by the Problem Definition, System Model, and Experimental Set-up.Finally, the paper is concluded.

Background and related work
The usage of pesticide varies from region to region as well as it depends upon the crops.In India, the maximum amount of pesticides is used in production of cotton followed by that of rice [8].Rice is the only food grain having highest amount of pesticide usage.This excessive usage of pesticide causes the contamination of soil, grain as well as of groundwater.A survey was conducted in the year 2014-15 and found the Maximum Residue Level(MRL) in rice to be high [9].This puts the farmers in a desperate economic situation where they are unable to recover their production costs causing them to enter depression and commit suicide.It is of utmost importance to know the type of pests attacking a particular crop.Hence we propose a conceptual model of pest identification system as shown in Fig. 1.

Figure 1. Conceptual Framework of a Pest Identification System
The model consists of the following:

Image collection
The images of pests are captured from the field with the use of economic resources like sensor based cameras and stored in a central repository where they are labeled.The process can be explained as a set of images 'I' collected, and matched to a set of labels 'L' such that there exists many to one mapping between them.

Image pre-processing
The collected set of images I are obtained from sensors having different aperture rates as well as under different weather or lighting conditions, i.e., every element Ij in set I has a different dimension, intensity, signal to noise ratio etc.Thus, image pre-processing includes removal of noise from the image, resizing the image as well as improving the overall quality of the image.

Extracting the features
The extraction of features involves identification of key points or descriptors in the obtained images i.e. creating a set of features F such that each element of I, has a set Fi having Fi= {f1, f2,.....,fn} features extracted.Features in an image can be broadly classified into global and local features [5].Global features constitute the entire image or a significant portion of it including eigen-spaces, color histograms, and receptive field histograms.Local features constitute the specific areas of an image and are more robust to occlusion as compared to the global features.These include features like edges, corners, entropy, and curvature etc.They are also quite immune to the background clutter as well as change in viewpoint.

Classification
Classification forms the cornerstone upon which the whole pest identification system relies upon.It is responsible for identifying the insects and provides a list of the same.The accuracy and the complexity of the classifier plays a key role in the identification of the pests.Some of the prevalent classifiers used to classify the pests in pest identification systems and they are as follows:-

Neural networks
Artificial neural networks (ANN) are versatile classifiers, with applications in multiple fields [13].The advantage the ANNs have over other classifiers, is its ability to extract all the features from an object on its own i.e. it does not requires additional support for extraction of features.The most popular version of ANN used for image processing is known as Convolutional Neural Networks (CNN) [15].CNN is a deep feed forward neural network, which helps in visualizing the images with the use of multilayer perceptron.

Support vector machines (SVM)
SVM is a constructive learning model, based on the statistical learning paradigm.It is a supervised learning paradigm involving creation of hyper-plane or a set of hyper-planes for classification of an unknown datapoint.The SVM is a versatile tool used for classification of various types of data like text, images etc.It is used in multiple fields like regression, clustering etc [12].

Naive bayes
It is a classifier based on the bayesian probability hypothesis.The classifier makes an assumption that each and every feature is mutually exclusive of each other as well as makes an independent contribution to the outcome [14].The algorithm utilizes (1), The others include newly created algorithms such as Random forests, PCA, and Logistic regression [10] [11].

Decision system
The decision system involves the use of some intelligence based on the results of classifier.It considers the result of classified images and decides what information to pass on to farmers regarding the pests and their respective pesticides.Many researchers have applied various methodologies over the years to mitigate the problem of pests in fields, orchards etc. Wen et.al have proposed a system to detect pests using their local features like morphology and contours [5].The above shed light about the fact that the orientation as well as the lighting in an image plays a key role in the detection of an object.Swain et.al have discussed a novel algorithm to detect weeds in fields by utilizing their shape [6].The extraction of shape was conducted using binary operations.Liu et.al have proposed a method of counting and detecting the insects in a wheat field by separating the external background [7].Sujartha et.al have proposed a weed detecting robot based on the fuzzy real-time classifier for detection of weeds in the fields [16].The robot applies morphological operations to extract the textures of the leaf to identify the weeds present in the field.Wspanialy et.al have proposed an approach to detect mild-dew in plants by removing the background and augmenting the light [17].Johannes et.al have explored the possibility of detecting plant disease over mobile by applying segmentation operations to detect the hot-spot for identification of the disease [18].

Problem definition
Wen et.al have introduced a method to address the misclassifications that occur due to the effect of light on the image [5].This method emphasizes that the global features present in the image are much more vulnerable to the effects of light as compared to the local features.In this method, the RGB images are transformed into morphological images to reduce the misclassification of insects due to light.A morphological image consists of only 0s and 1s, i.e. the whole idea of the intensity of light at a particular pixel is discarded.This mitigation of the effects of light enables the classifier to detect the insects with higher accuracy but leaves a room for the improvement in the detection of pests, that are in different orientation or partially visible due to foliage.Table 1 represents the aforementioned scenario.Table 2 shows the mathematical notations used in the problem such as: Table 2. List of notations A pest detection system can be formally depicted as: The logical expression explains a basic pest detection system where the images of pests are subjected to feature extraction and then are classified.The resulting label set is coupled with an existing decision system to provide required recommendation to the user.Our objective is to find the correct pair of C l and F to improve the accuracy of the attained label set L.

System model
The input for the proposed model includes an input vector I containing a set of images.Each image is collected via various sensors present on the field.To initialize the system the images are labeled manually.A tuple < L label , I feature > is generated, where L label represents the set of labels for the images, and I feature represents the features of interest extracted from each image.The above tuple is used to train the system.The post training detection of labels from the input vector is important as it determines the accuracy of the system.A low accuracy of the system can lead to massive crop losses and in turn can prove disastrous to

Image pre-processing
The pre-processing of the image set I, includes resizing the images, improving their contrast, and removing the noise from the images.The resizing of the images is performed using a matlab tool box function based on the nearest neighbor algorithm with anti-aliasing taken to be true.Overall sharpness and contrast were increased to enhance the feature of the images.Noise is a result of errors occurred during the image acquisition process.A gaussian noise filter removes the noise distributed around the image using the probability density function as shown in ( 2

) Continuous cloud cover promotes in breeding
To negate the effects of lighting on the extraction of skeleton we transform the noiseless RGB image-matrix to Grayscale.The above processes are formally summarized as: ∀ j ∀ i {G(R i (N ( I j ) ),i)} → I R ,where the image I j is cleansed of errors using the noise filters.The resulting image is resized to a desirable size with anti-aliasing to prevent further addition of noise.The achieved images are transformed further to their grayscale using G(Image,i).The resultant set IR contains images of required specification.

Feature extraction
A feature extraction process is a process of creating a subset of feature vectors from the image.The Grayscale matrix obtained earlier, is further transformed to a binary image based on the threshold determined by the global The detection of edges is done by surround suppression approach as proposed by Grigorescu et.al [21].Once the potential edges are detected we apply the binary function of skeletonization on the image.The resulting image retains only the desired features including the outer skeleton as well as the contours of the insect.The Feature extraction process can be formally expressed as:-∀ i { S(E (B ( IR i , ThresholdG) ) )} → F i Where, the pre-processed images IR is further processed under the binary operator B(image, threshold) with a threshold determined globally based on the matrix elements.The resultant matrix is subjected to Edge detection E(Image) and further skeletonized S(Image) to find the required feature.The feature is extracted in the form of vector for the SVM and Naive Bayes, and for CNN the same image is used.

Classifier
As mentioned in Section 2, among the three most popular approaches, SVM and Naive Bayes are nonparametric and much faster as compared to the Neural Networks.On the contrary, Neural Networks are parametric and much more accurate than the SVM, though they take a considerably long time to train.To find the perfect classifier for the proposed architecture we have compared these three classifiers based on the dataset created in section 4.1.

Experimental Set-up
Various image classification algorithms are available to be used for classification.The accuracy of the classifier depends primarily upon the feature selection criterion as well as upon the classifier used.The experiment here primarily determines the usage of the feature selection technique alongside the most popular classifiers for the pest identification system.The experimental set-up is divided in two phases:-1.Comparing the performance of the classifiers based on known feature detection and extraction techniques involving susan edge detection with the proposed surround sup-pressed canny edge detection [5] [21].2. Comparing the performance of the classifiers based on the morphological skeletons extracted from dataset, i.e. using the proposed approach to detect edges before skeletonization.The dataset used in this section is previously discussed in section 4.1.The dataset is divided into four sets, each containing equal numbers of pests and predators as shown in table 3.Each set is further segregated based on the visibility of the bug, i.e. the training sets contain the images of bugs where they are clearly visible or easily identified and the testing set contains the different views of the bugs as well as images, where they are partially visible as shown in Fig. 3.The dataset is pre-processed to the requirements of the classifiers for both the phases.The first phase of experiment deals with the skeletal images created using susan edge detection algorithm [5].The transformed images are fed directly to the CNN.For SVM and Naive bayes the images are flattened i.e. their features are extracted and converted to vectors which are used for testing and training.The second phase of the experiment involves the conversion of images to binary form and then application of surround suppressed canny edge detection [21].The obtained images are further transformed to skeletons, by applying binary operations, i.e we use the feature extraction techniques mentioned in section 4.3.These are fed to the classifiers and their accuracy is evaluated.SVM and Naive Bayes are created using Linear and Gaussian classification models respectively.CNN is initialized with two convolution layers of 32×32 and max pooling layer of 2×2.It was further connected to a 128×128 fully connected nodes each having weights wi.A sigmoid function was used to collect the results and send them out.The whole neural network was clocked at 300 epochs; i.e number of iterations to achieve conversion was 300.The results obtained in the above phases are evaluated on the parameters of a confusion matrix.The matrix is illustrated in Table 5.The above matrix is used to calculate the parameters like accuracy, precision, recall and f-score [22].The used equations are as follows:- The accuracy comparison of classifiers is shown in Fig. 4 and Fig. 5.    6 and 7 where the precision, recall, and f-score are displayed respectively.The evaluation of the proposed detection approach with the known detection approach is depicted in Fig. 6.It is inferred from Fig. 6 that the proposed detection approach provides higher accuracy than the known detection approach based on the classification accuracy of the popular classifiers (such as SVM, Naive Bayes, and CNN).

Discussion
The aim of this experiment is to reduce the misclassification of insects due to the partial visibility or them being in different orientations.The system also mitigates the misclassification occurring due to the interference of light, aperture of camera, background etc.The proposed work provides an insight about how the feature vectors play a key role in enhancing the accuracy of the classifiers.The paper promotes the use of local features such as skeletons or morphological images to achieve better classification accuracy as shown in Table 8.
The proposed architecture provides accurate results than the existing approaches as the images are subjected to both morphological edge detection as well as skeletonization.The edge detection approach used here is called Canny (with surround suppression), which provides enhanced feature detection than SUSAN [21].The use of edge detection is critical in the proposed approach as it provides the contours as well as the reference points to the classifier for comparison.The classifier uses these generated contours and reference points to classify whether the insect is harmful or not.This use of contours allows the classifiers to establish a convergence upon the insect even though it may be partially visible.In addition to accuracy, the architecture using CNN is fast.The speed up in convolution is due to the fact that morphological images contain only 0s and 1s as their pixel values, which in turn reduces the number of comparisons to create the feature maps.The implementation of this approach using the CNN provides higher accuracy as compared to that of the other popular classifiers.

Conclusion
The effectiveness of a pest identification system is totally dependent on the accuracy of the classifier which in turn depends upon the quality of the features extracted from the images.In this paper we have proposed a new pest identification system utilizing the morphology of the insect.We have further applied CNN as the classifier so as to improve the accuracy.It produced better and faster results as compared to other popular classifiers.The idea behind this paper is, every time the image of an insect is taken from the field, may not always be in the same position as in which the classifier was trained.We have tested our implementation with commonly found insects and in most common orientations.However, future research needs to be done so that insects having same morphological features can be detected with more accuracy.

Figure 3 .
Figure 3. Left to Right: Training and testing image of a mole cricket

Figure 4 .
Figure 4. Accuracy comparison of classifiers using the known feature detection approach

Figs. 4
Figs.4 and 5, illustrates that the CNN outperforms the other two popular classifiers using both known and proposed feature detection technique.The results are further substantiated by the Tables6 and 7where the precision, recall, and f-score are displayed respectively.The evaluation of the proposed detection approach with the known detection approach is depicted in Fig.6.

Figure 6 :
Figure 6: Accuracy comparison of the known and the proposed detection approach in different classifiers (left to right: SVM, Naive Bayes, and CNN)

Table 1 .
Reduction in the accuracy of classifiers owing to the change in orientation of the insects

Table 3
contains the information of the types of pests that are present in the dataset.

Table 3 .
Pests used in the dataset

Table 6 .
Comparison of classifiers using the known feature detection model on morphology Figure 5. Accuracy comparison of classifiers using the proposed feature detection approach

Table 7 .
Comparison of classifiers using the proposed feature detection model on morphology

Table 8 .
Table depicting the accuracy of the classifiers using the proposed feature extraction techniques