Transfer Learning Techniques to Classify Nematodes Species

Phytoparasitic nematodes are severely damaging crops all over the world, which leads to an enormous financial loss. Some researchers estimate that less than 0.01 % of these species have not yet been discovered. Since most nematodes have similar physical traits, it can be difficult to classify them using traditional techniques. In the past, the only way to identify nematodes was through their morphological traits, including body length, their reproductive organs' arrangement, and other physical characteristics. The aforementioned method is exceedingly labor and skill-intensive, and its classification is solely dependent on human ability and costly machinery. In recent years, DL-based techniques have greatly enhanced and boosted accuracy. Using DL algorithms InceptionV3 and VGG16, these species were effectively categorized in this study. Five different species of nematodes, Acrobeles, Acrobeloides, Aphelenchoides, Amplimerlinius, and Discolimus, were used. The given dataset, which consists of 1500 digital photos of nematodes, is further expanded to 5000 images using data augmentation techniques like flipping, shearing, zooming, and other procedures. Two pre-trained CNN models, InceptionV3 and VGG16, have been improved to classify these species. The InceptionV3 and VGG16 models have respective accuracy rates of 98.02 % and 95.87 %.


Introduction
Nematodes, sometimes referred to as roundworms, are invertebrates that are members of the phylum Nematoda.The bodies of these organisms are translucent, cylindrical, and lack segments.The most abundant and varied animals on Earth are nematodes.They can live as free-living organisms or as parasites, with up to a million different species [1].The first group is more commonly found in plants and animals, while the second group is more commonly found in soil, deserts, freshwater, and below the Ground's crust.These creatures mostly feed on dead organisms, algae, fungi, and bacteria.Numerous bacteria are dangerous and have the potential to hurt us, plants, and other living things.These species are the main reason for several severe diseases that affect humans, including trichuriasis [2], hookworm [3], angiostrongyliasis [4], helminths [5], onchocerciasis [6] etc.According to research conducted thus far, less than 0.01 % of these species have been discovered [7].In addition to recycling nutrients and managing pests, nematodes can occasionally be dangerous to plants.Most soil nematodes are important for the cycling of nitrogen in the natural environment.Certain nematodes are reportedly essential for veterinary and medical studies [8].Consequently, understanding the diversity of nematodes and developing effective control and management strategies depend on accurate identification.Nematodes are classified as belonging to distinct species based on their size.They occur naturally and are quite difficult to recognize visually.Nematodes can be challenging to classify because of their many physical similarities.To gain an understanding of the biological, genetic, and physiological features of nematodes, culture techniques are used to view them under a microscope [9].In the past, nematodes could only be identified by their body length, reproductive structure, mouth and tail sections, and other anatomical characteristics.The piercing of stylets, or mouthparts, sets nematodes apart from one another.This frequently leads to an inaccurate categorization among closely related species due to unique visual traits and a shortage of trained taxonomists, which is unsatisfactory, especially when a large sample size is involved [10].Traditional procedures, however, are expensive and timeconsuming.Morphological identification matches patterns using drawings from a standard taxonomic key by using basic principles.Experts identify nematode species using morphological and DNA-based techniques [11].Therefore, the mentioned procedure is highly intricate, labor-intensive, and totally reliant on pricey machinery and human skill.
AI methods can readily resolve this issue by identifying nematodes from their microscopic photos.Artificial intelligence (AI) approaches simplify identification processes quicker, saving time and labor-intensive tasks.ML approaches have been widely used in numerous different areas, such as speech recognition [12], healthcare [13], business forecasting [14], agriculture [15], and others.The accuracy of the results has increased, and there has been noticeable progress in the DL (a subdivision of ML)-based methodologies in a few past years.In the area of microscopic image identification [16], DL has already amassed a sizable following in the areas of object segmentation and classification [17], pattern recognition [18], autonomous cars [19], cell segmentation, tissue segmentation [20], etc.Several CNN architectures, including ResNet, Inception, Xception, and VGG16, have been created specifically for image categorization.
To classify images of five different nematode species-Acrobeles, Acrobeloides, Aphelenchoides, Amplimerlinius, and Discolimus, we show in this work a modified version of the InceptionV3 and VGG16 model that shows higher accuracy value.This is how the rest of the paper is formatted: The related work is shown in Section 2. Section 3 outlines the materials and suggested procedure.Section 4 summarises the findings and discussion.Section 5 concludes by outlining the scope of future work.

Related Study
Several DL algorithms are used by researchers to classify nematode species automatically from images.This section includes the studies that are most pertinent to this work.The steps involved in automatically classifying nematode species from images are (I) Collecting the image, (II) performing preprocessing, (III) retrieving and evaluating features, and (IV) classifying the image.
A deep learning-based method is used to classify 3,063 microscopic images from five phytonematode species with the most serious damage consequences for the soybean crop [21] using nematode species from the NemaDataset.Thirteen CNN models, which stand for the cutting edge of object identification and classification research, were assessed using the NemaDataset.Lastly, a comparison between the currently in-use models and the newly created CNN model NemaNet is presented.The accuracy from scratch was 96.99 %, and the best evaluation fold was 98.03 %.In this instance, the best evaluation fold reaches 99.34 % accuracy, although the average accuracy of the transfer learning model is 98.88 %.
The parasitic nematodes known as entomopathogenic nematodes (EPNs) infect insects with bacteria that lead to illness in the insects.The usage of EPNs has been investigated as a possible substitute for chemical pesticides, which have the potential to contaminate the environment.Three distinct species of EPNs are included [23]: Steinernema feltiae, Heterorhabditis bacteriophora, and Steinernema carpocapsae.The utilization of currently available state-of-the-art model architecture is applied to transfer learning.Thirteen CNN architectures are available for use in the Keras deep learning library, whether or not the weights are pre-trained.For the dataset of juvenile nematodes, the model's mean validation accuracy was 88.28 %, and for the dataset of adult nematodes, it was 69.45 %.Living organisms that live in the soil, called entomopathogenic nematodes, are commonly used to biologically control agricultural insect pests.With the development of easy methods for administering them with traditional sprayers, they are among the best substitutes for pesticides.
Microscopic images of Acrobeles and Acrobeloides nematodes were used [24] to show the classification of plant parasitic nematodes.The dataset comprises 277 photos that are further enhanced by data augmentation methods such as shearing, zooming, and so forth.These species are categorized using InceptionV3, a deep-learning approach.The authors' training and testing accuracy is 99 % and 90 %, respectively.Globodera pallida and Globodera rostochiensi are the two species of quarantine nematodes that [25] used CNN to classify images.The accuracy rate of the suggested CNN model was 71 %.
In this experimental study, InceptionV3 and VGG16 models were used that automatically classified and extracted features from digital microscopic images of five nematode species: Acrobeles, Acrobeloides, Aphelenchoides, Amplimerlinius, and Discolimus.The CNN was developed using Python, the Tensorflow framework, and the Keras API.

Dataset and preprocessing
This paper presents the "I-Nema" state-of-the-art dataset, which includes five species of plant parasitic nematodes (PPNs) with the most significant damage relevance for the crops: Acrobeles, Acrobeloides, Aphelenchoides, Amplimerlinius, and Discolimus.Some data augmentation techniques were applied, such as flipping, shearing, zooming, and other operations, to increase the volume of our training data artificially.After that, the final dataset comprises 5000 images, with 1000 images for each nematode species.It was then split into 80:20 ratios, where the test set consisted of 1250 photos and the training set of 3750 images.Some sample images from the dataset are shown in Fig. 2.

Transfer learning
For training, CNN needs very high computational power, more datasets, and training time.Transfer learning can be used to deal with this problem.So, nematode species are classified by applying InceptionV3 and VGG16 models.The criteria used for their selection are further explained in detail below.

Inception V3
Inception V3 is a CNN-based classification network [25].The 42-layer deep inception modules it uses are made up of a concatenated layer with 1 × 1, 3 × 3, and 5 × 5 convolutions.While increasing the training rate, the number of parameters will decrease.Inception 3 is also known as the GoogLeNet model.The following are some of the advantages of Inception V3.
• In order to address the vanishing gradient issue in extremely deep networks, auxiliary classifiers are employed.• Sizing Down the Grid.

VGG16
The VGG16 (Visual Geometry Group) architecture is a simple and popular convolutional neural network design utilized in the ImageNet project, a large visual database project used to create visual object identification software.Simonyan and Zisserman of the University of Oxford proposed the concept of Very Deep Convolutional Networks for Large-Scale Image Recognition.It has sixteen convolutional layers.Because VGG16 is freely available online, it is often used out of the box for a wide range of applications.Deep learning models are widely used for prediction.However, they have certain drawbacks, such as overfitting, incorrect categorization, and incorrect predictions for low-quality microscopic pictures.
The unique hybrid model Inception V3 with VGG16 is suggested for classifying nematodes.

Proposed technique
Fig. 1 displays the flowchart for the recommended procedure.First, preprocessing is applied to the supplied image.Preprocessing entails scaling each image to 299 by 299 pixels and using data augmentation to increase the image count.Next, the pre-trained CNN models, which include the VGG16 and InceptionV3 models, will be tested.Whereas the provided modified VGG16 is made up of five blocks: three blocks come after the first two, which are two convolutional layers with Max Pooling and a Relu activation function.Three convolutional layers with a Relu activation function and Max Pooling are included in each block.These blocks are followed by two blocks and an adaptive average pooling.Each block has a linear layer, a dropout layer, and a ReLu activation function.Lastly, the class of species is predicted using a linear layer.This model was adjusted during 50 epochs.The "Adam optimizer," also known as the Adaptive Moment Estimation, is used to optimize the loss function.The cross-entropy loss function is used to train the selected model.

Results and Discussion
In this work, two CNN models-the InceptionV3 and the VGG16 model-are trained via transfer learning to categorize photos of five distinct species of nematode.Features obtained from the ImageNet dataset have been used for feature extraction.Following that, these attributes were provided to classification layers for categorization.These layers consist of fully connected and softmax layers.Furthermore, all models retain the fully connected layer size.The Softmax layer produces five probabilities because our task is a five-class problem.Model overfitting is one of the main issues when using transfer learning with little datasets.A dropout with a value of 0.5 has been inserted before fully connected layers to prevent overfitting.The Adam optimizer was used to train both models for 50 epochs at a learning rate of 0.001.Furthermore, the activation and loss functions that have been employed are Relu and Categorical-cross-entropy, respectively.CNN models have been created using Python in conjunction with inbuilt libraries of Keras and Tensorflow.An analysis of the performance of the proposed system has been conducted by randomly dividing the dataset into two groups: 20 % for testing and 80 % for training.The training set consists of 3750 images and 1250 images in the test set.Augmentation techniques such as flipping, shearing, zooming, and so on have been applied to training data to provide CNN architecture with various visual inputs [27].
Four metrics have been used to evaluate model performance: F-score, Accuracy, Recall, and Precision.Accuracy yields a percentage of correctly classified samples.The model's ability to positively identify samples is determined by recall.The positive sample percentage is predicted to be positive, provided by precision.Both Precision and recall metrics are part of the F1-score.The provided confusion matrix has been used to calculate the performance measurements.
The confusion matrix for the VGG16 and InceptionV3 models is displayed in Fig. 3.It shows the number of accurate and inaccurate instances based on the model's predictions.Acrobeles, Acrobeloides, Aphelenchoides, Amplimerlinius, and Discolimus are represented by 0, 1, 2, 3 and 4, respectively, in Fig. 3.
Table 1 displays the experimental data achieved using both models.With an accuracy of 98.02 %, the results demonstrated that the InceptionV3 model outperformed the VGG16 model.The proposed model was also compared with other DL approaches using the same dataset based on accuracy parameters to evaluate the proposed approach's effectiveness for nematode classification.Table 2 presents these comparison results.The AlexNet approach predicted less than 50.7 % accuracy, whereas VGG16 and VGG19 achieved 65-70 % accuracy.ResNet34, ResNet50, and ResNet101 obtained accuracies of 70-80 but still less than the proposed model that achieved the highest accuracy of 98 % for Inceptionv3 and 95 % for VGG16.Thus, the experimental evaluation proved the effectiveness of the proposed algorithms.

Conclusion
In this paper, only five nematode species, Acrobeles, Acrobeloides, Aphelenchoides, Amplimerlinius, and Discolimus, are automatically classified using the transfer learning approach of InceptionV3 and VGG16.This work used 1500 microscopic images of five PPN species from a public dataset named "I-Nemad.Through the use of various data augmentation techniques, including flipping, shearing, zooming, and more, the number of images was increased to 5000.Two state-of-the-art DL models, VGG16 and InceptionV3, have been improved in order to categorize these species.A comparison of the two distinct pre-trained CNN models was made.It has been noted that the InceptionV3 model outperformed other VGG16 models in terms of classification outcomes.The accuracy of the VGG16 model and the Inception V3 model is 95.87 % and 98.02 %, respectively.These findings lead to the conclusion that DL has enormous potential for categorization.It is evident that our proposed method has produced superior results than other recent approaches.However, the study only looks at five different types of nematodes.We will need to improve its performance in our next work.As a matter of fact, deep learning model concatenation or combination may improve classification outcomes.Moreover, the dataset used in this investigation was quite tiny.In the future, it is planned to classify additional species and increase the dataset's size.The experiment on our primary dataset will be carried out.We hope our research and benchmark will be instructive to relevant researchers from different fields in their future research.

Fig. 1 .
Fig. 1.Proposed methodology.The accuracy scores of the InceptionV3 image categorization system exceed 78.1 % on the "ImageNet" dataset.The fundamental components of the model include convolutions, concatenations, drops, average pooling, maximum pooling, and fully connected layers.Frequently, this model does batch normalization on the activation inputs.SoftMax is used in the loss calculation.Our Modified InceptionV3 begins with three BasicConv2d blocks.Each block starts with batch normalization steps and convolutional layers.Next come 3 modules A, 4 modules B, and 2 modules C, then Avg Pooling, Dropout, Linear layer, ReLu, Dropout layer, and Linear layer.Whereas the provided modified VGG16 is made up of five blocks: three blocks come after the first two, which are two convolutional layers with Max Pooling and a Relu activation function.Three convolutional layers with a Relu activation function and Max Pooling are included in each block.These blocks are followed by two blocks and an adaptive average pooling.Each block has a linear layer, a dropout layer, and a ReLu activation function.Lastly, the class of species is predicted using a linear layer.This model was adjusted during 50 epochs.The "Adam optimizer," also known as the Adaptive Moment Estimation, is used to optimize the loss function.The cross-entropy loss function is used to train the selected model.
Fig. 4 also displays the plotted training and validation accuracy curves.The curves demonstrate no overfitting because the training accuracy is higher and comparable to the validation accuracy.

Table 1 .
Classification results of InceptionV3 and VGG16.

Table 2 .
[22]arison of the proposed model with existing work provided in the literature on the same dataset for nematode classification[22].