A Combination of Data Augmentation Techniques for Mango Leaf Diseases Classification

-Mango is one of the most traded fruits in the world. Therefor several pests and diseases which reduce the production and quality of mangoes and their price in the local and international markets. Several solutions for automatic diagnosis of these pests and diseases have been proposed by researchers in the last decade. These solutions are based on Machine Learning (ML) and Deep Learning (DL) algorithms. In recent years, Convolutional Neural Networks (CNNs) have achieved impressive results in image classification and are considered as th classification. However, one of the most significant issues facing mango pests and diseases classification solutions is the lack of availability of large and labeled datasets. Data augmentation is one of solutions that has been successfully reported in the literature namely blur, contrast, flip, noise, zoom and affine transformation to know, on the one hand, the impact of each technique on the performance of a ResNet50 CNN using a the combination between them which gives the best performance to the DL network. Results show that the best combination classifying mango leaf diseases is ‘Contrast & Flip & Affine transformation’ which gives to the model a training accuracy of 98.54% and testing accuracy of 97.80% with an f1_score > 0.9.


A Combination of Data Augmentation Techniques for Mango Leaf Diseases Classification
Demba Faye α , Idy Diop σ , Nalla Mbaye ρ & Doudou Dione Ѡ Abstract-Mango is one of the most traded fruits in the world.
Therefore, mango production suffers from several pests and diseases which reduce the production and quality of mangoes and their price in the local and international markets. Several solutions for automatic diagnosis of these pests and diseases have been proposed by researchers in the last decade. These solutions are based on Machine Learning (ML) and Deep Learning (DL) algorithms. In recent years, Convolutional Neural Networks (CNNs) have achieved impressive results in image classification and are considered as the leading methods for image classification. However, one of the most significant issues facing mango pests and diseases classification solutions is the lack of availability of large and labeled datasets. Data augmentation is one of solutions that has been successfully reported in the literature. This paper deals with data augmentation techniques namely blur, contrast, flip, noise, zoom and affine transformation to know, on the one hand, the impact of each technique on the performance of a ResNet50 CNN using an initial small dataset, on the other hand, the combination between them which gives the best performance to the DL network. Results show that the best combination classifying mango leaf diseases is 'Contrast & Flip & Affine transformation' which gives to the model a training accuracy of 98.54% and testing accuracy of 97.80% with an f1_score > 0.9.

I. introduction
ango or Magnifera Indica L. (scientific name) is a lucrative fruit widely cultivated in tropical countries. It belongs to the family anacardiaceous. Its overall consumption in 2017 was estimated at 50.65 million metric tons [1]. This fruit was in 2021, in terms of quantities exported, the third most traded tropical fruit after pineapple and avocado [2]. Mango fruit is very appreciated because of its richness in nutrients (vitamins A, B, C, K, ...), flavorful pulp and alluring aroma [3,4]. This fruit contributes enormous economic benefits to exporting countries and mango growers. However, mango production suffers severely from pests and diseases witch lead to a reduction of both quality and quantity. This influence mango price in the international market.
In the last decade, several solutions for automatic diagnosis of these pests and diseases have been proposed by researchers. These solutions are first based on image processing (IP) and machine learning (ML) techniques and finally, in the last five years, on deep learning (DL) algorithms DL based solutions have achieved state-of-the-art performance on Image Net and other benchmark datasets [5]. In recent years, Convolutional Neural Networks (CNNs) have achieved impressive results in image classification and are considered as the leading methods for object detection in computer vision [5,6].
However, one of the biggest issues facing mango pests and diseases identification solutions is the lack of availability of large and labeled datasets [7,8,9,10]. The limited training data inhibits performance of DL based models which need big data on which to train well to avoid overfitting and improves the model's generalization ability. Overfitting happens when the training accuracy is higher than the accuracy on the validation/test set. The generalizability of a model is the difference in performance it exhibits when evaluated on training data (known data) versus test data (unknown data). The use of data augmentation process is one of solutions that has been successfully reported in the literature [1]. This overfitting solution generates a more comprehensive set that minimizes the distance between training and validation sets.
A data augmentation process based on image manipulation is presented in this paper for improving the quality of a small dataset of mango leaves presented in [1]. The specific contributions of the paper include: • Generate a dataset for every data augmentation strategy except affine transformation. The DL model is trained in each generated dataset to know the impact of each data augmentation technique in the performance of the model. • Generate multiple datasets from pair wise sequential combination of data augmentation techniques, namely blur, contrast, flip, noise and zoom. This is to know the combinations which give the best performance to the DL model. • Apply affine transformation technique to the previous best combinations to determine the final combination which is better to classify diseased mango leaves.
The rest of the paper is organized as follows: Section 2 is an overview of the literature review, Section 3 deals with the data acquisition and data augmentation techniques and the CNN model used, Section 4 presents and discusses the results of the data augmentation techniques. The last section concludes the paper and announces the futures works of the authors. II.

Ralated Works
The literature review presented in this paper concerns only data augmentation strategies used for ango pest or diseases classification and mango or other fruits quality grading.
Shorten et al. [11] presented a survey dealing with image data augmentation algorithms such as color space augmentations, geometric transformations, mixing images, kernel filters, random erasing, adversarial training, feature space augmentation, generative adversarial networks (GAN), meta-learning and neural style transfer. They also discussed the application of augmentation methods based on GANs and others characteristics of data augmentation such as curriculum learning, test-time augmentation, resolution impact, and final dataset size. Dandavate et al. [12] applied data augmentation techniques namely rotation, scaling and image translation to a fruit dataset to avoid overfitting and obtain better performances with their simple CNN model. Agastya et al. [13] used VGG-16 and VGG-19 for an automatic batik classification. Applying random rotation in a certain degree, scaling and shearing, they improve the accuracy of their models up to 10%. Bargoti et al. [14] presented a fruit (mangoes, apples, and almonds) detection system using Faster R-CNN. They used image flipping and scaling to improve the performance of their model with an F1-score of > 0,9 achieved for mangoes and apples. Wu et al. [15] investigated several deep learning-based methods for mango quality grading. VGG-16 is found to be the best model for this task. During the training of their models, authors applied, at each epoch, randomly data augmentation strategies such as horizontal or vertical image flipping, rotation, brightness, contrast and zoom in/out. Zang et al. [16] developed a fruit category identification by using a 13-layer CNN and three data augmentation strategies namely noise injection, image rotation and Gamma correction. The final obtained overall accuracy is 94.94%, at least 5 percentage points higher than state-of-the-art approaches. Supekar et al. [17] performed a mango grading system based on ripeness, size, shape and defects. They used K-means clustering for defect segmentation and Random Forest Classifiers. To avoid overfitting with an initial training dataset of 69 images, authors applied image rotation on angle of 90,180 and 270. The final training dataset obtained consists of 522 images which allows their model to obtain an overall accuracy of 88,88%.

Methodology and Model a) Data aquisition
The dataset used in this paper is a part of 'MangoLeafBD' dataset produced by Ahmed et al. [18] and downloadable from 'Mendeley Data'' platform (https://data.mendeley.com/datasets/hxsnvwty3r).
MangoLeafBD dataset contains height classes, seven of which correspond to mango leaf diseases and one contains healthy leaves.
In this paper, four diseases namely anthracnose, Gall Midge, Powdery Mildew and Sooty Mold are treated as they are among the most mango leaf diseases treated by researchers during the last five years [19] (Fig.1andFig.2). The dataset used contains four classes corresponding respectively to these diseases and a class of healthy leaves. There are 500 RGB leaf images of 240x320 pixels in each class making a total of 2,500. Images are in JPG format.

b) Data augmentation
Data augmentation is a powerful solution against overfitting. It allows a model with a small dataset to become robust and generalizable. There are two categories of data augmentation: the first is based on image manipulations and the second on DL (generative adversarial networks (GANs), feature space augmentations, adversarial training, Neural Style Transfer, Meta Learning Data Augmentation) [11].
This research focuses on the first category because i) the second is generally used to generate synthetics images from quite a large dataset, ii) mango leaf images taken under real-world conditions suffer mainly from the problems of temperature variation, shadowing, overlapping of leaves, and presence of multiple objects. The first category can allow us to generate images in these cases.

This papers deals with following techniques:
• Noise injection Image noise is a random disturbance in the brightness and color of an image. Noise injection is an effective way to avoid overfitting and improves the test ability of a machine learning model [13]. There are several ways to add noise to an image (e.g. Gaussian noise, Salt and Pepper noise, Speckle noise, …). Gaussian noise is performed fixing mean parameter to 0 and sigma parameter to 0.05.

• Blur
Blurring an image means make it less sharp. Photographic blur occurs with movement in the model or scene relative to the camera, and vice versa. To realize this, Gaussian blur was carried out using a kernel size (5,15).

• Contrast and Brightness
The Contrast and Brightness function improves the appearance of an image. Brightness improves the overall clarity of the image and contrast adjusts the difference between the darkest and lightest colors.

• Zoom
Zooming an image means enlarging it in a sense that the details in the picture became more visible and clear. Each image is zoomed three times and from the center using zoom parameters {3;5;7}.

• Image flipping
To flip or mirror an image means to turn it horizontally (horizontal flip) or vertically (vertical flip). Flip function generates an image so that the left side becomes the right side or the top becomes the bottom. The images are vertically and horizontally flipped using flip parameter 0 and 1 respectively.

• Affine transformation
An affine transformation is, in general a combination of translations, rotations, shears and dilations [12]. It s used to simulate images captured from different camera projections nd positions. Affine transformation is performed using an input matrix (In) of size 2x3 and an output matrix (Out) of the same size. The input matrix corresponds to three points in the input image and the second matrix is their corresponding locations in the output image. In the training dataset, twenty additional images are randomly generated for each image. But after that, the generated images on which there is no part of mango leaf are removed. Fig.3 shows an example of a diseased mango leaf (anthracnose) on which all these data augmentation techniques are applied.

Number of times Mango Diseases
The data augmentation process (Fig. 4) is carried out as follow: First step: For each of the above mentioned data augmentation strategies (except affine transformation), a new dataset for training and validation is generated (Fig.  3, Table 2). Images of the original dataset are added to the generated one. This is to know the impact of each data augmentation strategy on the overall performance of the model.

Second step:
Every strategy (except affine transformation) is combined respectively by the 4 others sequentially to generate new datasets (Table 2).
Final step: Affine transformation is applied to the best combination that gives the best performance to the DL model ( Table 3).
The augmentation techniques are carried out using python Open Source Computer Vision Library (OpenCV).   Train  8000  6400  4800  8000  9600  8000  11200  6400  9600  8000   Validatio  n  2000  1600  1200  2 000  2400  2000  2800  1600  2400  2000   Test  500  500  500  500  500  500  500  500  500  500   Total  10500  8500  6500  10500  12500  10500  14500  8500  12500  10500 Original image Noised image Blurred image   [20]. ResNet won the first place at the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2015). To preserve knowledge, reduce losses and boost performance during the training phase, ResNet introduced residual connections between layers. A residual connection in a layer means that the output of a layer is convolution of its input plus its input [21]. ReNet50 is used in this research. It consists of 50 layers as it is shown by the Fig. 5.
The model is updated by replacing the number (1000) of nodes of the softmax output layer by 5 (corresponding to the number of treated mango leaf diseases).

d) Implementation details
The data augmentation process and ResNet50 model are all carried out using respectively, OpenCV and Keras labreries. Model's training parameters used include Adam optimizer with a learning rate of 0.001, binary cross-entropy (loss function) and epochs of 8.
The model is trained on a server with an NVIDIA GPU and 32 GB of RAM. IV.

Result and Discussion
The initial small dataset is splitted as follow: 64% for training, 16% for validation and 20% for testing. After randomly splitting the dataset, we have 1,600 images for training, 400 images for validation and 500 images for testing. Results sho that the training accuracy (87.18%) is greater than the testing accuracy (39.34%). So the model overfitted as it is shown by the Fig. 6. Since the dataset is not enough to train robustly the DL model, data augmentation process is carried out. This ask concerns only training and validation data [22]. Test data remains equal to 500 images.
In the first step, after training phase, results show that the DL model overfits on all datasets except 'Original & Contrast' which gives a training and testing accuracy of 90.56% and 86.23% respectively (Table 3, Fig. 7).
In the second step, training the model on the combined datasets yielded the results in Table 4 Fig. 8, Fig. 9).  Following the results presented previously, in the first step, the model overfitted in the generated datasets, except'Original and Contrast' dataset which resulted in an accuracy of 86.23%. Concerning data augmentation strategies namely blur, contrast, noise and zoom, the best cominations for classifying mango leaf diseases are 'Contrast & Flip' and 'Flip & Zoom', according to the results in the second step. These two strategies yielded accuracies of 91.39% and 90.59% respectively. In the final step, applying the 'Affine Transformation' strategy to the datasets generated by these two strategies revealed that the best combination for mango leaf diseases classification is 'Contrast & Flip & Affine Transformation' since it yielded an accuracy of 97.80%.

V. Conclusion and Future Works
This paper presented three contributions. The first allowed us to know the impact of data augmentation techniques namely blur, contrast, flip, noise and zoom in mango leaf diseases classification. The second is to know the best combinations between these techniques which give the best performance to the deep learning model. The last one reveals that applaying 'affine transformation' technique to the combination 'Contrast & Flip' gives the best performance to the Resnet50 CNN with an accuracy of 97.80%.
This solution can be used to improve the performance of DL models for image classification with small datasets.
Our future work, is to propose a dataset of mango leaf diseases with images captured in mango orchards of a sahelian country like Senegal. Applying this combination as a data augmentation technique to this dataset will allow us to achieve excellent results in mango leaf disease classification using a deep learning model such as ResNet50. Then, this model will be deployed in mobile and web applications to allow mango growers to diagnose diseases in their crops without expert intervention.