Classifying Java plum ( Eugenia jambolana ) Leaf for Tobacco Cigarette Wrapper using Convolutional Neural Network

With the increasing prices of cigarettes and cigars, more and more smokers are looking for an alternative. One way to smoke without the use of traditional cigarettes, cigars, or electronic cigarettes is to roll tobacco leaves with Java plum leaves. It is diﬃcult to select which leaves are to be used as cigarette covers because there are characteristics to be considered such as their texture, color, dryness and shape. This study aims to help or replace human experts in classifying Java Plum leaves using a Convolutional Neural Network Classiﬁer. Tensorﬂow Inception V3 is retrained to classify the leaves of the Java plum into three categories, and with 1470 test images, the neural network model managed to achieve a 91.2% accuracy in the classiﬁcation. The system needs more sample photos to get a higher accuracy rate.


Introduction
Java plum, known in the Philippines as duhat or lomboy is a tropical tree that is common in the Southeast Asian region.This tree was introduced to other parts of the world, such as in some parts of America after the colonization era.It is believed that the fruit of the Java plum treats various diseases including diabetes (Ayyanar & Subash-Babu, 2012).It is also known to have a chemopreventive bioactive compound such as anthocyanins that combat cancer (Charepalli et al., 2016).Hailed as one of the miracle fruit trees in the region because of these claimed health benefits, this tree can grow up to 30 meters in height but is considered as a slow growing tree.This research aims to automate the selection or grading of the Java plum leaves so that the selection would be quicker and would require less human experts.A new system needs to replace or aid the workers in the selection process, because the human experts in this industry are usually above 60 years old.The limitation of the growth of this industry depends on the available human experts who can do this job accurately, thus having this kind of automation will enable this type of business to grow.Comparisons were made to determine the best model between the hyperparameters used (learning rate) of a convolutional neural network.
In the Philippines and some neighboring countries, one of the uses of the dried leaves of Java plum is for making cigarette wrappers, as replacement of the traditional paper cover rolls.Currently, there are no large companies that manufacture or produce cigarettes using the leaves of this tree as the wrapper.Both the cigars and tobacco filled Java plum rolls are the same in structure because they are wrapped with leaves.The only difference is that cigars tend to be way more expensive than almost everything that smokers use, including the paper filled cigarettes.The risk of lung cancer by using a traditional cigarette and a cigar type, including the Java plum roll are the same (Boffetta et al., 1999).
The leaves of the Java plum tree are free or can be found almost anywhere.This is the reason why many locals use it for smoking.There are still few people who sell these leaves for this purpose, but the price is usually cheap.Choosing the best leaf for tobacco roll cover is a difficult task.This requires expert individuals who have been doing this kind of selection or leaf grading for many decades and usually, these people are in their 60's and 70's.Younger people who have just been introduced into this kind of work are not very accurate in classifying the leaves.
Convolutional Neural Networks (CNN) have been proven to classify images with unprecedented accuracy.
Among the numerous neural network models, CNN is demonstrated to be among the best in image classification (Guo et al., 2017).This neural network model removes the need for feature extraction, that is why CNNs are mainly used for image classification (Hedjazi et al., 2017).There are studies that employ CNN for leaf classification, and those studies yield some high accuracy rates.One such study used dual-path CNN to classify leaf images with higher accuracy compared to vanilla CNN classifiers (Shah et al., 2017).

Methodology
The small business that specializes in manual leaf classification was the main source of the classified and labeled leaves.The small business owner, who was interviewed for this study, also sold cigar-like products wrapped in the Java plum leaf.
In their simple production line, the leaves were already classified, making it easier to take pictures of each of the classes.In addition, they were placed in separate baskets and sacks.There are three classifications of leaves according to business owner: Class A which is perfect and ready to use, Class B which needs to be dried and flattened, and Class C which is rejected.Class C is easy to identify with the naked eye, The leaf samples were laid out in a table.Using a mobile phone camera (Xiaomi Mi Mix 2s) with 16 megapixels, all the collected and grouped leaves were photographed.All the images were resized to 256 pixels wide, and 256 pixels high, even though the images collected were at full resolution.The samples' photos were grouped and labeled as Class A, B, and C. Class A samples were leaves that were ready for wrapping tobacco.There were times that Class B leaves became rejected after processing such as flattening or drying.Class C leaves were usually thrown out; these leaves usually had large holes in them, or had broken edges.
A Convolutional Neural Network is a variation of Multi-Layer Perceptron that is usually used for image classification or recognition (Shah et al., 2017).CNN had been proven to yield high accuracy rates in image classification.This was the main reason why this study used this type of neural network.Instead of building a CNN model from scratch, this study used the Tensorflow Inception V3.This pre-built CNN model is suitable for this research because retraining the model will only take a few hours while yielding high accuracy rates (Xia et al., 2017).This existing neural network model was being trained for more than a week and had 1000 classes.During training, only the final layers were retrained because it turned out, these final layers could distinguish new classes aside from the existing 1000 classes of ImageNet.There were only three possible outputs of the CNN model: Class A, B and C. The Inception model had predetermined filter sizes of convolutions (combination of 5x5 and 3x3 filter sizes).Other predetermined parameters were the size for max pooling and the number of layers of the model.Cross-entropy function was used to check the distance between the expected output and the calculated output.Through this, the weights could be updated to get a more accurate classification.The weights were updated using stochastic gradient descent (SGD) together with backpropagation.All of these processes were handled by the Inception V3 model.
Equation ( 1) is the formula for Cross-entropy loss function used to calculate the distance between the expected output and the calculated output.R is the actual probability, while the C is the calculated probability.
Equation ( 2) shows the standard stochastic gradient descent algorithm used to update the weights of the filters using the parameter, λ and target, M(λ).
Because the neural network model used was CNN, the photos did not undergo any other preprocessing other than resizing.Another reason for not preprocessing the leaf photos for training was because the leaf photos had the same lighting conditions.There were about 520 photos of Class A leaves, 450 photos of Class B leaves and 500 photos of Class C leaves.Of the 1470 images, 80% were randomly selected as training data, while the rest of the 20% of the images were all labeled as validation data.The training of the model only involved the training data images.The model training had 1000 steps, and each step had 100 batch sizes.The learning rates used were 0.00001, 0.000001 and 0.000001.For the training, a laptop with a CPU of Intel Core i7-770HQ and GPU of Nvidia GTX 1050 was used.The comparisons made were the accuracy results of the three models because these would ensure which CNN model would be used for leaf classification.

Results and Discussions
In the training of the Convolutional Neural Network, 1470 Java plum leaf photos were used.
The three separate tests ensured the determination of the best CNN model to be used.The three tests of CNN models ensured that adjusting the hyperparameters, specifically the learning rate, could affect the accuracy of the neural network model.
To further verify the results of the model, a K-Fold cross validation was used.For this test, instead of randomly selecting 80% of the sample images as training data and 20% as validation data, the 80-20 model were selected manually.The number of folds for the K-Fold cross validation was 5 thus having the 80-20 setup.
All of the tests for the K-Fold cross validation for all three models had almost the same accuracy results for both training and validation.This further proved that the samples used in each model for training did not overfit it.
In the three tests performed, the validation data accuracy was slightly lower than the training data accuracy because the model had only seen the training data images.The validation data images were not fed during the training of the neural network.Validation data images were somewhat the same as real-world data images because the neural network model had not yet seen and had never trained on these images.
Based on the results gathered, the suitable learning rate that was used in the CNN for the leaf classification was 0.0000001.Even though all of the three learning rates had almost the same validation data accuracies, the fact that the learning rate 0.0000001

Future Work
To make this study applicable to the real world, the neural network can be installed into a mobile application, so that by using a smartphone's camera, Java plum leaf classification will be possible.
It is also possible that this neural network be ported on a computer with an industrial machine that does the classification.

Figure 1 .
Figure 1.Sample Java plum leaves from three various classes

Figure 2 .Figure 3 .
Figure 2. Generalized schematic diagram of the CNN model.An image input with 28x28 pixels size is fed to the Inception V3 model and outputs three different classes

Figure 4 .
Figure 4. Using the learning rate of 0.00001, the CNN model shows a steady growth in both the accuracy of the training data and the accuracy of the validation data

Figure 5 .
Figure 5.Using the learning rate of 0.000001, the CNN model shows a steady growth in both the accuracy of the training data and the accuracy of the validation data

Figure 6 .
Figure 6.There is almost no difference in the time it took to train all of the three CNN models with various learning rates

Table 1 .
Using the learning rate of 0.00001, the model managed to achieve a training data accuracy of 91.5% while the validation data accuracy is 87.1%.The time that it took to train this model is 210 seconds

Table 2 .
Using the learning rate of 0.000001, the model managed to achieve a training data accuracy of 95.8% while the validation data accuracy is 87.8%.The time that it took to train this model is 205 seconds

Table 3 .
Using the learning rate of 0.0000001, the model managed to achieve a training data accuracy of 97.8% while the validation data accuracy is 91.2%.The time that it took to train this model is 214 seconds

Table 4 .
Results of the K-Fold cross validation using the learning rate 0.00001.The average of the training accuracy is 91.75, while 87.08 for the average validation accuracy

Table 5 .
Results of the K-Fold cross validation using the learning rate 0.000001.The average of the training accuracy is 95.19, while 87.43 for the average validation accuracy

Table 6 .
Results of the K-Fold cross validation using the learning rate 0.0000001.The average of the training accuracy is 97.11, while 90.96 for the average validation accuracy

Folds Used for Training Fold Used for Validation Training Data Accuracy
These kinds of neural networks were trained with at least a thousand images in each class.Another possible reason was that the classification of the gathered classified leaves were not accurate in the first place.ConclusionClassifying Java plum leaves using CNN into three categories can yield a 91.22% accuracy rate.There were 1470 leaf images used for training and the neural network which yielded a satisfactory rate in the classification.This study has proven that Java plum leaf classification is possible and practical using Convolutional Neural Networks.The neural network model is also practical to use because of the aging population of expert individuals who can perform leaf classification accurately.Even though this neural network model has an accuracy of less than 100%, this can help individuals and enterprises involved with similar classification tasks.