Development of Paddy Rice Seed Classification Process using Machine Learning Techniques for Automatic Grading Machine

To increase productivity in agricultural production, speed, and accuracy is the key requirement for long-term economic growth, competitiveness, and sustainability. Traditional manual paddy rice seed classification operations are costly and unreliable because human decisions in identifying objects and issues are inconsistent, subjective, and slow. Machine vision technology provides an alternative for automated processes, which are nondestructive, cost-effective, fast, and accurate techniques. In this work, we presented a study that utilized machine vision technology to classify 14 Oryza sativa rice varieties. Each cultivar used over 3,500 seed samples, a total of close to 50,000 seeds. There were three main processes, including preprocessing, feature extraction, and rice variety classification. We started the first process using a seed orientation method that aligned the seed bodies in the same direction. Next, a quality screening method was applied to detect unusual physical seed samples. Their physical information including shape, color, and texture properties was extracted to be data representations for the classification. Four methods (LR, LDA, k-NN, and SVM) of statistical machine learning techniques and five pretrained models (VGG16, VGG19, Xception, InceptionV3, and InceptionResNetV2) on deep learning techniques were applied for the classification performance comparison. In our study, the rice dataset were classified in both subgroups and collective groups for studying ambiguous relationships among them. The best accuracy was obtained from the SVM method at 90.61%, 82.71%, and 83.9% in subgroups 1 and 2 and the collective group, respectively, while the best accuracy on the deep learning techniques was at 95.15% from InceptionResNetV2 models. In addition, we showed an improvement in the overall performance of the system in terms of data qualities involving seed orientation and quality screening. Our study demonstrated a practical design of rice classification using machine vision technology.


Introduction
Oryza sativa, known as Asian rice, is a popular variety of rice grown in many countries all over the world. It is divided into 2 subspecies according to climate conditions, including indica and japonica. Rice in Thailand is an indica species, which is adapted to suit the humid areas of tropical Asia (India, southern China, Vietnam, Thailand, Myanmar, etc.) [1,2]. The world needs a hundred million tons of rice annually. Thailand is the world's second-largest exporter of rice and approximately 40% of Thais who work in agriculture are rice farmers. Thailand has different climatic conditions that make a genetic diversity of rice varieties throughout the country. Thailand has more than 17,000 varieties of rice cultivar, which are under the responsibility of the governmental Rice Department. One of the most important tasks of the agency is to control the rice quality. The contamination causes many problems such as variety impurity, rice mutation, or cross-breeding, which may result in poor quality production. In the traditional way, the examination of contamination in breeding seeds has been done by rice experts. Paddy seeds are quite small and are sometimes ambiguous to classify differences between each type. The experts use personal skills to consider morphological structure, shape, texture, and color in many parts of the seed to make a decision. In the examination, they classified a specific type of rice seeds from a specific locality. Firstly, they put seed samples, which are supposed to be the same type on a table and examine them with tools such as a large magnifying glass, an illumination, and forceps. Then, they try to find and bring out seeds with different physical characteristics, which are contaminated seeds from other types. With the limitations of being human, a large number of seed inspectors take quite a long time in the process because it is difficult for the human eyes to find small differences in one seed among many seed samples.
Over the past decade, computer vision has been widely used in various domains. Several methods in the field of computer vision have been changed from statistical methods to deep learning methods because it offers greater accuracy for tasks such as object detection and image recognition. The technology can help computer scientists to develop tasks in various fields rapidly. It can automatically learn features from the given data while the traditional machine learning methods need feature engineering one at a time. It can handle the variability and deviations of the data that are very similar. However, deep learning technology is rather complicated. It has a large network structure and requires a large amount of training data, time-consuming, and high-performance computing resources. In this work, we experimented with classification methods of rice cultivars and compared classification performances between traditional machine vision methods and deep learning methods.

Related Works
Machine vision in agriculture applications, related to rice quality inspection and grading, has been reviewed and summarized [3][4][5]. There are various works related to the field of rice quality inspection. Measuring rice quality can be processed both on milled rice and paddy seeds, depends on the purpose of usage. The following works presented the quality inspection methods for milled rice grains or polished rice. HerathRavi and De Mel [6] analyzed four rice varieties that had quite different shapes and colors. Some previous works [7][8][9] were aimed at grains mixed with other defects.
They studied various grain defects, such as broken, chalky, damaged seeds, and improper elements. The defected grains needed to be detected and classified to estimate the purity of the rice grain. Wah et al. [9] proposed an image processing technique with a k-NN classifier and evaluated three classes (30 images for each class). Some other research [10][11][12][13] placed emphasis on detecting the chalkiness that appeared in the grain. A chalky grain is a grain the kernel of which is partially opaque or milky white. The degree of chalkiness is one of the important indicators in the evaluation process. Rice grains with a high degree of chalkiness tend to break during milling, which will affect their taste.
The works of references [8,14,15] focused on classifying milled rice quality between head rice and broken rice. Head rice seed has a length equal to or over three-quarters of the average, which is longer than the broken rice. The quantity of the two rice types is one of the criteria for measuring milled rice quality. In [14], they studied to help employees distinguish between head rice and broken rice grain to evaluate different rice standards. Yao et al. [8] processed 500 of head rice grains and 500 cracked rice grains of five rice kernel varieties. Zareiforoush et al. [15] applied four statistical classification techniques on four different classes of milled rice to classify the degree of milling and the length of rice grains.
In the above articles, they focused on milled rice quality examination. In the studies described later, they focused on paddy seed examination, which was the same examination target as in our work. They tried to identify the differences between rice varieties related to object classification technology. The difficulty of this technology depended on the complexity of the object's shape and the number of the types of objects to be classified. Having many types of objects would increase the chance of ambiguity between each type. Many research studied between 3 to 6 rice species while the only research by Kuo et al. [16] studied up to 30 species.
Anami et al. [17] was the only researcher that proposed a rough assessment of rice quality instead of the one-grain classification. They attempted to classify the level of adulteration from the image of mixed bulk paddy samples varied between 10-30% of the adulteration levels. Watanachaturaporn [18] adopted a symbolic regression algorithm to find analytic expressions to separate the Khao Dawk Mali 105 (KDML105) rice from three similar rice varieties. Their work studied a total of 800 images. Kuo et al. [16] proposed a sparse-representation-based classification for distinguishing over thirty rice grain varieties. Their process required analyzing a high-resolution image through a powerful optical microscope at high magnifications. They could examine the appearance of a sample in greater detail on both feature traits of grain body and parts (such as husk, sterile lemmas, and brush). Their experiments evaluated 50 images of each specie and received 89.1% overall accuracy. However, the microscope is a large, cumbersome, and expensive equipment. It also requires a careful sample preparation before placing rice grains in the microscope. Archana et al. [19] suggested methods to extract new angular features, horizontal-vertical and front-rear angles, for classifying four paddy varieties. The fusion feature could increase accuracy from 95.2% to 97.6% when evaluating the 164 images of paddy seeds by using a back propagation network classification.
Many researches [20][21][22] proposed a rice seed classification technique analyzing information from a hyperspectral imaging system. Hyperspectral imaging provided a wide range of an electromagnetic spectrum with higher details of spatial relations between the particular spectra. It suited for analysis of the surface of a material. Additionally, many of them paid attention to not only traditional classification techniques but also deep learning techniques (CNN). Vu et al. [22] used hyperspectral image data from a nearinfrared camera to classify 6 common rice seed varieties and evaluated 108 seeds in each variety and 648 seeds from across all varieties using SVM and a Random Forest classification technique. It was found that combining spectral and shape-based properties derived from the rice seeds could 2 Journal of Sensors enhance accuracy to 84%, compared to 74% when using only visual features. Lin et al. [23] proposed a comparison between two techniques, CNN and traditional methods, to distinguish rice grains between three different shapes (medium, round, and long grain). They studied 5,554 images for calibration purposes and 1,845 images for validation purposes. The experiments adjusted training parameters such as batch size and epochs in the CNN method. They also carried out an experiment using the traditional statistic methods that got a classification accuracy ranging from 89 to 92%. On the other hand, the experiment using the CNN method had given 95.5% classification accuracy, which was higher than the traditional methods. Chatnuntawech et al. [20] used benefits from the synergy between hyperspectral imaging and CNN. Their experiments were conducted on 2 sets of data, consisting of 232 samples from six types of milled rice and 414 samples from four types of paddy rice. The proposed method's accuracy was 86.3%. Compared to this, 79.8% was obtained from the SVM technique on the paddy seed dataset, while the accuracy of the other set was slightly different. Qiu et al. [21] identified 4 varieties of rice seeds using hyperspectral imaging with three machine learning methods, namely, k-NN, SVM, and CNN. The experiment was studied on two different spectral ranges, and the numbers of training samples are varied. A hyperspectral camera was adopted to deal with the problem of rice varieties classification in many researches. However, the instrument was costly and complex. Moreover it required a fast computer, sensitive detectors, and large data storage capabilities.
From the above literature survey, most research identified paddy rice seed varieties from a few species. Furthermore, they studied only few tenth image samples or few hundred images of each rice species. The limited number of samples might cause data bias, and insufficient variation may cause the trained model not to be general enough for practical uses.

Proposed Methodology
In this work, we presented a technical study of paddy seed quality inspection by evaluating over 14 varieties of Thai paddy rice, as shown in Figure 1. Popular and economical potential rice samples were chosen and supported by the Thailand Rice Department. We analyzed more than 3,500 images in each species from various planting sources. This study aimed at being a basis for a prototype of a rice grading machine [24], which is currently under development. The hardware consisted of a tray for conveying seed, a photographic part, a contaminated-seed detector, and a contaminated-seed elimination part. Therefore, many grain samples were collected to cover the diversity of each rice species to assess the potential or the limitation of the efficiency of classification obtained from each rice variety. We plan to improve the technique in future efforts.
Our rice varieties classification process consisted of the following steps: object orientation to align seed image in the same direction, image screening for outlier/irregular/abnormal seed or tilted seed, feature extraction for retrieving physical seed properties, and rice varieties classification. The system overview is presented in Figure 2. The classification performances were evaluated by comparing both traditional machine learning and deep learning technique. We investigated the performance of each rice variety in both subgroups and collective groups. We also presented preliminary results on data quality aspects such as quality screening and seed orientation. A flatbed scanner was selected because it was capable to acquire large sample data in one shot. Also, it was a reasonable price and an acceptable image data quality.  Figure 1), obtained from various provinces in Thailand in order to cover different characters which are varied depending on producing environments and areas, were provided by the Thailand Rice Department. The samples were prepared, and only complete seeds were selected, by experts from the Rice Department.

4.2.
Training Image Acquisition. Each training image of rice seed was acquired from a scanner with a special box tray, which could roughly separate each rice seed sample and usually adjusted each seed to be aligned horizontally as 3 Journal of Sensors shown as in Figure 3(a), because a seed which was not horizontally laid (Figure 3(b)) did not show all of the seed features properly. The seeds, which were not horizontally laid, were tilted. Examples of tilted seeds are shown in Figure 3(c). The obtained images were rechecked to get rid of images that contained more than one seed and images that the seed was not properly aligned horizontally. An object region in each input image was extracted by applying background-subtraction using Otsu thresholding method. Then, ellipse fitting with coordinates of object contour was used to calculate the object approximate size from the ellipse major axis and minor axis values. If the object size varied greatly from the average size, the object in the image was determined not properly aligned in the horizontal orientation.

Preprocessing.
After getting a single seed, this section dealt with a preparation of quality seed image, which consisted of two parts: seed alignment and seed quality screening.
4.3.1. Seed Orientation/Alignment. This process examined and rotated the seed body into the horizontal axis direction, so that all seeds' head-and-tail directions would be aligned in the same direction. Being aligned in the same direction simplified extracting features and analyzing data. This  process is necessary because the grain might move or be misaligned during scanning. The procedure details are described as follows: The seed image was processed based on coordinates of the object contour therefore the image needed to be rotated and flipped so that the image appeared as shown schematically in Figure 3(a). The seed head pointed to the left and the tail pointed to the top-right. After adjusting the alignment, shape features that expressed the head and the tail were easily extracted by the method described in Section 4.4.
The distances between each object contour points and the object centroid were calculated. The head point and the tail point were defined as tip points, which were the point furthest away from the centroid and the point locally furthest away from the centroid on the opposite side. A common physical shape of rice seed is shown in Figure 4. The shape around the head normally had a relatively symmetrical corner. In contrast, the shape around the tail had an unsymmetrical corner and might have two small corners due to the structure of lemma partly hanging over the palea. As a result, the head tip point could be determined by calculating the object area around the tip point, shown by the red triangle in Figure 4, and by comparing their sizes. After the head point was determined, the rice image could be rotated so the head point was on the left of the image and the line between the head point and the centroid was parallel to the X-axis in the image shown in Figure 3(a).

Seed Quality Screening.
It was important that input data images had high quality because we dealt with a lot of images and a large sample collection. A delivering of inappropriate data to be analyzed in the system should be avoided. There were two types of seed samples during the data preparation that should have been discarded. The first type, outlier seed, was caused by raw material itself while the other type, the tilted seed, was an error on the procedure in the seed scanning process.
(1) Outlier seeds were rice seeds that had different shapes from the standard one, e.g., very long tail, large crack, and smudged skin. Samples of outlier seeds are shown in Figure 5 (2) Tilted seeds were rice seeds that had an oval shape when viewed cross-sectionally. It might tilt up when laid on the flat surface of the scanner. Examples of tilted seeds are shown in Figures 3

(b) and 3(c)
A seed quality screening process to examine the two cases is described as the following.
We extracted features (shape, color, and texture presented in Section 4.4) from each sample and used the features as input data for the DBSCAN technique, one of the most popular clustering techniques, to detect outlier from sample data. DBSCAN uses a density-based clustering algorithm to detect abnormally of multidimensional data. The algorithm identified and clustered in a high-density region separated from a low-density region throughout the two parameters, eps (radius of neighborhood region), and MinPts (minimum number of points). A point was decided as a clustered region only if there were more neighbors than MinPts and was within the eps. Otherwise, a point that did not satisfy the condition was defined as an outlier point. In the tilted case, the shapes of the seed had more distinctive characteristics than the diverse shape of an outlier seed. Most of the tilted seed was more symmetrical in shape along the length of the body than the seeds laid horizontally. Here, the SVM technique with our shape features was applied to tackle this problem. A classify model was created to identify seed types between a tilted seed and a horizontal seed.  4.4.1. Shape. We used basic physical shape features as referred in [25]. The extracted values are shown below. C = circularity was calculated from equation 1, while A = object area and P = object perimeter.
R = roundness and Co = compactness was calculated from equation (2), while D max = object maximum diameter.
F A = area factor was calculated from equation (4) when A hull = object convex hull area was shown as an area surrounded by a red line in Figure 6(a).
Slope factors were calculated from contour pixel coordinates. The object was divided into N equal parts. From the experiment, N was determined to be 9. The point on the left of the object represented the head point P head , and the point on the right of the object represented the tail point P tail , shown by pink points in Figure 6(b). We defined S uppern = nth slope factor on the upper side, S lowern = nth slope factor on the lower side, P uppern = nth dividing point on contour on the upper side (blue points), and P lowern = nth dividing point on contour on the lower side (green points).
When n = 2, 3, ⋯, N − 1, Figure 6(c) shows the illustration of S tail , tail slope factors. Q midn =nth point dividing object into N tail equal parts on the object middle line shown as the green line. Examples of Q midn are shown as red points. The object middle line, L mid , was calculated from the thin object area. Q uppern (blue point) and Q lowern (green point) were calculated in the same way as the Q midn , using the object upper contour and the lower contour instead of L mid . N tail was experimentally determined to be 21. When m = 1, 2, ⋯, M and M was experimentally determined to be 7. While pos ∈ fmid, upper, lowerg, S tailposm and S tailavg were calculated as in equations (7) and (8).
We also used object contour shape histogram 180 degree around tail tip as shape factors. Example is shown in Figure 7. From this histogram, we calculated a hair around tail tip frequency value and used as another feature.

Color.
We used RGB color space in calculation and color features below from each pixel color value in the object area. When color ∈ fR, G, Bg and p color (x,y) = pixel value of the color in the object at coordinate (x, y) and N was the number of pixels of the object. We used the features below for each color. Min Avg We also used color histograms on RGB color space which represented color distribution on the object.

Texture.
We applied LBP (local binary pattern) [26] to gray-scale object picture to extract texture features, which represented each pixel intensity difference between neighborhood pixel intensity.
LBP values were tolerance to brightness differences in the picture. Finally, the computed LBP histogram was used as a feature vector.
Another set of texture features was calculated from GLCM (Gray-Level Cooccurrence Matrix). Normalizing object image by using GLCM, the GLCM matrix, P of each pixel value was A i,j /∑ i,j A i,j , where A i,j = number of neighbors with the center pixel value, i, and the neighbor pixel value j. From the GLCM matrix P member p(i,j), we calculated contrast, correlation, homogeneity, entropy, and dissimilarity, as shown below.
4.5. Classification. The techniques in both classical machine learning and deep learning techniques were applied to evaluate the efficiency of rice varieties classification.

Statistical Classification Method.
In the machine learning technique, features described in Section 4.4 could be identified and extracted by values pixels, position orientation, color, textures, and shape. There were four classifier methods, including LDA, LR, k-NN, and SVM in this evaluation. In the process, Principal Component Analysis (PCA), a well-studied algorithm for reducing the dimension, was applied to the extracted feature for reducing redundant and irrelevant features without causing data loss. PCA transformed a projection of the original data into a new subspace of fewer dimensions while preserved the most important of the original data. To estimate performance in each method, an average accuracy was estimated by K-fold crossvalidation. It was applied to give our model an opportunity to train on multiple train-test splits. A dataset was randomly divided into k sets with approximately equal size. Performance computing would repeat multiple times by using k−1 sets in training and the remaining sets for testing. Logistic regression (LR) [27] was one of the most popular algorithms widely used for classification problems. The technique was a predictive analysis algorithm and based on the concept of probability. The technique could map predicted values through a prediction function, defined as the Sigmoid function, to return a probability which was scored between 0 and 1.
Linear discrimination analysis (LDA) [28] was a generalization of Fisher linear discriminant, a method used in statistics, pattern recognition, and machine learning. LDA was a linear transformation technique to project data onto a lower-dimensional subspace that maximized the separation between multiple classes. It used Bayes theorem to estimate the probability. LDA assumed a normal distribution for each class, a class-specific mean, and a common variance.
The k-nearest neighbors algorithm (k-NN) [29] had been widely used in classification problems because it was simple, effective, and nonparametric method. Plotting all samples to a super space and setting k as a number, k-NN used the k nearest neighbors to decide which class was the new unknown-class sample point belongs to. It calculated the distances between an unknown sample and the samples in the predefined training set. Therefore, k-NN needed to define Support-vector machines (SVM) [29] classifier was one of the most efficient machine learning algorithms and widely used for pattern recognition. The algorithm would find the optimal hyperplane that maximized the margin between the two classes. SVM could efficiently perform a nonlinear classification using the kernel function to transform data into high-dimensional feature spaces. In this work, we used a popular RBF (radial basis function) kernel function because it performed good performance on a large variety of problems. There were two parameters to be determined in the SVM model, C, and gamma. To get the best model, the optimum values of C and gamma parameters of the kernel could be determined by using grid search with cross-validation.

Deep Learning.
Recently, a convolution neural network (CNN), known as deep learning [30], was used as a popular and effective method for image analysis. Deep learning technique could learn high-level features directly from input image data throughout many hidden layers of network architecture. CNN relied on a huge amount of parameters and needed to be tuned to achieve an optimum solution. In this paper, we used convolutional neural networks, which were combinations of image convolution and deep neural network as classification algorithms. We used the networks, which were fully trained on a large opened image data ImageNet, due to their ability in image classification. The ImageNet [31] project was a large visual database designed for visual object recognition software research. A model trained on ImageNet could classify images among 1,000 classes of realworld images. This made it a powerful and widely used in the latest computer vision research. In this technique, we studied some popular networks trained on ImageNet data with 5 pretrain models including VGG16, VGG19 [32], Xception [33], InceptionV3 [34], and InceptionResNetV2 [35]. The details of the networks are described below.
VGG was a uniform architecture. The VGG16 and VGG19 consisted of 13 and 16 convolutional layers and ended with three fully connected layers. Both networks used only 3 × 3 small filters with stride 1, followed by multiple nonlinear layers. With many layers, it could learn more complicated features from images. However, this network had a very large number of weight parameters. The Inception network was an important milestone in the development of the CNN classifier. Inception V1-V4 and InceptionResNet were popular models with some differences from each other. The inception module was used to act as a multilevel feature extractor by computing 1 × 1, 3 × 3, 5 × 5 convolutions within the same module of the network. InceptionResNet was a combined hybrid with a performance of the ResNet (Residual Neural Network). Xception [33] was an extension of the Inception architecture which replaces the standard Inception modules with depthwise separable convolutions. Due to the reason that it had no need to perform convolution across all channels, the number of connections in the model was lighter than the models described before.
Deep learning performance could be improved by increasing the amount of training data. Augmentation was a technique to increase the size of a training dataset. It provided various image functions such as shift, flip, rotation, zoom, rescale, and brightness. However, our research data had already been rotated and flipped into the same alignment. Therefore, we did not use flipping and rotation in the augmentation process.

Results and Discussion
Our work was developed on Windows OS processed on Intel Core i7 3.2 GHz CPUs, 16 GB Ram. It was implemented by OpenCV in C++ programming language, Scikit-learn, and Keras python library for machine learning and deep learning. For deep learning techniques, the entire training process was specially performed on Linux OS using NVIDIA Tesla K80 for faster training with CUDA Core Graphic Cards with 24 GB of GDDR5 memory. Rice seed samples were put in a special tray box and were scanned on a flatbed scanner with 600 DPI resolution. When measuring the average size of the seed, approximately 500 × 200 pixels were used. In the experiment, we divided the rice dataset into subgroups to analyze ambiguity among rice varieties and the potential of classification techniques. The rice varieties were categorized according to the same planting area because they had an opportunity to mix together. It could be divided into three groups. Rice varieties in the first group (GrpI) consisted of CNT1, KDML105, PTT, RD15, RD33, RD51, and RD6. The second group (GrpII) consisted of PSL2, RD31, RD41, RD47, RD49, RD57, and SPR1. The last group (GrpAll) was from combining the first two groups, a total of 14 varieties. In each class, more than 3,500 seeds were photographed, and a total number of close to 50,000 images was analyzed. This result section is divided into three parts as follows.

Preprocessing
5.1.1. Seed Orientation. The seed orientation method described in Section 4.3.1 was used to evaluate 20,000 sample images randomly from all varieties and received an accuracy rate of up to 98.32%. It was noticed that the missed detection could be identified into two cases: (1) seeds that had rather narrow tail with the shape similar to the head and (2) seeds that had a uniformly wide body from the head to the tail. The error seemed to be consistent with the hypothesis of the proposed method, which relied on asymmetry comparison between the head and the tail shape. Generally, the head of the seed should have higher symmetry than the tail. Therefore, if the seed body had a uniform shape on both sides, it was complicated to analyze. The problems in such seeds required additional features, such as the sterile glumes' position on the seed body or curvature around the tail, to improve efficiency.

Outlier and Tilted Seed
Screening. After all of the seed images were rotated in the same orientation, images that had an irregular (outlier) or tilted shape appearance were discarded from the dataset. 8 Journal of Sensors In outlier seeds, we applied the DBSCAN technique performed on all features from shape, color, and texture. The parameters of 2 values (eps and MinPts) in the algorithm were adjusted. In the experiment, the eps and MinPts values were defined in the range from 0.2 to 10 and 4 to 20, respectively. An optimum value range was found between eps 0.6-1.2 and MinPts 10-15, which offered an ability to separate only half of the total amount of outliers. Most of the seeds that separated had a distinctly different shape, both in terms of size and shape, such as seed that were longer than normal, a long tip of the tail, and large cracks or smudges. However, these types of seeds were in a small proportion of only 1.6% because the database was prescreened for seed quality control. Actually, this group of seeds inevitably existed naturally. In the future, if there are enough of these types of seeds, it will be further studied.
In tilted seeds, it had more symmetrical shape along the body length than the seeds that were laid horizontally. Here, we created a classify model to identify the type of seeds between tilted seed and horizontal seed, using shape feature and SVM technique. A total of 10,000 seeds sampling from all classes were divided in the ratio of 70 : 30 for training and test sets. We obtained a model with accuracy reached 96.98%. It was utilized to reexamine images and removed about 10% of the tilted seeds from the dataset. However, in that case, we further tested the hypothesis by taking 500 samples placed on a curved tray and counted those seeds manually. We found that the tilted proportion appeared only 2% in this small number of samples. Therefore, if we used a curved container in the design of the grading machine device, it would help to reduce this problem.
We also studied the effects of the discarded seeds on the classification performance of various rice varieties. Testing performance by comparing the efficiency before and after blending these seeds was demonstrated by the SVM method and evaluated on datasets in GrpI and GrpII. In the dataset, these seeds were mixed in each of the rice variety in different proportions. The total number of outliers was 714 seeds, of which the proportion in GrpI and GrpII was at 53.4% and 46.6%. In addition, we had 4,265 tilted seeds, which were in GrpI and GrpII in the proportion of 37.3% and 62.7%, respectively. In GrpI, the efficiency rates before and after filtering these seeds from the dataset were 87.90% and 89.2%, while in GrpII, the efficiency rates were 80.99% and 84.31%, respectively. We found that the differences in the average accuracy in both groups decreased by 2-3%. Most affected of the first three rice varieties were RD57, RD49, and PSL2 in GrpII, which was reduced by 4-7%. Although the human eyes perceived that these seeds were inappropriate when examined for classification, the seed physical features still had a unique difference in the variety appearances. The average accuracy was reduced by only a few percent, not in accordance with the proportion of these discarded seeds, which was up to 10%.

Classification Results from Statistical Techniques.
After the seed object was aligned in the horizontal direction, the object physical characteristic information was extracted by using the proposed method described in Section 4.4. Principal Component Analysis (PCA) was later applied to obtain features to get rid of the unimportant feature dimensions. In the classification process, some data were discarded from the seed screening process, resulting in different proportions of the sample number of each class. Therefore, we cut the number of samples according to the classes with the least samples to provide balanced information. These samples were randomized in equal proportions at 2,900 samples per class and divided into a training set and a validation set with a proportion of 80% and 20%, respectively. The performance of the classification model using 5-fold cross-validation and confusion matrix was considered in this processing step.

Journal of Sensors
In classification with k-NN, we used Euclidean distance as a distance function and optimized k parameter between five to forty-five range [5, 10, 15..., 45]. We found that k = 15 was the best accuracy given from our experiment. For the SVM classification method, the grid searching range of each parameter was at gamma = ½0:0001, 0:001, 0:01, 0:1, 1, 10, 100 and C = ½0:001, 0:01, 0:1, 1, 10, 100, 1000. The highest accuracy was achieved at gamma = 0:001 and C = 10. Therefore, we conducted experiments based on these parameters, which gave the highest accuracy. Table 1 shows the performance comparison from the four classification techniques when using PCA at various values. From the efficiency of all 3 groups of rice varieties, we found that the accuracy of all classification methods had an increasing trend according to the increasing PCA, and it was not changed much when PCA was equal to 350. LR and LDA methods started at a lower rate than the SVM method but tended to increase more clearly than SVM while the PCA increased. The accuracy of the k-NN method had changed only 2-3% even though the PCA was higher. The performance in GrpI had better accuracy than that of the GrpII, while GrpAll classified on all 14 rice varieties was close to that of the GrpII. SVM method achieved a 2-3% higher accuracy rate than the LR and LDA methods and 17-25% higher than the k-NN method. Among all studied methods, the best accuracy rate on the SVM technique presented at PCA 600 was 90.61%, 82.71%, and 83.9% in GrpI, GrpII, and GrpAll, respectively.
Results of the confusion matrix of classification in each group, which was obtained from the SVM method, are shown in Figure 8(a)-8(c). In Figure 8(a), the first three rice varieties in GrpI that received a high accuracy rate was RD6 at 99.14% and the two pairs of varieties, including (CNT1 and PTT) and (RD33 and RD51), had accuracy rates at 96.2-96.55% and 88.98-88.28%, respectively. RD15 was a rice species that appeared to be ambiguous with RD33 and RD51 at 6%. In Figure 8(b), the first 3 varieties that gave the best identification were RD47, RD49, and RD41 at 91.74%, 90.53%, and 89.29% accuracy, respectively, while RD31 was the worst to be classified (66.38%). The three varieties of RD31, RD57, and SPR1 were the most ambiguous with a range between 9.0-16.03%. In Figure 8(c), the accuracy rate remained at 83% closed to that of the GrpII although GrpAll had the number of rice varieties up to 14 species. This experiment showed that GrpI and GrpII had some dependencies on each other with relatively low ambiguity of a range between 1-4%. There were 4 varieties with a low accuracy rate, including RD31, RD15, RD57, and SPR1, which were very ambiguous from a total of 14 rice varieties in the study. Their accuracy ranged between 68.33-77.55%. From the confusion matrix, we could see that RD31 was an ambiguous species between RD57 and SPR1 with false-positive prediction results at 10.67% and 14.63%, respectively.

Classification Results from Deep Learning Methods.
The classification performance of rice varieties in this section was evaluated by using deep learning techniques, which had a complex network structure and many variables involved. Training a classify model on the technique took several days to process, in contrast to the statistical techniques which needed only a few hours. Therefore, this study presented a performance only in GrpAll because of processing time limitations. In this testing, we adopted a network architecture with weights trained on ImageNet, namely, VGG16, VGG19, Xception, InceptionV3, and InceptionResNetV2. In this experiment, an image size was about 250 × 250 pixels, and the number of epoch was equal to 200, which had training time enough to show the trend of performance through the entire training dataset. We also had defined the number of freezing layers in the model. This freezing was related to preventing weights in the network layer being modified. If the model did not freeze any layers, it meant that the weight in the network layer was modified and took longer training time. In addition, we also studied data quality factors that affected the performance of the model in terms of the accuracy and the duration of training time, such as the orientation of the seeds and the image size used to train the model.
We conducted an experiment to train deep learning models from the weights pretrained by the ImageNet dataset. From 2,900 samples in each rice variety, we trained each model with 2,320 (80%) images from each class, including 2,030 (70%) images for training, and 290 (10%) images for validating model. Then, we tested the performance accuracy of each model with test datasets, which contained 580 (20%) images per class. For each network architecture, we used the model with weight obtained at the 200th epoch, defined as Acc 200 , and model with weight obtained the least validation loss, defined as Acc l . The experimental results are shown in Table 2, and the training model validation accuracy is shown in Figure 9.
In Table 2, we found that the best classification efficiency was almost 95% accuracy obtained from InceptionResNetV2 and Xception, which their training was performed without freezing the layer while the remaining models received lower performance with a range of 85-90%. In training time, the best   Journal of Sensors accuracy from InceptionResNetV2 and Xception took up to 1.5-2.75 times longer than the other models throughout the number of epochs. However, the efficiency of the two models increased rapidly since the epoch number equaled to 10 and started to stabilize when it reached 50 as shown in Figure 9. Figure 10 shows a confusion matrix of classification results from InceptionResNetV2. Most rice varieties could well be distinguished from each other. The first 3 varieties that had low accuracy rates were SPR1, RD31, and RD57, which were 85.17%, 88.97%, and 90.69%, respectively. However, the results of three varieties of them, (SPR1, RD31, and RD57), were in the same group as the SVM method, which had ambiguity rates between 21-8.62%.
In the study of the size of images used in training, Table 3 shows the trend of efficiency that affected the image size at various values. Here, we selected the VGG19 model in this training because it gave a good performance in a short training time. In each level of the reduced image size, this reduction could save about 1.3 times of the training time. The performance (Acc l ) was almost no significant difference when the image size changed from 250 to 200 pixels but it was worse by 3.61%, 4.14%, and 10.7% when the image size drop to 150, 100, and 50 pixels, respectively. In conclusion, reducing the image size allowed for faster training time. However, the image size should not be lower than 150 pixels because it decreased the efficiency significantly.
We had evaluated the effectiveness of our seed orientation process. An experiment was conducted to compare the performance of two models. The first model was trained with our dataset that had been applied to the seed orientation process, and the second model, a baseline model, was trained with the same dataset that had not been used in the seed orientation process. The augmentation process of the first model included shifting, zooming, rescaling, and brightness adjustment, and the augmentation process of the second model includes shifting, zooming, rescaling, brightness adjustment, flipping, and rotating. We tested the process with InceptionResNetV2 model by performing on a short training time with an epoch number equal to 50. The dataset with 1,000 samples in each class were used in this evaluation, by dividing it into proportions of 80%, 10%, and 10%, for training, validation, and testing, respectively. We found that using our seed orientation technique could improve performance by 1.3% when compared to the nonoriented method, and the efficiency was improved slightly.

Conclusions
In this paper, we developed and tested a quality inspection method to identify 14 rice cultivars from a database of nearly five thousand seeds received from many planting areas. We implemented a method to deal with a large number of seed for preparing data quality appropriately before being used as input for machine learning techniques and to improve classification ability. It was found that the seed orientation improved the classification accuracy in deep learning experiment by 1.3%, and the seed screening improved the classification accuracy in statistical methods by 2-3%. For the classification of rice varieties, we investigated up to 2,900 data samples in each rice variety for training and testing models. Several methods in machine learning techniques were evaluated and compared in order to obtain a method that had the best performance. In the experiment using statistical methods, we found that SVM performed the best classification with an accuracy of 83.9% when the PCA dimension number was set at 600. In the experiment using deep learning method, we found that the InceptionResNetV2 model using the validation data gave the least loss value and performed the best classification accuracy at 95.14%. The results showed that the efficiency of deep learning method performed up to 11.24% better than the traditional method. Based on the results of the study and the efficiency from this investigated process, we will further improve our recently developed machine for seed quality inspection to be more efficient.

Data Availability
The data that support the findings of this study are from Rice Department, Ministry of Agriculture and Cooperatives.

Conflicts of Interest
The authors declare that they have no conflicts of interest.