Classification system for rain fed wheat grain cultivars using artificial neural network

Artificial neural network (ANN) models have found wide applications, including prediction, classification, system modeling and image processing. Image analysis based on texture, morphology and color features of grains is essential for various applications as wheat grain industry and cultivation. In order to classify the rain fed wheat cultivars using artificial neural network with different neurons number of hidden layers, this study was done in Islamic Azad University, Shahr-e-Rey Branch, during 2010 on 6 main rain fed wheat cultivars grown in different environments of Iran. Firstly, data on 6 colors, 11 morphological features and 4 shape factors were extracted, then these candidated features fed Multilayer Perceptron (MLP) neural network. The topological structure of this MLP model consisted of 21 neurons in the input layer, 6 neurons (Sardari, Sardari 39, Zardak, Azar 2, ABR1 and Ohadi) in the output layer and two hidden layers with different neurons number (21-30-10-6, 21-30-20-6 and 21-30-306). Finally, accuracy average for classification of rain fed wheat grains cultivars computed 86.48% and after feature selection application with UTA algorithm increased to 87.22% in 21-30-20-6 structure. The results indicate that the combination of ANN, image analysis and the optimum model architecture 2130-20-6 had excellent potential for cultivars classification.


INTRODUCTION
Wheat is one of the major staple foods all over the world because of its agronomical adaptability and ability of its flour to be made into various food materials.In the case of crops such as wheat, where end use depends on use of a specific variety, identification of that variety is crucial.Variety identification is also important for plant breeders and geneticists.The morphological characters of grains are heritable in nature (Harper et al., 1970) and play an important role in variety identification (Shouche et al., 2001).
External features describe the boundary information.
The boundary co-ordinates of the object can be used to extract morphological features (Jayas et al., 2000).Morphological features like roundness, elongation, compactness, etc., are widely used in automatic grading, sorting, detection and quality inspection of products in the food industry (Jayas et al., 2000).Three commonly used features for size measurement of an object can be found for food quality evaluation: area, perimeter and length and width.The most basic convenient measurement for size is the area.The perimeter of an object is particularly useful for discriminating between objects with simple and complex shapes.Area and perimeter measurements are easily computed during the extraction of an object from a segmented image (Sun and Du, 2004).
Various grading systems using different morphological features for the classification of different cereal grains and varieties have been reported in literature (Barker et al., 1992a, b, c, d;Majumdar and Jayas, 2000;Myers and Edsall, 1989;Sapirstein and Bushuk, 1989;Sapirstein et al., 1987;Symons and Fulcher, 1988a, b;Zapotoczny et al., 2008).Huang et al. (2004) proposed a method of identification based on Bayes decision theory to classify rice variety using color features and shape features with 88.3% accuracy.Majumdar and Jayas (2000) developed classification models by combining two or three features sets (morphological, color and textural) to classify individual kernels of Canada western red spring (CWRS) wheat, Canada western amber durum (CWAD) wheat, barley, oat and rye.
Image analysis based on texture, morphology and color features of grains is essential for various applications in the grain industry including discrimination of wheat classes, to assess grain quality and to detect insect infestation (Tahir et al., 2007).
Several researchers have worked on the development of machine vision systems for class and variety identification of grains (Neuman et al., 1987(Neuman et al., , 1989a, b;, b;Manickavasagan et al., 2008).Zayas et al. (1986) classified three classes of wheat from the USA (hard red winter, soft red winter and hard red spring) and their varieties, using kernel length, width, length ratio, tangent, sine and arc length of parabolic segment with 77 to 85% accuracy.While classifying five Australian wheat varieties using size and shape features attained 44 to 96% accuracy (Myers and Edsall, 1989).
Artificial neural networks (ANN) is a mathematical tool, which tries to represent low-level intelligence in natural organisms and it is a flexible structure, capable of making a non-linear mapping between input and output spaces (Rumelhart et al., 1986).
This method can be trained with numerical sample data concerning only inputs and corresponding outputs, they have promise in solving the problems of agriculture, especially grain identification.The inputs to the ANN can be given in terms of data obtained from digital images, which provide quantitative estimate of morphological features of grain and offer scope to bring objectivity in the process of identification.
So, in this study, the main aim was the development of a digital imaging system and ANN capable of measuring the geometric and shape related parameters for differentiating between rain fed wheat grain cultivars in order to distinguish them.

MATERIALS AND METHODS
Due to the identification of rain fed wheat grain (Triticum aestivum L.) cultivars using artificial neural network and investigated different neurons number in hidden layers before and after doing UTA algorithm, this study was done in Islamic Azad University, Shahr-e-Rey Branch during 2010 on 6 wheat cultivars (Sardari, Sardari 39, Zardak, Azar 2, ABR1 and Ohadi) which were grown in different environments of Iran to simulate variation on grain shape and sizes to cover the range of variations encountered in reality (Figure 1).
Finally, after training of neural network, the more effective features were selected by UTA algorithm (Utans et al., 1995).By trying different number of neurons in each hidden layer, 21-30-20-6 was evaluated as the optimum model architecture.The overall system architecture is shown in Figure 2.

Image acquisition
Digital image analysis offers an objective and quantitative method for the estimation of morphological parameters.This process uses digital images to measure the size of individual grains and mathematically extract features and shape related information from the images.
A Panasonic camera (Model SDR-H90) with zoom lens 1.5 to 105 mm focal length was used to take the images of wheat grain samples.Images format was 24 bit color JPEG with resolution of 360×640 pixels.The camera was mounted over the illumination chamber on a stand which provided easy vertical movement.
The distance between the camera and each grain sample was 27 cm.In order to reduce the influence of surrounding light, a black illumination chamber located between the samples and the lens and 90 images for each variety was taken.Rain fed wheat grain cultivars of the acquired is shown in Figure 1.

Feature extraction
In this study, color, morphological features and shape factors were used for extraction of individual wheat grains by MATLAB version 7.8.

Color feature extraction
An RGB image, sometimes referred as a truecolor image, is stored as an m-by-n-by-3 data array that defines red, green and blue color components for each individual pixel.
MATLAB and the image processing toolbox software do not support the HSI color space (hue saturation intensity).Therefore, we used the HSV color space that is very similar to HSI.From the red (R), green (G) and blue (B) color bands of an image, hue (H), saturation (S) and value (V) were calculated using the following equations: The mean value of R (Rm), the mean value of G (Gm), the mean value of B (Bm), the mean value of H (Hm), the mean value of S

Max Min
(Sm) and the mean value of V (Vm) were calculated in an image (Image Processing Toolbox , 2007).

Morphological feature extraction
The following morphological features were extracted from labeled images of individual rain fed wheat grains cultivars.Geometry related features including area, perimeter and major and minor axis lengths were measured from the binary images (Paliwal et al., 2001;Zhao-Yan et al., 2005).
Area (A): The area of a region is defined as the number of pixels contained within its boundary; perimeter (P): the perimeter is the length of its boundary.The length of the minor axis is the longest line that can be drawn through the object perpendicular to the major axis.Convex area (C): it is the number of pixels in the smallest convex polygon that can contain the wheat grains region; solidity (S): the proportion of the pixels in the grains region that are also in the convex hull; extent (Ex): the proportion of the pixels in the bounding box which are also in the grains region; roundness (R): this is given by: Compactness (CO): the compactness provides a measure of the object's roundness:

Shape features
From the values of axis length and area, shape factors were derived (Symons and Fulcher, 1988a)  The feature vector was made from the earlier mentioned features and feed, used as an artificial neural network for classification which in this case was used as a multi layer perceptron (MLP) method.

MLP neural network
An artificial neural network is composed of many artificial neurons that are linked together according to specific network architecture.The objective of the neural network was to transform the inputs into meaningful outputs.
Multilayer perceptron (MLP) network consists of an input layer, one or more hidden layers and an output layer.Each layer consists of multiple neurons.An artificial neuron is the smallest unit that constitutes the artificial neural network (Kantardzic, 2003).
A typical multilayer perceptron neural network architecture is shown in Figure 3. Dimension of the input vector was reduced by used feature selection algorithm.

Feature selection
Feature selection is the problem of choosing a subset of features ideally necessary to perform the classification task from a larger set of candidate features.There are several ways to determine the best subset of features.UTA is a simple method which is based on trained artificial neural network.In the basis of this method, average of one feature in all instances is calculated.Then, the selected feature in all input vectors is replaced by the calculated mean value.Then, the trained network is tested with the new features and data matrix according to Utans et al. (1995).The comparison error was defined in our strategy as follow: E= (FP (new) + FN (new)) -(FP (old) + FN (old)) ( 14) Where, FP(old) is the false positive and FN(old) is thefalse negative using the whole features and FP(new) and FN(new) are those values when one of the feature are replaced by the mean value.
There are three states in this way: (1) one input is considered more relevant if E is positive and higher according to the other features; (2) one input is ineffective if E is zero; (3) one input is not only ineffective but also noisy and should be removed from the input vector if E is negative.

RESULTS AND DISCUSSION
Identification of rain fed wheat grain cultivars on wheat images that contains samples of 6 cultivars was tested.There were 90 images for each cultivar.Images format was 24 bit color JPEG and 360×640 pixels considered for images size.The proposed method is implemented by a Pentium V personal computer with 1GB RAM and 1.80 GHz CPU.
There were 360 training data set and 180 test data set for all rain fed wheat cultivars.Six color features (Rm, Gm, Bm, Hm, Sm and Vm), 11 morphological features (area, perimeter, major axis length, minor axis length, aspect ratio, equivalent diameter, convex area, solidity, extent, roundness and compactness) extracted from grain cultivars images that features such as area, perimeter, major and minor axis length computed on the binary image using MATLAB 7.8 software.Four shape factors (SF1, SF2, SF3 and SF4) were derived from these main geometric features.
We applied a MLP neural network with 2 hidden layers.The input layer of the ANN had 21 neurons because the data sets contain 21 parameters and the output layer of ANN had 6 neurons.Many features were highly correlated with others and if one of the features was selected, the rest of the features will not contribute significantly to classification model.
In order to determine the best kind of features for getting the highest accuracy, UTA algorithm was applied and total feature's error (T) was evaluated.In the case of 21-30-10-6 structure, 5 effective features Hm (24), SF2 (22), Sm (14), area (8) and convex area (6) were selected (Table 2) because they had more positive and higher feature's error (Utans et al., 1995).

Conclusion
The development and use of digital image analysis based on texture, morphology and color features for grains identification depended on capability of accurate classification for different cultivars of given species and cultivars.The suitable model building aims to produce a robust ANN model that can accurately map outputs from inputs.A good ANN model mainly depends on the choice of an optimum neutral network architecture and network internal parameters as neurons number in each hidden

Cultivar
Feature error (E)

Cultivar
Feature error (E) layer.MLP neural network was presented for classifying 6 rain fed wheat cultivars.540 wheat grains were investigated and 21 features were extracted from each grain.In this study, after evaluating the influence of 2 hidden layers neurons number on accuracy average for the identification of rain fed wheat grains found that 21-30-20-6 structure shows higher accuracy in both conditions before UTA algorithm (86.48%) and after doing that (87.22%).Many features were highly correlated with others and if one of them was select, the rest will not contribute significantly to the classification model.It was found that feature selection in 21-30-20-6 structure is the best model, Sm, Hm, area, convex area and SF2 were extracted among 21 original inputs.The highest accuracy for grains identification was conducted in ABR1 (91.66%) and the minimum belonged to Sardari (79.44%) cultivars.Maximum differences between accuracies before and after feature selection were gained for Zardak (5.56%) in 21-30-20-6 structure.We observed that feature selection had positive effect on Zardak and Sardari cultivars classification in three experimental structures.
Eq): It was the diameter of a circle with the same area as the wheat grain region.
as follow:

Table 1 .
Average accuracy before UTA algorithm.

Table 5 .
Average accuracy after UTA algorithm.

Table 6 .
Difference of accuracies before and after UTA.