Analysis of Maize Crop Leaf using Multivariate Image Analysis for Identifying Soil Deficiency

Image processing analysis for the soil deficiency identification has become an active area of research in this study. The changes in the color of the leaves are used to analyze and identify the deficiency of soil nutrients such as Nitrogen (N), Phosphorus (P) and potassium (K) by digital color image analysis. This research study focuses on the image analysis of the maize crop leaf using multivariate image analysis. In this proposed novel approach, initially, a color transformation for the input RGB image is formed and this RGB is converted to HSV because RGB is ideal for color generation but HSV is very suitable for color perception. Then green pixels are masked and removed using specific threshold value by applying histogram equalization. This masking approach is done through specific customized filtering approach which exclusively filters the green color of the leaf. After the filtering step, only the deficiency part of the leaf is taken for consideration. Then, a histogram generation is carried out for the deficiency part of the leaf. Then, Multivariate Image Analysis approach using Independent Component Analysis (ICA) is carried out to extract a reference eigenspace from a matrix built by unfolding color data from the deficiency part. Test images are also unfolded and projected onto the reference eigenspace and the result is a score matrix which is used to compute nutrient deficiency based on the T2 statistic. In addition, a multi-resolution scheme by scaling down process is carried out to speed up the process. Finally, based on the training samples, the soil deficiency is identified based on the color of the maize crop leaf.


INTRODUCTION
Due to the increasing costs of crop production and to the progressing environmental pollution by agrochemicals, mineral fertilizers should be applied more efficiently.This concerns primarily N, because the over application of this element leads to low N recovery efficiency and to a risk of nitrate pollution of ground waters.The diagnostics of disease symptoms in plants, including those resulting from nutrient deficiencies, require quick, reliable and precise instrumental techniques enabling to recognize the symptoms of physiological disorders prior to the occurrence of responses to stress factors that can be observed visually.The majority of them affect the composition and proportions of pigments in leaf tissues (Bacci et al., 1998).
Since there is a correlation between the chemical composition of leaf tissues and reflectance in the visible spectrum, there are good reasons to use digital color image analysis in the early diagnostics of physiological changes caused by various stress factors (Bacci et al., 1998).The main advantages of this technique include a prompt non-invasive assessment, the possibility to perform measurements under varied conditions and comparability of results (Wiwart, 1999;Brosnan and Sun, 2002).New methods for estimating the nutrient requirements of crops, based on image analysis, will assure new methodology may become a reliable diagnostic tool to be applied in broadly understood agricultural practice, including mineral fertilization management.This solution relates to the concept of precision agriculture, promoted over the last decade, according to which the rates of mineral fertilizers are to meet the nutrient demand of crops estimated precisely on a local basis (Zhang et al., 2002).
Numerous studies involving a rapid estimation of the N requirements of crops have been carried out with the use of a chlorophyll meter.Singh et al. (2002) used this instrument in wheat and rye and reported that mineral fertilization rates may be reduced by 12.5-25% with no risk of yield decrease and that N fertilizers may be applied at the stage of the highest nutrient demand.These authors also demonstrated that comparable results may be obtained with the use of leaf color charts, where the color of leaves is compared with standardized color charts.Netto et al. (2005) studied the color of robusta (Cofea canephora Pierce ex Froehner) leaves and noted the occurrence of a significant linear dependence between the N content of leaves and readings on a chlorophyll meter.Carter and Knapp (2001) analyzed leaf spectral reflectance, transmittance and absorbance under conditions of physiological stress in five plant species and reported that the greatest differences occurred at wavelengths around 700 nm.The most significant changes in reflectance concerned the yellow-green color, which was ascribed to the effect of stress on a decrease in the chlorophyll content of leaves.However, in some cases these changes were not specific to particular stress factors, implying the need to continue research into changes in the color of leaves in plants exposed to stressors.Jia et al. (2004b) used aerial photographs taken at a low altitude (300-450 m) to estimate the level of N fertilization of winter wheat (Triticum aestivum L.) plantations in different regions of north China.The results of an analysis of these photographs showed significant inverse relationships between greenness intensity, canopy total N and SPAD readings at booting and flowering, thus allowing a precise determination of N fertilization rates.Cartelat et al. (2005) determined the concentrations of chlorophyll and polyphenols in wheat leaves as indicators of N accumulation by plants.The amount of chlorophyll was determined with Minolta SPAD-502, whereas the amount of polyphenols-with Dualex, a device that measures UV absorbance through leaf epidermis.The above authors proposed to use the chlorophyll/polyphenol quotient as an indicator of N accumulation by plants.This indicator also can be applied successfully in precision agriculture.
The objective of this study was to determine changes in the color of the leaf to identify the deficiency of nutrients such as N, P, K and Mg.A number of supervised and unsupervised approaches are available in the literature to carry out defect identification in plants.But, most of the existing supervised techniques do not offer accurate results under non-linear problem conditions.
Thus, this research work requires an unsupervised approach for analysis of the leaf to determine the nutrient deficiency in the soil.Hence, this approach uses the potential of a simple, relatively inexpensive and commonly available technique based on a color image analysis system, for the early diagnostics of nutrient deficiency symptoms in the maize crop.Multivariate image analysis phenomenon has been carried out in this research study for accurate identification of soil deficiency.

METHODOLOGY
Importance of unsupervised image analysis and MIA: This research study uses an unsupervised image analysis approach for the accurate analysis of the maize crop leaf.This research study requires unsupervised technique rather than supervised approach as evaluation techniques that need user assistance such as subjective evaluation and supervised evaluation are infeasible in these types of real time applications.Unsupervised evaluation facilitates the objective comparison of both different segmentation approaches and different parameterizations of a single technique, without necessitating human visual comparisons or comparison with a manually segmented or pre-processed reference image.Moreover, unsupervised approaches produce results for individual images and image whose features may not be known until evaluation time.Unsupervised approaches are essential to real time image analysis and can furthermore facilitate self tuning of algorithm parameters based on evaluation results (Zhang et al., 2008).
Therefore, an efficient unsupervised Multivariate Image Analysis (MIA) approach is utilized in this research work for the maize crop leaf analysis.The main advantage of using multivariate image analysis is that it has simpler formulations and computation, (i.e.,), it has lesser computational complexity when compared with unsupervised approaches.
Multivariate Analysis (MVA) methods are increasingly utilized in surface spectroscopies to aid the analyst in interpreting the vast amount of information resulting from these multidimensional data set acquisitions.The main aim of MIA techniques is to extract significant information from an image data set while minimizing the dimensionality of the data (Artyushkova and Fulghum, 2002).
Proposed MIA approach for maize crop leaf analysis: Figure 1 shows the basic procedure of the proposed unsupervised image analysis algorithm.Initially, the images of maize crop leaves are obtained using a digital camera then; image-processing techniques are applied to the acquired images to extract useful features that are necessary for further analysis.

HSV color transformation:
The HSV color space is essentially completely different from the wide noted RGB color space since it separates out the Intensity (luminance) from the color data (chromaticity).Again, of the two chromaticity axes, a distinction in Hue of a element is found to be visually a lot of distinguished compared to it of the Saturation.For every element, either its Hue or the Intensity is chosen because the dominant feature supported its Saturation.
In situations where color description plays an integral role, the HSV color model is often preferred over the RGB model.The HSV model describes colors similarly to how the human eye tends to perceive color.RGB defines color in terms of a combination of Fig. 1: Overall methodology primary colors, whereas, HSV describes color using more familiar comparisons such as color, vibrancy and brightness.
The significance of HSV over RGB is been clearly illustrated by Sural et al. (2007).Sural et al. (2007) illustrated that the approximation done by the RGB features blurs the distinction between two visually separable colors by changing the brightness.But, the HSV based approximation can determine the intensity and shade variations near the edges of an object, thereby sharpening the boundaries and retaining the color information of each pixel.This makes the HSVbased features very useful in image analysis.So, this approach uses HSV color transformation approach.
Initially, the RGB images of maize crop leaves are acquired.Then, RGB images are converted into Hue Saturation Value (HSV) color space representation.Hue is a color attribute that describes pure color as perceived by an observer.Saturation refers to the relative purity or the amount of white light added to hue and Value means amplitude of light.Considering that (I) exists in RGB color space, then: ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) After the transformation process, the Hue component is taken for further analysis.Saturation and Value are dropped since it does not give extra information.Figure 2 shows the H, S and V components.

Histogram equalization for image enhancement:
Histogram Equalization is a technique that generates a gray map which changes the histogram of an image and redistributing all pixels values to be as close as possible to a user-specified desired histogram.HE allows for areas of lower local contrast to gain a higher contrast.Histogram equalization automatically determines a transformation function seeking to produce an output image with a uniform Histogram.Histogram equalization is a method in image processing of contrast adjustment using the image histogram.This method usually increases the global contrast of many images, especially when the usable data of the image is represented by close contrast values.Through this adjustment, the intensities can be better distributed on the histogram.Histogram equalization accomplishes this by effectively spreading out the most frequent intensity values (Cheng and Shi, 2004).
Histogram equalization automatically determines a transformation that produces an image with uniform histogram of intensity values.Consider a discrete grayscale image and let be the number of occurrences of gray level i.The probability of an occurrence of a pixel of level i in the image is: where, L being the total number of gray levels in the image, n being the total number of pixels in the image and ) is the image's histogram for pixel value i, normalized to (0, 1).Let us also define the cumulative distribution function corresponding to as: For some constant K.The properties of the CDF allow us to perform such a transform; it is defined as: The function T maps the levels into the range (0, 1).The above describes histogram equalization on a grayscale image.However it can also be used on color images by applying the same method separately to the Red, Green and Blue components of the RGB color values of the image.The image is first converted to another color space, HSL/HSV color space in particular, then the algorithm can be applied to the luminance or value channel without resulting in changes to the hue and saturation of the image.The color intensities are spread uniformly leaving hues and saturation unchanged (Rothe and Kshirsagar, 2012;Wiwarta et al., 2009).

Multivariate image analysis:
The MIA approach presented follows the flow-diagram presented in Fig. 2. In both stages, training and test, the first step is to unfold the color and spatial information of the image pixels to configure a matrix of raw data.In this matrix, each row is composed of the RGB values of one pixel and its vicinity.Pixe 's vicinity is set through a neighbourhood window.Figure 3 shows the unfolded Fig. 3: Flow-diagram of the MIA approach RGB raw data of an image using a square window of size 3×3.For a given pixel ith, its R value is translated first, next the R value of the top-left neighbour and then the rest of neighbours following the clockwise direction.By following the same approach with the G and B channels, each row of the matrix is created.Apart of squares, other window shapes are suitable for use, such as hexagons or crosses.Nevertheless, most common window shape consisting of a W×W square window is used where W can vary in odd numbers, e.g., 3×3, 5×5, etc.Other unfolding orders are also possible since the ICA analysis does not depend on the order of columns (data variables).However, always use the same unfolding order in training and test stages for the correct functioning of the method (Lopez-Garciaa et al., 2010).
A training image free of defects is used to compile a reference eigenspace that will be used to build the reference model of locations (pixels) belonging to defect-free areas and to perform defect detection in test images.Let be the training image: ) ) ) ) ) ) ) ) Equation ( 1) represents the color-spatial feature of each pixel in .Let be the set of q vectors from the pixels of the training image, where K is the number of pixels in the neighbourhood window multiplied by the number of colour channels.Let ̅ ∑ be the mean vector of X.An eigenspace is obtained by applying Independent Component Analysis (ICA) on the mean-centred color-spatial feature matrix X.The eigenspace [ ] are extracted using Singular Value Decomposition.L is the number of selected principal components ))).Now, by projecting the training image onto the reference eigenspace, a score matrix A is computed: Where A is an unknown matrix called the mixing matrix x (t), s (t) are the two vectors representing the observed images and source images respectively. The objective is to recover the original image, si (t), from only the observed vector xi (t).The estimates for the sources are obtained by first obt ining the "unmixing m trix" W where W = A -1.  This enables an estimate, u of the independent sources to be obtained: u = W x (t).
values of pixels are then computed from the score matrix: ∑ where is the score value of a given pixel i th in the l th principal component (l th eigenvalue) with variance .
A behaviour model of normal pixels belonging to defect-free areas is created by computing the statistic for every pixel in the training image.
is, in fact, the Mahalanobis distance of the projection of the pixel neighbourhood onto the eigenspace with respect to the centre of gravity of the model (the mean) and represents a measure of the variation of each pixel inside the model.In order to achieve a threshold level of the variable defining the normal behaviour of pixels, a cumulative histogram is computed from the values of the training image.The threshold is then determined by choosing an extreme percentile in the histogram, commonly 90 or 95%.Any pixel with a T 2 value greater than the threshold will be considered a pixel belonging to a defective area.One or more training images can be used to achieve the reference eigenspace and compute the cumulative histogram (Lopez-Garciaa et al., 2010).

Multi-resolution and post-processing:
Multi-resolution stage: The MIA process is integrated with a multi-resolution scheme and a post-processing stage.Multi-resolution is introduced to capture defects and parts of defects, of different sizes with minimum computational cost.In this case, the sizes of the vicinity window is fixed to the minimum size of 3×3 and apply the method to the test sample at several scales (lower of equal to 1.0).In this way, bigger defects are collected at lower scales, where the 3×3 window covers a larger area than in the original scale.Lower scales lead to smaller matrices of unfolded data and then the process is accelerated because the major computational cost of the method is concentrated in the projection of data matrix onto the reference eigenspace (a matrix multiplication).By reducing the size of matrices the computational cost is significantly reduced.The final map of defects is built by combining the defective maps computed at each scale and then resized to the original size of samples (scale 1.0) (Lopez-Garciaa et al., 2010).
In the original method, to capture defects of different sizes it is necessary to use different window sizes (e.g., 3×3, 5×5, 7×7, etc.) and join the resulting maps of defects.This approach implies high computing costs since the size of the rows in the unfolded matrices grows exponentially with the window size.The size of a row in the matrix of unfolded data is , when use a neighbourhood window of N×N locations and C different color channels.Thus, for RGB and a 3×3 window the row size is 27, for a 5×5 window it is 75, 147 for a 7×7 window, etc.Consequently, the computational costs of handling the matrix of unfolded data increase exponentially with N.
Post processing stage: Post-processing is performed through simple morphological operations.Morphological operations are affecting the form, structure or shape of an object.They are used in pre or post processing (filtering, thinning and pruning).Pruning eliminates small parasite branches of the object (Lin, 2008).The global scheme of the method, including multi-resolution and post-processing, is shown in Fig. 3.    result of testing and training part of phosphorous and nitrogen and the output of the nitrogen deficiency are shown in Fig. 9 to 11.The above Table 1 gives the comparison of accuracy between existing PCA and proposed ICA.Proposed ICA gives better accuracy than existing PCA.That the accuracy is calculated from correctly detected images.PCA algorithm is detected 26 images out of 30 images, but ICA algorithm detected 27 images out of 30 and gives the accuracy of 90% batter classification performance than ICA. Figure 12 illustrates the comparison of accuracy, from the figure it is clearly observed that the proposed ICA algorithm classified better than the PCA.

CONCLUSION
The main objective of this study is to determine the possibility of using a digital color image analysis to evaluate the symptoms of Nitrogen N, Phosphorus P and Potassium K deficiencies in the maize crop.It is clearly observed from the results that the significant color changes in the leaf are mainly due to the nutrient deficiencies of N, P and K.This research study mainly focused on determining the soil nutrient deficiency based on the color changes in the leaf.An efficient unsupervised Multivariate Image Analysis (MIA) approach is used in this research work to identify the deficiency of N, P and K in the soil.Initially, this approach uses HSV color transformation approach as HSV is a good color descriptor.Masking and removing of green pixels is carried out through customized filtering with pre-computed threshold level.Then the histogram equalization is done to enhance the deficiency part of the image.Then, Multivariate Image Analysis is utilized in which the eigen vectors and T 2 histogram value are used to identify the nutrient deficiency.In MIA, instead of PCA, this approach uses ICA analysis to extract a reference eigenspace.The experimental evaluation of this approach is carried out in MATLAB 2010.The results are observed to be significant when compared with the traditional approaches.

Table 1 :
Comparison of accuracy