Method of Fruit Image Segmentation by Improved K-Means

The clustering algorithm of K-means is a widely used clustering algorithm, which characteristic is efficient and simple to implement. In this study, it takes the clustering algorithm of K-means as the starting point, which also explains the improvement of the clustering algorithm of K-means clustering, discussing the application of K-means on the realization of fruit image segmentation.


INTRODUCTION
In recent years, with the further development and advance of computer technology, more and more color images are used, therefore, the research on color image segmentation has become a major researching focus, which aims at the specific requirements from users according to the specific issues, scholars from all over the world have put forward many classic ways about segmentation algorithm, so far, the segmentation of color image histogram can be divided into the following aspects: The threshold method, feature space clustering method, region based method, edge detection method, fuzzy method and neural network method and so on.In previous studies, the research that is based on clustering algorithm of K-means and the improved image segmentation technology has been widely got concern (Zhou et al., 2013).The clustering algorithm of Kmeans is an unsupervised clustering method in dynamic algorithm, which has a certain adaptability, but the result of clustering can be easily influenced by the initial center of clustering.

The basic theory of clustering algorithm of K-Means:
The clustering algorithm of K-means is a kind of technology based on the average value.Taking K as the parameter, then dividing the sample point set s = {x 1 , x 2 ,……, x n } into classes {{c 1 }, {c 2 }, ……, {c k }}, at the same time, assuming that the center of clustering is Z p , the clustering error E can be defined as follows: The clustering algorithm of K-means can try to reduce the value of E by using iterative method, so that the cluster itself can be as compact as possible, which can be as far apart as possible from the other clustering.The steps of algorithm are as follows:  From the sample point s = {x 1 , x 2 ,……, x n }, there are K initial cluster centers that can be randomly selected. Putting the sample point set s = {x 1 , x 2 , ……, x n } into clustering according to the center of Z 1 , Z 2 ,……, Z p , clustering, K classes of clustering including {{c 1 }, {c 2 },...... {c k }} can be acquired, the method to determine ci can be as the following: for any xj∈s, if (x j -z i ) 2≤ (Xj-zp) 2, p≠i, p = 1, 2, ……, k, then x j ∈c i . Adjusting the center of clustering, the new class center can be acquired, namely, Among them, n i is the number of samples that c i contained. According to formula (1) calculating error function E, until the value of E can not be obviously changed any longer or there is no change in the member of clustering.
The algorithm is as shown in Fig. 1.

The improvement of clustering algorithm of K-Means:
There is an important step in clustering algorithm of K-means, which is the determination of K value and the selection of the initial clustering center.Since the traditional clustering algorithm of K-means is generally to select the number of the categories of the clustering as well as its center randomly (Chen et al., 2013).There is a strong correlation between the In this study, using the algorithm of K-means to have initial segmentation for the image of fruit, only according to the pixel image color information of the similar color of the fruit can it be divided into the same class, which is not to have an accurate segmentation for the image of fruit (Song et al., 2013).After finishing the initial segmentation, it needs to carry out regional merger according to the color and spatial information of the image of fruit, so as to get the final segmentation result.On choosing the number of categories, the choice of a fixed number of cluster can meet the requirements of the initial segmentation, which also can have high operation efficiency.
This study firstly puts the image of fruit into blocks, whose size is 16×16, since the size of the subblock can basically guarantee each block have smaller proportion in the image of fruit, so the change in color of each piece of block should not be too large, which is convenient for the clustering analysis.Once it is specified each small block as 3, then the value of K is 3, namely, the value of characteristics of each small block value is 3. Besides, the selection of clustering centers is another important factor that can affect the results of clustering.In this study, because the image of fruit is divided into many small pieces, in the sub -block, there is little color change among them, so we can choose the average pixel value as the initial clustering centers that the analysis on the image of fruit can be carried on.If the current representative object is replaced by the nonrepresentative objective, then, the cost function can calculate the absolute error value.While the total cost of exchanging is the total sum of the costs that are caused by all non-representative objects (Cai et al., 2008).If the total cost is negative, then E will be decreased, cluster center can be replaced by any random cluster point; if the total cost is positive, then the current cluster center is acceptable, there is no change in this iteration.

The improved clustering algorithm of K-Means:
According to the research results made by Ohta and some other staff, in this study, it takes component II as the one-dimensional characteristic quantity of color image's pixel, using component II to replace the gray value of the clustering of K-means in image segmentation.Indicating the number of pixels with T, then I P 1 ∈ {0, 1, ……, Lmax-1}(p=1, 2, ……, T), I P 1 is the gray value of the pixel in set Ai (j).Lmax is the brightness grade of II, λ i is the average value of No. i class after No. j iterations, A i (j) can represent the set of pixel of No. i class after No. j iterations.As shown in Table 1.
By using algorithm of K-means, the initial image of fruit can be segmented, due to the reasons that many small blocks are divided in advance, many divided small areas are submerged by the target area, which is the result of the over segmentation.There are still many similarities between each region, those regions without great obvious differences should be merged into a larger region (Wang et al., 2012).After the process of carrying on the following steps with the initial clustering segmentation, the final segmentation results can be obtained.When regions carry on merging, the size of the minimum region is critical for segmentation.The measurement of regional distance is an important standard for regional merging, the method of distance measurement can directly determine the final  -----------------------------------------Fast fuzzy algorithm of K -means - -----------------------------------------The improved algorithm of K-means - --------------------------------------------- Therefore, in this study, it mainly uses the adjacent color region to carry on regional merging.Through the experiment, we find out when threshold value T is 0.16% of the pixel number of the segmentation image of fruit, it can get better segmentation effect.As shown in Table 2.

CONCLUSION
The method of clustering of K-means is an unsupervised dynamic algorithm.The result of clustering of K-means can be influenced by the number of cluster center and the initial cluster center, which also can be affected by the geometry statues of samples at the same time.In view of the above problems, in this study, it proposes an improved clustering algorithm of K-means, which is based on rough set theory to determine the initial class number and class center of the clustering of K-means.By using the researching results of made by Ohta and some other staff, selecting the first component as the one-dimensional characteristic quantity of color image's pixel that can effectively represent the characteristic of color in the set to replace the gray value of the image segmentation of classical clustering of K-means, it can greatly reduce the amount of computing.At the same time, the distance measurement can adopt the characteristic distance as the method for measuring the clustering of K-means, which can improve the applicability and accuracy of the algorithm.

Fig. 1 :
Fig. 1: The flow chat of classical K-means algorithmclustering result and the location of the sample, moreover, the performance of the cluster is close related to the selection of the initial clustering center.If the selection of K sample is not reasonable, it will increase the complexity of operation, which will mislead the process of clustering, thus the clustering results can not be acquired reasonably.In this study, using the algorithm of K-means to have initial segmentation for the image of fruit, only according to the pixel image color information of the similar color of the fruit can it be divided into the same class, which is not to have an accurate segmentation for the image of fruit(Song et al., 2013).After finishing the initial segmentation, it needs to carry out regional merger according to the color and spatial information of the image of fruit, so as to get the final segmentation result.On choosing the number of categories, the choice of a fixed number of cluster can meet the requirements of the initial segmentation, which also can have high operation efficiency.This study firstly puts the image of fruit into blocks, whose size is 16×16, since the size of the subblock can basically guarantee each block have smaller

Table 1 :
The number of clusters and the processing time when different image segmentation algorithms are selected