Segmentation of Mushroom and Cap Width Measurement Using Modified K-Means Clustering Algorithm

Mushroom is one of the commonly consumed foods. Image processing is one of the effective way for examination of visual features and detecting the size of a mushroom. We developed software for segmentation of a mushroom in a picture and also to measure the cap width of the mushroom. KMeans clustering method is used for the process. KMeans is one of the most successful clustering methods. In our study we customized the algorithm to get the best result and tested the algorithm. In the system, at first mushroom picture is filtered, histograms are balanced and after that segmentation is performed. Results provided that customized algorithm performed better segmentation than classical K-Means algorithm. Tests performed on the designed software showed that segmentation on complex background pictures is performed with high accuracy, and 20 mushrooms caps are measured with 2.281 % average relative error.


Introduction
Due to their high nutritive content, mushrooms are one of the commonly consumed foods.Image processing techniques can be used in classifying, quality control and determining the size of a mushroom.
In the system, at first image is preprocessed and after preprocessing segmentation is performed with k-means clustering algorithm.In image processing, it is important to correctly selecting these steps, and successful application is very important to achieve the appropriate result.For the segmentation process methods such as neural networks, support vector machine, genetic algorithms can also be used.However, segmentation can be performed using k-means clustering in a simple and effective way.In the literature Hong Yao [1] successfully applied the segmentation on a fish picture.Yong Zhang [2] applied the segmentation using PSO and PCM.Zhiqiang Lao [3] segmented White matter lesions using support vector machine.

Working Principle of the System
System contains 3 steps as can be seen in Fig. 2. First step is preprocessing, second step is segmentation and last step is showing results.

Preprocessing Step
In the beginning of this step mushroom picture is obtained, then filtering and histogram equalization is applied to the image.

1) Filtering Process
In the beginning, the obtained mushroom picture is turned into greyscale format.After that 3 × 3 average filter is applied to the picture.The reason for applying average filter is providing a smooth transition among the pixels to prevent noise.Eq. ( 1) shows the application equation of average filter [4]: (1) In the equation, m and n adjust the width of the region that the filtering will be applied.gray_pixel (s, t) is the intensity level at point (s, t).Since 3 × 3 filtering is used in the process, Eq. ( 2) shows the same equation for 3 × 3 filtering: ( Figure 3(a) show unprocessed mushroom picture and Fig. 3(b) shows filtered mushroom picture.

2) Histogram Equalization
After filtering, histogram equalization is applied to the image.Histogram equalization ideally distributes the contrast of the image using the image's histogram.An image can be represented as a data array in the form of: In this data array, every component can be composed of L intensity level.In X (i, j) image plane, (i, j) represents normalized intensity of a pixel.X k is kth intensity level.Equation ( 4) is used to obtain probability distribution function (PDF) of the image [5]: where n is the total number of pixels in the input image and n k is the number of X k in the image X.To obtain a better contrast image Eq. ( 5) is used: where L is the total number of possible grey levels (such as 255 for 8 bit depth), s k is the grey conversion value for a better contrast image.

Segmentation
In this step segmentation process will be explained.

1) K-Means Method
K-Means is one of the most widely used uncontrolled learning processes.This method ensures that all data belong to a single cluster.This provides an efficient clustering mechanism.K-Means algorithm groups n data points into C number of clusters.Goal is at the end to have a high level of similarity in the clusters and low level of similarity among the clusters [1], [6].
Squared error criterion E is widely used to obtain the distance of cluster members to the cluster center.For the most successful clustering, E value is expected to be small.Equation ( 6) is used for obtaining the sum of the squares of the distances of members to the cluster center: At the end of clustering N points are divided into C clusters.For the distance calculation Euclid equation given in Eq. ( 7) is commonly used [6]: (7)

2) Color Segmentation with Modified K-Means Algorithm
Figure 5 shows the GUI structure used for mushroom segmentation and finding cap width.On the GUI shown in Fig. 5, mushroom image to be processed, grayscale k-means analysis results, color k-means analysis results, k-means clustering image is provided.Also from the GUI, edges of the interested segmentation are determined and cap width of the segmented mushroom is calculated.
Algorithm 1 provides software algorithm that is used for Gray K-Means segmentation.This stage is composed of 9 steps: • Preprocessing: At this step previously described filtering and histogram equalization processes are performed.Lines 1-4 correspond to these processes in the algorithm.
• Determination of the cluster center: In the study, number of cluster is determined to be 3.Because of that 3 intensity levels chosen from Histe-qMushroom image is set as cluster starting point.Line 5 of the algorithm shows this process, c 1 , c 2 and c 3 keeps the center values.
• Calculating distance from the cluster center: At this step distance between each pixel in the image and the cluster centers c 1 , c 2 and c 3 is calculated.Codes of the distance function are given in Algorithm-2.Function calculates the distances and keeps them in distance1, distance2, and distance3 variables.
• Calculation of clustering data sums and producing K-Means cluster map: Step 11 through 30 of Algorithm-1 reflects these steps.At the end of the process graykmeans variable holds the gray level k-means clustering map.c1_sum has cluster c 1 , c2_sum has cluster c 2 and c3_sum has cluster c 3 data sum and c1count, c2count, c3count variables contain the respective cluster's number of members.Coordinates that lie inside cluster 1 are painted in black, coordinates that lie inside cluster 2 are painted in gray and coordinates that lie inside cluster 3 are painted in white.• Stopping Iteration: graykmean variable and comp_kmean variable are equalized in line 10.If comp_kmean and graykmean are equal, iteration is stopped.Lines 31, 32 and 33 reflects this process.
• Calculating the cluster centers: Lines 35, 36, and 37 reflects this process.By dividing c1_sum, c2_sum and c3_sum into c1count, c2count and c3count respectively new cluster centers (c 1 , c 2 and c 3 ) are obtained.Figure 6b) shows the grayscale segmentation map image that is stored in grayscale variable after running Algorithm-1.
• Obtaining color segmentation map: Grayscale segmentation map obtained in Algorithm-1 is converted into color segmentation map by Algorithm-3.Algorithm-3 turns black colored coordinates into red and corresponding data are moved to cluster 1. Similarly gray colored coordinates turned into blue and  • Calculating cap width: Find Hat Width button controls the action.graykmean variable is used in cap width finding algorithm.When the button is clicked by using graykmean value and gray level color information image is scanned from the top row to the bottom and cap width is calculated in pixels.Pixel value is then multiplied with preset calibration parameter (CP) to find the cap width in cm.In order for cap width to be measured by the software image must contain only one mushroom.On the test image cap width of the mushroom seen in cluster 3 image is calculated as 15.32 cm (Fig. 9).

Showing the Results
Developed software shows the analysis results in separate image boxes.Calculated cap width value is provided in the textbox.
Algorithm 1 Gray K-Means segmentation algorithm.

Experimental Results
Several tests performed with the software whose GUI structure is shown in Fig. 5.In the first phase segmentation performance tests are carried out and in the second phase cap width measurement performance tests are carried out.

Segmentation Performance Test
At this phase segmentation performance of the designed system is tested.Five different mushroom images are used for the tests.Test results are provided in Fig. 10.First row of Fig. 10 shows the images to be processed.Second row contains grayscale k-means maps, and third row contains color k-means maps of the corresponding images.In the fourth row segmented mushroom images can be observed.Fifth row contains edge determined segmented mushroom images.Results show that the designed system successfully segments the provided images.System performance does not change in complex background images.

Mushroom Cap width Measurement Performance
In this step Cap Width Measurement is performed.Our software measured the cap width of the mushroom as 8.48 cm but the real cap width of the mushroom is 8.66 cm.To find the measurement error in this result, Absolute Error (∆ae) and Relative Error (re%) variables in Eq. (8) and Eq. ( 9) is used.Error analysis resulted in following error values: ∆ae = 0.18 cm, re% = 2.12:

Conclusion
In this study we developed a GUI based software for K-Means image segmentation.Raw input image is filtered and histogram equalized at the beginning of the process.On the processed image segmentation is performed with k-means method.To improve segmentation performance k-means algorithm is improved with modifications.Test results show that designed software successfully performs segmentation even on complex background images.The analysis also shows that histogram processes and noise reduction processes play an important role in a successful segmentation process.Software also has the capability to measure cap width of a mushroom to provide information about the size of the mushroom in the image.
In Fig. 1, a comparison of classical color k-means algorithm and proposed color k-means algorithm is provided.Algorithms are tested on 3 different mushroom types and as can be seen from the figure, proposed algorithm resulted in more successful segmentation.Figure also shows that unpreprocessed image segmentation with classical color k-means algorithm causes some problems.

Fig. 1 :
Fig. 1: Comparison of classical color K-Means and proposed color K-Means algorithms.

Figure 4 (
a) shows filtered image, Fig. 4(b) shows histogram image, Fig. 4(c) shows histogram equalized image and Fig. 4(d) shows the histogram of the image.As it can be seen from the figures after histogram equalization, intensities are equally distributed according to the pixels.

Fig. 5 :
Fig. 5: GUI structure of the software that segments mushroom and calculates cap width.

•
Fig. 7: Color segmentation map and cluster images.

4 :
cluster1(i,j,:) = MushroomImage(i,j,:); During the test 19 different mushrooms with different size and type are used.Figure11provides the results ordered ascending according to m real value .Statistical analysis is performed on ∆ae and re% values found during tests and results are provided in Tab. 1.As results show ∆ae and re% are very low.Statistical results.