Edge-Based Color Image Segmentation Using Particle Motion in a Vector Image Field Derived from Local Color Distance Images

This paper presents an edge-based color image segmentation approach, derived from the method of particle motion in a vector image field, which could previously be applied only to monochrome images. Rather than using an edge vector field derived from a gradient vector field and a normal compressive vector field derived from a Laplacian-gradient vector field, two novel orthogonal vector fields were directly computed from a color image, one parallel and another orthogonal to the edges. These were then used in the model to force a particle to move along the object edges. The normal compressive vector field is created from the collection of the center-to-centroid vectors of local color distance images. The edge vector field is later derived from the normal compressive vector field so as to obtain a vector field analogous to a Hamiltonian gradient vector field. Using the PASCAL Visual Object Classes Challenge 2012 (VOC2012), the Berkeley Segmentation Data Set, and Benchmarks 500 (BSDS500), the benchmark score of the proposed method is provided in comparison to those of the traditional particle motion in a vector image field (PMVIF), Watershed, simple linear iterative clustering (SLIC), K-means, mean shift, and J-value segmentation (JSEG). The proposed method yields better Rand index (RI), global consistency error (GCE), normalized variation of information (NVI), boundary displacement error (BDE), Dice coefficients, faster computation time, and noise resistance.


Introduction
In digital image processing, image segmentation that reduces the amount of unnecessary data and preserves the important information needed for analysis plays an important role in image analysis. In general, image segmentation gathers pixels displaying similar characteristics within the same areas and converts them into regions. Among the various techniques, image segmentation methods can be divided into two main groups: machine learning image segmentation and classical image segmentation. First, machine learning image segmentation is a method by which a program can learn and segment an object by itself, without adjusting the program further. There are three types of machining approaches: supervised, unsupervised, and reinforcement learning methods. Supervised methods use a training dataset containing ground truth data to train artificial neural networks to map between input images and segmented results (see the survey [1]). However, the training process is computationally intensive and the ground truth construction that requires manual labeling by experts is labor-intensive. Additionally, when a new object class is added, the whole training dataset must be thoroughly reconstructed and the time-consuming training process must be repeated. In contrast, the unsupervised method does not require a dataset for training. Instead, the result of each iteration is recursively input to the program to adjust its parameters. This type of approach, such as K-means [2,3], mean shift [4,5], and JSEG [6,7], etc., is often more effective and more tolerant of unusual or unpredictable situations. However, the unsupervised methods are usually time-consuming due to its iterative processes embedded in the methods. Finally, the reinforcement learning method uses the reward and punishment techniques from environmental analysis for learning to drive the agent to the target. This method requires a large number of iterations for training of the agent to get a reward [8][9][10]. Second, classical image segmentation is a low-level image processing approach that tries to extract information without knowing the truth. Although, nowadays, machine learning image segmentation is state-of-the-art [11], classical image segmentation is still necessary in cases in which segmentation does not have ground truth images or there is a time constraint. Classical image segmentation also helps to create ground truth data in training datasets for machine learning techniques. Classical image segmentation techniques are comprised of thresholding-based, edge-based, region-based, and graph-based techniques. Thresholding-based techniques are divided into three types [12]: global thresholding, local thresholding, and adaptive thresholding. First, global thresholding weighs the distribution of intensity in the histogram to determine the threshold for separating objects from the background [13][14][15]. Second, local thresholding is used when a single threshold is not possible for images with uneven illumination or shadows. In such a case, it is necessary to use a sub-image to select the threshold [16]. Third, adaptive thresholding makes a calculation of the threshold by a window to find the intensity from the neighbor pixel [17,18]. The edge-based techniques, such as zero-crossing [19], Active Canny [20], PMVIF [21][22][23], EdgeFlow [24], and PointFlow [25], extract object boundaries in the image by creating contours around the objects. The region-based techniques, such as watershed [26,27], are based on the principle of grouping pixels with similar properties into the same regions or the same objects. Finally, the graph-based techniques, such as graph cuts [28,29], normalized cuts [30,31], Superpixel [32,33], and SLIC [34,35] proceed by grouping pixels according to graph theory. The methods mentioned above have various strengths and weaknesses, such as segmentation accuracy, processing time, flexibility, ease of use, and robustness to noise. For example, some algorithms can be applied only to grayscale images while some are only available in the RGB color space. Some methods are not suitable for real-time use or have to adjust too many parameters.
This paper introduces an edge-based classical image segmentation algorithm for a color image using particle motion in a vector image field derived from local color distance images (PMLCD). It is developed from the PMVIF algorithm that is known to have a fast computation time and yields closed boundaries but can be applied only to grayscale images. In the PMVIF algorithm, two vector fields, namely, the normal compressive vector field and the edge vector field, derived from derivatives of grayscale images, are used to force the particle to move along the object edges, which results in closed particle trajectories that resemble the object boundaries. In order to extend this principle to a color image segmentation task, the new formulae for computing the normal compressive vector field and the edge vector field, derived from the local color distance images, are introduced. The method proposed in this paper can be used not only with color images but also multichannel images such as hyperspectral images.
The rest of the paper is organized as follows: Section 2 describes the principle of the PMVIF algorithm; Section 3 describes the developed color image segmentation using particle motion in a vector image field derived from local color distance images; Section 4 presents the experimental validations and benchmarking of the proposed algorithm; finally, conclusions are drawn in Section 5.

Background to Particle Motion in a Vector Image Field
This section describes the principle of a traditional boundary extraction algorithm based on particle motion in a vector image field (PMVIF), which is an edge-based classical image segmentation approach. In general, in an N-dimensional space, a boundary can be explicitly represented by a manifold of dimension N-1 interfacing between regions of different attributes; for example, a close curve in a two-dimensional space. However, in a discretized image where a set of pixels or voxels is the only class that can exist, explicit representations of region boundaries, such as a curve or a surface, are difficult to encode. In this case, a normal compressive vector field [21][22][23], where all vectors are normal and point to the nearest interface, providing information about the direction to the nearest boundary, is more suitable to be used as an implicit boundary representation. Nevertheless, the normal compressive vector field itself only provides information about the location of the boundary but cannot offer any clues regarding the direction for tracking edges. In order to be able to locate and track a boundary simultaneously, another vector field containing vectors parallel to edges-namely an edge vector field-combined with the normal compressive vector field is required. The concept of using two such orthogonal vector fields for boundary extraction in a grayscale image was introduced in the PMVIF algorithm, where the gradient-Laplacian vector field used as a normal compressive vector field and the Hamiltonian gradient vector field used as an edge vector field are given as follows: and e = − ∂P ∂yî where c is a normalization factor, andî ,ĵ are unit vectors in x and y directions, respectively. In general, in Equations (1) and (2), partial derivatives can be approximated using difference operators such as Sobel operators. Figure 1 illustrates examples of the gradient ∇P, Laplacian ∇ 2 P, edge vector field e, and the gradient-Laplacian vector field n. In order to extract object boundaries, sequences of boundary points were obtained from trajectories of a particle driven by the combined force field α e + β n, computed as follows: where P k is the kth particle position vector; e k is the edge vector, interpolated at the kth particle position; n k is the normal compressive vector, interpolated at the kth particle position; α is a tangential stepping factor, with α > 0 for a particle moving in a clockwise direction and α < 0 for a particle moving in a counter-clockwise direction; and β, β > 0 is a normal stepping factor allowing the trajectory to converge to a boundary line.  2 demonstrates a combined vector field, α e + β n, α = 0.5, β = 0.5, and a boundary extraction result obtained from a particle trajectory, according to Equation (3), as applied to the image in Figure 1. The PMVIF works well in extracting boundaries of regions with a constant intensity in grayscale images, providing subpixel resolution results. Nevertheless, the limitation of the PMVIF method is that the edge and normal compressive vector fields are derived from partial derivative operations that can only be applied to a scalar or intensity image. In the case of color or multispectral images in which each pixel is considered as a vector, there is no exact definition of gradient and Laplacian operators, limiting the application of the PMVIF method to color images. To overcome this limitation, a new scheme to generate a normal compressive vector field and an edge vector field for the vector image is required.

Methodology
The PMVIF algorithm requires both normal compressive and edge vector fields as particle driving forces. Due to the gradient definition that is applied only to a scalar image, the original PMVIF method can be applied only to intensity images. In this paper, the PMLCD method for finding the normal compressive and edge vector fields for color images using the center to centroid vectors of local color distance images is presented below.

Image Moments
For a discrete image I(x, y), a two-dimensional moment of order (p, q) [36] is defined as Analogous to a center of gravity in classical mechanics, the centroid (x,ȳ) of an image I(x, y) can be calculated as follows: The displacement between a center and a centroid of an image indicates an unbalanced pixel intensity distribution in a spatial domain.

Local Color Distance Images
In general, image segmentation can be viewed as a process to determine in which region each pixel should be located. For a multispectral image I, one feature that is widely used to determine whether or not pixels should belong to the same region is the color distance between two pixels, defined as where I n (x, y) and I n (i, j) are the nth color components of pixels (x, y) and (i, j), respectively. In the data classification aspect, the color distance functions as a dissimilarity measurement between two pixels. Using the concept of a moving window, a local color distance image (LCD) of the pixels surrounding pixel (i, j) can be computed as where N(i, j) is a neighbor area of a center pixel (i, j). Each pixel in LCD(x − i, y − j) represents a color distance between a neighboring pixel (x, y) and the center pixel (i, j). Figure 3a illustrates examples of RGB local color distance images (i)-(v), obtained using a circular moving window computed at various places in a simple two-object image. As seen in the (i) and (v) cases, if a circular window is placed entirely inside one region, a local color distance image contains all zero pixels. Conversely, if a circular window is located at the border between two regions, the obtained local color distance image comprises pixels, with large values packed to one side of the image, as shown in cases (ii)-(iv) in Figure 3a. As a result, the centroid CT of the local color distance image computed using Equation (5) is shifted from the center C toward a high color distance area belonging to an adjacent region. Thus, for a local color distance image located in the proximity of a boundary, a vector from center C to a centroid CT points in the direction of the nearest boundary, independent of the side of the center of the local color distance on which the image is; for example, cases (iii) and (iv) in Figure 3a.

The Normal Compressive Vector Field
By gathering (C-to-CT) vectors of local color distance images obtained at all valid positions in an original image, a normal compressive vector field n can be computed as where C is a normalization factor making max | n(i, j)| = 1, and (x (i,j) ,ȳ (i,j) ) is a centroid, computed using Equation (5), of LCD(x − i, y − j) computed using Equation (7). Figure 3b demonstrates the n of the image in Figure 3a. By combining Equations (5)- (7), n(i, j) can be directly computed as It is worth noting that, in this vector field, the phenomenon that a vector on one side always points in the opposite direction to a vector on another side is called the normal compressive property. In the PMVIF technique, the normal compressive property of the vector field causes a particle to cling to the object boundary. The difference in the n from Equations (1) and (9) is that the vector size obtained from Equation (1) is smaller than Equation (9), as shown in Figure 4. As Equation (9) uses the principle of LCD while Equation (1) only uses grayscale images that include intensity for each band collapsed together and using a gradient, resulting in a smaller vector size, Equation (9) is more suitable when used with color images.

The Edge Vector Field
The edge vector field in the original PMVIF method, used to drive a particle to move in a direction parallel to object edges in a grayscale image, is derived from a Hamiltonian gradient vector field. However, such a vector field cannot be generated in the case of vector images such as color images where each pixel is represented by a color vector. In order to create a vector field analogous to the edge vector field, firstly, a vector-to-scalar conversion scheme must be applied to a color image to achieve a unique condition, ensuring that different colors, normally represented by vectors, are represented by different scalar values. The linearization technique used to convert a color image into a scalar auxiliary image, based on the number base system, is proposed in this paper as follows: Aux(x, y) = m (n−1) I n (x, y) + m (n−2) I n−1 (x, y) + . . . + m 2 I 3 (x, y) + mI 2 (x, y) + I 1 (x, y) where m is the maximum intensity level of each color component. The auxiliary image is created to determine whether a neighbor pixel (x, y) has the same color as the center pixel (i, j). Thus, only a difference between Aux(x, y) and Aux(i, j) is sufficient to determine whether both pixels (x, y) and (i, j) have the same color.
To obtain a gradient-like vector field, in a normal compressive vector field, as demonstrated in Figure 3b, vectors outside objects must be reverted while vectors inside objects retain the same direction. Thus, Equation (9) is modified by multiplying the local color distance with the sign of a difference between auxiliary image pixels as follows: where C is a normalization factor so that max G(i, j) = 1 and sign( As a result, the normal compressive property of n in Equation (9)-i.e., a vector on one side always points in a direction opposite to a vector on another side, as shown in Figure 3b-is transformed to a gradient-like property of G where vectors on both sides of objects always point in the same direction, as shown in Figure 5a. Next, by rotating all vectors in G by 90 • , an edge vector field, similar to a Hamiltonian gradient vector field, is obtained as as shown in Figure 5b. Notice that vectors in e in the proximity of boundaries are always larger than areas farther away. Therefore, the magnitude of e can be used as a measurement for localizing object edges.

Particle Motion in a Vector Image Field Derived from Local Color Distance Images
The proposed boundary extraction algorithm is based on particle motion in a vector image field derived from local color distance images (PMLCD), using a normal compressive vector field that is calculated using Equation (9) while the edge vector field is calculated using Equation (12) so that particle trajectories, calculated using Equation (3), can be obtained. The object boundaries can then be extracted from a collection of these trajectories. The remaining steps of the PMLCD method are the same as for those of the PMVIF method.

Appropriate PMLCD Parameter Setting
The PMLCD method has three parameters: T | e| , α, and β. The T | e| parameter sets the threshold of | e| to determine the starting points of the particle, while α is the strength of the particle moving in the direction parallel to the object edges, and β is the strength in which the particle attaches to the object edges. These parameters are difficult to adjust. This article, therefore, suggests methods for adjusting all three parameters as follows: T | e| is determined using the Otsu's threshold method [14]. α and β are related as follows: where V c is the mean of the normalized variance of the color image. V e is the normalized variance of | e|. V n is the normalized variance of | n|.
γ is the ratio of α and β, The variance (V) of each channel, converted to a vector A, is made up of scalar observation defined as

Overall Boundary Extraction Method
The overall segmentation algorithm for color images, as illustrated in Figure 6, is described here. First, an image is smoothed to remove noise using a Gaussian low pass filter. The normal compressive vector field n and edge vector field e are then calculated using Equations (9) and (12), respectively, using a circular moving window of radius R. Local maximum points of | e| that are greater than a threshold are chosen as candidates for the starting points of the boundary extraction process. The suitable threshold value is determined by Otsu's threshold method of | e|. Commencing at each starting point, under the influence of a compressing edge vector field, a particle is forced to move along object edges, according to Equation (3), in both clockwise (α > 0) and counter-clockwise directions (α < 0) with a subpixel step size until it reaches a starting point or other previously extracted paths. Consequently, boundaries are collected from all obtained particle trajectories. Consequently, a complete edge map is achieved by quantizing the extracted boundaries. Finally, these boundaries are labeled with the fast region growing algorithm to produce the color image segmentation result.  Figure 7 illustrates image segmentation results obtained using both PMVIF and PMLCD evaluated using the same grayscale image and the original color image. Parameters used in all cases were T | e| = 0.08, α = 0.5, β = 0.2 (for both PMVIF and PMLCD), and the radius of LCD = 1 (for PMLCD). As seen in Figure 7b,c, PMLCD can be applied to both grayscale and color images. In addition, when compared to the results evaluated using the grayscale image, PMLCD evaluated using the original color image provided the best results with least fault contours.

Experimental Results and Discussion
The experimental results of color image segmentation using MATLAB 2019b with a CPU Intel Core i7-4710HQ, the VOC2012 dataset [37] and the BSDS500 dataset [38] collected to measure the performance of PMLCD compared with other unsupervised machine learning methods including K-means [2,3], mean shift [4,5], and JSEG [6,7] and classical methods including the grayscale PMVIF [21][22][23], grayscale watershed [26,27], and SLIC [34,35] are given in this section. Benchmarks used in this paper include the Rand Index (RI) [39], Global Consistency Error (GCE) [39], Normalized Variation of information (NVI) [40], Boundary Displacement Error (BDE) [39], Dice coefficient [39], computation time, and noise tolerance. Figure 9 shows the experimental color image segmentation results obtained from all methods using the image #2007_000063 from VOC2012. Figure 10 shows similarities between the object chosen from the ground truth image (a dog) and the corresponding segmented regions obtained from all methods in Figure 9. Figure 11 shows the results of the same experiment as those of Figures 9 and 10 for the images randomly selected from VOC2012 and BSDS500. As shown in Table 1, the parameters of all methods for each image in the experiment have been adjusted to achieve high RI and high Dice coefficients. The average benchmarking results show that the method with the highest average RI is PMLCD, at 0.78 (0.11). The methods with the lowest average GCE are PMLCD, at 0.13 (0.05), and Watershed, at 0.13 (0.08). The methods with the lowest average NVI are JSEG, at 0.12 (0.01), and PMLCD, at 0.12 (0.04). The method with the lowest average BDE is SLIC, at 11.82 (4.12). The method with the fastest average calculation time is Watershed, at 0.06 (0.01) seconds. The method with the highest average Dice coefficient is PMLCD, at 0.93 (0.03). Briefly, the PMLCD method yields the four best average values for the RI, GCE, NVI, and Dice coefficient. Figure 12 shows the graphs of computation times used to segment image #3096 from the BSDS500, interpolated to achieve various image sizes. As seen, the Watershed, PMVIF, and PMLCD methods are the fastest, respectively, but the PMLCD is the only true color image segmentation method. Figure 13 demonstrates the result of the noisy color image segmentation of the image #2007_001289 from VOC2012 with additive white Gaussian noise (signal-to-noise ratio (SNR) 0 dB (σ noise = 0.21)) obtained using the PMLCD algorithm with the following parameters: radius of LCD = 3, T | e| = 0.27 derived from Otsu's method, γ = −0.14 resulting in α = 0.34 and β = 0.66 obtained from Equations (13) and (14), respectively. The result gives the following benchmarks: RI = 0.91, GCE = 0.01, NVI = 0.01, BDE = 90.23 and computation time = 0.72 s. Figure 14 shows the SNR-performance graph of the PMLCD applied to this image, reflecting the high noise tolerance of the PMLCD method.   Figure 9 and the corresponding segmented regions.

Conclusions
The PMLCD color image segmentation algorithm is developed from the traditional method of particle motion in a vector image field (PMVIF) which uses two vector fields orthogonal to each other-namely a normal compressive vector field and an edge vector field-to force a particle to travel along object boundaries. Unlike the formulae previously used in the original PMVIF method, a normal compressive vector field is derived from the center-to-centroid vectors of local color distant images, whereas a gradient-like vector field is derived from center-to-centroid vectors of local color distant images in which each pixel is multiplied by the difference of the auxiliary pixels. An edge vector is then obtained by rotating each vector in a gradient-like vector field by 90 • to achieve a Hamiltonian gradient-like field. In addition, for ease of use, a method for adjusting parameters related to particle movement, including T | e| , α, and β, is introduced. Experimental results show that the proposed method yields promising results, with better RI, GCE, NVI, and Dice measures as well as a faster computation time and good noise resistance. Since the proposed algorithm is based on color distance measurement, which can be applied to both scalar and vector images, it outperforms other grayscale-based methods, especially in regions in which edge information cannot be visualized in the grayscale image domain. Moreover, the method is not only useful for segmenting color images, but also can be used for all types of color space and vector images including multispectral and hyperspectral images.