Clustering fuzzy objects using ant colony optimization

Article history: Received June 2 2013 Received in revised format September 7 2013 Accepted September 7 2013 Available online September 9 2013 This paper deals with the problem of grouping a set of objects into clusters. The objective is to minimize the sum of squared distances between objects and centroids. This problem is important because of its applications in different areas. In prior literature on this problem, attributes of objects have often been assumed to be crisp numbers. However, since in many realistic situations object attributes may be vague and should better be represented by fuzzy numbers, we are interested in the generalization of the minimum sum-of-squares clustering problem with the attributes being fuzzy numbers. Specifically, we consider the case where an object attribute is a triangular fuzzy number. The problem is first formulated as a fuzzy nonlinear binary integer programming problem based on a newly proposed dissimilarity measure, and then solved by developing and demonstrating a problem-specific ant colony optimization algorithm. The proposed algorithm is evaluated by computational experiments. © 2013 Growing Science Ltd. All rights reserved


Introduction
Clustering involves partitioning a set of objects into clusters in such a way that the objects belonging to the same cluster must be as similar as possible, while those belonging to different clusters must be as dissimilar as possible.Cluster analysis has found applications in different areas including image segmentation, information retrieval, marketing, analysis of chemical compounds, etc. Considering the crispness or fuzziness of classes as well as attributes of objects, clustering models can be categorized as follows (D'Urso & Giordani, 2006):  Crisp clustering of crisp objects  Crisp clustering of fuzzy objects  Fuzzy clustering of crisp objects  Fuzzy clustering of fuzzy objects In crisp clustering, also known as hard clustering, each object would just belong to one cluster, while in fuzzy clustering an object has a degree of membership in each cluster, i.e., the clusters are allowed to overlap.In both crisp and fuzzy clustering, object attributes may be represented by crisp or fuzzy numbers.
Most of studies conducted on clustering problems have mainly assumed that object attributes are fixed and deterministic (crisp clustering of crisp objects, in particular).However, in many real-world situations, due to the imprecise or uncertainty of data sources, the attributes should better be represented by fuzzy numbers.Consequently, dealing with clustering of fuzzy objects can provide a great deal of applications and advantages.
The fuzzy c-means algorithm (Bezdek, 1981) and its variations such as the Gustafson-Kessel algorithm (Gustafson & Kessel, 1979) are the most popular fuzzy clustering techniques.Metaheuristic algorithms have also been applied to solve fuzzy clustering problems (see, e.g., Al-sultan & Fedjki, 1997;Kanade & Hall, 2004).However, some researchers have paid attention to fuzzy data.Hathaway et al. (1996) have proposed fuzzy c-means clustering for trapezoidal fuzzy numbers.A fuzzy c-numbers clustering procedure for LR-type fuzzy numbers has been proposed by Yang and Ko (1996), and extended to conical fuzzy vectors by Yang and Liu (1999).Yang et al. (2004) have suggested fuzzy clustering algorithms for symbolic and fuzzy data.The so-called alternative fuzzy c-numbers clustering algorithm for LR-type fuzzy numbers has been proposed by Hung and Yang (2005) based on an exponential-type distance measure.D' Urso and Giordani (2006) have proposed a fuzzy c-means clustering model based on a weighted dissimilarity measure for comparing pairs of symmetric fuzzy data.Hung et al. (2010) have suggested a clustering procedure, which is robust to initials and cluster number, by modifying the similarity-based clustering method proposed by Yang and Wu (2004) to handle LR-type fuzzy numbers.Recently, Jafari et al. (2013) have investigated for clustering cellular manufacturing the performance of two fuzzy clustering methods.This paper deals with the problem of crisp clustering of fuzzy objects.We consider the case where each object attribute is a triangular fuzzy number (TFN).In order to introduce a dissimilarity measure between fuzzy data, the (squared) Euclidean distance is generalized to TFNs.The problem is formulated as a fuzzy nonlinear binary integer programming problem with the objective of minimizing the sum of squared distances between objects and centroids.To solve the problem efficiently, an ant colony optimization algorithm is then proposed.
The rest of the paper is organized as follows.In the next section, the problem is introduced and formulated.The proposed ACO algorithm is described in Section 3, followed by Section 4 providing computational results.Finally, Section 5 concludes the paper.

Problem definition
The problem of crisp clustering of fuzzy objects can be formulated, in general, as a problem of partitioning a finite set of N objects into a given number K of disjoint clusters.Each object is represented as an R-dimensional vector of fuzzy sets, where each dimension stands for a single attribute.
Let ij w be the association weight variable of object i with cluster j, which can be assigned as 1, if object is allocated to cluster , 1,..., , 1,..., 0, otherwise Assuming that the objective is to minimize the sum of squared error, which is the most frequently used criterion in non-hierarchical (i.e., partitional) clustering (Jain et al., 1999), the problem of crisp clustering of fuzzy objects can be formulated as the following fuzzy nonlinear binary integer programming problem: where ij D  denotes a (fuzzy) distance between object i and the center of cluster j, due to the fact that each cluster is identified by its center (or centroid).Clearly, each cluster center is an R-dimensional vector of fuzzy sets as well.It is noted that the first set of constraints ensures that each object belongs to only one cluster, while the second set of constraints ensures that at least one object is assigned to each cluster.
In the problem considered in this paper, it is assumed that TFNs are used to embody the imprecise and uncertainty of data sources.For a TFN, a particular case of fuzzy sets, the decision maker only needs to estimate three values for an object attribute: the most plausible, pessimistic and optimistic values.Let il A  be the TFN representing the value of the lth attribute of object i. il A  is denoted by triplet x a a x a a a The lth attribute value of the center of cluster j is denoted by jl M  , which can be obtained by averaging the lth attribute values of all objects belonging to the cluster as follows: Since, as is well-known, the multiplication/division of a TFN by a scalar as well as the addition/subtraction of two or more TFNs becomes also a TFN (for more discussion on this type of fuzzy numbers, the reader is referred to Kaufmann & Gupta, 1991), As seen, like each of the objects, each cluster center is represented as an R-dimensional vector of TFNs.

Dissimilarity measure
This subsection describes how the distance between object i and the center of cluster j, i.e., ij D  given in model ( 1), is measured.In the literature, several measures of distance, dissimilarity and similarity between fuzzy data have been suggested (see, e.g., Pappis & Karacapilidis, 1993;Bloch, 1999;Szmidt & Kacprzyk, 2000;Kim & Kim, 2004;Yong et al., 2004;D'Urso & Giordani, 2006).However, in order to measure the distance between a pair of multidimensional vectors of TFNs, the traditional Euclidean distance is utilized and adopted.
By generalizing the squared Euclidean distance to TFNs, 2 ij D  , referred to as the dissimilarity between object i and the center of cluster j, can then be calculated as follows: where It is clear that ijl d  defined in Eq. ( 5) is a TFN as well.Therefore, ijl d  is denoted as 1 2 3 ( , , ) where 3 ( 1) , 1, 2,3, 1,..., , 1,..., , 1,..., Unfortunately however, since 2 ijl d  on the basis of the extension principle does not become a TFN, for simplicity, it is approximated as a TFN in the following way.

Some remarks
Theorem 1 The proposed dissimilarity measure is a symmetric function.
Proof Taking into account Eqs. ( 4) and ( 5), to show that 2 ij D  is a symmetric function, it suffices to show that 2 2 ( ) ( ) Let us consider the case where il jl A M    is positive.Then, from Eq. ( 7), is negative and then, from Eq. ( 8), As seen, Eq. ( 10) holds.In the case where is positive, in a similar way we can easily show Eq. ( 10) holds.Furthermore, if is neither positive nor negative as well.Then, from Eq. ( 9), ) Again, Eq. ( 10) holds, and the proof is complete.∎ Furthermore, from Definition 2 it follows that 2 ijl d  is always approximated by a positive TFN.Taking into account Eq. ( 4), we then have the following corollaries.

Corollary 2
The proposed dissimilarity measure is positive (i.e., a positive TFN).
Theorem 1 and Corollary 2 show two essential properties of a distance measure.However, there is another important issue to be considered.When a cluster contains just one object, its centre clearly coincides with that object (this is also shown by Eq. ( 3)) and consequently, the distance between the object and the cluster center should be zero.In other words, such a cluster should not have any contribution to the objective function.Due to the fact that the subtraction of two equal TFNs does not become zero (see Eq. ( 5) and Eq. ( 6)), from Eq. ( 4), singleton clusters would therefore have an undesirable effect on the objective function if not revised.Hence, the objective function of model ( 1) is modified as follows: where j y is a binary variable such that Considering Eq. ( 11) and Eq. ( 12), the problem of crisp clustering of fuzzy objects can then be formulated as follows (without additional variables j y ): Since 2 ij D  is a TFN, the objective function of the above model is obviously the sum of some TFNs.We then have the following corollary.

Corollary 3
The objective function of model ( 13) becomes a TFN.

Theorem 2 The traditional minimum sum-of-squares clustering problem (with crisp object attributes) is a particular case of the problem of crisp clustering of fuzzy objects stated in model (13).
Proof Consider the case where the uncertainty of data sources is neglected by the decision maker.In this situation, each object attribute is undoubtedly set equal to its most plausible value, that is, the value of the lth attribute of object i is set to 2 il a .It is then easy to show, considering Eqs.(2-9), that the proposed dissimilarity measure is reduced to the traditional squared Euclidean distance and consequently, the problem stated in model ( 13) to the traditional minimum sum-of-squares clustering problem.In other words, the latter problem is a particular case of the former one.∎ From Theorem 2, it follows that the complexity of the problem under consideration is at least of the same order as that of the traditional problem.Since it is known that the traditional problem is NP-hard when the number of clusters exceeds 3 (Brucker, 1978), the problem of crisp clustering of fuzzy objects stated in model ( 13) is NP-hard as well.

Proposed ant colony algorithm
To solve the problem under consideration, an ant colony algorithm is developed.ACO algorithms, firstly introduced by Dorigo (1992), are population-based, cooperative search procedures derived from the behavior of real ants.Without using visual cues, real ants exploiting pheromones as a communication medium are able to find the shortest path from the nest to a food source.After representing a combinatorial optimization problem by a graph, an ACO algorithm makes use of simple agents, called artificial ants, to move across the graph and iteratively construct solutions.That is, an artificial ant builds a complete solution by starting with a null one and iteratively adding solution components.Moreover, artificial ants deposit pheromones on their path, and the generation of solutions is then guided by the pheromone trails.ACO algorithms have thus far had substantial applications in many hard optimization problems, such as reliability optimization (Ahmadizar & Soltanpanah, 2011) and scheduling (Ahmadizar & Hosseini, 2012) problems.For further details on ACO algorithms, interested readers may refer to Dorigo & Stutzle (2004).

Solution construction
To apply an ACO algorithm to the problem of crisp clustering of fuzzy objects stated in model ( 13), it is represented by a graph with two types of nodes.The first set of nodes contains one element for each object and the other contains one element for each cluster.Each node in the first set is then connected to each node in the second set by an edge, indicating that each object can be assigned to each cluster.To construct a solution, an artificial ant starts from the first object and chooses (moves to) one of the clusters by applying a transition rule.In other words, the object is assigned to the chosen cluster.Then, the ant iteratively moves to the next object and chooses a cluster.Clearly, each ant may move to a node corresponding to a cluster several times.Let ij  be the pheromone trail between object i and cluster j, i.e., the pheromone trail associated with edge (i, j) of the given graph.ij  shows the desirability of assigning object i to cluster j.The pheromone trails are regularly modified at run-time and form a kind of adaptive memory of previously found solutions.As mentioned, while constructing a solution, an object is assigned to a cluster by an ant according to a transition rule so-called pseudo-random proportional rule (Dorigo & Gambardella, 1997) as follows: with probability q 0 an ant v for object i chooses the cluster j for which the pheromone trail is maximum, that is, arg max( ) ij j   .While with probability 1-q 0 , the ant chooses a cluster j according to the probability distribution given in the following equation: As seen, q 0 (a parameter between 0 and 1) determines the relative importance of exploitation versus exploration.Moreover, it is noteworthy that the heuristic information is not employed in the proposed approach.The heuristic information, unlike the pheromone trails, represents a priori information about the problem instance definition provided by a source different from the artificial ants.The reason is that by assigning an object to a cluster, the cluster centre given in Eq. ( 2) relocates frequently and hence, the heuristic information may not be introduced appropriately.

Repairing infeasible solutions
From the solution construction mechanism, it follows immediately that a generated solution may be infeasible.The first set of constraints is guaranteed during the construction process, i.e., each object is assigned to only one cluster, but it is possible that no object is assigned to some of the clusters (producing empty clusters, that is, the violation of the second set of constraints).To repair an infeasible solution constructed by an ant, a straightforward procedure based on a neighborhood search is therefore developed in which the infeasible solution is always replaced by a feasible one as follows: Step 1. Determine empty clusters.
Step 2. For each empty cluster j, do the following (in an increasing order of j): 2.1.Among objects that their cluster has at least two objects, randomly select one.2.2.Reassign the selected object to cluster j.

Updating of the pheromone trails
In the beginning, each pheromone trail is set equal to a fixed value τ 0 =0.1 and then, at run-time, the pheromone trails are regularly modified according to a global updating rule.This rule is proposed to increase the pheromone values compatible to better solutions to make the search more directed.
Once all ants have constructed their solutions (and after repairing infeasible solutions), each pheromone trail compatible to the solution generated by ant v (for each ant in the colony) is updated as follows: (1 ) , where ρ, a parameter between 0 and 1, is the pheromone trail evaporation rate and v z is a defuzzified value of the objective function for the solution of ant v.Then, each pheromone trail compatible to the best solution obtained so far is updated as follows: (1 ) , where best z is a defuzzified value of the objective function for the best solution obtained up to now and B is a positive parameter determining the relative importance of this solution.It should be noted here that the value of the objective function for each (feasible) solution is defuzzified to not only apply the above updating rule but also compare a new generated solution with the best one generated so far.Several ranking methods for defuzzification/comparison of fuzzy sets are available in the literature (see, e.g., Chang & Lee, 1994;Chu & Tsao, 2002;Abbasbandy & Hajjari, 2009).In this study, however, the overall existence ranking index proposed by Chang and Lee (1994) is adopted to defuzzify Z  , which is a TFN (as stated in Corollary 3) denoted by 1 2 3 ( , , ) z z z .The defuzzified value (with the pure weighting; for more discussion on the various weightings, the reader is referred to Chang & Lee, 1994) is then defined as

General structure of the algorithm
In the following, the general structure of the ACO algorithm proposed to solve the problem under consideration is represented.
Step 1. Initialize the pheromone trails and set the parameters.
Step 2. While the termination condition is not met, do the following: 2.1.For each ant in the colony, do: a.By repeatedly applying the transition rule, construct a complete solution; b.If the solution is infeasible, replace it by a feasible one by applying the repairing mechanism; c.Calculate the objective function value, and then defuzzify it by means of the defuzzification method; d.In case of an improved solution, update the best solution generated so far.2.2.Modify the pheromone trails according to the global updating rule.
Step 3. Return the best solution generated.

Computational experiments
To show the performance of the proposed ACO algorithm, a fuzzified version of a well-known standard clustering test dataset, namely Fisher's Iris dataset containing 150 objects with 4 attributes (Fisher, 1936), is used.To fuzzify this dataset, the object attributes are assumed to be TFNs.For simplicity, the symmetrical triangular possibility distribution is then applied to build the fuzzy object attributes.The most plausible value of each object attribute is first set to be equal to its value in the original dataset and then, the corresponding most pessimistic and optimistic values are, respectively, assumed to be 80% and 120% of the most plausible value.Eight different numbers of clusters are considered: from K=3 to K=10, providing eight problem instances.
The algorithm has been coded in Visual C ++ 6.0 under Microsoft Windows XP operating systems, running on a Pentium IV, 2.6 GHz PC with 2 GB memory.The proposed ACO algorithm has some numeric parameters that could impact its performance.In order to calibrate these parameters, the Taguchi method, which is an experimental design methodology is employed.Table 1 shows the input data, the factors and their levels, for the Taguchi method.Since the objective function of the problem under consideration is classified in the smaller-the-better type, the signal-to-noise (S/N) ratio of the minimization objectives calculated by the following formula (Phadke, 1989) is a suitable measure, where the defuzzified value of the objective function is utilized as objective .It is noted that the terms 'signal' and 'noise' indicate the desirable value (response variable) and the undesirable value (standard deviation), respectively, and the purpose is to maximize the S/N ratio.Among the standard table of orthogonal arrays, L 9 (3 4 ) pattern presented in Table 2 is selected as the fittest design fulfilling the necessary requirements.

Table 2
The orthogonal array L 9 (3 4 ) Trial Number of (ants, iterations) Finally, Table 3 summarizes the results, that is, the mean S/N ratios obtained at each level of the factors; the best levels of the factors are indicated in bold.Accordingly, the numeric parameters of the proposed ACO algorithm are set as follows: 20 ants in the colony, q 0 =0.99, =0.1 and B=10.In addition, the algorithm terminates when the total number of iterations in Step 2 reaches 5000.From Table 4, as the best and average objective function values (particularly, the defuzzified values) are very close to each other for each number of clusters, it can be concluded that the proposed ACO algorithm is robust.Moreover, in view of the fact that the CPU time needed by the algorithm for each problem instance has never been more than 39 seconds, it seems that the algorithm is fast.Finally, it is noteworthy that the best results (over the ten runs) concerning the most plausible objective function value for the eight problem instances have been 78.945,57.632,46.666,39.061,35.713,35.674,33.289 and 28.917,, it is obvious that the most plausible value of the objective function depends only on the most plausible values of the object attributes (that is, the values in the original dataset).Then, comparing the above results with the optimal objective function values for the original non-fuzzy dataset, which for the eight numbers of clusters are 78.851, 57.228, 46.446, 39.040, 34.298, 29.989, 27.786 and 25.834, respectively (see Hansen et al., 2005), it can be concluded that the proposed algorithm is efficient.Of course, recall that the algorithm manages to minimize the defuzzified value of the objective function.In other words, if the algorithm managed to minimize the most plausible value of the objective function, it would be possible to attain even better results than those reported above.

Conclusions
This paper deals with the problem of crisp clustering of fuzzy objects.Specifically, we consider the case where triangular fuzzy numbers are used to embody the imprecise and uncertainty of data sources.
The squared Euclidean distance is adopted to introduce a dissimilarity measure between fuzzy data.The problem is then formulated as a fuzzy nonlinear binary integer programming problem with the objective of minimizing the sum of squared distances between objects and centroids.In view of the NPhardness of the problem, an ant colony optimization algorithm is proposed to solve it that is a simply structured approach.An artificial ant constructs a solution by iteratively applying a pseudo-stochastic rule based on the pheromone trails.If the constructed solution is infeasible, it is then replaced by a feasible solution by means of a straightforward repairing mechanism.To make the search more directed, the pheromone trails are dynamically modified according to a global updating rule.Moreover, the parameters of the algorithm are calibrated via the Taguchi method.Computational results show that the proposed algorithm is robust, fast and efficient.

Table 1
Factors and factor levels

Table 3
Results of the Taguchi method Furthermore, the computational results for the problem instances are shown in Table4, which gives, for each number of clusters, the average and best objective function values achieved by the algorithm over ten independent runs, respectively.

Table 4
Average and best results for the fuzzified version of Fisher's Iris dataset