Particle swarm optimization with selective particle regeneration for data clustering
Research highlights
► We present a novel algorithm developed based on particle swarm optimization. ► The algorithm contains particle regeneration operation. ► We apply this algorithm to data clustering problems. ► The proposed algorithm performs very well in the conducted numerical experiment.
Introduction
In recent years, meta-heuristic algorithms have been applied to a variety of complex problems in order to obtain quality solutions within acceptable computation time. Proposed by Kennedy and Eberhart (1995), particle swarm optimization (PSO) has been drawing attention of many researchers. This algorithm simulates the social behavior of animals such as birds and fish in nature. Individuals in a flock of birds or a school of fish exchange previous experience and make adjustment accordingly so that they can move toward the objective. The concept is adopted by PSO in searching for optimal solutions.
PSO has been widely applied in many research areas and real-world engineering fields. Examples include task assignment and scheduling (Liu et al., 2008, Sha and Hsu, 2008), data clustering (Kao, Zahara, & Kao, 2008), power flow analysis (Acharjee & Goswami, 2009), pattern recognition (Lin, Wang, & Lee, 2009), roundness measurement (Sun, 2009) demand forecast (Alper, 2008, Gao et al., 2006), financial decisions (Yannis, Magdalene, Michael, & Constantin, 2009), product plans (Wang, Che, & Wu, 2010) and layout design (Onut et al., 2008, Zeng et al., 2007). It has been demonstrated that PSO performs well in many optimization problems. However, it was observed that the algorithm did not perform well at times. The conversion may be slow when solving complex problems and the search can be occasionally trapped in local optima. Many attempts have been made to improve the algorithm’s efficiency and robustness.
One of the recent efforts to improve PSO is the selective regeneration particle swarm optimization (SRPSO) proposed by Tsai and Kao (2009), where the basic concept of the algorithm was introduced and the algorithm was applied to multimodal functions for preliminary evaluation of efficiency. This paper follows their work and the goal is two-fold. Firstly, how the operation of selective particle regeneration is designed and incorporated into PSO is illustrated. The intuition and detailed procedure of the algorithm is shown. Examples are provided to demonstrate the effect of the designed operation. Secondly, SRPSO is applied to data clustering. Its performance is evaluated and compared to other methods.
Data clustering is the classification of similar objects into different groups, or more precisely, the partitioning of a dataset into subsets, so that the data in each subset share some common traits. It was proven that clustering problem is NP problem (Adib, 2005). By clustering, sparse and dense regions can be identified and one can therefore discover overall distribution patterns and interesting correlations among data attributes. There are two main types of clustering algorithm, hierarchical and partitional clustering approaches. Hierarchical clustering approach aims at grouping data through repeated cluster separation or agglomeration. Partitional clustering approach attempts to directly decompose data into disjoint clusters based on an objective function such as minimizing the distance between data points and cluster centers. This paper shows how to apply SRSPO to data clustering problems. K-means, a common method for data clustering problems, is also incorporated into SRPSO for performance improvement.
There were many improved or hybridized PSO proposed in past several years. Fan, Liang, and Zahara (2004) developed the hybrid Nelder–Mead (NM)-particle swarm optimization algorithm based on the NM simplex search method and PSO. Wang, Qiu, and Bai (2005) developed a hybrid technique based on particle swarm optimization algorithm combined with the nonlinear simplex search method (HNM-PSO) for multimodal function optimization. A hybrid algorithm of the genetic algorithm and particle swarm optimization (GA-PSO) was developed by Kao and Zahara (2007). It was applied to solve continuous multimodal function. Coelho (2008) presented a quantum-behaved PSO (QPSO) using chaotic mutation operator and applied QPSO to solve a well-studied continuous optimization problem of mechanical engineering design. Ling et al. (2008) proposed a new hybrid particle swarm optimization method which incorporates a wavelet-theory-based mutation operation. It applied the wavelet theory to enhance the PSO in exploring the solution space more effectively for a better solution.
To improve heuristic algorithm performance for clustering, several methods have been proposed. Among them, K-Means is one of the widely used clustering techniques. The term K-Means was first used by MacQueen (1967). This technique groups data vectors into a predefined number of clusters on the basis of the Euclidean distance as the similarity measure. In order to improve the efficiency, many researchers combined heuristic algorithms with K-means to solve data clustering problem. Bandyopadhyay, Maulik, and Malay (2001) presented an efficient partitional clustering technique, called SAKM-clustering, that integrates the power of simulated annealing for obtaining minimum energy configuration, and the searching capability of K-means algorithm was examined in the research. Bandyopadhyay and Maulik (2002) developed a genetic algorithm-based efficient clustering technique which is called KGA-clustering. It is superior over the K-means algorithm and another genetic algorithm-based clustering method was extensively demonstrated for several artificial and real life data sets. A real life application of the KGA-clustering in classifying the pixels of a satellite image of a part of the city of Mumbai was provided. Kao, Tsai, and Wang (2007) developed an improved particle swarm optimization algorithm for data clustering. A bouncing mechanism was designed such that when particles reach the boundary of the search space, they will be bounced back and given proper directions. Two reflex schemes were implemented on PSO to improve the efficiency. Kao et al. (2008) applied hybrid NM-PSO and K-means (K-NM-PSO) to solve data clustering problem. K-NM-PSO algorithm is tested on nine data sets, and its performance is compared with those of PSO, NM-PSO, K-PSO, GA, KGA and K-means clustering. Results show that K-NM-PSO is both robust and suitable for handling data clustering.
The rest of the paper is organized as follows: the original particle swarm optimization and SRPSO are introduced in Sections 2 Particle swarm optimization, 3 Selective regenerated particle swarm optimization, respectively. Section 4 presents the considered methods for data clustering, includes K-means, PSO SRPSO and KSRPSO. Experiment setting and results are provided in Section 5. Finally, the conclusion is presented in the last section.
Section snippets
Particle swarm optimization
Particle swarm optimization (PSO) is inspired by the social behavior observed in flocks of birds and schools of fish. In nature, there is a leader who leads the bird or fish group to move, as illustrated in Fig. 1. Most members of the group follow the leader. In PSO, a potential solution to the considered problem is represented by a particle, similar to the individuals in the bird and fish group. Each particle travels in the solution space and attempts to move toward a better solution by
Selective regenerated particle swarm optimization
The proposed selective regenerated particle swarm optimization (SRPSO) aims at improving the original PSO. Two new features are designed. A suggestion on the setting of cognition and social parameters, c1 and c2, is proposed to accelerate convergence. In addition, a selective particle regeneration mechanism for avoiding the search trapped in local optima.
Clustering technique
In this section, we first provide a formal statement of the clustering and describe the widely used clustering techniques, K-means. In order to solve clustering problem more effectively, hybrid K-means and SRPSO (KSRPSO) is proposed in this section.
Experiments and results
In order to evaluate how the proposed new features enhance PSO, SRPSO and KSRPSO are applied to solve data clustering problems. In this study, the tested algorithms were coded in Matlab 2007a and run on a computer equipped with AMD 1.7G CPU and memory capacity of 1024 MB.
Conclusion
In this research, the novel selective regeneration particle swarm optimization (SRPSO) is presented, including the design concept as well as the detailed procedure. In this algorithm, suggestion on parameter setting and the mechanism of selective particle regeneration are proposed. The suggested unbalanced setting of c1 and c2, accelerates the convergence of the algorithm while the particle regeneration operation enables the search to escape from local optima and explore other areas for better
References (25)
- et al.
An evolutionary technique based on K-means algorithm for optimal clustering in RN
Information Science
(2002) A quantum particle swarm optimizer with chaotic mutation operator
Chaos, Solitons and Fractals
(2008)- et al.
A hybridized approach to data clustering
Expert Systems with Application
(2008) - et al.
An effective hybrid PSO-based algorithm for flow shop scheduling with limited buffers
Computers and Operations Research
(2008) - et al.
A particle swarm optimization algorithm for the multiple-level warehouse layout design problem
Computers and Industrial Engineering
(2008) - et al.
New particle swarm optimization for the open shop scheduling problem
Computers and Operations Research
(2008) Applying particle swarm optimization algorithm to roundness measurement
Expert Systems with Applications
(2009)- et al.
Using analytic hierarchy process and particle swarm optimization algorithm for evaluating product plans
Expert Systems with Applications
(2010) - et al.
Expert algorithm based on adaptive particle swarm optimization for power flow analysis
Expert Systems with Applications
(2009) NP-hardness of the cluster minimization problem revisited
Journal of Physics A: Mathematical and General
(2005)
Improvement of energy demand forecasts using swarm intelligence. The case of Turkey with projections to 2025
Energy Policy
Clustering using simulated annealing with probabilistic redistribution
International Journal of Pattern Recognition and Artificial Intelligence
Cited by (74)
Data clustering using hybrid water cycle algorithm and a local pattern search method
2021, Advances in Engineering SoftwareCitation Excerpt :Nature-inspired metaheuristic algorithms attempt to find the best (feasible) solution out of all possible solutions for an optimization problem and have been widely used for solving clustering problems due to their capability of discovering global solutions [23]. Examples of such metaheuristics are genetic algorithm (GA) [24,25], particle swarm optimization (PSO) [26-28], artificial bee colony (ABC) [29], and so forth. In optimization-based clustering, an appropriate balance between exploitation and exploration processes is expected for an efficient clustering.
Chaotic particle swarm optimization with sigmoid-based acceleration coefficients for numerical function optimization
2019, Swarm and Evolutionary ComputationCitation Excerpt :Particle swarm optimization (PSO) is an efficient population-based stochastic search technique that is based on the metaphors of social interaction and communication (e.g., bird flocking and fish schooling) [19]. Due to the advantages of swarm intelligence, intrinsic parallelism, few parameters, easy implementation and inexpensive computation, PSO has attracted much research attention in the field of evolutionary computation and gained wide applications in many areas in the last two decades [1,7,9,10,16,18,42,43,49]. However, like other nature-inspired evolutionary algorithms [36], PSO also encounters the issues of premature convergence and entrapment into local optimum.
Continuous greedy randomized adaptive search procedure for data clustering
2018, Applied Soft Computing JournalCitation Excerpt :Nevertheless, the initialization phase of the method may easily yield local optimal solutions [10]. To overcome this limitation, a number of algorithms based on well-known metaheuristics have been proposed in the last decades for this type of clustering problems (e.g., simulated annealing (SA) [49], tabu search (TS) [3,37], genetic algorithm (GA) [41,32,39,36], ant colony optimization (ACO) [50], artificial bee colony (ABC) [31,34] and particle swarm optimization (PSO) [14,51,2,53]). Other less known metaheuristic based approaches were developed in [24,52,44].
A Novel History-driven Artificial Bee Colony Algorithm for Data Clustering
2018, Applied Soft Computing JournalCitation Excerpt :The results show that the hybrid method is superior to both K-means and PSO algorithms. In [43], a new clustering approach based on PSO algorithm is presented in which balanced parameter setting is used to improve the convergence speed of PSO algorithm. In this work, selective particle regeneration strategy is proposed to deal with the local optima problem of PSO algorithm.
Enhanced Water Cycle Algorithm Using Hookes and Jeeves Method for Clustering Large Gas Data
2023, AIP Conference ProceedingsAdaptive Multi-Updating Strategy Based Particle Swarm Optimization
2023, Intelligent Automation and Soft Computing