Particle swarm optimization with selective particle regeneration for data clustering

https://doi.org/10.1016/j.eswa.2010.11.082Get rights and content

Abstract

This paper presents selective regeneration particle swarm optimization (SRPSO), a novel algorithm developed based on particle swarm optimization (PSO). It contains two new features, unbalanced parameter setting and particle regeneration operation. The unbalanced parameter setting enables fast convergence of the algorithm and the particle regeneration operation allows the search to escape from local optima and explore for better solutions. This algorithm is applied to data clustering problems for performance evaluation and a hybrid algorithm (KSRPSO) of K-means clustering method and SRPSO is developed. In the conducted numerical experiments, SRPSO and KSRPSO are compared to the original PSO algorithm, K-means, as well as, other methods proposed by other studies. The results demonstrate that SRPSO and KSRPSO are efficient, accurate, and robust methods for data clustering problems.

Research highlights

► We present a novel algorithm developed based on particle swarm optimization. ► The algorithm contains particle regeneration operation. ► We apply this algorithm to data clustering problems. ► The proposed algorithm performs very well in the conducted numerical experiment.

Introduction

In recent years, meta-heuristic algorithms have been applied to a variety of complex problems in order to obtain quality solutions within acceptable computation time. Proposed by Kennedy and Eberhart (1995), particle swarm optimization (PSO) has been drawing attention of many researchers. This algorithm simulates the social behavior of animals such as birds and fish in nature. Individuals in a flock of birds or a school of fish exchange previous experience and make adjustment accordingly so that they can move toward the objective. The concept is adopted by PSO in searching for optimal solutions.

PSO has been widely applied in many research areas and real-world engineering fields. Examples include task assignment and scheduling (Liu et al., 2008, Sha and Hsu, 2008), data clustering (Kao, Zahara, & Kao, 2008), power flow analysis (Acharjee & Goswami, 2009), pattern recognition (Lin, Wang, & Lee, 2009), roundness measurement (Sun, 2009) demand forecast (Alper, 2008, Gao et al., 2006), financial decisions (Yannis, Magdalene, Michael, & Constantin, 2009), product plans (Wang, Che, & Wu, 2010) and layout design (Onut et al., 2008, Zeng et al., 2007). It has been demonstrated that PSO performs well in many optimization problems. However, it was observed that the algorithm did not perform well at times. The conversion may be slow when solving complex problems and the search can be occasionally trapped in local optima. Many attempts have been made to improve the algorithm’s efficiency and robustness.

One of the recent efforts to improve PSO is the selective regeneration particle swarm optimization (SRPSO) proposed by Tsai and Kao (2009), where the basic concept of the algorithm was introduced and the algorithm was applied to multimodal functions for preliminary evaluation of efficiency. This paper follows their work and the goal is two-fold. Firstly, how the operation of selective particle regeneration is designed and incorporated into PSO is illustrated. The intuition and detailed procedure of the algorithm is shown. Examples are provided to demonstrate the effect of the designed operation. Secondly, SRPSO is applied to data clustering. Its performance is evaluated and compared to other methods.

Data clustering is the classification of similar objects into different groups, or more precisely, the partitioning of a dataset into subsets, so that the data in each subset share some common traits. It was proven that clustering problem is NP problem (Adib, 2005). By clustering, sparse and dense regions can be identified and one can therefore discover overall distribution patterns and interesting correlations among data attributes. There are two main types of clustering algorithm, hierarchical and partitional clustering approaches. Hierarchical clustering approach aims at grouping data through repeated cluster separation or agglomeration. Partitional clustering approach attempts to directly decompose data into disjoint clusters based on an objective function such as minimizing the distance between data points and cluster centers. This paper shows how to apply SRSPO to data clustering problems. K-means, a common method for data clustering problems, is also incorporated into SRPSO for performance improvement.

There were many improved or hybridized PSO proposed in past several years. Fan, Liang, and Zahara (2004) developed the hybrid Nelder–Mead (NM)-particle swarm optimization algorithm based on the NM simplex search method and PSO. Wang, Qiu, and Bai (2005) developed a hybrid technique based on particle swarm optimization algorithm combined with the nonlinear simplex search method (HNM-PSO) for multimodal function optimization. A hybrid algorithm of the genetic algorithm and particle swarm optimization (GA-PSO) was developed by Kao and Zahara (2007). It was applied to solve continuous multimodal function. Coelho (2008) presented a quantum-behaved PSO (QPSO) using chaotic mutation operator and applied QPSO to solve a well-studied continuous optimization problem of mechanical engineering design. Ling et al. (2008) proposed a new hybrid particle swarm optimization method which incorporates a wavelet-theory-based mutation operation. It applied the wavelet theory to enhance the PSO in exploring the solution space more effectively for a better solution.

To improve heuristic algorithm performance for clustering, several methods have been proposed. Among them, K-Means is one of the widely used clustering techniques. The term K-Means was first used by MacQueen (1967). This technique groups data vectors into a predefined number of clusters on the basis of the Euclidean distance as the similarity measure. In order to improve the efficiency, many researchers combined heuristic algorithms with K-means to solve data clustering problem. Bandyopadhyay, Maulik, and Malay (2001) presented an efficient partitional clustering technique, called SAKM-clustering, that integrates the power of simulated annealing for obtaining minimum energy configuration, and the searching capability of K-means algorithm was examined in the research. Bandyopadhyay and Maulik (2002) developed a genetic algorithm-based efficient clustering technique which is called KGA-clustering. It is superior over the K-means algorithm and another genetic algorithm-based clustering method was extensively demonstrated for several artificial and real life data sets. A real life application of the KGA-clustering in classifying the pixels of a satellite image of a part of the city of Mumbai was provided. Kao, Tsai, and Wang (2007) developed an improved particle swarm optimization algorithm for data clustering. A bouncing mechanism was designed such that when particles reach the boundary of the search space, they will be bounced back and given proper directions. Two reflex schemes were implemented on PSO to improve the efficiency. Kao et al. (2008) applied hybrid NM-PSO and K-means (K-NM-PSO) to solve data clustering problem. K-NM-PSO algorithm is tested on nine data sets, and its performance is compared with those of PSO, NM-PSO, K-PSO, GA, KGA and K-means clustering. Results show that K-NM-PSO is both robust and suitable for handling data clustering.

The rest of the paper is organized as follows: the original particle swarm optimization and SRPSO are introduced in Sections 2 Particle swarm optimization, 3 Selective regenerated particle swarm optimization, respectively. Section 4 presents the considered methods for data clustering, includes K-means, PSO SRPSO and KSRPSO. Experiment setting and results are provided in Section 5. Finally, the conclusion is presented in the last section.

Section snippets

Particle swarm optimization

Particle swarm optimization (PSO) is inspired by the social behavior observed in flocks of birds and schools of fish. In nature, there is a leader who leads the bird or fish group to move, as illustrated in Fig. 1. Most members of the group follow the leader. In PSO, a potential solution to the considered problem is represented by a particle, similar to the individuals in the bird and fish group. Each particle travels in the solution space and attempts to move toward a better solution by

Selective regenerated particle swarm optimization

The proposed selective regenerated particle swarm optimization (SRPSO) aims at improving the original PSO. Two new features are designed. A suggestion on the setting of cognition and social parameters, c1 and c2, is proposed to accelerate convergence. In addition, a selective particle regeneration mechanism for avoiding the search trapped in local optima.

Clustering technique

In this section, we first provide a formal statement of the clustering and describe the widely used clustering techniques, K-means. In order to solve clustering problem more effectively, hybrid K-means and SRPSO (KSRPSO) is proposed in this section.

Experiments and results

In order to evaluate how the proposed new features enhance PSO, SRPSO and KSRPSO are applied to solve data clustering problems. In this study, the tested algorithms were coded in Matlab 2007a and run on a computer equipped with AMD 1.7G CPU and memory capacity of 1024 MB.

Conclusion

In this research, the novel selective regeneration particle swarm optimization (SRPSO) is presented, including the design concept as well as the detailed procedure. In this algorithm, suggestion on parameter setting and the mechanism of selective particle regeneration are proposed. The suggested unbalanced setting of c1 and c2, accelerates the convergence of the algorithm while the particle regeneration operation enables the search to escape from local optima and explore other areas for better

References (25)

  • Ü. Alper

    Improvement of energy demand forecasts using swarm intelligence. The case of Turkey with projections to 2025

    Energy Policy

    (2008)
  • S. Bandyopadhyay et al.

    Clustering using simulated annealing with probabilistic redistribution

    International Journal of Pattern Recognition and Artificial Intelligence

    (2001)
  • Cited by (74)

    • Data clustering using hybrid water cycle algorithm and a local pattern search method

      2021, Advances in Engineering Software
      Citation Excerpt :

      Nature-inspired metaheuristic algorithms attempt to find the best (feasible) solution out of all possible solutions for an optimization problem and have been widely used for solving clustering problems due to their capability of discovering global solutions [23]. Examples of such metaheuristics are genetic algorithm (GA) [24,25], particle swarm optimization (PSO) [26-28], artificial bee colony (ABC) [29], and so forth. In optimization-based clustering, an appropriate balance between exploitation and exploration processes is expected for an efficient clustering.

    • Chaotic particle swarm optimization with sigmoid-based acceleration coefficients for numerical function optimization

      2019, Swarm and Evolutionary Computation
      Citation Excerpt :

      Particle swarm optimization (PSO) is an efficient population-based stochastic search technique that is based on the metaphors of social interaction and communication (e.g., bird flocking and fish schooling) [19]. Due to the advantages of swarm intelligence, intrinsic parallelism, few parameters, easy implementation and inexpensive computation, PSO has attracted much research attention in the field of evolutionary computation and gained wide applications in many areas in the last two decades [1,7,9,10,16,18,42,43,49]. However, like other nature-inspired evolutionary algorithms [36], PSO also encounters the issues of premature convergence and entrapment into local optimum.

    • Continuous greedy randomized adaptive search procedure for data clustering

      2018, Applied Soft Computing Journal
      Citation Excerpt :

      Nevertheless, the initialization phase of the method may easily yield local optimal solutions [10]. To overcome this limitation, a number of algorithms based on well-known metaheuristics have been proposed in the last decades for this type of clustering problems (e.g., simulated annealing (SA) [49], tabu search (TS) [3,37], genetic algorithm (GA) [41,32,39,36], ant colony optimization (ACO) [50], artificial bee colony (ABC) [31,34] and particle swarm optimization (PSO) [14,51,2,53]). Other less known metaheuristic based approaches were developed in [24,52,44].

    • A Novel History-driven Artificial Bee Colony Algorithm for Data Clustering

      2018, Applied Soft Computing Journal
      Citation Excerpt :

      The results show that the hybrid method is superior to both K-means and PSO algorithms. In [43], a new clustering approach based on PSO algorithm is presented in which balanced parameter setting is used to improve the convergence speed of PSO algorithm. In this work, selective particle regeneration strategy is proposed to deal with the local optima problem of PSO algorithm.

    • Adaptive Multi-Updating Strategy Based Particle Swarm Optimization

      2023, Intelligent Automation and Soft Computing
    View all citing articles on Scopus
    View full text