1 Introduction

Particle swarm optimization (PSO), proposed by Kennedy and Eberhart (1995), is an efficient swarm intelligence technology for global optimization. The underlying concept of PSO is to simulate the swarm behaviors of birds flocking. In PSO, the movements of particles are guided by their own best-known positions, denoted as pBests, as well as the best-known position of entire swarm, denoted as gBest, to search the space. PSO has been successfully applied in various fields (Gong et al. 2012; Ishibuchi and Salam 2013; Lu et al. 2021; Ji et al. 2021; Wang et al. 2021a). However, it may have difficulty to deal with optimization problems involving a large number of optima or high-dimensional data, which is termed as large scale global optimization (LSGO) (Yang and Pedersen 1997). Such an issue is mainly attributed to the premature convergence that usually occurs in the traditional PSO (Chen et al. 2013).

To address premature convergence, many variants of PSO have been proposed (Shi and Eberhart 1999; Zhan et al. 2009; Suganthan 1999; Kennedy and Mendes 2002; Juang 2004; Shelokar et al. 2007; Liang and Suganthan 2005; Li and Yao 2012; Cheng and Jin 2015a; Xu et al. 2021). One approach is to adaptively control the learning parameters of PSO. For example, Shi and Eberhart (1999) tried to linearly decrease the parameter value of inertia weight from 0.9 to 0.4 during evolution to balance exploration and exploitation. Zhan et al. (2009) designed a real-time evolutionary state estimation procedure to identify evolutionary state of the algorithm and automatically control the parameters of PSO based on the identified state. Another approach is to enhance the swarm diversity of PSO by introducing certain topological structures of swarm. For instance, in Suganthan (1999), the authors employed a neighborhood topology to enhance the diversity of swarm. Kennedy and Mendes (2002) proposed two topologies (i.e., a ring topology and a von Neumann topology) to enhance the diversity of evolutionary search. Further, since different search techniques possess different strengths, it is natural to hybridize PSO with other search methods, including other evolutionary algorithms (Juang 2004; Shelokar et al. 2007) and local searches (Liang and Suganthan 2005) to improve its performance. Additionally, since exchanging information between different sub-swarms can enhance the swarm diversity, a few multi-swarm based PSO methods have also been proposed. For example, Liang and Suganthan (2005) tried to dynamically re-group the swarm into multiple sub-swarms and exchange information among sub-swarms to achieve the diversity. Li and Yao (2012) presented a cooperative coevolutionary algorithm for LSGO. Apart from the above approaches, designing viable learning strategies to enhance the diversity of PSO has also gained attention. In this direction, for example, Cheng and Jin (2015a, b devised several variants of PSO, in which particles are set to learn from predominant particles in the swarm. Such a predominant particle learning strategy is able to enhance the diversity of evolutionary search. Although the above PSO variants are able to improve the performance of traditional PSO, their capabilities to deliver a well-balanced evolutionary search could be limited.

In this paper, we propose an attention-based particle swarm optimizer (APSO) for LSGO. In cognitive science, human beings tend to selectively focus on some of visible information while ignoring other information (Itti et al. 1998; Corbetta and Shulman 2002) due to the bottleneck of information processing. The above mechanism is often referred to as the attention mechanism, which has been successfully employed in deep learning. The mechanism suggests that selectively paying attention to part of the information while ignoring other information can improve the efficiency of information processing. During evolution of PSO, different particles in the swarm generally have different potentials for searching the space and should be treated differently at different stages of evolution. In this sense, an attention-based particle sampling (APS) strategy, which works by adaptively activating partial particles to participate in evolution based on their fitness values and the stage of evolution, has been developed and incorporated into the proposed method. By paying attention to low quality particles at the early stage of evolution while gradually switching to high quality particles at the later stage of evolution, the APS can be used to support a well-balanced evolutionary search at swarm level. Moreover, an attention-based particle learning (APL) has also been devised to guide the learning of particles. The strategy works by selecting three particles randomly from a predominant sub-swarm, which is activated by the attention mechanism, to participate into particle learning. By utilizing particles from the predominant sub-swarm activated by the attention mechanism for learning, the APL is able to enhance the balance of evolutionary search at particle level. Extensive experiments have been conducted on CEC’2010 (Tang et al. 2010) and CEC’2013 (Li et al. 2013) large scale benchmark sets to evaluate the significance of devised strategies and to compare the performance of proposed method with related algorithms. The results show our method is viable to deal with LSGO and outperforms related methods.

The rest of this paper is organized as follows. Section 2 gives a brief review of PSO and its variants as well as evolutionary algorithms dealing with LSGO. Section 3 presents the details of proposed method. Then, extensive experiments are conducted in Sect. 4 to verify the performance of proposed method. Finally, conclusions are given in Sect. 5.

2 Related work

2.1 PSO and its variants

PSO algorithm starts with a swarm of Np particles, which represent the solutions. To deal with an optimization problem, each particle i is defined as a position vector Xi = {xi,1, xi,2, …, xi,D} associated with a velocity vector Vi = {vi,1, vi,2, …, vi,D}, where D denotes the dimension of the problem. During evolution, the particles’ own historical best positions will be recorded as pBesti = {pBesti,1, pBesti,2, …, pBesti,D}, i = 1, 2, …, Np, and the best one among these pBests is denoted as gBest.

During particle learning, each particle i will update the velocity and position according to its pBesti and the gBest as follows:

$${v}_{i,j}=w \cdot {v}_{i,j}+{c}_{1}\cdot {r}_{1,j}\cdot \left({pBest}_{i,j}-{x}_{i,j}\right)+{c}_{2}\cdot {r}_{2,j}\cdot \left({gBest}_{j}-{x}_{i,j}\right)$$
(1)
$${x}_{i,j}={x}_{i,j}+{v}_{i,j},$$
(2)

where w represents the inertia weight, r1 and r2 are random values between [0, 1], c1 and c2 are acceleration coefficients and j denotes the jth dimension of the problem. After particle learning, if the current position Xi is better than pBesti, then replace pBesti with Xi.

PSO has received much attention due to its efficiency and simplicity. Many studies have also been put forward to improve its performance, thereby deriving many variants of PSO (Zhan et al. 2009; Kennedy and Mendes 2002; Liang and Suganthan 2005; Liang et al. 2006; Mendes et al. 2004). Among these variants, one approach is to focus on adaptively controlling of parameters of PSO. For example, in Shi and Eberhart (1998), the authors tried to improve the performance of PSO by linearly decreasing the inertia weight w along evolution according to rule:

$$w={w}_{max}-\left({w}_{max}-{w}_{min}\right) \cdot \frac{g}{G},$$
(3)

where g represents the current number of generations and G is a predefined maximum number of generations. In Shi and Eberhart (2001), a fuzzy adaptive rule was proposed to set the value of w. While, a random setting of w was experimented in Eberhart and Shi (2001) for dynamic system optimization. In addition to inertia weight, acceleration coefficients c1 and c2 are also important for the performance of PSO. In Ratnaweera et al. (2004), the authors tried to linearly control acceleration coefficients in PSO to balance its global and local search capabilities. Zhan et al. (2009) tended to automatically control acceleration coefficients based on states of the PSO algorithm, which are identified by a real-time evolutionary state estimation procedure.

The second approach of enhancing traditional PSO is to introduce topological structures of swarm, thereby improving swarm diversity and alleviating premature convergence (Suganthan 1999; Kennedy 1999). For instance, Kennedy and Mendes (2002) proposed two topological structures (i.e., the ring topology and von Neumann topology) to organize the swarm. In this scheme, the updating rule of particle’s velocity is defined as:

$${v}_{i,j}=w\cdot {v}_{i,j}+{c}_{1}\cdot {r}_{1,j}\cdot \left({pBest}_{i,j}-{x}_{i,j}\right)+{c}_{2}\cdot {r}_{2,j}\cdot \left({lBest}_{j}-{x}_{i,j}\right),$$
(4)

where lBestj denotes the best pBest of ith particle’s neighborhoods. In Mendes et al. (2004), the authors proposed a fully informed PSO, in which each particle is updated based on historical best positions of its neighbors. In Liang et al. (2006), the authors devised a comprehensive learning PSO, which allows the velocity of particle to be updated as:

$${v}_{i,j}=w\cdot {v}_{i,j}+c\cdot {r}_{j}\cdot \left({pBest}_{fi\left(j\right),j}-{x}_{i,j}\right),$$
(5)

where fi(j) denotes pBest of the winner between two particles randomly selected from the swarm.

The third approach tends to improve PSO by combining other search operations or techniques. For example, Angeline (1998) introduced a tournament selection operation into PSO. Chen et al. (2007) and Andrews (2006) integrated the crossover and mutation operation, respectively, from genetic algorithm (GA) into PSO. Juang (2004) tried to integrate a GA into PSO for designing a recurrent artificial neural network. Apart from GA, differential evolution (Zhang and Xie 2003), ant colony optimization (Shelokar et al. 2007; Song and Miao 2021) and local search (Liang and Suganthan 2005) have also been adopted to design PSO variants.

Since exchanging information between different swarms can enhance the swarm diversity, a few multi-swarm based PSO have been proposed. For instance, in Liang and Suganthan (2005), the authors tried to dynamically re-group the swarm into multiple sub-swarms to achieve information exchange. Ye et al. (2017) devised a multi-swarm based PSO, in which the particles in each sub-swarm are further classified into ordinary and communication particles. In this method, ordinary and communication particles of each sub-swarm are set to focus on exploitation and exploration, respectively. Zhang and Ding (2011) employed four sub-swarms for evolution, which exchange information to maintain cooperation and to guide their own evolution. Niu et al. (2007) presented a PSO with master–slave model of swarms, which consists of one master swarm and several slave swarms. The master swarm updates the states of particles based on both its own experience and that of the most successful particles in the slave swarms.

2.2 Evolutionary algorithms for LSGO

Traditional evolutionary algorithms (EAs) are generally suitable to address low-dimensional optimization problems. However, they could perform poorly on high-dimensional optimization problems. To address this issue, many extensions of EAs have been proposed and they can be roughly divided into two categories: decomposition and non-decomposition based methods. Decomposition-based methods are usually based on cooperative coevolutionary (CC) framework (Potter 1997) and referred to as cooperative coevolutionary algorithms (CCEAs). These methods adopt a divide-and-conquer strategy to decompose a high-dimensional problem into multiple low-dimensional problems and solve them separately. The optimal solution of each sub-problem is then merged into a global optimal vector called “context vector”. For example, in Bergh and Engelbrecht (2004), the authors devised two CC based PSO algorithms, named CCPSO-Sk and CCPSO-Hk. CCPSO-Sk is a CC-based standard PSO, while CCPSO-Hk is a hybrid of CCPSO-Sk and standard PSO. In these methods, the correlation between variables is considered to divide decision variables into K groups for optimization. Since the decomposition strategy of variables is vital, many CCEAs mainly focus on the decomposition strategy. In Yang et al. (2008), the authors introduced a random grouping scheme, which randomly divides all variables into K groups at each cycle. To address the setting of appropriate group size K, a multilevel CC framework was further devised by Yang et al. (2008). In this method, at the beginning of each cycle, a decomposer is selected from the pool based on the performance of different decomposers. At the end of each cycle, the performance of selected decomposer will be updated. Li and Yao (2012) adopted a simple approach to deal with the issue of group size K. In this method, if the selected decomposer performs well, then it will be continuously employed. Otherwise, a new one will be randomly selected from the pool. Recently, Omidvar et al. (2014) proposed a grouping strategy, which can uncover the underlying interaction structure of variables and form groups such that the interdependence between them is kept to be a minimum. Similar decomposition strategies can also be found in Omidvar et al. (2014, 2017).

Although CCEAs are promising for LSGO, they are generally expensive especially facing a large number of sub-problems, which has to be optimized individually. To alleviate this issue, non-decomposition based methods have also been developed. These methods generally based on lBest particle learning (Liang and Suganthan 2005) or inter-particle learning schemes (Cheng and Jin 2015b; Yang et al. 2016, 2018) to enhance the swarm diversity. In Liang and Suganthan (2005), the authors tried to divide the swarm into multiple sub-swarms and each sub-warm evolves individually based on a lBest learning strategy proposed in Kennedy and Mendes (2002). The resulting scheme is able to enhance the diversity of swarm, thus delivering much better performance than standard PSO, which adopts pBest and gBest for particle learning. The performance of this method, however, is limited for LSGO. In Cheng and Jin (2015a; b), the authors developed a competitive learning strategy for PSO, resulting in a method called competitive swarm optimizer (CSO). In this method, two particles, which are randomly taken from the swarm, are set to compete. The winner Xw will be passed directly to the swarm for evolution, while the loser Xl is set to learn from Xw as follow:

$${v}_{l,j}={r}_{1,j}\cdot {v}_{l,j}+{r}_{2,j}\cdot \left({x}_{w,j}-{x}_{l,j}\right)+\varphi \cdot {r}_{3,j}\cdot \left({\overline{x} }_{j}-{x}_{l,j}\right),$$
(6)

where r1, r2 and r3 are randomly generated values between [0, 1], φ is a parameter controlling learning rate and \(\overline{x}\) denotes the mean position of swarm. In Cheng and Jin (2015a; b), the authors presented a PSO with a social learning strategy (SL-PSO). In this strategy, the particle learning of Xi is performed by randomly selecting a particle Xk, which possesses a higher fitness than Xi, from the swarm and updating its velocity as:

$${v}_{l,j}={r}_{1,j}\cdot {v}_{l,j}+{r}_{2,j}\cdot \left({x}_{k,j}-{x}_{l,j}\right)+\varphi \cdot {r}_{3,j}\cdot \left({\overline{x} }_{j}-{x}_{l,j}\right).$$
(7)

In Yang et al. (2018), the authors devised two swarm optimizers with level-based learning (LLSO) and segment-based predominant learning (SPLSO), respectively. In LLSO, particles are grouped into various levels based on their fitness and each particle is set to learn from two predominant particles from higher levels. While in SPLSO, variables of particles are partitioned into multiple segments randomly and different segments are set to learn from different predominant particles. The above inter-particle learning strategies can be used to improve swarm diversity and the resulting algorithms are able to outperform CCEAs and lBest particle learning based PSO. However, these algorithms do not consider the fact that, during evolution, different particles in the swarm generally have different potentials in searching the space, thus should be treated differently in terms of being selected for evolution and learning. This could limit their performance.

Except for PSO, other EAs and their variants have also been developed for LSGO. For instances, Molina et al. (2010) tried to combine a local search into a steady-state GA, resulting an algorithm termed as MA-SW-Chains, to deal with LSGO. LaTorre et al. (2012) developed a multiple offspring sampling framework, which hybridizes multiple algorithms to deal with LSGO. Maucec et al. (2018) incorporated three evolution strategies into differential evolution for LSGO. Yildiz and Topal (2019) proposed a micro differential evolution algorithm with a directional local search operator using a small population size to solve LSGO. A good review of EAs for LSGO can be found in LaTorre et al. (2015).

3 Proposed method

In this section, we propose an attention-based particle swarm optimizer (APSO) for LSGO. In the proposed algorithm, an attention-based particle sampling (APS) strategy is designed and employed to dynamically activate an appropriate sub-swarm to participate in evolution at each generation. The activated sub-swarm will then go through an attention-based particle learning (APL) strategy. The APL works by randomly selecting three predominant particles from a sub-swarm, which is activated by the attention mechanism, to guide the learning of particles. The evolution will process until the termination condition is met. Algorithm 1 shows the procedure of the proposed algorithm. The details of APS and APL strategies in our algorithm are given in the follow sections.

figure a

3.1 Attention-based particle sampling

To tackle LSGO, a swarm optimizer is required to preserve a sufficient swarm diversity during evolution, so that the solution space could be well explored to avoid local optima. At the same time, the optimizer should efficiently exploit promising areas of solution space to locate the optimal or near optimal solution of the problem. These two requirements generally conflict with each other (Cheng and Jin 2015a; Campos et al. 2014) and a good optimizer should have a balanced exploration and exploitation capability to appropriate search the solution space.

To support a well-balanced PSO evolution, here we first devise an attention-based particle sampling strategy, which is inspired by the attention mechanism. In cognitive science, to make rational usage of limited visual information processing resources, humans usually selectively focus on a portion of information while ignoring other information to achieve an efficient information process (Itti et al. 1998; Corbetta and Shulman 2002). Similarly, during evolution of PSO, different particles are usually in different evolution states and have different potentials in exploring and exploiting the search space. Paying different attentions to different particles can lead the swarm to different levels of exploration and exploitation during evolution. Motivated by the above mechanism and rationale, an APS strategy, which tends to adaptively activate an appropriate subswarm at different stage of evolution such that encouraging exploration at the early stage of evolution while exploitation at the later stage of evolution, has been developed and incorporated into the proposed method.

Specifically, the proposed APS strategy works as follows. At each generation of evolution, we first sort the particles of swarm in descending order according to their fitness values. The sorted particles are then ranked from 1 to Np. Subsequently, for each particle i, we calculate its activation value pi as:

$${p}_{i}={p}_{i,ini}+\left({p}_{i,fin}-{p}_{i,ini}\right)\times \left(FEs/{FE}_{max}\right),$$
(8)

where FEmax denotes the maximum number of function evaluations and FEs represents the number of function evaluations consumed so far during evolution. Here, the initial activation value pi,ini is defined as pi,ini = rank(i)/Np while the final activation value pi,fin is computed to be pi,fin = 1 − pi,ini. Based on the calculated activation value of each particle, a randomly value between 0 to 1 is then generated. If the activation value of the particle is larger than the randomly generated value, it will be sampled as a member of sub-swarm for evolution. The procedure of the APS is shown in step 2 of Algorithm 1.

It can be found that, according to the above procedure, the particles with lower ranks (i.e., lower fitness) will have higher possibilities to be activated at the early phase of evolution. As the particles with lower fitness are generally scatter around the space and far away from the potential optima, by activating these particles for evolution, the search is thus encouraged to explore the space to locate promising areas. While, during the later stage of evolution, the particles with higher ranks (i.e., higher fitness) will have higher possibilities to be activated. Since the particles with higher fitness are generally locate near potential optima, by activating these particles for evolution, the search is therefore encouraged to exploit the space to accurately identify the optima. Consequently, the proposed APS can be used to support a balanced evolutionary search at swarm level.

3.2 Attention-based particle learning

After obtaining a subset swarm, supP, by employing APS at each generation, each particle in supP will go through an attention-based particle learning, which works as follows. For each particle m in supP, a predominant sub-swarm, PS, is first selected based on the attention mechanism from the swarm. Specifically, for each particle in the swarm, which is better than the particle m, a randomly value between 0 to 1 is generated. If the randomly generated value is less than the particle’s activation value, then insert the particle into PS. After obtaining the predominant sub-swarm, three exemplars will be randomly selected from PS and compete to update the particle m on each dimension of the problem. The competition is performed by determining the best, medium, and worst particle, denoted as e1, e2, and e3, respectively. Based on the three selected exemplars e1, e2, and e3, the particle m on the jth dimension of the problem will be updated according to following rules:

$${v}_{m,j}={r}_{1,j}\cdot {v}_{m,j}+{r}_{2,j}\cdot \left({x}_{{e}_{1},j}-{x}_{m,j}\right)+{\varphi }_{1}\cdot {r}_{3,j}\cdot \left({x}_{{e}_{2},j}-{x}_{m,j}\right)+{\varphi }_{2}\cdot {r}_{4,j}\cdot \left({x}_{{e}_{3},j}-{x}_{m,j}\right),$$
(9)
$${x}_{m,j}={x}_{m,j}+{v}_{m,j},$$
(10)

where φ1 and φ2 denote learning rates, and j denotes the jth dimension of the problem. It should be noted that, the three predominant exemplars will be independently selected from PS on each dimension of the problem for particle learning. By doing so, the diversity of swarm can be further preserved. In case that the size of predominant subswarm PS is less than three, which is required to perform the particle learning, then no particle learning will be carried out. The procedure of the APL is shown in step 3 of Algorithm 1.

Based on the above procedure, for each particle subjected to learning, exemplars are set to be selected from a predominant sub-swarm PS sampled by the attention mechanism. According to the attention mechanism, at the early phase of evolution, the particles with worse fitness will have higher possibilities to be chosen as candidate exemplars. While, at the later phase of evolution, the particles with better fitness will have higher possibilities to be selected as members of PS. Consequently, exemplars selected for particle learning at the beginning of evolution generally have low fitness, thus encouraging the exploration aspect of evolution. While, at the later stage of evolution, exemplars selected for particle learning typically have high fitness, thus advocating exploitation. The APL can therefore be used to achieve a balanced evolutionary search at particle level.

4 Experiments

In this section, we carry out a series of experiments to evaluate the proposed method and compare it with related algorithms. All algorithms are coded using C++ and tested on a workstation with an Intel (R) Core™ i7-3630QM CPU at 2.40 GHz running Windows™ 7 operation system. Unless otherwise stated, the results are averaged over 30 independent runs of the algorithms.

4.1 Benchmark functions and parameter settings

Before presenting the experimental results, we first describe benchmark functions used in experiments as well as parameter configurations of the proposed algorithm. The benchmark function sets are from CEC’2010 (Tang et al. 2010) and CEC’2013 (Li et al. 2013), which contain 20 and 15 functions, respectively. The characteristics of these functions can be found in (Tang et al. 2010; Li et al. 2013). The experiments are carried out on the functions with dimension D of 1000. The value of FEmax, which is used as the termination condition, is set to be 3000 × D. The above configurations are consistent with previous studies on these two benchmark sets, so their results can be cited and fairly compared with our method. The swarm size of our method and its variants is set to be 2 ×  (100 + D/10.0), due to the employment of APS strategy. The control parameters φ1 and φ2 in APS are configured as 0.4 and 0.1, respectively.

4.2 4.2. Exploring the proposed method

Firstly, the significance of APS and APL strategies in proposed algorithm is examined. To verify the usefulness of APL, we compared our proposed algorithm, APSO, with its variants: APSO without APL (denoted as APSO_1) and APSO without APS and APL (denoted as APSO_2). In both variants, the social learning strategy, which is from SL-PSO (Cheng and Jin 2015b), is used for particle learning. Tables 1 and 2 show the results of the three algorithms on CEC’2010 and CEC’s 2013 functions, respectively.

Table 1 Comparing results of APSO and its two variants on 1000-D CEC’2010 functions in term of mean fitness value
Table 2 Comparing results of APSO and its two variants on 1000-D CEC’2013 functions in terms of mean fitness value

By comparing the results of APSO and APSO_1, we can see that the APL is able to greatly enhance the performance of the proposed algorithm. By incorporating APL, APSO can locate better solutions than APSO_1 on most of the functions, except F1, F6, F7, F11, F16 from CEC’2010 and F5, F11 from CEC’2013. By comparing APSO_1 with APSO_2, it can be found that APS strategy can help improve the performance of APSO_1. For example, APSO_1 can deliver better solutions than APSO_2 on most functions of CEC’2010 and CEC’2013. From the above results, it can be concluded that the both APS and APL strategies can help improve the search performance, thus effectively identifying the potential optima in the solution space.

4.3 Comparisons with related algorithms

Then, we compare the performance of our algorithm with related methods on CEC’2010 and CEC’2013 LSGO function sets. The methods for comparison include: (1) variants of PSO: SL-PSO (Cheng and Jin 2015a), CSO (Cheng and Jin 2015b), DSPLSO (Yang et al. 2016), DLLSO (Yang et al. 2018); (2) CCEAs: DECC-DG (Omidvar et al. 2014), MLCC (Yang et al. 2008a), DECC-G (Yang et al. 2008b), CCPSO2 (Li and Yao 2012) and (3) CEC’2010 LSGO winner: MA-SW-Chains (Molina et al. 2010). These methods have been performed on the two benchmark function sets of the same dimension D (i.e., D = 1000) by using the same FEmax (i.e., 3000 × D) as the termination condition. Their results can therefore be fairly compared with our method.

The results of above algorithms are shown in Tables 3 and 4, respectively, for function sets of CEC’2010 and CEC’2013. Two-tailed t-tests have been conducted at a significance level of α = 0.05 and reported in Tables 3 and 4 to statistically justify the different performance between our algorithm and each of the methods. The bolded t-test values in the tables indicate that our algorithm is significantly better than the corresponding method. The number of wins, losses and ties (denoted as w/l/t) of comparing our method with its counterpart methods is summarized in the last row of the tables. Based on the results, it can be found that APSO can outperform all the methods to be compared. Comparing to the four PSO variants, out of 35 functions, APSO delivers significantly better solutions than SL-PSO, CSO, DSPLSO and DLLSO on 32, 26, 25 and 20, respectively. By comparing APSO with four CCEAs, similar results can also be found. Specifically, out of 35 functions, APSO can achieve significantly better solutions on 28, 26, 31 and 28 functions than DECC-DG, MLCC, DECC-G and CCPSO2, respectively. Comparing to MA-SW-Chains, which is a winner algorithm of CEC’2010 LSGO competition, APSO can deliver significantly better solutions on 26 functions. Based on the comparison results, it is clear that our proposed algorithm is the best choice among the ten methods to be compared. The superiority of APSO is mainly due to the incorporation of APS and APL strategies, which can be used to achieve a balanced evolution search at swarm and particle level, respectively, thus properly searching the space of LSGO.

Table 3 Comparing results of various algorithms on 1000-D CEC’2010 functions
Table 4 Comparing results of various algorithms on 1000-D CEC’2013 functions

5 Conclusions

In this paper, we have reported the implementation of a PSO with attention-based particle sampling and learning for LSGO. In the proposed method, the attention-based particle sampling strategy, which works by paying attention to low quality particles at the early stage of evolution while gradually switching to high quality particles at the later stage of evolution, is devised to achieve a balanced evolutionary search at swarm level. While, the attention-based particle learning strategy, which utilize particles from the predominant sub-swarm activated by the attention mechanism for learning, is designed and employed to improve the efficiency of particle learning as well as the balance of evolution at particle level, thus appropriately searching the space. The performance of proposed method has been evaluated via a series of experiments. The results show that proposed mechanisms are able to greatly improve PSO for addressing the LSGO and the resulting algorithm could outperform related algorithms.

The proposed method can be extended in a few directions. Firstly, it is desirable to devise other attention-based partition activation schemes to activate an appropriate sub-swarm for evolution. For instance, apart from the fitness of the particles, their distances to the local best particle in a given topology could also be considered for designing the activation function. Secondly, different exemplar selection schemes with various number of exemplars could also be studied to evaluate the impact of attention-based particle learning on the performance of evolution. Additionally, it would be interesting to extend the proposed algorithm to deal with real and complex optimization problems, for instance, data clustering (Sheng et al. 2020; Wang et al. 2021b), state estimation of networks (Hu et al. 2021a; Jia 2021; Zhao et al. 2021; Zou et al. 2021), sensor filtering fusion (Geng et al. 2021; Mao et al. 2021), and fault detection in network systems (Ju et al. 2021; Hu et al. 2021b).