Improvement of the Firework Algorithm for Classification Problems

: Attracted numerous analysts’ consideration, classification is one of the primary issues in Machine learning. Numerous evolutionary algorithms (EAs) were utilized to improve their global search ability. In the previous years, many scientists have attempted to tackle this issue, yet regardless of the endeavors, there are still a few inadequacies. Based on solving the classification problem, this paper introduces a new optimization classification model, which can be applied to the majority of evolutionary computing (EC) techniques. Firework algorithm (FWA) is one of the EC methods, Although the Firework algorithm (FWA) is a proficient algorithm for solving complex optimization issue. The proficient of the FWA isn't fulfilled when being utilized for solving the classification issues. In this paper we previously proposed optimization classification model according to the classification issue. At that point we legitimately utilize the model with FWA to solve the classification issue. Finally, to investigate the performance of our model, we select 4 datasets in the experiments, and the results indicate that an improved FWA can upgrade the classification accuracy by using this model.


Introduction
Classification problem may be a typical problem in data processing and machine learning that allots categories to a collection of data so as to assist in additional accurate predictions and analysis, which attracted many researcher's attention [1,2]. Within the past, many varieties of classification methods had been proposed such as: support vector machine (SVM) [3], decision tree (DT) [4,5], k-nearest neighbor (KNN) [6], artificial neural network (ANN) [7] and naive Bayesian classification (NBC) [8]. However, many of them always present insufficiencies because they easily trapped into local optima and mistakenly considered it as the global optima. The evolutionary computing (EC) has allowed to search the optimal parameters for classifiers [9]. Among the studies completed, for example, Prof. Holland [10] and Dr. Russell C. Eberhart proposed the genetic algorithm (GA) and the particle swarm optimization (PSO) achieved ideal classification performance, respectively. Xue et al. farther propose a novel classification model, and it can be solved classification problems straight forwardly by evolutionary algorithms (EAs). In their work, this model constructed by firework algorithm (FWA) and performs well [11]. It can be concluded that the novel classification model can solve the related problems feasibly.
Otherwise, for developing EC techniques many researchers focused on using self-adaptive mechanism and plenty effective self-adaptive EC techniques are new. Self-adaptive method has been used into EC which has performs better. For instance, to optimize the operating conditions, Fan et al. [12] introduced adaptive mutation strategy and control parameters based on DE (SSCPDE). In addition, a selfadaptive binary DE algorithm (SabDE) has put forward by Banitalebi et al. [13], the method obtained some good results on benchmark problems. The classification model proposed by Xue et al. may be accustomed to overcome classification issues by EC techniques directly and simply. In this paper, firework algorithm (FWA) has been used to resolve the classification problems. Besides, a self-adaptive firework algorithm (SaFWA) [14], which employed a selfadaptive mechanism, four candidate's solution generation strategy to extend the range of the solution. Specifically, it does not employ more information in the swarm. However, Within the existing research work, since the beginning to overcome this problem, the researchers usually target modifying the parameters or operators of the classifiers. As an example, Yu et al. introduced the FWA with DE mutation operator (FWA-DM) [15]. A lot of effort has been made, but when applied to classification problems, there is a lack of accuracy. In order to figure out the problem, we proposed a method which is improvement of the firework algorithm (IFWA) in this paper. The aim of our research is to investigate the performance of IFWA once we face a classification problem and propose an efficient method to enhance the performance of the classifier. At first, we introduce the optimization classification model. Second, since the FWA is one of the EAs methods and has shortcomings. Then we introduce our IFWA through to the present supervised classification problem by using this model. Finally, we compare the FWA to IFWA. To further investigate the performance of the model, four datasets were employed in experiments.
A. Contribution of this paper After a careful analysis on the explosion of the firework we noticed that when the sparks cannot approach the local space, they cannot contribute to the optimization process while taking a lot of resource. Thus, the current proposed methods use a new strategy of generating the individuals by updating the position of the sparks after the firework explode and keep the sparks with the good fitness value into the next iteration.

B. Organization of this paper
The organization of the paper is as follows: In section 2, we give the outline of the FWA. In section 3 we introduce the optimization classification model. Section 4 and section 5 describes the experimental design and results analysis, respectively. Finally, the last section concludes the research of the paper.

Related Work
FWA is the typical evolutionary algorithm to solve the complex optimization problems, which put forward by Tan et al. [16]. The processes of the algorithm are as follows: Step 1: N fireworks are randomly initialize, and the fitness of each individual was evaluated to determinate the amplitudes of the explosion and the numbers of sparks.
Step 2: Implement the explosion and mutation operator according to their fitness values.
Step 3: Fireworks with higher fitness produced more sparks in a smaller range, otherwise, fireworks with lower fitness generate fewer sparks over a larger range. The explosion amplitudes and sparks number are two important factors for the explosion operator. The number of sparks is described in Eq. (1): where i s is the number of sparks for each individual, m is the number of sparks, and max Y means the worst individual. i x represents the individual and ( ) i f x represents the fitness for i x , while ε denotes the smallest constant in the computer, which to prevent the denominator from being 0. The number of sparks is limited as Eq. (2): where a and b are two constant number, ˆi s is the limit of the number of sparks. For the explosion amplitude: where Eq. (3). Â is the sum of all the amplitudes, and min y is the fitness value of the optimal individual in N individuals.
In the generating sparks part, a new method [17] is selected in this paper, which is described as follows: where N (0, 1) and C (0, 1) are randomly generated by the Gaussian distribution and Cauchy distribution, respectively.
t id mean is the same as that in [18]. t represents the ℎ iteration in the evolutionary process. ∈ represents the current particle, and ps is the population size. d D ∈ denote the ℎ dimension of the search space, and D is the dimensionality of the search space.
denote the ℎ dimension of the current position of the particle.
∈ [− , ] is the velocity of the ℎ particle in the current iteration.
is the ℎ of personal best solution for the individual, and are position vectors for two random particles, respectively.
Once explosion operator, mutation operator, and mapping rules have been employed, some sparks generated from the process need to be picked out for the next iteration. The distance-based strategy is also adapted in the FWA. In order to select the sparks for next iteration, first, the next iteration always selects the best spark. Then, the rest (N-1) individuals are choose according to the distance maintaining the diversity of the population.
Per to the Euclidean distance which is employed in this part where � , � represents the Euclidean distance between the two individuals and .
where ( ) is the sum of distances between and the rest of the all opposite individuals. ∈ represent the location from , where represent the association both the sparks generated by the two operations as mentioned before.
By using the roulette strategy to settle on the individuals for the following iteration, the chance ( )for choosing the individual should be: where ( )representes the final distance between and other locations.

Optimization Classification Model
According to a dataset D, where 70% are used for training set T. The example of T can be described as follow: where ( , ) is the ℎ example, = 11 , 12 , … , 1 is the ℎ sample, ∈ = {1,2, … , }( = 1,2, … ) denote the label of the th i sample.
From Eq. (7). It can be noticed that: The label of could be predicted, if Eq. (8). has solution. Obviously, through some EAs techniques, these problems can be solved effectively.
Thus, the Eq. (9) can be brought back to the form: . = .
It is important to note that the number of examples is way greater than the number of dimensions. Thus, the Eq. (8) with a high probability is going to be an inconsistent equation; However, for classification problem, a rough solution is enough for the subsequent equation: According to the previous equation we notice that: From i y , it is possible to predict the label of i x if the condition: 1 − ≤ 1 1 + 2 12 + ⋯ + Where, δ is a small threshold, and EAs can solve this problem by using a type of objective function: The model is feasible under the condition that Eq. (10) is satisfied.
To make sure the lower and the upper boundary of , = 1,2, … , there are a lot of ways, one of them is:

Datasets Description of Data
The datasets utilized in the experiment are show in Tab. 1. Only four datasets are selected for the experiment to check the performance of the designed model and they are: Iono2, sonar, wine and iris.
Each dataset has a particular structure and has: The value of examples, labels and features. As mentioned in the second part of the paper 70% of the data is used for the training sets and 30% for the test sets. The table below shows the structure of the data used.

Parameters Settings of IFWA
As we know, 4 data sets have been used and IFWA were employed to optimize our model on each dataset by finding W.
Once W is found, for each feature ( , )and the whole data set D, we calculated the classification accuracy by−0.5 ≤ ( . − ) < +0.5, finally we deemed the labels was predicted efficiently.

Experimental Results and Discussion
As we can see in table 3, each dataset has four values: max, min, mean and std. "max" and "min" denotes the maximum and the minimum values of classification accuracy, while "mean" and "std" represent the mean and standard deviation obtained in 30 trials. After the experiments, the performance of each datasets (Irono2, sonar, wine and iris) in some situations is good and promising with high classification accuracy value. The classification accuracy on iris is 0.9569 and the minimum value of mean on the whole datasets is 0.8504. Also, note that the values of std are lower in some cases and same for the amplitude which is small.

Conclusion
Since the past, research has intensified and solutions are becoming more and more positive. In this paper, our IFWA has been used on four datasets to efficiently find the best good solution. The experiments show that the EA which employed our new classification optimization can be solve classification problems effectively. The results are being improved to get the best version of IFWA when it is used to classification problems. The next work will focus especially on the optimization of our model. Thus, the model will be reviewed to make it easier and efficient and planning to use more datasets in order to text the performance of IFWA.
Funding Statement: This work was partially supported by the Science and technology program of ministry of Housing and Urban-Rural Development (2019-K-142), the Entrepreneurial team of Sponge City (2017R02002).

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.