A Novel Unresolved Peaks Analysis Algorithm for ME Signal Detection Based on Improved SMA

Microchip electrophoresis (ME) is an ion detection system with low cost and portability, which is suitable for online analysis of environmental samples. However, the unresolved peaks in the detection signal of complex samples seriously affect the measurement accuracy of sample concentration. In this article, an efficient unresolved peaks analysis algorithm is proposed, which is based on the sigmoidal membership function, Lévy flight, and slime mould algorithm (SLSMA). First, the hyperbolic tangent function in the original slime mould algorithm (SMA) is replaced by the sigmoidal membership function to enhance the global optimization capability. Second, we use the Lévy flight sequences to further enhance the convergence speed of the SMA algorithm. Then, the performance of SLSMA is tested using synthetic peaks with different resolutions and noise levels. Finally, ME peaks are used to further validate the application of the proposed algorithm. The results show that the proposed algorithm has higher computational efficiency and can be used for the analysis of ME peaks.

To analyze the unresolved peaks, many related methods have been proposed. These unresolved peaks analysis methods include indirect hard modeling [5], partial least squares [6], reinforcement learning [7], machine learning [8], expectationconditional maximization [9], sum of Gaussian [10], and signal shape-based method [11]. These unresolved peaks analysis methods were used for the analysis of spectra [5], voltammetry [6], differential scanning calorimetry [7], [8], X-ray photoelectron spectroscopy [9], eddy current [10], and electrocardiogram [11]. Despite the high accuracy, these methods are difficult to use for the analysis of ME peaks because they were developed for specific signals. In addition, the PeakFit software is often used to manually analyze peak shape signals, such as chromatography [12] and spectroscopy [13].
It is worth noting that swarm intelligence algorithms have been increasingly used for the analysis of unresolved peaks. Selamat et al. [14] used a particle swarm optimization (PSO) algorithm for the detection of chewing peaks. Gao et al. [15] used a PSO algorithm to separate the unresolved peaks of ion mobility spectrometry. Li et al. [16] proposed a peak fitting algorithm based on a particle swarm algorithm and limit learning machine for analyzing measured data from highenergy physics experiments. To separate the unresolved peaks of magnetic eddy current signal, Xiong et al. [17] proposed an unresolved peaks separation algorithm based on the genetic algorithm (GA). Recently, an improved whale optimization algorithm [18] has been applied to the analysis of ME signals. These studies show that the advantages of swarm intelligence algorithms in nonlinear optimization contribute to the analysis of unresolved peaks.
Recently, the slime mould algorithm (SMA) [19] has shown a strong global optimization capability in solving specific engineering problems. However, the SMA algorithm may also suffer from slow convergence when dealing with complex problems [20]. Although it has been shown that the problem of unresolved peaks analysis is a separable nonlinear least squares problem [18], [21], when there are more complex unresolved peaks, the larger number of nonlinear parameters to be optimized is a great challenge for the swarm intelligence algorithm. In addition, to our knowledge, no relevant studies are using the SMA algorithm to analyze unresolved peaks. Therefore, the motivation of this work is to improve SMA and apply it to the analysis of unresolved ME peaks.
In this article, we propose an unresolved peaks analysis algorithm based on the sigmoidal membership function and This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Lévy flight. The main contributions of this article are given as follows.
1) A linearly mapped sigmoidal membership function is used to improve the global optimization capability of SMA.
2) The Lévy flight is used to improve the convergence speed of the SMA algorithm. 3) An algorithm based on the sigmoidal membership function, Lévy flight, and SMA (SLSMA) is proposed. The performance of the proposed algorithm is tested using the synthetic peaks and ME peaks. 4) The proposed SLSMA has a 28.4% and 12.2% reduction in fitting error and calculation time, respectively, compared to that of SMA. The rest of this article is organized as follows. In Section II, the mathematical models of the SMA algorithm and the Lévy flight are introduced. Section III describes the unresolved peaks problem, experimental data and computing environment, improved SMA based on modified sigmoidal membership function, improved SMA based on Lévy flight, and the proposed method. In Section IV, synthetic and ME peaks are used to verify the performance of the proposed algorithm. The conclusion is given in Section V.

A. Mathematical Model of SMA
According to the smell in the air, the slime mould can approach the food. The process of approaching food can be imitated as where X i is the location of the ith slime mould, k is the current iteration, X b means the location of the best individual, V c is a vector consisting of uniformly distributed random numbers between [−1, 1] whose values tend to 0 eventually as iterations increase, W is the weight vector, X A and X B are location vectors of two randomly selected individuals, r is a uniformly distributed random number in the interval of [0,1], and p is described as in which tanh is the hyperbolic tangent function, F(i) is the fitness of X i , and BF is the best fitness found so far. In (1), V b is expressed as and where maxIter represents the maximum number of iterations.
W in (1) is expressed as where S is the sequence sorted in ascending order of fitness, bF and wF denote the best fitness and worst fitness obtained in the current iteration, respectively, and n is the size of the population.
In the process of wrapping food, the location of slime mould is updated as where rand is the uniformly distributed random number in the interval [0, 1].
In (1) and (6), W imitates the oscillation frequency of slime mould to improve the optimization capability, while V b and V c synergistically imitate the oscillations of the SMA to improve global optimization.

B. Lévy Flight
Recently, the Lévy flight has been used for the improvement of many swarm intelligence algorithms [22], [23], [24]. The Lévy flight (Lf) distribution [23] is expressed as In (7), β is the control parameter and u and v are normal distributions expressed as where σ u is expressed as Different from the uniform distribution, Lévy flight is a random walk combining high-frequency short hops and lowfrequency long hops, as shown in Fig. 1(a)-(c). In Fig. 1(a), the long hops of Lévy flight are too many and the jump range is too large. On the contrary, in Fig. 1(c), the jump range of Lévy flight is too small. These situations are not conducive to improving the convergence speed of SMA. In this study, we choose the case with fewer long hops and moderate jump range, i.e., β equals 1.5, as shown in Fig. 1

A. Description of the Unresolved Peaks Problem
In the ME system, the detection signal of a sample can be expressed as the superposition of multiple component peaks with a baseline that contains noise. In addition, each ME peak Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. is a function of time. Therefore, in this work, we represent the ME signal as where t is time, bl is the baseline of the ME signal, N is the number of ME peaks, and y i is the ith ME peak.
To reduce the interference of the baseline, a baseline correction is required before further analysis of the peak signal. After baseline correction, the bl term in (10) is removed and the remaining signal is represented as follows: wheres indicates the signal after baseline correction. Based on the shape of the ME component peaks, we chose the Gaussian peak [18] to represent the ME peak. Hence, the peak fitting model is given as follows: (12) where h, p, and w are the height, position, and width of the component peak, respectively. Assuming that the length of the ME signal is L, then the fitting error can be given as follows: which is also the fitness of each swarm intelligence algorithm. In (13), h is the linear parameter, while p and w are nonlinear parameters. If (13) is fit directly using the SMA algorithm, the coding length of the individual will be 3N . Motivated by the literature [18], [21], we treat the fitting of unresolved peaks as a separable nonlinear least squares problem. When fitting the ME signal using the SMA algorithm, the nonlinear parameters p and w are first optimized using SMA, and then, the linear parameters h are optimized using linear least squares. Using this strategy, the coding length of individuals is reduced to 2N .

B. Experimental Data and Computing Environment
Synthetic and ME signals are used to test the performance of the algorithms involved in this study. The synthetic signal is a superposition of three Gaussian peaks [18] with different resolutions (Table I) and noise levels. The signal-to-noise ratios (SNRs) of the synthesized signals are 9, 12, 15, 18, 21, and 24 dB after adding different noises to the synthetic peaks using the awgn function of MATLAB. To obtain ME signals, all reagents purchased from Macklin (Shanghai, China) are of analytical grade, and the details of the stock solutions are shown in Table II. In addition, the electrophoresis experiments are set up in the same way as in the work [25].
For the fairness of comparison, we set the population size of these seven algorithms to 300 and the maximum number of iterations to 200. Other parameters are set the same as in the original algorithms. For each set of data, each algorithm was repeated 100 times. These swarm intelligence algorithms are implemented in MATLAB R2019a and run on the same hardware environment (256 GB of memory and an eight-core, 1.80 GHz Intel 1 Xeon 1 Silver 4108 CPU).

C. Improved SMA Based on Modified Sigmoidal Membership Function
In the original SMA algorithm, p in (6) plays a key role in the balance of exploration and exploitation. However, the value of p in (2) is calculated using the tanh function, which has a fixed distribution and cannot be adjusted according to the problem to be optimized.
Based on the analysis of the distribution of the tanh function and the sigmoidal membership function (sigmf),  we modified the sigmf function using the linear mapping method. Using the sigmf function, we rewrite (2) as where s1 is the parameter to adjust the sigmoidal membership function and s2 is 0. Fig. 1(d) shows the effect of s1 on the distribution of sigmf. From Fig. 1(d), it is clear that the sigmf curve gradually becomes steeper as s1 increases. When s1 equals 2.0, the curve of sigmf coincides exactly with that of tanh. Therefore, the descriptive power of the sigmf function is stronger than that of the tanh function.
To compare the global optimization capabilities of the tanhbased SMA and the sigm-based SMA, we focus on the optimal fitting errors at the end of the 200th iteration. Table III compares the fitting errors of these two SMA algorithms for the synthetic peaks. Although the fitting errors of the two  algorithms are close to each other, the overall fitting error of the sigm-based SMA algorithm is smaller. Table IV shows the fitting errors of these two algorithms for the ME peaks. From Table II, the fitting errors of the sigm-based SMA algorithm are smaller when fitting the ME peaks. Comparing the fitting errors of the synthetic peaks (Table III) and the ME peaks (Table IV) shows that the global capability of the sigm-based SMA algorithm is stronger than that of the tanh-based SMA. In Fig. 2, the computation times of the two algorithms are compared. In most cases, the computation time of the original tanh-based SMA is shorter.

D. Improved SMA Based on Lévy Flight
As a common method to improve the swarm intelligence algorithm, the use of Lévy flight helps to improve the convergence speed [22] of the algorithm. In this study, we replace the random individual X A with the current individual X i and V b with Lévy flight, and we rewrite (6) as To compare the convergence speed of the Rand-based SMA and the Lévy-based SMA, we focus on the times of calling to the objective function when the convergence condition is satisfied. The results of these two SMA algorithms for the synthetic peaks are shown in Fig. 3. As can be seen from Fig. 3, the times of calling to the objective function of the Lévy-based SMA algorithm are smaller. Table V shows the times of calling to the objective function of these two algorithms for the ME peaks. From Table V, the mean times of calling to the objective function of the Lévy-based SMA algorithm are smaller than those of the Rand-based SMA. The results of the synthetic peaks (Fig. 3) and the ME peaks (Table V) show that the convergence speed of the Lévy-based SMA is faster than that of the Rand-based SMA.

E. Description of the Proposed Algorithm
By analyzing the original SMA, sigmf-based SMA, and Lévy-based SMA above, we can get the following conclusions: 1) compared with the original SMA, the sigmf-based SMA has a stronger global capability and slower computation speed and 2) the Lévy-based SMA has faster convergence speed and weaker global capability.
Motivated by these findings, an improved SMA algorithm called sigmoidal membership function and Lévy flight-based SMA (SLSMA) is proposed in this study. The pseudocode for the SLSMA algorithm is presented in Algorithm 1.

IV. RESULTS AND DISCUSSION
In this section, the proposed SLSMA is compared with PSO [14], GA [17], and tent-mapped whale optimization algorithm (TWOA) [18], which are state-of-the-art peak analysis algorithms. To further verify the effectiveness of the proposed SLSMA algorithm, we have also implemented other swarm intelligence algorithms proposed in recent years, including the marine predators algorithm (MPA) [26], Harris Hawks optimization (HHO) [27], arithmetic optimization algorithm (AOA) [28], and butterfly optimization algorithm (BOA) [29] for comparison. To validate the proposed algorithm, we compare and discuss the performance of different algorithms, including the fitting error, the number of convergence iterations, the calculation time, and the peak position error. For the synthetic peaks, the Wilcoxon signed-rank test [30] was used to further compare the fitting errors of the different algorithms. Fig. 5 shows the convergence curves for the fitting of the synthetic peaks by different algorithms. The fitness of the optimal individual in each iteration, i.e., the best fitness, is recorded in the convergence curve. From Fig. 5, it can be seen that at the end of the 200th iteration, the best fitness obtained from GA is the largest, while that of SLSMA is one of the smallest. Furthermore, the SLSMA algorithm achieves convergence with a minimum number of iterations compared to other algorithms. Compared with the SMA algorithm, the proposed SLSMA algorithm converges faster, indicating that the role of sigmoidal membership function with Lévy flight is obvious. Fig. 6 compares the fitting errors of different algorithms for the synthesized peaks. Since this article uses the rootmean-square error as the fitting error, the fitting error should decrease as the noise level increases for each set of synthetic peaks. From Fig. 6, the fitting errors of SLSMA, SMA, PSO, and MPA are smaller and all decrease as the SNR    Fig. 6(a), GA and BOA in Fig. 6(c), AOA and BOA in Fig. 6(d), GA and AOA in Fig. 6(e), and TWOA in Fig. 6(f). In addition, the fitting error of the HHO algorithm is closer to that of the proposed SLSMA algorithm than that of the GA, TWOA, AOA, and BOA algorithms. It is worth noting that the fitting errors of SLSMA, SMA, PSO, and MPA are close to each other for all cases in Fig. 6. To analyze the differences between the errors of these four algorithms, we also performed the Wilcoxon signed-rank tests between the proposed SLSMA algorithm and other swarm intelligence algorithms, as shown in Table VI. Combining the p-values and h-values in Table VI, it can be seen that there is no significant difference in the fitting errors of the SLSMA, SMA, PSO, and MPA algorithms. From Fig. 6, the fitting errors of the SLSMA algorithm are significantly smaller than those of the GA, TWOA, AOA, and BOA algorithms. The smaller fitting error indicates that the proposed SLSMA algorithm has a strong global optimization capability. Fig. 7 compares the calculation time of the different algorithms. From Fig. 7, it can be seen that SLSMA has the shortest calculation time, while HHO has the longest calculation time. For each set of synthetic peaks, the noise level has almost no effect on the calculation time. The calculation time of the SMA algorithm is close to that of AOA. For these six synthetic peaks, the average calculation time of the proposed SLSMA algorithm is 3.13 s. The average computation times for HHO, BOA, MPA, GA, AOA, SMA, TWOA, PSO, and SLSMA are listed in decreasing order. Hence, the calculation time of the proposed SLSMA algorithm is shorter than that of other comparative algorithms. The results of SMA, PSO, and SLSMA show that the improvements in this study are obvious.  Fig. 8 shows that the number of calls to the objective function decreases sequentially for HHO, BOA, MPA, GA, AOA, SMA, TWOA, PSO, and SLSMA. Therefore, the convergence speed of the proposed SLSMA algorithm in this study is faster than that of other comparative algorithms. In particular, the convergence speed of the original SMA algorithm is slower than that of the PSO algorithm, which is not the case for the proposed SLSMA algorithm. The results illustrate that for the unresolved peaks problem, the improvement of SMA in this study leads to faster convergence. Fig. 9 shows the comparison of the fitting performance of different algorithms for the ME peaks. Samples 1-3 correspond to sample concentrations of 10, 20, and 30 mm/L, respectively. Fig. 9(a) compares the fitting errors of the different algorithms. From Fig. 9(a), the fitting error of the proposed SLSMA algorithm is the smallest for all samples. From Fig. 9(b), for all samples, the calculation time of the proposed SLSMA algorithm is the shortest and that of the HHO algorithm is the longest. Fig. 9(c) shows that the proposed SLSMA algorithm has the least times of calling to the objective function. It is easy to see that the distribution of Fig. 9(c) is consistent with that of Fig. 9(b). The fitting results of  the proposed SLSMA algorithm for Sample 2 are shown in Fig. 9(d).

B. Results in the Analysis of ME Peaks
Table VII compares the fitting performance of the SLSMA with that of the commonly used peak analysis methods, including the machine learning method and the PeakFit. The fitting performances are the root-mean-square error and the goodness of fit (R 2 ). From Table VII, the PeakFit software has the smallest fitting error and the largest R 2 value for samples 2 and 3. In contrast, the machine learning method has the smallest fitting error for sample 1, but the largest fitting error for sample 2. The average R 2 value of the SLSMA is between that of the PeakFit software and the machine learning method.

C. Discussion
To analyze the unresolved peaks problem more efficiently, the SLSMA algorithm is proposed in this article. In this study, a sigmoidal membership function is used to improve the global optimization ability and Lévy flight is used to speed up the convergence. From the fitting results of the synthetic peaks, it can be seen that the proposed SLSMA algorithm has the optimal fitting performance in terms of fitting error (Fig. 6), calculation time (Fig. 7), and times of calling to the objective function (Fig. 8). Furthermore, the fitting results of the ME peaks show that the proposed SLSMA algorithm has the best fitting performance compared to the recently proposed swarm intelligence algorithm (Fig. 9). Overall, according to the ranking of the fitting performance, i.e., SLSMA > PSO > SMA, it is clear that the improvements made in this study improve the ability of the SMA algorithm to analyze unresolved peaks.
To illustrate the practicability, we compared the SLSMA algorithm with the commonly used peak analysis methods. In Table VII, both machine learning methods and PeakFit methods are gradient methods, so these methods have a more stable fitting performance for specific ME peaks. However, the PeakFit software needs to manually remove the redundant peaks, and the machine learning approach may fall into local optima (results of Sample 2 in Table VII). In contrast, the proposed SLSMA is a swarm intelligence algorithm, which does not depend on the gradient, and its convergence is less stable than that of the gradient method. In the future, the combination of SLSMA and gradient algorithm for the unresolved peaks problem will be an interesting direction.

V. CONCLUSION
To analyze the unresolved ME peaks, an improved SMA algorithm is proposed in this study. The sigmoidal membership function and Lévy flight are used to improve the global optimization capability and convergence speed of the SMA algorithm, respectively. First, we perform a linear mapping of the sigmoidal membership function and use it to replace the hyperbolic tangent function in the original SMA. Second, the uniformly distributed random variable in the original SMA algorithm is replaced by Lévy flight. Finally, the synthetic peaks containing three Gaussian peaks and different noise levels and the ME peaks are used to verify the effectiveness of the proposed algorithm. The results show that the proposed SLSMA has a 28.4% and 12.2% reduction in fitting error and calculation time, respectively, compared to that of SMA. The proposed SLSMA algorithm outperforms the recently proposed swarm intelligence algorithms and has a similar fitting performance to the commonly used peak analysis methods. The proposed algorithm significantly improves the measurement accuracy of ME unresolved peaks. In the future, we will further investigate the peak analysis method based on the swarm intelligence algorithm and gradient algorithm.