A Hybrid Chatter Detection Method Based on WPD, SSA, and SVM-PSO

As a kind of self-excited vibrations, chatter vibration is extremely common in end milling, especially in high-speed cutting processes. It affects the machining accuracy of products and decreases the processing efficiency of machine tools. Thus it is very crucial to develop an effective condition monitoring system to extract the chatter feature before chatter vibration grows. In this paper, a hybrid chatter detection method (HCDM) is proposed for chatter feature extraction and classification in end milling. Firstly, wavelet packet decomposition is employed to decompose cutting vibration signals into a series of wavelet coefficients, and the signals of each frequency band are reconstructed. Secondly, fast Fourier transform and singular spectrum analysis are chosen to obtain the chatter features. Furthermore, the support vector machine model is optimized by particle swarm optimization to recognize the cutting states in end milling. At last, cutting experiments of 300 M steel under different machining conditions are conducted, and the results indicate that the proposed HCDM can distinguish the stable, transition, and chatter states accurately and rapidly in end milling.


Introduction
As one of the common self-excited vibrations, cutting chatter has been proved to be the major factor of limiting machining precision and efficiency of parts in end milling, and the mechanism of chatter is shown in Figure 1. To ensure the stable machining of end milling, a stability lobe diagram is frequently used to help choose the optimal cutting parameters without losing productivity [1][2][3][4][5]. It indicates the relationship between the cutting states and processing parameters and needs the dynamic measurement of machine tools and the cutting forces. And, the measurement results would change if the holder-tool system is changed. For a large number of holder-tool systems, it is impractical to accomplish the dynamic measurements. Moreover, cutting forces are affected by the tool and workpiece materials and process parameters. So, it is almost impossible to get the cutting forces under all the machining conditions. In addition, the choice of the cutting parameter is very hard due to their contradiction with productivity [6][7][8][9]. us, it is meaningful to develop a condition monitoring system for detecting and suppressing the chatter in end milling ahead of time.
As the monitoring and identification techniques are highly developed, chatter detection and suppression has been investigated by many researchers.
ere are three crucial steps in chatter detection, which are sensor selection, feature extraction, and feature classification. In sensor selection, sensor signals, such as acceleration [10][11][12][13], cutting force [14][15][16], acoustic emission signal [17][18][19], drive motor current [20], torque [21], and sound [22], have been used to detect chatter. Comparing with cutting force signals, vibration signals are more sensitive for chatter detection [10]. Besides, a cutting force sensor needs to be fixed between the fixture and the work table. It is not very convenient in the actual machining processes. e sound signal has a lower signal-to-noise ratio in the noisy processing environment, so advanced filter technology is needed to achieve good results. In addition, some researchers [10,13,23] have used vibration signals to obtain good prediction results. So, the vibration signal is employed in this study.
Cutting chatter is decided by many working conditions, such as cutting tools, workpiece materials, machining parameters, and other factors concerned. So, it is very important to extract the dimensionless profitable chatter features which can work in all the cutting conditions. Generally, effective feature extraction improves the final classification results. Time domain [24], frequency domain [25,26], and time-frequency domain [23,27,28] feature extraction methods can be used for processing the acceleration signal. Due to their great achievement, the timefrequency domain signal processing methods become very popular in feature extraction. For example, in empirical mode decomposition (EMD), the Hilbert-Huang transform is employed to achieve a series of orthogonal intrinsic mode functions, and then Hilbert spectral analysis is used to capture the instantaneous frequency array [29]. Compared with the short-time Fourier transform (STFT), EMD does not need too much interference. Nevertheless, because of less theoretical background, the problem of mode mixing and end effects still remains [13]. Moreover, STFT makes a long nonstationary signal as the superposition of many short stationary signals by the defined window function [30]. Nevertheless, the application area is restricted by the window function with the same shape, size, and magnification. e performance of STFT is mostly affected by the defined window function. e calculation precision of vibration signals in low-frequency and high-frequency segments could not be achieved at the same time. If FFT is used to obtain the statistical information, which contains all frequency information, information redundancy will occur. And, it will be more difficult to train the chatter detection model.
As an alternative, the wavelet packet transform decomposes the frequency domain signals into a decomposition tree and calculates each wavelet packet energy [23]. Wavelet basis function and decomposition level can be defined manually. erefore, it can overcome the calculation precision problem of STFT and the mode mixing problem of EMD. In addition, information entropy is the quantitative expression of the uncertainty of the system states. If the signal energy is concentrated on some principal components, the information entropy will be small and vice versa. Compared to the stable state and the chatter state, the transition state includes more components. So, it is profitable to be a chatter feature in the chatter detection system. In this paper, wavelet packet decomposition (WPD) and SSA are employed to acquire the dimensionless chatter features of the acceleration signals in end milling. Different milling chatters will demonstrate different characters in the frequency domain. So, the energy ratios of the frequency band were considered to mine feature information in this study.
After the chatter features are extracted, some classification methods are employed to recognize the cutting states, such as the hidden Markov model [31,32], artificial neural network [33,34], and support vector machine (SVM) [23,35,36]. Compared with classification techniques, SVM has become a powerful tool to classify the nonlinear data. But, like other classification algorithms, overfitting and multiple local minima still exist. In order to solve these problems, grid searching is usually chosen to find the optimal parameters of SVM. It is proved effective in the parameter choice of classification algorithms. However, there are some disadvantages in the grid searching: (1) if the grid is small enough, grid searching can achieve a high accuracy. But, it will cost too much time in the parameter adjustment especially when more than two parameters need to be adjusted. (2) If the grid is not small enough, it is difficult to choose the best parameters of SVM. Besides, the setting of threshold value is closely related to working conditions, and it is very difficult to be determined. erefore, particle swarm optimization was chosen as an optimization algorithm of SVM parameters to recognize cutting states in end milling.
In this paper, a hybrid chatter detection method (HCDM) based on the WPD, SSA, and SVM-PSO model is proposed to recognize the stable, transition, and chatter states in end milling. First, the WPD and SSA decompose the vibration signals under different cutting conditions and overcome the calculation precision problem of STFT and the mode mixing problem of EMD. Second, the energy and entropy features are both extracted according to the characteristics of vibration signals in the stable, transition, and chatter states. Finally, the SVM model is adopted to classify the extracted features, and chatter can be predicted ahead of time. SVM has been employed in chatter recognition by some researchers, and good results have been obtained. However, the parameters of the SVM kernel function are determined based on personal experience, and it is not easy to choose these parameters. erefore, in this study, the  recognition rate is defined as the objective function, and PSO is used to optimize the parameters of the SVM kernel function. And, this is one of the main contributions of this paper. e proposed methodology, experiment setup, data processing, and analysis are presented in the following sections.

Proposed Methodology
e scheme of the proposed hybrid chatter detection method is shown in Figure 2. WPD is used to obtain the clean reconstructed signals in the chatter-emerging frequency band.
en, the energy ratio and the singular spectrum entropy ratio of the reconstructed vibration signal in the chatter-emerging frequency band are constructed as two chatter features. Finally, the SVM-PSO model is used for feature classification to obtain the final recognition results.

Feature Extraction
2.1.1. WPD. Useful information can be achieved from obtained signals by the wavelet transform between time and frequency. Compared with other algorithms, the main advantages of wavelet transform is that it can ensure the high calculation precision of the low-high frequency segments simultaneously. Definition of continuous wavelet transform (CWT) is as follows [37]: where f(t) is the signal to be processed, ψ u,s (t) is the "wavelet," ψ(t) is the "mother wavelet," u is the continuous scale parameter, and s is the location parameter, respectively. Correspondingly, the discrete wavelet transform (DWT) is defined as follows [38]: where j and k are the integers and u and s are the constants, respectively. Meanwhile, an acquired signal f(t) is reestablished by [16] where c j0,k is the approximate parameter at scale j 0 and d j,k is the specific parameter at scale j 0 . In order to enhance the accuracy of recognition, wavelet packet decomposition is employed to determine the chatteremerging frequency band, and the signal in the special frequency band is reconstructed by equation (3).

SSA.
Singular spectrum analysis (SSA) is adopted to decompose the acquired acceleration signals into a series of segments with simpler components. Restrictive distributional and structural assumptions are not needed in SSA because they do not need any mathematical model and statistical assumptions of signals to accomplish the calculation [39,40]. e calculation process of SSA is described in this section. For a given time series with length N, the trajectory matrix Y should be constructed as follows: Note that the trajectory matrix Y is a Hankel matrix, where elements on the ''diagonals'' are equal.
Singular value decomposition is performed for the trajectory matrix Y, and Y is expressed as follows: where U � [u 1 , u 2 , . . . , u l ] ∈ R L×L , S ∈ R L×K is a diagonal matrix, the diagonal elements s 1 , s 2 , . . . , s m are the singular values of the matrix Y, K×K , and m � min (L, K), respectively. In this paper, the SSA is adopted to decompose the reconstructed signal by WPD, and the singular spectrum entropy is achieved as a chatter feature.

Feature Classification.
SVM can classify small data samples. It aims to solve the optimization problem, in which the margins between the closest training data and the optimal hyperplanes are maximized. en, this hyperplane can be used to classify the test samples. It can be described by the set of point x as follows: And, the dashed lines where the support vectors lie on are defined according to en, the margin of the optimal hyperplane can be denoted as 2/‖w‖. e two hyperplanes with equation (7) Sampling signal input Signal processing with WPD and SSA SVM-PSO model Output results

Feature extraction based on energy and entropy characteristics
Training Training Reference pattern Pattern recognition Figure 2: e scheme of the chatter detection system.

Shock and Vibration
and, therefore, the optimal hyperplane from equation (6) can be found by maximizing the distance 2/‖w‖ or by minimizing ‖w‖ 2 with the constraints: For some cases, the training data are not separable by a linear hyperplane. In this case, the SVM can be extended to nonlinear classification with the help of kernel functions. e kernel functions are shown in Table 1.
Recently, SVM has become effective in pattern classification and regression problems [35]. Especially, SVM can be used in the classification of the nonlinear data. Grid searching is often used to enhance the classification precision in the multiparameter adjustment of the SVM model. However, there are some disadvantages in grid searching: (1) if the grid is small enough, grid searching can achieve a high accuracy. But, it will cost too much time in the parameter adjustment especially when more than two parameters need to be adjusted. (2) If the grid is not small enough, it is difficult to choose the best parameters of SVM.
In order to overcome the contradictory problem, the SVM-PSO model is proposed so that grid searching is replaced by PSO which can be easily programmed, which means no encoding or decoding processes are needed as they are in GA [13]. In the PSO algorithm, there are three segments affecting the velocity of a particle, and they are inertial momentum, cognitive, and social segment, respectively. e global best position is set as P g � (p g1 , p g2 , . . . , p gm ), while the velocity of the i-th particle is V i � (v i1 , v i2 , . . . , v im ). e new velocity and position of each particle are calculated by the following equation: where w is the inertia weight related to exploration and exploitation ability of the swarm, c 1 and c 2 are the acceleration coefficients, and r 1 and r 2 are the random numbers [0, 1].
To increase the exploration and the exploitation during the search iterations, acceleration coefficients and inertia weight can be adjusted as follows: where c 1i , c 1f and c 2i , c 2f are the initial and final acceleration coefficients, w max and w min denote the maximum and minimum inertia weight, iter max is the maximum iteration number, and iter is the current iteration number. By making prediction results as the fitness function, PSO optimizes the input parameters and results of the SVM model. e flowchart of the optimization process is shown in Figure 3.

Experimental Setup
To verify the effectiveness of the proposed HCDM, the cutting experiments of 300 M steel were conducted on a fiveaxis high-speed milling center Huron K2X5. e tool is a carbide milling cutter with four teeth. Its diameter and length are 10 mm and 100 mm, respectively. An accelerometer (356A16, PCB, USA) was fixed on the spindle, and a data acquisition system (LMS SCADAS, Belgium) was used to acquire the acceleration signals in end milling. e experimental setup and the machining surfaces of parts (chatter and stable states) are shown in Figure 4.
e final aim of this study is to develop the detection algorithm of cutting chatter in the actual milling processes. So, the cutting experiment with different machining parameters needs to be conducted. In the experiment, 300 M steel was chosen as the processing material because of its high hardness, which can help chatter occur easily. In addition, in order to simulate the actual milling processes, different cutting parameters were selected in this experiment, such as spindle speed, axial depth of cut, radial depth of cut, and feed rate. e cutting parameters are listed in Table 2.

Signal Analysis of Milling Processes.
e vibration signals obtained are shown in Figure 5. Figure 5 shows that the starting time and ending time of the cutting appear at about 0.5 s and 7.4 s, respectively. e amplitude of the background vibration noise before and after cutting was about 1 m/s 2 . Fast Fourier transform (FFT) of the background noise is shown in Figure 6. It can be found that the main frequency peaks near 125.44 Hz and 251.65 Hz were the single-frequency and the double-frequency vibrations induced by the spindle rotation at 7500 r/min, respectively. e amplitude of the stable cutting state is larger than the signal before cutting. Compared with the stable cutting state, the vibration signal for the chatter cutting state has much larger amplitude, and nonlinear vibration phenomenon appears. e transition state is a comparable smooth process between the stable state and the chatter state.
Without considering the processes before cutting and after cutting, the cutting states of machining processes can be classified by stable state, transition state, and chatter state. If the signal has a continuous and smooth cutting process, it is considered to be a stable state. When the vibration amplitude increases a lot suddenly, it belongs to the chatter state. If the vibration amplitude does not increase sharply and the frequency transfer emerges, it is classified to be the transition state. e objective of pattern recognition is to recognize the cutting states based on the chatter features of the sampling signal. 60 group vibration signals were obtained under different machining conditions. Among them, the first 45 group signals were used as the training sample, and the other signals were used to test the detection system. According to the classification rules above, these signals are classified into three patterns, and each pattern has 20 signals. en, the chatter feature of each signal was achieved by using WPD and SSA.
To understand the characteristics of vibration signals in end milling processes comprehensively, FFT was employed for the vibration signal in the different states. e amplitude spectrum of vibration signals is shown in Figure 7. Figure 7 shows that the frequency spectrum of the vibration signal under the stable state has a low amplitude and well distribution, and the amplitude of the high frequency about 2000 Hz is a little larger. Compared to the stable state, the energy distribution of the vibration signal shifts from high frequency to low frequency in the transition state. e amplitude of the acceleration signal between 1000 Hz and 1500 Hz increases drastically, and the peak frequency 1260 Hz is the chatter frequency of the machine tool.

Feature Extraction Based on Wavelet Packet.
One of the obvious characteristics on the chatter-emerging processes is that the energy distribution of the vibration signal shifts from high frequency to low frequency in the frequency domain. erefore, the energy or the energy ratio of the vibration signal in the chatter-emerging frequency band can be constructed as one chatter feature to detect the cutting states. Different from the energy, the energy ratio is a dimensionless chatter feature between 0 and 1. It is beneficial to the feature classification, and it can be easily employed in other cutting processes with different machine tools, cutting tools, workpiece, and machining parameters. erefore, the energy ratio is determined as one of the chatter features to detect the cutting states in the end milling processes.
In order to obtain this chatter feature, the three-level WPD was employed to acquire the reconstructed signal in the chatter state as shown in Figure 8. en, the FFT was also implemented on the reconstructed signal in each frequency band. It can be shown that the amplitude of the acceleration signal in the frequency bands d 4 (960 Hz-1280 Hz) and d 5 (1280 Hz-1600 Hz) is larger than the signal in other frequency bands. So, the chatter frequency can be determined in the frequency bands d 4 and d 5 . According to the amplitude spectrum of the acceleration signal, the energy ratio   Shock and Vibration 5 of the acceleration signal in each frequency band can be calculated as follows: where m is the frequency point number in each frequency band, m � 500, C j is the energy ratio of the i-th frequency band, and a jk is the amplitude of the k-th frequency point in the j-th frequency band.
Considering the boundary error of WPD, the sum of energy ratio of d 4 and d 5 is constructed as the hybrid chatter feature: where C 4 is the energy ratio of the frequency band d 4 (960 Hz-1280 Hz) and C 5 is the energy ratio of the frequency band d 5 (1280 Hz-1600 Hz). T 1 is a feature of the cutting acceleration signal in the frequency domain, which reflects the mechanism of chatter generation and has the advantage of time. In order to validate the availability of T 1 , WPD was employed to obtain the energy ratio of the chatter-emerging frequency band, and then the variation trend of T 1 in the chatter-generating processes is shown in Figure 9.
It can be observed from Figure 9 that the chatter feature T 1 keeps a low level as a whole in the stable cutting state, and some finite fluctuations exist in some regions. In the transition state, the chatter feature T 1 can detect the chatter 0.5 s ahead of time compared to the acceleration signal in the time domain. It provides sufficient time for signal processing and chatter suppression. In the chatter cutting state, the chatter feature T 1 tends to a stable high-level value.

Feature Extraction Based on SSA.
Determination of the chatter feature is of utmost importance for a successful chatter detection system. Fundamentally, the chatter index should reflect the two kinds of chatter features, including the vibration amplitude in the time domain and the energy transfer in the frequency domain. e singular spectrum entropy could express the uncertainty of the signal energy based on the singular spectrum analysis. If the signal energy Shock and Vibration is concentrated on some principal components, the singular spectrum entropy will be small and vice versa.
In the stable state, the acquired signal is simple, so the singular spectrum entropy is small. Similarly, in the chatter state, the signal is dominated by the chatter frequency and the doubling frequency, so the singular spectrum entropy is also small. However, in the transition state, the increase in the vibration amplitude and the frequency transfer emerge, but they do not dominate in the signal acquired. erefore, the signal in the transition state is more complicated than that in the other two states, and the singular spectrum entropy will increase obviously. In this study, the singular spectrum entropy is adopted as a chatter index for chatter detection.
According to the singular spectrum analysis of the reconstructed signal, the singular values obtained are as follows: s � s 1 , s 2 , . . . , s m . (13) e singular spectrum of the sampling time series is as follows: where p i � s 2 i / m j�1 s 2 j is the contribution ratio of the i-th component in the total energy. e singular spectrum entropy of the reconstructed signal can be expressed as follows: e singular spectrum entropy ratio of the i-th frequency band is defined as follows: Considering the chatter frequency is between the frequency bands d 4 and d 5 , the sum of the singular spectrum entropy ratio of the frequency bands d 4 and d 5 is determined as the another chatter feature T 2 , which is shown as follows:  In order to validate the availability of T 2 , SSA was employed on the reconstructed acceleration signal to obtain the coefficient of singular spectrum entropy of the chatteremerging frequency band, and then the variation trend of T 2 in the chatter-generating processes is shown in Figure 10. Figure 10 shows that the chatter feature T 2 keeps at about 17%, and it changes a little in the stable cutting state. e chatter feature T 2 increases a lot and retains near 25%, and it can detect the chatter 0.3 s ahead of time compared to the acceleration signal. When the amplitude of the acceleration signal increases, the chatter feature T 2 decreases by 5%, and it is the chatter cutting stage. e analysis results indicate that the singular spectrum entropy ratio changes obviously in different cutting states, so it is profitable to be made as a chatter feature to detect chatter.   Shock and Vibration

SVM-PSO Setting and Optimization.
e target is to recognize the cutting states based on the chatter features of the acquired acceleration signal. 60 group vibration signals were obtained under different spindle speeds, cutting depths, and feed rates. Among them, the first 40 group signals were used as the training sample, and other signals were used for testing. According to the classification rules above, these signals are classified into three patterns, and each pattern has 20 group signals. en, the chatter index of each signal was calculated based on WPD and SSA.
SVM is employed for the feature classification. e improved SVM-based PSO can achieve high accuracy rapidly. It is implemented iteratively in four steps: Step 1: feature data input Step 2: data preprocessing: normalization and regularization Step 3: analysis and choice of kernel functions Step 4: parameter adjustment of SVM based on PSO e proposed chatter features T1 and T2 based on WPD and SSA can be used as the inputs of SVM-PSO. So, the number n of the input is 2. Outputs of SVM-PSO were set by [0 0 1], [0 1 0], and [1 0 0], corresponding to stable, transition, and chatter states, respectively. In order to obtain the profitable kernel function, Matlab 2018a was employed to calculate the two-feature classification model based on SVM. e computer CPU is Intel (R) Corel ™ i76500 CPU@ 2.50 GHz, and the RAM is 8.0 G. Before parameter adjustment, the recognition accuracy and calculation time of the four kernel functions are shown in Table 3.
It indicates that the accuracy of kernel Function "Poly" is much higher than the other three kernel Functions, and kernel Function "Poly" consumes relatively little time. Consequently, "Poly" was chosen as the kernel function of the SVM model. Kernel function "Poly" is defined as follows: where r is a constant, x is the feature data, y is the labeled data, and c and d are the two parameters to be adjusted. Generally, these two parameters have a great impact on the recognition accuracy, which are set in the intervals [0, 1] and [0 5], respectively. But, the optimal parameters are hard to be found. As a powerful mathematical tool, PSO was chosen to optimize the two parameters. e initial values of c and d were set as 0.5 and 2. e number of particles was 50. e parameter setting of the PSO process is shown in Table 4.

Performance Analysis.
To improve the generality of the proposed HCDM, the Kennard and Stone algorithm was used to choose training and testing samples [41,42]. To avoid accidental errors, the same work was done ten times in each working condition. And, one of the choices is illustrated in Table 5.

Comparison with SVM-GS.
In order to verify the proposed HCDM method, grid searching was chosen as the optimization method substituting for PSO. Initial values and grid size are set as shown in Table 6. And, the classification accuracy and the average calculation time of SVM, SVM-PSO, and SVM-GS are shown in Figure 11. It indicates that SVM has the minimal classification accuracy and the shortest calculation time. And, the classification accuracy of SVM-PSO is the same as SVM-GS, but the calculation time of SVM-GS is about three times as long as SVM-PSO. erefore, compared with SVM and SVM-GS, SVM-PSO has the better overall performance at classification accuracy and calculation time.

Comparison with the Single Chatter Feature.
Feature extraction has a great impact on the classification precision of cutting states. So WPD, SSA, and WPD-SSA were used to extract the chatter features T 1 , T 2 , and the combination of them. e classification results of SVM-PSO are shown in Figure 12. Due to the little calculation time of the classification model with a single feature, it is adaptive to  the real-time monitoring of cutting states. However, compared with the classification model using two features, their classification accuracy is much lower. In addition, with the help of the optimization algorithm, the coupling of two features which have low classification accuracy can achieve a good performance.

Comparison with the Number of Training Sample.
To verify the impact of number of training sampling on the classification results, the number of training sampling was set as 30, 35, 40, 45, 50, and 55, respectively. e same calculation process was done on each training sampling, and the classification results are shown in Figure 13. Obviously, the calculation time is increasing sharply along with the increasing of training sampling. Generally, with the increase in the number of training sampling, the classification accuracy is increased steadily. However, when the number of training sampling is 55, the classification accuracy is much lower and more dispersed because there is great randomness due to small test sampling (number of test sampling is 5).

Comparison with Other Classification Methods.
To demonstrate the classification algorithm, K nearest neighbors (KNN), backpropagation neural network (BPNN), decision trees (DT), and naive Bayesian mode (NBM) were chosen in the feature classification, respectively. Parameter settings of the different methods are shown in Table 7. e classification results are shown in Figure 14. Generally, the classification accuracy of SVM-PSO is much higher than the other classification algorithms. And, the calculation time of SVM-PSO is much higher than the other classification algorithms. e primary cause is that the input parameters of other classification algorithms are determined by engineering experiences instead of some optimization techniques.

Conclusion
In this paper, a hybrid chatter detection method based on WPD, SSA, and SVM-PSO in end milling is proposed. Based on the in-depth analysis of the chatter-emerging frequency band of vibration signals, the coefficients of the energy ratio and the singular spectrum entropy are extracted as two chatter features. To improve the precision and efficiency of the proposed HCDM, PSO is chosen to optimize the input parameters of SVM. e effectiveness is primarily verified by a set of cutting experiments of 300 M steel under different working conditions. e proposed approach can recognize the stable, transition, and chatter states more accurately than the other traditional approaches. After the detection of cutting chatter in end milling, chatter suppression will be studied in future. According to the impacts of cutting parameters on cutting chatter, the chatter suppressing method will be researched by adjusting the spindle speed.
Data Availability e acceleration signal and excel data used to support the findings of this study have not been made available because data sharing is not allowed. e data in this paper are about the machining of 300 M steel which is used in the aircraft landing gear, and the technologies concerned with the aviation industry are not open for all the researchers. So, they are not provided.

Conflicts of Interest
e authors declare that they have no conflicts of interest.