Vibration-Based Signal Analysis for Shearer Cutting Status Recognition Based on Local Mean Decomposition and Fuzzy C-Means Clustering

: In order to accurately acquire shearer cutting status, this paper proposed a pattern recognition method, based on the local mean decomposition (LMD), time-frequency statistical analysis, improved Laplacian score (LS), and fuzzy C-means (FCM) clustering algorithm. The LMD was employed to preprocess the vibration signals of shear cutting coal seam, and several product functions (PFs) were obtained. Following this, 14 time-frequency statistical parameters of the original signal and optimal PF were extracted. Additionally, the improved LS algorithm was designed to ensure the accurate estimation of features, and a new feature vector could be selected. Subsequently, the obtained eigenvector matrix was fed into a FCM to be clustered, for optimal clustering performance. The experimental examples were provided to verify the effectiveness of the methodology and the results indicated that the proposed algorithm could be applied to recognize the different categories of shearer cutting status.


Introduction
Recently, the intellectualized mining technology of the fully-mechanized coal mining face, has increasingly attracted the attention of coal mining managers. As a key component of this fully-mechanized equipment, the shearer plays the most important role in the whole process of coal mining. The precise identification of the cutting status of the shearer is directly related to the intelligentialize level, and has a great impact on the safety of coal miners and mining efficiency [1]. Therefore, there is an urgent need to study a new pattern recognition method for obtaining an accurate recognition of shearer cutting status [2].
Vibration-based signal analysis techniques have been widely used in the field of pattern recognition and fault diagnosis. The vibration performance of a shearer's cutting part changes when the shearer cuts different geological conditions of coal seam. However, due to the extremely harsh mining environment, the vibration signals represent strong, nonlinear, non-Gaussian, and non-stationary characteristics, which need to be reprocessed in order to obtain accurate state features. Many traditional methods, such as the Fourier transformation, wavelet transform, and Wigner-Ville distribution [3][4][5][6], have been developed, in order to achieve signal analysis, but these methods have their own drawbacks, lacking the nature of a self-adaptive feature.
Taking this into consideration, Huang proposed a self-adaptive time-frequency decomposition algorithm in 1998, named the empirical mode decomposition (EMD), which has since been used in (1) All of the local extrema n i and the time tn i of the original signal x(t) are determined, and the mean value m i of the two successive extrema n i and n i+1 is calculated as follows: All mean values m i are connected by straight lines between the corresponding time tn i and tn i+1 , to generate the local mean segments. Then, the local mean segments are smoothed by amoving averaging method, to form a continuous local mean function m 11 (t). (2) The local amplitude a i can be given as follows: The moving averaging method is also used to smooth the local amplitude segments, in order to derive the envelope estimate function a 11 (t). (3) The local mean function m 11 (t) is subtracted from the original signal x(t), and the remnant signal, denoted by h 11 (t), can be given as follows: (4) The signal h 11 (t) is demodulated by the envelope estimate function a 11 (t), and the result, denoted by s 11 (t), can be calculated as follows: The envelope estimate function a 12 (t) of s 11 (t) is calculated. If a 12 (t) is not equal to one, s 11 (t) is not a purely frequency-modulated signal, and the above procedure for s 11 (t) should be repeated until a purely frequency-modulated signal s 1n (t) is obtained. This is obtained when the envelope estimate function meets the condition that a 1(n+1) (t) of s 1n (t) is equal to one. Therefore, where s 11 (t) = h 11 (t) /a 11 (t) s 12 (t) = h 12 (t) /a 12 (t) . . .
(5) An envelope signal a 1 (t) can be derived by the product of all of the envelope estimate functions obtained during the iterative process described above.
a 1 (t) = a 11 (t)a 12 (t) · · · a 1n (t), (6) The first product function PF 1 of the original signal can be generated by the product of the envelope signal a 1 (t) and the purely frequency-modulated signal s 1n (t).
(7) PF 1 is then subtracted from the original signal x(t), generating a new signal u 1 (t). The whole process is repeated k times, until u k (t) is a constant or monotonic function.
Finally, the original signal x(t) is decomposed into k PFs and a residual u k (t), and x(t) can be reconstructed as follows: According to above algorithm, we can realize that the LMD method is an adaptive signal decomposition method, based on the local extrema information of the signal itself.

Fuzzy C-Means Clustering Algorithm
An FCM clustering algorithm is an unsupervised dynamic clustering method which can divide a sample collection X = {x 1 , x 2 , · · · , x n } into c categories (2 ≤ c ≤ n). The membership of the sample point x i , belonging to the jth (1 ≤ j ≤ c) class, is defined as u ij , and U = {u ij } is the membership matrix, which can be used to denote the fuzzy cluster results. The membership matrix U possesses the following properties: An FCM clustering algorithm is employed to minimize the objective function J fcm , under the constraint conditions (11).
where m denotes the fuzzy weighting exponent, c j is the jth clustering center, and d ij denotes the Euclidean distance between the ith sample and the jth clustering center, which can be described as The main iterative process can be expressed as follows: (1) Provide the number of clustering categories c, the fuzzy weighting exponent m, the iteration stop threshold ε, and the maximum number of iterations T max . Then, initialize the membership matrix U (t) and set the iterations number t = 0. (2) The clustering center c j can be calculated as follows: (3) The membership matrix U (t+1) can be updated as: (4) If U (t+1) − U (t) ≤ ε, then the iteration process terminates. Otherwise, set t = t + 1 and return to step (2). (5) Finally, an optimal membership matrix U* and clustering center C* can be obtained. (6) The principle of selecting the near is adapted to recognize the unmarked object types.
The Hamming near-degree H between the unmarked object A and each clustering center c j , is used to describe the similarity of the two fuzzy subsets. Suppose the number of variables in the sample is r, and the mathematical formula of H can be given as follows: The larger H(A, c j ) signifies that the two fuzzy subsets are more similar, and the unmarked object A with the largest H(A, c j ) demonstrates that it belongs to the jth category.

The Proposed Pattern Recognition Method
In this section, a novel pattern recognition method, based on LMD and FCM, is presented. Firstly, the LMD method is used to preprocess the measured vibration signals, producing a set of PF components. Then, the optimal PF is selected according to the Kullback-Leibler divergence values between each PF and original signal. Following this, the features of the original signal and the optimal PF are extracted, using the time-frequency statistical parameters. Finally, an improved Laplacian score algorithm is proposed, in order to rank the extracted features, and appropriate features are selected. The obtained new feature vectors are the inputs of FCM, which cluster the different cutting statuses of the shearer. The flowchart of the proposed method is shown in Figure 1.
Appl. Sci. 2017, 7, 164 5 of 14 (5) Finally, an optimal membership matrix U* and clustering center C* can be obtained. (6) The principle of selecting the near is adapted to recognize the unmarked object types. The Hamming near-degree H between the unmarked object A and each clustering center cj, is used to describe the similarity of the two fuzzy subsets. Suppose the number of variables in the sample is r, and the mathematical formula of H can be given as follows: The larger H(A, cj) signifies that the two fuzzy subsets are more similar, and the unmarked object A with the largest H(A, cj) demonstrates that it belongs to the jth category.

The Proposed Pattern Recognition Method
In this section, a novel pattern recognition method, based on LMD and FCM, is presented. Firstly, the LMD method is used to preprocess the measured vibration signals, producing a set of PF components. Then, the optimal PF is selected according to the Kullback-Leibler divergence values between each PF and original signal. Following this, the features of the original signal and the optimal PF are extracted, using the time-frequency statistical parameters. Finally, an improved Laplacian score algorithm is proposed, in order to rank the extracted features, and appropriate features are selected. The obtained new feature vectors are the inputs of FCM, which cluster the different cutting statuses of the shearer. The flowchart of the proposed method is shown in Figure 1.

The Optimal PF Component Selection Based on Kullback-Leibler Divergence
After LMD decomposition, several PF components can be obtained. However, it is unnecessary to conduct the feature extraction for all PFs. In order to choose an optimal PF component (OPF), which contains the most feature information, the Kullback-Leibler divergence (KLD) is taken as the criterion, to measure the relevancy between each PF and the original signal. The main steps of OPF selection with KLD can be described as follows: (1) Suppose p(x) and q(x) are the probability density functions of original signal x(t) and PF i (t), respectively. p(x) can be defined as follows: where N denotes the sampling number of x(t), h is called the window width or smoothing parameter, and K(*) denotes a kernel function and is commonly expressed as: Likewise, the probability density function q(x) can be obtained. (2) The Kullback-Leibler distance between x(t) and the ith PF can be defined as: (3) The KLD between x(t) and the ith PF can be calculated as: The normalized KLD values can be obtained by Equation (20).
The smaller the KLD value is, the closer the correlation between the PF and original signal is, and vice versa.

Feature Extraction Based on Time-Frequency Statistical Parameters
When the shearer drum is cutting different types of coal seams, the time domain and frequency domain properties of the collected vibration signal will change. Hence, the time domain and frequency domain feature parameters are comprehensively utilized, in order to acquire more feature information. In this paper, seven time domain feature parameters (p 1~p7 ) and seven frequency domain feature parameters (p 8~p14 ), are used to extract the features of the original signal. Likewise, the 14 features (p 15~p28 ) of the optimal PF component are also computed, presenting a total of 28 features. The formulas for calculating p 1~p14 are listed as follows: where y j denotes the frequency spectrum of the signal x(i), K is the number of spectrum lines, and f j denotes the frequency value of the jth spectrum line. In these 14 features, the time domain feature parameters are mainly comprised of the mean, variance, root mean square, skewness, kurtosis, impulsion index, and kurtosis index. For the frequency domain feature parameters, p 8 reflects the vibration energy in the frequency domain. p 9 and p 10 represent the change in the position of the dominant frequency. p 11 and p 14 characterize the degree of the spectrum's dispersion or concentration.

Feature Ranking and Selection Based on Improved Laplacian Score Algorithm
Although the extracted features can be used to identify the working status of equipment from a variety of aspects, they have different sensitivities to different working conditions. In addition, too many inputs for the FCM will consume more clustering time and reduce the clustering accuracy. Therefore, it is necessary to remove the insensitive features and refine the feature vectors. In this paper, an improved Laplacian score (LS) algorithm is introduced, in order to rank the extracted features. Following this, some optimal features are selected, according to their importance and locality preserving.
Suppose that the experiment is comprised of M samples, and that each sample has T features. In this study, L r denotes the Laplacian score of the rth feature, and f ir denotes the rth feature of the ith sample, where i = 1, 2, . . . , M, r = 1, 2, . . . , T. The feature vector of M samples can be used to construct a nearest neighbor graph G, and the weighting matrix S can then be calculated as the similarity matrix of graph G. The Laplacian score of the rth feature can be computed as follows [33]: where f r = ( f 1r , f 2r , · · · , f Mr ) T , D = SI, I = (1, 1, · · · , 1) T , and L is the Laplacian matrix of graph G, and can be calculated as L = D − S, f r = f r − (f T r DI)I/(I T DI). After calculating the LS value of each feature, the features can be ranked from a low score to a high score. However, the Laplacian score algorithm depends too much on the local structure information of adjacent samples. The selected features, based on LS, possess a stronger representational capacity for the similarity of adjacent samples, and a weaker distinguishing capability for global samples. To the best of our knowledge, the Fisher score (FS) algorithm is usually employed to measure the separation degree between samples. In this work, FS is coupled with the LS algorithm, named the Laplacian-Fisher score (LFS), to select the optimal features from the entire 28 features.
The Fisher score of the rth feature can be written as follows: where ti denotes the category of the ith sample, M ti denotes the number of samples of the tith category, and µ rti and σ rti represent the mean and standard deviation of the rth features, respectively. The feature with the largest Fisher score has a stronger distinguishing capability for global samples. In order to guarantee the selected features with a smaller LS and larger FS, a weighting factor α is introduced into the improved LS algorithm, and the LFS can be calculated as: where α ∈ [0, 1]. The improved LS method is degenerated to the basic LS method for α = 1, while the improved LS becomes the basic FS method for α = 0. In order to illustrate the superiority of the improved LS method, the comparisons of LFS, LS, and FS are conducted. Two data sets: Glass and Wall-Following Robot Navigation (WFRN) [34], are selected from the UCI database for the simulations, to estimate the ranking performance of the three methods. The specific sample information is provided in Table 1. Table 1. Statistical information of the two data sets.

Number of Features
Glass 150 50 6 9 WFRN 500 100 4 24 The weighting factors α are set as 0.5, 1, and 0, and the corresponding feature selection methods are LFS, LS, and FS, which are used to rank the features of the two databases. The FCM clustering algorithm is used as the classifier and the classification results with different numbers of features are shown in Figure 2. In Figure 2, it can be clearly seen that the classification accuracies are significantly influenced by the feature ranking techniques. Compared with FS and LS, the LFS method has a superior ranking ability for the highest number of selected features, which may have a better purpose in terms of shear cutting status recognition. The reason for this is that LFS can comprehensively consider the power of locality preserving and global distinguishing capability, in order to evaluate the importance of a feature. Hence, the LFS method is utilized to rank the 28 features of vibration signals, and select an important feature vector for training the FCM clustering model. capability for global samples. To the best of our knowledge, the Fisher score (FS) algorithm is usually employed to measure the separation degree between samples. In this work, FS is coupled with the LS algorithm, named the Laplacian-Fisher score (LFS), to select the optimal features from the entire 28 features. The Fisher score of the rth feature can be written as follows: where ti denotes the category of the ith sample, Mti denotes the number of samples of the tith category, and μrti and σrti represent the mean and standard deviation of the rth features, respectively. The feature with the largest Fisher score has a stronger distinguishing capability for global samples. In order to guarantee the selected features with a smaller LS and larger FS, a weighting factor αis introduced into the improved LS algorithm, and the LFS can be calculated as: where α ∈ [0,1]. The improved LS method is degenerated to the basic LS method for α = 1, while the improved LS becomes the basic FS method for α = 0. In order to illustrate the superiority of the improved LS method, the comparisons of LFS, LS, and FS are conducted. Two data sets: Glass and Wall-Following Robot Navigation (WFRN) [34], are selected from the UCI database for the simulations, to estimate the ranking performance of the three methods. The specific sample information is provided in Table 1. The weighting factors α are set as 0.5, 1, and 0, and the corresponding feature selection methods are LFS, LS, and FS, which are used to rank the features of the two databases. The FCM clustering algorithm is used as the classifier and the classification results with different numbers of features are shown in Figure 2. In Figure 2, it can be clearly seen that the classification accuracies are significantly influenced by the feature ranking techniques. Compared with FS and LS, the LFS method has a superior ranking ability for the highest number of selected features, which may have a better purpose in terms of shear cutting status recognition. The reason for this is that LFS can comprehensively consider the power of locality preserving and global distinguishing capability, in order to evaluate the importance of a feature. Hence, the LFS method is utilized to rank the 28 features of vibration signals, and select an important feature vector for training the FCM clustering model.

Experimental Validation
In order to verify the effectiveness of the proposed method, the vibration signals under different shearer cutting statuses were analyzed. The experimental data were collected from a self-designed experimental system for shearer cutting coal, as shown in Figure 3. The coal seam was composed of three parts, according to the Protodikonov's hardness coefficient f. The first part was the coal seam with f = 2, the second was the coal seam with f = 3, and the last part was the coal seam with gangues. Different cutting statuses mean that the shearer was cutting different types of coal seam. Thus, the shearer mainly contained four cutting statuses, including the idling pattern, the coal seams with f = 2 and 3, and the coal seam with gangues, which were represented by the symbols of F1, F2, F3, and F4, respectively.

Experimental Validation
In order to verify the effectiveness of the proposed method, the vibration signals under different shearer cutting statuses were analyzed. The experimental data were collected from a self-designed experimental system for shearer cutting coal, as shown in Figure 3. The coal seam was composed of three parts, according to the Protodikonov's hardness coefficient f. The first part was the coal seam with f = 2, the second was the coal seam with f = 3, and the last part was the coal seam with gangues. Different cutting statuses mean that the shearer was cutting different types of coal seam. Thus, the shearer mainly contained four cutting statuses, including the idling pattern, the coal seams with f = 2 and 3, and the coal seam with gangues, which were represented by the symbols of F1, F2, F3, and F4, respectively.  In the proposed method, the vibration signals should be decomposed by initially using the LMD method. The vibration signals under the different cutting conditions could be decomposed into a sum of PF components. To save space, only the decomposition results of the vibration signal of shearer cutting coal seam with f = 2 were shown in Figure 5, as a representative.

Experimental Validation
In order to verify the effectiveness of the proposed method, the vibration signals under different shearer cutting statuses were analyzed. The experimental data were collected from a self-designed experimental system for shearer cutting coal, as shown in Figure 3. The coal seam was composed of three parts, according to the Protodikonov's hardness coefficient f. The first part was the coal seam with f = 2, the second was the coal seam with f = 3, and the last part was the coal seam with gangues. Different cutting statuses mean that the shearer was cutting different types of coal seam. Thus, the shearer mainly contained four cutting statuses, including the idling pattern, the coal seams with f = 2 and 3, and the coal seam with gangues, which were represented by the symbols of F1, F2, F3, and F4, respectively.  In the proposed method, the vibration signals should be decomposed by initially using the LMD method. The vibration signals under the different cutting conditions could be decomposed into a sum of PF components. To save space, only the decomposition results of the vibration signal of shearer cutting coal seam with f = 2 were shown in Figure 5, as a representative. In the proposed method, the vibration signals should be decomposed by initially using the LMD method. The vibration signals under the different cutting conditions could be decomposed into a sum of PF components. To save space, only the decomposition results of the vibration signal of shearer cutting coal seam with f = 2 were shown in Figure 5, as a representative. According to the flowchart of the proposed recognition system, the Kullback-Leibler divergence (KLD) was then used to measure the relevancy between each PF and the original signal. The PF component with the lowest KLD value could be selected as the OPF component. Considering the LMD decomposition results in Figure 5 as an example, the normalized KLD value of each PF was calculated as 0.0018, 0.0125, 0.1454, 0.3578, and 0.4825, respectively. Therefore, the first PF component possessed the lowest KLD value (0.0018), and could be selected as the OPF component for further analysis.
The 14 time-frequency statistical parameters were utilized to extract the features of the original signal and the OPF component, and it was possible to obtain a total of 28 features. In the experiment, 140 samples with 35 data samples under each cutting status were randomly selected for training the FCM clustering model, and the rest of the samples were used to test its clustering performance. Furthermore, in order to determine a more appropriate weighting factorα, the improved LS algorithm was then used to rank the extracted 28 features, under different weighting factors. The weighting factor was set to 0, 0.2, 0.4, 0.6, 0.8, and 1, respectively, and the ranking results for each α, based on the LFS values, were shown in Table 2. From this table, it can be observed that the features have different sequences with various weighting factors.
The 14 time-frequency statistical parameters were utilized to extract the features of the original signal and the OPF component, and it was possible to obtain a total of 28 features. In the experiment, 140 samples with 35 data samples under each cutting status were randomly selected for training the FCM clustering model, and the rest of the samples were used to test its clustering performance. Furthermore, in order to determine a more appropriate weighting factor α, the improved LS algorithm was then used to rank the extracted 28 features, under different weighting factors. The weighting factor was set to 0, 0.2, 0.4, 0.6, 0.8, and 1, respectively, and the ranking results for each α, based on the LFS values, were shown in Table 2. From this table, it can be observed that the features have different sequences with various weighting factors.  6 23, 4, 8, 22, 11, 28, 27, 2, 17, 10, 3, 24, 5, 6, 15, 7, 1, 12, 16, 19, 21, 14, 13, 20, 18, 25, 26, 9 0.8 23, 15, 18, 5, 6, 20, 2, 17, 21, 7, 24, 19, 9, 13, 26, 12, 22, 4, 10, 1, 11, 14, 3, 8, 27, 25, 28, 16 1 23, 26, 12, 2, 17, 9, 13, 1, 3, 15, 21, 5, 6, 14, 28, 22, 4, 24, 19, 25, 16, 11, 27, 7, 18, 20, 10, 8 For each feature sequence in Table 2, the FCM clustering algorithm was used to cluster the training samples with different feature subsets, and the testing samples were classified according to the principle of selecting the near. Finally, we were able to obtain the highest classification accuracy, and the corresponding dimension of feature subsets under each weighting factor, as shown in Figure 6. It can be observed in Figure 6 that the dimension of the feature vectors for FCM can affect the clustering centers and the classification results of the FCM clustering model. The FCM clustering model was able to produce the highest classification accuracy (98.33percent), and the lowest dimension of the optimal feature subset (7), for a weighting factor of α = 0.4. The simulation results indicated that the LFS method possessed a superior ranking ability, when compared to other two methods, and the proposed pattern recognition method was applicable to shearer cutting status.
To verify the necessity of preprocessing the shearer cutting vibration signals with the LMD method, two simulations were performed. During the first, the original signal was decomposed by an EMD method, and the optimal IMF component was determined, according to the KLD values. Then, the 14 statistical parameters were used to extract the features of the original signal and the optimal IMF. In another simulation, the original signal was not preprocessed using any decomposition methods, and the 14 statistical parameters were directly used to extract the features of the original signal. The LFS method with α = 0.4 was employed to select the optimal feature subset for the two simulations. Through the same process that was mentioned in the proposed methodology section, the classification results of the testing samples were produced, and can be seen in Figures 7 and 8. From Figure 7, three samples were misclassified by FCM clustering using the EMD method, and the classification accuracy only reached 95 percent, which was lower than the LMD method. As shown in Figure 8, when the original signal was not preprocessed by any decomposing methods, four samples were misclassified by FCM clustering, and the classification accuracy only reached 93.33 percent, which is lower than the other two methods. Therefore, the comparison results demonstrate the necessity to preprocess the original vibration signals with the LMD method, before extracting the characteristics. The reason for this is that the LMD method can restrain the interference noise and highlight the feature information hidden in the original signals.  It can be observed in Figure 6 that the dimension of the feature vectors for FCM can affect the clustering centers and the classification results of the FCM clustering model. The FCM clustering model was able to produce the highest classification accuracy (98.33 percent), and the lowest dimension of the optimal feature subset (7), for a weighting factor of α = 0.4. The simulation results indicated that the LFS method possessed a superior ranking ability, when compared to other two methods, and the proposed pattern recognition method was applicable to shearer cutting status.
To verify the necessity of preprocessing the shearer cutting vibration signals with the LMD method, two simulations were performed. During the first, the original signal was decomposed by an EMD method, and the optimal IMF component was determined, according to the KLD values. Then, the 14 statistical parameters were used to extract the features of the original signal and the optimal IMF. In another simulation, the original signal was not preprocessed using any decomposition methods, and the 14 statistical parameters were directly used to extract the features of the original signal. The LFS method with α = 0.4 was employed to select the optimal feature subset for the two simulations. Through the same process that was mentioned in the proposed methodology section, the classification results of the testing samples were produced, and can be seen in Figures 7 and 8. From Figure 7, three samples were misclassified by FCM clustering using the EMD method, and the classification accuracy only reached 95 percent, which was lower than the LMD method. As shown in Figure 8, when the original signal was not preprocessed by any decomposing methods, four samples were misclassified by FCM clustering, and the classification accuracy only reached 93.33 percent, which is lower than the other two methods. Therefore, the comparison results demonstrate the necessity to preprocess the original vibration signals with the LMD method, before extracting the characteristics. The reason for this is that the LMD method can restrain the interference noise and highlight the feature information hidden in the original signals. It can be observed in Figure 6 that the dimension of the feature vectors for FCM can affect the clustering centers and the classification results of the FCM clustering model. The FCM clustering model was able to produce the highest classification accuracy (98.33percent), and the lowest dimension of the optimal feature subset (7), for a weighting factor of α = 0.4. The simulation results indicated that the LFS method possessed a superior ranking ability, when compared to other two methods, and the proposed pattern recognition method was applicable to shearer cutting status.
To verify the necessity of preprocessing the shearer cutting vibration signals with the LMD method, two simulations were performed. During the first, the original signal was decomposed by an EMD method, and the optimal IMF component was determined, according to the KLD values. Then, the 14 statistical parameters were used to extract the features of the original signal and the optimal IMF. In another simulation, the original signal was not preprocessed using any decomposition methods, and the 14 statistical parameters were directly used to extract the features of the original signal. The LFS method with α = 0.4 was employed to select the optimal feature subset for the two simulations. Through the same process that was mentioned in the proposed methodology section, the classification results of the testing samples were produced, and can be seen in Figures 7 and 8. From Figure 7, three samples were misclassified by FCM clustering using the EMD method, and the classification accuracy only reached 95 percent, which was lower than the LMD method. As shown in Figure 8, when the original signal was not preprocessed by any decomposing methods, four samples were misclassified by FCM clustering, and the classification accuracy only reached 93.33 percent, which is lower than the other two methods. Therefore, the comparison results demonstrate the necessity to preprocess the original vibration signals with the LMD method, before extracting the characteristics. The reason for this is that the LMD method can restrain the interference noise and highlight the feature information hidden in the original signals.   Furthermore, in order to illustrate the significance of ranking the features based on the LFS method, seven features were randomly selected from the 28 features of the original and OPF components, and were then fed into the FCM, in order to distinguish the various shearer cutting conditions. The simulation conditions remained the same as those mentioned above, and the clustering results of the 60 testing samples can be seen in Figure 9. It can be observed from Figure 9 that the FCM clustering model can distinguish between the idling pattern and the other cutting conditions, but possesses a weaker clustering ability for identifying various cutting statuses. This is because the random selection of features did not contain enough feature information to form appropriate clustering centers for the FCM clustering model. In order to illustrate the potential application of FCM in shearer cutting status recognition, a comparative study between the present work and other common methods, is shown in Table 3. The common methods include two clustering methods of K-means and a self-organizing map (SOM), and two artificial intelligence techniques of a BP neural network (BPNN) and support vector machine (SVM). Obviously, BPNN, SVM, and FCM classifiers, had a better recognition ability than the other two clustering methods, and the FCM clustering classifier obtained the highest recognition accuracy out of the five methods. The boundaries of the different shearer cutting statuses were usually ambiguous and could not be strictly distinguished. An FCM clustering algorithm has the theoretical basis of fuzzy mathematics and can reveal better classification performance in the recognition of shearer cutting statuses. Furthermore, in order to illustrate the significance of ranking the features based on the LFS method, seven features were randomly selected from the 28 features of the original and OPF components, and were then fed into the FCM, in order to distinguish the various shearer cutting conditions. The simulation conditions remained the same as those mentioned above, and the clustering results of the 60 testing samples can be seen in Figure 9. It can be observed from Figure 9 that the FCM clustering model can distinguish between the idling pattern and the other cutting conditions, but possesses a weaker clustering ability for identifying various cutting statuses. This is because the random selection of features did not contain enough feature information to form appropriate clustering centers for the FCM clustering model. Furthermore, in order to illustrate the significance of ranking the features based on the LFS method, seven features were randomly selected from the 28 features of the original and OPF components, and were then fed into the FCM, in order to distinguish the various shearer cutting conditions. The simulation conditions remained the same as those mentioned above, and the clustering results of the 60 testing samples can be seen in Figure 9. It can be observed from Figure 9 that the FCM clustering model can distinguish between the idling pattern and the other cutting conditions, but possesses a weaker clustering ability for identifying various cutting statuses. This is because the random selection of features did not contain enough feature information to form appropriate clustering centers for the FCM clustering model. In order to illustrate the potential application of FCM in shearer cutting status recognition, a comparative study between the present work and other common methods, is shown in Table 3. The common methods include two clustering methods of K-means and a self-organizing map (SOM), and two artificial intelligence techniques of a BP neural network (BPNN) and support vector machine (SVM). Obviously, BPNN, SVM, and FCM classifiers, had a better recognition ability than the other two clustering methods, and the FCM clustering classifier obtained the highest recognition accuracy out of the five methods. The boundaries of the different shearer cutting statuses were usually ambiguous and could not be strictly distinguished. An FCM clustering algorithm has the theoretical basis of fuzzy mathematics and can reveal better classification performance in the recognition of shearer cutting statuses. In order to illustrate the potential application of FCM in shearer cutting status recognition, a comparative study between the present work and other common methods, is shown in Table 3. The common methods include two clustering methods of K-means and a self-organizing map (SOM), and two artificial intelligence techniques of a BP neural network (BPNN) and support vector machine (SVM). Obviously, BPNN, SVM, and FCM classifiers, had a better recognition ability than the other two clustering methods, and the FCM clustering classifier obtained the highest recognition accuracy out of the five methods. The boundaries of the different shearer cutting statuses were usually ambiguous and could not be strictly distinguished. An FCM clustering algorithm has the theoretical basis of fuzzy mathematics and can reveal better classification performance in the recognition of shearer cutting statuses.

Conclusions
This paper presents a novel pattern recognition algorithm for shearer cutting status, based on LMD and FCM. In the proposed method, LMD is employed to preprocess the vibration signal of the shearer, in order to obtain a sum of PF components, and the KLD of each PF is calculated to select the OPF component. Then, 14 time-frequency statistical parameters are used to extract features from the original signal and the OPF. Furthermore, to solve the dimension disaster of the obtained 28 features, an improved LS method is presented, to rank the features and automatically select the optimal features by analyzing their importance and locality preserving. In addition, the FCM clustering model is introduced, to fulfill the classification of shearer cutting statuses. Finally, the experiments and some comparisons are presented, and the proposed method is proved feasible, illustrating a superior performance for the identification of the different cutting categories and working conditions of a shearer.