Improvement of Adaptive GAs and Back Propagation ANNs Performance in Condition Diagnosis of Multiple Bearing System Using Grey Relational Analysis

Condition diagnosis of multiple bearings system is one of the requirements in industry field, because bearings are used in many equipment and their failure can result in total breakdown. Conditions of bearings commonly are reflected by vibration signals data. In multiple bearing condition diagnosis, it will involve many types of vibration signals data; thus, consequently, it will involve many features extraction to obtain precise condition diagnosis. However, large number of features extraction will increase the complexity of the diagnosis system. Therefore, in this paper, we presented a diagnosis method which is hybridization of adaptive genetic algorithms (AGAs), back propagation neural networks (BPNNs), and grey relational analysis (GRA) to diagnose the condition of multiple bearings system. AGAs are used in the diagnosis algorithm to determine the best initial weights of BPNNs in order to improve the diagnosis accuracy. In addition, GRA is applied to determine and select the dominant features from the vibration signal data which will provide good diagnosis of multiple bearings system in less features extraction. The experiments results show that AGAs-BPNNs with GRA approaches can increase the accuracy of diagnosis in shorter processing time, compared with the AGAs-BPNNs without the GRA.


Introduction
A bearing is a device widely used in industries to minimize friction on rotating part of machine by giving smooth metal balls or roller and a smooth inner and outer metal surface for the balls to roll against. Unfortunately, bearing is one of the machine parts which has high percentage of failures compared to other components [1]. Based on previous study, bearings contribute 40-50% causes of machine failure [2]. Therefore, a precise condition diagnosis of bearing system is essential to detect defects before they lead to failures.
A precise condition diagnosis can be achieved by good condition monitoring. Condition monitoring of a bearing is reflected by vibration signal data. The vibration signal data is captured by accelerometers which record the condition of the bearing continuously. Vibration signal data is commonly used for bearing condition diagnosis due to its intrinsic advantage of revealing bearing failure [3][4][5]. Vibration signal under different condition will show different pattern [6], as can be seen in Figure 1. Figure 1 shows that the vibration signal data of normal bearing has different pattern from faulty bearing. Vibration signal of faulty bearing has higher amplitude compared with the normal bearing. However, multiple bearing system with one faulty bearing may not be visually different on the represented vibration signal data compared to all bearings being normal. Therefore, it is important to have a technique which is able to accurately diagnose the system condition based on the continuously monitored vibration signals.
It is important to extract features from vibration signal data, since the vibration signal data captured from mechanical system such as bearing are complex in nature and some of the useful information is corrupted [5,7]. In order to capture  the diagnostic information from the vibration signal data, it is appropriate to compute as many features as possible.
In this paper, we extract ten features from vibration signal data of the bearing system. Those features are standard deviation, skewness, kurtosis, the maximum peak value, absolute mean value, root mean square value, crest factor, shape factor, impulse factor, and clearance factor [8]. These features are effective and practical for condition diagnosis due to their relative sensitivity to early faults and robustness to various loads and speeds [9]. But the choice of features is often arbitrary, which will lead the situation where some features contain the same information and the others contain no useful information at all [8]. Therefore, a technique to determine which are the dominant features for multiple bearing diagnoses is important. One of the simplest ways to find the dominant parameters is by trying and combining the features continuously up to the desired result achieved. However, this method will spend much time and memory to obtain the dominant features. Grey relational analysis (GRA) is one of the methods which are commonly used to find the dominant features. It is used as feature selection method to remove irrelevance and redundant factors that affect the results [10]. The essential thing of GRA is that it can be used to describe the relation among features [11]. GRA can be employed to explain the complicated interrelationship among the data when the trends of their development are either homogeneous or heterogeneous [12]. In this study, we used grey relational analysis (GRA) to determine the dominant features which contain useful information in multiple bearings condition diagnosis. The dominant features from GRA will be used as the input in condition diagnosis algorithm. Diagnosis algorithm had been proposed by many researchers, some of them using individual metaheuristic techniques such as the genetic algorithms (GAs) and fuzzy and neural networks (NNs) [13][14][15][16][17]. However, individual metaheuristic techniques suffer from their own drawbacks, which can be overcome by forming a hybrid approach combining the advantages of each technique [18]  techniques to improve the performance of condition diagnosis. Wulandhari et al. [19][20][21] improved the condition diagnosis work in specific type of fault for multiple bearing using hybrid genetic algorithms and back propagation neural networks (GAs-BPNNs) approach. In this paper, we propose an improvement of GA-BPNNs by applying adaptive methods to GAs and back propagation neural networks (AGAs-BPNNs) and using GRA to identify and select the dominant features for AGAs-BPNNs algorithm in order to obtain a precise condition diagnosis for the multiple bearing systems.

Bearing Vibration Signal Data
In this paper, we use the vibration signals data from the Case Western Reserve University Bearing Data Center [22]. The vibration signals data were captured from a two-bearing system, which consists of Drive End bearing (DE) and Fan End bearing (FE), with various combinations of the bearing conditions. The specifications of the bearing are given in Table 1. For the purpose of capturing the vibration signals data, three accelerometers were attached on the bearings and the baseline (BA) as shown in Figure 2 Table 2.
From Table 2, we can see that each condition has three streams of data as captured by the three accelerometers; thus, each feature will be extracted from three accelerometers. Based on the available data, generally only seven condition classes of bearing can be specified as the output of the diagnosis. In this paper, we expanded the condition classes from seven to sixteen classes by combining and mixing the available data. For the FE-IRF and DE-IRF class, for instance, its BA data was set or obtained from the average of BA accelerometer in FE-IRF and DE-IRF condition, the FE data was obtained from FE accelerometer in FE-IRF condition, and the DE data was obtained from DE accelerometer in DE-IRF condition. The expansion of condition classes was done to obtain more specific condition diagnosis for each bearing so that any action to each bearing can be specifically carried out. The advantage of this expansion is that, here, we can identify the condition of DE and FE bearing simultaneously. In the seven classes case, we can only identify the condition of either one. The sixteen classes of the bearing conditions are presented in Table 3.
The classes of multiple bearing conditions are influenced by ten features extracted from the data. The values of the features lie within the interval which is the lower and upper bound of the data extraction. The interval of the features values is presented in Table 4.

The Proposed Algorithm
This paper proposed a hybrid method of GRA, AGA, and BPNN to diagnose the condition of multiple bearing systems. This hybridization applies GRA to determine the dominant features by analyzing the relation between each feature and its ideal values. The algorithm is started by initializing the features which are extracted from the vibration signal data and then calculate the grey relational coefficient (GRC) followed by calculating the grey relational grade (GRG) which both of them including to the GRA process. The results from the GRA are the dominant features which will be used as the input of the AGAs-BPNNs. The framework of the GRA-AGA-BPNNs is shown in Figure 3.

Grey Relational Analysis (GRA).
A system for which the relevant information is completely known is called a white system, while a system for which the relevant information is completely unknown is a black system. Any system between these limits is a grey system which has poor and limited information [23]. In multiple bearing systems, any information about the condition of the bearing is not completely revealed by the features extracted from the vibration signal data. This unclear condition of data can be overcome using GRA which was proposed by Deng [24] in 1982. GRA utilizes the mathematical method to analyzing correlation between the references series which is the ideal value of features and the alternatives series [25]. It firstly normalizes the features extracted and then translates the performance of all alternatives into a comparability sequence with the ideal value called grey relational generating [26], followed by the calculation of grey relational coefficient between all comparability sequences and the references sequences. Finally, the grey relational grade between the reference sequence and every comparability sequence is calculated based on the grey relational coefficient. The highest grey relational grade of the alternatives features indicates that the features have dominant influence to the condition diagnosis. The procedures of GRA are shown in Figure 4 (1) Then, the normalized features * can be obtained by the following equation [27]: Next, the grey relational generating is conducted by determining the reference or ideal values of the features extracted. Let be the features extracted, with = 1, 2, . . . , 10. is the condition of the bearing systems with = 1, 2, . . . , 16. It is noticed that we used 240 pieces of sample data and 16 classes for bearing conditions where each class consists of 15-item data which fall in the condition class . The ideal values of the features extracted, say , are the average of th feature extracted value in th condition which can be written as follows: For example, 11 is the ideal value for the first condition class (class of FE and DE Normal) and the parameter 1 . Regarding obtaining the next step, namely, grey relational coefficient, we need to determine the comparability sequences of the alternatives values and the ideal values. It is noticed that we used 240 pieces of sample data and 16 classes for bearing conditions where each class consists of 15-item data which fall in the class . Then, we define   where = 1, . . . , 16; then, the comparability of the alternatives and ideal values can be calculated as where Δ is the comparability of alternatives and ideal values, * is the alternatives value which is normalized features extracted from the vibration signal data, and is the ideal values which is defined based on the condition classes. Based on (4) and (5), the next step of GRA procedures, namely, grey relational coefficient calculation between comparability sequences and ideal sequences, is written as follows: GRC = min (min (Δ )) + max (max (Δ )) (Δ ) + max (max (Δ )) for = 1, 2, . . . , 240; = 1, 2, . . . , 10, where GRC is the grey relational coefficient value of th samples and th feature and is the distinguishing coefficient which is defined in the range 0 ≤ ≤ 1. Then, the grey relational grade (GRG) is determined by averaging the GRC to each feature and is represented as where is the number of samples. Equation (7) is used to find the GRG of accelerometers BA, DE, and FE. The final grey relational grade on the feature , say GRG , is the average of GRG * s from BA, DE, and FE, respectively. The results of the experiment using GRA are presented in Section 4.

The Proposed GRA-AGAs-BPNNs
Algorithm. The proposed GRA-AGAs-BPNNs algorithm is the hybrid algorithm which combines the advantages of GRA, GAs, and BPNNs to one algorithm for the condition diagnosis of the multiple bearing systems. The dominant features from GRA are used in AGAs-BPNNs algorithm to classify the condition effectively. Adaptive operator probabilities techniques in GAs are proposed to obtain better initial weights for BPNNs training.

6
Computational Intelligence and Neuroscience  The adaptive technique is applied to maintain the diversity of the population by varying the probabilities of crossover ( ) and mutation ( ) as, for example, [28][29][30][31]. The algorithm for the proposed AGAs-BPNNs is as follows.
(1) Let ( , ) be the th input and target pair of the problem to be solved by BPNN, with = 1, 2, . . . , in , and in is the number of paired data.
(2) Let pop , chro , 0 , 0 , and iter be the number of populations, number of chromosomes, initial crossover probability, initial mutation probability, and maximum number of iterations, respectively. Initialize 0 , 0 , , and where are random vectors of numbers which generated in range [0, 1] with size 1 × chro /2 and are random vectors of numbers which generated in range [0, 1] with size 1 × chro . Set = 0.
where ( , ) is the mean square error (MSE) of the th chromosome in the population . It is calculated based on the selected BPPN architecture as follows: where is the target of the th input in the th chromosome and is the output of the th input in the th chromosome of the population based on the selected BPNN architecture.
(7) Generate the mating pool by selecting the best chromosomes using roulette selection methods.
(9) Calculate the crossover probability of the th parents pairs in the population [32]: ( 1 ) and ( 2 ) are the fitness values of parent 1 and parent 2, respectively; max ( ) is the maximum fitness value of the population ; ( ) is the average fitness value of the population .
Computational Intelligence and Neuroscience 7   (10) Calculate mutation probability of the th chromosome in the population [32]: where ( , ) is the fitness value of the th chromosome in the population .
Generate by applying crossover and mutation mechanism based on the following rules. (12) If converge or is equal to iter , then the best chromosome is obtained and used as the initial weights for BPNN learning. Otherwise, go to step (6).

Experimental Evaluation and Discussion
In this section, we will describe and discuss the result of the experiment from GRA-AGAs-BPNNs in classifying the condition of the multiple systems. For the GRA process, we tried several values of , namely, 0.3, 0.5, 0.6, and 1. Based on (7), GRG value for each was calculated. The sequence of the features extracted based on GRG value is given in Table 5. Table 5 shows that if the distinguishing coefficient is closer to 0, then the GRG of the feature will have a range wider than if the distinguishing coefficient is closer to 1. For = 0.3, the GRG range is around 0.545 while the = 1 GRG is around 0.320 as shown in Figure 5. From Table 5, we can see, however, that the values are varied and the sequence of features extracted is the same. Root mean square value is the features with the highest GRG value.
In this paper, we conducted experiments of AGAs-BPNNs using one, three, five, and seven of the best dominant features based on the GRG values in Table 5. In AGAs-BPNNs techniques, we set AGAs features as follows: 100 chromosomes of each population, initial crossover probability of 0.6, and initial mutation probability of 0.01. For BPNNs, we use three hidden layers and refer to -1 -2 -3 -as neurons input, 1 neurons in the first hidden layers, 2 neurons in the second hidden layers, 3 neurons in the third hidden layers, and neurons output. As stated in Section 2, vibration signals data is recorded from three accelerometers; thus, features are extracted from three accelerometers which cause the number of neuron inputs in BPNNs to be equal to 3   We conducted ten times the experiments of the GRA-AGAs-BPNNs for each one, three, five, and seven dominant features combination. We also conducted experiments using the lowest GRG features and combination of the highest and the lowest GRG features in AGA-BPNN to obtain condition diagnosis of multiple bearing systems. These experiments were carried out to see the influence of the selection of dominant features combination on condition diagnosis algorithm performance for multiple bearing systems. The performance of the algorithm is characterized based on the accuracy in classifying the condition of the bearings. The classification accuracy is computed by classification accuracy = total true predicted class total output × 100%.
The experiments were executed using a computer with Intel Core 2 Quad processor Q8200, 2.33 GHz and 1.96 GHz, and RAM 3.46 GB. The result of the GRA-AGAs-BPNNs is given in Table 6. Table 6 shows the comparison of the classification accuracy between GRA-AGAs-BPNNs and AGA-BPNN algorithm without GRA (which involve ten features). We can see that the classification accuracy of GRA-AGA-BPNN using five and seven dominant features with topologies 15-15-15-15-16 and 21-21-21-21-16, respectively, has higher accuracy than AGA-BPNN with topology 30-30-30-30-16 without GRA and using ten features. From the experiment results, we obtain that root mean square value, standard deviation, absolute mean value, skewness, and maximum peak value can give good diagnosis of multiple bearing conditions. We notice that, by applying GRA in AGA-BPNN algorithm, we can achieve higher classification accuracy with shorter time. GRA is capable of determining which features can give dominant contribution in condition diagnosis of the multiple bearing system.

Conclusion
In this paper, we introduced a new hybrid technique of grey relational analysis, adaptive operator probabilities in genetic algorithms (AGAs), and back propagation neural networks (BPNNs), called GRA-AGAs-BPNNs for condition diagnosis of multiple bearing systems. We used grey relational analysis (GRA) to determine the dominant features which contain useful information of multiple bearings condition. GRA determines that mean square value, standard deviation, absolute mean value, skewness, and maximum peak value can give good diagnosis of the multiple bearings condition. The features from GRA are used in AGA-BPNNs to diagnose the condition effectively. We exploited the strong capability in optimization of genetic algorithms, which here have been further improved by varying the mutation and crossover operators probabilities, for searching the best initial weights for BPNNs, and the strong capability in classification of BPNNs to classify or diagnose the condition of a multiple bearing system. The AGAs strengthen the BPNNs to achieve the higher classification accuracy in shorter CPU time compared to the standard BPNN or the hybrid GAs-BPNNs.
Experimental results showed that the GRA is capable of improving the classification accuracy of AGAs-BPNNs in shorter time. The accuracy achieves 100%, 94.6%, and 92.5% for the training, validation, and testing, respectively, for 7 dominant features and 99.4%, 97.1%, and 96.7% for the training, validation, and testing, respectively, for 5 dominant features. The accuracy is increased and the processing time is reduced compared to the AGAs-BPNNs without GRA as shown in Table 7. This achievement provides the benefits for condition diagnosis in the real case, since we require a precise and quick process to diagnose the condition of multiple bearing in order to avoid total breakdown.