Fault diagnosis of helical gearbox using acoustic signal and wavelets

The efficient transmission of power in machines is needed and gears are an appropriate choice. Faults in gears result in loss of energy and money. The monitoring and fault diagnosis are done by analysis of the acoustic and vibrational signals which are generally considered to be unwanted by products. This study proposes the usage of machine learning algorithm for condition monitoring of a helical gearbox by using the sound signals produced by the gearbox. Artificial faults were created and subsequently signals were captured by a microphone. An extensive study using different wavelet transformations for feature extraction from the acoustic signals was done, followed by waveletselection and feature selection using J48 decision tree and feature classification was performed using K star algorithm. Classification accuracy of 100% was obtained in the study


Introduction:
Machines are the source of production of all the vital components that is required. Gears are responsible for the supply and transmission of the power for these machines making them a very essential part.Gears have cut teeth or cogs which mesh with each other to transmit torque. The gear tooth takes all the load showing its significance. Each and every tooth of the gear contributes for the transmission of the load. It is essential to ensure the proper maintenance of the gear tooth through real time condition monitoring system as the failure of even a single tooth can result in tremendous damage which can result in decreasing the efficiency for transmission or disruption. Gear failures can be caused by several factors such as incorrect design, installation, acid corrosion, poor lubrication etc. The effects of a faulty gearbox can be catastrophic as it can damage the machinery since the broken tooth may disrupt the rotatory motion of the gear. It is therefore essential to reduce the risk of gear failures. The situation can be monitored by examining the sound or vibration signals coming out of the box as the gears are enclosed inside the gearbox. The aim of this study is to reduce the risk of gear failures by real time monitoring of acoustic signals.
Acoustic signals were chosen for the experiment as the accelerometers are more expensive than microphones which would be used for recording the acoustic signals. Artificial fault conditions in gears were created for obtaining acoustic signals of different fault conditions. This was done so that the detection of faults could be done at early stages to avoid mishaps and thus resulting in better efficiency and transmission.
The processes involved were feature extraction, wavelet selection, feature selection and feature classification. Statistical features [1], histogram features [2] or wavelet features are the usually used features. FFT [5] (Fast Fourier Transform)was a very useful tool as the computation time for the Fourier Transform was reduced significantly. It was not suitable for the study as it was inconvenient for non-stationary signals. Though STFT [6] [18] (Short Time Fourier Transform) proved to be better than FFT for non-stationary signals, wavelet transformwere preferred as it provided useful information for fault identification and also provides multiresolution as performed by M. Lang et al . Discrete wavelet transform [3] [4] was preferred over continuous wavelet transform as the processing of data took place faster in discrete wavelet transform despite the provision of accurate data. With further analysis Daubechies 5 was found to produce the best feature set which was used for selection and classification.
The most popular techniques for feature selection are artificial networks [7] [8], fuzzy [9], decision tree [10] [11] [19], principal component analysis and genetic algorithm. J48 decision tree [10] [11] [19] was used for wavelet selection and feature selection as it is very compact, simple and a fast computation speed.
Commonly used classifiers for feature selection are Support vector machine (SVM) [12] [20], Proximal support vector machine (PSVM), artificial neural network (ANN) [13], Naïve Bayes (NB)and Bayes Net, Fuzzy, Decision tree (DT) [14] [15] etc. The SVM based models suffers from higher training time and computational complexity when the number of patterns increases. Despite good results provided by artificial neural network classifier, the training of artificial neural network classifier is complex and time consuming. For a real time system, the classifier used should consume least time with high classification accuracy. J48 decision tree was thus used.
Artificial faults were created in the gears by chipping off 20%, 40%, 60% and80% of the gear tooth and also by removing one entire gear tooth (100%) and another half of the next gear tooth (150%) all at full load capacity of the machine. Using J48 decision tree a set of wavelet features were extracted using 'db5' wavelet feature selection and classification. Simulations were performed and results were obtained.

Experimental studies
The experimental set up which is shown in Fig. 1 consists of a 5 HP two stage helical gearbox [17] [21] which is driven by 5.5 HP 3 phase induction motor at 1200 rpm. The DC generator is driven by the mechanical output from the gearbox. A load needs to be provided to the DC motor for the functioning. A resistor bank is connected to the DC motor such that the power can be dissipated in it. Resistor bank is preferred over dynamometers as additional torsional vibrations may occur due to torque fluctuations resulting in formations of inaccurate acoustic signals. Additional vibrations to the test rig are thus eliminated in the generator and resistor bank arrangement. The gearbox, generator and motor are mounted on stiff I-beams which are anchored to a big concrete block. To measure the vertical vibration signals which are generated on the bearing housing of the 16 teeth pinion, B&K 4332 accelerometer is used. Meshing gear frequencies are calculated as 320 Hz and its multiples. Data sets were collected when the helical gear was working at normal conditions and also when artificial faults were created by chipping off 20%, 40%, 60%, 80%, 100% and also by removing one entire gear tooth and another half of the next gear tooth (150%). The signals were curtailed to 3 kHz when a low pass filter was used and was sampled at 8kHz. B&K TYPE 2626 charge amplifier was used to condition the microphone output.
A pinion is connected to a DC motor to generate 2kW power, which dissipates power in the resistor bank. Subsequently the actual load on the gearbox is only 2.6 HP which is 52% of its rated 5 HP. Utilization of the load varies from 50% to 100% in the industrial environment. Tyre couplings are installed between the electrical machines and gear box so that the backslash can be restricted to the gears. The experimental setup is shown in Fig. 1

Feature extraction
The acoustic signal recorded from the gearbox by the microphone was used to perform the fault diagnosis. The conversion of acoustic time-domain signal into time-frequency-domain data is done using Discrete Wavelet Transform (DWT) through wavelet decomposition. The wavelet decomposition forms a trend. The obtained trend is again decomposed into next level trend and details. The same process is repeated for several levels of trends to give multiple levels of details. For this study, a signal length of 2048 (2 11 ) is chosen and subsequently the signals can be decomposed into 11 levels. At each level, the detail co-efficients were used to compute energy content using the following formula.
n=number of details coefficients Then, the features were defined as the energy content at each level. The feature vector is defined as When m -(number such that length of signal) = 2 m V 1 , V 2 , V 3 … are energy content at given level The following discrete wavelet transformations were used in this study.

Wavelet selection
Processing of the time domain signals was done using 54 different discrete wavelet transforms from the seven wavelet families. The features which were extracted from the wavelet transform are passed to the J48 algorithm to find the maximum classification accuracy. The features extracted using Daubechies 5 gave the maximum classification accuracy among all the DWTs mentioned above. Daubechies 5 was thus selected for subsequent operations. Daubechies wavelet which is represented as 'db n' is a family of orthogonal wavelets which are characterised by highest number of vanishing points(n) for a given support width of 2 n-1 . Of all these possible solutions for the point and orthogonality conditions, the solution whose scaling filter produces the maximum phase is selected.

Feature selection
J48 classifier is a simple decision tree used for classification. It creates a binary tree. The decision tree approach is most useful in classification problem. With this technique, a tree is constructed to model the classification process. Once the tree is built, it is applied to each tuple in the database and results in classification for that tuple.
While building a tree, J48 ignores the missing values i.e., the value for that item can be predicted based on what is known about the attribute values for the other records. The basic idea is to divide the data into range based on the attribute values for that item that are found in the training sample. J48 allows classification via either decision trees or rules generated from them.
J48 uses entropy based information gain as the selection criteria. As per information theory, entropy is a measure of the uncertainty in a random variable. The expected reduction in entropy due to the partitioning of the examples according to the given feature gives the information gain. It is a measure of the capability of a given attribute to separate its training examples according to the target function. The features selected are V1, V2, V3, V4, V5 and V6as this combination gave the maximum classification accuracy.

Feature classification
The K* algorithm [16] is a method of cluster analysis which mainly aims at the partition of "n" observation into "k" clusters in which each observation belongs to the cluster with the nearest mean. K* algorithm can be described as an instance based learner which uses entropy as a distance measure. The benefits are that it provides a consistent approach to handling of real valued attributes, symbolic attributes and missing values. K* is a simple, instance based classifier, similar to KNearest Neighbour (K-NN). New data instances, x, are assigned to the class that occurs most frequently amongst the knearest data points y(j) , where j = 1, 2…k. Entropic distance is then used to retrieve the most similar instances from the data set. By means of entropic distance as a metric has a number of benefits including handling of real valued attributes and missing values K*(y i ,x) = -ln P*(y i ,x) Where P* is the probability of all transformational paths from instance x to y. It can be useful to understand this as the probability that x will arrive at y via a random walk in IC feature space. It will performed optimization over the percent blending ratio parameter which is analogous to KNN "sphere of influence", prior to assessment with other Machine Learning methods.

Results and discussions
Analysis of the acoustic signals from the gearbox under good and faulty conditions (20%, 40%, 60%, 80%, 100% and 150%) was done followed by the feature extraction, selection and classification using discrete wavelet transform and K star algorithm. Using Daubechies 5 wavelet transform (v1 -v11) 11 features were extracted from the acoustic signals. The features which contribute for the feature classification were selected using J48 decision tree. The maximum classification accuracy was obtained when the features V1, V2, V3, V4, V5 and V6 were selected for training and testing. The remaining features were ignored as they reduced the classifier's accuracy.

Effect of number of features
(i) The decision tree gave a preview of the relative importance of the features extracted using 'db5' wavelets. The feature at the top gives the maximum contribution, i.e.V2, the contribution reduces down the tree.
(ii) The information gained as the entropy reduced gave the measure of the discriminating capability of the feature in a given data set and thus useful features were selected.
(iii) The classification accuracy with J48 algorithm is lower with less features and increases as the number of features increases, reaching a maximum value at 6 features and thereafter it remains constant at 97.619%.

Feature classification using K star algorithm
(i) Features V1, V2, V3, V4, V5 and V6which contributed for the classification were selected for training and testing from the J48 decision tree (ii)When the number of instances from the data set of 60 increased, the classification accuracy was 100% initially and was constant for some range of values of the global blend and further diminished as the value of the global blend decreased. The reduction of accuracy may be due to increase in complexity of the problem or unnecessary confusion when the number of features increases.
(iii)The summary of stratified cross validation is illustrated in Table 2. The confusion matrix gives an insight into the details of misclassification. In confusion matrix, the first row corresponds to the total number of data points for the ''GOOD'' condition of the gearbox and the first column in first row corresponds to the correct classification as ''GOOD''.
(iv)The second row in the confusion matrix represents the total number of 60 data points for 20% fault condition and the first column represents misclassification of those data points as ''GOOD''. The cell corresponding to the second row and second column in the confusion matrix represents the number of data points in 20% fault condition that have been correctly classified as 20% fault condition.
(v) As shown in the confusion matrix, it is observed that all the data points have been classified correctly. Thus giving 100% accuracy.  The True Positive (TP) rate is a measure of instances which are considered as true instances in the same class and should be 1 in ideal cases. If a class is declared 'GOOD' all the instances in the same class are classified correctly and the TP rate is 1. Classes with artificial faults of 20%, 40%, 100% and 150% show TP rate as 1 implies that all the instances were classified correctly. The False Positive (FP) rate is a measure of non-satisfying conditions which were interpreted as satisfying conditions for perfect classification. From Table 2, it can be implied that FP is zero. Precision is the fraction of the retrieved instances that are relevant while recall is the fraction of relevant instances that are retrieved and both should be ideally one. In the present study they both confirm to an ideal condition.

Conclusion
The classification of features of the faults of the gearbox was done by extracting the discrete wavelet features from the acoustic signals. K star algorithm which was used for classification gave 100% accuracy. The proposed system of using J48 decision tree for selection along with the classification using K star algorithm is extremely effective and can be used for the real time condition monitoring of gearbox with minimal expenditure on the microphone and various components. This system can be implemented properly as it is cost effective and also reduces the risk of failure of the gearbox. This also enables prediction of the faults in the gearbox. Further developments can be made to increase the speed of the process of computation as K star algorithm is a time consuming process.