Utilising the Wavelet Transform in Condition-Based Maintenance: A Review with Applications

Condition monitoring of machinery can be defined as the continuous or periodic measurement and interpretation of data in order to indicate the condition of an machine and determine the need for maintenance. Condition monitoring thus is primarily involved with the diagnostics of faults and failures and aims at an accurate and as early as possible fault detection. It is thus oriented towards an unscheduled preventive maintenance plan with continuous monitoring of the machinery as opposed to scheduled periodic maintenance. The possibility of failures of course cannot be diminished, but confident early diagnosis of incipient failures is extremely useful to avoid machinery breakdown and thus ensure a more cost-effective overall operation reducing equipment down-times. Industrial safety is also enhanced as catastrophic events are avoided when a maintenance-for-cause plan is followed.


Introduction
Condition monitoring of machinery can be defined as the continuous or periodic measurement and interpretation of data in order to indicate the condition of an machine and determine the need for maintenance.Condition monitoring thus is primarily involved with the diagnostics of faults and failures and aims at an accurate and as early as possible fault detection.It is thus oriented towards an unscheduled preventive maintenance plan with continuous monitoring of the machinery as opposed to scheduled periodic maintenance.The possibility of failures of course cannot be diminished, but confident early diagnosis of incipient failures is extremely useful to avoid machinery breakdown and thus ensure a more cost-effective overall operation reducing equipment down-times.Industrial safety is also enhanced as catastrophic events are avoided when a maintenance-for-cause plan is followed.
When faults occur in machines, phenomena like excessive vibration and/or noise, increased temperatures, increased wear rate, etc. are observed.The concept is to monitor, continuously or periodically, these dynamic phenomena utilizing one or more sensors to capture this behavior.One of the earliest approaches was the sound emission monitoring.An expert human ear played the role of the sensor in the early applications, a sophisticated microphone can play the same role today.The most classic approach -widely used until the present-is the vibration monitoring with few or several accelerometers placed upon the machine.The principle is that when damage occurs, the signature of the vibration response changes in the frequency domain, giving a qualitative indication of fault existence.The Acoustic Emission (AE) technique, famous for its sensitivity in the high frequency domain of micro-damage evolution, has found important applications in gearboxes and bearings as Section 4 presents.Other monitoring techniques include oil condition monitoring (oil debris, oil conductivity or humidity etc.), current and voltage transients monitoring in electric motors as well as temperature measurements/thermography.More than 80% of the applications presented in Section 4 involve vibration monitoring, with AE finding more and more applications the last 15 years and current/voltage measurements being always an option in electric machines.Monitoring generally results in a large number of complex signals with valuable diagnostic information hidden under noise or other irrelevant sources.Over the years and the same time with several breakthroughs in the signal processing field, engineers and researchers realized

Wavelet transforms 2.1 Continuous Wavelet Transform (CWT)
A wavelet is a wave-like oscillation that instead of oscillating forever like harmonic waves drops rather quickly to zero.The continuous wavelet transform breaks up a continuous function f(t) into shifted and scaled versions of the mother wavelet ψ.It can be defined as the convolution of the input data sequence with a set of functions generated by the mother wavelet: with the inverse transform being expressed as: where α represents scale (or pseudo-frequency) and b represents time shift of the mother wavelet ψ. ψ * is the complex conjugate of the mother wavelet ψ.The WT's superior timelocalization properties result from the finite support of the mother wavelet: as b increases, the analysis wavelet scans the length of the input signal, and a increases or decreases in response to changes in the signal's local time and frequency content.Finite support implies that the effect of each term in the wavelet representation is purely localized.This sets the WT apart from the Fourier Transform, where the effects of adding higher frequency sine waves are spread throughout the frequency axis.CWT can be applied with higher resolution to extract information with higher redundancy, that is, a very narrow range of scales can be used to pull details from a particular frequency band.

Discrete Wavelet Transform (DWT)
It turned out quite remarkably that instead of using all possible scales only dyadic scales can be utilized without any information loss.Mathematically this procedure is described by the discrete wavelet transform (DWT) which is expressed as: where DW(j, k) are the wavelet transforms coefficients given by a two-dimensional matrix, j is the scale that represents the frequency domain aspects of the signal and k represents the time shift of the mother wavelet.f(t) is the signal that is analyzed and ψ the mother wavelet used for the analysis (ψ * is the complex conjugate of ψ).The inverse discrete wavelet transform can be expressed as: where c is a constant depending only on ψ.Practically DWT is realized by the algorithm known as Mallat's algorithm or sub-band coding algorithm (Mallat, 1989).The DWT of a signal x is calculated by passing it through a series of filters.First the samples are passed through a low pass filter with impulse response h resulting in a convolution of the two.
The signal is also decomposed simultaneously using a high-pass filter g.The output from the high-pass filter gives the detail coefficients and the output from the low-pass filter gives the approximation coefficients.The two filters h, g are not arbitrarily chosen but are related to each other and they are known as a quadrature mirror filter.Since half the frequencies of the signal have now been removed, half the samples can be discarded according to Nyquist's rule.The filter outputs are then sub-sampled by 2. This decomposition has halved the time resolution s i n c e o n l y h a l f o f e a c h f i l t e r o u t p u t characterizes the signal.However, each output has half the frequency band of the input so the frequency resolution has been doubled.The approximation is then itself split into a second-level approximation and detail and the process is repeated as many times as it is desirable.This procedure can be repeated as many times as desirable by the user resulting in N levels of decomposition.
The number of decomposition levels N is related to the sampling frequency of the signal being analyzed (f s ).In order to get an approximation signal containing frequencies below frequency f, the number of decomposition levels that has to be considered is given by (Antonino-Daviu et al., 2007):

Wavelet Packet Transform (WPT)
Whereas DWT breaks up only the approximations, WPT simultaneously decomposes approximations and details.In the first resolution, j = 1, the signal is decomposed into two packets: A and D. The packet, A, represents the lower frequency component of the signal, while the packet D, represents the higher frequency component of the signal.Then, at the second resolution, j = 2, each packet is further decomposed into two sub-packets forming AA, AD, DA, DD.This decomposition process continues and at each subsequent resolution, the number of packets doubles while the number of data points in the packet are reduced by half.The wavelet packets contain the information of the signal in different time windows at different resolution.Each packet corresponds to a specific frequency band.
Both of WPT and DWT operate within the framework of multi-resolution analysis (MRA).Unlike DWT though, WPT has the same frequency bandwidth in every level.Fig. 2 depicts the WPT decomposition tree with A and D corresponding to approximation and detail respectively.
The WPT can thus be seen as a generalization of the wavelet transform and the wavelet packet function is also a time-scale function which can be described as: where the integers j and k are the index scale and translation operations.

Fig. 2. WPT decomposition tree
The index n is an operation modulation parameter or oscillation parameter.The first two wavelet packets are the scaling function φ(t) and mother wavelet functions ψ(t): When n = 2;3;. . . the function can be defined by the following recursive relationships: where h(k) and g(k) are the quadrature mirror filter associated with the predefined scaling function and mother wavelet function.The wavelet packet coefficients, , are calculated as: The frequency interval of each node is given by , , where S f is the sampling frequency, j the scale index and n the number of levels n=1,2,…,16.

Dual Tree Complex Wavelet (DTCWT)
The dual-tree complex wavelet transform (DTCWT) is a relatively recent enhancement to the DWT (Kingsbury, 1998), with important additional properties: reduced aliasing effects, nearly shift-invariance and directionally selective (useful in two and higher dimensions).The frequency aliasing is caused by the overlap of opposing-frequency pass-bands of the wavelet filters.The band-pass filter responses for the DTCWT have nearly all the pass-bands only on one side of zero frequency due to the adopted analytic filters.Thus, DTCWT may possess greatly reduced aliasing effects.Incidentally, this property of analytic filters is also the main reason for the DTCWT to achieve shift-invariance.
In the dual-tree implementation of decomposition and reconstruction, two parallel DWTs with different low-pass and high-pass filters in each scale are used, as can be seen in Fig. 3.The two DWTs use two different sets of filters, with each satisfying the perfect reconstruction condition.Let ψ h (t) and ψ g (t) denote the real-valued wavelet used, respectively, in the dual-tree transform.Then a complex-valued wavelet ψ C (t) can be obtained as: Thus, the two real wavelets constitute a complex analytical wavelet ψ C (t), which is only supported on the positive of the frequency axis.Fig. 3 shows the frequency response of DTCWT basis and DWT basis functions.It can be seen that all shown basis functions are analytic except for the basis functions corresponding to the scaling coefficients and the first stage wavelet coefficients in comparison with the transfer functions of a real DWT.

Fig. 3. Decomposition and reconstruction stages of DTCWT
Since DTCWT is composed of two parallel wavelet transforms, according to the wavelet theory, the wavelet coefficients and scaling coefficients of the upper tree can be computed via inner products (Wang et al., 2010): where l is the scale factor and J is the maximum scale.Similarly, and coefficients of the lower tree can be computed if ψ h (t) and φ h (t) are replaced by ψ g (t) and φ g (t), respectively.The wavelet and scaling of the DTCWT coefficients can then be expressed by combining the output of the dual-tree as follows: Furthermore, when other coefficients are set to zero, the scaling or wavelet coefficients can be individually reconstructed using the following equations: Coefficients and are real and have equal length with original signal x(t) being different from and .Specifically, for the tree Re, the corresponding decomposed scaling coefficients (approximation) and wavelet coefficients (details) as well as the inverse transform between the two consecutive resolution levels l and l+1 can be derived by: Similarly , for the tree Im can be obtained by: Note that a complex transform implemented in this way is no longer critically sampled, because two independent wavelet transforms are required.Thus DTCWT can be implemented using existing DWT software.The computational cost is significantly lower (only 2 times that of the basic DWT).In addition, the transform is naturally parallelized for efficient hardware implementation.Figs. 4 and 5 show the decomposition with DWT and DTCWT respectively of an artificial signal containing four fundamental frequencies: In the DWT decomposition, the highlighted frequencies actually do not exist as the FFT of the original signal confirms.On the contrary artificial peaks do not appear in the DTCWT decomposition as Fig. 5 clearly shows proving the reduced frequency aliasing of the DTCWT.A peak highlighted in detail 3 is real though it should appear only in detail 2.

The Second Generation Wavelet Transform (SGWT)
The classical wavelet techniques (CWT, DWT, WPT) are all dependent on the mother wavelet selection from a library of previously designed wavelet functions, an issue that is discussed in more detail in Section 3. Unfortunately, the standard wavelet functions are independent of a given signal.Towards this direction, the Second Generation Wavelet Transform (SGWT) was developed by (Sweldens, 1998), a new wavelet construction method using the lifting scheme.It is actually an alternative implementation of the classical DWT.The main feature of the SGWT is that it provides an entirely spatial domain interpretation of the transform, as opposed to the traditional frequency domain based constructions.Compared with the classical wavelet transform, the lifting scheme possesses several advantages, including the possibility of adaptive design, in-place calculations, irregular samples and integers-to-integers wavelet transforms.The lifting scheme provides high flexibility, which can be designed according to the properties of the given signal, and thus ensures that the resulting transform is always invertible.It makes good use of similarities between the high and low pass filters to speed up the calculation so that the implementation of the second generation wavelet transform is faster than the first generation wavelet transforms.Additionally, the multi-resolution analysis property is preserved.Consequently, the applications of the SGWT scheme in condition monitoring and fault diagnosis of mechanical equipments have been increasing the last few years (see Section 4).A basic decomposition of the SGWT consists of three main steps (Sweldens, 1998), split, predict, and update.In the split step, an approximate signal a l at level l is split into even samples and odd samples (Zhou et al., 2010).
In the prediction step, a prediction operator P is designed and applied on a l+1 to predict d l+1 .
The resultant prediction error d l+1 is regarded as the detail coefficients of a l .
where p r the coefficients of P and M is the length of p r .
In the update step, a designed update operator U is applied on d l+1 .Adding the result to the even samples, the resultant a l+1 is regarded as the approximate coefficients of a l .
where u j are the coefficients of U and N is the length of u j .Iteration of the above three steps on the output a, generates the detail and approximation coefficients at different levels.
The reconstruction stage of SGWT is a reverse procedure of the decomposition stage, which includes inverse update step, inverse prediction step and merging step.

Second Generation Wavelet Packet Transform (SGWPT)
The time-frequency resolution of SGWT varies with the decomposition levels.It gives good time and poor frequency resolution at high frequency sub-band, and good frequency and poor time resolution at low frequency sub-band.In order to obtain a higher resolution in the high frequency sub-band, SGWPT has been constructed and hence the detail coefficients at each level are further decomposed to obtain their approximation and detail components.The decomposition and reconstruction stages of SGWPT are described below.
In the decomposition stage, X l,k is split into even samples X l,ke and odd samples X l,ko , where X l,k represents the coefficients of the kth node at level l.Then calculate each sub-band coefficients at level l +1.
In the reconstruction stage, the sub-band coefficients to be reconstructed are reserved, and then other sub-band coefficients are set to be zeroes.Finally, the reconstructed results are obtained by the following formula.

Choosing the best wavelet basis
Utilizing the classical WT (DWT, CWT or WPT) brings on the unresolved issue of mother wavelet selection.Different types of wavelets have different time-frequency structures and thus it is always an issue how to choose the best wavelet function for extracting fault features from a given signal.An "inappropriate" wavelet will reduce the accuracy of the fault detection.There is a plethora of options between various wavelet families (with infinite number of members!) or specific wavelets.Haar, Daubechies (db), Symlets, Coiflets, Gaussian, Morlet, complex Morlet, Mexican hat, biorthogonal wavelets, reverse biorthogonal, Meyer, harmonic wavelets, discrete approximation of Meyer, complex Gaussian, Shannon, and frequency B-spline are among the most well established wavelets.
In principle, the wavelet decomposition would achieve a better result if the wavelet basis is ''similar'' to the signal under analysis.The wavelet coefficients reflect the similarity between the signal local and the corresponding wavelet basis.The bigger the coefficient, the more similar the two parts are.Different wavelet basis would lead to quite different results of signal analysis.Currently there are still no generic theoretical guidelines for how to select the optimum wavelet basis, or how to select the corresponding shape parameter and scale level for a particular application.The selection is in many cases done by trial and error.In literature there are some interesting approaches that attempt to address this issue.(Kankar et al., 2011) presented a methodology for rolling element bearings fault diagnosis using continuous wavelet transform (CWT).Six different base wavelets were considered of which three were real valued and the other three were complex valued.Out of these six wavelets, the base wavelet was selected based on wavelet selection criteria to extract statistical features from wavelet coefficients of raw vibration signals.
m is the number of wavelet coefficients and C n,i the ith wavelet coefficient at the nth scale.
The total energy is given by: whereas the Energy to Shannon Entropy ratio is given by: where the entropy of signal wavelet coefficients is defined as: and p i is the energy distribution of the wavelet coefficients, with ∑ = .
To find the most suitable mother wavelet, (Rafiee and Tse, 2009), in probably the most thorough study of mother wavelet choice investigation, studied 324 candidate mother wavelet functions from various families including Haar, Daubechies (db), Symlet, Coiflet, Gaussian, Morlet, complex Morlet, Mexican hat, bio-orthogonal, reverse bio-orthogonal, Meyer, discrete approximation of Meyer, complex Gaussian, Shannon, and frequency Bspline.The most similar mother wavelet for analyzing the gear vibration signal was selected based on the following procedure.Raw vibration signals were recorded and synchronized.
The feature vector was composed of the variance of CWT coefficients for each of the 2 4 scales calculated by each of the 50 segmented signals in each gearbox condition.The average of the feature vector in the 50 segmented signals was computed for each gearbox condition.Variances of the mentioned average of the four gearbox conditions were determined for each scale (2 4 elements).The five highest values of the calculated vector were selected as the feature because the larger the variance, the greater the ability to properly classify faults.The summation of the five elements, called ''SUMVAR'' for simplicity, was compared with those obtained from the other 323 candidate mother wavelets (a total of 324 mother wavelets).The one that had the highest SUMVAR was selected as the most similar function to our vibration signals.In a similar work (Rafiee et al., 2010) following a similar procedure found that "Daubechies 44" ("db44") has the most similar shape across both gear and bearing vibration signals.Results also suggested that although "db44" is the most similar mother wavelet function for the studied vibration signals, it is not the proper function for all wavelet-based processing.The research verified that Morlet wavelet has better similarity to both vibration signals in comparison to many other functions such as Daubechies (1-43), Coiflet, Symlet, complex Morlet, Gaussian, complex Gaussian, and Meyer for both experimental set-ups (i.e.gear testing and bearing testing).Among the studied mother wavelets, results also showed that db44 is the most similar function across both gear and bearing vibration signals.The drawback of the db44 function is that the high-order db functions take more CPU time than most others.In another work (Rafiee et al., 2009) utilized genetic algorithms (GAs) to optimize the selection of mother wavelet function (among several members of the Daubechies family), the number of the decomposition levels of the wavelet packet transform (WPT) as well as the number of neurons in the ANNs hidden layers used for the fault classification, resulted in a high-speed, effective two-layer ANN with a small-sized structure."db11", level 4 and 14 neurons have been selected as the best values for Daubechies order, decomposition level, and the number of nodes in hidden layer, respectively.In (Gketsis et al., 2009) the optimum wavelet choice criterion is the maximization of the cross-correlation between the signal of interest and the wavelet.In an application of condition monitoring in electrical machines, they tested several wavelet functions, namely Haar, Daubechies 2, 4, 8, Symlet 2, 3, 4, 8 and Coiflet 3 and concluded to "db2".(Saravanan and Ramachandran, 2009) found that among the 15 members of Daubechies wavelet, "db1" and "db5" gave the maximum classification efficiency of an expert system (Decision Tree) at around 98.7%.
Other researchers prefer more qualitative explanations.(Xu and Li, 2008) support that in the common family of wavelet bases i.e.Morlet, Haar, Shannon, Symmlets, Coiflets and Daubechies wavelets, etc., the most popular is the Daubechies wavelet, as it bears the shortest compactly supported scaling function in all of orthogonal wavelets when given exponent number of vanishing moment.Moreover, it gives the best overall performance in the respect of both mean squared error between reconstruction signal and original signal, and maximizing the SNR improvement.Therefore, the Daubechies wavelet is applied and others are for comparison in this case.(Jazebi et al., 2011) state that one specific mother wavelet is best suited for a particular application.For this purpose, mother wavelet type and decomposition level have been chosen based on experience and trial and error.The research includes detecting and analyzing low amplitude, short duration, fast decaying, and oscillating type of current signals.For this purpose, Daubechies's mother wavelet seems to be an appropriate choice.In comparison with Haar wavelet, Daubechies are best suited for feature extraction due to their low-pass and highpass filters.On the other hand because of its inherent orthogonality, it satisfies Parseval theorem, not like biorthogonal wavelets such as Coiflet and Meyer wavelets .db4 mother wavelet over level d4 has been chosen because the maximum energy localization in details (1-4) was obtained using these parameters.
( Daviu et al., 2007) supports that the Daubechies family is well suited for application of DWT in condition monitoring due to its interesting inherent properties.An important fact they observed when using the Daubechies family, was the overlap between the frequency bands (frequency aliasing) associated with the DWT decomposition of their signals.This is due to the non-ideal filtering process performed by the wavelet signals, a fact that makes that the signal components, included within a certain frequency band and placed in the proximity of its limits, overlap partially with the adjacent band.When using a high-order Daubechies wavelet for signal decomposition, this effect is less intense than when using a low-order one.In other words, high-order wavelets behave as more ideal filters.Maximization of statistical features such as kurtosis or crest factor can be utilized as a criterion for the choice of mother wavelet within a family or among various families.In an unpublished study by the authors, an investigation of the optimum parameters for the most effective de-noising with DWT was conducted.The analysis of a representative AE signal from seeded defects in bearings shows how statistical parameters change respectively to the wavelet choice between the 10 first members of the Daubechies family in Fig. 9. Obviously the wavelet that maximizes kurtosis, crest factor and crest value is chosen as optimum, "db2" in this case.
Fig. 9. Kurtosis, crest value and factor features of de-noised AE signal with various "db" wavelets in a DWT de-noising scheme

Wavelet-based de-noising
Wavelet based de-noising is a very interesting and important application of wavelets in the processing of signals from condition monitoring.It is very widely adopted in many studies as it is ideal to extract hidden diagnostic information and enhance the impulsive components of complex, non-stationary signals with strong background.Wavelet thresholding is based on the idea that the energy of the signal is concentrated in a few wavelet coefficients, while the energy of noise spreads throughout all the resulted wavelet coefficients.Similarity between the mother wavelet and the signal to be analyzed plays a very important role, making it possible for the signal to concentrate on fewer coefficients and thus its choice is critical in the efficiency of the de-noising task.The first foundations in wavelet-based de-noising were set by (Donoho, 1995).Let x(t) be the discrete signal acquired during condition monitoring.The signal series consists of impulses and noise.x(t) can alternatively be expressed as x(t)=p(t)+n(t), where p(t) indicates the impulses to be determined, whereas n(t) indicates equally distributed and independent Gaussian noise with mean zero and standard deviation r.In principle, the wavelet threshold de-noising procedure has the following steps: The second step is probably the most critical and has quite an impact upon the effectiveness of the procedure.There are plenty of thresholding techniques and many different thresholds proposed in the literature.Hard thresholding sets any coefficient less than or equal to the threshold to zero.
Hard thresholding is the simplest approach but tends to miss useful parts of the signal.In soft thresholding, the threshold is subtracted from any coefficient that is greater than it.
, σ is the standard deviation of the noise and N is the number of data samples in the measured signal.The true value of the noise standard deviation σ is, generally, unknown.It is often estimated by σ = MAD/0.6745,where MAD refers to the median absolute value of the finest scale wavelet coefficients.The combination of the soft thresholding policy and universal threshold is also referred to as "VisuShrink".It ensures a noise-free reconstruction but often the threshold is set too high.(Donoho and Jonestone, 1994) introduced the "minimax" threshold an enhancement of the universal threshold.The "minimax" threshold level can be much lower than the universal threshold level when it comes to small-to-moderate sample sizes."SureShrink" or "rigsure" approach relies on the minimization of Stein's unbiased estimator of risk (Donoho and Jonestone, 1995).When the wavelet representation is not very sparse, it yields better results.The universal threshold and "minimax" threshold are more effective when it comes to detecting sparse impulses.All the above methods assume that the noise properties are known, which is rarely the case in industrial applications.The maximum likelihood estimation de-noising method is suitable for non-Gaussian noise.A specific threshold rule, which is based on the maximum likelihood estimation method, incorporates a priori information on the impulse probability density function.The probability density function of the impulse to be identified must be known in advance though.The so-called ''sparse code shrinkage'' method, proposed by (Hyvarinen, 1999), can be utilized for wavelet coefficients shrinkage.
The DTCWT can give a substantial performance enhancement to the conventional DWTbased noise reduction methodologies due to its interesting properties of near shiftinvariance and reduced frequency aliasing.(Wang et al., 2010) proposed a scheme based on "NeighCoeff" scheme (Cai and Silverman, 2001)."NeighCoeff" uses lower threshold than "VisuShrink" and outperforms all other shrinkage methods.The de-noising using DTCWT and "NeighCoeff" shrinkage is implemented in the following stages: 1. Transform the data x into the wavelet domain via DTCWT (or any other wavelet transform in general) 2. At each resolution level j, group the noisy wavelet coefficients into disjoint blocks b ij of length L 0 =log(n)/2; then extend each block bij by an amount of max(1,L 0 /2) in each direction to form overlapping larger blocks Bij of length L=L 0 +2L 1 3. Within each block bij, each noisy wavelet coefficient is processed via "NeighCoeff" shrinkage rule 4. Calculate the de-noised signal using inverse wavelet domain In Fig. 10 various de-noising algorithms were applied on an AE signal from a bearing with seeded defect.In a) the original signal is depicted.In b) the method of spectral kurtosis (Randall and Antoni, 2011) is utilized.Spectral kurtosis is not a wavelet-based technique and relies on the location of the frequency band where kurtosis is maximized and then the band-pass filtering of the signal in the resulted band.In figure c) the DTCWT wavelet transform is applied in combination with "NeighCoeff" thresholding whilst in d) a parametric procedure was used by the authors to determine the optimum parameters of DWT (wavelet type, number of levels, threshold type, soft or hard application of threshold) that maximize the kurtosis and crest factor of the signal.DTCWT-and DWT-based denoising proved the most efficient in terms of the resulting signal kurtosis.

Gearboxes
Fault symptoms of running gearboxes must be detected as early as possible to avoid serious accidents.An efficient monitoring plan is needed for any industry because it can optimize the resources management and improve the plant economy, by reducing unnecessary costs Where c ij is the element of matrix [C] mxn , − + j− is the Euclidean distance between element c ij and c 11 , that is corresponding to the geometry length between the point (i,j) and reference point (1,1) in the scalogram.In (Fan and Zuo, 2006) a new fault detection method that combines Hilbert transform and wavelet packet transform was proposed.The wavelet packet node energy method is used as feature.WPT at the 4 th decomposition level using "db10" wavelet was utilized.Their results showed that the proposed method is effective to extract modulating signal and help to detect the early gear fault.(Sanz et al., 2007) proposed a method which combines the capability of DWT to treat transient vibration signals with the ability of auto-associative neural networks (AANNs) for feature extraction."db6" and 3 levels of decomposition were chosen for real application vibration data from a pump rotor gearset.The detail coefficient vectors of the DWT were taken as input parameters of the AANN.An advantage of the proposed method is that DWT is performed directly on the raw vibration signals not on time-synchronous averaged signals.(Rafiee et al., 2007) presented a new procedure which experimentally recognized gears and bearings faults of a typical gearbox system using a multi-layer perceptron ANN.The feature vector was populated by the standard deviation of wavelet packet coefficients after WPT on the recorded vibration signals."db4" wavelet and 4 levels of decomposition were used.The gear conditions were considered to be normal gearbox, slight-and mediumworn, broken-teeth gears faults and a general bearing fault.(He et al., 2007) proposed a novel non-linear feature extraction scheme from the time-domain features with wavelet packet preprocessing and frequency-domain features of the vibration signals using the kernel principal component analysis (KPCA) to characterize various gearbox conditions.Experimental analysis on a fatigue test of an automobile transmission gearbox have shown that the KPCA features outperformed PCA features in terms of clustering capability, and both the two KPCA-based subspace methods can be effectively applied to gearbox condition monitoring.The time-domain statistical features with wavelet packet preprocessing and frequency-domain statistical features proved more effective than the conventional timedomain features without WPT preprocessing for extracting the KPCA features.(Li et al., 2007) used the Haar wavelet CWT (HCWT) to diagnose three types of machine faults.To assess its effectiveness, the diagnosis information obtained by HCWT is compared with that by Morlet wavelet CWT (MCWT), which is more popular in machine diagnosis.Their results demonstrate that Haar wavelet is also a feasible wavelet in machine fault diagnosis and HCWT can provide abundant graphic features for diagnosis than MCWT.(Miao and Makis, 2007) have introduced a new feature extraction approach based on wavelet modulus maxima and proposed a Hidden Markov Model (HMM) based two-stage machine condition classification system.The modulus maxima distribution was utilized as the input observation sequence of the system.An adaptive algorithm was proposed and validated by three sets of real gearbox vibration data to classify two conditions: normal and failure.In addition, in condition classification (stage 2), three HMM models were set up to classify three different machine conditions, namely, adjacent tooth failure, distributed tooth failure and normal condition.The validation results showed an excellent performance of the proposed classification system.(Saravanan et al., 2008) investigated the effectiveness of wavelet-based features for fault diagnosis in a bevel gearbox using support vector machines (SVM) and proximal support vector machines (PSVM).The statistical feature vectors from Morlet wavelet coefficients resulted after CWT at sixty-four scales, were classified using the J48 algorithm and the predominant features were fed as input for training and testing SVM and PSVM.The coefficients of Morlet wavelet were used for feature extraction from the time domain vibration signals.Various statistical features like kurtosis, standard deviation, maximum value, etc. calculated from the wavelet coefficients formed the feature sets.It was concluded that PSVM has an edge over SVM in the classification efficiency of various fault conditions.(Li et al., 2008) presented a new signal-adapted lifting scheme for rotating machinery fault diagnosis, which allows the construction of a wavelet directly from the statistics of a given signal.The prediction operator based on genetic algorithms was designed to maximize the kurtosis of detail signal produced by the lifting scheme, and the update operator was designed to minimize a reconstruction error.The signal-adapted lifting scheme was applied to analyze bearing and gearbox vibration signals.The conventional diagnosis techniques and nonadaptive lifting scheme were also used to analyze the same signals for comparison.The results demonstrated that the signal-adapted lifting scheme was more effective in extracting inherent fault features from complex vibration signals.(Kar and Mohanty, 2008) conducted an experimental investigation of fault diagnosis in a multistage gearbox under transient loads.The signals studied were vibration measurements, recorded from an accelerometer fitted at the tail-end bearing of the gearbox as well as the current transients monitored at the induction motor.Three defective cases and three transient load conditions were investigated.DWT (with "db8") and a corrected multi-resolution Fourier transform (MFT) were applied to process the vibration and current transients.A statistical feature extraction technique was proposed in search of a trend in detection of defects.A condition monitoring scheme is devised that can facilitate in monitoring vibration and current transients in the gearbox with simultaneous presence of transient loads and defects.(Jafarizadeh et al., 2008) suggested a new noise canceling method, based on time-averaging method for asynchronous input, and CWT with complex Morlet wavelet.The complex Morlet wavelet depends on non-fixed parameters.For the feature extraction from time-domain vibration signals, the optimum values of the Morlet wavelet parameters should be estimated.Wavelet entropy was used towards this optimization.Then CWT was applied and 3-D scalograms were utilized for damage detection.The proposed method was successfully implemented on a simulated signal and real test rig of a Yahama motorcycle gearbox.(Loutas et al., 2009) reported on the condition monitoring of a lab-scale, single stage, gearbox with cracked gears using different non-destructive inspection methodologies and the processing of the acquired waveforms with advanced signal processing techniques is the aim of the present work.Acoustic emission (AE) and vibration measurements were utilized for this purpose.Emphasis was given on the signal processing of the acquired vibration and acoustic emission signals in order to extract conventional as well as novel parametersfeatures of potential diagnostic value from the monitored waveforms.Wavelet-based parameters-features were proposed utilizing the DWT and "db10" wavelet.The evolution of selected parameters/features versus test time is provided, evaluated and the parameters with the most interesting diagnostic behavior were highlighted.The differences in the parameters evolution of each NDT technique are discussed and the superiority of AE over vibration recordings for the early diagnosis of natural wear in gear systems was concluded.In (Saravanan and Ramachandran, 2009) the coefficients of Morlet wavelet were used for feature extraction.CWT and sixty four scales were chosen to extract the Morlet wavelet coefficients of the vibration signals.A group of statistical features like kurtosis, standard deviation, maximum value, etc., widely used in fault diagnostics, were extracted from the wavelet coefficients of the time domain signals.For the selection of best features, the decision tree using J48 algorithm was used.The selected features were fed as input to SVM for classification.(Xian and Zeng, 2009) developed a new intelligent method for the fault diagnosis of the rotating machinery based on wavelet packet analysis (WPA) and hybrid support vector machines (hybrid SVM).The faulty vibration signals obtained from a gearbox were decomposed by WPA via Dmeyer wavelet.Shannon entropy was calculated from the coefficients at each subspace of the WPA decomposition and formed the feature vectors that trained/tested the hybrid SVM for estimating the fault type.(Belsak and Flasker, 2009) studied the influence of a fatigue gear crack in a single-stage gear unit on the recorded vibrations.They applied the sparse code shrinkage method to de-noise vibration signals from a faulty gearbox.They discriminated between healthy and cracked gear using scalograms of the resulted CWT coefficients.Gabor wavelet was adopted in their work.(Wu and Chan, 2009) utilized the sound emission from a multi-stage gearbox towards gear fault diagnostics.Continuous wavelet transform with Morlet mother wavelet combined with a feature selection of energy spectrum was proposed for analyzing fault signals and feature extraction.Two artificial neural network (ANN) approaches i.e. the probability neural network and conventional back-propagation network were compared in the recognition of six faulty states and one healthy.(Saravanan and Ramachandran, 2009) recorded vibration signals from a spur bevel gearbox in different lubrication, loading and gear state conditions.They used various members of the Daubechies family (db1-db15) for statistical feature extraction.J48 Decision Tree was used for two reasons, feature selection and classification of the faulty signals.(Rafiee and Tse, 2009) processed vibration signals from a gearbox with three different fault conditions (slight-worn, medium-worn, and broken-tooth) of a spur gear.CWT was used with packet decomposition through the scales.After synchronizing the raw vibration signals, the CWT and autocorrelation function were applied to the synchronized signals and generated continuous wavelet coefficients of synchronized vibration signals.They found that a simple sinusoidal summation function can approximate the waveforms generated by autocorrelation of CWC-SVS for normal gearboxes as well as other defective gears with satisfactory performance.The function achieved proper approximation even though the waveforms were different from one condition to another as they possess different frequency contents of vibration signals.(Rafiee et al., 2009) presented an optimized gear fault identification system using genetic algorithms (GAs) to investigate the type of gear failures of a complex gearbox system using artificial neural networks (ANNs).Slightly-worn, medium-worn, and broken-tooth of a spur gear of the gearbox system were selected as the faults types.GAs were exploited to optimize the selection of mother wavelet function (among several members of the Daubechies family), the number of the decomposition levels of the wavelet packet transform (WPT) as well as the number of neurons in the ANNs hidden layers, resulted in a high-speed, effective two-layer ANN with a small-sized structure."db11", level 4 and 14 neurons have been selected as the best values for Daubechies order, decomposition level, and the number of nodes in hidden layer, respectively.(Singh and Al Kazzaz, et al., 2009) studied the effect of dry bearing fault on multi-sensor measurements (three line to line voltages, three currents, two vibration signals, four temperatures and one speed signal) in induction machines.Different families of WT have been introduced and implemented with vibration signals covering the dry bearing fault in induction machine.The results of testing various popular types of the WT showed different degree of success in relating the band with machine condition.It was concluded that the fluctuation in the RMS value of the first and second decomposition level was larger in the case of Mexican hat wavelet and it was thus proposed to investigate the random vibration of all machines in case of dry bearing fault.It was concluded that WT can be used effectively to specify one machine fault at a time, while it cannot treat multiple faults simultaneously.Instead, the combined use of wavelet and Fourier transform proved an effective tool for extracting important information about the machine condition.An intelligent diagnostic methodology for fault gear identification and classification based on vibration signals using DWT and adaptive neuro-fuzzy inference system (ANFIS) is presented in (Wu et al., 2009).After the vibration signal acquisition, 4-level decomposition via the DWT followed resulting in four high frequency details (D1-D4) and one low frequency approximation (A4).Three Daubechies wavelets (db4, db8 and db20) were utilized for the decomposition.The energy distribution of the five subbands was calculated and trained two different ANNs for the successful fault identification.No major differences were observed on the ANNs recognition rates in regard to the different mother wavelets utilized in the DWT.(Wu and Hsu, 2009) described a development of the fault gear identification system using the vibration signal with discrete wavelet transform and fuzzy-logic inference for a gear-set e x p e r i m e n t a l p l a t f o r m .T h e e x t r a c t i o n m e t h o d o f f e a t u r e v e c t o r i s b a s e d o n D W T decomposition followed by level energy calculation.The recognition rate of the classification task using three different Daubechies wavelets ("db4, db8 and db20") coefficients under various working conditions did not show significant discrepancies.The fault recognition rates were in general over 96%.
A diagnostic methodology of artificial defects in a single stage gearbox operating under various load levels and different defect states was proposed by (Loutas et al., 2010) based on vibration recordings as well as advanced signal analysis techniques.Two different waveletbased signal processing methodologies, using the DWT as well as the CWT, were utilized for the analysis of the recorded vibration signals and useful diagnostic information were extracted out of them.
DWT was applied with "db10" and 10-level decomposition whilst CWT was applied with Morlet wavelet (bandwidth parameter and wavelet center frequency were set at 1 and 1.5 respectively.Averaging across all scales was utilized instead of time synchronous averaging giving very characteristic scalograms for each artificial defect case.A novel method incorporating customized (i.e., signal-based) multiwavelet lifting schemes with sliding window de-noising was proposed in (Yuan et al., 2010).On the basis of Hermite spline interpolation, various vector prediction and update operators with the desirable properties of biorthogonality, symmetry, short support and vanishing moments are constructed.The minimum entropy principle is recommended to determine the optimal vector prediction and update operators in the lifting scheme, by means of measuring the sparsity.Due to the periodic characteristics of gearbox vibration signals, sliding window de-noising favorable to retain valuable information as much as possible is employed to extract and identify the fault features in gearbox signals.Experimental validations including the simulation experiments, gear fault diagnosis and normal gear detection prove the effectiveness of the multi-wavelet lifting schemes as compared to various conventional wavelets.In (Saravanan and Ramachandran, 2010) the vibration signals monitored at a bevel gear box in various conditions and fault conditions were processed with DWT.Wavelet features were extracted for all the wavelet coefficients and for all the signals using the Daubechies wavelets "db1" to "db15".ID3 Decision Tree is used for feature selection and artificial neural network were employed for classification of various faults of the gear box.The features selection of various discrete wavelets was carried out and the wavelet having the highest average efficiency of fault classification was chosen as the most appropriate.In (Rafiee et al., 2010) vibration signals recorded from two experimental set-ups were processed for gears and bearing conditions.Four statistical features were selected: standard deviation, variance, kurtosis, and fourth central moment of continuous wavelet coefficients of synchronized vibration signals (CWC-SVS).An automatic feature extraction algorithm is introduced for gear and bearing defects.It also shows that the fourth central moment of CWC-SVS is a proper feature for both bearing and gear failure diagnosis.Standard deviation and variance of CWC-SVS demonstrated more appropriate outcome for bearings than gears.Kurtosis of CWC-SVS illustrated the acceptable performance for gears only.(Wang et al., 2010) proposed a technique to provide accurate diagnosis of gearboxes under fluctuating load conditions.The residual vibration signal, i.e. the difference of time synchronously averaged signal from the average tooth-meshing vibration, is analyzed as source data due to its lower sensitiveness to the alternating load condition.Complex Morlet continuous wavelet transform was used for the vibration signal processing.A fault growth parameter (FGP) was introduced, based on the continuous wavelet transform amplitudes over all transform scales.FPG actually measures the relative CWT amplitude change.This parameter proved insensitive to varying load and can correctly indicate early gear fault.Other features such as kurtosis, mean, variance, form factor and crest factor, both of residual signal and mean amplitude of continuous wavelet transform waveform, were also checked and proved to be influenced by the changing load.The effectiveness of the proposed fault indicator was demonstrated using a full lifetime vibration data history obtained under sinusoidal varying load.
To overcome the shift-variance deficiency of classical DWT, a novel fault diagnosis method based on the redundant second generation wavelet packet transform was proposed in (Zhou et al., 2010).Initially, the redundant second generation wavelet packet transform (RSGWPT) was constructed on the basis of second generation wavelet transform and redundant lifting scheme.Then, the vibration signals were decomposed by RSGWPT and the faulty features were extracted from the resultant wavelet packet coefficients.In the end, the extracted fault features were given as input to classifiers for identification/classification.The proposed method was applied for the fault diagnosis of gearbox and gasoline engine valve trains.Test results indicate that a better classification performance can be obtained by using the proposed fault diagnosis method in comparison with using conventional second generation wavelet packet transform method.(Wang et al., 2010) employed the dual-tree complex wavelet transform (DTCWT) for the de-noising of vibration signals from gearbox and bearings monitoring.They compared the de-noising via DTCWT with other wavelet-based techniques (DWT and second generation wavelet transform (SGWT)) as well as with fast kurtogram.The results were evaluated through the kurtosis calculated for each signal after the de-noising.NeighCoeff shrinkage scheme was applied in all wavelet-based cases.Denoised results of signals collected from a gearbox with tooth crack showed that the DTCWTbased de-noising approach yielded more promising result than the SGWT-and DWT-based methods, and it can effectively remove the noise and retain valuable information as much as possible.In the case of multiple features detection, diagnosis results of rolling element bearings with combined faults and actual industrial equipment confirmed that the proposed DTCWT-based method is powerful and consistently outperformed the widely used SGWT and fast kurtogram.(Loutas et al. 2011a) conducted multi-hour tests in healthy gears in a single-stage gearbox.Three on-line monitoring techniques were implemented in the tests.Vibration and acoustic emission recordings in combination with data coming from oil debris monitoring (ODM) of the lubricating oil were utilized in order to assess the condition of the gears.A plethora of parameters/features were extracted from the acquired waveforms via conventional (in time and frequency domain) and non-conventional (wavelet-based) signal processing techniques.DWT was utilized to process vibration and AE signals with "db10" mother wavelet and 10 levels of decomposition.The wavelet levels energy and entropy were used as features.Data fusion was accomplished in the level of integration of the most representative among the extracted features from all three measurement technologies in a single data matrix.Principal component analysis (PCA) was utilized to reduce the dimensionality of the data matrix whereas independent component analysis (ICA) was further applied to identify the independent components among the data and correlate them to different damage modes of the gearbox.(Miao and Makis, 2011) presented an on-line fault classification system with an adaptive model re-estimation algorithm.The machinery condition is identified by selecting the HMM which maximizes the probability of a given observation sequence.The proper selection of the observation sequence is a key step in the development of an HMM-based classification system.In this paper, the classification system is validated using observation sequences based on the wavelet modulus maxima distribution obtained from real vibration signals, which has been proved to be effective in fault detection in previous research.(Li et al., 2011) utilized the Hermitian wavelet to diagnose the gear localized crack fault.The complex Hermitian wavelet is constructed based on the first and the second derivatives of the Gaussian function to detect signal singularities.The Fourier spectrum of Hermitian wavelet is real; therefore, Hermitian wavelet does not affect the phase of a signal in the complex domain.This gives a desirable ability to extract the singularity characteristic of a signal precisely.The proposed method is based on Hermitian wavelet amplitude and phase map of the time-domain vibration signals.Hermitian wavelet amplitude and phase maps are used to evaluate healthy and cracked gears.

Bearings
The fault diagnosis of rolling element bearings is very important for improving mechanical system reliability and performance in rotating machinery as bearing failures are among the most frequent causes of breakdowns in rotating machinery.When localized fault occurs in a bearing, periodic or non-periodic impulses appear in the time domain of the vibration signal, and the corresponding bearing characteristic frequencies (BCFs) and their harmonics emerge in the frequency domain.However, in the early stage of bearing failures, the BCFs usually carry very little energy and are often suppressed/hidden by noise and higher-level macro-structural vibrations.Consequently an effective signal processing method is of utmost importance in the de-noising of vibration or acoustic emission signals acquired or the extraction of damage sensitive features during the condition monitoring of bearings.Wavelet-based techniques meet this challenge in a variety of applications presented in the following.(Purushotham et al., 2005) have applied the DWT towards the detection of localized bearing defects.The vibration signals were decomposed up to 4 levels using "db2" mother wavelet.The complex cepstral coefficients for wavelet transformed time windows at Mel-frequency scales constituted the features that trained Hidden Markov Models for the fault detection and classification.
In (Yan and Gao, 2005) the Discrete Harmonic Wavelet Packet Transform (DHWPT) was used to decompose the vibration signals measured from a bearing test bed into a number of frequency sub-bands.Given the harmonic wavelet packet coefficients of a vibration signal x(t), the energy feature in each sub-band was calculated as: The key features were then used as inputs to neural network classifiers for assessing the system's health status.Comparing to the conventional approach where statistical parameters from raw vibration signals are used, the presented approach enables higher signal-to-noise ratios and consequently, more effective and intelligent use of the available sensor information, leading to more accurate system health evaluation.(Qiu et al., 2006) assessed the performance of wavelet decomposition-based de-noising versus wavelet filter-based de-noising methods on signals from mechanical defects.The comparison revealed that wavelet filter is more suitable and reliable to detect a weak signature of mechanical impulse-like defect signals, whereas the wavelet decomposition denoising method can achieve satisfactory results on smooth signal detection.In order to select optimal parameters for the wavelet filter, a two-step optimization process was proposed.
Minimal Shannon entropy was used to optimize the Morlet wavelet shape factor.A periodicity detection method based on singular value decomposition (SVD) was then used to choose the appropriate scale for the wavelet transform.The experimental results verify the effectiveness of the proposed method.
( Abbasion et al., 2007) studied the condition of an electric motor with two rolling bearings (one next to the output shaft and the other next to the fan) with one normal state and three faulty states each.De-noising via the CWT (Meyer wavelet) was conducted and support vector machines (SVMs) were used for the fault classification task.Results have showed 100% accuracy in fault detection.(Ocak et al., 2007) developed a new scheme based on wavelet packet decomposition and hidden Markov modeling (HMM) for the condition monitoring of bearing faults.In this scheme, vibration signals were decomposed into wavelet packets and the node energies of the 3-level decomposition tree were used as features.Based on the features extracted from normal bearing vibration signals, an HMM was trained to model the normal bearing operating condition.The probabilities of this HMM were then used to track the condition of the bearing.In (Zarei and Poshtan, 2007) WPT was used to process stator current signals in order to detect defective bearings at induction motors.The discrete Meyer wavelet was used to decompose the recorded signals in three levels.The defect frequency region was determined, and the coefficient energies in the related nodes were calculated.In comparison with the healthy condition, the energy was found to increase in the nodes related to defect frequency regions, therefore it was used as a diagnostic parameter.(Hu et al., 2007) introduced a methodology for fault diagnosis based on improved wavelet package transform (IWPT), a distance evaluation technique and the support vector machines (SVMs) ensemble.Their method consists of three stages.Firstly, with investigating the feature of impact fault in vibration signals, a biorthogonal wavelet with impact property is constructed via lifting scheme, and the IWPT is carried out for feature extraction from the raw vibration signals.Then, the faulty features can be detected by envelope spectrum analysis of wavelet package coefficients of the most salient frequency band.Secondly, with the distance evaluation technique, the optimal features are selected from the statistical characteristics of raw signals and wavelet package coefficients, and the energy characteristics of decomposition frequency band.Finally, the optimal features are input into the SVMs in order to identify the different abnormal cases.The proposed method was applied to the fault diagnosis of rolling element bearings, and testing results showed that the SVMs ensemble can reliably separate different fault conditions and identify the severity of incipient faults.(Lei et al., 2009) suggested a method relying on wavelet packets transform (WPT) and empirical mode decomposition (EMD) to preprocess vibration signals and extract fault characteristic information from them.Each of the raw vibration signals is decomposed with "db10" WPT at level 3. From a plethora of features extracted at each sub-band, the most relevant ones were selected via distance evaluation techniques and forwarded into a radial basis function (RBF) network to automatically identify different faults (inner race, outer race, roller) in rolling element bearings.A novel health index called frequency spectrum growth index (FSGI) to detect health condition of gear, based on wavelet decomposition was presented in (Wang et al., 2009)."db9" mother wavelet was chosen for signal decomposition and the maximum wavelet decomposition level is 4. In order to evaluate the performance of the proposed FSGI index various wavelets at various decomposition levels were tested.The results obtained prove that FSGI is insensitive to the selection of wavelet type and decomposition level.Three sets of vibration data collected from a mechanical diagnostics test bed were collected and analyzed in order to validate the method.An anti-aliasing lifting scheme is applied by (Bao et al.,2009) to analyze vibration signals measured from faulty ball bearings and testing results confirm that the proposed method is effective for extracting weak fault feature from a complex background.The simple lifting scheme (or 2 nd generation wavelet transform) was altered by discarding the split and merge operations and modifying accordingly the prediction and update operators improving significantly the frequency aliasing issue.Testing results showed that the anti-aliasing lifting scheme performs better than the lifting scheme and the redundant lifting scheme in terms of increasing the accuracy of classification algorithms (ANNs or SVMs) of faulty bearing signals.(Yuan et al., 2009) introduced a new method based on adaptive multi-wavelets via two-scale similarity transforms (TSTs).TSTs are simple methods to construct new biorthogonal multi-wavelets with properties of symmetry, short support and vanishing moments.Based on kurtosis maximization principle, adaptive multi-wavelets were designed to match the transient faults in rotating machinery.Genetic algorithms (GAs) were applied to select the optimal multiwavelets and the method was used to successfully diagnose bearing outer-race faults.(Zhu et al., 2009) introduced a new method that combines the CWT -through the Morlet waveletand the Kolmogorov-Smirnov test to detect transients contained in the vibrations signals from gearbox as well as faulty bearings.CWT initially decomposed the time domain vibration signals into two dimensional time-scale plane.By removing the Gaussian noise coefficients at all scales in the time-scale plane and then applying the inverse CWT to the noise reduced wavelet coefficients, the signal transients in the time domain were evaluated enhancing thus the difficult task of effective and reliable fault identification.A new robust method relying on the improved wavelet packet decomposition (IWPD) and support vector data description (SVDD) is proposed in (Pan et al., 2009).Node energies of IWPD were used to compose feature vectors.Based on feature vectors extracted from normal signals, a SVDD model fitting a tight hypersphere around them is trained, the general distance of test data to this hypersphere being used as the health index.IWPD is based on the second generation wavelet transform (SGWT) realized by lifting scheme.SVDD is an excellent method of oneclass classification, with the advantages of robustness and high computation.A methodology developed on the combination of these two methods for bearing performance degradation proved effective and reliable when applied to vibration signals from a bearing accelerated life test.(Feng et al., 2009) introduced the normalized wavelet packets quantifiers as a new feature set for the detection and diagnosis of localized bearing defect and contamination fault.The "Wavelet packets relative energy" measures the normalized energy of the wavelet packets node; the "Total wavelet packets entropy" measures how the normalized energies of the wavelet packets nodes are distributed in the frequency domain; the "Wavelet packets node entropy" describes the uncertainty of the normalized coefficients of the wavelet packets node.Unlike the conventional feature extraction methods, which use the amplitude of wavelet coefficients, these new features were derived from probability distributions and are more robust for diagnostic applications.Acoustic Emission signals from faulty bearings of rotating machines were recorded and the new features were calculated via WPT and Daubechies mother wavelets ("db1-db10").Their study showed that both localized defects and advanced contamination faults can be successfully detected and diagnosed if the appropriate feature was chosen.The Bayesian classifier was also used to quantitatively analyze and evaluate the performance of the proposed features.They also showed that by reducing the Daubechies wavelet order or the length of the signal segment will generally increase the classification rate probability.(Hao and Chu, 2009) presented a novel morphological undecimated wavelet (MUDW) decomposition scheme for fault diagnostics of rolling element bearings.The MUDW scheme was developed based on the morphological wavelet (MW) theory and was applied for both the extraction of impulse components and de-noising.The efficiency of the MUDW was assessed using simulated data as well as monitored vibration signals from a bearing test rig.(Hong and Liang, 2009) presented a new version of the Lempel-Ziv complexity as a bearing fault (single point) severity measure based on the continuous wavelet transform (CWT).The CWT (realized with the Morlet wavelet) was used to identify the best scale where the fault resides and eliminate the interferences of noise and irrelevant signal components as much as possible.
Next, the Lempel-Ziv complexity values were calculated for both the envelope and highfrequency carrier signal obtained from wavelet coefficients at the best scale level.As the noise and other un-related signal components have been removed, the Lempel-Ziv complexity value will be mostly contributed by the bearing system and hence can be reliably used as a bearing fault measure.The applications to the bearing inner-and outer-race fault signals have demonstrated that the proposed methodology can effectively measure the severity of both inner-and outer-race faults.(Xian, 2010) presented a combined discrete wavelet transform (DWT) and support vector machine (SVM) technique for mechanical failure classification of spherical roller bearing application in high performance hydraulic injection molding machine.The proposed technique consists of preprocessing the mechanical failure vibration signal samples using discrete wavelet transform with 'db2' mother wavelet at the fourth level of decomposition of vibration signal for failure classification.The energy of the approximation and the details was calculated and populated the feature vectors that trained the support vector machine that was built for the classification of mechanical failure types of the spherical roller bearings.In (Yan and Gao, 2010) the generalized harmonic wavelet transform (HWT) was used to enhance the signal-to-noise ratio for effective machine defect identification in rolling bearings that contained different types of structural defects.In harmonic wavelet transform a series of sub-frequency band wavelet coefficients are constructed by choosing different harmonic wavelet parameter pairs.The energy and entropy associated with each subfrequency band are then calculated.The filtered signal is obtained by choosing the wavelet coefficients whose corresponding sub-frequency band has the highest energy-to-entropy ratio.Experimental studies using rolling bearings that contain different types of structural defects have confirmed that the developed new technique enables high signal-to-noise ratio for effective machine defect identification.(Su et al., 2010) developed a new autocorrelation enhancement algorithm including two aspects of autocorrelation and extended Shannon function.This method does not need to select a threshold and can be implemented in an automatic way and is realized in various stages.First, to eliminate the frequency associated with interferential vibrations, the vibration signal is filtered with a band-pass filter determined by a Morlet wavelet whose parameters are optimized by genetic algorithm.
Then, the envelope of the autocorrelation function of the filtered signal is calculated.Finally the enhanced autocorrelation envelope power spectrum is obtained.The method is employed to the simulated signal and the real bearing vibration signals under various conditions, such as normal, inner-race fault and outer-race fault.There are only several single spectrum lines left in the enhanced autocorrelation envelope power spectrum.The single spectrum line with largest amplitude is corresponding to the bearing fault frequency for a defective bearing while it is corresponding to the shaft rotational frequency for a normal bearing.(Huang et al., 2010) utilized the lifting-based second generation wavelet packet transform to process vibration signals from a rolling element bearing test.The wavelet packet energy was calculated by the coefficients at the n th node of the wavelet packet.This corresponds to the energy of the coefficients in a certain frequency band.Normalization is applied to minimize possible bias due to different ranges of the wavelet packet energies.The fuzzy c-means method has been used to assess the bearing performance and classify the faulty and the healthy recordings.In (Pan et al., 2010) a new method based on lifting wavelet packet decomposition and fuzzy c-means for bearing performance degradation assessment is proposed.Vibration signals during run-in tests up to bearing failure were processed with lifting wavelet packet.Feature vectors composed of node energies were constructed and fed in a fuzzy c-means expert system for classification of healthy, degraded and failed bearings.(He et al., 2010) proposed a hybrid method which combines Morlet wavelet filter and sparse code shrinkage (SCS) to extract the impulsive features buried in the vibration signal.Initially, the parameters of a Morlet wavelet filter (center frequency and bandwidth) are optimized by differential evolution (DE) in order to eliminate the interferential vibrations and obtain the fault characteristic signal.Then, to further enhance the impulsive features and suppress residual noise, SCS which is a softthresholding method based on maximum likelihood estimation (MLE) is applied to the filtered signal.The results of simulated experiments and real bearing vibration signals verify the effectiveness of the proposed method in extracting impulsive features from noisy signals in condition monitoring.(Chiementin et al., 2010) studied the effect of wavelet de-noising and other techniques on acoustic emission signals from faulty bearings.They applied DWT and attempted to optimize the various parameters selection involved in a wavelet-based de-noising scheme.
They assessed the different de-noising techniques and concluded that the wavelet approach enhanced the signal kurtosis and crest factor more than the other techniques.

Motors
Electrical, hydraulic motors as well as internal combustion engines are the dominant applications in the related literature.(Chen et al., 2006) (Daviu et al., 2007) employed wavelet analysis on the stator startup currents in order to detect the presence of dynamic eccentricities in an induction motor.For this purpose, the DWT is applied on the stator startup monitored current signals.The approximation and details were obtained after the DWT decomposition via "db44" wavelet and 8 levels of analysis.The relative increment in the level energy of the wavelet coefficients was used as a quantitative indicator of the degree of severity of the fault.In (Chen et al., 2007) a novel method to process the vibration signals was presented for the fault diagnosis of water hydraulic motors.De-noising was initially conducted by thresholding in the wavelet domain and inversely transforming the de-noised wavelet coefficients.Feature extraction based on the second-generation wavelet of the vibration signals followed next.The statistical probability distributions of the mean, variance and the second-order statistical moment of the scaling coefficients at first, second and third scale were calculated and used to classify the different piston conditions.(Chendong et al., 2007) proposed a new sliding window feature extraction method based on the lifting scheme for extracting transient impacts from signals.A sliding window -designed according to the revolution cycle of rotating machinery-is applied to process the detail signals.By extracting modulus maxima from these windows, fault features and their locations in the original signals were revealed.
An incipient impact fault caused by axis misalignment, mass imbalance and a bush broken fault have been successfully detected by using the proposed approach.In (Peng et al., 2007) the wavelet transform modulus maximal (WTMM) method was used to calculate the Lipschitz exponents of the vibration signals with different faults.The Lipschitz exponent can give a quantitative description of the signal's singularity.The proposed singularity based parameters proved a set of excellent diagnostic features, which could separate the four kinds of faults very well.The results showed that, with the fault severity increasing, the vibration signals' singularities and singularity ranges increased as well, and therefore one could evaluate the fault severity through measuring the vibration signals' singularities and singularity ranges.Then the kurtosis of the de-noised signal was calculated and finally the KS test was used to classify the kurtosis statistical probability distribution (SPD) under seven different piston conditions.Thus the piston condition in water hydraulic motor was successfully assessed.(Widodo and Yang, 2008) introduced an intelligent system for faults detection and classification of induction motor using wavelet support vector machines (W-SVMs).W-SVMs were built by utilizing the kernel function using wavelets.Transient current signals were monitored in various damage conditions of the induction motor.The acquired signals were preprocessed through DWT ("db5", 5 levels) and various statistical features were extracted.Principal component analysis (PCA) and kernel PCA were utilized to reduce the dimension of features and to extract the useful features for classification process.Finally the classification process for diagnosing the faults was carried out using W-SVMs and conventional SVMs based on one against-all multi-class classification.(Wu and Liu, 2009) proposed a fault diagnosis system for internal combustion engines using wavelet packet transform (WPT) and artificial neural network (ANN) techniques on monitored sound emission signals.In the preprocessing phase, WPT coefficients are used, their entropy is calculated and treated as the input to the ANN in order to distinguish the various fault conditions.''db4", ''db8" and ''db20" from the Daubechies family were used as mother wavelets with no clear advantage of one of them in the ANN performances.(Lin et al., 2010) utilized vibration measurements to distinguish effectively between aligned and misaligned motors.The proposed method calculates the difference between the MSE of the original vibration signal and that of the signal after the signal is de-noised by wavelet transform.This study presents a novel use of the multiscale entropy technique by comparing the difference of sample entropy of a signal before and after the signal is denoised using wavelet transform.De-noising was performed using the Daubechies wavelet transform, which was implemented with Matlab wavelet function with the following parameter settings: threshold type is ''rigrsure"; number of decomposition levels is 4; mother wavelet is ''db4".(Cusido et al.,2010) have monitored motor current for fault diagnosis in induction machines.The power detail density (PDD) function resulting from a wavelet transformation has proven to be one of the best methods for motor fault estimation under variable load.Power detail density was calculated as the squares of the coefficients of one detail.(Wang and Jiang, 2010) utilized an adaptive wavelet de-noising scheme by combining advantages of both hard and soft thresholding, to de-noise vibration signals from the aircraft engine rotor experimental test rig by block to light rub-impact rotational plate.
After the de-noising procedure, the correlation dimension of the vibration signal is computed, and is used as the characteristic feature for identifying the fault deterioration grade.
(Ece and Basaran, 2011) applied wavelet packet decomposition (WPD) in supply-side current signals for the condition monitoring of induction motors with adjustable speed and load levels.In this work, acquired data, sampled at 20 kHz, is analyzed using 11 level WPD.This way, the coefficients of three nodes at the 11th level corresponding to 43.Hz that cover the region of both side-bands as well as the 50 Hz fundamental, are obtained.Using the coefficients of each resulted node, 5 statistical features (i.e.mean, variance, standard deviation, skewness, and kurtosis) are calculated resulting 15 element feature vectors.(Konar and Chattopadhyay, 2011) employed a hybrid CWT-Support Vector Machine approach (CWT-SVM) to analyze the frame vibrations of healthy and faulty induction motors during start-up.Various mother wavelets were utilized in the implementation of CWT.'Morlet' and 'db10' wavelets were found to be the best choice and used throughout the study.Three statistical features (i.e.root mean square (RMS), crest and kurtosis values) were calculated from the CWT coefficients for each loading condition and consisted of the input in the SVM to classify between healthy and faulty states.In (Anami et al., 2011), a methodology to determine the health condition of motorcycles, based on discrete wavelet transform (DWT) of sound measurements is proposed.The 1-D central contour moments and invariant contour moments, of approximation coefficients of DWT form the feature vectors corresponding to various health states.The sound samples are subjected to wavelet decomposition using Daubechies 'db4' wavelets.The decomposition into approximation and detailed coefficients is carried out for the first 14 levels.The feature vector comprises of four 1D central contour moments (l2;l3; l4 and l5) and their four invariants (F1; F2; F3 and F4) computed on approximation coefficients of a wavelet sub-band.A dynamic time warping (DTW) classifier along with Euclidean distance measure is successfully used for the classification of the feature vectors.

Tool wear
Tool condition monitoring is a very interesting industrial application.(Velayudham et al., 2005) used wavelet packet transform to study the condition of the drill during drilling of glass/phenolic composite under acoustic emission (AE) monitoring.The energy of the wavelet packet is considered as criterion for the selection of feature packets.Thus, the AE signals were decomposed into four levels, that is, splitting into 16 wavelet packets.Each wavelet packet corresponds to a frequency band ranging from 0-156.25 to 2343.75-2500 kHz.Out of the 16 packets resulted, it is necessary to select the packets (feature packets) that contain useful information.Based on the energy in each packet those with the maximum energy were selected.The monitoring index extracted from wavelet coefficients of highest energy packets could reliably detect the condition of the tool.(Shao et al., 2011) utilized a modified blind sources separation (BSS) technique to separate source signals in milling process.A single-channel BSS method based on wavelet transform and independent component analysis (ICA) was developed, and source signals related to a milling cutter and spindle were separated from a single-channel power signal.The experiments with different tool conditions illustrate that the separation strategy is robust and promising for cutting process monitoring.In (Liao et al., 2007) a wavelet-based methodology for grinding wheel condition monitoring based on acoustic emission (AE) signals was presented.Features were then extracted from each raw AE signal segment using the DWT via "db1" and 12 levels of analysis.An adaptive genetic clustering algorithm was finally applied to the extracted features in order to distinguish between different states of grinding wheel condition.(Li et al., 2005) utilized the DWT to recognize the tool wear states in automatic machining processes.The wavelet coefficients d(j, k) of cutting force signals were calculated after the application of DWT. d(5,k) coefficients proved sensitive and able to identify the different tool wear states and different cutting conditions.(Velayudham et al., 2005) used the WPT in order to characterize the acoustic emission signals released from glass/phenolic polymeric composite during drilling.In their work, the energy of the wavelet packets was taken as criterion for the selection of feature packets, with those having the higher energy to contain the characteristic features of the signal.The results showed that the selected monitoring from the wavelet packet coefficients were capable of detecting the drill condition effectively.(Borghetti et al., 2006) proposed a methodology based on the continuous-wavelet transform (CWT) for the analysis of voltage transients due to line faults, and discussed its application to fault location in power distribution systems.The analysis showed that correlation exists between typical frequencies of the CWT-transformed signals and specific paths in the network covered by the traveling waves originated by the fault.(Belotti et al., 2006) presented a diagnostic tool, based on the DWT, for the detection of wheel-flat defect of a test train at different speeds.DWT was applied on the rail acceleration signals via "db4" wavelet and 10-level decomposition.The results, achieved after an exhaustive experimental campaign, allowed the validation of the effectiveness of the diagnostic tool.(Xu and Li, 2007) utilized oil spectrometric data from air-compressors.In the first stage denoising of the original signals through WPT (db4", 3 levels) and "rigsure" threshoding strategy was conducted.Then decomposition of the de-noised signal through DWT (with "db1") followed.The variance of approximation coefficients and detail coefficients at level 1 were calculated.In the last stage the improved three-line method was adopted to ascertain decisive criteria for wear condition.The ability of the proposed method for classifying and recognizing wear patterns was verified.(Monsef and Lotfifard, 2007) presented a novel approach for differential protection of power transformers.DWT ("db9, 7 levels) and adaptive network-based fuzzy inference system (ANFIS) were utilized to discriminate internal faults from inrush currents.The proposed method has been designed based on the differences between amplitudes of wavelet transform coefficients in a specific frequency band generated by faults and inrush currents.The ability of the new method was demonstrated by simulating various cases on a typical power system.The algorithm is also tested off-line using data collected from a prototype laboratory three-phase power transformer.The test results confirm the effectiveness and reliability of the proposed algorithm.(Dong and He, 2007) proposed a methodology for the condition monitoring of hydraulic pumps.The collected vibration signals were processed using wavelet packet with "db10" wavelet and five decomposition levels.The wavelet coefficients obtained by the wavelet packet decomposition were used as the inputs to the hidden Markov and semi-Markov models for the classification of the various fault signals.The performance of the two methods was assessed resulting in higher classification rates in the case of hidden semi-Markov models.(Carneiro et al., 2008) presented an approach for incipient fault detection of motor-operated valves (MOVs) using DWT with "db4" wavelet and six decomposition levels chosen.The motor power signature was acquired through three-phase current and voltage measurements at the motor control center.The results demonstrated the effectiveness of DWT-based methodology on incipient fault detection of motor-operated valves.In the two cases considered, the technique was able to detect incipient faults.(Gketsis et al., 2009) applied the Wavelet Transform (WT) analysis along with Artificial Neural Networks (ANN) for the diagnosis of electrical machines winding faults.After an optimum wavelet selection procedure they utilized "db2" for the decomposition via DWT of the admittance, current and voltage curves.Level 7 (D7) detail is utilized for feature extraction.The Fourier Transform is employed to derive measures of amplitude and displacement (shift) of D7 details.Motor-operated valves are used in almost all nuclear power plant fluid systems.The purpose of motor-operated valves (MOVs) is to control the fluid flow in a system by opening, closing, or partially obstructing the passage through itself.The readiness of nuclear power plants depends strongly on the operational readiness of valves, especially MOVs.They are applied extensively in control and safety-related systems.(Tang et al., 2010) employed continuous wavelet transformation (CWT) to filter useless noise in raw vibration signals from gearboxes in wind turbines, and auto terms window (ATW) function was used to suppress the cross terms in Wigner Ville Distribution.In the CWT de-noising process, the Morlet wavelet (similar to the mechanical impulse signal) is chosen to perform CWT on the raw vibration signals.The appropriate scale parameter for CWT is optimized by the cross validation method (CVM).(Niu and Yang, 2010) proposed an intelligent condition monitoring and prognostics system in condition-based maintenance architecture based on data-fusion strategy.They collected vibration signals from a whole test on a methane compressor and trend features were extracted.Then features were normalized and sent into neural network for feature-level fusion.Next, data de-noising was achieved by smoothing with moving average and then wavelet decomposition was applied ('db5', 5 levels of decomposition) to reduce the fluctuation and pick out the trend information.In (Eristi et al.,2010) a novel scheme composed of feature extraction and feature selection procedures for obtaining robust and adequate features of power system disturbances was presented.Firstly, features were obtained by different extraction techniques to the wavelet coefficients of all decomposition levels of the disturbance signal utilizing DWT and 'db4' wavelet.Then, by using sequential forward selection (SFS) technique, robust and adequate features were selected in the feature set resulted from the first stage.The detail coefficients and approximation coefficients were not directly used as the classifier inputs.Reduction of the feature vector dimension was first conducted.In this study, mean, standard deviation, skewness, kurtosis, RMS, form factor, crest-factor, energy, Shannon-entropy, log-energy entropy and interquartile range of the ten level coefficients were used as features.Finally the classification of the power system disturbances using support vector machines (SVMs) was achieved.

Other applications
(Jiang et al, 2011) introduced a new de-noising method based on adaptive Morlet wavelet and singular value decomposition (SVD) for feature extraction of vibration signals from wind turbine gearbox.Modified Shannon wavelet entropy was utilized to optimize central frequency and bandwidth parameter of the Morlet wavelet so as to achieve optimal match with the impulsive components.The proposed method was applied to extract the outer-race fault in a rolling bearing and the fault diagnosis of a planetary gearbox in a wind turbine.The results show that the proposed method based on adaptive Morlet wavelet and SVD performed much better than the Donoho's "soft-thresholding de-noising", the de-noising method based on CWT and SVD, and the de-noising method based on Morlet wavelet.Thus, it provides an effective tool for fault diagnosis to extract the fault features submerged in the background noise.

Conclusions
Tremendous progress has been made the last 15 years in the evolution of WT theory as well as their applications in engineering and especially condition monitoring.WT literally gave a boost to the signal processing of engineering signals opening a wide full-of-options field.WT is now more mature than ever constituting one of the most powerful weapons in the signal analyst's arsenal.In this review, classical as well as second generation wavelet transforms were presented.The issue of mother wavelet choice and a variety of applications in wavelet-based condition monitoring were discussed.Some concepts on the beyond the state-of-the-art in WT were finally discussed.Despite the rapid evolution of WT there are still unresolved theoretical issues such as the optimum mother wavelet choice, the number of decomposition levels in DWT, WPT, SGWT and the number of analyzing scales in CWT.A solution by the mathematicians is expected there in the future.In the engineering field and especially in the condition monitoring, WT is expected to support (directly or indirectly) the developments in the fast evolving field of forecasting and prognostics.Wavelet-based utilization of schemes such as Hidden Markov Models, Particle Filters, Remaining Useful Life PDF, Trend extrapolation etc. are expected to dominate in the literature of condition monitoring the following years.

Fig. 6 .
Fig. 6.Decomposition and reconstruction of the signal with SGWT The operators P and U are built by means of interpolating subdivision method (ISM) [16].Choosing different P and U is equivalent to choosing different biorthogonal wavelet filters.Fig. 6 depicts the structure of SGWT.The computational costs of the forward and inverse transform are exactly the same.
decomposition and reconstruction stages of SGWPT are shown in Figs.7 and 8.

Fig. 7 .
Fig. 7. Decomposition step of SGWPT the signal x(t) to the time-scale plane by means of a wavelet transform.The wavelet coefficients on various scales are obtained.2. Assess the threshold t and, in accordance with the established rules, shrink the wavelet coefficients.3. Use the shrunken coefficients to carry out the inverse wavelet transform.The series recovered is the estimation of impulse p(t).

Fig. 10 .
Fig. 10.Effect of various de-noising schemes on an AE signal from defective bearing a) original signal b) de-noised signal via spectral kurtosis technique c) de-noised signal via DTCWT d) de-noised signal via DWT worked on fault diagnosis of water hydraulic motors.A modelling of the monitored vibration signals based on the adaptive wavelet transform (AWT) was proposed.The model-based method by AWT was applied for de-noising and feature extraction.Scalograms acquired through the CWT revealed the characteristic signal's energy in time-scale domain and were used as feature values for fault diagnosis of water hydraulic motor.(Wu and Chen, 2006) presented a fault signal diagnosis technique for internal combustion engines based on CWT.The Morlet wavelet was used because in many mechanical dynamic signals, impulses are always the symptoms of faults and the Morlet wavelet is very similar to an impulse component.Different faults have shown different scalograms.A characteristic analysis and experimental comparison of the vibration signal and acoustic emission signal with the proposed algorithm were also presented in their work.

(
Wu and Liu, 2008) instead of WPT utilized a DWT technique combined with a feature selection of energy spectrum and fault classification using ANNs for analyzing fault signals of internal combustion engines.The features of the sound emission signals at different resolution levels were extracted by multi-resolution analysis and Parseval's theorem.(Niu et al., 2008) applied multi-level wavelet decomposition on transient stator current signals for fault diagnosis of induction motors.After the signal preprocessing using smoothingsubtracting and wavelet transform techniques, features were extracted from each level of detail component of decomposed signals using DWT and "db10" mother wavelet.21 features in total are acquired from each sensor consisting of the time domain (10 features), frequency domain (three features) and regression estimation (eight features).Totally, two 70• 3•21 features sets are calculated from seven types of signals collected by three current probes at each wavelet decomposition level.The calculated two features sets consisted of the training and test sets respectively and consist of the input in four different classifiers for pattern recognition with quite satisfactory results.(Chen et al., 2008) proposed a methodology based on Wavelet Packet Analysis (WPA) and Kolmogorov-Smirnov (KS) test to analyze monitored vibration signals from the water hydraulic motor to assess the fault degradation of the pistons in water hydraulic motor.The fault detection procedure applied is summarized in the following.First, the time-domain vibration signals were decomposed through the WPT in two levels.The soft-thresholding technique was used in the wavelet and approximation coefficients to get the de-noised coefficients.The reconstructed denoised vibration signal with improved signal-to-noise ratio (SNR) was obtained by reconstructing the de-noised coefficients in the multi-decomposition of the vibration signal.
(YanPing et al., 2006)eory and Their Applications in Engineering, Physics and Technology 290 and increasing the level of safety.A great percentage of breakdowns in industrial processes as well as in rotorcraft transportation (helicopters etc) are caused by gearbox related failures.Fault symptoms usually begin from early stages, rather long before a destructive failure making the use of effective condition monitoring schemes very attractive.Many highquality investigations can be found in the recent literature.(YanPingetal., 2006)explored the statistical characteristics of the continuous wavelet transform scalogram of vibration signals from rotating machinery.Two features, wavelet grey moment (WGM) and first-order wavelet grey moment vector (WGMV), were proposed for condition monitoring of rotating machinery.Wavelet grey moments are defined as: