Study on Optimal Selection of Wavelet Vanishing Moments for ECG Denoising

The frequency characteristics of wavelets and the vanishing moments of wavelet filters are both important parameters of wavelets. Clarifying the relationship between the wavelet frequency characteristics and the vanishing moments of the wavelet filter can provide a theoretical basis for selecting the best wavelet. In this paper, the frequency characteristics of wavelets were analyzed by mathematical modeling, the mathematical relationship between wavelet frequency characteristics and vanishing moments was clarified, the optimal wavelet base function was selected hierarchically according to the amplitude frequency characteristics of ECG signal, and an accurate notch filter was realized according to the frequency characteristics of the noise. The experimental results showed that the optimal orthogonal wavelet analysis for the ECG signals with different frequency characteristics could make the high frequency energy distribution sparser, and the method proposed in this paper could effectively preserve the singularity of the signal and reduce the signal distortion.

how to select and use the appropriate wavelet basis function and how to filter the noise of the specific frequency characteristic in the process of wavelet transform 6 .

Wavelet vanishing moments and optimal wavelet basis selection
For ECG signal analysis and processing by wavelet transform, there are several purposes: denoising, feature detection, and data compression. Whatever the purpose, it is hoped that, after the wavelet decomposition of the ECG signal, the signal energy can achieve the maximum concentration in low-pass components, and the energy of high-pass components can achieve maximum thinning 7 . According to the current literature, under the premise of not considering the computational complexity of wavelet transform, it is generally thought that the higher vanishing moment of the wavelet can produce a better effect 8 . Some other articles contend that the effect of high-pass component thinning and the shape of the wavelet scale function are related to the similarity of the shape of the signal 9,10 . This paper provides an in-depth study on this issue and offers an answer.
Wavelet vanishing moment and filter amplitude-frequency characteristics. For , the above formula can be expressed as: The Fourier transform is a special case of the Laplace transform in imaginary axis s = jΩ, and thus is mapped as a unit circle in the Z-plane. The Z-transform of the sequence H on the unit circle is equal to the ideal Fourier transform of the sampled signal 10,11 . Let the digital frequency (ω) be the parameter of the unit circle in the Z-plane. Ω is the angle of the Z-plane, denoted as For the orthogonal wavelet, the perfect reconstruction satisfies: Therefore: A e w Namely: Theorem 1: The order of the vanishing moment of an orthogonal wavelet is proportional to its corresponding filter order.
, take the first derivative, resulting in the rate of change with is equal to 1, so the filter slope at the edge of the transition band is: ( ) 1 . This shows that the larger the vanishing moment, the steeper the slope of the edge in the transition band of the corresponding filter, and the higher the order of the filter Figure 1 shows the amplitude-frequency characteristics of orthogonal wavelet high-pass filters with vanishing moments of 2, 4, 6, 8, 10, and 12.
Scientific RepoRts | 7: 4564 | DOI:10.1038/s41598-017-04837-9 Selection of wavelet for ECG signal processing. It can be deduced from the above cases that the low and high frequency energies after wavelet transform are related to the frequency characteristics of the signal and also to the vanishing moments of the wavelet. For a signal whose frequency is stable, there must be a wavelet with a vanishing moment of N, so that the high-frequency energy after the wavelet transformation is the lowest 12,13 . Therefore, the following theorem can be introduced: is a periodic signal with stable frequency characteristics. For a wavelet series W with order vanishing moments from N1 to N2, there exists a unique wavelet W k (vanishing moment k: N1kN2) such that its energy is minimized after wavelet transform.
Proof: According to the Mallat principle, let x j (k), d j (k) be a discrete approximation coefficient in multiresolution analysis where h 0 (k), h 1 (k) are two filters that satisfy the orthonormal two-scale difference equation. Then, for X j (k), d j (k), there are recursive relations as follows: , then, in accordance with the Fourier convolution theorem: Also, according to the Parseval theorem, the following equation exists: Therefore, j j j 0 0 ; with the necessary conditions for extreme values with the function, we get, In general, for the signal processing, there are minimum requirements on the order of the vanishing moments in the filter 14 . The reason being that, the lower the vanishing moments, the more signal detail is lost during processing and the greater the distortion. If the lower limit of the vanishing moments is given, it can be understood from Theorem 2 that for a given a signal and set of wavelets with given frequency characteristics, there must be an optimal wavelet for minimizing the energy contained in the processed high-pass component. This facilitates the subsequent filtering process and the signal compression process. Real-time ECG signal Fourier transform is as shown in Fig. 2a, and we can clearly find the corresponding frequency characteristics, as shown in Fig. 2b.
The real-time ECG signals are decomposed by four-order, six-order, eight-order and ten-order orthogonal filters, respectively, and the corresponding high-pass components are shown in Figs 3b,d and 4b,d. As shown in Table 1, the energy of the decomposed high-pass signal is smallest when the wavelet vanishing moment N = 6. It can be seen that the orthogonal wavelet with the vanishing moment N = 6 is the best choice when the first layer is decomposed.
According to Theorem 2, prior to the processing of the ECG signal, an optimal wavelet should be selected to achieve filtering. There should be a complete wavelet set, where the wavelet should have a lowest vanishing moment such as four-order, until the fast wavelet transform can support up to the highest order, such as 24-order. If the vanishing moment is too low, then it will affect the accuracy of signal processing; if the vanishing moment is too high, the calculation will be too complex 15 . The high-frequency filter bank of each wavelet is extended by zero-padding (length Γ), and then Fourier-transformed. The typical 2-cycle ECG signal (length Γ) is taken as a sample, and is Fourier-transformed. According to Formula (12), the total energy of the high-pass component corresponding to the vanishing moment wavelet is obtained, and then the wavelet with the smallest energy is selected as the optimal choice.

Hierarchical processing in ECG signal wavelet transform
A filter with a different vanishing moment is selected according to the frequency characteristics of the signal in each layer. After a layer of wavelet is decomposed, low-pass components enter a new layer of wavelet transform. In this wavelet decomposition, should the original filter or the more matched filter for processing be used? It is clear that the sampling frequency is reduced by a factor of two, and so is the frequency of the signal after the bisectional extraction. However, since the processed signal contains only the low-pass portion of the original signal, there is a significant difference in the spectral distribution between layers j + 1 and j. In order to let the high-frequency components contain less energy, we should reselect an optimally matched filter for processing. Figure 5b shows the frequency characteristics of the original signal. Figure 5d shows frequency characteristics of low-frequency signals after wavelet decomposition. As can be seen from Fig. 5, there is a clear difference between the original and filtered frequency distributions. In this paper, the decomposed low-pass signals are re-decomposed through the filter with vanishing moments of 4, 6, 8, and 10. The high-pass components after such decomposition are shown in Fig. 6a,b,c and d, respectively. It  can be seen that there is a significant difference. At N = 4, the total energy of the decomposed high-pass component is the smallest (the specific data are shown in Table 2). This indicates that, in the wavelet transform process, it is necessary to choose wavelets with different vanishing moments to deal with the different levels of wavelet transform, so as to realize the optimization of energy distribution. In general, for the first and second layers, different filter banks should be used and the rest of the layers can be determined according to need.
Wavelet transform to remove power frequency interference. One of the main sources of noise is 50 Hz frequency interference, which is generally removed prior to further signal analysis and processing. If using Fourier-transform for the mathematical signal processing, this problem can easily be addressed by setting the corresponding frequency component to 0. However, the problem of how to use wavelet transform for frequency notching has not been discussed in the literature. The general practice is to set the threshold for high-frequency components for filtering, but the desired accuracy of the results is difficult to achieve 16,17 .
If the wavelet transform is made for signal f (x) with sampling frequency P0, then the high-pass component will contain the frequency component from       , . These two points are also a transition area from low frequency to high frequency. From the Fourier convolution theorem, we can see that the effect of G(ω), H(ω) on the signal is equivalent to the transfer function in the analysis of the filter circuit. In the following, we further analyze the suppression factor of the signal at these two critical points:  Table 1. Total energy of the high frequency system after wavelet DBN decomposition. So, when the appropriate vanishing moment (N) is taken, it is possible to make the suppression factor of the critical point infinitely small, so that the gain of the signal outside the region of this point is close to zero, theoretically equivalent to the truncated state.
Let the sampling frequency of the signal at P0, which contains the noise at the frequency of = P s P 2 0 . Suppose that the frequency tolerated in denoising through notching is: . If this time the band of the high-pass component is wide, and if this part of the signal is removed by setting at zero, then, undoubtedly, the effective frequency will be removed; or if there is a cutoff according to the threshold, noise will not be filtered completely. Both of these conditions can cause signal distortion. It is clear that the high-frequency component can be continuously decomposed by the wavelet, and the band of the high-frequency component is further narrowed and gradually approximated to − ∆ + ∆ P P P P [ , ] s s . Obviously, the higher the order of the filter used for the notch processing, the smaller the overlapping region of high-pass and low-pass frequencies, the steeper the filter frequency curve, and the more concentrated the energy 18 .
In order to facilitate the calculation, this paper uses the normalized frequency as the unit. The normalized frequency R of the actual frequency P t in the frequency interval [P 1 , P 2 ] is defined as follows: Any frequency range [P 1 , P 2 ] can be expressed as a real number domain [0, 1] after the normalized processing. It can be known from Shannon's Theorem that the sampling frequency should be no less than twice of the maximum frequency in analog signal spectrum. So the normalized frequency R should be in the range of [0, 0.5] when the sampling signal is directly filtered and [0, 1] for the second layer and above filtration.    Therefore, there is a need for secondary filtering to implement band-pass denoising: the high-pass component after the first filtering is decomposed by the wavelet; then, the low-pass component after the secondary decomposition contains the bandwidth [0.25, 0.40] and the high-pass component contains the bandwidth [0.375, 0.500]. So, this part of the high-pass component is set to 0 to complete filtering. Afterwards, the two-part signal is recombined into the detail signal of the first layer, as shown in Fig. 9. The final filtering effect is shown in Fig. 8b. If ΔP = 0.100; the notching can be implemented in a very narrow area, thus preserving some of the key details of the high frequency signal.

Experimental result
Based on the above discussion, the two key steps of the wavelet-based denoising algorithm proposed in this paper are as follows: First, the amplitude-frequency characteristic of the ECG signal is analyzed to determine the wavelet loss moment order, and on this basis, the optimal wavelet basis function is selected for different levels of wavelet decomposition. Second, by analyzing the noise frequency characteristics, determining the level and coefficient of band-pass filter, so as to realize the fixed-point removal of noise in the process of wavelet decomposition. Ten sampling records (data of the first 10 minutes for each record) are selected from ECG ID database for a contrast experiment. The adaptive genetic algorithm based on EEMD (Genetic EEMD), the adaptive threshold denoising algorithm based on discrete wavelet transform (Threshold DWT) and the wavelet denoising algorithm which is optimized by this paper (Proposed DWT) are applied for filtering processing. The filtering effect is evaluated from the aspects of filtering time consumption, denoising effect, signal loss and so on. To explain the effective energy loss procedure after signal denoising, mean squared error (MSE) was used to explain the difference between the denoised signal and the original signal. The smaller the MSE, the smaller the signal loss, the better the signal reduction effect. NSR (Noise Suppression Ratio) is defined to reflect the noise suppression effect. The smaller NSR is, the better the denoising effect achieved 16 . From the experimental results shown in Table 3, the Proposed DWT algorithm is significantly superior to the other two representative denoising algorithms in terms of MSE, NSR and TIME. For the denoising effect, Proposed DWT has especially obvious advantages compared with Threshold DWT. For the CPU time consumption, the optimistic algorithms save half the time of Genetic EEMD. Figure 10 illustrates the comparison of the results after denoising of the first 10 seconds of data recorded by Person_01/rec_1.

Conclusion
Although a classic method of processing an ECG signal is by using wavelet transform, there is still much confusion about ECG signal processing with wavelet. For example, on the premise that wavelet supporting width  can be tolerated, is it true that the larger the order number of the vanishing moment, the more concentrated the energy generated when the signal is decomposed by the wavelet? How can we realize the precise filtering of ECG signal by wavelet transform and keep the singular point in the signal? In this paper, a quantitative analysis was performed to study the correlation between the vanishing moment and frequency characteristics of the wavelet, an optimal wavelet base function was selected based on the amplitude frequency characteristics of ECG signals, and wavelet bases of different orders were used to deal with different wavelet spaces. We also found that the accurate bandpass filtering could be realized in the process of wavelet transform according to the different frequency characteristics of noise, which effectively avoided the damage to the signal during the denoising process. Experimental results showed that the proposed wavelet transform with optimized parameters had a remarkable effect on ECG signal denoising and has a strong practical significance, including the ECG monitoring watch developed by myself.  Table 3. Comparison of the effect of three different denoising algorithms. Figure 10. Comparison of denoising results with different denoising algorithms. A is the original ECG, B is the ECG containing noise, C is the ECG after processed by Threshold DWT, D is the ECG processed by Genetic EEMD, and E is the ECG processed by the algorithm in the paper.