Sparse Representation Based on Tunable Q-Factor Wavelet Transform for Whale Click and Whistle Extraction

Whale sounds may mix several elements including whistle, click, and creak in the same vocalization, which may overlap in time and frequency, so it leads to conventional signal separation techniques challenging to be applied for the signal extraction. Unlike conventional signal separation techniques which are based on the frequency bands, such as WT and EMD, tunable-Q wavelet transform (TQWT) can separate the objected signal into particular components with different structures according to its oscillation property and eliminate in-band noise using the basis pursuit method. Considering the characteristics of oscillatory and transient impulse, we propose a novel signal separation method for whale whistle and click extraction. *e proposed method is performed by the following two steps: first, TQWT is used to construct the dictionary for sparse representation. Secondly, the whale click and whistle construction are performed by designing the basis pursuit denoising (BPD) algorithm. *e proposed method has been compared with one of the popular signal decomposition techniques, i.e., the EMD method. *e experimental results show that the proposed method has a better performance of click and whistle signal separation in comparison with the EMD algorithm.


Introduction
Whale sounds are combined of several elements (whistle, regular click, and rapid-click buzzes (creak)) in the same vocalization [1].In general, whales utilize frequencymodulated pure tones (whistles) to communicate with each other.Meanwhile, they emit transient impulses (clicks) to echolocate the targets and explore the environment.Besides, they vocalize creak voice when they are in danger or emergency.However, most of the existing methods in the field of whale signal processing have been only analyzed for the single whistle or click signals, such as in [2][3][4][5][6][7][8][9][10].
Rather than extracting useful information from the multicomponent signals for further processing, some methods directly extract contours from whale whistle using image processing techniques, as described in [2][3][4][5].Besides, for the automated analysis of whale click, various methods have been presented based on the statistical computation of whale click spectrogram through different transforms in [6][7][8][9][10] with only presence of click.To further cognition of whale communication pattern and echolocation pattern, we need to separately extract whistles and clicks from the composite signal.Abundant methods have been presented to decompose multicomponent signals such as blind source separation [11], dual-tree complex wavelet transform [12], wavelet denoising, empirical mode decomposition (EMD) [13], ensemble empirical mode decomposition (EEMD), multiwavelet packet [14], and independent component analysis (ICA) [15][16][17][18].e above methods are managed to decompose the objected signals in the frequency domain.However, each component of the whale sounds may occupy the same frequency band and overlap in the frequency domain.
us, the above methods cannot exactly extract each component from the whale sounds.
Unlike frequency-based methods, a sparse signal representation method using the tunable Q factor wavelet transform (TQWT) can be used to decompose multicomponent signal according to the oscillatory behavior, where the oscillation is defined and reflected by theQ factor.Sparse signal representation has been applied to signal separation.An approach using biorthogonal RADWTs for signal separation was presented in [19].A method using a signal sparse representation for signal separation has been proposed to address the mitigation of wind turbine clutter (WTC) in weather radar data [20].Since the whistle and click signals have oscillatory and transient impulse characteristics, respectively, we proposed a sparse signal representation method with TQWT to extract click components with low Q factor and whistle components with high Q factor, respectively.e method is verified by the oceanic audio recording, and the results show effective extraction of click and whistle components from whale vocalization.e main contribution of this paper lies in (1) a new method that extracts click and whistle effectively and efficiently from multicomponents sound, (2) the selection of appropriate algorithm parameters in the actual application, and (3) the proposed method that can also be used for other mammals to extract interesting component from the composite signal.
e rest of the paper is organized as follows: in Section 2, the TQWT algorithm and the basis pursuit denoising method are introduced.Section 3 compares the capabilities of the proposed approach and the EMD algorithm using click and whistle components extraction from a recorded whale composite sound signal.Section 4 discusses the conclusion.

Methodology
e proposed approach consists of two parts: TQWT and BPD.TQWT offers an excellent flexibility for representing a signal of different temporal and spectral characteristics, achieved by tuning the transform parameters.TQWT is a discrete-time wavelet transform (DWT) [21] which provides a suitable tool for the analysis of oscillatory and nonoscillatory components of a signal.Besides, BPD is applied to obtain a sparse representation of each component which has different oscillatory behaviors.A brief introduction of the TQWT, its oscillation characteristic, and the transform operation are discussed in the following section.

TQWT.
Our aim is to separate click and whistle components from the composite signal.From the perspective of the oscillation nature, the click components are composed of instantaneous pulses which have low oscillation property, and the whistle components are constituted of multiharmonics which have high oscillation property.To deal with such signals, Selesnick et al. developed the tunable Q factor wavelet transform (TQWT) which the Q factor is flexibly tunable [22].TQWT can be used to decompose the objected signal into high-oscillatory, low-oscillatory, and residual components according to the values of high Q and low Q factors [23].e Q factor reflects the oscillatory properties of one signal.e Q factor is defined as follows [24]: where f c is the center frequency and BW is the bandwidth.e signal oscillatory property can be described with the Q factor.As shown in Figures 1(a)-1(d), it is evident that a higher Q signal has a higher oscillatory intensity in the time domain and a higher degree of frequency aggregation in the frequency domain at the same time, and vice versa [24].Hence, the difference between the low and high Q wavelet functions highlights the idea of oscillation of a signal, which is exploited for the component extraction problem discussed in this paper.
e TQWT algorithm which decomposes an N-point discrete-time signal into J-level subbands is demonstrated in Figure 2. e structure of TQWT employs two channel analysis and synthesis filter banks.e implementation of the analysis filter banks is performed on its low-pass channel iteratively and then further processed by the low-and high-pass scaling, and α and β are the corresponding scaling parameters [25].e synthesis filter bands execute the same steps.For each level, two-channel filters are composed of low-and high-pass filter, and the corresponding frequency responses are defined in the following equations [23]: where the parameters must satisfy 0 e most significant parameters of the TQWT algorithm are Q factor, redundancy factor r, and decomposition level J.
e Q factor describes the degree of signal oscillation.For a high Q factor, the wavelets have more intense oscillatory cycles, which are suitable for the extraction of oscillatory components.Meanwhile, for a low Q factor, the wavelets consist of nonoscillatory elements, which are fit for the extraction of the transient components.
e redundancy factor r controls the overlapping rate among the frequency responses of the adjacent wavelets.Increasing the r value in the case of a fixed Q value enhances the overlapping rate of frequency responses and the computational cost.Note that r must be greater than 1, and r ≥ 3 is recommended for the perfect reconstruction and sparsity.
e value of the decomposition level J affects the frequency coverage of the wavelets.Greater J value makes the wavelets cover a wider 2 Shock and Vibration frequency range which even reach 0 Hz. e value of J should be selected as large as possible to include the lower frequency to the utmost extent.e maximum number of decomposing level J max is set to where N is the length of the input signal x(n).

Shock and Vibration
In the practical applications, we calculated the value of J max according to Equation (3).Moreover, the computational cost of the TQWT algorithm is O(3rN log 2 (4N/(Q + 1))) [23], and we discuss the computational complexity with the change of parameters.We can properly get the conclusion that the computational complexity increases with the enlargement of Q and J values, but increasing the Q value reduces the computational cost in the case of J � J max .Although sometimes we must select big enough Q value to match the high oscillatory intensity of the objected signal and set J � J max , but the computational cost is not huge according to the above discussion.
In essence, the TQWT algorithm can provide one set of overcomplete basis to estimate high-and low-resonance components.We need to select three parameters Q, r, and J to establish the basis.e selection guides of the above parameters are mentioned in Table 1 [26].

Basis Pursuit Denoising (BPD) with TQWT.
Sparsity can be used in signal processing problems.By considering an observed signal y(n) which has been corrupted by additive noise, the received signal can be represented as where x(n) is the useful signal and i(n) is the additive noise.e problem is to estimate x(n) which has a sparse representation from the observed signal y(n).e technique of BPD [27] provides an excellent platform for the optimization of the sparse representation of the signal x(n).To obtain a sparse representation of the signal x(n) concerning TQWT, we construct the objection function based on BPD technique as arg min such that x � TQWT −1 (ω). ( where ‖ • ‖ 1 and ‖ • ‖ 2 denote l 1 and l 2 norms, respectively.e coefficient ω � [ω (1) , . . ., ω (J+1) ] is the wavelet coefficient computed by TQWT.ω (j) denotes the wavelet coefficient of the subband j, η is a "regularization" parameter, ⊙ denotes the Hadamard (elementwise) multiplication, and λ � (λ 1 , . . ., λ J+1 ) is the compensation vector.
e main idea of SALSA method is summarized in Algorithm 1, where μ represents the penalty parameter which affects the algorithm convergence speed and P is the iteration number.Considering the length of the article, the details of SALSA algorithm can be referred to [31].

Experimental Results and Analysis
In the experiment, we adopted a real oceanic audio recording of a Beluga whale which consists of chick and whistle components.
is recording was acquired at the Oceanographic Valencia, using the instruments such as a computer with a Roland (Edirol) FA-101 sound acquisition system, a Bruel and Kjaer 8103 hydrophone, and a Bruel and Kjaer 2692 Nexus amplifier, as described in [1].e waveform, spectrum, and time-frequency signature representation of the raw signal are plotted in Figures 3, 4(a), and 5(a), respectively.In Figure 3, there are 8192 samples in total under the sampling frequency f s � 96 kHz for the test.It is evident that the wave exists several transient impulses which denote click components.Meanwhile, we can observe from the subfigure of Figure 3 at t � 0.0601 ∼ 0.0611s that they have oscillatory components.Both click components and whistle components are clearly visualized in the spectrum as demonstrated in Figure 5(a): clicks are broadband components in the whole frequency domain.On the contrast, whistles are impulsive components which are distributed from 8 kHz to 26 kHz.We can clearly see from Figure 4(a) that whistles and click components intersect in the timefrequency signature.Meanwhile, whistles are line spectrum in the frequency axis, and clicks are also discrete distribution in the time axis.
end for (8) end procedure (9) procedure soft thresholding y � soft(x, T) (10) y � max(|x| − T, 0) (11) y � yx/(y + T) (12) end procedure ALGORITHM 1: SALSA algorithm.4 Shock and Vibration  We concluded that the performance of extracting the click is quite remarkable, and the proposed method has a good effect on noise reduction.

Case Study of Whale Whistle Extraction.
Similarly, considering the oscillatory nature of whistle signal, we set Q � 320, r � 3, and J � 891.To obtain whistle component sparse representation concerning TQWT, the BPD approach based on the SALSA algorithm is utilized with parameters μ � 2, η � 0.045, and P � 100.e time-frequency signature of the whistle components extracted by the proposed method is drawn in Figure 4(b).We can find that whistle components of Figures 4(a) and 4(b) are distributed at the same spot and their shapes are horizontal lines.We compare the effect of whistle and click signal separation by the proposed method and the EMD algorithm in Figures 5(b) and 5(c).It is obvious that the spectra of the whistle and click components separated by the proposed method are identical to the spectra of the raw signal.Furthermore, we can draw the conclusions that the proposed method can separate the whistle and click components from the raw signal effectively and the proposed method has the function of noise reduction to some extent.
In this paper, the EMD algorithm was selected as a comparison due to its advantage of analyzing the nonlinear and nonstationary signals.EMD method can separate the objected signal into different intrinsic mode functions (IMFs).As exhibited in Figure 5(b), the first three IMFs are drawn to visualize its spectra where most of the harmonic waves can be extracted by the EMD algorithm, which represent the whistle components.However, click components cannot be extorted by EMD method.In addition, the noise components with the relative higher amplitude are distributed all over the frequency domain.In conclusion, the proposed method has the better performance of signal separation of the whistle and click components from the raw signal and the stronger ability of noise suppression.

Conclusion
In this paper, we have proposed a sparse representation with TQWT method to extort the click and whistle components from the raw whale signal according to the oscillatory behavior of these two components.Considering the disparate oscillatory intensity of click and whistle signals, the TQWT algorithm with high Q factor was operated to extract whistle component.On the other hand, click extraction is realized under the condition of low Q factor when we perform TQWT method.Besides, BPD is used as an alternative to LTI filters for noise reduction.e Shock and Vibration proposed method and the EMD algorithm are applied to the same real whale vocalization.Our proposed method appears to be superior to EMD for the extraction of the precise frequency variation of the whistle.In conclusion, the proposed method has the prominent performance of extracting the click and whistle components and removing noise.

Figure 2 :
Figure 2: e filter banks of TQWT.(a) e analysis filter banks.(b) e synthetic filter banks.

6
Shock and Vibrationpursuit problem, we used the SALSA algorithm with parameters μ � 2, η � 0.06, and P � 100.After implementing two above-mentioned methods, we obtain the sparse representation of extracted click components.e time-frequency signature of the extracted click components is shown in Figure4(c).Compared with Figure4(a), it is obvious that all the click components of Figures4(a) and 4(c) are spread at the same position in the time domain, and their shapes are also identical.In addition, the resolution of Figure4(c) is much purer than Figure4(a), because the BPD algorithm can eliminate the noise.However, we have marked weak and strong waveforms of the click signal with ellipse and rectangle, respectively, as seen in Figure6.As compared with the two subfigures in Figure6, the weak click signal has low energy, which is almost as harmonics, so, it cannot be extracted effectively.Further, to verify the effectiveness of the click extraction, the waveform and the spectra of the raw and extracted signals are drawn in Figures5(b) and 6, respectively.

0 0 .Figure 6 :
Figure 6: e of the raw signal and the extracted whistle and click components by the proposed method.