On Demodulation, Ridge Detection, and Synchrosqueezing for Multicomponent Signals

In this paper, we present a novel technique for the retrieval of the modes of a multicomponent signal using a time-frequency (TF) representation of the signal. Our approach is based on a novel ridge extraction method that takes into account the fact that the TF representation is both discrete in time and frequency, followed by a demodulation procedure. Numerical results show the benefits of the proposed approach for mode reconstruction in comparison to similar techniques that do not make use of demodulation. Furthermore, numerical investigations show that the proposed approach sharpens the TF representation on which it is built.


I. INTRODUCTION
T HERE are many physical systems which generate complex signals that are often modeled as a sum of amplitude and frequency-modulated (AM-FM) waves.These signals are generally referred to as multicomponent signals (MCSs).In many situations, it is often desirable and necessary to decompose these MCSs into their individual components.As a result of their computational simplicity and efficiency, linear time-frequency (TF) transforms such as the short-time Fourier transform (STFT) and the continuous wavelet transform (CWT) have received considerable attention to this end, over the last 40 years.The STFT of a MCS determines ridges in the TF plane which, once detected, allow for the reconstruction of the different components based on the TF representation evaluated on these ridges [1].More recently, it has been shown in [2], [3] that local frequency integration can improve the robustness to noise of the reconstruction.An alternative approach to the reconstruction of the modes, based on the information computed on the ridge, is known as the synchrosqueezing transform (SST) originally introduced in [4], and theoretically studied in [5].In essence, the technique consists of enhancing the time-scale (TS) represen-tation, given by a wavelet transform, by reassigning the coefficients using an estimate of the instantaneous frequency (IF).Such a technique is easily transferable to the TF representation given by the STFT [6], [7].However, in their original formulations, these techniques are not well suited for signals made of non purely harmonic modes, therefore an extension of SST was recently proposed to better deal with this case, and is known as second order synchrosqueezing transform [8].
In this present paper, we introduce a novel demodulation technique, built on the second order synchrosqueezing transform that leads to an even sharper representation along with better mode reconstruction results.A previous attempt which involved demodulating the signal before applying SST was presented in [9].In that paper, the demodulation was based on the computation of the phase of the analytic signal associated with the MCS.However, it is well known that this phase cannot be related to the instantaneous frequency (IF) of the modes which the MCS consists of.Indeed, to demodulate a MCS requires estimates of the IF of the modes.In relation to this issue, in [10], an estimate of the IF of a mono-component signal was computed using local frequency extrema of the spectrogram.An iterative procedure was proposed to accurately estimate the IFs of the modes, but mode reconstruction was not discussed.
In this paper, we first introduce a novel ridge estimation technique, based on the second-order SST method introduced in [8], followed by demodulation and finally mode reconstruction.The benefits of using such a procedure is an improvement in the sharpness of the TF representation obtained and also in the mode reconstruction performance.The paper is structured as follows, after this brief introduction, we recall the basics of STFT-based SST (Section II).Then, we focus on the practical implementation of ridge estimation in Section III, and move on to the definition of the demodulation procedure followed by the reconstruction algorithm based on the demodulated signal (Section IV).Numerical examples showing the relevance of the proposed approach on both simulated and real data conclude the paper.

II. BACKGROUND TO FOURIER SYNCHROSQUEEZING TRANSFORM
Prior to starting, it is useful to define our notation, and remind readers of the basic elements of STFT based SST (FSST) and of the approach to FSST which is better adapted to deal with modulated modes.

A. Basic Definitions and Notation
Let f be a function in L 1 (R), the space of integrable functions, we denote by f the Fourier transform of f , defined using This work is licensed under a Creative Commons Attribution 3.0 License.For more information, see http://creativecommons.org/licenses/by/3.0/ the following normalization: Taking a window g in the Schwartz class, the space of smooth functions with fast decaying derivatives of any order, the (modified) STFT of f is defined by where g * (t) is the complex conjugate of g (t).
In what follows, we investigate the retrieval of the components f k of a MCS f defined by: where A k (t) > 0 and φ k (t) > 0. The signal f is completely defined by its so-called ideal TF representation as follows:

B. The Basics of FSST
The aim of STFT-based SST (FSST) [6], [7] is to retrieve the ideal TF representation of f from its STFT, based on estimation of the IF at time t and frequency η: , (5) where Im{X} denotes the imaginary part of complex number X.The principle of FSST is to reassign the complex coefficients V g f (η, t) according to the following map (η, t) → (ω f (η, t), t), by means of the synchrosqueezing operator: where δ is the Dirac distribution.Knowing φ k , the kth mode can then be retrieved by considering: Essentially, FSST reassigns the information in the TF plane and then makes use of this sharpened representation to recover the modes.Previous theoretical investigations [8], [6] have highlighted a set of signals on which the performance of FSST can be evaluated.These are defined as follows: Definition II.1:Let ε > 0. We define the set B Δ ,ε of MCS where r for all k, f k satisfies: r the f k s are separated with resolution Δ, i.e. for all k ∈ {1, • • • , K − 1} and for all t, The synchrosqueezing operator with threshold γ > 0 and accuracy parameter λ > 0 is then defined, using a function ρ ∈ D(R), in the space of compactly supported smooth functions, such that R ρ(x) dx = 1, as: This definition allows us to state the main approximation result of FSST [6]: and g be a window in the Schwartz class, such that ĝ is compactly supported in r For all k ∈ {1, • • • , K} and all pair (η, t) ∈ Z k , s.t.
r For all k ∈ {1, • • • , K} there exists a constant C s.t. for all t ∈ R, A detailed proof is available in [6].

C. Second-Order Synchrosqueezing
One of the limitations of FSST is that it does not allow for the reconstruction of modes subject to significant frequency modulation.A method was recently reported to deal with mode modulation in the FSST context via second order synchrosqueezing (VSST) [8], [11].This technique uses a second order approximation of the phase of the modes in the definition of the synchrosqueezing operator.VSST is based on a new complex estimate of the second order derivative of the phase of f , defined as follows: which is computable by means of five different STFTs: Now, introducing ωf (η, t) = , enables the definition of a new IF estimate as [11]: where Re{X} denotes the real part of X.It is worth noting here that ωf (η, t) = Re{ω f (η, t)}.It can be easily shown that, when t) is a linear chirp with Gaussian amplitude, ω(2) f (η, t) = φ (t).For a more general mode with Gaussian amplitude, when the IF is estimated by ω(2) f (η, t), the estimation error involves only derivatives of the phase whose orders are larger than 3.
VSST then consists of replacing ωf by ω(2) f in FSST: The reconstruction of the mode f k is subsequently performed by means of the following formula:

D. On the Computation of the STFT and Synchrosqueezed Transforms
In many practical situations, the signal f is of finite length, typically defined on the interval [0, T ], and discretized into f In what follows and without loss of generality, N is assumed to be a power of 2 to ease the presentation.Assuming g is supported on [− LT N , LT N ], with L < N/2 the STFT is then computed as follows: from which we infer that: where the last sum is computed by means of an FFT.It is common to extend, for each q, the sequence (S(q, n)) n into a sequence of size N f > N by adding N f − N zeros to it.This operation is known as zero-padding.By doing so, one obtains an increased frequency resolution in the TF grid but not of the time resolution, since: ) is approximated by means of an FFT, only the first half of the frequency set is meaningful.That is, V g f is approximated on the TF grid {0, T N , It is worth noting here that the synchrosqueezed TF representations T g f or T V g f correspond to reassigned versions of the STFT on the discrete time-frequency grid defined above.The computation of T g f from V g f can then be carried out as explained in Algorithm 1 (putting t q = qT N ) [8].The same algorithm is applied to get T V g f from V g f , replacing ωf by ω(2) f .The role of zero-padding is going to be further investigated in the sequel.

III. RIDGE ESTIMATION
Any mode reconstruction techniques based on the synchrosqueezing transform requires an estimate of the ridges (t, φ k (t)) (mode reconstruction being then either based on formula (7) or (16), depending on the type of TF representation used).In that context, we are going to introduce a classical ridge detector that is usually applied to the spectrogram, and then investigate whether to perform the ridge detection on the reassigned transform is profitable.The influence of all of the different parameters on the accuracy of ridge estimation both in noiseless and noisy contexts will also be studied.

A. Algorithm for Ridge Extraction
To compute an estimate of the ridge (t, φ k (t)), assuming knowledge of the number of modes K, we can use the same algorithm as described in [5] or [12], and which was originally pro- posed in [13].This computes a local minimum of the functional 2 dt, (20) where T F f is one of the TF representations given by V g f , T g f or T V g f .However, as presented, equation (20) does not offer any algorithmic means to compute the ridges.Inspired by the above minimization problem, we derive Algorithm 2, for that purpose.
To improve the robustness of the procedure, several random initializations are required, leading to the detection of many different ridge sets (ψ k ) k =1,••• ,K , and the one retained as the output corresponds to the one maximizing Note that, at the end of the procedure, the estimated ridges need to be resorted according to increasing IF.

B. Influence of Zero-Padding on Ridge Estimation
To start the discussion on the influence of zero-padding on ridge estimation, we recall, for the case of a mono-component signal, the following estimate of φ (t n ): which was studied in [14], for a noisy version of the signal where is a Gaussian white noise with variance σ 2 .Selecting and assuming g is the Gaussian window, g(x) = 1 √ 2π σ e − x 2 2 σ 2 , it was proven in [14] that: These results are interesting but they do not consider the fact that |V g f (η, t n )| is only available on a discrete frequency grid, since it is computed using an FFT.More precisely, (21) actually corresponds to the ridge detector we would like to study (when λ and β are null), assuming a continuous frequency representation.To illustrate the impact of the discrete grid associated with frequency resolution, remembering that, as already noted, TF representations are evaluated at frequencies we investigate the quality of IF estimate depending on this discretization, i.e. the choice for N f .This can be quantified by measuring the mean square error (MSE) between the estimated ridge and the ground truth: when the frequency resolution varies.To have a better understanding of what is at work in this ridge detection, we not only investigate the influence of zero-padding but also of the noiselevel.Since, the study of a linear chirp is somewhat limiting, we extend the analysis to three different types of mono-component signals whose STFT are displayed in Fig. 1, first row (they correspond to a linear chirp, a polynomial chirp and a mode with sinusoidal phase).
The results, displayed in Fig. 1(d), show that, in a noisefree context, when STFT or VSST are used for ridge detection, MSEs are the same for the linear chirp, which corresponds to the fact that the coefficients are reassigned to a maximum of the STFT with VSST (this method being based on an exact IF estimate for linear chirps).For the other two signals, the detector based on STFT behaves a little bit better than VSST, but not significantly so.In contrast, since FSST is based on an inaccurate IF estimate (even for the linear chirp), the results in terms of ridge estimation are significantly worse when the former is used as TF representation.For this reason, we do not consider it in the simulations which follow.Finally, we remark, that in the noise-free context for the linear and polynomial chirps of Fig. 1(a) and (b), the MSE error when using STFT or VSST decreases when the frequency resolution across the sampling grid is increased.However, this is no longer true with the signal of Fig. 1(c).In such a case, since the signal modulation is important, there is no staircase effect even at a low frequency resolution such as N f = N .The conclusion of this study is that the frequency resolution, for the purpose of ridge estimation, has to be tuned depending on the signal modulation: a small modulation requires a higher frequency resolution.Now, we would like to understand what happens in noisy situations, therefore we perform the ridge detection on the linear and polynomial chirps and also on the mode with sinusoidal phase but with an SNR equal to 5, 0 or −5 dB.The results are depicted in Fig. 1(d) to (f) (for the latter type of signals, and whatever the TF representation used, the ridge detector does not perform well at −5 dB, therefore the results are not depicted).It is clear from Fig. 1(d) and (e) that, while a finer frequency resolution, associated with a larger N f , leads to a more accurate IF estimate in the noise-free case, N f has a much smaller impact on the quality of the estimation in a noisy context.Furthermore, the quality of the estimate provided by applying the ridge detector to VSST rather than to STFT is always better: the ridge detection operates on a much sharper TF representation which appears to be less sensitive to noise.Finally, we note that, from these simulations, N f = 8N is a good choice for frequency resolution for ridge detection purpose.

C. Influence of Regularization Parameters
Taking into account the study carried out in the previous section, the ridge detector applied either to STFT or VSST both lead to good results when no regularization is used, even though, as illustrated in Fig. 1 (second row), to perform ridge detection on VSST rather than STFT is always better in noisy situations.
We now study the behavior of the ridge detector applied to STFT or VSST when regularization terms vary, both in the noise-free and noisy cases.To do so, we consider the same linear chirp as previously either in the noise-free, 0 dB or −5 dB cases.We remark that the ridge detector is much more sensitive to regularization parameters when applied to STFT rather than VSST (see Fig. 2): the reassignment technique enables a more robust ridge detection even at high noise level, because it corresponds to a sharper TF representation.Finally, note that, the regularization parameters do not offer any improvement in terms of the accuracy of the ridge estimation, which argues against using them, (the simulations shown in Fig. 2 were carried out for N f = 8N , but the same results could be derived for any reasonable value of N f ).It is important to note here that the same conclusions would hold if the simulations were carried out on the polynomial chirp or on the mode with sinusoidal phase, as soon the algorithm detects the ridge.

IV. DEMODULATION ALGORITHM AND MODE RECONSTRUCTION
Once a ridge is detected using an appropriate N f to avoid the staircase effect mentioned above, we compute a demodulation operator for each mode which is going to be subsequently used to extract the corresponding demodulated mode.Inverting the demodulation operator, we will finally obtain the desired mode.The modes are extracted in a sequential fashion, i.e. one at a time, a commonly used technique often referred to as the peeling method in the literature [15], [16].
It is worth noting here that, in most cases, and in contrast to our approach, when demodulation problems are considered, it is often assumed that knowledge of a phase function v(t) is available and, this is then used to compute the so-called short time generalized Fourier transform (STGFT).Indeed, the STGFT corresponds to: This kind of approach has also been used in [17], [18] and ridge detection can be viewed as a way to estimate this phase function.Attempts have also been made to estimate the ridges using parametric models [10].As will be explained later, our approach is fully non-parametric.

A. Definition of Demodulation Operator
Based on the ridge estimate defined above, we introduce the demodulation algorithm for a mono-component signal f (t) = A(t)e 2iπ φ(t) , for which we assume the IF estimate ψ(t) is computed.For the case of a linear chirp, i.e. φ(t) = at + bt 2 , ψ(t) approximates a + 2bt.So, by multiplying f (t) by e −2iπ (ψ (t)t/2) , and if the IMF estimation is accurate, we should obtain a demodulated signal f D with constant frequency a/2.However, it is worth remarking that this demodulation procedure is only well suited to a linear chirp, because it removes only second order terms.Therefore, to demodulate a more general mode f (t) = A(t)e i2π φ(t) , the following demodulation operator e −i2π ( t 0 ψ (x)dx−ψ 0 t) , where ψ 0 is some positive constant frequency, is a better choice since, no assumption is made about φ.Indeed, by considering the signal one should get a signal with constant frequency ψ 0 .An illustration of this is shown in Fig. 3, for three different types of mode, where ψ 0 is equal to 100 Hz.In that figure, we display the VSST of the considered modes in the noise-free (resp.0 dB) case, in the first (resp.second) row.In the bottom row of that figure, we display the VSSTs of the demodulated signals associated with the three modes represented in the second row (N f being taken equal to 8N in the ridge detection).Despite the high noise level, the demodulation performs well.Now, let us consider how this procedure works in the multicomponent case.We will illustrate this by adopting a signal consisting of the three different modes, displayed in Fig. 4(a).Then by applying Algorithm 2 to the VSST computed with N f = 8N , we obtain the estimates (ψ 1 , ψ 2 , ψ 3 ), which are subsequently used to compute three demodulated signals, as follows: The VSST of the three signals

B. Algorithm for Mode Extraction Based on Demodulation
The previous section has provided us with a means to demodulate any of the modes of the signal f , the number, K, of which is assumed to be known.With that in mind, the algorithm for mode extraction can then be summarized as follows: Note here that the TF representation used to compute the ridge of f D ,k and then mode k could alternatively be T g f since the mode sought is demodulated, there is no need to take into account the modulation at this stage.Indeed, the kth mode of signal f D ,k should be a purely harmonic signal at frequency ψ 0 (see Fig. 4(b) to (d) for illustrations).Furthermore, while it is important to fix the frequency resolution parameter N f according to mode modulation for ridge estimation, to compute T V f D , k , N f = N is used because the mode k, extracted at step 3 of Algorithm 3, is a purely harmonic one.

V. EVALUATION OF THE PERFORMANCE OF THE RECONSTRUCTION ALGORITHM
Before we assess the reconstruction technique proposed, we discuss how an optimal window length, (which is crucial in all TF representations), might be determined.The emphasis is placed on the difficulty of estimating this window length in a noisy context.

A. Automatic Window Length Determination
To determine an optimal window length, we consider that g is Gaussian, i.e. g(x) = 1 √ 2π σ e − x 2 2 σ 2 , so that its length is controlled by parameter σ.In our framework, we seek the value of that parameter leading to the most concentrated representation.Following [19], [20], this concentration can be measured on the VSST by means of the Shannon entropy: or the Rényi entropy: whose behaviors are reported to be very similar [19], and in which To determine σ in this manner is particularly relevant, but only when the noise level is relatively low.Indeed, looking at Fig. 5(a) to (c), representing the Rényi entropy (with α equal to 3) of the VSST of the signal f displayed in Fig. 4(a), one notices that it exhibits a local minimum at a specific value for σ at noise level lower than 0 dB, and the optimal value is relatively stable for these cases.Note also that, in such a case, the result is not dependent on the frequency resolution.We note here that the Rényi entropy is computed by considering that the second order reassignment operator reassigns only the coefficients V g f (η, t) such that |Re{V g f (η, t)}| > γ 1 or |Im{V g f (η, t)}| > γ 2 , where γ 1 (resp.γ 2 ) is the standard deviation of Re{V g f (η, t)} (resp.Im{V g f (η, t)}).The choice for such a threshold is motivated by the fact that the STFT of a zero mean white Gaussian noise is also a zero mean Gaussian process.
However, the technique based on Rényi entropy to determine the optimal σ no longer works in a very noisy context, see Fig. 5(d), in particular because it does not take into account the number K of modes.Since Algorithm 3 performs better when the ridge detection is efficient, it is natural to define the optimal value σ as the one that concentrates the most the information on the K detected ridges.This could be measured by introducing the following quantity: where E R stands for "energy on the ridge", bearing in mind the dependence on σ is contained in g.E R actually corresponds to the proportion of the total energy located on the ridges.We depict E R (σ) for the same signal as before, for different noise levels, and for N f = N in Fig. 6.We notice, first, that the optimal value is close to that given by the Rényi entropy for noise levels lower than 0 dB, and that the information located on the ridge becomes less and less significant as the noise level increases.
What is interesting with this technique is that, in contrast to the Rényi entropy, it offers us a means to find a relevant σ at noise levels as high as −5 dB.Such a technique will thus be used to determine the optimal σ in very noisy situations.Finally, note that similar results can be obtained by considering different values of N f .

B. Reconstruction Procedure: Noise-Free Case
In this section, we illustrate the improvement offered by Algorithm 3 in terms of the quality of the reconstructed modes in the noise free case to enable the impact of parameter selection to be considered.Our test signals are displayed in Fig. 7(a) and 8(a).The window used to build the TF representation is Gaussian and its length is optimized as explained in the previous subsection.Then, ridge detection and mode reconstruction are performed using a small value for γ in the definition of the reassignment operator (typically γ = 10 −3 ), since in such a case, all the non zero coefficients are related to the signal.
Since the ridge computation is influenced by the frequency resolution, we investigate the impact of N f used in ridge computation on mode reconstruction.Also, to show that to use demodulation results in a more compact TF representation than by the original VSST method, the role of d, used both in reconstruction formula (16) and Algorithm 3, is investigated.To assess how the ridge detection impacts mode reconstruction, we also compute the mode reconstruction assuming the IFs of the modes are known.
We study two types of signals which are depicted in Fig. 7(a) and 8(a).The results for the first type are depicted in Fig. 7(b) to (d) and represent the output SNR defined, for mode i, as 20 log 10 ( f i 2 ), where the norm is the l 2 norm and fi is the ith mode reconstructed using Algorithm 3, and when the frequency resolution used in the ridge detection varies (in the different figures we use the term "demod").In each case, we also display the reconstruction results using the true IFs of the modes in Algorithm 3 (in the figures we use the term "optimal demod").We note that, as expected, for N f = 8N the results are very close to those obtained assuming knowledge of the IFs of the modes, as illustrated in Fig. 7(d).We also display the results obtained by reconstructing the modes directly using formula (16): whatever the value of d the reconstruction is better when using Algorithm 3. Also, since the signal studied is slightly modulated, to choose a sufficiently large N f for ridge estimation is crucial.Similar conclusions can be drawn from the study of the signal whose VSST is displayed in Fig. 8(a): first, to increase N f clearly improves the reconstruction results, and,   then, when N f = 8N , the results are close to those that would be obtained if the IFs of the modes were known.Also, we again remark that the results are far better than direct reconstruction.In this case however, and since the modes are more modulated than those of Fig. 7(a), the impact of N f on ridge computation and then mode reconstruction is less important.

C. Reconstruction Procedure: Noisy Case
In this section, we investigate the sensitivity to noise of our new method for mode retrieval, considering again the two types of signals displayed in Fig. 7(a) and 8(a).From the study of the noise-free case, ridge computation leads to good reconstruction when N f = 8N , so we retain this value in the simulations that follow.Again we use a Gaussian window with the optimal window length σ computed as before, and with threshold γ = 10 −3 .or d = 5, respectively for the first and second mode of Fig. 7(a), with respect to global input SNR.We note the following based on these observations: whatever the noise level, the mode reconstruction is improved by using Algorithm 3 rather than direct reconstruction; the discrepancy in terms of reconstruction performance between the two types of techniques increases when the noise level is decreasing; the gain of demodulating first is not that important because VSST is optimized for linear chirps.
Switching to the study of the signal of Fig. 8(a), the benefit of using the demodulation procedure is much clearer: when a mode is very different from a linear chirp, the demodulation procedure greatly improve the reconstruction results.Finally, we remark that, as in the noise-free case, the parameter d plays a crucial role in the quality of the reconstruction, and that by demodulating first, we obtain a more concentrated representation since, for a given d, the reconstruction is always better using demodulation than without.

D. Application to Real Data: VSST Versus EMD and Limitations
Here we consider the reconstruction of a bat echolocation signal whose VSST is shown in Fig. 10(a).Assuming the number of modes is three, which is consistent with the representation in the aforementioned figure (there is actually a fourth mode but due to aliasing effect we do not take it into account), we perform ridge extraction on the VSST (the extracted ridges are also depicted in the figure) and then compute the different modes by either using VSST or by demodulating first.In this regard, we study the influence of parameter d and frequency resolution on the reconstruction.
Since the signal studied is real, and as Algorithm 3 applies to complex signal, we first consider the Hilbert transform of the signal before applying that algorithm.Then, the length of the signal N not being a power of 2, we use N 1 = 2 log 2 (N ) +1 , and its multiples, to define the different frequency resolutions subsequently used in the ridge detection.As previously, we investigate the impact of the frequency resolution used in the ridge estimation, on signal reconstruction.For this purpose, we compute the output SNR associated with the reconstruction of the signal by summing the first three modes.The results depicted in Fig. 10(b), again show the benefit of demodulating first the signal compared to direct computation, the improvement brought by using a higher frequency resolution being much less obvious than in controlled situations such as those studied before.In spite of this reconstruction results are satisfactory, some information is lost when considering the reconstructed signal obtained us-ing only the first three modes.This problem arises because for real-world signals, such as the bat signal considered here, the number of modes is not constant over time: i.e. some modes vanish but the ridge estimation assumes that the modes will persist throughout the data record.This is a failing of many methods and is topic of current research.For comparison purposes we depict the VSST of the first three intrinsic mode functions (IMF) obtained with the empirical mode decomposition (EMD) [21], which is an alternative technique to extract the modes of a multicomponent signal, on Fig. 11(a), (b) and (c).From these figures, we note that the modes obtained are not related to the TF content of the signal depicted in Fig. 10(a) since the TF structure based on modes corresponding to TF ridges is completely broken.

VI. CONCLUSION
In this paper, we have introduced a new algorithm for the retrieval of the modes of a multicomponent signal from the study of some time-frequency representations.It is based on a novel technique for ridge estimation followed by a demodulation procedure.By using an appropriate frequency resolution, it is possible to compensate for the discretization of the frequencies induced by the use of FFTs in the computation of the TF representations, and thus obtained reliable IF estimates.The simulation carried out on test signals show that, by demodulating the signal first using these IF estimates, the associated time-frequency representation is sharpened and that the accuracy of the reconstruction is much better than when direct reconstruction is performed, both in noiseless and noisy situations.Simulations performed on real signals where the number of modes may vary with time however show the limits of signal reconstruction based on ridge extraction, and is a challenging issue we will address in the near future.

Fig. 1 .
Fig. 1.(a) STFT of a linear chirp; (b) STFT of a polynomial chirp; (c) STFT of a mode with sinusoidal phase; (d) Computation of the mean square error associated with the ridge detection for the linear chirp displayed in A, for various frequency resolution (k in abscissa means N f = kN ), different TF representations and noise level; (e) same as D but for the polynomial chirp displayed in B; (f) same as D but for the mode with sinusoidal phase displayed in C.

Fig. 2 .
Fig. 2. (a) MSE corresponding to the ridge estimation for the linear chirp of Fig. 1(a) with V g f as TF representation (noise-free case, N f = 8N ); (b) Same as A but at a 0 dB noise level; (c) Same as A but at a −5 dB noise level; (d) MSE corresponding to the ridge estimation for the linear chirp of Fig. 1(a) with T V g f as TF representation (noise-free case, N f = 8N ); E: Same as D but at a 0 dB noise level; H: Same as D but at a −5 dB noise level.

2 , 3
are shown in Fig. 4(b) to (d), where the SNR in the original signal equals 0 dB.It is worth noting that, in f D ,k , only the kth mode is demodulated.
extract the ridge ψ D ,k corresponding to mode k of f D ,k , by considering single ridge detection in the frequency range [ψ 0 − Δ, ψ 0 + Δ]. 3. Reconstruct the kth mode of f D ,k and then multiply it by the inverse of demodulation operator to recover

Fig. 6 .
Fig.6.Computation of E R , corresponding to the proportion of the energy contained in VSST computed on the ridges, with respect to σ, and in either the noise-free, 5 dB, 0 dB or −5 dB cases.

Fig. 7 .
Fig. 7. (a) VSST of a two mode signal; (b) mode reconstruction ("rec f i " corresponds to reconstructed mode f i ) when ridges used in Algorithm 3 computed with N f = N , along with the reconstruction when the IFs of the mode are assumed to be known in Algorithm 3 ("optimal demod" in the figure); (c) Same as B, but when N f = 4N in the ridge computation; (d) Same as B, but when N f = 8N in the ridge computation.

Fig. 8 .
Fig. 8. (a) VSST of a two mode signal; (b) mode reconstruction ("rec f i " corresponds to reconstructed mode f i ) when ridges used in Algorithm 3 are computed with N f = N , along with the reconstruction when the IFs of the mode are assumed to be known in Algorithm 3 ("optimal demod" in the figure); (c) Same as B, but when N f = 4N in the ridge computation; (d) Same as B, but when N f = 8N in the ridge computation.

Fig. 9 .
Fig. 9. (a) SNR after reconstruction for mode f 1 of the signal whose VSST is depicted in Fig. 7(a) using either the direct reconstruction (direct) or Algorithm 3 (demod), and for d = 0 or d = 5 in both cases; (b) Same as A but for mode f 2 of the same signal; (c) SNR after reconstruction for mode f 1 of the signal whose VSST is depicted in Fig. 8(a) using either the direct reconstruction (direct) or Algorithm 3 (demod), and for d = 0 or d = 5 in both cases; of the same signal; (d) Same as C but for mode f 2 of that signal.

Fig. 10 .
Fig. 10.(a) VSST of a bat echolocation call along with the corresponding ridges, (b) reconstructed signal based on VSST and assuming the number of modes equals 3.The results displayed in the first row of Fig.9, represent the output SNR associated with mode reconstruction, when d = 0