Abstract
We propose to combine cepstrum and nonlinear time–frequency (TF) analysis to study multiple component oscillatory signals with time-varying frequency and amplitude and with time-varying non-sinusoidal oscillatory pattern. The concept of cepstrum is applied to eliminate the wave-shape function influence on the TF analysis, and we propose a new algorithm, named de-shape synchrosqueezing transform (de-shape SST). The mathematical model, adaptive non-harmonic model, is introduced and the de-shape SST algorithm is theoretically analyzed. In addition to simulated signals, several different physiological, musical and biological signals are analyzed to illustrate the proposed algorithm.
Similar content being viewed by others
Notes
The P, Q, R, S, and T are significant landmarks of the ECG signal. The P wave represents atrial depolarization. The Q wave is any downward deflection after the P wave. The R wave follows as an upward deflection, which is spiky, and the S wave is any downward deflection after the R wave. The Q wave, R wave, and S wave form the QRS complex, which corresponds to the ventricular depolarization. The T wave follows the S wave, which represents the ventricular repolarization. The QT interval (respectively RR interval) is the length of the time interval between the start of the Q wave and the end of the T wave of one heart beat (respectively two R landmarks of two consecutive heart beats). We could view the R peak as a surrogate of the cardiac cycle, and hence the RR interval could be viewed as a surrogate of the inverse of the heart rate. See Fig. 4 for an example of the P, Q, R, S, and T landmarks and the RR and QT intervals. For more information about ECG signal, we refer the readers to [21].
The term “cepstrum” is invented by reversing the consonants of the first part of the word “spectrum” in order to signify their difference. Similarly, the word “quefrency” is the inversion of the first part of “frequency”. By definition, the quefrency has the same unit as time.
In the music processing, the high-quefrency part in the cepstrum is related to the pitch while the low-quefrency part to timbre (i.e., sound color).
The phase factor \(e^{i2\pi \xi t}\) in this definition is not always present in the literature, leading to the name modified STFT for this particular form. To slightly abuse the notation, we still call it STFT.
The absence of even harmonics is (part of) what is responsible for the “warm” or “dark” sound of a clarinet compared to the “bright” sound of a saxophone.
References
Alexandre, P., Lockwood, P.: Root cepstral analysis: a unified view. application to speech processing in car noise environments. Speech Commun. 12(3), 277–288 (1993)
Auger, F., Flandrin, P.: Improving the readability of time-frequency and time-scale representations by the reassignment method. IEEE Trans. Signal Process. 43(5), 1068–1089 (1995)
Balazs, P., Dörfler, M., Jaillet, F., Holighaus, N., Velasco, G.: Theory, implementation and applications of nonstationary Gabor frames. J. Comput. Appl. Math. 236(6), 1481–1496 (2011)
Benchetrit, G.: Breathing pattern in humans: diversity and individuality. Respir. Physiol. 122(2–3), 123–129 (2000)
Bogert, B.P., Healy, M.J.R., Tukey, J.W.: The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and shape cracking. Proc. Symp. Time Series Anal. 15, 209–243 (1963)
Chen, Y.-C., Cheng, M.-Y., Wu, H.-T.: Nonparametric and adaptive modeling of dynamic seasonality and trend with heteroscedastic and dependent errors. J. R. Stat. Soc. B 76, 651–682 (2014)
Chui, C.K., Lin, Y.-T., Wu, H.-T.: Real-time dynamics acquisition from irregular samples—with application to anesthesia evaluation. Anal. Appl. 14(4), 1550016 (2016). doi:10.1142/S0219530515500165
Chui, C.K., Mhaskar, H.N.: Signal decomposition and analysis via extraction of frequencies. Appl. Comput. Harmon. Anal. 40(1), 97–136 (2016)
Cicone, A., Liu, J., Zhou, H.: Adaptive local iterative filtering for signal decomposition and instantaneous frequency analysis. Appl. Comput. Harmon. Anal. 41(2), 384–411 (2016)
Clifford, G.D., Azuaje, E., McSharry, P.E.: Advanced Methods and Tools for ECG Data Analysis. Artech House Publishers, Norwood (2006)
Coifman, R.R., Steinerberger, S.: Nonlinear phase unwinding of functions. J. Fourier Anal. Appl. (2015). doi:10.1007/s00041-016-9489-3
Daubechies, I., Lu, J., Wu, H.-T.: Synchrosqueezed wavelet transforms: an empirical mode decomposition-like tool. Appl. Comput. Harmon. Anal. 30, 243–261 (2011)
Daubechies, I., Wang, Y., Wu, H.-T.: ConceFT: concentration of frequency and time via a multitapered synchrosqueezing transform. Philos. Trans. R. Soc. Lond. A 374(2065), 20150193 (2016)
Davila, M.I.: Noncontact extraction of human arterial pulse with a commercial digital color video camera [thesis]. Ph.D. thesis, University of Illinois at Chicago, Chicago (2012)
Emiya, V., David, B., Badeau, R.: A parametric method for pitch estimation of piano tones. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Proc., pp. 249–252 (2007)
Flandrin, P.: Time-Frequency/Time-Scale Analysis, Wavelet Analysis and Its Applications, vol. 10. Academic Press Inc., San Diego (1999)
Fletcher, H.: Normal vibration frequencies of a stiff piano string. J. Acoust. Soc. Am. 36(1), 203–209 (1964)
Fletcher, N.H., Rossing, I.: The Physics of Musical Instruments, 2nd edn. Springer, New York (2010)
Fossa, A.A., Zhou, M.: Assessing QT prolongation and electrocardiography restitution using a beat-to-beat method. Cardiol. J. 17(3), 230–243 (2010)
Fridericia, L.S.: EKG systolic duration in normal subjects and heart disease patients. Acta Med. Scand. 53, 469–488 (1920)
Goldberger, A.L.: Clinical Electrocardiography: A Simplified Approach. Mosby, St. Louis (2006)
Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, PCh., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.-K., Stanley, H.E.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)
Guharay, S., Thakur, G., Goodman, F., Rosen, S., Houser, D.: Analysis of non-stationary dynamics in the financial system. Econ. Lett. 121, 454–457 (2013)
Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)
Herry, C.L., Frasch, M., Seely, A., Wu, H.-T.: Heart beat classification from single-lead ECG using the synchrosqueezing transform. Physiol. Meas. 38, 171 (2016)
Hormander, L.: The Analysis of Linear Partial Differential Operators I. Springer, Berlin (1990)
Hou, T., Shi, Z.: Data-driven time-frequency analysis. Appl. Comput. Harmon. Anal. 35(2), 284–308 (2013)
Hou, T.Y., Shi, Z.: Extracting a shape function for a signal with intra-wave frequency modulation. Philos. Trans. R. Soc. Lond. A 374(2065), 20150194 (2016)
Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.-C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 454(1971), 903–995 (1998)
Iatsenko, D., Bernjak, A., Stankovski, T., Shiogai, Y., Owen-Lynch, P.J., Clarkson, P.B.M., McClintock, P.V.E., Stefanovska, A.: Evolution of cardiorespiratory interactions with age Evolution of cardiorespiratory interactions with age. Philos. Trans. R. Soc. A 371(20110622), 1–18 (2013)
Indefrey, H., Hess, W., Seeser, G.: Design and evaluation of double-transform pitch determination algorithms with nonlinear distortion in the frequency domain-preliminary results. In: Signal Process, Proc. IEEE Int. Conf. Acoust. Speech, pp. 415–418 (1985)
Khadkevich, M., Omologo, M.: Time-frequency reassigned features for automatic chord recognition. In: IEEE, Proc. ICASSP, pp. 181–184 (2011)
Klapuri, A.: Multipitch analysis of polyphonic music and speech signals using an auditory model. IEEE Trans. Audio, Speech, Lang. Proc. 16(2), 255–266 (2008)
Kobayashi, T., Imai, S.: Spectral analysis using generalized cepstrum. IEEE Trans. Acoust. Speech Signal Proc. 32(5), 1087–1089 (1984)
Kowalski, M., Meynard, A., Wu, H.-T.: Convex optimization approach to signals with fast varying instantaneous frequency. Appl. Comput. Harmon. Anal. (2016). doi:10.1016/j.acha.2016.03.008
Kraft, S., Zölzer, U.: Polyphonic pitch detection by iterative analysis of the autocorrelation function. In: Proc. Int. Conf. Digital Audio Effects, pp. 1–8 (2014)
Lim, J.S.: Spectral root homomorphic deconvolution system. IEEE Trans. Acoust. Speech, Signal Proc. 27(3), 223–233 (1979)
Lin, Y.-T., Hseu, S.-S., Yien, H.-W., Tsao, J.: Analyzing autonomic activity in electrocardiography about general anesthesia by spectrogram with multitaper time-frequency reassignment. IEEE-BMEI 2, 628–632 (2011)
Lin, Y.-T., Wu, H.-T.: ConceFT for time-varying heart rate variability analysis as a measure of noxious stimulation during general anesthesia. IEEE Trans. Biomed. Eng. 64(1), 145–154 (2016)
Lin, Y.-T., Wu, H.-T., Tsao, J., Yien, H.-W., Hseu, S.-S.: Time-varying spectral analysis revealing differential effects of sevoflurane anaesthesia: non-rhythmic-to-rhythmic ratio. Acta Anaesthesiol. Scand. 58, 157–167 (2014)
Montgomery, H.L.: Lectures on the Interface Between Analytic Number Theory and Harmonic Analysis. AMS, Providence (1994)
Oberlin, T., Meignen, S., Perrier, V.: Second-order synchrosqueezing transform or invertible reassignment? Towards ideal time-frequency representations. IEEE Trans. Signal Process. 63(5), 1335–1344 (2015)
Oppenheim, A.V., Schafer, R.W.: From frequency to quefrency: a history of the cepstrum. IEEE Signal Process. Mag. 21(5), 95–106 (2004)
Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing, 3rd edn. Prentice Hall, Englewood Cliffs (2009)
Passilongo, D., Mattioli, L., Bassi, E., Szabó, L., Apollonio, M.: Visualizing sound: counting wolves by using a spectral view of the chorus howling. Front. Zool. 12(1), 1–10 (2015)
Peeters, G.: Music pitch representation by periodicity measures based on combined temporal and spectral representations. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Proc. (2006)
Peeters, G., Rodet, X.: Sinola: a new analysis/synthesis method using spectrum peak shape distortion, phase and reassigned spectrum. In: Proc. ICMC, vol. 99, Citeseer (1999)
Ricaud, B., Stempfel, G., Torrésani, B.: An optimally concentrated Gabor transform for localized time-frequency components. Adv. Comput. Math. 40, 683–702 (2014)
Stevens, S.S.: On the psychophysical law. Psychol. Rev. 64(3), 153 (1957)
Su, L., Chuang, T.-Y., Yang, Y.-H.: Exploiting frequency, periodicity and harmonicity using advanced time-frequency concentration techniques for multipitch estimation of choir and symphony. In: ISMIR (2016)
Su, L., Yang, Y.-H.: Combining spectral and temporal representations for multipitch estimation of polyphonic music. IEEE/ACM Trans. Audio Speech Lang. Process. 23(10), 1600–1612 (2015)
Su, L., Yu, L.-F., Lai, H.-Y., Yang, Y.-H.: Resolving octave ambiguities: a cross-dataset investigation. In: Proc, Sound and Music Computing (SMC) (2014)
Taxt, T.: Comparison of cepstrum-based methods for radial blind deconvolution of ultrasound images. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44(3), 666–674 (1997)
Ternström, S.: Perceptual evaluations of voice scatter in unison choir sounds. J. Voice 7(2), 129–135 (1993)
Thakur, G.: The synchrosqueezing transform for instantaneous spectral analysis. Excursions in Harmonic Analysis, vol. 4, pp. 397–406. Springer, Berlin (2015)
Tokuda, K., Kobayashi, T., Masuko, T., Imai, S.: Mel-generalized cepstral analysis: a unified approach to speech spectral estimation. In: Proc. Int. Conf. Spoken Language Processing (1994)
Tolonen, T., Karjalainen, M.: A computationally efficient multipitch analysis model. IEEE Speech Audio Process. 8(6), 708–716 (2000)
Wu, H.-T.: Instantaneous frequency and wave shape functions (I). Appl. Comput. Harmon. Anal. 35, 181–199 (2013)
Wu, H.-T., Chang, H.-H., Wu, H.-K., Wang, C.-L., Yang, Y.-L., Wu, W.-H.: Application of wave-shape functions and synchrosqueezing transform to pulse signal analysis, submitted (2015)
Wu, H.-T., Talmon, R., Lo, Y.-L.: Assess sleep stage by modern signal processing techniques. IEEE Trans. Biomed. Eng. 62, 1159–1168 (2015)
Xi, S., Cao, H., Chen, X., Zhang, X., Jin, X.: A frequency-shift synchrosqueezing method for instantaneous speed estimation of rotating machinery. ASME J. Manuf. Sci. Eng. 137(3), 031012–031012-11 (2015)
Yang, H.: Synchrosqueezed wave packet transforms and diffeomorphism based spectral analysis for 1D general mode decompositions. Appl. Comput. Harmon. Anal. 39, 33–66 (2014)
Zhao, X., Wang, D.: Analyzing noise robustness of mfcc and gfcc features in speaker identification. In: IEEE Int. Conf. Acoustics, Speech, Signal Proc. (ICASSP), IEEE, pp. 7204–7208 (2013)
Acknowledgements
Hau-tieng Wu’s research is partially supported by Sloan Research Fellow FR-2015-65363. Part of this work was done during Hau-tieng Wu’s visit to National Center for Theoretical Sciences, Taiwan, and he would like to thank NCTS for its hospitality. Hau-tieng Wu also thanks Dr. Ilya Vinogradov for the discussion of equidistribution sequences. The authors thank Professor Stephen W. Porges for sharing the non-contact PPG signal. The authors acknowledge the anonymous reviewers for their valuable recommendations to improve the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Patrick Flandrin.
Appendices
Appendix
Proof of Theorem 3.4
In this section, we provide an analysis of STCT in Theorem 3.4 step by step
-
first step: approximate the ANH function by a “harmonized” function by Taylor’s expansion and evaluate its STFT;
-
second step: evaluate the \(\gamma \) power of the absolute value of STFT. Since in general there will be more than one ANH component in the ANH function, we have to handle the possible interference between different ANH components. We will apply the Erdös–Turán inequality to control the interference;
-
third step: find the Fourier transform of the \(\gamma \) power of the absolute value of STFT and finish the proof.
We start from the first Lemma, which allows us to locally approximate an ANH function by a sinusoidal function.
Lemma 7.1
Take \(\epsilon {>0}\), a sequence \(c \in \ell ^1\), \(N\in \mathbb {N}\) and \(0< C<\infty \). For \(f(t)=\frac{1}{2}B_0(t)+\sum _{\ell =1}^\infty \cos (2\pi \phi _\ell (t))\in \mathcal {D}^{c,C,N}_\epsilon \), for each \(\ell \in \{0\}\cup \mathbb {N}\) we have
Proof
Assume that \(s>0\). The proof for \(s\le 0\) is the same. By the assumption of \(B_\ell (t)\), we have
The proof of (49) follows by the same argument.
\(\square \)
The following Lemma leads to the first part of the Theorem 3.4, regarding the STFT. In short, for the superposition of ANH functions in \(\mathcal {D}_{\epsilon ,d}\), at each time t the function behaves like a sinusoidal function and the STFT could be approximately explicitly.
Lemma 7.2
Fix \(\epsilon {>0}\) and \(d>0\). Take \(f(t)=\sum _{k=1}^Kf_k(t)\in \mathcal {D}_{\epsilon ,d}\). Then, the STFT of f at \(t\in \mathbb {R}\) is
where \(\xi \in \mathbb {R}\) and \(\epsilon _0(t,\xi )\) is defined in (62). Furthermore, \(|{\epsilon _0}(t,\xi )|\) is of order \(\epsilon \) and decays at the rate of \(|\xi |^{-1}\) as \(|\xi |\rightarrow \infty \).
Proof
Since \(f\in L^\infty \cap C^1\subset \mathcal {S}'\) and \(h\in \mathcal {S}\), by the linearity of the STFT, we have
where \(f_{k,0}=\frac{1}{2}B_{k,0}(\cdot {)}\) and \(f_{k,\ell }(\cdot ):=B_{k,\ell }(\cdot )\cos (2\pi \phi _{k,\ell }(\cdot ))\) for \(\ell =1,2,\ldots \). Denote
where \(k=1,\cdots , K\) and \(\ell = 1,\cdots , \infty \). Next, fix \(k\in \{1,\ldots ,K\}\), we evaluate the difference between \(V^{(h)}_{f_{k,\ell }}(t,\xi )\) and \(\tilde{V}^{(h)}_{f_{k,\ell }}(t,\xi )\). For each \(\ell \in \mathbb {N}\cup \{0\}\), denote
We show that \(|\epsilon _{k,\ell }(t,\xi )|\) is of order \(\epsilon \) and linearly dependent on \(c_{k}(\ell )\) for all \(t,\xi \in \mathbb {R}\). First, note that
and that
Denote
Clearly, \(\Vert \phi _{k,1}''\Vert _{L^\infty }\le \epsilon M_k\). Combining the above inequalities and Lemma 7.1, we have
which is of order \(\epsilon \) since \(\phi _{k,1}'(t)\) and \(B_{k,1}(t)\) are bounded. Note that \(\epsilon _{k,0}(t,\xi )\le \epsilon c_{k}(\ell )(\phi '_{k,1}(t) + \epsilon M_kI_2/2)\) since the phase \(\phi _{k,0}=0\). Furthermore, note that \(|\epsilon _{k,\ell }(t,\xi )|\) decays at the rate of \(|\xi |^{-1}\) as \(|\xi |\rightarrow \infty \) since
Denote
which converges by (15) that \(\sum _{\ell =1}^\infty \ell B_{k,\ell }(t)\le C_k \sqrt{\frac{1}{4}B^2_{k,0}(t)+\frac{1}{2}\sum _{\ell =1}^\infty B^2_{k,\ell }(t)}\), and hence
Thus, \(E^{(1)}_{k}(t,\xi )\) is of order \(\epsilon \).
Finally, for each \(k\in \{1,\ldots ,K\}\), denote
By the Plancherel identity, we have
Thus, by the assumption that (14) that \(\sum _{\ell =N_k+1}^\infty B_{k,\ell }(t)\le \epsilon \sqrt{\frac{1}{4}B^2_{k,0}(t)+\frac{1}{2}\sum _{\ell =1}^\infty B^2_{k,\ell }(t)}\), we have
where the last inequality holds since \(\Vert \hat{h}\Vert _{L^\infty }\le I_0\) by a direct bound. Thus, we have
where \(|E^{(2)}_{k}(t,\xi )|\) is of order \(\epsilon \). Furthermore, \(|E^{(2)}_{k}(t,\xi )|\) decays faster than \(|\xi |^{-1}\) as \(|\xi |\rightarrow \infty \) since \(\sum _{\ell =1}^\infty B_{k,\ell }(t)<\infty \) and \(\sum _{k=1}^K \sum _{\ell =0}^{N_k} \tilde{V}^{(h)}_{f_{k,\ell }}(t,\xi )\) decays faster than \(|\xi |^{-1}\) as \(|\xi |\rightarrow \infty \).
We thus have
Putting (53) and (60) together, we have
Denote
which is of order \(\epsilon \) and \(|\epsilon _0(t,\xi )|\) decays at the rate of \(|\xi |^{-1}\) as \(|\xi |\rightarrow \infty \). We thus have the proof. \(\square \)
Lemma 7.3
Fix \(\epsilon {>0}\) and \(d>0\). Take \(f(t)=\sum _{k=1}^Kf_k(t)\in \mathcal {D}_{\epsilon ,d}\). Fix a window function \(h\in \mathcal {S}\). For each \(t\in \mathbb {R}\) and \(\xi \in \mathbb {R}\), we have
where \(\epsilon _1(t,\xi )\) is defined in (67) satisfying
where \(\tilde{Z}_{k,\ell }(t) := [(\ell - \epsilon ) \phi '_{k,1}(t) - \Delta , (\ell + \epsilon ) \phi '_{k,1} + \Delta ]\). Note that the support of \(\epsilon _1(t,\xi )\) is inside \([-\max _{k}((N_k+\epsilon )\phi _{k,1}'(t))-\Delta ,\,\max _{k}((N_k+\epsilon )\phi _{k,1}'(t))+\Delta ]\). In particular, we have
where \(\epsilon _2(t,\xi )=\epsilon _0(t,\xi )+\epsilon _1(t,\xi )\), which is of order \(\epsilon \) and \(|\epsilon _2(t,\xi )|\) decays at the rate of \(|\xi |^{-1}\) as \(|\xi |\rightarrow \infty \).
Proof
The proof is straightforward by the smoothness assumption of h and Taylor’s expansion. Indeed, by the assumption that \(\left| \frac{\phi '_{k,\ell }(t)}{\phi '_{k,1}(t)}-\ell \right| \le \epsilon \), we know that \(|\phi '_{k,\ell }(t)-\ell \phi '_{k,1}(t)|\le \epsilon \phi _{k,1}'(t)\) for all \(\ell =1,\ldots \). Thus, since \(\hat{h}\) is compactly supported on \([-\Delta ,\Delta ]\), we have that for \(\xi \in \tilde{Z}_{k,\ell }\),
where we use the bound \(\Vert \hat{h}'\Vert _{L^\infty }\le 2\pi I_1\); for \(\xi \notin \tilde{Z}_{k,\ell }\),
Denote
By a direct bound, we have
which leads to the claim. The proof of (65) comes from a direct combination of (50) and (63). \(\square \)
By the assumption that \(0<\Delta \le \phi _{1,1}'(t)/4\), we know that for a fixed \(k\in \{1,\ldots ,K\}\), \(Z_{k,i}(t)\cap Z_{k,j}(t)=\emptyset \) for all \(i\ne j\), where \(Z_{k,\ell }\) is defined in (34). Thus, when \(K=1\), we know that for any \(\gamma >0\), the \(\gamma \) power of the absolute value of the major term in (65) becomes
since the supports of \(\hat{h}(\xi -\phi '_{1,i}(t))\) and \(\hat{h}(\xi -\phi '_{1,j}(t))\) do not overlap, when \(i\ne j\). However, when \(K>1\), although \(Z_{k,1}(t)\cap Z_{\ell ,1}(t)=\emptyset \) when \(k\ne \ell \) since \(\Delta <d/4\), there is no guarantee that \(Z_{k,i}(t)\cap Z_{\ell ,j}(t)=\emptyset \) when \(k\ne \ell \) and \(i\ne j\). So, when \(K>1\), we need to be careful when we take the power.
Definition 7.4
Fix \(\epsilon { >0}\) and \(d>0\). Take \(f(t)=\sum _{k=1}^Kf_k(t)\in \mathcal {D}_{\epsilon ,d}\). Define \({S_1(t)}=\emptyset \), and for each \(k\in \{{2},\ldots ,K\}\), define
Furthermore, define
The set \(S_k(t)\) indicates the multiples of the kth ANH function that have the danger of overlapping with the other ANH functions. To be more precise, for \(k\in \{2,\ldots ,K\}\) and \(\ell \in \{1,\ldots ,k-1\}\), the supports of \(\hat{h}(\xi -i\phi _{k}'(t))\) and \(\hat{h}(\xi -j \phi _{\ell }'(t))\), where \(i\in {\{0,\pm 1,\ldots ,\pm N_k\}}\backslash S_k\) and \(j\in { \{0,\pm 1,\ldots ,\pm N_\ell \}} \backslash S_\ell \) do not overlap. The sets \(Y_{\text {no-OL}}(t)\) and \(Y_{\text {with-OL}}(t)\) are used to control the overlapping of multiples associated with different ANH components. Note that the supports of all summands in \(\sum _{k=1}^K\sum _{\ell \in \{{0,}\pm 1,\ldots ,\pm N_k\}\backslash S_k}B_{k,\ell }(t)|\hat{h}(\xi -\ell \phi '_{k,1}(t))|\) do not overlap.
To evaluate \(|V^{(h)}_f(t,\xi )|^\gamma \), we need the following bounds to control the influence of taking the \(\gamma \) power.
Lemma 7.5
Suppose \(x\ge y\ge 0\). For \(0<\gamma \le 1\), we have
Proof
When \(x=y=0\), this is the trivial case. Suppose \(x\ge y> 0\) or \(x>y\ge 0\). By Taylor’s expansion, we have
Since \(y/x\le 1\), we obtain the bound. \(\square \)
Lemma 7.6
Suppose Assumption 3.2 holds and take \(0<\gamma \le 1\). Then we have
where \(\delta _3(t,\xi )\) is defined in (74) and \(\epsilon _3(t,\xi )\) is defined in (75). Moreover, \(\delta _3(t,\xi )=0\) when \(K=1\). When \(K>1\), \(\delta _3(t,\xi )\) is supported on \(Y_{\text {with-OL}}(t)\) and is bounded by \(\frac{I_0^\gamma }{2^\gamma }\sum _{k=2}^K\sum _{\ell \in S_k}B^\gamma _{k,\ell }(t)\chi _{Z_{k,\ell }}(\xi )\). \(\epsilon _3(t,\xi )\) satisfies \(|{\epsilon _3}(t,\xi )|\le |\epsilon _{2}(t,\xi )|^\gamma \).
Proof
Let \(\delta _3(t,\xi )\) and \({\epsilon _3}(t,\xi )\) be defined as
and
That is,
According to Lemmas 7.5 and 7.3, when \(\epsilon \) is small enough, by the triangular inequality that \(\big ||V^{(h)}_f(t,\xi )|- |\frac{1}{2} \sum _{k=1}^K \sum _{\ell = -N_k}^{N_k} B_{k,\ell }(t) \hat{h}(\xi - \ell \phi '_{k,1}(t))e^{i2\pi \phi _{k,\ell }(t)}|\big |\le |\epsilon _2(t,\xi )|\), we have
Note that when \(\xi \in Y_{\text {no-OL}}(t)\), \(\delta _3(t,\xi ) =0\) since the supports of all summands in \(\sum _{k=1}^K\sum _{\ell =-N_k}^{N_k}B_{k,\ell }(t)|\hat{h}(\xi -\ell \phi '_{k,1}(t))|\) do not overlap for each \(\xi \in Y_{\text {no-OL}}(t)\). Therefore, we have
Hence,
since \(\big |\frac{1}{2} \sum _{k=1}^K \sum _{\ell \in S_k(t) } B_{k,\ell }(t) \hat{h}(\xi - \ell \phi '_{k,1}(t))e^{i2\pi \phi _{k,\ell }(t)}\big |^{\gamma } \le \frac{1}{2^{\gamma }} \sum _{k=1}^K \sum _{\ell \in S_k(t) } B_{k,\ell }^{\gamma }(t) |\hat{h}(\xi - \ell \phi '_{k,1}(t))|^{\gamma }\) by Lemma 7.5. Note that when \(K=1\), \(S_1(t) = \emptyset \). Putting these together, we have
which completes the proof. \(\square \)
Before finishing the proof, we need to control the error introduced by \(\delta _3(t,\xi )\) in Lemma 7.6 when \(K\ge 2\). Note that \(\delta _3(t,\xi )\) is supported on \(Y_{\text {with-OL}}(t)\). We now control this set.
Lemma 7.7
Suppose Assumption 3.2 holds and \(K>1\). For each \(t\in \mathbb {R}\), we have for each \(k\in \{2,\ldots ,K\}\) the following bound:
where \(\#S_{k}(t)\) is the cardinal number of the set \(S_{k}(t)\) and \(E^{(\ell )}(N_k)\ge 0\) is defined in (85). Clearly \(\frac{\#S_{1}(t)}{N_1}=0\).
This Lemma gives a bound of the set \(S_k(t)\), which indicates that only a small fraction of the multiples of the kth ANH function has the danger of overlapping with other ANH function.
Proof
Fix \(k\in \{2,3,\ldots ,K\}\) and \(\ell \in \{1,\ldots ,k-1\}\). Define a set
which is the set of multiples of \(\phi '_{k,1}(t)\) that overlap some multiples of \(\phi '_{\ell ,1}(t)\). Clearly, \( S_{k}(t) \subset \cup _{\ell =1}^{k-1}S_{k,\ell }(t)\) and \(S_{k,\ell _1}(t)\) and \(S_{k,\ell _2}(t)\) might overlap when \(\ell _1\ne \ell _2\). Thus, \(\#S_{k}(t)\le { \sum _{\ell =1}^{k-1}}\#S_{k,\ell }(t)\). To evaluate the cardinality of the set \(S_{k,\ell }(t)\), denote a sequence \(s_{k,\ell }(m)\), \(m\in \mathbb {N}\), so that
By the compactly supported assumption of \(\hat{h}\), when \(s_{k,\ell }(m)\) lands in
we know that \(Z_{k,m}(t)\cap Z_{\ell ,j}\ne \emptyset \) for some j; that is,
When \(\phi '_{k,1}(t)/\phi '_{\ell ,1}(t)\) is a rational number, that is, \(\phi '_{k,1}(t)/\phi '_{\ell ,1}(t)=a/b\), where \(a,b\in \mathbb {N}\) and are co-prime numbers, then the sequence \(\{s_{k,\ell }(m)\}_{m\in \mathbb {N}}\) only lands on \(\{0,\phi '_{\ell ,1}(t)/b,\ldots ,(b-1)\phi '_{\ell ,1}(t)/b\}\) uniformly on \([0,\phi '_{\ell ,1}(t))\) since the integer a has a multiplicative inverse modulo b; that is, there exists \(n_0\) such that \(an_0\,\, (\text {mod } b)=1\). Thus the claim holds with the worst bound
When \(\phi '_{k,1}(t)/\phi '_{\ell ,1}(t)\) is an irrational number, the sequence \(\{s_{k,\ell }(m)\}\) is equidistributed on \([0,\phi '_{\ell ,1}(t)]\) by Weyl’s criterion. We apply the following well-known Erdös–Turán inequality [41, Corollary 1.1] to bound \(\frac{\#S_{k,\ell }(t)}{N_k}\):
for all positive J. Denote \(E^{(\ell )}_J(N_k)\) to be the right hand side of (84). Then the best upper bound we could obtain from Erdös–Turán inequality is
which goes to zero when \(N_k\rightarrow \infty \); that is, when \(N_k\rightarrow \infty \), the chance that \(s_{k,\ell }(m)\) would land in \(\mathcal {Z}_{k,\ell }\) is \(\frac{4\Delta }{\phi '_{\ell ,1}(t)}\). Thus, in general we know that for the pair \((k,\ell )\), we have
and hence
which is the number of multiples of \(\phi '_{k,1}(t)\) that are close to some multiples of \(\phi '_{\ell ,1}(t)\). In conclusion, we have
\(\square \)
By putting the above Lemmas together, we can prove Theorem 3.4, which shows that the STCT does provide the necessary information for the fundamental IF of the ANH function, even when there are more than one component.
Proof of Theorem 3.4
Note that in general \(|V^{(h)}_f(t,\cdot )|^\gamma \) is a tempered distribution, so we can define the Fourier transform in the distribution sense. Define a \(\ell ^1\) sequence \(b_k\), where \(b_k(\ell )=B^\gamma _{k,\ell }(t)\) for all \(\ell \in \{0,\ldots ,N_k\}\), \(b_k(\ell )=0\) for all \(\ell >N_k\), and \(b_k(-\ell )=b_k(\ell )\) for all \(\ell \in \mathbb {N}\cup \{0\}\). By a direct calculation, for \(q>0\), we have
where \(\hat{b}_k\) is the discrete-time Fourier transform of the \(\ell ^1\) sequence \(b_k\), which is a continuous and real.
For the term \(\delta _3\), since \(\delta _3(t,\cdot )\) is compactly supported, continuous by (78) and is bounded by (79), \(\delta _3(t,\cdot )\in L^1\) and its Fourier transform could be well defined as a function. Since the support of \(\delta _3\), which is determined by the overlapped multiples of different ANH functions, could not be controlled, we apply the Riemann-Lebesgue theorem to evaluate a simple bound:
since \(|Z_{k,\ell }| =2\Delta \). To control \(\sum _{\ell \in S_k(t)}c_k^\gamma (\ell )\), we apply the simple bound \(c_k(\ell )\le \Vert c_k\Vert _{\ell ^\infty }\) for all \(\ell =0,1,\ldots ,N_k\). This leads to
where the last inequality holds by Lemma 7.7. Thus, the first term
is bounded by
Note that \(K=1\), since \(\delta _3(t,\xi )=0\), we know that \(E_1=0\) and the bound holds trivially.
The error term \({\epsilon _3}(t,\xi )\) is of order \(\epsilon ^\gamma \) but in general it decays at the rate of \(|\xi |^{-\gamma }\) as \(|\xi |\rightarrow \infty \), so its Fourier transform is evaluated in the distribution sense. Denote \(E_2:=\mathcal {F}[{\epsilon _3}(t,\cdot )]\). We have
for all \(\psi \in \mathcal {S}\). We have thus obtained the claim. \(\square \)
Remark 1
Note that the bound for \(E_1\), which is the Fourier transform of \(\delta _3\), is the worst bound, since we could not control the locations of the overlaps between those multiples of different ANH components in the STFT. The problem we encounter could be simplified to the following analytic number theory problem: given an irrational number \(\alpha \). Denote \(\beta _n=n\alpha -[n\alpha ]\), where \(n\in \mathbb {N}\cup \{0\}\) and [x] means the integer part of x. Denote the set \(I=\{n,-n| n\in \mathbb {N}\cup \{0\},\,0\le \beta _n<\zeta \}\cup \{n,-n|\beta _n>1-\zeta \}\), where \(\zeta >0\) is a small number. Then, what is the spectral distribution of \(\sum _{n\in I}\delta _n\star g\), where g is a smooth and compact function supported on \([-\zeta /2,\zeta /2]\)?
Proof of Corollary 3.5
By (37), \(b_k(\ell )\) is non-zero for \(\ell \in \{-N_k,\ldots ,0,\ldots ,N_k\}\). Thus \(\hat{b}_k\) is a continuous, real, and periodic function with the period equal to \(1/\phi _{k,1}'(t)\). By (57), (59), and (64), \(\epsilon _2(t,\xi )\) is bounded by \(Q\epsilon \), where
Thus, when \(\sqrt{\frac{1}{4}B^2_{k,0}(t)+\frac{1}{2}\sum _{\ell =1}^\infty B^2_{k,\ell }(t)}\) is sufficiently large and \(\epsilon \) is sufficiently small, \(\frac{1}{2^{\gamma }} \sum _{k=1}^K \sum _{\ell = -N_k}^{N_k} B_{k,\ell }^{\gamma }(t) |\hat{h}(\xi - \ell \phi '_{k,1}(t))|^{\gamma }\) dominantes \(|\epsilon _2(t,\xi )|^\gamma \), since \(B^\gamma _{k,\ell }(t)>\epsilon ^{\gamma /2}\big (\frac{1}{4}B^2_{k,0}(t)+\frac{1}{2}\sum _{\ell =1}^\infty B^2_{k,\ell }(t)\big )^{\gamma /2}\) and \(\epsilon _3(t,\xi )\) is bounded by \(Q^\gamma \epsilon ^\gamma \). Moreover, when \(\Delta N_k\) is sufficiently small, \(\frac{1}{2^{\gamma }} \sum _{k=1}^K \sum _{\ell = -N_k}^{N_k} B_{k,\ell }^{\gamma }(t) |\hat{h}(\xi - \ell \phi '_{k,1}(t))|^{\gamma }\) also dominates \(\delta _3(t,\xi )\), and hence we finish the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Lin, CY., Su, L. & Wu, HT. Wave-Shape Function Analysis. J Fourier Anal Appl 24, 451–505 (2018). https://doi.org/10.1007/s00041-017-9523-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00041-017-9523-0
Keywords
- Adaptive non-harmonic model
- Cepstrum
- Short-time cepstral transform
- Instantaneous frequency
- Synchrosqueezing transform
- De-shape STFT
- De-shape SST