Phase-Encoded Speech Spectrograms

Seelamantula, Chandra Sekhar

doi:10.21437/Interspeech.2016-1600

Phase-Encoded Speech Spectrograms

Chandra Sekhar Seelamantula

Spectrograms of speech and audio signals are time-frequency densities, and by construction, they are non-negative and do not have phase associated with them. Under certain conditions on the amount of overlap between consecutive frames and frequency sampling, it is possible to reconstruct the signal from the spectrogram. Deviating from this requirement, we develop a new technique to incorporate the phase of the signal in the spectrogram by satisfying what we call as the delta dominance condition, which in general is different from the well known minimum-phase condition. In fact, there are signals that are delta dominant but not minimum-phase and vice versa. The delta dominance condition can be satisfied in multiple ways, for example by placing a Kronecker impulse of the right amplitude or by choosing a suitable window function. A direct consequence of this novel way of constructing the spectrograms is that the phase of the signal is directly encoded or embedded in the spectrogram. We also develop a reconstruction methodology that takes such phase-encoded spectrograms and obtains the signal using the discrete Fourier transform (DFT). It is envisaged that the new class of phase-encoded spectrogram representations would find applications in various speech processing tasks such as analysis, synthesis, enhancement, and recognition.

doi: 10.21437/Interspeech.2016-1600

Cite as: Seelamantula, C.S. (2016) Phase-Encoded Speech Spectrograms. Proc. Interspeech 2016, 1775-1779, doi: 10.21437/Interspeech.2016-1600

@inproceedings{seelamantula16_interspeech,
  author={Chandra Sekhar Seelamantula},
  title={{Phase-Encoded Speech Spectrograms}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={1775--1779},
  doi={10.21437/Interspeech.2016-1600}
}