ISCA Archive Eurospeech 1991
ISCA Archive Eurospeech 1991

A physical approach to speech quality assessment: correlation patterns in the speech spectrogram

Tammo Houtgast, Jan A. Verhave

A bank of filters has been implemented digitally to obtain, with running speech as input, energy values within well defined, Gaussian-shaped, frequency-time windows. The analysis concentrates on the correlation between the dB-outputs of pairs of different windows, with the frequency-spacing and/or the time-spacing between two such windows as parameters. The resulting correlation patterns reflect, in a global way, the statistics of the dynamic characteristics of running speech in both the frequency and the time domain. Various aspects of such correlation patterns will be considered briefly, illustrating interesting relations with some basic features in hearing and speech intelligibility. The main issue concerns the possible usefulness of this global measure for speech quality assessment. It is found that these correlation patters derived from natural speech have a typical structure, providing a basis for judging the degree of "naturalness" of a token of synthetic speech.


doi: 10.21437/Eurospeech.1991-78

Cite as: Houtgast, T., Verhave, J.A. (1991) A physical approach to speech quality assessment: correlation patterns in the speech spectrogram. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 285-288, doi: 10.21437/Eurospeech.1991-78

@inproceedings{houtgast91_eurospeech,
  author={Tammo Houtgast and Jan A. Verhave},
  title={{A physical approach to speech quality assessment: correlation patterns in the speech spectrogram}},
  year=1991,
  booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)},
  pages={285--288},
  doi={10.21437/Eurospeech.1991-78}
}