Analysis of voice-quality features of speech that expresses 'anger', 'joy', and 'sadness' uttered by radio actors and actresses

Takeda, Shoichi; Yasuda, Yuuri; Isobe, Risako; Kiryu, Shogo; Tsuru, Makiko

doi:10.21437/Interspeech.2008-548

Analysis of voice-quality features of speech that expresses 'anger', 'joy', and 'sadness' uttered by radio actors and actresses

Shoichi Takeda, Yuuri Yasuda, Risako Isobe, Shogo Kiryu, Makiko Tsuru

This paper describes the analysis of the voice-quality features of "anger", "joy", and "sadness" depending on the degree of the emotion for expressions in Japanese speech. The degrees of emotion were "neutral", "light", "medium" and "strong". Among voice-quality features, we turned to the noise level of the glottalflow waveform. We adopted the AR model and measured the noise levels of the predictive residual signal of speech that expressed each emotion. To measure a relative noise level to the signal level, the "noise-to-signal (N/S) ratio" was introduced. The analysis results showed that the relative noise levels in the residual-waveform spectra were different, i.e., the N/S ratio of each emotion was larger in the order of "anger" > "sadness". "neutral" > "joy" by approximately 4 dB.

doi: 10.21437/Interspeech.2008-548

Cite as: Takeda, S., Yasuda, Y., Isobe, R., Kiryu, S., Tsuru, M. (2008) Analysis of voice-quality features of speech that expresses 'anger', 'joy', and 'sadness' uttered by radio actors and actresses. Proc. Interspeech 2008, 2114-2117, doi: 10.21437/Interspeech.2008-548

@inproceedings{takeda08_interspeech,
  author={Shoichi Takeda and Yuuri Yasuda and Risako Isobe and Shogo Kiryu and Makiko Tsuru},
  title={{Analysis of voice-quality features of speech that expresses 'anger', 'joy', and 'sadness' uttered by radio actors and actresses}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2114--2117},
  doi={10.21437/Interspeech.2008-548}
}