Skip to main content
Log in

Facial expression recognition of a speaker using vowel judgment and thermal image processing

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

We have previously developed a method for the recognition of the facial expression of a speaker. For facial expression recognition, we previously selected three images: (i) just before speaking, (ii) speaking the first vowel, and (iii) speaking the last vowel in an utterance. By using the speech recognition system named Julius, thermal static images are saved at the timed positions of just before speaking, and when just speaking the phonemes of the first and last vowels. To implement our method, we recorded three subjects who spoke 25 Japanese first names which provided all combinations of the first and last vowels. These recordings were used to prepare first the training data and then the test data. Julius sometimes makes a mistake in recognizing the first and/or last vowel (s). For example, /a/ for the first vowel is sometimes misrecognized as /i/. In the training data, we corrected this misrecognition. However, the correction cannot be carried out in the test data. In the implementation of our method, the facial expressions of the three subjects were distinguished with a mean accuracy of 79.8% when they exhibited one of the intentional facial expressions of “angry,” “happy,” “neutral,” “sad,” and “surprised.” The mean accuracy of the speech recognition of vowels by Julius was 84.1%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Yuille AL, Cohen DS, Hallinan PW (1989) Feature extraction from faces using deformable templates. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, California, June 4–8, 1989, pp 104–109

  2. Harashima H, Choi CS, Takebe T (1989) 3-D model-based synthesis of facial expressions and shape deformation (in Japanese). Human Interface 4:157–166

    Google Scholar 

  3. Mase K (1990) An application of optical flow: extraction of facial expression. Proceedings of the IAPR Workshop on Machine Vision and Application, Kokubunji, Tokyo, November 28–30, 1990, pp 195–198

  4. Mase K (1991) Recognition of facial expression from optical flow. Trans IEICE E74(10):3474–3483

    Google Scholar 

  5. Matsuno K, Lee C, Tsuji S (1994) Recognition of facial expressions using potential net and KL expansion (in Japanese). Trans IEICE J77-D-II(8):1591–1600

    Google Scholar 

  6. Kobayashi H, Hara F (1994) Analysis of neural network recognition characteristics of 6 basic facial expressions. Proceedings of the 3rd IEEE International Workshop on Robot and Human Communication, Nagoya, Japan, July 18–20, 1994, pp 222–227

  7. Yoshitomi Y, Kimura S, Hira E, et al (1996) Facial expression recognition using infrared rays image processing. Proceedings of the Annual Convention IPS Japan, Osaka, Japan, September 4–6, 1996, vol 2, pp 339–340

    Google Scholar 

  8. Yoshitomi Y, Kimura S, Hira E, et al (1997) Facial expression recognition using thermal image processing. IPSJ SIG Notes, CVIM103-3, Kyoto, Japan, January 23–24, 1997, pp 17–24

  9. Yoshitomi Y, Miyawaki N, Tomita S, et al (1997) Facial expression recognition using thermal image processing and neural network. Proceedings of the 6th IEEE International Workshop on Robot and Human Communication, Sendai, Japan, September 29–October 1, 1997, pp 380–385

  10. Sugimoto Y, Yoshitomi Y, Tomita S (2000) A method for detecting transitions of emotional states using a thermal face image based on a synthesis of facial expressions. J Robotics Auton Syst 31(3):147–160

    Article  Google Scholar 

  11. Yoshitomi Y, Kim SIll, Kawano T, et al (2000) Effect of sensor fusion for recognition of emotional states using voice, face image and thermal image of face. Proceedings of the 6th IEEE International Workshop on Robot and Human Interactive Communication, Osaka, Japan, September 27–29, 2000, pp 178–183

  12. Ikezoe F, Ko R, Tanijiri T, et al (2004) Facial expression recognition for speaker using thermal image processing (in Japanese). Trans Human Interface Soc 6(1):19–27

    Google Scholar 

  13. Nakano M, Ikezoe F, Tabuse M, et al (2009) A study on the efficient facial expression using thermal face image in speaking and the influence of individual variations on its performance (in Japanese). J IEEJ 38(2):156–163

    Google Scholar 

  14. Koda Y, Yoshitomi Y, Nakano M, et al (2009) Facial expression recognition for a speaker of a phoneme of vowel using thermal image processing and a speech recognition system. Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication, Toyama, Japan, September 29–October 1, 2009, pp 955–960

  15. Yoshitomi Y (2010) Facial expression recognition for speaker using thermal image processing and speech recognition system. Proceedings of the 10th WSEAS International Conference on Applied Computer Science, Appi Kogen, Iwate, Japan, October 4–6, 2010, pp 182–186

  16. Fujimura T, Yoshitomi Y, Asada T, et al (2011) Facial expression recognition of a speaker using front-view face judgment, vowel judgment, and thermal image processing. Proceedings of the 16th International Symposium on Artificial Life and Robotics, Beppu, Oita, Japan, January 27–29, 2011, pp 219–224

  17. Kuno H (1994) Infrared rays engineering (in Japanese). Tokyo, IEICE, p 22

    Google Scholar 

  18. Kuno H (1994) Infrared rays engineering (in Japanese). Tokyo, IEICE, p 45

    Google Scholar 

  19. Yoshitomi Y, Tsuchiya A, Tomita S (1998) Face recognition using dynamic thermal image processing. Proceedings of the 7th IEEE International Workshop on Robot and Human Communication, Takamatsu, Kagawa, Japan, September 30–October 2, 1998, pp 443–448

  20. http://julius.sourceforge.jp/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasunari Yoshitomi.

Additional information

This work was presented in part at the 16th International Symposium on Artificial Life and Robotics, Oita, Japan, January 27–29, 2011

About this article

Cite this article

Yoshitomi, Y., Asada, T., Shimada, K. et al. Facial expression recognition of a speaker using vowel judgment and thermal image processing. Artif Life Robotics 16, 318–323 (2011). https://doi.org/10.1007/s10015-011-0939-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-011-0939-3

Key words

Navigation