Skip to main content
Log in

Social Signal Interpretation (SSI)

A Framework for Real-time Sensing of Affective and Social Signals

  • Projekt
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

The development of anticipatory user interfaces is a key issue in human-centred computing. Building systems that allow humans to communicate with a machine in the same natural and intuitive way as they would with each other requires detection and interpretation of the user’s affective and social signals. These are expressed in various and often complementary ways, including gestures, speech, mimics etc. Implementing fast and robust recognition engines is not only a necessary, but also challenging task. In this article, we introduce our Social Signal Interpretation (SSI) tool, a framework dedicated to support the development of such online recognition systems. The paper at hand discusses the processing of four modalities, namely audio, video, gesture and biosignals, with focus on affect recognition, and explains various approaches to fuse the extracted information to a final decision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. http://www.iis.fraunhofer.de/en/bf/bv/ks/gpe/demo/.

  2. In order to illustrate the use of SSI which focuses on emotion recognition tasks, we present a basic version of Alfred here which simply mirrors the user’s emotion. Thus, we do not integrate an appraisal model to simulate how Alfred appraises the user’s emotional display, but see [2] for a version of Alfred based on the Alma model which combines an appraisal mechanism with a dimensional representation of emotions.

  3. http://hcm-lab.de/ssi.html.

References

  1. Bee N, Falk B, André E (2009) Simplified facial animation control utilizing novel input devices: a comparative study. In: International conference on intelligent user interfaces (IUI’09), pp 197–206

    Google Scholar 

  2. Bee N, André E, Vogt T, Gebhard P (2010) The use of affective and attentive cues in an empathic computer-based companion. In: Wilks Y (ed) Close engagements with artificial companions: key social, psychological, ethical and design issues. Benjamins, Amsterdam, pp 131–142

    Google Scholar 

  3. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of german emotional speech. In: Proceedings of Interspeech, Lisbon, pp 1517–1520

    Google Scholar 

  4. Caridakis G, Raouzaiou A, Karpouzis K, Kollias S (2006) Synthesizing gesture expressivity based on real sequences. In: Workshop on multimodal corpora: from multimodal behaviour theories to usable models, LREC 2006 conference, Genoa, Italy, 24–26 May

    Google Scholar 

  5. Caridakis G, Wagner J, Raouzaiou A, Curto Z, André E, Karpouzis K (2010) A multimodal corpus for gesture expressivity analysis. In: Multimodal corpora: advances in capturing, coding and analyzing multimodality, LREC, Malta, 17–23 May 2010

    Google Scholar 

  6. Charles F, Pizzi D, Cavazza M, Vogt T, André E (2009) Emotional input for character-based interactive storytelling. In: The 8th international conference on autonomous agents and multiagent systems (AAMAS) Budapest, Hungary

    Google Scholar 

  7. Conati C, Chabbal R, Maclaren H (2003) A study on using biometric sensors for detecting user emotions in educational games. In: Proceedings of the workshop “Assessing and adapting to user attitude and affects: why, when and how?” In conjunction with UM’03, 9th international conference on user modeling

    Google Scholar 

  8. Ekman P (1992) An argument for basic emotions. Cogn Emot 6(3):169–200

    Article  Google Scholar 

  9. Gebhard P (2005) ALMA: a layered model of affect. In: AAMAS ’05: Proceedings of the fourth international joint conference on autonomous agents and multiagent systems. ACM, New York, pp 29–36

    Chapter  Google Scholar 

  10. Gilroy S, Cavazza M, Vervondel V (2011) Evaluating multimodal affective fusion with physiological signals. In: Proceedings of the international conference on intelligent user interfaces. Stanford University, Palo Alto

    Google Scholar 

  11. Gilroy SW, Cavazza M, Niiranen M, André E, Vogt T, Urbain J, Seichter H, Benayoun M, Billinghurst M (2009) Pad-based multimodal affective fusion. In: Affective computing and intelligent interaction (ACII), Amsterdam

    Google Scholar 

  12. Gratch J, Marsella S (2004) A domain-independent framework for modeling emotion. Cogn Syst Res 5(4):296–306

    Google Scholar 

  13. Hartmann B, Mancini M, Pelachaud C (2006) Implementing expressive gesture synthesis for embodied conversational agents, vol 3881, pp 188–199

  14. Hönig F, Wagner J, Batliner A, Nöth E (2009) Classification of user states with physiological signals: on-line generic features vs. specialized. In: Stewart B, Weiss S (eds) Proceedings of the 17th European signal processing conference (EUSIPCO), Glasgow, Scotland, pp 2357–2361

    Google Scholar 

  15. Jacucci G, Spagnolli A, Chalambalakis A, Morrison A, Liikkanen L, Roveda S, Bertoncini M (2009) Bodily explorations in space: social experience of a multimodal art installation. In: Proceedings of the 12th IFIP TC 13 international conference on human-computer interaction: part II, INTERACT ’09. Springer, Berlin, pp 62–75

    Google Scholar 

  16. Kim J, André E (2008) Emotion recognition based on physiological changes in music listening. IEEE Trans Pattern Anal Mach Intell 30:2067–2083

    Article  Google Scholar 

  17. Kim J, Lingenfelser F (2010) Ensemble approaches to parametric decision fusion for bimodal emotion recognition. In: Int conf on bio-inspired systems and signal processing (Biosignals 2010)

    Google Scholar 

  18. Kim J, Ragnoni A, Biancat J (2010) In-vehicle monitoring of affective symptoms for diabetic drivers. In: Fred JFA, Gamboa H (eds) Int conf on health informatics (HEALTHINF 2010), BIOSTEC. INSTICC Press, Valencia, pp 367–372

    Google Scholar 

  19. Küblbeck C, Ernst A (2006) Face detection and tracking in video sequences using the modifiedcensus transformation. Image Vis Comput 24:564–572

    Article  Google Scholar 

  20. Lingenfelser F, Wagner J, TVJK, André E (2010) Age and gender classification from speech using decision level fusion and ensemble based techniques. In: INTERSPEECH 2010

    Google Scholar 

  21. Lingenfelser F, Wagner J, Vogt T, Kim J, André E (2010) Age and gender classification from speech using decision level fusion and ensemble based techniques. In: INTERSPEECH 2010

    Google Scholar 

  22. Mehrabian A (1995) Framework for a comprehensive description and measurement of emotional states. Genet Soc Gen Psychol Monogr 121(3):339–361

    Google Scholar 

  23. Pantic M, Nijholt A, Pentland A, Huang TS (2008) Human-centred intelligent human-computer interaction (hci): how far are we from attaining it? Int J Auton Adapt Commun Syst 1(2):168–187

    Article  Google Scholar 

  24. Schuller B, Steidl S, Batliner A (2009) The INTERSPEECH 2009 emotion challenge. In: ISCA (ed) Proceedings of Interspeech 2009, pp 312–315

    Google Scholar 

  25. Vogt T, André E (2009) Exploring the benefits of discretization of acoustic features for speech emotion recognition. In: Proceedings of 10th conference of the International Speech Communication Association (INTERSPEECH), ISCA, Brighton, UK, pp 328–331

    Google Scholar 

  26. Vogt T, André E (2011) An evaluation of emotion units and feature types for real-time speech emotion recognition. This volume

  27. Wagner J, Kim J, André E (2005) From physiological signals to emotions: implementing and comparing selected methods for feature extraction and classification. In: IEEE international conference on multimedia and Expo, ICME 2005, pp 940–943

    Chapter  Google Scholar 

  28. Wagner J, André E, Jung F (2009) Smart sensor integration: a framework for multimodal emotion recognition in real-time. In: Affective computing and intelligent interaction (ACII 2009), IEEE

  29. Wagner J, Jung F, Kim J, André E, Vogt T (2010) The smart sensor integration framework and its application in EU projects. In: Kim J, Karjalainen P (eds) Workshop on bio-inspired human-machine interfaces and healthcare applications (B-Interface 2010), Biostec 2010. INSTICC Press, Valencia, pp 13–21

    Google Scholar 

  30. Wobbrock JO, Wilson AD, Li Y (2007) Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In: Proceedings of the 20th annual ACM symposium on user interface software and technology, UIST ’07. ACM, New York, pp 159–168

    Chapter  Google Scholar 

Download references

Acknowledgements

The work described in this paper is funded by the EU under research grant CALLAS (IST-34800), CEEDS (FP7-ICT-2009-5) and the IRIS Network of Excellence (Reference: 231824).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Wagner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wagner, J., Lingenfelser, F., Bee, N. et al. Social Signal Interpretation (SSI). Künstl Intell 25, 251–256 (2011). https://doi.org/10.1007/s13218-011-0115-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-011-0115-x

Keywords

Navigation