Article

Human computing and machine understanding of human behavior: a survey

Authors:
Maja Pantic

Imperial College London, UK and University of Twente, The Netherlands

Imperial College London, UK and University of Twente, The Netherlands
View Profile

,
Alex Pentland

Massachusetts Institute of Technology

Massachusetts Institute of Technology
View Profile

,
Anton Nijholt

University of Twente, The Netherlands

University of Twente, The Netherlands
View Profile

,
Thomas Huang

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

ICMI '06: Proceedings of the 8th international conference on Multimodal interfacesNovember 2006Pages 239–248https://doi.org/10.1145/1180995.1181044

Published:02 November 2006Publication History

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

Pages 239–248

ABSTRACT

A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior.

References

Aarts, E. Ambient intelligence drives open innovation. ACM Interactions, 12, 4 (July/Aug. 2005), 66--68. Google ScholarDigital Library
Ambady, N. and Rosenthal, R. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111, 2 (Feb. 1992), 256--274.Google ScholarCross Ref
Ba, S. O. and Odobez, J. M. A probabilistic framework for joint head tracking and pose estimation. In Proc. Conf. Pattern Recognition, vol. 4, 264--267, 2004. Google ScholarDigital Library
Bartlett, M. S., Littlewort, G., Frank, M. G., Lainscsek, C., Fasel, I. and Movellan, J. Fully automatic facial action recognition in spontaneous behavior. In Proc. Conf. Face & Gesture Recognition, 223--230, 2006. Google ScholarDigital Library
Bicego, M., Cristani, M. and Murino, V. Unsupervised scene analysis: A hidden Markov model approach. Computer Vision & Image Understanding, 102, 1 (Apr. 2006), 22--41. Google ScholarDigital Library
Bobick, A. F. Movement, activity and action: The role of knowledge in the perception of motion. Philosophical Trans. Roy. Soc. London B, 352, 1358 (Aug. 1997), 1257--1265.Google ScholarCross Ref
Bowyer, K. W., Chang, K. and Flynn, P. A survey of approaches and challenges in 3D and multimodal 3D+2D face recognition. Computer Vision & Image Understanding, 101, 1 (Jan. 2006), 1--15. Google ScholarDigital Library
Buxton, H. Learning and understanding dynamic scene activity: a review. Image & Vision Computing, 21, 1 (Jan. 2003), 125--136.Google ScholarCross Ref
Cacioppo, J. T., Berntson, G. G., Larsen, J. T., Poehlmann, K. M. and Ito, T. A. The psychophysiology of emotion. In Handbook of Emotions. Lewis, M. and Haviland-Jones, J. M., Eds. Guilford Press, New York, 2000, 173--191.Google Scholar
Chiang, C. C. and Huang, C. J. A robust method for detecting arbitrarily tilted human faces in color images. Pattern Recognition Letters, 26, 16 (Dec. 2005), 2518--2536. Google ScholarDigital Library
Costa, M., Dinsbach, W., Manstead, A. S. R. and Bitti, P. E. R. Social presence, embarrassment, and nonverbal behavior. Journal of Nonverbal Behavior, 25, 4 (Dec. 2001), 225--240.Google Scholar
Coulson, M. Attributing emotion to static body postures: Recognition accuracy, confusions, & viewpoint dependence. J. Nonverbal Behavior, 28, 2 (Jun. 2004), 117--139.Google ScholarCross Ref
Deng, B. L. and Huang, X. Challenges in adopting speech recognition. Communications of the ACM, 47, 1 (Jan. 2004), 69--75. Google ScholarDigital Library
Dey, A. K., Abowd, G. D. and Salber, D. A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. J. Human-Computer Interaction, 16, 2/4 (Dec. 2001), 97--166. Google ScholarDigital Library
Duchowski, A. T. A breadth-first survey of eye-tracking applications. Behavior Research Methods, Instruments and Computing, 34, 4 (Nov. 2002), 455--470.Google ScholarCross Ref
Ekman, P. and Friesen, W. F. The repertoire of nonverbal behavioral categories -- origins, usage, and coding. Semiotica, 1, 1969, 49--98.Google ScholarCross Ref
Ekman, P., Friesen, W.V. and Hager, J. C. Facial Action Coding System. A Human Face, Salt Lake City, 2002.Google Scholar
El Kaliouby, R. and Robinson, P. Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures. In Proc. Int'l Conf. Computer Vision & Pattern Recognition, vol. 3, 154, 2004. Google ScholarDigital Library
Fridlund, A. J. The new ethology of human facial expression. The psychology of facial expression. Russell, J. A. and Fernandez-Dols, J. M., Eds. Cambridge University Press, Cambridge, UK, 1997, 103--129.Google Scholar
Furnas, G., Landauer, T., Gomes L. and Dumais, S. The vocabulary problem in human-system communication, Communications of the ACM, 30, 11 (Nov. 1987), 964--972. Google ScholarDigital Library
Gatica-Perez, D., McCowan, I., Zhang, D. and Bengio, S. Detecting group interest level in meetings. In Proc. Int'l Conf. Acoustics, Speech & Signal Processing, vol. 1, 489--492, 2005.Google ScholarCross Ref
Gibson, K. R. and Ingold, T., Eds. Tools, Language and Cognition in Human Evolution. Cambridge University Press, Cambridge, UK, 1993.Google Scholar
Gu, H. and Ji, Q. Information extraction from image sequences of real-world facial expressions. Machine Vision and Applications, 16, 2 (Feb. 2005), 105--115.Google ScholarDigital Library
Gunes, H. and Piccardi, M. Affect Recognition from Face and Body: Early Fusion vs. Late Fusion, In Proc. Int'l Conf. Systems, Man and Cybernetics, 3437--3443, 2005.Google ScholarCross Ref
Haykin, S. and de Freitas, N., Eds. Special Issue on Sequential State Estimation. Proceedings of the IEEE, 92, 3 (Mar. 2004), 399--574.Google ScholarCross Ref
Huang, K. S. and Trivedi, M. M. Robust real-time detection, tracking, and pose estimation of faces in video. In Proc. Conf. Pattern Recognition, vol. 3, 965--968, 2004. Google ScholarDigital Library
Izard, C. E. Emotions and facial expressions: A perspective from Differential Emotions Theory. In The psychology of facial expression. Russell, J. A. and Fernandez-Dols, J. M., Eds. Cambridge University Press, Cambridge, UK, 1997, 103--129.Google Scholar
Jain, A. K. and Ross, A. Multibiometric systems. Communications of the ACM, 47, 1 (Jan. 2004), 34--40. Google ScholarDigital Library
Juslin, P. N. and Scherer, K. R. Vocal expression of affect. In The New Handbook of Methods in Nonverbal Behavior Research. Harrigan, J., Rosenthal, R. and Scherer, K., Eds. Oxford University Press, Oxford, UK, 2005.Google Scholar
Keltner, D. and Ekman, P. Facial expression of emotion. In Handbook of Emotions, Lewis, M., and Haviland-Jones, J. M. Eds. The Guilford Press, New York, 2000, pp. 236--249.Google Scholar
Li, S. Z. and Jain, A. K., Eds. Handbook of Face Recognition. Springer, New York, 2005. Google ScholarDigital Library
Lisetti, C. L. and Schiano, D. J. Automatic facial expression interpretation: Where human-computer interaction, AI and cognitive science intersect. Pragmatics and Cognition, 8, 1 (Jan. 2000), 185--235.Google Scholar
Maat, L. and Pantic, M. Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
Matos, S., Birring, S. S., Pavord, I. D. and Evans, D. H. Detection of cough signals in continuous audio recordings using HMM. IEEE Trans. Biomedical Engineering, 53, 6 (June 2006), 1078--1083.Google ScholarCross Ref
Nijholt, A., Rist, T. and Tuinenbreijer, K. Lost in ambient intelligence. In Proc. Int'l Conf. Computer Human Interaction, 1725--1726, 2004. Google ScholarDigital Library
Nijholt, A., de Ruyter, B., Heylen, D. and Privender, S. Social Interfaces for Ambient Intelligence Environments. Chapter 14 in: True Visions: The Emergence of Ambient Intelligence. Aarts, E. and Encarnaçao, J., Eds. Springer, New York, 2006, 275--289.Google Scholar
Nijholt, A. and Traum, D. The Virtuality Continuum Revisited. In Proc. Int'l Conf. Computer Human Interaction, 2132--2133, 2005. Google ScholarDigital Library
Nock, H. J., Iyengar, G. and Neti, C. Multimodal processing by finding common cause. Communications of the ACM, 47, 1 (Jan. 2004), 51--56. Google ScholarDigital Library
Norman, D. A., Human-centered design considered harmful, ACM Interactions, 12, 4 (July/Aug. 2005), 14--19. Google ScholarDigital Library
Oudeyer, P. Y. The production and recognition of emotions in speech: features and algorithms. Int'l J. Human-Computer Studies, 59, 1-2 (July 2003), 157--183. Google ScholarDigital Library
Oviatt, S. User-centered modeling and evaluation of multimodal interfaces. Proceedings of the IEEE, 91, 9 (Sep. 2003), 1457--1468.Google ScholarCross Ref
Pal, P., Iyer, A. N. and Yantorno, R. E. Emotion detection from infant facial expressions and cries. In Proc. Int'l Conf. Acoustics, Speech & Signal Processing, 2, 721--724, 2006.Google Scholar
Pantic, M. and Patras, I. Dynamics of Facial Expressions -- Recognition of Facial Actions and their Temporal Segments from Face Profile Image Sequences. IEEE Trans. Systems, Man, and Cybernetics, Part B, 36, 2 (Apr. 2006), 433--449. Google ScholarDigital Library
Pantic, M. and Rothkrantz, L. J. M. Toward an Affect-Sensitive Multimodal Human-Computer Interaction. Proceedings of the IEEE, 91, 9 (Sep. 2003), 1370--1390.Google ScholarCross Ref
Pantic, M., Valstar, M. F., Rademaker, R. and Maat, L. Web-based database for facial expression analysis. In Proc. Int'l Conf. Multimedia and Expo, 2005. (www.mmifacedb.com)Google ScholarCross Ref
Pentland, A. Socially aware computation and communication. IEEE Computer, 38, 3 (Mar. 2005), 33--40. Google ScholarDigital Library
Russell, J. A. and Fernandez-Dols, J. M., Eds. The psychology of facial expression. Cambridge University Press, Cambridge, UK, 1997.Google ScholarCross Ref
Russell, J. A., Bachorowski, J. A. and Fernandez-Dols, J. M. Facial and Vocal Expressions of Emotion. Annual Review of Psychology, 54, (2003), 329--349.Google ScholarCross Ref
Ruttkay, Z. M., Reidsma, D. and Nijholt, A. Human computing, virtual humans and artificial imperfection. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
Sand, P. and Teller, S. Particle Video: Long-Range Motion Estimation using Point Trajectories. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, 2006, 2195--2202. Google ScholarDigital Library
Scanlon, P. and Reilly, R. B. Feature analysis for automatic speech reading. In Proc. Int'l Workshop Multimedia Signal Processing, 2001, 625--630.Google Scholar
Sharma, R., Yeasin, M., Krahnstoever, N., Rauschert, I., Cai, G., Maceachren, A. M. and Sengupta, K. Speech-gesture driven multimodal interfaces for crisis management. Proceedings of the IEEE, 91, 9 (Sep. 2003), 1327--1354.Google ScholarCross Ref
Song, M., Bu, J., Chen, C. and Li, N. Audio-visual based emotion recognition -- A new approach. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, 2004, 1020--1025. Google ScholarDigital Library
Starner, T. The Challenges of Wearable Computing. IEEE Micro, 21, 4 (July/Aug. 2001), 44--67. Google ScholarDigital Library
Stein, B. and Meredith, M. A. The Merging of Senses. MIT Press, Cambridge, USA, 1993.Google Scholar
Stenger, B., Torr, P. H. S. and Cipolla, R. Model-based hand tracking using a hierarchical Bayesian filter. IEEE Trans. Pattern Analysis and Machine Intelligence, 28, 9 (Sep. 2006), 1372--1384. Google ScholarDigital Library
Streitz, N. and Nixon, P. The Disappearing Computer. ACM Communications, 48, 3 (Mar. 2005), 33--35. Google ScholarDigital Library
Traum, D. Back end of human computing: Multimodal and multi-party interaction management. In Proc. Int'l Conf. Multimodal Interfaces, 2006.Google Scholar
Truong, K. P. and van Leeuwen, D. A. Automatic detection of laughter. In Proc. Interspeech Euro. Conf., 485--488, 2005.Google Scholar
Valstar, M. F. and Pantic, M. Biologically vs. logic inspired encoding of facial actions and emotions in video. In Proc. Int'l Conf. on Multimedia and Expo, 2006.Google ScholarCross Ref
Valstar, M. F. and Pantic, M. Fully automatic facial action unit detection and temporal analysis. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, vol. 3, 149, 2006. Google ScholarDigital Library
Valstar, M. F., Pantic, M., Ambdar, Z. and Cohn, J. F. Spontaneous vs. Posed Facial Behavior: Automatic Analysis of Brow Actions. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
Viola, P. and Jones, M. J. Robust real-time face detection. Int'l J. Computer Vision, 57, 2 (May 2004), 137--154. Google ScholarDigital Library
Wang, J. J. and Singh, S. Video analysis of human dynamics -- a survey. Real Time Imaging, 9, 5 (Oct. 2003), 321--346.Google ScholarCross Ref
Wang, L., Hu, W. and Tan, T. Recent developments in human motion analysis. Pattern Recognition 36, 3 (Mar. 2003), 585--601.Google ScholarCross Ref
Weiser, M. The Computer for the Twenty-First Century. Scientific American, 265, 3 (Sep. 1991), 94--104.Google ScholarCross Ref
Yang, M. H., Kriegman, D. J. and Ahuja, N. Detecting faces in images: A survey. IEEE Trans. Pattern Analysis and Machine Intelligence, 24, 1 (Jan. 2002), 34--58. Google ScholarDigital Library
Zhai, S. and Bellotti, V. Sensing-Based Interaction. ACM Trans. Computer-Human Interaction, 12, 1 (Jan. 2005), 1--2. Google ScholarDigital Library
Zhao, W., Chellappa, R., Phillips, P. J. and Rosenfeld, A. Face recognition: A literature survey. ACM Computing Surveys, 35, 4 (Dec. 2003), 399--458. Google ScholarDigital Library
Zeng, Z., Hu, Y., Roisman, G. I., Fu, Y. and Huang, T. S. Audio-visual Emotion Recognition in Adult Attachment Interview. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
BTT Survey on Alternative Biometrics. Biometric Technology Today, 14, 3 (Mar. 2006), 9--11.Google ScholarCross Ref
Humaine Portal: http://emotion-research.net/wiki/DatabasesGoogle Scholar
MMUA: http://mmua.cs.ucsb.edu/Google Scholar
Praat: http://www.praat.orgGoogle Scholar

Index Terms

Human computing and machine understanding of human behavior: a survey

Recommendations

Human computing and machine understanding of human behavior: a survey
ICMI'06/IJCAI'07: Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing

A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation ...
Read More
Human-Centred Intelligent Human Computer Interaction (HCI²): how far are we from attaining it?

A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop ...
Read More
Human-virtual human interaction by upper body gesture understanding
VRST '13: Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology

In this paper, a novel human-virtual human interaction system is proposed. This system supports a real human to communicate with a virtual human using natural body language. Meanwhile, the virtual human is capable of understanding the meaning of human ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces
November 2006
404 pages
ISBN:159593541X
DOI:10.1145/1180995
General Chairs:
Francis Quek
Virginia Tech, USA
,
Jie Yang
Carnegie Mellon University, USA
,
Program Chairs:
Dominic Massaro
University of California, Santa Cruz, USA
,
Abeer Alwan
University of California, Los Angeles, USA
,
Timothy J. Hazen
Massachusetts Institute of Technology, USA
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
affective computing
anticipatory user interfaces
multimodal user interfaces
socially-aware computing
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate453of1,080submissions,42%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 127
  Total Citations
  View Citations
- 2,714
  Total Downloads
- Downloads (Last 12 months)130
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Human computing and machine understanding of human behavior: a survey

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Human computing and machine understanding of human behavior: a survey

Human-Centred Intelligent Human Computer Interaction (HCI²): how far are we from attaining it?

Human-virtual human interaction by upper body gesture understanding

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Human computing and machine understanding of human behavior: a survey

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Human computing and machine understanding of human behavior: a survey

Human-Centred Intelligent Human Computer Interaction (HCI²): how far are we from attaining it?

Human-virtual human interaction by upper body gesture understanding

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media