ABSTRACT
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior.
- Aarts, E. Ambient intelligence drives open innovation. ACM Interactions, 12, 4 (July/Aug. 2005), 66--68. Google ScholarDigital Library
- Ambady, N. and Rosenthal, R. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111, 2 (Feb. 1992), 256--274.Google ScholarCross Ref
- Ba, S. O. and Odobez, J. M. A probabilistic framework for joint head tracking and pose estimation. In Proc. Conf. Pattern Recognition, vol. 4, 264--267, 2004. Google ScholarDigital Library
- Bartlett, M. S., Littlewort, G., Frank, M. G., Lainscsek, C., Fasel, I. and Movellan, J. Fully automatic facial action recognition in spontaneous behavior. In Proc. Conf. Face & Gesture Recognition, 223--230, 2006. Google ScholarDigital Library
- Bicego, M., Cristani, M. and Murino, V. Unsupervised scene analysis: A hidden Markov model approach. Computer Vision & Image Understanding, 102, 1 (Apr. 2006), 22--41. Google ScholarDigital Library
- Bobick, A. F. Movement, activity and action: The role of knowledge in the perception of motion. Philosophical Trans. Roy. Soc. London B, 352, 1358 (Aug. 1997), 1257--1265.Google ScholarCross Ref
- Bowyer, K. W., Chang, K. and Flynn, P. A survey of approaches and challenges in 3D and multimodal 3D+2D face recognition. Computer Vision & Image Understanding, 101, 1 (Jan. 2006), 1--15. Google ScholarDigital Library
- Buxton, H. Learning and understanding dynamic scene activity: a review. Image & Vision Computing, 21, 1 (Jan. 2003), 125--136.Google ScholarCross Ref
- Cacioppo, J. T., Berntson, G. G., Larsen, J. T., Poehlmann, K. M. and Ito, T. A. The psychophysiology of emotion. In Handbook of Emotions. Lewis, M. and Haviland-Jones, J. M., Eds. Guilford Press, New York, 2000, 173--191.Google Scholar
- Chiang, C. C. and Huang, C. J. A robust method for detecting arbitrarily tilted human faces in color images. Pattern Recognition Letters, 26, 16 (Dec. 2005), 2518--2536. Google ScholarDigital Library
- Costa, M., Dinsbach, W., Manstead, A. S. R. and Bitti, P. E. R. Social presence, embarrassment, and nonverbal behavior. Journal of Nonverbal Behavior, 25, 4 (Dec. 2001), 225--240.Google Scholar
- Coulson, M. Attributing emotion to static body postures: Recognition accuracy, confusions, & viewpoint dependence. J. Nonverbal Behavior, 28, 2 (Jun. 2004), 117--139.Google ScholarCross Ref
- Deng, B. L. and Huang, X. Challenges in adopting speech recognition. Communications of the ACM, 47, 1 (Jan. 2004), 69--75. Google ScholarDigital Library
- Dey, A. K., Abowd, G. D. and Salber, D. A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. J. Human-Computer Interaction, 16, 2/4 (Dec. 2001), 97--166. Google ScholarDigital Library
- Duchowski, A. T. A breadth-first survey of eye-tracking applications. Behavior Research Methods, Instruments and Computing, 34, 4 (Nov. 2002), 455--470.Google ScholarCross Ref
- Ekman, P. and Friesen, W. F. The repertoire of nonverbal behavioral categories -- origins, usage, and coding. Semiotica, 1, 1969, 49--98.Google ScholarCross Ref
- Ekman, P., Friesen, W.V. and Hager, J. C. Facial Action Coding System. A Human Face, Salt Lake City, 2002.Google Scholar
- El Kaliouby, R. and Robinson, P. Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures. In Proc. Int'l Conf. Computer Vision & Pattern Recognition, vol. 3, 154, 2004. Google ScholarDigital Library
- Fridlund, A. J. The new ethology of human facial expression. The psychology of facial expression. Russell, J. A. and Fernandez-Dols, J. M., Eds. Cambridge University Press, Cambridge, UK, 1997, 103--129.Google Scholar
- Furnas, G., Landauer, T., Gomes L. and Dumais, S. The vocabulary problem in human-system communication, Communications of the ACM, 30, 11 (Nov. 1987), 964--972. Google ScholarDigital Library
- Gatica-Perez, D., McCowan, I., Zhang, D. and Bengio, S. Detecting group interest level in meetings. In Proc. Int'l Conf. Acoustics, Speech & Signal Processing, vol. 1, 489--492, 2005.Google ScholarCross Ref
- Gibson, K. R. and Ingold, T., Eds. Tools, Language and Cognition in Human Evolution. Cambridge University Press, Cambridge, UK, 1993.Google Scholar
- Gu, H. and Ji, Q. Information extraction from image sequences of real-world facial expressions. Machine Vision and Applications, 16, 2 (Feb. 2005), 105--115.Google ScholarDigital Library
- Gunes, H. and Piccardi, M. Affect Recognition from Face and Body: Early Fusion vs. Late Fusion, In Proc. Int'l Conf. Systems, Man and Cybernetics, 3437--3443, 2005.Google ScholarCross Ref
- Haykin, S. and de Freitas, N., Eds. Special Issue on Sequential State Estimation. Proceedings of the IEEE, 92, 3 (Mar. 2004), 399--574.Google ScholarCross Ref
- Huang, K. S. and Trivedi, M. M. Robust real-time detection, tracking, and pose estimation of faces in video. In Proc. Conf. Pattern Recognition, vol. 3, 965--968, 2004. Google ScholarDigital Library
- Izard, C. E. Emotions and facial expressions: A perspective from Differential Emotions Theory. In The psychology of facial expression. Russell, J. A. and Fernandez-Dols, J. M., Eds. Cambridge University Press, Cambridge, UK, 1997, 103--129.Google Scholar
- Jain, A. K. and Ross, A. Multibiometric systems. Communications of the ACM, 47, 1 (Jan. 2004), 34--40. Google ScholarDigital Library
- Juslin, P. N. and Scherer, K. R. Vocal expression of affect. In The New Handbook of Methods in Nonverbal Behavior Research. Harrigan, J., Rosenthal, R. and Scherer, K., Eds. Oxford University Press, Oxford, UK, 2005.Google Scholar
- Keltner, D. and Ekman, P. Facial expression of emotion. In Handbook of Emotions, Lewis, M., and Haviland-Jones, J. M. Eds. The Guilford Press, New York, 2000, pp. 236--249.Google Scholar
- Li, S. Z. and Jain, A. K., Eds. Handbook of Face Recognition. Springer, New York, 2005. Google ScholarDigital Library
- Lisetti, C. L. and Schiano, D. J. Automatic facial expression interpretation: Where human-computer interaction, AI and cognitive science intersect. Pragmatics and Cognition, 8, 1 (Jan. 2000), 185--235.Google Scholar
- Maat, L. and Pantic, M. Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
- Matos, S., Birring, S. S., Pavord, I. D. and Evans, D. H. Detection of cough signals in continuous audio recordings using HMM. IEEE Trans. Biomedical Engineering, 53, 6 (June 2006), 1078--1083.Google ScholarCross Ref
- Nijholt, A., Rist, T. and Tuinenbreijer, K. Lost in ambient intelligence. In Proc. Int'l Conf. Computer Human Interaction, 1725--1726, 2004. Google ScholarDigital Library
- Nijholt, A., de Ruyter, B., Heylen, D. and Privender, S. Social Interfaces for Ambient Intelligence Environments. Chapter 14 in: True Visions: The Emergence of Ambient Intelligence. Aarts, E. and Encarnaçao, J., Eds. Springer, New York, 2006, 275--289.Google Scholar
- Nijholt, A. and Traum, D. The Virtuality Continuum Revisited. In Proc. Int'l Conf. Computer Human Interaction, 2132--2133, 2005. Google ScholarDigital Library
- Nock, H. J., Iyengar, G. and Neti, C. Multimodal processing by finding common cause. Communications of the ACM, 47, 1 (Jan. 2004), 51--56. Google ScholarDigital Library
- Norman, D. A., Human-centered design considered harmful, ACM Interactions, 12, 4 (July/Aug. 2005), 14--19. Google ScholarDigital Library
- Oudeyer, P. Y. The production and recognition of emotions in speech: features and algorithms. Int'l J. Human-Computer Studies, 59, 1-2 (July 2003), 157--183. Google ScholarDigital Library
- Oviatt, S. User-centered modeling and evaluation of multimodal interfaces. Proceedings of the IEEE, 91, 9 (Sep. 2003), 1457--1468.Google ScholarCross Ref
- Pal, P., Iyer, A. N. and Yantorno, R. E. Emotion detection from infant facial expressions and cries. In Proc. Int'l Conf. Acoustics, Speech & Signal Processing, 2, 721--724, 2006.Google Scholar
- Pantic, M. and Patras, I. Dynamics of Facial Expressions -- Recognition of Facial Actions and their Temporal Segments from Face Profile Image Sequences. IEEE Trans. Systems, Man, and Cybernetics, Part B, 36, 2 (Apr. 2006), 433--449. Google ScholarDigital Library
- Pantic, M. and Rothkrantz, L. J. M. Toward an Affect-Sensitive Multimodal Human-Computer Interaction. Proceedings of the IEEE, 91, 9 (Sep. 2003), 1370--1390.Google ScholarCross Ref
- Pantic, M., Valstar, M. F., Rademaker, R. and Maat, L. Web-based database for facial expression analysis. In Proc. Int'l Conf. Multimedia and Expo, 2005. (www.mmifacedb.com)Google ScholarCross Ref
- Pentland, A. Socially aware computation and communication. IEEE Computer, 38, 3 (Mar. 2005), 33--40. Google ScholarDigital Library
- Russell, J. A. and Fernandez-Dols, J. M., Eds. The psychology of facial expression. Cambridge University Press, Cambridge, UK, 1997.Google ScholarCross Ref
- Russell, J. A., Bachorowski, J. A. and Fernandez-Dols, J. M. Facial and Vocal Expressions of Emotion. Annual Review of Psychology, 54, (2003), 329--349.Google ScholarCross Ref
- Ruttkay, Z. M., Reidsma, D. and Nijholt, A. Human computing, virtual humans and artificial imperfection. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
- Sand, P. and Teller, S. Particle Video: Long-Range Motion Estimation using Point Trajectories. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, 2006, 2195--2202. Google ScholarDigital Library
- Scanlon, P. and Reilly, R. B. Feature analysis for automatic speech reading. In Proc. Int'l Workshop Multimedia Signal Processing, 2001, 625--630.Google Scholar
- Sharma, R., Yeasin, M., Krahnstoever, N., Rauschert, I., Cai, G., Maceachren, A. M. and Sengupta, K. Speech-gesture driven multimodal interfaces for crisis management. Proceedings of the IEEE, 91, 9 (Sep. 2003), 1327--1354.Google ScholarCross Ref
- Song, M., Bu, J., Chen, C. and Li, N. Audio-visual based emotion recognition -- A new approach. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, 2004, 1020--1025. Google ScholarDigital Library
- Starner, T. The Challenges of Wearable Computing. IEEE Micro, 21, 4 (July/Aug. 2001), 44--67. Google ScholarDigital Library
- Stein, B. and Meredith, M. A. The Merging of Senses. MIT Press, Cambridge, USA, 1993.Google Scholar
- Stenger, B., Torr, P. H. S. and Cipolla, R. Model-based hand tracking using a hierarchical Bayesian filter. IEEE Trans. Pattern Analysis and Machine Intelligence, 28, 9 (Sep. 2006), 1372--1384. Google ScholarDigital Library
- Streitz, N. and Nixon, P. The Disappearing Computer. ACM Communications, 48, 3 (Mar. 2005), 33--35. Google ScholarDigital Library
- Traum, D. Back end of human computing: Multimodal and multi-party interaction management. In Proc. Int'l Conf. Multimodal Interfaces, 2006.Google Scholar
- Truong, K. P. and van Leeuwen, D. A. Automatic detection of laughter. In Proc. Interspeech Euro. Conf., 485--488, 2005.Google Scholar
- Valstar, M. F. and Pantic, M. Biologically vs. logic inspired encoding of facial actions and emotions in video. In Proc. Int'l Conf. on Multimedia and Expo, 2006.Google ScholarCross Ref
- Valstar, M. F. and Pantic, M. Fully automatic facial action unit detection and temporal analysis. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, vol. 3, 149, 2006. Google ScholarDigital Library
- Valstar, M. F., Pantic, M., Ambdar, Z. and Cohn, J. F. Spontaneous vs. Posed Facial Behavior: Automatic Analysis of Brow Actions. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
- Viola, P. and Jones, M. J. Robust real-time face detection. Int'l J. Computer Vision, 57, 2 (May 2004), 137--154. Google ScholarDigital Library
- Wang, J. J. and Singh, S. Video analysis of human dynamics -- a survey. Real Time Imaging, 9, 5 (Oct. 2003), 321--346.Google ScholarCross Ref
- Wang, L., Hu, W. and Tan, T. Recent developments in human motion analysis. Pattern Recognition 36, 3 (Mar. 2003), 585--601.Google ScholarCross Ref
- Weiser, M. The Computer for the Twenty-First Century. Scientific American, 265, 3 (Sep. 1991), 94--104.Google ScholarCross Ref
- Yang, M. H., Kriegman, D. J. and Ahuja, N. Detecting faces in images: A survey. IEEE Trans. Pattern Analysis and Machine Intelligence, 24, 1 (Jan. 2002), 34--58. Google ScholarDigital Library
- Zhai, S. and Bellotti, V. Sensing-Based Interaction. ACM Trans. Computer-Human Interaction, 12, 1 (Jan. 2005), 1--2. Google ScholarDigital Library
- Zhao, W., Chellappa, R., Phillips, P. J. and Rosenfeld, A. Face recognition: A literature survey. ACM Computing Surveys, 35, 4 (Dec. 2003), 399--458. Google ScholarDigital Library
- Zeng, Z., Hu, Y., Roisman, G. I., Fu, Y. and Huang, T. S. Audio-visual Emotion Recognition in Adult Attachment Interview. In Proc. Int'l Conf. Multimodal Interfaces, 2006. Google ScholarDigital Library
- BTT Survey on Alternative Biometrics. Biometric Technology Today, 14, 3 (Mar. 2006), 9--11.Google ScholarCross Ref
- Humaine Portal: http://emotion-research.net/wiki/DatabasesGoogle Scholar
- MMUA: http://mmua.cs.ucsb.edu/Google Scholar
- Praat: http://www.praat.orgGoogle Scholar
Index Terms
- Human computing and machine understanding of human behavior: a survey
Recommendations
Human computing and machine understanding of human behavior: a survey
ICMI'06/IJCAI'07: Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computingA widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation ...
Human-Centred Intelligent Human Computer Interaction (HCI²): how far are we from attaining it?
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop ...
Human-virtual human interaction by upper body gesture understanding
VRST '13: Proceedings of the 19th ACM Symposium on Virtual Reality Software and TechnologyIn this paper, a novel human-virtual human interaction system is proposed. This system supports a real human to communicate with a virtual human using natural body language. Meanwhile, the virtual human is capable of understanding the meaning of human ...
Comments