Automatic facial expression analysis: a survey
Introduction
Facial expression analysis goes well back into the nineteenth century. Darwin [1] demonstrated already in 1872 the universality of facial expressions and their continuity in man and animals and claimed among other things, that there are specific inborn emotions, which originated in serviceable associated habits. In 1971, Ekman and Friesen [2] postulated six primary emotions that possess each a distinctive content together with a unique facial expression. These prototypic emotional displays are also referred to as so called basic emotions. They seem to be universal across human ethnicities and cultures and comprise happiness, sadness, fear, disgust, surprise and anger. In the past, facial expression analysis was primarily a research subject for psychologists, but already in 1978, Suwa et al. [3] presented a preliminary investigation on automatic facial expression analysis from an image sequence. In the nineties, automatic facial expression analysis research gained much inertia starting with the pioneering work of Mase and Pentland [4]. The reasons for this renewed interest in facial expressions are multiple, but mainly due to advancements accomplished in related research areas such as face detection, face tracking and face recognition as well as the recent availability of relatively cheap computational power. Various applications using automatic facial expression analysis can be envisaged in the near future, fostering further interest in doing research in different areas, including image understanding, psychological studies, facial nerve grading in medicine [5], face image compression and synthetic face animation [6], video-indexing, robotics as well as virtual reality. Facial expression recognition should not be confused with human emotion recognition as is often done in the computer vision community. While facial expression recognition deals with the classification of facial motion and facial feature deformation into abstract classes that are purely based on visual information, human emotions are a result of many different factors and their state might or might not be revealed through a number of channels such as emotional voice, pose, gestures, gaze direction and facial expressions. Furthermore, emotions are not the only source of facial expressions, see Fig. 1. In contrast to facial expression recognition, emotion recognition is an interpretation attempt and often demands understanding of a given situation, together with the availability of full contextual information.
Section snippets
Facial expression measurement
Facial expressions are generated by contractions of facial muscles, which results in temporally deformed facial features such as eye lids, eye brows, nose, lips and skin texture, often revealed by wrinkles and bulges. Typical changes of muscular activities are brief, lasting for a few seconds, but rarely more than or less than . We would like to accurately measure facial expressions and therefore need a useful terminology for their description. Of importance is the location of facial
Automatic facial expression analysis
Automatic facial expression analysis is a complex task as physiognomies of faces vary from one individual to another quite considerably due to different age, ethnicity, gender, facial hair, cosmetic products and occluding objects such as glasses and hair. Furthermore, faces appear disparate because of pose and lighting changes. Variations such as these have to be addressed at different stages of an automatic facial expression analysis system, see Fig. 2. We have a closer look at the individual
Representative facial expression recognition systems
In this section, we have a closer look at a few representative facial expression analysis systems. First, we discuss deformation and motion-based feature extraction systems. Then we introduce hybrid facial expression analysis systems, which employ several image analysis methods that complete each other and thus allow for a better overall performance. Multi-modal frameworks on the other hand integrate other non-verbal communication channels for improved facial expression interpretation results.
Discussion
In this survey on automatic facial expression analysis, we have discussed automatic face analysis with regard to different motion and deformation-based extraction methods, model and image-based representation techniques as well as recognition and interpretation-based classification approaches. It is not possible to directly compare facial expression recognition results of face analysis systems found in the literature due to varying facial action labeling and different test beds that were used
Conclusion
Today, most facial expression analysis systems attempt to map facial expressions directly into basic emotional categories and are thus unable to handle facial actions caused by non-emotional mental and physiological activities. FACS may provide a solution to this dilemma, as it allows to classify facial actions prior to any interpretation attempts. So far, only marker-based systems are able to reliably code all FACS action unit activities and intensities [58]. More work has to be done in the
Summary
In recent years, facial expression analysis has become an active research area. Various approaches have been made towards robust facial expression recognition, applying different image acquisition, analysis and classification methods. Facial expression analysis is an inherently multi-disciplinary field and it is important to look at it from all domains involved in order to gain insight on how to build reliable automated facial expression analysis systems. This fact has often been neglected in
About the Author—BEAT FASEL graduated from the Swiss Federal Institute of Technology Lausanne (EPFL) with a diploma in Communication Systems. He currently works towards a Ph.D. degree at IDIAP in Martigny, Switzerland. His research interests include computer vision, pattern recognition and artificial intelligence.
References (87)
- et al.
Facial expression recognition using model-based feature extraction and action parameters classification
J. Visual Commun. Image Representation
(1997) - et al.
Expert system for automatic analysis of facial expression
Image Vision Comput. J.
(2000) On the estimation of optical flowrelations between different approaches and some new results
Artif. Intell.
(1987)The Expression of the Emotions in Man and Animals
(1872)- et al.
Constants across cultures in the face and emotion
J. Personality Social Psychol.
(1971) - M. Suwa, N. Sugie, K. Fujimora, A preliminary note on pattern recognition of human emotional expression, Proceedings of...
- et al.
Recognition of facial expression from optical flow
IEICE Trans. E
(1991) - et al.
Review of objective topographic facial nerve evaluation methods
Am. J. Otol.
(1999) - R. Koenen, Mpeg-4 Project Overview, International Organisation for Standartistion, ISO/IEC JTC1/SC29/WG11, La Baule,...
- et al.
What's in a smile?
Develop. Psychol.
(1999)
Facial expression and imagery in depression: an electromyographic study
Psychosomatic Med.
Emotions in the Human Face
Facial Action Coding System: A Technique for the Measurement of Facial Movement
Methods for measuring facial actions
Cultural similarities and differences in display rules
Motivation Emotion
Ethnic differences in affect intensity, emotion judgments, display rules, and self-reported emotional expression
Motivation Emotion
Automatic interpretation and coding of face images using flexible models
IEEE Trans. Pattern Anal. Mach. Intell.
Coding, analysis, interpretation and recognition of facial expressions
IEEE Trans. Pattern Anal. Mach. Intell.
Neural network-based face detection
IEEE Trans. Pattern Anal. Mach. Intell.
Eigenfaces vs. fisherfaces: recognition using class specific linear projection
IEEE Trans. Pattern Anal. Mach. Intell.
Recognizing facial expressions in image sequences using local parameterized models of image motion
Internat. J. Comput. Vision
Active appearance models
IEEE PAMI
Cited by (0)
About the Author—BEAT FASEL graduated from the Swiss Federal Institute of Technology Lausanne (EPFL) with a diploma in Communication Systems. He currently works towards a Ph.D. degree at IDIAP in Martigny, Switzerland. His research interests include computer vision, pattern recognition and artificial intelligence.
About the Author—JUERGEN LUETTIN received a Ph.D. degree in Electronic and Electrical Engineering from the University of Sheffield, UK, in the area of visual speech and speaker recognition. He joined IDIAP in Martigny, Switzerland, in 1996 as a research assistant where he worked on multimodal biometrics. From 1997 to 2000, he was head of the computer vision group at IDIAP, where he initiated and lead several European Community and Swiss SNF projects in the area of biometrics, speech recognition, face analysis and document recognition. In 2000, he joined Ascom AG in Maegenwil, Switzerland as head of the technology area Pattern Recognition. Dr. Luettin has been a visiting researcher at the Center for Language and Speech Processing at the Johns Hopkins University, Baltimore, in 1997 (large vocabulary conversational speech recognition) and 2000 (audio–visual speech recognition). His research interests include speech recognition, computer vision, biometrics, and multimodal recognition.