Abstract
This paper addresses a fine in-plane (roll) face orientation estimation for a perspective face analysis algorithm that requires normalized frontal faces. As most of the face analysers (e.g., gender, expression, and recognition) need frontal up-right faces, there is a clear need for the precise roll estimation, as precise face normalization has an important role in classification methods. The in-plane orientation estimation algorithm is constructed on top of regular Viola-Jones frontal face detector. When a face is detected for the first time, it is rotated with respect to the face origin to find the boundaries of the detection. Mean value of these angles is said to be the measurement of the in-plane rotation of the face. Since we only need a face detection algorithm, the proposed method can work effectively on very small sized faces where traditional landmark (eye, mouth) or planar detection based estimations fail. Experiments on controlled and unconstrained large-scale datasets (CMU Rotated, YouTube, Boston University Face Tracking, Caltech, FG-NET Aging, BioID and Manchester Talking-Face) showed that the proposed method is robust to various settings for in-plane face orientation estimation in terms of RMSE and MAE. We achieved less than ±3.5 ∘ mean absolute error for roll estimation which proves that the accuracy of the proposed method is comparable to that of the state-of-the-art tracking based approaches for the roll estimation.
Similar content being viewed by others
Notes
Annnotation data can be accessed at http://www.cristal.univ-lille.fr/bilasco/annotations/YouTubeFacesC4K/
References
An KH, Chung MJ (2008) 3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp 307–312
Asteriadis S, Karpouzis K, Kollias S (2010) Head pose estimation with one camera, in uncalibrated environments. In: Proceedings of the 2010 workshop on eye gaze in intelligent human machine interaction, ACM, New York, NY, USA, EGIHMI ’10. doi:10.1145/2002333.2002343, pp 55–62
Ba S, Odobez J (2004) A probabilistic framework for joint head tracking and pose estimation. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. doi:10.1109/ICPR.2004.1333754, vol 4, pp 264–267
Castrillon M, Deniz O, Guerra C, Hernandez M (2007) Encara2: Real-time detection of multiple faces at different resolutions in video streams. J Vis Commun Image R 18(2):130–140. doi:10.1016/j.jvcir.2006.11.004
Cootes TF (2004). Manchester talking face video dataset. http://personalpages.manchester.ac.uk/staff/timothy.f.cootes/data/talking_face/talking_face.html, date last accessed: 02.02.2013
Dahmane A, Larabi S, Djeraba C (2010) Detection and analysis of symmetrical parts on face for head pose estimation. In: 17th IEEE International Conference on Image Processing (ICIP). doi:10.1109/ICIP.2010.5651202, pp 3249 –3252
Dahmane A, Larabi S, Bilasco I, Djeraba C (2014) Head pose estimation based on face symmetry analysis. Signal Image Video P pp 1–10. doi:10.1007/s11760-014-0676-x
Danisman T, Bilasco IM, Ihaddadene N, Djeraba C (2010) Automatic facial feature detection for facial expression recognition. In: Richard P, Braz J (eds) VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010 - Volume 2, INSTICC Press, pp 407–412
Danisman T, Bilasco I, Djeraba C (2014) Cross-database evaluation of normalized raw pixels for gender recognition under unconstrained settings. In: 22nd IEEE International Conference on Pattern Recognition, ICPR, Stockholm, Sweden, p 2014
Demirkus M, Clark J, Arbel T (2013) Robust semi-automatic head pose labeling for real-world face video sequences. Multimedia Tools Applications pp 1–29. doi:10.1007/s11042-012-1352-1
Du S, Zheng N, You Q, Wu Y, Yuan M, Wu J (2006) Rotated haar-like features for face detection with in-plane rotation. In: Zha H, Pan Z, Thwaites H, Addison A, Forte M (eds) Interactive Technologies and Sociotechnical Systems, LNCS. doi:10.1007/11890881_15, vol 4270. Springer Berlin Heidelberg, pp 128–137
Face and gesture recognition working group (2000) FGNET Aging dataset. http://www-prima.inrialpes.fr/FGnet/html/benchmarks.html, date last accessed: 22.05.2009
Guo W, Kotsia I, Patras I (2011) Higher order support tensor regression for head pose estimation. In: 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)
Jesorsky O, Kirchberg K, Frischholz R (2001) Robust face detection using the hausdorff distance. In: Bigun J, Smeraldi F (eds) Audio- and Video-Based Biometric Person Authentication, LNCS. doi:10.1007/3-540-45344-X_14, vol 2091. Springer Berlin Heidelberg, pp 90–95
Jia H, Zhang Y, Wang W, Xu J (2012) Accelerating Viola-Jones face detection algorithm on gpus. In: High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on. doi:10.1109/HPCC.2012.60, pp 396–403
Jung S, Nixon MS (2012) On using gait to enhance frontal face extraction. IEEE Trans Inf Forensic Secur 7(6):1802–1811. doi:10.1109/TIFS.2012.2218598
La Cascia M, Sclaroff S, Athitsos V (2000) Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3D models. IEEE Trans Pattern Anal Mach Intell 22(4):322–336. doi:10.1109/34.845375
Lefevre S, Odobez J (2010) View-based appearance model online learning for 3D deformable face tracking. In: Richard P, Braz J (eds) VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010 - Volume 1. INSTICC Press, pp 223–230
Morency LP, Whitehill J, Movellan J (2010) Monocular head pose estimation using generalized adaptive view-based appearance model. Image Vis Comput 28(5):754–761. doi:10.1016/j.imavis.2009.08.004, best of Automatic Face and Gesture Recognition 2008
Murphy-Chutorian E, Trivedi M (2008) Hyhope: Hybrid head orientation and position estimation for vision-based driver head tracking. In: Intelligent Vehicles Symposium, 2008 IEEE. doi:10.1109/IVS.2008.4621320, pp 512–517
Murphy-Chutorian E, Trivedi M (2009) Head pose estimation in computer vision: A survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626. doi:10.1109/TPAMI.2008.106
My VD, Zell A (2013) Real time face tracking and pose estimation using an adaptive correlation filter for human-robot interaction. ECMR, pp 119–124
Oka K, Sato Y, Nakanishi Y, Koike H (2005) Head pose estimation system based on particle filtering with adaptive diffusion control. In: Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2005), May 16-18, 2005, Tsukuba Science City, Japan. http://b2.cvl.iis.u-tokyo.ac.jp/mva/proceedings/CommemorativeDVD/2005/papers/2005586.pdf, pp 586–589
Osadchy M, Cun YL, Miller ML (2007) Synergistic face detection and pose estimation with energy-based models. J Mach Learn Res 8:1197–1215. http://dl.acm.org/citation.cfm?id=1248659.1248700
Pan H, Zhu Y, Xia L (2013) Efficient and accurate face detection using heterogeneous feature descriptors and feature selection. Comput Vis Image Understand 117(1):12–28. doi:10.1016/j.cviu.2012.09.003
Pathangay V, Das S, Greiner T (2008) Symmetry-based face pose estimation from a single uncalibrated view. In: 8th IEEE International Conference on Automatic Face Gesture Recognition, FG ’08. doi:10.1109/AFGR.2008.4813312, pp 1 –8
Rowley HA, Baluja S, Kanade T (1998) Rotation invariant neural network-based face detection. In: IEEE conference on Computer Vision and Pattern Recognition. CVPR, IEEE, pp 38–44
Sung J, Kanade T, Kim D (2008) Pose robust face tracking by combining active appearance models and Cylinder Head Models. Int J Comput Vis 80:260–274. doi:10.1007/s11263-007-0125-1
Tran NT, Ababsa FE, Charbit M, Feldmar J, Petrovska-Delacrétaz D, Chollet G (2013) 3D face pose and animation tracking via eigen-decomposition based bayesian approach. In: Bebis G, Boyle R, Parvin B, Koracin D, Li B, Porikli F, Zordan V, Klosowski J, Coquillart S, Luo X, Chen M, Gotz D (eds) Advances in Visual Computing, LNCS. doi:10.1007/978-3-642-41914-0_55, vol 8033. Springer Berlin Heidelberg, pp 562–571
Valenti R, Yücel Z, Gevers T (2009) Robustifying eye center localization by head pose cues. In: IEEE conference on Computer Vision and Pattern Recognition. CVPR, IEEE, pp 612–618
Viola M, Jones MJ, Viola P (2003) Fast multi-view face detection. Technical Report TR2003-96, Mitsubishi Electric Research Laboratories, 201 Broadway Cambridge, MA
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57:137–154. doi:10.1023/B:VISI.0000013087.49260.fb
Voit M (2007) Clear 2007 evaluation plan: Head pose estimation. http://isl.ira.uka.de/mvoit/clear07/CLEAR07HEADPOSE2007-03-26.doc
Wang JG, Sung E (2007) EM enhancement of 3D head pose estimated by point at infinity. Image Vis Comput 25(12):1864–1874. doi:10.1016/j.imavis.2005.12.017, the age of human computer interaction
Wang YQ (2014) An Analysis of the Viola-Jones Face Detection Algorithm. Image Processing On Line 4:128–148. doi:10.5201/ipol.2014.104
Weber M (1999). Caltech frontal face dataset. http://www.vision.caltech.edu/html-files/archive.html, date last accessed: 02.02.2013
Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20-25. doi:10.1109/CVPR.2011.5995566. IEEE, pp 529–534
Wu B, Ai H, Huang C, Lao S (2004) Fast rotation invariant multi-view face detection based on real adaboost. In: IEEE International Conference on Automatic Face and Gesture Recognition. doi:10.1109/AFGR.2004.1301512, pp 79–84
Wu S, Jiang L, Xie S, Yeo AC (2006) A robust method for detecting facial orientation in infrared images. Pattern Recogn 39(2):303–309. doi:10.1016/j.patcog.2005.06.003, part Special Issue: Complexity Reduction
Wu S, Lin W, Xie S (2008) Skin heat transfer model of facial thermograms and its application in face recognition. Pattern Recogn 41(8):2718–2729. doi:10.1016/j.patcog.2008.01.003
Xiao J, Kanade T, Cohn JF (2002) Robust full-motion recovery of head by dynamic templates and re-registration techniques. In: Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, IEEE, Washington, DC, USA, FGR ’02. http://dl.acm.org/citation.cfm?id=874061.875442, p 163
Yan S, Zhang Z, Fu Y, Hu Y, Tu J, Huang T (2008) Learning a person-independent representation for precise 3D pose estimation. In: Stiefelhagen R, Bowers R, Fiscus J (eds) Multimodal Technologies for Perception of Humans, Lecture Notes in Computer Science. doi:10.1007/978-3-540-68585-2_28, vol 4625. Springer Berlin Heidelberg, pp 297–306
Zhao G, Chen L, Song J, Chen G (2007) Large head movement tracking using sift-based registration. In: Proceedings of the 15th International Conference on Multimedia, ACM, New York, NY, USA, MULTIMEDIA ’07. doi:10.1145/1291233.1291416, pp 807–810
Zhao S, Yao H, Sun X (2013) Video classification and recommendation based on affective analysis of viewers. Neurocomputing 119(0):101–110. doi:10.1016/j.neucom.2012.04.042, Intelligent Processing Techniques for Semantic-based Image and Video Retrieval
Zhou J, Lu XG, Zhang D, Wu CY (2002) Orientation analysis for rotated human face detection. Image Vis Comput 20(4):257–264. doi:10.1016/S0262-8856(02)00018-5
Acknowledgments
This study is supported by TWIRL (ITEA2 10029 - Twinning Virtual World Online Information with Real-World Data Sources) project.
Author information
Authors and Affiliations
Corresponding author
Appendix: Example detections from test databases
Appendix: Example detections from test databases
Rights and permissions
About this article
Cite this article
Danisman, T., Bilasco, I.M. In-plane face orientation estimation in still images. Multimed Tools Appl 75, 7799–7829 (2016). https://doi.org/10.1007/s11042-015-2699-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2699-x