Skip to main content
Log in

In-plane face orientation estimation in still images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper addresses a fine in-plane (roll) face orientation estimation for a perspective face analysis algorithm that requires normalized frontal faces. As most of the face analysers (e.g., gender, expression, and recognition) need frontal up-right faces, there is a clear need for the precise roll estimation, as precise face normalization has an important role in classification methods. The in-plane orientation estimation algorithm is constructed on top of regular Viola-Jones frontal face detector. When a face is detected for the first time, it is rotated with respect to the face origin to find the boundaries of the detection. Mean value of these angles is said to be the measurement of the in-plane rotation of the face. Since we only need a face detection algorithm, the proposed method can work effectively on very small sized faces where traditional landmark (eye, mouth) or planar detection based estimations fail. Experiments on controlled and unconstrained large-scale datasets (CMU Rotated, YouTube, Boston University Face Tracking, Caltech, FG-NET Aging, BioID and Manchester Talking-Face) showed that the proposed method is robust to various settings for in-plane face orientation estimation in terms of RMSE and MAE. We achieved less than ±3.5 mean absolute error for roll estimation which proves that the accuracy of the proposed method is comparable to that of the state-of-the-art tracking based approaches for the roll estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. Annnotation data can be accessed at http://www.cristal.univ-lille.fr/bilasco/annotations/YouTubeFacesC4K/

References

  1. An KH, Chung MJ (2008) 3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp 307–312

  2. Asteriadis S, Karpouzis K, Kollias S (2010) Head pose estimation with one camera, in uncalibrated environments. In: Proceedings of the 2010 workshop on eye gaze in intelligent human machine interaction, ACM, New York, NY, USA, EGIHMI ’10. doi:10.1145/2002333.2002343, pp 55–62

  3. Ba S, Odobez J (2004) A probabilistic framework for joint head tracking and pose estimation. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. doi:10.1109/ICPR.2004.1333754, vol 4, pp 264–267

  4. Castrillon M, Deniz O, Guerra C, Hernandez M (2007) Encara2: Real-time detection of multiple faces at different resolutions in video streams. J Vis Commun Image R 18(2):130–140. doi:10.1016/j.jvcir.2006.11.004

    Article  Google Scholar 

  5. Cootes TF (2004). Manchester talking face video dataset. http://personalpages.manchester.ac.uk/staff/timothy.f.cootes/data/talking_face/talking_face.html, date last accessed: 02.02.2013

  6. Dahmane A, Larabi S, Djeraba C (2010) Detection and analysis of symmetrical parts on face for head pose estimation. In: 17th IEEE International Conference on Image Processing (ICIP). doi:10.1109/ICIP.2010.5651202, pp 3249 –3252

  7. Dahmane A, Larabi S, Bilasco I, Djeraba C (2014) Head pose estimation based on face symmetry analysis. Signal Image Video P pp 1–10. doi:10.1007/s11760-014-0676-x

  8. Danisman T, Bilasco IM, Ihaddadene N, Djeraba C (2010) Automatic facial feature detection for facial expression recognition. In: Richard P, Braz J (eds) VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010 - Volume 2, INSTICC Press, pp 407–412

  9. Danisman T, Bilasco I, Djeraba C (2014) Cross-database evaluation of normalized raw pixels for gender recognition under unconstrained settings. In: 22nd IEEE International Conference on Pattern Recognition, ICPR, Stockholm, Sweden, p 2014

  10. Demirkus M, Clark J, Arbel T (2013) Robust semi-automatic head pose labeling for real-world face video sequences. Multimedia Tools Applications pp 1–29. doi:10.1007/s11042-012-1352-1

  11. Du S, Zheng N, You Q, Wu Y, Yuan M, Wu J (2006) Rotated haar-like features for face detection with in-plane rotation. In: Zha H, Pan Z, Thwaites H, Addison A, Forte M (eds) Interactive Technologies and Sociotechnical Systems, LNCS. doi:10.1007/11890881_15, vol 4270. Springer Berlin Heidelberg, pp 128–137

  12. Face and gesture recognition working group (2000) FGNET Aging dataset. http://www-prima.inrialpes.fr/FGnet/html/benchmarks.html, date last accessed: 22.05.2009

  13. Guo W, Kotsia I, Patras I (2011) Higher order support tensor regression for head pose estimation. In: 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)

  14. Jesorsky O, Kirchberg K, Frischholz R (2001) Robust face detection using the hausdorff distance. In: Bigun J, Smeraldi F (eds) Audio- and Video-Based Biometric Person Authentication, LNCS. doi:10.1007/3-540-45344-X_14, vol 2091. Springer Berlin Heidelberg, pp 90–95

  15. Jia H, Zhang Y, Wang W, Xu J (2012) Accelerating Viola-Jones face detection algorithm on gpus. In: High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on. doi:10.1109/HPCC.2012.60, pp 396–403

  16. Jung S, Nixon MS (2012) On using gait to enhance frontal face extraction. IEEE Trans Inf Forensic Secur 7(6):1802–1811. doi:10.1109/TIFS.2012.2218598

    Article  Google Scholar 

  17. La Cascia M, Sclaroff S, Athitsos V (2000) Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3D models. IEEE Trans Pattern Anal Mach Intell 22(4):322–336. doi:10.1109/34.845375

    Article  Google Scholar 

  18. Lefevre S, Odobez J (2010) View-based appearance model online learning for 3D deformable face tracking. In: Richard P, Braz J (eds) VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010 - Volume 1. INSTICC Press, pp 223–230

  19. Morency LP, Whitehill J, Movellan J (2010) Monocular head pose estimation using generalized adaptive view-based appearance model. Image Vis Comput 28(5):754–761. doi:10.1016/j.imavis.2009.08.004, best of Automatic Face and Gesture Recognition 2008

    Article  Google Scholar 

  20. Murphy-Chutorian E, Trivedi M (2008) Hyhope: Hybrid head orientation and position estimation for vision-based driver head tracking. In: Intelligent Vehicles Symposium, 2008 IEEE. doi:10.1109/IVS.2008.4621320, pp 512–517

  21. Murphy-Chutorian E, Trivedi M (2009) Head pose estimation in computer vision: A survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626. doi:10.1109/TPAMI.2008.106

    Article  Google Scholar 

  22. My VD, Zell A (2013) Real time face tracking and pose estimation using an adaptive correlation filter for human-robot interaction. ECMR, pp 119–124

  23. Oka K, Sato Y, Nakanishi Y, Koike H (2005) Head pose estimation system based on particle filtering with adaptive diffusion control. In: Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2005), May 16-18, 2005, Tsukuba Science City, Japan. http://b2.cvl.iis.u-tokyo.ac.jp/mva/proceedings/CommemorativeDVD/2005/papers/2005586.pdf, pp 586–589

  24. Osadchy M, Cun YL, Miller ML (2007) Synergistic face detection and pose estimation with energy-based models. J Mach Learn Res 8:1197–1215. http://dl.acm.org/citation.cfm?id=1248659.1248700

  25. Pan H, Zhu Y, Xia L (2013) Efficient and accurate face detection using heterogeneous feature descriptors and feature selection. Comput Vis Image Understand 117(1):12–28. doi:10.1016/j.cviu.2012.09.003

    Article  Google Scholar 

  26. Pathangay V, Das S, Greiner T (2008) Symmetry-based face pose estimation from a single uncalibrated view. In: 8th IEEE International Conference on Automatic Face Gesture Recognition, FG ’08. doi:10.1109/AFGR.2008.4813312, pp 1 –8

  27. Rowley HA, Baluja S, Kanade T (1998) Rotation invariant neural network-based face detection. In: IEEE conference on Computer Vision and Pattern Recognition. CVPR, IEEE, pp 38–44

  28. Sung J, Kanade T, Kim D (2008) Pose robust face tracking by combining active appearance models and Cylinder Head Models. Int J Comput Vis 80:260–274. doi:10.1007/s11263-007-0125-1

    Article  Google Scholar 

  29. Tran NT, Ababsa FE, Charbit M, Feldmar J, Petrovska-Delacrétaz D, Chollet G (2013) 3D face pose and animation tracking via eigen-decomposition based bayesian approach. In: Bebis G, Boyle R, Parvin B, Koracin D, Li B, Porikli F, Zordan V, Klosowski J, Coquillart S, Luo X, Chen M, Gotz D (eds) Advances in Visual Computing, LNCS. doi:10.1007/978-3-642-41914-0_55, vol 8033. Springer Berlin Heidelberg, pp 562–571

  30. Valenti R, Yücel Z, Gevers T (2009) Robustifying eye center localization by head pose cues. In: IEEE conference on Computer Vision and Pattern Recognition. CVPR, IEEE, pp 612–618

  31. Viola M, Jones MJ, Viola P (2003) Fast multi-view face detection. Technical Report TR2003-96, Mitsubishi Electric Research Laboratories, 201 Broadway Cambridge, MA

  32. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57:137–154. doi:10.1023/B:VISI.0000013087.49260.fb

    Article  Google Scholar 

  33. Voit M (2007) Clear 2007 evaluation plan: Head pose estimation. http://isl.ira.uka.de/mvoit/clear07/CLEAR07HEADPOSE2007-03-26.doc

  34. Wang JG, Sung E (2007) EM enhancement of 3D head pose estimated by point at infinity. Image Vis Comput 25(12):1864–1874. doi:10.1016/j.imavis.2005.12.017, the age of human computer interaction

  35. Wang YQ (2014) An Analysis of the Viola-Jones Face Detection Algorithm. Image Processing On Line 4:128–148. doi:10.5201/ipol.2014.104

    Article  Google Scholar 

  36. Weber M (1999). Caltech frontal face dataset. http://www.vision.caltech.edu/html-files/archive.html, date last accessed: 02.02.2013

  37. Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20-25. doi:10.1109/CVPR.2011.5995566. IEEE, pp 529–534

  38. Wu B, Ai H, Huang C, Lao S (2004) Fast rotation invariant multi-view face detection based on real adaboost. In: IEEE International Conference on Automatic Face and Gesture Recognition. doi:10.1109/AFGR.2004.1301512, pp 79–84

  39. Wu S, Jiang L, Xie S, Yeo AC (2006) A robust method for detecting facial orientation in infrared images. Pattern Recogn 39(2):303–309. doi:10.1016/j.patcog.2005.06.003, part Special Issue: Complexity Reduction

    Article  Google Scholar 

  40. Wu S, Lin W, Xie S (2008) Skin heat transfer model of facial thermograms and its application in face recognition. Pattern Recogn 41(8):2718–2729. doi:10.1016/j.patcog.2008.01.003

    Article  Google Scholar 

  41. Xiao J, Kanade T, Cohn JF (2002) Robust full-motion recovery of head by dynamic templates and re-registration techniques. In: Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, IEEE, Washington, DC, USA, FGR ’02. http://dl.acm.org/citation.cfm?id=874061.875442, p 163

  42. Yan S, Zhang Z, Fu Y, Hu Y, Tu J, Huang T (2008) Learning a person-independent representation for precise 3D pose estimation. In: Stiefelhagen R, Bowers R, Fiscus J (eds) Multimodal Technologies for Perception of Humans, Lecture Notes in Computer Science. doi:10.1007/978-3-540-68585-2_28, vol 4625. Springer Berlin Heidelberg, pp 297–306

  43. Zhao G, Chen L, Song J, Chen G (2007) Large head movement tracking using sift-based registration. In: Proceedings of the 15th International Conference on Multimedia, ACM, New York, NY, USA, MULTIMEDIA ’07. doi:10.1145/1291233.1291416, pp 807–810

  44. Zhao S, Yao H, Sun X (2013) Video classification and recommendation based on affective analysis of viewers. Neurocomputing 119(0):101–110. doi:10.1016/j.neucom.2012.04.042, Intelligent Processing Techniques for Semantic-based Image and Video Retrieval

    Article  Google Scholar 

  45. Zhou J, Lu XG, Zhang D, Wu CY (2002) Orientation analysis for rotated human face detection. Image Vis Comput 20(4):257–264. doi:10.1016/S0262-8856(02)00018-5

    Article  Google Scholar 

Download references

Acknowledgments

This study is supported by TWIRL (ITEA2 10029 - Twinning Virtual World Online Information with Real-World Data Sources) project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taner Danisman.

Appendix: Example detections from test databases

Appendix: Example detections from test databases

Fig. 20
figure 20

Additional samples from BUFT rotated dataset. The green lines present the ground-truth orientation and it’s perpendicular. The blue lines present the estimated orientation and the red rectangles present detected faces

Fig. 21
figure 21

Sample detections from CMU rotated dataset. The green lines present the ground-truth orientation and it’s perpendicular. The blue lines present the estimated orientation and the red rectangles present detected faces

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Danisman, T., Bilasco, I.M. In-plane face orientation estimation in still images. Multimed Tools Appl 75, 7799–7829 (2016). https://doi.org/10.1007/s11042-015-2699-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2699-x

Keywords

Navigation