Skip to main content

EFG-Net: A Unified Framework for Estimating Eye Gaze and Face Gaze Simultaneously

  • Conference paper
  • First Online:
Book cover Pattern Recognition and Computer Vision (PRCV 2022)

Abstract

Gaze is of vital importance for understanding human purpose and intention. Recent works have gained tremendous progress in appearance-based gaze estimation. However, all these works deal with eye gaze estimation or face gaze estimation separately, ignoring the mutual benefit of the fact that eye gaze and face gaze are roughly the same with a slight difference in the starting point. For the first time, we propose an Eye gaze and Face Gaze Network (EFG-Net), which makes eye gaze estimation and face gaze estimation take advantage of each other, leading to a win-win situation. Our EFG-Net consists of three feature extractors, a feature communication module named GazeMixer, and three predicting heads. The GazeMixer is designed to propagate coarse gaze features from face gaze to eye gaze and fine gaze features from eye gaze to face gaze. The predicting heads are capable of estimating gazes from the corresponding features more finely and stably. Experiments show that our method achieves state-of-the-art performance of 3.90° (by \({\sim } 4 \% \)) eye gaze error and 3.93° (by \({\sim } 2 \% \)) face gaze error on MPIIFaceGaze dataset, 3.03° eye gaze error and 3.17° (by \({\sim } 5 \% \)) face gaze error on GazeCapture dataset respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)

  2. Bao, Y., Cheng, Y., Liu, Y., Lu, F.: Adaptive feature fusion network for gaze tracking in mobile tablets. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 9936–9943. IEEE (2021)

    Google Scholar 

  3. Cai, X., et al.: Gaze estimation with an ensemble of four architectures. arXiv preprint arXiv:2107.01980 (2021)

  4. Chen, Z., Shi, B.E.: Appearance-based gaze estimation using dilated-convolutions. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 309–324. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20876-9_20

    Chapter  Google Scholar 

  5. Cheng, Y., Huang, S., Wang, F., Qian, C., Lu, F.: A coarse-to-fine adaptive network for appearance-based gaze estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10623–10630 (2020)

    Google Scholar 

  6. Cheng, Y., Lu, F.: Gaze estimation using transformer. arXiv preprint arXiv:2105.14424 (2021)

  7. Cheng, Y., Zhang, X., Lu, F., Sato, Y.: Gaze estimation by exploring two-eye asymmetry. IEEE Trans. Image Process. 29, 5259–5272 (2020)

    Article  Google Scholar 

  8. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  9. Fischer, T., Chang, H.J., Demiris, Y.: RT-GENE: real-time eye gaze estimation in natural environments. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 334–352 (2018)

    Google Scholar 

  10. Ghosh, S., Hayat, M., Dhall, A., Knibbe, J.: MTGLS: multi-task gaze estimation with limited supervision. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3223–3234 (2022)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., Torralba, A.: Gaze360: physically unconstrained gaze estimation in the wild. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6912–6921 (2019)

    Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  14. Krafka, K., et al.: Eye tracking for everyone. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2176–2184 (2016)

    Google Scholar 

  15. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  16. Park, S., Mello, S.D., Molchanov, P., Iqbal, U., Hilliges, O., Kautz, J.: Few-shot adaptive gaze estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9368–9377 (2019)

    Google Scholar 

  17. Park, S., Spurr, A., Hilliges, O.: Deep pictorial gaze estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 721–738 (2018)

    Google Scholar 

  18. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  19. Tolstikhin, I.O., et al.: MLP-mixer: an all-MLP architecture for vision. In: Advances in Neural Information Processing Systems, vol. 34 (2021)

    Google Scholar 

  20. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  21. Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., Hilliges, O.: ETH-XGaze: a large scale dataset for gaze estimation under extreme head pose and gaze variation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 365–381. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_22

    Chapter  Google Scholar 

  22. Zhang, X., Sugano, Y., Bulling, A.: Revisiting data normalization for appearance-based gaze estimation. In: Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, pp. 1–9 (2018)

    Google Scholar 

  23. Zhang, X., Sugano, Y., Bulling, A., Hilliges, O.: Learning-based region selection for end-to-end gaze estimation. In: BMVC (2020)

    Google Scholar 

  24. Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4511–4520 (2015)

    Google Scholar 

  25. Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 51–60 (2017)

    Google Scholar 

  26. Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Mpiigaze: real-world dataset and deep appearance-based gaze estimation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 162–175 (2017)

    Article  Google Scholar 

  27. Zheng, Y., Park, S., Zhang, X., De Mello, S., Hilliges, O.: Self-learning transformations for improving gaze and head redirection. Adv. Neural. Inf. Process. Syst. 33, 13127–13138 (2020)

    Google Scholar 

Download references

Acknowledgments

This work was supported by National Science and Technology Major Project from Minister of Science and Technology, China (2018AAA0103100), National Natural Science Foundation of China (61873255), Shanghai Municipal Science and Technology Major Project (ZHANGJIANG LAB) under Grant 2018SHZDZX01 and Youth Innovation Promotion Association, Chinese Academy of Sciences (2021233).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiamao Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Che, H. et al. (2022). EFG-Net: A Unified Framework for Estimating Eye Gaze and Face Gaze Simultaneously. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18907-4_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18906-7

  • Online ISBN: 978-3-031-18907-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics