EFG-Net: A Unified Framework for Estimating Eye Gaze and Face Gaze Simultaneously

Che, Hekuangyi; Zhu, Dongchen; Lin, Minjing; Shi, Wenjun; Zhang, Guanghui; Li, Hang; Zhang, Xiaolin; Li, Jiamao

doi:10.1007/978-3-031-18907-4_43

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13534))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

2695 Accesses

Abstract

Gaze is of vital importance for understanding human purpose and intention. Recent works have gained tremendous progress in appearance-based gaze estimation. However, all these works deal with eye gaze estimation or face gaze estimation separately, ignoring the mutual benefit of the fact that eye gaze and face gaze are roughly the same with a slight difference in the starting point. For the first time, we propose an Eye gaze and Face Gaze Network (EFG-Net), which makes eye gaze estimation and face gaze estimation take advantage of each other, leading to a win-win situation. Our EFG-Net consists of three feature extractors, a feature communication module named GazeMixer, and three predicting heads. The GazeMixer is designed to propagate coarse gaze features from face gaze to eye gaze and fine gaze features from eye gaze to face gaze. The predicting heads are capable of estimating gazes from the corresponding features more finely and stably. Experiments show that our method achieves state-of-the-art performance of 3.90° (by \({\sim } 4 \% \)) eye gaze error and 3.93° (by \({\sim } 2 \% \)) face gaze error on MPIIFaceGaze dataset, 3.03° eye gaze error and 3.17° (by \({\sim } 5 \% \)) face gaze error on GazeCapture dataset respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Bao, Y., Cheng, Y., Liu, Y., Lu, F.: Adaptive feature fusion network for gaze tracking in mobile tablets. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 9936–9943. IEEE (2021)
Google Scholar
Cai, X., et al.: Gaze estimation with an ensemble of four architectures. arXiv preprint arXiv:2107.01980 (2021)
Chen, Z., Shi, B.E.: Appearance-based gaze estimation using dilated-convolutions. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 309–324. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20876-9_20
Chapter Google Scholar
Cheng, Y., Huang, S., Wang, F., Qian, C., Lu, F.: A coarse-to-fine adaptive network for appearance-based gaze estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10623–10630 (2020)
Google Scholar
Cheng, Y., Lu, F.: Gaze estimation using transformer. arXiv preprint arXiv:2105.14424 (2021)
Cheng, Y., Zhang, X., Lu, F., Sato, Y.: Gaze estimation by exploring two-eye asymmetry. IEEE Trans. Image Process. 29, 5259–5272 (2020)
Article Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Fischer, T., Chang, H.J., Demiris, Y.: RT-GENE: real-time eye gaze estimation in natural environments. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 334–352 (2018)
Google Scholar
Ghosh, S., Hayat, M., Dhall, A., Knibbe, J.: MTGLS: multi-task gaze estimation with limited supervision. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3223–3234 (2022)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., Torralba, A.: Gaze360: physically unconstrained gaze estimation in the wild. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6912–6921 (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krafka, K., et al.: Eye tracking for everyone. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2176–2184 (2016)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Park, S., Mello, S.D., Molchanov, P., Iqbal, U., Hilliges, O., Kautz, J.: Few-shot adaptive gaze estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9368–9377 (2019)
Google Scholar
Park, S., Spurr, A., Hilliges, O.: Deep pictorial gaze estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 721–738 (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tolstikhin, I.O., et al.: MLP-mixer: an all-MLP architecture for vision. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., Hilliges, O.: ETH-XGaze: a large scale dataset for gaze estimation under extreme head pose and gaze variation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 365–381. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_22
Chapter Google Scholar
Zhang, X., Sugano, Y., Bulling, A.: Revisiting data normalization for appearance-based gaze estimation. In: Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, pp. 1–9 (2018)
Google Scholar
Zhang, X., Sugano, Y., Bulling, A., Hilliges, O.: Learning-based region selection for end-to-end gaze estimation. In: BMVC (2020)
Google Scholar
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4511–4520 (2015)
Google Scholar
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 51–60 (2017)
Google Scholar
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Mpiigaze: real-world dataset and deep appearance-based gaze estimation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 162–175 (2017)
Article Google Scholar
Zheng, Y., Park, S., Zhang, X., De Mello, S., Hilliges, O.: Self-learning transformations for improving gaze and head redirection. Adv. Neural. Inf. Process. Syst. 33, 13127–13138 (2020)
Google Scholar

Download references

Acknowledgments

This work was supported by National Science and Technology Major Project from Minister of Science and Technology, China (2018AAA0103100), National Natural Science Foundation of China (61873255), Shanghai Municipal Science and Technology Major Project (ZHANGJIANG LAB) under Grant 2018SHZDZX01 and Youth Innovation Promotion Association, Chinese Academy of Sciences (2021233).

Author information

Authors and Affiliations

Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, 200050, China
Hekuangyi Che, Dongchen Zhu, Minjing Lin, Wenjun Shi, Guanghui Zhang, Hang Li, Xiaolin Zhang & Jiamao Li
University of Chinese Academy of Sciences, Beijing, 100049, China
Hekuangyi Che, Dongchen Zhu, Wenjun Shi, Guanghui Zhang, Xiaolin Zhang & Jiamao Li
Xiongan Institute of Innovation, Xiongan, 071700, China
Xiaolin Zhang & Jiamao Li
University of Science and Technology of China, Hefei, 230027, Anhui, China
Xiaolin Zhang
School of Information Science and Technology, ShanghaiTech University, Shanghai, 201210, China
Xiaolin Zhang

Authors

Hekuangyi Che
View author publications
You can also search for this author in PubMed Google Scholar
Dongchen Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Minjing Lin
View author publications
You can also search for this author in PubMed Google Scholar
Wenjun Shi
View author publications
You can also search for this author in PubMed Google Scholar
Guanghui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hang Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiamao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiamao Li .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi’an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Che, H. et al. (2022). EFG-Net: A Unified Framework for Estimating Eye Gaze and Face Gaze Simultaneously. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-18907-4_43
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18906-7
Online ISBN: 978-3-031-18907-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EFG-Net: A Unified Framework for Estimating Eye Gaze and Face Gaze Simultaneously