Skip to main content

Deception Detection in Videos Using Robust Facial Features

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1290))

Abstract

In this paper, we approach the problem of deception detection in videos. Current approaches are limited since they (i) are used in short videos focusing only on a small act of deception, (ii) are hard to interpret, and (iii) do not make use of any human model that could help them in the detection task. To address those limitations, we propose a novel framework that uses as input the 1-dimensional Facial Action Unit (FAU) and Gaze signals. By using a higher-level input and not the raw video, we are able to train a conceptually simple, modular and powerful model that achieves state-of-the-art performance in video-based deception detection. Finally, we propose a novel approach to interpret our model’s predictions, by computing the attention of the neural network in the time domain. This method can enable domain scientists perform retrospective analysis of deceptive behavior.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/yjxiong/tsn-pytorch.

References

  1. Bai, S., Kolter, J., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling (2018)

    Google Scholar 

  2. Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L.: OpenFace 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pp. 59–66, May 2018

    Google Scholar 

  3. Baydin, A., Cornish, R., Rubio, D., Schmidt, M., Wood, F.: Online learning rate adaptation with hypergradient descent (2017)

    Google Scholar 

  4. Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D.: Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424 (2015)

  5. Buller, D.B., Burgoon, J.K.: Interpersonal deception theory. Commun. Theor. 6(3), 203–242 (1996)

    Article  Google Scholar 

  6. Burgoon, J.K., Guerrero, L.K., Floyd, K.: Nonverbal Communication, 1st edn. Allyn and Bacon, Boston (2010)

    Google Scholar 

  7. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  8. Ding, M., Zhao, A., Lu, Z., Xiang, T., Wen, J.R.: Face-focused cross-stream network for deception detection in videos. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

    Google Scholar 

  9. Donahue, J., Hendricks, L.A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 677–691 (2017)

    Article  Google Scholar 

  10. Ekman, P., Rosenberg, E.L.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)

    Google Scholar 

  11. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059 (2016)

    Google Scholar 

  12. Gehring, J., Auli, M., Grangier, D., Dauphin, Y.: A convolutional encoder model for neural machine translation, pp. 123–135 (2017)

    Google Scholar 

  13. Gogate, M., Adeel, A., Hussain, A.: Deep learning driven multimodal fusion for automated deception detection. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–6, November 2017

    Google Scholar 

  14. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc., (2014)

    Google Scholar 

  15. Gupta, V., Agarwal, M., Arora, M., Chakraborty, T., Singh, R., Vatsa, M.: Bag-of-lies: a multimodal dataset for deception detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2015)

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision (ICCV 2015), vol. 1502 (2015)

    Google Scholar 

  18. Hinton, G.E.: Connectionist learning procedures, pp. 11–47. IEEE Press(1990)

    Google Scholar 

  19. Ioffe , S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32Nd International Conference on International Conference on Machine Learning, ICML 2015, vol. 37, pp. 448–456. JMLR.org (2015)

    Google Scholar 

  20. Jaiswal, M., Tabibu, S., Bajpai, R.: The truth and nothing but the truth: multimodal analysis for deception detection. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 938–943, December 2016

    Google Scholar 

  21. Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)

    Article  Google Scholar 

  22. Kalchbrenner, N., Espeholt, L., Simonyan, K., van den Oord, A., Graves, A., Kavukcuoglu, K.: Neural machine translation in linear time (2016)

    Google Scholar 

  23. Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, pp. 5574–5584 (2017)

    Google Scholar 

  24. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes.arXiv preprint arXiv:1312.6114 (2013)

  25. Luo, Y., Mesgarani, N.: Conv-TasNet: surpassing ideal time-frequency magnitude masking for speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 27(8), 1256–1266 (2019)

    Article  Google Scholar 

  26. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  Google Scholar 

  27. Pérez-Rosas, V., Abouelenien, M., Mihalcea, R., Burzo, M.: Deception detection using real-life trial data. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, ICMI 2015, New York, NY, USA, pp. 59–66. ACM (2015)

    Google Scholar 

  28. Pérez-Rosas, V., Abouelenien, M., Mihalcea, R., Xiao, Y., Linton, C.J., Burzo, M.: Verbal and nonverbal clues for real-life deception detection. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp. 2336–2346. Association for Computational Linguistics, September 2015

    Google Scholar 

  29. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434 (2015)

    Google Scholar 

  30. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  31. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems (2014)

    Google Scholar 

  32. Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282, August 1995

    Google Scholar 

  33. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497, December 2015

    Google Scholar 

  34. Tran, L., Liu, X.: Nonlinear 3D face morphable model. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7346–7355 (2018)

    Google Scholar 

  35. van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: A generative model for raw audio. In: Arxiv, Wavenet (2016)

    Google Scholar 

  36. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  37. Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Van Gool, L.: Temporal segment networks for action recognition in videos. IEEE Trans. Pattern Anal. Mach. Intell. 41(11), 2740–2755 (2019)

    Article  Google Scholar 

  38. Wang, L., Wu, Z., Karanam, S., Peng, K.C., Singh, R.V., Liu, B., Metaxas, D.N.: Sharpen focus: learning with attention separability and consistency. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 512–521 (2019)

    Google Scholar 

  39. Wu, Z., Singh, B., Davis, L., Subrahmanian, V.S.: Deception detection in videos. In: AAAI, pp. 1695–1702 (2018)

    Google Scholar 

  40. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. CoRR, abs/1511.07122 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anastasis Stathopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stathopoulos, A., Han, L., Dunbar, N., Burgoon, J.K., Metaxas, D. (2021). Deception Detection in Videos Using Robust Facial Features. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Proceedings of the Future Technologies Conference (FTC) 2020, Volume 3. FTC 2020. Advances in Intelligent Systems and Computing, vol 1290. Springer, Cham. https://doi.org/10.1007/978-3-030-63092-8_45

Download citation

Publish with us

Policies and ethics