research-article

Free Access

Just Accepted

Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video Detection

Authors:
Dengyong Zhang

Changsha University of Science and Technology, Changsha, China

Changsha University of Science and Technology, Changsha, China

0000-0002-2789-2980
Search about this author

,
Wenjie Zhu

Changsha University of Science and Technology, Changsha, China

Changsha University of Science and Technology, Changsha, China

0000-0002-2414-9307
Search about this author

,
Xin Liao

Hunan University, Changsha, China

Hunan University, Changsha, China

0000-0002-9131-0578
Search about this author

,
Feifan Qi

Changsha University of Science and Technology, Changsha, China

Changsha University of Science and Technology, Changsha, China

0009-0009-3030-5589
Search about this author

,
Gaobo Yang

Hunan University, Changsha, China

Hunan University, Changsha, China

0000-0003-2734-659X
Search about this author

,
Xiangling Ding

Hunan University of Science and Technology, Xiangtan, China

Hunan University of Science and Technology, Xiangtan, China

0000-0002-6581-4633
Search about this author

ACM Transactions on Multimedia Computing, Communications, and ApplicationsAccepted on May 2024https://doi.org/10.1145/3664654

Online AM:13 May 2024Publication History

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

With the rise of the metaverse, the rapid advancement of Deepfakes technology has become closely intertwined. Within the metaverse, individuals exist in digital form and engage in interactions, transactions, and communications through virtual avatars. However, the development of Deepfakes technology has led to the proliferation of forged information disseminated under the guise of users’ virtual identities, posing significant security risks to the metaverse. Hence, there is an urgent need to research and develop more robust methods for detecting deep forgeries to address these challenges. This paper explores deepfake video detection by leveraging the spatiotemporal inconsistencies generated by deepfake generation techniques, and thereby proposing the interactive spatioTemporal inconsistency learning and interactive fusion (ST-ILIF) detection method, which consists of phase-aware and sequence streams. The spatial inconsistencies exhibited in frames of deepfake videos are primarily attributed to variations in the structural information contained within the phase component of the Fourier domain. To mitigate the issue of overfitting the content information, a phase-aware stream is introduced to learn the spatial inconsistencies from the phase-based reconstructed frames. Additionally, considering that deepfake videos are generated frame-by-frame and lack temporal consistency between frames, a sequence stream is proposed to extract temporal inconsistency features from the spatiotemporal difference information between consecutive frames. Finally, through feature interaction and fusion of the two streams, the representation ability of intermediate and classification features is further enhanced. The proposed method, which was evaluated on four mainstream datasets, outperformed most existing methods, and extensive experimental results demonstrated its effectiveness in identifying deepfake videos. Our source code is available at https://github.com/qff98/Deepfake-Video-Detection

References

2019. Deepfakes github. https://github.com/deepfakes/faceswapGoogle Scholar
Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. MesoNet: a Compact Facial Video Forgery Detection Network. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS). 1–7. https://doi.org/10.1109/WIFS.2018.8630761Google ScholarCross Ref
Luca Bondi, Edoardo Daniele Cannas, Paolo Bestagini, and Stefano Tubaro. 2020. Training Strategies and Data Augmentations in CNN-based DeepFake Video Detection. In 2020 IEEE International Workshop on Information Forensics and Security (WIFS). 1–6. https://doi.org/10.1109/WIFS49906.2020.9360901Google ScholarCross Ref
Nicolò Bonettini, Edoardo Daniele Cannas, Sara Mandelli, Luca Bondi, Paolo Bestagini, and Stefano Tubaro. 2021. Video Face Manipulation Detection Through Ensemble of CNNs. In 2020 25th International Conference on Pattern Recognition (ICPR). 5012–5019. https://doi.org/10.1109/ICPR48806.2021.9412711Google ScholarCross Ref
Joao Carreira and Andrew Zisserman. 2017. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6299–6308.Google ScholarCross Ref
M. del Castillo. 2022, September 1. Facebook’s Metaverse Could Be Overrun By Deep Fakes And Other Misinformation If These Non-Profits Don’t Succeed. https://www.forbes.com/sites/michaeldelcastillo/2022/08/29/facebooks-metaverse-could-be-overrun-by-deep-fakes-and-other-misinformation-if-these-non-profits-dont-succeed/?sh=21acb3842737.Google Scholar
Xiangling Ding, Wenjie Zhu, and Dengyong Zhang. 2022. DeepFake Videos Detection via Spatiotemporal Inconsistency Learning and Interactive Fusion. In 2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). IEEE, 425–433.Google ScholarDigital Library
Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton Ferrer. 2019. The deepfake detection challenge (dfdc) preview dataset. arXiv preprint arXiv:1910.08854(2019). https://doi.org/10.48550/arXiv.1910.08854Google ScholarCross Ref
Ricard Durall, Margret Keuper, Franz-Josef Pfreundt, and Janis Keuper. 2019. Unmasking deepfakes with simple features. arXiv preprint arXiv:1911.00686(2019). https://doi.org/10.48550/arXiv.1911.00686Google ScholarCross Ref
Shiming Ge, Fanzhao Lin, Chenyu Li, Daichi Zhang, Weiping Wang, and Dan Zeng. 2022. Deepfake video detection via predictive representation learning. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 2s (2022), 1–21.Google ScholarDigital Library
Zhihao Gu, Yang Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, and Lizhuang Ma. 2021. Spatiotemporal Inconsistency Learning for DeepFake Video Detection. In Proceedings of the 29th ACM International Conference on Multimedia. 3473–3481.Google ScholarDigital Library
Zhihao Gu, Yang Chen, Taiping Yao, Shouhong Ding, Jilin Li, and Lizhuang Ma. 2022. Delving into the Local: Dynamic Inconsistency Learning for DeepFake Video Detection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 36. 744–752.Google ScholarCross Ref
Zhiqing Guo, Gaobo Yang, Jiyou Chen, and Xingming Sun. 2021. Fake face detection via adaptive manipulation traces extraction network. Computer Vision and Image Understanding 204 (2021). https://doi.org/10.1016/j.cviu.2021.103170Google ScholarCross Ref
Bruce C Hansen and Robert F Hess. 2007. Structural sparseness and spatial phase alignment in natural scenes. JOSA A 24, 7 (2007), 1873–1885.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778.Google ScholarCross Ref
Juan Hu, Xin Liao, Wei Wang, and Zheng Qin. 2022. Detecting Compressed Deepfake Videos in Social Networks Using Frame-Temporality Two-Stream Convolutional Network. IEEE Transactions on Circuits and Systems for Video Technology 32, 3(2022), 1089–1102. https://doi.org/10.1109/TCSVT.2021.3074259Google ScholarCross Ref
Ziheng Hu, Hongtao Xie, Yuxin Wang, Jiahong Li, Zhongyuan Wang, and Yongdong Zhang. 2021. Dynamic inconsistency-aware deepfake video detection. In IJCAI. 736–742.Google Scholar
Gengyun Jia, Meisong Zheng, Chuanrui Hu, Xin Ma, Yuting Xu, Luoqi Liu, Yafeng Deng, and Ran He. 2021. Inconsistency-Aware Wavelet Dual-Branch Network for Face Forgery Detection. IEEE Transactions on Biometrics, Behavior, and Identity Science 3, 3(2021), 308–319. https://doi.org/10.1109/TBIOM.2021.3086109Google ScholarCross Ref
Liming Jiang, Ren Li, Wayne Wu, Chen Qian, and Chen Change Loy. 2020. DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2889–2898.Google ScholarCross Ref
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014). https://doi.org/10.48550/arXiv.1412.6980Google ScholarCross Ref
Dingquan Li, Tingting Jiang, and Ming Jiang. 2019. Quality Assessment of In-the-Wild Videos. In Proceedings of the 27th ACM International Conference on Multimedia (Nice, France) (MM ’19). Association for Computing Machinery, New York, NY, USA, 2351–2359. https://doi.org/10.1145/3343031.3351028Google ScholarDigital Library
Yuezun Li and Siwei Lyu. 2018. Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656(2018). https://doi.org/10.48550/arXiv.1811.00656Google ScholarCross Ref
Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3207–3216.Google ScholarCross Ref
Xin Liao, Yumei Wang, Tianyi Wang, Juan Hu, and Xiaoshuai Wu. 2023. FAMM: Facial Muscle Motions for Detecting Compressed Deepfake Videos over Social Networks. IEEE Transactions on Circuits and Systems for Video Technology (2023).Google Scholar
Shanchuan Lin, Linjie Yang, Imran Saleemi, and Soumyadip Sengupta. 2022. Robust high-resolution video matting with temporal guidance. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 238–247.Google ScholarCross Ref
Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. 2021. Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 772–781.Google ScholarCross Ref
Jiarui Liu, Kaiman Zhu, Wei Lu, Xiangyang Luo, and Xianfeng Zhao. 2021. A lightweight 3D convolutional neural network for deepfake detection. International Journal of Intelligent Systems 36, 9 (2021), 4990–5004. https://doi.org/10.1002/int.22499 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/int.22499Google ScholarDigital Library
Kunlin Liu, Wenbo Zhou, Zhenyu Zhang, Yanhao Ge, Hao Tang, Weiming Zhang, and Nenghai Yu. 2023. Measuring the Consistency and Diversity of 3D Face Generation. IEEE Journal of Selected Topics in Signal Processing 17, 6(2023), 1208–1220. https://doi.org/10.1109/JSTSP.2023.3273781Google ScholarCross Ref
Xiaolong Liu, Yang Yu, Xiaolong Li, Yao Zhao, and Guodong Guo. 2023. TCSD: Triple complementary streams detector for comprehensive deepfake detection. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 19, 6 (2023), 1–22.Google ScholarDigital Library
Zhaoyang Liu, Donghao Luo, Yabiao Wang, Limin Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, and Tong Lu. 2020. Teinet: Towards an efficient architecture for video recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 11669–11676.Google ScholarCross Ref
Wei Lu, Lingyi Liu, Bolin Zhang, Junwei Luo, Xianfeng Zhao, Yicong Zhou, and Jiwu Huang. 2023. Detection of Deepfake Videos Using Long-Distance Attention. IEEE Transactions on Neural Networks and Learning Systems (2023).Google Scholar
Fuyan Ma, Bin Sun, and Shutao Li. 2021. Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion. IEEE Transactions on Affective Computing(2021). https://doi.org/10.1109/TAFFC.2021.3122146Google ScholarDigital Library
Iacopo Masi, Aditya Killekar, Royston Marian Mascarenhas, Shenoy Pratik Gurudatt, and Wael AbdAlmageed. 2020. Two-Branch Recurrent Network for Isolating Deepfakes in Videos. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 667–684.Google ScholarDigital Library
Falko Matern, Christian Riess, and Marc Stamminger. 2019. Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). 83–92. https://doi.org/10.1109/WACVW.2019.00020Google ScholarCross Ref
Changtao Miao, Qi Chu, Weihai Li, Suichan Li, Zhentao Tan, Wanyi Zhuang, and Nenghai Yu. 2022. Learning Forgery Region-Aware and ID-Independent Features for Face Manipulation Detection. IEEE Transactions on Biometrics, Behavior, and Identity Science 4, 1(2022), 71–84. https://doi.org/10.1109/TBIOM.2021.3119403Google ScholarCross Ref
Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. 2019. Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2307–2311. https://doi.org/10.1109/ICASSP.2019.8682602Google ScholarCross Ref
Yuval Nirkin, Yosi Keller, and Tal Hassner. 2019. Fsgan: Subject agnostic face swapping and reenactment. In Proceedings of the IEEE/CVF international conference on computer vision. 7184–7193.Google ScholarCross Ref
A Oppenheim, Jae Lim, Gary Kopec, and SC Pohlig. 1979. Phase in speech and pictures. In ICASSP’79. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 4. IEEE, 632–637.Google Scholar
Alan V Oppenheim and Jae S Lim. 1981. The importance of phase in signals. Proc. IEEE 69, 5 (1981), 529–541.Google ScholarCross Ref
Guilin Pang, Baopeng Zhang, Zhu Teng, Zige Qi, and Jianping Fan. 2023. MRE-Net: Multi-Rate Excitation Network for Deepfake Video Detection. IEEE Transactions on Circuits and Systems for Video Technology (2023). https://doi.org/10.1109/TCSVT.2023.3239607Google ScholarDigital Library
Ivan Perov, Daiheng Gao, Nikolay Chervoniy, Kunlin Liu, Sugasa Marangonda, Chris Umé, Mr Dpfks, Carl Shift Facenheim, Luis RP, Jian Jiang, et al. 2020. DeepFaceLab: Integrated, flexible and extensible face-swapping framework. arXiv preprint arXiv:2005.05535(2020). https://doi.org/10.48550/arXiv.2005.05535Google ScholarCross Ref
Leon N Piotrowski and Fergus W Campbell. 1982. A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase. Perception 11, 3 (1982), 337–346.Google ScholarCross Ref
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Thinking in Frequency: Face Forgery Detection by Mining Frequency-Aware Clues. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 86–103.Google ScholarDigital Library
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Niessner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 1–11.Google ScholarCross Ref
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 618–626.Google ScholarCross Ref
Saniat Javid Sohrawardi, Akash Chintha, Bao Thai, Sovantharith Seng, Andrea Hickerson, Raymond Ptucha, and Matthew Wright. 2019. Poster: Towards Robust Open-World Detection of Deepfakes. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS ’19). Association for Computing Machinery, New York, NY, USA, 2613–2615. https://doi.org/10.1145/3319535.3363269Google ScholarDigital Library
Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning. PMLR, 6105–6114.Google Scholar
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2387–2395.Google ScholarDigital Library
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning Spatiotemporal Features With 3D Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 4489–4497.Google ScholarDigital Library
Gengxing Wang, Jiahuan Zhou, and Ying Wu. 2020. Exposing Deep-faked Videos by Anomalous Co-motion Pattern Detection. arXiv preprint arXiv:2008.04848(2020). https://doi.org/10.48550/arXiv.2008.04848Google ScholarCross Ref
Hanyi Wang, Zihan Liu, and Shilin Wang. 2023. Exploiting Complementary Dynamic Incoherence for DeepFake Video Detection. IEEE Transactions on Circuits and Systems for Video Technology (2023). https://doi.org/10.1109/TCSVT.2023.3238517Google ScholarDigital Library
Limin Wang, Zhan Tong, Bin Ji, and Gangshan Wu. 2021. TDN: Temporal Difference Networks for Efficient Action Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1895–1904.Google ScholarCross Ref
Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, and Qinghua Hu. 2020. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 11531–11539. https://doi.org/10.1109/CVPR42600.2020.01155Google ScholarCross Ref
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2020. CNN-Generated Images Are Surprisingly Easy to Spot... for Now. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8695–8704.Google ScholarCross Ref
Tianyi Wang, Harry Cheng, Kam Pui Chow, and Liqiang Nie. 2023. Deep convolutional pooling transformer for deepfake detection. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 19, 6 (2023), 1–20.Google ScholarDigital Library
Wenhao Wu, Yuxiang Zhao, Yanwu Xu, Xiao Tan, Dongliang He, Zhikang Zou, Jin Ye, Yingying Li, Mingde Yao, Zichao Dong, et al. 2021. Dsanet: Dynamic segment aggregation network for video-level representation learning. In Proceedings of the 29th ACM International Conference on Multimedia. 1903–1911.Google ScholarDigital Library
Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang, and Qi Tian. 2021. A Fourier-Based Framework for Domain Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14383–14392.Google ScholarCross Ref
Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang, and Qi Tian. 2021. A fourier-based framework for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14383–14392.Google ScholarCross Ref
Jiachen Yang, Aiyun Li, Shuai Xiao, Wen Lu, and Xinbo Gao. 2021. MTD-Net: Learning to Detect Deepfakes Images by Multi-Scale Texture Difference. IEEE Transactions on Information Forensics and Security 16 (2021), 4234–4245. https://doi.org/10.1109/TIFS.2021.3102487Google ScholarDigital Library
Yang Yu, Rongrong Ni, Yao Zhao, Siyuan Yang, Fen Xia, Ning Jiang, and Guoqing Zhao. 2023. MSVT: Multiple Spatiotemporal Views Transformer for DeepFake Video Detection. IEEE Transactions on Circuits and Systems for Video Technology (2023).Google Scholar
Dengyong Zhang, Jiahao Chen, Xin Liao, Feng Li, Jiaxin Chen, and Gaobo Yang. 2024. Face Forgery Detection via Multi-Feature Fusion and Local Enhancement. IEEE Transactions on Circuits and Systems for Video Technology (2024), 1–1. https://doi.org/10.1109/TCSVT.2024.3390945Google ScholarCross Ref
Shiwen Zhang, Sheng Guo, Weilin Huang, Matthew R Scott, and Limin Wang. 2020. V4d: 4d convolutional neural networks for video-level representation learning. arXiv preprint arXiv:2002.07442(2020). https://doi.org/10.48550/arXiv.2002.07442Google ScholarCross Ref
Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. 2021. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2185–2194.Google ScholarCross Ref
Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Weiming Zhang, and Nenghai Yu. 2022. Self-supervised transformer for deepfake detection. arXiv preprint arXiv:2203.01265(2022).Google Scholar
Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. 2020. WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection. Association for Computing Machinery, New York, NY, USA, 2382–2390.Google Scholar

Index Terms

Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Spatiotemporal Inconsistency Learning for DeepFake Video Detection
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

The rapid development of facial manipulation techniques has aroused public concerns in recent years. Following the success of deep learning, existing methods always formulate DeepFake video detection as a binary classification problem and develop frame-...
Read More
DeepFake Videos Detection via Spatiotemporal Inconsistency Learning and Interactive Fusion
2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)
While the rapid expansion of DeepFake generation techniques has arisen a serious impact on human society, the detection of DeepFake videos is challenging because of their highly plausible contents on each frame, which are not visually apparent. To address ...
Read More
Augmented Multi-Scale Spatiotemporal Inconsistency Magnifier for Generalized DeepFake Detection
Recently, realistic DeepFake videos have raised severe security concerns in society. Existing video-based detection methods observe local spatial regions with the coarse temporal view, thus it is difficult to obtain subtle spatiotemporal information, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Multimedia Computing, Communications, and Applications Just Accepted
ISSN:1551-6857
EISSN:1551-6865
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Online AM: 13 May 2024
- Accepted: 4 May 2024
- Revised: 6 April 2024
- Received: 29 December 2023
Published in tomm Just Accepted

Check for updates
Author Tags
Video forensics
Deepfake videos
Spatiotemporal Inconsistency learning
Face recongnition
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 88
  Total Downloads
- Downloads (Last 12 months)88
- Downloads (Last 6 weeks)88
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video Detection

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Spatiotemporal Inconsistency Learning for DeepFake Video Detection

DeepFake Videos Detection via Spatiotemporal Inconsistency Learning and Interactive Fusion

Augmented Multi-Scale Spatiotemporal Inconsistency Magnifier for Generalized DeepFake Detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video Detection

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Spatiotemporal Inconsistency Learning for DeepFake Video Detection

DeepFake Videos Detection via Spatiotemporal Inconsistency Learning and Interactive Fusion

Augmented Multi-Scale Spatiotemporal Inconsistency Magnifier for Generalized DeepFake Detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media