ABSTRACT
Recently, falsified claims incorporating both text and images have been disseminated more effectively than those containing text alone, raising significant concerns for multi-modal fact verification. Existing research makes contributions to multi-modal feature extraction and interaction, but fails to fully utilize and enhance the valuable and intricate semantic relationships between distinct features. Moreover, most detectors merely provide a single outcome judgment and lack an inference process or explanation. Taking these factors into account, we propose a novel Explainable and Context-Enhanced Network (ECENet) for multi-modal fact verification, making the first attempt to integrate multi-clue feature extraction, multi-level feature reasoning, and justification (explanation) generation within a unified framework. Specifically, we propose an Improved Coarse- and Fine-grained Attention Network, equipped with two types of level-grained attention mechanisms, to facilitate a comprehensive understanding of contextual information. Furthermore, we propose a novel justification generation module via deep reinforcement learning that does not require additional labels. In this module, a sentence extractor agent measures the importance between the query claim and all document sentences at each time step, selecting a suitable amount of high-scoring sentences to be rewritten as the explanation of the model. Extensive experiments demonstrate the effectiveness of the proposed method.
- Sahar Abdelnabi, Rakibul Hasan, and Mario Fritz. 2022. Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14940--14949.Google ScholarCross Ref
- Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. 2019. Protecting World Leaders Against Deep Fakes.. In Proceedings of the IEEE Conference Workshops on Computer Vision and Pattern Recognition. 38--48.Google Scholar
- Tariq Alhindi, Savvas Petridis, and Smaranda Muresan. 2018. Where is your evidence: Improving fact-checking by justification modeling. In Proceedings of the First Workshop on Fact Extraction and Verification (FEVER). 85--90.Google ScholarCross Ref
- Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, and Isabelle Augenstein. 2020. Generating Fact Checking Explanations. arxiv: 2004.05773 [cs.CL]Google Scholar
- Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of data. 1247--1250.Google ScholarDigital Library
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Proceedings of the Advances in Neural Information Processing Systems, Vol. 26 (2013).Google Scholar
- Brooke Borel. 2016. The Chicago Guide to Fact-checking. University of Chicago Press.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
- Abhishek Dhankar, Osmar R. Zaïane, and Francois Bolduc. 2022. UofA-Truth at Factify 2022: Transformer And Transfer Learning Based Multi-Modal Fact-Checking. arxiv: 2203.07990 [cs.MM]Google Scholar
- Wenkai Dong, Zhaoxiang Zhang, and Tieniu Tan. 2019. Attention-aware sampling via deep reinforcement learning for action recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8247--8254.Google ScholarDigital Library
- Jie Gao, Hella-Franziska Hoffmann, Stylianos Oikonomou, David Kiskovski, and Anil Bandhakavi. 2021. Logically at Factify 2022: Multimodal Fact Verfication. ArXiv, Vol. abs/2112.09253 (2021).Google Scholar
- Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. 2022. A survey on automated fact-checking. Transactions of the Association for Computational Linguistics, Vol. 10 (2022), 178--206.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM, Vol. 60, 6 (2017), 84--90.Google ScholarDigital Library
- Dong Li, Jiaying Zhu, Menglu Wang, Jiawei Liu, Xueyang Fu, and Zheng-Jun Zha. 2023 b. Edge-Aware Regional Message Passing Controller for Image Forgery Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8222--8232.Google ScholarCross Ref
- Yi Li, Hualiang Wang, Yiqun Duan, and Xiaomeng Li. 2023 a. CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks. arXiv preprint arXiv:2304.05653 (2023).Google Scholar
- Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2117--2125.Google ScholarCross Ref
- Fuxiao Liu, Yinghan Wang, Tianlu Wang, and Vicente Ordonez. 2020a. VisualNews: Benchmark and Challenges in Entity-aware Image Captioning. arxiv: 2010.03743 [cs.CV]Google Scholar
- Zhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2019. Fine-grained fact verification with kernel graph attention network. arXiv preprint arXiv:1910.09796 (2019).Google Scholar
- Zhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2020b. Fine-grained Fact Verification with Kernel Graph Attention Network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7342--7351. https://doi.org/10.18653/v1/2020.acl-main.655Google ScholarCross Ref
- Yi-Ju Lu and Cheng-Te Li. 2020a. GCAN: Graph-aware co-attention networks for explainable fake news detection on social media. arXiv preprint arXiv:2004.11648 (2020).Google Scholar
- Yi-Ju Lu and Cheng-Te Li. 2020b. GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 505--514. https://doi.org/10.18653/v1/2020.acl-main.48Google ScholarCross Ref
- Jackson Luken, Nanjiang Jiang, and Marie-Catherine de Marneffe. 2018. QED: A fact verification system for the FEVER shared task. In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, Brussels, Belgium, 156--160. https://doi.org/10.18653/v1/W18--5526Google ScholarCross Ref
- Grace Luo, Trevor Darrell, and Anna Rohrbach. 2021. Newsclippings: Automatic generation of out-of-context multimodal media. arXiv preprint arXiv:2104.05893 (2021).Google Scholar
- Jing Ma, Wei Gao, Shafiq Joty, and Kam-Fai Wong. 2019. Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 2561--2571. https://doi.org/10.18653/v1/P19-1244Google ScholarCross Ref
- Shreyash Mishra, S Suryavardan, Amrit Bhaskar, Parul Chopra, Aishwarya Reganti, Parth Patwa, Amitava Das, Tanmoy Chakraborty, Amit Sheth, Asif Ekbal, et al. 2022. Factify: A multi-modal fact verification dataset. In Proceedings of the First Workshop on Multimodal Fact-Checking and Hate Speech Detection (DE-FACTIFY).Google Scholar
- Kartik Narayan, Harsh Agarwal, Surbhi Mittal, Kartik Thakral, Suman Kundu, Mayank Vatsa, and Richa Singh. 2022. DeSI: Deepfake Source Identifier for Social Media. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2858--2867.Google ScholarCross Ref
- Shashi Narayan, Shay B Cohen, and Mirella Lapata. 2018. Ranking sentences for extractive summarization with reinforcement learning. arXiv preprint arXiv:1802.08636 (2018).Google Scholar
- Yixin Nie, Haonan Chen, and Mohit Bansal. 2019. Combining Fact Extraction and Verification with Neural Semantic Matching Networks. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 6859--6866. https://doi.org/10.1609/aaai.v33i01.33016859Google ScholarDigital Library
- Parth Patwa, Shreyash Mishra, S Suryavardan, Amrit Bhaskar, Parul Chopra, Aishwarya N. Reganti, Amitava Das, Tanmoy Chakraborty, A. Sheth, Asif Ekbal, and Chaitanya Ahuja. 2022. Benchmarking Multi-Modal Entailment for Fact Verification (short paper). In DE-FACTIFY@AAAI.Google Scholar
- Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, and Gerhard Weikum. 2018a. Declare: Debunking fake news and false claims using evidence-aware deep learning. arXiv preprint arXiv:1809.06416 (2018).Google Scholar
- Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, and Gerhard Weikum. 2018b. DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 22--32. https://doi.org/10.18653/v1/D18-1003Google ScholarCross Ref
- Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748--8763.Google Scholar
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, Vol. 21, 1 (2020), 5485--5551.Google ScholarDigital Library
- Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv, Vol. abs/1910.01108 (2019).Google Scholar
- Zhihua Shang, Hongtao Xie, Zhengjun Zha, Lingyun Yu, Yan Li, and Yongdong Zhang. 2021. PRRNet: Pixel-Region relation network for face forgery detection. Pattern Recognition, Vol. 116 (2021), 107950.Google ScholarDigital Library
- Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. 2019. dEFEND: Explainable Fake News Detection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 395--405. https://doi.org/10.1145/3292500.3330935Google ScholarDigital Library
- James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355 (2018).Google Scholar
- Joseph E. Uscinski and Ryden W. Butler. 2013. The Epistemology of Fact Checking. Critical Review, Vol. 25, 2 (2013), 162--180.Google ScholarCross Ref
- Menglu Wang, Xueyang Fu, Jiawei Liu, and Zheng-Jun Zha. 2022. Jpeg compress-ion-aware image forgery localization. In Proceedings of the 30th ACM International Conference on Multimedia. 5871--5879.Google Scholar
- Wei-Yao Wang and Wen-Chih Peng. 2022. Team Yao at Factify 2022: Utilizing Pre-trained Models and Co-attention Networks for Multi-Modal Fact Verification (short paper). ArXiv, Vol. abs/2201.11664 (2022).Google Scholar
- Jie Wu, Guanbin Li, Si Liu, and Liang Lin. 2020a. Tree-structured policy based progressive reinforcement learning for temporally language grounding in video. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12386--12393.Google ScholarCross Ref
- Lianwei Wu, Yuan Rao, Yongqiang Zhao, Hao Liang, and Ambreen Nazir. 2020b. DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1024--1035. https://doi.org/10.18653/v1/2020.acl-main.97Google ScholarCross Ref
- Fan Yang, Shiva K. Pentyala, Sina Mohseni, Mengnan Du, Hao Yuan, Rhema Linder, Eric D. Ragan, Shuiwang Ji, and Xia (Ben) Hu. 2019. XFake: Explainable Fake News Detector with Visualizations. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, Ling Liu, Ryen W. White, Amin Mantrach, Fabrizio Silvestri, Julian J. McAuley, Ricardo Baeza-Yates, and Leila Zia (Eds.). ACM, 3600--3604. https://doi.org/10.1145/3308558.3314119Google ScholarDigital Library
- Barry Menglong Yao, Aditya Shah, Lichao Sun, Jin-Hee Cho, and Lifu Huang. 2022. End-to-end multimodal fact-checking and explanation generation: A challenging dataset and models. arXiv preprint arXiv:2205.12487 (2022).Google Scholar
- Takuma Yoneda, Jeff Mitchell, Johannes Welbl, Pontus Stenetorp, and Sebastian Riedel. 2018. UCL Machine Reading Group: Four Factor Framework For Fact Finding (HexaF). In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, Brussels, Belgium, 97--102. https://doi.org/10.18653/v1/W18-5515Google ScholarCross Ref
- Wanjun Zhong, Jingjing Xu, Duyu Tang, Zenan Xu, Nan Duan, Ming Zhou, Jiahai Wang, and Jian Yin. 2020. Reasoning Over Semantic-Level Graph for Fact Checking. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 6170--6180. https://doi.org/10.18653/v1/2020.acl-main.549Google ScholarCross Ref
- Jie Zhou, Xu Han, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2019. GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 892--901. https://doi.org/10.18653/v1/P19-1085Google ScholarCross Ref
- Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (volume 2: Short papers). 207--212.Google ScholarCross Ref
- Yipin Zhou and Ser-Nam Lim. 2021. Joint audio-visual deepfake detection. In Proceedings of the IEEE International Conference on Computer Vision. 14800--14809.Google ScholarCross Ref
- Yan Zhuang and Yanru Zhang. 2022. Yet at Factify 2022: Unimodal and Bimodal RoBERTa-based models for Fact Checking (short paper). In DE-FACTIFY@AAAI.Google Scholar
Index Terms
- ECENet: Explainable and Context-Enhanced Network for Muti-modal Fact verification
Recommendations
Fine-grained attention-based phrase-aware network for aspect-level sentiment analysis
AbstractAspect-level sentiment classification aims to identify the sentiment polarity of a specific aspect in a sentence. In recent years, many researchers have sought to explore aspect-specific representation via attention mechanisms. Although a ...
Topic Attentional Neural Network for Abstractive Document Summarization
Advances in Knowledge Discovery and Data MiningAbstractAbstractive summarization is a renewed and challenging task of document summarization. Recently, neural networks, especially attentional encoder-docoder architecture, have achieved impressive progress in abstractive document summarization. However,...
Explainable Argumentation for Wellness Consultation
Explainable, Transparent Autonomous Agents and Multi-Agent SystemsAbstractThere has been a recent resurgence in the area of explainable artificial intelligence as researchers and practitioners seek to provide more transparency to their algorithms. Much of this research is focused on explicitly explaining decisions or ...
Comments