research-article

Free Access

Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware

Authors:
Peiran Dong

Hong Kong Polytechnic University, Hong Kong, Hong Kong

Hong Kong Polytechnic University, Hong Kong, Hong Kong

0000-0002-1129-9218
View Profile

,
Song Guo

Hong Kong Polytechnic University, Hong Kong, Hong Kong

Hong Kong Polytechnic University, Hong Kong, Hong Kong

0000-0001-9831-2202
View Profile

,
Junxiao Wang

King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

0000-0001-7263-174X
View Profile

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningAugust 2023Pages 437–447https://doi.org/10.1145/3580305.3599395

Published:04 August 2023Publication History

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 437–447

ABSTRACT

The recent success of pre-trained language models (PLMs) such as BERT has resulted in the development of various beneficial database middlewares, including natural language query interfaces and entity matching. This shift has been greatly facilitated by the extensive external knowledge of PLMs. However, as PLMs are often provided by untrusted third parties, their lack of standardization and regulation poses significant security risks that have yet to be fully explored. This paper investigates the security threats posed by malicious PLMs to these emerging database middleware. We specifically propose a novel type of Trojan attack, where a maliciously designed PLM causes unexpected behavior in the database middleware. These Trojan attacks possess the following characteristics: (1) Triggerability: The Trojan-infected database middleware will function normally with normal input, but will likely malfunction when triggered by the attacker. (2) Imperceptibility: There is no need for noticeable modification of the input to trigger the Trojan. (3) Generalizability: The Trojan is capable of targeting a variety of downstream tasks, not just one specific task. We thoroughly evaluate the impact of these Trojan attacks through experiments and analyze potential countermeasures and their limitations. Our findings could aid in the creation of stronger mechanisms for the implementation of PLMs in database middleware.

Supplemental Material

rtfp1122-2min-promo.mp4

mp4

3.3 MB

Download

References

Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines. In Proceedings of the International Conference on Machine Learning (ICML).Google Scholar
Nicholas Boucher, Ilia Shumailov, Ross Anderson, and Nicolas Papernot. 2022. Bad Characters: Imperceptible NLP Attacks. In Proceedings of IEEE Symposium on Security and Privacy (S&P).Google ScholarCross Ref
Ursin Brunner and Kurt Stockinger. 2020. Entity Matching with Transformer Architectures - A Step Forward in Data Integration. In Proceedings of the 23rd International Conference on Extending Database Technology, EDBT. 463--473.Google Scholar
Kangjie Chen, Yuxian Meng, Xiaofei Sun, Shangwei Guo, Tianwei Zhang, Jiwei Li, and Chun Fan. 2022. Badpre: Task-agnostic backdoor attacks to pre-trained nlp foundation models. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
Valter Crescenzi, Andrea De Angelis, Donatella Firmani, Maurizio Mazzei, Paolo Merialdo, Federico Piai, and Divesh Srivastava. 2021. Alaska: A flexible benchmark for data integration tasks. arXiv preprint arXiv:2101.11259 (2021).Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
Siddhant Garg, Adarsh Kumar, Vibhor Goel, and Yingyu Liang. 2020. Can adversarial weight perturbations inject neural backdoors. In Proceedings of the ACM International Conference on Information & Knowledge Management (CIKM).Google ScholarDigital Library
Tong Guo and Huilin Gao. 2019. Content enhanced bert-based text-to-sql generation. arXiv preprint arXiv:1910.07179 (2019).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno, and Julian Martin Eisenschlos. 2020. TaPas: Weakly supervised table parsing via pre-training. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. 2022. Editing Models with Task Arithmetic. arXiv preprint arXiv:2212.04089 (2022).Google Scholar
Hengrui Jia, Mohammad Yaghini, Christopher A. Choquette-Choo, Natalie Dullerud, Anvith Thudi, Varun Chandrasekaran, and Nicolas Papernot. 2021. Proof-of-Learning: Definitions and Practice. In Proceedings of the IEEE Symposium on Security and Privacy (S&P).Google ScholarCross Ref
Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Google Scholar
Georgios Karagiannis, Mohammed Saeed, Paolo Papotti, and Immanuel Trummer. 2020. Scrutinizer: a mixed-initiative approach to large-scale, data-driven claim verification. In Proceedings of the VLDB Endowment.Google ScholarDigital Library
Keita Kurita, Paul Michel, and Graham Neubig. 2020. Weight Poisoning Attacks on Pretrained Models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
Linyang Li, Demin Song, Xiaonan Li, Jiehang Zeng, Ruotian Ma, and Xipeng Qiu. 2021b. Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
Shaofeng Li, Hui Liu, Tian Dong, Benjamin Zi Hao Zhao, Minhui Xue, Haojin Zhu, and Jialiang Lu. 2021a. Hidden Backdoors in Human-Centric Language Models. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS).Google ScholarDigital Library
Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. 2020. Deep Entity Matching with Pre-Trained Language Models. Proc. VLDB Endow., Vol. 14, 1 (2020), 50--60.Google ScholarDigital Library
Xi Victoria Lin, Richard Socher, and Caiming Xiong. 2020. Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing. In Findings of the Association for Computational Linguistics.Google Scholar
Qian Liu, Bei Chen, Jiaqi Guo, Zeqi Lin, and Jian-guang Lou. 2022. TAPEX: table pre-training via learning a neural SQL executor. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
Weimin Lyu, Songzhu Zheng, Teng Ma, and Chao Chen. 2022. A Study of the Attention Abnormality in Trojaned BERTs. ArXiv, Vol. abs/2205.08305 (2022).Google Scholar
Sidharth Mudgal, Han Li, Theodoros Rekatsinas, AnHai Doan, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra. 2018. Deep learning for entity matching: A design space exploration. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Google ScholarDigital Library
Xudong Pan, Mi Zhang, Beina Sheng, Jiaming Zhu, and Min Yang. 2022. Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 3611--3628. https://www.usenix.org/conference/usenixsecurity22/presentation/pan-hiddenGoogle Scholar
Panupong Pasupat and Percy Liang. 2015. Compositional Semantic Parsing on Semi-Structured Tables. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
Ralph Peeters and Christian Bizer. 2021. Dual-objective fine-tuning of BERT for entity matching. In Proceedings of the VLDB Endowment.Google ScholarDigital Library
Anna Primpeli, Ralph Peeters, and Christian Bizer. 2019. The WDC training dataset and gold standard for large-scale product matching. In Proceedings of the World Wide Web Conference (WWW).Google ScholarDigital Library
Fanchao Qi, Yangyi Chen, Mukai Li, Yuan Yao, Zhiyuan Liu, and Maosong Sun. 2021a. Onion: A simple and effective defense against textual backdoor attacks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, and Maosong Sun. 2021b. Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
Fanchao Qi, Yuan Yao, Sophia Xu, Zhiyuan Liu, and Maosong Sun. 2021c. Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).Google Scholar
Torsten Scholak, Nathan Schucher, and Dzmitry Bahdanau. 2021. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
Immanuel Trummer. 2021. DB-BERT: a Database Tuning Tool that "Reads the Manual". In Proceedings of ACM SIGMOD/PODS Conference.Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS).Google Scholar
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In International Conference on Learning Representations. https://openreview.net/forum?id=rJ4km2R5t7Google Scholar
Lihan Wang, Bowen Qin, Binyuan Hui, Bowen Li, Min Yang, Bailin Wang, Binhua Li, Fei Huang, Luo Si, and Yongbin Li. 2022. Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing. arXiv preprint arXiv:2206.14017 (2022).Google Scholar
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. 2020. Transformers: State-of-the-art natural language processing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I Wang, et al. 2022. UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. arXiv preprint arXiv:2201.05966 (2022).Google Scholar
Chaofei Yang, Qing Wu, Hai Li, and Yiran Chen. 2017. Generative poisoning attack method against neural networks. arXiv preprint arXiv:1703.01340 (2017).Google Scholar
Wenkai Yang, Lei Li, Zhiyuan Zhang, Xuancheng Ren, Xu Sun, and Bin He. 2021. Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).Google ScholarCross Ref
Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, et al. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
Xinyang Zhang, Zheng Zhang, Shouling Ji, and Ting Wang. 2021a. Trojaning language models for fun and profit. In Proceedings of IEEE European Symposium on Security and Privacy (EuroS&P).Google ScholarCross Ref
Xinyang Zhang, Zheng Zhang, Shouling Ji, and Ting Wang. 2021b. Trojaning language models for fun and profit. In Proceedings of IEEE European Symposium on Security and Privacy (EuroS&P).Google ScholarCross Ref
Zhiyuan Zhang, Lingjuan Lyu, Xingjun Ma, Chenguang Wang, and Xu Sun. 2022. Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models. arXiv preprint arXiv:2210.09545 (2022).Google Scholar
Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103 (2017).Google Scholar
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarDigital Library

Index Terms

Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware

Recommendations

An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

With the widespread use of deep neural networks (DNNs) in high-stake applications, the security problem of the DNN models has received extensive attention. In this paper, we investigate a specific security problem called trojan attack, which aims to ...
Read More
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases
Computer Vision – ECCV 2020
Abstract
When the training data are maliciously tampered, the predictions of the acquired deep neural network (DNN) can be manipulated by an adversary known as the Trojan attack (or poisoning backdoor attack). The lack of robustness of DNNs against Trojan ...
Read More
STRIP: a defence against trojan attacks on deep neural networks
ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference

A recent trojan attack on deep neural network (DNN) models is one insidious variant of data poisoning attacks. Trojan attacks exploit an effective backdoor created in a DNN model by leveraging the difficulty in interpretability of the learned model to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2023
5996 pages
ISBN:9798400701030
DOI:10.1145/3580305
General Chairs:
Ambuj Singh
UC Santa Barbara, USA
,
Yizhou Sun
UC Los Angeles, USA
,
Program Chairs:
Leman Akoglu
Carnegie Mellon University, USA
,
Dimitrios Gunopulos
University of Athens, Greece
,
Xifeng Yan
UC Santa Barbara, USA
,
Ravi Kumar
Google, USA
,
Fatma Ozcan
Google, USA
,
Jieping Ye
Alibaba DAMO Academy
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 August 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
database middleware
pre-trained language model
trojan attack
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 361
  Total Downloads
- Downloads (Last 12 months)361
- Downloads (Last 6 weeks)37
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases

STRIP: a defence against trojan attacks on deep neural networks