ABSTRACT
While recent studies have exposed various vulnerabilities incurred from data poisoning attacks in many web services, little is known about the vulnerability on online professional job platforms (e.g., LinkedIn and Indeed). In this work, first time, we demonstrate the critical vulnerabilities found in the common Human Resources (HR) task of matching job seekers and companies on online job platforms. Capitalizing on the unrestricted format and contents of job seekers' resumes and easy creation of accounts on job platforms, we demonstrate three attack scenarios: (1) company promotion attack to increase the likelihood of target companies being recommended, (2) company demotion attack to decrease the likelihood of target companies being recommended, and (3) user promotion attack to increase the likelihood of certain users being matched to certain companies. To this end, we develop an end-to-end "fake resume" generation framework, titled FRANCIS, that induces systematic prediction errors via data poisoning. Our empirical evaluation on real-world datasets reveals that data poisoning attacks can markedly skew the results of matchmaking between job seekers and companies, regardless of underlying models, with vulnerability amplified in proportion to poisoning intensity. These findings suggest that the outputs of various services from job platforms can be potentially hacked by malicious users.
Supplemental Material
- Ibrahim M Ahmed and Manar Younis Kashmoola. 2021. Threats on machine learning technique by data poisoning attack: A survey. In Advances in Cyber Security: Third International Conference, ACeS 2021, Penang, Malaysia, August 24--25, 2021, Revised Selected Papers 3. Springer, 586--600.Google ScholarCross Ref
- Anirban Chakraborty, Manaar Alam, Vishal Dey, Anupam Chattopadhyay, and Debdeep Mukhopadhyay. 2018. Adversarial attacks and defences: A survey. arXiv preprint arXiv:1810.00069 (2018).Google Scholar
- Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017).Google Scholar
- Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial attack on graph structured data. In International conference on machine learning (ICML). PMLR, 1115--1124.Google Scholar
- Le Dai, Yu Yin, Chuan Qin, Tong Xu, Xiangnan He, Enhong Chen, and Hui Xiong. 2020. Enterprise Cooperation and Competition Analysis with a Sign-Oriented Preference Network. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). 774--782.Google ScholarDigital Library
- Vachik S Dave, Baichuan Zhang, Mohammad Al Hasan, Khalifeh AlJadda, and Mohammed Korayem. 2018. A combined representation learning approach for better job and skill recommendation. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM). 1997--2005.Google ScholarDigital Library
- Joanna Davis, Hans-Georg Wolff, Monica L Forret, and Sherry E Sullivan. 2020. Networking via LinkedIn: An examination of usage and career benefits. Journal of Vocational Behavior , Vol. 118 (2020), 103396.Google ScholarCross Ref
- Johan De Smedt, Martin le Vrang, and Agis Papantoniou. 2015. ESCO: Towards a Semantic Web for the European Labor Market.. In LDOW@ WWW.Google Scholar
- Tawanna R Dillahunt, Aarti Israni, Alex Jiahong Lu, Mingzhi Cai, and Joey Chiao-Yin Hsiao. 2021. Examining the use of online platforms for employment: A survey of US job seekers. In Proceedings of the 2021 CHI conference on human factors in computing Systems. 1--23.Google ScholarDigital Library
- Jiaxin Fan, Qi Yan, Mohan Li, Guanqun Qu, and Yang Xiao. 2022. A Survey on Data Poisoning Attacks and Defenses. In 2022 7th IEEE International Conference on Data Science in Cyberspace (DSC). IEEE, 48--55.Google ScholarCross Ref
- Wei Jin, Yaxing Li, Han Xu, Yiqi Wang, Shuiwang Ji, Charu Aggarwal, and Jiliang Tang. 2021. Adversarial attacks and defenses on graphs. ACM SIGKDD Explorations Newsletter , Vol. 22, 2 (2021), 19--34.Google ScholarDigital Library
- Zixiao Kong, Jingfeng Xue, Yong Wang, Lu Huang, Zequn Niu, and Feng Li. 2021. A survey on adversarial attack in the age of artificial intelligence. Wireless Communications and Mobile Computing , Vol. 2021 (2021), 1--22.Google Scholar
- Thai Le, Suhang Wang, and Dongwon Lee. 2020. Malcom: Generating malicious comments to attack neural fake news detection models. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 282--291.Google ScholarCross Ref
- Martin le Vrang, Agis Papantoniou, Erika Pauwels, Pieter Fannes, Dominique Vandensteen, and Johan De Smedt. 2014. Esco: Boosting job matching in europe with semantic interoperability. Computer, Vol. 47, 10 (2014), 57--64.Google ScholarDigital Library
- Huayu Li, Yong Ge, Hengshu Zhu, Hui Xiong, and Hongke Zhao. 2017a. Prospecting the career development of talents: A survival analysis perspective. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). 917--925.Google ScholarDigital Library
- Liangyue Li, How Jing, Hanghang Tong, Jaewon Yang, Qi He, and Bee-Chung Chen. 2017b. NEMO: Next career move prediction with contextual embedding. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW). 505--513.Google ScholarDigital Library
- Ye Liu, Luming Zhang, Liqiang Nie, Yan Yan, and David Rosenblum. 2016. Fortune teller: predicting your career path. In Proceedings of the AAAI conference on artificial intelligence (AAAI), Vol. 30.Google ScholarCross Ref
- Qingxin Meng, Hengshu Zhu, Keli Xiao, Le Zhang, and Hui Xiong. 2019. A hierarchical career-path-aware neural network for job mobility prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 14--24.Google ScholarDigital Library
- Chenglin Miao, Qi Li, Lu Su, Mengdi Huai, Wenjun Jiang, and Jing Gao. 2018. Attack under disguise: An intelligent data poisoning attack mechanism in crowdsourcing. In Proceedings of the 2018 World Wide Web Conference (WWW). 13--22.Google ScholarDigital Library
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).Google Scholar
- OpenAI. 2023. GPT-4 Technical Report. ArXiv , Vol. abs/2303.08774 (2023).Google Scholar
- Chuan Qin, Le Zhang, Rui Zha, Dazhong Shen, Qi Zhang, Ying Sun, Chen Zhu, Hengshu Zhu, and Hui Xiong. 2023. A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics. arXiv preprint arXiv:2307.03195 (2023).Google Scholar
- Chuan Qin, Hengshu Zhu, Tong Xu, Chen Zhu, Liang Jiang, Enhong Chen, and Hui Xiong. 2018. Enhancing person-job fit for talent recruitment: An ability-aware neural network approach. In The 41st international ACM SIGIR conference on research & development in information retrieval (SIGIR). 25--34.Google ScholarDigital Library
- Rohan Ramanath, Hakan Inan, Gungor Polatkan, Bo Hu, Qi Guo, Cagri Ozcaglar, Xianren Wu, Krishnaram Kenthapadi, and Sahin Cem Geyik. 2018. Towards deep and representation learning for talent search at linkedin. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM). 2253--2261.Google ScholarDigital Library
- Namita Ruparel, Amandeep Dhir, Anushree Tandon, Puneet Kaur, and Jamid Ul Islam. 2020. The influence of online professional social media in human resource management: A systematic literature review. Technology in Society , Vol. 63 (2020), 101335.Google ScholarCross Ref
- Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, and Tom Goldstein. 2021. Just how toxic is data poisoning? a unified benchmark for backdoor and data poisoning attacks. In International Conference on Machine Learning (ICML). PMLR, 9389--9398.Google Scholar
- Baoxu Shi, Shan Li, Jaewon Yang, Mustafa Emre Kazdagli, and Qi He. 2020a. Learning to Ask Screening Questions for Job Postings. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 549--558.Google ScholarDigital Library
- Baoxu Shi, Jaewon Yang, Feng Guo, and Qi He. 2020b. Salience and Market-aware Skill Extraction for Job Targeting. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2871--2879.Google ScholarDigital Library
- Ying Sun, Fuzhen Zhuang, Hengshu Zhu, Xin Song, Qing He, and Hui Xiong. 2019. The impact of person-organization fit on talent management: A structure-aware convolutional neural network approach. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). 1625--1633.Google ScholarDigital Library
- Mingfei Teng, Hengshu Zhu, Chuanren Liu, Chen Zhu, and Hui Xiong. 2019. Exploiting the contagious effect for employee turnover prediction. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 33. 1166--1173.Google ScholarDigital Library
- Hanghang Tong, B Aditya Prakash, Charalampos Tsourakakis, Tina Eliassi-Rad, Christos Faloutsos, and Duen Horng Chau. 2010. On the vulnerability of large graphs. In 2010 IEEE International Conference on Data Mining. IEEE, 1091--1096.Google ScholarDigital Library
- Chao Wang, Hengshu Zhu, Qiming Hao, Keli Xiao, and Hui Xiong. 2021. Variable interval time sequence modeling for career trajectory prediction: Deep collaborative perspective. In Proceedings of The ACM Web Conference (WWW). 612--623.Google ScholarDigital Library
- Han Xu, Yao Ma, Hao-Chen Liu, Debayan Deb, Hui Liu, Ji-Liang Tang, and Anil K Jain. 2020. Adversarial attacks and defenses in images, graphs and text: A review. International Journal of Automation and Computing , Vol. 17 (2020), 151--178.Google ScholarCross Ref
- Tong Xu, Hengshu Zhu, Chen Zhu, Pan Li, and Hui Xiong. 2018. Measuring the popularity of job skills in recruitment market: A multi-criteria approach. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 32.Google ScholarCross Ref
- Michiharu Yamashita, Yunqi Li, Thanh Tran, Yongfeng Zhang, and Dongwon Lee. 2022. Looking further into the future: Career pathway prediction. In Proceedings of the International Workshop on Computational Jobs Marketplace.Google Scholar
- Michiharu Yamashita, Jia Tracy Shen, Thanh Tran, Hamoon Ekhtiari, and Dongwon Lee. 2023. JAMES: Normalizing Job Titles with Multi-Aspect Graph Embeddings and Reasoning. In 2023 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE.Google Scholar
- Jaewon Yang, Qi He, How Jing, Bee-Chung Chen, and Liangyue Li. 2019. Next career move prediction with contextual long short-term memory networks. US Patent App. 15/799,396.Google Scholar
- Wei Yuan, Quoc Viet Hung Nguyen, Tieke He, Liang Chen, and Hongzhi Yin. 2023. Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures. arXiv preprint arXiv:2304.03054 (2023).Google Scholar
- Zhenrui Yue, Zhankui He, Huimin Zeng, and Julian McAuley. 2021. Black-box attacks on sequential recommenders via data-free model extraction. In Proceedings of the 15th ACM Conference on Recommender Systems. 44--54.Google ScholarDigital Library
- Denghui Zhang, Junming Liu, Hengshu Zhu, Yanchi Liu, Lichen Wang, Pengyang Wang, and Hui Xiong. 2019a. Job2Vec: Job title benchmarking with collective multi-view representation learning. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM). 2763--2771.Google ScholarDigital Library
- Hengtong Zhang, Yaliang Li, Bolin Ding, and Jing Gao. 2020a. Practical data poisoning attack against next-item recommendation. In Proceedings of The Web Conference 2020 (WWW). 2458--2464.Google ScholarDigital Library
- Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. 2019b. Data poisoning attack against knowledge graph embedding. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI). 4853--4859.Google ScholarCross Ref
- Le Zhang, Tong Xu, Hengshu Zhu, Chuan Qin, Qingxin Meng, Hui Xiong, and Enhong Chen. 2020b. Large-scale talent flow embedding for company competitive analysis. In Proceedings of The Web Conference 2020 (WWW). 2354--2364.Google ScholarDigital Library
- Le Zhang, Ding Zhou, Hengshu Zhu, Tong Xu, Rui Zha, Enhong Chen, and Hui Xiong. 2021. Attentive heterogeneous graph embedding for job mobility prediction. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 2192--2201.Google ScholarDigital Library
- Xuezhou Zhang, Xiaojin Zhu, and Laurent Lessard. 2020c. Online data poisoning attacks. In Learning for Dynamics and Control. PMLR, 201--210.Google Scholar
- Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. 2018. Adversarial attacks on neural networks for graph data. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (KDD). 2847--2856. ioGoogle ScholarDigital Library
Index Terms
- Fake Resume Attacks: Data Poisoning on Online Job Platforms
Recommendations
Defending Against Adversarial Denial-of-Service Data Poisoning Attacks
DYNAMICS '20: Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber SecurityData poisoning is one of the most relevant security threats against machine learning and data-driven technologies. Since many applications rely on untrusted training data, an attacker can easily craft malicious samples and inject them into the training ...
Stronger data poisoning attacks break data sanitization defenses
AbstractMachine learning models trained on data from the outside world can be corrupted by data poisoning attacks that inject malicious points into the models’ training sets. A common defense against these attacks is data sanitization: first filter out ...
Data Poisoning Attacks Against Federated Learning Systems
Computer Security – ESORICS 2020AbstractFederated learning (FL) is an emerging paradigm for distributed training of large-scale deep neural networks in which participants’ data remains on their own devices with only model updates being shared with a central server. However, the ...
Comments