ABSTRACT
Multi-task Recommender Systems (MTRSs) has become increasingly prevalent in a variety of real-world applications due to their exceptional training efficiency and recommendation quality. However, conventional MTRSs often input all relevant feature fields without distinguishing their contributions to different tasks, which can lead to confusion and a decline in performance. Existing feature selection methods may neglect task relations or require significant computation during model training in multi-task setting. To this end, this paper proposes a novel Single-shot Feature Selection framework for MTRSs, referred to as MultiSFS, which is capable of selecting feature fields for each task while considering task relations in a single-shot manner. Specifically, MultiSFS first efficiently obtains task-specific feature importance through a single forward-backward pass. Then, a data-task bipartite graph is constructed to learn field-level task relations. Subsequently, MultiSFS merges the feature importance according to task relations and selects feature fields for different tasks. To demonstrate the effectiveness and properties of MultiSFS, we integrate it with representative MTRS models and evaluate on three real-world datasets. The implementation code is available online to ease reproducibility.
- Mohamed S Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas D Lane. 2021. Zero-cost proxies for lightweight nas. International Conference on Representation Learning (ICLR) (2021).Google Scholar
- Ting Bai, Yudong Xiao, Bin Wu, Guojun Yang, Hongyong Yu, and Jian-Yun Nie. 2022. A Contrastive Sharing Model for Multi-Task Recommendation. In Proceedings of the ACM Web Conference 2022.Google ScholarDigital Library
- Joachim Bingel and Anders Søgaard. 2017. Identifying beneficial task relations for multi-task learning in deep neural networks. arXiv preprint arXiv:1702.08303 (2017).Google Scholar
- Leo Breiman. 1997. Arcing the edge. Technical Report.Google Scholar
- Kaidi Cao, Jiaxuan You, and Jure Leskovec. 2022. Relational multi-task learning: Modeling relations between data and tasks. In International Conference on Representation Learning (ICLR).Google Scholar
- Rich Caruana. 1997. Multitask learning. Machine learning (1997).Google Scholar
- Zhongde Chen, Ruize Wu, Cong Jiang, Honghui Li, Xin Dong, Can Long, Yong He, Lei Cheng, and Linjian Mo. 2022. CFS-MTL: A Causal Feature Selection Mechanism for Multi-task Learning via Pseudo-intervention. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3883--3887.Google ScholarDigital Library
- Ke Ding, Xin Dong, Yong He, Lei Cheng, Chilin Fu, Zhaoxin Huan, Hai Li, Tan Yan, Liang Zhang, Xiaolu Zhang, et al. 2021. MSSM: a multiple-level sparse sharing model for efficient multi-task learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2237--2241.Google ScholarDigital Library
- Wei Fan, Kunpeng Liu, Hao Liu, Pengyang Wang, Yong Ge, and Yanjie Fu. 2020. Autofs: Automated feature selection via diversity-aware interactive reinforcement learning. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 1008--1013.Google ScholarCross Ref
- Chongming Gao, Shijun Li, Yuan Zhang, Jiawei Chen, Biao Li, Wenqiang Lei, Peng Jiang, and Xiangnan He. 2022. KuaiRand: An Unbiased Sequential Recommendation Dataset with Randomly Exposed Videos. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3953--3957.Google ScholarDigital Library
- Pablo M Granitto, Cesare Furlanello, Franco Biasioli, and Flavia Gasperi. 2006. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and intelligent laboratory systems (2006).Google Scholar
- Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 1725--1731.Google ScholarCross Ref
- Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. Journal of machine learning research (2003).Google Scholar
- Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).Google Scholar
- Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. 355--364.Google ScholarDigital Library
- Wei Jin, Xiaorui Liu, Xiangyu Zhao, Yao Ma, Neil Shah, and Jiliang Tang. 2022. Automated Self-Supervised Learning for Graphs. In 10th International Conference on Learning Representations (ICLR 2022).Google Scholar
- Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In International conference on machine learning. PMLR, 1885--1894.Google Scholar
- Namhoon Lee, Thalaiyasingam Ajanthan, and Philip Torr. 2019. SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY. In International Conference on Learning Representations. https://openreview.net/forum?id=B1VZqjAcYXGoogle Scholar
- Lingjie Li, Manlin Xuan, Qiuzhen Lin, Min Jiang, Zhong Ming, and Kay Chen Tan. 2022. An Evolutionary Multitasking Algorithm with Multiple Filtering for High-Dimensional Feature Selection. arXiv preprint arXiv:2212.08854 (2022).Google Scholar
- Muyang Li, Zijian Zhang, Xiangyu Zhao, Wanyu Wang, Minghao Zhao, Runze Wu, and Ruocheng Guo. 2023. AutoMLP: Automated MLP for Sequential Recommendations. In Proceedings of the Web Conference 2023.Google ScholarDigital Library
- Weilin Lin, Xiangyu Zhao, Yejing Wang, Tong Xu, and Xian Wu. 2022. AdaFS: Adaptive Feature Selection in Deep Recommender System. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3309--3317.Google ScholarDigital Library
- Weilin Lin, Xiangyu Zhao, Yejing Wang, Yuanshao Zhu, and Wanyu Wang. 2023. AutoDenoise: Automatic Data Instance Denoising for Recommendations. In Proceedings of the Web Conference 2023.Google ScholarDigital Library
- Huan Liu and Rudy Setiono. 1995. Chi2: Feature selection and discretization of numeric attributes. In Proceedings of 7th IEEE international conference on tools with artificial intelligence. IEEE, 388--391.Google Scholar
- Haochen Liu, Xiangyu Zhao, Chong Wang, Xiaobing Liu, and Jiliang Tang. 2020. Automated Embedding Size Search in Deep Recommender Systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2307--2316.Google ScholarDigital Library
- Kunpeng Liu, Yanjie Fu, Pengfei Wang, Le Wu, Rui Bo, and Xiaolin Li. 2019. Automating feature subspace exploration via multi-agent reinforcement learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 207--215.Google ScholarDigital Library
- Kunpeng Liu, Yanjie Fu, Le Wu, Xiaolin Li, Charu Aggarwal, and Hui Xiong. 2021. Automated feature selection: A reinforcement learning perspective. IEEE Transactions on Knowledge and Data Engineering (2021).Google Scholar
- Qi Liu, Qian Xu, Vincent W Zheng, Hong Xue, Zhiwei Cao, and Qiang Yang. 2010. Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study. BMC bioinformatics (2010).Google Scholar
- Ziru Liu, Jiejie Tian, Qingpeng Cai, Xiangyu Zhao, Jingtong Gao, Shuchang Liu, Dayou Chen, Tonghao He, Dong Zheng, Peng Jiang, et al. 2023. Multi-Task Recommendations with Reinforcement Learning. In Proceedings of the Web Conference 2023.Google ScholarDigital Library
- Jiaqi Ma, Zhe Zhao, Jilin Chen, Ang Li, Lichan Hong, and Ed H Chi. 2019. Snr: Sub-network routing for flexible parameter sharing in multi-task learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 216--223.Google ScholarDigital Library
- Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H Chi. 2018b. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1930--1939.Google ScholarDigital Library
- Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018a. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137--1140.Google ScholarDigital Library
- Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3994--4003.Google ScholarCross Ref
- Alejandro Newell, Lu Jiang, Chong Wang, Li-Jia Li, and Jia Deng. 2019. Feature partitioning for efficient multi-task architectures. arXiv preprint arXiv:1908.04339 (2019).Google Scholar
- Liang Qu, Yonghong Ye, Ningzhi Tang, Lixin Zhang, Yuhui Shi, and Hongzhi Yin. 2022. Single-shot Embedding Dimension Search in Recommender System. Proceedings of the 45th International ACM SIGIR conference on research and development in Information Retrieval (2022).Google ScholarDigital Library
- Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In 2016 IEEE 16th international conference on data mining (ICDM). IEEE, 1149--1154.Google ScholarCross Ref
- Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard. 2017. Learning what to share between loosely related tasks. arXiv preprint arXiv:1705.08142 (2017).Google Scholar
- Fengyi Song, Bo Chen, Xiangyu Zhao, Huifeng Guo, and Ruiming Tang. 2022. AutoAssign: Automatic Shared Embedding Assignment in Streaming Recommendation. In IEEE International Conference on Data Mining (ICDM). IEEE, 458--467.Google Scholar
- Tianxiang Sun, Yunfan Shao, Xiaonan Li, Pengfei Liu, Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. Learning sparse sharing architectures for multiple tasks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 8936--8943.Google ScholarCross Ref
- Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations. In Proceedings of the 14th ACM Conference on Recommender Systems. 269--278.Google ScholarDigital Library
- Xuewen Tao, Mingming Ha, Xiaobo Guo, Qiongxu Ma, Hongwei Cheng, and Wenfang Lin. 2023. Task Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning. arXiv preprint arXiv:2301.02494 (2023).Google Scholar
- Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) (1996).Google Scholar
- Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, and Wei Chu. 2022a. ESCM2: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 363--372.Google ScholarDigital Library
- Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of the ADKDD'17. 1--7.Google ScholarDigital Library
- Xiang Wang, Hongye Jin, An Zhang, Xiangnan He, Tong Xu, and Tat-Seng Chua. 2020a. Disentangled graph collaborative filtering. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. 1001--1010.Google ScholarDigital Library
- Yanshi Wang, Jie Zhang, Qing Da, and Anxiang Zeng. 2020b. Delayed feedback modeling for the entire space conversion rate prediction. arXiv preprint arXiv:2011.11826 (2020).Google Scholar
- Yejing Wang, Xiangyu Zhao, Tong Xu, and Xian Wu. 2022b. Autofield: Automating feature selection in deep recommender systems. In Proceedings of the ACM Web Conference 2022.Google ScholarDigital Library
- Hong Wen, Jing Zhang, Yuan Wang, Fuyu Lv, Wentian Bao, Quan Lin, and Keping Yang. 2020. Entire space multi-task modeling via post-click behavior decomposition for conversion rate prediction. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 2377--2386.Google ScholarDigital Library
- Haolun Wu, Yingxue Zhang, Chen Ma, Wei Guo, Ruiming Tang, Xue Liu, and Mark Coates. 2022. Intent-aware Multi-source Contrastive Alignment for Tag-enhanced Recommendation. arXiv preprint arXiv:2211.06370 (2022).Google Scholar
- Dongbo Xi, Zhen Chen, Peng Yan, Yinger Zhang, Yongchun Zhu, Fuzhen Zhuang, and Yu Chen. 2021. Modeling the sequential dependence among audience multi-step conversions with multi-task learning in targeted display advertising. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3745--3755.Google ScholarDigital Library
- Shen Xin, Yuhang Jiao, Cheng Long, Yuguang Wang, Xiaowei Wang, Sen Yang, Ji Liu, and Jie Zhang. 2022. Prototype Feature Extraction for Multi-task Learning. In Proceedings of the ACM Web Conference 2022.Google ScholarDigital Library
- Guanghu Yuan, Fajie Yuan, Yudong Li, Beibei Kong, Shujie Li, Lei Chen, Min Yang, Chenyun YU, Bo Hu, Zang Li, Yu Xu, and Xiaohu Qie. 2022. Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track. https://openreview.net/forum?id=PfuW84q25y9Google Scholar
- Xiao-Tong Yuan, Xiaobai Liu, and Shuicheng Yan. 2012. Visual classification with multitask joint sparse representation. IEEE Transactions on Image Processing (2012).Google Scholar
- Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep Learning over Multi-field Categorical Data: --A Case Study on User Response Prediction. In Advances in Information Retrieval: 38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20-23, 2016. Proceedings 38. Springer, 45--57.Google ScholarCross Ref
- Yu Zhang and Qiang Yang. 2017. Learning sparse task relations in multi-task learning. In Proceeding of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Xiangyu Zhao. 2022. Adaptive and automated deep recommender systems. ACM SIGWEB Newsletter Spring (2022), 1--4.Google Scholar
- Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Hui Liu, and Jiliang Tang. 2021a. DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 750--758.Google ScholarCross Ref
- Xiangyu Zhao, Haochen Liu, Wenqi Fan, Hui Liu, Jiliang Tang, and Chong Wang. 2021b. AutoLoss: Automated Loss Function Search in Recommendations. In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3959--3967.Google ScholarDigital Library
- Xiangyu Zhao, Haochen Liu, Wenqi Fan, Hui Liu, Jiliang Tang, Chong Wang, Ming Chen, Xudong Zheng, Xiaobing Liu, and Xiwang Yang. 2021c. Autoemb: Automated embedding dimensionality search in streaming recommendations. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 896--905.Google Scholar
- Xiangyu Zhao, Haochen Liu, Hui Liu, Jiliang Tang, Weiwei Guo, Jun Shi, Sida Wang, Huiji Gao, and Bo Long. 2021d. AutoDim: Field-aware Embedding Dimension Searchin Recommender Systems. In Proceedings of the Web Conference 2021. 3015--3022.Google ScholarDigital Library
- Xiaosa Zhao, Kunpeng Liu, Wei Fan, Lu Jiang, Xiaowei Zhao, Minghao Yin, and Yanjie Fu. 2020. Simplifying reinforced feature selection via restructured choice strategy of single agent. In Proceeding of the IEEE International Conference on Data Mining.Google ScholarCross Ref
- Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2018a. Deep Reinforcement Learning for Page-wise Recommendations. In Proceedings of the 12th ACM Recommender Systems Conference. ACM, 95--103.Google ScholarDigital Library
- Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018b. Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1040--1048.Google ScholarDigital Library
- Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2017. Deep Reinforcement Learning for List-wise Recommendations. arXiv preprint arXiv:1801.00209 (2017).Google Scholar
- Chenxu Zhu, Bo Chen, Huifeng Guo, Hang Xu, Xiangyang Li, Xiangyu Zhao, Weinan Zhang, Yong Yu, and Ruiming Tang. 2023. AutoGen: An Automated Dynamic Model Generation Framework for Recommender System. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 598--606.Google ScholarDigital Library
- Yongchun Zhu, Yudan Liu, Ruobing Xie, Fuzhen Zhuang, Xiaobo Hao, Kaikai Ge, Xu Zhang, Leyu Lin, and Juan Cao. 2021. Learning to expand audience via meta hybrid experts and critics for recommendation and advertising. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 4005--4013.Google ScholarDigital Library
Index Terms
- Single-shot Feature Selection for Multi-task Recommendations
Recommendations
AMTEA-Based Multi-task Optimisation for Multi-objective Feature Selection in Classification
Applications of Evolutionary ComputationAbstractFeature selection is important nowadays due to many real-world datasets usually having a large number of features. Evolutionary multi-objective optimisation algorithms have been successfully used for feature selection which usually has two ...
CFS-MTL: A Causal Feature Selection Mechanism for Multi-task Learning via Pseudo-intervention
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementMulti-task learning (MTL) has been successfully applied to a wide range of real-world applications. However, MTL models often suffer from performance degradation with negative transfer due to sharing all features without distinguishing their helpfulness ...
Gaussian Process Multi-task Learning Using Joint Feature Selection
Machine Learning and Knowledge Discovery in DatabasesAbstractMulti-task learning involves solving multiple related learning problems by sharing some common structure for improved generalization performance. A promising idea to multi-task learning is joint feature selection where a sparsity pattern is shared ...
Comments