research-article

Open Access

FedHLT: Efficient Federated Low-Rank Adaption with Hierarchical Language Tree for Multilingual Modeling

Authors:
Zhihan Guo

The Chinese University of Hong Kong, Hong Kong SAR, China

The Chinese University of Hong Kong, Hong Kong SAR, China

0000-0002-1889-583X
View Profile

,
Yifei Zhang

The Chinese University of Hong Kong, Hong Kong SAR, China

The Chinese University of Hong Kong, Hong Kong SAR, China

0000-0003-4185-8663
View Profile

,
Zhuo Zhang

Harbin Institute of Technology, Shenzhen, China

Harbin Institute of Technology, Shenzhen, China

0000-0002-1835-5411
View Profile

,
Zenglin Xu

Harbin Institute of Technology, Shenzhen, China

Harbin Institute of Technology, Shenzhen, China

0000-0001-5550-6461
View Profile

,
Irwin King

The Chinese University of Hong Kong, Hong Kong SAR, China

The Chinese University of Hong Kong, Hong Kong SAR, China

0000-0001-8106-6447
View Profile

WWW '24: Companion Proceedings of the ACM on Web Conference 2024May 2024Pages 1558–1567https://doi.org/10.1145/3589335.3651933

Published:13 May 2024Publication History

WWW '24: Companion Proceedings of the ACM on Web Conference 2024

Pages 1558–1567

ABSTRACT

Federated Multilingual Modeling (FMM) has become an essential approach in natural language processing (NLP) due to increasing linguistic diversity and the heightened emphasis on data privacy. However, FMM faces two primary challenges: 1) the high communication costs inherent in network operations, and 2) the complexities arising from parameter interference, as languages exhibit both unique characteristics and shared features. To tackle these issues, we introduce a communication-efficient framework for Multilingual Modeling (MM) that combines low-rank adaptation with a hierarchical language tree structure. Our method maintains the base model's weights while focusing on updating only the Low-rank adaptation (LoRA) parameters, significantly reducing communication costs. Additionally, we mitigate parameter conflicts by organizing languages based on their familial ties rather than merging all LoRA parameters together. Our experimental findings reveal that this novel model surpasses established baseline models in performance and markedly decreases communication overhead.

Supplemental Material

w07f0019.mp4

Supplemental video

mp4

4.7 MB

Download

References

Armen Aghajanyan, Sonal Gupta, and Luke Zettlemoyer. 2021. Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).Google ScholarCross Ref
M Saiful Bari, Batool Haider, and Saab Mansour. 2021. Nearest neighbour fewshot learning for cross-lingual classification. arXiv preprint arXiv:2109.02221 (2021).Google Scholar
Elad Ben Zaken, Yoav Goldberg, and Shauli Ravfogel. 2022. BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).Google ScholarCross Ref
Gaode Chen, Xinghua Zhang, Yijun Su, Yantong Lai, Ji Xiang, Junbo Zhang, and Yu Zheng. 2023. Win-Win: A Privacy-Preserving Federated Framework for Dual-Target Cross-Domain Recommendation. In Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Brian Williams, Yiling Chen, and Jennifer Neville (Eds.). AAAI Press.Google Scholar
Alexandra Chronopoulou, Dario Stojanovski, and Alexander Fraser. 2023. Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation. In Proceedings of the The Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023). 59--72.Google ScholarCross Ref
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised Cross-lingual Representation Learning at Scale. CoRR (2019).Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi. org/10.18653/v1/N19-1423Google Scholar
Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, et al. 2022. Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models. arXiv preprint arXiv:2203.06904 (2022).Google Scholar
Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, et al. 2021. Beyond english-centric multilingual machine translation. The Journal of Machine Learning Research 22, 1 (2021), 4839--4886.Google ScholarDigital Library
Xinyu Fu and Irwin King. 2023. FedHGN: A Federated Framework for Heterogeneous Graph Neural Networks. arXiv preprint arXiv:2305.09729 (2023).Google Scholar
Jay Gala, Deep Gandhi, Jash Mehta, and Zeerak Talat. 2023. A Federated Approach for Hate Speech Detection. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Dubrovnik, Croatia, 3248--3259. https: //doi.org/10.18653/v1/2023.eacl-main.237Google ScholarCross Ref
Karim Gamal, Ahmed Gaber, and Hossam Amer. 2023. Federated Learning Based Multilingual Emoji Prediction In Clean and Attack Scenarios. arXiv preprint arXiv:2304.01005 (2023).Google Scholar
Demi Guo, Alexander Rush, and Yoon Kim. 2021. Parameter-Efficient Transfer Learning with Diff Pruning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4884--4896. https://doi.org/10.18653/v1/2021.acllong. 378Google ScholarCross Ref
Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin De Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-efficient transfer learning for NLP. In International Conference on Machine Learning. PMLR, 2790--2799.Google Scholar
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations. https: //openreview.net/forum?id=nZeVKeeFYf9Google Scholar
Zhiqiang Hu, Yihuai Lan, Lei Wang, Wanyu Xu, Ee-Peng Lim, Roy Ka-Wei Lee, Lidong Bing, and Soujanya Poria. 2023. LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models. arXiv preprint arXiv:2304.01933 (2023).Google Scholar
Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2021. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning 14, 1--2 (2021), 1--210.Google Scholar
Yeachan Kim, Junho Kim, Wing-Lam Mok, Jun-Hyung Park, and SangKeun Lee. 2023. Client-Customized Adaptation for Parameter-Efficient Federated Learning. In Findings of the Association for Computational Linguistics: ACL 2023.Google Scholar
Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated Learning: Strategies for Improving Communication Efficiency. CoRR abs/1610.05492 (2016). arXiv:1610.05492 http://arxiv.org/abs/1610.05492Google Scholar
Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).Google ScholarCross Ref
Tomasz Limisiewicz, Jiří Balhar, and David Mareček. 2023. Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages. In Findings of the Association for Computational Linguistics: ACL 2023. Association for Computational Linguistics, Toronto, Canada, 5661--5681. https://doi.org/10.18653/v1/2023.findings-acl.350Google ScholarCross Ref
Bill Yuchen Lin, Chaoyang He, Zihang Ze, Hulin Wang, Yufen Hua, Christophe Dupuy, Rahul Gupta, Mahdi Soltanolkotabi, Xiang Ren, and Salman Avestimehr. 2022. FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks. In Findings of the Association for Computational Linguistics: NAACL 2022. Association for Computational Linguistics, Seattle, United States, 157--175. https://doi.org/10.18653/v1/2022.findings-naacl.13Google ScholarCross Ref
Yi Liu, Xiaohan Bi, Lei Li, Sishuo Chen, Wenkai Yang, and Xu Sun. 2023. Communication Efficient Federated Learning for Multilingual Neural Machine Translation with Adapter. In Findings of the Association for Computational Linguistics: ACL 2023. Association for Computational Linguistics, Toronto, Canada, 5315--5328. https://aclanthology.org/2023.findings-acl.327Google Scholar
Andrea Manoel, Mirian del Carmen Hipolito Garcia, Tal Baumel, Shize Su, Jialei Chen, Robert Sim, Dan Miller, Danny Karmon, and Dimitrios Dimitriadis. 2023. Federated Multilingual Models for Medical Transcript Analysis. In Proceedings of the Conference on Health, Inference, and Learning (Proceedings of Machine Learning Research, Vol. 209), Bobak J. Mortazavi, Tasmie Sarker, Andrew Beam, and Joyce C. Ho (Eds.). PMLR, 147--162. https://proceedings.mlr.press/v209/manoel23a.htmlGoogle Scholar
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 54), Aarti Singh and Jerry Zhu (Eds.). PMLR, 1273--1282. https://proceedings. mlr.press/v54/mcmahan17a.htmlGoogle Scholar
Lewis M Paul, Gary F Simons, Charles D Fennig, et al. 2009. Ethnologue: Languages of the world. Dallas, TX: SIL International. Available online at www. ethnologue. com/. Retrieved June 19 (2009), 2011.Google Scholar
Matt Post. 2018. A Call for Clarity in Reporting BLEU Scores. In Proceedings of the Third Conference on Machine Translation: Research Papers.Google ScholarCross Ref
Yichen Ruan and Carlee Joe-Wong. 2022. Fedsoft: Soft clustered federated learning with proximal local updating. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 8124--8131.Google ScholarCross Ref
Sebastian Ruder, Jonas Pfeiffer, and Ivan Vulić. 2022. Modular and Parameter-Efficient Fine-Tuning for NLP Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts. 23--29.Google ScholarCross Ref
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019).Google Scholar
Yi-Lin Sung, Jaemin Cho, and Mohit Bansal. 2022. Lst: Ladder side-tuning for parameter and memory efficient transfer learning. Advances in Neural Information Processing Systems 35 (2022), 12991--13005.Google Scholar
Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, and Bill Lin. 2023. Efficient distribution similarity identification in clustered federated learning via principal angles between client data subspaces. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
Haoyu Wang, Handong Zhao, Yaqing Wang, Tong Yu, Jiuxiang Gu, and Jing Gao. 2022. FedKC: Federated Knowledge Composition for Multilingual Natural Language Understanding. In The ACM Web Conference 2022.Google Scholar
Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, and Benjamin Van Durme. 2022. Pretrained Models for Multilingual Federated Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 1413--1421. https://doi.org/10. 18653/v1/2022.naacl-main.101Google ScholarCross Ref
Runxin Xu, Fuli Luo, Baobao Chang, Songfang Huang, and Fei Huang. 2022. S4-Tuning: A Simple Cross-lingual Sub-network Tuning Method-Tuning: A Simple Cross-lingual Sub-network Tuning Method. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 530--537.Google ScholarCross Ref
Zheng Xu, Yanxiang Zhang, Galen Andrew, Christopher Choquette, Peter Kairouz, Brendan Mcmahan, Jesse Rosenstock, and Yuanbo Zhang. 2023. Federated Learning of Gboard Language Models with Differential Privacy. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track).Google ScholarCross Ref
Dun Zeng, Siqi Liang, Xiangjing Hu, Hui Wang, and Zenglin Xu. 2023. Fed Lab: A Flexible Federated Learning Framework. Journal of Machine Learning Research (2023).Google Scholar
Tianshu Zhang, Changchang Liu, Wei-Han Lee, Yu Su, and Huan Sun. 2023. Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, 12149--12163. https://doi.org/10.18653/v1/2023.acllong. 678Google ScholarCross Ref
Yifei Zhang, Dun Zeng, Jinglong Luo, Zenglin Xu, and Irwin King. 2023. A Survey of Trustworthy Federated Learning with Perspectives on Security, Robustness and Privacy. In Companion Proceedings of the ACM Web Conference 2023 (Austin, TX, USA) (WWW '23 Companion). Association for Computing Machinery, New York, NY, USA, 1167--1176. https://doi.org/10.1145/3543873.3587681Google ScholarDigital Library
Zhuo Zhang, Xiangjing Hu, Jingyuan Zhang, Yating Zhang, Hui Wang, Lizhen Qu, and Zenglin Xu. 2023. FEDLEGAL: The First Real-World Federated Learning Benchmark for Legal NLP. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).Google ScholarCross Ref
Zhuo Zhang, Yuanhang Yang, Yong Dai, Qifan Wang, Yue Yu, Lizhen Qu, and Zenglin Xu. 2023. Fed PETuning: When Federated Learning Meets the Parameter-Efficient Tuning Methods of Pre-trained Language Models. In Findings of the Association for Computational Linguistics: ACL 2023.Google Scholar

Index Terms

FedHLT: Efficient Federated Low-Rank Adaption with Hierarchical Language Tree for Multilingual Modeling
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Multilingual Offensive Language Identification for Low-resource Languages
Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, ...
Read More
The Role of WSD for Multilingual Natural Language Applications
TSD '02: Proceedings of the 5th International Conference on Text, Speech and Dialogue

Nowadays, the need of advanced free text filtering in multilingual environment is increasing. Therefore, when searching for specific keywords in multilingual information space, it is desirable to eliminate occurrences where the word or words of each ...
Read More
Refining low-resource unsupervised translation by language disentanglement of multilingual model
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems

Numerous recent work on unsupervised machine translation (UMT) implies that competent unsupervised translations of low-resource and unrelated languages, such as Nepali or Sinhala, are only possible if the model is trained in a massive multilingual ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Companion Proceedings of the ACM on Web Conference 2024
May 2024
1928 pages
ISBN:9798400701726
DOI:10.1145/3589335
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Check for updates
Author Tags
federated multilingual modeling
hierarchical federated learning
low-rank adaption
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 42
  Total Downloads
- Downloads (Last 12 months)42
- Downloads (Last 6 weeks)42
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

FedHLT: Efficient Federated Low-Rank Adaption with Hierarchical Language Tree for Multilingual Modeling

WWW '24: Companion Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Multilingual Offensive Language Identification for Low-resource Languages

The Role of WSD for Multilingual Natural Language Applications

Refining low-resource unsupervised translation by language disentanglement of multilingual model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

FedHLT: Efficient Federated Low-Rank Adaption with Hierarchical Language Tree for Multilingual Modeling

WWW '24: Companion Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Multilingual Offensive Language Identification for Low-resource Languages

The Role of WSD for Multilingual Natural Language Applications

Refining low-resource unsupervised translation by language disentanglement of multilingual model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media