research-article

Unifying Token- and Span-level Supervisions for Few-shot Sequence Labeling

Authors:
Zifeng Cheng

State Key Laboratory for Novel Software Technology, Nanjing University, China

State Key Laboratory for Novel Software Technology, Nanjing University, China

0000-0002-8486-2614
View Profile

,
Qingyu Zhou

Tencent Cloud Xiaowei, China

Tencent Cloud Xiaowei, China

0000-0002-4389-1582
View Profile

,
Zhiwei Jiang

State Key Laboratory for Novel Software Technology, Nanjing University, China

State Key Laboratory for Novel Software Technology, Nanjing University, China

0000-0001-5243-4992
View Profile

,
Xuemin Zhao

Tencent Cloud Xiaowei, China

Tencent Cloud Xiaowei, China

0000-0003-1525-1569
View Profile

,
Yunbo Cao

Tencent Cloud Xiaowei, China

Tencent Cloud Xiaowei, China

0009-0005-2558-5206
View Profile

,
Qing Gu

State Key Laboratory for Novel Software Technology, Nanjing University, China

State Key Laboratory for Novel Software Technology, Nanjing University, China

0000-0002-1112-790X
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 42 Issue 1Article No.: 32pp 1–27https://doi.org/10.1145/3610403

Published:21 August 2023Publication History

ACM Transactions on Information Systems

Abstract

Few-shot sequence labeling aims to identify novel classes based on only a few labeled samples. Existing methods solve the data scarcity problem mainly by designing token-level or span-level labeling models based on metric learning. However, these methods are only trained at a single granularity (i.e., either token-level or span-level) and have some weaknesses of the corresponding granularity. In this article, we first unify token- and span-level supervisions and propose a Consistent Dual Adaptive Prototypical (CDAP) network for few-shot sequence labeling. CDAP contains the token- and span-level networks, jointly trained at different granularities. To align the outputs of two networks, we further propose a consistent loss to enable them to learn from each other. During the inference phase, we propose a consistent greedy inference algorithm that first adjusts the predicted probability and then greedily selects non-overlapping spans with maximum probability. Extensive experiments show that our model achieves new state-of-the-art results on three benchmark datasets. All the code and data of this work will be released at https://github.com/zifengcheng/CDAP.

REFERENCES

[1] Agosti Maristella, Marchesin Stefano, and Silvello Gianmaria. 2020. Learning unsupervised knowledge-enhanced representations to reduce the semantic gap in information retrieval. ACM Trans. Inf. Syst. 38, 4 (2020), 38:1–38:48. DOI:Google ScholarDigital Library
[2] Athiwaratkun Ben, Santos Cícero Nogueira dos, Krone Jason, and Xiang Bing. 2020. Augmented natural language for generative sequence labeling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 375–385. Google ScholarCross Ref
[3] Chen Jiawei, Liu Qing, Lin Hongyu, Han Xianpei, and Sun Le. 2022. Few-shot named entity recognition with self-describing networks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). 5711–5722. Google ScholarCross Ref
[4] Chen Wei-Yu, Liu Yen-Cheng, Kira Zsolt, Wang Yu-Chiang Frank, and Huang Jia-Bin. 2019. A closer look at few-shot classification. In Proceedings of the 7th International Conference on Learning Representations (ICLR’19). OpenReview.net. Retrieved from https://openreview.net/forum?id=HkxLXnAcFQ.Google Scholar
[5] Chen Yinbo, Liu Zhuang, Xu Huijuan, Darrell Trevor, and Wang Xiaolong. 2021. Meta-baseline: Exploring simple meta-learning for few-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’21). IEEE, 9042–9051. DOI:Google ScholarCross Ref
[6] Cheng Zifeng, Jiang Zhiwei, Yin Yafeng, Li Na, and Gu Qing. 2021. A unified target-oriented sequence-to-sequence model for emotion-cause pair extraction. IEEE ACM Trans. Audio Speech Lang. Process. 29 (2021), 2779–2791. DOI:Google ScholarDigital Library
[7] Chiu Jason P. C. and Nichols Eric. 2016. Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Ling. 4 (2016), 357–370. Google ScholarCross Ref
[8] Colombo Pierre, Chapuis Emile, Manica Matteo, Vignon Emmanuel, Varni Giovanna, and Clavel Chloé. 2020. Guiding attention in sequence-to-sequence models for dialogue act prediction. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI’20). 7594–7601. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/6259Google ScholarCross Ref
[9] Coucke Alice, Saade Alaa, Ball Adrien, Bluche Théodore, Caulier Alexandre, Leroy David, Doumouro Clément, Gisselbrecht Thibault, Caltagirone Francesco, Lavril Thibaut, Primet Maël, and Dureau Joseph. 2018. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. Retrieved from http://arxiv.org/abs/1805.10190.Google Scholar
[10] Cui Leyang, Wu Yu, Liu Jian, Yang Sen, and Zhang Yue. 2021. Template-based named entity recognition using BART. In Proceedings of the Association for Computational Linguistics (ACL/IJCNLP’21). 1835–1845. Google ScholarCross Ref
[11] Cui Leyang and Zhang Yue. 2019. Hierarchically-refined label attention network for sequence labeling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 4113–4126. Google ScholarCross Ref
[12] Das Sarkar Snigdha Sarathi, Katiyar Arzoo, Passonneau Rebecca J., and Zhang Rui. 2022. CONTaiNER: Few-shot named entity recognition via contrastive learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). 6338–6353. Google ScholarCross Ref
[13] Derczynski Leon, Nichols Eric, Erp Marieke van, and Limsopatham Nut. 2017. Results of the WNUT2017 shared task on novel and emerging entity recognition. In Proceedings of the 3rd Workshop on Noisy User-generated Text (NUT@EMNLP’17). Association for Computational Linguistics, 140–147. DOI:Google ScholarCross Ref
[14] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19). 4171–4186. Google ScholarCross Ref
[15] Dhillon Guneet Singh, Chaudhari Pratik, Ravichandran Avinash, and Soatto Stefano. 2020. A baseline for few-shot image classification. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20). OpenReview.net. Retrieved from https://openreview.net/forum?id=rylXBkrYDSGoogle Scholar
[16] Ding Ning, Xu Guangwei, Chen Yulin, Wang Xiaobin, Han Xu, Xie Pengjun, Zheng Haitao, and Liu Zhiyuan. 2021. Few-NERD: A few-shot named entity recognition dataset. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL/IJCNLP’21). 3198–3213. Google ScholarCross Ref
[17] Fei-Fei Li, Fergus Robert, and Perona Pietro. 2006. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28, 4 (2006), 594–611. DOI:Google ScholarDigital Library
[18] Finn Chelsea, Abbeel Pieter, and Levine Sergey. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning (ICML’17). 1126–1135. Retrieved from http://proceedings.mlr.press/v70/finn17a.htmlGoogle ScholarDigital Library
[19] Finn Chelsea, Xu Kelvin, and Levine Sergey. 2018. Probabilistic model-agnostic meta-learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS’18). 9537–9548. Retrieved from https://proceedings.neurips.cc/paper/2018/hash/8e2c381d4dd04f1c55093f22c59c3a08-Abstract.htmlGoogle Scholar
[20] Fritzler Alexander, Logacheva Varvara, and Kretov Maksim. 2019. Few-shot classification in named entity recognition task. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing (SAC’19). 993–1000. Google ScholarDigital Library
[21] Fu Jinlan, Huang Xuanjing, and Liu Pengfei. 2021. SpanNER: Named entity re-/recognition as span prediction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL/IJCNLP’21). 7183–7195. DOI:Google ScholarCross Ref
[22] Gao Tianyu, Han Xu, Liu Zhiyuan, and Sun Maosong. 2019. Hybrid attention-based prototypical networks for noisy few-shot relation classification. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19) and the 31st Innovative Applications of Artificial Intelligence Conference (IAAI’19). AAAI Press, 6407–6414. DOI:Google ScholarDigital Library
[23] Guo Jiafeng, Cai Yinqiong, Fan Yixing, Sun Fei, Zhang Ruqing, and Cheng Xueqi. 2022. Semantic models for the first-stage retrieval: A comprehensive review. ACM Trans. Inf. Syst. 40, 4 (2022), 66:1–66:42. DOI:Google ScholarDigital Library
[24] Henderson Matthew and Vulic Ivan. 2021. ConVEx: Data-efficient and few-shot slot labeling. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’21). 3375–3389. Google ScholarCross Ref
[25] Hou Ruibing, Chang Hong, Ma Bingpeng, Shan Shiguang, and Chen Xilin. 2019. Cross attention network for few-shot classification. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS’19). 4005–4016. Retrieved from https://proceedings.neurips.cc/paper/2019/hash/01894d6f048493d2cacde3c579c315a3-Abstract.htmlGoogle Scholar
[26] Hou Yutai, Che Wanxiang, Lai Yongkui, Zhou Zhihan, Liu Yijia, Liu Han, and Liu Ting. 2020. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 1381–1393. Google ScholarCross Ref
[27] Hou Yutai, Chen Cheng, Luo Xianzhen, Li Bohan, and Che Wanxiang. 2022. Inverse is better! fast and accurate prompt for few-shot slot tagging. In Proceedings of the Association for Computational Linguistics (ACL’22). 637–647. Google ScholarCross Ref
[28] Huang Jiaxin, Li Chunyuan, Subudhi Krishan, Jose Damien, Balakrishnan Shobana, Chen Weizhu, Peng Baolin, Gao Jianfeng, and Han Jiawei. 2021. Few-shot named entity recognition: An empirical baseline study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 10408–10423. Google ScholarCross Ref
[29] Huang Zhiheng, Xu Wei, and Yu Kai. 2015. Bidirectional LSTM-CRF models for sequence tagging. Retrieved from http://arxiv.org/abs/1508.01991Google Scholar
[30] Jiang Zhengbao, Xu Wei, Araki Jun, and Neubig Graham. 2020. Generalizing natural language analysis through span-relation representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). Association for Computational Linguistics, 2120–2133. DOI:Google ScholarCross Ref
[31] Lample Guillaume, Ballesteros Miguel, Subramanian Sandeep, Kawakami Kazuya, and Dyer Chris. 2016. Neural architectures for named entity recognition. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’16). 260–270. Google ScholarCross Ref
[32] Lee Dong-Ho, Kadakia Akshen, Tan Kangmin, Agarwal Mahak, Feng Xinyu, Shibuya Takashi, Mitani Ryosuke, Sekiya Toshiyuki, Pujara Jay, and Ren Xiang. 2022. Good examples make A faster learner: Simple demonstration-based learning for low-resource NER. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). 2687–2700. Google ScholarCross Ref
[33] Li Aoxue, Huang Weiran, Lan Xu, Feng Jiashi, Li Zhenguo, and Wang Liwei. 2020. Boosting few-shot learning with adaptive margin loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). Computer Vision Foundation/IEEE, 12573–12581. DOI:Google ScholarCross Ref
[34] Li Jing, Chiu Billy, Feng Shanshan, and Wang Hao. 2022. Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34, 9 (2022), 4245–4256. DOI:Google ScholarCross Ref
[35] Li Kai, Zhang Yulun, Li Kunpeng, and Fu Yun. 2020. Adversarial feature hallucination networks for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). Computer Vision Foundation/IEEE, 13467–13476. DOI:Google ScholarCross Ref
[36] Li Ruirui, Wu Xian, Chen Xiusi, and Wang Wei. 2020. Few-shot learning for new user recommendation in location-based social networks. In Proceedings of the Web Conference (WWW’20). 2472–2478. DOI:Google ScholarDigital Library
[37] Li Xiaoya, Feng Jingrong, Meng Yuxian, Han Qinghong, Wu Fei, and Li Jiwei. 2020. A unified MRC framework for named entity recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 5849–5859. DOI:Google ScholarCross Ref
[38] Li Xiangsheng, Mao Jiaxin, Ma Weizhi, Liu Yiqun, Zhang Min, Ma Shaoping, Wang Zhaowei, and He Xiuqiang. 2021. Topic-enhanced knowledge-aware retrieval model for diverse relevance estimation. In Proceedings of the Web Conference (WWW’21). ACM/IW3C2, 756–767. DOI:Google ScholarDigital Library
[39] Li Zhenghua, Chao Jiayuan, Zhang Min, and Chen Wenliang. 2015. Coupled sequence labeling on heterogeneous annotations: POS tagging as a case study. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1783–1792. Google ScholarCross Ref
[40] Liu Bulou, Li Chenliang, Zhou Wei, Ji Feng, Duan Yu, and Chen Haiqing. 2020. An attention-based deep relevance model for few-shot document filtering. ACM Trans. Inf. Syst. 39, 1 (2020), 6:1–6:35. DOI:Google ScholarDigital Library
[41] Liu Yonghao, Li Mengyu, Li Ximing, Giunchiglia Fausto, Feng Xiaoyue, and Guan Renchu. 2022. Few-shot node classification on attributed networks with graph meta-learning. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 471–481. DOI:Google ScholarDigital Library
[42] Loshchilov Ilya and Hutter Frank. 2019. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations (ICLR’19). Retrieved from https://openreview.net/forum?id=Bkg6RiCqY7Google Scholar
[43] Ma Dehong, Li Sujian, Wu Fangzhao, Xie Xing, and Wang Houfeng. 2019. Exploring sequence-to-sequence learning in aspect term extraction. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL’19). 3538–3547. Google ScholarCross Ref
[44] Ma Jie, Ballesteros Miguel, Doss Srikanth, Anubhai Rishita, Mallya Sunil, Al-Onaizan Yaser, and Roth Dan. 2022. Label semantics for few shot named entity recognition. In Proceedings of the Association for Computational Linguistics (ACL’22). 1956–1971. Google ScholarCross Ref
[45] Ma Jianqiang, Yan Zeyu, Li Chang, and Zhang Yang. 2021. Frustratingly simple few-shot slot tagging. In Proceedings of the Association for Computational Linguistics (ACL/IJCNLP’21). 1028–1033. Google ScholarCross Ref
[46] Ma Longxuan, Li Mingda, Zhang Wei-Nan, Li Jiapeng, and Liu Ting. 2022. Unstructured text enhanced open-domain dialogue system: A systematic survey. ACM Trans. Inf. Syst. 40, 1 (2022), 9:1–9:44. DOI:Google ScholarDigital Library
[47] Ma Ruotian, Zhou Xin, Gui Tao, Tan Yiding, Li Linyang, Zhang Qi, and Huang Xuanjing. 2022. Template-free prompt tuning for few-shot NER. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL’22). 5721–5732. Google ScholarCross Ref
[48] Ma Tingting, Jiang Huiqiang, Wu Qianhui, Zhao Tiejun, and Lin Chin-Yew. 2022. Decomposed meta-learning for few-shot named entity recognition. In Findings of the Association for Computational Linguistics (ACL’22). 1584–1596. Google ScholarCross Ref
[49] Ma Xuezhe and Hovy Eduard H.. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). Google ScholarCross Ref
[50] Oguz Cennet and Vu Ngoc Thang. 2021. Few-shot learning for slot tagging with attentive relational network. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL’21). 1566–1572. Google ScholarCross Ref
[51] Ouchi Hiroki, Suzuki Jun, Kobayashi Sosuke, Yokoi Sho, Kuribayashi Tatsuki, Konno Ryuto, and Inui Kentaro. 2020. Instance-based learning of span representations: A case study through named entity recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). Association for Computational Linguistics, 6452–6459. DOI:Google ScholarCross Ref
[52] Pradhan Sameer, Moschitti Alessandro, Xue Nianwen, Ng Hwee Tou, Björkelund Anders, Uryupina Olga, Zhang Yuchen, and Zhong Zhi. 2013. Towards robust linguistic analysis using ontonotes. In Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL’13). ACL, 143–152. Retrieved from https://aclanthology.org/W13-3516/Google Scholar
[53] Qin Libo, Che Wanxiang, Li Yangming, Wen Haoyang, and Liu Ting. 2019. A stack-propagation framework with token-level intent detection for spoken language understanding. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 2078–2087. DOI:Google ScholarCross Ref
[54] Sang Erik F. Tjong Kim and Meulder Fien De. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the 7th Conference on Natural Language Learning (CoNLL’03) Held in Cooperation with North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’03). ACL, 142–147. Retrieved from https://aclanthology.org/W03-0419/Google Scholar
[55] Shen Yongliang, Wang Xiaobin, Tan Zeqi, Xu Guangwei, Xie Pengjun, Huang Fei, Lu Weiming, and Zhuang Yueting. 2022. Parallel instance query network for named entity recognition. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). Association for Computational Linguistics, 947–961. DOI:Google ScholarCross Ref
[56] Snell Jake, Swersky Kevin, and Zemel Richard. 2017. Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems. Retrieved from https://proceedings.neurips.cc/paper/2017/hash/cb8da6767461f2812ae4290eac7cbc42-Abstract.htmlGoogle Scholar
[57] Sun Shengli, Sun Qingfeng, Zhou Kevin, and Lv Tengchao. 2019. Hierarchical attention prototypical networks for few-shot text classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 476–485. DOI:Google ScholarCross Ref
[58] Sung Flood, Yang Yongxin, Zhang Li, Xiang Tao, Torr Philip H. S., and Hospedales Timothy M.. 2018. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 1199–1208. Retrieved from http://openaccess.thecvf.com/content_cvpr_2018/html/Sung_Learning_to_Compare_CVPR_2018_paper.htmlGoogle ScholarCross Ref
[59] Tan Zeqi, Shen Yongliang, Zhang Shuai, Lu Weiming, and Zhuang Yueting. 2021. A sequence-to-set network for nested named entity recognition. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI’21). ijcai.org, 3936–3942. DOI:Google ScholarCross Ref
[60] Tong Meihan, Wang Shuai, Xu Bin, Cao Yixin, Liu Minghui, Hou Lei, and Li Juanzi. 2021. Learning from miscellaneous other-class words for few-shot named entity recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL/IJCNLP’21). 6236–6247. Google ScholarCross Ref
[61] Vakulenko Svitlana, Kanoulas Evangelos, and Rijke Maarten de. 2021. A large-scale analysis of mixed initiative in information-seeking dialogues for conversational search. ACM Trans. Inf. Syst. 39, 4 (2021), 49:1–49:32. DOI:Google ScholarDigital Library
[62] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 30. 5998–6008. Retrieved from https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.htmlGoogle Scholar
[63] Vinyals Oriol, Blundell Charles, Lillicrap Timothy, Wierstra Daan, et al. 2016. Matching networks for one shot learning. Advances in Neural Information Processing Systems. Retrieved from https://proceedings.neurips.cc/paper/2016/hash/90e1357833654983612fb05e3ec9148c-Abstract.htmlGoogle Scholar
[64] Wang Peiyi, Xu Runxin, Liu Tianyu, Zhou Qingyu, Cao Yunbo, Chang Baobao, and Sui Zhifang. 2022. An enhanced span-based decomposition method for few-shot sequence labeling. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL’22). 5012–5024. Google ScholarCross Ref
[65] Wang Yaqing, Chu Haoda, Zhang Chao, and Gao Jing. 2021. Learning from language description: Low-shot named entity recognition via decomposed framework. In Proceedings of the Association for Computational Linguistics (EMNLP’21). 1618–1630. Google ScholarCross Ref
[66] Wang Yu-Xiong, Girshick Ross B., Hebert Martial, and Hariharan Bharath. 2018. Low-shot learning from imaginary data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). Computer Vision Foundation/IEEE Computer Society, 7278–7286. DOI:Google ScholarCross Ref
[67] Yan Hang, Gui Tao, Dai Junqi, Guo Qipeng, Zhang Zheng, and Qiu Xipeng. 2021. A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL/IJCNLP’21). Association for Computational Linguistics, 5808–5822. Google ScholarCross Ref
[68] Yang Yi and Katiyar Arzoo. 2020. Simple and effective few-shot named entity recognition with structured nearest-neighbor learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 6365–6375. Google ScholarCross Ref
[69] Yoon Sung Whan, Seo Jun, and Moon Jaekyun. 2019. TapNet: Neural network augmented with task-adaptive projection for few-shot learning. In Proceedings of the 36th International Conference on Machine Learning (ICML’19). 7115–7123. Retrieved from http://proceedings.mlr.press/v97/yoon19a.htmlGoogle Scholar
[70] Yu Dian, He Luheng, Zhang Yuan, Du Xinya, Pasupat Panupong, and Li Qi. 2021. Few-shot intent classification and slot filling with retrieved examples. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’21). 734–749. Google ScholarCross Ref
[71] Yu Juntao, Bohnet Bernd, and Poesio Massimo. 2020. Named entity recognition as dependency parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 6470–6476. DOI:Google ScholarCross Ref
[72] Yu Shi, Liu Zhenghao, Xiong Chenyan, Feng Tao, and Liu Zhiyuan. 2021. Few-shot conversational dense retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21). 829–838. DOI:Google ScholarDigital Library
[73] Zeldes Amir. 2017. The GUM corpus: Creating multilayer resources in the classroom. Lang. Resour. Eval. 51, 3 (2017), 581–612. DOI:Google ScholarDigital Library
[74] Zhai Feifei, Potdar Saloni, Xiang Bing, and Zhou Bowen. 2017. Neural models for sequence chunking. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3365–3371. Retrieved from http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14776Google ScholarCross Ref
[75] Zhang Ruixiang, Che Tong, Ghahramani Zoubin, Bengio Yoshua, and Song Yangqiu. 2018. MetaGAN: An adversarial approach to few-shot learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS’18). 2371–2380. Retrieved from https://proceedings.neurips.cc/paper/2018/hash/4e4e53aa080247bc31d0eb4e7aeb07a0-Abstract.htmlGoogle Scholar
[76] Zhang Shichao, Li Jiaye, and Li Yangding. 2022. Reachable distance function for KNN classification. IEEE Trans. Knowl. Data Eng. 35, 7 (2022), 7382–7396.Google ScholarDigital Library
[77] Zhang Shuai, Shen Yongliang, Tan Zeqi, Wu Yiquan, and Lu Weiming. 2022. De-bias for generative extraction in unified NER task. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). Association for Computational Linguistics, 808–818. DOI:Google ScholarCross Ref
[78] Zhou Jing, Zheng Yanan, Tang Jie, Jian Li, and Yang Zhilin. 2022. FlipDA: Effective and robust data augmentation for few-shot learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). Association for Computational Linguistics, 8646–8665. DOI:Google ScholarCross Ref
[79] Zhu Enwei, Liu Yiyang, and Li Jinpeng. 2022. Deep span representations for named entity recognition. Retrieved from https://arXiv:2210.04182.Google Scholar

Index Terms

Unifying Token- and Span-level Supervisions for Few-shot Sequence Labeling
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction
  2. Information systems applications
    1. Data mining

Recommendations

Improving sequence labeling with labeled clue sentences
Abstract
Pre-trained language models (PLMs) have achieved noticeable success on a variety of natural language processing tasks, such as sequence labeling. In particular, the existing sequence labeling methods fine-tune PLMs on large-scale ...
Highlights
- A general framework uses labeled clues to mitigate labeled data shortages.
- Two ...
Read More
A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling
Few-shot learning (FSL) aims at learning to generalize from only a small number of labeled examples for a given target task. Most current state-of-the-art FSL methods typically have two limitations. First, they usually require access to a source dataset (...
Read More
Few-Shot Adaptation for Multimedia Semantic Indexing
MM '18: Proceedings of the 26th ACM international conference on Multimedia

We propose a few-shot adaptation framework, which bridges zero-shot learning and supervised many-shot learning, for semantic indexing of image and video data. Few-shot adaptation provides robust parameter estimation with few training examples, by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 42, Issue 1
January 2024
924 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3613513
Editor:
Min Zhang
Tsinghua University, China
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2023
- Online AM: 20 July 2023
- Accepted: 11 July 2023
- Revised: 1 May 2023
- Received: 11 November 2022
Published in tois Volume 42, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Few-shot sequence labeling
few-shot learning
sequence labeling
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 291
  Total Downloads
- Downloads (Last 12 months)291
- Downloads (Last 6 weeks)38
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Unifying Token- and Span-level Supervisions for Few-shot Sequence Labeling

ACM Transactions on Information Systems

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Improving sequence labeling with labeled clue sentences

A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling

Few-Shot Adaptation for Multimedia Semantic Indexing