基于负学习的样本重加权鲁棒学习方法

doi:10.11772/j.issn.1001-9081.2023050880

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1479-1484.DOI: 10.11772/j.issn.1001-9081.2023050880

• 第十九届中国机器学习会议(CCML 2023) • 上一篇

基于负学习的样本重加权鲁棒学习方法

邹博士, 杨铭, 宗辰辰, 谢明昆, 黄圣君()

南京航空航天大学计算机科学与技术学院，南京 211106

收稿日期:2023-07-05 修回日期:2023-07-21 接受日期:2023-07-24 发布日期:2023-08-07 出版日期:2024-05-10
通讯作者: 黄圣君
作者简介:邹博士（1999—），男，河南商丘人，硕士研究生，主要研究方向：机器学习
杨铭（2002—），男，安徽六安人，主要研究方向：机器学习
宗辰辰（2000—），男，河南汝州人，博士研究生，CCF会员，主要研究方向：主动学习、噪声标记学习、偏标记学习
谢明昆（1995—），男，福建厦门人，博士研究生，CCF会员，主要研究方向：机器学习
第一联系人：黄圣君（1987—），男，湖南长沙人，教授，博士，CCF杰出会员，主要研究方向：机器学习、数据挖掘。

Robust learning method by reweighting examples with negative learning

Boshi ZOU, Ming YANG, Chenchen ZONG, Mingkun XIE, Shengjun HUANG()

College of Computer Science and Technology，Nanjing University of Aeronautics and Astronautics，Nanjing Jiangsu 211106，China

Received:2023-07-05 Revised:2023-07-21 Accepted:2023-07-24 Online:2023-08-07 Published:2024-05-10
Contact: Shengjun HUANG
About author:ZOU Boshi， born in 1999， M.S. candidate. His research interests include machine learning.
YANG Ming， born in 2002. His research interests include machine learning.
ZONG Chenchen， born in 2000， Ph. D. candidate. His research interests include active learning， noisy label learning， partial label learning.
XIE Mingkun， born in 1995， Ph. D. candidate. His research interests include machine learning.

摘要/Abstract

摘要：

噪声标记学习方法能够有效利用含有噪声标记的数据训练模型，显著降低大规模数据集的标注成本。现有的噪声标记学习方法通常假设数据集中各个类别的样本数目是平衡的，但许多真实场景下的数据往往存在噪声标记，且数据的真实分布具有长尾现象，这导致现有方法难以设计有效的指标，如训练损失或置信度区分尾部类别中的干净样本和噪声样本。为了解决噪声长尾学习问题，提出一种基于负学习的样本重加权鲁棒学习（NLRW）方法。具体来说，根据模型对头部类别和尾部类别样本的输出分布，提出一种新的样本权重计算方法，能够使干净样本的权重接近1，噪声样本的权重接近0。为了保证模型对样本的输出准确，结合负学习和交叉熵损失使用样本加权的损失函数训练模型。实验结果表明，在多种不平衡率和噪声率的CIFAR-10以及CIFAR-100数据集上，NLRW方法相较于噪声长尾分类的最优基线模型TBSS（Two stage Bi-dimensional Sample Selection），平均准确率分别提升4.79%和3.46%。

关键词: 噪声标记学习, 长尾学习, 噪声长尾学习, 样本重加权, 负学习

Abstract:

Noisy label learning methods can effectively use data containing noisy labels to train models and significantly reduce the labeling cost of large-scale datasets. Most existing noisy label learning methods usually assume that the number of each class in the dataset is balanced， but the data in many real-world scenarios tend to have noisy labels， and long-tailed distributions often present in the dataset simultaneously， making it difficult for existing methods to select clean examples from noisy examples in the tail class according to traning loss or confidence. To solve noisy long-tailed learning problem， a ReWeighting examples with Negative Learning （NLRW） method was proposed， by which examples were reweighted adaptively based on negative learning. Specifically， at each training epoch， the weights of examples were calculated according to the output distributions of the model to head classes and tail classes. The weights of clean examples were close to one while the weights of noisy examples were close to zero. To ensure accurate estimation of weights， negative learning and cross entropy loss were combined to train the model with a weighted loss function. Experimental results on CIFAR-10 and CIFAR-100 datasets with various imbalance rates and noise rates show that， compared with the optimal baseline model TBSS （Two stage Bi-dimensional Sample Selection） for noisy long-tail classification， NLRW method improves the average accuracy by 4.79% and 3.46%， respectively.

Key words: noisy label learning, long-tailed learning, noisy long-tailed learning, example reweighting, negative learning

中图分类号:

TP391.4

邹博士, 杨铭, 宗辰辰, 谢明昆, 黄圣君. 基于负学习的样本重加权鲁棒学习方法[J]. 计算机应用, 2024, 44(5): 1479-1484.

Boshi ZOU, Ming YANG, Chenchen ZONG, Mingkun XIE, Shengjun HUANG. Robust learning method by reweighting examples with negative learning[J]. Journal of Computer Applications, 2024, 44(5): 1479-1484.

图/表 6

参考文献 39

1	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 25th Internation Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2012： 1097-1105.
2	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015， 1： 91-99.
3	DEVLIN J， CHANG M-W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186. 10.18653/v1/n18-2
4	PAOLACCI G， CHANDLER J J， IPEIROTIS P G. Running experiments on amazon mechanical turk［J］. Judgment and Decision Making， 2010， 5（5）： 411-419. 10.1017/s1930297500002205
5	MALACH E， SHALEV-SHWARTZ S. Decoupling “when to update” from “how to update”［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 961-971.
6	LI S， XIA X， GE S， et al. Selective-supervised contrastive learning with noisy labels［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 316-325. 10.1109/cvpr52688.2022.00041
7	HAN B， YAO Q， YU X， et al. Co-teaching： robust training of deep neural networks with extremely noisy labels［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2018： 8536-8546. 10.48550/arXiv.1804.06872
8	SONG H， KIM M， PARK D， et al. Learning from noisy labels with deep neural networks： a survey［J］. IEEE Transactions on Neural Networks and Learning Systems， 2023， 34（11）： 8135-8153. 10.1109/tnnls.2022.3152527
9	JIANG S， LI J， WANG Y， et al. Delving into sample loss curve to embrace noisy and imbalanced data［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2022， 36（6）： 7024-7032. 10.1609/aaai.v36i6.20661
10	KIM Y， YIM J， YUN J， et al. NLNL： negative learning for noisy labels［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 101-110. 10.1109/iccv.2019.00019
11	LI J， SOCHER R， HOI S C H. DivideMix： learning with noisy labels as semi-supervised learning［C/OL］// Proceedings of the 2020 International Conference on Learning Representations. ［S.l.］： ICLR， 2019［2023-05-01］. .
12	BERTHELOT D， CARLINI N， GOODFELLOW I， et al. MixMatch： a holistic approach to semi-supervised learning［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2019： 5049-5059. 10.48550/arXiv.1905.02249
13	KARIM N， RIZVE M N， RAHNAVARD N， et al. UNICON： combating label noise through uniform selection and contrastive learning［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 9676-9686. 10.1109/cvpr52688.2022.00945
14	ZHANG Z， SABUNCU M. Generalized cross entropy loss for training deep neural networks with noisy labels［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2018： 8792-8802.
15	REN M， ZENG W， YANG B， et al. Learning to reweight examples for robust deep learning［C］// Proceedings of the 35th International Conference on Machine Learning. New York： JMLR， 2018： 4334-4343.
16	SHU J， XIE Q， YI L， et al. Meta-weight-net： learning an explicit mapping for sample weighting［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2019： 1919-1930.
17	HE H， GARCIA E A. Learning from imbalanced data［J］. IEEE Transactions on Knowledge and Data Engineering， 2009， 21（9）： 1263-1284. 10.1109/tkde.2008.239
18	JAPKOWICZ N， STEPHEN S. The class imbalance problem： a systematic study［J］. Intelligent Data Analysis， 2002， 6（5）： 429-449. 10.3233/ida-2002-6504
19	BUDA M， MAKI A， MAZUROWSKI M A. A systematic study of the class imbalance problem in convolutional neural networks［J］. Neural Networks， 2018， 106： 249-259. 10.1016/j.neunet.2018.07.011
20	CAO K， WEI C， GAIDON A， et al. Learning imbalanced datasets with label-distribution-aware margin loss［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2019： 1567-1578.
21	CUI Y， JIA M， LIN T-Y， et al. Class-balanced loss based on effective number of samples［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 9268-9277. 10.1109/cvpr.2019.00949
22	KHAN S H， HAYAT M， BENNAMOUN M， et al. Cost-sensitive learning of deep feature representations from imbalanced data［J］. IEEE Transactions on Neural Networks and Learning Systems， 2018， 29（8）： 3573-3587. 10.1109/tnnls.2017.2732482
23	SHEN L， LIN Z， HUANG Q. Relay backpropagation for effective learning of deep convolutional neural networks［C］// Proceedings of the 14th European Conference on Computer Vision. Cham： Springer， 2016： 467-482. 10.1007/978-3-319-46478-7_29
24	KHAN S， HAYAT M， ZAMIR S W， et al. Striking the right balance with uncertainty［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 103-112. 10.1109/cvpr.2019.00019
25	HUANG C， LI Y， LOY C C， et al. Deep imbalanced learning for face recognition and attribute prediction［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（11）： 2781-2794. 10.1109/tpami.2019.2914680
26	WANG Y-X， RAMANAN D， HEBERT M. Learning to model the tail［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 7032-7042.
27	CUI Y， SONG Y， SUN C， et al. Large scale fine-grained categorization and domain-specific transfer learning［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE，2018： 4109-4118. 10.1109/cvpr.2018.00432
28	LIU Z， MIAO Z， ZHAN X， et al. Large-scale long-tailed recognition in an open world［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 2537-2546. 10.1109/cvpr.2019.00264
29	ZHOU B， CUI Q， WEI X-S， et al. BBN： bilateral-branch network with cumulative learning for long-tailed visual recognition［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 9719-9728. 10.1109/cvpr42600.2020.00974
30	WEI C， SOHN K， MELLINA C， et al. CReST： a class-rebalancing self-training framework for imbalanced semi-supervised learning［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10857-10866. 10.1109/cvpr46437.2021.01071
31	LEE H， SHIN S， KIM H. ABC： auxiliary balanced classifier for class-imbalanced semi-supervised learning［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021： 7082-7094.
32	KARTHIK S， REVAUD J， CHIDLOVSKII B. Learning from long-tailed data with noisy labels ［EB/OL］. （2021-08-25）［2022-12-24］. .
33	XIANG T， ZHANG C， SONG Y， et al. Walk in the cloud： learning curves for point clouds shape analysis［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 915-924. 10.1109/iccv48922.2021.00095
34	WEI T， SHI J-X， TU W-W， et al. Robust long-tailed learning under label noise ［EB/OL］. ［2022-12-14］. .
35	CAO K， CHEN Y， LU J， et al. Heteroskedastic and imbalanced deep learning with adaptive regularization［C/OL］// Proceedings of the 2020 International Conference on Learning Representations. ［2023-04-15］. .
36	XIA X， LIU T， HAN B， et al. Sample selection with uncertainty of losses for learning with noisy labels［C/OL］// Proceedings of the 2021 International Conference on Learning Representations. ［2023-03-22］. . 10.1109/iccv51070.2023.00176
37	ZHANG Y， LU Y， HAN B， et al. Combating noisy-labeled and imbalanced data by two stage bi-dimensional sample selection［EB/OL］. ［2023-02-19］. .
38	SOHN K， BERTHELOT D， LI C-L， et al. FixMatch： simplifying semi-supervised learning with consistency and confidence［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 596-608.
39	KRIZHEVSKY A， HINTON G. Learning multiple layers of features from tiny image［R/OL］. Toronto： University of Toronto， Computer Science Department， 2009 ［2023-05-01］. .

数据集	不平衡率	噪声率（对称噪声）	平均准确率/%
数据集	不平衡率	噪声率（对称噪声）	CE	DivideMix	UNICON	MW-Net	CurveNet	RoLT	HAR	TBSS	NLRW
CIFAR-10	10	0.4	68.98	82.67	84.25	70.90	78.03	81.62	77.44	87.21	89.59
	10	0.6	53.47	80.17	82.29	59.85	67.82	76.58	63.75	85.11	86.23
	100	0.4	46.56	32.42	61.23	46.62	58.55	60.11	51.54	63.64	70.00
	100	0.6	36.35	34.73	54.69	39.33	43.16	44.23	38.28	58.40	63.81
CIFAR-100	10	0.4	33.42	54.71	52.34	32.03	41.06	42.95	38.17	57.04	59.10
	10	0.6	23.07	44.98	45.87	21.71	29.83	32.59	26.09	46.59	48.32
	100	0.4	21.36	36.20	32.09	19.65	23.64	23.64	20.21	37.25	39.30
	100	0.6	14.11	26.29	24.82	13.72	17.41	17.41	14.89	26.43	27.81

数据集	不平衡率	噪声率（对称噪声）	平均准确率/%
数据集	不平衡率	噪声率（对称噪声）	CE	DivideMix	UNICON	MW-Net	CurveNet	RoLT	HAR	TBSS	NLRW
CIFAR-10	10	0.4	68.98	82.67	84.25	70.90	78.03	81.62	77.44	87.21	89.59
	10	0.6	53.47	80.17	82.29	59.85	67.82	76.58	63.75	85.11	86.23
	100	0.4	46.56	32.42	61.23	46.62	58.55	60.11	51.54	63.64	70.00
	100	0.6	36.35	34.73	54.69	39.33	43.16	44.23	38.28	58.40	63.81
CIFAR-100	10	0.4	33.42	54.71	52.34	32.03	41.06	42.95	38.17	57.04	59.10
	10	0.6	23.07	44.98	45.87	21.71	29.83	32.59	26.09	46.59	48.32
	100	0.4	21.36	36.20	32.09	19.65	23.64	23.64	20.21	37.25	39.30
	100	0.6	14.11	26.29	24.82	13.72	17.41	17.41	14.89	26.43	27.81

数据集	不平衡率	噪声率（翻转噪声）	平均准确率/%
数据集	不平衡率	噪声率（翻转噪声）	CE	DivideMix	UNICON	MW-Net	CurveNet	RoLT	HAR	TBSS	NLRW
CIFAR-10	10	0.2	79.81	80.92	72.81	79.34	82.64	83.88	82.85	86.04	90.87
CIFAR-10		0.4	69.63	69.35	69.04	65.49	77.44	58.29	69.19	80.53	89.15
CIFAR-100		0.2	47.16	58.09	55.99	42.52	51.16	48.19	48.50	59.14	62.38
CIFAR-100		0.4	33.70	41.99	44.70	30.42	38.49	39.32	33.20	46.75	48.78

数据集	不平衡率	噪声率（翻转噪声）	平均准确率/%
数据集	不平衡率	噪声率（翻转噪声）	CE	DivideMix	UNICON	MW-Net	CurveNet	RoLT	HAR	TBSS	NLRW
CIFAR-10	10	0.2	79.81	80.92	72.81	79.34	82.64	83.88	82.85	86.04	90.87
CIFAR-10		0.4	69.63	69.35	69.04	65.49	77.44	58.29	69.19	80.53	89.15
CIFAR-100		0.2	47.16	58.09	55.99	42.52	51.16	48.19	48.50	59.14	62.38
CIFAR-100		0.4	33.70	41.99	44.70	30.42	38.49	39.32	33.20	46.75	48.78

数据集	不平衡率	噪声率（对称噪声）	平均准确率/%
数据集	不平衡率	噪声率（对称噪声）	CE	RW-	NL-	SEMI-	NLRW
CIFAR-10	10	0.4	68.98	86.68	88.98	86.28	89.59
	10	0.6	53.47	68.95	84.84	78.89	86.23
	100	0.4	46.56	60.92	68.93	69.69	70.00
	100	0.6	36.35	48.83	65.59	54.21	63.81
CIFAR-100	10	0.4	33.42	39.78	56.52	53.99	59.10
	10	0.6	23.07	26.31	38.14	41.51	48.32
	100	0.4	21.36	27.52	32.28	35.77	39.30
	100	0.6	14.11	15.67	21.26	24.31	27.81

基于负学习的样本重加权鲁棒学习方法

Robust learning method by reweighting examples with negative learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 39

相关文章 15

编辑推荐

Metrics

数据集	不平衡率	噪声率（翻转噪声）	平均准确率/%
数据集	不平衡率	噪声率（翻转噪声）	CE	RW-	NL-	SEMI-	NLRW
CIFAR-10	10	0.2	79.81	85.32	89.93	90.59	90.87
CIFAR-10		0.4	69.63	76.32	88.54	85.52	89.15
CIFAR-100		0.2	47.16	52.94	57.58	61.89	62.38
CIFAR-100		0.4	33.70	38.66	42.62	47.49	48.78

[1]	高文烁, 陈晓云. 基于节点结构的点云分类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1471-1478.
[2]	李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444.
[3]	冯勇杨思卓徐红艳. 基于YOLO v8的轻量化安全帽佩戴检测算法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[4]	梁杰涛罗兵付兰慧常青玲李楠楠易宁波冯其何鑫邓辅秦. 基于坐标几何采样的点云配准方法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[5]	杨顺边小勇陈希. 无迭代图胶囊网络的遥感场景分类[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[6]	张睿惠永科张延军潘理虎. 基于多维空间卷积信息增强的低质车牌信息超分辨率重建[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[7]	李林昊王逸泽李英双董永峰王振. 基于关系特征强化的全景场景图生成方法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[8]	王丽芳吴荆双尹鹏亮胡立华. 基于注意力机制和能量函数的动作识别算法[J]. 《计算机应用》唯一官方网站, 0, (): 1-1.
[9]	宋鹏程郭立君张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 0, (): 1-1.
[10]	邹耀斌张彬. 四向加权香农熵最大化导向的自动阈值分割方法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[11]	区卓越邓秀勤陈磊. 基于加权锚点的自适应多视图互补聚类算法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[12]	庞玉东李志星刘伟杰李天昊. 基于改进实时检测 Transformer的塔机上俯视场景小目标检测模型[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[13]	黄颖李昌盛彭慧刘苏. 用于动态场景高动态范围成像的局部熵引导的双分支网络[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[14]	朱俊杰, 余丽, 李圣文, 周长征. 综合成分句法分析的技术名称识别[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1072-1079.
[15]	刘扬, 刘蓉, 方可, 张心月, 王光旭. 基于帧间跨越光流的视频超分辨率重建网络[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1277-1284.