research-article

Back-End: A Noise Rate Estimation Method in the Presence of Class Conditional Noise

Authors:
Qi Wang

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China
View Profile

,
Xia Zhao

Department of Mathematics and Systems Science, College of Science, National University of Defense Technology, Changsha, Hunan, China

Department of Mathematics and Systems Science, College of Science, National University of Defense Technology, Changsha, Hunan, China
View Profile

,
Jincai Huang

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China
View Profile

,
Yanghe Feng

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China
View Profile

,
Jiahao Su

College of Information System and Engineering, National University of Defense Technology, Changsha, Hunan, China

College of Information System and Engineering, National University of Defense Technology, Changsha, Hunan, China
View Profile

,
Yehan Zou

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China

Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China
View Profile

ICIT '17: Proceedings of the 2017 International Conference on Information TechnologyDecember 2017Pages 318–324https://doi.org/10.1145/3176653.3176699

Published:27 December 2017Publication History

ICIT '17: Proceedings of the 2017 International Conference on Information Technology

Pages 318–324

ABSTRACT

In this era, the big data has brought us massive information as well as challenges in example annotation. Since label noise is commonly seen in dataset, weakly supervised learning is becoming more and more popular. In this paper, we discussed the issue of estimating noise rate matrix with small proportion of clean dataset, which is of great significance for learning in the presence of class conditional noise. With several recent brilliant methods reviewed, a more comprehensive and explainable algorithm called Back-End was proposed. This method attempts to capture the noise characteristics from the discrimination between noisy and clean dataset. Meanwhile some novel metrics in evaluation were firstly developed from the perspective of matrix distance. By performing experiments on binary and multi-class dataset, we verified the effectiveness of Back-End algorithm. Future works would focus on adaptations of such algorithm.

References

Bowman, M., Debray, S. K., and Peterson, L. L. 1993. Reasoning about naming systems. ACM Trans. Program. Lang. Syst. 15, 5 (Nov. 1993), 795--825. Google ScholarDigital Library
F. Provost, and T. Fawcett, "Data Science and its Relationship to Big Data and Data-Driven Decision Making," Big Data, vol. 1, no. 1, pp. 51, 2013.Google ScholarCross Ref
M. I. Jordan, and T. M. Mitchell, "Machine learning: Trends, perspectives, and prospects," Science, vol. 349, no. 6245, pp. 255--60, 2015.Google ScholarCross Ref
S. Yin, and O. Kaynak, "Big Data for Modern Industry: Challenges and Trends {Point of View}," Proceedings of the IEEE, vol. 103, no. 2, pp. 143--146, 2015.Google ScholarCross Ref
W. Wei, and Z. H. Zhou, "Crowdsourcing label quality: a theoretical analysis," Science China Information Sciences, vol. 58, no. 11, pp. 1--12, 2015.Google ScholarCross Ref
D. Garcíagil, J. Luengo, S. García, and F. Herrera, "Enabling Smart Data: Noise filtering in Big Data classification," 2017.Google Scholar
B. Frénay, "Uncertainty and label noise in machine learning," Verleysen Michel, 2013.Google Scholar
B. Sluban, D. Gamberger, and N. Lavra, "Advances in Class Noise Detection." pp. 1105--1106. Google ScholarDigital Library
D. L. Wilson, "Asymptotic Properties of Nearest Neighbor Rules Using Edited Data," IEEE Transactions on Systems Man & Cybernetics, vol. 2, no. 3, pp. 408--421, 1972.Google ScholarCross Ref
J. Bootkrajang, and A. Kabán, "Label-Noise Robust Logistic Regression and Its Applications." pp. 143--158.Google Scholar
S. Sukhbaatar, J. Bruna, M. Paluri, L. Bourdev, and R. Fergus, "Training Convolutional Networks with Noisy Labels," Computer Science, 2014.Google Scholar
R. Wang, T. Liu, and D. Tao, "Multiclass Learning With Partially Corrupted Labels," IEEE Transactions on Neural Networks & Learning Systems, vol. PP, no. 99, pp. 1--13, 2017.Google Scholar
T. Liu, and D. Tao, "Classification with Noisy Labels by Importance Reweighting," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 38, no. 3, pp. 447--461, 2016. Google ScholarDigital Library
T. Leung, Y. Song, and J. Zhang, "Handling label noise in video classification via multiple instance learning." pp. 2056--2063. Google ScholarDigital Library
J. Bootkrajang, and A. Kaban, "Boosting in the presence of label noise," Computer Science, 2013.Google Scholar
T. Liu, and D. Tao, "Classification with noisy labels by importance reweighting," IEEE Transactions on pattern analysis and machine intelligence, vol. 38, no. 3, pp. 447--461, 2016. Google ScholarDigital Library
J. Bootkrajang, and A. Kabán, "Label-Noise Robust Logistic Regression and Its Applications." pp. 143--158.Google Scholar
T. Sanderson, and C. Scott, "Class Proportion Estimation with Application to Multiclass Anomaly Rejection," Computer Science, vol. 44, no. 2, pp. 203--213, 2013.Google Scholar
C. Scott, G. Blanchard, G. Handy, S. Pozzi, and M. Flaska, "Classification with Asymmetric Label Noise: Consistency and Maximal Denoising." pp. 489--511.Google Scholar
C. G. Northcutt, T. Wu, and I. L. Chuang, "Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels," 2017.Google Scholar
C. Scott, "{A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels}," pp. 838--846, 2015.Google Scholar
N. Natarajan, I. S. Dhillon, P. Ravikumar, and A. Tewari, "Learning with noisy labels." pp. 1196--1204. Google ScholarDigital Library
G. Patrini, A. Rozza, A. Menon, R. Nock, and L. Qu, "Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach," 2017.Google Scholar
S. Sukhbaatar, and R. Fergus, "Learning from Noisy Labels with Deep Neural Networks," Eprint Arxiv, 2014.Google Scholar
G. Patrini, A. Rozza, A. Menon, R. Nock, and L. Qu, "Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach," 2016Google Scholar

Index Terms

Back-End: A Noise Rate Estimation Method in the Presence of Class Conditional Noise
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees

Recommendations

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification
WWW '19: The World Wide Web Conference

Hierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to reduce the ...
Read More
Unconfused ultraconservative multiclass algorithms

We tackle the problem of learning linear classifiers from noisy datasets in a multiclass setting. The two-class version of this problem was studied a few years ago where the proposed approaches to combat the noise revolve around a Perceptron learning ...
Read More
Hierarchical Multi-Label Classification with Partial Labels and Unknown Hierarchy
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Hierarchical multi-label classification aims at learning a multi-label classifier from a dataset whose labels are organized into a hierarchical structure. To the best of our knowledge, we propose for the first time the problem of finding a multi-label ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICIT '17: Proceedings of the 2017 International Conference on Information Technology
December 2017
492 pages
ISBN:9781450363518
DOI:10.1145/3176653

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 December 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Weakly supervised learning
class conditional noise (CCN)
confusion matrix
noise rate estimation
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 69
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Back-End: A Noise Rate Estimation Method in the Presence of Class Conditional Noise

ICIT '17: Proceedings of the 2017 International Conference on Information Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

Unconfused ultraconservative multiclass algorithms

Hierarchical Multi-Label Classification with Partial Labels and Unknown Hierarchy

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Back-End: A Noise Rate Estimation Method in the Presence of Class Conditional Noise

ICIT '17: Proceedings of the 2017 International Conference on Information Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

Unconfused ultraconservative multiclass algorithms

Hierarchical Multi-Label Classification with Partial Labels and Unknown Hierarchy

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media