ABSTRACT
In this era, the big data has brought us massive information as well as challenges in example annotation. Since label noise is commonly seen in dataset, weakly supervised learning is becoming more and more popular. In this paper, we discussed the issue of estimating noise rate matrix with small proportion of clean dataset, which is of great significance for learning in the presence of class conditional noise. With several recent brilliant methods reviewed, a more comprehensive and explainable algorithm called Back-End was proposed. This method attempts to capture the noise characteristics from the discrimination between noisy and clean dataset. Meanwhile some novel metrics in evaluation were firstly developed from the perspective of matrix distance. By performing experiments on binary and multi-class dataset, we verified the effectiveness of Back-End algorithm. Future works would focus on adaptations of such algorithm.
- Bowman, M., Debray, S. K., and Peterson, L. L. 1993. Reasoning about naming systems. ACM Trans. Program. Lang. Syst. 15, 5 (Nov. 1993), 795--825. Google ScholarDigital Library
- F. Provost, and T. Fawcett, "Data Science and its Relationship to Big Data and Data-Driven Decision Making," Big Data, vol. 1, no. 1, pp. 51, 2013.Google ScholarCross Ref
- M. I. Jordan, and T. M. Mitchell, "Machine learning: Trends, perspectives, and prospects," Science, vol. 349, no. 6245, pp. 255--60, 2015.Google ScholarCross Ref
- S. Yin, and O. Kaynak, "Big Data for Modern Industry: Challenges and Trends {Point of View}," Proceedings of the IEEE, vol. 103, no. 2, pp. 143--146, 2015.Google ScholarCross Ref
- W. Wei, and Z. H. Zhou, "Crowdsourcing label quality: a theoretical analysis," Science China Information Sciences, vol. 58, no. 11, pp. 1--12, 2015.Google ScholarCross Ref
- D. Garcíagil, J. Luengo, S. García, and F. Herrera, "Enabling Smart Data: Noise filtering in Big Data classification," 2017.Google Scholar
- B. Frénay, "Uncertainty and label noise in machine learning," Verleysen Michel, 2013.Google Scholar
- B. Sluban, D. Gamberger, and N. Lavra, "Advances in Class Noise Detection." pp. 1105--1106. Google ScholarDigital Library
- D. L. Wilson, "Asymptotic Properties of Nearest Neighbor Rules Using Edited Data," IEEE Transactions on Systems Man & Cybernetics, vol. 2, no. 3, pp. 408--421, 1972.Google ScholarCross Ref
- J. Bootkrajang, and A. Kabán, "Label-Noise Robust Logistic Regression and Its Applications." pp. 143--158.Google Scholar
- S. Sukhbaatar, J. Bruna, M. Paluri, L. Bourdev, and R. Fergus, "Training Convolutional Networks with Noisy Labels," Computer Science, 2014.Google Scholar
- R. Wang, T. Liu, and D. Tao, "Multiclass Learning With Partially Corrupted Labels," IEEE Transactions on Neural Networks & Learning Systems, vol. PP, no. 99, pp. 1--13, 2017.Google Scholar
- T. Liu, and D. Tao, "Classification with Noisy Labels by Importance Reweighting," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 38, no. 3, pp. 447--461, 2016. Google ScholarDigital Library
- T. Leung, Y. Song, and J. Zhang, "Handling label noise in video classification via multiple instance learning." pp. 2056--2063. Google ScholarDigital Library
- J. Bootkrajang, and A. Kaban, "Boosting in the presence of label noise," Computer Science, 2013.Google Scholar
- T. Liu, and D. Tao, "Classification with noisy labels by importance reweighting," IEEE Transactions on pattern analysis and machine intelligence, vol. 38, no. 3, pp. 447--461, 2016. Google ScholarDigital Library
- J. Bootkrajang, and A. Kabán, "Label-Noise Robust Logistic Regression and Its Applications." pp. 143--158.Google Scholar
- T. Sanderson, and C. Scott, "Class Proportion Estimation with Application to Multiclass Anomaly Rejection," Computer Science, vol. 44, no. 2, pp. 203--213, 2013.Google Scholar
- C. Scott, G. Blanchard, G. Handy, S. Pozzi, and M. Flaska, "Classification with Asymmetric Label Noise: Consistency and Maximal Denoising." pp. 489--511.Google Scholar
- C. G. Northcutt, T. Wu, and I. L. Chuang, "Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels," 2017.Google Scholar
- C. Scott, "{A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels}," pp. 838--846, 2015.Google Scholar
- N. Natarajan, I. S. Dhillon, P. Ravikumar, and A. Tewari, "Learning with noisy labels." pp. 1196--1204. Google ScholarDigital Library
- G. Patrini, A. Rozza, A. Menon, R. Nock, and L. Qu, "Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach," 2017.Google Scholar
- S. Sukhbaatar, and R. Fergus, "Learning from Noisy Labels with Deep Neural Networks," Eprint Arxiv, 2014.Google Scholar
- G. Patrini, A. Rozza, A. Menon, R. Nock, and L. Qu, "Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach," 2016Google Scholar
Index Terms
- Back-End: A Noise Rate Estimation Method in the Presence of Class Conditional Noise
Recommendations
Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification
WWW '19: The World Wide Web ConferenceHierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to reduce the ...
Unconfused ultraconservative multiclass algorithms
We tackle the problem of learning linear classifiers from noisy datasets in a multiclass setting. The two-class version of this problem was studied a few years ago where the proposed approaches to combat the noise revolve around a Perceptron learning ...
Hierarchical Multi-Label Classification with Partial Labels and Unknown Hierarchy
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementHierarchical multi-label classification aims at learning a multi-label classifier from a dataset whose labels are organized into a hierarchical structure. To the best of our knowledge, we propose for the first time the problem of finding a multi-label ...
Comments