Learning a Label-Noise Robust Logistic Regression: Analysis and Experiments

Bootkrajang, Jakramate; Kabán, Ata

doi:10.1007/978-3-642-41278-3_69

Jakramate Bootkrajang²⁴ &
Ata Kabán²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8206))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

4873 Accesses
5 Citations

Abstract

Label-noise robust logistic regression (rLR) is an extension of logistic regression that includes a model of random mislabelling. This paper attempts a theoretical analysis of rLR. By decomposing and interpreting the gradient of the likelihood objective of rLR as employed in gradient ascent optimisation, we get insights into the ability of the rLR learning algorithm to counteract the negative effect of mislabelling as a result of an intrinsic re-weighting mechanism. We also give an upper-bound on the error of rLR using Rademacher complexities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. JMLR 3, 463–482 (2003)
MathSciNet MATH Google Scholar
Bootkrajang, J., Kabán, A.: Label-noise robust logistic regression and its applications. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 143–158. Springer, Heidelberg (2012)
Chapter Google Scholar
Bootkrajang, J., Kabán, A.: Classification of mislabelled microarrays using robust sparse logistic regression. Bioinformatics 29(7), 870–877 (2013)
Article Google Scholar
Brodley, C.E., Friedl, M.A.: Identifying and eliminating mislabeled training instances. In: Proceedings of AAAI 1996, pp. 799–805 (1996)
Google Scholar
Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)
MATH Google Scholar
Chhikara, R.S., McKeon, J.: Linear discriminant analysis with misallocation in training samples. Journal of the American Stat. Assoc. 79(388), 899–906 (1984)
Article MathSciNet MATH Google Scholar
Krishnan, T., Nandy, S.C.: Efficiency of discriminant analysis when initial samples are classified stochastically. Pattern Recognition 23(5), 529–537 (1990)
Article MathSciNet Google Scholar
Lachenbruch, P.A.: Discriminant analysis when the initial samples are misclassified. Technometrics 8(4), 657–662 (1966)
Article MathSciNet Google Scholar
Lachenbruch, P.A.: Discriminant analysis when the initial samples are misclassified ii: Non-random misclassification models. Technometrics 16(3), 419–424 (1974)
Article MATH Google Scholar
Lawrence, N.D., Schölkopf, B.: Estimating a kernel fisher discriminant in the presence of label noise. In: Proceedings of ICML 2001, pp. 306–313 (2001)
Google Scholar
Lugosi, G.: Learning with an unreliable teacher. Pattern Recogn. 25, 79–87 (1992)
Article MathSciNet Google Scholar
Magder, L.S., Hughes, J.P.: Logistic regression when the outcome is measured with uncertainty. American Journal of Epidemiology 146(2), 195–203 (1997)
Article Google Scholar
Malossini, A., Blanzieri, E., Ng, R.T.: Detecting potential labeling errors in microarrays by data perturbation. Bioinformatics 22(17), 2114–2121 (2006)
Article Google Scholar
Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and handling mislabelled instances. Journal of Intelligent Information Systems 22(1), 89–109 (2004)
Article Google Scholar
Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters 24(7), 1015–1022 (2003)
Article Google Scholar
Shalev-Shwartz, S.: Introduction to machine learning. The Hebrew University of Jerusalem (2009), http://www.cs.huji.ac.il/~shais/Handouts.pdf
Yasui, Y., Pepe, M., Hsu, L., Adam, B.-L., Feng, Z.: Partially supervised learning using an EM-boosting algorithm. Biometrics 60(1), 199–206 (2004)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
Jakramate Bootkrajang & Ata Kabán

Authors

Jakramate Bootkrajang
View author publications
You can also search for this author in PubMed Google Scholar
Ata Kabán
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
University of Science and Technology of China, Hefei, China
Ke Tang
Nanjing University, Nanjing, China
Yang Gao
Ostfalia University of Applied Sciences, 38302, Wolfenbüttel, Germany
Frank Klawonn
Kyungpook National University, 702-701, Buk-Gu, Daegu, Korea
Minho Lee
Nature Inspired Computational and Applications Laboratory, School of Computer Science and Technology,, University of Science and Technology of China, 230027, Hefei, China
Thomas Weise
University of Science and Technology of China, 230017, Hefei, China
Bin Li
CERCIA, School of Computer Science, University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bootkrajang, J., Kabán, A. (2013). Learning a Label-Noise Robust Logistic Regression: Analysis and Experiments. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2013. IDEAL 2013. Lecture Notes in Computer Science, vol 8206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41278-3_69

Download citation

DOI: https://doi.org/10.1007/978-3-642-41278-3_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41277-6
Online ISBN: 978-3-642-41278-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics