Skip to main content

Learning a Label-Noise Robust Logistic Regression: Analysis and Experiments

  • Conference paper
Intelligent Data Engineering and Automated Learning – IDEAL 2013 (IDEAL 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8206))

Abstract

Label-noise robust logistic regression (rLR) is an extension of logistic regression that includes a model of random mislabelling. This paper attempts a theoretical analysis of rLR. By decomposing and interpreting the gradient of the likelihood objective of rLR as employed in gradient ascent optimisation, we get insights into the ability of the rLR learning algorithm to counteract the negative effect of mislabelling as a result of an intrinsic re-weighting mechanism. We also give an upper-bound on the error of rLR using Rademacher complexities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. JMLR 3, 463–482 (2003)

    MathSciNet  MATH  Google Scholar 

  2. Bootkrajang, J., Kabán, A.: Label-noise robust logistic regression and its applications. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 143–158. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  3. Bootkrajang, J., Kabán, A.: Classification of mislabelled microarrays using robust sparse logistic regression. Bioinformatics 29(7), 870–877 (2013)

    Article  Google Scholar 

  4. Brodley, C.E., Friedl, M.A.: Identifying and eliminating mislabeled training instances. In: Proceedings of AAAI 1996, pp. 799–805 (1996)

    Google Scholar 

  5. Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)

    MATH  Google Scholar 

  6. Chhikara, R.S., McKeon, J.: Linear discriminant analysis with misallocation in training samples. Journal of the American Stat. Assoc. 79(388), 899–906 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  7. Krishnan, T., Nandy, S.C.: Efficiency of discriminant analysis when initial samples are classified stochastically. Pattern Recognition 23(5), 529–537 (1990)

    Article  MathSciNet  Google Scholar 

  8. Lachenbruch, P.A.: Discriminant analysis when the initial samples are misclassified. Technometrics 8(4), 657–662 (1966)

    Article  MathSciNet  Google Scholar 

  9. Lachenbruch, P.A.: Discriminant analysis when the initial samples are misclassified ii: Non-random misclassification models. Technometrics 16(3), 419–424 (1974)

    Article  MATH  Google Scholar 

  10. Lawrence, N.D., Schölkopf, B.: Estimating a kernel fisher discriminant in the presence of label noise. In: Proceedings of ICML 2001, pp. 306–313 (2001)

    Google Scholar 

  11. Lugosi, G.: Learning with an unreliable teacher. Pattern Recogn. 25, 79–87 (1992)

    Article  MathSciNet  Google Scholar 

  12. Magder, L.S., Hughes, J.P.: Logistic regression when the outcome is measured with uncertainty. American Journal of Epidemiology 146(2), 195–203 (1997)

    Article  Google Scholar 

  13. Malossini, A., Blanzieri, E., Ng, R.T.: Detecting potential labeling errors in microarrays by data perturbation. Bioinformatics 22(17), 2114–2121 (2006)

    Article  Google Scholar 

  14. Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and handling mislabelled instances. Journal of Intelligent Information Systems 22(1), 89–109 (2004)

    Article  Google Scholar 

  15. Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters 24(7), 1015–1022 (2003)

    Article  Google Scholar 

  16. Shalev-Shwartz, S.: Introduction to machine learning. The Hebrew University of Jerusalem (2009), http://www.cs.huji.ac.il/~shais/Handouts.pdf

  17. Yasui, Y., Pepe, M., Hsu, L., Adam, B.-L., Feng, Z.: Partially supervised learning using an EM-boosting algorithm. Biometrics 60(1), 199–206 (2004)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bootkrajang, J., Kabán, A. (2013). Learning a Label-Noise Robust Logistic Regression: Analysis and Experiments. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2013. IDEAL 2013. Lecture Notes in Computer Science, vol 8206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41278-3_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41278-3_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41277-6

  • Online ISBN: 978-3-642-41278-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics