Learning From Noisy Examples

Angluin, Dana; Laird, Philip

doi:10.1023/A:1022873112823

Learning From Noisy Examples

Published: April 1988

Volume 2, pages 343–370, (1988)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Learning From Noisy Examples

Download PDF

Dana Angluin¹ &
Philip Laird²

2471 Accesses
57 Citations
1 Altmetric
Explore all metrics

Abstract

The basic question addressed in this paper is: how can a learning algorithm cope with incorrect training examples? Specifically, how can algorithms that produce an “approximately correct” identification with “high probability” for reliable data be adapted to handle noisy data? We show that when the teacher may make independent random errors in classifying the example data, the strategy of selecting the most consistent rule for the sample is sufficient, and usually requires a feasibly small number of examples, provided noise affects less than half the examples on average. In this setting we are able to estimate the rate of noise using only the knowledge that the rate is less than one half. The basic ideas extend to other types of random noise as well. We also show that the search problem associated with this strategy is intractable in general. However, for particular classes of rules the target rule may be efficiently identified if we use techniques specific to that class. For an important class of formulas – the k-CNF formulas studied by Valiant – we present a polynomial-time algorithm that identifies concepts in this form when the rate of classification errors is less than one half.

References

Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75, 87–106.
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1986). Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension. Pro-ceedings of the Eighteenth Annual ACM Symposium on Theory of Computing (pp. 273–282). Berkeley, CA: The Association for Computing Machinery.
Google Scholar
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 60 13–30.
Google Scholar
Kearns, M., & Li, M. (1987). Learning in the presence of malicious errors (Technical Report TR-03–87). Cambridge, MA: Harvard University, Center for Research in Computing Technology.
Google Scholar
Kearns, M., Li, M., Pitt, L., & Valiant, L. (1987). On the learnability of Boolean formulae. Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (pp. 285–295). New York: The Association for Computing Machinery.
Google Scholar
Laird, P. (1987). Learning from good data and bad. Doctoral dissertation, Depart-ment of Computer Science, Yale University, New Haven, CT.
Google Scholar
Quinlan, J. R. (1986). The effect of noise on concept learning. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning (Vol. 2). Los Altos, CA: Morgan Kaufmann.
Google Scholar
Schlimmer, J. C., & Granger, R. H. (1986). Incremental learning from noisy data. Machine Learning, 1, 317–354.
Google Scholar
Shackelford, G. G., & Volper, D. J. (1987). Learning in the presence of noise. Unpublished manuscript. University of California, Department of Information and Computer Science, Irvine.
Google Scholar
Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134–1142.
Google Scholar
Valiant, L. G. (1985). Learning disjunctions of conjunctions. Proceedings of the Ninth International Joint Conference on Artificial Intelligence (pp. 560–566). Los Angeles, CA: Morgan Kaufmann.
Google Scholar
Vapnik, V. N. (1982). Estimation of dependencies based on empirical data. New York: Springer-Verlag.
Google Scholar
Wilkins, D. C., & Buchanan, B. G. (1986). On debugging rule sets when reasoning under uncertainty. Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 448–454). Philadelphia, PA: Morgan Kaufmann.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Yale University, P.O. Box, 2158, New Haven, CT, 06520, USA
Dana Angluin
NASA Ames Research Center, MS 244-17, Moffett Field, CA, 94035, USA
Philip Laird

Authors

Dana Angluin
View author publications
You can also search for this author in PubMed Google Scholar
Philip Laird
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Angluin, D., Laird, P. Learning From Noisy Examples. Machine Learning 2, 343–370 (1988). https://doi.org/10.1023/A:1022873112823

Download citation

Issue Date: April 1988
DOI: https://doi.org/10.1023/A:1022873112823

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Learning From Noisy Examples

Abstract

Article PDF

Similar content being viewed by others

Learning from binary labels with instance-dependent noise

Maximum Independent Sets and Supervised Learning

Discovering Rule Lists with Preferred Variables

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Learning From Noisy Examples

Abstract

Article PDF

Similar content being viewed by others

Learning from binary labels with instance-dependent noise

Maximum Independent Sets and Supervised Learning

Discovering Rule Lists with Preferred Variables

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation