Abstract
The basic question addressed in this paper is: how can a learning algorithm cope with incorrect training examples? Specifically, how can algorithms that produce an “approximately correct” identification with “high probability” for reliable data be adapted to handle noisy data? We show that when the teacher may make independent random errors in classifying the example data, the strategy of selecting the most consistent rule for the sample is sufficient, and usually requires a feasibly small number of examples, provided noise affects less than half the examples on average. In this setting we are able to estimate the rate of noise using only the knowledge that the rate is less than one half. The basic ideas extend to other types of random noise as well. We also show that the search problem associated with this strategy is intractable in general. However, for particular classes of rules the target rule may be efficiently identified if we use techniques specific to that class. For an important class of formulas – the k-CNF formulas studied by Valiant – we present a polynomial-time algorithm that identifies concepts in this form when the rate of classification errors is less than one half.
Article PDF
Similar content being viewed by others
References
Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75, 87–106.
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1986). Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension. Pro-ceedings of the Eighteenth Annual ACM Symposium on Theory of Computing (pp. 273–282). Berkeley, CA: The Association for Computing Machinery.
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 60 13–30.
Kearns, M., & Li, M. (1987). Learning in the presence of malicious errors (Technical Report TR-03–87). Cambridge, MA: Harvard University, Center for Research in Computing Technology.
Kearns, M., Li, M., Pitt, L., & Valiant, L. (1987). On the learnability of Boolean formulae. Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (pp. 285–295). New York: The Association for Computing Machinery.
Laird, P. (1987). Learning from good data and bad. Doctoral dissertation, Depart-ment of Computer Science, Yale University, New Haven, CT.
Quinlan, J. R. (1986). The effect of noise on concept learning. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning (Vol. 2). Los Altos, CA: Morgan Kaufmann.
Schlimmer, J. C., & Granger, R. H. (1986). Incremental learning from noisy data. Machine Learning, 1, 317–354.
Shackelford, G. G., & Volper, D. J. (1987). Learning in the presence of noise. Unpublished manuscript. University of California, Department of Information and Computer Science, Irvine.
Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134–1142.
Valiant, L. G. (1985). Learning disjunctions of conjunctions. Proceedings of the Ninth International Joint Conference on Artificial Intelligence (pp. 560–566). Los Angeles, CA: Morgan Kaufmann.
Vapnik, V. N. (1982). Estimation of dependencies based on empirical data. New York: Springer-Verlag.
Wilkins, D. C., & Buchanan, B. G. (1986). On debugging rule sets when reasoning under uncertainty. Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 448–454). Philadelphia, PA: Morgan Kaufmann.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Angluin, D., Laird, P. Learning From Noisy Examples. Machine Learning 2, 343–370 (1988). https://doi.org/10.1023/A:1022873112823
Issue Date:
DOI: https://doi.org/10.1023/A:1022873112823