Abstract
Pre-pruning and Post-pruning are two standard techniques for handling noise in decision tree learning. Pre-pruning deals with noise during learning, while post-pruning addresses this problem after an overfitting theory has been learned. We first review several adaptations of pre- and post-pruning techniques for separate-and-conquer rule learning algorithms and discuss some fundamental problems. The primary goal of this paper is to show how to solve these problems with two new algorithms that combine and integrate pre- and post-pruning.
Article PDF
Similar content being viewed by others
References
Angluin, D., & Laird, P. (1988). Learning from noisy examples. Machine Learning, 2(4), 343–370.
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. Pacific Grove, CA: Wadsworth & Brooks.
Brunk, C. A., & Pazzani, M. J. (1991). An investigation of noise-tolerant relational concept learning algorithms. Proceedings of the 8th International Workshop on Machine Learning (ML-91) (pp. 389–393). Evanston, IL: Morgan Kaufmann.
Cameron-Jones, R. M. (1996). The complexity of batch approaches to reduced error rule set induction. Proceedings of the 4th Pacific Rim International Conference on Artificial Intelligence (PRICAI-96) (pp. 348–359). Cairns, Australia: Springer-Verlag.
Cameron-Jones, R.M., & Quinlan, J. R. (1993). First order learning, zeroth order data. Proceedings of the 6th Australian Joint Conference on AI. World Scientific.
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. Proceedings of the 5th European Working Session on Learning (EWSL-91) (pp. 151–163). Porto, Portugal: Springer-Verlag.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261–283.
Cohen, W. W. (1993). Efficient pruning methods for separate-and-conquer rule learning systems. Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93) (pp. 988–994). Chambéry, France: Morgan Kaufmann.
Cohen, W. W. (1995). Fast effective rule induction. Proceedings of the 12th International Conference on Machine Learning (ML-95) (pp. 115–123). Lake Tahoe, CA: Morgan Kaufmann.
Dolšak, B., Bratko, I., & Jezernik, A. (1994). Finite element mesh design: An engineering domain for ILP application. Proceedings of the 4th International Workshop on Inductive Logic Programming (ILP-94) (pp. 305–320). Bad Honnef, Germany: GMD-Studien (Vol. 237).
Dolšak, B., & Muggleton, S. (1992). The application of inductive logic programming to finite-element mesh design. In S. H. Muggleton (Ed.), Inductive Logic Programming. London, UK: Academic Press.
Džeroski, S., & Bratko, I. (1992). Handling noise in inductive logic programming. Proceedings of the International Workshop on Inductive Logic Programming (ILP-92). Tokyo, Japan.
Esposito, F., Malerba, D., & Semeraro, G. (1993). Decision tree pruning as a search in the state space. Proceedings of the 6th European Conference on Machine Learning (ECML-93) (pp. 165–184). Vienna, Austria: Springer-Verlag.
Fürnkranz, J. (1994a). Fossil: A robust relational learner. Proceedings of the 7th European Conference on Machine Learning (ECML-94) (pp. 122–137). Catania, Italy: Springer-Verlag.
Fürnkranz, J. (1994b). Top-down pruning in relational learning. Proceedings of the 11th European Conference on Artificial Intelligence (ECAI-94) (pp. 453–457). Amsterdam, The Netherlands: John Wiley & Sons.
Fürnkranz, J. (1995a). A tight integration of pruning and learning (Technical Report OEFAI-TR–95–03). Vienna, Austria: Austrian Research Institute for Artificial Intelligence.
Fürnkranz, J. (1995b). A tight integration of pruning and learning (extended abstract). Proceedings of the 8th European Conference on Machine Learning (ECML-95) (pp. 291–294). Heraclion, Greece: Springer-Verlag.
Fürnkranz, J. (1996). Separate-and-conquer rule learning (Technical Report OEFAI-TR–96–25). Vienna, Austria: Austrian Research Institute for Artificial Intelligence. Submitted for publication.
Fürnkranz, J., & Widmer, G. (1994). Incremental reduced error pruning. Proceedings of the 11th International Conference on Machine Learning (ML-94) (pp. 70–77). New Brunswick, NJ: Morgan Kaufmann.
Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63–91.
Holte, R. C., Acker, L., & Porter, B. (1989). Concept learning and the problem of small disjuncts. Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI-89) (pp. 813–818). Detroit, MI: Morgan Kaufmann.
Lavrač, N., Džeroski, S., & Grobelnik, M. (1991). Learning nonrecursive definitions of relations with LINUS. Proceedings of the 5th European Working Session on Learning (EWSL-91) (pp. 265–281). Porto, Portugal: Springer-Verlag.
Michalski, R. S. (1980). Pattern recognition and rule-guided inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 349–361.
Michalski, R. S., Mozetič, I., Hong, J., & Lavrač, N. (1986). The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. Proceedings of the 5th National Conference on Artificial Intelligence (AAAI-86) (pp. 1041–1045). Philadelphia, PA.
Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.
Muggleton, S., Bain, M., Hayes-Michie, J., & Michie, D. (1989). An experimental comparison of human and machine learning formalisms. Proceedings of the 6th International Workshop on Machine Learning (ML-89) (pp. 113–118). Morgan Kaufmann.
Niblett, T., & Bratko, I. (1987). Learning decision rules in noisy domains. In M. Bramer (Ed.), Research and Development in Expert Systems. Cambridge, UK: Cambridge University Press.
Pagallo, G., & Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5, 71–99.
Quinlan, J. R. (1987). Simplifying decision trees. International Journal of Man-Machine Studies, 27, 221–234.
Quinlan, J. R. (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
Quinlan, J. R. (1994). The minimum description length principle and categorical theories. Proceedings of the 11th International Conference on Machine Learning (ML-94) (pp. 233–241). New Brunswick, NJ.
Quinlan, J. R., & Cameron-Jones, R. M. (1995). Induction of logic programs: FOIL and related systems. New Generation Computing, 13(3,4), 287–312.
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.
Schaffer, C. (1993). Overfitting avoidance as bias. Machine Learning, 10, 153–178.
Weiss, S. M., & Indurkhya, N. (1991). Reduced complexity rule induction. Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI-91) (pp. 678–684). Morgan Kaufmann.
Weiss, S. M., & Indurkhya, N. (1994). Small sample decision tree pruning. Proceedings of the 11th Conference on Machine Learning (pp. 335–342). New Brunswick, NJ: Morgan Kaufmann.
Wolpert, D. H. (1993). On overfitting avoidance as bias (Technical Report SFI TR 92–03–5001). Santa Fe, NM: The Santa Fe Institute.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Fürnkranz, J. Pruning Algorithms for Rule Learning. Machine Learning 27, 139–172 (1997). https://doi.org/10.1023/A:1007329424533
Issue Date:
DOI: https://doi.org/10.1023/A:1007329424533