Extracting Refined Rules from Knowledge-Based Neural Networks

Towell, Geoffrey G.; Shavlik, Jude W.

doi:10.1023/A:1022683529158

Extracting Refined Rules from Knowledge-Based Neural Networks

Published: October 1993

Volume 13, pages 71–101, (1993)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Extracting Refined Rules from Knowledge-Based Neural Networks

Download PDF

Geoffrey G. Towell¹^nAff2 &
Jude W. Shavlik¹

787 Accesses
434 Citations
Explore all metrics

Abstract

Neural networks, despite their empirically proven abilities, have been little used for the refinement of existing knowledge because this task requires a three-step process. First, knowledge must be inserted into a neural network. Second, the network must be refined. Third, the refined knowledge must be extracted from the network. We have previously described a method for the first step of this process. Standard neural learning techniques can accomplish the second step. In this article, we propose and empirically evaluate a method for the final, and possibly most difficult, step. Our method efficiently extracts symbolic rules from trained neural networks. The four major results of empirical tests of this method are that the extracted rules 1) closely reproduce the accuracy of the network from which they are extracted; 2) are superior to the rules produced by methods that directly refine symbolic rules; 3) are superior to those produced by previous techniques for extracting rules from trained neural networks; and 4) are “human comprehensible.” Thus, this method demonstrates that neural networks can be used to effectively refine symbolic knowledge. Moreover, the rule-extraction technique developed herein contributes to the understanding of how symbolic and connectionist approaches to artificial intelligence can be profitably integrated.

References

Berenji, H.R. (1991). Refinement of approximate reasoning-based controllers by reinforcement learaning. Proceedings of the Eighth International Machine Learning Workshop (pp. 475–479). Evanston, IL: Morgan Kaufmann.
Google Scholar
Bochereau, L., & Bourgine, P. (1990). Extraction of semantic features and logical rules from a multilayer neural network. International Joint Conference on Neural Networks (Vol. 2) (pp. 579–582). Washington, D.C.: Erlbaum.
Google Scholar
Bruner, J.S., Goodnow, J.J., & Austin, G.A. (1956). A study of thinking. New York: Wiley.
Google Scholar
Dzeroski, S., & Lavrac, N. (1991). Learning relations from noisy examples: An empirical comparison of LINUS and FOIL. Proceedings of the Eighth International Machine Learning Workshop (pp. 399–402). Evanston, IL: Morgan Kaufmann.
Google Scholar
Fahlman, S.E., & Lebiere, C. (1989). The cascade-correlation learning architecture. Advances in neural information processing systems (Vol. 2) (pp. 524–532). Denver, CO: Morgan Kaufmann.
Google Scholar
Fisher, D.H., & McKusick, K.B. (1989). An empirical comparison of ID3 and backpropagation. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 788–793). Detroit, MI: Morgan Kaufmann.
Google Scholar
Fu, L.M. (1991). Rule learning by searching on adapted nets. Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 590–595). Anaheim, CA: AAAI Press.
Google Scholar
Goldman, S.A., & Kearns, M.J. (1991). On the complexity of teaching. Proceedings of the Fourth Annual Workshop on Computational Learning Theory (pp. 303–314). Santa Cruz, CA: Morgan Kaufmann.
Google Scholar
Harley, C.B., & Reynolds, R.P. (1987). Analysis of E. coli promoter sequences. Nucleic Acids Research, 15, 2343–2361.
Google Scholar
Hartigan, J.A. (1975). Clustering algorithms. New York: Wiley.
Google Scholar
Hawley, D.K., & McClure, W.R. (1983). Compilation and analysis of escherichia coli promotor DNA sequences. Nucleic Acids Research, 11, 2237–2255.
Google Scholar
Hayashi, Y. (1990). A neural expert system with automated extraction of fuzzy if-then rules. Advances in neural information processing systems (Vol. 3) (pp. 578–584). Denver, CO: Morgan Kaufmann.
Google Scholar
Hinton, G.E. (1989). Connectionist learning procedures. Artificial Intelligence, 40, 185–234.
Google Scholar
Judd, S. (1988). On the complexity of loading shallow neural networks. Journal of Complexity, 4, 177–192.
Google Scholar
Koudelka, G.B., Harrison, S.C., & Ptashne, M. (1987). Effect of non-contacted bases on the affinity of 434 operator for 434 repressor and Cro. Nature, 326, 886–888.
Google Scholar
Le Cun, Y., Denker, J.S., & Solla, S.A. (1989). Optimal brain damage. Advances in neural information processing systems (Vol. 2) (pp. 598–605). Denver, CO: Morgan Kaufmann.
Google Scholar
Masuoka, R., Watanabe, N., Kawamura, A., Owada, Y., & Asakawa, K. (1990). Neurofuzzy system—fuzzy inference using a structured neural network. Proceedings of the International Conference on Fuzzy Logic & Neural Networks (pp. 173–177). Iizuka, Japan.
McDermott, J. (1982). R1: A rule-based configurer of computer systems. Artificial Intelligence, 19, 21–32.
Google Scholar
McMillan, C., Mozer, M.C., & Smolensky, P. (1991). The connectionist scientist game: Rule extraction and refinement in a neural network. Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society. Chicago, IL: Erlbaum.
Google Scholar
Miller, G.A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–97.
Google Scholar
Mozer, M.C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. Advances in neural information processing systems (Vol. 1) (pp. 107–115). Denver, CO: Morgan Kaufmann.
Google Scholar
Murphy, P.M., & Pazzani, M.J. (1991). ID2-of-3: Constructive induction of M-of-N concepts for discriminators in decision trees. Proceedings of the Eighth International Machine Learning Workshop (pp. 183–187). Evanston, IL: Morgan Kaufmann.
Google Scholar
Nessier, U., & Weene, P. (1962). Hierarchies in concept attainment. Journal of Experimental Psychology, 64, 640–645.
Google Scholar
Noordewier, M.O., Towell, G.G., & Shavlik, J.W. (1991). Training knowledge-based neural networks to recognize genes in DNA sequences. Advances in neural information processing systems (Vol. 3) (pp. 530–536). Denver, CO: Morgan Kaufmann.
Google Scholar
Nowlan, S.J., & Hinton, G.E. (1991). Simplifying neural networks by soft weight-sharing. Advances in neural information processing systems (Vol. 4) (pp. 993–1000). Denver, CO: Morgan Kaufmann.
Google Scholar
Ourston, D. (1991). Using explanation-based and empirical methods in theory revision. Ph.D. thesis, Department of Computer Sciences, University of Texas, Austin, TX.
Ourston, D., & Mooney, R.J. (1990). Changing the rules: A comprehensive approach to theory refinement. Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 815–820). Boston, MA: AAAI Press.
Google Scholar
Pazzani, M. (1992). When prior knowledge hinders learning. Workshop Notes of Constraining Learning with Prior Knowledge (pp. 44–52). San Jose, CA.
Pratt, L.Y., Mostow, J., & Kamm, C.A. (1991). Direct transfer of learned information among neural networks. Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 584–589). Anaheim, CA: AAAI Press.
Google Scholar
Quinlan, J.R. (1987). Simplifying decision trees. International Journal of Man-Machine Studies, 27, 221–234.
Google Scholar
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by error propagation. In D.E. Rumelhart & J.L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations (pp. 318–362). Cambridge, MA: MIT Press.
Google Scholar
Saito, K., & Nakano, R. (1988). Medical diagnostic expert system based on PDP model. Proceedings of IEEE International Conference on Neural Networks (Vol. 1) (pp. 255–262). San Diego, CA: IEEE.
Google Scholar
Sestito, S., & Dillon, T. (1990). Using multi-layered neural networks for learning symbolic knowledge. Proceedings of the Fourth Australian Joint Conference on Artificial Intelligence. Perth, Australia: World Scientific.
Google Scholar
Shavlik, J.W., Mooney, R.J., & Towell, G.G. (1991). Symbolic and neural net learning algorithms: An empirical comparison. Machine Learning, 6, 111–143.
Google Scholar
Stormo, G.D. (1990). Consensus patterns in DNA. Methods in enzymology (Vol. 183). Orlando, FL: Academic Press.
Google Scholar
Sutton, R.S. (1986). Two problems with backpropagation and other steepest descent learning procedures for networks. Program of the Eighth Annual Conference of the Cognitive Science Society (pp. 823–831). Amherst, MA: Erlbaum.
Google Scholar
Thompson, K., Langley, P., & Iba, W. (1991). Using background knowledge in concept formation. Proceedings of the Eighth International Machine Learning Workshop (pp. 554–558). Evanston, IL: Morgan Kaufmann.
Google Scholar
Thrun, S., Bala, J. Bloedorn, E., Bratko, I., Cestnik, B., Cheng, J., De Jong, K., Dzeroski, S., Fahlman, S., Fisher, D., Hamann, R., Kaufman, K., Keller, S., Kononenko, I., Kreuziger, J., Michalski, R., Mitchell, T., Pachowicz, P., Reich, Y., Vafaie, H., Van de Welde, W., Wenzel, W., Wnek, J., & Zhang, J. (1991). The MONK's problem: A performance comparison of different learning algorithms. (Technical Report CMU-CS-91-197). Pittsburgh, PA: Carnegie Mellon.
Google Scholar
Towell, G.G. (1991). Symbolic knowledge and neural networks: Insertion, refinement, and extraction. Ph.D. thesis, Computer Sciences Department, University of Wisconsin, Madison, WI.
Towell, G.G., & Shavlik, J.W. (1991). Interpretation of artificial neural networks: Mapping knowledge-based neural networks into rules. Advances in neural information processing systems (Vol. 4) (pp. 977–984). Denver, CO: Morgan Kaufmann.
Google Scholar
Towell, G.G., Shavlik, J.W., & Noordewier, M.O. (1990). Refinement of approximately correct domain theories by knowledge-based neural networks. Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 861–866). Boston: MA: AAAI Press.
Google Scholar
Weiss, S.M., & Kulikowski, C.A. (1990). Computer systems that learn. San Mateo, CA: Morgan Kaufmann.
Google Scholar

Download references

Author information

Geoffrey G. Towell
Present address: Siemens Corporate Research, 755 College Road East, Princeton, NJ, 08540

Authors and Affiliations

University of Wisconsin, 1210 West Dayton Street, Madison, Wisconsin, 53706
Geoffrey G. Towell & Jude W. Shavlik

Authors

Geoffrey G. Towell
View author publications
You can also search for this author in PubMed Google Scholar
Jude W. Shavlik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Geoffrey G. Towell.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Towell, G.G., Shavlik, J.W. Extracting Refined Rules from Knowledge-Based Neural Networks. Machine Learning 13, 71–101 (1993). https://doi.org/10.1023/A:1022683529158

Download citation

Issue Date: October 1993
DOI: https://doi.org/10.1023/A:1022683529158

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Extracting Refined Rules from Knowledge-Based Neural Networks

Abstract

Article PDF

Similar content being viewed by others

Learning Decision Rules or Learning Decision Models?

Rule extraction from neural network trained using deep belief network and back propagation

Automated discovery of algorithms from data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Navigation

Extracting Refined Rules from Knowledge-Based Neural Networks

Abstract

Article PDF

Similar content being viewed by others

Learning Decision Rules or Learning Decision Models?

Rule extraction from neural network trained using deep belief network and back propagation

Automated discovery of algorithms from data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation