Skip to main content

Computational machine learning in theory and praxis

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1000))

Abstract

In the last few decades a computational approach to machine learning has emerged based on paradigms from recursion theory and the theory of computation. Such ideas include learning in the limit, learning by enumeration, and probably approximately correct (pac) learning. These models usually are not suitable in practical situations. In contrast, statistics based inference methods have enjoyed a long and distinguished career. Currently, Bayesian reasoning in various forms, minimum message length (MML) and minimum description length (MDL), are widely applied approaches. They are the tools to use with particular machine learning praxis such as simulated annealing, genetic algorithms, genetic programming, artificial neural networks, and the like. These statistical inference methods select the hypothesis which minimizes the sum of the length of the description of the hypothesis (also called ‘model’) and the length of the description of the data relative to the hypothesis. It appears to us that the future of computational machine learning will include combinations of the approaches above coupled with guaranties with respect to used time and memory resources. Computational learning theory will move closer to practice and the application of the principles such as MDL require further justification. Here, we survey some of the actors in this dichotomy between theory and praxis, we justify MDL via the Bayesian approach, and give a comparison between pac learning and MDL learning of decision trees.

The first author was supported in part by NSERC operating grant OGP-046506, ITRC, and a CGAT grant. The second author was supported by NSERC through International Scientific Exchange Award ISE0125663, and by the European Union through NeuroCOLT ESPRIT Working Group Nr. 8556, and by NWO through NFI Project ALADDIN under Contract number NF 62-376.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. Angluin, Computational learning theory: survey and selected bibliography, Proc. 24th Ann. ACM Symp. Theory of Computing, 1992, pp. 319–342.

    Google Scholar 

  2. L. Breiman, J. Friedman, R. Olshen and C. Stone, Classification and Regression Trees. Wadsworth International Group, Belmont, CA, 1984.

    Google Scholar 

  3. W. Buntine, Personal communication, September 1994.

    Google Scholar 

  4. A. Ehrenfeucht and D. Haussler, Learning decision trees from random examples. Proc. 1988 Workshop on Comp. Learning Theory, Morgan-Kaufmann, 1988, pp. 182–195.

    Google Scholar 

  5. P. Gács, On the symmetry of algorithmic information, Soviet Math. Dokl., 15 (1974) 1477–1480. Correction: ibid., 15 (1974) 1480.

    Google Scholar 

  6. Q. Gao and M. Li, An application of minimum description length principle to online recognition of hand-printed alphanumerals, 11th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, 1989, pp. 843–848.

    Google Scholar 

  7. E.M. Gold, Language identification in the limit, Inform. Contr. 10 (1967) 447–474.

    Article  Google Scholar 

  8. T. Hancock, T. Jiang, M. Li, and J. Tromp, Lower bounds on learning decision lists and trees. in: E.W. Mayr, C. Puech (Eds.), STACS 95, Proc. 12th Annual Symp. Theoret. Aspects of Comput. Science, Lecture Notes in Computer Science, Vol. 900, Springer-Verlag, Heidelberg, 1995, pp. 527–538.

    Google Scholar 

  9. A.N. Kolmogorov, Three approaches to the quantitative definition of information, Problems Inform. Transmission 1:1 (1965) 1–7.

    Google Scholar 

  10. M. Li and P.M.B. Vitányi, Inductive reasoning and Kolmogorov complexity, J. Comput. Syst. Sci. 44 (1992) 343–384.

    Article  Google Scholar 

  11. M. Li and P. Vitányi, A theory of learning simple concepts under simple distributions, SIAM J. Computing 20:5 (1991) 915–935.

    Article  Google Scholar 

  12. M. Li and P.M.B. Vitányi, An Introduction to Kolmogorov Complexity and its Applications, Springer-Verlag, New York, 1993.

    Google Scholar 

  13. R. Nohre, Some Topics in Descriptive Complexity, Ph.D. Thesis, Linköping University, Linköping, Sweden, 1994.

    Google Scholar 

  14. D. MacKay, pp 59 in Maximum Entropy and Bayesian Methods, Kluwer, 1992. (Personal communication W. Buntine.)

    Google Scholar 

  15. J. Mingers, An empirical comparison of selection measures for decision-tree induction. Machine Learning 3 (1989) 319–342.

    Google Scholar 

  16. J.R. Quinlan, Induction of decision trees, Machine Learning 1 (1986) 81–106.

    Google Scholar 

  17. J.R. Quinlan and R. Rivest, Inferring decision trees using the minimum description length principle, Inform. Computation 80 (1989) 227–248.

    Article  Google Scholar 

  18. J. Rissanen, Modeling by the shortest data description, Automatica-J.IFAC 14 (1978) 465–471.

    Article  Google Scholar 

  19. J. Rissanen, Universal coding, information, prediction and estimation, IEEE Transactions on Information Theory IT-30 (1984) 629–636.

    Article  Google Scholar 

  20. J. Rissanen, Stochastic Complexity and Statistical Inquiry, World Scientific Publishers, 1989.

    Google Scholar 

  21. J. Rissanen, Stochastic complexity, J. Royal Stat. Soc, series B 49 (1987) 223–239. Discussion: ibid., pp. 252–265.

    Google Scholar 

  22. J. Rissanen, Stochastic complexity in learning, in: P. Vitányi (Ed.), Computational Learning Theory, Proc. 2nd European Conf. (EuroCOLT '95), Lecture Notes in Artificial Intelligence, Vol. 904, Springer-Verlag, Heidelberg, 1995, pp. 196–210.

    Google Scholar 

  23. J. Rissanen and M. Wax, Algorithm for constructing tree structured classifiers, US Patent No. 4719571, 1988.

    Google Scholar 

  24. J. Segen, Pattern-Directed Signal Analysis, PhD Thesis, Carnegie-Mellon University, Pittsburgh, 1980.

    Google Scholar 

  25. R.J. Solomonoff, Complexity-based induction systems: comparisons and convergence theorems, IEEE Trans. Inform. Theory IT-24 (1978) 422–432.

    Article  Google Scholar 

  26. L.G. Valiant, A Theory of the Learnable, Comm. ACM 27 (1984) 1134–1142.

    Article  Google Scholar 

  27. V. Vovk, Minimum description length estimators under the universal coding scheme, in: P. Vitányi (Ed.), Computational Learning Theory, Proc. 2nd European Conf. (EuroCOLT '95), Lecture Notes in Artificial Intelligence, Vol. 904, Springer-Verlag, Heidelberg, 1995, pp. 237–251.

    Google Scholar 

  28. C.S. Wallace and D.M. Boulton, An information measure for classification, Computing Journal 11 (1968) 185–195.

    Google Scholar 

  29. C.S. Wallace and P.R. Freeman, Estimation and inference by compact coding, J. Royal Stat. Soc, Series B, 49 (1987) 240–251. Discussion: ibid.,252–265.

    Google Scholar 

  30. K. Yamanishi, Approximating the minimum description length and its applications to learning, Manuscript, NEC Research Labs, New Jersey, 1995.

    Google Scholar 

  31. A.K. Zvonkin and L.A. Levin, The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms, Russian Math. Surveys 25:6 (1970) 83–124.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jan van Leeuwen

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Li, M., Vitányi, P. (1995). Computational machine learning in theory and praxis. In: van Leeuwen, J. (eds) Computer Science Today. Lecture Notes in Computer Science, vol 1000. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015264

Download citation

  • DOI: https://doi.org/10.1007/BFb0015264

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60105-0

  • Online ISBN: 978-3-540-49435-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics