Skip to main content

Message Length Formulation of Support Vector Machines for Binary Classification A Preliminary Scheme

  • Conference paper
  • First Online:
AI 2002: Advances in Artificial Intelligence (AI 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2557))

Included in the following conference series:

Abstract

This paper presents a preliminary attempt at performing extrinsic binary classification by reformulating the Support Vector Machine (SVM) approach in a Bayesian Message Length framework. The reformulation uses the Minimum Message Length (MML) principle as a way of costing each hyperplane via a two-part message. This message defines a separating hyperplane. The length of this message is used as an objective function for a search through the hypothesis space of possible hyperplanes used to dichotomise a set of data points.

Two preliminary MML implementations are presented here, which difier in the (Bayesian) coding schemes used and the search procedures. The generalisation ability of these two reformulations on both artificial and real data sets are compared against current implementations of Support Vector Machines - namely SVM light, the Lagrangian Support Vector Machine and SMOBR. It was found that, in general, all implementations improved as the size of the data sets increased. The MML implementations tended to perform best on the inseparable data sets and the real data set. Our preliminary MML scheme showed itself to be a strong competitor against the classical SVM, despite inefficiencies in the current scheme

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barros de Almeida, Marcelo (2001). http://www.litc.cpdee.ufmg.br/~barros/svm/smobr/.

  2. Bennett, K. P., Wu, D., & Auslender, L. (1998). On support vector decision trees for database marketing. (Research Report 98-100). Rensselaer Polytechnic Institute, Troy, NY.

    Google Scholar 

  3. Blake, C. L., & Merz, C. J. (1998). UCI Repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html. Irvine, CA: University of California, Department of Information and Computer Science.

  4. Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey, T., Ares Jr, M., & Haussler, D. (2000). Knowledge-based Analysis of Microarray Gene Expression Data using Support Vector Machines. Proceedings of the National Academy of Sciences, 97(1), 262–267.

    Google Scholar 

  5. Dowe, D.L., G.E. Farr, A.J. Hurst and K.L. Lentin (1996). Information-theoretic football tipping, in N. de Mestre (ed.), Third Australian Conference on Mathematics and Computers in Sport, Bond University, Qld, 233–241, 1996.

    Google Scholar 

  6. Dowe, D.L, & Krusel N. (1993). A decision tree model of bushfire activity, (Technicalreport 93/190) Dept Computer Science, Monash University, Melbourne, pp 7, 1993

    Google Scholar 

  7. Fitzgibbon, L. J., Allison, L., & Dowe, D. L. (2000). Minimum message length grouping of ordered data. In H. Arimura, S. Jain & A. Sharma (Eds.) Lecture Notes in Artificial Intelligence, Springer-Verlag, Berlin, Germany, 1968, 56–70.

    Google Scholar 

  8. Good, I. J. (1965). The Estimation of Probabilities: An Essay on Modern Bayesian Methods. Research Monograph No. 30, Cambridge Massachusetts: MIT Press.

    Google Scholar 

  9. Joachims, T. (1998). Making Large-Scale SVM Learning Practical. In B. Scholkopf, C. J. C. Burges & A. J. Smola (Eds.), Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, USA, 1998. http://www.kernel-machines.org

  10. Kornienko, L., Dowe, D.L., Albrecht, D.W. (2002). Message Length Formulation of Support Vector Machines for Binary Classification. Technical Report. Monash University, Clayton, Australia.

    Google Scholar 

  11. Kullback, S. (1959) Information Theory and Statistics. John Riley and Sons, Inc.

    Google Scholar 

  12. Kullback, S. & Leibler, R. A. (1951). On Information and Sufficiency.Ann. Math. Statist., Vol 22, pp 79–86.

    Article  MathSciNet  MATH  Google Scholar 

  13. Mangasarian, O. L., & Musicant, D. R. (2000). Lagrangian Support Vector Machines. (Technical Report 00-06). Data Mining Institute. http://www.kernel-machines.org

  14. Needham, Scott L., & Dowe, David L. (2001). Message Length as an Effective Ockham’s Razor in Decision Tree Induction. Proc. 8th International Workshop on Artificial Intelligence and Statistics (AI+STATS 2001), pp 253–260, Key West, Florida, U.S.A., Jan. 2001.

    Google Scholar 

  15. Osuna, E., Freund, R., & Girosi, F. (1997). Training Support Vector Machines: an Application to Face Detection. Proceedings of CVPR’97, Puerto Rico.

    Google Scholar 

  16. Platt, J. (1999). Sequential minimal optimization: A fast algorithm for training support vector machines in Advances in Kernel Methods-Support Vector Learning, Bernhard Scholkopf, Christopher J. C. Burges and Alexander J. Smola, Eds. 1999, pp. 185–208, MIT Press.

    Google Scholar 

  17. Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer, New York.

    Google Scholar 

  18. Viswanathan, M., & Wallace, C. S. (1999) A note on the comparison of polynomial selection methods. In D Heckerman and J Whittaker (eds), Proceedings of Uncertainty 99: The Seventh International Workshop on Artificial Intelligence and Statistics, pp 169–177. Fort Lauderdale, Florida, 3-6 January, 1999. Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA.

    Google Scholar 

  19. Wallace, C. S., & Boulton, D.M. (1968). An information measure for classification. Computer Journal, 11, 185–194.

    MATH  Google Scholar 

  20. Wallace, C.S., & Dowe, D. L. (1999). Minimum Message Length and Kolmogorov Complexity. Computer Journal Special Issue: Kolmogorov Complexity. 42(4), 270–283.

    MATH  Google Scholar 

  21. Wallace, C.S. and D.L. Dowe (2000). MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions, Statistics and Computing, Vol. 10, No. 1, Jan. 2000, pp 73–83

    Google Scholar 

  22. Wallace, C. S., &amp Freeman, P.R. (1987). Estimation and Inference by Compact Coding, J Royal Stat. Soc. B. 49, 240–252.

    MATH  MathSciNet  Google Scholar 

  23. Wallace, C. S., & Patrick, J. D. (1993). Coding Decision Trees. Machine Learning, 11, 7–22.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kornienko, L., Dowe, D.L., Albrecht, D.W. (2002). Message Length Formulation of Support Vector Machines for Binary Classification A Preliminary Scheme. In: McKay, B., Slaney, J. (eds) AI 2002: Advances in Artificial Intelligence. AI 2002. Lecture Notes in Computer Science(), vol 2557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36187-1_11

Download citation

  • DOI: https://doi.org/10.1007/3-540-36187-1_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00197-3

  • Online ISBN: 978-3-540-36187-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics