Skip to main content

Abstract

Bayesian networks are today one of the most promising approaches to Data Mining and knowledge discovery in databases. This chapter reviews the fundamental aspects of Bayesian networks and some of their technical aspects, with a particular emphasis on the methods to induce Bayesian networks from different types of data. Basic notions are illustrated through the detailed descriptions of two Bayesian network applications: one to survey data and one to marketing data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • S. G. Bottcher and C. Dethlefsen. Deal: A package for learning Bayesian networks. Available from http://www.jstatsoft.org/v08/i20/deal.pdf, 2003.

    Google Scholar 

  • U. M. Braga-Neto and E. R. Dougerthy. Is cross-validation valid for small-sample microarray classification. Bioinformatics, 20:374–380, 2004.

    Article  Google Scholar 

  • E. Castillo, J. M. Gutierrez, and A. S. Hadi. Expert Systems and Probabilistic Network Models. Springer, New York, NY, 1997.

    Google Scholar 

  • E. Charniak. Belief networks without tears. AI Magazine, pages 50–62, 1991.

    Google Scholar 

  • P. Cheeseman and J. Stutz. Bayesian classification (AutoClass): Theory and results. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 153–180. MIT Press, Cambridge, MA, 1996.

    Google Scholar 

  • J. Cheng and M. Druzdzel. AIS-BN: An adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks. J Artiflntell Res, 13:155–188, 2000.

    MathSciNet  MATH  Google Scholar 

  • D. M. Chickering. Learning equivalence classes of Bayesian-network structures. J Mack Learn Res, 2:445–498, February 2002.

    Article  MATH  MathSciNet  Google Scholar 

  • G. F. Cooper. The computational complexity of probabilistic inference using Bayesian belief networks, aij, 42:297–346, 1990.

    Google Scholar 

  • G. F. Cooper and E. Herskovitz. A Bayesian method for the induction of probabilistic networks from data. Mach Learn, 9:309–347, 1992.

    MATH  Google Scholar 

  • R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer, New York, NY, 1999.

    MATH  Google Scholar 

  • A. P. Dawid and S. L. Lauritzen. Hyper Markov laws in the statistical analysis of decomposable graphical models. Ann Stat, 21:1272–1317, 1993. Correction ibidem, (1995), 23, 1864.

    MathSciNet  MATH  Google Scholar 

  • R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, NY, 1973.

    MATH  Google Scholar 

  • N. Friedman. Inferring cellular networks using probabilistic graphical models. Science, 303:799–805, 2004.

    Article  Google Scholar 

  • N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian network classifiers. Mach Learn, 29:131–163, 1997.

    Article  MATH  Google Scholar 

  • N. Friedman and D. Koller. Being Bayesian about network structure: A Bayesian approach to structure discovery in bayesian networks. Machine Learning, 50:95–125, 2003.

    Article  MATH  Google Scholar 

  • N. Friedman, K. Murphy, and S. Russell. Learning the structure of dynamic probabilistic networks. In Proceedings of the 14th Annual Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 139–147, San Francisco, CA, 1998. Morgan Kaufmann Publishers.

    Google Scholar 

  • D. Geiger and D. Heckerman. Learning gaussian networks. In Proceedings of the Tenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-94), San Francisco, 1994. Morgan Kaufmann.

    Google Scholar 

  • D. Geiger and D. Heckerman. A characterization of Dirichlet distributions through local and global independence. Ann Stat, 25:1344–1368, 1997.

    Article  MathSciNet  MATH  Google Scholar 

  • A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian Data Analysis. Chapman and Hall, London, UK, 1995.

    Google Scholar 

  • S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE T Pattern Anal, 6:721–741, 1984.

    MATH  Google Scholar 

  • W. R. Gilks and G. O. Roberts. Strategies for improving MCMC. In W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, editors, Markov Chain Monte Carlo in Practice, pages 89–114. Chapman and Hall, London, UK, 1996.

    Google Scholar 

  • C. Glymour, R. Scheines, P. Spirtes, and K. Kelly. Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling. Academic Press, San Diego, CA, 1987.

    MATH  Google Scholar 

  • I. J. Good. Rational decisions. J Roy Stat Soc B, 14:107–114, 1952.

    MathSciNet  Google Scholar 

  • I. J. Good. The Estimation of Probability: An Essay on Modern Bayesian Methods. MIT Press, Cambridge, MA, 1968.

    Google Scholar 

  • D. J. Hand. Construction and Assessment of Classification Rules. Wiley, New York, NY, 1997.

    MATH  Google Scholar 

  • D. J. Hand, N. M. Adams, and R. J. Bolton. Pattern Detection and Discovery. Springer, New York, 2002.

    MATH  Google Scholar 

  • D. J. Hand, H. Mannila, and P. Smyth. Principles of Data Mining. MIT Press, Cambridge, 2001.

    Google Scholar 

  • T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer-Verlag, New York, 2001.

    MATH  Google Scholar 

  • D. Heckerman. Bayesian networks for Data Mining. Data Min Knowl Disc, 1:79–119, 1997.

    Article  Google Scholar 

  • D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: The combinations of knowledge and statistical data. Mach Learn, 20:197–243, 1995.

    MATH  Google Scholar 

  • D. F. Heitjan and D. B. Rubin. Ignorability and coarse data. Ann Stat, 19:2244–2253, 1991.

    MathSciNet  MATH  Google Scholar 

  • R. E. Kass and A. Raftery. Bayes factors. J Am Stat Assoc, 90:773–795, 1995.

    Article  MATH  Google Scholar 

  • P. Langley, W. Iba, and K. Thompson. An analysis of Bayesian classifiers. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 223–228, Menlo Park, CA, 1992. AAAI Press.

    Google Scholar 

  • P. Larranaga, C. Kuijpers, R. Murga, and Y. Yurramendi. Learning Bayesian network structures by searching for the best ordering with genetic algorithms. IEEE T Pattern Anal, 26:487–493, 1996.

    Google Scholar 

  • S. L. Lauritzen. Propagation of probabilities, means and variances in mixed graphical association models. J Am Stat Assoc, 87(420):1098–108, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  • S. L. Lauritzen. Graphical Models. Oxford University Press, Oxford, UK, 1996.

    Google Scholar 

  • S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems (with discussion). J Roy Stat Soc B, 50:157–224, 1988.

    MathSciNet  MATH  Google Scholar 

  • R. J. A. Little and D. B. Rubin. Statistical Analysis with Missing Data. Wiley, New York, NY, 1987.

    MATH  Google Scholar 

  • D. Madigan and A. E. Raftery. Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc, 89:1535–1546, 1994.

    Article  MATH  Google Scholar 

  • D. Madigan and G. Ridgeway. Bayesian data analysis for Data Mining. In Handbook of Data Mining, pages 103–132. MIT Press, 2003.

    Google Scholar 

  • D. Madigan and J. York. Bayesian graphical models for discrete data. Int Stat Rev, pages 215–232, 1995.

    Google Scholar 

  • P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman and Hall, London, 2nd edition, 1989.

    MATH  Google Scholar 

  • A. O’Hagan. Bayesian Inference. Kendall’s Advanced Theory of Statistics. Arnold, London, UK, 1994.

    Google Scholar 

  • J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of plausible inference. Morgan Kaufmann, San Francisco, CA, 1988.

    Google Scholar 

  • M. Ramoni, A. Riva, M. Stefanelli, and V. Patel. An ignorant belief network to forecast glucose concentration from clinical databases. Artif Intell Med, 7:541–559, 1995.

    Article  Google Scholar 

  • M. Ramoni and P. Sebastiani. Bayesian methods. In Intelligent Data Analysis. An Introduction, pages 131–168. Springer, New York, NY, 2nd edition, 2003.

    Google Scholar 

  • M. Ramoni, P. Sebastiani, and I.S. Kohane. Cluster analysis of gene expression dynamics. Proc Natl Acad Sci USA, 99(14):9121–6, 2002.

    Article  MathSciNet  MATH  Google Scholar 

  • D. B. Rubin. Inference and missing data. Biometrika, 63:581–592, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  • D. B. Rubin. Multiple Imputation for Nonresponse in Survey. Wiley, New York, NY, 1987.

    Google Scholar 

  • D. B. Rubin. Multiple imputation after 18 years. J Am Stat Assoc, 91:473–489, 1996.

    Article  MATH  Google Scholar 

  • D. B. Rubin, H. S. Stern, and V. Vehovar. Handling “don’t know” survey responses: the case of the Slovenian plebiscite. J Am Stat Assoc, 90:822–828, 1995.

    Article  Google Scholar 

  • M. Sahami. Learning limited dependence Bayesian classifiers. In Proceeding of the 2 Int. Conf. On Knowledge Discovery & Data Mining, 1996.

    Google Scholar 

  • J. L. Schafer. Analysis of Incomplete Multivariate Data. Chapman and Hall, London, UK, 1997.

    MATH  Google Scholar 

  • P Sebastiani, M Abad, and M F Ramoni. Bayesian networks for genomic analysis. In E R Dougherty, I Shmulevich, J Chen, and Z J Wang, editors, Genomic Signal Processing and Statistics, Series on Signal Processing and Communications. EURASIP, 2004.

    Google Scholar 

  • P. Sebastiani and M. Ramoni. Analysis of survey data with Bayesian networks. Technical Report, Knowledge Media Institute, The Open University, Walton Hall, Milton Keynes MK7 6AA, 2000. Available from authors.

    Google Scholar 

  • P. Sebastiani and M. Ramoni. Bayesian selection of decomposable models with incomplete data. J Am Stat Assoc, 96(456):1375–1386, 2001A.

    Article  MathSciNet  MATH  Google Scholar 

  • P. Sebastiani and M. Ramoni. Common trends in european school populations. Res. Offic. Statist., 4(1):169–183, 2001B.

    MathSciNet  Google Scholar 

  • P. Sebastiani and M. F. Ramoni. On the use of Bayesian networks to analyze survey data. Res. Offic. Statist., 4:54–64, 2001C.

    Google Scholar 

  • P. Sebastiani and M. Ramoni. Generalized gamma networks. Technical report, University of Massachusetts, Department of Mathematics and Statistics, 2003.

    Google Scholar 

  • P. Sebastiani, M. Ramoni, and A. Crea. Profiling customers from in-house data. ACM SIGKDD Explorations, 1:91–96, 2000.

    Google Scholar 

  • P. Sebastiani, M. Ramoni, and I. Kohane. BADGE: Technical notes. Technical report, Department of Mathematics and Statistics, University of Massachusetts at Amherst, 2003.

    Google Scholar 

  • P. Sebastiani, M. F. Ramoni, V. Nolan, C. Baldwin, and M. H. Steinberg. Discovery of complex traits associated with overt stroke in patients with sickle cell anemia by Bayesian network modeling. In 27th Annual Meeting of the National Sickle Cell Disease Program, 2004. To appear.

    Google Scholar 

  • P. Sebastiani, Y. H. Yu, and M. F. Ramoni. Bayesian machine learning and its potential applications to the genomic study of oral oncology. Adv Dent Res, 17:104–108, 2003.

    Article  Google Scholar 

  • R. D. Shachter. Evaluating influence diagrams. Operation Research, 34:871–882, 1986.

    Article  MathSciNet  Google Scholar 

  • M. Singh and M. Valtorta. Construction of Bayesian network structures from data: A brief survey and an efficient algorithm. Int J Approx Reason, 12:111–131, 1995.

    Article  MATH  Google Scholar 

  • D. J. Spiegelhalter and S. L. Lauritzen. Sequential updating of conditional probabilities on directed graphical structures. Networks, 20:157–224, 1990.

    MathSciNet  Google Scholar 

  • P. Spirtes, C. Glymour, and R. Scheines. Causation, prediction and search. Springer, New York, 1993.

    MATH  Google Scholar 

  • M. A. Tanner. Tools for Statistical Inference. Springer, New York, NY, third edition, 1996.

    MATH  Google Scholar 

  • Y. Thibaudeau and W. E. Winler. Bayesian networks representations, generalized imputation, and synthetic microdata satisfying analytic restraints. Technical report, Statistical Research Division report RR 2002/09, 2002. http://www.census.gov/srd/www/byyear.html.

    Google Scholar 

  • A. Thomas, D. J. Spiegelhalter, and W. R. Gilks. Bugs: A program to perform Bayesian inference using Gibbs Sampling. In J. Bernardo, J. Berger, A. P. Dawid, and A. F. M. Smith, editors, Bayesian Statistics 4, pages 837–42. Oxford University Press, Oxford, UK, 1992.

    Google Scholar 

  • J. Whittaker. Graphical Models in Applied Multivariate Statistics. Wiley, New York, NY, 1990.

    MATH  Google Scholar 

  • S. Wright. The theory of path coefficients: a reply to niles’ criticism. Genetics, 8:239–255, 1923.

    Google Scholar 

  • S. Wright. The method of path coefficients. Annals of Mathematical Statistics, 5:161–215, 1934.

    MATH  Google Scholar 

  • J. Yu, V. Smith, P. Wang, A. Hartemink, and E. Jarvis. Using Bayesian network inference algorithms to recover molecular genetic regulatory networks. In International Conference on Systems Biology 2002 (ICSB02), 2002.

    Google Scholar 

  • H. Zhou and S. Sakane. Sensor planning for mobile robot localization using Bayesian network inference. J. of Advanced Robotics, 16, 2002. To appear.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Sebastiani, P., Abad, M.M., Ramoni, M.F. (2005). Bayesian Networks. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387-25465-X_10

Download citation

  • DOI: https://doi.org/10.1007/0-387-25465-X_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24435-8

  • Online ISBN: 978-0-387-25465-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics