Skip to main content

Identifying Markov Blankets Using Lasso Estimation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Abstract

Determining the causal relation among attributes in a domain is a key task in data mining and knowledge discovery. The Minimum Message Length (MML) principle has demonstrated its ability in discovering linear causal models from training data. To explore the ways to improve efficiency, this paper proposes a novel Markov Blanket identification algorithm based on the Lasso estimator. For each variable, this algorithm first generates a Lasso tree, which represents a pruned candidate set of possible feature sets. The Minimum Message Length principle is then employed to evaluate all those candidate feature sets, and the feature set with minimum message length is chosen as the Markov Blanket. Our experiment results show the ability of this algorithm. In addition, this algorithm can be used to prune the search space of causal discovery, and further reduce the computational cost of those score-based causal discovery algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wright, S.: Correlated and causation. Journal of Agricultural Research 20, 557–585 (1921)

    Google Scholar 

  2. Wright, S.: The method of path coefficients. Annals of Mathematical Statistics 5, 161–215 (1934)

    Article  MATH  Google Scholar 

  3. Bollen, K.: Structural Equations with Latent Variables. Wiley, New York (1989)

    MATH  Google Scholar 

  4. Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Revised second printing edn. Morgan Kauffmann Publishers, San Mateo (1988)

    Google Scholar 

  5. Wallace, C., Boulton, D.: An information measure for classification. Computer Journal 11, 185–194 (1968)

    MATH  Google Scholar 

  6. Wallace, C., Korb, K.B., Dai, H.: Causal discovery via MML. In: Proceedings of the 13th International Conference on Machine learning (ICML 1996), pp. 516–524. Morgan Kauffmann Publishers, San Francisco (1996)

    Google Scholar 

  7. Dai, H., Korb, K., Wallace, C., Wu, X.: A study of causal discovery with small samples and weak links. In: Proceedings of the 15th International Joint Conference On Artificial Intelligence IJCAI 1997, pp. 1304–1309. Morgan Kaufmann Publishers, Inc., San Francisco (1997)

    Google Scholar 

  8. Dai, H., Li, G.: An improved approach for the discovery of causal models via MML. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 304–315. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Li, G., Dai, H., Tu, Y.: Linear causal model discovery using MML criterion. In: Proceedings of 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 274–281. IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

  10. Dai, H., Li, G., Tu, Y.: An empirical study of encoding schemes and search strategies in discovering causal networks. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 48–59. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Dai, H., Li, G., Zhou, Z.H., Webb, G.: Ensembling MML causal induction. Technical Report, Deakin University (2003)

    Google Scholar 

  12. Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the 13th International Conference in Machine Learning (ICML1996), pp. 284–292. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  13. Tsamardinos, I., Aliferis, C.: Towards principled feature selection: Relevancy, filters and wrappers. In: Proceedings of the ninth International Workshop on Artificial Intelligence and Statistics, pp. ??–??. IEEE Computer Society Press, Los Alamitos (2003)

    Google Scholar 

  14. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  15. Wallace, C., Freeman, P.: Estimation and inference by compact coding. Journal of the Royal Statistical Society B 49, 240–252 (1987)

    MATH  MathSciNet  Google Scholar 

  16. Conway, J., Sloane, N.: Sphere Packings, Lattices and Groups. Springer, London (1988)

    MATH  Google Scholar 

  17. Harvey, A.: The Econometric Analysis of Time Series, 2nd edn. The MIT Press, Cambridge (1990)

    MATH  Google Scholar 

  18. Loehlin, J.C.: Latent Variable Models: An Introduction to Factor, Path and Structural Analysis, 2nd edn. Lawrence Erlbaum Associates, Hillsdale (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, G., Dai, H., Tu, Y. (2004). Identifying Markov Blankets Using Lasso Estimation. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24775-3_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22064-0

  • Online ISBN: 978-3-540-24775-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics