Identifying Markov Blankets Using Lasso Estimation

Li, Gang; Dai, Honghua; Tu, Yiqing

doi:10.1007/978-3-540-24775-3_39

Identifying Markov Blankets Using Lasso Estimation

Gang Li¹⁹,
Honghua Dai¹⁹ &
Yiqing Tu¹⁹

Conference paper

2948 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Abstract

Determining the causal relation among attributes in a domain is a key task in data mining and knowledge discovery. The Minimum Message Length (MML) principle has demonstrated its ability in discovering linear causal models from training data. To explore the ways to improve efficiency, this paper proposes a novel Markov Blanket identification algorithm based on the Lasso estimator. For each variable, this algorithm first generates a Lasso tree, which represents a pruned candidate set of possible feature sets. The Minimum Message Length principle is then employed to evaluate all those candidate feature sets, and the feature set with minimum message length is chosen as the Markov Blanket. Our experiment results show the ability of this algorithm. In addition, this algorithm can be used to prune the search space of causal discovery, and further reduce the computational cost of those score-based causal discovery algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wright, S.: Correlated and causation. Journal of Agricultural Research 20, 557–585 (1921)
Google Scholar
Wright, S.: The method of path coefficients. Annals of Mathematical Statistics 5, 161–215 (1934)
Article MATH Google Scholar
Bollen, K.: Structural Equations with Latent Variables. Wiley, New York (1989)
MATH Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Revised second printing edn. Morgan Kauffmann Publishers, San Mateo (1988)
Google Scholar
Wallace, C., Boulton, D.: An information measure for classification. Computer Journal 11, 185–194 (1968)
MATH Google Scholar
Wallace, C., Korb, K.B., Dai, H.: Causal discovery via MML. In: Proceedings of the 13th International Conference on Machine learning (ICML 1996), pp. 516–524. Morgan Kauffmann Publishers, San Francisco (1996)
Google Scholar
Dai, H., Korb, K., Wallace, C., Wu, X.: A study of causal discovery with small samples and weak links. In: Proceedings of the 15th International Joint Conference On Artificial Intelligence IJCAI 1997, pp. 1304–1309. Morgan Kaufmann Publishers, Inc., San Francisco (1997)
Google Scholar
Dai, H., Li, G.: An improved approach for the discovery of causal models via MML. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 304–315. Springer, Heidelberg (2002)
Chapter Google Scholar
Li, G., Dai, H., Tu, Y.: Linear causal model discovery using MML criterion. In: Proceedings of 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 274–281. IEEE Computer Society, Los Alamitos (2002)
Google Scholar
Dai, H., Li, G., Tu, Y.: An empirical study of encoding schemes and search strategies in discovering causal networks. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 48–59. Springer, Heidelberg (2002)
Chapter Google Scholar
Dai, H., Li, G., Zhou, Z.H., Webb, G.: Ensembling MML causal induction. Technical Report, Deakin University (2003)
Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the 13th International Conference in Machine Learning (ICML1996), pp. 284–292. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Tsamardinos, I., Aliferis, C.: Towards principled feature selection: Relevancy, filters and wrappers. In: Proceedings of the ninth International Workshop on Artificial Intelligence and Statistics, pp. ??–??. IEEE Computer Society Press, Los Alamitos (2003)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)
MATH MathSciNet Google Scholar
Wallace, C., Freeman, P.: Estimation and inference by compact coding. Journal of the Royal Statistical Society B 49, 240–252 (1987)
MATH MathSciNet Google Scholar
Conway, J., Sloane, N.: Sphere Packings, Lattices and Groups. Springer, London (1988)
MATH Google Scholar
Harvey, A.: The Econometric Analysis of Time Series, 2nd edn. The MIT Press, Cambridge (1990)
MATH Google Scholar
Loehlin, J.C.: Latent Variable Models: An Introduction to Factor, Path and Structural Analysis, 2nd edn. Lawrence Erlbaum Associates, Hillsdale (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Deakin University, 221 Burwood Highway, Vic, 3125, Australia
Gang Li, Honghua Dai & Yiqing Tu

Authors

Gang Li
View author publications
You can also search for this author in PubMed Google Scholar
Honghua Dai
View author publications
You can also search for this author in PubMed Google Scholar
Yiqing Tu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering and Information Technology, Deakin University, VIC 3125, Australia
Honghua Dai
University of Illinois at Urbana-Champaign, 61801, Urbana, IL, USA
Ramakrishnan Srikant
Faculty of Engineering and Information Technology, Centre for Quantum Computation and Intelligent Systems, and Australian ACS National Committee for Artificial Intelligence, University of Technology, Sydney, Australia
Chengqi Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, G., Dai, H., Tu, Y. (2004). Identifying Markov Blankets Using Lasso Estimation. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_39

Download citation

DOI: https://doi.org/10.1007/978-3-540-24775-3_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics