Abstract
Problems of reconstruction of structures of probabilistic dependence models in the class of directed (oriented) acyclic graphs (DAGs) and mono-flow graphs are considered. (Mono-flow graphs form a subclass of DAGs in which the cycles with one collider are prohibited.) The technique of induced (provoked) dependences is investigated and its application to the identification of structures of models is shown. The algorithm “Collifinder-M” is developed that identifies all collider variables (i.e., solves an intermediate problem of reconstruction of the structure of a mono-flow model). It is shown that a generalization of the technique of induced dependences makes it possible to strengthen well-known rules of identification of orientation of edges in a DAG model.
Similar content being viewed by others
References
S. L. Lauritzen, Graphical Models, Clarendon Press, Oxford (1996).
R. G., Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer-Verlag, Berlin-Heidelberg-New York (1999).
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo (1988).
P. Spirtes, C. Glymour, and R. Scheines, Causation, Prediction, and Search, MIT Press, New York (2001).
D. Heckerman, “Bayesian networks for data mining,” Data Mining and Knowledge Discovery, 1, No. 1, 79–119 (1997).
T. Verma and J. Pearl, “Causal networks: semantics and expressiveness,” in: R. Shachter, T. S. Levitt, and L. N. Kanal (eds.), Uncertainty in Artificial Intelligence, 4, Elsevier (1990), pp. 69–76.
J. Pearl, Causality: Models, Reasoning, and Inference, Univ. Press, Cambridge (2000).
O. S. Balabanov, “Determination of structures of dependences in data: From indirect associations to causality,” in: Proc. 2nd Intern. Conf. “UkrProg 2002,” Probl. Programmirovaniya, Nos. 1–2, 309–316 (2002).
F. I. Andon and A. S. Balabanov, “Identification of knowledge and research in databases: Approaches, models, methods, and systems (a review),” in: Proc. 2nd Intern. Conf. “UkrProg 2000,” Probl. Programmirovaniya, Nos. 1–2, 513–526 (2000).
A. S. Balabanov, “Extraction of knowledge from databases: Advanced computer technologies of intellectual analysis of data,” Mathematical Machines and Systems, No. 1–2, 40–54 (2001).
D. Heckerman, D. Geiger, and D. M. Chickering, “Learning Bayesian networks: The combination of knowledge and statistical data,” Machine Learning, 20, 197–243 (1995).
R. Scheines, P. Spirtes, C. Glymour, C. Meek, and T. Richardson, “The TETRAD project: Constraint based aids to causal model specification,” Multivariate Behavioral Research, 33, No. 1, 65–118 (1998).
D. Geiger, A. Paz, and J Pearl, “Learning simple causal structures,” Intern. Journ. of Intelligent Systems, 8, No. 2, 231–247 (1993).
J. Cheng, R. Greiner, J. Kelly, D. Bell, and W. Liu, “Learning Bayesian networks from data: An information-theory based approach,” Artificial Intelligence, 137, 43–90 (2002).
M. I. Jordan (ed.), Learning in Graphical Models, MIT Press, Cambridge (1999).
A. M. Gupal and A. A. Vagis, “Learning in Bayesian networks,” Problems of Control and Informatics, No. 3, 106–111 (2002).
C. K. Chow and C. N. Liu, “Approximating discrete probability distributions with dependence trees,” IEEE Trans. Inform. Theory, 14, No. 3, 462–467 (1968).
O. S. Balabanov, “Inductive reconstruction of treelike structures of systems of dependences,” Probl. Programmirovaniya, No. 1–2, 95–108 (2001).
D. M. Chickering, C. Meek, and D. Heckerman, “Large-sample learning of Bayesian networks is NP-hard,” in: Proc. 19th Conf. on Uncertainty in Artificial Intelligence, Morgan Kaufmann, Acapulco, Mexico (2003), pp. 124–133.
A. P. Dawid, “Conditional independence in statistical theory (with discussion),” Journ. of Royal Statist. Soc., 41-B, 1–31 (1979).
A. S. Balabanov, “Inductive method of reconstruction of mono-flow probabilistic graphical models of dependencies,” Probl. Upravlen. Inf., No. 5, 75–84 (2003).
A. S. Balabanov, “New method of reconstruction of probabilistic graphical models of dependencies,” in: Proc. 1th Intern. Conf. on Inductive Modeling, “MKIM-2002,” 1, L’viv (2002), pp. 118–124.
A. S. Balabanov, “Reconstruction of structures of probabilistic dependence systems from data: The apparatus of genotypes of variables,” Probl. Upravlen. Inf., No. 2, 91–99 (2003).
A. S. Balabanov, “Efficient method of identification of dependence structures in statistical data,” in: Proc. 4th Intern. Conf. “UkrProg 2004,” Probl. Programmirovaniya, No. 2–3, 312–319 (2004).
D. M. Chickering and C. Meek, “Monotone DAG Faithfulness: A Bad Assumption,” Techn. Rep. MSR-TR-2003-16, Microsoft, Redmond, WA. (2003).
S. Chaudhuri and T. Richardson, “Using the structure of d-connecting paths as a qualitative measure of the strength of dependence,” in: Proc. 19th Conf. on Uncertainty in Artificial Intelligence, Part 2, Morgan Kaufmann, Acapulco, Mexico (2003), pp. 116–123.
Author information
Authors and Affiliations
Additional information
__________
Translated from Kibernetika i Sistemnyi Analiz, No. 6, pp. 19–31, November–December 2005.
Rights and permissions
About this article
Cite this article
Balabanov, A.S. Inference of structures of models of probabilistic dependences from statistical data. Cybern Syst Anal 41, 808–817 (2005). https://doi.org/10.1007/s10559-006-0019-1
Received:
Issue Date:
DOI: https://doi.org/10.1007/s10559-006-0019-1