research-article

Open Access

Sample-optimal and efficient learning of tree Ising models

Authors:
Constantinos Daskalakis

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Qinxuan Pan

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of ComputingJune 2021Pages 133–146https://doi.org/10.1145/3406325.3451006

Published:15 June 2021Publication History

STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing

Pages 133–146

ABSTRACT

We show that n-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance є from an optimal O(n lnn/є²) samples, where O(·) hides an absolute constant which, importantly, does not depend on the model being learned—neither its tree nor the magnitude of its edge strengths, on which we place no assumptions. Our guarantees hold, in fact, for the celebrated Chow-Liu algorithm [1968], using the plug-in estimator for estimating mutual information. While this (or any other) algorithm may fail to identify the structure of the underlying model correctly from a finite sample, we show that it will still learn a tree-structured model that is є-close to the true one in total variation distance, a guarantee called “proper learning.”

Our guarantees do not follow from known results for the Chow-Liu algorithm and the ensuing literature on learning graphical models, including the very recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or sample-suboptimal and/or time-inefficient algorithms, unless further assumptions are placed on the model, such as bounds on the “strengths” of the model’s edges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sample-optimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation.

References

Guy Bresler. 2015. Efficiently learning Ising models on arbitrary graphs. In Proceedings of the forty-seventh annual ACM Symposium on Theory Of Computing (STOC).Google ScholarDigital Library
C Chow and Cong Liu. 1968. Approximating discrete probability distributions with dependence trees. IEEE transactions on Information Theory 14, 3 (1968), 462–467.Google ScholarDigital Library
C Chow and T Wagner. 1973. Consistency of an estimate of tree-dependent probability distributions. IEEE Transactions on Information Theory 19, 3 (1973), 369–371.Google ScholarDigital Library
Constantinos Daskalakis, Nishanth Dikkala, and Gautam Kamath. 2019. Testing Ising Models. IEEE Trans. Inf. Theory 65, 11 (2019), 6829–6852. Google ScholarCross Ref
Constantinos Daskalakis and Qinxuan Pan. 2017. Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing. In the 30th Conference on Learning Theory (COLT).Google Scholar
Constantinos Daskalakis and Qinxuan Pan. 2020. Sample-Optimal and Efficient Learning of Tree Ising models. CoRR abs/2010.14864 (2020). arxiv:2010.14864 https://arxiv.org/abs/2010.14864Google Scholar
Luc Devroye, Abbas Mehrabian, and Tommy Reddad. 2019. The minimax learning rate of normal and Ising undirected graphical models. Electronic Journal of Statistics (2019).Google Scholar
Linus Hamilton, Frederic Koehler, and Ankur Moitra. 2017. Information theoretic properties of Markov random fields, and their algorithmic applications. In Advances in Neural Information Processing Systems. 2463–2472.Google Scholar
Ali Jalali, Pradeep Ravikumar, Vishvas Vasuki, and Sujay Sanghavi. 2011. On learning discrete graphical models using group-sparse regularization. In Proceedings of the fourteenth international conference on artificial intelligence and statistics. 378–387.Google Scholar
Adam Klivans and Raghu Meka. 2017. Learning graphical models using multiplicative weights. In Proceedings of the forty-ninth annual ACM Symposium on Theory Of Computing (STOC).Google ScholarCross Ref
Frederic Koehler. 2020. A Note on TV Learning of Tree Models. Personal Communication, http://math.mit.edu/~fkoehler/tv_note.pdf.Google Scholar
Steffen L Lauritzen. 1996. Graphical models. Vol. 17. Clarendon Press.Google Scholar
Mukund Narasimhan and Jeff A. Bilmes. 2004. PAC-learning Bounded Tree-width Graphical Models. In the 20th Conference in Uncertainty in Artificial Intelligence (UAI).Google Scholar
Judea Pearl. 2014. Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier.Google ScholarDigital Library
Pradeep Ravikumar, Martin J Wainwright, and John D Lafferty. 2010. High-dimensional Ising model selection using $\ell_1$-regularized logistic regression. The Annals of Statistics 38, 3 (2010), 1287–1319.Google ScholarCross Ref
Narayana P. Santhanam and Martin J. Wainwright. 2012. Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions. IEEE Trans. Information Theory 58, 7 (2012), 4117–4134.Google ScholarDigital Library
Vincent Y. F. Tan, Animashree Anandkumar, Lang Tong, and Alan S. Willsky. 2011. A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures. IEEE Trans. Information Theory 57, 3 (2011), 1714–1735.Google ScholarDigital Library
Marc Vuffray, Sidhant Misra, Andrey Lokhov, and Michael Chertkov. 2016. Interaction screening: Efficient and sample-optimal learning of Ising models. In Advances in Neural Information Processing Systems. 2595–2603.Google Scholar
Marc Vuffray, Sidhant Misra, and Andrey Y. Lokhov. 2019. Efficient Learning of Discrete Graphical Models. arxiv:1902.00600 [cs.LG]Google Scholar
Martin J Wainwright, Michael I Jordan, et al\mbox. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning 1, 1–2 (2008), 1–305.Google ScholarDigital Library
Shanshan Wu, Sujay Sanghavi, and Alexandros G. Dimakis. 2019. Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models. In the 32nd Annual Conference on Neural Information Processing Systems.Google Scholar

Index Terms

Sample-optimal and efficient learning of tree Ising models
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic inference problems
      1. Density estimation
    2. Probabilistic representations
      1. Bayesian networks
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Sample complexity and generalization bounds

Recommendations

Predictive learning on hidden tree-structured Ising models

We provide high-probability sample complexity guarantees for exact structure recovery and accurate predictive learning using noise-corrupted samples from an acyclic (tree-shaped) graphical model. The hidden variables follow a tree-structured Ising model ...
Read More
Optimal quantum sample complexity of learning algorithms

In learning theory, the VC dimension of a concept class ℒ is the most common way to measure its "richness." A fundamental result says that the number of examples needed to learn an unknown target concept c ∈ ℒ under an unknown distribution D, is tightly ...
Read More
Efficiently Learning Ising Models on Arbitrary Graphs
STOC '15: Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

graph underlying an Ising model from i.i.d. samples. Over the last fifteen years this problem has been of significant interest in the statistics, machine learning, and statistical physics communities, and much of the effort has been directed towards ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing
June 2021
1797 pages
ISBN:9781450380539
DOI:10.1145/3406325
General Chair:
Samir Khuller
Northwestern University, USA
,
Program Chair:
Virginia Vassilevska Williams
Massachusetts Institute of Technology, USA
Copyright © 2021 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 June 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bayesian Network
Chow-Liu Algorithm
Estimation
Hellinger Distance
Ising Model
Markov Random Field
Sample Complexity
Subadditivity
Tree-Structured Model
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,469of4,586submissions,32%
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 579
  Total Downloads
- Downloads (Last 12 months)147
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Sample-optimal and efficient learning of tree Ising models

STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predictive learning on hidden tree-structured Ising models

Optimal quantum sample complexity of learning algorithms

Efficiently Learning Ising Models on Arbitrary Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Sample-optimal and efficient learning of tree Ising models

STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predictive learning on hidden tree-structured Ising models

Optimal quantum sample complexity of learning algorithms

Efficiently Learning Ising Models on Arbitrary Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media