Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter November 22, 2016

Adaptive input data transformation for improved network reconstruction with information theoretic algorithms

  • Venkateshan Kannan EMAIL logo and Jesper Tegner ORCID logo

Abstract

We propose a novel systematic procedure of non-linear data transformation for an adaptive algorithm in the context of network reverse-engineering using information theoretic methods. Our methodology is rooted in elucidating and correcting for the specific biases in the estimation techniques for mutual information (MI) given a finite sample of data. These are, in turn, tied to lack of well-defined bounds for numerical estimation of MI for continuous probability distributions from finite data. The nature and properties of the inevitable bias is described, complemented by several examples illustrating their form and variation. We propose an adaptive partitioning scheme for MI estimation that effectively transforms the sample data using parameters determined from its local and global distribution guaranteeing a more robust and reliable reconstruction algorithm. Together with a normalized measure (Shared Information Metric) we report considerably enhanced performance both for in silico and real-world biological networks. We also find that the recovery of true interactions is in particular better for intermediate range of false positive rates, suggesting that our algorithm is less vulnerable to spurious signals of association.

References

Bansal, M., V. Belcastro, A. Ambesi-Impiombato and D. di Bernardo (2007): “How to infer gene networks from expression profiles,” Mol. Syst. Biol., 3, 78, http://dx.doi.org/10.1038/msb4100120.10.1038/msb4100120Search in Google Scholar PubMed PubMed Central

Beal, M. J., F. Falciani, Z. Ghahramani, C. Rangel and D. L. Wild (2005): “A Bayesian approach to reconstructing genetic regulatory networks with hidden factors,” Bioinformatics, 21, 349–356.10.1093/bioinformatics/bti014Search in Google Scholar PubMed

Butte, A. J. and I. S. Kohane (2000): “Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements,” Pac. Symp. Biocomput., 426, 418–429.10.1142/9789814447331_0040Search in Google Scholar PubMed

Chan, T. E., M. Stumpf and A. C. Babtie (2016): “Network inference and hypotheses-generation from single-cell transcriptomic data using multivariate information measures,” bioRxiv. http://dx.doi.org/10.1101/082099.http://dx.doi.org/10.1101/082099Search in Google Scholar

de Matos Simoes, R. and F. Emmert-Streib (2011): “Influence of statistical estimators of mutual information and data heterogeneity on the inference of gene regulatory networks,” PLoS One, 6, e29279.10.1371/journal.pone.0029279Search in Google Scholar PubMed PubMed Central

Faith, J. J., B. Hayete, J. T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J. J. Collins and T. S. Gardner (2007): “Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles,” PLoS Biol., 5, e8.10.1371/journal.pbio.0050008Search in Google Scholar PubMed PubMed Central

Fraser and Swinney (1986): “Independent coordinates for strange attractors from mutual information,” Phys. Rev. A, 33, 1134–1140.10.1103/PhysRevA.33.1134Search in Google Scholar PubMed

Guimerà, R. and M. Sales-Pardo (2009): “Missing and spurious interactions and the reconstruction of complex networks,” Proc. Natl. Acad. Sci. U.S.A., 106, 22073–22078.10.1073/pnas.0908366106Search in Google Scholar PubMed PubMed Central

Gustafsson, M., M. Hörnquist, J. Lundström, J. Björkegren and J. Tegnér (2009): “Reverse engineering of gene networks with LASSO and nonlinear basis functions,” Ann. N. Y. Acad. Sci., 1158, 265–275.10.1111/j.1749-6632.2008.03764.xSearch in Google Scholar PubMed

Hausser, J. and K. Strimmer (2009): “Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks,” J. Mach. Learn. Res., 10, 1469–1484.Search in Google Scholar

Hecker, M., S. Lambeck, S. Toepfer, E. van Someren and R. Guthke (2009): “Gene regulatory network inference: data integration in dynamic models-a review,” Biosystems, 96, 86–103.10.1016/j.biosystems.2008.12.004Search in Google Scholar PubMed

Hendrickx, D. M., M. M. W. B. Hendriks, P. H. C. Eilers, A. K. Smilde and H. C. J. Hoefsloot (2011): “Reverse engineering of metabolic networks, a critical assessment,” Mol. Biosyst., 7, 511–520.10.1039/C0MB00083CSearch in Google Scholar

Hickman, G. J. and T. C. Hodgman (2009): “Inference of gene regulatory networks using boolean-network inference methods,” J. Bioinform. Comput. Biol., 7, 1013–1029.10.1142/S0219720009004448Search in Google Scholar PubMed

Hill, S. M., Y. Lu, J. Molina, L. M. Heiser, P. T. Spellman, T. P. Speed, J. W. Gray, G. B. Mills and S. Mukherjee (2012): “Bayesian inference of signaling network topology in a cancer cell line,” Bioinformatics, 28, 2804–2810.10.1093/bioinformatics/bts514Search in Google Scholar PubMed PubMed Central

Kinney, J. B. and G. S. Atwal (2014): “Equitability, mutual information, and the maximal information coefficient,” Proc. Natl. Acad. Sci. U.S.A., 111, 3354–3359.10.1073/pnas.1309933111Search in Google Scholar PubMed PubMed Central

Kraskov, A., H. Stögbauer and P. Grassberger (2004): “Estimating mutual information,” Phys. Rev. E, 69, 066138.10.1103/PhysRevE.69.066138Search in Google Scholar PubMed

LESNE, A. (2014): “Shannon entropy: a rigorous notion at the crossroads between probability, information theory, dynamical systems and statistical physics,” Math. Struct. Comput. Sci., 24, e240311.10.1017/S0960129512000783Search in Google Scholar

Liang, S., S. Fuhrman and R. Somogyi (1998): “Reveal, a general reverse engineering algorithm for inference of genetic network architectures,” Pac. Symp. Biocomput., 3, 18–29.Search in Google Scholar

Madar, A., A. Greenfield, E. Vanden-Eijnden and R. Bonneau (2010): “DREAM3: network inference using dynamic context likelihood of relatedness and the inferelator,” PLoS One, 5, e9803.10.1371/journal.pone.0009803Search in Google Scholar PubMed PubMed Central

Marbach, D., J. C. Costello, R. Kïffner, N. M. Vega, R. J. Prill, D. M. Camacho, K. R. Allison, D. R. E. A. M. Consortium, M. Kellis, J. J. Collins and G. Stolovitzky (2012): “Wisdom of crowds for robust gene network inference,” Nat. Methods, 9, 796–804.10.1038/nmeth.2016Search in Google Scholar PubMed PubMed Central

Margolin, A. A., I. Nemenman, K. Basso, C. Wiggins, G. Stolovitzky, R. D. Favera and A. Califano (2006): “ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context,” BMC Bioinform., 7 (Suppl 1), S7.10.1186/1471-2105-7-S1-S7Search in Google Scholar PubMed PubMed Central

Markowetz, F. and R. Spang (2007): “Inferring cellular networks–a review,” BMC Bioinform., 8 (Suppl 6), S5.10.1186/1471-2105-8-S6-S5Search in Google Scholar PubMed PubMed Central

Mc Mahon, S. S., A. Sim, S. Filippi, R. Johnson, J. Liepe, D. Smith and M. P. Stumpf (2014): “Information theory and signal transduction systems: From molecular information processing to network inference,” Semin. Cell Dev. Biol., 35, 98–108.10.1016/j.semcdb.2014.06.011Search in Google Scholar PubMed

Meyer, P. E., K. Kontos, F. Lafitte and G. Bontempi (2007): “Information-theoretic inference of large transcriptional regulatory networks,” EURASIP J. Bioinform. Syst. Biol., 2007, 79879, http://dx.doi.org/10.1155/2007/79879.http://dx.doi.org/10.1155/2007/79879Search in Google Scholar PubMed PubMed Central

Miller, G. (1955): “Note on the bias of information estimates,” Inf. Theory Psychol. Probl. Methods, 2, 95–100.Search in Google Scholar

Moon, Rajagopalan and Lall (1995): “Estimation of mutual information using kernel density estimators,” Phys. Rev. E. Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., 52, 2318–2321.10.1103/PhysRevE.52.2318Search in Google Scholar

Mukherjee, S. and T. P. Speed (2008): “Network inference using informative priors,” Proc. Natl. Acad. Sci. U. S. A., 105, 14313–14318.10.1073/pnas.0802272105Search in Google Scholar PubMed PubMed Central

Paninski, L. (2003): “Estimation of entropy and mutual information,” Neural Comput., 15, 1191–1253.10.1162/089976603321780272Search in Google Scholar

Papana, A. and D. Kugiumtzis (2008): “Evaluation of mutual information estimators on nonlinear dynamic systems,” NONLINEAR Phenom. COMPLEX Syst., 225–232, http://arxiv.org/abs/0809.2149.Search in Google Scholar

Shannon, C. E. (1948): “A mathematical theory of communication,” Bell Syst. Tech. J., 27, 379–423.10.1002/j.1538-7305.1948.tb01338.xSearch in Google Scholar

Studham, M. E., A. Tjärnberg, T. E. M. Nordling, S. Nelander and E. L. L. Sonnhammer (2014): “Functional association networks as priors for gene regulatory network inference,” Bioinformatics, 30, i130–i138.10.1093/bioinformatics/btu285Search in Google Scholar PubMed PubMed Central

Viger, F. and M. Latapy (2015): “Efficient and simple generation of random simple connected graphs with prescribed degree sequence,” J. Complex Networks, 4(1), 15–37. http://doi.org/10.1093/comnet/cnv013.http://doi.org/10.1093/comnet/cnv013Search in Google Scholar

Villaverde, A. F., J. Ross, F. Morán and J. R. Banga (2014): “MIDER: Network inference with mutual information distance and entropy reduction,” PLoS One, 9, e96732.10.1371/journal.pone.0096732Search in Google Scholar PubMed PubMed Central

Vinciotti, V., L. Augugliaro, A. Abbruzzo and E. C. Wit (2016): “Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks,” Stat. Appl. Genet. Mol. Biol., 15, 193–212.10.1515/sagmb-2014-0075Search in Google Scholar PubMed

Werhli, A. V. and D. Husmeier (2008): “Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions,” J. Bioinform. Comput. Biol., 6, 543–572.10.1142/S0219720008003539Search in Google Scholar PubMed

Yeung, M. K. S., J. Tegnér and J. J. Collins (2002): “Reverse engineering gene networks using singular value decomposition and robust regression,” Proc. Natl. Acad. Sci. U. S. A., 99, 6163–6168, http://www.pnas.org/content/99/9/6163.abstract.10.1073/pnas.092576199Search in Google Scholar PubMed PubMed Central

Yuan, Y., C.-T. Li and O. Windram (2011): “Directed partial correlation: inferring large-scale gene regulatory network through induced topology disruptions,” PLoS One, 6, e16835.10.1371/journal.pone.0016835Search in Google Scholar PubMed PubMed Central

Zhang, Z. and L. Zheng (2015): “A mutual information estimator with exponentially decaying bias,” Stat. Appl. Genet. Mol. Biol., 14, 243–252.10.1515/sagmb-2014-0047Search in Google Scholar PubMed

Zhang, Z., Z. Zheng, H. Niu, Y. Mi, S. Wu and G. Hu (2015): “Solving the inverse problem of noise-driven dynamic networks,” Phys. Rev. E, 91, 12814.10.1103/PhysRevE.91.012814Search in Google Scholar PubMed


Supplemental Material

The online version of this article (DOI:10.1515/sagmb-2016-0013) offers supplementary material, available to authorized users.


Published Online: 2016-11-22
Published in Print: 2016-12-1

©2016 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.4.2024 from https://www.degruyter.com/document/doi/10.1515/sagmb-2016-0013/html
Scroll to top button