Abstract
Complex diseases are generally caused by the dysregulation of biological functions rather than individual molecules. Hence, a major challenge of the systematical study on complex diseases is how to capture the differentially regulated biological functions, e.g., pathways. The traditional differential expression analysis (DEA) usually considers the changed expression values of genes rather than functions. Meanwhile, the conventional function-based analysis (e.g., PEA: pathway enrichment analysis) mainly considers the varying activation of functions but disregards the structure change of genetic elements of functions. To achieve precision medicine against complex diseases, it is necessary to distinguish both the changes of functions and their elements from heterogeneous dysregulated pathways during the disease development and progression. In this work, in contrast to the traditional DEA, we developed a new computational framework, namely differential function analysis (DFA), to identify the changes of element-structure and expression-activation of biological functions, based on comparative non-negative matrix factorization (cNMF). To validate the effectiveness of our method, we tested DFA on various datasets, which shows that DFA is able to effectively recover the differential element-structure and differential activation-score of pre-set functional groups. In particular, the analysis of DFA on human gastric cancer dataset, not only capture the changed network-structure of pathways associated with gastric cancer, but also detect the differential activations of these pathways (i.e., significantly discriminating normal samples and disease samples), which is more effective than the state-of-the-art methods, such as GSVA and Pathifier. Totally, DFA is a general framework to capture the systematical changes of genes, networks and functions of complex diseases, which not only provides the new insight on the simultaneous alterations of pathway genes and pathway activations, but also opens a new way for the network-based functional analysis on heterogeneous diseases.
创新点
复杂疾病通常由生物功能, 而不是单个分子的失调造成的。因此, 系统性地研究复杂疾病的主要挑战是如何捕捉差异调节的生物功能。传统的差异表达分析(DEA), 通常考虑基因, 而不是功能的改变的表达值。同时, 传统的基于功能的分析(例如, PEA:功能途径富集分析)主要考虑功能活性的变化, 而忽略了功能内遗传基因之间的结构变化。在这个工作中, 我们开发了一个新的差分功能分析(DFA)算法, 它能够同时识别遗传基因之间的结构和功能活性的变化。为了验证我们方法的有效性, 我们在各种数据集上测试DFA, 结果表明DFA是能够有效地还原功能内遗传基因之间的结构变化和功能活性的失调。总之, DFA提供了一个系统性地窥视复杂疾病的功能, 网络, 基因变化的工具。
Similar content being viewed by others
References
Jin L, Zuo X Y, Su W Y, et al. Pathway-based analysis tools for complex diseases: a review. Genom Proteom Bioinform, 2014, 12: 210–220
Panoutsopoulou K, Zeggini E. Finding common susceptibility variants for complex disease: past, present and future. Brief Funct Genom Proteom, 2009, 8: 345–352
Freimer N B, Sabatti C. Human genetics: variants in common diseases. Nature, 2007, 445: 828–830
Thomas D. Gene-environment-wide association studies: emerging approaches. Nat Rev Genet, 2010, 11: 259–272
Cordell H J. Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet, 2009, 10: 392–404
Ashburner M, Ball C A, Blake J A, et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000, 25: 25–29
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res, 2000, 28: 27–30
Holmans P, Green E K, Pahwa J S, et al. Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. Am J Hum Genet, 2009, 85: 13–24
Zhang C C, Liu J, Shi Q Q, et al. Identification of phenotypic networks based on whole transcriptome by comparative network decomposition. In: Proceedings of Bioinformatics and Biomedicine (BIBM), Washington, 2015. 189–194
Subramanian A, Tamayo P, Mootha V K, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Nat Acad Sci, 2005, 102: 15545–15550
Wang J, Huang Q, Liu Z P, et al. NOA: a novel network ontology analysis method. Nucleic Acids Res, 2011, 39: e87
Zhang C, Wang J, Hanspers K, et al. NOA: a cytoscape plugin for network ontology analysis. Bioinformatics, 2013, 29: 2066–2067
Tarca A L, Draghici S, Khatri P, et al. A novel signaling pathway impact analysis. Bioinformatics, 2009, 25: 75–82
Martini P, Sales G, Massa M S, et al. Along signal paths: an empirical gene set approach exploiting pathway topology. Nucleic Acids Res, 2013, 41: 218–225
Drier Y, Sheffer M, Domany E. Pathway-based personalized analysis of cancer. Proc Nat Acad Sci, 2013, 110: 6388–6393
Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-Seq data. BMC Bioinform, 2013, 14: 1–15
Khatri P, Sirota M, Butte A J. Ten years of pathway analysis: current approaches and outstanding challenges. Plos Comput Biol, 2012, 8: 1454–1459
Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization. Nature, 1999, 401: 788–791
Lee D D, Seung H S. Algorithms for non-negative matrix factorization. Adv Neural Inform Proc Syst, 2001, 13: 556–562
Wang Y X, Zhang Y J. Nonnegative matrix factorization: a comprehensive review. IEEE Trans Knowl Data Eng, 2013, 25: 1336–1353
Jia Z L, Zhang X, Guan N Y, et al. Gene ranking of RNA-seq data via discriminant non-negative matrix factorization. Plos One, 2015, 10: e0137782
Zhang X, Guan N Y, Jia Z L, et al. Semi-supervised projective non-negative matrix factorization for cancer classification. Plos One, 2015, 10: e0138814
Zhang S H, Li Q J, Liu J, et al. A novel computational framework for simultaneous integration of multiple types of genomic data to identify microRNA-gene regulatory modules. Bioinformatics, 2011, 27: i401–i409
Leo T, Bjorn N. A framework for regularized non-negative matrix factorization, with Application to the analysis of gene expression data. Plos One, 2012, 7: e46331
Lee C M, Mudaliar M A V, Haggart D R, et al. Simultaneous non-negative matrix factorization for multiple large scale gene expression datasets in toxicology. Plos One, 2012, 7: 1411
Ma H, Jia M, Shi Y K, et al. Semi-supervised nonnegative matrix factorization for microblog clustering based on term correlation. Web Technol Appl, 2014, 8709: 511–516
Seichepine N, Essid S, Fevotte C, et al. Soft nonnegative matrix co-factorization. IEEE Trans Signal Process, 2014, 22: 5940–5949
Liu H F, Wu Z H, Li X L, et al. Constrained nonnegative matrix factorization for image representation. IEEE Trans Patt Anal Mach Intell, 2012, 34: 1299–1311
Wu Q Y, Wang Z Y, Li C S, et al. Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization. BMC Syst Biology, 2015, 9: 1–14
Fogel P, Young S S, Hawkins D M, et al. Inferential, robust non-negative matrix factorization analysis of microarray data. Bioinformatics, 2007, 23: 44–49
Zafeiriou S, Tefas A, Buciu I, et al. Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neural Netw, 2006, 17: 683–695
Jiang J J, Zhang H B, Xue Y. Fast local learning regularized nonnegative matrix factorization. Adv Comput Environm Sci, 2012, 142: 67–75
Gu Q Q, Zhou J. Local learning regularized nonnegative matrix factorization. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, 2009. 1046–1051
Cai D, He X F, Wu X Y, et al. Non-negative matrix factorization on manifold. In: Proceedings of IEEE International Conference on Data Mining, Pisa, 2008. 63–72
Liu Y L, Du J L, Wang F. Non-negative matrix factorization with sparseness constraints for credit risk assessment. In: Proceedings of IEEE International Conference on Grey Systems and Intelligent Services, Macau, 2013. 211–214
Liu C L, Ma J W. Automatic non-negative matrix factorization clustering with competitive sparseness constraints. Intell Comput Methodol, 2014, 8589: 118–125
Hoyer P O. Non-negative matrix factorization with sparseness constraints. J Mach Learn Res, 2004, 5: 1457–1469
Canadas-Quesada F J, Vera-Candeas P, Ruiz-Reyes N, et al. Percussive/harmonic sound separation by non-negative matrix factorization with smoothness/sparseness constraints. Eur J Audio Speech Music Proc, 2014, 2014: 1–17
Zhang S, Liu C C, Li W, et al. Discovery of multi-dimensional modules by integrative analysis of cancer genomic data. Nucleic Acids Res, 2012, 40: 9379–9391
Gao Y, Church G. Improving molecular cancer class discovery through sparse non-negative matrix factorization. Bioinformatics, 2005, 21: 3970–3975
Kim H. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics, 2007, 23: 1495–1502
Peng C, Wong K C, Rockwood A, et al. Multiplicative algorithms for constrained non-negative matrix factorization. In: Proceedings of IEEE International Conference on Data Mining, Brussels, 2012. 1068–1073
Cui J, Li F, Wang G Q, et al. Gene-expression signatures can distinguish gastric cancer grades and stages. Plos One, 2011, 6: 1387
Frances N, Zeichner S B, Francavilla M, et al. Gastric small-cell carcinoma found on esophagogastroduodenoscopy: a case report and literature review. Case Rep Oncol Med, 2013, 2013: 475961
Hu K W, Chen F H. Identification of significant pathways in gastric cancer based on protein-protein interaction networks and cluster analysis. Genet Mol Biol, 2012, 35: 701–708
Shimoda T, Matsutani T, Yoshida H, et al. A case of gastric cancer associated with systemic lupus erythematosus and nephrotic syndrome. Nihon Shokakibyo Gakkai Zasshi, 2013, 110: 1797–1803
Axon A T. Relationship between Helicobacter pylori gastritis, gastric cancer and gastric acid secretion. Adv Med Sci, 2007, 52: 55–60
Lee J, Jung K, Kim Y S, et al. Diosgenin inhibits melanogenesis through the activation of phosphatidylinositol-3-kinase pathway (PI3K) signaling. Life Sci, 2007, 81: 249–254
Rappaport N, Nativ N, Stelzer G, et al. MalaCards: an integrated compendium for diseases and their annotation. Datab J Biolog Datab Curat, 2013, 2013: 1429–1438
Croft D, O’Kelly G, Wu G, et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res, 2011. 39(Database issue): D691–D697
Zhao J, Zhou Y W, Zhang X J, et al. Part mutual information for quantifying direct associations in networks. Proc Nat Acad Sci, 2016, 113: 5130–5135
Zhang X J, Liu K Q, Liu Z P, et al. NARROMI: a noise and redundancy reduction technique improves accuracy of gene regulatory network inference. Bioinformatics, 2013, 29: 106–113
Chen L N, Liu R, Liu Z P, et al. Detecting early-warning signals for sudden deterioration of complex diseases by dynamical network biomarkers. Sci Rep, 2012, 2: 342
Liu R, Wang X D, Aihara K, et al. Early diagnosis of complex diseases by molecular biomarkers, network biomarkers, and dynamical network biomarkers. Med Res Rev, 2013, 34: 455–478
Liu R, Chen P, Aihara K, et al. Identifying early-warning signals of critical transitions with strong noise by dynamical network markers. Sci Rep, 2015, 5: 17501
Zeng T, Zhang C C, Zhang W W, et al. Deciphering early development of complex diseases by progressive module network. Methods, 2014, 67: 334–343
Yu X T, Li G J, Chen L N. Prediction and early diagnosis of complex diseases by edge-network. Bioinformatics, 2014, 30: 852–859
Yu X T, Zeng T, Wang X D, et al. Unravelling personalized dysfunctional gene network of complex diseases based on differential network model. J Transl Med, 2015, 13: 1–13
Zeng T, Wang D C, Wang X D, et al. Prediction of dynamical drug sensitivity and resistance by module network rewiring-analysis based on transcriptional profiling. Drug Resist Update, 2014, 17: 64–76
Zeng T, Zhang W W, Yu X T, et al. Big-data-based edge biomarkers: study on dynamical drug sensitivity and resistance in individuals. Brief Bioinform, 2015, 21: 863–874
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhang, C., Liu, J., Shi, Q. et al. Differential function analysis: identifying structure and activation variations in dysregulated pathways. Sci. China Inf. Sci. 60, 012108 (2017). https://doi.org/10.1007/s11432-016-0030-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-016-0030-6
Keywords
- complex disease
- biological function
- non-negative matrix factorization
- network structure
- function activation