skip to main content
10.1145/3584371.3612963acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Canonical Representation of Biological Networks Using Graph Convolution

Published:04 October 2023Publication History

ABSTRACT

Graph machine learning algorithms are being commonly applied to a broad range of prediction tasks in systems biology. These algorithms present many design choices depending on the specific application and available data, making it difficult to choose from different options. An important design criterion in this regard is the definition of "topological similarity" between two nodes in a network, which is used to design convolution matrices for graph convolution or loss functions to evaluate node embeddings. Many measures of topological similarity exist in network science literature (e.g., random walk based proximity, shared neighborhood) and recent comparative studies show that the choice of topological similarity can have a significant effect on the performance and reliability of graph machine learning models.

We propose GraphCan, a framework for computing canonical representations for biological networks using a similarity-based Graph Convolutional Network (GCN). GraphCan integrates multiple node similarity measures to compute canonical node embeddings for a given network. The resulting embeddings can be utilized directly for downstream machine learning tasks. We comprehensively evaluate GraphCan in the context of various link prediction tasks in systems biology. Our results show that GraphCan consistently delivers improved prediction accuracy over algorithms that directly use the adjacency matrix of the input network, and the integration of multiple similarity measurements improves the robustness of the framework. The implementation of GraphCan can be found in https://github.com/Meng-zhen-Li/Similarity-based-GCN.git.

References

  1. Lada A Adamic and Eytan Adar. 2003. Friends and neighbors on the web. Social networks 25, 3 (2003), 211--230.Google ScholarGoogle Scholar
  2. Adrián Bazaga, Dan Leggate, and Hendrik Weisser. 2020. Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology. Scientific reports 10, 1 (2020), 1--10.Google ScholarGoogle Scholar
  3. Olivier Bodenreider. 2004. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research 32, suppl_1 (2004), D267--D270.Google ScholarGoogle Scholar
  4. Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM international on conference on information and knowledge management. 891--900.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mustafa Coşkun and Mehmet Koyutürk. 2021. Node similarity-based graph convolution for link prediction in biological networks. Bioinformatics 37, 23 (2021), 4501--4508.Google ScholarGoogle ScholarCross RefCross Ref
  6. Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. 2018. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering 31, 5 (2018), 833--852.Google ScholarGoogle ScholarCross RefCross Ref
  7. Allan Peter Davis, Cynthia J Grondin, Robin J Johnson, Daniela Sciaky, Roy McMorran, Jolene Wiegers, Thomas C Wiegers, and Carolyn J Mattingly. 2019. The comparative toxicogenomics database: update 2019. Nucleic acids research 47, D1 (2019), D948--D954.Google ScholarGoogle Scholar
  8. Sinan Erten, Gurkan Bebek, Rob M Ewing, and Mehmet Koyutürk. 2011. DA DA: degree-aware algorithms for network-based disease gene prioritization. BioData mining 4, 1 (2011), 1--20.Google ScholarGoogle Scholar
  9. Sinan Erten, Gurkan Bebek, and Mehmet Koyutürk. 2011. Vavien: an algorithm for prioritizing candidate disease genes based on topological similarity of proteins in interaction networks. Journal of computational biology 18, 11 (2011), 1561--1574.Google ScholarGoogle ScholarCross RefCross Ref
  10. Yuchong Gong, Yanqing Niu, Wen Zhang, and Xiaohong Li. 2019. A network embedding-based multiple information integration method for the MiRNA-disease association prediction. BMC bioinformatics 20, 1 (2019), 1--13.Google ScholarGoogle Scholar
  11. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Pietro Hiram Guzzi and Swarup Roy. 2020. Biological Network Analysis: Trends, Approaches, Graph Theory, and Algorithms. Elsevier.Google ScholarGoogle Scholar
  13. Takahiko Ito, Masashi Shimbo, Taku Kudo, and Yuji Matsumoto. 2005. Application of kernels to link analysis. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. 586--592.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jian Kang, Yan Zhu, Yinglong Xia, Jiebo Luo, and Hanghang Tong. 2022. Rawlsgcn: Towards rawlsian difference principle on graph convolutional network. In Proceedings of the ACM Web Conference 2022. 1214--1225.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google ScholarGoogle Scholar
  16. Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. NIPS Workshop on Bayesian Deep Learning (2016).Google ScholarGoogle Scholar
  17. Mengzhen Li, Mustafa Coşkun, and Mehmet Koyutürk. 2022. Consensus embedding for multiple networks: Computation and applications. Network Science 10, 2 (2022), 190--206.Google ScholarGoogle ScholarCross RefCross Ref
  18. Xiangyu Li, Weizheng Chen, Yang Chen, Xuegong Zhang, Jin Gu, and Michael Q Zhang. 2017. Network embedding-based representation learning for single cell RNA-seq data. Nucleic acids research (2017).Google ScholarGoogle Scholar
  19. Pedro G Lind, Marta C Gonzalez, and Hans J Herrmann. 2005. Cycles and clustering in bipartite networks. Physical review E 72, 5 (2005), 056127.Google ScholarGoogle Scholar
  20. Linyuan Lü, Ci-Hang Jin, and Tao Zhou. 2009. Similarity index based on local paths for link prediction of complex networks. Physical Review E 80, 4 (2009), 046122.Google ScholarGoogle ScholarCross RefCross Ref
  21. Walter Nelson, Marinka Zitnik, Bo Wang, Jure Leskovec, Anna Goldenberg, and Roded Sharan. 2019. To embed or not: network embedding as a paradigm in computational biology. Frontiers in genetics 10 (2019), 381.Google ScholarGoogle Scholar
  22. Ryan A Rossi, Di Jin, Sungchul Kim, Nesreen K Ahmed, Danai Koutra, and John Boaz Lee. 2019. From community to role-based graph embeddings. arXiv e-prints (2019), arXiv-1908.Google ScholarGoogle Scholar
  23. Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, and Mike Tyers. 2006. BioGRID: a general repository for interaction datasets. Nucleic acids research 34, suppl_1 (2006), D535--D539.Google ScholarGoogle Scholar
  24. Chang Su, Jie Tong, Yongjun Zhu, Peng Cui, and Fei Wang. 2020. Network embedding in biomedical data science. Briefings in bioinformatics 21, 1 (2020), 182--197.Google ScholarGoogle Scholar
  25. Damian Szklarczyk, Annika L Gable, David Lyon, Alexander Junge, Stefan Wyder, Jaime Huerta-Cepas, Milan Simonovic, Nadezhda T Doncheva, John H Morris, Peer Bork, et al. 2019. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids research 47, D1 (2019), D607--D613.Google ScholarGoogle Scholar
  26. Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In Sixth international conference on data mining (ICDM'06). IEEE, 613--622.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. David S Wishart, Yannick D Feunang, An C Guo, Elvis J Lo, Ana Marcu, Jason R Grant, Tanvir Sajed, Daniel Johnson, Carin Li, Zinat Sayeeda, et al. 2018. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research 46, D1 (2018), D1074--D1082.Google ScholarGoogle Scholar
  28. Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M Lin, Wen Zhang, Ping Zhang, and Huan Sun. 2020. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 36, 4 (2020), 1241--1251.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Canonical Representation of Biological Networks Using Graph Convolution

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
          September 2023
          626 pages
          ISBN:9798400701269
          DOI:10.1145/3584371

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 4 October 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate254of885submissions,29%
        • Article Metrics

          • Downloads (Last 12 months)62
          • Downloads (Last 6 weeks)7

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader