research-article

Canonical Representation of Biological Networks Using Graph Convolution

Authors:
Mengzhen Li

Case Western Reserve University, Cleveland, USA

Case Western Reserve University, Cleveland, USA

https://orcid.org/0000-0002-2266-4313
View Profile

,
Mustafa Coşkun

Ankara University, Ankara, Turkey

Ankara University, Ankara, Turkey

https://orcid.org/0000-0003-4805-1416
View Profile

,
Mehmet Koyutürk

Case Western Reserve University, Cleveland, USA

Case Western Reserve University, Cleveland, USA

https://orcid.org/0000-0002-3434-5512
View Profile

BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsSeptember 2023Article No.: 5Pages 1–9https://doi.org/10.1145/3584371.3612963

Published:04 October 2023Publication History

BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Pages 1–9

ABSTRACT

Graph machine learning algorithms are being commonly applied to a broad range of prediction tasks in systems biology. These algorithms present many design choices depending on the specific application and available data, making it difficult to choose from different options. An important design criterion in this regard is the definition of "topological similarity" between two nodes in a network, which is used to design convolution matrices for graph convolution or loss functions to evaluate node embeddings. Many measures of topological similarity exist in network science literature (e.g., random walk based proximity, shared neighborhood) and recent comparative studies show that the choice of topological similarity can have a significant effect on the performance and reliability of graph machine learning models.

We propose GraphCan, a framework for computing canonical representations for biological networks using a similarity-based Graph Convolutional Network (GCN). GraphCan integrates multiple node similarity measures to compute canonical node embeddings for a given network. The resulting embeddings can be utilized directly for downstream machine learning tasks. We comprehensively evaluate GraphCan in the context of various link prediction tasks in systems biology. Our results show that GraphCan consistently delivers improved prediction accuracy over algorithms that directly use the adjacency matrix of the input network, and the integration of multiple similarity measurements improves the robustness of the framework. The implementation of GraphCan can be found in https://github.com/Meng-zhen-Li/Similarity-based-GCN.git.

References

Lada A Adamic and Eytan Adar. 2003. Friends and neighbors on the web. Social networks 25, 3 (2003), 211--230.Google Scholar
Adrián Bazaga, Dan Leggate, and Hendrik Weisser. 2020. Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology. Scientific reports 10, 1 (2020), 1--10.Google Scholar
Olivier Bodenreider. 2004. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research 32, suppl_1 (2004), D267--D270.Google Scholar
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM international on conference on information and knowledge management. 891--900.Google ScholarDigital Library
Mustafa Coşkun and Mehmet Koyutürk. 2021. Node similarity-based graph convolution for link prediction in biological networks. Bioinformatics 37, 23 (2021), 4501--4508.Google ScholarCross Ref
Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. 2018. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering 31, 5 (2018), 833--852.Google ScholarCross Ref
Allan Peter Davis, Cynthia J Grondin, Robin J Johnson, Daniela Sciaky, Roy McMorran, Jolene Wiegers, Thomas C Wiegers, and Carolyn J Mattingly. 2019. The comparative toxicogenomics database: update 2019. Nucleic acids research 47, D1 (2019), D948--D954.Google Scholar
Sinan Erten, Gurkan Bebek, Rob M Ewing, and Mehmet Koyutürk. 2011. DA DA: degree-aware algorithms for network-based disease gene prioritization. BioData mining 4, 1 (2011), 1--20.Google Scholar
Sinan Erten, Gurkan Bebek, and Mehmet Koyutürk. 2011. Vavien: an algorithm for prioritizing candidate disease genes based on topological similarity of proteins in interaction networks. Journal of computational biology 18, 11 (2011), 1561--1574.Google ScholarCross Ref
Yuchong Gong, Yanqing Niu, Wen Zhang, and Xiaohong Li. 2019. A network embedding-based multiple information integration method for the MiRNA-disease association prediction. BMC bioinformatics 20, 1 (2019), 1--13.Google Scholar
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarDigital Library
Pietro Hiram Guzzi and Swarup Roy. 2020. Biological Network Analysis: Trends, Approaches, Graph Theory, and Algorithms. Elsevier.Google Scholar
Takahiko Ito, Masashi Shimbo, Taku Kudo, and Yuji Matsumoto. 2005. Application of kernels to link analysis. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. 586--592.Google ScholarDigital Library
Jian Kang, Yan Zhu, Yinglong Xia, Jiebo Luo, and Hanghang Tong. 2022. Rawlsgcn: Towards rawlsian difference principle on graph convolutional network. In Proceedings of the ACM Web Conference 2022. 1214--1225.Google ScholarDigital Library
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. NIPS Workshop on Bayesian Deep Learning (2016).Google Scholar
Mengzhen Li, Mustafa Coşkun, and Mehmet Koyutürk. 2022. Consensus embedding for multiple networks: Computation and applications. Network Science 10, 2 (2022), 190--206.Google ScholarCross Ref
Xiangyu Li, Weizheng Chen, Yang Chen, Xuegong Zhang, Jin Gu, and Michael Q Zhang. 2017. Network embedding-based representation learning for single cell RNA-seq data. Nucleic acids research (2017).Google Scholar
Pedro G Lind, Marta C Gonzalez, and Hans J Herrmann. 2005. Cycles and clustering in bipartite networks. Physical review E 72, 5 (2005), 056127.Google Scholar
Linyuan Lü, Ci-Hang Jin, and Tao Zhou. 2009. Similarity index based on local paths for link prediction of complex networks. Physical Review E 80, 4 (2009), 046122.Google ScholarCross Ref
Walter Nelson, Marinka Zitnik, Bo Wang, Jure Leskovec, Anna Goldenberg, and Roded Sharan. 2019. To embed or not: network embedding as a paradigm in computational biology. Frontiers in genetics 10 (2019), 381.Google Scholar
Ryan A Rossi, Di Jin, Sungchul Kim, Nesreen K Ahmed, Danai Koutra, and John Boaz Lee. 2019. From community to role-based graph embeddings. arXiv e-prints (2019), arXiv-1908.Google Scholar
Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, and Mike Tyers. 2006. BioGRID: a general repository for interaction datasets. Nucleic acids research 34, suppl_1 (2006), D535--D539.Google Scholar
Chang Su, Jie Tong, Yongjun Zhu, Peng Cui, and Fei Wang. 2020. Network embedding in biomedical data science. Briefings in bioinformatics 21, 1 (2020), 182--197.Google Scholar
Damian Szklarczyk, Annika L Gable, David Lyon, Alexander Junge, Stefan Wyder, Jaime Huerta-Cepas, Milan Simonovic, Nadezhda T Doncheva, John H Morris, Peer Bork, et al. 2019. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids research 47, D1 (2019), D607--D613.Google Scholar
Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In Sixth international conference on data mining (ICDM'06). IEEE, 613--622.Google ScholarDigital Library
David S Wishart, Yannick D Feunang, An C Guo, Elvis J Lo, Ana Marcu, Jason R Grant, Tanvir Sajed, Daniel Johnson, Carin Li, Zinat Sayeeda, et al. 2018. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research 46, D1 (2018), D1074--D1082.Google Scholar
Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M Lin, Wen Zhang, Ping Zhang, and Huan Sun. 2020. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 36, 4 (2020), 1241--1251.Google ScholarCross Ref

Index Terms

Canonical Representation of Biological Networks Using Graph Convolution

Recommendations

Exploiting node-feature bipartite graph in graph convolutional networks
Abstract
In recent years, Graph Convolutional Networks (GCNs), which extend convolutional neural networks to graph structure, have achieved great success on many graph learning tasks by fusing structure and feature information, such as node ...
Read More
GTCN: Dynamic Network Embedding Based on Graph Temporal Convolution Neural Network
Intelligent Computing Theories and Application
Abstract
Network embedding aims to learn the low-dimensional node representations from high-dimensional network structures of complex systems. Embedding in dynamic networks is a very difficult but important problem due to the dynamics of network structures ...
Read More
Graph Convolutional Networks for Road Networks
SIGSPATIAL '19: Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems

The application of machine learning techniques in the setting of road networks holds the potential to facilitate many important transportation applications. Graph Convolutional Networks (GCNs) are neural networks that are capable of leveraging the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
September 2023
626 pages
ISBN:9798400701269
DOI:10.1145/3584371
General Chairs:
May D. Wang
SGeorgia Institute of Technology
,
Byung-Jun Yoon
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
similarity matrices
graph convolutional networks
network embedding
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate254of885submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 62
  Total Downloads
- Downloads (Last 12 months)62
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Canonical Representation of Biological Networks Using Graph Convolution

BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exploiting node-feature bipartite graph in graph convolutional networks

GTCN: Dynamic Network Embedding Based on Graph Temporal Convolution Neural Network

Graph Convolutional Networks for Road Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Canonical Representation of Biological Networks Using Graph Convolution

BCB '23: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exploiting node-feature bipartite graph in graph convolutional networks

GTCN: Dynamic Network Embedding Based on Graph Temporal Convolution Neural Network

Graph Convolutional Networks for Road Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media