Abstract
Discovering relations of cross-type biomedical entities is crucial for biology research. A large amount of potential or indirect connected biological relations is hidden in millions of biomedical literatures and biological databases. The previous rules-based and deep learning approaches rely on plenty of manual annotations, which is laborious, time-consuming and unsatisfactory. It is necessary to be able to combine available annotated gene databases, chemical, genomic, clinical and other types of data repositories as domain knowledge to assist the extraction of biological entity relations from numerous literatures. Under this scenario, this paper proposes BioGraphSAGE model, a Siamese graph neural network with structured databases as domain knowledge to extract biological entity relations from literatures. Our model combines both biological semantic features and positional features to improve the recognition of relations between distant entities in the same literature. The experiment results show that BioGraphSAGE achieves the best F1 score among other relation extraction models on smaller annotated samples. Moreover, the proposed model can still maintain a F1 score of 0.526 without using annotated training samples.
Similar content being viewed by others
References
Bai T, Wang CY, Wang Y, Huang L, Xing FY (2020) A novel deep learning method for extracting unspecific biomedical relation. Concurr Comput-Pract Exp 32(1):11. https://doi.org/10.1002/cpe.5005
Wang Y, Huang L, Guo SY, Gong LG, Bai T (2019) A novel MEDLINE topic indexing method using image presentation. J Vis Commun Image Represent 58:130–137. https://doi.org/10.1016/j.jvcir.2018.11.022
Hamilton WL, Ying R, Leskovec J (2017) Inductive Representation Learning on Large Graphs. In: 2017 31th Conference on Neural Information Processing Systems (NIPS), Long Beach, pp 1024–1034. http://papers.nips.cc/paper/6703-inductive-representation-learning-on-large-graphs
Wang J, Chen XY, Zhang Y, Zhang YJ, Wen JB, Lin HF, Yang ZH, Wang X (2020) Document-level biomedical relation extraction using graph convolutional network and multihead attention: algorithm development and validation. JMIR Med Inf 8(7):15. https://doi.org/10.2196/17638
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/tpami.2013.50
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4):1234–1240. https://doi.org/10.1093/bioinformatics/btz682
Bai T, Gong LG, Wang Y, Wang Y, Kulikowski CA, Huang L (2016) A method for exploring implicit concept relatedness in biomedical knowledge network. BMC Bioinformatics 17:14. https://doi.org/10.1186/s12859-016-1131-5
Bai T, Gong LG, Kulikowski CA, Huang L (2015) Implicit Knowledge Discovery in Biomedical Ontologies: Computing Interesting Relatednesses. In: 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Washington D.C., pp 497–502. https://doi.org/10.1109/BIBM.2015.7359734
Li FF, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611. https://doi.org/10.1109/tpami.2006.79
Zhang LL, Liu J, Luo MN, Chang XJ, Zheng QH, Hauptmann AC (2019) Scheduled sampling for one-shot learning via matching network. Pattern Recognit 96:11. https://doi.org/10.1016/j.patcog.2019.07.007
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):48. https://doi.org/10.1186/s40537-019-0197-0
Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans Pattern Anal Mach Intell 39(4):652–663. https://doi.org/10.1109/tpami.2016.2587640
Wang YQ, Yao QM, Kwok JT, Ni LM (2020) Generalizing from a few examples: a survey on few-shot learning. ACM Comput Surv 53(3):34. https://doi.org/10.1145/3386252
Hao FS, Cheng J, Wang L, Cao JZ (2019) Instance-level embedding adaptation for few-shot learning. IEEE Access 7:100501–100511. https://doi.org/10.1109/access.2019.2906665
Bromley J, Guyon I, Lecun Y, Sckinger E, Shah R (1993) Signature verification using a “Siamese” time delay neural network. Int J Pattern Recognit Artif Intell 7(04):669–688. https://doi.org/10.1142/S0218001493000339
Wei CH, Kao HY, Lu ZY (2013) PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res 41(W1):W518–W522. https://doi.org/10.1093/nar/gkt441
Wei CH, Harris BR, Li DH, Berardini TZ, Huala E, Kao HY, Lu ZY (2012) Accelerating Literature Curation with Text-mining Tools: a Case Study of using PubTator to Curate Genes in PubMed Abstracts. Database.https://doi.org/10.1093/database/bas041
Zhang YJ, Chen QY, Yang ZH, Lin HF, Lu ZY (2019) BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci Data 6:9. https://doi.org/10.1038/s41597-019-0055-0
Schuster M, Paliwal K (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681
Chung J, Ahn S, Bengio Y (2017) Hierarchical Multiscale Recurrent Neural Networks. In: 2017 5th International Conference on Learning Representations (ICLR), Toulon. https://openreview.net/forum?id=S1di0sfgl
Sousa D, Couto F (2020) BiOnt: deep learning using multiple biomedical ontologies for relation extraction. Adv Inf Retrieval 12036:367–374. https://doi.org/10.1007/978-3-030-45442-5_46
Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko O (2013) Translating Embeddings for Modeling Multi-relational Data. In: 2013 27th Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, pp. 2787–2795. http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data
Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge Graph Embedding by Translating on Hyperplanes. In: 2014 28th AAAI Conference on Artificial Intelligence, Québec, pp 1112–1119. http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531
Lin Y, Liu Z, Sun M, Liu Y, Zhu X (2015) Learning Entity and Relation Embeddings for Knowledge Graph Completion. In: 2015 29th AAAI Conference on Artificial Intelligence, Texas, pp 2181–2187. http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9571
Huang L, Wang Y, Wang Y, Bai T (2016) Gene-disease interaction retrieval from multiple sources: a network based method. Biomed Res Int 2016:9. https://doi.org/10.1155/2016/3594517
Acknowledgements
This work is supported by the Development Project of Jilin Province of China (Nos.20200801033GH, YDZJ202101ZYTS128), Jilin Provincial Key Laboratory of Big Data Intelligent Computing (No.20180622002JC), The Fundamental Research Funds for the Central University, JLU.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, S., Huang, L., Yao, G. et al. Extracting Biomedical Entity Relations using Biological Interaction Knowledge. Interdiscip Sci Comput Life Sci 13, 312–320 (2021). https://doi.org/10.1007/s12539-021-00425-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-021-00425-8