Skip to main content
Log in

A weighted-link graph neural network for lung cancer knowledge classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Visualized knowledge representation can more effectively help the public gain knowledge about lung cancer prevention, diagnosis, treatment, and subsequent life. Therefore, this study collected articles on lung cancer from the well-known Web of Science database to analyze lung cancer literature, and the text data were published between 2016 and 2021. First, we used natural language processing to handle the collected text data, and then we used the latent Dirichlet allocation method to perform topic modeling and obtain the optimal topic numbers based on two coherence metrics for assigning the class of every article. Next, a PMI_2 weighted was proposed to build an initial weighted knowledge graph, and four graph neural network algorithms were used to train the initial weighted knowledge graph. In addition, we proposed a PMI_2 + link to improve the classification performance, and the additional links were obtained from the graph auto-encoder and graph convolutional network training. When the best classification performance has been obtained, these edge weights have a representative. For visualized knowledge representation, we used the Neo4j tool to display the nodes and edge weights for the final literature knowledge. The results show that the use of the proposed PMI_2 + link to build a weighted graph has a better classification performance. Further, the proposed PMI_2 + link can effectively reduce the number of edges on the knowledge graphs and avoid insufficient GPU memory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data are collected from Web of Science (https://www.webofscience.com/wos/woscc/basic-search).

References

  1. World Health Organization,  Cancer. https://www.who.int/news-room/factsheets/detail/cancer. Accessed 21 Sept 2021

  2. Murray CJ et al (2020) Global burden of 87 risk factors in 204 countries and territories, 1990–2019: a systematic analysis for the global burden of Disease Study 2019. Lancet 396(10258):1223–1249. https://doi.org/10.1016/S0140-6736(20)30752-2

  3. Walter FM et al (2015) Symptoms and other factors associated with time to diagnosis and stage of lung cancer: a prospective cohort study. Br J Cancer 112(1):S6–S13. https://doi.org/10.1038/bjc.2015.30

    Article  Google Scholar 

  4. Akabe K, Takeuchi T, Aoki T, Nishimura K (2021) Information retrieval on oncology knowledge base using recursive paraphrase lattice. J Biomed Inform 116:103705. https://doi.org/10.1016/j.jbi.2021.103705

    Article  Google Scholar 

  5. Nurdiati S, Hoede C (2008) 25 years development of knowledge graph theory: the results and the challenge. Memorandum 1876(2):1–10

    Google Scholar 

  6. Vlietstra WJ, Vos R, Sijbers AM, van Mulligen EM, Kors JA (2018) Using predicate and provenance information from a knowledge graph for drug efficacy screening. J Biomed Semant 9(1):1–10. https://doi.org/10.1186/s13326-018-0189-6

    Article  Google Scholar 

  7. Zhou H, Lang C, Liu Z, Ning S, Lin Y, Du L (2019) Knowledge-guided convolutional networks for chemical-disease relation extraction. BMC Bioinform 20(1):1–13. https://doi.org/10.1186/s12859-019-2873-7

    Article  Google Scholar 

  8. Zhang Z, Cao L, Chen X (2020) Representation learning of knowledge graphs with entity attributes. IEEE Access 8:7435–7441. doi:https://doi.org/10.1109/access.2020.2963990

    Article  Google Scholar 

  9. Neo4j Graph Data Platform,  Neo4j Graph Data Platform – The Leader in Graph Databases. https://neo4j.com/. Accessed 06 Apr 2022

  10. Futia G, Vetrò A, De Martin JC (2020) SeMi: a SEmantic modeling machIne to build knowledge graphs with graph neural networks. SoftwareX 12:100516. https://doi.org/10.1016/j.softx.2020.100516

    Article  Google Scholar 

  11. Gruber TR (1995) Toward principles for the design of ontologies used for knowledge sharing? Int J Hum Comput Stud 43:5–6. https://doi.org/10.1006/ijhc.1995.1081

    Article  Google Scholar 

  12. Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37:141–188. https://doi.org/10.1613/jair.2934

  13. Yang Y, Cao Z, Zhao P, Zeng DD, Zhang Q, Luo Y (2021) Constructing public health evidence knowledge graph for decision-making support from COVID-19 literature of modelling study. J Saf Sci Resil 2(3):146–156. https://doi.org/10.1016/j.jnlssr.2021.08.002

    Article  Google Scholar 

  14. Akkasi A, Moens M-F (2021) Causal relationship extraction from biomedical text using deep neural models: a comprehensive survey. J Biomed Inform 119:103820. https://doi.org/10.1016/j.jbi.2021.103820

    Article  Google Scholar 

  15. Sheth A, Padhee S, Gyrard A (2019) Knowledge graphs and knowledge networks: the story in brief. IEEE Internet Comput 23(4):67–75

    Article  Google Scholar 

  16. Lin Y, Han X, Xie R, Liu Z, Sun M (2018) Knowledge representation learning: a quantitative review. arXiv preprint arXiv:1812.10901. https://doi.org/10.48550/arXiv.1812.10901

  17. Wang Q, Mao Z, Wang B, Guo L (2017) Knowledge graph embedding: a survey of approaches and applications. IEEE Trans Knowl Data Eng 29(12):2724–2743. https://doi.org/10.1109/TKDE.2017.2754499

    Article  Google Scholar 

  18. Zhang Z, Cui P, Zhu W (2020) Deep learning on graphs: a survey. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2020.2981333

    Article  Google Scholar 

  19. Finlayson SG, LePendu P, Shah NH (2014) Building the graph of medicine from millions of clinical narratives. Sci Data 1(1):1–9. https://doi.org/10.1038/sdata.2014.32

  20. Rotmensch M, Halpern Y, Tlimat A, Horng S, Sontag D (2017) Learning a health knowledge graph from electronic medical records. Sci Rep 7(1):1–11. https://doi.org/10.1038/s41598-017-05778-z

    Article  Google Scholar 

  21. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, Conference Track Proceedings

  22. Li Z, Zhao Y, Zhang Y, Zhang Z (2022) Multi-relational graph attention networks for knowledge graph completion. Knowl Based Syst 251:109262. https://doi.org/10.1016/j.knosys.2022.109262

    Article  Google Scholar 

  23. Shi Y, Huang Z, Feng S, Zhong H, Wang W, Sun Y (2020) Masked label prediction: Unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509. https://doi.org/10.48550/arXiv.2009.03509

  24. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc EEE 86(11):2278–2324. https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  25. Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24. https://doi.org/10.1109/TNNLS.2020.2978386

    Article  MathSciNet  Google Scholar 

  26. Lin J, Zhao Y, Huang W et al (2021) Domain knowledge graph-based research progress of knowledge representation. Neural Comput Appl 33:681–690. https://doi.org/10.1007/s00521-020-05057-5

    Article  Google Scholar 

  27. Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, no 01, pp 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370

  28. Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inform Process Syst 30

  29. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  30. Veličković P, Casanova A, Lio P, Cucurull G, Romero A, Bengio Y (2018) Graph attention networks. 6th International Conference on Learning Representations, ICLR - Conference Track Proceedings. https://doi.org/10.17863/CAM.48429

  31. Devlin J, Chang M-W, Lee K, Toutanova K Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol 1, 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics

  32. Wu T, Qi G, Li C (2018) A survey of techniques for constructing chinese knowledge graphs and their applications. Sustainability 10(9):3245. doi:https://doi.org/10.3390/su10093245

    Article  Google Scholar 

  33. Xu H, Jiang B, Huang L, Tang J, Zhang S (2022) Multi-head collaborative learning for graph neural networks. Neurocomputing 499:47–53. https://doi.org/10.1016/j.neucom.2022.05.027

    Article  Google Scholar 

  34. Baker S, Silins I, Guo Y, Ali I, Högberg J, Stenius U, Korhonen A (2016) Automatic semantic classification of scientific literature according to the hallmarks of cancer. Bioinformatics 32(3):432–440. https://doi.org/10.1093/bioinformatics/btv585

    Article  Google Scholar 

  35. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    MATH  Google Scholar 

  36. Röder M, Both A, Hinneburg A (2015) Exploring the space of topic coherence measures. In: Proceedings of the eighth ACM international conference on web search and data mining, pp 399–408. https://doi.org/10.1145/2684822.2685324

  37. Goodfellow I, Bengio Y, Courville A (2016) 6.2. 2.3 softmax units for multinoulli output distributions. Deep Learning, no 1, pp 180

  38. Fey M, Lenssen JE (2019) Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428. https://doi.org/10.48550/arXiv.1903.02428

  39. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. https://doi.org/10.48550/arXiv.1412.6980

  40. Heist N (2018) Towards knowledge graph construction from entity co-occurrence, the proceedings of the 21th International conference on knowledge engineering and knowledge management, EKAW 2018, held in Nancy, France, in November 2018

  41. Duan Y, Shao L, Hu G (2018) Specifying knowledge graph with data graph, information graph, knowledge graph, and wisdom graph. Int J Softw Innov (IJSI) 6(2):10–25. https://doi.org/10.4018/IJSI.2018040102

    Article  Google Scholar 

  42. Sun Z, Huang J, Hu W, Chen M, Guo L, Qu Y (2019) TransEdge: Translating relation-contextualized Embeddings for Knowledge Graphs, the semantic web – ISWC 2019. Lecture Notes in Computer Science book series, vol 11778. Springer, Cham. https://doi.org/10.1007/978-3-030-30793-6_35

  43. Begum M, Urquhart I, Grant Lewison FF, Sullivan R (2020) Research on lung cancer and its funding, 2004–2018. Ecancermedicalscience 14:1132. https://doi.org/10.3332/ecancer.2020.1132

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ching-Hsue Cheng.

Ethics declarations

Competing interest

The authors declare that they have no relevant financial or non-financial interests or competing interests to disclose.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, CH., Ji, ZT. A weighted-link graph neural network for lung cancer knowledge classification. Appl Intell 53, 17610–17628 (2023). https://doi.org/10.1007/s10489-022-04437-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04437-9

Keywords

Navigation