Skip to main content
Log in

A collective entity linking algorithm with parallel computing on large-scale knowledge base

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Entity linking is a central concern of automatic knowledge question answering and knowledge base population. Traditional collective entity linking approaches only consider one of the entity contexts or semantic relations between entities. Thus, these approaches always have poor performance on Web documents. The efficiency of collective entity linking needs to be improved as well. This paper proposes a collective entity linking algorithm based on topic model and graph. Constructing the topic model can represent mentions and candidate entities by using topic distributions. It makes full use of context in documents. Entity semantic relations are represented by document similarities which are computed through the topic model. Parallel computing is used to reduce long running time which is caused by topic model construction. Entity graph is constructed according to the relations between entities in the knowledge graph. Hypertext-Induced Topic Search exploits the entity graph to compute hub value and authority value of candidate entities. And the authority value is the basis for entity linking. Experimental results on open-domain corpus (NLPCC2014) demonstrate the validity of the proposed method. Experimental results show that the proposed approach has 5.2% improvement in \(F_{1}\)-measure than AGDISTIS on corp NLPCC2014 .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ji H, Grishman R (2011) Knowledge base population: successful approaches and challenges. ACL-HLT 18:518–519

    Google Scholar 

  2. Liu Q, Zhong Y, Li Y, Liu Y, Qin ZG (2016) Graph-based collective Chinese entity linking algorithm. J Comput Res Dev 53:270–283

    Google Scholar 

  3. Paulheim H (2017) Knowledge graph refinement: a survey of approaches and evaluation methods. Semant Web 8:489–508

    Article  Google Scholar 

  4. Yan JH, Wang CY, Cheng WL, Gao M, Zhou AY (2016) A retrospective of knowledge graphs. Front Comput Sci 12:55–74

    Article  Google Scholar 

  5. Burdick D, Kolaitis PG, Tan WC, Fagin R, Popa L (2016) A declarative framework for linking entities. ACM Trans Database Syst 41:17

    Article  MathSciNet  Google Scholar 

  6. Shen W, Wang JY, Han JW (2015) Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans Knowl Data Eng 27:443–460

    Article  Google Scholar 

  7. Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: CoNLL, pp 147–155

  8. Ananthakrishna R, Chaudhuri S, Ganti V (2002) Eliminating fuzzy duplicates in data warehouses. In: International Conference on Very Large Data Bases, VLDB Endowment, pp 586–597

  9. Cucerzan S (2007) Large-scale named entity disambiguation based on Wikipedia data. In: EMNLP-CoNLL 2007, pp 708–716

  10. Huai BX, Bao TF, Zhu HS, Liu Q (2014) Topic modeling approach to named entity linking. J Softw 25:2076–2087

    Google Scholar 

  11. Usbeck R, Ngomo ACN, Roder M, Gerber D, Coelho SA, Auer S, Both A (2014) AGDISTIS-graph-based disambiguation of named entities using linked data. Springer, Berlin, pp 457–471

    Google Scholar 

  12. Gerber D, Hellmann S, Buhmann L, Soru T, Usbeck R, Ngomo ACN (2013) Real-time RDF extraction from unstructured data streams. In: ISWC

  13. Mendes PN, Jakob M, Garcia-Silva A, Bizer C (2011) DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems (I-Semantics)

  14. Ferragina P, Scaiella U (2012) Fast and accurate annotation of short texts with Wikipedia pages. IEEE Softw 29:70–75

    Article  Google Scholar 

  15. Kleinberg JM (1998) Authoritative sources in a hyperlinked environment. J ACM 46:604–632

    Article  MathSciNet  Google Scholar 

  16. Han XP, Sun L (2011) A Generative entity-mention model for linking entities with knowledge base. In: The Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference. DBLP, pp 945–954

  17. Han JL, Sun AX, Cong G, Zhao X, Ji ZC, Phan MC (2018) Linking fine-grained locations in user comments. IEEE Trans Knowl Data Eng 30:59–72

    Article  Google Scholar 

  18. Shen W, Han JW, Wang JY, Yuan XJ, Yang ZL (2018) SHINE+: a general framework for domain-specific entity linking with heterogeneous information networks. IEEE Trans Knowl Data Eng 30:353–365

    Article  Google Scholar 

  19. Mccallum A, Wang XR, Andrés CN (2007) Topic and role discovery in social networks with experiments on enron and academic email. J Artif Intell Res 30:249–272

    Article  Google Scholar 

  20. Xin J, Cui ZM, Zhang SK, He T, Li C, Huang H (2014) Constructing topic models of internet of things for information processing. Sci World J 2014:1–11

    Google Scholar 

  21. Zeng WX, Zhao X, Tang JY, Shang HC (2018) Collective list-only entity linking: a graph-based approach. IEEE Access 6:16035–16045

    Article  Google Scholar 

  22. Ensan F, Du WC (2019) Ad hoc retrieval via entity linking and semantic similarity. Knowl Inf Syst 58:551–583

    Article  Google Scholar 

  23. Huang HZ, Heck L, Ji H (2015) Leveraging deep neural networks and knowledge graphs for entity disambiguation. Comput Sci 1275–1284. arXiv:1504.07678v1

  24. Zeng WX, Tang JY, Zhao X (2018) Entity linking on chinese microblogs via deep neural network. IEEE Access 6:25908–25920

    Article  Google Scholar 

  25. Gu LC, Han YY, Wang C, Chen W, Jiao J, Yuan XH (2019) Module overlapping structure detection in PPI using an improved link similarity-based Markov clustering algorithm. Neural Comput Appl 31:1481–1490

    Article  Google Scholar 

  26. Yuan XH, Gu LC, Chen T, Elhoseny M, Wang W (2018) A fast and accurate retina image verification method based on structure similarity. In: 2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService), pp 181–185

  27. Roder M, Usbeck R, Hellmann S, Gerber D, Both A (2014) N3-A collection of datasets for named entity recognition and disambiguation in the NLP interchange format. In: 9th LREC

  28. Gruetze T, Kasneci G, Zuo Z, Naumann F (2016) CohEEL: coherent and efficient named entity linking through random walks. J Web Semant 37–38:75–89

    Article  Google Scholar 

  29. Moro A, Raganato A, Navigli R (2014) Entity linking meets word sense disambiguation: a unified approach. Trans Assoc Comput Linguist 2:231–244

    Article  Google Scholar 

  30. Phan MC, Sun A, Tay Y, Han JL (2018) Pair-linking for collective entity disambiguation: two could be better than all. IEEE Trans Knowl Data Eng 31:1383–1396

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China under Grant (31771679, 31371533, 31671589), the Anhui Foundation for Science and Technology Major Project, China, under Grant (16030701092, 18030901034), the Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture of China under Grant (AEC2018003, AEC2018006), the 2016 Anhui Foundation for Natural Science Major Project of the Higher Education Institutions under Grant (KJ2016A836), the Hefei Major Research Project of Key Technology (J2018G14).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lichuan Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, Y., Wang, X., Gu, L. et al. A collective entity linking algorithm with parallel computing on large-scale knowledge base. J Supercomput 76, 948–963 (2020). https://doi.org/10.1007/s11227-019-03046-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-019-03046-7

Keywords

Navigation