Abstract
Entity linking is a central concern of automatic knowledge question answering and knowledge base population. Traditional collective entity linking approaches only consider one of the entity contexts or semantic relations between entities. Thus, these approaches always have poor performance on Web documents. The efficiency of collective entity linking needs to be improved as well. This paper proposes a collective entity linking algorithm based on topic model and graph. Constructing the topic model can represent mentions and candidate entities by using topic distributions. It makes full use of context in documents. Entity semantic relations are represented by document similarities which are computed through the topic model. Parallel computing is used to reduce long running time which is caused by topic model construction. Entity graph is constructed according to the relations between entities in the knowledge graph. Hypertext-Induced Topic Search exploits the entity graph to compute hub value and authority value of candidate entities. And the authority value is the basis for entity linking. Experimental results on open-domain corpus (NLPCC2014) demonstrate the validity of the proposed method. Experimental results show that the proposed approach has 5.2% improvement in \(F_{1}\)-measure than AGDISTIS on corp NLPCC2014 .
Similar content being viewed by others
References
Ji H, Grishman R (2011) Knowledge base population: successful approaches and challenges. ACL-HLT 18:518–519
Liu Q, Zhong Y, Li Y, Liu Y, Qin ZG (2016) Graph-based collective Chinese entity linking algorithm. J Comput Res Dev 53:270–283
Paulheim H (2017) Knowledge graph refinement: a survey of approaches and evaluation methods. Semant Web 8:489–508
Yan JH, Wang CY, Cheng WL, Gao M, Zhou AY (2016) A retrospective of knowledge graphs. Front Comput Sci 12:55–74
Burdick D, Kolaitis PG, Tan WC, Fagin R, Popa L (2016) A declarative framework for linking entities. ACM Trans Database Syst 41:17
Shen W, Wang JY, Han JW (2015) Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans Knowl Data Eng 27:443–460
Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: CoNLL, pp 147–155
Ananthakrishna R, Chaudhuri S, Ganti V (2002) Eliminating fuzzy duplicates in data warehouses. In: International Conference on Very Large Data Bases, VLDB Endowment, pp 586–597
Cucerzan S (2007) Large-scale named entity disambiguation based on Wikipedia data. In: EMNLP-CoNLL 2007, pp 708–716
Huai BX, Bao TF, Zhu HS, Liu Q (2014) Topic modeling approach to named entity linking. J Softw 25:2076–2087
Usbeck R, Ngomo ACN, Roder M, Gerber D, Coelho SA, Auer S, Both A (2014) AGDISTIS-graph-based disambiguation of named entities using linked data. Springer, Berlin, pp 457–471
Gerber D, Hellmann S, Buhmann L, Soru T, Usbeck R, Ngomo ACN (2013) Real-time RDF extraction from unstructured data streams. In: ISWC
Mendes PN, Jakob M, Garcia-Silva A, Bizer C (2011) DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems (I-Semantics)
Ferragina P, Scaiella U (2012) Fast and accurate annotation of short texts with Wikipedia pages. IEEE Softw 29:70–75
Kleinberg JM (1998) Authoritative sources in a hyperlinked environment. J ACM 46:604–632
Han XP, Sun L (2011) A Generative entity-mention model for linking entities with knowledge base. In: The Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference. DBLP, pp 945–954
Han JL, Sun AX, Cong G, Zhao X, Ji ZC, Phan MC (2018) Linking fine-grained locations in user comments. IEEE Trans Knowl Data Eng 30:59–72
Shen W, Han JW, Wang JY, Yuan XJ, Yang ZL (2018) SHINE+: a general framework for domain-specific entity linking with heterogeneous information networks. IEEE Trans Knowl Data Eng 30:353–365
Mccallum A, Wang XR, Andrés CN (2007) Topic and role discovery in social networks with experiments on enron and academic email. J Artif Intell Res 30:249–272
Xin J, Cui ZM, Zhang SK, He T, Li C, Huang H (2014) Constructing topic models of internet of things for information processing. Sci World J 2014:1–11
Zeng WX, Zhao X, Tang JY, Shang HC (2018) Collective list-only entity linking: a graph-based approach. IEEE Access 6:16035–16045
Ensan F, Du WC (2019) Ad hoc retrieval via entity linking and semantic similarity. Knowl Inf Syst 58:551–583
Huang HZ, Heck L, Ji H (2015) Leveraging deep neural networks and knowledge graphs for entity disambiguation. Comput Sci 1275–1284. arXiv:1504.07678v1
Zeng WX, Tang JY, Zhao X (2018) Entity linking on chinese microblogs via deep neural network. IEEE Access 6:25908–25920
Gu LC, Han YY, Wang C, Chen W, Jiao J, Yuan XH (2019) Module overlapping structure detection in PPI using an improved link similarity-based Markov clustering algorithm. Neural Comput Appl 31:1481–1490
Yuan XH, Gu LC, Chen T, Elhoseny M, Wang W (2018) A fast and accurate retina image verification method based on structure similarity. In: 2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService), pp 181–185
Roder M, Usbeck R, Hellmann S, Gerber D, Both A (2014) N3-A collection of datasets for named entity recognition and disambiguation in the NLP interchange format. In: 9th LREC
Gruetze T, Kasneci G, Zuo Z, Naumann F (2016) CohEEL: coherent and efficient named entity linking through random walks. J Web Semant 37–38:75–89
Moro A, Raganato A, Navigli R (2014) Entity linking meets word sense disambiguation: a unified approach. Trans Assoc Comput Linguist 2:231–244
Phan MC, Sun A, Tay Y, Han JL (2018) Pair-linking for collective entity disambiguation: two could be better than all. IEEE Trans Knowl Data Eng 31:1383–1396
Acknowledgements
This work is partially supported by the National Natural Science Foundation of China under Grant (31771679, 31371533, 31671589), the Anhui Foundation for Science and Technology Major Project, China, under Grant (16030701092, 18030901034), the Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture of China under Grant (AEC2018003, AEC2018006), the 2016 Anhui Foundation for Natural Science Major Project of the Higher Education Institutions under Grant (KJ2016A836), the Hefei Major Research Project of Key Technology (J2018G14).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xia, Y., Wang, X., Gu, L. et al. A collective entity linking algorithm with parallel computing on large-scale knowledge base. J Supercomput 76, 948–963 (2020). https://doi.org/10.1007/s11227-019-03046-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-03046-7