skip to main content
10.1145/2487575.2487686acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Linking named entities in Tweets with knowledge base via user interest modeling

Published:11 August 2013Publication History

ABSTRACT

Twitter has become an increasingly important source of information, with more than 400 million tweets posted per day. The task to link the named entity mentions detected from tweets with the corresponding real world entities in the knowledge base is called tweet entity linking. This task is of practical importance and can facilitate many different tasks, such as personalized recommendation and user interest discovery. The tweet entity linking task is challenging due to the noisy, short, and informal nature of tweets. Previous methods focus on linking entities in Web documents, and largely rely on the context around the entity mention and the topical coherence between entities in the document. However, these methods cannot be effectively applied to the tweet entity linking task due to the insufficient context information contained in a tweet. In this paper, we propose KAURI, a graph-based framework to collectively link all the named entity mentions in all tweets posted by a user via modeling the user's topics of interest. Our assumption is that each user has an underlying topic interest distribution over various named entities. KAURI integrates the intra-tweet local information with the inter-tweet user interest information into a unified graph-based framework. We extensively evaluated the performance of KAURI over manually annotated tweet corpus, and the experimental results show that KAURI significantly outperforms the baseline methods in terms of accuracy, and KAURI is efficient and scales well to tweet stream.

References

  1. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, and Z. Ives. Dbpedia: A nucleus for a web of open data. In ISWC'07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Bizer, T. Heath, and T. Berners-Lee. Linked Data - The Story So Far. IJSWIS, 5(3), 2009.Google ScholarGoogle Scholar
  3. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD'08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Bunescu and M. Pasca. Using Encyclopedic Knowledge for Named Entity Disambiguation. In EACL'06.Google ScholarGoogle Scholar
  5. J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. Chi. Short and tweet: experiments on recommending content from information streams. In CHI'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Chen, T. Chen, G. Zheng, O. Jin, E. Yao, and Y. Yu. Collaborative personalized tweet recommendation. In SIGIR'12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Cucerzan. Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In EMNLP-CoNLL'07.Google ScholarGoogle Scholar
  8. N. Dalvi, R. Kumar, and B. Pang. Object matching in tweets with spatial models. In WSDM'12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. X. Han, L. Sun, and J. Zhao. Collective entity linking in web text: a graph-based method. In SIGIR'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. H. Haveliwala. Topic-sensitive pagerank. In WWW'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Hoffart, M. A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. In EMNLP'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In SIGKDD'09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Li, J. Weng, Q. He, Y. Yao, A. Datta, A. Sun, and B.-S. Lee. Twiner: named entity recognition in targeted twitter stream. In SIGIR'12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. Liu, S. Zhang, F. Wei, and M. Zhou. Recognizing named entities in tweets. In ACL'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Meij, W. Weerkamp, and M. de Rijke. Adding semantics to microblog posts. In WSDM '12, pages 563--572. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Michelson and S. A. Macskassy. Discovering users' topics of interest on twitter: a first look. In AND'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Milne and I. H. Witten. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In WIKIAI'08.Google ScholarGoogle Scholar
  18. W. Shen, J. Wang, P. Luo, and M. Wang. A graph-based approach for ontology population with named entities. In CIKM'12, pages 345--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Shen, J. Wang, P. Luo, and M. Wang. Liege: Link entities in web lists with knowledge base. In SIGKDD'12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. W. Shen, J. Wang, P. Luo, and M. Wang. Linden: linking named entities with knowledge base via semantic knowledge. In WWW'12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In WWW'07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: finding topic-sensitive influential twitterers. In WSDM'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD'12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Z. Xu, L. Ru, L. Xiang, and Q. Yang. Discovering user interest on twitter with a modified author-topic model. In WI-IAT'11. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Linking named entities in Tweets with knowledge base via user interest modeling

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2013
      1534 pages
      ISBN:9781450321747
      DOI:10.1145/2487575

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 August 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '13 Paper Acceptance Rate125of726submissions,17%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader