Skip to main content

Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity

  • Conference paper
  • First Online:
  • 1377 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10640))

Abstract

This work proposes a novel approach in Named Entity rEcognition and Linking (NEEL) in tweets, applying the same strategy already presented for Question Answering (QA) by the same authors. The previous work describes a rule-based and ontology-based system that attempts to retrieve the correct answer to a query from the DBPedia ontology through a similarity measure between the query and the ontology labels. In this paper, a tweet is interpreted as a query for the QA system: both the text and the thread of a tweet are a sequence of statements that have been linked to the ontology. Provided that tweets make extensive use of informal language, the similarity measure and the underlying processes have been devised differently than in the previous approach; also the particular structure of a tweet, that is the presence of mentions, hashtags, and partially structured statements, is taken into consideration for linguistic insights. NEEL is achieved actually as the output of annotating a tweet with the names of the ontological entities retrieved by the system. The strategy is explained in detail along with the architecture and the implementation of the system; also the performance as compared to the systems presented at the #Micropost2016 workshop NEEL Challenge co-located with the World Wide Web conference 2016 (WWW ’16) is reported and discussed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Beaufort, R., Roekhaut, S., Cougnon, L.A., Fairon, C.: A hybrid rule/model-based finite-state framework for normalizing SMS messages. In: Hajic, J., Carberry, S., Clark, S. (eds.) ACL, pp. 770–779. The Association for Computer Linguistics (2010). http://dblp.uni-trier.de/db/conf/acl/acl2010.html#BeaufortRCF10

  2. Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., Petrak, J., Bontcheva, K.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manag. 51(2), 32–49 (2015)

    Article  Google Scholar 

  3. Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. The MIT Press, Cambridge; London (1998)

    Google Scholar 

  4. Habib, M.B., van Keulen, M.: Need4tweet: a twitterbot for tweets named entity extraction and disambiguation. In: Proceedings of the System Demonstrations of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China. The Association for Computer Linguistics, Beijing, July 2015

    Google Scholar 

  5. Habib, M., van Keulen, M.: A generic open world named entity disambiguation approach for tweets. In: 5th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2013. SciTePress, September 2013. http://doc.utwente.nl/86471/

  6. Han, B., Baldwin, T.: Lexical normalisation of short text messages: makn sens a #twitter. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 368–378. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2002472.2002520

  7. Hoover, W.A., Gough, P.B.: The simple view of reading. Read. Writ. 2(2), 127–160 (1990). https://doi.org/10.1007/BF00401799

    Article  Google Scholar 

  8. Kaufmann, M., Kalita, J.: Syntactic normalization of Twitter messages. In: International Conference on Natural Language Processing, Kharagpur, India (2010)

    Google Scholar 

  9. Kobus, C., Yvon, F., Damnati, G.: Normalizing SMS: are two metaphors better than one? In: Proceedings of the 22nd International Conference on Computational Linguistics, COLING 2008, vol. 1, pp. 441–448. Association for Computational Linguistics, Stroudsburg (2008). http://dl.acm.org/citation.cfm?id=1599081.1599137

  10. Li, C., Weng, J., He, Q., Yao, Y., Datta, A., Sun, A., Lee, B.S.: Twiner: named entity recognition in targeted Twitter stream. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 721–730. ACM, New York (2012). http://doi.acm.org/10.1145/2348283.2348380

  11. Liu, F., Weng, F., Wang, B., Liu, Y.: Insertion, deletion, or substitution?: normalizing text messages without pre-categorization nor supervision (2011)

    Google Scholar 

  12. Nothman, J., Ringland, N., Radford, W., Murphy, T., Curran, J.R.: Learning multilingual named entity recognition from Wikipedia. Artif. Intell. 194, 151–175 (2013). https://doi.org/10.1016/j.artint.2012.03.006

    Article  MATH  MathSciNet  Google Scholar 

  13. Pipitone, A., Campisi, M.C., Pirrone, R.: An A* based semantic tokenizer for increasing the performance of semantic applications. In: 2013 IEEE Seventh International Conference on Semantic Computing, Irvine, CA, USA, 16–18 September 2013, pp. 393–394. IEEE Computer Society (2013). https://doi.org/10.1109/ICSC.2013.75

  14. Pipitone, A., Tirone, G., Pirrone, R.: QuASIt: a cognitive inspired approach to question answering for the Italian language. In: Adorni, G., Cagnoni, S., Gori, M., Maratea, M. (eds.) AI*IA 2016. LNCS, vol. 10037, pp. 464–476. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49130-1_34

    Chapter  Google Scholar 

  15. Plu, J., Rizzo, G., Troncy, R.: Enhancing entity linking by combining NER models. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 17–32. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_2

    Chapter  Google Scholar 

  16. Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 1524–1534. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145595

  17. Rizzo, G., van Erp, M., Plu, J., Troncy, R.: Making sense of microposts (#microposts2016) named entity recognition and linking (NEEL) challenge. In: Dadzie, A., Preotiuc-Pietro, D., Radovanovic, D., Basave, A.E.C., Weller, K. (eds.) Proceedings of the 6th Workshop on ‘Making Sense of Microposts’ co-located with the 25th International World Wide Web Conference (WWW 2016), Montréal, Canada, 11 April 2016. CEUR Workshop Proceedings, vol. 1691, pp. 50–59. CEUR-WS.org (2016). http://ceur-ws.org/Vol-1691/microposts2016_neel-challenge-report/

  18. Rupley, W.H., Blair, T.R., Nichols, W.D.: Effective reading instruction for struggling readers: the role of direct/explicit teaching. Read. Writ. Q. 25(2–3), 125–138 (2009). https://doi.org/10.1080/10573560802683523

    Article  Google Scholar 

  19. Wang, A., Chen, T., Kan, M.Y.: Re-tweeting from a linguistic perspective. In: Proceedings of the Second Workshop on Language in Social Media, LSM 2012, pp. 46–55. Association for Computational Linguistics, Stroudsburg (2012). http://dl.acm.org/citation.cfm?id=2390374.2390380

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Pirrone .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pipitone, A., Tirone, G., Pirrone, R. (2017). Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70169-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70168-4

  • Online ISBN: 978-3-319-70169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics