Skip to main content

Arabic Entity Graph Extraction Using Morphology, Finite State Machines, and Graph Transformations

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7181))

Abstract

Research on automatic recognition of named entities from Arabic text uses techniques that work well for the Latin based languages such as local grammars, statistical learning models, pattern matching, and rule-based techniques. These techniques boost their results by using application specific corpora, parallel language corpora, and morphological stemming analysis. We propose a method for extracting entities, events, and relations amongst them from Arabic text using a hierarchy of finite state machines driven by morphological features such as part of speech and gloss tags, and graph transformation algorithms. We evaluated our method on two natural language processing applications. We automated the extraction of narrators and narrator relations from several corpora of Islamic narration books. We automated the extraction of genealogical family trees from Biblical texts. In all applications, our method reports high precision and recall and learns lemmas about phrases that improve results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Complete Bible Genealogy (2005), http://www.complete-bible-genealogy.com

  2. Abuleil, S.: Extracting names from Arabic text for question-answering systems. In: Recherche d’Information et ses Applications (RIAO), pp. 638–647 (2004)

    Google Scholar 

  3. Al-Jumaily, H., Martínez, P., Martínez-Fernàndez, J., Van der Goot, E.: A real time named entity recognition system for Arabic text mining. In: Language Resources and Evaluation, pp. 1–21 (2011)

    Google Scholar 

  4. Al Kulayni, M.I.Y.: Kitab al-Kafi. Taaruf (May 1996)

    Google Scholar 

  5. Al Tousi, M.B.H.: Al Istibsar. Taaruf (June 1995)

    Google Scholar 

  6. Azami, M.M.A.: A note on work in progress on computerization of hadith. Journal of Islamic Studies 2(1) (1991)

    Google Scholar 

  7. Azmi, A., Bin Badia, N.: e-Narrator: an application for creating an ontology of hadiths narration tree semantically and graphically. The Arabian Journal of Science and Technology 35(2C), 86–91 (2010)

    Google Scholar 

  8. Azmi, A., Bin Badia, N.: iTree - automating the construction of the narration tree of hadiths. In: Natural Language Processing and Knowledge Engineering (August 2010)

    Google Scholar 

  9. Belote, J.: Bible Genealogies with Notes on Bible Kinship and Family Systems (2008), http://www.d.umn.edu/~jbelote/biblegenealogy.html

  10. Benajiba, Y., Diab, M., Rosso, P.: Arabic named entity recognition using optimized feature sets. In: Empirical Methods in Natural Language Processing, Morristown, NJ, USA, pp. 284–293 (2008)

    Google Scholar 

  11. Benajiba, Y., Rosso, P., BenedíRuiz, J.M.: ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 143–153. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Benajiba, Y., Zitouni, I., Diab, M.T., Rosso, P.: Arabic named entity recognition: Using features extracted from noisy data. In: ACL (Short Papers), pp. 281–285 (2010)

    Google Scholar 

  13. Cohen, S.: Entity extraction enables “discovery”. Tech. rep., Basis Technology (2006)

    Google Scholar 

  14. COLTEC: ANEE: Arabic named entity extraction. Tech. rep., Computer & Language Technology (2007)

    Google Scholar 

  15. Debili, F., Achour, H.: Voyellation automatique de l’Arabe. In: Workshop on Computational Approaches to Semitic Languages, pp. 42–49 (1998)

    Google Scholar 

  16. Ibn Hanbal, A.B.: Musnad. Noor Foundation (August 2005)

    Google Scholar 

  17. Maloney, J., Niv, M.: TAGARAB: A fast accurate Arabic name recognizer using high-precision morphological analysis. In: Workshop on Computational Approaches to Semitic Languages (1998)

    Google Scholar 

  18. Rouse, R.: Mapping God’s bloodline (April 2011), http://soulliberty.com/View.php?ID=5052

  19. Shaalan, K.F., Raza, H.: NERA: Named entity recognition for Arabic. JASIST 60(8) (2009)

    Google Scholar 

  20. Technologies, B.: BBN IdentiFinder Text Suite, http://www.bbn.com/technology/speech/identifinder

  21. Traboulsi, H.: Arabic named entity extraction: A local grammar-based approach. In: International Multi Conference on Computer Science and Information Technology (2009)

    Google Scholar 

  22. Arabic text mining framework (2009), http://code.google.com/p/atmine/

  23. Sakhr inc. (September 2009), http://www.sakhr.com/products/Mining

  24. Zaghouani, W., Pouliquen, B., Ebrahim, M., Steinberger, R.: Adapting a resource-light highly multilingual named entity recognition system to Arabic. In: Language Resources and Evaluation Conference, Valletta, Malta (May 2010)

    Google Scholar 

  25. Zeineddine, M., et al.: Platform for automated authentication of Islamic traditions and hadiths (2008), http://code.google.com/p/hadithopaedia

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Makhlouta, J., Zaraket, F., Harkous, H. (2012). Arabic Entity Graph Extraction Using Morphology, Finite State Machines, and Graph Transformations. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28604-9_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28603-2

  • Online ISBN: 978-3-642-28604-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics