Abstract
Research on automatic recognition of named entities from Arabic text uses techniques that work well for the Latin based languages such as local grammars, statistical learning models, pattern matching, and rule-based techniques. These techniques boost their results by using application specific corpora, parallel language corpora, and morphological stemming analysis. We propose a method for extracting entities, events, and relations amongst them from Arabic text using a hierarchy of finite state machines driven by morphological features such as part of speech and gloss tags, and graph transformation algorithms. We evaluated our method on two natural language processing applications. We automated the extraction of narrators and narrator relations from several corpora of Islamic narration books. We automated the extraction of genealogical family trees from Biblical texts. In all applications, our method reports high precision and recall and learns lemmas about phrases that improve results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Complete Bible Genealogy (2005), http://www.complete-bible-genealogy.com
Abuleil, S.: Extracting names from Arabic text for question-answering systems. In: Recherche d’Information et ses Applications (RIAO), pp. 638–647 (2004)
Al-Jumaily, H., Martínez, P., Martínez-Fernàndez, J., Van der Goot, E.: A real time named entity recognition system for Arabic text mining. In: Language Resources and Evaluation, pp. 1–21 (2011)
Al Kulayni, M.I.Y.: Kitab al-Kafi. Taaruf (May 1996)
Al Tousi, M.B.H.: Al Istibsar. Taaruf (June 1995)
Azami, M.M.A.: A note on work in progress on computerization of hadith. Journal of Islamic Studies 2(1) (1991)
Azmi, A., Bin Badia, N.: e-Narrator: an application for creating an ontology of hadiths narration tree semantically and graphically. The Arabian Journal of Science and Technology 35(2C), 86–91 (2010)
Azmi, A., Bin Badia, N.: iTree - automating the construction of the narration tree of hadiths. In: Natural Language Processing and Knowledge Engineering (August 2010)
Belote, J.: Bible Genealogies with Notes on Bible Kinship and Family Systems (2008), http://www.d.umn.edu/~jbelote/biblegenealogy.html
Benajiba, Y., Diab, M., Rosso, P.: Arabic named entity recognition using optimized feature sets. In: Empirical Methods in Natural Language Processing, Morristown, NJ, USA, pp. 284–293 (2008)
Benajiba, Y., Rosso, P., BenedíRuiz, J.M.: ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 143–153. Springer, Heidelberg (2007)
Benajiba, Y., Zitouni, I., Diab, M.T., Rosso, P.: Arabic named entity recognition: Using features extracted from noisy data. In: ACL (Short Papers), pp. 281–285 (2010)
Cohen, S.: Entity extraction enables “discovery”. Tech. rep., Basis Technology (2006)
COLTEC: ANEE: Arabic named entity extraction. Tech. rep., Computer & Language Technology (2007)
Debili, F., Achour, H.: Voyellation automatique de l’Arabe. In: Workshop on Computational Approaches to Semitic Languages, pp. 42–49 (1998)
Ibn Hanbal, A.B.: Musnad. Noor Foundation (August 2005)
Maloney, J., Niv, M.: TAGARAB: A fast accurate Arabic name recognizer using high-precision morphological analysis. In: Workshop on Computational Approaches to Semitic Languages (1998)
Rouse, R.: Mapping God’s bloodline (April 2011), http://soulliberty.com/View.php?ID=5052
Shaalan, K.F., Raza, H.: NERA: Named entity recognition for Arabic. JASIST 60(8) (2009)
Technologies, B.: BBN IdentiFinder Text Suite, http://www.bbn.com/technology/speech/identifinder
Traboulsi, H.: Arabic named entity extraction: A local grammar-based approach. In: International Multi Conference on Computer Science and Information Technology (2009)
Arabic text mining framework (2009), http://code.google.com/p/atmine/
Sakhr inc. (September 2009), http://www.sakhr.com/products/Mining
Zaghouani, W., Pouliquen, B., Ebrahim, M., Steinberger, R.: Adapting a resource-light highly multilingual named entity recognition system to Arabic. In: Language Resources and Evaluation Conference, Valletta, Malta (May 2010)
Zeineddine, M., et al.: Platform for automated authentication of Islamic traditions and hadiths (2008), http://code.google.com/p/hadithopaedia
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Makhlouta, J., Zaraket, F., Harkous, H. (2012). Arabic Entity Graph Extraction Using Morphology, Finite State Machines, and Graph Transformations. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-28604-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)