Abstract
In this paper, we address the problem of semantic hidden errors in Arabic texts. These are spelling errors occurring in valid words and causing semantic irregularities. We first expose the different types of these errors. Then, we present and argue the adopted approach, which is based on the combination of several methods. Next, we describe the context of our work and show the multi-agent architecture of our system. Finally we present the testing framework used to evaluate the implemented system.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Verberne, S.: Context sensitive spell checking based on word trigram probabilities. Master thesis Taal, Spraak & Informatica, University of Nijmegen (2002)
Ben Othman, C.: De la synthèse lexicographique à la détection et la correction des graphie fautives arabes. Thèse de doctorat, Université de Paris XI, Orsay (1998)
Golding, A.: A Bayesian hybrid method for context-sensitive spelling correction. In: Proceedings of the third Workshop On Very Large Corpora, Cambridge, Massachuses, USA, pp. 39–53 (1995)
Golding, A., Schabes, Y.: Combining trigram based and feature based methods for context sensitive spelling correction. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, pp. 71–78 (1996)
Golding, A.R., Roth, D.: A winnow-based approach to context-sensitive spelling correction. Machine Learning 34(1-3), 107–130 (1999)
Xiaolong, W., Jianhua, L.: Combine trigram and automatic weight distribution in Chinese spelling error correction. Journal of computer Science and Technology 17(Issue 6), Province, China (2001)
Bigert, J., Knutsson, O.: Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge. In: Proceedings of Robust Methods in Analysis of Natural Language Data (ROMAND’02), Frascati, Italy (2002)
Bolshakov, I., Gelbukh, A.: On Detection of Malapropisms by Multistage Collocation Testing. In: NLDB-2003. Lecture Notes in Informatics, pp. 28–41. Bonner Köllen Verlag, Bonn (2003)
Bolshakov, I.A., Gelbukh, A.: Paronyms for Accelerated Correction of Semantic Errors. International Journal on Information Theories and Applications 10, 11–19 (2003)
Gelbukh, A., Bolshakov, I.: On Correction of Semantic Errors in Natural Language Texts with a Dictionary of Literal Paronyms. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 105–114. Springer, Heidelberg (2004)
Bolshakov, I.A., Galicia-Haro, S.N., Gelbukh, A.: Detection and Correction of Malapropisms in Spanish by means of Internet Search. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 115–122. Springer, Heidelberg (2005)
Ben Othmane, Z.C., Ben Fraj, F., Ben Ahmed, M.: A Multi-Agent System for Detecting and Correcting ”Hidden” Spelling Errors in Arabic Texts. In: NLUCS 2005, pp. 149–154 (2005)
Ben Hamadou, A.: Vérification et correction automatique par analyse affixale des textes écrits en langue naturelle: le cas de l’arabe non voyellé. Thèse d’état en informatique, Faculté des Sciences de Tunis (1993)
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to Latent Semantic Analysis. Discourse Processes 25, 259–284 (1998)
Mlayeh, I.: Extraction de collocations à partir de corpus textuels en langue arabe. Mémoire de mastère, Ecole nationale des sciences informatiques, Université de la Manouba (2004)
Ben Othmane, Z.C., Ben Ahmed, M.: Le contexte au service des graphies fautives arabes. In: TALN 2003, Nantes, pp. 11–14 (2003)
Aloulou, C.: Utilisation de l’approche multi-critère pour orienter un processus de correction des erreurs d’accord dans des phrases de la langue arabe non voyellée. Mémoire de DEA, Institut Supérieur de Gestion, Université de Tunis III (1996)
Courtin, J., Genthial, D., Menézo, J.: Intégration de strategies de correction dans un système de detection/correction d’erreurs, Colloque Informatique et Langue Naturelle (ILN93), Nantes (1993)
Sulaiti, L.: Designing and Developing a Corpus of Contemporary Arabic. Master of Science, School of Computing, University of Leeds, United Kingdom (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ben Othmane Zribi, C., Mejri, H., Ben Ahmed, M. (2007). Combining Methods for Detecting and Correcting Semantic Hidden Errors in Arabic Texts. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2007. Lecture Notes in Computer Science, vol 4394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-70939-8_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70938-1
Online ISBN: 978-3-540-70939-8
eBook Packages: Computer ScienceComputer Science (R0)