Combining Methods for Detecting and Correcting Semantic Hidden Errors in Arabic Texts

Ben Othmane Zribi, Chiraz; Mejri, Hanene; Ben Ahmed, Mohamed

doi:10.1007/978-3-540-70939-8_56

Combining Methods for Detecting and Correcting Semantic Hidden Errors in Arabic Texts

Chiraz Ben Othmane Zribi¹,
Hanene Mejri¹ &
Mohamed Ben Ahmed¹

Conference paper

1530 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4394))

Abstract

In this paper, we address the problem of semantic hidden errors in Arabic texts. These are spelling errors occurring in valid words and causing semantic irregularities. We first expose the different types of these errors. Then, we present and argue the adopted approach, which is based on the combination of several methods. Next, we describe the context of our work and show the multi-agent architecture of our system. Finally we present the testing framework used to evaluate the implemented system.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Verberne, S.: Context sensitive spell checking based on word trigram probabilities. Master thesis Taal, Spraak & Informatica, University of Nijmegen (2002)
Google Scholar
Ben Othman, C.: De la synthèse lexicographique à la détection et la correction des graphie fautives arabes. Thèse de doctorat, Université de Paris XI, Orsay (1998)
Google Scholar
Golding, A.: A Bayesian hybrid method for context-sensitive spelling correction. In: Proceedings of the third Workshop On Very Large Corpora, Cambridge, Massachuses, USA, pp. 39–53 (1995)
Google Scholar
Golding, A., Schabes, Y.: Combining trigram based and feature based methods for context sensitive spelling correction. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, pp. 71–78 (1996)
Google Scholar
Golding, A.R., Roth, D.: A winnow-based approach to context-sensitive spelling correction. Machine Learning 34(1-3), 107–130 (1999)
Article MATH Google Scholar
Xiaolong, W., Jianhua, L.: Combine trigram and automatic weight distribution in Chinese spelling error correction. Journal of computer Science and Technology 17(Issue 6), Province, China (2001)
Google Scholar
Bigert, J., Knutsson, O.: Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge. In: Proceedings of Robust Methods in Analysis of Natural Language Data (ROMAND’02), Frascati, Italy (2002)
Google Scholar
Bolshakov, I., Gelbukh, A.: On Detection of Malapropisms by Multistage Collocation Testing. In: NLDB-2003. Lecture Notes in Informatics, pp. 28–41. Bonner Köllen Verlag, Bonn (2003)
Google Scholar
Bolshakov, I.A., Gelbukh, A.: Paronyms for Accelerated Correction of Semantic Errors. International Journal on Information Theories and Applications 10, 11–19 (2003)
Google Scholar
Gelbukh, A., Bolshakov, I.: On Correction of Semantic Errors in Natural Language Texts with a Dictionary of Literal Paronyms. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 105–114. Springer, Heidelberg (2004)
Chapter Google Scholar
Bolshakov, I.A., Galicia-Haro, S.N., Gelbukh, A.: Detection and Correction of Malapropisms in Spanish by means of Internet Search. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 115–122. Springer, Heidelberg (2005)
Chapter Google Scholar
Ben Othmane, Z.C., Ben Fraj, F., Ben Ahmed, M.: A Multi-Agent System for Detecting and Correcting ”Hidden” Spelling Errors in Arabic Texts. In: NLUCS 2005, pp. 149–154 (2005)
Google Scholar
Ben Hamadou, A.: Vérification et correction automatique par analyse affixale des textes écrits en langue naturelle: le cas de l’arabe non voyellé. Thèse d’état en informatique, Faculté des Sciences de Tunis (1993)
Google Scholar
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to Latent Semantic Analysis. Discourse Processes 25, 259–284 (1998)
Article Google Scholar
Mlayeh, I.: Extraction de collocations à partir de corpus textuels en langue arabe. Mémoire de mastère, Ecole nationale des sciences informatiques, Université de la Manouba (2004)
Google Scholar
Ben Othmane, Z.C., Ben Ahmed, M.: Le contexte au service des graphies fautives arabes. In: TALN 2003, Nantes, pp. 11–14 (2003)
Google Scholar
Aloulou, C.: Utilisation de l’approche multi-critère pour orienter un processus de correction des erreurs d’accord dans des phrases de la langue arabe non voyellée. Mémoire de DEA, Institut Supérieur de Gestion, Université de Tunis III (1996)
Google Scholar
Courtin, J., Genthial, D., Menézo, J.: Intégration de strategies de correction dans un système de detection/correction d’erreurs, Colloque Informatique et Langue Naturelle (ILN93), Nantes (1993)
Google Scholar
Sulaiti, L.: Designing and Developing a Corpus of Contemporary Arabic. Master of Science, School of Computing, University of Leeds, United Kingdom (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

RIADI laboratory, National School of Computer Sciences, 2010, University of La Manouba, Tunisia
Chiraz Ben Othmane Zribi, Hanene Mejri & Mohamed Ben Ahmed

Authors

Chiraz Ben Othmane Zribi
View author publications
You can also search for this author in PubMed Google Scholar
Hanene Mejri
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Ben Ahmed
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ben Othmane Zribi, C., Mejri, H., Ben Ahmed, M. (2007). Combining Methods for Detecting and Correcting Semantic Hidden Errors in Arabic Texts. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2007. Lecture Notes in Computer Science, vol 4394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_56

Download citation

DOI: https://doi.org/10.1007/978-3-540-70939-8_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70938-1
Online ISBN: 978-3-540-70939-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics