Nâbra: Syrian Arabic Dialects with Morphological Annotations

Amal Nayouf, Tymaa Hammouda, Mustafa Jarrar, Fadi Zaraket, Mohamad-Bassam Kurdy


Abstract
This paper presents Nâbra (نَبْرَة), a corpora of Syrian Arabic dialects with morphological annotations. A team of Syrian natives collected more than 6K sentences containing about 60K words from several sources including social media posts, scripts of movies and series, lyrics of songs and local proverbs to build Nâbra. Nâbra covers several local Syrian dialects including those of Aleppo, Damascus, Deir-ezzur, Hama, Homs, Huran, Latakia, Mardin, Raqqah, and Suwayda. A team of nine annotators annotated the 60K tokens with full morphological annotations across sentence contexts. We trained the annotators to follow methodological annotation guidelines to ensure unique morpheme annotations, and normalized the annotations. F1 and 𝜅 agreement scores ranged between 74% and 98% across features, showing the excellent quality of Nâbra annotations. Our corpora are open-source and publicly available as part of the Currasat portal https://sina.birzeit.edu/currasat.
Anthology ID:
2023.arabicnlp-1.2
Volume:
Proceedings of ArabicNLP 2023
Month:
December
Year:
2023
Address:
Singapore (Hybrid)
Editors:
Hassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Ahmed Abdelali, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Khalil Mrini, Rawan Almatham
Venues:
ArabicNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–23
Language:
URL:
https://aclanthology.org/2023.arabicnlp-1.2
DOI:
10.18653/v1/2023.arabicnlp-1.2
Bibkey:
Cite (ACL):
Amal Nayouf, Tymaa Hammouda, Mustafa Jarrar, Fadi Zaraket, and Mohamad-Bassam Kurdy. 2023. Nâbra: Syrian Arabic Dialects with Morphological Annotations. In Proceedings of ArabicNLP 2023, pages 12–23, Singapore (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Nâbra: Syrian Arabic Dialects with Morphological Annotations (Nayouf et al., ArabicNLP-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.arabicnlp-1.2.pdf