09Dec 2022

AUTOMATIC RELATION EXTRACTION BETWEEN ENTITIES FOR AMHARIC TEXT

  • Assistatnt Professor, Department of Computer Science, Ambo University, Ethiopia.
  • Lecturer, Department of Computer Science, BuleHora University, Ethiopia.
  • Lecturer, Department of Computer Science, Werabe University, Ethiopia.
Crossref Cited-by Linking logo
  • Abstract
  • Keywords
  • Cite This Article as
  • Corresponding Author

This research work primarily focused on the automatic relation extraction between entities for Amharic text using supervised machine learning approach.The Walta Information Centre online archive resources were used to create the studys own corpus, which consisted of 2000 sentences and a reasonable quantity of 30,466 words or tokens. The proposed solution has four processes namely preprocessing, Text labeling, feature extraction and feature selection and Recognition. The tokenization and POS are used as preprocessing. After the text is tokenized and giving POS for each of tokens the next step is text labeling system. For text labeling mechanism BIO scheme is used. The tag features are selected for building the model. The Tag feature consists of name entity type and relation type. The name entity type features are represented by Location (LOC), Organization (ORG) and Person (PER) and the relation type features are identified every word which existed between two entities for instance between location-location relation type or location-organization relation type and all the corresponding entities that are appeared it. Vectorizations are done using DictVectorizer and word2features. Support vector machines and conditional random field machine learning are used to for recognizing the entity relation between Amharic texts. SVM with SGD achieved the weighted precision of 49%, recall 10% and f1-score 13% are scored. SVM with Multinomial Naive Bayes Classifier Algorithm achieve precision of 61%, recall 41% and f1-score 48%.SVM with Passive Aggressive classifier achieved weighted average precision of 55%, recall 19% and f1-score 27%. CRF algorithm achieved precision of 87%, recall 87% and f1-score 86%. The CRF model outperform compared with other SVM algorithms.


[S. Nagarajan, Melkamu Genet and Yonatan Negesa (2022); AUTOMATIC RELATION EXTRACTION BETWEEN ENTITIES FOR AMHARIC TEXT Int. J. of Adv. Res. 10 (Dec). 114-125] (ISSN 2320-5407). www.journalijar.com


S.Nagarajan
Assistatnt Professor, Department of Computer Science, Ambo University, Ethiopia.

DOI:


Article DOI: 10.21474/IJAR01/15816      
DOI URL: http://dx.doi.org/10.21474/IJAR01/15816