Research Articles

Multi-label classification of computer science documents using fuzzy logic

Authors:

Abstract

Classification has been already used for the prediction of predefined topics in many diversified domains including research paper classification task. A research paper may belong to one or more than one topic (classes). The state-ofthe- art techniques in this area have the following limitations such as: (1) most of the techniques classify documents to at most one principal topic and do not identify all of the topic associations for research papers, (2) considers the classification problem of research documents in discrete domain and the accuracy of these techniques remain low when considering multiple classes for a single document. These limitations led us to explore the fuzzy domain for the classification of Computer Science documents because we are not sure whether the documents belong to one category or more than one category. Furthermore, fuzzy classification will help to identify the degree to which papers belong to different topics. To validate the findings of our research, we need a comprehensive dataset. Such a dataset has been made available by the scientific community for Computer Science domain. Therefore, in this paper, we restrict our focus to the Computer Science domain. Key features are extracted from the Title and Keywords of the research paper. We used term frequency (TF) as the weight scoring methodology. As a paper may belong to more than one category, we used fuzzy classifier, which automatically identifies all possible categories. Subsequently based on a threshold, the final one or more than one topic is assigned. We propose a generic framework and two algorithms for category (ies) identification. Our rules have been evolved (updated) by rules updater after the classification has been done by the fuzzy classifier. Performance of the technique with respect to accuracy has been compared with different classification techniques. The proposed approach has outperformed the state-of-the-art approaches.

Keywords:

Category identificationdocument classificationfuzzy rulesresearch paper classification
  • Year: 2016
  • Volume: 44 Issue: 2
  • Page/Article: 155-165
  • DOI: 10.4038/jnsfsr.v44i2.7996
  • Published on 30 Jun 2016
  • Peer Reviewed