Computer Science and Information Systems 2023 Volume 20, Issue 4, Pages: 1367-1387
https://doi.org/10.2298/CSIS221210055K
Full text ( 351 KB)


Sentence embedding approach using LSTM auto-encoder for discussion threads summarization

Khan Abdul Wali (Center for Excellence in Information Technology,Institute of Management Sciences, Peshawar, Pakistan), abdulwalikhanafridi@gmail.com
Al-Obeidat Feras (College of Technological Innovation, Zayed University, Abu Dhabi, UA), Feras.Al-Obeidat@zu.ac.ae
Khalid Afsheen (Center for Excellence in Information Technology,Institute of Management Sciences, Peshawar, Pakistan), afsheen.khalid@imsciences.edu.pk
Amin Adnan (Center for Excellence in Information Technology,Institute of Management Sciences, Peshawar, Pakistan), adnan.amin@imsciences.edu.pk
Moreira Fernando (REMIT, IJP, Universidade Portucalense IEETA, Universidade de Aveiro, Portugal), fmoreira@uportu.pt

Online discussion forums are repositories of valuable information where users interact and articulate their ideas and opinions, and share experiences about numerous topics. These online discussion forums are internet-based online communities where users can ask for help and find the solution to a problem. A new user of online discussion forums becomes exhausted from reading the significant number of irrelevant replies in a discussion. An automated discussion thread summarizing system (DTS) is necessary to create a candid view of the entire discussion of a query. Most of the previous approaches for automated DTS use the continuous bag of words (CBOW) model as a sentence embedding tool, which is poor at capturing the overall meaning of the sentence and is unable to grasp word dependency. To overcome these limitations, we introduce the LSTM Auto-encoder as a sentence embedding technique to improve the performance of DTS. The empirical result in the context of the proposed approach’s average precision, recall, and F-measure with respect to ROGUE-1 and ROUGE-2 of two standard experimental datasets demonstrates the effectiveness and efficiency of the proposed approach and outperforms the state-of-the-art CBOWmodel in sentence embedding tasks and boost the performance of the automated DTS model.

Keywords: Sentence embedding, LSTM Auto-encoder, CBOW, Deep learning, Machine learning, NLP