skip to main content
research-article
Free Access
Just Accepted

Analyzing and Detecting Information Types of Developer Live Chat Threads

Online AM:29 January 2024Publication History
Skip Abstract Section

Abstract

Online chatrooms serve as vital platforms for information exchange among software developers. With multiple developers engaged in rapid communication and diverse conversation topics, the resulting chat messages often manifest complexity and lack structure. To enhance the efficiency of extracting information from chat threads, automatic mining techniques are introduced for thread classification. However, previous approaches still grapple with unsatisfactory classification accuracy, due to two primary challenges that they struggle to adequately capture long-distance dependencies within chat threads and address the issue of category imbalance in labeled datasets. To surmount these challenges, we present a topic classification approach for chat information types named EAEChat. Specifically, EAEChat comprises three core components: the text feature encoding component captures contextual text features using a multi-head self-attention mechanism-based text feature encoder, and a siamese network is employed to mitigate overfitting caused by limited data; the data augmentation component expands a small number of categories in the training dataset using a technique tailored to developer chat messages, effectively tackling the challenge of imbalanced category distribution; the non-text feature encoding component employs a feature fusion model to integrate deep text features with manually extracted non-text features. Evaluation across three real-world projects demonstrates that EAEChat respectively achieves an average precision, recall, and F1-score of 0.653, 0.651, and 0.644, and it marks a significant 7.60% improvement over the state-of-the-art approachs. These findings confirm the effectiveness of our method in proficiently classifying developer chat messages in online chatrooms.

References

  1. Bin Lin, Alexey Zagalsky, Margaret-Anne Storey, and Alexander Serebrenik. 2016. Why developers are slacking off: Understanding how software teams use slack. In Proceedings of the 19th acm conference on computer supported cooperative work and social computing companion. 333–336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Margaret-Anne Storey, Leif Singer, Brendan Cleary, Fernando Figueira Filho, and Alexey Zagalsky. 2014. The (r) evolution of social media in software engineering. Future of software engineering proceedings(2014), 100–116.Google ScholarGoogle Scholar
  3. Verena Käfer, Daniel Graziotin, Ivan Bogicevic, Stefan Wagner, and Jasmin Ramadani. 2018. Communication in open-source projects-end of the e-mail era?. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings. 242–243.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 2023. Gitter. https://gitter.im/. (2023).Google ScholarGoogle Scholar
  5. 2023. Slack. https://slack.com/. (2023).Google ScholarGoogle Scholar
  6. 2023. Freenode. https://freenode.net/. (2023).Google ScholarGoogle Scholar
  7. Preetha Chatterjee, Kostadin Damevski, Lori Pollock, Vinay Augustine, and Nicholas A Kraft. 2019. Exploratory study of slack q&a chats as a mining source for software engineering tools. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 490–501.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Osama Ehsan, Safwat Hassan, Mariam El Mezouar, and Ying Zou. 2020. An empirical study of developer discussions in the gitter platform. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 1(2020), 1–39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Lin Shi, Mingzhe Xing, Mingyang Li, Yawen Wang, Shoubin Li, and Qing Wang. 2020. Detection of hidden feature requests from massive chat messages via deep siamese network. (2020), 641–653.Google ScholarGoogle Scholar
  10. Hareem Sahar, Abram Hindle, and Cor-Paul Bezemer. 2021. How are issue reports discussed in Gitter chat rooms?Journal of Systems and Software 172 (2021), 110852.Google ScholarGoogle Scholar
  11. Rana Alkadhi, Manuel Nonnenmacher, Emitza Guzman, and Bernd Bruegge. 2018. How do developers discuss rationale?. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 357–369.Google ScholarGoogle ScholarCross RefCross Ref
  12. Deeksha Arya, Wenting Wang, Jin LC Guo, and Jinghui Cheng. 2019. Analysis and detection of information types of open source software issue discussions. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 454–464.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Eduard C Groen, Norbert Seyff, Raian Ali, Fabiano Dalpiaz, Joerg Doerr, Emitza Guzman, Mahmood Hosseini, Jordi Marco, Marc Oriol, Anna Perini, et al. 2017. The crowd in requirements engineering: The landscape and challenges. IEEE software 34, 2 (2017), 44–52.Google ScholarGoogle Scholar
  14. Jonathan K Kummerfeld, Sai R Gouravajhala, Joseph Peper, Vignesh Athreya, Chulaka Gunasekara, Jatin Ganhotra, Siva Sankalp Patel, Lazaros Polymenakos, and Walter S Lasecki. 2018. A large-scale corpus for conversation disentanglement. arXiv preprint arXiv:1810.11118(2018).Google ScholarGoogle Scholar
  15. Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing. Ieee, 6645–6649.Google ScholarGoogle ScholarCross RefCross Ref
  16. Shengyi Pan, Lingfeng Bao, Xiaoxue Ren, Xin Xia, David Lo, and Shanping Li. 2021. Automating developer chat mining. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 854–866.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xinbei Ma, Zhuosheng Zhang, and Hai Zhao. 2022. Structural Characterization for Dialogue Disentanglement. arXiv preprint arXiv:2110.08018(2022).Google ScholarGoogle Scholar
  18. Yuan Meng, Xuhao Pan, Jun Chang, and Yue Wang. 2023. RGAT: A Deeper Look into Syntactic Dependency Information for Coreference Resolution. In 2023 International Joint Conference on Neural Networks (IJCNN). 1–8. DOI: http://dx.doi.org/10.1109/IJCNN54540.2023.10191577Google ScholarGoogle ScholarCross RefCross Ref
  19. Tong Zhao, Junjie Peng, Yansong Huang, Lan Wang, Huiran Zhang, and Zesu Cai. 2023. A graph convolution-based heterogeneous fusion network for multimodal sentiment analysis. Applied Intelligence (11 2023), 1–14. DOI: http://dx.doi.org/10.1007/s10489-023-05151-wGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shafiq Joty, Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice, Lluís Màrquez, Alessandro Moschitti, and Preslav Nakov. 2015. Global Thread-level Inference for Comment Classification in Community Question Answering. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lluís Màrquez, Chris Callison-Burch, and Jian Su (Eds.). Association for Computational Linguistics, Lisbon, Portugal, 573–578. DOI: http://dx.doi.org/10.18653/v1/D15-1068Google ScholarGoogle ScholarCross RefCross Ref
  22. Ruoyao Yang, Wanying Xie, Chunhua Liu, and Dong Yu. 2019. BLCU_NLP at SemEval-2019 Task 7: An Inference Chain-based GPT Model for Rumour Evaluation. In Proceedings of the 13th International Workshop on Semantic Evaluation, Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, and Saif M. Mohammad (Eds.). Association for Computational Linguistics, Minneapolis, Minnesota, USA, 1090–1096. DOI: http://dx.doi.org/10.18653/v1/S19-2191Google ScholarGoogle ScholarCross RefCross Ref
  23. Preetha Chatterjee, Kostadin Damevski, Nicholas A. Kraft, and Lori Pollock. 2021. Automatically Identifying the Quality of Developer Chats for Post Hoc Use. ACM Trans. Softw. Eng. Methodol. 30, 4, Article 48(jul 2021), 28 pages. DOI: http://dx.doi.org/10.1145/3450503Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Marwa Tolba, Salima Ouadfel, and Souham Meshoul. 2021. Hybrid Ensemble Approaches to Online Harassment Detection in Highly Imbalanced Data. Expert Syst. Appl. 175, C (aug 2021), 13. DOI: http://dx.doi.org/10.1016/j.eswa.2021.114751Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jonathan Herzig, Guy Feigenblat, Michal Shmueli-Scheuer, David Konopnicki, and Anat Rafaeli. 2016. Predicting Customer Satisfaction in Customer Support Conversations in Social Media Using Affective Features. In Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization (UMAP ’16). Association for Computing Machinery, New York, NY, USA, 115–119. DOI: http://dx.doi.org/10.1145/2930238.2930285Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).Google ScholarGoogle Scholar
  27. 2023. Bert-small on huggingface. https://huggingface.co/google/bert_uncased_L-4_H-256_A-4. (2023).Google ScholarGoogle Scholar
  28. Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 1802. Deep contextualized word representations. CoRR abs/1802.05365 (2018). arXiv preprint arXiv:1802.05365(1802).Google ScholarGoogle Scholar
  29. Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Vol.  1. IEEE, 539–546.Google ScholarGoogle Scholar
  30. Mohamed Chiny, Omar Bencharef, Moulay Youssef Hadi, and Younes Chihab. 2021. A client-centric evaluation system to evaluate guest’s satisfaction on AirBNB using machine learning and NLP. Applied Computational Intelligence and Soft Computing 2021 (2021), 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Qizhe Xie, Zihang Dai, Eduard Hovy, Thang Luong, and Quoc Le. 2020. Unsupervised data augmentation for consistency training. Advances in neural information processing systems 33 (2020), 6256–6268.Google ScholarGoogle Scholar
  32. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  33. Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).Google ScholarGoogle Scholar
  34. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. PMLR, 1188–1196.Google ScholarGoogle Scholar
  35. Kaitlyn Zhou, Kawin Ethayarajh, Dallas Card, and Dan Jurafsky. 2022. Problems with Cosine as a Measure of Embedding Similarity for High Frequency Words. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 401–423. DOI: http://dx.doi.org/10.18653/v1/2022.acl-short.45Google ScholarGoogle ScholarCross RefCross Ref
  36. Zeming Dong, Qiang Hu, Yuejun Guo, Zhenya Zhang, Maxime Cordy, Mike Papadakis, Yves Le Traon, and Jianjun Zhao. 2023. Boosting Source Code Learning with Data Augmentation: An Empirical Study. arXiv preprint arXiv:2303.06808(2023).Google ScholarGoogle Scholar
  37. Paige Rodeghero, Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Detecting user story information in developer-client conversations to generate extractive summaries. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 49–59.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Andrew Wood, Paige Rodeghero, Ameer Armaly, and Collin McMillan. 2018. Detecting speech act types in developer question/answer conversations during bug repair. In Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. 491–502.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sarah Rastkar, Gail C Murphy, and Gabriel Murray. 2014. Automatic summarization of bug reports. IEEE Transactions on Software Engineering 40, 4 (2014), 366–380.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. 2023. Angular chatroom on gitter. https://gitter.im/angular/angular. (2023).Google ScholarGoogle Scholar
  41. 2023. Deeplearning4j chatroom on gitter. https://gitter.im/eclipse/deeplearning4j. (2023).Google ScholarGoogle Scholar
  42. 2023. Spring-boot chatroom on gitter. https://gitter.im/spring-projects/spring-boot. (2023).Google ScholarGoogle Scholar
  43. 2023. Gitter developer page. https://developer.gitter.im/. (2023).Google ScholarGoogle Scholar
  44. Andrea Di Sorbo, Sebastiano Panichella, Corrado A Visaggio, Massimiliano Di Penta, Gerardo Canfora, and Harald C Gall. 2015. Development emails content analyzer: Intention mining in developer discussions (T). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 12–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado A Visaggio, Gerardo Canfora, and Harald C Gall. 2015. How can i improve my app? classifying user reviews for software maintenance and evolution. In 2015 IEEE international conference on software maintenance and evolution (ICSME). IEEE, 281–290.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.Google ScholarGoogle Scholar
  47. Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement 20, 1 (1960), 37–46.Google ScholarGoogle Scholar
  48. Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems 30 (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. 2016. Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651(2016).Google ScholarGoogle Scholar
  50. Qiao Huang, Xin Xia, David Lo, and Gail C Murphy. 2018. Automating intention mining. IEEE Transactions on Software Engineering 46, 10 (2018), 1098–1119.Google ScholarGoogle ScholarCross RefCross Ref
  51. Allen Institute for Artificial Intelligence. 2023. AllenNLP. https://allennlp.org/. (2023).Google ScholarGoogle Scholar
  52. Facebook. 2023. PyTorch. https://pytorch.org/. (2023).Google ScholarGoogle Scholar
  53. 2023. Transformers. https://huggingface.co/. (2023).Google ScholarGoogle Scholar
  54. Foyzur Rahman and Premkumar Devanbu. 2013. How, and Why, Process Metrics Are Better. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). IEEE Press, 432–441.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized defect prediction. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 279–289. DOI: http://dx.doi.org/10.1109/ASE.2013.6693087Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jaechang Nam, Sinno Jialin Pan, and Sunghun Kim. 2013. Transfer defect learning. In 2013 35th International Conference on Software Engineering (ICSE). 382–391. DOI: http://dx.doi.org/10.1109/ICSE.2013.6606584Google ScholarGoogle ScholarCross RefCross Ref
  57. Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayse Basar Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17 (2010), 375–407. https://api.semanticscholar.org/CorpusID:2782280Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Emad Shihab, Zhen Ming Jiang, and Ahmed E Hassan. 2009. Studying the use of developer IRC meetings in open source projects. In 2009 IEEE International Conference on Software Maintenance. IEEE, 147–156.Google ScholarGoogle ScholarCross RefCross Ref
  59. Rana Alkadhi, Teodora Lata, Emitza Guzmany, and Bernd Bruegge. 2017. Rationale in development chat messages: an exploratory study. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 436–446.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Anna Glazkova. 2020. A comparison of synthetic oversampling methods for multi-class text classification. arXiv preprint arXiv:2008.04636(2020).Google ScholarGoogle Scholar
  61. Jason Wei and Kai Zou. 2019. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196(2019).Google ScholarGoogle Scholar
  62. Claude Coulombe. 2018. Text data augmentation made simple by leveraging nlp cloud apis. arXiv preprint arXiv:1812.04718(2018).Google ScholarGoogle Scholar
  63. Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321–357.Google ScholarGoogle ScholarCross RefCross Ref
  64. Shikai Guo, Jian Dong, Hui Li, and Jiahui Wang. 2021. Software defect prediction with imbalanced distribution by radius-synthetic minority over-sampling technique. Journal of Software: Evolution and Process 33, 7 (2021), e2362.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Analyzing and Detecting Information Types of Developer Live Chat Threads
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Software Engineering and Methodology
            ACM Transactions on Software Engineering and Methodology Just Accepted
            ISSN:1049-331X
            EISSN:1557-7392
            Table of Contents

            Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Online AM: 29 January 2024
            • Accepted: 15 January 2024
            • Revised: 11 January 2024
            • Received: 13 August 2023
            Published in tosem Just Accepted

            Check for updates

            Qualifiers

            • research-article
          • Article Metrics

            • Downloads (Last 12 months)175
            • Downloads (Last 6 weeks)46

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader