Skip to main content
Log in

How to cherry pick the bug report for better summarization?

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Bug reports, as a frequently consulted software asset, are maintained and evolved in software communities. A large number of bug reports with complex discussions are accumulated during the software evolution. It has been proven that an accurate and concise summary can help developers reduce the time effort spent going through the entire content of bug reports. Prior works select salient sentences that contain the most semantic information to form summaries. Their performance is limited due to the lack of consideration of controversial standpoints among developers’ comments and the redundancy in sentences. In this paper, we study the possibility of assessing comments’ opinions from discussions, and which kind of sentences are more likely to have redundant information. Based on these studies, we propose two new factors, Believability and Informativeness. The former measures the degree of approved or disapproved to a sentence within discussions, and the latter assesses the amount of information contained in the summary. Accordingly, we design BugSum, a supervised approach to generate summaries with a two-phase method. In the measuring phase, we propose a classification method that combines the advantages of Deep Pyramid CNN and Random Forest to assess the believability of sentences in bug reports. In the selection phase, BugSum integrates an auto-encoder network for semantic feature extraction with the believability of sentences, and optimizes the informativeness of generated summaries through a dynamic selection of salient sentences. Extensive experiments show that our approach outperforms 8 comparative approaches over two public datasets and one customized dataset. In particular, the probability of adding controversial sentences that are clearly disapproved by other developers into the summary is reduced by up to 64.7%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.bugzilla.org/

  2. https://help.github.com/en/github/managing-your-work-on-github/about-issues

  3. https://www.debian.org/Bugs/server-control#summary

  4. https://github.com/HaoranLiu14/BugSum

  5. https://github.com/HaoranLiu14/Controversial-Dataset

References

  • Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug?. In: 28th International Conference on Software Engineering (ICSE 2006), Shanghai

  • Arya D, Wang W, Guo Jin LC, Cheng J (2019) Analysis and detection of information types of open source software issue discussions. In: Proceedings of the 41st International Conference on Software Engineering. IEEE Press, pp 454–464

  • Bettenburg N, Just S, Schröter A, Weiss C, Premraj R, Zimmermann T (2008) What makes a good bug report?. In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. ACM, pp 308–318

  • Bettenburg N, Premraj R, Zimmermann T, Kim S (2008) Extracting structural information from bug reports. In: Proceedings of the 2008 international working conference on Mining software repositories, pp 27–30

  • Bishnu PS, Bhattacherjee V (2012) Software fault prediction using quad tree-based k-means clustering algorithm

  • Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010. Springer, pp 177–186

  • Buse RPL, Weimer WR (2010) Automatically documenting program changes. In: Proceedings of the IEEE/ACM international conference on Automated software engineering, pp 33–42

  • Carbonell JG, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries.. In: SIGIR, vol 98, pp 335–336

  • Casalnuovo C, Vasilescu B, Devanbu P, Filkov V (2015) Developer onboarding in github: the role of prior social links and language experience. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 817–828

  • Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. arXiv:1603.07252

  • Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078

  • Christoffersen P, Jacobs K (2004) The importance of the loss function in option valuation. J Financ Econ 72(2):291–318

    Article  Google Scholar 

  • Corporate AAFAI (1992) Proceedings of the tenth national conference on artificial intelligence. In: Tenth National Conference on Artificial Intelligence

  • Fan Q, Yu Y, Yin G, Wang T, Wang H (2017) Where is the road for issue reports classification based on text mining?. In: 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, pp 121–130

  • Fu W, Menzies T (2017) Easy over hard: A case study on deep learning. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 49–60

  • Hampe B (2002) Superlative verbs: A corpus-based study of semantic redundancy in english verb-particle constructions, vol 24. Gunter Narr Verlag

  • Han X, Yu T, Lo D (2020) Perflearner: Learning from bug reports to understand and generate performance test frames. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)

  • He H, Jia Z, Li S, Xu E, Liao X (2020) Cp-detector: using configuration-related performance properties to expose performance bugs. In: ASE ’20: 35th IEEE/ACM International Conference on Automated Software Engineering

  • Hellendoorn VJ, Devanbu P (2017) Are deep neural networks the best choice for modeling source code?. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp 763–773

  • Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat:65–70

  • Jadhav A, Rajan V (2018) Extractive summarization with swap-net: Sentences and words from alternating pointer networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 142–151

  • Jiang H, Zhang J, Ma H, Nazar N, Ren Z (2017) Mining authorship characteristics in bug repositories. Sci China Inf Sci 60(1):012107

    Article  Google Scholar 

  • Jiang S, Armaly A, McMillan C (2017) Automatically generating commit messages from diffs using neural machine translation. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, pp 135–146

  • John, Anvik, Gail, C., Murphy (2011) Reducing the effort of bug report triage: Recommenders for development-oriented decisions. Acm Transactions on Software Engineering and Methodology

  • Johnson R, Tong Z (2017) Deep pyramid convolutional neural networks for text categorization. In: Meeting of the Association for Computational Linguistics

  • Kalliamvakou E, Damian D, Blincoe K, Singer L, German DM (2015) Open source-style collaborative development practices in commercial projects using github. In: Proceedings of the 37th International Conference on Software Engineering. IEEE Press, pp 574–585

  • Kianifard F, Kleinbaum DG (1995) Logistic regression: A self-learning text. Technometrics 37(1):116

    Article  Google Scholar 

  • Kim KM, Dinara A, Choi BJ, Lee SK (2018) Incorporating word embeddings into open directory project based large-scale classification. In: Pacific-asia Conference on Knowledge Discovery and13:15 2021/2/17 Data Mining

  • Kim W, Jeong O-R, Lee S-W (2010) On social web sites. Inf Syst 35(2):215–236

    Article  Google Scholar 

  • Kim Y (2014) Convolutional neural networks for sentence classification. Eprint Arxiv

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  • Li X, Jiang H, Liu D, Ren Z, Li G (2018) Unsupervised deep bug report summarization. In: Proceedings of the 26th Conference on Program Comprehension. ACM, pp 144–155

  • Lin C-Y (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

  • Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp 912–920

  • Linares-Vásquez M, Cortés-Coy L F, Aponte J, Poshyvanyk D (2015) Changescribe: A tool for automatically generating commit messages. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol 2. IEEE, pp 709–712

  • Liu Q, Liu Z, Zhu H, Fan H, Du B, Qian Y (2019) Generating commit messages from diffs using pointer-generator network. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, pp 299–309

  • Lotufo R, Malik Z, Czarnecki K (2015) Modelling the ‘hurried’bug report reading process to summarize bug reports. Empir Softw Eng 20(2):516–548

    Article  Google Scholar 

  • Mani S, Catherine R, Sinha VS, Dubey A (2012) Ausum: approach for unsupervised bug report summarization. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. ACM, p 11

  • Matskevich S, Gordon CS (2018) Generating comments from source code with ccgs. In: Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering, pp 26–29

  • Mei Q, Guo J, Radev D (2010) Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. Acm, pp 1009–1018

  • Moreno L, Aponte J, Sridhara G, Marcus A, Pollock L, Vijay-Shanker K (2013) Automatic generation of natural language summaries for java classes. In: 2013 21st International Conference on Program Comprehension (ICPC). IEEE, pp 23–32

  • Munot N, Govilkar SS (2014) Comparative study of text summarization methods. Int J Comput Appl 102(12)

  • Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-First AAAI Conference on Artificial Intelligence

  • Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning. arXiv:1802.08636

  • Nazar N, Hu Y, Jiang H (2016) Summarizing software artifacts: A literature review. J Comput Sci Technol 31(5):883–909

    Article  Google Scholar 

  • Nenkova A, Passonneau R, McKeown K (2007) The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans Speech Lang Process (TSLP) 4(2):4

    Article  Google Scholar 

  • Olanow CW, Koller WC (1998) An algorithm (decision tree) for the management of parkinson’s disease: treatment guidelines. american academy of neurology. Neurology 50(3 Suppl 3):S1

    Article  Google Scholar 

  • Owczarzak K, Conroy JM, Dang HT, Nenkova A (2012) An assessment of the accuracy of automatic evaluation in summarization. In: Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization. Association for Computational Linguistics, pp 1–9

  • Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  • Porter MF (1980) An algorithm for suffix stripping. Program 14 (3):130–137

    Article  Google Scholar 

  • Radev DR, Jing H, Styś M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938

    Article  Google Scholar 

  • Ramos J, et al. (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, vol 242. Piscataway, NJ, pp 133–142

  • Rastkar S, Murphy GC, Murray G (2010) Summarizing software artifacts: a case study of bug reports. In: 2010 ACM/IEEE 32nd International Conference on Software Engineering, vol 1. IEEE, pp 505–514

  • Rastkar S, Murphy GC, Murray G (2014) Automatic summarization of bug reports. IEEE Trans Softw Eng 40(4):366–380

    Article  Google Scholar 

  • Surhone LM, Tennoe MT, Henssonow SF, Breiman L (2010) Random forest. Mach Learn 45(1):5–32

    Google Scholar 

  • Vapnik VN (2000) The nature of statistical learning theory. Springer

  • Wan A, Dunlap L, Ho D, Yin J, Lee S, Jin H, Petryk S, Bargal SA, Gonzalez JE (2020) Nbdt: Neural-backed decision trees. arXiv:2004.00221

  • Wang L, Cardie C (2013) Domain-independent abstract generation for focused meeting summarization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol 1, pp 1395–1405

  • Wang S, Manning CD (2012) Baselines and bigrams: Simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

  • Wang X, Pollock L, Vijay-Shanker K (2017) Automatically generating natural language descriptions for object-related statement sequences. In: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, pp 205–216

  • White M, Vendome C, Linares-Vásquez M, Poshyvanyk D (2015) software repositories. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, pp 334–345

  • Xuan J, Jiang H, Hu Y, Ren Z, Zou W, Luo Z, Wu X (2014) Towards effective bug triage with software data reduction techniques. IEEE Trans Knowl Data Eng 27(1):264–280

    Article  Google Scholar 

  • Zhang Y, Legunsen O, Li S, Dong W, He H, Xu T (2021) An evolutionary study of configuration design and implementation in cloud systems. In: the 43rd International Conference on Software Engineering

  • Zhang Y, Jin R, Zhou Z-H (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1(1-4):43–52

    Article  Google Scholar 

  • Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences. arXiv:1807.02305

  • Zhu X, Goldberg A, Van Gael J, Andrzejewski D (2007) Improving diversity in ranking using absorbing random walks. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pp 97–104

  • Zimmermann T, Premraj R, Bettenburg N, Just S, Schroter A, Weiss C (2010) What makes a good bug report?. IEEE Trans Softw Eng 36(5):618–643

    Article  Google Scholar 

Download references

Acknowledgments

This paper is supported by National Grand R&D Plan(Grant No. 2020AAA0103504), and National Natural Science Foundation (No.61872373 and No. 61672529).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yue Yu or Shanshan Li.

Additional information

Communicated by: Ali Ouni, David Lo, Xin Xia, Alexander Serebrenik and Christoph Treude

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Recommendation Systems for Software Engineering

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Yu, Y., Li, S. et al. How to cherry pick the bug report for better summarization?. Empir Software Eng 26, 119 (2021). https://doi.org/10.1007/s10664-021-10008-2

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-10008-2

Keywords

Navigation