NRPredictor: an ensemble learning and feature selection based approach for predicting the non-reproducible bugs

Bansal, Kulbhushan; Singh, Gopal; Sunesh Malik; Rohil, Harish

doi:10.1007/s13198-023-01902-7

NRPredictor: an ensemble learning and feature selection based approach for predicting the non-reproducible bugs

Original Article
Published: 08 May 2023

Volume 14, pages 989–1009, (2023)
Cite this article

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Kulbhushan Bansal ORCID: orcid.org/0000-0003-3874-8949¹,
Gopal Singh²,
Sunesh Malik³ &
…
Harish Rohil⁴

123 Accesses
1 Citation
Explore all metrics

Abstract

Software maintenance is essential and significant phase of software development life cycle. In software projects, issue tracking systems are used to collect, categorise, and track filed issues. The distinct bug reports are not being able to reproduced by software developers and hence, marked as non-reproducible. Non-reproducible problems are a major performance issue in bug repositories since they take up a lot of time and effort from developers. The goal of this paper is to create a prediction model for detecting non-reproducible bugs. Due to sheer unexpected nature of bug fixation, bug management is frequently a painful undertaking for software engineers. Non reproducible bugs add to the difficulty of this vexing indexing. This paper deals with the development of a early prediction model for identification of non-reproducible bugs. In this work, a novel framework named NRPredictor, has been proposed which uses three ensemble learning and one feature selection algorithm for Non-Reproducible bug prediction. The prediction performance of the proposed framework has been examined using projects of Bugzilla bug tracking system. Three open-source projects viz. Mozilla Firefox, Eclipse and NetBeans have been used for evaluating the prediction performance. While forecasting the fixability of bug reports, the experimental findings reveal that NRPredictor surpasses traditional machine learning techniques. For Mozilla Firefox, Eclipse, and NetBeans projects, NRPredictor, delivers performance (in terms of F1-score) up to 88.3, 87.8, and 87.4% respectively. An improvement in performance up to 6.1, 5 and 2.7% has been obtained for NetBeans, Eclipse, and Mozilla Firefox projects, respectively as compared to the best performing standalone machine learning classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts

Article 10 September 2015

An empirical study of non-reproducible bugs

Article 07 September 2019

Optimized ensemble machine learning model for software bugs prediction

Article 03 December 2022

Notes

References

Abou Khalil Z, Constantinou E, Mens T, Duchien L (2021) On the impact of release policies on bug handling activity: a case study of eclipse. J Syst Softw 173:110882
Article Google Scholar
Abualigah L, Abd Elaziz M, Sumari P, Geem ZW, Gandomi AH (2022) Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Exp. Syst. Appl. 191:116158
Article Google Scholar
Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376:113609
Article MathSciNet MATH Google Scholar
Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-Qaness MA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Indus. Eng. 157:107250
Article Google Scholar
Agushaka JO, Ezugwu AE, Abualigah L (2022) Dwarf mongoose optimization algorithm. Comput. Methods Appl. Mech. Eng. 391:114570
Article MathSciNet MATH Google Scholar
Ahmed HA, Bawany NZ, Shamsi JA (2021) CaPBug-a framework for automatic bug categorization and prioritization using NLP and machine learning algorithms. IEEE Access 9:50496–50512
Article Google Scholar
Alzubi JA (2015) Optimal classifier ensemble design based on cooperative game theory. Res J Appl Sci Eng Technol 11(12):1336–1343
Article Google Scholar
Alzubi OA, Alzubi JA, Alweshah M, Qiqieh I, Al-Shami S, Ramachandran M (2020) An optimal pruning algorithm of classifier ensembles: dynamic programming approach. Neural Comput Appl 32(20):16091–16107
Article Google Scholar
Alzubi OA, Alzubi JAA, Tedmori S, Rashaideh H, Almomani O (2018) Consensus-based combining method for classifier ensembles. Int Arab J Inf Technol 15(1):76–86
Google Scholar
Anvik J (2006) Automating bug report assignment. In: Proceedings of the 28th International conference on software engineering, pp 937–940
Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development oriented decisions. ACM Trans Softw Eng Method (TOSEM) 20(3):10
Article Google Scholar
Artzi S, Kim S, Ernst MD (2008) Recrash: Making software failures reproducible by preserving object states. In: European conference on object-oriented programming. pp 542–565. Springer
Bauer C, Parkinson A, Scharl A (1999) Automated vs. manual classification: a multi-methodological set of web analysis components. In: Australasian conference on information systems, pp 54–64. Citeseer
Bhattacharya P, Neamtiu I (2010) Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In: Software maintenance (ICSM), 2010 IEEE international conference on, pp 1–10. IEEE
Breiman L (1996) Bagging predictors. Machine Learn 24(2):123–140
Article MATH Google Scholar
Breu S, Premraj R, Sillito J, Zimmermann T (2010) Information needs in bug reports: improving cooperation between developers and users. In: Proceedings of the 2010 ACM conference on computer supported cooperative work, pp 301–310
Chaparro O, Lu J, Zampetti F, Moreno L, Di Penta M, Marcus A, Bavota G, and Ng V (2017) Detecting missing information in bug descriptions. In Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 396–407. ACM
Cheng X, Liu N, Guo L, Xu Z, Zhang T (2020) Blocking bug prediction based on XGBoost with enhanced features. In: 2020 IEEE 44th annual computers, software, and applications conference (COMPSAC), pp 902–911. IEEE
Dao A-H, Yang C-Z (2021) Improving priority prediction for bug reports with comment features. In 2021 IEEE international conference on software engineering and artificial intelligence (SEAI), pp 58–62. IEEE
Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems, pp 1–15. Springer
Ekanayake J (2021) Bug severity prediction using keywords in imbalanced learning environment. Int J Inf Technol Comput Sci (IJITCS) 13:53–60
Google Scholar
Erfani Joorabchi M, Mirzaaghaei M, Mesbah A (2014) Works for me! characterizing non-reproducible bug reports. In Proceedings of the 11th working conference on mining software repositories, pp 62–71. ACM
Fagan M (2002) Design and code inspections to reduce errors in program development. In Software pioneers, pp 575–607. Springer
Fan Y, Xia X, Lo D, Hassan AE (2018) Chaff from the wheat: characterizing and determining valid bug reports. IEEE Trans Softw Eng 46(5):495–525
Article Google Scholar
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, pp 23–37. Springer
Goyal A, Sardana N (2016) Analytical study on bug triaging practices. Int J Open Source Softw Process (IJOSSP) 7(2):20–42
Article Google Scholar
Goyal A, Sardana N (2017) black optimizing bug report assignment using multi criteria decision making technique. Intell Decision Technol 11(3):307–320
Article Google Scholar
Goyal A, Sardana N (2017) Machine learning or information retrieval techniques for bug triaging: Which is better? e-Inform Softw Eng J. https://doi.org/10.5277/e-Inf170106
Article Google Scholar
Goyal A, Sardana N (2017) Nrfixer: sentiment based model for predicting the fixability of non-reproducible bugs. Inform Softw Eng J 11(1):109–122
Google Scholar
Goyal A, Sardana N (2018) Characterization study of developers in non-reproducible bugs. In: 2018 eleventh international conference on contemporary computing (IC3), pp 1–6. IEEE
Goyal A Sardana N (2019) Empirical analysis of ensemble machine learning techniques for bug triaging. In: 2019 twelfth international conference on contemporary computing (IC3), pp 1–6. IEEE
Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: Software engineering, 2010 ACM/IEEE 32nd international conference on, vol 1, pp 495–504. IEEE
X. Guo, Y. Yin, C. Dong, G. Yang, and G. Zhou (2008) On the class imbalance problem. In: Fourth international conference on natural computation, 2008. ICNC’08., vol 4, pp 192–201. IEEE
Gupta S, Gupta SK (2021) An approach to generate the bug report summaries using two-level feature extraction. Exp Syst Appl 176:114816
Article Google Scholar
Hewett R, Kijsanayothin P (2009) On modeling software defect repair time. Empir Softw Eng 14(2):165
Article Google Scholar
Isotani H, Washizaki H, Fukazawa Y, Nomoto T, Ouji S, Saito S (2021) Duplicate bug report detection by using sentence embedding and fine-tuning. In: 2021 IEEE international conference on software maintenance and evolution (ICSME), pp 535–544. IEEE
Jin W, Orso A (2012) Bugredux: reproducing field failures for in-house debugging. In: 2012 34th international conference on software engineering (ICSE), pp 474–484. IEEE
Jonsson L, Borg M, Broman D, Sandahl K, Eldh S, Runeson P (2016) Automated bug assignment: ensemble-based machine learning in large scale industrial contexts. Empir Softw Eng 21(4):1533–1578
Article Google Scholar
Kamkar M (1998) Application of program slicing in algorithmic debugging. Inform Softw Technol 40(11–12):637–645
Article Google Scholar
Koh Y, Kang S, Lee S (2021) Bug report summarization using believability score and text ranking. In: 2021 international conference on artificial intelligence in information and communication (ICAIIC), pp 117–120. IEEE
Kumari M, Sharma M, Anand S, Singh V (2020) Predicting the fix time of a reported bug using radoop: a big data approach. In: Decision analytics applications in industry, pp 259–269. Springer
Kumari M, Singh UK, Sharma M (2020) Entropy based machine learning models for software bug severity assessment in cross project context. In: International conference on computational science and its applications, pp 939–953. Springer
Lal S, Sardana N, Sureka A (2017) Eclogger: cross-project catch-block logging prediction using ensemble of classifiers. e-Inform Softw Eng J. https://doi.org/10.5277/e-Inf170101
Article Google Scholar
Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inform Softw Technol 58:388–402
Article Google Scholar
Lee Y, Lee S, Lee C-G, Yeom I, Woo H (2020) Continual prediction of bug-fix time using deep learning-based activity stream embedding. IEEE Access 8:10503–10515
Article Google Scholar
Li Z, Jiang Z, Chen X, Cao K, Gu Q (2021) Laprob: a label propagation-based software bug localization method. Inform Softw Technol 130:106410
Article Google Scholar
Limsettho N, Bennin KE, Keung JW, Hata H, Matsumoto K (2018) Cross project defect prediction using class distribution estimation and oversampling. Inform Softw Technol 100:87–102
Article Google Scholar
López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inform Sci 250:113–141
Article Google Scholar
Malhotra R, Dabas A, Hariharasudhan A, Pant M (2021) A study on machine learning applied to software bug priority prediction. In: 2021 11th international conference on cloud computing, data science & engineering (Confluence), pp 965–970. IEEE
Mohan D, Goyal A, Sardana N (2016) Visheshagya: Time based expertise model for bug report assignment. In: 2016 ninth international conference on contemporary computing (IC3), pp 1–6. IEEE
Movassagh AA, Alzubi JA, Gheisari M, Rahimi M, Mohan S, Abbasi AA, Nabipour N (2021) Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model. J Ambient Intel Human Comput. https://doi.org/10.1007/s12652-020-02623-6
Article Google Scholar
Neysiani BS, Babamir SM (2020) Automatic duplicate bug report detection using information retrieval-based versus machine learning-based approaches. In: 2020 6th international conference on web research (ICWR), pp 288–293. IEEE
Neysiani BS, Babamir SM, Aritsugi M (2020) Efficient feature extraction model for validation performance improvement of duplicate bug report detection in software bug triage systems. Inform Softw Technol 126:106344
Article Google Scholar
Oyelade ON, Ezugwu AE-S, Mohamed TI, Abualigah L (2022) Ebola optimization search algorithm: a new nature-inspired metaheuristic optimization algorithm. IEEE Access 10:16150–16177
Article Google Scholar
Perry DE, Stieg CS (1993) Software faults in evolving a large, real-time system: a case study. In: European software engineering conference, pp 48–67. Springer
Phua C, Alahakoon D, Lee V (2004) Minority report in fraud detection: classification of skewed data. Acm Sigkdd Explorations Newsletter 6(1):50–59
Article Google Scholar
Rahman MM, Khomh F, Castelluccio M (2020) Why are some bugs non-reproducible?: An empirical investigation using data fusion. In: 2020 IEEE international conference on software maintenance and evolution (ICSME), pp 605–616. IEEE
Rashmi P, Kambli P (2020) Predicting bug in a software using ann based machine learning techniques. In: 2020 IEEE International Conference for Innovation in Technology (INOCON), pp 1–5. IEEE
Rocha TM, Carvalho ALDC (2021) SiameseQAT: a semantic context-based duplicate bug report detection using replicated cluster information. IEEE Access 9:44610–44630
Article Google Scholar
Sethuraman J, Alzubi JA, Manikandan R, Gheisari M, Kumar A (2019) Eccentric methodology with optimization to unearth hidden facts of search engine result pages. Recent Patents Comput Sci 12(2):110–119
Article Google Scholar
Sharma M, Kumari M, Singh V (2021) Bug priority assessment in cross-project context using entropy-based measure. In: Advances in machine learning and computational intelligence, pp 113–128. Springer
Shatnawi MQ, Alazzam B (2022) An assessment of eclipse bugs’ priority and severity prediction using machine learning. Int J Commun Netwo Inform Security (IJCNIS) 14(1):62–69
Google Scholar
Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Matsumoto K-I (2013) Studying re-opened bugs in open source software. Empir Softw Eng 18(5):1005–1042
Article Google Scholar
Shokripour R, Anvik J, Kasirun ZM, Zamani S (2015) A time-based approach to automatic bug report assignment. J Syst Softw 102:109–122
Article Google Scholar
Sureka A, Jalote P (2010) Detecting duplicate bug report using character n-gram-based features. In: 2010 17th Asia pacific software engineering conference (APSEC), pp 366–374. IEEE
Tagra A (2021) Studying reopened bugs in open source software systems. PhD thesis
Tamrawi A, Nguyen TT, Al-Kofahi J, Nguyen TN (2011) Fuzzy set-based automatic bug triaging: nier track. In: Software engineering (ICSE), 2011 33rd international conference on, pp 884–887. IEEE
Tamrawi A, Nguyen TT, Al-Kofahi JM, Nguyen TN (2011) Fuzzy set and cache-based approach for bug triaging. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pp 365–375
Valdivia Garcia H, Shihab E (2014) Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th working conference on mining software repositories, pp 72–81. ACM
White M, Linares-Vásquez M, Johnson P, Bernal-Cárdenas C, Poshyvanyk D (2015) Generating reproducible and replayable bug reports from android application crashes. In: 2015 IEEE 23rd international conference on program comprehension (ICPC), pp 48–59. IEEE
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article Google Scholar
Xi S-Q, Yao Y, Xiao X-S, Xu F, Lv J (2019) Bug triaging based on tossing sequence modeling. J Comput Sc Technol 34(5):942–956
Article Google Scholar
Xia X, Lo D, Shihab E, Wang X, Yang X (2015) Elblocker: predicting blocking bugs with ensemble imbalance learning. Inform Softw Technol 61:93–106
Article Google Scholar
Ye L, Jinxiao H, Yutao M (2020) An automatic method using hybrid neural networks and attention mechanism for software bug triaging. J Comput Res Develop 57(3):461
Google Scholar
Yuan W, Xiong Y, Sun H, and Liu X (2021) Incorporating multiple features to predict bug fixing time with neural networks. In: 2021 IEEE international conference on software maintenance and evolution (ICSME), pp 93–103. IEEE
Zhang T, Yang G, Lee B, Lua EK (2014) A novel developer ranking algorithm for automatic bug triage using topic model and developer relations. In: 2014 21st Asia-pacific software engineering conference, vol 1, pp 223–230. IEEE
Zhou T, Sun X, Xia X, Li B, Chen X (2019) Improving defect prediction with deep forest. Inform Softw Technol 114:204–216
Article Google Scholar
Zimmermann T, Nagappan N, Guo PJ, Murphy B (2012) Characterizing and predicting which bugs get reopened. In: Proceedings of the 34th international conference on software engineering, pp 1074–1083. IEEE Press

Download references

Funding

This study and all authors involved in this study have not received any funding, including after the completion of the study.

Author information

Authors and Affiliations

Department of CSE, Chaudhary Devi Lal University, Sirsa, Haryana, India
Kulbhushan Bansal
Department of CSA, Maharshi Dayanand University, Rohtak, Haryana, India
Gopal Singh
Maharaja Surajmal Institute of Technology, Delhi, India
Sunesh Malik
Department of CSE, Chaudhary Devi Lal University, Sirsa, Haryana, India
Harish Rohil

Authors

Kulbhushan Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Gopal Singh
View author publications
You can also search for this author in PubMed Google Scholar
Sunesh Malik
View author publications
You can also search for this author in PubMed Google Scholar
Harish Rohil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kulbhushan Bansal.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article. The authors did not receive support from any organization for the submitted work.

Human or animal rights

This tudy did not involve any human participants or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bansal, K., Singh, G., Sunesh Malik et al. NRPredictor: an ensemble learning and feature selection based approach for predicting the non-reproducible bugs. Int J Syst Assur Eng Manag 14, 989–1009 (2023). https://doi.org/10.1007/s13198-023-01902-7

Download citation

Received: 10 February 2022
Revised: 24 June 2022
Accepted: 29 March 2023
Published: 08 May 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s13198-023-01902-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

NRPredictor: an ensemble learning and feature selection based approach for predicting the non-reproducible bugs

Abstract

Access this article

Similar content being viewed by others

Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts

An empirical study of non-reproducible bugs

Optimized ensemble machine learning model for software bugs prediction

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human or animal rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

NRPredictor: an ensemble learning and feature selection based approach for predicting the non-reproducible bugs

Abstract

Access this article

Similar content being viewed by others

Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts

An empirical study of non-reproducible bugs

Optimized ensemble machine learning model for software bugs prediction

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human or animal rights

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation