Abstract
Software maintenance is essential and significant phase of software development life cycle. In software projects, issue tracking systems are used to collect, categorise, and track filed issues. The distinct bug reports are not being able to reproduced by software developers and hence, marked as non-reproducible. Non-reproducible problems are a major performance issue in bug repositories since they take up a lot of time and effort from developers. The goal of this paper is to create a prediction model for detecting non-reproducible bugs. Due to sheer unexpected nature of bug fixation, bug management is frequently a painful undertaking for software engineers. Non reproducible bugs add to the difficulty of this vexing indexing. This paper deals with the development of a early prediction model for identification of non-reproducible bugs. In this work, a novel framework named NRPredictor, has been proposed which uses three ensemble learning and one feature selection algorithm for Non-Reproducible bug prediction. The prediction performance of the proposed framework has been examined using projects of Bugzilla bug tracking system. Three open-source projects viz. Mozilla Firefox, Eclipse and NetBeans have been used for evaluating the prediction performance. While forecasting the fixability of bug reports, the experimental findings reveal that NRPredictor surpasses traditional machine learning techniques. For Mozilla Firefox, Eclipse, and NetBeans projects, NRPredictor, delivers performance (in terms of F1-score) up to 88.3, 87.8, and 87.4% respectively. An improvement in performance up to 6.1, 5 and 2.7% has been obtained for NetBeans, Eclipse, and Mozilla Firefox projects, respectively as compared to the best performing standalone machine learning classifier.
Similar content being viewed by others
Notes
References
Abou Khalil Z, Constantinou E, Mens T, Duchien L (2021) On the impact of release policies on bug handling activity: a case study of eclipse. J Syst Softw 173:110882
Abualigah L, Abd Elaziz M, Sumari P, Geem ZW, Gandomi AH (2022) Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Exp. Syst. Appl. 191:116158
Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376:113609
Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-Qaness MA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Indus. Eng. 157:107250
Agushaka JO, Ezugwu AE, Abualigah L (2022) Dwarf mongoose optimization algorithm. Comput. Methods Appl. Mech. Eng. 391:114570
Ahmed HA, Bawany NZ, Shamsi JA (2021) CaPBug-a framework for automatic bug categorization and prioritization using NLP and machine learning algorithms. IEEE Access 9:50496–50512
Alzubi JA (2015) Optimal classifier ensemble design based on cooperative game theory. Res J Appl Sci Eng Technol 11(12):1336–1343
Alzubi OA, Alzubi JA, Alweshah M, Qiqieh I, Al-Shami S, Ramachandran M (2020) An optimal pruning algorithm of classifier ensembles: dynamic programming approach. Neural Comput Appl 32(20):16091–16107
Alzubi OA, Alzubi JAA, Tedmori S, Rashaideh H, Almomani O (2018) Consensus-based combining method for classifier ensembles. Int Arab J Inf Technol 15(1):76–86
Anvik J (2006) Automating bug report assignment. In: Proceedings of the 28th International conference on software engineering, pp 937–940
Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development oriented decisions. ACM Trans Softw Eng Method (TOSEM) 20(3):10
Artzi S, Kim S, Ernst MD (2008) Recrash: Making software failures reproducible by preserving object states. In: European conference on object-oriented programming. pp 542–565. Springer
Bauer C, Parkinson A, Scharl A (1999) Automated vs. manual classification: a multi-methodological set of web analysis components. In: Australasian conference on information systems, pp 54–64. Citeseer
Bhattacharya P, Neamtiu I (2010) Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In: Software maintenance (ICSM), 2010 IEEE international conference on, pp 1–10. IEEE
Breiman L (1996) Bagging predictors. Machine Learn 24(2):123–140
Breu S, Premraj R, Sillito J, Zimmermann T (2010) Information needs in bug reports: improving cooperation between developers and users. In: Proceedings of the 2010 ACM conference on computer supported cooperative work, pp 301–310
Chaparro O, Lu J, Zampetti F, Moreno L, Di Penta M, Marcus A, Bavota G, and Ng V (2017) Detecting missing information in bug descriptions. In Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 396–407. ACM
Cheng X, Liu N, Guo L, Xu Z, Zhang T (2020) Blocking bug prediction based on XGBoost with enhanced features. In: 2020 IEEE 44th annual computers, software, and applications conference (COMPSAC), pp 902–911. IEEE
Dao A-H, Yang C-Z (2021) Improving priority prediction for bug reports with comment features. In 2021 IEEE international conference on software engineering and artificial intelligence (SEAI), pp 58–62. IEEE
Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems, pp 1–15. Springer
Ekanayake J (2021) Bug severity prediction using keywords in imbalanced learning environment. Int J Inf Technol Comput Sci (IJITCS) 13:53–60
Erfani Joorabchi M, Mirzaaghaei M, Mesbah A (2014) Works for me! characterizing non-reproducible bug reports. In Proceedings of the 11th working conference on mining software repositories, pp 62–71. ACM
Fagan M (2002) Design and code inspections to reduce errors in program development. In Software pioneers, pp 575–607. Springer
Fan Y, Xia X, Lo D, Hassan AE (2018) Chaff from the wheat: characterizing and determining valid bug reports. IEEE Trans Softw Eng 46(5):495–525
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, pp 23–37. Springer
Goyal A, Sardana N (2016) Analytical study on bug triaging practices. Int J Open Source Softw Process (IJOSSP) 7(2):20–42
Goyal A, Sardana N (2017) black optimizing bug report assignment using multi criteria decision making technique. Intell Decision Technol 11(3):307–320
Goyal A, Sardana N (2017) Machine learning or information retrieval techniques for bug triaging: Which is better? e-Inform Softw Eng J. https://doi.org/10.5277/e-Inf170106
Goyal A, Sardana N (2017) Nrfixer: sentiment based model for predicting the fixability of non-reproducible bugs. Inform Softw Eng J 11(1):109–122
Goyal A, Sardana N (2018) Characterization study of developers in non-reproducible bugs. In: 2018 eleventh international conference on contemporary computing (IC3), pp 1–6. IEEE
Goyal A Sardana N (2019) Empirical analysis of ensemble machine learning techniques for bug triaging. In: 2019 twelfth international conference on contemporary computing (IC3), pp 1–6. IEEE
Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: Software engineering, 2010 ACM/IEEE 32nd international conference on, vol 1, pp 495–504. IEEE
X. Guo, Y. Yin, C. Dong, G. Yang, and G. Zhou (2008) On the class imbalance problem. In: Fourth international conference on natural computation, 2008. ICNC’08., vol 4, pp 192–201. IEEE
Gupta S, Gupta SK (2021) An approach to generate the bug report summaries using two-level feature extraction. Exp Syst Appl 176:114816
Hewett R, Kijsanayothin P (2009) On modeling software defect repair time. Empir Softw Eng 14(2):165
Isotani H, Washizaki H, Fukazawa Y, Nomoto T, Ouji S, Saito S (2021) Duplicate bug report detection by using sentence embedding and fine-tuning. In: 2021 IEEE international conference on software maintenance and evolution (ICSME), pp 535–544. IEEE
Jin W, Orso A (2012) Bugredux: reproducing field failures for in-house debugging. In: 2012 34th international conference on software engineering (ICSE), pp 474–484. IEEE
Jonsson L, Borg M, Broman D, Sandahl K, Eldh S, Runeson P (2016) Automated bug assignment: ensemble-based machine learning in large scale industrial contexts. Empir Softw Eng 21(4):1533–1578
Kamkar M (1998) Application of program slicing in algorithmic debugging. Inform Softw Technol 40(11–12):637–645
Koh Y, Kang S, Lee S (2021) Bug report summarization using believability score and text ranking. In: 2021 international conference on artificial intelligence in information and communication (ICAIIC), pp 117–120. IEEE
Kumari M, Sharma M, Anand S, Singh V (2020) Predicting the fix time of a reported bug using radoop: a big data approach. In: Decision analytics applications in industry, pp 259–269. Springer
Kumari M, Singh UK, Sharma M (2020) Entropy based machine learning models for software bug severity assessment in cross project context. In: International conference on computational science and its applications, pp 939–953. Springer
Lal S, Sardana N, Sureka A (2017) Eclogger: cross-project catch-block logging prediction using ensemble of classifiers. e-Inform Softw Eng J. https://doi.org/10.5277/e-Inf170101
Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inform Softw Technol 58:388–402
Lee Y, Lee S, Lee C-G, Yeom I, Woo H (2020) Continual prediction of bug-fix time using deep learning-based activity stream embedding. IEEE Access 8:10503–10515
Li Z, Jiang Z, Chen X, Cao K, Gu Q (2021) Laprob: a label propagation-based software bug localization method. Inform Softw Technol 130:106410
Limsettho N, Bennin KE, Keung JW, Hata H, Matsumoto K (2018) Cross project defect prediction using class distribution estimation and oversampling. Inform Softw Technol 100:87–102
López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inform Sci 250:113–141
Malhotra R, Dabas A, Hariharasudhan A, Pant M (2021) A study on machine learning applied to software bug priority prediction. In: 2021 11th international conference on cloud computing, data science & engineering (Confluence), pp 965–970. IEEE
Mohan D, Goyal A, Sardana N (2016) Visheshagya: Time based expertise model for bug report assignment. In: 2016 ninth international conference on contemporary computing (IC3), pp 1–6. IEEE
Movassagh AA, Alzubi JA, Gheisari M, Rahimi M, Mohan S, Abbasi AA, Nabipour N (2021) Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model. J Ambient Intel Human Comput. https://doi.org/10.1007/s12652-020-02623-6
Neysiani BS, Babamir SM (2020) Automatic duplicate bug report detection using information retrieval-based versus machine learning-based approaches. In: 2020 6th international conference on web research (ICWR), pp 288–293. IEEE
Neysiani BS, Babamir SM, Aritsugi M (2020) Efficient feature extraction model for validation performance improvement of duplicate bug report detection in software bug triage systems. Inform Softw Technol 126:106344
Oyelade ON, Ezugwu AE-S, Mohamed TI, Abualigah L (2022) Ebola optimization search algorithm: a new nature-inspired metaheuristic optimization algorithm. IEEE Access 10:16150–16177
Perry DE, Stieg CS (1993) Software faults in evolving a large, real-time system: a case study. In: European software engineering conference, pp 48–67. Springer
Phua C, Alahakoon D, Lee V (2004) Minority report in fraud detection: classification of skewed data. Acm Sigkdd Explorations Newsletter 6(1):50–59
Rahman MM, Khomh F, Castelluccio M (2020) Why are some bugs non-reproducible?: An empirical investigation using data fusion. In: 2020 IEEE international conference on software maintenance and evolution (ICSME), pp 605–616. IEEE
Rashmi P, Kambli P (2020) Predicting bug in a software using ann based machine learning techniques. In: 2020 IEEE International Conference for Innovation in Technology (INOCON), pp 1–5. IEEE
Rocha TM, Carvalho ALDC (2021) SiameseQAT: a semantic context-based duplicate bug report detection using replicated cluster information. IEEE Access 9:44610–44630
Sethuraman J, Alzubi JA, Manikandan R, Gheisari M, Kumar A (2019) Eccentric methodology with optimization to unearth hidden facts of search engine result pages. Recent Patents Comput Sci 12(2):110–119
Sharma M, Kumari M, Singh V (2021) Bug priority assessment in cross-project context using entropy-based measure. In: Advances in machine learning and computational intelligence, pp 113–128. Springer
Shatnawi MQ, Alazzam B (2022) An assessment of eclipse bugs’ priority and severity prediction using machine learning. Int J Commun Netwo Inform Security (IJCNIS) 14(1):62–69
Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Matsumoto K-I (2013) Studying re-opened bugs in open source software. Empir Softw Eng 18(5):1005–1042
Shokripour R, Anvik J, Kasirun ZM, Zamani S (2015) A time-based approach to automatic bug report assignment. J Syst Softw 102:109–122
Sureka A, Jalote P (2010) Detecting duplicate bug report using character n-gram-based features. In: 2010 17th Asia pacific software engineering conference (APSEC), pp 366–374. IEEE
Tagra A (2021) Studying reopened bugs in open source software systems. PhD thesis
Tamrawi A, Nguyen TT, Al-Kofahi J, Nguyen TN (2011) Fuzzy set-based automatic bug triaging: nier track. In: Software engineering (ICSE), 2011 33rd international conference on, pp 884–887. IEEE
Tamrawi A, Nguyen TT, Al-Kofahi JM, Nguyen TN (2011) Fuzzy set and cache-based approach for bug triaging. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pp 365–375
Valdivia Garcia H, Shihab E (2014) Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th working conference on mining software repositories, pp 72–81. ACM
White M, Linares-Vásquez M, Johnson P, Bernal-Cárdenas C, Poshyvanyk D (2015) Generating reproducible and replayable bug reports from android application crashes. In: 2015 IEEE 23rd international conference on program comprehension (ICPC), pp 48–59. IEEE
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Xi S-Q, Yao Y, Xiao X-S, Xu F, Lv J (2019) Bug triaging based on tossing sequence modeling. J Comput Sc Technol 34(5):942–956
Xia X, Lo D, Shihab E, Wang X, Yang X (2015) Elblocker: predicting blocking bugs with ensemble imbalance learning. Inform Softw Technol 61:93–106
Ye L, Jinxiao H, Yutao M (2020) An automatic method using hybrid neural networks and attention mechanism for software bug triaging. J Comput Res Develop 57(3):461
Yuan W, Xiong Y, Sun H, and Liu X (2021) Incorporating multiple features to predict bug fixing time with neural networks. In: 2021 IEEE international conference on software maintenance and evolution (ICSME), pp 93–103. IEEE
Zhang T, Yang G, Lee B, Lua EK (2014) A novel developer ranking algorithm for automatic bug triage using topic model and developer relations. In: 2014 21st Asia-pacific software engineering conference, vol 1, pp 223–230. IEEE
Zhou T, Sun X, Xia X, Li B, Chen X (2019) Improving defect prediction with deep forest. Inform Softw Technol 114:204–216
Zimmermann T, Nagappan N, Guo PJ, Murphy B (2012) Characterizing and predicting which bugs get reopened. In: Proceedings of the 34th international conference on software engineering, pp 1074–1083. IEEE Press
Funding
This study and all authors involved in this study have not received any funding, including after the completion of the study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article. The authors did not receive support from any organization for the submitted work.
Human or animal rights
This tudy did not involve any human participants or animals.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bansal, K., Singh, G., Sunesh Malik et al. NRPredictor: an ensemble learning and feature selection based approach for predicting the non-reproducible bugs. Int J Syst Assur Eng Manag 14, 989–1009 (2023). https://doi.org/10.1007/s13198-023-01902-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-023-01902-7