Abstract
Software engineers have limited resources and need metrics analysis tools to investigate software quality such as fault-proneness of modules. There are a large number of software metrics available to investigate quality. However, not all metrics are strongly correlated with faults. In addition, software fault data are imbalanced and affect quality assessment tools such as fault prediction or threshold values that are used to identify risky modules. Software quality is investigated for three purposes. First, the receiver operating characteristics (ROC) analysis is used to identify threshold values to identify risky modules. Second, the ROC analysis is investigated for imbalanced data. Third, the ROC analysis is considered for feature selection. This work validated the use of ROC to identify thresholds for four metrics (WMC, CBO, RFC and LCOM). The ROC results after sampling the data are not significantly different from before sampling. The ROC analysis selects the same metrics (WMC, CBO and RFC) in most datasets, while other techniques have a large variation in selecting metrics.
Similar content being viewed by others
References
Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Arisholm E, Briand L, Johannessen E (2010) A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J Syst Softw 83(1):2–17
Agrawal A, Menzies T (2017) “Better Data” is better than “Better Data Miners”, arXiv:1705.03697 [cs.SE]
Basili V, Briand L, Melo W (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Bender R (1999) Quantitative risk assessment in epidemiological studies investigating threshold effects. Biom J 41(3):305–319
Benlarbi S, El Emam K, Goel N, Rai S (2000) Thresholds for object-oriented measures. In: 11th International symposium on software reliability engineering (ISSRE 2000). IEEE Computer Society, Los Alamitos, CA, pp 24–38
Briand LC, Wü st J, Daly JW, Victor Porter D (2000) Exploring the relationships between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273
Cartwright M (1998) An empirical view of inheritance. Inf Softw Technol 40:795–799
Catal C, Diri B (2008) A Fault prediction model with limited fault data to improve test process. In: PROFES 2008, LNCS 5089, pp 244–257
Catal C (2011) Software fault prediction: a literature review and current trends. Expert Syst Appl 38:4626–4636
Catal C, Alan O, Balkan K (2011) Class noise detection based on software metrics and ROC curves. Inf Sci 181(21):4867–4877
Challagulla VU, Bastani FB, Yen I, Paul RA (2005) Empirical assessment of machine learning based software defect prediction techniques. In: Tenth IEEE international workshop on object-oriented real-time dependable systems, pp 263–270
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) SMOTE, synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Daly J, Brooks A, Miller J, Roper M, Wood M (1996) Evaluating inheritance depth on the maintainability of object-oriented software. Empir Softw Eng 1(2):109–132
Dessi N, Pes B (2015) Similarity of feature selection methods: an empirical study across data intensive classification tasks. Expert Syst Appl 42(10):4632–4642
El Emam KE, Benlarbi S, Goel N, Rai SN (2001a) The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans Softw Eng 27(7):630–648
El Emam KE, Melo W, Machado J (2001b) The prediction of faulty classes using object-oriented design metrics. J Syst Softw 56:63–75
El Emam K, Benlarbi S, Goel N, Melo W, Lounis H, Rai S (2002) The optimal class size for object-oriented software. IEEE Trans Softw Eng 28(5):494–509
Erni K, Lewerentz C (1996) Applying design-metrics to object-oriented frameworks. In: Proceedings of the third international software metrics symposium. Society Press, pp 25–26
Fawcett T (2004) ROC graphs, notes and practical considerations for researchers. Technical report, HP Laboratories, Page Mill Road, Palo Alto, CA
Ferreira KAM, Bigonha M, Bigonha R, Mendes L, Almeida H (2012) Identifying thresholds for object-oriented software metrics. J Syst Softw 85:244–257
Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999) Refactoring: improving the design of existing code
Gao K, Khoshgoftaar K, Wang H, Seliya N (2011) Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw Pract Exp 41(5):579–606
Gondra I (2008) Applying machine learning to software fault-proneness prediction. J Syst Softw 81(2):186–195
Gronback RC (2003) Software remodeling: improving design and implementation quality, using audits, metrics and refactoring in Borland Together ControlCenter, A Borland White Paper
Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software, an update. Spec Interest Group Knowl Discov Data Min Explor Newsl 11(1):10–18
Harrison R, Counsell S, Nithi R (2000) Experimental assessment of the effect of inheritance on the maintainability of object-oriented systems. J Syst Softw 52(2):173–179
Hosmer D, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304
Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13:561–595
John G, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers, San Mateo, pp 338–345
Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, pp 1–10
Jureczko M, Spinellis D (2010) Using object-oriented design metrics to predict software defects. In: Proceedings of the 5th international conference on dependability of computer systems, pp 69–81
Khoshgoftaar T, Seliya N (2004) Comparative assessment of software quality classification techniques, an empirical case study. Empir Softw Eng 9(3):229–257
Khoshgoftaar TM, Kehan G, Seliya N (2010) Attribute Selection and imbalanced data: problems in software defect prediction. In: Proceedings of the 22nd IEEE international conference on tools with artificial intelligence (ICTAI), pp 137–144
Koru AG, El Emam K, Zhang D, Liu H, Mathew D (2008) Theory of relative defect proneness. Empir Softw Eng 13:473–498
Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceedings of the fourteenth international conference on machine learning, pp 179–186
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116
Ma Y, Cukic B (2007) Adequate evaluation of quality models in software engineering studies. In: International workshop on predictor models in software engineering, p 1
Marcus A, Poshyvanyk D, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300
Marinescu R (2002) Measurement and quality in object-oriented design. Ph.D. thesis, Politehnica University of Timisoara
McCabe Software (2012) Using code quality metrics in management of outsourced development and maintenance, white paper. http://www.mccabe.com/pdf/McCabeCodeQualityMetrics-OutsourcedDev.pdf. Accessed Nov 2012
Menzies T, DiStefano J, Orrego A, Chapman R (2004) Assessing predictors of software defects. In: Predictive software models workshop
Mertik M, Lenic M, Stiglic G, Kokol P (2006) Estimating software quality with advanced data mining techniques. In: International conference on software engineering advances, p 19
Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(8):402–419
Prechelt L, Unger B, Philippsen M, Tichy W (2003) A controlled experiment on inheritance depth as a cost factor for code maintenance. J Syst Softw 65:115–126
Quinlan JR (1993) C4.5, Programs for machine learning. Morgan Kaufmann, San Mateo
Riquelme JC, Ruiz R, Rodrí guez D, Moreno J (2008) Finding defective modules from highly unbalanced datasets. Actas del \(8^{\circ } \) taller sobre el apoyo a la decisió n en ingenierí a del software, pp 67–74
Rosenberg LH, Stapko R, Gallo A (1999) Risk-based object oriented testing. In: 24th Annual software engineering workshop, Goddard Space Flight Center
Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2008) Building useful models from imbalanced data with sampling and boosting. In: Proceedings of the twenty-first international FLAIRS conference, pp 206–311
Shatnawi R, Li W (2008) The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J Syst Softw 81(11):1868–1882
Shatnawi RA (2010) Quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans Softw Eng 36(2):216–225
Shatnawi R, Li W, Swain J, Newman T (2010) Finding software metrics threshold values using ROC curves. J Softw Maint Evol Res Pract 22(1):1–16
Van Hulse J, Khoshgoftaar TM, Napolitano A (2007) Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th international conference on machine learning, Corvallis, OR, pp 935–942
Wang H, Khoshgoftaar TM, Seliya N (2011) How many software metrics should be selected for defect prediction? In: Murray RC, McCarthy, PM (eds) FLAIRS conference. AAAI Press
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
XLStat, Creating an ROC curve and identify the optimal threshold value for a detection method. http://www.xlstat.com/en/learning-center/tutorials/creating-an-roc-curve-and-identify-the-optimal-threshold-value-for-a-detection-method.html. Accessed 8/2/2014
Yan Z, Chen X, Guo P (2010) Software defect prediction using fuzzy support vector regression. In: International symposium on neural networks. Springer, Berlin, pp 17–24
Yu Q, Jiang S, Zang Y (2017) The performance stability of defect prediction models with class imbalance: an empirical study. IEICE Trans Inf Syst E100(2):265–272
Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32(10):771–789
Zweig M, Campbell G (1993) Receiver-operating characteristic (ROC) plots, a fundamental evaluation tool in clinical medicine. Clinl Chem 39(4):561–577
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shatnawi, R. The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction. Innovations Syst Softw Eng 13, 201–217 (2017). https://doi.org/10.1007/s11334-017-0295-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-017-0295-0