The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Shatnawi, Raed

doi:10.1007/s11334-017-0295-0

The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Original Paper
Published: 02 August 2017

Volume 13, pages 201–217, (2017)
Cite this article

Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Raed Shatnawi ORCID: orcid.org/0000-0001-7777-1370¹

771 Accesses
29 Citations
Explore all metrics

Abstract

Software engineers have limited resources and need metrics analysis tools to investigate software quality such as fault-proneness of modules. There are a large number of software metrics available to investigate quality. However, not all metrics are strongly correlated with faults. In addition, software fault data are imbalanced and affect quality assessment tools such as fault prediction or threshold values that are used to identify risky modules. Software quality is investigated for three purposes. First, the receiver operating characteristics (ROC) analysis is used to identify threshold values to identify risky modules. Second, the ROC analysis is investigated for imbalanced data. Third, the ROC analysis is considered for feature selection. This work validated the use of ROC to identify thresholds for four metrics (WMC, CBO, RFC and LCOM). The ROC results after sampling the data are not significantly different from before sampling. The ROC analysis selects the same metrics (WMC, CBO and RFC) in most datasets, while other techniques have a large variation in selecting metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Stability of Threshold Values for Software Metrics in Software Defect Prediction

Data quality issues in software fault prediction: a systematic literature review

Article 21 December 2022

Analysis of Different Sampling Techniques for Software Fault Prediction

Notes

References

Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
MATH Google Scholar
Arisholm E, Briand L, Johannessen E (2010) A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J Syst Softw 83(1):2–17
Article Google Scholar
Agrawal A, Menzies T (2017) “Better Data” is better than “Better Data Miners”, arXiv:1705.03697 [cs.SE]
Basili V, Briand L, Melo W (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Article Google Scholar
Bender R (1999) Quantitative risk assessment in epidemiological studies investigating threshold effects. Biom J 41(3):305–319
Article MATH Google Scholar
Benlarbi S, El Emam K, Goel N, Rai S (2000) Thresholds for object-oriented measures. In: 11th International symposium on software reliability engineering (ISSRE 2000). IEEE Computer Society, Los Alamitos, CA, pp 24–38
Briand LC, Wü st J, Daly JW, Victor Porter D (2000) Exploring the relationships between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273
Article Google Scholar
Cartwright M (1998) An empirical view of inheritance. Inf Softw Technol 40:795–799
Article Google Scholar
Catal C, Diri B (2008) A Fault prediction model with limited fault data to improve test process. In: PROFES 2008, LNCS 5089, pp 244–257
Catal C (2011) Software fault prediction: a literature review and current trends. Expert Syst Appl 38:4626–4636
Article Google Scholar
Catal C, Alan O, Balkan K (2011) Class noise detection based on software metrics and ROC curves. Inf Sci 181(21):4867–4877
Article Google Scholar
Challagulla VU, Bastani FB, Yen I, Paul RA (2005) Empirical assessment of machine learning based software defect prediction techniques. In: Tenth IEEE international workshop on object-oriented real-time dependable systems, pp 263–270
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Article Google Scholar
Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) SMOTE, synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
MATH Google Scholar
Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Article Google Scholar
Daly J, Brooks A, Miller J, Roper M, Wood M (1996) Evaluating inheritance depth on the maintainability of object-oriented software. Empir Softw Eng 1(2):109–132
Article Google Scholar
Dessi N, Pes B (2015) Similarity of feature selection methods: an empirical study across data intensive classification tasks. Expert Syst Appl 42(10):4632–4642
Article Google Scholar
El Emam KE, Benlarbi S, Goel N, Rai SN (2001a) The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans Softw Eng 27(7):630–648
Article Google Scholar
El Emam KE, Melo W, Machado J (2001b) The prediction of faulty classes using object-oriented design metrics. J Syst Softw 56:63–75
Article Google Scholar
El Emam K, Benlarbi S, Goel N, Melo W, Lounis H, Rai S (2002) The optimal class size for object-oriented software. IEEE Trans Softw Eng 28(5):494–509
Article Google Scholar
Erni K, Lewerentz C (1996) Applying design-metrics to object-oriented frameworks. In: Proceedings of the third international software metrics symposium. Society Press, pp 25–26
Fawcett T (2004) ROC graphs, notes and practical considerations for researchers. Technical report, HP Laboratories, Page Mill Road, Palo Alto, CA
Ferreira KAM, Bigonha M, Bigonha R, Mendes L, Almeida H (2012) Identifying thresholds for object-oriented software metrics. J Syst Softw 85:244–257
Article Google Scholar
Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999) Refactoring: improving the design of existing code
Gao K, Khoshgoftaar K, Wang H, Seliya N (2011) Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw Pract Exp 41(5):579–606
Article Google Scholar
Gondra I (2008) Applying machine learning to software fault-proneness prediction. J Syst Softw 81(2):186–195
Article Google Scholar
Gronback RC (2003) Software remodeling: improving design and implementation quality, using audits, metrics and refactoring in Borland Together ControlCenter, A Borland White Paper
Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software, an update. Spec Interest Group Knowl Discov Data Min Explor Newsl 11(1):10–18
Google Scholar
Harrison R, Counsell S, Nithi R (2000) Experimental assessment of the effect of inheritance on the maintainability of object-oriented systems. J Syst Softw 52(2):173–179
Article Google Scholar
Hosmer D, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York
Book MATH Google Scholar
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304
Article Google Scholar
Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13:561–595
Article Google Scholar
John G, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers, San Mateo, pp 338–345
Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, pp 1–10
Jureczko M, Spinellis D (2010) Using object-oriented design metrics to predict software defects. In: Proceedings of the 5th international conference on dependability of computer systems, pp 69–81
Khoshgoftaar T, Seliya N (2004) Comparative assessment of software quality classification techniques, an empirical case study. Empir Softw Eng 9(3):229–257
Article Google Scholar
Khoshgoftaar TM, Kehan G, Seliya N (2010) Attribute Selection and imbalanced data: problems in software defect prediction. In: Proceedings of the 22nd IEEE international conference on tools with artificial intelligence (ICTAI), pp 137–144
Koru AG, El Emam K, Zhang D, Liu H, Mathew D (2008) Theory of relative defect proneness. Empir Softw Eng 13:473–498
Article Google Scholar
Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceedings of the fourteenth international conference on machine learning, pp 179–186
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116
Article Google Scholar
Ma Y, Cukic B (2007) Adequate evaluation of quality models in software engineering studies. In: International workshop on predictor models in software engineering, p 1
Marcus A, Poshyvanyk D, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300
Article Google Scholar
Marinescu R (2002) Measurement and quality in object-oriented design. Ph.D. thesis, Politehnica University of Timisoara
McCabe Software (2012) Using code quality metrics in management of outsourced development and maintenance, white paper. http://www.mccabe.com/pdf/McCabeCodeQualityMetrics-OutsourcedDev.pdf. Accessed Nov 2012
Menzies T, DiStefano J, Orrego A, Chapman R (2004) Assessing predictors of software defects. In: Predictive software models workshop
Mertik M, Lenic M, Stiglic G, Kokol P (2006) Estimating software quality with advanced data mining techniques. In: International conference on software engineering advances, p 19
Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(8):402–419
Article Google Scholar
Prechelt L, Unger B, Philippsen M, Tichy W (2003) A controlled experiment on inheritance depth as a cost factor for code maintenance. J Syst Softw 65:115–126
Article Google Scholar
Quinlan JR (1993) C4.5, Programs for machine learning. Morgan Kaufmann, San Mateo
Riquelme JC, Ruiz R, Rodrí guez D, Moreno J (2008) Finding defective modules from highly unbalanced datasets. Actas del \(8^{\circ } \) taller sobre el apoyo a la decisió n en ingenierí a del software, pp 67–74
Rosenberg LH, Stapko R, Gallo A (1999) Risk-based object oriented testing. In: 24th Annual software engineering workshop, Goddard Space Flight Center
Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2008) Building useful models from imbalanced data with sampling and boosting. In: Proceedings of the twenty-first international FLAIRS conference, pp 206–311
Shatnawi R, Li W (2008) The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J Syst Softw 81(11):1868–1882
Article Google Scholar
Shatnawi RA (2010) Quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans Softw Eng 36(2):216–225
Article Google Scholar
Shatnawi R, Li W, Swain J, Newman T (2010) Finding software metrics threshold values using ROC curves. J Softw Maint Evol Res Pract 22(1):1–16
Article Google Scholar
Van Hulse J, Khoshgoftaar TM, Napolitano A (2007) Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th international conference on machine learning, Corvallis, OR, pp 935–942
Wang H, Khoshgoftaar TM, Seliya N (2011) How many software metrics should be selected for defect prediction? In: Murray RC, McCarthy, PM (eds) FLAIRS conference. AAAI Press
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Article Google Scholar
XLStat, Creating an ROC curve and identify the optimal threshold value for a detection method. http://www.xlstat.com/en/learning-center/tutorials/creating-an-roc-curve-and-identify-the-optimal-threshold-value-for-a-detection-method.html. Accessed 8/2/2014
Yan Z, Chen X, Guo P (2010) Software defect prediction using fuzzy support vector regression. In: International symposium on neural networks. Springer, Berlin, pp 17–24
Yu Q, Jiang S, Zang Y (2017) The performance stability of defect prediction models with class imbalance: an empirical study. IEICE Trans Inf Syst E100(2):265–272
Article Google Scholar
Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32(10):771–789
Article Google Scholar
Zweig M, Campbell G (1993) Receiver-operating characteristic (ROC) plots, a fundamental evaluation tool in clinical medicine. Clinl Chem 39(4):561–577
Google Scholar

Download references

Author information

Authors and Affiliations

Software Engineering Department, Jordan University of Science and Technology, Irbid, 22110, Jordan
Raed Shatnawi

Authors

Raed Shatnawi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raed Shatnawi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shatnawi, R. The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction. Innovations Syst Softw Eng 13, 201–217 (2017). https://doi.org/10.1007/s11334-017-0295-0

Download citation

Received: 07 March 2016
Accepted: 26 July 2017
Published: 02 August 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s11334-017-0295-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Abstract

Access this article

Similar content being viewed by others

The Stability of Threshold Values for Software Metrics in Software Defect Prediction

Data quality issues in software fault prediction: a systematic literature review

Analysis of Different Sampling Techniques for Software Fault Prediction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Abstract

Access this article

Similar content being viewed by others

The Stability of Threshold Values for Software Metrics in Software Defect Prediction

Data quality issues in software fault prediction: a systematic literature review

Analysis of Different Sampling Techniques for Software Fault Prediction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation