Metrics-Driven Software Quality Prediction Without Prior Fault Data

Catal, Cagatay; Sevim, Ugur; Diri, Banu

doi:10.1007/978-90-481-8776-8_17

Metrics-Driven Software Quality Prediction Without Prior Fault Data

Cagatay Catal³,
Ugur Sevim³ &
Banu Diri⁴

Chapter
First Online: 01 January 2010

1648 Accesses
5 Citations

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 60))

Abstract

Software quality assessment models are quantitative analytical models that are more reliable compared to qualitative models based on personal judgment. These assessment models are classified into two groups: generalized and product-specific models. Measurement-driven predictive models, a subgroup of product-specific models, assume that there is a predictive relationship between software measurements and quality. In recent years, greater attention in quality assessment models has been devoted to measurement-driven predictive models and the field of software fault prediction modeling has become established within the product-specific model category. Most of the software fault prediction studies focused on developing fault predictors by using previous fault data. However, there are cases when previous fault data are not available. In this study, we propose a novel software fault prediction approach that can be used in the absence of fault data. This fully automated technique does not require an expert during the prediction process and it does not require identifying the number of clusters before the clustering phase, as required by the K-means clustering method. Software metrics thresholds are used to remove the need for an expert. Our technique first applies the X-means clustering method to cluster modules and identifies the best cluster number. After this step, the mean vector of each cluster is checked against the metrics thresholds vector. A cluster is predicted as fault-prone if at least one metric of the mean vector is higher than the threshold value of that metric. Three datasets, collected from a Turkish white-goods manufacturer developing embedded controller software, have been used during experimental studies. Experiments revealed that unsupervised software fault prediction can be automated fully and effective results can be achieved by using the X-means clustering method and software metrics thresholds.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Arisholma, E., Briand, L.C., Johannessen, E.B.: A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83(1):2–17 (January 2010)
Article Google Scholar
Berkhin, P.: Survey of clustering data mining techniques. Technical Report, Accrue Software, San Jose, CA (2002)
Google Scholar
Bingbing, Y., Qian, Y., Shengyong, X., Ping, G.: Software quality prediction using affinity propagation algorithm. IJCNN – International Joint Conference on Neural Networks, pp. 1891–1896 (2008)
Google Scholar
Catal, C., Sevim, U., Diri, B.: Clustering and metrics thresholds based software fault prediction of unlabeled program modules. Proceedings of the Sixth International Conference on Information Technology: New Generations, pp. 199–204 (2009)
Google Scholar
Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. Society for Industrial and Applied Mathematics, Philadelphia (2007)
MATH Google Scholar
Khoshgoftaar, T.M., Seliya, N., Sundaresh, N.: An empirical study of predicting software faults with case-based reasoning. Softw. Qual. J. 14(2):85–111 (2006)
Article Google Scholar
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 32(1):2–13 (2007)
Article Google Scholar
Pelleg, D., Moore, A.: X-means: Extending K-means with efficient estimation of the number of clusters. Proceedings of the 17th International Conference on Machine Learning, pp. 727–734 (2000)
Google Scholar
Seliya, N., Khoshgoftaar, T.M.: Software quality analysis of unlabeled program modules with semi-supervised clustering. IEEE Trans. Syst. Man Cyb A: Syst. Humans 37(2):201–211 (2007)
Article MathSciNet Google Scholar
Shatnawi, R., Li, W., Swain, J., Newman, T.: Finding software metrics threshold values using ROC curves. J. Softw. Maint. Evol.: Res. Pract. (14 Apr 2009, Published Online)
Google Scholar
Tian, J.: Software Quality Engineering: Testing, Quality Assurance, and Quantifiable Improvement. Wiley, New York (2005)
Google Scholar
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Networ. 16(3): 645–678 (2005)
Article Google Scholar
Zhong, S., Khoshgoftaar, T.M., Seliya, N.: Unsupervised learning for expert-based software quality estimation. Proceedings of the 8th International Symposium on High Assurance Systems Engineering, Tampa, FL, pp. 149–155 (2004)
Google Scholar

Download references

Acknowledgement

This project is supported by the Scientific and Technological Research Council of TURKEY (TUBITAK) under Grant 107E213.

Author information

Authors and Affiliations

TUBITAK-Marmara Research Center, Information Technologies Institute, Gebze, Kocaeli, Turkey
Cagatay Catal & Ugur Sevim
Department of Computer Engineering, Yildiz Technical University, Besiktas, Istanbul, Turkey
Banu Diri

Authors

Cagatay Catal
View author publications
You can also search for this author in PubMed Google Scholar
Ugur Sevim
View author publications
You can also search for this author in PubMed Google Scholar
Banu Diri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cagatay Catal .

Editor information

Editors and Affiliations

International Association of Engineers, Hung To Road 37-39, Hong Kong, Hong Kong/PR China
Sio-Iong Ao
School of Engineering, Dept. Process & Systems Engineering, Cranfield University, Cranfield, Beds., MK43 0AL, United Kingdom
Len Gelman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Catal, C., Sevim, U., Diri, B. (2010). Metrics-Driven Software Quality Prediction Without Prior Fault Data. In: Ao, SI., Gelman, L. (eds) Electronic Engineering and Computing Technology. Lecture Notes in Electrical Engineering, vol 60. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-8776-8_17

Download citation

DOI: https://doi.org/10.1007/978-90-481-8776-8_17
Published: 24 February 2010
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-8775-1
Online ISBN: 978-90-481-8776-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics