research-article

An in-depth study of the potentially confounding effect of class size in fault prediction

Authors:
Yuming Zhou

Nanjing University

Nanjing University
View Profile

,
Baowen Xu

Nanjing University

Nanjing University
View Profile

,
Hareton Leung

Hong Kong Polytechnic University

Hong Kong Polytechnic University
View Profile

,
Lin Chen

Nanjing University

Nanjing University
View Profile

ACM Transactions on Software Engineering and Methodology Volume 23 Issue 1Article No.: 10pp 1–51https://doi.org/10.1145/2556777

Published:20 February 2014Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

Background. The extent of the potentially confounding effect of class size in the fault prediction context is not clear, nor is the method to remove the potentially confounding effect, or the influence of this removal on the performance of fault-proneness prediction models. Objective. We aim to provide an in-depth understanding of the effect of class size on the true associations between object-oriented metrics and fault-proneness. Method. We first employ statistical methods to examine the extent of the potentially confounding effect of class size in the fault prediction context. After that, we propose a linear regression-based method to remove the potentially confounding effect. Finally, we empirically investigate whether this removal could improve the prediction performance of fault-proneness prediction models. Results. Based on open-source software systems, we found: (a) the confounding effect of class size on the associations between object-oriented metrics and fault-proneness in general exists; (b) the proposed linear regression-based method can effectively remove the confounding effect; and (c) after removing the confounding effect, the prediction performance of fault prediction models with respect to both ranking and classification can in general be significantly improved. Conclusion. We should remove the confounding effect of class size when building fault prediction models.

References

K. K. Aggarwal, Y. Singh, A. Kaur, and R. Malhotra. 2007. Investigating effect of design metrics on fault proneness in object-oriented systems. J. Object Technol. 6, 10, 127--141.Google ScholarCross Ref
H. Aman, K. Yamasaki, H. Yamada, and M. T. Noda. 2002. A proposal of class cohesion metrics using sizes of cohesive parts. In Knowledge-Based Software Engineering. IOS Press, 102--107.Google Scholar
C. Andersson and P. Runeson. 2007. A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Engin. 33, 5, 273--286. Google ScholarDigital Library
E. Arisholm and L. C. Briand. 2006. Predicting fault-prone components in a Java legacy system. In Proceedings of the 5^th ACM-IEEE International Symposium on Empirical Software Engineering. 8--17. Google ScholarDigital Library
E. Arisholm, L. C. Briand, and E. B. Johannessen. 2010. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83, 1, 2--17. Google ScholarDigital Library
L. Badri and M. Badri. 2004. A proposal of a new class cohesion criterion: An empirical study. J. Object Technol. 3, 4, 145--159.Google ScholarCross Ref
J. Bansiya, L. Etzkorn, C. Davis, and W. Li. 1999. A class cohesion metric for object-oriented designs. J. Object-Oriented Program. 11, 8, 47--52.Google Scholar
V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Engin. 22, 10, 751--761. Google ScholarDigital Library
D. Belsley, E. Kuh, and R. Welsch. 2010. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley and Sons.Google Scholar
Y. Benjamini and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Royal Statist. Soc. Series B (Methodol.) 57, 1, 289--300.Google ScholarCross Ref
S. Benlarbi and W. L. Melo. 1999. Polymorphism measures for early risk prediction. In Proceedings of the 21^st International Conference on Software Engineering. 334--344. Google ScholarDigital Library
J. M. Bieman and B. K. Kang. 1995. Cohesion and reuse in an object-oriented system. ACM SIGSOFT Softw. Engin. Not. 20, 259--262. Google ScholarDigital Library
L. C. Briand, S. Morasca, and V. R. Basili. 1996. Property-based software engineering measurement. IEEE Trans. Softw. Engin. 22, 1, 68--86. Google ScholarDigital Library
L. C. Briand, P. T. Devanbu, and W. L. Melo. 1997. An investigation into coupling measures for C++. In Proceedings of the 19^th International Conference on Software Engineering. 412--421. Google ScholarDigital Library
L. C. Briand, J. Wust, S. Ikonomovski, and H. Lounis. 1998a. A comprehensive investigation of quality factors in object-oriented designs: An industrial case study. Tech. rep. ISERN-98-29, International Software Engineering Research Network.Google Scholar
L. C. Briand, J. W. Daly, and J. Wust. 1998b. A unified framework for cohesion measurement in object-oriented systems. Empirical Softw. Engin. 3, 1, 65--117. Google ScholarDigital Library
L. C. Briand, J. W. Daly, and J. Wust. 1999a. A unified framework for coupling measurement in object-oriented systems. IEEE Trans. Softw. Engin. 25, 1, 91--121. Google ScholarDigital Library
L. C. Briand, J. Wust, S. V. Ikonomovski, and H. Lounis. 1999b. Investigating quality factors in object-oriented designs: An industrial case study. In Proceedings of the 21^st International Conference on Software Engineering. 345--354. Google ScholarDigital Library
L. C. Briand, J. Wust, J. W. Daly, and D. V. Porter. 2000. Exploring the relationships between design measures and software quality in object-oriented systems. J. Syst. Softw. 51, 3, 245--273. Google ScholarDigital Library
L. C. Briand and J. Wust. 2001. Modeling development effort in object-oriented systems using design properties. IEEE Trans. Softw. Engin. 27, 11, 963--986. Google ScholarDigital Library
L. C. Briand, W. L. Melo, and J. Wust. 2002. Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans. Softw. Engin. 28, 7, 706--720. Google ScholarDigital Library
S. L. Carmichael, G. M. Shaw, D. M. Schaffer, C. Laurent, and S. Selvin. 2003. Dieting behaviors and risk of neural tube defects. Amer. J. Epidemiol. 158, 12, 1127--1131.Google ScholarCross Ref
C. Catal, U. Sevim, and B. Diri. 2009. Software fault prediction of unlabeled program modules. In Proceedings of the World Congress on Engineering. 212.Google Scholar
C. Catal. 2011. Software fault prediction: A literature review and current trends. Expert Syst. Appl. 38, 4, 4626--4636. Google ScholarDigital Library
S. R. Chidamber and C. F. Kemerer. 1991. Towards a metrics suite for object-oriented design. In Proceedings of the 6^th Annual Conference of Object-oriented Programming, Systems, Languages, and Applications. 197--211. Google ScholarDigital Library
S. R. Chidamber and C. F. Kemerer. 1994. A metrics suite for object-oriented design. IEEE Trans. Softw. Engin. 20, 6, 476--493. Google ScholarDigital Library
J. Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates.Google Scholar
M. Cotterchio, N. Kreiger, G. Darlington, and A. Steingart. 2000. Antidepressant medication use and breast cancer risk. Amer. J. Epidemiol. 151, 10, 951--957.Google ScholarCross Ref
S. Counsell, S. Swift, and J. Crampton. 2006. The interpretation and utility of three cohesion metrics for object-oriented design. ACM Trans. Softw. Engin. Methodol. 15, 2, 123--149. Google ScholarDigital Library
M. D'Ambros, M. Lanza, and R. Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7^th IEEE Working Conference on Mining Software Repositories. 31--41.Google Scholar
J. Davis and M. Goadrich. 2006. The relationships between precision-recall and roc curves. In Proceedings of the 23^rd International Conference on Machine Learning. 233--240. Google ScholarDigital Library
K. Dejaeger, T. Verbraken, and B. Baesens. 2012. Towards comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans. Softw. Engin. 39, 2, 237--257. Google ScholarDigital Library
A. Demaris. 1995. A tutorial in logistic regression. J. Marriage Family 57, 4, 956--968.Google ScholarCross Ref
G. Denaro, L. Lavazza, and M. Pezz. 2003. An empirical evaluation of object oriented metrics in industrial setting. In Proceedings of the 5^th CaberNet Plenary Workshop.Google Scholar
W. Dickinson, D. Leon, and A. Podgurski. 2001. Finding failures by cluster analysis for execution profiles. In Proceedings of the 23^rd International Conference on Software Engineering. 339--348. Google ScholarDigital Library
K. El Emam, S. Benlarbi, and N. Goel. 1999a. The confounding effect of class size on the validity of object-oriented metrics. Tech. rep. ERB-1062.Google Scholar
K. El Emam, S. Beniarbi, N. Goel, and S. Rai. 1999b. A validation of object-oriented metrics. Tech. rep. ERB-1063, NRC.Google Scholar
K. El Emam. 2000. A methodology for validating software product metrics. Tech. rep. NCR/ERB-1076, National Research Council of Canada, Ottawa, Ontario.Google Scholar
K. El Emam, W. L. Melo, and J. C. Machado. 2001a. The prediction of faulty classes using object-oriented design metrics. J. Syst. Softw. 56, 1, 63--75. Google ScholarDigital Library
K. El Emam, S. Benlarbi, N. Goel, and S. Rai. 2001b. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Engin. 27, 6, 630--650. Google ScholarDigital Library
M. English, C. Exton, I. Rigon, and B. Cleary. 2009. Fault detection and prediction in an open-source software project. In Proceedings of the 5^th International Conference on Predictor Models in Software Engineering. Google ScholarDigital Library
W. M. Evanco. 2003. Comments on “The confounding effect of class size on the validity of object-oriented metrics”. IEEE Trans. Softw. Engin. 29, 7, 670--672. Google ScholarDigital Library
A. J. Fairchild, D. P. MacKinnon, M. P. Taborga, and A. B. Taylor. 2009. R2 effect-size measures for mediation analysis. Behav. Res. Methods 41, 2, 486--498.Google ScholarCross Ref
N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Engin. 26, 8, 797--814. Google ScholarDigital Library
F. Fioravanti and P. Nesi. 2001. A study on fault-proneness detection of object-oriented systems. In Proceedings of the 5^th European Conference on Software Maintenance and Reengineering. 121--130. Google ScholarDigital Library
G. Fitzmaurice. 2003. Confused by confounding&quest; Nutrition 19, 2, 189--191.Google Scholar
D. Glasberg, K. El Emam, W. Melo, and N. Madhavji. 2000. Validating object-oriented design metrics on a commercial java application. Tech rep. RC/ERB-1080, National Research Council of Canada.Google Scholar
T. Gyimthy, R. Ferenc, and L. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Engin. 31, 10, 897--910. Google ScholarDigital Library
T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. 2012. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Engin. 38, 6, 1276--1304. Google ScholarDigital Library
D. J. Hand. 2009. Measuring classifier performance: A coherent alternative to the area under the roc curve. Mach. Learn. 77, 1, 103--123. Google ScholarDigital Library
B. Henderson-Sellers. 1996. Software Metrics. Prentice-Hall.Google Scholar
P. de Heus. 2012. R squared effect-size measures and overlap between direct and indirect effect in mediation analysis. Behav. Res. Methods 44, 1, 213--221.Google ScholarCross Ref
M. Hitz and B. Montazeri. 1995. Measuring coupling and cohesion in object-oriented systems. In Proceedings of the International Symposium on Applied Corporate Computing.Google Scholar
D. C. Howell. 2002. Statistical Methods for Psychology. Dukbury Press.Google Scholar
A. Janes, M. Scotto, W. Pedrycz, B. Russo, M. Stefanovic, and G. Succi. 2006. Identification of defect-prone classes in telecommunication software systems using design metrics. Inf. Sci. 176, 24, 3711--3734. Google ScholarDigital Library
S. Jasti, W. N. Dudley, and E. Goldwater. 2008. SAS macros for testing statistical mediation in data with binary mediators or outcomes. Nursing Res. 57, 2, 118--122.Google ScholarCross Ref
Y. Kamei, A. Monden, S. Matsumoto, T. Kakimoto, and K. Matsumoto. 2007. The effects of over and under sampling on fault-prone module detection. In Proceedings of the 1^st International Symposium on Empirical Software Engineering and Measurement. 196--204. Google ScholarDigital Library
T. Kamiya, S. Kusumoto, and K. Inoue. 1999. Prediction of fault-proneness at early phase in object-oriented development. In Proceedings of the 2^nd International Symposium on Object-Oriented Real-Time Distributed Computing. 253--258. Google ScholarDigital Library
S. Kanmani, V. R. Uthariaraj, V. Sankaranarayanan, and P. Thambidurai. 2007. Object-oriented software fault prediction using neural networks. Inf. Softw. Technol. 49, 5, 483--492. Google ScholarDigital Library
K. B. Karlson, A. Holm, and R. Breen. 2010. Comparing regression coefficients using logit and probit: A new method. CSER WP No. 0003, Aarhus University. http://www.cser.dk/fileadmin/www.cser.dk/wp_003kbkkkjrb.pdf.Google Scholar
T. M. Khoshgoftaar, X. Yuan, and E. B. Allen. 2000. Balancing misclassification rates in classification-tree models of software quality. Empirical Softw. Engin. 5, 4, 313--330. Google ScholarDigital Library
E. Kim, S. Kusumoto, and T. Kikuno. 1996. Heuristics for computing attribute values of C++ program complexity metrics. In Proceedings of the 20^th Conference on Computer Software and Applications. 104--109. Google ScholarDigital Library
D. Kleinbaum, L. Kupper, and H. Morgenstern. 1982. Epidemiologic Research: Principles and Quantitative Methods. Van Nostrand Reinhold.Google Scholar
A. G. Koru, K. El Emam, D. Zhang, H. Liu, and D. Mathew. 2008. Theory of relative defect proneness. Empirical Softw. Engin. 13, 5, 473--498. Google ScholarDigital Library
A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans. Softw. Engin. 35, 2, 293--304. Google ScholarDigital Library
A. G. Koru, H. Liu, D. Zhang, and K. El Emam. 2010. Testing the theory of relative defect proneness for closed-source software. Empirical Softw. Engin. 15, 6, 577--598. Google ScholarDigital Library
A. Lake and C. Cook. 1994. Use of factor analysis to develop oop software complexity metrics. In Proceedings of the 6^th Annual Oregon Workshop on Software Metrics.Google Scholar
Y. Lee, B. Liang, S. Wu, and F. Wang. 1995. Measuring the coupling and cohesion of an object-oriented program based on information flow. In Proceedings of the International Conference on Software Quality.Google Scholar
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. 2008. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. Softw. Engin. 34, 4, 485--496. Google ScholarDigital Library
M. Lorenz and J. Kidd. 1994. Object-Oriented Software Metrics. Prentice Hall. Google ScholarDigital Library
M. Lumpe, R. Vasa, T. Menzies, and B. Turhan. 2011. Popularity is (almost) a perfect predictor for defects. http://promisedata.org/repository/data/popularity/popularity.pdf.Google Scholar
D. P. MacKinnon, J. L. Krull, and C. M. Lockwood. 2000. Equivalence of the mediation, confounding and suppression effect. Prevent. Sci. 1, 4, 173--181.Google ScholarCross Ref
D. P. MacKinnon, C. M. Lockwood, J. M. Hoffman, S. G. West, and V. Sheets. 2002. A comparison of methods to test mediation and other intervening variable effects. Psychol. Methods 7, 1, 83--104.Google ScholarCross Ref
D. P. MacKinnon, C. M. Lockwood, C. H. Brown, and J. M. Hoffman. 2007. The intermediate endpoint effect in logistic and probit regression. Clinical Trials 4, 5, 499--513.Google ScholarCross Ref
D. P. MacKinnon. 2008. Introduction to Statistical Mediation Analysis. Lawrence Erlbaum Associates.Google Scholar
G. Maldonado and S. Greenland. 1993. Simulation study of confounder-selection strategies. Amer. J. Epidemiol. 138, 11, 923--936.Google ScholarCross Ref
A. Marcus, D. Poshyvanyk, and R. Ferenc. 2008. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Engin. 34, 2, 287--300. Google ScholarDigital Library
T. Mende and R. Koschke. 2009. Revisiting the evaluation of defect prediction models. In Proceedings of the 5^th International Conference on Predictor Models in Software Engineering. Google ScholarDigital Library
T. Menzies, J. Greenwald, and A. Frank 2007. Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Engin. 33, 1, 2--13. Google ScholarDigital Library
T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener. 2010. Defect prediction from static code features: Current results, limitations, new approaches. Autom. Softw. Engin. 17, 4, 375--407. Google ScholarDigital Library
C. Mood. 2010. Logistic regression: Why we cannot do what we think we can do, and what we can do about it. Euro. Sociol. Rev. 26, 1, 67--82.Google Scholar
A. A. O'Connell. 2005. Logistic Regression Models for Ordinal Response Variables. Sage Publications.Google Scholar
H. M. Olague, L. H. Etzkorn, S. Gholston, and S. Quattlebaum. 2007. Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development process. IEEE Trans. Softw. Engin. 33, 6, 402--419. Google ScholarDigital Library
H. M. Olague, L. H. Etzkorn, S. L. Messimer, and H. S. Delugach. 2008. An empirical validation of object-oriented class complexity metrics and their ability to predict error-prone classes in highly iterative, or agile, software: A case study. J. Softw. Maint. Evolut. Res. Pract. 20, 3, 171--197. Google ScholarDigital Library
G. J. Pai and J. B. Dugan. 2007. Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Trans. Softw. Engin. 33, 10, 675--686. Google ScholarDigital Library
K. J. Preacher and K. Kelley. 2011. Effect size measures for mediation models: Quantitative strategies for communicating indirect effects. Psychol. Models 16, 2, 93--115.Google Scholar
F. Rahman, D. Posnett, and P. Devanbu. 2012. Recalling the imprecision of cross-project defect prediction. In Proceedings of the 20^th ACM SIGSOFT International Symposium on the Foundations of Software Engineering. Google ScholarDigital Library
J. Rosenberg. 1997. Some misconceptions about lines of code. In Proceedings of the 4^th International Symposium on Software Metrics. 137--142. Google ScholarDigital Library
A. I. Schein, L. K. Saul, and L. H. Ungar. 2003. A generalized linear model for principal component analysis of binary data. In Proceedings of the 9^th International Workshop on Artificial Intelligence and Statistics.Google Scholar
C. Seiffert, T. M. Khoshgoftaar, and J. V. Hulse. 2009. Improving software-quality predictions with data sampling and boosting. IEEE Trans. Syst. Man, Cybernet. Part A: Syst. Hum. 39, 6, 1283--1294. Google ScholarDigital Library
N. Seliya, T. M. Khoshgoftaar, and S. Zhong. 2005. Analyzing software quality with limited fault-proneness defect data. In Proceedings of the 9^th IEEE International Symposium on High-Assurance Systems Engineering. 89--98. Google ScholarDigital Library
N. Seliya and T. M. Khoshgoftaar. 2007a. Software quality with limited fault-proneness defect data: A semi supervised learning perspective. Softw. Qual. J. 15, 3, 327--344. Google ScholarDigital Library
N. Seliya and T. M. Khoshgoftaar. 2007b. Software quality analysis of unlabeled program modules with semi supervised clustering. IEEE Trans. Syst. Man Cybernet. Part A: Syst. Hum. 37, 2, 201--211. Google ScholarDigital Library
R. Shatnawi, W. Li, and H. Zhang. 2006. Predicting error probability in the eclipse project. In Proceedings of the International Conference on Software Engineering Research and Practice. 422--428.Google Scholar
R. Shatnawi and W. Li. 2008. The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J. Syst. Softw. 81, 11, 1868--1882. Google ScholarDigital Library
Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indictors of software vulnerabilities. IEEE Trans. Softw. Engin. 37, 6, 772--787. Google ScholarDigital Library
A. M. Sibai, M. Feinleib, T. A. Sibai, and H. K. Armenian. 2005. A positive or a negative confounding variable&quest; A simple teaching aid for clinicians and students. Ann. Epidemiol. 15, 6, 421--423.Google ScholarCross Ref
Y. Singh, A. Kaur, and R. Malhotra. 2010. Empirical validation of object-oriented metrics for predicting fault proneness models. Softw. Qual. J. 18, 1, 3--35. Google ScholarDigital Library
M. E. Sobel. 1982. Asymptotic confidence intervals for indirect effects in structural equation models. Sociol. Methodol. 13, 290--312.Google ScholarCross Ref
F. Soleimannejed. 2004. Six Sigma, Basic Steps and Implementation. AuthorHouse.Google Scholar
J. Sonis. 1998. A closer look at confounding. Family Med. 30, 8, 584--588.Google Scholar
R. Subramanyam and M. S. Krisnan. 2003. Empirical analysis of ck metrics for object-oriented design complexity: Implications for software defects. IEEE Trans. Softw. Engin. 29, 4, 297--310. Google ScholarDigital Library
G. Succi, W. Pedrycz, M. Stefanovic, and J. Miller. 2003. Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics. J. Syst. Softw. 65, 1, 1--12. Google ScholarDigital Library
M. Szklo and F. J. Nieto. 2000. Epidemiology: Beyond the Basics. Aspen, Gaithersburg, MD.Google Scholar
M. H. Tang, M. H. Kuo, and M. H. Chen. 1999. An empirical study on object-oriented metrics. In Proceedings of the 6^th International Software Metrics Symposium. 242--249. Google ScholarDigital Library
D. Tegarden, S. Sheetz, and D. Monarchi. 1992. A software complexity model of object-oriented systems. Decis. Support Syst. 13, 34, 241--262. Google ScholarDigital Library
P. Tomaszewski, L. Lundberg, and H. Grahn. 2007. Improving fault detection in modified code: A study from the telecommunication industry. J. Comput. Sci. Technol. 22, 3, 397--409. Google ScholarDigital Library
A. M. Toschke, S. M. Montogomery, U. Pfeiffer, and R. von Kries. 2003. Early intrauterine exposure to tobacco-inhaled products and obesity. Amer. J. Epidemiol. 158, 11, 1068--1074.Google ScholarCross Ref
H. Wang, T. M. Khoshgoftaar, and K. Gao. 2010. A comparative study of filter-based feature ranking techniques. Proceedings of the 11^th IEEE International Conference on Information Reuse and Integration. 43--48.Google Scholar
I. H. Witten and E. Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Fransisco. Google ScholarDigital Library
Y. Zhou and H. Leung. 2006. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans. Softw. Engin. 32, 10, 771--789. Google ScholarDigital Library
Y. Zhou, H. Leung, and B. Xu. 2009. Examining the potentially confounding effect of class size on the associations between object-oriented metrics and change-proneness. IEEE Trans. Softw. Engin. 35, 5, 607--623. Google ScholarDigital Library
Y. Zhou, B. Xu, and H. Leung. 2010. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J. Syst. Softw. 83, 4, 660--674. Google ScholarDigital Library
T. Zimmermann, R. Premrai, and A. Zeller. 2007. Predicting defects for eclipse. In Proceedings of the 3^rd International Workshop on Predictor Models in Software Engineering. Google ScholarDigital Library

Index Terms

An in-depth study of the potentially confounding effect of class size in fault prediction
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics

Recommendations

Examining the Potentially Confounding Effect of Class Size on the Associations between Object-Oriented Metrics and Change-Proneness

Previous research shows that class size can influence the associations between object-oriented (OO) metrics and fault-proneness and therefore proposes that it should be controlled as a confounding variable when validating OO metrics on fault-proneness. ...
Read More
Mitigating the Dependence Confounding Effect for Effective Predicate-Based Statistical Fault Localization
COMPSAC '15: Proceedings of the 2015 IEEE 39th Annual Computer Software and Applications Conference - Volume 02

The recent studies indicate that predicate-based statistical fault localization suffered from the control dependence confounding effect and the failure flow confounding effect, which decrease the measurement accuracy of fault localization. However, the ...
Read More
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 23, Issue 1
February 2014
354 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2582050
Editor:
David S. Rosenblum
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 February 2014
- Accepted: 1 May 2013
- Revised: 1 April 2013
- Received: 1 October 2011
Published in tosem Volume 23, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Metrics
class size
confounding effect
fault
prediction
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 73
  Total Citations
  View Citations
- 998
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An in-depth study of the potentially confounding effect of class size in fault prediction

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Examining the Potentially Confounding Effect of Class Size on the Associations between Object-Oriented Metrics and Change-Proneness

Mitigating the Dependence Confounding Effect for Effective Predicate-Based Statistical Fault Localization

Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study