skip to main content
research-article

An in-depth study of the potentially confounding effect of class size in fault prediction

Published:20 February 2014Publication History
Skip Abstract Section

Abstract

Background. The extent of the potentially confounding effect of class size in the fault prediction context is not clear, nor is the method to remove the potentially confounding effect, or the influence of this removal on the performance of fault-proneness prediction models. Objective. We aim to provide an in-depth understanding of the effect of class size on the true associations between object-oriented metrics and fault-proneness. Method. We first employ statistical methods to examine the extent of the potentially confounding effect of class size in the fault prediction context. After that, we propose a linear regression-based method to remove the potentially confounding effect. Finally, we empirically investigate whether this removal could improve the prediction performance of fault-proneness prediction models. Results. Based on open-source software systems, we found: (a) the confounding effect of class size on the associations between object-oriented metrics and fault-proneness in general exists; (b) the proposed linear regression-based method can effectively remove the confounding effect; and (c) after removing the confounding effect, the prediction performance of fault prediction models with respect to both ranking and classification can in general be significantly improved. Conclusion. We should remove the confounding effect of class size when building fault prediction models.

References

  1. K. K. Aggarwal, Y. Singh, A. Kaur, and R. Malhotra. 2007. Investigating effect of design metrics on fault proneness in object-oriented systems. J. Object Technol. 6, 10, 127--141.Google ScholarGoogle ScholarCross RefCross Ref
  2. H. Aman, K. Yamasaki, H. Yamada, and M. T. Noda. 2002. A proposal of class cohesion metrics using sizes of cohesive parts. In Knowledge-Based Software Engineering. IOS Press, 102--107.Google ScholarGoogle Scholar
  3. C. Andersson and P. Runeson. 2007. A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Engin. 33, 5, 273--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Arisholm and L. C. Briand. 2006. Predicting fault-prone components in a Java legacy system. In Proceedings of the 5th ACM-IEEE International Symposium on Empirical Software Engineering. 8--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Arisholm, L. C. Briand, and E. B. Johannessen. 2010. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83, 1, 2--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Badri and M. Badri. 2004. A proposal of a new class cohesion criterion: An empirical study. J. Object Technol. 3, 4, 145--159.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. Bansiya, L. Etzkorn, C. Davis, and W. Li. 1999. A class cohesion metric for object-oriented designs. J. Object-Oriented Program. 11, 8, 47--52.Google ScholarGoogle Scholar
  8. V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Engin. 22, 10, 751--761. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Belsley, E. Kuh, and R. Welsch. 2010. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley and Sons.Google ScholarGoogle Scholar
  10. Y. Benjamini and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Royal Statist. Soc. Series B (Methodol.) 57, 1, 289--300.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Benlarbi and W. L. Melo. 1999. Polymorphism measures for early risk prediction. In Proceedings of the 21st International Conference on Software Engineering. 334--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. M. Bieman and B. K. Kang. 1995. Cohesion and reuse in an object-oriented system. ACM SIGSOFT Softw. Engin. Not. 20, 259--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. C. Briand, S. Morasca, and V. R. Basili. 1996. Property-based software engineering measurement. IEEE Trans. Softw. Engin. 22, 1, 68--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. C. Briand, P. T. Devanbu, and W. L. Melo. 1997. An investigation into coupling measures for C++. In Proceedings of the 19th International Conference on Software Engineering. 412--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. C. Briand, J. Wust, S. Ikonomovski, and H. Lounis. 1998a. A comprehensive investigation of quality factors in object-oriented designs: An industrial case study. Tech. rep. ISERN-98-29, International Software Engineering Research Network.Google ScholarGoogle Scholar
  16. L. C. Briand, J. W. Daly, and J. Wust. 1998b. A unified framework for cohesion measurement in object-oriented systems. Empirical Softw. Engin. 3, 1, 65--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. C. Briand, J. W. Daly, and J. Wust. 1999a. A unified framework for coupling measurement in object-oriented systems. IEEE Trans. Softw. Engin. 25, 1, 91--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. C. Briand, J. Wust, S. V. Ikonomovski, and H. Lounis. 1999b. Investigating quality factors in object-oriented designs: An industrial case study. In Proceedings of the 21st International Conference on Software Engineering. 345--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. C. Briand, J. Wust, J. W. Daly, and D. V. Porter. 2000. Exploring the relationships between design measures and software quality in object-oriented systems. J. Syst. Softw. 51, 3, 245--273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. C. Briand and J. Wust. 2001. Modeling development effort in object-oriented systems using design properties. IEEE Trans. Softw. Engin. 27, 11, 963--986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. C. Briand, W. L. Melo, and J. Wust. 2002. Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans. Softw. Engin. 28, 7, 706--720. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. L. Carmichael, G. M. Shaw, D. M. Schaffer, C. Laurent, and S. Selvin. 2003. Dieting behaviors and risk of neural tube defects. Amer. J. Epidemiol. 158, 12, 1127--1131.Google ScholarGoogle ScholarCross RefCross Ref
  23. C. Catal, U. Sevim, and B. Diri. 2009. Software fault prediction of unlabeled program modules. In Proceedings of the World Congress on Engineering. 212.Google ScholarGoogle Scholar
  24. C. Catal. 2011. Software fault prediction: A literature review and current trends. Expert Syst. Appl. 38, 4, 4626--4636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. R. Chidamber and C. F. Kemerer. 1991. Towards a metrics suite for object-oriented design. In Proceedings of the 6th Annual Conference of Object-oriented Programming, Systems, Languages, and Applications. 197--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. R. Chidamber and C. F. Kemerer. 1994. A metrics suite for object-oriented design. IEEE Trans. Softw. Engin. 20, 6, 476--493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates.Google ScholarGoogle Scholar
  28. M. Cotterchio, N. Kreiger, G. Darlington, and A. Steingart. 2000. Antidepressant medication use and breast cancer risk. Amer. J. Epidemiol. 151, 10, 951--957.Google ScholarGoogle ScholarCross RefCross Ref
  29. S. Counsell, S. Swift, and J. Crampton. 2006. The interpretation and utility of three cohesion metrics for object-oriented design. ACM Trans. Softw. Engin. Methodol. 15, 2, 123--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. D'Ambros, M. Lanza, and R. Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories. 31--41.Google ScholarGoogle Scholar
  31. J. Davis and M. Goadrich. 2006. The relationships between precision-recall and roc curves. In Proceedings of the 23rd International Conference on Machine Learning. 233--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. K. Dejaeger, T. Verbraken, and B. Baesens. 2012. Towards comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans. Softw. Engin. 39, 2, 237--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Demaris. 1995. A tutorial in logistic regression. J. Marriage Family 57, 4, 956--968.Google ScholarGoogle ScholarCross RefCross Ref
  34. G. Denaro, L. Lavazza, and M. Pezz. 2003. An empirical evaluation of object oriented metrics in industrial setting. In Proceedings of the 5th CaberNet Plenary Workshop.Google ScholarGoogle Scholar
  35. W. Dickinson, D. Leon, and A. Podgurski. 2001. Finding failures by cluster analysis for execution profiles. In Proceedings of the 23rd International Conference on Software Engineering. 339--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. El Emam, S. Benlarbi, and N. Goel. 1999a. The confounding effect of class size on the validity of object-oriented metrics. Tech. rep. ERB-1062.Google ScholarGoogle Scholar
  37. K. El Emam, S. Beniarbi, N. Goel, and S. Rai. 1999b. A validation of object-oriented metrics. Tech. rep. ERB-1063, NRC.Google ScholarGoogle Scholar
  38. K. El Emam. 2000. A methodology for validating software product metrics. Tech. rep. NCR/ERB-1076, National Research Council of Canada, Ottawa, Ontario.Google ScholarGoogle Scholar
  39. K. El Emam, W. L. Melo, and J. C. Machado. 2001a. The prediction of faulty classes using object-oriented design metrics. J. Syst. Softw. 56, 1, 63--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. K. El Emam, S. Benlarbi, N. Goel, and S. Rai. 2001b. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Engin. 27, 6, 630--650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. English, C. Exton, I. Rigon, and B. Cleary. 2009. Fault detection and prediction in an open-source software project. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. W. M. Evanco. 2003. Comments on “The confounding effect of class size on the validity of object-oriented metrics”. IEEE Trans. Softw. Engin. 29, 7, 670--672. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. J. Fairchild, D. P. MacKinnon, M. P. Taborga, and A. B. Taylor. 2009. R2 effect-size measures for mediation analysis. Behav. Res. Methods 41, 2, 486--498.Google ScholarGoogle ScholarCross RefCross Ref
  44. N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Engin. 26, 8, 797--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. F. Fioravanti and P. Nesi. 2001. A study on fault-proneness detection of object-oriented systems. In Proceedings of the 5th European Conference on Software Maintenance and Reengineering. 121--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. G. Fitzmaurice. 2003. Confused by confounding? Nutrition 19, 2, 189--191.Google ScholarGoogle Scholar
  47. D. Glasberg, K. El Emam, W. Melo, and N. Madhavji. 2000. Validating object-oriented design metrics on a commercial java application. Tech rep. RC/ERB-1080, National Research Council of Canada.Google ScholarGoogle Scholar
  48. T. Gyimthy, R. Ferenc, and L. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Engin. 31, 10, 897--910. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. 2012. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Engin. 38, 6, 1276--1304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. D. J. Hand. 2009. Measuring classifier performance: A coherent alternative to the area under the roc curve. Mach. Learn. 77, 1, 103--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. B. Henderson-Sellers. 1996. Software Metrics. Prentice-Hall.Google ScholarGoogle Scholar
  52. P. de Heus. 2012. R squared effect-size measures and overlap between direct and indirect effect in mediation analysis. Behav. Res. Methods 44, 1, 213--221.Google ScholarGoogle ScholarCross RefCross Ref
  53. M. Hitz and B. Montazeri. 1995. Measuring coupling and cohesion in object-oriented systems. In Proceedings of the International Symposium on Applied Corporate Computing.Google ScholarGoogle Scholar
  54. D. C. Howell. 2002. Statistical Methods for Psychology. Dukbury Press.Google ScholarGoogle Scholar
  55. A. Janes, M. Scotto, W. Pedrycz, B. Russo, M. Stefanovic, and G. Succi. 2006. Identification of defect-prone classes in telecommunication software systems using design metrics. Inf. Sci. 176, 24, 3711--3734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. S. Jasti, W. N. Dudley, and E. Goldwater. 2008. SAS macros for testing statistical mediation in data with binary mediators or outcomes. Nursing Res. 57, 2, 118--122.Google ScholarGoogle ScholarCross RefCross Ref
  57. Y. Kamei, A. Monden, S. Matsumoto, T. Kakimoto, and K. Matsumoto. 2007. The effects of over and under sampling on fault-prone module detection. In Proceedings of the 1st International Symposium on Empirical Software Engineering and Measurement. 196--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. T. Kamiya, S. Kusumoto, and K. Inoue. 1999. Prediction of fault-proneness at early phase in object-oriented development. In Proceedings of the 2nd International Symposium on Object-Oriented Real-Time Distributed Computing. 253--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. S. Kanmani, V. R. Uthariaraj, V. Sankaranarayanan, and P. Thambidurai. 2007. Object-oriented software fault prediction using neural networks. Inf. Softw. Technol. 49, 5, 483--492. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. K. B. Karlson, A. Holm, and R. Breen. 2010. Comparing regression coefficients using logit and probit: A new method. CSER WP No. 0003, Aarhus University. http://www.cser.dk/fileadmin/www.cser.dk/wp_003kbkkkjrb.pdf.Google ScholarGoogle Scholar
  61. T. M. Khoshgoftaar, X. Yuan, and E. B. Allen. 2000. Balancing misclassification rates in classification-tree models of software quality. Empirical Softw. Engin. 5, 4, 313--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. E. Kim, S. Kusumoto, and T. Kikuno. 1996. Heuristics for computing attribute values of C++ program complexity metrics. In Proceedings of the 20th Conference on Computer Software and Applications. 104--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. D. Kleinbaum, L. Kupper, and H. Morgenstern. 1982. Epidemiologic Research: Principles and Quantitative Methods. Van Nostrand Reinhold.Google ScholarGoogle Scholar
  64. A. G. Koru, K. El Emam, D. Zhang, H. Liu, and D. Mathew. 2008. Theory of relative defect proneness. Empirical Softw. Engin. 13, 5, 473--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans. Softw. Engin. 35, 2, 293--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. A. G. Koru, H. Liu, D. Zhang, and K. El Emam. 2010. Testing the theory of relative defect proneness for closed-source software. Empirical Softw. Engin. 15, 6, 577--598. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. A. Lake and C. Cook. 1994. Use of factor analysis to develop oop software complexity metrics. In Proceedings of the 6th Annual Oregon Workshop on Software Metrics.Google ScholarGoogle Scholar
  68. Y. Lee, B. Liang, S. Wu, and F. Wang. 1995. Measuring the coupling and cohesion of an object-oriented program based on information flow. In Proceedings of the International Conference on Software Quality.Google ScholarGoogle Scholar
  69. S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. 2008. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. Softw. Engin. 34, 4, 485--496. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. M. Lorenz and J. Kidd. 1994. Object-Oriented Software Metrics. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. M. Lumpe, R. Vasa, T. Menzies, and B. Turhan. 2011. Popularity is (almost) a perfect predictor for defects. http://promisedata.org/repository/data/popularity/popularity.pdf.Google ScholarGoogle Scholar
  72. D. P. MacKinnon, J. L. Krull, and C. M. Lockwood. 2000. Equivalence of the mediation, confounding and suppression effect. Prevent. Sci. 1, 4, 173--181.Google ScholarGoogle ScholarCross RefCross Ref
  73. D. P. MacKinnon, C. M. Lockwood, J. M. Hoffman, S. G. West, and V. Sheets. 2002. A comparison of methods to test mediation and other intervening variable effects. Psychol. Methods 7, 1, 83--104.Google ScholarGoogle ScholarCross RefCross Ref
  74. D. P. MacKinnon, C. M. Lockwood, C. H. Brown, and J. M. Hoffman. 2007. The intermediate endpoint effect in logistic and probit regression. Clinical Trials 4, 5, 499--513.Google ScholarGoogle ScholarCross RefCross Ref
  75. D. P. MacKinnon. 2008. Introduction to Statistical Mediation Analysis. Lawrence Erlbaum Associates.Google ScholarGoogle Scholar
  76. G. Maldonado and S. Greenland. 1993. Simulation study of confounder-selection strategies. Amer. J. Epidemiol. 138, 11, 923--936.Google ScholarGoogle ScholarCross RefCross Ref
  77. A. Marcus, D. Poshyvanyk, and R. Ferenc. 2008. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Engin. 34, 2, 287--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. T. Mende and R. Koschke. 2009. Revisiting the evaluation of defect prediction models. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. T. Menzies, J. Greenwald, and A. Frank 2007. Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Engin. 33, 1, 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener. 2010. Defect prediction from static code features: Current results, limitations, new approaches. Autom. Softw. Engin. 17, 4, 375--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. C. Mood. 2010. Logistic regression: Why we cannot do what we think we can do, and what we can do about it. Euro. Sociol. Rev. 26, 1, 67--82.Google ScholarGoogle Scholar
  82. A. A. O'Connell. 2005. Logistic Regression Models for Ordinal Response Variables. Sage Publications.Google ScholarGoogle Scholar
  83. H. M. Olague, L. H. Etzkorn, S. Gholston, and S. Quattlebaum. 2007. Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development process. IEEE Trans. Softw. Engin. 33, 6, 402--419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. H. M. Olague, L. H. Etzkorn, S. L. Messimer, and H. S. Delugach. 2008. An empirical validation of object-oriented class complexity metrics and their ability to predict error-prone classes in highly iterative, or agile, software: A case study. J. Softw. Maint. Evolut. Res. Pract. 20, 3, 171--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. G. J. Pai and J. B. Dugan. 2007. Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Trans. Softw. Engin. 33, 10, 675--686. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. K. J. Preacher and K. Kelley. 2011. Effect size measures for mediation models: Quantitative strategies for communicating indirect effects. Psychol. Models 16, 2, 93--115.Google ScholarGoogle Scholar
  87. F. Rahman, D. Posnett, and P. Devanbu. 2012. Recalling the imprecision of cross-project defect prediction. In Proceedings of the 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. J. Rosenberg. 1997. Some misconceptions about lines of code. In Proceedings of the 4th International Symposium on Software Metrics. 137--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. A. I. Schein, L. K. Saul, and L. H. Ungar. 2003. A generalized linear model for principal component analysis of binary data. In Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics.Google ScholarGoogle Scholar
  90. C. Seiffert, T. M. Khoshgoftaar, and J. V. Hulse. 2009. Improving software-quality predictions with data sampling and boosting. IEEE Trans. Syst. Man, Cybernet. Part A: Syst. Hum. 39, 6, 1283--1294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. N. Seliya, T. M. Khoshgoftaar, and S. Zhong. 2005. Analyzing software quality with limited fault-proneness defect data. In Proceedings of the 9th IEEE International Symposium on High-Assurance Systems Engineering. 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. N. Seliya and T. M. Khoshgoftaar. 2007a. Software quality with limited fault-proneness defect data: A semi supervised learning perspective. Softw. Qual. J. 15, 3, 327--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. N. Seliya and T. M. Khoshgoftaar. 2007b. Software quality analysis of unlabeled program modules with semi supervised clustering. IEEE Trans. Syst. Man Cybernet. Part A: Syst. Hum. 37, 2, 201--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. R. Shatnawi, W. Li, and H. Zhang. 2006. Predicting error probability in the eclipse project. In Proceedings of the International Conference on Software Engineering Research and Practice. 422--428.Google ScholarGoogle Scholar
  95. R. Shatnawi and W. Li. 2008. The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J. Syst. Softw. 81, 11, 1868--1882. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indictors of software vulnerabilities. IEEE Trans. Softw. Engin. 37, 6, 772--787. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. A. M. Sibai, M. Feinleib, T. A. Sibai, and H. K. Armenian. 2005. A positive or a negative confounding variable? A simple teaching aid for clinicians and students. Ann. Epidemiol. 15, 6, 421--423.Google ScholarGoogle ScholarCross RefCross Ref
  98. Y. Singh, A. Kaur, and R. Malhotra. 2010. Empirical validation of object-oriented metrics for predicting fault proneness models. Softw. Qual. J. 18, 1, 3--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. M. E. Sobel. 1982. Asymptotic confidence intervals for indirect effects in structural equation models. Sociol. Methodol. 13, 290--312.Google ScholarGoogle ScholarCross RefCross Ref
  100. F. Soleimannejed. 2004. Six Sigma, Basic Steps and Implementation. AuthorHouse.Google ScholarGoogle Scholar
  101. J. Sonis. 1998. A closer look at confounding. Family Med. 30, 8, 584--588.Google ScholarGoogle Scholar
  102. R. Subramanyam and M. S. Krisnan. 2003. Empirical analysis of ck metrics for object-oriented design complexity: Implications for software defects. IEEE Trans. Softw. Engin. 29, 4, 297--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. G. Succi, W. Pedrycz, M. Stefanovic, and J. Miller. 2003. Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics. J. Syst. Softw. 65, 1, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. M. Szklo and F. J. Nieto. 2000. Epidemiology: Beyond the Basics. Aspen, Gaithersburg, MD.Google ScholarGoogle Scholar
  105. M. H. Tang, M. H. Kuo, and M. H. Chen. 1999. An empirical study on object-oriented metrics. In Proceedings of the 6th International Software Metrics Symposium. 242--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. D. Tegarden, S. Sheetz, and D. Monarchi. 1992. A software complexity model of object-oriented systems. Decis. Support Syst. 13, 34, 241--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. P. Tomaszewski, L. Lundberg, and H. Grahn. 2007. Improving fault detection in modified code: A study from the telecommunication industry. J. Comput. Sci. Technol. 22, 3, 397--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. A. M. Toschke, S. M. Montogomery, U. Pfeiffer, and R. von Kries. 2003. Early intrauterine exposure to tobacco-inhaled products and obesity. Amer. J. Epidemiol. 158, 11, 1068--1074.Google ScholarGoogle ScholarCross RefCross Ref
  109. H. Wang, T. M. Khoshgoftaar, and K. Gao. 2010. A comparative study of filter-based feature ranking techniques. Proceedings of the 11th IEEE International Conference on Information Reuse and Integration. 43--48.Google ScholarGoogle Scholar
  110. I. H. Witten and E. Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Fransisco. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Y. Zhou and H. Leung. 2006. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans. Softw. Engin. 32, 10, 771--789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Y. Zhou, H. Leung, and B. Xu. 2009. Examining the potentially confounding effect of class size on the associations between object-oriented metrics and change-proneness. IEEE Trans. Softw. Engin. 35, 5, 607--623. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Y. Zhou, B. Xu, and H. Leung. 2010. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J. Syst. Softw. 83, 4, 660--674. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. T. Zimmermann, R. Premrai, and A. Zeller. 2007. Predicting defects for eclipse. In Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An in-depth study of the potentially confounding effect of class size in fault prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 23, Issue 1
      February 2014
      354 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/2582050
      Issue’s Table of Contents

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 February 2014
      • Accepted: 1 May 2013
      • Revised: 1 April 2013
      • Received: 1 October 2011
      Published in tosem Volume 23, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader