Skip to main content
Log in

Measuring the effect of clone refactoring on the size of unit test cases in object-oriented software: an empirical study

  • Original Paper
  • Published:
Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Abstract

This paper aims at empirically measuring the effect of clone refactoring on the size of unit test cases in object-oriented software. We investigated various research questions related to the: (1) impact of clone refactoring on source code attributes (particularly size, complexity and coupling) that are related to testability of classes, (2) impact of clone refactoring on the size of unit test cases, (3) correlations between the variations observed after clone refactoring in both source code attributes and the size of unit test cases and (4) variations after clone refactoring in the source code attributes that are more associated with the size of unit test cases. We used different metrics to quantify the considered source code attributes and the size of unit test cases. To investigate the research questions, and develop predictive and explanatory models, we used various data analysis and modeling techniques, particularly linear regression analysis and five machine learning algorithms (C4.5, KNN, Naïve Bayes, Random Forest and Support Vector Machine). We conducted an empirical study using data collected from two open-source Java software systems (ANT and ARCHIVA) that have been clone refactored. Overall, the paper contributions can be summarized as: (1) the results revealed that there is a strong and positive correlation between code clone refactoring and reduction in the size of unit test cases, (2) we showed how code quality attributes that are related to testability of classes are significantly improved when clones are refactored, (3) we observed that the size of unit test cases can be significantly reduced when clone refactoring is applied, and (4) complexity/size measures are commonly associated with the variations of the size of unit test cases when compared to coupling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. https://developers.google.com/java-dev-tools/codepro/.

  2. http://www.borland.com/.

  3. http://www.xlstat.com/.

  4. https://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html.

References

  1. Fowler M (1999) Refactoring: improving the design of existing code. Addison Wesley, Boston

    MATH  Google Scholar 

  2. Roy CK, Cordy JR, Koschke R (2009) Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci Comput Program 74(7):470

    Article  MathSciNet  MATH  Google Scholar 

  3. Baker B (1995) On finding duplication and near-duplication in large software systems In: 2nd working conference on reverse engineering, WCRE

  4. Baxter ID, Yahin A, Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: ICSM

  5. Roy CK, Cordy JR (2008) An empirical study of function clones in open source software systems. In: 15th working conference on reverse engineering, WCRE

  6. Mens T, Tourwé T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126

    Article  Google Scholar 

  7. Sajnani H, Saini V, Lopes CV (2014) A comparative study of bug patterns in java cloned and non-cloned code. In: 14th international working conference on source code analysis and manipulation, pp 21–30

  8. Saini V, Sajnani H, Lopes C (2016) Comparing quality metrics for cloned and non-cloned java methods: a large scale empirical study. In: International conference on software maintenance and evolution, IEEE

  9. Kaur P, Mittal P (2017) Impact of clones refactoring on external quality attributes of open source software. Int J Adv Res Comput Sci 8(5):1

    Google Scholar 

  10. Shahzad S, Hussain A, Nazir S (2017) A clone management framework to improve code quality of FOSS software. In: International conference on communication, computing and digital system (CCODE), IEEE

  11. Fontana A, Zanoni F, Ranchetti A, Ranchetti D (2013) Software clone detection and refactoring. Hindawi Publishing Corporation, INRN Software Engineering, Cairo

    Google Scholar 

  12. Alshayeb M (2009) Empirical investigation of refactoring effect on software quality. Inf Softw Technol 51:1319–1326

    Article  Google Scholar 

  13. Badri M, Kout A, Badri L (2012) On the effect of aspect-oriented refactoring on testability of classes: a case study. In: IEEE international conf computer systems and industrial informatics

  14. Badri M, Kout A, Badri L (2017) Investigating the effect of aspect-oriented refactoring on the unit testing effort of classes: an empirical evaluation. Int J Softw Eng Knowl Eng 27(5):749–789

    Article  Google Scholar 

  15. Bruntink M, Deursen AV (2004) Predicting class testability using object-oriented metrics. In: 4th international workshop on source code analysis and manipulation (SCAM)

  16. Bruntink M, Deursen AV (2006) An empirical study into class testability. J Syst Softw 79(9):1219

    Article  Google Scholar 

  17. Singh Y, Kaur A, Malhota R (2008) Predicting testability effort using artificial neural network. In: Proceedings of the world congress on engineering and computer science, San Francisco, USA

  18. Singh Y, Saha A (2010) Predicting testability of Eclipse: a case study. J Softw Eng 4(2):122

    Article  Google Scholar 

  19. Badri M, Touré F (2011) Empirical analysis for investigating the effect of control flow dependencies on testability of classes. In: 23rd international conference on software engineering and knowledge engineering, USA

  20. Badri M, Touré F (2012) Empirical analysis of object-oriented design metrics for predicting unit testing effort of classes. J Softw Eng Appl 5(7):513

    Article  Google Scholar 

  21. Zhou Y, Leung H, Song Q, Zhao J, Lu H, Chen L, Xu B (2012) An in-depth investigation into the relationships between structural metrics and unit testability in OOS. Inf Sci 55(12):2800

    Google Scholar 

  22. Toure F, Badri M, Lamontagne L (2014) Towards a metrics suite for JUnit test cases. In: 26th international conference on software engineering and knowledge engineering (SEKE), Vancouver

  23. Toure F, Badri M, Lamontagne L (2014) A metrics suite for JUnit test code: a multiple case study on open source software. J Softw Eng Res Dev (JSERD) 2:14

    Article  Google Scholar 

  24. Toure F, Badri M, Lamontagne L (2018) Predicting different levels of the unit testing effort of classes using source code metrics: a multiple case study on open-source software. Innov Syst Softw Eng 14:15–46

    Article  Google Scholar 

  25. Chidamber SR, Kemerer CF (1994) A Metrics suite for OO design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  26. Chidamber SR, Darcy DP, Kemerer CF (1998) Managerial use of metrics for object-oriented software: an exploratory analysis. IEEE Trans Softw Eng 24(8):629–639

    Article  Google Scholar 

  27. Hegedüs G, Hrabovszki G (2010) Effect of object-oriented refactoring on testability, error proneness and other maintainability attributes. In: ECOOP’2010 Maribor, Slovenia EU, ACM

  28. Kataoka Y, Imai T, Andou H, Fukaya T (2002) A quantitative evaluation of maintainability enhancement by refactoring. In: Proceedings of the international conference on software maintenance

  29. Dandashi F (2002) A method for assessing the reusability of object-oriented code using a validated set of automated measurements. In: Proceedings of the ACM symposium on applied computing

  30. Murgia A, Tonelli R, Marchesi M, Concas G, Counsell S, McFall J, Swift S (2012) Refactoring and its relationship with fan-in and fan-out: an empirical study. In: Proceedings of the 16th European conference on software maintenance and reengineering (CSMR)

  31. Szöke G, Csaba Nagy G, Ferenc R, Gyimòthy T (2017) Empirical study on refactoring large-scale industrial systems and its effects on maintainability. J Syst Softw 129:107

    Article  Google Scholar 

  32. Kadar I, Hegedüs P, Ferenc R, Gyimothy T (2016) A code refactoring dataset and its assessment regarding software maintainability. In: 23rd international conference on software analysis, evolution, and reengineering

  33. Basit HA, Hammad M, Koschke R (2015) A survey on goal-oriented visualization of clone data. In: VISSOFT 2015, IEEE, Bremen

  34. Sajnani H, Sainiy V, Svajlenkoz J, Roy CK, Lopesy CV, Sourcerer CC (2016) Scaling code clone detection to big-code. In: 38th international conference on software engineering. IEEE/ACM

  35. Kapser C, Godfrey MW (2006) Cloning considered harmful. In: Proceedings of the 13th working conference on reverse engineering (WCRE’06), IEEE

  36. Koschke R (2007) Survey of research on software clones. In: Proceedings of duplication, redundancy, and similarity in software

  37. Toomim M, Begel A, Graham S (2004) Managing duplicated code with linked editing. In: 2004 IEEE symposium on visual languages and human centric computing

  38. Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. In: Proceedings of FSE

  39. Kapser C, Gofrey M (2008) Cloning considered harmful: patterns of cloning in software. Empir Softw Eng 13(6):645–692

    Article  Google Scholar 

  40. Rahman F, Bird C, Devanbu P (2012) Clones: what is that smell? Empir Softw Eng 17(4–5):503–530

    Article  Google Scholar 

  41. Mondal M, Rahman S, Saha RK, Roy CK, Krinke J, Schneider KA (2011) An empirical study of the impacts of clones in software maintenance. In: 19th international conference on program comprehension, IEEE

  42. Rahman F, Bird C, Devanbu P (2012) Clones: what is that smell? Empir Softw Eng 7(4–5):503–530

    Article  Google Scholar 

  43. Saidur Rahman M, Aryaniy A, Roy CK, Perinz F (2013) On the relationships between domain-based coupling and code clones: an exploratory study. ICSE, San Francisco

    Google Scholar 

  44. Saidur Rahman M, Roy CK (2017) On the relationships between stability and bug-proneness of code clones: an empirical study. In: 17th international working conference on source code analysis and manipulation, IEEE

  45. Devi U, Sharma A, Kesswani N (2016) A review on quality models to analyze the impact of refactoring code on maintainability with reference to software product line. In: International conference on computing for sustainable global development (INDIACom)

  46. Basili VR, Briand LC, Melo W (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751

    Article  Google Scholar 

  47. Fenton N, Pfleeger SL (1997) Software metrics: a rigorous and practical approach. PWS Publishing Company, Boston

    Google Scholar 

  48. Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 3(10):897–910

    Article  Google Scholar 

  49. Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32(10):771–789

    Article  Google Scholar 

  50. Zhou Y, Xu B, Leung H (2010) On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J Syst Softw 83:4

    Google Scholar 

  51. Shatnawi R, Li W, Swain J, Newman T (2010) Finding software metrics threshold values using ROC curves. J Softw Maint Evol Res Pract 22:1–16

    Article  Google Scholar 

  52. Shatnawi R (2010) A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans Softw Eng 36:216–225

    Article  Google Scholar 

  53. Srivastava S, Kumar R (2013) Indirect method to measure software quality using CK-OO suite. In: International conference on intelligent systems and signal processing (ISSP), IEEE

  54. Isong B, Obeten E (2013) A systematic review of the empirical validation of object-oriented metrics towards fault-proneness prediction. Int J Softw Eng Knowl Eng 23:1513

    Article  Google Scholar 

  55. Boucher A, Badri M (2018) Software metrics thresholds calculation techniques to predict fault-proneness: an empirical comparison. Inf Softw Technol 96:38

    Article  Google Scholar 

  56. Shatnawi R (2015) Deriving metrics thresholds using log transformation. J Softw 27(2):95–113

    Google Scholar 

  57. Binder RV (1994) Design for testability in object-oriented systems. Commun ACM 37(9):87–101

    Article  Google Scholar 

  58. Malhotra R, Bansal AJ (2015) Fault prediction considering threshold effects of object-oriented metrics. Expert Syst 32(2):203

    Article  Google Scholar 

  59. Kaur A, Kaur K (2014) Performance analysis of ensemble learning for predicting defects in open source software. In: 2014 international conference on advances in computing, communications and informatics (ICACCI)

  60. Moeyersoms J, Junqué de Fortuny E, Dejaeger K, Baesens B, Martens D (2015) Comprehensible software fault and effort prediction: a data mining approach. J Syst Softw 100:203

    Article  Google Scholar 

  61. Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66

    MATH  Google Scholar 

  62. Shatnawi R (2012) Improving software fault-prediction for imbalanced data. In: International conference on innovations in information technology, IIT

  63. Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276

    Article  Google Scholar 

  64. Shatnawi R (2017) The application of ROC analysis in threshold identification, data im balance and metrics selection for software fault prediction. Innov Syst Softw Eng 13:201

    Article  Google Scholar 

  65. Breiman L (2001) Random forests. Mach Learn 45:5

    Article  MATH  Google Scholar 

  66. Moeyersoms J, Junqué de Fortuny E, Dejaeger K, Baesens B, Martens D (2015) Comprehensible software fault and effort prediction: a data mining approach. J Syst Softw 100:80–90

    Article  Google Scholar 

  67. Malhotra R, Bansal AJ (2015) Fault prediction considering threshold effects of object-oriented metrics. Expert Syst 32(2):203–219

    Article  Google Scholar 

  68. Malhotra R, Jain A (2012) Fault prediction using statistical and machine learning methods for improving software quality. J Inf Process Syst 8(2):241–262

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by a NSERC (Natural Sciences and Engineering Research Council of Canada) grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mourad Badri.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Badri, M., Badri, L., Hachemane, O. et al. Measuring the effect of clone refactoring on the size of unit test cases in object-oriented software: an empirical study. Innovations Syst Softw Eng 15, 117–137 (2019). https://doi.org/10.1007/s11334-019-00334-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11334-019-00334-6

Keywords

Navigation