skip to main content
10.1145/3196321.3196353acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

How slim will my system be?: estimating refactored code size by merging clones

Published:28 May 2018Publication History

ABSTRACT

We have been doing code clone analysis with industry collaborators for a long time, and have been always asked a question, "OK, I understand my system contains a lot of code clones, but how slim will it be after merging redundant code clones?" As a software system evolves for long period, it would increasingly contain many code clones due to quick bug fix and new feature addition. Industry collaborators would recognize decay of initial design simplicity, and try to evaluate current system from the view point of maintenance effort and cost. As one of resources for the evaluation, the estimated code size by merging code clone is very important for them. In this paper, we formulate this issue as "slimming" problem, and present three different slimming methods, Basic, Complete, and Heuristic Methods, each of which gives a lower bound, upper bound, and modest reduction rates, respectively. Application of these methods to OSS systems written in C/C++ showed that the reduction rate is at most 5.7% of the total size, and to a commercial COBOL system, it is at most 15.4%. For this approach, we have gotten initial but very positive feedback from industry collaborators.

References

  1. Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, and Kostas Kontogiannis. 1999. Measuring clone based reengineering opportunities. In Proc. of METRICS 1999. 292--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, and Kostas Kontogiannis. 2000. Advanced clone-analysis to support object-oriented system refactoring. In Proc. of WCRE 2000. 98--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hamid Abdul Basit and Stan Jarzabek. 2009. A data mining approach for detecting higher-level clones in software. IEEE Transactions on Software engineering 35, 4 (2009), 497--514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ira D Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant'Anna, and Lorraine Bier. 1998. Clone detection using abstract syntax trees. In Proc. of ICSM 1998. 368--377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stefan Bellon, Rainer Koschke, Giulio Antoniol, Jens Krinke, and Ettore Merlo. 2007. Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering 33, 9 (2007), 577--591. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Wen-Ke Chen, Bengu Li, and Rajiv Gupta. 2003. Code compaction of matching single-entry multiple-exit regions. In Proc. of SAS 2003. Springer, 401--417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Eunjong Choi, Norihiro Yoshida, Takashi Ishio, Katsuro Inoue, and Tateki Sano. 2011. Extracting code clones for refactoring using combinations of clone metrics. In Proc. of IWSC 2011. 7--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. James R Cordy. 2003. Comprehending reality-practical barriers to industrial adoption of software maintenance automation. In Proc. of IWPC 2003. 196--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bjorn De Sutter and Koen De Bosschere. 2003. Software Techniques for Program Compaction. Commun. ACM 46, 8 (2003).Google ScholarGoogle Scholar
  10. Saumya K Debray, William Evans, Robert Muth, and Bjorn De Sutter. 2000. Compiler techniques for code compaction. ACM Transactions on Programming languages and Systems 22, 2 (2000), 378--415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sebastian Eder, Maximilian Junker, Elmar Jürgens, Benedikt Hauptmann, Rudolf Vaas, and Karl-Heinz Prommer. 2012. How much does unused code matter for maintenance?. In Proc. of ICSE 2012. 1102--1111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Stephen G Eick, Todd L Graves, Alan F Karr, J Steve Marron, and Audris Mockus. 2001. Does code decay? assessing the evidence from change management data. IEEE Transactions on Software Engineering 27, 1 (2001), 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Martin Fowler. 1999. Refactoring: improving the design of existing code. Addison Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Anfernee Goon, Yuhao Wu, Makoto Matsushita, and Katsuro Inoue. 2016. Evolution of Code Clone Ratios throughout Development History of Open-Source C and C++ programs. Technical Report of Software Engineering Lab, Dept. of Computer Science, Osaka University (2016). http://sel.ist.osaka-u.ac.jp/lab-db/betuzuri/contents.ja/1046.htmlGoogle ScholarGoogle Scholar
  15. Yoshiki Higo, Shinji Kusumoto, and Katsuro Inoue. 2008. A metric-based approach to identifying refactoring opportunities for merging code clones in a Java software system. Journal of Software Maintenance and Evolution 20, 6 (2008), 435--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yoshiki Higo, Yasushi Ueda, Toshihro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. On software maintenance process improvement based on code clone analysis. In Proc. of PROFES 2002. Springer, 185--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Keisuke Hotta, Yoshiki Higo, and Shinji Kusumoto. 2012. Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph. In Proc. of CSMR 2012. 53--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. DECKARD: scalable and accurate tree-based detection of code clones. In Proc. of ICSE '07. 96--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lingxiao Jiang, Zhendong Su, and Edwin Chiu. 2007. Context-based detection of clone-related bugs. In Proc. of ESEC-FSE '07. 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yasuo Kadono. 2015. Management of Software Engineering Innovation in Japan. Springer.Google ScholarGoogle Scholar
  21. Toshihiro Kamiya. 2010. the archive of CCFinder Official Site. http://www.ccfinder.net/ccfinderxos.html. (2010).Google ScholarGoogle Scholar
  22. Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28, 7 (2002), 654--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Cory Kapser and Michael W Godfrey. 2006. "Cloning considered harmful" considered harmful. In Proc. of WCRE 2006. 19--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Koki Kato, Tsuyoshi Kanai, and Sanya Uehara. 2011. Source code partitioning using process mining. In Proc. of BPM 2011. 38--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andy Kellens, Kim Mens, and Paolo Tonella. 2007. A survey of automated code-level aspect mining techniques. In Transactions on aspect-oriented software development IV. Springer, 143--162. http://dl.acm.org/citation.cfm?id=1793854.1793862 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rainer Koschke. 2007. Survey of Research on Software Clones. In Duplication, Redundancy, and Similarity in Software (Dagstuhl Seminar Proceedings), Rainer Koschke, Ettore Merlo, and Andrew Walenstein (Eds.). Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, Dagstuhl, Germany. http://drops.dagstuhl.de/opus/volltexte/2007/962Google ScholarGoogle Scholar
  27. Rainer Koschke and Saman Bazrafshan. 2016. Software-Clone Rates in Open-Source Programs Written in C or C++. In Proc. of IWSC 2016. 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  28. Jussi Koskinen. 2015. Software maintenance costs. (2015). https://wiki.uef.fi/download/attachments/38669960/SMCOSTS.pdfGoogle ScholarGoogle Scholar
  29. Jens Krinke. 2001. Identifying Similar Code with Program Dependence Graphs. In Proc. of WCRE '01. 301--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Giri Panamoottil Krishnan and Nikolaos Tsantalis. 2013. Refactoring Clones: An Optimization Problem. In Proc. of ICSM 2013. 360--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Bennet P Lientz and E Burton Swanson. 1980. Software maintenance management. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Thomas M Pigoski. 1996. Practical software maintenance: best practices for managing your software investment. Wiley Publishing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chanchal K Roy, James R Cordy, and Rainer Koschke. 2009. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming 74, 7 (2009), 470--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Harry M Sneed. 1995. Planning the reengineering of legacy systems. IEEE software 12, 1 (1995), 24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Norihiro Yoshida, Eunjong Choi, and Katsuro Inoue. 2013. Active support for clone refactoring: A perspective. In Proc. of WRT 2013. 13--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Michael Joseph Zastre. 1995. Compacting Object Code via Parameterized Procedural Abstraction. Master's thesis. Department of Computer Science, University of Victoria. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.51.4235&rep=rep1&type=pdfGoogle ScholarGoogle Scholar

Index Terms

  1. How slim will my system be?: estimating refactored code size by merging clones

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICPC '18: Proceedings of the 26th Conference on Program Comprehension
      May 2018
      423 pages
      ISBN:9781450357142
      DOI:10.1145/3196321

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 May 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader