ABSTRACT
We have been doing code clone analysis with industry collaborators for a long time, and have been always asked a question, "OK, I understand my system contains a lot of code clones, but how slim will it be after merging redundant code clones?" As a software system evolves for long period, it would increasingly contain many code clones due to quick bug fix and new feature addition. Industry collaborators would recognize decay of initial design simplicity, and try to evaluate current system from the view point of maintenance effort and cost. As one of resources for the evaluation, the estimated code size by merging code clone is very important for them. In this paper, we formulate this issue as "slimming" problem, and present three different slimming methods, Basic, Complete, and Heuristic Methods, each of which gives a lower bound, upper bound, and modest reduction rates, respectively. Application of these methods to OSS systems written in C/C++ showed that the reduction rate is at most 5.7% of the total size, and to a commercial COBOL system, it is at most 15.4%. For this approach, we have gotten initial but very positive feedback from industry collaborators.
- Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, and Kostas Kontogiannis. 1999. Measuring clone based reengineering opportunities. In Proc. of METRICS 1999. 292--303. Google ScholarDigital Library
- Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lague, and Kostas Kontogiannis. 2000. Advanced clone-analysis to support object-oriented system refactoring. In Proc. of WCRE 2000. 98--107. Google ScholarDigital Library
- Hamid Abdul Basit and Stan Jarzabek. 2009. A data mining approach for detecting higher-level clones in software. IEEE Transactions on Software engineering 35, 4 (2009), 497--514. Google ScholarDigital Library
- Ira D Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant'Anna, and Lorraine Bier. 1998. Clone detection using abstract syntax trees. In Proc. of ICSM 1998. 368--377. Google ScholarDigital Library
- Stefan Bellon, Rainer Koschke, Giulio Antoniol, Jens Krinke, and Ettore Merlo. 2007. Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering 33, 9 (2007), 577--591. Google ScholarDigital Library
- Wen-Ke Chen, Bengu Li, and Rajiv Gupta. 2003. Code compaction of matching single-entry multiple-exit regions. In Proc. of SAS 2003. Springer, 401--417. Google ScholarDigital Library
- Eunjong Choi, Norihiro Yoshida, Takashi Ishio, Katsuro Inoue, and Tateki Sano. 2011. Extracting code clones for refactoring using combinations of clone metrics. In Proc. of IWSC 2011. 7--13. Google ScholarDigital Library
- James R Cordy. 2003. Comprehending reality-practical barriers to industrial adoption of software maintenance automation. In Proc. of IWPC 2003. 196--205. Google ScholarDigital Library
- Bjorn De Sutter and Koen De Bosschere. 2003. Software Techniques for Program Compaction. Commun. ACM 46, 8 (2003).Google Scholar
- Saumya K Debray, William Evans, Robert Muth, and Bjorn De Sutter. 2000. Compiler techniques for code compaction. ACM Transactions on Programming languages and Systems 22, 2 (2000), 378--415. Google ScholarDigital Library
- Sebastian Eder, Maximilian Junker, Elmar Jürgens, Benedikt Hauptmann, Rudolf Vaas, and Karl-Heinz Prommer. 2012. How much does unused code matter for maintenance?. In Proc. of ICSE 2012. 1102--1111. Google ScholarDigital Library
- Stephen G Eick, Todd L Graves, Alan F Karr, J Steve Marron, and Audris Mockus. 2001. Does code decay? assessing the evidence from change management data. IEEE Transactions on Software Engineering 27, 1 (2001), 1--12. Google ScholarDigital Library
- Martin Fowler. 1999. Refactoring: improving the design of existing code. Addison Wesley. Google ScholarDigital Library
- Anfernee Goon, Yuhao Wu, Makoto Matsushita, and Katsuro Inoue. 2016. Evolution of Code Clone Ratios throughout Development History of Open-Source C and C++ programs. Technical Report of Software Engineering Lab, Dept. of Computer Science, Osaka University (2016). http://sel.ist.osaka-u.ac.jp/lab-db/betuzuri/contents.ja/1046.htmlGoogle Scholar
- Yoshiki Higo, Shinji Kusumoto, and Katsuro Inoue. 2008. A metric-based approach to identifying refactoring opportunities for merging code clones in a Java software system. Journal of Software Maintenance and Evolution 20, 6 (2008), 435--461. Google ScholarDigital Library
- Yoshiki Higo, Yasushi Ueda, Toshihro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. On software maintenance process improvement based on code clone analysis. In Proc. of PROFES 2002. Springer, 185--197. Google ScholarDigital Library
- Keisuke Hotta, Yoshiki Higo, and Shinji Kusumoto. 2012. Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph. In Proc. of CSMR 2012. 53--62. Google ScholarDigital Library
- Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. DECKARD: scalable and accurate tree-based detection of code clones. In Proc. of ICSE '07. 96--105. Google ScholarDigital Library
- Lingxiao Jiang, Zhendong Su, and Edwin Chiu. 2007. Context-based detection of clone-related bugs. In Proc. of ESEC-FSE '07. 55--64. Google ScholarDigital Library
- Yasuo Kadono. 2015. Management of Software Engineering Innovation in Japan. Springer.Google Scholar
- Toshihiro Kamiya. 2010. the archive of CCFinder Official Site. http://www.ccfinder.net/ccfinderxos.html. (2010).Google Scholar
- Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28, 7 (2002), 654--670. Google ScholarDigital Library
- Cory Kapser and Michael W Godfrey. 2006. "Cloning considered harmful" considered harmful. In Proc. of WCRE 2006. 19--28. Google ScholarDigital Library
- Koki Kato, Tsuyoshi Kanai, and Sanya Uehara. 2011. Source code partitioning using process mining. In Proc. of BPM 2011. 38--49. Google ScholarDigital Library
- Andy Kellens, Kim Mens, and Paolo Tonella. 2007. A survey of automated code-level aspect mining techniques. In Transactions on aspect-oriented software development IV. Springer, 143--162. http://dl.acm.org/citation.cfm?id=1793854.1793862 Google ScholarDigital Library
- Rainer Koschke. 2007. Survey of Research on Software Clones. In Duplication, Redundancy, and Similarity in Software (Dagstuhl Seminar Proceedings), Rainer Koschke, Ettore Merlo, and Andrew Walenstein (Eds.). Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, Dagstuhl, Germany. http://drops.dagstuhl.de/opus/volltexte/2007/962Google Scholar
- Rainer Koschke and Saman Bazrafshan. 2016. Software-Clone Rates in Open-Source Programs Written in C or C++. In Proc. of IWSC 2016. 1--7.Google ScholarCross Ref
- Jussi Koskinen. 2015. Software maintenance costs. (2015). https://wiki.uef.fi/download/attachments/38669960/SMCOSTS.pdfGoogle Scholar
- Jens Krinke. 2001. Identifying Similar Code with Program Dependence Graphs. In Proc. of WCRE '01. 301--307. Google ScholarDigital Library
- Giri Panamoottil Krishnan and Nikolaos Tsantalis. 2013. Refactoring Clones: An Optimization Problem. In Proc. of ICSM 2013. 360--363. Google ScholarDigital Library
- Bennet P Lientz and E Burton Swanson. 1980. Software maintenance management. Addison-Wesley. Google ScholarDigital Library
- Thomas M Pigoski. 1996. Practical software maintenance: best practices for managing your software investment. Wiley Publishing. Google ScholarDigital Library
- Chanchal K Roy, James R Cordy, and Rainer Koschke. 2009. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming 74, 7 (2009), 470--495. Google ScholarDigital Library
- Harry M Sneed. 1995. Planning the reengineering of legacy systems. IEEE software 12, 1 (1995), 24. Google ScholarDigital Library
- Norihiro Yoshida, Eunjong Choi, and Katsuro Inoue. 2013. Active support for clone refactoring: A perspective. In Proc. of WRT 2013. 13--16. Google ScholarDigital Library
- Michael Joseph Zastre. 1995. Compacting Object Code via Parameterized Procedural Abstraction. Master's thesis. Department of Computer Science, University of Victoria. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.51.4235&rep=rep1&type=pdfGoogle Scholar
Index Terms
- How slim will my system be?: estimating refactored code size by merging clones
Recommendations
A metric-based approach to identifying refactoring opportunities for merging code clones in a Java software system
A code clone is a code fragment that has other code fragments identical or similar to it in the source code. The presence of code clones is generally regarded as one factor that makes software maintenance more difficult. For example, if a code fragment ...
Extracting code clones for refactoring using combinations of clone metrics
IWSC '11: Proceedings of the 5th International Workshop on Software ClonesCode clone detection tools may report a large number of code clones, while software developers are interested in only a subset of code clones that are relevant to software development tasks such as refactoring. Our research group has supported many ...
What kind of and how clones are refactored?: a case study of three OSS projects
WRT '12: Proceedings of the Fifth Workshop on Refactoring ToolsAlthough code clone (i.e. a code fragment that has similar or identical fragments) is regarded as one of the most typical bad smells, tools for identification of clone refactoring (i.e. merge code clones into a single method) are not commonly used. To ...
Comments