Abstract
Commit messages are the atomic level of software documentation. They provide a natural language description of the code change and its purpose. Messages are critical for software maintenance and program comprehension. Unlike documenting feature updates and bug fixes, little is known about how developers document their refactoring activities. Specifically, developers can perform multiple refactoring operations, including moving methods, extracting classes, renaming attributes, for various reasons, such as improving software quality, managing technical debt, and removing defects. Yet, there is no systematic study that analyzes the extent to which the documentation of refactoring accurately describes the refactoring operations performed at the source code level. Therefore, this paper challenges the ability of refactoring documentation, written in commit messages, to adequately predict the refactoring types, performed at the commit level. Our analysis relies on the text mining of commit messages to extract the corresponding features (i.e., keywords) that better represent each class (i.e., refactoring type). The extraction of text patterns, specific to each refactoring type (e.g., rename, extract, move, inline, etc.) allows the design of a model that verifies the consistency of these patterns with their corresponding refactoring. Such verification process can be achieved via automatically predicting, for a given commit, the method-level type of refactoring being applied, namely Extract Method, Inline Method, Move Method, Pull-up Method, Push-down Method, and Rename Method. We compared various classifiers, and a baseline keyword-based approach, in terms of their prediction performance, using a dataset of 5004 commits. Our main findings show that the complexity of refactoring type prediction varies from one type to another. Rename Method and Extract Method were found to be the best documented refactoring activities, while Pull-up Method, and Push-down Method were the hardest to be identified via textual descriptions. Such findings bring the attention of developers to the necessity of paying more attention to the documentation of these types.
Similar content being viewed by others
Notes
https://github.com/bekvon/residence/commit/76c364ea47e5a28b2041a0bb3323cb48bab180c9 (last checked 2020/06/20).
Commit extracted from sage-bionetworks/schema-to-pojo.
References
AlOmar, E.A., AlRubaye, H., Mkaouer, M.W., Ouni, A., Kessentini, M.: Refactoring practices in the context of modern code review: an industrial case study at xerox. In: IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 348–357. IEEE (2021)
AlOmar, E.A., Mkaouer, M.W., Newman, C., Ouni, A.: On preserving the behavior in software refactoring: a systematic mapping study. In: Information and Software Technology, p. 106675 (2021)
AlOmar, E.A., Mkaouer, M.W., Ouni, A., Kessentini, M.: On the impact of refactoring on the relationship between quality attributes and design metrics. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–11. IEEE (2019)
AlOmar, E., Mkaouer, M.W., Ouni, A.: Can refactoring be self-affirmed? An exploratory study on how developers document their refactoring activities in commit messages. In: IEEE/ACM 3rd International Workshop on Refactoring (IWoR), pp. 51–58. IEEE (2019)
AlOmar, E.A., Peruma, A., Mkaouer, M.W., Newman, C.D., Ouni, A.: Behind the scenes: on the relationship between developer experience and refactoring. J. Softw. Evol. Process e2395 (2021)
AlOmar, E.A., Peruma, A., Newman, C.D., Mkaouer, M.W., Ouni, A.: On the relationship between developer experience and refactoring: an exploratory study and preliminary results. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops, pp. 342–349 (2020)
AlOmar, E.A., Rodriguez, P.T., Bowman, J., Wang, T., Adepoju, B., Lopez, K., Newman, C., Ouni, A., Mkaouer, M.W.: How do developers refactor code to improve code reusability? In: International Conference on Software and Software Reuse, pp. 261–276. Springer (2020)
AlOmar, E.A., Wang, T., Vaibhavi, R., Mkaouer, M.W., Newman, C., Ouni, A.: Refactoring for reuse: an empirical study. In: Innovations in Systems and Software Engineering, pp. 1–31 (2021)
AlOmar, E.A.: Self-affirmed-refactoring repository (2021). https://smilevo.github.io/self-affirmed-refactoring/. Last accessed 1 Oct 2021
AlOmar, E.A., Mkaouer, M.W., Ouni, A.: Toward the automatic classification of self-affirmed refactoring. J. Syst. Softw. 171, 110821 (2020)
AlOmar, E.A., Peruma, A., Mkaouer, M.W., Newman, C., Ouni, A., Kessentini, M.: How we refactor and how we document it? On the use of supervised machine learning algorithms to classify refactoring documentation. Expert Syst. Appl. 167, 114176 (2021)
Alsolai, H., Roper, M.: A systematic literature review of machine learning techniques for software maintainability prediction. Inf. Softw. Technol. 119, 106214 (2020)
Amor, J., Robles, G., Gonzalez-Barahona, J., Navarro Gsyc, A., Carlos, J., Madrid, S.: Discriminating development activities in versioning systems: a case study (2006)
Andrew, G., Gao, J.: Scalable training of l1-regularized log-linear models. In: International Conference on Machine Learning (2007)
Aniche, M., Maziero, E., Durelli, R., Durelli, V.: The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Trans. Softw. Eng. (2020). https://doi.org/10.1109/TSE.2020.3021736
Arnaoudova, V., Eshkevari, L.M., Penta, M.D., Oliveto, R., Antoniol, G., Guéhéneuc, Y.: Repent: Analyzing the nature of identifier renamings. IEEE Trans. Softw. Eng. 40, 502–532 (2014)
Arnaoudova, V., Di Penta, M., Antoniol, G.: Linguistic antipatterns: what they are and how developers perceive them. Empir. Softw. Eng. 21, 104–158 (2016)
Avgeriou, P., Kruchten, P., Ozkaya, I., Seaman, C.: Managing technical debt in software engineering (dagstuhl seminar 16162). In: Dagstuhl Reports, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, vol. 6 (2016)
Bibiano, A.C., Soares, V., Coutinho, D., Fernandes, E., Correia, J., Santos, K., Oliveira, A., Garcia, A., Gheyi, R., Fonseca, B., et al.: How does incomplete composite refactoring affect internal quality attributes. In: 28th IEEE/ACM International Conference on Program Comprehension (ICPC) (2020)
Chávez, A., Ferreira, I., Fernandes, E., Cedrim, D., Garcia, A.: How does refactoring affect internal quality attributes? A multi-project study. In: Proceedings of the 31st Brazilian Symposium on Software Engineering, pp. 74–83. ACM (2017)
Chen, N., Johnson, R.: Toward refactoring in a polyglot world: extending automated refactoring support across java and xml. In: Proceedings of the 2nd Workshop on Refactoring Tools, pp. 1–4 (2008)
Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics (2002)
Counsell, S., Arzoky, M., Destefanis, G., Taibi, D.: On the relationship between coupling and refactoring: an empirical viewpoint. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–6. IEEE (2019)
Counsell, S., Swift, S., Arzoky, M., Destefanis, G.: Do developers really worry about refactoring re-test? An empirical study of open-source systems. In: International Conference on Product-Focused Software Process Improvement, pp. 159–166. Springer (2018)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)
Ebert, F., Castor, F., Novielli, N., Serebrenik, A.: An exploratory study on confusion in code reviews. Empir. Softw. Eng. 26, 1–48 (2021)
Fakhoury, S., Roy, D., Hassan, S.A., Arnaoudova, V.: Improving source code readability: theory and practice. In: Proceedings of the 27th International Conference on Program Comprehension, pp. 2–12. IEEE Press (2019)
Fakhoury, S., Roy, D., Ma, Y., Arnaoudova, V., Adesope, O.: Measuring the impact of lexical and structural inconsistencies on developers’ cognitive load during bug localization. Empir. Softw. Eng. 25, 2140–2178 (2019)
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems. J. Mach. Learn. Res. 15, 3133–3181 (2014)
Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the Design of Existing Code. Addison-Wesley Longman Publishing Co., Inc, Boston (1999)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Gallaba, K., McIntosh, S.: Use and misuse of continuous integration features: an empirical study of projects that (mis) use travis ci. IEEE Trans. Softw. Eng. 46, 33–50 (2018)
Gharbi, S., Mkaouer, M.W., Jenhani, I., Messaoud, M.B.: On the classification of software change messages using multi-label active learning. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 1760–1767 (2019)
Gu, Q., Li, Z., Han, J.: Generalized fisher score for feature selection. arXiv preprintarXiv:1202.3725 (2012)
Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12, 993–1001 (1990)
Herbrich, R., Graepel, T., Campbell, C.: Bayes point machines. J. Mach. Learn. Res. 1, 245–279 (2001)
Hindle, A., Ernst, N.A., Godfrey, M.W., Mylopoulos, J.: Automated topic naming to support cross-project analysis of software maintenance activities. In: Proceedings of the 8th Working Conference on Mining Software Repositories MSR ’11, pp. 163–172. ACM, New York (2011). https://doi.org/10.1145/1985441.1985466
Hindle, A., German, D.M., Godfrey, M.W., Holt R.C..: Automatic classication of large changes into maintenance categories. In: IEEE 17th International Conference on Program Comprehension, pp. 30–39 (2009). https://doi.org/10.1109/ICPC.2009.5090025
Hönel, S., Ericsson, M., Löwe, W., Wingkvist, A.: Importance and aptitude of source code density for commit classification into maintenance activities. In: The 19th IEEE International Conference on Software Quality, Reliability, and Security (2019)
Hönel, S., Ericsson, M., Löwe, W., Wingkvist, A.: Using source code density to improve the accuracy of automatic commit classification into maintenance activities. J. Syst. Softw. 168, 110673 (2020)
Jose, C., Goyal, P., Aggrwal, P., Varma, M.: Local deep kernel learning for efficient non-linear SVM prediction. In: International Conference on Machine Learning, pp. 486–494 (2013)
Kim, S., Kim, D.: Automatic identifier inconsistency detection using code dictionary. Empir. Softw. Eng. 21, 565–604 (2016)
Kim, M., Zimmermann, T., Nagappan, N.: An empirical study of refactoringchallenges and benefits at microsoft. IEEE Trans. Softw. Eng. 40, 633–649 (2014)
Kochhar, P.S., Thung, F., Lo, D.: Automatic fine-grained issue report reclassification. In: 19th International Conference on Engineering of Complex Computer Systems (ICECCS), pp. 126–135. IEEE (2014)
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., Brown, D.: Text classification algorithms: a survey. Information 10, 150 (2019)
Krasniqi, R., Cleland-Huang, J.: Enhancing source code refactoring detection with explanations from commit messages. In: IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 512–516. IEEE (2020)
Lane, H., Hapke, H., Howard, C.: Natural Language Processing in Action: Understanding, Analyzing, and Generating Text with Python. Manning Publications Company, New York (2019)
Le, T.-D.B., Linares-Vásquez, M., Lo, D., Poshyvanyk, D.: Rclinker: Automated linking of issue reports and commits leveraging rich contextual information. In: IEEE 23rd International Conference on Program Comprehension, pp. 36–47. IEEE (2015)
Levin, S., Yehudai, A.: Boosting automatic commit classification into maintenance activities by utilizing source code changes. In: Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering PROMISE, pp. 97–106. ACM, New York (2017). https://doi.org/10.1145/3127005.3127016
Levin, S., Yehudai, A.: Towards software analytics: modeling maintenance activities. arXiv preprintarXiv:1903.04909 (2019)
Lin, S., Ma, Y., Chen, J.: Empirical evidence on developer’s commit activity for open-source software projects. In: SEKE, vol. 13, pp. 455–460 (2013)
Lorena, A.C., de Carvalho, A.C.P.L.F., Gama, J.M.P.: A review on the combination of binary classifiers in multiclass problems. Artif. Intell. Rev. 30, 19 (2009). https://doi.org/10.1007/s10462-009-9114-9
Mahmoodian, N., Abdullah, R., Murad, M.A.A.: Text-based classification incoming maintenance requests to maintenance type. In: International Symposium on Information Technology, vol. 2, pp. 693–697 (2010). https://doi.org/10.1109/ITSIM.2010.5561540
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Marmolejos, L., AlOmar, E.A., Mkaouer, M.W., Newman, C., Ouni, A.: On the use of textual feature extraction techniques to support the automated detection of refactoring documentation. In: Innovations in Systems and Software Engineering, pp. 1–17 (2021)
Mauczka, A., Huber, M., Schanes, C., Schramm, W., Bernhart, M., Grechenig, T.: Tracing your maintenance work—a cross-project validation of an automated classification dictionary for commit messages. In: J. de Lara, A. Zisman (eds.) Fundamental Approaches to Software Engineering: 15th International Conference, FASE 2012, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March 24–April 1, 2012, pp. 301–315. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-28872-2_21
McMillan C, Linares-Vasquez, M., Poshyvanyk, D., Grechanik, M.: Categorizing software applications for maintenance. In: Proceedings of the 2011 27th IEEE International Conference on Software Maintenance ICSM ’11, pp. 343–352. IEEE Computer Society, Washington, DC (2011). https://doi.org/10.1109/ICSM.2011.6080801
Munaiah, N., Kroh, S., Cabrey, C., Nagappan, M.: Curating github for engineered software projects. Empir. Softw. Eng. 22, 3219–3253 (2017)
Mund, S.: Microsoft Azure Machine Learning. Packt Publishing Ltd, Birmingham (2015)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)
Murphy-Hill, E., Parnin, C., Black, A.P.: How we refactor, and how we know it. IEEE Trans. Softw. Eng. 38, 5–18 (2012)
Naiya, N., Counsell, S., Hall, T.: The relationship between depth of inheritance and refactoring: an empirical study of eclipse releases. In: 41st Euromicro Conference on Software Engineering and Advanced Applications, pp. 88–91. IEEE (2015)
Ouni, A., Kessentini, M., Sahraoui, H., Inoue, K., Deb, K.: Multi-criteria code refactoring using search-based software engineering: an industrial case study. ACM Trans. Softw. Eng. Methodol. (TOSEM) 25, 23 (2016)
Pantiuchina, J., Lanza, M., Bavota, G.: Improving code: the (mis) perception of quality metrics. In: IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 80–91. IEEE (2018)
Peruma, A., Mkaouer, M.W., Decker, M.J., Newman, C.D.: Contextualizing rename decisions using refactorings, commit messages, and data types. J. Syst. Softw. 169, 110704 (2020)
Prinzie, A., Van den Poel, D.: Random forests for multiclass classification: random multinomial logit. Expert Syst. Appl. 34, 1721–1732 (2008)
Ratzinger, J., Sigmund, T., Gall, H.C.: On the relation of refactorings and software defect prediction. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories MSR ’08, pp. 35–38. ACM, New York (2008). https://doi.org/10.1145/1370750.1370759
Ratzinger, J.: sPACE: software project assessment in the course of evolution, Ph.D. thesis. http://www.infosys.tuwien.ac.at/Staff/ratzinger/publications/ratzinger_phd-thesis_space.pdf (2007)
Ratzinger, J., Fischer, M., Gall, H.: Improving Evolvability Through Refactoring, vol. 30. ACM, New York (2005)
Rebai, S., Kessentini, M., Alizadeh, V., Sghaier, O.B., Kazman, R.: Recommending refactorings via commit message analysis. Inf. Softw. Technol. 126, 106332 (2020)
Sabetta, A., Bezzi, M.: A practical approach to the automatic classification of security-relevant commits. In: IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 579–582. IEEE (2018)
Saif, H., Fernández, M., He, Y., Alani, H.: On stopwords, filtering and data sparsity for sentiment analysis of twitter (2014)
Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Shotton, J., Sharp, T., Kohli, P., Nowozin, S., Winn, J., Criminisi, A.: Decision jungles: Compact and rich models for classification. In: Proceedings of NIPS. https://www.microsoft.com/en-us/research/publication/decision-jungles-compact-and-rich-models-for-classification/ (2013)
Silva, D., Tsantalis, N., Valente, M.T.: Why we refactor? Confessions of github contributors. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering FSE 2016, pp. 858–870. ACM, New York (2016). https://doi.org/10.1145/2950290.2950305
Silva, D., Tsantalis, N., Valente, M.T.: Why we refactor? Confessions of github contributors. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 858–870. ACM (2016)
Silva, D., Valente, M.T.: Refdiff: detecting refactorings in version histories. In: Proceedings of the 14th International Conference on Mining Software Repositories, pp. 269–279. IEEE Press (2017)
Soares, G., Cavalcanti, D., Gheyi, R., Massoni, T., Serey, D., Cornélio, M.: Saferefactor-tool for checking refactoring safety (2009)
Soares, V., Oliveira, A., Pereira, J.A., Bibano, A.C., Garcia, A., Farah, P.R., Vergilio, S.R., Schots, M., Silva, C., Coutinho, D., et al.: On the relation between complexity, explicitness, effectiveness of refactorings and non-functional concerns. In: Proceedings of the 34th Brazilian Symposium on Software Engineering, pp. 788–797 (2020)
Soares, G., Gheyi, R., Murphy-Hill, E., Johnson, B.: Comparing approaches to analyze refactoring activity on software repositories. J. Syst. Softw. 86, 1006–1022 (2013)
Stroggylos, K., Spinellis, D.: Refactoring–does it improve software quality? In: Fifth International Workshop on Software Quality (WoSQ’07: ICSE Workshops 2007), pp. 10–10. IEEE (2007)
Swanson, E.B.: The dimensions of maintenance. In: Proceedings of the 2nd International Conference on Software Engineering ICSE ’76, pp. 492–497. IEEE Computer Society Press, Los Alamitos. http://dl.acm.org/citation.cfm?id=800253.807723 (1976)
Swidan, A., Hermans, F., Smit, M.: Programming misconceptions for school students. In: Proceedings of the 2018 ACM Conference on International Computing Education Research, pp. 151–159 (2018)
Tan, L., Bockisch, C.: A survey of refactoring detection tools. In: Software Engineering (Workshops), pp. 100–105 (2019)
Tan, C.-M., Wang, Y.-F., Lee, C.-D.: The use of bigrams to enhance text categorization. Inf. Process. Manag. 38, 529–546 (2002)
Treude, C., Middleton, J., Atapattu, T.: Beyond accuracy: assessing software documentation quality. arXiv preprint. arXiv:2007.10744 (2020)
Tsantalis, N., Ketkar, A., Dig, D.: Refactoringminer 2.0. In: IEEE Transactions on Software Engineering (2020)
Tsantalis, N., Mansouri, M., Eshkevari, L.M., Mazinanian, D., Dig, D.: Accurate and efficient refactoring detection in commit history. In: Proceedings of the 40th International Conference on Software Engineering, pp. 483–494. ACM (2018)
Ubayashi, N., Kamei, Y., Sato, R.: Can abstraction be taught? Refactoring-based abstraction learning. In: MODELSWARD, pp. 429–437 (2018)
Veerappa, V., Harrison, R.: An empirical validation of coupling metrics using automated refactoring. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 271–274. IEEE (2013)
Wake, W.C.: Refactoring Workbook. Addison-Wesley Professional, Boston (2004)
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Philip, S.Y., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14, 1–37 (2008)
Yamashita, S., Hayashi, S., Saeki, M.: Changebeadsthreader: an interactive environment for tailoring automatically untangled changes. In: IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 657–661. IEEE (2020)
Zafar, S., Malik, M.Z., Walia, G.S.: Towards standardizing and improving classification of bug-fix commits. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–6. IEEE (2019)
Zampetti, F., Vassallo, C., Panichella, S., Canfora, G., Gall, H., Di Penta, M.: An empirical characterization of bad practices in continuous integration. Empir. Softw. Eng. 25, 1095–1135 (2020)
Zhang, D., Li, B., Li, Z., Liang, P.: A preliminary investigation of self-admitted refactorings in open source software (2018). https://doi.org/10.18293/SEKE2018-081
Acknowledgements
This material is based on work supported by the National Science Foundation under Grant No. 1757680.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
AlOmar, E.A., Liu, J., Addo, K. et al. On the documentation of refactoring types. Autom Softw Eng 29, 9 (2022). https://doi.org/10.1007/s10515-021-00314-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10515-021-00314-w