Skip to main content
Log in

Analyzing the co-evolution of comments and source code

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Source code comments are a valuable instrument to preserve design decisions and to communicate the intent of the code to programmers and maintainers. Nevertheless, commenting source code and keeping comments up-to-date is often neglected for reasons of time or programmers obliviousness. In this paper, we investigate the question whether developers comment their code and to what extent they add comments or adapt them when they evolve the code. We present an approach to associate comments with source code entities to track their co-evolution over multiple versions. A set of heuristics are used to decide whether a comment is associated with its preceding or its succeeding source code entity. We analyzed the co-evolution of code and comments in eight different open source and closed source software systems. We found with statistical significance that (1) the relative amount of comments and source code grows at about the same rate; (2) the type of a source code entity, such as a method declaration or an if-statement, has a significant influence on whether or not it gets commented; (3) in six out of the eight systems, code and comments co-evolve in 90% of the cases; and (4) surprisingly, API changes and comments do not co-evolve but they are re-documented in a later revision. As a result, our approach enables a quantitative assessment of the commenting process in a software system. We can, therefore, leverage the results to provide feedback during development to increase the awareness of when to add comments or when to adapt comments because of source code changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.eclipse.org.

  2. http://wiki.eclipse.org/Coding_Conventions.

  3. Compared to the hypotheses of Study 1 and 2 this is rather an assumption than a statistical hypothesis. Nevertheless we use the term hypothesis to keep the organization of the empirical studies consistent.

  4. The detailed results of the other study are not available for publication.

References

  • Antoniol, G., Canfora, G., Casazza, G., Lucia, A. D., & Merlo, E. M. (2002). Recovering traceability links between code and documentation. IEEE Transactions on Software Engineering, 28(10), 970–983.

    Article  Google Scholar 

  • Baresi, L., & Morasca, S. (2007). Three empirical studies on estimating the design effort of Web applications. ACM Transactions on Software Engineering and Methodology, 16(4), 15.

    Article  Google Scholar 

  • Bevan, J., James Whitehead, E. J., Kim, S., & Godfrey, M. W. (2005). Facilitating software evolution research with Kenyon. In Proceedings of the joint 10th European software engineering conference and the 13th ACM SIGSOFT symposium on the foundations of software engineering (pp. 177–186). ACM.

  • Demeyer, S., Ducasse, S., & Nierstraz, O. (2003). Object-oriented reengineering patterns. Morgan Kaufmann.

  • des Rivières, J., & Wiegand, J. (2004). Eclipse: A platform for integrating development tools. IBM Systems Journal, 43(2), 371–383.

    Article  Google Scholar 

  • Dromey, R. G. (1995). A model for software product quality. IEEE Transactions on Software Engineering, 21(2), 146–162.

    Article  Google Scholar 

  • Elshoff, J. L., & Marcotty, M. (1982). Improving computer program readability to aid modification. Communications of the ACM, 25(8), 512–521.

    Article  Google Scholar 

  • Fischer, M., Pinzger, M., & Gall, H. (2003). Populating a release history database from version control and bug tracking systems. In Proceedings of the 19th international conference on software maintenance (pp. 23–32). IEEE Computer Society.

  • Fluri, B., & Gall, H. C. (2006). Classifying change types for qualifying change couplings. In Proceedings of the 14th international conference on program comprehension (pp. 35–45). IEEE Computer Society.

  • Fluri, B., Würsch, M., Pinzger, M., & Gall, H. C. (2007). Change distilling: Tree differencing for fine-grained source code change extraction. IEEE Transactions on Software Engineering, 33(11), 725–743.

    Article  Google Scholar 

  • Goldberg, A. (1987). Programmer as reader. IEEE Software, 4(5), 62–70.

    Article  Google Scholar 

  • Hyatt, L. E., & Rosenberg, L. H. (1996). A software quality model and metrics for identifying project risks and assessing software quality. In European space agency software assurance symposium and the 8th annual software technology conference (p. 209).

  • Jiang, Z. M., & Hassan, A. E. (2006). Examining the evolution of code comments in PostgreSQL. In Proceedings of the 3rd international workshop on mining software repositories (pp. 179–180). ACM.

  • Kaelbling, M. J. (1988). Programming languages should NOT have comment statements. ACM SIGPlan Notices, 23(10), 59–60.

    Article  Google Scholar 

  • Lakhotia, A. (1993). Understanding someone else’s code: Analysis and experience. Journal of Systems and Software, 23(3), 269–275.

    Article  Google Scholar 

  • Lawrie, D. J., Feild, H., & Binkley, D. (2006). Leveraged quality assessment using information retrieval techniques. In Proceedings of the international conference on program comprehension (pp. 149–158). IEEE Computer Society.

  • Lucia, A. D., Penta, M. D., Oliveto, R., & Zurolo, F. (2006). Improving comprehensibility of source code via traceability: A controlled experiment. In Proceedings of the 14th international conference on program comprehension (pp. 317–326). IEEE Computer Society.

  • Lucia, A. D., Fasano, F., Oliveto, R., & Tortora, G. (2007). Recovering traceability links in software artifact management systems using information retrieval methods. ACM Transactions on Software Engineering and Methodology, 16(4), 50.

    Article  Google Scholar 

  • Marcus, A., & Maletic, J. I. (2003). Recovering documentation-to-source-code traceability links using latent semantic indexing. In Proceedings of the 25th international conference on software engineering (pp. 125–135). IEEE Computer Society.

  • Marcus, A., & Poshyvanyk, D. (2005). The conceptual cohesion of classes. In Proceedings of the 21st international conference on software maintenance (pp. 133–142). IEEE Computer Society.

  • Schreck, D., Dallmeier, V., & Zimmermann, T. (2007). How documentation evolves over time. In Proceedings of the 9th international workshop on principles of software evolution (pp. 4–10). ACM.

  • Spinellis, D. (2006). Code quality—The open source perspective. Addison-Wesley, Pearson Education, Inc.

  • Tan, L., Yuan, D., Krishna, G., & Zhou, Y. (2007). /* iComment: Bugs or bad comments? */. In Proceedings of 21st ACM SIGOPS symposium on operating systems principles (pp. 145–158). ACM.

  • Tenny, T. (1988). Program readability: Procedures versus comments. IEEE Transactions on Software Engineering, 14(9), 1271–1279.

    Article  Google Scholar 

  • Vanter, M. L. V. D. (2002). The documentary structure of source code. Information and Software Technology, 44(13), 767–782.

    Article  Google Scholar 

  • Witte, R., Zhang, Y., & Rilling, J. (2007). Empowering software maintainers with semantic web technologies. In Proceedings of the 4th European semantic web conference (pp. 37–52). Springer.

  • Yin, R. K. (2003). Case study research—Design and methods (3rd edn.). Sage Publications, Inc.

  • Ying, A. T. T., Wright, J. L., & Abrams, S. (2005). Source code that talks: An exploration of eclipse task comments and their implication to repository mining. In Proceedings of the 2nd international workshop on mining software repositories (pp. 1–5).

  • Zimmermann, T., Weissgerber, P., Diehl, S., & Zeller, A. (2005). Mining version histories to guide software changes. IEEE Transactions on Software Engineering, 31(6), 429–445.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Hasler Foundation as part of the ProMedServices—Proactive Software Service Improvements project. The authors would like to thank the reviewers for their insightful suggestions that greatly helped to improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beat Fluri.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fluri, B., Würsch, M., Giger, E. et al. Analyzing the co-evolution of comments and source code. Software Qual J 17, 367–394 (2009). https://doi.org/10.1007/s11219-009-9075-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-009-9075-x

Keywords

Navigation