skip to main content
10.1145/1806799.1806855acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Linking e-mails and source code artifacts

Published:01 May 2010Publication History

ABSTRACT

E-mails concerning the development issues of a system constitute an important source of information about high-level design decisions, low-level implementation concerns, and the social structure of developers.

Establishing links between e-mails and the software artifacts they discuss is a non-trivial problem, due to the inherently informal nature of human communication. Different approaches can be brought into play to tackle this trace-ability issue, but the question of how they can be evaluated remains unaddressed, as there is no recognized benchmark against which they can be compared.

In this article we present such a benchmark, which we created through the manual inspection of a statistically significant number of e-mails pertaining to six unrelated software systems. We then use our benchmark to measure the effectiveness of a number of approaches, ranging from lightweight approaches based on regular expressions to full-fledged information retrieval approaches.

References

  1. G. Antoniol, G. Canfora, G. Casazza, A. D. Lucia, and E. Merlo. Recovering traceability links between code and documentation. IEEE Transactions on Software Engineering, 28(10):970--983, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Bacchelli, M. D'Ambros, and M. Lanza. Are popular classes more defect prone? In Proceedings of FASE 2010 (13th International Conference on Fundamental Approaches to Software Engineering), pages xxx-xxx, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Bacchelli, M. D'Ambros, M. Lanza, and R. Robbes. Benchmarking lightweight techniques to link e-mails and source code. In Proceedings of WCRE 2009 (16th IEEE Working Conference on Reverse Engineering), pages 205--214. IEEE CS Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Baysal and A. J. Malton. Correlating social interactions to release history during software evolution. In Proceedings of MSR 2007 (4th International Workshop on Mining Software Repositories), page 7. IEEE Computer Society, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Berry and M. Browne. Understanding Search Engines - Mathematical Modeling and Text Retrieval. SIAM, 2nd edition, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. W. Berry, S. T. Dumais, and T. A. Letsche. Computational methods for intelligent information access. In Proceedings of SC 1995 (ACM/IEEE Conference on Supercomputing), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Bird, A. Gourley, P. T. Devanbu, M. Gertz, and A. Swaminathan. Mining email social networks. In Proceedings of MSR 2006 (3th International Workshop on Mining Software Repositories), page 137, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Bird, D. S. Pattison, R. M. D'Souza, V. Filkov, and P. T. Devanbu. Latent social structure in open source projects. In SIGSOFT FSE, pages 24--35, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. J. Chikofsky and J. H. C. II. Reverse engineering and design recovery: A taxonomy. IEEE Software, 7(1):13--17, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  11. A. Dekhtyar and J. Hayes. Good benchmarks are hard to find: Toward the benchmark for information retrieval applications in software engineering. In ICSM 2006 Working Session: Information Retrieval Based Approaches in Software Evolution, 2007.Google ScholarGoogle Scholar
  12. S. Demeyer, S. Tichelaar, and S. Ducasse. FAMIX 2.1 --- The FAMOOS Information Exchange Model. Technical report, University of Bern, 2001.Google ScholarGoogle Scholar
  13. J. H. Hayes, A. Dekhtyar, and S. K. Sundaram. Advancing candidate link generation for requirements tracing: The study of methods. IEEE Transactions on Software Engineering, 32(1):4--19, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Kontostathis. Essential dimensions of latent semantic indexing (LSI). In Proceedings of HICSS 2007 (40th Annual Hawaii International Conference on System Sciences), pages 73--80. IEEE CS Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Kuhn, S. Ducasse, and T. Gírba. Semantic clustering: Identifying topics in source code. Information and Software Technology, 49(3):230--243, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. D. LaToza, G. Venolia, and R. DeLine. Maintaining mental models: a study of developer work habits. In Proceedings of ICSE 2006 (28th ACM International Conference on Software Engineering), pages 492--501. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Lormans and A. van Deursen. Can LSI help reconstructing requirements traceability in design and test? In Proceedings of CSMR 2006 (10th European Conference on Software Maintenance and Reengineering), pages 47--56, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Marcus and J. Maletic. Recovering documentation-to-source-code traceability links using latent semantic indexing. In Proceedings of ICSE 2003 (25th International Conference on Software Engineering), pages 125--135. IEEE CS Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. S. Pattison, C. Bird, and P. T. Devanbu. Talk and work: a preliminary report. In MSR, pages 113--116, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Pfleeger and J. Atlee. Software Engineering - Theory and Practice. Pearson, 2006.Google ScholarGoogle Scholar
  22. S. E. Sim, S. Easterbrook, and R. C. Holt. Using benchmarking to advance research: a challenge to software engineering. In Proceedings of ICSE 2003 (25th International Conference on Software Engineering), pages 74--83. IEEE CS Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Tang, A. E. Hassan, and Y. Zou. Techniques for identifying the country origin of mailing list participants. In Proceedings of WCRE 2009 (16th IEEE Working Conference on Reverse Engineering), pages 36--40. IEEE CS Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Triola. Elementary Statistics. Addison-Wesley, 10th edition, 2006.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    ICSE '10: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
    May 2010
    627 pages
    ISBN:9781605587196
    DOI:10.1145/1806799

    Copyright © 2010 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 May 2010

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate276of1,856submissions,15%

    Upcoming Conference

    ICSE 2025

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader