ABSTRACT
The research community in Software Engineering and Software Testing in particular builds many of its contributions on a set of mutually shared expectations. Despite the fact that they form the basis of many publications as well as open-source and commercial testing applications, these common expectations and beliefs are rarely ever questioned. For example, Frederic Brooks’ statement that testing takes half of the development time seems to have manifested itself within the community since he first made it in the “Mythical Man Month” in 1975. With this paper, we report on the surprising results of a large-scale field study with 416 software engineers whose development activity we closely monitored over the course of five months, resulting in over 13 years of recorded work time in their integrated development environments (IDEs). Our findings question several commonly shared assumptions and beliefs about testing and might be contributing factors to the observed bug proneness of software in practice: the majority of developers in our study does not test; developers rarely run their tests in the IDE; Test-Driven Development (TDD) is not widely practiced; and, last but not least, software developers only spend a quarter of their work time engineering tests, whereas they think they test half of their time.
- P. Runeson, “A survey of unit testing practices,” IEEE Software, vol. 23, no. 4, pp. 22–29, 2006. Google ScholarDigital Library
- A. Begel and T. Zimmermann, “Analyze this! 145 questions for data scientists in software engineering,” in Proceedings of the International Conference on Software Engineering (ICSE), pp. 12–13, ACM, 2014. Google ScholarDigital Library
- L. S. Pinto, S. Sinha, and A. Orso, “Understanding myths and realities of test-suite evolution,” in Proceedings of the Symposium on the Foundations of Software Engineering (FSE), pp. 33:1–33:11, ACM, 2012. Google ScholarDigital Library
- A. Zaidman, B. Van Rompaey, A. van Deursen, and S. Demeyer, “Studying the co-evolution of production and test code in open source and industrial developer test processes through repository mining,” Empirical Software Engineering, vol. 16, no. 3, pp. 325–364, 2011. Google ScholarDigital Library
- A. Bertolino, “Software testing research: Achievements, challenges, dreams,” in Proceedings of the International Conference on Software Engineering (ISCE), Workshop on the Future of Software Engineering (FOSE), pp. 85–103, 2007. Google ScholarDigital Library
- F. Brooks, The mythical man-month. Addison-Wesley, 1975. Google ScholarDigital Library
- G. Meszaros, xUnit Test Patterns: Refactoring Test Code. Addison-Wesley, 2007. Google ScholarDigital Library
- R. L. Glass, R. Collard, A. Bertolino, J. Bach, and C. Kaner, “Software testing and industry needs,” IEEE Software, vol. 23, no. 4, pp. 55–57, 2006. Google ScholarDigital Library
- A. Bertolino, “The (im)maturity level of software testing,” SIGSOFT Softw. Eng. Notes, vol. 29, pp. 1–4, Sept. 2004. Google ScholarDigital Library
- J. Rooksby, M. Rouncefield, and I. Sommerville, “Testing in the wild: The social and organisational dimensions of real world practice,” Comput. Supported Coop. Work, vol. 18, pp. 559–580, Dec. 2009. Google ScholarDigital Library
- P. Runeson, M. Host, A. Rainer, and B. Regnell, Case Study Research in Software Engineering: Guidelines and Examples. Wiley, 2012. Google ScholarCross Ref
- M. Beller, G. Gousios, and A. Zaidman, “How (much) do developers test?,” in Proceedings of the 37th International Conference on Software Engineering (ICSE), NIER Track, pp. 559–562, IEEE, 2015.Google ScholarCross Ref
- P. Muntean, C. Eckert, and A. Ibing, “Context-sensitive detection of information exposure bugs with symbolic execution,” in Proceedings of the International Workshop on Innovative Software Development Methodologies and Practices (InnoSWDev), pp. 84–93, ACM, 2014. Google ScholarDigital Library
- S. S. Shapiro and M. B. Wilk, “An analysis of variance test for normality (complete samples),” Biometrika, vol. 52, no. 3-4, pp. 591–611, 1965.Google ScholarCross Ref
- J. L. Devore and N. Farnum, Applied Statistics for Engineers and Scientists. Duxbury, 1999.Google Scholar
- W. G. Hopkins, A new view of statistics. 1997. http://newstatsi.org, Accessed 16 March 2015.Google Scholar
- V. I. Levenshtein, “Binary codes capable of correcting deletions, insertions, and reversals,” in Soviet physics doklady, vol. 10, pp. 707–710, 1966.Google Scholar
- J. C. Munson and S. G. Elbaum, “Code churn: A measure for estimating the impact of code change,” in Proceedings of the International Conference on Software Maintenance (ICSM), p. 24, IEEE, 1998. Google ScholarDigital Library
- K. Beck, Test Driven Development – by Example. Addison Wesley, 2003. Google ScholarDigital Library
- H. Munir, K. Wnuk, K. Petersen, and M. Moayyed, “An experimental evaluation of test driven development vs. test-last development with industry professionals,” in Proceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE), pp. 50:1–50:10, ACM, 2014. Google ScholarDigital Library
- Y. Rafique and V. B. Misic, “The effects of test-driven development on external quality and productivity: A meta-analysis,” IEEE Transactions on Software Engineering, vol. 39, no. 6, pp. 835–856, 2013. Google ScholarDigital Library
- J. E. Hopcroft, R. Motwani, and J. D. Ullman, Introduction to Automata theory, languages, and computation. Prentice Hall, 2007. Google ScholarDigital Library
- G. Rothermel and S. Elbaum, “Putting your best tests forward,” IEEE Software, vol. 20, pp. 74–77, Sept 2003. Google ScholarDigital Library
- A. Patterson, M. Kölling, and J. Rosenberg, “Introducing unit testing with BlueJ,” ACM SIGCSE Bulletin, vol. 35, pp. 11–15, June 2003. Google ScholarDigital Library
- M. Beller, A. Bacchelli, A. Zaidman, and E. Juergens, “Modern code reviews in open-source projects: which problems do they fix?,” in Proceedings of the Working Conference on Mining Software Repositories (MSR), pp. 202–211, ACM, 2014. Google ScholarDigital Library
- E. Derby, D. Larsen, and K. Schwaber, Agile retrospectives: Making good teams great. Pragmatic Bookshelf, 2006. Google ScholarDigital Library
- C. Marsavina, D. Romano, and A. Zaidman, “Studying fine-grained co-evolution patterns of production and test code,” in Proceedings International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 195–204, IEEE, 2014. Google ScholarDigital Library
- M. Gligoric, S. Negara, O. Legunsen, and D. Marinov, “An empirical evaluation and comparison of manual and automated test selection,” in Proceedings of the 29th ACM/IEEE international conference on Automated software engineering, pp. 361–372, ACM, 2014. Google ScholarDigital Library
- L. Ponzanelli, G. Bavota, M. Di Penta, R. Oliveto, and M. Lanza, “Mining stackoverflow to turn the ide into a self-confident programming prompter,” in Proceedings of the Working Conference on Mining Software Repositories (MSR), pp. 102–111, ACM, 2014. Google ScholarDigital Library
- A. Clauset, C. R. Shalizi, and M. E. Newman, “Power-law distributions in empirical data,” SIAM review, vol. 51, no. 4, pp. 661–703, 2009. Google ScholarDigital Library
- J. G. Adair, “The Hawthorne effect: A reconsideration of the methodological artifact.,” Journal of applied psychology, vol. 69, no. 2, pp. 334–345, 1984.Google ScholarCross Ref
- L. Hattori and M. Lanza, “Syde: a tool for collaborative software development,” in Proceedings of the International Conference on Software Engineering (ICSE), pp. 235–238, ACM, 2010. Google ScholarDigital Library
- R. Robbes and M. Lanza, “Spyware: a change-aware development toolset,” in Proceedings of the International Conference on Software Engineering (ICSE), pp. 847–850, ACM, 2008. Google ScholarDigital Library
- S. Negara, N. Chen, M. Vakilian, R. E. Johnson, and D. Dig, “A comparative study of manual and automated refactorings,” in Proceedings of the 27th European Conference on Object-Oriented Programming, 2013. Google ScholarDigital Library
- R. Minelli, A. Mocci, M. Lanza, and L. Baracchi, “Visualizing developer interactions,” in Proceedings of the Working Conference on Software Visualization (VISSOFT), pp. 147–156, IEEE, 2014. Google ScholarDigital Library
- P. Kochhar, T. Bissyande, D. Lo, and L. Jiang, “An empirical study of adoption of software testing in open source projects,” in Proceedings of the International Conference on Quality Software (QSIC), pp. 103–112, IEEE, 2013. Google ScholarDigital Library
- T. D. LaToza, G. Venolia, and R. DeLine, “Maintaining mental models: a study of developer work habits,” in Proceedings of the International Conference on Software Engineering (ICSE), pp. 492–501, ACM, 2006. Google ScholarDigital Library
- R. Pham, S. Kiesling, O. Liskin, L. Singer, and K. Schneider, “Enablers, inhibitors, and perceptions of testing in novice software teams,” in Proceedings of the International Symposium on Foundations of Software Engineering (FSE), pp. 30–40, ACM, 2014. Google ScholarDigital Library
- A. N. Meyer, T. Fritz, G. C. Murphy, and T. Zimmermann, “Software developers’ perceptions of productivity,” in Proceedings of the International Symposium on Foundations of Software Engineering (FSE), pp. 19–29, ACM, 2014. Google ScholarDigital Library
- P. D. Marinescu, P. Hosek, and C. Cadar, “Covrig: a framework for the analysis of code, test, and coverage evolution in real software,” in Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), pp. 93–104, ACM, 2014. Google ScholarDigital Library
- R. Feldt, “Do system test cases grow old?,” in Proceedings of the International Conference on Software Testing, Verification and Validation (ICST), pp. 343–352, IEEE, 2014. Google ScholarDigital Library
- M. Greiler, A. van Deursen, and M. Storey, “Test confessions: a study of testing practices for plug-in systems,” in Software Engineering (ICSE), 2012 34th International Conference on, pp. 244–254, IEEE, 2012. Google ScholarDigital Library
- V. Hurdugaci and A. Zaidman, “Aiding software developers to maintain developer tests,” in Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp. 11–20, IEEE, 2012. Google ScholarDigital Library
Index Terms
- When, how, and why developers (do not) test in their IDEs
Recommendations
Exploring influences on student adherence to test-driven development
ITiCSE '12: Proceedings of the 17th ACM annual conference on Innovation and technology in computer science educationTest-Driven Development (TDD) is a software development process with a test-first approach that shows promise for improving code quality. Our research addresses concerns raised in both academia and industry about a lack of motivation or acceptance in ...
Regression test selection in test-driven development
AbstractThe large number of unit tests produced in the test-driven development (TDD) method and the iterative execution of these tests extend the regression test execution time in TDD. This study aims to reduce test execution time in TDD. We propose a TDD-...
Causal Factors, Benefits and Challenges of Test-Driven Development: Practitioner Perceptions
APSEC '11: Proceedings of the 2011 18th Asia-Pacific Software Engineering ConferenceThis report describes the experiences of one organization's adoption of Test Driven Development (TDD) practices as part of a medium-term software project employing Extreme Programming as a methodology. Three years into this project the team's TDD ...
Comments