Skip to main content
Log in

Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Where the creation, understanding, and assessment of software testing and regression testing techniques are concerned, controlled experimentation is an indispensable research methodology. Obtaining the infrastructure necessary to support such experimentation, however, is difficult and expensive. As a result, progress in experimentation with testing techniques has been slow, and empirical data on the costs and effectiveness of techniques remains relatively scarce. To help address this problem, we have been designing and constructing infrastructure to support controlled experimentation with testing and regression testing techniques. This paper reports on the challenges faced by researchers experimenting with testing techniques, including those that inform the design of our infrastructure. The paper then describes the infrastructure that we are creating in response to these challenges, and that we are now making available to other researchers, and discusses the impact that this infrastructure has had and can be expected to have.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Andrews, J., Briand, L., and Labiche, Y. 2005. Is mutation an appropriate tool for testing experiments? In International Conference on Software Engineering, May.

  • Basili, V., Selby, R., Heinz, E., and Hutchens, D. 1986. Experimentation in software engineering. IEEE Transactions on Software Engineering 12(7): July, 733–743.

    Google Scholar 

  • Basili, V., Shull, F., and Lanubile, F. 1999. Building knowledge through families of experiments. IEEE Transactions on Software Engineering 25(4): 456–473.

    Article  Google Scholar 

  • Beizer, B. 1990. Software Testing Techniques. New York, NY: Van Nostrand Reinhold.

    Google Scholar 

  • Bible, J., Rothermel, G., and Rosenblum, D. 2001. Coarse-and fine-grained safe regression test selection. ACM Transactions on Software Engineering and Methodology 10(2): April, 149–183.

    Article  Google Scholar 

  • Binkley, D. 1997. Semantics guided regression test cost reduction. IEEE Transactions on Software Engineering 23(8): August, 498–516.

    Article  Google Scholar 

  • Binkley, D., Capellini, R., Raszewski, L., and Smith, C. 2001. An implementation of and experiment with semantic differencing. In International Conference on Software Maintenance, November.

  • Budd, T., and Gopal, A. 1985. Program testing by specification mutation. Computer Languages 10(1): 63–73.

    Article  Google Scholar 

  • Chen, Y. F., Rosenblum, D. S., and Vo, K. P. 1994. Test Tube: A system for selective regression testing. In Proceedings of International Conference on Software Engineering, May, pp. 211–220.

  • Coen-Porisini, A., Denaro, G., Ghezzi, C., and Pezze, P. 2001. Using symbolic execution for verifying safety-critical systems. In Proceedings of ACM Foundations of Software Engineering.

  • Delamaro, M., Maldonado, J., and Mathur, A. 2001. Interface mutation: An approach for integration testing. IEEE Transactions on Software Engineering 27(3): March, 228–247.

    Google Scholar 

  • Do, H., and Rothermel, G. September 2005. A controlled experiment assessing test case prioritization techniques via mutation faults. In Proceedings of International Conference on Software Maintenance, September.

  • Do, H., Rothermel, G., and Kinneer, A. 2004. Empirical studies of test case prioritization in a JUnit testing environment. In International Conference on Software Maintenace, September.

  • Elbaum, S., Gable, D., and Rothermel, G. November 2001a. The impact of software evolution on code coverage. In Proceedings of International Conference on Software Maintenance, November, pp. 169–179.

  • Elbaum, S., Malishevsky, A., and Rothermel, G. 2001b. Incorporating varyingtest costs and fault severities into test case prioritization. In International Conference on Software Maintenance, May, pp. 329–338.

  • Elbaum, S., Kallakuri, P., Malishevsky, A., Rothermel, R., and Kanduri, S. 2003. Understanding the effects of changes on the cost-effectiveness of regression testing techniques. Journal of Software Testing, Verification and Reliability 12(2): June, 65–83.

    Google Scholar 

  • Ernst, M., Czeisler, A., Griswold, W., and Notkin, D. 2000. Quickly detecting relevant program invariants. In International Conference on Software Engineering, June.

  • Fewster, M., and Graham, D. 1999. Software Test Automation: Effective Use of Test ExecutionTools. Reading, MA: Addison-Wesley.

  • Frankl, P. G., and Weiss, S. N. August 1993. An experimental comparison of the effectiveness of branch testing and data flow testing. IEEE Transactions on Software Engineering 19(8): 774–787.

    Article  Google Scholar 

  • Harder, M., and Ernst, M. May 2003. Improving test suites via operational abstraction. In Proceedings of International Conference on Software Engineering, May.

  • Hartmann, J., and Robson, D. J. 1990. Techniques for selective revalidation. IEEE Software 16(1): January, 31–38.

    Article  Google Scholar 

  • Hoffman, D., and Brealey, C. 1989. Module testcase generation. In Proceedings of 3rd Workshop on Software Testing, Analysis, and Verification, December, pp. 97–102.

  • Hutchins, M., Foster, H., Goradia, T., and Ostrand, T. 1994. Experiments on the effectiveness of data flow-and control flow-based test adequacy criteria. In Proceedings of International Conference on a Software Engineering, pp. 191–200.

  • Juristo, N., Moreno, A. M., and Vegas, S. 2004. Reviewing 25 years of testing technique experiments. Empirical Software Engineering: An International Journal 9(1), March.

  • Kim, J., and Porter, A. 2002. A history-based test prioritization technique for regression testing in resource constrained environments. In In Proceedings of International Conference Software Engineering, May.

  • Kitchenham, B., Pickard, L., and Pfleeger, S. 1995. Case studies for method and tool evaluation. IEEE Software 52–62, July.

  • Kitchenham, B., Pfleeger, S., Pickard, L., Jones, P., Hoaglin, D., Emam, K., and Rosenberg, J. 2002. Preliminary guidelines for empirical research in software engineering. IEEE Transactions on Software Engineering 28(8): August, 721–734.

    Article  Google Scholar 

  • Leung, H. K. N., and White, L. 1989. Insights into- regression testing. In International Conference on SoftwareMaintenance, October, pp. 60–69.

  • Malishevsky, A. G., Rothermel, G., and Elbaum, S. 2002. Modeling the cost–benefits tradeoffs for regression testing techniques. In Proceedings of International Conference on Software Maintenance, October, pp. 230–240.

  • Marre, M., and Bertolino, A. 2003. Using spanning sets for coverage testing. IEEE Transactions on Software Engineering 29(11): November, 974–984.

    Google Scholar 

  • Offutt, A., Rothermel, G., Untch, R., and Zapf, C. 1996. An experimental determination of sufficient mutant operators. ACM Transactions on Software Engineering and Methodology 5(2): April, 99–118.

    Google Scholar 

  • Okun, V., Black, P., and Yesha, Y. 2003. Testing with model checkers: Insuring fault visibility. WSEAS Transactions on Systems 2(1): January, 77–82.

    Google Scholar 

  • Onoma, K., Tsai, W.-T., Poonawala, M., and Suganuma, H. 1988. Regression testing in an industrial environment. Communications of the ACM 41(5): May 81–86.

    Article  Google Scholar 

  • Ostrand, T. J., and Balcer, M. J. 1988. The category-partition method for specifying and generating functional tests. Communications of the ACM 31(6): June, 676–686.

    Google Scholar 

  • Pickard, L., and Kitchenham, B. August 1998. Combining empirical results in software engineering. Information and Software Technology 40(14): 811–821.

    Article  Google Scholar 

  • Rapps, S., and Weyuker, E. J. 1985. Selecting software test data using data flow information. IEEE Transactions on Software Engineering SE-11(4): April, 367–375.

    Google Scholar 

  • Rothermel, G., and Harrold, M. J. 1994. Selecting tests and identifying test coverage requirements for modified software. In Proceedings of International Symposium on Software Testing and Analysis, May.

  • Rothermel, G., and Harrold, M. J. 1996. Analyzing regression test selection techniques. IEEE Transactions on Software Engineering 22(8): August 529–551.

    Article  Google Scholar 

  • Rothermel, G., and Harrold, M. J. 1997. A safe, efficient regression test selection technique. ACM Transactions on Software Engineering and Methodology 6(2): April, 173–210.

    Article  Google Scholar 

  • Rothermel, G., Untch, R., Chu, C., and Harrold, M. J. 2001. Test case prioritization. IEEE Transactions on Software Engineering 27(10): October, 929–948.

    Article  Google Scholar 

  • Rothermel, G., Elbaum, S., Malishevsky, A., Kallakuri, P., and Davia, B. May 2002. The impact of test suite granularity on the cost-effectiveness of regression testing. In Proceedings of International Conference onSoftware Engineering, May.

  • Tichy, W., Lukowicz, P., Heinz, E., and Prechelt, L. 1995. Experimental evaluation in computer science: A quantitative study. Journal of Systems and Software 28(1): January, 9–18.

    Article  Google Scholar 

  • Trochim, W. 2000. The Research Methods Knowledge Base, 2nd edition. Cinncinnati, OH: Atomic Dog Publishing.

  • Vokolos, F. I., and Frankl, P. G. 1998. Empirical evaluation of the textual differencing regression testing technique. In Proceedings of International Conference on Software Maintenance, November, pp. 44–53.

  • Weyuker, E. 1988. The evaluation of program-based software test data adequacy criteria. Communications of the ACM 31(6) June 668–675.

    Google Scholar 

  • Wohlin, C., Runeson, P., Host, M., Ohlsson, M., Regnell, B., and Wesslen, A. 2000. Experimentation in Software Engineering: An Introduction. Boston, MA: Kluwer Academic Publishers.

  • Wong, W. E., Horgan, J. R., Mathur, A. P., and Pasquini, A. 1997a. Test set size minimization and fault detection effectiveness: A case study in a space application. In Proceedings of the Computer Software Applications Conference, pp. 522–528.

  • Wong, W. E., Horgan, J. R., London, S., and Agrawal, H. November 1997b. A study of effective regression testing in practice. In Proceedings of Eighth International Symposium Software Release Engineering, pp. 230–238.

  • Xie, T., and Notkin, D. 2002. Macro and micro perspectives on strategic software qualityassurance in resource constrained environments. In Proceedings of EDSER-4, May.

  • Yin, R. K. 1994. Case Study Research: Design and Methods (Applied Social Research Methods, Vol. 5). London, UK: Sage Publications.

    Google Scholar 

  • Zelkowitz, M., and Wallace, D. 1998. Experimental models for validating technology. IEEE Computer 31(5) 23–31.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyunsook Do.

Additional information

Editor: Forrest Shull and Natalia Juristo

Rights and permissions

Reprints and permissions

About this article

Cite this article

Do, H., Elbaum, S. & Rothermel, G. Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact. Empir Software Eng 10, 405–435 (2005). https://doi.org/10.1007/s10664-005-3861-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-005-3861-2

Keywords

Navigation