The role of non-exact replications in software engineering experiments

Juristo, Natalia; Vegas, Sira

doi:10.1007/s10664-010-9141-9

The role of non-exact replications in software engineering experiments

Published: 17 August 2010

Volume 16, pages 295–324, (2011)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Natalia Juristo¹ &
Sira Vegas¹

599 Accesses
42 Citations
Explore all metrics

Abstract

In no science or engineering discipline does it make sense to speak of isolated experiments. The results of a single experiment cannot be viewed as representative of the underlying reality. Experiment replication is the repetition of an experiment to double-check its results. Multiple replications of an experiment increase the confidence in its results. Software engineering has tried its hand at the identical (exact) replication of experiments in the way of the natural sciences (physics, chemistry, etc.). After numerous attempts over the years, apart from experiments replicated by the same researchers at the same site, no exact replications have yet been achieved. One key reason for this is the complexity of the software development setting, which prevents the many experimental conditions from being identically reproduced. This paper reports research into whether non-exact replications can be of any use. We propose a process aimed at researchers running non-exact replications. Researchers enacting this process will be able to identify new variables that are possibly having an effect on experiment results. The process consists of four phases: replication definition and planning, replication operation and analysis, replication interpretation, and analysis of the replication’s contribution. To test the effectiveness of the proposed process, we have conducted a multiple-case study, revealing the variables learned from two different replications of an experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing the results of replications in software engineering

Article 02 February 2021

Using Experimental Material Management Tools in Experimental Replication: A Systematic Mapping Study

Increasing validity through replication: an illustrative TDD case

Article Open access 26 March 2020

Notes

Note that experience is a context variable in both the baseline experiment and the replication, as neither of the two explore its relationship to the response variable. If we get the feeling that it could be having an influence (for example, using the process proposed here), an experiment can be designed with the hypothesis that there is a relationship between technique effectiveness and experience.
Note that this is what experimenters expect to happen, and it will need to be confirmed by the results of the replication. Even if the results of the replication seem to corroborate the possible influence of this variable, it will still have to be further explored in other experiments.
A subject applying a testing technique may generate a test case that exercises a fault and produces a failure, but the tester may fail to identify and report the failure. For this reason, a distinction is made between detected defects and reported defects.
Running a replication in the same country where the UPM experiment was run increases the possibility of many experimental conditions being the same.
Note that leaving out a technique does not have an effect on the values for the response variables of the other two techniques. Therefore, the replications are comparable.
In any case this change affects the reported defects response variable, which has been left out of this study.
Note that leaving out a response variable does not have an effect on the values for the other response variables of the experiment. Therefore, the replications are comparable.
As subject motivation increases, so does effectiveness.
The more pressure the subject works under the less effective the application of the technique is.

References

Basili VR, Selby RW (1985) Comparing the effectiveness of software testing strategies. Department of Computer Science. University of Maryland. Technical Report TR-1501. College Park
Basili VR, Selby RW (1987) Comparing the effectiveness of software testing strategies. IEEE Trans Software Eng SE-13(12):1278–1296
Article Google Scholar
Close F (1991) Too hot to handle: the story of the race for cold fusion. Princeton University Press
Conradi R, Basili VR, Carver J, Shull F, Travassos GH (2001) A pragmatic documents standard for an experience library: roles, documents, contents and structure. University of Maryland Technical Report. CS-TR-4235
Gómez OS, Juristo N, Vegas S (2010) Replications types in other experimental disciplines. Submitted to International Symposium on Empirical Software Engineering and Measurement (ESEM’10). Bolzano, Italy
Hedges LV, Olkin I (1985) Statistical methods for meta-analysis. Orlando Academic Press
Juristo N, Moreno AM, Vegas S (2004) Reviewing 25 years of testing technique experiments. Empirical Softw Eng 9(1):7–44
Article Google Scholar
Juristo N, Vegas S (2003) Functional testing, structural testing and code reading: What fault type do they each detect? Empirical Methods and Studies in Software Engineering- Experiences from ESERNET. Springer-Verlag. Volume 2785. Chapter 12, pp 235–261
Kamsties S, Lott C (1995) An empirical evaluation of three defect detection techniques. Technical Report ISERN 95-02, Dept. Computer Science, University of Kaiserslautern
Kamsties E, Lott CM (1995b) An empirical evaluation of three defect-detection techniques. Proceedings of the Fifth European Software Engineering Conference. Sitges, Spain
Google Scholar
Laitenberger O, Rombach HD (2003) (Quasi-) experimental studies in industrial settings. Lecture notes on empirical software engineering. World Scientific Publishing
Lindsay RM, Ehrenberg A (1993) The design of replicated studies. Am Stat 47(3):217–228
Article Google Scholar
Lung J, Aranda J, Easterbrook SM, Wilson GV (2008) On the difficulty of replicating human subjects studies in software engineering. In Proceedings of the 30th International Conference on Software Engineering. Leipzig, Germany.
Mendonça MG, Maldonado JC, de Oñoveroa MCF, Fabbri KCS, Shull F, Travassos GH, Höhn EN, Basili VR (2008) A Framework for Software Engineering Experimental Replications. In Proceedings of the 13th International Conference on Engineering of Complex Computer Systems. pp. 203–212. Belfast, Northern Ireland
Miller J (2000) Applying meta-analytical procedures to software engineering experiments. J Syst Softw 54(1):29–39
Article Google Scholar
Popper K (1959) The logic of scientific discovery. Hutchinson & Co
Porter A, Johnson P (1997) Assessing software review meetings: results of a comparative analysis of two experimental studies. IEEE Trans Software Eng 23(3):129–145
Article Google Scholar
Roper M, Wood M, Miller J (1997) An empirical evaluation of defect detection techniques. Inf Softw Technol 39:763–775
Article Google Scholar
Runeson P, Andersson C, Thelin T, Amschler-Andrews A, Berling T (2006) What do we know about defect detection methods? IEEE Softw 23(3):82–90
Article Google Scholar
Shull F (1998) Developing techniques for using software documents: a series of empirical studies. PhD Thesis. Department of Computer Science. University of Maryland
Shull F, Carver J, Travassos GH, Maldonado JC, Conradi R, Basili VR (2003) Replicated studies: building a body of knowledge about software reading techniques. Lecture Notes on Empirical Software Engineering. Chapter 2, pp 39–84. World Scientific
Shull F, Carver J, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empirical Softw Eng J 13:211–218
Article Google Scholar
Sjoberg D, Hannay J, Hansen O, Kampenes V, Karahasanovic A, Liborg N, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Software Eng 31(9):733–753
Article Google Scholar
Thompson SG, Sharp SJ (1999) Explaining hetegoreneity in meta-analysis: a comparison of methods. Stat Med 18(20):2693–708
Article Google Scholar
Vegas S, Juristo N, Moreno AM, Solari M, Letelier P (2006) Analysis of the Influence of Communication between Researchers on Experiment Replication. Proceedings of the International Symposium on Empirical Software Engineering (ISESE’06). pp 28–37, 2006, Rio de Janeiro, Brazil.
Whitehead A (2002) Meta-analysis of controlled clinical trials. Wiley
Wood M, Roper M, Brooks A, Miller J (1997) Comparing and combining software defect detection techniques: a replicated empirical study. Proceedings of the 6th European Software Engineering Conference. Zurich, Switzerland
Yin RK (2008) Case Study Research: Design and Methods. 4th edition. Applied Social Research Methods Series. Vol. 5. Sage Publications

Download references

Acknowledgments

We would like to thank the reviewers of the paper for their thorough and insightful comments on this paper. They have all unquestionably helped to improve this work. We also would like to thank Óscar Dieste for sharing with us his deep knowledge on meta-analysis, and for the fruitful conversation on random variations among experiments’ results. This work was funded by research grant TIN2008-00555 of the Spanish Ministry of Science and Innovation.

Author information

Authors and Affiliations

Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo, 28660, Madrid, Spain
Natalia Juristo & Sira Vegas

Authors

Natalia Juristo
View author publications
You can also search for this author in PubMed Google Scholar
Sira Vegas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sira Vegas.

Additional information

Editor: James Miller

Rights and permissions

Reprints and permissions

About this article

Cite this article

Juristo, N., Vegas, S. The role of non-exact replications in software engineering experiments. Empir Software Eng 16, 295–324 (2011). https://doi.org/10.1007/s10664-010-9141-9

Download citation

Published: 17 August 2010
Issue Date: June 2011
DOI: https://doi.org/10.1007/s10664-010-9141-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The role of non-exact replications in software engineering experiments

Abstract

Access this article

Similar content being viewed by others

Comparing the results of replications in software engineering

Using Experimental Material Management Tools in Experimental Replication: A Systematic Mapping Study

Increasing validity through replication: an illustrative TDD case

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The role of non-exact replications in software engineering experiments

Abstract

Access this article

Similar content being viewed by others

Comparing the results of replications in software engineering

Using Experimental Material Management Tools in Experimental Replication: A Systematic Mapping Study

Increasing validity through replication: an illustrative TDD case

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation