Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication

Apa, Cecilia; Dieste, Oscar; Espinosa G., Edison G.; Fonseca C., Efraín R.

doi:10.1007/s10664-013-9267-7

Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication

Published: 08 August 2013

Volume 19, pages 378–417, (2014)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Cecilia Apa¹,
Oscar Dieste²,
Edison G. Espinosa G.³ &
…
Efraín R. Fonseca C.⁴

584 Accesses
8 Citations
Explore all metrics

Abstract

The verification and validation activity plays a fundamental role in improving software quality. Determining which the most effective techniques for carrying out this activity are has been an aspiration of experimental software engineering researchers for years. This paper reports a controlled experiment evaluating the effectiveness of two unit testing techniques (the functional testing technique known as equivalence partitioning (EP) and the control-flow structural testing technique known as branch testing (BT)). This experiment is a literal replication of Juristo et al. (2013). Both experiments serve the purpose of determining whether the effectiveness of BT and EP varies depending on whether or not the faults are visible for the technique (InScope or OutScope, respectively). We have used the materials, design and procedures of the original experiment, but in order to adapt the experiment to the context we have: (1) reduced the number of studied techniques from 3 to 2; (2) assigned subjects to experimental groups by means of stratified randomization to balance the influence of programming experience; (3) localized the experimental materials and (4) adapted the training duration. We ran the replication at the Escuela Politécnica del Ejército Sede Latacunga (ESPEL) as part of a software verification & validation course. The experimental subjects were 23 master’s degree students. EP is more effective than BT at detecting InScope faults. The session/program and group variables are found to have significant effects. BT is more effective than EP at detecting OutScope faults. The session/program and group variables have no effect in this case. The results of the replication and the original experiment are similar with respect to testing techniques. There are some inconsistencies with respect to the group factor. They can be explained by small sample effects. The results for the session/program factor are inconsistent for InScope faults. We believe that these differences are due to a combination of the fatigue effect and a technique x program interaction. Although we were able to reproduce the main effects, the changes to the design of the original experiment make it impossible to identify the causes of the discrepancies for sure. We believe that further replications closely resembling the original experiment should be conducted to improve our understanding of the phenomena under study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid Is Better: Why and How Test Coverage and Software Reliability Can Benefit Each Other

A Generic Fault Model for Quality Assurance

Program verification and testing technologies

Article 25 June 2014

References

Basili VR (1992) Software modeling and measurement: the goal/question/metric paradigm. Tech. Rep. UMIACS TR-92-96, Departament of Computer Science, University of Maryland, College Park
Basili VR, Selby RW (1985) Comparing the effectiveness of software testing strategies. Tech. Rep. TR-1501, Departament of Computer Science, University of Maryland, College Park
Basili VR, Selby RW (1987) Comparing the effectiveness of software testing strategies. IEEE Trans Softw Eng SE-13:78–96
Article Google Scholar
Brown BW (1980) The crossover experiment for clinical trials. Biometrics 36:69–70
Article MATH Google Scholar
Carver JC (2010) Towards reporting guidelines for experimental replications: a proposal. In: Proceedings of the 1st international workshop on Replication in Empirical Software Engineering Research (RESER). Cape Town, South Africa, 4 May 2010
Gómez O (2012) Tipología de Replicaciones para la Síntesis de Experimentos en Ingeniería del Software. PhD thesis, Universidad Politécnica de Madrid
Gómez O, Juristo N, Vegas S (2010) Replications types in experimental disciplines. In: Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement, no. 3. Bolzano-Bozen, Italy, pp 1–10
Graham JW, Schafer JL (1999) On the performance of multiple imputation for multivariate data with small sample size. In: Hoyle RH (ed) Statistical strategies for small sample research. Sage Publications, pp 1–29
Juristo N, Vegas S (2003) Functional testing, structural testing and code reading: what fault do they each detect? In: Empirical Methods and Studies in Software Engineering Experiences from ESERNET, vol 2785(12), pp 208–232
Juristo N, Vegas S, Apa C (2013) Effectiveness for detecting faults within and outside the scope of testing techniques: a controlled experiment. Available at http://www.grise.upm.es/reports.php. Accessed 15 May 2013
Kamsties E, Lott C (1995) An empirical evaluation of three defect-detection techniques. In: Fifth European Software Engineering Conference (ESEC ’95). Lecture Notes in Computer Science, vol 989, pp 362–383
Kernan WN, Viscoli CM, Makuch RW, Brass LM, Horwitz RI (1999) Stratified randomization for clinical trials. J Clin Epidemiol 52(1):19–26
Article Google Scholar
Kitchenham B, Fry J, Linkman S (2003) The case against cross-over designs in software engineering. In: Eleventh annual international workshop on software technology and engineering practice, pp 65–67
Meyers LS, Gamst G, Guarino AJ (2006) Applied multivariate reseach: design and interpretation. Sage Publication
Myers GJ (1978) A controlled experiment in program testing and code walkthroughs/inspections. In: Communications of the ACM, vol 21, pp 760–768
Richy F, Ethgen O, Bruyère O, Deceulaer F, Reginster J (2004) From sample size to effect-size: Small study effect investigation (ssei). In: The Internet Journal of Epidemiology, vol 1
Roper M, Wood M, Miller J (1997) An empirical evaluation of defect detection techniques. Inform Softw Technol 39(11):763–775
Article Google Scholar
Senn S (2002) Cross-over trials in clinical research, 2nd edn. Wiley
Sjøberg DI, Han Hannay JE, Hansen O, Kampenes VB, Karahasanovic A, Liborg N-K, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31:733–753
Article Google Scholar
Wood M, Roper M, Brooks A, Miller J (1997) Comparing and combining software defect detection techniques: a replicated empirical study. In: Proceedings of the 6th European software engineering conference held jointly with the 5th ACM SIGSOFT international symposium on foundations of software engineering. Zurich, Switzerland, pp 262–277

Download references

Acknowledgements

This research has been funded by a grant from the Armed Forces Technical School (ESPE), Republic of Ecuador National Higher Education, Science, Technology and Innovation Secretary’s Office (SENESCYT) and partially funded by the Spanish Ministry of Economics and Competitiveness project TIN2011-23216.

We also thank the reviewers for their thoughtful review, which greatly improved the quality of the manuscript.

Author information

Authors and Affiliations

Universidad de la República, Julio Herrera y Reissig, 565, Montevideo, Uruguay
Cecilia Apa
Universidad Politécnica de Madrid, Boadilla del Monte, 28660, Madrid, Spain
Oscar Dieste
Escuela Politécnica del Ejército Sede Latacunga, Latacunga, Ecuador
Edison G. Espinosa G.
Escuela Politécnica del Ejército, Sangolquí, Ecuador
Efraín R. Fonseca C.

Authors

Cecilia Apa
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Dieste
View author publications
You can also search for this author in PubMed Google Scholar
Edison G. Espinosa G.
View author publications
You can also search for this author in PubMed Google Scholar
Efraín R. Fonseca C.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Efraín R. Fonseca C..

Additional information

Communicated by: Jeffrey C. Carver, Natalia Juristo, Teresa Baldassarre and Sira Vegas.

Appendix

1.1 Appendix A: Descriptive Statistics

Table 19 Descriptive statistics InScope variable

Full size table

Table 20 Descriptive statistics OutScope variable

Full size table

1.2 Appendix B: Survey’s Data

Table 21 Survey’s data

Full size table

1.3 Appendix C: Replication’s Raw Data

Table 22 Replication’s raw data

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Apa, C., Dieste, O., Espinosa G., E.G. et al. Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication. Empir Software Eng 19, 378–417 (2014). https://doi.org/10.1007/s10664-013-9267-7

Download citation

Published: 08 August 2013
Issue Date: April 2014
DOI: https://doi.org/10.1007/s10664-013-9267-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication

Abstract

Access this article

Similar content being viewed by others

Hybrid Is Better: Why and How Test Coverage and Software Reliability Can Benefit Each Other

A Generic Fault Model for Quality Assurance

Program verification and testing technologies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Appendix A: Descriptive Statistics

1.2 Appendix B: Survey’s Data

1.3 Appendix C: Replication’s Raw Data

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication

Abstract

Access this article

Similar content being viewed by others

Hybrid Is Better: Why and How Test Coverage and Software Reliability Can Benefit Each Other

A Generic Fault Model for Quality Assurance

Program verification and testing technologies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Appendix A: Descriptive Statistics

1.2 Appendix B: Survey’s Data

1.3 Appendix C: Replication’s Raw Data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation