1600 faults in 100 projects: automatically finding faults while achieving high coverage with EvoSuite

Fraser, Gordon; Arcuri, Andrea

doi:10.1007/s10664-013-9288-2

1600 faults in 100 projects: automatically finding faults while achieving high coverage with EvoSuite

Published: 15 November 2013

Volume 20, pages 611–639, (2015)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Gordon Fraser¹ &
Andrea Arcuri²

1040 Accesses
70 Citations
1 Altmetric
Explore all metrics

Abstract

Automated unit test generation techniques traditionally follow one of two goals: Either they try to find violations of automated oracles (e.g., assertions, contracts, undeclared exceptions), or they aim to produce representative test suites (e.g., satisfying branch coverage) such that a developer can manually add test oracles. Search-based testing (SBST) has delivered promising results when it comes to achieving coverage, yet the use in conjunction with automated oracles has hardly been explored, and is generally hampered as SBST does not scale well when there are too many testing targets. In this paper we present a search-based approach to handle both objectives at the same time, implemented in the EvoSuite tool. An empirical study applying EvoSuite on 100 randomly selected open source software projects (the SF100 corpus) reveals that SBST has the unique advantage of being well suited to perform both traditional goals at the same time—efficiently triggering faults, while producing representative test sets for any chosen coverage criterion. In our study, EvoSuite detected twice as many failures in terms of undeclared exceptions as a traditional random testing approach, witnessing thousands of real faults in the 100 open source projects. Two out of every five classes with undeclared exceptions have actual faults, but these are buried within many failures that are caused by implicit preconditions. This “noise” can be interpreted as either a call for further research in improving automated oracles—or to make tools like EvoSuite an integral part of software development to enforce clean program interfaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-based testing leveraged for automated web tests

Article 27 November 2021

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

APOGEN: automatic page object generator for web testing

Article 09 August 2016

Notes

The other participating tools were t2 and DSC, as well as Randoop as a baseline. Tools were evaluated based on achieved code coverage, mutation score and execution time (linearly combined in a single score).
An alternative would be to resort to data structures that can cope with larger number ranges (e.g., BigDecimal in Java), but this would lead to a significant performance drop.
Note that we used the 1.01 version of SF100. The original version in (Fraser and Arcuri 2012b) had 8,784 classes, but more classes became available once we fixed some classpath issues (e.g., missing jars) in some of the projects.
http://findbugs.sourceforge.net, accessed July 2013.

References

Arcuri A (2013) It really does matter how you normalize the branch distance in search-based software testing. Softw Test Verif Rel (STVR) 23(2):119–147
Article Google Scholar
Arcuri A, Briand L (2012) A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw Test Verif Rel (STVR). doi:10.1002/stvr.1486
Arcuri A, Fraser G (2013) Parameter tuning or default values? An empirical investigation in search-based software engineering. Empir Software Eng (EMSE) 18(3):594–623. doi:10.1007/s10664-013-9249-9
Article Google Scholar
Arcuri A, Iqbal MZ, Briand L (2012) Random testing: theoretical results and practical implications. IEEE Trans Softw Eng (TSE) 38(2):258–277
Article Google Scholar
Baresi L, Young M (2001) Test oracles. Technical Report CIS-TR-01-02, University of Oregon, Dept. of Computer and Information Science, Eugene, Oregon, USA. http://www.cs.uoregon.edu/~michal/pubs/oracles.html
Barr E, Vo T, Le V, Su Z (2013) Automatic detection of floating-point exceptions. In: Proceedings of the international conference on principles of programming languages (POPL’13). ACM
Bauersfeld S, Vos T, Lakhotiay K, Poulding S, Condori N (2013) Unit testing tool competition. In: International workshop on search-based software testing (SBST)
Bhattacharya N, Sakti A, Antoniol G, Guéhéneuc YG, Pesant G (2011) Divide-by-zero exception raising via branch coverage. In: Proceedings of the third international conference on search based software engineering, SSBSE’11. Springer, Berlin, Heidelberg, pp 204–218
Clarke LA (1976) A system to generate test data and symbolically execute programs. IEEE Trans Softw Eng (TSE) 2(3):215–222
Article Google Scholar
Cowles M, Davis C (1982) On the origins of the .05 level of statistical significance. Am Psychol 37(5):553–558
Article Google Scholar
Csallner C, Smaragdakis Y (2004) JCrasher: an automatic robustness tester for Java. Softw Pract Exper 34:1025–1050. doi:10.1002/spe.602
Article Google Scholar
Del Grosso C, Antoniol G, Merlo E, Galinier P (2008) Detecting buffer overflow via automatic test input data generation. Comput Oper Res 35(10):3125–3143
Article Google Scholar
Duran JW, Ntafos SC (1984) An evaluation of random testing. IEEE Trans Softw Eng (TSE) 10(4):438–444
Article Google Scholar
Feller W (1968) An introduction to probability theory and its applications, vol 1, 3 edn. Wiley
Fraser G, Arcuri A (2011a) EvoSuite: automatic test suite generation for object-oriented software. In: ACM symposium on the foundations of software engineering (FSE), pp 416–419
Fraser G, Arcuri A (2011b) It is not the length that matters, it is how you control it. In: IEEE International conference on software testing, verification and validation (ICST), pp 150–159
Fraser G, Arcuri A (2012a) The seed is strong: seeding strategies in search-based software testing. In: IEEE International conference on software testing, verification and validation (ICST), pp 121–130
Fraser G, Arcuri A (2012b) Sound empirical evidence in software testing. In: ACM/IEEE International conference on software engineering (ICSE), pp 178–188
Fraser G, Arcuri A (2013a) EvoSuite: on the challenges of test case generation in the real world (tool paper). In: IEEE International conference on software testing, verification and validation (ICST)
Fraser G, Arcuri A (2013b) Whole test suite generation. IEEE Trans Softw Eng 39(2):276–291
Article Google Scholar
Fraser G, Arcuri A, McMinn P (2013) Test suite generation with memetic algorithms. In: Genetic and evolutionary computation conference (GECCO)
Godefroid P, Klarlund N, Sen K (2005) Dart: directed automated random testing. In: ACM conference on programming language design and implementation (PLDI), pp 213–223
Godefroid P, Levin MY, Molnar DA (2008) Active property checking. In: Proceedings of the 8th ACM international conference on Embedded software, EMSOFT ’08. ACM, New York, pp 207–216
Gross F, Fraser G, Zeller A (2012) Search-based system testing: high coverage, no false alarms. In: ACM Int. symposium on software testing and analysis (ISSTA)
Korel B, Al-Yami AM (1996) Assertion-oriented automated test data generation. In: Proceedings of the 18th international conference on software engineering, ICSE ’96. IEEE Computer Society, Washington, pp 71–80
Lakhotia K, Harman M, Gross H (2010a) AUSTIN: a tool for search based software testing for the C language and its evaluation on deployed automotive systems. In: International symposium on search based software engineering (SSBSE), pp 101–110
Lakhotia K, McMinn P, Harman M (2010b) An empirical investigation into branch coverage for c programs using cute and austin. J Syst Softw 83(12):2379–2391
Article Google Scholar
Malburg J, Fraser G (2011) Combining search-based and constraint-based testing. In: IEEE/ACM int. conference on automated software engineering (ASE)
McMinn P (2004) Search-based software test data generation: a survey. Softw Test Verif Rel (STVR) 14(2):105–156
Article Google Scholar
McMinn P (2007) Iguana: input generation using automated novel algorithms. a plug and play research tool. Tech. rep., The University of Sheffield
McMinn P (2009) Search-based failure discovery using testability transformations to generate pseudo-oracles. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, genetic and evolutionary computation conference (GECCO). ACM, New York, pp 1689–1696
Meyer B, Ciupa I, Leitner A, Liu LL (2007) Automatic testing of object-oriented software. In: Proceedings of the 33rd conference on current trends in theory and practice of computer science, SOFSEM ’07. Springer, Berlin, Heidelberg, pp 114–129
Orso A, Xie T (2008) Bert: behavioral regression testing. In: Proceedings of the 2008 international workshop on dynamic analysis: held in conjunction with the ACM SIGSOFT International symposium on software testing and analysis (ISSTA 2008), WODA ’08. ACM, New York, pp 36–42. doi:10.1145/1401827.1401835
Pacheco C, Ernst MD (2005) Eclat: automatic generation and classification of test inputs. In: ECOOP 2005—object-oriented programming, 19th European conference, pp 504–527
Pacheco C, Lahiri SK, Ernst MD, Ball T (2007) Feedback-directed random test generation. In: ACM/IEEE International conference on software engineering (ICSE), pp 75–84
Pandita R, Xie T, Tillmann N, de Halleux J (2010) Guided test generation for coverage criteria. In: IEEE International conference on software maintenance (ICSM), pp 1–10
R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. ISBN 3-900051-07-0
Romano D, Di Penta M, Antoniol G (2011) An approach for search based testing of null pointer exceptions. In: Proceedings of the 2011 fourth IEEE international conference on software testing, verification and validation, ICST ’11. IEEE Computer Society, Washington, pp 160–169
Sen K, Marinov D, Agha G (2005) CUTE: a concolic unit testing engine for C. In: ESEC/FSE-13: proc. of the 10th European software engineering conf. held jointly with 13th ACM SIGSOFT int. symposium on foundations of software engineering. ACM, pp 263–272
Tillmann N, de Halleux NJ (2008) Pex—white box test generation for .NET. In: International conference on Tests and Proofs (TAP), pp 134–253
Tracey N, Clark J, Mander K, McDermid J (2000) Automated test-data generation for exception conditions. Softw Pract Exper 30(1):61–79
Article Google Scholar
Visser W, Pasareanu CS, Khurshid S (2004) Test input generation with Java PathFinder. ACM SIGSOFT 29(4):97–107
Article Google Scholar
Williams N, Marre B, Mouy P, Roger M (2005) PathCrawler: automatic generation of path tests by combining static and dynamic analysis. In: EDCC’05: proceedings of the 5th European dependable computing conference. LNCS, vol 3463. Springer, pp 281–292
Xiao X, Xie T, Tillmann N, de Halleux J (2011) Precise identification of problems for structural test generation. In: Proceeding of the 33rd international conference on software engineering, ICSE ’11. ACM, New York, pp 611–620

Download references

Acknowledgements

This project has been funded by a Google Focused Research Award on “Test Amplification” and the Norwegian Research Council.

Author information

Authors and Affiliations

Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, S1 4DP, Sheffield, UK
Gordon Fraser
Certus Software V&V Center at Simula Research Laboratory, P.O. Box 134, Lysaker, Norway
Andrea Arcuri

Authors

Gordon Fraser
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Arcuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gordon Fraser.

Additional information

Communicated by: Gregg Rothermel

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fraser, G., Arcuri, A. 1600 faults in 100 projects: automatically finding faults while achieving high coverage with EvoSuite. Empir Software Eng 20, 611–639 (2015). https://doi.org/10.1007/s10664-013-9288-2

Download citation

Published: 15 November 2013
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10664-013-9288-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

1600 faults in 100 projects: automatically finding faults while achieving high coverage with EvoSuite

Abstract

Access this article

Similar content being viewed by others

Model-based testing leveraged for automated web tests

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

APOGEN: automatic page object generator for web testing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

1600 faults in 100 projects: automatically finding faults while achieving high coverage with EvoSuite

Abstract

Access this article

Similar content being viewed by others

Model-based testing leveraged for automated web tests

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

APOGEN: automatic page object generator for web testing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation