Abstract
Raw lines of code (LOC) is a metric that does not, at first glance, seem extremely useful for automated test generation. It is both highly language-dependent and not extremely meaningful, semantically, within a language: one coder can produce the same effect with many fewer lines than another. However, relative LOC, between components of the same project, turns out to be a highly useful metric for automated testing. In this article, we make use of a heuristic based on LOC counts for tested functions to dramatically improve the effectiveness of automated test generation. This approach is particularly valuable in languages where collecting code coverage data to guide testing has a very high overhead. We apply the heuristic to property-based Python testing using the TSTL (Template Scripting Testing Language) tool. In our experiments, the simple LOC heuristic can improve branch and statement coverage by large margins (often more than 20%, up to 40% or more) and improve fault detection by an even larger margin (usually more than 75% and up to 400% or more). The LOC heuristic is also easy to combine with other approaches and is comparable to, and possibly more effective than, two well-established approaches for guiding random testing.
- Ali Aburas and Alex Groce. 2016. A method dependence relations guided genetic algorithm. In Proceedings of the 8th International Symposium Search Based Software Engineering (SSBSE’16). 267--273.Google ScholarCross Ref
- Hiralal Agrawal. 1994. Dominators, super blocks, and program coverage. In Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’94). ACM, New York, NY, 25--34. DOI:https://doi.org/10.1145/174675.175935Google ScholarDigital Library
- Iftekhar Ahmed, Rahul Gopinath, Caius Brindescu, Alex Groce, and Carlos Jensen. 2016. Can testedness be effectively measured? In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’16). ACM, New York, NY, 547--558. DOI:https://doi.org/10.1145/2950290.2950324Google ScholarDigital Library
- Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi. 2016. Generating focused random tests using directed swarm testing. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). ACM, New York, NY, 70--81. DOI:https://doi.org/10.1145/2931037.2931056Google ScholarDigital Library
- C. Andersson and P. Runeson. 2007. A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 33, 5 (May 2007), 273--286. DOI:https://doi.org/10.1109/TSE.2007.1005Google ScholarDigital Library
- James Andrews, Felix Li, and Tim Menzies. 2007. Nighthawk: A two-level genetic-random unit test data generator. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering. 144--153.Google ScholarDigital Library
- Jamie Andrews, Yihao Ross Zhang, and Alex Groce. 2010. Comparing Automated Unit Testing Strategies. Technical Report 736. Department of Computer Science, University of Western Ontario.Google Scholar
- James H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is mutation an appropriate tool for testing experiments? In Proceedings of the International Conference on Software Engineering. 402--411.Google Scholar
- Andrea Arcuri and Lionel Briand. 2014. A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verif. Reliab. 24, 3 (2014), 219--250.Google ScholarDigital Library
- Andrea Arcuri, Muhammad Zohaib Z. Iqbal, and Lionel C. Briand. 2010. Formal analysis of the effectiveness and predictability of random testing. In Proceedings of the International Symposium on Software Testing and Analysis. 219--230.Google Scholar
- J. Bansiya and C. G. Davis. 2002. A hierarchical model for object-oriented design quality assessment. IEEE Trans. Softw. Eng. 28, 1 (Jan. 2002), 4--17. DOI:https://doi.org/10.1109/32.979986Google ScholarDigital Library
- V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 10 (Oct. 1996), 751--761. DOI:https://doi.org/10.1109/32.544352Google ScholarDigital Library
- Ned Batchelder. 2015. Coverage.py. Retrieved from https://coverage.readthedocs.org/en/coverage-4.0.1/.Google Scholar
- James M. Bieman and Byung-Kyoo Kang. 1995. Cohesion and reuse in an object-oriented system. SIGSOFT Softw. Eng. Notes 20, SI (Aug. 1995), 259--262. DOI:https://doi.org/10.1145/223427.211856Google Scholar
- Marcel Böhme and Soumya Paul. 2014. On the efficiency of automated testing. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). ACM, New York, NY, 632--642. DOI:https://doi.org/10.1145/2635868.2635923Google ScholarDigital Library
- Lionel C. Briand and Jürgen Wüst. 2002. Empirical studies of quality models in object-oriented systems. Advances in Computers, Vol. 56. Elsevier, 97--166. DOI:https://doi.org/10.1016/S0065-2458(02)80005-5Google Scholar
- Lionel C. Briand, Jürgen Wüst, Stefan V. Ikonomovski, and Hakim Lounis. 1999. Investigating quality factors in object-oriented designs: An industrial case study. In Proceedings of the 21st International Conference on Software Engineering (ICSE’99). ACM, New York, NY, 345--354. DOI:https://doi.org/10.1145/302405.302654Google ScholarDigital Library
- Lionel C. Briand, Jürgen Wüst, and Hakim Lounis. 2001. Replicated case studies for investigating quality factors in object-oriented designs. Empir. Softw. Eng. 6, 1 (2001), 11--58. DOI:https://doi.org/10.1023/A:1009815306478Google ScholarDigital Library
- G. Buchgeher, C. Ernstbrunner, R. Ramler, and M. Lusser. 2013. Towards tool-support for test case selection in manual regression testing. In Proceedings of the IEEE 6th International Conference on Software Testing, Verification and Validation Workshops. 74--79. DOI:https://doi.org/10.1109/ICSTW.2013.16Google Scholar
- Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the Conference on Operating System Design and Implementation. 209--224.Google Scholar
- R. Carlson, H. Do, and A. Denton. 2011. A clustering approach to improving test case prioritization: An industrial case study. In Proceedings of the 27th IEEE International Conference on Software Maintenance (ICSM’11). 382--391. DOI:https://doi.org/10.1109/ICSM.2011.6080805Google Scholar
- Yuanliang Chen, Yu Jiang, Fuchen Ma, Jie Liang, Mingzhe Wang, Chijin Zhou, Xun Jiao, and Zhuo Su. 2019. EnFuzz: Ensemble fuzzing with seed synchronization among diverse fuzzers. In Proceedings of the 28th USENIX Security Symposium (USENIX Security’19). 1967--1983.Google Scholar
- Kalyan-Ram Chilakamarri and Sebastian Elbaum. 2004. Reducing coverage collection overhead with disposable instrumentation. In Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE’04). IEEE, 233--244.Google ScholarDigital Library
- Travis CI. 2012. Customizing the Build: Build Timeouts. Retrieved from https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts.Google Scholar
- Koen Claessen and John Hughes. 2000. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the International Conference on Functional Programming (ICFP’00). 268--279.Google ScholarDigital Library
- J. J. Dolado. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 10 (Oct. 2000), 1006--1021. DOI:https://doi.org/10.1109/32.879821Google ScholarDigital Library
- Matthew B. Dwyer, Suzette Person, and Sebastian Elbaum. 2006. Controlling factors in evaluating path-sensitive error detection techniques. In Proceedings of the Symposium on the Foundations of Software Engineering. 92--104.Google ScholarCross Ref
- Fernando Brito e Abreu and Rogério Carapuça. 1994. Object-oriented software engineering: Measuring and controlling the development process. In Proceedings of the International Conference on Software Quality (QSIC’94).Google Scholar
- F. Brito e Abreu and W. Melo. 1996. Evaluating the impact of object-oriented design on software quality. In Proceedings of the 3rd International Software Metrics Symposium. 90--99. DOI:https://doi.org/10.1109/METRIC.1996.492446Google ScholarCross Ref
- E. D. Ekelund and E. Engström. 2015. Efficient regression testing based on test history: An industrial evaluation. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’15). 449--457. DOI:https://doi.org/10.1109/ICSM.2015.7332496Google Scholar
- Kalhed El Emam, Saïda Benlarbi, Nishith Goel, and Shesh N. Rai. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng. 27, 7 (July 2001), 630--650. DOI:https://doi.org/10.1109/32.935855Google Scholar
- E. Engström, P. Runeson, and G. Wikstrand. 2010. An empirical evaluation of regression testing based on fix-cache recommendations. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation. 75--78. DOI:https://doi.org/10.1109/ICST.2010.40Google Scholar
- D. Fatiregun, M. Harman, and R. M. Hierons. 2004. Evolving transformation sequences using genetic algorithms. In Proceedings of the 4th IEEE International Workshop on Source Code Analysis and Manipulation. 65--74. DOI:https://doi.org/10.1109/SCAM.2004.11Google ScholarDigital Library
- N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26, 8 (Aug. 2000), 797--814. DOI:https://doi.org/10.1109/32.879815Google ScholarDigital Library
- M. Fowler. 2010. Domain-specific Languages. Addison-Wesley Professional.Google Scholar
- Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE’11). ACM, 416--419.Google ScholarDigital Library
- Gregory Gay. 2018. To call, or not to call: Contrasting direct and indirect branch coverage in test generation. In Proceedings of the 11th International Workshop on Search-Based Software Testing (SBST’18). ACM, New York, NY, 43--50. DOI:https://doi.org/10.1145/3194718.3194719Google ScholarDigital Library
- Milos Gligoric, Alex Groce, Chaoqiang Zhang, Rohan Sharma, Amin Alipour, and Darko Marinov. 2013. Comparing non-adequate test suites using coverage criteria. In Proceedings of the International Symposium on Software Testing and Analysis. 302--313.Google ScholarDigital Library
- Milos Gligoric, Stas Negara, Owolabi Legunsen, and Darko Marinov. 2014. An empirical evaluation and comparison of manual and automated test selection. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). Association for Computing Machinery, New York, NY, 361--372. DOI:https://doi.org/10.1145/2642937.2643019Google ScholarDigital Library
- Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed automated random testing. In Proceedings of the Conference on Programming Language Design and Implementation. 213--223.Google ScholarDigital Library
- Peter Goodman. 2016. A fuzzer and a symbolic executor walk into a cloud. Retrieved from https://blog.trailofbits.com/2016/08/02/engineering-solutions-to-hard-program-analysis-problems/.Google Scholar
- Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Code coverage for suite evaluation by developers. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 72--82. DOI:https://doi.org/10.1145/2568225.2568278Google ScholarDigital Library
- Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Mutations: How close are they to real faults? In Proceedings of the International Symposium on Software Reliability Engineering. 189--200.Google ScholarDigital Library
- Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey Siy. 2000. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26, 7 (July 2000), 653--661. DOI:https://doi.org/10.1109/32.859533Google ScholarDigital Library
- Alex Groce and Martin Erwig. 2012. Finding common ground: Choose, assert, and assume. In Proceedings of the International Workshop on Dynamic Analysis. 12--17.Google ScholarDigital Library
- Alex Groce, Alan Fern, Martin Erwig, Jervis Pinto, Tim Bauer, and Amin Alipour. 2012. Learning-based test programming for programmers. In Proceedings of the International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 752--786.Google ScholarDigital Library
- Alex Groce, Alan Fern, Jervis Pinto, Tim Bauer, Amin Alipour, Martin Erwig, and Camden Lopez. 2012. Lightweight automated testing with adaptation-based programming. In Proceedings of the IEEE International Symposium on Software Reliability Engineering. 161--170.Google ScholarDigital Library
- Alex Groce, Gerard Holzmann, and Rajeev Joshi. 2007. Randomized differential testing as a prelude to formal verification. In Proceedings of the International Conference on Software Engineering. 621--631.Google ScholarDigital Library
- Alex Groce and Jervis Pinto. 2015. A little language for testing. In Proceedings of the NASA Formal Methods Symposium. 204--218.Google ScholarCross Ref
- Alex Groce, Jervis Pinto, Pooria Azimi, Pranjal Mittal, Josie Holmes, and Kevin Kellar. 2015. TSTL: The template scripting testing language. Retrieved from https://github.com/agroce/tstl.Google Scholar
- Alex Groce and Willem Visser. 2004. Heuristics for model checking Java programs. Softw. Tools Technol. Transf. 6(4) (2004), 260--276.Google Scholar
- Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm testing. In Proceedings of the International Symposium on Software Testing and Analysis. 78--88.Google ScholarDigital Library
- T. Gyimothy, R. Ferenc, and I. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31, 10 (Oct. 2005), 897--910. DOI:https://doi.org/10.1109/TSE.2005.112Google ScholarDigital Library
- Richard Hamlet. 1994. Random testing. In Encyclopedia of Software Engineering. Wiley, 970--978.Google Scholar
- Mark Harman and Peter O’Hearn. 2018. From start-ups to scale-ups: Open problems, challenges and myths in static and dynamic program analysis for testing and verification. In Proceedings of the IEEE International Working Conference on Source Code Analysis and Manipulation.Google Scholar
- Kim Herzig, Michaela Greiler, Jacek Czerwonka, and Brendan Murphy. 2015. The art of testing less without sacrificing quality. In Proceedings of the 37th International Conference on Software Engineering (ICSE’15), Vol. 1. IEEE Press, 483--493.Google ScholarCross Ref
- Matthias Hirzel and Herbert Klaeren. 2016. Graph-walk-based selective regression testing of web applications created with Google web toolkit. In Proceedings of the Gemeinsamer Tagungsband der Workshops der Tagung Software Engineering (SE’16). 55--69. Retrieved from: http://ceur-ws.org/Vol-1559/paper05.pdf.Google Scholar
- Josie Holmes, Alex Groce, Jervis Pinto, Pranjal Mittal, Pooria Azimi, Kevin Kellar, and James O’Brien. 2018. TSTL: The template scripting testing language. Int. J. Softw. Tools Technol. Transf. 20, 1 (2018), 57--78.Google ScholarDigital Library
- Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2008. Swarm verification. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering. 1--6.Google ScholarDigital Library
- Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2008. Tackling large verification problems with the swarm tool. In Proceedings of the SPIN Workshop on Model Checking of Software. 134--143.Google ScholarDigital Library
- Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2011. Swarm verification techniques. IEEE Trans. Softw. Eng. 37, 6 (2011), 845--857.Google ScholarDigital Library
- Laura Inozemtseva. [n.d.]. Supplemental results for “Coverage is not Correlated...”. DOI:http://inozemtseva.com/research/2014/icse/coverageGoogle Scholar
- Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 435--445. DOI:https://doi.org/10.1145/2568225.2568271Google ScholarDigital Library
- René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the International Symposium on Software Testing and Analysis. ACM, 437--440.Google Scholar
- Kazuki Kaneoka. 2017. Feedback-based Random Test Generator for TSTL. Technical Report MS thesis. Oregon State University.Google Scholar
- George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’18). ACM, New York, NY, 2123--2138. DOI:https://doi.org/10.1145/3243734.3243804Google ScholarDigital Library
- A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans. Softw. Eng. 35, 2 (Mar. 2009), 293--304. DOI:https://doi.org/10.1109/TSE.2008.90Google ScholarDigital Library
- Davy Landman, Alexander Serebrenik, Eric Bouwers, and Jurgen J. Vinju. 2016. Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods and C functions. J. Software: Evol. Proc. 28, 7 (2016), 589--618. DOI:https://doi.org/10.1002/smr.1760Google ScholarDigital Library
- David R. MacIver. 2013. Hypothesis: Test faster, fix more. Retrieved from http://hypothesis.works/.Google Scholar
- David R. MacIver. 2016. Rule Based Stateful Testing. Retrieved from http://hypothesis.works/articles/rule-based-stateful-testing/.Google Scholar
- David R. MacIver. 2017. Python Coverage could be fast. Retrieved from https://www.drmaciver.com/2017/09/python-coverage-could-be-fast/.Google Scholar
- David R. MacIver. 2017. Coverage adds a lot of overhead when the base test is fast. Retrieved from https://github.com/HypothesisWorks/hypothesis/issues/914.Google Scholar
- David R. MacIver and PyPI. 2015. Usage stats for hypothesis on PyPI. Retrieved from https://libraries.io/pypi/hypothesis/usage.Google Scholar
- Andrian Marcus, Denys Poshyvanyk, and Rudolf Ferenc. 2008. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Eng. 34, 2 (Mar. 2008), 287--300. DOI:https://doi.org/10.1109/TSE.2007.70768Google ScholarDigital Library
- D. Marijan. 2015. Multi-perspective regression test prioritization for time-constrained environments. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security. 157--162. DOI:https://doi.org/10.1109/QRS.2015.31Google ScholarDigital Library
- D. Marijan, A. Gotlieb, and S. Sen. 2013. Test case prioritization for continuous regression testing: An industrial case study. In Proceedings of the IEEE International Conference on Software Maintenance. 540--543. DOI:https://doi.org/10.1109/ICSM.2013.91Google Scholar
- Paul Dan Marinescu and Cristian Cadar. 2012. make test-zesti: A symbolic execution solution for improving regression testing. In Proceedings of the International Conference on Software Engineering. 716--726.Google ScholarCross Ref
- T. J. McCabe. 1976. A complexity measure. IEEE Trans. Softw. Eng. 2, 4 (July 1976), 308--320. DOI:https://doi.org/10.1109/TSE.1976.233837Google ScholarDigital Library
- William McKeeman. 1998. Differential testing for software. Dig. Tech. J. Dig. Equip. Corp. 10(1) (1998), 100--107.Google Scholar
- Phil McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verif. Reliab. 14 (2004), 105--156.Google ScholarDigital Library
- T. Menzies, J. S. Di Stefano, M. Chapman, and K. McGill. 2002. Metrics that matter. In Proceedings of the 27th NASA Goddard/IEEE Software Engineering Workshop.51--57. DOI:https://doi.org/10.1109/SEW.2002.1199449Google Scholar
- Rickard Nilsson, Shane Auckland, Mark Sumner, and Sanjiv Sahayam. 2016. ScalaCheck User Guide. Retrieved from https://github.com/rickynils/scalacheck/blob/master/doc/UserGuide.md.Google Scholar
- A. Jefferson Offutt and Roland H. Untch. 2001. Mutation 2000: Uniting the orthogonal. In Mutation Testing for the New Century. Springer, 34--44.Google Scholar
- Peter Ohmann, David Bingham Brown, Naveen Neelakandan, Jeff Linderoth, and Ben Liblit. 2016. Optimizing customized program coverage. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). IEEE, 27--38.Google ScholarDigital Library
- Hector M. Olague, Letha H. Etzkorn, Sampson Gholston, and Stephen Quattlebaum. 2007. Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans. Softw. Eng. 33, 6 (June 2007), 402--419. DOI:https://doi.org/10.1109/TSE.2007.1015Google ScholarDigital Library
- T. J. Ostrand, E. J. Weyuker, and R. M. Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng. 31, 4 (Apr. 2005), 340--355. DOI:https://doi.org/10.1109/TSE.2005.49Google ScholarDigital Library
- Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-directed random test generation. In Proceedings of the International Conference on Software Engineering. 75--84.Google ScholarDigital Library
- G. J. Pai and J. Bechta Dugan. 2007. Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans. Softw. Eng. 33, 10 (Oct. 2007), 675--686. DOI:https://doi.org/10.1109/TSE.2007.70722Google Scholar
- Manolis Papadakis and Konstantinos Sagonas. 2011. A PropEr integration of types and function specifications with property-based testing. In Proceedings of the ACM SIGPLAN Erlang Workshop. ACM Press, New York, NY, 39--50.Google ScholarDigital Library
- Suzette Person, Guowei Yang, Neha Rungta, and Sarfraz Khurshid. 2011. Directed incremental symbolic execution. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). 504--515.Google ScholarDigital Library
- Danijel Radjenović, Marjan Heričko, Richard Torkar, and Aleš Živkovič. 2013. Software fault prediction metrics. Inf. Softw. Technol. 55, 8 (Aug. 2013), 1397--1418. DOI:https://doi.org/10.1016/j.infsof.2013.02.009Google ScholarDigital Library
- Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware evolutionary fuzzing. In Proceedings of the Network and Distributed Security Symposium (NDSS’17).Google ScholarCross Ref
- Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the “Naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). ACM, New York, NY, 428--439. DOI:https://doi.org/10.1145/2884781.2884848Google ScholarDigital Library
- Ripon K. Saha, Lingming Zhang, Sarfraz Khurshid, and Dewayne E. Perry. 2015. An information retrieval approach for regression test prioritization based on program changes. In Proceedings of the IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 268--279.Google Scholar
- Scientific Toolworks, Inc.2017. Understand™ Static Code Analysis Tool. Retrieved from https://scitools.com/.Google Scholar
- Kang Seonghoon. 2015. Tutorial: How to collect test coverages for Rust project. Retrieved from https://users.rust-lang.org/t/tutorial-how-to-collect-test-coverages-for-rust-project/650.Google Scholar
- Sina Shamshiri, Rene Just, Jose Miguel Rojas, Gordon Fraser, Phil McMinn, and Andrea Arcuri. 2015. Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). IEEE, 201--211.Google ScholarDigital Library
- Sina Shamshiri, José Miguel Rojas, Gordon Fraser, and Phil McMinn. 2015. Random or genetic algorithm search for object-oriented test suite generation? In Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO’15). Association for Computing Machinery, New York, NY, 1367--1374. DOI:https://doi.org/10.1145/2739480.2754696Google ScholarDigital Library
- Sina Shamshiri, José Miguel Rojas, Luca Gazzola, Gordon Fraser, Phil McMinn, Leonardo Mariani, and Andrea Arcuri. 2018. Random or evolutionary search for object-oriented test suite generation?Softw. Test. Verif. Reliab. 28, 4 (2018), e1660.Google ScholarCross Ref
- M. Skoglund and P. Runeson. 2005. A case study of the class firewall regression test selection technique on a large scale distributed software system. In Proceedings of the International Symposium on Empirical Software Engineering.. DOI:https://doi.org/10.1109/ISESE.2005.1541816Google Scholar
- Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively prioritizing tests in development environment. SIGSOFT Softw. Eng. Notes 27, 4 (July 2002), 97--106. DOI:https://doi.org/10.1145/566171.566187Google ScholarDigital Library
- Matt Staats, Michael W. Whalen, and Mats P. E. Heimdahl. 2011. Programs, tests, and oracles: The foundations of testing revisited. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 391--400. DOI:https://doi.org/10.1145/1985793.1985847Google Scholar
- M. D. Syer, M. Nagappan, B. Adams, and A. E. Hassan. 2015. Replicating and re-evaluating the theory of relative defect-proneness. IEEE Trans. Softw. Eng. 41, 2 (Feb. 2015), 176--197. DOI:https://doi.org/10.1109/TSE.2014.2361131Google ScholarDigital Library
- Sahar Tahvili, Wasif Afzal, Mehrdad Saadatmand, Markus Bohlin, Daniel Sundmark, and Stig Larsson. 2016. Towards earlier fault detection by value-driven prioritization of test cases using fuzzy TOPSIS. In Information Technology: New Generations. Springer, 745--759.Google Scholar
- Mustafa M. Tikir and Jeffrey K. Hollingsworth. 2002. Efficient instrumentation for code coverage testing. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’02). ACM, New York, NY, 86--96. DOI:https://doi.org/10.1145/566172.566186Google Scholar
- David A. Tomassi, Naji Dmeiri, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Premkumar T. Devanbu, Bogdan Vasilescu, and Cindy Rubio-González. 2019. BugSwarm: Mining and continuously growing a dataset of reproducible failures and fixes. In Proceedings of the International Conference on Software Engineering (ICSE’19). IEEE/ACM, 339--349.Google ScholarDigital Library
- user1689822. 2012. python AVL tree insertion. Retrieved from http://stackoverflow.com/questions/12537986/python-avl-tree-insertion.Google Scholar
- Lee White, Khaled Jaber, Brian Robinson, and Václav Rajlich. 2008. Extended firewall for regression testing: An experience report. J. Softw. Maint. Evol. 20, 6 (Nov. 2008), 419--433.Google ScholarCross Ref
- L. White and B. Robinson. 2004. Industrial real-time regression testing and analysis using firewalls. In Proceedings of the 20th IEEE International Conference on Software Maintenance.18--27. DOI:https://doi.org/10.1109/ICSM.2004.1357786Google Scholar
- G. Wikstrand, R. Feldt, J. K. Gorantla, W. Zhe, and C. White. 2009. Dynamic regression test selection based on a file cache an industrial evaluation. In Proceedings of the International Conference on Software Testing Verification and Validation. 299--302. DOI:https://doi.org/10.1109/ICST.2009.42Google Scholar
- Qian Yang, J. Jenny Li, and David M. Weiss. 2007. A survey of coverage-based testing tools. Comput. J. 52, 5 (2007), 589--597.Google ScholarDigital Library
- Kohsuke Yatoh, Kazunori Sakamoto, Fuyuki Ishikawa, and Shinichi Honiden. 2015. Feedback-controlled random test generation. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, NY, 316--326. DOI:https://doi.org/10.1145/2771783.2771805Google ScholarDigital Library
- Michal Zalewski. 2014. american fuzzy lop (2.35b). Retrieved from http://lcamtuf.coredump.cx/afl/.Google Scholar
- Chaoqiang Zhang, Alex Groce, and Mohammad Amin Alipour. 2014. Using test case reduction and prioritization to improve symbolic execution. In Proceedings of the International Symposium on Software Testing and Analysis. 160--170.Google ScholarDigital Library
- H. Zhang. 2009. An investigation of the relationships between lines of code and defects. In Proceedings of the IEEE International Conference on Software Maintenance. 274--283. DOI:https://doi.org/10.1109/ICSM.2009.5306304Google ScholarCross Ref
- Jiang Zheng, Brian Robinson, Laurie Williams, and Karen Smiley. 2006. Applying regression test selection for COTS-based applications. In Proceedings of the 28th International Conference on Software Engineering (ICSE’06). Association for Computing Machinery, New York, NY, 512--522. DOI:https://doi.org/10.1145/1134285.1134357Google ScholarDigital Library
- Jiang Zheng, Laurie Williams, and Brian Robinson. 2007. Pallino: Automation to support regression test selection for COTS-based applications. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). Association for Computing Machinery, New York, NY, 224--233. DOI:https://doi.org/10.1145/1321631.1321665Google ScholarDigital Library
- Yuming Zhou and Hareton Leung. 2006. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans. Softw. Eng. 32, 10 (Oct. 2006), 771--789. DOI:https://doi.org/10.1109/TSE.2006.102Google Scholar
- T. Zimmermann and N. Nagappan. 2008. Predicting defects using network analysis on dependency graphs. In Proceedings of the ACM/IEEE 30th International Conference on Software Engineering. 531--540. DOI:https://doi.org/10.1145/1368088.1368161Google Scholar
Index Terms
- Using Relative Lines of Code to Guide Automated Test Generation for Python
Recommendations
Code Coverage Aware Test Generation Using Constraint Solver
Software Engineering and Formal Methods. SEFM 2020 Collocated WorkshopsAbstractCode coverage has been used in the software testing context mostly as a metric to assess a generated test suite’s quality. Recently, code coverage analysis is used as a white-box testing technique for test optimization. Most of the research ...
The Effectiveness of T-Way Test Data Generation
SAFECOMP '08: Proceedings of the 27th international conference on Computer Safety, Reliability, and SecurityThis paper reports the results of a study comparing the effectiveness of automatically generated tests constructed using random and <em>t</em>-way combinatorial techniques on safety related industrial code using mutation adequacy criteria. A reference ...
A detailed investigation of the effectiveness of whole test suite generation
A common application of search-based software testing is to generate test cases for all goals defined by a coverage criterion (e.g., lines, branches, mutants). Rather than generating one test case at a time for each of these goals individually, whole ...
Comments