research-article

Using Relative Lines of Code to Guide Automated Test Generation for Python

Authors:
Josie Holmes

School of Informatics, Computing 8 Cyber Systems, Northern Arizona University

School of Informatics, Computing 8 Cyber Systems, Northern Arizona University
View Profile

,
Iftekhar Ahmed

Donald Bren School of Information and Computer Sciences, University of California, Irvine

Donald Bren School of Information and Computer Sciences, University of California, Irvine

0000-0001-8221-5352
View Profile

,
Caius Brindescu

School of Electrical Engineering and Computer, Oregon State University

School of Electrical Engineering and Computer, Oregon State University
View Profile

,
Rahul Gopinath

Center for IT-Security, Privacy and Accountability (CISPA), University of Saarbrücken

Center for IT-Security, Privacy and Accountability (CISPA), University of Saarbrücken
View Profile

,
He Zhang

School of Electrical Engineering and Computer, Oregon State University

School of Electrical Engineering and Computer, Oregon State University
View Profile

,
Alex Groce

School of Informatics, Computing 8 Cyber Systems, Northern Arizona University

School of Informatics, Computing 8 Cyber Systems, Northern Arizona University
View Profile

ACM Transactions on Software Engineering and Methodology Volume 29 Issue 4Article No.: 28pp 1–38https://doi.org/10.1145/3408896

Published:26 September 2020Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

Raw lines of code (LOC) is a metric that does not, at first glance, seem extremely useful for automated test generation. It is both highly language-dependent and not extremely meaningful, semantically, within a language: one coder can produce the same effect with many fewer lines than another. However, relative LOC, between components of the same project, turns out to be a highly useful metric for automated testing. In this article, we make use of a heuristic based on LOC counts for tested functions to dramatically improve the effectiveness of automated test generation. This approach is particularly valuable in languages where collecting code coverage data to guide testing has a very high overhead. We apply the heuristic to property-based Python testing using the TSTL (Template Scripting Testing Language) tool. In our experiments, the simple LOC heuristic can improve branch and statement coverage by large margins (often more than 20%, up to 40% or more) and improve fault detection by an even larger margin (usually more than 75% and up to 400% or more). The LOC heuristic is also easy to combine with other approaches and is comparable to, and possibly more effective than, two well-established approaches for guiding random testing.

References

Ali Aburas and Alex Groce. 2016. A method dependence relations guided genetic algorithm. In Proceedings of the 8th International Symposium Search Based Software Engineering (SSBSE’16). 267--273.Google ScholarCross Ref
Hiralal Agrawal. 1994. Dominators, super blocks, and program coverage. In Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’94). ACM, New York, NY, 25--34. DOI:https://doi.org/10.1145/174675.175935Google ScholarDigital Library
Iftekhar Ahmed, Rahul Gopinath, Caius Brindescu, Alex Groce, and Carlos Jensen. 2016. Can testedness be effectively measured? In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’16). ACM, New York, NY, 547--558. DOI:https://doi.org/10.1145/2950290.2950324Google ScholarDigital Library
Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi. 2016. Generating focused random tests using directed swarm testing. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). ACM, New York, NY, 70--81. DOI:https://doi.org/10.1145/2931037.2931056Google ScholarDigital Library
C. Andersson and P. Runeson. 2007. A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 33, 5 (May 2007), 273--286. DOI:https://doi.org/10.1109/TSE.2007.1005Google ScholarDigital Library
James Andrews, Felix Li, and Tim Menzies. 2007. Nighthawk: A two-level genetic-random unit test data generator. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering. 144--153.Google ScholarDigital Library
Jamie Andrews, Yihao Ross Zhang, and Alex Groce. 2010. Comparing Automated Unit Testing Strategies. Technical Report 736. Department of Computer Science, University of Western Ontario.Google Scholar
James H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is mutation an appropriate tool for testing experiments? In Proceedings of the International Conference on Software Engineering. 402--411.Google Scholar
Andrea Arcuri and Lionel Briand. 2014. A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verif. Reliab. 24, 3 (2014), 219--250.Google ScholarDigital Library
Andrea Arcuri, Muhammad Zohaib Z. Iqbal, and Lionel C. Briand. 2010. Formal analysis of the effectiveness and predictability of random testing. In Proceedings of the International Symposium on Software Testing and Analysis. 219--230.Google Scholar
J. Bansiya and C. G. Davis. 2002. A hierarchical model for object-oriented design quality assessment. IEEE Trans. Softw. Eng. 28, 1 (Jan. 2002), 4--17. DOI:https://doi.org/10.1109/32.979986Google ScholarDigital Library
V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 10 (Oct. 1996), 751--761. DOI:https://doi.org/10.1109/32.544352Google ScholarDigital Library
Ned Batchelder. 2015. Coverage.py. Retrieved from https://coverage.readthedocs.org/en/coverage-4.0.1/.Google Scholar
James M. Bieman and Byung-Kyoo Kang. 1995. Cohesion and reuse in an object-oriented system. SIGSOFT Softw. Eng. Notes 20, SI (Aug. 1995), 259--262. DOI:https://doi.org/10.1145/223427.211856Google Scholar
Marcel Böhme and Soumya Paul. 2014. On the efficiency of automated testing. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). ACM, New York, NY, 632--642. DOI:https://doi.org/10.1145/2635868.2635923Google ScholarDigital Library
Lionel C. Briand and Jürgen Wüst. 2002. Empirical studies of quality models in object-oriented systems. Advances in Computers, Vol. 56. Elsevier, 97--166. DOI:https://doi.org/10.1016/S0065-2458(02)80005-5Google Scholar
Lionel C. Briand, Jürgen Wüst, Stefan V. Ikonomovski, and Hakim Lounis. 1999. Investigating quality factors in object-oriented designs: An industrial case study. In Proceedings of the 21st International Conference on Software Engineering (ICSE’99). ACM, New York, NY, 345--354. DOI:https://doi.org/10.1145/302405.302654Google ScholarDigital Library
Lionel C. Briand, Jürgen Wüst, and Hakim Lounis. 2001. Replicated case studies for investigating quality factors in object-oriented designs. Empir. Softw. Eng. 6, 1 (2001), 11--58. DOI:https://doi.org/10.1023/A:1009815306478Google ScholarDigital Library
G. Buchgeher, C. Ernstbrunner, R. Ramler, and M. Lusser. 2013. Towards tool-support for test case selection in manual regression testing. In Proceedings of the IEEE 6th International Conference on Software Testing, Verification and Validation Workshops. 74--79. DOI:https://doi.org/10.1109/ICSTW.2013.16Google Scholar
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the Conference on Operating System Design and Implementation. 209--224.Google Scholar
R. Carlson, H. Do, and A. Denton. 2011. A clustering approach to improving test case prioritization: An industrial case study. In Proceedings of the 27th IEEE International Conference on Software Maintenance (ICSM’11). 382--391. DOI:https://doi.org/10.1109/ICSM.2011.6080805Google Scholar
Yuanliang Chen, Yu Jiang, Fuchen Ma, Jie Liang, Mingzhe Wang, Chijin Zhou, Xun Jiao, and Zhuo Su. 2019. EnFuzz: Ensemble fuzzing with seed synchronization among diverse fuzzers. In Proceedings of the 28th USENIX Security Symposium (USENIX Security’19). 1967--1983.Google Scholar
Kalyan-Ram Chilakamarri and Sebastian Elbaum. 2004. Reducing coverage collection overhead with disposable instrumentation. In Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE’04). IEEE, 233--244.Google ScholarDigital Library
Travis CI. 2012. Customizing the Build: Build Timeouts. Retrieved from https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts.Google Scholar
Koen Claessen and John Hughes. 2000. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the International Conference on Functional Programming (ICFP’00). 268--279.Google ScholarDigital Library
J. J. Dolado. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 10 (Oct. 2000), 1006--1021. DOI:https://doi.org/10.1109/32.879821Google ScholarDigital Library
Matthew B. Dwyer, Suzette Person, and Sebastian Elbaum. 2006. Controlling factors in evaluating path-sensitive error detection techniques. In Proceedings of the Symposium on the Foundations of Software Engineering. 92--104.Google ScholarCross Ref
Fernando Brito e Abreu and Rogério Carapuça. 1994. Object-oriented software engineering: Measuring and controlling the development process. In Proceedings of the International Conference on Software Quality (QSIC’94).Google Scholar
F. Brito e Abreu and W. Melo. 1996. Evaluating the impact of object-oriented design on software quality. In Proceedings of the 3rd International Software Metrics Symposium. 90--99. DOI:https://doi.org/10.1109/METRIC.1996.492446Google ScholarCross Ref
E. D. Ekelund and E. Engström. 2015. Efficient regression testing based on test history: An industrial evaluation. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’15). 449--457. DOI:https://doi.org/10.1109/ICSM.2015.7332496Google Scholar
Kalhed El Emam, Saïda Benlarbi, Nishith Goel, and Shesh N. Rai. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng. 27, 7 (July 2001), 630--650. DOI:https://doi.org/10.1109/32.935855Google Scholar
E. Engström, P. Runeson, and G. Wikstrand. 2010. An empirical evaluation of regression testing based on fix-cache recommendations. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation. 75--78. DOI:https://doi.org/10.1109/ICST.2010.40Google Scholar
D. Fatiregun, M. Harman, and R. M. Hierons. 2004. Evolving transformation sequences using genetic algorithms. In Proceedings of the 4th IEEE International Workshop on Source Code Analysis and Manipulation. 65--74. DOI:https://doi.org/10.1109/SCAM.2004.11Google ScholarDigital Library
N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26, 8 (Aug. 2000), 797--814. DOI:https://doi.org/10.1109/32.879815Google ScholarDigital Library
M. Fowler. 2010. Domain-specific Languages. Addison-Wesley Professional.Google Scholar
Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE’11). ACM, 416--419.Google ScholarDigital Library
Gregory Gay. 2018. To call, or not to call: Contrasting direct and indirect branch coverage in test generation. In Proceedings of the 11th International Workshop on Search-Based Software Testing (SBST’18). ACM, New York, NY, 43--50. DOI:https://doi.org/10.1145/3194718.3194719Google ScholarDigital Library
Milos Gligoric, Alex Groce, Chaoqiang Zhang, Rohan Sharma, Amin Alipour, and Darko Marinov. 2013. Comparing non-adequate test suites using coverage criteria. In Proceedings of the International Symposium on Software Testing and Analysis. 302--313.Google ScholarDigital Library
Milos Gligoric, Stas Negara, Owolabi Legunsen, and Darko Marinov. 2014. An empirical evaluation and comparison of manual and automated test selection. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). Association for Computing Machinery, New York, NY, 361--372. DOI:https://doi.org/10.1145/2642937.2643019Google ScholarDigital Library
Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed automated random testing. In Proceedings of the Conference on Programming Language Design and Implementation. 213--223.Google ScholarDigital Library
Peter Goodman. 2016. A fuzzer and a symbolic executor walk into a cloud. Retrieved from https://blog.trailofbits.com/2016/08/02/engineering-solutions-to-hard-program-analysis-problems/.Google Scholar
Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Code coverage for suite evaluation by developers. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 72--82. DOI:https://doi.org/10.1145/2568225.2568278Google ScholarDigital Library
Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Mutations: How close are they to real faults? In Proceedings of the International Symposium on Software Reliability Engineering. 189--200.Google ScholarDigital Library
Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey Siy. 2000. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26, 7 (July 2000), 653--661. DOI:https://doi.org/10.1109/32.859533Google ScholarDigital Library
Alex Groce and Martin Erwig. 2012. Finding common ground: Choose, assert, and assume. In Proceedings of the International Workshop on Dynamic Analysis. 12--17.Google ScholarDigital Library
Alex Groce, Alan Fern, Martin Erwig, Jervis Pinto, Tim Bauer, and Amin Alipour. 2012. Learning-based test programming for programmers. In Proceedings of the International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 752--786.Google ScholarDigital Library
Alex Groce, Alan Fern, Jervis Pinto, Tim Bauer, Amin Alipour, Martin Erwig, and Camden Lopez. 2012. Lightweight automated testing with adaptation-based programming. In Proceedings of the IEEE International Symposium on Software Reliability Engineering. 161--170.Google ScholarDigital Library
Alex Groce, Gerard Holzmann, and Rajeev Joshi. 2007. Randomized differential testing as a prelude to formal verification. In Proceedings of the International Conference on Software Engineering. 621--631.Google ScholarDigital Library
Alex Groce and Jervis Pinto. 2015. A little language for testing. In Proceedings of the NASA Formal Methods Symposium. 204--218.Google ScholarCross Ref
Alex Groce, Jervis Pinto, Pooria Azimi, Pranjal Mittal, Josie Holmes, and Kevin Kellar. 2015. TSTL: The template scripting testing language. Retrieved from https://github.com/agroce/tstl.Google Scholar
Alex Groce and Willem Visser. 2004. Heuristics for model checking Java programs. Softw. Tools Technol. Transf. 6(4) (2004), 260--276.Google Scholar
Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm testing. In Proceedings of the International Symposium on Software Testing and Analysis. 78--88.Google ScholarDigital Library
T. Gyimothy, R. Ferenc, and I. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31, 10 (Oct. 2005), 897--910. DOI:https://doi.org/10.1109/TSE.2005.112Google ScholarDigital Library
Richard Hamlet. 1994. Random testing. In Encyclopedia of Software Engineering. Wiley, 970--978.Google Scholar
Mark Harman and Peter O’Hearn. 2018. From start-ups to scale-ups: Open problems, challenges and myths in static and dynamic program analysis for testing and verification. In Proceedings of the IEEE International Working Conference on Source Code Analysis and Manipulation.Google Scholar
Kim Herzig, Michaela Greiler, Jacek Czerwonka, and Brendan Murphy. 2015. The art of testing less without sacrificing quality. In Proceedings of the 37th International Conference on Software Engineering (ICSE’15), Vol. 1. IEEE Press, 483--493.Google ScholarCross Ref
Matthias Hirzel and Herbert Klaeren. 2016. Graph-walk-based selective regression testing of web applications created with Google web toolkit. In Proceedings of the Gemeinsamer Tagungsband der Workshops der Tagung Software Engineering (SE’16). 55--69. Retrieved from: http://ceur-ws.org/Vol-1559/paper05.pdf.Google Scholar
Josie Holmes, Alex Groce, Jervis Pinto, Pranjal Mittal, Pooria Azimi, Kevin Kellar, and James O’Brien. 2018. TSTL: The template scripting testing language. Int. J. Softw. Tools Technol. Transf. 20, 1 (2018), 57--78.Google ScholarDigital Library
Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2008. Swarm verification. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering. 1--6.Google ScholarDigital Library
Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2008. Tackling large verification problems with the swarm tool. In Proceedings of the SPIN Workshop on Model Checking of Software. 134--143.Google ScholarDigital Library
Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2011. Swarm verification techniques. IEEE Trans. Softw. Eng. 37, 6 (2011), 845--857.Google ScholarDigital Library
Laura Inozemtseva. [n.d.]. Supplemental results for “Coverage is not Correlated...”. DOI:http://inozemtseva.com/research/2014/icse/coverageGoogle Scholar
Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 435--445. DOI:https://doi.org/10.1145/2568225.2568271Google ScholarDigital Library
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the International Symposium on Software Testing and Analysis. ACM, 437--440.Google Scholar
Kazuki Kaneoka. 2017. Feedback-based Random Test Generator for TSTL. Technical Report MS thesis. Oregon State University.Google Scholar
George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’18). ACM, New York, NY, 2123--2138. DOI:https://doi.org/10.1145/3243734.3243804Google ScholarDigital Library
A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans. Softw. Eng. 35, 2 (Mar. 2009), 293--304. DOI:https://doi.org/10.1109/TSE.2008.90Google ScholarDigital Library
Davy Landman, Alexander Serebrenik, Eric Bouwers, and Jurgen J. Vinju. 2016. Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods and C functions. J. Software: Evol. Proc. 28, 7 (2016), 589--618. DOI:https://doi.org/10.1002/smr.1760Google ScholarDigital Library
David R. MacIver. 2013. Hypothesis: Test faster, fix more. Retrieved from http://hypothesis.works/.Google Scholar
David R. MacIver. 2016. Rule Based Stateful Testing. Retrieved from http://hypothesis.works/articles/rule-based-stateful-testing/.Google Scholar
David R. MacIver. 2017. Python Coverage could be fast. Retrieved from https://www.drmaciver.com/2017/09/python-coverage-could-be-fast/.Google Scholar
David R. MacIver. 2017. Coverage adds a lot of overhead when the base test is fast. Retrieved from https://github.com/HypothesisWorks/hypothesis/issues/914.Google Scholar
David R. MacIver and PyPI. 2015. Usage stats for hypothesis on PyPI. Retrieved from https://libraries.io/pypi/hypothesis/usage.Google Scholar
Andrian Marcus, Denys Poshyvanyk, and Rudolf Ferenc. 2008. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Eng. 34, 2 (Mar. 2008), 287--300. DOI:https://doi.org/10.1109/TSE.2007.70768Google ScholarDigital Library
D. Marijan. 2015. Multi-perspective regression test prioritization for time-constrained environments. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security. 157--162. DOI:https://doi.org/10.1109/QRS.2015.31Google ScholarDigital Library
D. Marijan, A. Gotlieb, and S. Sen. 2013. Test case prioritization for continuous regression testing: An industrial case study. In Proceedings of the IEEE International Conference on Software Maintenance. 540--543. DOI:https://doi.org/10.1109/ICSM.2013.91Google Scholar
Paul Dan Marinescu and Cristian Cadar. 2012. make test-zesti: A symbolic execution solution for improving regression testing. In Proceedings of the International Conference on Software Engineering. 716--726.Google ScholarCross Ref
T. J. McCabe. 1976. A complexity measure. IEEE Trans. Softw. Eng. 2, 4 (July 1976), 308--320. DOI:https://doi.org/10.1109/TSE.1976.233837Google ScholarDigital Library
William McKeeman. 1998. Differential testing for software. Dig. Tech. J. Dig. Equip. Corp. 10(1) (1998), 100--107.Google Scholar
Phil McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verif. Reliab. 14 (2004), 105--156.Google ScholarDigital Library
T. Menzies, J. S. Di Stefano, M. Chapman, and K. McGill. 2002. Metrics that matter. In Proceedings of the 27th NASA Goddard/IEEE Software Engineering Workshop.51--57. DOI:https://doi.org/10.1109/SEW.2002.1199449Google Scholar
Rickard Nilsson, Shane Auckland, Mark Sumner, and Sanjiv Sahayam. 2016. ScalaCheck User Guide. Retrieved from https://github.com/rickynils/scalacheck/blob/master/doc/UserGuide.md.Google Scholar
A. Jefferson Offutt and Roland H. Untch. 2001. Mutation 2000: Uniting the orthogonal. In Mutation Testing for the New Century. Springer, 34--44.Google Scholar
Peter Ohmann, David Bingham Brown, Naveen Neelakandan, Jeff Linderoth, and Ben Liblit. 2016. Optimizing customized program coverage. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). IEEE, 27--38.Google ScholarDigital Library
Hector M. Olague, Letha H. Etzkorn, Sampson Gholston, and Stephen Quattlebaum. 2007. Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans. Softw. Eng. 33, 6 (June 2007), 402--419. DOI:https://doi.org/10.1109/TSE.2007.1015Google ScholarDigital Library
T. J. Ostrand, E. J. Weyuker, and R. M. Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng. 31, 4 (Apr. 2005), 340--355. DOI:https://doi.org/10.1109/TSE.2005.49Google ScholarDigital Library
Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-directed random test generation. In Proceedings of the International Conference on Software Engineering. 75--84.Google ScholarDigital Library
G. J. Pai and J. Bechta Dugan. 2007. Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans. Softw. Eng. 33, 10 (Oct. 2007), 675--686. DOI:https://doi.org/10.1109/TSE.2007.70722Google Scholar
Manolis Papadakis and Konstantinos Sagonas. 2011. A PropEr integration of types and function specifications with property-based testing. In Proceedings of the ACM SIGPLAN Erlang Workshop. ACM Press, New York, NY, 39--50.Google ScholarDigital Library
Suzette Person, Guowei Yang, Neha Rungta, and Sarfraz Khurshid. 2011. Directed incremental symbolic execution. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). 504--515.Google ScholarDigital Library
Danijel Radjenović, Marjan Heričko, Richard Torkar, and Aleš Živkovič. 2013. Software fault prediction metrics. Inf. Softw. Technol. 55, 8 (Aug. 2013), 1397--1418. DOI:https://doi.org/10.1016/j.infsof.2013.02.009Google ScholarDigital Library
Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware evolutionary fuzzing. In Proceedings of the Network and Distributed Security Symposium (NDSS’17).Google ScholarCross Ref
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the “Naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). ACM, New York, NY, 428--439. DOI:https://doi.org/10.1145/2884781.2884848Google ScholarDigital Library
Ripon K. Saha, Lingming Zhang, Sarfraz Khurshid, and Dewayne E. Perry. 2015. An information retrieval approach for regression test prioritization based on program changes. In Proceedings of the IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 268--279.Google Scholar
Scientific Toolworks, Inc.2017. Understand™ Static Code Analysis Tool. Retrieved from https://scitools.com/.Google Scholar
Kang Seonghoon. 2015. Tutorial: How to collect test coverages for Rust project. Retrieved from https://users.rust-lang.org/t/tutorial-how-to-collect-test-coverages-for-rust-project/650.Google Scholar
Sina Shamshiri, Rene Just, Jose Miguel Rojas, Gordon Fraser, Phil McMinn, and Andrea Arcuri. 2015. Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). IEEE, 201--211.Google ScholarDigital Library
Sina Shamshiri, José Miguel Rojas, Gordon Fraser, and Phil McMinn. 2015. Random or genetic algorithm search for object-oriented test suite generation? In Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO’15). Association for Computing Machinery, New York, NY, 1367--1374. DOI:https://doi.org/10.1145/2739480.2754696Google ScholarDigital Library
Sina Shamshiri, José Miguel Rojas, Luca Gazzola, Gordon Fraser, Phil McMinn, Leonardo Mariani, and Andrea Arcuri. 2018. Random or evolutionary search for object-oriented test suite generation?Softw. Test. Verif. Reliab. 28, 4 (2018), e1660.Google ScholarCross Ref
M. Skoglund and P. Runeson. 2005. A case study of the class firewall regression test selection technique on a large scale distributed software system. In Proceedings of the International Symposium on Empirical Software Engineering.. DOI:https://doi.org/10.1109/ISESE.2005.1541816Google Scholar
Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively prioritizing tests in development environment. SIGSOFT Softw. Eng. Notes 27, 4 (July 2002), 97--106. DOI:https://doi.org/10.1145/566171.566187Google ScholarDigital Library
Matt Staats, Michael W. Whalen, and Mats P. E. Heimdahl. 2011. Programs, tests, and oracles: The foundations of testing revisited. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 391--400. DOI:https://doi.org/10.1145/1985793.1985847Google Scholar
M. D. Syer, M. Nagappan, B. Adams, and A. E. Hassan. 2015. Replicating and re-evaluating the theory of relative defect-proneness. IEEE Trans. Softw. Eng. 41, 2 (Feb. 2015), 176--197. DOI:https://doi.org/10.1109/TSE.2014.2361131Google ScholarDigital Library
Sahar Tahvili, Wasif Afzal, Mehrdad Saadatmand, Markus Bohlin, Daniel Sundmark, and Stig Larsson. 2016. Towards earlier fault detection by value-driven prioritization of test cases using fuzzy TOPSIS. In Information Technology: New Generations. Springer, 745--759.Google Scholar
Mustafa M. Tikir and Jeffrey K. Hollingsworth. 2002. Efficient instrumentation for code coverage testing. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’02). ACM, New York, NY, 86--96. DOI:https://doi.org/10.1145/566172.566186Google Scholar
David A. Tomassi, Naji Dmeiri, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Premkumar T. Devanbu, Bogdan Vasilescu, and Cindy Rubio-González. 2019. BugSwarm: Mining and continuously growing a dataset of reproducible failures and fixes. In Proceedings of the International Conference on Software Engineering (ICSE’19). IEEE/ACM, 339--349.Google ScholarDigital Library
user1689822. 2012. python AVL tree insertion. Retrieved from http://stackoverflow.com/questions/12537986/python-avl-tree-insertion.Google Scholar
Lee White, Khaled Jaber, Brian Robinson, and Václav Rajlich. 2008. Extended firewall for regression testing: An experience report. J. Softw. Maint. Evol. 20, 6 (Nov. 2008), 419--433.Google ScholarCross Ref
L. White and B. Robinson. 2004. Industrial real-time regression testing and analysis using firewalls. In Proceedings of the 20th IEEE International Conference on Software Maintenance.18--27. DOI:https://doi.org/10.1109/ICSM.2004.1357786Google Scholar
G. Wikstrand, R. Feldt, J. K. Gorantla, W. Zhe, and C. White. 2009. Dynamic regression test selection based on a file cache an industrial evaluation. In Proceedings of the International Conference on Software Testing Verification and Validation. 299--302. DOI:https://doi.org/10.1109/ICST.2009.42Google Scholar
Qian Yang, J. Jenny Li, and David M. Weiss. 2007. A survey of coverage-based testing tools. Comput. J. 52, 5 (2007), 589--597.Google ScholarDigital Library
Kohsuke Yatoh, Kazunori Sakamoto, Fuyuki Ishikawa, and Shinichi Honiden. 2015. Feedback-controlled random test generation. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, NY, 316--326. DOI:https://doi.org/10.1145/2771783.2771805Google ScholarDigital Library
Michal Zalewski. 2014. american fuzzy lop (2.35b). Retrieved from http://lcamtuf.coredump.cx/afl/.Google Scholar
Chaoqiang Zhang, Alex Groce, and Mohammad Amin Alipour. 2014. Using test case reduction and prioritization to improve symbolic execution. In Proceedings of the International Symposium on Software Testing and Analysis. 160--170.Google ScholarDigital Library
H. Zhang. 2009. An investigation of the relationships between lines of code and defects. In Proceedings of the IEEE International Conference on Software Maintenance. 274--283. DOI:https://doi.org/10.1109/ICSM.2009.5306304Google ScholarCross Ref
Jiang Zheng, Brian Robinson, Laurie Williams, and Karen Smiley. 2006. Applying regression test selection for COTS-based applications. In Proceedings of the 28th International Conference on Software Engineering (ICSE’06). Association for Computing Machinery, New York, NY, 512--522. DOI:https://doi.org/10.1145/1134285.1134357Google ScholarDigital Library
Jiang Zheng, Laurie Williams, and Brian Robinson. 2007. Pallino: Automation to support regression test selection for COTS-based applications. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). Association for Computing Machinery, New York, NY, 224--233. DOI:https://doi.org/10.1145/1321631.1321665Google ScholarDigital Library
Yuming Zhou and Hareton Leung. 2006. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans. Softw. Eng. 32, 10 (Oct. 2006), 771--789. DOI:https://doi.org/10.1109/TSE.2006.102Google Scholar
T. Zimmermann and N. Nagappan. 2008. Predicting defects using network analysis on dependency graphs. In Proceedings of the ACM/IEEE 30th International Conference on Software Engineering. 531--540. DOI:https://doi.org/10.1145/1368088.1368161Google Scholar

Index Terms

Using Relative Lines of Code to Guide Automated Test Generation for Python
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Code Coverage Aware Test Generation Using Constraint Solver
Software Engineering and Formal Methods. SEFM 2020 Collocated Workshops
Abstract
Code coverage has been used in the software testing context mostly as a metric to assess a generated test suite’s quality. Recently, code coverage analysis is used as a white-box testing technique for test optimization. Most of the research ...
Read More
The Effectiveness of T-Way Test Data Generation
SAFECOMP '08: Proceedings of the 27th international conference on Computer Safety, Reliability, and Security

This paper reports the results of a study comparing the effectiveness of automatically generated tests constructed using random and <em>t</em>-way combinatorial techniques on safety related industrial code using mutation adequacy criteria. A reference ...
Read More
A detailed investigation of the effectiveness of whole test suite generation

A common application of search-based software testing is to generate test cases for all goals defined by a coverage criterion (e.g., lines, branches, mutants). Rather than generating one test case at a time for each of these goals individually, whole ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 29, Issue 4
Continuous Special Section: AI and SE
October 2020
307 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3409663
Editor:
Mauro Pezzè
Università della Svizzera italiana and Università di Milano-Bicocca, Switzerland
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 September 2020
- Accepted: 1 June 2020
- Revised: 1 May 2020
- Received: 1 September 2019
Published in tosem Volume 29, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Automated test generation
static code metrics
testing heuristics
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 356
  Total Downloads
- Downloads (Last 12 months)61
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Using Relative Lines of Code to Guide Automated Test Generation for Python

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Code Coverage Aware Test Generation Using Constraint Solver

The Effectiveness of T-Way Test Data Generation

A detailed investigation of the effectiveness of whole test suite generation