skip to main content
research-article

Using Relative Lines of Code to Guide Automated Test Generation for Python

Authors Info & Claims
Published:26 September 2020Publication History
Skip Abstract Section

Abstract

Raw lines of code (LOC) is a metric that does not, at first glance, seem extremely useful for automated test generation. It is both highly language-dependent and not extremely meaningful, semantically, within a language: one coder can produce the same effect with many fewer lines than another. However, relative LOC, between components of the same project, turns out to be a highly useful metric for automated testing. In this article, we make use of a heuristic based on LOC counts for tested functions to dramatically improve the effectiveness of automated test generation. This approach is particularly valuable in languages where collecting code coverage data to guide testing has a very high overhead. We apply the heuristic to property-based Python testing using the TSTL (Template Scripting Testing Language) tool. In our experiments, the simple LOC heuristic can improve branch and statement coverage by large margins (often more than 20%, up to 40% or more) and improve fault detection by an even larger margin (usually more than 75% and up to 400% or more). The LOC heuristic is also easy to combine with other approaches and is comparable to, and possibly more effective than, two well-established approaches for guiding random testing.

References

  1. Ali Aburas and Alex Groce. 2016. A method dependence relations guided genetic algorithm. In Proceedings of the 8th International Symposium Search Based Software Engineering (SSBSE’16). 267--273.Google ScholarGoogle ScholarCross RefCross Ref
  2. Hiralal Agrawal. 1994. Dominators, super blocks, and program coverage. In Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’94). ACM, New York, NY, 25--34. DOI:https://doi.org/10.1145/174675.175935Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Iftekhar Ahmed, Rahul Gopinath, Caius Brindescu, Alex Groce, and Carlos Jensen. 2016. Can testedness be effectively measured? In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’16). ACM, New York, NY, 547--558. DOI:https://doi.org/10.1145/2950290.2950324Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi. 2016. Generating focused random tests using directed swarm testing. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). ACM, New York, NY, 70--81. DOI:https://doi.org/10.1145/2931037.2931056Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Andersson and P. Runeson. 2007. A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 33, 5 (May 2007), 273--286. DOI:https://doi.org/10.1109/TSE.2007.1005Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. James Andrews, Felix Li, and Tim Menzies. 2007. Nighthawk: A two-level genetic-random unit test data generator. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering. 144--153.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jamie Andrews, Yihao Ross Zhang, and Alex Groce. 2010. Comparing Automated Unit Testing Strategies. Technical Report 736. Department of Computer Science, University of Western Ontario.Google ScholarGoogle Scholar
  8. James H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is mutation an appropriate tool for testing experiments? In Proceedings of the International Conference on Software Engineering. 402--411.Google ScholarGoogle Scholar
  9. Andrea Arcuri and Lionel Briand. 2014. A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verif. Reliab. 24, 3 (2014), 219--250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrea Arcuri, Muhammad Zohaib Z. Iqbal, and Lionel C. Briand. 2010. Formal analysis of the effectiveness and predictability of random testing. In Proceedings of the International Symposium on Software Testing and Analysis. 219--230.Google ScholarGoogle Scholar
  11. J. Bansiya and C. G. Davis. 2002. A hierarchical model for object-oriented design quality assessment. IEEE Trans. Softw. Eng. 28, 1 (Jan. 2002), 4--17. DOI:https://doi.org/10.1109/32.979986Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. R. Basili, L. C. Briand, and W. L. Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 10 (Oct. 1996), 751--761. DOI:https://doi.org/10.1109/32.544352Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ned Batchelder. 2015. Coverage.py. Retrieved from https://coverage.readthedocs.org/en/coverage-4.0.1/.Google ScholarGoogle Scholar
  14. James M. Bieman and Byung-Kyoo Kang. 1995. Cohesion and reuse in an object-oriented system. SIGSOFT Softw. Eng. Notes 20, SI (Aug. 1995), 259--262. DOI:https://doi.org/10.1145/223427.211856Google ScholarGoogle Scholar
  15. Marcel Böhme and Soumya Paul. 2014. On the efficiency of automated testing. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). ACM, New York, NY, 632--642. DOI:https://doi.org/10.1145/2635868.2635923Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lionel C. Briand and Jürgen Wüst. 2002. Empirical studies of quality models in object-oriented systems. Advances in Computers, Vol. 56. Elsevier, 97--166. DOI:https://doi.org/10.1016/S0065-2458(02)80005-5Google ScholarGoogle Scholar
  17. Lionel C. Briand, Jürgen Wüst, Stefan V. Ikonomovski, and Hakim Lounis. 1999. Investigating quality factors in object-oriented designs: An industrial case study. In Proceedings of the 21st International Conference on Software Engineering (ICSE’99). ACM, New York, NY, 345--354. DOI:https://doi.org/10.1145/302405.302654Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lionel C. Briand, Jürgen Wüst, and Hakim Lounis. 2001. Replicated case studies for investigating quality factors in object-oriented designs. Empir. Softw. Eng. 6, 1 (2001), 11--58. DOI:https://doi.org/10.1023/A:1009815306478Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Buchgeher, C. Ernstbrunner, R. Ramler, and M. Lusser. 2013. Towards tool-support for test case selection in manual regression testing. In Proceedings of the IEEE 6th International Conference on Software Testing, Verification and Validation Workshops. 74--79. DOI:https://doi.org/10.1109/ICSTW.2013.16Google ScholarGoogle Scholar
  20. Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the Conference on Operating System Design and Implementation. 209--224.Google ScholarGoogle Scholar
  21. R. Carlson, H. Do, and A. Denton. 2011. A clustering approach to improving test case prioritization: An industrial case study. In Proceedings of the 27th IEEE International Conference on Software Maintenance (ICSM’11). 382--391. DOI:https://doi.org/10.1109/ICSM.2011.6080805Google ScholarGoogle Scholar
  22. Yuanliang Chen, Yu Jiang, Fuchen Ma, Jie Liang, Mingzhe Wang, Chijin Zhou, Xun Jiao, and Zhuo Su. 2019. EnFuzz: Ensemble fuzzing with seed synchronization among diverse fuzzers. In Proceedings of the 28th USENIX Security Symposium (USENIX Security’19). 1967--1983.Google ScholarGoogle Scholar
  23. Kalyan-Ram Chilakamarri and Sebastian Elbaum. 2004. Reducing coverage collection overhead with disposable instrumentation. In Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE’04). IEEE, 233--244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Travis CI. 2012. Customizing the Build: Build Timeouts. Retrieved from https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts.Google ScholarGoogle Scholar
  25. Koen Claessen and John Hughes. 2000. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the International Conference on Functional Programming (ICFP’00). 268--279.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. J. Dolado. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 10 (Oct. 2000), 1006--1021. DOI:https://doi.org/10.1109/32.879821Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matthew B. Dwyer, Suzette Person, and Sebastian Elbaum. 2006. Controlling factors in evaluating path-sensitive error detection techniques. In Proceedings of the Symposium on the Foundations of Software Engineering. 92--104.Google ScholarGoogle ScholarCross RefCross Ref
  28. Fernando Brito e Abreu and Rogério Carapuça. 1994. Object-oriented software engineering: Measuring and controlling the development process. In Proceedings of the International Conference on Software Quality (QSIC’94).Google ScholarGoogle Scholar
  29. F. Brito e Abreu and W. Melo. 1996. Evaluating the impact of object-oriented design on software quality. In Proceedings of the 3rd International Software Metrics Symposium. 90--99. DOI:https://doi.org/10.1109/METRIC.1996.492446Google ScholarGoogle ScholarCross RefCross Ref
  30. E. D. Ekelund and E. Engström. 2015. Efficient regression testing based on test history: An industrial evaluation. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’15). 449--457. DOI:https://doi.org/10.1109/ICSM.2015.7332496Google ScholarGoogle Scholar
  31. Kalhed El Emam, Saïda Benlarbi, Nishith Goel, and Shesh N. Rai. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. Softw. Eng. 27, 7 (July 2001), 630--650. DOI:https://doi.org/10.1109/32.935855Google ScholarGoogle Scholar
  32. E. Engström, P. Runeson, and G. Wikstrand. 2010. An empirical evaluation of regression testing based on fix-cache recommendations. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation. 75--78. DOI:https://doi.org/10.1109/ICST.2010.40Google ScholarGoogle Scholar
  33. D. Fatiregun, M. Harman, and R. M. Hierons. 2004. Evolving transformation sequences using genetic algorithms. In Proceedings of the 4th IEEE International Workshop on Source Code Analysis and Manipulation. 65--74. DOI:https://doi.org/10.1109/SCAM.2004.11Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26, 8 (Aug. 2000), 797--814. DOI:https://doi.org/10.1109/32.879815Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Fowler. 2010. Domain-specific Languages. Addison-Wesley Professional.Google ScholarGoogle Scholar
  36. Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE’11). ACM, 416--419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Gregory Gay. 2018. To call, or not to call: Contrasting direct and indirect branch coverage in test generation. In Proceedings of the 11th International Workshop on Search-Based Software Testing (SBST’18). ACM, New York, NY, 43--50. DOI:https://doi.org/10.1145/3194718.3194719Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Milos Gligoric, Alex Groce, Chaoqiang Zhang, Rohan Sharma, Amin Alipour, and Darko Marinov. 2013. Comparing non-adequate test suites using coverage criteria. In Proceedings of the International Symposium on Software Testing and Analysis. 302--313.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Milos Gligoric, Stas Negara, Owolabi Legunsen, and Darko Marinov. 2014. An empirical evaluation and comparison of manual and automated test selection. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). Association for Computing Machinery, New York, NY, 361--372. DOI:https://doi.org/10.1145/2642937.2643019Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed automated random testing. In Proceedings of the Conference on Programming Language Design and Implementation. 213--223.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Peter Goodman. 2016. A fuzzer and a symbolic executor walk into a cloud. Retrieved from https://blog.trailofbits.com/2016/08/02/engineering-solutions-to-hard-program-analysis-problems/.Google ScholarGoogle Scholar
  42. Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Code coverage for suite evaluation by developers. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 72--82. DOI:https://doi.org/10.1145/2568225.2568278Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Mutations: How close are they to real faults? In Proceedings of the International Symposium on Software Reliability Engineering. 189--200.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey Siy. 2000. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26, 7 (July 2000), 653--661. DOI:https://doi.org/10.1109/32.859533Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Alex Groce and Martin Erwig. 2012. Finding common ground: Choose, assert, and assume. In Proceedings of the International Workshop on Dynamic Analysis. 12--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Alex Groce, Alan Fern, Martin Erwig, Jervis Pinto, Tim Bauer, and Amin Alipour. 2012. Learning-based test programming for programmers. In Proceedings of the International Symposium on Leveraging Applications of Formal Methods, Verification and Validation. 752--786.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Alex Groce, Alan Fern, Jervis Pinto, Tim Bauer, Amin Alipour, Martin Erwig, and Camden Lopez. 2012. Lightweight automated testing with adaptation-based programming. In Proceedings of the IEEE International Symposium on Software Reliability Engineering. 161--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Alex Groce, Gerard Holzmann, and Rajeev Joshi. 2007. Randomized differential testing as a prelude to formal verification. In Proceedings of the International Conference on Software Engineering. 621--631.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Alex Groce and Jervis Pinto. 2015. A little language for testing. In Proceedings of the NASA Formal Methods Symposium. 204--218.Google ScholarGoogle ScholarCross RefCross Ref
  50. Alex Groce, Jervis Pinto, Pooria Azimi, Pranjal Mittal, Josie Holmes, and Kevin Kellar. 2015. TSTL: The template scripting testing language. Retrieved from https://github.com/agroce/tstl.Google ScholarGoogle Scholar
  51. Alex Groce and Willem Visser. 2004. Heuristics for model checking Java programs. Softw. Tools Technol. Transf. 6(4) (2004), 260--276.Google ScholarGoogle Scholar
  52. Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm testing. In Proceedings of the International Symposium on Software Testing and Analysis. 78--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. T. Gyimothy, R. Ferenc, and I. Siket. 2005. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31, 10 (Oct. 2005), 897--910. DOI:https://doi.org/10.1109/TSE.2005.112Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Richard Hamlet. 1994. Random testing. In Encyclopedia of Software Engineering. Wiley, 970--978.Google ScholarGoogle Scholar
  55. Mark Harman and Peter O’Hearn. 2018. From start-ups to scale-ups: Open problems, challenges and myths in static and dynamic program analysis for testing and verification. In Proceedings of the IEEE International Working Conference on Source Code Analysis and Manipulation.Google ScholarGoogle Scholar
  56. Kim Herzig, Michaela Greiler, Jacek Czerwonka, and Brendan Murphy. 2015. The art of testing less without sacrificing quality. In Proceedings of the 37th International Conference on Software Engineering (ICSE’15), Vol. 1. IEEE Press, 483--493.Google ScholarGoogle ScholarCross RefCross Ref
  57. Matthias Hirzel and Herbert Klaeren. 2016. Graph-walk-based selective regression testing of web applications created with Google web toolkit. In Proceedings of the Gemeinsamer Tagungsband der Workshops der Tagung Software Engineering (SE’16). 55--69. Retrieved from: http://ceur-ws.org/Vol-1559/paper05.pdf.Google ScholarGoogle Scholar
  58. Josie Holmes, Alex Groce, Jervis Pinto, Pranjal Mittal, Pooria Azimi, Kevin Kellar, and James O’Brien. 2018. TSTL: The template scripting testing language. Int. J. Softw. Tools Technol. Transf. 20, 1 (2018), 57--78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2008. Swarm verification. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2008. Tackling large verification problems with the swarm tool. In Proceedings of the SPIN Workshop on Model Checking of Software. 134--143.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2011. Swarm verification techniques. IEEE Trans. Softw. Eng. 37, 6 (2011), 845--857.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Laura Inozemtseva. [n.d.]. Supplemental results for “Coverage is not Correlated...”. DOI:http://inozemtseva.com/research/2014/icse/coverageGoogle ScholarGoogle Scholar
  63. Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 435--445. DOI:https://doi.org/10.1145/2568225.2568271Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the International Symposium on Software Testing and Analysis. ACM, 437--440.Google ScholarGoogle Scholar
  65. Kazuki Kaneoka. 2017. Feedback-based Random Test Generator for TSTL. Technical Report MS thesis. Oregon State University.Google ScholarGoogle Scholar
  66. George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’18). ACM, New York, NY, 2123--2138. DOI:https://doi.org/10.1145/3243734.3243804Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans. Softw. Eng. 35, 2 (Mar. 2009), 293--304. DOI:https://doi.org/10.1109/TSE.2008.90Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Davy Landman, Alexander Serebrenik, Eric Bouwers, and Jurgen J. Vinju. 2016. Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods and C functions. J. Software: Evol. Proc. 28, 7 (2016), 589--618. DOI:https://doi.org/10.1002/smr.1760Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. David R. MacIver. 2013. Hypothesis: Test faster, fix more. Retrieved from http://hypothesis.works/.Google ScholarGoogle Scholar
  70. David R. MacIver. 2016. Rule Based Stateful Testing. Retrieved from http://hypothesis.works/articles/rule-based-stateful-testing/.Google ScholarGoogle Scholar
  71. David R. MacIver. 2017. Python Coverage could be fast. Retrieved from https://www.drmaciver.com/2017/09/python-coverage-could-be-fast/.Google ScholarGoogle Scholar
  72. David R. MacIver. 2017. Coverage adds a lot of overhead when the base test is fast. Retrieved from https://github.com/HypothesisWorks/hypothesis/issues/914.Google ScholarGoogle Scholar
  73. David R. MacIver and PyPI. 2015. Usage stats for hypothesis on PyPI. Retrieved from https://libraries.io/pypi/hypothesis/usage.Google ScholarGoogle Scholar
  74. Andrian Marcus, Denys Poshyvanyk, and Rudolf Ferenc. 2008. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Eng. 34, 2 (Mar. 2008), 287--300. DOI:https://doi.org/10.1109/TSE.2007.70768Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. D. Marijan. 2015. Multi-perspective regression test prioritization for time-constrained environments. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security. 157--162. DOI:https://doi.org/10.1109/QRS.2015.31Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. D. Marijan, A. Gotlieb, and S. Sen. 2013. Test case prioritization for continuous regression testing: An industrial case study. In Proceedings of the IEEE International Conference on Software Maintenance. 540--543. DOI:https://doi.org/10.1109/ICSM.2013.91Google ScholarGoogle Scholar
  77. Paul Dan Marinescu and Cristian Cadar. 2012. make test-zesti: A symbolic execution solution for improving regression testing. In Proceedings of the International Conference on Software Engineering. 716--726.Google ScholarGoogle ScholarCross RefCross Ref
  78. T. J. McCabe. 1976. A complexity measure. IEEE Trans. Softw. Eng. 2, 4 (July 1976), 308--320. DOI:https://doi.org/10.1109/TSE.1976.233837Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. William McKeeman. 1998. Differential testing for software. Dig. Tech. J. Dig. Equip. Corp. 10(1) (1998), 100--107.Google ScholarGoogle Scholar
  80. Phil McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verif. Reliab. 14 (2004), 105--156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. T. Menzies, J. S. Di Stefano, M. Chapman, and K. McGill. 2002. Metrics that matter. In Proceedings of the 27th NASA Goddard/IEEE Software Engineering Workshop.51--57. DOI:https://doi.org/10.1109/SEW.2002.1199449Google ScholarGoogle Scholar
  82. Rickard Nilsson, Shane Auckland, Mark Sumner, and Sanjiv Sahayam. 2016. ScalaCheck User Guide. Retrieved from https://github.com/rickynils/scalacheck/blob/master/doc/UserGuide.md.Google ScholarGoogle Scholar
  83. A. Jefferson Offutt and Roland H. Untch. 2001. Mutation 2000: Uniting the orthogonal. In Mutation Testing for the New Century. Springer, 34--44.Google ScholarGoogle Scholar
  84. Peter Ohmann, David Bingham Brown, Naveen Neelakandan, Jeff Linderoth, and Ben Liblit. 2016. Optimizing customized program coverage. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). IEEE, 27--38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Hector M. Olague, Letha H. Etzkorn, Sampson Gholston, and Stephen Quattlebaum. 2007. Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans. Softw. Eng. 33, 6 (June 2007), 402--419. DOI:https://doi.org/10.1109/TSE.2007.1015Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. T. J. Ostrand, E. J. Weyuker, and R. M. Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng. 31, 4 (Apr. 2005), 340--355. DOI:https://doi.org/10.1109/TSE.2005.49Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-directed random test generation. In Proceedings of the International Conference on Software Engineering. 75--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. G. J. Pai and J. Bechta Dugan. 2007. Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans. Softw. Eng. 33, 10 (Oct. 2007), 675--686. DOI:https://doi.org/10.1109/TSE.2007.70722Google ScholarGoogle Scholar
  89. Manolis Papadakis and Konstantinos Sagonas. 2011. A PropEr integration of types and function specifications with property-based testing. In Proceedings of the ACM SIGPLAN Erlang Workshop. ACM Press, New York, NY, 39--50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Suzette Person, Guowei Yang, Neha Rungta, and Sarfraz Khurshid. 2011. Directed incremental symbolic execution. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). 504--515.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Danijel Radjenović, Marjan Heričko, Richard Torkar, and Aleš Živkovič. 2013. Software fault prediction metrics. Inf. Softw. Technol. 55, 8 (Aug. 2013), 1397--1418. DOI:https://doi.org/10.1016/j.infsof.2013.02.009Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware evolutionary fuzzing. In Proceedings of the Network and Distributed Security Symposium (NDSS’17).Google ScholarGoogle ScholarCross RefCross Ref
  93. Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the “Naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). ACM, New York, NY, 428--439. DOI:https://doi.org/10.1145/2884781.2884848Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Ripon K. Saha, Lingming Zhang, Sarfraz Khurshid, and Dewayne E. Perry. 2015. An information retrieval approach for regression test prioritization based on program changes. In Proceedings of the IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 268--279.Google ScholarGoogle Scholar
  95. Scientific Toolworks, Inc.2017. Understand™ Static Code Analysis Tool. Retrieved from https://scitools.com/.Google ScholarGoogle Scholar
  96. Kang Seonghoon. 2015. Tutorial: How to collect test coverages for Rust project. Retrieved from https://users.rust-lang.org/t/tutorial-how-to-collect-test-coverages-for-rust-project/650.Google ScholarGoogle Scholar
  97. Sina Shamshiri, Rene Just, Jose Miguel Rojas, Gordon Fraser, Phil McMinn, and Andrea Arcuri. 2015. Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). IEEE, 201--211.Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Sina Shamshiri, José Miguel Rojas, Gordon Fraser, and Phil McMinn. 2015. Random or genetic algorithm search for object-oriented test suite generation? In Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO’15). Association for Computing Machinery, New York, NY, 1367--1374. DOI:https://doi.org/10.1145/2739480.2754696Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Sina Shamshiri, José Miguel Rojas, Luca Gazzola, Gordon Fraser, Phil McMinn, Leonardo Mariani, and Andrea Arcuri. 2018. Random or evolutionary search for object-oriented test suite generation?Softw. Test. Verif. Reliab. 28, 4 (2018), e1660.Google ScholarGoogle ScholarCross RefCross Ref
  100. M. Skoglund and P. Runeson. 2005. A case study of the class firewall regression test selection technique on a large scale distributed software system. In Proceedings of the International Symposium on Empirical Software Engineering.. DOI:https://doi.org/10.1109/ISESE.2005.1541816Google ScholarGoogle Scholar
  101. Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively prioritizing tests in development environment. SIGSOFT Softw. Eng. Notes 27, 4 (July 2002), 97--106. DOI:https://doi.org/10.1145/566171.566187Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Matt Staats, Michael W. Whalen, and Mats P. E. Heimdahl. 2011. Programs, tests, and oracles: The foundations of testing revisited. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 391--400. DOI:https://doi.org/10.1145/1985793.1985847Google ScholarGoogle Scholar
  103. M. D. Syer, M. Nagappan, B. Adams, and A. E. Hassan. 2015. Replicating and re-evaluating the theory of relative defect-proneness. IEEE Trans. Softw. Eng. 41, 2 (Feb. 2015), 176--197. DOI:https://doi.org/10.1109/TSE.2014.2361131Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Sahar Tahvili, Wasif Afzal, Mehrdad Saadatmand, Markus Bohlin, Daniel Sundmark, and Stig Larsson. 2016. Towards earlier fault detection by value-driven prioritization of test cases using fuzzy TOPSIS. In Information Technology: New Generations. Springer, 745--759.Google ScholarGoogle Scholar
  105. Mustafa M. Tikir and Jeffrey K. Hollingsworth. 2002. Efficient instrumentation for code coverage testing. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’02). ACM, New York, NY, 86--96. DOI:https://doi.org/10.1145/566172.566186Google ScholarGoogle Scholar
  106. David A. Tomassi, Naji Dmeiri, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Premkumar T. Devanbu, Bogdan Vasilescu, and Cindy Rubio-González. 2019. BugSwarm: Mining and continuously growing a dataset of reproducible failures and fixes. In Proceedings of the International Conference on Software Engineering (ICSE’19). IEEE/ACM, 339--349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. user1689822. 2012. python AVL tree insertion. Retrieved from http://stackoverflow.com/questions/12537986/python-avl-tree-insertion.Google ScholarGoogle Scholar
  108. Lee White, Khaled Jaber, Brian Robinson, and Václav Rajlich. 2008. Extended firewall for regression testing: An experience report. J. Softw. Maint. Evol. 20, 6 (Nov. 2008), 419--433.Google ScholarGoogle ScholarCross RefCross Ref
  109. L. White and B. Robinson. 2004. Industrial real-time regression testing and analysis using firewalls. In Proceedings of the 20th IEEE International Conference on Software Maintenance.18--27. DOI:https://doi.org/10.1109/ICSM.2004.1357786Google ScholarGoogle Scholar
  110. G. Wikstrand, R. Feldt, J. K. Gorantla, W. Zhe, and C. White. 2009. Dynamic regression test selection based on a file cache an industrial evaluation. In Proceedings of the International Conference on Software Testing Verification and Validation. 299--302. DOI:https://doi.org/10.1109/ICST.2009.42Google ScholarGoogle Scholar
  111. Qian Yang, J. Jenny Li, and David M. Weiss. 2007. A survey of coverage-based testing tools. Comput. J. 52, 5 (2007), 589--597.Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Kohsuke Yatoh, Kazunori Sakamoto, Fuyuki Ishikawa, and Shinichi Honiden. 2015. Feedback-controlled random test generation. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, NY, 316--326. DOI:https://doi.org/10.1145/2771783.2771805Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Michal Zalewski. 2014. american fuzzy lop (2.35b). Retrieved from http://lcamtuf.coredump.cx/afl/.Google ScholarGoogle Scholar
  114. Chaoqiang Zhang, Alex Groce, and Mohammad Amin Alipour. 2014. Using test case reduction and prioritization to improve symbolic execution. In Proceedings of the International Symposium on Software Testing and Analysis. 160--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. H. Zhang. 2009. An investigation of the relationships between lines of code and defects. In Proceedings of the IEEE International Conference on Software Maintenance. 274--283. DOI:https://doi.org/10.1109/ICSM.2009.5306304Google ScholarGoogle ScholarCross RefCross Ref
  116. Jiang Zheng, Brian Robinson, Laurie Williams, and Karen Smiley. 2006. Applying regression test selection for COTS-based applications. In Proceedings of the 28th International Conference on Software Engineering (ICSE’06). Association for Computing Machinery, New York, NY, 512--522. DOI:https://doi.org/10.1145/1134285.1134357Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Jiang Zheng, Laurie Williams, and Brian Robinson. 2007. Pallino: Automation to support regression test selection for COTS-based applications. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). Association for Computing Machinery, New York, NY, 224--233. DOI:https://doi.org/10.1145/1321631.1321665Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Yuming Zhou and Hareton Leung. 2006. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans. Softw. Eng. 32, 10 (Oct. 2006), 771--789. DOI:https://doi.org/10.1109/TSE.2006.102Google ScholarGoogle Scholar
  119. T. Zimmermann and N. Nagappan. 2008. Predicting defects using network analysis on dependency graphs. In Proceedings of the ACM/IEEE 30th International Conference on Software Engineering. 531--540. DOI:https://doi.org/10.1145/1368088.1368161Google ScholarGoogle Scholar

Index Terms

  1. Using Relative Lines of Code to Guide Automated Test Generation for Python

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 29, Issue 4
      Continuous Special Section: AI and SE
      October 2020
      307 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3409663
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 September 2020
      • Accepted: 1 June 2020
      • Revised: 1 May 2020
      • Received: 1 September 2019
      Published in tosem Volume 29, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format