skip to main content
10.1145/2771783.2771791acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

Published:13 July 2015Publication History

ABSTRACT

We analyze reported patches for three existing generate-and- validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for all inputs in the validation test suite. Because of errors in the patch evaluation infrastructure, the majority of the reported patches are not plausible — they do not produce correct outputs even for the inputs in the validation test suite. The overwhelming majority of the reported patches are not correct and are equivalent to a single modification that simply deletes functionality. Observed negative effects include the introduction of security vulnerabilities and the elimination of desirable functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that lever- ages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.

References

  1. AE results. http://dijkstra.cs.virginia.edu/ genprog/resources/genprog-ase2013-results.zip.Google ScholarGoogle Scholar
  2. CVE-2006-2025. http://cve.mitre.org/cgi-bin/ cvename.cgi?name=CVE-2006-2025.Google ScholarGoogle Scholar
  3. GenProg benchmarks. http://dijkstra.cs.virginia.edu/genprog/ resources/genprog-icse2012-benchmarks/.Google ScholarGoogle Scholar
  4. GenProg results. http://dijkstra.cs.virginia.edu/ genprog/resources/genprog-icse2012-results.zip.Google ScholarGoogle Scholar
  5. GenProg source code. http://dijkstra.cs.virginia.edu/genprog/ resources/genprog-source-v3.0.zip.Google ScholarGoogle Scholar
  6. GenProg virtual machine. http://dijkstra.cs. virginia.edu/genprog/resources/genprog_images.Google ScholarGoogle Scholar
  7. RSRepair results. http: //sourceforge.net/projects/rsrepair/files/.Google ScholarGoogle Scholar
  8. Claire Le Goues, personal communication, May 2015.Google ScholarGoogle Scholar
  9. E. D. Berger and B. G. Zorn. Diehard: probabilistic memory safety for unsafe languages. In ACM SIGPLAN Notices, volume 41, pages 158–168. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Carbin, S. Misailovic, M. Kling, and M. C. Rinard. Detecting and escaping infinite loops with jolt. In ECOOP 2011–Object-Oriented Programming, pages 609–633. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Debroy and W. E. Wong. Using mutation to automatically suggest fixes for faulty programs. In Software Testing, Verification and Validation (ICST), 2010 Third International Conference on, pages 65–74. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. DeMarco, J. Xuan, D. Le Berre, and M. Monperrus. Automatic repair of buggy if conditions and missing preconditions with smt. In Proceedings of the 6th International Workshop on Constraints in Software Testing, Verification, and Analysis, CSTVA 2014, pages 30–39, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Demsky, M. D. Ernst, P. J. Guo, S. McCamant, J. H. Perkins, and M. C. Rinard. Inference and enforcement of data structure consistency specifications. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2006, Portland, Maine, USA, July 17-20, 2006, pages 233–244, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Demsky and M. Rinard. Automatic detection and repair of errors in data structures. In Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programing, Systems, Languages, and Applications, OOPSLA ’03’, pages 78–95, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Demsky and M. Rinard. Data structure repair using goal-directed reasoning. In Proceedings of the 27th International Conference on Software Engineering, ICSE ’05’, pages 176–185, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Demsky and M. C. Rinard. Static specification analysis for termination of specification-based data structure repair. In 14th International Symposium on Software Reliability Engineering (ISSRE) 2003), 17-20 November 2003, Denver, CO, USA, pages 71–84, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Dobolyi and W. Weimer. Changing java’s semantics for handling null pointer exceptions. In 19th International Symposium on Software Reliability Engineering (ISSRE 2008), 11-14 November 2008, Seattle/Redmond, WA, USA, pages 47–56, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Durieux, M. Martinez, M. Monperrus, R. Sommerard, and J. Xuan. Automatic repair of real bugs: An experience report on the defects4j dataset. arXiv, abs/1505.07002, 2015.Google ScholarGoogle Scholar
  19. B. Elkarablieh, I. Garcia, Y. L. Suen, and S. Khurshid. Assertion-based repair of complex data structures. In Proceedings of the Twenty-second IEEE/ACM International Conference on Automated Software Engineering, ASE ’07’, pages 64–73, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Fast, C. L. Goues, S. Forrest, and W. Weimer. Designing better fitness functions for automated program repair. In Genetic and Evolutionary Computation Conference, GECCO 2010, Proceedings, Portland, Oregon, USA, July 7-11, 2010, pages 965–972, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Forrest, T. Nguyen, W. Weimer, and C. Le Goues. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO ’09’, pages 947–954, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Z. P. Fry, B. Landau, and W. Weimer. A human study of patch maintainability. In Proceedings of the 2012 International Symposium on Software Testing and Analysis, pages 177–187. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Galenson, P. Reames, R. Bod´ık, B. Hartmann, and K. Sen. Codehint: dynamic and interactive synthesis of code snippets. In 36th International Conference on Software Engineering, ICSE ’14, Hyderabad, India - May 31 - June 07, 2014, pages 653–663, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Q. Gao, Y. Xiong, Y. Mi, L. Zhang, W. Yang, Z. Zhou, B. Xie, and H. Mei. Safe memory-leak fixing for c programs. In Proceedings of the 37th International Conference on Software Engineering, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. L. Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, pages 3–13, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Just, D. Jalali, and M. D. Ernst. Defects4j: a database of existing faults to enable controlled testing studies for java programs. In International Symposium on Software Testing and Analysis, ISSTA ’14, San Jose, CA, USA - July 21 - 26, 2014, pages 437–440, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Kim, J. Nam, J. Song, and S. Kim. Automatic patch generation learned from human-written patches. In Proceedings of the 2013 International Conference on Software Engineering, pages 802–811. IEEE Press, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Kling, S. Misailovic, M. Carbin, and M. Rinard. Bolt: on-demand infinite loop escape in unmodified binaries. In ACM SIGPLAN Notices, volume 47, pages 431–450. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer. Genprog: A generic method for automatic software repair. Software Engineering, IEEE Transactions on, 38(1):54–72, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Le Goues, W. Weimer, and S. Forrest. Representations and operators for improving evolutionary software repair. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, pages 959–966. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. F. Long and M. Rinard. Prophet: Automatic patch generation via learning from successful human patches. Technical Report MIT-CSAIL-TR-2015-019, 2015.Google ScholarGoogle Scholar
  32. F. Long and M. Rinard. Staged program repair in SPR. Technical Report MIT-CSAIL-TR-2015-008, 2015.Google ScholarGoogle Scholar
  33. F. Long and M. Rinard. Staged program repair in SPR. In Proceedings of ESEC/FSE 2015 (to appear), 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. F. Long, S. Sidiroglou-Douskos, and M. Rinard. Automatic runtime error repair and containment via recovery shepherding. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, page 26. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. L. Marcote and M. Monperrus. Automatic Repair of Infinite Loops. Technical Report 1504.05078, Arxiv, 2015.Google ScholarGoogle Scholar
  36. M. Martinez. Extraction and analysis of knowledge for automatic software repair. Software Engineering. Universite Lille, (tel-01078911), 2014.Google ScholarGoogle Scholar
  37. M. Martinez and M. Monperrus. Mining software repair models for reasoning on the search space of automated program fixing. Empirical Software Engineering, pages 1–30, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Martinez, W. Weimer, and M. Monperrus. Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion 2014, pages 492–495, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Mechtaev, J. Yi, and A. Roychoudhury. Directfix: Looking for simple program repairs. In Proceedings of the 37th International Conference on Software Engineering, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  40. M. Monperrus. A critical review of ”automatic patch generation learned from human-written patches”: essay on the problem statement and the evaluation of automatic software repair. In 36th International Conference on Software Engineering, ICSE ’14, Hyderabad, India - May 31 - June 07, 2014, pages 234–242, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. V. Nagarajan, D. Jeffrey, and R. Gupta. Self-recovery in server programs. In Proceedings of the 2009 international symposium on Memory management, pages 49–58. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra. Semfix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13’, pages 772–781, Piscataway, NJ, USA, 2013. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. H. H. Nguyen and M. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In Proceedings of the 6th International Symposium on Memory Management, ISMM ’07, pages 15–30, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, et al. Automatically patching errors in deployed software. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 87–102. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Y. Qi, X. Mao, Y. Lei, Z. Dai, and C. Wang. The strength of random search on automated program repair. In ICSE, pages 254–265, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Y. Qi, X. Mao, Y. Lei, and C. Wang. Using automated program repair for evaluating the effectiveness of fault localization techniques. In International Symposium on Software Testing and Analysis, ISSTA ’13, Lugano, Switzerland, July 15-20, 2013, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Z. Qi, F. Long, S. Achour, and M. Rinard. An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material). http://hdl.handle.net/1721.1/97051.Google ScholarGoogle Scholar
  48. Z. Qi, F. Long, S. Achour, and M. Rinard. An anlysis of patch plausibility and correctness for generate-and-validate patch generation systems. Technical Report MIT-CSAIL-TR-2015-021, 2015.Google ScholarGoogle Scholar
  49. M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, and T. Leu. A dynamic technique for eliminating buffer overflow vulnerabilities (and other memory errors). In ACSAC, pages 82–90, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W. S. Beebee. Enhancing server availability and security through failure-oblivious computing. In OSDI, volume 4, pages 21–21, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. H. Samimi, M. Schäfer, S. Artzi, T. D. Millstein, F. Tip, and L. J. Hendren. Automated repair of HTML generation errors in PHP applications using string constraint solving. In 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, pages 277–287, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. S. Sidiroglou, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by multi-application code transfer. Technical Report MIT-CSAIL-TR-2014-024, Aug. 2014.Google ScholarGoogle Scholar
  53. S. Sidiroglou, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by multi-application code transfer. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. S. Sidiroglou-Douskos, E. Lahtinen, and M. Rinard. Automatic discovery and patching of buffer and integer overflow errors. Technical Report MIT-CSAIL-TR-2015-018, 2015.Google ScholarGoogle Scholar
  55. Y. Wei, Y. Pei, C. A. Furia, L. S. Silva, S. Buchholz, B. Meyer, and A. Zeller. Automated fixing of programs with contracts. In Proceedings of the 19th international symposium on Software testing and analysis, pages 61–72. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. W. Weimer, Z. P. Fry, and S. Forrest. Leveraging program equivalence for adaptive program repair: Models and first results. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, pages 356–366. IEEE, 2013.Google ScholarGoogle Scholar
  57. W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, pages 364–374. IEEE Computer Society, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and Analysis
      July 2015
      447 pages
      ISBN:9781450336208
      DOI:10.1145/2771783
      • General Chair:
      • Michal Young,
      • Program Chair:
      • Tao Xie

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 July 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate58of213submissions,27%

      Upcoming Conference

      ISSTA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader