research-article

An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

Authors:
Zichao Qi

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Fan Long

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Sara Achour

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Martin Rinard

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and AnalysisJuly 2015Pages 24–36https://doi.org/10.1145/2771783.2771791

Published:13 July 2015Publication History

ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and Analysis

Pages 24–36

ABSTRACT

We analyze reported patches for three existing generate-and- validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for all inputs in the validation test suite. Because of errors in the patch evaluation infrastructure, the majority of the reported patches are not plausible — they do not produce correct outputs even for the inputs in the validation test suite. The overwhelming majority of the reported patches are not correct and are equivalent to a single modification that simply deletes functionality. Observed negative effects include the introduction of security vulnerabilities and the elimination of desirable functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that lever- ages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.

References

AE results. http://dijkstra.cs.virginia.edu/ genprog/resources/genprog-ase2013-results.zip.Google Scholar
CVE-2006-2025. http://cve.mitre.org/cgi-bin/ cvename.cgi?name=CVE-2006-2025.Google Scholar
GenProg benchmarks. http://dijkstra.cs.virginia.edu/genprog/ resources/genprog-icse2012-benchmarks/.Google Scholar
GenProg results. http://dijkstra.cs.virginia.edu/ genprog/resources/genprog-icse2012-results.zip.Google Scholar
GenProg source code. http://dijkstra.cs.virginia.edu/genprog/ resources/genprog-source-v3.0.zip.Google Scholar
GenProg virtual machine. http://dijkstra.cs. virginia.edu/genprog/resources/genprog_images.Google Scholar
RSRepair results. http: //sourceforge.net/projects/rsrepair/files/.Google Scholar
Claire Le Goues, personal communication, May 2015.Google Scholar
E. D. Berger and B. G. Zorn. Diehard: probabilistic memory safety for unsafe languages. In ACM SIGPLAN Notices, volume 41, pages 158–168. ACM, 2006. Google ScholarDigital Library
M. Carbin, S. Misailovic, M. Kling, and M. C. Rinard. Detecting and escaping infinite loops with jolt. In ECOOP 2011–Object-Oriented Programming, pages 609–633. Springer, 2011. Google ScholarDigital Library
V. Debroy and W. E. Wong. Using mutation to automatically suggest fixes for faulty programs. In Software Testing, Verification and Validation (ICST), 2010 Third International Conference on, pages 65–74. IEEE, 2010. Google ScholarDigital Library
F. DeMarco, J. Xuan, D. Le Berre, and M. Monperrus. Automatic repair of buggy if conditions and missing preconditions with smt. In Proceedings of the 6th International Workshop on Constraints in Software Testing, Verification, and Analysis, CSTVA 2014, pages 30–39, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
B. Demsky, M. D. Ernst, P. J. Guo, S. McCamant, J. H. Perkins, and M. C. Rinard. Inference and enforcement of data structure consistency specifications. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2006, Portland, Maine, USA, July 17-20, 2006, pages 233–244, 2006. Google ScholarDigital Library
B. Demsky and M. Rinard. Automatic detection and repair of errors in data structures. In Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programing, Systems, Languages, and Applications, OOPSLA ’03’, pages 78–95, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
B. Demsky and M. Rinard. Data structure repair using goal-directed reasoning. In Proceedings of the 27th International Conference on Software Engineering, ICSE ’05’, pages 176–185, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
B. Demsky and M. C. Rinard. Static specification analysis for termination of specification-based data structure repair. In 14th International Symposium on Software Reliability Engineering (ISSRE) 2003), 17-20 November 2003, Denver, CO, USA, pages 71–84, 2003. Google ScholarDigital Library
K. Dobolyi and W. Weimer. Changing java’s semantics for handling null pointer exceptions. In 19th International Symposium on Software Reliability Engineering (ISSRE 2008), 11-14 November 2008, Seattle/Redmond, WA, USA, pages 47–56, 2008. Google ScholarDigital Library
T. Durieux, M. Martinez, M. Monperrus, R. Sommerard, and J. Xuan. Automatic repair of real bugs: An experience report on the defects4j dataset. arXiv, abs/1505.07002, 2015.Google Scholar
B. Elkarablieh, I. Garcia, Y. L. Suen, and S. Khurshid. Assertion-based repair of complex data structures. In Proceedings of the Twenty-second IEEE/ACM International Conference on Automated Software Engineering, ASE ’07’, pages 64–73, 2007. Google ScholarDigital Library
E. Fast, C. L. Goues, S. Forrest, and W. Weimer. Designing better fitness functions for automated program repair. In Genetic and Evolutionary Computation Conference, GECCO 2010, Proceedings, Portland, Oregon, USA, July 7-11, 2010, pages 965–972, 2010. Google ScholarDigital Library
S. Forrest, T. Nguyen, W. Weimer, and C. Le Goues. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO ’09’, pages 947–954, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Z. P. Fry, B. Landau, and W. Weimer. A human study of patch maintainability. In Proceedings of the 2012 International Symposium on Software Testing and Analysis, pages 177–187. ACM, 2012. Google ScholarDigital Library
J. Galenson, P. Reames, R. Bod´ık, B. Hartmann, and K. Sen. Codehint: dynamic and interactive synthesis of code snippets. In 36th International Conference on Software Engineering, ICSE ’14, Hyderabad, India - May 31 - June 07, 2014, pages 653–663, 2014. Google ScholarDigital Library
Q. Gao, Y. Xiong, Y. Mi, L. Zhang, W. Yang, Z. Zhou, B. Xie, and H. Mei. Safe memory-leak fixing for c programs. In Proceedings of the 37th International Conference on Software Engineering, 2015.Google ScholarDigital Library
C. L. Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, pages 3–13, 2012. Google ScholarDigital Library
R. Just, D. Jalali, and M. D. Ernst. Defects4j: a database of existing faults to enable controlled testing studies for java programs. In International Symposium on Software Testing and Analysis, ISSTA ’14, San Jose, CA, USA - July 21 - 26, 2014, pages 437–440, 2014. Google ScholarDigital Library
D. Kim, J. Nam, J. Song, and S. Kim. Automatic patch generation learned from human-written patches. In Proceedings of the 2013 International Conference on Software Engineering, pages 802–811. IEEE Press, 2013. Google ScholarDigital Library
M. Kling, S. Misailovic, M. Carbin, and M. Rinard. Bolt: on-demand infinite loop escape in unmodified binaries. In ACM SIGPLAN Notices, volume 47, pages 431–450. ACM, 2012. Google ScholarDigital Library
C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer. Genprog: A generic method for automatic software repair. Software Engineering, IEEE Transactions on, 38(1):54–72, 2012. Google ScholarDigital Library
C. Le Goues, W. Weimer, and S. Forrest. Representations and operators for improving evolutionary software repair. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, pages 959–966. ACM, 2012. Google ScholarDigital Library
F. Long and M. Rinard. Prophet: Automatic patch generation via learning from successful human patches. Technical Report MIT-CSAIL-TR-2015-019, 2015.Google Scholar
F. Long and M. Rinard. Staged program repair in SPR. Technical Report MIT-CSAIL-TR-2015-008, 2015.Google Scholar
F. Long and M. Rinard. Staged program repair in SPR. In Proceedings of ESEC/FSE 2015 (to appear), 2015.Google ScholarDigital Library
F. Long, S. Sidiroglou-Douskos, and M. Rinard. Automatic runtime error repair and containment via recovery shepherding. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, page 26. ACM, 2014. Google ScholarDigital Library
S. L. Marcote and M. Monperrus. Automatic Repair of Infinite Loops. Technical Report 1504.05078, Arxiv, 2015.Google Scholar
M. Martinez. Extraction and analysis of knowledge for automatic software repair. Software Engineering. Universite Lille, (tel-01078911), 2014.Google Scholar
M. Martinez and M. Monperrus. Mining software repair models for reasoning on the search space of automated program fixing. Empirical Software Engineering, pages 1–30, 2013. Google ScholarDigital Library
M. Martinez, W. Weimer, and M. Monperrus. Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion 2014, pages 492–495, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
S. Mechtaev, J. Yi, and A. Roychoudhury. Directfix: Looking for simple program repairs. In Proceedings of the 37th International Conference on Software Engineering, 2015.Google ScholarCross Ref
M. Monperrus. A critical review of ”automatic patch generation learned from human-written patches”: essay on the problem statement and the evaluation of automatic software repair. In 36th International Conference on Software Engineering, ICSE ’14, Hyderabad, India - May 31 - June 07, 2014, pages 234–242, 2014. Google ScholarDigital Library
V. Nagarajan, D. Jeffrey, and R. Gupta. Self-recovery in server programs. In Proceedings of the 2009 international symposium on Memory management, pages 49–58. ACM, 2009. Google ScholarDigital Library
H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra. Semfix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13’, pages 772–781, Piscataway, NJ, USA, 2013. IEEE Press. Google ScholarDigital Library
H. H. Nguyen and M. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In Proceedings of the 6th International Symposium on Memory Management, ISMM ’07, pages 15–30, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
J. H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, et al. Automatically patching errors in deployed software. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 87–102. ACM, 2009. Google ScholarDigital Library
Y. Qi, X. Mao, Y. Lei, Z. Dai, and C. Wang. The strength of random search on automated program repair. In ICSE, pages 254–265, 2014. Google ScholarDigital Library
Y. Qi, X. Mao, Y. Lei, and C. Wang. Using automated program repair for evaluating the effectiveness of fault localization techniques. In International Symposium on Software Testing and Analysis, ISSTA ’13, Lugano, Switzerland, July 15-20, 2013, 2013. Google ScholarDigital Library
Z. Qi, F. Long, S. Achour, and M. Rinard. An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material). http://hdl.handle.net/1721.1/97051.Google Scholar
Z. Qi, F. Long, S. Achour, and M. Rinard. An anlysis of patch plausibility and correctness for generate-and-validate patch generation systems. Technical Report MIT-CSAIL-TR-2015-021, 2015.Google Scholar
M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, and T. Leu. A dynamic technique for eliminating buffer overflow vulnerabilities (and other memory errors). In ACSAC, pages 82–90, 2004. Google ScholarDigital Library
M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W. S. Beebee. Enhancing server availability and security through failure-oblivious computing. In OSDI, volume 4, pages 21–21, 2004. Google ScholarDigital Library
H. Samimi, M. Schäfer, S. Artzi, T. D. Millstein, F. Tip, and L. J. Hendren. Automated repair of HTML generation errors in PHP applications using string constraint solving. In 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, pages 277–287, 2012. Google ScholarDigital Library
S. Sidiroglou, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by multi-application code transfer. Technical Report MIT-CSAIL-TR-2014-024, Aug. 2014.Google Scholar
S. Sidiroglou, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by multi-application code transfer. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 2015.Google ScholarDigital Library
S. Sidiroglou-Douskos, E. Lahtinen, and M. Rinard. Automatic discovery and patching of buffer and integer overflow errors. Technical Report MIT-CSAIL-TR-2015-018, 2015.Google Scholar
Y. Wei, Y. Pei, C. A. Furia, L. S. Silva, S. Buchholz, B. Meyer, and A. Zeller. Automated fixing of programs with contracts. In Proceedings of the 19th international symposium on Software testing and analysis, pages 61–72. ACM, 2010. Google ScholarDigital Library
W. Weimer, Z. P. Fry, and S. Forrest. Leveraging program equivalence for adaptive program repair: Models and first results. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, pages 356–366. IEEE, 2013.Google Scholar
W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, pages 364–374. IEEE Computer Society, 2009. Google ScholarDigital Library

Index Terms

An analysis of patch plausibility and correctness for generate-and-validate patch generation systems
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

An analysis of the search spaces for generate and validate patch generation systems
ICSE '16: Proceedings of the 38th International Conference on Software Engineering

We present the first systematic analysis of key characteristics of patch search spaces for automatic patch generation systems. We analyze sixteen different configurations of the patch search spaces of SPR and Prophet, two current state-of-the-art patch ...
Read More
Hallucinating face by position-patch

A novel face hallucination method is proposed in this paper for the reconstruction of a high-resolution face image from a low-resolution observation based on a set of high- and low-resolution training image pairs. Different from most of the established ...
Read More
Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications
SP '08: Proceedings of the 2008 IEEE Symposium on Security and Privacy

The automatic patch-based exploit generation problem is: given a program P and a patched version of the program P, automatically generate an exploit for the potentially unknown vulnerability present in P but fixed in P. In this paper, we propose ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and Analysis
July 2015
447 pages
ISBN:9781450336208
DOI:10.1145/2771783
General Chair:
Michal Young
University of Oregon, USA
,
Program Chair:
Tao Xie
University of Illinois at Urbana-Champaign, USA
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 July 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Automatic Repair
Function Elimination
Patch Analysis
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate58of213submissions,27%
Upcoming Conference
ISSTA '24

Sponsor:

sigsoft

33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

September 16 - 20, 2024

Vienna , Austria
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 265
  Total Citations
  View Citations
- 1,484
  Total Downloads
- Downloads (Last 12 months)145
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

An analysis of the search spaces for generate and validate patch generation systems

Hallucinating face by position-patch

Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

An analysis of the search spaces for generate and validate patch generation systems

Hallucinating face by position-patch

Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media