skip to main content
research-article
Open Access
Artifacts Evaluated & Functional

Heaps don't lie: countering unsoundness with heap snapshots

Published:12 October 2017Publication History
Skip Abstract Section

Abstract

Static analyses aspire to explore all possible executions in order to achieve soundness. Yet, in practice, they fail to capture common dynamic behavior. Enhancing static analyses with dynamic information is a common pattern, with tools such as Tamiflex. Past approaches, however, miss significant portions of dynamic behavior, due to native code, unsupported features (e.g., invokedynamic or lambdas in Java), and more. We present techniques that substantially counteract the unsoundness of a static analysis, with virtually no intrusion to the analysis logic. Our approach is reified in the HeapDL toolchain and consists in taking whole-heap snapshots during program execution, that are further enriched to capture significant aspects of dynamic behavior, regardless of the causes of such behavior. The snapshots are then used as extra inputs to the static analysis. The approach exhibits both portability and significantly increased coverage. Heap information under one set of dynamic inputs allows a static analysis to cover many more behaviors under other inputs. A HeapDL-enhanced static analysis of the DaCapo benchmarks computes 99.5% (median) of the call-graph edges of unseen dynamic executions (vs. 76.9% for the Tamiflex tool).

References

  1. Edward E. Aftandilian, Sean Kelley, Connor Gramazio, Nathan Ricci, Sara L. Su, and Samuel Z. Guyer. 2010. Heapviz: Interactive Heap Visualization for Program Understanding and Debugging. In Proceedings of the 5th International Symposium on Software Visualization (SOFTVIS ’10). ACM, New York, NY, USA, 53–62. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Davide Ancona, Massimo Ancona, Antonio Cuni, and Nicholas D. Matsakis. 2007. RPython: A Step Towards Reconciling Dynamically and Statically Typed OO Languages. In Proceedings of the 2007 Symposium on Dynamic Languages (DLS ’07). ACM, New York, NY, USA, 53–64. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Molham Aref, Balder ten Cate, Todd J. Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L. Veldhuizen, and Geoffrey Washburn. 2015. Design and Implementation of the LogicBlox System. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD ’15). ACM, New York, NY, USA, 1371–1382. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 259–269. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khan, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony L. Hosking, Maria Jump, Han Bok Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanovic, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA ’06). ACM, New York, NY, USA, 169–190. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sam Blackshear, Alexandra Gendreau, and Bor-Yuh Evan Chang. 2015. Droidel: A General Approach to Android Framework Modeling. In Proceedings of the 4th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis (SOAP 2015). ACM, New York, NY, USA, 19–25. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE ’11). ACM, New York, NY, USA, 241–250. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA ’09). ACM, New York, NY, USA, 243–261. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eric Bruneton, Romain Lenglet, and Thierry Coupaye. 2002. ASM: a code manipulation tool to implement adaptable systems. Adaptable and extensible component systems 30, 19 (2002).Google ScholarGoogle Scholar
  10. Patrick P. F. Chan, Lucas C. K. Hui, and S. M. Yiu. 2011. Dynamic Software Birthmark for Java Based on Heap Memory Analysis. Springer Berlin Heidelberg, Berlin, Heidelberg, 94–107. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  11. Stephen J. Fink. 2015. T.J. Watson Libraries for Analysis (WALA). http://wala.sourceforge.net . (2015).Google ScholarGoogle Scholar
  12. Michael Furr, Jong-hoon (David) An, and Jeffrey S. Foster. 2009. Profile-guided Static Typing for Dynamic Scripting Languages. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’09). ACM, New York, NY, USA, 283–300. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’05). ACM, New York, NY, USA, 213–223. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Brian Goetz. 2010. One VM, Many Languages. https://gotocon.com/dl/jaoo- aarhus- 2010/slides/BrianGoetz_ OneVMManyLanguages.pdf . (2010).Google ScholarGoogle Scholar
  15. Brian Goetz. 2016. Project Valhalla Update. http://www.oracle.com/technetwork/java/jvmls2016- goetz- 3126134.pdf . (August 2016). JVM Language Summit.Google ScholarGoogle Scholar
  16. Neville Grech, Julian Rathke, and Bernd Fischer. 2013. Preemptive Type Checking in Dynamically Typed Languages. In Proceedings of the 10th International Colloquium on Theoretical Aspects of Computing (Lecture Notes in Computer Science), Vol. 8049. Springer, 195–212. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  17. Neville Grech and Yannis Smaragdakis. 2017. P/Taint: Unified Points-to and Taint Analysis. Proceedings of the ACM on Programming Languages 1, OOPSLA (October 2017). DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Martin Hirzel, Daniel Von Dincklage, Amer Diwan, and Michael Hind. 2007. Fast Online Pointer Analysis. ACM Transactions on Programming Languages and Systems 29, 2, Article 11 (April 2007), 55 pages. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Martin Hirzel, Amer Diwan, and Michael Hind. 2004. Pointer Analysis in the Presence of Dynamic Class Loading. In Proceedings of the 18th European Conference on Object-Oriented Programming (ECOOP ’04). Springer Berlin Heidelberg, Berlin, Heidelberg, 96–122. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  20. Peter Hofer, David Gnedt, and Hanspeter Mössenböck. 2015. Efficient Dynamic Analysis of the Synchronization Performance of Java Applications. In Proceedings of the 13th International Workshop on Dynamic Analysis (WODA 2015). ACM, New York, NY, USA, 14–18. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. IBM. 2017. Using the HPROF Profiler. https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.lnx. 80.doc/diag/tools/hprof.html . (2017).Google ScholarGoogle Scholar
  22. Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On Synthesis of Program Analyzers. Springer International Publishing, Cham, 422–430. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  23. George Kastrinis and Yannis Smaragdakis. 2013. Hybrid Context-Sensitivity for Points-To Analysis. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’13). ACM, New York, NY, USA, 423–434. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Davy Landman, Alexander Serebrenik, and Jurgen J. Vinju. 2017. Challenges for Static Analysis of Java Reflection – Literature Review and Empirical Study. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sungho Lee, Julian Dolby, and Sukyoung Ryu. 2016. HybriDroid: Static Analysis Framework for Android Hybrid Applications. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). ACM, New York, NY, USA, 250–261. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lian Li, Yi Lu, and Jingling Xue. 2017. Dynamic Symbolic Execution for Polymorphism. In Proceedings of the 26th International Conference on Compiler Construction (CC 2017). ACM, New York, NY, USA, 120–130. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yue Li, Tian Tan, and Jingling Xue. 2015. Effective Soundness-Guided Reflection Analysis. Springer Berlin Heidelberg, Berlin, Heidelberg, 162–180. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  28. Tim Lindholm, Frank Yellin, Gilad Bracha, and Alex Buckley. 2014. The Java Virtual Machine Specification, Java SE 8 Edition. (2014).Google ScholarGoogle Scholar
  29. Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In Defense of Soundiness: A Manifesto. Commun. ACM 58, 2 (Jan. 2015), 44–46. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Evan K. Maxwell, Godmar Back, and Naren Ramakrishnan. 2010. Diagnosing Memory Leaks Using Graph Mining on Heap Dumps. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’10). ACM, New York, NY, USA, 115–124. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Eric Mercer and Michael Jones. 2005. Model Checking Machine Code with the GNU Debugger. In Proceedings of the 12th International Conference on Model Checking Software (SPIN’05). Springer-Verlag, Berlin, Heidelberg, 251–265. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Transactions on Software Engineering and Methodology 14, 1 (2005), 1–41. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Oracle. 2014a. Java Platform, Standard Edition Java Flight Recorder Runtime Guide. https://docs.oracle.com/javacomponents/ jmc- 5- 4/jfr- runtime- guide/about.htm . (2014).Google ScholarGoogle Scholar
  34. Oracle. 2014b. JSR 335: Lambda Expressions for the Java™ Programming Language. (2014). https://jcp.org/en/jsr/detail?id=335Google ScholarGoogle Scholar
  35. Oracle. 2016a. HPROF: A Heap/CP U Profiling Tool. http://docs.oracle.com/javase/8/docs/technotes/samples/hprof.html . (2016).Google ScholarGoogle Scholar
  36. Oracle. 2016b. jhat. (2016). https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jhat.htmlGoogle ScholarGoogle Scholar
  37. Oracle. 2016c. VisualVM: Home. https://visualvm.github.io/ . (2016).Google ScholarGoogle Scholar
  38. Oracle. 2017. JEP 280: Indify String Concatenation. (2017). http://openjdk.java.net/jeps/280Google ScholarGoogle Scholar
  39. Alex Potanin, James Noble, and Robert Biddle. 2004. Checking Ownership and Confinement: Research Articles. Concurrency and Computation: Practice & Experience - Formal Techniques for Java-like Programs 16, 7 (June 2004), 671–687. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  40. S. P. Reiss. 2009. Visualizing the Java heap to detect memory problems. In 2009 5th IEEE International Workshop on Visualizing Software for Understanding and Analysis. 73–80. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  41. John R. Rose. 2009. Bytecodes Meet Combinators: Invokedynamic on the JVM. In Proceedings of the Third Workshop on Virtual Machines and Intermediate Languages (VMIL ’09). ACM, New York, NY, USA, Article 2, 11 pages. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Koushik Sen, Darko Marinov, and Gul Agha. 2005. CUTE: A Concolic Unit Testing Engine for C. In Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE-13). ACM, New York, NY, USA, 263–272. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  43. Micha Sharir and Amir Pnueli. 1981. Two Approaches to Interprocedural Data Flow Analysis. In Program flow analysis: theory and applications, Steven S. Muchnick and Neil D. Jones (Eds.). Prentice-Hall, Inc., Englewood Cliffs, NJ, Chapter 7, 189–233.Google ScholarGoogle Scholar
  44. Yannis Smaragdakis, Martin Bravenboer, and Ondřej Lhoták. 2011. Pick Your Contexts Well: Understanding Object-Sensitivity. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 17–30. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie J. Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java bytecode optimization framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative research (CASCON ’99). IBM Press, 125–135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Noriaki Yoshiura and Wei Wei. 2014. Static Data Race Detection for Java Programs with Dynamic Class Loading. Springer International Publishing, Cham, 161–173. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  47. YourKit. 2017. https://www.yourkit.com/ . (2017).Google ScholarGoogle Scholar
  48. Yury Zhauniarovich, Maqsood Ahmad, Olga Gadyatskaya, Bruno Crispo, and Fabio Massacci. 2015. StaDynA: Addressing the Problem of Dynamic Code Updates in the Security Analysis of Android Applications. In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy (CODASPY ’15). ACM, New York, NY, USA, 37–48. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Heaps don't lie: countering unsoundness with heap snapshots

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader