skip to main content
10.1145/3446804.3446851acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections
Article

Helper function inlining in dynamic binary translation

Published:27 February 2021Publication History

ABSTRACT

Dynamic binary translation (DBT) is the cornerstone of many important applications. Yet, it takes a tremendous effort to develop and maintain a real-world DBT system. To mitigate the engineering effort, helper functions are frequently employed during the development of a DBT system. Though helper functions greatly facilitate the DBT development, their adoption incurs substantial performance overhead due to the helper function calls. To solve this problem, this paper presents a novel approach to inline helper functions in DBT systems. The proposed inlining approach addresses several unique technical challenges. As a result, the performance overhead introduced by helper function calls can be reduced, and meanwhile, the benefits of helper functions for DBT development are not lost. We have implemented a prototype based on the proposed inlining approach using a popular DBT system, QEMU. Experimental results on the benchmark programs from the SPEC CPU 2017 benchmark suite show that an average of 1.2x performance speedup can be achieved. Moreover, the translation overhead introduced by inlining helper functions is negligible.

References

  1. 2019. IEEE Standard for Floating-Point Arithmetic. IEEE Std 754-2019 (Revision of IEEE 754-2008) ( 2019 ), 1-84. htps://doi.org/10.1109/ IEEESTD. 2019.8766229Google ScholarGoogle Scholar
  2. Andrew Ayers, Richard Schooler, and Robert Gottlieb. 1997. Aggressive Inlining. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation (Las Vegas, Nevada, USA) ( PLDI '97). Association for Computing Machinery, New York, NY, USA, 134-145. htps://doi.org/10.1145/258915.258928Google ScholarGoogle Scholar
  3. Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (Anaheim, CA) ( ATC '05). USENIX, USA, 41-46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Derek Bruening and Vladimir Kiriansky. 2008. Process-Shared and Persistent Code Caches. In Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. Association for Computing Machinery, New York, NY, USA, 61-70. htps://doi.org/10.1145/1346256.1346265Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brad Calder and Dirk Grunwald. 1994. Reducing Indirect Function Call Overhead in C++ Programs. In Proceedings of the 21st ACM SIGPLANSIGACT Symposium on Principles of Programming Languages (Portland, Oregon, USA) ( POPL '94). Association for Computing Machinery, New York, NY, USA, 397-408. htps://doi.org/10.1145/174675.177973Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John Cavazos and Michael F. P. O'Boyle. 2005. Automatic Tuning of Inlining Heuristics. In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (SC '05). IEEE Computer Society, USA, 14. htps: //doi.org/10.1109/SC. 2005.14Google ScholarGoogle Scholar
  7. Emilio G. Cota, Paolo Bonzini, Alex Bennée, and Luca P. Carloni. 2017. Cross-ISA Machine Emulation for Multicores. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (Austin, USA) ( CGO '17). IEEE Press, 210-220.Google ScholarGoogle Scholar
  8. Peter Feiner, Angela Demke Brown, and Ashvin Goel. 2012. Comprehensive Kernel Instrumentation via Dynamic Binary Translation. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (London, England, UK) (ASPLOS XVII). Association for Computing Machinery, New York, NY, USA, 135-146. htps://doi.org/10.1145/2150976.2150992Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. GCC. 2020. Optimization Options. htps://gcc.gnu.org/onlinedocs/ gcc/Optimize-Options.html.Google ScholarGoogle Scholar
  10. Byron Hawkins, Brian Demsky, Derek Bruening, and Qin Zhao. 2015. Optimizing Binary Translation of Dynamically Generated Code. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization (San Francisco, California) ( CGO '15). IEEE Computer Society, USA, 68-78.Google ScholarGoogle ScholarCross RefCross Ref
  11. Shiliang Hu and James E. Smith. 2004. Using Dynamic Binary Translation to Fuse Dependent Instructions. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization (Palo Alto, California) (CGO '04). IEEE Computer Society, USA, 213.Google ScholarGoogle Scholar
  12. Suresh Jagannathan and Andrew Wright. 1996. Flow-Directed Inlining. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation (Philadelphia, Pennsylvania, USA) ( PLDI '96). Association for Computing Machinery, New York, NY, USA, 193-205. htps://doi.org/10.1145/231379.231417Google ScholarGoogle Scholar
  13. Jinhu Jiang, Rongchao Dong, Zhongjun Zhou, Changheng Song, Wenwen Wang, Pen-Chung Yew, and Weihua Zhang. 2020. More with Less-Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 415-426. htps: //doi.org/10.1109/MICRO50266. 2020.00043Google ScholarGoogle Scholar
  14. LLVM. 2020. Inlining. htps://clang.llvm.org/docs/analyzer/developerdocs/IPA.html.Google ScholarGoogle Scholar
  15. Guilherme Ottoni, Thomas Hartin, Christopher Weaver, Jason Brandt, Belliappa Kuttanna, and Hong Wang. 2011. Harmonia: A Transparent, Eficient, and Harmonious Dynamic Binary Translator Targeting the Intel® Architecture. In Proceedings of the 8th ACM International Conference on Computing Frontiers (Ischia, Italy) (CF '11). Association for Computing Machinery, New York, NY, USA, Article 26, 10 pages. htps://doi.org/10.1145/2016604.2016635Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Maksim Panchenko, Rafael Auler, Bill Nell, and Guilherme Ottoni. 2019. BOLT: A Practical Binary Optimizer for Data Centers and Beyond. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (Washington, DC, USA) ( CGO '19). IEEE Press, 2-14.Google ScholarGoogle ScholarCross RefCross Ref
  17. Ian Piumarta and Fabio Riccardi. 1998. Optimizing Direct Threaded Code by Selective Inlining. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (Montreal, Quebec, Canada) ( PLDI '98). Association for Computing Machinery, New York, NY, USA, 291-300. htps://doi.org/10.1145/ 277650.277743Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Aleksandar Prokopec, Gilles Duboscq, David Leopoldseder, and Thomas Würthinger. 2019. An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (Washington, DC, USA) ( CGO '19). IEEE Press, 164-179.Google ScholarGoogle ScholarCross RefCross Ref
  19. Vijay Janapa Reddi, Dan Connors, Robert Cohn, and Michael D. Smith. 2007. Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications. In Proceedings of the International Symposium on Code Generation and Optimization (CGO '07). IEEE Computer Society, USA, 74-88. htps://doi.org/10.1109/CGO. 2007.29Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Douglas Simon, John Cavazos, Christian Wimmer, and Sameer Kulkarni. 2013. Automatic Construction of Inlining Heuristics Using Machine Learning. In Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO '13). IEEE Computer Society, USA, 1-12. htps://doi.org/10.1109/CGO. 2013.6495004Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Changheng Song, Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Weihua Zhang. 2019. Unleashing the Power of Learning: An Enhanced Learning-Based Approach for Dynamic Binary Translation. In Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference (Renton, WA, USA) ( USENIX ATC '19). USENIX Association, USA, 77-89.Google ScholarGoogle Scholar
  22. Tom Spink, Harry Wagstaf, and Björn Franke. 2019. A Retargetable System-Level DBT Hypervisor. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 505-520. htps://www.usenix.org/conference/atc19/presentation/spinkGoogle ScholarGoogle Scholar
  23. Standard Performance Evaluation Corporation. 2020. SPEC CPU 2017. htps://www.spec.org/cpu2017.Google ScholarGoogle Scholar
  24. Levon Stepanian, Angela Demke Brown, Allan Kielstra, Gita Koblents, and Kevin Stoodley. 2005. Inlining Java Native Calls at Runtime. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (Chicago, IL, USA) ( VEE '05). Association for Computing Machinery, New York, NY, USA, 121-131. htps://doi.org/ 10.1145/1064979.1064997Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Wenwen Wang, Stephen McCamant, Antonia Zhai, and Pen-Chung Yew. 2018. Enhancing Cross-ISA DBT Through Automatically Learned Translation Rules. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (Williamsburg, VA, USA) ( ASPLOS '18). Association for Computing Machinery, New York, NY, USA, 84-97. htps://doi. org/10.1145/3173162.3177160Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wenwen Wang, Chenggang Wu, Tongxin Bai, Zhenjiang Wang, Xiang Yuan, and Huimin Cui. 2014. A Pattern Translation Method for Flags in Binary Translation. Journal of Computer Research and Development 51, 10 ( 2014 ), 2336-2347. htp://crad.ict.ac.cn/EN/10.7544/issn1000-1239. 2014.20130018Google ScholarGoogle Scholar
  27. Wenwen Wang, Jiacheng Wu, Xiaoli Gong, Tao Li, and Pen-Chung Yew. 2018. Improving Dynamically-Generated Code Performance on Dynamic Binary Translators. In Proceedings of the 14th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (Williamsburg, VA, USA) ( VEE '18). Association for Computing Machinery, New York, NY, USA, 17-30. htps://doi.org/10.1145/ 3186411.3186413Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Stephen McCamant. 2016. A General Persistent Code Caching Framework for Dynamic Binary Translation (DBT). In Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference (Denver, CO, USA) ( USENIX ATC '16). USENIX Association, USA, 591-603.Google ScholarGoogle Scholar
  29. Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Stephen McCamant. 2020. Eficient and Scalable Cross-ISA Virtualization of Hardware Transactional Memory. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (San Diego, CA, USA) ( CGO '20). Association for Computing Machinery, New York, NY, USA, 107-120. htps://doi.org/10.1145/3368826.3377919Google ScholarGoogle Scholar
  30. Wenwen Wang, Pen-Chung Yew, Antonia Zhai, Stephen McCamant, Youfeng Wu, and Jayaram Bobba. 2017. Enabling Cross-ISA Ofloading for COTS Binaries. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (Niagara Falls, New York, USA) ( MobiSys '17). Association for Computing Machinery, New York, NY, USA, 319-331. htps://doi.org/10.1145/3081333.3081337Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jin Wu, Jian Dong, Ruili Fang, Wenwen Wang, and Decheng Zuo. 2020. PerfDBT: Eficient Performance Regression Testing of Dynamic Binary Translation. In 2020 IEEE 38th International Conference on Computer Design (ICCD). 389-392. htps://doi.org/10.1109/ICCD50377. 2020. 00071Google ScholarGoogle Scholar
  32. Xiaochun Zhang, Qi Guo, Yunji Chen, Tianshi Chen, and Weiwu Hu. 2015. HERMES: A Fast Cross-ISA Binary Translator with PostOptimization. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization (San Francisco, California) ( CGO '15). IEEE Computer Society, USA, 246-256.Google ScholarGoogle Scholar
  33. Ziyi Zhao, Zhang Jiang, Ximing Liu, Xiaoli Gong, Wenwen Wang, and Pen-Chung Yew. 2020. DQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms. In 49th International Conference on Parallel Processing-ICPP (Edmonton, AB, Canada) ( ICPP '20). Association for Computing Machinery, New York, NY, USA, Article 7, 11 pages. htps://doi.org/10.1145/3404397.3404403Google ScholarGoogle Scholar

Index Terms

  1. Helper function inlining in dynamic binary translation

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                CC 2021: Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction
                March 2021
                164 pages
                ISBN:9781450383257
                DOI:10.1145/3446804
                • General Chair:
                • Aaron Smith,
                • Program Chairs:
                • Delphine Demange,
                • Rajiv Gupta

                Copyright © 2021 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 27 February 2021

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • Article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader