ABSTRACT
Speculative execution, the base on which modern high-performance general-purpose CPUs are built on, has recently been shown to enable a slew of security attacks. All these attacks are centered around a common set of behaviors: During speculative execution, the architectural state of the system is kept unmodified, until the speculation can be verified. In the event that a misspeculation occurs, then anything that can affect the architectural state is reverted (squashed) and re-executed correctly. However, the same is not true for the microarchitectural state. Normally invisible to the user, changes to the microarchitectural state can be observed through various side-channels, with timing differences caused by the memory hierarchy being one of the most common and easy to exploit. The speculative side-channels can then be exploited to perform attacks that can bypass software and hardware checks in order to leak information. These attacks, out of which the most infamous are perhaps Spectre and Meltdown, have led to a frantic search for solutions.
In this work, we present our own solution for reducing the microarchitectural state-changes caused by speculative execution in the memory hierarchy. It is based on the observation that if we only allow accesses that hit in the L1 data cache to proceed, then we can easily hide any microarchitectural changes until after the speculation has been verified. At the same time, we propose to prevent stalls by value predicting the loads that miss in the L1. Value prediction, though speculative, constitutes an invisible form of speculation, not seen outside the core. We evaluate our solution and show that we can prevent observable microarchitectural changes in the memory hierarchy while keeping the performance and energy costs at 11% and 7%, respectively. In comparison, the current state of the art solution, InvisiSpec, incurs a 46% performance loss and a 51% energy increase.
- Gordon B. Bell and Mikko H. Lipasti. 2004. Deconstructing Commit. In Proceedings of the International Symposium on Performance Analysis of Systems and Software. IEEE Computer Society, Washington, DC, USA, 68--77. Google ScholarDigital Library
- Daniel J. Bernstein. 2005. Cache-timing attacks on AES.Google Scholar
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The gem5 Simulator. ACM SIGARCH Computer Architecture News 39, 2 (Aug. 2011), 1--7. Issue 2. Google ScholarDigital Library
- Joseph Bonneau and Ilya Mironov. 2006. Cache-Collision Timing Attacks Against AES. In Proceedings of the Cryptographic Hardware and Embedded Systems. Springer, Berlin, Heidelberg, 201--215. Google ScholarDigital Library
- Rosario Cammarota and Rami Sheikh. 2018. VPsec: Countering Fault Attacks in General Purpose Microprocessors with Value Prediction. In Proceedings of the ACM International Conference on Computing Frontiers. ACM, New York, NY, USA, 191--199.Google ScholarDigital Library
- Luis Ceze, Karin Strauss, James Tuck, Josep Torrellas, and Jose Renau. 2006. CAVA: Using Checkpoint-assisted Value Prediction to Hide L2 Misses. ACM Transactions on Architecture and Code Optimization 3, 2 (June 2006), 182--208. Google ScholarDigital Library
- Jonathan Corbet. 2017. KAISER: hiding the kernel from user space. https://lwn.net/Articles/738975/.Google Scholar
- National Vulnerability Database. 2017. CVE-2017-5753. Available from MITRE, CVE-ID CVE-2017-5753.. http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-5753Google Scholar
- National Vulnerability Database. 2017. CVE-2018-3693. Available from MITRE, CVE-ID CVE-2018-3693.. http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3693Google Scholar
- Leonid Domnitser, Aamer Jaleel, Jason Loew, Nael Abu-Ghazaleh, and Dmitry Ponomarev. 2012. Non-monopolizable Caches: Low-complexity Mitigation of Cache Side Channel Attacks. ACM Transactions on Architecture and Code Optimization 8, 4 (Jan. 2012), 35:1--35:21. Google ScholarDigital Library
- Xiaowan Dong, Zhuojia Shen, John Criswell, Alan Cox, and Sandhya Dwarkadas. 2018. Spectres, Virtual Ghosts, and Hardware Support. In Proceedings of the International Workshop on Hardware and Architectural Support for Security and Privacy. ACM, New York, NY, USA, 5:1--5:9. Google ScholarDigital Library
- Hongyua Fang, Sai Santosh Dayapule, Fan Yao, Milos Doroslovacki, and Guru Venkataramani. 2018. Prefetch-guard: Leveraging hardware prefetches to defend against cache timing channels. In Proceedings of the IEEE International Symposium on Hardware Oriented Security and Trust. IEEE Computer Society, Washington, DC, USA, 187--190.Google ScholarCross Ref
- Adi Fuchs and Ruby B. Lee. 2015. Disruptive Prefetching: Impact on Side-channel Attacks and Cache Designs. In Proceedings of the ACM International Systems and Storage Conference. ACM, New York, NY, USA, 14:1--14:12. Google ScholarDigital Library
- Daniel Gruss, Julian Lettner, Felix Schuster, Olya Ohrimenko, Istvan Haller, and Manuel Costa. 2017. Strong and efficient cache side-channel protection using hardware transactional memory. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, USA, 217--233.Google Scholar
- Daniel Gruss, Raphael Spreitzer, and Stefan Mangard. 2015. Cache Template Attacks: Automating Attacks on Inclusive Last-Level Caches. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, USA, 897--912.Google Scholar
- David Gullasch, Endre Bangerter, and Stephan Krenn. 2011. Cache Games - Access-Based Cache Attacks on AES to Practice. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, USA, 490--505. Google ScholarDigital Library
- Gorka Irazoqui, Thomas Eisenbarth, and Berk Sunar. 2016. Cross Processor Cache Attacks. In Proceedings of the ACM on Asia Conference on Computer and Communications Security. ACM, New York, NY, USA, 353--364. Google ScholarDigital Library
- Georgios Keramidas, Antonios Antonopoulos, Dimitrios N. Serpanos, and Stefanos Kaxiras. 2008. Non deterministic caches: a simple and effective defense against side channel attacks. Design Automation for Embedded Systems 12, 3 (Sept. 2008), 221--230. Google ScholarDigital Library
- Khaled N. Khasawneh, Esmaeil Mohammadian Koruyeh, Chengyu Song, Dmitry Evtyushkin, Dmitry Ponomarev, and Nael Abu-Ghazaleh. 2018. SafeSpec: Banishing the Spectre of a Meltdown with Leakage-Free Speculation. arXiv:1806.05179 http://arxiv.org/abs/1806.05179Google Scholar
- Ilhyun Kim and Mikko H. Lipasti. 2004. Understanding scheduling replay schemes. In Proceedings of the International Symposium High-Performance Computer Architecture. IEEE Computer Society, Washington, DC, USA, 198--209. Google ScholarDigital Library
- Taesoo Kim, Marcus Peinado, and Gloria Mainar-Ruiz. 2012. STEALTHMEM: System-level Protection Against Cache-based Side Channel Attacks in the Cloud. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, USA, 11--11.Google Scholar
- Vladimir Kiriansky, Ilia Lebedev, Saman Amarasinghe, Srinivas Devadas, and Joel Emer. 2018. DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 974--987.Google ScholarDigital Library
- Paul Kocher, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, and Yuval Yarom. 2019. Spectre attacks: Exploiting speculative execution. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, USA, 19--37.Google ScholarCross Ref
- Jingfei Kong, Onur Aciicmez, Jean-Pierre Seifert, and Huiyang Zhou. 2009. Hardware-software integrated approaches to defend against software cache-based side channel attacks. In Proceedings of the International Symposium High-Performance Computer Architecture. IEEE Computer Society, Washington, DC, USA, 393--404.Google ScholarCross Ref
- Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 469--480. Google ScholarDigital Library
- Sheng Li, Ke Chen, Jung Ho Ahn, Jay B Brockman, and Norman P Jouppi. 2011. CACTI-P: Architecture-Level Modeling for SRAM-based Structures with Advanced Leakage Reduction Techniques. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design. IEEE Computer Society, Washington, DC, USA, 694--701. Google ScholarDigital Library
- Mikko H. Lipasti and John Paul Shen. 1996. Exceeding the Dataflow Limit via Value Prediction. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 226--237. Google ScholarDigital Library
- Mikko H. Lipasti, Christopher B. Wilkerson, and John Paul Shen. 1996. Value Locality and Load Value Prediction. In Proceedings of the Architectural Support for Programming Languages and Operating Systems. ACM, New York, NY, USA, 138--147.Google ScholarDigital Library
- Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg. 2018. Meltdown. arXiv:1801.01207 http://arxiv.org/abs/1801.01207Google Scholar
- Fangfei Liu and Ruby B. Lee. 2013. Security Testing of a Secure Cache Design. In Proceedings of the International Workshop on Hardware and Architectural Support for Security and Privacy. ACM, New York, NY, USA, 3:1--3:8. Google ScholarDigital Library
- Fangfei Liu and Ruby B. Lee. 2014. Random Fill Cache Architecture. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 203--215. Google ScholarDigital Library
- Fangfei Liu, Hao Wu, Kenneth Mai, and Ruby B. Lee. 2016. Newcache: Secure Cache Architecture Thwarting Cache Side-Channel Attacks. IEEE Micro 36, 5 (Sept. 2016), 8--16. Google ScholarDigital Library
- Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B. Lee. 2015. Last-Level Cache Side-Channel Attacks are Practical. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, USA, 605--622. Google ScholarDigital Library
- Milo M. K. Martin, Daniel J. Sorin, Harold W. Cain, Mark D. Hill, and Mikko H. Lipasti. 2001. Correctly Implementing Value Prediction in Microprocessors That Support Multithreading or Multiprocessing. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 328--337. Google ScholarDigital Library
- Robert Martin, John Demme, and Simha Sethumadhavan. 2012. TimeWarp: Rethinking Timekeeping and Performance Monitoring Mechanisms to Mitigate Side-channel Attacks. In Proceedings of the International Symposium on Computer Architecture. IEEE Computer Society, Washington, DC, USA, 118--129. Google ScholarDigital Library
- Lois Orosa, Rodolfo Azevedo, and Onur Mutlu. 2018. AVPP: Address-first Value-next Predictor with Value Prefetching for Improving the Efficiency of Load Value Prediction. ACM Transactions on Architecture and Code Optimization 15, 4 (Dec. 2018), 49:1--49:30.Google ScholarDigital Library
- Dag Arne Osvik, Adi Shamir, and Eran Tromer. 2006. Cache attacks and counter-measures: the case of AES. In Proceedings of the RSA Conference. Springer, Berlin, Heidelberg, 1--20. Google ScholarDigital Library
- Dan Page. 2005. Partitioned Cache Architecture as a Side-Channel Defence Mechanism. IACR Cryptology ePrint archive.Google Scholar
- Andrew Pardoe. 2018. Spectre mitigations in MSVC. https://blogs.msdn.microsoft.com/vcblog/2018/01/15/spectre-mitigations-in-msvc/.Google Scholar
- Arthur Perais and André Seznec. 2014. EOLE: Paving the Way for an Effective Implementation of Value Prediction. In Proceedings of the International Symposium on Computer Architecture. ACM, New York, NY, USA, 481--492. Google ScholarDigital Library
- Arthur Perais and André Seznec. 2014. Practical data value speculation for future high-end processors. In Proceedings of the International Symposium High-Performance Computer Architecture. IEEE Computer Society, Washington, DC, USA, 428--439.Google ScholarCross Ref
- Arthur Perais and André Seznec. 2015. BeBoP: A cost effective predictor infrastructure for superscalar value prediction. In Proceedings of the International Symposium High-Performance Computer Architecture. IEEE Computer Society, Washington, DC, USA, 13--25.Google ScholarCross Ref
- Peter Pessl, Daniel Gruss, Clémentine Maurice, Michael Schwarz, and Stefan Mangard. 2016. DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks.. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, USA, 565--581. Google ScholarDigital Library
- Moinuddin K. Qureshi. 2018. CEASER: Mitigating Conflict-Based Cache Attacks via Encrypted-Address and Remapping. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 775--787.Google ScholarDigital Library
- Alberto Ros, Trevor E. Carlson, Mehdi Alipour, and Stefanos Kaxiras. 2017. Non-Speculative Load-Load Reordering in TSO. In Proceedings of the International Symposium on Computer Architecture. ACM, New York, NY, USA, 187--200. Google ScholarDigital Library
- Christos Sakalis, Mehdi Alipour, Alberto Ros, Alexandra Jimborean, Stefanos Kaxiras, and Själander Magnus. 2019. Ghost Loads: What is the Cost of Invisible Speculation?. In Proceedings of the ACM International Conference on Computing Frontiers. ACM, New York, NY, USA, 153--163.Google ScholarDigital Library
- Michael Schwarz, Martin Schwarzl, Moritz Lipp, and Daniel Gruss. 2018. Net-spectre: Read arbitrary memory over network. arXiv:1807.10535Google Scholar
- André Seznec. 2007. A 256 kbits l-TAGE Branch Predictor. Journal of Instruction-Level Parallelism (JILP) Special Issue: The Second Championship Branch Prediction Competition (CBP-2) 9 (2007), 1--6.Google Scholar
- Standard Performance Evaluation Corporation. 2006. SPEC CPU Benchmark Suite. http://www.specbench.org/osg/cpu2006/.Google Scholar
- Michael Stokes, Ryan Baird, Zhaoxiang Jin, David Whalley, and Soner Onder. 2018. Decoupling Address Generation from Loads and Stores to Improve Data Access Energy Efficiency. In Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems. ACM, New York, NY, USA, 65--75. Google ScholarDigital Library
- Caroline Trippel, Daniel Lustig, and Margaret Martonosi. 2018. MeltdownPrime and SpectrePrime: Automatically-Synthesized Attacks Exploiting Invalidation-Based Coherence Protocols. arXiv:1802.03802 http://arxiv.org/abs/1802.03802Google Scholar
- Dean M. Tullsen and John S. Seng. 1999. Storageless Value Prediction Using Prior Register Values. In Proceedings of the International Symposium on Computer Architecture. IEEE Computer Society, Washington, DC, USA, 270--279. Google ScholarDigital Library
- Paul Turner. 2018. Retpoline: a software construct for preventing branch-target-injection. https://support.google.com/faqs/answer/7625886.Google Scholar
- Nandita Vijaykumar, Abilasha Jain, Diptesh Majumdar, Keving Hsieh, Gennady Pekhimenko, Eiman Ebrahimi, Nastaran Hajinazar, Philip B. Gibbons, and Onur Mutlu. 2018. A Case for Richer Cross-Layer Abstractions: Bridging the Semantic Gap with Expressive Memory. In Proceedings of the International Symposium on Computer Architecture. ACM, New York, NY, USA, 207--220. Google ScholarDigital Library
- Zenghong Wang and Ruby B. Lee. 2006. Covert and Side Channels Due to Processor Architecture. In Proceedings of the Annual Computer Security Applications Conference. IEEE Computer Society, Washington, DC, USA, 473--482. Google ScholarDigital Library
- Zhenghong Wang and Ruby B. Lee. 2007. New Cache Designs for Thwarting Software Cache-based Side Channel Attacks. In Proceedings of the International Symposium on Computer Architecture. ACM, New York, NY, USA, 494--505. Google ScholarDigital Library
- Zhenghong Wang and Ruby B. Lee. 2008. A Novel Cache Architecture with Enhanced Performance and Security. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 83--93.Google Scholar
- Zhenyu Wu, Zhang Xu, and Haining Wang. 2012. Whispers in the Hyper-space: High-speed Covert Channel Attacks in the Cloud.. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, USA, 159--173.Google Scholar
- Mengjia Yan, Jiho Choi, Dimitrios Skarlatos, Adam Morrison, Christopher W. Fletcher, and Josep Torrellas. 2018. InvisiSpec: Making Speculative Execution Invisible in the Cache Hierarchy. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, DC, USA, 428--441.Google ScholarDigital Library
- Yuval Yarom and Katrina Falkner. 2014. FLUSH+ RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, USA, 719--732. Google ScholarDigital Library
- Yinqian Zhang and Michael K. Reiter. 2013. Düppel: Retrofitting Commodity Operating Systems to Mitigate Cache Side Channels in the Cloud. In Proceedings of the ACM SIGSAC Conference on Computer & Communications Security. ACM, New York, NY, USA, 827--838. Google ScholarDigital Library
- Xiaotong Zhuang, Tao Zhang, and Santosh Pande. 2004. HIDE: An Infrastructure for Efficiently Protecting Information Leakage on the Address Bus. In Proceedings of the Architectural Support for Programming Languages and Operating Systems. ACM, New York, NY, USA, 72--84.Google ScholarDigital Library
Index Terms
- Efficient invisible speculative execution through selective delay and value prediction
Recommendations
Exploiting selective instruction reuse and value prediction in a superscalar architecture
In our previously published research we discovered some very difficult to predict branches, called unbiased branches. Since the overall performance of modern processors is seriously affected by misprediction recovery, especially these difficult branches ...
Three Architectural Models for Compiler-Controlled Speculative Execution
To effectively exploit instruction level parallelism, the compiler must move instructions across branches. When an instruction is moved above a branch that it is control dependent on, it is considered to be speculatively executed since it is executed ...
Using Predicated Execution to Improve the Performance of a Dynamically Scheduled Machine with Speculative Execution
Conditional branches incur a severe performance penalty in wide-issue, deeply pipelined processors. Speculative execution(1, 2) and predicated execution(3---9) are two mechanisms that have been proposed for reducing this penalty. Speculative execution ...
Comments