ABSTRACT
Serverless computing separates function execution from state management. Simple retry-based fault tolerance might corrupt the shared state with duplicate updates. Existing solutions employ log-based fault tolerance to achieve exactlyonce semantics, where every single read or write to the external state is associated with a log for deterministic replay. However, logging is not a free lunch, which introduces considerable overhead to stateful serverless applications.
We present Halfmoon, a serverless runtime system for fault-tolerant stateful serverless computing. Our key insight is that it is unnecessary to symmetrically log both reads and writes. Instead, it suffices to log either reads or writes, i.e., asymmetrically. We design two logging protocols that enforce exactly-once semantics while providing log-free reads and writes, which are suitable for read- and write-intensive workloads, respectively. We theoretically prove that the two protocols are log-optimal, i.e., no other protocols can achieve lower logging overhead than our protocols. We provide a criterion for choosing the right protocol for a given workload, and a pauseless switching mechanism to switch protocols for dynamic workloads. We implement a prototype of Halfmoon. Experiments show that Halfmoon achieves 20%--40% lower latency and 1.5--4.0× lower logging overhead than the state-of-the-art solution Boki.
Supplemental Material
Available for Download
Supplemental material.
- 2023. AWS Step Functions. https://aws.amazon.com/step-functions/. Accessed 2023-04-17.Google Scholar
- 2023. Azure Durable Entities. https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-entities. Accessed 2023-04-17.Google Scholar
- 2023. DeathStarBench. https://github.com/delimitrou/DeathStarBench/. Accessed 2023-04-17.Google Scholar
- 2023. Functionbench. https://github.com/kmu-bigdata/serverless-faas-workbench. Accessed 2023-04-17.Google Scholar
- 2023. Google Cloud Functions Triggers. https://cloud.google.com/functions/docs/calling. Accessed 2023-04-17.Google Scholar
- 2023. Halfmoon: Log-Optimal Fault-Tolerant Stateful Serverless Computing (Extended Version). https://tomquartz.github.io/files/SOSP23_Halfmoon_extended.pdf. Accessed 2023-09-11.Google Scholar
- 2023. Logging in Azure Durable Functions. https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-orchestrations. Accessed 2023-04-17.Google Scholar
- 2023. Retrying event-driven functions in Google Cloud. https://cloud.google.com/functions/docs/bestpractices/retries. Accessed 2023-04-17.Google Scholar
- 2023. Sample projects for AWS Step Functions. https://docs.aws.amazon.com/step-functions/latest/dg/create-sample-projects.html. Accessed 2023-04-17.Google Scholar
- 2023. Serverless Examples. https://github.com/serverless/examples. Accessed 2023-04-17.Google Scholar
- 2023. Serverlessbench. https://serverlessbench.systems/en-us/. Accessed 2023-04-17.Google Scholar
- 2023. Statelessness of Google Cloud Functions. https://cloud.google.com/functions/docs/concepts/execution-environment. Accessed 2023-04-17.Google Scholar
- 2023. Tutorial: Design and implementation of a simple Twitter clone using PHP and the Redis key-value store. https://redis.io/topics/twitter-clone. Accessed 2023-09-11.Google Scholar
- Marcos K Aguilera, Naama Ben-David, Rachid Guerraoui, Virendra J Marathe, Athanasios Xygkis, and Igor Zablotchi. 2020. Microsecond consensus for microsecond applications. In USENIX OSDI.Google Scholar
- Remzi Can Aksoy and Manos Kapritsos. 2019. Aegean: Replication beyond the Client-Server Model. In ACM SOSP.Google ScholarDigital Library
- Kalev Alpernas, Aurojit Panda, Leonid Ryzhyk, and Mooly Sagiv. 2021. Cloud-scale runtime verification of serverless applications. In ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Mohamed Alzayat, Jonathan Mace, Peter Druschel, and Deepak Garg. 2023. Groundhog: Efficient Request Isolation in FaaS. In EuroSys.Google Scholar
- Mahesh Balakrishnan, Jason Flinn, Chen Shen, Mihir Dharamshi, Ahmed Jafri, Xiao Shi, Santosh Ghosh, Hazem Hassan, Aaryaman Sagar, Rhed Shi, et al. 2020. Virtual consensus in delos. In USENIX OSDI.Google Scholar
- Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, and John D Davis. 2012. CORFU: A Shared Log Design for Flash Clusters.. In USENIX NSDI.Google Scholar
- Mahesh Balakrishnan, Dahlia Malkhi, Ted Wobber, Ming Wu, Vijayan Prabhakaran, Michael Wei, John D Davis, Sriram Rao, Tao Zou, and Aviad Zuck. 2013. Tango: Distributed data structures over a shared log. In ACM SOSP.Google Scholar
- Mahesh Balakrishnan, Chen Shen, Ahmed Jafri, Suyog Mapara, David Geraghty, Jason Flinn, Vidhya Venkat, Ivailo Nedelchev, Santosh Ghosh, Mihir Dharamshi, et al. 2021. Log-structured protocols in delos. In ACM SOSP.Google Scholar
- Daniel Barcelona-Pons, Pierre Sutra, Marc Sánchez-Artigas, Gerard París, and Pedro García-López. 2022. Stateful serverless computing with crucial. ACM Transactions on Software Engineering and Methodology (2022).Google Scholar
- Ken Birman and Thomas Joseph. 1987. Exploiting virtual synchrony in distributed systems. In ACM SOSP.Google Scholar
- Sebastian Burckhardt, Badrish Chandramouli, Chris Gillum, David Justo, Konstantinos Kallas, Connor McMahon, Christopher S Meiklejohn, and Xiangfeng Zhu. 2022. Netherite: Efficient execution of serverless workflows. Proceedings of the VLDB Endowment (2022).Google ScholarDigital Library
- Sebastian Burckhardt, Chris Gillum, David Justo, Konstantinos Kallas, Connor McMahon, and Christopher S Meiklejohn. 2021. Durable functions: semantics for stateful serverless.. In ACM OOPSLA.Google Scholar
- Binbin Chen, Haifeng Yu, Yuda Zhao, and Phillip B Gibbons. 2014. The cost of fault tolerance in multi-party communication complexity. J. ACM (2014).Google Scholar
- James C Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, et al. 2012. Spanner: Google's globally-distributed database. In USENIX OSDI.Google Scholar
- Heming Cui, Rui Gu, Cheng Liu, Tianyu Chen, and Junfeng Yang. 2015. Paxos made transparent. In ACM SOSP.Google Scholar
- Martijn de Heus, Kyriakos Psarakis, Marios Fragkoulis, and Asterios Katsifodimos. 2021. Distributed transactions on serverless stateful functions. In Proceedings of the ACM International Conference on Distributed and Event-based Systems.Google ScholarDigital Library
- Cong Ding, David Chu, Evan Zhao, Xiang Li, Lorenzo Alvisi, and Robbert Van Renesse. 2020. Scalog: Seamless reconfiguration and total order in a scalable shared log. In USENIX NSDI.Google Scholar
- Haoran Ding, Zhaoguo Wang, Zhuohao Shen, Rong Chen, and Haibo Chen. 2023. Automated Verification of Idempotence for Stateful Serverless Applications. In USENIX OSDI.Google Scholar
- Zhiyuan Dong, Zhaoguo Wang, Xiaodong Zhang, Xian Xu, Changgeng Zhao, Haibo Chen, Aurojit Panda, and Jinyang Li. 2023. Fine-Grained Re-Execution for Efficient Batched Commit of Distributed Transactions. (2023).Google Scholar
- Dong Du, Qingyuan Liu, Xueqiang Jiang, Yubin Xia, Binyu Zang, and Haibo Chen. 2022. Serverless computing on heterogeneous computers. In ACM ASPLOS.Google Scholar
- Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qixuan Wu, and Haibo Chen. 2020. Catalyzer: Sub-millisecond startup for serverless computing with initialization-less booting. In ACM ASPLOS.Google ScholarDigital Library
- Mostafa Elhemali, Niall Gallagher, Bin Tang, Nick Gordon, Hao Huang, Haibo Chen, Joseph Idziorek, Mengtian Wang, Richard Krog, Zongpeng Zhu, et al. 2022. Amazon {DynamoDB}: A Scalable, Predictably Performant, and Fully Managed {NoSQL} Database Service. In USENIX ATC.Google Scholar
- Vitor Enes, Carlos Baquero, Alexey Gotsman, and Pierre Sutra. 2021. Efficient replication via timestamp stability. In EuroSys.Google Scholar
- Henrique Fingler, Zhiting Zhu, Esther Yoon, Zhipeng Jia, Emmett Witchel, and Christopher J Rossbach. 2022. DGSF: Disaggregated GPUs for Serverless Functions. In IEEE International Parallel and Distributed Processing Symposium.Google Scholar
- Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers. In USENIX ATC.Google Scholar
- Xinwei Fu, Wook-Hee Kim, Ajay Paddayuru Shreepathi, Mohannad Ismail, Sunny Wadkar, Dongyoon Lee, and Changwoo Min. 2021. Witcher: Systematic crash consistency testing for non-volatile memory key-value stores. In ACM SOSP.Google Scholar
- Xinwei Fu, Dongyoon Lee, and Changwoo Min. 2022. {DURINN}: Adversarial Memory and Thread Interleaving for Detecting Durable Linearizability Bugs. In USENIX OSDI.Google Scholar
- Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, et al. 2019. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In ACM ASPLOS.Google Scholar
- Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2021. Exploiting nil-externality for fast replicated storage. In ACM SOSP.Google Scholar
- Zhiyuan Guo, Yizhou Shan, Xuhao Luo, Yutong Huang, and Yiying Zhang. 2022. Clio: A hardware-software co-designed disaggregated memory system. In ACM ASPLOS.Google Scholar
- Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. 2017. IronFleet: Proving Safety and Liveness of Practical Distributed Systems. Commun. ACM (2017).Google ScholarDigital Library
- Joseph M Hellerstein, Jose Faleiro, Joseph E Gonzalez, Johann SchleierSmith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. 2019. Serverless computing: One step forward, two steps back. (2019).Google Scholar
- Maurice P Herlihy and Jeannette M Wing. 1990. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (1990).Google Scholar
- Yige Hu, Zhiting Zhu, Ian Neal, Youngjin Kwon, Tianyu Cheng, Vijay Chidambaram, and Emmett Witchel. 2019. TxFS: Leveraging file-system crash consistency to provide ACID transactions. ACM Transactions on Storage (2019).Google ScholarDigital Library
- Peng Huang, Chuanxiong Guo, Jacob R Lorch, Lidong Zhou, and Yingnong Dang. 2018. Capturing and enhancing in situ system observability for failure detection. In USENIX OSDI.Google Scholar
- Nicholas Hunt, Tom Bergan, Luis Ceze, and Steven D Gribble. 2013. DDOS: taming nondeterminism in distributed systems. ACM SIGPLAN Notices (2013).Google ScholarDigital Library
- Abhinav Jangda, Donald Pinckney, Yuriy Brun, and Arjun Guha. 2019. Formal foundations of serverless computing. (2019).Google Scholar
- Zhipeng Jia and Emmett Witchel. 2021. Boki: Stateful serverless computing with shared logs. In ACM SOSP.Google ScholarDigital Library
- Zhipeng Jia and Emmett Witchel. 2021. Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices. In ACM ASPLOS.Google Scholar
- Ricardo Jiménez-Peris, Gustavo Alonso, and Bettina Kemme. 2003. Are quorums an alternative for data replication? ACM Transactions on Database Systems (TODS) (2003).Google Scholar
- Kostis Kaffes, Neeraja J. Yadwadkar, and Christos Kozyrakis. 2019. Centralized Core-Granular Scheduling for Serverless Functions. In ACM Symposium on Cloud Computing.Google Scholar
- Jonathan Kaldor, Jonathan Mace, Michał Bejda, Edison Gao, Wiktor Kuropatwa, Joe O'Neill, Kian Win Ong, Bill Schaller, Pingjia Shan, Brendan Viscomi, et al. 2017. Canopy: An end-to-end performance tracing and analysis system. In ACM SOSP.Google Scholar
- Konstantinos Kallas, Haoran Zhang, Rajeev Alur, Sebastian Angel, and Vincent Liu. 2023. Executing Microservice Applications on Serverless, Correctly. ACM POPL (2023).Google Scholar
- Manos Kapritsos, Yang Wang, Vivien Quema, Allen Clement, Lorenzo Alvisi, and Mike Dahlin. 2012. All about eve: Execute-verify replication for multi-core servers. In USENIX OSDI.Google Scholar
- Antonios Katsarakis, Vasilis Gavrielatos, MR Siavash Katebzadeh, Arpit Joshi, Aleksandar Dragojevic, Boris Grot, and Vijay Nagarajan. 2020. Hermes: A fast, fault-tolerant and linearizable replication protocol. In ACM ASPLOS.Google Scholar
- Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Serverless Analytics.. In USENIX OSDI.Google Scholar
- Marios Kogias and Edouard Bugnion. 2020. HovercRaft: Achieving scalability and fault-tolerance for microsecond-scale datacenter services. In EuroSys.Google Scholar
- Eric Koskinen and Junfeng Yang. 2016. Reducing crash recoverability to reachability. In ACM POPL.Google Scholar
- Swaroop Kotni, Ajay Nayak, Vinod Ganapathy, and Arkaprava Basu. 2021. Faastlane: Accelerating Function-as-a-Service Workflows.. In USENIX ATC.Google Scholar
- Leslie Lamport. 1979. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. (1979).Google ScholarDigital Library
- Leslie Lamport. 2001. Paxos made simple. ACM SIGACT News (2001).Google Scholar
- Günter Last and Mathew Penrose. 2017. Lectures on the Poisson process. Vol. 7. Cambridge University Press.Google Scholar
- Sekwon Lee, Soujanya Ponnapalli, Sharad Singhal, Marcos K Aguilera, Kimberly Keeton, and Vijay Chidambaram. 2022. DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory. Proceedings of the VLDB Endowment (2022).Google ScholarDigital Library
- Guangpu Li, Haopeng Liu, Xianglan Chen, Haryadi S Gunawi, and Shan Lu. 2019. Dfix: automatically fixing timing bugs in distributed systems. In ACM Conference on Programming Language Design and Implementation.Google ScholarDigital Library
- Jialin Li, Ellis Michael, Naveen Kr Sharma, Adriana Szekeres, and Dan RK Ports. 2016. Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering.. In USENIX OSDI.Google ScholarDigital Library
- Jiaxin Li, Yiming Zhang, Shan Lu, Haryadi S Gunawi, Xiaohui Gu, Feng Huang, and Dongsheng Li. 2023. Performance Bug Analysis and Detection for Distributed Storage and Computing Systems. ACM Transactions on Storage (2023).Google Scholar
- Zijun Li, Yushi Liu, Linsong Guo, Quan Chen, Jiagan Cheng, Wenli Zheng, and Minyi Guo. 2022. Faasflow: Enable efficient workflow execution for function-as-a-service. In ACM ASPLOS.Google Scholar
- Barbara Liskov, Liuba Shrira, and John Wroclawski. 1991. Efficient at-most-once messages based on synchronized clocks. ACM Transactions on Computer Systems (1991).Google Scholar
- John DC Little. 2011. Little's Law as viewed on its 50th anniversary. Operations Research (2011).Google Scholar
- Haopeng Liu, Guangpu Li, Jeffrey F Lukman, Jiaxin Li, Shan Lu, Haryadi S Gunawi, and Chen Tian. 2017. Dcatch: Automatically detecting distributed concurrency bugs in cloud systems. ACM SIGARCH Computer Architecture News (2017).Google ScholarDigital Library
- Joshua Lockerman, Jose M. Faleiro, Juno Kim, Soham Sankaran, Daniel J. Abadi, James Aspnes, Siddhartha Sen, and Mahesh Balakrishnan. 2018. The FuzzyLog: A Partially Ordered Shared Log. In USENIX OSDI.Google Scholar
- Jeffrey F Lukman, Huan Ke, Cesar A Stuardo, Riza O Suminto, Daniar H Kurniawan, Dikaimin Simon, Satria Priambada, Chen Tian, Feng Ye, Tanakorn Leesatapornwongsa, et al. 2019. Flymc: Highly scalable testing of complex interleavings in distributed systems. In EuroSys.Google ScholarDigital Library
- Haojun Ma, Hammad Ahmad, Aman Goel, Eli Goldweber, Jean-Baptiste Jeannin, Manos Kapritsos, and Baris Kasikci. 2022. Sift: Using Refinement-guided Automation to Verify Complex Distributed Systems. In USENIX ATC.Google Scholar
- Haojun Ma, Aman Goel, Jean-Baptiste Jeannin, Manos Kapritsos, Baris Kasikci, and Karem A Sakallah. 2019. I4: incremental inference of inductive invariants for verification of distributed protocols. In ACM SOSP.Google Scholar
- Jonathan Mace, Ryan Roelke, and Rodrigo Fonseca. 2015. Pivot tracing: Dynamic causal monitoring for distributed systems. In ACM SOSP.Google Scholar
- Kostas Meladakis, Chrysostomos Zeginis, Kostas Magoutis, and Dimitris Plexousakis. 2022. Transferring transactional business processes to FaaS. In Proceedings of the Eighth International Workshop on Serverless Computing.Google ScholarDigital Library
- Luke Nelson, James Bornholt, Ronghui Gu, Andrew Baumann, Emina Torlak, and Xi Wang. 2019. Scaling symbolic evaluation for automated verification of systems code with Serval. In ACM SOSP.Google Scholar
- Thomas Neumann, Tobias Mühlbauer, and Alfons Kemper. 2015. Fast serializable multi-version concurrency control for main-memory database systems. In ACM SIGMOD.Google Scholar
- Diego Ongaro and John Ousterhout. 2014. In search of an understandable consensus algorithm. In USENIX ATC.Google Scholar
- Haochen Pan, Jesse Tuglu, Neo Zhou, Tianshu Wang, Yicheng Shen, Xiong Zheng, Joseph Tassarotti, Lewis Tseng, and Roberto Palmieri. 2021. Rabia: Simplifying state-machine replication through randomization. In ACM SOSP.Google ScholarDigital Library
- Dai Qin, Angela Demke Brown, and Ashvin Goel. 2017. Scalable replay-based replication for fast databases. Proceedings of the VLDB Endowment (2017).Google ScholarDigital Library
- Andrew Quinn, Jason Flinn, Michael Cafarella, and Baris Kasikci. 2022. Debugging the {OmniTable} Way. In USENIX OSDI.Google Scholar
- Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications. In ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja J Yadwadkar, Raluca Ada Popa, Joseph E Gonzalez, Ion Stoica, and David A Patterson. 2021. What serverless computing is and should become: The next phase of cloud computing. Commun. ACM (2021).Google Scholar
- Srinath TV Setty, Chunzhi Su, Jacob R Lorch, Lidong Zhou, Hao Chen, Parveen Patel, and Jinglei Ren. 2016. Realizing the Fault-Tolerance Promise of Cloud Storage Using Locks with Intent.. In USENIX OSDI.Google Scholar
- Simon Shillaker and Peter Pietzuch. 2020. Faasm: Lightweight isolation for efficient stateful serverless computing. In USENIX ATC.Google Scholar
- Vikram Sreekanti, Chenggang Wu, Saurav Chhatrapati, Joseph E Gonzalez, Joseph M Hellerstein, and Jose M Faleiro. 2020. A fault-tolerance shim for serverless computing. In EuroSys.Google Scholar
- Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, Johann Schleier-Smith, Joseph E Gonzalez, Joseph M Hellerstein, and Alexey Tumanov. 2020. Cloudburst: Stateful Functions-as-a-Service. Proceedings of the VLDB Endowment (2020).Google ScholarDigital Library
- Yang Tang and Junfeng Yang. 2020. Lambdata: Optimizing serverless computing by making data intents explicit. In IEEE International Conference on Cloud Computing.Google ScholarCross Ref
- Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, and Boris Grot. 2021. Benchmarking, analysis, and optimization of serverless function snapshots. In ACM ASPLOS.Google Scholar
- Kaushik Veeraraghavan, Dongyoon Lee, Benjamin Wester, Jessica Ouyang, Peter M Chen, Jason Flinn, and Satish Narayanasamy. 2012. DoublePlay: Parallelizing sequential logging and replay. ACM Transactions on Computer Systems (2012).Google Scholar
- Stephanie Wang, John Liagouris, Robert Nishihara, Philipp Moritz, Ujval Misra, Alexey Tumanov, and Ion Stoica. 2019. Lineage stash: fault tolerance off the critical path. In ACM SOSP.Google Scholar
- Zhaoguo Wang, Changgeng Zhao, Shuai Mu, Haibo Chen, and Jinyang Li. 2019. On the Parallels between Paxos and Raft, and how to Port Optimizations. In ACM PODC.Google Scholar
- Michael Wei, Amy Tai, Christopher J Rossbach, Ittai Abraham, Maithem Munshed, Medhavi Dhawan, Jim Stabile, Udi Wieder, Scott Fritchie, Steven Swanson, et al. 2017. vcorfu: A cloud-scale object store on a shared log. In USENIX NSDI.Google Scholar
- Xingda Wei, Rong Chen, Haibo Chen, Zhaoguo Wang, Zhenhan Gong, and Binyu Zang. 2021. Unifying Timestamp with Transaction Ordering for MVCC with Decentralized Scalar Timestamp.. In USENIX NSDI.Google Scholar
- Xingda Wei, Fangming Lu, Tianxia Wang, J Gu, Y Yang, R Chen, and H Chen. 2023. No provisioned concurrency: Fast RDMA-codesigned remote fork for serverless computing. (2023).Google Scholar
- Jinfeng Wen, Zhenpeng Chen, Xin Jin, and Xuanzhe Liu. 2023. Rise of the Planet of Serverless Computing: A Systematic Review. ACM Transactions on Software Engineering and Methodology (2023).Google Scholar
- Chenggang Wu, Vikram Sreekanti, and Joseph M Hellerstein. 2020. Transactional causal consistency for serverless computing. In ACM SIGMOD.Google Scholar
- Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, and Andrew Pavlo. 2017. An empirical evaluation of in-memory multi-version concurrency control. Proceedings of the VLDB Endowment (2017).Google ScholarDigital Library
- Jianan Yao, Runzhou Tao, Ronghui Gu, and Jason Nieh. 2022. {DuoAI}: Fast, Automated Inference of Inductive Invariants for Verifying Distributed Protocols. In USENIX OSDI.Google Scholar
- Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan. 2021. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols.. In USENIX OSDI.Google Scholar
- Tianyi Yu, Qingyuan Liu, Dong Du, Yubin Xia, Binyu Zang, Ziqian Lu, Pingchao Yang, Chenggang Qin, and Haibo Chen. 2020. Characterizing serverless platforms with serverlessbench. In ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou, and Shankar Pasupathy. 2010. Sherlog: error diagnosis by connecting clues from run-time logs. In ACM ASPLOS.Google Scholar
- Ding Yuan, Soyeon Park, Peng Huang, Yang Liu, Michael M Lee, Xiaoming Tang, Yuanyuan Zhou, and Stefan Savage. 2012. Be conservative: Enhancing failure diagnosis with proactive logging. In USENIX OSDI.Google Scholar
- Xinhao Yuan and Junfeng Yang. 2020. Effective concurrency testing for distributed systems. In ACM ASPLOS.Google Scholar
- Haoran Zhang, Adney Cardoza, Peter Baile Chen, Sebastian Angel, and Vincent Liu. 2020. Fault-tolerant and transactional stateful serverless workflows. In USENIX OSDI.Google Scholar
- Tian Zhang, Dong Xie, Feifei Li, and Ryan Stutsman. 2019. Narrowing the Gap Between Serverless and its State with Storage Functions. In ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Wen Zhang, Vivian Fang, Aurojit Panda, and Scott Shenker. 2020. Kappa: A programming framework for serverless computing. In ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Wen Zhang, Eric Sheng, Michael Chang, Aurojit Panda, Mooly Sagiv, and Scott Shenker. 2022. Blockaid: Data Access Policy Enforcement for Web Applications. In USENIX OSDI.Google Scholar
- Yanqi Zhang, Íñigo Goiri, Gohar Irfan Chaudhry, Rodrigo Fonseca, Sameh Elnikety, Christina Delimitrou, and Ricardo Bianchini. 2021. Faster and cheaper serverless computing on harvested resources. In ACM SOSP.Google Scholar
- Yongle Zhang, Serguei Makarov, Xiang Ren, David Lion, and Ding Yuan. 2017. Non-Intrusive Failure Reproduction for Distributed Systems using the Partial Trace Principle. In ACM SOSP.Google Scholar
- Yongle Zhang, Serguei Makarov, Xiang Ren, David Lion, and Ding Yuan. 2017. Pensieve: Non-intrusive failure reproduction for distributed systems using the event chaining approach. In ACM SOSP.Google ScholarDigital Library
- Ziming Zhao, Mingyu Wu, Jiawei Tang, Binyu Zang, Zhaoguo Wang, and Haibo Chen. 2023. BeeHive: Sub-second elasticity for web services with Semi-FaaS execution. In ACM ASPLOS.Google Scholar
- Mo Zou, Haoran Ding, Dong Du, Ming Fu, Ronghui Gu, and Haibo Chen. 2019. Using concurrent relational logic with helpers for verifying the AtomFS file system. In ACM SOSP.Google Scholar
- Gefei Zuo, Jiacheng Ma, Andrew Quinn, Pramod Bhatotia, Pedro Fonseca, and Baris Kasikci. 2021. Execution reconstruction: Harnessing failure reoccurrences for failure reproduction. In ACM Conference on Programming Language Design and Implementation.Google ScholarDigital Library
Index Terms
- Halfmoon: Log-Optimal Fault-Tolerant Stateful Serverless Computing
Recommendations
Supporting Multi-Provider Serverless Computing on the Edge
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel ProcessingServerless computing has recently emerged as a new execution model for cloud computing, in which service providers offer compute runtimes, also known as Function-as-a-Service (FaaS) platforms, allowing users to develop, execute and manage application ...
The Serverless Computing Survey: A Technical Primer for Design Architecture
The development of cloud infrastructures inspires the emergence of cloud-native computing. As the most promising architecture for deploying microservices, serverless computing has recently attracted more and more attention in both industry and academia. ...
Leveraging Towards Serverless Edge Computing Model for Intelligent IoMT Applications
IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary ComputingInternet of Medical Things(IoMT) devices connect billions of sensors consisting of a large volume of computation-intensive and time-sensitive tasks in the health sector. These tasks need to be processed or executed by different computational devices ...
Comments