skip to main content
10.1145/3342195.3387534acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Rhythm: component-distinguishable workload deployment in datacenters

Published:17 April 2020Publication History

Editorial Notes

A corrigendum was issued for this article on July 15, 2020. You can download the corrigendum from the supplemental material section of this citation page.

ABSTRACT

Cloud service providers improve resource utilization by co-locating latency-critical (LC) workloads with best-effort batch (BE) jobs in datacenters. However, they usually treat an LC workload as a whole when allocating resources to BE jobs and neglect the different features of components of an LC workload. This kind of coarse-grained co-location method leaves a significant room for improvement in resource utilization.

Based on the observation of the inconsistent interference tolerance abilities of different LC components, we propose a new abstraction called Servpod, which is a collection of a LC parts that are deployed on the same physical machine together, and show its merits on building a fine-grained co-location framework. The key idea is to differentiate the BE throughput launched with each LC Servpod, i.e., Servpod with high interference tolerance ability can be deployed along with more BE jobs. Based on Servpods, we present Rhythm, a co-location controller that maximizes the resource utilization while guaranteeing LC service's tail latency requirement. It quantifies the interference tolerance ability of each servpod through the analysis of tail-latency contribution. We evaluate Rhythm using LC services in forms of containerized processes and microservices, and find that it can improve the system throughput by 31.7%, CPU utilization by 26.2%, and memory bandwidth utilization by 34% while guaranteeing the SLA (service level agreement).

Skip Supplemental Material Section

Supplemental Material

References

  1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martion Wicke, Yuan Yu, and Zheng Xiaoqiang. 2016. Tensorflow: a system for large-scale machine learning.. In OSDI, Vol. 16. 265--283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Agarwala, F. Alegre, K. Schwan, and J. Mehalingham. 2007. E2EProf: Automated End-to-End Performance Management for Enterprise Systems. In The 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07). 749--758. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Marcos K. Aguilera, Jeffrey C. Mogul, Janet L. Wiener, Patrick Reynolds, and Athicha Muthitacharoen. 2003. Performance Debugging for Distributed Systems of Black Boxes. In Proceedings of the Nineteenth ACM Symposium on (Operating Systems Principles (SOSP '03). Association for Computing Machinery, New York, NY, USA, 74--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Paul Barham, Richard Black, Moises Goldszmidt, Rebecca Isaacs, John MacCormick, Richard Mortier, and Aleksandr Simma. 2008. Constellation: automated discovery of service and host dependencies in networked systems. Technical Report MSR-TR-2008-67. 1--14 pages.Google ScholarGoogle Scholar
  5. Paul Barham, Austin Donnelly, Rebecca Isaacs, and Richard Mortier. 2004. Using Magpie for request extraction and workload modelling. In Proceedings of the Sixth USENIX Symposium on Operating Systems Design and Implementation (OSDI) 2004 (proceedings of the sixth usenix symposium on operating systems design and implementation (osdi) 2004 ed.). 259--272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sean Kenneth Barker and Prashant Shenoy. 2010. Empirical Evaluation of Latency-sensitive Application Performance in the Cloud. In Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems (MMSys '10). ACM, New York, NY, USA, 35--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Chen, Y. Qi, and D. Hou. 2019. CauseInfer: Automated End-to-End Performance Diagnosis with Hierarchical Causality Graph in Cloud Environment. IEEE Transactions on Services Computing 12, 2 (March 2019), 214--230. Google ScholarGoogle ScholarCross RefCross Ref
  8. Shuang Chen, Christina Delimitrou, and José F. Martínez. 2019. PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '19). ACM, New York, NY, USA, 107--120.Google ScholarGoogle Scholar
  9. Xu Chen, Ming Zhang, Z. Morley Mao, and Paramvir Bahl. 2008. Automating Network Application Dependency Discovery: Experiences, Limitations, and New Solutions. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI'08). USENIX Association, USA, 117--130.Google ScholarGoogle Scholar
  10. MIchael Chow, David Meisner, Jason Flinn, Daniel Peek, and Thomas F. Wenisch. 2014. The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). USENIX Association, Broomfield, CO, 217--231. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/chowGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  11. The Internet Traffic Archive ClarkNet. 2017. http://ita.ee.lbl.gov/html/traces.html.Google ScholarGoogle Scholar
  12. Henry Cook, Miquel Moreto, Sarah Bird, Khanh Dao, David A Patterson, and Krste Asanovic. 2013. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In ACM SIGARCH Computer Architecture News, Vol. 41. ACM, 308--319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (Feb. 2013), 74--80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Christina Delimitrou and Christos Kozyrakis. 2013. ibench: Quantifying interference for datacenter applications. In 2013 IEEE international symposium on workload characterization (IISWC). IEEE, 23--33.Google ScholarGoogle ScholarCross RefCross Ref
  15. Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. In ACM SIGPLAN Notices, Vol. 48. ACM, 77--88.Google ScholarGoogle Scholar
  16. Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. ACM SIGPLAN Notices 49, 4 (2014), 127--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Christina Delimitrou and Christos Kozyrakis. 2016. HCloud: Resource-Efficient Provisioning in Shared Cloud Systems. SIGPLAN Not. 51, 4 (March 2016), 473--488.Google ScholarGoogle Scholar
  18. Elasticsearch. 2019. Elasticsearch: a search engine based on the Lucene library. https://lucene.apache.org/solr/.Google ScholarGoogle Scholar
  19. Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware. SIGPLAN Not. 47, 4 (March 2012), 37--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Rodrigo Fonseca, George Porter, Randy H. Katz, and Scott Shenker. 2007. X-Trace: A Pervasive Network Tracing Framework. In 4th USENIX Symposium on Networked Systems Design & Implementation (NSDI 07). USENIX Association, Cambridge, MA. https://www.usenix.org/conference/nsdi-07/x-trace-pervasive-network-tracing-frameworkGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yu Gan and Christina Delimitrou. 2018. The Architectural Implications of Cloud Microservices. IEEE Computer Architecture Letters 17, 2 (July 2018), 155--158.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, Kelvin Hu, Meghna Pancholi, Yuan He, Brett Clancy, Chris Colen, Fukang Wen, Catherine Leung, Siyuan Wang, Leon Zaruvinsky, Mateo Espinosa, Rick Lin, Zhongling Liu, Jake Padilla, and Christina Delimitrou. 2019. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '19). ACM, New York, NY, USA, 3--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wanling Gao, Lei Wang, Jianfeng Zhan, Chunjie Luo, Daoyi Zheng, Zhen Jia, Biwei Xie, Chen Zheng, Qiang Yang, and Haibin Wang. 2017. A Dwarf-based Scalable Big Data Benchmarking Methodology. CoRR abs/1711.03229 (2017).Google ScholarGoogle Scholar
  24. Alexander N. Gorban, Lyudmila I. Pokidysheva, Elena V. Smirnova, and Tatiana A. Tyukina. 2011. Law of the Minimum Paradoxes. Bulletin of Mathematical Biology 73 (2011), 2013--2044.Google ScholarGoogle ScholarCross RefCross Ref
  25. Sriram Govindan, Jie Liu, Aman Kansal, and Anand Sivasubramaniam. 2011. Cuanta: Quantifying Effects of Shared On-chip Resource Interference for Consolidated Virtual Machines. In Proceedings of the 2Nd ACM Symposium on Cloud Computing (SOCC '11). ACM, New York, NY, USA, 22:1--22:14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jing Guo, Zihao Chang, Sa Wang, Haiyang Ding, Yihui Feng, Liang Mao, and Yungang Bao. 2019. Who Limits the Resource Efficiency of My Datacenter: An Analysis of Alibaba Datacenter Traces. In Proceedings of the International Symposium on Quality of Service (IWQoS '19). ACM, New York, NY, USA, Article 39, 10 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Calin Iorgulescu, Reza Azimi, Youngjin Kwon, Sameh Elnikety, Manoj Syamala, Vivek Narasayya, Herodotos Herodotou, Paulo Tomita, Alex Chen, Jack Zhang, and Junhua Wang. 2018. PerfIso: Performance Isolation for Commercial Latency-Sensitive Services. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 519--532.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Alexandru Iosup, Nezih Yigitbasi, and Dick Epema. 2011. On the Performance Variability of Production Cloud Services. In 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 104--113.Google ScholarGoogle Scholar
  29. Ravi R. Iyer. 2004. CQoS: a framework for enabling QoS in shared caches of CMP platforms. In International Conference on Supercomputing. 257--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Bart Jacob, Paul Larson, B Leitao, and SAMM Da Silva. 2008. SystemTap: instrumenting the Linux kernel for analyzing performance and functional problems. IBM Redbook (2008).Google ScholarGoogle Scholar
  31. jaeger. 2019. https://www.jaegertracing.io/.Google ScholarGoogle Scholar
  32. M. K. Jeong, M. Erez, C. Sudanthi, and N. Paver. 2012. A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC. In DAC Design Automation Conference 2012. 850--855.Google ScholarGoogle Scholar
  33. Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Jayant Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, and David A. Patterson. 2019. Cloud Programming Simplified: A Berkeley View on Serverless Computing. CoRR abs/1902.03383 (2019).Google ScholarGoogle Scholar
  34. Jonathan Kaldor, Jonathan Mace, Michal Bejda, Edison Gao, Wiktor Kuropatwa, Joe O'Neill, Kian Win Ong, Bill Schaller, Pingjia Shan, Brendan Viscomi, Vinod Venkataraman, Kaushik Veeraraghavan, and Yee Jiun Song. 2017. Canopy: An End-to-End Performance Tracing And Analysis System. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28--31, 2017. ACM, 34--50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Kambadur, T. Moseley, R. Hank, and M. A. Kim. 2012. Measuring interference between live datacenter applications. In High PERFORMANCE Computing, Networking, Storage and Analysis. 1--12.Google ScholarGoogle Scholar
  36. Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, Jason Mars, and Lingjia Tang. 2019. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys '19). Association for Computing Machinery, New York, NY, USA, Article Article 34, 16 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Harshad Kasture and Daniel Sanchez. 2014. Ubik: efficient cache sharing with strict qos for latency-critical workloads. In ACM SIGPLAN Notices, Vol. 49. ACM, 729--742.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Darja Krushevskaja and Mark Sandler. 2013. Understanding latency variations of black box services. In Proceedings of the 22nd international conference on World Wide Web. ACM, 703--714.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kubernetes. 2019. https://kubernetes.io/.Google ScholarGoogle Scholar
  40. Qixiao Liu and Zhibin Yu. 2018. The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: A View from Alibaba Trace. In Proceedings of the ACM Symposium on Cloud Computing (SoCC '18). ACM, New York, NY, USA, 347--360.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: Improving resource efficiency at scale. In ACM SIGARCH Computer Architecture News, Vol. 43. ACM, 450--462.Google ScholarGoogle Scholar
  42. LTTng. 2019. https://lttng.org/.Google ScholarGoogle Scholar
  43. Jiuyue Ma, Xiufeng Sui, Ninghui Sun, Yupeng Li, Zihao Yu, Bowen Huang, Tianni Xu, Zhicheng Yao, Yun Chen, Haibin Wang, Lixin Zhang, and Yungang Bao. 2015. Supporting Differentiated Services in Computers via Programmable Architecture for Resourcing-on-Demand (PARD). SIGARCH Comput. Archit. News 43, 1 (March 2015), 131--143.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jonathan Mace, Peter Bodik, Rodrigo Fonseca, and Madanlal Musuvathi. 2015. Retro: Targeted Resource Management in Multi-tenant Distributed Systems. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). USENIX Association, Oakland, CA, 589--603.Google ScholarGoogle Scholar
  45. A. K. Maji, S. Mitra, and S. Bagchi. 2015. ICE: An Integrated Configuration Engine for Interference Mitigation in Cloud Services. In 2015 IEEE International Conference on Autonomic Computing. 91--100.Google ScholarGoogle Scholar
  46. Haroon Malik, Hadi Hemmati, and Ahmed E Hassan. 2013. Automatic detection of performance deviations in the load testing of large scale systems. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 1012--1021.Google ScholarGoogle ScholarCross RefCross Ref
  47. Raman Manikantan, Kaushik Rajan, and Ramaswamy Govindarajan. 2012. Probabilistic shared cache management (PriSM). In Computer Architecture (ISCA), 2012 39th Annual International Symposium on. IEEE, 428--439.Google ScholarGoogle ScholarCross RefCross Ref
  48. Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th annual IEEE/ACM International Symposium on Microarchitecture. ACM, 248--259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44). ACM, New York, NY, USA, 248--259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. D. A. Menasce. 2002. TPC-W: A Benchmark for E-Commerce. IEEE Internet Computing 6 (05 2002), 83--87.Google ScholarGoogle Scholar
  51. Ripal Nathuji, Aman Kansal, and Alireza Ghaffarkhah. 2010. Q-clouds: managing performance interference effects for qos-aware clouds. In Proceedings of the 5th European conference on Computer systems. ACM, 237--250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Rajiv Nishtala, Vinicius Petrucci, Paul Carpenter, and Magnus Sjalander. 2020. Twig : Multi-Agent Task Management for Colocated Latency-Critical Cloud Services. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 167--179.Google ScholarGoogle ScholarCross RefCross Ref
  53. Dejan Novakovic, Nedeljko Vasic, Stanko Novakovic, Dejan Kostic, and Ricardo Bianchini. 2013. DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference (USENIX ATC'13). USENIX Association, Berkeley, CA, USA, 219--230.Google ScholarGoogle Scholar
  54. Numactl. 2019. https://github.com/numactl/numactl.Google ScholarGoogle Scholar
  55. Zhonghong Ou, Hao Zhuang, Jukka K. Nurminen, Antti Ylä-Jääski, and Pan Hui. 2012. Exploiting Hardware Heterogeneity Within the Same Instance Type of Amazon EC2. In Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Ccomputing (HotCloud'12). USENIX Association, Berkeley, CA, USA, 4--4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Ioannis Papadakis, Konstantinos Nikas, Vasileios Karakostas, Georgios Goumas, and Nectarios Koziris. 2017. Improving QoS and Utilisation in modern multi-core servers with Dynamic Cache Partitioning. In Proceedings of the Joined Workshops COSH 2017 and VisorHPC 2017, Carsten Clauss, Stefan Lankes, Carsten Trinitis, and Josef Weidendorfer (Eds.). Stockholm, Sweden, 21--26.Google ScholarGoogle Scholar
  57. Jinsu Park, Seongbeom Park, and Woongki Baek. 2019. CoPart: Co-ordinated Partitioning of Last-Level Cache and Memory Bandwidth for Fairness-Aware Workload Consolidation on Commodity Servers. In Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys '19). ACM, New York, NY, USA, Article 10, 16 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Tirthak Patel and Devesh Tiwari. 2020. CLITE : Efficient and QoS-Aware Co-location of Multiple Latency-Critical Jobs for Warehouse Scale Computers. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 193--206.Google ScholarGoogle ScholarCross RefCross Ref
  59. Xing Pu, Ling Liu, Yiduo Mei, Sankaran Sivathanu, Younggyun Koh, and Calton Pu. 2010. Understanding Performance Interference of I/O Workload in Virtualized Cloud Environments. In Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing (CLOUD '10). IEEE Computer Society, Washington, DC, USA, 51--58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Navaneeth Rameshan, Leandro Navarro, Enric Monte, and Vladimir Vlassov. 2014. Stay-Away, Protecting Sensitive Applications from Performance Interference. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, New York, NY, USA, 301--312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Redis. 2019. Redis: an open source, in-memory data structure store. https://redis.io.Google ScholarGoogle Scholar
  62. Charles Reiss, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz, and Michael A. Kozuch. 2012. Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis. In Proceedings of the Third ACM Symposium on Cloud Computing (SoCC '12). ACM, New York, NY, USA, 7:1--7:13.Google ScholarGoogle Scholar
  63. B. Sang, J. Zhan, G. Lu, H. Wang, D. Xu, L. Wang, Z. Zhang, and Z. Jia. 2012. Precise, Scalable, and Online Request Tracing for Multitier Services of Black Boxes. IEEE Transactions on Parallel and Distributed Systems 23, 6 (June 2012), 1159--1167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Jörg Schad, Jens Dittrich, and Jorge-Arnulfo Quiané-Ruiz. 2010. Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance. Proc. VLDB Endow. 3, 1--2 (Sept. 2010), 460--471.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Benjamin H. Sigelman, Luiz André Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. 2010. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Technical Report. Google, Inc.Google ScholarGoogle Scholar
  66. S. Sivathanu, X. Pu, L. Liu, X. Dong, and Y. Mei. 2013. Performance Analysis of Network I/O Workloads in Virtualized Data Centers. IEEE Transactions on Services Computing 6 (01 2013), 48--63.Google ScholarGoogle Scholar
  67. Solr. 2019. Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene. https://www.elastic.co.Google ScholarGoogle Scholar
  68. Shekhar Srikantaiah, Mahmut Kandemir, and Qian Wang. 2009. SHARP control: controlled shared cache management in chip multiprocessors. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 517--528.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Christopher Stewart and Kai Shen. 2005. Performance modeling and system management for multi-component online services. In Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation-Volume 2. USENIX Association, 71--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Eno Thereska, Brandon Salmon, John Strunk, Matthew Wachs, Michael Abd-El-Malek, Julio Lopez, and Gregory R. Ganger. 2006. Stardust: Tracking Activity in a Distributed Storage System. SIGMETRICS Perform. Eval. Rev. 34, 1 (June 2006), 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. A Tirumala, F Qin, J Dugan, J Ferguson, and K Gibbs. 2005. Iperf: The TCP/UDP bandwidth measurement tool. http.dast.nlanr.net/Projects 38 (2005).Google ScholarGoogle Scholar
  72. Guohui Wang and T. S. Eugene Ng. 2010. The Impact of Virtualization on Network Performance of Amazon EC2 Data Center. In Proceedings of the 29th Conference on Information Communications (INFOCOM'10). IEEE Press, Piscataway, NJ, USA, 1163--1171.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Fei Xu, Fangming Liu, Linghui Liu, Hai Jin, Bo Li, and Baochun Li. 2014. iAware: Making Live Migration of Virtual Machines Interference-Aware in the Cloud. IEEE Trans. Comput. 63, 12 (Dec. 2014), 3012--3025.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. H. Xu, X. Ning, H. Zhang, J. Rhee, and G. Jiang. 2016. PInfer: Learning to Infer Concurrent Request Paths from System Kernel Events. In 2016 IEEE International Conference on Autonomic Computing (ICAC). 199--208. Google ScholarGoogle ScholarCross RefCross Ref
  75. Ran Xu, Subrata Mitra, Jason Rahman, Peter Bai, Bowen Zhou, Greg Bronevetsky, and Saurabh Bagchi. 2018. Pythia: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads. In Proceedings of the 19th International Middleware Conference (Middleware '18). ACM, New York, NY, USA, 146--160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Neeraja J Yadwadkar, Ganesh Ananthanarayanan, and Randy Katz. 2014. Wrangler: Predictable and faster jobs using fewer resources. In Proceedings of the ACM Symposium on Cloud Computing. ACM, 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. ACM SIGARCH Computer Architecture News 41, 3 (2013), 607--618.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Xi Yang, Stephen M. Blackburn, and Kathryn S. McKinley. 2016. Elfen Scheduling: Fine-Grain Principled Borrowing from Latency-Critical Workloads Using Simultaneous Multithreading. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). USENIX Association, Denver, CO, 309--322.Google ScholarGoogle Scholar
  79. Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI 2: CPU performance isolation for shared compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems. 379--391.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Y. Zhang, M. A. Laurenzano, J. Mars, and L. Tang. 2014. SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. 406--418.Google ScholarGoogle Scholar
  81. Yunqi Zhang, Michael A Laurenzano, Jason Mars, and Lingjia Tang. 2014. Smite: Precise qos prediction on real-system smt processors to improve utilization in warehouse scale computers. In Microarchitecture (MICRO), 2014 47th Annual IEEE/ACM International Symposium on. IEEE, 406--418.Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Jiacheng Zhao, Huimin Cui, Jingling Xue, and Xiaobing Feng. 2016. Predicting Cross-Core Performance Interference on Multicore Processors with Regression Analysis. IEEE Trans. Parallel Distrib. Syst. 27, 5 (May 2016), 1443--1456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Jiacheng Zhao, Huimin Cui, Jingling Xue, Xiaobing Feng, Youliang Yan, and Wensen Yang. 2013. An Empirical Model for Predicting Cross-core Performance Interference on Multicore Processors. In Proceedings of the 22Nd International Conference on Parallel Architectures and Compilation Techniques (PACT '13). IEEE Press, Piscataway, NJ, USA, 201--212.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Haishan Zhu and Mattan Erez. 2016. Dirigent: Enforcing QoS for latency-critical tasks on shared multicore systems. ACM SIGARCH Computer Architecture News 44, 2 (2016), 33--47.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.Google ScholarGoogle ScholarCross RefCross Ref
  86. Zipkin. 2019. https://zipkin.io/.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    EuroSys '20: Proceedings of the Fifteenth European Conference on Computer Systems
    April 2020
    49 pages
    ISBN:9781450368827
    DOI:10.1145/3342195

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 17 April 2020

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    EuroSys '20 Paper Acceptance Rate43of234submissions,18%Overall Acceptance Rate241of1,308submissions,18%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader