research-article

DVM: towards a datacenter-scale virtual machine

Authors:
Zhiqiang Ma

The Hong Kong University of Science and Technology, Hong Kong, Hong Kong

The Hong Kong University of Science and Technology, Hong Kong, Hong Kong
View Profile

,
Zhonghua Sheng

The Hong Kong University of Science and Technology, Hong Kong, Hong Kong

The Hong Kong University of Science and Technology, Hong Kong, Hong Kong
View Profile

,
Lin Gu

The Hong Kong University of Science and Technology, Hong Kong, Hong Kong

The Hong Kong University of Science and Technology, Hong Kong, Hong Kong
View Profile

,
Liufei Wen

Huawei Technologies, Shenzhen, China

Huawei Technologies, Shenzhen, China
View Profile

,
Gong Zhang

Huawei Technologies, Shenzhen, China

Huawei Technologies, Shenzhen, China
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 47 Issue 7July 2012pp 39–50https://doi.org/10.1145/2365864.2151032

Published:03 March 2012Publication History

ACM SIGPLAN Notices

Abstract

As cloud-based computation becomes increasingly important, providing a general computational interface to support datacenter-scale programming has become an imperative research agenda. Many cloud systems use existing virtual machine monitor (VMM) technologies, such as Xen, VMware, and Windows Hypervisor, to multiplex a physical host into multiple virtual hosts and isolate computation on the shared cluster platform. However, traditional multiplexing VMMs do not scale beyond one single physical host, and it alone cannot provide the programming interface and cluster-wide computation that a datacenter system requires. We design a new instruction set architecture, DISA, to unify myriads of compute nodes to form a big virtual machine called DVM, and present programmers the view of a single computer where thousands of tasks run concurrently in a large, unified, and snapshotted memory space. The DVM provides a simple yet scalable programming model and mitigates the scalability bottleneck of traditional distributed shared memory systems. Along with an efficient execution engine, the capacity of a DVM can scale up to support large clusters. We have implemented and tested DVM on three platforms, and our evaluation shows that DVM has excellent performance in terms of execution time and speedup. On one physical host, the system overhead of DVM is comparable to that of traditional VMMs. On 16 physical hosts, the DVM runs 10 times faster than MapReduce/Hadoop and X10. On 256 EC2 instances, DVM shows linear speedup on a parallelizable workload.

References

Amazon Elastic Compute Cloud -- EC2. phhttp://aws.amazon.com/ec2/. {last access: 11/2, 2011}.Google Scholar
Windows Azure. phhttp://www.microsoft.com/windowsazure/. {last access: 11/2, 2011}.Google Scholar
Rackspace. phhttp://www.rackspace.com/. {last access: 11/2, 2011}.Google Scholar
E. Allen, D. Chase, J. Hallett, V. Luchangco, J. Maessen, S. Ryu, G. Steele Jr, S. Tobin-Hochstadt, J. Dias, C. Eastlund, et al. The Fortress language specification. phhttps://labs.oracle.com/projects/plrg/fortress.pdf, 2008. {last access: 11/2, 2011}.Google Scholar
}hadoop.poweredbyApache Hadoop. Hadoop Users List. phhttp://wiki.apache.org/hadoop/PoweredBy. {last access: 11/2, 2011}.Google Scholar
}mahoutApache Mahout. Mahout machine learning libraries. phhttp://mahout.apache.org/. {last access: 11/2, 2011}.Google Scholar
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Above the clouds: A Berkeley view of cloud computing. phUC Berkeley Technical Report UCB/EECS-2009--28, February 2009.Google Scholar
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. In phProceedings of the 19th ACM symposium on Operating Systems Principles, pages 164--177, 2003. Google ScholarDigital Library
le(2009)}barroso2009datacenterL. Barroso and U. Hölzle. The datacenter as a computer: An introduction to the design of warehouse-scale machines. phSynthesis Lectures on Computer Architecture, 4 (1): 1--108, 2009.Google Scholar
L. Barroso, J. Dean, and U. Hoelzle. Web search for a planet: The Google cluster architecture. phIEEE Micro, 23 (2): 22--28, 2003. Google ScholarDigital Library
K. Birman, G. Chockler, and R. van Renesse. Toward a cloud computing research agenda. phSIGACT News, 40 (2): 68--80, 2009. Google ScholarDigital Library
R. Buyya, T. Cortes, and H. Jin. Single system image. phIntl. Journal of High Performance Computing Applications, 15 (2): 124, 2001. Google ScholarDigital Library
B. Chamberlain, D. Callahan, and H. Zima. Parallel programmability and the Chapel language. phInternational Journal of High Performance Computing Applications, 21 (3): 291, 2007. Google ScholarDigital Library
M. Chapman and G. Heiser. vNUMA: A virtual shared-memory multiprocessor. In phProceedings of the 2009 conference on USENIX Annual technical conference, 2009. Google ScholarDigital Library
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. Von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In phACM SIGPLAN Notices, volume 40, pages 519--538, 2005. Google ScholarDigital Library
D.-K. Chen, H.-M. Su, and P.-C. Yew. The impact of synchronization and granularity on parallel systems. In phProceedings of the 17th annual intl. symposium on Computer Architecture, pages 239--248, 1990. Google ScholarDigital Library
Y. Chen, D. Pavlov, and J. F. Canny. Large-scale behavioral targeting. In phProc. of the 15th ACM SIGKDD intl conf. on Knowledge discovery and data mining, pages 209--218, 2009. Google ScholarDigital Library
C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, and K. Olukotun. Map-Reduce for machine learning on multicore. In phProc. of NIPS'07, pages 281--288, 2007.Google Scholar
T. Condie, N. Conway, P. Alvaro, J. Hellerstein, K. Elmeleegy, and R. Sears. MapReduce online. In phProceedings of the 7th USENIX conf. on networked systems design and implementation, pages 21--21, 2010. Google ScholarDigital Library
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In phthe 6th Conference on Symposium on Operating Systems Design & Implementation, volume 6, pages 137--150, 2004. Google ScholarDigital Library
J. Ekanayake, S. Pallickara, and G. Fox. MapReduce for data intensive scientific analysis. In phFourth IEEE International Conference on eScience, pages 277--284, 2008. Google ScholarDigital Library
M. P. I. Forum. MPI: A message-passing interface standard. phhttp://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf, 2009. {last access: 11/2, 2011}.Google Scholar
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In phProc. of the 9th ACM Symposium on Operating Systems Principles (SOSP'03), pages 29--43, 2003. Google ScholarDigital Library
B. Hayes. Cloud computing. phCommunications of the ACM, 51 (7): 9--11, 2008. Google ScholarDigital Library
B. He, W. Fang, Q. Luo, N. Govindaraju, and T. Wang. Mars: a MapReduce framework on graphics processors. In phProceedings of the 17th international conference on parallel architectures and compilation techniques, pages 260--269, 2008. Google ScholarDigital Library
B. Hedlund. Inverse virtualization for internet scale applications. phhttp://bradhedlund.com/2011/03/16/inverse-virtualization-for-inte%rnet-scale-applications/. {last access: 11/2, 2011}.Google Scholar
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In phEuroSys '07: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, pages 59--72, 2007. Google ScholarDigital Library
u et al.(2010)Jégou, Douze, and Schmid}jegou2010improvingH. Jégou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. phInternational Journal of Computer Vision, 87 (3): 316--336, 2010. Google ScholarDigital Library
P. Keleher, A. Cox, S. Dwarkadas, and W. Treadmarks. Distributed shared memory on standard workstations and operating systems. In phProc. 1994 Winter Usenix Conference, pages 115--131, 1994. Google ScholarDigital Library
A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: the linux virtual machine monitor. In phProceedings of the Linux Symposium, volume 1, pages 225--230, 2007.Google Scholar
D. Lee, S. Baek, and K. Sung. Modified k-means algorithm for vector quantizer design. phSignal Processing Letters, IEEE, 4 (1): 2--4, 1997.Google Scholar
K. Li and P. Hudak. Memory coherence in shared virtual memory systems. phACM Trans. Comput. Syst., 7 (4): 321--359, 1989. Google ScholarDigital Library
H. Lu, S. Dwarkadas, A. Cox, and W. Zwaenepoel. Message passing versus distributed shared memory on networks of workstations. In phProc. of the IEEE/ACM Supercomputing 95 Conf., page 37, 1995. Google ScholarDigital Library
Z. Ma and L. Gu. The limitation of MapReduce: A probing case and a lightweight solution. In phProc. of the 1st Intl. Conf. on Cloud Computing, GRIDs, and Virtualization, pages 68--73, 2010.Google Scholar
J. MacQueen. Some methods for classification and analysis of multivariate observations. In phProceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, page 14, 1967.Google Scholar
}matlabMathWorks. Inc. Matlab. phhttp://www.mathworks.com/products/matlab/. {last access: 11/2, 2011}.Google Scholar
B. Nitzberg and V. Lo. Distributed shared memory: A survey of issues and algorithms. phComputer, 24 (8): 52--60, 1991. Google ScholarDigital Library
D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov. The Eucalyptus open-source cloud-computing system. In phProc. of the 9th IEEE/ACM Intl. Symposium on Cluster Computing and the Grid, pages 124--131, 2009. Google ScholarDigital Library
P. J. Nurnberg, U. K. Wiil, and D. L. Hicks. A grand unified theory for structural computing. phMetainformatics, 3002: 1--16, 2004.Google Scholar
R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the data: Parallel analysis with Sawzall. phSci. Program., 13 (4): 277--298, 2005. Google ScholarDigital Library
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating MapReduce for multi-core and multiprocessor systems. In phProc. of the 2007 IEEE 13th Intl Symposium on High Performance Computer Architecture, pages 13--24, 2007. Google ScholarDigital Library
Salesforce.com. phhttp://www.salesforce.com. {last access: 11/2, 2011}.Google Scholar
M. C. Schatz. CloudBurst: highly sensitive read mapping with MapReduce. phBioinformatics, 25: 1363--1369, 2009. Google ScholarDigital Library
}rprojectThe R Project. The R Language. phhttp://www.r-project.org/. {last access: 11/2, 2011}.Google Scholar
C. Tseng. Compiler optimizations for eliminating barrier synchronization. In phACM SIGPLAN Notices, volume 30, pages 144--155, 1995. Google ScholarDigital Library
C. A. Waldspurger. Memory resource management in VMware ESX server. phSIGOPS Oper. Syst. Rev., 36 (SI): 181--194, 2002. Google ScholarDigital Library
H.-C. Yang, A. Dasdan, R.-L. Hsiao, and D. S. Parker. Map-Reduce-Merge: simplified relational data processing on large clusters. In phSIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 1029--1040, 2007. Google ScholarDigital Library
Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In phthe 8th Conference on Symposium on Operating Systems Design & Implementation, pages 1--14, 2008. Google ScholarDigital Library
M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In phEuroSys '10: Proceedings of the 5th European conference on computer systems, pages 265--278, 2010. Google ScholarDigital Library
R. Zhang and A. Rudnicky. A large scale clustering scheme for kernel k-means. phPattern Recognition, 4: 40289, 2002.Google Scholar
W. Zhao, H. Ma, and Q. He. Parallel k-means clustering based on mapreduce. In phroceedings of the First International Conference on Cloud Computiong (CloudCom), pages 674--679, 2009. Google ScholarDigital Library

Index Terms

DVM: towards a datacenter-scale virtual machine

Recommendations

DVM: towards a datacenter-scale virtual machine
VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments

As cloud-based computation becomes increasingly important, providing a general computational interface to support datacenter-scale programming has become an imperative research agenda. Many cloud systems use existing virtual machine monitor (VMM) ...
Read More
Enabling Instantaneous Relocation of Virtual Machines with a Lightweight VMM Extension
CCGRID '10: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

We are developing an efficient resource management system with aggressive virtual machine (VM) relocation among physical nodes in a data center. Existing live migration technology, however, requires a long time to change the execution host of a VM, it ...
Read More
Transparently bridging semantic gap in CPU management for virtualized environments

Consolidated environments are progressively accommodating diverse and unpredictable workloads in conjunction with virtual desktop infrastructure and cloud computing. Unpredictable workloads, however, aggravate the semantic gap between the virtual ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 47, Issue 7
VEE '12
July 2012
229 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2365864
Issue’s Table of Contents
VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
March 2012
248 pages
ISBN:9781450311762
DOI:10.1145/2151024
General Chair:
Steven Hand
University of Cambridge, UK
,
Program Chair:
Dilma da Silva
IBM T. J. Watson Research Center, USA
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 March 2012
Check for updates
Author Tags
cloud computing
datacenter
virtualization
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 504
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DVM: towards a datacenter-scale virtual machine

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

DVM: towards a datacenter-scale virtual machine

Enabling Instantaneous Relocation of Virtual Machines with a Lightweight VMM Extension

Transparently bridging semantic gap in CPU management for virtualized environments