Study of an Improved Hadoop Speculative Execution Algorithm

Bao Yi Wang; Xiao Yang Pu; Shao Min Zhang

doi:10.4028/www.scientific.net/AMM.513-517.2281

Paper Titles

Semantic Feature Modeling Based on Geometric Constraint Solving
p.2264

Split Process Cluster: A Distributed Computing Platform for Edge Extraction of Massive Remote Sensing Images
p.2268

Research on Access Control Based on CP-ABE Algorithm and Cloud Computing
p.2273

Secure Message Transmission Method of MMS Telecontrol Communication Based on AES-CCM
p.2277

Study of an Improved Hadoop Speculative Execution Algorithm
p.2281

The Application of an Improved Integration Algorithm of Support Vector Machine to the Prediction of Network Security Situation
p.2285

Invulnerability Analysis of Large Logistics Supply Chain Network in Complex Environment Based on the Internet of Things
p.2289

Simulation on Task Scheduling for Multiprocessors Based on Improved Neural Network
p.2293

Convergence Proof for Unsteady Classification Model with Data Fluctuations
p.2297

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 513-517Study of an Improved Hadoop Speculative Execution...

Study of an Improved Hadoop Speculative Execution Algorithm

Abstract:

The problems of difference of nodes capabilities and the unevenly-distributed bandwidth of the network, widespread exist in the heterogeneous clouding environment. Together with the users randomness of submitting jobs, the problems above lead to server synchronization problems.Under the platform of Hadoop and the situations mentioned above, we come up with a method which is based on the native hadoop speculative algorithm to solve the problems. Through monitoring the load-balance in realtime, dynamically assessing the performance of the node and making the speculative tasks happened in high-performance node which meantime is the nearest node from input split, the algorithm effectively reduces the occupation of the network and accelerates executing speed. The experiment result shows that the method in the execution of which the speculative tasks has a high ratio, significantly improved the efficiency and throughput of the cluster.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 513-517)

Pages:

2281-2284

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.513-517.2281

Citation:

Cite this paper

Online since:

February 2014

Authors:

Bao Yi Wang*, Xiao Yang Pu, Shao Min Zhang

Keywords:

Cloud Computing, Hadoop, Job Scheduling, MapReduce, Speculative Execution

Export:

RIS, BibTeX

Price:

Permissions:

Request Permissions

* - Corresponding Author

References

[1] Apache Software Foundation, Hadoop on demand. URL http: /hadoop. apache. org/core/docs/r0. 20. 0/hod-user-guide. html.

Google Scholar

[2] P. Ling T. X, Z. Zhang, B. T. Loo, and I. Lee, Real-time mapreduce scheduling, University of Pennsylvania Department of Computer and Information Science, Tech. Rep., Jan (2010).

Google Scholar

[3] H. Herodotou, F. Dong, and S. Babu. No one (cluster) size ﬁts all: Automatic cluster sizing for data-intensive analytics. In Proc. of ACM Symposium on Cloud Computing, (2011).

DOI: 10.1145/2038916.2038934

Google Scholar

[4] Jinhua Hu; Jianhua Gu; Guofei Sun; Tianhai Zhao; , A Scheduling Strategy on Load Balancing of Virtual Machine Resources in Cloud Computing Environment, Parallel Architectures, Algorithms and Programming (PAAP), 2010 Third International Symposium on , pp.89-96, 18-20 Dec. (2010).

DOI: 10.1109/paap.2010.65

Google Scholar

[5] Zahafia M．Konwinski A．Joseph A．Improving MapReduce Performance in heterogeneous environments [C]. /Proc of the 8th Usenix Symp on Operating Systems Design and Implementation．2008：29-42.

Google Scholar

[6] Dean J. Ghemawat S. MapReduce；simplified Data Processing on Large Clusters[J]．Commun．ACM．2008．51(1)：107-113.

Google Scholar

[7] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, I. Stoica, Job scheduling for multi-user MapReduce clusters, Tech. Rep. UCB/EECS-2009-55, EECS Department, University of California, Berkeley (Apr 2009).

DOI: 10.1145/1755913.1755940

Google Scholar

[8] Kant Soni, V. Sharma, R. Kumar Mishra, M. An analysis of various job scheduling strategies in grid computing, Signal Processing Systems (ICSPS), 2010 2nd International Conference on, vol. 2, pp. V2-162-V2-166, 5-7 July (2010).

DOI: 10.1109/icsps.2010.5555272

Google Scholar