Parallel failure recovery techniques in cluster-based media servers

Lee, Joahyung; Jung, Inbum

doi:10.1007/s11227-009-0305-6

Parallel failure recovery techniques in cluster-based media servers

Published: 27 May 2009

Volume 51, pages 20–39, (2010)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Joahyung Lee¹ &
Inbum Jung¹

56 Accesses
1 Citation
Explore all metrics

Abstract

For large-scale video-on-demand (VOD) service, cluster servers are highlighted due to their high performance and low cost. A cluster server consists of a front-end node and multiple backend nodes. Though the increase in backend nodes provides more quality of service (QoS) streams, the possibility of backend node failure is proportionally increased. The failure causes not only the cessation of streaming services but also the loss of current playing positions. In this paper, when a backend node fails, recovery mechanisms are studied to support the streaming service continuously. Without considering the characteristics of cluster-based servers and MPEG media, the basic redundant array of independent disks (RAID) techniques cause a network bottleneck in the internal network path and demonstrate inefficient CPU usage in backend nodes. To address these problems, a new failure recovery mechanism is proposed based on the pipeline computing concept. The proposed method not only distributes the internal network traffic generated from the recovery operations but also utilizes the CPU time available in the backend nodes. In the experiments, even if a backend node fails, the proposed method provides continuous streaming media services within a short MTTR value as well as more QoS streams than the existing method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bolosky WJ, Pitzgerald RP, Draves JH (1997) Distributed schedule management in the tiger video fileserver. In: Proceedings of the sixteenth ACM symposium on operating systems principles, Saint Malo, France, October 5–8, 1997, pp 212–223
Chang T, Shim S, Du D (1998) The designs of RAID with XOR engines on disks for mass storage systems. In: IEEE mass storage conference, March 23–26, 1998, pp 181–186
Choi J-M, Lee S-W, Chung K-D (2001) A multicast delivery scheme for VCR operations in a large VOD system. In: IEEE international conference on parallel and distributed systems, June 26–29, 2001, pp 555–561
Fox A, Patterson D (2005) Approaches to recovery oriented computing. IEEE Internet Comput 9(2):14–16. doi:10.1109/MIC.2005.39
Article Google Scholar
Gafsi J, Biersack EW (1999) Data striping and reliability aspects in distributed video servers. Cluster Comput Netw Softw Tools Appl 2(1):75–91
Google Scholar
Gafsi J, Biersack EW (2000) Modeling and performance comparison of reliability strategies for distributed video servers. IEEE Trans Parallel Distrib Syst 11(4):412–430. doi:10.1109/71.850836
Article Google Scholar
Holland M, Gibson G, Siewiorek D (1994) Architectures and algorithms for on-line failure recovery in redundant disk arrays. J Distrib Parallel Databases 2:295–335. doi:10.1007/BF01266332
Article Google Scholar
http://www.ieeetfcc.org (2009)
http://www.mpeg.org (2009)
Kang S, Yeom HY (2003) Modeling the caching effect in continuous media servers. Multimedia Tools Appl 23(3):203–224. doi:10.1023/A:1025702332314
Article Google Scholar
Merchant A, Yu PS (1995) Analytic modeling and comparisons of striping strategies for replicated disk arrays. IEEE Trans Comput 44:419–433. doi:10.1109/12.372034
Article MATH Google Scholar
Patterson DA, Hennessy JL (1998) Computer organization & design. Morgan Kaufmann, San Mateo, pp 392–490
MATH Google Scholar
Sarhan NJ, Das CR (2004) Caching and scheduling in NAD-based multimedia servers. IEEE Trans Parallel Distrib Syst 15(10):921–933. doi:10.1109/TPDS.2004.49
Article Google Scholar
Schmidt BK, Lam MS, Northcutt JD (1999) The interactive performance of SLIM: a stateless, thin-client architecture. In: ACM symposium on operating systems principles, 1999, pp 31–47
Seo D, Lee J, Jung I (2007) Resource consumption-aware QoS in cluster-based VOD servers. J Syst Archit 53(1):39–52
Article Google Scholar
Shenoy PJ, Goyal P, Vin HM (2002) Data storage and retrieval for video-on-demand servers. In: IEEE fourth international symposium on multimedia software engineering, December 2002, pp 240–245
Sitaram D, Dan A (2000) Multimedia servers: applications, environments, and design. Morgan Kaufmann, San Mateo
Google Scholar
Tang D, Zhu J, Andrada R (2002) Automatic generation of availability models in RAScard. In: IEEE international conference of dependable systems and networks, June 23–26, 2002, pp 488–494

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Kangwon National University, Chuncheon, Korea
Joahyung Lee & Inbum Jung

Authors

Joahyung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Inbum Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Inbum Jung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, J., Jung, I. Parallel failure recovery techniques in cluster-based media servers. J Supercomput 51, 20–39 (2010). https://doi.org/10.1007/s11227-009-0305-6

Download citation

Received: 27 November 2008
Accepted: 06 May 2009
Published: 27 May 2009
Issue Date: January 2010
DOI: https://doi.org/10.1007/s11227-009-0305-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel failure recovery techniques in cluster-based media servers

Abstract

Access this article

Similar content being viewed by others

CDVT: A Cluster-Based Distributed Video Transcoding Scheme for Mobile Stream Services

Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services

A Task Allocation Method for Stream Processing with Recovery Latency Constraint

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallel failure recovery techniques in cluster-based media servers

Abstract

Access this article

Similar content being viewed by others

CDVT: A Cluster-Based Distributed Video Transcoding Scheme for Mobile Stream Services

Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services

A Task Allocation Method for Stream Processing with Recovery Latency Constraint

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation