skip to main content
10.1145/3318265.3318274acmotherconferencesArticle/Chapter ViewAbstractPublication Pageshp3cConference Proceedingsconference-collections
research-article

OPS: an optimized partial stripe write scheme to improve performance of XOR-based disk arrays tolerating triple disk failures

Authors Info & Claims
Published:08 March 2019Publication History

ABSTRACT

In cloud storage and big data processing systems, RAID especially disk arrays tolerating triple disk failures (3DFTs) is a popular choice to provide high reliability with low monetary cost. For 3DFTs, a key obstacle is the low partial stripe write performance, which is caused by large amount of parity modifications based on complex erasure coding layouts.

In order to solve this problem, in this paper, we propose an optimized partial stripe write (OPS) method, which reorganizes the distribution of write data blocks to share partial parities among data blocks, thereby improving overall I/O performance. The OPS method can effectively reduce the number of modified parities. To illustrate the effectiveness of our OPS method, we used Disksim to evaluate several different partial stripe write methods through simulation. The results show that OPS can reduce the average response time by up to 37.21% and decreases the number of write operations by up to 26.22% compared to the traditional partial strip writing method..

References

  1. J. Bonwick. Raid-z. http://blogs.sun.com/bonwick/entry/raidz, 2010.Google ScholarGoogle Scholar
  2. P. Chen, E. Lee, G. Gibson, R. Katz, and D. Patterson. RAID: High-performance, reliable secondary storage. ACM Computing Surveys, 26(2):145--185, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Patterson, G. Gibson, and R. Katz. A case for Redundant Arrays of Inexpensive Disks (RAID). In Proc. of the 1988 ACM SIGMOD International Conference on Management of Data, Chicago, IL, June 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Pinheiro, W. Weber, and L. Barroso. Failure trends in a large disk drive population. In Proc. of the 5th USENIX Conference on File and Storage Technologies, San Jose, CA, February 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Schroeder and G. Gibson. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proc. of the 5th USENIX Conference on File and Storage Technologies, San Jose, CA, February 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, pages 300--304, 1960.Google ScholarGoogle ScholarCross RefCross Ref
  7. Stephen B Wicker and Vijay K Bhargava. Reed-Solomon codes and their applications. John Wiley & Sons, 1999. Google ScholarGoogle ScholarCross RefCross Ref
  8. James S Plank. A tutorial on reed-solomon coding for fault-tolerance in raid-like systems. Software: Practice and Experience, 27(9):995--1012, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Huang and L. Xu. STAR: An efficient coding scheme for correcting triple storage node failures. In Proc. of the 4th USENIX Conference on File and Storage Technologies, San Francisco, CA, December 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Wang, G. Li, and X. Zhong. Triple-Star: A coding scheme with optimal encoding complexity for tolerating triple disk faliures in raid. International Journal of Innovative Computing, Information and Control, 8(3):1731--1472, 2012.Google ScholarGoogle Scholar
  11. Yongzhe Zhang, Chentao Wu, Jie Li, and Minyi Guo. Tip-code: A three independent parity code to tolerate triple disk failures with optimal update complextiy. In 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 136--147. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yanbing Jiang, Chentao Wu, Jie Li, and Minyi Guo. Eh-code: An extended mds code to improve single write performance of disk arrays for correcting triple disk failures. In International Conference on Algorithms and Architectures for Parallel Processing, pages 34--49. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  13. Dan Tang, Xiaojing Wang, Sheng Cao, and Zheng Chen. A new class of highly fault tolerant erasure code for the disk array. In Power Electronics and Intelligent Transportation System,, 2008. PEITS'08. Workshop on, pages 578--581. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dimitris S Papailiopoulos and Alexandros G Dimakis. Locally repairable codes. IEEE Transactions on Information Theory, 60(10):5843--5855, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  15. Pradeep Subedi and Xubin He. A comprehensive analysis of xor-based erasure codes tolerating 3 or more concurrent failures. In Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International, pages 1528--1537. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zhang Yanpo. Lrc(local reconstruction codes) erasure code based on reed-solomon with vandermonde matrix. https://github.com/baishancloud/lrc-erasure-code, 2015.Google ScholarGoogle Scholar
  17. Congjin Du, Chentao Wu, Jie Li, Minyi Guo, and Xubin He. Bps: A balanced partial stripe write scheme to improve the write performance of raid-6. In Cluster Computing (CLUSTER), 2015 IEEE International Conference on, pages 204--213. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Guillermo A Alvarez, Walter A Burkhard, and Flaviu Cristian. Tolerating multiple failures in raid architectures with optimal storage and uniform declustering. In ACM SIGARCH Computer Architecture News, volume 25, pages 62--72. ACM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. David. Method for improving partial stripe write performance in disk array subsystems. US Patent No. 5333305, July 1994.Google ScholarGoogle Scholar
  20. Ao Ma, Rachel Traylor, Fred Douglis, Mark Chamness, Guanlin Lu, Darren Sawyer, Surendar Chandra, and Windsor Hsu. Raidshield: characterizing, monitoring, and proactively protecting against disk failures. ACM Transactions on Storage (TOS), 11(4):17, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Thomasian and M. Blaum. Higher reliability redundant disk arrays: Organization, operation, and coding. ACM Transactions on Storage, 5(4):Article 7, November 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Blaum, J. Brady, J. Bruck, and J. Menon. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Transactions on Computers, 44(2):192--202, February 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar. Row-Diagonal Parity for double disk failure correction. In Proc. of the 3rd Usenix Conference on File and Storage Technologies, San Francisco, CA, March 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Chao Jin, Hong Jiang, Dan Feng, and Lei Tian. P-code: A new raid-6 code with optimal properties. In Proceedings of the 23rd international conference on Supercomputing, pages 360--369. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Wu, S. Wan, X. He, and C. Xie. H-Code: A hybrid mds array code to optimize partial stripe write in raid-6. In Proc. of the IEEE IPDPS' 11, Anchorage, AK, May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Xu and J. Bruck. X-Code: MDS array codes with optimal encoding. IEEE Transactions on Information Theory, 45(1):272--276, January 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang. Diskseen: Exploiting disk layout and access history to enhance i/o prefetch. In USENIX Annual Technical Conference, volume 7, pages 261--274, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sitaram Iyer and Peter Druschel. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous i/o. In ACM SIGOPS Operating Systems Review, volume 35, pages 117--130. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yuehai Xu and Song Jiang. A scheduling framework that makes any disk schedulers non-work-conserving solely based on request characteristics. In FAST, pages 119--132, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Alexander Thomasian. Survey and analysis of disk scheduling methods. ACM SIGARCH Computer Architecture News, 39(2):8--25, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jeanna Neefe Matthews, Drew Roselli, Adam M Costello, Randolph Y Wang, and Thomas E Anderson. Improving the performance of log-structured file systems with adaptive methods, volume 31. ACM, 1997.Google ScholarGoogle Scholar
  32. Mendel Rosenblum and John K Ousterhout. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS), 10(1):26--52, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Randolph Y Wang, Thomas E Anderson, and David A Patterson. Virtual log based file systems for a programmable disk. Operating systems review, 33:29--44, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Peter Scheuermann, Gerhard Weikum, and Peter Zabback. Data partitioning and load balancing in parallel disk systems. The VLDB Journal---The International Journal on Very Large Data Bases, 7(1):48--66, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Michael Stonebraker, Robert Devine, Marcel Kornacker, Witold Litwin, Avi Pfeffer, Adam Sah, and Carl Staelin. An economic paradigm for query processing and data migration in mariposa. In Proceedings of 3rd International Conference on Parallel and Distributed Information Systems, pages 58--67. IEEE, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Bucy, J. Schindler, S. Schlosser, and G. Ganger. The disksim simulation environment version 4.0 reference manual. Technical Report CMU-PDL-08-101, Carnegie Mellon University, May 2008.Google ScholarGoogle Scholar
  37. EMC Corporation. Emc clariion raid 6 technology: A detailed review. http://www.emc.com/collateral/hardware/white-papers/h2891-clariion-raid-6.paf, July 2007.Google ScholarGoogle Scholar
  38. Chih-Shing Tau and Tzone-I Wang. Efficient parity placement schemes for tolerating triple disk failures in raid architectures. In Advanced Information Networking and Applications, 2003. AINA 2003. 17th International Conference on, pages 132--137. IEEE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Fiu-homes. http://iotta.snia.org/traces/391, 2011.Google ScholarGoogle Scholar

Index Terms

  1. OPS: an optimized partial stripe write scheme to improve performance of XOR-based disk arrays tolerating triple disk failures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      HP3C '19: Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications
      March 2019
      201 pages
      ISBN:9781450366380
      DOI:10.1145/3318265
      • Conference Chair:
      • Steven Guan

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 March 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader