skip to main content
research-article

Trident: task scheduling over tiered storage systems in big data platforms

Published:01 May 2021Publication History
Skip Abstract Section

Abstract

The recent advancements in storage technologies have popularized the use of tiered storage systems in data-intensive compute clusters. The Hadoop Distributed File System (HDFS), for example, now supports storing data in memory, SSDs, and HDDs, while OctopusFS and hatS offer fine-grained storage tiering solutions. However, the task schedulers of big data platforms (such as Hadoop and Spark) will assign tasks to available resources only based on data locality information, and completely ignore the fact that local data is now stored on a variety of storage media with different performance characteristics. This paper presents Trident, a principled task scheduling approach that is designed to make optimal task assignment decisions based on both locality and storage tier information. Trident formulates task scheduling as a minimum cost maximum matching problem in a bipartite graph and uses a standard solver for finding the optimal solution. In addition, Trident utilizes two novel pruning algorithms for bounding the size of the graph, while still guaranteeing optimality. Trident is implemented in both Spark and Hadoop, and evaluated extensively using a realistic workload derived from Facebook traces as well as an industry-validated benchmark, demonstrating significant benefits in terms of application performance and cluster efficiency.

References

  1. Cristina L Abad, Yi Lu, and Roy H Campbell. 2011. DARE: Adaptive Data Replication for Efficient Cluster Scheduling. In Proc. of the 2011 IEEE Intl. Conf. on Cluster Computing (CLUSTER). IEEE, 159--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Faraz Ahmad, Srimat T Chakradhar, Anand Raghunathan, and TN Vijaykumar. 2012. Tarazu: Optimizing MapReduce on Heterogeneous Clusters. ACM SIGARCH Computer Architecture News 40, 1 (2012), 61--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alluxio 2021. Alluxio: Data Orchestration for the Cloud. Retrieved May 5, 2021 from http://www.alluxio.org/Google ScholarGoogle Scholar
  4. Ganesh Ananthanarayanan, Sameer Agarwal, Srikanth Kandula, Albert Greenberg, Ion Stoica, Duke Harlan, and Ed Harris. 2011. Scarlett: Coping with Skewed Popularity Content in MapReduce Clusters. In Proc. of the 6th European Conf. on Computer Systems (EuroSys). ACM, 287--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2011. Disk-locality in Datacenter Computing Considered Irrelevant. In Proc. of the 13th Workshop on Hot Topics in Operating Systems (HotOS). USENIX, 12--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Warfield, Dhruba Borthakur, Srikanth Kandula, Scott Shenker, and Ion Stoica. 2012. PACMan: Coordinated Memory Caching for Parallel Jobs. In Proc. of the 9th USENIX Symp. on Networked Systems Design and Implementation (NSDI). USENIX, 267--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Apache Hadoop 2021. Apache Hadoop. Retrieved May 5, 2021 from https://hadoop.apache.orgGoogle ScholarGoogle Scholar
  8. Apache Spark 2021. Apache Spark. Retrieved May 5, 2021 from https://spark.apache.orgGoogle ScholarGoogle Scholar
  9. Quan Chen, D. Zhang, M. Guo, Q. Deng, and S. Guo. 2010. SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment. In Proc. of the 10th IEEE Intl. Conf. on Computer and Information Technology (ICCIT). IEEE, 2736--2743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yanpei Chen, Sara Alspaugh, and Randy Katz. 2012. Interactive Analytical Processing in Big Data Systems: A Cross-industry Study of MapReduce Workloads. PVLDB 5, 12 (2012), 1802--1813. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yanpei Chen, Archana Ganapathi, Rean Griffith, and Randy Katz. 2011. The Case for Evaluating MapReduce Performance using Workload Suites. In Proc. of the 2011 IEEE Intl. Symp. on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 390--399. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dazhao Cheng, Jia Rao, Yanfei Guo, and Xiaobo Zhou. 2014. Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning. In Proc. of the 15th IEEE Intl. Conf. on Cluster Computing (CLUSTER). ACM, 97--108.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. 2009. Introduction to Algorithms. MIT press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Francis Deslauriers, Peter McCormick, George Amvrosiadis, Ashvin Goel, and Angela Demke Brown. 2016. Quartet: Harmonizing Task Scheduling and Caching for Cluster Computing. In Proc. of the 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage). USENIX, 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ran Duan and Seth Pettie. 2014. Linear-time Approximation for Maximum Weight Matching. Journal of the ACM (JACM) 61, 1 (2014), 1--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Avrilia Floratou, Nimrod Megiddo, Navneet Potti, Fatma Özcan, Uday Kale, and Jan Schmitz-Hermes. 2016. Adaptive Caching in Big SQL using the HDFS Cache. In Proc. of the 7th ACM Symp. on Cloud Computing (SoCC). ACM, 321--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rohan Gandhi, Di Xie, and Y Charlie Hu. 2013. PIKACHU: How to Rebalance Load in Optimizing MapReduce On Heterogeneous Clusters. In Proc. of the 2013 USENIX Annual Technical Conference (ATC). USENIX, 61--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kannan Govindarajan, Supun Kamburugamuve, Pulasthi Wickramasinghe, Vibhatha Abeykoon, and Geoffrey Fox. 2017. Task Scheduling in Big Data-Review, Research Challenges, and Prospects. In Proc. of the 9th Intl. Conf. on Advanced Computing (ICoAC). IEEE, 165--173.Google ScholarGoogle ScholarCross RefCross Ref
  19. GridGain 2021. GridGain In-Memory Computing Platform. Retrieved May 5, 2021 from http://www.gridgain.com/Google ScholarGoogle Scholar
  20. HDFS 2020. HDFS Archival Storage, SSD & Memory. Retrieved May 5, 2021 from https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.htmlGoogle ScholarGoogle Scholar
  21. Herodotos Herodotou and Elena Kakoulli. 2019. Automating Distributed Tiered Storage Management in Cluster Computing. PVLDB 13, 1 (2019), 43--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. HiBench 2020. HiBench Suite. Retrieved May 5, 2021 from https://github.com/intel-hadoop/HiBenchGoogle ScholarGoogle Scholar
  23. Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A Platform for Fine-grained Resource Sharing in the Data Center. In Proc. of the 8th USENIX Symp. on Networked Systems Design and Implementation (NSDI). USENIX, 295--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2011. The HiBench Benchmark Suite: Characterization of the MapReduce-based Data Analysis. In New Frontiers in Information and Software as Services. Springer, 209--228.Google ScholarGoogle Scholar
  25. Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg. 2009. Quincy: Fair Scheduling for Distributed Computing Clusters. In Proc. of the 22nd ACM Symp. on Operating Systems Principles (SOSP). ACM, 261--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jingjie Jiang, Shiyao Ma, Bo Li, and Baochun Li. 2016. Symbiosis: Network-aware Task Scheduling in Data-parallel Frameworks. In Proc. of the 35th IEEE Intl. Conf. on Computer Communications (INFOCOM). IEEE, 1--9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Elena Kakoulli and Herodotos Herodotou. 2017. OctopusFS: A Distributed File System with Tiered Storage Management. In Proc. of the 2017 ACM Intl. Conf. on Management of Data (SIGMOD). ACM, 65--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. KR Krish, Ali Anwar, and Ali R Butt. 2014. hatS: A Heterogeneity-aware Tiered Storage for Hadoop. In Proc. of the 14th IEEE/ACM Intl. Symp. on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 502--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sparsh Mittal and Jeffrey S Vetter. 2015. A Survey of Software Techniques for using Non-volatile Memories for Storage and Main Memory Systems. IEEE Transactions on Parallel and Distributed Systems (TPDS) 27, 5 (2015), 1537--1550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Seyed Reza Pakize. 2014. A Comprehensive View of Hadoop MapReduce Scheduling Algorithms. International Journal of Computer Networks & Communications Security 2, 9 (2014), 308--317.Google ScholarGoogle Scholar
  31. Fengfeng Pan, Jin Xiong, Yijie Shen, Tianshi Wang, and Dejun Jiang. 2018. H-scheduler: Storage-aware task scheduling for heterogeneous-storage spark clusters. In Proc. of the 24th IEEE Intl. Conf. on Parallel and Distributed Systems (ICPADS). IEEE, 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  32. Mario Pastorelli, Damiano Carra, Matteo Dell'Amico, and Pietro Michiardi. 2015. HFSP: Bringing Size-based Scheduling to Hadoop. IEEE Transactions on Cloud Computing 5, 1 (2015), 43--56.Google ScholarGoogle ScholarCross RefCross Ref
  33. Aparna Raj, Kamaldeep Kaur, Uddipan Dutta, V Venkat Sandeep, and Shrisha Rao. 2012. Enhancement of Hadoop Clusters with Virtualization Using the Capacity Scheduler. In Proc. of the Third Intl. Conf. on Services in Emerging Markets. IEEE, 50--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The Hadoop Distributed File System. In Proc. of the 26th Intl. Conf. on Massive Storage Systems and Technology (MSST). IEEE, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mbarka Soualhia, Foutse Khomh, and Sofiène Tahar. 2017. Task Scheduling in Big Data Platforms: A Systematic Literature Review. Journal of Systems and Software 134 (2017), 170--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xiaoyu Sun, C. He, and Ying Lu. 2012. ESAMR: An Enhanced Self-Adaptive MapReduce Scheduling Algorithm. In Proc. of the 18th IEEE Intl. Conf. on Parallel and Distributed Systems (ICPADS). IEEE, 148--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. SWIM 2016. SWIM: Statistical Workload Injector for MapReduce. Retrieved May 5, 2021 from https://github.com/SWIMProjectUCB/SWIM/wikiGoogle ScholarGoogle Scholar
  38. Jian Tan, Xiaoqiao Meng, and Li Zhang. 2013. Coupling Task Progress for MapReduce Resource-aware Scheduling. In Proc. of the 32nd IEEE Intl. Conf. on Computer Communications (INFOCOM). IEEE, 1618--1626.Google ScholarGoogle ScholarCross RefCross Ref
  39. Zhuo Tang, Min Liu, Almoalmi Ammar, Kenli Li, and Keqin Li. 2016. An Optimized MapReduce Workflow Scheduling Algorithm for Heterogeneous Computing. The Journal of Supercomputing 72, 6 (2016), 2059--2079. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, et al. 2013. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proc. of the 4th ACM Symp. on Cloud Computing (SoCC). ACM, 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jiayin Wang, Yi Yao, Ying Mao, Bo Sheng, and Ningfang Mi. 2014. Fresh: Fair and Efficient Slot Configuration and Scheduling for Hadoop Clusters. In Proc. of the 7th IEEE Intl. Conf. on Cloud Computing (CLOUD). IEEE, 761--768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Weina Wang, Kai Zhu, Lei Ying, Jian Tan, and Li Zhang. 2014. Map Task Scheduling in MapReduce with Data Locality: Throughput and Heavy-traffic Optimality. IEEE/ACM Transactions On Networking 24, 1 (2014), 190--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Luna Xu, A. Butt, Seung-Hwan Lim, and R. Kannan. 2018. A Heterogeneity-Aware Task Scheduler for Spark. In Proc. of the 2018 IEEE Intl. Conf. on Cluster Computing (CLUSTER). IEEE, 245--256.Google ScholarGoogle Scholar
  44. Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2009. Job Scheduling for Multi-User MapReduce Clusters. Technical Report UCB/EECS-2009-55. EECS Department, University of California, Berkeley. Retrieved May 5, 2021 from http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-55.htmlGoogle ScholarGoogle Scholar
  45. Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proc. of the 5th European Conf. on Computer Systems (EuroSys). ACM, 265--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, et al. 2012. Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing. In Proc. of the 9th USENIX Symp. on Networked Systems Design and Implementation (NSDI). USENIX, 15--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy H Katz, and Ion Stoica. 2008. Improving MapReduce Performance in Heterogeneous Environments. In Proc. of the 8th USENIX Symp. on Operating Systems Design and Implementation (OSDI). USENIX, 29--42. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Trident: task scheduling over tiered storage systems in big data platforms
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the VLDB Endowment
            Proceedings of the VLDB Endowment  Volume 14, Issue 9
            May 2021
            249 pages
            ISSN:2150-8097
            Issue’s Table of Contents

            Publisher

            VLDB Endowment

            Publication History

            • Published: 1 May 2021
            Published in pvldb Volume 14, Issue 9

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader