Skip to main content

Record Placement Based on Data Skew Using Solid State Drives

  • Conference paper
  • First Online:
Big Data Benchmarks, Performance Optimization, and Emerging Hardware (BPOE 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8807))

Abstract

Integrating a solid state drive (SSD) into a data store is expected to improve its I/O performance. However, there is still a large difference between the price of an SSD and a hard-disk drive (HDD). One of the methods to offset the increase in cost of consisting devices is to configure a hybrid system using both devices. In such a system, a common method to decide the placement of data records is based on reference locality, i.e., placing the frequently accessed records in a faster SSD. In this paper, we propose an alternative that focuses on data skew by storing records with values that appear less often in an SSD while those that do more in an HDD. As we will show, this enhances the performance of fetching records using multi-dimensional indices. When records are fetched using one of the indices targeted for optimization, records stored in an SSD are likely be retrieved using random access, while those stored in an HDD using sequential access. Given the method does not rely on reference locality, its performance is stable between first and second accesses and it provides a performance gain even when a host memory is large enough to contain the entire working set of the application. Our implementation and experiments show that storing just \(20\,\%\) records in an SSD achieves up to \(76\,\%\) of the maximum reduction that would otherwise be obtained when all the records are stored in an SSD.

J. Suzuki—Visiting scholar at University of California, Berkeley when this work was done.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Canim, M., Mihaila, G.A., Bhattacharjee, B., Ross, K.A., Lang, C.A.: An object placement advisor for DB2 using solid state storage. In: VLDB, pp. 1318–1329 (2009)

    Google Scholar 

  2. Canim, M., Mihaila, G.A., Bhattacharjee, B., Ross, K.A., Lang, C.A.: SSD bufferpool extensions for database systems. In: VLDB, pp. 1435–1446 (2010)

    Google Scholar 

  3. Do, J., Zhang, D., Patel, J.M., DeWitt, D.J., Naughton, J.F., Halverson, A.: Turbocharging DBMS buffer pool using SSDs. In: SIGMOD (2011)

    Google Scholar 

  4. Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J.D., Manasse, M., Panigrahy, R.: Design tradeoffs for SSD performance. In: 2008 USENIX Annual Technical Conference (ATC’08), pp. 57–70 (2008)

    Google Scholar 

  5. Walton, C.B., Dale, A.G., Jenevein, R.M.: A taxonomy and performance model of data skew effects in parallel joins. In: VLDB, pp. 537–548 (1991)

    Google Scholar 

  6. Stoica, I.: Warehouse-Scale Computing and the BDAS Stack. http://ampcamp.berkeley.edu/amp-camp-one-berkeley-2012/

  7. Intel SSD Product Comparison. http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-ssd.html

  8. Seagate Desktop HDD. http://www.seagate.com.edgekey.net/staticfiles/docs/pdf/datasheet/disc/desktop-hdd-data-sheet-ds1770-1-1212us.pdf

  9. Liu, X., Salem, K.: Hybrid storage management for database systems. In: VLDB, pp. 541–552 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Suzuki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Suzuki, J., Venkataraman, S., Agarwal, S., Franklin, M., Stoica, I. (2014). Record Placement Based on Data Skew Using Solid State Drives. In: Zhan, J., Han, R., Weng, C. (eds) Big Data Benchmarks, Performance Optimization, and Emerging Hardware. BPOE 2014. Lecture Notes in Computer Science(), vol 8807. Springer, Cham. https://doi.org/10.1007/978-3-319-13021-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13021-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13020-0

  • Online ISBN: 978-3-319-13021-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics