Skip to main content

FPGA-Integrated Bag of Little Bootstraps Accelerator for Approximate Database Query Processing

  • Conference paper
  • First Online:
Applied Reconfigurable Computing. Architectures, Tools, and Applications (ARC 2023)

Abstract

We propose a novel approach to an FPGA-based approximate query processing accelerator using the Bag of Little Bootstraps (BLB) algorithm. The BLB algorithm is a statistical approximate computing method, allowing for efficient parallelization. We enhanced the BLB algorithm with a streaming mode to neglect data storage and memory transfer overhead. This allows us to take full advantage of the hardware capabilities of FPGAs. We supersede resampling with multiple passes over the dataset with a method based on Poisson bootstrapping using resampling coefficients. We show that our approach implemented on a Xilinx Zynq7000 FPGA with clock frequency at 125 MHz outperforms an optimized, multithreaded CPU implementation on an Intel i7-6850K with 4 GHz by factor 4 without and factor 2 with data transfer time for one million entries. This improvement increases with the amount of data to be processed. Implementing the BLB algorithm on an FPGA as an approximate query processing accelerator offers a promising approach for improving database query processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Acharya, S., Gibbons, P.B., Poosala, V., Ramaswamy, S., et al.: The aqua approximate query answering system. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pp 574–576. New York, NY, USA (1999). https://doi.org/10.1145/304182.304581

  2. Agarwal, S., Panda, A., Mozafari, B., Madden, S., Stoica, I.: Blinkdb: queries with bounded errors and bounded response times on very large data. EuroSys 2013 (2012). https://doi.org/10.1145/2465351.2465355

  3. Alimohammad, A., Fard, S.F., Cockburn, B.F., Schlegel, C.: A compact and accurate gaussian variate generator. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 16(5), 517–527 (2008). https://doi.org/10.1109/TVLSI.2008.917552

  4. Broneske, D., Drewes, A., Gurumurthy, B., Hajjar, I., Pionteck, T., Saake, G.: In-depth analysis of OLAP query performance on heterogeneous hardware. Datenbank-Spektrum 21(2), 133–143 (2021). https://doi.org/10.1007/s13222-021-00384-w

    Article  Google Scholar 

  5. Canty, A.J., Davison, A.C., et al.: Bootstrap diagnostics and remedies. Can. J. Stat. 34(1), 5–27 (2006). https://doi.org/10.1002/cjs.5550340103

    Article  MathSciNet  MATH  Google Scholar 

  6. Cormode, G.: Sketch techniques for approximate query processing. Found. Trends Databases 4(1-3), 1-294 (2011)

    Google Scholar 

  7. Dagum, L., Menon, R.: Openmp: an industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998). https://doi.org/10.1109/99.660313

  8. Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7(1), 1–26 (1979). https://doi.org/10.1214/aos/1176344552

    Article  MathSciNet  MATH  Google Scholar 

  9. Fang, J., Mulder, Y.T.B., Hidders, J., Lee, J., Hofstee, H.P.: In-memory database acceleration on FPGAs: a survey. VLDB J. 29(1), 33–59 (2019). https://doi.org/10.1007/s00778-019-00581-w

    Article  Google Scholar 

  10. Babu, G.J., Pathak, P.K., Rao, C.R.: Second-order correctness of the Poisson bootstrap. Ann. Stat. 27(5), 1666–1683 (1999). https://doi.org/10.1214/aos/1017939146

  11. Gough, B.: GNU scientific library reference manual. Network Theory Ltd. (2009)

    Google Scholar 

  12. Hellerstein, J.M., Haas, P.J., Wang, H.J.: Online aggregation. SIGMOD Rec. 26(2), 171–182 (1997). https://doi.org/10.1145/253262.253291

  13. Hilprecht, B., et al.: Deepdb: learn from data, not from queries! Proc. VLDB Endow. 13(7), 992–1005 (2020). https://doi.org/10.14778/3384345.3384349

  14. Kleiner, A., Talwalkar, A., Agarwal, S., Stoica, I., Jordan, M.: A general bootstrap performance diagnostic, pp. 419–427 (2013). https://doi.org/10.1145/2487575.2487650

  15. Kleiner, Ariel, et al.: A scalable bootstrap for massive data. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 76 (2011). https://doi.org/10.1111/rssb.12050

  16. Li, Y., Chow, P., et al.:: Software/hardware framework for generating parallel gaussian random numbers based on the monty python method. In: 2012 International Conference on Field-Programmable Technology, pp. 190–197 (2012). https://doi.org/10.1109/FPT.2012.6412133

  17. Liu, Z., Zhang, A.: A survey on sampling and profiling over big data (technical report) (2020). https://doi.org/10.48550/ARXIV.2005.05079

  18. Ma, Q., Triantafillou, P.: Dbest: revisiting approximate query processing engines with machine learning models, pp. 1553–1570 (2019). https://doi.org/10.1145/3299869.3324958

  19. Mahmud, M.S., Huang, J.Z., et al.: A survey of data partitioning and sampling methods to support big data analysis. Big Data Min. Anal. 3(2), 85–101 (2020). https://doi.org/10.26599/BDMA.2019.9020015

  20. Malik, J.S., Hemani, A.: Gaussian random number generation: a survey on hardware architectures. ACM Comput. Surv. 49(3) (2016). https://doi.org/10.1145/2980052

  21. Nair, L.B.G., et al.: The reprovide query-sequence optimization in a hardware-accelerated DBMs. In: Proceedings of the 16th International Workshop on Data Management on New Hardware. DaMoN 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3399666.3399926

  22. Park, Y., et al.: Verdictdb: universalizing approximate query processing, SIGMOD 2018, pp. 1461–1476. New York, NY, USA (2018). https://doi.org/10.1145/3183713.3196905

  23. Parsons, V.L.: Stratified Sampling, pp. 1–11. Wiley, Hoboken (2017). https://doi.org/10.1002/9781118445112.stat05999.pub2

  24. Peng, J., et al.: AQP++: Connecting approximate query processing with aggregate precomputation for interactive analytics, pp. 1477–1492. SIGMOD 2018, New York, NY, USA (2018). https://doi.org/10.1145/3183713.3183747

  25. Pol, A., Jermaine, C.: Relational confidence bounds are easy with the bootstrap, pp. 587–598 (2005). https://doi.org/10.1145/1066157.1066224

  26. Rao, C., Pathak, P., Koltchinskii, V.: Bootstrap by sequential resampling. J. Stat. Plan. Infer. 64(2), 257–281 (1997). https://doi.org/10.1016/S0378-3758(97)00041-4

    Article  MathSciNet  MATH  Google Scholar 

  27. Salami, B., Gorker, et al.: Axledb: a novel programmable query processing platform on FPGA. Microprocess. Microsyst. 51, 142–164 (2017). https://doi.org/10.1016/j.micpro.2017.04.018

  28. Shoemaker, O.J., Pathak, P.K.: The sequential bootstrap: a comparison with regular bootstrap. Commun. Stat. Theor. Methods 30(8–9), 1661–1674 (2001). https://doi.org/10.1081/STA-100105691

    Article  MathSciNet  MATH  Google Scholar 

  29. Thomas, D.B.: The table-hadamard GRNG: an area-efficient FPGA gaussian random number generator. ACM Trans. Reconfigurable Technol. Syst. 8(4) (2015). https://doi.org/10.1145/2629607

  30. TPC: Tpc-h decision support benchmark. https://www.tpc.org/tpch. Accessed 05 Aug 2022

  31. Xilinx: Logicore IP product guide. https://docs.xilinx.com/v/u/en-US/pg060-floating-point (2020)

  32. Zhao, H., Zhang, H., Jing, Y., Zhang, K., He, Z., Wang, X.S.: Revisiting approximate query processing and bootstrap error estimation on GPU. In: Bhattacharya, A., et al. Database Systems for Advanced Applications. DASFAA 2022. LNCS, vol. 13245, pp. 72–87. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-00123-9_5

  33. Ziener, D., Bauer, F., et al.: FPGA-based dynamically reconfigurable SQL query processing. ACM Trans. Reconfigurable Technol. Syst. 9(4) (2016). https://doi.org/10.1145/2845087

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Burtsev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Burtsev, V. et al. (2023). FPGA-Integrated Bag of Little Bootstraps Accelerator for Approximate Database Query Processing. In: Palumbo, F., Keramidas, G., Voros, N., Diniz, P.C. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2023. Lecture Notes in Computer Science, vol 14251. Springer, Cham. https://doi.org/10.1007/978-3-031-42921-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-42921-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-42920-0

  • Online ISBN: 978-3-031-42921-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics