Abstract
Third-generation nanopore sequencing technologies, along with portable devices such as MinION Nanopore and Jetson Xavier NX, allow performing cost-effective metagenomic analysis in a portable manner. At the same time, we observe the growth of the serverless computing paradigm that offers high scalability with limited maintenance overhead for the underlying infrastructure. Recent advancements in serverless offerings make it a viable choice for performing operations such as basecalling. This paper aims to evaluate if a combination of edge and serverless computing paradigms can be successfully used to perform the basecalling process, with the focus on acceleration of offline edge-based processing with serverless-based infrastructure. For the purposes of the experiments, we proposed a workflow in which DNA sequence reads are processed simultaneously at the edge with Jetson Xavier NX and in the cloud with AWS Lambda in different network conditions. The results of our experiments show that with such a hybrid approach, we can reduce the processing time and energy consumption of the basecalling process compared to fully offline or fully online processing. We also believe that while so far, the adoption of serverless computing for bioinformatic applications is not high, the recent improvements to platforms such as AWS Lambda make it a compelling choice for an increasing number of bioinformatics workflows.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
AWS Lambda container image support. https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/, (accessed 5 February 2022),
Bonito basecaller repository on. https://github.com/nanoporetech/bonito, github (accessed 5 February 2022)
Jetson Xavier NX specification. https://developer.nvidia.com/embedded/jetson-xavier-nx-devkit, (accessed 5 February 2022)
Nanopore product comparison. https://nanoporetech.com/products/comparison, (accessed 5 February 2022)
s3cmd. https://s3tools.org/s3cmd, (accessed 5 April 2022)
Serverless framework. https://github.com/serverless/serverless, (accessed 5 April 2022)
Aboukhalil, R.: Serverless genomics - using WebAssembly and Cloudflare Workers to power genomics analysis. https://robaboukhalil.medium.com/serverless-genomics-c412f4bed726, (accessed 5 February 2022)
Acharya, K., Blackburn, A., Mohammed, J., Haile, A.T., Hiruy, A.M., Werner, D.: Metagenomic water quality monitoring with a portable laboratory. Water Res. 184, 116112 (2020). https://www.sciencedirect.com/science/article/pii/S0043135420306497
Boykin, L.M., et al.: Tree lab: portable genomics for early detection of plant viruses and pests in sub-saharan africa. Genes 10(9) 63 (2019). https://www.mdpi.com/2073-4425/10/9/632
Boža, V., Perešíni, P., Brejová, B., Vinař, T.: Deepnano-blitz: a AST base caller for minion nanopore sequencers. Bioinformatics (Oxford, England) 36, 4191–4192 (2020)
Castro-Wallace, S.L., et al.: Nanopore DNA sequencing and genome assembly on the international space station. Sci. Rep. 7(1), 18022 (2017). https://doi.org/10.1038/s41598-017-18364-0
Crespo-Cepeda, R., Agapito, G., Vazquez-Poletti, J.L., Cannataro, M.: Challenges and opportunities of amazon serverless lambda services in bioinformatics. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2019, pp. 663–668. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3307339.3343462
David, M., Dursi, L.J., Yao, D., Boutros, P.C., Simpson, J.T.: Nanocall: an open source basecaller for Oxford Nanopore sequencing data. Bioinformatics 33(1), 49–55 (2016). https://doi.org/10.1093/bioinformatics/btw569
D’Agostino, D., Morganti, L., Corni, E., Cesini, D., Merelli, I.: Combining edge and cloud computing for low-power, cost-effective metagenomics analysis. Future Gener. Comput. Syst. 90, 79–85 (2019). https://www.sciencedirect.com/science/article/pii/S0167739X18300293
Gowers, G.O.F., Vince, O., Charles, J.H., Klarenberg, I., Ellis, T., Edwards, A.: Entirely off-grid and solar-powered DNA sequencing of microbial communities during an ice cap traverse expedition. Genes 10(11), 902 (2019). https://www.mdpi.com/2073-4425/10/11/902
Grzesik, P., Augustyn, D.R., Wyciślik, L., Mrozek, D.: Serverless computing in omics data analysis and integration. Briefings Bioinform. 23(1) (2021). https://doi.org/10.1093/bib/bbab349, bbab349
Grzesik, P., Mrozek, D.: Metagenomic analysis at the edge with jetson xavier NX. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2021. LNCS, vol. 12745, pp. 500–511. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77970-2_38
Grzesik, P., Mrozek, D.: Serverless nanopore basecalling with AWS Lambda. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2021. LNCS, vol. 12743, pp. 578–586. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77964-1_44
Hoenen, T., et al.: Nanopore sequencing as a rapidly deployable EBOLA outbreak tool. Emerg. Inf. Dis. 22(2), 331–334 (2016). https://pubmed.ncbi.nlm.nih.gov/26812583, 26812583[pmid]
Hung, L.H., Niu, X., Lloyd, W., Yeung, K.Y.: Accessible and interactive RNA sequencing analysis using serverless computing. BioRxiv (2020). https://www.biorxiv.org/content/early/2020/10/03/576199
Jain, Y., et al.: sBeacon: cloud-native genomic data exchange. In: ABACBS-2020, vol. 2020, p. 1 (2020)
Jonas, E., et al.: Cloud programming simplified: a berkeley view on serverless computing. CoRR abs/1902.03383 (2019). http://arxiv.org/abs/1902.03383
Kafetzopoulou, L.E., et al.: Metagenomic sequencing at the epicenter of the Nigeria 2018 lassa fever outbreak. Science 363(6422), 74–77 (2019). https://science.sciencemag.org/content/363/6422/74
Kumanov, D., Hung, L.H., Lloyd, W., Yeung, K.Y.: Serverless computing provides on-demand high performance computing for biomedical research (2018). https://arxiv.org/abs/1807.11659
Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009). https://doi.org/10.1093/bioinformatics/btp324
Merelli, I., et al.: Low-power portable devices for metagenomics analysis: fog computing makes bioinformatics ready for the internet of things. Future Generat. Comput. Syst. 88, 467–478 (2018). https://www.sciencedirect.com/science/article/pii/S0167739X17324123
Oliva, M., Milicchio, F., King, K., Benson, G., Boucher, C., Prosperi, M.: Portable nanopore analytics: are we there yet? Bioinformatics 36(16), 4399–4405 (2020). https://doi.org/10.1093/bioinformatics/btaa237
What is “serverless" and “cloud-native" and when to use it?. https://bioinformatics.csiro.au/blog/converting-traditional-architecture-to-cloud-native-applications/, (accessed 5 February 2022)
Serverless VEP. https://bioinformatics.csiro.au/serverless-vep/, (accessed 5 February 2022)
Singh, S.: Optimize cloud computations using edge computing. In: 2017 International Conference on Big Data, IoT and Data Science, BID, pp. 49–53, December 2017
Wick, R.R., Judd, L.M., Holt, K.E.: Performance of neural network basecalling tools for oxford nanopore sequencing. Genome Biol. 20(1), 129 (2019). https://doi.org/10.1186/s13059-019-1727-y
Zeng, J., Cai, H., Peng, H., Wang, H., Zhang, Y., Akutsu, T.: Causalcall: Nanopore basecalling using a temporal convolutional network. Front Genet. 10, 1332 (2020). https://www.frontiersin.org/article/10.3389/fgene.2019.01332
Acknowledgments
The research was supported by the Polish Ministry of Science and Higher Education as a part of the CyPhiS program at the Silesian University of Technology, Gliwice, Poland (Contract No. POWR.03.02.00-00-I007/17-00) and by Statutory Research funds of Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland (grant No BK/RAu7/2022).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Grzesik, P., Mrozek, D. (2022). Accelerating Edge Metagenomic Analysis with Serverless-Based Cloud Offloading. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13351. Springer, Cham. https://doi.org/10.1007/978-3-031-08754-7_54
Download citation
DOI: https://doi.org/10.1007/978-3-031-08754-7_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08753-0
Online ISBN: 978-3-031-08754-7
eBook Packages: Computer ScienceComputer Science (R0)