Abstract
Current biology tasks are impracticable without bioinformatic data processing. Information technologies and the newest computers provide the ability to automatically execute algorithms on an extensive data set and store either strong- or weak-structured data. A well-designed architecture of such data warehouses increases the reproducibility of investigations. However, it is challenging to create a data schema that aids fast search of properties in such warehouses. This paper describes the method and its implementation for storing and processing microbiological and bioinformatical data. The web platform stores genomes in FASTA format, genome annotations in table files that indicate gene coordinates in the genomes, structural and mathematical models to compare different strains and predict new properties.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
VIR Institute. History, scientific word by most important accessions. Passport database of the VIR’s plant genetic recources collection – VIR Institute. History, scientific word by most important accessions. Passport database of the VIR’s plant genetic recources collection. https://www.vir.nw.ru/en/. Accessed 07 Apr 2021
Lashin, S.A., et al.: An integrated information system on bioresource collections of the FASO of Russia. Vavilovskii Zhurnal Genet. Selektsii 22(3), 386–393 (2018). https://doi.org/10.18699/VJ18.360
Guralnick, R.P., Zermoglio, P.F., Wieczorek, J., LaFrance, R., Bloom, D., Russell, L.: The importance of digitized biocollections as a source of trait data and a new VertNet resource. Database 2016, baw158 (2016). https://doi.org/10.1093/database/baw158
Smith, V.S., Blagoderov, V.: Bringing collections out of the dark. ZooKeys 209(209), 1–6 (2012). https://doi.org/10.3897/zookeys.209.3699
Wilkinson, M.D., et al.: Comment: the FAIR guiding principles for scientific data management and stewardship. Sci. Data 3(1), 1–9 (2016). https://doi.org/10.1038/sdata.2016.18
Bauch, A., et al.: OpenBIS: a flexible framework for managing and analyzing complex data in biology research. BMC Bioinform. 12(1), 1–19 (2011). https://doi.org/10.1186/1471-2105-12-468
Wolstencroft, K., et al.: SEEK: a systems biology data and model management platform. BMC Syst. Biol. 9(1), 1–12 (2015). https://doi.org/10.1186/s12918-015-0174-y
Bankevich, A., et al.: SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19(5), 455–477 (2012). https://doi.org/10.1089/cmb.2012.0021
Seemann, T.: Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14), 2068–2069 (2014). https://doi.org/10.1093/bioinformatics/btu153
Kanehisa, M., et al.: KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36(SUPPL. 1), D480–D484 (2008). https://doi.org/10.1093/nar/gkm882
Karp, P.D., Riley, M., Paley, S.M., Pellegrini-Toole, A.: The MetaCyc database. Nucleic Acids Res 30(1), 59–61 (2002). https://doi.org/10.1093/nar/30.1.59
Quast, C., et al.: The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41(D1), D590–D596 (2013). https://doi.org/10.1093/nar/gks1219
Ivanisenko, V.A., Demenkov, P.S., Ivanisenko, T.V., Mishchenko, E.L., Saik, O.V.: A new version of the ANDSystem tool for automatic extraction of knowledge from scientific publications with expanded functionality for reconstruction of associative gene networks by considering tissue-specific gene expression. BMC Bioinform. 20(S1), 34 (2019). https://doi.org/10.1186/s12859-018-2567-6
Funding
The work was funded by the Kurchatov Genomic Center of the Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences (Novosibirsk, Russia) according to the agreement with the Ministry of Education and Science RF, No. 075-15-2019-1662.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Mukhin, A.M., Kazantsev, F.V., Klimenko, A.I., Lakhova, T.N., Demenkov, P.S., Lashin, S.A. (2021). The Web Platform for Storing Biotechnologically Significant Properties of Bacterial Strains. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2021. Lecture Notes in Computer Science(), vol 12942. Springer, Cham. https://doi.org/10.1007/978-3-030-86359-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-86359-3_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86358-6
Online ISBN: 978-3-030-86359-3
eBook Packages: Computer ScienceComputer Science (R0)