Metagenome-assembled genomes recovered from the datasets of a high-altitude Himalayan hot spring Khirganga, Himachal Pradesh, India

Khirganga, a pristine hot spring that lies in the Parvati Valley within the Northern Himalayas characterised with unique white colour microbial mat and divine water with healing abilities. Here, we report 41 metagenome-assembled genomes (MAGs) reconstructed from the microbial mat, sediment and water samples of hot spring passed through Genome Standards Consortium (GSC) and Minimum Information of Metagenome-assembled Genome (MIMAG).


Specifications
Environmental Science Specific subject area Environmental Genomics and Metagenomics Type of data

Value of the Data
• This data provides information about genetic potential of bacterial and archaeal candidates in mesothermic hot spring. • The thermozymes of the metagenome-assembled genomes will be beneficial for sewage treatments and biotechnological processes. • Data is applicable for comparative genomic studies of 41 different candidates of prokaryotes.
• Data will help to explore the functional potential and inter-habitat interactions of hot spring ecosystem.

Site description and sample collection
Khirganga hot spring is a meso-thermic extreme environment characterised by white colour microbial mat deposited in and around the flowing hot water stream [7] . This spring is a  relatively undisturbed natural setting at Kullu district, Himachal Pradesh, India. Khirganga lies at an altitude (2978 m MSL) with the source of water being the mystical Parvati River. Due characteristics of geothermal energy, high altitude and white microbial mat that are found in the Khirganga ground leads to emission of heavy metals and ions make the site more provocative [ 8 , 9 ]. Microbial mat, sediment and water samples were collected in replicates from three different habitats and water was filtered through 0.45 μm filter (Merck Millipore Ltd., Ireland) under sterile conditions and filtrate was processed for DNA extraction.

Sequencing and assembly
Community DNA was sequenced at Beijing Genome Institute (BGI), Hongkong, China at Illumina Hiseq 20 0 0 platform and 2 × 100bp paired-end libraries with insert size of 350bp were generated. Reads with < Q 20 quality cut-off were discarded using SolexaQA [11] . A total of 110,861,650 -152,895,302 reads in all six samples were generated which were assembled into 180,849 -519,194 contigs using IDBA-UD [12] with insertion length 50 bp, min. k-mer 31, max. k-mer 93 and other default parameters. The metagenome-assembled genomes (MAGs) were reconstructed combining contigs based on tetra-nucleotide frequency and genome abundance probabilities using MetaBAT v2 (Metagenomic Binning with Abundance and Tetranucleotide Frequencies) [13] using the following parameters minContig (minimum contig size) = 2500 bp, and minS (minimum score of edge for binning) = 60.

Annotation of genomes
Additional genome functional annotation was performed automatically using the Prokaryotic Genome Annotation Pipeline (PGAP) [14] .

Data Accessibility
The raw sequence data were deposited at the National Centre for Biotechnology Information (NCBI) database under the project number PRJNA673998. The sequences of metagenomes are available with SAMN16657637; SAMN16632777 for microbial mat, SAMN16657991; SAMN16673719 for sediment, and SAMN16683881; SAMN16683882 for water. The sequences of MAGs are available at GenBank under the genome accessions summarized in Table 1 .

Ethics Statement
The work did not involve human subjects, animals, cell lines or endangered species of wild fauna and flora.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.