Shotgun metagenomic sequencing data of sunflower rhizosphere microbial community in South Africa

This dataset presents shotgun metagenomic sequencing of sunflower rhizosphere microbiome in Bloemhof, South Africa. Data were collected to decipher the structure and function in the sunflower microbial community. Illumina HiSeq platform using next generation sequencing of the DNA was carried out. The metagenome comprised 8,991,566 sequences totaling 1,607,022,279 bp size and 66% GC content. The metagenome was deposited into the NCBI database and can be accessed with the SRA accession number SRR10418054. An online metagenome server (MG RAST) using the subsystem database revealed bacteria had the highest taxonomical representation with 98.47%, eukaryote at 1.23%, and archaea at 0.20%. The most abundant genera were the Conexibacter (17%), Nocardioides (8%), Streptomyces (7%), Geodermatophilus (6%), Methylobacterium (5%), and Burkholderia (4%). MG-RAST assisted analysis also revealed functional annotation based on subsystem, carbohydrates sequence had 13.74%, clustering based subsystem 12.93%, amino acids and derivatives 10.30% coupled with other useful functional traits needed for plant growth and health.

This dataset presents shotgun metagenomic sequencing of sunflower rhizosphere microbiome in Bloemhof, South Africa. Data were collected to decipher the structure and function in the sunflower microbial community. Illumina HiSeq platform using next generation sequencing of the DNA was carried out. The metagenome comprised 8,991,566 sequences totaling 1,607,022,279 bp size and 66% GC content. The metagenome was deposited into the NCBI database and can be accessed with the SRA accession number SRR10418054. An online metagenome server (MG RAST) using the subsystem database revealed bacteria had the highest taxonomical representation with 98.47%, eukaryote at 1.23%, and archaea at 0.20%. The most abundant genera were the Conexibacter (17%), Nocardioides (8%), Streptomyces (7%), Geodermatophilus (6%), Methylobacterium (5%), and Burkholderia (4%). MG-RAST assisted analysis also revealed functional annotation based on subsystem, carbohydrates sequence had 13.74%, clustering based subsystem 12.93%, amino acids and derivatives 10.30% coupled with other useful functional traits needed for plant growth and health.
© 2020 The Author(s

Value of data
• The data provides information on the rhizosphere microbiome of sunflower • The microorganisms inhabiting the rhizosphere serves as hotspot for active biomolecules and novel genes • Understanding the rhizosphere microbiome is important for plant growth and health • These data are valuable, and this offers the possibilities of identifying new genes which could be an impetus for solving hunger and agricultural sustainability

Description of data
The dataset contains a raw sequence data obtained using shotgun metagenomic of sunflower rhizosphere microbiome. The data files in FASTQ format were deposited at the National Center for Biotechnology Information (NCBI) with SRA accession number SRR10418054. The data are presented in Figs. 1 and 2 , respectively.

Experimental design, materials, and methods
Soil samples were obtained from sunflower rhizosphere soils in Bloemhof, South Africa (26.296138S: 26.972175E) and the DNA samples were extracted using the PowerSoil® (MO Bio labs, USA) isolation kit following the instructions of the manufacture. DNA concentration and purity were determined using a NanoDrop Lite Spectrophotometer (Thermo Fischer Scientific, CA, USA). Extracted DNA was sent for metagenome shotgun sequencing to the Molecular Research Laboratory ( www.mrdnalab.com ), Texas, USA, using the Illumina HiSeq platform. Qubit® dsDNA HS Assay Kit (Life Technologies) was used to determine the initial concentration of DNA. Library preparation was done using the Nextera DNA Flex library preparation kit (Illumina) according to the manufacturer's guidelines. In brief, 50 ng of DNA were used for library preparation. After DNA fragmentation Illumina sequencing adapters were added and products amplified using six  cycles of PCR during which unique indices were added. After library amplification, their concentration was estimated using the Qubit® dsDNA HS Assay Kit (Life Technologies), while the average library fragment size was measured using the Agilent 2100 Bioanalyzer (Agilent Technologies). Libraries were then pooled in equimolar ratios of 0.7 nM and sequenced paired-end for 300 cycles using the NovaSeq 6000 platform (Illumina).
The online metagenomic rapid annotation server MG-RAST ( www.mg-rast.org ) was used for the quality control of the raw metagenome sequences [1] . After performing quality control (QC), BLAT (the BLAST-like alignment tool) algorithm was used to annotate the sequences [2] against the M5NR database [3] , which encompasses non-redundant integration of many databases.

Declaration of Competing Interest
The authors declare that they have no conflict of interest, either financial or commercial wise.