Complete genome sequence of a natural compounds producer, Streptomyces violaceus S21

The complete genome sequence of Streptomyces violaceus strain S21, a valuable natural compounds producer isolated from the forest soil, is firstly presented here. The genome comprised 7.91M bp, with a G + C content of 72.65%. A range of genes involved in pathways of secondary product biosynthesis were predicted. The genome sequence is available at DDBJ/EMBL/Genbank under the accession number CP020570. This genome is annotated with 6856 predicted genes identifying the natural product biosynthetic gene clusters in S. violaceus.


A B S T R A C T
The complete genome sequence of Streptomyces violaceus strain S21, a valuable natural compounds producer isolated from the forest soil, is firstly presented here. The genome comprised 7.91M bp, with a G +C content of 72.65%. A range of genes involved in pathways of secondary product biosynthesis were predicted. The genome sequence is available at DDBJ/EMBL/Genbank under the accession number CP020570. This genome is annotated with 6856 predicted genes identifying the natural product biosynthetic gene clusters in S. violaceus.

Direct link to deposited data
The complete genome sequences can be found at the site https:// www.ncbi.nlm.nih.gov/nuccore/CP020570.

Introduction
Natural products from actinomycetes have been the major sources for clinical antibiotics, along with numerous other useful compounds including antineoplastic, antiparasitic, insecticidal and phytocidal drugs. Streptomyces violaceus is one of the great potential producers of natural compounds. S. violaceus was first isolated and classified in the 1960s [1]. Following its discovery, different kinds of anthracycline antibiotics for cancer treatment were isolated from S. violaceus, but side effects like cardiotoxicity have limited their clinical use [2][3][4][5]. Development of new anthracyclines with less cardiotoxicity and improved therapeutic efficacy is required. In addition, amylase inhibitors, extracellular polysaccharide and thrombolytic actinoprotease were also isolated from S. violaceus [6][7][8]. S. violaceus was also used to develop new useful natural products recently [9][10]. Until now, only one draft genome of S. violaceus NRRL B-2867 has been deposited in Genbank [9][10]. To further understand this potential producer of many natural compounds, we present the first complete genome sequence of S. violaceus S21 and its features.

Experimental design, materials and methods
Strain S21 was isolated from the Seabed sludge in Shandong, China. Strain S21 is a valuable producer of many natural compounds, including anthracycline antibiotics, amylase inhibitors and extracellular polysaccharide. Analysis of the genome of strain S21 was carried out in order to reveal the biosynthetic gene clusters of natural compounds.
S. violaceus S21 was cultured in Tryptic Soy Broth (OXOID, UK) medium to obtain mycelium, then Genomic DNA was extracted using Genomic DNA Purification Kit (Promega, USA). Both the PE300 DNA library and 10-kb DNA library were constructed, after the quality of DNA sample was analyzed using a NanoDrop 2000 Spectrophotometer (Thermo Scientific, USA). DNA sequencing was performed using an Illumina Hiseq4000 platform and a PacBio RS II platform at Beijing Genomics Institute (Shenzhen, China). The cleaned reads were de novo assembled with SPAdes [11], then polished with SSPACEStandard and GapFiller to get scaffolds [12][13]. The genome was annotated using the Prokaryotic Genome Annotation Pipeline (PGAP) version 3.2 software on NCBI. Additional gene prediction was performed by the RASTtk server [14]. SEED viewer was used for assignment of the predicted genes to functional categories [15].

Data description
After quality control, about 1.50 Gb of data was obtained from the Illumina Hiseq platform, and about 0.81 Gb of data was obtained from the PacBio RS II platform. A total of 7,916,045 bp genome sequence with an average GC content of 72.65% was assembled. The genome was predicted to contain 6856 genes, including 6571 coding sequences, 65 tRNAs, 18 rRNAs (5S, 16S, and 23S), 3 ncRNAs, and 199 pseudo genes. Most of the annotated genes determined amino acids and derivative synthesis (667), carbohydrate metabolism (409), cofactor, vitamin, prosthetic group and pigment formation (357), protein metabolism (349), and fatty acid, lipid and isoprenoid (149) (Fig. 1).
About 79 gene clusters involved in the pathways for the secondary products biosynthesis were predicted in the genome of Strain S21 using antiSMASH [16]. Further studies of the genes involved in the biosynthesis of anthracycline antibiotics, amylase inhibitors and extracellular polysaccharide are necessary.

Nucleotide sequence accession numbers
The nucleotide sequence of the S. violaceus S21 genome has been deposited in Genbank under the accession number CP020570.

Conflict of interest
The authors declare that there is no conflict of interests on the work published in this paper.