Complete Genome Sequence of the Lignocellulose-Degrading Actinomycete Streptomyces albus CAS922

Streptomyces albus CAS922 was isolated from sunflower seed hulls. Its fully sequenced genome harbors a multitude of genes for carbohydrate-active enzymes, which likely facilitate growth on lignocellulosic biomass. Furthermore, the presence of 27 predicted biosynthetic gene clusters indicates a significant potential for the production of bioactive secondary metabolites.

A ctinomycete bacteria are important contributors to the natural degradation of lignocellulosic biomass (1,2). An illustrative example is Streptomyces albus CAS922, which was isolated from sunflower seed hulls discarded by the sunflower oil industry (Cargill S.A.C.I., Bahía Blanca, Argentina) by using ISP-2 agar plates (3) and has been found to grow on diverse lignocellulosic substrates. The strain produces an earthy odor and shows typical streptomycete morphology, including aerial mycelia and substrate hyphae penetrating the agar. Since many actinomycetes are versatile producers of clinically used antibiotics (4, 5), S. albus CAS922 might possess the potential to convert lignocellulosic biomass into value-added products. To obtain data to test this hypothesis, its genome was sequenced.
For DNA isolation, a glycerol stock of S. albus CAS922 was incubated in ISP-2 broth at 30°C and 250 rpm for 4 days. Genomic DNA was then isolated according to a protocol adapted from a report by Neumann et al. with minor modifications (6). Briefly, cells were harvested from liquid ISP-2 cultures and ground in liquid nitrogen in a mortar. The sample was incubated with lysozyme for 2 h at 300 rpm and proteinase K for 3 h at 300 rpm and was extracted with a mixture of phenol, chloroform, and isoamyl alcohol (25:24:1) followed by RNase treatment. Genome sequencing was executed by CD Genomics (New York, NY, USA) using the Oxford Nanopore sequencing platform. The SQK-LSK109 kit was used for library preparation following the 1D genomic DNA by ligation protocol (7). Adapters were filtered with the software Porechop v0.2.3. Lowquality and short-fragment (Ͻ2,000-bp) reads were removed with the MinKNOW software package to obtain a total of 1,543,556,228 bp of clean data from 260,187 clean reads with an N 50 value of 7,942 bp. The clean reads were assembled using Canu v1.5 software (8), followed by a subsequent data-polishing step with Pilon software (9) to increase accuracy. This resulted in a linear chromosome of 8.06 Mb with 191-fold genome coverage and a GϩC content of 72.59%. The Prokaryotic Genome Annotation Pipeline (PGAP) (10) predicted 6,776 protein-coding genes, 80 RNAs, 59 tRNAs, 3 noncoding RNAs, and 312 pseudogenes. A repetitive sequence content of 2.45% was predicted by RepeatMasker (11). Assignment of the strain to the species S. albus was verified based on alignments with the nonredundant database (12). Default parameters were used for all software.
A total of 232 carbohydrate-active enzymes were predicted using the CAZy database (13) and dbCAN2 (14). Three of these proteins belong to the AA10 family, recently recognized as copper-dependent lytic polysaccharide monooxygenases involved in the lysis of chitin, cellulose, and xylan (15).
Data availability. This whole-genome project has been deposited in GenBank with the accession number CP048875. The raw data are available under accession number SRR11069776.

ACKNOWLEDGMENTS
M.S.V.G. acknowledges Cristian Gallo for technical assistance. This study was supported by grants PUE2017 CERZOS-CONICET and PGI 24/B294 SeGCyT-UNS (to M.S.V.G.). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.