Complete genome sequence of Arthrobacter phenanthrenivorans type strain (Sphe3)

Arthrobacter phenanthrenivorans is the type species of the genus, and is able to metabolize phenanthrene as a sole source of carbon and energy. A. phenanthrenivorans is an aerobic, non-motile, and Gram-positive bacterium, exhibiting a rod-coccus growth cycle which was originally isolated from a creosote polluted site in Epirus, Greece. Here we describe the features of this organism, together with the complete genome sequence, and annotation.


Introduction
Strain Sphe3 T (=DSM 18606 T = LMG 23796 T ) is the type strain of Arthrobacter phenanthrenivorans [1]. It was isolated from Perivleptos, a creosote polluted site in Epirus, Greece (12 Km North of the city of Ioannina), where a wood preserving industry was operating for over 30 years [2]. Strain Sphe3 T is of particular interest because it is able to metabolize phenanthrene at concentrations of up to 400 mg/L as a sole source of carbon and energy, at rates faster than those reported for other Arthrobacter species [3][4][5]. It appears to internalize phenanthrene with two mechanisms: a passive diffusion when cells are grown on glucose, and an inducible active transport system, when cells are grown on phenanthrene as a sole carbon source [2]. Here we present a summary classification and a set of features for A. phenanthrenivorans strain Sphe3 T , together with the description of the complete genome sequencing and annotation. Figure 1 shows the phylogenetic neighborhood of A. phenanthrenivorans strain Sphe3 T in a 16S rRNA based tree. Strain Sphe3 T is a Gram-positive, aerobic, nonmotile bacterium exhibiting a rod-coccus cycle (Figure 2), with a cell size of approximately 1.0-1.5 x 2.5-4.0 μm. Colonies were slightly yellowish on Luria agar. The temperature range was 40-37 o C with optimum growth at 30-37 o C. The pH range was 6.5-8.5 with optimal growth at pH 7.0-7.5 (Table 1). Strain Sphe3 T was found to be sensitive to various antibiotics, the minimal inhibitory concentrations of which were estimated as follows: ampicillin 20 mgL -1 , chloramphenicol 10 mgL -1 , erythromycin 10 mgL -1 , neomycin 20 mgL -1 , rifampicin 10 mgL -1 and tetracycline 10 mgL -1 .

Classification and features
Amylase, catalase and nitrate reductase tests were positive, whereas arginine dihydrolase, gelatinase, lipase, lysine and ornithine decarboxylase, oxi-dase, urease, citrate assimilation and H2S production tests were negative. No acid was produced in the presence of glucose, lactose and sucrose.

Genome sequencing and annotation Genome project history
This organism was selected for sequencing on the basis of its biodegradation capabilities, i.e. metabolizes phenanthrene as a sole source of carbon and energy. The genome project is deposited in the Genome OnLine Database [18] and the com-plete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
A. phenanthrenivorans Sphe3 T , DSM 18606 T was grown aerobically at 30°C on MM M9 containing 0.02% (w/v) phenanthrene. DNA was isolated according to the standard JGI (CA, USA) protocol for Bacterial genomic DNA isolation using CTAB.

Genome sequencing and assembly
The genome of Arthrobacter phenanthrenivorans type strain (Sphe3)was sequenced using a combination of Sanger and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website [19]. Pyrosequencing reads were assembled using the Newbler assembler version 1.1.02.15 (Roche). Large Newbler contigs were broken into 4,967 overlapping fragments of 1,000 bp and entered into assembly as pseudo-reads. The sequences were assigned quality scores based on Newbler consensus q-scores with modifications to account for overlap redundancy and to adjust inflated qscores. A hybrid 454/Sanger assembly was made using the Arachne assembler [20]. Possible misassemblies were corrected and gaps between contigs were closed by by editing in Consed, by cus-tom primer walks from sub-clones or PCR products. A total of 822 Sanger finishing reads were produced to close gaps, to resolve repetitive regions, and to raise the quality of the finished sequence. The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Sanger and 454 sequencing platforms provided 26.78 × coverage of the genome. The final assembly contains 44,113 Sanger reads and 599,557 pyrosequencing reads.

Genome annotation
Genes were identified using Prodigal [21] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [22]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, Uni-Prot, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and In-terPro databases. Additional gene prediction analysis and functional annotation were performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [23].

Genome properties
The genome consists of a 4,250,414 bp long chromosome with a GC content of 66% and two plasmids both with 62% GC content, the larger being 190,450 bp long and the smaller 94,456 bp (Figure 3, Figure 4, and Table 3). Of the 4,288 genes predicted, 4,212 were protein-coding genes, and 76 RNAs; 77 pseudogenes were also identified. The majority of the protein-coding genes (73.8%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.