Draft genome sequence of a nitrate-reducing, o-phthalate degrading bacterium, Azoarcus sp. strain PA01T

Azoarcus sp. strain PA01T belongs to the genus Azoarcus, of the family Rhodocyclaceae within the class Betaproteobacteria. It is a facultatively anaerobic, mesophilic, non-motile, Gram-stain negative, non-spore-forming, short rod-shaped bacterium that was isolated from a wastewater treatment plant in Constance, Germany. It is of interest because of its ability to degrade o-phthalate and a wide variety of aromatic compounds with nitrate as an electron acceptor. Elucidation of the o-phthalate degradation pathway may help to improve the treatment of phthalate-containing wastes in the future. Here, we describe the features of this organism, together with the draft genome sequence information and annotation. The draft genome consists of 4 contigs with 3,908,301 bp and an overall G + C content of 66.08 %. Out of 3,712 total genes predicted, 3,625 genes code for proteins and 87 genes for RNAs. The majority of the protein-encoding genes (83.51 %) were assigned a putative function while those remaining were annotated as hypothetical proteins.


Introduction
Phthalic acid consists of a benzene ring to which two carboxylic groups are attached. There are three isomers of phthalic acid (o-phthalic acid, m-phthalic acid and p-phthalic acid). Phthalic acid esters are widely used as additives in plastic resins such as polyvinyl resin, cellulosic and polyurethane polymers for the manufacture of building materials, home furnishings, transportation apparatus, clothing, and to a limited extent in food packaging materials and medical products [1,2]. Due to the widespread use of phthalates there has been great concern about their release into the environment [3,4]. In addition, phthalates and their metabolic intermediates have been found to be potentially harmful to humans due to their hepatotoxic, teratogenic and carcinogenic characteristics [5,6]. Phthalic acid is also an intermediate in the bacterial degradation of phthalic acid esters [7] as well as in degradation of certain fused-ring polycyclic aromatic compounds found in fossil fuel [8], such as phenanthrene [9], fluorene [10] and fluoranthene [11].
Azoarcus sp. strain PA01 T (=KCTC 15483) is a mesophilic, Gram-negative, nitrate-reducing bacterium that was isolated from a wastewater treatment plant in Constance, Germany, for its ability to completely degrade o-phthalate and a wide range of aromatic compounds. Strain PA01 T is also able to grow with a variety of organic substrates including short-chain fatty acids, alcohols, selected sugars and amino acids. These substrates are degraded completely to carbon dioxide coupled to nitrate reduction. The genus Azoarcus is comprised of nitrogen-fixing bacteria [12] and known for degradation of aromatic compounds. Currently, this genus consists of nine species with validly published names [13]. These species have been isolated from a wide range of environments, including anoxic wastewater sludge and grass root soil [12]. On the basis of 16S rRNA gene sequence similarity search, the closest relatives of strain PA01 T are Azoarcus buckelii DSM 14744 T (99 % gene similarity) [14,15] and Azoarcus anaerobius (98 %) [16]. A. buckelii DSM 14744 T was also isolated from a sewage treatment plant for its ability to degrade a wide range of aromatic compounds. But the biochemistry and genetics of anaerobic o-phthalate degradation had not been elucidated in detail. Here, we present a summary of the features for Azoarcus sp. strain PA01 T and its classification, together with the description of the genomic information and annotation.

Classification and features
Azoarcus sp. strain PA01 T is a member of the family Rhodocyclaceae in the phylum Proteobacteria. It was isolated from an activated sewage sludge sample collected (in 2012) from a wastewater treatment plant in Constance, Germany. Enrichment, isolation, purification and growth experiments were performed in anoxic, bicarbonatebuffered, non-reduced freshwater medium containing (g/l); NaCl,  [18] and 1 ml seven-vitamin solution [19] were added. The initial pH of the medium was adjusted to 7.3 ± 0.2 with sterile 1 N NaOH or 1 N HCl. Cultivations and transfer of the strain were performed under N 2 :CO 2 (80:20) gas atmosphere. The strain was cultivated in the dark at 30°C. Enrichment cultures were started by inoculating approximately 2 ml of sludge sample in 50 ml freshwater medium (described above) containing 2 mM neutralized o-phthalic acid as sole carbon source and 10-12 mM NaNO 3 as an electron acceptor. Growth was observed after 3-4 weeks of incubation. Enrichment cultures were sub-cultured for several passages with o-phthalate as sole carbon source. Pure cultures were obtained in repeated agar (1 %) shake dilutions [20]. Single colonies obtained were retrieved by means of finely-drawn sterile Pasteur pipettes and transferred to fresh liquid medium. The strain was routinely examined for purity by light microscopy (Axiophot, Zeiss, Germany) also after growing the culture with 2 mM phthalate plus 1 % (w/v) yeast extract. For genetic and chemotaxonomic analysis, it was cultivated in the described medium containing 8 mM acetate as a carbon source.
Azoarcus sp. strain PA01 T is a mesophilic, non-motile, Gram-negative, short rod-shaped bacterium measuring 0.5-0.7 μm (wide), 1.6-1.8 μm (length) ( Fig. 1a and b) and divides by binary fission. Growth was observed from 25°C to 37°C with an optimum at 30°C and optimal pH of 7.3 ± 0. Initial identification and validation of strain PA01 T was performed by 16S rRNA gene amplification using a set of universal bacterial primers; 27 F (5′-AGA GTT TGA TCM TGG CTC AG-3′) and 1492R (5′-TAC GGY TAC CTT GTT ACG ACT T-3′) as described [21]. A phylogenetic tree was constructed from the 16S rRNA gene sequence together with the other representatives of the genus Azoarcus (Fig. 2) using the MEGA 4 software package [22]. Phylogenetic analysis indicated that strain PA01 T belongs to the genus Azoarcus and is closely related to Azoarcus buckelii (99 %) and Azoarcus anaerobius (98 %). Currently, 30 genome sequences are available for the members of the order Rhodocyclales. The closest neighbors of strain PA01 T whose genome sequence is available are Azoarcus sp. strain KH32C [23] and Azoarcus sp. strain BH72 [24] and Azoarcus toluclasticus ATCC

Genome sequencing information
Genome project history Strain PA01 T was selected for genome sequencing on the basis of its phylogenetic position and its ability to grow on o-phthalaet together with numerous aromatic compounds under nitrate-reducing conditions. Genome sequencing was performed at GATC Biotech AG, Konstanz (Germany). High-quality genome draft sequence of Azoarcus sp. strain PA01 T is listed in the Genomes Online Database of the Joint Genome Institute under project ID Gp0109270 [25]. The Azoarcus sp. PA01 T whole genome shotgun (WGS) project has been deposited at DDBJ/EMBL/ GenBank under the project accession LARU00000000. The version described in this paper has the accession , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [50]. If the evidence code is IDA, the property was directly observed by one of the authors or an expert mentioned in the acknowledgments number LARU01000000, and consists of sequences LARU01000001-LARU01000004. The draft genome sequence was released on August 26, 2015. Annotation of the Azoarcus sp. strain PA01 T genome, was performed by the DOE Joint Genome Institute using microbial genome annotation pipeline state of the art technology [29,30]. Table 2 presents the project information and its association with MIGS version 2.0 compliance [31].

Growth conditions and genomic DNA preparation
For the isolation of genomic DNA, cells were grown in one liter medium with 8 mM acetate plus 10-12 mM nitrate. Cells were harvested in the late stationary phase and cell pellet was stored frozen (−20°C) until DNA preparation. High-molecular-weight genomic DNA was prepared using modified CTAB DNA extraction protocol [32] with some modifications. Chloroform:isoamyl alcohol (24:1) and phenol:chloroform:isoamyl alcohol (25:24:1) steps were repeated twice and RNase treatment was performed for 2 h. Finally, the DNA was dissolved in RNase and DNase-free molecular grade water. Purity, quality and size of the genomic DNA preparation were analyzed by using nanodrop (639 ng/μl, A 260/280 = 1.84, A 260/230 = 2.10) and agarose gel electrophoresis (1 % w/v) (see Fig. 1c).

Genome sequencing and assembly
The genome of Azoarcus sp. strain PA01 T was sequenced using a library size of 8-12 kb. Library construction, quantification and sequencing (Pacific Bioscience RS) were performed at GATC Biotech AG (Konstanz, Germany). The final high-quality draft assembly was based on 95,883 reads. The combined libraries provided the 97.42 mean coverage of sequencing depth. Final de novo assembly of the genome from the total reads was performed using the PacBio HGAP3 assembly pipeline with default filter parameters. Minimum read length and polymerase read quality was 500 bp and 0.80, respectively. The minimum seed read length was computed automatically and resulted in 5181 bp (length cutoff). The final polished assembly of the sequencing reads yielded 4 linear contigs generating a draft genome size of 3.9 Mb.

Genome annotation
Annotation was carried out using the DOE-JGI annotation pipeline [30] and genes were identified using Prodigal [33]. The predicted CDSs were translated and used to search the NCBI non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG and InterPro databases. The tRNAScanSE tool [34] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [35]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [36]. Additional gene prediction analysis and manual functional annotation was performed within the IMG-ER Platform [37].

Genome properties
The draft genome of Azoarcus sp. PA01 T is 3,908,301 bp long (with 4 linear contigs, see Fig. 3) with an overall GC content of 66.08 % (Table 3). Of a total 3,712 genes predicted, 3,625 were protein-coding genes, and 87 were RNA genes (15 rRNA genes and 59 tRNA genes); 525 genes without function were identified (pseudogenes). The majority of the protein-coding genes (83.51 %) were assigned a putative function while those remaining were  annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 3, the distribution of genes into COGs functional categories is presented in Table 4. One CRISPR region was found in the genome of strain PA01 which is located in proximity to the CRISPR-associated endonucleases (Cas1 and Cas 2) proteins.

Insight from the genome sequence
Azoarcus sp. strain PA01 T grows on a wide variety of aromatic compounds (Table 1) linked to nitrate reduction like other bacteria capable of growth via anaerobic degradation of aromatic compounds [38]. In the degradation pathway of most aromatic compounds (including ophthalate), benzoate is a central intermediate and has  also been used routinely as the model compound to study the anaerobic degradation of aromatic compounds via the benzoyl-CoA degradation pathway [39]. Annotation of the genome indicated that strain PA01 T has key enzymes for the degradation of aromatic compounds such as benzoate.
In the past decade, degradation of benzoate through the benzoyl-CoA pathway has been detailed at the molecular level in facultative anaerobes and the phototrophic strictly anaerobic bacteria, i.e. in the denitrifying bacteria Thauera aromatica and Rhodopseudomonas palustris respectively [40,41]. Unlike other benzoate and/or aromatic compound degrading bacteria, strain PA01 T has the genes for benzoate degradation, which involves a one-step reaction that activates benzoate to benzoyl-CoA by an ATP-dependent benzoate-CoA ligase. The genome of PA01 T contains in total two copies of the benzoate-CoA ligase, i.e., benzoate-CoA ligase (EC 6.2.1.25) and benzoate-CoA ligase (EC 6.2.1.25) (locus tag PA01_01819, PA01_03223) which are supposed to be involved in the initial activation of benzoate to benzoyl-CoA. They are located in different positions. These two genes show 68.11 % identity to each other and are also found to be present in the genomes of the other bacteria [23]. The subsequent enzyme of benzoate degradation, benzoyl-CoA reductase is present in one copy with all its four subunits (locus tags PA01_00623, PA01_00625, PA01_00624, PA01_00626) in the genome of strain PA01. The presence of these gene clusters in the genome of Azoarcus sp. strain PA01 T provides evidence for the capacity of strain PA01 T to degrade aromatic compounds.
Most of the novel biochemistry of the anaerobic metabolism of aromatic compounds has been discovered with nitrate-reducing bacteria in the past two decades [42,43] and little is known about the biochemistry of phthalate degradation in nitrate-reducing and strictly anaerobic (fermenting and sulfate-reducing) bacteria. We are currently exploring the genome of strain PA01 T and the enzymes responsible for o-phthalate degradation by using differential proteomics and measuring enzyme activities (unpublished). Thus, the draft genome sequence of strain PA01 T provides an opportunity to study the biochemistry of o-phthalate degradation into depth.

Conclusions
Azoarcus sp. strain PA01 T harbors various genes required for degradation of aromatic compounds (which are normally found in the other aromatic degrading bacteria), e.g., genes for benzoate degradation in the genome of strain PA01 T . Further, the genome of Azoarcus sp. strain PA01 T The total is based on either the size of the genome in the base pairs or the total number of protein coding genes in the annonated genome The total is based on the total number of protein coding genes predicted in the genome will expands our view to understand the biochemistry of anaerobic degradation of various aromatic compounds, including o-phthalate, a priority pollutant. The genome sequence of strain PA01 T will provide insight into the putative genes involved in the degradation of all these compounds, mainly o-phthalate.