Insights into the single cell draft genome of “Candidatus Achromatium palustre”

“Candidatus Achromatium palustre” was recently described as the first marine representative of the Achromatium spp. in the Thiotrichaceae - a sister lineage to the Chromatiaceae in the Gammaproteobacteria. Achromatium spp. belong to the group of large sulfur bacteria as they can grow to nearly 100 μm in size and store elemental sulfur (S0) intracellularly. As a unique feature, Achromatium spp. can accumulate colloidal calcite (CaCO3) inclusions in great amounts. Currently, both process and function of calcite accumulation in bacteria is unknown, and all Achromatium spp. are uncultured. Recently, three single-cell draft genomes of Achromatium spp. from a brackish mineral spring were published, and here we present the first draft genome of a single “Candidatus Achromatium palustre” cell collected in the sediments of the Sippewissett Salt Marsh, Cape Cod, MA. Our draft dataset consists of 3.6 Mbp, has a G + C content of 38.1 % and is nearly complete (83 %). The next closest relative to the Achromatium spp. genomes is Thiorhodovibrio sp. 907 of the family Chromatiaceae, containing phototrophic sulfide-oxidizing bacteria.

A marine population of Achromatium spp. [6] was recently described in more detail [7] and this population showed altered migration patterns as well as an increased tolerance to oxygen as reported for freshwater populations [14]. Besides calcite and sulfur inclusions, staining and energy dispersive X-ray analysis revealed a third type of inclusion in the salt marsh Achromatium containing a high concentration of Ca 2+ ions that were suggested to be stored for the rapid, dynamic precipitation of calcium carbonate. The number of inclusions varied according to the position of a cell relative to the redox gradient of the sediment [7].
Sequencing Achromatium genomes not only provides insight into the genetic and ecophysiological potential of these uncultured organisms in order to find genetic evidence supporting field and microcosm observations ( Table 1), but also enables the identification of candidate genes involved in calcite accumulation. Three draft genomes of Achromatium from a mineral spring in Florida were recently published [17], and here we present the first draft genome of a marine Achromatium representative. Phylum Proteobacteria TAS [42][43][44] Class Gammaproteobacteria TAS [44,45] Order Thiotrichales TAS [32] Family Thiotrichaceae TAS [31] Genus Achromatium TAS [5,46] Species Candidatus Achromatium palustre TAS [7,47] Gram stain Negative TAS [14] Cell

Classification and features
As the most striking phenotypic feature, Candidatus A. palustre, as well as other described Achromatium species, appear bright white to the naked eye, as they contain multiple intracellular calcium carbonate (CaCO 3 ) inclusions, and elemental sulfur (S 0 ) granules, that fill nearly the entire interior of the cell. There is no large central vacuole as observed in other large sulfur bacteria, e.g. Beggiatoa spp. [18]. Calcite inclusions vary in diameter, but are typically several micrometers in size. Under the microscope, Achromatium spp. appear bulgy and rock-like (Fig. 1a), and one can observe the slowly jerky rolling motility of the large cells. TEM imaging of freshwater Achromatium showed that the calcite inclusions have a central nucleation point that is surrounded by concentric rings of precipitated calcite, and that they are probably enclosed by a membrane [14]. The salt marsh Achromatium were on average 20 × 26 μm in diameter, rod-shaped, contained several large calcite inclusions, and numerous small interstitial inclusions. Some cells had an external sheath, supposedly a layer of mucus, to which occasionally other rod-shaped and filamentous bacteria were attached [7]. Staining with Calcium Orange-5 N (Fig. 1c), or Calcium Green-1 revealed additional inclusions that were highly enriched in Ca 2+ and of much smaller size (<1 μm) in the interstitial space between the large calcite inclusions (compare Fig. 2 Phylogenetic tree based on 16S rRNA gene sequence information. The reconstruction was performed originally with 80 sequences, of which only a subset is shown here, and a total of aligned 1,101 positions using the maximum likelihood RaxML method of the ARB software package [49]. The tree was rooted with representatives of the Deltaproteobacteria. Branching patterns supported by <40 % confidence in 100 bootstraps replicates were manually converted into multifurcations. Candidatus Achromatium palustre, the source organism of the here presented genome, affiliates with cluster A in the Achromatium lineage, and is highlighted in bold face. (T) marks type strains/sequences, and asterisks (*) shows the availability of a genome Fig. 1b and c) [7]. Achromatium have a Gram-negative cell wall [3,19], and the cytoplasm as well as DNA is distributed across the entire cell in thin (<2 μm) threads stretching between the inclusions [7]. Candidatus Achromatium palustre was detected in Little Sippewissett Salt Marsh on Cape Cod, Massachusetts, where they occurred mainly in the upper 2 cm of the sediment of a tide pool. From the deeper layers of the flocculous, organic-rich phytodetritus, high sulfide concentrations diffused upwards meeting the sediment/water interface during the night. During the day, photosynthetic algae and cyanobacteria generated supersaturated oxygen concentrations in the surficial sediment and overlying water column, which created an oxic, sulfide-free zone in the upper millimeters of the sediment [7].
The salt marsh Achromatium population co-occurred with highly abundant and conspicuous, millimeter-size aggregates of purple sulfur bacteria in the surficial sediment layers. Other phototrophic bacteria (phylum Cyanobacteria) and eukaryotes (diatoms) are also found in higher densities at the sediment/water interface; heterotrophic sulfate-reducing bacteria of the Deltaproteobacteria dominate in deeper sediment layers [7,20,21]. The single Candidatus A. palustre cell was isolated by an initial sieving of the sediment to remove the large aggregates and debris, followed by manual removal of the cell using a glass Pasteur pipette, and a successive washing steps in sterile water until contaminants were out-diluted.
Currently, Achromatium spp. 16S rRNA gene sequences are either classified as Achromatium oxaliferum, or Achromatium sp., intermixed [2,3,22] between the two phylogenetic subclusters "A" and "B" (Fig. 2). These subclusters not only separate by 16S rRNA gene sequence difference, but also by the presence (A) or absence (B) of helix 38 in the V6 region [2]. Recently, it was proposed that the subclusters may represent and/or include several candidatus taxa [8], however, due to the lack of cultures, a reclassification of the members of the Achromatium lineage is challenging, as it cannot be based on sequence information alone [23]. With the accumulation of information about the natural populations and subpopulations through culture-independent techniques the phylotypes will most likely receive phylogenetic attention in the future. One subcluster in "cluster B" was already classified as "Candidatus Achromatium minus" based on sequence divergence and morphological difference [24]. "Candidatus Achromatium palustre" was likewise classified as part of "cluster A", based on 16S rRNA gene sequence information and their adaptation to the very different habitat, as well as their altered behavioural characteristics [7] (Fig. 2).

Genome project history
The sequencing project was initiated in August 2013, when cells were collected from the field, isolated, and subjected to multiple displacement amplification. The amplified DNA was sequenced in November 2014, the raw data were integrated into the JGI pipeline Jigsaw2.4.1, where they were quality-checked and assembled. Annotation and further decontamination was performed through IMG [33]. After final analysis for contamination and completion in CheckM [34], the draft genome ( Table 2) was completed in February 2015, when it was deposited in the Genome On-Line Database and became available in IMG (Ga0065144). The whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number LFCU00000000.

Growth conditions and genomic DNA preparation
The cell was retrieved directly from the field, added to the sample buffer of the illustra GenomiPhi V2 kit (GE Healthcare Life Sciences, Pittsburgh, PA), crushed manually with a sterile needle, heated for 3 min at 95°C, and supplemented with the remaining ingredients for the  [35]. Purity of the MDA product was assessed by amplifying the 16S rRNA gene sequence and directly sequencing the PCR product with Sanger. The genome was then reamplified with the illustra GenomiPhi HY DNA Amplification kit to yield enough material for whole genome sequencing.

Genome sequencing and assembly
The MDA product was sequenced with illumina MiSeq v2 technology at the Cornell University Institute of Biotechnology, Ithaca, NY. This resulted a total of 30,190,768 reads, which were quality checked, trimmed, and artifact/ contamination filtered with DUK, a filtering program developed at the JGI that removes known Illumina sequencing and library preparation artifacts. Additionally, reads were screened for human, cat, and dog contaminant sequences. The remaining 29,696,136 reads were passed to SPAdes [36] and assembled into 586 contigs >2 kb, representing 7,614,708 bp. This dataset was uploaded in IMG/ mer [37] under analysis project number Ga0064002, and further decontaminated manually. Only contigs affiliating with the Thiotrichaceae/Chromatiales lineage were finally uploaded in IMG/er [38] under analysis project number Ga0065144. This final dataset is the draft genome of Candidatus A. palustre and consists of 3,645,683 bp on 276 contigs, and the coverage is 375x. CheckM is software that is designed to assess quality and completeness of (meta)genomes [34], and our analysis of the draft genome dataset revealed a completeness of 83.36 % based on the finding of 503/538 lineage specific maker genes (marker lineage Gammaproteobacteria), and a contamination value of 1.13 %, which is in the error range (≤6 %) of contamination estimates of incomplete (~70 %) genomes [34]. Strain heterogeneity, tested by the amino acid identity (AAI) between multi-copy genes [34], is 0.

Genome annotation
Gene calling and functional annotation was performed automatically by IMG [33,39] during the upload process.
We are currently manually verifying annotations of Fig. 3 Graphical simulated circular genome of 276 concatenated contigs of the Candidatus A. palustre draft genome. The contigs were concatenated in Geneious 6.0.1 [50] using the random order of appearance in IMG, and the map was generated in Geneious and CGView [51]. The concatenated contigs are shown in blue, open reading frames (ORFs) in red in both directions, and the GC content in black interest, constructing databases using Uniprot (Swissprot and TrEMBL) and blasting against these with the Achromatium draft genome using the integrated tblastn tool in IMG/er.

Genome properties
The Candidatus Achromatium palustre draft genome is 3,645,683 bp in size, and distributed on 276 contigs that are between 2012 and 57,118 bp in length. The N 50 is 18,361 bp, and the G + C content is 38.08 %. Based on sequence comparison of nearly full-length 16S rRNA genes, the phylogenetic affiliation of the Candidatus Achromatium palustre genome is in cluster A among other Achromatium spp. sequences, including the three previously published draft genomes (Fig. 2). The Achromatium lineage is a sister lineage to the Chromatiaceae [2,3,8,22,24] containing purple sulfur bacteria such as Thiorhodovibrio and Chromatium (Fig. 2). IMG identified 3,400 genes, of which 3,343 encoded proteins (98.32 %), 57 encoded rRNA (1.68 %) and no pseudogenes (0.00 %). Among the 57 rRNA genes, one operon contained the 16S rRNA, 23S rRNA, and 5S rRNA gene. An additional truncated 5S rRNA gene was located on a different contig, and the sequence is identical to the fulllength 5S rRNA gene. Furthermore, we find, e.g., 42 tRNA genes, genes for transcription and translation, DNA replication and repair, cell motility and chemotaxis. Details are given in Fig. 3, and Tables 3 and 4. We did not identify indications for plasmid DNA. Further insights into the coding regions of the draft genome will be given elsewhere.

Conclusions
Details of Achromatium spp. genomes promise further insight into the ecophysiology of these unique organisms. The draft genome of Candidatus A plaustre is one of the first steps to unravel the phenotypic and physiological adaptations of Achromatium spp. occurring in different redox gradient systems as well as across divers salinities. A comparison with the brackish Achromatium genomes and prospect freshwater Achromatium spp. genomes, as well as with future metagenomes of different Achromatium-containing habitats, will be conducted and promise highly valuable information. Future analyses will not only include the investigation of nutrient pathways and modes of energy generation in these organisms, but also potential insights into calcium transport and calcite accumulation.

Competing interests
The authors declare that they have no competing interests. The total is based on the total number of protein coding genes in the genome