Non-contiguous finished genome sequence and description of Fenollaria massiliensis gen. nov., sp. nov., a new genus of anaerobic bacterium

Fenollaria massiliensis strain 9401234T, is the type strain of Fenollaria massiliensis gen. nov., sp. nov., a new species within a new genus Fenollaria. This strain, whose genome is described here, was isolated from an osteoarticular sample. F. massiliensis strain 9401234T is an obligate anaerobic Gram-negative bacillus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 1.71 Mbp long genome exhibits a G+C content of 34.46% and contains 1,667 protein-coding and 30 RNA genes, including 3 rRNA genes.


Introduction
Fenollaria massiliensis strain 9401234 T (= CSUR P127 = DSM 26367), is the type strain of Fenollaria massiliensis sp. nov., and the first member of the new genus Fenollaria gen. nov. This bacterium is a Gram-negative, anaerobic, non sporeforming, indole positive bacillus that was isolated from an osteoarticular sample, during a study prospecting anaerobic isolates from deep samples [1]. Traditionally, definition of a new bacterial species or genus has relied on the application of the "gold standard" methods of DNA-DNA hybridization and G+C content determination [2]. However, those methods are expensive, and poorly reproducible. The development of PCR and sequencing methods led to new ways of classifying bacterial species, using, in particular, 16S rRNA sequences with cutoff [3], together with phenotypic characteristics. Recently, a number of new bacterial genera and species have been described using high throughput genome sequencing and mass spectrometric analyses, which allows access to a wealth of genetic and proteomic information [4,5]. We propose a new bacterial genus and species using a whole genome sequence and a MALDI-TOF spectrum, and the main characteristics of the organism, as we have previously done [6][7][8][9][10][11][12].
Here we present a summary classification and a set of features for F. massiliensis gen. nov., sp. nov. strain 9401234 T (= CSUR P127= DSM 26367) together with the description of the complete genomic sequencing and annotation. These charac-teristics support the circumscription of a novel genus, Fenollaria gen. nov., within the Clostridiales Family XI Incertae sedis, with Fenollaria massiliensis gen. nov., sp. nov, as the type species. Clostridiales Family XI Incertae sedis was created in 2009 [13], and currently comprises 11 genera, including Anaerococcus, Peptoniphilus and Tissierella. It is a heterogeneous group that includes anaerobic and morphologically variable bacteria. This group is defined mainly on the basis of phylogenetic analyses of 16S rRNA sequences and its members have no precise taxonomic or phylogenetic affiliation. Based on the 16S rRNA comparison, the species most closely related to Fenollaria massiliensis is Sporobacterium olearium [14], which is the sole representative of the genus Sporobacterium. S. olearium is a Gram-positive rod with terminal spores. The most closely related validly named species is Tissierella creatinini, which belongs to the genus Tissierella sp [15]. It was first described in 1986 and is represented by three species, among which the type species is T. praecuta, a strictly anaerobic Gram-negative, non spore-forming bacterium.

Classification and features
An osteoarticular sample was collected from a patient as part of a study analyzing emerging anaerobic infectious agents by MALDI-TOF and 16S rRNA gene sequencing. The specimen was sampled in Marseille and preserved at -80°C after collection.  [26]. In the inferred phylogenetic tree, it forms a distinct lineage within the Clostridiales Family XI Incertae sedis ( Figure 1). Those similarity values are lower than the recommended threshold to delineate a new genus without carrying out DNA-DNA hybridization [3].
Growth at different temperatures was tested; no growth occurred at 23°C, 25°C, 28°C and 50°C, but did occur between 32° and 37°C. Optimal growth was observed at 37°C.
Colonies are punctiform, grey, smooth, and round when grown on blood-enriched Columbia agar (Biomerieux), under anaerobic conditions using GENbag anaer (BioMérieux). Growth was achieved anaerobically, on blood-enriched Columbia agar and in TS broth medium after 72h. They also were grown under anaerobic conditions on BHI agar supplemented with 1% NaCl. Growth did not occur under microaerophilic conditions and in the presence of air, with 5% CO2. . Gram staining showed rod-shaped non spore-forming Gram-negative bacilli ( Figure 2). Cells were non-motile. Cells grown in TS broth medium have a mean length of 1.555 µm (min = 1.167µm; max = 2.948µm), and a mean width of 0.772 µm (min = 0.602 µm; max = 1.014 µm), as determined using electron microscopic observation after negative staining ( Figure 3). , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [25]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. Sequences were alig ned using CLUSTALW, and phylog enetic inferences obtained using the maximum-likelihood method within the MEGA 4 software [27]. Numbers at the nodes are bootstrap values obtained by repeating the analysis 500 times to generate a majority consensus tree. The scale bar represents a 2% nucleotide sequence diverg ence.  Strain 9401234 T exhibited neither catalase nor oxidase activities. Using the API 20A system, a positive reaction was observed only for indole, and weakly for gelatinase. Using the API Zym system, a positive reaction was observed for leucine arylamidase and valine arylamidase regarding the proteases, and for Naphtol phosphatase. API RapidID 32A confirmed the positivity for indole and leucine arylamidase, and was also positive for arginine arylamidase, and weakly positive for pyrrolidonyl arylamidase, tyrosine arylamidase, glycine arylamidase, histidine arylamidase and serine arylamidase. Regarding antibiotic susceptibility, F. massiliensis was susceptible to penicillin G, amoxicillin, cefotetan, imipenem, metronidazole, and vancomycin. When compared to the species Tissierela creatinini, Sporobacterium olearium, and Anaerococcus prevotii, within the Clostridiales Family XI Incertae sedis, F. massiliensis exhibits the phenotypic characteristics details in Table 2.
Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [29]. A pipette tip was used to pick one isolated bacterial colony from a culture agar plate, and to spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonik GmbH, Germany). Ten distinct deposits were done for strain JC122 T from ten isolated colonies. Each smear was overlaid with 2µL of matrix solution (saturated solution of alphacyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoracetic acid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (ISI), 20kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The ten 9401234 T spectra were imported into the MALDI Biotyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 5,697 bacteria in the Biotyper database. The method of identification includes the m/z from 3,000 to 15,000 Da. For every spectrum, 100 peaks at most were taken into account and compared with the spectra in database. The output score enabled the identification of the tested species: a score ≥ 2 with a validated species enabled the identification at the species level; a score ≥ 1.7 but < 2 enabled the identification at the genus level; a score < 1.7 was not significant. For strain 9401234 T , the obtained score was 1.04, which is not significant, suggesting that our isolate was not a member of a known genus. We added the spectrum from strain 9401234 T (Figure 4) to our database. A dendrogram was constructed with the MALDI Biotyper software, comparing the reference spectrum of strain 9401234 T with reference spectra of 29 bacterial species, all belonging to the order of Clostridiales ( Figure 5). In this dendrogram, strain 9401234 T appears in a separate clade between the genus Peptoniphilus and Acidaminococcus ( Figure 5).

Genome sequencing and annotation Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position, 16S rRNA similarity to other members of the Clostridiales Family XI Incertae sedis, and its isolation from an osteoarticular clinical sample. It is the first ge-nome of the new genus Fenollaria (Genbank accession numbers are CALI02000001-CALI02000010) and consists of 11 contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance.

Genome sequencing and assembly
This project was loaded twice on a one-quarter region for the paired end application on PTP Picotiter plates. DNA (5µg) was mechanically fragmented on a Hydroshear device (Digilab, Holliston, MA, USA) with an enrichment size at 3-4kb. The DNA fragmentation was visualized through an Agilent 2100 BioAnalyzer on a DNA LabChip 7500 with an optimal size of 4.2 kb. The library was constructed according to the 454_Titanium paired end protocol and manufacturer recommendations. Circularization and nebulization were performed and generated a pattern with an maximum at 686 bp. After PCR amplification through 15 cycles followed by double size selection, the single stranded paired end library was then quantified on the Agilent 2100 BioAnalyzer with a RNA 6000 Pico chip at 1,820 pg/µL. The library concentration equivalence was calculated as 4.87E+09 molecules/µL.

Genome annotation
Non-coding genes and miscellaneous features were predicted using RNAmmer [32], ARAGORN [33], Rfam [34] and signalP [35]. Open Reading Frames (ORFs) were predicted using Prodigal [36] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing GAP region. The functional annotation was achieved using BLASTP [37] against the GenBank database [23] and the Clusters of Orthologous Groups (COG) database.

Genome properties
The genome of Fenollaria massiliensis sp. nov. strain 9401234 T is estimated at 1.71 Mb long with a G+C content of 36.47% ( Figure 6 and Table 4). A total of 1,667 protein-coding and 30 RNA genes, including 3 rRNA genes, 26 tRNA and 1 tmRNA were found. The majority of the protein-coding genes (70.8%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 4 and Table 5. Genes assig ned to COGs 1744 98.44 * The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome Standards in Genomic Sciences Figure 6. Graphical circular map of the genome. From outside to the center: scaffolds are in grey (unordered), genes on forward strand (colored by COG categ ories), genes on reverse strand (colored by COG categ ories), RNA genes (tRNAs g reen, rRNAs red, other RNAs black), GC content (black/g rey), and GC skew (purple/olive). The total is based on the total number of protein coding genes in the annotated g enome.  Some COGs contain sig nificantly more genes as "RNA processing and modification" (+208,7%) or "Secondary metabolites biosynthesis, transport and catabolism" (+115,9%), whereas others contain less g enes as "Nuclear structure" (-100%) or "Defense mechanisms"(-42,4%).

Conclusion
On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Fenollaria massiliensis gen. nov., sp. nov. that contains the strain 9401234 T . This bacterium was found in Marseille, France. Gram negative, catalase negative, oxidase negative and obligate anaerobic. Cells are non-spore forming, non motile rods, with a mean length of 1,555 µm, and a mean width of 772 µm. Colonies are punctiform, very small, grey, smooth, and round on blood-enriched Columbia agar under anaerobic conditions. Optimal growth under anaerobic conditions, at 37°C (range from 32°C to 37°C). Cells are positive for leucine arylamidase, valine arylamidase, arginine arylamidase and for Naphtol phosphatase. Cells are weakly positive for pyrrolidonyl arylamidase, tyrosine arylamidase, glycine arylamidase, histidine arylamidase and serine arylamidase. Susceptible to penicillin G, amoxicillin, cefotetan, imipenem, metronidazole and vancomycin. The potential pathogenicity of the type strain 9401234 T is unknown. The type strain is 9401234 T (= CSUR P127 = DSM 26367); it was isolated from an osteoarticular sample of a patient in Marseille (France). The G+C content of the genome is 34.46 mol%. A partial 16S rRNA gene sequence was deposited in GenBank with the accession number HM587321. The whole genome shotgun sequence of F. massiliensis strain 9401234 T (= CSUR P127 = DSM 26367) has been deposited in GenBank under accession numbers CALI02000001-CALI02000010.