Non contiguous-finished genome sequence and description of Peptoniphilus timonensis sp. nov.

Peptoniphilus timonensis strain JC401T sp. nov. is the type strain of P. timonensis sp. nov., a new species within the Peptoniphilus genus. This strain, whose genome is described here, was isolated from the fecal flora of a healthy patient. P. timonensis is an obligate Gram-positive anaerobic coccus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 1,758,598 bp long genome (1 chromosome, no plasmid) contains 1,922 protein-coding and 22 RNA genes, including 5 rRNA genes.


Introduction
Peptoniphilus timonensis strain JC401 T (= CSUR P165= DSM 25367) is the type strain of P. timonensis sp. nov. This bacterium is a Grampositive, anaerobic, indole-positive coccus that was isolated from the stool of a healthy Senegalese patient as part of a "culturomics" study aiming at cultivating individually all species within human feces.
Since the early days of bacterial taxonomy, defining a bacterial species has been a matter of debate. Currently, the availability of a wide array of molecular methods, notably 16S rRNA and full genome sequencing, offers a possibility to base the description of new species on other methods than the "gold standard" of DNA-DNA hybridization [1]. In particular, sequence similarity of the 16S rRNA, although neither uniform across taxa nor necessarily predictive, enabled the taxonomic classification or reclassification of many taxa [2], and genome sequencing has provided access to the complete genetic information of bacteria [3]. As a consequence, we based our description of P. timonensis sp. nov. on a polyphasic approach [4] including their genome sequence and main phenotypic characteristics (habitat, Gram-stain reaction, culture and metabolic characteristics, MALDI-TOF spectrum, and when applicable, pathogenicity).
Here we present a summary classification and a set of features for P. timonensis sp. nov. strain JC401 T together with the description of the complete genomic sequencing and annotation. These characteristics support the creation of the P. timonensis species.

Organism information
A stool sample was collected from a healthy 16year-old male Senegalese volunteer patient living in Dielmo (rural village in the Guinean-Sudanian zone in Senegal), who was included in a research protocol. The patient gave an informed and signed Standards in Genomic Sciences consent, and the agreement of the National Ethics Committee of Senegal and the local ethics committee of the IFR48 (Marseille, France) were obtained under agreement (09-022 and 11-017). The fecal specimen was preserved at -80°C after collection and sent to Marseille. Strain JC401 T was isolated in June 2011 by cultivation on 5% sheep bloodenriched Brain Heart Infusion agar (Becton Dickinson, Heidelberg, Germany). This strain exhibited a 98% nucleotide sequence similarity with Peptoniphilus harei, the phylogenetically closest validated Peptoniphilus species (Figure 1, for classification, see Table 1). This value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridization [18]. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [17]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements.

Figure 1.
Phylogenetic tree highlighting the position of Peptoniphilus timonensis strain JC401 T relative to other type strains within the Peptoniphilus genus. GenBank accession numbers are indicated in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the maximum-likelihood method within the MEGA software. Numbers at the nodes are percentages of bootstrap values obtained by repeating the analysis 500 times to generate a majority consensus tree. Anaerococcus prevotii was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence.
Different growth temperatures (25, 30, 37, 45°C) were tested. Growth was not observed at 25°C and 45°C, but optimal growth occurred between 30 and 37°C. Colonies were 0.5 mm in diameter on blood-enriched BHI agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMérieux), and in aerobic conditions, with or without 5% CO 2 . Growth was not achieved in aerobic (with and without CO 2 ) conditions. The growth was observed in anaerobic conditions. Gram staining showed Gram-positive cocci ( Figure 2). A motility test was negative. Cells grown on agar are sporulated and have a mean diameter of 0.91 µm ( Figure 3). Strain JC401 T exhibited a catalase activity but no oxidase activity. Using API Rapid ID 32A, positive reactions were obtained for α galactosidase, arginine arylimidase, tyrosine arylamidase, histidine arylamidase, serine arylamidase and indole production. Weak reactions were observed for leucine arylamidase and phenylalanine arylamidase. P. timonensis is susceptible to penicillin G, imipeneme, amoxicillin + clavulanic acid, vancomycin, clindamycin and metronidazole. Standards in Genomic Sciences Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [19]. Briefly, a pipette tip was used to pick one isolated bacterial colony from a culture agar plate, and to spread it as a thin film on an MTP 384 MALDI-TOF target plate (Bruker Daltonics, Leipzig, Germany). Twelve distinct deposits were done for strain JC401 T from twelve isolated colonies. Each smear was overlaid with 2µL of matrix solution (saturated solution of alpha-cyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoraceticacid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (IS1), 20 kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The twelve JC401 T spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria, including 12 spectra from 8 Peptoniphilus species, which were used as reference data, in the BioTyper database. The method of identification included the m/z from 3,000 to 15,000 Da. For every spectrum, 100 peaks at most were taken into account and compared with spectra in the database. A score enabled the identification, or not, from the tested species: a score > 2 with a validated species enabled the identification at the species level, a score > 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain JC401 T , the obtained score was 1.2, thus suggesting that our isolate was not a member of a known species. We incremented our database with the spectrum from strain JC401 T (Figure 4). The spectrum was made available online in our free-access URMS database [20].

Genome sequencing information Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the Peptoniphilus genus, and is part of a "culturomics" study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the seventh genome of a Peptoniphilus species and the first genome of Peptoniphilus timonensis sp. nov. The Genbank accession number is CAEL00000000 and consists of 97 large contigs. Table 2 shows the project information and its association with MIGS version 2.0 compliance.

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [21] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [22] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [23] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [24] and BLASTn against the GenBank database. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an Evalue of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans.
To estimate the mean level of nucleotide sequence similarity at the genome level between Peptoniphilus species, we compared the ORFs only using BLASTN and the following parameters: query coverage of ≥ 70% and a minimum nucleotide length of 100 bp.

Genome properties
The genome is 1,758,598 bp long (1 chromosome, but no plasmid) with a 30.70% GC content ( Figure  5 and Table 3). Of the 1,944 predicted genes, 1,922 were protein-coding genes and 22 were RNAs. A total of 1,368 genes (70.37%) were assigned a putative function. A total of 186 genes were identified as ORFans (9.6%). The remaining genes were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. The properties and the statistics of the genome are summarized in Tables 3 and 4.

Comparison with the genomes from other Peptoniphilus species
Draft genome sequences are currently available for six species. Here we compared the genome sequence of P. timonensis strain JC401 T with those of P. harei strain ACS-146-V-Sch2b, P. indolicus strain ATCC BAA-1640 and P. lacrimalis strain 315-B.
The draft genome sequence of P. timonensis is larger than P. lacrimalis (1.76 Mb and 1.69 Mb, respectively) and smaller than P. indolicus and P. harei (2.2 Mb and 1.8 Mb, respectively). The G+C content of P. timonensis is comparable to P. lacrimalis (30.7 and 29.91% respectively) but smaller than those of P. indolicus and P. harei (32.29 and 34.44% respectively). Additionally, P. timonensis has more predicted genes than P. harei and P. lacrimalis (1,922, 1,724 and 1,589 respectively) and lesser than P. indolicus (2,269). The genes assigned to COGs of P. timonensis are comparable to P. harei (1,368 and 1,381 respectively) greater than P. lacrimalis (1,192) and lesser than P. indolicus (1,690). However, the distribution of genes into COG categories (Table 4) was almost similar in all the four genomes.

Conclusion
On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Peptoniphilus timonensis sp. nov. that contains the strain JC401 T . This strain has been found in Senegal.  a The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome The total is based on the total number of protein coding genes in the annotated genome