Genome Sequence of Weissella ceti NC36, an Emerging Pathogen of Farmed Rainbow Trout in the United States

Novel Weissella sp. bacteria have recently been reported to be associated with disease outbreaks in cultured rainbow trout (Oncorhynchus mykiss) at commercial farms in China, Brazil, and the United States. Here we present the first genome sequence of this novel Weissella species, isolated from the southeastern United States.

negative, non-spore-forming, nonmotile bacteria with irregular or coccoid heterofermentative rod morphologies (1). Weissella species are common in diverse nutrient-rich environments, including fermented foods, soil, and the intestines of many animals, including humans (2). Weissella confusa strains have infrequently been reported to cause infections in both humans (3)(4)(5) and nonhuman primates (6); however, members of the genus are not typically associated with disease. Novel Weissella sp. bacteria have recently been associated with disease outbreaks in rainbow trout (Oncorhynchus mykiss) in China (7), Brazil (8), and the United States (T. J. Welch and C. M. Good, submitted for publication). Each of these outbreaks occurred at commercial rainbow trout farms and caused high levels of morbidity and mortality. The origin of the bacteria associated with these outbreaks is unknown, but 16S rRNA sequences from the Brazilian, Chinese, and U.S. isolates are Ͼ99% identical, suggesting a high level of genetic similarity among strains (Welch and Good, submitted). The trout isolates also show Ͼ99% 16S sequence similarity to W. ceti sp. nov., which was recently isolated from beaked whales (9), and therefore, the whale and fish isolates may constitute a single species. The occurrence of this pathogen on three continents over a relatively short period (5 years) suggests that weissellosis is a rapidly emerging disease of farmed rainbow trout. Comparison of the genome sequences of the U.S., Brazilian, and Chinese strains will be necessary to our understanding of the evolutionary relationship among the strains and may additionally provide insight into the recent emergence of this pathogen. As a basis for these comparisons, and to identify putative virulence genes, we sequenced the genome of Weissella ceti NC36, a representative strain from the U.S. outbreak.
Genomic DNA was purified by using the MasterPure Grampositive DNA purification kit (Epicenter) according to the supplied protocol. The genome was assembled by using a combination of sequences from Illumina (MiSeq, paired-end 150-bp reads; 583ϫ coverage) and Pacific Biosciences (PacBio RS 10-kb library of continuous long reads [CLR]; 198ϫ coverage). An initial as-sembly was conducted using ABySS (10) with only the Illumina sequences, resulting in 23 high-quality contigs of Ն500 bp. The PacBio CLR sequences were then leveraged to join these contigs together using AHA (Pacific Biosciences). Finally, small assembly errors were corrected through an iterative process of mapping the Illumina reads onto the final contigs and then creating a new consensus using Bowtie2, Samtools, and custom scripts (11,12). The final assembly consisted of seven contigs (N 50 , 385,673 bp; maximum length, 518,056 bp), and the genome was estimated to bẽ 1.35 Mb, with a GϩC content of 40.8%. Automated annotation by NCBI's Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP) revealed 16 rRNA genes, 68 tRNA genes, and 1,264 protein-coding sequences (CDS).
Results of comparative analysis highlighted several putative virulence factors, which do not have homologs encoded in any of the other sequenced Weissella genomes. These include five collagen adhesins (WCNC_00912, WCNC_00917, WCNC_00922, WCNC_05547, and WCNC_06207), a platelet-associated adhesin (WCNC_01820), and a mucus-binding protein (WCNC_01840).
Nucleotide sequence accession numbers. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/ GenBank under the accession number ANCA00000000. The version described in this paper is the first version, ANCA01000000.

ACKNOWLEDGMENTS
We thank G. Koroleva and M. Gestole for preparing the Illumina and PacBio sequencing libraries.
This work was funded by the Defense Threat Reduction Agency Project no. 1881290.
Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the U.S. Army. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. The USDA is an equal opportunity provider and employer.