Complete genome sequence and comparative genomics of the golden pompano (Trachinotus ovatus) pathogen, Vibrio harveyi strain QT520

Vibrio harveyi is a Gram-negative, halophilic bacterium that is an opportunistic pathogen of commercially farmed marine vertebrate species. To understand the pathogenicity of this species, the genome of V. harveyi QT520 was analyzed and compared to that of other strains. The results showed the genome of QT520 has two unique circular chromosomes and three endogenous plasmids, totaling 6,070,846 bp with a 45% GC content, 5,701 predicted ORFs, 134 tRNAs and 37 rRNAs. Common virulence factors, including ACF, IlpA, OmpU, Flagellin, Cya, Hemolysin and MARTX, were detected in the genome, which are likely responsible for the virulence of QT520. The results of genomes comparisons with strains ATCC 33843 (392 (MAV)) and ATCC 43516 showed that greater numbers genes associated with types I, II, III, IV and VI secretion systems were detected in QT520 than in other strains, suggesting that QT520 is a highly virulent strain. In addition, three plasmids were only observed in the complete genome sequence of strain QT520. In plasmid p1 of QT520, specific virulence factors (cyaB, hlyB and rtxA) were identified, suggesting that the pathogenicity of this strain is plasmid-associated. Phylogenetic analysis of 12 complete Vibrio sp. genomes using ANI values, core genes and MLST revealed that QT520 was most closely related to ATCC 33843 (392 (MAV)) and ATCC 43516, suggesting that QT520 belongs to the species V. harveyi. This report is the first to describe the complete genome sequence of a V. harveyi strain isolated from an outbreak in a fish species in China. In addition, to the best of our knowledge, this report is the first to compare the V. harveyi genomes of several strains. The results of this study will expand our understanding of the genome, genetic characteristics, and virulence factors of V. harveyi, setting the stage for studies of pathogenesis, diagnostics, and disease prevention.

Many studies have reported on the pathogenic mechanisms of V. harveyi, and variability in virulence mechanisms were found to be strain dependent toward different host species (Austin & Zhang, 2006;Bai et al., 2008;Zhang & Austin, 2000). Some reports have shown that extracellular products (ECPs) including proteases, phospholipases, hemolysins and cytotoxins, play significant roles in the pathogenicity of V. harveyi (Liu & Lee, 1999;Liuxy, Lee & Chen, 1996;Zhang & Austin, 2000). However, in fatal cases in Artemia franciscana nauplii and some cultured marine fishes, the hemolytic activity of V. harveyi was not a significant factor (Soto-Rodriguez et al., 2003;Won & Park, 2008). Siderophore production has been strongly correlated with fish mortality, and it is considered to be an essential virulence factor in some bacterial species (Ratledge & Dover, 2003), but the effect is less obvious in others (Pedersen et al., 1997). In recent years, Type III secretion systems (TTSS), bacteriophages, and quorum-sensing mechanisms, which play important roles in effector translocation, conversion of virulence factors, and regulation of virulence gene expression, respectively, have all been shown to be associated with virulence (Henke & Bassler, 2004;Natrah et al., 2011;Ruwandeepika et al., 2012). To further understand the pathogenesis and the associated virulence factors of V. harveyi, the determination of the genomic information of various V. harveyi strains is crucial.
In this study, we isolated a V. harveyi from a diseased golden pompano (Trachinotus ovatus) cultured in a deep sea cage in Qiaotou, China and named it QT520. Genome and comparative genome analyses were used in the current study to analyze the pathogenesis of V. harveyi. The results will help to characterize pathogenesis, diagnostics, and disease prevention.

Ethics statement
All protocols for experiments involving live animals conducted in this study were approved by the Animal Experimental Inspection of Laboratory Animal Centre, Hainan University (Haikou, China), and the approval number is 20160709. We conscientiously abided by the ethical principles of animal welfare and the rule of laboratory animal center.

Bacterial strain isolation
V. harveyi QT520 was initially isolated in Qiaotou, China from T. ovatus showing clinical signs of body rot. To obtain a pure culture, a single colony of strain QT520 was selected and transferred to fresh TCBS medium at least three times. Then, the clone was cultivated at 30 • C in 2216E liquid medium for 16 h. The cell morphology of QT520 and contamination of the culture with other microorganisms were assessed by light microscopy (BA410; Motic, Xiamen, China). The strain was archived at Hainan University and Hainan Academy of Ocean and Fisheries Sciences under accession number QT520.

Genomic DNA extraction and whole genome sequencing
Genomic DNA was extracted using a Rapid Bacterial Genomic DNA Isolation Kit (Sangon Biotech, Shanghai, China) according to the manufacturer's instructions. The concentration and purity of the extracted DNA were determined using a DNA Qubit 2.0 (Invitrogen, Carlsbad, CA, USA). The extracted genomic DNA was used to construct next-generation libraries with an insert size of 500 bp using a NEBNext R Ultra TM DNA Library Prep Kit for Illumina R and PacBio RS II libraries with an insert size of 10 kb.
Genome sequencing was conducted by Sangon Biotech Co., Ltd. (Shanghai, China) using the next-generation Illumina MiSeq (Illumina, San Diego, CA, USA) and the third-generation PacBio RS II sequencing platform (Pacific Biosciences, Menlo Park, CA, USA), respectively. The raw sequences obtained from the third-generation PacBio RS II sequencing platform were assembled using Canu (Koren et al., 2016). Gapcloser and GapFiller were used to close the gaps with next-generation sequence reads where possible after assembly (Boetzer & Pirovano, 2012) and the sequence variants were detected and assembled by PrInSeS-G to obtain high-quality, whole-genome sequences (Massouras et al., 2010).

Genome analysis
Prokka was used to predict and annotate prokaryotic ORFs, tRNAs and rRNAs (Seemann, 2014). Clusters of Orthologous Groups (COG) analysis was performed using RPS BLAST. Antibiotic resistance genes and genes encoding virulence factors were identified through BLAST searches of the Comprehensive Antibiotic Resistance Database (CARD) and the Virulence Factors of Pathogenic Bacteria Database (VFDB), respectively. Circular genome maps were generated using the CGView Server (Grant & Stothard, 2008) based on the information generated by the genome annotation.

Phylogenetic analyses
Four phylogenetic trees were constructed based on average nucleotide identity (ANI), core genes, multilocus sequence typing (MLST), and the 16S rRNA gene, respectively. The ANI values were calculated using Jspecies (Goris et al., 2007) and phylogenetic tree analysis of 12 available complete genome sequences of Vibrio sp. strains (Table S1) based on ANI values was performed using the R package Ape with neighbor-joining methods (Paradis, Claude & Strimmer, 2004). The core genes of the 12 genomes were obtained using cd-hit 4.6.1, the corresponding protein sequences were aligned using MUSCLE 3.8.31 and the phylogenetic trees were generated using Treebest 1.9.2 (Caputo et al., 2015;Mbengue et al., 2016). Four different housekeeping genes were used to construct the MLST phylogenetic tree: toxR, vhhA, ompK and hsp60. The 16S rRNA gene sequences and the corresponding amino acid sequences inferred from the four genes for MLST were respectively aligned with those of the most closely related species using the multiple alignment program ClustalW. Phylogenetic relationships between strain QT520 and closely related species were determined using the MEGA6 software. Phylogenetic trees were generated using the neighbor-joining (NJ) method with 1,000 randomly selected bootstrap replicates.

Comparative genomics
Genomes of ATCC 33843 (392 (MAV)), ATCC 43516, and QT520 were compared through Mauve using the default parameters (Darling et al., 2004;Mbengue et al., 2016). The locations of different genes in the gene clusters were visualized using SVG 1.1, and ImageMagick 6.5.4-7 was used to convert SVG format to PNG.

Determination of Median Lethal Dose (LD50)
The isolated QT520 strain was inoculated into 2216E liquid medium and incubated on a shaker (180 rpm) at 30 • C overnight, after which a culture containing 1.47 × 10 8 CFU/mL obtained. During challenge, the bacterial suspension was serially diluted 10-fold using Stroke-Physiological Saline Solution and was used to infect T. ovatus (10 fish/group) by intraperitoneal injection with 1.47×10 8 CFU/mL, 1.47×10 7 CFU/mL, 1.47×10 6 CFU/mL or 1.47 × 10 5 CFU/mL. The control group was injected with Stroke-Physiological Saline Solution. Fish deaths were recorded for 2 weeks and the Reed-Muench method was used to calculate the median lethal dosage.

General features of V. harveyi QT520
Through the use of a combination of sequencing approaches using the MiSeq and PacBio platforms, we were able to assemble five contigs with 227.76× coverage. The genome sequence of strain QT520 was deposited in the DDBJ/EMBL/GenBank under the accession numbers CP018680-CP018684. The project information, according to the minimum information about a genome sequence (MIGS) recommendation (Field et al., 2008), is shown in Table S2. The strain QT520, which was isolated from cultured T. ovatus exhibiting clinical signs of body rot, was Gram-negative and aerobic. After negative staining it was observed to be an oval, rod-shaped cell approximately 1.6-2.2 µm in length and 1.0-1.1 µm in width with a single polar flagellum (Fig. 1).

Phylogenetic analysis
Four comparative phylogenetic tree analysis methods were used to identify closely related strains of QT520: ANI, core genes, MLST, and 16S rRNA gene sequencing. For phylogenetic tree analysis using ANI, 11 complete genome sequences available in the NCBI GenBank database, as well as that of QT520, were obtained and the ANI values were calculated (Table S4). A phylogenetic tree based on ANI values was constructed (Fig. 4A). The tree identified a subgroup that contained 1114GL, LMB29, ATCC BAA-1116, ATCC 33843 (392 (MAV)), ATCC 43516 and QT520, which was designated Group I. Within this group, the genome of QT520 was most closely related to those of ATCC 33843 (392 (MAV)) and ATCC 43516 (98.49 and 98.59% of ANI values, respectively). An additional phylogenetic tree of the 12 complete genome sequences described above was constructed using the core genes method (Fig. 4B). The results showed that ATCC 33843 (392 (MAV)) and ATCC 43516 were the closest in Group I and that QT520 was closely related to both strains, supporting that the ANI and core genes methods could distinguish them even at the strain level.
Another phylogenetic tree was generated using MLST with four different housekeeping genes: toxR, vhhA, ompK and hsp60. The DNA sequences of the four MLST genes in the Group I strains were compared and aligned to generate a phylogenetic tree (Fig. 4C). The results showed that the genomes of QT520, ATCC 33843 (392 (MAV)) and ATCC 43516 were again the closest relatives and those of 1114GL, LMB29 and ATCCBAA-1116 were the closest in Group I. However, comparative phylogenetic tree analysis using 16S rRNA sequences of the Group I strains revealed that strains QT520, 1114GL, ATCC 33843 392 MAV and ATCC 43516 were the closest relatives, being classified into one phylogenetic branch (Fig. 4D).

Comparative genomics of V. harveyi ATCC 3843 (392 (MAV)) and ATCC 43516
Comparative genomic analyses were conducted to compare V. harveyi QT520 with its most closely related strains, V. harveyi ATCC 3843 (392 (MAV)) and ATCC 43516 (Fig. 5A). The results showed that the number of core genes among the three strains is 4,463. The number of specific genes in QT520, ATCC 3843, and ATCC 43516 are 458, 277 and 448, respectively.
Three plasmids were detected in V. harveyi strain QT520 that were not observed in strains ATCC 3843 (392 (MAV)) and ATCC 43516. In particular, three important virulence factor genes (cyaB, hlyB and rtxA) were present in plasmid p1.

LD 50
Artificial infection results showed that the LD 50 of QT520 was 2.5 × 10 5 bacteria per fish, suggesting that this bacterium is strongly pathogenicity to T. ovatus (Table 3).

DISCUSSION
Since species within the Harveyi clade have a high degree of genetic and phenotypic similarity (Sawabe, Kitatsukamoto & Thompson, 2007), identification of V. harveyi strains is a challenging task. For example, the species V. harveyi, V. campbellii and V. rotiferianus share approximately 99% sequence identity within the 16S rRNA gene (Gomez-Gil, 2003). These similarities have confounded typing schemes and resulted in documented misidentifications (Gauger & Gómez-Chiarri, 2002;Gomez-Gil et al., 2004). Recently, novel phylogenetic analysis approaches have been suggested, specifically, ANI, core genes and MLST. The ANI and core genes method was performed using the 12 complete genome sequences available for Vibrio sp. In this study, the analysis based on core genes and MLST revealed that QT520, ATCC 3843 (392 (MAV)) and ATCC 43516 were the most closely related and that 1114GL, LMB29 and ATCCBAA-1116 were the closest in Group I, which was consistent with the analysis using ANI values. However, these strains could not be differentiated using the 16S rRNA phylogenetic tree, suggesting that ANI, core genes and MLST methods can provide a greater level of resolution between V. harveyi and V. campbellii compared to the 16S rRNA method. The pathogenicity of V. harveyi strains is related to a number of factors, including secretion of ECPs (containing substances such as proteases, hemolysins, and lipases) (Teo, Zhang & Poh, 2003;Zhang & Austin, 2000), a bacteriocin-like substance (Prasad et al., 2005), quorum sensing capabilities (Henke & Bassler, 2004), susceptibility to bacteriophage infection (Oakey & Owens, 2000) and siderophore production (Owens, Austin & Austin, 1996). The observed LD 50 of QT520 was 2.5 × 10 5 bacteria per fish, suggesting that this bacterium is a highly virulent strain towards T. ovatus. Various virulence factor genes, including ACF, IlpA, MAM7, OmpU, Type IV pili, Flagellin, Cya, Hemolysin and MARTX, were observed in this strain. These genes may be responsible for the high virulence associated with the QT520 strain.
The ability of pathogens to obtain iron from their host is central to their survival (Ratledge & Dover, 2003). The genome of the QT520 strain encodes the aerobactin siderophore receptor (detected at the APP07707.1 region on chromosome II), heme receptors (at the region APP04364.1 on chromosome I and the region APP07339.1 on chromosome II), and iron ABC transport (APP03856.1-APP03858.1 and APP03907.1 on chromosome I). These iron uptake related-genes may play key roles in the survival of the QT520 strain in host cells.
Secretion systems can transport various virulence factors outside of the bacterial cell and allow bacteria to communicate within the environment in which they live. Greater numbers of genes encoding T1SS, T2SS, T3SS and T4SS and T6SS components were found in strain QT520 than in strains ATCC 33843 (392 (MAV)) and ATCC 43516. Interestingly, the T1SS and T6SS were only found in QT520. The T1SS is employed for secreting proteins, includes many adhesins, proteases, and toxins that are delivered into host cells, and can secrete hemolysin in the E. coli (Dalbey & Kuhn, 2012). The T2SS is required for secretion of exotoxins, including the cholera toxin (Ctx) (Abendroth, Kreger & Hol, 2009). The T3SS, which is found in various pathogenic Gram-negative bacterial genera, such as Salmonella, Shigella, Yersinia, Pseudomonas, and enteropathogenic Escherichia coli (EPEC) (Galan & Wolf-Watz, 2006), serve several well defined functions in pathogenesis. The T4SS is responsible for the transport of virulent proteins or DNA into eukaryotic cells as well as for the conjugative transfer of plasmids from one bacterium to another (Fronzes, Christie & Waksman, 2009) and were found in the insertion regions, suggesting that the T4SS may play important roles in the pathogenicity of QT520. The T6SS, which is associated with cytotoxic effectors (Costa et al., 2015;Unterweger et al., 2014), translocates toxic effector proteins into eukaryotic and prokaryotic cells and has a pivotal role in pathogenesis and bacterial competition (Ho, Dong & Mekalanos, 2014;Zoued et al., 2014), and this system was also reported in the pathogenic Vibrio harveyi strains ZJ0603 and CAIM 1792 (Huang et al., 2012;Espinoza-Valles et al., 2012). This secretion system functions as a group of toxin proteins by transporting various bacterial effectors into eukaryotic cells, resulting in host cell death (Costa et al., 2015), and its presence suggests that QT520 is a highly virulent strain based on our genome analysis. However, further analysis of the mechanisms of these secretion systems in V. harveyi is required.
Interestingly, we observed three plasmids in the complete genome sequence of strain QT520, which have not been identified in strains ATCC 33843 (392 (MAV)) and ATCC 43516 (Wang et al., 2014). Through comparative genomic analysis, three specific virulence factors (Cya, Hemolysin and RTX toxin), encoded by cyaB, hlyB and rtxA, were observed in plasmid p1 of strain QT520, suggesting that the pathogenicity of this strain is closely related to plasmid p1. Studies have also reported that a plasmid, pVCR1, was harbored by the highly virulent V. harveyi ORM4 strain, suggesting its involvement in the virulence phenotype (Schikorski et al., 2013). The virulence of V. harveyi was believed to be acquired by association with genetically mobile elements, such as plasmids or transposons (Austin & Zhang, 2006). Non-virulent V. harveyi strains can become virulent after plasmid uptake or bacteriophage-mediated transfer of toxin gene(s) (Oakey & Owens, 2000). Thus, we inferred that the pathogenicity of QT520 has been acquired by the incorporation of plasmids.
Through comparative analyses of the complete genomes of the three V. harveyi strains, a majority of the genes analyzed were observed to be core genes, including the antibiotic resistance genes reported previously in the complete genome sequences of V. harveyi ATCC 3843 (392 (MAV)) and ATCC 43516. This suggests that the three V. harveyi strains likely possess similar antibiotic resistances to tigecycline, tetracycline, streptomycin, kanamycin, ciprofloxacin and norfloxacin. In addition, QT520 exhibited resistance against tetracycline, streptomycin and kanamycin.

CONCLUSIONS
V. harveyi, an opportunistic pathogen of many maricultured animals, can cause mass mortalities in aquaculture species, posing a considerable threat to the industry. V. harveyi QT520, which was isolated from diseased deep sea cage-cultured golden pompano, was observed to be a virulent isolate with an LD50 of 2.5 ×10 5 bacteria per fish. To understand its pathogenesis, the genome of V. harveyi QT520 was sequenced. This genome was observed to consist of two circular chromosomes and three plasmids, totaling 6,070,846 bp with a 45% GC content, as well as containing 5,701 predicted ORFs, 134 tRNAs and 37 rRNAs.
Phylogenetic analysis of 12 complete genomes of Vibrio sp. using ANI values and core genes revealed that QT520 was most closely related to ATCC 33843 (392 (MAV)) and ATCC 43516, indicating that QT520 belongs to the species V. harveyi.
Common virulence factors, including ACF, IlpA, OmpU, Flagellin, Cya, Hemolysin and MARTX, were detected. These factors may be responsible for the virulence associated with colonization by QT520. Additionally, greater numbers of genes encoding types I, II, III, IV and VI secretion systems were detected in the genome of QT520 than in strains ATCC 33843 (392 (MAV)) and ATCC 43516, suggesting that strain QT520 has the capacity for a highly virulent phenotype. It is worth mentioning that three specific virulence factor genes (cyaB, hlyB and rtxA) were contained in plasmid p1 in this strain, suggesting that the pathogenicity of this strain is plasmid associated. Comparative genome analysis of QT520, ATCC 33843 (392 (MAV)) and ATCC 43516 revealed that the majority of ORFs were core genes, including two antibiotic encoding genes, suggesting these strains are resistant to multiple antibiotics.

ADDITIONAL INFORMATION AND DECLARATIONS
Funding