Comparative Genomics Reveals Pathogenicity-Related Loci in Shewanella algae

Shewanella algae is an emerging marine zoonotic pathogen and accounts for considerable mortality and morbidity in compromised hosts. However, there is scarce literature related to the understanding of the genetic background of virulence determinants in S. algae. In this study, we aim to determine the occurrence of common virulence genes in S. algae using whole-genome sequence and comparative genomic analysis. Comparative genomics reveals putative-virulence genes related to bile resistance, chemotaxis, hemolysis, and motility. We detected the existence of hlyA, hlyD, and hlyIII involved in hemolysis. We also found chemotaxis gene cluster cheYZA operon and cheW gene. The results provide insights into the genetic basis underlying pathogenicity in S. algae.


Introduction
Shewanella algae is an emerging marine zoonotic pathogen. e organism was first classified in 1990 by Simidu et al. [1], emended by Nozue et al. [2], and described as a Gramnegative, motile bacillus, with hydrogen sulfide production, exhibiting hemolysis on sheep blood agar. S. algae is found in marine environments throughout the world and has been linked with both human and marine animal infections [3,4]. Currently, there are at least three other Shewanella species found in clinical specimens and S. algae accounts for the majority of isolates from humans [5,6]. S. algae has also been reported to cause diseases in marine animal, both wild and cultured [7][8][9]. However, there is scarce literature related to the understanding of the genetic background of virulence determinants in S. algae.
Marine ecosystem consists of a large variety of organisms that impact human health [10]. e advance of sequencing technology allows the identification of determinants in pathogenic microorganisms and has become an important approach to study the fundamental mechanisms of pathogenesis [11,12]. Comparative genomics further enables the investigation of core elements of pathogenesis factors in great detail [13]. Recently, there have been attempts to use wholegenome sequencing in the study of marine pathogens [14]. erefore, genomic comparison of the clinical S. algae isolates could provide clues for pathogenic or fitness determinants [15]. e aims of the study were to determine the occurrence of common virulence genes found in S. algae isolates from clinical setting using whole-genome sequence and comparative genomic analysis and to explore the relationship among the tested genomes.
was grown in trypticase soy agar with 5% sheep blood (Becton, Dickinson and Company, Franklin Lakes, NJ, USA) at 30°C for 24 hours. Single colonies were inoculated in tryptic soy broth (Becton, Dickinson and Company, Franklin Lakes, NJ). e isolates were preliminarily identified using 16S rRNA gene sequencing and matrix-assisted laser desorption ionization-time of flight mass spectrometry (bioMérieux, Marcy l'Etoile, France). A part of 16S rRNA gene was amplified using the primers of B27F (5′-AGAGTTTGATCCTGGCTCAG-3′) and U1492R (5′-GGTTACCTTGTTACGACTT-3′) [9,16]. e nucleotide sequences were aligned, and BLAST search was performed against the GenBank database of the National Center for Biotechnology Information (NCBI) [17].

Phylogenetic Analysis Based on Whole-Genome Sequences.
Genome-based phylogenic analysis was performed using pairwise comparison of average nucleotide identity. e whole-genome average nucleotide identity (ANI) was calculated with the use of a modified algorithm [18]. Phylogenetic trees were visualized using MEGA7.

Annotation and Comparative
Genomics. e annotation was performed using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) [19] and the DOE-JGI Microbial Genome Annotation Pipeline version 4.10.5 [20]. e prediction was done using Glimmer 3.02 [21]. e nontranslated genes were predicted by tRNAscan-SE [22], RNAmmer [23], and RFAM [24]. Functional classification of the predicted genes was carried out using RPSBLAST program v. 2.2.15 [25]. Analysis of the functional annotation was further performed using the Integrated Microbial Genomes & Microbiomes system v.5.0 [26] and the Pathosystems Resource Integration Center [27]. CDS count for these strains was derived. Comparative genome analysis was performed using EDGAR platform (http://edgar.computational.bio) [28]. e core genome and the singletons for the 4 related S. algae genomes were generated for Prokka-annotated genomes using EDGAR (http://edgar.computational.bio). We compared the S. algae genomes using the MUMmer software package [29] together with the Circos visualization engine [30].

Genome Sequencing and Assembly.
e genomic sequencing consisted of 250 bp paired-end reads, yielding approximately 0.88 Gbp to 1.24 Gbp for each isolate. e de novo assembly of genome sequence data revealed that the number of contigs (>200 bp) varied from 27 to 74 for each genome. e maximum contig size among the genomes was 976,090 bp aligned to YHL. e GC content ranged from 52.96% for CHL to 53.08% for ACCC.

Genome-Based Phylogenetic Analysis.
e average nucleotide identity (ANI) was calculated and revealed that tested S. algae strains were identical in terms of nucleotide sequences, as shown in Figure 1.

Comparative Genomics.
We constructed a pan-genome dataset using whole-genome sequence of sequenced S. algae strains. Figure 2 shows orthologous genes shared among strains and depicts the position and color-coded function of the S. algae genes. e numbers of orthologous and strain-specific unique genes are shown in the Venn diagram. Core genome for the S. algae strains consists of 1354 coding sequences (Figure 3). e set of unique genes harbored by each strain varies from 335 for S. algae YHL to 466 for S. algae CHL. Following genome map construction, we conducted genome mapping among the S. algae strains in the study. In this comparison, colored arcs indicate regions of high similarity as revealed by the NUCmer script from the MUMmer software package. As shown in Figure 4, the alignment revealed an obvious syntenic relationship in these strains.

Analysis of Putative-Virulence-Related Genes.
As illustrated in Table 2, genes encoded exbBD, galU, and htpB are shared with S. algae genomes. Heat shock protein gene clpP and hemolysis homologous genes, hlyA, hlyD, hlyIII, and tolC, were found in each S. algae genome. Gene cluster cheYZA operon and cheW involved in chemotaxis were detected in all tested S. algae. Flagellar gene operons are present in all tested S. algae genome.
S h e w a n e ll a a lg a e A C C C

Discussion
S. algae has become an emerging marine zoonotic pathogen world-wide [5]. e spectrum of S. algae infection is broad with considerable morbidity and mortality in compromised hosts [31,32]. us, understanding genomic characterization of S. algae is important for determining molecular epidemiology, understanding its pathogenesis, identifying specific biomarkers, tracing evolution of these strains, and developing control strategy of these pathogens in host Cell cycle control, cell division, chromosome partitioning Cell wall/membrane/envelope biogenesis Cell motility Post translational modification, protein turnover, and chapero Signal transduction mechanisms Intracellular trafficking, secretion, and vesicular transport Defense mechanisms Extracellular structures Nuclear structure Cytoskeleton RNA processing and modification Chromatin structure and dynamics Translation, ribosomal structure and biogenesis Transcription Replication, recombination and repair Energy production and conversion Amino acid transport and metabolism Nucleotide transport and metabolism Carbohydrate transport and metabolism Coenzyme transport and metabolism Lipid transport and metabolism Inorganicion transport and metabolism Secondary metabolites biosynthesis, transport and catabolism General function predicton only Function unknown Multi-function       [33,34]. In the present study, we used comparative genomics to analyze chromosomal sequence of four isolates to determine the common genetic content and organization, unique virulence attributes, and evolutionary relationship with other strains. Whole-genome sequence analysis of S. algae detected the presence of chemotaxis gene cluster cheYZA operon that is conserved in the chemotactic bacteria [35]. Chemotaxis is a directed motility in response to concentration gradients of signals. e cheA was demonstrated to be essential for chemotaxis using a two-component pathway [36]. In brief, CheA phosphorylates cheY and then is dephosphorylated by the phosphatase cheZ [37]. Previous studies revealed that CheW and CheA share structural homology and bind to the same site on chemoreceptors [37]. CheW is essential to the activation of CheA and the formation of CheA-CheW complex [38]. Owing to the wide range of S. algae habitats, the drivers of its chemotaxis could be very diverse. Previous studies have demonstrated that pathogenic bacteria use chemotaxis to localize reservoirs. Further study would be needed to identify the microenvironments suit for S. algae and the trigger of its chemotaxis.
Biliary tract infection is main manifestation of S. algae infection, and bile resistance has been noted in pathogenic strains [31]. In the study we also identified genes associated with bile adaption. e exbBD gene encodes Ton energy transduction system implicated in the response to bile [39,40]. We also detected galU, htpB, and wecA involved in bile resistance [41][42][43]. e results support an earlier genomic study suggesting a common mechanism of bile resistance in Shewanella.
Motility is one characteristic of S. algae [3]. We identified series of flagellar gene operons in S. algae genomes. ese flagellar systems are unique and require more study regarding the evolution and organization. Hemolysis is a main pathogenic feature in S. algae [44]. e gene hlyA encodes RTX pore-forming toxin α-hemolysin, which alters membrane permeability and causes cell lysis in a variety of human and animal hosts [45].

Conclusions
In conclusion, this is one of the few studies tracking genetic background of putative virulence-related genes in S. algae. Although the number of strains was limited, we highlight the unique characteristics of core virulence determinants in these strains, as a high level of genomic conservation.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.