Research PaperA bioinformatic approach to identify core genome difference between Salmonella Pullorum and Salmonella Enteritidis
Introduction
The majority of Salmonella serovars are broad host range pathogens, however, a few serovars only infect one or a few host species (Uzzau et al., 2000). The underlying mechanism behind host specificity of these serovars remains unknown. Salmonella Pullorum (formal name S. enterica subspecies enterica serovar Gallinarum biovar Pullorum (S. Pullorum)) belongs to the host specific group as it only infects avian, where it causes systemic disease with high mortality in young chicken (Barrow and Freitas Neto, 2011; Shivaprasad, 2000). Studies have revealed that S. Pullorum is closely related to S. Enteritidis (formal name S. enterica serovars Enteritidis) (Langridge et al., 2015; Thomson et al., 2008), another Salmonella serovar, which commonly infects avian, but has a broad host range and mainly causes gastroenteritis (Rodrigue et al., 1990). The close evolutionary relationship and yet different host range and pathogenicity makes comparison of genomics and traits of S. Pullorum and S. Enteritidis a suitable approach to identify genome features that are associated with host-specificity in Salmonella.
The development towards host-specificity of Salmonella serovars has likely been driven by several mechanisms. Comparative genomics have already uncovered important genome features of host-specific serovars, most notably that each of them contain a number of specific genes, and that host adaptation has been accompanied by pseudogene formation in genes that apparently are not needed for the host-specific infection (Langridge et al., 2015; Thomson et al., 2008). Also, studies in other bacteria have indicated the importance of horizontal gene transfer for the evolution of microbial genomes (Pal et al., 2005; Popa et al., 2011; Treangen and Rocha, 2011) leading to regions which are specific to particular bacterial (sub)species or strains, termed the ‘mobilome’ (Dobrindt et al., 2004; Ou et al., 2007; Siguier et al., 2006).
A pan-genome is the entire set of genes for all strains within a clade, and the core-genome represents genes present in all strains (Medini et al., 2005; Tettelin et al., 2005; Vernikos et al., 2015). As the number of available genomes increases, so does the pan-genome of each serovar (Baddam et al., 2014; Laing et al., 2017), and pan-genome analysis tools have been established to increase the power of genome comparisons (Xiao et al., 2015). Core-genome, on the other hand, denoted the set of genes present in more than 90% of members of a clade. Pan- and core-genome analysis can provide a useful framework to determine the genomic diversity of the dataset at hand (Vernikos et al., 2015).
Apart from pathotype diversity associated with gene gain or loss, single-nucleotide polymorphism (SNP) may also contribute to host-specificity (Bekal et al., 2016; Yue et al., 2015; Yue and Schifferli, 2014). Previous research has reported that SNP mutations in coding sequences cause phenotypic difference in Salmonella (Hopkins and Threlfall, 2004; Thornbrough and Worley, 2012). Functionally important SNP mutation can also exist in non-coding regions (Hammarlof et al., 2018; Zaunbrecher et al., 2009), such as in promoter regions or in recognition motifs for regulators, usually located at the upstream of gene start codon (Haugen et al., 2008; McClure, 1985). Thus, it is important to be able to combine searches for unique genes with searches for SNPs that are likely to influence phenotype.
In the current study, we determined the pan-genome of S. Pullorum and S. Enteritidis, and based on this, we identified the shared core-genome of the two serovars, as well as the unique parts of the core-genome in each serovar. The aim was to discover putative host-specificity associated genes and SNPs in S. Pullorum. In order to do so, we designed a novel workflow to identify core-SNP and core-upstream-SNP between groups of strains. Our new genome-based analysis workflow (Corevar) with related scripts can also be applied to studies of other bacterial clades without additional adjustments.
Section snippets
Read dataset
In this study, a total of 144 read datasets were analysed. It consisted of 97 S. Pullorum genome sequences (Hu et al., 2019) and 47 S. Enteritidis strains, which were de novo sequenced in the current study following the same protocol applied for sequencing of S. Pullorum (Hu et al., 2019). Briefly, genomic DNA was fragmented with an insertion size of ~500 bp to prepare the library. Then paired end sequencing (2 × 150 bp) was applied by a HiSeq 2500 system (Illumina, USA). Reads with <90% Q30
Features of the genomes analysed
The pan-genome analysis of 97 S. Pullorum genomes and 47 S. Enteritidis genomes identified 6449 CDS, which were divided into 3997 core CDS, 116 soft core CDS, 844 shell CDS and 1492 cloud CDS. The high identify of core-CDS between the two serovars supported the view that S. Pullorum, and its close relative S. Gallinarum biovar Gallinarum, are descendants of S. Enteritidis (Langridge et al., 2015; Thomson et al., 2008). Among the core-CDS of each serovar, 145 and 127 were unique to S. Pullorum
Discussion
Pathogenicity Island (PAI) CDSs and prophage elements were common among the unique CDS according the VRprofile prediction results. Prophages are known to drive diversity in S. enterica (Cooke et al., 2007; Thomson et al., 2004), and thus, it is not surprising that there is a relatively high enrichment of prophage related CDS among the unique CDS. PAIs play a pivotal role in the virulence of bacterial pathogens including Salmonella (Schmidt and Hensel, 2004), and the relative high hit number of
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This study was supported by the National Natural Science Foundation of China (31730094, 31920103015); National Key Research and Development Program of China (2017YFD0500705; 2017YFD0500100); Jiangsu province agricultural science and technology independent innovation funds (CX(16)1028); and The Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD); Xiao Fei was supported by funding from the China Scholarship Council.
References (65)
- et al.
Salmonella typhimurium impedes innate immunity with a mast-cell-suppressing protein tyrosine phosphatase, SptP
Immunity
(2013) - et al.
Structure-function analyses of the bacterial zinc metalloprotease effector protein GtgA uncover key residues required for deactivating NF-kappaB
J. Biol. Chem.
(2018) - et al.
A gene knock-in method used to purify plasmid pSPI12 from Salmonella enterica serovar Pullorum and characterization of IpaJ
J. Microbiol. Methods
(2014) - et al.
The microbial pan-genome
Curr. Opin. Genet. Dev.
(2005) - et al.
Non-typhoidal salmonellosis: emerging problems
Microbes Infect.
(2001) - et al.
The presence of genes homologous to the K88 genes faeH and faeI on the virulence plasmid of Salmonella gallinarum
FEMS Microbiol. Lett.
(1998) - et al.
The SPI-19 encoded type-six secretion-systems (T6SS) of Salmonella enterica serovars Gallinarum and Dublin play different roles during infection
Vet. Microbiol.
(2019) - et al.
The role of prophage-like elements in the diversity of Salmonella enterica serovars
J. Mol. Biol.
(2004) - et al.
Ten years of pan-genome analyses
Curr. Opin. Microbiol.
(2015) - et al.
A brief review of software tools for pangenomics
Genom. Proteom. Bioinforma.
(2015)
Genome dynamics and evolution of Salmonella Typhi strains from the typhoid-endemic zones
Sci. Rep.
SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing
J. Comput. Biol.
Pullorum disease and fowl typhoid--new thoughts on old diseases: a review
Avian Pathol.
Functional homology of virulence plasmids in Salmonella gallinarum, S. pullorum, and S. typhimurium
Infect. Immun.
Contribution of Salmonella-Gallinarum large plasmid toward virulence in fowl typhoid
Infect. Immun.
Usefulness of high-quality core genome single-nucleotide variant analysis for subtyping the highly clonal and the most prevalent Salmonella enterica Serovar Heidelberg clone in the context of outbreak investigations
J. Clin. Microbiol.
Comparative genomic analysis uncovers 3 novel loci encoding type six secretion systems differentially distributed in Salmonella serotypes
BMC Genomics
The type VI secretion system encoded in Salmonella pathogenicity island 19 is required for Salmonella enterica serotype Gallinarum survival within infected macrophages
Infect. Immun.
Comparative physical and genetic maps of the virulence plasmids of Salmonella enterica serovars typhimurium, enteritidis, choleraesuis, and Dublin
Infect. Immun.
Prophage sequences defining hot spots of genome variation in Salmonella enterica serovar typhimurium can be used to discriminate between field isolates
J. Clin. Microbiol.
Identification of a novel gene in ROD9 island of Salmonella Enteritidis involved in the alteration of virulence-associated genes expression
Virulence
Genomic islands in pathogenic and environmental microorganisms
Nat. Rev. Microbiol.
Comparative genomics identifies distinct lineages of S. Enteritidis from Queensland, Australia
PLoS One
Biology and clinical significance of virulence plasmids in Salmonella serovars
Clin. Infect. Dis.
Role of a single noncoding nucleotide in the evolution of an epidemic African clade of Salmonella
Proc. Natl. Acad. Sci. U. S. A.
Advances in bacterial promoter recognition and its control by factors that do not bind DNA
Nat. Rev. Microbiol.
Frequency and polymorphism of sopE in isolates of Salmonella enterica belonging to the ten most prevalent serotypes in England and Wales
J. Med. Microbiol.
Typhoid fever: pathogenesis and immunologic control
N. Engl. J. Med.
Loss and gain in the evolution of the Salmonella enterica serovar gallinarum biovar pullorum genome
mSphere
eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences
Nucleic Acids Res.
MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Mol. Biol. Evol.
Pan-genome analyses of the species Salmonella enterica, and identification of genomic markers predictive for species, subspecies, and Serovar
Front. Microbiol.
Cited by (4)
The global transcriptomes of Salmonella enterica serovars Gallinarum, Dublin and Enteritidis in the avian host
2023, Microbial PathogenesisFor Someone, You Are the Whole World: Host-Specificity of Salmonella enterica
2023, International Journal of Molecular SciencesIdentification of Salmonella Pullorum Factors Affecting Immune Reaction in Macrophages from the Avian Host
2023, Microbiology SpectrumSalmonella-Based Biorodenticides: Past Applications and Current Contradictions †
2022, International Journal of Molecular Sciences