Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food

Background Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. Results The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Conclusions Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.


Background
Listeria monocytogenes (L. monocytogenes) is a Grampositive, motile, rod-shaped bacterium that is ubiquitous in nature. It is an emerging foodborne pathogen and causes human listeriosis which can be a life-threatening illness particularly in elderly, pregnant women, newborns, and immunocompromised patients [1]. Listeriosis has been detected in many geographical regions, particularly in USA and Europe [1]. Although the occurrence of L monocytogenes in foods has been detected in Malaysia, cases of listeriosis are rarely reported [2,3].
Human listeriosis has been associated with the consumption of contaminated raw, processed, and readyto-eat foods (RTE) [3]. Since L. monocytogenes is able to survive in a wide range of adverse conditions such as low temperature (2-4 °C), low pH, and low water content [4], it may outcompete other microorganisms in acidic and refrigerated food, as well as food that are preserved through salting, sugaring and drying. Furthermore, the increasing demand for fresh and minimally processed foods by consumers has increased the risk of listerosis as such foods contain low levels of preservative which can inhibit the growth of L. monocytogenes [5].
Listeria monocytogenes is naturally susceptible to a wide range of clinically-relevant antibiotics except for quinolone, fosfomycin and cephalosporins [10]. However, resistance to single or multiple antibiotics has increasingly been reported for food strains [3,11]. The occurrence of resistant strains might be a consequence of food contamination by the food handlers or from the contaminated food processing plants. Apart from that, the use of antibiotics in livestock as growth promoter or for disease treatment and prevention may act as a selective pressure for emerging resistant strains which may be zoonotically transferred to humans via food consumption [12]. Given the severity of listeriosis, the emergence of antibiotic resistant L. monocytogenes poses a major health concern in both food safety and public health.
The availability of complete genome sequence of L. monocytogenes allows comparative genomics analyses to be performed, which shed light on the genetic basis underlying the virulence and adaptability of this foodborne pathogen. New genomic data is needed to extend our understanding on the pathogenicity of this organism. This new genomic information may help in the development of new control method through identification and discovery of new virulence-associated genes. In this study, we sequenced and analysed two L. monocytogenes strains isolated from RTE food in Malaysia to elucidate their virulence potential. Genomic comparison was also performed between the studied strains and three other reference strains to gain insights into the evolutionary relationships of these bacteria.

Bacteria strains and genomic DNA extraction
LM115 and LM41 were isolated from fried fish and salad, respectively, that were purchased from a Malaysian street-side hawker stall in 2011 as previously described [2]. The strains were cultivated in Trytic soy medium (Oxoid, Basingstoke, UK) and preserved at −80 °C in 50% glycerol. The genomic DNA was extracted from a pure culture using DNeasy Blood & Tissue kit (Qiagen, Hilden, Germany) according to the manufacturer's instruction.

Whole genome sequencing, assembly, and annotation
Whole genome sequencing of the L. monocytogenes strains was performed on an Illumina HiSeq 2000 platform. The generated sequence reads were trimmed, quality-checked, and assembled de novo using CLC Genomics Workbench 5.1 (CLC Bio, Denmark) as previously described [13]. A total of 28 and 11 contigs with the coverage of 98× and 101× were generated for LM115 and LM41, respectively. These contigs were mapped and reordered against L. monocytogenes EGD-e (1/2a) using Mauve [14]. Assembled sequence was then submitted to the Rapid Annotation using Subsystem Technology (RAST) server [15] for annotation. The number of rRNA was predicted using RNAmmer 1.2 server [16] whereas the numbers of tRNA and tmRNA were gleaned through ARAGORN [17].

Virulence factors and antimicrobial resistance genes identification
Virulence genes were predicted by performing a BLAST search of LM115 and LM41 genomes against the Virulence Factors of Pathogenic Bacteria database (VFDB) [20]. For antimicrobial resistance genes detection, the whole genome sequences of LM115 and LM41 were uploaded to the Resistance Gene Identifier (RID) of the Comprehensive Antibiotic Resistance Database (CARD) [21]. The predicted genes were then validated by performing BLASTp against both the non-redundant (nr) and Swiss-Prot database with 60% coverage and 60% sequence identity as the threshold. If results of the two databases conflicted, a priority order of nr, Swiss-Prot was followed.

Quality assurance
Standard biochemical tests (Gram staining, catalase, oxidase, urea, SIM, TSI, and MR-VP) and species-specific PCR were used to confirm the identity of both L. monocytogenes strains LM115 and LM41 as previously described [2]. Genomic DNA was extracted from a single colony of the pure bacterial culture. Potential contamination of the genomic library by foreign DNA was assessed using the CLC Genomics Workbench 5.1 (CLC Bio, Denmark) as previously described [13].

General genome features
The predicted genome sizes of LM115 and LM41 are 2,959,041 and 2,963,111 bp, respectively. The G + C contents of the two genomes are approximately 38%. The number of tRNA is 51 and 60 for LM115 and LM41, respectively. Both strains carry three rRNA and one tmRNA. A total of 2913 and 2951 coding sequences (CDS) were predicted for LM115 and LM41, respectively. The genome features of the two strains are summarized in Table 1.

Comparative genomics and phylogenomic analysis
Whole genome comparison of the two Malaysian L. monocytogenes strains with L. monocytogenes EGD-e, F2365, and L. innocua Clip11262 revealed a total of 2497 shared ORFs, which accounted for approximately 82% of the total ORFs present in each of the studied strains. LM41 was genetically more similar to EGD-e, a derivative of an animal isolate EGD that was used in cellmediated immunity studies [22]. This genetic similarity was depicted in the circular genomic map of genomes comparison (Fig. 1) where LM41 showed high genome identity (>70% nucleotide identity) to EGD-e, except for two major regions in EGD-e, ranging from approximately 1132-1152 kb and 2362-2385 kb. These regions carried genes that encode for various proteins, including hypothetical proteins, cadmium resistance protein, and phage-related proteins. In contrast, LM115 showed less genomic similarity with EGD-e but was more closely related to F2365, a cheese isolate from a Californian outbreak in 1985 [23] (Fig. 1).
Pairwise comparison showed that LM115 and LM41 shared approximately 90% of their total ORFs. The core and unique genes of LM115 and LM41 were further analysed according to various classes of Cluster of Orthologous Groups (COGs) (Fig. 2). Our results showed that genes from COG class J (Translation, ribosomal structure biogenesis), class C (Energy production conversion), class E (Amino acid transport metabolism), class F (Nucleotide transport metabolism), and class H (Coenzyme transport metabolism) were abundant in the core genome. On the other hand, the unique genes were mostly associated with class M (Cell wall/membrane/envelope biogenesis), class V (Defence mechanism), and class L (Replication, recombination conversion). Detailed genome analysis showed that LM115 and LM41 carried a total of 95 and 116 strain-specific genes, respectively. Other than genes related to the mentioned COG classes (M, V, and L), most of the unique genes encode for hypothetical proteins. To note, these strain-specific hypothetical proteins might carry functions relevant to specific adaptive or fitness advantages, despite the fact that their functions remain uncharacterized.
Our SNP-based phylogenomic analysis showed that LM41 and LM115 were closely related to the reference strain EGD-e and F2365 (Fig. 3), respectively, consistent with the findings of our comparative genomics discussed earlier (Fig. 1). LM115 was also shown to be closely related to two serotype 4b strains, LM201 and Scott A. LM201 is isolated from foodstuffs in China whereas Scott A is a clinical isolate from the Massachusetts listeriosis outbreak in 1983 [24,25]. The phylogenomic tree also revealed the separation of LM115 and LM41 into two different clades. Since SNPs were used to infer the phylogeny relationship of these strains, this observation indicated a possibly high genetic variation in the genomes of LM115 and LM41.

Virulence genes profiling
Several virulence genes found in Listeria spp. were shared between LM115 and LM41. These included the Listeria pathogenicity island (LIPI-1) and several internalins. The LIPI-1 plays a major role in the pathogenicity of L. monocytogenes and consists of six genes that are important for phagosomal escape (hly, plcA, plcB, mpl), motility and cell-to-cell spread (act), and gene regulation (prfA) [8].
Six internalins genes (inlA, inlB, inlC, inlK, inlF, inlJ) were identified in both LM115 and LM41. These internalin genes are involved in invasion (inlA, inlB), adherence (inlF, inlJ), cell-to-cell spread (inlC), and autophagy evasion (inlK) [9,26,27]. Other virulence factors that were annotated in the genomes of both LM115 and LM41were bile salt hydrolase (bsh) which provides resistance to the acute toxicity of bile salt in the host intestine and autolysis amidase (ami) which plays a role in host cells adhesion [28,29]. All the virulence genes identified in both LM115 and LM41 were also present in the pathogenic reference strains EGD-e and F2365.

Stress tolerance
Listeria monocytogenes can encounter various stresses due to the different food processing methods such as heating, chilling, and sugaring. The ability of this pathogen to adapt to and overcome these stresses is contributed by their stress tolerance proteins. A number of genes encoding stress response proteins were identified in LM115 and LM41 (Table 2). Both strains carried the glutamate decarboxylase (GAD) operon and arginine deiminase (ADI) operon that are involved in acid tolerance. The GAD system increases the pH of cytoplasm by utilizing intracellular proton during the conversion of glutamate to ϒ-aminobutyrate (GABA) [30]. The ADI system, on the other hand, alleviates the acidity of cytoplasm by combining intracellular proton with the system's by-product (NH 3 ) to release ammonium ion (NH 4 + ) [31]. Both these systems may provide competitive advantage to L. monocytogenes to survive in food with low pH which can usually limit bacterial growth. In fact, the role of the GAD system in acid tolerance had been The phylogeny tree was generated using CSI Phylogeny 1.4 [19]. Single nucleotide polymorphisms (SNPs) of each strain were called by mapping the genome sequence to that of the reference. The quality-checked SNPs were then concatenated and used to infer a maximum-likelihood tree. The "Reference" refers to the Reference strain L. monocytogenes EGD-e demonstrated in acidified skim milk [32] and cheese [33]. Besides, LM115 and LM41 also harboured several cold and heat shock proteins related genes which protect bacteria from cell damage induced by temperature stress [34,35]. Foods stored in low temperature or processed with high heat, such as frozen burger patties and fried chickens, had been reported to contaminated with L. monocytogenes in Malaysia [2,36]. Moreover, BetL, Gbu, and OpuC transport systems which play a major role in L. monocytogenes osmotic stress response were also annotated in the genomes of LM115 and LM41. These three systems are involved in the uptake of betaine and carnitine that balance the intracellular and extracellular osmotic stress [37], allowing L. monocytogenes to survive in food preserved in low water content. Gene encoding the sigma-B regulator protein (SigB) which regulates various stress responses such as osmotic and temperature stress was also identified in LM115 and LM41.

Antibiotic resistance determinants
Both LM115 and LM41 carried similar antibiotic resistance related genes in their genomes. The tetA gene which is related to tetracycline resistance was found in both strains. Although an association of tetM to tetracycline resistance was more commonly reported, tetA had also been identified in strains isolated from fish samples [38,39]. Additionally, LM115 and LM41 also harboured mecC gene which could confer resistance to beta-lactam drugs. Beta lactam antibiotics such as ampicillin and penicillin, in combination with aminoglycosides, remain the primary therapeutic option for human listeriosis [40]. Resistance to beta lactam drugs could challenge the current treatment option in effectively treating the disease. Apart from that, genes encoding for lincomycin resistance protein (lmrB), fosfomycin resistance protein (fosX), and erythromycin resistance ATP-binding protein (msrA) were also identified in both strains. Furthermore, two efflux pump-related genes, lde and mdrL, which confer resistance to quinolone and macrolides, respectively, were also identified in the two genomes.
A few recent reports have documented the isolation of resistant L. monocytogenes strains against one or more antibiotics in Malaysia [3,9]. The isolation of resistant strains from food is an important health risk as these strains could be transmitted to humans via food contamination. The identification of multiple antibiotic resistance genes in LM115 and LM41 further reiterates the importance of food practice to prevent the dissemination of this pathogen.

Conclusions
Our comparative genomics analyses identified approximately 90% homologous genes between LM115 and LM41. Both LM115 and LM41 showed a close phylogenetic relationship with the pathogenic reference strains F2365 and EGD-e, respectively. Based on our initial genomic analysis, several virulence genes such as those encode for LIPI-1 and internalins were shared between the two strains. Both LM115 and LM41 harboured several stress tolerance genes which may help them to survive through various stresses imposed by different food processes. Additionally, a number of antibiotic resistance genes were also found in the two genomes. The occurrence of virulent and antibiotic resistant L. monocytogenes strains with significant stress tolerance in RTE food poses a great concern for food safety. Functional genomic studies are required to study the association of these genes to the persistence and pathogenicity of these strains.
Authors' contributions SYL performed the comparative genomics, analysed the data, and drafted the manuscript. KPY performed the sequence quality check, assembly, gene prediction, and gene annotation. SYL, KPY, and KLT critically reviewed and improved the manuscript. KLT supervised and provided funding for the project. All authors read and approved the final manuscript.