Comparison of Whole-Genome Sequences for Three Species of the Elizabethkingia Genus

Background: There are increasing researches on whole-genome sequences for clinical strains of Elizabethkingia genus which can cause severe infection in humans, while few studies on the comparative genomics of species in the Elizabethkingia genus in China have been conducted. Methods: The Elizabethkingia genus, isolated in a tertiary hospital of Beijing, China, were re-identied and analyzed through in silico DNA-DNA hybridization (DDH), whole-genome sequence-based phylogeny. Antibiotic resistance genes, antimicrobial resistance-associated proteins, virulence factors were identied, and clusters of orthologous groups were evaluated by Kyoto Encyclopedia of Genes and Genomes (KEGG). The clinical information of patients infected by these organisms was collected and the characteristics were analyzed. Results: There were three species among 20 clinical isolates of Elizabethkingia genus: E. meningoseptica, E. anophelis and E. miricola. E. anophelis accounted for the majority. E. meningoseptica exhibited higher GC content and possessed carbapenemase-encoding genes of bla GOB-16 and bla B-12 while E. anophelis carried genes of bla CME-1 . Multiple kinds of antimicrobial resistance-associated proteins were predicted and the virulence factors about adherence, biolm formation, iron and magnesium uptake, stress adaptation, and immune evasion were discovered. Among 2622 clusters of core genomes identied from the three species of the Elizabethkingia genus, the majority of genes were metabolism-related. Pan genome displayed an upregulation, while the core genome displayed a downregulation with the addition of new genes for the 20 Elizabethkingia strains. Conclusions: The composition was different antimicrobial resistance-related, virulence-related and metabolism-related depending upon

associated outbreak caused by E. anophelis occurred in Wisconsin in 2016, which led to deaths of 18 patients as reported [4,6,8].
Whole-genome sequencing has been considered as a powerful technique in molecular microbiology, by which comprehensive information about the drug resistance and virulence factors can be provided, hostpathogen interaction and host-environment reaction can also be predicted [16]. In this study, genomic features as well as clinical characteristics of 20 Elizabethkingia strains were investigated based on our previous antimicrobial susceptibility testing [17].

Results
Identi cation of Elizabethkingia species by whole-genome sequence There were 14 isolates of E. anophelis, 5 isolates of E. meningoseptica and 1 isolate of E. miricola among 20 nonduplicated Elizabethkingia isolates according to the results of whole-genome comparison as shown in Fig. 1. The heat map displayed a clear delineation of three species in the Elizabethkingia genus. Whole-genome sequence-based phylogenetic tree of the 20 Elizabethkingia isolates and the reference strains by the Reference Sequence Alignment based Phylogeny Builder (REALPHY) were presented (Fig. 2). The difference was clear among 3 species of the Elizabethkingia strains. The phylogenetic distance between E. miricola and E. anophelis was closer than that between E. miricola and E. meningoseptica.
Statistically signi cant difference was not found in genome size between E. meningoseptica and E. anophelis. There was also no statistically signi cance among Elizabethkingia species in total contig number, N50 contig length, N90 contig length, gene number or pseudogene number. GC base content was detected more in E. meningoseptica than in E. anophelis (p < 0.001) ( Table 1). As we can see from Table 2 and Fig. 3, resistance genes of 6 classes of antibiotics including betalactamase, sulfonamides, macrolides, tetracyclines, aminoglycosides and glycopeptides from 20 isolates of Elizabethkingia were identi ed. All reference strains and clinical isolates possessed antibiotic e ux pump gene of adeF. Several antibiotic resistance genes existed in speci c species of Elizabethkingia, such as bla GOB−16 and bla B−12 in E. meningoseptica, bla CME−1 in E. anophelis and bla GOB−13 and bla B−6 in E.
miricola. Prediction of antimicrobial resistance-associated proteins in Elizabethkingia species The antimicrobial resistance-associated proteins of E. meningoseptica and E. anophelis were predicted as in Table 3.  The potential virulence factors of the three Elizabethkingia species were shown in Table 4 and Fig. 3. We noticed that some virulence factors existed in all clinical and reference strains, such as hsp60, streptococcal enolase, exopolysaccharide, Mg 2+ transport, EF-Tu, catalase, and peroxidase, while some existed differently upon species, i.e., phospholipase D, isocitrate lyase and lipopolysaccharide existed mostly in E. meningoseptica, and O-antigen mostly in E. anophelis. Through the functional analysis of the Clusters of Orthologous Groups (COGs) in genomes of the 20 Elizabethkingia (Fig. 4), core genomes were found to be mostly related to metabolism, while majority of accessory and unique gene families involved in information storage and processing, R (general function prediction only) occupied the most, followed by K (transcription) and M (cell wall/membrane/envelope biogenesis), otherwise D (cell cycle control, cell division, and chromosome partitioning) accounted for the least ( Fig. 4A and 4B). Detailed information about COG distribution among components of gene families revealed that core genomes took 70.8% of J (translation, ribosomal structure, and biogenesis), and accessory genomes occupied 49.0%, most in T (Signal transduction mechanisms), the unique genomes possessed 58.8%, the most in V (defense mechanisms).
Analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) distribution on 20 Elizabethkingia strains (Fig. 5) demonstrated that functional genomes of metabolism accounted for the most (Fig. 5A). Among these genomes, carbohydrate metabolism occupied the most, followed by amino acid metabolism (Fig. 5B). Further analysis indicated that core genes accounted for 32.3% in carbohydrate metabolism, accessory genes and unique genes accounted for 41.0% and 26.7%, respectively.
Analysis of core and pan genome A ower plot of core and pan genome analysis presented 2622 core genes, 686 to 1119 accessory genes and 0 to 415 unique genes ( Fig. 6 and Table S1). The pan genome displayed an upregulation from 3561 to 6596, while the core genome displayed a downregulation from 3134 to 2440 ( Fig. 7 and Table S2).

Clinical Characteristics of Elizabethkingia Infections
As to the source of these 20 Elizabethkingia strains, specimens from respiratory tract accounted for the most (65%), followed by the exudate (15%), blood (10%), urine (5%) and bile (5%). Both isolates of E. meningoseptica and E. anophelis were mostly originated from respiratory tract. E. anophelis was the only species isolated from blood and E. miricola was the only species isolated from the urine, as shown in Table S3. Table 5 showed the clinical characteristics of patients infected of E. meningoseptica, E. anophelis and E. miricola.

Discussion
The role of whole-genome sequencing in species discrimination of Elizabethkingia genus The traditional DDH was used to be one of the most important criterion for discrimination of bacterial species. Recently, in silico DDH has been deemed as a more accurate substitution for traditional DDH [18]. In our study, genome-to-genome distance calculator (GGDC) plus in silico DDH was used to reveal the delineation of Elizabethkingia species. The 20 clinical strains were initially identi ed as E. meningoseptica, however, they were found to be E. anophelis ( carbapenemase-encoding genes were more in E. meningoseptica, while bla CME−1 was more in E. anophelis. bla GOB−9 and bla GOB−10 also existed in E. anophelis (Fig. 3). bla GOB−13 and bla B−6 were identi ed in E. miricola. Extended-spectrum serine-beta-lactamase carboxymethyl ether (CME) (class D) and two unrelated wide-spectrum metallo-beta-lactamases (MBLs), BlaB (subclass B1) and GOB (subclass B3) belong to the family of beta-lactamases. Due to the divergence of CME and MBLs among Elizabethkingia species, it might be a potential evidence for species discrimination to analyze the homology of different phylogenetic cluster.
According to our previous study, our clinical strains of Elizabethkingia genus possessed high level of multi-drug resistance [17]. The present study demonstrated that proteins resistant to beta-lactamases, vancomycin, tetracyclines, quinolones, macrolides, and multidrug resistance e ux pumps were identi ed in our clinical strains, while proteins resistant to aminoglycosides and sulfonamides were absent. The coincidence rate between antimicrobial resistance genes and phenotype of beta-lactamases was 100%.
Although antimicrobial resistance genes of sulfonamides, macrolides, tetracyclines and aminoglycosides were absent, these strains of Elizabethkingia possessed resistance phenotype to them. Interestingly, TetA, a kind of protein in charge of tetracycline e ux, was found in our clinical Elizabethkingia strains, but these strains were 100% susceptible to minocycline (Table S4). It has been reported that vancomycin was used to treat neonatal meningitis with E. meningoseptica successfully [21], however, all of the 20 clinical Elizabethkingia strains presented minimum inhibitory concentration (MIC) of vancomycin more than 8 µg/ml [22], the similar results were also discovered in other studies [19,20]. vanW, a vanB-type glycopeptide resistance gene, was identi ed in all the three Elizabethkingia species. The exact function of vanW still remained uncertain, but an involvement of its mutations has been observed in the regulation of resistance to teicoplanin [23]. Hence, the use of glycopeptide should be cautious in treating infection by Elizabethkingia [19,20].
Comparative genome analysis of virulence factors predicted among Elizabethkingia species Bacterial virulence factors are essential for pathogenesis, in our study, exopolysaccharide, heme biosynthesis, urease, and Mg 2+ transport were predicted in three species of 20 Elizabethkingia strains (Table 4 and Fig. 3), while they existed only in E. miricola GTC 862 T by Liang et al. [16]. The adeG gene existed in 18 strains of Elizabethkingia (Table 4), it was related to bio lm which could cause bacteria to adhere to the medical devices and resist against disinfectant [24][25][26].
Virulence factors about lipid and fatty acid metabolism, serum resistance and immune evasion and phospholipase D were found mostly in E. meningoseptica, that might be the reason why E. meningoseptica tends to cause more neonatal meningitis and sepsis. It was not clear whether O-antigen involved in immune evasion played a role in outbreaks largely triggered by E. anophelis. There were less categories of virulence factors in E. miricola, which could explain why E. miricola caused occasional clinical infection case reports [27].

COGs and KEGG analysis of Elizabethkingia genus
COGs, clusters of orthologous groups, that involve species-related genes evolving from a common gene, remain the original function during the evolution process. Detection of COGs and prediction of their functions are of fundamental importance in many elds, particularly in pathogenic analysis with new sequence and function of intracellular survival related to COGs with "information storage and processing" [28,29]. Both COGs and KEGG analysis of Elizabethkingia genus indicated that function of metabolism occupied the largest part, most of which were linked to carbohydrate metabolism.
Pan genome proposed by Tettelin et al. was introduced to discriminate genomes, to investigate the core (conserved), accessory (dispensable), and unique (strain-speci c) genes, to trace horizontal gene-ux among strains, and to acquire information about species evolution [30]. The only strain of E. miricola possessed most unique genomes among these 20 clinical Elizabethkingia strains, indicating a higher degree of species evolution than other species. Further study on more strains are needed to explore the complicated species evolution of Elizabethkingia genus.

Clinical characteristics of Elizabethkingia infections
Chronic underlying illnesses, such as cardiovascular disease, hypertension, diabetes mellitus, malignancy, and liver cirrhosis were common in most patients with Elizabethkingia infections [3-5, 31, 32]. Although mortality rate of E. anophelis and E. meningoseptica was 24-34% and 30% reported by Lin et al. (2018) and Lin et al. (2019) respectively [5,19], the death rate of patients in our study was lower than that. We also found that among mortality-affecting factors, cerebral infarction was an independent risk factor (Table S5). Therefore, we suggest that more attention should be paid on patients with cerebral infarction and Elizabethkingia infection in order to reduce mortality.

Conclusions
A genomic comparative analysis as well as clinical characteristics of 20 clinical strains among three Elizabethkingia species were conducted. The results provided information to better understand the drug resistance, virulence, gene evolution from the point of genome structure of this pathogen.
Clusters of orthologous groups, Kyoto encyclopedia of genes and genomes and pan genome analysis Analysis of core (conserved), accessory (dispensable), and unique (strain-speci c) genes was performed with the use of the Bacterial Pan Genome Analysis Tool (BPGA) [41]. The BPGA was also used to generate pan genome and core genome phylogenies and to access the Clusters of Orthologous Groups (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database [41].

Data Analysis
Clinical information was authorized and obtained from clinical medical record system of Chinese PLA General Hospital. The clinical and whole-genome sequenced data were analyzed with SPSS version 23.0 (IBM, Armonk, NY, USA). Categorical variables were calculated with the chi-squared test or Fisher exact test as appropriate. Continuous variables were illustrated with average ± standard deviation (SD) and calculated using the independent-sample t test or nonparametric Wilcoxon rank sum test according to normal distribution. Univariate analyses with the two-tailed p-value were conducted to examine the variables for mortality. p < 0.05 was of statistical signi cance.

Consent for Publication
Not applicable.

Availability of data and materials
All data generated or analyzed during this study are included in this article and its supplementary information les.

Competing interests
The authors declare no competing interests.

Funding
This work was partly supported by the National Natural Science Foundation of China (grant no.81472012). The funds are mainly used for the purchase of reagents and whole-genome sequencing in our study.
Authors' contributions DXS designed and organized this study, revised the manuscript, CY performed most of the experiments, analyzed the sequence and wrote the manuscript, ZL collected strains, and performed the antimicrobial susceptibility testing, SY analyzed the sequence, KY, XL assembled and annotated the DNA short reads. All authors have read and approved the nal manuscript.