Pathogenomic Analysis of a Novel Extensively Drug-Resistant Citrobacter freundii Isolate Carrying a blaNDM-1 Carbapenemase in South Africa

Pathogenomic analysis was performed on a novel carbapenem-resistant Citrobacter freundii isolate (H2730R) from a rectal swab of an adult male patient admitted to a tertiary hospital, Durban, South Africa. H2730R was identified using selective media and API 20e kit. Confirmatory identification and antibiotic susceptibility testing were performed using the VITEK II. H2730R was whole-genome sequenced on the Illumina MiSeq platform. H2730R was resistant to all tested antibiotics except tigecycline and was defined as ST498 by the C. freundii multilocus sequence typing (MLST) database. The estimated pathogenic potential predicted a higher probability (Pscore ≈ 0.875), supporting H2730R as a human pathogen. H2730R harbored 25 putative acquired resistance genes, 4 plasmid replicons, 4 intact prophages, a class 1 integron (IntI1), 2 predominant insertion sequences (IS3 and IS5), numerous efflux genes, and virulome. BLASTn analysis of the blaNDM-1 encoding contig (00022) and its flanking sequences revealed the blaNDM-1 was located on a plasmid similar to the multireplicon p18-43_01 plasmid reported for the spread of carbapenem resistance in South Africa. Phylogenomic analysis showed clustering of H2730R with CF003/CF004 strains in the same clade, suggesting a possible association between C. freundii strains/clones. Acquiring the p18-43_01 plasmid containing blaNDM-1, the diversity, and complex resistome, virulome, and mobilome of this pathogen makes its incidence very worrying regarding mobilized resistance. This study presents the background genomic information for future surveillance and tracking of the spread of carbapenem-resistant Enterobacteriaceae in South Africa.


Introduction
Citrobacter freundii (C. freundii) is a rod-shaped, Gram-negative, aerobic member of the Enterobacterales family that is an intestinal commensal in humans and animals [1,2]. It is a known cause of nosocomial infections associated with the biliary tract, respiratory tract, urinary system, and central nervous system, including meningitis, neonatal sepsis, and infectious diarrhea [3,4]. C. freundii

General Genomic Features of H2730R
The CG Viewer Server was used for the genomic visualization of the isolate ( Figure A1; Appendix A). The C. freundii (H2730R) genomic features are shown in Table A1. The size, GC Pathogens 2020, 9,89 3 of 13 content, number of contigs, N50, and L50 of the H2730R genome were 5.29 Mbp, 51.80%, 58, 518368, and 4, respectively. Annotation with RAST and PGAP resulted in 5006 protein-coding genes, 70 RNAs, and 12 tRNAs (Table A1). The protein-coding genes (CDSs) and non-protein-coding genes were assigned to 27 clusters of orthologous groups (COG) and functional categories (Table A1).

WGS-Based Confirmation and Multilocus Sequence Typing (MLST)
The identification of H2730R isolate was confirmed with generated genomic data via the Global Platform for Genomic Surveillance (Pathogenwatch). In silico determination of the classical sequence type from WGS data using the C. freundii MLST scheme resulted in a previously undescribed MLST comprising a new allelic combination for arca_5, aspc_16, clpx_14, dnag_54, fadd_103, lysp_5, mdh_15. Allele sequences were submitted for curation and assigned to the new ST 498 by the C. freundii PubMLST database. BURST (Based Upon Related Sequence Types) analysis identified the novel ST498 as a satellite clone (more distantly related strain) with no single-locus variant (SLV) or double-locus variant (DLV) of globally curated C. freundii STs. Of note, the new ST had different allelic profiles from the five deposited C. freundii genomes from South Africa.

WGS Analysis of Resistance Genes and Genetic Support
WGS analysis via the different databases revealed 25 putative acquired antibiotic resistance genes responsible for resistance to the various antibiotics found in the isolate, which corroborated the phenotypic profile (Table 2). Chromosomal mutation in GyrA [S83I] and the plasmid-mediated quinolone resistance genes (aac (6')-Ib-cr and qnrB1) were found. Most of the resistance genes were plasmid-borne (Table 2). Resistance to carbapenems was linked to the New Delhi metallo-β-lactamase-1 (bla NDM-1 ) belonging to the sub-class B1. The BLASTn analysis of the bla NDM-1 carbapenemase containing contig (00022) and its flanking sequences matched the 212.3 Kbp multireplicon plasmid p18-43_01 (GenBank accession number CP023554.1). The PlasmidFinder v1.3 identified four plasmid replicons (Inc A/C2, Inc FIB(pB171), Inc FII(Yp), and Inc Q1). The PHAge Search Tool (PHAST) detected four intact phages (Escher_HK639, Entero_c_1, Salmon_RE_2010, and Salmon_SJ46). The insertion sequences in the genomes belonging to the IS3 family (IS2) and IS5 family (IS903) were predicted by BLAST searches against contigs on the ISFinder database. In silico analysis identified the class 1 integron-integrase gene (IntI1) in H2730R. The relationship between the IntI1 integrase in H2730R and other closely related IntI1 integrases in diverse species obtained from the RAST Server was shown as a phylogenetic tree using the MAFFT online tool (Figure 1). The Intl1 of the H2730R strain had the closest similarity with the Salmonella enterica subsp. enterica serovar Choleraesuis plasmid pOu7519, as depicted in Figure 1.

Pathogenicity, Defence Systems Mechanisms, and Virulome Predictions
H2730R had a 0.875 probability of being pathogenic to humans and was found to match with 110 pathogenic families. All the 110 pathogenic families were linked to members of the Enterobacterales, of which Citrobacter koseri ATCC BAA-895 was the organism with the highest possible pathogen linkage. The CRISPRCasFinder predicted two clustered, regularly interspaced, short palindromic repeat (CRISPR) arrays with no Cas systems. Of note, the in silico analysis revealed a Type II restriction-modification (R-M) defense system consisting of Eco128I (restriction enzyme) and M. EcoRII (methyltransferase)]. The whole-genome virulome analysis predicted virulenceencoded pathogenesis-associated proteins predominantly made up of adherence determinants, regulation, toxin, motility, antiphagoctyosis, invasion, and biofilm formation (Table 3). Furthermore, 10 efflux pump systems known to be involved in drug resistance were identified ( Table 2). They belonged to three main efflux systems, namely: ABC, ATP-binding cassette; MFS, major facilitator superfamily; and RND, resistance-nodulation-division (Table 2).

Phylogenomic Analysis and Metadata Insights of Reported Citrobacter Freundii Isolates from South Africa
The phylogenetic relationship and epidemiological distribution of all deposited C. freundii genomes on GenBank from South Africa are depicted in Figure 2 in which isolates of the same cluster are highlighted in the same color. The tree analysis, coupled with metadata, revealed two clusters (A and B) of C. freundii with insights into the diversity of clones in South Africa (Table A2). Of note, the H2730R was related to the CF003 and CF004 strains (Cluster B). All the isolates were from human hosts.

Phylogenomic Analysis and Metadata Insights of Reported Citrobacter Freundii Isolates from South Africa
The phylogenetic relationship and epidemiological distribution of all deposited C. freundii genomes on GenBank from South Africa are depicted in Figure 2 in which isolates of the same cluster are highlighted in the same color. The tree analysis, coupled with metadata, revealed two clusters (A and B) of C. freundii with insights into the diversity of clones in South Africa (Table A2). Of note, the H2730R was related to the CF003 and CF004 strains (Cluster B). All the isolates were from human hosts.

Discussion
Carbapenem resistance in Enterobacteriaceae has become a major public health threat that requires urgent attention. In this study, we use whole-genome sequencing (WGS) and bioinformatics analysis to describe the emergence of a novel carbapenem-resistant sequence type (ST498, satellite clone) of C. freundii isolated from the rectal swab of an adult male patient admitted to a tertiary healthcare facility in Durban, South Africa, harboring a diverse resistome, virulome, and mobilome.
H2730R was categorized as XDR due to its phenotypic resistance to all tested antibiotics except tigecycline [14] ( Table 1). The carbapenem phenotypes of the strain, as identified on selective chromogenic screening medium, were confirmed by the presence of carbapenemase (bla NDM-1 ) ( Tables 1 and 2). The 100% concordance between phenotypes and predicted resistome annotation data reiterated the reliability of WGS for the accurate prediction of antibiotic resistance [15,16]. The C. freundii H2730R genome harbored a total of 25 putative resistance genes, which were predominantly plasmid-borne (Table 2), indicating an acquisition and the potential for spread via horizontal resistome transfer between H2730R and other species. Resistance to ciprofloxacin was mediated by GyrA [S83I] chromosomal mutation, plasmid-mediated quinolone resistance genes (PMQR) (aac(6')-Ib-cr and qnrB1), and efflux genes (  [15] reported that high-level fluroquinolone resistance was mediated by efflux, PMQR genes, and gyrA mutations in Enterobacterales, including Citrobacter freundii, which supported our results. More so, efflux genes belonging to three main efflux pump systems ( Table 2) reported to play vital roles in antibiotic resistance in Enterobacterales were also identified [18,19]. However, the mere presence of these efflux genes does not directly implicate resistance, and hence further studies on their expression levels will be needed to ascertain the possibility of efflux pump hyperexpression [18]. The diverse resistance mechanisms (plasmid-borne, chromosomal, and efflux) involved in H2730R resistance indicate the various processes the organism expresses to survive the effects of antibiotics and other biocides ( Table 2).
The National Center for Biotechnology Information (NCBI) microbial nucleotide BLAST search of the bla NDM-1 carbapenemase containing contig (00022) and its flanking sequences in the isolate revealed the bla NDM-1 was located on a plasmid similar to the 212.  [20] and public healthcare sector [21] in Durban, South Africa. This implicates a possible bla NDM-1 acquisition as well as nosocomial spread and development of an XDR genetic lineage via this local plasmid across sectors in the KZN province, which is very worrisome. The four plasmid replicon genes (Inc A/C2, Inc FIB(pB171), Inc FII(Yp), and Inc Q1) identified via the PlasmidFinder supported the multireplicon plasmid nature of the p18-43_01. This global spread of carbapenem resistance through the acquisition of plasmid encoding carbapenemase has been highly documented in the literature [22][23][24].
Moreover, H2730R contained two insertion sequences (IS3 family and IS5 family) whose genetic transposition could help in the plasticity and concomitant adaptability of H2730R phenotypic traits, including catabolism, pathogenicity, resistance to antibacterial agents, and virulence [25]. The IS2 (IS3 family) has been reported to cause erythromycin and fluoroquinolone resistance through the formation of hybrid promoters [25]. Additionally, the IS930 (IS5 family) has been implicated in carbapenem and metal ion (silver) resistance in Enterobacter aerogenes [26] and E. coli, respectively. Further analysis of H2730R revealed prophage with possible origination from diverse sources (E. coli and Salmonella) [27]. For instance, the two prophages Salmon_RE_2010 and Salmon_SJ46 were initially identified in Salmonella enterica subsp. enterica serovar Enteritidis and Salmonella enterica subsp. enterica serovar Indiana, respectively [28]. This supports a possible exchange of genetic traits via horizontal transfer between Enterobacterales. Class 1 integrons, reported to be associated with multiple classes of antibiotic resistome, including β-lactams, quinolones, and aminoglycosides worldwide, were found [29]. However, the full mobile gene cassettes and integron genetic structure could not be determined as they were truncated into different contigs during the assembly process.
The pathogenic potential (P score ) is a machine learning algorithm used for the in silico prediction of the possibility of a strain being a human pathogen with the probability ranging from 0 to 1. This estimation of the pathogenic potential using trained algorithms to differentiate between pathogenic or commensal strains predicted a higher probability (P score ≈0.875) suggesting H2730R as a potential human pathogen. Further, in silico analysis of H2730R predicted the presence of two CRISPR arrays and a Type II restriction-modification (R-M) defense system which offers bacterial protection against viral attacks and thus increases its survival and adaptability in the microbial landscape [30,31].
Moreover, the genomic virulome analysis revealed a battery of different virulence encoded proteins from diverse sources. These virulence determinants were predominantly made up of adherence determinants which aid in the complement-independent attachment to host mammalian cells [32]. H2730R possessed toxins to induce cell membrane damage, trigger cytokine formation, and reduce or kill neutrophils [33]. Interestingly, the strain also harbored the ibeB and pgaC genes, which contribute to invasion, biofilm formation, and adhesion to eukaryotic host cells. Additionally, H2730R harbored the two-component regulation system (PhoPQ) activated by low divalent cations magnesium (Mg 2+ ) and calcium (Ca 2+ ) levels and required for intracellular survival, cationic antimicrobial peptides (CAMPs) resistance, and stimulation of cytokine secretion [34]. The higher virulence factor homology existing between H2730R and other Gram-negative species (Table 3) suggests that they were possibly obtained horizontally, necessitating further studies on transconjugation and/or transcriptomics to ascertain this assertion. The detection of the virulome can inform vaccine development by delineating possible bacterial protein targets.
Phylogenomic analysis, coupled with metadata, showed clustering of H2730R (Durban) with CF003/CF004 strains in the same cluster (B) suggesting a possible association of C. freundii strains/clones. However, not much inference could be drawn from the tree analysis due to the small number of deposited C. freundii genomes from South Africa. It is thus recommended that more studies be carried out in all the sectors (human, animal, and environment) to monitor the emergence and spread of XDR C. freundii using high throughput technologies such as whole-genome sequencing (WGS). Moreover, there was limited data on the clinical information, history and/or clinical examination of the patient, and all deposited C. freundii genomes obtained from the human host, which could be very useful in comparative phylogenomic analysis. Ultimately, a combination of demographics, clinical information, and WGS data coupled with graphical analysis should be applied to offer insights into the spread of pathogens and increase confidence during molecular epidemiological investigations [35].

Ethical Approval
This study was approved by the Biomedical Research Ethics Committee (approval no: BE599/16, a substudy of BCA444/16), College of Health Sciences, University of KwaZulu-Natal (UKZN).

DNA Purification, Genome Sequencing, and Preprocessing of Sequence Data
The isolate was streaked onto a nutrient agar (NA) (Sigma-Aldrich, St. Louis, USA) plate and incubated at 37 • C for 24 h. Following incubation, genomic DNA was extracted from a visibly pure culture using the Quick-DNA™ Fungal/Bacterial Miniprep Kit (Zymogen Research, USA, cat. no. D6005), quantified by Nanodrop 8000 (Thermo Scientific, Waltham, MA) and Qubit, and verified by agarose gel electrophoresis. A paired-end library (2 × 300 bp) was prepared using Illumina Nextera XT DNA Sample Preparation Kit and sequenced on a MiSeq machine (Illumina, USA). The generated sequenced reads were quality assessed and trimmed using the Skesa assembler (version 2.3). Default parameters were used for all software unless otherwise specified. The CheckM tool [37] was used to verify that the sequence reads were not from mixed species using lineage-specific marker sets from other genetically well-characterized, closely related C. freundii strains.

Genome Visualization and Annotation
The raw reads were de novo assembled using the Skesa (version 2.3) assembler [38]. The genomes of the strains were visualized using the CG Viewer Server ( Figure A1; Appendix A). The National Center for Biotechnology Information (NCBI) Prokaryotic Genome Annotation Pipeline (PGAP; version 4.3) [39] and Rapid Annotation using Subsystem Technology (RAST) Server (version 2.0) [40] were used for annotation of the size, GC content, number of contigs, N50, L50, average coverage, number of RNAs, and protein-coding sequences of the isolate.

WGS-Based Confirmation and Molecular Typing
The generated contigs from the WGS data were used to confirm the identity of the isolate on the pathogen watch platform [41]. The assembled genome was submitted to the C. freundii MLST Pathogens 2020, 9,89 8 of 13 database, which predicted the allelic profiles of the seven housekeeping genes to assign the new sequence type [42]. An eBURST [43] algorithmic analysis was performed in the MLST database (https://pubmlst.org/cfreundii/) to ascertain whether the novel sequence type (ST) was a single-locus variant (SLV) or double-locus variant (DLV) or satellite (SAT, more distantly related strain) of known STs. The allelic profiles and STs of all deposited C. freundii genomes from South Africa in the PATRIC database were also predicted using the MLST database.

Genomic Identification of the Antibiotic Resistome and Mobile Genetic Elements (MGEs)
The bacterial analysis pipeline of GoSeqIt via ResFinder [44], Antibiotic Resistance Gene-Annotation database (ARG-ANNOT) [45], and the comprehensive antibiotic resistance database (CARD) [46] tools were also used to annotate and identify antibiotic resistance genes. To determine if the resistance genes and their associated MGEs were plasmid-borne or chromosomal, the contigs bearing these resistance genes were BLASTed on BLASTn to determine if the closest nucleotide homologies were chromosomal or plasmids. Plasmid replicons were predicted through PlasmidFinder [47]. The PHAge Search Tool [48] server was used for the identification, annotation, and visualization of prophage sequences. Phage regions were extracted and analyzed in the NCBI nucleotide BLASTn service to confirm their identity. Insertion sequences and transposase in the genomes were predicted by blasting contigs on the ISFinder database [49] to find the common insertion sequences. Integrons in the genomes predicted by PGAP and RAST subsystems were blasted on the INTEGRALL database to find the actual integrons. Additionally, using the nucleotide sequences, the MAFFT multiple alignment program [50] was used to show the phylogenetic relationship between predicted and other closely related integrases in diverse species obtained from the RAST Server.

Pathogenicity, Defence Systems, and Virulome Predictions
PathogenFinder [51] was used to predict pathogenicity towards human hosts. The CRISPRCasFinder was used to identify the putative CRISPR loci and Cas cluster in the draft genomes [52]. Restriction Modification Finder predicted the R-M system in the isolate [53]. Virulence determinants (sequences and functions) corresponding to different major bacterial virulence factors associated with C. freundii were searched for virulence factors with the pathogenic bacteria database, VFanalyzer [54], and validated by blasting assembled genomes to a pseudomolecule generated by concatenating a set of target genes using the NCBI in-house BLASTN tool.

Phylogenomic Analyses of C. freundii Isolates from South Africa
The genome sequences of all five available genomes were downloaded from GenBank and PATRIC for genomic comparison and phylogenomic analysis via CSI Phylogeny-1.4 [55], an online service which identifies SNPs from WGS data, filters and validates the SNP positions, and then infers phylogeny based on concatenated SNP profiles [55]. A bootstrapped indicator with 100 replicates was applied to identify recombined regions and provide the phylogenetic accuracy in groups with little homoplasy. Figtree was used to edit and visualize the phylogenetic tree. The phylogeny was visualized alongside annotations for isolate demographics and WGS in silico molecular typing metadata using Phandango [56].

Data Availability
This whole-genome shotgun project has been deposited in DDBJ/ENA/GenBank under the accession no. VWTQ00000000. The version described in this paper is version VWTQ01000000.