Genome dynamics mediated by repetitive and mobile elements in Xanthomonas citri pv. durantae

Xanthomonas is a highly evolved group of phytopathogenic bacteria infecting nearly 400 host plants having vast genomic resources available with heterogenicity in representation from different species and pathovars. Unfortunately, the wealth of data is extremely biased and restricted to a few Xanthomonas pathogens that infect economically important plants, while those reported to infect the most diverse plants remain neglected. In the present study, we report the first complete genome sequence of Xanthomonas citri pv. durantae that was reported to infect Duranta repens L. or golden dewdrop, a hedge plant of ornamental importance native to the American region. Phylogenomic analysis with its closest relatives placed it amongst X. citri pv. citri A* pathotype strains and further comparative studies revealed various large unique genomic regions of chromosomal origin. The association of integrative and conjugative elements and prophages with unique genomic regions suggests the role of mobilome in genome dynamics. A large number of IS elements and transcription activator-like effectors encoding genes on each of the four plasmids indicate the further scope of diversification in Xanthomonas .


INTRODUCTION
Xanthomonas is a complex genus of Gram-negative phytopathogenic bacteria comprising more than 34 species (LPSN, http:// www.bacterio.net, accessed on 5 August 2022) [1] capable of infecting around 400 plants [2][3][4]. It is an extensively studied group of phytopathogens with a plethora of genomic data available in the databases. However, there is a lack of complete genomic information of Xanthomonas pathogens reported to infect diverse hosts. Xanthomonas citri pv. durantae (Xdur) infects Duranta repens L. (golden dewdrop), a flowering plant of the Verbenaceae family native to the American region, popularly grown as hedge across the globe (www.cabi.org). Duranta repens is widespread in Mexico, the Caribbean, and most of South America, as well as in Florida, California, and Texas in the United States of America; it serves its ornamental importance as a garden plant and windbreaks [5]. Disease symptoms involve the extensive formation of enlarged, angular spots with a brownish centre and slightly raised margins [6,7]. The first report of Xanthomonas infecting Duranta repens was from India in 1957 [6,7]. In 2013, an unidentified strain of the Xanthomonas genus was reported to cause symptoms comparable to Xdur infection on golden dewdrop in Florida, USA [8].
Previous studies based on marker genes and phylo-taxonogenomic analyses, including average nucleotide identity and digital DNA-DNA hybridization values, suggested Xanthomonas citri pv. citri (Xcc), which causes citrus bacterial canker (CBC) and Xanthomonas citri pv. durantae (Xdur) are closely related [9,10], and Xdur is one of the constituent pathovars of X. citri [11,12]. As these earlier studies were based on draft genome sequences, mechanistic details of genome dynamics in evolution and variation of closely related pathovars were lacking. The short-read assemblies make it challenging to study genes of repetitive nature, such as transcription activator-like effectors (TALEs), which are crucial determinants of pathogenicity [13] and IS (Insertion OPEN ACCESS Sequences) elements, which are small mobile genetic elements dispersed over a genome, responsible for genome plasticity as well as genomic rearrangements [14]. The emergence of third-generation sequencing technologies has created immense opportunities for investigating the role of mobile genetic elements and repetitive elements in host diversification with much more precision [15].
The present study reports the first complete genome-based investigation of Xanthomonas citri pv. durantae strain LMG696, which is available in the culture collections and the NCBI database (https://www.ncbi.nlm.nih.gov/assembly/GCF_019201325. 1/) as X. campestris pv. durantae LMG696. LMG696 is the reference strain of the pathovar. Comprehensive genome comparisons with the complete genomes of X. citri pv. citri strains led to identifying five large dynamic regions associated with integrative and conjugative elements and prophages in the chromosome. Our study also observed variations in IS elements and TALE repertoire during the diversification of genomes. This suggests the importance of complete genome-based studies of Xanthomonas strains reported to infect hosts of less economic importance and also investigating genome dynamics apart from phylogenomics.

IS elements and transcription activator-like effectors (TALEs)
IS elements were identified using ISsaga 2.0 web server [30]. The TALEs in the Xcc strains and Xdur LMG696 were identified by the 'TALE Prediction' tool of AnnoTALE software version 1.5 [31]. Further, identified TALEs were assigned to different classes by the 'TALE class assignment' tool of AnnoTALE. A neighbor-joining tree of the central repeat regions, including RVDs of the TALEs, was constructed using the DisTAL v1.1 module of the QueTAL suite [32]. The DisTAL constructs phylogeny based on an alignment of central repeat regions of TALEs.
The plasmids carry genes encoding partition and replication proteins such as parA, parB, conjugal transfer protein-encoding genes such as traI, traG, trbM, trbG, trbF, trbL, proteins related to toxin/anti-toxin systems, mobilization proteins, transposases and a large number of hypothetical proteins. The type IV secretion system reported by Bansal and co-workers on contig 29 of the draft genome of LMG696 is located on plasmid pLMG696-1 (Fig. 1b) [9]. As reported earlier [12], this type IV secretion system is also associated with the plasmids of X. citri pv. citri strain TX160149, X. campestris pv. campestris CN18 and X. campestris pv. campestris CN03. Apart from this, pLMG696-2 also has a type IV secretion system cluster, which shares homology to X. citri pv. citri strain 29-1 and X. citri pv. citri strain LH276 (Fig. 1b). This indicates that plasmids carrying distinct type IV secretion systems might be playing a role in the evolution and adaptation of Xdur pathovar. Fig. 1. a) Circular map of X. citri pv. durantae LMG696. Starting from the outermost ring to centre: CDS genes on the forward strand coloured according to their COG classification, CDS genes and RNAs on the forward strand, CDS genes and RNAs on the reverse strand, CDS genes on the reverse strand coloured according to their COG classification, GC content, and GC skew. b) Pictorial representation of Xdur plasmids; pLMG696-1, pLMG696-2, pLMG696-3, and pLMG696-4. The outermost ring represents protein-coding genes; clockwise and anti-clockwise arrows indicate the forward and reverse orientation of the genes, respectively. Two innermost rings represent GC content (black) and +/-GC skew (purple and green). The circular scale gives genome coordinates. Type IV secretion system and TALEs are indicated with green and turquoise colours, respectively.

Phylogenetic and genome comparison analysis of Xdur LMG696 with its closest relatives
The phylogeny constructed using the complete genomes of X. citri pv. citri strains along with Xdur LMG696 was consistent with the previous studies [33], where the phylogenetic analysis revealed three main groups correlated with three X. citri pv. citri pathotypes; A, A* and A w . Interestingly, Xdur LMG696 was clustered within the A* group (Fig. 2). The three pathotypes of X. citri pv. citri, A, A* and A w differ in their host range and the host plant defence responses towards them. Xcc A pathotype has a broad host range infecting almost all citrus plants, while A* and A w restrict themselves to key lime (Citrus aurantifolia) and alemow (Citrus macrophylla) [12,33]. A w pathotype differs from the A* pathotype as it shows a hypersensitive response in grapefruit and sweet orange given to the presence of the XopAG/avrGf1 gene [12,33].
Genome comparison analysis of Xdur LMG696 with these complete genome sequences of 35 X. citri pv. citri strains revealed five regions that are unique to Xdur LMG696, termed Xdur large dynamic regions (XDLDRs) (Fig. 3). XDLDR1, XDLDR4, and XDLDR5 were absent in A pathotype strains, while all strains of A* and A w pathotypes have XDLDR4 and XDLDR5, with some of the A w pathotype strains lacking XDLDR1 (Fig. 3). Further, PHASTER [29] analysis revealed the presence of a prophage in the XDLDR5 region. XDLDR2 was absent in both the A and A w pathotypes, while a significant part of XDLDR3 was absent in all Xcc strains (Fig. 3). On further analysis, XDLDR2 and XDLDR3 were found to harbour ICE-related genes. ICEfinder [28] analysis revealed the presence of two ICE regions in the Xdur LMG696 genome. Interestingly, both of these ICEs were mapped to the XDLDRs. One of the ICEs (coordinates 2 483 602-2 550 615) was part of the XDLDR2 and another ICE (coordinates Fig. 2. A maximum-likelihood phylogenetic tree of X. citri pv. durantae and 35 X. citri pv. citri strains based on core gene alignment constructed using PhyML [24]. Phylogenetic grouping is colour-coded according to Xcc pathotypes along with their NCBI accession numbers: yellow, A* pathotype; green, A w pathotype; pink, A pathotype. Xdur LMG696 is represented in grey colour. X. citri pv. glycines CFBP2526 was used as an outgroup, here depicted in blue colour. 2 624 779-2 743 795) was part of XDLDR3. These regions carried type IV secretion system-related genes such as virB6, virB4, traI, and traD. Interestingly, XDLDR3 was found to harbour a gene encoding heavy metal translocating P-type ATPase (Xdur_12075) and multidrug efflux pump-related genes (Xdur_12095, Xdur_12100, Xdur_12105) (Table S1, available in the online version of this article). These five regions carry a large number of IS elements and hypothetical genes. Apart from these, genes encoding AlpA family, LysR family, TetR family, and helix-turn-helix transcriptional regulators, DNA repair system proteins, DNA replication proteins, methyltransferases, ABC transporters, and proteins domains of unknown function were common in the XDLDRs (Table S1).

Comparison of repetitive elements
The chromosomal sequence also encodes for a large number of IS elements indicating their role in the genome evolution of this pathogen. Xdur LMG696 harbours 95 IS elements which are in the range of A* pathotype IS elements, i.e. 75 to 115. In contrast, A pathotype strains carry a much smaller number of IS elements in the range of 45 to 51, and A w pathotype has a variable number of IS elements from 65 to as much as 105 (Fig. 4a). A maximum number of IS elements fall into three IS element families, IS3_ssgr_IS51, Fig. 3. Comparative genome map of Xcc strains against Xdur LMG696 constructed using BRIG [27]. Concatenated rings from the inside out represent shared homology between Xdur LMG696 (reference genome) with ring 1 to 6 (yellow-coloured) Xcc A*pathotypes, ring 7-14 (green-coloured) Xcc Aw pathotypes, and ring 15-35 (red-coloured) Xcc A pathotypes with white-coloured area represents a genome lacking a particular region (highlighted in black boxes, XDLDR1-XDLDR5). Two innermost rings indicate GC content and GC skew, respectively.  [32]. The outermost ring represents different Xcc pathotypes with Xdur LMG696 (black colour), and the innermost ring represents TALEs assigned to various classes with colour-coding given on the right-hand side of the panel.
IS3_ssgr_IS407, and IS4_ssgr_IS10. IS1595_ssgr_IS1595 and IS5_ssgr_IS427 families were present only in A* and A w pathotypes, while IS21 was restricted to A w pathotype group. ISKra4_ssgr_ISAzba1 family IS elements were present only in A* pathotypes and Xdur LMG696. Apart from these, IS elements from ISL3, IS5_ssgr_IS5, ISNCY, S1595_ssgr_ISNha5, IS4_is10, and Tn3 families were also present in varying numbers in all the strains (Fig. 4a). As mentioned above, XDLDRs were associated with IS elements, where XDLDR3 have a large number of IS elements belonging to various IS element families, IS3_ssgr_IS407, ISNCY, IS3_ssgr_IS51, IS5_ssgr_IS5, IS1595_ssgr_IS1595, and IS4_ssgr_IS10. IS elements were distributed throughout the Xdur LMG696 genome in large numbers, even outside the XDLDRs.
Transcription activator-like effectors (TALEs) are key virulence factors found in the Xanthomonas genus that manipulate host cell machinery for its own benefit. TALEs are tandem repeats of 33 to 34 amino acids, secreted by the type three secretion system into the host, where they act as transcription factors by binding to promotor elements using their repeat variable diresidues (RVDs) and thus regulating the expression of target genes [13,34]. On account of being repetitive, TALEs are often missed in short-read sequencing technologies. However, the emergence of long-read third-generation sequencing technologies generating complete genome data has made it easier to study repetitive regions in depth. TALE analysis of Xdur revealed the presence of five TALEs distributed in four different classes (TalGD, TalGG, TalIQ, and TalHW), all encoded on the plasmids. The phylogenetic analysis of TALEs revealed a different repertoire of TALE classes in A*, A, and A w pathotypes (Fig. 4b). As discussed in the above section Xdur was forming a clade with A* pathotypes, Xdur TALEs also grouped with TALEs of A* pathotypes, such as both TalGD and TalGG were clustered with TalIT and TalGG class of Xcc DAR73910, another TalGD was clustered with TalII of Xcc DAR73909, and TalIQ was clustered with Xcc DAR73889 and Xcc DAR73886. Xdur did not reveal any unrelated or new TALE class except for TalHW, which was confined only to Xdur. TalHW was associated with pLMG696-3 plasmid along with another TALE assigned to TalGD class.

CONCLUSION
There are many reports focused on the most successful pathogens of the genus Xanthomonas. At the same time, some pathogens which infect more diverse plants, reported from the middle of the last century, are being neglected for their lower economic importance [7]. The present study reports the first high-quality complete genome sequence of a Xanthomonas pathovar that was reported from a diseased ornamental hedge plant [6]. Previous studies have hinted at its close relationship with X. citri, a citrus plant pathogen [9,10]. Xdur LMG696 have four plasmids, two of which carry T4SS clusters similar to the ones found associated with plasmids of previously reported Xcc strains. Further, phylogenetic analysis of Xdur LMG696 with Xcc genomes revealed that Xdur LMG696 itself groups amongst Xcc A* strains, one of the Xcc pathotypes.
Comparative genomic analysis revealed regions unique to Xdur LMG696 with genes related to ICEs, prophages, and a large association of IS elements. The extensive prevalence of IS elements indicates chromosomal plasticity, and the association of unique regions with ICEs and phages suggests a role of horizontal gene transfer events. Mobile genetic elements (MGEs) such as plasmids, IS elements, ICEs and prophages are involved in genomic rearrangements and inter-strain variation, further contributing to the constant emergence of variable strains. Our report also focused on TALEs, which are considered important pathogenicity determinants in the genus Xanthomonas. TALE analysis revealed the presence of a unique TALE class in Xdur LMG696. These unique regions and TALEs might be good targets for molecular and pathogenicity studies.