Genome Identiﬁcation and Characterization of WRKY Transcription Factor Gene Family in Mandarin ( Citrus reticulata )

: WRKY proteins are an important group of transcription factors (TFs) gene family and were identiﬁed primarily in plants. WRKY TFs play vital roles in modulating gene expression when plants face detrimental effects due to the environment. In the current study, we focused on using the mandarin citrus ( Citrus reticulata ) genome to understand the impact of the WRKY gene family on the extraction of alleles mining in mandarins. The mining of the C. reticulata genome identiﬁed 46 CrWRKY genes that were classiﬁed into three main groups (G1, G2, and G3) further with ﬁve subclasses (IIa, IIb, IIc, Iid, and IIe) in the G2 group, and all were presented on 29 scaffolds representing numerous segmental duplications of 100% events established. Multiple sequence analysis predicted the presence of the “WRKYGQK” domain and metal-chelating zinc-ﬁnger motif C 2 H 2 in 45 genes, while the “WRKYGQK” domain was replaced with “WRKYGKK” only in CrWRKY20. The comparative relationship of CrWRKY with other plant species using dual synteny analysis revealed that the divergence between C. reticulata and C. grandis occurred after the evolutionary divergence of C. clementine , C. sinensis , C. medica , and C. ichangensis . The possible functions of the CrWRKY genes in mitigating environmental effects were predicted using cis-regulatory elements analysis and in silico RNAseq analysis, for the development of plants. These results provide a robust platform and absence of knowledge for the functional identiﬁcation from key genes of CrWRKY genes in the mandarin for the possible use to improve key desirable agronomic and consumer-driven fruit quality traits in mandarins and related species.


Introduction
The plant, a sessile organism, faces many environmental stresses and evolves various mechanisms to mitigate these environmental fluctuations [1]. Gene expression regulation using various transcription factors plays a significant role in response to ever-changing internal and external stimuli [2]. WRKY was identified as a transcription factor gene family in plants and is vital to regulate different pathways against many abiotic and biotic stresses [3]. The first gene, SPF1 containing the WRKY domain, was identified in 1994 from sweet potatoes [4]. The WRKY name is derived from the four conserved amino acids present in the WRKY domain. WRKY domain consists of 60 highly conserved amino  66 68 104 [25] Moss (Physcomitrella patens) 38 [25] Grape-vine plant VvWRKY1 transgenic tobacco (Nicotianatabacum) plants were susceptible to a fungus variety. Still, the exhibited-ectopia expression of the grape-vine VvWRKY2 increased the resistance of the necro-trophic fungi Pythium, B. cinerea, and Alternariatenuis resistance. CaWRKY1 of chili pepper (Capsicum annuum), acts as a defense negative-regulator, viral-induction genes silencing of CaWRKY1 gene reduced Xanthomonas growth, showing that overexpression of the gene increased hypersensitivity cell death to tobacco mosaic virus and P. syringae. Mildew Locus A (MLA) provides isolate-specific resistance to barley powdery mildew (Blumeriagraminis). MLA is physically engaged with HvWRKY1 and HvWRKY2; it triggered the pathogen-associated molecular pattern molecules (PAMP) with their two repressors for basal defense, which work as a primary defense in the nucleus and disrupt WRKY repressor functions [25,26].
The genus Citrus includes more than 162 species belonging to the order Geraniales, family Rutaceae, and subfamily Aurantoideae. Mandarin (Citrus reticulata) is a primitive species and the most important citrus fruit crop worldwide. Recent studies indicate that farmers face several problems due to biotic and abiotic stresses such as diseases, drought, cold, and soil salinity resulting in a decrease in mandarin production. Traditional breeding methods have been used successfully over the years to improve Citrus; however, these methods are limited by slow growth, incompatibility, polyembryony, and parthenocarpy [27]. WRKY genes have been reported from C. sinensis (51), C. unshiu (1), and C. clementina (48). However, due to their essential role in the early response to pathogens and abiotic stresses, several WRKY genes were intensively studied in C. reticulata. The citrus WRKY family is potentially involved in several plant developmental processes and abiotic and biotic stress responses. Thus, members of this family could be suitable candidates for different citrus breeding and improvement programs [28].
This study used tactical bioinformatics tools to identify and characterize genes associated with the WRKYTFs in the C. reticulata genome. A deterministic strategy was used to identify member of the WRKY gene class in C. reticulata. WRKY genes were classified with their intron/exon distribution pattern, chromosome distribution, presence of conserved motifs, comparative phylogenetic analysis, cis-regulatory elements, enrichment analysis, transcriptome analysis, and following the expression inhibitory miRNA. Further, the genome-wide association studies of WRKY genes in C.reticulata provided a reference and convenience for practical analysis and cloning of key targeted genes.

Database Search, Classification, and Retrieval of Sequences
WRKY domain sequence [13], obtained from the Pfam database (PF03106) (http://pfam. xfam.org/family/PF03106 accessed on 17 July 2022), was used to identify and download WRKY TFs gene members in C. reticulata genome from the Citrus Genome Database (CGD) (https://www.citrusgenomedb.org/ accessed on 18 July 2022) [29] using the BLAST-P (Protein-Basic Local Alignment Search Technique) algorithm. The physical and chemical properties such as protein length (AA residues), molecular weight (MW), and theoretical isoelectric point (pI) were calculated using the online program ExPASy (https://web.expasy. org/protparam/ accessed on 20 July 2022) [30] while sub-cellular localization was found using WOLF PSORT (https://wolfpsort.hgc.jp/ accessed on 22 July 2022). The information about gene IDs, chromosomal positions, and genome sequences was retrieved from the Citrus Genome Database. The order of physical position was used to rename these WRKY encoding genes.

Analyzing the Structure of Genes
The coding sequence (CDS) and genomic sequence of CrWRKY genes were downloaded from Citrus Genome Database (CGD) (https://www.citrusgenomedb.org/citrus downloads/Citrus reticulata/C.reticulataHzau v1 accessed on 28 July 2022). Gene Structure Display Server (GSDS v2.0) software was used to visualize gene structure using these sequences (http://gsds.cbi.pku.edu.cn/ accessed on 31 July 2022) [31]. A phylogenetic tree was added to understand the gene association better.

Phylogenetic Analysis
To explore the evolutionary relationships among the WRKY genes, a neighbor-joining method-based phylogenetic tree was constructed using the molecular evolutionary genetic analysis (MEGA11) package [32]. For this purpose, the MUSCLE program performed the multiple sequence alignment of identified WRKY amino acid sequences with default parameters. The unrooted tree was generated using the neighbor-joining strategy, which summarizes the evolutionary distances between 97 members of the WRKY gene family. The tree was visualized and presented in circular form, and the closely related sequences were grouped in different clads.

Cis-Elements and Conserved Motifs Analysis
The promoter sequence of 1000 bp (base pair) upstream to the start codon was retrieved from the Citrus Genome Database to analyze cis-regulatory elements (CREs). Promoter sequences of CrWRKY genes were used to identify CREs from PlantCare Database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [33].
The Multiple EM for Motif Elicitation (MEME) tools (http://meme.nbcr.net/meme/) [34] were used to understand the motif pattern and structures with the maximum 10 numbers of Agriculture 2023, 13, 1182 4 of 21 motif sets and default value sets or other variables to analyze concluded peptide sequence of CrWRKY genes.

Analysis of Synteny and Duplication
Using the default configuration, duplicated genes event was analyzed through MC-ScanX (Multiple Collinearity Scan toolkit) [35]. Syntenic and dual syntenic maps were created using TBtools [36] to find the synteny link between the paralogous genes of C. reticulata and orthologous genes of CrWRKY in Arabidopsis, C. sinensis, C. Clementina, C. grandis, C. medica, and C. ichangensis.
TBtools software [36] was used to calculate Ka/Ks substitution values from the aligned protein sequences of CrWRKY genes. These values were used to calculate the time of divergence (T) and the molecular evolution rate of every gene pair. The formula T = Ks/2R was used to calculate the divergence time equation, where R of C. reticulata is 1.7 × 10 −8 . Gene IDs and their length were used to map the scaffold by TBTools software [36].

Expression Analysis of CrWRKY in Different Organs
To investigate the expression patterns of all CrWRKY genes tissues, we obtained previously generated RNA-seq (GEO accession no. GSE94810, GSE67560, and GSE74384) and EST (NCBI) data for Mandarin citrus (C. reticulata) plant tissues including nucellarembryony, ovule, fruit, flower (before and after anthesis), roots, and leaves. The experiment (GSE74384) was conducted at Huazhong Agricultural University (Wuhan, Hubei Province, China) to identify the candidate gene related to nucellarembryony initiation (NEI) and polyembryony (PE) formation from mature Mandarin citrus (C. reticulata) trees grown under natural conditions. The monoembryonic clementine (CM, C.clementina) and the polyembryonic Ponkan (PK, C. reticulata Blanco) ovules were collected for roots, leaves, flower and fruit from three biological replications (trees) at anthesis after anthesis/50% flowering (DAF) (day 0 DAF) and subsequently at 3, 7, 14, 21, and 28 days after flowering (DAF). Samples were taken right before and after the appearance of Nucellar Embryo Initials (NEI) cells were used to create messenger RNA-seq libraries. PK/CM at pre-NEI and NEI stages with biological replicates. The expression pattern of all 46 CrWRKY genes was retrieved from this dataset. The experiment (GEO accession no. GSE94810) focused on understanding the wild-type (WT) of C. reticulata cv. Suavissima. Fruits were collected from the field of Wenzhou (Zhejiang Province, P.R. China; [37]). The fruit color is one of the most important appearance quality traits. The flavedo of the C. reticulata cv. Suavissima wild and its mutant were collected at 21 DAF of the wild-type and mutant-type and 30 DAF of the wild-and mutant-type after storage for RNA extraction, which had slight phenotypic differences [37]. Reads per kilobases per million mapped reads (RPKM) values from RNA-seq data were log2 transformed for expression profiling. Expression patterns with hierarchical clustering were displayed in Heatmap Illustrator in TBtools software [36].

Comprehensive Analysis of microRNAs
psRNA Target using CDS sequences of CrWRKY genes and its default parameters was used to determine the micro-RNA (miRNA) sequences associated with CrWRKY genes in C. reticulata [38]. Previous in vivo and in vitro research was used to determine the potential function of the discovered miRNA.

Identification of WRKY TFs in Mandarin (C. reticulata)
Initial analysis identified 87 genes that encode the WRKY domain in C. reticulata genome proteins. The proteins encoded by the identical gene isoforms and proteins containing a truncated WRKY DNA binding domain were excluded from the analysis. A total of 46 non-redundant CrWRKY genes were identified and used for further analysis. These non-redundant WRKY protein sequences from C. reticulata included the highly conserved WRKYGQK motif in all sequences except CrWRKY20, which contain WRKYGKK. Fourteen of sixty-one AA in the WRKY domain were 100% conserved in all CrWRKY peptide sequences in C. reticulata (Figure 1).

Identification of WRKY TFs in Mandarin (C. reticulata)
Initial analysis identified 87 genes that encode the WRKY domain in C. reticulata genome proteins. The proteins encoded by the identical gene isoforms and proteins containing a truncated WRKY DNA binding domain were excluded from the analysis. A total of 46 non-redundant CrWRKY genes were identified and used for further analysis. These non-redundant WRKY protein sequences from C. reticulata included the highly conserved WRKYGQK motif in all sequences except CrWRKY20, which contain WRKYGKK. Fourteen of sixty-one AA in the WRKY domain were 100% conserved in all CrWRKY peptide sequences in C. reticulata (Figure 1). The sizes of the proteins encoded by CrWRKY genes ranged from 159 to 602 AA. The molecular weight ranged from 18.23697 to 65.19112

Gene Structure, Conserved Motif, and Domain
The number and structure of introns and exons play a crucial role in determining evolutionary relationships among different genes [39]. Exon-intron structures of CrWRKY genes were comprehensively demonstrated phylog ( Figure 2) enetically, revealing the gene distribution pattern and their number. The intron numbers varied from one to five in CrWRKY genes ( Figure S2). Three genes contain one intron (6.5%), twenty-three CrWRKY genes retained two introns (50%), seven genes owned three introns (15.21%), nine genes had four introns (19.56%) and four genes had five introns (8.69%). All of the CrWRKY genes in subfamily G3, 2D, and 2E possessed two introns, while the several introns in the CrWRKY gene in subfamily G1 varied from two to five. Group 2a CrWRKY genes introns ranged from three to four and 2b from three to five. Group 2C also contained variable numbers of introns ranging from one to three (Table 1, Figure S2).
All 10 motifs identified in the mandarin WRKY proteins were studied using the MEME program in TBtools [36] (Figure 3). WRKY domain was consistent in all the CrWRKY proteins. It was also determined that CrWRKY proteins from the same group have similar motifs, indicating the involvement of these conserved motifs in group-specific activities among the whole gene family. The presence of two WRKY domains in Group 1 (G1) further supports the phylogenetic distribution of the gene family. G1 contains two WRKY domains, collectively made up of 5-6 motifs, but CrWRKY2 showed The sizes of the proteins encoded by CrWRKY genes ranged from 159 to 602 AA. The molecular weight ranged from 18.23697 to 65.19112 kD, with CrWRKY20 being the smallest and CrWRKY30 being the most extended protein (ExPASy; https://web.expasy. org/protparam/) ( Table 1).

Gene Structure, Conserved Motif, and Domain
The number and structure of introns and exons play a crucial role in determining evolutionary relationships among different genes [39]. Exon-intron structures of CrWRKY genes were comprehensively demonstrated phylog ( Figure 2) enetically, revealing the gene distribution pattern and their number. The intron numbers varied from one to five in CrWRKY genes ( Figure S2). Three genes contain one intron (6.5%), twenty-three CrWRKY genes retained two introns (50%), seven genes owned three introns (15.21%), nine genes had four introns (19.56%) and four genes had five introns (8.69%). All of the CrWRKY genes in subfamily G3, 2D, and 2E possessed two introns, while the several introns in the CrWRKY gene in subfamily G1 varied from two to five. Group 2a CrWRKY genes introns ranged from three to four and 2b from three to five. Group 2C also contained variable numbers of introns ranging from one to three (Table 1, Figure S2).
All 10 motifs identified in the mandarin WRKY proteins were studied using the MEME program in TBtools [36] (Figure 3). WRKY domain was consistent in all the CrWRKY proteins. It was also determined that CrWRKY proteins from the same group have similar motifs, indicating the involvement of these conserved motifs in group-specific activities among the whole gene family. The presence of two WRKY domains in Group 1 (G1) further supports the phylogenetic distribution of the gene family. G1 contains two WRKY domains, collectively made up of 5-6 motifs, but CrWRKY2 showed ambiguity by having 4 motifs and an additional domain (Plant-znclust) at the C-terminus region. Group 2A showed a single WRKY domain and most of them having 4 motifs but CrWRKY19 displayed an additional motif (motif 10) in its structure. In Group 2B, CrWRKY27 and CrWRKY34 also contained bZIP and Phage_GPO superfamily domains in their N-terminus regions. Except for CrWRKY23, all the members of Group 2B had 7 motifs. In the case of Group 2C, all the members exhibited complete homogeneity in the number of motifs and domains, except for CrWRKY8, which had an additional motif (motif 6) at N-terminus. However, CrWRKY26 contained motif 4 instead of motif 6, which was unusual for the whole of Group 2D ( Figure 3). Furthermore, the only member from the whole group lacked the Plant-znclust domain. Other than motifs 1, 2, and 6, CrWRKY32 also had motif 9. CrWRKY39 was the only member from the whole gene family with no WRKY domain and had the Herpes-BLLF1 superfamily in its place. While coming towards the motifs, this group displayed conservation of motifs 1 and 2 among its members. All the members of Group 3 had the same number of motifs and conserved domain structure. This comparable distribution of conserved motifs between the WRKY proteins in each group corresponds to the similarity in the functioning of these genes. ber of motifs and domains, except for CrWRKY8, which had an additional motif (motif 6) at N-terminus. However, CrWRKY26 contained motif 4 instead of motif 6, which was unusual for the whole of Group 2D ( Figure 3). Furthermore, the only member from the whole group lacked the Plant-znclust domain. Other than motifs 1, 2, and 6, CrWRKY32 also had motif 9. CrWRKY39 was the only member from the whole gene family with no WRKY domain and had the Herpes-BLLF1 superfamily in its place. While coming towards the motifs, this group displayed conservation of motifs 1 and 2 among its members. All the members of Group 3 had the same number of motifs and conserved domain structure. This comparable distribution of conserved motifs between the WRKY proteins in each group corresponds to the similarity in the functioning of these genes. The genes of C. reticulata were marked with red stars. CrWRKY genes were categorized into seven subgroups. G1 is represented by the yellow color, 2A is shown as the grey color, red represents 2B, blue shows 2C, the 2D group of CrWRKYs is denoted in purple color, pink represents the 2E group, and G3 is represented by green color. The evolutionary history was inferred using the NJ method with 1000 bootstraps. This analysis involved 95 WRKY genes.

Figure 2.
Phylogenetic relationship between the WRKY genes of C. reticulata and A. thaliana. The genes of C. reticulata were marked with red stars. CrWRKY genes were categorized into seven subgroups. G1 is represented by the yellow color, 2A is shown as the grey color, red represents 2B, blue shows 2C, the 2D group of CrWRKYs is denoted in purple color, pink represents the 2E group, and G3 is represented by green color. The evolutionary history was inferred using the NJ method with 1000 bootstraps. This analysis involved 95 WRKY genes.

Comparative Phylogenetic Relationship of C. reticulata WRKY Gene Family with Arabidopsis
The evolutionary relationship between A. thaliana and C. reticulata was determined by generating a phylogenetic tree using the neighbor-joining (NJ) method ( Figure 2). Results depicted that these total 96 WRKY proteins can be categorized into three main groups G1, G2, and G3, based on the domains and motifs present in these peptide sequences. More precisely, this grouping was based on the number of WRKY domains and the nature of the zinc finger in the peptide sequences. Members of G1 contain two WRKY domains each,

Comparative Phylogenetic Relationship of C. reticulata WRKY Gene Family with Arabidopsis
The evolutionary relationship between A. thaliana and C. reticulata was determined by generating a phylogenetic tree using the neighbor-joining (NJ) method ( Figure 2). Results depicted that these total 96 WRKY proteins can be categorized into three main groups G1, G2, and G3, based on the domains and motifs present in these peptide sequences. More precisely, this grouping was based on the number of WRKY domains and the nature of the zinc finger in the peptide sequences. Members of G1 contain two WRKY domains each, while the members of other groups contain only a single WRKY domain in their sequence. The zinc finger of Group 3 was of C2HC-type while Group 1 and 2 had C2H2-type of zinc fingers. Group 2 was further classified into 2A, 2B, 2C, 2D, and 2E, depending on their initial amino acid sequences [40].
Group G1 contained 17 proteins, of which 8 belonged to C. reticulata and 9 were from Arabidopsis. G2 had 62 members, divided into 5 subgroups. 2A had 6 members (3 from each organism); 2B contained 13 proteins, 8 belonging to C. reticulata; 2C was the most prominent group with 25 members, out of which 10 were from C. reticulata; 2D had 13 members (6 from C. reticulata). Four out of seven proteins in Group 2E belonged to citrus, whereas G3 had 17 proteins, out of which 7 were from C. reticulata (Figures 2 and S1a-g).

Gene Location and Duplication Orthologous Gene Pairs between C. reticulata and Other Plant Species
Orthologous gene pairs provide information about the evolutionary relationship between different plant species. To understand the evolution of the CrWRKY gene, we analyzed the syntenic relationships between C. reticulata and other citrus species in the Rutacaeae family. To better understand the evolution and relationship of CrWRKY genes in monocot and dicot plants, a dual syntenic analysis was also performed between C. reticulata and Arabidopsis, C. reticulata, and O.sativa. Collinearity analysis revealed that 39 collinear orthologous WRKY gene pairs were identified in the Mandarin citrus/Arabidopsis and 14 among Mandarin citrus/rice pairs, respectively ( Figure 4). The number of orthologous events of CrWRKY-AtWRKY was higher than of CrWRKY-OsWRKY, indicating that the divergence between citrus and Arabidopsis occurred after the divergence of rice and the common ancestor of dicotyledons. The high level of syntenic conservation between the citrus and Arabidopsis indicated that WRKY TFs in citrus might share similar structures and functions to those of orthologs in Arabidopsis.
Syntenic analysis was also performed on the whole genome of C. reticulata to discover any CrWRKY duplication event within the C. reticulata genome. Gene location on the scaffold predicted that CrWRKY genes belonging to the same subgroup were present on the different scaffolds and linked, suggesting that these genes might have emerged from segmental duplication (Figure 4; Table S2). The maximum number of CrWRKY genes (six) were present on S7, followed by S21, which contains four genes ( Figure S4). Results showed that 100% of segmental duplication events possibly lead to CrWRKY genes and that the evolution of CrWRKYs may have been driven, at least in part, by segmental duplication events.
fold predicted that CrWRKY genes belonging to the same subgroup were present on the different scaffolds and linked, suggesting that these genes might have emerged from segmental duplication (Figure 4; Table S2). The maximum number of CrWRKY genes (six) were present on S7, followed by S21, which contains four genes ( Figure S4). Results showed that 100% of segmental duplication events possibly lead to CrWRKY genes and that the evolution of CrWRKYs may have been driven, at least in part, by segmental duplication events.  Seventy-three syntenic collinear lines observed between C. reticulata and C. grandis were higher than all other species of citrus. Collinear WRKY genes were identified in C. reticulata/C. clementine (70), C. reticulata/C. sinensis (69), C. reticulata/C. medica (60) and C. reticulata/C. ichangensis (36). It indicated the divergence between C. reticulata and C. grandis occurred after the divergence of C. clementine, C. sinensis, C. medica, and C. ichangensis (Figures 4 and 5).   The time of duplication of CrWRKY genes was assessed by the KaKs calculator in tbtool software [36] (Figure 6). Ka shows the number of nonsynonymous substitutions per nonsynonymous site while Ks show the number of synonymous substitutions per synonymous site and the ratio between these two was calculated by Ka/Ks. The estimated ratio between different gene pairs (CrWRKY15_CrWRKY2, CrWRKY4_CrWRKY2,CrWRKY9_CrWRKY38, Cr-WRKY41_CrWRKY44,CrWRKY3_CrWRKY2,CrWRKY4_CrWRKY3,CrWRKY36_CrWRKY43, CrWRKY14_CrWRKY12,CrWRKY12_CrWRKY8,CrWRKY14_CrWRKY8,CrWRKY13_CrWRKY10, CrWRKY37_CrWRKY44,CrWRKY9_CrWRKY15,CrWRKY2_CrWRKY38,CrWRKY21_CrWRKY18, CrWRKY26_CrWRKY32,CrWRKY23_CrWRKY34,CrWRKY28_CrWRKY19,CrWRKY31_ Cr-WRKY28,CrWRKY40_CrWRKY29,CrWRKY25_CrWRKY29,CrWRKY34_CrWRKY30,CrWRKY23_ CrWRKY30,CrWRKY41_CrWRKY37,CrWRKY22_CrWRKY32,CrWRKY14_CrWRKY12, Cr-WRKY6_CrWRKY38,CrWRKY3_CrWRKY38,CrWRKY43_CrWRKY42,CrWRKY45_CrWRKY46 and CrWRKY35_CrWRKY46) were less than 1; while CrWRKY5_ CrWRKY2, CrWRKY4_ Cr-WRKY5, CrWRKY1_ CrWRKY5 and CrWRKY5_ CrWRKY38 held more than 1 ratio. The highest estimated date of duplication was 352.1051365 (MYA) in CrWRKY31_CrWRKY28 and the lowest estimated date of duplication was 53.91711764 (MYA) in CrWRKY15_CrWRKY2 (Table S3).
Agriculture 2023, 13, x FOR PEER REVIEW CrWRKY35_CrWRKY46) were less than 1; while CrWRKY5_ CrWRKY2, CrW CrWRKY5, CrWRKY1_ CrWRKY5 and CrWRKY5_ CrWRKY38 held more than 1 ra highest estimated date of duplication was 352.1051365 (MYA) in CrWRKY31_CrW and the lowest estimated date of duplication was 53.91711764 (MY CrWRKY15_CrWRKY2 (Table S3).

Cis-Regulatory Elements Analysis
Cis-regulatory elements (CREs) are DNA binding motifs present at the promoter region that regulate the transcription of a gene [41]. In silico analysis can be performed to evaluate the putative function of various genes using CREs analysis [42]. Various CREs were detected in the promoter regions of CrWRKY genes (Figure 7). Phytohormone-responsive elements, such as MeJARE (MeJA-responsive element), ABRE (abscisic acid-responsive element), SARE (salicylic acid-responsive element), ethylene-responsive elements (ERE) and Gibbereline response elements GARE were detected in the promoter regions, suggesting that multiple phytohormones might regulate the expression of CrWRKY genes. Additionally, some stress-related CREs, such as DSRE (defense and drought stress-responsive element), LTRE (low-temperature-responsive element), and HSRE (heat stress-responsive element), were also found in the CrWRKY gene promoter regions (Table S1); these results indicated that CrWRKY genes might be closely related to the responses to multiple abiotic and biotic stresses. Moreover, two plant hormone-related elements (ABRE and MeJARE) and a stress-responsive element (HSRE) were frequently detected in the putative promoters of CrWRKY genes. Notably, each CrWRKY promotor contained multiple copies of LRE (light-responsive element), suggesting that CrWRKY genes were an important component of light response in C. reticulata. Similarly, CrWRKY5, CrWRKY6, CrWRKY9, and CrWRKY22 contained multiple LRE suggesting their role in light responsiveness. CrWRKY6, CrWRKY9, and CrWRKY16 had multiple ABAR CREs that showed their role in abscisic acid responsiveness. MeJA element showed their regulation of methyl jasmonate responsiveness in CrWRKY9, CrWRKY18, CrWRKY35, and CrWRKY42. Defense and drought resistance signaling specific CREs were observed in CrWRKY2, CrWRKY6, and CrWRKY9.
Agriculture 2023, 13, x FOR PEER REVIEW

Cis-Regulatory Elements Analysis
Cis-regulatory elements (CREs) are DNA binding motifs present at the prom gion that regulate the transcription of a gene [41]. In silico analysis can be perfo evaluate the putative function of various genes using CREs analysis [42]. Variou were detected in the promoter regions of CrWRKY genes (Figure 7). Phytohorm sponsive elements, such as MeJARE (MeJA-responsive element), ABRE (abscisic sponsive element), SARE (salicylic acid-responsive element), ethylene-respons ments (ERE) and Gibbereline response elements GARE were detected in the prom gions, suggesting that multiple phytohormones might regulate the expression of C genes. Additionally, some stress-related CREs, such as DSRE (defense and drough responsive element), LTRE (low-temperature-responsive element), and HSRE (hea responsive element), were also found in the CrWRKY gene promoter regions (Ta these results indicated that CrWRKY genes might be closely related to the respo multiple abiotic and biotic stresses. Moreover, two plant hormone-related e (ABRE and MeJARE) and a stress-responsive element (HSRE) were frequently det the putative promoters of CrWRKY genes. Notably, each CrWRKY promotor co multiple copies of LRE (light-responsive element), suggesting that CrWRKY gen an important component of light response in C. reticulata. Similarly, CrWRKY5, CrW CrWRKY9, and CrWRKY22 contained multiple LRE suggesting their role in light siveness. CrWRKY6, CrWRKY9, and CrWRKY16 had multiple ABAR CREs that their role in abscisic acid responsiveness. MeJA element showed their regulation of jasmonate responsiveness in CrWRKY9, CrWRKY18, CrWRKY35, and CrWRKY42. and drought resistance signaling specific CREs were observed in CrWRKY2, CrW and CrWRKY9.

Expression Analysis of the CrWRKY Gene in Different Organs
To investigate the expression of CrWRKY genes, computational Expressed Sequence Tag (EST) analysis was accomplished for all 46 CrWRKY genes using NCBI nucleotide blast and interpreted using a heatmap with a hierarchical cluster in TBtools [36]. Results revealed that most of the CrWRKY genes expressed in leaves compared to the fruit. Cr-WRKY7, CrWRKY22, CrWRKY28, and CrWRKY38 genes showed expression during fruit development in contrast to the expression of other genes in the leaf (Figure 8).
Agriculture 2023, 13, x FOR PEER REVIEW 12 of 23 Figure 7. Cis-regulatory elements in putative CrWRKY gene promoters were associated with different plant stresses, hormone responses, growth, and developmental processes. G1, G2, and G3 showed the groups of WRKY genes.

Expression Analysis of the CrWRKY Gene in Different Organs
To investigate the expression of CrWRKY genes, computational Expressed Sequence Tag (EST) analysis was accomplished for all 46 CrWRKY genes using NCBI nucleotide blast and interpreted using a heatmap with a hierarchical cluster in TBtools [36]. Results revealed that most of the CrWRKY genes expressed in leaves compared to the fruit. CrWRKY7, CrWRKY22, CrWRKY28, and CrWRKY38 genes showed expression during fruit development in contrast to the expression of other genes in the leaf (Figure 8).

Expression Analysis of CrWRKY Genes during Nucellarembryony Initiation
Next-generation RNA sequencing analysis is an important tool to find and compare the expression of genes under different organ development and stress responses. An experimental NGS RNA-seq (GSE74384) data were downloaded from the NCBI GEO database. In the current study, transcriptome profiling was employed to compare differentially expressed genes (DEGs) of ovules at and before the appearance of nucellarembryony initial cells. Results demonstrated that CrWRKY35 and CrWRKY43 expression were significantly down-regulated during NEI in monoembryonic cultivars CM and C. clementina.
CrWRKY8 was down-regulated considerably in polyembryonicPonkan (PK, C. reticulata Blanco) during the beginning of the new embryo initiation (NEI) compared to pre-NEI. LEC1, L1L, FUS3, ABI3, and ABI5 were involved in the somatic embryogenesis of citrus in vitro but differentially not expressed in polyembryonic and monoembryonic ovules. The AP2/ERF and WRKY domains were up-regulated in polyembryonic cultivars compared to monoembryonic cultivars [43] (Figure 9).
Next-generation RNA sequencing analysis is an important tool to find and compare the expression of genes under different organ development and stress responses. An experimental NGS RNA-seq (GSE74384) data were downloaded from the NCBI GEO database. In the current study, transcriptome profiling was employed to compare differentially expressed genes (DEGs) of ovules at and before the appearance of nucellarembryony initial cells. Results demonstrated that CrWRKY35 and CrWRKY43 expression were significantly down-regulated during NEI in monoembryonic cultivars CM and C. clementina. CrWRKY8 was down-regulated considerably in polyembryonicPonkan (PK, C. reticulata Blanco) during the beginning of the new embryo initiation (NEI) compared to pre-NEI. LEC1, L1L, FUS3, ABI3, and ABI5 were involved in the somatic embryogenesis of citrus in vitro but differentially not expressed in polyembryonic and monoembryonic ovules. The AP2/ERF and WRKY domains were up-regulated in polyembryonic cultivars compared to monoembryonic cultivars [43] (Figure 9).

Fruit Color Expressed in Wild-and Mutant-Type of Stay-Green Citrus
The experiment (GEO accession no. GSE94810) data were planned to illuminate the potential regulation of citrus fruits color formation process and compare the different gene expression patterns between a spontaneous stay-green mutant and its wild-type.

Fruit Color Expressed in Wild-and Mutant-Type of Stay-Green Citrus
The experiment (GEO accession no. GSE94810) data were planned to illuminate the potential regulation of citrus fruits' color formation process and compare the different gene expression patterns between a spontaneous stay-green mutant and its wild-type. Read per Kilo-bases per million Mapping reads (RPKM) values from RNA-seq data were log2 transformed for expressions profiling [36].

CrWRKY Gene Expression Profiles in HBL Infection
The experiment (GEO accession no.GSE67560) data were downloaded from NCBI Geo to illuminate the potential regulation of Citrus Huanglongbing (HLB) disease detected in the leaves and roots of citrus. HLB is a worldwide uncontrollable and worldwide citrus disease, Gram-negative Candidatus Liberibacterphloem-inhabiting α-Proteobacteria causes this Ca. Liberibacterasiaticus (CLas) is a pathogenic species of this class, which is transmitted by grafting with HLB-infected bud woods and by a phloem-feeding psyllid, Diaphorinacitri. Phosphorus-starvation-induced miR399 significantly induced phosphorus

CrWRKY Gene Expression Profiles in HBL Infection
The experiment (GEO accession no.GSE67560) data were downloaded from NCBI Geo to illuminate the potential regulation of Citrus Huanglongbing (HLB) disease detected in the leaves and roots of citrus. HLB is a worldwide uncontrollable and worldwide citrus disease, Gram-negative Candidatus Liberibacterphloem-inhabiting α-Proteobacteria causes this Ca. Liberibacterasiaticus (CLas) is a pathogenic species of this class, which is transmitted by grafting with HLB-infected bud woods and by a phloem-feeding psyllid, Diaphorinacitri. Phosphorus-starvation-induced miR399 significantly induced phosphorus levels to alleviate symptoms of HLB in the infected tree [44]. In roots, CLas-responsive genes were exploited for HLB resistance to HBL in breeding. Some WRKYs involved in SA-and JA-dependent defense pathways were up-regulated and targeted in CLas-infected roots [45].

Discussion
C. reticulata plays a significant part in the worldwide economy [46]. Conventional methods coupled with molecular approaches and tools employed through bioinformatic approaches can improve the fruit quality and production traits of C. reticulata. WRKY protein family is one of the major TFs in citrus and has also been identified in many other organisms. Here, we consistently detected 46 WRKY genes from the WGS of C. reticulata. Sequence alignment and phylogenetic analysis divided WRKY genes into three main groups (G-I, G-II, G-III). This classification depends on the WD motif and the conserved ZF motif. There were 8 and 7 members of CrWRKY in G-I and G-III, while G-II was the largest group containing 31 members (Figure 2). Furthermore, in A. thaliana and Populas [47], G-I houses the maximum gene members of WRKY; while G-III contains the largest number of WRKYs in rice [19]. Most of the WRKY genes in C. reticulata are present in G-II, suggesting that the evolutionary history of G-II may experience further duplication of genes. Currently, WRKY genes identified in the genome of lower bryophytes, unicellular protoplast, slime mold myxomycetes, ferns, and Chlamydomonas monocytogenes that relate to G-I, indicating that G-I is most likely the novel form of protein expression and they developed 1.5 billion years ago in a eukaryotic cell, before thallophytes [48]. A diverse alignment study of CrWRKY domain structure revealed two main types of conserved introns in CrWRKY proteins (Table  S5): V-type and R-type introns [49]. The R-type intron was found only in subgroups I, II-c, II-d, II-e, and III. Therefore subgroups II-a and II-b have V-type introns, which they discovered as an additional type of introns, which were positioned at the fourth AA-residue (K) after the second C-residue metal-chelating zinc-finger motif [19]. It is unclear whether the WRKY function is affected by adding homologous introns in distinct locus. The constant sequence of the WRKY gene is the heptapeptide sequence WRKYGQK. Furthermore, with respect to the domain feature and the phylogenic tree (Figures 1 and 2), it showed three interesting observations: (i) all Group I members have two WRKY domains and two zincfinger structures; (ii) subgroup II-c has several mutations, indicating that perhaps the selective force and evolutionary patterns of CrWRKY genes are different among categories, such as CrWRKY20 possess lack of ZF structure the six amino acid residues (WRKYGQ), mutations were observed in the Q position which is replaced by K; (iii) CrWRKY47 belong to none of the group because of lack of its domain and ZF structure. All these results implied the CrWRKY genes may have experienced loss of the WRKY domain loss during the evolution.
The Group III WRKY gene accounts for 20% of the family members in higher plants, but it does not exist in some lower plants, such as bryophytes. Studies have shown that almost all Group III WRKYs in Arabidopsis are related to the response to biotic stress response [40], which indicates that Group III WRKYs are relatively the latest in the evolutionary history of terrestrial plants. However, in CrWRKY, all of G-I have deficient expression levels in fruit development even those not expressed in fruit development ( Figure 8). Two of them (CrWRKY6 and CrWRKY9) are considered pseudogenes that are expressed both in fruit and leaf development, implying that G-III members may have little to do with the fruit formation only expressed in CrWRKY38, 45and 46 genes, which is inconsistent with the Arabidopsis.
Tissue expression analysis showed that members of G-II expressed highly in both tissue's development. Members of G-IIe expressed in both developments without apparent different expression levels. However, other subgroups of G-II showed different expressions in fruit and leaf development (Figure 8). Combined with their expression in the various organs (nucleus, chloroplast, cytoplasm, cytoplasm nucleus, vacuoles, plasmids, extra nucleus, peroxisomes, golgi plasmids, mitochondria, and cytoskeleton), all groups showed high expression in the nucleus, but CrWRKY16 of G-IIc was lacking expression ( Figure S3).
Regardless of the fact that C. reticulata has 46 CrWRKY genes, members of G-I (Cr-WRKY1 is the ortholog of AtWRKY3; CrWRKY3, CrWRKY4 are homologues of AtWRKY26 and AtWRKY33; CrWRKY6 is the homologue of AtWRKY25; CrWRKY5 is the homology of AtWRKY34; CrWRKY9, CrWRKY2 and CrWRKY15 are the homology of AtWRKY32); members of G-III (CrWRKY38, CrWRKY42, CrWRKY43, CrWRKY35 and CrWRKY36 are homology of AtWRKY30; CrWRKY45 is the homology of AtWRKY70; CrWRKY46 is the homology of AtWRKY38; CrWRKY39 is the homology of AtWRKY14; members of G-IIe (CrWRKY44, CrWRKY37 and CrWRKY41 are the homology of AtWRKY27); mem-bers of G-IId (CrWRKY18, CrWRKY21 and CrWRKY24 are homology AtWRKY17 and AtWRKY11; CrWRKY22 is homology of AtWRKY21; CrWRKY32 and CrWRKY26 are homology of AtWRKY39 and AtWRKY74); members of G-IIc (CrWRKY11 is homology of AtWRKY71; CrWRKY7 is homology of AtWRKY23; CrWRKY10 and CrWRKY13 are homology of AtWRKY57; CrWRKY8, CrWRKY12 and CrWRKY14 are the homology of AtWRKY75; CrWRKY16 and CrWRKY17 are homology of AtWRKY13; CrWRKY20 is the homology of AtWRKY50); members of G-IIa (CrWRKY19 and CrWRKY28 are homology of AtWRKY40; CrWRKY31 is the homology of AtWRKY18 and AtWRKY60); members of G-IIb (CrWRKY29, CrWRKY40, CrWRKY25, CrWRKY27 and CrWRKY33 are homology of AtWRKY31; Cr-WRKY30 and CrWRKY34 are the homology of AtWRKY36; CrWRKY23 is the homology of AtWRKY9) (Figure 2).
Through the structural analysis of CrWRKY genes, it is found that CrWRKY genes have different numbers of exons/introns as in their ortholog A.thaliana (as predicted by phylogeny analysis) ( Figure S2). Furthermore, the motif analysis indicated that all the CrWRKY and AtWRKY have the same motif structures except CrWRKY1, CrWRKY8, CrWRKY26, CrWRKY32, CrWRKY19, CrWRKY40, CrWRKY38, AtWRKY36, AtWRKY55 and AtWRKY40 of different clades have variant motif structures.
RNA-seq analysis helps to identify the significant PmWRKYs in the cold tolerance of Prunusmume [50]. Some PmWRKY genes changed their expression during chilling stress in the P. mume cultivar 'Zhusha'. According to [50], PmWRKY6 and 11 were identified as up-regulated or down-regulated over twofold after experiencing cold acclimation in the winter. Although some conditions were controlled during winter, PmWRKYs were reported to evaluate their cold response through artificially low-temperature conditions. Cold temperature signaling pathways are classified into ABA-dependent or ABA-independent pathways; both modulate the functional gene expression, such as KIN1, COR47, and RD29A, that increase the cold resistance in plants. WRKY TrFs were reported as a significant role in the ABA signal transduction pathway [51]. According to the authors, 17 genes respond to ABA treatment. Especially after this treatment, PmWRKY18 was expressed significantly up-regulated, while other genes were not significantly different compared to its control. So, they concluded that PmWRKY18 of G-II may involve in ABA-dependent cold signaling pathway. PmWRKY18is homology of AtWRKY18. AtWRKY18 belongs to G-IIa, and the other genes of A. thaliana in G-IIaareAtWRKY40, and AtWRKY60 [50].
According to our research, the homologies of AtWRKY18, AtWRKY40, and AtWRKY60 in G-IIa of C. reticulata are CrWRKY19, CrWRKY28, and CrWRKY31. AtWRKY18, AtWRKY40, and AtWRKY60 served as positive and negative regulators in ABA-dependent pathways for inhibiting seed and root development (Table S6). Moreover, to further know about the gene structure and their function, CRE has an essential function in binding the TrFs by their respective target site and helps to regulate gene expression. CrWRKY genes have CRE that are related to LRE, LTRE, and different hormone responsiveness to play a significant role in plant growth and development (Table S1 and Figure 7). So, these genes may be involved in photosynthesis and ABA responses. However, further research is required to find out the exact function of these CRE. Members of G-I (CrWRKY1, CrWRKY2, CrWRKY5, CrWRKY6, and CrWRKY9); G-IIb (CrWRKY27); G-IIc (CrWRKY7, CrWRKY16); G-IIc (CrWRKY18, CrWRKY21, CrWRKY22, CrWRKY24); and G-III (CrWRKY35, CrWRKY38, CrWRKY45, and CrWRKY46) play a significant role in ABAR and LTRE that are directly or indirectly involved in cold stress.
Evolutionary changes play a vital role in the process of gene duplication. Duplicated genes on the same scaffolds are tandem duplication. However, duplication of the genes on different scaffolds is a segmented mode of duplication [26]. All the CrWRKYs show segmental duplication events, while tandem duplication events were lacking( Figure 5). The lack of tandem duplication in CrWRKY genes might be possible for the minimum number of genes. Segmental duplication was a major driver of WRKYs expansion during the citrus evolutionary process. Twenty-nine scaffolds of C. reticulata genes were identified, possibly all scaffolds resulting from segmented duplication due to their presence on different scaf-genes have diverse functional networks but mainly in the leaf and flower developmental processes. miRNA data on possibly targeted CrWRKY genes during root, leaf, flowering, and fruit development provided a base for further studies in C. reticulata to increase yield and help cope with stress, which badly affects the plant. The detailed computational inspection of CrWRKY proteins revealed in the current study could be used for cloning and selection at the molecular level, portraying gene expression and studying their interactions with different transcription factors.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/agriculture13061182/s1, Figure S1a Figure S3: Heat Map representing Sub-cellular localization of all 46 CrWRKY genes. Skin colour represents the absence of the respective gene in that cell region; yellow mustard represents the minimum and maximum value of the gene in that region. Figure S4: Distribution of CrWRKY genes on citrus scaffolds, it is predicting the possible gene duplication on the different chromosome and show segmented duplication. Table S1: Cis-regulatory elements in putative CrWRKY promoters were associated with different plant developmental processes. Table S2: Scaffold Renaming. Table S3: Ka/Ks ratio duplicated gene pairs in Citrus reticulata. Table S4: miRNA targets prediction of CrWRKY genes. The miRNA data was downloaded from psRNATarget (Online Web tool). Table S5: R-type and V-type are differentiated by red (R) and purple (V) in the sequence alignment of CrWRKY genes. Table S6: A. thaliana with its role in plants.