Phylogenomic Analysis of micro-RNA Involved in Juvenile to Flowering-Stage Transition in Photophilic Rice and Its Sister Species

Vegetative to reproductive phase transition in phototropic plants is an important developmental process and is sequentially mediated by the expression of micro-RNA MIR172. To obtain insight into the evolution, adaptation, and function of MIR172 in photophilic rice and its wild relatives, we analyzed the genescape of a 100 kb segment harboring MIR172 homologs from 11 genomes. The expression analysis of MIR172 revealed its incremental accumulation from the 2-leaf to 10-leaf stage, with maximum expression coinciding with the flag-leaf stage in rice. Nonetheless, the microsynteny analysis of MIR172s revealed collinearity within the genus Oryza, but a loss of synteny was observed in (i) MIR172A in O. barthii (AA) and O. glaberima (AA); (ii) MIR172B in O. brachyantha (FF); and (iii) MIR172C in O. punctata (BB). Phylogenetic analysis of precursor sequences/region of MIR172 revealed a distinct tri-modal clade of evolution. The genomic information generated in this investigation through comparative analysis of MIRNA, suggests mature MIR172s to have evolved in a disruptive and conservative mode amongst all Oryza species with a common origin of descent. Further, the phylogenomic delineation provided an insight into the adaptation and molecular evolution of MIR172 to changing environmental conditions (biotic and abiotic) of phototropic rice through natural selection and the opportunity to harness untapped genomic regions from rice wild relatives (RWR).


Introduction
Micro-RNAs (MIRNAs) function as post-transcriptional regulators of gene expression in eukaryotes [1]. In plants under a given environmental condition, MIRNAs perform a host of regulatory functions, and one important regulation is phase transitions leading to plant morphogenesis and development [2][3][4][5]. Such phase transitions mark cardinal changes in plant development and are mediated by sequentially expressed MIRNAs [6]. Stage transitions are cardinal and necessary changes in the plant developmental processes, and these transitions are mediated by sequentially expressed miRNAs [6]. MIRNA172 is one such family of MIRNA that is involved in phase transition during plant development. Vegetative phase changes in Arabidopsis and maize are controlled by the sequential activity of miR156 and MIR172 [7]. Although miR156 is highly expressed during early developmental stages, MIR172 is highly expressed during later stages of development [7][8][9]. In plants such as Acacia confusa, A. colei, Hedera helix, Eucalyptus globulus, and Quercus acutissima, contrasting expression patterns of miR156 and MIR172 and their target genes were observed [10].

Sequence Retrieval of MIR172 Homologs
MiRBase 22.1 (http://www.mirbase.org/, accessed on 10 March 2021) was used to retrieve the mature and precursor sequences of MIR172 from Oryza sativa, Sorghum bicolor, Zea mays, Triticum aestivum, and Arabidopsis thaliana [56]. Oryza sativa MIRNA precursor sequences were used as the query to execute BLASTN (Local BLAST) against the genome sequences of O. glaberrima, O. glumaepatula, O. rufipogon, O. punctata, O. barthii, and O. brachyantha that are available in the Gramene to extract each plant's precursor sequence, the MIRNA precursor sequences of sorghum, maize, and Arabidopsis were utilized as the query. Based on the score, e-value (lowest), and percentage identity (highest), high-scoring pairings (HSP) for MIR172 were found. These HSP were then chosen for comparative genomic study in order to comprehend the microsynteny, organizational structure, and evolutionary trend of MIR172 in domesticated and wild grasses. As an outlier for determining the evolutionary trend, the Arabidopsis MIR172 sequence from TAIR (https://www.arabidopsis.org/, accessed on 10 March 2021) was used.

MIRNA Precursor Sequence Analysis
Using the MAFFT version 7.271 program [57] with the L-INS-I approach and output in Phylip format, multiple sequence alignment of MIR172 precursor and mature sequences was carried out in seven Oryza spp., maize, sorghum, and Arabidopsis. The aligned sequences were scored for similarity using ESPRIPT 3.0 [58] (https://espript.ibcp.fr/ESPript/ ESPript/, accessed on 12 March 2023) with the default settings.  Table S1). The reference plant for the microsynteny investigation was Oryza sativa. Genes were predicted using the FGENESH tool from Molquest II (http://www.molquest.com/molquest.phtml? group=index&topic=gfind, accessed on 18 March 2021) [59] with the default parameters. For all rice species, Oryza sativa was used as the default template, and the genomes of wheat, maize, and sorghum were employed as templates for their corresponding genomes. Blast2GO software (version 5.2.5) (https://www.blast2go.com/, accessed on 19 March 2021) was used to perform functional annotation and genomics analysis on genes predicted by FGENESH in the 100 kb area [60]. Functionally annotated genes from the 100 kb region of MIRNAs were enlisted from each species and used for microsynteny analysis using BLAST2GO analysis.
The database for microsynteny analysis, as well as the blastp for calculating synteny block input, were created using the NCBI BLAST-2.11.1+ software (makeblastdb and blastp). The query for the makeblastdb script comprised four sets of proteins (one set per MIR172A, B, C, and D) predicted by the MolQuest program in 100 kb genomic regions of 10 genomes under study in the previous stage. The blastp settings were blastp-outfmt 8-e-value 1 × 10 −10 -max target seqs [5]. Synteny blocks for each MIRNA were computed using the blast output and GFF annotations of seven Oryza, two non-Oryza, and arabidopsis as input. The MCScanX tool [61] was used to determine the interspecies syntenic blocks with the following parameters: Match-score, final score = match score + num gaps × gap penalty (default: 50); gap-penalty, gap penalty (default: 1); match-size, the number of genes required to call a collinear block (default: 5); E-value, alignment significance 1 × 10 −5 ; max-gaps, maximum gaps allowed (default: 25); and overlap-window, maximum distance 10,000 (number of nucleotides between genes) (default: 5) as well as collinear block patterns: 1 inter-species. The method found two or more species that shared a pairwise synteny block with at least five genes shared and an E-value of 1 × 10 −10 in a maximum range of 10,000 nucleotides. The MCScanX software circle plotter was used to generate the figures. After identifying the collection of synteny blocks, in-house scripts were created to sub-set the MCScanX collinearity output file. Circos (Version 0.69-9., http://circos.ca/, accessed on 25 March 2021) [62] was used to generate circular graphs. and O. punctata were procured from AICRP on rice (OUAT), and individual collections of rice breeders, seeds of Zea mays and Sorghum bicolor, were obtained from the division of Genetics, IARI; seeds of Arabidopsis were available in the laboratory. Plants were sown on moistened germination paper for five days, and after initial growth, the seedlings were transferred to 12-inch pots containing a mixture of cocopeat and sand (1:1)  Triticum aestivum as an outlier since no homologs of MIR172 were identified in wheat by insilico analysis. Using the CTAB method, genomic DNA was isolated from ten plant species and used as a template for amplification of MIR172 A, B, C, and D precursor sequences using specific primers (Supplementary Table S2) in a master-cycler using high-fidelity DNA polymerase. The PCR reaction conditions were as follows: 30 s at 98 • C, 30 cycles of 10 s at 98 • C, 15 s at different annealing temperatures, and 10 s at 72 • C. For validation, the amplified PCR products were sequenced and aligned with pre-existing sequences.

Expression Analysis of Mature MIR172
Total MIRNA was extracted from the stage staggered tissues of seven Oryza spp., such as 2 leaf-shoot, 2 leaf-root, 4 leaf-shoot, 10 leaf-shoot, 10 leaf-shoot apical, 10 leaf-root, flag leaf, booting panicle, panicle (0.5 cm), panicle (0.5-1 cm), panicle (1-2 cm), panicle (2-4 cm), sorghum (3 leaf-shoot, 3 leaf-root, 5 leaf-shoot, 5 leaf-root, growing point, flower that has not yet bloomed, complete flower-blooming and G-blooming), maize (2 leaf-shoot, 2 leafroot, 4 leaf-shoot, 10 leaf-shoot, 10 leaf-root, flag leaf, tassel, and silk), and Arabidopsis (2 rosette leaf-shoot, 2 rosette leaf-root, 4 leaf rosette-shoot, 10 leaf rosette-shoot, 10 leaf rosette-root, complete rosette growth, inflorescence, floral bud, and open flower) using a commercial MIRNA isolation kit as per manufacturer's instruction. Using the Mir-XTM MIRNA First-Strand synthesis kit, cDNA was synthesized from the extracted MIRNA. Real-time PCR (qRT-PCR) was used to examine the expression of mature MIR172 using the Mir-X MIRNA qRT-PCR SYBR Kit with the mRQ 3 universal reverse primer supplied with the kit, the species-specific mature MIR172 forward primer (Supplementary Table S3), and a reference dye (ROX) according to the manufacturer's protocol. In 96-well qPCR compatible plates, the following cycles were used: initial denaturation for 5 min at 95 • C, 45 cycles of 15 s at 95 • C, 30 s at 60 • C, and 30 s at 72 • C. U6 was used as an endogenous control to normalize the MIRNA expression level. The results were presented as the mean of three biological replicates, with three technical replicates for each biological repeat.

Phylogenomic Analysis of Precursor Sequences
MIR172 homologs' precursor sequences, along with 500 bp upstream sequences, were retrieved from the Gramene database. Synteny/evolutionary link between different MIRNAs was deduced by using the Gramene database (Supplementary Table S1). Clustal Omega [63] was used to conduct multiple sequence alignment for the individual MIRNAs, and MEGA10 [64] was used to construct an un-rooted tree using the Maximum-Likelihood (ML) technique. The Nearest Neighbor Interchanges (NNIs) method [65] was used to search for tree topology. For MIR172, the Tamura 3-parameter model with a discrete Gamma distribution (+G) and 5 rate categories substitution was used. The gamma shape parameter was directly determined from the data, and 1000 bootstrap replicates were used in the procedure. The proportion of invariable sites was fixed. The tree was generated in the Newick format. I-TOL (http://itol.embl.de/, accessed on 28 March 2021) produced a graphical representation of the phylogenetic tree.

Identification of MIR172 Homologs
MIR172 homologs were identified in all seven Oryza spp., sorghum, maize, and Arabidopsis except wheat, as no precursor of MIR172 could be identified in wheat using its miRBase database. In rice and its wild relatives, four homologs of MIR172, i.e., MIR172 (A-D), were identified. These homologs are located on different chromosomes throughout the rice genome (Table 1). In sorghum, a total of six homologs of MIR172, i.e., MIR172 (A-F), were identified, whereas, in maize and Arabidopsis, five homologs of MIR172, i.e., MIR172 (A-E) were identified. Since we used sativa as a reference for genome structure and evolution analysis, only four homologs corresponding to rice, i.e., MIR172 (A-D), were included in the study. Chromosomal locations and coordinates of each homolog (Table 1, Figure 1a-d) and the number of predicted homologs per MIR172 are provided in Table 2. The presence of four homologs in rice, five in maize and Arabidopsis, and six in sorghum was confirmed by the BLAST result in this study. All the homologs of MIR172 viz., MIR172A, B, C, and D were successfully amplified, and the nucleotide sequences have been submitted as Supplementary Data S1.
No. of homologs predicted

Expression Analysis of MIR172
MIR172 is known to play a cardinal role in vegetative to reproductive-stage transition in plants [7]. To determine the variations in the expression pattern of MIR172 during development, an expression profile of mature MIR172 was compared in six RWR and other poaceae members. Accumulation of mature MIR172 increased with the growth of the plant, gradually increasing from the 2-leaf stage (0.582-fold) to the 10-leaf stage (8.54-fold), reaching its maximum in the flag-leaf stage in rice (25.97-fold, Figure 2). However, there was no increase in M1IR172 expression in roots in rice. Similarly, in sorghum, the expression of MIR172 increased from the 3-leaf stage (0.934-fold) to growth point differentiation (7.35-fold), reaching a maximum in the flag-leaf stage (24.9-fold, Figure 2). A similar pattern was observed in maize, with maximum expression in the flag leaf (23.9-fold). In the case of Arabidopsis, the expression level gradually increased from the 2-rosette leaf stage (0.745-fold) to the complete rosette growth stage (22.615-fold, Figure 2). Flag leaf or complete rosette growth marks the transition stage between the vegetative and reproductive stages. During the reproductive growth phase, increased accumulation of MIR172 was observed in the booting panicle stage in rice (13.465-fold), maize (13.546-fold), and sorghum (10.918-fold); and developing inflorescence in Arabidopsis (11.22-fold). Expression of MIR172 gradually decreased during panicle development. Mature MIR172 exhibited a similar expression profile in all Oryza spp. with minor variations in the magnitude of expression among different species (Figure 2).

Conservation and Divergence in Mature and Precursor Sequence of MIR172
Mature MIR172A and MIR172D of all the poaceae members and Arabidopsis were found to be highly conserved (Figure 3a

Conservation and Divergence in Mature and Precursor Sequence of MIR172
Mature MIR172A and MIR172D of all the poaceae members and Arabidopsis were found to be highly conserved ( Figure      In MIR172B precursors, specific single nucleotide substitution was detected at the 203 rd position, where G was replaced with A; at the 205 th position, G was replaced with A, and at the 208 th position, G was replaced with A was observed in maize, and Arabidopsis in comparison to rice. Specific single base substitutions were also observed in sorghum at the 197 th (A→G) and  In addition to the substitutions, insertions and deletions were also observed in MIR172D precursor sequences. A single nucleotide insertion at the zero position was observed in brachyantha. An insertion of 10 nucleotides (CAAATAAACC) in sorghum and a nine-nucleotide insertion (CTTATGCCT) in Arabidopsis between the 53 rd and 54 th posi-

Spectrum of Sequence Variation in MIR172 Homologs
Tajima's D 67 and Fu and Li's F 68 test of neutrality of sequence polymorphisms for different MIR172 homologs, viz., MIR172A, MIR172B, MIR172C, and MIR172D revealed non-significant negative values for all four loci. Among the four pre-MIRNA homologs examined across nine poaceae species, no single nucleotide polymorphism was detected in MIR172A and, therefore, no neutrality test was conducted for MIR172A ( Supplementary  Figures S1 and S2). The highest negative (non-significant) Tajima's D value (−0.48816) was found in MIR172C, followed by MIR172B (−1.0403). Sequence variation for MIR172D was the lowest (−1.45822) and non-significant (Supplementary Figures S3, S5, and S7). Nevertheless, Fu and Li's F values were also non-significantly negative and found to be consistent with Tajima Figures S4, S6, and S8). The nucleotide diversity of the poaceae species studied ranged from 0.05657 at the pre-MIR172B gene to 0.05657 at the pre-MIR172C locus and 0.18291 at the pre-MIR172D locus. These findings suggest that differences in selection pressure experienced by cultivated varieties during improvement, as well as WGD events in non-Oryza species, account for sequence polymorphism at distinct MIRNA loci.

Gene Conservation and Gene Density Analysis
The number of genes predicted in the 100 kb region of each homolog of MIR172 (Table 2) revealed the presence of 23 genes around the 100 kb region MIR172A of Oryza sativa, and the highest percentage of sativa homologs were found to be conserved in rufipogon (73.91%, i.e., 17 conserved genes out of 23 detected genes). The lowest conservation (0%) was found in barthii, glaberrima, sorghum, maize, and Arabidopsis, i.e., disruptive synteny with complete loss of conservation. Similarly, the maximum gene density was found in Oryza punctata (1gene/3.703 kb), and the minimum density of genes was observed in maize (1gene/7.14 kb) (Figure 6a,b). We observed the presence of 23 genes in the 100 kb region of MIR172B of sativa, with 58.8% conservation with rufipogon (10 out of 17 detected genes), and the lowest conservation of 0% was detected in brachyantha, sorghum, and maize (disruptive synteny), including Arabidopsis. The maximum gene density was found in sorghum (1gene/3.073 kb), and the minimum density of genes was observed in punctata (1gene/5.88 kb) (Figure 6a,b).
Furthermore, 19 genes were identified in the 100kb region harboring MIR172Dii of Oryza sativa sub-species indica, and the highest percentage of sativa homologs were found to be conserved in rufipogon (72.2%, i.e., 13 out of 18 genes). However, the disruptive synteny showing a total loss of conservation was detected in brachyantha, sorghum, and maize. Further, the synteny results revealed the highest gene density of 1 gene/3.073 kb in sorghum, with punctata exhibiting the lowest gene density of 1 gene/5.88 kb.  In MIR172C, 21 genes were identified in the 100 kb region of Oryza sativa with 76.9% conservation in glumaepatula (10 out of 13 genes) and the lowest in punctata, brachyantha, sorghum, and maize (0%, i.e., disruptive synteny), including Arabidopsis. At the same time, maximum gene density was found in rufipogon and brachyantha (1gene/3.571 kb); the minimum density of genes was observed in maize (1gene/8.33 kb) (Figure 6a,b). Two homeologs of MIR172D, i.e., MIR172Di and MIR172Dii, were identified in the sativa genome, and 23 genes were identified in the 100 kb region harboring MIR172Di of sativa. The highest percentage of sativa homologs were found to be conserved in rufipogon (61.1%, i.e., 11 out of 18 genes) and the lowest in brachyantha, sorghum, Arabidopsis, and maize (0%, i.e., disruptive synteny (Figure 6a). It is observed that the maximum gene density is 1 gene/3.703 kb in sorghum, while the minimum gene density of 1 gene/5.882kb was observed in punctata (Figure 6b). Furthermore, 19 genes were identified in the 100 kb region harboring MIR172Dii of Oryza sativa sub-species indica, and the highest percentage of sativa homologs were found to be conserved in rufipogon (72.2%, i.e., 13 out of 18 genes). However, the disruptive synteny showing a total loss of conservation was detected in brachyantha, sorghum, and maize. Further, the synteny results revealed the highest gene density of 1 gene/3.073 kb in sorghum, with punctata exhibiting the lowest gene density of 1 gene/5.88 kb.

Microsynteny Analysis
Inter-species synteny block analysis of MIRNA172 homologs revealed that out of 218 genes in MIR172A, 129 genes (59.17%) were collinear. Similarly, 88 genes (45.13%) out of 195 predicted genes in MIR172B; 77 genes (36.49%) out of predicted 211 genes in MIR172C; and 91 genes (38.72%) out of 235 predicted genes for MIR172D were collinear. The demonstration of the alignment of non-anchor genes is marked by '||' in the multi-alignment of gene orders. Synteny or collinearity blocks were (Supplementary Figures S9-S45) constructed using sativa and other relevant reference genomes to develop saturated synteny/collinearity blocks for each homolog of MIRNA.

(iii) MIR172C
Analysis in a 100 kb region surrounding MIR172C revealed the conservation of microsynteny amongst glumaepatula, rufipogon, barthii, and glaberima (Figures 7c and 8c). However, a complete loss of microsynteny was observed in punctata and brachyantha. Amongst 21 genes identified in the 100 kb region harboring MIR172C of sativa, six genes, namely, F-actin-capping protein subunit alpha, Ribosomal protein L23/L15e family protein, hypothetical protein OsI_25745, adenyl cyclase-like protein, G-type lectin S-receptor-like serine/threonineprotein kinase At1g34300 and Retrotransposon protein, Ty1-copia subclass were found to be conserved in glumaepatula, rufipogon, barthii, and glaberima. Another hypothetical protein gene OsI_25741 was found to be conserved in barthii, rufipogon, and glumaepatula. Similarly, hypothetical protein DAI22_07g114300 and fatty acyl-CoA reductase1 were found to be conserved in barthii, glumaepatula, and glaberrima. Genes such as retrotransposon protein, Ty3-gypsy subclass was found to be conserved in barthii and glaberrima. Amongst others, activator-like transposable element and phosphatidylinositol 4-phosphate 5-kinase 6-like are the two genes found conserved only in rufipogon while no sativa homologs were conserved in sorghum/maize/Arabidopsis indicating complete disruption of microsynteny (Figures 7c and 8c).
Microsynteny conservation in the 100 kb region surrounding MIR172D was detected amongst punctata, glumaepatula, rufipogon, barthii, and glaberima, but the same was completely lost in brachyantha. Two homeologs of MIR172D, i.e., OsMIR172Di and OsMIR172Dii, were identified in the sativa genome. Out of the 23 genes detected in OsMIR172Di in the 100 kb region harboring MIR172D of sativa, two genes, namely protein LUTEIN DEFICIENT 5 and pentatricopeptide repeat-containing protein At2g22410 were found to be conserved in punctata, glumaepatula, rufipogon, barthii, and glaberima. The genes heavy metal-associated isoprenylated plant protein 3-like and Pib variant protein were found to be conserved in punctata, glumaepatula, barthii, and glaberima. Another rice homolog, putative brown planthopperinduced resistance protein 1 was conserved in punctata, rufipogon, barthii, glaberima, and hypothetical protein DAI22_02g385300 was conserved in punctata, glumaepatula, rufipogon, and glaberima. Further, genes such as DNA ligase-like and probable protein S-acyltransferase 15 were conserved in all Oryza spp except punctata, glumaepatula, and brachyantha. Other genes, such as putative telomere binding protein-1; TBP1, cyst nematode resistance protein-like protein, and protein N-lysine methyltransferase METTL21A were observed to be conserved in barthii, punctata, and glumaepatula, respectively. No sativa homologs were found to be conserved in sorghum, maize, and Arabidopsis, indicating complete disruption of microsynteny (Figures 7d and 8d).
Out of the 19 genes identified in the 100 kb region of OsMIR172Dii detected in sativa, four genes, pentatricopeptide repeat-containing protein At2g22410, heavy metal-associated isoprenylated plant protein 3-like, thioredoxin-like 3-2 and OTU domain-containing protein 5-A were found to be conserved in punctata, glumaepatula, rufipogon, barthii, and glaberima. Rice homolog, hypothetical protein DAI22_02g385300 was conserved in punctata, glumaepatula, and rufipogon; pumilio homolog 1-like was conserved in punctata, glumaepatula, rufipogon, barthii, and glaberima; and P-loop NTPase domain-containing protein LPA1 was conserved in glumaepatula, rufipogon, barthii, and glaberima. Further, genes named DNA ligase-like and probable protein S-acyltransferase 15 were found to be conserved in rufipogon, barthii, and glaberima, while the Pib variant protein was conserved in punctata, glaberrima, and barthii. Nonetheless, no sativa homologs were detected in sorghum, maize, or Arabidopsis, indicating complete disruption of microsynteny.

Discussion
Studies have shown that whole-genome duplication (WGD), in addition to the tandem duplication of MIRNA genes, are involved in their evolution, but its contribution varies from species to species [69], which means apart from their ancestral origin, plant MIRNAs may have been generated by duplication of pre-existing MIRNA genes Our study revealed the presence of single nucleotide polymorphism in precursor sequences of MIR172 homologs at multiple positions along with deletions and insertions (Figures 3-5). This can be ascribed to two rounds of WGD that Arabidopsis underwent after splitting from papaya during the course of evolution [70]. However, monocots, unlike eudicots such as Arabidopsis, underwent only one shared ancestral WGD during their evolution [70]. Our analysis of the precursor sequences of MIR172A ( Figure 4) in poaceae revealed that it is conserved in Oryza except in barthii, glaberrima, punctata, and rufipogon, which showed deletion in a few positions. Nonetheless, brachyantha and punctata showed deletion in MIR172B and MIR172C, while an insertion in brachyantha and a deletion in punctata was observed in MIR172D ( Figure 5). These can be ascribed to the genome downsizing and resistance to genome expansion in brachyantha and glaberrima that could have led to structural variation. Similarly, in punctata and barthii both genome expansion and contraction might have led to structural variations [71]. The polymorphism in the precursor sequences of sorghum and maize could be ascribed to the WGD event that occurred 30 million years ago (MYA), separating the lineage of sorghum and maize from that of rice [72] from its common ancestors. Further, the variations in the precursor sequence of maize could be due to an extra WGD event leading to the expansion of MIRNA gene family [73]. In fact, duplicated MIRNA genes in maize underwent extensive gene loss, with approximately 35% of ancestral sites retained as duplicate homoeologous MIRNA genes [74]. Common grass triplication and genome hybridization in wheat [75] might be accountable for the complete deletion of wheat MIR172 homologs. WGDs shape up the number of MIRNA genes, but their number and pattern vary in a species-specific manner [69]. This might be responsible for the disruption of microsynteny among rice wild relatives and the complete loss of microsynteny in sorghum, maize, and Arabidopsis.
MIR172 belongs to highly conserved MIRNA families [22], and the multiple sequence alignment of mature MIR172 revealed that it is conserved amongst all the seven Oryza species. Our result is commensurate with a previous report that MIRNA genes contain fewer SNPs than their contiguous border region, and the mature sequence of the MIRNA genes contains fewer SNPs than their precursors [76,77]. The results of multiple sequence alignment of mature and precursor MIR172 revealed the same as we observed fewer SNPs in mature MIRNA sequence in comparison to precursor sequences. SNPs have a different impact on the functionality of MIRNA; for example, variants in miRNA promoter regions and other regulatory regions may result in an altered transcription rate, variants in splice sites of the host gene (for intronic miRNAs) or of the poly-cistron (clustered miRNAs) can result in aberrant expression patterns. In this regard, Tajima's D [67] and Fu and Li's F [68] tests were used to estimate the neutrality of sequence polymorphism for MIRNA genes. These tests detect both positive and balancing selections [78]. Our results suggest non-neutral sequence variations in all the MIRNAs. . Previously, negative findings in one or both tests were reported in rice and Arabidopsis [5,68]. Similarly, the nucleotide diversity was found to differ amongst MIRNA locations. As a result, MIRNAs are effective targets for the differential accumulation of variation in populations subjected to selection pressures.
Expression analysis of the mature MIR172 sequences in root and shoot tissues at different developmental stages helped in determining the spatial and temporal expression of each MIR172 member. However, discerning expression of individual MIR172 family members was unlikely as two out of four MIRNA homologs (MIR172A and D) are completely conserved, while the other two MIRs (MIR172B and C) have minimal sequence differences. Comparative analysis of MIR172 showed similar expression profiles in different developmental tissues (Figure 2). Insight into MIR172 expression revealed that it is highly expressed in late vegetative stages (10 leaf-stage-12.23-fold), flag leaf (25.97-fold), and developing panicles (13.465-fold). However, there was no increase in its expression in roots in rice. A similar trend of expression was observed in other poaceae members and Arabidopsis as well. Lower levels of MIR172 expression observed in the early vegetative stage can be ascribed to its role in the transition [79] of plant development from juvenile to the flowering stage by regulating AP2-like genes, including the target of early activation tagged 1/2/3. Higher expression of MIR172 in late vegetative and panicle ( Figure 2) is consistent with its role in the acquisition of floral competence [80]. The increasing level of MIR172 is commensurate with the appearance of adult traits, as its over-expression leads to early flowering ( Figure 2).
Microsynteny/collinearity analysis provides an insight into the shared ancestry of groups of genes and unravels the evolutionary history of genomes and gene families, establishing gene orthology [81]. Microsynteny analysis of MIR172 revealed that gene collinearity within the genus Oryza is conserved. However, disruption of synteny in the 100 kb region of MIR172A in barthii (AA) and glaberima (AA), of MIR172B in brachyantha (FF), and of MIR172C in punctata (BB) was observed. Oryza spp. That lost microsynteny belonged to three distinct sub-genomes. Amongst these, glaberrima is a domesticated species, while the other three are wild relatives. Evolutionary studies revealed that the AA genome diverged from FF progenitors by almost~15 MYA [82], whereas the divergence between AA-and BB-genomes occurred at 9.11 MYA [83]. Gene loss events observed in glaberrima (AA), barthii (AA), and brachyantha (FF) plausibly led to the differences in gene family content.
The progenitors of brachyantha and punctata diverged from sativa progenitors during the course of evolution from FF-and BB-genomes to AA-Oryza genomes, in spite of a well-conserved genome organization and well-preserved gene order. Loss of microsynteny in MIR172A in barthii (AA) and glaberima (AA), MIR172B in brachyantha (FF), and MIR172C in punctata (BB) can be due to these factors. It is reported that sativa was domesticated from rufipogon (perennial wild rice) around 9000 years ago in Asia, and glaberrima was domesticated from barthii independently, around 3000 years ago in West Africa [84,85]. This corroborates our result as the highest number of sativa gene homologs were found to be conserved in rufipogon for MIR172A, B and D followed by MIR172C (Figure 7). Additionally, most of the barthii gene homologs were conserved in glaberrima (Supplementary Figures S1, S2, S10, S11, S19, S20, S28, and S29).

Conclusions
Our in-depth analysis of microsynteny/collinearity of MIR172 and its homologs in seven different Oryza species, sorghum, maize, and Arabidopsis is a comprehensive evolutionary study based on MIRNA sequence variation and conservation of orthologous genes. We identified the orthologous MIRNA genes of rice and, microsynteny analysis, genes harbored around 100 kb region, revealed that the gene pattern and content are conserved (conservative evolution) among Oryza species with exceptions depending upon the genome type and selection pressure during evolution/domestication. However, the microsynteny in sorghum, maize, and Arabidopsis was completely lost (disruptive evolution) during the course of evolution due to WGD events. Gain/loss of genes or chromosomal repatterning might have caused structural variation, but overall gene content and order in MIRs are maintained. Low genetic diversity at different MIRNA loci in cultivated rice revealed that the rice wild relatives are the untapped genetic reservoirs to be harnessed for crop improvement. Abiotic [86] and biotic stresses [43] due to inconsistent climate indices and population outbursts pose a threat to national food security. The green revolution has no doubt provided superior cultivars for improved food grain production, but stringent selection has created a bottleneck in the genetic variability in domesticated crops. Owing to their habitat in a robust natural environment, crop wild relatives managed to maintain a higher level of genetic variability. Utilizing the untapped genetic resources available in CWRs for crop improvement is an attractive option for further improving food production in the future.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/cells12101370/s1, Dash et al. supplementary file S1.pdf, Figure S1: Screenshot of Tajima's test conducted on MIR172A sequences; Figure S2: Screenshot of Fu and Li's test conducted on MIR172A sequences; Figure S3: Screenshot of Tajima's test conducted on MIR172B sequences; Figure S4: Screenshot of Fu and Li's test conducted on MIR172B sequences; Figure S5: Screenshot of Tajima's test conducted on MIR172C sequences; Figure S6: Screenshot of Fu and Li's test conducted on MIR172C sequences; Figure S7: Screenshot of Tajima's test conducted on MIR172D sequences; Figure S8: Screenshot of Fu and Li's test conducted on MIR172D sequences; Figure S9: Synteny block diagram for MIR172A with Oryza barthii as reference; Figure S10: Synteny block diagram for MIR172A with Oryza glaberima as reference; Figure S11: Synteny block diagram for MIR172A with Oryza glumaepetula as reference; Figure S12: Synteny block diagram for MIR172A with Oryza rufipogon as reference; Figure S13: Synteny block diagram for MIR172A with Oryza brachyantha as reference; Figure S14: Synteny block diagram for MIR172A with Oryza punctata as reference; Figure S15: Synteny block diagram for MIR172A with Sorghum bicolor as reference; Figure S16: Synteny block diagram for MIR172A with Zea mays as reference; Figure S17: Synteny block diagram for MIR172A with Arabidopsis thaliana as reference; Figure S18: Synteny block diagram for MIR172B with Oryza barthii as reference; Figure S19: Synteny block diagram for MIR172B with Oryza glaberima as reference; Figure S20: Synteny block diagram for MIR172B with Oryza glumaepetula as reference; Figure S21: Synteny block diagram for MIR172B with Oryza rufipogon as reference; Figure S22: Synteny block diagram for MIR172B with Oryza brachyantha as reference; Figure S23: Synteny block diagram for MIR172B with Oryza punctata as reference; Figure S24: Synteny block diagram for MIR172B with Sorghum bicolor as reference; Figure S25: Synteny block diagram for MIR172B with Zea mays as reference; Figure S26: Synteny block diagram for MIR172B with Arabidopsis thaliana as reference; Figure S27: Synteny block diagram for MIR172C with Oryza barthii as reference; Figure S28: Synteny block diagram for MIR172C with Oryza glaberima as reference; Figure S29: Synteny block diagram for MIR172C with Oryza glumaepetula as reference; Figure S30: Synteny block diagram for MIR172C with Oryza rufipogon as reference; Figure S31: Synteny block diagram for MIR172C with Oryza brachyantha as reference; Figure S32: Synteny block diagram for MIR172C with Oryza punctata as reference; Figure S33: Synteny block diagram for MIR172C with Sorghum bicolor as reference; Figure S34: Synteny block diagram for MIR172C with Zea mays as reference; Figure S35: Synteny block diagram for MIR172C with Arabidopsis thaliana as reference; Figure S36: Synteny block diagram for MIR172D with Oryza barthii as reference; Figure S37: Synteny block diagram for MIR172D with Oryza glaberima as reference; Figure S38: Synteny block diagram for MIR172D with Oryza glumaepetula as reference; Figure S39: Synteny block diagram for MIR172D with Oryza rufipogon as reference; Figure S40: Synteny block diagram for MIR172D with Oryza brachyantha as reference; Figure S41: Synteny block diagram for MIR172D with Oryza punctata as reference; Figure S42: Synteny block diagram for MIR172D with Oryza sativa (ii) as reference; Figure S43: Synteny block diagram for MIR172D with Sorghum bicolor as reference; Figure S44: Synteny block diagram for MIR172D with Zea mays as reference; Figure S45: Synteny block diagram for MIR172D with Arabidopsis thaliana as reference; Table S1: Coordinates for 100 kb region harboring MIRNAs for microsynteny analysis and coordinates for promoter and precursors for phylogenetic analysis; Table S2: Forward and reverse primer sequence for amplification of MIR172A, B, C and D from different poaceae members and Arabidopsis: Table S3: Forward primer sequence for RT-qPCR used for generating the expression profile of MIR172 in different members of poaceae and Arabidopsis; Supplementary Data S1: Nucleotide sequence (Sanger sequence) of PCR amplified MIR172 A-D from different poaceae members; (2) Dash et al. supplementary file S2.xls.
Author Contributions: P.K.D., N.K.S. and R.R. conceived the idea and designed the experiments. P.K.D., P.K.R., M.R.M. and S.K.P. obtained the resources. P.K.D. and P.G. carried out the experiments. P.K.D., P.G., S.K.P., T.D.S. and R.R. analyzed the data. P.G. prepared the manuscript with contribution from P.K.D., R.R., R.S., T.D.S. and N.K.S. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.