Diff isomiRs: Large-scale detection of differential isomiRs for understanding non-coding regulated stress omics in plants

Plants have an amazing ability to cope with wide variety of stresses by regulating the expression of genes and thus by altering the physiological status. In the past few years, canonical microRNA variants (isomiRs) have been shown to play pivotal roles by acting as regulators of the transcriptional machinery. In the present research, we present Diff isomiRs, a web-based exploratory repository of differential isomiRs across 16 sequenced plant species representing a total of 433 datasets across 21 different stresses and 158 experimental states. Diff isomiRs provides the high-throughput detection of differential isomiRs using mapping-based and model-based differential analysis revealing a total of 16,157 and 2,028 differential isomiRs, respectively. Easy-to-use and web-based exploration of differential isomiRs provides several features such as browsing of the differential isomiRs according to stress or species, as well as association of the differential isomiRs to targets and plant endogenous target mimics (PeTMs). Diff isomiRs also provides the relationship between the canonical miRNAs, isomiRs and the miRNA-target interactions. This is the first web-based large-scale repository for browsing differential isomiRs and will facilitate better understanding of the regulatory role of the isomiRs with respect to the canonical microRNAs. Diff isomiRs can be accessed at: www.mcr.org.in/diffisomirs.

Functional genomics of abiotic stress tolerance in plants is at the forefront of the 21 st century. To improve plant longevity and sustainability, several approaches such as high-throughput expression profiling, next generation sequencing and emerging gene targeting are used in combination with each other and playing a key role in developing solutions [1][2][3] . In line with these approaches, multiple efforts have been leveraged to understand the transcriptional and post-transcriptional machinery, which includes the high-throughput profiling of the gene arrays and understanding the regulatory elements. Among the regulatory elements, non-coding RNAs, e.g. miRNAs 4,5 , artificial microRNAs (amiRNAs) 6 , circular RNAs (circRNAs) 7,8 , and long non-coding RNAs (lncRNAs) 9 have been shown to be among the dominant class of non-coding RNAs in shaping the post-transcriptional events in plants. Plant microRNAs play a key role in defining the post-transcriptional regulation by altering the transcriptional regulation either through cleavage or translational suppression 10 . microRNA biogenesis has been long studied through the development of HUA1 ENHANCER1 (HEN1) loss-of-function mutants, which is defective in methylating the microRNA duplex prior to exporting them to cytoplasm by Exportin5 11 . Biogenesis pathways of endogenous microRNAs are well established in plants elucidating the conversion from the pri-miRNAs to pre-miRNAs by Dicer-like (DCL1), followed by subsequent methylation by HEN1 methyltransferase and recruited by the ARGONAUTE 1 (AGO1) to form the RNA-induced silencing complex (RISC), which later causes the post-transcriptional suppression. miRNA biogenesis can be regulated by several factors such as whether the Scientific RepoRts | (2019) 9:1406 | https://doi.org/10.1038/s41598-019-38932-w database (http://bioinf.scri.sari.ac.uk/cgi-bin/plant_snorna/home) 47 . Plant long non-coding RNAs were downloaded from CANTATAdb 48 , and GREENC 49 .

Differential isomiR identification and development of Diff isomiRs.
For the identification of differential isomiRs, all the smallRNA reads were firstly cleaned using a homemade Perl script which calls cutadapt for quality control and adapter trimming 50 and the adapter-free reads were then collapsed and mapped to rRNAs, tRNAs, snoRNAs and long non-coding RNAs downloaded from abovementioned database. Reads mapped to any of the categories were excluded from further analysis. Filtered reads were mapped to genome features to estimate the proportion of the reads mapping to the relative genome features. The adapter-free, filtered and collapsed small RNAs were then mapped to pre-miRNAs allowing no mismatch using bowtie 51 . Templated isomiRs were defined as previously described in isomiR2Function 42 . For the estimation of the differential expression, we implemented two methods: 1. First method calls the differential expression of the isomiRs using the dispersion as implemented in the package DESeq2 52 and 2. The second method hereby is referred to as mapping method estimating the differential expression by considering the mapping counts. Briefly, the abundance of each isomiRs is normalized by the total read count and p-value was estimated using the previously defined statistical tests 53,54 and winflat 53 .
For the estimation of the p-value, we evaluated all the pairs as defined in a given condition by comparing the control vs stress in a single stress experiment and by comparing the control vs (stress) n in case of multiple stress conditions. In case of multiple stresses defined in a given experiment, we also compared the defined (stress) n to (stress) n-1 . Log2FC was then evaluated as defined above for the single stress and the multiple stress experiments and the statistical value was estimated as defined by Wang et al. 55 and described below: In the above defined condition, N 1 and N 2 define the total number of clean reads in the control and stress in the case of single stress experiment and in the case of multiple stress experiment, N 1 and N 2 define the total number of clean reads for the respective comparison. For the identification of the targets, psRNAtarget was used, which is accessible through plantgrn.noble.org/psRNATarget/ 56 . For the integration of the target mimics, microRNAs and corresponding target mimics were retrieved from PeTMbase (http://petmbase.org) 57 . For the identification of the miRNA-target interactions, target interactions with enogenous RNAs were downloaded from PCeRBase available from http://bis.zju.edu.cn/pcernadb/index.jsp 58 . Diff isomiRs has been developed using the MySQL as a backend database and uses PHP as a front end. Diff isomiRs uses WOFF2 font for faster loading of the web-pages. Diff isomiRs is hosted on a 16 core Intel Xeon machine with Ubuntu as an operating system and allows for the rapid searches and easy-to-browse interface is equipped with several search options.

Results and Discussion
Diff isomiRs: front end to the isomiR biology. Post-transcriptional regulation is an important phenomenon controlled by several classes of regulatory RNAs, among which microRNAs play an important role as post-transcriptional regulators and as check points for transcriptional coordination. Leveraged by genome-wide microRNA profiling, substantial roles of microRNAs have been widely elucidated in development 10 and stress regulation 29,59 . Limited evidences exist for the implication of isomiRs in stress, however it has been shown that stress widely regulates the expression of isomiRs and enhances the target repression repertoire 26,60 . To address this knowledge gap, we developed Diff isomiRs, which is an integrated information portal and catalogues the differential isomiRs in stress treatments across 16 plant species. The computational workflow along with the features implemented in the Diff isomiRs are presented in Fig. 1. Table 1 displays the corresponding number of experiments and datasets embedded in Diff isomiRs for each species. In total, Diff isomiRs contains 433 datasets representing 21 stress conditions classified into few types such as cold, heat, light, drought, submergence, salt etc. Although these canonical variants have been defined as 'templated' and 'non-templated' based on the subsequent processing of DROSHA/DICER enzymatic machinery 41 , Diff isomiRs provides detection and differential expression of templated isomiRs with two particular aims: 1. Robust identification of isomiRs with emphasis on the differential isomiRs patterns across the stress omics; and 2. Easy-to-explore interface of differential isomiRs and the canonical miRNAs to understand and reveal the prominent role of the nucleotide substitution events, which acts as check point for both the post-transcriptional controls and isomiRs biogenesis. For the identification of the differential expressed isomiRs, two differential expression algorithms provide a total of 16,157 and 2,028 differentially expressed isomiRs across the 21 different stress types implemented in Diff isomiRs. Table 2 displays the categorical distribution of the studied stress and the number of the differential isomiRs present in each stress according to the mapping-based method and model-based dispersion estimation. Interestingly, 2,010 differentially isomiRs were found across both algorithms. Supplementary Table 2 details the number of identified isomiRs in each representative stress for each species included in the Diff isomiRs.
Among the stress datasets, drought and salt stress revealed the most differentially expressed isomiRs with a total of 7,642 and 401 by mapping-based method and DESeq2 in drought stress and a total of 3,776 and 203 in salt stress respectively. Interestingly among these 397 isomiRs are represented by both methods in drought stress and 193 in salt stress respectively. Nevertheless, previous reports have widely established the role of the Scientific RepoRts | (2019) 9:1406 | https://doi.org/10.1038/s41598-019-38932-w microRNAs in drought and salt stresses [61][62][63][64] , with limited reports addressing the role of isomiRs in stress which is mainly due to the lack of availability of high-throughput isomiR identification approaches. isomiR diversity as exemplified in above stress examples and present in Diff isomiRs can contribute to the better understanding of the post-transcriptional responses in these stresses. It is worth to mention that isomiR over expression has widely been documented 60 and combination of the miRNAs and their canonical variants has been shown with improved target prediction 26 . Interestingly, we identified several important classes of miRNAs giving the most abundant isomiRs in each species (Table 3) such as ath-miRf10564-npr, which is similar to ath-miR172b-3p and targets mRNAs of AP2 proteins and plays an important role in development 65 . Similarly, in case of Brachypodium distachyon, we identified most abundant isomiRs originating from miR156 family which targets SQUAMOSA PROMOTER BINDING PROTEIN (SBP) genes and plays an important role in the developmental processes 66 .
Terminal modification either at the 5′-or the 3′-terminus plays an important role in directing the microRNA movement especially 5′-terminal nucleotides, which have been widely shown to play an important role in ARGONAUTE sorting, for example, preferential uptake of the sRNAs having 5′ cytidine has shown to be widely beneficial for AGO5 67 . These terminal modifications have not only been effective in associating the microRNAs to the AGO complexes but also revealing a class of microRNAs variants (isomiRs). On such classical example is the presence of the terminal additions at the 5′ nucleotides in miR157, which alters the binding efficiency of the miR157 to SPL genes 68 Across the implemented stresses in Diff isomiRs, a total of 33,874 isomiRs were detected, with most of the isomiRs revealing the uridine at the 5′-terminus (Table 4). Previously uridine at the 5′-terminus has been shown to be one of the dominant phenomena of palnt miRNAs and AGO1 recruited smallRNAs 67 , which supports the present finding and also suggests uridine at 5′-terminus contributes to the isomiR functionality. Among all stresses, highest amount of the 3′-terminal cytidine addition was seen across the control, drought, salt and cold conditions in order (Table 5). It was reported that non-templated cytidine addition had been shown to be a dominant mechanism for isomiR biogenesis 22 . In present work, we observed that cytidine addition is also a dominant mechanism for templated isomiR biogenesis.
Diff isomiRs: Implemented browsing patterns for differential isomiRs. Next generation sequencing has accelerated both the identification and characterization of isomiRs and the terminal modifications associated with isomiRs 65,69 . However, large-scale isomiR mining and development of open access exploratory portal are still lacking. Currently there is only a single repository, isomiRBank 41 for plants, which provides pre-compiled set of isomiR across four plant species. However, isomiR diversity in stress is yet to be addressed. Considering this relative lack of open access web-based exploratory for stress associated isomiRs, Diff isomiRs was developed with an intuitive graphical display for ease of stress associated isomiR browsing. Diff isomiRs provides four features for mining of differential isomiRs (Fig. 2) such as selection according to species, stress, differential

Sorghum bicolor 8 5
Triticum aestivum 90 28 Vitis vinifera 2 1 Zea mays 70 19 Total 433 158      (Fig. 2). In addition to aforementioned experimental details, summary statistics such as: 1. Number of cleaned and collapsed reads; and 2. Reads assigned to genome and genome associated features and corresponding lncRNAs present in the respective species were shown at the same page (Fig. 2).
To allow for the rapid visualization, differential isomiRs along with the log2FC and p-value calculated by mapping-based and model-based methods are listed in sorted tables along with sequence features. A quick download hyperlink allows to download the entire table in TAB-delimited format for manual curation (Fig. 2). Each differentially expressed isomiRs is further hyperlinked to isomiR specific page, which list several features, including: 1. Comparison of the mature miRNAs and isomiRs; 2. Read depth of the identified isomiRs across the datasets and their corresponding accession numbers; 3. Presence and absence of the isomiRs in other species; 4. Whether  being identified as isomiR in other species; 5. psRNAtarget predicted target of the corresponding isomiRs along with the corresponding function and 6. Target comparison between isomiRs and corresponding canonical miR-NAs (Fig. 3). To access the target site cleavage, each target is hyperlinked to a pop-up box, which displays the target site cleavage and the UPE and the corresponding coordinates of the target site (Fig. 3). Species, Stress, DiffExp and isomiR selection pages of Diff isomiRs allow for downloading of the information in TAB-delimited format, which can be easily exported for further manual curation. Target mimics in plants analogous to miRNA sponges in animals is a novel way to reduce the miRNAs targeting transcripts resulting in the overexpression of the target transcripts 70 . Recent approaches have enabled the high-throughput and computational-based prediction of plant endogenous target mimics (PeTMs, http://petmbase.org). The present version of the Diff isomiRs reports the alignment between the canonical miRNAs and isomiRs and the PeTMs with a 50 bp up-and down-stream window to understand the truncation events between the identified isomiRs, corresponding miRNAs and PeTMs (Fig. 3). Interestingly, we found a total of 4,230 isomiRs corresponding to 716 canonical miRNAs and 918 PeTMs from 11 species. Among these, species Glycine max represented the maximum number of isomiRs associated with PeTMs (962) and Hordeum vulgare represented the least number of isomiRs associated with PeTMs (7). In model plant Arabidopsis thaliana, multiple knock-down lines using artificial miRNA target mimicry have been previously reported 71 . Availability of these isomiRs associated with canonical miRNAs will accelerate the development of the multiple knock-down lines in model plants. In addition, Diff isomiRs can be widely used to establish the role of the isomiRs and can be further explored for understanding the truncation events, isomiRs biogenesis and miRNA target repository. miRNA-target interactions play an important role in revealing the targets, which can play a critical role in unravelling the temporal regulation of miRNAs. With the increased understanding of the miRNA-target interactions and their role in gene mediated pathways 72 , we downloaded all the miRNAs-target interactions from the recently published PCeRBase database 58 and incorporate them into the Diff isomiRs for the canonical miRNAs and isomiRs for which the targets are predicted. Incorporation of these interactions provides an important feature to understand and expand the repertoire of endogenous interactions at the level of not only miRNAs but also with respect to isomiRs. isomiR specific page and corresponding target information are hyperlinked with the endogenous interactions as clickable links (Fig. 3). Availability of this information is critical to understand whether isomiRs can also play an important role in endogenous target interactions and for further validations through the easy-to-browse interface of Diff isomiRs which allows for the quick list of these interactions with respect to isomiR and canonical miRNAs. miRNA156 isomiRs: Case example of evolutionary conserved microRNAs in Diff isomiRs. Plant regulates its developmental processes through the systematic alterations of expressions of protein-coding genes.
In addition to those, several non-coding RNAs especially miRNAs, which can alter the expression profile of the host genes, also play a critical role in regulating the host developmental processes. Vegetative phase is critical and crucial to the plant growth and development. Systematic expression of the genes along the leaf blades are required for the consecutive activation of the growth cascades. miRNA156 represents a classical example of the highly conserved microRNA family from the evolutionary point of view that not only on the basis of the target site conservation but also the evolutionary conservation among the sequence features that governs the target site identification and the formation of the fold back structures 73,74 . Complex regulatory events of miR156 have been widely observed ranging from the developmental functions to responding environmental stresses 75,76 . Recently, it has been shown that the miR156 regulated vegetative development is regulated and also in coordination with miR159, whose loss can temporally increase the miR156 level and hence increase the shoot development and delay vegetative growth 77 , although the conserved miR156 mediated inhibition of the SQUAMOSA PROMOTER BINDING PROTEIN LIKE (SPL), which are critical regulators of the developmental transitions, has been well documented.
In particular, miR156 and miR157 both possess the ability to regulate the expression pattern of the major defining gene SQUOMOSA PROMOTER BINDING PROTEIN-LIKE (SPL). However, the binding efficiency of the miR156 and miR157 differs to SPL genes primarily due to the presence of the nucleotide additions at the 5′-terminus of the miR157 leading to imperfect binding 68 and thus despite being more abundant than miR156, its mode of action is subsequently less as compared to miR156. The above example posed an interesting question suggesting the relative role of the nucleotide's additions to the 5′-and 3′-end of the miRNAs and their relative binding efficiency. Gleaning from the above example, it is quite interesting to understand that whether these nucleotide additions in isomiRs of the parent miRNAs can affect the binding efficiency.
However, recent studies by Wei et al. 76 also highlights the role of the miR156 family in regulating the genes involved in carotenoid metabolism using sk156 mutant thus expanding the functional repertoire of miR156. In Vitis vinifera, miR156 plays an important role in fruit ripening and also downregulates the level of miR172, which targets APETALA2 (AP2) transcription factors 78 . Previous reports such as Baev et al. 60 and Jeong et al. 65,69 further elucidated the confirmatory higher expression of the isomiRs as compared to the canonical miRNA156 suggesting the evolutionary basis of isomiR biogenesis. To address, this question, there has been a relative lack of the studies providing a comprehensive set of the well characterized isomiRs showing the differential expression across stress conditions, which can be explored for the binding efficiency. Taking into account the critical role of miR156 as a complex regulator of the genes as well as regulator of other miRNAs with confirmatory isomiRs, we present a case browsing example of miR156 using Arabidopsis thaliana as an case example (Fig. 4), which highlights the relative features such as the identification of the differential isomiRs and their classification according to the log2FC and p-value, association of the predicted isomiRs with corresponding targets and association of the terminal modification with canonical miRNAs and corresponding endogenous target mimicry (PeTMs). Diff isomiRs detected a total of 107 isomiRs with 5′-and 3′-terminal modifications corresponding to 16 mature miR-NAs in Arabidopsis. Among these, 209 differential isomiRs across 23 experiments and six across five were found by mapping-based and model-based methods, respectively. A total of 19 genes were predicted to be targeted by the miRNA156 in Arabidopsis revealing a total association with 5 endogenous target mimics (PeTMs). Since the classification of the isomiRs are defined by the terminal modifications at the 5′-and 3′-terminus, Fig. 5 shows the terminal modifications observed in the Arabidopsis miRNA156 along with the read depth display. These features with highlighted role of the terminal modifications will play a major role in understanding the role of terminal modification, which might be due to the DCL1 imprecise cleavage or due to the post-transcriptional events. As per the Wei et al. 76 miRNA156 network is not only limited to the SPL genes, therefore high-throughput discovery of these isomiRs will play a major role in improving target prediction and also to reveal the major complex patterns of the miRNA regulation.
Interesting, we observed a total of 162 targets specific to miR156 isomiRs using psRNATarget 56 among which two of them were found to be AP2 transcription factors, which are targeted by miR172 family. Coordinated regulation of miR156 and miR172 class of miRNAs have been previously shown in Vitis vinifera 78 . Interestingly, among the targets, we observed AT1G67040.1 (TON1 RECRUITING MOTIF 22) and is targeted by miR838. miR838 has been previously shown to act as a negative feedback regulator of DCL1 biogenesis 79,80 . A comprehensive list of the miR156 isomiR targets, which are identified in addition to previously described conserved SPL is provided (Supplementary Table 3). Recent efforts by Ahmed et al. 26 pointed towards the advanced target prediction using the miRNAs and isomiRs in model plant Arabidopsis thaliana, which is well supported by  81 . Abundance of this isoform and its targeting mode of action in Phaseoleae species indicate that the isoform generation (isomiRs) from the parent miRNA is an evolutionary conserved phenomenon and fine tunes and enhances the target regulation. We believe that the swathing information on isomiRs and easy to search patterns in Diff isomiRs will play a significant role in increasing the knowledge base for isomiR biogenesis, alternative functions and expand evolutionary landscape of the isomiRs with respect to the parent miRNAs.

Conclusion
Accelerated development of the sequencing technologies has currently been leveraged to understand the potential role of isomiRs and the functional impact of isomiRs as compared to the canonical miRNAs. Till now, stress associated isomiRs have not been widely explored due to the limited mining efforts for isomiRs across the stress datasets. Diff isomiR presents a next leap forward to understand the association of the isomiRs and their relative roles in stress by providing high-throughput mining of differential isomiRs across 16 plant species. The graphical web-based exploration not only allows for rapid assessment of isomiRs and associated features but also provides potential targets and the association of isomiRs, canonical miRNAs and target mimics. To conclude, this is the first web-based repository, which provides large scale isomiR discovery in stress and will significantly increase the understanding of templated isomiRs, their origin and potential roles of differential isomiRs in stress responses. An potent application of the DiffisomiRs is the exploration of the target mimics as previously observed and reviewed 82 in conjuction with the corresponding miRNAs to increase the efficiency of the gene silencing approaches. To this end, a systematic in-depth identified isomiRs can provide a clue into the biochemical processes leading to the terminal modifications at small scale and thus in leading increasing the diversity of the smallRNAs' ome.