Identification of conserved and novel mature miRNAs in selected crops as future targets for metabolic engineering

MicroRNAs (miRNAs) are small non-coding RNA molecules, involved in the posttranscriptional gene expression of countless metabolic pathways including plant biomass production. The current work was focused on identification of miRNAs involved in the growth metabolism of Glycine max, Oryza sativa, Zea mays, Sorghum bicolor, Brassica napus, Triticum aestivum. In order to identify conserved miRNA clusters, the miRNA data were collected from miRBase database. Overall, 756, 738, 325, 241, 92, and 125 datasets of the mature miRNA sequences of Glycine max, Oryza sativa, Zea mays, Sorghum bicolor, Brassica napus, Triticum aestivum were collected from miRbase. Using MEGA software, a total of 6, 6, 5, 6, and 3 conserved miRNA clusters were examined in aforementioned crops, respectively, with the aim of studying the conserved miRNA clusters belonging to same gene families. The conserved miRNA clusters were shown to belong to miR166, miR399, miR156, miR171, miR164, miR167, and miR394 families in the selected crops. This study may lead to elucidate the role of these miRNAs and their subsequent exploitation to enhance the biomass production via metabolic pathway engineering.


Introduction
Energy has become the basic necessity in the social and economic development as well as in improving the standards of human being. Since 1970, researchers have paid a great deal of attention towards the development of technologies using renewable resource of energy in the context of high energy crises (Gokcol et al., 2009). Among various sources, biomass from agricultural practices (bagasse, wheat/rice straw), wastewater cultivated microalgae (Shahid et al., 2020) and biomass from non-arable lands has shown promising potential as a renewable and lowcost feedstock to produce energy either biological or thermochemical processes , Ahmad et al., 2017, Ye et al., 2018. Thermochemical methods have shown dominance over biological methods in terms of robustness, efficiency, and costeffectiveness (Mehmood et al., 2019). Besides, regardless of the method of choice, higher biomass productivity is one of the desired parameters to enhance the cost-effectiveness of the bioenergy production. Bioenergy crops are considered as an auspicious source of the renewable energy (Sims et al., 2006). Biogas, ethanol and biodiesel are leading bioenergy products with respect to modern bioenergy (Yuan et al., 2008). The utilization of bioenergy is particularly high in the countries good financial backing or tax incentives, as for example China, Sweden, and Brazil (Wright, 2006). Various approaches have been adopted to enhance the biomass productivity of the biomass including agricultural management practices to metabolic engineering for the subsequent use of biomass to produce bioenergy. Among various targets of metabolic engineering, Micro-RNAs have come forward as targets of interest due to their diverse roles in plant physiology including biomass production (Joshi et al., 2017). MicroRNAs (miRNA) are small non-coding RNAs, comprising about 22 nucleotides (Zhang et al., 2006b) and perform diverse physiological roles in the development of plant ; (Nogueira et al., 2007, Chitwood et al., 2009, Rubio-Somoza et al., 2009, abiotic and biotic stress responses (Shukla et al., 2008, Ruiz-Ferrer andVoinnet, 2009), signal transduction, protein degradation (Guo et al., 2005, Zhang et al., 2006b, post-transcriptional gene expression, and cellular metabolism (Zhang et al., 2006b, Zhao et al., 2010. The miRNA sequences of monocotyledonous and dicotyledonous plants are available in the miRBase. The identification of novel miRNA is a preliminary step to figure out the evolution of miRNAs in plant species along with their role in plant physiology. Hence, may lead to cultivation of the selected crop plants under salt/drought stress, after modifying the stress-responsive metabolic pathways. A large number of miRNA have been reported in Glycine max (756), Oryza sativa (738), Zea mays (325), Sorghum bicolor (241), Brassica napus (92), Triticum aestivum (125), in miRBase (http://www.mirbase.org/) (Griffiths-Jones et al., 2007). The present study was focused on the identification and phylogenetics-based molecular characterization of conserved mature miRNAs in the aforementioned crops. The miRNA families have been shown to regulate physiological processes in the studied crops under stress conditions. Hence, study was aimed to propose the targets, those can be used in future for the metabolic engineering as well as in biomass production to meet the need of energy crises in future

Material and Methods
Retrieval of miRNA data Plant genome contains hundreds of miRNAs. A very limited data is available for miRNA. In the current study, miRNA data of bioenergy was retrieved from miRBase database for further analysis (Figure 1). The miRBase database is considered as one of the main storehouses to collect miRNA genes since its inception because it provides a user-friendly interface offering a detailed overview of miRNA of interest and includes mature miRNA sequence along with their genomic coordinates and gene family. The dataset of the mature miRNAs sequences of Glycine max (no=756 mature), Oryza sativa (no=738), Zea mays (no=325, mature), Sorghum bicolor (no=241), Brassica napus (no=92 mature), and Triticum aestivum (no=125 mature) were collected from miRBase.

Multiple sequence alignment using ClustalW
The miRNA sequences of the bioenergy crops were then subjected to ClustalW for multiple sequence alignment (Chenna, 2003), which is a freely available and a frequently-used tool for the alignment of multiple sequences. It works on the basis of progressive alignment method. Therefore, it was used to reveal the conserved consensus by performing multiple alignments among mature miRNA sequence.

Phylogenetic analysis using MEGA
The phylogenetic analyses of aligned sequences were carried out by utilizing Molecular Evolutionary Genetics Analysis (MEGA). Due to its user-friendly interface and availability of multiple methods for phylogenetic tree building, MEGA is well known tool for evolutionary analysis. MEGA offers the comparative analysis of aligned sequences. In order to infer the evolutionary history among aligned sequences, the Neighbor-Joining method was used. The Maximum Composite Likelihood method was employed to calculate the evolutionary distances. The scale tree is drawn with branch length in the same units as those of the evolutionary distances used to infer the phylogenetic tree.

Identification of conserved miRNA
After performing alignment, a separate phylogenetic tree was constructed for each bioenergy crop. Because some conserved miRNA along with their corresponding gene family were found in more than one bioenergy crop, therefore Jvenn tool (http://jvenn.toulouse.inra.fr) was used to identify conserved miRNA (Bardou et al., 2014) among the selected bioenergy crops.

Identification of intra-specific homologous mature miRNA sequences
On the basis of sequence identity and phylogenetic relationship, the homologous sequences in Glycine max were clustered ( Figure 2). The present analysis was comprised of 13 nucleotide sequences. These sequences were further subjected to MEGA for phylogenetic analysis. From pairwise deletion option, all uncertain positions were deleted. The final dataset contained total of 26 positions and which were grouped into 6 clusters. All the miRNA sequences were shown belong to a cluster that showed 100% sequence identity and their associated genomic coordinates (Table 1). Likewise, a phylogenetic model was employed in Oryza sativa with the aim of identifying all identical miRNA sequence ( Figure 3). The analysis was comprised of 20 different nucleotide sequences where final datasets were shown to contain 23 positions which grouped into 6 clusters. All miRNA sequences were shown to belong to a cluster that showed 100% sequence identity and their associated genomic coordinates (Table 2). Similarly, the same phylogenetic model was employed in Zea mays to identify identical miRNA sequence ( Figure 4). The analysis comprised of 14 nucleotide sequences and the final dataset contained 23 positions and have been grouped into 5 clusters. In this case, all the miRNA sequences were shown to belong to a cluster that showed 100% sequence identity and their associated genomic coordinates (Table 3). Same phylogenetic model was employed in Sorghum bicolor where analysis comprised of 14 nucleotide sequences ( Figure  5) and the final dataset had a total of 21 positions and were grouped into 6 clusters. The miRNA sequences which belonged to a cluster that showed 100% sequence identity and their associated genomic coordinates are shown in Table 4. The phylogenetic analyses of Brassica napus contained 8 nucleotide sequences where final dataset had total of 24 positions and were grouped into 3 clusters ( Figure 6). The miRNA sequences which belonged to a cluster that showed 100% sequence identity and their associated genomic coordinates are shown in Table 5.

Identification of Inter-specific homologous mature miRNAs
The multiple aligned sequences were subjected phylogenetic analyses (Figure 8). The analysis comprised of 73 nucleotide sequences, and the final dataset was shown to contain 26 positions which were grouped into 7 clusters. It was interesting to see that one miRNA-UGACAGAAGAGAGUGAGCAC was conserved in Glycine max, Oryza sativa, Zea mays, Sorghum bicolor, and Brassica napus. Another miRNA-UGCCAAAGGAGAAUUGCCCUG was shown to be conserved in Glycine max, Oryza sativa, Zea mays, and Sorghum Bicolor. While another miRNA-UGAUUGAGCCGUGCCAAUAUC was shown to be conserved in Glycine max, Oryza sativa, Zea mays, Sorghum bicolor, and Triticum aestivum. The miRNA-UGAAGCUGCCAGCAUGAUCUA was found to be conserved in Glycine max, Oryza sativa, Zea mays, Sorghum bicolor, Brassica napus, and Triticum aestivum. The miRNA-UGGAGAAGCAGGGCACGUGCA was found to be conserved in Glycine max, Oryza sativa, Zea mays, Sorghum bicolor, Brassica napus, and Triticum aestivum.
The miRNA-UUGGCAUUCUGUCCACCUCC was found to be conserved in Glycine max, Sorghum bicolor, and Brassica napus. While, the miRNA-UCGGACCAGGCUUCAUUCCCC was shown to be conserved in Glycine max, Zea mays, Sorghum bicolor, and Brassica napus.

Discussion
It is shown that miRNAs are involved the posttranscriptional gene expression (Zhang et al., 2006a). In spite of the fact that miRNA is common in animals, some miRNA clusters are found to be conserved in plants (Sunkar and Zhu, 2004, Guddeti et al., 2005, Zhang et al., 2007, Talmor-Neiman et al., 2006. However, insertion, deletion and duplication events in miRNA sequences suggested that evolutionary conserved clusters are present in plants. However, it has been estimated that gene duplication events occur more frequently in eukaryotic genomes (Lynch and Conery, 2000) and particularly in flowering plants (Blanc andWolfe, 2004, Cui et al., 2006). It has been investigated that plant comprises more non-conserved clusters when compared to the conserved clusters (Fahlgren et al., 2007, Rajagopalan et al., 2006. Several in-silico and in-vitro studies have identified conserved miRNA in various bioenergy crops, but none deciphered the conserved identical miRNA sequences in the group of bioenergy crops. The present study was focused to identify the intra-specific and inter-specific conserved miRNAs in six bioenergy crops. Through computational and experimental identification, miRBase is considered as one of the main storehouses to collect miRNA genes. The present study elucidated 7 miRNA families that found to be conserved in six bioenergy crops. The conserved miRNA clusters were shown to belong miR166, miR399, miR156, miR171, miR164, miR167, and miR394 families indicated that due to genomic duplication event the ancestral clusters might have been originated. The miR156 represents an evolutionary conserved miRNA which indicated that it is common in plant species. Interestingly, the bioenergy crops overexpressing miR156 were shown to increase the plant biomass and altered lignin content and composition (Fu et al., 2012, Rubinelli et al., 2013, Schwab et al., 2005. The miR156 has also found associated with the phase transition, in the plant development, formation of floral meristem, and morphology of immature leaves and cell wall. Besides, miR156 has shown to be involved in abiotic stress responses including drought and low-nitrogen in bioenergy crops (Ferreira et al., 2012, Khraiwesh et al., 2012. A higher expression level of miR156 led to a shortened length of internode and hence the overall plant biomass decreased (Fu et al., 2012). Both of the miR166 and miR167 are involved in metabolism, morphology, and development of Zea mays and Sorghum bicolor (Wei et al., 2009). Moreover, miR166 and miR167 are also involved in the early development of plant, hence could have potential application in abiotic stress, biofuel yield, bio confinement, and recalcitrance (Trumbo et al., 2015). The miR164 is another evolutionary conserved miRNA which is found associated with the metabolic processes, drought response, early development (Wei et al., 2009), regulation of lateral rooting. Hence, it could be another target of metabolic engineering to counteract stress and to enhance plant biomass production (Wei et al., 2009). In Sorghum bicolor, miR399 may have potential application in abiotic stress. During water deprivation, miR399 was upregulated and showed a positive stress response (Calviño et al., 2011, Katiyar et al., 2012, Paterson et al., 2009. In Glycine Max, miR399 could be an engineering target to control the phosphate regulation (Sun, 2012). In Zea mays, miR399 has shown to involved regulating the morphogenesis and embryonic development in the grain (Li et al., 2016). In Oryza sativa, miR399 has shown to be involved in phosphate signaling (Fang et al., 2009). In Sorghum bicolor, miR171 is associated with the stress responses and in the process of morphological development (Ram and Sharma, 2013). While the upregulation of miR171 have shown to be involved in abiotic stress and in floral development in Oryza Sativa (Zhou et al., 2010). Further in vitro studied are required on miR399, miR171, and miR394 for their exploitation as future targets of metabolic engineering to counteract the abiotic stress and to enhance biomass production of the bioenergy crops.

Conclusion
The study was focused on identification of conserved miRNAs in selected bioenergy crops as future targets of metabolic engineering to improve the biomass productivity. Based on phylogenetic analyses, conserved miRNA clusters were shown to belong to miR166, miR399, miR156, miR171, miR164, miR167, and miR394 families, while these families can be used as genetic engineering targets. Such studies can further be extended to other crops of agricultural and environmental importance to identify conserved miRNAs to understand their physiological roles and evolutionary relationships.