Metabolomics Intervention towards Better Understanding of Plant Traits

The majority of the most economically important plant and crop species are enriched with the availability of high-quality reference genome sequences forming the basis of gene discovery which control the important biochemical pathways. The transcriptomics and proteomics resources have also been made available for many of these plant species that intensify the understanding at expression levels. However, still we lack integrated studies spanning genomics–transcriptomics–proteomics, connected to metabolomics, the most complicated phase in phenotype expression. Nevertheless, for the past few decades, emphasis has been more on metabolome which plays a crucial role in defining the phenotype (trait) during crop improvement. The emergence of modern high throughput metabolome analyzing platforms have accelerated the discovery of a wide variety of biochemical types of metabolites and new pathways, also helped in improving the understanding of known existing pathways. Pinpointing the causal gene(s) and elucidation of metabolic pathways are very important for development of improved lines with high precision in crop breeding. Along with other-omics sciences, metabolomics studies have helped in characterization and annotation of a new gene(s) function. Hereby, we summarize several areas in the field of crop development where metabolomics studies have made its remarkable impact. We also assess the recent research on metabolomics, together with other omics, contributing toward genetic engineering to target traits and key pathway(s).


Introduction
Metabolomics in the plant system has extended the opportunities towards the discovery of new pathways and integrating it with other omics-based data generated from genomics, transcriptomics, and proteomics, which improved existing genome annotations. The study of metabolomics has gained attention in the last 20 years, as most of the research labs were involved in generating the metabolic profile through various platforms such as nuclear magnetic resonance (NMR), liquid chromatography-mass spectrometry (LC-MS), and gas chromatography-mass spectrometry (GC-MS), which also lead to enrichment of several metabolite databases such as KEGG, GOLM, NIEST databases. By 2010, most of the metabolomics labs were equipped with the latest analytical high-throughput chromatography instruments. It is coupled with highly sensitive and precise mass spectrometric tools developed through revolutionary advances in the field of mass-spectrometry and data processing softwares including the free web-tool like Metaboanalyst and offline software METLIN. The most important plant-based metabolite data-processing tools involves platforms such as ChromaTOF, Met-Align, MET-COFEA, MET-XAlign, etc. [1]. Further, availability of statistical tools, such as MetaboAnalyst, Cytoscape, Statistical analysis tool, etc., have made statistical analysis simple, such as principal component analysis (PCA), partial least squares (PLS), K-means clustering, boxplot, heatmap, and reconstructing metabolic pathways [1][2][3]. The availability of the above tools has allowed analysis of a remarkable collection of metabolome data from the samples that were extracted for the analysis of primary and secondary metabolites, and lipidomics under various growth conditions. Metabolome data are available for several model and crop species including Arabidopsis thaliana, Arachis hypogaea, Actinidia Lindl. spp., Citrus spp., Lotus sp., Lupinus albus, Helianthus annuus L., Mangifera indica, Medicago trancatula, Malus spp., Fragaria × ananassa, Glycine max, Oryza sativa, Pyrus communis, Solanum lycopersicum L., Vitis vinifera, Zea mays, etc., [1,4]. The metabolomics study was done to explore multiple areas such as biotic stress [1,5,6], abiotic stress [7][8][9], legumes and cereals quality improvement [10][11][12][13][14][15][16][17], biofuel production and lipid profiling [18][19][20][21], impact of climate change and high CO 2 level [22][23][24][25], hormone profiling [26], and improving fruit quality [1,[26][27][28][29]. These attempts have provided opportunities to dissect the metabolic pathways for developing stress-tolerant and nutrition-rich crop plants [1]. Previously, several review articles have focused on providing the detailed methodology and availability of the advanced instruments which are being used for the omics study including metabolomics [1,30,31]. In this review, we have covered the important area that has flourished in the era of metabolomics and how the knowledge gathered through metabolomics has helped in dissecting different pathways through metabolic engineering for crop improvement.

Integrating Metabolomics with Genomics Study for Gene Characterization and Metabolomics-Assisted Breeding
Over the past decade, metabolomics has seen excellent progress in the area of development of instrumentation and software advancement; providing the opportunity to analyze the whole metabolome of plant species using high throughput methods. Metabolomics applications have supported several research areas, especially biotechnology, genomics, molecular plant breeding, and functional genomics [32]. In addition, its use makes advances in the area of translation metabolomics and plant breeding. Recent advancements in post-genomics technologies have boosted the process of screening and metabolomics integrations with other high throughput methodologies, which will be reducing the time required to develop crop varieties with enhanced biotic and abiotic stress tolerance. Metabolomics has a strong ability to holistically explore the evaluation and phenotyping of various metabolites in crops [33]. Approximately 840 metabolites were identified in rice cultivars that could be used in breeding programmes [34]. mQTLs (metabolomic quantitative trait loci) mapping and mGWAS (metabolic genome-wide association studies) are important approaches for the identification of genetic variants associated with metabolic-related traits [10].

Metabolomic Quantitative Trait Loci
To understand the metabolic networks that regulate the complex developmental process metabolomics-based quantitative trait locus (mQTL) studies are important for improving the quality and performance of elite cultivars. In addition, results obtained from mQTL studies contribute to a deeper understanding of quantitative and functional genetics [35]. Metabolic profiling decreases the gap between phenotype and genotype and offers new opportunities for metabolic dissection, starting with the discovery of molecular markers along with mQTL mapping studies for the identification of candidate genes and linked genomic region. Metabolic markers have become an important tool to uncover and investigate the various biological complex pathways responsible for distinct phenotypes [36]. The mQTL approach connects the metabolome and genome, and provides important insight into genetic function and investigates phenotypic variation via metabolic profiling and comprehensive gene expression analysis [37].
Advances in genomic technologies have enabled mQTL detections via high-density maps for candidate gene discovery [38]. Several candidate genes that regulate metabolites biosynthesis have been detected using multi-omics approaches with reverse and forward genetics methods [39]. Moreover, population genetics, which integrates quantitative genetics with metabolic profiling has begun to explore genetic regulation of the entire metabolome in plants. A recent study, reported by [10], uses high-density map with 1619 bins for mQTL mapping, leading to identification of several mQTLs for flag leaf and germinating seeds across 12 linkage groups in rice. Comparative mQTL studies in two rice cultivars showed tissue-specific secondary metabolites accumulation under strict genetic regulation. A total of 19 metabolites have been identified on 23 mQTLs, indicating a substantial interaction between metabolites and the associated genomic loci [10]. Another mQTL study conducted in back-crossed inbred lines (BILs) of rice identified 700 different metabolic characteristics under 802 mQTLs which show an unusual range that could regulate various metabolic traits [40]. Further, in maize, 26 distinct metabolites were identified which shows a strong association with single nucleotide polymorphism (SNPs), and highlighted the importance of cinnamoyl-CoA reductase gene located on chromosome 9 for controlling lignocellulosic biomass [41]. mQTL mapping is an effective method for identifying stress-responsive trait pathways. In the barley recombinant inbred line (RIL) population, the mQTL study detected 98 different stress-responsive metabolites and observed that their abundance modulates through a coordinated expression of several genes to function under drought conditions [42]. Similarly, the mQTL study in barley identified 57 metabolites under drought stress conditions [43]. In rapeseed, metabolic profiling and gene function analysis to identify the basis of glucosinolate synthesis was performed, which reported around 105 mQTLs in seeds and leaves involved with glucosinolate production [44]. In a very recent study carried out in the tomato wild and introgression lines, 679 mQTLs were identified for secondary metabolism-related pathways linked to environmental stress tolerance [45]. In later experiments, mQTL analysis was performed in a similar IL to investigate metabolite concentration [46]. Likewise, metabolic profiling of wheat (double haploid lines) by LC/MS method revealed about 558 secondary metabolites, comprising alkaloids, flavonoids, and phenylpropanoids [47]. The GC-TOF/MSbased metabolic analysis of seed of tomato RIL population was performed to investigate the seed metabolism [48], which identified several genomic regions controlling a group of metabolites. As sequencing technologies progresses, more plant genomes have been sequenced and these high-quality genomes may further accelerate the crop plant's mQTL studies, leading to establishing a relationship between genome and trait expression. For example, phenylpropanoid synthesis genes have been identified in corn [49], phenolamide in corn and rice [50,51], and glucosinolate regulation in cabbage [52] have been reported; these by-products are regarded as defense responsive metabolites. In the future, these mQTLs will help in targeting several pathways for designing crops with desired traits.

Metabolic Genome-Wide Association Studies
The mGWAS was developed as a valuable tool to explain the natural genetic basis of different metabolic shifts in a plant's metabolome (Table 1). Recent studies have shown the broad perspective of plant metabolites related to specific traits [16]. A parallel study of mGWAS with phenotypic genome-wide association studies (pGWAS) in rice have effectively detected novel candidate genes that control the genetic variation in relevant agronomic traits [16].
Metabolic polymorphism studies in rice species reported various forms of flavone glycosylation and stated a positive association between plant growth conditions and UVB light exposure [53]. A recent mGWAS study in rice reported 323 associations among 89 secondary metabolites for two genetic architecture types, related to secondary metabolite concentration [54]. Natural variation studies and the metabolic profiling of phenolamides have been undertaken by Dong and colleagues using an LC/MS mediated targeted metabolomics method in several rice accessions. They identified a temporal and spatial accumulation of several phenolamides. In addition, mGWAS detected two spermidine hydroxyl cinnamoyl transferases, responsible for natural variations in spermidine levels. This study showed that gene-to-metabolic analysis through mGWAS offers an opportunity to improve crop genetics [51]. Another mGWAS study was conducted to analyze rice metabolism biochemical and genetic variants. The study reported 36 genes linked to specific metabolites that regulate physiological and nutritionalrelated traits [34]. Traits associated with primary and secondary metabolites could be utilized as metabolic markers to promote plant breeding. Similarly, the maize mGWAS study was conducted to reveal complex metabolic character. Around 26 metabolites associated with SNPs have been detected which regulate the main target of cinnamoyl-CoA reductase to increase the lignocellulosic quality of maize [41]. Recently, in winter, wheat metabolic profiling has been done to make apparent the association of 18,372 SNPs and detected 76 metabolites. The relation between metabolites has shown a functional relationship with several pathways of the Krebs cycle. The mGWAS identified a strong correlation between 1 and 17 SNPs with six metabolic attributes. These findings provide a way to predict the impact of genetic interventions on related metabolic traits and possibly, on a metabolic phenotype [55]. These studies will speed up metabolomics-assisted breeding to improve the quality and quantity of target traits in crops.

Metabolic Analysis for Biotic Stress Tolerance in Crop Plants
Recent evidence showed that invasive microbes systematically suppress plant immune function in susceptible cultivars using protein-effector molecules which can also be identified by plant R gene products in inconsistent interactions [61]. Besides counteracting plant defenses, an effective pathogen must also subvert host plant metabolism to facilitate efficient intake, sequestration, and use of host-derived nutrients [62,63]. Several studies have utilized transcriptional profile analysis to examine the global changes in expression of genes which arise during host invasion by biotrophic and hemibiotrophic fungi [64][65][66], and have reported co-ordinated expression of several gene products, that often have a predicted metabolic function. Therefore, a metabolome study related to the stress responses is important to unravel the molecules/metabolites which coordinate susceptibility and/or resistance traits in different plant [1,[7][8][9][67][68][69][70][71][72].
Biotic stress resistance-associated loci have been reported in various crop diseases such as late blight of potato (Phytophthora infestans) [73], rice blast (Magnaporthe grisea) [74], and cereal rusts (Puccinia spp.) [75]. Two mQTLs, Qfhs.ndsu-3BS in barley [76] and Fhb1 in wheat, have been also reported for Fusarium head blight disease resistance [77]. Such loci generally co-localize multiple genes and cloning of such loci to identify all the co-localizing genes is a challenging task. A combined transcriptomics and metabolomics analysis of the rice in response to bacterial blight pathogen Xanthomonas oryzae pv. Oryzae reported that few mRNA and metabolite differences have been observed, and many differential changes in the Xa21-mediated response occurred [78]. Important transcriptional induction of various pathogenesis-related genes in the Xa21 challenged strain, as well as differential expression of GAD, PAL, ICL1, and Glutathione-S-transferase transcripts suggested a minimal association with changes in metabolite under single time point global profiling conditions. In fact, a metabolome study using LC-MS and GC-MS methods identified several hundreds of compounds, which were modulated when the susceptible and resistant line was compared. Most importantly, this study identified ornithine, citrulline, tyrosine, phenylalanine, lysine, oxoproline, butyrolactam, and N-acetylglutamate as the key compounds involved in providing tolerance against bacterial blight pathogen in rice. Additionally, the role of acetophenone and 2-phenylpropanol (acetophenone reduction product) was identified during host resistance, as earlier these were reported to be involved in the dicot plants [79]. More importantly, recently through metabolomics study, resveratrol was identified to have inhibitory action on Xoo as it causes oxidative stress as well as disrupts several pathways related to Xoo growth and metabolism including amino acid, purine, energy, and NAD + metabolism in Xoo [80]. Further, metabolomics was deployed for the reconstruction of a genome-wide metabolic model of Xoo and revealed the influence of nitrogen-fertilizers on Xanthomonas oryzae pv. Oryzae metabolism, a differential flux in nitrogen-metabolism and ammonia uptake was observed [81]. Like bacterial blight, Asian rice gall midge (Orseolia oryzae) is a severe rice pest causing major yield losses. Metabolic studies reported a number of metabolites that can be categorized as resistance, susceptibility, infestation, and host features, depending on their relative occurrence, and can be considered as biomarkers for insect-plant interaction in general and rice-gall midge interaction in particular [82]. Therefore, more metabolomics studies including tissue and single cell-specific studies are required to develop interactome maps by integrating different layers of omics studies.

Important Achievements through Metabolic Engineering
In the past two decades, several attempts have been made towards characterization of genes related to important metabolic pathways which have also led to the improvement of several crop plants in the area of bio-fortification. We have summarized most of them in Table 2 and discussed some important ones below.  Overexpression increases leaf CO 2 uptake and plant dry matter productivity in tobacco [104] Overexpression reduces water loss per CO 2 assimilated in tobacco [105] Calvin-Benson cycle SBPase Key regulator of carbon flux Overexpression enhances photosynthesis against high temperature stress in transgenic rice [106] Overexpression increases photosynthetic carbon assimilation, leaf area, and biomass yield in tobacco [107] Overexpression increases photosynthesis and grain yield in wheat [108] Photorespiration GCS H-protein Catalyzes the degradation of glycine Overexpressing increases biomass yield in transgenic tobacco plants [109] GDC-L protein

Gene Function of Gene Phenotypes of Transgenics Reference
Electron Transport Algal Cyt c6 Participates in algal photosynthetic electron transport chain Overexpression increase CO 2 assimilation rates and plant growth in Arabidopsis [112] Constitutive expression enhanced water use efficiency, chlorophyll and carotenoid content in tobacco [113] Rieske FeS Regulates electron transfer Constitutive expression enhanced photosynthetic electron transport rates, chlorophyll and carotenoid content [114] Carbon transport Cyanobacterial inorganic carbon transporter B Regulates CO 2 concentration mechanism Significantly higher photosynthetic rates and biomass was observed in overexpressed Arabidopsis lines [115,116] Overexpression enhanced CO 2 assimilation rates in rice and tobacco [117] Genome  Engineering of terpenoid pathway led enhanced aroma and flavor in tomato [143] Limonene Synthase Catalyzes the cyclization of geranyl pyrophosphate to (4S)-limonene Modified essential oil content in transgenic lines in transgenic mint [144] β-Glucosidase

Fortification of Carotenoids and Flavonoids
The carotenoid biosynthesis and metabolism are studied intensively as different carotenoids have distinct nutraceutical roles such as lycopene as an antioxidant, lutein for vision, acyclic carotenoids i.e., phytoene and phytofluene in nutricosmetics, and β-carotene as the primary dietary precursor of vitamin A. The sufficient intake of vitamin A is essential for human health. In many developing and under developed countries, vitamin A deficiency (VAD) is a prevalent cause of premature death and childhood blindness. In addition, therapeutic doses of β-carotene have protective effects against cardiovascular disease, certain cancers, and aging-related diseases [166,167]. Considering the nutritional benefit of β-carotene, in recent years, considerable efforts have been directed to elevate its content in food crops. Various metabolic engineering approaches have been used to increase the β-carotene levels to alleviate the provitamin A deficiency, beginning from "Golden Rice I". Since then, biofortification is attempted in several crop plants using transgenic approaches, conventional breeding, and screening genetic diversity. Conventional breeding and markerassisted selection have significantly increased carotenoid content in a few instances, but there is the need for identification of novel alleles or wild germplasm associated with high carotene levels [168][169][170]. On the other hand, transgenic approaches using overexpression of plant genes or introduction of bacterial genes lead to high provitamin A, but suffer from GM regulations, safety, and public acceptance [124,[171][172][173]. Screening of natural accessions, genetic variants, and mutants with altered carotenoid content provides a faster and safer way for the biofortification of provitamin A in crop plants [174,175]. Carotenoid sequestration was also achieved via overexpression of Orange (Or) gene or Or mutants harboring "Golden SNP", which encodes the plastid-localized DnaJ cysteine-rich protein, has been successfully demonstrated in melons, cauliflower, and potato tubers [176,177]. A list of provitamin A biofortified crops is summarized in Table 3. Not only provitamin-A carotenoids, but xanthophylls like zeaxanthin and lutein also play an imperative role in protection against age-related macular degeneration (AMD) which is the predominant cause of blindness in several countries [178,179]. Recently, a zeaxanthin-rich tomato fruit was developed using metabolic engineering and genetic breeding which has highest concentration of zeaxanthin achieved in a primary crop [180]. To date, the exploitation of several natural and transgenic resources has been utilized for the biofortification of carotenoids in crop plants and the field is still expanding by identifying new regulatory factors which can modulate the carotenoid production.

Melon
Or Total carotenoid content increased by 11-fold [176] Cauliflower Or Beta-carotene content increased by 7-fold [201] Flavonoids, belong to a group of polyphenolic plant secondary metabolites, which not only have physiological roles in plants but also constitute our daily diet. There are six major subclasses of flavonoids notably, anthocyanidins, flavan-3-ols, flavonols, flavanones, flavones, and isoflavones, which are widely present in fruits and vegetables. Flavonoids-rich fruits and vegetables have been largely promoted in the human diet because of their broad spectrum of health-promoting benefits, which include anti-oxidant and anti-inflammatory properties. Given its nutritional importance, several efforts have been made to increase flavonoid levels in various crops using overexpression of key structural genes and transcription factors. Overexpression of single or multiple structural genes from different sources resulted in a significant increase in flavonoid production. Schijlen et al. [202] showed that combining structural flavonoid genes stilbene synthase, chalcone synthase, chalcone reductase, chalcone isomerase, and flavone synthase lead to the accumulation of stilbenes, deoxy chalcone, flavones, and flavanols in tomato peel. Similarly, overexpression of petunia chalcone isomerase in tomato fruits resulted in increased flavanols levels [203]. In addition, several transcription factors have been used to regulate phenylpropanoid metabolism. Bovy et al. [204] utilize maize transcription factor genes LC and C1 for production of high flavanols tomato. Likewise, Zhang et al. [205] reported fruit-specific expression of AtMYB12 in tomato leads to the accumulation of flavanols. Accumulation of anthocyanins in tomato fruits was achieved by expressing snapdragon transcription factors AmDel and AmRos1 [206]. Recently, Jian et al. [207] showed the overexpression of SlMYB75 promotes anthocyanin and flavonoids accumulation. These results suggest that structural genes and transcription factors together can be used to achieve a higher accumulation of flavonoids in crop plants.

Metabolic Engineering of Phytohormone Signaling and Biosynthetic Pathway to Improve Crop Performance
Phytohormones auxins, brassinosteroids (BRs), cytokinins (CKs), ethylene, gibberellins (GAs), and abscisic acid (ABA) are the key regulator of the plant architecture and their growth [26,208]. In the recent past two decades, several transgenics have been generated to understand their role and also to improve the crop plants [26]. In fact, one of the most key events in plant biology and agronomy was that the selection of the semi draft variety in wheat and rice during the green revolution was driven by a selection of genes related to GA pathways such as GA-20 oxidase and Della [209,210]. One of the key transcription factors regulating GA signaling is Squamosa promoter-binding-like protein 8 (SPL8), amputation, or attenuation of it through transgenic approach severe declines GA accumulation via GA2-OX and GA2-OX6 [211]. Likewise, cytokinin biosynthesis was targeted to alter plant architecture, growth habit, and life cycle because upregulation of cytokinin production enhances biomass and delays plant senescence via cell division [212]. A mutation in the cytokinin receptor or overexpression of gene cytokinin oxidase (CKX, encode for cytokinin catabolizing enzyme) can lead to the smaller shoot apical meristem, decreased leaf area, and severely retard plant growth [213]. Therefore, to achieve better crop yield, CKX gene homologs were targeted by developing knockouts. In rice, CKX knockout results in the improved maintenance of photosynthetic rate, panicle branching, and reduced yield gap under salinity stress [214]. Several attempts involved upregulation of cytokinin through overexpression of a cytokinin biosynthetic genes isopentenyl transferase (ipt) in broad bean [215], creeping bentgrass [216], peanut [217], rice [218], tobacco [219], and in salinity stress exposed cotton [220]. Additionally, transgenic poplar plants overexpressing a YUCCA6, abiotic stress-responsive gene involving in tryptophan-dependent IAA biogenesis pathway, exhibit remarkable rapid shoot elongation with restricted tap root but with enhanced root hairs [221].
The complete knowledge of metabolic pathways is very important. Recently, a cluster of genes related to ABA signaling was targeted through genome editing to improve drought tolerance, due to which the edited lines showed a remarkable 30 percent yield increase due to increased number of spikelet numbers per main panicle [222]. The edited genes involved ABA receptor (RCAR) family of proteins PYL1-PYL6, PYL12, PYL7-PYL11, and PYL13. ABA plays a key role in abiotic stress tolerance especially during drought stress, as a result, several ABA signaling and biosynthetic genes including ABA-responsive complex (ABRC1) and 9-cis-epoxy carotenoid dioxygenase (NCED) have been targeted to improve the abiotic stress tolerance in crop plants [223,224]. Lee et al. [223] demonstrated the role of ABRC1 in tomato transgenic in maintaining yield against cold, drought, and salinity stress. Likewise, the gene NCED1 was overexpressed in tobacco to achieve tolerance to drought and salt stress due to enhanced accumulation of ABA in leaves [224].

Engineering of Cell Wall Biosynthesis Pathway: Some Examples
The non-living cell wall present in the plant system makes them unique compared to animal cells, provides structural and mechanical support to the whole cell, and also acts as a physical barrier against both abiotic and biotic stresses. The principal compositions of a cell wall are cellulose, hemicelluloses, and lignins. Often, the plant activates the cell wall metabolism-related pathways whenever they are challenged with stress, such as higher production of lignin biosynthesis enzymes during biotic and abiotic stresses. Therefore, immense progress has been made to target cell wall-related pathways to confer tolerance against these biotic and abiotic stresses. Modification of the lignin biosynthetic pathway was done in Pinus radiate, which provided the significance of gene 4-coumarate-Co A ligase in the accumulation and distribution of lignin in the tracheid element during cell wall and wood formation; by which it also interferes into plant height [225]; indicating its economic importance in the field of horticulture for generating a dwarfed plant or "bonsai tree-like". The biosynthesis of the cell required UDP-Glc, which is required for the formation of different sugars required during wall formation [225]. Researchers have explored genes UDP-glucose pyrophosphorylase and sucrose synthase for drought tolerance as their overexpression causes enhanced cellulose accumulation by increased production of UDP-Glc [226]. Likewise, the role of the cellulose biosynthetic gene cellulose synthase was observed in Brassinosteroid insensitive2 mutants [227]. Further, the Expansin gene, which controls cell wall loosening, plays a very important in the root architecture during drought tolerance [228]. The gene SHINE encodes the AP2/ERF transcription factor family protein known to control the wax biosynthesis pathway in a plant [229]. In rice, the gene SHINE was overexpressed, which led to reduced 45% lignin content and increased cellulose content by 34%, thus improving the fodder quality and digestibility [230]. The silencing of the NAC2 transcription factor, which binds to the promoter region of Expansin-A4 (EXP-A4), caused reduced drought tolerance during floral organ development in rose due to reduced expression of gene EXP-A4 [231]. On the contrary, overexpression of EXP-A4 in Arabidopsis showed an expected drought tolerance phenotype [231]. In rice, overexpression of Sucrose synthase (SUS) led to increased cell wall-related polysaccharides deposition and reduced cellulose-crystallinity as well as xylose/arabinose proportion in hemicellulose; which is beneficial for the biofuel industry [232]. The genetic engineering of the cell wall biosynthetic pathway through overexpression of SUS in rice added a new dimension towards its role in the cell wall metabolism.

Metabolic Engineering for Bio-Fortification of Phytonutrients
In the past 20 years, several attempts have been made to enrich the nutritional constitution in crop plants; so that they can emerge as a superfood; such as development of the purple tomato [206], where a gene was overexpressed for a hyperaccumulation of "anthocyanin" which is an anticancerous compound. One of the most important contributions in the field of metabolic engineering of crop plants was the development of 'Gloden rice' by overexpressing phytoene synthase (PSY) from maize and the daffodil plant, and PSY ortholog from (Erwinia uridovora) bacterial using the endosperm specific promoter, leading to a 27-fold increase in the β-carotene level in the transgenic golden rice [1,124,171]. Every year, folate deficiency causes death, cardiovascular disease, megaloblastic anemia, and neurological disorder in newborns [1]. Now, due to the characterization of the folate biosynthesis pathways genes, several genes have been overexpressed in Arabidopsis, lettuce, tomato, lettuce, maize, and potato [1]. The gene GTP-cyclohydrolase 1 (GTPCH1) was overexpressed in Arabidopsis, lettuce, rice, and tomato [233][234][235][236].

Study of Root Nodule Symbiosis (RNS) in Legumes
The symbiotic nitrogen fixation is mainly restricted to legumes, there are several rhizobia including certain diazotrophs that inhabit the rhizosphere of other crops, which are involved in plant development. In the late 19th century, legumes (Fabaceae) were found to be capable of forming a root nodule symbiosis (RNS) with nitrogen-fixing rhizobia which improves soil fertility [237]. With the emergence of modern tools such as transcriptomics and proteomics, the molecular mechanism of root nodule symbiosis (RNS), nodule organogenesis, and their development have been well studied in model legume species [238,239]. These studies have centered the concepts that mark the path for the engineering of nitrogen fixation nodule symbiosis which include; various blueprints for nitrogen-fixing root nodule symbiosis (RNS), use of non-model crops to recognize important symbiosis genes, recruitment of the arbuscular mycorrhizal pathway for RNS, and crosstalk between developmental programs involved in plants and RNS. Not only do these concepts reflect significant breakthroughs in our knowledge of RNS, but they also provide important insights for engineering strategies possibilities and constraints. Various studies in legumes reported a number of genes which are associated with RNS ( Figure 1) [240][241][242][243][244][245]. Some important genes which control the RNS have been reported: NFR, LYK3, LYR3, DMI1-3, CASTOR, POLLUX, NIP85, NUP133, NENA, and SyMRK Nod factor for perception, and the downstream signaling pathway includes transcription factors NSP1, NSP2, ERN1, etc. (See Figure 1) [238,239]. More such studies are required in order to understand the molecular biology, biochemistry, and nodulation physiology in nodulating species.

Addressing Symbiotic Nitrogen Fixation in Cereals and Non-Legume Crop Plants
The nitrogen-fixing orders Cucurbitales, Fagales, Fabales, Rosales, and other Poaceae (Poales) varied widely and their root systems showed various developmental adaptations [246]. The crop plants such as cereals demand a significant amount of nitrogen for their proper growth and grain production, therefore engineering of these crops would be ideal to induce nitrogen fixation nodulation-related traits [247]. Selection of a single gene for metabolic engineering of non-legumes plants (such as cereals) to induce root nodulation for better nitrogen use efficiency is the biggest challenge. Therefore, by comparing the various RNS and the associated genes, we can distinguish common features and the core genes that must be recruited in the early development of the trait. However, knowledge and understanding of these genes can also be important, as they can be related to processes like root hair invasion, nodule organogenesis, and symbiosome development, thereby enabling an engineering approach that integrates features from multiple symbioses. In order to assess a core community of symbiosis genes important to RNS and to classify lineage-specific adaptations, it is necessary to choose representative species in different clades for comparative study. Particularly the latter is a pro, as CRISPR-Cas9-based reverse genetics will allow the study of the function of genes.

Addressing Symbiotic Nitrogen Fixation in Cereals and Non-Legume Crop Plants
The nitrogen-fixing orders Cucurbitales, Fagales, Fabales, Rosales, and other Poaceae (Poales) varied widely and their root systems showed various developmental adaptations [246]. The crop plants such as cereals demand a significant amount of nitrogen for their proper growth and grain production, therefore engineering of these crops would be ideal to induce nitrogen fixation nodulation-related traits [247]. Selection of a single gene for metabolic engineering of non-legumes plants (such as cereals) to induce root nodulation for better nitrogen use efficiency is the biggest challenge. Therefore, by comparing the various RNS and the associated genes, we can distinguish common features and the core genes that must be recruited in the early development of the trait. However, knowledge and understanding of these genes can also be important, as they can be related to processes like root hair invasion, nodule organogenesis, and symbiosome development, thereby enabling an engineering approach that integrates features from multiple symbioses. In order to assess a core community of symbiosis genes important to RNS and to classify lineage-specific adaptations, it is necessary to choose representative species in different clades for comparative study. Particularly the latter is a pro, as CRISPR-Cas9-based reverse genetics will allow the study of the function of genes.
Introducing a cluster of genes responsible for the root nodulation through genetic engineering will be an important achievement; in fact, such novel attempts are required in cereals and other non-legume crops [248][249][250][251]. If all genes in model species are defined for nitrogen-fixing symbiosis, it will provide a framework for engineering in far-related species. Since the nitrogen-fixing trait is believed to have a single evolutionary origin, several species in nitrogen-fixing clade may lose nodulation in the future [252,253]. A current approach is to bring back mutated genes of symbiotic association (nitrogen-fixing clade) in non-nodulating species. Likewise, the species representing a sister lineage of a clade could be approached [252,254]. In non-nodulating species, introduction of nodulation will rely on the endogenous genes, but several transgenes are required to transfer. At first, NFP/NFR5/NFP2, NIN, and RPG genes can be used. The question still stands whether these genes are the only genes that are responsible for nodulation [255]. Other genes such as leghemoglobin encoding have most likely undergone minor but important adaptations [256].
Expecting functional RNS in a single attempt in non-nodulating species is not possible as it is coordinated through multiple genes. Instead, engineering might be an iterative approach. Evolutionary genomics studies indicate that relatively few genetic elements are required to provide nitrogen-fixing ability from legume to non-legume species [257]. The transfer of nitrogenase encoding genes to plants needs a bacterial concatemerization genetic unit (a minimum set of three genes) [258]. Engineering nitrogenase encoding bacterial nif genes into non-legumes species is quite difficult because of the complex nature of nitrogenase biogenesis and nitrogenase sensitivity in the presence of oxygen. Advanced genetic and biochemical studies have defined the common core group of genes that are needed for the functional biogenesis of nitrogenase [259]. Moreover, potential low-oxygen subcellular conditions provided by mitochondria and plastids to express active nitrogenase activity in plants enable this engineering approach [260]. Recent studies have shown that the legume symbiotic signaling pathway (SYM) plays a key role in arbuscular mycorrhizal symbiotic associations (AMSA). Various plants including cereals could form AMSA, but they do not have the ability to form nitrogen-fixing nodules. The SYM pathway for the arbuscular mycorrhizal associations in cereals can be engineered to perceive rhizobial signal molecules, which can trigger this pathway and activation into an oxygen-limited nodulelike-root organ for fixation of nitrogen [261]. Prior phylogenomic studies have shown that a set of genes can convert a species in AMSA into a nitrogen fixation symbiosis [252,256]. In cereals, chloroplasts and mitochondria are known to be ideal locations for generating a high-energy nitrogenase enzyme [262]; however, oxygen evolved from chloroplasts during photosynthesis could disrupt the nitrogenase enzyme complex formation. A potential solution is spatio-temporal separation of photosynthesis and nitrogen fixation, which means that nif genes could express only in dark periods or in non-photosynthetic parts (root system) [263]. Besides, a carbon-secretion approach that promotes increased carbon competition among the nitrogen-fixing population can be used to develop adequate signals between cereals and nitrogen-fixing rhizobia for effective colonization [261].
Phylogenomics studies assisted de novo genome sequencing of non-model legume species led to a better understanding of the origin of nodulation trait. These studies have paved the path for trait engineering. These comparative phylogenomic studies were comprehensive, as result more target genes were being found, that encouraged researchers to put efforts towards the genetic engineering for nitrogen fixation symbiosis-related traits. Metabolic engineering of nitrogen fixation pathway such as genes associated with N transport, assimilation, and primary N metabolism for the improvement of nitrogen use efficiency (NUE) in crop plants is important and appeared to be most promising [264][265][266][267]. In addition, there are several genes, which are involved in C metabolism, and appeared to have a close connection between C and N metabolism, it is hoped that modification of these genes could improve N uptake [265]. There is an amino acid biogenesis gene, AlaAT, which when overexpressed in canola and rice, exhibits an NUE phenotype in the greenhouse and field condition [268,269]. This gene encodes for alanine aminotransferase (AlaAT, EC.2.6.1.2), an enzyme that catalyzes the reversible synthesis of alanine and 2-oxoglutarate from pyruvate and glutamate, resulting in N metabolism downstream of GS and GOGAT pathway. Intriguingly, transcriptomics analysis of alanine aminotransferase (AlaAT-ox) overexpressing rice lines with wild type (WT), under low, medium, or high N conditions, did not detect any of the known N transport and N-assimilation genes as differentially regulated, instead, the highly differentiated genes were regulatory transcription factor associated with secondary metabolism, and few genes with unknown function [270,271]. Due to the change in the expression of the TCA and secondary metabolite-associated genes, researcher focused on the assessment of N-containing metabolites and the N-flux balance in transgenic plants [272]. In our view, research efforts in this direction is important, because crops engineered for RNS may have a promising future in the incoming era.

Public Perception for the Metabolic Engineered Plants
In the present world, every year, the food demand is increasing; on the other side, the agriculture system is degrading and arable land is shrinking due to severe thinning of biodiversity and increased incidence of climate change-driven uncertainty in rain. Therefore, in the present scenario, a traditional breeding-based outcome may take reasonable time to fulfill the demand; the breeders must adopt molecular biology as a tool to develop climate smart crops. One of the important achievement in the field of plant biotechnology is development of transgenic tomato "flavor saver" (Flavr Savr or CGN-89564-2), developed by Monsanto [273]. Similar to Flavr Savr, many important crops were developed by targeting metabolic pathways for enhancing the postharvest shelf-life or biotic and abiotic stress tolerance [274]. In plant breeding, genetic engineering has played a very important role, as a result around 525 transgenic events, of which maximum 238 events is registered for maize, 61 for cotton, 49 for potato, 42 for canola, 41 for soybean, etc., and worldwide nearly 32 crops have received approval for cultivation [275]. However, from the past two decades, frequently outrage from the public and NGOs was observed against transgenic and/or genetically modified crops (GMOs) including Flavr Savr which was approved for sale by the Food and Drug Administration (FDA), USA [273]. Now, in the present era, genome/gene(s) editing has made a significant impact; earlier, ZFNs and TALEN played very important roles and the products are already available in the market [274][275][276]; several countries like US, Canada, China, etc. have shown positive response to their product and treated them just as mutants; unlike EU's regulations which are stringent and treated these genomes edited crops as the transgenic. In July 2018, ECJ (European Court of Justice) stated that "All genome-edited plants should be treated legally as genetically-modified organisms (GMOs), using definitions dating from 2001". Now, with the advent of the CRISPR/Cas, a revolutionary genome/gene editing tool, the regulatory barrier is expected to get weaken in coming years [274][275][276] as the regulatory agencies of several countries such as USA, Canada, China, etc., have considered them as mutants [276]. In addition, the technique CRISPR/Cas can more favorably modified and used as several variants of Cas enzymes are now available [277]. In the present scenario, CRIPSR/Cas is considered as one of the best tools for editing the traits in crop(s) species. Additionally, technique such as speed breeding can be integrated to achieve more from CRISPR/Cas.

Future Perspective
In future, the de novo domestication would become one of the most important areas. To achieve de novo domestication, metabolomics assisted breeding and the knowledge of metabolic pathways will play very important role. Earlier, during 'Green Revolution', the selection of genes related to GAs pathways have played a crucial role in the development of semi dwarf high yielding variety, which helped in fulfilling the food demand of billions of people. Today, a better understanding of a metabolic pathway through an integrated approach can redesign the ancestral species, which are resistant to several biotic and abiotic stresses. In addition, the advent of modern sequencing technology has been playing a pivotal role in fine-tuning the genome annotation by utilizing available transcriptome, proteome, and metabolome atlas data. Therefore, utilization of metabolomics data would help in the rapid generation of climate-smart and bio-fortified nutrient-rich varieties to achieve targeted sustainable food production and security.

Conflicts of Interest:
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.