Cataloging the Brassica napus seed metabolome

The allotetraploid Brassicales member canola (Brassica napus L.) is an oil-seed crop sought primarily for vegetable oil, animal feed, and biofuel. With the advent of -omics (genomics, transcriptomics, proteomics, and metabolomics) technologies, numerous studies have focused on deciphering the functional genes, proteins, and metabolites in canola. However, the oil-yielding seeds are of commercial interest and only a handful of studies using mass-spectrometry (MS) or spectroscopy based platforms have attempted to characterize and quantify metabolites of the seed. Baring metabolite profiling approaches which study groups of chemicals, metabolomic insights into the seeds are very recent. Canola seeds are enriched in fatty acids, glucosinolates, phenylpropanoids (i.e. sinapoyl cholines), flavonoids, and phytohormones among others as has become evident from MS and spectroscopybased recent studies. Thus, cataloging of the seed metabolome is essential before gaining further insights into their roles in seed biology and plant physiology in order to further understand the qualitative traits and products obtainable from the seeds. Subjects: Plant Biology; Plant Biotechnology; Bioscience; Biochemistry; Natural Products; Bioinformatics; Agriculture; Food Chemistry; Food Analysis

ABOUT THE AUTHOR Biswapriya Biswavas Misra, PhD (supporting photograph) is a Postdoctoral Scientist at the Texas Biomedical Research Institute, San Antonio, Texas, USA. His current research interests are metabolomics and genomics towards a holistic understanding of organismal biology. He had obtained a PhD in Plant Biotechnology from Indian Institute of Technology Kharagpur, West Bengal, India followed by Postdoctoral trainings in Plant Genomics from Malaysia and in Metabolomics of Canola from University of Florida, Gainesville, USA. One of his ambitions is to help the metabolomics user community realize the potentials of harnessing plant metabolomics information, tools and methods towards a meaningful utilization of phytochemistry.

PUBLIC INTEREST STATEMENT
Brassica napus L., commonly known as canola (rapeseed) belongs to the mustard family of plant kingdom and is an oil-seed crop sought primarily for vegetable oil, animal feed, and biofuel. The oil-yielding seeds are of commercial interest. Hence, recent studies have used tools such as mass-spectrometry or spectroscopy to identify and measure the different small molecules (phytochemicals) in the seeds. Apart from the sulfur-smelling and -containing chemical compound class known as glucosinolates, canola seeds are rich in fatty acids which make them commercially important as an oil crop. In addition, the seeds contain phenolics, flavonoids, sugars, amino acids, and other basic biochemicals usually present in a seed. Thus, cataloging of the seed metabolome is important for understanding its role in seed biology, physiology, and agriculture, before maneuvering the agronomic traits.
allopolyploid crossing between its parents Brassica rapa (AA,2n = 20) and Brassica oleracea (CC,2n = 18). The oil composition of the two oilseed species is highly similar and their oils are typically mixed and sold as one product, i.e. rapeseed oil. Being a parallel to the model plant Arabidopsis thaliana, canola has enjoyed deriving enormous comparative genomics resources from the later, hence rendered as a model oil crop species for research and development. Accordingly, canola seeds are used for the production of animal feed, vegetable oil, and biodiesel, whereas the canola seed meal as a byproduct contains the highest quality of proteins in terms of nutritive value, antigenicity, and amino acid composition (Pan, Jiang, & Pan, 2011). Seed residues left after pressing, i.e. seed meal, is of great interest for industry as animal feed and for human nutrition. However, the meal is enriched in fibers, polyphenols, phytic acid, and other antinutritive principles which decreases the meals market value.
Given that the seed is the key product of most crop species, its development has been intensively investigated across model species and many plant genera (De Smet, Lau, Mayer, & Jürgens, 2010;Moles et al., 2005;North et al., 2010). Moreover, the genetic (North et al., 2010), molecular (Finkelstein, Reeves, Ariizumi, & Steber, 2008;Holdsworth, Bentsink, & Soppe, 2008), and biochemical (Bailly, 2004) processes underlying a seed's life history are somewhat understood to the extent of allowing insights into basic phenomena such as seed dormancy and germination (Koornneef, Bentsink, & Hilhorst, 2002). However, the realization of -omics approaches to the understanding of seed development and physiology are limited till date with few exceptions (Joët, Wurtzel, Matsuda, Saito, & Dussert, 2012;Miernyk & Johnston, 2012;Nambara & Nonogaki, 2012). Consistent with the growth and realization of tremendous potentials of -omics research in plant sciences (Provart et al., 2015), efforts in cataloging of the available -omics data sets for canola seeds are non-existent. Towards this end, the information on seed genome, transcriptome, proteome, and metabolome are of critical interest.
Metabolome is defined as the entire complement of metabolites (i.e. small molecules with molecular weights <2,000 Daltons) obtained from cells, tissues, organs, organisms, fluids, and communities of biological origin. Although, no single platform is capable of covering the diverse chemical classes of small molecules, metabolomic investigations are platform-dependent, and are either MS [gas chromatography-mass spectrometry (GC-MS), liquid chromatography-mass spectrometry (LC-MS), and capillary electrophoresis-mass spectrometry (CE-MS)] or spectroscopy based [nuclear magnetic resonance (NMR), Raman, or Fourier Transform Infrared (FTIR) spectroscopy] (Kim, Choi, & Verpoorte, 2011). These approaches may or may not involve chromatographic separations (GC, LC, CE etc.) and are achieved using targeted (Sawada et al., 2009) anduntargeted (De Vos et al., 2007) platforms. Hence, this treatise is to collate all the metabolites found and quantified in canola seeds as a phytoresource for future metabolomics studies.
Plant metabolomics applications can range from applications in phytochemistry (Sumner, Mendes, & Dixon, 2003), food safety, quality, and traceability (Castro-Puyana & Herrero, 2013) to single cell and single cell-type metabolomics studies (Geng et al., 2016;Misra, Assmann, & Chen, 2014;Misra, Acharya, Granot, Assman, & Chen, 2015) and serve as functional genomics tool (Hall et al., 2002). These studies can range from explorations of specific classes of chemical compounds using metabolite profiling approaches or widely targeted global approaches. In fact, very recently the normal B. napus guard and mesophyll cell metabolomes from the leaves as well bicarbonate stressed metabolomes (mimicking elevated atmospheric [CO 2 ] response) were cataloged using global high performance liquid chromatography-multiple reaction monitoring-mass spectrometry (HPLC-MRM-MS) and GC-MS approaches (Misra, de Armas, Tong, & Chen, 2015). Similarly, previous studies on canola seeds focused on the profiling of smaller groups of compounds such as glucosinolates, phenolics, fatty acids, or sinapoyl cholines using GC-MS, UHPLC-MS, or LC-MS or other chromatographic and chemometric approaches (Farag et al., 2013;Frolov et al., 2013). With exciting times in metabolomics, the available information is summarized towards the cataloging of the important B. napus seed metabolome.

The canola seed metabolome
Till date, only a handful of attempts are recorded which deal with system-wide metabolomic investigations of B. napus seeds (Farag et al., 2013;Frolov et al., 2013;Jiang et al., 2013;Kortesniemi et al., 2015;Tan et al., 2015). Previous studies used gas-liquid chromatography (GLC) to determine the chemical composition of yellow-and brown-seeded Brassica for quantification of carbohydrates, dietary fiber, and galacto-oligosaccharides in canola seed meal (Simbaya et al., 1995;Slominski, Campbell, & Guenter, 1994). Remarkably, a sharp increase in metabolomics studies occurred beyond 2012 and the number of studies continues to rise as is evident in current literature. Consequently, the first 13 C-NMR based assignment of phytochemicals in seeds coats of B. napus-Sinapis alba hybrids was also performed during this period (Jiang et al., 2013). In another recent NMR-based  metabolomics study, it was shown that unsaturated fatty acids, sucrose, and sinapine were the most discriminating metabolites between B. napus and B. rapa metabolomes (Kortesniemi et al., 2015). This study also recorded the characteristic Fourier transform infrared spectroscopy (FTIR) spectra of critical compounds which provided important leads towards the assured identification of metabolites (classes). However, the most comprehensive metabolomic study of B. napus seeds was performed using non-targeted metabolomic analysis via ultra performance liquid chromatographyquadrupole time-of-flight-mass spectrometry (UPLC-QTOF-MS) where compounds belonging to various chemical classes i.e. oxygenated fatty acids, flavonols, phenolic acids, and sinapoyl choline derivatives were successfully quantified in seeds (Farag et al., 2013). In this study, about 20 metabolites were confidently identified in the seeds alongside another four unknown metabolites, as listed in Supplementary Table 1. These metabolites are different from another 44 metabolites which were differentially distributed in root, stem, and inflorescence but were not detected in seeds. Another study in developing canola seeds, where global metabolomics explorations were performed using GC-MS, claimed to detect almost 443 features, but could correctly annotate and quantify 77 metabolites which include a very informative list of amino acids, fatty acids, organic acids, phenolic acids, carbohydrates (sugars) among others (Tan et al., 2015). The list is presented as Supplementary Table 1. In another interesting study, to highlight metabolic control, the genotypic differences in carbon partitioning for in vitro cultured developing embryos of oilseed rape resulted in recording of 79 net fluxes, the levels of 77 metabolites, and 26 enzyme activities with specific focus on central metabolism in nine selected germplasm accessions (Schwender et al., 2015).
In addition, a lot of information on individual metabolites are derived from focused profiling approaches where the following major metabolite groups have been well characterized and individual metabolites quantified. A list of 160 total metabolites were collected (Supplementary Table 1) which were mapped against different chemical databases to obtain 110 Kyoto Encyclopedia of Genes and Genomes (KEGG) IDs. These KEGG IDs were then connected based on their biochemical interactions and relatedness at STITCH 5.0 available at http://stitch.embl.de/ (Szklarczyk et al., 2016) as shown in Figure 1. For a more classical view of KEGG pathways, the metabolites were also mapped against the KEGG metabolic map (http://www.genome.jp/kegg/mapper.html) as shown in Figure 2. In fact, with the KEGG IDs provided, users can create both the images in an interactive fashion to add or remove metabolites. It is also to be noted that the network and pathway mapping were based only on the above discussed metabolomics efforts, as not all metabolites (like many flavonoids and sinapines) are mappable in public databases, and it did not take into account each individual metabolite after removing the redundant ones from individual publications. Moreover, the networks and the pathways are used for visual interpretation of the canola seed metabolome for comprehensiveness and not completeness.

Phenylpropanoids
Phenylpropanoids (phenolics) constitute a large class of metabolites in plants involved in defense and protection against pathogen attacks, high light intensity and UV rays, wounding, and low nutrient conditions (Dixon & Paiva, 1995;Hahlbrock & Scheel, 1989;Yu & Jez, 2008). In one study, a total phenolic choline ester fractions prepared from canola seeds were analyzed by capillary LC/ESI-QTOF-MS and direct infusion electrospray ionization-Fourier transform ion cyclotron resonancemass spectrometry (ESI-FTICR-MS) (Böttcher, von Roepenack-Lahaye, Schmidt, Clemens, & Scheel, 2009). In addition to the dominating sinapoylcholine, 30 phenolic choline esters were identified based on accurate mass measurements. Some of the compounds identified included substituted hydroxycinnamoyl-and hydroxybenzoylcholines, respective monohexosides, and other oxidative coupling products of phenolic choline esters and monolignol.
Using liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based analysis, sixty-six cell wall-bound phenolics and their derivatives were identified and quantified (Frolov et al., 2013). The analysis was performed by an electrospray ionization-liquid chromatography/quadrupole linear ion trap mass spectrometer (ESI-QqLIT) operating in negative ion mode and coupled on-line to an reverse phase ultra performance liquid chromatography (RP-UPLC) system, where the compounds identified are listed in Supplementary Table 1. Although, the functions of the reported common phenylpropanoids are well known, the collection of these unique combination of phenolics alongside a rich repertoire of flavonoids is yet to be deciphered.
Sinapoylcholine (i.e. sinapine) content in seeds can typically vary from 3 to 12 mg/g (Zum Felde, Baumert, Strack, Becker, & Möllers, 2007). Furthermore, it was also found that sinapine/sinapic acid content can range from 73 to 80% of the total polyphenol content in the seed meals (Obied et al., 2013). Although other classes of metabolites are detectable predominantly in specific organs such as roots and leaves, the study found that sinapoyl cholines are present uniquely in seeds (Farag et al., 2013). Not just in canola, but sinapoyl choline esters are known to accumulate abundantly in seeds of glucosinolate-producing plants (Böttcher et al., 2009). As sinapoyl choline is antinutritive to human and animals, efforts are ongoing to reduce sinapoyl choline amounts in canola seeds by conventional and molecular breeding approaches to establish rapeseed as a source of food-grade protein. In this study, sinapoyl cholines and their esters were tentatively observed. Nonetheless, the biosynthetic pathways and efforts to manipulate the biosynthesis of sinapine and its derivatives keeps phytochemists interested in canola seed development.
Furthermore, the mean contents of brassicasterol, campesterol, stigmasterol, β-sitosterol, Δ5-avenasterol, and Δ7-stigmatenol were determined as percentages of total phytosterols in canola oil (Hamama, Bhardwaj, & Starner, 2003). Although biosynthesized through unrelated pathways, the vitamin E compounds (tocochromenols) are abundant in canola seeds but are known to protect the plant lipids as antioxidants. In addition, higher anti-oxidant activity in seeds is attributed to α-tocopherol present in seeds (El-Beltagi & Mohamed, 2010;Tuberoso, Kowalczyk, Sarritzu, & Cabras, 2007). A plethora of tocols, i.e. alpha-tocopherol, alpha-tocotrienol, beta-tocopherol, gamma-tocopherol, beta-tocotrienol, gamma-tocotrienol, delta-tocopherol, and delta-tocotrienol, were quantified in canola seed oils (Oraby & Ramadan, 2014). These tocols indicate their roles in protection from oxidative stress as well as source of commercial exploits. As fatty acids and the lipophilic constituents are major constituents of canola seed oil, any transgenic manipulation directed towards the seed oil content must take care as to not negatively modify the balance in other metabolic pathways.

Other metabolites of importance
Phytohormones are important in seed dehiscence, maturation, germination, and development. Ethylene precursor, 1-aminocyclopropane-1-carboxylic acid (ACC) and ACC-related compounds, α-and γ-aminobutyric acids, both known to stimulate ethylene production, were measured in the canola seed exudate and tissues using HPLC methods (Penrose & Glick, 2001). Among the most thoroughly studied classes are abscisic acid (ABA) in relation to imbibition and seed germination. Although, ABA is considered the bioactive form, 7,8, and 9-hydroxy ABA also displayed hormonal activity in B. napus embryos (Jadhav et al., 2008). Gibberellins such as GA 1 and GA 4 and auxins such as indole-3-acetic acid (IAA) were quantified in canola seeds (Walton et al., 2012). The ethylene precursor, 1-aminocyclopropane-1-carboxylic acid (ACC) and ACC-related compounds, α-and γ-aminobutyric acids, both known to stimulate ethylene production, were measured in the canola seed exudate and tissues using HPLC methods (Penrose & Glick, 2001).
Previous studies have also concluded that the photosynthesis in the silique wall may contribute the driest matter to ripening seeds, with the contents of sucrose, fructose, and glucose specifically affecting the seed oil content (King, Lunn, & Furbank, 1997). Sugars like glucose and fructose were identified in developing rape seeds (Norton & Harris, 1975). Primary metabolites, for example, 15 common amino acids, were quantified in B. napus seed meal (Jiang et al., 2015). Among other classes of metabolites, sugar alcohols such as inositol accumulation patterns in developing canola seeds indicated that early stages of seed development are marked by rapid deployment of inositol into a variety of pathways, such as polar inositol phosphates and non-polar phospholipids (Dong et al., 2013). This study also quantified the galactose, galactinol, raffinose, stachyose, and sucrose levels in seeds. Moreover, hydrolysis of inositol-containing phospholipids is known to lead to increased phytate accumulation in seeds of B. napus (Georges et al., 2009). Studies have also quantified glucose, gentiobiose, and a polyamine using NMR and HR-ESI-MS based approaches (Baumert et al., 2005).
Carotenoids are a large group of secondary compounds, often highly colored, which are derived from isoprenoid precursors, synthesized in plants, have antioxidant function and in higher plants, and are at times are integral parts of the photosynthetic apparatus (Bartley & Scolnik, 1995). Carotenoids such as lutein and beta-carotene are prominent in canola seeds (Ravanello, Ke, Alvarez, Huang, & Shewmaker, 2003;Shewmaker, Sheehy, Daley, Colburn, & Ke, 1999). Moreover, chlorophyll pigments present in canola seed, meal, and crude and degummed oils analyzed by HPLC revealed that chlorophylls a and b, low levels of pheophytin a, and occasionally traces of pheophorbide, and its methyl ester were present in canola seed. Contrastingly, meals and oils contained magnesiumdeficient chlorophyll pigments such as pheophorbide a, methylpheophorbide a, pheophytins a and b, and pyropheophytins a and b but not chlorophyll a or b (Endo, Thorsteinson, & Daun, 1992).

Future prospects
For an oil-crop, the primary challenges are to achieve higher oil yield, reduced glucosinolate, in addition to lowered secondary metabolites of lesser importance. This renders advances in understanding of lipid biology, biochemistry, and lipidomics very important towards the understanding of seed-omics starting from maturation to germination. To this end, newer technologies available to study plant lipidomes are growing at a burgeoning pace which include the successful implementation of in situ lipidomic visualization/imaging as evidenced with numerous successful examples (Horn & Chapman, 2012. Additionally, approaches such as shotgun lipidomics (Han & Gross, 2005) as well as advances in hydrophobic interaction liquid chromatography-ion trap-time of flightmass spectrometry (HILIC-IT-ToF-MS) (Okazaki, Kamide, Hirai, & Saito, 2013) and ultra-pressure liquid chromatography-high resolution mass spectrometry (UPLC-HRMS) (Hummel et al., 2011) have helped technology platforms move forward at a rapid pace to allow better coverage and quantification. The tools above are not only associated with lipidomic explorations, but are also available to widely targeted and untargeted global metabolomic studies. Even more promising, there has been significant development in the amount, accuracy, performance, availability of databases, software, algorithms, webservers, and tools for analyses of lipidomic (Haimi, Uphoff, Hermansson, & Somerharju, 2006) and metabolomic (Misra & van der Hooft, 2016) datasets.

Conclusions
Canola seeds are enriched in fatty acids, glucosinolates, phenylpropanoids (especially sinapoyl cholines), flavonoids, and phytohormones, among others. With advances in epigenetic studies, transcriptomics, proteomics, and metabolomics platforms and technologies, the possibilities of gaining a deeper understanding of the B. napus systems biology is feasible. In addition, the understanding of dynamic changes of seed metabolic flux is a critical step to increase the quantity and quality of seed oil (Tan et al., 2015). Thus, flux models of specific pathways would lead to an understanding of C-flow in germinating embryos (Schwender, Ohlrogge, & Shachar-Hill, 2003). However, with the seeds of this oil crop playing important roles in oil economy, as future biofuel for industrial growth among other financial roles, approaches and technologies addressing the enhanced oil content in seeds are of prime importance. To achieve this, metabolomics will play a pivotal role in helping integrate the genotype-derived information from genome, transcriptome, and proteome for understanding of regulatory steps leading to increased oil-yield among other improved traits. Moreover, the cataloged metabolites from this review can be integrated into seed-specific or Brassica-specific metabolite databases for future applications.