Untargeted metabotyping to study phenylpropanoid diversity in crop plants

Plant genebanks constitute a key resource for breeding to ensure crop yield under changing environmental conditions. Because of their roles in a range of stress responses, phenylpropanoids are promising targets. Phenylpropanoids comprise a wide array of metabolites; however, studies regarding their diversity and the underlying genes are still limited for cereals. The assessment of barley diversity via genotyping-by-sequencing is in rapid progress. Exploring these resources by integrating genetic association studies to in-depth metabolomic profiling provides a valuable opportunity to study barley phenylpropanoid metabolism; but poses a challenge by demanding large-scale approaches. Here, we report an LC-PDA-MS workflow for barley high-throughput metabotyping. Without prior construction of a species-specific library, this method produced phenylpropanoid-enriched metabotypes with which the abundance of putative metabolic features was assessed across hundreds of samples in a single-processed data matrix. The robustness of the analytical performance was tested using a standard mix and extracts from two selected cultivars: Scarlett and Barke. The large-scale analysis of barley extracts showed (1) that barley flag leaf profiles were dominated by glycosylation derivatives of isovitexin, isoorientin, and isoscoparin; (2) proved the workflow's capability to discriminate within genotypes; (3) highlighted the role of glycosylation in barley phenylpropanoid diversity. Using the barley S42IL mapping population, the workflow proved useful for metabolic quantitative trait loci purposes. The protocol can be readily applied not only to explore the barley phenylpropanoid diversity represented in genebanks but also to study species whose profiles differ from those of cereals: the crop Helianthus annuus (sunflower) and the model plant Arabidopsis thaliana .


| INTRODUCTION
The genetic diversity represented in plant germplasm collections constitutes a key resource for breeding plants to ensure crop yield and nutritional quality under changing environmental conditions (Esquinas-Alcazar, 2005;Fowler & Hodgkin, 2004;Gollin, 2020;Vreugdenhil et al., 2005). Several cereals are major staple crop species worldwide, amongst them wheat, rice, maize, and barley.
Hence, conservation and evaluation of cereal germplasm collections are of particular relevance for future human and animal nutrition. The significance of the preservation efforts has led to the establishment of plant genebanks in many countries and, more recently, the Svalbard global seed vault (Asdal & Guarino, 2018;Westengen et al., 2020). Whereas the maintenance of the large germplasm collections constitutes a continuous effort itself, evaluation of the stocks is still a major bottleneck toward their use for breeding and biotechnology since these demand large-scale approaches. With the development of novel sequencing technologies and the so-called "omics" strategies for high-throughput analysis of transcripts, proteins, and metabolites, the evaluation of the diversity within genebank collections has gained momentum but remains challenging. Large-scale phenotyping of collections is now enabled by the development of systems for automated assessment of plants (Yang et al., 2020). Genomic sequences have been published for the major crop species rice, barley, maize, and wheat (Mascher et al., 2017;Schnable, 2015;Song et al., 2018;Walkowiak et al., 2020). Being amongst these agronomically important crops, the assessment of the genetic diversity within barley by genomic sequencing is now in rapid progress. Almost all barley accessions of the German ex-situ genebank were analyzed using a genome-wide genotyping-by-sequencing approach; this enabled the detection of known and novel loci underlying morphological traits that differentiate the barley gene pools (Milner et al., 2019). Twenty barley genotypes, including landraces, cultivars, and a wild relative, were used to establish the barley pan-genome representing its global diversity (Jayakodi et al., 2020). These sequencing approaches have resulted in databases such as BRIDGE (König et al., 2020) and BarleyVarDB (Tan et al., 2020) that enable a comprehensive analysis of the genomic variation across barley accessions. Altogether, the increasing availability of these resources is consolidating barley as an attractive model not only for crop research but for additional plant science fields (Harwood, 2019), such as plant development (Poursarebani et al., 2020;Walla et al., 2020), chloroplast function (Li et al., 2019;Rotasperti et al., 2020), and plant pathology (Dörmann et al., 2014;Lenk et al., 2018).
The plant specialized metabolism comprises around 200 000 compounds synthesized through a wide array of metabolic pathways from numerous plant taxa (Arimura & Maffei, 2016). Phenylpropanoids contribute to the specialized metabolism with an enormous set of metabolites displaying intermediates of the shikimate pathway as core structures. These are amplified and subjected to additional chemical modifications (e.g., glycosylation, methylation, acylation), resulting in structurally and functionally diverse phenolics that are defined in a species-, tissue-, and temporal-specific manner (Vogt, 2010). Unlike dicotyledonous species such as the model Arabidopsis thaliana (Arabidopsis), whose phenylpropanoids are dominated by flavonol derivatives, cereals namely comprise flavones (Tohge et al., 2017). Early studies profiling nearly 1500 barley varieties by using TLC (thin-layer chromatography) already highlighted the remarkable variability of flavonoids dominated by C-glucoflavone derivatives (Fröst et al., 1975). The recent annotation of a large number of phenylpropanoids identified in nine barley cultivars using in-depth analytical approaches supports these early observations (Piasecka et al., 2015).
The analysis of the plant's specialized metabolism, including phenylpropanoids, still faces major technical obstacles directly associated with their structural complexity, their wide dynamic range, which is highly dependent on environmental conditions, and their complex spatiotemporal dynamics (Li & Gaquerel, 2021). The flexibility of LC-MS (Liquid Chromatography coupled to MS) has resulted in its widespread application for metabolomics studies, being one of the preferred analytical tools for specialized metabolites. However, some obstacles remain. Additional to their high complexity, their limited coverage in databases not only makes their analysis but also their annotation difficult (Döll et al., 2019). Owing to the multidimensional complexity of the generated data, further processing remains challenging and computationally intensive (Döll et al., 2019). Ongoing developments on MS metabolomics and computational tools are addressing these obstacles and revolutionizing the field (Li & Gaquerel, 2021).
The increasing availability of genomic sequence information for many accessions of a crop species, including barley, and the progress in phytochemical analysis promote integrating genetic mapping and metabolic network analysis. The combination of both approaches will allow identifying novel mechanisms of regulatory control of plant metabolism. Furthermore, the information will allow us to find yet unknown genes involved in the biosynthesis of specialized metabolites in crops. The integration of these strategies is already showing progress in the field. A metabolic genome-wide association study with Arabidopsis accessions grown under two environmental conditions allowed the detection of 123 mQTLs (metabolic Quantitative Trait Loci) and the identification of 70 candidate genes involved in the specialized metabolism (Wu et al., 2018). A similar combinatorial approach was applied to assess Tibetan hulless barley (quingke) and elite barley accessions (Zeng et al., 2020). When barley is cultivated on the Tibetan Plateau, it is exposed to strong UV-B radiation. The study revealed different branches of phenylpropanoid metabolism as relevant for the adaptation toward UV-B (Zeng et al., 2020). In a previous study, we applied LC coupled to UV/Vis-based detection to profile the barley introgression lines (IL) of the S42IL population, which is derived from the introgressive allele hybridization of the Israeli wild barley accession ISR42-8 into the gene pool of the German spring cultivar Scarlett. This study identified a candidate gene encoding a glycosyltransferase involved in the biosynthesis of the flavonoid isovitexin 2 00 -O-ß-D-glucoside (Brauch et al., 2018). We now aimed to extend our earlier approach (Brauch et al., 2018) to enable the comparative phenylpropanoid profiling of larger populations and sets of genotypes potentially comprising several hundreds of samples.
In this study, we report the development of a workflow for the high-throughput untargeted metabotyping of phenylpropanoids applicable for barley and other plant species. Without the prior construction of a species-specific compound library, this workflow aimed at the generation of metabolic signatures enriched in phenylpropanoids, with which the abundance of thousands of putative metabolic features could be assessed across hundreds of samples in a singleprocessed data matrix. By pursuing this goal, the challenge of generating and processing large data matrices was undertaken. The workflow was designed to reflect the samples' metabolic complexity and enable their discrimination in a genotype-, development-, or treatmentspecific manner through the analysis of both shared and specific metabolic features. The analytical performance was firstly assessed with commercial standards and extracts from two German spring barley elite cultivars, Barke, and Scarlett. The feasibility of using the relative abundances of the identified features as quantitative traits for genetic association studies was evaluated using the S42IL population. Finally, we tested the workflow's versatility through the analysis of extracts from dicotyledonous species whose phenylpropanoid profiles differ from those of cereals: the crop Helianthus annuus (sunflower) and the model plant Arabidopsis that has largely contributed to current knowledge on specialized metabolism (D'Auria & Gershenzon, 2005;Tohge et al., 2017 Analytical standards are described in Table S1.

| Standard mix preparation
A set of eight metabolites (Table 1) comprising soluble single and polyphenolic compounds of different polarities and masses was defined.
To evaluate the workflow performance with the standard compounds, the stock was diluted 1:4 in 80% (v/v) methanol to a final concentration of 10 μg mL À1 for each compound (except QUE and NEN: 1 μg mL À1 ). Hence, the LC injection of 3 μL, corresponded to the analysis of 30 ng of each compound (except QUE and NEN: 3 ng) in-column.
To test the feasibility of using these compounds as an internal standard, barley methanolic extracts were spiked with the stock solution as follows: 20 μL of stock solution added to 80 μL of barley sample extract.  (Honsdorf et al., 2017). The subset of 41 S42ILs referred to in this work covers 75.3% of the exotic genome (Zahn et al., 2020). Additional information regarding the elite cultivars is given in Note: Description of the eight soluble phenolic compounds comprising the standard mix analyzed by RP-LC-PDA-ESI-HR-QTOF-MS. The corresponding buckets (Rt; m/z pairs) are indicated together with the adduct and fragment ions combined during data processing. The average Rt and MS intensity for 100 injections along more than 2500 LC-MS runs are indicated for every compound (CV-Coefficient of Variationin parenthesis). Peak numbers are according to Figure 2A.

| Plant materials
a Denotes aglycone ions. (Lancashire et al., 2008). The dataset comprised a total of 226 samples, where every genotype was represented from one up to 10 different field plots in a specific developmental stage (Table S3). Five flag leaves were collected per plot and freeze-dried for further analysis.

| Untargeted phenylpropanoid metabotyping workflow
The high-throughput untargeted phenylpropanoid metabotyping method was implemented based on previous protocols (Brauch et al., 2018) as a starting point. The workflow comprises six major stages ( Figure 1). For freeze-dried material, 10 mg were suspended in 400 μL of 80% (v/v) methanol, shortly vortexed, and incubated overnight (4 C). Following centrifugation (22,500g, 4 C, 10 min), the supernatant was recovered into a clean tube and the pellet was resuspended in 400 μL of 100% methanol. After one-h incubation (4 C), the second supernatant was recovered by centrifugation; the two supernatants were combined. For frozen material, extraction was performed as previously described (Petridis et al., 2016). These two-step methanolic extraction methods enable recovering 95% of the total soluble UV-absorbing compounds ( Figure S1).

| Steps 2 and 3: RP-UPLC-PDA Analysis
Phenylpropanoids were analyzed by RP-UPLC-PDA (Reversed Phase Ultra Performance LC separation coupled to PhotoDiode Array detection) using an Acquity UPLC system (Waters), equipped with an Acquity UPLC PDA eλ detector (Waters). Before their analysis, 80 μL-aliquots from the methanolic extracts were mixed with 20 μl of 0.5% (v/v) formic acid, incubated overnight (À20 C), and centrifuged (22,500g, 4 C, 10 min) to remove precipitates. Samples were randomized before their analysis. Sample injection was performed using a PLNO (Partial-Loop with Needle-Overfill) injection mode. The compound separation was carried out with different RP-LC methods depending on the sample's chemistry as described in Table S4. PDA-detection was performed in a range between 210 and 800 nm, at a resolution of 1.2 nm, and a sampling rate of 20 points s À1 .

| Step 4: MS Acquisition
Phenylpropanoids were further analyzed via ESI-UHR-QTOF (1) methanolic extraction of phenolic compounds, (2) RP-UPLC separation, (3) PDA-based detection, (4) untargeted MS1 acquisition (MS/MS of representative pool samples can be done for annotation purposes), (5) data processing, and (6) data analysis. The latter two steps are carried out using the software MetaboScape. The method is not restricted to barley, being successfully applied to sunflower and Arabidopsis software applies a time-aligned region complete extraction algorithm (T-Rex 3D) for mass calibration, non-linear retention time alignment, feature extraction, de-isotoping, de-adducting, and ion combination (i.e., isotopes, charge states, adducts, or fragments) into single features.
The processing of raw data signals results in a single-aligned matrix (bucket   table) comprised of the identified "buckets" (retention time-m/z pairs) and their abundance, in terms of either signal intensity or area, across the analyzed samples. Data processing was carried out with the following parameters: intensity threshold: 10 3 -10 4 counts; minimum peak length: 8 spectra; minimum peak length (

| RESULTS
3.1 | Workflow performance across multiple runs: Using a phenolics standard mix As an initial verification of our approach, a set of eight metabolites comprising soluble single and polyphenolic compounds of different polarities and masses was defined to assess the workflow performance (Table 1).
For this purpose, we evaluated a dataset of 100 injections of the standard mix distributed along more than 2500 LC-MS runs that were per- to batch variability (approx. 15% CV). These results underscore the relevance of using internal and external quality controls to monitor systematic variations in the workflow by using not a single, but multiple metabolites that mimic the composition of the sample to analyze. Using additional standards for positional C-glucoflavone isomers frequently found in cereals, the workflow proved to discriminate them based on their chromatographic separation ( Figure S2).
To evaluate the feasibility of using these compounds as an internal standard, barley leaf extracts were spiked with the standard mix ( Figure 2C). As we have not yet detected these compounds in our barley extracts, the standard mix proved useful as an internal and external standard for the high-throughput LC-MS analysis of barley phenylpropanoids.  Table S5). The phenylpropanoid profiles were dominated by glycosylation and acylation derivatives of three flavone 6-C-glucosides: isovitexin, isoorientin, and isoscoparin, where the latter two comprise an additional 3 0 -hydroxyl and a 3 0 -hydroxymethyl group in their aglycone moieties, respectively, when compared to isovitexin (Figure 3). The workflow discriminated different positional O-glycosylation isomers of these C-glucoflavones (e.g., peaks 6,9,11,15,16,17). Hydroxycinnamic acid derivatives (peaks 1, 4, 5) and a putative flavonol C-hexoside (peak 2) were also detected in lower proportions. The differences in O-glycosylation patterns of the 6-C-glucoflavones, were responsible for the contrasting phenylpropanoid profiles observed between both genotypes: 7-O-glycosyl-6-C-glucosylflavones were predominant in cv. Barke, whereas cv. Scarlett was dominated by 2 00 -O-glycosyl-6-C-glucosylflavones (Table S5). The C-pentosides of these 6-C-glucoflavones (peaks 8, 12, 14) were common to both cultivars. Making use of their contrasting phenylpropanoid profiles, these genotypes were employed to evaluate the high-throughput performance of the metabotyping workflow with plant extracts.

| High-throughput phenylpropanoid metabotyping of contrasting barley genotypes
Next to data acquisition, a major bottleneck for high-throughput   Figure 4A). When considering the annotated compounds (Table S5)

| Phenylpropanoid metabotyping of complex datasets: Barley mapping populations
The performance of the workflow was evaluated with a complex dataset comprising more than two barley genotypes at different developmental stages. We analyzed the flag leaves of field-grown IL from the barley S42IL mapping population (Honsdorf et al., 2017;Zahn et al., 2020), together with seven additional elite cultivars. The processing of the LC-MS1 data using a highintensity threshold resulted in a matrix of 1713 buckets across 241 samples.
Two major factors challenged the workflow performance. In a previous study (Brauch et al., 2018), the phenylpropanoid profiling of young greenhouse-grown seedlings from the S42IL population using an LC-PDA approach led to the identification of IL (S42IL-101, 177, 178) devoid of the isovitexin 2 00 -O-glucoside.
This enabled the identification of a putative glycosyltransferase involved in flavonoid metabolism (Brauch et al., 2018). The identification of the same lines using the LC-MS workflow described here would validate the feasibility of using the buckets' MS intensities as metabolic quantitative traits. For this purpose, we analyzed the distribution of isovitexin 2 00 -O-glucoside and additional O-glycosylated 6-C-glucoflavones across field-grown lines of the S42IL population in stage A (shooting), at which these flavones were detected at high abundance ( Figure 5C, Table S3). The flag leaves of most lines displayed a similar abundance of isovitexin and isoorientin 2 00 -O-hexosides to those shown by the parental cv. Scarlett. Only a few lines showed decreased average amounts, where the lowest were displayed by the S42IL-101 genotype ( Figure 5D, Table S3). This is in line with our previous findings (Brauch et al., 2018); the two lines S42IL-177 and 178 were not identified, as these were absent in the panel analyzed in this work.
When compared to cv. Barke as a negative control for the presence of the isovitexin 2 00 -O-glucoside ( Figure 5D), a higher average amount was still detected in the S42IL-101 flag leaves. This was due to the clear presence of this compound in some biological replicates (four out of 10) rather than by an overall higher abundance, as confirmed by manual inspection of the EIC from individual injections. Similar to cv. Barke, the five additional elite spring cultivars analyzed in this work (cv. Grace, Propino, Quench, RGT Planet, Salome) did not show 2 00 -O-glycosylated 6-C-glucoflavones in their profiles ( Figure 5D, Table S3).
These results demonstrate the applicability of the workflow for the high-throughput phenylpropanoid metabotyping of complex barley populations and the feasibility of using the detected buckets for further mQTL analyses.

| Workflow application to other plant species
To show the versatility of this workflow, examples are given for its application in dicotyledonous species known to display phenylpropanoid patterns that drastically differ from those of monocots such as barley: the model plant Arabidopsis thaliana, whose leaf phenylpropanoids mostly comprise flavonol derivatives (Tohge et al., 2005); and the crop plant Helianthus annuus L. (sunflower) that displays chlorogenic acid derivatives as its major soluble phenolics while flavonoids, mainly methoxylated flavones, are present in trace amounts (Stelzner et al., 2019). To test the performance of the MS workflow, it was necessary to adjust the chromatographic conditions to the contrasting phenylpropanoid chemistries of these species (Table S4).
The leaves of five sunflower accessions from the IPK genebank grown in field conditions were analyzed (A1 to A5, described in  Figure 6B). Further analysis showed that these differences did not rely on the most abundant phenolics (e.g., CGA) but upon novel features. This is illustrated by an unknown UV-absorbing compound (4.16 min; m/z 251.16), which was completely absent in accession A1 while displaying differential amounts in the remaining accessions ( Figure 6B).
The application of this workflow to the model Arabidopsis was  Figure 6C). This was supported by the differential sample clustering in a treatment-dependent manner ( Figure 6D) that was highly influenced by the levels of kaempferol and cyanidin derivatives. The metabotyping profiles revealed that contrary to the major UV-absorbing compounds, other semi-polar metabolites might decrease upon N deficiency. This is exemplified by the quantitation of an unknown feature (0.47 min; m/z 476.86).  (Fröst et al., 1975). A total of 27 flavonoids enabled their classification into five chemical races, each of them corresponding to a different geographical distribution. It was proposed that race-specific chemical differences relied on biosynthetic flavonoid decorations commonly affecting more than one compound (e.g., 3 0 -O-methylation, glycosylation) (Fröst et al., 1977). In recent years, 152 phenylpropanoids were identified in nine barley varieties by in-depth analysis using MS and NMR (Nuclear Magnetic Resonance). The study confirmed phenylpropanoid glycosylation diversity in barley, but a UV profile-based chemotaxonomic analysis did not discriminate cultivars in a geographic-wise manner (Piasecka et al., 2015). The barley metabotypes generated in this work support indications from prior studies. First, our results confirm that the phenylpropanoid profiles of barley leaves are dominated by glycosylation and acylation derivatives of flavone glycosides, mainly from the 6-C-glucoflavones isovitexin, isoorientin, and isoscoparin to a lower extent. Second, the major glycoflavone in barley leaves is tissue-dependent: as shown previously (Fröst et al., 1977) as well as in this work, isoorientin derivatives lead the profiles in flag leaves, whereas isovitexin derivatives are dominant in the leaves of young seedlings (Brauch et al., 2018;Kaspar et al., 2010;Reuber et al., 1996). Third, the contrasting metabotypes from cv. Barke and Scarlett support the O-glycosylation of common C-glycoflavones as a major source of diversity in barley phenylpropanoids. This points toward specific glycosyltransferases as major determinants of barley flavonoid diversity. This is in line with the identification of a candidate glycosyltransferase involved in the synthesis of the isovitexin 2 00 -O-glucoside (Brauch et al., 2018), and the proven role of glucosyltransferases in the natural variation of rice flavone accumulation (Peng et al., 2017).
Aiming to study the genetic basis underlying barley phenylpropanoid diversity, we extended our prior LC-PDA-based approach (Brauch et al., 2018) to the MS approach here described. The multidimensional data from LC-MS workflows not only provide great advantages owing to the high detection sensitivity and coverage but also pose a challenge for complex data processing and analysis. We challenged the MetaboScape software which has been successfully employed for studies on specialized plant metabolites (Olmo-García et al., 2018;Villette et al., 2018). With the workflow here described, 200 LC-MS runs were processed in periods of one to a maximum of two hours. Initial work using this workflow with mapping populations is enabling to process more than 1600 LC-MS runs in a single dataset and thus, to overcome limitations of large LC-MS data processing.  (Sawada et al., 2012;Shahaf et al., 2016) and/or tailored computational tools such as FlavonoidSearch (Akimoto et al., 2017) and FlavonQ (Zhang et al., 2017) can support phenylpropanoid annotation. Using exclusively a positive ionization mode, the workflow from this study already generated biologically relevant phenylpropanoid metabotypes. Merging data of different polarities did not significantly improve the outcomes from the workflow in positive mode. However, assessing the effect of different ionization modes is always recommended, since the outcomes strongly rely on the metabolites' chemistry.
The metabotyping workflow still faces limitations for phenylpropanoid analysis, which are related to common challenges for metabolomics feature processing: (1) metabolite overrepresentation in different buckets; (2) combination of different metabolites into single buckets. The first issue is often a consequence of the in-source fragmentation to which O-glycosylation moieties are particularly prone (Akimoto et al., 2017;Kachlicki et al., 2016). To cope with this, we include the early neutral losses Elucidating the genetics underlying phenylpropanoid metabolism demands the integration of genetic association studies to in-depth metabolomic profiling. We tested and demonstrated the feasibility of the metabotyping workflow for mQTL purposes by profiling a subset of the barley S42IL mapping population, which was previously analyzed using an LC-PDA approach (Brauch et al., 2018). In both strategies, the same line was identified (S42IL-101) for displaying low or negligible levels of isovitexin 2 00 -O-glucoside. The presence of low levels of this compound in the flag leaves of some S42IL-101 plants (four out of 10), compared to its absence in the second leaves of young S42IL-101 seedlings (Brauch et al., 2018), could be attributed to different factors: the residual heterozygosity of this line (2.5%) (Honsdorf, 2017) and/or the metabotype susceptibility to the variations inherent to field conditions (e.g., differences in nutrient availability). The generated metabotypes reflected not only the genotypic complexity of mapping populations but also their developmentdependent changes. Using MS-based metabolite mapping is a strategy that has already provided novel insights into the specialized metabolism of cereals (rice, maize, and quingke), mainly through widetargeted metabolomics based on multiple reaction monitoring (Chen et al., 2013;Peng et al., 2017;Wen et al., 2014;Zeng et al., 2020). This is a highly sensitive and accurate MS method that is nevertheless, restricted to a predefined set of metabolites based on the prior construction of an MS/MS spectral tag library (Chen et al., 2013). In the untargeted phenylpropanoid metabotyping workflow described here, all the detectable features are measured. This readily enables to assess their presence and influence in the dataset diversity, without prior construction of a library or compound annotation. Hence, unknown features might be used as quantitative traits for their association with specific genetic loci (Wu et al., 2018). As already shown for human serum samples (Krumsiek et al., 2012), this approach could support predicting the identity of unknown metabolites, which is a common bottleneck in phytochemistry analyses.
The genetic diversity represented by the wild and domesticated barley accessions maintained at the German national Genebank hosted by the IPK, together with the continuous development of genome-sequencing resources, are enabling the accurate assessment of genetic variation in these highly diverse germplasm collections (Jayakodi et al., 2020;Milner et al., 2019). This is complemented by the availability of mapping populations with a high genetic resolution, such as the Morex X Barke Recombinant Inbred Lines (Mascher et al., 2013)

and the 'Halle Exotic Barley 25' Nested Association
Mapping (HEB-25 NAM) population (Maurer et al., 2015); these enable investigating specific genomic regions and their allelic variation. Using these resources, known and novel loci underlying phenotypic traits have been identified and allocated to different barley gene pools. The application of the metabotyping workflow to these resources provides opportunities to investigate the diversity and specific roles played by specialized metabolites, as well as to identify the underlying genes. Moreover, multiple subjects can be addressed: Do different genetic pools display contrasting metabotypes? Are the metabotypes responsible for specific phenotypic traits? Can novel metabolites and their underlying genes be retrieved from exotic genotypes? The workflow described here allows exploring additional resources available at the IPK Genebank, as exemplified here with sunflower. The method is also applicable to the model species Arabidopsis. Using metabotyping approaches in natural genetic variation-based studies and their further integration to other "omics"-based studies (e.g., transcriptomics, proteomics), will provide exciting insights into the phenylpropanoid metabolism of crop plants.
By addressing current challenges in the field of specialized metabolites, the workflow here described can be readily applied to study in a high-throughput fashion the relevance underlying the natural diversity of phenylpropanoids in plants, particularly barley. Sebastian Fricke. We also gratefully acknowledge Dr. Stefanie Döll for her support during the implementation of this workflow.

CONFLICT OF INTEREST
Nikolas Kessler is an employee of Bruker Daltonik GmbH which manufactures and sells the mass spectrometer and software used in this study.

DATA AVAILABILITY STATEMENT
The data supporting this study are openly available through the e!DAL electronic data archive library (Arend et al., 2014)