Variation of Secondary Metabolite Profile of Zataria multiflora Boiss. Populations Linked to Geographic, Climatic, and Edaphic Factors

Geographic location and connected environmental and edaphic factors like temperature, rainfall, soil type, and composition influence the presence and the total content of specific plant compounds as well as the presence of a certain chemotype. This study evaluated whether geographic, edaphic, and climatic information can be utilized to predict the presence of specific compounds from medicinal or aromatic plants. Furthermore, we tested rapid analytical methods based on near infrared spectroscopy (NIR) coupled with gas chromatography/flame ionization (GC/FID) and gas chromatography/mass spectrometry (GC/MS) analytical methods for characterization and classification metabolite profiling of Zataria multiflora Boiss. populations. Z. multiflora is an aromatic, perennial plant with interesting pharmacological and biological properties. It is widely dispersed in Iran as well as in Pakistan and Afghanistan. Here, we studied the effect of environmental factors on essential oil (EO) content and the composition and distribution of chemotypes. Our results indicate that this species grows predominantly in areas rich in calcium, iron, potassium, and aluminum, with mean rainfall of 40.46 to 302.72 mm·year−1 and mean annual temperature of 14.90°C to 28.80°C. EO content ranged from 2.75% to 5.89%. Carvacrol (10.56–73.31%), thymol (3.51–48.12%), linalool (0.90–55.38%), and p-cymene (1.66–13.96%) were the major constituents, which classified 14 populations into three chemotypes. Corresponding to the phytochemical cluster analysis, the hierarchical cluster analysis (HCA) based on NIR data also recognized the carvacrol, thymol, and linalool chemotypes. Hence, NIR has the potential to be applied as a useful tool to determine rapidly the chemotypes of Z. multiflora and similar herbs. EO and EO constituent content correlated with different geographic location, climate, and edaphic factors. The structural equation models (SEMs) approach revealed direct effects of soil factors (texture, phosphor, pH) and mostly indirect effects of latitude and altitude directly affecting, e.g., soil factors. Our approach of identifying environmental predictors for EO content, chemotype or presence of high amounts of specific compounds can help to select regions for sampling plant material with the desired chemical profile for direct use or for breeding.


INTRODUCTION
All over the world, plants face different local climatic regimes as well as different edaphic factors. To predict how different environmental factors affect species dispersal, the abundance of populations and chemotypes as well as the content of specific compounds can be a valuable tool to understand plant variation in chemical features. It can also facilitate prospecting plants with high amounts of specific compounds for nutrition, pharmaceutical or agricultural use. In most cases, plant essential oils (EOs) are characterized by a strong aroma, which is mainly produced by secondary metabolites. EO compounds are coupled with environmental acclimatization and play vital biological roles. Several factors, such as environmental and edaphic conditions, geographical regions, season of collection, harvesting time, genotype, and ecotype influence the quantitative and qualitative composition of EO (Milos et al., 2001;Zgheib et al., 2016;Morshedloo et al., 2018). For example, in Matricaria chamomilla L. climatic conditions, altitude, soil properties, and irrigation influence the phytochemical composition and antioxidant activity of EO (Formisano et al., 2015).
Zataria multiflora Boiss. (Lamiaceae) is an aromatic and perennial shrub growing wild in Iran ( Figure 1A), Pakistan, and Afghanistan. This aromatic plant is known by the Persian name of Avishan Shirazi which is also entitled Sattar or Zattar, meaning thyme. Z. multiflora can be identified by the orbicular, densely gland-dotted, grey-green ovate leaves, and the thickly white hairy round buds in the leaf axils. Its inflorescence is verticillate, and the flowers are very small and white (Simbar et al., 2008). Z. multiflora has shown pharmacological (antimicrobial, antinociceptive, spasmolytic, and anti-inflammatory) properties, is utilized in traditional folk remedies for its antiseptic, analgesic, carminative, anthelmintic, and antidiarrheal properties, and it is also a condiment (Iranian Herbal Pharmacopoeia Committee, 2002;Moazeni et al., 2014;Khazdair et al., 2018;Mohajeri et al., 2018). Currently, some pharmaceutical forms of this plant, such as syrups, oral drops, soft capsules, and vaginal creams are produced (Sajed et al., 2013;Mahboubi, 2019).
The EO of Z. multiflora is rich in phenolic oxygenated monoterpenes. The main chemical constituents are carvacrol, thymol, linalool, and p-cymene (Hadian et al., 2011a;Saedi Dezaki et al., 2016;Mahmoudvand et al., 2017). Although there are some studies based on Z. multiflora EO constituents (Saleem et al., 2004;Niczad et al., 2019), there is hardly any information on the environmental factors affecting EO content and composition. Z. multiflora is not only harvested for local markets but is also one of the valuable species for industry, so this plant is under severe threat from overharvesting. Thus, a deep perception of its phytochemical and environmental characteristics in its natural habitats is crucial to foretell its behavior under man-made cultivation.
Today, the standard method for EO analysis is gas chromatography coupled with different detection techniques like mass spectrometry. In the last two decades, numerous vibrational spectroscopy methods including mid-infrared (IR), near-infrared (NIR), and Raman spectroscopy have been described as a useful tool to examine the plant secondary metabolites which are commonly applied in the chemical fingerprinting of plants (Schulz et al., 2004;Schulz et al., 2005;Gudi et al., 2014). However, up to now, no studies have been performed utilizing this capable approach to differentiate and characterize various Z. multiflora chemotypes.
The aim of this study was to evaluate how different environmental factors affect species dispersal with respect to EO production, chemotype as well as the content of specific compounds of Z. multiflora population ( Figures 1A, B). Besides, we aimed to evaluate whether geographic, edaphic, and climatic information can predict the presence of specific compounds. Furthermore, we tested rapid analytical methods based on NIRS coupled with GC/GC-MS methods for characterization and classification metabolite profiling of Z. multiflora populations.

Study Area
To determine the effects of geography, climate, and edaphic conditions on EO yield and composition of Z. multiflora, plant materials were collected in 2018 in 14 natural habitats across five provinces from the center to the south of Iran including their major growing areas Isfahan, Kerman, Yazd, Fars, and Hormozgan provinces ( Figure 1A).

Soil Analysis
Soil samples from the surface layer (0 to 30 cm depth) were taken from five randomly selected plots in each sampling site. The five soil samples were combined into a single 500 g sample that was dried at room temperature (20-25°C) and sieved to 2 mm. A duplicate soil sample was sieved through a 2 mm filter once again for determination of soil chemical characteristics including the soil texture (percentage content of sand, silt, and clay), the amount of abundant nutrients (N, P, K, Ca, Al, and Fe), pH value, and organic matter. The total heavy metal and nutrient contents of soil samples were determined after pressure dissolution with 69% supra pure nitric acid (according to A2.4.3.1, VDLUFA, 1991) by ICP-AES (iCAP ™ 7600 Duo, Thermo Fischer Scientific). Contents of total carbon and total nitrogen were determined with CNS elemental analyzer (Vario EL Cube, Elementar Analysesysteme GmbH). Pedological base parameters (soil particle size, pH value, C/N) were collected for characterization. The particle size determination of soil texture was performed according to DIN 19683-2 (1997).

Isolation of the Essential Oils
The aerial plant parts were dried at room temperature (20-25°C) in the shade, then the leaves of each plant were separated and 10 g of each plant sample were ground manually. The EO of each sampled plant (10 g of leaves) was isolated by hydro-distillation for 2 h utilizing a clevenger-type system (Pavela et al., 2018). The distilled oils were dried over anhydrous sodium sulfate and stored at 4°C in sealed glass vials for analysis. The yield of the essential oil was calculated based on the dry weight of the plant material.

GC-FID and GC/MS Analyses
EOs were analyzed by GC−FID using an Agilent gas chromatograph 6890N, equipped with a HP-5 column (30 m × 0. 25 mm i.d., with a film thickness of 0.5 mm). The oven temperature was programmed at 50°C for 2 min, then from 50°C to 320°C at 5°C min −1 , and held at 320°C for 6 min. Both injector and detector temperatures were 250°C. Hydrogen was A B FIGURE 1 | Collection sites (A) and overview on geographic, climatic, and edaphic factors (B) affecting Zataria multiflora populations from Iran. used as carrier gas with a constant flow rate of 1 ml min −1 , and 1 ml of the diluted EOs (1/500 v/v in isooctane) was injected automatically (Gerstel MPS) in a splitless mode. Nitrogen was used as make-up gas, which was set at a flow of 45 ml min −1 . Mass spectrometry of the EOs was performed using an Agilent MSD 5975B/GC 6890N, equipped with a 30 m × 0.25 mm i.d., 0.5 mm, HP-5MS column. The injector temperature was 250°C, and the initial GC oven temperature was 50°C, held for 2 min, then raised to 320°C at 5°C min −1 and held for 6 min. Helium was used as carrier gas with a flow rate of 1 ml min −1 . One ml of the diluted EOs (1/500 v/v in isooctane) was injected automatically (Gerstel MPS) in a splitless mode. Injector and detector temperatures were set at 250°C. The EI + -MS operating parameters were as follows: ionization energy, 70 eV and ion source temperature, 230°C. The quadrupole mass spectrometer was scanned over 35 to 350 m/z. The runtime and solvent delay were set at 60 and 5 min, respectively (4.45 scans/s). Carvacrol, thymol, linalool, p-cymene, g-terpinene, and a-pinene were used as standard. 6-Methyl-5-hepten-2-one was used as internal standard and was added to the dilution before the analysis. The oil components were identified by comparison of mass spectra and retention indices with those recorded in the Adams (Adams, 2014), NIST mass spectral databases SRD 69 (NIST Chemistry WebBook, 2002), standard constituents, and the previously published data. The retention indices of individual components were calculated using a series of n-alkanes (C8-C40) (Sigma-Aldrich-Fluka, Germany) (1/100 in n-Pentan). The relative percentage composition of individual compounds was computed from the GC peak areas obtained without using correction factors.

NIR Spectroscopy and Chemometrics
Before isolation of EO, vibrational spectroscopy was performed directly on the homogenized plant material. NIRS analyses were carried out on a Fourier-Transform (FT)-NIR spectrometer (Multi-Purpose Analyser MPA, Bruker Optics GmbH, Germany). Spectra were recorded in the wavenumber range of 4,000 to 12,000 cm −1 with a spectral resolution of 8 cm −1 . Approximately 7 g of dried leaves were put in a glass Petri dish and spectra were collected during rotation of the dish using the integrating sphere for measuring in diffuse reflection. Spectra were acquired at 30 s. Each sample was analyzed with threefold repetition. The raw spectra were centered and corrected for scattering effects and baseline shifting using WMSC of the OPUS 6.5 software (Bruker Optics). Only averaged spectra of the three replicates were used for the later chemometric analysis.

Statistical Analysis
Statistical analysis was performed using hierarchical cluster analysis (HCA) with SPSS version 16 to classify and cluster the populations of Z. multiflora based on the squared Euclidean distances. Pearson's correlation coefficients were estimated among the EO content, major components, and edaphic factors using SPSS (SPSS, Chicago, IL, USA) software package from version 16. The calculation of means, standard deviations (SD) and t-test were used to express the significance of differences (P < 0.05) using SAS 9.1 program (SAS Inc. USA).
For chemometrics (based on NIR), HCA was performed to evaluate the diversity of the samples. Characteristic spectral ranges were identified by comparison with spectra appropriate reference standards and HCA. Calibration models were built by 10-fold cross-validation using a partial least squares (PLS) algorithm. Therefore, GC data of each plant and averaged plant wise spectra of the population were correlated.
Furthermore, we set up SEMs for each region using partial least squares (PLS) regression using Warp PLS 6.0 (Kock and Lynn, 2012). The PLS regression was chosen over covariance based approaches because it suited our small sample size and, compared to covariance structure analysis, can accommodate both reflective and formative scales more easily. Moreover, PLS does not require any a priori distributional assumptions (Chin and Newsted, 1999). We present individual standardized path coefficients (b), partial model fit scores (R 2 ), and overall model P values calculated by resampling estimations coupled with Bonferroni like corrections (Kock, 2010). To validate the models three model-fit indices [average path coefficient (APC), average R-squared (ARS), and average variance inflation factor (AVIF)] were calculated for each region. For model fit, it is recommended that P values for APC and ARS are both lower than 0.05 (i.e., significance at the 0.05 level). The AVIF index controls for multicollinearity and should be below 5 (Kock, 2010). In the SEM analysis we set paths from geographic factors (latitude, longitude, altitude), climatic factors (rainfall, temperature), soil texture (relative proportion of clay, silt, and sand), constituents (N, P, K, Al, Ca, Fe), and pH value directly to EO content and compounds; furthermore, we included the possible effects of the geographic factors on climatic and soil factors.
To determine the degree of phytochemical variation, HCA based on the phytochemical profiles was performed ( Figure 3). According to the major components, three chemotypes can be distinguished thus populations of Z. multiflora were divided into three main clusters. Cluster I consists of two populations (Siriz and Haneshk) characterized by higher content of linalool. Cluster II contains two populations (Fasa and Darab) which are characterized by higher amounts of thymol, carvacrol, p-cymene, and linalool. Cluster III contains ten populations including Jandaq, Ashkezar, Taft, Arsenjan, Gezeh, Hongooyeh, Daarbast, Gachooyeh, Konar Siah, and Kemeshk characterized by lower quantities of a-pinene, myrcene, aterpinene, linalool, and carvacrol methyl ether and higher amounts of carvacrol, thymol, p-cymene, and g-terpinene.

Environmental Characteristics
Geographical, climatic, and edaphic characteristics of Z. multiflora natural habitats are exhibited in Tables 1 and 2.
Our results indicate that this species grows in areas characterized by a mean rainfall of 40.46 to 302.72 mm year −1 and mean annual temperature of 14.90°C to 28.80°C. The altitude ranges from 731 to 1946 m. The percentage of organic     matter (OM) ranged from 4% to 10% (Haneshk and Darab regions, respectively). The soil of regions were rich in calcium (Ca), iron (Fe), potassium (K) and aluminum (Al) whereas nitrogen (N) and phosphor (P) were present in lower levels. Furthermore, Z. multiflora grows on soils with alkaline pH (7.60 to 7.90).
The volatile constituents were influenced by edaphic factors ( Table 4). Carvacrol was significantly positively correlated with pH, Ca, and temperature [0.69 (p < 0.01), 0.62 and 0.54 (p < 0.05) respectively] and there was a highly negative correlation between carvacrol and Al, Fe, and K. The correlation analysis indicated that linalool was considerably positively correlated with Al, Fe, and K (p < 0.01). No statistically significant correlations were detected among N and EO content and phytochemical constituents.
The SEM approach was used to dissect the contribution of environmental factors on EO and EO constituent content.  Figure 4A). Thymol content was positively affected by clay amount in the soil and indirect negatively via the negative effect of latitude on clay ( Figure 4B). Carvacrol was directly positively influenced by silt content and pH-value in the soil, which was positively depended on the amount of sand in the soil ( Figure 4C). Latitude had a negative effect on soil silt and a positive one on the soil sand portion. The linalool content was affected on the one hand, directly by longitude (positively) and on the other hand by silt (negatively) while silt content itself was negatively affected by latitude ( Figure 4D).

Quantitative Analysis of EO Composition by NIRS
The dried leaves of specimens of Z. multiflora from different regions were analyzed by near infrared spectroscopy and hierarchical cluster analysis (HCA, Wards Algorithm). The NIR spectra of Z. multiflora were characterized by combination, first and second overtone vibrations in the range of 4,000 to 12,000 cm −1 . HCA was used to group samples according to their spectral appearance determined through their chemical profile. Figure 5 presents the appropriate HCA plot showing the separation of Z. multiflora populations into different clusters. In contrast to GC analysis, NIRS combines spectral features of chemically similar structures. Hence, carvacrol, thymol, and p-cymene, all characterized by an isopropyl-and methyl-substituted aromatic ring system, show all nearly identical NIRS absorption patterns. Therefore, for NIRS not only the quantity of individual EO components are relevant, but the amount of structurally related substances. As shown in Figure 5 HCA resulted on highest level of heterogeneity in the clustering of samples according to the ratio of aromatic EO compounds (thymol + carvacrol + pcymene) to aliphatic, isolated C=C structures (linalool). On the next level, types with a high content of aromatic structures are divided into sub-clusters with high amounts of carvacrol (cluster IIIB), high thymol, and high linalool or high p-cymene (cluster IIIA) or high carvacrol and high p-cymene (cluster II).
Chemometrics of superintended pattern identification based on PLS-DA of GC combined with NIR spectroscopy was endeavored to categorize fourteen populations of Z. multiflora. Quantification models for the EO content and for major compounds were developed by 10-fold cross-validation  procedure according to literature (Krähmer et al., 2013). Therefore, averaged spectra for each plant were correlated with GC reference data for carvacrol, thymol, and linalool as well as EO content. For all constituents, appropriate prediction models were achieved. Figure 6 shows the results of cross-validation according to plant wise averaged NIR spectra from all populations. Generally, coefficients of determination (R 2 ) were higher than 0.82 for individual components and EO content. As shown in Figure 6A, NIRS offers a fast tool for estimation of EO content with a coefficient of determination R 2 = 0.85 and a root mean square error of prediction (RMESP) below 10% of mean EO content (the mean of EO content over all samples used in the model, according to Figure 6 something about 4 to 5 ml/100 g) (RMSEP = 0.431%). Furthermore, for major EO components, prediction quality was best for linalool (R 2 = 0.97) followed by R 2 = 0.87 and R 2 = 0.82 for carvacrol and thymol ( Figures  6B-D), respectively.

DISCUSSION
This study investigated the effect of different environmental factors on EO production, the content of specific EO compounds as well as on chemotype of different Z. multiflora populations. The EO values (up to 5.89% dry weight) detected in 14 populations in Iran were higher than those reported previously in the literature including 1.2% to 3.4% (Hadian et al., 2011a), 2.91% to 4% (Sadeghi et al., 2015), and 1.93% to 2.22% (Golkar et al., 2020). The EO content can be affected by geological, climatic, and edaphic characteristics as well as harvesting time. Saei-Dehkordi et al. (2010) described that the largest quantity of the EO content of Z. multiflora was collected in mid-May with 1.57% (v/w). Thus, knowledge on the season, phenological stage, and harvesting time during the day is necessary to obtain high quantities of EO content. Of the chemical constituents detected, carvacrol, thymol, linalool, p-cymene, gterpinene, and a-pinene were found as the main compounds of Z. multiflora. In other studies, the highest diversity was shown for the monoterpenes, including carvacrol, thymol, linalool, and pcymene (Shafiee and Javidnia, 1997;Abkenar et al., 2008;Mahboubi and Bidgoli, 2010;Ziaee et al., 2018). Carvacrol, the major compound of the Jandagh population, has been previously reported as one of the most important components of EO among various members of the Lamiaceae family (Ebrahimi et al., 2008;Hadian et al., 2011b;Stefanaki et al., 2018;Santos et al., 2019). The main component of Darab and Fasa populations was thymol (41.61% and 48.12% respectively), which is an isomer of carvacrol. Saei-Dehkordi et al. (2010) and Sharififar et al. (2007) had depicted thymol as the most abundant component in the essential oil profile of Z. multiflora from different areas in Iran. Contrariwise, two other studies showed carvacrol as the main constituent of Z. multiflora (Basti et al., 2007;Khosravi et al., 2009). Moreover, EO of Z. multiflora contains other important monoterpene constituents like linalool, p-cymene, g-terpinene, and a-pinene. Siriz and Haneshk populations were rich in linalool (55.38% and 37.65% respectively) and p-cymene was one of the main components of Darab population (13.96%). The positive and negative correlations between EO components indicate the presence of three different chemotypes: thymol, carvacrol, and linalool. Furthermore, they indicate which compounds are interlinked in a chain of monoterpene synthesis with certain branches in the predicted enzymatic pathway: while geranyl-diphosphate is the precursor of non-phenolic linalool and phenolic thymol and carvacrol, the latter are connected via pcymene (Thompson, 2005). In agreement to our results, similar correlations between individual EO components were found in Artemisia dracunculus, where methyl chavicol as the main constituent of A. dracunculus was positively correlated with terpinolene and methyl eugenol, and negatively correlated with apinene, limonene, (Z)-b-ocimene, and (E)-b-ocimene (Karimi et al., 2015). Hierarchical cluster analysis based on phytochemical components was proven to be a helpful tool to classify medicinal and aromatic plants accessions. For instance, cluster analysis on Verbascum songaricum resulted into nine groups (Selseleh et al., 2019) and for lemon balm populations three different chemotypes could be identified (Pouyanfar et al., 2018). Also grouping based on EO constituents of four Vitex specimens revealed different clusters (de Sena Filho et al., 2017).
In the present study, the components of the EO measured at full flowering stage underpin the presence of the three chemotypes (carvacrol, thymol, linalool). Rapid and reliable identification of medicinal plant species and chemotypes concerning authenticity and quality is crucial for pharmaceutical and food processing. Spectroscopy techniques as fast and easy handling technologies are nowadays widely applied directly on plant material for qualitative and semi-quantitative characterization. Different studies describe the application of NIRS, IRS, and Raman for differentiation of chemotypes and prediction of EO composition in various medicinal and aromatic plants (Seidler-Lozykowska et al., 2010;Gudi et al., 2014;Farag et al.,   2018). For Z. multiflora the presented quantification models are not accurate for exact determination at current state, since, e.g., for linalool, samples are very inhomogeneous distributed over the investigated range of concentration. Nevertheless, in combination with HCA, near infrared spectroscopy offers a fast method for chemotyping and EO estimation already on plant material. An improved prediction of EO content and main components with regard to cross-validation concerning averaged ATR−FTIR spectra can also be achieved for constituents with lower concentrations (Gudi et al., 2015). The high correlation between NIRS and GC data allows application of NIRS for authenticity and quality control directly on the plant material for the flavor and fragrance as well as pharmaceutical industries. NIR spectroscopy can be used to classify plants according to their chemotype as well as predict the content of valuable components such as carvacrol, thymol, and linalool as well as other terpenes, rapidly and accurately. The effect of soil parameters and climatic condition on plant perfomance and EO content has been shown for many plant species. For example, Kelussia odoratissima Mozaff grows in dark soil, rich in mineral content (Raiesi et al., 2013) and growth habitats of Thymus pulegioides were characterized by high amount of Al, Ca, Fe, K, and Si, however, by low amount of P and Mn (Vaicǐulytėet al., 2017). Mexican oregano populations grown in soil with high nitrogen and iron content, lower soil water availability, and higher pH values showed a higher EO yield (Martıńez-Natareń et al., 2012). It is widely accepted that environmental conditions affect plant EO content and its components (Ormeño et al., 2008;Mansour et al., 2010). Several studies have revealed that the predominance of carvacrol or thymol in different Lamiaceae species is related to environmental factors (Boira and Blanquer, 1998;Economou et al., 2011). In Thymus vulgaris such phenolic chemotypes cope better with summer drought, while non-phenolic (e.g., linalool) chemotypes cope better with early-winter freezing temperatures (Thompson et al., 2007). In our study the Pearson correlations revealed that altitude, K, Fe, and Al were significantly (p < 0.01) negatively correlated with EO content (Table 3). In agreement to our results, the lowest altitudes showed higher EO yield in Lavandula angustifolia (Demasi et al., 2018) and Satureja rechingeri (Hadian et al., 2014). Also, a correlation between higher EO yields at decreasing altitudes was found in Origanum vulgare (Giuliani et al., 2013). Notwithstanding the effect of geographical condition, EO content and EO constituents can be affected by edaphic factors and climatic conditions, for example, the soil type affects Origanum syriacum chemotype (El-Alam et al., 2019). In our study EO yield showed a highly significant positive correlation with temperature, pH value, and Ca. Former studies highlighted the same behavior in other aromatic plants and suggest that the wide variation in the chemical composition of the EO can be ascribed to habitat influences in Origanum compactum (Aboukhalid et al., 2017) and Origanum vulgare L. (De Falco et al., 2013). The influence of environmental conditions on EO of Origanum vulgare ssp. showed a negative correlation with altitude and a positive correlation with soil temperature and air temperature (Tuttolomondo et al., 2014). SEMs were applied to impute relationships between the different factors and revealed indirect geographic and direct edaphic effects on EO content and compounds, while climate factors do not have an influence. Chemotype and high amount of specific compounds can thus be predicted when looking for populations with specific features. Biotic factors like co-occuring vegetation (Wäschke et al., 2015) or herbivore activity (Dicke et al., 2009) can additionally influence the metabolome profile of plants and shall be considered in future studies.

CONCLUSIONS
Medicinal and aromatic plants play important roles all over the world because of their wide application due to pharmacological, therapeutic, industrial, and agricultural properties. The varying climate and environmental growth conditions lead to a huge phytochemical diversity of these resources. Zataria multiflora is a valuable medicinal plant with various pharmaceutical properties and has potential as source of compounds with agricultural relevance as plant protection agents. Ingredients such as carvacrol, thymol, and linalool are responsible for the respective effects and show a high variability among the investigated populations. Environmental conditions are affecting the EO content and its components. Hence, existing variability in the chemical profile of studied populations allow selection of populations with distinct scent or bioactive components for use in pertinent industries and breeding purposes. Our approach of identifying environmental predictors for EO content, chemotype or presence of high amounts of specific compounds can help to identify regions for sampling plant material with the desired chemical profile. Based on mobile NIRS devices, fast classification of yet undescribed populations and individual plants together with an EO profiling can be performed directly in the field.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding authors.

AUTHOR CONTRIBUTIONS
TM, HS, and JH conceived and designed the project; AlK performed all sampling, extraction, and chemical analyses, except soil analysis which was performed by NH. Statistical analyses were performed by AlK, TM, and AnK. AlK and TM wrote the article with contributions from all other authors.

FUNDING
The authors gratefully acknowledge the financial support obtained from the Federal Ministry of Food and Agriculture (BMEL) based on a decision of the Parliament of the Federal Republic of Germany via the Federal Office for Agriculture and Food (BLE) under the innovation support program (Project 2816DOKI06).