Terpene chemotypes in Gossypium hirsutum (wild cotton) from the Yucatan Peninsula, Mexico

Cultivated plants of Gossypium hirsutum Cav. (cotton) consistently emit low levels of volatile organic compounds, primarily mono-and sesquiterpenoids, which are produced and stored in pigment glands. In this study


Introduction
Many plants constitutively produce a wide range of volatile organic compounds (VOCs) that play an important role in mediating the relationships between plants and their surrounding environment, and can influence their interactions with many species (Rosenkranz et al., 2021).Terpenes are a large chemical class of VOCs that are involved in a variety of functions that contribute to overall plant fitness, including defence and inter-and intra-specific communication (Gershenzon and Dudareva, 2007).They are found across the plant kingdom and display high levels of variation in structure and abundance both across and within plant families and species (Pichersky and Raguso, 2018).A large source of the chemical variation found in plants is due to the vast range of terpenoid compounds, of which approximately 80,000 are known to date (Christianson, 2017).They are derived from both the mevalonic acid and methylerythritol pyruvate pathways, and are produced by terpene synthases, which are capable of catalysing a wide assortment of structures from single substrates (Zhou and Pichersky, 2020).Terpenoid compounds can be stored constitutively in some species, often making up the majority of the volatile blend of a healthy unstressed plant; when plants are challenged, for example by herbivory, they will usually enhance the production/accumulation of the same compounds they store constitutively (Lange and Turner, 2013).The high levels of variation in terpenoid compounds across and within plant species has allowed for the practice of grouping plants by their chemical phenotype, an application known as chemotyping, which normally has a discrete and genetic basis (Despinasse et al., 2020).
High levels of intra-specific variation in constitutive volatile terpene compounds have been documented in several plant species.For instance, terpene chemotypes in the aromatic genus Thymus have been long recognised, and the genetic basis for them has been described (Trindade et al., 2018).In the conifer Pinus pinaster, where terpenes are stored in resin canals, two distinct chemotypes have been described (Arrabal et al., 2012).Additionally, some Nepeta species (including Nepeta cataria, commonly known as catnip) group into chemotypes, with three discrete groups distinguished by their volatile oil components (Gomes et al., 2021).In turn, chemotypic variation has been found to be correlated with herbivory levels.For instance, Melaleuca alternifolia plants exhibiting terpinolene chemotypes (i.e.high levels of terpinolene relative to other phenotypes) were found to suffer from less feeding damage by Paropsisterna tigrina, a specialist herbivore, than plants belonging to the eucalyptol chemotypes (Bustos-Segura et al., 2015).In addition, Tanacetum vulgare (tansy) plants in a field site grouped into four chemotypes, which were found to influence colonisation by aphids (Clancy et al., 2016).Furthermore, plant chemotype effects on herbivores can in turn affect higher trophic levels.Senft et al. (2019) compared the effects of these tansy chemotypes on aphid population dynamics and the abundance of patrolling ants through manipulation studies, and found that ants had a preference for aphids that were on specific chemotypes.How terpene profiles affect a plant's vulnerability to herbivores and interactions with other trophic levels can be of great importance to crop management.
Gossypium hirsutum Cav.(Malvaceae), commonly called 'upland cotton', is an important cash crop that is grown in arid and semi-arid regions worldwide.It is frequently attacked by multiple pest species (e.g.Aphis gossypii (cotton aphid), Helicoverpa zea (cotton bollworm), Spodoptera exigua (beet armyworm); Bottrell and Adkisson, 1977), and requires heavy use of pesticides, earning it the title of world's dirtiest crop (Environmental Justice Foundation in collaboration with Pesticide Action Network UK, 2007).Many efforts have been made to develop alternative strategies to handle pest management of G. hirsutum.Transgenic Bt lines expressing Cry-endotoxin were developed, however suppression of lepidopteran feeding made them more susceptible to other pest species.In particular, the reduction in damage by chewing herbivores on Bt lines led to a reduction of the induction of other defensive terpenoids, which favoured the cotton aphid (Hagenbucher et al., 2013).Interestingly, it has also been shown that resistance induced in G. hirsutum by phytophagous mites (Tetranychus urticae and T. turkestani) can lead to reduced mite population growth (Karban and Carey, 1984).In response to enemy attack, Gossypium spp.produces a wide range of protective/defensive compounds.Gossypol and associated terpenoid aldehydes are stored in specialised pigment glands in the leaves, stems, seeds, and flower buds.These glands are induced by damage and have been associated with resistance against noctuid moths (Agrawal and Karban, 2000).Other defences in cotton include the production of extrafloral nectaries (which recruit plant "bodyguards" such as ants (Reyes-Hernández et al., 2022)), and the induction of volatile emissions following herbivory (Loughrin et al., 1994).In damaged plants, the foliar volatile profile is comprised of compounds that are emitted both constitutively and in response to insect feeding, and can act as cues for natural enemies of herbivores, such as host-seeking parasitoids.It has been shown that specialist and generalist parasitoid species (Microplitis croceipes and Cotesia marginiventris respectively) are attracted to the volatiles released by damaged cotton plants (McCall et al., 1993).G. hirsutum plants constitutively produce a range of mono-and sesquiterpenoid compounds which are stored in the pigment glands.Low levels of these compounds are consistently emitted from undamaged plants (Paré and Tumlinson, 1997).
Though several Gossypium species have been independently domesticated in different regions of the world, wild populations persist to date.Gossypium hirsutum was domesticated in Mesoamerica, at least 4000 years ago (Yuan et al., 2021).Wild populations of G. hirsutum grow naturally along the coastal shrubland of the Yucatan Peninsula (Abdala-Roberts et al., 2019), which is also likely the centre of origin and domestication of this species (d'Eeckenbrugge and Lacape, 2014).In this study, we aimed to assess qualitative and geographic variation in constitutive stored volatile terpenes among wild G. hirsutum populations in this region, and to assess whether these plants can be grouped according to their chemical phenotype.To this end, we analysed the chemical profiles of plants grown from seeds collected from sixteen populations along the Yucatan Peninsula in Mexico.We also analysed the chemical constituents of the maternal plants in order to estimate heritability of the chemical phenotype.

Results and discussion
Due to seed dormancy and germination rates differing among populations, the number of replicates was not the same for all populations.The 16 populations were grouped into four geographic zones: Celestún (Far West), Sisal (West), Chicxulub (Centre), and Coloradas (East; see Fig. 1).In total, we analysed 349 individual plants and identified 44 compounds.Through comparison to database searches, authentic standards, and retention indices, 17 of the compounds were determined to be monoterpenes (MTs), and 19 sesquiterpenes (SQTs).Five green leaf volatiles (GLVs), two diterpenes, and one triterpene were also identified in the liquid extracts (Table 1, see Table S1 for absolute values; see Table S2 for full list of compounds).

Terpene chemistry and chemotyping of wild G. hirsutum populations
A correlation analysis (Fig. 2) performed on the relative abundance of mono-and sesquiterpenes in the leaf solvent extracts revealed two distinct highly supported (approximately unbiased (AU) p < 0.05) groups of monoterpenes.One group comprised the monoterpenes γ-terpinene, limonene, α-thujene, α-terpinene, terpinolene, and p-cymene (the "γ-terpinene group", designated chemotype class A), while the other consisted of the pinene-type compounds αand β-pinene (the "α-pinene group", designated chemotype class B).The sesquiterpene compounds α-humulene and β-caryophyllene were also strongly intercorrelated.These strong correlations suggest a shared or highly linked biosynthesis.A theorised pathway of the synthesis of these compounds highlights their shared biochemical precursors (Degenhardt et al., 2009, Fig. 3).Taking the relative abundance of only monoterpenes, a negative relationship was observed between the proportions of γ-terpinene and α-pinene groups with two clusters of plants identified (Fig. 4a).Representative chromatograms from plants belonging to chemotype classes A and B respectively can be seen in Fig. 4b.Chemotype class A was composed of plants where the γ-terpinene group compounds made up 4% or more of the total monoterpene concentration; chemotype class B comprised plants where the γ-terpinene group compounds made up less than 4% of total monoterpenes (Fig. 4c).The major difference is the strong reduction or absence of the γ-terpinene cassette of monoterpenes in chemotype class B. In a principal coordinate analysis (PCoA) on relative abundances of monoterpenes, with samples separating based on their compositional similarity (using a Euclidean dissimilarity matrix), the chemotype classes separated along the first axis (Fig. 5), which explained ca.65% of the variation.Plants belonging to chemotype class B group together, while plants in class A are spread along the first axis.Around 42% of overall variance in the monoterpene profiles of the plants was explained by chemotype class (adonis: R2 = 0.419, p = 0.001).
Most modern varieties of G. hirsutum produce and store a range of mono-and sesquiterpenes, including αand β-pinene, limonene, camphene, β-myrcene, α-humulene, and β-caryophyllene (Elzen et al., 1985;Minyard et al., 1965;Yang et al., 2013), and plants producing significant amounts of γ-terpinene are rare.Only one naturalised variety has been reported to emit γ-terpinene in large quantities (Loughrin et al., 1995).Although modern cultivated varieties came from diverse landraces, this could indicate that the main sources for domestication events of G. hirsutum were obtained from B chemotype wild plants, or that perhaps the artificial selection throughout the domestication process selected against chemotype A plants due to pleiotropic interactions or linkage with traits of interest.
The large diversity of terpenoid compounds found in plants is due to terpene synthases (TPSs), which are a diverse family of enzymes that catalyse terpenoid compounds from single substrates (Karunanithi and Zerbe, 2019).The genes GhTPS1 and GhTPS2, identified and characterised in Gossypium hirsutum (cultivar CCRI12) (Huang et al., 2013), encode for active TPSs that are expressed in young developing leaves, and may be involved in the production of constitutive terpenoids that are stored in glands.It has been found that GhTPS1 mainly produces the sesquiterpenes β-caryophyllene and humulene at a ratio of ~4:1.
GhTPS2 was found to mainly produce the monoterpenes α-pinene and β-pinene, at the ratio of ~6:1.These sets of compounds were highly correlated in our dataset, with β-caryophyllene and humulene being present at a ratio of 3.8:1 (SE = 0.013), and α-pinene and β-pinene at a   The highly similar product ratios between our dataset and those described in Huang et al. (2013), suggests that GhTPS1 and GhTPS2 contribute to the production of these compounds in wild G. hirsutum.Nonetheless, other TPSs may also be involved, as multiple G. hirsutum TPSs have been shown to produce several terpenes at varying levels (including the compounds that comprise the "γ -terpinene group" (Huang et al., 2018)).Eighty-five TPS genes have been putatively identified in G. hirsutum to date (Zhang et al., 2022).Our results suggest that one or more enzymes present or overexpressed in plants of chemotype A are multifunctional TPSs that synthesise the monoterpenes of the γ-terpinene group.

Association between chemotype and geographic location
The correlation between geographic location of the sites and chemical distance was not very strong but significantly different from zero (Mantel test, r = 0.165, p = 0.001), indicating that as geographic distance increased, the chemical profiles of the plants became more dissimilar.An association between site location and γ-terpinene content was observed, with more plants in western sites containing low amounts of compounds belonging to the γ-terpinene group, and plants in eastern sites containing higher proportions of γ-terpinene group compounds (Fig. 6).Future studies could examine the genetic structuring of these populations to determine if these chemotypic differences are related to genetic differentiation (i.e., isolation among populations) and/or adaptation to their local environment.

Chemotypes of mature plants
In addition to the plants grown from seed, we also assessed the chemical profiles of 165 mature plants from which a subset (34) were also maternal plants.Thirty-seven compounds were detected in the mature plants, with 20 compounds identified as MTs, 10 SQTs, 4 GLVs, 1 hydrocarbon, and 2 nitrogen-containing volatiles (see Table S3).Mature plants were classified using the same chemotyping process as described above.PcoA ordination of all mature plants harvested in situ shows grouping by chemotype class, separating along axis 1 (Fig. S1; variation explained by first two ordinations: 82.6% and 15%).More than twothirds of the variation was explained by chemotype group (adonis: R2 = 0.698, p = 0.001).Terpenoid production is known to be affected by a variety of environmental stresses, including exposure to extreme temperatures, water stress (flooding, drought), and salinity, as well as herbivory (Gouinguené and Turlings, 2002;Isah, 2019).A differential regulation in terpenoid production because of environmental pressures could explain why chemotype explained a higher percentage of total variation in the chemical profiles of the mature plants compared to plantlets; abiotic and biotic stress are known to affect the levels of VOCs in plants (Holopainen and Gershenzon, 2010).Plant ontogeny can also influence the terpenoid profile.Although the age of the mature plants was unknown, they had been surveyed for several years and were well established flowering plants.Meanwhile, offspring plants were seedlings with only four developed leaves (around one month old), and it is possible that mature G. hirsutum plants exhibit a more defined chemotype.For example, in Melaleuca alternifolia, six terpene chemotypes can be clearly differentiated in mature trees, however young plants only have four identifiable chemotypes, and two chemotypes with high terpinolene concentration are indistinguishable in young plants (Bustos-Segura et al., 2015).

Heritability of chemotypes
Most of the progeny had the same chemotype as their parent plant.2018) reported an outcrossing rate of 0.72 for a metapopulation of wild G. hirsutum plants from the Yucatan Peninsula, indicating that although the level of self-pollination is important, there is also a large contribution from cross-pollination to the next generation.Thus, as we only knew the chemotype of the maternal plant and not the paternal plant, this could explain why there was some disparity between maternal and offspring chemotypes.Broad sense heritability estimated with the animal model was 0.85 (CI = 0.75, 0.95).In concordance with what we observed, chemotype is generally considered to be heritable in plants (Hare, 2011).In the tree species Eucalyptus globulus, O'Reilly-Wapstra et al. ( 2011) found moderate to high broad-sense heritability of foliar terpenes, and Karban et al. (2014) found that in sagebrush (Artemisia tridentata) chemotypes are highly heritable between parent and offspring.

Concentration of leaf terpenoids
To determine the degree to which biological and geographic factors can explain the variation in the concentration of compound classes, we used generalised linear mixed models.Approximately a quarter of the total variance (from 21 to 29%) in concentrations of total volatiles, total monoterpenes, total sesquiterpenes, and total green leaf volatiles was significantly explained by 'genotype' (Table 2), whereas 'population' explained only a small fraction of the variance and was not statistically significant.Concentrations of total volatiles and total monoterpenes were significantly higher in plants belonging to chemotype Class A (VOCs: χ 2 (1) = 4.9101, p = 0.027, MTs: χ 2 (1) = 8.298, p = 0.004).No significant difference was observed in total concentrations of sesquiterpenes or green leaf volatiles between chemotypes (SQTs: χ 2 (1) = 1.5618, p = 0.2114, GLVs: χ 2 (1) = 1.646, p = 0.1995).At the time of harvesting, all leaves were at a similar developmental stage (4th leaf fully unfurled).However, leaf area showed considerable variation with a bimodal distribution (excess mass test, p = 0.004, Fig. S2); therefore, all analyses on concentration included leaf area as a   (1) = 184.9882,p < 0.001).Total monoterpene and sesquiterpene concentrations were significantly higher in smaller leaves (MTs: χ 2 (1) = 585.716,p < 0.001, SQTs: χ 2 (1) = 339.2854,p < 0.001), whereas total concentrations of green leaf volatiles were higher in larger leaves (GLVs: χ 2 (1) = 51.733,p < 0.001).Opitz and colleagues had found that accumulation of terpenoids in glands in cotton was strongly influenced by leaf area and developmental stage, with the two youngest/ smallest leaves having the highest total terpenoid concentration (Opitz et al., 2008).They also showed that younger leaves have a much higher gland density than older leaves, but contain less total terpenoids per gland.Eisenring et al. (2017) made the observation that smaller leaves have a higher density of glands than larger leaves of the same age.We found that smaller leaves contained higher concentrations of MTs and SQTs and lower concentrations of GLVs in relation to larger leaves.

Ecological implications of chemical diversity in G. hirsutum
The chemotype of a plant plays an important role in the outcome of environmental adaptation, and volatile-mediated plant-insect and plantplant interactions.For instance, in Artemisia tridentata (sagebrush), plants with the same chemotype respond more strongly to volatiles emitted by each other than to volatiles from individuals with a different chemotype (Karban et al., 2014).Moreover, when Tanacetum vulgare (tansy) plants were challenged by insect herbivores from two feeding guilds, distinct chemotypes responded differently.Following caterpillar feeding, some chemotypes exhibit a stronger volatile response when also pre-treated with aphids, while the opposite was observed for other chemotypes (Clancy et al., 2020).In Thymus vulgaris (thyme), phenolic chemotypes show reduced tolerance to freezing compared to non-phenolic chemotypes (Thompson et al., 2013) and occur more frequently in a region that has undergone regional warming (where occurrence of extreme winter freezing events has dwindled).Accordingly, the observed geographic gradient in the frequency of the two identified chemotypes across wild G. hirsutum populations calls for further work testing whether the terpene chemotypes are associated with a pattern of local adaptation to biotic (herbivores and/or pathogens) or abiotic factors.
There have been numerous investigations concerned with describing the volatile terpenoids found in G. hirsutum.While chemotypes have not, to our knowledge, been described in wild cotton to date, differences in the amounts of herbivory-induced plant volatiles (HIPVs) released by different cultivars have been recorded (Hagenbucher et al., 2016).Magalhães et al. (2020) investigated VOC differences between seven G. hirsutum genotypes and found few qualitative and quantitative differences.Röse and colleagues (Röse and Tumlinson, 2005) showed that cotton plants respond in specific ways to herbivory by systemically releasing distinct blends of volatile compounds, and suggest that plants that emit higher amounts of induced defence chemicals are better at defending themselves.Loughrin et al. (1995) showed that upon feeding damage by beet armyworm larvae (Spodoptera exigua), the naturalised variety TX2259 was found to emit much greater quantities of HIPVs than five other cultivated lines (almost sevenfold higher).Moreover, this variety released higher amounts of γ-terpinene.Thus, it would be interesting to explore if wild (and naturalised) cotton emits higher amounts of HIPVs than cultivated plants of G. hirsutum, and whether the constitutive terpene chemotype is also linked to quantitative differences in the volatile emissions.
HIPVs in G. hirsutum have also been implicated in interactions with the third trophic level.Campoletis sonorensis and Microplitis croceipes (two parasitoid species) were found to respond to volatiles from undamaged cotton plants (Elzen et al., 1987), yet volatiles emitted by caterpillar-damaged plants seem to provoke a considerably stronger response (De Moraes et al., 1998;Turlings et al., 1995).Importantly, cotton volatiles are also involved in plant-plant signalling; studies have shown that cotton plants exposed to the volatiles of a nearby damaged plant are more resistant to herbivory (Llandres et al., 2018;Renou et al., 2011).Therefore, chemotypic variation could be a factor that not only affects the attraction of natural enemies of herbivores to damaged or undamaged plants but also be a key determinant of the effectiveness of airborne signalling between plants.

Conclusions
In brief, we demonstrate that wild G. hirsutum plants from 16 naturally occurring populations along the Yucatan peninsula could be grouped into two chemotypes based on differences in their stored monoterpene profiles.The identification of chemotypes in wild G. hirsutum could have implications for cotton direct defences against herbivory, and indirect defences through the attraction of natural enemies or plant-plant signalling.We recommend performing experiments exploring the consequences of the reported terpene variation on the different levels of plant defence and its effects on the outcome of associated interactions.In combination with functional characterisation of terpene synthases in wild cotton chemotypes such research would provide valuable information into the chemical ecology of cotton and may reveal opportunities to enhance the control of cotton pests without the current excessive use of pesticides.

Plant material
Gossypium hirsutum Cav.(Malvaceae) is a perennial shrub that is native to Mexico and is distributed throughout the Caribbean Basin and Central America (d'Eeckenbrugge and Lacape, 2014).It grows up to 2 m tall under natural conditions.Wild populations of the species can be found growing in coastal shrubland along the Mexican Yucatan Peninsula coast.Seeds from 16 populations of wild G. hirsutum plants located around the Yucatan Peninsula (Fig. 1) were collected in 2019 and 2020.From each population, approximately 100 seeds from six maternal plants were harvested at each site.Offspring from the same mother are either full or half-siblings and referred to here as a genotype.The seeds were shipped to the University of Neuchâtel, Switzerland, where the experiment was performed between July-November 2020.

Experimental protocol 4.2.1. Greenhouse seedling experiment
To improve germination rates, the seeds were pre-treated as follows before planting: seeds were delinted and carefully scratched at the chalazal end using a mini drill (Dremel 3000, WI, USA) with a sandpaper adapter.They were then soaked in tap water at 35 • C for 19 h before being planted in trays containing soil (Profi Substrat, Einheitserde, Germany).The trays were kept in phytotrons (GroBanks CLF Plant Climatics, Germany) under the following conditions until seedlings emerged; 14 h: 30 • C, 65 μmol m − 2 s − 1 , and 10 h at 25 • C in the dark.
Once cotyledons were fully developed, seedlings were transplanted to individual pots (250 ml) and moved to the greenhouse, where they were placed randomly.Plants were then grown until the fourth true leaf had emerged and was fully developed.

Sampling of mature plants in situ
In the winter of 2020, the leaves of 165 mature plants from 14 of the 16 populations previously described, plus three additional populations, were sampled in situ.Of these, 34 plants from 11 populations produced seeds that were used in the main experiment.As some plants were lost to flooding, hurricanes, and urbanisation, not all maternal plants could be sampled.We only harvested undamaged leaves which were immediately frozen on dry ice upon collection.The frozen samples were ground to a powder under liquid nitrogen and stored at − 80 • C in the Biotechnology Laboratory at the Faculty of Chemical Engineering of the Autonomous University of Yucatan (UADY).In 2021, the frozen samples were transported to the University of Neuchâtel, where hexane extraction and GC-MS analysis were performed as described below.

Sample preparation and GC-MS analysis
The fourth leaf of each plant was harvested, photographed, and weighed, then immediately frozen in liquid nitrogen.The frozen leaf samples were then ground into a powder and stored at − 80 • C. For each sample, one ml of hexane was added to approximately 100 mg frozen leaf material and then vortexed and stored at 4 • C for 24 h.The hexane extract (800 μl) was removed and transferred to an amber glass 1.5 ml vial with screw top PTFE/silicone lid.Nonyl acetate was added as an internal standard (200 ng in 10 μl).
The samples were analysed by GC-MS (GC type: 8890A, MS type: 5977B MSD, both Agilent Technologies, Palo Alto, CA, USA), using a HP-5MS capillary column (30 m × 250 μm × 0.25 μm, Agilent Technologies).1.5 μl of sample was autoinjected in splitless mode (inlet at 250 • C) at a constant flow rate of He at 0.9 ml min − 1 at the following temperature program: 40 • C for 0 min, ramping at 15 • C min − 1 to 130 • C, then 60 • C min − 1 to 190 • C, then 30 • C min − 1 to 290 • C and holding for 0.5 min.Identification of all measured VOCs was carried out through comparison of obtained mass spectra with those of authentic commercially available standards, NIST version 2.3 library spectra, an in-house VOC mass spectra library, and the Kovats retention index library (Lucero et al., 2009).Peaks were aligned and quantified using the PyMassSpec package in Python forked from the PyMS repository (O'Callaghan et al., 2012).

Statistical analysis
Analyses were performed using the relative abundance of monoterpenes and sesquiterpenes together, unless explicitly stated otherwise.Statistical analyses were performed using R (version 4.1.3).Hierarchical clustering of mono-and sesquiterpenes was performed in pvclust (Suzuki and Shimodaira, 2006) using the correlation distance method, Ward.D2 method, and 10,000 bootstrap replications to identify compounds that were highly correlated in order to distinguish chemotype groupings (Bustos-Segura et al., 2017).The correlation heatmap was created using the ComplexHeatmap R package (Gu et al., 2016).We performed Principal Coordinate Analysis using functions from the R packages vegan and stats (vegdist, cmdscale (Oksanen et al., 2020;R Core Team, 2013);) using relative abundances of mono-and sesquiterpenes.We assessed the effect of geographic location on plant chemical profiles using a Mantel test (R package 'vegan'), which tested the correlation between the spatial and chemical distances (Haversine distance and Bray-Curtis dissimilarity).Permutational multivariate analysis of variance (PERMANOVA) using the 'adonis' function (R package 'vegan') with 999 permutations was used to test for differences between chemotypes.Distribution analysis of leaf area was performed using the R package 'multimode' (Ameijeiras-Alonso et al., 2021), and ridgeline plots were produced using 'ggridges' (R package ggridges (Wilke, 2021);).Generalised linear mixed-effects models were performed using the R package 'lme4' (Bates et al., 2015), using a Gamma distribution with a log link function to analyse the variation in VOCs concentration.Concentrations of total terpenes, total monoterpenes, total sesquiterpenes or total GLVs were used as dependent variables, and we assessed the effects of chemotype class and leaf area (cm 2 ) as a covariate (fixed effects); population and genotype were included as random effects.Likelihood ratio tests and the Akaike information criterion were used to determine which random factors significantly affected the model.Heritability was estimated with an animal model using a generalised linear mixed model with a Bayesian approach (R package 'MCMCglmm'; Hadfield, 2010).The model included chemotype as a response binomial variable (chemotype class A or B) and the pedigree of maternal and offspring plants as a random factor.A threshold distribution family was used (de Villemereuil, 2018).We ran MCMC for 2 × 10 6 iterations and thinning interval of 200 after a burn-in of 1 × 10 5 which yielded an effective sample size of 8075.Heritability and confidence intervals were estimated using the posterior distribution of the variance components from the model (Falconer and Mackay, 1996): where H 2 : broad sense heritability; V A : additive variance; V R : residual variance.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Map showing locations of the wild Gossypium hirsutum populations from which seeds were collected along the Yucatan Peninsula.

Fig. 2 .
Fig. 2. Correlation analysis of all mono-and sesquiterpenes analysed in the wild Gossypium hirsutum plants.Red rectangles indicate highly supported groups of monoterpenes (approximately unbiased (AU) p < 0.05).(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 3 .
Fig. 3. Theorised monoterpenoid biosynthesis pathway.Compounds highlighted in blue comprise the γ-terpinene compound group, those highlighted in grey comprise the α-pinene compound group.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 5 .
Fig. 5. Principal coordinate analysis on relative abundance of monoterpenes in plants grown from seed, showing samples separated based on their compositional similarity.

Fig. 6 .
Fig. 6.Ridgeline plot showing the distribution of the summed values of the γ-terpinene compound group (γ-terpinene, limonene, α-thujene, α-terpinene, terpinolene, and p-cymene; as % relative to total monoterpenes in each plant).In order from top to bottom: the plots coloured red (Celestún) and orange (Sisal) are located at the west of the peninsula.The plot coloured yellow (Chicxulub) is in the centre, and the cream coloured plot (Coloradas) is located at the east of the peninsula.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Table 1
Relative abundance of monoterpenes and sesquiterpenes used in analyses grouped by geographic zones.Compounds marked with an asterisk were verified by comparison to authentic standards.

Table 2
Variance between genotype and population random factors.The percentage of total variance explained is also shown.Variance components were obtained from generalised linear mixed models evaluating the variance of total concentrations of all volatiles, total monoterpenes, total sesquiterpenes, and total green leaf volatiles.Values shown in brackets specify the standard deviation of each component.Significant values are shown in bold.***p < 0.0001.