Widely assumed phenotypic associations in Cannabis sativa lack a shared genetic basis

The flowering plant Cannabis sativa, cultivated for centuries for multiple purposes, displays extensive variation in phenotypic traits in addition to its wide array of secondary metabolite production. Notably, Cannabis produces two well-known secondary-metabolite cannabinoids: cannabidiolic acid (CBDA) and delta-9-tetrahydrocannabinolic acid (THCA), which are the main products sought by consumers in the medical and recreational market. Cannabis has several suggested subspecies which have been shown to differ in chemistry, branching patterns, leaf morphology and other traits. In this study we obtained measurements related to phytochemistry, reproductive traits, growth architecture, and leaf morphology from 297 hybrid individuals from a cross between two diverse lineages. We explored correlations among these characteristics to inform our understanding of which traits may be causally associated. Many of the traits widely assumed to be strongly correlated did not show any relationship in this hybrid population. The current taxonomy and legal regulation within Cannabis is based on phenotypic and chemical characteristics. However, we find these traits are not associated when lineages are inter-crossed, which is a common breeding practice and forms the basis of most modern marijuana and hemp germplasms. Our results suggest naming conventions based on leaf morphology do not correspond to the chemical properties in plants with hybrid ancestry. Therefore, a new system for identifying variation within Cannabis is warranted that will provide reliable identifiers of the properties important for recreational and, especially, medical use.

137 could be culled before pollen production and potential female pollination. This is important 138 because females would undesirably divert energy to seeds instead of cannabinoids after 139 being pollinated (Clarke & Merlin 2013).

140
In this study, we quantified 18 phenotypic traits of 297 individuals from a first-

161
Morphological measurements including height, stalk diameter, inner-node length, 162 petiole length, leaf length and width, among other measurements, were obtained at two 163 different time points during the growing cycle (Table S1). We chose these two time points, 164 one at the beginning and one at the end of the growing season, to provide information on

215
Given that the production of these three cannabinoids may be correlated because 218 molecule -CBGA-, we analyzed them using a Principal Component Analysis (PCA) to account 219 for multicollinearity and to avoid redundancies. We used a K-means cluster analysis on PC1 220 vs PC2 to visualize the different cannabinoid groups. We also added the total cannabinoid 221 concentration and measured the ratio of each cannabinoid over this total concentration 222 (Table S1).

225
We examined the associations between the production of each cannabinoid and 226 each of the measured traits at both timepoints and the . We also used cannabinoids as the 227 explanatory variables for several MANOVA models to determine whether cannabinoid 228 production explained differences among the measured traits. We also corroborated the 229 MANOVA results with multivariate multiple regressions, and correlated leaf shape to 230 cannabinoid content to understand whether any association exists between those traits.
231 Finally, we generated a variance-covariance matrix to establish the association within and 232 between all phenotypic traits.

233
These data were added to the dryad repository

242
Our results show that some phenotypic traits from the IT (Table S2) (Table S2). The 246 positive correlation between traits related to height such as number of nodes and number 247 of branches is expected. In other words, it is expected that tall plants will have multiple 248 branches and nodes. It is also expected that traits that are not related to height, such as 249 leaf-related characteristics, lack a significant correlation.

250
Similarly, the FT also shows that some traits are correlated at this stage (Table S3). Manuscript to be reviewed 252 have long side branches as well as thicker stalks. However, as expected, some traits lack 253 association, such as stalk diameter and inflorescence number or size.

254
However, many of the significant associations within either the IT or FT are lost 255 when both timepoints are correlated between them (Table S4). These various phenotypic 256 traits are not predictive between time periods (Table S4)

272
Similarly, these phenotypic traits are not different between males and females 273 (Table S5). In other words, males cannot be distinguished from females with any of the 274 physical characteristics that we measured in this study. However, some trait correlations 275 do differ between the sexes (Table S6)

296
Our results suggest there are some trait correlations that describe leaf shape, but 297 these are not correlated to growth rates, plant size, branching architecture, 298 phytochemistry, or plant sex ( Figure S1). It appears that there could be a within-leaf effect 299 because the FELL measurements correlate within them in the IT, and serration, leaf length, 300 and number of leaflets correlate with leaf shape in the FT. However, the leaf measurements 301 show no association between timepoints (Table S7).

302
The overall trend shows leaf shape is not explained by any of the plant traits 303 measured on either timepoint (   (Table S8). These results 335 were confirmed with the MANOVAs and multivariate multiple regressions.

338
We found no relationship between leaf shape and cannabinoid content using PC1 for 339 leaf shape and PC1 for cannabinoid variation (Figure 4). Therefore, leaf shape is not

362
The lack of sexual dimorphism in the measured traits for this study may be specific 363 to this population, and particularly the measurements in the FT may be problematic due to 364 the lack of males. Theoretical models suggest differences between males and females

369
The PCA analysis facilitates the examination of shape variation for each structure 370 independently (Adams et al. 2004), allowing us to distinguish differences in leaf shape 371 ( Figures 1B and 2). As size is removed during the Procrustes superimposition, it does not 372 determine the variation of the first principal component (PC1) as it does in traditional 373 morphometrics, assuring that the main source of variation explored is shape. With this 374 geometric morphometric analysis, we found that leaf shape is not related to sex (Figure 2), 375 cannabinoid production (Figure 4), or to multiple other phenotypic traits (Table S8), 376 suggesting all of these traits segregate independently. However, we did find within-leaf 377 associations between shape, leaf length, serration, and number of leaflets ( Figure S1) 378 suggesting that within a single leaf some characteristics may be related to each other.

394
Studies suggest that THC has been selected for by breeders and growers and that

396
Our results confirm these studies given that THC is always produced in higher quantities 397 than CBD ( Figure S2), implying that THCA synthase may be a better competitor than CBDA 398 synthase in this population.

399
Variation in THC production is probably a result of gene sequence variation (Onofri 412 correlations between leaf shape and phytochemistry may not be due to causal 413 relationships, but rather because breeders have intentionally (or unintentionally) selected 414 for these trait combinations. If these traits were associated due to shared ancestry or 415 correlated selection, their association can be broken by recombination.

416
This is particularly noticeable in most of the modern cultivars which are hybrids 417 from the supposed two main groups. Therefore, our study also suggests that common 418 assumptions about associations between leaf shape and chemistry may exacerbate the     Exemplar leaf depicting the ten points used for leaf shape analysis. The ten points measured the first, central, and last leaflets.