Correlation Analysis of Compounds in Essential Oil of Amomum tsaoko Seed and Fruit Morphological Characteristics, Geographical Conditions, Locality of Growth

: Amomum tsaoko is a perennial herb belonging to Zingiberaceae. Its dried ripe fruit is an important food additive, spice and materia medicai in Southeast Asia. For hundreds of years of cultivation, morphological variations have existed. The essential oil is one of the major active products of the A. tsaoko fruit and seed. In this study, we collected 12 populations in Yunnan province, and the correlation analysis of compounds was focused on the essential oil of A. tsaoko seed and its fruit morphological characteristics, geographical conditions, and locality of growth. The results showed that the difference in morphological characteristics between populations is greater than the difference within the population. High altitude areas are beneﬁcial for biomass accumulation. Another interesting ﬁnding is that selecting speciﬁc functional or odor type materials could reference the morphologies of A. tsaoko fruit and seed. Furthermore, the qualitative and quantitative analysis of compounds in essential oil could be used to distinguish the producing area of the A. tsaoko fruit. These results are crucial in realizing the determination of botanical origin and evaluating the quality of A. tsaoko fruit. Meanwhile, it makes clear that various other studies on this plant deserve more attention.


Introduction
Amomum tsaoko Crevost et Lemaire is a perennial herb belonging to Zingiberaceae. Based on a novel multi-marker phylogenetic framework using matK and nrITS, new genera Lanxangia was described [1]. A. tsaoko got a new name Lanxangia tsaoko [1]. In this study, A. tsaoko still follow the Flora of China [2]. Its dried ripe fruit, commonly known as black cardamom, Tsaoko Fructus or Caoguo, is an important food additive and spice in Southeast Asia, including China, Korea, and Japan [3,4]. It is also a common materia medica in China. For a long time in history, it has been used for the interior obstruction of cold-dampness and distending pain in the epigastrium and abdomen, stuffiness and fullness, vomiting, malaria with cold and fever, and pestilence fever [5]. In recent years, A. tsaoko have been proved having anti-inflammatory effect against insects, antitumor effect in liver cancer cells, anti-angiogenesis efficacy in ovarian cancer, and inhibited sphingosine kinases 1 and 2 (SPHK1/2) [6,7]. The essential oil is one of the major active products of the black cardamom, mainly consisting of monoterpene hydrocarbons and oxygenated monoterpenes [8]. It has been proven to be related to a broad spectrum of bioactivities, such as antimicrobial, antibacterial, insecticidal, antioxidant activities, sedative, analgesic, and hypnotic effects [8][9][10][11][12].
Nowadays, the wild (or natural) population of A. tsaoko is extremely hard to find. However, it is easy to cultivate with little plant management required. The resource usage is mainly dependent on its cultivation. In China, it is randomly planted in the southeast of Yunnan province, mainly in the moist, broadleaved forests. In the last 200 years, to increase the consumption and gradual disappearance of the wild (or natural) population, the cultivated area has expanded [13]. Now, there is about 90% producing area of A. tsaoko in Yunnan province [14]. Typically, it could be harvested three years after planting and produce for about 20 years. As a result, A. tsaoko is a valuable economic plant in the mountainous regions of Yunnan province.
After years of cultivation, morphological variations have developed in the fruits and seeds of A. tsaoko. According to the length-width ratio and shape of the ends, the fruit types are defined as elliptic, cone-shaped, spindle-shaped, and spheroidal [15]. In different regions, the frequencies of distinct shapes are variable [15]. In China, people believe that larger or roundish ones have a more prominent smell, which may indicate good quality and higher price. However, there is no direct evidence to support this. There is little to no report on whether the odor or the ingredient of essential oils correlates with its geographical conditions. Furthermore, very little was known whether the essential oils were separated into different places and whether the characteristic components could discover the locality of growth.
For the analysis of the volatile compounds in the essential oil, spectroscopy provides powerful analytical methods. Gas chromatography-mass spectrometer/flame ionization detector (GC-MS/FID), nuclear magnetic resonance (NMR), near infrared (NIR), midinfrared (MIR), ultraviolet-visible spectroscopy (UV-vis) and Raman spectroscopy have been applied on many complex matrices [16]. In this study, GC-MS was used to quantitate the volatile compounds in essential oil relatively. We then measured the morphological characteristics, collected geographical conditions, and investigated the composition and content differences in essential oils of black cardamom. The correlation analysis of compounds and morphological variables, geographical conditions, and locality of growth were carried out. The results could help select a cultivated land, or specific functional or odor type materials by referencing the morphologies of A. tsaoko fruit and seed, and the identification of A. tsaoko producing area.

Plant Materials
A. tsaoko ripe fruits were collected from 12 populations at early ripeness in autumn of 2018, as shown in Table 1 and Figure 1. All the fruits were identified as A. tsaoko fruits by Yao-wen Yang, and all voucher specimens were deposited in the herbarium of the museum of the Yunnan University of Chinese Medicine.

Chemicals
All solvents and reagents were of analytical grades. n-Hexane (GC grade) was purchased from Merck KGaA, Germany. C7-C30 Saturated Alkanes were purchased from Sigma-Aldrich. He (99.999% purity) was provided by Shanghai Lvmin Gas Co., Ltd., Shanghai, China.

Measurements of Morphological Characteristics
A total of 30 fruits from each population were randomly chosen and their morphological characteristics were measured. Fruit length and width were measured using a slide caliper (accuracy 0.1 mm). An electronic scale (accuracy 0.01 g) was used to access data on fruit fresh weight, fruit dry weight, and 1000 seed weight.

Measurements of Morphological Characteristics
A total of 30 fruits from each population were randomly chosen and their morphological characteristics were measured. Fruit length and width were measured using a slide caliper (accuracy 0.1 mm). An electronic scale (accuracy 0.01 g) was used to access data on fruit fresh weight, fruit dry weight, and 1000 seed weight.

Sample Preparation for GC-MS (SIM)
A. tsaoko fruits were dried using an electro-thermostatic blast oven (DHG-9140A, Gongyi Yuhua Instrument Co., Ltd., Gongyi, China) under 25 • C for 10 days, turned twice daily. Seeds were separated from the dried fruit. Seeds from the same plant were put together and ground using a portable high-speed pulverizer (DFT-200, Wenling Linda Machinery Co., Ltd., Wenling, China). The essential oil was extracted from the seeds powdered by hydrodistillation according to the essential oil extraction method I in the Chinese pharmacopoeia [5]. Each population prepared six parallel samples of essential oil.
The essential oil samples were diluted 100 times and filtered using a 0.22-µm PTFE Millipore filtration (Tianjin Jinteng experimental equipment Co., Ltd., Tianjin, China). After discarding the primary filtrate, the subsequent filtrate was collected into SureStop vials (C5000-2W, Thermo Fisher Scientific, Waltham, MA, USA) and stored at −80°C until analysis. All samples were treated consistently.
A 200-µL of each sample was mixed to get the QC sample. The mixture was then placed into SureStop vials with inner tubes (200 µL each vial) (C4010-630, Thermo Fisher Scientific, Waltham, MA, USA).

GC-MS Analysis
The GC-MS analysis of the essential oil was carried out using an Agilent 9000 GC system coupled with an Agilent 5977B MSD. The sample was introduced into the instrument through split injection with a pressure of 25 psi. at 280 • C, and the injection volume was 1 µL. Compounds were separated along a HP-5MS capillary column (30 m × 0.25 mm × 0.25 µm film thickness). The optimized temperature program is shown  Supporting Table S1. GC-MS data was obtained from split ratios of 44:1 and 8:1. The carrier gas was helium at a flow rate of 1.0 mL/min. MS detection was obtained in an electron impact mode at 70 eV. The temperature of the MS transfer line, quadrupole, and ion source were set at 280 • C, 150 • C, and 230 • C, respectively. The full scan m/z range was 15-300 Da.
In the sequence, the samples were run randomly, and one QC injection was added after every seven sample injections to ensure data stability.

Compound Identification and Quantification
Deconvolution was performed using the Agilent MassHunter Qualitative Analysis Workflows software (version B.08.00, Agilent Technologies, Inc., Santa Clara, CA, USA). Based on full scan mass spectral data and deconvolution results, the fragmentation patterns were compared with NIST17 (version 2.3.0.0, National institute of standards and technology (NIST) Gaithersburg, MD, USA) library. The identified compounds were further confirmed by comparing them with published literature. Retention indices were then taken as a reference to double-check the identification results.
The selected ion monitoring (SIM) mode was used to quantitate. One quantifier and at least two qualifiers for each of the target compounds were selected. The quantifiers were chosen based on the specificity and intensity of ions.

Statistical Analysis
All GC-MS (SIM) data were auto-integrated first and manually checked to ensure the right peak integration using the Agilent MassHunter Quantitative Analysis software (version B.09.00, Agilent Technologies, Inc., Santa Clara, CA, USA). The data was integrally normalized to the QC samples using the MetNormalizer package [17]. Univariate statistical analysis was performed using the Student's t-test or Mann-Whitney U test where appropriate, and false discovery rates were used to control error propagation. The results were visualized by a volcano plot using the R ggplot2 and ggrepel package. For each population, the mean of compound contents in essential oil and morphology information were calculated, and the Spearman rank test was then performed to evaluate their correlation and visualized using the corrplot package. The R corrplot package used is the Visualisation of a Correlation Matrix (Version 0.84), which is available from https://github.com/taiyun/corrplot (accessed on 20 July 2020) [18]. Furthermore, the compound contents in essential oil and morphology correlation networks were constructed by the cytoscape software (v3.30) [19].
Data were treated with unit variance (UV) scaling before subjected to multivariate analyses. Principal component analysis (PCA) and orthogonal partial least squares discrimination analysis (OPLS-DA) were performed using the SIMCA software (12.0.0.0, Umetrics AB, Umea, Sweden). Several model parameters were evaluated to ensure model quality and avoid overfitting risks, such as the R 2 X, R 2 Y, Q 2 Y and the p-value of CV-ANOVA.

Morphological Characteristics
In this study, all four fruit types were detected in almost 12 populations. As shown in Table 2, the mean value of length, width, and length-width ratios were 3.29-3.73 cm, 2.47-3.13 cm, and 1.20-1.38, respectively. The relative standard deviations (RSDs) were relatively higher within the population. The number of seeds in one fruit showed a high variation with RSDs greater than 30%. However, the RSD of 1000 seeds' weight was no more than 4%. That means the seeds' weight was more consistent within the population, but not among the populations. Similarly, although the fresh and dry weight of fruit had relatively higher RSDs, the RSDs of dehydration rate were lesser than 6%. The oil yield of seeds from the same population were 2.93-3.99%, but the average RSD was an extremely low number (2.37%). The results suggest that the difference between populations is greater than the difference within population. The location should be chosen for plantation. Meanwhile, the result of correlation analysis of morphological variations and geographical conditions was shown in Figure 2. The breadth, fresh weight, dry weight, and 1000 seeds' weight are positively correlated to altitude with p < 0.05. This result might indicate that high altitude localities is beneficial for accumulation of biomass of A. tsaoko fruits. This is consistent with previous reports of Stevia rebaudiana leaves [20]. However, the correlation between biomass and altitude is controversial [21]. Some studies have shown an inverse relationship. For example, the forage yield decreased significantly with the increasing altitude in Alpine Meadow of Sanjiangyuan, China [21]. More parameters, such as nutrient element in soil, are worth investigating [22]. low number (2.37%). The results suggest that the difference between population greater than the difference within population. The location should be chosen for pla tion.
Meanwhile, the result of correlation analysis of morphological variations and graphical conditions was shown in Figure 2. The breadth, fresh weight, dry weight, 1000 seeds' weight are positively correlated to altitude with p < 0.05. This result m indicate that high altitude localities is beneficial for accumulation of biomass of A. ts fruits. This is consistent with previous reports of Stevia rebaudiana leaves [20]. Howe the correlation between biomass and altitude is controversial [21]. Some studies h shown an inverse relationship. For example, the forage yield decreased significantly the increasing altitude in Alpine Meadow of Sanjiangyuan, China [21]. More parame such as nutrient element in soil, are worth investigating [22].
Among these 12 populations, it is notable that the fruits from YY have the hig fruit weight, fruit size, and 1000 seeds' weight, and the oil yield of seeds was 3.91%, w was also on the high side. This indicated that Yuanyang (YY) is suitable for the A. ts fruit biomasses.  Among these 12 populations, it is notable that the fruits from YY have the highest fruit weight, fruit size, and 1000 seeds' weight, and the oil yield of seeds was 3.91%, which was also on the high side. This indicated that Yuanyang (YY) is suitable for the A. tsaoko fruit biomasses.

Compound Identification and Quantification
After the deconvolution, 45 peaks with signal-noise-ratio higher than ten under the split ratio 8:1 were chosen and identified. The GC-MS chromatogram with label peaks is shown in Figure 3. The identification results are shown in Table 3, including 10 monoterpene hydrocarbons, 12 oxygenated monoterpenes, eight indane derivatives, eight straight chain aldehyde, three esters, one oxygenated sesquiterpenes, and three unknowns. The compounds were identified mainly based on full scan mass spectral data, compared with the NIST17 (version 2.3.0.0, NIST, Gaithersburg, MD, USA) library. The compounds, 5-indanecarbaldehyde and six C 10 H 12 O, which are probably 2,3,3A,7Atetrahydro-1H-indene-4(5)-carbaldehydes, could not be found in the NIST library and were identified by comparing them with the mass spectral data in previous studies [23,24].
GC-MS (SIM) data were acquired under the split ratio 44:1. The selected ions were shown in Supporting Table S2 and the underlined ions were quantifiers, while the other ions were qualifiers. All GC-MS (SIM) data were auto-integrated first, followed by manual checking. Peak 26, which had more than 20% missing data, was not used in further analyses. The data was integrally normalized to the QC samples, followed by a PCA analysis. The results were shown in Supporting Figures S1 and S2. Data RSDs decreased, and the QC samples were closely clustered, which indicated good stability of the method. One sample from YY and one from FG were regarded as an outlier and removed from further analyses because they were out of the 95% confidence interval.

Correlation between Compounds and Morphological Characteristics, Geographical Conditions
The plant morphology is related to multiple factors. The two most important ones are genes and the environment. The composition and content of essential oils are also influenced by multiple factors, such as genotype, geographical conditions, harvest time, and drying methods [25][26][27][28]. In this study, the correlation analysis between morphological split ratio 8:1 were chosen and identified. The GC-MS chromatogram with label peaks is shown in Figure 3. The identification results are shown in Table 3, including 10 monoterpene hydrocarbons, 12 oxygenated monoterpenes, eight indane derivatives, eight straight chain aldehyde, three esters, one oxygenated sesquiterpenes, and three unknowns. The compounds were identified mainly based on full scan mass spectral data, compared with the NIST17 (version 2.3.0.0, NIST, Gaithersburg, MD, USA) library. The compounds, 5indanecarbaldehyde and six C10H12O, which are probably 2,3,3A,7A-tetrahydro-1H-indene-4(5)-carbaldehydes, could not be found in the NIST library and were identified by comparing them with the mass spectral data in previous studies [23,24]. (3) (1)     The results were shown in Supporting Figures 1 and 2. Data RSDs decreased, and the QC samples were closely clustered, which indicated good stability of the method. One sample from YY and one from FG were regarded as an outlier and removed from further analyses because they were out of the 95% confidence interval.

Correlation between Compounds and Morphological Characteristics, Geographical Conditions
The plant morphology is related to multiple factors. The two most important ones are genes and the environment. The composition and content of essential oils are also influenced by multiple factors, such as genotype, geographical conditions, harvest time, and drying methods [25][26][27][28]. In this study, the correlation analysis between morphological variables, geographical conditions, and compounds in essential oils have been carried out. The results were shown in Figure 4. There were nine compounds related to fruit lengths. Seven of them, (E)-2-hexenal (1), octanal (6), α-phellandrene (7), α-terpinene (8), cymene (9), (E)-2-octenal (11), and γ-terpinene (12), are positive correlations, with p < 0.05. Trans-sabinenehydrate (13) and unknown-1 (24) are negatively correlated to length. It is notable that all of them, excluding trans-sabinenehydrate (13), can be used as a fragrance in the cosmetics and food industries There were nine compounds related to fruit lengths. Seven of them, (E)-2-hexenal (1), octanal (6), α-phellandrene (7), α-terpinene (8), cymene (9), (E)-2-octenal (11), and γ-terpinene (12), are positive correlations, with p < 0.05. Trans-sabinenehydrate (13) and unknown-1 (24) are negatively correlated to length. It is notable that all of them, excluding trans-sabinenehydrate (13), can be used as a fragrance in the cosmetics and food industries for their aromas. For example, the octanal (6) is with a fruit-like odor, α-terpinene (8) has citrus and lemon-like aromas, and (E)-2-octenal (11), used in preparing chicken, has cucumber flavor. Homologous, octanal (6), and trans-sabinenehydrate (13) showed the same trend to the breadth and fresh weight with p < 0.05. The octanal (6) showed the same trend to dry weight. None of them had a correlation to length-width ratio and water content. These results suggest that the scent of the black cardamom is more relevant to its absolute size than its shape, and a longer length meant a stronger aroma. This is consistent with a folk belief that the larger black cardamom has a stronger smell. Furthermore, α-phellandrene (7) has anti-nociceptive and anti-inflammatory effects. The α-terpinene (8) has various biological activities, including acaricidal, antiprotozoal, and antioxidant properties. That means the longer one should be better for bioactivities.
The seeds' number is negatively correlated to decanal (23). The 1000 seeds' weight has the same trend to unknown-1 (24) and 5-indanecarbaldehyde (38). The decanal (23) is a flavoring agent. Moreover, the seeds' number and 1000 seeds' weight also contribute to the choice of the odor types.
The oil yield of seeds is an important characteristic to judge the commercial quality of black cardamom. This has a positive correlation to α-thujene (2) and a negative correlation to unknown-2 (27), C10H12O-5 (36), and C10H12O-6 (39), with p < 0.05. α-thujene (2), being the major compound (61.4-69.8%) in the essential oil of Boswellia serrata, has been proven to be responsible for enhanced antifungal and antioxidant activities [29]. For using these biological activities, the oil yield of seeds might be the reference value. Fruits collected from Jingping (JP) obtained the highest oil yield in these 12 locations, followed by Tengchong (TC) and Pingbian (PB), that is 4.17%, 4.08%, and 4.03%, respectively ( Table 2).
The linalool (15) is used in perfumery. This is negatively correlated to the longitude and reverse trend to the latitude. Therefore, for the purpose of obtaining a high content of linalool (15), the northwest area in the Yunnan province is a better choice than the southeast area. δ-terpineol (18) is also positively correlated to the latitude. It is worth mentioning that the altitude has positive correlation to (E)-nerolidol (43), with p < 0.05. A recent report indicated the (E)-nerolidol (43) was a potent volatile signal involved in defense of Empoasca onukii and Colletotrichum fructicola in the tea plant. Early responses included the activation of a mitogen-activated protein kinase and WRKY, an H 2 O 2 burst and the induction of jasmonic acid and abscisic acid signaling. High levels of defense-related chemicals accumulated, which possessed broad spectrum anti-herbivore or anti-pathogen properties, and ultimately triggered resistance [30]. In this study, the result suggested that the ripe fruits of A. tsaoko, which is collected in a higher altitude area, accumulated a higher level of (E)-nerolidol (43). However, more research is needed to approach the function in A. tsaoko.
The annual average precipitations are positively correlated to the 2,6-dimethyl-2,4,6octatriene (16), which is a plant source volatile organic compound. It has been reported that growth of fungi Botrytis cinerea was significantly inhibited when the mycelia were exposed to an atmosphere containing 2,6-dimethyl-2,4,6-octatriene (16). When Phaseolus vulgaris is infected with Colletotrichum lindemuthianum, there is a response [31]. The fruits of A. tsaoko may be infected by different fungi in areas with different precipitations; this is worth more attention in the future.
In addition, the correlation between the harvest date and the compounds were analyzed. The fruit period of A. tsaoko is from August to December. When the fruit is ripe, its pericarp presents a reddish-brown color. The samples used in this study were harvested when the fruit turned reddish-brown. Sabinene (4), trans-sabinenehydrate (13), linalool (15), and (E)-2-tetradecenal (45) increased over time. Sabinene (4) and linalool (15) are perfumery, and (E)-2-tetradecenal (45) has a citrus odor. These suggest that the harvest date might affect the odor type of the black cardamom. Unfortunately, our harvest dates were concentrated in 4-5 August 2018 and 30 October to 7 November 2018. More research covering longer time periods to confirm this result is needed.

Correlation between Compounds and Locality of Growth
Considering that sabinene (4), trans-sabinenehydrate (13), linalool (15), and (E)-2tetradecenal (45) were related to the harvest date, they have been excluded from the analysis of correlation between compounds and distribution areas.
As shown in the Figure 5A Figure 6A showed the result. Groups 1 and 2 had differences with Q 2 greater than 0.56 while p-values 1.9 × 10 −11 . The 16 compounds were highlighted by significant level to discriminate Groups 1 and 2 ( Figure 6B). However, the volcano plot of the univariate statistical analysis showed that there is no compound having significant differences between the two groups ( Figure 6C).
Considering that A. tsaoko have been planted for centuries, the geographical conditions may have an effect on its essential oil. The floristic zones have been mentioned, to have larger geographic areas composed of similar plant species or plant groupings. Based on a theory, the Yunnan province was divided into five floristic zones [32]. Now, A. tsaoko distributes in three of these floristic zones: the south and southwest zone (zone I), the southeast zone (zone II), and the west and northwest zone (zone IV). Zone II has the longest cultivation history for A. tsaoko in China. A. tsaoko produced in zones I and IV were introduced from zone II. In this study, Yuanyang (YY), Lvchun (LC), Yingjiang (YJ), and Tengchong (TC) populations were in zone I. Hekou (HK), Pingbian (PB), Jinping (JP), and Maguan (MG) populations were in zone II. Fugong (FG), Lushui (LS), Gongshan (GS), and Dulongjiang (DLJ) populations were in zone IV. Therefore, these 12 populations were split into three groups ( Figure 5B), and followed by establishing the OPLS-DA model. Results showed that zone I and II had outstanding differences with Q 2 greater than 0.58 and p-values smaller than 1.2 × 10 −7 (Figure 7(A1)). Meanwhile, zone I and IV showed significant differences with Q 2 greater than 0.75 and p-values smaller than 1.9 × 10 −12 (Figure 7(B1)). Furthermore, zone II and zone IV showed specific differences with Q 2 greater than 0.81 and p-values smaller than 2.1 × 10 −15 (Figure 7(C1)). The compounds, which were employable to discriminate, are shown in Figure 7(A2,B2,C2).

Correlation between Compounds and Locality of Growth
Considering that sabinene (4), trans-sabinenehydrate (13), linalool (15), tetradecenal (45) were related to the harvest date, they have been excluded fro ysis of correlation between compounds and distribution areas.

Conclusions
For hundreds of years of cultivation, A. tsaoko has been distributed in three of the floristic zones in the Yunnan province: the south and southwest zone (zone I), the southeast zone (zone II), and the west and northwest zone (zone IV). Until now, the morphological characteristics have been presented such that the difference between population is greater than the difference within population. The correlation analysis of the morphological variations and geographical conditions showed that the high-altitude localities are beneficial for the accumulation of biomass. Among these 12 populations, the fruits from Yuanyang (YY) have the highest fruit weight, fruit size, and 1000 seeds' weight. Therefore, the Yuanyang (YY) is suitable for A. tsaoko fruit biomasses.
Focusing on the relative quantity of compounds in the essential oil and morphological characteristics, it is interesting to find that the selection of specific functional or odor type materials could reference the morphologies of the A. tsaoko fruit and seed. For example, the longer length means a stronger aroma and potentially higher pharmacological activity in acaricidal, antiprotozoal, and antioxidant properties. Meanwhile, more attention must be paid to the harvest date in the process of collection.
Qualitative and quantitative analyses of compounds in the essential oil could be used to distinguish the producing area of the A. tsaoko fruit. Moreover, the locality of growth might indicate the component characteristic in the essential oil of the A. tsaoko fruit. These results are crucial in realizing the determination of botanical origin and evaluating the quality of A. tsaoko fruit. Meanwhile, it is clear that various other studies on this plant deserve more attention.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/agronomy11040744/s1. Table S1: The optimised temperature program of GC/MS. Table S2: Selected ions used in SIM mode. Figure S1: PCA score plot of samples and QC. Figure S2: Change of RSD after SVR normalization.