The Content of Volatile Organic Compounds in Calypogeia suecica (Calypogeiaceae, Marchantiophyta) Confirms Genetic Differentiation of This Liverwort Species into Two Groups

Calypogeia is a genus of liverworts in the family Calypogeiaceae. The subject of this study was Calypogeia suecica. Samples of the liverwort Calypogeia suecica were collected from various places in southern Poland. A total of 25 samples were collected in 2021, and 25 samples were collected in 2022. Volatile organic compounds (VOCs) from liverworts were analyzed by gas chromatography–mass spectrometry (GC–MS). A total of 107 compounds were detected, of which 38 compounds were identified. The identified compounds were dominated by compounds from the sesquiterpene group (up to 34.77%) and sesquiterpenoids (up to 48.24%). The tested samples of Calypogeia suecica also contained compounds belonging the aromatic classification (up to 5.46%), aliphatic hydrocarbons (up to 1.66%), and small amounts of monoterpenes (up to 0.17%) and monoterpenoids (up to 0.30%). Due to the observed differences in the composition of VOCs, the tested plant material was divided into two groups, in accordance with genetic diversity.

Calypogeia suecica is regarded as a boreal-montane species, and its presence has been reported in North America, Europe, and Asia.In Poland, it is widespread in the south, specifically in the mountains, and is very rare in the northeastern region.Calypogeia suecica is an obligate xylicole; it grows almost exclusively on moist decorticated logs in a later stage of decay, mainly in humid stream valleys in coniferous forests.It is the only Calypogeia species that grows on rotting logs [1].Calypogeia suecica is a small plant; its shoots are up to 2.0 cm long and 1.8 mm wide (Figure S1).Characteristic features that distinguish C. suecica from other Calypogeia species are its almost orbicular leaves (composed of a single layer of small cells) with a truncate apex, its deeply divided underleaves, which are 2-3× wider than the stem with lateral angulation or teeth, and the colorless oil bodies present in all leaf and underleaf cells [5] (Figure S2).This is a dioecious species that is characterized by low morphological variability [5].In Europe, two cytoforms of C. suecica have been reported, nine from Germany and Poland [6,7], and eighteen from Britain [8], which may support the hypothesis that an unrecognized species is present within C. suecica.Recently, molecular studies of the chloroplast genome have shown the genetic differentiation of C. suecica into two groups [9].
The presence of various volatile organic compounds (VOCs) in plants, animals, or microorganisms has led to the development of modern identification techniques such as high-performance liquid chromatography (HPLC) and gas chromatography-mass spectrometry (GC-MS), which have become popular identification tools due to their advantages of objectivity and accuracy [10].These methods are used to identify species, varieties, and strains in a wide range of organisms, as well as the quality of various products and foods [11][12][13][14][15]. Liverworts are also rich in a wide range of biologically active compounds, such as terpenoids and aromatic compounds, which are synthesized and accumulated in oil bodies, cell structures characteristic of this group of plants [16][17][18].Many of these compounds are specific only to liverworts [16,19].
Liverworts are small plants with a very simple morphological structure, and there are few good diagnostic features on the basis of which species can be recognized.Moreover, in liverworts.some species are genetically heterogeneous and, in fact, consist of morphologically cryptic or nearly cryptic taxa [20].Therefore, the correct identification of liverwort species based solely on morphological characteristics has proven to be insufficient in many cases [21,22].As previous studies have shown, volatile organic compounds (VOCs) can be helpful in identifying difficult-to-recognize closely related liverwort species belonging to the same genus, e.g., Pellia, Riccardia, Pallavicinia, Mylia, and Porella [16,[23][24][25][26].So far, the content of chemical compounds has been determined for about 10% of liverwort species [24], but only a few studies have combined chemotaxonomic analysis with molecular identification of the studied plants.As work on Conocephalum conicum [27] and Aneura pinguis [28] has shown, chemotaxonomic studies based on genetically identified material enable correct differentiation of cryptic species based on the detected VOCs.Additionally, chemotaxonomic studies of species belonging to the genus Calypogeia have shown that some species differ in their chemical composition [24,29,30].So far, the analysis of chemical composition has been performed for only four European species of the genus Calypogeia: C. azurea, C. muelleriana, C. fissa, and C. suecica [31][32][33][34], which were carried out in the 1990s.However, at the time of these previous studies, there was no knowledge about the genetic diversity and the presence of hidden species in the genus Calypogeia, which were revealed by later genetic studies, e.g., within C. muelleriana, C. sphagnicola, C. azurea [35], and C. suecica [9].To our knowledge, further analyses of Calypogeia species were conducted by Guzowska [36] and Wawrzyniak [37], and these studies were devoted to the seasonal variability of C. azurea and C. integristipula.However, so far, there have been no studies on the chemotaxonomic differentiation within Calypogeia species, combining chemical and genetic analyses.
The purpose of our study was to investigate whether the groups distinguished within C. suecica on the basis of genome analysis also differ in terms of composition and content of volatile chemical compounds.

Volatiles Present in Calypogeia suecica
Fifty Calypogeia suecica samples were tested for their content of volatile organic compounds (VOCs).Twenty-five samples were collected in 2021 (Table S1a-e) and twenty-five in 2022 (Table S2a-e).Due to the observed differences in the composition of volatile organic compounds, the samples were divided into two groups.The first group included 32 samples (CSU1-1-CSU1-32), while the second group included 18 samples (CSU2-1-CSU2-18).
One hundred and seven volatile compounds were detected in the biological material tested.Thirty-eight compounds were identified.The proportion of compounds identified in the first group ranged from 57.21% to 72.56%.However, in the case of the second group, the proportion of identified compounds ranged from 20.03% to 32.06%.Compounds that could not be identified were described using three characteristic ions: a molecular ion and two ions with the highest intensity.Based on the GC-MS analysis, it was found that the dominant groups of compounds in the tested plant material were sesquiterpenes and sesquiterpenoids.
Sesquiterpenoids were represented by bisabola-2,10-dien [1,9]oxide (80) and 4,5dehydroviridiflorole (73).In the first group, these compounds constituted from 26.25 to 48.24% of the composition of the identified compounds.In the second group, identified sesquiterpenoids constituted only from 2.26% to 5.61% of the composition.However, in the second group, there were compounds with retention indexes of 1532 (66) and 1594 (79) in amounts ranging from 9.69% to 18.21% and 8.81% to 19.84%, respectively.MS spectra suggested that these compounds belonged to the sesquiterpenoid classification.It should be noted that the presence of these compounds was not detected in the first group.
In the case of the first group, sesquiterpenes were present at levels ranging from 17.78% to 34.77%.However, in the case of the second group, these compounds occurred at a rate of 14.39% to 24.88%.γ-curcumene (53) was the dominant sesquiterpene in the first group.It occurred at levels from 1.27% to 7.73%.In the case of the second group, the dominant sesquiterpene was γ-bisabolene (60).It was marked in the second group at a level from 4.17% to 8.74%.Bicyclogermacrene (58) is a sesquiterpene that was present at similar levels in groups one and two.In the case of the first group, its prevalence ranged from 1.07% to 6.48%.In the case of the second group, it ranged from 1.03% to 6.11%.A similar situation occurred with another sesquiterpene, anastreptene (32).In the case of the first group, its occurrence ranged from 4.60% to 9.53%.In the case of the second group, it ranged from 3.06% to 6.45%.Conversely, α-zingiberene (57) is a sesquiterpene that occurred in the first group at levels from 0.09% to 1.32%, and its presence was not detected in the second group.Other compounds belonging to the sesquiterpene group that were identified in the tested plant material included δ-elemene (30), α-funebrene (33), β-elemene (34), 7-episesquithujene (35), italicene (36), 9-aristolene (37), 1(10),8-aristoladiene (38), β-barbatene (46), α-curcumene (55) and β-sesquiphellandrene (64).Compounds belonging to the group of monoterpenes were also identified in the analyzed plant material, such as tricyclene (6), α-pinene (7), and β-pinene (10).
Monoterpenoids include compounds such as bornyl acetate (26) and isobornyl acetate (27).However, in the case of both monoterpenes and terpenoids, the detected amounts in the studied samples were small.Monoterpenes were present in amounts of 0.02% to 0.07% in the first group, and in the second group, 0.03% to 0.17%.Monoterpenoids occurred in the first group at levels from 0.00 to 0.08%, and in the second group from 0.00% to 0.30%.
Compounds belonging to the group of aliphatic compounds were also found in the cells of the liverwort samples tested: hexanal (1), 3-methylbutanoic acid (2), 2-methylbutanoic acid (3), 3-hexen-1-ol (4), 1-hexanol (5), hexanoic acid (9), 1-octen-3-ol (11), 3-octanone (12), 3-octanol (13), 2-ethylhexanoic acid (16); and aromatic compounds: benzenemethanol (14), benzeneacetaldehyde (15), benzeneethanol (17), phenoxyethanol (23), and 1-phenoxy-2-propanol (24).The content of aliphatic compounds in samples belonging to the first group ranged from 0.12% to 1.48%.In the second group, it ranged from 0.25% to 1.66%.Compounds classified as aromatic compounds constituted 0.88% to 5.46% of the composition in the first group, and in the second group 0.72% to 3.27% of the composition.Figures 1 and 2 show a comparison of the average content of the dominant sesquiterpenes and sesquiterpenoids in the 2021 and 2022 samples, divided based on the location of the habitat from which the material was collected, and further into groups 1 and 2.  In the light of the information presented, it can be concluded that the anal biological material was clearly able to be divided into two groups in terms of composition of volatile secondary metabolites.Some fluctuations in the compositio volatile compounds were also observed, resulting from the location of the sites which the samples were collected.In the light of the information presented, it can be concluded that the analy biological material was clearly able to be divided into two groups in terms of composition of volatile secondary metabolites.Some fluctuations in the compositio volatile compounds were also observed, resulting from the location of the sites f which the samples were collected.In the light of the information presented, it can be concluded that the analyzed biological material was clearly able to be divided into two groups in terms of the composition of volatile secondary metabolites.Some fluctuations in the composition of volatile compounds were also observed, resulting from the location of the sites from which the samples were collected.
Based on the collected results in Tables S1a-e and S2a-e, the mean % values of compound contents were calculated, along with the standard deviation for collection place within the group and for the group in total.Table 1 presents the results for group 1, and

Statistical Analysis of the Obtained Results
To investigate the variation in VOCs among two genetic groups of C. suecica (groups 1 and 2) revealed on the basis of chloroplast DNA [9], a set of 107 detected compounds were subjected to multivariate statistical analyses.
First, a PCA was conducted, which is an unsupervised analysis used to reduce the dimensions of a large data set and to extract and visualize the hidden structure in the analyzed data.The explanatory and predictive abilities of the PCA model are evaluated on the basis of two parameters: R 2 X and Q 2 .The closer R 2 X and Q 2 are to 1, the better the fitness of the model is.In the analysis of C. suecica samples, the model included five statistically significant components that explained 83.8% of the variation (R 2 X) and 75.7% of the predicted ability (Q 2 ).However, the first two principal components, PC1 and PC2, explained as much as 63.0% of the total variance (R 2 X), with values of 48.7 and 14.3%, respectively.The PCA revealed a clear separation of the studied samples.The 2D scatterplot shows that the main differentiation between the studied samples, corresponding to their genetic group affiliation (group 1 and group 2), occurred along the PC1 axis, which explained 48.7% and predicted 44.0% of the total variation (Figure 3).The analysis of factor loadings indicates that the samples of group 1 and 2 differed primarily in the content of compounds 40, 48, 52, 60, 63, 66, 71, 79, 80, 91, and 106, which made the largest contribution to the PC1 axis.Variables 40, 48, 60, 63, 66, 71, 79, and 91 had high (>0.90)positive loading, whereas 52, 80, and 106 had high negative (>−0.90) loading (Figure S3).Therefore, samples belonging to group 1 located on the left (negative) side of the PCA diagram are characterized by a lower concentration (or lack) of the VOCs 40, 48, 60, 63, 66, 71, 79, and 91, as compared to group 2 located on the right (positive) side.In turn, samples from group 1 had a higher content of compounds 52, 80, and 106 than those from group 2. The separation of samples by genetic groups in PCA was confirmed by permutational multivariate analysis of variance (PERMANOVA, R 2 = 0.75, p < 0.001).Smaller differentiation within both groups (1 and 2) was observed along the PC2 axis, explaining 14.3% and predicting 9.9% of the total variation.This variation is related to geographical diversity and reflects the fact that the samples tested came from different geographical regions of Poland (Figure 4).On the PC2 axis, the highest positive (>0.70) factor loadings had VOCs 11, 15, 16, and 74, while compound 45 had negative factor loadings (Figure S4).Analysis of samples from different regions of Poland revealed small differences in the geographical distributions of the groups studied.Group 2 occurred mainly in the Pieniny and rarely in the Beskid S ądecki and Bieszczady Mountains (only one location), while group 1 had a wider distribution; it occurred in the Tatry, Małe Pieniny, and Bieszczady Mountains, and less frequently in the Pieniny Mountains.It is worth emphasizing that at the site in the Łonny stream (in Pieniny), plants belonging to both groups grew together in one colony (samples CSU1-7 and CSU2-5) and had a composition of chemical compounds typical of the group (Tables S3 and S4).We did not observe visible differences between samples from a given region, nor between subsequent years of sample collection (Figure 5).
The same result is shown by hierarchical cluster analysis (HCA).The dendrogram plotted on the basis of the Euclidean distance for all 107 compounds using Ward ′ s method divided the studied samples into two clades that were consistent with their affiliation to the genetic groups.A large Euclidean distance between the two groups (>150) supports the significant difference in chemical composition between them.On the contrary, the Euclidean distances among samples in one group were about three times lower, suggesting that there are small variations between samples from a given group (1 or 2) that grow in different geographic regions (Figure 6).
Similarly, the differentiation of the analyzed samples according to their genetic group was shown using a heatmap.The studied samples were separated into two main clusters correlated with their genetic classification, the first cluster including all 32 samples belonging to group 1, and the second containing 18 samples belonging to group 2. Heatmap analysis showed a high degree of correlation between the composition of VOCs in the analyzed samples of C. suecica and the genetic group.The compounds formed three groups, the content of which in the tested plants clearly changed depending on their genetic affiliation (Figure 7).Slight differences related to region can be seen in group 2, especially between the Bieszczady sample and the remaining samples, which had a lower content of compounds 38, 41, 45, 100, and 104 and a higher content of 1, 11, 15, 16, 25, 46, 77, and 105 (Figure S5).The same result is shown by hierarchical cluster analysis (HCA).The dendrogram plotted on the basis of the Euclidean distance for all 107 compounds using Ward′s method divided the studied samples into two clades that were consistent with their affiliation to the genetic groups.A large Euclidean distance between the two groups (>150) supports the significant difference in chemical composition between them.On the contrary, the Euclidean distances among samples in one group were about three times lower, suggesting that there are small variations between samples from a given group (1 or 2) that grow in different geographic regions (Figure 6).Results consistent with PCA, HCA, and heatmap analysis were also obtained in supervised PLS-DA analysis.The PLS-DA scatter plot showed a clear differentiation among the two genetic groups of C. suecica, indicating that the groups differed in their composition and content of the detected volatile compounds (Figure 8).The model identified significant separation between the two groups (R 2 X = 0.98, Q 2 = 0.97, p < 0.001).
PLS-DA analysis makes it possible to determine the importance of variables in projection (VIP), which are key to the separation of the tested samples into groups.The importance of the above VOCs for the identification of C. suecica groups was also confirmed by univariate analyses, such as fold change (FC), t-test, and volcano plot.According to the t-test (Table 2), the studied groups differed statistically significantly in the mean concentration of 78 chemical compounds, while the volcano plot, which combined the results of the fold change (FC) and t-test analyses into one single graph, showed significant differences for 64 VOCs: 38 sig.down, and 26 sig. up (Figures S6 and S7, Tables S7 and S8).
The observed differences in the composition of volatile organic compounds in the two genetic groups of C. suecica [9] are so distinct that they can serve as a marker enabling the reliable identification of plants that cannot be recognized based on morphological features.Both groups of C. suecica also differed clearly in their VOC composition from the other species of the genus Calypogeia occurring in Europe analyzed so far [32][33][34]36,37].The results of our study confirmed the high content of bisabola-2,10-diene [1,9]oxide (80) in C. suecica, a compound detected for the first time in this species by Warmers et al. [34].It should be emphasized that bisabola-2,10-diene [1,9]oxide (80) occurred in both genetic groups of C. suecica; however, the groups differed significantly in the average content of this compound.In group 1 it was 34.89%, while in group 2 it was several times lower, with an average of 3.20% (Tables 1 and 2).Group 2 C. suecica, unlike group 1, contained two unidentified compounds (RI = 1532, RI = 1594) in amounts of 11.64 and 16.17%, respectively.Both C. suecica groups, similarly to C. muelleriana and C. azurea, were found to include compounds of the azulene type [32,36].Azulenes are considered important chemical markers of Calypogeia species [29].According to previous studies, C. fissa is dominated by acorane-type sesquiterpenes, which distinguishes this species from C. suecica [33].In the case of C. integristipula, the dominant compounds are anastraptenes (15.61-25.26%)[37], which were also identified in C. suecica, but at a much lower level (4.69-7.13%).Bisabola-2,10-diene [1,9]oxide (80) is also present in C. integristipula, but in small amounts compared to C. suecica.The importance of the composition of volatile compounds as chemical markers has been proven in many studies of various liverwort species [16,[23][24][25][26].Our present studies have confirmed the great importance of using the HS-SPME/GC-MS method to profile volatile organic compounds (VOCs) in the case of closely related and morphologically indistinguishable liverwort species, as has been shown for the cryptic species Conocephalum conicum [27] and Aneura pinguis [28].Similarly, the differentiation of the analyzed samples according to their genetic group was shown using a heatmap.The studied samples were separated into two main clusters correlated with their genetic classification, the first cluster including all 32 samples belonging to group 1, and the second containing 18 samples belonging to group 2. Heatmap analysis showed a high degree of correlation between the composition of VOCs in the analyzed samples of C. suecica and the genetic group.The compounds formed three groups, the content of which in the tested plants clearly changed depending on their  Results consistent with PCA, HCA, and heatmap analysis were also obtained supervised PLS-DA analysis.The PLS-DA scatter plot showed a clear differentiati among the two genetic groups of C. suecica, indicating that the groups differed in th composition and content of the detected volatile compounds (Figure 8).The mod identified significant separation between the two groups (R 2 X = 0.98, Q 2 = 0.97, p < 0.00      The occurrence of C. suecica is limited only to moist rotting wood, mainly fir or spruce, which has an appropriate degree of decay (decorticated logs) [1,8].For this reason, the species under study is not abundant in nature, and usually forms only small colonies.Due to the small number of sites and the specific nature of the habitat where this species occurs, we decided that all samples would be collected only in one growing season, in autumn.The autumn season was chosen because it ensures the optimal development phase and the best condition of the liverwort plants due to the prevailing weather conditions (higher humidity and lower temperatures).Unfortunately, this makes it impossible to carry out studies that illustrate changes in the composition of metabolites in different growing seasons, as was the case in [36].The samples examined consisted of well-developed stems that were in a sterile state; that is, without reproductive structures.Research materials were collected from five geographical regions: the Bieszczady Mts, Beskidy Mts, Tatry Mts, Małe Pieniny Mts, and Pieniny Mts.
Five stems with a total weight of approximately 15 mg were taken from each sample.Only green plants that showed no signs of drying out and that were not affected by visible diseases were selected for further research.Before analysis, the samples were determined on the basis of morphological features, structure, and the distribution of the oil bodies in the leaves and under leaves [1,5,8].The samples were classified into two groups detected by Ślipiko et al. [9] based on the chloroplast barcode marker rbcL.All samples classified to group 1 had the same sequence of rbcL as C. suecica acc.number MK294008, and those classified to group 2 had the same rbcL sequence as C. suecica acc.number MK294009, deposited in the GenBank by Ślipiko et al. [9].

HS-SPME Extraction
The VOCs from Calypogeia suecica were extracted using the headspace solid-phase microextraction technique (HS-SPME).Fused silica fibers coated with divinylbenzene/carboxen/ polydimethylsiloxane (DVB/CAR/PDMS)(Merck KGaA, Darmstadt, Germany) were employed.The fibers, 2 cm in length and covered with a 50 µm DVB layer and a 30 µm CAR/PDMS layer, were conditioned for 1 h at 270 • C according to the supplier ′ s guidelines.A sample of 5 mg of clean and dried plant material was placed in a 1.7 mL vial, which was hermetically sealed with a Teflon/silicone septum and heated to 50 • C. The extraction of the compounds was conducted at 50 • C for 60 min.Desorption of analytes from the fibers was performed in the injection port of the gas chromatograph at 250 • C for 10 min.Sorption and desorption operations were performed using the TriPlus RSH autosampler (Thermo Scientific, Waltham, MA, USA).

GC-MS Analysis
The analysis of VOCs was performed using a previously described gas chromatographymass spectrometry (GC-MS) method [36].GC-MS analyses employing a Quadrex 007-5MS column (30 m, 0.25 mm, 0.25 µm)(Quadrex Corporation, Bethany, CT, USA) were conducted on a Trace 1310 (Thermo Scientific, Waltham, MA, USA).The ISQ QD mass detector (Thermo Scientific, Waltham, MA, USA) was operated at 70 eV in electron ionization (EI) mode within an m/z range of 30 to 550.Helium was used as the carrier gas at a flow rate of 1.0 mL/min.The oven temperature program was set to increase from 60 • C to 230 • C at a rate of 4 • C/min, followed by an isothermal hold at 230 • C for 40 min.The injector and transfer line temperatures were maintained at 250 • C. The samples were injected in splitless mode.
The identification of components was confirmed by comparing the mass spectral fragmentation patterns with those stored in the MS database (NIST 2011 [38], NIST Chemistry WebBook [39], Adams 4 Library [40], and Pherobase [41]) and with those reported in the literature.Furthermore, retention indices determined relative to a homologous series of n-alkanes (C7-C30)(Merck KGaA, Darmstadt, Germany) were compared with published data.Quantitative data for the components were obtained by integrating the total ion

Figure 3 .
Figure 3. Two-dimensional PCA scatter plot based on all 107 detected compounds in the studied samples of C. suecica.The percentage of explained variance (R 2 X) is 48.7% for PC1 and 14.3% for PC2, and predictive ability (Q 2 ) is 44.0% and 9.9%, respectively.Different colors indicate genetic group affiliation.Shaded ellipses indicate the 95% confidence regions.

Figure 4 .
Figure 4. Two-dimensional PCA scatter plot based on all 107 detected compounds in the studied samples of C. suecica originating from different regions.The percentage of explained variance is 48.7% for PC1 and 14.3% for PC2.Different colors indicate the regions of origin.

Figure 3 .
Figure 3. Two-dimensional PCA scatter plot based on all 107 detected compounds in the studied samples of C. suecica.The percentage of explained variance (R 2 X) is 48.7% for PC1 and 14.3% for PC2, and predictive ability (Q 2 ) is 44.0% and 9.9%, respectively.Different colors indicate genetic group affiliation.Shaded ellipses indicate the 95% confidence regions.

Figure 3 .
Figure 3. Two-dimensional PCA scatter plot based on all 107 detected compounds in the studied samples of C. suecica.The percentage of explained variance (R 2 X) is 48.7% for PC1 and 14.3% for PC2, and predictive ability (Q 2 ) is 44.0% and 9.9%, respectively.Different colors indicate genetic group affiliation.Shaded ellipses indicate the 95% confidence regions.

Figure 4 .
Figure 4. Two-dimensional PCA scatter plot based on all 107 detected compounds in the studied samples of C. suecica originating from different regions.The percentage of explained variance is 48.7% for PC1 and 14.3% for PC2.Different colors indicate the regions of origin.

Figure 4 .
Figure 4. Two-dimensional PCA scatter plot based on all 107 detected compounds in the studied samples of C. suecica originating from different regions.The percentage of explained variance is 48.7% for PC1 and 14.3% for PC2.Different colors indicate the regions of origin.

Figure 5 .
Figure 5. Two-dimensional PCA scatter plot based on all 107 detected compounds in samples of C. suecica collected in different years.The percentage of explained variance is 48.7% for PC1 and 14.3% for PC2.The year of sample collection is marked in different colors.Shaded ellipses indicate the 95% confidence regions.

Figure 5 .
Figure 5. Two-dimensional PCA scatter plot based on all 107 detected compounds in samples of C. suecica collected in different years.The percentage of explained variance is 48.7% for PC1 and 14.3% for PC2.The year of sample collection is marked in different colors.Shaded ellipses indicate the 95% confidence regions.

Molecules 2024 , 21 Figure 6 .
Figure 6.Dendrogram of examined samples of C. suecica samples belonging to two genetic groups constructed on the basis of the Euclidean distance according to Ward's linkage method using 107 detected VOCs in the studied C. suecica samples.

Figure 6 .
Figure 6.Dendrogram of examined samples of C. suecica samples belonging to two genetic groups constructed on the basis of the Euclidean distance according to Ward's linkage method using 107 detected VOCs in the studied C. suecica samples.

Figure 7 .
Figure 7. Clustering and heatmap analysis of the 107 chemical compounds detected in the stud Calypogeia suecica samples.The annotations bar shows clustering of the samples by group (cla Each cell was colored based on the level of the concentration of the chemical compound in sample.

Figure 7 .
Figure 7. Clustering and heatmap analysis of the 107 chemical compounds detected in the studied Calypogeia suecica samples.The annotations bar shows clustering of the samples by group (class).Each cell was colored based on the level of the concentration of the chemical compound in the sample.

Figure 8 .
Figure 8. Two-dimensional PLS-DA separation of two genetic groups of C. suecica samples based on all 107 detected compounds.Shaded ellipses indicate the 95% confidence regions.PLS-DA analysis makes it possible to determine the importance of variables in projection (VIP), which are key to the separation of the tested samples into groups.Figure 9 presents 20 key VOCs (VIP > 1.40) differentiating C. suecica samples belonging to different genetic groups.The higher the VIP result, the greater the contribution of the chemical compound to group separation.Among the chemical compounds indicated as the most important for distinguishing the studied groups, group 1, compared to group 2, was characterized by a reduced content of 14 compounds (48, 79, 66, 71, 51, 40, 43, 72, 91, 56, 89, 68, 54, and 70) and an increased content of 6 compounds (52, 80, 103, 106, 53, and 61).

Figure 8 .
Figure 8. Two-dimensional PLS-DA separation of two genetic groups of C. suecica samples based on all 107 detected compounds.Shaded ellipses indicate the 95% confidence regions.

Figure 8 .
Figure 8. Two-dimensional PLS-DA separation of two genetic groups of C. suecica samples based on all 107 detected compounds.Shaded ellipses indicate the 95% confidence regions.PLS-DA analysis makes it possible to determine the importance of variables in projection (VIP), which are key to the separation of the tested samples into groups.Figure 9 presents 20 key VOCs (VIP > 1.40) differentiating C. suecica samples belonging to different genetic groups.The higher the VIP result, the greater the contribution of the chemical compound to group separation.Among the chemical compounds indicated as the most important for distinguishing the studied groups, group 1, compared to group 2, was characterized by a reduced content of 14 compounds (48, 79, 66, 71, 51, 40, 43, 72, 91, 56, 89, 68, 54, and 70) and an increased content of 6 compounds (52, 80, 103, 106, 53, and 61).

Figure 9 .
Figure 9. Variable importance in projection (VIP) identified by PLS-DA.The red and blue boxes on the right indicate whether the compound concentration is increased (red) or decreased (blue) in the samples of group 1 and 2 of C. suecica.

3 .
Materials and Methods 3.1.Plant Material Fifty samples of the liverwort C. suecica collected in the years 2021-2022 from the natural environment in different regions of Poland were analyzed.The collected samples were approximately 5-7 cm in diameter.Detailed information on the samples, including the places of collection, geographic coordinates, and the dates of collection, are provided in Tables S3-S6.

Table 2
presents the results for group 2. Table 2 also provides additional t-test values for groups.

Table 1 .
Mean % and standard deviation of volatile compounds detected in group 1 (CSU1) of C. suecica samples, divided based on group and collection place.

Table 2 .
Mean % and standard deviation of volatile compounds detected in group 2 (CSU2) of C. suecica samples divided by group and collection place, and t-test values for groups 1 and 2.