Characteristics of non-volatile metabolites in fresh shoots from tea plant ( Camellia sinensis ) and its closely related species and varieties

Tea plant ( Camellia sinensis ) and its closely related species and varieties belong to Sect. Thea (L.) Dyer, Camellia L. There are abundant compounds in the fresh shoots of section Thea (L.) Dyer species and varieties. Their variation in different tea species and varieties is unclear. Fresh shoots from 336 accessions of C. sinensis and its closely related species and varieties were harvested and their non-volatile metabolites were detected through UPLC-MS (ultra-performance liquid chromatography - mass spectrometry). A total of 374 non-volatile metabolites were identified, which can be divided into 27 categories. Among them, 32 compounds were flavonoid polymers. The tea plants were divided into two groups, according to the Calinski criterion according to the composition of metabolites. The top 30 differential metabolites in C. sinensis var. sinensis , C. sinensis var. assamica , C. sinensis var. pubilimba , C. tachangensis , and C. taliensis , belong to amino acids and their derivatives, benzoic acid derivatives, carbohydrates, coumarins, flavonol glycosides, organic acids, quinoline acid and its derivatives. The results provide new insights for further understanding the characteristic metabolites of tea plant and its closely related species and varieties. non-volatile metabolites from plant ( )


INTRODUCTION
The tea plant is a perennial and economic crop with a productive lifespan. Its fresh shoots are processed into tea, including green, black, Oolong, white, yellow and dark tea [1] . Numerous metabolites in fresh tea shoots are precursors which experience a series of variations during the tea manufacturing process and contribute significantly to tea quality and flavor [2] . Additionally, these metabolites play important roles in resistance for tea itself and functional effects on human health. Flavonoids, theanine, caffeine and theobromine are the dominant metabolites in section Thea (L.) Dyer [1] .
Tea is basically produced from three kinds of tea plants: Camellia sinensis (L.) O. Kuntze var. sinensis, C. sinensis var. assamica (Masters) Kitamura and C. sinensis var. pubilimba Chang. As well as these, wild tea plants, involving C. tachangensis F. C. Zhang, C. taliensis (W. W. Smith) Melchior, C. crassicolumna Chang, C. gymnogyna Chang also belong to sect. Thea (L.) Dyer. The sect. Thea is divided into the above mentioned species and varieties depending on the ovary chamber number, the style splitting number and the presence or absence of ovary pubescence, combined with the corolla size and the morphology of tree, branch and leaves [3] . C. tachangensis was first identified in Yunnan, China in the 1980's [2] . C. taliensis is usually distributed in the evergreen broad-leaved forests from 1,300 to 2,400 m above sea level, mainly growing in southwestern Yunnan in China and its adjacent regions [2] . There are significant differences among C. tachangensis, C. taliensis, C. gymnogyna and C. sinensis based on the chloroplast genome [2] . A number of papers have presented the metabolites in leaves from C. sinensis with the rapid development of metabolomics technology. They reveal the effects of temperature, light, water, fertilizer, climate, altitude and other factors on metabolites of the tea plant.
Some studies have researched the metabolic characteristic of wild tea plants. C. crassicolumna which has an abundance of phenols and no caffeine or theophylline in the leaves [2] . Zhang et al. [2] found that C. tachangensis had significanbtly higher contents of epicatechin and epigallocatechin compared to C. sinensis. In contrast, Yu [4] believed that C. tachangensis had low accumulation of polyphenols, especially low content of catechins. Gujing tea (C. gymnogyna) in Guizhou province has high content of proanthocyanidins and galloylquinic acid [5] , and this species in Guangxi province contains three different purine alkaloids, among which theobromine content was the highest [6] . However, there were less than three wild tea plants in each study, some conclusions are contradictory, and it is difficult to reflect their comprehensive metabolic profiles, let alone the comparison of metabolites among tea and its wild relatives. Therefore, we gathered fresh shoots from 336 accessions cultivated tea and its wild species and varieties in spring and detected their metabolic components and content. We compare their distinction of metabolites and aim to explore their characteristic metabolites. The current study elucidates the common and unique character of different tea plant species and varieties, especially flavonoids, amino acids and alkaloids.

Overview of metabolites in tea plants
The correlation between every two QC samples was calculated according to Pearson correlation coefficient. Supplemental Fig. S1 shows correlation ranges from 0.92 to 1.00, suggesting that the instrument is stable. Three methods were performed to identify compounds: (1) Retention time, primary and secondary fragments of compounds were aligned to standards; (2) Primary and secondary fragments of compounds were aligned to the literature; (3) The remaining compounds were annotated through information from online databases. A total of 4,836 metabolites were detected, 374 of which were identified (Supplemental Table S1). Concretely, 117 and 18 of them were identified through standards and literature, respectively. The remaining 239 were identified via online databases.

Metabolomic characteristic of different species and varieties of Sect. Thea
CRL, CCC, CMC, COC were deleted before DAM analysis among Sect. Thea plants as their sample number was less than 4. The remainder of the tea plants were CSS, CSA, CSP, CTF, CTM. The top 30 DAMs are displayed in Fig. 4. A total of 16 metabolites were in the top 30 in two consecutive years, involving amino acids and their derivatives, benzoic acid derivatives, carbohydrates, coumarins, quinate and their derivatives (Table 1). These metabolites could be used to discriminate different Sect. Thea plants. Among them, the content of Lpyroglutamic acid, L-serine, sinapaldehyde glucoside, 7ehoxycoumarin and D -quinic acid was highest in CSS. The content of N-acetyl-DL-tryptophan and azelaic acid was highest in CSA. The content of 7-hydroxycoumarine, diGC-GA, 2acetamido-2-deoxyglucose, kaempferitrin, nictoflorin, traumatic acid and neochlorogenic acid was highest in CTF. The content of 5'-xanthylic acid and chlorogenic acid was highest in CTM.

Polymers of flavonoids
There are abundant polymers of flavonoids in the identified metabolites ( Table 2). A total of 32 flavonoid polymers were identified, consisting of flavonoid, sugar and gallic acid. Most of them were catechin polymers and flavonoid glycosides. More interestingly, flavonol was the major monomer of flavonoid glycosides. A total of 23 flavonol glycosides were identified in the current study, whose major glycoside ligands were glycose and rhamnose. In addition, almost all the glycoside ligands were hexose.

DISCUSSION
The non-volatile compounds in tea could be divided into six groups, which were hydrolysable tannins, flavan-3-ols and their derivatives, flavonoids, alkaloids and theanine, simple phenols and their derivatives, terpenes [13] . In the present study, the nonvolatile compounds in tea were divided into 27 groups, according to functional group integrated with chemical structure. Organic acids possessed the largest number among all the classifications. Gallic acid, benzoic acid, chlorogenic acid, αketoglutaric acid, malic acid, citric acid and oxalic acid were the predominant organic acids. Among them, organic acids play an important role in the metabolism of tea plants and tea quality usually as intermediate products from carbohydrate decomposition. The stability, water-solubility and activity of benzoic acid were changed after methylation or glycosylation [7] . Benzoic acid, chlorogenic acid, gallic acid and their derivatives were the essential phenolic acids in tea plants. Their content decreased with the maturity of tea leaves [8] . The content of benzoic acid near to the flowers declined during the flowering process [9] . The content of gallic acid was dramatically increasing in the process of green and black tea manufacturing [10] . White and green tea contained high-content chlorogenic acid, with white tea containing the most [11] . During deoxidation of Oolong tea, gallic acid was generated from the degradative estercatechin [12] . These studies imply that organic acids are of great importance for the growth and development of tea plant and the quality of tea.
Tea plants in a narrow sense mean C. sinensis and its varieties.   Kuntze var. sinensis is mainly distributed south of the Yangtze River. It has disseminated eastward to Japan and Korea, westward to Tibet, China and southward to northern Burma [13] . C. sinensis var. assamica (Masters) Kitamura is mainly distributed in Yunnan and its neighboring regions and countries, such as Vietnam, Thailand and Burma [13] . The majority of wild tea plants are CTF, CCC, CTM and CGC. CTF grows in eastern Yunnan, southwestern Guizhou and western Guangxi [14] . Compared to CSS, CTF contains higher amounts of EC, EGC, di-EC, ECG-EC, EGC-EC and tri-EC, with lower content of ester catechins [14] . There are abundant phenols in CCC. A total of 18 phenols were identified in C. crassicolumna var. multiplex, involving four flavanols, six flavonol glycosides, three hydrolysable tannins, two derivatives from chlorogenic acid, and three simple phenols [15] . There was no caffeine or theophylline in one C. crassicolumna var. multiplex [13] . In two CCC, the content of caffeine was 0.83% and 0.059%, the content of theobromine was 0.05% and 0.07% [16] . However, in the present study, 3.26 ± 0.09% of caffeine was detected in one CCC through HPLC. It is likely that there are differences in alkaloid content among various CCC. Further CCC samples need to be gathered to explore their metabolic characteristics in the future. CTM, one of the most predominant tree varieties in southern tropical evergreen broad-leaved forest, is distributed from the middle of Lancang River in Hengduan Mountains of Yunnan to the Irrawaddy River Basin [17] . CTM is the relative of cultivated tea plant [18] . CTM from Lincang had nine phenols containing eight hydrolysable tannins, six catechin derivatives, three quinic acid aromatic esters, and two simple phenolics identified, along with caffeine [19] . Theobromine, caffeine and theacrine were detected in one CGC from Dayao Mountain and theobromine was its major alkaloid [6] . The content of theacrine was 3.2 times that of caffeine in CGC [20] . The results in the current study demonstrate that the composition and percentage of amino acids and their derivatives, benzoic acid derivatives, carbohydrates, coumarins, flavonol glycosides, nucleotide acids and their derivatives, organic acids, quininic acids and their derivatives could be distinct to different Sect. Thea. Flavonoids are the essential secondary metabolites in tea plant, closely related with flavor and the quality of brewed tea [21] . The convergence of soup is mainly caused by these substances [1] . Glycosylation is a ubiquitous modification process in plants [22] . Many substances have changed through glycosylation stability, solubility, biological activity and pharmacokinetics, which also helps tea plants resist biotic and abiotic stresses [22] . Glycosides are important metabolites in plants, which have the function of free radical scavenging, anti-radiation, anti-diabetes, and enhancing immunity [23] . Various flavonoid glycosides are formed in plants under the catalysis of UFGT. Phenolic acids, flavonols, flavonoids and isoflavones are the main substrates of UFGT [24] . Tea shoots are rich in flavonoid polymers, mainly flavonoid glycosides and anthocyanins. Natural flavonoids usually exist in glycoside form. The precursors of flavonols, flavonoids, anthocyanins, saponins and aroma substances mainly exist as glycosides in tea. The solubility of flavonoid glycosides is greater than that of flavonoids. Thus, flavonoid glycosides have greater influence on tea taste than flavonoids. The content of naringenin-7-O-glucoside, quercetin-3-O-glucoside and kaempfer-3-O-(6''-O-p-coumaryl)-glucoside in fresh tea leaves in August are higher than those in other seasons [25] . It could explain the reason why summer and autumn tea are more bitter and astringent. Myrice-3-O-rhamnoside, myrice-3-O-galactoside, myrice-3-O-glucoside, quercetin-3-O-glucosylrhamnosyl-galactoside, quercetin-3-O-glucosyl-rhamnosylglucoside, vitexin-2''-O-rhamnoside, kaempfer-3-O-rutinoside, kaempfer-3-O-galactoside and kaempfer-3-O-glucoside were detected in tea [26] . A total of 202 glycosides in green tea were identified, and the sugar ligands were mainly glucose, galactose, rhamnose, rutose and primrose [27] . The higher the content of glycosides binding to EC, the lower the dimer content of EC, which indicates that the glycosides of EC are important donors of EC dimer [16] . It is likely that theanine converted into N-ethylpyrrolidone after deamination and decarboxylation. Then it may combine with the catechin A ring at 6-C or 8-C position to form theanine-catechin complex [28] . However, the polymers of theanine and catechin were not detected in this study, which may be due to the low content of theanine and catechin or the low response value of the instrument. It deserves establishing specific methods to detect these polymers in future.

CONCLUSIONS
A total of 374 non-volatile metabolites were identified in tea plant (Camellia sinensis) and its closely related species and varieties. Organic acids and flavonoid are the most abundant and special non-volatile metabolites in tea plants, respectively. 7-hydroxycoumarine, propentofylline, 1,2-di-O-galloyl-HHDPglucose and theacrine are the top four DAMs between C. sinensis and its relatives. Seven categories of substances: amino acids and their derivatives, benzoic acid derivatives, carbohydrates, coumarin, flavonol glycosides, organic acids, quinic acid and its derivatives are the top 30 DAMs among CSS, CSA, CSP, CTF, and CTM.

Plant materials
The  Table S2). In the first round, two leaves and a bud from healthy tender shoots were harvested from March 16 to April 30, 2019. All the natural population tea plants were cultivated under the same horticultural conditions and management in the China National Germplasm Hangzhou Tea Repository in Hangzhou, China. Samples were fixed in liquid nitrogen immediately after plucking and were stored at −80 °C.

Sample extraction
Ten milliliters of 70% methanol were added to 0.2 (± 0.001) g of tea powder. The mixture was treated with ultrasonic extraction for 0.5 h. The supernatant was passed through a 0.22 µm film after standing at 4 °C for 2 h. The extracts were taken in a brown injection bottle and stored at −80 °C until injection.
Then, 100 µL of the extract was taken as a Quality control (QC). And 10 µL internal standard (0.025 mg/mL of sulfacetamide) was added to the sample before testing.

LC-MS conditions
Metabolite detection was performed using a UPLC (Thermo Scientific Dionex Ultimate 3000, USA)-Q-Orbitrap (Thermo Scientific Q Exactive, USA) with a column C18 (SB-AQ, 1.8 µm, 2.1 mm × 100 mm, Agilent, USA). Mobile phase A and B were 0.1% (v/v) formic acid in water and 0.1% (v/v) formic acid in acetonitrile. The gradient elution program was as follows: 0-6 min, 5%−20% B; 6−10 min, 20%−95% B; 10−11.5 min, 95% B; 11.5−15 min, 95%−5% B. The sample chamber temperature and the column temperature were 4 °C and 40 °C, respectively. The injection volume was 2 µL with 0.3 mL/min. The relative collision energies were 15, 30, and 60. The m/z scanning range was 70−1,000. The spray voltage was 3.5 KV with the drying gas temperature at 320 °C, the auxiliary gas temperature at 350 °C, and the protective gas flow rate at 40 arb. The primary resolution was 70,000. The automatic gain control was 1e6. The secondary resolution was 17,500, and the automatic gain control was 1e5.

Construction of the standard data base
Standards were dissolved with 70% methanol to 1.00 mg/mL as technical concentration. The mother liquid was diluted to 50 ng/mL.

Metabolic data processing
Information such as substance name, mass-to-charge ratio, file location retrieved from standards and literature were input to mzValut to build a local library. The local library and sample information was imported into Compound discoverer 2.1 to characterize and quantify metabolites. The substances in the local library were characterized according to RT (retention time), accurate mass of primary mass spectrometry and secondary mass spectrometry, with ΔRT ≤ 0.5 min and Δm/s ≤ 5 ppm. Substances that were not in the local library were annotated by accurate mass alignment with primary and secondary mass spectra of substances in public databases AraCyc, BioCyc, Human Metabolome Database, KEGG, Lipid Maps, and PlantCyc with Δm/s ≤ 5 ppm. Solvent mass spectra were used to remove background noise. QC was used for peak alignment and correction.

Data analysis
Metabolite taxonomic statistics were performed using Microsoft Office Excel 2010. The Pearson correlation coefficient between QCs was calculated using the gpubr package in R3.6.2 and visualized with the ggplot2 package. Cluster analysis was performed on multiple samples according to the composition and content of metabolites. The fpc and vegan packages were used to determine the number of clusters according to Kmeans. The optimal k value was determined based on Calinski criterion. The distance matrix between samples was calculated via the cluster package based on ward.D2 in the pedigree clustering. Ape package was used for visualization. DESeq2 was used to calculate log2 (Fold Change) and p value between groups after variance analysis. The p value was corrected via Benjamini-hochberg to obtain adj-p. DAMs (differentially accumulated metabolites) were screened as Fold change > 2 and adj-p < 0.05. Volcano plots were created using the ggplot2 package.