METABOLIC AND TRANSCRIPTOME ANALYSIS REVEALS METABOLITE VARIATION AND FLAVONOID REGULATORY NETWORKS IN FRESH SHOOTS OF TEA ( CAMELLIA SINENSIS ) OVER THREE SEASONS

Metabolites, especially secondary metabolites, are very important in the adaption of tea plants and the quality of tea products. Here, we focus on the seasonal variation in metabolites of fresh tea shoots and their regulatory mechanism at the transcriptional level. The metabolic profiles of fresh tea shoots of 10 tea accessions collected in spring, summer, and autumn were analyzed using ultraperformance liquid chromatography coupled with quadrupole-obitrap mass spectrometry. We focused on the metabolites and key genes in the phenylpropanoid/flavonoid pathway integrated with transcriptome analysis. Multivariate statistical analysis indicates that metabolites were distinctly different with seasonal alternation. Flavonoids, amino acids, organic acids and alkaloids were the predominant metabolites. Levels of most key genes and downstream compounds in the flavonoid pathway were lowest in spring but the catechin quality index was highest in spring. The regulatory pathway was explored by constructing a metabolite correlation network and a weighted gene co-expression network. cluster analysis, and pathway analysis were conducted using MetaboAnalyst 4.0. The metabolites of cv. Longjing Changye over three seasons in the flavonoid pathway were compared by least significant difference (LSD) test.

Please note that technical editing may introduce minor changes to the manuscript text and/or graphics which may affect the content, and all legal disclaimers that apply to the journal pertain. In no event shall HEP be held responsible for errors or consequences arising from the use of any information contained in these "Just Accepted" manuscripts. To cite this manuscript please use its Digital Object Identifier (DOI(r)), which is identical for all formats of publication.

Introduction
The popularity of tea drinking is growing due to its unique flavor and perceived health benefits [1] . In addition to traditional tea, instant tea, tea beverage, tea bags, tea natural products and other processed tea products are popular. The quality of both traditional tea and new tea products relies strongly on the raw materials. Various metabolites in fresh shoots of tea contribute strongly to the quality and flavor of tea products since they are the precursors of compounds during processing. These metabolites also exhibit adaption of the tea plant to the constantly changing climate and environment. They are regulated by complex metabolism in the plant that is controlled by its genetic background and ambient conditions [2,3] . The ambient conditions include temperature, light, water and other climatic factors that cause substantial changes with the seasonal transition. Some studies have addressed the effect of surroundings on the chemical component of tea responsible for the flavor [4] . Caffeine, total flavonol, and ECG (epicatechin gallate) concentrations significantly decrease during the monsoon season [5] . Flavan-3-ols, theasinensins, procyanidins, quercetin-O-glycosides, apigenin-C-glycosides and amino acids of green tea exhibit sharp seasonal fluctuations [6] . Isocitric acid, theasinensin B, myricetin 3-O-galactosylrutinoside, prodelphinidin, B2 and quinic acid are at higher concentrations in Tieguanyin tea in spring than in autumn. Most flavonoids have an opposite seasonal alteration pattern [7] . In most circumstances, flavonoid biosynthesis increases with increasing temperature and light. The concentration of most flavonoids decreases greatly in light-sensitive tea leaves when the plant is exposed to light, which further improves tea quality [8] . Various metabolic patterns of catechin derivatives have been observed across cultivars and harvest seasons based on 30 selected compounds [3] . Some key metabolites that contribute to differences among tea samples over five months have been identified [9] .
The above information suggests that the tea plant responds to climatic and planting conditions via variation in compound biosynthesis and degradation. However, it is difficult to clarify the role of climate in the metabolism of tea using infused tea rather than fresh tea leaves. It is also difficult to elucidate a comprehensive seasonal effect on its physiologic and biochemical regulation under artificial cultivation conditions by controlling one or two specific factors. Moreover, comprehensive information is lacking to reveal the metabolic patterns in different seasons of the year through detection of metabolites of only one tea cultivar. In their adaption to changing conditions, tea plants will regulate gene expression and related metabolites. Thus, determination of gene expression and metabolites is helpful in understanding metabolic regulation in tea. Metabolomics is a useful method capable of simultaneously detecting hundreds of endogenous metabolites and has been widely used in the study of chemical composition [10] . Here, we therefore take advantage of metabolomics combined with transcriptomics to systematically examine the effect of seasonal alternation on intrinsically metabolic variation in fresh tea shoots before processing.
The south bank of the Yangtze River is one of the most famous tea growing regions in China, processing a variety of elite green teas including West Lake Longjing (Dragon Well tea), Anji Baicha (a high amino acid albino green tea) and Biluochun (a hairy frizzy tender green tea). The area has four distinct seasons as a result of its subtropical monsoon climate. Spring tea is the major tea in most tea producing areas in China [11] . However, fresh tea shoots are not fully used in other seasons. Spring, summer and autumn are the tea growing seasons in the south bank of the Yangtze River. Accordingly, we used an untargeted UPLC-MS approach coupled with transcriptome analysis to detect metabolite and gene transcription levels in young tea shoots from 90 samples collected in three harvest seasons to investigate the alternating patterns in metabolites with season. We also focused on the regulation of flavonoids as an illustration of seasonal response. Finally, we constructed metabolite correlation networks and weighted the gene coexpression network to further elucidate the comprehensive regulation mechanism underlying the highly correlated metabolic compounds in tea. The results may help to improve the flavor of teas selected in appropriate seasons. In addition, some transcription factors discovered through the co-expression network are helpful in further understanding the regulation of metabolites, particularly flavonoids, at the transcriptional level in tea.

Sample extraction
Ten mL of 70% methanol solution (v/v) were added to 0.2 g of tea powder in a clean tube. Extraction was then conducted ultrasonically (Bransonic, 5510E-DTH, Branson Ultrasonics, Danbury, CT) for 30 min at 30°C. The supernatants were filtered through a 0.22-µm porew filter for analysis after infusion at 4°C for 2 h.
Standard (Table S1) and metabolite detection were made with a Q-Obitrap-MS (Thermo Fisher Scientific, Waltham, MA). The mass spectrometer (MS) was operated in both positive and negative modes with electron spray ionization at capillary voltages of 3.50 and 3.20 kV, respectively. The temperatures of drying gas and auxiliary gas were, respectively, 320°C and 350°C. The full MS mode ranged from 70 to 1000 at a resolution of 70,000, and the top 10 peak areas were selected. This was followed by MS/MS scanning at a resolution of 17,500 with normalized collision energies of 15, 30 and 60, isolation window of 4.0 m/z (mass-to-charge ratio), loop count 10, and dynamic exclusion 10.0 s. Quality control (QC) samples were prepared by mixing samples of equal amounts, with every 10 samples injected with one QC to monitor instrumental performance.

High performance liquid chromatography (HPLC) conditions
The concentrations of catechin monomers were determined by HPLC (Waters 2695-2489, Water Corp., Milford, MA) with a C18 reverse phase column (Synergi, 5 µm, 4.6 mm × 250 mm, Phenomenex, Torrance, CA). The determination conditions were set as follows: solvent A and B were 0.1% FA in water (v/v) and pure acetonitrile, respectively. Column temperature was 35°C and the flow rate was 1.0 mL·min −1 . UV spectra were measured at 278 nm. The injection volume was 10 µL. The gradient elution profile was 4% B at 0 min, to 18.7% B at 42 min, to 4% B at 43 min, and 4% B maintained from 43 to 45 min. The absolute concentrations of catechins were calculated according to standard curves (Fig. S1).

RNA sequencing
Total RNA was isolated using an RNAprep Pure Plant kit (Tiangen Biotech Co., Ltd., Beijing, China) according to the manufacturer's protocol. The degradation and purity of RNA were examined by 1% agarose gel electrophoresis and a NanoPhotometer spectrophotometer (Implen, Munich, Germany). RNA integrity was assessed using an RNA Nano 6000 Assay Kit of the Bioanalyzer 2100 system (Agilent Technologies). High-quality RNA samples from tea plants were prepared using an Illumina TruSeq RNA Sample Preparation Kit (Illumina Inc., San Diego, CA) and cDNA libraries were constructed with an UltraTM RNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA). The cDNAs were purified using Beckman AMPure XP beads (Beckman Coulter, Brea, CA) and subsequently moved to an Agilent High Sensitivity DNA Kit (Agilent 2100) for the detection of inserted cDNA fragments. Subsequently, cDNA libraries were quantified with a Bio-Rad KIT iQ SYBR Green kit (Bio-Rad CFX 96, Bio-Rad Laboratories, Hercules, CA) and cDNA libraries were subsequently sequenced using a TruSeq SBS Kit v3 (Illumina HiSeq2500). The clean reads were subsequently aligned to reference genomes (pcsb.ahau.edu.cn:8080/CSS). An index of the reference genome was built using Hisat2 v2.0.5 and pairedend clean reads were aligned to the reference genome. A database of splice junctions was generated using Hisat2 based on the gene model annotation for an optimized mapping result. FPKM (expected number of fragments per kilobase of transcript sequence per million base pairs) of each gene was calculated based on the length of the gene and read counts mapped to the corresponding gene.

Data processing
The information on m/z, retention time (RT), characterized fragments and peak intensity was extracted with Xcalibur (Thermo Fisher Scientific). A local database of authentic standards was obtained from mzVault based on the information of raw files. Xcalibur Compound Discoverer 2.1 read the input files, extracted the peaks, RT, and calculated the relative peak area. The dominant parameters of database alignment were as follows: mass tolerance 5 ppm; threshold of signal-noise ratio 1.5; precursor selection MS. RT and MS2 spectra were used for local database alignment. The MS2 spectrum was also aligned with the online mass databases human metabolome database, Kyoto Encyclopedia of Genes and Genomes, and PlantCyc. Subsequently, the original data processing was conducted by generalized logarithmic transformation and Pareto scaling with R 3.6.2.
Principal component analysis (PCA), partial least-squares discriminant analysis (PLS-DA), cluster analysis, and pathway analysis were conducted using MetaboAnalyst 4.0. The metabolites of cv. Longjing Changye over three seasons in the flavonoid pathway were compared by least significant difference (LSD) test.

Metabolite correlation network and weighted gene co-expression network analysis
Pearson correlation coefficient (PCC) values between metabolites were calculated via R 3.6.2. The biomarkers were selected based on p-value < 0.05 and PCC > 0.6 to construct metabolite correlation networks. PCC values between genes were also calculated using R. The ranks for every gene pair were obtained based on the PCC value from high to low. The mutual ranks (MR) were calculated using the formula: The genes were clustered based on the dissimilarity and divided by the dynamic method depending on PCC > 0.8 (minModuleSize = 30). Finally, MR was converted to network edge weight using the decay function ≥ 0.01.

Untargeted analysis of tea samples
A total of 100 metabolites of tea samples from three harvest seasons were identified after detection through untargeted metabolomics, which were divided into 12 categories and 56 of the 100 metabolites were confirmed by standards. The remaining metabolites were putatively identified by alignment to online mass databases referring to accurate mass and fragments (resulting information is given in Table 1). The top three categories were flavonoids and their derivatives, amino acids and their derivatives and organic acids. The predominant components in tea were confirmed to be flavonoids and their derivatives, especially catechins, in agreement with previous studies [9,12] .  An unsupervised PCA was used to acquire an overview of the metabolic differences among the tea fresh leaves collected over three seasons ( Fig. 1(a)). The first two principal components explained 48.1% of the total variance with 95% confidence regions. Figure 1(a) shows that the metabolites of tea samples were markedly different between the three seasons. Subsequently, a supervised PLS-DA was used to further investigate the relationship between seasons and metabolites. Figure 1(b) shows the obvious metabolic diversity among different seasons. Figure 1(c) shows that the calculated R 2 and Q 2 were up to 0.95 and 0.93 according to 10-fold cross-validation of the first three components, indicating that the result of the PLS-DA approach was reliable. In addition, the plotted scores of samples collected in spring are more concentrated than those in the other two seasons, implying lower variation in the spring samples. It is likely that metabolite accumulation among tea plants is increasingly distinct during their annual growth cycle and this would make an interesting topic for further research. Cluster analysis was conducted using the peak area mean of biological replicates based on Euclidean distance (Fig. 2) for the purpose of discovery of key metabolites responding to seasonal variation. All samples distinctly clustered into three groups in line with each corresponding season in terms of the top 25 differential metabolites, revealing the marked effect of season on the metabolites. These metabolites can be basically categorized into amino acids, organic acids, alkaloids, flavonoids and their derivatives. In Fig. 2 the red and blue boxes show the metabolites at higher or lower concentrations than the mean value, respectively, and a deeper color indicates a greater difference. Most amino acids, namely theanine, glutamic acid, glutamine and arginine, had higher concentrations in spring than in the other two seasons. These amino acids were found to be closely associated with the taste and aroma of infused tea. Theanine with fresh flavor, a contributor to the formation of tea volatiles, accounted for about 50% of total amino acids in tea [13] , and was more abundant in spring than in the other two seasons. Theanine concentration markedly decreased in summer. By comparison, the vast majority of catechines, flavonoids and their derivatives were at their lowest concentrations in spring. Previous investigations have found that theanine is hydrolyzed to both glutamic acid and ethylamine in the presence of theanine hydrolase in the leaves under sunlight [14] . Ethylamine was transformed to acetaldehyde as a precursor of catechins induced by ammonia oxidase [14] . Therefore, high temperature and strong sunlight in the summer may accelerate the hydrolysis of theanine and the biosynthesis of catechins. In addition to catechins, flavonoids are one of primary polyphenols in tea leaves. Flavonoids are the upstream metabolites of catechins. Overall, the alternation tendencies of catechins and flavonoids in the three harvest seasons were consistent. It is likely that a low accumulation of flavonoids inhibits the biosynthesis of catechins. The great majority of organic acids and alkaloids had lower concentrations in spring. Of these, citric acid is the essential primary compound in the primary metabolism of organisms. In addition, tea samples collected from spring are more distinct than those from the other two seasons, as samples from summer and autumn were clustered into one group at the green dotted line shown in Fig. 2. This may be attributed to dormancy of the plants in winter during the previous year in the south bank region of the Yangtze River whose latitude ranges from 28°N to 32°N. When the latitude is above about 16°, tea goes through almost complete winter dormancy which lasts longer with increasing latitude [15] . Phytohormones, phenols and polyamines are correlated with this intrinsic physiologic regulation, namely, dormancy [16] . As a consequence, the plants have accumulated abundant compounds in winter for shooting and growth the following spring.

Variation in flavonoids and key gene expression in different seasons
Flavonoids were the major differential metabolites over the three seasons as shown in Fig. 2. Flavonoids are a large group of secondary plant metabolites. They bear a common diphenylpropane (C6-C3-C6) backbone in which two aromatic rings are linked via a three-carbon chain. Flavonoids fall into six major subclasses on the basis of differences in the heterocyclic C-ring, namely, flavones, flavonols, flavanones, flavanols, anthocyanidins and isoflavones [17] . They are important quality-related compounds in infused tea, accounting for 20% to 40% of the dry matter in young shoots [8] . Camellia sinensis (L.) O. Kuntze 'Longjing Changye' is suitable for processing into one of the highest quality green teas, West Lake Longjing, which was bred from 'Longjing Qunti' through individual selection. Here, flavonoids show changes in compounds of 'Longjing Changye' with season in the flavonoid related metabolic pathway based on the relative peak areas (Fig. 3). In this pathway the concentration of shikimic acid was significantly higher in spring than in the two other seasons. The concentration of gallic acid was lower in spring and summer than in autumn. In contrast, the concentration of digallic acid was highest in spring. This may show that the increase in digallic acid was the result of decrease in gallic acid. Previous sensory studies report that gallic acid was at the highest concentration in autumn, which is a umami-enhancing compound in green tea beverages that may increase the umami intensity of sodium L-glutamate, and that digallic acid was observed at the lowest concentration in the same season [18] . Intensities of sweet aftertaste increased with increasing concentration of gallic acid [19] . Also, there were relatively low concentrations of most downstream metabolites in spring, including naringenin, eriodictyol, schaftoside, myricetin, kaempferol, quercetin, quercetin-3β-D-glucoside, C, GC, EC, EGC, EGC-3′-glucuronide, EGCG-3"Me and delphinidin-3-O-rutinoside. Similar results were also observed following tea processing.
It has been reported that 70% to 75% of the bitterness and astringency of green tea is associated with catechins [20] . The two catechins C and EC differ in their oral astringency and bitterness. EC imparts significantly more bitterness and astringency than an equal concentration of C. Also, the bitterness and astringency of gallate type catechins are reported to be more severe than those of free catechins [21] . Their polymers, procyanidin and glucoside, represent weaker bitterness and astringency [22] . Fresh tea leaves accumulated more GC-(4α→8)-EGC and theasinensin B in summer and autumn than in spring. However, procyanidin B3 and procyanidin B4 decreased markedly in the autumn (Fig. 3). Procyanidins are homooligomeric (epi)catechin with two B-ring hydroxyl groups yielded from flavan-3-ols [23] . In addition, there was no marked difference in quinic acid or phenylalanine across seasons. In conclusion, concentrations and composition of flavonoids in tea depend on the ambient conditions over the seasons, suggesting that the biosynthesis of a large proportion of flavonoids is active under high temperatures and light intensity in summer.
Flavonoids are plant secondary metabolites derived from the shikimate pathway, the phenylpropanoid pathway and the flavonoid pathway. We calculated the expression levels of key genes in the flavonoid metabolic pathway to further explore the flavonoid regulatory mechanism in tea plants. Twenty-eight transcripts were aligned to genes responsible for flavonoid metabolism (Fig. 3, Table S3) involving 25 genes in the reference genome and three novel genes. The expression levels of 19 genes were lowest in spring and of the other 18 genes were highest in summer. The expression of TEA032730 and TEA009266 that aligned with DFR and ANR, respectively, were highest in spring. This suggests that flavonoids were strongly regulated at the transcription level, consistent with previous findings [24,25] . Also, the majority of key genes in the flavonoid pathway were upregulated in summer and autumn. PAL, the key enzyme of the phenylpropanoid biosynthesis pathway, catalyzes the first step of the phenylpropanoid biosynthesis pathway [26] . In spring, although the expression level of PAL was lowest, the substrate phenylalanine was not highly accumulated (Fig. 3). This indicates that other factors also affect flavonoid metabolism in addition to the regulation of gene expression. Enzyme activity is one of these factors. PAL activity was significantly lower in berry skins at high temperatures than at low temperatures [27] . Therefore, plants may tend to increase the expression of PAL to enhance flavonoid metabolism under higher temperature conditions. Variation in two SNPs was found to have effects on the concentration of flavonoids [28] and gene expression level was also important [9] . Nine genes responsible for flavonoid biosynthesis were randomly selected for qRT-PCR to validate the gene expression level calculated through transcriptomic data. The result of qRT-PCR is consistent with that of RNA sequencing (Fig. S2). Collectively, flavonoid metabolism is a complex process controlled by gene structure, gene expression level, enzyme activity and other factors.

Catechin quality index in different seasons
The absolute concentrations of catechin monomers were quantified to further uncover seasonal effects on catechin metabolism (Table S4). Catechin quality index (CQI) × 100 is a quality index for measuring the difference in catechin concentrations of fresh tea shoots across growing seasons [29] . The CQI of most tea cultivars is significantly higher in spring than in summer or autumn according to Fig. 4(a). The CQI of fresh shoots in spring, summer and autumn successively declined as shown in Fig. 4(b). In general, tea produced in spring is considered premium in China. A possible reason is that the spring tea leaves usually possess more theanine and less catechins and have a higher CQI, thus bearing more umami and less bitter taste. Also, high concentrations of catechins will induce a generation of tea cream and degrade the quality of tea beverages [30] . However, broken black tea with relative high concentrations of catechins, particularly gallated catechins, is of superior quality having fresh and brisk taste [31] . Accordingly, we suggest that tea raw materials may be alternated for different consumers with distinct demands depending on the CQI. For instance, tea raw materials collected in summer may be less suitable for tea beverages, as relatively high concentrations of catechins may effectively prevent the formation of tea cream. Fresh tea leaves collected in summer and autumn may be used to produce polyphenol preparations for diabetics, hypertensive patients and cancer patients.

Analysis of regulatory pathway and metabolite correlation network
Pathway analysis was carried out to identify the pathways influenced by season. The pairwise comparisons between seasons show that seasonal change is important in flavonoid biosynthesis, flavone and flavonol biosynthesis, alanine, aspartate and glutamate biosynthesis, β-alanine metabolism, glutathione biosynthesis, arginine and proline metabolism, pantothenate and CoA biosynthesis, isoquinoline, alkaloid biosynthesis, and phenylalanine metabolism (Fig. S3). Also, differential metabolites were characterized through t-tests with p-value < 0.01 and FDR < 0.02 (Tables S5-S7). Flavonoids, amino acids, and nucleotides were the top three differential metabolites between spring and summer. Greater variation was observed in the concentrations of flavonoids, amino acids, and organic acids in the autumn than in spring or summer. In addition, flavonoids, amino acids and organic acids were the top three categories of metabolites detected in fresh tea shoots as indicated above, further highlighting the remarkable significance of these metabolites.
A metabolite correlation network was constructed according to PCC values to investigate the relationships among the main metabolic compounds. There were 284 pairs of metabolites showing high correlation with a PCC > 0.6 and a p-value ˂ 0.05 (Tables S8-S9). The PCC values of 54 pairs were > 0.8, indicating an extremely high correlation. These 54 pairs were classified as flavonoids, organic acids, amino acids, alkaloids, nucleotides and catechin polymers with molecular weights from high to low, further confirming the importance of flavonoids and their derivatives in the metabolism of tea (Fig. 5(a)). Although most pairs of metabolites had negative correlations, the positively correlated pairs had higher PCC values. Also, benzoic acids lay in the center of the network, demonstrating their essential intermediate role in the major metabolic pathways. Benzoic acid derivatives are polyphenols, like the flavonoids. Benzoic acid is a precursor of chalcone in plants and bacteria [32] . Consequently, benzoic acid metabolism is closely related to flavonoid metabolism.
The pathway and covariance network of flavonoids require further investigation coupling metabolome with transcriptome in order to identify the genes responsible for metabolism and their interrelations. The WGCNA is a systematic biological method used to describe the gene association modes in different samples [33] . Accordingly, we used WGCNA to identify the hub genes and their correlations. As shown in Fig. 5(b), 12 of 28 genes mapped to the key genes responsible for flavonoid metabolism were placed in the same module. Eight genes interacted strongly with other genes, namely TEA023243, TEA018665, TEA034012, TEA033429, TEA023340, TEA023331, TEA014864 and TEA009266. These hub genes were predicted to possess the functions of PAL, 4CL, CHS, C4H and ANR. There were 301 genes highly interacting with hub genes of 721 pair correlations (weight > 0.5, Table S10). A total of 136 genes were annotated to code transcription factors. Some transcription factors regulate flavonoid biosynthesis such as MYB, bHLH, WD40 and WRKY [34] . TEA017953 and TEA003334 belong to the MYB family, TEA026406 and TEA033085 to the WD40 family, and TEA000421 to the bHLH family. In particular, TEA003334 simultaneously interacted with two genes in the 4CL group (TEA034012 and TEA033429), two genes in the CHS group (TEA023340 and TEA023331), one gene in the PAL group (TEA023243), and one gene in the C4H group (TEA014864) ( Table S10). More notably, TEA026406 also interacted with TEA034012, TEA023340, TEA023331, TEA023243 and TEA014864. TEA000421 also interacted with TEA023340, TEA023331 and TEA014864. These results suggest that TEA003334, TEA026406 and TEA000421 may control the MYB-bHLH-WD40 transcription complex together to regulate the expression level of genes in flavonoid metabolism. Therefore, these five genes (TEA017953, TEA003334, TEA026406, TEA033085 and TEA000421) merit further investigation to uncover their regulatory relationships in flavonoid metabolism. Both the metabolite correlation network and weighted gene coexpression network indicate that flavonoid biosynthesis is the key regulatory pathway for secondary metabolism in tea plants. In summary, this study shows that seasonal fluctuation leads to significant variation in metabolites, especially amino acids, organic acids, alkaloids, flavonoids and their derivatives, in fresh tea shoots over three harvest seasons through UPLC-Q-Obitrap-MS. Most flavonoids in cv. Longjing Changye, having bitterness and astringency as the predominant differential metabolites, are at their lowest concentrations in spring. Furthermore, the metabolomics combined with transcriptomics comprehensively examined flavonoid change with season, and the genes responsible for flavonoid biosynthesis. WGCNA suggests that five genes warrant further investigation of their regulatory function in flavonoid metabolism.