Comparative metabolomic analysis reveals the variations in taxoids and flavonoids among three Taxus species

Background Trees of the genus Taxus are highly valuable medicinal plants with multiple pharmacological effects on various cancer treatments. Paclitaxel from Taxus trees is an efficient and widely used anticancer drug, however, the accumulation of taxoids and other active ingredients can vary greatly among Taxus species. In our study, the metabolomes of three Taxus species have been investigated. Results A total of 2246 metabolites assigned to various primary and secondary metabolic pathways were identified using an untargeted approach. Analysis of differentially accumulated metabolites identified 358 T. media-, 220 T. cuspidata-, and 169 T. mairei-specific accumulated metabolites, respectively. By searching the metabolite pool, 7 MEP pathway precursors, 11 intermediates, side chain products and derivatives of paclitaxel, and paclitaxel itself were detected. Most precursors, initiated intermediates were highly accumulated in T. mairei, and most intermediate products approaching the end point of taxol biosynthesis pathway were primarily accumulated in T. cuspidata and T. media. Our data suggested that there were higher-efficiency pathways to paclitaxel in T. cuspidata and T. media compared with in T. mairei. As an important class of active ingredients in Taxus trees, a majority of flavonoids were predominantly accumulated in T. mairei rather than T. media and T. cuspidata. The variations in several selected taxoids and flavonoids were confirmed using a targeted approach. Conclusions Systematic correlativity analysis identifies a number of metabolites associated with paclitaxel biosynthesis, suggesting a potential negative correlation between flavonoid metabolism and taxoid accumulation. Investigation of the variations in taxoids and other active ingredients will provide us with a deeper understanding of the interspecific differential accumulation of taxoids and an opportunity to accelerate the highest-yielding species breeding and resource utilization.

active ingredients will provide us a deeper understanding of the interspecific differential accumulation of taxoids and an opportunity to accelerate the highest-yielding species breeding and resource utilization.

Background
Taxol (generic name paclitaxel) is the major bioactive component of the Taxus species widely used in the treatment of various cancers, such as ovarian cancer, breast cancer and squamous cancers [ 1 ]. Since its approval for ovarian cancer treatment prescription in 1992, the demand for paclitaxel and its derivatives have increased [ elution conditions were set as follows: 100% phase A, 0-2 min; 0% to 100% phase B, 2-11 min; 100% phase B, 11-13 min; 0% to 100% phase A, 13-15 min. The injection volume for each sample was 10 μL.
A high resolution MS/MS Waters Xevo G2-XS Q-TOF (Waters, Herts, UK) was used to detect metabolites eluted from the column. The Q-TOF system was operated in both positive and negative ion modes. For the positive ion mode, the capillary and sampling cone voltages were set at 3 kV and 40 V, respectively. For the negative ion mode, the capillary and sampling cone voltages were set at 1 kV and 40 V, respectively. The MS data were acquired in centroid MSE mode. The mass range was from 50 to 1,200 Da, and the scan time was 0.2 s. For the MS/MS detection, all of the precursors were fragmented using 20-40 eV, and the scan time was 0.2 s. During the acquisition, the LE signal was acquired every 3 s to calibrate the mass accuracy. To evaluate the stability of the UPLC-MS/MS system over the whole detection process, a quality control sample, which was prepared by mixing an equal volume of each experimental sample, was acquired after every 10 samples.

Bioinformatics of the untargeted metabolomic dataset
Raw data of UPLC-MS/MS were processed using the following procedures. For each sample, a matrix of molecular features, such as retention time and mass-to-charge ratio (m/z), was generated using the XCMS software with default parameters [ 31 ]. The data were normalized to the total ion current, and the relative quantity of each feature was calculated using the mean area of the chromatographic peaks from three replicate injections. The quantities of metabolites were generated using an algorithm that clustered masses into spectra based on co-variation and co-elution in the dataset. The metabolites were annotated by searching against the KEGG database. For quality control, the identifications of precursor ions of the expected positive ion adduct with less than a 5 ppm error were defined using high-resolution MS.

Analysis of targeted metabolites
Fresh twigs of each sample were collected from three Taxus species, dried at 40°C for 3 d, and powdered. A modified version of a previously published method was used to prepare crude extracts [ 32 ]. In brief, 2.0 g powder of each sample was mixed with 30 mL of 100% methanol, and the mixture was subjected to ultrasonication for 60 min. After centrifugation at 5,000 g for 5 min, the supernatant was filtered through 0.22-μm membrane filters and transferred to a new tube. Taxoids were detected using a Thermo Dionex UltiMate 3000 series HPLC system equipped with a Finnigan TSQ Quantum Discovery triple quadrupole MS (Thermo Fisher Scientific, Waltham, MA, USA). The separation of the above four compounds was carried out on a Phenomenex Kinetex C18 column (100 × 4.6 mm, 2.6-μm particle size; Phenomenex, Torrance, CA, USA). The mobile phase consisted of 35% of solvent A (2 mM ammonium formate and 0.1% formic acid aqueous solution) and 65% of solvent B (100% methanol).
The flow rate was 0.2 mL/min, the temperature of column oven was 30°C, and the injection volume was 5 μL. Other detailed parameters of the HPLC-MS/MS analysis were as follows: the capillary temperature was 270°C; the ion spray voltage was 3,000 V; the auxiliary gas and sheath gas was N 2 ; and the collision gas was high purity argon.

Systematic correlativity analysis and statistical analysis
For the untargeted metabolome analyses, Pearson's and Spearman's correlations, a oneway analysis of variance (ANOVA), and hierarchical clustering were conducted. P values of the ANOVA were adjusted for the false discovery rate. A principal component analysis (PCA) of the metabolites was performed on the data that was mean-centered with the Pareto-scaling method using SIMCA v14.0 (Umetrics, Umea, Sweden).
The quantification results of targeted metabolites are presented as the means of at least three replicates ± standard error. Statistical analyses were performed using SPSS software version 19.0 (SPSS Inc., Chicago, IL, USA), and an ANOVA was applied to compare taxoid content differences. A P value < 0.05 was considered to be statistically significant.

Results
Untargeted metabolite profiling the metabolomes of different Taxus species To explore the comprehensive variations in metabolomes of different Taxus species, an untargeted approach (15 repeats for each group) was applied, identifying 2,246 metabolites from 8,712 ions with a relative standard deviation < 30% (Additional file 1).
Similar to the differences in twig morphology, dramatic variations in the metabolomes among different Taxus species were also observed ( Fig. 1a). For quality checking, total ion chromatograms were generated, suggesting that the sample preparation met the common standards (Additional file 2). To produce an overview of the metabolic variations, a PCA was performed, and the percentages of explained value in the metabolome analysis of PC1 and PC2 were 25.01% and 31.24%, respectively. The PCA data showed three clearly separated sample groups, indicating separations among the three different species (Fig.   1b). Based on their KEGG annotations, 747 metabolites were characterized to various primary metabolic pathways, including amino acid, carbohydrate, cofactors and vitamins, energy, lipid, nucleotide, secondary metabolites, and terpenoid pathways ( Fig. 1c and Additional file 3).

Clustering of differential accumulated metabolites
All annotated metabolites were clustered to identify the differential accumulated metabolites (DAMs) among three Taxus species (Fig. 2a). All DAMs were grouped into three Clusters: I, II and III. The T. media predominantly accumulated metabolites were grouped into Cluster I (358 metabolites), the T. cuspidata predominantly accumulated metabolites were grouped into Cluster II (220 metabolites), and the T. mairei predominantly accumulated metabolites were grouped into Cluster III (169 metabolites) ( Fig. 2b). Our data showed that the DAMs belonging to the 'secondary metabolites', 'lipids', 'cofactors and vitamins', 'carbohydrate' and 'amino acid' categories were predominantly accumulated in T. media (Fig. 2c Confirmation of the variations in paclitaxel and its derivatives using a targeted approach To determine more precisely the differences in taxoids among the three Taxus species, a targeted approach was used to measure the concentrations of paclitaxel, 10-DAB III, baccatin III, and 10-DAP (Additional file 7). The untargeted metabolomics analysis indicated that T. cuspidata and T. mairei contained the highest and the lowest levels of paclitaxel, respectively. The direct quantification with an authentic paclitaxel standard showed that T. cuspidata, T. media, and T. mairei contained 1.67 mg.g -1 , 1.22 mg.g -1 , and 0.66 mg.g -1 of paclitaxel, respectively (Fig. 5a). The order of the paclitaxel contents was in good agreement with the untargeted metabolome results. For other taxoids, the highest levels of baccatin III and 10-DAP were accumulated in T. cuspidata (0.65 mg.g -1 and 0.80 mg.g -1 , respectively), and the highest level of 10-DAB III was detected in T. mairei (0.85 mg.g -1 ) (Fig. 5b-d).

Confirmation of the variations in flavonoids using a targeted approach
To determine more precisely the differences in flavonoids among the three Taxus species, a targeted approach was used to measure the concentrations of amentoflavone, ginkgetin, quercetin and luteolin (Fig. S6). Our data showed that amentoflavone highly accumulated in T. cuspidata (0.14 mg.g -1 ) and lowly accumulated in T. media (0.024 mg.g -1 ) (Fig. 5e).
Systematic correlativity analysis identifies a number of metabolites associated with key metabolites of paclitaxel biosynthesis An analysis of metabolite-metabolite interaction networks contributed to the understanding of functional relationships and the identification of new compounds associated with key metabolites of paclitaxel biosynthesis. In our study, an interaction network based on the differentially accumulated metabolites was constructed.
Furthermore, the taxoid-related networks were divided into three clusters surrounding paclitaxel, baccatin III, and 10-DAB III (Additional file 9). The interaction networks suggested that nine classes of metabolites, phenylpropanoids, flavonoids, alkaloids, carboxylic acid derivatives, quinones, glycosides, saccharides, steroids and terpenoids, may also contribute to the variations in taxoid accumulation in different species (Fig. 6).
However, the mechanisms underlying the interactions of these potential new metabolites need to be investigated.
In plants, the accumulation of metabolites is a complex and important trait mainly affected by genetic and environmental factors [ 38 , 39 ]. By identifying specific metabolites, our results suggested that variations, not only in paclitaxel and its derivatives, but also in their precursors, exist in different Taxus species (Fig. 3). The diterpenoid taxane core is derived by three units of IPP and one unit of dimethylallyl diphosphate, which are supplied by the MEP pathway [ 9 ]. Interestingly, most precursors for paclitaxel biosynthesis were highly accumulated in T. mairei compared with in T. cuspidata. However, paclitaxel was primarily accumulated in T. cuspidata rather than in T. mairei. This suggested that the efficiency of paclitaxel synthesis using MEP pathway precursors in T. cuspidata may be extremely high. mairei.
In addition to taxoids, flavonoids, phenylpropanoids, and phenolic compounds have been isolated in Taxus species [ 17 , 34 , 45 , 46 ]. In our study, the metabolite-metabolite interaction network revealed 222 taxoid-associated metabolites, belonging to 10 major categories. In total, 21 flavonoids, including 3 baccatin III-related metabolites and 18 paclitaxel-related metabolites, were identified in the interaction network. Interestingly, the majority of the flavonoids were negatively correlated with baccatin III and paclitaxel (Table S3), which was in accord with data from our metabolomes. A previous work showed that total flavonoids, ginkgetin, and quercetin were highly accumulated in T. mairei and that paclitaxel was highly accumulated in T. media [ 34 ]. Under ultrasound and salicylic acid treatments, paclitaxel biosynthesis

Competing interests
The authors declare that they have no competing interests.

Additional File Legend
Additional file 1 Table S1 Detail information of 2,246 identified metabolites.
Additional file 2 Figure S1 The total ion chromatograms of all the samples.
Additional file 3 Table S2 The KEGG annotations of 747 identified       and luteolin (h), were quantified by HPLC-MS/MS method. A P value < 0.05 was considered to be statistically significant and indicated by "b" and P < 0.01 was indicated by "a".