Characterization and Metabolic Diversity of Flavonoids in Citrus Species

Flavonoids are widely distributed in plants and play important roles in many biological processes. Citrus fruits are rich dietary sources of flavonoids. However, there have been very few reports about the comprehensive metabolic profile and natural diversity of flavonoids in different tissues of various Citrus cultivars. In this study, based on the 7416 metabolic signals detected with non-targeted metabolomics approach, Principal Component Analysis revealed the flavedo has the largest differences from other tissues in metabolite levels; as many as 198 flavonoid signals were then detected in 62 Citrus germplasms from 5 species mainly cultivated worldwide, while 117 flavonoids were identified, including 39 polymethoxylated flavonoids (PMFs), 7 flavones, 10 C-O-glycosylflavonoids, 44 O-glycosylflavonoids, 10 C-glycosylflavonoids and 7 newly annotated O-glycosylpolymethoxylated flavonoids. Tissue-specific accumulations were observed: O-glycosylated flavonoids were abundant in all fruit tissues, while PMFs were accumulated preferentially in the flavedo. Among different species, mandarins had the highest levels of PMFs and O-glycosylpolymethoxylated flavonoids, followed by sweet oranges. Based on the flavonoid profiles, 62 germplasms could be clearly grouped into five distinct clusters via hierarchical clustering analysis, which were perfectly matched with their species, with sweet oranges and mandarins clustering closely and being further away from other three species.

luteolin 11 . A small number of C-glycosyl flavones have also been detected in blood orange 22 . PMFs, one special group of flavonoids in which almost all hydroxyls are capped by methylation, are predominantly present in Citrus genus, especially in the peels of sweet oranges and wild mandarins 23,24 . PMFs glycosides were identified in Murraya paniculata leaves, which was the first time that the presence of PMFs glycosides in the genus was reported 25 . There have been no related reports about the glycosylation of PMFs in Citrus genus so far.
Owing to the varied flavonoid compositions among different citrus species 20,26 and tissues 10,27 , flavonoids can be taken as a metabolic marker to distinguish citrus varieties, and can be applied to fruit juice identification. For example, the content of hesperetin is a marker of the floral origin of citrus honey 28 , while for lemon juice, flavonoids such as eriodictyol-7-O-rutinoside, diosmetin-6, 8-di-C-glucoside, diosmetin-8-C-glucoside, luteolin-7-O-rutinoside and diosmetin-6-C-glucoside are taken as marker metabolites 20 .
As a powerful method, HPLC-MS-based widely targeted metabolomics has been successfully used for specific detection of flavonoids and characterization of new flavonoids in Arabidopsis and rice 6,29,30 . In a previous study, eight new flavonoids were identified by metabolic profiling of flavonoids in bergamot juice 26 . The phenolic compounds in citrus juices from Spanish cultivars were also comprehensively characterized with the same method 31 . Despite this progress, the flavonoid metabolism is not systematically studied, and their naturally occurring variation in citrus germplasm remains elusive.
In our research, the different accumulation patterns of flavonoids and its metabolic diversity were carried out in 62 Citrus germplasms from five most widely cultivated Citrus species, including one wild germplasm. Unlike previous studies, the aim of this study was to determine the content and composition of flavonoid compounds in citrus fruits, provide comprehensive profiling of flavonoids in different citrus species, and reveal some metabolic diversities of flavonoids in citrus fruits. We have only focused on 117 flavonoids contained in citrus, then provided more detailed data in various citrus germplasms; this work will not only benefit the evaluation of existed germplasms, but also shed light on future selection and breeding of new healthy citrus germplasms. Although 36 compounds have been previously described 32 , 7 new compounds were detected for the first time in citrus in this work. 117 flavonoids were (tentatively) identified and the tissue-specific accumulations were detected for most flavonoids. The neighbor-joining tree based on flavonoid profiles has been used to study the population structure of citrus from the metabolic level. The results pave the way for future dissection of biosynthesis and genetic regulation of flavonoid metabolic pathways in citrus, and could be helpful in the selection of breeding parents for new specific flavonoid-rich germplasms.

Results and Discussion
Metabolic profiles of Citrus fruits. For comprehensive profiling of metabolites in citrus fruits, we analyzed 62 Citrus germplasms representing five major species in Citrus genus, including sweet oranges (SO), mandarins (M), lemons (L), pummelos (P) and grapefruits (G) (Supplementary Table S1). Non-targeted high-performance liquid chromatography with diode array detection and electrospray ionization mass spectrometry (HPLC-DAD-ESI-MS/MS) was used to profile the metabolites in fruit tissues. As a result, 7416 metabolic signals were detected (Supplementary Table S2).
To investigate the inter-and intra-species metabolic differences, mixed samples of different fruit tissues, including the flavedo (F), albedo (A), segment membrane (SM) and juice sacs (JS) from five Citrus species, were prepared and subjected to metabolic profiling via Liquid Chromatography Tandem Time of Flight Mass Spectrometer (LC-TOF-MS). Based on the untargeted metabolomics and Mass Profiler Professional (MPP) analysis, Principal Component Analysis (PCA), an unsupervised method, was subsequently used to briefly evaluate the kinetic metabolome patterns of different fruit tissues and Citrus species. PCA clearly grouped these tissues into four distinct clusters in individual Citrus species, while classified the metabolic signals from the same tissues of different species into five distinct clusters ( Fig. 1A and B). The first three main PCs (PC1, PC2 and PC3) explained the variability of the entire system by 33.08-39.13%, and all variables contributed to the PCs were showed in Supplementary Table S2.
Furthermore, PCA separated the flavedo from other three tissues in all the five species, indicating that the flavedo has the largest differences from other tissues in metabolite levels (Fig. 1A). Notably, all the four tissues of various species showed similar patterns in their metabolomes. For the outermost tissue flavedo, the plots were widely scattered, suggesting that metabolomes considerably vary among different species (Fig. 1A). However, for the innermost tissues SM and JS, the plots were much less distinguishable, indicating similar metabolic profiles between the two tissues (Fig. 1B). Furthermore, the PCA plots of the same tissue from the five species were relatively closer, indicating lower metabolic variations within the same tissues of different species.

Flavonoid identification using HPLC-DAD-ESI-MS/MS. To further identify flavonoids, multiple Diode
Array Detector (DAD) wavelength-scanning programs were used, which were capable of monitoring several wavelengths simultaneously. The flavonoid signals for total ions at 280 nm were listed in Fig. 1C and D. Based on the UV absorbance and mass spectrometric data, in addition to the publically available information of flavonoids, 198 precursor ions were obtained to construct the flavonoid metabolism database in this study (Supplementary  Table S3).
In order to better identify the detected flavonoids, 198 flavonoid-related candidate ions were analyzed by using the targeted MS 2 mode, and the corresponding fragmentation patterns were obtained. Subsequently, a MS 2 spectral tag (MS2T) library containing 198 metabolite signals of flavonoids with the product ion spectra (MS 2 ) was created, which was then annotated based on the fragmentation pattern, retention time (RT), and accurate m/z value (Supplementary Table S3).
Flavonoids with available commercial standards were identified by direct comparison of the m/z values, RT, and the secondary mass spectral fragment information with those of the standard compounds. For example, firstly, a metabolite signal was detected at RT 7.74 min (compound m149), and the characteristic spectrum suggested that this metabolite was O-diglycosylated flavonoid ( Fig. 2A); secondly, the accurate m/z value (Q1) detected by ESI-QqTOF-MS/MS in positive ionization mode was 611.1595 (base peak); finally, the product ion mass spectrometry of [M + H] + showed same fragment pattern as quercetin 3-O-rutinoside compared with the standards, and a peak was observed at m/z 303.0500 because of the loss of sugar moiety on the aglycone (−308, rutinose) ( Fig. 2B), as shown by the structure and main fragmentation pathways of the flavonoids (Fig. 2C).
Flavonoids with no available authentic standards were annotated by comparing the MS fragments with those in literatures or databases (MassBank and METLIN). If the peak had a fragmentation pattern similar to what has been published, the best matches were then searched in the Dictionary of Natural Products (DNP) and Kyoto Encyclopedia of Genes and Genomes (KEGG) for possible structures. About 90 metabolites were putatively annotated following the strategy, and most of them were (C-or O-) glycosylflavonoids and PMFs (Supplemental Table 4).
In addition to those reported metabolites, seven O-glycosylpolymethoxylated flavonoids were newly annotated in the study. To better characterize these flavonoids, the 3′,4′,5,6,7,8-hexamethoxyflavone standard (m70, RT    consistent with previous reports 10, 18 , the richest flavonoids were found in the flavedo (Fig. 1C). At the species level, the largest number of flavonoid signals was found in the samples of sweet oranges, while the smallest number of flavonoid signals was detected in pummelos and grapefruits (Fig. 1D).
Hierarchical clustering analysis (HCA) on the pattern of flavonoids in the flavedo, albedo, SM and JS displayed substantial variations, and clearly separated flavedo from other tissues (Figs 4A and 5A). PMFs and their derivatives had the same accumulation patterns in SM and JS tissues, and cluster analysis showed that the two tissues were clustered together.
Furthermore, all investigated tissues contained PMFs and their derivatives, including tetramethoxyflavone, pentamethoxyflavone and hexamethoxyflavone. Compared with the other three tissues, the accumulation of PMFs was highest in flavedo, while the total amount of PMFs and their derivatives was the lowest in JS. (Fig. 4A). Collectively, the facts that the contents of PMFs and PMFs derivatives were the highest and metabolic signals were detected in the flavedo suggested that metabolomes are the most complicated in the flavedo. Therefore, it can be speculated that flavedo, the outer layer of the fruit, may accumulate more metabolites with special physiological functions, such as hydroxylated PMFs, which play an important role in potent inhibition of microorganism growth, and antibacterial, antifungal and antiviral activities 5,33 . Meanwhile, the accumulation of O-glycosylpolymethoxylated flavonoids was also observed in similar trends among four tissues, with the highest levels in the flavedo and the lowest levels being observed in the SM and JS (Fig. 4C).
As for different species, the highest levels of PMFs as well as O-glycosylpolymethoxylated flavonoids were found in mandarins, followed by sweet oranges, while significantly lower levels were detected in lemons and pummelos (Fig. 4D). These results are consistent with the results of the above HCA analysis (Fig. 4B). Other flavonoid clusters such as Fla C-gly, Fla O-gly and Fla C, O-gly could be categorized into two main groups based on their tissue-specific accumulation patterns. In different fruit tissues, the flavonoids in-group I were mostly detected at higher levels, including a number of glycosylated metabolites (quercetin 3-O-rutinoside, luteolin 7-O-glucoside and apigenin di-C-hexoside); while group II included flavonoid aglycones at lower levels in all fruit tissues, such as naringenin, apigenin and quercetin (Fig. 5A).
Notably, consistent with previous reports 32 , O-glycosylated flavonoids were the most abundant flavonoids in all fruit tissues (Fig. 5C and E). In addition, as another evidence for tissue-specific accumulation of flavonoids, C-glycosylated flavonoids were preferentially accumulated in the flavedo, followed by the albedo, SM and JS (Fig. 5E). Furthermore, HCA analysis also revealed the species-specific profiles of flavonoids (Fig. 5B). Lemon had the highest levels of C-O-glycosylflavonoids while flavonoid aglycones were the most abundant in grapefruits (Fig. 5D). Further analysis revealed that O-glycosylated flavonoids were detected at higher levels than C-glycosylated flavonoids in all investigated species, and the latter were less accumulated in grapefruits and pummelos (Fig. 5F).
Interestingly, as shown in Fig. 5C, C-O-glycosylflavonoids were detected at lower levels compared with flavonoid aglycones in all fruit tissues except for the flavedo, indicating that the C-O-glycosylic process of flavonoids is highly activated in the flavedo. However, in the mixed samples of various tissues (Fig. 5D), C-O-glycosylflavonoids were detected at higher levels compared with flavonoid aglycones in all investigated Citrus species (esp. lemons) except for grapefruits, indicating not only the large contribution of the flavedo-generated flavonoids to the total flavonoid pool of the whole fruit, but also the significantly different genetic control of flavonoid patterns in different Citrus species.

Metabolic diversity of flavonoids in Citrus species.
To further study the naturally metabolic diversity of flavonoids in Citrus species, mixed samples of four fruit tissues from the 62 germplasms were used for targeted metabolic profiling. Heat map (Fig. 6A) obtained from HCA exhibited the metabolic diversity of flavonoids in  Additionally, the results of HCA also showed the great metabolic diversity in the contents of flavonoids among the investigated Citrus germplasms (Fig. 6). In Fig. 6, the 62 germplasms were grouped into two clusters, in which lemons and pummelos were separated from other germplasms. Mandarins and sweet oranges, however, were closely clustered, reflecting their relatively close genetic relationship 34 . However, the studies have shown that sweet orange is derived from interspecific hybridization between pummelo (female parent) and mandarin (male parent), and then backcrossed with a male mandarin (sweet orange = (pummelo × mandarin) × mandarin) 35 , while in this study, both HCA results and the neighbor-joining tree (Fig. 6B) demonstrate a closer genetic relationship between Mandarins and sweet oranges.  Twelve to twenty-one healthy fruits true to its cultivars at commercial maturity were collected randomly from the peripheral canopy of at least three trees and were randomly divided into three biological replicates. The washed material was separated into four tissues, including flavedo, albedo, SM and JS, and immediately placed in liquid nitrogen, and then vacuum freeze-drying using the Heto LyLab 3000 (Heto-Holten A/S, Allerød, Denmark).

Materials and Methods
Chemical reagents. Chromatographic-grade acetonitrile, acetic acid and methanol were purchased from Merck (Darmstadt, Germany). The water used as milliQ water was purified using a Millipore purification system (Millipore Corporation). The internal standard lidocaine was purchased from Shanghai New Asiatic Pharmaceuticals Co., Ltd (www.xinyapharm.com/). All standard compounds, including C-glycosylflavonoids, flavone, O-glycosylflavonoids and polymethoxylated flavonoids (Supplementary Table S3), were purchased from Sigma-Aldrich, USA (http://www.sigmaaldrich.com/united-states.html). All flavonoid standards were dissolved in methanol-dimethyl sulfoxide (50:50, v/v) and stored at −20 °C in darkness.
Preparation of metabolic samples. The dried material was prepared into a powder using a mixer mill (MM 400, Retsch) under conditions of 1.5 minutes at 30 Hz. Water-soluble metabolites contained in 100 mg powder was extracted with 1.0 ml of the extract (methanol:H 2 O 2 , 50:50, v/v) at 4 °C for 8 hours and then centrifuged at 10, 000 g for 10 min. The supernatant was collected and filtrated (SCAA-104, 0.22 μm pore size; ANPEL), and then analyzed by LC-MS.
To investigate the inter-and intra-species differences in metabolites, mixed samples of four tissues (flavedo, albedo, SM and JS) were analyzed. For each germplasm, 4 ml mixed sample was prepared with 1 ml of the above extracted supernatant from each fruit tissue.

LC-MS/MS analysis of metabolites. Qualitative metabolic analysis via HPLC-DAD-ESI-QqTOF-MS/
MS (6520B, Agilent, USA) was performed in the targeted MS 2 mode. The UV spectra (DAD) were recorded from 270 to 380 nm. The raw data was analyzed using MassHunter software and the processing method was the same as previously described 32  Statistical analysis. The PCA diagram was drawn using the Mass Profiler Professional (MPP, B.02.01, Agilent), and the analysis parameters were the same as previously described 32 . HCA was performed using R software to study the accumulation patterns and metabolic diversities of metabolites.
The metabolite data containing the 117 relative intensities of metabolites from 62 Citrus germplasms. The phylogenetic tree was built from the perspective of metabolomics using pairwise population distance by PHYLIP (version 3.69), and its visualization was performed using TreeView and MEGA5.