Systematic analysis of the polyphenol metabolome using the Phenol‐Explorer database

Scope The Phenol‐Explorer web database details 383 polyphenol metabolites identified in human and animal biofluids from 221 publications. Here, we exploit these data to characterize and visualize the polyphenol metabolome, the set of all metabolites derived from phenolic food components. Methods and results Qualitative and quantitative data on 383 polyphenol metabolites as described in 424 human and animal intervention studies were systematically analyzed. Of these metabolites, 301 were identified without prior enzymatic hydrolysis of biofluids, and included glucuronide and sulfate esters, glycosides, aglycones, and O‐methyl ethers. Around one‐third of these compounds are also known as food constituents and corresponded to polyphenols absorbed without further metabolism. Many ring‐cleavage metabolites formed by gut microbiota were noted, mostly derived from hydroxycinnamates, flavanols, and flavonols. Median maximum plasma concentrations (C max) of all human metabolites were 0.09 and 0.32 μM when consumed from foods or dietary supplements, respectively. Median time to reach maximum plasma concentration in humans (T max) was 2.18 h. Conclusion These data show the complexity of the polyphenol metabolome and the need to take into account biotransformations to understand in vivo bioactivities and the role of dietary polyphenols in health and disease.


Introduction
Polyphenols are a large and complex class of bioactive plant compounds. Around 500 are known to be consumed from human diets, and the compounds are further divided into numerous classes and subclasses according to their carbon skeleton [1] (Supporting Information 1). The average total intake of polyphenols per person in Western populations could Correspondence: Augustin Scalbert E-mail: scalberta@iarc.fr Abbreviations: C max , maximum plasma concentration; T max , time to reach maximum plasma concentration be as high as 2 g/day [2]. Polyphenol consumption may protect against cardiovascular diseases, cancers, and a range of other diseases [3,4]. As a result, the last 15 years has seen the accumulation of an expansive body of literature on their occurrence, metabolism, bioactivities, and bioavailability.
The total set of polyphenols or polyphenol metabolites present in foods or in human biospecimens is termed the "polyphenol metabolome" [5]. Circulating metabolites are unlikely to be the same compounds as ingested polyphenols, but derivatives formed either by endogenous metabolism or by the microbiota, which allow effective transport and elimination from the body. Many of these metabolites have been identified in human and animal biofluids from controlled intervention studies with polyphenol-rich foods or pure polyphenols. All metabolites described in the literature have been extracted and stored in Phenol-Explorer, a web database originally conceived as a repository of polyphenol food composition data [6,7]. The database enables the retrieval of all known polyphenol metabolites formed upon uptake of a particular food or polyphenol with accompanying pharmacokinetic data, if available. Until the release of this database, the identities of polyphenol metabolites were scattered throughout a large volume of literature and frequently difficult to obtain.
Previous studies have reviewed polyphenol absorption and metabolism from controlled intervention studies, taking into account the dose size and mode of administration [8,9]. Instead, the aim of this study is to exploit all Phenol-Explorer data to systematically analyze the polyphenol metabolome and, more specifically, the nature of the metabolites reported in intervention studies and their pharmacokinetic properties. Polyphenol metabolites are distributed to tissues and may act, e.g., by modulating signaling cascades, which govern biological processes, such as endothelial function [10] and cell-cycle control [11]. Thus a notable use of the database might be to find out which metabolites circulate after administration of a given polyphenol-rich source that is suspected of imparting beneficial effects upon health [12]. Qualitative and pharmacokinetic knowledge of the polyphenol metabolome has also been essential for the identification of biomarkers of polyphenol intake in metabolomic studies [13].

Sources of data on polyphenol metabolism and pharmacokinetics
All polyphenol intervention studies in the Phenol-Explorer database were included in the analysis [7]. Publications were required to meet the following base criteria: (i) be intervention studies using a single or repeated orally administered dose of a normal food source, an experimental food (e.g., food extract, dried, and powdered foods) or an oral supplement; (ii) be conducted in vivo on disease-free humans or animals; (iii) use an appropriate analytical technique capable of reliably identifying metabolites; (iv) detect or quantify at least one polyphenol metabolite in urine or plasma. In brief, separate "interventions" were identified from each publication, where one intervention was defined as the administration of a given dose of polyphenol source (or control) to a single species, with subsequent collection of biofluids. Study design details for each intervention were entered into the database followed by the identities of metabolites corresponding to each intervention, along with any pharmacokinetic details.

Data analysis
Phenol-Explorer tables on metabolites detected, metabolite concentrations, plasma pharmacokinetics, and urine pharmacokinetics were exported and data analyzed using Microsoft Excel and R statistical software (http://www.R-project.org/). Chemical similarities, based on Tanimoto scores, were calculated for each pair of Phenol-Explorer metabolites using the PubChem structure clustering tool. The MetaMapp tool (http://metamapp.fiehnlab. ucdavis.edu) was used to format and filter the PubChem similarity output matrix, associating only pairs of metabolites with similarity scores >0.7 [14]. The matrix was mapped to a network graph using Cytoscape open-source software [15]. The metabolic transformations of pure compounds administered to humans and animals were visualized using Circos open-source software [16].

Polyphenol intervention studies
Data originate from 221 publications describing intervention studies with polyphenol-rich sources. These publications produced 424 separate interventions, of which 398 were on humans or rats. Polyphenol sources were regular and "experimental" foods and included raw foods, processed foods, and food extracts. In addition, pure polyphenol doses were also administered in solution or in powder, tablet, or capsule form. Flavonoids were most studied overall, particularly the flavanones and anthocyanins. Foods and experimental foods were most commonly administered to humans, whereas pure compounds most commonly administered to rats. Doses were either given once or at time intervals, where intervention duration varied from a few hours to several months. The range of study designs is summarized in Table 1.

Diversity of polyphenol metabolites
A total of 383 polyphenol metabolites identified in blood or urine were compiled from the literature as part of the polyphenol metabolome. Similar numbers of metabolites were reported in human and rat studies, with 104 metabolites common to both species (Fig. 1A). Fewer metabolites were reported only in plasma than only in urine, and around half were identified in both biofluids (Fig. 1B). Most metabolites (n = 301) were identified without prior hydrolysis of the glucuronides and sulfate esters with enzymes (Fig. 1C). These included 53 glucuronides and 23 sulfate esters as well as 67 glycosides (Supporting Information 2). Anthocyanin glycosides accounted for most of glycoside conjugates. The remaining metabolites were aglycones or esters. Of the 301 metabolites identified in biofluids without prior enzyme treatment, 114 were known as food components (out of the 502 known food polyphenols in Phenol-Explorer) (Fig. 1D). The latter correspond to compounds absorbed in the body without further biotransformation. The remaining 187 metabolites correspond to methylated derivatives and other metabolites formed in host tissues or by microbiota. Four hundred twenty-four separate interventions were included, originating from 221 publications. a) Other animal models used were mouse, pig, dog, and sheep. All references used can be obtained from http://www. phenol-explorer.eu.
Mapping of the polyphenol metabolome by chemical similarity revealed distinct clusters of related metabolites (Fig. 2). The largest cluster, in the central part of the chemical similarity map, contains glucuronides of all polyphenol classes and subclasses, as well as some glycosides. The anthocyanins (mono-and diglycosides) are chemically distinct from all other metabolites and account for the next largest cluster. Hydroxycinnamic acids and benzoic acids are seen clustered at the bottom and middle and left. Compounds found only in biofluids and those also known as food components are indicated by red and green nodes, respectively. The former includes glucuronides, sulfate esters, O-methylated compounds (e.g. 3 -O-methylepicatechin, and O-methylated anthocyanins in the center of the anthocyanin cluster) and microbial metabolites (such as dihydrogenated isoflavones and hydroxycinnamic acids, hippuric acid, and urolithin C). The most commonly reported metabolites are emphasized with nodes sizes proportional to the number of interventions in which they were identified. See Supporting Information 3 for full metabolite labels.

Pharmacokinetics
Of the 383 metabolites in Phenol-Explorer, 235 were quantified in biofluids at multiple time points and pharmacoki- netics described. A total of 222 maximum plasma concentration (C max ) values were collected for 88 metabolites. In humans, a median C max of 0.1 M (interquartile range = 0.37 M) was observed after food or dietary supplement consumption. Stratification of this distribution by dose type revealed a shifted distribution of values for pure compounds compared to foods (median values of 0.32 and 0.09 M, respectively; Fig. 3A). In rats, where pure polyphenols were more often administered, higher maximum concentrations were achieved (median = 0.41 M, interquartile range = 2.25 M). In relation to time taken to reach maximum plasma concentration (T max ), 207 values were collected corresponding to 81 metabolites. Median T max was much shorter in rat than in human (median time = 0.71 versus 2.18 h). In rat, peak concentrations usually occurred within 1 h of polyphenol administration and few values >5 h were observed (Fig. 3B). In humans, T max of flavanone and isoflavone metabolites was usually in excess of 5 h (Fig. 3C). For most other subclasses, T max was usually less than 2 h. Seventy-one plasma half-life values were compiled for the Phenol-Explorer database. Median half-life for all polyphenol metabolites in humans was 2.8 h (interquartile range = 4.05 h).
The proportion of dose excreted in urine was measured for 80 compounds until 24 h or more after administration. For extracts incubated with either ␤-glucuronidase or sulfatase to group all conjugates with a common parent polyphenol, proportion of dose recovered could be used as a measure of overall bioavailability. Median urinary excretion was 10.9%, although this varied substantially between different  polyphenols (interquartile range = 25.9%) (Supporting Information 4). Some were particularly poorly recovered (e.g. 0.01% for some anthocyanins) either due to poor absorption in the gut or to extensive biotransformation. The highest recoveries in humans were observed for stilbenes, tyrosols, and isoflavonoids (20-60%).

Frequently reported metabolites from polyphenol-rich and pure polyphenol doses
The most frequently identified metabolites (without enzyme treatment of biofluids) in Phenol-Explorer were determined based on the number of intervention studies in which each was described (Table 2). For many of these, food sources are available in Phenol-Explorer as well as the identities of precursors administered as pure compounds. The most commonly reported metabolite was cyanidin 3-O-glucoside, which appeared in plasma in 40 different interventions after administration of different berries, berry extracts, and pure cyanidin 3-O-glucoside. Delphinidin and peonidin glucosides were also commonly reported in these interventions. From the flavanol subclass, epicatechin and 3-O-methylepicatechin were frequently observed in biofluids, usually after exper-imental doses of cocoa and tea. The most frequently observed glucuronides were those of hesperetin and naringenin, usually originating from citrus fruits. The phenolic acids 4hydroxybenzoic acid, 4-hydroxyphenylacetic acid, gallic acid, and the most downstream polyphenol metabolite, hippuric acid, were frequently identified as metabolites of a diverse range of polyphenols. Administration of 59 pure polyphenol doses, mostly to rats, led to the identification of 160 different metabolites in biofluids. Doses of 5-caffeoylquinic acid, which is particularly well studied as a major coffee phenol, led to the identification of 27 circulating metabolites across all studies, while caffeic acid and epicatechin were each the precursor of 22 derivatives. Over half of all metabolites identified in biofluids in interventions with pure polyphenols were, however, derived from a single precursor only. The metabolism of these pure polyphenol doses is represented in Fig. 4.

Discussion
Biological databases are essential for the efficient retrieval of relevant information from large numbers of scientific publications [17]. In the present study, all polyphenol  metabolites reliably identified in human and animal biofluids and described in literature were collated and regarded as the experimentally determined polyphenol metabolome. This is only a small subset of all metabolites theoretically possible, since coverage is governed by research design and analytical capabilities. For example, less glucuronides and sulfates and more aglycones and esters were described in intervention studies included in Phenol-Explorer than might be expected. Conjugates are indeed difficult to identify and the metabolites whose position of conjugation were unknown were not included in Phenol-Explorer. Also, a striking number of unchanged glycosides were found in biofluids; most were anthocyanins, recovered in low concentrations, which escaped deglycosylation by luminal enzymes and can be transported into cells by bilitranslocase membrane proteins [18,19]. In addition, several glycosides of other polyphenol subclasses were either detected and quantified (hesperidin, naringin, neohesperidin, puerarin, and quercetin 3-O-rutinoside) or detected only ( Absorption and pharmacokinetics govern the extent to which polyphenols reach target tissues via systemic circulation. Phenol-Explorer data show that plasma concentrations > 10 M are possible but implausible from human diets, and typical plasma concentrations of individual polyphenol metabolites are many times lower (Fig. 3A). The greatest C max recorded in humans from an unmodified food was 3.95 M quercetin from a dose of shallot skin [20], although the consumption of this tissue alone is improbable. The highest C max recorded for a polyphenol following a dose of any commonly-consumed food tested was 1.21 M dihydroferulic acid following intake of instant coffee (4 g instant coffee powder in hot water) [21]. In both cases, glucuronides and sulfate esters were hydrolyzed before analysis. However, overall high levels of polyphenol exposure may be experienced upon the simultaneous consumption of different polyphenol-rich foods containing many different polyphenols. Upon ingestion of such a meal, total polyphenol plasma concentrations in excess of 5 M might be possible. Time taken to reach these peak concentrations is also important, since metabolites which are absorbed gradually and persist in circulation may be more likely to reach target tissues. T max values collected for Phenol-Explorer clustered at around 2 h and 5 h after polyphenol doses in humans, corresponding to absorption in the small and large intestine, respectively. In rat models, small intestine absorption was quicker than in humans and late T max was less frequently observed. Care should be taken when extrapolating pharmacokinetic data from rat models to human metabolism, particularly as metabolizing enzymes show different expression profiles [22].
Some polyphenol metabolites were identified in many different interventions. Cyanidin 3-glucoside was detected in the most, identified in 40 separate intervention studies following ingestion of many different berries and berry www.mnf-journal.com Table 2. extracts. However, the flavanol aglycones epicatechin and 3-methylepicatechin were detected after administration of a greater range of polyphenol-rich foodstuffs. Fewer phase II conjugates were reported in multiple interventions because their characterization requires greater analytical precision. Those most often reported were different isoforms of naringenin and hesperetin glucuronides, which originate from flavanone glycosides in citrus fruits. Other metabolites were notable for being derived from a wide range of polyphenols. The ring-cleavage microbial metabolite hippuric acid was identified in biofluids after the intake of sources of polyphenols belonging to all most important subclasses (Fig. 4, compound #20). Previous studies have shown the compound to increase upon total polyphenol and fruit intake [23,24] and thus excretion has been postulated to reflect polyphenol intake. It may also however be derived from aromatic amino acids [25] and thus is usually excreted in urine at low levels, independently of phenol intake. In the present study, hippuric acid was often associated with its closest precursor 4-hydroxybenzoic acid (Fig. 4, hydroxybenzoic acids, compound #07), which itself was one of the most commonly reported metabolites in the database. This compound, like 3-and 4-hydroxyphenylacetic acids, is a product of microbial polyphenol metabolism [26,27] and could account for substantial proportions of the ingested polyphenol doses. This is illustrated by the many pathways associating more particularly parent hydroxycinnamates, flavanols, and flavonols with various phenolic acid metabolites shown in Fig. 4 (hydroxypentanoic acids, hydroxypropanoic acids, hydroxyphenylacetic acids, and hydroxybenzoic acids). In particular, many of these microbial metabolites are formed from various phenolic precursors (see Fig. 4, grey histograms). There is much interest in the discovery of biomarkers of polyphenol-rich food intake or of total or individual polyphenol intake, since reliable estimates of dietary exposure are necessary to better understand the health effects of polyphenols through epidemiological studies. Biomarkers could measure polyphenol intake more objectively than dietary questionnaires, which are susceptible to different types of bias [28]. Detailed information on metabolism and pharmacokinetics, such as is available in Phenol-Explorer, is useful as a starting point in biomarker discovery but also aids the identification of metabolites, detected in biofluids, which discriminate high and low consumers of a food of interest. Phenol-Explorer data, e.g., support the validity of biomarkers, such as naringenin and hesperetin glucuronides, recently proven to reflect citrus fruit intake reliably [13]. Biomarkers of exposure will be essential for establishing relationships between polyphenol intake and disease risk, particularly for diseases such as cancer where links are currently poorly characterized [5].
The present study is the first to describe and visualize the polyphenol metabolome as currently known. However, certain limitations should be kept in mind. First, despite a much deeper understanding of metabolism than only a few years ago, only a small proportion of the . Circular diagram of polyphenol metabolic pathways derived from studies in which pure compounds were orally administered to humans and animals. Compounds are ordered by subclass and numbered. Filled circles represent pure compounds administered in intervention studies and empty triangles represent metabolites found in urine or plasma. Histograms indicate the numbers of precursors leading to the formation of each metabolite (scale 0-8 precursors). Link color represents metabolism reactions as follows: red, methylation; green, glucuronidation; blue, sulfation; gray, combination of reactions or unchanged from precursor. See Supporting Information 5 for compound codes.
possible polyphenol metabolome has been described. In general, studies have disproportionately concentrated on the principal flavonoid and phenolic acid subclasses, and more studies on lignans and complex polyphenols, such as proanthocyanidin and theaflavin polymers are needed. A lack of detailed knowledge is evident for some commonly consumed subclasses, such as anthocyanins, which are unstable and quickly break down to smaller products [29]. As enzymatic hydrolyses are often employed, relatively few phase II conjugates have been confirmed, and these have great potential as specific biomarkers of intake of individual polyphenols and polyphenol-containing foods. There is evidence that over 500 polyphenols are known to be consumed by humans, but in our analysis the administration of only 59 pure polyphenols to humans or rats led to the identification of 160 polyphenol metabolites. Second, more studies administering pure compounds are required to precisely identify metabolites of interest. A particular limitation of the present analysis is the proportion of pure polyphenol studies performed on animals (98 experimental interventions) compared to humans (32 interventions). Thus knowledge obtained from pure polyphenol doses is derived predominantly from animals. Despite these drawbacks, the knowledge of the polyphenol metabolome summarized here is needed both to understand in vivo bioactivity and to aid in the search for biomarkers for application to the study of polyphenol intake in relation to health.