Metabolomics-Based Discovery of Biomarkers with Cytotoxic Potential in Extracts of Myracrodruon urundeuva Metabolomics-Based Discovery of Biomarkers with Cytotoxic Potential in Extracts of Myracrodruon urundeuva

is a species threatened with extinction due to anthropogenic exploitation. Phytochemical analysis of bark, branch and leaf extracts revealed the presence of several compounds such as flavonoids, phenols, tannins, quercetin derivatives and anacardic acids. Dereplication methodology was performed to tentatively identify 50 compounds analyzed by ultra-performance liquid chromatography coupled with an electrospray ionization quadrupole time-of-flight mass spectrometry operating in MS E mode (UPLC-QTOF-MS E ). The extracts exhibited anti-tumor effect in cancer cells HCT-116 (colorectal), SF-295 (glioblastoma), HL-60 (leukemia), and RAJI (leukemia). Also, these results correlate with the principal component analysis (PCA) data that identified three distinct groups indicating, efficiently, metabolic differences between organs of M. urundeuva . Through discriminatory analysis of the orthogonal partial least squares (OPLS-DA), the variable of importance in the projection (VIP) and S-Plot, we were able to determine 30 potential biomarkers. The fingerprint of hydroethanolic extracts was correlated with the cytotoxicity assay and demonstrated a significant difference in the composition of plant extract.


Introduction
Myracrodruon urundeuva Fr.All. (Anacardiaceae family), popularly known as "aroeira-do-sertão", is a medicinal tree found in several regions of Brazil, especially in the caatinga. 1 Currently, it is included in the official list of Brazilian flora species threatened with extinction 2 in the vulnerable category due to indiscriminate use of the species for several purposes in the wood and pharmaceutical area. 3 The plant raises the researchers interest due to anti-inflammatory properties of its extracts, notably associated with the presence of bioactive phenolic compounds such as tannins, polyphenols, ellagitannins and, mainly, dimeric chalcones. 4 Previous studies have shown that the chemical properties of these substances may be associated with antitumor activity in lung cells and leukemia. 5 Pharmacological studies revealed a wide variety of pharmacological activities including cytotoxic, 6,7 anti-inflammatory and analgesic. 8 Besides, M. urundeuva may prevent cancer indirectly due to antioxidant and antiinflammatory activity of its compounds. 4,[8][9][10] Previous studies have reported antitumor activity of ethanolic extracts from different sections of the plant. Extract dilutions yielded a half maximal inhibitory concentration (IC 50 ) between 9.5-16.7 μg mL -1 against leukemic HL-60 and, among other types such as SF-295 glioblastoma of IC 50 17.3-36.3 μg mL -1 . The activity of this extract was reported to occur via an apoptotic mechanism, which results in a reduction of cell numbers, cell volume, and viability in addition to internucleosomal DNA fragmentation. 7 Pessoa et al. 5 reported the action of ethanolic extracts of the Myracrodruon urundeuva leaf against the HL-60 and SW-1573 lines with IC 50 of 7.4 and 8.5 μg mL -1 , respectively. Therefore, the literature shows several shreds of evidence of significant antitumor activity from M. urundeuva extracts, which motivated the study of its chemical composition. No previous research has been conducted with the identification of compounds associated with the biological activity of M. urundeuva extracts. The metabolomic study of plant samples is of great importance when one wants trying to associate certain bioactivity with the chemical composition of the extract. In this regard, the metabolomics focuses on the study of low molecular weight compounds that may be established as biomarkers 11 by means of metabolic fingerprinting and profiling. 12 This area of study covers a set of analyses from extraction methods to the statistical analysis of the data in order to identify molecules that can function as biomarkers or from genome alteration. 13 This is because it can be used in different spheres, such as metabolic fingerprint, metabolic profile, and metabolomics. 12 The ultra-performance liquid chromatography (UPLC) coupled to high-resolution mass spectrometry (HRMS) has the ability and sensitivity to provide a high-resolution mass spectrum for a complex matrix as a plant extract. Therefore, it is widely used in metabolomics studies involving identification of substances from high complexity extracts. [14][15][16] Ultra-performance liquid chromatography coupled with an electrospray ionization quadrupole time-of-flight mass spectrometry operating in MS E mode (UPLC-QTOF-MS E ) allied to chemometric analysis is very useful for the identification of compounds by comparison of different matrices. Further analyses as principal component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) identify groups that differ from each other, as well as presenting the responsible components that cause these differences, which are recognized as discriminant. 17 Thus, these analyses help to provide information about compounds that could be used as diagnostic to each sample type, therefore potential biomarkers.
Present work aimed to explore differences in metabolic fingerprints of M. urundeuva leaves, branches, and bark ethanolic extracts by using UPLC-QTOF-MS E and multivariate modeling (PCA and OPLS-DA) in order to search associations between chemical composition and cytotoxic effect.

Plant material
Samples of leaf, bark and branch from M. urundeuva were collected from naturally occurring young plants from the Embrapa semi-arid experimental field, close to the border between the municipalities of Petrolina and Lagoa Grande (Pernambuco State, Brazil, 09°04'16.4"S, 40°19'5.37"W) on August 24, 2016, between 9 and 10 o'clock in the morning. The voucher specimens have been deposited in the Herbarium with number HTSA4978. Samples of leaves, bark, and branches were collected in biological quintuplets (in five different trees) taking into account the four quadrants of the tree, in the north, south, east and west directions. The four quadrants were assembled in single samples for each section of the plant. At the time of collection, the liquid nitrogen cooling process was performed at -80 °C. After that, the material was dried in a forced circulation oven at 40 °C for 168 h (one week). Prior to extraction, the samples were ground in a knife mill and stored in a plastic bag at room temperature.

Chemicals
The solvents used were from LiChrosolv ® of the Sigma-Aldrich Chemical Company (St. Louis, MO, USA). In all methods, high purity Milli-Q water (Billerica, MA, USA) was used. The standards for chlorogenic acid and corilagin were obtained from Sigma-Aldrich Chemical Company (St. Louis, MO, USA) and urundeuvine A and B were previously isolated by our laboratory.

Sample preparation
The method used was adapted for the preparation of extracts by liquid-liquid partition. 18,19 Leaves, branches, and bark (50 mg) were added in Falcon (15 mL) tube and extracted with 4 mL hexane, at room temperature, for 20 min in ultrasound batch. Afterward, 4 mL of EtOH:H 2 O (7:3) solution was added. The samples were extracted again with hexane, and the hydroethanolic partition was collected to yield the corresponding EtOH extract. Finally, a 1 mL aliquot of the lower (hydroethanolic) phase was filtered (0.20 μm polytetrafluoroethylene (PTFE)), collected in flasks and stored at -80 °C until further UPLC analysis.

Mass spectrometry conditions
The chemical profiling of M. urundeuva leaves, branches, and bark extracts was performed by coupling the Waters Acquity UPLC system to the QTOF mass spectrometer (Waters, Milford, MA, USA) with the electrospray ionization interface (ESI) in positive and negative ionization modes. The ESI + and ESIdata was acquired in the range of 110-1180 Da, with a fixed source temperature of 120 °C, and a desolvation temperature of 350 °C. A desolvation gas flow of 350 L h -1 was used for the ESI + mode and the 500 L h -1 for the ESI − mode. The capillary voltage was 3 kV. Leucine enkephalin was used as a lock mass. The MS model used was Xevo G2-XS QTOF. The spectrometer operated with MS E centroid programming using a tension ramp from 20 to 40 eV. The instrument was controlled by MassLynx 4.1 software (Waters Corporation).

Chemometric data analysis
The UPLC-MS data of all samples were analyzed using the MarkerLynx XS software 20 to identify potential discriminatory chemical markers in different extracts. For data collection, the method parameters were set as retention time (t R ) range, 0.88-17.0 min, and mass range of 110-1180 Da. For data analysis, a list composed of the identities of the detected peaks was generated using retention time (t R )-mass data (m/z) pairs as the identifier for each peak. An arbitrary ID was assigned to each of this t R -m/z pairs based on their order of elution from the UPLC system. The ion intensity for each detected peak was normalized against the sum of the peak intensities within that sample. Ion identification was based on the t R and m/z values. The resulting three-dimensional data comprising peak number (t R -m/z pair), sample name, and ion intensity were analyzed by PCA and OPLS-DA using MarkerLynx. 20 Cytotoxicity of leaf, bark, and branch samples from M. urundeuva

Cell lines and cultures
Cytotoxicity tests were performed against HCT-116 and SW-620 (colorectal), SF-295 (glioblastoma), HL-60 and RAJI (leukemia), PC3 (prostate) and L929 (murine fibroblast) cell lines, which were obtained from the National Cancer Institute (Washington, DC, USA). All cells were cultured in Roswell Park Memorial Institute (RPMI) 1640, except for L929, which was cultivated in Dulbecco's Modified Eagle Medium (DMEM). Both mediums were supplemented with 10% fetal bovine serum (FBS) and 1% antibiotics (100 U mL -1 penicillin and 100 μg mL -1 streptomycin) at 37 °C with 5% CO 2 . The L929 cell line was used to evaluate the selectivity of the extracts and these assays, the anticancer drug doxorubicin was used as positive control.

Statistical analysis of data activity
All experiments were performed in duplicate and repeated three times. For all samples, the selectivity index (SI) was calculated. The calculation of this index corresponds to the division between the IC 50 value of each test compound in the L929 non-tumor cell line and the IC 50 value of each compound in the tumor cell line (SI = IC 50 L929 / IC 50 neoplastic cells). 22 The experiments were analyzed according to the mean ± standard deviation (SD) of the percentage of cell growth inhibition using the GraphPad Prism software. 23

Chemical profile by UPLC-QTOF-MS E
The ethanolic extracts of the three sections of M. urundeuva were obtained from the methodology described in "Sample preparation" sub-section. The extracts were analyzed by UPLC-QTOF-MS E following the parameters described in "Chromatographic conditions" and "Mass spectrometry conditions" sub-sections only in the negative mode. In all, about 50 compounds were tentatively identified, covering the three sections of the species studied using MS and MS/MS from the chromatographic analysis ( Figure 1). These results were compared to the data reported in the literature (chemotaxonomic) referring to the family (Anacardiaceae) and the genus (Myracrodruon) because there are few reports concerning the species. We used databases such as PubChem, ChemSpider, and Scifinder to support the results.
A wide range of phenolic compounds was identified, mainly derived from flavonoids and tannins. The predominant compounds in leaves were corilagin, firstly reported to the species, as well as geraniinic acid, and compounds well known in the literature as quercetin, gallic acids, and anacardic acids derivatives. The ethanolic extract of branches presented predominantly chlorogenic acid, quinic acid derivatives and the dimeric chalcones, urundeuvines A and B. The bark presented mostly catechin derivatives, in addition to the compounds contained in the branch.
A fragmentation study of the possible biomarkers tentatively identified was performed, presented below. The remaining substances have been tentatively identified and are presented in Table S1 and Figure S1 (Supplementary Information section).  The fragmentation pattern was very similar, and they are derived from the ellagitannin common in the genus Phyllanthus. 25 These compounds are being reported for the first time to the Anacardiaceae family.

Flavanols
Peaks 9 and 15 were identified as gallocatechin derivatives. All gallocatechin derivatives have a 125 Da fragment. 29 This fragmentation is shown in Figure 2 by the formation of free phenol and the non-formation of a fragment of the gallic acid, indicating the gallocatechin and epigallocatechin compounds. This proposal was based on Miketova et al. 29  [M − H − H 2 O] − . Based on Abu-Reidah et al. 28 work, it was identified as malic acid. 28 Peak 5 demonstrated in its first-order spectrum the molecular ion m/z [M − H] − at 191.0192 (C 6 H 8 O 7 ). This compound presented fragment in 111.0079 Da and it was possible to identify as citric acid as suggested by Lafontaine and co-workers. 30 Peaks 10 Figure 4).
In addition, the assignment was corroborated with a comparison between the retention times in the extract and in the analytical standard that was 6.24 and 6.63 min; and 6.24 and 6.69 min, respectively. Therefore, these compounds were identified as urundeuvines A isomers II and III. Similarly, MS/MS spectra and retention time

Chemometric analysis
The main objective of using PCA analysis is to transform large amounts of complex analytical data into easily understood data. 33 The analysis allowed to observe corresponds to the maximum amount of variance not explained by PC1, in which case the branch is positive for PC2 and leaf, and bark is negative. Therefore, according to the PCA data, it is evident that the three parts of the plant differ from the respective chemical profiles.
After analysis of principal components for all samples, OPLS-DA was performed among the three groups (leafbark, leaf-branch, and bark-branch). It was possible to verify clearly the formation of distinct groups, also observed in the PCA, demonstrating the dissimilarity between leaves, branches, and barks. In addition, in the OPLS-DA, the intra-group variation can be observed, that is, how much the samples from the same tissues may differ from each other, and in this case, a greater homogeneity occurred in the leaf samples compared to bark and branch samples as shown in the OPLS-DA graphs ( Figure 6). The good quality of the model is expressed in R 2 Y (explained variance) and Q 2 (predicted variance), where the values must be above 0.5 and the closer to 1 the more reliable. 34 For the analysis, R 2 Y and Q 2 ranged from 0.98 to 0.99, indicating that the results are highly reliable. In order to identify the metabolites that have the greatest contribution to the distinction between the parts of the plant, other statistical tools derived from the OPLS-DA were used: VIP (variable of importance in projection) and S-Plot. By employing VIP, it is possible to predict which are the most significant variables for the selection of biomarkers, in general, VIP > 1 is considered statistically significant. 35 In the present study, VIP > 1 and p < 0.05 were used. S-Plot highlights the discriminant variables, that is, those that move away from the common axis between the two groups compared. Figure 7 presents the VIP and S-Plot graphs for the leaf-bark group. The complete data containing all    other comparisons made through the S-Plot and bar charts are presented in the Figures S4 and S5 (Supplementary  Information section).
After the combination of OPLS-DA, VIP and S-Plot it was possible to tentatively identify the possible biomarkers ( Table 1) that may be associated with the highest cytotoxic activity presented in all biomarkers of leaf, bark and branch extracts.

Cytotoxic activity
The screening tests (100 μg mL -1 ) showed growth inhibition above 70% in all cell lines exposed to ethanolic bark extracts. The leaf extracts were toxic only to leukemia (HL-60) cells. Table 2 shows the results of the cytotoxicity assays.
After initial screening, IC 50 tests were performed with the bark and leaf extracts. The extracts showed higher cytotoxic potential against the leukemic cell line with IC 50 ranging from 17.46 (bark) to 18.55 μg mL -1 (leaf) ( Table 3).
Other studies show in vitro cytotoxic effects of plant ethanolic extracts. IC 50 of 38.1 μg mL -1 was found after treatment of leukemic cells with ethanolic extract of M. urundeuva seeds. 7 The authors showed DNA fragmentation and mitochondrial depolarization caused by seed extracts.
Studies on the inhibition of tumor cells growth under the effect of the ethanolic extract of M. urundeuva were carried out. 5 IC 50 values for this study were 7.4 μg mL -1 against HL-60 and 8.5 μg mL -1 against SW-1573. Bark ethanolic extract, for example, presented 95.6% growth inhibition for the breast, colon, and glioblastoma lines. 6 Viana et al. 4 demonstrated that hydroethanolic extract of M. urundeuva bark exerts anti-inflammatory and analgesic effects related to chalcones. The extracts have antioxidant properties attributed to flavonoids. 36 Souza et al. 8 demonstrated the anti-inflammatory and protective effects against gastric ulcer in mice or rat after treatment with fraction rich in tannins extracted from the "aroeira" using ethyl acetate as the solvent.
The selectivity index (SI) of each sample was evaluated.
The SI measures how much a compound is active against tumor cells without causing damage to non-tumor ones, and it is interesting when it presents values greater than 2.0. 37 In the cell lines tested the leaf extract showed selectivity to the HL-60 line with an index higher than 2. This result can be correlated by the possible biomarkers, tentatively identified via chemometric analysis, present in the leaf, such as corilagin, geraniin, geraniinic acid, quercetin derivatives, among others. Corilagin, a compound well described in the literature, presents a variety of pharmacological effects, such as anti-tumor, 38 anti-inflammatory, 39 antioxidant, 40 and hepatoprotective. 41 Also, the literature reports good antitumor activity along with low toxicity to healthy cells and tissues, making corilagin a promising anticancer lead molecule. 42 The geranin, another possible adjuvant compound, is known to exert antitumor, 43 antibacterial, 44 antioxidant 45 and antiviral activities. 46,47 It has already been reported in the literature 48-50 that quercetin and its derivatives are well known for their antioxidant, antihistaminic and anti-inflammatory properties. The quercetin is being considered a promising new chemotherapeutic agent, and several studies are underway to explore molecules derived from quercetin for cancer-directed chemotherapy. 51 In addition, there may be other substances, as well as their synergistic compounds, that play significant roles in the reported bioactivity but were not identified in the present work due to limitations of the technique chosen.
The bark showed very promising results. The SI was  50 ) values with a 95% confidence interval obtained by non-linear regression from three independent experiments performed in duplicate on six tumor lines and one non-tumor line; b doxorubicin was used as a positive control.  8 had demonstrated antimicrobial and anti-inflammatory activity for M. urundeuva. The literature also has evidence for the antitumor activity being promoted by specific classes of flavonoids such as chalcones, flavonones, and flavones. Many derivatives of these classes showed significant activities against some tumoral cell lines such as human colon, breast, and kidneys. 53 Different polyphenols from "aroeira-vermelha" (Schinus terebinthifolius Raddi) induced cell death of human prostate carcinoma and were considered capable of modulating cell proliferation according to the test concentration. 53 The use of catechins has shown inhibition of prostate and colon cancer. 54,55 The combination of classical chemotherapy with nutrients and especially with polyphenols may decrease the pressure and the adverse effects of the antineoplastic drug. 56 Therefore, three of the polyphenols present in the bark of the "aroeira" tree are promising compounds for isolation or synthesis into the development of phytopharmaceutical products from natural extracts. Chemical investigations of this extract can be a promising strategy for the discovery of phytotherapeutic agents. Also, the chemical profile comparison of bark and branch extracts revealed compounds that may be important for their biological activity.

Conclusions
From a simple and rapid extraction method, it was possible to trace the chemical profile of the three plant organs of M. urundeuva as leaf, branch and bark using the analytical technique UPLC-QTOF-MS E , which allowed the tentative identification of 50 compounds which covered several classes of compounds as flavonoids, flavanoids, hydrolysable tannins and anacardic acid. From the multivariate data analyses presented, it was possible to have information about the metabolic differences between the extracts compared. Such an association has been significant in the discussion of observed activities because the extracts obtained different responses against the tested lines.
The bark and leaf extract showed high toxicity and low IC 50 values against the HL-60 (leukemia), HCT-116 (human colon) and RAJI (leukemia) cell lines compared to the branch. The higher relative concentration of compounds derived from quercetin, galloy derivatives, and phenolic acids present in these extracts may contribute to the understanding of the observed high cytotoxic activity. Some of the compounds identified, such as quercetin derivatives, corilagin, and chlorogenic acid, already has activities recognized as anti-tumor, antioxidants, and anti-inflammatory, among others, which may explain the promising activities observed here compared to literature.
Besides, through the statistical analysis, it was possible to observe the separation of the groups concerning each part of the plant and the identification of the 30 possible biomarkers. Therefore, this metabolic study notes the importance and value of the M. urundeuva plant as a possible source of secondary metabolites that are likely to act to inhibit certain types of cancer cells.