Data on coffee composition and mass spectrometry analysis of mixtures of coffee related carbohydrates, phenolic compounds and peptides

The data presented here are related to the research paper entitled “Transglycosylation reactions, a main mechanism of phenolics incorporation in coffee melanoidins: inhibition by Maillard reaction” (Moreira et al., 2017) [1]. Methanolysis was applied in coffee fractions to quantify glycosidically-linked phenolics in melanoidins. Moreover, model mixtures mimicking coffee beans composition were roasted and analyzed using mass spectrometry-based approaches to disclose the regulatory role of proteins in transglycosylation reactions extension. This article reports the detailed chemical composition of coffee beans and derived fractions. In addition, it provides gas chromatography–mass spectrometry (GC–MS) chromatograms and respective GC–MS spectra of silylated methanolysis products obtained from phenolic compounds standards, as well as the detailed identification of all compounds observed by electrospray mass spectrometry (ESI-MS) analysis of roasted model mixtures, paving the way for the identification of the same type of compounds in other samples.


a b s t r a c t
The data presented here are related to the research paper entitled "Transglycosylation reactions, a main mechanism of phenolics incorporation in coffee melanoidins: inhibition by Maillard reaction" (Moreira et al., 2017) [1]. Methanolysis was applied in coffee fractions to quantify glycosidically-linked phenolics in melanoidins. Moreover, model mixtures mimicking coffee beans composition were roasted and analyzed using mass spectrometry-based approaches to disclose the regulatory role of proteins in transglycosylation reactions extension. This article reports the detailed chemical composition of coffee beans and derived fractions. In addition, it provides gas chromatography-mass spectrometry (GC-MS) chromatograms and respective GC-MS spectra of silylated methanolysis products obtained from phenolic compounds standards, as well as the detailed identification of all compounds observed by electrospray mass spectrometry (ESI-MS) analysis of Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/dib GC-MS data of methanolysis products of phenolic compounds standards provide information on the efficiency and linkages cleaved by methanolysis and are the basis for their identification in real samples.
Mass spectrometry data on the roasting-induced compounds formed from model mixtures mimicking coffee bean composition are valuable for the identification of the same type of compounds in roasted coffee, but also other complex roasted carbohydrate-rich matrices.
In Section 1.2 are presented data on the chemical composition of Arabica and Robusta coffee beans, including chlorogenic acid composition (Table 1), simple sugars, caffeine, and protein contents ( Table 2) and total sugar composition (Table 3), but also the chromatic properties of respective coffee powders (Table 4). Moreover, data on the chemical composition of high molecular weight materials (HMWMs) isolated from roasted Arabica and Robusta coffee infusions are presented in Subsection 1.2 in Tables 5 and 6. The latter includes the phenolic compounds and quinic acid released by methanolysis from coffee HMWMs.
The data presented in Section 1.3 include the mass losses observed after roasting of model mixtures prepared using commercial standards of coffee related carbohydrates, phenolic compounds and peptides (Table 7). The commercial standards used as models of coffee bean components were as follow: (β1-4)-D-mannotriose (Man 3 ), an oligosaccharide structurally related to the backbone of coffee galactomannans; 5-O-caffeoylquinic acid (5-CQA), the most abundant phenolic compound in green coffee beans; and dipeptides composed by tyrosine (Y) and leucine (L), used as models of coffee proteins. Additionally, malic acid (MalA) and citric acid (CitA), the most abundant aliphatic acids present in green coffee beans, were also used. In Subsection 1.3 are also presented electrospray mass spectrometry (ESI-MS) data obtained from the model mixtures, either by direct infusion of the sample into the mass spectrometer, or online coupling to liquid chromatography (LC). Fig. 6 shows LC-MS reconstructed ion chromatograms (RICs) acquired from the roasted mixture Man 3 -CQA-YL. Table 8 contains the detailed identification of all the ions observed by LC-MS analysis of roasted Man 3 and mixtures Man 3 -CQA-YL, Man 3 -CQA, Man 3 -YL, and Man 3 -LY. In Table 9 are presented the accurate masses obtained from high resolution and high mass accuracy measurements using a LTQ-Orbitrap mass spectrometer for the ions identified after roasting of the mixture Man 3 -CQA. Tables 10 and 11 summarize the ions identified by ESI-MS analysis of the roasted mixtures Man 3 -MalA and Man 3 -CitA, respectively. In Table 12 the accurate masses found by LTQ-Orbitrap for the ions identified after roasting of the mixture Man 3 -YL are presented. Table 13 provides data on the LC-MS 2 fragmentation of roasting-induced compounds formed from the mixture Man 3 -YL.

Chemical composition of coffee beans and derived fractions
See Tables 1-6.  Table 1 Chlorogenic acid (CGA) composition (g/100 g of green or roasted coffee).  Table 2 Simple sugars, caffeine, and protein contents (g/100 g of green or roasted coffee).

Data on the model mixtures mimicking coffee composition
See Fig. 6 and Tables 7-13. Some of the Hex n and dehydrated derivatives identified by HPLC-ESI-MS (Table 8) were not observed in the negative ESI-MS spectrum acquired on the LTQ-Orbitrap mass spectrometer (Table 9). This is due to the fact that neutral oligosaccharides ionize better in positive than in negative mode.
The analysis of the reconstructed ion chromatograms (RICs) corroborates the presence of isomeric compounds, i.e. compounds with the same elemental composition but different structures, eluting at different RTs. However, the exact structural differences were not possible to be inferred based on the respective LC-MS n spectra (n ¼2-3) because they were very similar, most probably due to the presence of positional isomers. In the case of the compounds bearing a sugar moiety, the structural differences of the isomers can be related to different structures of the sugar moiety, differing on glycosidic linkage positions, and anomeric configuration.         ions marked with the symbol † or ‡ were attributed to different isobaric compounds: † for two and ‡ for three possible compounds. For roasted Man 3 -CQA-YL, the ion assignment was made considering the most abundant isobaric compounds identified in the roasted mixture Man 3 -CQA. However, the presence of isobars in roasted Man 3 -CQA-YL cannot be excluded.   Table 13 Compounds identified after roasting of the mixture Man 3 -YL: the m/z values of the ions identified, the proposed assignments, the retention time (RT), and the most abundant product ions observed in the respective LC-MS 2 spectrum, with the indication of the m/z values, mass differences relative to the precursor ion, and the identification of the most informative product ions.

Experimental design, materials and methods
The methodologies that allowed the data here presented are described in [1] and in cited references. Here, only the protocol for glycosidic linkage analysis is provided, giving a large number of experimental details, usually omitted in research articles due to the words limit.

Glycosidic linkage analysis
A sample (0.5-1 mg) of each unroasted and roasted model (Man 3 and mixtures) was dissolved with DMSO (1 mL), and then powdered NaOH (40 mg) was added to the solution. After 30 min at room temperature with continuous stirring, samples were methylated by adding of CH 3 I (80 mL), allowed to react 20 min under vigorous stirring. Distilled water (2 mL) was then added, and the solution was neutralized using HCl 1 M. Dichloromethane (3 mL) was then added and, upon vigorous manual shaking and centrifugation, the dichloromethane phase was recovered and washed two times by addition of distilled water (2-3 mL). The organic phase was evaporated to dryness and the resulting material was remethylated using the same procedure. The remethylated material was hydrolyzed with 500 mL of TFA 2 M at 121°C for 1 h, and the acid was then evaporated to dryness. For carbonyl-reduction, the resulting material was then suspended in 300 mL of NH 3 2 M and 20 mg of NaBD 4 were added. The reaction mixture was incubated at 30°C for 1 h. After cooling, the excess of borodeuteride was destroyed by the addition of glacial acetic acid (2 Â 50 mL). The partially methylated alditol derivatives were acetylated with acetic anhydride (3 mL) in the presence of 1-methylimidazole (450 μL) during 30 min at 30°C. To decompose the excess of acetic anhydride, distilled water (3 mL) was added while the tubes were in ice. Dichloromethane (2.5 mL) was then added and, upon vigorous manual shaking and centrifugation, the dichloromethane phase was recovered. The addition of water (3 mL) and dichloromethane (2.5 mL), and the recovery of the organic phase were performed once more. The dichloromethane phase was then washed two times by addition of distilled water (3 mL) and evaporated to dryness. The dried material was dissolved with anhydrous acetone (2 Â 1 mL) followed by the evaporation of the acetone to dryness. The partially methylated alditol acetates (PMAAs) were redissolved with anhydrous acetone and identified by gas chromatography-mass spectrometry (GC-MS) on an Agilent Technologies 6890 N Network GC system (Santa Clara, CA) equipped with a DB-1ms column with 30 m of length, 0.25 mm of internal diameter, and 0.1 mm of film thickness (J&W Scientific, Folsom, CA). The GC was connected to an Agilent 5973 Network Mass Selective Detector operating with an electron impact mode at 70 eV, and scanning the m/z range 40-500 in a 1 s cycle in a full scan mode acquisition. The oven temperature program used was: initial temperature 50°C, a linear increase of 8°C/min up to 140°C, standing at this temperature for 5 min, followed by linear increase of 0.5°C/min up to 150°C, ollowed by linear increase of 40°C/min up to 250°C, standing at this temperature for 1 min. The injector and detector temperatures were 220 and 230°C, respectively. Helium was used as carrier gas at a flow rate of 1.7 mL/min. Relative abundance of each PMAA identified in both unroasted and roasted samples was determined upon integration of each peak using the equipment's software.