HPLC-DAD-ESI-QTOF-MS/MS qualitative analysis data and HPLC-DAD quantification data of phenolic compounds of grains from five Australian sorghum genotypes

Sorghum (Sorghum bicolor) grain is a rich source of bioactive phenolic compounds and understanding the phenolic profile of different sorghum genotypes is an important step towards the selection of the most appropriate genotype for industrial applications. The free and bound phenolic compounds of sorghum bran and kernel fractions from five Australian-grown sorghum genotypes (1 white, 2 red, 1 brown and 1 black coloured grain) were identified/tentatively identified by HPLC-DAD-ESI-QTOF-MS/MS and quantified/semi-quantified by HPLC-DAD. Firstly, MS chromatograms of sorghum samples and standards and the MS/MS spectra of individual detected compounds and standards are presented. Then quantification data of these compounds is provided. This dataset is supplementary to the research paper “Comprehensive profiling of phenolic compounds by HPLC-DAD-ESI-QTOF-MS/MS to reveal their location and form of presence in different sorghum grain genotypes” [1].


a b s t r a c t
Sorghum ( Sorghum bicolor ) grain is a rich source of bioactive phenolic compounds and understanding the phenolic profile of different sor ghum genotypes is an im portant step towards the selection of the most appropriate genotype for industrial applications. The free and bound phenolic compounds of sorghum bran and kernel fractions from five Australiangrown sorghum genotypes (1 white, 2 red, 1 brown and 1 black coloured grain) were identified/tentatively identified by HPLC-DAD-ESI-QTOF-MS/MS and quantified/semi-quantified by HPLC-DAD. Firstly, MS chromatograms of sorghum samples and standards and the MS/MS spectra of individual detected compounds and standards are presented. Then quantification data of these compounds is provided. This dataset is supplementary to the research paper "Comprehensive profiling of phenolic compounds by HPLC-DAD-ESI-QTOF-MS/MS to reveal their location and form of presence in different sorghum grain genotypes" [1] .

Value of the Data
• The MS chromatogram and MS/MS spectra data can be used as a reference, and serve as a benchmark, for the identification of phenolic compounds in sorghum grains; the quantification data provide useful information for the evaluation and estimation of individual or group of phenolic contents in sorghum grain materials. • The qualitative and quantitative data provide valuable information/reference to researchers from various sectors (agricultural, food and pharmaceutical) for the analysis and comparison of phenolic compounds in sorghum as well as in other cereal grains or plant materials. • The data provide a comprehensive understanding of the sorghum phenolic profile, which provides useful insights into sorghum material selection and processing design to help tailor specific industrial food or drug applications of sorghum.

Data Description
This present dataset provides supplementary information to our work submitted to Reference [1] . The MS chromatograms of 20 sorghum samples (i.e. free and bound phenolic extracts of bran and kernel fractions from 5 sorghum grain genotypes) and a standard sample of 27 mixed phenolic standards are provided in Fig. 1 . Data in Table 1 Table 1 in Reference [1] and Fig. 3 ; peak numbers S1-27 referring to Fig. 2 . S9 S10 S11 S12 S13 S14 S15 S16 S17 S18 S19& S20 (overlapping)    Table 2 were the calibration and method validation parameters for the quantification of phenolic compounds. Data in Table 3 presents the concentration of phenolic compounds and the standards used for their quantification or semiquantification.

Chemicals and reagents
Standards of apigeninidin chloride, 7-methoxy-apigeninidin chloride and luteolinidin chloride were obtained from ChromaDex (Los Angeles, CA, USA). All other standards and chemicals were obtained from Sigma-Aldrich (Castle Hill, NSW, Australia). All chemicals used for the HPLC-DAD-ESI-QTOF-MS/MS and HPLC-DAD analyses were LC-MS grade.   Standard peak numbers S1-27 are shown in Fig. 1 . LOD = limits of detection; LOQ = limits of quantification; RSD = relative standard deviation.

Table 3
Quantification of sorghum phenolic compounds by HPLC-DAD.       #36 abrasive roller (SATAKE Corporation, Hiroshima, Japan) was used for grain decortication. Sorghum grains (200 g) were decorticated for 60 s to collect the bran fraction. The remaining grains were collected and further decorticated for 45 s to remove uncleared bran residues to give the kernel samples. Both bran and kernel fractions were ground by an EM0405 Multigrinder II grinder (Sunbeam, FL, USA), sieved 100% through a 500 μm brass sieve, and stored at -20 °C in vacuum bags in the dark before extraction. The free and bound phenolic compounds were extracted according to previously published work [2] . For the extraction of free phenolic compounds, the ground sorghum sample (4 g) was mixed with 30 mL of 80% methanol solution under nitrogen gas, and the mixture was shaken at 25 °C and 150 rpm in the dark for 2 h. The mixture was centrifuged at 3500 g and 4 °C for 10 min to collect the supernatant, and the residue was re-extracted with 35 mL 80% methanol two more times. All supernatants were combined and evaporated to dryness by a rotary evaporator at 39-40 °C and 100 rpm for 10-15 min, and the resulting solid was re-dissolved in 20 mL of 100% methanol and stored under nitrogen gas at −20 °C in the dark for 1-3 day until analysis. For the extraction of free phenolic compounds, the residue remaining after the free phenolic extraction was mixed with 30 mL of 2 M HCl under nitrogen gas and heated at 100 °C for 60 min for hydrolysis. Then, 40 mL ethyl acetate was added and mixed thoroughly and wait for about 5 min for partition. After partitioning, the ethyl acetate fraction was collected, and the hydrolysate was re-extracted with 50 mL ethyl acetate five more times. All ethyl acetate fractions were pooled and evaporated to dryness by a rotary evaporator at 39-40 °C and 100 rpm for 10-15 min, and the resulting solid was re-dissolved in 20 mL of 100% methanol and stored under nitrogen gas at −20 °C in the dark for 1-3 day until analysis.
The data was analysed by MassHunter Qualitative software (Agilent Technologies, Santa Clara, CA, USA). The integration thresholds were set as peak area > 30 0 0 0 counts for UV-Vis chromatogram and > 1 counts for MS chromatogram, and only the MS and UV-Vis matched peaks, i.e. peaks that are present in both MS and UV-Vis chromatograms with the peak area above the thresholds, were selected for further analysis. Compound identification and characterisation were based on comparing the retention time, UV-Vis, MS and MS/MS spectra with authentic standards, database, and published literature as follows: (1) Standards: a total of 27 standards were used for identification, of which 15 matching compounds were identified in the tested sorghum samples, as shown in Table 1 . database was the main tool used for identification [5 , 6] . The settings were MS-DIAL score > 80 and MS-FINDER score > 7.5, and compounds/peaks below these scores were not selected for identification. Besides, the UV-Vis spectrum of each compound was used to assign it to a subclass according to its specific UV-Vis absorption/peak pattern [7] , and compounds without matched subclass UV-Vis absorption/peak pattern were not selected for identification. Also, online UV-Vis (SpectraBase) and Mass (ChemSpider, Phenol-Explorer and MassBank) database were used for double verification when available. (4) Mass error: only compounds with mass error ≤ ±10 ppm, and compounds with mass error > ±11 ppm but identified by standards or having a high MS-DIAL score > 90, were selected for identification and verification.

HPLC-DAD quantitative analysis
The quantification of phenolic compounds was performed by an Agilent 1260 series HPLC system equipped with a DAD (Agilent Technologies, Santa Clara, CA, USA), and the same column, mobile phase and conditions were applied as described above in Section 2.3. The data was intergraded by Agilent OpenLAB Workstation software (Agilent Technologies, Santa Clara, CA, USA), and the integration threshold was set as peak area > 1. Compounds with standards were directly quantified by the standards, and compounds without available standards were semi-quantified by selecting structurally similar standards or the standards of the same subclass based on their functional group and chemical structure (i.e. core structure and functional group), as shown in Table 3 . Compounds without structurally matched standards were not quantified. The calibration curves of standards were created at their specific monitoring wavelengths as described above in Section 2.3, and compounds were quantified/semi-quantified at their selected monitoring wavelengths. The semi-quantification was performed on the basis of that phenolic compounds of the same subclass with similar core structure and functional group have similar UV-Vis absorption pattern/peaks at 20 0-60 0 nm [5] , and this method has been used in many studies [8][9][10] .
The quantification method was validated for linearity, limit of detection (LOD), limit of quantification (LOQ) and precision (repeatability). Calibration curves were obtained at eight levels of concentration of standards, except for procyanidin (seven levels of concentration). Method linearity was tested on the basis of calibration curves, which were processed using linear regression. LOD and LOQ were calculated based on the standard deviation of the regression line (SD) and the slope (S) according to the formulae: LOD = 3.3(SD/S) and LOQ = 10(SD/S). Precision (repeatability) was evaluated by analysing three replicates (consecutive injections) of three different concentrations of standards according to Table 2 , and the relative standard deviation (RSD) at each concentration of standard was calculated. All the calibration and method validation parameters for the quantification of phenolic compounds were presented in Table 2 . The experiment was carried out in triplicate and data were expressed as mean ± standard deviation.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.