Dataset of Near-infrared spectroscopy measurement for amylose determination using PLS algorithms

In the dataset presented in this article, 168 rice samples comprising sixteen rice varieties (including Indica and Japonica sub species) from a Portuguese Rice Breeding Program obtained from three different sites along four seasons, and 11 standard rice varieties from International Rice Research Institute were characterised. The amylose concentration was evaluated based on iodine method, and the near infrared (NIR) spectra were determined. To assess the advantage of Near infrared spectroscopy, different rice varieties and specific algorithms based on Matlab software such as Standard Normal Variate (SNV), Multiple Scatter Calibration (MSC) and Savitzky-Golay filter were used for NIR spectra pre-processing.


Value of the data
The data can be used as a supplement on the biochemical properties of amylose concentration and can be compared with other related studies.
Those data establish a link between biochemical properties and reflectance spectra on several rice samples for amylose evaluation using different PLS model.
Several Matlab algorithms such as SNV, MSC, derivatives and others Savitzky-Golay filters allowed to preprocessed the raw NIR spectra.
The experimental data of amylose and NIR spectra can be used for analysis of different PLS algorithms (iPLS, siPLS and mw-PLS).

Data
Amylose concentration of 168 different rice samples was determined using a spectrophotometric method ( Fig. 1A and B). For the same samples the NIR spectra were obtained using the Spectra -NIR transflection MPA equipment ( Fig. 3A-B, Matlab file: RawData.mat). After that, the spectra data were previously analysed by principal component analysis method for identifying and removing the outliers

Amylose determination
The amylose concentration was determined using the standard iodine colorimetric method prepared according to ISO 6647-2 [1]. The absorbance was measured using a spectrophotometer (Hitachi, Japan) at 720 nm. Amylose content was quantified using a standard curve created from absorbance values of 4 calibrated samples from standard rice varieties (IR65, IR24, IR64, IR8) obtained from IRRI (Fig. 2). The calibration values were obtained by separation of hydrodynamic volume and molecular weight of amylose by size exclusion chromatography ISO 6647-1 [2].

Instrumentation and measurements
The samples containing approximately 25 cm 3 of rice flour were loaded in a circular sample cup and pressed slightly to obtain a similar packing density. Sample spectra were registered using an NIR transflection MPA equipment (Bruker Optics, Germany). For each rice sample, 16 successive scans were performed, over a wavenumber range (12,000-4000 cm −1 ), at 16 cm −1 of resolution. For each rice sample two spectra were obtained (Fig. 3A and B).

Principal component analysis (PCA)
Principal component analysis is a linear pattern recognition technique that allows the reduction of the dimensionality of multivariate data to n principal components. All samples were considered for analysis to enable inferring how sample variability may affect possible trends from the direct observation of the scores plot. The outliers were identified using PCA analysis. PCA was performed using MATLAB® 7.9.0 software (Matlab-toolbox). PCA analysis was performed to select the suitable experimental data for model construction and to identify and eliminate the outliers (Fig. 4).

Amylose determined using the siPLS model
The rice samples were evaluated regarding the NIR spectroscopy, and the spectra were used for building the siPLS model for amylose prediction in rice. The model created PLS is particularly useful to predict a set of dependent variables from a (very) large set of independent variables (i.e., predictors). Due to the large number of rice samples, the experimental data related to colorimetric method of all samples used in this study, as well as the correspondent value obtained through the siPLS model developed from the NIR spectra were submitted as in the Excel file (DatainBrief_AmyloseContents).