Near infrared spectroscopic data for rapid and simultaneous prediction of quality attributes in intact mango fruits

Presented dataset contains spectral data on near infrared region for a total of 186 intact mango fruit samples from 4 different cultivars (cv. Kweni, Cengkir, Palmer and Kent). Near infrared spectral data were collected and recorded as absorbance (Log(1/R)) data in wavelength range of 1000–2500 nm. Those spectral data are potential to be re-used and analysed for the prediction of mango quality attributes in form of vitamin C, soluble solids content (SSC) and total acidity (TA). Spectra data can be corrected and enhanced using several algorithms such as multiplicative scatter correction (MSC) and de-trending (DT). Prediction models can be established using common regression approach like partial least square regression (PLSR).


Data
Near infrared spectral dataset acquired in form of absorbance or log(1/R) in wavelength from 1000 to 2500 nm (Fig. 1). Spectra data were then enhanced and corrected in order to eliminate noises due to light scattering, background amplifications and temperature changes on the detector. There are several approaches that can be employed as spectra corrections like smoothing, normalization, spectra derivatives and transformations [1]. Two most common used spectra corrections and enhancements are multiplicative scatter correction (MSC) as shown in Fig. 2 and de-trending (DT) as presented in Fig. 3.
Spectra data set were then used to establish prediction model for the determination of inner quality attributes on intact mangos using regression approach. One of the most promising and common regression method in NIRS practices is partial least square regression (PLSR) [2]. This method used to find best correlation between NIR spectra data and respective quality attributes such as vitamin C, SSC and TA of mango fruits. Prediction models were constructed by means of raw and enhanced spectra data as shown in Fig. 4.

Instrument preparations
The NIR instrument used to obtain spectral data is Fourier transform near infrared (FT-NIR) (Thermo Nicolet Antaris II TM). It was configured and controlled under integrated software Thermo Integration® and Thermo Operation®. The workflow has been developed and set up to run specified tasks for spectra Specifications Table   Subject Agricultural and Biological Sciences Specific subject area Spectroscopy, non-destructive technique in agriculture. Type of data Value of the Data Dataset obtained from Near infrared spectroscopy (NIRS) provide fast, non-destructive, simultaneous and pollution free to determine quality attributes of agricultural products like mango fruits. These data can be used to develop prediction models used to predict vitamin C, soluble solids content (SSC) and total acidity (TA) of intact mangos without complicated sample procedures and preparations. Data can be benefited for those who concentrated on rapid and non-destructive application for agriculture products. They can be from academics, agro-industries and practitioners. Near infrared spectroscopy can be employed in foods and agricultural products industries especially for quality evaluations: sorting, grading and authenticating. Prediction performances may vary, depends on spectra enhancement and regression approaches to be used. data acquisition of intact mango samples. High resolution measurement with integrating sphere was chosen as a basic measurement [3]. Sample naming and labelling was required automatically for each spectra acquisition in order to distinct all 186 mango samples. Background spectra correction was carried out every hour automatically. Spectra data were recorded in form of absorbance (Log(1/R)) in wavelength range from 1000 to 2500 nm and saved in two different file extension formats: Nicolet (.spa) and comma separated value (.csv). Standard laboratory method also prepared for inner quality attributes measurements of mango as vitamin C, soluble solids content (SSC) and total acidity (TA). Hand held refractometer (model HRO32, Krüss Optronic GmbH) was used for SSC, automatic titrator equipment (Titroline 96, Schott) for TA and standard manual titration method for vitamin C measurement respectively. The centrifuge was also used to obtain clarified sample juice and separate suspended solids.

Spectra data acquisition
Near spectra data were acquired firstly using the FT-NIR instrument. Intact mango sample were placed centrally upon the fruit holder as illustrated in Fig. 5. Each single fruit was hand placed right to the incoming holes (1 cm of diameter) of the light source to ensure direct contact and minimize noises due to light scattering. Absorbance spectra (Log 1/R) in wavelength range of 1000e2500 nm were acquired with co-added of 64 number of scans. Spectra data of each fruit sample were measured in six different points and averaged [2].

Vitamin C, SSC and TA measurements
Once after spectra data acquisitions are completed, fruit samples were directly sliced and some part of mango pulp were taken. To measure vitamin C, a total 5g of mango pulp samples were macerated, mixed and homogenized with addition of 20 ml of meta-phosphoric acid into a beaker to prevent oxidation. Distilled water were added until 50 ml of volume was reached. Vitamin C was obtained and quantified based on its reaction with the Dichlorophenolindophenol as an indicator in titration method from which colour change from colourless to light red at the end of titration. Vitamin C of intact mango fruit are expressed in mg•100g À1 fresh mass (FM) [4].
On the other hand, SSC and TA quality attributes were measured by making another juices from 20 g of pulp sample and maximum of 100 ml distilled water. A little filtered juice was squeezed and dropped into a hand-held refractometer to record SSC in form of o Brix while automatic titration method with 0.1 N NaOH to an end point of pH 8.1 was used to measure TA of intact mango and expressed as mg•100g À1 fresh mass [5,6]. Descriptive statistics of actual vitamin C, SSC and TA measurements of mango samples is shown in Table 1.

Spectra data enhancement
It is highly recommended to perform spectra correction and enhancement prior to prediction model development. There are several methods and approaches that can be used to enhance spectra data, among of them are Multiplicative scatter correction (MSC) and de-trending (DT). Spectra enhancements were used to correct additive and multiplicative effects in the spectra. The MSC and DT spectra corrections were carried out using The Unscrambler X 10.1 software (CAMO, Oslo Norway) with network client license.

Quality attributes (Vitamin C, SSC and TA) prediction
The main part of NIRS application is to develop models used to predict desired quality attributes of intact mango simultaneously. Prediction models were constructed to reveal information about mango quality attributes buried in absorbance spectra data. These models were developed commonly by regressing spectra data (as X variable) and actual measured quality attributes (as Y variable) through multivariate analysis. Partial least squares regression (PLSR) was applied to develop those models and validated using full cross validation approach.
The prediction performances were evaluated by means of these following statistical parameters: the coefficient of correlation (r) and determination (R 2 ) between predicted and measured quality  attributes, prediction error which is defined as the root mean square error (RMSE) and the residual predictive deviation (RPD), defined as the ratio between standard deviation (SD) of the population's actual value of Vitamin C, SSC and TA, and the RMSE of predicted quality attributes. The higher value of RPD, the greater probability of models to predict desired quality parameters or chemical constituent of samples dataset accurately [7e9].