Fast, simultaneous and contactless assessment of intact mango fruit by means of near infrared spectroscopy

This study aims to apply near infrared technology as a fast, simultaneous and nondestructive method for quality assessment on intact mango fruit in form of total soluble solids (TSS) and vitamin C. Absorbance spectra of 186 intact mango fruits with four different cultivars were acquired and recorded in wavelength ranging from 1000–2500 nm. Spectra data were enhanced and corrected using three different methods namely moving average smoothing (MAS), extended multiplicative scatter correction (EMSC) and standard normal variate (SNV). In addition, they were divided into two datasets namely calibration (n = 143) and prediction (n = 43) datasets consisting all four mango cultivars. The models used to predict TSS and vitamin C were developed using partial least square regression (PLSR). Prediction performance were quantified using correlation coefficient (r), root mean square error (RMSE), ratio prediction to deviation (RPD) and range to error ratio (RER) indexes. The results showed that the best prediction models for TSS and vitamin C were achieved when the models were constructed using EMSC correction approach with r = 0.86, RMSE = 1.67 Brix, RPD = 2.34 and RER = 9.72 for TSS. Meanwhile, for vitamin C, r = 0.86, RMSE = 6.84 mg·100g, RPD = 2.00 and RER = 8.87. From this study, it was concluded that near infrared technology combined with proper spectra enhancement method may be applied as a rapid, simultaneous and contactless method for quality assessment on intact mangoes.


Introduction
Mango (Mangifera indica. L) is one of the most popular tropical horticulture fruits worldwide due to its taste, visual appearance, benefits and overall nutritional contents [1]. In addition, it is well known as a source of vitamin, minerals and other active compounds beneficial to human health which leads to an increasing demand in the market. In horticultural industries, quality of produce is great importance and consumers are willing to pay handsomely to get high quality mangoes supplied to them. Therefore, in order to ensure and keep the chain supply of high quality mangoes, it is important to perform sorting and grading for mangoes based on their qualities [2,3].
Total soluble solids (TSS) and vitamin C are two major quality parameters of mangoes that are beneficial to humans. TSS is related to sweetness, sugar content and carbohydrates, while vitamin C improves the immune system [4,5]. Several methods are used to determine chemical quality parameters of mango such as TSS and vitamin C. However, most of them are based on standard laboratory procedures which are time consuming, costly, have complicated sample preparations, requires extraction, laborious, involves chemical materials and destructive [6][7][8]. In recent years, alternative fast and robust methods for the analysis of quality parameters of horticultural and other agricultural products have been one of the most important and prioritized objectives. Therefore, agricultural industries need to be equipped with a proper and ideal simultaneous and non-destructive method that may be used to determine several quality attributes of agricultural products including mangoes and their derivative products [7,[9][10][11].
In the last few decades, near infrared reflectance spectroscopy (NIRS) has become one of the most promising and significant alternative method applied in many fields including agriculture. Compared to standard laboratory analysis, NIRS has some advantages, such as fast, minimum sample preparation, non-destructive, robust, environmental friendly since no chemical materials are used. In addition, NIRS has the ability to predict several inner quality attributes simultaneously [7,[12][13][14].
It was discovered that NIRS was feasible to be used generally as an alternative fast and nondestructive method in determining several quality parameters. For mangoes, several studies have been carried out successfully with satisfactory prediction performances. From the results, it was discovered that NIRS predicts inner quality parameters of intact mangoes such as soluble solids content, malic acid, fructose, ascorbic acid, fiber content, pulp firmness, pH and dry mater [1,11,[34][35][36]. However, most of these related studies were based on local prediction models using one mango cultivar like Tommy Atkins, Palmer and Thai mango cultivars. There are still scarce NIRS predictions using global models which involve more than one mango fruit cultivar. In this study, an attempt was made to develop NIRS global prediction models used to predict total soluble solids (TSS) and vitamin C of intact mango from four different mango cultivars obtained from Indonesia (cv. Cengkir and Kweni), Brazil (cv. Kent) and Peru (cv. Palmer). It was assumed that global NIRS models were more acceptable to be applied in real time situation for determining the quality attributes of intact mango fruits.

Mango samples
In this study, a total of 186 intact mango samples from four different cultivars obtained from Indonesia (cv. Cengkir n = 18 and Kweni n = 29), Brazil (cv. Kent n = 85) and Peru (cv. Palmer n = 54) were used. In order to obtain mango samples with varied ripeness stages, we collected Cengkir and Kweni mangoes from central mango orchards in Majalengka, Indonesia with varied maturity stages from 75-105 days after flowering. Meanwhile, for Kent and Palmer mangoes, we collected from local mango distributors. We order to obtain mango with maturity stages varied from which harvested on 76-98 days after flowering. All mango samples were then divided into two datasets namely calibration dataset which had 143 mangoes and prediction dataset which had 43 mangoes where each dataset consisted of the four mango cultivars. All mango samples were stored for two days at 26 ℃ room temperature to equilibrate internal temperature. After two days, near infrared spectra data was continuously acquired and actual inner quality attributes (TSS and vitamin C) were measured.

Spectra acquisition
Near infrared spectra data, in form of absorbance spectrum of all mango samples were acquired using a self-developed near infrared instrument (PSD -NIRS iptek i16). Spectra data were obtained in wavelength ranging from 1000 to 2500 nm or in wavenumbers from 4000 to 10,000 cm −1 . For each mango sample, spectra data was acquired in three different points (top edge, middle and bottom edge) and averaged as illustrated in Figure 1. Resolution windows were set to 0.2 nm with 64 scans per spectra data acquisition [4,37,38].

Actual TSS and vitamin C measurements
TSS and vitamin C contents of the mango samples were measured immediately the spectra data acquisition was completed. Marked mango samples were sliced, the pulp was extracted and TSS was measured by making fruit juice from 25 g of pulp sample and 100 mL distilled water. The centrifuge was applied for about 10 min [35,39] to obtain clarified juice. Afterwards, a little filtered supernatant juice was dropped into a hand-held refractometer (model HRO32, Krüss Optronic GmbH) to record TSS expressed as o Brix. Conversely, the actual vitamin C content of mango samples was measured using titration method. Approximately 5 g of pulp mango sample was mixed with 20 mL of 5% meta-phosphoric acid (Roth, Germany) into a beaker to prevent oxidation. This mixture was homogenized using the ultra-turrax (IKA T 18B, Germany) for 2 minutes and filtered through filter paper (MN 6151/4, Ꝋ 150 mm, MachereyeNagel, Germany). A total of 10 mL was measured from the filtrate, transferred into a 25 mL beaker glass and titrated with 0.064 M 2.6 Dichlorophenolindophenol. Vitamin C, was quantified based on its reaction with this solution as an indicator in titration method with a colour change from colourless to light red at the end of titration [40,41]. Vitamin C is expressed in mg 100 g −1 fresh mass (FM),

Spectra correction
Infrared spectra data may contain irrelevant background information and noises which interferes with the desired quality attributes of information. Therefore, they need to be corrected and enhanced in order to obtain accurate and robust prediction models. In this study, three different spectra correction methods were employed, namely: moving average smoothing (MAS), extended multiplicative scatter correction (EMSC) and standard normal variate (SNV). The impact of these methods was compared in term of their accuracy and other prediction performance indicators.

Prediction models
The main aim of NIRS practices was to develop prediction models used to determine desired quality attributes of materials studied. The global prediction models developed in this study consisted of four different mango cultivars using the partial least square regression (PLSR) [42,43]. Samples were divided into two datasets namely calibration and prediction dataset from which those four cultivars were included in each dataset.
The models were constructed using calibration data containing 143 samples by regressing absorbance spectra data as predictor variable and actual value of inner quality attributes (TSS and Vitamin C) as response variable. K-fold cross validation was employed during calibration to evaluate and test the models where k = 10 fold. Each fold consisted of approximately 14 randomize samples for validation. The most accurate and robust model was selected by its validation performance and the best model was tested independently using separated prediction dataset containing 43 samples. All data analysis including prediction models development, statistic descriptions and spectra data enhancements were carried out by using The Unscrambler X 10.3 software (CAMO Inc. Oslo, Norway).

Prediction performance evaluation
The prediction performance for TSS and vitamin C were evaluated based on calibration and validation results according to the correlation coefficient (r), the root mean square error in calibration (RMSEC), cross validation (RMSECV), prediction (RMSEP) and the ratio prediction to deviation (RPD) index. The RPD was obtained by dividing standard deviation (SD) of reference data with the RMSECV value. Judging from these performance indexes, it was obvious that ideal prediction models should have higher r coefficient and RPD index, lower RMSE and fewer number of LVs [7,12,13]. Scatter plots derived from reference and predicted values of the desired quality parameters (TSS and vitamin C) were plotted to assist in the results visualization.

Spectra features of mangoes
Typical recorded diffuse reflectance spectra for four intact mango cultivars in near infrared region (1000-2500 nm) are shown in Figure 2. This infrared spectrum was in accordance with the presence of related quality attributes such as vitamins, sugar, moisture and fibre contents, protein and other parameters as derived from absorption bands resulting from the interaction between electromagnetic radiation and organic materials. These bands correspond to intrinsic molecular bonds of O-H, C-H, C-O and N-H as basic structures of quality attributes of mangoes and other agricultural products.

Figure 2.
Near infrared absorbance spectra feature of intact mango samples in wavelength ranging from 1000 to 2500 nm for four different cultivars.
As shown in Figure 2, typical spectra feature and patterns of four mango samples are quite similar. The highest absorption bands are in wavelength 1460 and 1920 nm which corresponds to O-H bands relating to moisture contents. Therefore, it was concluded that mango consisted of 80% water. Similar patterns are also discovered by other researchers that presented a typical pattern of mangoes and oranges samples.
Mangoes from Indonesia (cv. Cengkir and Kweni) typically have high water contents compared to cv Kent from Brazil and cv.Palmer from Peru. This is clearly seen from the typical near infrared spectrum as shown in Figure 2. Inner quality attributes like TSS and vitamin C of intact mangoes are mainly constructed by molecular bonds of C-H-O. Therefore, near infrared reflectance spectroscopy (NIRS) may be used to predict both quality attributes. The absorption bands ranging from 2100-2280 nm are believed to be related to C-H-O structures such as TSS, sugar contents, carbohydrates, fructose, vitamin A and C. Meanwhile, absorption bands at around 1420, 1850 and 2050 nm are associated with organic acids [11,13]. Similar results were also noted in other studies that mentioned the relationship and association between wavelengths and these quality attributes [6,11,35]. Probably the peak and valley are not too obvious on the raw spectra data. Therefore, we would like to confirm those reported findings that vitamin C and TSS, consisted of C-H-O molecules can be predicted on that range. The highest peak found is related to O-H structures. This is clear, since most of mango and other fruits consisted mostly water content.

Total soluble solids (TSS) and vitamin C prediction
In this study, the global prediction models developed were used to simultaneously predict both inner quality attributes (TSS and vitamin C) of intact mango samples derived from calibration datasets consisting four mango cultivars. Descriptive statistics of actual TSS and vitamin C content for all calibration dataset samples is presented in Table 1.
Prediction models were developed using partial least square regression (PLSR) approach. This was because from previous studies, it was discovered that PLSR provides better prediction results compared to other linear regression approaches such as multiple linear regression (MLR) and principal component regression (PCR) (40). The PLSR algorithm considered and modeled both the predictor data (NIR spectra) and response matrices (actual measured TSS and vitamin C) simultaneously to determine the latent variables in predictor data that will best predict both studied inner quality attributes. Firstly, global prediction models were developed to predict TSS and vitamin C using raw original, un-enhanced spectra data in wavelength ranging from 1000 to 2500 nm. These models were quantified using k-fold cross validation method during model development. Calibration and cross validation results for TSS and vitamin C quality attributes are shown in Table 2 and Table 3. Generally, TSS and vitamin C may be satisfactorily predicted using raw spectra data with correlation coefficient of 0.83 and 0.82 for TSS and vitamin C predictions, respectively. Prediction accuracy and robustness were slightly improved when the models were developed using enhanced spectra. Correlation coefficient was increased to maximum of 0.86 for TSS and vitamin C predictions by means of EMSC spectra data correction. Scatter plot derived from calibration and cross validation result for TSS prediction is presented in Figure 3. It showed that prediction performance was improved when the models were developed using enhanced spectra data. In this study, it was discovered that EMSC was the best spectra enhancement method.  Spectra data were firstly corrected and enhanced using moving-average smoothing (MAS). This method enhanced spectra data by averaging total spectra and moving every single spectra data closer to its averaged data. This enhancement method proved to be effective since the prediction error presented as root mean square error (RMSE) were decreased to 2.1 o Brix and 7.41 mg· 100g −1 for TSS and vitamin C, respectively. In addition, the ratio prediction to deviation (RPD) index which represents model robustness was also improved. Before MAS spectra correction, the RPD index was 1.81 and 1.75 for TSS and vitamin C, respectively which categorized as coarse sufficient prediction models. The robustness index was observed by studying the RPD index. From the results obtained, it was observed that the prediction performance using spectra data correction provided and achieved higher RPD index compared to un-corrected raw spectra data i.e. RPD is directly proportional to robust NIRS model. The maximum RPD may be achieved when the models are developed by means of EMSC spectra data. Table 3. Calibration and validation performance of vitamin C (mg· 100g −1 ) in intact mangoes using calibration dataset (n = 143) with partial least square regression approach. As earlier stated, mangoes are biological and climacteric fruit that contains more water and other chemical substances such vitamins and soluble solids contents. These inner quality attributes may be affected by external factors such as temperature and relative humidity which interferes with prediction model's accuracy and robustness. These effects need to be corrected and enhanced in order to obtain and achieve more accurate and robust prediction performances. Therefore, it is very crucial to preprocess and enhance spectra data before establishing global prediction models.
Spectra correction was performed using other enhancement methods such as extended multiplicative scatter correction (EMSC) and standard normal variate (SNV). As shown in Table 2 and 3, the correlation coefficient and RPD index were improved for both TSS and vitamin C prediction. Compared to other studied spectra correction methods, the EMSC method seemed to be the best method providing more robust and accurate prediction performance. The maximum correlation coefficient achieved using EMSC was 0.86 for both inner quality attributes (TSS and vitamin C). Furthermore, maximum RPD index reached were 2.34 and 2.00 for TSS and vitamin C, respectively. In this study, RPD index between 1.5 and 2.0 was categorized as coarse sufficient prediction models. Meanwhile, RPD above 2 was categorized as relatively good performance but still needed improvements.
Scatter plot between reference and predicted TSS and vitamin C derived from EMSC and SNV spectra data are presented in Figure 4. The SNV correction method provided quite similar effect as moving-average smoothing (MAS). In addition, SNV was slightly more robust compared to MAS. The prediction error was lower compared to MAS for both TSS and vitamin C prediction. Conversely, the RPD index resulting from SNV was higher compared to that from MAS spectra correction method. Scatter plot of calibration performance between measured and predicted vitamin C content derived from raw spectra and EMSC enhanced spectra data was presented in Figure 4. In this study, extended multiplicative scatter correction (EMSC) was the best spectra correction tested so far. It attempted to reduce amplification (multiplicative, scattering) and offset (additive, chemical) effects in NIR spectrum. The EMSC rotates each spectrum ensuring that it fits as closely as possible to a standard spectrum that may often be the mean spectrum of all mangoes spectra data. The practical difference between SNV and MAS is that SNV standardizes each NIR spectrum using only data from the responded spectra, while MAS uses the average spectra of any set during spectra correction process. Furthermore, NIRS model based on EMSC spectra data were tested to determine TSS and vitamin C using 43 samples separated from the prediction dataset. The descriptive statistics of actual TSS and vitamin C from prediction dataset is shown in Table 4. The prediction performance of TSS and vitamin C determination on intact mango using EMSC spectra data was sufficiently accurate and robust as shown in Table 5. The achieved correlation coefficient was 0.86, similar to calibration with RPD index also nearly in calibration was 2.25. In addition, the range to error ratio was high (9.72) which in this study, indicates good prediction performance. The prediction performance for vitamin C content determination also achieved good result with correlation coefficient on prediction 0.86 which is similar to correlation coefficient in calibration. The prediction error was also slightly closed to calibration therefore, RPD and RER indexes achieved were 2.19 and 8.87, respectively which corresponded to sufficient prediction performance. Scatter plot derived from the prediction performance of TSS and vitamin C on intact mango samples in prediction dataset is presented in Figure 5. Based on achieved prediction results, it was reported that generally, NIRS may be employed as an alternative robust and accurate non-destructive method in determining inner quality attributes (TSS and vitamin C) of intact mangoes. Furthermore, it was concluded that EMSC correction method was the best and most superior method in predicting inner quality attributes compared to SNV and MAS. Judging from the external prediction performance, it is obvious that both inner quality parameters on intact mango fruit samples may be predicted quite well using enhanced NIRS spectra data. Therefore, it is strongly recommended that the spectra data prior to prediction model development be corrected and enhanced.

Conclusion
In this study, near infrared (NIR) global prediction models were developed to predict two important quality attributes of intact mangoes namely total soluble solids (TSS) and vitamin C. The models were developed using partial least square regression (PLSR) and NIR spectra data corrected and enhanced by means of moving-average smoothing (MAS), extended multiplicative scatter correction (EMSC) and standard normal variate (SNV). Results showed that NIR spectra data combined with EMSC spectra correction generated more accurate and robust prediction performances compared to SNV and MAS methods.
The maximum correlation coefficient (r) and ratio prediction to deviation (RPD) index during calibration for TSS were 0.86 and 2.34. Meanwhile, for vitamin C, the correlation coefficient and RPD index were 0.86 and 2.00, respectively which were categorized as good prediction models.