Self-adaptive models for predicting soluble solid content of blueberries with biological variability by using near-infrared spectroscopy and chemometrics
Introduction
Blueberries are known as “the queen of fruits” and “the king of berries” with a high economic value worldwide (Wang et al., 2018; Hu et al., 2016a; Li et al., 2019b). They contain valuable antioxidant and flavonoid that can enhance immunity, delay nerve senescence as well as prevent cardiovascular disease (Neto, 2007; Kalt and Dufour, 1997; Jiang et al., 2019). Soluble solids content (SSC) is a critical internal quality in assessing fruit ripeness and harvest time. Also, it is an important indicator to determine fruit storage and consumer preference (Zhang et al., 2017; Fan et al., 2015; Li and Colin, 2014). Therefore, a rapid, nondestructive method to accurately measure SSC would be useful.
Near-infrared (NIR) spectroscopy has been widely accepted for rapid detection of internal qualities in fruits, such as apple, pear and melon. (Liu and Ying, 2005; Xiaobo et al., 2007; Nicolaï et al., 2008; McGlone and Kawano, 1998; An et al., 2020). NIR combined with computer technology, spectroscopy as well as chemometrics is becoming an efficient and fast analytical technique for evaluating internal qualities of fruit and vegetables (Roggo et al., 2007). In NIR spectroscopy, the matter is irradiated by a continuous beam of NIR radiation to measure its reflected or transmitted radiation. Its spectral properties change with the wavelength related scattering and absorption processes when the radiation pierces the matter. This change depends on the chemical composition of the matter and the light scattering characteristic associated with its microstructure. (Nicolai et al., 2007). Hydrogen groups (CH, NH and OH) in organic molecules absorb NIR at different wavelengths and intensities (Li et al., 2016), and therefore SSC and other chemical information can be reflected according to the absorption. NIR can be measured by reflectivity, transmittivity and interaction patterns, to make light levels more relevant, Fu et al. (2006) improved the spectra acquisition method for detection of pear SSC and obtained NIR spectroscopy conveniently and accurately by using diffuse reflection measurement. In addition, Wang et al. (2019) used Monte Carlo outlier detection method to remove abnormal samples as well as optimize the sample set, and the reliability of the calibration model was further improved. Zhang et al. (2017) combined NIR technique and wavelength selection algorithm to screen the most effective wavelength, eliminating the interference information in apple SSC determination and simplifying the model to ensure the prediction accuracy. Combined with various chemometric models, including partial least squares regression (PLSR), principal component regression (PCR), and multiple linear regression (MLR), NIR spectroscopy can be used to predict SSC (Zhan et al., 2017). However, many studies have shown that the diversity of cultivar, season, origin and other biological characters would affect the interaction between light and matter as well as reduce the predictive performance of the model (Wold et al., 2001; Hu et al., 2016a,b; Fan et al., 2019). Consequently, the accuracy of fruit SSC prediction is not only related to the veracity of the spectrum, the selection of the optimal wavelength and the stability and reliability of the model, but also highly dependent on the biological variability and sample richness.
The diversity of biological characteristics has great influence on the measurement of SSC. Zhang et al. (2017) pointed out that the cellular structure could lead to the heterogeneity in terms of physical aspect and chemical component, meanwhile, the structure would also be influenced by the biological variability. Simultaneously, the difference in internal quality will affect the propagation characteristics of the incident light (Magwaza et al., 2012; Xia et al., 2019), thus, the variability of biological affects the robustness of SSC prediction model. To ensure the accuracy and universality of the calibration model, biological differences should be considered as an important factor in modeling. To eliminate the effects of biodiversity, five modeling strategies: preprocessing-based, compensation-based, equivalent-based, classification-based and calibration transfer-based methods have been proposed. Zhang et al. (2017) analyzed the performance of various pretreatment methods and their combinations in removing the uninformative biological variability. The method used the preprocessed data to develop models can reduce the effect of noise and unrelated physical factors. Tian et al. (2018) built two types of models to reduce the influence of the spectral measurement position, i.e. the surface features of the fruit. One was the compensation-based prediction model which took all the spectral data from all locations as the input simultaneously, another one was the equivalent-based model which took the average data of every location at each wavelength as the input. Lyu et al. (2015) and Bai et al. (2019) utilized various cultivar discriminant methods such as linear discriminant analysis (LDA), support vector machine (SVM), fingerprint features and deep learning to build classification model to decrease the sensitivity to cultivar. Fan et al. (2019) established the slope and bias correction model with the calibration transfer method based on the predicted results rather than the spectral space to eliminate the influence of seasonal diversity. The models mentioned above can solve a certain single biological difference and specifically eliminate the effects of cultivar, season and origin, but neither of those models is universal for all biological variations and needs to be upgraded or reconstructed when new samples are collected. The existing solutions cannot automatically adapt to the variation of differences to improve the detection accuracy. Hence, it is important to develop a new model that can adapt to the change of diverse biological variation.
In view of the biological variability, it is critical to self-select the corresponding modeling method with the highest accuracy according to different situations. This study attempts to propose a new strategy: establishment of a self-adaptive model combined with self-selection strategy, NIR technique, chemometric analysis, five correcting methods and model search technology to adapt the changes in biodiversity, improving the prediction performance of the unknown samples and the universality of the calibration model. The specific objectives were to: (1) analyze and investigate the spectral characteristics and differences among seasons and several diverse cultivars of blueberries; (2) establish the above five correcting models and assess the predictive performance of each model; (3) develop the self-adaptive model and select the relevant modeling strategy with the highest accuracy combined with self-selection algorithm for SSC detection in different situations and requirements; (4) compare and evaluate the performance of the self-adaptive selection model based on the results of the cyclic selection.
Section snippets
Blueberry samples
A total of 684 blueberries from three cultivars were collected from Qingdao, China: ‘Bluecrop’ (sample size of 324), ‘Duke’ (sample size of 180) and ‘M2’ (sample size of 180). From season variation aspect, 540 samples were harvested in May 2015 (180 for each cultivar) and 144 “Bluecrop’’ samples were harvested in May 2014. Fruit of uniform size and free of blemish were delivered to the laboratory immediately after harvest and stored at 4 ℃ before experiments. The samples were removed from
Spectra features
The FT-NIR spectra of 684 blueberries from three cultivars in the region of 1000−2400 nm are shown in Fig. 2. Each color interval represents the full spectrum range of the corresponding cultivar or season of blueberries, that is, the lowest absorbance to the highest absorbance of the samples at each wavelength, and the thick solid line within the interval is the average spectrum of each cultivar. Obviously, the average spectral curve trends of fruit with diverse biological variation is similar,
Conclusions
NIR spectroscopy and chemometrics, this paper studies the effect of biodiversity on blueberry SSC prediction and aims to eliminate the effects of biological variability and abiotic information. Although the specific individual-variation model or hybrid-variation model achieved satisfactory accuracy in prediction itself, when applied to the prediction of blueberry SSC in other cultivars or seasons, the accuracy was greatly reduced and the effect was undesirable, indicating that there were
CRediT authorship contribution statement
Wei Zheng: Methodology, Writing - original draft, Writing - review & editing. Yuhao Bai: Conceptualization, Investigation. Hui Luo: Methodology, Writing - original draft. Yuhua Li: Data curation, Supervision. Xi Yang: Supervision, Validation. Baohua Zhang: Project administration, Conceptualization, Funding acquisition, Resources, Supervision.
Declaration of Competing Interest
The authors declare that they have no know competing financial interests or personal relationships that could have appeared of influence the reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (project No. 31901415), the Natural Science Foundation of Jiangsu Province (Grant no. BK20180515), and the Jiangsu Training Programs of Innovation and Entrepreneurship for Undergraduates (project No. 201910307155Y). The authors also wish to give special thanks to Professor Menghan Hu and Professor Shizhuang Weng for their help in data processing and analysis.
References (39)
- et al.
Characterization of textural failure mechanics of strawberry fruit
J. Food Eng.
(2020) - et al.
Accurate prediction of soluble solid content of apples from multiple geographical regions by combining deep learning with spectral fingerprint features
Postharvest Biol. Technol.
(2019) - et al.
Long-term evaluation of soluble solids content of apples with biological variability by using near-infrared spectroscopy and calibration transfer method
Postharvest Biol. Technol.
(2019) - et al.
Estimating blueberry mechanical properties based on random frog selected hyperspectral data
Postharvest Biol. Technol.
(2015) - et al.
Prediction of mechanical properties of blueberry using hyperspectral interactance imaging
Postharvest Biol. Technol.
(2016) - et al.
Fusion of machine vision technology and AlexNet-CNNs deep learning network for the detection of postharvest apple pesticide residues
Artif. Intell. Agric.
(2019) - et al.
Assessment of internal quality of blueberries using hyperspectral transmittance and reflectance images with whole spectra or selected wavelengths
Innovative Food Sci. Emerging Technol.
(2014) - et al.
Quantitative evaluation of mechanical damage to fresh fruit
Trends Food Sci. Technol.
(2014) - et al.
A comparative study for the quantitative determination of soluble solids content, pH and firmness of pears by Vis/NIR spectroscopy
J. Food Eng.
(2013) - et al.
Determination of SSC in pears by establishing the multi-cultivar models based on visible-NIR spectroscopy
Infrared Phys. Technol.
(2019)
Use of FT-NIR spectrometry in non-invasive measurements of internal quality of ‘Fuji’apples
Postharvest Biol. Technol.
Firmness, dry-matter and soluble-solids assessment of postharvest kiwifruit by NIR spectroscopy
Postharvest Biol. Technol.
Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: a review
Postharvest Biol. Technol.
Time-resolved and continuous wave NIR reflectance spectroscopy to predict soluble solids content and firmness of pear
Postharvest Biol. Technol.
Review of the most common pre-processing techniques for near-infrared spectra
TrAC, Trends Anal. Chem.
A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies
J. Pharm. Biomed. Anal.
Non-destructive prediction of soluble solids content of pear based on fruit surface feature classification and multivariate regression analysis
Infrared Phys. Technol.
Development of multi-cultivar models for predicting the soluble solid content and firmness of European pear (Pyrus communis L.) using portable vis–NIR spectroscopy
Postharvest Biol. Technol.
Application of deep learning architectures for accurate and rapid detection of internal mechanical damage of blueberry using hyperspectral transmittance data
Sensors
Cited by (23)
Calibration transfer of cross soluble solids content of different kiwifruit cultivars based on Two-stage TrAdaBoost.R2
2024, Postharvest Biology and TechnologyEarly identification of strawberry leaves disease utilizing hyperspectral imaging combing with spectral features, multiple vegetation indices and textural features
2023, Computers and Electronics in AgricultureCitation Excerpt :By resolving the information in the spectral dimension for each pixel in the acquired image, the spatial information of the image can be intuitively observed and the spectral reflectance of the target pixel can be obtained. Subtle changes in spectral reflectance at different wavelengths due to absorption or reflectance provide predictive indicators for the identification of crop diseases (Zheng et al., 2020; Nguyen et al., 2021). Once strawberry leaves are infested with the disease, the host strawberry leaves initiate a protective mechanism and their biochemical and biophysical properties begin to change, thus producing a spectral profile different from that of healthy leaves.
An improved method for predicting soluble solids content in apples by heterogeneous transfer learning and near-infrared spectroscopy
2022, Computers and Electronics in AgricultureThe prediction of ripening parameters in Primitivo wine grape cultivar using a portable NIR device
2022, Journal of Food Composition and AnalysisCitation Excerpt :The chemometric indexes for TSS models of the Primitivo berries indicated that the NIR spectra region (740–1070 nm) could effectively predict the TSS across a wide range, from 4.2 up to 23.8 %, which represent values for the production of very good Primitivo wines such as PDO ones. For fresh fruit, the model failure is commonly due to a high biological variability which can be related to several factors such as: cultivars, site of cultivation, cultural practices, training systems, season of harvest, and ripening stages of fruit (Zheng et al., 2020; Bedbabis et al., 2014; Boselli et al., 2019). Hence, a natural solution to deal with this calibration failure problem is to measure a wide range of samples from different viticultural locations, seasons of harvest (2–3 seasons), training systems and developing/ripening stages to calibrate global models to be used in different wine-producing areas.
Nondestructive evaluation of Zn content in rape leaves using MSSAE and hyperspectral imaging
2022, Spectrochimica Acta - Part A: Molecular and Biomolecular SpectroscopyCitation Excerpt :It is of great significance to further investigate whether HSI has the potential for nondestructive detection of Zn content in plant leaves. Since each pixel in the HSI contains hundreds of bands, the massive amount of data increases the complexity of feature extraction and modeling [16,17]. Dimensionality reduction methods such as variable iterative space shrinkage approach (VISSA), principal component analysis (PCA), and successive projections algorithm (SPA) have been proposed to reduce the data dimension [18].