Skip to main content

Methods for Improving Prediction Ability of Model

  • Chapter
  • First Online:
Chemometric Methods in Analytical Spectroscopy Technology
  • 1089 Accesses

Abstract

Model prediction ability mainly refers to the performance of robustness and accuracy. They are unified in some cases but contradictory in others. For example, for liquid samples, the conditions of spectroscopic acquisition (such as temperature and pressure) can be strictly controlled. A quantitative calibration model can be established accordingly. The model has high prediction accuracy for samples within the bounds under the same conditions. However, if the spectral acquisition conditions change moderately, the prediction accuracy of the model will become significantly worse. A hybrid calibration model can be established on the spectra collected under different conditions to improve the robustness (or adaptability) of the model. Yet, the model accuracy would be decreased in this case. In practice, it is often necessary to seek a balance between robustness and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Tulsyan A, Schorner G, Khodabandehlou H, et al. A machine-learning approach to calibrate generic raman models for real-time monitoring of cell culture processes. Biotechnol Bioeng. 2019;116(10):2575–86.

    Article  CAS  PubMed  Google Scholar 

  2. Chu XL, Yuan HF, Wang YB, et al. Developing robust near infrared calibration models. Spectroscopy Spectral Anal. 2004;24(6):666–71.

    CAS  Google Scholar 

  3. Hetrick E, Shi ZQ, Barnes L, et al. Development of near infrared (NIR) spectroscopy-based process monitoring methodology for pharmaceutical continuous manufacturing using an offline calibration approach. Anal Chem. 2017;89:9175–83.

    Article  CAS  PubMed  Google Scholar 

  4. Bakeev KA. Process analytical technology: spectroscopic tools and implementation strategies for the chemical and pharmaceutical industries. Oxford: Blackwell Publishing; 2005.

    Book  Google Scholar 

  5. Blanco M, Coello J, Iturriaga H, et al. Strategies for constructing the calibration set in the determination of active principles in pharmaceuticals by near infrared diffuse reflectance spectrometry. Analyst. 1997;122:761–5.

    Article  CAS  Google Scholar 

  6. Farrell JA, Higgins K, Kalivas JH. Updating a near-infrared multivariate calibration model formed with lab-prepared pharmaceutical tablet types to new tablet types in full production. J Pharm Biomed Anal. 2012;61:114–21.

    Article  CAS  PubMed  Google Scholar 

  7. Mehdizadeh H, Lauri D, Karry KM, et al. Generic raman-based calibration models enabling real-time monitoring of cell culture bioreactors. Biotechnol Prog. 2015;31(4):1004–13.

    Article  CAS  PubMed  Google Scholar 

  8. Santos RM, Kessler JM, Salou P, et al. Monitoring mAb cultivations with in-situ Raman spectroscopy: the influence of spectral selectivity on calibration models and industrial use as reliable PAT tool. Biotechnol Prog. 2018;34(3):659–70.

    Article  CAS  PubMed  Google Scholar 

  9. Zhang S, Xiong H, Zhou L, et al. Development and validation of in-line near-infrared spectroscopy based analytical method for commercial production of a botanical drug product. J Pharm Biomed Anal. 2019;174:674–82.

    Article  CAS  PubMed  Google Scholar 

  10. Shenk JS, Westerhaus MO. Near infrared reflectance analysis with single and multiproduct calibrations. Crop Sci. 1993;33:582–4.

    Article  CAS  Google Scholar 

  11. Luo X, Ye ZZ, Xu HR, et al. Robustness improvement of NIR-based determination of soluble solids in apple fruit by local calibration. Postharvest Biol Technol. 2018;139:82–90.

    Article  CAS  Google Scholar 

  12. Davies AMC, Fearn T. Quantitative analysis via near infrared databases: comparison analysis using restructured near infrared and constituent data-deux (CARNAC-D). J Near Infrared Spectrosc. 2006;14(6):403–11.

    Article  CAS  Google Scholar 

  13. Næs T, Isaksson T, Kowalski BR. Locally weighted regression and scatter correction for near-infrared reflectance data. Anal Chem. 1990;62(7):664–73.

    Article  Google Scholar 

  14. Centner V, Massart DL. Optimization in locally weighted regression. Anal Chem. 1998;70(19):4206–11.

    Article  CAS  PubMed  Google Scholar 

  15. Shenk JS, Westerhaus MO. Investigation of a LOCAL calibration procedure for near infrared instruments. J Near Infrared Spectrosc. 1997;5(4):223–32.

    Article  CAS  Google Scholar 

  16. Dambergs RG, Cozzolino D, Cynkar WU, et al. The determination of red grape quality parameters using the LOCAL algorithm. J Near Infrared Spectrosc. 2006;14(2):71–9.

    Article  CAS  Google Scholar 

  17. Perez-Marin D, Garrido-Varo A, Guerrero JE. Implementation of LOCAL algorithm with near-infrared spectroscopy for compliance assurance in compound feeding stuffs. Appl Spectrosc. 2005;59(1):69–77.

    Article  CAS  PubMed  Google Scholar 

  18. Fearn T, Davies AMC. Locally-biased regression. J Near Infrared Spectrosc. 2003;11(6):467–78.

    Article  CAS  Google Scholar 

  19. Chung H, Cho S, Toyoda Y, et al. Moment combined partial least squares (MC-PLS) as an improved quantitative calibration method: application to the analyses of petroleum and petrochemical products. Analyst. 2006;131(5):684–91.

    Article  CAS  PubMed  Google Scholar 

  20. He KX, Cheng H, Du WL, et al. Online updating of NIR model and its industrial application via adaptive wavelength selection and local regression strategy. Chemom Intell Lab Syst. 2014;134:79–88.

    Article  CAS  Google Scholar 

  21. Zhang HG, Lu JG. Local regression algorithm based on net analyte signal and its application in near infrared spectral analysis. Spectroscopy Spectral Anal. 2016;36(2):384–7.

    CAS  Google Scholar 

  22. Yan Y, Zhang HG, Lu JG, et al. Spectral-information-divergence based local PLS modeling algorithm in near infrared spectroscopy. Comput Appl Chem. 2017;34(5):18–22.

    Google Scholar 

  23. Tulsyan A, Wang T, Schorner G, et al. Automatic real-time calibration, assessment, and maintenance of generic raman models for online monitoring of cell culture processes. Biotechnol Bioeng. 2019;117(2):406–16.

    Article  PubMed  CAS  Google Scholar 

  24. Chu XL, Yuan HF, Lu WZ. Determining four component contents in residues by partial least squares-ultraviolet-visible spectrophotometry. Chin J Anal Chem. 2000;28(12):1457–61.

    CAS  Google Scholar 

  25. Xu Y, Wu JZ, Wang YM, et al. Clustering method of unknown sort samples based on near infrared spectroscopy. Trans Chinese Soc Agricult Eng. 2011;27(8):345–9.

    Google Scholar 

  26. Ogen Y, Zaluda J, Francos N, et al. Cluster-based spectral models for a robust assessment of soil properties. Geoderma. 2019;340:175–84.

    Article  Google Scholar 

  27. Fearn T. Bagging NIR news. 2006;17(8):15.

    Article  Google Scholar 

  28. Boosting FT. NIR news. 2007;18(1):11–2.

    Article  Google Scholar 

  29. Galvao RKH, Araujo MCU, Martins MD, et al. An application of subagging for the improvement of prediction accuracy of multivariate calibration models. Chemom Intell Lab Syst. 2006;81(1):60–7.

    Article  CAS  Google Scholar 

  30. Viscarra Rossel RA. Robust modelling of soil diffuse reflectance spectra by bagging-partial least squares regression. J Near Infrared Spectrosc. 2007;15(1):39–47.

    Article  CAS  Google Scholar 

  31. Li YK, Shao XG, Cai WS. Partial least squares regression method based on consensus modeling for quantitative analysis of near-infrared spectra. Chem J Chinese Univ. 2007;28(2):246–9.

    CAS  Google Scholar 

  32. Yao ZX, Yang JY, Zhang Q, et al. The application of boosting algorithm in chemical data mining. J Guangxi Univ Technol. 2006;17(4):13–8.

    Google Scholar 

  33. Zhang MH, Xu QS, Massart DL. Boosting partial least squares. Anal Chem. 2005;77(5):1423–31.

    Article  CAS  PubMed  Google Scholar 

  34. Drucker H. Improving regressors using boosting techniques. In: Proceedings of the 14th international conference on machine learning, 1997.

    Google Scholar 

  35. Luo RM, Tan SM, Zhou YP, et al. Quantitative analysis of tea using ytterbium-based internal standard near-infrared spectroscopy coupled with boosting least-squares support vector regression. J Chemom. 2013;27(7–8):198–206.

    Article  CAS  Google Scholar 

  36. Wu XL, Li YJ, Wu TJ. A boosting-partial least squares method for ultraviolet spectroscopic analysis of water quality. Chin J Anal Chem. 2013;27(7–8):198–206.

    Google Scholar 

  37. Shao XG, Bian XH, Cai WS. An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis. Anal Chim Acta. 2010;666:32–7.

    Article  CAS  PubMed  Google Scholar 

  38. Chen Z, Wu ZS, Shi XY, et al. A study on model performance for ethanol precipitation process of Lonicera Japonica by NIR based on bagging-PLS and boosting-PLS algorithm. Chin J Anal Chem. 2014;42(11):1679–86.

    CAS  Google Scholar 

  39. Tan C, Li M, Qin X. Random subspace regression ensemble for near-infrared spectroscopic calibration of tobacco samples. Anal Sci. 2008;24(5):647–53.

    Article  CAS  PubMed  Google Scholar 

  40. Ni WD, Brown SD, Man RL. Stacked partial least squares regression analysis for spectral calibration and prediction. J Chemom. 2009;23(10):505–17.

    Article  CAS  Google Scholar 

  41. Ni WD, Man RL. Stacked multivariate calibration analysis. Chin J Anal Chem. 2010;38(3):367–71.

    CAS  Google Scholar 

  42. Ji GL, Huang GZ, Yang ZJ, et al. Using consensus interval partial least square in near infrared spectra analysis. Chemom Intell Lab Syst. 2015;144:56–62.

    Article  CAS  Google Scholar 

  43. Li YK, Jing J. A consensus PLS method based on diverse wavelength variables models for analysis of near-infrared spectra. Chemom Intell Lab Syst. 2014;130:45–9.

    Article  CAS  Google Scholar 

  44. Liu K, Chen XJ, Li LM, et al. A consensus successive projections algorithm-multiple linear regression method for analyzing near infrared spectra. Anal Chim Acta. 2015;858:16–23.

    Article  CAS  PubMed  Google Scholar 

  45. Bi YM, Xie Q, Peng SL, et al. Dual stacked partial least squares for analysis of near-infrared spectra. Anal Chim Acta. 2013;792:19–27.

    Article  CAS  PubMed  Google Scholar 

  46. Cui JD. A stacked extreme learning machine algorithm based on nir spectroscopy and its application. Shenyang: Northeastern University; 2015.

    Google Scholar 

  47. Shan P, Zhao YH, Wang QY, et al. Stacked ensemble extreme learning machine coupled with partial least squares-based weighting strategy for nonlinear multivariate calibration. Spectrochim Acta Part A Mol Biomol Spectrosc. 2019;215:97–111.

    Article  CAS  Google Scholar 

  48. Chen H, Tan C, Lin Z. Ensemble of extreme learning machines for multivariate calibration of near-infrared spectroscopy. Spectrochimica Acta Part A: Molecul Biomolecul Spectr. 2020; 229: 117982.

    Google Scholar 

  49. Mevik BH, Segtnan VH, Næs T. Ensemble methods and partial least squares regression. J Chemom. 2004;18(11):498–507.

    Article  CAS  Google Scholar 

  50. Saiz-Abajo MJ, Mevik BH, Segtnan VH, et al. Ensemble methods and data augmentation by noise addition applied to the analysis of spectroscopic data. Anal Chim Acta. 2005;533(2):147–59.

    Article  CAS  Google Scholar 

  51. Conlin AK, Martin EB, Morris AJ. Data augmentation: an alternative approach to the analysis of spectroscopic data. Chemom Intell Lab Syst. 1998;44(1):161–73.

    Article  CAS  Google Scholar 

  52. Li ZG, Peng SL, Yang N, et al. Quantitative analysis method of infrared spectra based on derivative spectra fusion modeling. Chin J Anal Chem. 2016;44(3):437–43.

    CAS  Google Scholar 

  53. Li ZG, Lv JT, Si GY, et al. An improved ensemble model for the quantitative analysis of infrared spectra. Chemom Intell Lab Syst. 2015;146:211–20.

    Article  CAS  Google Scholar 

  54. Bian X H, Wang K Y, Tan E X, et al. A selective ensemble preprocessing strategy for near-infrared spectral quantitative analysis of complex samples. Chemom Intell Laborat Syst. 2020; 197:103916.

    Google Scholar 

  55. Xu L, Zhou YP, Tang LJ, et al. Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration. Anal Chim Acta. 2008;616(2):138–43.

    Article  CAS  PubMed  Google Scholar 

  56. Lascola R, O’Rourke PE, Kyser EA. A piecewise local partial least squares (PLS) method for the quantitative analysis of plutonium nitrate solutions. Appl Spectrosc. 2017;71(12):2579–94.

    Article  CAS  PubMed  Google Scholar 

  57. Tan C, Qin X, Li ML. Ensembe partial least squares algorithmn mutual information-induced subspace for near-infrared quantitative calibration. Chin J Anal Chem. 2009;37(12):1834–8.

    CAS  Google Scholar 

  58. Xie JB. 20 Lectures on visual machine learning. Beijing: Tsinghua University Press; 2015.

    Google Scholar 

  59. Lei M. Machine learning: principles, algorithms and applications. Beijing: Tsinghua University Press; 2019.

    Google Scholar 

  60. Yu S, Liu GH, Xia SY, et al. State recognition of solid fermentation process based on near infrared spectroscopy with adaboost and spectral regression discriminant analysis. Spectr Spectral Anal. 2016;36(1):51–4.

    CAS  Google Scholar 

  61. Jin X, Zhu X Z, Li S W, et al. Predicting soil available phosphorus by hyperspectral regression method based on gradient boosting decision tree. Laser Optoelectr Progr. 2019; 56(13):131102.

    Google Scholar 

  62. Xu K, Cui Y. Application of stacking learning in hyperspectral image classification. Appl Sci Technol. 2018;45(6):42–6.

    Google Scholar 

  63. Tao YQ, Peng Y, Jiang Q, et al. Remote detection of critical growth stages in rapeseed using vegetation spectral and stacking combination method. J Geomat. 2019;44(5):20–3.

    Google Scholar 

  64. Shen T, Yu H, Wang YZ. Discrimination of gentiana and its related species using IR spectroscopy combined with feature selection and stacked generalization. Molecules. 2020;25(6):1442.

    Article  CAS  PubMed Central  Google Scholar 

  65. Shi RJ, Xia FZ, Zeng WD, et al. Raman spectroscopic classification of foodborne pathogenic bacteria based on PCA-stacking model. Laser Optoelectr Progr. 2019;56(4):20–3.

    Google Scholar 

  66. Yu X, Yang J, Xie ZQ. Research on virtual sample generation technology. Comput Sci. 2011;38(3):16–9.

    Google Scholar 

  67. Tang J, Qiao JF, Chai TY, et al. Multi-component mechanical signal modeling based on virtual sample generation technology. Acta Autom Sin. 2018;44(9):1569–89.

    Google Scholar 

  68. Li DC, Wu CS, Tsai TI, et al. Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge. Comput Oper Res. 2007;34(4):966–82.

    Article  CAS  Google Scholar 

  69. Zhu B. Virtual sample generation technology and modeling application research. Beijing: Beijing University of Chemical Technology; 2017.

    Google Scholar 

  70. Gao KX, Li ZG, Xu CM, et al. Virtual sample construction and blood spectrum analysis of mixed overall trend diffusion. Chinese J Sci Instrum. 2019;40(8):94–101.

    Google Scholar 

  71. Gong HF. Research on virtual sample generation technology and application of industrial modeling. Beijing: Beijing University of Chemical Technology; 2018.

    Google Scholar 

  72. Yi L, Lv ZY, Ding JL, et al. Data amplification preprocessing method for prediction of total hydrogen properties of crude oil. Control and Decision. 2018;33(2):44–51.

    Google Scholar 

  73. Ye YF, Zhang XR, Mei B, et al. Research on modeling methods based on automatic densification technology. Sci Technol Vis. 2017;2:34–34.

    Google Scholar 

  74. Li JY, Chu XL. Rapid determination of hydrocarbon composition of LTAG raw materials and products by virtual spectral identification method. Acta Petrol Sin (Petroleum Process Sect). 2019;35(2):283–8.

    Google Scholar 

  75. Qian J, Guo YK, Zhang Q, et al. High spectral classification modeling of heavy metal Pb and Cd pollution in soil of mining area. Bull Surv Map. 2019;9:82–4.

    Google Scholar 

  76. Yang YN, Qi LH, Wang H, et al. Research on small sample data generation technology based on generative adversarial network. Electric Power Construct. 2019;40(5):71–7.

    Article  Google Scholar 

  77. Zhi SS, Zhao QH, Jin DH, et al. The gait virtual sample generation method based on CNN and DLTL. Appl Res Comput. 2020;37(1):291–5.

    Google Scholar 

  78. Cui X W, Shen T, Liu Y L, et al. Small sample terahertz spectroscopy identification. Laser Optoelectron Progr. 2020.

    Google Scholar 

  79. Liu JW, Liu Y, Luo XL. Semi-supervised learning methods. Chinese J Comput. 2015;38(8):1592–618.

    Article  Google Scholar 

  80. Chen WJ. Summarization of semi-supervised learning. Comput Knowl Technol. 2011;7(16):3887–9.

    Google Scholar 

  81. Cai Y, Zhu XF, Sun ZL, et al. Semi-supervised ensemble learning review. Comput Sci. 2017;44(6A):7–14.

    Google Scholar 

  82. Zhou ZH. Machine learning and its application. Beijing: Tsinghua University Press; 2007.

    Google Scholar 

  83. Li L, Xu S, An X, et al. New method for quantitative analysis of near infrared spectroscopy: semi-supervised least squares support vector regression machine. Spectrosc Spectr Anal. 2011;31(10):2702–5.

    Google Scholar 

  84. Zhang R. Incremental learning algorithm based on support vector regression. J Shandong Univ Technol (Soc Sci Ed). 2010;24(3):56–9.

    CAS  Google Scholar 

  85. Lv CC. Research on ensemble learning algorithm for incremental NIR semi-supervised SVR. Shenyang: Northeastern University; 2014.

    Google Scholar 

  86. Liang M, Cai JY, Yang K, et al. The application of semi-supervised partial least squares method in near infrared sensory evaluation model of tobacco leaves. Chin J Anal Chem. 2014;42(11):1687–91.

    CAS  Google Scholar 

  87. Guo DS. Research on the updating method of agricultural product quality detection model. Wuxi: Jiangnan University; 2018.

    Google Scholar 

  88. Jing SB, Yang LM, Li JH, et al. Semi-supervised extreme learning machine and its application in near infrared spectral data analysis. J Comput Appl. 2016;36(2):387–91.

    Google Scholar 

  89. Wang J, Gao XR, Zhang R, et al. Multi-objective regression combined with target-specific characteristics and target relevance. Acta Electron Sin. 2020;48(11):2092–100.

    Google Scholar 

  90. Spyromitros-Xioufis E, Tsoumakas G, Groves W, et al. Multi-target regression via input space expansion: treating targets as inputs. Mach Learn. 2016;104(1):55–98.

    Article  Google Scholar 

  91. Shukla AK. Spectroscopic techniques and artificial intelligence for food and beverage analysis. Singapore: Springer; 2020.

    Book  Google Scholar 

  92. Santana EJ, Geronimo BC, Mastelini SM, et al. Predicting poultry meat characteristics using an enhanced multi-target regression method. Biosys Eng. 2018;171:193–204.

    Article  Google Scholar 

  93. Junior SB, Mastelini SM, Barbon APAC, et al. Multi-target prediction of wheat flour quality parameters with near infrared spectroscopy. Inform Process Agricult. 2019;7:342–54.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bian, X. (2022). Methods for Improving Prediction Ability of Model. In: Chemometric Methods in Analytical Spectroscopy Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-1625-0_14

Download citation

Publish with us

Policies and ethics