Abstract
Near infrared (NIR) spectroscopy is a rapid, non-destructive technology to predict a variety of wood properties and provides great opportunities to optimize manufacturing processes through the realization of in-line assessment of forest products. In this paper, a novel multivariate regression procedure, the hybrid model of principal component regression (PCR) and partial least squares (PLS), is proposed to develop more accurate prediction models for high-dimensional NIR spectral data. To integrate the merits of PCR and PLS, both principal components defined in PCR and latent variables in PLS are utilized in hybrid models by a common iterative procedure under the constraint that they should keep orthogonal to each other. In addition, we propose the modified sequential forward floating search method, originated in feature selection for classification problems, in order to overcome difficulties of searching the vast number of possible hybrid models. The effectiveness and efficiency of hybrid models are substantiated by experiments with three real-life datasets of forest products. The proposed hybrid approach can be applied in a wide range of applications with high-dimensional spectral data.
Similar content being viewed by others
References
Bennett, K. P., & Embrechts, M. J. (2003). An optimization perspective on partial least squares. In Proceedings of the NATO Advanced Study Institute on learning theory and practice (pp. 227–250). Amsterdam: IOS Press.
Fang, Y., Cho, H., & Jeong, M. K. (2006). Health monitoring of a shaft transmission system via hybrid models of PCR and PLS. In Proceedings of SIAM conference on data mining (pp. 554–558).
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Berlin: Springer.
Jain, A. K., & Zongker, D. (1997). Feature selection: evaluation, application, and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 153–158.
Kemsley, E. K. (1996). Discriminant analysis of high-dimensional data: a comparison of principal components analysis and partial least squares data reduction methods. Chemometrics and Intelligent Laboratory Systems, 33, 47–61.
Labbé, N., Lee, S.-H., Cho, H.-W., Jeong, M. K., & André, N. (2008). Enhanced discrimination and calibration of biomass NIR spectral data using nonlinear kernel methods. Bioresource Technology, 99(17), 8445–8452.
Massy, W. F. (1965). Principal components regression in exploratory statistical research. Journal of the American Statistical Association, 60, 234–256.
Pudil, P., Ferri, F. J., Novovicova, J., & Kittler, J. (1994a). Floating search methods for feature selection with nonmonotonic criterion functions. In Proceedings of the 12th IAPR international conference on pattern recognition (Vol. 2, pp. 279–283).
Pudil, P., Novovicova, J., & Kittler, J. (1994b). Floating search methods in feature selection. Pattern Recognition Letters, 15, 1119–1125.
Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares regression in reproducing kernel Hilbert space. Journal of Machine Learning Research, 2, 97–123.
Schölkopf, B., Smola, A., & Muller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319.
So, C.-L., Via, B. K., Groom, L. H., Schimleck, L. R., Shupe, T. F., Kelley, S. S., & Rials, T. G. (2004). Near infrared spectroscopy in the forest products industry. Forest Products Journal, 54, 6–16.
Stone, M. (1977). Asymptotics for and against cross-validation. Biometrika, 64, 29–35.
Taylor, A., Baek, S., Jeong, M. K., & Nix, G. (2008). Wood shrinkage prediction using NIR spectroscopy. Wood and Fiber Science, 40(2), 301–307.
Vigneau, E., Bertrand, D., & Qannari, M. (1996). Application of latent root regression for calibration in near-infrared spectroscopy. Comparison with principal component regression and partial least squares. Chemometrics and Intelligent Laboratory Systems, 35, 231–238.
Wentzell, P. D., & Vega Montoto, L. (2003). Comparison of principal components regression and partial least squares regression through generic simulations of complex mixtures. Chemometrics and Intelligent Laboratory Systems, 65, 257–279.
Wold, H. (1966). Estimation of principal components and related models by iterative least squares. In Proceedings of international symposium on multivariate analysis (pp. 391–420). New York: Academic Press.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Fang, Y., Park, J.I., Jeong, YS. et al. Enhanced predictions of wood properties using hybrid models of PCR and PLS with high-dimensional NIR spectral data. Ann Oper Res 190, 3–15 (2011). https://doi.org/10.1007/s10479-009-0554-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-009-0554-z