Skip to main content
Log in

Multivariate Prediction with Nonlinear Principal Components Analysis: Theory

  • Published:
Quality and Quantity Aims and scope Submit manuscript

Abstract

We propose the notion of multivariate predictability as a measure of goodness-of-fit in data reduction techniques which are useful for visualizing and screening data. For quantitative variables this leads to the usual sums-of-squares and variance accounted for criteria. For categorical variables we show how to predict the category-levels of all variables associated with every point (case). The proportion of predictions which agree with the true categories gives the measure of fit. The ideas are very general; as an illustration we use nonlinear principal components analysis (NLPCA) in association with ordered categorical variables. A detailed example using data from the International Social Survey Program (ISSP) will be given in Blasius and Gower (quality and quantity, 39, to appear). It will be shown that the predictability criterion suggests that the fits are rather better than is indicated by “percentage of variance accounted for”.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Blasius, J. & Gower, J. C. (to appear). Multivariate prediction with nonlinear principal components analysis: Application. Quality and Quantity 39

  • I. Borg P. Groenen (1997) Modern Multidimensional Scaling Springer New York

    Google Scholar 

  • I. Borg S. Shye (1995) Facet Theory. Form and Content Sage Newbury Park, CA

    Google Scholar 

  • J.M. Chambers (1998) Programming with Data: A Guide to the S Language Springer New York

    Google Scholar 

  • C. Eckart G. Young (1936) ArticleTitleThe approximation of one matrix by another of lower rank Psychometrika 1 211–218

    Google Scholar 

  • K.R. Gabriel (1971) ArticleTitleThe biplot-graphic display of matrices with applications to principal components analysis Biometrika 58 453–467

    Google Scholar 

  • K.R. Gabriel (1981) Biplot display of multivariate matrices for inspecting of data and diagnosis V. Barnett (Eds) Interpreting Multivariate Data Wiley Chichester 147–174

    Google Scholar 

  • InstitutionalAuthorNameGenstat 5 Committee (1993) Genstat 5 Release 3 Reference Manual Numerical Algorithms Group Oxford

    Google Scholar 

  • A. Gifi (1990) Nonlinear Multivariate Analysis Wiley Chichester

    Google Scholar 

  • J.C. Gower (1966) ArticleTitleSome distance properties of latent-root and vector methods used in multivariate analysis Biometrika 53 325–338

    Google Scholar 

  • Gower, J. C. (1993). The construction of neighbour-regions in two dimensions for prediction with multi-level categorical variables. In: O. Opitz, B. Lausen & R. Klar (eds.), Information and Classification: Concepts–Methods–Applications Proceedings 16th Annual Conference of the Gesellschaft für Klassifikation, Dortmund, April 1992, Berlin: Springer, pp. 174–189

  • J.C. Gower (2002) Categories and quantities S. Nishisato Y. Baba H. Bozdogan K. Kamefuji (Eds) Measurement and Multivariate Analysis Springer Tokyo 1–12

    Google Scholar 

  • J.C. Gower D.J. Hand (1996) Biplots Chapman & Hall London

    Google Scholar 

  • J.C. Gower S. Harding (1998) Prediction regions for categorical variables J. Blasius M. Greenacre (Eds) Visualization of Categorical Data. Academic Press San Diego 405–423

    Google Scholar 

  • M.J. Greenacre (1993) ArticleTitleBiplots in correspondence analysis Journal of Applied Statistics 20 251–269

    Google Scholar 

  • L. Guttman (1965) ArticleTitleA faceted definition of intelligence Scripta Hierosolymitana 14 166–181

    Google Scholar 

  • W.J. Heiser J.J. Meulman (1994) Homogeneity analysis: exploring the distribution of variables and their nonlinear relationships M.J. Greenacre J. Blasius (Eds) Correspondence Analysis in the Social Sciences. Recent Developments and Applications Academic Press London 179–209

    Google Scholar 

  • J.J. Meulman W.J. Heiser (1999) SPSS Categories 100 SPSS Inc. Chicago

    Google Scholar 

  • Payne, R. W., Lane, P. W., Baird, D. B., Gilmour, A. R., Harding, S. A., Morgan, G. W. Murray, D. A., Thompson, R., Todd, A. D., Tunicliffe-Wilson, G., Webster, R. & Welham, S. J. (1998). Genstat 5 Release 4.1 Reference Manual Supplement. Oxford: Numerical Algorithms Group

  • SPSS. (1999). See Meulman and Heiser (1999)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to JÖRG BLASIUS.

Additional information

This article was written while John Gower was a visiting professor at the ZA-Eurolab, at the Zentralarchiv für Empirische Sozialforschung, University of Cologne, Germany. The ZA is a Large Scale Facility funded by the Training and Mobility of Researchers program of the European Union.

Rights and permissions

Reprints and permissions

About this article

Cite this article

GOWER, J.C., BLASIUS, J. Multivariate Prediction with Nonlinear Principal Components Analysis: Theory. Qual Quant 39, 359–372 (2005). https://doi.org/10.1007/s11135-005-3005-1

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-005-3005-1

Keywords

Navigation