Abstract
Classical biplot methods allow for the simultaneous representation of individuals (rows) and variables (columns) of a data matrix. For binary data, logistic biplots have been recently developed. When data are nominal, both classical and binary logistic biplots are not adequate and techniques such as multiple correspondence analysis (MCA), latent trait analysis (LTA) or item response theory (IRT) for nominal items should be used instead. In this paper we extend the binary logistic biplot to nominal data. The resulting method is termed “nominal logistic biplot”(NLB), although the variables are represented as convex prediction regions rather than vectors. Using the methods from computational geometry, the set of prediction regions is converted to a set of points in such a way that the prediction for each individual is established by its closest “category point”. Then interpretation is based on distances rather than on projections. We study the geometry of such a representation and construct computational algorithms for the estimation of parameters and the calculation of prediction regions. Nominal logistic biplots extend both MCA and LTA in the sense that they give a graphical representation for LTA similar to the one obtained in MCA.
Similar content being viewed by others
References
Albert A, Anderson JA (1984) On the existence of maximum likelihood estimates in logistic regression models. Biometrika 71(1):1–10
Baker FB (1992) Item response theory. Parameter estimation techniques, Marcel Dekker, New York
Bock R, Aitkin M (1981) Marginal maximum likelihood estimation of item parameters: application of an em algorithm. Psychometrika 46(4):443–459
Browne RP, McNicholas PD (2013) Estimating common principal components in high dimensions. Adv Data Anal Classif 8(2):217–226
Bull SB, Mak C, Greenwood CM (2002) A modified score function for multinomial logistic regression. Comput Stat Data Anal 39:57–74
Chalmers RP (2012) Mirt: a multidimensional item response theory package for the r environment. J Stat Softw 48(6):1–29
De Leeuw J (2006) Principal component analysis of binary data by iterated singular value decomposition. Comput Stat Data Anal 50(1):21–39
Demey J, Vicente-Villardón JL, Galindo MP, Zambrano A (2008) Identifying molecular markers associated with classification of genotypes using external logistic biplots. Bioinformatics 24(24):2832–2838
Evans DG, Jones SM (1987) Detecting Voronoi (area of influence) polygons. Math Geol 19(6):523–537
Firth D (1993) Bias reduction of maximum likelihood estimates. Biometrika 80(1):27–38
Gabriel KR (1971) The biplot graphic display of matrices with application to principal component analysis. Biometrika 58(3):453–467
Gabriel KR (1998) Generalised bilinear regresin. Biometrika 85(3):689–700
Gabriel KR, Zamir S (1979) Lower rank approximation of matrices by least squares with any choice of weights. Technometrics 21(4):489–498
Gallego-Álvarez I, Vicente-Villardón JL (2012) Analysis of environmental indicators in international companies by applying the logistic biplot. Ecol Indic 23:250–261
Gower J, Hand D (1996) Biplots. Monographs on statistics and applied probability, vol 54. Chapman and Hall, London, p 277
Groenen PJ, Le Roux NJ, Gardner-Lubbe S (2015) Spline-based nonlinear biplots. Adv Data Anal Classif 9(2):219–238
Hartvigsen D (1992) Recognizing Voronoi diagrams with linear programming. ORSA J Comput 4:369–374
Heinze G, Schemper M (2002) A solution to the problem of separation in logistic regresion. Stat Med 21:2409–2419
Hernández JC, Vicente-Villardón JL (2013) Nominal logistic Biplot: Biplot representations of categorical data. University of Salamanca. Department of Statistics. http://CRAN.R-project.org/package=NominalLogisticBiplot,Rpackage,version0.1
Hron, K., Brito, P., Filzmoser, P. (2016) Exploratory data analysis for interval compositional data. Adv Data Anal Classif doi:10.1007/s11634-016-0245-y (in press)
Jongman RHG, Ter Braak CJF, Tongeren OFRV (1987) Data analysis in community and landscape ecology. Cambridge University Press, Cambridge
Le Cessie S, Van Houwelingen J (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201
Lee S, Huand J, Hu J (2010) Sparse logistic principal component analysis for binary data. Ann Appl Stat 4(3):21–39
Meier L, van de Geer S, Buhlmann P (1984) The group lasso for logistic regression. J R Stat Soc 70(1):53–71
Schoenberg F, Ferguson T, Li C (2003) Inverting dirichlet tesselations. Comput J 46(1):76–83
Scrucca L (2014) Graphical tools for model-based mixture discriminant analysis. Adv Data Anal Classif 8:147–165
Vicente-Galindo P, de Noronha Vaz T, Nijkamp P (2011) Institutional capacity to dynamically innovate: an application to the Portuguese case. Technol Forecast Soc Change 78(1):3–12
Vicente-Villardón JL (2010) MULTBIPLOT: a package for multivariate analysis using Biplots. University of Salamanca. Department of Statistics. http://biplot.usal.es/ClassicalBiplot/index.html
Vicente-Villardón JL, Galindo MP, Blázquez-Zaballos A (2006) Logistic biplots. In: Greenacre M, Blasius J (eds) Multiple Correspondence analysis and related methods. Chapman & Hall/CRC Press, London, pp 503–521
Yamashita N, Mayekawa SI (2015) A new biplot procedure with joint classification of objects and variables by fuzzy c-means clustering. Adv Data Anal Classif 9(3):243–266
Acknowledgments
The authors would like to thank the anonymous referees and the editor very much for their careful reading of our manuscript and their valuable comments and suggestions that have improved significantly the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hernández-Sánchez, J.C., Vicente-Villardón, J.L. Logistic biplot for nominal data. Adv Data Anal Classif 11, 307–326 (2017). https://doi.org/10.1007/s11634-016-0249-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-016-0249-7
Keywords
- Biplot
- Categorical variables
- Logistic responses
- Latent traits
- Computational geometry
- Inverse Voronoi problem