Abstract
We study the problem of obtaining optimal projections for performing discriminant analysis with Gaussian class densities. Unlike in most existing approaches to the problem, we focus on the optimisation of the multinomial likelihood based on posterior probability estimates, which directly captures discriminability of classes. Finding optimal projections offers utility for dimension reduction and regularisation, as well as instructive visualisation for better model interpretability. Practical applications of the proposed approach show that it is highly competitive with existing Gaussian discriminant models. Code to implement the proposed method is available in the form of an R package from https://github.com/DavidHofmeyr/OPGD.
Similar content being viewed by others
Notes
taken from the UCI machine learning repository (Dua and Graff 2017)
We use the implementation in R’s base stats package(R Core Team 2018).
Code to implement the method is available from https://github.com/DavidHofmeyr/OPGD.
In Table 4 only values of \(p'\) up to 2 times the number of classes were considered for \(\text {OPGD}_{J}\).
References
Byrd RH, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput 16(5):1190–1208
Calò DG (2007) Gaussian mixture model classification: a projection pursuit approach. Comput Stat Data Anal 52(1):471–482
Cook RD, Weisberg S (1991) Discussion of sliced inverse regression for dimension reduction. J Am Stat Assoc 86(414):335
Cook RD, Critchley F (2000) Identifying outliers and regression mixtures graphically. J Am Stat Assoc 95:781–794
Dasgupta S (2013) Experiments with random projection. arXiv preprint arXiv:1301.3849
Dua D, Graff C (2017) UCI machine learning repository. URL http://archive.ics.uci.edu/ml
Eslami A, Qannari EM, Bougeard S, Sanchez G (2020) Multigroup: Multigroup Data Analysis. URL https://CRAN.R-project.org/package=multigroup. R package version 0.4.5
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eug 7(2):179–188
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175
Hand DJ (1982) Kernel discriminant analysis. Wiley, One Wiley Dr., Somerset, N. J. 08873, 1982, 264
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J R Stat Soc Ser B Methodol 58(1):155–176
Hastie T, Tibshirani R, Buja A (1994) Flexible discriminant analysis by optimal scoring. J Am stat Assoc 89(428):1255–1270
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin
Huber PJ (1982) Projection pursuit. Ann Stat, pp 435–475
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: 11th Conference on Uncertainty in Artificial Intelligence
Jaakko Peltonen, Samuel Kaski (2005) Discriminative components of data. IEEE Trans Neural Netw 16(1):68–83
Peltonen J, Goldberger J, Kaski S (2006) Fast discriminative component analysis for comparing examples. In: Neural information processing systems workshop on learning to compare examples
R Core Team (2018) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, URL https://www.R-project.org/
Venables WN, Ripley BD (2002) Modern Applied Statistics with S. Springer, New York, fourth edition. URL http://www.stats.ox.ac.uk/pub/MASS4. ISBN 0-387-95457-0
Zhu Mu (2006) Discriminant analysis with common principal components. Biometrika 93(4):1018–1024
Zhu M, Hastie TJ (2003) Feature extraction for nonparametric discriminant analysis. J Comput Graphic Stat 12(1):101–120
Acknowledgements
We would like to thank the anonymous reviewers for their very helpful comments, which greatly enhanced the quality of the paper in its final form.
Funding
This funding is provided by National Research Foundation (ZA), Grant Number 114632.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hofmeyr, D.P., Kamper, F. & Melonas, M.C. Optimal projections for Gaussian discriminants. Adv Data Anal Classif 17, 43–73 (2023). https://doi.org/10.1007/s11634-021-00486-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-021-00486-z