Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions

Dang, Utkarsh J.; Gallaugher, Michael P.B.; Browne, Ryan P.; McNicholas, Paul D.

doi:10.1007/s00357-022-09427-7

Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions

Published: 15 February 2023

Volume 40, pages 145–167, (2023)
Cite this article

Journal of Classification Aims and scope Submit manuscript

454 Accesses
1 Citation
Explore all metrics

Abstract

Families of mixtures of multivariate power exponential (MPE) distributions have already been introduced and shown to be competitive for cluster analysis in comparison to other mixtures of elliptical distributions, including mixtures of Gaussian distributions. A family of mixtures of multivariate skewed power exponential distributions is proposed that combines the flexibility of the MPE distribution with the ability to model skewness. These mixtures are more robust to variations from normality and can account for skewness, varying tail weight, and peakedness of data. A generalized expectation-maximization approach, which combines minorization-maximization and optimization based on accelerated line search algorithms on the Stiefel manifold, is used for parameter estimation. These mixtures are implemented both in the unsupervised and semi-supervised classification frameworks. Both simulated and real data are used for illustration and comparison to other mixture families.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted likelihood mixture modeling and model-based clustering

Article 10 June 2019

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

Article 14 January 2022

Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data

Article 23 August 2022

References

Absil, P.-A., Mahony, R., & Sepulchre, R. (2009). Optimization algorithms on matrix manifolds. Princeton University Press.
Aitken, A.C. (1926). On Bernoulli’s numerical solution of algebraic equations. In Proceedings of the royal society of edinburgh (pp. 289–305).
Andrews, J. L., & McNicholas, P. D. (2012). Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Statistics and Computing, 22(5), 1021–1029.
Article MathSciNet MATH Google Scholar
Azzalini, A. (1986). Further results on a class of distributions which includes the normal ones. Statistica, 46(2), 199–208.
MathSciNet MATH Google Scholar
Azzalini, A., & Valle, A. D. (1996). The multivariate skew-normal distribution. Biometrika, 83, 715–726.
Article MathSciNet MATH Google Scholar
Banfield, J. D., & Raftery, A. E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49(3), 803–821.
Article MathSciNet MATH Google Scholar
Basford, K., Greenway, D., McLachlan, G., & Peel, D. (1997). Standard errors of fitted component means of normal mixtures. Computational Statistics, 12(1), 1–18.
MATH Google Scholar
Biernacki, C., Celeux, G., & Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(7), 719–725.
Article Google Scholar
Biernacki, C., Celeux, G., & Govaert, G. (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics & Data Analysis, 41(3–4), 561–575.
Article MathSciNet MATH Google Scholar
Böhning, D., & Lindsay, B. G. (1988). Monotonicity of quadratic-approximation algorithms. Annals of the Institute of Statistical Mathematics, 40(4), 641–663.
Article MathSciNet MATH Google Scholar
Bouveyron, C., & Brunet-Saumard, C. (2014). Model-based clustering of high-dimensional data: A review. Computational Statistics and Data Analysis, 71, 52–78.
Article MathSciNet MATH Google Scholar
Branco, M. D., & Dey, D. K. (2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79(1), 99–113.
Article MathSciNet MATH Google Scholar
Browne, R. P., & McNicholas, P. D. (2014). Orthogonal Stiefel manifold optimization for eigen-decomposed covariance parameter estimation in mixture models. Statistics and Computing, 24(2), 203–210.
Article MathSciNet MATH Google Scholar
Browne, R. P., & McNicholas, P. D. (2015). A mixture of generalized hyperbolic distributions. Canadian Journal of Statistics, 43(2), 176–198.
Article MathSciNet MATH Google Scholar
Celeux, G., & Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition, 28(5), 781–793.
Article Google Scholar
Cho, D., & Bui, T. D. (2005). Multivariate statistical modeling for image denoising using wavelet transforms. Signal Processing: Image Communication, 20 (1), 77–89.
Google Scholar
da Silva Ferreira, C., Bolfarine, H., & Lachos, V. H. (2011). Skew scale mixtures of normal distributions: Properties and estimation. Statistical Methodology, 8(2), 154–171.
Article MathSciNet MATH Google Scholar
Dang, U. J., Browne, R. P., & McNicholas, P. D. (2015). Mixtures of multivariate power exponential distributions. Biometrics, 71(4), 1081–1089.
Article MathSciNet MATH Google Scholar
Dang, U.J., Browne, R.P., Gallaugher, M.P., & Band McNicholas, P.D. (2021). mixSPE: Mixtures of power exponential and skew power exponential distributions for use in model-based clustering and classification. R package version 0.9.1.
Day, N. E. (1969). Estimating the components of a mixture of normal distributions. Biometrika, 56(3), 463–474.
Article MathSciNet MATH Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39(1), 1–38.
MathSciNet MATH Google Scholar
DiCiccio, T. J., & Monti, A. C. (2004). Inferential aspects of the skew exponential power distribution. Journal of the American Statistical Association, 99 (466), 439–450.
Article MathSciNet MATH Google Scholar
Flury, B. (2012). Flury: data sets from Flury, 1997. R package version 0.1–3.
Forina, M., & Tiscornia, E. (1982). Pattern-recognition methods in the prediction of italian olive oil origin by their fatty-acid content. Annali di Chimica, 72(3-4), 143–155.
Google Scholar
Fraley, C., Raftery, A. E., Murphy, T. B., & Scrucca, L. (2012). mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. Technical report 597, Department of statistics, university of Washington, Seattle, Washington.
Franczak, B. C., Browne, R. P., & McNicholas, P. D. (2014). Mixtures of shifted asymmetric Laplace distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1149–1157.
Article Google Scholar
Franczak, B. C., Browne, R. P., McNicholas, P. D., & Burak, K. L. (2018). MixSAL: Mixtures of multivariate shifted asymmetric Laplace (SAL) distributions. R package version, 1, 0.
Google Scholar
Gallaugher, M. P. B., & McNicholas, P. D. (2018). Finite mixtures of skewed matrix variate distributions. Pattern Recognition, 80, 83–93.
Article Google Scholar
Gallaugher, M. P. B., & McNicholas, P. D. (2019). On fractionally-supervised classification: Weight selection and extension to the multivariate t-distribution. Journal of Classification, 36(2), 232–265.
Article MathSciNet MATH Google Scholar
Gómez, E., Gomez-Viilegas, M. A., & Marin, J. M. (1998). A multivariate generalization of the power exponential family of distributions. Communications in Statistics-Theory and Methods, 27(3), 589–600.
Article MathSciNet MATH Google Scholar
Hartigan, J. A., & Wong, M. A. (1979). A k-means clustering algorithm. Journal of the Royal Statistical Society: Series C, 28(1), 100–108.
MATH Google Scholar
Hasselblad, V. (1966). Estimation of parameters for a mixture of normal distributions. Technometrics, 8(3), 431–444.
Article MathSciNet Google Scholar
Horst, A.M., Hill, A.P., & Gorman, K.B. (2020). Palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1.0.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article MATH Google Scholar
Hunter, D. R., & Lange, K. (2000). Rejoinder to discussion of optimization transfer using surrogate objective functions. Journal of Computational and Graphical Statistics, 9(1), 52–59.
Google Scholar
Hunter, D. R., & Lange, K. (2004). A tutorial on MM algorithms. The American Statistician, 58(1), 30–37.
Article MathSciNet Google Scholar
Hurley, C. (2012). gclus: clustering Graphics. R package version 1.3.1.
Karlis, D., & Santourian, A. (2009). Model-based clustering with non-elliptically contoured distributions. Statistics and Computing, 19(1), 73–83.
Article MathSciNet Google Scholar
Lee, S., & McLachlan, G. J. (2014). Finite mixtures of multivariate skew t-distributions: some recent and new results. Statistics and Computing, 24 (2), 181–202.
Article MathSciNet MATH Google Scholar
Lee, S. X., & McLachlan, G. J. (2016). Finite mixtures of canonical fundamental skew t-distributions: The unification of the restricted and unrestricted skew t-mixture models. Statistics and Computing, 26(3), 573–589.
Article MathSciNet MATH Google Scholar
Lin, T. -I. (2010). Robust mixture modeling using multivariate skew t distributions. Statistics and Computing, 20(3), 343–356.
Article MathSciNet Google Scholar
Lin, T. -I., Ho, H. J., & Lee, C -R. (2014). Flexible mixture modelling using the multivariate skew-t-normal distribution. Statistics and Computing, 24(4), 531–546.
Article MathSciNet MATH Google Scholar
Lindsay, B. G. (1995). Mixture models: Theory, geometry and applications. In NSF-CBMS regional conference series in probability and statistics (pp. 1–163).
Lindsey, J. K. (1999). Multivariate elliptically contoured distributions for repeated measurements. Biometrics, 55(4), 1277–1280.
Article MATH Google Scholar
McNicholas, P. D. (2010). Model-based classification using latent Gaussian mixture models. Journal of Statistical Planning and Inference, 140(5), 1175–1181.
Article MathSciNet MATH Google Scholar
McNicholas, P. D. (2016a). Mixture model-based classification. Boca Raton: Chapman & Hall/CRC Press.
Book MATH Google Scholar
McNicholas, P. D. (2016b). Model-based clustering. Journal of Classification, 33(3), 331–373.
Article MathSciNet MATH Google Scholar
McNicholas, P. D., & Murphy, T. B. (2008). Parsimonious gaussian mixture models. Statistics and Computing, 18(3), 285–296.
Article MathSciNet Google Scholar
McNicholas, P. D., Murphy, T. B., McDaid, A. F., & Frost, D. (2010). Serial and parallel implementations of model-based clustering via parsimonious gaussian mixture models. Computational Statistics & Data Analysis, 54(3), 711–723.
Article MathSciNet MATH Google Scholar
McNicholas, P.D., ElSherbiny, A., McDaid, A.F., & Murphy, T. B. (2022). pgmm: Parsimonious Gaussian Mixture Models. R package version 1.2.6-2.
McNicholas, S.M., McNicholas, P.D., & Browne, R.P. (2017). A mixture of variance-gamma factor analyzers. In Big and complex data analysis (pp 369–385). Springer international publishing, Cham.
Morris, K., & McNicholas, P. D. (2013). Dimension reduction for model-based clustering via mixtures of shifted asymmetric Laplace distributions. Statistics and Probability Letters, 83(9), 2088–2093.
Article MathSciNet MATH Google Scholar
Murray, P. M., Browne, R. B., & McNicholas, P. D. (2014). Mixtures of skew-t factor analyzers. Computational Statistics and Data Analysis, 77, 326–335.
Article MathSciNet MATH Google Scholar
Murray, P. M., Browne, R. B., & McNicholas, P. D. (2017). Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering. Journal of Multivariate Analysis, 161, 141–156.
Article MathSciNet MATH Google Scholar
Nakai, K., & Kanehisa, M. (1991). Expert system for predicting protein localization sites in gram-negative bacteria. Proteins: Structure. Function, and Bioinformatics, 11(2), 95–110.
Article Google Scholar
Nakai, K., & Kanehisa, M. (1992). A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics, 14(4), 897–911.
Article Google Scholar
O’Hagan, A., Murphy, T. B., Gormley, I. C., McNicholas, P. D., & Karlis, D. (2016). Clustering with the multivariate normal inverse gaussian distribution. Computational Statistics and Data Analysis, 93, 18–30.
Article MathSciNet MATH Google Scholar
Peel, D., & McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Statistics and Computing, 10(4), 339–348.
Article Google Scholar
Pocuca, N., Browne, R.P., & McNicholas, P.D. (2022). Mixture: Mixture models for clustering and classification. R package version 2.0.5.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.
Article MathSciNet MATH Google Scholar
Steinley, D. (2004). Properties of the Hubert-Arabie adjusted Rand index. Psychological Methods, 9, 386–396.
Article Google Scholar
Streuli, H. (1973). Der heutige stand der kaffeechemie. In 6th international colloquium on coffee chemisrty (pp. 61–72).
Subedi, S., & McNicholas, P. D. (2014). Variational Bayes approximations for clustering via mixtures of normal inverse Gaussian distributions. Advances in Data Analysis and Classification, 8(2), 167–193.
Article MathSciNet MATH Google Scholar
Tipping, M. E., & Bishop, C. M. (1999). Mixtures of probabilistic principal component analysers. Neural Computation, 11(2), 443–482.
Article Google Scholar
Tortora, C., ElSherbiny, A., Browne, R. P., Franczak, B.C., McNicholas, P.D., & Amos, D.D. (2018). MixGHD: Model based clustering, classification and discriminant analysis using the mixture of generalized hyperbolic distributions. R package version 2.2.
Venables, W.N., & Ripley, B.D. (2002). Modern applied statistics with S, fourth edn. New York: Springer. ISBN 0-387-95457-0.
Book MATH Google Scholar
Verdoolaege, G., De Backer, S., & Scheunders, P. (2008). Multiscale colour texture retrieval using the geodesic distance between multivariate generalized Gaussian models. In 2008 15th IEEE international conference on image processing (pp. 169–172).
Vrbik, I., & McNicholas, P. D. (2014). Parsimonious skew mixture models for model-based clustering and classification. Computational Statistics and Data Analysis, 71, 196–210.
Article MathSciNet MATH Google Scholar
Vrbik, I., & McNicholas, P. D. (2015). Fractionally-supervised classification. Journal of Classification, 32(3), 359–381.
Article MathSciNet MATH Google Scholar
Wolfe, J.H. (1965). A computer program for the maximum likelihood analysis of types. U.S. Naval personnel research activity, technical bulletin:65-15.
Zhu, X., Sarkar, S., & Melnykov, V. (2022). Mattransmix: An r package for matrix model-based clustering and parsimonious mixture modeling. Journal of Classification, 39(1), 147–170.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research was supported by the Natural Sciences and Engineering Research Council of Canada through their Discovery Grants program for Dang, Browne, and McNicholas, the Banting Postdoctoral Fellowship for Gallaugher, as well as the Canada Research Chairs program for McNicholas.

Author information

Authors and Affiliations

Department of Health Sciences, Carleton University, Ottawa, Ontario, Canada
Utkarsh J. Dang
Department of Statistical Science, Baylor University, Waco, TX, USA
Michael P.B. Gallaugher
Department of Statistics & Actuarial Sciences, University of Waterloo, Waterloo, Ontario, Canada
Ryan P. Browne
Department of Mathematics & Statistics, McMaster University, Hamilton, Ontario, Canada
Paul D. McNicholas

Authors

Utkarsh J. Dang
View author publications
You can also search for this author in PubMed Google Scholar
Michael P.B. Gallaugher
View author publications
You can also search for this author in PubMed Google Scholar
Ryan P. Browne
View author publications
You can also search for this author in PubMed Google Scholar
Paul D. McNicholas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Utkarsh J. Dang.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Code and Data Accessibility

All data used here are publicly available; references have been provided within the bibliography. An implementation of mixtures of skewed power exponential distributions and mixtures of power exponential distributions is available as the R package mixSPE (Dang et al., 2021).

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix. Supporting Information

Parameter Recovery

A three-component mixture is simulated with 500 observations in total. Group sample sizes are sampled from a multinomial distribution with mixing proportions $(0.2, 0.34, 0.46)^{\prime }$. The first component is simulated from a heavy-tailed three-dimensional MSPE distribution with $\boldsymbol {\mu }_{1} = (12, 14, 0)^{\prime }$, β₁ = 0.85, and $\boldsymbol {\psi }_{1}=(-5, -10, 0)^{\prime }$. The second component is simulated with $\boldsymbol {\mu }_{2} = (-3, -10, 0)^{\prime }$, β₂ = 0.9, and $\boldsymbol {\psi }_{2}=(15, 10, 0)^{\prime }$. The third component is simulated with light tails with $\boldsymbol {\mu }_{3} = (3, 1, 0)^{\prime }$, β₃ = 2, and ψ₃ = ψ₂. Lastly, the scale matrices were common to all three components with diag $(\boldsymbol {\Delta }_{g}) = (4, 3, 1)^{\prime }$ and

$$ \boldsymbol{\Gamma}_{g}=\begin{pmatrix} 0.36 & 0.48 & -0.80 \\ -0.80 & 0.60 & 0.0 \\ 0.48 & 0.64 & 0.6 \\ \end{pmatrix}, $$

for g = 1, 2, 3. The simulated components were well separated to show parameter recovery. MSPE was run on 100 such datasets and perfect classification obtained on this well separated data each time. Note that there is an identifiability issue with individual parameter estimates (different combinations of individual parameter estimates yield the same fit) and closed form equations for overall mean and variance are not available. Hence, we demonstrate parameter recovery of overall cluster-specific mean and covariances in Table 6 by comparing estimates from data simulated (via a Metropolis-Hastings rule) using individual parameter estimates from the GEM fit. Clearly, the estimates are overall close to the generated values.

Table 6 Parameter recovery of cluster-specific means and covariance matrices (rounded to two decimals)

Full size table

2.1 Performance on Additional Datasets

In addition to those considered in the main body of the manuscript, we also considered the following data available through various R packages:

Wine Dataset

The expanded twenty seven variable wine dataset, obtained from pgmm, has 27 different measurements of chemical aspects of 178 Italian wines (three types of wine).

Iris Dataset

The iris dataset (included with R) consists of 150 observations, 50 each of 3 different species of iris. There are four different variables that were measured, namely the petal length and width and the sepal length and width.

Swiss Banknote Dataset

The Swiss banknote dataset, obtained from MixGHD (Tortora et al., 2018) looked at 6 different measurements from 100 genuine and 100 counterfeit banknotes. The measurements were length, length of the diagonal, width of the right and left edges, and the top and bottom margin widths.

Crabs Dataset

The crabs dataset, obtained from MASS (Venables & Ripley, 2002), contains 200 observations with 5 different variables that measure characteristics of crabs. There were 100 males and 100 females, and two different species of crabs, orange, and blue. This creates four different groups of crabs based on gender/species combinations.

Bankruptcy Dataset

The bankruptcy dataset, obtained from MixGHD, looked at the ratio of retained earnings to total assets, and the ratio of earnings before interests and taxes to total assets of 33 financially sound and 33 bankrupt American firms.

Yeast Dataset

A subset of the yeast dataset from Nakai and Kanehisa (1991, 1992) sourced through the MixSAL package (Franczak et al., 2018) is also used. There are measurements on three variables: McGeoch’s method for signal sequence recognition, the score of the ALOM membrane spanning region prediction program, and the score of discriminant analysis of the amino acid content of vacuolar and extracellular proteins along with the possible two cellular localization sites, CYT (cytosolic or cytoskeletal) and ME3 (membrane protein, no N-terminal signal) for the proteins.

Diabetes Dataset

The diabetes dataset, obtained from mclust (Fraley et al., 2012), considered 145 non-obese adult patients with different types of diabetes classified as normal, overt, and chemical. There were three measurements, the area under the plasma glucose curve, the area under the plasma insulin curve, and the steady-state plasma glucose.

Unsupervised Classification

Unsupervised classification, i.e., clustering, is performed on the (scaled) datasets mentioned above using the same comparison distributions as on the simulated data. The ARI and the number of groups chosen by the BIC are shown in Table 7.

Table 7 Performance comparison of four mixture model families on real data for the unsupervised scenarios

Full size table

The banknote data is interesting in that while the two elliptical mixtures, MPE and Gaussian mixture models, split the counterfeit and genuine banknotes into four different groups with the same classification overall (Table 8), the selected MSPE model fits three components splitting the counterfeit banknotes into a larger and a smaller component, while the ghpcm model splits the observations into two groups only. We see that, for the crabs, the MPE distribution exhibits the best performance (with eight “blue males” being misclassified into a different component) while the other three methods choose three components; however, the clusters found are a little different (Table 8). For example, the MSPE model perfectly separates one species of crab from the other; however, for the “blue” species, it does not differentiate between the sexes. For the second species, there are only four misclassifications for differentiating the sexes. The ghpcm model has a similar fit to the MSPE model (species separated nicely but not sexes), but with two fewer misclassifications. The gpcm had a poorer fit compared to known labels, clustering sex better than species. The bankruptcy data shows some interesting results. The ghpcm model fits three clusters to the data, with two small clusters of two and three observations, respectively, with poor performance compared to known labels (Table 8). The other three comparators performed somewhat similarly with MSPE fitting four components (including one with eight tightly clustered points). For the diabetes data, the two selected skewed mixtures under-fit the number of components (seem to combine the normal and chemical classes but able to differentiate from the overt class) as compared to the elliptical mixtures, which fare better and similarly. Moreover, the estimates of skewness were not trivial, and both groups had heavier tails in the MSPE fit. On the other hand, for the MPE fit, the tails were approximately Gaussian (common β_g ≈ 1). For the yeast dataset, apart from the ghpcm mixtures, the other three mixtures overfit the number of components; however, the ghpcm mixture clustering was not meaningful compared to known labels. For the iris data, only the MPE mixtures’ selected model had three components. While for the expanded 27-dimensional wine dataset, the gpcm mixtures perform best with perfect classification. Interestingly, the gpcm mixtures have poorer performance relatively in the semi-supervised fits on these data. Note that this phenomenon, whereby cluster analysis can obtain better results compared to using semi-supervised classification, has been noted before, e.g., by Vrbik and McNicholas (2015) and Gallaugher and McNicholas (2019).

Table 8 Classification tables for the cluster analysis of each model and dataset

Full size table

The relative performance of the MPE versus MSPE mixtures in Table 9 suggests that there are cases in which using these skewed mixtures might not be ideal and could be a possible subject of future work.

Table 9 Median ARI values along with first and third quartiles in parentheses for the four different models for each dataset for the semi-supervised runs

Full size table

Semi-supervised Classification

For each dataset, we take 25 labelled/unlabelled splits with 25% supervision. In Table 9, we display the median ARI values along with the first and third quartiles over the 25 splits. For the most part, as found in the main text of this manuscript, performance in the semi-supervised scenarios was better than in the fully unsupervised scenarios. Performance across the four comparators was also quite comparable with few exceptions. For the yeast data, both in the unsupervised and semi-supervised context, the MSPE mixtures performed the best. Similarly, for the diabetes data, the MSPE mixtures perform the best, with gpcm mixtures having the most inferior classification performance. On the 27 dimensional wine data, the MPE mixtures performed well with MSPE mixtures having more variability in ARI across the runs. For the iris dataset, the MPE and ghpcm models showed the best overall performance while, for the banknote dataset, all algorithms exhibit similar performance. For the bankruptcy data, the gpcm algorithm performed the poorest while, for crabs, the gpcm algorithm performed the best along with ghpcm and MPE mixtures which had similar performance. For the crabs dataset, although the MSPE models had poorer classification compared to the other three mixtures, the performance was still close to the other mixture distributions.

A reviewer noted that using mixtures of canonical fundamental skew t (CFUST) distributions (Lee & McLachlan, 2016), one can obtain an ARI close to 1 with there being only one misclassification. However, we were unable to obtain this solution from CFUST; perhaps due to different initialization. For the semi-supervised runs, this solution could be obtained for all four comparator distributions (for the best out of 25 runs).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dang, U.J., Gallaugher, M.P., Browne, R.P. et al. Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions. J Classif 40, 145–167 (2023). https://doi.org/10.1007/s00357-022-09427-7

Download citation

Accepted: 24 November 2022
Published: 15 February 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s00357-022-09427-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions

Abstract

Access this article

Similar content being viewed by others

Weighted likelihood mixture modeling and model-based clustering

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Code and Data Accessibility

Publisher’s Note

Appendices

Appendix. Supporting Information

Parameter Recovery

2.1 Performance on Additional Datasets

Wine Dataset

Iris Dataset

Swiss Banknote Dataset

Crabs Dataset

Bankruptcy Dataset

Yeast Dataset

Diabetes Dataset

Unsupervised Classification

Semi-supervised Classification

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions

Abstract

Access this article

Similar content being viewed by others

Weighted likelihood mixture modeling and model-based clustering

Model-based clustering via new parsimonious mixtures of heavy-tailed distributions

Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Code and Data Accessibility

Publisher’s Note

Appendices

Appendix. Supporting Information

Parameter Recovery

2.1 Performance on Additional Datasets

Wine Dataset

Iris Dataset

Swiss Banknote Dataset

Crabs Dataset

Bankruptcy Dataset

Yeast Dataset

Diabetes Dataset

Unsupervised Classification

Semi-supervised Classification

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation