Skip to main content
Log in

Robust mixture modeling using the skew t distribution

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Azzalini A. 1985. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12: 171–178.

    MathSciNet  Google Scholar 

  • Azzalini A. 1986. Further results on a class of distributions which includes the normal ones. Statistica 46: 199–208.

    MATH  MathSciNet  Google Scholar 

  • Azzalini A. and Capitaino A. 2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of the Royal Statistical Society, Series B 65: 367–389.

    Article  MATH  Google Scholar 

  • Basford K.E., Greenway D.R., McLachlan G.J., and Peel, D. 1997. Standard errors of fitted means under normal mixture. Computational Statistics 12: 1–17.

    MATH  Google Scholar 

  • Dellaportas P. and Papageorgiou I. 2006. Multivariate mixtures of normals with unknown number of components. Statistics and Computing 16: 57–68.

    Article  MathSciNet  Google Scholar 

  • Dempster A.P., Laird N.M., and Rubin D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B 39: 1–38.

    MATH  MathSciNet  Google Scholar 

  • Flegal K.M., Carroll M.D., Ogden C.L., and Johnson C.L. 2002. Prevalence and trends in obesity among US adults, 1999–2000. Journal of the American Medical Association 288: 1723–1727.

    Article  Google Scholar 

  • Henze N. 1986. A probabilistic representation of the skew-normal distribution. Scandinavian Journal of Statistics 13: 271–275.

    MathSciNet  Google Scholar 

  • Jones M.C. and Faddy M.J. 2003. A skew extension of the t-distribution, with applications. Journal of the Royal Statistical Society, Series B 65: 159–174.

    Article  MATH  MathSciNet  Google Scholar 

  • Lin T.I., Lee J.C., and Ni H.F. 2004. Bayesian analysis of mixture modelling using the multivariate t distribution. Statistics and Computing 14: 119–130.

    Article  MathSciNet  Google Scholar 

  • Lin T.I., Lee J.C., and Yen S.Y. 2007. Finite mixture modelling using the skew normal distribution. Statistica Sinica (In press)

  • Liu C.H. and Rubin D.B. 1994. The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81: 633–648.

    Article  MATH  MathSciNet  Google Scholar 

  • Liu C.H., Rubin D.B., and Wu, Y. 1998. Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika 85: 755–770.

    Article  MATH  MathSciNet  Google Scholar 

  • McLachlan G.J. and Basford K.E. 1988. Mixture Models: Inference and Application to Clustering, Marcel Dekker, New York.

    Google Scholar 

  • McLachlan G.J. and Peel D. 2000. Finite Mixture Models, Wiely, New York.

    Book  MATH  Google Scholar 

  • Meng X.L. and Rubin D.B. 1993. Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80:267–78.

    Article  MATH  MathSciNet  Google Scholar 

  • Peel D. and McLachlan G. J. 2000. Robust mixture modeling using the t distribution. Statistics and Computing 10: 339–348.

    Article  Google Scholar 

  • Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society, Series B 59: 731–792.

    Article  MATH  MathSciNet  Google Scholar 

  • Shoham S. 2002. Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognition 35: 1127–1142.

    Article  MATH  Google Scholar 

  • Shoham S., Fellows M.R., and Normann R.A. 2003. Robust, automatic spike sorting using mixtures of multivariate t-distributions. Journal of Neuroscience Methods 127: 111–122.

    Article  Google Scholar 

  • Titterington D.M., Smith A.F.M., and Markov U.E. 1985. Statistical Analysis of Finite Mixture Distributions, Wiely, New York.

    MATH  Google Scholar 

  • Wang H.X., Zhang Q.B., Luo B., and Wei S. 2004. Robust mixture modelling using multivariate t distribution with missing information. Pattern Recognition Letter 25: 701–710.

    Article  Google Scholar 

  • Zacks S. 1971. The Theory of Statistical Inference, New York, Wiley.

    Google Scholar 

  • Zhang Z., Chan K.L., Wu Y., and Cen C.B. 2004. Learning a multivariate Gaussian mixture model with the reversible Jump MCMC algorithm. Statistics and Computing 14: 343–355.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tsung I. Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, T.I., Lee, J.C. & Hsieh, W.J. Robust mixture modeling using the skew t distribution. Stat Comput 17, 81–92 (2007). https://doi.org/10.1007/s11222-006-9005-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-006-9005-8

Keywords

Navigation