Mixture Models

Marin, Jean-Michel; Robert, Christian P.

doi:10.1007/978-1-4614-8687-9_6

Jean-Michel Marin⁷ &
Christian P. Robert⁸

Part of the book series: Springer Texts in Statistics ((STS))

178k Accesses
1 Citations

Abstract

This chapter covers a class of models where a rather simple distribution is made more complex and less informative by a mechanism that mixes together several known or unknown distributions. This representation is naturally called a mixture of distributions, as illustrated above. Inference about the parameters of the elements of the mixtures and the weights is called mixture estimation, while recovery of the original distribution of each observation is called classification (or, more exactly, unsupervised classification to distinguish it from the supervised classification to be discussed in Chap. 8).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This is not a definition in the mathematical sense since all densities can formally be represented that way. We thus stress that the model itself must be introduced that way. This point is not to be mistaken for a requirement that the variable z be meaningful for the data at hand. In many cases, for instance the probit model, the missing variable representation remains formal.
2.
We will see later that the missing structure of a mixture actually need not be simulated but, for more complex missing-variable structures like hidden Markov models (introduced in Chap. 7), this completion cannot be avoided.
3.
The Frenchman Alphonse Bertillon is also the father of scientific police investigation. For instance, he originated the use of fingerprints in criminal investigations.
4.
To get a better understanding of this second mode, consider the limiting setting when p = 0.5. In that case, there are two equivalent modes of the likelihood, \((\mu _{1},\mu _{2})\) and \((\mu _{2},\mu _{1})\). As p moves away from 0.5, this second mode gets lower and lower compared with the other mode, but it still remains.
5.
In non-Bayesian statistics, the EM algorithm is certainly the most ubiquitous numerical method, even though it only applies to (real or artificial) missing variable models.
6.
Historically, missing-variable models constituted one of the first instances where the Gibbs sampler was used by completing the missing variables by simulation under the name of data augmentation (see Tanner, 1996, and Robert and Casella, 2004, Chaps. 9 and 10).
7.
That this is a natural estimate of the model, compared with the “plug-in” density using the estimates of the parameters, will be explained more clearly in Sect. 6.5.
8.
In practice, the Gibbs sampler never leaves the vicinity of a given mode if the attraction of this mode is strong enough, for instance in the case of many observations.
9.
While this resolution seems intuitive enough, there is still a lot of debate in academic circles on whether or not label switching should be observed on an MCMC output and, in case it should, on which substitute to the posterior mean should be used.
10.
This section may be skipped by most readers, as it only addresses the very specific issue of handling improper priors in mixture estimation.
11.
By nature, ill-posed problems are not precisely defined. They cover classes of models such as inverse problems, where the complexity of getting back from the data to the parameters is huge. They are not to be confused with nonidentifiable problems, though.

References

Chib, S. (1995). Marginal likelihood from the Gibbs output. J. American Statist. Assoc., 90:1313–1321.
Article MathSciNet MATH Google Scholar
Frühwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models. Springer-Verlag, New York, New York.
MATH Google Scholar
Gelfand, A. and Dey, D. (1994). Bayesian model choice: asymptotics and exact calculations. J. Royal Statist. Society Series B, 56:501–514.
MathSciNet MATH Google Scholar
Green, P. (1995). Reversible jump MCMC computation and Bayesian model determination. Biometrika, 82(4):711–732.
Article MathSciNet MATH Google Scholar
Hjort, N., Holmes, C., Müller, P., and Walker, S. (2010). Bayesian Nonparametrics. Cambridge University Press, Cambridge.
Book MATH Google Scholar
Marin, J.-M. and Robert, C. (2007). Bayesian Core. Springer-Verlag, New York.
MATH Google Scholar
Robert, C. and Casella, G. (2004). Monte Carlo Statistical Methods. Springer-Verlag, New York, second edition.
Google Scholar
Tanner, M. (1996). Tools for Statistical Inference: Observed Data and Data Augmentation Methods. Springer-Verlag, New York, third edition.
Google Scholar

Download references

Author information

Authors and Affiliations

Université Montpellier 2, Montpellier, France
Jean-Michel Marin
Université Paris-Dauphine, Paris, France
Christian P. Robert

Authors

Jean-Michel Marin
View author publications
You can also search for this author in PubMed Google Scholar
Christian P. Robert
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Marin, JM., Robert, C.P. (2014). Mixture Models. In: Bayesian Essentials with R. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8687-9_6

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8687-9_6
Published: 24 September 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8686-2
Online ISBN: 978-1-4614-8687-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics