Skip to main content
Log in

Random Partition Models and Exchangeability for Bayesian Identification of Population Structure

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

We introduce a Bayesian theoretical formulation of the statistical learning problem concerning the genetic structure of populations. The two key concepts in our derivation are exchangeability in its various forms and random allocation models. Implications of our results to empirical investigation of the population structure are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Bernardo, J.M., Smith, A.F.M., 1994. Bayesian Theory. Wiley, Chichester.

    MATH  Google Scholar 

  • Corander, J., Waldmann, P., Marttinen, P., Sillanpää, M. J., 2004. BAPS 2: Enhanced possibilities for the analysis of genetic population structure. Bioinformatics 20, 2363–2369.

    Article  Google Scholar 

  • Corander, J., Waldmann, P., Sillanpää, M.J., 2003. Bayesian analysis of genetic differentiation between populations. Genetics, 163, 367–374.

    Google Scholar 

  • Corander, J., Gyllenberg, M. and Koski, T., 2006a. Bayesian unsupervised classification framework based on stochastic partitions of data and a parallel search strategy. submitted to J. Statist. Comput. Simulation.

  • Corander, J., Gyllenberg, M. and Koski, T., 2006b. Bayesian model learning based on a parallel MCMC strategy. Stat. Comput. 16, 355–362.

    Google Scholar 

  • de Finetti, B., 1974. Theory of Probability, vol. I. Wiley, Chichester.

    Google Scholar 

  • Dawson, K.J., Belkhir, K., 2001. A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet. Res. Camb. 78, 59–77.

    Google Scholar 

  • Diaconis, P., Zabell, S.L., 1982. Updating subjective probability. J. Amer. Stat. Assoc. 77, 822–830.

    Article  MATH  MathSciNet  Google Scholar 

  • Dieringer, D., Nolte, V., Schlötterer, C., 2005. Population structure in African Drosophila melanogaster revealed by microsatellite analysis. Mol. Ecol. 14, 563–573.

    Article  Google Scholar 

  • Donnelly, P., 1986. Partition structures, Poly’a urns, the Ewens sampling formula, and the ages of alleles. Theor. Pop. Biol. 30, 271–288.

    Article  MATH  MathSciNet  Google Scholar 

  • Draper, D., Hodges, J.S., Mallows, C.L., Pregibon, D., 1993. Exchangeability and data analysis. J. R. Stat. Soc. A 156, 9–37.

    Article  MATH  MathSciNet  Google Scholar 

  • Duda, R.O., Hart, P.E., Stork, D.G., 2000. Pattern Classification and Scene Analysis, 2nd edition. Wiley, New York.

    Google Scholar 

  • Ewens, W.J., 1990. Population genetics theory—the past and the future. In: Lessard, S. (Ed.), Mathematical and Statistical Developments of Evolutionary Theory. Kluwer, Dordrecht, pp. 177–227.

  • Ewens, W.J., 2004. Mathematical Population Genetics, 2nd edition. Springer-Verlag, New York.

    MATH  Google Scholar 

  • Geiger, D., Heckerman, D., 1997. A characterization of the Dirichlet distribution through global and local parameter independence. Ann. Stat. 25, 1344–1369.

    Article  MATH  MathSciNet  Google Scholar 

  • Geisser, S., 1966. Predictive discrimination. In: Krishnajah, P.R. (Ed.), Multivariate Analysis. Academic Press, New York, London.

  • Good, I.J., 1965. Estimation of Probabilities. MIT Press, Cambridge, MA.

    MATH  Google Scholar 

  • Gyllenberg, M., Koski, T., 2002. Bayesian predictiveness, exchangeability and sufficientness in bacterial taxonomy. Math. Biosc. 177–178, 161–184.

  • Holst, L., 1981. On numbers related to partitions of unlike objects and occupancy problems. Eur. J. Combinatorics 2, 231–237.

    MATH  MathSciNet  Google Scholar 

  • Hoppe, F.M., 1984. Poly’a-like urns and the Ewens’ sampling formula. J. Math. Biol. 20, 91–94.

    Article  MATH  MathSciNet  Google Scholar 

  • Joyce, P., 1991. Estimating the frequency of the oldest allele: A Bayesian approach. Adv. Appl. Prob. 23, 456–475.

    Article  MATH  MathSciNet  Google Scholar 

  • Joyce, P., 1998. Partition Structures and sufficient statistics J. Appl. Prob. 35, 622–632.

    Article  MATH  MathSciNet  Google Scholar 

  • Kallenberg, O., 2005. Probabilistic Symmetries and Invariance Principles. Springer-Verlag, New York.

    MATH  Google Scholar 

  • Kingman, J.F.C., 1977. The population structure associated with the Ewens sampling formula. Theor. Pop. Biol. 11, 274–283.

    Article  MathSciNet  Google Scholar 

  • Kingman, J.F.C., 1978a. The representation of partition structures. J. Lond. Math. Soc. 18, 374–380.

    Article  MATH  MathSciNet  Google Scholar 

  • Kingman, J.F.C., 1978b. Random partitions in population genetics. Proc. R. Soc. Lond. A 361, 1–20.

    Google Scholar 

  • Kingman, J.F.C., 1978c. Uses of exchangeability. Ann. Prob. 6, 183–197.

    MATH  MathSciNet  Google Scholar 

  • Kingman, J.F.C., 1980. Mathematics of Genetic Diversity. SIAM, Philadelphia.

    Google Scholar 

  • Nagylaki, T., 1992. Theoretical Population Genetics. Springer-Verlag, Berlin.

    MATH  Google Scholar 

  • Pitman, J., 1997. Some probabilistic aspects of set partitions. Amer. Math. Month. 104, 201–209.

    Article  MATH  MathSciNet  Google Scholar 

  • Pritchard, J.K., Stephens, M., Donnelly, P., 2000. Inference of population structure using multilocus genotype data. Genetics 155, 945–959.

    Google Scholar 

  • Robert, C.P., Casella, G., 2005. Monte Carlo Statistical Methods. 2nd edition. Springer-Verlag, New York.

    Google Scholar 

  • Rota, G.-C., 1964. The number of partitions of a set. Amer. Math. Month. 71, 498–504.

    Article  MATH  MathSciNet  Google Scholar 

  • Schervish, M. J., 1995. Theory of Statistics. Springer-Verlag, New York.

    MATH  Google Scholar 

  • Simon, H.A., 1955. On a class of skew distribution functions. Biometrika 42, 425–440.

    MATH  MathSciNet  Google Scholar 

  • Stam, A.J., 1983. Generation of a random partition of a finite set by an urn model. J. Combin. Theor. Ser. A 35, 231–240.

    Article  MATH  MathSciNet  Google Scholar 

  • Stigler, S.M., 1982. Thomas Bayes’s Bayesian inference. J. R. Stat. Soc. A 145, 250–258.

    Article  MATH  MathSciNet  Google Scholar 

  • Yule, G.U., 1925. A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S.. Philos. Trans. R. Soc. B 213, 431–444.

    Article  Google Scholar 

  • Zabell, S.L., 1982. W.E. Johnson’s ‘sufficientness’ postulate. Ann. Stat. 10, 1091–1099.

    MATH  MathSciNet  Google Scholar 

  • Zabell, S.L., 1992. Predicting the unpredictable. Synthese 90, 205–232.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jukka Corander.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corander, J., Gyllenberg, M. & Koski, T. Random Partition Models and Exchangeability for Bayesian Identification of Population Structure. Bull. Math. Biol. 69, 797–815 (2007). https://doi.org/10.1007/s11538-006-9161-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11538-006-9161-1

Keywords

Navigation