Skip to main content

Advertisement

Log in

A new variational Bayesian algorithm with application to human mobility pattern modeling

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A new variational Bayesian (VB) algorithm, split and eliminate VB (SEVB), for modeling data via a Gaussian mixture model (GMM) is developed. This new algorithm makes use of component splitting in a way that is more appropriate for analyzing a large number of highly heterogeneous spiky spatial patterns with weak prior information than existing VB-based approaches. SEVB is a highly computationally efficient approach to Bayesian inference and like any VB-based algorithm it can perform model selection and parameter value estimation simultaneously. A significant feature of our algorithm is that the fitted number of components is not limited by the initial proposal giving increased modeling flexibility. We introduce two types of split operation in addition to proposing a new goodness-of-fit measure for evaluating mixture models. We evaluate their usefulness through empirical studies. In addition, we illustrate the utility of our new approach in an application on modeling human mobility patterns. This application involves large volumes of highly heterogeneous spiky data; it is difficult to model this type of data well using the standard VB approach as it is too restrictive and lacking in the flexibility required. Empirical results suggest that our algorithm has also improved upon the goodness-of-fit that would have been achieved using the standard VB method, and that it is also more robust to various initialization settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aitkin, M., Wilson, G.T.: Mixture models, outliers, and the EM algorithm. Technometrics 22, 325–331 (1980)

    Article  MATH  Google Scholar 

  • Archambeau, C., Verleysen, M.: Robust Bayesian clustering. Neural Netw. 20, 129–138 (2007)

    Article  MATH  Google Scholar 

  • Armstrong, J.S.: Principles of Forecasting. Springer, Dordrecht (2001)

    Google Scholar 

  • Attias, H.: Inferring parameters and structure of latent variable models by variational Bayes. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (1999)

    Google Scholar 

  • Azzalini, A.: Statistical Inference: Based on the Likelihood. Champman & Hall, London (1996)

    MATH  Google Scholar 

  • Balakrishnan, S., Madigan, D.: A one-pass sequential Monte Carlo method for Bayesian analysis of massive datasets. Bayesian Anal. 1, 345–362 (2006)

    Article  MathSciNet  Google Scholar 

  • Ball, G.H., Hall, D.J.: ISODATA, a novel method of data analysis and pattern classification. Stanford Research Institute, Menlo Park (1965)

    Google Scholar 

  • Beal, M.J., Ghahramani, Z.: The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Bayesian Stat. 7, 453–464 (2003)

    MathSciNet  Google Scholar 

  • Beal, M.J., Ghahramani, Z.: Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Anal. 1, 1–44 (2006)

    Article  MathSciNet  Google Scholar 

  • Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  • Brockmann, D., Hufnagel, L., Geisel, T.: The scaling laws of human travel. Nature 439, 462–465 (2006)

    Article  Google Scholar 

  • Celeux, G., Forbes, F., Robert, C.P., Titterington, D.M.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–674 (2006)

    Article  MathSciNet  Google Scholar 

  • Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Constantinopoulos, C., Likas, A.: Unsupervised learning of Gaussian mixtures based on variational component splitting. IEEE Trans. Neural Netw. 18, 745–755 (2007)

    Article  Google Scholar 

  • Corduneanu, A., Bishop, C.M.: Variational Bayesian model selection for mixture distributions. In: Proceedings Eighth International Conference on Artificial Intelligence and Statistics (2001)

    Google Scholar 

  • Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (1996)

    Google Scholar 

  • Gelman, A.: Bayesian Data Analysis. Chapman & Hall, Boca Raton (2004)

    MATH  Google Scholar 

  • Ghahramani, Z., Beal, M.J.: Variational inference for Bayesian mixtures of factor analysers. In: Advances in Neural Information Processing Systems, vol. 12 (1999)

    Google Scholar 

  • Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.-L.: Understanding individual human mobility patterns. Nature 453, 779–782 (2008)

    Article  Google Scholar 

  • Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)

    MATH  Google Scholar 

  • Jaakkola, T.S., Jordan, M.I.: Bayesian parameter estimation via variational methods. Stat. Comput. 10, 25–37 (2000)

    Article  Google Scholar 

  • Jain, A., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, Upper Saddle River (1988)

    MATH  Google Scholar 

  • Madigan, D., Ridgeway, G.: Bayesian data analysis. In: Ye, N. (ed.) The Handbook of Data Mining. Lawrence Erlbaum Associates, Mahwah (2003)

    Google Scholar 

  • McGrory, C.A., Titterington, D.M.: Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data Anal. 51, 5352–5367 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    Book  MATH  Google Scholar 

  • Milligan, G., Cooper, M.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)

    Article  Google Scholar 

  • Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc., Ser. B 59, 731–792 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Van der Linde, A.: Bayesian measures of model complexity and fit. J. R. Stat. Soc., Ser. B 64, 583–639 (2002)

    Article  MATH  Google Scholar 

  • Stephens, M.: Bayesian analysis of mixture models with an unknown number of components—an alternative to reversible jump methods. Ann. Stat. 28, 40–74 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Svensén, M., Bishop, C.M.: Robust Bayesian mixture modelling. Neurocomputing 64, 235–252 (2005)

    Article  Google Scholar 

  • Ueda, N., Ghahramani, Z.: Bayesian model search for mixture models based on optimizing variational bounds. Neural Netw. 15, 1223–1241 (2002)

    Article  Google Scholar 

  • Ueda, N., Nakano, R., Ghahramani, Z., Hinton, G.E.: SMEM algorithm for mixture models. Neural Comput. 12, 2109–2128 (2000)

    Article  Google Scholar 

  • Wang, B., Titterington, D.M.: Convergence properties of a general algorithm for calculating variational Bayesian estimates for a normal mixture model. Bayesian Anal. 1, 625–650 (2006)

    Article  MathSciNet  Google Scholar 

  • Watanabe, S., Minami, Y., Nakamura, A., Ueda, N.: Application of variational Bayesian approach to speech recognition. In: Neural Information Processing Systems, NIPS 2002 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony N. Pettitt.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, B., McGrory, C.A. & Pettitt, A.N. A new variational Bayesian algorithm with application to human mobility pattern modeling. Stat Comput 22, 185–203 (2012). https://doi.org/10.1007/s11222-010-9217-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-010-9217-9

Keywords

Navigation