Approximate inference for continuous-time Markov processes

Cédric Archambeau; Manfred Opper

doi:10.1017/CBO9780511984679.007

6 - Approximate inference for continuous-time Markov processes

from II - Deterministic approximations

Published online by Cambridge University Press: 07 September 2011

Cédric Archambeau and

Edited by

A. Taylan Cemgil and

Cédric Archambeau: Affiliation:
University College London
Manfred Opper: Affiliation:
Technische Universität Berlin
David Barber: Affiliation:
University College London
A. Taylan Cemgil: Affiliation:
Boğaziçi Üniversitesi, Istanbul
Silvia Chiappa: Affiliation:
University of Cambridge

Book contents

Get access

Summary

Introduction

Markov processes are probabilistic models for describing data with a sequential structure. Probably the most common example is a dynamical system, of which the state evolves over time. For modelling purposes it is often convenient to assume that the system states are not directly observed: each observation is a possibly incomplete, non-linear and noisy measurement (or transformation) of the underlying hidden state. In general, observations of the system occur only at discrete times, while the underlying system is inherently continuous in time. Continuous-time Markov processes arise in a variety of scientific areas such as physics, environmental modelling, finance, engineering and systems biology.

The continuous-time evolution of the system imposes strong constraints on the model dynamics. For example, the individual trajectories of a diffusion process are rough, but the mean trajectory is a smooth function of time. Unfortunately, this information is often under- or unexploited when devising practical systems. The main reason is that inferring the state trajectories and the model parameters is a difficult problem as trajectories are infinite-dimensional objects. Hence, a practical approach usually requires some sort of approximation. For example, Markov chain Monte Carlo (MCMC) methods usually discretise time [41, 16, 34, 2, 20], while particle filters approximate continuous densities by a finite number of point masses [13, 14, 15]. More recently, approaches using perfect simulation have been proposed [7, 8, 18]. The main advantage of these MCMC techniques is that they do not require approximations of the transition density using time discretisations.

Type: Chapter
Information: Bayesian Time Series Models , pp. 125 - 140

DOI: https://doi.org/10.1017/CBO9780511984679.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Y., Ait-Sahalia. Closed-form likelihood expansions for multivariate diffusions. Annals of Statistics, 36:906–937, 2008.Google Scholar

[2] F. J., Alexander, G. L., Eyink and J. M., Restrepo. Accelerated Monte Carlo for optimal estimation of time series. Journal of Statistical Physics, 119:1331–1345, 2005.Google Scholar

[3] C., Archambeau, D., Cornford, M., Opper and J., Shawe-Taylor. Gaussian process approximation of stochastic differential equations. Journal of Machine Learning Research: Workshop and Conference Proceedings, 1:1–16, 2007.Google Scholar

[4] C., Archambeau, M., Opper, Y., Shen, D., Cornford and J., Shawe-Taylor. Variational inference for diffusion processes. In J. C., Platt, D., Koller, Y., Singer, and S., Roweis, editors, Advances in Neural Information Processing Systems 20, pages 17–24. MIT Press, 2008.Google Scholar

[5] D., Barber and C. M., Bishop. Ensemble learning for multi-layer networks. In M. I., Jordan, M. J., Kearns, and S. A., Solla, editors, Advances in Neural Information Processing Systems 10, pages 395–401. MIT Press, 1998.Google Scholar

[6] James O., Berger. Statistical Decision Theory and Bayesian Analysis. Springer, 1985.Google Scholar

[7] A., Beskos, O., Papaspiliopoulos, G., Roberts and P., Fearnhead. Exact and computationally efficient likelihood-based estimation for discretely observed diffusion processes (with discussion). Journal of the Royal Statistical Society B, 68(3):333–382, 2006.Google Scholar

[8] A., Beskos, G., Robert, A., Stuart and J., Voss, MCMC methods for diffusion bridges. Stochastics and Dynamics, 8(3):319–350, 2008.Google Scholar

[9] C. M., Bishop, Neural Networks for Pattern Recognition. Oxford University Press, 1995.Google Scholar

[10] C. M., Bishop. Pattern Recognition and Machine Learning. Springer, 2006.Google Scholar

[11] I., Cohn, T., El-hay, N., Friedman and R., Kupferman. Mean field variational approximation for continuous-time Bayesian networks. In 25th International Conference on Uncertainty in Artificial Intelligence, pages 91–100, 2009.Google Scholar

[12] T. M., Cover and J. A., Thomas. Elements of Information Theory. John Wiley & Sons, 1991.Google Scholar

[13] D., Crisan and T., Lyons. A particle approximation of the solution of the Kushner-Stratonovitch equation. Probability Theory and Related Fields, 115(4):549–578, 1999.Google Scholar

[14] P., Del Moral and J., Jacod. Interacting particle filtering with discrete observations. In A., Doucet, N., de Freitas and N., Gordon, editors, Sequential Monte Carlo Methods in Practice, pages 43–76. MIT Press, 2001.Google Scholar

[15] P., Del Moral, J., Jacod and P., Protter. The Monte Carlo method for filtering with discrete-time observations. Probability Theory and Related Fields, 120:346–368, 2002.Google Scholar

[16] B., Eraker. MCMC analysis of diffusion models with application to finance. Journal of Business and Economic Statistics, 19:177–191, 2001.Google Scholar

[17] G. L., Eyink, J. L., Restrepo and F. J., Alexander. A mean field approximation in data assimilation for nonlinear dynamics. Physica D, 194:347–368, 2004.Google Scholar

[18] P., Fearnhead, O., Papaspiliopoulos and G. O., Roberts. Particle filters for partially-observed diffusions. Journal of the Royal Statistical Society B, 70:755–777, 2008.Google Scholar

[19] R. P., Feynman and A. R., Hibbs. Quantum Mechanics and Path integrals. McGraw-Hill Book Company, 1965.Google Scholar

[20] A., Golightly and D. J., Wilkinson. Bayesian sequential inference for nonlinear multivariate diffusions. Statistics and Computing, 16:323–338, 2006.Google Scholar

[21] A., Honkela and H., Valpola. Unsupervised variational Bayesian learning of nonlinear models. In L., Saul, Y., Weiss, and L., Bottou, editors, Advances in Neural Information Processing Systems 17, pages 593–600. MIT Press, 2005.Google Scholar

[22] M. I., Jordan, editor. Learning in Graphical Models. MIT Press, 1998.Google Scholar

[23] I., Karatzas and S. E., Schreve. Brownian Motion and Stochastic Calculus. Springer, 1998.Google Scholar

[24] H., Kleinert. Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets. World Scientific, 2006.Google Scholar

[25] P. E., Kloeden and E., Platen. Numerical Solution of Stochastic Differential Equations. Springer, 1999.Google Scholar

[26] H. J., Kushner. On the differential equations satisfied by conditional probability densities of Markov processes with applications. Journal of SIAM, Series A: Control, 2:106–119, 1962.Google Scholar

[27] H., Lappalainen and J. W., Miskin. Ensemble learning. In M., Girolami, editor, Advances in Independent Component Analysis, pages 76–92. Springer-Verlag, 2000.Google Scholar

[28] D. J. C., MacKay. Bayesian interpolation. Neural Computation, 4(3):415–447, 1992.Google Scholar

[29] B., Øksendal. Stochastic Differential Equations. Springer-Verlag, 2005.Google Scholar

[30] M., Opper and D., Saad, editors. Advanced Mean Field Methods: Theory and Practice. MIT Press, 2001.Google Scholar

[31] M., Opper and G., Sanguinetti. Variational inference for Markov jump processes. In J. C., Platt, D., Koller, Y., Singer, and S., Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1105–1112 MIT Press, 2008.Google Scholar

[32] E., Pardoux. Equations du filtrage non linéaire, de la prédiction et du lissage. Stochastics, 6:193–231, 1982.Google Scholar

[33] L., Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 2(77):257–286, 1989.Google Scholar

[34] G., Roberts and O., Stramer. On inference for partially observed non-linear diffusion models using the Metropolis-Hastings algorithm. Biometrika, 88(3):603–621, 2001.Google Scholar

[35] J. B., Roberts and P. D., Spanos. Random Vibration and Statistical Linearization. Dover Publications, 2003.Google Scholar

[36] A., Ruttor, G., Sanguinetti and M., Opper. Approximate inference for stochastic reaction processes. In M., Rattray, N. D., Lawrence, M., Girolami and G., Sanguinetti, editors, Learning and Inference in Computational Systems Biology, pages 189–205. MIT Press, 2009.Google Scholar

[37] G., Sanguinetti, A., Ruttor, M., Opper and C., Archambeau. Switching regulatory models of cellular stress response. Bioinformatics, 25(10):1280–1286, 2009.Google Scholar

[38] S., Särkkä. Recursive Bayesian Inference on Stochastic Differential Equations. PhD thesis, Helsinki University of Technology, Finland, 2006.

[39] M., Seeger. Bayesian model selection for support vector machines, Gaussian processes and other kernel classifiers. In T. G., Dietterich, S., Becker and Z., Ghahramani, editors, Advances in Neural Information Processing Systems 12, pages 603–609. MIT Press, 2000.Google Scholar

[40] Y., Shen, C., Archambeau, D., Cornford and M., Opper. Markov Chain Monte Carlo for inference in partially observed nonlinear diffusions. In Proceedings Newton Institute for Mathematical Sciences workshop on Inference and Estimation in Probabilistic Time-Series Models, pages 67–78, 2008.Google Scholar

[41] O., Elerian, S., Chiband and N., Shephard. Likelihood inference for discretely observed nonlinear diffusions. Econometrika, 69(4):959–993, 2001.Google Scholar

[42] R. L., Stratonovich. Conditional Markov processes. Theory of Probability and its Applications, 5:156–178, 1960.Google Scholar

[43] B., Wang and D. M., Titterington. Lack of consistency of mean field and variational Bayes approximations for state space models. Neural Processing Letters, 20(3):151–170, 2004.Google Scholar

Book contents

6 - Approximate inference for continuous-time Markov processes

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive