Skip to main content

Sequential Prediction

  • Chapter
  • First Online:
Learning with the Minimum Description Length Principle
  • 277 Accesses

Abstract

In this chapter, we address the issue of sequential prediction. The goal is to make the cumulative prediction loss as small as possible. This issue is reduced to the problem of minimizing the cumulative code-length when the code-length is calculated sequentially. We consider three types of prediction algorithms; maximum likelihood prediction algorithm, Bayesian prediction algorithm, and sequentially normalized maximum likelihood algorithm. We give their asymptotic analysis in terms of the so-called redundancy. Here, the asymptotic analysis means that the data length is sufficiently large for a fixed number of parameters. We also provide high-dimensional asymptotic analysis for the Bayesian prediction algorithm when the number of parameters increases together with the data length. In it, the spike and tail density function is introduced as the prior density, and its effectiveness is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. S. Amari, N. Murata, Statistical theory of learning curves under entropic loss criterion. Neural Comput. 5(1), 140–153 (1993)

    Google Scholar 

  2. B.S. Clarke, A.R. Barron, Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Theory 36(3), 453–471 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  3. B.S. Clarke, A.R. Barron, Jeffreys’ prior is asymptotically least favorable under entropy risk. J. Stat. Plan. Inference 41(1), 37–60 (1994)

    Google Scholar 

  4. A. DeSantis, G. Markowsky, M.N. Wegman, Learning probabilistic prediction functions, in Proceedings of the 29th Annual Symposium on Foundations of Computer Science (FOCS’88) (1988), pp. 110–119

    Google Scholar 

  5. C. Giurcaneanu, S. Razavi, AR order selection in the case when the model parameters are estimated by forgetting factor least-squares algorithms. Signal Process. 90(2), 451–466 (2010)

    Google Scholar 

  6. C. Giurcaneanu, S. Razavi, A. Liski, Variable selection in linear regression: several approaches based on normalized maximum likelihood. Signal Process. 91(8), 1671–1692 (2011)

    Google Scholar 

  7. K. Miyaguchi, K. Yamanishi, Adaptive minimax regret against smooth logarithmic losses over high-dimensional l1-balls via envelope complexity, in Proceedings of the 22nd Artificial Intelligence and Statistics Conference (AISTATS’19) (2019), pp. 3440–3448

    Google Scholar 

  8. R.L. Plackett, Some theorems in least squares. Biometrika 37(1/2), 149–157 (1955)

    Google Scholar 

  9. J. Rissanen, Optimal Estimation of Parameters (Cambridge University Press, 2012)

    Google Scholar 

  10. J. Rissanen, T. Roos, P. Myllymäki, Model selection by sequentially normalized least squares. J. Multivar. Anal. 101(4), 839–849 (2010)

    Google Scholar 

  11. T. Roos, J. Rissanen, On sequentially normalized maximum likelihood models, in Proceedings of Workshop Information Theoretic Methods in Science and Engineering (WITMSE’08) (2008)

    Google Scholar 

  12. T. Takahashi, R. Tomioka, K. Yamanishi, Discovering emerging topics in social streams via link anomaly detection. IEEE Trans. Knowl. Data Eng. 26(1), 120–130 (2014)

    Google Scholar 

  13. J. Takeuchi, A.R. Barron, Asymptotically minimax by Bayes mixtures, in Proceedings of IEEE International Symposium on Information Theory (ISIT’98) (1998)

    Google Scholar 

  14. J. Takeuchi, K. Yamanishi, A unifying framework for detecting outliers and change-points from time series. IEEE Trans. Knowl. Data Eng. 18(44), 482–492 (2006)

    Google Scholar 

  15. E. Takimoto, M. Warmuth, The last step minimax algorithm, in Proceedings of the 11th International Conference on Algorithmic Learning Theory (ALT’00) (2000), pp. 279–290

    Google Scholar 

  16. Y. Urabe, K. Yamanishi, R. Tomioka, H. Iwai, Real-time change-point detection using sequentially discounting normalized maximum likelihood coding, in Proceedings of 15th Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD’11), Lecture Notes in Computer Science, Berlin, Heidelberg, vol. 6635, Part II (2011), pp. 185–197

    Google Scholar 

  17. Q. Xie, A.R. Barron, Minimax redundancy for the class of memoryless sources. IEEE Trans. Inf. Theory 43(2), 646–657 (1997)

    Google Scholar 

  18. Q. Xie, A.R. Barron, Asymptotic minimax regret for data compression, gambling, and prediction. IEEE Trans. Inf. Theory 46(2), 431–445 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  19. K. Yamanishi, A loss bound model for on-line stochastic prediction algorithms. Inf. Comput. 119(1), 39–54 (1995)

    Google Scholar 

  20. K. Yamanishi, Y. Maruyama, Dynamic syslog mining for network failure monitoring, in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’05) (2005), pp. 499–508

    Google Scholar 

  21. K. Yamanishi, J. Takeuchi, A unifying framework for detecting outliers and change points from non-stationary time series data, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02) (2002), pp. 676–681

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenji Yamanishi .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yamanishi, K. (2023). Sequential Prediction. In: Learning with the Minimum Description Length Principle . Springer, Singapore. https://doi.org/10.1007/978-981-99-1790-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-1790-7_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-1789-1

  • Online ISBN: 978-981-99-1790-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics