Abstract
In this chapter, we address the issue of sequential prediction. The goal is to make the cumulative prediction loss as small as possible. This issue is reduced to the problem of minimizing the cumulative code-length when the code-length is calculated sequentially. We consider three types of prediction algorithms; maximum likelihood prediction algorithm, Bayesian prediction algorithm, and sequentially normalized maximum likelihood algorithm. We give their asymptotic analysis in terms of the so-called redundancy. Here, the asymptotic analysis means that the data length is sufficiently large for a fixed number of parameters. We also provide high-dimensional asymptotic analysis for the Bayesian prediction algorithm when the number of parameters increases together with the data length. In it, the spike and tail density function is introduced as the prior density, and its effectiveness is discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. Amari, N. Murata, Statistical theory of learning curves under entropic loss criterion. Neural Comput. 5(1), 140–153 (1993)
B.S. Clarke, A.R. Barron, Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Theory 36(3), 453–471 (1990)
B.S. Clarke, A.R. Barron, Jeffreys’ prior is asymptotically least favorable under entropy risk. J. Stat. Plan. Inference 41(1), 37–60 (1994)
A. DeSantis, G. Markowsky, M.N. Wegman, Learning probabilistic prediction functions, in Proceedings of the 29th Annual Symposium on Foundations of Computer Science (FOCS’88) (1988), pp. 110–119
C. Giurcaneanu, S. Razavi, AR order selection in the case when the model parameters are estimated by forgetting factor least-squares algorithms. Signal Process. 90(2), 451–466 (2010)
C. Giurcaneanu, S. Razavi, A. Liski, Variable selection in linear regression: several approaches based on normalized maximum likelihood. Signal Process. 91(8), 1671–1692 (2011)
K. Miyaguchi, K. Yamanishi, Adaptive minimax regret against smooth logarithmic losses over high-dimensional l1-balls via envelope complexity, in Proceedings of the 22nd Artificial Intelligence and Statistics Conference (AISTATS’19) (2019), pp. 3440–3448
R.L. Plackett, Some theorems in least squares. Biometrika 37(1/2), 149–157 (1955)
J. Rissanen, Optimal Estimation of Parameters (Cambridge University Press, 2012)
J. Rissanen, T. Roos, P. Myllymäki, Model selection by sequentially normalized least squares. J. Multivar. Anal. 101(4), 839–849 (2010)
T. Roos, J. Rissanen, On sequentially normalized maximum likelihood models, in Proceedings of Workshop Information Theoretic Methods in Science and Engineering (WITMSE’08) (2008)
T. Takahashi, R. Tomioka, K. Yamanishi, Discovering emerging topics in social streams via link anomaly detection. IEEE Trans. Knowl. Data Eng. 26(1), 120–130 (2014)
J. Takeuchi, A.R. Barron, Asymptotically minimax by Bayes mixtures, in Proceedings of IEEE International Symposium on Information Theory (ISIT’98) (1998)
J. Takeuchi, K. Yamanishi, A unifying framework for detecting outliers and change-points from time series. IEEE Trans. Knowl. Data Eng. 18(44), 482–492 (2006)
E. Takimoto, M. Warmuth, The last step minimax algorithm, in Proceedings of the 11th International Conference on Algorithmic Learning Theory (ALT’00) (2000), pp. 279–290
Y. Urabe, K. Yamanishi, R. Tomioka, H. Iwai, Real-time change-point detection using sequentially discounting normalized maximum likelihood coding, in Proceedings of 15th Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD’11), Lecture Notes in Computer Science, Berlin, Heidelberg, vol. 6635, Part II (2011), pp. 185–197
Q. Xie, A.R. Barron, Minimax redundancy for the class of memoryless sources. IEEE Trans. Inf. Theory 43(2), 646–657 (1997)
Q. Xie, A.R. Barron, Asymptotic minimax regret for data compression, gambling, and prediction. IEEE Trans. Inf. Theory 46(2), 431–445 (2000)
K. Yamanishi, A loss bound model for on-line stochastic prediction algorithms. Inf. Comput. 119(1), 39–54 (1995)
K. Yamanishi, Y. Maruyama, Dynamic syslog mining for network failure monitoring, in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’05) (2005), pp. 499–508
K. Yamanishi, J. Takeuchi, A unifying framework for detecting outliers and change points from non-stationary time series data, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02) (2002), pp. 676–681
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2023 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Yamanishi, K. (2023). Sequential Prediction. In: Learning with the Minimum Description Length Principle . Springer, Singapore. https://doi.org/10.1007/978-981-99-1790-7_5
Download citation
DOI: https://doi.org/10.1007/978-981-99-1790-7_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1789-1
Online ISBN: 978-981-99-1790-7
eBook Packages: Computer ScienceComputer Science (R0)