Two-armed bandit problem for parallel data processing systems

Kolnogorov, A. V.

doi:10.1134/S0032946012010085

Two-armed bandit problem for parallel data processing systems

Large Systems
Published: 17 April 2012

Volume 48, pages 72–84, (2012)
Cite this article

Problems of Information Transmission Aims and scope Submit manuscript

A. V. Kolnogorov¹

75 Accesses
3 Citations
Explore all metrics

Abstract

We consider application of the two-armed bandit problem to processing a large number N of data where two alternative processing methods can be used. We propose a strategy which at the first stages, whose number is at most r − 1, compares the methods, and at the final stage applies only the best one obtained from the comparison. We find asymptotically optimal parameters of the strategy and observe that the minimax risk is of the order of N ^α, where α = 2^r−1/(2^r − 1). Under parallel processing, the total operation time is determined by the number r of stages but not by the number N of data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

One-armed bandit problem for parallel data processing systems

Article 01 April 2015

A. V. Kolnogorov

Robust parallel control in a random environment and data processing optimization

Article 17 December 2014

A. V. Kolnogorov

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

Article 01 August 2022

A. V. Kolnogorov, A. V. Nazin & D. N. Shiyan

References

Tsetlin, M.L., Issledovaniya po teorii avtomatov i modelirovaniyu biologicheskikh sistem, Moscow: Nauka, 1969. Translated under the title Automaton Theory and Modeling of Biological Systems, New York: Academic, 1973.
MATH Google Scholar
Varshavskii, V.I., Kollektivnoe povedenie avtomatov (Collective Behavior of Automata), Moscow: Nauka, 1973. Translated under the title Kollektives Verhalten von Automaten, Warschawski, W.I., Berlin: Akademie, 1978.
MATH Google Scholar
Hellman, M.E. and Cover, T.M., Comment on Automata in Random Media, Probl. Peredachi Inf., 1970, vol. 6, no. 2, pp. 21–30 [Probl. Inf. Trans. (Engl. Transl.), 1970, vol. 6, no. 2, pp. 107–114].
MathSciNet MATH Google Scholar
Zigangirov, K.Sh., Multiple Hypothesis Discrimination Using Finite-State Automata, Probl. Peredachi Inf., 1977, vol. 13, no. 3, pp. 45–55 [Probl. Inf. Trans. (Engl. Transl.), 1977, vol. 13, no. 3, pp. 194–202].
MathSciNet MATH Google Scholar
Sragovich, V.G., Adaptivnoe upravlenie (Adaptive Control), Moscow: Nauka, 1981.
MATH Google Scholar
Nazin, A.V. and Poznyak, A.S., Adaptivnyi vybor variantov: rekurrentnye algoritmy (Adaptive Choice: Recursive Algorithms), Moscow: Nauka, 1986.
Google Scholar
Berry, D.A. and Fristedt, B., Bandit Problems: Sequential Allocation of Experiments, London: Chapman & Hall, 1985.
MATH Google Scholar
Presman, E.L. and Sonin, I.M., Posledovatel’noe upravlenie po nepolnym dannym. Baiesovskii podkhod (Sequential Control Based on Incomplete Data: Bayesian Approach), Moscow: Nauka, 1982.
Google Scholar
Kolnogorov, A.V., On Optimal Prior Learning Time in the Two-Armed Bandit Problem, Probl. Peredachi Inf., 2000, vol. 36, no. 4, pp. 117–127 [Probl. Inf. Trans. (Engl. Transl.), 2000, vol. 36, no. 4, pp. 387–396].
MathSciNet Google Scholar
Kolnogorov, A.V. and Melnikova, S.V., Minimax R-Stage Strategy for the Multi-Armed Bandit Problem, in Proc. 9th IFAC Workshop on Adaptation and Learning in Control and Signal Processing (ALCOSP’07), St. Petersburg, Russia, 2007. Available at http://www.ifac-papersonline.net/Detailed/30255.html.
Witmer, J.A., Bayesian Multistage Decision Problems, Ann. Statist., 1986, vol. 14, no. 1, pp. 283–297.
Article MathSciNet MATH Google Scholar
Cheng, Y., Multistage Decision Problems, Sequential Analysis, 1994, vol. 13, no. 4, pp. 329–349.
Article MathSciNet MATH Google Scholar
Vogel, W., An Asymptotic Minimax Theorem for the Two-Armed Bandit Problem, Ann. Math. Stat., 1960, vol. 31, no. 2, pp. 444–451.
Article MATH Google Scholar
Lai, T.L. and Robbins, H., Asymptotically Efficient Adaptive Allocation Rules, Adv. Appl. Math., 1985, vol. 6, no. 1, pp. 4–22.
Article MathSciNet MATH Google Scholar
Prokhorov, Yu.V. and Rozanov, Yu.A., Teoriya veroyatnostei: osnovnye poniatiya, predel’nye teoremy, sluchainye protsessy, Moscow: Nauka, 1987, 3rd ed. First edition translated under the title Probability Theory: Basic Concepts, Limit Theorems, Random Processes, Berlin: Springer, 1969.
Google Scholar
Ibragimov, I.A. and Linnik, Yu.V., Nezavisimye i statsionarno svyazannye velichiny, Moscow: Nauka, 1965. Translated under the title Independent and Stationary Sequences of Random Variables, Groningen: Wolters-Noordhoff, 1971.
Google Scholar
Petrov, V.V., Generalization of Cramér’s Limit Theorem, Uspehi Matem. Nauk (N.S.), 1954, vol. 9, no. 4, pp. 195–202.
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Applied Mathematics and Information Science Chair, Yaroslav-the-Wise Novgorod State University, Novgorod, Russia
A. V. Kolnogorov

Authors

A. V. Kolnogorov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. V. Kolnogorov.

Additional information

Original Russian Text © A.V. Kolnogorov, 2012, published in Problemy Peredachi Informatsii, 2012, Vol. 48, No. 1, pp. 83–95.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kolnogorov, A.V. Two-armed bandit problem for parallel data processing systems. Probl Inf Transm 48, 72–84 (2012). https://doi.org/10.1134/S0032946012010085

Download citation

Received: 22 March 2011
Accepted: 19 September 2011
Published: 17 April 2012
Issue Date: March 2012
DOI: https://doi.org/10.1134/S0032946012010085

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-armed bandit problem for parallel data processing systems

Abstract

Access this article

Similar content being viewed by others

One-armed bandit problem for parallel data processing systems

Robust parallel control in a random environment and data processing optimization

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two-armed bandit problem for parallel data processing systems

Abstract

Access this article

Similar content being viewed by others

One-armed bandit problem for parallel data processing systems

Robust parallel control in a random environment and data processing optimization

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation