Average optimality in a Poissonian bandit with switching arms

Donchev, Doncho S.; Yushkevich, Alexander A.

doi:10.1007/BF01193865

Average optimality in a Poissonian bandit with switching arms

Published: June 1997

Volume 45, pages 265–280, (1997)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Doncho S. Donchev¹ &
Alexander A. Yushkevich²

45 Accesses
1 Citation
Explore all metrics

Abstract

A symmetric Poissonian two-armed bandit becomes, in terms of a posteriori probabilities, a piecewise deterministic Markov decision process. For the case of the switching arms, only of one which creates rewards, we solve explicitly the average optimality equation and prove that a myopic policy is average optimal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The non-stationary stochastic multi-armed bandit problem

Article 30 March 2017

Robust Risk-Averse Stochastic Multi-armed Bandits

Stochastic One-Sided Full-Information Bandit

References

Donchev DS (1995) On the two-armed bandit problem with non-observed Poissonian switching of arms. Technical report, University of Bonn
Donchev DS, Rachev ST, Stiegerwald D (1995) Optimal policies for exchange between two currencies in case of Poissonian switching. Technical report, University of California at Santa Barbara
Feldman D (1962) Contributions to the “two-armed bandit” problem. Ann Math Stat 33:847–856
Google Scholar
Presman EL (1990) A Poisson version of the two-armed bandit problem with discounting. Theory Prob. Appl 35:307–317
Google Scholar
Presman EL, Sonin IM (1990) Sequential Control with Incomplete Data: Bayesian Approach. Academic Press New York, Russian edition 1982
Sonin IM (1976) A model of resource distribution with incomplete information. In: Modelling scientific-technological progress and the control of economic processes under incomplete information. CEMI, USSR Academy of Sciences Press Moscow: 161–201, in Russian
Vermes D (1985) Optimal control of piecewise deterministic Markov processes. Stochastics 14:165–207
Google Scholar
Yushkevich AA (1989) On the two-armed bandit problem with continuous time parameter and discounted rewards. Stochastics 23:299–310
Google Scholar
Yushkevich AA (1989a) Verificiation theorems for Markov decision processes with a controlled deterministic drift and gradual and impulsive controls. Theory Prob Appl 34:474–496
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Higher Institute of Food and Flavor Industry, Maritza 26, 4002, Plovdiv, Bulgaria
Doncho S. Donchev
Department of Mathematics, University of North Carolina at Charlotte, 28223, Charlotte, NC, USA
Alexander A. Yushkevich

Authors

Doncho S. Donchev
View author publications
You can also search for this author in PubMed Google Scholar
Alexander A. Yushkevich
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Supported by NSF grant DMS-9404177

Rights and permissions

Reprints and permissions

About this article

Cite this article

Donchev, D.S., Yushkevich, A.A. Average optimality in a Poissonian bandit with switching arms. Mathematical Methods of Operations Research 45, 265–280 (1997). https://doi.org/10.1007/BF01193865

Download citation

Received: 15 April 1996
Issue Date: June 1997
DOI: https://doi.org/10.1007/BF01193865

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Average optimality in a Poissonian bandit with switching arms

Abstract

Access this article

Similar content being viewed by others

The non-stationary stochastic multi-armed bandit problem

Robust Risk-Averse Stochastic Multi-armed Bandits

Stochastic One-Sided Full-Information Bandit

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Average optimality in a Poissonian bandit with switching arms

Abstract

Access this article

Similar content being viewed by others

The non-stationary stochastic multi-armed bandit problem

Robust Risk-Averse Stochastic Multi-armed Bandits

Stochastic One-Sided Full-Information Bandit

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation