Abstract
A symmetric Poissonian two-armed bandit becomes, in terms of a posteriori probabilities, a piecewise deterministic Markov decision process. For the case of the switching arms, only of one which creates rewards, we solve explicitly the average optimality equation and prove that a myopic policy is average optimal.
Similar content being viewed by others
References
Donchev DS (1995) On the two-armed bandit problem with non-observed Poissonian switching of arms. Technical report, University of Bonn
Donchev DS, Rachev ST, Stiegerwald D (1995) Optimal policies for exchange between two currencies in case of Poissonian switching. Technical report, University of California at Santa Barbara
Feldman D (1962) Contributions to the “two-armed bandit” problem. Ann Math Stat 33:847–856
Presman EL (1990) A Poisson version of the two-armed bandit problem with discounting. Theory Prob. Appl 35:307–317
Presman EL, Sonin IM (1990) Sequential Control with Incomplete Data: Bayesian Approach. Academic Press New York, Russian edition 1982
Sonin IM (1976) A model of resource distribution with incomplete information. In: Modelling scientific-technological progress and the control of economic processes under incomplete information. CEMI, USSR Academy of Sciences Press Moscow: 161–201, in Russian
Vermes D (1985) Optimal control of piecewise deterministic Markov processes. Stochastics 14:165–207
Yushkevich AA (1989) On the two-armed bandit problem with continuous time parameter and discounted rewards. Stochastics 23:299–310
Yushkevich AA (1989a) Verificiation theorems for Markov decision processes with a controlled deterministic drift and gradual and impulsive controls. Theory Prob Appl 34:474–496
Author information
Authors and Affiliations
Additional information
Supported by NSF grant DMS-9404177
Rights and permissions
About this article
Cite this article
Donchev, D.S., Yushkevich, A.A. Average optimality in a Poissonian bandit with switching arms. Mathematical Methods of Operations Research 45, 265–280 (1997). https://doi.org/10.1007/BF01193865
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF01193865