Markov Decision Processes

Cao, Xi-Ren

doi:10.1007/978-0-387-69082-7_4

Xi-Ren Cao PhD²

2027 Accesses

Abstract

In Chapter 2, we introduced the basic principles of PA and derived the performance derivative formulas for queueing networks and Markov and semi-Markov systems with these principles. In Chapter 3, we developed sample-path-based (on-line learning) algorithms for estimating the performance derivatives and sample-path-based optimization schemes. In this chapter, we will show that the performance sensitivity based view leads to a unified approach to both PA and Markov decision processes (MDPs).

One of the principal objects of theoretical research in my department of knowledge is to find the point of view from which the subject appears in its greatest simplicity.

Josiah Willard Gibbs American Scientist (1839 – 1903)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

X. R. Cao, “A Unified Approach to Markov Decision Problems and Performance Sensitivity Analysis,” Automatica, Vol. 36, 771-774, 2000.
Article MATH Google Scholar
X. R. Cao and X. P. Guo, “A Unified Approach to Markov Decision Problems and Sensitivity Analysis with Discounted and Average Criteria: Multichain Case,” Automatica, Vol. 40, 1749-1759, 2004.
Article MATH MathSciNet Google Scholar
X. R. Cao and J. Y. Zhang, “The nth-Order Bias Optimality for Multi-chain Markov Decision Processes,” IEEE Transactions on Automatic Control, submitted.
Google Scholar
D. P. Bertsekas, Dynamic Programming and Optimal Control, Volumes I and II. Athena Scientific, Belmont, Massachusetts, 1995, 2001, 2007.
Google Scholar
D. P. Bertsekas and T. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, Massachusetts, 1996.
Google Scholar
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, New York, 1994.
MATH Google Scholar
H. S. Chang, M. C. Fu, J. Hu and S. I. Marcus, Simulation-Based Algorithms for Markov Decision Processes, Springer, New York, 2007.
MATH Google Scholar
H. S. Chang, H. G. Lee, M. C. Fu, and S. I. Marcus, “Evolutionary Policy Iteration for Solving Markov Decision Processes,” IEEE Transactions on Automatic Control, Vol. 50, 1804–1808, 2005.
Article MathSciNet Google Scholar
J. Q. Hu, M. C. Fu, V. R. Ramezani, and S. I. Marcus, “An Evolutionary Random Search Algorithm for Solving Markov Decision Processes,” INFORMS Journal on Computing, to appear, 2006.
Google Scholar
M. E. Lewis and M. L. Puterman, “A Probabilistic Analysis of Bias Optimality in Unichain Markov Decision Processes,” IEEE Transactions on Automatic Control, Vol. 46, 96-100, 2001.
Article MATH MathSciNet Google Scholar
M. E. Lewis and M. L. Puterman, “Bias Optimality,” in E. A. Feinberg and A. Shwartz (eds.), The Handbook of Markov Decision Processes: Methods and Applications, Kluwer Academic Publishers, Boston, 89-111, 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Xi-Ren Cao PhD (Professor)

Authors

Xi-Ren Cao PhD
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xi-Ren Cao PhD .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cao, XR. (2007). Markov Decision Processes. In: Stochastic Learning and Optimization. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69082-7_4

Download citation

DOI: https://doi.org/10.1007/978-0-387-69082-7_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-36787-3
Online ISBN: 978-0-387-69082-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics