Skip to main content
  • 2027 Accesses

Abstract

In Chapter 2, we introduced the basic principles of PA and derived the performance derivative formulas for queueing networks and Markov and semi-Markov systems with these principles. In Chapter 3, we developed sample-path-based (on-line learning) algorithms for estimating the performance derivatives and sample-path-based optimization schemes. In this chapter, we will show that the performance sensitivity based view leads to a unified approach to both PA and Markov decision processes (MDPs).

One of the principal objects of theoretical research in my department of knowledge is to find the point of view from which the subject appears in its greatest simplicity.

Josiah Willard Gibbs American Scientist (1839 – 1903)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. X. R. Cao, “A Unified Approach to Markov Decision Problems and Performance Sensitivity Analysis,” Automatica, Vol. 36, 771-774, 2000.

    Article  MATH  Google Scholar 

  2. X. R. Cao and X. P. Guo, “A Unified Approach to Markov Decision Problems and Sensitivity Analysis with Discounted and Average Criteria: Multichain Case,” Automatica, Vol. 40, 1749-1759, 2004.

    Article  MATH  MathSciNet  Google Scholar 

  3. X. R. Cao and J. Y. Zhang, “The nth-Order Bias Optimality for Multi-chain Markov Decision Processes,” IEEE Transactions on Automatic Control, submitted.

    Google Scholar 

  4. D. P. Bertsekas, Dynamic Programming and Optimal Control, Volumes I and II. Athena Scientific, Belmont, Massachusetts, 1995, 2001, 2007.

    Google Scholar 

  5. D. P. Bertsekas and T. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, Massachusetts, 1996.

    Google Scholar 

  6. M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, New York, 1994.

    MATH  Google Scholar 

  7. H. S. Chang, M. C. Fu, J. Hu and S. I. Marcus, Simulation-Based Algorithms for Markov Decision Processes, Springer, New York, 2007.

    MATH  Google Scholar 

  8. H. S. Chang, H. G. Lee, M. C. Fu, and S. I. Marcus, “Evolutionary Policy Iteration for Solving Markov Decision Processes,” IEEE Transactions on Automatic Control, Vol. 50, 1804–1808, 2005.

    Article  MathSciNet  Google Scholar 

  9. J. Q. Hu, M. C. Fu, V. R. Ramezani, and S. I. Marcus, “An Evolutionary Random Search Algorithm for Solving Markov Decision Processes,” INFORMS Journal on Computing, to appear, 2006.

    Google Scholar 

  10. M. E. Lewis and M. L. Puterman, “A Probabilistic Analysis of Bias Optimality in Unichain Markov Decision Processes,” IEEE Transactions on Automatic Control, Vol. 46, 96-100, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  11. M. E. Lewis and M. L. Puterman, “Bias Optimality,” in E. A. Feinberg and A. Shwartz (eds.), The Handbook of Markov Decision Processes: Methods and Applications, Kluwer Academic Publishers, Boston, 89-111, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xi-Ren Cao PhD .

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Cao, XR. (2007). Markov Decision Processes. In: Stochastic Learning and Optimization. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69082-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-69082-7_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-36787-3

  • Online ISBN: 978-0-387-69082-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics