Skip to main content
  • 232 Accesses

Abstract

This is the second special issue of Machine Learning on the subject of reinforcement learning. The first, edited by Richard Sutton in 1992, marked the development of reinforcement learning into a major component of the machine learning field. Since then, the area has expanded further, accounting for a significant proportion of the papers at the annual International Conference on Machine Learning and attracting many new researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Bertsekas, Dimitri P., (1995). Dynamic Programming and Optimal Control. Athena Scientific, Belmont, Massachusetts. Volumes 1 and 2.

    MATH  Google Scholar 

  • Dayan, Peter & Sejnowski, Terrence J., (1994). TD(λ) converges with probability 1. Machine Learning, 14(3).

    Google Scholar 

  • Dietterich, Thomas G., (1986). Learning at the knowledge level. Machine Learning, 1(3):287–315.

    Google Scholar 

  • Dietterich, Thomas G. & Flann, Nicholas S., (1995). Explanation-based learning and reinforcement learning: A unified view. In Proceedings of the Twelfth International Conference on Machine Learning, pages 176–184, Tahoe City, California. Morgan Kaufmann.

    Google Scholar 

  • Puterman, Martin L., (1994). Markov Decision Processes. John Wiley & Sons, New York.

    MATH  Google Scholar 

  • Sutton, Richard S., (1988). Learning to predict by the method of temporal differences. Machine Learning, 3(1):9–44.

    Google Scholar 

  • Tesauro, Gerald, (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, pages 58–67.

    Google Scholar 

  • Tsitsiklis, John N., (1994). Asynchronous stochastic approximation and Q-learning. Machine Learning, 16(3).

    Google Scholar 

  • Watkins, C. J. C. H., (1989). Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge.

    Google Scholar 

  • Zhang, Wei & Dietterich, Thomas G., (1995). A reinforcement learning approach to job-shop scheduling. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1114–1120. Montreal, Canada. Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Kaelbling, L.P. (1996). Introduction. In: Kaelbling, L.P. (eds) Recent Advances in Reinforcement Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-33656-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-0-585-33656-5_2

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-7923-9705-2

  • Online ISBN: 978-0-585-33656-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics