Skip to main content

SD-Q: Selective Discount Q Learning Based on New Results of Intertemporal Choice Theory

  • Conference paper
  • 2129 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7003))

Abstract

We discuss the reinforcement learning from an intertemporal choice perspective. Different from previous research, this paper wants to emphasize the importance of deeper understanding the psychological mechanism of human decision-making. In what follows we aim to improve the previous Q learning algorithm according to the new results of intertemporal choice experiments. We start with a brief introduction to new findings of intertemporal choice theory and reinforcement learning. Then we propose a new reinforcement learning algorithm with selective discount (SD-Q). Experiments show that, SD-Q is superior to both the traditional Q learning algorithm and the reinforcement learning method without considering the discount.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Weber, B.J., Chapman, G.B.: The combined effects of risk and time on choice: Does uncertainty eliminate the immediacy effect? Does delay eliminate the certainty effect? Organizational Behavior and Human Decision Processes 96, 104–118 (2005)

    Article  Google Scholar 

  2. Loewenstein, G., Prelec, D.: Preferences for sequences of outcomes. Psychological Review 100, 91–108 (1993)

    Article  Google Scholar 

  3. Allais, M.: Le comportement de l’homme rationel devant le risque: Critique des postulats et axioms del’école americaine (Rational man’s behavior in face of risk: Critique of the American School’s postulates and axioms). Econometrica 21, 503–546 (1953)

    Article  MathSciNet  MATH  Google Scholar 

  4. Rao, L.-L., Li, S.: New paradoxes in intertemporal choice. Judgment and Decision Making 6(2), 122–129 (2011)

    Google Scholar 

  5. Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 260–268 (1998)

    Google Scholar 

  6. Singh, S.P.: Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning 8, 323–339 (1992)

    MATH  Google Scholar 

  7. Moriarty, D., Schultz, A., Grefenstette, J.: Evolutionary algorithms for reforcement learning. Journal of Artficial Intelligence Research 11(1), 241–276 (1999)

    MATH  Google Scholar 

  8. Watkins, C., Dayan, P.: Q-Learning. Machine Learning 8(3), 279–292 (1992)

    MATH  Google Scholar 

  9. Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)

    MathSciNet  MATH  Google Scholar 

  10. Dietterich, T.G.: The MAXQ method for hierarchical reinforcement learning. In: Fifteenth International Conference on Machine Learning, pp. 118–126. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, F., Qin, Z. (2011). SD-Q: Selective Discount Q Learning Based on New Results of Intertemporal Choice Theory. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds) Artificial Intelligence and Computational Intelligence. AICI 2011. Lecture Notes in Computer Science(), vol 7003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23887-1_88

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23887-1_88

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23886-4

  • Online ISBN: 978-3-642-23887-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics