In most Markov decision process applications, the decision-maker receives a reward each period. This reward can depend on the current state, the action taken, and the next state and is denoted by r t (s, a, s′).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
(2017). Reward. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_729
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_729
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering