Risk-Sensitive Markov Decision Processes

Barz, Christiane; Bäuerle, Nicole

doi:10.1007/978-3-030-54621-2_819-1

Christiane Barz³ &
Nicole Bäuerle⁴

79 Accesses

Abstract

Traditionally, Markov decision processes are Markov processes whose transition law is controlled by a decision maker aiming at a maximization of expected (total, discounted or average) reward. Risk-sensitive Markov decision processes are a generalization of such models, taking into account higher order moments as well by aiming at a maximization of the expected exponential utility of such rewards. We introduce the main ideas for the finite horizon case, interpret the optimization criterion and give some applications highlighting the effect of risk-sensitivity. Further criteria and extensions as well as other definitions of risk-sensitivity are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Arapostathis A, Borkar VS (2021) Linear and dynamic programs for risk-sensitive cost minimization. https://arxiv.org/pdf/2103.07993.pdf, Accessed 17 Aug 2021
Arrow KJ (1971) Essays in the theory of risk bearing. Markham Publishing Company, Chicago
MATH Google Scholar
Asienkiewicz H, Jaśkiewicz A (2017) A note on a new class of recursive utilities in Markov decision processes. Appl Math 44:149–161
MathSciNet MATH Google Scholar
Atar R, Goswami A, Shwartz A (2013) Risk-sensitive control for the parallel server model. SIAM J Control Optim 51:4363–4286
Article MathSciNet MATH Google Scholar
Barz C, Waldmann KH (2007) Risk-sensitive capacity control in revenue management. Math Meth Oper Res 65:565–579
Article MathSciNet MATH Google Scholar
Barz C, Waldmann KH (2017) Risk-sensitive decision support for admission control. In: Köppen V, Müller RM (eds) Business intelligence: methods and applications. Dr. Kovac Verlag, Hamburg, pp 165–174
Google Scholar
Bäuerle N, Ott J (2011) Markov decision processes with average-value-at-risk criteria. Math Meth Oper Res 74:361-379
Article MathSciNet MATH Google Scholar
Bäuerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res 39:105–120
Article MathSciNet MATH Google Scholar
Bäuerle N, Rieder U (2015) Partially observable risk-sensitive stopping problems in discrete time. In: Piunovskiy AB (ed) Modern trends of controlled stochastic processes: theory and applications, vol II. Luniver Press, Frome, pp 12–31
Google Scholar
Bäuerle N, Rieder U (2017) Zero-sum risk-sensitive stochastic games. Stoch Process Appl 127:622–642
Article MathSciNet MATH Google Scholar
Bäuerle N, Rieder U (2017) Partially observable risk-sensitive Markov decision processes. Math Oper Res 42:1180–1196
Article MathSciNet MATH Google Scholar
Bensoussan A, Frehse J, Nagai H (1998) Some results on risk-sensitive control with full observation. Appl Math Optim 37:1–41
Article MathSciNet MATH Google Scholar
Bielecki T, Pliska SR (2003) Economic properties of the risk sensitive criterion for portfolio management. Rev Account Financ 2:3–17
Article Google Scholar
Bielecki T, Hernández-Hernández D, Pliska SR (1999) Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management. Math Meth Oper Res 50:167–188
Article MathSciNet MATH Google Scholar
Boda K, Filar JA (2006) Time consistent dynamic risk measures. Math Meth Oper Res 63:169–186
Article MathSciNet MATH Google Scholar
Borkar VS (2002) Q-learning for risk-sensitive control. Math Oper Res 27:294–311
Article MathSciNet MATH Google Scholar
Bouakiz M, Kebir Y (1995) Target-level criterion in Markov decision processes. J Optim Theory Appl 86:1–15
Article MathSciNet MATH Google Scholar
Bouakiz M, Sobel MJ (1992) Inventory control with an exponential utility criterion. Oper Res 40:603–608
Article MathSciNet MATH Google Scholar
Cavazos-Cadena R, Montes-de-Oca R (2003) The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space. Math Oper Res 28:752–776
Article MathSciNet MATH Google Scholar
Chen X, Sim M, Simchi-Levi D, Sun P (2007) Risk aversion in inventory management. Oper Res 55:828–842
Article MATH Google Scholar
Chung KJ, Sobel MJ (1987) Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J Control Optim 25:49–62
Article MathSciNet MATH Google Scholar
Dai Pra P, Meneghini L, Runggaldier WJ (1996) Connections between stochastic control and dynamic games. Math of Control Signals Syst 9:303–326
Article MathSciNet MATH Google Scholar
Davis MH, Lleo S (2014) Risk-sensitive investment management. World Scientific, Singapore
Book MATH Google Scholar
De Finetti B (1940) Il probleme dei pieni. G dell’Istituto Italiano degli Attuari 11:1–88
MATH Google Scholar
Di Masi GB, Stettner L (1999) Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J Control Optim 38:61–78
Article MathSciNet MATH Google Scholar
Ermon S, Conrad J, Gomes C, Selman B (2011) Risk-sensitive policies for sustainable renewable resource allocation. In: Walsh T (ed) Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI). AAAI Press, Barcelona, pp 1942–1948
Google Scholar
Filar JA, Krass D, Ross KW (1995) Percentile performance criteria for limiting average Markov decision processes. IEEE Trans Autom Control 40:2–10
Article MathSciNet MATH Google Scholar
Fleming WH, Hernandez-Hernandez D (1997) Risk-sensitive control of finite state machines on an infinite horizon I. SIAM J Control Optim 35:1790–1810
Article MathSciNet MATH Google Scholar
Fleming WH, McEneaney WM (1995) Risk-sensitive control on an infinite time horizon. SIAM J Control Optim 33:1881–1915
Article MathSciNet MATH Google Scholar
Föllmer H, Schied A (2016) Stochastic finance. de Gruyter, Oldenburg
Book MATH Google Scholar
Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stoch Int J Probab Stoch Process 86:655–675
Article MathSciNet MATH Google Scholar
Gönsch J (2017) A survey on risk-averse and robust revenue management. Eur J Oper Res 263: 337–348
Article MathSciNet MATH Google Scholar
Hansen LP, Sargent TJ (1995) Discounted linear exponential quadratic gaussian control. IEEE Trans Autom Control 40:968–971
Article MathSciNet MATH Google Scholar
Henig MI (1990) Risk criteria in a stochastic knapsack problem. Oper Res 38:820–825
Article MathSciNet Google Scholar
Hernández-Hernández D, Marcus SI (1996) Risk sensitive control of Markov processes in countable state space. Syst Control Lett 29:147–155
Article MathSciNet MATH Google Scholar
Hernández-Hernández D, Marcus SI, Fard PJ (1999) Analysis of a risk-sensitive control problem for hidden Markov chains. IEEE Trans Autom Control 44:1093–1100
Article MathSciNet MATH Google Scholar
Hou P, Yeoh W, Varakantham P (2014) Revisiting risk-sensitive MDPs: new algorithms and results. In: Chien S, Do M, Fern A, Ruml W (eds) Proceedings of the International Conference on Automated Planning and Scheduling, vol 24. pp 136–144
Google Scholar
Howard R, Matheson J (1972) Risk-sensitive Markov decision processes. Manag Sci 18:356–369
Article MathSciNet MATH Google Scholar
Jacobson D (1973) Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games. IEEE Trans Autom Control 18:124–131
Article MathSciNet MATH Google Scholar
James MR, Baras JS, Elliott RJ (1994) Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Trans Autom Control 39:780–792
Article MathSciNet MATH Google Scholar
Jaśkiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17:654–675
Article MathSciNet MATH Google Scholar
Jiang DR, Powell WB (2018) Risk-averse approximate dynamic programming with quantile-based risk measures. Math Oper Res 43:554–579
Article MathSciNet MATH Google Scholar
Kirkwood CW (1997) Notes on the attitude toward risk taking and the exponential utility function. https://www.public.asu.edu/~kirkwood/DAStuff/refs/risk.pdf, Accessed 17 Aug 2021
Kreps DM (1977) Decision problems with expected utility critera, I: upper and lower convergent utility. Math Oper Res 2:45–53
Article MathSciNet MATH Google Scholar
Kumar A, Kavitha V, Hemachandra N (2015) Finite horizon risk sensitive MDP and linear programming. In: 2015 54th IEEE Conference on Decision and Control (CDC). IEEE, Osaka, pp 7826–7831
Google Scholar
Li D, Ng WL (2000) Optimal dynamic portfolio selection: multiperiod mean-variance formulation. Math Financ 10:387–406
Article MathSciNet MATH Google Scholar
Marcus SI, Fernández-Gaucherand E, Hernández-Hernandez D, Coraluppi S, Fard P (1997) Risk sensitive Markov decision processes. In: Byrnes CI, Datta BN, Martin CF, Gilliam DS (eds) Systems and control in the twenty-first century. Birkhäuser, Boston, pp 263–279
Chapter Google Scholar
Markowitz H (1952) Portfolio selection. J Financ 7:77–91
Google Scholar
Minami R, da Silva VF (2012) Shortest stochastic path with risk sensitive evaluation. In: Batyrshin I, González Mendoza M (eds) Advances in artificial intelligence. Springer, Berlin/Heidelberg, pp 371–382
Google Scholar
Nagai H (1996) Bellman equations of risk-sensitive control. SIAM J Control Optim 34:74–101
Article MathSciNet MATH Google Scholar
Porteus EL (1975) On the optimality of structured policies in countable stage decision processes. Manag Sci 22:148–157
Article MathSciNet MATH Google Scholar
Schlosser R (2020) Risk-sensitive control of Markov decision processes: a moment-based approach with target distributions. Comput Oper Res 123:1049975. https://doi.org/10.1016/j.cor.2020.104997
Article MathSciNet MATH Google Scholar
Sladký K (1976) On dynamic programming recursions for multiplicative Markov decision chains. In: Wets RJB (ed) Stochastic systems: modeling, identification and optimization. Springer, Berlin/Heidelberg, pp 216–226
Chapter Google Scholar
White DJ (1987) Utility, probabilistic constraints, mean and variance of discounted rewards in Markov decision processes. OR Spektrum 9:13–22
Article MathSciNet MATH Google Scholar
White DJ (1988) Mean, variance, and probabilistic criteria in finite Markov decision processes: a review. J Optim Theory Appl 56:1–29
Article MathSciNet MATH Google Scholar
Whittle P (1990) Risk-sensitive optimal control, vol 2. Wiley, Chichester
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Business Administration, University of Zürich, Zürich, Switzerland
Christiane Barz
Institute of Stochastics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Nicole Bäuerle

Authors

Christiane Barz
View author publications
You can also search for this author in PubMed Google Scholar
Nicole Bäuerle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christiane Barz .

Editor information

Editors and Affiliations

Department of Industrial & Systems Engin, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Departmentl of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA
Oleg A. Prokopyev

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Barz, C., Bäuerle, N. (2023). Risk-Sensitive Markov Decision Processes. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_819-1

Download citation

DOI: https://doi.org/10.1007/978-3-030-54621-2_819-1
Published: 21 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54621-2
Online ISBN: 978-3-030-54621-2
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics