Abstract
Traditionally, Markov decision processes are Markov processes whose transition law is controlled by a decision maker aiming at a maximization of expected (total, discounted or average) reward. Risk-sensitive Markov decision processes are a generalization of such models, taking into account higher order moments as well by aiming at a maximization of the expected exponential utility of such rewards. We introduce the main ideas for the finite horizon case, interpret the optimization criterion and give some applications highlighting the effect of risk-sensitivity. Further criteria and extensions as well as other definitions of risk-sensitivity are also discussed.
References
Arapostathis A, Borkar VS (2021) Linear and dynamic programs for risk-sensitive cost minimization. https://arxiv.org/pdf/2103.07993.pdf, Accessed 17 Aug 2021
Arrow KJ (1971) Essays in the theory of risk bearing. Markham Publishing Company, Chicago
Asienkiewicz H, Jaśkiewicz A (2017) A note on a new class of recursive utilities in Markov decision processes. Appl Math 44:149–161
Atar R, Goswami A, Shwartz A (2013) Risk-sensitive control for the parallel server model. SIAM J Control Optim 51:4363–4286
Barz C, Waldmann KH (2007) Risk-sensitive capacity control in revenue management. Math Meth Oper Res 65:565–579
Barz C, Waldmann KH (2017) Risk-sensitive decision support for admission control. In: Köppen V, Müller RM (eds) Business intelligence: methods and applications. Dr. Kovac Verlag, Hamburg, pp 165–174
Bäuerle N, Ott J (2011) Markov decision processes with average-value-at-risk criteria. Math Meth Oper Res 74:361-379
Bäuerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res 39:105–120
Bäuerle N, Rieder U (2015) Partially observable risk-sensitive stopping problems in discrete time. In: Piunovskiy AB (ed) Modern trends of controlled stochastic processes: theory and applications, vol II. Luniver Press, Frome, pp 12–31
Bäuerle N, Rieder U (2017) Zero-sum risk-sensitive stochastic games. Stoch Process Appl 127:622–642
Bäuerle N, Rieder U (2017) Partially observable risk-sensitive Markov decision processes. Math Oper Res 42:1180–1196
Bensoussan A, Frehse J, Nagai H (1998) Some results on risk-sensitive control with full observation. Appl Math Optim 37:1–41
Bielecki T, Pliska SR (2003) Economic properties of the risk sensitive criterion for portfolio management. Rev Account Financ 2:3–17
Bielecki T, Hernández-Hernández D, Pliska SR (1999) Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management. Math Meth Oper Res 50:167–188
Boda K, Filar JA (2006) Time consistent dynamic risk measures. Math Meth Oper Res 63:169–186
Borkar VS (2002) Q-learning for risk-sensitive control. Math Oper Res 27:294–311
Bouakiz M, Kebir Y (1995) Target-level criterion in Markov decision processes. J Optim Theory Appl 86:1–15
Bouakiz M, Sobel MJ (1992) Inventory control with an exponential utility criterion. Oper Res 40:603–608
Cavazos-Cadena R, Montes-de-Oca R (2003) The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space. Math Oper Res 28:752–776
Chen X, Sim M, Simchi-Levi D, Sun P (2007) Risk aversion in inventory management. Oper Res 55:828–842
Chung KJ, Sobel MJ (1987) Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J Control Optim 25:49–62
Dai Pra P, Meneghini L, Runggaldier WJ (1996) Connections between stochastic control and dynamic games. Math of Control Signals Syst 9:303–326
Davis MH, Lleo S (2014) Risk-sensitive investment management. World Scientific, Singapore
De Finetti B (1940) Il probleme dei pieni. G dell’Istituto Italiano degli Attuari 11:1–88
Di Masi GB, Stettner L (1999) Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J Control Optim 38:61–78
Ermon S, Conrad J, Gomes C, Selman B (2011) Risk-sensitive policies for sustainable renewable resource allocation. In: Walsh T (ed) Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI). AAAI Press, Barcelona, pp 1942–1948
Filar JA, Krass D, Ross KW (1995) Percentile performance criteria for limiting average Markov decision processes. IEEE Trans Autom Control 40:2–10
Fleming WH, Hernandez-Hernandez D (1997) Risk-sensitive control of finite state machines on an infinite horizon I. SIAM J Control Optim 35:1790–1810
Fleming WH, McEneaney WM (1995) Risk-sensitive control on an infinite time horizon. SIAM J Control Optim 33:1881–1915
Föllmer H, Schied A (2016) Stochastic finance. de Gruyter, Oldenburg
Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stoch Int J Probab Stoch Process 86:655–675
Gönsch J (2017) A survey on risk-averse and robust revenue management. Eur J Oper Res 263: 337–348
Hansen LP, Sargent TJ (1995) Discounted linear exponential quadratic gaussian control. IEEE Trans Autom Control 40:968–971
Henig MI (1990) Risk criteria in a stochastic knapsack problem. Oper Res 38:820–825
Hernández-Hernández D, Marcus SI (1996) Risk sensitive control of Markov processes in countable state space. Syst Control Lett 29:147–155
Hernández-Hernández D, Marcus SI, Fard PJ (1999) Analysis of a risk-sensitive control problem for hidden Markov chains. IEEE Trans Autom Control 44:1093–1100
Hou P, Yeoh W, Varakantham P (2014) Revisiting risk-sensitive MDPs: new algorithms and results. In: Chien S, Do M, Fern A, Ruml W (eds) Proceedings of the International Conference on Automated Planning and Scheduling, vol 24. pp 136–144
Howard R, Matheson J (1972) Risk-sensitive Markov decision processes. Manag Sci 18:356–369
Jacobson D (1973) Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games. IEEE Trans Autom Control 18:124–131
James MR, Baras JS, Elliott RJ (1994) Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Trans Autom Control 39:780–792
Jaśkiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17:654–675
Jiang DR, Powell WB (2018) Risk-averse approximate dynamic programming with quantile-based risk measures. Math Oper Res 43:554–579
Kirkwood CW (1997) Notes on the attitude toward risk taking and the exponential utility function. https://www.public.asu.edu/~kirkwood/DAStuff/refs/risk.pdf, Accessed 17 Aug 2021
Kreps DM (1977) Decision problems with expected utility critera, I: upper and lower convergent utility. Math Oper Res 2:45–53
Kumar A, Kavitha V, Hemachandra N (2015) Finite horizon risk sensitive MDP and linear programming. In: 2015 54th IEEE Conference on Decision and Control (CDC). IEEE, Osaka, pp 7826–7831
Li D, Ng WL (2000) Optimal dynamic portfolio selection: multiperiod mean-variance formulation. Math Financ 10:387–406
Marcus SI, Fernández-Gaucherand E, Hernández-Hernandez D, Coraluppi S, Fard P (1997) Risk sensitive Markov decision processes. In: Byrnes CI, Datta BN, Martin CF, Gilliam DS (eds) Systems and control in the twenty-first century. Birkhäuser, Boston, pp 263–279
Markowitz H (1952) Portfolio selection. J Financ 7:77–91
Minami R, da Silva VF (2012) Shortest stochastic path with risk sensitive evaluation. In: Batyrshin I, González Mendoza M (eds) Advances in artificial intelligence. Springer, Berlin/Heidelberg, pp 371–382
Nagai H (1996) Bellman equations of risk-sensitive control. SIAM J Control Optim 34:74–101
Porteus EL (1975) On the optimality of structured policies in countable stage decision processes. Manag Sci 22:148–157
Schlosser R (2020) Risk-sensitive control of Markov decision processes: a moment-based approach with target distributions. Comput Oper Res 123:1049975. https://doi.org/10.1016/j.cor.2020.104997
Sladký K (1976) On dynamic programming recursions for multiplicative Markov decision chains. In: Wets RJB (ed) Stochastic systems: modeling, identification and optimization. Springer, Berlin/Heidelberg, pp 216–226
White DJ (1987) Utility, probabilistic constraints, mean and variance of discounted rewards in Markov decision processes. OR Spektrum 9:13–22
White DJ (1988) Mean, variance, and probabilistic criteria in finite Markov decision processes: a review. J Optim Theory Appl 56:1–29
Whittle P (1990) Risk-sensitive optimal control, vol 2. Wiley, Chichester
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this entry
Cite this entry
Barz, C., Bäuerle, N. (2023). Risk-Sensitive Markov Decision Processes. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_819-1
Download citation
DOI: https://doi.org/10.1007/978-3-030-54621-2_819-1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54621-2
Online ISBN: 978-3-030-54621-2
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering