Denumerable Markov Decision Chains: Sensitive Optimality Criteria

Hordijk, Arie; Dekker, Rommert

doi:10.1007/978-3-642-68997-0_79

Arie Hordijk⁶ &
Rommert Dekker⁶

Part of the book series: Operations Research Proceedings ((ORP,volume 1982))

78 Accesses

Summary

Assuming compact metric action spaces and the usual continuity properties of the immediate costs and of the transition probabilities we regard the existence of average and/or sensitive optimal stationary policies. We generalize results from the unichain case to the multichain case. It appears that the simultaneous Doeblin condition is not sufficient. However, the continuity of the ergodic potential guarantees not only average but also bias and Blackwell optimality. Relations between these conditions and uniform strong ergodicity are discussed. An extension is also made to the unbounded costs case.

Zusammenfassung

Für Markoffsche Entscheidungsprozesse mit abzählbarem Zustandsraum und kompaktem metrischen Aktionenraum betrachten wir unter den üblichen Stetigkeitsannahmen für die einstufigen Kosten und die Obergangswahrscheinlichkeiten die Existenz von durchschnitts- und/oder sensitiv optimalen stationären Strategien. Wir verallgemeinern Ergebnisse vom Fall einer ergodischen Klasse auf den Fall mehrerer ergodischer Klassen. Es zeigt sich, daß die simultane Doeblin-Bedingung nicht hinreichend ist. Doch die Stetigkeit des ergodischen Potentials garantiert nicht nur Durchschnittsoptimalität, sondern auch Bias- und Blackwell-Optimalität. Beziehungen zwischen diesen Bedingungen und gleichmäßiger starker Ergodizität werden diskutiert. Auch wird eine Erweiterung auf den Fall unbeschränkter Kosten gegeben.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Deppe, Durchschnittskosten in semiregenerativen Entscheidungsmodellen. Dissertation, Rheinische Friedrichs-Wilhelms-Universität Bonn (1981)
Google Scholar
A. Federgruen, A. Hordijk and H.C. Tijms, Denumberable state semi-Markov decision processes with unbounded costs, average costs criterion, Stoch. Proc. Appl. 9 (1975) pp 223–245.
Article Google Scholar
A. Federgruen, P.J. Schweitzer and H.C. Tijms. Denumerable semi-Markov decision processes with unbounded rewards. Research working paper no. 355A Gradutate School of Business Columbia University (1981).
Google Scholar
A. Hordijk, Dynamic programming and Markov potential theory. Mathematical Centre Tract, 51, Mathematical Centre, Amsterdam (1974).
Google Scholar
A. Hordijk, F.A. van der Duyn Schouten. On the existence of average optimal, policies in Markov decision drift processes with general state and action space. Report no. 31–32. Institute of Applied Mathematics and Computer Science. University of Leiden (1931).
Google Scholar
D. Isaacson and G.R. Luecke, Strongly ergodic Markov chains and rates of convergence using spectral conditions. Stoch. Proc. and their Apolic. 7 (1963) pp 113–121.
Article Google Scholar
A.F. Veinott, Discrete dynamic programming with sensitive discount optimality criteria, Ann. Math. Statist. 37 (1966), pp 1635–1660.
Article Google Scholar
W.H.M. Zijm, The optimality equations in multichain denumerable state Markov decision process with the average cost criterion: the bounded cost case. Report AE 2/32 Faculty of Actuarial Science Econometrics. University of Amsterdam (1982).
Google Scholar

Download references

Author information

Authors and Affiliations

University of Leiden, The Netherlands
Arie Hordijk & Rommert Dekker

Authors

Arie Hordijk
View author publications
You can also search for this author in PubMed Google Scholar
Rommert Dekker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Lehrstuhl für Investition und Finanzierung, Universität Dortmund, Postfach 50 05 00, D-4600, Dortmund 50, Deutschland
Wolfgang Bühler
Lehrstuhl für Quantitative Methoden der Betriebswirtschaftslehre, Universität Hamburg, Von-Melle-Park 5, D-2000, Hamburg 13, Deutschland
Bernd Fleischmann
Philips GmbH, Bereich ISA, Billstraße 80, D-2000, Hamburg 28, Deutschland
Karl-Peter Schuster (Ressortleiter Operations Research) (Ressortleiter Operations Research)
Fachbereich Ökonomie, Universität Frankfurt, Mertonstraße 17, D-6000, Frankfurt, Deutschland
Lothar Streitferdt
EDV-Entwicklung, Ruhrkohle AG, D-4300, Essen, Deutschland
Helmut Zander (Leiter der Abteilung Mathematisch-technische) (Leiter der Abteilung Mathematisch-technische)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hordijk, A., Dekker, R. (1983). Denumerable Markov Decision Chains: Sensitive Optimality Criteria. In: Bühler, W., Fleischmann, B., Schuster, KP., Streitferdt, L., Zander, H. (eds) DGOR Papers of the 11th Annual Meeting Vorträge der 11. Jahrestagung. Operations Research Proceedings, vol 1982. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-68997-0_79

Download citation

DOI: https://doi.org/10.1007/978-3-642-68997-0_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-12239-5
Online ISBN: 978-3-642-68997-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics