Estimation and control in multichain processes

Girlich, Hans-Joachim; Sokolichin, A. A.

doi:10.1007/BF02204826

Estimation and control in multichain processes

Published: December 1991

Volume 32, pages 23–33, (1991)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Hans-Joachim Girlich¹ &
A. A. Sokolichin¹

45 Accesses
Explore all metrics

Abstract

This paper considers Markovian decision processes in discrete time with transition probabilities depending on an unknown parameter which may change step by step. In the case of the convergence of such a parameter sequence, a policy maximizing the average expected reward over an infinite future is looked for. Under continuity conditions, the uniform optimality of a policy based on “estimation and control” for some multichain models is shown.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Closed-form expressions of the run-length distribution of the nonparametric double sampling precedence monitoring scheme

Article Open access 12 April 2024

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Article 17 January 2019

Symmetric Markov Processes with Tightness Property

References

J. Bather, Optimal decision procedures for finite Markov chains. Part II: Communicating systems, Adv. Appl. Prob. 5(1973)521–540.
Google Scholar
R. Bellman,Dynamic Programming (Princeton University Press, Princeton, 1957).
Google Scholar
K.-J. Bierth, Nearly optimal policies in semi-Markov decision models,14th Symp. on Operations Research, Ulm (1989).
D. Blackwell, Discrete dynamic programming, Ann. Math. Stat. 33(1962)719–726.
Google Scholar
D. Blackwell, Discounted dynamic programming, Ann. Math. Stat. 36(1965)226–235.
Google Scholar
R. Dekker, Denumerable Markov decision chains: Optimal policies for small interest rates, Doctoral Dissertation, University of Leiden (1985).
A. Dvoretzky, J. Kiefer and J. Wolfowitz, The inventory problem: II. Case of unknown distribution of demand, Econometrica 20(1952)450–466.
Google Scholar
E. Dynkin and A. Yushkevich,Controlled Markov Processes (Springer, 1979).
Y. Fukuda, Bayes and maximum likelihood policies for a multi-echelon inventory problem, Planning Research Corporation, Los Angeles (1969).
Google Scholar
J. Georgin, Estimation et contrôle des chaines de Markov sur des espaces arbitraires, in: Lecture Notes in Mathematics 636 (Springer, 1978), pp. 71–113.
H.-J. Girlich and A. Sokolichin, Schätzen und Steuern in einem Markoffschen Entscheidungsmodell mit unbekannter Parameterfolge, Wiss. Zeitschr. Techn. Hochschule Leipzig 12(1988)121–126.
Google Scholar
K. Hinderer,Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lecture Notes in Operations Research 33 (Springer, 1970).
A. Hordijk,Dynamic Programming and Markov Potential Theory, Mathematical Centre Tracts 51, Amsterdam (1974).
G. Hübner, A unified approach to adaptive controls of average reward Markov decision processes, OR Spektrum 10(1988)161–166.
Google Scholar
M. Kolonko, Dynamische Optimierung unter Unsicherheit in einem Semi-Markoff-Modell mit abzählbarem Zustandsraum, Doctoral Dissertation, University of Bonn (1980).
M. Kolonko and M. Schäl, Optimal control of semi-Markov chains under uncertainty with applications to queueing models,Proc. in Operations Research 9 (Physica-Verlag, Würzburg 1980), pp. 430–435.
Google Scholar
M. Kurano, Discrete-time Markovian decision processes with an unknown parameter — average return criterion, J. Oper. Res. Soc. Japan 15(1972)67–76.
Google Scholar
P. Mandl, Estimation and control in Markov chains, Adv. Appl. Prob. 6(1974)40–60.
Google Scholar
P. Mandl, On the adaptive control of countable Markov chains, Banach Center Publ. 5(1979)159–173.
Google Scholar
P. Mandl and G. Hübner, Transient phenomena and self-optimizing control of Markov chains, Acta Univ. Carolinae, Math. Phys., 26(1985)35–51.
Google Scholar
E. Mann, Optimality equations and sensitive optimality in bounded Markov decision processes, Optimization 16(1985)767–781.
Google Scholar
E. Mann, Optimalitätsgleichungen für undiskontierte semi-Markoffsche Entscheidungsprozesse, Doctoral Dissertation, University of Bonn (1986).
U. Rieder, Bayesian dynamic programming, Adv. Appl. Prob. 7(1975)330–348.
Google Scholar
M. Schäl, Estimation and control in Markov decisions models Wiss. Zeitschr. Techn. Hochschule Leipzig 12(1988)187–192.
Google Scholar
M. Schäl, On the second optimality equation for semi-markov decision models, Math. Oper. Res. (1989), submitted.
A. Sokolichin, Existenz durchschnittsoptimaler Strategien in einem Markoffschen Entscheidungsmodell mit unbekannter Parameterfolge, Optimization 19(1988)577–585.
Google Scholar
K. van Hee, Bayesian control of Markov chains, Doctoral Dissertation, Techn. Hogeschool Eindhoven (1978).
H. Zijm, The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: The bounded cost case, Statist. Decisions 3(1985)314–365.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Leipzig, O-7010, Leipzig, Germany
Hans-Joachim Girlich & A. A. Sokolichin

Authors

Hans-Joachim Girlich
View author publications
You can also search for this author in PubMed Google Scholar
A. A. Sokolichin
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Girlich, HJ., Sokolichin, A.A. Estimation and control in multichain processes. Ann Oper Res 32, 23–33 (1991). https://doi.org/10.1007/BF02204826

Download citation

Issue Date: December 1991
DOI: https://doi.org/10.1007/BF02204826

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation and control in multichain processes

Abstract

Access this article

Similar content being viewed by others

Closed-form expressions of the run-length distribution of the nonparametric double sampling precedence monitoring scheme

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Symmetric Markov Processes with Tightness Property

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimation and control in multichain processes

Abstract

Access this article

Similar content being viewed by others

Closed-form expressions of the run-length distribution of the nonparametric double sampling precedence monitoring scheme

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Symmetric Markov Processes with Tightness Property

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation