Abstract
Decentralized stochastic control refers to the multi-stage optimization of a dynamical system by multiple controllers that have access to different information. Decentralization of information gives rise to new conceptual challenges that require new solution approaches. In this expository paper, we use the notion of an information-state to explain the two commonly used solution approaches to decentralized control: the person-by-person approach and the common-information approach.
Similar content being viewed by others
Notes
In general, a dynamic program may not have an unique solution, or any solution at all. In this paper, we ignore the issue of existence of such a solution and refer the reader to (Hernández-Lerma and Lasserre 1996) for details.
Note that \(\{ q_{s,m} \mid s \in \{0,1\}\) and \(m \in {\mathbb {Z}_{> 0}}\}\) is equivalent to the reachable set \(\mathcal Q\) of \(\xi _t\).
This condition is needed to ensure that the information-state is time-homogeneous and, as such, may be ignored for finite horizon models (Nayyar et al. 2013b).
For example, the process \(\{\pi _t\}_{t=0}^\infty \), where \(\pi _t\) is the conditional probability measure on \((X_t, L^1_t, \ldots , L^n_t)\) conditioned on \(C_t\), is always an information-state process.
References
Aicardi, M., Davoli, F., & Minciardi, R. (1987). Decentralized optimal control of Markov chains with a common past information set. IEEE Transactions on Automatic Control, 32(11), 1028–1031.
Arrow, K. J., Blackwell, D., & Girshick, M. A. (1949). Bayes and minimax solutions of sequential decision problems. Econometrica, 17(3/4), 213–244.
Aumann, R. J. (1976). Agreeing to disagree. Annals of Statistics, 4, 1236–1239.
Başar, T., & Bansal, R. (1989). The theory of teams: A selective annotated bibliography. In T. Başar & P. Bernhard (Eds.), Differential games and applications (Vol. 119, pp. 186–201)., lecture notes in control and information sciences Berlin: Springer.
Bellman, R. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.
Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vol. 1). Belmont, MA: Athena Scientific.
Casalino, G., Davoli, F., Minciardi, R., Puliafito, P., & Zoppoli, R. (1984). Partially nested information structures with a common past. IEEE Transactions on Automatic Control, 29(9), 846–850.
Cavazos-Cadena, R. (1986). Finite-state approximations for denumerable state discounted markov decision processes. Applied Mathematics and Optimization, 14(1), 1–26.
Flåm, S. D. (1987). Finite state approximations for countable state infinite horizon discounted Markov decision processes. Modeling, Identification and Control, 8(2), 117–123.
Hernández-Lerma, O. (1986). Finite-state approximations for denumerable multidimensional state discounted markov decision processes. Journal of Mathematical Analysis and Applications, 113(2), 382–389.
Hernández-Lerma, O., & Lasserre, J. (1996). Discrete-time Markov control processes. Berlin: Springer.
Ho, Y. C. (1980). Team decision theory and information structures. Proceedings of the IEEE, 68(6), 644–654.
Mahajan, A. (2013). Optimal decentralized control of coupled subsystems with control sharing. IEEE Transactions on Automatic Control, 58(9), 2377–2382.
Mahajan, A., Martins, N., Rotkowitz, M., & Yüksel, S. (2012). Information structures in optimal decentralized control. In Proceedings of the 51st IEEE conference decision and control (pp. 1291–1306). Maui, Hawaii.
Mahajan, A., Nayyar, A., & Teneketzis, D. (2008). Identifying tractable decentralized control problems on the basis of information structure. In Proceedings of the 46th annual Allerton conference communication, control, and computing (pp. 1440–1449). Monticello, IL.
Mahajan, A., & Teneketzis, D. (2009). Optimal performance of networked control systems with non-classical information structures. SIAM Journal of Control and Optimization, 48(3), 1377–1404.
Marschak, J., & Radner, R. (1972). Economic theory of teams. New Haven: Yale University Press.
Nayyar, A. (2011). Sequential decision making in decentralized systems. Ph.D. thesis, Ann Arbor, MI: University of Michigan.
Nayyar, A., Mahajan, A., & Teneketzis, D. (2013). The common-information approach to decentralized stochastic control. In B. Bernhardsson, G. Como, & A. Rantzer (Eds.), Information and control in networks. Berlin: Springer.
Nayyar, A., Mahajan, A., & Teneketzis, D. (2013). Decentralized stochastic control with partial history sharing: A common information approach. IEEE Transactions on Automatic Control, 58(7), 1644–1658.
Oliehoek, F. A., Spaan, M. T. J., Amato, C., & Whiteson, S. (2013). Incremental clustering and expansion for faster optimal planning in decentralized POMDPs. Journal of Artificial Intelligence Research, 46, 449–509.
Ooi, J. M., Verbout, S. M., Ludwig, J. T., & Wornell, G. W. (1997). A separation theorem for periodic sharing information patterns in decentralized control. IEEE Transactions on Automatic Control, 42(11), 1546–1550.
Powell, W. B. (2007). Approximate dynamic programming: Solving the curses of dimensionality (Vol. 703). London: Wiley.
Puterman, M. (1994). Markov decision processes: Discrete stochastic dynamic programming. London: Wiley.
Radner, R. (1962). Team decision problems. Annals of Mathmatical Statistics, 33, 857–881.
Russell, S. J., & Norvig, P. (1995). Artificial intelligence: A modern approach. Englewood Cliffs, NJ: Prentice Hall.
Sennott, L. I. (1999). Stochastic dynamic programming and the control of queueing systems. New York, NY: Wiley.
Shani, G., Pineau, J., & Kaplow, R. (2013). A survey of point-based POMDP solvers. Autonomous Agents and Multi-Agent Systems, 27(1), 1–51.
Stokey, N. L., & Lucas Robert, E. J. (1989). Recursive methods in economic dynamics. Cambridge, MA: Harvard University Press.
Teneketzis, D., & Ho, Y. (1987). The decentralized Wald problem. Information and Computation (formerly Information and Control), 73(1), 23–44.
Teneketzis, D., & Varaiya, P. (1984). The decentralized quickest detection problem. IEEE Transactions on Automatic Control, AC–29(7), 641–644.
Walrand, J. C., & Varaiya, P. (1983). Optimal causal coding-decoding problems. IEEE Transactions on Information Theory, 29(6), 814–820.
White, D. (1980). Finite state approximations for denumerable state infinite horizon discounted Markov processes. Journal of Mathematical Analysis and Applications, 74(1), 292–295.
Witsenhausen, H. S. (1971). On information structures, feedback and causality. SIAM Journal of Control, 9(2), 149–160.
Witsenhausen, H. S. (1971). Separation of estimation and control for discrete time systems. Proceedings of the IEEE, 59(11), 1557–1566.
Witsenhausen, H. S. (1973). A standard form for sequential stochastic control. Mathematical Systems Theory, 7(1), 5–11.
Yoshikawa, T. (1978). Decomposition of dynamic team decision problems. IEEE Transactions on Automatic Control, 23(4), 627–632.
Yüksel, S. (2009). Stochastic nestedness and the belief sharing information pattern. IEEE Transactions on Automatic Control, 54(12), 2773–2786.
Yüksel, S., & Başar, T. (2013). Stochastic networked control systems: Stabilization and optimization under information constraints. Boston, MA: Birkhäuser.
Zhang, W. (2001). Algorithms for partially observed Markov decision processes. Ph.D. thesis, Hong Kong University of Science and Technology.
Acknowledgments
The authors are grateful to A. Nayyar, D. Teneketzis, and S. Yüksel for useful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by Fonds de recherche du Québec–Nature et technologies (FRQ-NT) Establishment of New Researcher Grant 166065 and by Natural Science and Engineering Research Council of Canada (NSERC) Discovery Grant 402753-11.
Rights and permissions
About this article
Cite this article
Mahajan, A., Mannan, M. Decentralized stochastic control. Ann Oper Res 241, 109–126 (2016). https://doi.org/10.1007/s10479-014-1652-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-014-1652-0