Skip to main content
Log in

Decentralized stochastic control

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Decentralized stochastic control refers to the multi-stage optimization of a dynamical system by multiple controllers that have access to different information. Decentralization of information gives rise to new conceptual challenges that require new solution approaches. In this expository paper, we use the notion of an information-state to explain the two commonly used solution approaches to decentralized control: the person-by-person approach and the common-information approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. In general, a dynamic program may not have an unique solution, or any solution at all. In this paper, we ignore the issue of existence of such a solution and refer the reader to (Hernández-Lerma and Lasserre 1996) for details.

  2. Note that \(\{ q_{s,m} \mid s \in \{0,1\}\) and \(m \in {\mathbb {Z}_{> 0}}\}\) is equivalent to the reachable set \(\mathcal Q\) of \(\xi _t\).

  3. This condition is needed to ensure that the information-state is time-homogeneous and, as such, may be ignored for finite horizon models (Nayyar et al. 2013b).

  4. For example, the process \(\{\pi _t\}_{t=0}^\infty \), where \(\pi _t\) is the conditional probability measure on \((X_t, L^1_t, \ldots , L^n_t)\) conditioned on \(C_t\), is always an information-state process.

References

  • Aicardi, M., Davoli, F., & Minciardi, R. (1987). Decentralized optimal control of Markov chains with a common past information set. IEEE Transactions on Automatic Control, 32(11), 1028–1031.

    Article  Google Scholar 

  • Arrow, K. J., Blackwell, D., & Girshick, M. A. (1949). Bayes and minimax solutions of sequential decision problems. Econometrica, 17(3/4), 213–244.

    Article  Google Scholar 

  • Aumann, R. J. (1976). Agreeing to disagree. Annals of Statistics, 4, 1236–1239.

    Article  Google Scholar 

  • Başar, T., & Bansal, R. (1989). The theory of teams: A selective annotated bibliography. In T. Başar & P. Bernhard (Eds.), Differential games and applications (Vol. 119, pp. 186–201)., lecture notes in control and information sciences Berlin: Springer.

    Chapter  Google Scholar 

  • Bellman, R. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.

    Google Scholar 

  • Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vol. 1). Belmont, MA: Athena Scientific.

    Google Scholar 

  • Casalino, G., Davoli, F., Minciardi, R., Puliafito, P., & Zoppoli, R. (1984). Partially nested information structures with a common past. IEEE Transactions on Automatic Control, 29(9), 846–850.

    Article  Google Scholar 

  • Cavazos-Cadena, R. (1986). Finite-state approximations for denumerable state discounted markov decision processes. Applied Mathematics and Optimization, 14(1), 1–26.

    Article  Google Scholar 

  • Flåm, S. D. (1987). Finite state approximations for countable state infinite horizon discounted Markov decision processes. Modeling, Identification and Control, 8(2), 117–123.

    Article  Google Scholar 

  • Hernández-Lerma, O. (1986). Finite-state approximations for denumerable multidimensional state discounted markov decision processes. Journal of Mathematical Analysis and Applications, 113(2), 382–389.

    Article  Google Scholar 

  • Hernández-Lerma, O., & Lasserre, J. (1996). Discrete-time Markov control processes. Berlin: Springer.

    Book  Google Scholar 

  • Ho, Y. C. (1980). Team decision theory and information structures. Proceedings of the IEEE, 68(6), 644–654.

    Article  Google Scholar 

  • Mahajan, A. (2013). Optimal decentralized control of coupled subsystems with control sharing. IEEE Transactions on Automatic Control, 58(9), 2377–2382.

  • Mahajan, A., Martins, N., Rotkowitz, M., & Yüksel, S. (2012). Information structures in optimal decentralized control. In Proceedings of the 51st IEEE conference decision and control (pp. 1291–1306). Maui, Hawaii.

  • Mahajan, A., Nayyar, A., & Teneketzis, D. (2008). Identifying tractable decentralized control problems on the basis of information structure. In Proceedings of the 46th annual Allerton conference communication, control, and computing (pp. 1440–1449). Monticello, IL.

  • Mahajan, A., & Teneketzis, D. (2009). Optimal performance of networked control systems with non-classical information structures. SIAM Journal of Control and Optimization, 48(3), 1377–1404.

    Article  Google Scholar 

  • Marschak, J., & Radner, R. (1972). Economic theory of teams. New Haven: Yale University Press.

    Google Scholar 

  • Nayyar, A. (2011). Sequential decision making in decentralized systems. Ph.D. thesis, Ann Arbor, MI: University of Michigan.

  • Nayyar, A., Mahajan, A., & Teneketzis, D. (2013). The common-information approach to decentralized stochastic control. In B. Bernhardsson, G. Como, & A. Rantzer (Eds.), Information and control in networks. Berlin: Springer.

    Google Scholar 

  • Nayyar, A., Mahajan, A., & Teneketzis, D. (2013). Decentralized stochastic control with partial history sharing: A common information approach. IEEE Transactions on Automatic Control, 58(7), 1644–1658.

    Article  Google Scholar 

  • Oliehoek, F. A., Spaan, M. T. J., Amato, C., & Whiteson, S. (2013). Incremental clustering and expansion for faster optimal planning in decentralized POMDPs. Journal of Artificial Intelligence Research, 46, 449–509.

    Google Scholar 

  • Ooi, J. M., Verbout, S. M., Ludwig, J. T., & Wornell, G. W. (1997). A separation theorem for periodic sharing information patterns in decentralized control. IEEE Transactions on Automatic Control, 42(11), 1546–1550.

    Article  Google Scholar 

  • Powell, W. B. (2007). Approximate dynamic programming: Solving the curses of dimensionality (Vol. 703). London: Wiley.

    Book  Google Scholar 

  • Puterman, M. (1994). Markov decision processes: Discrete stochastic dynamic programming. London: Wiley.

    Book  Google Scholar 

  • Radner, R. (1962). Team decision problems. Annals of Mathmatical Statistics, 33, 857–881.

    Article  Google Scholar 

  • Russell, S. J., & Norvig, P. (1995). Artificial intelligence: A modern approach. Englewood Cliffs, NJ: Prentice Hall.

    Google Scholar 

  • Sennott, L. I. (1999). Stochastic dynamic programming and the control of queueing systems. New York, NY: Wiley.

    Google Scholar 

  • Shani, G., Pineau, J., & Kaplow, R. (2013). A survey of point-based POMDP solvers. Autonomous Agents and Multi-Agent Systems, 27(1), 1–51.

    Article  Google Scholar 

  • Stokey, N. L., & Lucas Robert, E. J. (1989). Recursive methods in economic dynamics. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Teneketzis, D., & Ho, Y. (1987). The decentralized Wald problem. Information and Computation (formerly Information and Control), 73(1), 23–44.

    Google Scholar 

  • Teneketzis, D., & Varaiya, P. (1984). The decentralized quickest detection problem. IEEE Transactions on Automatic Control, AC–29(7), 641–644.

    Article  Google Scholar 

  • Walrand, J. C., & Varaiya, P. (1983). Optimal causal coding-decoding problems. IEEE Transactions on Information Theory, 29(6), 814–820.

    Article  Google Scholar 

  • White, D. (1980). Finite state approximations for denumerable state infinite horizon discounted Markov processes. Journal of Mathematical Analysis and Applications, 74(1), 292–295.

    Article  Google Scholar 

  • Witsenhausen, H. S. (1971). On information structures, feedback and causality. SIAM Journal of Control, 9(2), 149–160.

    Article  Google Scholar 

  • Witsenhausen, H. S. (1971). Separation of estimation and control for discrete time systems. Proceedings of the IEEE, 59(11), 1557–1566.

    Article  Google Scholar 

  • Witsenhausen, H. S. (1973). A standard form for sequential stochastic control. Mathematical Systems Theory, 7(1), 5–11.

    Article  Google Scholar 

  • Yoshikawa, T. (1978). Decomposition of dynamic team decision problems. IEEE Transactions on Automatic Control, 23(4), 627–632.

    Article  Google Scholar 

  • Yüksel, S. (2009). Stochastic nestedness and the belief sharing information pattern. IEEE Transactions on Automatic Control, 54(12), 2773–2786.

  • Yüksel, S., & Başar, T. (2013). Stochastic networked control systems: Stabilization and optimization under information constraints. Boston, MA: Birkhäuser.

    Book  Google Scholar 

  • Zhang, W. (2001). Algorithms for partially observed Markov decision processes. Ph.D. thesis, Hong Kong University of Science and Technology.

Download references

Acknowledgments

The authors are grateful to A. Nayyar, D. Teneketzis, and S. Yüksel for useful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aditya Mahajan.

Additional information

This work was supported by Fonds de recherche du Québec–Nature et technologies (FRQ-NT) Establishment of New Researcher Grant 166065 and by Natural Science and Engineering Research Council of Canada (NSERC) Discovery Grant 402753-11.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahajan, A., Mannan, M. Decentralized stochastic control. Ann Oper Res 241, 109–126 (2016). https://doi.org/10.1007/s10479-014-1652-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-014-1652-0

Keywords

Navigation