Abstract
This article reviews the history and theory of dynamic programming (DP), a recursive method of solving sequential decision problems under uncertainty. It discusses computational algorithms for the numerical solution of DP problems, and an important limitation in our ability to solve realistic large-scale dynamic programming problems, the ‘curse of dimensionality’. It also summarizes recent research in complexity theory that delineates situations where the curse can be broken (allowing us to solve DPs using fast polynomial time algorithms), and situations where it is insuperable. The literature on econometric estimation and testing of DP models is reviewed, as is another ‘scientific limit to knowledge’, namely, the identification problem.
This chapter was originally published in The New Palgrave Dictionary of Economics, 2nd edition, 2008. Edited by Steven N. Durlauf and Lawrence E. Blume
Bibliography
This article has benefited from helpful feedback from Kenneth Arrow, Daniel Benjamin, Larry Blume, Moshe Buchinsky, Larry Epstein, Chris Phelan and Arthur F. Veinott, Jr.
Adda, J., and R. Cooper. 2003. Dynamic economics quantitative methods and applications. Cambridge, MA: MIT Press.
Aguirregabiria, V., and P. Mira. 2004. Swapping the nested fixed point algorithm: A class of estimators for discrete Markov decision models. Econometrica 70: 1519–1543.
Aguirregabiria, V., and P. Mira. 2007. Sequential estimation of dynamic discrete games. Econometrica 75: 1–53.
Arrow, K.J., D. Blackwell, and M.A. Girshik. 1949. Bayes and minimax solutions of sequential decision problems. Econometrica 17: 213–244.
Bajari, P., and H. Hong. 2006. Semiparametric estimation of a dynamic game of incomplete information, Technical Working Paper No. 320. Cambridge, MA: NBER.
Bajari, P., L. Benkard, and J. Levin. 2007. Estimating dynamic models of imperfect competition. Econometrica 75: 1331–1370.
Barto, A.G., S.J. Bradtke, and S.P. Singh. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72: 81–138.
Bellman, R. 1957. Dynamic programming. Princeton: Princeton University Press.
Bellman, R. 1984. Eye of the hurricane. Singapore: World Scientific.
Bellman, R., and S. Dreyfus. 1962. Applied dynamic programming. Princeton: Princeton University Press.
Bertsekas, D.P. 1995. Dynamic programming and optimal control, vols 1 and 2. Belmont: Athena Scientific.
Bertsekas, D.P., and J. Tsitsiklis. 1996. Neuro-dynamic programming. Belmont: Athena Scientific.
Bhattacharya, R.N., and M. Majumdar. 1989. Controlled semi-Markov models – The discounted case. Journal of Statistical Planning and Inference 21: 365–381.
Binmore, K., J. McCarthy, G. Ponti, L. Samuelson, and A. Shaked. 2002. A backward induction experiment. Journal of Economic Theory 104: 48–88.
Blackwell, D. 1962. Discrete dynamic programming. Annals of Mathematical Statistics 33: 719–726.
Blackwell, D. 1965a. Positive dynamic programming. Proceedings of the 5th Berkeley Symposium 3: 415–428.
Blackwell, D. 1965b. Discounted dynamic programming. Annals of Mathematical Statistics 36: 226–235.
Cayley, A. 1875. Mathematical qsts and their solutions. Problem No. 4528. Educational Times 27, 237.
Chow, C.S., and J.N. Tsitsiklis. 1989. The complexity of dynamic programming. Journal of Complexity 5: 466–488.
Denardo, E. 1967. Contraction mappings underlying the theory of dynamic programming. SIAM Review 9: 165–177.
Dvoretzky, A., J. Kiefer, and J. Wolfowitz. 1952. The inventory problem: I. Case of known distributions of demand. Econometrica 20: 187–222.
Eckstein, Z., and K.I. Wolpin. 1989. The specification and estimation of dynamic stochastic discrete choice models: A survey. Journal of Human Resources 24: 562–598.
Gallant, A.R., and G.E. Tauchen. 1996. Which moments to match? Econometric Theory 12: 657–681.
Gihman, I.I., and A.V. Skorohod. 1979. Controlled stochastic processes. New York: Springer.
Gittins, J.C. 1979. Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society B 41: 148–164.
Gotz, G.A., and J.J. McCall. 1980. Estimation in sequential decision-making models: A methodological note. Economics Letters 6: 131–136.
Gourieroux, C., and A. Monfort. 1997. Simulation-based methods of inference. Oxford: Oxford University Press.
Grüne, L., and W. Semmler. 2004. Using dynamic programming with adaptive grid scheme for optimal control problems in economics. Journal of Economic Dynamics and Control 28: 2427–2456.
Hall, G., and J. Rust. 2006. Econometric methods for endogenously sampled time series: The case of commodity price speculation in the steel market. Manuscript: Yale University.
Hansen, L.P., and T.J. Sargent. 1980. Formulating and estimating dynamic linear rational expectations models. Journal of Economic Dynamics and Control 2: 7–46.
Hansen, L.P., and K. Singleton. 1982. Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica 50: 1269–1281.
Howard, R.A. 1960. Dynamic programming and Markov processes. New York: Wiley.
Imai, S., N. Jain, and A. Ching. 2005. Bayesian estimation of dynamic discrete choice models. Manuscript: University of Illinois.
Judd, K. 1998. Numerical methods in economics. Cambridge, MA: MIT Press.
Keane, M., and K.I. Wolpin. 1994. The solution and estimation of discrete choice dynamic programming models by simulation: Monte Carlo evidence. Review of Economics and Statistics 76: 648–672.
Kurzweil, R. 2005. The singularity is near when humans transcend biology. New York: Viking Press.
Kushner, H.J. 1990. Numerical methods for stochastic control problems in continuous time. SIAM Journal on Control and Optimization 28: 999–1048.
Lancaster, A. 1997. Exact structural inference in optimal job search models. Journal of Business Economics and Statistics 15: 165–179.
Ledyard, J. 1986. The scope of the hypothesis of Bayesian equilibrium. Journal of Economic Theory 39: 59–82.
Lucas Jr., R.E. 1976. Econometric policy evaluation: A critique. In The phillips curve and labour markets, Carnegie-Rochester Conference on Public Policy, ed. K. Brunner and A.K. Meltzer. Amsterdam: North-Holland.
Lucas Jr., R.E. 1978. Asset prices in an exchange economy. Econometrica 46: 1426–1445.
Luenberger, D.G. 1969. Optimization by vector space methods. New York: Wiley.
Magnac, T., and D. Thesmar. 2002. Identifying dynamic discrete decision processes. Econometrica 70: 801–816.
Marschak, T. 1953. Economic measurements for policy and prediction. In Studies in econometric method, ed. W.C. Hood and T.J. Koopmans. New York: Wiley.
Massé, P. 1945. Application des probabilités en chaîne á l’hydrologie statistique et au jeu des réservoirs. Report to the Statistical Society of Paris. Paris: Berger-Levrault.
Massé, P. 1946. Les réserves et la régulation de l’avenir. Paris: Hermann.
McFadden, D. 1989. A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica 57: 995–1026.
Nemirovsky, A.S., and D.B. Yudin. 1983. Problem complexity and method efficiency in optimization. New York: Wiley.
Nourets, A. 2006. Inference in dynamic discrete choice models with serially correlated unobserved state variables. Manuscript, University of Iowa.
Paarsch, H.J., and J. Rust. 2007. Stochastic dynamic programming in space: An application to British Columbia forestry. Working paper.
Pakes, A. 1986. Patents as options: Some estimates of the values of holding European patent stocks. Econometrica 54: 755–784.
Pakes, A. 2001. Stochastic algorithms, symmetric Markov perfect equilibria and the ‘curse’ of dimensionality. Econometrica 69: 1261–1281.
Pakes, A., and P. McGuire. 1994. Computing Markov perfect Nash equilibrium: Numerical implications of a dynamic differentiated product model. RAND Journal of Economics 25: 555–589.
Penrose, R. 1989. The emperor’s new mind. New York: Penguin.
Pesendorfer, M., and P. Schmidt-Dengler. 2003. Identification and estimation of dynamic games. Manuscript, University College London.
Pollard, D. 1989. Asymptotics via empirical processes. Statistical Science 4: 341–386.
Puterman, M.L. 1994. Markovian decision problems. New York: Wiley.
Rust, J. 1985. Stationary equilibrium in a market for durable goods. Econometrica 53: 783–805.
Rust, J. 1987. Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher. Econometrica 55: 999–1033.
Rust, J. 1988. Maximum likelihood estimation of discrete control processes. SIAM Journal on Control and Optimization 26: 1006–1024.
Rust, J. 1994. Structural estimation of Markov decision processes. In Handbook of econometrics, vol. 4, ed. R.F. Engle and D.L. McFadden. Amsterdam: North-Holland.
Rust, J. 1996. Numerical dynamic programming in economics. In Handbook of computational economics, ed. H. Amman, D. Kendrick, and J. Rust. Amsterdam: North-Holland.
Rust, J. 1997. Using randomization to break the curse of dimensionality. Econometrica 65: 487–516.
Rust, J., and G.J. Hall. 2007. The (S, s) rule is an optimal trading strategy in a class of commodity price speculation problems. Economic Theory 30: 515–538.
Rust, J., and C. Phelan. 1997. How social security and medicare affect retirement behavior in a world with incomplete markets. Econometrica 65: 781–832.
Rust, J., J.F. Traub, and H. Woźniakowski. 2002. Is there a curse of dimensionality for contraction fixed points in the worst case? Econometrica 70: 285–329.
Santos, M., and J. Rust. 2004. Convergence properties of policy iteration. SIAM Journal on Control and Optimization 42: 2094–2115.
Sargent, T.J. 1978. Estimation of dynamic labor demand schedules under rational expectations. Journal of Political Economy 86: 1009–1044.
Sargent, T.J. 1981. Interpreting economic time series. Journal of Political Economy 89: 213–248.
Selten, R. 1975. Reexamination of the perfectness concept for equilibrium points in extensive games. International Journal of Game Theory 4: 25–55.
Tauchen, G., and R. Hussey. 1991. Quadrature-based methods for obtaining approximate solutions to nonlinear asset pricing models. Econometrica 59: 371–396.
Todd, P., and K.I. Wolpin. 2005. Ex ante evaluation of social programs. Manuscript, University of Pennsylvania.
Traub, J.F., and A.G. Werschulz. 1998. Complexity and information. Cambridge: Cambridge University Press.
Von Neumann, J., and O. Morgenstern. 1944. Theory of games and economic behavior. Princeton: Princeton University Press. 3rd edn, 1953.
Wald, A. 1947a. Foundations of a general theory of sequential decision functions. Econometrica 15: 279–313.
Wald, A. 1947b. Sequential analysis. New York: Dover.
Wald, A., and J. Wolfowitz. 1948. Optimum character of the sequential probability ratio test. Annals of Mathematical Statistics 19: 326–339.
Wolpin, K. 1984. An estimable dynamic stochastic model of fertility and child mortality. Journal of Political Economy 92: 852–874.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Copyright information
© 2008 The Author(s)
About this entry
Cite this entry
Rust, J. (2008). Dynamic Programming. In: The New Palgrave Dictionary of Economics. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-349-95121-5_1932-1
Download citation
DOI: https://doi.org/10.1057/978-1-349-95121-5_1932-1
Received:
Accepted:
Published:
Publisher Name: Palgrave Macmillan, London
Online ISBN: 978-1-349-95121-5
eBook Packages: Springer Reference Economics and FinanceReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences