Abstract
This work deals with the application of the Q-learning technique in order to make investment decisions. This implies to give investment recommendations about the convenience of investment on a particular asset. The reinforcement learning system, and particularly Q-learning, allows continuous learning based on decisions proposed by the system itself. This technique has several advantages, like the capability of decision-making independently of the learning stage, capacity of adaptation to the application domain, and a goal-oriented logic. These characteristics are very useful on financial problems. Results of experiments made to evaluate the learning capacity of the method in the mentioned application domain are presented. Decision-making capacity on this domain is also evaluated. As a result, a system based on Q-learning that learns from its own decisions in an investment context is obtained. The system presents some limitations when the space of states is big due to the lack of generalization of the Q-learning variant used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bertsekas, D.P., Yu, H.: Q-Learning and enhanced policy iteration in discounted dynamic programming. Math. Oper. Res. 37 (1), 66–94 (2012)
Borodin, A., El-Yaniv, R., Gogan, V.: Can we learn to beat the best stock. J. Artif. Intell. Res. 21, 579–94 (2004)
Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evolutionary Computation 1 (1), 53–66 (1997)
Fama, E.F.: The behavior of stock-market prices. J. Bus. 38 (1), 34–105 (1965)
Fama, E.F.: Efficient capital markets: a review of theory and empirical work. J. Finance 25 (2), 383–417 (1970)
Maei, H.R., Szepesvari, C., Bhatnagar, S., Precup, D., Silver, D., Sutton, R.S.: Convergent temporal-difference learning with arbitrary smooth function approximation. In: Advances in Neural Information Processing Systems. 23rd Annual Conference on Neural Information Processing Systems, Vancouver, 7–10 December 2009, pp. 1204–1212. La Jolla (2010)
Markowitz, H.M.: Portfolio Selection. Yale University Press, New Haven (1959)
Murphy, J.J.: Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications. New York Institute of Finance, New York (1999)
Precup, D., Sutton, R.S., Dasgupta, S.: Off-policy Temporal-difference Learning with Function Approximation. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001), June 2001, pp. 417–424. Morgan Kaufmann, San Francisco (2000)
Rafols, E.J., Ring, M.B., Sutton, R.S., Tanner, B.: Using predictive representations to improve generalization in reinforcement learning. In: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, July 2005, pp. 835–840. Professional Book Center, Denver (2005)
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall/Pearson Education, Upper Saddle River (1995)
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall/Pearson Education, Upper Saddle River (2003)
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson, Boston (2010)
Sutton, R.S.: Open theoretical questions in reinforcement learning. In: Computational Learning Theory. Lecture Notes in Computer Science, vol. 1572, pp. 637–638. Springer, Berlin (1999)
Sutton, R.S.: Reinforcement learning: past, present and future. In: Simulated Evolution and Learning. Lecture Notes in Computer Science, vol. 1585, pp. 195–197. Springer, Berlin (1999)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT, Cambridge (2000)
Sutton, R.S., McAllester, D. Singh, S., Mansour, Y.: Policy Gradient Methods for Reinforcement Learning with Function Approximation. In: Advances in Neural Information Processing Systems, 1999 Conference, vol. 12, pp. 1057–1063. MIT, Cambridge (2000)
Sutton, R.S., Maei, H.R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., Wiewiora, E.: Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings, Twenty-sixth International Conference on Machine Learning, pp. 993–1000. Omnipress, Madison (2009a)
Sutton, R.S., Szepesvari, C., Maei, H.R.: A convergent o(n) algorithm for off-policy temporal difference learning with linear function approximation. In: Advances in Neural Information Processing Systems. 22nd Annual Conference on Neural Information Processing Systems, Vancouver, 8–10 December 2008, vol. 21 pp. 1609–1616, Curran, Red Hook (2009b)
Van Hasselt, H.: Reinforcement learning in continuous state and action spaces. In: Reinforcement Learning, pp. 207–251. Springer, Berlin (2012)
Watkins, C.J.C.H.: Learning from Delayed Rewards. University of Cambridge, Cambridge (1989)
Whiteson, S., Tanner, B., Taylor, M., Stone, P.: Protecting against evaluation overfitting in empirical reinforcement learning. In: Symposium on Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), pp. 120–127. IEEE, Paris (2011)
Xu, X., Zuo, L., Huang, Z.: Reinforcement learning algorithms with function approximation: recent advances and applications. Inform. Sci. 261, 1–31 (2014)
Yahoo! Finance - Business Finance, Stock Market, Quotes, News. http://finance.yahoo.com/. Accessed 19 December 2011
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Varela, M., Viera, O., Robledo, F. (2016). A Q-Learning Approach for Investment Decisions. In: Pinto, A., Accinelli Gamba, E., Yannacopoulos, A., Hervés-Beloso, C. (eds) Trends in Mathematical Economics. Springer, Cham. https://doi.org/10.1007/978-3-319-32543-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-32543-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32541-5
Online ISBN: 978-3-319-32543-9
eBook Packages: Economics and FinanceEconomics and Finance (R0)