Skip to main content

MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14648))

Included in the following conference series:

Abstract

Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT, which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks. Ablation studies validate the effectiveness of the components of MOT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Transaction costs are charged as a percentage of the contract.

  2. 2.

    Slippage refers to the difference between the expected and the actual execution price.

  3. 3.

    A well-known Chinese quantitative trading platform, https://www.ricequant.com/.

  4. 4.

    We chose it as a baseline because we employed the GRU method in the Pretrain Module before imitation learning. The results of GRU demonstrate the performance of the Pretrain Module.

  5. 5.

    We enhance PPO using imitation learning mentioned in Methodology Section.

References

  1. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  2. Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: NIPS, vol. 26 (2013)

    Google Scholar 

  3. Deng, Y., Bao, F., Kong, Y., Ren, Z., Dai, Q.: Deep direct reinforcement learning for financial signal representation and trading. IEEE TNNLS 28(3), 653–664 (2016)

    Google Scholar 

  4. Fama, E.F., French, K.R.: Multifactor explanations of asset pricing anomalies. J. Financ. 51(1), 55–84 (1996)

    Article  Google Scholar 

  5. Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. JMLR 23(1), 5232–5270 (2022)

    MathSciNet  Google Scholar 

  6. Gurrib, I., et al.: Performance of the average directional index as a market timing tool for the most actively traded USD based currency pairs. Banks Bank Syst. 13(3), 58–70 (2018)

    Article  Google Scholar 

  7. Hong, H., Stein, J.C.: A unified theory of underreaction, momentum trading, and overreaction in asset markets. J. Financ. 54(6), 2143–2184 (1999)

    Article  Google Scholar 

  8. Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, pp. 2790–2799. PMLR (2019)

    Google Scholar 

  9. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)

  10. Jegadeesh, N., Titman, S.: Returns to buying winners and selling losers: implications for stock market efficiency. J. Financ. 48(1), 65–91 (1993)

    Article  Google Scholar 

  11. Jegadeesh, N., Titman, S.: Cross-sectional and time-series determinants of momentum returns. Rev. Financ. Stud. 15(1), 143–157 (2002)

    Article  Google Scholar 

  12. Jeong, G., Kim, H.Y.: Improving financial trading decisions using deep q-learning: predicting the number of shares, action strategies, and transfer learning. Expert Syst. Appl. 117, 125–138 (2019)

    Article  Google Scholar 

  13. Kim, H.J., Shin, K.S.: A hybrid approach based on neural networks and genetic algorithms for detecting temporal patterns in stock markets. Appl. Soft Comput. 7(2), 569–576 (2007)

    Article  Google Scholar 

  14. Li, Z., Tam, V.: A machine learning view on momentum and reversal trading. Algorithms 11(11), 170 (2018)

    Article  MathSciNet  Google Scholar 

  15. Lin, H., Zhou, D., Liu, W., Bian, J.: Learning multiple stock trading patterns with temporal routing adaptor and optimal transport. In: 27th ACM SIGKDD, pp. 1017–1026 (2021)

    Google Scholar 

  16. Liu, Y., Liu, Q., Zhao, H., Pan, Z., Liu, C.: Adaptive quantitative trading: an imitative deep reinforcement learning approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 2128–2135 (2020)

    Google Scholar 

  17. Moody, J., Saffell, M.: Reinforcement learning for trading. In: NIPS, vol. 11 (1998)

    Google Scholar 

  18. Moody, J., Wu, L.: Optimization of trading systems and portfolios. In: Proceedings of the IEEE/IAFE 1997 CIFEr, pp. 300–307. IEEE (1997)

    Google Scholar 

  19. de Oliveira, R.A., Ramos, H.S., Dalip, D.H., Pereira, A.C.M.: A tabular sarsa-based stock market agent. In: Proceedings of the First ACM International Conference on AI in Finance, pp. 1–8 (2020)

    Google Scholar 

  20. Poterba, J.M., Summers, L.H.: Mean reversion in stock prices: evidence and implications. J. Financ. Econ. 22(1), 27–59 (1988)

    Article  Google Scholar 

  21. Pricope, T.V.: Deep reinforcement learning in quantitative algorithmic trading: a review. arXiv preprint arXiv:2106.00123 (2021)

  22. Ritter, J.R.: Behavioral finance. Pac.-Basin Finance J. 11(4), 429–437 (2003)

    Article  Google Scholar 

  23. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  24. Sharpe, W.F.: Mutual fund performance. J. Bus. 39(1), 119–138 (1966)

    Article  Google Scholar 

  25. Si, W., Li, J., Ding, P., Rao, R.: A multi-objective deep reinforcement learning approach for stock index future’s intraday trading. In: 2017 10th ISCID, vol. 2, pp. 431–436. IEEE (2017)

    Google Scholar 

  26. Tsang, W.W.H., Chong, T.T.L., et al.: Profitability of the on-balance volume indicator. Econ. Bull. 29(3), 2424–2431 (2009)

    Google Scholar 

  27. Wilder, J.W.: New concepts in technical trading systems. Trend Research (1978)

    Google Scholar 

  28. Xu, W., et al.: HIST: a graph-based framework for stock trend forecasting via mining concept-oriented shared information. arXiv preprint arXiv:2110.13716 (2021)

  29. Xu, W., Liu, W., Xu, C., Bian, J., Yin, J., Liu, T.Y.: Rest: relational event-driven stock trend forecasting. In: Proceedings of the Web Conference 2021, pp. 1–10 (2021)

    Google Scholar 

  30. Yuan, Y., Wen, W., Yang, J.: Using data augmentation based reinforcement learning for daily stock trading. Electronics 9(9), 1384 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 72374201).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xi Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, X., Zhang, J., Zeng, Y., Xue, W. (2024). MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14648. Springer, Singapore. https://doi.org/10.1007/978-981-97-2238-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2238-9_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2240-2

  • Online ISBN: 978-981-97-2238-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics