Commission Fee is not Enough: A Hierarchical Reinforced Framework for Portfolio Management

Cited by: 0|Bibtex|Views14
Other Links: arxiv.org
Weibo:
Extensive experimental results in the U.S market and China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches

Abstract:

Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error. Existing methods are impractical since they usually assume each reallocation can be finished immediately and thus ignoring the p...More

Code:

Data:

0
Introduction
  • The problem of portfolio management is widely studied in the area of algorithmic trading.
  • Many existing RL methods get promising results by focusing on various technologies to extract richer representation, e.g., by modelbased learning (Tang 2018; Yu et al 2019), by adversarial learning (Liang et al 2018), or by state augmentation (Ye et al 2020)
  • These RL algorithms assume that portfolio weights can change immediately at the last price once an order is placed.
  • Due to the need of balancing the long-term profit maximization and short-term trade execution, it is challenge for a single/flat RL algorithm to operate on different levels of temporal tasks
Highlights
  • The problem of portfolio management is widely studied in the area of algorithmic trading
  • From the plots of Dow Jones Industrial Average Index (DJIA) index, we can see this period is in a bull market generally, the market edges down several times
  • Our strategy could gain more profit under the same risk. When it goes to Maximum DrawDown (MDD) and Downside Deviation Ratio (DDR), the results show that HRPM bears the least risk, even lower than Uniform Constant Rebalanced Portfolios (UCRP)
  • We focus on the problem of portfolio management with trading cost via deep reinforcement learning
  • We propose a hierarchical reinforced stock trading system (HRPM)
  • Extensive experimental results in the U.S market and China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches
Results
  • The authors' HRPM is the only strategy that outperforms the DJIA index, when DPM keeps almost the same as the trend of the market.
  • On ASR, most of methods get higher scores than DJIA index and HRPM is the best.
  • When the portfolio values of other methods decline, the strategy still hovers at the peak
  • This phenomenon demonstrates that HRPM is superior to all the baselines, it is robust under different market conditions relatively
Conclusion
  • The authors focus on the problem of portfolio management with trading cost via deep reinforcement learning.
  • The authors propose a hierarchical reinforced stock trading system (HRPM).
  • The authors build a hierarchy of portfolio management over trade execution and train the corresponding policies.
  • The high-level policy gives portfolio weights and invokes the low-level policy to sell or buy the corresponding shares within a short time window.
  • Extensive experimental results in the U.S market and China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches
Summary
  • Introduction:

    The problem of portfolio management is widely studied in the area of algorithmic trading.
  • Many existing RL methods get promising results by focusing on various technologies to extract richer representation, e.g., by modelbased learning (Tang 2018; Yu et al 2019), by adversarial learning (Liang et al 2018), or by state augmentation (Ye et al 2020)
  • These RL algorithms assume that portfolio weights can change immediately at the last price once an order is placed.
  • Due to the need of balancing the long-term profit maximization and short-term trade execution, it is challenge for a single/flat RL algorithm to operate on different levels of temporal tasks
  • Objectives:

    The authors' objective is to maximize the final portfolio value given a long time horizon by taking into account the trading cost.
  • In order to encourage the high-level policy not to “put all the eggs in one basket”, the authors aim to find a highlevel policy that maximizes the maximum entropy objective:
  • Results:

    The authors' HRPM is the only strategy that outperforms the DJIA index, when DPM keeps almost the same as the trend of the market.
  • On ASR, most of methods get higher scores than DJIA index and HRPM is the best.
  • When the portfolio values of other methods decline, the strategy still hovers at the peak
  • This phenomenon demonstrates that HRPM is superior to all the baselines, it is robust under different market conditions relatively
  • Conclusion:

    The authors focus on the problem of portfolio management with trading cost via deep reinforcement learning.
  • The authors propose a hierarchical reinforced stock trading system (HRPM).
  • The authors build a hierarchy of portfolio management over trade execution and train the corresponding policies.
  • The high-level policy gives portfolio weights and invokes the low-level policy to sell or buy the corresponding shares within a short time window.
  • Extensive experimental results in the U.S market and China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches
Tables
  • Table1: Period of stock data used in the experiments
  • Table2: Performance comparison in the U.S market
  • Table3: Performance comparison in the China market
  • Table4: Ablation on the effect of entropy in the U.S market
Download tables as Excel
Funding
  • This research is supported, in part, by the Joint NTUWeBank Research Centre on Fintech (Award No: NWJ2019-008), Nanyang Technological University, Singapore
Reference
  • Almahdi, S.; and Yang, S. Y. 2017. An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown. Expert Systems with Applications 87: 267–279.
    Google ScholarLocate open access versionFindings
  • Borodin, A.; El-Yaniv, R.; and Gogan, V. 2004. Can we learn to beat the best stock. In Advances in Neural Information Processing Systems, 345–352.
    Google ScholarLocate open access versionFindings
  • Cover, T. M. 2011. Universal portfolios. In The Kelly Capital Growth Investment Criterion: Theory and Practice, 181– 209. World Scientific.
    Google ScholarFindings
  • Gaivoronski, A. A.; and Stella, F. 2000. Stochastic nonstationary optimization for finding universal portfolios. Annals of Operations Research 100(1-4): 165–188.
    Google ScholarLocate open access versionFindings
  • Gao, L.; and Zhang, W. 2013. Weighted moving average passive aggressive algorithm for online portfolio selection. In 2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics, volume 1, 327– 330. IEEE.
    Google ScholarLocate open access versionFindings
  • Jiang, Z.; and Liang, J. 2017. Cryptocurrency portfolio management with deep reinforcement learning. In 2017 Intelligent Systems Conference (IntelliSys), 905–913. IEEE.
    Google ScholarLocate open access versionFindings
  • Jiang, Z.; Xu, D.; and Liang, J. 201A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059.
    Findings
  • Li, B.; and Hoi, S. C. 2012. On-line portfolio selection with moving average reversion. In Proceedings of the 29th International Coference on International Conference on Machine Learning, 563–570.
    Google ScholarLocate open access versionFindings
  • Liang, Z.; Chen, H.; Zhu, J.; Jiang, K.; and Li, Y. 2018. Adversarial deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808.09940.
    Findings
  • Lillicrap, T. P.; Hunt, J. J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; and Wierstra, D. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
    Findings
  • Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; and Riedmiller, M. 2013. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
    Findings
  • Mosavi, A.; Ghamisi, P.; Faghan, Y.; and Duan, P. 2020. Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics. arXiv preprint arXiv:2004.01509.
    Findings
  • Nevmyvaka, Y.; Feng, Y.; and Kearns, M. 2006. Reinforcement learning for optimized trade execution. In Proceedings of the 23rd International Conference on Machine Learning, 673–680.
    Google ScholarLocate open access versionFindings
  • Silver, D.; Huang, A.; Maddison, C. J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529(7587): 484.
    Google ScholarLocate open access versionFindings
  • Sutton, R. S.; McAllester, D. A.; Singh, S. P.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems, 1057–1063.
    Google ScholarLocate open access versionFindings
  • Tang, L. 2018. An actor-critic-based portfolio investment method inspired by benefit-risk optimization. Journal of Algorithms & Computational Technology 12(4): 351–360.
    Google ScholarLocate open access versionFindings
  • Tavakoli, A.; Pardo, F.; and Kormushev, P. 2018. Action branching architectures for deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence, 4131–4138.
    Google ScholarLocate open access versionFindings
  • Ye, Y.; Pei, H.; Wang, B.; Chen, P.-Y.; Zhu, Y.; Xiao, J.; and Li, B. 2020. Reinforcement-learning based portfolio management with augmented asset movement prediction states. arXiv preprint arXiv:2002.05780.
    Findings
  • Yu, P.; Lee, J. S.; Kulyatin, I.; Shi, Z.; and Dasgupta, S. 20Model-based deep reinforcement learning for dynamic portfolio optimization. arXiv preprint arXiv:1901.08740.
    Findings
  • Zhang, J.; and Tao, D. 20Empowering Things with Intelligence: A Survey of the Progress, Challenges, and Opportunities in Artificial Intelligence of Things. IEEE Internet of Things Journal doi:10.1109/JIOT.2020.3039359.
    Locate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments