跳到主要內容

簡易檢索 / 詳目顯示

研究生: 劉上瑋
論文名稱: 深度增強學習在動態資產配置上之應用— 以美國ETF為例
The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs
指導教授: 廖四郎
口試委員: 蔡炎龍
連育民
學位類別: 碩士
Master
系所名稱: 商學院 - 金融學系
Department of Money and Banking
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 43
中文關鍵詞: 動態資產配置深度增強學習Q-Learning類神經網路
外文關鍵詞: Dynamic asset allocation, Deep reinforcement learning, Q-Learning, Neural network
相關次數: 點閱:90下載:29
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 增強式學習(Reinforcement Learning)透過與環境不斷的互動來學習,以達到極大化每一期報酬的總和的目標,廣泛被運用於多期的決策過程。基於這些特性,增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。
    本研究應用Deep Q-Learning演算法建立動態資產配置策略,研究如何在每期不同的環境狀態之下,找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合,以其日報酬率資料進行訓練,並與買進持有策略及固定比例投資策略比較績效,檢視深度增強式學習在動態資產配置適用性。


    Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently.
    In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.

    第一章 緒論 1
    第一節 研究背景與動機 1
    第二節 研究目的 2
    第二章 文獻探討 3
    第一節 資產配置 3
    第二節 增強式學習 7
    第三章 研究方法 21
    第一節 建構投資組合 21
    第二節 資料處理 27
    第三節 增強式學習系統設計 27
    第四節 Deep Q-Network 29
    第四章 研究結果 33
    第一節 結果分析 33
    第五章 結論與建議 34
    參考文獻 36

    [1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99.

    [2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover.

    [3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1).

    [4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953.

    [5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48.

    [6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.

    [7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR).

    [8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

    [9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91.

    [10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA.

    [11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42.

    [12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

    [13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27.

    [14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22.

    [15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138.

    [16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32.

    [17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

    [18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292.

    QR CODE
    :::