跳到主要內容

簡易檢索 / 詳目顯示

研究生: 黃得晉
Huang, De-Jin
論文名稱: 基於條件式潛在因子模型結合對抗式生成網路的選股策略:以台灣股市為例
A GAN-IPCA Framework for Stock Selection: Empirical Evidence from the Taiwan Stock Market
指導教授: 江彌修
學位類別: 碩士
Master
系所名稱: 商學院 - 金融學系
Department of Money and Banking
論文出版年: 2026
畢業學年度: 114
語文別: 中文
論文頁數: 90
中文關鍵詞: 生成對抗網路非線性因子映射條件式潛在因子模型橫斷面選股Walk-Forward 回測台灣股市IPCA生成對抗網路非線性因子映射橫斷面選股Walk-Forward 回測台灣股市
外文關鍵詞: Generative Adversarial Networks, Nonlinear Factor Mapping, Conditional Latent Factor Models, Cross-Sectional Stock Selection, Walk-Forward Backtesting, Taiwan Stock Market, IPCA, Generative Adversarial Networks, Nonlinear Factor Mapping, Cross-Sectional Stock Selection, Walk-Forward Backtesting, Taiwan Stock Market
相關次數: 點閱:9下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 資產定價文獻自 Fama-French 系列因子模型以降,皆建立於「個股因子載荷為時間不變之常數,且特徵映射至載荷之關係為線性」之雙重假設上。Kelly, Pruitt, and Su (2019) 提出之條件式主成分模型(Instrumented Principal Component Analysis,以下簡稱 IPCA)放寬第一項假設,使因子載荷成為公司特徵之函數,是近年條件因子模型的代表性突破;然而 IPCA 仍將「特徵 → 載荷」之對映限定為線性,無法捕捉特徵之間的非線性交互作用。Gu, Kelly, and Xiu (2020) 與 Chen, Pelger, and Zhu (2024)等大規模美股實證指出,非線性方法相對於線性基準在橫斷面預測上具系統性優勢。
    然而,現有非線性條件因子模型的實證證據幾乎集中於美股,於散戶比例高、籌碼資訊豐富、產業集中度高之台灣市場,深度學習能否提供超越線性基準主成分
    分析(Principal Component Analysis,以下簡稱 PCA)的條件因子建模能力、又能否在已具備動態載荷能力的 IPCA 之上提供獨立的邊際貢獻,目前仍屬空白。簡祥育(2025) 雖率先將線性 IPCA 應用於台股,但尚未檢驗非線性映射的可能性。本研究的核心動機,即在於以台灣市場為實證場域,建立兩個層次的方法論定位:第一,深度條件因子模型是否能全面超越 PCA 此一靜態基準;第二,於 IPCA 已展現穩健表現的前提下,將其線性映射 Γ 替換為對抗式生成網路 G(·) 之非線性版本,能否在IPCA「無法 span 的維度」上提供獨立貢獻——例如於特定市場 regime、特定策略加權方式、或於 IPCA 殘差面上的正向 alpha。
    為此,本文以同一資料集、同一特徵維度、同一驗證程序,對三種模型進行對照分析:靜態載荷之主成分迴歸(PCA)、線性條件載荷之 IPCA、以及以生成對抗
    網路(Generative Adversarial Network,以下簡稱 GAN)取代 IPCA 線性映射所構建之 Deep IPCA-GAN。GAN-IPCA 於 IPCA「特徵 → 載荷 → 潛在因子 → 預測報酬」之鏈式結構中將線性映射 Γ 替換為非線性生成器 G(·),並在損失函數中加入因子正交懲罰以防止潛在因子坍縮,使非線性建模能力之引入不致破壞模型之經濟可解釋性。研究設計採台灣上市公司 2006 年 3 月至 2026 年 2 月共 240 個月之月頻資料,特徵集涵蓋三大法人籌碼、估值多來源對照、盈餘與股利、價量動能與總體貨幣等面向,並以 walk-forward 滾動方式逐月重訓以對齊三模型之資訊條件。
    研究關注之核心問題包括:(一)GAN-IPCA 能否全面超越 PCA 此一靜態基準?(二)GAN-IPCA 雖然全期 Sharpe 不必然優於 IPCA,但在何種子維度上能提供 IPCA抓不到的邊際資訊?此一邊際貢獻可分別於「對 IPCA 月報酬之 spanning 迴歸截距是否為正」、「市場低波動 / 高波動條件下 Sharpe 差距」、以及「下跌月份損失壓抑能力」三個面向上加以檢驗。(三)若 GAN-IPCA 對 IPCA 之邊際貢獻確實存在,其經濟意涵為何,以及在何種市場 regime 下會失效?


    Since the Fama–French factor models, the asset pricing literature has been built upon two implicit assumptions: (i) factor loadings are time-invariant constants for each stock, and (ii) the mapping from firm characteristics to factor loadings is linear. Kelly, Pruitt, and Su (2019) relax the first assumption through the Instrumented Principal Components Analysis (IPCA) framework, in which factor loadings become functions of observable characteristics—a major advance in conditional factor modeling. Yet IPCA confines the characteristic-to-loading mapping to a linear form and cannot capture nonlinear interactions among characteristics. Large-scale U.S. empirical work (Gu, Kelly, and Xiu, 2020; Chen, Pelger, and Zhu, 2024) documents systematic advantages of nonlinear methods over linear benchmarks in cross-sectional return prediction.
    However, the existing empirical evidence on nonlinear conditional factor models concentrates almost exclusively on the U.S. market. Whether deep learning can deliver conditional factor modeling capacity that surpasses linear benchmarks (PCA) in the Taiwan market—characterized by high retail participation, abundant institutional flow data, and high industry concentration—and whether, on top of an already strong IPCA baseline, the nonlinear variant can offer an independent marginal contribution in dimensions IPCA cannot span, remain entirely open questions. While 簡 祥 育 (2025) pioneered the application of linear IPCA to Taiwan stocks, no work has yet examined the nonlinear extension. The central motivation of this study is therefore to establish, in the Taiwan market, a two-tier methodological positioning: first, whether a deep conditional factor model can fully surpass the static PCA benchmark; second, given IPCA’s already robust performance, whether replacing its linear mapping Γ with an adversarially trained nonlinear generator G(·) can provide independent marginal contributions in dimensions where IPCA cannot span—for example, positive spanning alpha on IPCA residuals, performance differentials under specific market regimes, or
    differential portfolio-weighting schemes.
    Holding constant the dataset, feature dimension, and validation procedure, we conduct a side-by-side analysis of three approaches: static-loading Principal Component Regres-sion (PCA), linear conditional-loading IPCA, and Deep IPCA-GAN. GAN-IPCA preserves IPCA’s chained estimation structure (characteristics → loadings → latent factors → predicted returns), with an orthogonality penalty on the latent factors to prevent the factor-collapse pathology of adversarial training. The empirical design uses monthly data on Taiwan-listed firms from March 2006 to February 2026 (240 months in total), with a feature set spanning institutional trading flows, multi-source valuation indicators, earnings and dividend signals, price-volume momentum, and macro-monetary indicators. A walk-forward retraining scheme aligns the three models’ information conditions month by month.
    The study addresses the following questions: (i) Can GAN-IPCA comprehensively outperform the static PCA benchmark? (ii) Although GAN-IPCA’s full-sample Sharpe is not necessarily superior to IPCA’s, on which sub-dimensions does it provide marginal information that IPCA cannot extract? This marginal contribution is examined through three
    lenses: spanning regression intercepts on IPCA monthly returns, Sharpe gaps under low-versus high-volatility regimes, and loss-mitigation capacity in down months. (iii) If GAN-IPCA’s marginal contribution over IPCA is confirmed, what is its economic interpretation,and under which market regimes does it fail?

    摘要 i

    Abstract iii

    第一章 緒論 1

    1.1 研究背景與動機 1

    1.2 研究目的 3

    1.3 研究貢獻 4

    第二章 文獻回顧 6

    2.1 傳統因子模型的演進與因子過剩問題 6

    2.2 條件式主成分模型(IPCA) :放寬「載荷不變」假設 7

    2.3 深度學習於資產定價:放寬「線性映射」假設 8

    2.4 對抗式生成網路於資產定價 8

    2.5 短側 anomaly 與機器學習在新興市場的應用 9

    2.6 研究缺口與本文定位 10

    第三章 研究方法 11

    3.1 資料來源與樣本 11

    3.1.1 資料來源 11

    3.1.2 特徵共線性分析與最終 50 維特徵之決定 11

    3.1.3 樣本期間與分割 14

    3.1.4 資料前處理 15

    3.2 特徵分組 16

    3.3 模型設定 22

    3.3.1 (A) PCA:靜態載荷 + 線性主成分迴歸 22

    3.3.2 (B) IPCA:條件線性載荷 23

    3.3.3 (C) GAN-IPCA:條件非線性載荷 24

    3.3.4 推論:靜態 f¯ 25

    3.4 投資組合建構 25

    3.5 績效衡量指標 26

    3.6 穩健性檢定 27

    第四章 實證結果 28

    4.1 PCA 因子分析 28

    4.1.1 解釋變異與選擇主成分數 28

    4.1.2 主成分 Loading 解讀 29

    4.2 IPCA 因子分析 35

    4.2.1 Gamma 矩陣解讀 35

    4.2.2 Bmean :長期平均特徵 → 報酬 41

    4.3 GAN-IPCA 因子分析 43

    4.3.1 SHAP 特徵重要性 43

    4.3.2 開盤價之經濟詮釋與穩健性檢定 46

    4.3.3 因子貢獻時序與累積 50

    4.4 績效比較 53

    4.4.1 全期表現摘要 53

    4.4.2 累積報酬曲線 55

    4.4.3 12 個月滾動 Sharpe 59

    4.5 子期間分析 60

    4.5.1 年度 Sharpe 熱圖與 CAGR 條圖 61

    4.5.2 LS VW 子期間績效(GAN-IPCA 之主場) 62

    4.5.3 LO EW 子期間績效(基準對照) 63

    4.5.4 IS / OOS 穩健性整理 65

    4.5.5 GAN-IPCA LS VW 績效衰退成因診斷 65

    4.6 因子解釋力與換手率 70

    4.6.1 月度 IC 時序 70

    4.6.2 KPS-R2 與 Fama–MacBeth 72

    4.6.3 換手率與交易成本 73

    4.7 Fama-French 6 因子 Alpha 檢定 74

    4.7.1 FF6 因子描述統計 74

    4.7.2 Alpha 檢定結果 74

    4.7.3 因子負荷量 75

    4.7.4 GRS 聯合檢定 75

    4.7.5 Factor MVE Sharpe 邊際貢獻 76

    4.8 選股集中度分析:IPCA 之 alpha 是否為「全市場 alpha」? 76

    4.9 進階統計檢定 77

    4.9.1 Diebold-Mariano 檢定:模型間預測能力差異 77

    4.9.2 滾動 24 個月 Sharpe 比率對比 79

    4.9.3 IPCA-GAN 組合策略:互補性檢定 79

    4.10 GAN-IPCA 對 IPCA 之獨立邊際貢獻分析 80

    4.10.1 Spanning Regression:分全期 / IS / OOS 80

    4.10.2 Regime-Conditional Sharpe 差距 82

    4.10.3 結果整合與研究問題二之回答 82

    第五章 結論與建議 84

    5.1 研究結論 84

    5.2 研究限制 86

    5.3 未來研究方向 87

    參考文獻 88

    Aas, K., Jullum, M., and Løland, A. (2021). Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artificial Intelligence, 298, 103502.
    Baker, M., and Wurgler, J. (2006). Investor sentiment and the cross-section of stock returns. Journal of Finance, 61(4), 1645–1680.
    Baker, M., Greenwood, R., and Wurgler, J. (2009). Catering through nominal share prices. Journal of Finance, 64(6), 2559–2590.
    Birru, J., and Wang, B. (2016). Nominal price illusion. Journal of Financial Economics, 119(3), 578–598.
    Brennan, M. J., Chordia, T., and Subrahmanyam, A. (1998). Alternative factor specifica- tions, security characteristics, and the cross-section of expected stock returns. Journal of Financial Economics, 49(3), 345–373.
    Bryzgalova, S., Pelger, M., and Zhu, J. (2025). Forest through the trees: Building cross- sections of stock returns. Journal of Finance, 80(5), 2447–2506.
    Carhart, M. M. (1997). On persistence in mutual fund performance. Journal of Finance, 52(1), 57–82.
    Chen, L., Pelger, M., and Zhu, J. (2024). Deep learning in asset pricing. Management Science, 70(2), 714–750.
    Cong, L. W., Tang, K., Wang, J., and Zhang, Y. (2021). AlphaPortfolio: Direct construc- tion through deep reinforcement learning and interpretable AI. SSRN Working Paper No. 3554486.
    Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1(2), 223–236.
    Diebold, F. X., and Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Busi- ness & Economic Statistics, 13(3), 253–263.
    Fama, E. F., and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56.
    Fama, E. F., and French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116, 1–22.
    Feng, G., Giglio, S., and Xiu, D. (2020). Taming the factor zoo: A test of new factors. Journal of Finance, 75(3), 1327–1370.
    Feng, G., He, J., Polson, N. G., and Xu, J. (2024). Deep learning in characteristics-sorted factor models. Journal of Financial and Quantitative Analysis, 59(7), 3001–3036.
    Gibbons, M. R., Ross, S. A., and Shanken, J. (1989). A test of the efficiency of a given portfolio. Econometrica, 57(5), 1121–1152.
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems (NeurIPS), 27, 2672–2680.
    Gu, S., Kelly, B., and Xiu, D. (2020). Empirical asset pricing via machine learning. Review of Financial Studies, 33(5), 2223–2273.
    Gu, S., Kelly, B., and Xiu, D. (2021). Autoencoder asset pricing models. Journal of Econo- metrics, 222(1), 429–450.
    Harvey, C. R., Liu, Y., and Zhu, H. (2016). …and the cross-section of expected returns. Review of Financial Studies, 29(1), 5–68.
    Jegadeesh, N. (1990). Evidence of predictable behavior of security returns. Journal of Fi- nance, 45(3), 881–898.
    Jegadeesh, N., and Titman, S. (1993). Returns to buying winners and selling losers: Impli- cations for stock market efficiency. Journal of Finance, 48(1), 65–91.
    Kelly, B. T., Pruitt, S., and Su, Y. (2019). Characteristics are covariances: A unified model of risk and return. Journal of Financial Economics, 134(3), 501–524.
    Kim, S., Korajczyk, R. A., and Neuhierl, A. (2022). Characteristic-based returns: Alpha or smart beta? Journal of Investment Management, 20(1), 70–89.
    Lettau, M., and Pelger, M. (2020). Estimating latent asset-pricing factors. Journal of Econo- metrics, 218(1), 1–31.
    Leippold, M., Wang, Q., and Zhou, W. (2022). Machine learning in the Chinese stock market. Journal of Financial Economics, 145(2), 64–82.
    Lundberg, S. M., and Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (NeurIPS), 30, 4765–4774.
    Newey, W. K., and West, K. D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55(3), 703–708.
    Stambaugh, R. F., Yu, J., and Yuan, Y. (2012). The short of it: Investor sentiment and anomalies. Journal of Financial Economics, 104(2), 288–302.
    簡祥育(2025)。 藉由機器學習強化基本面、股市動能與市場情緒的動態因子模型: 以台灣股票市場為例。 國立政治大學金融學系研究所碩士學位論文。

    無法下載圖示 全文公開日期 2031/06/29
    QR CODE
    :::