跳到主要內容

簡易檢索 / 詳目顯示

研究生: 胡瑞軒
Hu, Ruei-Xsuan
論文名稱: 主題分析方法在經濟文獻學上的應用:隱含狄利克雷分配與代理人基計算經濟學
Topic Analysis in the Automatic Organization of Economic Literature: The Case of Agent-Based Computational Economics with the Use of Latent Dirichlet Allocation
指導教授: 陳樹衡
Chen, Shu-Heng
口試委員: 池秉聰
Chie, Bin-Tzong
曾翊恆
Tseng, Yi-Heng
學位類別: 碩士
Master
系所名稱: 社會科學學院 - 經濟學系
Department of Economics
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 79
中文關鍵詞: 代理人基建模非監督學習詞彙頻率-逆文檔頻率文字雲自然語言處理主題一致性主題相似度
外文關鍵詞: Agent-Based Modeling, Unsupervised Learning, TF-IDF, Wordcloud, NLP, Topic coherence, Topic similarity
DOI URL: http://doi.org/10.6814/NCCU202201265
相關次數: 點閱:185下載:20
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本文將多個期刊的代理人基建模(Agent-Based Modeling, ABM) 的論文用主題模型中的隱含狄利克雷分配(Latent Dirichlet Allocation, LDA) 進行分類,接著用詞彙頻率-逆文檔頻率(Term Frequency-Inverse Document Frequency, TF-IDF) 與文字雲分別找出與該主題相關卻被過濾掉的詞彙以及主題之間的相同詞彙並且對於每個主題所屬的期刊進行分類並分析主題隨時間的變化。最後,主題相似度、主題排名與主題一致性分析結果顯示每個主題的重疊度不大,主題解釋比例與一致性都很高。本文有別於過往研究,進行多個期刊的分析以及分類之後的評估,主題相似度、主題排名與主題一致性評估方式顯示隱含狄利克雷分配模型能有效地量化具體的方式將文檔分類,且比人為的分類方式降低更多時間成本與資料複雜度。


    In this paper, we classify Agent-Based Modeling (ABM) papers in multiple journals with Latent Dirichlet Allocation (LDA) in topic model. By applying analyses of TF-IDF algorithm and word cloud, we recollect words related to the topic but filtered out in the first place and gather same words belonging to different topics. Also, we analyze the dynamics of topics in several journals over time. Finally, the results of topic similarity, topic ranking and topic consistency analysis show that each topic has little overlap, and the topic explanation ratio and consistency are high. Different from previous studies, we classify ABM papers in multiply journals and have further evaluations. The evaluation methods of topic similarity, topic ranking and topic consistency show that the implicit Dirichlet allocation model can effectively quantitatively classify documents. And it reduces more time cost and data complexity than artificial classification.

    摘要 I
    Abstractv II
    1 緒論 1
    2 研究流程 4
    3 理論架構 6
    3.1 主題建模 6
    3.2 隱含狄利克雷分配的基本概念 7
    3.3 分類方式與採用理論 11
    3.3.1 TF-IDF 演算法 12
    3.3.2 文字雲 14
    3.3.3 餘弦距離 14
    3.3.4 主題一致性 15
    4 數據與數據分析方法 16
    4.1 數據 16
    4.2 數據分析方法 17
    5 研究結果 18
    5.1 解釋主題 19
    5.2 分析主題類別 24
    5.3 主題相關詞彙 34
    5.4 與一般文檔分類相異之處 38
    5.5 TF-IDF 演算法 45
    5.6 主題隨時間的變化 52
    5.7 使用文字雲來識別主題 62
    5.8 主題相似程度 68
    5.9 主題排名 70
    5.10 主題一致性 73
    6 結論與建議 75
    6.1 結論 75
    6.2 建議 76
    7 參考文獻 77

    [1] Ambrosino, A., Cedrini, M., Davis, J. B., Fiori, S. Guerzoni, M., & Nuccio, M. (2018). What topic modeling could reveal about the evolution of economics. Journal of Economic Methodology, 25(4), 329-348.
    [2] Alexakis, C., Doolig, M., Eleftheriou, K., & Polemis, M. (2020). Textual Machine Learning: An Application to Computational Economics Research. Computational Economics, 57(1), 369-385.
    [3] Blei, D. M., Jordan, M. I, & Ng, A. Y.. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(2003), 993-1022.
    [4] Boyd-Graber, J., Hu, Y., & Mimno, D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval, 11(2-3), 143–296.
    [5] Hannigan, T. R., Haans, R. F., Vakili, K., Tchalian, H., Glaser, V. L., Wang, M. S., et al. (2019). Topic modeling in management research: rendering new theory from textual data. Academy of Management Annals, 13(2), 586–632.
    [6] Hofmann, T. (1999). Probabilistic Latent Semantic Analysis. Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI-99), Stockholm, 289-296.
    [7] Huang, A. H., Lehavy, R., Zang, A. Y., & Zheng, R. (2018). Analyst information discovery and interpretation roles: a topic modeling approach. Management Science, 64(6), 2833-2855.
    [8] Kao, Y. F., & Venkatachalam, R. (2018). Human and Machine Learning. Computational Economics, 57(4), 889-909.
    [9] Kumar, A., & Paul, A. (2016). Mastering Text Mining with R. UK:Packt Publishing Ltd.
    [10] Mimno, D., Leenders, M., McCallum, A., Talley, E., & Wallach, H. M. (2011). Optimizing Semantic Coherence in Topic Models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 262-272.
    [11] Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, 100-108.
    [12] Papadimitriou, C. H., Raghavan, P., Tamaki, H., & Vempala, S. (1999). Latent Semantic Indexing: A Probabilistic Analysis. Journal of Computer and System Sciences, 61(2), 217-235.
    [13] Polyakov, M., Chalak, M., Iftekhar, M. S., Pandit, R., Tapsuwan, S., Zhang, F., & Ma, C. (2017). Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015. Environmental and Resource Economics volume 71(1), 217-239.
    [14] Piepenbrink, A., & Nurmammadov, E. (2015). Topics in the literature of transition economies and emerging markets. Scientometrics, 102(3), 2107-2130.
    [15] Tesfatsion, L. (2021). Agent-Based Computational Economics: Overview and Brief History. Working Paper 21004, Department of Economics, Iowa State University.
    [16] Tesfatsion, L. (2022, January 1). Agent-Based Computational Economics(ACE). Intro Materials and Research Area Sites. http://www2.econ.iastate.edu/tesfatsi/aapplic.htm

    QR CODE
    :::