| 研究生: |
林柏亦 Lin, Po-Yi |
|---|---|
| 論文名稱: |
人工智慧在永續報告書漂綠資訊辨識 The Identification of Greenwashing Information in Sustainability Reports Using Artificial Intelligence |
| 指導教授: |
蔡炎龍
Tsai, Yen-Lung |
| 口試委員: |
陳天進
Chen, Ten-Ging 張宜武 Chang, Yi-Wu |
| 學位類別: |
碩士
Master |
| 系所名稱: |
理學院 - 應用數學系 Department of Mathematical Sciences |
| 論文出版年: | 2026 |
| 畢業學年度: | 115 |
| 語文別: | 英文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 漂綠 、永續報告書 、檢索增強生成(RAG) 、AI Agent 、ESG |
| 外文關鍵詞: | Greenwashing, Sustainability Report, Retrieval-Augmented Generation (RAG), AI Agent, ESG |
| 相關次數: | 點閱:67 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著氣候變遷與永續發展成為全球關注焦點,各國政府相繼要求企業揭露環境、社會與治理(ESG)相關資訊。然而,伴隨而來的「漂綠」行為——即企業誇大或誤導其環保貢獻——不僅損害企業誠信,也誤導投資人與大眾。本研究旨在設計並實作一個整合人工智慧技術的分析框架,名為「SustaiNex」,以系統化地評估永續報告書中可能存在之資訊揭露落差與潛在漂綠風險。
本框架主要包含三個核心階段:首先,透過 SustaiNex Agent 對永續報告書 PDF 進行解析,動態切分語意區塊並提取具體的 ESG 事件。其次,利用嵌入模型(Embedding Model)從獨立新聞資料庫中檢索相關報導,作為外部驗證數據。最後,開發 VerdiX 與 VerdiDict 分析 Agent,依據 Planet Tracker 所定義的六大漂綠類型(集體遮掩、局部亮點、責任轉移、標籤誤導、目標洗滌、靜默漂綠),對個別事件進行判斷並產出整體的分析報告。
研究結果顯示,本框架能有效整合內部揭露資訊與外部媒體數據,並透過大規模語言模型(LLM)提供具資料來源可追溯之潛在漂綠風險判讀。此外,本研究進一步透過 InfoBridge 機制,將模型判定結果與政府環保裁罰記錄進行交叉驗證,初步顯示該量化評估具有一定程度的可解釋性與參考價值。本研究不僅為永續資訊揭露提供了自動化的監測工具,也為學術界與實務界在漂綠偵測與 ESG 資訊品質優化方面提供了新的研究視角。
With climate change and sustainable development becoming global priorities, governments worldwide have increasingly mandated corporate disclosure of Environmental, Social, and Governance (ESG) information. However, this has been accompanied by the rise of ”greenwashing”—the practice of making exaggerated or misleading claims about environmental contributions—which undermines corporate credibility and misleads investors and the public. This study aims to design and implement an artificial intelligence-integrated analytical framework, named ” SustaiNex,” to systematically assess potential greenwashing risks and disclosure gaps within sustainability reports.
The proposed framework consists of three core phases: First, the SustaiNex agent parses sustainability report PDFs, dynamically segmenting them into semantic chunks and extracting specific ESG events. Second, an embedding model is utilized to retrieve relevant reports from an independent news database to serve as external verification data. Finally, the VerdiX and VerdiDict analytical agents evaluate individual events and generate a comprehensive analysis report based on the six types of greenwashing defined by Planet Tracker (Greencrowding, Greenlighting,
Greenshifting, Greenlabelling, Greenrinsing, and Greenhushing).
The research results demonstrate that the framework effectively integrates internal disclosures with external media data, providing evidence-based greenwashing assessments through Large Language Models (LLMs). Furthermore, this study employs the InfoBridge mechanism to cross-reference model judgments with government environmental penalty records, confirming that the quantitative evaluations possess high interpretability and reference value. This research not only provides an automated monitoring tool for sustainability disclosure but also offers a new perspective for both academia and industry in greenwashing detection and the optimization of ESG information quality.
致謝 ii
中文摘要 iii
Abstract iv
Contents vi
List of Tables vii
List of Figures viii
1 Introduction 1
2 Base Techniques of Artificial Intelligence 3
2.1 The Neurons 3
2.2 Embedding 5
2.3 Recurrent Neural Network (RNN) 6
2.4 Transformer 7
2.4.1 The Embedding and the Positional Encoding 7
2.4.2 Self Attention 9
2.4.3 Multi-Head Attention 10
2.4.4 Residual Connection and Layer Normalization 10
2.4.5 Masked Multi-Head Attention 11
2.5 Large Language Model (LLM) 12
2.6 Prompt Engineering 12
2.7 Retrieval-Augmented Generation (RAG) 13
2.8 AI Agent 14
3 Greenwash 16
3.1 Concept of Greenwash 16
3.2 Greenwash Classification 17
4 Experiment Results and Discussion 20
4.1 Collecting Sustainable Report Events 21
4.2 Validation Data Preparation 24
4.3 Determine Greenwash or not 26
4.3.1 Input Prepare 26
4.3.2 Greenwashing Score 27
4.3.3 Greenwash Conclusion 31
4.4 Modification 31
4.4.1 CompanyF Modification 33
4.4.2 More Company Results 35
4.4.3 Results 53
5 Conclusion and Future Works 55
5.1 Conclusion 55
5.2 Future Works 56
Bibliography 57
Appendix A Calculate the Positional Encoding 59
Appendix B Masked Multi-head Attention 60
Appendix C Python Code 63
C.1 SustaiNex 63
C.2 Chunking 69
C.3 VerdiX 71
C.4 VerdiDict 75
C.5 InfoBridge 79
[1] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization, 2016.
[2] Mark Elsner, Grace Atkinson, and Saadia Zahidi. Global Risks Report 2025. World Economic Forum, 2025.
[3] Financial Supervisory Commission, R.O.C. (Taiwan). Financial supervisory commission. major administrative sanctions, 2024. Data coverage: 2023-2024.
[4] Gemma Team, Aishwarya Kamath, Johan Ferret, Shreya Pathak, et al. Gemma 3 Technical Report, 2025.
[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition, 2015.
[6] Dengyang Jiang, Dongyang Liu, Zanyi Wang, Qilong Wu, Liuzhuozheng Li, Hengzhuang Li, Xin Jin, David Liu, Changsheng Lu, Zhen Li, Bo Zhang, Mengmeng Wang, Steven Hoi, Peng Gao, and Harry Yang. Distribution Matching Distillation Meets Reinforcement Learning, 2026.
[7] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, 2021.
[8] Dongyang Liu, Peng Gao, David Liu, Ruoyi Du, Zhen Li, Qilong Wu, Xin Jin, Sihan Cao, Shifeng Zhang, Hongsheng Li, and Steven Hoi. Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield, 2025.
[9] Ministry of Environment PRTR Platform. Ministry of environment. regulated pollution data (including administrative penalties), 2024. Data coverage: 2023-2024.
[10] OpenAI, Sandhini Agarwal, Lama Ahmad, Jason Ai, et al. gpt-oss-120b & gpt-oss-20b Model Card, 2025.
[11] Taiwan Stock Exchange Market Observation Post System (MOPS). Taiwan stock exchange. market observation post system (mops), 2024. Data coverage: 2023-2024.
[12] Image Team, Huanqia Cai, Sihan Cao, Ruoyi Du, Peng Gao, Steven Hoi, Zhaohui Hou, Shijie Huang, Dengyang Jiang, Xin Jin, Liangchen Li, Zhen Li, Zhong-Yu Li, David Liu, Dongyang Liu, Junhan Shi, Qilong Wu, Feng Yu, Chi Zhang, Shifeng Zhang, and Shilin Zhou. Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer, 2025.
[13] Transparent Footprint Blog. Building an anti-greenwashing ecosystem: 2024 corporate sustainability reporting tracker, 2024.
[14] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention Is All You Need, 2023.
[15] Henrique Schechter Vera, Sahil Dua, Biao Zhang, Daniel Salz, Ryan Mullins, Sindhu Raghuram Panyam, Sara Smoot, Iftekhar Naim, Joe Zou, Feiyang Chen, et al. EmbeddingGemma: Powerful and Lightweight Text Representations, 2025.
[16] John Willis, Thalia Bofiliou, Arianna Manili, Isabella Reynolds, and Nicole Kozlowski. The Greenwashing Hydra. Planet Tracker, January 2023.
[17] Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, Fei Huang, and Jingren Zhou. Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models, 2025.