| 研究生: |
張智昱 Chang, Chih-Yu |
|---|---|
| 論文名稱: |
基於軌跡的引導式分類模型 A Trajectory-Guided Classification Model |
| 指導教授: |
曾正男
Tzeng, Jeng-Nan |
| 口試委員: |
李永達
Li, Yung-Ta 曾睿彬 Tseng, Jui-Pin |
| 學位類別: |
碩士
Master |
| 系所名稱: |
理學院 - 應用數學系 Department of Mathematical Sciences |
| 論文出版年: | 2025 |
| 畢業學年度: | 114 |
| 語文別: | 中文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 資料轉換 、分類模型 、向量場 、力場 |
| 外文關鍵詞: | Data transformation, Classification model, Vector field, Force field |
| 相關次數: | 點閱:18 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究提出一種結合吸引力與排斥力和向量場概念的軌跡引導式資料轉換方法,以改善分類模型在面對複雜資料分布時的預測效能。該方法透過模擬資料點間的動態交互力,將同類資料彼此吸引,異類資料則產生排斥,進而重塑資料的空間結構,使各類別更具可分性。透過上述力場產生的資料移動軌跡,我們得以建立一組動態的向量場,從而模擬資料點間的相互作用所產生之動態行為。研究中設計兩種計算架構進行資料轉換,一為基於力場的吸引力與排斥力模擬,一為基於向量場的結構引導式轉換。實驗使用了 scikit-learn 資料集的 Moons dataset 和 XOR dataset 進行理論驗證,將該方法使用前後的資料分布進行比較,並使用基礎分類模型如邏輯迴歸進行分類,確認資料轉換後的可分性提升。此外,為了進一步驗證方法的穩定性與效能,我們選擇了爾灣加州大學 UCI Machine Learning Repository 資料庫的肺癌資料集進行實驗,並結合隨機森林與 LDA 進行特徵選取與降維以提升模型分類能力,同時進行多組資料切分用以進行統計檢定,驗證資料轉換方法的穩定性和能否顯著提升分類的效果。結果顯示,將 UCI 肺癌資料集降維至各個低維度時,本資料轉換方法於 SVM 分類器上能顯著提升分類精準度,尤其在使用 LDA 降至二維時達到 93% 的平均精準度,證實其在不同維度與模型下具有穩定且優異的效能。
This study proposes a trajectory-guided data transformation method that integrates the concepts of attractive-repulsive forces and vector fields to enhance the predictive performance of classification models when dealing with complex data distributions. By simulating dynamic interactions among data points, the method attracts data points of the same class while repelling those of different classes, thereby reshaping the spatial structure of the data to improve class separability. From the trajectories generated by these simulated forces, a dynamic vector field is constructed to model the interactions and resulting dynamic behaviors among data points. Two computational frameworks are designed for data transformation: one based on attractive-repulsive force simulation and the other on vector fieldguided structural transformation. The proposed method is first validated using the Moons and XOR datasets from scikit-learn, where the pre- and post-transformation data distributions are compared, and baseline classifiers such as logistic regression are applied to confirm the improvement in separability. To evaluate stability and effectiveness, experiments used the UCI Lung Cancer dataset from the UCI Machine Learning Repository, incorporating randomforest for feature selection and LDA for dimensionality reduction to enhance classification capability. Multiple data splits are performed for statistical validation of the method’s stability and its ability to significantly improve classification performance. The results show that, after reducing the UCI lung cancer dataset to various low-dimensional spaces, the proposed transformation method yields substantial improvements in classification accuracy with an SVM classifier, achieving an average accuracy of 93% when reduced to two dimensions via LDA. These findings demonstrate the method’s robustness and superior performance across different dimensionalities and models.
1 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
2 文獻探討 4
2.1 動態系統與機器學習 4
2.2 將動態系統融入機器學習的應用 6
2.3 利用機器學習推測動態系統 8
2.4 觀察動態系統對數據的影響 10
2.5 基於向量場的數據驅動方法 11
3 研究方法 14
3.1 方法設計 14
3.2 資料前處理與資料分割 16
3.2.1 隨機森林特徵選取法 17
3.2.2 LDA 降維方法 19
3.3 吸引力與排斥力 22
3.4 向量場的建構 23
3.5 方法設計的差異與比較 25
3.6 實驗設計與評估方法 30
4 實驗結果 34
4.1 理論實驗結果 34
4.2 實際實驗結果 37
5 結論 42
參考文獻 43
[1] Genki Furuhata, Tomoaki Niiyama, and Satoshi Sunada. Physical deep learning based on optimal control of dynamical systems. Phys. Rev. Appl., 15:034092, Mar 2021.
[2] Herbert Jaeger and Harald Haas. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science, 304(5667):78–80, 2004.
[3] L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer. Information processing using a single dynamical node as complex system. Nature Communications, 2(1):468, 2011.
[4] Daniel Brunner, Miguel C. Soriano, Claudio R. Mirasso, and Ingo Fischer. Parallel photonic information processing at gigabyte per second data rates using transient states. Nature Communications, 4(1):1364, 2013.
[5] K. Gajamannage, D. I. Jayathilake, Y. Park, and E. M. Bollt. Recurrent neural networks for dynamical systems: Applications to ordinary differential equations, collective motion, and hydrological modeling. Chaos: An Interdisciplinary Journal of Nonlinear Science, 33(1):013109, Jan 2023.
[6] Kevin Egan, Weizhen Li, and Rui Carvalho. Automatically discovering ordinary differential equations from data with sparse regression. Communications Physics, 7(1):20, 2024.
[7] Rui Wang, Danielle Maddix, Christos Faloutsos, Yuyang Wang, and Rose Yu. Bridging physics-based and data-driven modeling for learning dynamical systems. In Proceedings of the 3rd Conference on Learning for Dynamics and Control, volume 144 of Proceedings of Machine Learning Research, pages 385–398. PMLR, Jun 2021.
[8] Krzysztof Zarzycki and Maciej Ławryńczuk. Long short-term memory neural networks for modeling dynamical processes and predictive control: A hybrid physics-informed
approach. Sensors, 23(21), 2023.
[9] Goktug T. Cinar and Jose C. Principe. Clustering of time series using a hierarchical linear dynamical system. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6741–6745, May 2014.
[10] Boaz Nadler, Stéphane Lafon, Ronald R. Coifman, and Ioannis G. Kevrekidis. Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Applied and Computational Harmonic Analysis, 21(1):113–127, 2006. Special Issue: Diffusion Maps and Wavelets.
[11] Damián H Zanette and Alexander S Mikhailov. Dynamical systems with time-dependent coupling: clustering and critical behaviour. Physica D: Nonlinear Phenomena, 194(3):203–218, 2004.
[12] S. Kazadi, M. Chung, B. Lee, and R. Cho. On the dynamics of clustering systems. Robotics and Autonomous Systems, 46(1):1–27, 2004.
[13] Daniel Vieira and Joao Paixao. Vector field neural networks. arXiv e-prints, 2019.
[14] Z.Q. Hong and J.Y Yang. Lung Cancer, UC Irvine Machine Learning Repository, 1991. available at https://doi.org/10.24432/C57596.
[15] Mitra Montazeri, Mahdieh Soleymani Baghshah, and Ahmad Enhesari. Hyper-heuristic algorithm for finding efficient features in diagnose of lung cancer disease. arXiv e-prints, 2016.
[16] Muhammad Imran Faisal, Saba Bashir, Zain Sikandar Khan, and Farhan Hassan Khan. An evaluation of machine learning classifiers and ensembles for early stage prediction of lung cancer. 3rd International Conference on Emerging Trends in Engineering, Sciences and Technology (ICEEST), Karachi, Pakistan, pages 1–4, Dec 2018.
[17] Bhanumathi S and Dr. Chandrashekara S N. Impute, select, decision tree and naïve bayes (ise-dnc): An ensemble learning approach to classify the lung cancer. SSRN Electronic Journal, Aug 2020.
全文公開日期 2026/12/09