| 研究生: |
孫紹傑 Sun, Shao-Chieh |
|---|---|
| 論文名稱: |
自適型單層前饋式類神經網路的裁剪機制與主成分分析 The Pruning Mechanism of Adaptive Single-hidden Layer Neural Networks and Principal Component Analysis |
| 指導教授: |
蔡瑞煌
Tsaih, Rua-Huan |
| 口試委員: |
黃士嘉
Huang, Shih-Chia 周珮婷 Chou, Pei-Ting |
| 學位類別: |
碩士
Master |
| 系所名稱: |
商學院 - 資訊管理學系 Department of Management Information System |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 英文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 主成分分析 、強記、軟化、整合學習演算法 、人工類神經網路 、隱藏節點修剪 |
| 外文關鍵詞: | Principal Component Analysis, Cramming, Softening, and Integrating learning algorithm, Artificial Neural Network, Hidden Node Pruning |
| DOI URL: | http://doi.org/10.6814/NCCU202001062 |
| 相關次數: | 點閱:71 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在機器學習領域中的人工類神經網絡(ANN)之架構中,為了解決神經網路學習演算法中過度擬合(overfitting)問題,截至目前尚未有任何系統化的機制可以來幫助我們有效的判別可丟棄的非相關隱藏節點(Irrelevant Hidden Nodes) 。為了解決上述挑戰,我們著重在建立一種系統化結合 PCA (主成分分析) 所提出的 PD(修剪檢測機制)機制,來可靠且有效的決斷出潛在非相關隱藏節點(Potential Irrelevant Hidden Nodes)。本研究所提出的ASLFNPD 運作機制具有以下特點:(1)採用單層隱藏層的神經網(ASLFN)和 ReLU 激活函數;(2)採用PCA 機制幫助辦別潛在非相關隱藏節點(potential irrelevant hidden nodes)。我們進行了實驗並記錄PCA 運作時所產生的 omega 參數數值以及相關資訊,用以驗證所提出的機制具有有效性和效率性。
In order to solve the overfitting problem in the neural network learning issue, there is no systematic mechanism to help us effectively identify Irrelevant Hidden Nodes. To address the above challenges, we focus on establishing a systematic PCA (Principal Component Analysis), PD (Pruning Detection) mechanism to reliably and effectively determine the potential irrelevant hidden nodes. The proposed mechanism ASLFNPD has the following characteristics: (1) applicable to the adaptive single-hidden layer feed-forward neural networks (ASLFN) with the ReLU activation function on all hidden nodes. (2) Use the PCA mechanism to help identify potential irrelevant hidden nodes. We conducted experiments and recorded the omega values generated by PCA and relevant information to verify the effectiveness and efficiency of the proposed mechanism.
1. INTRODUCTION 8
2. LITERATURE REVIEW 11
2.1 RECTIFIED LINEAR UNIT (RELU) 11
2.2 SINGLE-HIDDEN LAYER FEED-FORWARD NEURAL NETWORKS
WITH ONE OUTPUT NODE. 12
2.3 ADAPTIVE SINGLE-HIDDEN LAYER FEED-FORWARD NEURAL
NETWORKS (ASLFN) 14
2.4 THE CRAMMING, SOFTENING AND INTEGRATING LEARNING
ALGORITHM 17
2.5 OVERFITTING 23
2.6 PRINCIPAL COMPONENT ANALYSIS 26
3. METHODOLOGY 29
4. EXPERIMENT DESIGN 34
5. EXPERIMENT RESULT 38
6. SUMMARY AND FUTURE WORK 47
APPENDIX 49
REFERENCE 58
[1] Agatonovic-Kustrin, S., & Beresford, R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Journal of pharmaceutical and biomedical analysis, 22(5), 717-727.
[2] Tsaih, R. R. (1998). An explanation of reasoning neural networks. Mathematical and Computer Modelling, 28(2), 37-44.
[3] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
[4] Xue, Y. (2019, February). An Overview of Overfitting and its Solutions. In Journal of Physics: Conference Series (Vol. 1168, No. 2, p. 022022). IOP Publishing.
[5] Tsaih, R. H., & Cheng, T. C. (2009). A resistant learning procedure for coping with outliers. Annals of Mathematics and Artificial Intelligence, 57(2), 161-180.
[6] Chang, H.Y. (2019). The sequentially-learning-based algorithm and the prediction of the turning points of bull and bear markets (Master’s dissertation). National Chengchi University, 1-39.
[7] Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559-572.
[8] Smith, L. I. (2002). A tutorial on principal components analysis.
[9] Shlens, J. (2014). A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100.
[10] Hanna, A. J. (2018). A top-down approach to identifying bull and bear market states. International Review of Financial Analysis, 55, 93-110.
[11] Pagan, A. R., & Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of applied econometrics, 18(1), 23-46.
[12] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958.
[13] Talathi, S. S., & Vartak, A. (2015). Improving performance of recurrent neural network with relu nonlinearity. arXiv preprint arXiv:1511.03771.
[14] Tsaih, R. R. (1993). The softening learning procedure. Mathematical and computer modelling, 18(8), 61-64.
[15] Allamy, H. (2014). Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study). Computer Science, Communication and Instrumentation Devices, Kochi, India (December 27, 2014).
[16] Caruana, R., Lawrence, S., & Giles, C. L. (2001). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in neural information processing systems (pp. 402-408).
[17] Cawley, G. C. (2012, October). Over-Fitting in Model Selection and Its Avoidance. In IDA (p. 1).
[18] Chauvin, Y. (1989). A back-propagation algorithm with optimal use of hidden units. In Advances in neural information processing systems (pp. 519-526).
[19] Ishikawa, M. (1989). A structural learning algorithm with forgetting of link weights. In International 1989 Joint Conference on Neural Networks (pp. 626-vol). IEEE.
[20] Weigend, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forecasting. In Advances in neural information processing systems (pp. 875-882).
[21] Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In Advances in neural information processing systems (pp. 950-957).
[22] LeCun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In Advances in neural information processing systems (pp. 598-605).
[23] Srivastava, N. (2013). Improving neural networks with dropout. University of Toronto, 182(566), 7.
[24] Jackson, J. E. (2005). A user's guide to principal components (Vol. 587). John Wiley & Sons. (pp. 1-3)
[25] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.
[26] Fisher, R. A., & Mackenzie, W. A. (1923). Studies in crop variation. II. The manurial response of different potato varieties. The Journal of Agricultural Science, 13(3), 311-320.
[27] Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3), 37-52.
[28] Tripathi, A. (2019), A Complete Guide to Principal Component Analysis – PCA in Machine earning. URL“https://towardsdatascience.com/a-complete-guide-to-principal-component-analysis-pca-in-machine-learning-664f34fc3e5a”
[29] Jolliffe, I. T. (2002). Principal component analysis.
[30] Xu, X., & Wen, C. (2017). Fault Diagnosis Method Based on Information Entropy and Relative Principal Component Analysis. Journal of Control Science and Engineering, 2017.
[31] Kashani, M. N., Aminian, J., Shahhosseini, S., & Farrokhi, M. (2012). Dynamic crude oil fouling prediction in industrial preheaters using optimized ANN based moving window technique. Chemical Engineering Research and Design, 90(7), 938-949.
[32] Chen, S. S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indicators. Journal of Banking & Finance, 33(2), 211-223.
[33] Chen, S. S. (2012). Revisiting the empirical linkages between stock returns and trading volume. Journal of Banking & Finance, 36(6), 1781-1788.