應用多顆 GPU 以及 PyTorch 指令進行平行處理於加速學習演算法之執行速度

簡易檢索 / 詳目顯示

回結果列表

研究生：	方凱柔 Fang, Kai-Rou
論文名稱：	應用多顆 GPU 以及 PyTorch 指令進行平行處理於加速學習演算法之執行速度 Applying multiple GPUs and PyTorch commands for parallel processing to accelerate the execution speed of learning algorithms
指導教授：	蔡瑞煌 Tsaih, Rua-Huan 林怡伶 Lin, Yi-Ling
口試委員:	周承復 Chou, Cheng-Fu
學位類別：	碩士 Master
系所名稱：	商學院 - 資訊管理學系 Department of Management Information System
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	41
中文關鍵詞：	學習演算法、自適應神經網路、PyTorch 、數據平行處理
外文關鍵詞：	Learning algorithm, Adaptive neural networks, PyTorch, Data parallelism
相關次數：	點閱：30 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在平行處理和多GPU應用方面，對於兩層自適應神經網路的研究相對較少。本研究旨在利用PyTorch框架及其相關指令，結合多個GPU，探討兩層自適應神經網路的數據平行處理。此外，我們將運用Pupil Learning Mechanism演算法，實現在多GPU環境下更高效的計算。以銅價預測數據集為基礎，我們將透過一系列實驗來驗證這一方法，並分析多GPU平行處理對模型訓練速度和準確性的影響，以全面了解和評估所提方法的實際效果和應用價值。預期能提供一個簡單的平行處理模組，讓未來使用兩層自適應神經網路的研究得以快速且簡單地進行平行處理。

Research on the parallel processing and multi-GPU application of two-layer adaptive neural networks (2LANN) is relatively scarce. This study aims to explore the data parallel processing and model parallel processing of 2LANN by leveraging the PyTorch framework and its related instructions, combined with multiple GPUs. The Pupil Learning Mechanism algorithm is employed to achieve more efficient computation in a multi-GPU environment. Based on a copper price prediction dataset, a series of experiments is conducted to validate this approach and analyze the impact of multi-GPU parallel processing on model training speed and accuracy. This aims to comprehensively understand and evaluate the practical effectiveness and application value of the proposed method. It is expected to provide a simple parallel processing module, facilitating future research on 2LANN to conduct parallel processing quickly and easily.

CHAPTER 1. INTRODUCTION 1

CHAPTER 2. LITERATURE REVIEW 4
2.1 PUPIL LEARNING MECHANISM (TSAIH ET AL., 2023) 4
2.2 PYTORCH 7
2.3 DATA PARALLEL PROCESSING IN PYTORCH 8
2.3.1 DATAPARALLEL 8
2.3.2 DISTRIBUTEDDATAPARALLEL 10

CHAPTER 3. RESEARCH METHODOLOGY 13
3.1 ALGORITHM OF RPLM 13
3.2 PARALLEL PROCESSING 19
3.2.1 ORGANIZING MODULE WITH DATAPARALLEL 20
3.2.2 ORGANIZING MODULE WITH DISTRIBUTEDDATAPARALLEL 21

CHAPTER 4. EXPERIMENT DESIGN 24
4.1 DATASET 24
4.2 EXPERIMENT EVALUATION 26

CHAPTER 5. EXPERIMENT RESULTS 28
5.1 TRAINING TIME 28
5.1.1 TRAINING TIME OF THE UNDERSTANDING MODULE OF ORGANIZING MODULE 28
5.1.2 TRAINING TIME OF ORGANIZING MODULE 30
5.1.3 OVERALL TRAINING TIME 32
5.2 MAE RESULTS 33

CHAPTER 6. CONCLUSION AND FUTURE WORK 36
6.1 CONCLUSION 36
6.2 LIMITATION AND FUTURE WORK 37

REFERENCES 39

Bahrampour, S., Ramakrishnan, N., Schott, L., & Shah, M. (2015). Comparative Study of Deep Learning Software Frameworks. ArXiv Preprint ArXiv: 1511.06435
DataParallel — PyTorch 2.2 documentation. (2024). Retrieved March 6, 2024, from https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html
Distributed communication package - torch.distributed — PyTorch 2.3 documentation. (2024). Retrieved June 3, 2024, from https://pytorch.org/docs/stable/distributed.html
Distributed Data Parallel — PyTorch 2.3 documentation. (2024). Retrieved June 2, 2024, from https://pytorch.org/docs/stable/notes/ddp.html
Distributed data parallel training using Pytorch on AWS | Telesens. (2019). Retrieved May 31, 2024, from https://www.telesens.co/2019/04/04/distributed-data-parallel-training-using-pytorch-on-aws/
DistributedDataParallel — PyTorch 2.3 documentation. (2024). Retrieved June 1, 2024, from https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html
Fan, S., Rong, Y., Meng, C., Cao, Z., Wang, S., Zheng, Z., Wu, C., Long, G., Yang, J., Xia, L., Diao, L., Liu, X., & Lin, W. (2021). DAPPLE: A pipelined data parallel approach for training large models. Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, 431–445.
Geng, J., Li, D., & Wang, S. (2019). ElasticPipe: An efficient and dynamic model-parallel solution to DNN training. ScienceCloud 2019 - Proceedings of the 10th Workshop on Scientific Cloud Computing, Co-Located with HPDC 2019, 5–9.
Hara, K., Saito, D., & Shouno, H. (2015). Analysis of function of rectified linear unit used in deep learning. 2015 International Joint Conference on Neural Networks (IJCNN), 1–8.
Harlap, A., Narayanan, D., Phanishayee, A., Seshadri, V., Devanur, N., Ganger, G., & Gibbons, P. (2018). PipeDream: Fast and Efficient Pipeline Parallel DNN Training. Preprint ArXiv: 1806.03377
Ketkar, N., & Moolayil, J. (2021). Deep learning with python: Learn Best Practices of Deep Learning Models with PyTorch. In Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch. Apress Media LLC. https://doi.org/10.1007/978-1-4842-5364-9
Khomenko, V., Shyshkov, O., Radyvonenko, O., & Bokhan, K. (2016). Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization. 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), 100–103.
Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. ArXiv Preprint ArXiv: 1404.5997
Lee, S., Kang, Q., Madireddy, S., Balaprakash, P., Agrawal, A., Choudhary, A., Archibald, R., & Liao, W. (2019). Improving Scalability of Parallel CNN Training by Adjusting Mini-Batch Size at Run-Time. 2019 IEEE International Conference on Big Data (Big Data), 830–839.
Nguyen, T. D. T., Park, J. H., Hossain, M. I., Hossain, M. D., Lee, S.-J., Jang, J. W., Jo, S. H., Huynh, L. N. T., Tran, T. K., & Huh, E.-N. (2019). Performance Analysis of Data Parallelism Technique in Machine Learning for Human Activity Recognition Using LSTM. 2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 387–391.
Optional: Data Parallelism — PyTorch Tutorials 2.2.0+cu121 documentation. (2024). Retrieved February 21, 2024, from https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html
Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). GPU Computing. Proceedings of the IEEE, 96(5), 879–899.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., Facebook, Z. D., Research, A. I., Lin, Z., Desmaison, A., Antiga, L., Srl, O., & Lerer, A. (2017). Automatic differentiation in PyTorch. NIPS 2017 Workshop on Autodiff.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. ArXiv Preprint ArXiv: 1912.01703
Pérez-Sánchez, B., Fontenla-Romero, O., & Guijarro-Berdiñas, B. (2018). A review of adaptive online learning for artificial neural networks. Artificial Intelligence Review, 49(2), 281–299.
PyTorch Distributed Overview — PyTorch Tutorials 2.3.0+cu121 documentation. (2024). Retrieved June 1, 2024, from https://pytorch.org/tutorials/beginner/dist_overview.html
Ren-Han, Y. (2022). An adaptive learning-based model for copper price forecasting. Master's thesis, Department of Information Management, National Chengchi University, 1–78.
Sanders, J., Kandrot, E., & Jacoboni, E. (2011). CUDA par l’exemple [une introduction à la programmation parallèle de GPU]. Pearson.
torch.nn — PyTorch 2.2 documentation. (2024). Retrieved March 5, 2024, from https://pytorch.org/docs/stable/nn.html
torch.nn.parallel.data_parallel — PyTorch 2.3 documentation. (2024). Retrieved June 2, 2024, from https://pytorch.org/docs/stable/_modules/torch/nn/parallel/data_parallel.html
torch.utils.data — PyTorch 2.3 documentation. (2024). Retrieved June 3, 2024, from https://pytorch.org/docs/stable/data.html#single-and-multi-process-data-loading
Tsai, Y.-H., Jheng, Y.-J., & Tsaih, R.-H. (2019). The Cramming, Softening and Integrating Learning Algorithm with Parametric ReLU Activation Function for Binary Input/Output Problems. 2019 International Joint Conference on Neural Networks (IJCNN), 1–7.
Tsaih, R. R. (1998). An Explanation of Reasoning Neural Networks. In Mathematical and Computer Modelling (Vol. 28, Issue 2).
Tsaih, R.-H., Chien, Y.-H., & Chien, S.-Y. (2023). Pupil Learning Mechanism. ArXiv Preprint ArXiv: 2307.16141

全文公開日期 2029/07/14

簡易檢索 / 詳目顯示

相關論文