深度強化學習於二維裝箱問題的規劃策略｜國立政治大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	詹凱傑 Chan, Kai-Jie
論文名稱：	深度強化學習於二維裝箱問題的規劃策略 Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problems
指導教授：	李蔡彥 Li, Tsai-Yen
口試委員:	劉遠楨 Liu, Yuan-Chen 廖文宏 Liao, Wen-Hung
學位類別：	碩士 Master
系所名稱：	資訊學院 - 資訊科學系碩士在職專班 Excutive Master Program of Computer Science
論文出版年：	2025
畢業學年度：	113
語文別：	中文
論文頁數：	91
中文關鍵詞：	強化學習、深度強化學習、二維裝箱問題
外文關鍵詞：	Two-dimensional Bin Packing Problem, 2DBP
相關次數：	點閱：25 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

二維裝箱問題是典型的NP-hard組合最佳化問題，廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性，深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法，將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型，將複雜動作空間分解為物品選擇和位置決策兩個子問題，採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示，DRL + First-Fit混合策略在測試的資料集上，基於物品放置率和空間利用率兩個重要指標上，比傳統最佳演算法皆有更好的綜合表現，且明顯改善純深度強化學習的密集空間放置不穩定問題。
本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性，也為相關問題提供方法論的參考。

The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm.
The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness.
The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios.
Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields.

第一章緒論 1
第一節研究動機 1
第二節研究背景 2
第三節研究目標 3
第四節研究挑戰 4
第五節論文貢獻 5
第六節論文架構 6

第二章相關研究 7
第一節技術背景 7
第二節相關研究 9
第三節本章結語 14

第三章系統架構設計 15
第一節問題定義與系統目標 15
第二節系統架構設計 20
第三節深度強化學習模型設計 25
第四節混合演算法策略設計 34
第五節本章結語 37

第四章系統實作 38
第一節實作環境與技術選擇 38
第二節核心元件實作 40
第三節關鍵技術挑戰與解決方案 48
第四節實驗系統實作 52
第五節本章結語 61

第五章實驗設計與結果分析 62
第一節實驗目標 62
第二節實驗環境與設置 62
第三節實驗結果 66
第四節實驗結果分析與討論 75
第五節本章結語 83

第六章結論與未來研究 84
第一節研究總結 84
第二節研究限制 87
第三節未來研究方向 88
第四節研究總結 89

參考文獻 90

[1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396.
[2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399.
[3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.
[4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359.
[5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429.
[6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671.
[7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300.
[8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392.
[9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560.
[10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259.
[11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394.
[12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420.
[13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531.
[14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20.

全文公開日期 2030/07/29

簡易檢索 / 詳目顯示

相關論文