基於交叉注意力合成之二曝光影像融合｜國立政治大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃莎涴 Sha-Wan Huang
論文名稱：	基於交叉注意力合成之二曝光影像融合 TwoExposure Image Fusion based on Cross Attention Fusion
指導教授：	彭彥璁 Peng, Yan-Tsung
口試委員:	彭彥璁 Peng, Yan-Tsung 廖文宏 Liao, Wen-Hung 陳柏豪 Chen, Bo-Hao
學位類別：	碩士 Master
系所名稱：	理學院 - 資訊科學系
論文出版年：	2021
畢業學年度：	110
語文別：	中文
論文頁數：	36
中文關鍵詞：	高動態範圍成像、兩曝光影像融合
外文關鍵詞：	High Dynamic Range imaging, Two-exposure image fusion
DOI URL：	http://doi.org/10.6814/NCCU202101538
相關次數：	點閱：70 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

高動態範圍 (HDR) 成像需要融合在同一場景中以多種不同曝光程度的影像以覆蓋整個動態範圍。以目前現有的研究中，只利用少數低動態範圍 (LDR) 影像，這仍然是一項具有挑戰性的任務。本論文提出了一種新穎的兩曝光影像融合模型，此模型具有我們提出的交叉注意力融合模組 (CAFM)，可使用一個影像的高曝光的部分來補償因曝光不足或過度曝光而導致的另一張影像內容缺失的部分。CAFM 由交叉注意力融合(Cross Attention Fusion) 和通道注意力融合(Channel Attention Fusion) 組成，以實現雙分支融合，從而產生出色的融合結果。並且在公開的HDR 資料集上，我們進行大量實驗以證明所提出的模型在與最先驅的圖像融合方法比較時表現良好。

High Dynamic Range (HDR) imaging requires the fusion of images captured with multiple exposure ratios in the same scene to cover the entire dynamic range. With only a few low dynamic range (LDR) images, it remains a challenging task. The paper presents a novel two-exposure image fusion model that features the proposed Cross Attention Fusion Module (CAFM) to use one image's highlight to compensate for the other's content loss caused by under-exposure or over-exposure. The CAFM consists of Cross Attention Fusion and Channel Attention Fusion to achieve a dual-branch fusion for producing superior fusion results. The extensive experimental results on benchmark HDR public datasets demonstrate that the proposed model performs favorably against the state-of-the-art image fusion methods.

論文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III
圖目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V
表目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII
1緒論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1研究背景與動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2研究目的. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2技術背景與相關研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1基於傳統影像處理的HDR影像融合. . . . . . . . . . . . . . . . . . . . 4
2.2基於深度學習方式的HDR影像融合. . . . . . . . . . . . . . . . . . . . 6
2.3注意力機制技術介紹與進展. . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4小結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3研究方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1高動態範圍影像生成. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1條狀池化注意力機制介紹(Strip Pooling Attention) . . . . . . . . 17
3.1.2交叉注意力融合(Cross Attention Fusion, XAF) . . . . . . . . . . 18
3.1.3通道注意力融合(Channel Attention Fusion, CAF) . . . . . . . . 20
3.2損失函數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1損失函數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3資料集. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4訓練設置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5融合評估指標. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4實驗結果與分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5消融研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6結論與後續工作. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

[1] C. Florea, C. Vertan, and L. Florea, “High dynamic range imaging by perceptuallogarithmic exposure merging,”International Journal of Applied Mathematics andComputer Science, vol. 25, no. 4, pp. 943–954, 2015.
[2] T. Mertens, J. Kautz, and F. Van Reeth, “Exposure fusion: A simple and practical alternativetohighdynamicrangephotography,”inComputer graphics forum,vol.28,pp. 161–171, Wiley Online Library, 2009.
[3] F. Kou, Z. Li, C. Wen, and W. Chen, “Multiscale exposure fusion via gradient domainguidedimagefiltering,”in2017 IEEE International Conference on Multimediaand Expo (ICME), pp. 1105–1110, IEEE, 2017.
[4] Y. Yang, W. Cao, S. Wu, and Z. Li, “Multiscale fusion of two largeexposureratioimages,” 2018.
[5] K. R. Prabhakar, V. S. Srikar, and R. V. Babu, “Deepfuse: A deep unsupervisedapproach for exposure fusion with extreme exposure image pairs.,” inICCV, vol. 1,p. 3, 2017.
[6] G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk, and J. Unger, “Hdr image reconstructionfromasingleexposureusingdeepcnns,”ACM transactions on graphics(TOG), vol. 36, no. 6, pp. 1–15, 2017.
[7] Y. Endo, Y. Kanamori, and J. Mitani, “Deep reverse tone mapping.,”ACM Trans.Graph., vol. 36, no. 6, pp. 177–1, 2017.
[8] Y. Chen, M. Yu, K. Chen, G. Jiang, Y. Song, Z. Peng, and F. Chen, “New stereo high dynamic range imaging method using generative adversarial networks,”in2019IEEE International Conference on Image Processing (ICIP), pp. 3502–3506, IEEE,2019.
[9] J.L. Yin, B.H. Chen, Y.T. Peng, and C.C. Tsai, “Deep prior guided network forhighquality image fusion,” in2020 IEEE International Conference on Multimediaand Expo (ICME), pp. 1–6, IEEE, 2020.32
[10] H. Xu, J. Ma, Z. Le, J. Jiang, and X. Guo, “Fusiondn: A unified densely connectednetworkforimagefusion,”inProceedings of the ThirtyFourth AAAI Conference onArtificial Intelligence (AAAI), pp. 12484–12491, 2020.
[11] J. Hu, L. Shen, and G. Sun, “Squeezeandexcitation networks,” inProceedings ofthe IEEE conference on computer vision and pattern recognition, pp. 7132–7141,2018.
[12] S. Woo, J. Park, J.Y. Lee, and I. So Kweon, “Cbam: Convolutional block attentionmodule,” inProceedings of the European conference on computer vision (ECCV),pp. 3–19, 2018.
[13] X. Li, W. Wang, X. Hu, and J. Yang, “Selective kernel networks,” inProceedingsof the IEEE conference on computer vision and pattern recognition, pp. 510–519,2019.
[14] R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu, “Attentive generative adversarialnetwork for raindrop removal from a single image,” inProceedings of the IEEEconference on computer vision and pattern recognition, pp. 2482–2491, 2018.
[15] F. Lv, Y. Li, and F. Lu, “Attention guided lowlight image enhancement with a largescale lowlight simulation dataset,”arXiv: 1908.00682, 2019.
[16] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention networkfor scene segmentation,” inProceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition, pp. 3146–3154, 2019.
[17] Q. Hou, L. Zhang, M.M. Cheng, and J. Feng, “Strip Pooling: Rethinking spatialpooling for scene parsing,” 2020.
[18] H. Yeganeh and Z. Wang, “Objective quality assessment of tonemapped images,”IEEE Transactions on Image Processing, vol. 22, no. 2, pp. 657–667, 2012.
[19] K. Gu, S. Wang, G. Zhai, S. Ma, X. Yang, W. Lin, W. Zhang, and W. Gao, “Blindquality assessment of tonemapped images via analysis of information, naturalness,and structure,”IEEE Transactions on Multimedia, 2016.
[20] J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrast enhancer frommultiexposure images,”IEEE Transactions on Image Processing, vol. 27, no. 4,pp. 2049–2062, 2018.33
[21] Q. Wang, W. Chen, X. Wu, and Z. Li, “Detailenhanced multiscale exposure fusionin yuv color space,” 2019.
[22] M. Nejati, M. Karimi, S. R. Soroushmehr, N. Karimi, S. Samavi, and K. Najarian, “Fast exposure fusion using exposedness function,” in2017 IEEE InternationalConference on Image Processing (ICIP), pp. 2234–2238, IEEE, 2017.
[23] K. Ma, H. Li, H. Yong, Z. Wang, D. Meng, and L. Zhang, “Robust multiexposureimage fusion: a structural patch decomposition approach,”IEEE Transactions onImage Processing, vol. 26, no. 5, pp. 2519–2532, 2017.
[24] A. Rafi, M. Tinauli, and M. Izani, “High dynamic range images: Evolution, applications and suggested processes,” in2007 11th International Conference InformationVisualization (IV’07), pp. 877–882, IEEE, 2007.
[25] Y. Kinoshita and H. Kiya, “Scene segmentationbased luminance adjustment formultiexposure image fusion,”IEEE Transactions on Image Processing, vol. 28,no. 8, pp. 4101–4116, 2019.
[26] Y. Kinoshita, T. Yoshida, S. Shiota, and H. Kiya, “Pseudo multiexposure fusionusing a single image,” in2017 AsiaPacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 263–269, IEEE, 2017.
[27] Y. Kinoshita and H. Kiya, “Automatic exposure compensation using an image segmentationmethodforsingleimagebasedmultiexposurefusion,”APSIPA Transactions on Signal and Information Processing, vol. 7, 2018.
[28] A. Visavakitcharoen, Y. Kinoshita, and H. Kiya, “A color compensation methodusing inverse camera response function for multiexposure image fusion,” in2019IEEE 8th Global Conference on Consumer Electronics (GCCE),pp.468–470,IEEE,2019.
[29] Z.Li, Z.Wei, C.Wen, andJ.Zheng, “Detailenhancedmultiscaleexposurefusion,”IEEE Transactions on Image processing, vol. 26, no. 3, pp. 1243–1252, 2017.
[30] T. Sakai, D. Kimura, T. Yoshida, and M. Iwahashi, “Hybrid method for multiexposure image fusion based on weighted mean and sparse representation,” in201523rd European Signal Processing Conference (EUSIPCO), pp. 809–813, IEEE,2015.34
[31] N.K.KalantariandR.Ramamoorthi,“Deephighdynamicrangeimagingofdynamicscenes.,”ACM Trans. Graph., vol. 36, no. 4, pp. 144–1, 2017.
[32] S. Wu, J. Xu, Y.W. Tai, and C.K. Tang, “Deep high dynamic range imaging withlargeforegroundmotions,”inProceedings of the European Conference on ComputerVision (ECCV), pp. 117–132, 2018.
[33] K. Ma, K. Zeng, and Z. Wang, “Perceptual quality assessment for multiexposureimage fusion,”IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3345–3356, 2015.
[34] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connectedconvolutionalnetworks,” inProceedings of the IEEE conference on computer visionand pattern recognition, pp. 4700–4708, 2017.
[35] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser,and I. Polosukhin, “Attention is all you need,” inAdvances in neural informationprocessing systems, pp. 5998–6008, 2017.
[36] Z.Pu,P.Guo,M.S.Asif,andZ.Ma,“Robusthighdynamicrange(hdr)imagingwithcomplexmotionandparallax,”inProceedings of the Asian Conference on ComputerVision, 2020.

全文公開日期 2026/08/29

簡易檢索 / 詳目顯示

相關論文