| 研究生: |
黃政嘉 Huang, Jia-Jheng |
|---|---|
| 論文名稱: |
Deepfake與GAN真偽人臉圖像統計分析 Statistical Analysis of Synthetic Images: Deepfake and GAN |
| 指導教授: |
余清祥
Yue, Ching-Syang 陳怡如 Chen, Yi-Ju |
| 口試委員: |
林怡伶
Lin, Yi-Ling 陳春樹 Chen, Chun-Shu 魏裕中 Wei, Yu-Zhong |
| 學位類別: |
碩士
Master |
| 系所名稱: |
商學院 - 統計學系 Department of Statistics |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 中文 |
| 論文頁數: | 81 |
| 中文關鍵詞: | 影像辨識 、統計分析 、維度縮減 、Sobel梯度 、資料依賴性 |
| 外文關鍵詞: | Image Recognition, Statistical Analysis, Dimensionality Reduction, Sobel Gradient, Data Dependency |
| 相關次數: | 點閱:80 下載:23 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著深度學習技術的發展,人工智慧已可生成的高度逼真的圖像,像是影像深偽影片(Deepfake)和生成對抗網絡(Generative Adversarial Network,GAN)都是知名範例,甚至還有以文字生成影片的模型。這些幾可亂真的偽造影像對資訊安全和個人隱私造成威脅,如何分辨真實、電腦生成影像成為熱門研究議題。有別於深度學習以準確性評估模型,本文希望透過影像資料的特性,配合統計理論及資料分析的概念,作為分辨真實及電腦生成人臉影像的依據。
我們認為Deepfake深偽影像和GANs生成圖像存在局部紋理缺陷,前者呈現過度平滑趨勢,後者則有油畫般的紋理扭曲,這些特性可藉由圖像資料的Sobel梯度計算、一階差分等方法偵測出差異。本文將圖像的RGB紅綠藍三原色等九種色彩空間資料代入上述方法,並以這些資料的統計量(平均數、變異數、離群值)為解釋變數,再使用統計學習、機器學習分類模型,判斷影像是否為電腦生成。分析發現本文提議的統計方法之準確性不亞於深度學習模型,而且使用明顯較少的解釋變數,但需選擇適當的資料切割。以PGGAN影像為例,一階差分的切割數較大時,模型準確率約為95%,Sobel則在切割數為64×64時,模型準確率可達99%。另外,模型準確率有明顯的資料依賴性,尤其是GAN資料集,例如:僅在PGGAN有99%準確率,StyleGAN/StyleGAN2等準確率降至20%左右,但若是比對其他真實資料與生成資料時,準確率可達到90%以上。
With the advancement of deep learning technology, artificial intelligence can generate highly realistic images, such as Deepfake videos and Generative Adversari-al Network (GAN). These technologies pose threats to information security and per-sonal privacy, making the differentiation between real and computer-generated im-ages a critical research topic. This paper aims to distinguish between real and com-puter-generated facial images using image characteristics, statistical theories, and data analysis concepts, rather than deep learning accuracy metrics. We consider that Deepfake and GAN-generated images exhibit distinct local texture defects. Deepfake images show excessive smoothness, while GAN images have painting-like distor-tions. These defects can be detected using methods such as Sobel gradient (Liu et al., 2023) and first-order difference (Chen Huishuang, 2023). This study applies these methods to nine color spaces, including RGB, and uses statistical measures (mean, variance, outlier ratio) as explanatory variables in statistical and machine learning classification models.
Our analysis reveals that the proposed statistical classification method achieves accuracy comparable to deep learning models with fewer explanatory variables, and image split is crucial. For PGGAN images, the first-order difference results show that larger split numbers can achieve 95% accuracy, while the Sobel method reaches 99% accuracy with a split number of 64. Additionally, data dependency significantly impacts model accuracy, particularly in GAN datasets. Using the original training dataset yields better results only for PGGAN, whereas StyleGAN/StyleGAN2 data perform worse. However, cross-validated other GAN datasets achieve over 90% ac-curacy.
第一章 緒論 1
第一節 研究動機 1
第二節 研究目的 3
第二章 文獻探討 5
第一節 文獻回顧 5
第二節 資料介紹與特性 8
第三節 資料抽樣與平衡 11
第三章 研究方法 14
第一節 圖像資料結構化 15
第二節 色彩空間 17
第三節 梯度變化量計算方式 19
第四節 維度縮減 24
第五節 模型介紹 27
第四章 探索性資料分析 35
第一節 離群值變數比較 35
第二節 探索性資料分析 37
第三節 特徵變數選取 42
第四節 最佳切割數 45
第五章 模型分析與比較 49
第一節 深度學習模型比較 49
第二節 GAN資料集依賴性 53
第三節 梯度計算方法結果比較 55
第六章 結論與研究限制 58
第一節 結論 58
第二節 研究限制與建議 61
參考文獻 63
附錄 69
附錄一、視覺化分析結果 69
附錄二、0.2/0.8離群值與Z-score離群值模型準確率 71
附錄三、所有資料集不同切割數下準確率比較 72
附錄四、所有資料集不同特徵準確率比較(未切割) 74
附錄五、GANs資料集深度學習損失函數與準確率 78
附錄六、GANs相關資料集之間準確率(不同模型比較) 81
參考文獻
一、中文文獻
[1]陳慧霜(2023)。「影像分析與深偽影片的偵測」,國立政治大學統計學系學位論文。
[2]李俊毅(2022)。「基於少量訓練樣本下之深偽視訊鑑識技術」,國立屏東科技大學資訊管理系學位論文。
[3]許志仲(2022)。「解決真實世界底下的深偽影視訊偵測問題之研究」,國立成功大學統計學系學位論文。
[4]莊易修(2017)。「成對學習應用於偽造影像/視訊偵測」,國立屏東科技大學資訊管理系學位論文。
[5]李昕(2021)。「融合生成對抗網路及領域知識的分層式影像擴增」,國立中央大學資訊工程學系學位論文。
二、英文文獻
[1]Breiman, L. (2001). “Random Forests”, Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324.
[2]Chen, B., Liu, X., Zheng, Y., Zhao, G., & Shi, Y.-Q. (2022). “A Robust GAN-Generated Face Detection Method Based on Dual-Color Spaces and an Improved Xception”, IEEE Transactions on Circuits and Systems for Video Technology, 32(6).
[3]Cozzolino, D., Gragnaniello, D., Poggi, G., & Verdoliva, L. (2021). “Towards Universal GAN Image Detection”, arXiv preprint arXiv:2112.12606.
[4]Dong, C., Kumar, A., & Liu, E. (2022). “Think Twice before Detecting GAN-generated Fake Images from Their Spectral Domain Imprints”, in Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
[5]Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Un-ter-thiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”, in International Conference on Learning Repre-sentations (ICLR)
[6]Durall, R., Keuper, M., Pfreundt, F.-J., & Keuper, J. (2020). “Unmasking Deep-Fakes with Simple Features”, arXiv preprint arXiv:1911.00686v3.
[7]Fu, T., Xia, M., & Yang, G. (2022). “Detecting GAN-generated Face Images Via Hybrid Texture and Sensor Noise Based Features”, Multimedia Tools and Ap-plications, 81, 26345–26359.
[8]Fu, Y., Sun, T., Jiang, X., Xu, K., & He, P. (2019). “Robust GAN-Face Detection Based on Dual-Channel CNN Network”, in 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)
[9]Giudice, O., Guarnera, L., & Battiato, S. (2021). “Fighting Deepfakes by De-tecting GAN DCT Anomalies”, Journal Imaging 2021, 7(8), 128
[10]Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). “Generative Adversarial Nets,” Advances in Neural Information Processing Systems, 27.
[11]Gragnaniello, D., Cozzolino, D., Marra, F., Poggi, G., & Verdoliva, L. (2021). “Are GAN Generated Images Easy to Detect? a Critical Analysis of the State-of-the-art”, in Proceedings of the IEEE International Conference on Mul-timedia and Expo (ICME), 1-6.
[12]He, K., Zhang, X., Ren, S., & Sun, J. (2016). “Deep Residual Learning for Im-age Recognition”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778.
[13]Hu, S., Li, Y., & Lyu, S. (2021). “Exposing GAN-generated Faces Using Incon-sistent Corneal Specular Highlights”, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
[14]Huang, Y., Juefei-Xu, F., Guo, Q., Liu, Y., & Pu, G. (2022). “FakeLocator: Ro-bust Localization of GAN-Based Face Manipulations”, IEEE Transactions on Information Forensics and Security, 17
[15]Kanopoulos, N., VasanthaVada, N., & Baker, R. L. (1988). “Design of an Image Edge Detection Filter Using the Sobel Operator”, IEEE JOURNAL OF SOL-ID-STATE CIRCUITS, VOL. 23, NO. 2
[16]Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, in International Confer-ence on Learning Representations (ICLR)
[17]Karras, T., Laine, S., & Aila, T. (2019). “A Style-based Generator Architecture for Generative Adversarial Networks”, in Proceedings of the IEEE/CVF Con-ference on Computer Vision and Pattern Recognition (CVPR), 4401-4410
[18]Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). “Analyzing and Improving the Image Quality of StyleGAN”, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8110-8119
[19]Li, H., Li, B., Tan, S., & Huang, J. (2020). “Identification of Deep Network Generated Images Using Disparities in Color Components”, Signal Processing, 174.
[20]Li, J., He, H., Huang, Z., & Zhang, C. (2013). “Exposing Computer Generated Images by Using Deep Convolutional Neural Networks”, IEEE Transactions on Information Forensics and Security, 14(5), 1180-1190.
[21]Li, Z., Ye, J., & Shi, Y. Q. (2013). “Distinguishing Computer Graphics from Photographic Images Using Local Binary Patterns”, in Proceedings of the 11th international conference on Digital Forensics and Watermaking
[22]Liu, Y., Wan, Z., Yin, X., Yue, G., Tan, A., & Zheng, Z. (2023). “Detection of GAN generated image using color gradient representation”, J. Vis. Commun. Image R
[23]Liu, Z., Qi, X., & Torr, P. H. S. (2020). “Global Texture Enhancement for Fake Face Detection in the Wild”, in Proceedings of the IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR), 8060-8069.
[24]Liu, Y., & Cheng, L. (2010). “Robust and Fast Blind Image Watermarking Based on DCT dDomain”, Pattern Recognition, 43(5), 1763-1772.
[25]Nguyen, T. T., & Huynh, T. (2012). “A Novel Image Watermarking Scheme Based on Visual Cryptography”, Journal of Visual Communication and Image Representation, 23(7), 1120-1132.
[26]Odena, A., Dumoulin, V., & Olah, C. (2016). “Deconvolution and Checkerboard Artifacts”, Distill. https://doi.org/10.23915/distill.00003.
[27]Qiao, T., Chen, Y., Zhou, X., Shi, R., Shao, H., Shen, K., & Luo, X. (2023). “CSC-Net: Cross-color Spatial Co-occurrence Matrix Network for Detecting Synthesized Fake Images”, IEEE Transactions on Cognitive and Developmental Systems
[28]Pu, J., Mangaokar, N., Wang, B., Reddy, C. K., & Viswanath, B. (2020). “NoiseScope: Detecting Deepfake Images in a Blind Setting”, in Annual Com-puter Security Applications Conference (ACSAC 2020)
[29]Rhee, K. H. (2020). “Detection of Spliced Image Forensics Using Texture Analysis of Median Filter Residual”, IEEE Access, 8, 103374-103384.
[30]Singhal, P., Raj, S., Mathew, J., & Mondal, A. (2022). “Frequency Spectrum with Multi-head Attention for Face Forgery Detection”, in International Con-ference on Neural Information Processing (ICONIP 2022), 200-211
[31]Tan, Y., Liu, S., Huang, Z., & Chen, C. (2023). “Learning on Gradients: Gener-alized artifacts Representation for GAN-generated Images Detection”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-tion (CVPR), 1-10.
[32]Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). “Attention is All You Need”, Advances in Neural Information Processing Systems, 5998-6008.
[33]Wang, S., Wang, X., & Zhang, Y. (2018). “Identification of Natural Images and Computer-generated Graphics Based on Statistical and Textural Features”, Journal of Visual Communication and Image Representation, 55, 495-502.
[34]Yang, X., Li, Y., Qi, H., & Lyu, S. (2019). “Exposing GAN-synthesized Faces using Landmark Locations”, in Proceedings of the ACM International Confer-ence on Multimedia
[35]Zhang, X., Karaman, S., & Chang, S.-F. (2019). “Detecting and Simulating Ar-tifacts in GAN Fake Images (Extended Version)”, in 2019 IEEE Interna-tional Workshop on Information Forensics and Security (WIFS), 1-6