跳到主要內容

簡易檢索 / 詳目顯示

研究生: 林琬儒
Lin, Wan-Ju
論文名稱: 運用人臉辨識技術於歷史照片之分析
Application of Deep Face Recognition Techniques to the Analysis of Historical Photos
指導教授: 廖文宏
口試委員: 陳駿丞
彭彥璁
學位類別: 碩士
Master
系所名稱: 理學院 - 資訊科學系碩士在職專班
Excutive Master Program of Computer Science
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 101
中文關鍵詞: 人臉偵測深度學習人臉識別歷史照片
外文關鍵詞: Face detection, Face recognition, Deep learning, Historical photos
DOI URL: http://doi.org/10.6814/NCCU202100276
相關次數: 點閱:91下載:24
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 國內外文史單位,蒐集歷史老照片並致力於檔案數位化,然而這些照片尚有許多資訊內容,例如人、事、時、地、物,須當事人或其家屬、親友等,協助辨識確認。相關人士或旅居海外,或年事已高,因此需要建置友善操作介面的網站,讓這些目擊歷史事件的耆老們提供寶貴記憶,為珍貴的史料記錄其來龍去脈。
    原本上述須完全倚靠人工辨識、描述的資訊內容之作業,是否能採用電腦視覺技術予以協助、加速?我們在前述提及之數位典藏歷史相簿標記網站中,除了提供上傳照片以及建置描述資料(metadata)功能外,也加入人臉識別推薦功能輔助識別照片中的人物。最後,對打字不熟悉的長輩,亦可透過錄音的方式記錄和老照片之相關資訊,此音檔亦可視為一種珍貴的口述歷史保存之標的。
    本論文建置基於蒐集歷史圖像為主要資料集的網站,並應用電腦視覺技術,開發從人臉偵測(face detection)到人臉識別(face recognition)的端對端(end-to-end)流程,盼本研究之貢獻能造福有文史圖片分析需求之典藏單位。


    Cultural and historical institutions collect and digitize historical photos for archiving purposes. However, information regarding these photos, including identity, event, time, place, and objects need to be identified and confirmed. The relevant people may live overseas or are quite aged. It is thus beneficial to build a website with a friendly user interface, so that the elderly who witnessed historical events can share their valuable memories by contributing precious historical materials.
    Computer vision technology can be used to assist and accelerate the above-mentioned operations that relied solely on human identification and description. In the historical album website, in addition to basic functions such as uploading photos and adding metadata, we also implement face recognition recommendation to assist in identifying people in photos. Elderly who are unfamiliar with typing can also record related information for photos through voice recording. This audio file can also be stored for the preservation of oral history.
    This thesis builds a website based on the collection of historical images, and adopts computer vision technology to an end-to-end process from face detection to face recognition. We hope that this research can benefit the institutions that have the need for the analysis of cultural and historical pictures.

    第一章 緒論 14
    第一節 研究背景與動機 14
    第二節 研究目的 15
    第三節 研究貢獻 16
    第四節 論文架構 16
    第二章 相關研究 18
    第一節 圖像詮釋角度 18
    第二節 文史處理研究 19
    第三節 電腦視覺技術探討 19
    2.3.1 影像圖說 19
    2.3.2 Google相簿 20
    2.3.3 Google Cloud Vision API 28
    2.3.4 InsightFace 33
    2.3.5 口述錄音 35
    第三章 研究方法 39
    第一節 資料集 39
    3.1.1 註冊資料(facebank) 39
    3.1.2 驗證資料集 41
    第二節 分析方法 42
    3.2.1 前處理:Check Orientation 43
    3.2.2 人臉偵測 45
    3.2.3 人臉辨識 49
    第三節 系統實作 57
    3.3.1 網站整體流程圖 57
    3.3.2 耆老資料標記流程 58
    3.3.3 系統建議人臉描述資訊流程 59
    3.3.4 口述錄音流程 59
    第四章 實驗結果與討論 61
    第一節 人臉偵測實驗 61
    4.1.1 Google Cloud Vision 61
    4.1.2 MTCNN 62
    4.1.3 RetinaFace 64
    4.1.4 小結 64
    第二節 人臉識別實驗 65
    4.2.1 模型:ResNet-100(Subcenter ArcFace), 原始註冊資料 66
    4.2.2 模型:ResNet-50(Subcenter ArcFace), 原始註冊資料 69
    4.2.3 模型:ResNet-100(Subcenter ArcFace), 分齡註冊資料 74
    4.2.4 模型:ResNet-50(Subcenter ArcFace), 分齡註冊資料 79
    4.2.5 小結 84
    第五章 結論及未來研究 85
    參考文獻 87
    附錄 91
    附錄A 網頁介面及功能規劃 91
    A.1 登入系統login.php 91
    A.2 歷史照片網站首頁index.php 92
    A.3 相簿瀏覽頁面albumshow.php 94
    A.4 照片瀏覽頁面albumphoto.php 96
    A.5 新增相簿頁面adminadd.php 99
    A.6 修改相簿頁面adminmodify.php 100

    [1] 林素甘, 楊美華, & 柯皓仁. (2008). 數位化發展對檔案典藏與保存之影響.
    [2] "勝利之吻," in https://zh.wikipedia.org/wiki/%E8%83%9C%E5%88%A9%E4%B9%8B%E5%90%BB
    [3] "飢餓的蘇丹," in https://zh.wikipedia.org/wiki/%E9%A3%A2%E9%A4%93%E7%9A%84%E8%98%87%E4%B8%B9
    [4] "EXIF wiki," in http://en.wikipedia.org/wiki/Exchangeable_image_file_format.
    [5] XU, Donna, et al. Survey on multi-output learning. IEEE transactions on neural networks and learning systems, 2019.
    [6] 張婷雅 (2016),臉書相片分類及使用者樣貌分析,碩士論文,政治大學資訊科學系,臺北。
    [7] 蔡旻琪(2019)。影像分析灣裡葉姓家族的生活記憶。國立臺南大學文化與自然資源學系碩士班碩士論文,台南市。 取自https://hdl.handle.net/11296/354e8z
    [8] 周憶卿(2014)。關渡老照片的敘事:在地成長的記憶。臺北市立大學視覺藝術學系視覺藝術教學碩士學位班碩士論文,臺北市。 取自https://hdl.handle.net/11296/6dvrp3
    [9] "Image captioning with visual attention," in https://www.tensorflow.org/tutorials/text/image_captioning
    [10] "Google相簿 wiki," in https://zh.wikipedia.org/wiki/Google%E7%9B%B8%E7%B0%BF
    [11] "Google相簿說明," in https://support.google.com/photos/answer/6153599?co=GENIE.Platform%3DDesktop&hl=zh-Hant
    [12]"Google Cloud Vision API," in https://cloud.google.com/vision/
    [13] "InsightFace: 2D and 3D Face Analysis Project," in https://github.com/deepinsight/insightface
    [14] CHEN, Tianqi, et al. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274, 2015.
    [15] HE, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778.
    [16] CHEN, Sheng, et al. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices. In: Chinese Conference on Biometric Recognition. Springer, Cham, 2018. p. 428-438.
    [17] HOWARD, Andrew G., et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
    [18] SZEGEDY, Christian, et al. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 2818-2826.
    [19] PLEISS, Geoff, et al. Memory-efficient implementation of densenets. arXiv preprint arXiv:1707.06990, 2017.
    [20] DENG, Jiankang, et al. Arcface: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019. p. 4690-4699.
    [21] DENG, Jiankang, et al. Sub-center arcface: Boosting face recognition by large-scale noisy web faces. In: European Conference on Computer Vision. Springer, Cham, 2020. p. 741-757.
    [22] "InsightFace_Pytorch," in https://github.com/TreB1eN/InsightFace_Pytorch/blob/master/README.md
    [23] ZHANG, Kaipeng, et al. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016, 23.10: 1499-1503.
    [24]DENG, Jiankang, et al. Retinaface: Single-stage dense face localisation in the wild. arXiv preprint arXiv:1905.00641, 2019.
    [25] "Ethics in action: removing gender labels from Cloud’s Vision API, "in https://diversity.google/story/ethics-in-action-removing-gender-labels-from-clouds-vision-api/
    [26] "Word error rate," in https://en.wikipedia.org/wiki/Word_error_rate
    [27] "Google cloud 語音轉文字, " in https://cloud.google.com/speech-to-text?hl=zh-tw#section-12
    [28] udntvArt, "20140515《藝想世界》後代無私捐贈 羅家倫萬冊藏書落腳政大," in https://www.youtube.com/watch?app=desktop&v=N1Us-R4e3eM
    [29] 資料授權來源:國立政治大學圖書館特藏管理組(2021),臺北。
    [30] "Check orientation," in https://github.com/ternaus/check_orientation
    [31] "Open Images Dataset V6 + Extensions," in https://storage.googleapis.com/openimages/web/index.html
    [32] "Detect faces , Google Cloud文件," in https://cloud.google.com/vision/docs/detecting-faces
    [33] "定價, Google Cloud文件," in https://cloud.google.com/vision/pricing
    [34] "Face Challenges," in https://www.nist.gov/programs-projects/face-challenges
    [35] LIN, Tsung-Yi, et al. Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. 2017. p. 2980-2988.
    [36] NAJIBI, Mahyar, et al. Ssh: Single stage headless face detector. In: Proceedings of the IEEE international conference on computer vision. 2017. p. 4875-4884.
    [37] YI, Dong, et al. Learning face representation from scratch. arXiv preprint arXiv:1411.7923, 2014.

    QR CODE
    :::