基於多模態融合的長序列表示嵌入框架｜國立政治大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	夏秋如 Shia, Chiu-Ju
論文名稱：	基於多模態融合的長序列表示嵌入框架 An Embedding Framework on Long Sequence Representation with Multimodal Fusion
指導教授：	蕭舜文 Hsiao, Shun-Wen
口試委員:	左瑞麟 Tso, Ray-Lin 黃意婷 Huang, Yi-Ting 陳孟彰 Chen, Meng-Chang
學位類別：	碩士 Master
系所名稱：	商學院 - 資訊管理學系 Department of Management Information System
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	48
中文關鍵詞：	長序列表示、圖神經網絡、點矩陣法、注意力機制、多模態融合
外文關鍵詞：	Long Sequence Representation, Graph Neural Networks, Dot-matrix Method, Attention Mechanism, Multimodal Fusion
相關次數：	點閱：26 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

分析惡意軟體的API呼叫序列是一項重大挑戰，因為這些序列很長、屬於文字、事件型態且包含隱藏訊息，所以使得人在分析上變得困難。此外，與自然語言不同，這些API呼叫序列往往表現出程式相關的特性和結構，如循環和重複呼叫。因此，本研究重點分析這些序列中的結構，旨在解決它們在理解惡意軟體行為方面所呈現的複雜性。本研究提出了一種嵌入框架，旨在利用惡意軟體API呼叫序列的結構進行表徵學習，並點出序列中重要的API呼叫。我們使用了兩種不同提取結構資訊的方法，包括馬爾可夫模型和點矩陣法。為了幫助學習這些擁有複雜程式邏輯結構的長序列，我們的研究使用了圖神經網絡和視覺變換器，將圖結構和點矩陣結構轉換成高維向量。此外，我們利用基於注意力機制的多模態融合技術，將多模態資料融合成單一的表示向量，並顯示出序列中API呼叫的重要性。通過這些方法的整合，我們的框架不僅指出了惡意軟體家族中特定API呼叫的重要性，還提出了基於多模態融合技術的創新應用。

Analyzing malware through its API call sequences presents a significant challenge because it is long, text-based, event-based, and has hidden information, which may be difficult for manual examination. Moreover, unlike natural language, these call sequences often exhibit programming-related properties and structures such as loops and repeated calls. Consequently, this paper focuses on the analysis of such structures within call sequences, aiming to untangle the complexities they present in understanding malware behaviors. In this paper, we propose an embedding framework designed to learn the structure of malware call sequences in multiple ways for representation learning and to pinpoint the important calls in the sequence. Our method introduces two different approaches for structural information extraction including the Markov model and the dot matrix method. To navigate the complexities of variable-length sequences imbued with intricate programming logic, our study leverages Graph Neural Networks (GNN) and Vision Transformer Networks to distill both graph and dot matrix structures into high dimensional vectors. Furthermore, we employ multimodal fusion techniques based on the attention mechanism to fuse multimodal data into a cohesive representation that highlights the importance of the API call within the sequences. Through the integration of these advanced methods, our framework not only indicates the significance of specific calls within the malware family but also introduces the innovative application of multimodal fusion networks.

1 Introduction 1

2 Related Work 7
2.1 Text Sequences 7
2.1.1 Seq2Seq Model 7
2.1.2 Attention Mechanism and Transformer 8
2.2 Graph 9
2.2.1 Markov Model 9
2.2.2 Graph Neural Networks 10
2.3 Vision 13
2.3.1 Dot Matrix Method 13
2.3.2 Image Classification 14
2.4 Multimodal Fusion 15
2.5 API Call Sequence Analysis 17

3 Design of Our Method 18
3.1 Overview 18
3.2 Malware Behavior Profile Preprocessing 20
3.3 Behavior Structure Extraction 21
3.3.1 Behavior Transition Graph Generation 21
3.3.2 Behavior Dot Matrix Generation 22
3.4 Multimodalities Transformer Networks 24
3.4.1 Sequence Transformer Networks 24
3.4.2 Graph Transformer Networks 24
3.4.3 Vision Transformer Networks 25
3.5 Multimodal Fusion Networks 26

4 Evaluation 29
4.1 DataSet 29
4.2 Malware Family Classification Comparison 30
4.3 Multimodal Fusion Method Comparison 33
4.4 Attention Mechanism 36
4.5 Representation of Long Sequence in Malware Family 39

5 Conclusion 42

Reference 43

Faitouri A Aboaoja, Anazida Zainal, Fuad A Ghaleb, Bander Ali Saleh Al-Rimy, Taiseer Abdalla Elfadil Eisa, and Asma Abbas Hassan Elnour. Malware detection issues, challenges, and future directions: A survey. Applied Sciences, 12(17):8482, 2022.

AV-TEST. Av-atlas malware & pua. https://portal.av-atlas.org/malware. Accessed: 2024-07-02.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.

Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence, 41(2):423–443, 2018.

Ferhat Ozgur Catak, Ahmet Faruk Yazı, Ogerta Elezaj, and Javed Ahmed. Deep learning based sequential model for malware analysis using windows exe api calls. PeerJ Computer Science, 6:e285, 2020.

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.

Cuckoo. Cuckoo sandbox book release 2.0.7. https://readthedocs.org/projects/cuckoo/downloads/pdf/latest/, 2020. Accessed: 2024-03-22.

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

Massimo Ficco. Comparing api call sequence algorithms for malware detection. In Web, Artificial Intelligence and Network Applications: Proceedings of the Workshops of the 34th International Conference on Advanced Information Networking and Applications
(WAINA-2020), pages 847–856. Springer, 2020.

Hisham Shehata Galal, Yousef Bassyouni Mahdy, and Mohammed Ali Atiea. Behavior-based features model for malware detection. Journal of Computer Virology and Hacking Techniques, 12:59–67, 2016.

Adrian J Gibbs and George A Mcintyre. The diagram, a method for comparing sequences: Its use with amino acid and nucleotide sequences. European journal of biochemistry, 16(1):1–11, 1970.

Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.

Jinsoo Hwang, Jeankyung Kim, Seunghwan Lee, and Kichang Kim. Two-stage ransomware detection using dynamic analysis and machine learning techniques. Wireless Personal Communications, 112(4):2597–2609, 2020.

M Asha Jerlin and K Marimuthu. A new malware detection system using machine learning techniques for api call sequences. Journal of Applied Security Research, 13(1):45–62, 2018.

Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016a.

Thomas N Kipf and Max Welling. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016b.

Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551, 1989.

Ce Li, Zijun Cheng, He Zhu, Leiqi Wang, Qiujian Lv, Yan Wang, Ning Li, and Degang Sun. Dmalnet: Dynamic malware analysis based on api feature engineering and graph learning. Computers & Security, 122:102872, 2022.

Chen Li and Junjun Zheng. Api call-based malware classification using recurrent neural networks. Journal of Cyber Security and Mobility, 10(3):617–640, 2021.

Xiang Ling, Lingfei Wu, Wei Deng, Zhenqing Qu, Jiangyu Zhang, Sheng Zhang, Tengfei Ma, Bin Wang, Chunming Wu, and Shouling Ji. Malgraph: Hierarchical graph neural networks for robust windows malware detection. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications, pages 1998–2007. IEEE, 2022.

Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, Amir Zadeh, and Louis-Philippe Morency. Efficient low-rank multimodal fusion with modality-specific factors. arXiv preprint arXiv:1806.00064, 2018.

J Mathew and MA Ajay Kumara. Api call based malware detection approach using recurrent neural network—lstm. In Intelligent Systems Design and Applications: 18th International Conference on Intelligent Systems Design and Applications (ISDA 2018) held in Vellore, India, December 6-8, 2018, Volume 1, pages 87–99. Springer, 2020.

Microsoft. Pua:win32/loadmoney. https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=PUA:Win32/LoadMoney&threatId=223699, a. Accessed: 2024-07-02.

Microsoft. Win32/allaple. https://www.microsoft.com/en-us/wdsi/threats/threat-search?query=Allaple, b. Accessed: 2024-07-02.

Microsoft. Worm:win32/allaple.o. https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Worm:Win32/Allaple.O&threatId=-2147255709, c. Accessed: 2024-07-02.

Microsoft. Worm:win32/rahack.a. https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Worm:Win32/Rahack.A&ThreatID=2147571243, d. Accessed: 2024-07-02.

Fahad Mira. A review paper of malware detection using api call sequences. In 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS), pages 1–6. IEEE, 2019.

Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, and Chen Sun. Attention bottlenecks for multimodal fusion. Advances in neural information processing systems, 34:14200–14213, 2021.

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27, 2014.

Talos. Graftor - but i never asked for this.... https://blogs.cisco.com/security/talos/graftor-but-i-never-asked-for-this. Accessed: 2024-07-02.

TensorFlow. tf.keras.layers.textvectorization. https://www.tensorflow.org/api_docs/python/tf/keras/layers/TextVectorization. Accessed: 2024-07-02.

Daniele Ucci, Leonardo Aniello, and Roberto Baldoni. Survey of machine learning techniques for malware analysis. Computers & Security, 81:123–147, 2019.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, Yoshua Bengio, et al. Graph attention networks. stat, 1050(20):10–48550, 2017.

VirusTotal. Virustotal. https://www.virustotal.com/. Accessed: 2024-03-22.

Zhanghao Wu, Paras Jain, Matthew Wright, Azalia Mirhoseini, Joseph E Gonzalez, and Ion Stoica. Representing long-range context for graph neural networks with global attention. Advances in Neural Information Processing Systems, 34:13266–13279, 2021.

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1):4–24, 2020.

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.

Nan Xu, Wenji Mao, and Guandan Chen. Multi-interactive memory network for aspect based multimodal sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 371–378, 2019.

Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:1707.07250, 2017.

全文公開日期 2029/08/05

簡易檢索 / 詳目顯示

相關論文