跳到主要內容

簡易檢索 / 詳目顯示

研究生: 邱莉晴
Chiu, Li Ching
論文名稱: 行動應用軟體在迭代分群行為之研究
Iterative Clustering on Behaviors of App Executables
指導教授: 郁方
Yu, Fang
學位類別: 碩士
Master
系所名稱: 商學院 - 資訊管理學系
Department of Management Information System
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 57
中文關鍵詞: 行動應用程式GHSOM分群
外文關鍵詞: iterative
相關次數: 點閱:96下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 行動裝置在現在這個世代相當普遍,而我們需要一個方法來探索App在背後的行為。
    本研究提出了一個非監督式的分群方式,目的是在於探討我們是否能使用App中的原始碼當作以行為分群的依據。
    在此研究中,我們應用了迭代分群的方式對Apps做分析,並且觀察分群的結果是否恰當。
    而在實驗中,我們由App Store下載了數百個App並加以分析,我們發現我們所提出的方式表現相當良好並且能給出正確的分群結果。


    Smart devices are everywhere nowadays. Mobile application (app) development has become one of the main streams in software industry with more than millions of apps that have been developed and published to billions of users.

    It is essential to have a systematic way to analyze apps, preferably on their executable that are the only public available sources of apps in most cases.

    In this work, we propose to apply unsupervised clustering to mobile applications on their system call distributions. This is done by first adopting a static binary analysis that reverses engineering on executable of apps to find method call/sequence counts that are embedded in apps. Apps are then clustered iteratively based on this information to reveal implicit relationships among apps based on function call similarity. The GHSOM (Growing Hierarchical Self-Organizing Map), an unsupervised learning tool, is integrated to cluster apps based on the information resolved from their executable directly.

    We use types of methods and sequences as features. To run the clustering algorithm on apps, however, we immediately confront a problem that we have a large amount of attributes and data that leads to a long/infeasible analysis time with GHSOMs. The new iterative approach is proposed to conquer this problem along with dimension reduction with principle component analysis, cutting attributes with limited information loss.

    In the preliminary result on analyzing hundreds of apps that are directly downloaded from Apple app store, we can find that the proposed clustering works well and reveals some interesting information. Apps that are developed by the same company are clustered in the same group. Apps that have similar behaviors, e.g., having the same functions on games, painting, socializing, are clustered together.

    Abstract............................................................................................................................. 3
    Content.............................................................................................................................. 4
    1 Introduction.................................................................................................................... 7
    2 Related Works.............................................................................................................. 10
    2.1 Clustering methods ............................................................................................... 10
    2.1.1 K-Means Algorithm....................................................................................... 10
    2.1.2 SOM............................................................................................................... 11
    2.1.3 GHSOM......................................................................................................... 12
    2.1.4 Comparison of clustering method .................................................................. 13
    2.2 Dimension reduction............................................................................................. 14
    2.2.1 PCA................................................................................................................ 14
    2.2.2 Comparison with LDA method.......................................................................... 15
    2.3 OPcode sequence analysis .................................................................................... 16
    2.4 App Analysis and clustering ................................................................................. 19
    4 Evaluations................................................................................................................... 30
    4.1 115 apps clustering ............................................................................................... 30
    4.2 564 apps clustering ............................................................................................... 35
    4.2.1 PCA reduction................................................................................................ 35
    4.2.2 Iterative GHSOM on 564 apps....................................................................... 36
    4.3 800 apps clustering ............................................................................................... 37
    4.3.1 PCA reduction on 800 apps ........................................................................... 38
    4.3.1 Iterative GHSOM on 800 apps....................................................................... 38
    5 Conclusions.................................................................................................................. 42
    Reference ........................................................................................................................ 43
    Appendix......................................................................................................................... 46
    1.GHSOM clustering result of 115 apps ................................................................... 46(1). Segment of MATLAB code on transfer the original data: ............................... 47
    2. Progress of iterative GHSOM on 115 apps............................................................. 48
    3.564 apps iterative progress....................................................................................... 50
    4.800 apps progress..................................................................................................... 53

    [1] Anonymous. (2010) Mimvi Reports Patent Filing for 'Intelligent' Mobile App
    Search and Recommendation Technology." Entertainment Close – Up
    [2] Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley
    Interdisciplinary Reviews: Computational Statistics, 2(4), 433-459.
    [3] Bizzi, S., Harrison, R. F., & Lerner, D. N. (2009). The Growing Hierarchical
    Self-Organizing Map (GHSOM) for analysing multi-dimensional stream habitat
    datasets. In 18th World IMACS/MODSIM Congress.
    [4] Banković, Z., Stepanović, D., Bojanić, S., & Nieto-Taladriz, O. (2007).
    Improving network security using genetic algorithm approach. Computers &
    Electrical Engineering, 33(5), 438-451.
    [5] Bilar, D. (2007). Opcodes as predictor for malware. International Journal of
    Electronic Security and Digital Forensics, 1(2), 156-168.
    [6] Chang, E. C., Huang, S. C., & Wu, H. H. (2010). Using K-means method and
    spectral clustering technique in an outfitter’s value analysis. Quality & Quantity,
    44(4), 807-815.
    [7] Chandy, R., & Gu, H. (2012, April). Identifying spam in the iOS app store. In
    Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality (pp.
    56-59). ACM.
    [8] Danyu X.(2003).Pattern Recognition of Mutual Funds using Self-Organizing
    Maps Order No. MQ88787 Carleton University (Canada)
    [9] Eleyan, A., & Demirel, H. (2006). PCA and LDA based face recognition
    using feedforward neural network classifier. In Multimedia Content
    Representation, Classification and Security (pp. 199-206). Springer Berlin
    Heidelberg.
    [10] Eleyan, A., & Demirel, H. (2007). Pca and lda based neural networks for
    human face recognition. Face Recognition, 93-106.
    [11] Hurlburt, G., Voas, J., & Miller, K. W. (2011). mobile-app addiction: threat
    to security?. IT Professional.
    [12] Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means
    clustering algorithm. Applied statistics, 100-108.
    [13] Jieun Kim, Yongtae Park, Chulhyun Kim, Hakyeon Lee. "Mobile
    application service networks: Apple’s App Store." Service Business 8.1 (2014):
    1-27.
    [14] Kenney, M., & Pon, B. (2011). Structuring the smartphone industry: is the

    QR CODE
    :::