跳到主要內容

簡易檢索 / 詳目顯示

研究生: 游詳閔
Yu, Hsiang-Min
論文名稱: 具概念飄移的動態社群網絡之類別預測
Label Prediction on Dynamic Social Networks with Concept Drifting
指導教授: 沈錳坤
Shan, Man-Kwan
口試委員: 沈錳坤
Shan, Man-Kwan
柯佳伶
Koh, Jia-Ling
李華富
Li, Hua-Fu
林守德
Lin, Shou-De
學位類別: 碩士
Master
系所名稱: 理學院 - 資訊科學系
論文出版年: 2010
畢業學年度: 99
語文別: 中文
論文頁數: 39
中文關鍵詞: 類別預測動態社群網絡概念飄移
外文關鍵詞: Label Prediction, Dynamic Social Networks, Concept Drifting
DOI URL: http://doi.org/10.6814/NCCU201901233
相關次數: 點閱:74下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 社會網絡在電腦科學的研究範疇中扮演一個日漸重要的角色,類別預測正是其中 一項熱門的議題。類別預測的研究目標,是利用網絡中部分已知類別的節點,預 測出其他未知類別節點之類別。
    以往類別預測之研究,皆以靜態社會網絡為主;然而,社會網絡往往是隨著 時間動態演進的。在動態網絡中,網絡中的節點、連結、類別,皆可能隨著時間 演進而更動。連帶的,節點之間相互影響的關係也會隨著時間改變。此變動可以 視為一種概念飄移 (Concept Drift)。

    不同於過往的研究,我們指出了動態網絡中類別分類的問題,並利用靜態網絡中類別分類的技術,結合概念飄移的方法,提出能夠在動態網絡中預測類別的 解法。

    實驗所採用的資料是 IMDb (Internet Movie Database) 的社會網路,我們用以 預測演員的類別,根據實驗結果顯示,將動態社會網絡的演化過程,加入作為類 別預測的參考指標,能夠提高動態網絡中類別分類的準確性。


    Label prediction is one of the central questions of social network research. The core of label prediction is the use of labeled nodes to predict labels of un-labeled nodes in a social network. The definition of a labeled social network is a social network of partial or complete labeled nodes. The nodes in the same social network have a mutual impact on each other’s labels.

    Previous research on label prediction have been focused on static social networks. However, social networks are more dynamic in reality. In a dynamic social network, the links of nodes, even the labels of nodes, can be changed with time. The mutual influence of nodes can also be changed. The changing is called “Concept Drift.”

    This thesis predicts the labels on a dynamic labeled social work. We address the problems of classification for a dynamic social network. The technique of label prediction on static social networks and algorithms used to tackle concept drift are combined to solve the label prediction problem on dynamic social networks.

    Experiments were performed on a labeled social network constructed from the Internet Movie Database. The results show that we can use the evolution of dynamic social networks to generate a more precise prediction of labels.

    摘要 ii
    目錄 iv
    圖目錄 vi
    表目錄 vii
    第 一 章 前言 1
    第 二 章 相關研究 5
    2.1 Collective Classification 5
    2.2 Local Structure Similarity 8
    2.3 Graph-based Semi-supervised Learning 9
    2.4 Ghost Edges 10

    第 三 章 研究方法 12
    3.1 問題定義 12
    3.2 研究架構 13
    3.3 Base Classifier Learning 14
    3.3.1 Ghost Edge 16
    3.3.2 Random Walk With Restart 17
    3.3.3 Base Classifier 18
    3.4 特徵選取 18
    3.5 Ensemble Box Learning 19
    3.5.1 Concept Drift 20
    3.5.2 Ensemble Box 24
    3.6 Labeling 26
    3.6.1 Iterative Classification Algorithm 26

    第 四 章 29
    4.1 資料庫 29
    4.1.1 資料庫特性 29
    4.2 實驗設計 31

    第 五 章 36
    5.1 結論 36
    5.2 未來研究方向 36

    參考文獻 37

    [1] L. Breiman. “Random Forests,” Machine Learning, Vol. 15, No. 1, pp. 5-12, 2001.

    [2] D. Brezeale and D. J. Cook. “Automatic Video Classification: A Survey of the Literature,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE
    Transactions on, Vol. 38, pp. 416-430, 2008.

    [3] C. Desrosiers and G. Karypis, “Within-Network Classification Using Local Structure
    Similarity,” Proc. of the European Conference on Machine Learning (ECML) and the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 260-275, 2009.

    [4] B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos, “Using Ghost Edges for Classification in Sparsely Labeled Networks,” Proc. of the 14th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International Conference on Knowledge Discovery and Data Mining, pp. 256-264, 2008.

    [5] J. He, J. Carbonell, and Y. Liu, “Graph-Based Semi-Supervised Learning as a Generative Model,” Proc. of the International Joint Conference on Artificial Intelligence, pp. 2492-2497, 2007.

    [6] J. He, M. J. Li, H. J. Zhang, H. H. Tong, and C. S. Zhang, “Manifold-Ranking based Image Retrieval,” Proc. of the 12th annual ACM International Conference on Multimedia, pp. 9-16, 2004.

    [7] J. Z. Kolter and M. A. Maloof. “Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift,” Proc. of the 3rd International IEEE Conference on Data Mining, pp. 123-130, 2003.

    [8] F. Lin and W. W. Cohen, “Semi-Supervised Classification of Network Data Using Very Few Labels,” Proc. of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 192-199, 2010.

    [9] S. A. Macskassy and F. Provost. “A Simple Relational Classifier,” Proc. of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at International Conference
    on Knowledge Discovery and Data Mining, pp. 64-76, 2003.

    [10] S. A. Macskassy and F. Provost, “Classification in Networked Data: A Toolkit and a Univariate Case Study,” The Journal of Machine Learning Research, Vol. 8, pp.
    935-983, 2007.

    [11] L. McDowell, K. M. Gupta, and D. W. Aha, “Cautious Inference in Collective Classification,” Journal of Machine Learning Research, Vol. 10, pp. 596-601, 2007.

    [12] L. McDowell, K. M. Gupta, and D. W. Aha, “Meta-Prediction for Collective Classification,” Proc. 23th International FLAIRS Conference, 2010.

    [13] J. Y. Pan, H. J. Yang, C. Faloutsos, and P. Duygulu, “Automatic Multimedia Cross-Modal Correlation Discovery,” Proc. of the 10th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International conference on Knowledge discovery and data mining, pp. 653-658, 2004.

    [14] P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad, “Collective Classification in Network Data,” AI Magazine, vol. 29, No.3, pp. 93-106, 2008.

    [15] W. Street and Y. Kim, “A Streaming Ensemble Algorithm (SEA) for Large-Scale Classification,” Proc. of the 7th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International conference on Knowledge discovery and
    data mining, pp. 377-382, 2001.

    [16] A. Sultan and A.Hegami, “Classical and Incremental Classification in Data Mining Process,” International Journal of Computer Science and Network Security, Vol. 7, No.12, pp. 179-187, 2007.

    [17] P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison-Wesley Longman Publishing Co., 2005.

    [18] H. Tong and C. Faloutsos, “Center-Piece Subgraphs: Problem Definition and Fast Solutions,” Proc. of the 12th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International Conference on Knowledge Discovery and Data Mining, pp. 404-413, 2006.

    [19] H. Tong, C. Faloutsos, and J. Y. Pan, “Fast Random Walk with Restart and Its Applications,” Proc. of the 6th International IEEE Conference on Data Mining, pp. 613-622, 2006.

    [20] G. Tsoumakas and I. Katakis, “Multi-Label Classification: An Overview,” International Journal of Data Warehousing and Mining, Vol. 3, No. 3, pp. 1-13, 2007.

    [21] A. Tsymbal, M. Pechenizkiy, P. Cunningham, and S. Puuronen, “Dynamic Integration of Classifiers for Handling Concept Drift,” Information Fusion, Vol. 9, pp. 56-68, 2008.

    [22] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems, Vol. 16, pp. 321-328, 2004.

    [23] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. of the 20th International Conference on Machine Learning, pp. 912-919, 2003.

    [24] X. Zhu. Semi-supervised Learning Literature Survey, University of Wisconsin-Madison Department of Computer Sciences, 2005.

    [25] I. Zliobaite, “Learning under Concept Drift: an Overview,” Technical Report, Vilnius University, 2010.

    QR CODE
    :::