| 研究生: |
甘岱珺 Kan, Tai-Chun |
|---|---|
| 論文名稱: |
基於圖論之高通量染色體結構捕獲連結網路視覺化與分析 Apply graph theory to visualizing and analyzing Hi-C contact network |
| 指導教授: |
張家銘
Chang, Jia-Ming |
| 口試委員: |
林耀鈴
Lin, Yaw-Ling 紀明德 Chi, Ming-Te 陳彥宏 Chen, Yen-Hung 郭桐惟 Kuo, Tung-Wei 張家銘 Chang, Jia-Ming |
| 學位類別: |
碩士
Master |
| 系所名稱: |
理學院 - 資訊科學系 |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 中文 |
| 論文頁數: | 41 |
| 中文關鍵詞: | Hi-C 、連結熱圖 、連結網路 、圖論 、網路嵌入 、資料視覺化 、Shiny |
| 外文關鍵詞: | Hi-C, Contact map, Contact network, Graph theory, Network embedding, Information visualization, Shiny |
| DOI URL: | http://doi.org/10.6814/THE.NCCU.CS.014.2018.B02 |
| 相關次數: | 點閱:61 下載:4 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在本研究中主要探討於遠距離規模下基因片段交互作用的情況,並且運用網路拓撲分析其表現模式和生物性功能。網路特性能夠有效率地測量圖論中節點的重要性,以及節點彼此之間的關聯性,藉此辨識在生物系統裡的中心元素。本研究應用各種網路拓撲方法分析高通量染色體結構捕獲連結網路,然後結合 t-SNE 和 Network Embedding 進行資料分群。此外,HiCONET 是針對 Hi-C 資料提供連結熱圖和網路結構視覺化的服務平台。圖形化介面可以協助使用者在視覺上搜尋特定資料模式,同時連結熱圖與網路圖中相關聯的資料內容。借助 R Shiny 平台,使用者能夠透過點選視覺化結果和調整參數,互動式地探索其感興趣的資料範圍。此網路服務平台的網址是 https://changlab.shinyapps.io/hiconet/。
In this work we explore the interactions of gene regions in long-range scale with network topologies for analyzing expression patterns and biological functionalities. Network features help us efficiently measure the significance of nodes and relationships between other nodes, in order to identify the central elements in a biological system. We apply different network topological measures in analyzing Hi-C contact network, then use t-SNE and network embedding method for clustering. Furthermore, we developed a web server to visualize Hi-C data by contact map and network, HiCONET. The graphical interface lets users visually search for patterns in the Hi-C data, as simultaneously plotting related genomic region among contact map and network. Besides, users can interactively explore interesting regions through clicking network or selecting parameters of Hi-C data thanks to R Shiny platform. The server is free available in https://changlab.shinyapps.io/hiconet/.
1. Introduction 7
2. Background 9
2.1. Chromosome Conformation Capture (3C) 9
2.2. High-throughput Chromatin Conformation Capture (Hi-C) 10
2.3. Topologically Associating Domain (TAD) 12
2.4. Biological Contact Network 13
3. Related Works 14
3.1. Hi-C Data Visualization 14
3.2. Hi-C Data Network Analysis 14
4. Methods 15
4.1. Hi-C Data Processing 15
4.2. Epigenetic TADs 16
4.3. Hi-C Contact Matrix 19
4.4. Hi-C Contact Network 21
4.5. Network Properties 22
4.6. Network Models 23
4.7. Network Centrality Measures 24
4.8. Epigenetic TADs Clustering 25
4.8.1. Network Embedding 25
4.8.2. t-Distributed Stochastic Neighbor Embedding (t-SNE) 26
5. Visualization 28
5.1. System Structure 28
5.2. Implementation 29
6. Results and Discussion 32
6.1. Network Connectivity 32
6.2. Network Centrality Distribution per TADs 33
7. Conclusions 36
8. Reference 39
1.T. Sexton, E. Yaffe, E. Kenigsberg, F. Bantignies, B. Leblanc, M. Hoichman, H. Parrinello, A. Tanay, and G. Cavalli, “Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome,” in Cell 148, pp. 458-472, 2012.
2.B. Bonev and G. Cavalli, “Organization and function of the 3D genome,” in Nature Reviews Genetics volume 17, pp. 661-678, 2016.
3.S. Rosa and P. Shaw, “Insights into chromatin structure and dynamics in plants,“ in Biology (Basel), Vol. 2(4), pp.1378-1410, 2013.
4.J. R. Dixon, D. U. Gorkin, and B. Ren, "Chromatin domains: the unit of chromosome organization," in Mol. Cell, Vol. 62, pp. 668-680, 2016.
5.J. Dostie, J. Dekker, “ Mapping networks of physical interactions between genomic elements using 5C technology,” in Nat. Protoc. 2, pp. 988-1002, 2007.
6.A. S. Belmont, "Large-scale chromatin organization: the good, the surprising, and the still perplexing," in Curr. Opin. Cell Biol., Vol. 26, pp. 69-78, 2014.
7.K. S. Sandhu, G. Li , H. M. Poh, ..., Y. Ruan, “Large-scale functional organization of long-range chromatin interaction networks,” in Cell Rep, Vol. 2(5), pp. 1207-1219, 2012.
8.J. Dekker, K. Rippe, M. Dekker, and N. Kleckner, “Capturing chromosome conformation,” in Science, Vol. 295, pp.1306-1311, 2002.
9.A. Pombo, and N. Dillon, "Three-dimensional genome architecture: players and mechanisms," in Nat. Rev. Mol. Cell Biol., Vol. 16, pp. 245-257, 2015.
10. J. Dekker, “The three ‘C’ s of chromosome conformation capture: controls, controls, controls,” in Nat. Methods 3, pp. 17-21, 2006.
11. Z. Zhao, G. Tavoosidana, M. Sjolinder, A. Gondor, ..., U. Singh, “Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions,” in Nat. Genet. 38, pp. 1341-1347, 2006.
12. E. Lieberman-Aiden, N. L. van Berkum, L. Williams, M. Imakaev, ..., J. Dekker, “Comprehensive mapping of long range interactions reveals folding principles of the human genome,” in Science, Vol. 326(5950), pp. 289-293, 2009.
13. W. de Laat and D. Duboule, “Topology of mammalian developmental enhancers and their regulatory landscapes,” in Nature, Vol. 502, pp. 499-506, 2013.
14. Nora E.P. , Lajoie B.R., Schulz E.G., Giorgetti L., Okamoto I., Servant N., Piolot T., van Berkum N.L., Meisig J., Sedat J.et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature . 2012; 485:381–385.
15. N. Matharu and N. Ahituv, “Minor Loops in Major Folds: Enhancer-Promoter Looping, Chromatin Restructuring, and Their Association with Transcriptional Regulation and Disease,” in PLoS Genet, 2015.
16. G. A. Pavlopoulos, M. Secrier, C. N. Moschopoulos, ..., P. G. Bagos, “Using graph theory to analyze biological networks,” in BioData Min. 4, 10, 2011. 3917. C. Chin, S. Wu, H. Ho, M. Ko, and C. Lin, “cytoHubba: Identifying hub objects and sub-networks from complex interactome,” in BMC Systems Biology, 8(Suppl 4):S11, 2014.
18. H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabási, “The large-scale organization of metabolic networks,” in Nature, Vol. 407, pp. 651-654, 2000.
19. J. B. Morlot, J. Mozziconacci, and A. Lesne, “Network concepts for analyzing 3D genome structure from chromosomal contact maps,” in EPJ Nonlinear Biomed Phys, 4: 2, 2016.
20. M. W. Schmid, S. Grob, and U. Grossniklaus, “HiCdat: a fast and easy-to-use Hi-C data analysis tool,” in BMC Bioinformatics, Vol. 16, pp. 277, 2015.
21. G. Castellano, F. Le Dily, A. Hermoso Pulido, M. Beato, and G. Roma, “Hi-Cpipe: a pipeline for high-throughput chromosome capture,” in bioRxiv. Cold Spring Harbor Labs Journals, 2015.
22. E. C. Schofield, T. Carver, P. Achuthan, P. Freire-Pritchett, M. Spivakov, J. A. Todd, O. S. Burren, “CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets,” in Bioinformatics, Vol. 32, Issue 16, pp. 2511–2513, 2016.
23. N. C Durand, J. T. Robinson, M. S. Shamim, I. Machol, J. P. Mesirov, E. S. Lander, and E. Lieberman Aiden, “Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom,” in Cell Systems 3(1), 2016.
24. R. Kumar, H. Sobhy, P. Stenberg, and L. Lizana, “Genome Contact Map Explorer - A platform for the comparison, interactive visualization and analysis of genome contact maps,” in Nucleic Acids Res, Vol. 45, Issue 17, pp. e152, 2017.
25. A. Thibodeau, E. J. Márquez, O. Luo, Y. Ruan, F. Menghi, D. G. Shin, M. L. Stitzel, P. Vera-Licona, and D. Ucar, “QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks, ” in PLoS Comput Biol, 2016.
26. S. Babaei, A. Mahfouz, M. Hulsman, B. P. Lelieveldt, J. de Ridder, and M. Reinders, “Hi-C chromatin interaction networks predict co-expression in the mouse cortex,” in PLoS Comput Biol, 11(5):1004221, 2015.
27. B. Schuettengruber, N. Oded Elkayam, T. Sexton, M. Entrevan, S. Stern, A. Thomas, E. Yaffe, H. Parrinello, A. Tanay, and G. Cavalli, “Cooperativity, specificity, and evolutionary stability of Polycomb targeting in Drosophila,” in Cell Reports, 2014.
28. Q. Szabo, D. Jost, J. M. Chang, ... and G. Cavalli, “TADs are 3D structural units of higher-order chromosome organization in Drosophila,” in Science Advances, 2018.
29. T. Schauer, Y. Ghavi-Helm, T. Sexton, ... , P. B. Becker, “Chromosome topology guides the Drosophila Dosage Compensation Complex for target gene activation,” in EMBO reports, 2017.
30. Ay F, Noble WS. Analysis methods for studying the 3D architecture of the genome. Genome Biol. 2015;16:1–15.
31. Schmitt AD, Hu M, Ren B. Genome-wide mapping and analysis of chromosome architecture. Nat Rev Mol Cell Biol. 2016;17:743–55.
32. P. Erdös and A. Rényi, “On the evolution of random graphs,” in Publ. Math. Inst. Hung. Acad. Sci, Vol. 5, pp. 17-61, 1960.
33. B. A. László and A. Réka, “Emergence of Scaling in Random Networks,” in Science, Vol. 286, Issue. 5439, pp.509-512, 1999.
4034. Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu, “A Survey on Network Embedding,” in arXiv preprint arXiv:1711.08752, 2017.
35. Xiangyu Li, Weizheng Chen, Yang Chen, Xuegong Zhang, Jin Gu Michael, and Q. Zhang, “Network embedding-based representation learning for single cell RNA-seq data,” in Nucleic Acids Research, Vol 45, Issue 19, pp. E166, 2017.
36. Jian Tang, Qu Meng, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei, “Line: Large-scale information network embedding,” in Proceedings of the 24th International Conference on World Wide Web, 2015.
37. B. Perozzi, R.i Al-Rfou, and S. Skiena, “DeepWalk: Online Learning of Social Representations,” in KDD '14 Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710, 2014.
38. L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,” in Journal of Machine Learning Research, Vol. 9, pp. 2579-2605, 2008.
39. H. Jeong, S. P. Mason, A. L. Barabási and Z. N. Oltvai, “Lethality and centrality in protein networks,” in Nature, Vol. 411, pp. 41-42, 2001.