| 研究生: |
徐碩亨 Hsu, Shuo Heng |
|---|---|
| 論文名稱: |
充分維度縮減於整體性檢定之應用 Application of sufficient dimension reduction to global test |
| 指導教授: |
薛慧敏
Hsueh, Hui Min |
| 學位類別: |
碩士
Master |
| 系所名稱: |
商學院 - 統計學系 Department of Statistics |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 中文 |
| 論文頁數: | 29 |
| 中文關鍵詞: | 維度縮減 、切片平均變異數估計法 、基因組分析 、排列顯著值 |
| 相關次數: | 點閱:102 下載:59 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技不斷的進步,人們需要處理的資料量也不斷地增加。在巨量資料的分析上,維度縮減將有助於增進效率。本篇論文主要介紹切片平均變異數估計維度縮減方法,並將此法應用於整體相關性檢定問題上。我們考慮切片平均變異數估計法中的邊際維度檢定,並將利用排列重抽法建構檢定統計量的虛無分配,藉此計算排列顯著值來獲得統計推論。此整體相關性檢定可用在基因組分析問題上,以驗證特定基因組與外顯特徵變數間的相關程度。最後我們將模擬本檢定的型一誤差率和檢定力,並與前人提出的方法做比較。
目錄
摘要 Ⅰ
目錄 Ⅱ
一、緒論 1
二、維度縮減之簡介 4
1.縮減維度和中央子空間 4
2.切片平均變異數估計 6
3.整體相關之顯著性檢定 8
三、模擬分析 11
四、結論與建議 25
參考文獻 27
Auer, P.L. and Doerge, R.W. (2011) A Two-Stage Poisson Model for Testing RNA-Seq Data. Statistical Applications in Genetics and Molecular Biology, 10, 1.
Bura, E. and Pfeiffer, R.M. (2003) Graphical methods for class prediction using dimension reduction techniques on DNA microarray data. Bioinformatics, 19, 1252-1258.
Chen, J.J., Lee, T., Delonggchamp, R.R., Chen, T. and Tsao, C.A. (2007) Significance analysis of groups of genes in expression profiling studies. Bioinformatics, 23, 2104-2112.
Cook, R.D. (1996) Graphics for regression with a binary response. Journal of the American Statistical Association, 91, 983-992.
Cook, R.D. (1998) Regression Graphics, Ideas for Studying Regressions Through Graphics. New York:John Wiley.
Cook, R.D. (2000) Save:a method for dimension reduction and graphics in regression. Communications in Statistics-Theory and Methods, 29, 2109-2121.
Cook, R.D. (2004) Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092.
Cook, R.D. and Lee, H. (1999) Dimension reduction in binary response regression. Journal of the American Statistical Association, 94, 1187-1200.
Cook, R.D. and Weisberg, S. (1991) Comment. Journal of the American Statistical Association, 86, 328-332.
Dinu, I., Potter, J.D., Mueller, T., Liu, Q., Adewale, A.J., Jhangri, G.S., Einecke, G., Famulski, K.S., Halloran, P. and Yasui, Y. (2007) Improving gene set analysis of microarray data by SAM-GS. Bioinformatics, 8, 242.
Efron, B. and Tibshirani, R. (2007) On testing the significance of sets of genes. The Annals of Applied Statistics, 1, 107-129.
Hosmer, D.W. and Cessie, S.L. and Lemeshow, S. (1997) A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine, 16, 965-980.
Li, K.C. (1991) Sliced Inverse Regression for Dimension Reduction. Journal of the American Statistical Association, 86, 316-327.
Li, K.C. (1992) On Principal Hessian Directions for Data Visualization and Dimension Reduction:Another Application of Stein’s Lemma. Journal of the American Statistical Association, 87, 1025-1039.
Liu, Q., Dinu, I., Adewale, A., Potter, J. and Yasui, Y. (2007) Comparative evaluation of gene-set analysis methods. Bioinformatics, 8, 431.
Mootha, V.K., Lindgren, C.M., Eriksson,K.F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M.J., Patterson, N., Mesirov, J.P., Golub, T.R., Tamayo, P., Spiegelman, B., Lander, E.S., Hirschhorn, J.N., Altshuler, D. and Groop, L.C. (2003) PGC-1 alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics, 34, 267-273
Rajagopalan, D. and Agarwal, P. (2005) Inferring pathways from gene lists using a literature-derived network of biological relationships. Bioinformatics, 21, 788-793.
Stein, C. (1981) Estimating the Mean of a Multivariate Normal Distribution. The Annals of Statistics, 9, 1135-1151.
Scha ̈fer, J. and Strimmer, K. (2005) A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics. Statistical Applications in Genetics and Molecular Biology, 4, 1.
Segaran, T. and Hammerbacher, J. (2009) Beautiful Data:The Stories Behind Elegant Data Solutions. O’Reilly Media
Shao, Y. and Cook, R.D. and Weisberg, S. (2007) Marginal tests with sliced average variance estimation. Biometrika, 94, 285-296.
Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette,M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S. and Mesirov, J.P. (2005) Gene set enrichment analysis:a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102, 15545-15550
Tian, L., Greenberg, S.A., Kong, S.W., Altschuler, J., Kohane, I.S. and Park, P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the United States of America, 102, 13544-13549
Tsai C.A. and Chen, J.J. (2009) Multivariate analysis of variance test for gene set analysis. Bioinformatics, 25, 897-903.
Weisberg, S. (2005) Applied Linear Regression, 3rd ed. New York:John Wiley.
White, T. (2012) Hadoop:The Definitive Guide, 3rd ed. O’Reilly Media