跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李其軒
Li, Qi-Xuan
論文名稱: 探討兩資料集之相關性
Exploring the correlation between two datasets
指導教授: 鄭宗記
Cheng, Tsung-Chi
口試委員: 鄒宗山
Tsou, Tsung-Shan
蕭維政
Hsiao, Wei-Cheng
學位類別: 碩士
Master
系所名稱: 商學院 - 統計學系
Department of Statistics
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 146
中文關鍵詞: Mantel 檢定典型相關分析RV係數PROTEST距離共變異數檢定歐氏距離馬氏距離皮爾森相關係數距離
外文關鍵詞: Mantel test, Canonical correlation analysis, RV coefficient, PROTEST, Distance covariance test, Euclidean distance, Mahalanobis distance, Pearson correlation distance
相關次數: 點閱:57下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在生物統計或生態統計研究中,衡量兩組多維度資料集相關性是重要課題,統計方法中衡量兩資料集相關性除了典型相關係數分析(canonical correlation analysis)外,本研究探討其他方法,包括Mantel檢定(Mantel test)、RV係數(RV coefficient)、PROTEST(Procrustean randomization test)、距離共變異數檢定(distance covariance test),並且比較這幾種方法在不同的資料形態下優劣。Mantel檢定以及距離共變異數檢定需要透過距離來衡量資料集的相關性,本文除了使用Mantel檢定以及距離共變異數檢定常見的歐氏距離(Euclidean distance)外,也加入馬氏距離(Mahalanobis distance)和皮爾森相關係數距離(Pearson correlation distance),比較不同距離方法是否影響檢定效果。透過電腦模擬一般多元常態分配資料以及模擬非常態分配資料,針對每個模型分配改變資料的樣本數、資料的維度、資料變數的變異數,並且依據每種檢定的檢定力(power)和檢定力圖(power curve),來比較各檢定的效果,最後利用美國黃鶯(American wood warbler)音符結構與鳥鳴聲、小白鼠基因與體內脂肪酸兩實證資料集觀察各檢定的檢定結果。


    In biological statistics or ecological statistics research, assessing the correlation between two multidimensional datasets is an important topic. In addition to canonical correlation analysis, this study explores other methods for measuring the correlation between two datasets. These methods include the Mantel test, RV coefficient, PROTEST (Procrustean randomization test), and distance covariance test. The study compares the performance of these methods under different data structures. The Mantel test and distance covariance test require the use of distance measures to quantify the similarity between datasets. In this study, besides the commonly used Euclidean distance, Mahalanobis distance and Pearson correlation distance are also employed to examine whether different distance measures affect the test results. Computer simulations are conducted using multivariate normal distribution data and non-normal distribution data. The sample size, dimensionality of the data, and variance of the data variables are varied for each simulated model. The effectiveness of each test is compared based on the test power and power curves. Finally, the empirical datasets of American wood warbler song structures and gene expression with hepatic fatty acids in mice are used to observe the test results of each method.

    第一章、緒論 1
    第一節、研究動機與目的 1
    第二節、研究架構 2
    第二章、研究方法 3
    第一節、距離矩陣 3
    1.1 歐氏距離 3
    1.2 馬氏距離 4
    1.3 皮爾森相關係數距離 4
    第二節、Mantel檢定 5
    第三節、典型相關分析 6
    第四節、RV係數 7
    第五節、PROTEST 9
    第六節、距離共變異數檢定 11
    第三章、模擬分析 12
    第一節、模擬設計 12
    1.1 多元常態分配 12
    1.2 多元對數常態分配模型 21
    第二節、模擬結果 24
    2.1 多元常態分配 24
    2.2 多元對數常態分配 98
    第四章、實證資料分析 135
    第一節、美國黃鶯鳥鳴聲與音符結構 135
    第二節、Nutrimouse資料集 138
    第五章、結論與建議 142
    第一節、結論 142
    第二節、未來建議 143
    第六章、參考文獻 144

    Abdi, H. (2011). Conguence: Congruence coefficient, RV-coefficient, and Mantel coefficient. pp. 1-15.
    Buskirk, J.V. (1997). Independent evolution of song structure and note structure in American wood warblers. Proceedings of the Royal Society of London. Series B: Biological Sciences, 264(1382), pp. 755-761.
    Diniz-Filho, J. A., Soares, T. N., Lima, J. S., Dobrovolski, R., Landeiro, V. L., de Campos Telles, M. P., Rangel, T. F., & Bini, L. M. (2013). Mantel test in population genetics. Genetics and molecular biology, 36(4), pp. 475-485.
    Dow, M. M., & Cheverud, J. M. (1985). Comparison of distance matrices in studies of population structure and genetic microdifferentiation: quadratic assignment. American journal of physical anthropology, 68(3), pp. 367-373.
    Dutilleul, P., Stockwell, J.D., Frigon, D., & Legendre, P. (2000). The Mantel test versus Pearson's correlation analysis Assessment of the differences for biological and environmental studies. Journal of Agricultural Biological and Environmental Statistics, 5(2), pp. 131-150.
    Escoufier, Y. (1973). Le traitement des variables vectorielles. Biometrics, 29, pp. 751-760.
    Ghorbani, H.R. (2019). Mahalanobis distance and its application for detecting multivariate outliers. Facta Universitatis Series Mathematics and Informatics, 34(3), pp. 583-595.
    González, I. ., Déjean, S., Martin, P. . G. P., & Baccini, A. (2008). CCA: An R Package to Extend Canonical Correlation Analysis. Journal of Statistical Software, 23(12), pp. 1-14.
    Goslee, S.C., & Urban, D.L. (2007). The ecodist Package for Dissimilarity-based Analysis of Ecological Data. Journal of Statistical Software, 22(7), pp. 1-19.
    Härdle W. K., & Simar L.. (2015). "Canonical Correlation Analysis". Applied Multivariate Statistical Analysis., pp. 321-330.
    Hotelling, H. (1935). The most predictable criterion. Journal of Educational Psychology, 26, pp. 139-142.
    Husson, F., Lê, S., Mazet, J. (2007). FactoMineR: Factor Analysis and Data Mining with R. R package version 1.05. https://CRAN.R-project.org/package=FactoMineR
    Jackson, D. A. (1995). PROTEST: a Procrustean randomization test of community environment concordance. Écoscience, 2(3), pp. 297-303.
    Josse, J., Pagès, J., & Husson, F. (2008). Testing the significance of the RV coefficient. Computational Statistics & Data Analysis, 53(1), pp. 82-91.
    Legendre, P. and Legendre, L. (1998). Numerical ecology (2nd ed.). Amsterdam: Elsevier.
    Legendre, P., & Fortin, M. J. (2010). Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data. Molecular ecology resources, 10(5), pp. 831-844.
    Legendre, P., Fortin, M., & Borcard, D. (2015). Should the Mantel test be used in spatial analysis? Methods in Ecology and Evolution, 6(11), pp. 1239-1247.
    Liu, G., Yang, S., Liu, W., Wang, S., Tai, P., Kou, F., Jia, W., Han, K., Liu, M., & He, Y. (2020). Canonical Correlation Analysis on the Association Between Sleep Quality and Nutritional Status Among Centenarians in Hainan. Frontiers in public health, 8, pp. 1-7.
    Lyu, J., & Nadarajah , S. (2022). New bivariate and multivariate log-normal distributions as models for insurance data. Results in Applied Mathematics, 14(87), pp. 1-26.
    Mahalanobis, P.C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Science of India, 2(1), pp. 49-55.
    Mantel N. (1967). The detection of disease clustering and a generalized regression approach. Cancer research, 27(2), pp. 209-220.
    Mantel, N., & Valand, R. S. (1970). A technique of nonparametric multivariate analysis. Biometrics, 26(3), pp. 547-558.
    Martin, P. G., Guillou, H., Lasserre, F., Déjean, S., Lan, A., Pascussi, J. M., Sancristobal, M., Legrand, P., Besse, P., & Pineau, T. (2007). Novel aspects of PPARalpha-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology (Baltimore, Md.), 45(3), pp. 767-777.
    McLachlan, G.J. (1999). Mahalanobis distance. Resonance, 4(6), pp. 20-26.
    Oksanen, F.J., et al. (2017). Vegan: Community Ecology Package. R package Version 2.4-3. https://CRAN.R-project.org/package=vegan.
    Omelka, M., & Hudecová, Š. (2013). A comparison of the Mantel test with a generalised distance covariance test. Environmetrics, 24(7), pp. 449-460.
    Peres-Neto, P. R., & Jackson, D. A. (2001). How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia, 129(2), pp. 169-178.
    Silva, A., Dias, C.T., Cecon, P., & Rêgo, E. (2015). An alternative procedure for performing a power analysis of Mantel's test. Journal of Applied Statistics, 42(9), pp. 1984-1992.
    Stöckl, S., & Hanke, M. (2014). Financial Applications of the Mahalanobis Distance. Applied Economics and Finance, 1(2), pp. 78-84.
    Székely, Gá. J., Rizzo, M. L. & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35, pp. 2769-2794.
    van Schaik, C. P., Ancrenaz, M., Borgen, G., Galdikas, B., Knott, C. D., Singleton, I., Suzuki, A., Utami, S. S., & Merrill, M. (2003). Orangutan cultures and the evolution of material culture. Science (New York, N.Y.), 299(5603), pp. 102-105.

    無法下載圖示 全文公開日期 2026/07/17
    QR CODE
    :::