跳到主要內容

簡易檢索 / 詳目顯示

研究生: 任嘉珩
Jen , Chia Heng
論文名稱: 混合連續與間斷資料之馬式距離的穩健估計
Robust estimation of the Mahalanobis distance for multivariate data mixed with continuous and discrete variables
指導教授: 鄭宗記
學位類別: 碩士
Master
系所名稱: 商學院 - 統計學系
Department of Statistics
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 39
中文關鍵詞: 混合型資料隱藏常態變數模型穩健估計馬式距離
外文關鍵詞: normal latnet variable model, Mahalanobis distacne, minimum covariance determinant
相關次數: 點閱:242下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究採用Lee 和Poon 所提出的隱藏常態變數模型來估計混合連續與間斷型變數之參數估計,並估計其馬式距離。此外,並利用穩健估計來估計混合型資料參數及其馬式距離,可在有離群值時解決最大蓋似估計的不穩定。


    Poon and Lee (1987) applied normal latent variable model to deal with the parameters
    estimation for the data mixed with continuous and discrete variables and Bedrick et al. (2000) used this idea to evaluate the Mahalanobis distance. In this thesis, we extend a similar idea to robustly estimate Multivariate Data Mixed with Continuous and Discrete Variables with the same model. Furthermore, we evaluate the Mahalanobis distance which can determine similarity of variables. The proposed method can overcome the unreliability of MLE while there exist outliers in the data.

    1 Introduction 1
    2 Mahalanobis Distance and Robust Estimation 3
    2.1 Mahalanobis Distance . . . . . . . . . . . . . . . . . . . . . . . . . . 3
    2.1.1 Mahalanobis Distance Between Individuals . . . . . . . . . . . 3
    2.1.2 Mahalanobis Distance Between Populaions . . . . . . . . . . . 4
    2.2 Robust Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
    2.2.1 Breakdown Point . . . . . . . . . . . . . . . . . . . . . . . . . 5
    2.2.2 Definition of Minimum Covariance Determinant . . . . . . . . 7
    3 Distance Between Populations With Mixed Continuous and Discrete
    Variables 8
    3.1 Distance Between Populations With Continuous Variables . . . . . . 8
    3.2 Distance between Populations With Mixed Continuous and Discrete
    Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
    3.3 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 10
    3.4 Robust Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
    3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
    3.5.1 Academic Achievement Data . . . . . . . . . . . . . . . . . . . 11
    3.5.2 Skull Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
    4 Robust Estimation of the Mahalanobis Distance for Multivariate
    Data Mixed with Continuous and Discrete Variables 21
    4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
    4.2 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 23
    4.3 Estimation of the Mahalanobis Distance . . . . . . . . . . . . . . . . 26
    4.4 Robust Estimation of the Mahalanobis Distance . . . . . . . . . . . . 26
    4.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
    4.5.1 Academic Achievement Data . . . . . . . . . . . . . . . . . . . 27
    4.5.2 Skull Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
    5 Conclusions 29
    Reference 31

    [1] Barnett, V. and Lewis, T. (1994), Outliers in Statistical Data, 3rd ed. New York:
    John Wiley and Sons.
    [2] Bedrick, E. J., Lapidus, J., and Powell, J. F. (2000), Estimating the Mahalanobis
    Distance from Mixed Continuous and Discrete Data, Biometrics, 56, 394–401.
    [3] Bhattacharyya, A. (1943), On a measure of divergence between two statistical
    populations defined by their probability distributions, Bulletin of the Calcutta
    Mathematical Society, 35, 99–109.
    [4] Donoho, D. L., and Huber,P. J. (1983), The Notion of Breakdown Point. In A
    Festschrift for Erich L. Lehmann, Ed. P. J. Bickel, K. A. Docksum and J. L.
    Hodges, Jr., 157–84, Belmont CA: Wadsworth.
    [5] Hampel, F., Ronchetti, P., Rousseeuw, P., and Stahel, W. (1986), Robust Statistics:
    The Approach Based on Influence Functions, New York: John Wiley and
    Sons.
    [6] Huber, Peter. J. (1964), Robust estimation of a location parameter, The Annals
    of Mathematical Statistics, 35, 73–101.
    [7] Huber, Peter. J. (1981), Robust Statistics, New York: John Wiley and Sons.
    [8] Jobsin, J. D. (1992), Applied Multivariate Data Analysis: Volume II: Categorical
    and Multivariate Methods, New York: Springer-Verlag.
    [9] Krzanowski, W. J. (1975), Discrimination and classfication using both binary
    and continuous variables, Journal of the American Statistical Association, 70,
    782–790.
    [10] Krzanowski, W. J. (1983), Distance between population using mixed continuous
    and categorical variables, Biometrika, 70, 235–243.
    [11] Lehmann, E. L. and Casella, G. (1998), Theory of Point Estimation, New York:
    Springer.
    [12] Krzanowski, W. J. and Marriott, F. H. C. (1995), Kendall’s Library of Statistics
    2, Maltivariate Analysis Part 2, London: Arnold.
    [13] Mahalanobis, P. C. (1936), On the generalized distance in statistics, Proceedings
    of the National Institute of Science India, 2, 49–55.
    [14] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979), Multivariate Analysis,
    London: Academic Press.
    [15] Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006), Robust Statistics, Theory
    and Methods, New York: Wiley.
    [16] Matusita, K. (1972), Discrimination and the affinity of distributions, Sidcriminant
    Analysis and Applications, Ed. T. Cacoullos, pp.213-223, New York: Academic
    Press.
    [17] Olkin,I. and Tate, R. F. (1961), Multivariate correlation models with mixed
    discrete and continuous variables, Annals of Mathematical Statistics, 32, 448–
    465.
    [18] Poon, W. Y. and Lee, S. Y. (1986), Maximum likelihood estimation of polyserial
    correlations, Psychometrika, 51, 113–121.
    [19] Poon, W. Y. and Lee, S. Y. (1987),Maximum likelihood estimation of multivariate
    polyserial and polychoric correlation coefficients, Psychometrika, 52, 409–
    430.
    [20] Prohorov, Y. V. (1956), Convergence of random processes and limit theorems in
    probability theory, Theory of Probability and its Applications, 1, 157–214.
    [21] Rousseeuw, P. J. (1984), Least median of squares regression, Journal of the
    American Statistical Association, 79, 871–880.
    [22] Rousseeuw, P. J. and A. M. Leroy (1987), Robust Regression and Outlier Detection,
    New York: John Wiley.
    [23] Rousseeuw, P. J. and Van Driessen, K. (1999), A fast algorithm for the minimum
    covariance determinant estimator. Technometrics, 41, 212V223.
    [24] Zaman, A., Rousseeuw, P. J., and Orhan, M. (2001), Econometric applications
    of high-breakdown robust regression techiniques, Econometrics Letters, 71, 1–8.

    無法下載圖示 此全文未授權公開
    QR CODE
    :::