| 研究生: |
施苑玉 Shi, Yuan Yu |
|---|---|
| 論文名稱: |
列聯表中離群細格偵測探討 Detecting Outlying Cells in Cross-Classified Tables |
| 指導教授: |
江振東
Jiang, Zhen Dong |
| 學位類別: |
碩士
Master |
| 系所名稱: |
商學院 - 統計學系 Department of Statistics |
| 論文出版年: | 2013 |
| 畢業學年度: | 83 |
| 語文別: | 英文 |
| 論文頁數: | 67 |
| 中文關鍵詞: | 列聯表 、卡方適合度檢定 、離群細格 、殘差 、近似獨立性 |
| 外文關鍵詞: | Contingency tables, Goodness-of-fit tests, Outlying cells, Residuals, Quasi-independence |
| 相關次數: | 點閱:112 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在處理列聯表(Contingency table)資料時,一般我們常用卡方適合度檢定(chi-squared goodess-of-fit test)來判定模式配適的好壞。如果這個檢定是顯著的,則意謂著配適的模式並不恰當,我們則希望進一步探討可能的原因何在。這其中的一個可能原因是資料中存在所謂的離群細格(outlying cell),這些細格的觀測次數和其他細格的觀測次數呈現某種不一致的現象。
在以往的文獻中,離群細格的偵測,通常藉由不同定義的殘差(residual)作為工具,進而衍生出各種不同的偵測方法。只是,這些探討基本上僅局限於二維列聯表的情形,對於高維度的列聯表,並沒有作更進一步的詮釋。Brown (1974)提出一個逐步偵測的方法,可依序找出所有可能的離群細格,直到近似獨立(quasi-independence)的模式假設不再顯著為止。但是我們認為他所引介的這個方法所牽涉的計算程序似乎過於繁複,因此藉由簡化修改計算過程,我們提供了另一種離群細格偵測的方法。依據模擬實驗的結果發現,本文所介紹的方法與Brown的方法作比較只有過之而無不及。此外我們也探討了應用此種方法到三維列聯表的可行性和可能遭遇到的困難。
Chi-squared goodness-of-fit tests are usually employed to test whether a model fits a contingency table well. When the test is significant, we would then like to identify the sources that cause significance. The existence of outlying cells that contribute heavily to the test statistic may be one of the reasons.
Brown (1974) offered a stepwise criteria for detecting outlying cells in two-way con-tingency tables. In attempt to simplify the lengthy calculations that are required in Brown's method, we suggest an alternative procedure in this study. Based on simulation results, we find that the procedure performs reasonably well, it even outperforms Brown's method on several occasions. In addition, some extensions and issues regarding three-way contingency tables are also addressed.
誌謝
摘要
Abstract
Contents
List of Figures
List of Tables
1 Introduction-----1
1.1 Motivation-----1
1.2 Outline-----3
2 Literature Review-----4
2.1 Basic Definitions-----4
2.2 Brown's Method-----9
2.3 Other Methods-----12
3 Outlying Cells Detection for Two-Way Contingency Tables-----14
3.1 Procedure Proposed-----14
3.2 Examples-----17
4 Some Extensions to Three-Way Contingency Tables and Its Limitations-----25
4.1 One Partial Association Models-----27
4.2 Two Partial Association Models-----34
4.3 Problems Related to the Other Models-----40
5 Concluding Remarks-----44
List of Figures
3.1 Flow chart for identifying outliers-----16
List of Tables
3.1 Given table representing father/son occupations (rows being for fathers andcolumns for sons)-----17
3.2 Outliers detected by Brown's and our proposed methods for Table 3.1-----18
3.3 Generated contingency table with some cells having interaction term-----19
3.4 Outliers detected by our method for Table 3.3-----19
3.5 Table obtained by deleting the first digits of the extreme outliers in Table 3.3-----20
3.6 Outliers detected by our method for Table 3.5-----20
3.7 Generated contingency table with expected cell frequencies given by model (3.1)-----21
3.8 Outliers detected by Brown's and our methods for Table 3.7-----22
3.9 Generated contingency table with expected cell frequencies given by model (3.2)-----23
3.10 Outliers detected by Brown's and our methods for Table 3.9-----23
4.1 Generated contingency table with expected cell frequencies given by model (4.1)-----31
4.2 Outliers detected for Table 4.1-----32
4.3 Generated contingency table with expected cell frequencies given by model (4.2)-----33
4.4 Outliers detected for Table 4.3-----34
4.5 Generated contingency table with expected cell frequencies given by model (4.3)-----38
4.6 Outliers detected for Table 4.5-----39
4.7 Generated contingency table with expected cell frequencies given by model (4.4)-----41
4.8 Outliers detected for Table 4.7-----42
(限達賢圖書館四樓資訊教室A單機使用)