跳到主要內容

簡易檢索 / 詳目顯示

研究生: 吳小萍
Wu, Hsiao-Ping
論文名稱: 模擬高密度寡聚核甘酸微陣列矩陣資料及正規化方法之探討
A Simulation Study on High Density Oligonucleotide Microarray Data With Discussion of Normalization Methods
指導教授: 郭訓志
Kuo, Hsun-Chih
蔡紋琦
Tsai, Wen-Chi
學位類別: 碩士
Master
系所名稱: 商學院 - 統計學系
Department of Statistics
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 59
中文關鍵詞: 微陣列矩陣正規化
外文關鍵詞: microarray, normalization
相關次數: 點閱:197下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 微陣列矩陣晶片是一門現今被廣泛使用在許多領域的生物醫學研究,在本文,我們主要是對寡核甘酸微陣列矩陣晶片資料的正規化感興趣。為了比較不同的正規化方法,我們致力於模擬更接近真實寡核甘酸微陣列矩陣晶片的資料。在資料的模擬上,我們主要是根據Li和Wong的模型來進行模擬,並利用階層法來設定模型的參數。最後為了判別正規化方法的好壞,我們模擬了100組資料,並且利用四個判斷準則來做比較。模擬的結果表示,我們所提出的新方法
    (LOESS to Average),一般來說都比其他的正規化方法來的好。


    Microarray technology is now widely used in many areas of biomedical research. In this thesis, we are interested in the normalization for oligonucleotide Microarray data. We aimed to simulate more realistic oligonucleotide microarry data in order to compare different normalization methods. The data simulation was based on Li and Wong's model with a hierarchical setup for parameters. In order to compare normalization methods, 100 data sets were simulated data. The performance of ten normalization methods was assessed based on four comparison criteria. Simulation results suggest that our new proposed normalization method, LOESS
    to Average, is generally a better method than other normalization methods.

    謝辭.....................................................{i}

    Abstract...............................................{ii}

    中文摘要...............................................{iii}

    1 Introduction..........................................{1}

    2 Literature Review.....................................{3}

    2.1 Affymetrix Gene Chip Technologies.................{3}

    2.2 Li and Wong's Model...............................{4}

    2.3 DNA-Chip (dChip)..................................{4}

    2.3.1 Invariant Normalization.......................{5}

    2.4 Robust Multi-Array Average (RMA)..................{5}

    2.4.1 Background Correction in RMA..................{6}

    2.4.2 Quantile Normalization in RMA.................{7}

    2.4.3 Summarization in RMA: Median Polish...........{7}

    2.5 Microarray Analysis Suite Software (MAS 5.0)......{9}

    2.5.1 Background Correction in MAS 5.0..............{9}

    2.5.2 The Ideal Mismatch Value (IM)................{10}

    2.5.3 The Adjusted Log-Transformed PM Intensities..{10}

    2.5.4 One Step Tukey Biweight Algorithm............{11}

    2.5.5 Scaling Normalization........................{11}

    2.6 omparisons of Normalization Methods..............{12}

    3 Methodology..........................................{13}

    3.1 Scaling Method...................................{13}

    3.2 Median Centered..................................{15}

    3.3 Hybrid Scaling-Median Centered Methods...........{15}

    3.4 Z^* Scores.......................................{16}

    3.5 Quantile Normalization...........................{16}

    3.6 Cyclic LOESS.....................................{18}

    3.7 New Proposed Normalization Method: LOESS to Average
    .................................................{20}

    4 Real Data............................................{21}

    4.1 Real Data........................................{21}

    4.2 The Perfect Match (PM) Value.....................{21}

    4.3 The Mismatch (MM) Value..........................{22}

    4.4 The Theta (θ)....................................{23}

    4.5 The Phi (Φ)......................................{24}

    5 Simulation...........................................{26}

    5.1 Common Simulation Setting........................{26}

    5.2 Simulation Settings for Differentially Expressed
    Genes............................................{27}

    5.3 Simulated Data...................................{28}

    6 Comparisons of Normalization Methods.................{30}

    6.1 Interquarter Range (IQR).........................{30}

    6.2 Diff-statistics..................................{31}

    6.3 Mean Standard Deviation (MSD)....................{33}

    6.3.1 Overall MSD..................................{33}

    6.3.2 Diff-MSD.....................................{34}

    6.4 Ratio............................................{36}

    7 Discussion and Future Work...........................{38}

    7.1 Discussion of Comparison Criteria................{38}

    7.2 Summary of Comparisons for Normalization Methods.{39}

    7.3 Discussion of Simulation Settings................{39}

    7.3.1 Simulation Setting 1.........................{39}

    7.3.2 Simulation Setting 2 and Setting 3...........{40}

    7.4 Future Work......................................{41}

    References.............................................{43}

    Appendix...............................................{45}

    [1] Affymetrix (2002), Statistical algorithms description
    document, Technical report, Affymetrix.


    [2] B. M. Bolstad, R. A. Irizarry, M. Astrand and T. P.
    Speed (2003), A comparison of normalization methods for
    high density oligonucleotide array data based on
    variance and bias, Bioinformatics, 19(2), 185-193.

    [3] R. A. Irizarry, B. Hobbs, F. Collin, Y. D. Beazer-
    Barclay, K. J. Antonellis, U. Scherf and T. P. Speed
    (2003), Exploration, normalization, and summaries of
    high density oligonucleotide array probe level data,
    Biostatistics, 4(2), 249-264.

    [4] C. Li and W. H. Wong (2001a), Model-based analysis of
    oligonucleotide arrays: expression index computation
    and outlier detection, Proceedings of the National
    Academy of Science USA, 98, 31-36.

    [5] C. Li and W. H. Wong (2001b), Model-based analysis of
    oligonucleotide arrays: model validation, design issues
    and standard error application, Genome Biology 2(8):
    research 0032.1-0032.11.

    [6] R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope,
    B. Hobbs and T. P. Speed (2003), Summaries of
    affymetrix GeneChip probe level data, Nucleic Acids
    Research, 31(4), e15.

    [7] B. Bolstad (2001), Probe level quantile normalization of
    high density oligonucleotide array data, Division of
    Biostatistics.

    [8] B. Bolstad (2002), Comparing the effects of background,
    normalization and summarization on gene expression
    estimates.

    [9] Affymetrix (2001), GeneChip arrays provide optimal
    sensitivity and specificity for microarray expression
    analysis, Affymetrix.

    [10] B. M. Bolstad (2004), Low-level analysis of high-
    density Oligonucleotide array data: background,
    normalization and summarization.

    [11] D. Holder, R. F. Raubertas, V. Bill Pikounis, V.
    Svetnik and K. Soper, statistical analysis of high
    density oligonucleotide arrays: a safer approach,
    Merck Research Laboratories, WP37C-305, West Point, PA
    19486.

    [12] F. Naef, D. A. Lim, N. Patil and M. O. Magnasco
    (2001),From features to expression: High-density
    oligonucleotide array analysis revisited, Tech Report,
    1, 1-9.

    [13] R. Sasik, E. Calvo and J. Corbeil (2002), Statistical
    analysis of high-density oligonucleotide arrays: a
    multiplicative noise model, Bioinformatics 18(12),
    1633-1640.

    [14] dChip User's Manual (2005)
    http://biosun1.harvard.edu/complab/dchip

    [15] 薛慧芬 (2005), The research of normalization methods
    for high density oligonucleotide array, Thesis at
    National Chengchi University.

    [16] S. Dudoit, Y. H. Yang, M. J. Callow and T. P. Speed
    (2000), Statistical methods for identifying
    differentially expressed genes in replicated cDNA
    microarray experiments.

    無法下載圖示 此全文未授權公開
    QR CODE
    :::