跳到主要內容

簡易檢索 / 詳目顯示

研究生: 邱邦旭
Qiu, Bang-Xu
論文名稱: 帶有高維度測量誤差之長度偏差與區間設限資料的提升方法
Boosting method for length-biased and interval-censored survival data subject to high-dimensional error-prone covariates
指導教授: 陳立榜
Chen, Li-Pang
口試委員: 陳立榜
Chen, Li-Pang
周珮婷
Chou, Elizabeth
張欣民
Chang, Hsing-Ming
學位類別: 碩士
Master
系所名稱: 商學院 - 統計學系
Department of Statistics
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 41
中文關鍵詞: 加速失效模型有偏抽樣不完整數據校正測量誤差變數選取SIMEX
外文關鍵詞: AFT model, biased sampling, incomplete data, measurement error correction, variable selection, SIMEX
DOI URL: http://doi.org/10.6814/NCCU202200857
相關次數: 點閱:170下載:48
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 長度偏差和區間設限資料分析是生存分析的一個重要課題,許多方法已被開發用來處理這種複雜的資料結構。然而現有的方法側重於低維資料,並假定協變數是精確測量的,而在應用中經常會收集到受測量誤差影響的高維數據。在本
    篇論文中,我們提出了一種有效的推論方法來處理加速失效時間模型下協變數存在測量誤差的高維長度偏差和區間設限的生存資料。我們採用 SIMEX 方法來修正測量誤差的影響,並提出提升演算法來進行變數選擇和估計。所提出的方法能夠處理協變數的維度大於樣本量的情況,並能適應不同的協變數分佈。


    Analysis of length-biased and interval-censored data is an important topic in survival analysis, and many methods have been developed to address this complex data structure. However, existing methods focus on low-dimensional data and assume the covariates to be precisely measured, while high-dimensional data subject to measurement error are frequently collected in applications. In this thesis, we explore a valid inference method for handling high-dimensional length-biased and interval-censored survival data with measurement error in covariates under the accelerated failure time model. We primarily employ the SIMEX method to correct for measurement error effects and propose the boosting procedure to do variable selection and estimation. The proposed method is able to handle the case that the dimension of covariates is larger than the sample size and enjoys appealing features that the distributions of the covariates are left unspecified.

    Abstract I
    Table of Contents II
    Tables III
    Figures IV
    Chapter 1 Introduction 1
    Chapter 2 Notation and Models 3
    2.1 Length-Biased and Partly Interval-Censored Data 3
    2.2 Accelerated Failure Time Models 4
    2.3 Measurement Error Models 7
    Chapter 3 Methodology 8
    3.1 SIMEXBoost 9
    3.2 SIMEXBoost with Collinearity in Covariates 12
    Chapter 4 Numerical Studies 13
    4.1 Simulation Setup 13
    4.2 Simulation Results 14
    4.3 Application to The Signal Tandmobiel Study 16
    Chapter 5 Summary 19
    Reference 20

    Aktan, A. M., Kara, I., Sener, I., Bereket, C., Celik, S., Kirtay, M., Ciftci, M. E., and Arici, N. (2012). An evaluation of factors associated with persistent primary teeth. European
    Journal of Orthodontics, 34, 208-212.
    Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge, New York.
    Brown, B., Miller, C. J., and Wolfson, J. (2017). ThrEEBoost: Thresholded boosting for variable selection and prediction via estimating equations. Journal of Computational and Graphical Statistics, 26, 579-588.
    Cai, T. and Betensky, R. A. (2003). Hazard regression for interval-censored data with penalized spline. Biometrics, 59, 570-579
    Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Model, Chapman and Hall, New York
    Chen, L.-P. (2018). Semiparametric estimation for the accelerated failure time model with length-biased sampling and covariate measurement error. Stat, 7, e209.
    Chen, L.-P. (2019). Semiparametric estimation for cure survival model with left-truncated
    and right-censored data and covariate measurement error. Statistics and Probability Letters, 154, 108547.
    Chen, L.-P. (2020). Semiparametric estimation for the transformation model with length�biased data and covariate measurement error. Journal of Statistical Computation and
    Simulation, 90, 420-442.
    Chen, L.-P. (2021). Variable selection and estimation for the additive hazards model sub�ject to left-truncation, right-censoring and measurement error in covariates. Journal of Statistical Computation and Simulation, 90, 3261-3300.
    Chen, L.-P. and Yi, G. Y. (2020). Model selection and model averaging for analysis of truncated and censored data with measurement error. Electronic Journal of Statistics, 14, 4054-4109.
    Chen, L.-P. and Yi, G. Y. (2021a). Semiparametric methods for left-truncated and right�censored survival data with covariate measurement error. Annals of the Institute of
    Statistical Mathematics, 73, 481–517.
    Chen, L.-P. and Yi, G. Y. (2021b). Analysis of noisy survival data with graphical propor�tional hazards measurement error models. Biometrics, 77, 956–969.
    Du, M. and Sun, J. (2021). Variable selection for interval-censored failure time data. Inter�national Statistical Review, 1-23.
    Du, M., Zhao, H., and Sun, J. (2021). A unified approach to variable selection for Cox’s proportional hazards model with interval-censored failure time data. Statistical Methods in Medical Research, 30, 1833-1849.
    Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.
    Fu, W. and Simonoff, J. S. (2017). Survival trees for interval-censored survival data. Statis�tics in Medicine, 36, 4831-4842.
    Gao, F., Zeng, D., and Lin, D. Y. (2017). Semiparametric estimation of the accelerated failure time model with partly interval-censored data. Biometrics, 73, 1161-1168.
    Gao, F. and Chan, K. C. G. (2019). Semiparametric regression analysis of length-biased interval-censored data. Biometrics, 75, 121-132.
    Hu, Q., Liang, Z., Liu, Y., Sun, J., Srivastava, D. K., and Robison, L. L. (2020). Nonpara�metric screening and feature selection for ultrahigh-dimensional Case II interval-censored failure time data. Biometrical Journal, 62, 1909–1925.
    Huang, J. (1999). Asymptotic properties of nonparametric estimation based on partly interval-censored data. Statistica Sinica, 9, 501-519.
    Kim, J. S. (2003). Maximum likelihood estimation for the proportional hazards model with partly interval-censored data. Journal of the Royal Statistical Society, Series B, 65, 489-502.
    Kom´arek, A. and Lesaffre, E. (2007). Bayesian accelerated failure time model for correlated interval-censored data with a normal mixture as an error distribution. Statistica Sinica, 17, 549–569.
    K¨uchenhoff, H., Lederer, W., and Lesaffre, E. (2007). Asymptotic variance estimation for the misclassification SIMEX. Computational Statistics & Data Analysis, 51, 6197-6211.
    K¨uchenhoff, H., Mwalili, S. M., and Leasaffre, E. (2006). A general method for dealing with misclassificationin regression: The misclassification SIMEX. Biometrics, 62, 85-96.
    Lawless, J. F. (2003). Statistical Models and Methods for Lifetime Data. Wiley, New York.
    Mandal, S., Wang, S., and Sinha, S. (2019). Analysis of linear transformation models with covariate measurement error and interval censoring. Statistics in Medicine, 38, 4642-4655.
    Ning, J., Qin, J., and Shen, Y. (2011). Buckley-James-type estimator with right-censored and length-biased data. Biometrics, 67, 1369-1378.
    Qiu, Z., Qin, J., and Zhou, Y. (2016). Composite estimating equation method for the accelerated failure time model with length-biased sampling data. Scandinavian Journal of Statistics, 43, 396-415.
    Scolas, S., Ghouch, A. E., Legrand, C., and Oulhaj, A. (2016). Variable selection in a flexible parametric mixture cure model with interval-censored data. Statistics in Medicine, 35,1210-1225.
    Song, X. and Ma, S. (2008). Multiple augmentation for interval-censored data with mea�surement error. Statistics in Medicine, 27, 3178-3190.
    Sun, L., Li, S., Wang, L., and Song, X. (2021). Simultaneous variable selection in regression analysis of multivariate interval-censored data. Biometrics, 1-12.
    Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288.
    Wang, L., McMahan, C. S., Hudgens, M. G., and Qureshi, Z. P. (2016). A flexible, computationally efficient method for fitting the proportional hazards model to interval-censored
    data. Biometrics, 72, 222-231.
    Wang, P., Li, D., and Sun, J. (2021). A pairwise pseudo-likelihood approach for left�truncated and interval-censored data under the Cox model. Biometrics, 77, 1303-1314.
    Wen, C.-C. and Chen, Y.-H. (2014). Functional inference for interval-censored data in proportional odds model with covariate measurement error. Statistica Sinica, 24, 1301-
    1317.
    Wolfson, J. (2011). EEBOOST: a general method for prediction and variable selection based on estimating equation. Journal of the American Statistical Association, 106, 296-305.
    Wu, Y. and Cook, R. J. (2015). Penalized regression for interval-censored times of disease progression: selection of HLA markers in psoriatic arthritis. Biometrics, 71, 782-791.
    Yao, W., Frydman, H., and Simonoff, J. S. (2019). An ensemble method for interval-censored time-to-event data. Biostatistics, 22, 198-213.
    Yavuz, A. C¸ . and Lambert, P. (2011). Smooth estimation of survival functions and hazard ratios from interval-censored data using Bayesian penalized B-splines. Statistics in
    Medicine, 30 75-90.
    Zhang, T. and Yu, B. (2005). Boosting with early stopping: convergence and consistency. The Annals of Statistics, 33, 1538-1579.
    Zhao, H., Wu, Q., Li, G., and Sun, J. (2020). Simultaneous estimation and variable selec�tion for interval-censored data With broken adaptive ridge regression. Journal of the
    American Statistical Association, 115, 204-216.
    Zhao, X., Zhao, Q., Sun, J., and Kim, J. S. (2008). Generalized log-rank tests for partly interval-censored failure time data. Biometrical Journal, 50, 375-385.
    Zhou, Q., Hu, T., and Sun, J. (2017). A sieve semiparametric maximum likelihood approach for regression analysis of bivariate interval-censored failure time data. Journal of the American Statistical Association, 112, 664-672.
    Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67, 301-320.
    Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association. 101, 1418–1429.

    QR CODE
    :::