在無母數的迴歸當中,因為原始的函數類型未知,所以常用已知特定類型的函數來近似未知的函數,而spline函數也可以用來近似未知的函數,但是要估計spline函數就需要設定節點(knots),越多的節點越能準確近似原始函數的內容,可是如果節點太多有較多的參數要估計, 就會變得比較不準確,所以選擇適合節點個數就變得很重要。
在本研究中,用交叉驗證的方式來尋找適合的節點個數, 考慮了幾種不同切割資料方式來決定訓練資料和測試資料, 並比較不同切割資料的方式下選擇節點的結果與函數估計的效果。
In this thesis, I consider the problem of estimating an unknown regression function using spline approximation.
Splines are piecewise polynomials jointed at knots. When using splines to approximate unknown functions, it is crucial to determine the number of knots and the knot locations. In this thesis, I determine the knot locations using least squares for given a given number of knots, and use cross-validation to find appropriate number of knots. I consider three methods to split the data into training data and testing data, and compare the estimation results.
第一章 緒論 8
第二章 文獻探討 9
第一節 Cross Validation 9
第二節 Spline節點的選擇 9
第三章 研究方法 11
第一節 迴歸函數估計 11
第二節 資料切割及交叉驗證 13
第四章 模擬資料分析 14
第一節 f非spline函數 14
第二節 f為spline函數 17
第五章 結果討論與建議 20
David Ruppert, Selecting the number of knots for penalized splines. Journal of Computational and Graphical Statistics, 11(4):735–757, 2002.
E.F. Halpern. Bayesian spline regression when the number of knots is unknown. Journal of the Royal Statistical Society B, 35:347–60, 1973.
Issac Jacob Schoenberg. Contributions to the problem of approximation of equidistant data by analytic functions ,part b: On the problem of osculatory interpolation, a second class of analytic appoximation formulae. Quart. Appl. Math, 4:112–141, 1983.
Jeff Racine. Feasible cross-validatory model selection for general stationary processes. Journal of Applied Econometrics, 12(2):169–179, 1997.
J Shao. Linear model selection by cross-validation. Journal of the American Statistical Association, 88(422):486–95, 1993.
Meyer Mary C. Inference using shape-restricted regression splines. The Annals of Applied Statistics, 2(3):1013–1033, 2008.
M Stone. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, 36:111–147, 1974.
R Picard, R and D Cook, R. Cross-validation of regression models. Journal of the American Statistical Association, 79:575–583, 1984.
S Geisser. A predictive sample reuse method with application. Journal of the American Statistical Association, 70:320–8, 1975.