10 CHAPTER 1. NONPARAMETRIC REGRESSIO_中国高校课件下载中心

点击下载：《实用非参数统计》课程教学资源（阅读材料）W.J.Braun's Nonparametric regression notes

正在加载图片...

10 CHAPTER 1.NONPARAMETRIC REGRESSION it is much more computationally intensive,since the regression must be computed for each value of j.Recall that PRESS could be computed from one regression,upon exploiting the diagonal of the hat matrix.If one pursues this idea in the choice of smoothing parameter,one obtains the so-called generalized cross-validation criterion which turns out to have very similar behaviour to ordinary cross-validation but with lower computational requirements.The generalized cross-validation calculation is carried out using the trace of a matrix which plays the role of the hat matrix.6 Figure 1.5 exhibits a plot which is very similar to what could be obtained using lines(spline(smooth.spline(temperature,g))) This invokes the smoothing spline with smoothing parameter selected by generalized cross- validation.To use ordinary cross-validation,include the argument cv=TRUE. Penalized splines The smoothing spline is an example of the use of penalized least-squares.In this case,re- gression functions which have large second derivatives are penalized,that is,the objective function to be minimized has something positive added to it.Regression functions which have small second derivatives are penalized less,but they may fail to fit the data as well. Thus,the penalty approach seeks a compromise between fitting the data and satisfying a constraint (in this case,a small second derivative). Eilers and Marx(1996)generalized this idea by observing that other forms of penalties may be used.The penalized splines of Eilers and Marx(1996)are based on a similar idea to the smoothing spline,with two innovations.The penalty term is no longer in terms of a second derivative,but a second divided difference,and the number of knots can be specified.(For smoothing splines,the number of knots is effectively equal to the number of observations.)? Recall that the least-squares spline problem is to find coefficients Bo,B1,...,Bp to minimize ∑-se》2 = where S()-B().Eilers and Marx(1996)advocate the use ofkequally spaced knots,instead of the order statistics of the predictor variable.Note that the number of knots must be chosen somehow,but this is simpler than choosing knot locations. The smoothness of a spline is related to its B-spline coefficients;thus,Eilers and Marx (1996)replace the second derivative penalty for the smoothing spline with lth order differences of B-spline coefficients.These are defined as follows: △0：=B:-B-1 6The hat matrix is called the smoother matrix in this context,and for smoothing splines,it is of the form H()=(BTB+ADT D)-1BT 7Penalized splines are implemented in the function psplinereg()in the R package kspline which can be obtained from http://www.stats.uwo.ca/faculty/braun/Rlinks.htm.10 CHAPTER 1. NONPARAMETRIC REGRESSION it is much more computationally intensive, since the regression must be computed for each value of j. Recall that PRESS could be computed from one regression, upon exploiting the diagonal of the hat matrix. If one pursues this idea in the choice of smoothing parameter, one obtains the so-called generalized cross-validation criterion which turns out to have very similar behaviour to ordinary cross-validation but with lower computational requirements. The generalized cross-validation calculation is carried out using the trace of a matrix which plays the role of the hat matrix.6 Figure 1.5 exhibits a plot which is very similar to what could be obtained using lines(spline(smooth.spline(temperature, g))) This invokes the smoothing spline with smoothing parameter selected by generalized crossvalidation. To use ordinary cross-validation, include the argument cv=TRUE. Penalized splines The smoothing spline is an example of the use of penalized least-squares. In this case, regression functions which have large second derivatives are penalized, that is, the objective function to be minimized has something positive added to it. Regression functions which have small second derivatives are penalized less, but they may fail to fit the data as well. Thus, the penalty approach seeks a compromise between fitting the data and satisfying a constraint (in this case, a small second derivative). Eilers and Marx (1996) generalized this idea by observing that other forms of penalties may be used. The penalized splines of Eilers and Marx (1996) are based on a similar idea to the smoothing spline, with two innovations. The penalty term is no longer in terms of a second derivative, but a second divided difference, and the number of knots can be specified. (For smoothing splines, the number of knots is effectively equal to the number of observations.)7 Recall that the least-squares spline problem is to find coefficients β0, β1, . . . , βk+p to minimize Xn j=1 (yj − S(xj ))2 where S(x) = Pk+p i=0 βiBi,p(x). Eilers and Marx (1996) advocate the use of k equally spaced knots, instead of the order statistics of the predictor variable. Note that the number of knots must be chosen somehow, but this is simpler than choosing knot locations. The smoothness of a spline is related to its B-spline coefficients; thus, Eilers and Marx (1996) replace the second derivative penalty for the smoothing spline with ℓth order differences of B-spline coefficients. These are defined as follows: ∆βi = βi − βi−1 6The hat matrix is called the smoother matrix in this context, and for smoothing splines, it is of the form H(λ) = (B T B + λDT D) −1B T . 7Penalized splines are implemented in the function psplinereg() in the R package kspline which can be obtained from http://www.stats.uwo.ca/faculty/braun/Rlinks.htm

<<向上翻页向下翻页>>

点击下载：《实用非参数统计》课程教学资源（阅读材料）W.J.Braun's Nonparametric regression notes