1.1. SPLINE REGRESSION 9 that was alr_中国高校课件下载中心

点击下载：《实用非参数统计》课程教学资源（阅读材料）W.J.Braun's Nonparametric regression notes

正在加载图片...

1.1.SPLINE REGRESSION 9 that was already there improves the fit as well.4 1.1.3 Smoothing Splines One way around the problem of choosing knots is to use lots of them.A result analogous to the Weierstrass approximation theorem says that any sufficiently smooth function can be approximated arbitrarily well by spline functions with enough knots. The use of large numbers of knots alone is not sufficient to avoid trouble,since we will over-fit the data if the number of knots k is taken so large that p++1>n.In that case, we would have no degrees of freedom left for estimating the residual variance.A standard way of coping with the former problem is to apply a penalty term to the least-squares problem.One requires that the resulting spline regression estimate has low curvature as measured by the square of the second derivative. More precisely,one may try to minimize (for a given constant A) over the set of all functions S(x)which are twice continuously differentiable.The solution to this minimization problem has been shown to be a cubic spline which is surprisingly easy to calculate.5 Thus,the problem of choosing a set of knots is replaced by selecting a value for the smoothing parameter A.Note that if A is small,the solution will be a cubic spline which almost interpolates the data;increasing values of A render increasingly smooth approximations The usual way of choosing A is by cross-validation.The ordinary cross-validation choice of入minimizes CV()=∑-S6(c) j=1 where (()is the smoothing spline obtained using parameter A,using all data but the jth observation.Note that the CV function is similar in spirit to the PRESS statistic,but 4The plot in Figure 1.4 can be generated using y.1m<-1m(g~bs(temperature,knots=c(755,835,885,895,915,975), Boundary.knots=c(550,1100))) plot(titanium) lines(spline(temperature,predict(y.lm))) 5The B-spline coefficients for this spline can be obtained from an expression of the form B=(BTB+λDTD)-1Bry where B is the matrix used for least-squares regression splines and D is a matrix that arises in the calculation involving the squared second derivatives of the spline.Details can be found in de Boor (1978).It is sufficient to note here that this approach has similarities with ridge regression,and that the estimated regression is a linear function of the responses.1.1. SPLINE REGRESSION 9 that was already there improves the fit as well.4 1.1.3 Smoothing Splines One way around the problem of choosing knots is to use lots of them. A result analogous to the Weierstrass approximation theorem says that any sufficiently smooth function can be approximated arbitrarily well by spline functions with enough knots. The use of large numbers of knots alone is not sufficient to avoid trouble, since we will over-fit the data if the number of knots k is taken so large that p+k+1 > n. In that case, we would have no degrees of freedom left for estimating the residual variance. A standard way of coping with the former problem is to apply a penalty term to the least-squares problem. One requires that the resulting spline regression estimate has low curvature as measured by the square of the second derivative. More precisely, one may try to minimize (for a given constant λ) Xn j=1 (yj − S(xj ))2 + λ Z b a (S ′′(x))2 dx over the set of all functions S(x) which are twice continuously differentiable. The solution to this minimization problem has been shown to be a cubic spline which is surprisingly easy to calculate.5 Thus, the problem of choosing a set of knots is replaced by selecting a value for the smoothing parameter λ. Note that if λ is small, the solution will be a cubic spline which almost interpolates the data; increasing values of λ render increasingly smooth approximations. The usual way of choosing λ is by cross-validation. The ordinary cross-validation choice of λ minimizes CV(λ) = Xn j=1 (yj − Sbλ,(j)(xj ))2 where Sbλ,(j)(x) is the smoothing spline obtained using parameter λ, using all data but the jth observation. Note that the CV function is similar in spirit to the PRESS statistic, but 4The plot in Figure 1.4 can be generated using y.lm <- lm(g ~ bs(temperature, knots=c(755, 835, 885, 895, 915, 975), Boundary.knots=c(550, 1100))) plot(titanium) lines(spline(temperature, predict(y.lm))) 5The B-spline coefficients for this spline can be obtained from an expression of the form βb = (B T B + λDT D) −1B T y where B is the matrix used for least-squares regression splines and D is a matrix that arises in the calculation involving the squared second derivatives of the spline. Details can be found in de Boor (1978). It is sufficient to note here that this approach has similarities with ridge regression, and that the estimated regression is a linear function of the responses

<<向上翻页向下翻页>>

点击下载：《实用非参数统计》课程教学资源（阅读材料）W.J.Braun's Nonparametric regression notes