Introduction Regression splines (parametric Smoothing splines (nonparametric Splines and penalized regression Patrick Breheny November 23 Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines and penalized regression Patrick Breheny November 23 Patrick Breheny STA 621: Nonparametric Statistics
Introduction Regression splines (parametric Smoothing splines (nonparametric) Introduction o We are discussing ways to estimate the regression function f. where E(yx)=f(x) o One approach is of course to assume that f has a certain shape,such as linear or quadratic,that can be estimated parametrically o We have also discussed locally weighted linear/polynomial models as a way of allowing f to be more flexible o An alternative,more direct approach is penalization Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) Introduction We are discussing ways to estimate the regression function f, where E(y|x) = f(x) One approach is of course to assume that f has a certain shape, such as linear or quadratic, that can be estimated parametrically We have also discussed locally weighted linear/polynomial models as a way of allowing f to be more flexible An alternative, more direct approach is penalization Patrick Breheny STA 621: Nonparametric Statistics
Introduction Regression splines (parametric Smoothing splines (nonparametric Controlling smoothness with penalization o Here,we directly solve for the function f that minimizes the following objective function,a penalized version of the least squares objective: ∑-feP+Af"o2a o The first term captures the fit to the data,while the second penalizes curvature-note that for a line,f"(u)=0 for all u o Here,A is the smoothing parameter,and it controls the tradeoff between the two terms: =0 imposes no restrictions and f will therefore interpolate the data A=oo renders curvature impossible,thereby returning us to ordinary linear regression Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) Controlling smoothness with penalization Here, we directly solve for the function f that minimizes the following objective function, a penalized version of the least squares objective: Xn i=1 {yi − f(xi)} 2 + λ Z {f 00(u)} 2 du The first term captures the fit to the data, while the second penalizes curvature – note that for a line, f 00(u) = 0 for all u Here, λ is the smoothing parameter, and it controls the tradeoff between the two terms: λ = 0 imposes no restrictions and f will therefore interpolate the data λ = ∞ renders curvature impossible, thereby returning us to ordinary linear regression Patrick Breheny STA 621: Nonparametric Statistics
Introduction Regression splines (parametric Smoothing splines (nonparametric Splines It may sound impossible to solve for such an f over all possible functions,but the solution turns out to be surprisingly simple This solutions,it turns out,depends on a class of functions called splines o We will begin by introducing splines themselves,then move on to discuss how they represent a solution to our penalized regression problem Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines It may sound impossible to solve for such an f over all possible functions, but the solution turns out to be surprisingly simple This solutions, it turns out, depends on a class of functions called splines We will begin by introducing splines themselves, then move on to discuss how they represent a solution to our penalized regression problem Patrick Breheny STA 621: Nonparametric Statistics
Regression splines (parametric Smoothing splines (nonparametric) Basis functions One approach for extending the linear model is to represent z using a collection of basis functions: M fe)=∑Bmtim(d) m=1 Because the basis functions [hm}are prespecified and the model is linear in these new variables,ordinary least squares approaches for model fitting and inference can be employed o This idea is probably not new to you,as transformations and expansions using polynomial bases are common Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) Basis functions One approach for extending the linear model is to represent x using a collection of basis functions: f(x) = X M m=1 βmhm(x) Because the basis functions {hm} are prespecified and the model is linear in these new variables, ordinary least squares approaches for model fitting and inference can be employed This idea is probably not new to you, as transformations and expansions using polynomial bases are common Patrick Breheny STA 621: Nonparametric Statistics
Regression splines (parametric Smoothing splines (nonparametric Global versus local bases o However,polynomial bases with global representations have undesirable side effects:each observation affects the entire curve,even for c values far from the observation o In previous lectures,we got around this problem with local weighting o In this lecture,we will explore instead an approach based on piecewise basis functions o As we will see,splines are piecewise polynomials joined together to make a singe smooth curve Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) Global versus local bases However, polynomial bases with global representations have undesirable side effects: each observation affects the entire curve, even for x values far from the observation In previous lectures, we got around this problem with local weighting In this lecture, we will explore instead an approach based on piecewise basis functions As we will see, splines are piecewise polynomials joined together to make a singe smooth curve Patrick Breheny STA 621: Nonparametric Statistics
Regression splines(parametric) Smoothing splines (nonparametric The piecewise constant model o To understand splines,we will gradually build up a piecewise model,starting at the simplest one:the piecewise constant model o First,we partition the range of z into K+1 intervals by choosing K points called knots o For our example involving bone mineral density,we will choose the tertiles of the observed ages Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) The piecewise constant model To understand splines, we will gradually build up a piecewise model, starting at the simplest one: the piecewise constant model First, we partition the range of x into K + 1 intervals by choosing K points {ξk} K k=1 called knots For our example involving bone mineral density, we will choose the tertiles of the observed ages Patrick Breheny STA 621: Nonparametric Statistics
Regression splines (parametric Smoothing splines (nonparametric The piecewise constant model (cont'd) 25 male 020 0.00 -005 20 的 Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) The piecewise constant model (cont’d) age spnbmd −0.05 0.00 0.05 0.10 0.15 0.20 10 15 20 25 female 10 15 20 25 male Patrick Breheny STA 621: Nonparametric Statistics
Regression splines (parametric) Smoothing splines (nonparametric The piecewise linear model 20 25 male 020 015 0.00 -005 20 Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) The piecewise linear model age spnbmd −0.05 0.00 0.05 0.10 0.15 0.20 10 15 20 25 female 10 15 20 25 male Patrick Breheny STA 621: Nonparametric Statistics
Regression splines (parametric Smoothing splines (nonparametric The continuous piecewise linear model 25 male 0.20 0.15 0 .05 C.00 -0.05 20 25 age Patrick Breheny STA 621:Nonparametric Statistics
Introduction Regression splines (parametric) Smoothing splines (nonparametric) The continuous piecewise linear model age spnbmd −0.05 0.00 0.05 0.10 0.15 0.20 10 15 20 25 female 10 15 20 25 male Patrick Breheny STA 621: Nonparametric Statistics