正在加载图片...
6 Comparing the various criteria Comments on the frequentist approaches Backward and forward selection procedures will tend to give different models as "best".The conclusions are dependent on the order in which you try out models. P-values may be poorly approximated in small sample sizes,but most software packages ignore this. Because so many tests are done,there is a large danger of multiple testing,so unimportant independent variables may enter the model. For the same reasons,confidence intervals from models selected by backward/forward procedures tend to be too narrow. Backward and forward selection procedures are based on p-values,and so have all of the usual problems associated with them,plus an additional problem:In these methods,the next step depends on what happened in the previous step, but the calculated p-values ignore this,and so are not interpretable in the usual way as“true”p-values. Also,p-values assume only two models are being compared,for example we compare Y=X1+..+Xp-1+Xp is compared to Y=X1+..+Xp-1 But in reality,there are many other models simultaneously being considered. P-values tend to exaggerate the "weight of evidence",leading to models that are too large,in general (we will see an example of this later). These methods tend to select a "single best model",but what if two or more models are equally plausible? Prior knowledge about covariates is ignored,wasteful of information when good information exists. ●Models must be“nested”,i.e.,cannot test the following two models: Y=X1+X3+X46 Comparing the various criteria Comments on the frequentist approaches • Backward and forward selection procedures will tend to give different models as “best”. The conclusions are dependent on the order in which you try out models. • P-values may be poorly approximated in small sample sizes, but most software packages ignore this. • Because so many tests are done, there is a large danger of multiple testing, so unimportant independent variables may enter the model. • For the same reasons, confidence intervals from models selected by backward/forward procedures tend to be too narrow. • Backward and forward selection procedures are based on p-values, and so have all of the usual problems associated with them, plus an additional problem: In these methods, the next step depends on what happened in the previous step, but the calculated p-values ignore this, and so are not interpretable in the usual way as “true” p-values. • Also, p-values assume only two models are being compared, for example we compare Y = X1 + . . . + Xp−1 + Xp is compared to Y = X1 + . . . + Xp−1 But in reality, there are many other models simultaneously being considered. • P-values tend to exaggerate the “weight of evidence”, leading to models that are too large, in general (we will see an example of this later). • These methods tend to select a “single best model”, but what if two or more models are equally plausible? • Prior knowledge about covariates is ignored, wasteful of information when good information exists. • Models must be “nested”, i.e., cannot test the following two models: Y = X1 + X3 + X4
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有