当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第七讲 非线性分类模型——集成方法

资源类别:文库,文档格式:PDF,文档页数:42,文件大小:261.35KB,团购合买
1 基本原理 2 多分类器结合 3 装袋 Bagging 4 提升法 Boosting(提升法) AdaBoost 算法 AdaBoost 算法的另一个解释
点击下载完整版文档(PDF)

统计学习理论及应用 第七讲非线性分类模型-集成方法 编写:文泉、陈娟 电子科技大学计算机科学与工程学院

统计学习理论及应用 第七讲 非线性分类模型–集成方法 编写:文泉、陈娟 电子科技大学 计算机科学与工程学院

目录 ①基本原理 2 多分类器结合 装袋Bagging 提升法 。Boosting(提升法) 。AdaBoost算法 o AdaBoost算法的另一个解释 1/41

目录 1 基本原理 2 多分类器结合 3 装袋 Bagging 4 提升法 Boosting(提升法) AdaBoost 算法 AdaBoost 算法的另一个解释 1 / 41

7.1.基本原理 o In any application,we can use several learning algorithms o The No Free Lunch Theorem:no single learning algorithm in any domains always introduces the most accurate learner o Try many and choose the one with the best cross-validation results 2/41

7.1. 基本原理 In any application, we can use several learning algorithms The No Free Lunch Theorem: no single learning algorithm in any domains always introduces the most accurate learner Try many and choose the one with the best cross-validation results 2 / 41

Rationale o On the other hand.. Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem finite data):each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently, which may lead to improved results? Why it works? Suppose there are 25 base classifiers Each classifier has error rate,e=0.35 o If the base classifiers are identical,thus dependent,then the ensemble will misclassify the same examples predicted incorrectly by the base classifiers. 3/41

Rationale On the other hand … Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem ( finite data): each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently, which may lead to improved results? Why it works? Suppose there are 25 base classifiers Each classifier has error rate, ε = 0.35 If the base classifiers are identical, thus dependent, then the ensemble will misclassify the same examples predicted incorrectly by the base classifiers. 3 / 41

Rationale o Assume classifiers are independent,i.e.,their errors are uncorrelated.Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. o Probability that the ensemble classifier makes a wrong prediction: 25 25 e(1-e)25-=0.06 i=13 注意:x≥13,n=25,p=0.35二项式分布 4/41

Rationale Assume classifiers are independent, i.e., their errors are uncorrelated. Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. Probability that the ensemble classifier makes a wrong prediction: X 25 i=13  25 i  ε i (1 − ε) 25−i = 0.06 注意:x ≥ 13, n = 25, p = 0.35 二项式分布 4 / 41

Works if... o The base classifiers should be independent o The base classifiers should do better than a classifier that performs random guessing.(error 0.5) o In practice,it is hard to have base classifiers perfectly independent.Nevertheless,improvements have been observed in ensemble methods when they are slightly correlated. 5/41

Works if … The base classifiers should be independent. The base classifiers should do better than a classifier that performs random guessing. (error < 0.5) In practice, it is hard to have base classifiers perfectly independent. Nevertheless, improvements have been observed in ensemble methods when they are slightly correlated. 5 / 41

Rationale o One important note is that: When we generate multiple base-learners,we want them to be reasonably accurate but do not require them to be very accurate individually,so they are not,and need not be, optimized separately for best accuracy. The base learners are not chosen for their accuracy,but for their simplicity. 6/41

Rationale One important note is that: When we generate multiple base-learners, we want them to be reasonably accurate but do not require them to be very accurate individually, so they are not, and need not be, optimized separately for best accuracy. The base learners are not chosen for their accuracy, but for their simplicity. 6 / 41

7.2.多分类器结合 o Average results from different models o Why? Better classification performance than individual classifiers More resilience to noise o Why not? Time consuming o Overfitting 7/41

7.2. 多分类器结合 Average results from different models Why? Better classification performance than individual classifiers More resilience to noise Why not? Time consuming Overfitting 7 / 41

Why o Better classification performance than individual classifiers o More resilience to noise Beside avoiding the selection of the worse classifier under particular hypothesis,fusion of multiple classifiers can improve the performance of the best individual classifiers o This is possible if individual classifiers make"different" errors For linear combiners,Turner and Ghosh (1996)showed that averaging outputs of individual classifiers with unbiased and uncorrelated errors can improve the performance of the best individual classifier 8/41

Why Better classification performance than individual classifiers More resilience to noise Beside avoiding the selection of the worse classifier under particular hypothesis, fusion of multiple classifiers can improve the performance of the best individual classifiers This is possible if individual classifiers make ”different” errors For linear combiners, Turner and Ghosh (1996) showed that averaging outputs of individual classifiers with unbiased and uncorrelated errors can improve the performance of the best individual classifier 8 / 41

Architecture parallel serial hybrid 9/41

Architecture 9 / 41

点击下载完整版文档(PDF)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共42页,可试读14页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有