Advanced Artificial Intelligence Lecture 3: Decision tree
Advanced Artificial Intelligence Lecture 3: Decision Tree
Outline Introduction Constructing a decision Tree D3 C4.5 Regression Trees CART Gradient Boosting
Outline ▪ Introduction ▪ Constructing a Decision Tree ▪ ID3 ▪ C4.5 ▪ Regression Trees ▪ CART ▪ Gradient Boosting
Decision tree Introduction o The Decision Tree is one of the most powerful and popular classification and prediction algorithms in current use in data mining and machine learning The attractiveness of decision trees is due to the fact that in contrast to neural networks decision trees represent rules o Rules can readily be expressed so that humans can understand them or even directly used in a database access language like sQL so that records falling into a particular category may be retrieved
4 Decision Tree Introduction ⚫ The Decision Tree is one of the most powerful and popular classification and prediction algorithms in current use in data mining and machine learning. ⚫ The attractiveness of decision trees is due to the fact that, in contrast to neural networks, decision trees represent rules. ⚫ Rules can readily be expressed so that humans can understand them or even directly used in a database access language like SQL so that records falling into a particular category may be retrieved
Decision tree o a decision tree consists of nodes test for the value of a certain attribute Edges: correspond to the outcome of a test connect to the next node or leaf · Leaves: terminal nodes that predict the outcome
5 Decision Tree ⚫ A decision tree consists of • Nodes: test for the value of a certain attribute • Edges: correspond to the outcome of a test connect to the next node or leaf • Leaves: terminal nodes that predict the outcome
Decision tree Exampl e I SOLL IHR NEUES AUTO 1. Start at the root SEINEN PREIS WERT SEIN 2 2. Perform the test NEIN 3. Follow the edge corresponding to outcome FDMEURO 4. Go to 2 unless leaf 5. Predict that outcome associated with the leaf Genau das Wichtige
6 Decision Tree ⚫ Example 1. Start at the root 2. Perform the test 3. Follow the edge corresponding to outcome 4. Go to 2. unless leaf 5. Predict that outcome associated with the leaf
Decision Tree Learning In Decision Tree The training examples Learning, a new example are used for choosing is classified b appropriate tests in the submitting it to a series decision tree. Typically, of tests that determine the a tree is built from top to class label of the bottom, where tests that example. These tests are Training maximize the information organized in a gain about hierarchical structure the classification are called a decision tree selected first New Example Classification
7 Decision Tree Learning
Why Decision Tree Decision Trees To Classify To Predict Response variable has Response variable has Response variable is only two categories multiple categories continuous Use standard Linear relationships Use c4.5 Nonlinear relationships classification tree implementation between predictors between predictors and and response response Use standard Use c4.5 Regression tree implementation 8
8 Why Decision Tree ?
A Sample Task Temperature Outook Humidity Windy Play Golf? 07-05 hot sunny high false 0706 hot high true 0707 hot overcast high false yes 0709 cool rain normal false yes 07-10 cool overcast normal true 07-12 mild sunn high false 07-14 sunny normal false yes 0715 mild rain normal false yes 07-20 true yes 07-21 mild overcast high true 07-22 hot overcast normal false y 07-23 mild raIn high no 0726 cool rain normal 07-30 mild rain high false yes today cool sunny false tomorrow mild sunny normal false 9
9 A Sample Task
A Sample Task utor sunny overcast rain Humidity yes windy normal high true false ye no ye tomorrow mild sunny normal false 10
10 A Sample Task
Divide-And-Conguer Algorithms o Family of decision tree learning algorithms TDIDT: Top-Down Induction of Decision Trees O Learn trees in a Top-Down fashion divide the problem in subproblems solve each problem Basic Divide-And-Conquer Algorithm 1. select a test for root node Create branch for each possible outcome of the test 2. split instances into subsets One for each branch extending from the node 3. repeat recursively for each branch, using only instances that reach the branch 4. stop recursion for a branch if all its instances have the same class
11 Divide-And-Conquer Algorithms ⚫Family of decision tree learning algorithms • TDIDT: Top-Down Induction of Decision Trees ⚫Learn trees in a Top-Down fashion • divide the problem in subproblems • solve each problem