Decision tree ING SHEN SSE TONGJLUNIVERSITY OCT.2016
Decision Tree Y I NG SH EN SSE, TO NG JI UNI VERSITY O CT. 2016
Decision tree We can solve a classification problem by asking a series of carefully crafted questions about the attributes of the test record Each time we receive an answer, a follow-up question is asked This process is continued until we reach a conclusion about the class label of the record 1/30/2021 PATTERN RECOGNITION
Decision tree We can solve a classification problem by asking a series of carefully crafted questions about the attributes of the test record. Each time we receive an answer, a follow-up question is asked. This process is continued until we reach a conclusion about the class label of the record. 1/30/2021 PATTERN RECOGNITION 2
Decision tree The series of questions and answers can be organized in the form of a decision tree It is a hierarchical structure consisting of nodes and directed edges. The tree has three types of nodes A root node that has no incoming edges and zero or more outgoing eages Internal nodes, each of which has exactly one incoming edge and two or more outgoing edges Leaf or terminal nodes, each of which has exactly one incoming edge and no outgoing edges 1/30/2021 PATTERN RECOGNITION
Decision tree The series of questions and answers can be organized in the form of a decision tree. It is a hierarchical structure consisting of nodes and directed edges. The tree has three types of nodes ◦ A root node that has no incoming edges, and zero or more outgoing edges. ◦ Internal nodes, each of which has exactly one incoming edge and two or more outgoing edges. ◦ Leaf or terminal nodes, each of which has exactly one incoming edge and no outgoing edges. 1/30/2021 PATTERN RECOGNITION 3
Decision tree In a decision tree, each leaf node is assigned a class label The non-terminal nodes, which include the root and other internal nodes, contain attribute test conditions to separate records that have different characteristics 1/30/2021 PATTERN RECOGNITION
Decision tree In a decision tree, each leaf node is assigned a class label. The non-terminal nodes, which include the root and other internal nodes, contain attribute test conditions to separate records that have different characteristics. 1/30/2021 PATTERN RECOGNITION 4
Decision tree Classifying a test record is straightforward once a decision tree has been constructed Starting from the root node, we apply the test condition We then follow the appropriate branch based on the outcome of the test This will lead us either to Another internal node, for which a new test condition is applied, or A leaf node The class label associated with the leaf node is then assigned to the record 1/30/2021 PATTERN RECOGNITION
Decision tree Classifying a test record is straightforward once a decision tree has been constructed. Starting from the root node, we apply the test condition. We then follow the appropriate branch based on the outcome of the test. This will lead us either to ◦ Another internal node, for which a new test condition is applied, or ◦ A leaf node. The class label associated with the leaf node is then assigned to the record. 1/30/2021 PATTERN RECOGNITION 5
Decision tree construction Efficient algorithms have been developed to induce a reasonably accurate, although suboptimal, decision tree in a reasonable amount of time These algorithms usually employ a greedy strategy that makes a series of locally optimal decisions about which attribute to use for partitioning the data 1/30/2021 PATTERN RECOGNITION
Decision tree construction Efficient algorithms have been developed to induce a reasonably accurate, although suboptimal, decision tree in a reasonable amount of time. These algorithms usually employ a greedy strategy that makes a series of locally optimal decisions about which attribute to use for partitioning the data. 1/30/2021 PATTERN RECOGNITION 6
Decision tree construction a decision tree is grown in a recursive fashion by partitioning the training records into successively purer subsets We suppose Un is the set of training records that are associated with node n C=CL C2,.,CK) is the set of class labels 1/30/2021 PATTERN RECOGNITION
Decision tree construction A decision tree is grown in a recursive fashion by partitioning the training records into successively purer subsets. We suppose ◦ Un is the set of training records that are associated with node n. ◦ C={c1 , c2 ,…, cK} is the set of class labels. 1/30/2021 PATTERN RECOGNITION 7
Decision tree construction If all the records in u belong to the same class cu then n is a leaf node labeled as c If Un contains records that belong to more than one class, An attribute test condition is selected to partition the records into smaller subsets a child node is created for each outcome of the test condition The records in u are distributed to the children based on the outcomes The algorithm is then recursively applied to each child node 1/30/2021 PATTERN RECOGNITION
Decision tree construction If all the records in Un belong to the same class ck , then n is a leaf node labeled as ck . If Un contains records that belong to more than one class, ◦ An attribute test condition is selected to partition the records into smaller subsets. ◦ A child node is created for each outcome of the test condition. ◦ The records in Un are distributed to the children based on the outcomes. The algorithm is then recursively applied to each child node. 1/30/2021 PATTERN RECOGNITION 8
Decision tree construction For each node, let p(ck) denotes the fraction of training records from class k In most cases the leaf node is assigned to the class that has the majority number of training records The fraction p(ck for a node can also be used to estimate the probability that a record assigned to that node belongs to class k 1/30/2021 PATTERN RECOGNITION 9
Decision tree construction For each node, let p(ck ) denotes the fraction of training records from class k. In most cases, the leaf node is assigned to the class that has the majority number of training records. The fraction p(ck ) for a node can also be used to estimate the probability that a record assigned to that node belongs to class k. 1/30/2021 PATTERN RECOGNITION 9
Decision tree construction Decision trees that are too large are susceptible to a phenomenon known as overfitting a tree pruning step can be performed to reduce the size of the decision tree Pruning helps by trimming the tree branches in a way that improves the generalization error. 1/30/2021 PATTERN RECOGNITION
Decision tree construction Decision trees that are too large are susceptible to a phenomenon known as overfitting. A tree pruning step can be performed to reduce the size of the decision tree. Pruning helps by trimming the tree branches in a way that improves the generalization error. 1/30/2021 PATTERN RECOGNITION 10