You are on page 1of 2

5.3.

5 Algorithm for Decision Tree Induction


A skeleton decision tree induction algorithm called Tree Growth is shown in Algorithm 4.1. The input to this algorithm
consists of the training records E and the attribute set F. The algorithm works by recursively selecting the best attribute
to split the data (Step 7) and expanding the leaf nodes of the tree.

(Steps 11 and 12) until the stopping criterion is met (Step 1). Explained below:
The details of this algorithm are
1. The createNode () function extends the decision tree by creating a new node. A node in the
decision tree has either a test condition, denoted as node. Test cond, or a class label, denoted as
node. Label.
2. The find best split () function determines which attribute should be selected as the test condition
for splitting the training records. As previously noted, the choice of test condition depends on
which impurity measure is used to determine the goodness of a split. Include entropy, the Gini
index, and the 2statistic. Some widely used measures
3. The Classify () function determines the class label to be assigned to a leaf node. For each leaf
node t, let p (i|t) denote the fraction of training records from class i associated with the node t. In
most cases, the leaf node is assigned to the class that has the majority number of training records:
leaf.label = argmax p (i|t),i
Where the argmax operator returns the argument i that maximizes the expression p(i|t). Besides
providing the information needed to determine the class label of a leaf node, the fraction p (i|t) can
also be used to estimate the probability that a record assigned to the leaf node t belongs to class i.
4. The stopping cond () function is used to terminate the tree-growing process by testing whether
all the records have either the same class label or the same attribute values. Another way to
terminate the recursive function is to test whether the number of records has fallen below some
Minimum threshold. After building the decision tree, a tree-pruning step can be performed to
reduce the size of the decision tree. Decision trees that are too large are susceptible to a
phenomenon known as over fitting. Pruning helps by trimming the branches of the initial tree in a
way that improves the generalization capability of the decision tree.

5.3.6 Characteristics of Decision Tree Induction


The following is a summary of the important characteristics of decision tree induction algorithms.
1. Decision tree induction is a nonparametric approach for building classification models. In other
words, it does not require any prior assumptions regarding the type of probability distributions
satisfied by the class and other attributes.
2. Finding an optimal decision tree is an NP-complete problem. Many decision tree algorithms
employ a heuristic-based approach to guide their search in the vast hypothesis space. For example,
The algorithm presented in Section 3.3.5 uses a greedy, top-down, recursive partitioning strategy
for growing a decision tree.
3. Techniques developed for constructing decision trees are computationally inexpensive, making
It possible to quickly construct models even when the training set size is very large. Furthermore,
once a decision tree has been built, classifying a test record is extremely fast, with a worst-case
complexity of O (w), where w is the maximum depth of the tree.
4. Decision trees, especially smaller-sized trees, are relatively easy to interpret. The accuracies of
the trees are also comparable to other classification techniques for many simple data sets.
5. Decision trees provide an expressive representation for learning discrete valued functions.
However, they do not generalize well to certain types of Boolean problems. One notable example
is the parity function, whose value is 0 (1) when there is an odd (even) number of Boolean
attributes with the value True. Accurate modeling of such a function requires a full decision tree
with 2d nodes, where d is the number of Boolean attributes

You might also like