You are on page 1of 3

ASSIGNMENT - 4

REDDIVARI SAI SARAN


G01142501

Problem 1-
Correctly classified instance RMSE
percentage
66.66% 0.3333
Decisionstump
J48 96% 0.1586
Ibk (KNN=3) 95.33% 0.1703
Ibk (KNN=5) 95.33% 0.1444

Decisionstump is one level decision tree. Generally, it has one internal node which is connected directly
to the termination nodes [1].

A single root node decides how to classify inputs based on the single feature. Each leaf represents
possible feature value, the class label that should be assigned to inputs whose elements have that value
[1].

For using this method, one must decide the feature and build the tree. It is the easiest method by which
decision stumps can be developed for each possible element and which element is giving the highest
accuracy on the training data can be checked [1].

There are total five attributes participated in the decision stump they are sepallength, sepalwidth,
petallength,petalwidth, and class.

PETALLENGTH is an attribution that was used to decide in decision stump. (petallength<=2.45,


petallength>2.45, petallength is missing)

petallength <= 2.45 : Iris-setosa

petallength > 2.45 : Iris-versicolor

petallength is missing : Iris-setosa


Problem 2-
a) From all the given set of features, it is not practical to explore all set of combinations
because selecting certain important feature subsets instead of selecting all the
combinations of features would be helpful in saving the training time. If all the possible
combinations are explored, then it tends to take a large amount of time in exploring the
best optimal subset of features. The number of combinations of features that exist for 10
and 100 features is 10! (10 factorial) and 100! (100 factorial). It is clear that if we have n
set of features, then the combinations possible could be n! (n factorial).

b)
Subset size Attributes in “best” subset Classification Accuracy
4 ALL 95.33%
3 Sepallength,petallength,petalwidth 96.667%
2 Sepallength, petalwidth 96%
1 petalwidth 96%

c) No, backward elimination approach guaranteed to find the optimal set of features because
Forward selection and backward selection is much slower than just ranking attributes
based on metrics that can be applied to one attribute at a time. So If you are using such a
parameter, use a ranking selection technique rather than a search selection technique.
Using search won’t change the result, it will just waste much time
References

[1] S. Sonawani, "A Decision Tree Approach to Classify Web Services using Quality Parameter".

You might also like