Professional Documents
Culture Documents
Step 2: Set each member in each bin each to the mean of that bin
Bin 1: {14.7, 14.7, 14.7}
Bin 2: {18.3, 18.3, 18.3}
Bin 3: {21, 21, 21}
Bin 4: {24, 24, 24}
Bin 5: {26.7, 26.7, 26.7}
Bin 6: {33.7, 33.7, 33.7}
Bin 7: {35, 35, 35}
Bin 8: {40.3, 40.3, 40.3}
Bin 9: {56, 56, 56}
2. In real-world data, tuples with missing values for some attributes are a common occurrence. Describe
the various methods for handling this problem. (20 points)
Ignore the tuple Exclude the tuple from processing. Although a simple process, it can
have a negative impact on the information that can be mined. One exception where this
can make sense is if the tuple is missing several values.
Fill in the value manually This is one of the most accurate approaches. However, it can be
very time consuming. As a result, it doesnt scale well for large datasets
Use a global constant to fill in the missing value - Fill in all missing values with the same
constant. It is a simple approach. It biases the data. It also uses less information to fill in
this data than other approaches.
Use the attribute mean to fill in the missing values Fill missing values with the mean of
that attribute for other instances. This is slightly more complex that using a constant, but
it should provides higher quality data.
Use the attribute mean for all samples belonging to the same class as the given tuple- The
same as the previous method except that the mean is calculated only for tuples with the
same classification as the tuple with the missing values. It is once again more complex but
it should yield higher quality information.
Use the most probable value to fill in the missing value- Use advanced techniques such as
Bayesian networks or decision trees induction to determine the missing value. This is a
preferred method since it uses the most information to make the decision. However, it is
also the most complicated to implement.
This assignment is DUE 10AM Friday (2/6/09). Please name your file HW3yourlastname.doc and
submit it on Moodle @ classes.cs.siue.edu