You are on page 1of 8

Development of Classification Model for Dengue Data 2003-2009:

A Comparison of Artificial Neural Network Decision Tree and Rough


Set Theory

S.Norhudha Sarif, Hazwani Hamdan, Marlinawati Djasmir, S.Ain

hudasarif86@yahoo.com, mujahidah28@yahoo.com, mimah_adeq@yahoo.com,


aimiesk@yahoo.com

Abstract. This paper compares Neural Network, Decision Tree, and Rough Set Theory
in development of classification model for dengue data 2003-2009 that has been pre-
processed before. The development in this paper is using two data mining software,
which are Weka and Rosetta.

Keywords: Neural Network, Decision Tree and Rough Set Theory.

1. Introduction

Classification is one of the data analysis techniques that may help to better understanding
about the data and develop the model to extract the knowledge from the future data trends. In
this paper we had practiced three techniques of classification which are Neural Network,
Decision Tree and Rough Set Theory.

2. Material and Methods

Neural networks were first inspired by an attempt to mimic the neural functions of the human
brain (Rumelhart, Hinton, & Williams, 1986). They are powerful prediction and classification
tools, and provide new opportunities for solving difficult problems that have been
traditionally modeled using statistical approaches. Among the numerous neural networks that
have been proposed, back propagation networks are probably the most popular and widely
used.

The network consists of several components: (1) a set of neurons or processing units that
receive and send signals from an outside environment or other neurons in the network using
three layers-input, hidden and output layers; (2) connectivity, which shows the interactivity
between neurons; (3) propagation rules, which aggregate input signals from other neurons; (4)
activation/transfer functions, which convert the aggregated inputs to output to be sent to other
connected resources of the server, and also should not generate too many false alarms to
become ineffective. System architecture issues need to be addressed while designing those
systems Neurons; and (5) learning algorithms, which update the patterns and strength of
connectivity.

Typically, the network starts with a random set of weights, and adjusts the weights each time
it detects an input-output pair of errors. This process is called learning. During the training
period, various classes of training data are fed into the networks. Activation flows from the
input layer, through the hidden layer, and then to the output layer. Each neuron receives as
input the outputs of all neurons from the previous layer. After input data is applied as a
stimulus to the input layer of the network, it is propagated through neurons in each upper
layer until an output, is generated. The error signals are then transmitted backward from the
output layer to the middle layer. This process repeats layer by layer and, based on the error
signal received, connection weights are then updated to cause the network to converge toward
a stable state.

A rough set is a formal approximation of a crisp set (i.e., conventional set) in terms of a pair
of sets which give the lower and the upper approximation of the original set. The lower and
upper approximation sets themselves are crisp sets in the standard version of rough set theory
(Pawlak 1991), but in other variations, the approximating sets may be fuzzy sets as well.

Rough set theory can be used for classification to discover structural relationships within
imprecise or noisy data. It applies to discrete-valued attributes. Rough set theory is based on
the establishment of equivalence classes within the given training data. Rough set theory can
also e used for attribute subset selection and relevance analysis (Han Jiawei. et al).

Rough sets can be used as a theoretical basis for some problems in machine learning. They
have found to be particularly useful for rule induction and feature selection (semantics-
preserving dimensionality reduction). They have been used for missing data estimation as
well as understanding HIV. They have also inspired some logical research.

Rough set theory is a kind of new tool to deal with knowledge, particularly when knowledge
is imprecise, inconsistent and incomplete (Miao. et al).

Decision tree analysis evolved in the field of data mining and is most useful for approaching
problems that can be addressed through extensive and diverse data (Garver, 2002). The
method allows the researcher to predict a dependent target variable by incorporating
numerous and diversely measured independent variables into the model. In many cases the
flexibility and reduced operational constraints of decision tree modeling deem it preferable to
logistic regression.

Decision tree is widely used learning method. It is easy to interpret because it can be re-
represented as if-then-else rules. Decision tree does not require any prior knowledge of data
distribution and it works well on noisy data.

3. Methodology

a) Data Collection
The data set was obtained from data denggi 2003-2009 that adapted from Borang Siasatan
Kes Demam Denggi / Demam Denggi Berdarah Unit Kawalan Vektor, Pusat Kesihatan
Daerah Hulu Langat. The data was collected from the case of dengue fever in the Hulu Langat
region from 2003 until 2009.

The data set had been cleaned through pre-processing steps manually and the size of data had
reduced to 820 data from the original 8505 data set. The attribute of this data set is 133 and
this big size of attribute had been reduced due to the goal of this classification is to find out
the class of the data based on demography attribute. Another reason of this big number of
reduction is because most of the data in some attribute had deleted because the privacy of that
patient.

The data set also had been transformed into the appropriate form to be classified. Normalized
data is the appropriate form of data for neural network classification which the data was
scaled in the range [0,1]. For the decision tree classification the appropriate form of data is
real number data called discrete data while rough set technique used discretization form of
data.

b) Experiment Design

Decision Tree Classification (using Weka software)

The step taken to mining the data using Decision Tree Classification (J48) are as follow:

1. We had choose the techniques of split the data into training and testing set

2. The data was split into the range as in the following table:

Table 1: The data split range

Training Data Test Data


10 90
20 80
30 70
40 60
50 50
Training Data Test Data
60 40
70 30
80 20
90 10

3. The data then was processed using split data technique as the table above.

4. The accuracy of each model then was transfer to the table so make the comparison and to
choose for the best model.

Neural Network Classification (using Weka software)

1. The data for neural network classification was split as in the decision tree classification.

2. The learning rate was set to 0.3 and the hidden layer was 28 due to the function that
usually used to determine the number of hidden nodes:

N hidden = √Ninput
N = number of nodes

3. The iteration number had set to 500 iterations.

Rough Set Classification (using Rosetta software)

The steps taken to perform the modeling are as follows:


1. Open folder discretized denggi data . Split the data into 10 random set in ratio. We have split into
10:10, 20:80, 30:70, 40:60, 50:50(1), 50:50(2), 60:40, 70:30, 80:20, 90:10.

2. Every set have two part. Above are training part and the below part are for testing. For training
part, right click and choose reduce with genetic algorithm. Choose full for discernibility to
produces a set of minimal attribute subsets that define functional dependencies then ok.

3. While for set testing, right click and then choose classify. At classifier, change to naïve bayes.
Then set parameters by select a master table(from object training) and click ok.

4. After that choose discrimination to generate ROC data for file for a particular classification. Then
click ok.

5. Then repeat another set with the same instruction from rule 1-4 until finish 10 set.

c) Testing and Evaluation

The result of each partition for each technique had been compared to get the best result. The
highest accuracy among the partition of each technique was selected to be compared among
the technique.

4. Experiment and Discussion

Table 2: Experimental Result for J48 Tree

Model Data Accuracy Number of Length of


Allocation rules rules
1 1:9 86.5854 % 10 15
2 2:8 85.9756 % 10 15
3 3:7 83.9721 % 10 15
4 4:6 85.7724 % 10 15
5 5:5 86.3415 % 10 15
6 6:4 85.9756 % 10 15
7 7:3 85.7724 % 10 15
8 8:2 85.3659 % 10 15
9 9:1 86.5854 % 10 15
Table 3: Experiment Result for Neural Network

Model Data Num of Activation Learning Accuracy MSE(Mean


Allocation Hidden Function Rate Squared
Nodes Error)
1 1:9 28 0.3 76.6938% 0.4495
2 2:8 28 0.3 79.2683 % 0.4254
3 3:7 28 0.3 80.8362% 0.4064
4 4:6 28 0.3 81.5041 % 0.3986
5 5:5 28 0.3 79.2683 % 0.4379
6 6:4 28 0.3 81.7073 % 0.3914
7 7:3 28 0.3 77.6423 % 0.4437
8 8:2 28 0.3 83.5366 % 0.3787
9 9:1 28 0.3 84.1463 % 0.3905

Table 4: Experimental Results for Rough Set Classification Models

Model Data Accuracy Number of Length of


Allocation rules rules
1 1:9 78% 17 7
2 2:8 80% 2 10
3 3:7 82.5% 1 12
4 4:6 79.2% 1 13
5 5(1):5(2) 78% 1 13
6 5(2):5(1) 82.7% 2 13
7 6:4 79.3% 1 13
8 7:3 80.1% 1 13
9 8:2 78% 1 14
10 9:1 76.8% 1 14

The neural networks have been criticized for their poor interpretability. Advantages of neural
network include their high tolerance of noisy data as well as their ability to classify patterns
on which they have not trained. They can be used when you have little knowledge of the
relationship between attributes and classes. They are well-suited for continuous-valued inputs
and outputs. (jiawei Han 2006.et al) For real-world problems with high nonlinearity and short
memory dynamics, neural networks usually perform better at prediction and classification
accuracy.

The main advantage of rough sets is that they do not require any preliminary or additional
information about the data. The method can work with missing values, switch between
different reducts, and use less expensive or alternative sets of measurements. It is able to
discover important facts hidden in the data and express them in the natural language of
decision rules. The rough sets method offers the ability to handle large amounts of both
quantitative and qualitative data. Its ability to model highly nonlinear or discontinuous
functional relationships provides a powerful method for characterizing complex,
multidimensional patterns. It offers transparency of classification decisions, allowing for their
argumentation. The rough sets method has been successfully applied in knowledge
acquisition, forecasting and predictive modeling, and decision support.

Decision Tree has reasonable training time. It is a fast application and easy to interpret and to
implement. Decision can handle large number of features. But the disadvantages in Decision
Tree are, it cannot handle complicated relationship between features, it has simple decision
boundaries and always has problems with lots of missing data.

5. Concluding Remark

In this paper we compare neural network with rough set and decision tree and bring
advantages and disadvantages of them. After reading some articles and based on our results
the conclusion is that neural network calculate and produces the model faster than rough set
and decision tree. Even the accuracy of decision tree is more than neural network and rough
set, but neural network works with outlier data better. Rough set can work better on discrete
data but neural network can work by real data easily.

6. References

Garver, M. (2002), “Using data mining for customer satisfaction research”, Marketing
Research, Vol. 14, No. 1, pp. 8-12.

Massimo Bertolini (2006), “Methodology and Theory, Oil Pipeline Spill Cause Analysis, A
Classification Tree Approach”, Journal of Quallity in Maintenance Engineering, Vol. 12, No.
2, pp. 186- 198.

Rumelhart, D. E., Hinton, Cz E., & Williams, R. J. (1986). Learning internal representations
by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed
processing. Cambridge, MA: The MIT Press, 318-362.

Pawlak, Zdzisław (1991). Rough Sets: Theoretical Aspects of Reasoning About Data.
Dordrecht: Kluwer Academic Publishing. ISBN 0-7923-1472-7.

Han jiawei, kamber micheline 2006.Data mining concepts and techniques pg 351 maurgan
kaufman publishment

Miao Duoqian , Lishan Hou A comparison of rough set methods and representative inductive
learning algorithms Source Fundamenta Informaticae archive, Volume 59 , Issue 2-3
(February 2004) table of contents,Special issue on the 9th international conference on rough
sets, fuzzy sets, data mining and granular computing (RSFDGrC 2003) ,Pages: 203 -
219 ,Year of Publication: 2004 ,ISSN:0169-2968 ,IOS Press Amsterdam, The Netherlands,
The Netherlands

You might also like