You are on page 1of 5

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012)

Anomaly Detection in Network using Data mining Techniques


Sushil Kumar Chaturvedi1 , Prof. Vineet Richariya2 , Prof. Nirupama Tiwari3

1
M-Tech Research Scholar, LNCT,Bhopal
2
Department of CSE LNCT, Bhopal
3
Department of CSE SRCEM, Banmore
1
chaturvedisushilkumar@yahoo.co.in
2
vineet_rich@yahoo.com
3
girishniru@yahoo.com

Abstract-As the network dramatically extended security  Host Based ADS:-these types of systems actually run
considered as major issue in networks. There are many on the system being monitored. These data come from
methods to increase the network security at the moment such the records of different host system activities,
as encryption, VPN, firewall etc. but all of these are too static
including appraisal record of OS, system logs,
to give an effective protection against attack and counter
application program information, and so on.
attack. We use data mining algorithm and apply it to the
anomaly detection problem. In this work our aim to use data
 Network Based ADS:-these types of system are placed
mining techniques including classification tree and support
vector machines for anomaly detection. The result of on the network, near the system or system being
experiments shows that the algorithm C4.5 has greater monitored. They examine the network traffic and
capability than SVM in detecting network anomaly and false determine whether it falls within acceptable
alarm rate by using 1999 KDD cup data. boundaries. these data come through network
segments, such as :Internet packets.
Keywords- Data Mining; Support Vector Machines;
classification Tree; Anomaly Detection Systems (ADS) Anomaly detection techniques are classified into two
categories [3]:
I. INTRODUCTION
1. Anomaly Detection: Anomaly detection refers to storing
In recent year computer technology have been utilized
features of user’s usual behaviors into database, then
by many people all over the world in several areas. With
comparing user’s current behavior with those in database.
the development of internet technology, network security
If the deviation is huge enough, we can say that there is
has become a global focus in the world. Traditional
something abnormal.
security such as firewall, VPN and data encryption is
insufficient to detect against attacks by crackers. However, 2. Misuse Detection: Misuse Detection refers to
intrusion detection is a dynamic one, which can give confirming attack incidents by matching features through
dynamic protection to the network security in monitoring, the attacking feature library.
attack and counter attack [1]. For collecting the data set,
Anomaly Detection System (ADS) can be classified as We decided to use data mining for solving the problem of
host-based and network-based [2]. network intrusion because of following reasons [1, 4, 5, 6,]:

349
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012)
 Data mining can process huge amount of data. 3. If the selected attribute is discrete (categorical),
the node is branched with all possible values. If
 It is more useful to find out the ignored and hidden the attribute is continuous, a cut point with the
information. highest information gain is selected.
4. After splitting, consider whether or not these new
Data mining algorithms are used to perform data nodes are leaves (their data belong to the same
type); otherwise, new nodes are the root of the
summarization and visualization that help the security
sub-trees.
analysis in various areas. [7]. 5. Repeating all the above steps, until all new nodes
are leaves.
II. RELATED WORK
Algorithm C4.5 (D)
Denning was amongst the first persons to think in the area
of application of data mining to network security. He has Input: an attribute-valued dataset D
given a model of a real –time intrusion-detection expert 1: Tree = {}
system [8]. The concept behind the model is that 2: if D is “pure” OR other stopping criteria met then
exploitation of a system’s vulnerabilities involves abnormal 3: terminate
4: end if
usage of system and this abnormality can be detected by
5: for all attribute a € D do
looking for the abnormal patterns in the audit records. The 6: Compute information-theoretic criteria if we split on a
model proposed is capable of detecting break-ins, 7: end for
penetrations, and other forms of computer anomaly.in this 8: abest = Best attribute according to above computed
paper we are using two methods of anomaly detection criteria
SVM (Support Vector Machine) and C4.5 that is extended 9: Tree = Create a decision node that tests abest in the root
version of classification algorithm ID3. Both the methods 10: Dv = Induced sub-datasets from D based on abest
11: for all Dv do
are supervised algorithm. We are performing comparison
12: Treev = C4.5(Dv)
on the basic of detection rate and false alarm rate. 13: Attach Treev to the corresponding branch of Tree
14: end for
III DATA MINING ALGORITHMS 15: return Tree

A. C4.5 In building decision tree, there are two different methods


for pruning it: pre-pruning and post-pruning. The power of
It is targeted at supervised learning. Given an attribute
post-pruning is obvious in situations in which two
valued data set where instances are described by collections
attributes individually seem to have nothing to contribute,
of attributes and belong to one of a set of mutually
but they are robust predictor when fused [10]. There are
exclusive classes, C4.5 learn a mapping from attribute
three post-pruning techniques: sub-tree replacement, sub
values to classes that can be applied to classify new ,
tree raising, and reduced error pruning.
unseen instances. This algorithm is more applicable for
continuous and discrete value attributes [20]. B. Support Vector Machine:-
The algorithm involves the following steps [9]:
Support vector machines (SVMs), including support vector
1. Computing the information gain for each attribute. classifier (SVC) and support vector regressor (SVR), are
2. The attribute with the highest information gain, is among the most robust and accurate methods in all well-
selected as a splitting attribute.
known data mining algorithms.

350
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012)

For a two-class linearly separable learning task, the aim IV. EXPERIMENTS
of SVC is to find a hyperplane that can separate two classes
of given samples with a maximal margin which has been We tested our work using the 1999 KDD cup network
proved able to offer the best generalization ability. anomaly data set [17]. It originated from the 1998 DARPA
Support Vector machines are a set of related supervised Intrusion Detection Evaluation Program managed by MIT
learning methods used for classification and prediction Lincoln Labs.
[11]. The first stage is pre-processing. Data in this phase
a margin can be defined as the amount of space, or partition into training and testing. In the next step, we
separation, between the two classes as defined by a applied C4.5 and SVM on training dataset in order to build
hyperplane. Geometrically, the margin corresponds to the and train the models.
shortest distance between the closest data points to any Finally trained models are evaluated on testing dataset to
point on the hyperplane. Figure 1 shows optimal calculate the efficiency of the models.
hyperplane for a linearly seperable case.
The training data set consists of seven weeks of traffic
with around 5 million connections and the testing data
Optimal Hyper Plane consists of two weeks of traffic with around 300,000
connections. The data contains four main categories of
attacks:
Figure 1  Denial-of-service (Dos) such as smurf, apache2,pod,
R* etc.
 Remote-to-local (R2L) like imap, worm, phf,etc.
 User to root (U2R) such as perl, rootkit and so on.
 PROBING such as nmap, portsweep, etc.
mining algorithms can lead to better results if data under
analysis have been normalized [18].
Detection of attack can be measured by following metrics:
R*  False positive (FP): Or false alarm, Corresponds to the
number of detected attacks but it is in fact normal.
 False negative (FN): Corresponds to the number of
detected normal instances but it is actually attack, in
other words these attacks are the target of intrusion
Hyperplane can be written as [12]. detection systems.
 True positive (TP): Corresponds to the number of
T
w x+b=0 (1) detected attacks and it is in fact attack.
 True negative (TN): Corresponds to the number of
Where W = {w1, w2, …, wn } are weight vectors for n detected normal instances and it is actually normal.
attributes A = { A1, A2, …, An }; b is a scalar, and X ={x1,
x2, …, xn} are values of attributes. R* desired directionally
geometrical distance from the sample x* to the optimal The accuracy of an intrusion detection system is
hyperplane [13, 14]. For more details on support vector measured regarding to detection rate and false alarm rate.
machines, you can refer to [15, 16]. In this work, we use 1999 KDD cup Dataset which consist
of (311129 records).

351
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012)
Table 1 given below shows the percentage of data. Then, TABLE 2
15% of data is extracted by sampling. 70% of this new set DETECTION RATE COMPARISION OF DIFFERENT ATTACKS
belonged to training set, and 40% dedicated to test data. THROUGH C4.5 AND SVM

TABLE 1 Algorithm Dos U2R Probe R2L


PERCENTAGE OF DATA
SVM 92.85 68.88 88.19 16.89
Attack Name Quantity (Anomaly) Percentage ( anomaly)
C4.5 92.87 34.44 94.48 17.44
Normal 62,083 19.9

Dos 229,533 73.77


B. False alarm rate comparison
U2r 328 0.001
False alarm rate refers to the percentage of normal data
Probe 4,066 0.013 which is wrongly recognized as attack, and is defined as
follows:
R2l 15,189 0.048

False alarm rate = FP × 100

A. Detection rate comparison FP+FN

Detection rate refers to the percentage of detected attack


among all attack data, and is defined as follows: The average of false alarm rate in our experiment is 0.81
for C4.5 algorithm and 1.62 for SVM. As the results show,
C4.5 also performs better in false alarm average than SVM.
Detection rate = detected attack × 100
V. CONCLUSION
All attack data
If we are using network then detecting attack is an
Or important need in network systems, in this paper we used
two data mining techniques namely C4.5 and SVM to
Detection rate = TP × 100 detect anomaly in network. Experiment result show, C4.5
algorithm has better result than SVM in both detection and
TP+TN
false alarm rate in our data set.
The results of detection rate for different types of
Implementing new techniques for detecting attacks in
attacks are shown in Table 2. As statistical results indicate,
network will be examined in further study. Furthermore,
average detection rate for C4.5 and SVM are 84.05 and
data mining techniques can be applied in other domains
83.76, respectively. Furthermore, detection rate for C4.5 is
such as data warehousing in order to improve the quality of
better than SVM. For each attack, C4.5 is also better than
data.
SVM except U2R attack. It seems it is because of limited
amount of U2R attacks in our data sample.

352
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 5, May 2012)
REFERENCES [20] Prabhjeet Kaur , Amit Kumar Sharma, Sudesh Kumar Prajapat
“MADAM ID FOR INTRUSION DETECTION USING DATA
[1] M. Xue, C. Zhu, "Applied Research on Data Mining Algorithm in MINING” IJRIM Volume 2, Issue 2 (February 2012) (ISSN 2231-
Network Intrusion Detection," jcai, pp.275-277, 2009 International 4334)
Joint Conference on Artificial Intelligence, 2009.

[2] D. E. Denning, “An intrusion detection model,” IEEE Transaction on


Software Engineering, 1987.

[3] T. Bhavani et al., “Data Mining for Security Applications,”


Proceedings of the 2008 IEEE/IFIP International Conference on
Embedded and Ubiquitous Computing - Volume 02, IEEE Computer
Society, 2008.

[4] T. Lappas and K. P. ,"Data Mining Techniques for (Network)


Intrusion Detection System," January 2007.

[5] S. Sun, Y. Wang, "A Weighted Support Vector Clustering Algorithm


and its Application in Network Intrusion Detection," etcs, vol. 1,
pp.352-355, 2009 First International Workshop on Education
Technology and Computer Science, 2009.

[6] S. Wu, E. Yen. “Data mining-based intrusion detectors,” Elsevier


computer Network, 2009.

[7] E. Bloedorn et al, ”Data Mining for Network Intrusion Detection:


How to Get Started,” Technical paper, 2001.

[8] Dorothy E. Denning. “An Intrusion-Detetcion Model” 1986 IEEE


Computer Society Symposium on Research in Security and Privacy ,
pp 118-31

[9] J. Han, and M. Kamber, “Data mining: concepts and techniques’”(2nd


ed.). Morgan Kaufmann Publishers, 2006.

[10] I.H. Witthen, E. Frank, “Data Mining: Practical Machine Learning


Tools and techniques with Java Implementations,” Morgan
Kaufmann Publishers, October 1999.

[11]http://en.wikipedia.org/wiki/Support_vector_machine

[12] J. Han, and M. Kamber, “Data mining: concepts and techniques’”


(2nd ed.). Morgan Kaufmann Publishers, 2006.

[13] R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification, Wiley,
2001.

[14] S. Haykin. Neural Networks: A Comprehensive Foundation,


Tsinghua University, Press, 2001.

[15] J. Han, and M. Kamber, “Data mining: concepts and techniques’”


(2nd ed.). Morgan Kaufmann Publishers, 2006.

[16] D. L. Olson, D. Delen, “Advanced Data Mining Techniques,” 2008.

[17] http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

[18] J. Han, and M. Kamber, “Data mining: concepts and techniques’”


(2nd ed.). Morgan Kaufmann Publishers, 2006.

353

You might also like