You are on page 1of 6

Bonfring International Journal of Data Mining, Vol. 4, No.

1, March 2014 1
ISSN 2277 - 5048 | 2014 Bonfring
Abstract--- Computer Systems are exposed to an
increasing number of different types of security threats due to
the expanding of internet in recent years. How to detect
network intrusions effectively becomes an important security
technique. Many intrusions arent composed by single events,
but by a series of attack steps taken in chronological order.
Analyzing the order in which events occur can improve the
attack detection accuracy and reduce false alarms. Intrusion
is a multi step process in which a number of events must occur
sequentially in order to launch a successful attack. Intrusion
detection using sequential pattern mining is a research topic
focusing on the field of information security. Sequential
Pattern Mining is used to discover the frequent sequential
pattern in the event dataset. Sequential Pattern mining
algorithms can be broadly classified into Apriori based,
Pattern growth based and a combination of both. The first
algorithm is based on the characteristic of Apriori and the
second uses a pattern growth approach. The major drawback
of the Apriori based algorithm is the multiple scans of the
database, generating maximal patterns. In this paper, a
simulation study of both the algorithms, a modified
AprioriALL Algorithm to optimize the processing by including
set theory techniques and the original AprioriALL algorithm is
done on a network intrusion dataset from KDD cup 1999.
Experimental results show that the modified algorithm shrinks
the dataset size. At the most, it also scans the database twice.
Also, as the interestingness of the itemset is increased with the
dataset shrinking it leads to efficient sequences with high
associativity. As the database is reduced, the time taken to
mine sequences also reduces and is faster than Apriori based
algorithm.
Keywords--- Data mining, Sets, Sequence data, Time
series, Intrusion detection system, DoS attacks
I. INTRODUCTION
ITH massive amounts of data continuously being
collected and stored, many industries are becoming
interested in identifying sequential patterns from their
database. Sequential pattern mining is one of the most well-
known methods and has broad applications including web-log
analysis, customer purchase behavior analysis, medical record
analysis, market analysis, decision support, music
recommendation, fraud detection, intrusion detection and
business management. Many approaches have been proposed
to extract information, and mining sequential patterns is one of
the most important ones [1][2][3]. It is firstly proposed by

Alpa Reshamwala
Dr. Sunita Mahajan
Agrawal R. et al. in the shopping basket data analysis [1].
Sequential Pattern Mining finds interesting sequential patterns
among the large database. It finds out frequent subsequences
as patterns from a sequence database. In addition, Constraint-
based sequential pattern mining algorithm, based on the
pattern of growth approach, and databases based on the
projection methods have been proposed. And moreover, there
are some expansions of research on SPM, such as closed
sequential pattern mining, parallel mining, distributed mining,
multi-dimensional sequential pattern mining and approximate
sequential pattern mining.
Existing approaches to find appropriate sequential patterns
in time related data are mainly classified into two approaches.
In the first approach developed by Agarwal and Srikant [14],
the algorithm extends the well-known Apriori algorithm. This
type of algorithms is based on the characteristic of Apriori
that any subpattern of a frequent pattern is also frequent [1].
The latter, uses a pattern growth approach [8] and employs the
same idea used by the Prefix-Span algorithm.
It has been a great challenge to improve the efficiency of
Apriori algorithm. Since all the frequent sequential patterns
are included in the maximum frequent sequential patterns, the
task of mining frequent sequential patterns can be converted as
mining maximum frequent sequential patterns. AprioriALL[1]
is based on Apriori algorithm. In each pass we use the large
sequences from the previous pass to generate the candidate
sequences and then measure their support by making a pass
over the database.
In this paper, the Apriori based algorithm, AprioriALL[1],
as well as modified algorithm AprioriAll_Set, both are
implemented to mine frequent sequential patterns.
II. RELATED WORK
After mid 1990s, following Agrawal and Srikant [1],
many scholars provided more efficient algorithms
[8][9][10][11][12][13]. Besides these, work has been done to
extend the mining of sequential patterns to other time-related
patterns. Existing efforts to find appropriate sequential
patterns in time related data are mainly classified into two
approaches. In the first approach developed by Agarwal and
Srikant [14], the algorithm extends the well-known Apriori
algorithm. This type of algorithms is based on the
characteristic of Apriorithat any sub-pattern of a frequent
pattern is also frequent [1]. The latter, using a pattern growth
approach [8], employs the same idea used by the Prefix-Span
algorithm. This algorithm divides the original database into
smaller sub-databases and solves them recursively.
Improving Efficiency of Apriori Algorithms for
Sequential Pattern Mining
Alpa Reshamwala and Dr. Sunita Mahajan


W
Bonfring International Journal of Data Mining, Vol. 4, No. 1, March 2014 2
ISSN 2277 - 5048 | 2014 Bonfring
Previous research addresses time intervals in two typical
ways, first by the time-window approach, and second by
completely ignoring the time interval. First, the time window
approach requires the length of the time window to be
specified in advance. A sequential pattern mined from the
database is thus a sequence of windows, each of which
includes a set of patterns. Patterns in the same time window
are bought in the same time period. Srikant and Agrawal,
specified the maximum interval (max-interval), the minimum
interval (min-interval) and the sliding time window size
(window-size) in the algorithm [12], Moreover, they cannot
find a pattern whose interval between any two sequences is
not in the range of the window-size. Agrawal and Srikant [1],
introduced traditional sequential mining, by ignoring the time
interval and including only the temporal order of the patterns.
To address the intervals between successive patterns in
sequence database, Chen et al. have proposed a generalization
of sequential patterns, called time-interval sequential patterns,
which reveals not only the order of patterns, but also the time
intervals between successive patterns [4]. Chen et al.
developed algorithms to find sequential patterns using both the
approaches [4]. Their work, by assuming the partition of time
interval as fixed, developed two efficient algorithms -I-Apriori
and I- PrefixSpan. The first algorithm is based on the
conventional Apriori algorithm, while the second one is based
on the PrefixSpan algorithm.
An extension of the algorithm developed by Chen et al [4],
to solve the problem of sharp boundaries to provide a smooth
transition between members and non-members of a set, is
addressed by Chen et al [5]. The sharp boundary problems can
be solved by the concept of fuzzy sets. The concept included
fuzzy time interval (FTI) pattern. Two efficient algorithms, the
FTI-Apriori algorithm and the FTI-PrefixSpan algorithm, were
developed for mining FTI sequential patterns. There are
several other reasons that support the use of FTI in place of
crisp time interval. First, the human knowledge can be easily
represented by fuzzy logic. Second, it is widely recognized
that many real world situations are intrinsically fuzzy, and the
partition of time interval is one of them. Third, FTI is simple
and easy for users. Fuzzy logic addresses the formal principles
of approximate reasoning. It provides a sound foundation to
handle imprecision and vagueness as well as mature inference
mechanisms by varying degrees of truth. As boundaries are
not always clearly defined, fuzzy logic can be used to identify
complex pattern or behavior variations. And it can be
accomplished by building an intrusion detection system that
combines fuzzy logic rules with an expert system in charge of
evaluating rule truthfulness. In [6], the authors have
contributed to the ongoing research on FTI sequential pattern
mining by proposing an algorithm to detect and classify audit
sequential patterns in network traffic data. The paper also
defines the confidence of the FTI audit sequences, which is
not yet defined in the previous researches. In [7], S. Mahajan
and A. Reshamwala have proposed an algorithm which uses a
fuzzy genetic approach to discover optimized sequences in the
network traffic data to classify and detect intrusion.
Anrong et al [15], addresses application of sequential
pattern in intrusion detection by refining the pattern rules and
reducing redundant rules. Their work implements PrefixSpan
algorithm in the data mining module of network intrusion
detection system (NIDS). Shang Gao et al [16], describes a
set-based approach for mining association rules and finding
frequent sequential patterns in customer transactional
databases. Their approach relaxes the constraints described in
Apriori (All/Some), and improves the performance while
being more user-oriented and self-adaptive than the
probabilistic knowledge representation. In [17], A.
Reshamwala and S. Mahajan, have implemented on KDD Cup
1999 dataset to predict DoS attack sequences and they
conclude that, Approach 2 results are more efficient with
dividing the sequence by a timestamp window of 1 day or
86400 seconds.
III. SET THEORY
Set theory is the branch of mathematical logic that
studies sets, which are collections of objects. Although any
type of object can be collected into a set. A set theory features
binary operations on sets:
Union of the sets A and B, denoted A B, is the set of all
objects that are a member of A, or B, or both. The union of {1,
2, 3} and {2, 3, 4} is the set {1, 2, 3, 4}.
Intersection of the sets A and B, denoted A B, is the set
of all objects that are members of both A and B. The
intersection of {1, 2, 3} and {2, 3, 4} is the set {2, 3}.
Consider the sequence database as shown in Table I. The
length of a sequence is the number of itemsets in the sequence.
A sequence of length k is called a k-sequence. The sequence
formed by the concatenation of two sequences x and y is
denoted as x, y. the support for an itemset i is defined as the
fraction of customers who bought the items in i in a single
transaction. Thus the itemset i and the 1-sequence <i> have
the same support. An itemset with minimum support is called
as the large itemset or litemset.
IV. APRIORIALL SET BASED ALGORITHM
Figure 1 depicts the working of the algorithm to find frequent
sequences using set theory. Consider the sequence dataset D,
as in Table I. To avoid multiple scans of the dataset D, the
dataset is stored in the Hash Map data structure in Java. For
the example in figure 1 we get, frequent longest sequence
pattern as <a b e> with minimum support >= 0.3.










Bonfring International Journal of Data Mining, Vol. 4, No. 1, March 2014 3
ISSN 2277 - 5048 | 2014 Bonfring












































Figure 1: AprioriAll_Set Algorithm
















Sid Sequence
10 <(a,1),(b,4),(e,29)>
20 <(d,1),(a,2),(d,24)>
30 <(b,1),(a,11),(e,28)>
40 <(f,1),(b,5),(c,19)>
50 <(a,4),(b,5),(d,10),(e,28)>
60 <(a,0),(b,5),(e,30)>
70 <(j,2),(a,17),(h,17)>
80 <(c,3),(I,10),(f,18)>
90 <(h,4),(a,10),(b,21)>
100 <(g,0),(a,0),(b,3),(e,30)>



Sid Sequence Support
[10,20,30,50,60,70,90,100] <a> 0.8
[10,30,40,50,60,90,100] <b> 0.7
[10,30,50,60,100] <e> 0.5

Sid Sequence Support
[10, 30,50,60, 100] <a b> 0.4
[10, 30,50,60, 100] <a e> 0.5
[10,30,50,60,100] <b e> 0.5


Sid Sequence
10 <(a,1),(b,4),(e,29)>
30 <(b,1),(a,11),(e,28)>
50 <(a,4),(b,5),(d,10),(e,28)>
60 <(a,0),(b,5),(e,30)>
100 <(g,0),(a,0),(b,3),(e,30)>














1
st
Scan
Sequence Support
<a> 0.8
<b> 0.7
<c> 0.2
<d> 0.2
<e> 0.5
<f> 0.2
<g> 0.1
<h> 0.2
<i> 0.1
<j> 0.1

SUP
min
=0.3
0-Length Sequences
Sid Sequence
10 <(a,1),(b,4),(e,29)>
30 <(b,1),(a,11),(e,28)>
50 <(a,4),(b,5),(d,10),(e,28)>
60 <(a,0),(b,5),(e,30)>
100 <(g,0),(a,0),(b,3),(e,30)>

2
nd
Scan
Sequence Support
<a b e> 0.4

2-Length Sequence


Bonfring International Journal of Data Mining, Vol. 4, No. 1, March 2014 4
ISSN 2277 - 5048 | 2014 Bonfring

Table 1: Sequence Database
Sid Audit Sequence
10 <(a,1),(b,4),(e,29)>
20 <(d,1),(a,2),(d,24)>
30 <(b,1),(a,11),(e,28)>
40 <(f,1), (b,5),(c,19)>
50 <(a,4),(b,5),(d,10),(e,28)>
60 <(a,0),(b,5),(e,30)>
70 <(j,2),(a,17),(h,17)>
80 <(c,3),(i,10),(f,18)>
90 <(h,4),(a,10),(b,21)>
100 <(g,0),(a,0),(b,3),(e,30)>


Now, on applying the AprioriAll_Set algorithm of
candidate generation and considering minimum support of 0.3.
In the first pass, find L
0
by scanning the dataset D to generate
large 1-sequences. By Apriori principle C
1
, candidates are
generated. Find L
1
satisfying the min_supp =0.3, we get 1-
sequence <a>, <b> and <e>. Also form a set of Sequence_id
of each of these L
1
candidates as shown in Figure 1. For
example Sid for 1-sequence itemset
<a>: {10, 20, 30, 50, 60, 70, 90, 100},
<b>: {10, 30, 40, 50, 60, 90, 100} and
<e>: {10, 30, 50, 60, 100}.
Interestingness of the 1- sequence is found by applying the
set intersection of the set of all the Sid of the candidates in L
1.
Sid <a> Sid <b> Sid <e>
Next pass or when k>=2, we will be considering only
those set of Sequence_id which resulted from the previous
pass intersection if Sids of the l-sequence, where l is the
length of sequence. When l=1, we get, a set of Sid {10, 30, 50,
60, 100}. Thus C
2
will be generated from this reduced dataset
D stored as a hash map. Find L
2
satisfying the min_supp
=0.3, we get 2-sequence <a b>, <a e> and <b e>. Also form a
set of Sequence_id of each of these L
2
candidates as shown in
Figure 1. For example, Sid for 2-sequence itemset are.
<a>: {10, 30, 50, 60, 100}.
<b>: {10, 30, 50, 60, 100}.
<e>: {10, 30, 50, 60, 100}.
Similarly, Interestingness of the k- sequence is found by
intersection of the set of all the Sid of the candidates in L
k
For example in figure 1, the interestingness of the 2-
sequence can be improved by applying the set intersection of
the set of all the Sid of the candidates in L
2

Sid <a b> Sid <a e> Sid <b e>
Hence, resulting in a set of Sid {10, 30, 50, 60, 100}.
Repeating the earlier pass till L
k.
Frequent sequences are the
union of L
k..
The algorithm is as follows
Algorithm:
L
0
= Scan the database to generate large 1- sequences;
C
1
= new candidates generated from L
0
.
for each sequence c in the database do
Increment the count of all candidates in C
1
that
are contained in c.
L
1
= Candidates in C
1
with minimum support.
end.
Interestingness of the 1- sequence is found by intersection
of the set of all the Sid of the candidates in L
1

Sid <i
1
> Sid <i
2
> . Sid<i
n
>; i
1
,i
2
,i
n
- itemsets
for (k=2; L
k-1
; k++) do
begin
L
k
= Candidates with minimum support
Interestingness of the k- sequence is found by
intersection of the set of all the Sid of the
candidates in L
k
end.
Maximal Sequences in U
k
L
k.

V. RESULTS AND DISCUSSION
In this section, both the algorithms: AprioirALL [1] and
AprioriAll_Set; are implemented to mine sequential patterns
without time intervals.
These algorithms were implemented in Sun Java language
and tested on an Intel Core Duo Processor, 2.10 GHz with
2GB main memory under Windows XP operating system.
The dataset used for simulation is the KDD Cup 1999
dataset to detect DoS attack sequences on network traffic data.
The sequence dataset is formed using the second approach as
in [17]. Here the sequence is divided by a timestamp window
of 1 day or 86400 seconds.
AprioriAll_Set; based on traditional set theory shrinks the
database size. It also scans the database at most twice. Also, as
the interestingness of the itemset is increased with the
database shrinking leads to longest sequences. As the database
is reduced the time taken to mine sequences also reduces and
is faster than traditional algorithms. The Complexity of the
Algorithm can also be reduced. As we can observe in the
Figure 3, AprioriAll_Set; generates efficient sequential
patterns as per the Apriori principle. Also, it takes only 2 fixed
database scans for k- itemset as compared to k database scans
for k-itemset in AprioriALL algorithm. It also generates
longest sequences. The itemsets which satisfy the minimum
support constraints will together generate the longest
sequences. The interestingness of the itemset increases by
taking the intersection of the sequence-ids in which the
itemsets are present.
Bonfring International Journal of Data Mining, Vol. 4, No. 1, March 2014 5
ISSN 2277 - 5048 | 2014 Bonfring
The first comparison is based on the performance of the
two algorithms where the minimum support threshold is
varied from 20 % to 90%. Figure 2 summarizes those results.
All the results show that AprioriAll_Set algorithm is
approximately 1.5 times Faster as compared to AprioriALL
algorithm as per the results for minimum support of 20%.

Figure 2: Performance of AprioriAll_Set Algorithm


Figure 3: No. of Patterns of AprioriAll_Set Algorithm
The second comparison is done on the number of frequent
sequence patterns found executing these algorithms with the
varying minimum support threshold. From the results in
Figure 3, it is shown that AprioriAll_Set generates efficient
number of sequential patterns.
From Figure 4, it is seen that AprioriALL algorithm
requires 34% more memory than AprioriAll_Set when the
minimum support is taken as 20%.

Figure 4: Memory Usage of AprioriAll_Set Algorithm


Figure 5: Pattern Length Discovery of AprioriAll_Set
Algorithm


Figure 6: Dataset Size of AprioriAll_Set Algorithm

Figure 5 depicts that, AprioriALL algorithm generates
longer patterns as compared to AprioriAll_Set algorithm.
AprioriAll_Set; based on traditional set theory shrinks the
database size as shown in Figure 6. The comparison is based
on the dataset size where the minimum support threshold is
varied 20 % to 90%.The average dataset size per iterations in
both the algorithms is found in figure 7.
VI. CONCLUSION AND FUTURE ENHANCEMENT
On applying AprioriALL and AprioriAll_Set on KDD cup
1999 dataset, the results obtained indicate that the algorithm
AprioriAll_Set is faster and generates less number of
sequential patterns as compared to AprioriALL. Also,

Figure 7: Comparison of Dataset Size
0
50
100
150
200
250
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
R
u
n
T
i
m
e
(
m
s
)
Support(%)
Performance
AprioriALL_Set
AprioriALL
0
100
200
300
400
500
600
700
800
900
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
N
o
.

O
f

P
a
t
t
e
r
n
s
Support(%)
Pattern Discovery
AprioriALL_Set
AprioriALL
0
0.5
1
1.5
2
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
M
e
m
o
r
y


(
m
b
)
Support(%)
Memory Usage
AprioriALL_Set
AprioriALL
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
A
v
e
r
a
g
e


L
e
n
g
t
h

o
f

P
a
t
t
e
r
n
s
Support(%)
Pattern length Discovery
AprioriALL_Set
AprioriALL
0
20
40
60
80
100
120
1 2 3 4 5 6
P
e
r
c
e
n
t
a
g
e
Iterations
Dataset Size - AprioriAll_Set
0.2
0.4
0.6
0.7
0.9
0
20
40
60
80
100
120
0.2 0.4 0.6 0.7 0.9
P
e
r
c
e
n
t
a
g
e
Support (%)
Dataset Size AprioriALL_
Set
Bonfring International Journal of Data Mining, Vol. 4, No. 1, March 2014 6
ISSN 2277 - 5048 | 2014 Bonfring
AprioriALL algorithm requires more memory and
generates longer patterns than AprioriAll_Set algorithm. On
applying set intersection operation, the interestingness of the
itemset is increased in AprioriAll_Set. Dataset shrinking in
AprioriAll_Set leads to efficient sequences with high
associativity. Lastly, in AprioriAll_Set, as the dataset is stored
in Hash Map data structure the multiple scans of the dataset is
relatively reduced.
In past enhancement, as in these experiments sequence
patterns, were discovered by ignoring the time interval and
including only the temporal order of the patterns. The
approach can be extended to more set-based mathematical
models for further data analysis in order to discover hidden
sequential patterns. To address the intervals between
successive patterns in sequence database, Chen et al. have
proposed a generalization of sequential patterns, called time-
interval sequential patterns, which reveals not only the order
of patterns, but also the time intervals between successive
patterns [4]. An extension of the algorithm developed by Chen
et al [4], can also be implemented to solve the problem of
sharp boundaries for providing a smooth transition between
members and non-members of a set, as addressed in Chen et al
[5]. Also as proposed in [7], the use of fuzzy genetic approach
to discover optimized sequences in the network traffic data to
classify and detect intrusion can also be implemented.

REFERENCES
[1] R. Agrawal and R. Srikant, Mining sequential patterns, In Proc. Int.
Conf. Data Engineering, pp.314, 1995.
[2] Y. L. Chen, S. S. Chen and P. Y. Hsu, Mining hybrid sequential
patterns and sequential rules, Inf. Syst., vol. 27, no. 5, pp. 345362,
2002.
[3] J. Han and M. Kamber, Data Mining: Concepts and Techniques, New
York: Academic, 2001.
[4] Y. L. Chen, M. C. Chiang and M. T. Ko, Discovering time-interval
sequential patterns in sequence databases, Expert Systems with
Applications, Volume 25, Issue 3,pp 343354,2003.
[5] Yen-Liang, Tony Cheng-Kui Huang, Discovering Fuzzy Time-Interval
Sequential Patterns in Sequence Databases, IEEE Transactions on
Systems, Man, and Cybernetics-Part B: Cybernetics, vol.35, pp.959-972,
2005.
[6] Sunita Mahajan and Alpa Reshamwala, Amalgamation of IDS
Classification with Fuzzy techniques for Sequential pattern mining
,IJCA Proceedings on International Conference on Technology
Systems and Management - ICTSM 2011, Number 3 - Article 7, pp 9
14, 2011.
[7] Sunita Mahajan and Alpa Reshamwala, An Approach to Optimize
Fuzzy Time-Interval Sequential Patterns Using Multi-objective Genetic
Algorithm, ICTSM 2011, CCIS 145, Springer-Verlag Berlin
Heidelberg, pp. 115120, 2011.
[8] Pei, J., Han, J., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C.,
PrefixSpan: Mining sequential patterns efficiently by prefix-projected
pattern growth, Proceedings of 2001 International Conference on Data
Engineering, pp. 215224, 2001.
[9] J. Han, J. Pei, and Y. Yin, Mining Frequent Patterns without Candidate
Generation, Proc. Of ACM-SIGMOD Intl Conf. Management of Data
(SIGMOD 00), pp. 1-12, 2000.
[10] J. Ayres, J. Gehrke, T. Yiu, and J. Flannick, Sequential PAttern Mining
using A Bitmap Representation, In Proceedings of ACM SIGKDD on
Knowledge discovery and data mining, pp. 429-435, 2002.
[11] Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U. and Hsu, M.-C.,
FreeSpan: Frequent pattern-projected sequential pattern mining,
Proceedings of 2000 International Conference on Knowledge Discovery
and Data Mining, pp. 355359, 2000.
[12] Srikant, R. and Agrawal, R., Mining sequential patterns:
Generalizations and performance improvements, Proceedings of the 5
th

International Conference on Extending Database Technology, pp. 317,
1996.
[13] Zaki, M. J., SPADE: An efficient algorithm for mining frequent
sequences, volume 42 Issue 1-2, pp 3160, 2001.
[14] R. Agrawal and R. Srikant, Fast algorithms for mining association
rules, Proceedings of 20
th
VLDB Conference Santiago, Chile, pp. 487
499, 1994.
[15] XUE Anrong, HONG Shijie, JU Shiguan and CHEN Weihe,
Application of Sequential Patterns Based on Users Interest in Intrusion
Detection, Proceedings of 2008 IEEE International Symposium on IT
in Medicine and Education, pp 1089- 1093, 2008.
[16] Shang Gao, Reda Alhaji, Jon Rokne and Jiwen Guan, Set Based
Approach in Mining Sequential Patterns, 24th International Symposium
on Computer and Information Sciences, ISCIS 2009, pp 218 223,
2009.
[17] Alpa Reshamwala and Dr. Sunita Mahajan, Prediction of DoS attack
Sequences, Proceedings of International Conference on
Communication, Information & Computing Technology (ICCICT), pp.
1-5, 2012.

Ms. Alpa Reshamwala is currently working as an
Asistant Professor in the Department of Computer
Engineering at MPSTME, NMIMS University. She
received her B.E degree in Computer Engineering from
Fr. CRCE, Bandra, Mumbai University in 2000 and
M.E degree in Computer Engineering from TSEC,
Mumbai University in 2008. Her area of Interest
includes Artificial Intelligence, Data Mining, Soft
Computing Fuzzy Logic, Neural Network and Genetic Algorithm. She has
24 papers in National/International Conferences/ Journal to her credit.

Dr Sunita M. Mahajan is currently working as the
Principal, Mumbai Educational Trusts Institute of
Computer Science. She has done her Doctorate from
S.N.D.T. Womens University in 1997. She has
worked as senior scientist at Bhabha Atomic
Research Centre for 31 years and entered educational
field after her retirement. She has done extensive
work in parallel processing. She has more than 45
papers in National and International conferences and journals to her credit.
She has guided many PhD students in distributed computing, data mining,
natural language processing etc. Her current field of interest is parallel
processing, distributed computing, cloud computing, data mining. She has
also written a text book on Distributed Computing(New Delhi, Oxford
University Press, 2010)

You might also like