Professional Documents
Culture Documents
www.elsevier.com/locate/asoc
Abstract
In this paper, a Pareto-based multi-objective differential evolution (DE) algorithm is proposed as a search strategy for mining accurate and
comprehensible numeric association rules (ARs) which are optimal in the wider sense that no other rules are superior to them when all objectives
are simultaneously considered. The proposed DE guided the search of ARs toward the global Pareto-optimal set while maintaining adequate
population diversity to capture as many high-quality ARs as possible. ARs mining problem is formulated as a four-objective optimization problem.
Support, confidence value and the comprehensibility of the rule are maximization objectives while the amplitude of the intervals which conforms
the itemset and rule is minimization objective. It has been designed to simultaneously search for intervals of numeric attributes and the discovery of
ARs which these intervals conform in only single run of DE. Contrary to the methods used as usual, ARs are directly mined without generating
frequent itemsets. The proposed DE performs a database-independent approach which does not rely upon the minimum support and the minimum
confidence thresholds which are hard to determine for each database. The efficiency of the proposed DE is validated upon synthetic and real
databases.
# 2007 Elsevier B.V. All rights reserved.
Keywords: Data mining; Machine learning; Evolutionary computation; Multi-objective optimization; Differential evolution
1. Introduction
Data mining is the extraction of implicit, valid, and
potentially useful knowledge from large volumes of raw data
[1]. The extracted knowledge must be not only accurate but also
readable, comprehensible and ease of understanding. There are
a lot of data mining tasks such as ARs, sequential patterns,
classification, clustering, time series, etc., and there have been
many techniques and algorithms for these tasks and different
types of data in data mining. When the data contain continuous
values, it becomes difficult to mine the data and some special
techniques need to be developed.
One new and extremely powerful algorithm due to
convergence characteristics and few control parameters is
DE that can be categorized into a novel class of floating-point
encoded, evolutionary optimization algorithms. The population
reproduction scheme and selection scheme of DE differ from
other evolutionary algorithms.
* Corresponding author. Tel.: +90 424 237 00 00; fax: +90 424 218 19 07.
E-mail addresses: balatas@firat.edu.tr, bilalalatas@yahoo.com (B. Alatas),
eakin@firat.edu.tr (E. Akin), akarci@firat.edu.tr (A. Karci).
1568-4946/$ see front matter # 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2007.05.003
647
648
(1)
i 1; . . . ; N p
(2)
The canonical EA, GA [22], and population-based incremental learning [23] work with the strings of bits or integers
(letters). Evolution strategy (ES) [24] and DE both work with
vectors of real numbers to represent the candidate solutions.
Mutation is the main step in ES. However, ES typically utilizes
adaptive mutation rates for the vectors themselves, but DE
utilizes mutations of the differences of the parameter vectors as
described in following subsection.
DE generates new offsprings by forming a noisy replica
(trial vector) of each parent individual (target vector) of the
population. The population is successfully improved by three
basic operators: mutation, crossover, and selection. Although
these names are the same as used in EAs, the ways they are
performed are different. DE devises its own mutation,
crossover, and selection and redefines in the present context.
First the mutation operator which plays the key role in
optimization process creates mutant vectors by perturbing each
target vector with the weighted difference of two other
randomly selected individuals. Perturbation which can have
either one or two pair of vectors can be performed to either a
randomly selected vector from the population or the best
candidate solution found so far. Then, the crossover operator
generates trial vectors by mixing the parameters of the mutant
vectors with the target vectors according to a selected
probability distribution. Crossover can be based on binomial
or exponential distributions. Finally, the selection operator
forms the next generation by deterministically selecting
between the trial and the corresponding target vector which
fits better the objective function. The interesting point in
selection is that a trial vector is not compared against all the
individuals in the current population, but only against its one
counterpart target individual. These operators are repeated for
several generations until the termination criteria are met.
The schematic diagram in Fig. 1 provides a way to visualize
the working principle of DE and simple pseudo-code of DE for
solving single-objective optimization is given in Fig. 2.
3.1. Initial population
The initial population is created by assigning random values
which lie inside the feasible bounds of the decision variable to
each decision parameter of each individual of the population as
shown in the following equation:
X 0j;i X min
h j X max
X min
j
j
j ;
i 1; . . . ; N p ; j 1; . . . ; D
(3)
649
G
FXaG XbG XcG XdG ;
XiG Xbest
i 1; . . . ; N p ; a 6 b 6 c 6 d 6 i
(6)
G
XiG XaG lXbest
XaG FXbG XcG ;
i 1; . . . ; N p ; best 6 a 6 b 6 c 6 i
(7)
where X min
and X max
are the lower and upper bound of the jth
j
j
decision parameter respectively, and hj 2 [0,1] is a uniformly
distributed random number generated anew for each value of j.
3.2. Mutation
i 1; . . . ; N p ; a 6 b 6 c 6 i
(4)
G
XiG Xbest
FXaG XbG ;
i 1; . . . ; N p ; best 6 a 6 b 6 i
(5)
G
Xbest
is the best solution found so far.
Perturbing the best solution found so far with two difference
vectors can present a higher convergence rate in global
(8)
(9)
; i 1; :::; N p
XiG otherwise
4. Multi-objective optimization
Multi-objective optimization is the problem of simultaneously optimizing a set S of two or more objective functions.
The objective functions typically measure different features of
a desired solution. Often these objectives are conflicting in that
there is no single solution which simultaneously optimizes all
functions. Instead, one has a set of optimal solutions. This set
can be defined using the notion of Pareto-optimality and is
commonly referred to as the Pareto-optimal set [27].
Assuming that the functions in S should be maximized, then
a solution s is Pareto-optimal if there is no other solution s0 such
650
that f i(s0 ) f j(s) for all f 2 S and f i(s0 ) > f j(s) for at least one
f 2 S. Informally, this means that s is Pareto-optimal if and only
if there is no feasible solution s0 which increases some objective
function without simultaneously decreasing at least one other
objective function. The solutions in the Pareto-optimal set are
called non-dominated. Given two solutions, s0 and s, s0
dominates s if f i (s0 ) f j(s) for all f 2 S and f i(s0 ) >f j(s) for at
least one f 2 S. In other words, s0 is at least as good as s with
respect to all objectives and better than s with respect to at least
one objective.
The concept of dominance and Pareto-optimality is simply
delineated in Fig. 3. Let us consider the case, where there are
three solutions s1, s2, and s3; and assume that the two objectives
o1 and o2 are to be maximized. s1 is not dominated by any other
solution due to its having the highest value for objective o2.
Similarly, s2 is not dominated by any other solution due to its
having the highest value for objective o1. s3 is not dominated by
s1 due to higher value of s3 for objective o1. However, s3 is
dominated by s2 due to its having lower values for both
objectives o1 and o2 compared to s2. Thus, there is no solution
that dominates s1 and s2 and there is one solution, s2, which
dominates s3. Therefore, the set of non-dominated or Paretooptimal solution is given by Pareto_Set = {s1, s2}.
The goal in multi-objective optimization is to find a diverse
set of Pareto-optimal solutions. In evolutionary multi-objective
optimization, this is typically found by producing a set of
solutions from a single EA run. In rule mining task here, a set of
high-quality numeric ARs which are optimal in the wider sense
that no other rules are superior to them when all objectives are
simultaneously considered, are mined in a single DE run.
5. The proposed differential evolution algorithm
(MODENAR)
In recent years, the techniques of evolutionary computation
have proven themselves useful in the area of data mining. For
the problem of rule mining, several objective functions have
been designed, relating to accuracy, comprehensibility and
interestingness in general [28]. However, when searching for
(10)
(11)
x y x; x y; y x y ; x y
c x; c x
c x c x; x x; x c
c x ; c x
(12)
if c 0
if c < 0
(13)
m
1X
ui li
m i1 maxAi minAi
(15)
Here, m is the number of attributes in the itemsets, ui and li
are the upper and lower bounds encoded in the itemsets
corresponding to attribute i. max(Ai) and min(Ai) are the
651
(16)
Here, X is the D-dimensional parameter vector, Y the k-dimensional vector of continuous parameters, Z the vector of (Dk)
discrete parameters, and round() is a function for converting a
continuous value to an integer value by truncation. In case of
integer variable, the population is initialized as follows:
X 0j;i X min
h j X max
X min
1;
j
j
j
i 1; . . . ; N p ; j 1; . . . ; D
(17)
652
0
X Gj;i
8 G0
X X min
>
j
>
< j;i
2
G0
min
>
> min X j;i X j
:
Xj
2
0
X Gj;i
(18)
> X max
j
x 6 xi 6 x j
(19)
o
X
k1
wk f k XiG
(20)
653
Table 1
Synthetically created sets
Table 3
ARs found by MODENAR
A1 2 [10] ^ A2 2 [1530]
A1 2 [1545] ^ A3 2 [6075]
A2 2 [6590] ^ A4 2 [1545]
A3 2 [80100] ^ A4 2 [80
100]
Rule
Support
(%)
Confidence
(%)
Records
(%)
A1 2 [110] ) A2 2 [1530]
A1 2 [1545] ) A3 2 [6075]
A3 2 [80100] ) A4 2 [8098]
A2 2 [6590] ) A4 2 [1543]
A2 2 [1530] ) A1 2 [110]
A3 2 [6075] ) A1 2 [1545]
A4 2 [8098] ) A3 2 [80100]
A4 2 [1544] ) A2 2 [6589]
25
25
25
25
25
25
25
25
100
100
100
100
100
100
100
100
100
intervals had small size and others have larger size. Support and
confidence values for these sets were 25% and 100%,
respectively. Other values outside these sets were distributed
in such a way that no other better rules than these rules exist. By
using the appropriate weights for objectives in ARs mining
task, these rules wanted to be mined. The goal was to most
accurately find the intervals of each one of the built regions. It is
wanted to be tested that MODENAR finds the association rules
with the most accurate values for the numeric intervals of each
attribute in the rule. DE/rand/1 scheme has been used for the
algorithm. The used parameter values have been shown in
Table 2. Empirically determined weight values for support,
confidence, comprehensibility, and objective computed as in
(15) for amplitude of the intervals were 0.8, 0.2, 0.1, and 0.4,
respectively.
In Table 3, the ARs found by MODENAR are shown. It can
bee seen that it found the comprehensible rules that have high
support and confidence values according to the synthetically
created sets. Note that MODENAR is database-independent,
since it does not rely upon support/confidence thresholds which
are hard to choose for each database. If support and confidence
thresholds have been used and a support threshold that is higher
than 25% is selected, no rules will be able to be found according
to the values of the attributes in this database. However, it is
known that this database contains some accurate and
comprehensible rules. MODENAR is able to find all these
rules without relying upon the minimum support and the
minimum confidence thresholds.
To test the efficiency of the proposed algorithm, it has been
executed on noisy synthetic database. The noise in this database
is introduced by locating the values that does not belong to the
interval of the second item of the set. That is why a percentage r of
records exists that is not fulfilled in the pre-established interval of
the second item. For example, for the first set there is a percentage
r of records that do not fulfill the second item A2 2 [1530], but
they are distributed in the ranges [014] or [31100].
The algorithm has tested whether it obtains the most
adequate ranges for antecedents and consequents of the rules.
This test was carried out with three levels of noise (4%, 6%, and
Table 2
The used parameter values for synthetic database
Parameters
Values
Pop. size
No. of generations
Crossover rate (CR)
Step length (F)
10
1000
0.3
Generated for each variable
from a Gaussian distribution N(0,1)
8
Threshold
Support
(%)
Confidence
(%)
Records
(%)
r = 4%
A1 2 [110] ) A2 2 [1529]
A1 2 [1545] ) A3 2 [6073]
A3 2 [80100] ) A4 2 [8096]
A2 2 [6590] ) A4 2 [1546]
A2 2 [1529] ) A1 2 [110]
A3 2 [6073] ) A1 2 [1545]
A4 2 [8096] ) A3 2 [80100]
A4 2 [1546] ) A2 2 [6589]
24.1
24.0
23.7
24.2
24.1
24.0
23.7
24.2
100
100
96.7
98.3
100
100
96.7
98.3
96.0
r = 6%
A1 2 [111] ) A2 2 [1431]
A1 2 [1545] ) A3 2 [5673]
A3 2 [80100] ) A4 2 [8495]
A2 2 [6589] ) A4 2 [1449]
A2 2 [1431] ) A1 2 [111]
A3 2 [5673] ) A1 2 [1545]
A4 2 [8495] ) A3 2 [80100]
A4 2 [1449] ) A2 2 [6589]
23.3
23.6
23.3
23.8
23.3
23.6
23.3
23.8
98.9
99.0
94.5
97.8
98.9
99.0
94.5
97.8
94.0
r = 8%
A1 2 [111] ) A2 2 [1429]
A1 2 [1545] ) A3 2 [6276]
A3 2 [79100] ) A4 2 [8298]
A2 2 [6590] ) A4 2 [1548]
A2 2 [1429] ) A1 2 [111]
A3 2 [6276] ) A1 2 [1545]
A4 2 [8298] ) A3 2 [79100]
A4 2 [1548] ) A2 2 [6590]
22.4
22.9
22.8
23.7
22.4
22.9
22.8
23.7
97.6
98.0
93.4
95.8
97.6
98.0
93.4
95.8
91.8
654
Table 5
The used parameter values for real databases
Parameters
Values
Pop. size
No. of generations
Crossover rate (CR)
Step length (F)
100
1000
0.3
Generated for each variable from
a Gaussian distribution N(0,1)
60
Threshold
Basketball
Bodyfat
Bolts
Pollution
Quake
Sleep
Stock price
No. of records
No. of attributes
96
252
40
60
2178
62
950
No. of rules
5
18
8
16
4
8
10
Confidence (%)
Ref. [18]
MODENAR
Ref. [18]
MODENAR
33.8
44.2
39.0
41.2
43.8
32.8
48.2
48.0
52.4
55.4
54.2
55.4
48.8
53.8
60 1.2
59 3.8
65 1.9
68 4.8
62 5.1
64 2.3
52 2.5
61 2.1
62 3.2
65 1.8
67 2.7
63 2.8
64 3.4
56 1.9
Table 7
Comparisons of the results
Database
Basketball
Bodyfat
Bolts
Pollution
Quake
Sleep
Stock price
Support (%)
Size
Amplitude (%)
MODENAR
GAR
Ref. [18]
MODENAR
GAR
Ref. [18]
MODENAR
GAR
Ref. [18]
37.20
65.22
28.52
44.85
39.86
36.55
45.29
36.69
65.26
25.97
46.55
38.65
35.91
45.25
32.21
63.29
27.04
38.95
36.96
37.25
46.21
3.21
6.87
5.19
6.24
2.03
4.23
6.01
3.38
7.45
5.29
7.32
2.33
4.21
5.80
3.21
7.06
5.14
6.21
2.1
4.19
6.20
19
25
19
15
17
5
22
25
29
34
15
25
5
26
20
27
27
14
19
4
22
655
Table 8
Mean number of sizes and mean size of amplitudes of the antecedent and consequent of the mined rules
Database
Size of antecedent
Size of consequent
Basketball
Bodyfat
Bolts
Pollution
Quake
Sleep
Stock price
1.19
2.47
2.18
2.84
0.85
1.88
2.02
2.02
4.40
3.01
3.40
1.18
2.35
3.99
17
27
20
13
16
5
24
20
24
18
17
18
5
21
Table 9
Percentages of records covered by the mined rules
Database
Records (%)
MODENAR
GAR
Ref. [18]
Basketball
Bodyfat
Bolts
Pollution
Quake
Sleep
Stock price
100
86.11
80.0
95.0
88.9
80.6
98.73
100
86.0
77.5
95.0
87.5
79.03
99.26
100
84.12
77.5
95.0
87.6
79.81
98.99
656
[16] M. Kaya, R. Alhajj, Genetic algorithm based framework for mining fuzzy
association rules, Fuzzy Sets Syst. 152 (3) (2005) 587601.
[17] J. Mata, J.L. Alvarez, J.C. Riquelme, Discovering numeric association
rules via evolutionary algorithm, in: Proceedings of the sixth Pacific-Asia
Conference on Knowledge Discovery and Data Mining PAKDD-02
(LNAI), Taiwan, (2002), pp. 4051.
[18] B. Alatas, E. Akin, An efficient genetic algorithm for automated mining of
both positive and negative quantitative association rules, in: Soft Computing, vol 10 no. 3, Springer-Verlag, 2006, pp. 230237.
[19] A. Ghosh, B. Nath, Multi-objective rule mining using genetic algorithms,
in: Information Sciences, vol. 163, no. 13, Elsevier Inc., 2004, pp. 123
133.
[20] R. Storn, K. Price, Differential evolutiona simple and efficient adaptive
scheme for global optimization over continuous spaces, Technical Report
TR-95-012, ICSI, 1995.
[21] D. Karaboga, S. Okdem, A simple and global optimization algorithm for
engineering problems: differential evolution algorithm, Turkish J. Electr.
Eng. Comput. Sci. 12 (1) (2004) 5360.
[22] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine
Learning, Addison-Wesley, New York, 1989.
[23] S. Baluja, Population-based incremental learning: a method for integrating
genetic search based function optimization and competitive learning,
Technical Report CMU-CS-94-163, Comp. Sci. Dep., Carnegie Mellon
University, 1994.
[24] I. Rechenberg, Evolution Strategy, in Zuarda et.al. (1994) 147159.
[25] Storn, R., On the Use of differential evolution for function optimization,
Technical Report, ICSI, Berkeley, 1996.
[26] R. Perez-Guerrero, Differential evolution based power dispatch algorithms, Master Thesis, University of Puerto Rico, 2004.
[27] C.A. Coello, An updated survey of GA-based multi-objective optimization
techniques, ACM Comput. Surveys 32 (2) (2000) 109143.
[28] P. Strom, M.L. Hetland, Multiobjective evolution of temporal rules, in:
Proceedings of the eighth Scandinavian Conference on Artificial Intelligence, SCAI, IOS Press, 2003.
[29] B. Alatas, E. Akin, Rough differential evolution algorithm, in: Proceedings of the Second International Conference on Electronics and Computer
Engineering IKECCO, 2005, Bishkek/Kyrgyzstan, (2005), pp. 173178.
[30] R. Sarker, H.A. Abbass, Differential evolution for solving multi-objective
optimization problems, Asia-Pacific J. Operat. Res. 21 (2) (2004) 225
240.
[31] J. Lampinen, I. Zelinka, Mixed integer-discrete-continuous optimization
by differential evolution. Part 1. The optimization method, in: P Osmera
(Ed.), Proceedings of MENDEL99, fifth International Mendel Conference on Soft Computing, Brno, Czech Republic, (1999), pp. 7176.
[32] H.A. Abbass, R. Sarker, C. Newton, PDE, A Pareto-frontier differential
evolution approach for multi-objective optimization problems, in: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 2,
Seoul, South Korea, IEEE, Piscataway, NJ, USA, 2001, p. 971-L978.
[33] H. A. Guvenir, I. Uysal, Bilkent University Function Approximation
Repository, 2000 http://funapp.cs.bilkent.edu.tr.