Professional Documents
Culture Documents
Database
Wirarama Wedashwara, Shingo Mabu, Masanao Obayashi and Takashi Kuremoto
Abstract
This research proposes a decision support system of database cluster optimization using genetic network programming
(GNP) with on-line rule based clustering. GNP optimizes cluster quality by reanalyzing weak points of each cluster and
maintaining rules stored in each cluster. The maintenance of rules includes : 1) adding new relevant rules, 2) moving
rules between clusters and 3) removing irrelevant rules. The simulations focus on optimizing cluster quality response
against several unbalanced data growth to the data-set that is working with storage rules. The simulation results of the
proposed method show better results compared to GNP rule based clustering without on-line optimization.
Keywords: Genetic Network Programming, Rule Based Clustering, Cluster Optimization
1.
Introduction
The 2015 International Conference on Artificial Life and Robotics (ICAROB 2015), Jan. 10-12, Oita, Japan
2.3. Silhouette
NTi
Ai
Ri
Ci
8,12
10,13
10
11,15
11
4,14
12
13
14
15
ba
max a, b
1 a / b if a b
0
if a b
b / a 1 if a b
The 2015 International Conference on Artificial Life and Robotics (ICAROB 2015), Jan. 10-12, Oita, Japan
3.
Extracted Rules
A1^B1
Support
Optimization
A1^B1^D2
A1^B1^D2^C2
D2^C2
D2^C2^A2
D2^C2^A2^B1
C1^D1
C1^D1^B3
C1^D1^B3^A1
The 2015 International Conference on Artificial Life and Robotics (ICAROB 2015), Jan. 10-12, Oita, Japan
Simulation
The 2015 International Conference on Artificial Life and Robotics (ICAROB 2015), Jan. 10-12, Oita, Japan
Table 3. Simulation Result of Rule updating frequency and Number of Data Comparison.
Step
Number of Data
Increment/Decrement
1
2
3
4
5
6
7
8
9
10
1000 (Default)
2000
3000
4000
5000
6000
5000
4000
3000
2000
+1000
+1000
+1000
+1000
+1000
-1000
-1000
-1000
-1000
Average
No rule updating
0.967
0.945
0.902
0.892
0.882
0.812
0.787
0.765
0.723
0.698
Silhouette values
frequency: 1000
frequency: 2000
0.967
0.967
0.965*
0.944
0.923*
0.915*
0.935*
0.902
0.912*
0.909*
0.902*
0.897
0.892*
0.901*
0.901*
0.888
0.912*
0.892*
0.909*
0.879
frequency: 4000
0.967
0.947
0.899
0.895
0.903*
0.887
0.821
0.797
0.815*
0.802
0.832
0.938
0.885
0.923
Table 4. Comparison of Silhouette Values and Iteration Between Different Setting of Threshold.
Step
Number of Data
Increment
1.0
Sil
1
2
3
4
5
6
1000 (Default)
2000
3000
4000
5000
6000
Average
Sil: Silhouette value, Iter: iteration
+1000
+1000
+1000
+1000
+1000
0.5
Iter
0.971
0.945
0.902
0.905
0.902
0.878
153
165
201
265
354
402
0.971
0.935
0.913
0.907
0.913
0.837
153
53
65
89
102
112
0.971
0.923
0.902
0.873
0.851
0.898
153
45
53
75
89
99
0.971
0.912
0.895
0.846
0.802
0.816
153
15
8
11
9
12
0.924
278
0.905
133
0.935
126
0.893
83
The 2015 International Conference on Artificial Life and Robotics (ICAROB 2015), Jan. 10-12, Oita, Japan
Conclusions
2.
3.
4.
5.
6.
7.
9.
The 2015 International Conference on Artificial Life and Robotics (ICAROB 2015), Jan. 10-12, Oita, Japan