Professional Documents
Culture Documents
A Thesis
Submitted to Institute of Applied Information on Graduate Studies
of Leader University in partial fulfillment of the Requirements for
the degree of Master Science
June 2005
Tainan, Taiwan
(Decision Support
System, DSS)
(Data
Warehouse)
ABSTRACT
II
III
.I
ABSTRACT..II
..III
..IV
..VI
.......VIII
..1
1.1 .1
1.2 .2
1.3 .2
1.4 .....3
1.5 .....3
..5
2.1 .5
2.1.1 ..5
2.1.2 ..7
2.1.3 ..9
2.1.4 ......9
2.2 ...10
2.2.1 11
2.2.2 ....12
....13
3.1 ....13
IV
3.2 ...13
....17
4.1 ...17
4.2 ...18
4.3 ...22
4.4 X Y ...26
4.5 ...33
4.6 ...35
4.7 ...44
4.8 ...52
....55
5.1 ...55
5.2 X Y ..57
5.3 .......63
....66
6.1 ...66
6.2 ...67
...71
7.1 ...71
7.2 ...72
..73
2-1
..10
3-1
..14
4-1
..19
4-2
..25
4-3 (a)
..27
4-3 (b)
..27
4-3 (c)
..28
4-4 (a)
..30
4-4 (b)
..30
4-4 (c)
..31
4-5
..33
4-6
..34
4-7 (a)
37
4-7 (b)
37
4-7 (c)
38
4-8 (a)
40
4-8 (b)
40
4-8 (c)
41
4-9
..43
4-10 (a)
45
4-10 (b)
45
4-10 (c)
46
VI
4-11 (a)
48
4-11 (b)
48
4-11 (c)
49
4-12
..51
4-13
..53
5-1
..56
5-2
..57
5-3 (a)
..58
5-3 (b)
..58
5-3 (c)
..59
5-4 (a)
..61
5-4 (b)
..61
5-4 (c)
..62
5-5
..64
6-1
..68
VII
1-1
2-1
3-1
...15
3-2
..15
4-1
..17
4-2
-.20
4-3
......20
4-4
-.21
4-5
..21
4-6
-.....22
4-7
..23
4-8
..23
4-9
..26
4-10 (a)
..28
4-10 (b)
..29
4-10 (c)
..29
4-11 (a)
..31
4-11 (b)
..32
4-11 (c)
..32
4-12
..32
4-13
..35
4-14
..36
VIII
4-15 (a)
..38
4-15 (b)
..39
4-15 (c)
..39
4-16 (a)
..41
4-16 (b)
..42
4-16(c)
..42
4-17
..43
4-18 (a)
46
4-18 (b)
47
4-18 (c)
47
4-19(a)
49
4-19 (b)
50
4-19 (c)
50
4-20
..51
5-1 (a)
..59
5-1 (b)
..60
5-1 (c)
..60
5-2 (a)
..62
5-2 (b)
..63
5-2 (c)
..63
5-3
..64
6-1
..66
6-2
IX
..70
1.1
(Over Fitting)
(Data Mining)
1.2
1.3
(Data Warehouses)
1.4
1.5
(Interesting)
1-1
1-1
2.1
2.1.1
(Knowledge
Discovery in Database)
2-1
1.
2.
3. NBA
4.
5.
6.
7.
(Association Rule)
(Clustering)(Classification)(Sequential Pattern)(Decision
Tree)
2-1
2. (Data Integration)
3. (Data Selection)
4. (Data Transformation)
5. (Data Mining)(Classification)
(Clustering)(Summarization)...
6. (Pattern Evaluation)
7. (Knowledge Presentation)
2.1.2
[Agrawal
and Imiclinski and Swami, 1993] Agrawal
Apriori
Apriori
X Y X IY I XY= X Y
I D
X Y (Support) s (Confidence) c
D X Y
D X Y [2003]
(Minsup)
(Minconf)
1. (Frequent Item Sets)
2.
Agrawal 1993
[Agrawal and Srikant,
1995]EPISODES [Mannila and Toivonen and Verkamo, 1997]
[Koperski and Han, 1995] [Savasere and Omieciaski and Navathe,
1998] [Lu and Han and Feng, 1998]
2.1.3
70%
2.1.4
20~30 5 ~7 80% A
9
2-1
2-1
Agrawal
1993
1995
1995
1997
1998
2005
1998
2.2
Lu
10
()
[1999]
2.2.1
(Joint Probability)
(Joint Probability Function)
X, Y X x1, x2, x3, , xnY y1, y2, y3, ,
ym f ( xi , y j )
0 f ( xi , y j ) 1
n
f ( x , y
i =1 j =1
11
) =1
(2-1)
f ( xi , y j )
2.2.2
(Conditional Probability) B
A A
f ( x, y ) Y = yi xi
f ( xi | Y = y j ) =
f ( xi , y j )
fy (y j )
(2-2)
X = xi yj
f ( y j | X = xi ) =
12
f ( xi , y j )
f x ( xi )
(2-3)
3.1
A Bi , i=1, 2, , n
n P ( A Bi ) , i=1, 2, , n
P ( Bi | A) , i=1, 2, , n
P( Bi | A) =
P ( A Bi )
, P( A) 0
P ( A)
(3-1)
3.2
I (Index of Association) X
P ( A Bi ) Y P( Bi | A)
I k = f ( X , Y ) , k=1, 2, 3, 4 I 1 I 2 I 3
I 4 I j ( 3-2 3-1)
13
I 1 ( P( X Q3 ) P(Y Q3 ))
I 2 (( P(Q2 < X < Q3 ) P(Y Q3 )) ( P( X Q3 ) P(Q2 < Y < Q3 )))
I 3 (( P ( X Q2 ) P(Y Q3 )) ( P(Q2 < X < Q3 ) P(Q2 < Y < Q3 ))
(3-2)
( P( X Q3 ) P(Y Q2 )))
I 4 (( P( X Q2 ) P(Q2 < Y < Q3 )) ( P(Q2 < X < Q3 ) P(Y Q2 ))
( P( X Q2 ) P(Y Q2 )))
Q3 Q2
30 X, Y
( 3-1)(mean) Q3
75%( 3-2)
3-1
Q3
Q3 Q2
Q2
Q3
Q3
|
Q2
Q2
14
3-1
25%
Q1
Q3
3-1
3-2
15
Q3 , ( 3-3), ( 3-4)
x
i =1
(x
i =1
(3-3)
)2
(3-4)
()
, X ( 3-5), S 2 ( 3-6)
(Unbiased Estimator)
X =
x
i =1
n 1
S=
(x
i =1
X )2
n 1
16
(3-5)
(3-6)
4.1
( 4-1)
, X , Y Q3 Q2
P(Ik)
XY
Q3
Q2
4-1
17
P(Ik)
4.2
(
)
( 4-1)
18
4-1
IC
2340 2384
2327 2370
2378
2317 2387
2352 2358
2328 2341
2323 2349
19
4-2 -
12 :
1999 2004
4-3
20
4-4 -
4-5
21
4-6 -
()() =
(4-1)
4.3
22
4-7
A Bi , i=1, 2, , n
n P( Bi | A) , i=1,
2, , n 4-8
4-8
23
=66%
=60%
=70%
66%
1999 2004
1999
( 3-1)
24
4-2
0.8260
0.6778
0.7487
0.6770
0.7100
0.6752
0.7100
0.6739
0.7074
0.6739
0.7010
0.6713
0.6958
0.6709
0.6942
0.6701
0.6941
0.6658
0.6817
0.6636
0.6817
0.6636
0.6804
0.6636
0.6791
0.6610
0.6779
25
4.4 X Y
j=1, 2, , 24A B i
(B1j)(B2j) (B27j)
j 24 (6 4
) 648
, X ,
1999 1 1 2004 12 31
4-9
26
P( A B1 j ) ~ P ( A B27 j ) , j=1, 2, , 24
4-3(a)
1999
0.43
0.46
0.46
0.46
2000
0.40
0.33
0.39
0.46
2001
0.44
0.45
0.52
0.48
2002
0.40
0.36
0.38
0.45
2003
0.30
0.48
0.41
0.37
2004
0.38
0.40
0.41
0.42
4-3(b)
1999
0.39
0.50
0.41
0.46
2000
0.32
0.29
0.29
0.56
2001
0.42
0.34
0.51
0.47
2002
0.29
0.34
0.35
0.38
2003
0.30
0.47
0.32
0.28
2004
0.38
0.37
0.34
0.31
27
4-3(c)
1999
0.41
0.43
0.36
0.41
2000
0.26
0.32
0.31
0.54
2001
0.35
0.21
0.39
0.50
2002
0.27
0.33
0.39
0.38
2003
0.27
0.45
0.36
0.35
2004
0.31
0.34
0.33
0.32
( 4-8(a)(b)(c))
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-10 (a)
28
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-10 (b)
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-10 (c)
j=1, 2, , 24 648
, Y
, X Y =0.6866 , SY=0.10207 ,
29
P( B1 j | A) ~ P ( A B27 j ) , j=1, 2, , 24
4-4 (a)
1999
0.87
0.76
0.86
0.74
2000
0.93
0.77
0.75
0.76
2001
0.78
0.85
0.82
0.89
2002
0.92
0.74
0.83
0.94
2003
0.74
0.84
0.96
0.89
2004
0.76
0.81
0.79
0.90
4-4 (b)
1999
0.80
0.83
0.76
0.74
2000
0.74
0.67
0.56
0.93
2001
0.75
0.64
0.79
0.86
2002
0.67
0.71
0.77
0.81
2003
0.74
0.81
0.75
0.67
2004
0.76
0.75
0.67
0.67
30
4-4 (c)
1999
0.83
0.71
0.68
0.65
2000
0.59
0.73
0.61
0.90
2001
0.63
0.39
0.62
0.91
2002
0.63
0.68
0.87
0.81
2003
0.65
0.78
0.86
0.85
2004
0.62
0.69
0.64
0.70
( 4-11(a)(b)(c))
1.2
1
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
4-11 (a)
31
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
4-11 (b)
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
4-11 (c)
220
200
180
160
140
120
100
80
60
40
20
0
0.2
0.4
0.6
0.8
1.2
4-12
32
4.5
( 3-1) 4.4
24 , I1 16
, I2 4 , I3 3
, I4 1
P(I1)=66.7%P(I2)=16.7%P(I3)=12.5%P(I4)=4.2%
4-5
, X
Q3
>=0.39801
,
Y
Q3
Q3 Q2
< 0.39801
> 0.34881
Q2
<=0.34881
>=0.7555
16
<0.7555
>0.6866
<=0.6866
Q3
|
Q2
Q2
33
4-6
P(I1)
4-12
4-6 (%)
4-5 & P(I1)
P(I2)
P(I3)
P(I4)
66.7
16.7
12.5
4.2
25.0
20.8
8.3
45.8
25.0
12.5
12.5
50.0
20.8
8.3
8.3
62.5
20.8
8.3
8.3
62.5
20.8
8.3
8.3
62.5
16.7
20.8
4.2
58.3
16.7
16.7
4.2
62.5
16.7
12.5
12.5
58.3
10
16.7
8.3
4.2
70.8
11
16.7
4.2
8.3
70.8
12
16.7
4.2
4.2
75.0
13
16.7
0.0
8.3
75.0
14
12.5
20.8
4.2
62.5
15
12.5
16.7
8.3
62.5
16
12.5
12.5
12.5
62.5
17
12.5
12.5
12.5
62.5
18
12.5
8.3
16.7
62.5
19
12.5
8.3
12.5
66.7
20
12.5
8.3
4.2
75.0
21
12.5
4.2
16.7
66.7
22
12.5
4.2
16.7
66.7
23
8.3
25.0
4.2
62.5
24
8.3
16.7
20.8
54.2
25
8.3
12.5
12.5
66.7
26
8.3
12.5
8.3
70.8
27
8.3
4.2
25.0
62.5
34
100%
(%)
80%
60%
40%
20%
0%
1
11
13
15
17
19
21
23
25
27
4-13
4.6
( 4-14)
35
4-14
P(I1) 20.8
(_)(_)(_)
(_)(_)(_)(_
)(_)(_)(_)(_
)(_)(_)(_)(
_)
C 26 = 15
(4-2)
, X, X X =0.28364 , SX =0.07433
, Q3=0.33381 X ~ N(0.28364,
4-7 (a)
1999
0.36
0.38
0.38
0.41
2000
0.32
0.29
0.24
0.43
2001
0.35
0.29
0.41
0.44
2002
0.25
0.30
0.30
0.35
2003
0.20
0.41
0.32
0.25
2004
0.29
0.31
0.30
0.31
4-7 (b)
1999
0.31
0.25
0.38
0.39
2000
0.27
0.23
0.29
0.39
2001
0.35
0.32
0.33
0.38
2002
0.31
0.28
0.29
0.32
2003
0.23
0.38
0.27
0.23
2004
0.33
0.34
0.38
0.35
37
4-7(c)
1999
0.20
0.32
0.29
0.30
2000
0.32
0.23
0.31
0.37
2001
0.37
0.29
0.39
0.41
2002
0.24
0.20
0.27
0.35
2003
0.25
0.39
0.24
0.23
2004
0.19
0.28
0.31
0.26
( 4-15(a)(b)(c))
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-15 (a)
38
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-15 (b)
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-15 (c)
, Y , X Y =0.55449 ,
SY=0.11793, Q3=0.6341Y ~
N(0.55449, 0.117932) (_)( 4-8(a) 4-16(a))
(_)( 4-8(b) 4-16(b))(_)( 4-8(c)
4-15(b))
39
4-8 (a)
1999
0.73
0.63
0.70
0.65
2000
0.74
0.67
0.47
0.71
2001
0.63
0.55
0.64
0.80
2002
0.58
0.61
0.67
0.74
2003
0.48
0.70
0.75
0.59
2004
0.59
0.63
0.58
0.67
4-8 (b)
1999
0.63
0.41
0.70
0.63
2000
0.63
0.53
0.56
0.64
2001
0.63
0.61
0.51
0.69
2002
0.71
0.58
0.63
0.68
2003
0.57
0.65
0.64
0.56
2004
0.66
0.69
0.73
0.77
40
4-8 (c)
1999
0.40
0.54
0.54
0.49
2000
0.74
0.53
0.61
0.62
2001
0.66
0.55
0.62
0.74
2002
0.54
0.42
0.60
0.74
2003
0.61
0.68
0.57
0.56
2004
0.38
0.56
0.61
0.57
( 4-16(a)(b)(c))
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
4-16 (a)
41
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
4-16 (b)
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-16(c)
( 3-1) 4.6
, I1 9 , I2 6
, I3 4 , I4 5
P(I1)=37.5%P(I2)=25.0%
P(I3)=16.7%P(I4)=20.8%
( 4-9)
P(I1)
( 4-17)
( 4-9)
42
4-9 (
%)
4-14 & P(I1)
P(I2)
P(I3)
P(I4)
37.5
25.0
16.7
20.8
29.2
20.8
20.8
29.2
20.8
12.5
12.5
54.2
20.8
8.3
8.3
62.5
20.8
4.2
8.3
66.7
20.8
4.2
0.0
75.0
20.8
0.0
0.0
79.2
20.8
0.0
0.0
79.2
16.7
12.5
8.3
62.5
10
16.7
8.3
8.3
66.7
11
16.7
16.7
12.5
54.2
12
16.7
25.0
4.2
54.2
13
12.5
4.2
8.3
75.0
14
12.5
12.5
8.3
66.7
15
8.3
12.5
8.3
70.8
100%
(%)
80%
60%
40%
20%
0%
1
4-17
7
8
9 10
11
12
13
14
15
43
4.7
(_)(
_)(_)(_
)(_)(_)(_
)(_)(_)(_
)
C 35 = 10
(4-3)
X,
X X =0.23449 , SX =0.06787
44
4-10(a)
1999
0.31
0.24
0.33
0.33
2000
0.24
0.22
0.21
0.37
2001
0.30
0.23
0.28
0.38
2002
0.22
0.22
0.24
0.28
2003
0.20
0.31
0.23
0.12
2004
0.26
0.26
0.28
0.28
4-10(b)
1999
0.25
0.28
0.28
0.28
2000
0.23
0.20
0.21
0.37
2001
0.32
0.19
0.28
0.38
2002
0.16
0.22
0.15
0.29
2003
0.13
0.34
0.29
0.18
2004
0.24
0.22
0.19
0.23
45
4-10(c)
1999
0.36
0.26
0.32
0.29
2000
0.29
0.26
0.20
0.33
2001
0.32
0.19
0.34
0.39
2002
0.25
0.22
0.23
0.31
2003
0.09
0.28
0.26
0.15
2004
0.16
0.18
0.22
0.20
( 4-18(a)(b)(c))
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
1999
2000
2001
2002
2003
2004
4-18(a)
46
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
1999
2000
2001
2002
2003
2004
4-18(b)
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-18(c)
, Y , X Y =0.45661 ,
SY=0.11547, Q3=0.53456Y ~
N(0.45661, 0.115472) (_
)( 4-11(a) 4-19(a))(_)( 4-11(b)
47
4-11(a)
1999
0.63
0.39
0.62
0.53
2000
0.56
0.50
0.42
0.62
2001
0.53
0.42
0.44
0.69
2002
0.50
0.45
0.53
0.58
2003
0.48
0.54
0.54
0.30
2004
0.52
0.53
0.55
0.60
4-11(b)
1999
0.50
0.46
0.51
0.44
2000
0.52
0.47
0.42
0.62
2001
0.56
0.36
0.44
0.69
2002
0.38
0.45
0.33
0.61
2003
0.30
0.59
0.68
0.44
2004
0.48
0.44
0.36
0.50
48
4-11(c)
1999
0.73
0.44
0.59
0.47
2000
0.67
0.60
0.39
0.55
2001
0.56
0.36
0.54
0.71
2002
0.58
0.45
0.50
0.65
2003
0.22
0.49
0.61
0.37
2004
0.31
0.38
0.42
0.27
( 4-19(a)(b)(c))
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-19(a)
49
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-19(b)
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
4-19(c)
( 3-1) 4.7
, I1 8 , I2
5 , I3 0 , I4 11
P(I1)=33.3%
P(I2)=20.8%P(I3)=0%P(I4)=45.8%
4-12
P(I1)
( 4-20)
( 4-12)
50
4-12
(%)
4-11 &
P(I1)
P(I2)
P(I3)
P(I4)
33.3
20.8
0.0
45.8
29.2
16.7
16.7
37.5
25.0
20.8
4.2
50.0
25.0
0.0
16.7
58.3
20.8
16.7
0.0
62.5
20.8
4.2
8.3
66.7
16.7
12.5
8.3
62.5
16.7
8.3
12.5
62.5
16.7
8.3
0.0
75.0
10
12.5
8.3
4.2
75.0
100%
(%)
80%
60%
40%
20%
0%
1
5
6
10
4-20
51
4.8
1999 2004
2005
P( Bi | A) , i=1, 2, , 27
52
4-13
1999/1/1~2004/12/30 2005/1/1~2005/6/30
0.8571
0.8285
0.6285
0.6285
0.6285
0.6142
0.6142
0.6
0.6
10
0.6
11
0.6
12
0.6
13
0.5857
14
0.5857
15
0.5857
16
0.5797
17
0.5714
18
0.5714
19
0.5614
20
0.5571
21
0.5571
22
0.5468
23
0.5142
24
0.5
25
0.4714
26
0.4571
27
0.4428
53
1999 2004
1999 2002
66% 8.3
16.7 2004 2005
IC
IC
2005
54
5.1
( 5-1)
55
5-1
2801
2807
2808
2809
2811
2812
2816
2820
2822
2823
2825
2827
2831
2833
2834
2836
2838
2841
2845
=65%
=60%
=63%
=63% ( 5-2)
56
5-2
0.7809
0.6417
0.6417
0.6378
0.6378
0.6314
0.6314
5.2 X Y
216
, X
, X X =0.33417 , SX =0.08401 ,
57
5-3(a)
1999
0.33
0.38
0.51
0.61
2000
0.29
0.36
0.36
0.56
2001
0.47
0.53
0.61
0.53
2002
0.38
0.39
0.38
0.35
2003
0.25
0.39
0.23
0.25
2004
0.38
0.31
0.36
0.29
5-3(b)
1999
0.30
0.26
0.35
0.32
2000
0.16
0.19
0.24
0.49
2001
0.37
0.39
0.43
0.37
2002
0.25
0.39
0.38
0.40
2003
0.27
0.44
0.27
0.23
2004
0.34
0.29
0.30
0.37
58
5-3(c)
1999
0.36
0.31
0.36
0.33
2000
0.32
0.20
0.21
0.47
2001
0.32
0.47
0.57
0.48
2002
0.33
0.42
0.33
0.31
2003
0.267
0.28
0.24
0.25
2004
0.362
0.20
0.23
0.20
( 5-1(a)(b)(c))
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
5-1(a)
59
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
5-1(b)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1999
2000
2001
2002
2003
2004
5-1(c)
_
216
, Y
, X Y =065811 , SY=0.13167, Q3=0.74699
Y ~ N(0.65811, 0.131672)
(_)( 5-4(a) 5-2(a))(_)( 5-4(a)
5-4(a)
1999
0.67
0.63
0.95
0.98
2000
0.67
0.83
0.69
0.93
2001
0.84
1.00
0.95
0.97
2002
0.87
0.81
0.83
0.74
2003
0.61
0.68
0.54
0.59
2004
0.76
0.63
0.70
0.63
5-4(b)
1999
0.60
0.44
0.65
0.51
2000
0.37
0.43
0.47
0.81
2001
0.66
0.73
0.67
0.69
2002
0.58
0.81
0.83
0.84
2003
0.65
0.76
0.64
0.56
2004
0.69
0.59
0.58
0.80
61
5-4(c)
1999
0.73
0.51
0.68
0.53
2000
0.74
0.47
0.42
0.79
2001
0.56
0.88
0.90
0.89
2002
0.75
0.87
0.73
0.65
2003
0.65
0.49
0.57
0.59
2004
0.72
0.41
0.45
0.43
( 5-2(a)(b)(c))
1.2
1
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
5-2(a)
62
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
5-2(b)
1
0.8
0.6
0.4
0.2
0
1999
2000
2001
2002
2003
2004
5-2(c)
5.3
3-1 5.2
, I1 7 , I2 5
, I3 4 , I4 8
P(I1)=29.2% P(I2)=20.8% P(I3)=16.7%
P(I4)=33.3%
5-5
P(I1)
63
( 5-3)
5-5 (%)
5-3 & P(I1)
P(I2)
P(I3)
P(I4)
29.2
20.8
16.7
33.3
20.8
0.0
16.7
62.5
16.7
12.5
8.3
62.5
12.5
16.7
12.5
58.3
12.5
8.3
16.7
62.5
8.3
12.5
12.5
66.7
4.2
16.7
8.3
70.8
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
5-3
64
65
6-1
6.1
66
= 0%
P ( A B ij ) , i=1, 2, , 56; j=1,
2, , 24A B i
( B 1j)( B 2j) ( B 56j)
j 24 (6
4 ) 1344
, X
,
X X =0.16987 , SX =0.05849 ,
SY=0.10731, Q3=0.40895Y ~
N(0.33652, 0.107312)
6.2
( 3-1) 6.1
, I 1 9 , I 2 3
, I 3 3 , I 4 9
P( I 1 )=37.5% P( I 2 )=12.5% P( I 3 )=12.5%
67
P( I 4 )=37.5%
, I 1 3
, I 2 , I 3 0
, I 4 21
6-1
6-2
6-1 (%)
4-5 & P( I 1 ) P( I 2 ) P( I 3 ) P( I 4 )
37.5 12.5
8.3 41.7
2
37.5
4.2 12.5 45.8
3
33.3
8.3
8.3 50.0
4
33.3
4.2
16.7
45.8
33.3
4.2
12.5
50.0
33.3
4.2
4.2
58.3
33.3
0.0
16.7
50.0
29.2
12.5
16.7
41.7
10
29.2
8.3
16.7
45.8
11
29.2
0.0
20.8
50.0
12
25.0
20.8
12.5
41.7
13
25.0
20.8
8.3
45.8
14
25.0
16.7
20.8
37.5
15
25.0
16.7
16.7
41.7
16
25.0
12.5
12.5
50.0
68
17
25.0
12.5
12.5
50.0
18
25.0
12.5
8.3
54.2
19
25.0
4.2
12.5
58.3
20
25.0
4.2
12.5
58.3
21
20.8
25.0
16.7
37.5
22
20.8
8.3
16.7
54.2
23
20.8
4.2
4.2
70.8
24
16.7
20.8
16.7
45.8
25
16.7
20.8
12.5
50.0
26
16.7
20.8
8.3
54.2
27
16.7
16.7
16.7
50.0
28
16.7
12.5
29.2
41.7
29
16.7
12.5
20.8
50.0
30
16.7
12.5
20.8
50.0
31
16.7
12.5
16.7
54.2
32
16.7
12.5
12.5
58.3
33
16.7
12.5
8.3
62.5
34
16.7
12.5
8.3
62.5
35
16.7
8.3
29.2
45.8
36
16.7
8.3
12.5
62.5
37
16.7
8.3
12.5
62.5
38
16.7
8.3
4.2
70.8
39
12.5
16.7
8.3
62.5
40
12.5
12.5
16.7
58.3
41
12.5
8.3
20.8
58.3
42
12.5
8.3
20.8
58.3
43
12.5
8.3
16.7
62.5
44
12.5
4.2
12.5
70.8
45
12.5
0.0
0.0
87.5
46
8.3
33.3
12.5
45.8
47
8.3
33.3
8.3
50.0
48
8.3
20.8
20.8
50.0
49
8.3
16.7
16.7
58.3
50
8.3
16.7
8.3
66.7
51
8.3
12.5
16.7
62.5
52
8.3
8.3
12.5
70.8
53
8.3
0.0
4.2
87.5
69
54
8.3
0.0
4.2
87.5
55
4.2
25.0
8.3
62.5
56
4.2
8.3
16.7
70.8
100%
80%
60%
40%
(%)20%
0%
1
6-2
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55
70
7.1
()()(
)
(
)()()
71
7.2
(
)
72
11-13 2003
74-15 117-143
1999
Agrawal R., T. Imielinski and A. Swami. Mining association rules between sets of
items in large databases. In Proc. 1993 ACM-SIGMOD Int. Conf. Management
of Data (SIGMOD93), pages 207-216, Washington, DC, May 1993.
Agrawal R. and R. Srikant. Fast algorithms for mining association rules. In Proc.
1994 Int. Conf. Very Large Data Bases (VLDB,94), pages 487-499, Santiago,
Chile, Sept. 1994.
Agrawal R. and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data
Engineering (ICDE95), pages 3-14, Taipei, Taiwan, Mar. 1995.
Jiawci Han and Micheline Kamber. Datat Mining, Concepts and Techniques, pages
1-33, 2001.
Koperski K. and J. Han. Discovery of spatial association rules in geographic
information databases. In Proc. 4th Int. Symp. Large Spatial Databases
(SSD95), pages 47-66, Portland, ME, Aug. 1995.
Lu H., J. Han and L. Feng. Stock movement and n-dimensional inter-transaction
association rules. In Proc. 1998 SIGMOD Workshop on Research Issues on Data
Mining and Knowledge Discovery (DMKD98), pages 12:1-12:7, Seattle, WA,
June 1998.
Mannila H., H. Toivonen and A. I. Verkamo. Efficient algorithms for discovering
73
74