Professional Documents
Culture Documents
Data Mining
Data Mining,
, Data Mining.
.
Data Mining
OLAP-, , Data Mining
(, , , , ).
Data Mining. Web Mining.
Data Mining: , ,
, , ,
, .
, Data Mining.
Data Mining . OLTP, OLAP,
ROLAP, MOLAP. Data
Mining. .
,
Data Mining, .
Data Mining, ,
,
Data Mining
, , ,
( ),
.
1
Data Mining?..........................................................................................................................7
, Data Mining..........................................................9
.....................................................................................................9
Data Mining........................................................................................................................10
Data Mining .........................................................11
.................................................................................................................................................17
?...........................................................................................................................17
.........................................................................................................17
........................................................................................................................................19
.....................................................................................................................22
.............................................................................................................23
. ..............................................................................................24
........................................................................................................27
.....................................................................................................................................28
Data Mining.............................................................................................................29
Data Mining. .......................................................................................39
Data Mining..........................................................................................................................39
....................................................................................................................43
................................................................................................................44
.....................................................................................................................................45
...............................................................................................................................................48
"", "", "".................................48
Data Mining. ......................................................................50
....................................................................................................................50
..................................................................................................................53
, .......................................................54
: ........................................................................56
...................................................................................56
.....................................................................................................................57
.....................................................................................................60
...................................................................................................................60
.................................................................................................60
.............................................................................................................................................61
Data Mining. ...................................................................63
.................................................................................................................63
..................................................................64
............................................................................................64
.......................................................................................................................71
Data Mining.........................................................................................................74
Data Mining -...................................................................75
Data Mining ........................................................................................79
......................................................................................................................84
Microsoft Excel.....................................................................................................84
................................................................................................................84
.................................................................................................................88
2
....................................................................................................................90
.............................................................................................................................................96
. .....................................................97
...............................................................................................100
................................................................................101
......................................................................................................................................104
...........................................................................................................................................106
. . "
". ..............................................................................................107
..............................................................................................................107
SVM...............................................................................................................................108
" " .....110
.......................................................................112
...............................................................................................113
k -...........................................................................114
.........................................................................................................115
.............................................................................................116
. .....................................................118
..........................................................................................................119
......................................................................................................120
..........................................................................................................122
.............................................................................................................124
................................................126
...............................................................................................................127
Matlab...................................................................................................................................132
. ..............................................................134
.................................................................................................134
................................................................................................135
................................................................................................136
............................................................................................................................136
...............................................................................................................141
...........................................................................................................................................146
. .................................................................147
.......................................................................................................151
...............................................................................................................................153
..................................................................................................154
SPSS..................................................................................155
. .....................................................................159
k- (k-means)......................................................................................................159
PAM ( partitioning around Medoids)..............................................................................162
..............................................................................162
.........................................................................................................................162
SPSS...............................................................................................163
, ..165
...................167
...........................................................................................170
:...........................170
3
............................................................................................170
.............................................................................171
.....................................................................................................................................172
.......................................................................................173
...............................................173
.......................................................................................174
Apriori.................................................................................................176
...........................................................178
. .........................................184
Data Mining....................................................................................184
Data Mining .............................................................................................185
...................................................................................................................186
, .........................................................186
4 + ....................................................................................187
..........................................................................................................187
" "..............................................................................................................................188
.................................................................................................................190
......................................................................191
...........................................................................191
...........................................................................................................................................194
Data Mining, OLAP ...................195
.....................................................................................................................197
OLAP-.................................................................................................................................198
OLAP-...............................................................................................................................199
OLAP Data Mining....................................................................................................200
........................................................................................................................201
......................................................................202
Data Mining. ...........................................................................................205
1. ............................................................................................205
2. ............................................................................................................206
3. ...........................................................................................................206
...........................................................................................................................................214
Data Mining. .............................................................................................215
......................................................................................................215
.....................................................................................................221
Data Mining. .........................................................223
.............................................................................................................................223
................................................................................................................................224
...............................................................................................................226
4. ..........................................................................................................227
5. ............................................................................................229
6. ...................................................................................................................230
7. .........................................................................................................230
8. ....................................................................................231
Data Mining...........................................................................................231
...........................................................................................................................................233
Data Mining. Data Mining....................234
..........................................................................................................234
4
. Data Mining................................................................................235
CRISP-DM ..................................................................................................................238
SEMMA .....................................................................................................................240
Data Mining......................................................................................................241
PMML..............................................................................................................................241
, ................................................................242
Data Mining.....................................................................................................244
Data Mining................................................................................................................244
Data Mining..................................................................................250
Data Mining ............................251
......................252
...............................................253
Data Mining
.........................................................................................................................................................253
...........................................................................................................................................254
Data Mining. SAS Enterprise Miner..............................................................................255
...................................................................................266
SAS - ..............................................266
SASR Enterprise Miner..............................................................267
Data Mining. PolyAnalyst..............................................................................268
.....................................................................................................................268
PolyAnalyst Workplace - .........................................................................269
PolyAnalyst...............................................................................269
............................................................................................................271
...........................................................................................................271
.................................................................................................................272
..........................................................................................................273
.................................................................................................................................274
.............................................................................................275
PolyAnalyst..............................................................................276
WebAnalyst.....................................................................................................................................278
Data Mining. Cognos STATISTICA Data Miner...280
Cognos 4Thought........................282
STATISTICA Data Miner.....................................................................................................286
STATISTICA Data Miner.....................................................................................288
Oracle Data Mining Deductor...................................................................................295
Oracle Data Mining..........................................................................................................................295
.............................................................................................................297
................................................................................................................297
Deductor............................................................................................298
KXEN...............................................................................................................................309
Data Mining .....................................................................................................................318
Data Mining-.........................................................................................................................318
.........................................................................................................................320
.........................................................................................................................322
...................................................................................................323
...........................................................................................................................................326
5
Data Mining?
" , , ,
,
- :
, ".
.
,
,
.
,
. Data
Mining . ,
, ,
.
,
, Data Mining ,
.
Data Mining :
(data) (mining).
,
.
Data Mining , ,
, , ,
, , " ",
, , "" .
" " (Knowledge Discovery in Databases, KDD)
Data Mining [1].
Data Mining, 1978 ,
1990- .
,
.
Data Mining , "Data
Mining" Google ( 2005 ) - 18
.
Data Mining?
Data Mining - ,
, , ,
., . . 1.1.
7
,
Data Mining.
- ,
, .
, ,
,
.
,
. .
.
. 1996 : " - ,
,
".
.
- ,
, .
(intelligence) intellectus, ,
, , .
, (AI, Artificial Intelligence)
.
, .
, Data Mining, .
.
, Data Mining
o , Data Mining, .
o .
o .
o .
Data Mining.
o .
o , ,
, .
Data Mining ,
.
1960- .
1968 IMS
IBM.
1970- .
1975 Conference on Data System Languages (CODASYL),
,
.
.. ,
.
1980- .
9
.
. , 1985 , SQL.
.
1990- .
- " ", "", "", "".
, ,
SQL.
DataMining, , web- .
Data Mining ,
[2]:
;
;
;
.
Data Mining
Data Mining - ,
( ) [3].
Data Mining -
(Gregory Piatetsky-Shapiro) - :
Data Mining - ,
, ,
.
Data Mining : ,
,
.
- ,
.
- ,
, ,
.
- , ,
.
- , ,
, , ..
10
(knowledge deployment)
(,
).
Data Mining.
Data Mining -
, .
Data Mining - ,
(patterns)
( SAS Institute).
Data Mining - , - ,
( Gartner Group).
Data Mining (patterns),
, ,
, .
"Mining" - " ",
.
- ,
.
.
Data Mining
Gartner Group, ,
1980- "Business Intelligence" (BI), . ,
.
1996 .
Business Intelligence - ,
,
,
.
BI
.
BI-, -
.
11
BI- (,
DSS, Decision Support System). ,
, .. .
Gartner Group Business Intelligence
:
(data warehousing, );
(OLAP);
- (Enterprise Information Systems, EIS);
(data mining);
(query and reporting tools).
Gartner ,
.
Data Mining
[4] -,
.
Data Mining (Enterprise Data Mining Buying
Guide) Aberdeen Group: "Data Mining -
.
, ,
Data
Mining .
Data Mining
, ,
, , ,
Data Mining .
Data Mining ,
" " . 75%
Data Mining , ,
. ,
,
".
(Herb Edelstein), Data
Mining, CRM: " Two Crows
, Data Mining .
,
. : Data Mining
, .
IT- , Data Mining .
, ,
. , Data Mining12
, ,
".
Data Mining,
, , ,
, .
Data Mining
, .
,
.
Data Mining
,
, Data Mining,
, .
Data Mining ""
.
.
Data Mining .
Data Mining, ,
.
,
.
Data Mining. .
Data Mining .
.
, 80%
Data Mining-.
, , ,
, .
,
Data Mining ,
.
13
Data Mining
. , Data Mining-
. ,
.
Data Mining- .
, - .
Data Mining, ,
.
, , ,
.
.
Data Mining
( ) OLAP
(verification-driven data
mining) "" ,
(OnLine Analytical Processing, OLAP),
Data Mining - .
Data Mining
.
,
Data Mining .
,
, Data Mining .
OLAP , Data Mining
.
Data Mining
,
Data Mining,
;
,
Data Mining ;
14
Data Mining, ,
, ;
Data
Mining .
Data Mining , ,
, .
Data Mining
, , , ,
.
Data Mining
-
, .
Data Mining -
,
, :
"Amazon"
"
", Data Mining,
.
,
. - ,
- ,
(, , ..). ,
, , .
-
.
, , Data
Mining, [5]. ,
Data Mining, , , :
, ;
;
, ;
.
Data Mining
, " " (Pregibon, 1997).
Data Mining.
,
. - , Data Mining
. ,
15
Data Mining
.
Data Mining ,
,
.
,
Data Mining, - Knowledge Discovery Data
Mining (International Conferences on Knowledge Discovery and Data Mining).
WWW- - www.kdnuggets.com,
Data Mining -.
Data Mining: Data Mining and Knowledge Discovery, KDD
Explorations, ACM-TODS, IEEE-TKDE, JIIS, J. ACM, Machine Learning, Artificial
Intelligence.
: ACM-SIGKDD, IEEE-ICDM, SIAM-DM, PKDD, PAKDD,
Machine learning (ICML), AAAI, IJCAI, COLT (Learning Theory).
16
, , , ,
, -.
, ,
.
, ,
.
, - ,
.
2.1 , .
2.1. "-"
18
Single
125
22
Married
100
30
Single
70
32
Married
120
24
Divorced
95
25
Married
60
32
Divorced
220
19
Single
85
22
Married
75
10
40
Single
90
17
.
- .
.
, , , ..
- , .
: , ..
, , , .
[6], ..
, .
(variable) - ,
, .
(value) .
, ,
.
, ,
, .
, ,
.
,
.
.
(population) - ,
.
(sample) - ,
.
- .
- .
.
. - ,
.
- ,
,
.
18
:
.
, - . ,
, ( )
( , ,
..), .
.
.
.
-
.
, .
- , .
Data Mining
/
(, , ).
.
.
, , .
,
,
.
. (
): 10, 15, 25 .
- ,
.
.
: , , , ..
: , , ,
.
(nominal scale) - , ;
,
.
19
, ,
.
: , , .
: (=), ( ).
(ordinal scale) - ,
,
.
.
,
" ", "
".
: (1, 2, 3-), ,
(1-, 23-, ..), ,
, .
: (=), ( ), (>),
(<).
(interval scale) - ,
, .
,
,
.
: - 19 , - 24, ..
5 , , 1,26 .
, ,
, , , .
: (=), ( ), (>),
(<), (+) (-).
(ratio scale) - ,
.
: (4 3 ). 1,33
.
1,2 , .
.
20
: (=), ( ), (>),
(<), (+) (-), (*) (/).
(dichotomous scale) - ,
.
: ( ).
,
, ,
2.2.
2.2.
(
)
( (
)
)
22
55
47
,
, , 2.3.
2.3.
( )
8
( )
( )
22
17
23
. , ,
.
.
. ,
.
,
.
21
. 2.1.
: WWW-; ; (.
2.2); .
. 2.2.
, ,
, .
, , , .
, ( ,
), . 2.3.
22
. :
Benzene Molecule: C6H6 (. 2.4)
. 2.4.
,
. :
, , (, .).
"".
.
, Data Mining
,
.
23
- .
Data Mining /
,
Data Mining .
, " ",
. 2.5.
. 2.5.
(23%)
, . Text, CSV - 18%, 14%
Text, space or tab separated SAS; Excel - 9%, SPSS 8%, S-Plus/R - 4%, Weka ARFF - 6%, Data Mining - 2%.
,
Data Mining .
.
. .
(Database) -
.
,
,
24
.
.
,
.
,
,
.
.
.
.
- ,
.
-
.
,
- ,
, , , ,
.
(Database Management System, DBMS) ,
.
(Relational Database Management
System) - , .
( ).
.
, .
, -,
, , .
(C, C++, Pascal, Object Pascal). ,
, , , .
, ,
Data Mining, . ,
, FoxPro
, .
Access .
25
, ,
(, ),
, .
.
, :
1.
2.
3.
4.
5.
6.
7.
8.
;
;
;
;
;
( );
;
.
, ..
.
-
.
.
.
-
, .
: .
- ,
,
.
- . :
, ;
( ) ;
( ) .
:
/ ; ;
, .. ,
.
26
,
.
.
: ,
, .
? .
- ().
- , OLAP.
(dimension) - -
, .
:
;
;
-.
- ,
.
- ,
( , )
.
- - ,
, ,
.
, , ,
, , .
.
, .
;
.
.
: , .
. : ,
.
27
. - ,
, .
, ,
.
.
(Metadata) - .
: , , .
, , , ,
, , , ,
, ,
.
- .
, , ,
. -
.
- - ,
.
- ,
:
;
(, );
, , ..
.
, ,
, .
. , ,
, , .
. ,
, .
28
Data Mining
Data Mining -
(
) .
Data Mining
, .. .
Data Mining :
, , , k-
, , , , ; ,
, k- k-;
, Apriori;
, ,
.
, Data Mining -
.
,
.
, Data Mining
.
(method) , , ,
, , ,
.
.
, - ,
.
(algorithm) -
(), .
Data Mining
, , ,
,
.
3. -
, .
, Data Mining
[11]:
( ) ->
-> ->
->
1. (Discovery)
.
.
(law) - ,
,
.
Data Mining ,
OLAP, ,
. -
. ,
,
.
:
(conditional logic);
(associations and affinities);
(trends and variations).
, , ,
.
:
25 35 1200
. ,
.
" ..., ...".
, , " < 20
> 700 , 75%
" " >35
30
( ,
);
(
);
( ).
,
.. ,
.
.
2. (Predictive Modeling)
Data Mining - -
.
.
:
(outcome prediction);
(forecasting).
.
( )
, ,
, .
(
) ( )
().
, .
, > 15 , 65 %
, > 35 . , > 35
> 1200 , 90%
, .
31
. .
, , .
.
: " < 20 > 700
, 75% "
, .. " < 20 "
" > 700 ",
, : - .
, , . ,
, .
:
, ;
, .
, > 15 , 65%
, > 35 .
, : -
> 15 , - > 35 .
, , , ,
, .. ( ),
, " ".
- .
3. (forensic analysis)
Data Mining ,
.
, , - (deviation detection).
,
.
, .
" > 35 > 1200
, 90 % ".
- 10 % ?
. -
, .
10% - .
[12].
32
Data Mining
Data Mining
.
Data Mining
Data Mining
.
, Data Mining
.
1. , .
/
. -
.
: , , k-
, .
2. ,
.
()
,
Data Mining.
, .
, .
,
(" ").
: ; ; ; , .
, , :
; ; ; .
, , -
, ,
.
. ,
.
. ,
, .
33
-: , () , -
. Data Mining . ,
-
Data Mining - ,
Data Mining [13].
.
- . ,
,
. ,
,
.
:
. ,
, , - , ,
, .
Data Mining :
.
[14].
, Data
Mining. Data Mining,
. ,
Data Mining.
[5, 14].
:
, ,
;
,
.
: ,
.
-
( ), ..
Data Mining.
.
34
Data mining
[14] :
(
, , , ,
, ..);
( ,
.);
( ,
, , .);
.
Data Mining
:
1. .
2. ( , ,
).
3. ( , ,
, .).
4. ( ).
Data Mining
Data Mining - ,
.
:
(, , );
( .. );
();
( , );
;
;
.
35
, ,
, : k-, k-,
, ,
- , .
/ ()
() .
, ,
: , , ,
, .
Data Mining
Data Mining ,
.
, .
Data Mining :
, , , , ,
, .
- ,
, , ,
., .
3.1
[15]. ,
: , , /,
/, , /, , .
,
. ,
, Data Mining.
Data Mining,
, , , ,
, ,
.
(, SPSS, SAS,
STATGRAPHICS, Statistica, .)
( , ). ,
,
(, , , .)
.
.
36
, . , ,
Statistica, "
".
.
,
-
- -
-
-
(
)
-
-
/ / /
-
-
-
/ /
- /
-
-
-
37
k--
-
- /
38
Data Mining.
, Data Mining ,
. ,
Data Mining.
, , ,
Data Mining.
(tasks) Data Mining (regularity) [16]
(techniques) [17].
, Data Mining, .
: ,
, , , ,
, , , .
, , - Data Mining,
, ,
. Data Mining - ,
, , -
. ,
[18], Data Mining.
Data Mining
.
Data Mining
(Classification)
. Data Mining.
,
- ;
.
. :
(Nearest Neighbor); k- (k-Nearest Neighbor);
(Bayesian Networks); ;
(neural networks).
(Clustering)
.
. , ,
.
.
: " "
- .
39
(Associations)
.
.
Data Mining:
,
, .
-
Apriori.
(Sequence), (sequential association)
.
. ,
, , (..
). ,
.
,
, . Data Mining
(sequential pattern).
: X
Y.
. 60%
, 50%
. ,
, (Customer Lifecycle Management).
(Forecasting)
.
.
,
.
(Deviation Detection),
. - ,
,
.
(Estimation)
.
40
(Link Analysis) - .
(Visualization, Graph Mining)
.
,
.
- 2-D 3-D .
(Summarization) - , -
.
Data Mining
, Data Mining
:
;
;
.
Data Mining:
, , .
.
, .
Data Mining, ,
.
, Data Mining.
, Data Mining
.
(descriptive) ,
, .
,
, , .
.
.
.
(predictive) , ,
.
41
Data Mining : , ,
.
( )
: .
.
: .
: , , , .
,
.
.
:
.
: , , , .
, 50 , - 30 ,
- .
.
.
, Data Mining ,
Data Mining.
Data Mining.
, Data Mining -
, ,
, .
, Data Mining,
, ,
, .
, , , ,
, , , ?
42
:
1. - -
2. - -
" ", ,
.
. . 4.1. "",
"" "", .
. 4.1. ,
, .
, . ,
, ..
, , - ,
, .
, ..
. , , Business Intelligence
.
43
. . 4.2.
[17], , ,
Data Mining.
. 4.2. , ,
, (, , )
,
.
- - (
), . :
, , .
- - ,
Data Mining; :
44
( ),
, .
- Data Mining,
, ;
, , , .
, .
4.1. Data Mining
3
Data Mining
,
( ) ,
, ,
.
( ).
. - . (,
, , ). ,
, ; .
- .
- .
, , , , , , .
.
, , ,
, ,
.
,
"", "", "", "".
.
,
. , , .
, , ,
.
45
. (. informatio) 1. -;
2. , , (
);
3. () -
(), ; - ,
, , ,
.
- , - , ,
.., ,
.
, , ,
.
, , ,
.. ,
,
.
, ..
.
. " " ,
, .
. " ,
." .
.
, ,
.
.
.
. -
.
46
, .. .
.
,
.
.
.
,
( ).
,
, ..
.
.
.
.
.
, , . ,
, .
,
.
.
, ,
. ,
(, ,
, ,
).
.
,
47
; .
, ,
.
- , ,
.
, , ..
. ?
[19].
, .
. , - -
, -
.
, " -
, , , ,
, ".
, [20].
1. . " ".
2. . -
, , ; - .
3. .
" ".
- ,
- Internet .
4. . .
5. . , .
-
. .
, ..
.
"", "", ""
48
,
.
.
1. , , .
2. , , .
3. , , .
- - ,
, , .
.
, .
.
, , .
"" "", ,
, "", .
""
, ""
. "" .
, , ,
. :
, ,
, , .
,
.
, , .
. , ,
.
, .
, .
.
,
.
. , Data Mining
, ,
,
.
49
Data Mining.
Data Mining.
- - .
Data Mining.
.
.
- , ,
, , , -
;
, .
- ,
( ),
.
:
;
, ..
;
,
;
.
() ,
(, )
;
, ,
.
, ..
.
,
:
-
. ,
,
(.. : " ");
50
-
.
.
(, )
.
- ,
. ,
, ,
(
).
(supervised learning),
.
(.. , )
/ .
, ,
, - , , , .. ,
(, , 0 1).
,
. ,
, .
.
( ) (
).
. ,
, . (1930 .),
.
.
. ,
.
: ,
. , : 1 2.
5.1.
5.1.
1
18
25
1
51
22
100
30
70
32
120
24
15
25
22
32
50
19
45
22
75
10
40
90
. ,
.
(
), , 1 ( ) 2
( ). . 5.1 .
. 5.1.
, ,
, .
52
, ,
.
.
, ,
.
, ,
. .
( ) .
( ) :
.
(training set) - , ,
() .
() .
.
(test set) .
.
[21]:
.
1. : .
o .
o ,
.
o ,
.
2. : .
o () .
1.
.
2. -
.
3. , .. ,
, .
o ,
, .
, , ,
. 5.2. - 5.3.
53
. 5.2. .
. 5.3. .
,
. :
;
() ;
;
;
, , ;
54
;
CBR-;
.
(
, ) . 5.4 - 5.6.
. 5.4.
if X > 5 then grey
else if Y > 3 then orange
else if X > 2 then grey
else orange
. 5.5.
55
. 5.6.
:
-. (Cross-validation) -
, - .
.
, , ,
-.
, -
- .
. ,
,
.
, [21]:
, , , .
,
.
, .. - ,
.
.
:
;
56
, "
".
, ,
, ,
, .
,
, ,
.
"" " ",
" " "".
( ).
, "
".
- .
,
, "
".
"" :
"". (cluster) "", "".
, .
:
;
.
, , ,
, .. .
, , .
-
.
57
5.2
.
5.2.
,
,
. 5.7 .
. 5.7.
58
, (non-overlapping,
exclusive), (overlapping) [22].
. 5.8.
. 5.8.
,
. ,
"" , "",
..,
.
(, )
.
,
- .
,
.
.
,
.
.
, ,
.
[21].
, (Partitioning algorithms), .. :
o k ;
o .
(Hierarchy algorithms):
o : , ,
, ..
, (Density-based methods):
59
;
, .
- (Grid-based methods):
o -.
(Model-based):
o ,
.
o
o
;
;
;
.
, .
.
.
, , ,
, ..
-
.
,
.
. ,
.
, , (Hartigan,
1975).
, ,
, , ..
..
.
, ,
, . -
. -
.
60
-
, ,
.
, ..
,
, :
,
. , (1974),
(1981).
, , ,
.. , ,
. ,
.
, ,
.
, , .
,
.
, , , .
,
.
1971
, .
1974 (Sexton),
- ,
. ,
.
1981 ,
,
.
.
,
. .
61
, Data Mining,
" ",
, () . ,
, Data Mining, " ",
.. .
, .
. :
k- ( ),
( ), SOM.
.
62
Data Mining.
Data Mining.
.
, , , .
, .
, ,
,
Data Mining.
, ,
Data Mining, ,
.
( Prognosis), ,
.
.
(forecasting) Data Mining
.
(prognostics) - .
, ..
. ,
.
-
.
.
, .
: ,
, .
(market
forecasting).
63
, ,
( , ,
).
:
(, );
, ;
.
,
: , ;
.
:
;
.
.
.
Data Mining
. , , , ,
( - ).
.
?
.
,
, -
, ( ).
, ,
,
, ,
, .
,
.
Data Mining (Time-Series Data Mining).
64
[23].
Data Mining. . 6.1
Data Mining . , (23%)
. ( 14%),
( 9%), (8%).
6%.
,
.
:
, ,
.
.
- - ,
.
. ,
.
65
- .
, ,
,
.
:
;
.
: , , ,
.
"" .
, , ,
.
.
.
,
Data Mining.
,
.
,
.
,
.
,
, .
, , .
.
(..
),
.
,
[24]:
, , .
,
.
66
, ,
,
.
,
.
, .. .
.
,
, ,
.
,
.
. . 6.2. ,
" " , .
, ,
.
. 6.2.
( 12 ),
. 6.3, . ,
,
.
67
. 6.3. 12-
,
, , , .
,
.
,
.
, .
.
, () .
- ,
.
,
, .
.
, . , ,
, ,
.
.
. (..
); .
:
1. , , , ;
2. , , .
68
-
, , ,
.
:
1. ?
2. ()?
3. ?
, ,
. , ,
, ..
,
, , Data Mining.
, .
-
:
;
;
.
- , .
, .
- .
- , .
12 , ,
- , - 12 .
- , .
.
.
,
, , ,
. .
, , ,
- .
69
, ,
, , , ,
.
:
.
,
, - .
.
, ,
.
.
,
.
. .
.
:
(). .
-
.
(). .
, .
, " " .
(SSE), .
( ) .
.
().
.
.
, .
, ..
3%
1-3 .
- 3-5% , 7-12
;
.
.
70
- 5% .
, "" ,
"".
,
. ,
.
1. , ,
, .
2.
.
.
, ,
,
(
).
, , -
.
, .
Data Mining, ,
. Data Mining,
, .
,
.
, .
Data Mining, ,
.
,
, ,
, , .
- ,
,
,
[25].
71
,
, , CHI ACM-SIGGraph, ,
, "IEEE Trans. visualization and computer graphics".
.
,
, .
,
.
,
.
, :
.
, Data Mining , .
-
.
, .
,
.
- ,
.
: , , ,
..
:
;
;
(), .
.
. , " "
2000 2005 , 6.1.
6.1.
2000
1100
2001
1101
2002
1104
72
2003
1105
2004
1106
2005
1107
Excel .
.
, .
, ,
, . 6.4.
. 6.4. , y 1096
2000
2005 . , y,
, , x , 1096. ,
y 1096 1108 .
, y, ,
. 6.5.
73
. 6.5. , y 0
0 2000
.
,
, , ,
, [26].
. ,
Data Mining.
Data Mining.
, -, -,
, ,
, , " ",
.
Data Mining
Data Mining.
, ,
. ,
Data Mining - ,
- . Data
Mining.
,
, Data Mining .
[16] Data Mining:
.
,
Data Mining -. , ,
Data Mining ,
, 1000%
.
Data Mining
, .
Data Mining
[22, 27]: , , Web-.
Data Mining -. :
, , , CRM, , ,
, , .
74
Data Mining .
: , ; .
Data Mining . : ,
, , , ,
, , , .
Data Mining Web-. :
(search engines), .
Data Mining -
Data Mining
.
" ?"
Data Mining -
.
" ?".
Data Mining
, ,
.
Data Mining.
()
, .
" ?" Data Mining
. (
); , ,
"" ;
(" ", " ").
.
Data Mining "
" " " .
.
.
Data Mining ,
,
- , ,
.
.
.
75
, Data Mining,
.
.
.
" ",
, .
. Data Mining
, ,
, .
. ,
Data Mining, .
, ,
.
.
-.
Data Mining
, ,
, - .
. ,
.
,
. , ,
.
, ,
, .
Data Mining
Web-.
. Data Mining
Web Mining [28].
76
Data Mining
.
,
;
.. ,
, Data Mining.
Data Mining [29]:
;
;
;
;
;
;
;
;
;
;
,
.
Data Mining .
" ?", " ?", "
?"
, ,
, , ,
.
-
.
.
, , :
77
(
, ).
,
..
, ,
.
,
.
, Data
Mining [30]:
;
( - , , )
(, ..);
, ,
;
;
;
;
;
.
, Data Mining
,
.
Data Mining CRM
Data Mining -
CRM.
CRM (Customer Relationship Management) - .
"
" .
, , ,
. CRM
, .
: ,
, , .
Data Mining, ,
, ,
.
78
Data Mining
. ,
.
.
, .
.
Data Mining
, .
,
Data Mining, ,
. CRM Data Mining
.
,
, . :
,
,
( , .).
10 . ,
- Accenture.
,
(Data Mining),
.
(, , e-mail,
),
.
, , .
, ,
, , ,
. - ,
.
Data Mining
Data Mining - ,
,
.
, ,
.
79
, Data Mining
.
, ,
, , . Data
Mining .
Data Mining
.
, ,
.
.
Data
Mining, - (Microarray Data
Analysis, MDA). Microarray Data Analysis
[22].
:
;
;
;
.
Data Mining -
; ,
; .
, Data Mining "
" - , .. ,
.
Data Mining
.
Data Mining
. Data Mining - ,
.
, Mining
"".
80
Web Mining
, (Database Approach), :
;
web- (Web Query Systems);
web-:
81
;
.
,
Web-.
Web Usage Mining :
;
;
;
.
Web Mining .
, - .
, ,
, .
Web-
.
Web Mining [31] :
Web Mining.
,
, ;
.
Text Mining
Text Mining ,
. Text Mining KDT
(Knowledge Discovering in Text - ).
Data Mining,
, Text Mining
.
, ,
82
. , Text Mining , -
.
Call Mining
83
,
, , .
,
.
- , ,
. , ,
.
,
,
.
. Microsoft Excel
,
.
, ,
, -
.
Microsoft Excel
Microsoft Excel .
, .
.
. , ,
. ,
/ " ".
, .
(Descriptive statistics ) -
,
, .
- ,
.
, 8.1.
8.1.
x
y
84
12
15
17
19
21
23,4
10
25,6
11
27,8
6,5
17,68
0,957427108 2,210922382
6,5
3,027650354 6,991550456
9,166666667 48,88177778
-1,2
-1,106006058
-0,128299221
20,8
18
85
11
27,8
65
176,8
10
10
(1)
11
27,8
(1)
(95,0%)
2,16585224
5,001457714
, .
,
.
, . ,
: ,
.
- ,
.
. , ,
, .
:
, . "" ,
.
.
.
,
. " "
" ",
.
.
.
.
86
,
.
, "" .
.
.
.
.
,
.
- ,
.
.
,
(n+1)/2, n - .
n/2
(n+2)/2.
.
,
,
.
.
- .
- .
- .
- .
- - ,
.
" " ,
.
( ).
( ).
87
,
, , , , (,
). .
. ,
, .
, ; .
(outliers) - , .
: .
.
.
, ,
, , (),
.
, , ,
.
"" , , ,
..
.
, .
, .
, r,
.
( ) , .
,
-1 +1 .
. 8.1.
88
. 8.1.
r,
-1,0 1,0 ,
.
:
x - ;
y - ;
n - .
- :
.
,
:
( ) - ;
(
) - ;
( ) -
.
( 8.1).
x y.
, x y. ,
, . 8.2. ,
x y,
x y.
.
89
. 8.2.
, x y.
(x y)
MS Excel (1;2).
0,998364, .. x y
. MS Excel "",
.
:
. ,
. , .
, .
. ,
, .
.
:
,
.
.
1. .
.
2. () .
3. . ,
.
4. ( ,
).
90
5. (
)
6. .
7. .
.
.
8. .
.
.
:
, , ,
, , , - .
: ,
, .
.
:
( );
;
;
( );
;
.
,
. .
.
, ,
.
.
.
:
, .. ; .
, ..
; .
91
.
() .
, .
, .. ,
. ,
.
, ..
, .
. ,
.
.
.
, ,
, .
.
.
: Y=a+b*X
Y a
( ) b, X.
a , -
B-.
( )
.
- ()
( ).
MS Excel
" " "". X Y.
Y - ,
. X - ,
.
16.
, 8.3
- 8.3.
92
8.3.
R
0,998364
R-
0,99673
R-
0,996321
0,42405
10
, 8.3, .
R-, ,
.
( ).
[0;1].
R- ,
, .. .
R- , ,
. ,
R-, , .
0,99673,
.
R - R -
(X) (Y).
R ,
.
R
. , R
(0,998364).
8.3.
t-
Y- 2,694545455
0,33176878
8,121757129
X 1 2,305454545
0,04668634
49,38177965
*
93
, 8.3.
b (2,305454545) , .. a
(2,694545455).
, :
Y= x*2,305454545+2,694545455
( ) ( b).
- ,
.
, , .
- ,
().
8.3. .
, ""
"".
8.3.
Y
9,610909091
-0,610909091 -1,528044662
7,305454545
-0,305454545 -0,764022331
11,91636364
0,083636364 0,209196591
14,22181818
0,778181818 1,946437843
16,52727273
0,472727273 1,182415512
18,83272727
0,167272727 0,418393181
21,13818182
-0,138181818 -0,34562915
23,44363636
-0,043636364 -0,109146047
25,74909091
-0,149090909 -0,372915662
10
28,05454545
-0,254545455 -0,636685276
94
.
- 0,778, - 0,043.
, .
8.3. , ""
.
,
.
. 8.3.
, ..
.
, Y=
x*2,305454545+2,694545455 x.
Y 8.4.
8.4. Y
x
Y()
11
28,05455
12
30,36
13
32,66545
14
34,97091
15
37,27636
16
39,58182
95
, Microsoft
Excel :
;
, ;
;
;
;
.
, ,
, ,
.
, , ,
.
, , , ,
. .
,
, .
.
, .
96
.
(decision trees)
. Data Mining
, .
,
.
, .. ,
.
,
, ..
.
(Hoveland, Hunt)
50- . .,
- " " ("Experiments in
Induction") - 1966 .
-
, . - ""
"" .
. 9.1 , - :
" ?" , .. , ,
( "" " "). ,
, .
"?" , .. .
,
, - . ,
.
.., ,
. : "" "
" .
( )
, .. - ""
" " .
97
.
, :
: "?"
: " ?", "
?"
, , : "", " "
( ): "", "".
, ..
.
.
, ..
("" "").
, .
, , , ,
, .
. ,
,
, : , , ,
, . ,
98
( ) ,
.
, ,
: .
, ,
. ,
, ,
" ?"
, " : :".
. 9.2. ,
" ?". ,
.
, (, ,
) .
99
, (splitting attribute).
, , ,
"" " " .
, , .
.
:
-.
(splitting criterion) [33].
. 9.2.
. , " ?",
: "" " ".
.
, ( )
,
.
.
.
"" [34].
,
.
. ,
, .
, , ,
, " ",
.
,
. ,
.
. : > 35 > 200, .
,
.
( ).
, ,
. , ,
, ,
.
100
, ,
( ,
).
,
;
, ,
, .. , . :
SLIQ, SPRINT.
.
,
, , .
.
,
, ,
, .
,
, , ,
, ,
. , , .
, Data Mining,
.
,
, .
.
""
"" (tree building) "" (tree pruning).
(
).
.
.
, .. .
,
, ,
.
. :
, ,
101
,
. ,
, "",
.
. -
Gini.
,
" "
(information gain measure) .
, (Breiman) .,
CART Gini.
.
T, n , Gini, .. gini(T),
:
T - , pj - j T, n - .
, ""
,
. ""
"", ,
,
. ""
, .
, ,
, ,
" " (Breiman,1984).
?
, ,
[39]. ,
, ,
, .
.
, .
,
102
" " ,
.
, "
" , , . 1984 . ,
, ,
.
,
, : ;
.
,
. :
;
.
. ,
, ,
, .. .
- ,
.
- " " (prepruning),
. .
. "
" (Breiman, 1984).
- .
, .
- ,
.
,
.
, ,
, [35].
(pruning) .
, ,
: .
103
,
, ,
.
,
, ,
.
,
. , ..
. ,
. , ,
.
,
, .
.
.
,
: CART, C4.5, CHAID, CN2, NewId, ITrule .
CART
;
;
;
.
,
. ,
. ,
, .
( right) - , ; ( left) , .
104
,
, - Gini - . ,
. , ,
.
50 , -
100 0 .
. , CART
.
. ,
xi <= c, c
xi .
,
xi V(xi), V(xi) -
xi .
. , minimal cost-complexity tree
pruning, CART
. -
" "
.
, , "
".
(V-fold cross-validation)
CART.
, ,
,
.
, CART: ,
- Gini, minimal cost-complexity tree pruning V-fold crossvalidation, " , ", ,
.
C4.5
C4.5 .
. C4.5
.
C4.5 :
, .. .
.
.
.
105
- (binary), (multi-way)
- , Gini,
.
,
.
,
- , ..
.
, - Sprint,
[36]. Sprint,
CART,
.
;
,
.
, .
, , , ,
.
106
.
. " ".
;
: ,
( )
.
. 10.1.
- , ;
. 10.2.
: , - .
107
. 10.2.
, , ..
; . 10.3.
. 10.3.
, .
, .
. 10.3. , .
SVM
,
. , .
108
f(x),
- .
, ..
f(x), ,
.
f(x). b, ..
f(x)=ax+b. . 10.4.
, .. SVM-, ,
-
.
.
. 10.4. SVM
,
.
.
,
. ,
.
.
- .
,
,
.
,
, ,
.
109
- , ..
, ,
.
, .
,
,
.
: SVM- ,
, .
- -
,
.
, SVM ,
.
,
, , .
, ,
, .
, ,
.
[37, 38]:
( );
,
.
" "
;
, ,
;
, , ;
, ;
;
.
, , ,
,
, .
" ", ..
"" ,
.
, ,
- , .
, CBR- .
,
"" .
.
,
.
, .
" "
- ,
, -
, , .
"" ().
,
.
.
, - .
-
.
111
.
k-
().
. 10.5. ( )
"+" "-",
("+" "-"), , ,
. .
()
. ,
, : "+"
"-".
. 10.5. k
k-
.
, .
. k ,
(..
).
5. ,
(
( ) ). 2 "+" 3
"-" , k- "-"
.
112
k-
.
.
, . 10.6. (
) x
y ( ). (..
); k-
X ( ).
. 10.6. k
k-
, .. k, .
( ) X.
- (x4 ;y4). x4 (.. y4), ,
X (.. Y). ,
: Y y4 (Y = y4 ).
, k , ..
. X .
y3 y4 . ,
Y Y = (y3 + y4)/2.
,
Y X k .
113
,
.
, - .
k-
, "
".
k-
k. ,
.
k,
. ,
. , ,
k.
, ,
, , k
.
, k ,
() .
k -
k - - (Bishop, 1995).
, , STATISTICA (StatSoft) [39].
- - .
- v "". V ""
.
k k-
v- ( )
.
,
( ).
v.
v "" (),
(.. ).
k, ,
( ),
( -).
, - - ,
,
.
114
k - .
,
, ,
.
k-
.
Dell,
Inference. ,
.
, ,
. CBR
Intranet Dell.
Data Mining, k- CBR-,
. : CBR Express Case Point (Inference Corp.),
Apriori (Answer Systems), DP Umbrella (VYCOR Corp.), KATE tools (Acknosoft, ),
Pattern Recognition Workbench (Unica, ), ,
, Statistica.
: , ,
.
[11].
[40],
Data Mining.
- (naive-bayes
approach) [43] ,
. ,
"" .
"" - .
"" ,
.
:
1. .
2. :
o ;
o , ..
.
115
,
, ,
; .
, ,
. ,
?
, .
.
, - .
?
.
Data Mining [41]:
,
, ;
", ";
,
, , , , ;
(overfitting), ,
(, ).
- :
,
;
,
, [42];
-
, ;
[43];
-
,
[43].
,
, .
. (Paul Graham).
.
116
- ,
.
, .
- " - ".
, "
" , , .
" " ,
, ,
. ,
" ", ,
, .
117
.
,
.
(Neural Networks) - ,
, ,
( ).
,
, - .
.
-
, , , ,
, ,
.
, , ,
, ,
.
.
.
,
, .
, .
, - ..
Data Mining, ,
:
( ). :
, , .
.
: ,
( ). ,
.
( ).
.
, , .
.
,
.
118
.
.
,
.
( ).
.
. ,
, ,
"".
( ) - ,
.
-
, .
, ,
, (
).
. 11.1.
119
. 11.1.
, .
- , (
) .
( wi).
:
:
y = f(s).
, , -
, .
:
.
.
.
, .
- ,
( )
( ).
() - ,
.
, .
.
.
120
- , ,
[44].
- [45, 46].
.
- ,
.
- ,
() , .
i- ,
(i+1) . k- ,
.
, .
,
- .
, , , ,
, , .
. , ,
.
, , [46].
- , .
- , .
, -
, - . , -
, .
, - .
, , .
(input neuron),
.
(hidden neuron) - ,
.
(output neuron), ,
.
121
, .
.
.
.
.
,
.
- ,
, ,
.
.
.
.
, - ()
. ,
.
.
,
, .
.
,
" ".
.
.
,
.
,
, .
() .
.
122
- ,
.
. , .
.
,
(overfitting).
, -
,
.
,
.
, ()
. .
( )
.
"" , . .
-
( ).
.
. .
,
.
. -
.
,
.
, .
. 11.2.
123
. 11.2. .
.
,
, ..
.
. ,
.
:
.
[47]. - .
( ) - ,
.
. 1960-
.
124
- . 11.3.
. 11.3.
, , n , ,
3 .
.
(MLP) -
( ), ,
.
, - .
.
,
, .
.
.
. 11.4.
125
. 11.4.
, , n . ,
3 , .
. , ,
.
(Back propagation, backprop) -
,
.
, , ,
.
.
, ,
.
:
(
).
.
126
,
.
: BrainMaker, NeuroOffice,
NeuroPro, .
: ,
, , ,
. .
" "
Deductor (BaseGroup).
,
, : , , , , ,
, , , .
, , ,
, , ..
" ?".
, .. .
credit.txt.
.
- .
. - " ", - .
. 11.5.
127
.
"". . 11.6.
128
, ..
- 33 ( ),
- 1, - 1 ( ).
- , . .
11.7.
.
" ", . 11.8.
129
.
, 0,005,
10000.
.
, 4536 83,10%
, - 85,71% .
. 11.9.
130
. :
, , ", ",
[48].
. 11.10 . ,
, .. 55 , ,
89 , .
, (1 4). ,
- 96,64%.
. 11.10.
"-" .
,
131
PR - R ;
Si - i- ;
TFi - i;
btf - , ;
blf - , ;
pf - .
,
, tansig, logsig, purelin.
Net=netff(minmax (P), [n,m, l],{ logsig, logsig, purelin },trainpr),
P - ;
132
n - ;
m - ;
l - .
. ,
, : Net.performFcn='SSE'.
10000
: net.trainParam.epochs=10000.
:
[net,tr]=train(net,P,T);
, , nn1.mat.
:
save nn1 net;
,
, .
Matlab
, Neural Network Toolbox.
Neural Network Toolbox
[49, 50].
133
. .
- .
.
,
, .
.
(, , ).
.
-
.
.
( ).
( ).
,
.
- ,
.
:
.
.
Back Propagation.
.
.
. -
, .
134
.
- , ,
,
.
.
.
.
. ,
.
, .
.
, ,
. .
,
.
.
. ,
, .
.
.
().
,
.
,
.
,
. ,
0 10,
.
.
, , .. ,
, ,
.
,
135
, , , [0..1].
,
.
. [44,
51, 52].
.
,
[53]:
1. ,
;
2.
;
3. ( , ,
- , .)
.
.
,
. , ,
, ,
, , ,
- ,
(
).
(Self-Organizing Maps, SOM)
, , - ,
,
. ,
,
. ,
.
(1982 ).
-
.
.
.
136
, ,
(, ,
).
,
,
, , ,
.
-
, .. .
, ,
( ).
, .. , .
.
:
[39].
. ,
. ,
, .
, ,
.
, , - ,
.
.
.
, ,
.
, , ;
: .
. , ,
. . 12.1
137
. 12.1.
.
,
, .
.
(,
).
.
"" "-".
,
( )
.
,
.
,
.
. ,
.
,
,
"".
, .. -.
, [39].
,
. ,
, .
.
138
( )
.
n-
. ,
.
,
.
,
.
. 12.2
. 12.2.
? .12.3 , , i-
( pr_a), . , -
, -
.
. 12.3. i-
139
, .12.2, ,
( ,
), - ( ,
).
, ( ) :
,
;
.
;
;
.
. ,
[15:30] , 15- 30-
. , .
.
. ,
.
.
- ,
.
. .
.
. -
, .
. , ,
, ,
.
, -
.
.
140
, ,
. ,
,
, - ;
.
, , SoMine,
Statistica, NeuroShell, NeuroScalp, Deductor .
Deductor.
. , ..
, - 21.
"banks.xls".
.
xls- .
" ". , ..
: , ,
. ,
, "". "" .
,
. , 95% - 5%.
,
.
5, . 12.4 :
Y ( ).
141
" ",
. 12.5, ,
.
142
, . 12.6,
: , .
:
( ). ,
.
- ""
.
.
,
.
" "
"-". ,
. 12.7.
143
, , ,
" " .
.
, , . 12.8
. 12.8.
.
.
,
. , ,
144
, .
,
.
, , : ,
, du (
) akt ( ) pr_a
( ).
,
: ,
, , ..
, .
, , .
(. 12.9) ,
- . (
) ,
.
. 12.9.
" ".
. 12.10. ,
, . ,
, .
145
. 12.10.
,
.
7 ,
, .
.
. ,
.
, :
- , - .
, - .
.
.
146
.
.
"" ,
- .
, (Tryon) 1939 ,
100 .
,
,
,
( , , ). ,
.
, .
,
.
,
, , .,
. .
:
1.
2.
3.
4.
.
.
.
, (),
, .
,
.
.
, , 14- ,
X Y. 13.1.
13.1.
X Y
1
27
19
11
46
147
25
15
36
27
35
25
10
43
11
44
36
24
26
14
10
26
14
11
45
12
33
23
13
27
16
14
10
47
. X
Y , . 13.1.
. 13.1. X Y
148
"" . (),
X Y "" , ();
.
. "",
.
, , .
- i j
, X Y:
(13.1)
: ,
, ,
.
, , :
, ()
. ,
( . 13.2),
(13.1) :
(13.2)
. 13.2.
: , ,
, .
- .
149
- .
,
. ,
.
. .
- ,
.
,
.
, .
, .
.
.
() .
, -
.
. .
,
: 100 700,
- 0 1.
, ,
, , , ..
,
, .. . -
.
.
(standardization) (normalization)
,
. .
:
;
Z- .
,
, ,
. ,
- .
150
.
,
, .
;
.
.
,
. .
.
.
(Agglomerative Nesting, AGNES)
.
.
.
,
.
() (DIvisive ANAlysis, DIANA)
.
,
,
.
.
13.3.
151
. 13.3.
Data Mining,
. , SPSS,
- Statgraf.
.
,
"" ( )
( ).
.
.
( dendron ""), .
,
()
.
(dendrogram) - , n ,
.
, ,
.
,
.
152
.
.
. 13.4.
. 13.4.
11, 10, 3 ..
. ,
( ), : 11
10; 3, 4 5; 8 9; 2 6. :
11, 10, 3, 4, 5 7, 8, 9. ,
.
( ), .
, .
.
.
( ),
"" "-" .
.
,
. , ,
,
.
. ,
"", -
.
. ,
.
153
,
. -
? ,
.
.
( ) .
,
.
"" "" ,
" " ,
.
.
(.. " ").
, "".
"", .
(Ward's method).
,
(Ward, 1963).
, .
,
, .. .
""
.
(
- unweighted pair-group method using arithmetic averages,
UPGMA (Sneath, Sokal, 1973)).
. ,
"", "" ,
.
(
- weighted pair-group method using arithmetic averages, WPGM A
(Sneath, Sokal, 1973)). ,
,
( , ).
.
154
(
- unweighted pair-group method using the centroid average (Sneath and Sokal,
1973)).
.
(
- weighted pair-group method using the centroid average, WPGMC (Sneath, Sokal
1973)). , ,
( ), .
,
.
SPSS
SPSS (SPSS).
SPSS
( ), () [54]. ,
, - .
,
.
, .
, .
N-1. ,
. ,
. ,
.
. SPSS :
(Between-groups linkage),
.
(Within-groups linkage).
- (Nearest
neighbor).
(Furthest neighbor).
(Centroid clustering) .
,
, .
- ,
(Median clustering).
.
155
( )
13.2. :
Stage - ();
Cluster Combined - (
);
Coefficients - .
13.2.
Cluster Combined
Coefficients
Cluster 1 Cluster 2
1
10
,000
14
1,461E-02
1,461E-02
1,461E-02
1,461E-02
13
3,490E-02
11
3,651E-02
4,144E-02
5,118E-02
10 4
12
,105
11 1
,120
12 1
1,217
13 1
7,516
, Cluster Combined :
9 10,
9, 10 .
2 14, 3 9, ..
Coefficients ,
;
, .
,
156
. ,
, .
SPSS :
Z- (Z-Scores). ,
.
-1 1.
-1 1.
0 1.
0 1.
1. .
1. .
1. .
, , ,
, .
, 0 1.
.
.
/ .
, .
,
. ,
,
.
13.2 , Coefficients ,
, ,
, .
1,217 7,516.
, (14)
(12).
,
, .
.
( , 0 25).
. 13.5.
, 9 5 .
157
,
25 .
. 13.5.
158
. .
. ,
,
. ,
.
. .
, .. ,
" ".
k- (k-means)
k-,
.
(Hartigan and Wong, 1978).
,
,
.
k- k ,
. , k-, () ,
, . k
, .
: k
, ( )
.
1. .
k, "" .
.
:
o
o
o
k- ;
k-;
k-.
.
159
2. .
,
. .
, :
o
o
, ..
, ;
.
. 14.1 k- k, .
160
. 14.1. k- (k=2)
.
, 2 , 3, 4, 5 ..,
.
k-
(.. , ).
.
161
.
k-:
;
;
.
k-:
, .
k-;
.
.
. ,
. 25 .
.
.
,
, . ..
.
; - .
.
- ,
.
, :
;
162
-
.
,
.
, ,
.
. .
, .
"" .
, .
,
.
, .
,
.
- ,
.
.
- -
.
SPSS
,
(,
), (,
).
SPSS.
.
: Analyze ()/Data Reduction ( )/Factor
( ):
Extraction:() .
, .
- -
. ""
"Save as variables" ( ).
163
"
", - ,
.
, fact1_1, fact1_2
.., k-.
:
Analyze ()/Classify()/K-Means Cluster: (
k-).
K Means Cluster Analysis ( k-)
fact1_1, fact1_2 ..
. .
, ,
, .
, k-
.
, , .
, ,
, . ,
,
,
.
:
;
;
.
,
.
SPSS, , (,
), (, ) ,
,
, ,
.
-. -
. ,
. ,
, , ,
. ,
.
164
, ,
, .
,
.
.
( ).
( ).
,
.
. :
;
;
,
; .
.
.
, .
:
,
;
-;
;
;
.
-
.
,
.
,
,
, .. , .
,
.
,
.
165
,
.
, .
, ,
- .
.
.
, :
.
.
.
.
, ,
, .
.
. ,
,
.
,
.
, .
, ,
, . ,
, "".
, ,
. .
,
. ,
- ,
, ,
, . ""
.
, ,
, .
: ;
; .
-
.
166
, , .
.
.
, .
.
,
.
, , .. ,
.
.
,
,
.
,
.
, , ""
. ,
, : ,
.
, ,
, . ,
, - .
, :
;
.
,
.
.
(summarized cluster representation), ,
[33].
,
. : BIRCH, CURE, CHAMELEON,
ROCK.
BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)
[55].
,
, .
167
.
.
-
.
[33] , .
, ,
""
.
, ,
, ,
.
WaveCluster
WaveCluster
[56].
.
, ,
.
.
.
WaveCluster:
1.
2.
3.
4.
;
;
;
.
, . ""
. Clarans
.
,
.
CURE [57] , DBScan [58],
(density).
BIRCH, Clarans, CURE, DBScan
, ,
. ,
[59].
,
- ,
.
169
, - Data Mining.
(association rule)
.
:
?
?
?
?
, .
. , , ,
, , .
.
:
: , ;
; ;
;
: , A,
, ?
: , ;
: ,
;
, ,
( );
Web-.
: ,
, 50%.
- ,
.
, .
, .
(Transaction database)
, (TID) ,
.
TID - , .
, ,
15.1. (TID) ,
,
.
15.1.
TID
100 , ,
200 ,
300 , , ,
400 ,
500 , , ,
,
.
, D.
( 15.2).
= a
= b
= c
= d
= e
171
= f
15.2.
TID
TID
100 , ,
100 a, b, c
200 ,
200 b, d
300 , , ,
300 b, a, d, c
400 ,
400 e, d
500 , , ,
500 a, b, c, d
600
600 f
(Itemset), , , {, , }.
:
abc={a,b,c}
, ..
3:
SUP(abc)=3.
, , abc
.
min_sup=3, {, , } - .
,
.
, ,
50%.
SUP(abc)=(3/6)*100%=50%
.
, ,
(min support).
(frequent).
172
: " A B".
:
" ( ) A,
, B)"
, .
.
" " ,
15.1.
. .
s, s%
A B , , .
- A, - B. "
" 3, 50%.
, , A
B.
" A B" , c%
, A, B.
, , , ,
, , (3/4)*100%, .. 75%.
" " 75%, .. 75%
, , B.
" A B",
. ,
.
,
.
, ,
, ,
. , , "
", ,
- .
173
,
. , 3%
.
;
.
(candidate generation) - , ,
, i- (i - ).
.
(candidate counting) - ,
i- . ,
, (min_sup).
i- .
Apriori D.
. 15.1. 3.
174
. 15.1. Apriori
175
.
.
, 3, . e f,
, 1.
: a, b, c, d.
,
, 3.
, ab, ac,
bd, .
,
: abc, abd, bcd, acd,
, 3. abc
.
Apriori , - - ,
,
.
,
.
,
,
.
ad, bc, cd ,
abd, bcd, acd.
, .
,
abc
( ).
Apriori ,
. (negative border),
-, ,
,
.
Apriori
Apriori
. Apriori,
,
, - [31].
Apriori: AprioriTID AprioriHybrid.
176
AprioriTid
- , D
.
,
.
, , .
AprioriHybrid
Apriori AprioriTid ,
Apriori , AprioriTid; AprioriTid
Apriori . ,
-. ,
AprioriHybrid ,
Apriori AprioriTid. AprioriHybrid Apriori
AprioriTid, ,
. , Apriori AprioriTid
.
, Apriori.
, [31,
33].
- DHP, (J. Park, M.
Chen and P. Yu, 1995 ). - ,
Apriori [63, 64].
, k- -
. k-1
-. -
k- , . k k -. ,
Apriori, -, -
, .
: PARTITION, DIC,
" ".
PARTITION (A. Savasere, E. Omiecinski and S. Navathe, 1995 ).
()
,
[65].
Apriori "" .
. ,
.
177
DIC, Dynamic Itemset Counting (S. Brin R. Motwani, J. Ullman and S. Tsur, 1997
). ,
" " (start point),
[64].
,
.
Deductor.
, ,
, MS Excel.
MS Excel Deductor, ,
. - .
( - " ")
" (ID)", - "".
MS Excel Deductor . 15.2.
, 140 .
" ".
,
"ID" "".
, . 15.3,
, ..
. "" ,
. - ,
, .
. 15.3.
:
20% 60% ,
40% 90% .
, ,
.
, , 30% 50%, ,
.
.
.
. 15.4.
179
. 15.4.
, -
, - .
; : " ", "", "
", "-". , .
" ".
- , ,
. ,
, .
, ,
15.3.
.
. , ,
: , , .
54,55 24
180
52,27 23
50,00 22
10
45,45 20
43,18 19
31,82 14
31,82 14
12
22,73 10
11
22,73 10
22,73 10
22,73 10
13 20,45 9
9
20,45 9
""
. ,
"-",
, . ,
, .
15.4. , ,
, 71%
. . , ,
.
, %
22,73 10
71,43
22,73 10
71,43
22,73 10
43,48
22,73 10
52,63
20,45 9
40,91
181
45,45 20
86,96
45,45 20
83,33
22,73 10
52,63
22,73 10
41,67
10
22,73 10
45,45
11
22,73 10
41,67
12
20,45 9
90,00
13
20,45 9
45,00
20,45 9
90,00
20,45 9
47,37
14
15
.
" " "-".
" " ,
: .
, , ,
, . ,
,
.
,
, . ,
.
"-" , ,
.
, "",
"", " ", " ",
. . 15.5.
182
. 15.5. "-"
.
, ,
.
, ,
, ,
.
183
.
", , , ,
." [65]
,
Data Mining,
"" . , ,
Data Mining - .
, .
1987 ACM SIGGRAPH IEEE Computer Society Technical Committee
of Computer Graphics, ,
,
.
,
, , , , , ..
, .
:
;
, ;
;
;
.
Data Mining
Data Mining .
Data Mining.
, , ,
, ,
.
, ,
.
,
, .
: , , .
.
: , .
.
184
(, ()
);
;
;
( , ).
Data Mining
( ), , Data
Mining . ,
"".
, , ,
. Data Mining
, ,
.
, Data
Mining. ,
, " ".
, . , ,
- .
.
- , .
, ""
. , .
,
, .
.
. .
, ..
Data Mining.
,
, .
", ". "" .
,
Data Mining . , , ,
..
, ,
, ,
.
185
,
, .
.
.
.
.
.
.
.
. ,
"" .
.
, ,
: , ,
.
, ,
[22]:
, ;
.
,
.
Data Mining
.
:
(univariate) , 1-D;
(bivariate) , 2-D;
(projection) , 3-D.
,
.
-
:
(,
);
186
;
;
;
, .
, :
(
);
-, - .
, - -
.
.
4 +
.
.
:
;
" ";
.
,
. ,
, . 16.1 [22].
(Alfred
Inselberg ) 1985 .
187
. 16.1.
" "
" "
[66].
"" .16.2.
"". ""
(, , , , ).
.
188
. 16.3 , "
".
, - .
, .
, ,
.
. ,
, ,
. ,
, , .
.
-
-
. , ,
.
,
, , .
.
,
- .
.
189
.
,
, .. .
. :
- , -
.
,
.
.
(,
)
.
, Data Mining,
.
, ,
: , , , ,
..
,
,
.
(Tufte's Principles)
[67] :
, ,
;
.
[65]
:
1.
2.
3.
4.
5.
6.
7.
.
.
.
.
.
.
.
,
,
, (
) .
190
,
, ,
.
.
.
. 16.4 MineSet [26], ,
, Data Mining.
,
( ,
).
-
.
. 16.4. MineSet.
,
, - .
,
, .
(Philip Russom)
[68]:
.
191
.
, .
1. .
( , ...).
, .
,
, .
,
. , , -
,
.
, ,
.
, ,
, -
.
"", "" "".
.
-
OLAP, Text Mining Data Mining,
CRM- .
, ,
, web- (
).
2. .
, .
,
,
,
.
,
.
, - , ,
. (
)
, .
.
(drill down) ,
192
.
- , .
, OLAP (
) .
.
, ( ,
, , .),
.
Data Mining
. (
)
. Data Mining
,
,
, .
,
,
, , ..
193
;
,
.
,
, .. Data Mining.
,
, ,
.
:
,
, ,
, ,
,
.
194
195
- " ,
, " [73].
- " ,
,
,
". ,
, [74].
, , , ,
, .
: - ,
[75].
:
(..
);
( );
(
).
,
, [75, 76].
,
,
.
,
.
" ,
" [76].
, ,
:
,
[75].
,
[11, 77]:
;
OLAP;
Data Mining.
196
:
,
.
, ( OLAP)
.
Data Mining ,
,
, .
,
. .
(D.J. Power,
2000). C [75].
, ,
: EIS DSS [75,78].
EIS (Execution Information System) - , .
,
.
, ,
,
. ,
; , ,
.
DSS (Decision Support System).
.
, .
, EIS, ,
,
. ,
197
.
,
.
, .. DSS.
, ..
(ad hoc) .
[79].
[11].
1. (OLTP-).
,
- .
.
2. ( OLAP-).
OLAP , ,
.
.
3. (Data Mining).
EIS DSS
. ,
.
, [80], :
;
;
;
, ;
.
OLAP-
OLAP, (On-Line
Analytical Processing),
(Multidimensional conceptual view).
OLAP (E. F. Codd) 1993 .
,
.
. ,
,
. OLAP-,
, ,
, , ,
198
. OLAP-
.
OLAP-
OLAP-.
: ,
, OLAP-,
. .
OLAP- OLAP [77]:
, OLAP-
.
MOLAP,
.
OLAP-.
. , , .
.
" " ,
.
ROLAP-
-.
.
OLAP-.
,
.
, .. HOLAP-,
, . OLAP OLAP-
. .
.
- OLAP-.
OLAP- OLAP- OLAP-.
OLAP-
- .
, . OLAP-
, -
. OLAP-
: MOLAP, ROLAP HOLAP.
OLAP- Microsoft.
OLAP- -. OLAP-
.
199
OLAP-
[81],
. ,
.
? OLAP
.
, SQL-.
17.1
[81]:
17.1.
OLTP
OLAP
OLAP Data Mining
. : OLAP
, Data Mining
.
OLAP Data Mining "" ,
. ,
. N. Raden, " ...
, ,
200
,
" [82].
K. Parsaye [83] "OLAP Data Mining" ( Data Mining)
.
,
.
,
( , , ), ,
.
( )
. J. Han
- "OLAP Mining" .
1. "Cubing then mining".
, .
2. "Mining then cubing". , ,
.
3. "Cubing while mining".
( ,
..).
Data Mining
. , Data Mining, ,
,
.
,
, ..
.
() .
, .
,
,
.
, , ,
. ,
.
,
, , ,
;
201
, ;
..
.
,
. ,
Data Mining.
(Bill Inmon) "
, , ,
, "
" ",
,
[84].
,
, ,
, .
,
, . ,
, ,
.
,
"" : .
.
, , ,
. .
, , ,
- " " [86].
,
.
""
"".
, [88] :
,
- -
,
, .
202
,
-. ,
.
-.
.
,
.
.
.
().
.
.
. ,
, .
.
. ,
, - .
( ,
, , ). ,
,
.
, - .
,
. ,
,
, .
. -,
. -,
. -,
.
.
.
.
, , ,
. .
; ,
-, .
OLAP (On-Line Analytical Processing).
, . OLAP
. , ,
, (,
..). , ,
( ),
203
.
OLAP-.
- OLAP-.
, "" .
,
,
Data Mining. ,
.
204
Data Mining.
Data Mining . ,
, , ,
, , , .
Data Mining .
Data Mining ,
.
Data Mining. :
;
;
;
;
;
;
;
.
Data Mining,
.
1.
- ,
.
, -,
.
.
- ,
.
,
- .
- ,
, , .
. ,
" ?"
, , ,
205
, ..
. ,
, .
.
- .
.
: , SADT IDEF0,
-, -
UML . ,
, , .
Data Mining. ,
, Data Mining.
2.
Data Mining :
;
.
.
. ,
.
.
. . "": ,
, .
,
.
. .
, .
. .
Data Mining ,
. Data
Mining, ,
.
.
3.
: Data Mining.
206
2 .
,
Data Mining.
, , , ,
80% , .
, .
1.
, ..
, Data Mining.
(, ,
); , ,
/ ;
( , ,
.).
2.
,
, , ,
.
.
, , ..
.
Data Mining
, , ,
.
,
, ,
, .
. ,
,
: , , , , .
,
. ,
.
,
.
, ,
/ .
207
/ ,
/.
, ,
.
.
. ,
. ,
-
, . ,
, .
.
()
. ()
().
.
.
3.
, .
, .
,
Data Mining.
. , ,
. ,
Data Mining - .
(Data quality) - , , ,
.
, -
"" .
- , , ,
.
: ,
.
, "
" ,
2005 Business Intelligence
Knightsbridge Solutions. 2005 ,
2005 (Duffie Brunson),
Knightsbridge Solutions, .
208
[90].
.
. ,
,
, -
,
.
.
. , .
, Basel II.
,
. ,
.
, , ""
, , .
. ,
. ,
,
(Extraction, Transformation, Loading - ETL),
, .
.
, - ,
(,
, ).
, .
,
, ,
, ,
, - ..
; ,
,
[91].
[92],
33
.
, :
, ;
, ;
, ;
, .
209
,
.
:
;
;
.
(Missing Values).
, :
(, );
(,
" " ).
.
.
.
.
(Duplicate Data).
, .. .
.
.
. ,
.
?
.
, . ,
, .
.
.
- .
.
,
. - ,
210
.
, .
-
- .
Data Mining ,
.
Data Mining .
, ,
.
. 18.1. , (
).
. 18.1.
, Data Mining
.
.
/ ,
.
,
.
[93].
[93].
1. ,
, .
2. ,
,
.
3.
, .
4.
.
5.
,
.
.
, , ,
Data Mining
. ,
,
, .
, [93] (
,
).
1.
2.
3.
4.
5.
.
.
.
.
.
1. .
.
,
.
2. .
,
,
212
.
;
.
, .
/
, , .
ETL ,
.
, , , ,
,
, , ,
. ,
.
, .
3. .
. , ,
, - , -
. ,
, , ,
.
4. .
ETL
,
.
5. .
,
,
.
.
(,
, .).
, ,
.
, ,
. , . 3
,
.
213
(
), .
;
.
Data Mining, ,
,
.
, ,
Data Mining ,
, , , ..
.
,
.
214
Data Mining.
, .
.
(Erhard Ram) (Hong Hai Do)
.
1. .
2. :
o ;
o .
3. ETL.
[93] ,
.
1.
, ,
,
, Data
Mining.
. MIGRATIONARCHITECT (Evoke Software)
.
: , , ,
,
, . MIGRATIONARCHITECT
.
Data Mining. , WIZRULE (WizSoft) DATAMININGSUITE
(Information Discovery) ,
, .
WIZRULE : , if-then
("-") , , - ,
" Edinburgh 52 ; 2
". WIZRULE
.
, , INTEGRITY (Vality),
, .. . INTEGRITY
- , , .
215
,
, ,
. INTEGRITY
(,
, , ) . INTEGRITY
.
,
, .
2.
-
- .
, , ,
.
.
- ,
- ,
. ,
,
.
2.1.
,
. ,
IDCENTRIC (First Logic), PUREINTEGRATE (Oracle), QUICKADDRESS (QAS Systems),
REUNION (Pitney Bowes) TRILLIUM (Trillium Software),
. :
,
, ,
.
, . ,
TRILLIUM () 200000
-.
,
.
2.2.
DATACLEANSER
(EDD), MERGE/PURGELIBRARY (Sagent/QMSoftware), MATCHIT (HelpITSystems)
MASTERMERGE (Pitney Bowes). ,
.
; DATACLEANSER
MERGE/PURGE LIBRARY ,
.
216
3. ETL
ETL
.
ETL API
,
.
ETL
, , COPYMANAGER (Information Builders), DATASTAGE
(Informix/Ardent), EXTRACT (ETI), POWERMART (Informatica), DECISIONBASE
(CA/Platinum), DATATRANSFORMATIONSERVICE (Microsoft), METASUITE
(Minerva/Carleton), SAGENTSOLUTIONPLATFORM (Sagent)
WAREHOUSEADMINISTRATOR (SAS).
, , , ..
.
"" DBMS,
- ODBC EDA.
.
.
,
C/C++
.
, ,
. (, COPYMANAGER,
DECISIONBASE, POWERMART, DATASTAGE, WAREHOUSEADMINISTRATOR),
.
(,
/ ).
ETL ,
API.
,
. ,
(sum, count, min, max, median, variance, deviation,:).
- ( ,
), (, , ,
), , ..
,
,
.
if-then case,
, - , ,
.
.
217
,
soundex. ,
,
,
.
, ,
:
, ;
/ .
, [94], .
. ,
. : Enterprise Integrator Apertus; Integrity Data
Reengineering Tool Validy Technology; Data Quality Administrator
Gladstone Computer Services; Inforefiner Platinium Technology; QDB Analyze
( QDB Solutions) Trillium Software System Hart-Hanks Data
Technologies.
,
, , .
.
, Trillium,
. ,
(, ),
. ,
Apertus Validy, .
, Object Query Language. ,
.
Validy
, ,
. .
/. , ,
. :
Nadis Group 1 Software Postalsoft.
: ,
/. ,
, .
, , ,
.
,
. , Nadis Universal
Name and Address data standard.
Group 1, Code-1 Plus,
. ZIP-
218
. , ,
,
, , ,
.
-
. -
, .
(Rich Olshefski) ,
[95].
. ,
- " "
.
1 , ,
. 1 ,
, .
2 ,
.
2 .
, .
" ". ,
. -
, ,
.
, ,
1 2. 1
, .
2. 2
, , ,
, , - 1.
,
"" .
, .
,
.
1
. - 2,
, , "".
219
, ,
. :
;
;
;
, .
" " 1 2.
?
,
. ,
.
,
.
.
.
, .
.
. ,
1 2. , ,
, -
, , .
. -,
,
. , -
, .
- " ", - .
, .
, ,
.
. , .
" " ,
, .
[96].
. .
,
, ( , ,
, , ). ,
,
. , , ,
- (
220
) .
,
.
. ,
. , "", "." ""
.
.
, .
,
, .
.
.
,
.
. ,
, , ,
.
, -,
.
, .
. ,
.
( )
.
.
,
, ,
,
.
,
Data Mining .
,
.
. ,
, .. , .
, ,
, Data Mining
.
221
, 80% ,
.
222
Data Mining.
Data Mining
, .
Data Mining, :
;
;
;
;
.
"".
""
"".
- , .
-
, [97].
- ,
.
,
. .
, , ,
.
,
.
, .
Data Mining.
Data Mining .
Data Mining , ,
, .
,
.
223
Data Mining
, .
Data Mining
, , ,
. Data Mining
.
.
, , ..
, ,
.
.
,
.
, ,
.
. ,
, , ..
,
.
, ,
, (
).
(, , ,
). . ,
.
.
, , .
,
,
, .
,
,
1.
2.
3.
4.
5.
(, ) ;
;
;
;
; ; , Data Mining;
224
6. () .
.
Data Mining :
.
(predictive) .
, ..
().
, ,
.
() . Data Mining
. (
, , , )
,
.
,
.
- (
) .
, ,
.
,
. :
.
Data Mining ,
.
- ,
.
:
;
;
;
, .
() .
(descriptive)
.
, , , .
225
, ,
;
.
, ,
() "".
.
.
, , , .
, (
).
- ,
, , ..
/,
.
.
.
- , ,
.
- , ,
.
, .
:
Y=f(x1,...,xn),
x1,...,xn - , Y - .
:
Y=f(x1,...,xn,z1,...,zr,w1,...,ws),
x1,...,xn - ,
;
226
z1,...,zr - , ,
;
w1,...,ws - .
Y - .
. ,
,
.
, , ,
,
.
, , ,
. Data Mining
.
4.
Data Mining.
.
, 6
. ,
: 1 (,
, ) 2 (, ,
).
,
: ()
.
( ).
, , : " >20
= "married", "1".
, ,
( ) .
"" "
", (.. ),
, .
, , ,
. , , ,
.
227
Data Mining.
,
. ,
, . Data
Mining ,
. Data Mining
.
Data Mining Group PMML
(Predictive Model Markup Language), ,
Data
Mining. .
Data Mining
,
.
.
, - . :
,
, , ;
. ( )
.
, ,
, ,
, , .
, ,
" ". " ".
, ,
:
;
;
,
. .
, Data Mining .
- ,
.
- .
, .
228
, ,
.
- ,
:
( - );
( -
).
.
t-1. -> t-1-> .
t. -> t -> .
t+1. -> t+1 -> .
.
5.
.
.
.
(adequacy of a model) -
.
,
,
, .
, ,
.
.
.
,
.
.
.
"" ,
, , -
.
. .
,
229
. ,
, .
, ,
,
.
, [98].
.
( ), ,
.
- ,
(, ).
.
, Data
Mining, : , , .
,
.
6.
,
.
,
, .
.
, , -
[77].
,
. " ",
.
, Statistica (Statsoft) [39]
" ", : (,
); ; -.
7.
, .
, Data Mining.
()
(target attribute).
230
8.
Data Mining ,
, ""
.
,
. ,
, (
).
. ,
().
. . ,
, .
, , .. ,
.
:
;
;
;
, ;
(, ,
- , ..).
, , ,
.
.
: " >20 =
"married", "1". -
, , , ,
. :
" >30 = "married", "1".
Data Mining
231
.
, . Data
Mining , ,
.
, .. , ""
.
, 18 20 .
, ,
, , , .
, , , .
,
.
.
,
, . ,
, ,
. , ,
.
, , ,
. , , .
, " ,
, ". ,
, ,
,
.
. , , ,
, .
(..
), , , ,
..
Data Mining (,
).
Data Mining, Data
Mining, , , ..
, ,
, ( , " ",
).
232
Data Mining ,
.
.
Data Mining
, .
Data Mining ,
. , ..
. ,
.
, Data Mining .
, ,
.
233
Data Mining.
Data Mining
, - .
, , ,
. ,
- ,
.
, :
" ,
?". "", , ,
. , , ,
, ,
. Data
Mining.
Data Mining ,
-
.
Data Mining , ,
,
Data Mining.
Data Mining, ,
: " ?"
Data Mining, ,
.
.
(flow of Data) Data Mining [17],
..
. -
.
, Data Mining:
.
,
,
, ;
.
, [99].
234
.
-
. , ,
.
, , ,
. ,
Data Mining.
. Data Mining .
Data
Mining, .
, .
,
.
Data Mining
.
. Data Mining
Data Mining -
, Data Mining.
, Data Mining, ,
. 21.1: ,
, .
.
(Domain experts) - ,
, , , , , , ..
.
235
, ,
, , ,
, . , .
(Database administrator) - , ,
,
.
,
, , .
:
; ;
; ; ;
; ;
.
(Mining specialists) - ,
, , .
Data Mining
.
.
Data Mining
,
.
.
Data Mining- ,
..
. Data Mining
. ,
, - .
, ,
. Data Mining
, ,
( - - ),
.
, , , Data Mining
.
.
:
236
(Project Manager);
IT (IT Architect);
(Solution Architect);
(Data Architect);
(Data Modeler);
Data Mining (Data Mining Expert);
(Business Analyst).
. , ..
, (outsourcing).
,
.
Data Mining .
Data Mining, ,
:
( );
( );
( Data Mining-
);
( );
- ( , data
mining);
( );
.
KDnuggets, -
, Data Mining
(34%), (19%), , -,
.
Data Mining ,
, , ,
-.
, Data Mining ,
.
Data Mining ,
.
Data Mining, ,
.
( , Data Mining ).
- ,
. ,
237
, ,
. Data Mining
.
, Data Mining.
.
Data Mining
.
Data Mining
, ,
. Data Mining
.
, ,
.
Data Mining ,
, .
CRISP-DM
Data Mining :
, Data Mining.
- , Data Mining.
Data Mining Data Mining.
CRISP-DM [100] (The Cross Industrie Standard Process for Data Mining -
Data Mining)
. CRISP-DM NCR, SPSS DimlerChrysler.
CRISP, Data Mining
.
Data Mining CRISP-DM :
1.
2.
3.
4.
5.
6.
(Business understanding).
(Data understanding).
(Data preparation).
(Modeling).
(Evaluation).
(Deployment).
- , .
Data Mining CRISP-DM . 21.2.
238
. 21.2. , CRISP-DM
SEMMA
SEMMA ,
,
. SEMMA
,
. , SEMMA
,
,
, ,
.
- .
SEMMA
, , .
KDnuggets (2004 .), 42%
CRISP-DM, 10% - SEMMA, 6% -
, 28% - ,
6% . 7% .
240
Data Mining
, Data Mining, ..
Data Mining.
, , -
Data Mining,
.
:
1. ,
Data Mining.
2. , .
PMML
( );
( );
(, );
, .
241
PMML
, , , , ,
, .
,
Data Mining. ,
, ,
SQL.
,
, : CWM Data Mining, JDM.
2000 MDC (MetaData Coalition, www.mdcinfo.com) OMG (Object
Management Group, www.omg.org), -
- OIM (Open Information Model)
CWM (Common Warehouse Metamodel) -
OMG. CWM
, , XML,
, OLAP, ,
.
JDM (The Java Data Mining standard - Java Specification Request 73, JSR-73). ,
JSR 73, Java Data Mining API (JDM) -
Java API ( )
Data Mining Java-.
SQL,
Data Mining,
. :
SQL/MM, OLE DB for Data Mining.
SQL/MM SQL
Data Mining.
The OLE DB for Data Mining standard of Microsoft. ,
SQL/MM, Data Mining .
OLE DB.
, Data Mining,
:
, Data Mining ( ,
, , ,
);
web- (SOAP/XML, WSRF, .), Grid- (OGSA, OGSA/DAI, ..),
Web (RDF, OWL, ..);
242
, :
, ,
(real time) Data Mining, (data webs).
, Data Mining , ,
, .
"" Data Mining .
243
Data Mining
Data Mining
, . .
, ,
, .
Data Mining (Enterprise Data
Mining Buying Guide) Aberdeen Group: "Data Mining -
.
, ,
Data
Mining ".
Data Mining,
:
Data Mining;
Data Mining,
;
Data Mining- ;
Data Mining- ;
, ,
,
Data Mining.
, ,
, Data Mining.
Data Mining
.
, , ,
,
Data Mining. SPSS (SPSS, Clementine),
Statistica (StatSoft), SAS Institute (SAS Enterprise Miner). OLAP Data Mining, ,
Cognos. , Data Mining :
Microsoft (Microsoft SQL Server), Oracle, IBM (IBM Intelligent Miner for Data).
Data Mining .
- .
" Data Mining,
", 2005 Kdnuggets.
. 22.1.
245
2002 2003 , ,
, - .
, . ,
: 2003 , 2002 ,
Weka Prudsys Xelopes R, 2005
Weka , Xelopes
246
.
: Microsoft SQL Data Mining 2003
, 2002 , , 2005 - .
,
.
, ,
. , ,
. ,
,
.
,
,
, ,
,
( ). ,
,
Data Mining.
, ,
, -
,
.
, 2005 , :
: (US $10000 )
: ( $1000 $9999)
Data Mining .
Data Mining
. : .
, .
1. .
247
- ,
, ,
.
, .. ,
, , , .
, .
, " -"
.
(,
).
2. / .
Data Mining-
, .
, , . Data Mining
() .
()
. :
txt, dbf, xls, csv .
( )
.
3.
,
,
.
4.
5. Data Mining-
6. .
,
Data Mining.
7. .
(Wizard).
248
8. , ,
,
.
9.
.
10. .
.
11. , .
- ,
. ,
. ,
.
,
, .
12. .
Data Mining ,
. ()
(
),
, .
13. .
14. (
), .
15. , , .
, .
(),
.
Data Mining .
, Data Mining.
.
, ,
, ,
. -
.
16. , . Data Mining
,
.
249
17. , ,
: PC Standalone (95/98/2000/NT), Unix Server, Unix Standalone, PC Client, NT
Server.
, ,
Data Mining.
, , .
, , ,
,
. , , Data Mining
,
, .
Data Mining
Data Mining
- .
Data Mining KDnuggets:
; .
:
;
;
;
;
(Text Mining), (Information Retrieval (IR));
.
. ,
, .
:
, ,
, SVM .
Polyanalyst (http://www.megaputer.com/). , Data
Mining. , , ,
, . OLE DB for Data Mining DCOM-.
SAS Enterprise Miner (http://www.sas.com/). ,
GUI. SEMMA.
SPSS (http://www.spss.com/clementine/). ,
Data Mining.
Statistica Data Miner (http://www.StatSoft.com/). ,
,
, , .
, Polyanalyst,
Deductor,
. Deductor .
Weka (http://www.cs.waikato.ac.nz/ml/weka/index.html). Weka
Data Mining-.
Weka Java .
, :
;
;
, ;
;
;
BI (Business Intelligence), Database and OLAP software;
;
,
Data Mining;
Web Mining: , XML mining;
Web;
Audio and Video Mining.
.
Data Mining , Data Mining. Two Crows.
Data Mining
Apriori, priori;
Apriori, FP-growth, Eclat and DIC implementations (http://www.adrem.ua.ac.be/) by Bart
Goethals;
ARtool (http://www.cs.umb.edu/),
(binary databases);
DM-II system (http://www.comp.nus.edu.sg/), CBA
;
FIMI, Frequent Itemset Mining Implementations (http://fimi.cs.helsinki.fi/) -
, .
:
252
Autoclass C (http://ic.arc.nasa.gov/projects/bayes-group/autoclass/autoclass-c-program.html,
http://ic.arc.nasa.gov), " " NASA,
- Unix Windows;
CLUTO (http://www.cs.umn.edu/~karypis/cluto, http://www.cs.umn.edu/~karypis/cluto).
,
;
Databionic ESOM Tools (http://databionic-esom.sourceforge.net/).
, ,
ESOM - ;
MCLUST/EMCLUST (http://www.stat.washington.edu/fraley/mclust_home.html).
(modelbased) , .
- S-PLUS;
PermutMatrix (http://www.lirmm.fr/). ,
,
;
PROXIMUS (http://www.cs.purdue.edu/homes/koyuturk/proximus/).
, ;
ReCkless (http://cde.iiit.net/RNNs/) ,
k- .
;
Snob (http://www.csse.monash.edu.au/), MML
(Minimum Message Length - );
SOM in Excel (http://www.geocities.com/adotsaha/NN/SOMinExcel.html),
Microsoft Excel Angshuman Saha.
,
, ,
.
.
.
.
, 2
. ,
, : , , ,
, .
,
.
Data Mining
Alyuda Forecaster XL (http://www.alyuda.com/forecasting-tool-for-excel.htm).
Excel-
.
253
- - Excel-
ExcelNeuralPackage (http://www.neurok.ru/demo/enp/demo_enp.htm).
-
. free-
.
, Data Mining
, .
.
. Data Mining-
Business Intelligence, ,
, .
Data Mining.
, , Business Intelligence,
Data Mining, ,
,
.
254
Miner ,
,
, .
.
"
". Data
Mining ,
,
, .
SAS Enterprise Miner . 23.1.
256
Enterprise Miner .
,
.
,
Miner Repository. ,
, ,
.
.
.
XML-,
. SAS Enterprise Miner
,
, ,
, .
(SAS Metadata Server),
. Web-
.
- ,
. ,
, .
SAS Enterprise Miner
,
SAS, C, Java
PMML. (
) SAS, Web
.
.
,
Enterprise Miner, -
,
Web
.
Enterprise Miner
,
SAS.
, SAS Enterprise Miner 5.1,
, SAS XML-.
, Java API,
Enterprise Miner
. ,
,
, , OLAP-
.
258
,
.
Enterprise Miner SAS, ,
SAS ETL Studio, OLAP,
, SAS Text Miner. SAS
- .
,
, :
.
Web-.
SAS.
XML.
.
, .
SAS macro.
Java API.
Web-:
.
, ..
, -,
.
- .
( ).
-
.
259
.
- .
50 .
SAS ETL Studio SAS Metadata Server:
.
.
.
.
.
N .
.
, .
.
.
.
: , , , ,
, .
: bucketed ( ), ,
.
: ,
, .
,
.
, n .
.
.
.
.
260
M-.
.
n, , , , ,
, .
, , , ,
.
.
.
.
-
n .
.
logworth-.
:
"" , -
.
,
.
/
.
.
/
, : ,
, , , ,
.
Java- :
.
WHERE.
.
.
, Enterprise Miner,
.
.
261
.
,
GIF TIF.
- k .
.
.
,
.
,
.
PMML.
- :
, .
, ,
.
.
.
.
.
.
.
PMML.
Web-
-
- .
.
, ,
- R2.
262
.
.
.
.
.
.
: , ,
.
.
.
, ,
, .
.
.
SAS Code Node
SAS
.
SAS.
.
Enterprise Miner.
, ,
..
.
,
, : , AIC, SBC,
, , ROC, , KS
(-).
, ,
.
.
.
.
263
, .
: , ,
.
.
.
: , , , ,
.
PMML.
CHAID ( -).
.
C 4.5.
.
: -,
F-, , , .
.
.
.
.
:
.
,
.
13 ,
.
.
- ARBORETUM.
:
.
10 .
264
.
.
.
.
.
PMML.
.
.
.
.
, .
.
.
k- .
.
.
: , , .
.
.
.
.
ROC.
.
().
.
.
.
SAS, C, Java PMML.
265
, ,
SAS, C Java.
.
.
.
, , ,
.
Data Mining . ,
,
.
SAS -
SAS -
SAS Intelligent Warehousing solutions, . 23.2.
ERP/OLTP-,
ERP/OLTP- ( SAS/ACCESS).
(SAS Data Quality-Cleanse).
(SAS/Warehouse
Administrator).
(SAS Scalable Performance
Data Server).
:
266
267
PolyAnalyst - .
PolyAnalyst Workplace.
- PolyAnalyst Knowledge Server.
:
.
PolyAnalyst ++ Microsoft's COM
(ActiveX).
. PolyAnalyst .
24.1.
. 24.1. PolyAnalyst
268
Workplace - , . Workplace
,
. 24.2.
. 24.2. PolyAnalyst
:
,
, , ,
, drop-down pop-up ,
.
Data Mining PolyAnalyst "".
, , , , ..
.
HTML .
PolyAnalyst
PolyAnalyst 4.6 18 ,
Data Text Mining. Know-How
.
,
,
,
269
,
.
PolyAnalyst.
, ,
. ,
PolyAnalyst , :
. ,
,
.
.
Memory based Reasoning (MR) - " "
PolyAnalyst "
".
.
" " PolyAnalyst
270
. MR
, (string data
type), .
PolyAnalyst ,
..
.
Classify (CL) -
CL .
. 0
1. ,
271
"1", , "0" .
.
Discriminate (DS) -
CL. ,
, ,
, , ,
. CL,
, ,
.
Decision Tree (DT) -
PolyAnalyst ,
(information gain).
, ( )
.
.
DT PolyAnalyst.
Decision Forest (DF) -
,
, .
PolyAnalyst , (decision
forest). -
. ,
, ,
.
272
. 24.3.
Text Analysis () -
Text Analysis
.
, / ,
( "-")
.
Data Mining, PolyAnalyst. ,
.
Text Categorizer (TC) -
273
.
.
Link Terms (LT) -
,
, .
, .
PolyAnalyst :
1. , .
2. , ,
.
-
.
, .
Text OLAP ( ) Taxonomies () -
. Text OLAP
(), . : "[] []
([] [] [])". PolyAnalyst
.
, .
.
Text OLAP,
, .
.
,
.
. :
, , (,
(Link Analysis), ,
, )
.
Data Mining Text Mining
.
PolyAnalyst
.
274
: , , -
.
Data Mining
.
. , , -
Lift, Gain charts,
. ,
Data Mining:
.
Link Analysis (LA) -
Link Analysis
,
.
Symbolic Rule Language (SRL) -
SRL - PolyAnalyst,
Data Mining
, . SRL
,
, , ,
. SRL
.
Data Mining.
.
, .
, .
,
,
" " (GT-search). ""
, ,
.
,
. ,
, ,
.
275
""
PolyAnalyst -
(Symbolic Rule Language), : ,
.
, ,
.
,
.
PolyAnalyst
PolyAnalyst . : , (yes/no),
, , ,
.
PolyAnalyst . :
"" (.csv), Microsoft Excel 97/2000, ODBC , SAS data files, Oracle Express, IBM Visual Warehouse.
OLE DB for Data Mining
4.6 PolyAnalyst Microsoft OLE DB for Data Mining
(Version 1.0).
(LR, FD, CL, FC, DT, DF, FL,PN, BA, TB) "Mining
Models" (MM).
OLE DB ADO
, ADO COM-.
SQL- ( SQL for DM). Mining
Models PMML.
"PolyAnalyst DataMining Provider" Microsoft Analysis Services(
SQL Server 2000).
In-place Data Mining
PolyAnalyst OLE DB
PA.
PolyAnalyst SQL-
.
. . 24.4.
276
PolyAnalyst Scheduler -
PolyAnalyst .
,
,
.
. Scheduler
.
24.1 PolyAnalyst6:
.
24.1. PolyAnalyst
PolyAnalyst 4.6,
: FL, FD, PN, FC, BA, , MB, CL, DS, DT, DF,
LR, LA, TA, TC, LT, SS. , OLE DB.
- MS Windows NT/2000/XP
PolyAnalyst Knowledge Server 4.6, : FL, FD, PN, FC, BA, , MB, CL, DS, DT, DF,
LR, LA, TA, TC, LT, SS. , OLE DB, InPlace Data Mining. - MS Windows NT/2000/XP
server, - MS Windows 98/NT/2000/XP.
/
PolyAnalyst COM - SDK
Data Mining
COM-, ,
WebAnalyst
PolyAnalyst TextAnalyst,
(Data Mining Text Mining),
- WebAnalyst.
WebAnalyst - ,
web- e-business.
WebAnalyst ,
, ,
. ,
WebAnalyst
.
,
(HTTP), - web-.
.
( WebAnalyst),
.
Web-;
;
;
;
;
.
278
"" WebAnalyst :
Web-; ;
, ; Web-;
.
279
. 25.1. Cognos
Cognos,
, .
1. .
.
:
o Decision Stream - (data marts),
;
o Impromptu - ,
;
o PowerPlay - ;
o Impromptu Web Reports -
Web;
o Cognos Query - ,
.. Web;
280
Visualizer - .
.
.
, , ,
.
-
( drill through):
o PowerPlay - (OLAP) -;
o Impromptu -
( Windows);
o Impromptu Web Reports -
( Web);
o Visualizer - .
.
.
,
.
,
()
.
:
o Visualizer -
;
o PowerPlay ;
o Impromptu ;
o Cognos Query - Web- .
(data mining).
,
,
, :
o Scenario - ;
o 4Thought - ;
o Visualazer .
.
, Access Manager
Cognos.
Access Manager,
.
;
.
Cognos BI , Cognos
Architect.
-.
Cognos.
o
2.
3.
4.
5.
6.
281
Cognos 4Thought
Cognos 4Thought
.
.
Cognos 4Thought . 4Thought
,
, .
. 25.3 Cognos 4Thought
, 4Thought.
282
, ..,
). Impromptu
4Thought.
4Thought
( ,
),
. : , ,
, .
3. .
, . ,
4Thought (
, ).
4. . 4Thought ,
;
, ( ),
..
5. . 4Thought
.
, ,
.
6. .
, .
4Thought
.
-
() :
.
(
),
.
Cognos 4Thought , ,
, : " ,
?"
, , ,
,
.
.
Cognos 4Thought ( )
,
. ,
.
4Thought .
,
. 4Thought
285
,
, .
Cognos (. 25.3)
-
( ).
PowerPlay Transformation
Server.
( )
Access Manager, PowerPlay
Transformation Server.
PowerPlay Impromptu ,
, ,
, 4Thought Scenario -
- , .
.
Cognos .
/-
Upfront, Cognos PowerPlay Enterprise Server.
STATISTICA Data Miner
;
, MS Office;
;
;
;
;
;
COM-,
( Visual Basic
( ), Java, C/C++).
286
4. Reports - . ,
(, , ).
STATISTICA Data Miner
StatSoft.
, STATISTICA Data Miner Data
Mining, Data Mining:
Feature Selection and Variable Filtering (for very large data sets) -
( ).
. ,
.
Association Rules - .
. ,
: "", 95
100 "B" "".
Interactive Drill-Down Explorer - .
.
,
.
Generalized EM & k-Means Cluster Analysis -
. -
.
, ,
.
Generalized Additive Models (GAM) - (GAM).
, Hastie Tibshirani.
General Classification and Regression Trees (GTrees) -
(GTrees). ,
Breiman, Friedman, Olshen Stone (1984). ,
,
..
.
288
- ,
,
. , . ,
, .
- .
- - ,
.
,
. , " ",
.
-
.
-
,
(, , ..)
() . ,
(.. ,
) ,
.
.
.
, STATISTICA
, ,
. StatSoft
,
.
, ,
() ,
: , ,
..
Data Miner.
1. Data Miner " " "" (.
25.6). " - " " -
", STATISTICA Data Mining.
2. Boston2.sta STATISTICA.
.
- Low, - Medium
- High Price.
- Cat1 12 - Ord1-Ord12.
, 1012 , Boston2.sta.
. 25.7.
. 25.7.
3. "
", . 25.8.
291
. 25.8.
( )
( ), , .
OK.
4. " " (
.
260 , .
292
,
, .
.
,
"".
.
, , Descriptive Statistics Standard Classification Trees with Deployment
(C And RT) . Data Miner .
Data Miner
. / .
5. . ,
, .
293
( ).
.
STATISTICA.
6. , .
, STATISTICA Data Miner
,
, .
, .
294
295
, .
.
Oracle Data Mining API. Java API Java
JDM ( Data Mining).
Data Mining 10g ,
26.1.
26.1. , Oracle Data Mining
Apriory Algorithm
;
;
.
;
;
.
296
, ABN ( ).
< 200.
, ABN.
.
( ).
, NB.
.
.
. Support Vector Machine.
- ,
. .
- Minimum Descriptor Length (MDL).
297
Deductor
Deductor ( -
BaseGroup Labs [115]). Deductor :
Deductor Studio Deductor Warehouse [48] .
Deductor . 26.1.
. 26.1. Deductor
Deductor Warehouse - ,
.
,
. Deductor Warehouse ,
.
Deductor Studio - ,
. , ,
. Deductor Studio ,
,
.
298
.
-
. Deductor Warehouse
. , :
;
Microsoft Excel;
Microsoft Access;
Dbase;
CSV-;
ADO- - ODBC- (Oracle, MS
SQL, Sybase ).
, - ,
.
, ,
.
. , ,
( ), .
. ,
, .
299
,
.
.
,
. ,
.
- - ,
.
Deductor Studio
Deductor Studio
:
;
;
;
.
, , ,
. , ,
,
, , . :
Deductor Warehouse ;
Microsoft Excel;
Microsoft Word;
HTML;
XML;
Dbase;
Windows;
.
.
, " ",
. ,
, , .
, ,
.
.
. 26.4.
301
Deductor Warehouse - ,
.
"",
, "" . .
26.5.
. 26.5. ""
.
"" .
302
Deductor Warehouse ,
.
Deductor Warehouse ? -
, ,
.
,
. ,
.
, "" "",
,
.
Deductor Warehouse , ..
.
, ,
. Deductor
Studio.
,
.
. ,
, , , .
. , ,
, , .
Data Mining. , ,
, .
.
. 26.6 , ,
.
303
. 26.6. , Deductor
1.
,
, , , .
, . ,
""
.
.
- .
-
.
, .
( ). ,
, ,
.
304
.
, ,
.
.
-
. .
,
,
. ,
.
. , ,
.
- -. ,
. ""
: , ""
.
( 7-9) ,
("" ).
-
"" .
: ,
"", - ""
.
:
, .
, ..
:
1. ;
2. .
, .
,
() () .
- ""
"", "" "".
2.
305
, ,
.
, . ,
( ,
).
, , ,
. ,
, , .
: ( ,
),
, , .
.
. ,
0 10, 10 0
1, 1 2 .. 0 , 1 - , 9
10 - .
, , (,
, ) .
, , ,
.
"" .
, , .
, 0 - "", 1 - "", 2 - "". "" - "", "" "", "" - "", "" - "".
.
,
. ,
, " ",
( ).
" "
, ,
,
.
( -
, - "" ).
306
,
, ,
, .
(, , , , ). ,
,
, .
, , , - ,
, , , .
- .
.
, .
Deductor Studio , , "". -,
- .
- .
.
- ,
.
, ,
.
"" , ,
- . , ,
,
.
, , ,
. Deductor Studio ,
/
.
,
, .. .
, .
,
( ,
).
307
, .
0 1.
,
() .
.
. ( )
,
.
3. Data Mining
Data Mining Deductor :
;
;
;
;
;
;
.
,
Deductor .
308
KXEN
Data Mining. KXEN,
- [116],
1998 . KXEN "Knowledge eXtraction
Engines" - "" .
, KXEN
[117]. KXEN , .
KXEN - ,
Data Mining .
.
, KXEN (
, ) , Data Mining.
- KXEN -
. ,
.
KXEN :
/ ( .. );
/;
;
( ).
, .. . -
"" ,
( ,
).
KXEN , ,
, :
; , ;
; ;
.
.
309
KXEN ,
( )
. Data Mining
KXEN . 27.1.
, KXEN
on-line "-".
, ,
,
.
KXEN :
: , KXEN
( DB2, Oracle MS
SQL Server, .. ODBC);
, :
+ score-;
:
++, XML, PMML, HTML, AWK, SQL, JAVA, VB, SAS,
.
. ,
KXEN, .
KXEN
.
-
, KXEN.
, ,
-.
KXEN Data Mining.
, KXEN
. ,
, , .
,
, , .
KXEN
1990- .
,
. ,
, .
.
? ,
,
. : " , -
, ?" : "".
.
, ,
, .
KXEN. , KXEN
.
. KXEN, , .
.
, KXEN,
.
( ),
. (Structured
Risk Minimization). KXEN ,
,
.
311
,
-
. , - .
( , ,
..)
, ,
.
KXEN ? KXEN
. KXEN .
, " " (Data Manipulation),
(, ), .
, ,
.
. KXEN
, , - ,
.
.
KXEN ,
, .. ,
, , ,
, .
.
,
.
( ),
- .
, ,
. :
1. API.
2. .
3.
.
. KXEN
. KXEN
, .. " " ( ). ,
, .
4. .
312
on-line, ,
, Java, SQL, PMML .
KXEN
.
, KXEN
. -
; ,
. KXEN
, .
KXEN Clementine,
Data Mining SPSS,
, KXEN
Data Mining.
, : "
KXEN ,
() Data Mining?"
: ,
. ,
.
KXEN API,
-. ,
,
,
, .
KXEN
,
.
. KXEN
, . ,
,
, , :
;
;
;
,
.
KXEN.
(, ,
) (,
).
SQL
. KEL ,
KXEN.
" " ,
.
, .
, .
, K2R
( 10 000). K2R
,
.
.
.
. KTS ,
, .
:
.
, , IOLAPTM KXEN -
,
.
OLAP-
, , .
. ,
, , 200
?
IOLAP:
,
.
( ).
.
.
IOLAP ,
KXEN, Microsoft Excel. IOLAP
OLAP- .
317
Data Mining
Data Mining,
. ,
: -
.
Data Mining . ,
,
, ,
,
,
.
, . ,
,
, ,
.
,
, , ,
-.
,
.
, " "
, Data Mining ( ,
1-2 , ,
). -
,
: " "
.
,
Data Mining. (
) .
,
, , , Data Mining.
: Data Mining /
.
Data Mining-
-. KDnuggets
, Data
Mining.
Data Mining -
Two Crows (www.twocrows.com). Data
Mining, ,
Data Mining . Data Mining , Two Crows.
Data Mining-
, :
. , , Arvato Business Intelligence, www.arvatobi.fr,
Data Mining , ,
.
.
, Blue Hawk LLC, www.bluehawk.biz, Data Mining
Direct Marketing CRM.
Data Mining.
Bayesia, (www.bayesia.com), "
" . Visual Analytics
(www.visualanalytics.com) -
Data Mining.
, Data Mining
.
. Data
Mining ,
.
, . , -,
( ), -,
( ). ,
,
,
.
.
, -
.
.
,
.
319
-
:
,
; , ,
, "" ""
"", "" "" - ,
. ,
, ,
.
.
, .
, Data Mining .
SnowCactus,
Data Mining.
SnowCactus Data Mining
- [118], :
, ;
;
;
;
;
;
.
-
,
.
SnowCactus
.
,
Data Mining .
. 28.1. , , Data
Mining. , Data Mining
, ,
.
320
.
1. -
-.
:
,
.
, .
, .
,
Data Mining.
2.
- ,
,
. -
,
,
.
3.
321
. ,
.
4.
- - .
, ,
, . -
.
5.
. , , -,
, ,
, , ,
.
-
, -. ,
. -,
,
.
, -
. " ?" .
dm-Score -
- .
1. dm-Score (,
- ).
dm-Score (dm - Data Mining)
.
, ..
, ,
(), ,
.
, ..,
- , , ,
,
.
322
dm-Score ,
. , dm-Score
, , .. -.
, ,
-.
dm-Score :
( );
.
, ;
( )
;
;
;
, ..
, ..;
(
);
;
, .
, dm-Score ,
: ,
, .
, .
, ,
.
dm-Score . 28.2.
323
. 28.2. dm-Score
dm-Score -
.
( ). dm-Score
,
Data Mining. ,
.
dm-Score ,
.
,
. , dm-Score ,
, -
.
, ,
, .
.
(
, ), dmScore .
( ).
dm-Score
, ,
.. ,
, ,
.
324
dm-Score ,
, ,
.. , dm-Score,
, .
Data
Mining , ,
.
- . Data Mining, ,
,
.
. , Data Mining
. ,
- , . .
. , Data Mining
, , .
, Data Mining ,
, . ,
.
. ,
. ,
Data Mining, ,
. ,
.
.
. ,
, - , , -
, ..
.
. :
,
(, )
..
2. : - .
- ,
Data Mining .
IT-,
.
325
,
. , -
.
Data Mining,
, ..
,
. -
Data Mining.
()
, , ,
- ,
, . ,
, 20-25 .
.
?
, ,
35-45 ,
.
, 20-25 .
: Data Mining , IT-
. ?
,
.
, Data Mining
, , .
Data Mining
,
,
. Data Mining
. ,
, .
326