Professional Documents
Culture Documents
(voron@forecsys.ru),
..-.. . .
( + )
( + )
. .
1 / 41
, ,
, ,
. .
2 / 41
( )
X ; Y (, );
y : X Y .
: xi = (xi1 , . . . , xin )
yi = y (x), i = 1, . . . , :
1
x1 . . . x1n
y1
y
. . . . . . . . . . .
. . .
x1 . . . xn
y
: a : X Y,
xi = (
xi1 , . . . , xin ), i = 1, . . . , k:
1
x1 . . . x1n
a(
x1 )
a?
. . . . . . . . . . .
...
1
n
xk . . . xk
a(
xk )
. .
3 / 41
, , (|Y| < ):
x ; y , ;
x ; y ;
x ; y ;
x ; y / ;
x ; y ;
x ; y ;
x ; y : / ;
x ; y ;
(Y = R Rm ):
x ; y ;
x h,i; y ;
x . ; y ;
x . ; y ;
x ; y ;
. .
4 / 41
:
();
(, );
( );
;
;
;
;
;
.
:
( );
;
;
;
;
. .
5 / 41
.
, L = 98. .
, %
15
14
13
12
11
10
9
8
7
6
5
4
3
3
10
11
12
, %
. .
6 / 41
X = {x1 , . . . , xL } ;
A = {a1 , . . . , aD } ;
I (a, x) = [ a x];
LD- :
a1
a2
a3
a4
a5
a6
aD
x1
...
x
1
0
0
1
0
0
0
0
1
0
0
0
0
1
0
1
1
0
1
1
0
x+1
...
xL
0
0
0
0
0
1
0
0
1
1
1
1
1
0
1
1
0
1
0
1
0
n(a, X ) =
(a, X ) =
X
()
X
()
k = L
I (a, x) a A X X;
xX
1
|X | n(a, X )
a X ;
. .
7 / 41
:
(X ) = arg min (a, X ).
aA
= X , |X | = , |X
| = k.
X X
P
P E C1
.
L
X X
:
.
CCV(, X) = E (X ), X
:
(X ), X > .
Q (, X) = P (X ), X
. .
8 / 41
(, , 1971)
X, A, [0, 1], = k
Q (, X) 6 |A| 32 exp 2 .
:
|A|.
:
108 1011 ;
= 106 1010 .
:
L D;
I (a, x), X, .
. .
9 / 41
A:
a 6 b: I (a, x) 6 I (b, x) x X;
a b: a 6 b kb ak = 1.
hA, E i:
A
;
E = (a, b) : a b .
:
6 A;
(a, b) xab X,
, I (a, xab ) = 0, I (b, xab ) = 1;
Am = a A : n(a, X) = m , m = 0, . . . , L;
. .
10 / 41
2
1
0
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
0
0
0
0
0
0
0
0
0
0
0
. .
11 / 41
2
1
0
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0
0
0
0
1
0
0
0
0
0
. .
12 / 41
2
1
0
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0
0
0
0
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
. .
0
1
1
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
2
0 0
0 0
0 0
1 0
1 1
0 1
0 0
0 0
0 0
0 0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
...
...
...
...
...
...
...
...
...
...
13 / 41
a A
u(a) a ,
a:
u(a) = |Xa |, Xa = xab X a b ;
Xa a.
q(a) a
, , a:
q(a) = |Xa |, Xa = x X b A : b < a, I (b, x) < I (a, x) ;
Xa a.
. .
14 / 41
(, , , 2010)
, X, A (0, 1)
u
X CLuq
u, mq
Q (, X) 6
HLuq
,
L m k
C
L
aA
u = |Xa | a,
q = |Xa | a,
m = n(a, X) a,
z
s
X
Cms CLm
, m
HL (z) =
, z = 0, . . . ,
CL
s=0
:
u
: P X = a 6 CLuq
/CL .
. .
15 / 41
( )
( )
. .
16 / 41
CCV Q -
A, CCV Q
[.]
[., ., .]
- [.]
[.]
[.]
CCV :
[.],
[., .],
[.]
. .
17 / 41
( )
-
.
.
.
. . .
. 2011.
http://www.machinelearning.ru/wiki/images/d/d9/Voron-2011-tnop.pdf
. .
18 / 41
, ,
z(t, ) =
Xi (t)Yi ()
: z(t, ) -;
: Xi (t) i- ,
Yi () i- .
-
I (p, k) =
apg Cgk
: I (p, k) p- k- ;
: apg p- g - ,
Cgk g - k- .
p(w |d) =
p(w |t)p(t|d)
: p(w |d) w d;
: p(w |t) w t,
p(t|d) t d.
. .
19 / 41
, ,
d
: p(w |d) =
p(w |t)p(t|d)
tT
,,
( |!)
#$
!
("| ):
"
#" $
0.023
0.016
0.009
0.014
0.009
0.006
0.018
0.013
0.011
" , , "#$ :
-
.
GC- GA- .
,
( , )
. .
,
.
( ).
. .
20 / 41
, ,
(topic modeling)
, , ,
:
(expert search), ,
( )
, ,
. .
21 / 41
, ,
PLSA Probabilistic Latent Semantic Analysis [Hofmann, 1999]
:
XX
X
ndw ln
wt td max,
dD w d
tT
P
P
wt > 0;
wt = 1;
td > 0;
td = 1
w W
dD
kF k min
,
ndw
nd W D
F = p(w |d) =
;
= wt W T wt = p(w |t);
= td T D td = p(t|d).
. .
22 / 41
, ,
-
E-: p(t|d, w ) t, d, w
wt , td :
p(t|d, w ) =
p(w , t|d)
p(w |t)p(t|d)
wt td
=
=P
.
p(w |d)
p(w |d)
s ws sd
-:
, ndwt = ndw p(t|d, w ):
nwt
,
nt
ndt
=
,
nd
wt =
td
nwt =
ndwt ,
ndwt ,
nt =
nwt ;
ndt .
w W
dD
ndt =
nd =
w d
tT
- .
. .
23 / 41
, ,
Weiwei Cui, Shixia Liu, Li Tan, Conglei Shi, Yangqiu Song, Zekai J. Gao, Xin
Tong, Huamin Qu TextFlow: Towards Better Understanding of Evolving Topics
in Text // IEEE Transactions On Visualization And Computer Graphics,
Vol. 17, No. 12, December 2011.
. .
24 / 41
, ,
. n-
. .
25 / 41
, ,
. n-
. .
26 / 41
, ,
27 / 41
, ,
(ARTM)
[., 2013]
BigARTM
ARTM ++ [. ., 2014]
,
[.,2012]
[., 2013]
n- [., 2013]
,
PLSA LDA [., 2013]
. .
28 / 41
, ,
,
,
0 [., 2013]:
PLSA
LDA
0.8
0.8
D
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
0.1
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
, = 0.1
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
, = 0.1
. .
29 / 41
, ,
wt td
95%,
[., 2013]:
1.0
4 200
4 000
3 800
3 600
3 400
3 200
3 000
2 800
2 600
2 400
2 200
2 000
1 800
1 600
1 400
1 200
1 000
800
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
15:1:15%
10
15
1:2:10%
. .
20
25
30
35
40
30 / 41
, ,
(ARTM)
:
XX
X
ndw ln
wt td + R(, ) max,
dD w d
tT
: - EM-
wt nwt ,
td ndt ,
R
R
wt nwt + wt
,
td ndt + td
.
wt +
td +
R
. .
31 / 41
, ,
( )
?
?
?
(, ,
, )?
. .
32 / 41
, ,
..:
Rn+1 Rn , Tn+1 Tn n+1 n .
. .
33 / 41
, ,
..
1
,
600
599-
6-
216
. .
34 / 41
, ,
. .
35 / 41
, ,
(2- )
10
20 ( + )
50
20
. .
36 / 41
, ,
-
: .
: .
1-4
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
: .
. .
37 / 41
, ,
.
(x 1 , . . . x n ) R n x;
:
X
n
j
a(x, ) = sign
j x = signh, xi.
j=1
X
Q() =
L h, xi iyi + kk2 min .
2
i =1
log(1 + e ) ;
L (M) = (1 M)+
;
M
e
AdaBoost;
. .
38 / 41
, ,
5
,
( 12 ).
.
http://www.MachineLearning.ru/wiki/index.php?title=User:Vokov
http://www.MachineLearning.ru/wiki/images/e/e3/Voron-2014-task-ekg.pdf
http://www.MachineLearning.ru/wiki/images/3/37/Voron-2014-task-ekg-data.rar
. .
39 / 41
?
:
Python, scikit-learn scikit-learn.org
RapidMiner rapidminer.com
WEKA www.cs.waikato.ac.nz/ml/weka
:
kaggle.com
UCI:
archive.ics.uci.edu/ml
:
Poligon.MachineLearning.ru
- :
www.MachineLearning.ru
: :Vokov
:
voron@forecsys.ru
www.MachineLearning.ru/wiki, :Vokov
strijov@forecsys.ru
http://www.strijov.com
www.MachineLearning.ru/wiki, :Strijov