Professional Documents
Culture Documents
LSA
PageRank [Brin & Page, 1998]
Transition matrix of a Markov chain: Ranks:
Terms car service military repair d2 d1
d3 Documents
Multi-Link Graphs
Terms Documents
Tensors
An I x J matrix Notation scalar vector matrix tensor
aij
Tensor order
Number of dimensions
J An I J K tensor
K
Example
xijk
J
The matrix A (at left) has order 2. The tensor X (at left) has order 3 and its 3rd mode is of size K.
Tensors
An I J K tensor
K
X = [xijk ]
J 3rd order tensor mode 1 has dimension I mode 2 has dimension J mode 3 has dimension K
Horizontal Slices
Lateral Slices
Frontal Slices
(i,j,k)
(i,j)
(i,j)
(i,j,k)
5 7 1 3 6 8 2 4
MxN
PxQ
MP x NQ
MxR NxR
Observe: For two vectors a and b, a b and a b have the same elements, but one is shaped into a matrix and the other into a vector.
IxJxK =
IxR
JxS V
IxJxK ==
I R wx 1
v1 U u1
W
R R Jwx
++ RxRxR
V
vR uR
RxSxT
In matrix form:
In matrix form:
Tucker Decomposition
Three-mode factor analysis Three-mode PCA N-mode PCA Higher-order SVD (HO-SVD)
K x T
[Tucker, 1966] [Kroonenberg, et al. 1980] [Kapteyn et al., 1986] [De Lathauwer et al., 2000]
IxJxK
IxR
RxSxT
A, B, and C may be orthonormal (generally, full column rank) is not diagonal Decomposition is not unique
IxJxK
IxR
JxR B = ++
RxRxR
CANDECOMP = Canonical Decomposition PARAFAC = Parallel Factors Core is diagonal (specified by the vector ) Columns of A, B, and C are not orthonormal
where
is the frequency of term i in document j is the number of documents that term i appears in
for terms in the titles for terms in the author-suppplied keywords where
not symmetric
Community Identification
Communities of papers connected by different link types
Author Disambiguation
List of most prolific authors in collection changes Multiple author disambiguation
Journal Prediction
Co-reference information to define journal characteristics
Interior Point Methods (Not related) Sparse approximate inverses (Arguably distantly related) Graph Partitioning (Related)
Graphs (Related)
T Chan (2), TM Chan (4) T Manteufel (3) S McCormick (3) G Golub (1)
R = 15
R = 20
R = 25
Applications
Network analysis, web analysis, multi-way clustering, data fusion, cross-language IR, feature vector generation
Different operations, data partitioning issues
Thank You
Extra Slides
+L+