Professional Documents
Culture Documents
Application to images
Vclav Hlav
Czech Technical University in Prague
Faculty of Electrical Engineering, Department of Cybernetics
Center for Machine Perception
http://cmp.felk.cvut.cz/hlavac, hlavac@fel.cvut.cz
Outline of the lecture:
Principal components, informal idea.
Needed linear algebra.
Least-squares approximation.
PCA derivation, PCA for images.
Drawbacks. Interesting behaviors live in
manifolds.
Subspace methods, LDA, CCA, . . .
2/26
Eigen-analysis
This linear space has some natural orthogonal basis vectors which allow
data to be expressed as a linear combination with regards to the coordinated
system induced by this base.
Goal:
Find another basis of the vector space
which treats variations of data better.
4/26
Geometric motivation, principal components (2)
_
= A =
_
_
1 3 2
3 5 6
2 4 3
_
_
, b =
_
_
5
7
8
_
_
.
_
1 3 2
3 5 6
2 4 3
5
7
8
_
_
.
This system has a unique solution if and only if the rank of the matrix A is
equal to the rank of the extended matrix [A|b].
7/26
Similar transformations of a matrix
Matrices A and B with real or complex entries are called similar if there
exists an invertible square matrix P such that P
1
AP = B.
Useful properties of similar matrices: they have the same rank, determinant,
trace, characteristic polynomial, minimal polynomial and eigen-values (but
not necessarily the same eigen-vectors).
Let I be the unitary matrix having values 1 only on the main diagonal and
zeros elsewhere.
_
J
1
0
.
.
.
0 J
p
_
_
, where J
i
are Jordan blocks
_
i
1 0
0
i
.
.
. 0
0 0
.
.
. 1
0 0
i
_
_
,
in which
i
are the multiple eigen-values.
The multiplicity of the eigen-value gives the size of the Jordan block.
If the eigen-value is not multiple then the Jordan block degenerates to the
eigen-value itself.
10/26
Least-square approximation
In the deterministic world, the conclusion would be that the system of linear
equations has no solution.
PCA represents the data in a new coordinate system in which basis vectors
follow modes of greatest variance in the data.
Thus, new basis vectors are calculated for the particular data set.
The aim: to reduce the dimensionality of the data so that each observation
can be usefully represented with only L variables, 1 L < M.
, n = 1, . . . , N.
This procedure is not applied to the raw data R but to normalized data X
as follows.
The raw observed data is arranged in a matrix R and the empirical mean is
calculated along each row of R. The result is stored in a vector u the
elements of which are scalars
u(m) =
1
N
N
n=1
R(m, n) , where m = 1, . . . , M .
2
=
1
N
N
n=1
|x
n
|
2
i=1
b
i
_
1
N
N
n=1
x
n
x
n
_
b
i
,
where b
i
, i = 1, . . . , L are basis vector of the linear space of dimension L. If
2
is to be minimized then the following term has to be maximized
L
i=1
b
i
cov(x) b
i
, where cov(x) =
N
n=1
x
n
x
n
,
is the covariance matrix.
15/26
Approximation error
2
= trace
_
cov(x)
_
i=1
i
=
M
i=L+1
i
,
where trace(A) is the tracesum of the diagonal elementsof the
matrix A. The trace equals the sum of all eigenvalues.
16/26
Can we use PCA for images?
The diculty of the task is to nd out empirically from the data in which
manifold the data vary.
26/26
Subspace methods
Subspace methods explore the fact that data (images) can be represented in a
subspace of the original vector space in which data live.
Dierent methods examples:
Method (abbreviation) Key property
Principal Component Analysis (PCA)
reconstructive, unsupervised, optimal recon-
struction, minimizes squared reconstruction error,
maximizes variance of projected input vectors
Linear Discriminative Analysis (LDA)
discriminative, supervised, optimal separation,
maximizes distance between projection vectors
Canonical Correlation Analysis (CCA)
supervised, optimal correlation, motivated by re-
gression task, e.g. robot localization
Independent Component Analysis (ICA) independent factors
Non-negative matrix factorization (NMF) non-negative factors
Kernel methods for nonlinear extension local straightening by kernel functions