You are on page 1of 10

Let's have a set of M images f i( ) = 1 2 g where each image is .

We can transform each image into a vector by scanning the rows(columns). We get a set of = 2 -dimensional vectors xi i( ; 1 xi = i(0 0) i(0 1) ; 1)]T The mean vector mx is de ned as: mx = fxg The expectation Efxg can be approximated by the average: M mx = 1 X xi
f x y i ::: M N N n N f f ::: f N N E M

K-L Transform { Hotelling

The Covariance matrix is de ned by 4 C x = f(x ; mx)(x ; mx)T g Approximately: M C x = 1 X(xi ; mx)(xi ; mx)T
E M

i=1

C x is

i=1

matrix.

Cx

M X = 1 (xi ; mx )(xiT ; mxT )]


M

= 1 = 1

i=1 M X

i=1 M X M X i=1
c

(xi xT ; ximxT ; mx xiT + mx mxT ) i

= 1

i=1

x x ; mx ; mx xi
T i i T i=1 i=1

M X

M X

M X i=1

mxmxT ]

xixT ; mxmxT i
C

The diagonal elements: ll of x denote the variance of the component l of the vector set f x g. The o -diagonal elements kl of x are the covariance of component xk and xl. If kl = 0 they are uncorrelated. The matrix x is real and symmetric: M M M X X X = = 1 xi xi ; 1 xi ] 1 xi]
c C c C ckl clk M

i=1

k l

2 6 6 xi = 6 6 4

i i x

x1

...

i N2

3 7 7 7 7 5

i=1

i=1

Therefore it always has where

eigenvalues

and orthogonal eigenvectors e,

Cxek = kek

k=1,2,. . . , n

We are arranging them in descending order


1
:::

k+1 : : :

We de ne a transformatin matrix A with rows composed of the eigen-vectors ek


2 6 6 A=6 6 4
e11 e21 e12 e22 ::: ::: e1n e2n

...

3 7 7 7 7 5

2 T 3 eT 6 e1 7 6 2 7 6. 7 6. 7 4. 5

en1

en2

:::

enn

also normalized: jek j = 1. The eigenvalues and eigenvectors are found from the equation

eT n

since ek 6= 0, we must have:


det

Cxek = k ek =1 2 Cx ; k I ] ek = 0
k

:::

otherwise, with ek 6= 0 there is no solution! This is called the characteristic equation. Next, we de ne a transformation of fxg set into a new, \rotated" set fyg by

Cx ; k I ] = 0

=1 2

:::

yl = A(xl ; mx)
E

The average my is zero my = 0 because

my

= = =

fyg = fA(x ; mx)g Af x ; mx]g = Af (x) ; (mx)g Afmx ; mxg = 0


E E E E E

The covariance of y

Cy

= = =

f(y ; my )(y ; my )T g = fyyT g fA(x ; mx)(x ; mx)T AT g A f(x ; mx)(x ; mx)T gAT = ACxAT
E E E

Cy is diagonal: Cy

2 6 6 6 =6 6 6 6 4

1 2 3

0 0
...
n

3 7 7 7 7 7 7 7 5

It means that the components of fyg are mutually uncorrelated, i.e. C kl = fyk ylg ; fyk g fylg = 0 Also, since: fyl g = fyk g = 0 =1 2 therefore the fyg components are also orthogonal, i.e.: fyk yl g = 0 To prove that C y is diagonal:
E E E E E l k ::: n E

C y = AC xAT
First,
h

2 6 A=6 6 4

:::

eT 1 eT 2 eT n

3 7 7 7 5

C xAT = C x e1 e2 Since ei is an eigenvector of C x C xei = iei


therefore it follows Next,

:::

en

C x AT = Cy
= =

1 e1

2 e2

:::

n en

AC xAT 2 T 3 e1 6 eT 7 h 6 2 7 6 7 1 e1 4 5 T e 2 n
:::

2 2

:::

n n

6 6 6 6 6 6 6 4

0
...
n
k k l l

2 3

3 7 7 7 7 7 7 7 5

ek el = 0 if 6= 1 if = Since the o -diagonal elements of C y are zero, it follows that the yk are orthogonal & uncorrelated, fyk yl g = fyk g fyl g = 0
i.e.
E E E

The eigenvalues of C y , i.e.,

k
E

Cy

= =

2 6 6 6 6 6 6 6 4 2 6 6 6 6 6 6 6 4

C ij ] = fyiyj g]
1 2 3

are equal to the variance of k (k-th component of y)


y

fy1 y1 g
E

...
n

3 7 7 7 7 7 7 7 5

fy2y2g
E

0
fy3 y3 g
...
E

fynyng

3 7 7 7 7 7 7 7 5

since fyk g = 0 we have Since the components of mean power of y | y :


E P P E

2 = fy2 g = k . k are orthogonal (and uncorrelated), we nd that the


E

and because y is obtained by a unitary transformation{ by unitary matriax A:

y = fy yg = fi=1 yi g = i=1 fyi g = i=1


T
E

n X

n X

n X

n X i=1

yl = A(xl ; mx)
E E

fyT yg =

= n X = (C xx) =
i=1

f(x ; mx)T AT A(x ; mx)g f(x ; mx)T (x ; mx)g


xi
2

The variance of (x ; mx) is the same as the variance of y. We can compress the set fxg by transforming it into the set fyg and then transmitting only the rst components: yi ! yk . j In the reconstruction of the set fxg we receive
k

yj

2 6 6 6 6 6 =6 6 6 6 6 4

yj 1

... ...

yjk yjk +1

3 7 7 7 7 7 7 ) yk j 7 7 7 7 5

2 6 6 6 6 6 =6 6 6 6 6 4

yj 1

...

yjk

yjn

0 ... 0

3 7 7 7 7 7 7 7 7 7 7 5

^ Reconstruct an approximate vector xj :

^ xj = AT yk + mx j

^ It can be shown that the mean square error = R = MSE between x and x is ^ R = f(x ; x)2g = X j ; X
n k
E

j =1

j =1

j =

n X j =k;1

R is the sum of the eigenvalues of the untransmitted components.


Example:
Suppose 4 two-element pictures are transmitted four times to produce four = ;2 x2 = 21 = ;1 0 1 12 22 " # " # " # " # x3 = 31 = 1 x4 = 41 = 2 3 4 32 42 " # " # " # " # " # 1 f ;2 + ;1 + 1 + 2 g = 0 mx = fxg = 4 0 1 3 4 2
" # " # " # " #

x=

"

x1 x2

x1 =

x11 x

x x x x

x x

4 3 2

x4 x3

x2 x1
-2 -1

x - mx
4

x3- m x
1 2

x2- m x
-2

x1- m x

We can draw the images as points in a two-dimensional space. The covariance matrix C x is

Cx

1 = 4f 4 4 + 1 1 + 1 1 + 4 4 g 4 4 1 1 1 1 4 4 " 10 10 # 4 4 = 10 10
4 4

f"x ; mx)(x ; mx)T g " # ( # " # " # 1 f ;2 h ;2 ;2 i + ;1 h ;1 ;1 i + 1 h 1 1 i + 2 h 2 2 ig = ;1 1 2 4 ;2


=
E

"

"

"

"

The characteristic equation:

jC x ; I j = 0 )
( ; 5) = 0 (1). For
1

"

10 4

10 4 1

;
=5
"

10 4

10 4

;
=0

=0

= 5, we nd the eigenvector e1 =
"
10 5

e11 e12

C x ; 5I ] e1 = 0
;5
10 4

= 0 10 10 ;5 0 12 4 4 ( 10 ; 4 11 + 12 10 = 0 4 10 10 11 ; 12 4 = 0 4 11 ; 12 = 0
e11 e e e e e e e

#"

"

We require here also j 1 j = 1, thus, e1 =


e

"

(2). For

= 0 we receive
"
10 4 10 4

1 p2 # 1 p2 .

;0
e

10 4

10 4

#"

;0
"

e11 e12

= 0 0

"

# #

1 + 22 = 0 ) e = p2 2 je2 j = 1 ; p12
e21

"

Note that e1 and e2 could have also negative directions. Note that AT C xA.)

1 p A = p12 2

1 p2 # ; p12

A is orthogonal matrix.
1 = AC xA = p 1 2 1 " # " 1 10 4 0 = = 2 4 0 0
T

(Using
#

ei as columns of A we have C y
" #

Cy

"

10 1 1 1 ;1 # 4 1 1 5 0 0 0

1 1 p 1 ;1 2 1

"

All the xi are \rotated" to yi using the A transform matrix. Since A is orthogonal, the transformation does not involve scale change or distortion and the variance of the axes yi is equal to the old one:
x1 +
2

x2 =
2

y1 +
2

y2 =
2

2 X

i=1

xi =
2

2 X

i=1

yi =
2

2 X

i=1

=5

The o diagonal elements of C y are zero indicating that the yi are uncorrelated 1 1 y1 = A(x1 ; mx) = p2 1 ;1 1 In the same manner,
# " # " p # 1 ;2 0 = p ;4 = ;2 2 0 ;2 0 2 0 " #" # " # " p # 1 1 1 1 ;1 0 = p ;2 = ; 2 y2 = A(x2 ; mx) = p2 1 ;1 1 ;2 0 2 0 " #"

y3 =

"

2 0

y4 = 2 0 2

"

We note here that the y2 component of all the yi is zero, they all are on the y1 axis. This also can be inferred from the result that 2 = 0.
4

x4 x3 y x - mx
4 1

3 2

x2 x1
-2 -1

x3- m x
1 2

x2- m x
-2

x1- m x

Figure 1: The original and transformed points The transformed points are plotted here with the rotated -axis system. Note that the variance along the 1-axis is the largest, and along 2 the variance in this case is zero. Image compression using Hotelling transformation, to conclude this example, instead of transmitting the xi, = 1 2 3 4, we can tramsmit only y1i and y2i then to reconstruct xi by the inverse transformation
y y y i

xi = A;1yi + mx = AT yi + mx

Here we see that in this case only the rst component of yi is su cient to reconstruct xi with no error.

x1

x2

. . . ......

7 x 2 6 5 4 3 2 1 1 2 3 4 5 6 7

y1

x1

y2

Figure 2: Dividing an image into sub-images of two pixels each For the general case, if we divide an image into sub-images of two pixels each, we shall denote the pixels in each subimage by 1 and 2 . Suppose this image is digitized with 3-bits (8 gray levels). Since 1 and 2 are neighbors, it is more likely that their gray levels will be similar. Out of the 64 possible combinations, the one near the diagonal will be more likely.
x x x x

Hence, transforming to makes the components pendent, i.e., the values of 1 are not dependent on 2.
y y

"

x1 x2

"

y1 y2

y1

and

y2

more inde-

Application:

Principle axes of Binary Images

In this application, each pixel of the shape has a vector of its coordinates: (x1 x2 ). We next nd C x = 1 X xixiT ; mxmT = 2 2] x
M

xi =

The eigenvectors e1 and e2 give the principle axes of the shape. The coordinates of the pixels along e1 have the largest variance.

Multi-Band Image Compression


In multiple images of di erent bands, we creat the vector set from di erent band-values of each pixel. For instance, if we have six bands, we receive 6-dimensional vector set. Next, we keep only the rst few -components which include most of the information.
y

General Image Compression

We divide the image into sub-images and each sub-image serves as one vector. If the image has high-correlation, we receive a covariance matrix which has block-Toeplitz con guration.

Finding and e using SVD


i i

We can pack the vectors fxig in a matrix X :

Xn
(Assuming that mx = 0.)
E

x1 x2

:::

xN
n

C x = fxxT g = 1 XX T =
N n n n

1 also, C x = N PN=1 xk xk T this dimension might be very large especially with k images. For example, with image of 128 128 ) = 1282 = 16384 and we have to invert a C x matrix of the size of 16 16 with 16 eigenvectors. TSince it is possible to nd from linear algebra the eigenvalues and vectors of XX from these of X T X ( ) which are much easier to nd. If rank of X is : then
k k k N n N N r r N

X=X
r

k=1

T k k k

uv

uk = vk =

1] is an eigenvector of X T X 1] is an eigenvector of XX T T T k are the nonzero eigenvalues of both X X and XX if we multiply both sides of previous equation by vi we get
N n

Xvi = (

r Xq

Pr p k uk v k is nonzero only for = , thus k=1


k i

k=1

k k k

u v )vi =

i i

u (vT vi) i

ui = p1 Xvi
i

Hence, we can nd the eigenvectors of the large matrix XX T ( of the smaller matrix X T X ( ).
N N

) from the eigenvectors

Eigenfaces
The
N

eigenvectors are divided into two groups: : One group of vectors which correspond to the largest eigenvalues The resudual gruop of ; vectors Since all the vectors are normal to one another, the two subspaces of and ; are also normal.
F M M F N M M N

principle space F = f igN=M +1 complement space i Assuming independent Gaussian densities that each corresponds to an eigenvector. The likelihood of an input pattern x is given by 1 ; 2 (x ; x)T P;1 (x ; x)] = p(xj ) = (; (x)) 1 2 (2 ) N j P j 2 where (x) is the Mahalanobis distance i.e. the square of the normalized euclidean distance, and is a class of the object(represented by x.) X X ~ ~ (x) = (x ; x)T ;1(x ; x) = xT ;1 x Since X X = T ) ;1 = ;1 T Thus ~ ~ ( ) = xT ;1 T x = yT ;1y T ~ ~ where y = x and represents x in the new KLT coordinate system. Since 3 2 1
exp k exp d d d d x

F = f igM i=1

6 6 ;1 = 6 6 6 4

1
2

...

7 7 7 7 7 5

We can write
d d d

(x) =

N X yi2 i=1 i

An approximation to (x) i.e. ^(x) is given by M 2 N M ^(x) = X i + 1 X 2 = X


d y

where
e

i=1

i=M +1

yi

yi

i=1

2 + (x)
e

~ (x) = kxk2 ;

M X i=1

yi

The value of (x) can be easily calculated from the rst principle components, the optimal value of is the arithmetic average of i i.e. N X = 1
N

i=M +1

You might also like