115 views

Uploaded by avant_ganji

- Proper Orthogonal Decomposition
- december 2007 Etrx Extc Appmaths MU
- LDA.pdf
- Russian Doll
- 552_Notes_4.pdf
- application of eigen value n eigen vector
- Matrix
- Differential Equations - Review _ Eigenvalues & Eigenvectors
- Unitary Matrices
- Paper With Sondipon in IJNME
- Background Foreground Based Underwater Image Segmentation
- Assembling Reduced Order Substructural Models
- wireless intrusion
- IRJET-A Comprehensive Survey and Detailed Study on Various Face Recognition Methods
- Multivariable Feedback Design
- Bachelor of AMI Progamme (Final).pdf
- Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations
- Sci
- 2blockPLS RohlfCorti-
- 1987 Multicollinearity Problems in Modeling Time Series With Trading-Day VariationMulticollinearity Problems in Modeling Time Series With Trading-Day Variation

You are on page 1of 48

Adapted from G. Piatetsky-Shapiro, Biologically Inspired Intelligent Systems (Lecture 7) and R. Gutierrez-Osunas Lecture

Most mining algorithms look for non-linear combinations of fields -- can easily find many spurious combinations given small # of records and large # of fields Classification accuracy improves if we first reduce number of fields Multi-class heuristic: select equal # of fields from each class

8

If there are too many fields, select a subset that is most relevant

Can select top N fields using 1-field predictive accuracy as computed earlier What is good N?

Rule of thumb: keep top 50 fields

Attribute Construction

Better to have a fair modeling method and good variables, than to have the best modeling method and poor variables Examples:

People are eligible for pension withdrawal at age 59 . Create it as a separate Boolean variable! Household income as sum of spouses incomes in loan underwritting

Advanced methods exists for automatically examining variable combinations, but it is very computationally expensive!

10

Variance

A measure of the spread of the data in a data set

X

s !

2 i! 1

X

n 1

11

Covariance

Variance measure of the deviation from the mean for points in one dimension, e.g., heights Covariance a measure of how much each of the dimensions varies from the mean with respect to each other. Covariance is measured between 2 dimensions to see if there is a relationship between the 2 dimensions, e.g., number of hours studied & grade obtained. The covariance between one dimension and itself is the variance

12

Covariance

X

var( X ) !

i! 1 n n i

X

X

n 1

X

cov( X ,Y ) !

i! 1

X Yi Y

n 1

So, if you had a 3-dimensional data set (x,y,z), then you could measure the covariance between the x and y dimensions, the y and z dimensions, and the x and z dimensions.

13

Covariance

What is the interpretation of covariance calculations? Say you have a 2-dimensional data set

X: number of hours studied for a subject Y: marks obtained in that subject

And assume the covariance value (between X and Y) is: 104.53 What does this value mean?

14

Covariance

Exact value is not as important as its sign. A positive value of covariance indicates that both dimensions increase or decrease together, e.g., as the number of hours studied increases, the grades in that subject also increase. A negative value indicates while one increases the other decreases, or vice-versa, e.g., active social life at BYU vs. performance in CS Dept. If covariance is zero: the two dimensions are independent of each other, e.g., heights of students vs. grades obtained in a subject. 15

Covariance

Why bother with calculating (expensive) covariance when we could just plot the 2 values to see their relationship?

Covariance calculations are used to find relationships between dimensions in high dimensional data sets (usually greater than 3) where visualization is difficult.

16

Covariance Matrix

Representing covariance among dimensions as a matrix, e.g., for 3 dimensions:

cov( X, X) cov( X,Y ) cov( X,Z) C ! cov(Y , X) cov(Y ,Y ) cov(Y ,Z) cov(Z, X) cov(Z,Y ) cov(Z,Z)

Properties:

Diagonal: variances of the variables cov(X,Y)=cov(Y,X), hence matrix is symmetrical about the diagonal (upper triangular) n-dimensional data will result in nxn covariance matrix 17

Transformation Matrices

Consider the following:

3 3 3 12 2 v ! ! 4 v 2 2 2 1 8

The square (transformation) matrix scales (3,2) Now assume we take a multiple of (3,2) 3 6 2 v ! 2 4

2 2

3 v 1

6 ! 4

6 24 ! 4 v 4 16

18

Transformation Matrices

Scale vector (3,2) by a value 2 to get (6,4) Multiply by the square transformation matrix And we see that the result is still scaled by 4. WHY? A vector consists of both length and direction. Scaling a vector only changes its length and not its direction. This is an important observation in the transformation of matrices leading to formation of eigenvectors and eigenvalues. Irrespective of how much we scale (3,2) by, the solution (under the given transformation matrix) is always a multiple of 4.

19

Eigenvalue Problem

The eigenvalue problem is any problem having the following form: A.v= .v A: n x n matrix v: n x 1 non-zero vector : scalar Any value of for which this equation has a solution is called the eigenvalue of A and the vector v which corresponds to this value is called the eigenvector of A.

20

Eigenvalue Problem

Going back to our example:

3 3 3 12 2 v ! ! 4 v 2 2 2 1 8

A . v =

. v

Therefore, (3,2) is an eigenvector of the square matrix A and 4 is an eigenvalue of A The question is: Given matrix A, how can we calculate the eigenvector and eigenvalues for A?

21

Simple matrix algebra shows that: A.v= .v A.v- .I.v=0 (A - . I ). v = 0 Finding the roots of |A - . I| will give the eigenvalues and for each of these eigenvalues there will be an eigenvector Example

22

Let

0 1 A ! 2 3 Then:

0 1 0 0 1 1 P A P .I ! ! P 0 0 2 3 1 2 3

0 P

P 1 ! ! v 3 P 2 v 1 ! P2 3P 2 P 2 3 P

23

For

1

1 v1:1 . ! 0 v 2 1:2 and 2v1:1 2v1:2 ! 0

A P1 .I
.v1 ! 0

1 2

Therefore the first eigenvector is any column vector in which the two elements have equal magnitude and opposite sign.

24

Therefore eigenvector v1 is

1 v1 ! k1 1

1 v 2 ! k2 2

25

Eigenvectors can only be found for square matrices and not every square matrix has eigenvectors. Given an n x n matrix (with eigenvectors), we can find n eigenvectors. All eigenvectors of a symmetric* matrix are perpendicular to each other, no matter how many dimensions we have. In practice eigenvectors are normalized to have unit length.

*Note: covariance matrices are symmetric! 26

PCA

Principal components analysis (PCA) is a technique that can be used to simplify a dataset It is a linear transformation that chooses a new coordinate system for the data set such that

The greatest variance by any projection of the data set comes to lie on the first axis (then called the first principal component) The second greatest variance on the second axis Etc.

PCA can be used for reducing dimensionality by eliminating the later principal components.

27

PCA

By finding the eigenvalues and eigenvectors of the covariance matrix, we find that the eigenvectors with the largest eigenvalues correspond to the dimensions that have the strongest correlation in the dataset. These are the principal components. PCA is a useful statistical technique that has found application in:

Fields such as face recognition and image compression Finding patterns in data of high dimension.

28

Subtract the mean from each of the dimensions This produces a data set whose mean is zero. Subtracting the mean makes variance and covariance calculation easier by simplifying their equations. The variance and co-variance values are not affected by the mean value.

29

X Y X d 0.69 0.39 X !1.81 Y !1.91 0.09 1.29 0.49 0.19 Y d 0.49 0.99 0.29 1.09 0.79 0.31 2.5 2.4 0.5 0.7 2.2 2.9 1.9 2.2 3.1 3.0 2.3 2.7 2.0 1.6 1.0 1.1 1.5 1.6 1.2 0.9 1.31 1.21

30

http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf

Calculate the covariance matrix

Since the non-diagonal elements in this covariance matrix are positive, we should expect that both the X and Y variables increase together. Since it is symmetric, we expect the eigenvectors to be orthogonal.

31

Calculate the eigenvectors and eigenvalues of the covariance matrix 0.490833989 eigenvalues ! 1.28402771

32

Eigenvectors are plotted as diagonal dotted lines on the plot. (note: they are perpendicular to each other). One of the eigenvectors goes through the middle of the points, like drawing a line of best fit. The second eigenvector gives us the other, less important, pattern in the data, that all the points follow the main line, but are off to the side of the main line by some amount.

33

Reduce dimensionality and form feature vector The eigenvector with the highest eigenvalue is the principal component of the data set. In our example, the eigenvector with the largest eigenvalue is the one that points down the middle of the data. Once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives the components in order of significance.

34

Now, if youd like, you can decide to ignore the components of lesser significance. You do lose some information, but if the eigenvalues are small, you dont lose much

n dimensions in your data calculate n eigenvectors and eigenvalues choose only the first p eigenvectors final data set has only p dimensions.

35

When the is are sorted in descending order, the proportion of variance explained by the p principal components is:

P P

i! 1 i! 1 n

!

i

P1 P 2 K P p P1 P 2 K P p K P n

If the dimensions are highly correlated, there will be a small number of eigenvectors with large eigenvalues and p will be much smaller than n. If the dimensions are not correlated, p will be as large as n and PCA does not help. 36

Feature Vector FeatureVector = (

1 2 3 p)

(take the eigenvectors to keep from the ordered list of eigenvectors, and form a matrix with these eigenvectors in the columns)

We can either form a feature vector with both of the eigenvectors: 0.677873399 0.735178656

0.735178656 0.677873399

or, we can choose to leave out the smaller, less significant component and only have a single column:

0.677873399 0.735178656

37

Derive the new data

FinalData = RowFeatureVector x RowZeroMeanData

RowFeatureVector is the matrix with the eigenvectors in the columns transposed so that the eigenvectors are now in the rows, with the most significant eigenvector at the top. RowZeroMeanData is the mean-adjusted data transposed, i.e., the data items are in each column, with each row holding a separate dimension.

38

FinalData is the final data set, with data items in columns, and dimensions along rows. What does this give us? The original data solely in terms of the vectors we chose. We have changed our data from being in terms of the axes X and Y, to now be in terms of our 2 eigenvectors.

39

FinalData (transpose: dimensions along columns)

ne X 0.827870186 1.77758033 0.992197494 0.274210416 1.67580142 0.912949103 0.0991094375 1.14457216 0.438046137 1.22382956 ne Y 0.175115307 0.142857227 0.384374989 0.130417207 0.209498461 0.175282444 0.349824698 0.0464172582 0.0177646297 0.162675287

40

41

Recall that:

FinalData = RowFeatureVector x RowZeroMeanData

Then:

RowZeroMeanData = RowFeatureVector-1 x FinalData

And thus:

RowOriginalData = (RowFeatureVector-1 x FinalData) + OriginalMean

If we use unit eigenvectors, the inverse is the same as the transpose (hence, easier).

42

If we reduce the dimensionality (i.e., p<n), obviously, when reconstructing the data we lose those dimensions we chose to discard. In our example let us assume that we considered only a single eigenvector. The final data is newX only and the reconstruction yields

43

The variation along the principal component is preserved. The variation along the other component has been lost.

44

45

The reciprocal of PCA(?) PCA generates new variables (zi) that are linear combinations of the original input variables (xi). FA assumes that there are factors (zi) that, when linearly combined, generate the input variables (xi).

46

Both PCA and FA are unsupervised. LDA seeks to find a dimension such that when the data is projected onto it, the two classes* are well separated (i.e., the means are as far apart as possible and the examples of classes are as tightly clustered)

47

References

PCA tutorial: http://kybele.psych.cornell.edu/~edelman/Ps ych-465-Spring-2003/PCA-tutorial.pdf Wikipedia: http://en.wikipedia.org/wiki/Principal_comp onent_analysis http://en.wikipedia.org/wiki/Eigenface

48

- Proper Orthogonal DecompositionUploaded byt8e7w2ko
- december 2007 Etrx Extc Appmaths MUUploaded byNikhil Hosur
- LDA.pdfUploaded byAiz Dan
- Russian DollUploaded byFernando Carvalho
- 552_Notes_4.pdfUploaded byMuhammad Arief Rahman
- application of eigen value n eigen vectorUploaded byAman Khera
- MatrixUploaded bywertkh32
- Differential Equations - Review _ Eigenvalues & EigenvectorsUploaded byJustin Hofman
- Unitary MatricesUploaded bySumit
- Paper With Sondipon in IJNMEUploaded byaozrafael
- Background Foreground Based Underwater Image SegmentationUploaded byInternational Journal for Scientific Research and Development - IJSRD
- Assembling Reduced Order Substructural ModelsUploaded bysqualljavier612
- wireless intrusionUploaded byDivya Polavaram
- IRJET-A Comprehensive Survey and Detailed Study on Various Face Recognition MethodsUploaded byIRJET Journal
- Multivariable Feedback DesignUploaded bynatalaric
- Bachelor of AMI Progamme (Final).pdfUploaded byMilito Gilberto
- Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint LocationsUploaded byharry_3k
- SciUploaded byManda Ramesh Babu
- 2blockPLS RohlfCorti-Uploaded byattilal_5
- 1987 Multicollinearity Problems in Modeling Time Series With Trading-Day VariationMulticollinearity Problems in Modeling Time Series With Trading-Day VariationUploaded bythe4asempire
- mat223syl2Uploaded byKiran Gowda
- A Methodology for Modeling Based on Fuzzy Clustering...Uploaded byRoberto González
- 21 Medioni_tensor_voting.pdfUploaded byM Rameez Ur Rehman
- Beke TovUploaded byrobertsgilbert
- 2012Uploaded byMohsin Yousuf
- A world record in Atlantic City and the length of the shooter’s hand at crapsUploaded byChris Horymski
- pdpuUploaded byAJAY GELOT
- 22_3_rptd_eigen_vals_symm_mtrx.pdfUploaded byVaibhav Gupta
- Collaborative Spectrum Sensing Based on the Ratio Between Largest Eigenvalue and Geometric Mean of EigenvaluesUploaded byShafayat Abrar
- Numerical Methods Overview of the CourseUploaded bychetan

- Tutorial FreeProbabilityTheoryUploaded byMohamed BGcity
- IB Mathematics Higher Level Sample Questions Trig AnswersUploaded byndcuong1980
- precalUploaded byksr131
- Engineering MathematicsUploaded byMANOJ M
- Solutions for Detection and Estimation for Communication and Radar All_odd_numbered_hw_solutionsUploaded byChisn Lin Chisn
- Cw ComplexesUploaded byBa Nguyen
- Treil S. - Linear Algebra Done WrongUploaded byF
- Brinkhuis ArticleUploaded byAbraham Jyothimon
- cours de matricesUploaded byabdorez
- 4-EUCLIDEAN_VECTOR_SPACESUploaded byHanif Abdul Karim
- Splines (Quadratic Cubic Natural Clamped)Uploaded byDalila Asilah
- maths01-1Uploaded byanon-448684
- Lin LectureUploaded byAslam Aly Saib
- Finding potential function for conservative force.pdfUploaded bynitin
- OCW - CH1 - Further Transcendental FunctionUploaded byIzumi
- A Tutorial on Spectral ClusteringUploaded bySirui Xu
- UW Math 308 notesUploaded byLi Du
- Synethic DivisionUploaded byArindam Garai
- Partial FractionsUploaded byJyoti Shekhar
- Exact and Numerical Solution for Large Deflection of Elastic PlatesUploaded byFarid Tata
- TANCETUploaded byNagarajan Rv
- New5pt CameraREady Ver 1Uploaded byTeferi
- AIEEE Class XI Maths LimitsUploaded bysiddharth1996
- Sobolev SpaceUploaded byconmec.crpl
- Introduction to Lebesgue integral.pdfUploaded byDavid Merayo Fernández
- chapitre_1Uploaded byTruong Huynh
- cs229-cvxoptUploaded byEric Huang
- Subject Exams Calculus TestUploaded byDrope Rezpe
- mytut11sUploaded byTom Davis
- Qualifying Test – Master Programme in MathematicsUploaded bySyedAhsanKamal