Professional Documents
Culture Documents
Assignment #2
Principal Component Analysis
Matthew Reaume
GISC9216
Digital
Copy:
X:\Students\MREAUME2\GISC9216\Assignment#1\ReaumeMGISC9216D1 containing;
a) ReaumeMGISC9216D2.docx
b) unsupervised_image.img
c) pca_unsuper_image.img
d) pca_image
Table of Contents
1) Transform Original Bands to Principal Components ................................................................................................ 1
2) Band Comparison of Strong, Moderate, and Weak Correlations ..............................................................................1
3) PCA Variance on First, Second, and Third Channel .................................................................................................4
4) Comparison of Original Data to the PCA Channels ..................................................................................................5
5a) Unsupervised Classification on the Original Image ................................................................................................ 7
5b) Unsupervised Classification on the Principal Component Analysis........................................................................8
6) Comparison of the PCA and Unsupervised ...............................................................................................................9
Table of Figures
Figure 1: Histogram of Band 1 and Band 2 ...................................................................................................................3
Figure 2: Histogram of Band 1 and Band 4 ...................................................................................................................3
Figure 3: Eigen Matrix of PCA .....................................................................................................................................4
Figure 4: Eigen Values of PCA .....................................................................................................................................4
Figure 5: Formal Layout of Unsupervised Classification .............................................................................................. 7
Figure 6: Formal Layout of Unsupervised PCA ............................................................................................................8
Figure 7: Agricultural Features of PCA (left) and Unsupervised Classification (right) ................................................9
Figure 8: Pixel Value Comparison .............................................................................................................................. 10
Figure 9: Urban Features Comparison of PCA (left) and Unsupervised Classification (right) ................................... 10
Figure 10: Comparison of PCA (left) and Subset Image (right) .................................................................................. 11
Figure 11: Comparison of Unsupervised Classification (left) and Subset Image (right) ............................................. 11
Table of Tables
Table 1: Strong Correlation of Bands ............................................................................................................................ 1
Table 2: Moderate Correlation of Bands ....................................................................................................................... 2
Table 3: Weak Correlation of Bands ............................................................................................................................. 2
Table 4: PCA Channels and Calculation of Covariance ................................................................................................ 4
Table 5: Comparison of PCA to Orignal Subset Scatterplots ........................................................................................ 5
Table 6: Comparison of PCA and Original Subset Histograms ..................................................................................... 6
1|P age
Example of Correlation
Reasoning
This is a strong correlation
due to the linear placement
of the scatterplot. The plot
starts in the bottom left and
progresses linearly to the
top right of the image.
2|P age
Moderate Correlation of
Bands
Example of Correlation
1:5
Reasoning
This is a moderate
correlation due to the linear
placement of the
scatterplot. There is a
partial linear shape and
both ends of the feature are
pointing diagonally in
opposite corners of the
space image.
1:6
2: 5
2:6
3:5
3:6
5:6
Example of Correlation
Reasoning
This is a weak correlation
due to the linear placement
of the scatterplot. There is
no linear shape on the
image due to the triangular
shape of the scatterplot.
4:5
4:6
Along with the images provided above, histograms also show the correlation between different
bands. Figure 1 below shows redundancy for band 1 and band 2. Above in Table 1, band 1 and
band 2 have a strong correlation and this can be seen on Figure 1 below by displaying an
identical overall shape. Figure 2 displays band 1 and band 4, which has a weak correlation as
seen above in Table 3. This figure shows how the histogram has a different shape between the
bands, thus displaying a weak correlation. Due to the tables provided above and the histograms
provided below, it is apparent that some of the data is redundant, thus a Principal Component
Analysis is performed to eliminate the amount of redundancy.
3|P age
4|P age
In order to properly create the Principal Component Analysis, three components are selected that
are at the top of the Eigen Values chart, and which are the most important for the purpose of this
assignment. Three components are chosen because the goal is to reduce the amount of redundancy
of the six bands. The top three values account for 99.8%, therefore 0.60% of the data is be lost
when the PCA is executed. Below in Figure 4, shows the Eigen Values and in Table 4 by showing
the calculation of percentages for each of the PCA channels
PCA Channel
Eigen Values
Divided By
Total
Percentage (%)
Covariance
Accounted For (%)
1993.072512
0.681932356
68.19323562
68.2
816.4739036
0.27935761
27.93576097
96.1
5|P age
94.49608606
0.032331959
3.233195894
99.4
11.71935572
0.004009793
0.400979283
99.8
5.215368048
0.001784445
0.178444497
99.9
1.706371854
0.000583837
0.058383735
100.0
Total:
2922.683597
100%
100%
According to the table provided above, the first channel of the Eigen Values is 68.2%, the second
channel is 27.9%, and the third is 3.2%. All three of these channels adds up to 99.4% of the
covariance that is account for. This means that the remaining 0.60%, as mentioned above, of the
data is lost. After the third channel, there is only a 0.4% difference, whereas the other differences
are 27.9% between channels one and two, and 3.3% between channels two and three. This is crucial
for understanding how PCA works and why it is helpful after data has been reduced and
compressed.
Bands
1:2
PCA Channels
Original Data
Comparison
The PCA scatterplot
shows that the bands
are
unique
when
compared
to
the
original data. The
redundancy has been
removed as shows no
or a weak correlation.
6|P age
1:3
2:3
Lastly,
the
PCA
scatterplot shows a
weak or no correlation
between the bands, and
the redundancy is once
again removed.
Band 1
PCA
Scatterplot
Original
Data
Band 2
Band 3
7|P age
8|P age
9|P age
10 | P a g e
Figure 9: Urban Features Comparison of PCA (left) and Unsupervised Classification (right)
11 | P a g e
Figure 11: Comparison of Unsupervised Classification (left) and Subset Image (right)