Professional Documents
Culture Documents
Introduction:
References:
1) Spearman, C. (1904), Am. J. Psychol. 15, 72-101.
2) Conover, W. J. (1998), Practical Nonparametric Statistics, 3rd ed., John Wiley & Sons, New York.
3) Gilmore et al. (2004), J. Appl. Cryst. 37, 231-242.
4) Pearson, K. (1896), Mathematical Contributions to the Theory of Evolution. III. Regression, Heredity and
Panmixia, Philosophical Transactions of the Royal Society of London, 187, 253-318.
5) E.H. Malinowski and D.G. Howery (1980), Factor Analysis in Chemistry, John Wiley & Sons, New York.
6) I.T. Joliffe (1986), Principal Component Analysis, Springer-Verlag, New York.
7) Unpublished proprietary algorithm.
55
Ln(Counts)
9.156
8.941
8.727
8.513
8.299
8.085
7.87
7.656
7.442
7.228
7.014
6.8
6.585
6.371
6.157
5.943
5.729
5.515
5.3
5.086
4.872
4.658
50
45
40
Scan number
35
30
25
20
15
10
5
20
30
40
50
60
70
80
Position [2Theta]
90
100
110
120
Summary:
PC
PC
% var.
Eige
Ei
50
48
46
44
42
40
38
36
34
n Ac oun ed
32
30
28
26
24
Perc nta e V
However, cluster analysis taking into account the signal-tonoise ratio of the XRDP raw data (as used in our searchmatch-identify algorithm) immediately reveals the presence
of at least four or five different groups. Figure 2a presents
the PCA score plot calculated from such a correlation
matrix. Figure 2b shows the Eigenvalues plot, which
indicates that PC3 in z-direction is much less important,
because it only accounts for about 2% variation in the data,
whereas PC1 in x-direction accounts for 51% and PC2 in ydirection still accounts for 38% variation.
The four clusters (fourth cluster is green and brown
together) nicely correspond to the four different batches
under investigation. Further manipulation of the cut-off
line in the dendrogram allows each of the seven samples to
be clearly identified, proving the non-homogeneity of the
sampling.
The other approach using only the matrix of raw
observations/data points creates a different picture. The 3D
PCA score plot (Figure 3a) based on the full matrix of
observations shows the separation into two clearly
distinguishable groups, but this is in fact no more help than
just looking at the surface plot (Figure 1).
22
20
18
16
14
12
10
8
6
4
2
var.
0
1
2
omp nen Nu ber