Professional Documents
Culture Documents
Mean
Std Dev
Skewness
Kurtosis
Bimodality
16.1667
13.1667
9.9883
6.3692
0.2684
-0.4510
-1.4015
-1.8108
0.2211
0.2711
-Clusters Joined-c1
c3
c5
CL4
CL5
c2
c4
c6
CL3
CL2
FREQ
RMS
STD
SPRSQ
RSQ
2
2
2
4
6
0.7071
0.7071
2.5495
5.5227
8.3766
0.0014
0.0014
0.0185
0.2409
0.7378
.999
.997
.979
.738
.000
Centroid
Distance
1.4142
1.4142
5.099
13
19.704
The statistics above provide information about the cluster solution. RMSSTD
is the pooled standard deviation of all the variables forming the cluster.
Since the objective of cluster analysis is to form homogeneous groups, the
RMSSTD of a cluster should be as small as possible. SPRSQ (semipartial R-
CLUSTER=1
Obs
cid
income
educ
1
2
c1
c2
5
6
5
6
CLUSTER=2
Obs
cid
income
educ
3
4
c3
c4
15
16
14
15
CLUSTER=3
Obs
cid
income
educ
5
6
c5
c6
25
30
20
19
20
D
i
s
t
a
n
c
e 15
B
e
t
w
e
e
n 10
C
l
u
s
t
e
r
5
C
e
n
t
r
o
i
d
s
0
c1
c2
c3
c4
c5
c6
cid
You must specify either the MAXCLUSTERS= or the RADIUS= argument in the PROC
FASTCLUS statement
The RADIUS= option establishes the minimum distance criterion for selecting
new seeds. No observation is considered as a new seed unless its minimum
distance to previous seeds exceeds the value given by the RADIUS= option. The
default value is 0.
The MAXCLUSTERS= option specifies the maximum number of clusters allowed. If
you omit the MAXCLUSTERS= option, a value of 100 is assumed.
The REPLACE= option specifies how seed replacement is performed.
FULL
requests default seed replacement.
PART
requests seed replacement only when the distance between the observation
and the closest seed is greater than the minimum distance between seeds.
NONE
suppresses seed replacement.
RANDOM
selects a simple pseudo-random sample of complete observations as
initial cluster seeds.
The MAXITER= option specifies the maximum number of iterations for recomputing
cluster seeds. When the value of the MAXITER= option is greater than 0, each
observation is assigned to the nearest seed, and the seeds are recomputed as
the means of the clusters.
The LIST option lists all observations, giving the value of the ID variable
(if any), the number of the cluster to which the observation is assigned, and
the distance between the observation and the final cluster seed.
Maxclusters=3 Maxiter=20
Converge=0.02
Initial Seeds
Cluster
income
educ
1
5.00000000
5.00000000
2
30.00000000
19.00000000
3
16.00000000
15.00000000
Minimum Distance Between Initial Seeds = 14.56022
Iteration History
Relative Change in Cluster Seeds
Iteration
Criterion
1
2
3
1
1.5811
0.0486
0.1751
0.0486
2
1.1180
0
0
0
Convergence criterion is satisfied.
Here, the cluster solution at the second iteration is the final cluster
solution because the change in cluster seeds at the second iteration is less
than the convergence criterion. Note that a zero change in the centroid of
the cluster seeds for the second iteration implies that the reallocation did
not result in any reassignment of observations.
Cluster Listing
Distance
from
Obs
cid
Cluster
Seed
1
c1
1
0.7071
2
c2
1
0.7071
3
c3
3
0.7071
4
c4
3
0.7071
5
c5
2
2.5495
6
c6
2
2.5495
Criterion Based on Final Seeds =
1.1180
Cluster Summary
Maximum Distance
RMS Std
from Seed
Radius
Nearest
Cluster
Frequency
Deviation
to Observation
Exceeded
Cluster
1
2
0.7071
0.7071
3
2
2
2.5495
2.5495
3
3
2
0.7071
0.7071
2
Non-Hierarchical Cluster Analysis of Hypothetical Data
The FASTCLUS Procedure
Replace=FULL Radius=0
Maxclusters=3 Maxiter=20
Converge=0.02
Cluster Summary
Distance Between
Cluster
Cluster Centroids
1
13.4536
2
13.0000
3
13.0000
The statistics used for the evaluation of the cluster solution are the same as
in the hierarchical cluster analysis.
Statistics for Variables
Variable
Total STD
Within STD
R-Square
RSQ/(1-RSQ)
income
9.98833
2.12132
0.972937
35.950617
educ
6.36920
0.70711
0.992605
134.222222
OVER-ALL
8.37655
1.58114
0.978622
45.777778
The cluster solution can also be evaluated with respect to each lustering
variable. If the measurement scales are not the same, then for each variable
one should obtain the ration of the respective Within STD to the Total STD,
and compare this ration across the variables.
Pseudo F Statistic =
68.67
WARNING: The two values above are invalid for correlated variables.
Cluster Means
Cluster
income
educ
1
5.50000000
5.50000000
2
27.50000000
19.50000000
3
15.50000000
14.50000000
1
0.707106781
0.707106781
2
3.535533906
0.707106781
3
0.707106781
0.707106781
Non-Hierarchical Cluster Analysis of Hypothetical Data
The FASTCLUS Procedure
Replace=FULL Radius=0
Maxclusters=3 Maxiter=20
Converge=0.02
1
.
26.07680962
13.45362405
2
26.07680962
.
13.00000000
3
13.45362405
13.00000000
.