You are on page 1of 19

UNSUPERVISED CLASSIFICATION

Unsupervised Classification
In contrast to supervised classification, unsupervised classification requires only a minimal amount of initial input from the analyst. It is a process whereby numerical operations are performed that search for natural groupings of the spectral properties of pixels, as examined in multispectral feature space. The user allows the computer to select the class means and covariance matrices to be used in the classification. Once the data are classified, the analyst attempts, to assign these natural or spectral classes to the information classes of interest. This may not be easy. Some of the clusters may be meaningless as they represent mixed classes of earth surface materials. This ambiguity is resolved by the analyst who understands the spectral characteristics of the terrain in order classify clusters into information classes.

UNSUPERVISED CLASSIFICATION
Clustering is one of the most important tasks in data mining and knowledge discovery. It attempts to find subsets within a given data that are similar enough to warrant further analysis. It organizes a set of objects into groups (or clusters) such that objects in the same group are similar to each other and different from those in other groups. These groups or clusters should have meaning in the context of a particular problem.

In clustering one of important and fundamental task is the definition of

a) proximity (similarity function) between two data objects and b) the overall optimization search strategy, i.e. how to find the best overall grouping according to an optimization criteria. Clustering, commonly known as unsupervised classification, does not need any training data and is especially useful when the user has limited knowledge about the data.

Clustering algorithms partition data into a certain number of clusters (groups, subsets, or categories). There is no universally agreed upon definition. Most researchers describe a cluster by considering a) the internal homogeneity and b) the external separation i.e. patterns in the same cluster should be similar to each other, while patterns in different clusters should not. Both the similarity and the dissimilarity should be examinable in a clear and meaningful way.

STEPS IN CLUSTER ANALYSIS Data to cluster data acquisition, preparation and cleaning. Variables to use selection of relevant variable for performing the clustering procedure. Irrelevant or masking variable should be excluded as far as possible. A proximity measure designing a proper proximity measure. The clustering procedure. Number of clusters even no cluster is a possible outcome. Replication, Testing and interpretation.

Unsupervised Classification
Clustering algorithms used for the unsupervised classification of remotely sensed data, generally, vary according to the efficiency with which the clustering takes place. An example of a conceptually simple but not necessarily efficient clustering algorithm has been used below to demonstrate the fundamental logic of unsupervised classification known as CLUSTER. This algorithm operates in a two-pass mode. In the first pass, the algorithm sequentially builds class clusters. In the second pass, a minimum-distance classifier is applied to the whole data set on a pixel-by-pixel basis, where each pixel is assigned to one of the mean vectors created in pass 1.

CLUSTER Algorithm
Pass 1: Cluster Building During the first pass, the analyst may be required to supply four types of information: (i) R, radius of the cluster, (ii) C, a distance parameter for merging clusters, (iii) N, the number of pixels to be evaluated between each merging of the clusters, and (iv) Cmax, the maximum number of clusters to be identified by the algorithm.

CLUSTER Algorithm
To start the process of building of cluster centres, the first pixel of the image is considered to be the cluster centre of the first class. Then the second pixel is taken up and its membership for the first cluster is found by computing the distance between this point and the cluster centre of class 1. If the distance between the pixel and the cluster centre of class 1 is less than or equal to R, then this pixel belong to class 1. Now the class 1 has two points within its cluster and the cluster centre of class 1 is modified by taking the average value of both the pixels.

30

Band 2 Brightness Values

20

Pixel 2 (20. 20) Pixel 3 (30. 20)

10

Pixel 1 (10. 10)

0 0 10 20 30 40

Band 1 Brightness Values

30

Band 2 Brightness Values

20

Pixel 2 (20. 20)

10

Pixel 1 (10. 10)

0 0 10 20 30 40

Band 1 Brightness Values

30

Band 5 Brightness Values

20

R=15

10

Cluster #1 after 1st iteration (15,15)

0 0 10 20 30 40

Band 4 Brightness Values

CLUSTER Algorithm
Now the third pixel is taken up for examination. If the distance between this pixel and the cluster centre of class 1 less than or equal to R, then the pixel belongs to class 1. Adjust the cluster centre of class 1 by taking the average values of all the three pixels. If the distance of the third pixel exceeds the distance R, then this pixel does not belong to the class 1, hence this pixel now becomes the cluster centre of a new class i.e. class 2.

30

Band 5 Brightness Values

20

R-15 Pixel 3 (30. 20) D-15,81 becomes Cluster #2 Cluster #1 after 1st iteration (15,15)

10

0 0 10 20 30 40

Band 4 Brightness Values

CLUSTER Algorithm
This process of building cluster continues till N pixels have been examined for their membership to cluster of different classes. At this point, the cluster building process stops temporarily and the distance between class clusters are examined for their separability. The class clusters that have now been identified have to be checked such that the cluster centres of all classes are separated by a minimum value C. Those clusters, which are lying at a distance less than C, have to be merged together as they belong to the same cluster. The new cluster centres of the merged cluster are found by taking weighted average value of the old cluster centres being merged.

CLUSTER Algorithm
Once the cluster centres have been checked for proper separability, the building of cluster starts from the point where it had stopped. It is found that the centres of the cluster, which have been identified, tends to move in its position in the initial phase, and as more points are examined, the positions of the clusters start to stabilize before converging into a fixed position. This process of cluster building continues till the maximum number of cluster centres (Cmax) have been identified or the end of image is encountered. Finally, the separability of each cluster is checked before proceeding to Pass 2.

30

Band 5 Brightness Values

20

ending

ending Cluster # 2 beginning

10

Cluster # 1 beginning

0 0 10 20 30 40

Band 4 Brightness Values

Unsupervised Classification

Pass 2: Classification of Image Having identified the cluster centres of all the classes, the classification of the image starts. Each point is assigned a class membership on the basis of minimum distance to means classifier. When the whole image has been classified, the analyst now examines the classified image. Since the classes that have been identified are basically spectral class and not informational classes, hence the analyst now has to undertake the process of converting the spectral classes into informational classes.

Unsupervised Classification
In this process of convergence it is found that two or more spectral classes may combine together to yield a single information class. This process is rather a tedious, cumbersome, and complex, hence requires a great amount of expertise on the part of the analyst in merging many spectral classes into one informational class.