You are on page 1of 8

REPRESENTATION, PROCESSING, ANALYSIS, AND UNDERSTANDING OF IMAGES

Unsupervised Classification of Image Texture


V. S. Sidorova
Institute of Computational Mathematics and Mathematical Geophysics, Siberian Branch of the Russian Academy of Sciences, pr. Lavrenteva 6, Novosibirsk, 630090 Russia e-mail: vsidorova@inbox.ru, svs@ooi.sscc.ru AbstractAn automatic histogram-based algorithm for clustering statistical textural features of image incorporating estimation of the quality of the obtained distribution of feature vectors over clusters is presented. The algorithms is applied to classication of forest aerial imagery. DOI: 10.1134/S1054661808040263

1. INTRODUCTION One way to improve classication accuracy of remote sensing data is to use textural features, in addition to spectral ones, which describe patterns of pixel distribution across the image. Unsupervised classication normally makes no assumptions about the structure of objects in the image and operates in terms of statistical textural features computed over a neighborhood of an image point and associated with this point in the form of an N-dimensional vector. Object classication based of such feature vectors can be accomplished by using clustering algorithms, employed for multispectral imagery in remote sensing. Such algorithms, applied to on large sets of remote sensing data, can be roughly divided into three groups: classication by K-means, hierarchical, and histogram-based methods [1]. Algorithms based on analysis of multidimensional histograms have several advantages over other methods: there is no need to specify in advance the number of clusters, like in the K-means algorithms, and they are much faster than hierarchical methods. Their limitation consists in that their requiring a lot of computer memory. However, by making directly accessible only vectors present in the given part of the image, it becomes possible to deal with large-dimensional data sets. For example, various hashing techniques are available to retrieve both the data and the histograms [2]. Clustering algorithms often fail to produce well-isolated clusters in the feature space and thus require subsequent analysis of the classication quality. This is the subject matter of clustering validation methods, reviewed along with clustering algorithms in [3]. This survey, however, does not touch upon histogram-based classication methods. The approach taken in the present study relies on the clustering algorithm and the quality measure introduced in [4]. Its application to analysis of multispectral satellite imagery showed that the obtained clusters corresponded to meaningful land cover classes. In this paper, the algorithm is applied to

classifying satellite imagery by textural features. Because such features reect spatial rather than pointwise characteristics of the object, there are several problems to be solved in using textural features for object classication. One such problem relates to choosing the right size of the neighborhood where textural statistics are collected. This size is known to strongly affect the results of classication. Another problem is that the segments of the obtained clusters in the image should not be thinner than this neighborhood. Thin, false clusters are likely to appear at the boundary of texturally different objects. Therefore, subsequent analysis of the obtained results is requisite. In this paper, we show how these problems can be solved automatically without operators intervention. The algorithm is intended for unsupervised classication of forests from their grayscale aerial images having a certain resolution. The forest texture on such images reects the intrinsic structure of forest communities. For forestry specialists, this is the most important characteristics of the forest. Ground forest estimation and measurement of the age of trees from their annual rings is limited to a small number of specially designated regions with using visual features of texture. The type and age of a forest can normally be inferred from the texture it has in a grayscale image, and these data, in turn, can tell us about other important forest characteristics like its composition, height of trees, forest yield, and etc. 2. CLASSIFICATION ALGORITHM Classication is accomplished by using a fast nonparametric algorithm to partition the vector space into unimodal clusters, corresponding to local maxima of the histogram. Boundaries of such clusters are aligned with histogram valleys [5]. The histogram is viewed as an approximation to the probability density function of the data. For each vector, an elemental graph is constructed in the direction of the probability density gradient computed across this vectors neighbors. The vectors are linked into trees by means of such graphs. When the graph runs into a local maximum of the his-

Received March 5, 2008

ISSN 1054-6618, Pattern Recognition and Image Analysis, 2008, Vol. 18, No. 4, pp. 693699. Pleiades Publishing, Ltd., 2008.

694

SIDOROVA

togram at some vector, being the tree root, the entire tree is assigned to the same cluster as its root. If several vectors share a maximum in the histogram and thus form a plateau, then all of them are assigned to the same cluster. Tracing of elemental graphs is linear in the number of operations versus the number of vectors. The calculations needed to construct such graphs deal only with scalar values of the histogram. Multidimensional vectors are stored as an ordered list, on account of which searching for vector neighbors is also a fast procedure. Unfortunately, a histogram may have a very large number of local maxima, which can arise from very close clusters or be just random spikes on the histogram surface. Such clusters can be merged, but, to select the best variant, it is important to assess the quality of vectors distribution over clusters. To do that, one can vary the level of detail of the input data prior to classication by merging vectors. Elsewhere, we suggested running the algorithm several times for different number N of the vector space quantization levels [4]. Let the initial number of quantization levels be N < N0, N0 = 256. The cell size for this number of levels is given by kf = (N0 1)/(N 1). Let L be the number of features, f = [ f (1), f (2), , f (L)] be a feature vector, and g = [g(1), g(2), , g(L)] be a vector obtained by quantizing f : f (k) g ( k ) = ---------- , kf k = 1, , L,

where [] denotes the integer part of number. The new vectors g(k) are fed to the classication procedure. By varying N, we obtain a series of vector distributions over clusters, among which the best ones are determined in terms of a distribution quality measure. The quality measure Mj(N) for each individual unimodal cluster is given by (1) and the quality measure M(N) of the overall distribution is calculated as an average of over all K(N) clusters (2): 1 j h i ( N ), M ( N ) = ------------------------------------j j B (N) H (N) i = 1
j B (N)
j

(1)

1 M ( N ) = ------------K(N)
j

K(N)

M ( N ),

(2)

j=1

where h i (N) is the histogram value in the i th boundary point of cluster j, Bj(N) is the number of boundary points in cluster j, and Hj(N) is the maximum value of the histogram. The less is (1), the better is the cluster. The minima of (2) correspond to the best classications. In all cases, Mj(N) 1 and M(N) 1. The boundary vectors in (1) can be readily determined from a list of vector neighbors constructed as part of classication.

3. CLASSIFICATION QUALITY MEASURE Its is a common requirement that the classication quality measure should peak when cluster members are close to cluster centers and the clusters are far-away from one another. Indicator measures are known to be computationally least intensive. In most such measures [3], the closeness of vectors to cluster centers is evaluated terms of cluster variance, and the remoteness of clusters is estimated in terms of the distance between cluster centers. The same criteria were used in designing indicator quality measures for supervised classication [6]. Even so, these two approaches differ signicantly both in their basic ideas and in ways to estimate the validity of classications they produce. In supervised classication, each class is represented by its own histogram, i.e., its individual vector distribution. The variance of a class is determined by its distribution density function, and histogram values range from zero to a maximum. Finding the best quality classication consists in bringing to a minimum a region over which the classes overlap. Because computing multidimensional integrals for large datasets may not be practical, indicator measures are used instead. Unsupervised classication is often carried out using rigid clustering algorithms, which partition the vector space into nonintersecting clusters. Unlike in supervised classication, in this case one lacks full knowledge of the distribution function for each cluster, and so it would be wrong to associate the degree of vector ocking around the center of a cluster with the clusters variance. In this case, the scatter of a cluster is determined not only by the fall-off rate of the density function but also by its size. In unsupervised classication, increasing the separation between cluster centers can be interpreted as increased cluster size. Hence, it is impossible to tell whether a small scatter of cluster points is caused by a large fall-off rate of the probability density function or merely results from small size of the cluster. Therefore, small clusters with a more sluggish distribution density variation could outdo larger clusters, and thus none of the two above-stated classication quality requirements would be satised. The closer the objects to be classied are in the feature space, the higher is the risk to face this situation. Unfortunately, this happens quite often when dealing with remote sensing data. Measure (1) directly relates the average number of boundary points of a cluster to the number of its modal points. Let us consider the behavior of the distribution function (histogram) of cluster j along a direction determined by two points: the modal point and a boundary point i. Suppose that the one-dimensional density function along the chosen direction is represented by a restriction of a normal function to segment [0, R], where 0 corresponds to the modal vector and R is its distance from a boundary point i of the cluster. Let = j h i (N)/Hj(N). In the one-dimensional case, the normal density function for a class j is given by
Vol. 18 No. 4 2008

PATTERN RECOGNITION AND IMAGE ANALYSIS

UNSUPERVISED CLASSIFICATION OF IMAGE TEXTURE

695

1 1( x j) p j ( x ) = --------------------- exp -- -------------------- , 1/2 2 2 ( 2 ) j j


2 2

(3)

where j is the mean value and j is the variance of the cluster j. The ratio for the normal density function can be expressed as hi ( N ) R = --------------- = exp ---- j j H (N)
j

modal points. The less is (1), the more likely it is that most of cluster vectors ock around the modal vector and only a small their part falls within the boundary region. One should also take into account that, in multivariate problems, the fraction of cluster boundary points is averaged over all directions, which is equivalent to a sort of smoothing. The quality measure of the overall distribution of vectors across clusters has also a statistical nature. 4. CLASSIFICATION BY TEXTURAL FEATURES Statistical textural features are computed over a certain neighborhood of point in the image. Let such a neighborhood be a square window of one size, to be determined by the algorithm, for all points. The smaller the window size, the more exactly cluster boundaries could, in principle, be determined in the image. However, the feature value for each given texture becomes stable only for sufciently large windows. Starting with some small window size, let us gradually increase it, nding at every step the best classication and the corresponding number of clusters. Suppose now, that, after some size is reached, not only the features will stabilize within all clusters corresponding to extended textural objects but also the number of clusters will no longer change. Although, as the window size increases, the boundary points falling within this window could give rise to new clusters. This contribution, however, should decrease as the windows grows, provided that the fraction of boundary points in the image is fairly small. In addition, only the best quality classications are selected for every window size. It is unlikely that the boundary clusters corresponding to parts of different objects would be well-isolated. As soon as the number of clusters stops changing with the window size, the classication corresponding to the smallest such window can be selected. Another requirement specic to texture classication is that the obtained cluster segments should not be thinner than the sampling window. Thin false clusters could arise on boundaries of image objects with different textures. When the cluster map is constructed, such clusters can be joined to clusters of adjacent segments. In order to identify false clusters, the ratio D between the number L of boundary points of each cluster (i.e., the perimeter of all its segments) on the map and the cluster area S (D = L/S) can be compared against a certain threshold. This threshold can be set to a value calculated for the square window of the obtained size. If D exceeds this threshold, the cluster is assumed to be false. For such a cluster, we consider two of its largest neighbors P1 and P2 in the image. Out of these two clusters, we choose the one least isolated from the given false cluster in the feature space. Let j be a false cluster; compute separately the contribution to Mj(N) (1) of boundary vectors of cluster j neighboring clusters P1 and P2 in the vector space. The false cluster
No. 4 2008

(j > 0 for the given segment). We see that decreases with increasing R, the cluster size in the given direction, and decreasing j , where j is an unknown variance of the overall distribution, governing its fall-off rate, rather than a sample variance estimated over the segment R. This resolves the ambiguity mentioned above: the less the value of estimated from the histogram, depending upon the relationship between the histogram fall-off steepness and the cluster size, the higher the quality of the cluster. The size R reects the degree of faraway of the given cluster from other clusters, and the fall-off rate governs the degree of vector scatter around the cluster center. All requirements regarding the cluster quality measure are, therefore, met. Let Gj (x) be a one-dimensional section of the histogram along the segment [0, R] and let its variation be described by a normal density function pj (x) (3). Then, it can be derived: G j( x) = H ( N )
j x --R

h ij ( N ) R j = H ( N ) --------------- . j H ( N )

x ---

(4)

Consider the fraction of boundary points within the overall cluster size. Within the family of decreasing functions (4) with the parameter Hj(N), the value taken on by Gj (x) at any point x will be greater for functions with larger values of this parameter. Therefore, for such a function, the cluster size will also be larger. It follows that, whenever decreases just because the fall-off rate of the function decreases while R is xed, the percent of boundary points among all cluster points will also decrease. If, on the other hand, a decrease in is caused by an increase in R with j unchanged, then, by xing Hj(N), we can see that the size of the cluster increases j and, concurrently, h i (N) decreases. Here again, the fraction of boundary points among all cluster points will decrease with decreasing . The fraction of boundary points in the overall cluster size depends also on the density function prole, which may well not be a Gaussian. Meanwhile, the behavior of the cluster near the its boundary is particularly important for recognition, and for this reason (1) is expressed in terms of the ratio between boundary and
PATTERN RECOGNITION AND IMAGE ANALYSIS

Vol. 18

696

SIDOROVA

4 6

(a) MEAN

(b) (d)

(c)

4 6

1 2

7 TONE 0 Mj(N) 255

Fig. 1. (a) Forest landscape; (b) cluster map; (c) forest parts manually delineated according to development phases; and (d) vector diagram.

can then be merged with the one for which this contribution is larger. Cluster merging based on estimating their isolation rather than distance to the nearest cluster is well in line with the major idea of the histogram scheme to draw boundaries between clusters along their valleys, i.e., regions with low vector density. If the image neighbors P1 and P2 of a cluster j fail to neighbor it in the vector space, then we can join j to that of the two clusters having the longest boundary. 5. TEXTURAL FEATURES Texture was quantied in terms of Haralicks texture statistics vector [7] Pr, (N). The ith component of this vector is the probability that the absolute difference between gray levels at two pixels separated by a vector {r, } equals i. Here, N is the number of image gray levels. The features considered for r = 1 were MEAN, CON, ENT, and ASM: MEAN =

AS =

r, Pi

r, Pi ,

ENT =

P
i

r, i

log ( P i ).

r,

The image is rst equalized and the number of its gray levels is reduced [7]. The statistics vector is computed over a chosen window for each image pixel. Each component of the obtained vector is normalized by converting to values ranging from 0 to 255. In order to make features invariant with respect to direction, their average over four values of is taken. 6. NUMERIC EXPERIMENTS The algorithm was used to classify areas of forest present in terrain aerial imagery. The specic trait of such data is that neighboring objects have close textural features and so the degree of isolation of adjacent clusters is fairly low. An aerial view of a forest landscape in West Siberia at a scale of 1 : 50000 is shown in Fig. 1a. The size of the digital image is 582 1374, and its resolution is 5 m2/pixel. The crown of individual trees is not visible at this resolution, and forest texture is constituted by interlaced lighter groups of birches (the image was taken in the autumn) and darker groups of Siberian cedar. These stands of cedar are in the three
Vol. 18 No. 4 2008

r, Pi

i,

CON =

P
i

r, i

i i,

PATTERN RECOGNITION AND IMAGE ANALYSIS

UNSUPERVISED CLASSIFICATION OF IMAGE TEXTURE

697

senior age phases of evolution. The three uniformly light blobs are surfaces of water bodies. The forest in this image represents a community of birch and Siberian cedar. After reviving on re-sites and cleared spaces, a forest passes through several stages of development. Gray level images of such forests also change in a predictable way [8]. In early stages of forest development, more active growth is shown by birch, while Siberian cedar pulls up under its canopy. Later on birch crowns close in, and the image becomes lighter and less textured. Finally, cedar trees push their way into the upper storey and gradually dislodge birches. The image becomes darker, and the texture rst grows more salient but later on fades away. Prior to classication, the image was equalized and the number of gray levels was reduced to 30. The classication procedure used two textural features: the average gray level TONE and Haralicks MEAN feature. Characteristics of the best distributions of vectors to clusters for different window sizes are given in Table 1. For each size, the table lists two minima M1 and M2 of measure (2) and the respective number of clusters K1 and K2 obtained for two quantization levels. It can be seen that the number of clusters for the two best classications becomes stable for the window size of 14 14. The best distribution in the sense of measure (2), M(32) = 0.190, yields two clusters, partitioning the feature space into a forest area and a water surface. Let us now consider the distribution of texture vectors to seven clusters associated with the minimum M(34) = 0.252. A diagram of texture feature distribution is shown (with large downsampling) in Fig. 1d. It can be seen that the object scatter in lightness is much larger than in texture. The obtained seven clusters (indicated by numbers) follow one another along the graylevel axis. Therefore, this classication basically relies on just one feature, the mean gray level. This is in agreement with the fact that Siberian cedar gradually displaces birch in later development phases, and the image grows darker. The threshold for identifying false clusters for a 14 14 window is D = L/S = 0.28 (Section 4). The area, boundary length, and their ratio for the seven obtained clusters are listed in Table 2. Clusters 1, 3, and 5 were decided to be false ones. The clusters these are to be merged with were determined as described in Section 4. First, for each false cluster, its two largest neighbor clusters were identied in the image plane. For all false clusters, these two clusters also proved to neighbor the respective false cluster in the feature space. The degree of isolation of each false cluster from its two neighbors was evaluated using relationship (1). It was found that cluster 1 is better separated from cluster 2 than from cluster 3; the latter is better separated from cluster 5 than from 1; and cluster 5 is better separated from cluster 3 than from 4. This fact can also be seen from the density of points at clusPATTERN RECOGNITION AND IMAGE ANALYSIS

Table 1. Characteristics of the best distributions for the image in Fig. 1a Window size 66 10 10 14 14 18 18 Quality measure M1 0.210 0.136 0.190 0.220 Number of clusters K1 6 3 2 2 Quality measure M2 0.212 0.206 0.252 0.246 Number of clusters K2 10 4 7 7

Table 2. Data for identifying false clusters Cluster number Perimeter D 1 2 3 4 5 6 7 17124 7964 17176 9580 11462 2438 534 Area S 50348 47416 46244 34624 26192 10092 3500 D = L/S 0.34 0.17 0.37 0.28 0.44 0.24 0.15

ter boundaries in Fig. 1d. As a consequence, cluster 1 is merged with cluster 3 and cluster 5 with cluster 4 (rather than with cluster 3 to which it is closer in the vector space). We end up, therefore, with four clusters. The obtained ltered classication map is shown in Fig. 1b. The ltering amounted to smoothing segment boundaries and was accomplished by replacing cluster number in every point of the cluster map with the average taken over its eight neighbors. One of the clusters, with the lightest gray tone on the map, embraces all water surfaces. The four other correspond to the three elder phases of cedar forest development present in the image. Contours of age phase segments visually determined by a forestry specialist are shown in Fig. 1c. The white points here indicate sites of ground forest estimation. As one can see, all but one phase are represented by one cluster, and phase 6 is represented by two clusters. Cluster 6 corresponds to the darker and least textured areas of forest, represented by the eldest cedar stands found in the image. Increasing the number of quantization levels causes a signicant increase in the cluster isolation measure quantied by (2). An example of a more detailed classication is given in Fig. 2a, were all seven development phases of the birch-cedar community are present. Moreover, there are areas of pine stands in the image, which look very much like cedar forests and are not easy to single out. Fig. 2c is a detailed map obtained by on-site forest estimation. Classication was run for different combinations of the above-mentioned features, which in all instances included the gray-level tone TONE. The least values of
No. 4 2008

Vol. 18

698

SIDOROVA

(a) (c) MEAN B 1 6 4 4 7 6 6 3 B 4 2 0

(b)

(d)

5 7C

5 7

4 2 C 1 3

TONE

Fig. 2. (a) Image of forest landscape; (b) cluster map; (c) forest type map based on land forest estimation. Development phases of Siberian cedar forests are indicated by large digits; 7C denotes interlaced stands of cedar and pine; C denotes pine forests; and the light-gray shade corresponds to river oodplains and marshes B; and (d) vector diagram with clusters corresponding to different forest types and age phases.

measure (2) were obtained for two different feature pairs. Although the minimum of measure (2) was lower for the pair CON and TONE, it was obtained using a smaller number of quantization levels than with the pair MEAN and TONE. The latter combination of features made it possible to differentiate cedar and pine stands and distinguish all phases of cedar development. The
Table 3. Characteristics of the best distributions for the image in Fig. 2a Window size 12 12 14 14 16 16 18 18 20 20 Quanytization number of levels 77 80 78 78 78 Number of clusters 54 48 49 50 50 Quality measure M(N) 0.339 0.335 0.332 0.345 0.348

best distribution at which cedar and pine stands can be distinguished was obtained using 78 quantization levels. The window size for collecting statistics was determined to be 18 18. The number of clusters in the best classication as a function the sampling window size is presented in Table 3. The best classication (with quantization levels ranging from 37 to 90) was obtained for 78 levels and M(78) = 0.345. The starting number of clusters was 50. Pine stands failed to be distinguished at less detailed quantizations. The second smallest minimum of measure M(36) = 0.44 gave rise to six clusters. Increasing the level of detail beyond this point results in decreased quality and a larger number of clusters. The cluster segments in the image plain should not be thinner than the window size, and yet all obtained clusters turned out to be thin. However, given that features varied smoothly across most of neighboring clusters, thinner clusters were merged with those for which the measure D = L/S exceeded the average computed over all clusters. After processing false clusters, the overall number of clusters dropped down to 36, with 17 of them corresponding to forests.
Vol. 18 No. 4 2008

PATTERN RECOGNITION AND IMAGE ANALYSIS

UNSUPERVISED CLASSIFICATION OF IMAGE TEXTURE

699

The obtained vector diagram is shown in Fig. 2d. Different shades of gray in this diagram represent the seven phases of the cedar-birch community and pine stands; the clusters of water-meadows and marshes are in the right bottom part of the diagram (have no marking). The corresponding cluster map is shown in Fig. 2b. The obtained classication is in general agreement with results of on-site forest estimation, sketched in Fig. 2c. Each forest type is represented by one or two clusters. The regions corresponding to the seven phases of cedar forest development and to pine stands were identied by the algorithm and their location is similar to that in the map of Fig. 2c. Taking into account the given resolution of the digital image, the 18 18 sampling window corresponds to a square of 0.8 ha on ground. Meanwhile, the size of a quad normally used in ground forest estimation is 1 ha. 7. CONCLUSION A histogram-based classication furnished with a measure of cluster isolation makes it possible to obtain the best, in the sense of the given measure, distribution of image texture features over clusters automatically. The estimates of the degree of cluster isolation are also used to select the window size for sampling textural statistics and in handling false clusters that arise at boundaries of image regions with different texture. The introduced classication quality measure can also be employed in selecting textural features to be used in classication. The algorithm incorporates a mechanism for determining the level of detail in data representation for classifying forest types in aerial imagery. The extent of a window where textural statistics are collected imposes an additional limitation on the possible degree of detail of the nal result. Even so, the clusters yielded by the algorithm under such conditions could be interpreted in terms of signicant forest-cover classes, such as the type of forest and the phase of its development, and the classication exactness was close to that provided by on-site forest estimation. The obtained result means that objects with close textural features, such as birch deciduous forests, mixed forests with different ratios of conifer and deciduous species, and purely conifer forests, could be distinguished in grayscale

aerial images by automatic clastering, which even succeeded in telling pine stands from cedar ones. ACKNOWLEDGMENTS This study was supported in part by the Russian Foundation for Basic Research, project no. 070700085a. REFERENCES
1. P. Gong and P. J. Howarth, An Assessment of Some Factors Inuencing Multispectral Land-cover Classication, Photogram. Eng. Rem. Sensing 56, 597 (1990). 2. V. S. Sidorova, Separating of the Multivariate Histogram on the Unimodal Clusters, in Proceedings of the Second IASTED International Conference Automation Control and Information Technology, Novosibirsk, 2005, pp. 267274. 3. M. Halkidi, Y. Batistakis, and M. Vazirgiannis, On Clustering Validation Techniques, J. Intel. Inf. Syst. 17 (23), 107132 (2001). 4. V. S. Sidorova, Quality Assessment of Multispectral Image Classication by a Histogram-Based Method, Avtometria 43 (1), 3743 (2007) [in Russian]. 5. P. M. Narendra and M. Goldberg, A Non-parametric Clustering Scheme for LANDSAT, Pat. Rec. 9, 207 (1977). 6. Remote Sensing: A Quantitative Approach, Edited by P. H. Swain and S. M. Davis. Purdue University, USA 396 (1978). 7. R. M. Haralick, K. Shanmugam, and I. Dinstein, Textural Features for Image Classication, IEEE Trans. Syst. Man Cybern. 3, 610 (1973). 8. V. N. Sedykh, Formation of Siberian Cedar Forests in the Ob Region (Nauka, Novosibirsk, 1979) [In Russian]. Valeriya S. Sidorova. Born 1947. Graduated Novosibirsk State University in 1972. Researcher at Institute of Computational Mathematics and Mathematical Geophysics. Scientic interests: classication of remote sensing data, texture analysis, and 3D visualization. Author of 40 papers.

PATTERN RECOGNITION AND IMAGE ANALYSIS

Vol. 18

No. 4

2008

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

You might also like