You are on page 1of 11

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO.

3, JUNE 2011

543

A New Weighted Fuzzy C-Means Clustering Algorithm for Remotely Sensed Image Classication
Chih-Cheng Hung, Member, IEEE, Sameer Kulkarni, and Bor-Chen Kuo, Member, IEEE
AbstractFuzzy clustering model is an essential tool to nd the proper cluster structure of given data sets in pattern and image classication. In this paper, a new weighted fuzzy C-Means (NW-FCM) algorithm is proposed to improve the performance of both FCM and FWCM models for high-dimensional multiclass pattern recognition problems. The methodology used in NW-FCM is the concept of weighted mean from the nonparametric weighted feature extraction (NWFE) and cluster mean from discriminant analysis feature extraction (DAFE). These two concepts are combined in NW-FCM for unsupervised clustering. The main features of NW-FCM, when compared to FCM, are the inclusion of the weighted mean to increase the accuracy, and, when compared to FWCM, the centroid of each cluster is included to increase the stability. The motivation of this work is to meliorate the well-known fuzzy C-Means algorithm (FCM) and a recently proposed fuzzy weighted C-Means algorithm (FWCM). Our nding is that the proposed algorithm gives greater classication accuracy and stability than that of FCM and FWCM. Experimental results on both synthetic and real data demonstrate that the proposed clustering algorithm will generate better clustering results than those of FCM and FWCM algorithms, in particularly for hyperspectral images. Index TermsDiscriminant analysis feature extraction (DAFE), fuzzy C-means algorithm (FCM), fuzzy weighted C-means algorithm (FWCM), nonparametric weighted feature extraction (NWFE).

I. INTRODUCTION

LUSTERING is an approach that attempts to assess the relationships in a data set by organizing the data patterns into different groups such that data patterns within a group are more similar to one another than those belonging to different groups [24]. Clustering techniques can be classied into supervised and unsupervised methods. The unsupervised clustering method is to detect the underlying structure in the data set for classication, pattern recognition, and model reduction and optimization [2] while the supervised clustering method is usually involved with the human interaction. The unsupervised clustering algorithms are more popular due to the lack of knowledge about the data set one is working on.

Manuscript received April 27, 2010; revised August 18, 2010; accepted November 15, 2010. Date of publication December 06, 2010; date of current version May 18, 2011. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Jon Benediktsson. C.-C. Hung and S. Kulkarni are with the School of Computing and Software Engineering, Southern Polytechnic State University, Marietta, GA 30060-2896, USA. B.-C. Kuo is with the Graduate Institute of Educational Measurement and Statistics, National Taichung University, Taichung 40306, Taiwan. Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/JSTSP.2010.2096797

The Fuzzy C-Means (FCM) algorithm uses fuzzy logic where each data point is specied by a membership grade between 0 and 1 [3], [20]. FCM segments a set of data points into a number of clusters; hence, it is called a clustering algorithm. The objective function of the classical FCM is dened based on the distance measure of the data to the cluster centers of their fuzzy memberships. In addition, FCM uses the probabilistic constraint that the memberships of a data point across all the classes must sum to 1. The constraint is used to avoid the trivial solution of all memberships being equal to 0. As pointed out by Krishnapuram and Keller [13], if a point is equidistant from two cluster centers, the membership of for each of these two cluster will be the same and its membership value equals to 0.5. The problem of this membership assignment is that noise points (which may be far but equidistant from two cluster centers) are treated the same as points which are close to the cluster centers [13], but in reality, those noise points are supposed to be given very low, or even zero, membership in either cluster. To improve this weakness of FCM, Krishnapuram and Keller relaxed the probabilistic constraints and proposed Possibilistic Clustering Algorithms (PCAs), where the memberships provide a good explanation of degrees of belongingness for the data [13], [14]. The clustering performance of PCAs in [13], [14] heavily depends on the parameters used; hence, Yang and Wu [25] suggested a new PCA whose performance can be easily controlled. Compared with FCM, the PCAs are less affected by noise and outliers. Thus, PCAs have been applied to problems such as shell clustering, boundary detection, surface, and function approximations [12], [21], [22]. Both FCM and PCAs are iterative algorithms, in which the update equations for memberships (and cluster centers) are all derived from the necessary conditions for a minimizer of some objective functions. As each cluster is assumed to be a fuzzy set in fuzzy clustering, it is natural to evaluate the memberships of points belonging to clusters directly according to the data information using the fuzzy set theory. With this reasoning, a new family of PCAs was proposed [26]. A performance comparison of PCAs and FCM on the accuracy and real cluster number for the data sets of Iris, Glass, and Vowel was given in [26]. The results revealed that PCAs cannot provide very stable results on clustering those data sets. The stability of algorithms is necessary for real applications. The paper in [26] concluded that it is challenging to explore whether the PCAs are unstable through theoretic analysis or real data experiments. In their experiments [26], all the algorithms including PCAs and FCM were repeated one thousand times and the statistics was then calculated. In the case of Iris data, the experiment

1932-4553/$26.00 2010 IEEE

544

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 3, JUNE 2011

shows that although the FCM achieves 90 percent of accuracy for the highest and average, it obtains the correct number of clusters with 99% of one thousand runs. On the contrary, the best performance of PCAs achieves 95% accuracy over all while most of the PCAs generate the correct number of clusters with 75% to 80% of one thousand runs. Experiments were done with the membership weighting exponent, m, in values of 2, 2.5 and 3. The greater the value of m, the fuzzier the membership assignments will be (i.e., the degree of fuzziness). Although PCAs can achieve an overall higher accuracy, FCM shows more consistent and stable experimental results. Kuo and Landgrebe proposed the nonparametric weighted feature extraction (NWFE) method for dimensionality reduction [15]. The idea of weighted mean is an essential part of NWFE for dealing with high-dimensional multiclass pattern recognition problems. One of the important properties associated with NWFE is that this method assigns greater weights to pixels near the expected decision boundary. Li et al. [17] applied the concept of weighted mean to FCM to create a new FCM-like clustering algorithm, named the fuzzy weighted C-Means (FWCM) algorithm; FWCM does provide better results than those of FCM; however, it does not give very uniform stable experimental outcomes. In addition, for different data set, the range of fuzziers used can be very different to obtain a higher clustering accuracy. The fuzzier has a wide range between 2.0 and 6.0 in their experiments [17]. To address these problems, a New Weighted Fuzzy C-Means algorithm (NW-FCM) is proposed for solving similar high-dimensional multiclass pattern recognition problems. In NW-FCM, both the concept of weighted mean (from the nonparametric weighted feature extraction [15]) and cluster mean (from discriminant analysis feature extraction [10]) are used in the algorithm. The weighted mean is to increase the classication accuracy and the cluster mean to increase the stability of the algorithm. Compared with previously proposed techniques, the FWCM does not employ the cluster mean concept; hence, the algorithm is not stable. The FCM does not have the weighted mean concept which can increase the classication accuracy. This paper is organized as follows: a brief review of FCM is given in Section II-A. FWCM is sketched in Section II-B. The proposed NW-FCM is described in Section III. The performance of NW-FCM on pattern data and image classication, and its comparison with FCM and FWCM, are reported in Section IV. The conclusions then follow. II. RELATED WORK Clustering algorithms, in general, fall into two modes of nonfuzzy (also known as hard clustering) and fuzzy models. The K-Means clustering algorithm is a typical example of a nonfuzzy method [8]. These hard clustering algorithms work well if the structure of the data set is well distributed; however, when the boundary between clusters in a data set is ill dened (a situation where the same data object may belong to more than one cluster), the notion of fuzzy clustering becomes relevant [19]. FCM is one of the well-known fuzzy approaches among fuzzy related models that have been proposed in literature [5].

Many sensors systems collect a large number of spectral bands which is more than needed for any specic applications. It becomes necessary to determine an optimal set of spectral bands (i.e., features) which can be used efciently for classication [16]. Several feature extraction methods have been developed for this purpose. Some feature extraction methods will be very briey reviewed as they are related to our work here. Specically, the Discriminant Analysis Feature Extraction (DAFE), the Nonparametric Analysis Feature Extraction (NAFE), and the Nonparametric Weighted Feature Extraction (NWFE) will be described in a few words along with FCM and FWCM. A. Fuzzy C-Means Clustering Algorithm (FCM) The Fuzzy C-Means clustering (FCM) algorithm (a fuzzy version of the K-Means clustering algorithm [8]) minimizes the objective function (1) , where is the center of with respect to membership grade fuzzy cluster , is the number of data points, is the number is a fuzzier [3]. The main differof clusters, and ence between FCM and K-Means is that the data points are partitioned into fuzzy regions using fuzzy membership grades [3]. A membership grade matrix is created for all data points and clusters. The constraint for FCM is the sum of all membership grades of a data point belonging to all clusters must be equal to one. This constraint is similar to that used on fractions of endmembers in unmixing for multispectral images and on a posteriori probabilities in maximum-likelihood classication [9]. FCM is an iterative algorithm [3], which is described in the following. Step 1) Initialize the membership matrix

with random values ranging between 0 and 1 such of satises the constraint that the element (2) Step 2) Calculate fuzzy cluster centers using (3)

Step 3) Update the membership grade

using (4)

Step 4) Repeat steps 2 and 3 until the algorithm converges. This means that the difference between the current membership matrix and the previous membership matrix is below a specied tolerance value or the

HUNG et al.: NWFCM CLUSTERING ALGORITHM FOR REMOTELY SENSED IMAGE CLASSIFICATION

545

number of iteration reaches the maximum value dened. However, this criterion only considers the distance of the data to the cluster centers of their fuzzy memberships. If two distinct clusters have a common mean, then the performance of FCM is poor [17]. Hence, the idea of a weighted mean was employed in FWCM to improve its clustering performance.

The objective function is redened in (7) for FWCM algorithm

(7) Using the method of Lagrange multipliers, a new objective funcis formulated as follows: tion

B. Fuzzy Weighted C-Means Algorithm (FWCM) Discriminant analysis feature extraction (DAFE) is an often used method for dimensionality reduction in classication problems [10]. DAFE uses the mean vector and covariance matrix of each class for feature extraction. This is a method to enhance separability [16]. DAFE relies on the assumption of having classes of Gaussian distributed data. In DAFE, within-class, between-class and mixture scatter matrices are calculated as the criteria of class separability. Nonparametric analysis feature extraction (NAFE) is an improved version of DAFE. The difference between DAFE and NAFE is that the between-class scatter matrix is redened as nonparametric between-class scatter matrix [10]. The main idea of NWFE [15] is to give different weights on each sample for computing the weighted mean and dening new nonparametric between-class and within-class scatter matrices to extracting features. As pointed out by Kuo and Landgrebe [15], NWFE provides a greater weight on the samples which are close to the expected boundary compared with DAFE. Other advantages over DAFE include that one can achieve a full ranking of the scatter matrices, reduce the inuence of outliers through the nonparametric nature of scatter matrices, and it also works well for nonnormal data sets. Although NWFE is for supervised learning problems, the concept of weighted mean can be extended to an unsupervised version. This unsupervised weighted mean plays an essential role in FWCM algorithm. An unsupervised weighted mean was proposed [17] to extend (assume that belongs to class ), NWFE. For any sample compute the distances from it to other samples , i.e., (5) Generally, data samples near belong to the same class of . The corresponding weights must be large and, the reciprocals of the above distances are used for weights. If the sample is close to , but is not in class , then the inuence of must be small. Multiplying by membership grade can be used for the weighting in those conditions. Therefore, the unsupervised weighted mean of in class is dened by (6) Step 2) Calculate the unsupervised weighted means using (8) where , constraints are the Lagrange multipliers for the

(9) By differentiating with respect to all arguments, we have the following formulations:

(10) (11) Similar to FCM, FWCM is an iterative algorithm [17]. The algorithm is described in the following. Step 1) Initialize the membership matrix

with random values from of satises

such that the element

(12) Step 3) Update the Lagrange multipliers using

One can expect that the unsupervised weighted mean than ( being the center of cluster ). closer to

is (13)

546

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 3, JUNE 2011

Step 4) Update the membership grade

using

TABLE I DESCRIPTIONS OF FOUR DATA SETS SHOWING THE NUMBER OF CLASSES, THE NUMBER OF SAMPLES, AND THE NUMBER OF FEATURES FOR EACH DATA SET

(14) Step 5) Repeat steps 2, 3, and 4 until the average of the square differences between the membership grades in this iteration and the above iteration is below a certain tolerance value or it is no longer decreasing. III. PROPOSED NEW WEIGHTED FUZZY C-MEANS ALGORITHM (NW-FCM) Comparing DAFE, NAFE, and NWFE methods, DAFE uses the centroid concept in the calculation of both within-class and between-class scatter matrices without the weighted mean, NAFE uses a combination of the centroid for within-class scatter matrices and the weighted mean for between-class scatter matrices, and NWFE uses the weighted mean for both within-class and between-class scatter matrices without the centroid. The weighted mean in NWFE was used in FWCM, but not the centroid concept. The weighted mean used in FWCM is very similar to the gradient weighted inverse concept used in the smoothing lter [23]. In our proposed algorithm, one can expect that this to be even closer to new unsupervised weighted mean (than which is the center of cluster ). A new weighted fuzzy C-Means algorithm (NW-FCM) is proposed for solving similar high-dimensional multiclass pattern recognition problems. This method combines the centroid of each cluster in DAFE as used in FCM and the weighted mean in NWFE as used in FWCM in deriving this new algorithmone that is more efcient and stable than both FCM and FWCM. In other words, NW-FCM is more stable than FWCM and obtains higher data classication accuracy than FCM. In FWCM, weighted means are calculated based on the point in consideration and all the sample points, whereas in NW-FCM it is calculated based on cluster centers and the rest of sample points. This makes NW-FCM more precise in assigning a sample point to a particular cluster than FWCM. Because the weighted mean is calculated from the cluster centers the algorithm is less computationally exhaustive than FWCM. The proposed algorithm is formulated below. Step 1) Initialize the membership matrix: Step 4) Update the Lagrange multiplier using

Step 3) Calculate the weighted means

using (16)

(17) Step 5) Upgrade the membership grade using (18) Step 6) Repeat steps 2, 4, and 5 until the average of the square differences between the membership grades in this iteration and the above iteration is below a certain tolerance value or it is no longer decreasing. IV. EXPERIMENTAL RESULTS ON NUMERICAL EXAMPLES, PATTERN DATA SET, AND IMAGES The effectiveness of the proposed NW-FCM is validated by extensive experiments in numerical examples, real pattern data sets (including Iris and Wine), and natural and hyperspectral images. The numerical examples demonstrate how these algorithms work visually. The validation is through the classication accuracy assessment. For our purpose, only the overall accuracy and the real number of clusters will be taken into account when we compare all the clustering results of the algorithms used in the experiments. The fuzzier values used in the experiments are 2, 2.5, and 3. We used the same terminating conditions for all three algorithms, i.e., the difference between two membership grade matrices should be less than 0.00001 or the total number of iterations should not exceed more than 100. Please note that measuring the effectiveness of algorithms by the overall accuracy and the real number of clusters may not be enough. Many evaluation criteria have been proposed and used such as the users and producers accuracies, the kappa value in [7], and the Tau coefcient in [18]. On using different criteria the clustering results need to be evaluated differently. A summary of data sets used is given in Table I. We ran each of the algorithms for the pattern data sets and hyperspectral images to compare their performance. For each algorithm, 1000 experiments with random initial centers and memberships were implemented to investigate the accuracy distribution of these clustering algorithms. The analysis on accuracies of all the clustering results with various fuzzier values are given

with random values from of satises ments

such that the ele-

Step 2) Calculate the fuzzy cluster center

using (15)

HUNG et al.: NWFCM CLUSTERING ALGORITHM FOR REMOTELY SENSED IMAGE CLASSIFICATION

547

TABLE II COMPARISON OF 1000 EXPERIMENTS FOR IRIS DATA

Fig. 1. (a) Data set consists of two clusters with one cluster represented by red square dots and the other cluster by green circle dots. Three blue triangle dots lie exactly on the boundary of these two clusters, and (b) some clustering results of FCM, FWCM, and NW-FCM. Note that the cluster centers are marked with for FCM and NW-FCM.

in Tables II and III, Tables V, and VI. The columns Highest, Mean, and Variance below Overall Accuracy in the tables provide the highest, mean, and variance of the overall accuracies of 1000 experiments for each algorithm. In order to show the accuracy distribution of 1000 clustering results, the number of experiments with overall accuracies in the subintervals from 0 to 1 are counted and given in the three columns in the tables. Accuracy Distribution, where [ means inclusive, and ) means not inclusive. The last three columns Variant Clusters show the actual number of clusters obtained for each algorithm among 1000 experiments. Each algorithm will be given a true number of clusters to begin with the experiments. The algorithm may

not be able to detect the correct (i.e., true) number of clusters in the data set due to the instability of the algorithm. This information shows one of the capacities for each algorithm. A. Some Numerical Examples Several numerical examples are tested to compare the clustering results of FCM, FWCM and NW-FCM. Fig. 1(a) shows a data set which consists of two clusters; one cluster represented by red square dots and the other cluster by green circle dots. Three blue triangle dots lie exactly on the boundary of these two clusters. Fig. 1(b) shows clustering results of FCM, FWCM, and

548

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 3, JUNE 2011

Fig. 2. (a) Data set consists of two clusters with one cluster represented by red square dots and the other cluster by green circle dots. One blue triangle dot lies exactly on the boundary of these two clusters, and (b) some clustering results of FCM, FWCM and NW-FCM. Note that the cluster centers are marked with for FCM and NW-FCM.

NW-FCM. Fig. 2 shows the other different data set and clustering results. The experiments were repeated several times for those data sets for each of the three algorithms. Please note that blue triangle dots can be assigned to either cluster. Based on the results shown in Figs. 1 and 2, both FCM and NW-FCM are more stable and accurate than that of FWCM. One can observe that few points are misclassied with FWCM. The cluster centers for FCM and NW-FCM are marked with (The cluster center is not used in FWCM).
Fig. 3. (a) Original scenery image and (b) original ower image with birds.

B. Iris Data The Iris data set in [1], [6] has points in a 4-D feature space belonging to three clusters, each with 50 points. Among the three clusters, two of the clusters have substantial overlapping. From Table II, we can see that the mean of the overall accuracies of results of FCM and NW-FCM are more stable than those of FWCM. NW-FCM provides clustering results with higher overall accuracies than FCM, while both algorithms report a stable result with low variance. As shown in Table II, the variances of the overall accuracies of FWCM are larger than those of FCM and NW-FCM. This result implies that the experiments of FWCM could provide unstable clustering results with lower case). overall accuracies (such as in

C. Wine Data The Wine data set are the results of a chemical analysis of wines grown in the same region in Italy. The data has 178 sample points in 13-dimensional feature space belonging to three clusters, with each cluster having 59, 71, 48 sample points. Similar to Table II, the analysis on accuracies of all the clustering results with various fuzzier values are given in Table III. It follows from Table III that both FWCM and NW-FCM provide clustering results with an average accuracy better than 70%, while FCM is less than 70% in all cases. It can also be seen that NW-FCM gives a minimum variance among all three algorithms.

HUNG et al.: NWFCM CLUSTERING ALGORITHM FOR REMOTELY SENSED IMAGE CLASSIFICATION

549

TABLE III COMPARISON OF 1000 EXPERIMENTS FOR WINE DATA

Fig. 4. (a) and (b) are clustering results with FCM, FWCM, and NW-FCM. In (a) the sky is not properly segmented for FCM with 2.5 and 3.0 and FWCM with 2, 2.5, and 3. Note that FWCM classied the birds into the grass cluster (shown with black color) when and FCM classied the grass pixels into the sky cluster with 2, 2.5, and 3 in (b).

m=

m=3

D. Articial and Natural Images Scenery and ower images shown in Fig. 3 can be segmented into ve and seven clusters, respectively. Fig. 4 shows the results obtained from FCM, FWCM, and NW-FCM. It can be visually observed from the results of scenery image classication, the sky is not properly segmented with FCM for 2.5 and 3.0, and with FWCM for 2, 2.5, and 3. For the ower image, the majority of pixels are accurately classied with NW-FCM in appropriate clusters, while FWCM classied the birds into the grass cluster , and FCM (shown with black color) when the fuzzier 2, 2.5, and 3. classied the grass pixels into sky cluster with The experiments show that NW-FCM works considerably better than FCM and FWCM.

TABLE IV NUMBER OF SAMPLES IN THE DC MALL DATASET USED FOR EXPERIMENTS

E. Hyperspectral Images Two hyperspectral images are used in the experiments; the Indian Pinea mixed forest/agricultural site in Indiana and the Washington DC Mall hyperspectral image as an urban site [16].

550

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 3, JUNE 2011

TABLE V COMPARISON OF 1000 EXPERIMENTS FOR INDIAN PINE DATA

TABLE VI COMPARISON OF 1000 EXPERIMENTS FOR DC MALL DATA

Fig. 6. False-color IR age of a portion of Washington DC Mall dataset with a size of 205 307 pixels. Fig. 5. Simulated false-color IR image of the Indian Pine Site dataset used in our experiments.

These two data sets were gathered by a sensor known as the Airborne Visible/Infrared Imaging Spectrometer. The Indian Pine image, mounted from an aircraft own at 65 000 ft. altitude and operated by the NASA/Jet Propulsion Laboratory, has 220 spectral bands measuring approximately 20 m across on the ground.

Four classes including Corn-notill, Soybean-notill, Soybeansmin, and Grass were selected for the experiments. The simulated false-color IR image is shown in Fig. 5. The Washington DC Mall (shown in Fig. 6) from an urban area is a Hyperspectral Digital Imagery Collection Experiment (HYDICE) airborne hyperspectral data ightline over the Washington, DC, Mall. Two

HUNG et al.: NWFCM CLUSTERING ALGORITHM FOR REMOTELY SENSED IMAGE CLASSIFICATION

551

Fig. 7. (a) Ground truth data, (b) classied results with FCM, (c) classied results with FWCM, and (d) classied results with NW-FCM. Accuracy-1 shows that classication accuracy of 400 sample pixels and accuracy-2 shows the classication accuracy with all pixels in four classes based on the minimum distance from the cluster centers. Each cluster center is obtained based on the classication of 400 sample pixels.

hundred and ten bands were collected in the 0.4 to 2.4 m region of the visible and infrared spectrum. Some water absorption channels were discarded, resulting in 191 channels for experiments. These two dataset are available in [16]. Four information classes in the Washington, DC, data are used for experiments; Roofs, Road, Trail, and Grass as shown in Table IV. To be compatible with previous experiments just shown, we still use the fuzzier values of 2, 2.5, and 3. Table V shows the classication accuracy of Indian Pine. From Table V, we can nd that the highest accuracy among all algorithms is 61.75%.

It follows from Table V that both FWCM and NW-FCM provide clustering results with an average accuracy better than 51%, while the FCM is less than 50% in all cases. It can be also seen that NW-FCM gives a better minimum variance than that of FWCM, but worse than that of FCM. FWCM does not always give a correct number of clusters when m is set to 2.5 and 3. Table VI shows the classication accuracy of DC Mall. Similar to Table V, the highest accuracy among all algorithms is 76.25%. FWCM does not always detect a correct number of clusters with fuzzier values of 2, 2.5, and 3. In repeated ex-

552

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 3, JUNE 2011

periments it is observed that the clustering results of FWCM algorithms depend on the randomly generated initial membership matrix. Fig. 7 shows some of classied images of Indian Pine. Fig. 7(a) displays the ground truth data, Fig. 7(b)(d) are classied results with FCM, FWCM, and NW-FCM, respectively. Accuracy-1 shows that classication accuracy of 400 sample pixels and accuracy-2 shows the classication accuracy with all pixels in four classes based on the minimum distance from the cluster centers. Each cluster center is obtained based on the classication of 400 sample pixels. It can be observed that NW-FCM gives higher accuracy in both accuracy assessments. To compare with non-C-Means based clustering approaches, Kohonens self-organizing map [11] is used for the experiment. Our goal is to develop an efcient algorithm for high-dimensional multiclass pattern classication; therefore, hyperspectral images (Indian Pine and DC Mall) are tested for comparison. Experimental results are shown in Fig. 8. Fig. 8(a) and (b) illustrates the highest classication accuracy and average classication accuracy, respectively, for all the algorithms. We can conclude that SOM performs better than FCM in the highest accuracy, but in average it is not better than that of FCM. Both NW-FCM and FWCM have better performance than FCM and SOM. To explore the impact of the fuzzier, , on the performance of algorithms, is increased from 2 to 20 with an incremental size of either 0.5 or 1 and tested on hyperspectral DC Mall and Indian Pine images. Based on the experimental results, the fuzzier does not affect the accuracy of the algorithms significantly. One exception is that FCM gives a slight increasing average accuracy with m increases from 2 to 20 on DC Mall images. However, our results show that NW-FCM obtains the correct number of clusters on both images in 1000 experiments while FCM can derive the correct number of clusters on the Indian Pine image, but not on the DC Mall image. One can observe that FWCM does not generate the correct number of clusters for most m values used (results from Tables V and VI. Both FWCM and NW-FCM have the same time complexity, . The only difference in if calculated theoretically, of both algorithms is the way the weighted means are calculated, i.e., the basic operation in the algorithm, because of which the NW-FCM converges after fewer iterations than FWCM. Due to the difference in the way the weighted means are calculated, the approximation of cluster centers are more accurate because the rate of decrease in the difference of membership grades between the iterations is more for NW-FCM than for FWCM. Hence, the number of iterations required to converge is much less for NW-FCM. The accuracy of clustering is also increased due to the use of cluster centers. As a result NW-FCM is faster and more accurate than FWCM. Similarly, FCM and NW-FCM have the same time com. The most imporplexity, if calculated theoretically, of tant step added in NW-FCM is the calculation of the weighted means in which cluster centers are more accurately detected and thereby obtain higher classication accuracy. As the approximation of cluster centers is more accurate the rate of decrease in difference of membership grades between iterations is more for NW-FCM than FCM. This eventually makes the

Fig. 8. (a) The highest classication accuracy, and (b) the average classication accuracy for FCM, FWCM, NW-FCM, and SOM with Indian Pine and Washington DC Mall hyperspectral images, respectively.

number of iterations required for convergence much less for NW-FCM. Theoretically, this should cause NW-FCM to run faster than FCM, but the coding for our experiments was done using Matlab code and FCM makes use of many built-in functions designed for faster computation. At present, FCM runs faster even with more number of iterations, but gives much less accuracy than NW-FCM. At this point it is a tradeoff between accuracy and speed for FCM and NW-FCM (if Matlab code is used). V. CONCLUSION We proposed a new clustering algorithm NW-FCM for high-dimensional multiclass pattern recognition problems. Our purpose was to increase the accuracy and stability of the well-known FCM and a recently proposed FWCM. NW-FCM is based on the concept of the unsupervised weighted mean and cluster centroids from nonparametric weighted feature extraction (NWFE) and discriminant analysis feature extraction (DAFE), respectively. In terms of accuracy and stability, experimental results of both numerical and real data sets have illustrated that our proposed algorithm has better performance than that of FCM and FWCM. The articial image experiments exhibited several failures for both FCM and FWCM, but, the proposed NW-FCM always gave an accurate and correct number of clusters. The stability of our algorithm was proven time and again to exhibit predictable expectations. In addition,

HUNG et al.: NWFCM CLUSTERING ALGORITHM FOR REMOTELY SENSED IMAGE CLASSIFICATION

553

we optimized the computation time of NW-FCM which will be competitive with FCM and better than that of FWCM. Overall results conrm that the accuracy of our proposed NW-FCM is better than that of FCM and FWCM. However, the execution time taken by FCM is the best while that of FWCM is the worst. It can be concluded that there is a tradeoff between accuracy and execution time when using FCM and NW-FCM. NW-FCM algorithm does not employ the concept of fuzzy between-and within-scatter matrices; therefore, we will take both matrices into consideration in future research. ACKNOWLEDGMENT The authors would like to thank Dr. D. Landgrebe for providing the Washington DC Mall and Indian Pine Site image data. They also wish to thank the reviewers for giving useful suggestions to improve the presentation of this paper. REFERENCES
[1] E. Anderson, The Irises of the Gaspe Peninsula, Bull. Am. Iris Soc, vol. 59, pp. 25, 1935. [2] B. Balasko, J. Abonyi, and B. Feil, Fuzzy Clustering and Data Analysis Toolbox for Use With Matlab. [Online]. Available: http://www. fmt.vein.hu/softcomp/ [3] J. C. Bezdek, Pattern Recognition With Fuzzy Objective Function Algorithms. New York: Plenum Press, 1981. [4] J. C. Bezdek, J. M. Keller, R. Krishnapuram, L. I. Kuncheva, and N. R. Pal, Will the real Iris data please stand up?, IEEE Trans. Fuzzy Syst., vol. 7, no. 3, pp. 368369, Jun. 1999. [5] Fuzzy Models for Pattern Recognition, J. C. Bezdek and S. K. Pal, Eds. New York: IEEE Press, 1992. [6] C. L. Blake and C. J. Merz, UCI repository of machine learning databases, 1998 [Online]. Available: http://www.ics.uci.edu~mlearnMLRepository.html [7] J. Cohen, A coefcient of agreement for nominal scales, Educ. Psychol. Meas., vol. 20, pp. 3746, 1960. [8] R. O. Duda and P. E. Hart, Pattern Classication and Scene Analysis. New York: Wiley, 1973. [9] G. M. Foody, A fuzzy sets approach to the representation of vegetation continua from remotely sensed data: An example from lowland heath, Photogrammetric Eng. Remote Sens., vol. 58, no. 2, pp. 221225, 1992. [10] K. Fukunaga, Introduction to Statistical Pattern Recognition. San Francisco, CA: Morgan Kaufmann, 1990. [11] T. Kohonen, Self-Organizing Maps. New York: Springer, 1995. [12] R. Krishnapuram, H. Frigui, and O. Nasraoui, Fuzzy and possibilistic shell clustering algorihm and their application to boundary detection and surface approximation, IEEE Trans. Fuzzy Syst., vol. 3, no. 1, pp. 2960, Feb. 1995. [13] R. Krishnapuram and J. M. Keller, A possibilistic approach to clustering, IEEE Trans. Fuzzy Syst,, vol. 1, no. 2, pp. 98110, May 1993. [14] R. Krishnapuram and J. M. Keller, The possibilistic C-means algorithm: Insights and recommendations, IEEE Trans. Fuzzy Syst,, vol. 4, no. 3, pp. 385393, Aug. 1996. [15] B. C. Kuo and D. A. Landgrebe, Nonparametric weighted feature extraction for classication, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 5, pp. 10961105, May 2004. [16] D. A. Landgrebe, Signal Theory Methods in Multispectral Remote Sensing. Hoboken, NJ: Wiley, 2003. [17] C.-H. Li, W.-C. Huang, B.-C. Kuo, and C.-C. Hung, A novel fuzzy weighted C-means method for image classication, Int. J. Fuzzy Syst,, vol. 10, no. 3, pp. 168173, Sep. 2008. [18] Z. Ma and R. L. Redmond, Tau coefcient for accuracy assessment of classication of remote sensing data, Photogrammetric Eng. Remote Sens., vol. 61, no. 4, pp. 435439, 1995.

[19] S. Nefti and M. Oussalah, Probalilistic-fuzzy clustering algorithm, in Proc. IEEE Int. Conf. Syst., Man, Cybern., 2004, pp. 47864791. [20] P. J. Rouseeuw, L. Kaufman, and E. Trauwaert, Fuzzy clustering using scatter matrices, Comput. Statist. Data Anal., vol. 23, pp. 135151, 1996. [21] T. A. Runkler and J. C. Bezdek, Function approximation with polynomial membership functions and alternating cluster estimation, Fuzzy Sets Syst., vol. 101, pp. 207218, 1999. [22] T. A. Runkler and J. C. Bezdek, Alternating cluster estimation: A new tool for clustering and function approximation, IEEE Trans. Fuzzy Syst,, vol. 7, no. 4, pp. 377393, Aug. 1999. [23] X. Wang, On the gradient inverse weighted lter, IEEE Trans. Signal Process., vol. 40, no. 2, pp. 482484, Feb. 1982. [24] L. X. Xuanli and B. Gerardo, A validity measure for fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 8, pp. 841847, Aug. 1991. [25] M.-S. Yang and K.-L. Wu, Unsupervised possibilistic clustering, Pattern Recognition, vol. 39, no. 1, pp. 521, 2006. [26] J. Zhou and C. C. Hung, A generalized approach to possibility clustering, Int. J. Uncertainty, Fuzziness Knowl.-Based Syst., vol. 15, pp. 110132, Apr. 2007. Chih-Cheng Hung (M90) received the B.S. degree in business mathematics from Soochow University, Taipei, Taiwan, and the M.S. and Ph.D. degrees in computer science from the University of Alabama, Huntsville, in 1986 and 1990, respectively. He is a Professor of computer science at Southern Polytechnic State University, Marietta, GA. He developed image processing software tools with the Department of Image Processing Applications at Intergraph Corporation from 1990 to 1993 and was an Associate Professor at Alabama A&M University, Normal, from 1993 to 1999. His research interests include image processing, pattern recognition, neural networks, genetic algorithms, articial intelligence, and software metrics. He is an Associate Editor of Information: An International Interdisciplinary Journal. Prof. Hung served as the program co-chair for the Association of Computing Machinery (ACM) Symposium on Applied Computing (SAC 2010), Sierre, Switzerland, from March 22 to 26, 2010.

Sameer S. Kulkarni received the B.S. degree in computer science from Pune University, Pune, India, in 2006. He is currently pursuing the M.S. degree in computer science from Southern Polytechnic State University, Marietta, GA, and is working on his M.S. thesis on image processing and pattern recognition as a partial fulllment of requirement for the M.S. degree. His research interests are pattern recognition, remote sensing, image processing, and face recognition.

Bor-Chen Kuo (M10) received the B.S. and M.S. degrees from National Taichung Teachers College, Taichung, Taiwan, in 1993 and 1996, respectively, and the Ph.D. degree from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, in 2001. He is currently a Professor in the Graduate Institute of Educational Measurement and Statistics, National Taichung University, Taichung, Taiwan. His research interests are pattern recognition, remote sensing, image processing, and nonparametric functional estimation.

You might also like