You are on page 1of 7

SCIENCE CHINA Information Sciences

. RESEARCH PAPERS .

January 2011 Vol. 54 No. 1: 197203 doi: 10.1007/s11432-010-4074-x

Matrix calculation of high-dimensional cross product and its application in automatic recognition of the endmembers of hyperspectral imagary
GENG XiuRui1 , ZHAO YongChao1 , LIU SuHong2 & WANG FuXiang3
Laboratory of Technology in Geo-spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100080, China; 2State Key Laboratory of Remote Sensing Science, Jointly Sponsored by Beijing Normal University and the Institute of Remote Sensing Applications of Chinese Academy of Sciences, School of Geography, Beijing Normal University, Beijing 100875, China; 3School of Electronics and Information Engineering, Beihang University, Beijing 10019, China Received September 15, 2008; accepted July 26, 2009; published online September 14, 2010
1Key

Abstract This paper gives the denition of the high-dimensional cross product and its calculation by extending the 3-D cross product denition into the high-dimensional vector space. Based on the properties of the cross product, the volume variance index (VVI) is proposed to be used in extracting automatically the endmembers of the hypherspectral imagery which eliminates the shortcoming of the traditional method of using simplex only where the extraction results were easily impacted by the abnormal pixels. A case study of endmembers extraction experiment using the VVI method with the AVIRIS data for Cuprite has shown a very good result. Keywords endmember, simplex, hyperspectral imagery, cross product

Citation Geng X R, Zhao Y C, Liu S H, et al. Matrix calculation of high-dimensional cross product and its application in automatic recognition of the endmembers of hyperspectral imagary. Sci China Inf Sci, 2011, 54: 197203, doi: 10.1007/s11432-010-4074-x

Introduction

The development of the high-spectral resolution remote sensing techniques has been one of the breakthroughs of the EOS in late 1990s. The study of the application of the imaging spectrometry is the frontier of the remote sensing techniques. Due to the complexity and the diversity of the surface objects as well as the space resolution limitation of the sensors, the mixed pixels exist ubiquitously in the remotely sensed images. Based on this, all the pixels in a hyperspectral image can be considered as the linear mixing of the endmembers of the image. Therefore, the extraction of the endmembers of an image is the prerequisite for understanding the hyperspectral data and performing further analysis on it. How to extract the endmembers has been a hot issue in the hyperspectral imagery processing. There are some methods and algorithms available. Earlier, Boardman [1] proposed the idea of extracting the endmembers by using the method of the convex geometry analysis, pointing out that all the data of a hyperspectral image in its feature space is covered by the simplex formed by the vertexes which
Corresponding

author (email: gengxr@sina.com.cn)

c Science China Press and Springer-Verlag Berlin Heidelberg 2010

info.scichina.com

www.springerlink.com

198

Geng X R, et al.

Sci China Inf Sci

January 2011 Vol. 54 No. 1

corresponded to the pure pixels representing all the surface objects (endmembers). Together with Kruse and Green [2], he developed the pure pixel index (PPI) algorithm to extract the endmembers. The minimal volume transform (MVT) method proposed by Craig [3] is used to get the endmembers by calculating the smallest volume of the simplex which can encompass the entire hyperspectral data cloud. Bateson and Curtiss [4] developed a man-machine interactive method (MEST) to extract the endmembers based on the principal component analysis and the multi-dimension visualization software. The N-FINDR [5] algorithm nds the set of endmembers with the largest possible volume by inating a simplex within the data. The iterative error analysis (IEA) [6] is also an algorithm of endmember extraction which does not require the reduction of the dimensions or the removal of the redundancy of the original data. It rst sets the initial vector (generally this initial vector would be the mean value vector of all the spectra), then conducts linear unmixing operation step by step with one endmember from the error image extracted in each step and adds this endmember in the calculation during the next step, and so on and so forth, until all the endmembers are calculated based on the given criteria. In order to explain variability of the endmembers spectra, Roberts [7] developed the multiple endmembers spectral mixing analysis (MESMA). Its main point is that each endmember is represented by a group of vectors rather than a single vector and when conducting the linear unmixing, the best suitable vector is chosen from its representative vector group to make the mean square root error minimal where the endmember can be chosen from the image data or the spectral library of the region. Noticing the variability of the endmembers, Bateson proposed the concept of the endmember bundles and generated the endmember bundles by using the simulated annealing algorithm. Plaza [8] proposed an algorithm to extract endmembers automatically based on the morphology which made good use of the space correlation of the pixels while utilizing the spectral information. The vertex components analysis (VCA) method, proposed by Nascimento [9] to extract the endmembers of an image, has the advantage in the speed of the extraction. Based on quick convergence property of the non-negative matrix decomposition, Miao [10] designated a novel endmembers extraction method which does not require the hypothesis that there are pure pixels in an image. The aforementioned endmember extraction methods are all based on linear mixture of the pixels in a hyperspectral image, which was equivalent to the use of the simplex characteristics of the scatter points of the hyperspectral image data in its feature space. However, there are abnormal pixels of an image due to the impacts caused by various elements during the acquirement of the hyperspectral image data. The algorithms based on the linear mixture models or simplex volume would extract automatically those abnormal pixels as part of the endmembers, which is obviously not benecial to the further processing and analyzing of an image. This paper extends the 3-D cross product denition into the high-dimensional vector space and proposes a volume variance index to be used in the automatic extraction of the endmembers which takes into consideration both the geometrical properties and statistics information of the data distribution based on the properties of the high-dimensional cross product.

The problem background

In general, each pixel in a hyperspectral image can be considered as the linear mixture of the pixels of the image, which can be expressed as
N

p=
i=1 N

ci ei + n = Ec + n, ci = 1 ,

(1)

(2) (3)

i=1

ci

1,

where N is the number of endmembers in the image, and ci is a scalar value representing the fractional abundance of endmember vector ei in the pixel corresponding to spectral vector p. E = [e1 e2 , . . . , eN ]

Geng X R, et al.

Sci China Inf Sci

January 2011 Vol. 54 No. 1

199

Figure 1

The triangular structure of three endmembers in a 2-D scatterplot.

is an L N (L is the number of bands for the original data) matrix composed of all endmember vectors. c = (c1 , c2 , . . . , cN )T and n are respectively abundance vector and noise vector. There are three cases in the linear mixture model. Case 1 is expressed in eq. (1) in which there is no restriction. If adding the restriction condition expressed by eq. (2), it becomes case 2 which is called partial restricted linear mixture model. Case 3 is called the full restricted mixture model by adding the restriction condition expressed by eq. (3) on top of case 2. The linear unmixing is to calculate their corresponding proportions of the given endmembers in representing a pixel of an image so that the proportion coecients diagram of each endmember in that image can be obtained. Using the method of least squares, the nonrestraint solution to eq. (1) is given as follows: = (E T E )1 E T p. c (4)

When the error item, n, is very small, the set of all the points which meet eqs. (1)(3) can just form a convex set of high-dimensional space with the endmembers sitting on those vertexes of the simplex. Figure 1 illustrates the geometrical relationship between the endmembers by taking 2 bands and 3 endmembers as an example. As shown in this gure, the endmembers, a, b and c, are located respectively at the three vertexes of the triangle and the points inside the triangle would correspond to the mix pixels of an image. As a result, the problem of extracting the endmembers of a hyperspectral image is turned into a problem of obtaining the vertexes of the simplex. As is noticeable, the area of the triangle formed by the endmembers a, b and c in Figure 1, is the maximal among all the triangles formed by any three pixels of this image, expressed as S (a, b, c) = max{S (i, j, k )}, (5)

where S (i, j, k ) represents the area of the triangle formed by the pixels i, j and k . Therefore, the problem of obtaining the vertexes of the simplex becomes a problem of seeking for the maximal volume of the simplex. There are many algorithms based on this as mentioned in the previous section. Apparently those algorithms utilize only the geometrical information of the image distribution in its feature space without taking into account the statistic information of the scattering distribution of points. By extending the 3-D cross product denition into the high-dimensional vector space, a volume variance index is proposed in the next section for the automatic extraction of the endmembers of the hyperspectral imagery which takes into consideration both the geometrical properties and statistics information of the data distribution.

The algorithm

As is well known, being one of the basic calculations in the 3-D space, the cross product of two 3-D vectors is a new vector with its direction vertical to the plane formed by the two vectors, and the value

200

Geng X R, et al.

Sci China Inf Sci

January 2011 Vol. 54 No. 1

equal to the area of the parallelogram formed by the two vectors. Supposing a = (a1 a2 a3 ), b = (b1 b2 b3 ), their cross product, c, can be expressed as i c=ab= a1 b1 j a2 b2 k a3 b3 , (6)

where | | is the determinant calculator, and i, j , k are the unit vectors of the three coordinate axes respectively. In order for eq. (6) to make sense, a, b must be 3-D vectors. To extend this concept into the high dimension space, we assume that there are N 1 N -dimension vectors, represented by ei = (ei1 , ei2 , . . . , eiN ), i = 1, 2, . . . , N 1. In the N -dimension vector space, we dene their cross product as iN i1 d = e1 e2 eN 1 = e11 . . . eN 1,1 e1N . . . eN 1,N , (7)

where i1 , i2 , . . . , iN are the unit vectors of the N -dimensional coordinates axes. It is obvious that d is a new vector with its direction vertical to the super plane (represented as span(e1 , e2 , . . . , eN 1 )) formed by e1 , e2 , . . . , eN 1 , and its value equal to the volume of the (N 1)-dimension parallel polyhedron formed by the vectors, e1 , e2 , . . . , eN 1 . The volume of the simplex with e1 , e2 , . . . , eN 1 being its vertexes can be calculated as 1 d . (8) V (e1 , e2 , . . . , eN 1 ) = (N 2)! Apparently there must be N 1 N -dimension vectors, ei = (ei1 , ei2 , . . . , eiN ), participating in the calculation in order for eq. (7) to make sense. That is to say, the cross product in the N -dimension space does not happen with two vectors. It instead requires N 1 vectors to participate in the calculation. As a special case, the cross product in the 2-D space would be the unit calculation. For example, given j |. x = (x1 , x2 ) as a non-zero vector in the 2-D space, its cross product could be described as y = | xi1 x 2 And obviously y x and they have the same length. Assuming that e1 , e2 , . . . , eN 1 are all the endmembers retrieved from an image, we denitely hope that the majority of the information of this image would be distributed in the super plane, span(e1 , e2 , . . . , eN 1 ), and the less the information distributed in the orthogonal complementary set of span(e1 , e2 , . . . , eN 1 ) the better. The variance of the image in the d direction can be expressed as var(d) = dT d , d 2 (9)

where is the co-variance matrix of the N -dimension hyperspectral data. Considering the value and direction of the cross product, d, of the vectors e1 , e2 , . . . and eN 1 , the volume variance index to be used to extract the endmembers is dened as p(e1 , e2 , . . . , eN 1 ) = d 3 d = T . var(d) d d (10)

The N 1 N -dimension vectors, e1 , e2 , . . . , and eN 1 , will be the endmembers if they can meet the requirements to maximize the value of p(e1 , e2 , . . . , eN 1 ). This index is meaningful because of the larger 1 volume as (N 2)! d of the simplex formed by the endmembers, e1 , e2 , . . . , and eN 1 , and the more information projected on the super plane. Because the abnormal pixels of an image are normally isolated far away from the data cloud, they would be included as part of the endmembers if extracted by the algorithms only based on the volume of the simplex. This can be avoided by the introduction of the variance in the calculation proposed in this paper. In many cases, the band numbers of a hyperspectral image to be processed is much bigger than its latent dimension of the image. However, the latent dimension of an image depends on the total number

Geng X R, et al.

Sci China Inf Sci

January 2011 Vol. 54 No. 1

201

of the meaningful surface objects based on the linear mixing theory. If there are N 1 endmembers of an image, it is required to reduce the dimension of the original data to N dimension. This can be achieved via the method of principal component analysis (PCA) which can eliminate the redundant information while keeping the information content unchanged and thus reduce the number of dimension of the original image to the level required. The following comes with the detailed steps of the algorithm: (1) Use PCA to reduce the number of dimension of the original data to N dimension if needed (2) Take any N 1 vectors of the image as the initial vectors, to calculate the volume variance index, p0 by using eq. (10) (3) Sequentially traverse all the pixels of the image and replace the elements in the initial vectors with the newly picked up pixels, to calculate the volume variance indexes in turn. Assume the maximal value among the calculated variance indexes is p1 , and if p1 > p0 , then the corresponding pixel in the initial vector set is replaced by this new pixel and p0 is replaced by p1 . Otherwise, continue traversing to the next pixel of the image (4) Continue step (3) until the value of the volume variance index no longer changes. Then the resulting N 1 vectors would be the endmembers.

The experiment validation

The experiment is conducted to extract the surface objects by using AVIRIS data obtained for Cuprite from ENVI (Figure 2). AVIRIS is a hyperspectral instrument of high quality and low noise with 224 bands (the wave length ranged from 400 to 2500 nm). Only 50 consecutive bands of infrared shortwave (from 1978 to 2478 nm) are selected for the algorithm validation. First, the image is transformed by using the PCA method. Because the sum of the eigenvalues of the rst 7 principal components takes 99.43% of the total summary of all the eigenvalues, the rst 7 bands resulting from the PCA are selected to conduct the endmembers extraction experiment. Based on the algorithm proposed in this paper, 6 endmembers should be extracted after the decrease of the dimension. The comparison has been given between the method of using the volume only (as expressed by eq. (8)) and the method of using the volume variance index (as expressed by eq. (10)). As illustrated in Figure 3, there are two endmemebers in common out of the 6 endmembers. Out of the rest 4 endmembers, three of them are quite similar (Figure 4(a)(c)) and leaving one with signicant dierence in the spectral curve (Figure 4(d)). From the obtained endmembers, the endmember 2 represents kaolinite, the endmember 3 represents alunite and the endmember 4, calcite. These endmembers spectra pretty much match the measured spectra of those surface objects in the spectra database. It is worthy of notice that there is an abnormal peak at band 44 for the endmember 6 (Figure 4(d)). If stretching the image at the band 44, it is obvious that this endmember is an abnormal pixel (Figure 5). The introduction of the volume variance index can enable us to detect this abnormality and thus exclude such abnormal pixels in the endmembers (Figure 4(d)). Based on the real distribution of the surface objects in this area, the endmember 6 obtained by using the volume variance index is corresponding to the surface objects, such as silicon substances and rhyolitic tu which are abroadly distributed in the region (as shown in the red ellipse in Figure 2). This indicates that the abnormal pixels are excluded from the endmembers by the introduction of the volume variance index, and therefore, what are extracted by this method is the endmembers representing meaningful surface objects.

Conclusions

This paper extends the 3-D cross product concept into the high-dimensional vector space, and gives the denition of the high-dimensional cross product and its calculation. Based on the properties of the highdimensional cross product, a volume variance index is proposed to be used in the automatic extraction of the endmembers for the hyperspectral images. Because it takes into account both the geometrical as-

202

Geng X R, et al.

Sci China Inf Sci

January 2011 Vol. 54 No. 1

Figure 2 False color composite image of the Cuprite sample data.

Figure 3 Results of endmembers extraction from volume (eq. (8)) and VVI (eq. (10)).

Figure 4

Endmembers extracted from the image by volume (eq. (8)) and VVI (eq. (10)). (a) The 2nd endmember; (b)

the 4th endmember; (c) the 5th endmember; (d) the 6th endmember.

pect and the statistical aspect (e.g., the volume and the variance), it is free of the shortcomings of the traditional geometry focused methods in extraction of the endmembers. Both the analysis and the experiment result show that our method can better avoid the abnormal pixels, indicating that the introduction of the volume variance index is quite valuable. As any method has its application boundary, the method proposed herein may not be benecial if the interest of an application is in the abnormal pixels or the small objects of the imagery data.

Geng X R, et al.

Sci China Inf Sci

January 2011 Vol. 54 No. 1

203

Figure 5

The abnormal pixel in band 44 (the left image is the 44 band gray image, the right shows the locality of the

image, where the bright point is the abnormal pixel).

Acknowledgements This work was supported by the National Basic Research Program of China (Grant Nos. 2007CB714406, 2007CB714401, 2007CB403507), the National Natural Science Foundation of China (Grant No. 40501041), and the National Science Foundation of USA (Grant No. 0421530).

References
1 Boardman J W. Automating spectral unmixing of AVIRIS data using convex geometry concepts. In: Summaries of the Fourth Annual JPL Airborne Geoscience Workshop, AVIRIS Workshop. Pasadena, CA: Jet Propulsion Laboratory, 1993. 1114 2 Boardman J W, Kruse F A, Green R O. Mapping target signatures via partial unmixing of AVIRIS data. In: Summaries of the V JPL Airborne Earth Science Workshop, Pasadena, CA, 1995 3 Craig M D. Minimum volume transforms for remotely sensed data. IEEE Trans Geosci Remote Sens, 1994, 32: 542552 4 Bateson C A, Curtiss B. A tool for manual endmember selection and spectral unmixing. In: Summaries of the V JPL Airborne Earth Science Workshop, Pasadena, CA, 1993 5 Winter M E. N-FINDR: An algorithm for fast autonomous spectral end-member determination in hyperspectral data. In: Proc SPIE, 1999. 3753: 266275 6 Neville R A, Nadeau C, Levesque J, et al. Hyperspectral imagery for mineral exploration: comparison of data from two airborne sensors. In: Proceedings of the International SPIE Symposium on Imaging Spectrometry, SPIE Vol. 3438, San Diego, California, 1998. 7482 7 Roberts D A, Gardner M, Church R, et al. Mapping chaparral in the Santa Monica Mountains using multiple endmember spectral mixture models. Remote Sens Envir, 1998, 65: 267279 8 Plaza A, Martinez P, Perez R M. Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans Geosci Remote Sens, 2002, 40: 20252041 9 Nascimento J M P, Dias J M B. Vertex component analysis: a fast algorithm to unmix hyperspectral data. IEEE Trans Geosci Remote Sens, 2005, 43: 898910 10 Miao L D, Qi H R. Endmember extraction from highly mixed data using minimum volume constrained nonnegative matrix factorization. IEEE Trans Geosci Remote Sens, 2007, 45: 765777

You might also like