Professional Documents
Culture Documents
Vignesh R. Ramachandran
Johns Hopkins Applied Physics Laboratory
Laurel, MD, USA
Vinny.Ramachandran@jhuapl.edu
Herbert J. Mitchell
Naval Postgraduate School
Monterey, CA, USA
herbert.mitchell@jieddo.mil
Samantha K. Jacobs
Johns Hopkins Applied Physics Laboratory
Laurel, MD, USA
Samantha.Jacobs@jhuapl.edu
Nigel H. Tzeng
Johns Hopkins Applied Physics Laboratory
Laurel, MD, USA
Nigel.Tzeng@jhuapl.edu
Alexer H. Firpi
Johns Hopkins Applied Physics Laboratory
Laurel, MD, USA
Alexer.Firpi@jhuapl.edu
Benjamin M. Rodriguez
Johns Hopkins Applied Physics Laboratory
Laurel, MD, USA
Benjamin.Rodriguez@jhuapl.edu
1. I NTRODUCTION
TABLE OF C ONTENTS
1 I NTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 R ELATED W ORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 M ETHODOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 F INDINGS AND A NALYSIS . . . . . . . . . . . . . . . . . . . . . . .
5 C ONCLUSION AND F UTURE W ORK . . . . . . . . . . . . .
R EFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B IOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
2
7
8
8
9
c
978-1-4799-1622-1/14/$31.00
2014
IEEE.
IEEEAC Paper #2635, Version 1, Updated 11/15/2013.
The remainder of this paper is structured as follows. In Section 2, related work in the area of spectral signature matching
is presented. Our proposed method, signature classification
via scoring and clustering, is described in Section 3. Findings
and analyses are provided in Section 4. Finally, we conclude
and describe opportunities for future work in Section 5.
3. M ETHODOLOGY
We formulate a methodology to rapidly classify a new,
unknown signature by identifying signatures in a spectral
library with similar spectral features. Instead of individually comparing the unknown signature against each member
of the library, the proposed method precomputes a score
representation of the library against a small number (N) of
artificial reference spectra. Using a derivative of the SAM
method, the scalar spectral angle values between each library
signature and the reference spectra are treated as coordinates
of a point in N-dimensional space, and cached within the
library; when a new signature is introduced, it is compared
against the reference spectra to produce a corresponding set
of coordinates. Then, sets of spatial coordinates with smaller
Euclidean distances correspond to library spectra with the
greater similarity to the new signature. This method is intended to filter the library down to a small subset of candidate
matches.
2. R ELATED W ORK
The need for spectral signature comparison and identification
has driven substantial work in the application of pattern
recognition, unsupervised and semi-supervised learning, and
data clustering [3] [4] [5]. Further, the need to quantify and
analyze enormous quantities of spectral data has spawned
many attempts at spectral collections or databases, with
mixed results [6]. The inability to reduce various methods
of collection and phenomenologies of spectra into a least
common denominator representation has made the problem
computationally challenging, especially without the use of
copious metadata to explain the exact conditions and context
in which the spectra were collected.
The classification and identification of unlabeled data has
been studied in great detail in the spatial domain, and a
wealth effective of solutions have been developed to address the problem [3] [7]. Spatial clustering algorithms
in general attempt to determine natural boundaries between
non-uniformly distributed spatial data points. Of these, kmeans is relatively simplistic approach: given some known
number of clusters k, cluster center points are randomly
distributed among the sample data, then iteratively updated
to reflect the average of their nearby constituents. A key
assumption is advance knowledge of k, as this algorithm has
no ability to merge or split existing clusters. However, it
is the simplest in a large field of clustering solutions that
includes hierarchical clustering, fuzzy k-means, DBSCAN,
expectation-maximization, and many others [3][8]. If the
spectral classification problem can be effectively adapted into
the spatial domain, any of these existing methods can be
applied.
Several direct and indirect methods exist to compare signatures against each other, such as Spectral Angle Mapping
(SAM) [4], multiple endmember spectral mixture analysis
(MESMA) [9], peak detection and similarity indices such as
the Pearson correlation [1]. In general practice, the sensitivity
and utility of each method is inversely correlated with its
runtime computational complexity [1]. The SAM algorithm
is of specific interest due to its ability to precisely describe
the difference between two signatures without regard for the
relative illumination within the spectra (which is irrelevant
to the spectral features of the observed material) [4]. SAM
measures similarity by taking a signature in two dimensions
(X and Y) and creating a spectral vector consisting of its Y
2
1
~ = ..
.
N
(1)
~
where N is the number of characteristic functions B. The
values can then be considered the coordinates of a point in
N -dimensional space, where B1 ..BN serve as axes (orthogonality is not required, but is desirable). In this new spatial
representation, score similarity can be characterized as the
Euclidean distance between two points; thus this enables the
use of existing spatial clustering algorithms, such as k-means,
to perform classification of spectra.
Detailed Approach
Each signature in the spectral database consists of columns
of spectral data accompanied by various optional metadata
properties, such as sensor identification and calibration, environmental conditions, sample identification and description,
axis units and labels, and any known observable associations.
The data is in the wavelength domain with value columns
representing either reflectivity or emissivity, as indicated by
axis properties (see Figure 3, an example signature from
the ASTER library [12]). Note that NaN float values are
used to represent invalid or removed data points within the
spectra, such as deliberately suppressed water bands. A
hash of the spectral data uniquely identifies the signatures;
therefore, two signatures having the same identifier are assumed to be identical, and cannot both exist in the database.
The phenomenology of the signature (LWIR, MWIR, SWIR,
VIS/NIR) is also indicated by metadata properties.
Preconditions
The choice of mathematical functions used to produce reference spectra, and thus the spectral angle scores used for
comparison, is a critical factor in the utility of this approach.
Poorly chosen functions result in spectral angles that are
highly similar for many or all library spectra. Functions that
perform well in one spectral band may perform poorly in
others, thus requiring different sets of functions for different
3
M
X
ai bi = a1 b1 + ... + aM bM
(2)
i=1
Figure 3.
Library
~B
~ = |A||B| cos AB =
A
M
X
ai bi
(3)
i=1
= AB = cos1
PM
a i bi
|A||B|
i=1
(4)
Sif1
Sif2
(5)
~Si =
...
SifN
17:
scores[Si ][fj ]=cos1 ( sMproduct
)
ag f M ag
18:
end for
19:
end for
20:
return scores
21: end procedure
Name
10-nm Cosine
1-m Cosine
100-m Cosine
Equation
y = cos(100x)
y = cos(x)
x
y = cos( 100
)
Implementation
In support of ISP spectral analysis, JHU/APL has developed
a standardized database schema representation of spectral
signatures, and an associated Java-language software application SigDB to aid in exchange, preliminary analysis, comparison and classification of collected spectra. The scoring
and clustering methodology described herein was developed
as a plug-in capability for the SigDB application, which
enabled immediate access to a large quantity of spectral data
and a framework for analysis. SigDB stores signature data
in an SQL relational database as IEEE754 64-bit floating
Results
Table 2 shows the number of computations and runtimes
for each of the algorithm processes performed. The first
row is a traditional spectral angle mapping comparison of
the unknown signature against the database spectra. Although the database contains 1,800 signatures, only 477
match the same minimum/maximum domain, so only these
were selected for comparison; of these, only 17 signatures
matched every datapoints domain precisely. Thus, the other
460 calculations were terminated before completion. This
process took approximately forty seconds; the generated
scores, along with the name of each signature, is shown
in Table 3. The second row includes computation of
scores against each of the three characteristic functions, kmeans cluster classification (which included 29 iterations of
the clustering algorithm), and storage in the database. All
database spectra were scored, but no actual comparisons were
performed in this step. This process took approximately four
minutes. The third row includes calculation of the unknown
signatures scores against the characteristic functions, then
Euclidean distance comparison of those scores against each
of the 30 cluster centroids to determine classification. This
process took ten milliseconds. The final row, which is not
performed in the normal course of k-means analysis, was a
recalculation of the unknown signatures scores (in order to
# Score Calculations
SAM Algorithm
cSAM Score Computation
cSAM Cluster Comparison
cSAM Euclidean Score Comparison
477 (17)
1800
1
1
# Comparisons
Run-Time (ms)
477 (17)
0
30
1800
39,794
223,340
10
15
Avg ms per
Comparison
83
N/A
0.33
0.01
Spectral Angle
cSAM Distance
1.333
1.656
2.651
2.670
2.760
3.041
3.369
4.159
4.561
4.643
4.776
5.550
7.066
8.133
8.356
8.855
24.929
1.063
0.695
1.901
2.032
2.094
2.046
3.017
3.642
4.303
4.810
4.329
5.080
5.637
7.648
7.972
7.852
8.786
Cluster
10-nm
1-m
100-m
Signatures
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
90.443
89.401
90.636
89.874
90.451
89.293
90.587
90.633
90.104
90.366
90.575
90.630
90.175
90.522
90.289
90.282
90.859
90.380
89.745
90.624
90.476
90.355
89.594
89.747
90.483
89.429
90.591
90.121
90.413
90.546
90.380
123.94
93.475
124.72
80.886
119.21
116.74
114.72
126.14
118.52
106.13
107.24
131.17
86.036
125.50
119.15
75.508
117.53
140.77
97.332
74.903
68.062
122.00
134.34
57.068
130.32
78.987
124.53
76.509
105.788
16.014
42.293
38.517
33.820
7.3951
48.691
22.568
29.288
20.840
37.312
27.278
37.788
24.880
9.4432
27.491
9.6618
58.144
14.352
30.669
24.690
7.8526
15.130
1.4625
29.652
30.304
36.741
2.5388
13.485
24.125
19.860
61
62
57
70
59
44
40
68
89
43
53
72
122
49
61
89
13
72
48
64
77
48
39
78
47
56
120
38
18
43
ACKNOWLEDGMENT
The authors would like to thank the Integrated Signatures
Program for their support, Thomas Spisz (JHU/APL) for
information on the Spectral Angle Mapper algorithm, and
Edward Birrane (JHU/APL) and Jason Oxenrider (JHU/APL)
for editing and review.
R EFERENCES
[1]
[2]
[3]
[4]
The importance of careful selection of characteristic functions was clearly illustrated by the 10-nm cosine functions
inability to discriminate amongst the library spectra. Intuitively, a 10-nm-scale curvature is negligible when compared
against spectra on the micron scale; therefore, the range of
spectral resolutions of the library spectra are a significant
factor in the efficacy of the functions.
[5]
[6]
[7]
[8]
8
J. Li, D. B. Hibbert, S. Fuller, and G. Vaughn, A comparative study of point-to-point algorithms for matching
spectra, Chemometrics and Intelligent Laboratory Systems, vol. 82, no. 1-2, pp. 5058, May 2006.
A. M. Baldridge, S. J. Hook, C. I. Grove, and G. Rivera,
The aster spectral library version 2.0, Remote Sensing
of Environment, vol. 113, pp. 711715, 2009.
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern
Classification, 2nd ed. John Wiley and Sons, 2001.
Y. Sohn and N. S. Rebello, Supervised and unsupervised spectral angle classifiers, Photogrammatric
Engineering and Remote Sensing, vol. 68, no. 12, pp.
12711280, December 2002.
F. A. Kruse, J. W. Boardman, and J. F. Huntington,
Comparison of airborne hyperspectral data and eo-1
hyperion for mineral mapping, IEEE Transactions on
Geoscience and Remote Sensing, vol. 41, no. 6, pp.
13881400, June 2003.
C. Salvaggio, L. E. Smith, and E. J. Antoine,
Spectral signature databases and their application/misapplication to modeling and exploitation of
multispectral/hyperspectral data, in Algorithms and
Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XI, S. S. Shen and P. E. Lewis, Eds.,
vol. 5806. SPIE, 2005.
K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed., W. Rheinboldt, Ed. New York: Academic Press, October 1990.
M. Ester, H.-P. Kriegel, J. S, and X. Xu, A densitybased algorithm for discovering clusters in large spa-
[9]
[10]
[11]
[12]
B IOGRAPHY [
tial databases with noise, in Proceedings of 2nd International Conference on Knowledge Discovery and
Data Mining, E. Simoudis, J. Han, and U. Fayyad,
Eds., American Association for Artificial Intelligence.
Menlo Park, California: The AAAI Press, 1996, pp.
226231.
P. E. Dennison, K. Q. Halligan, and D. A. Roberts, A
comparison of error metrics and constraints for multiple
endmember spectral mixture analysis and spectral angle
mapper, Remote Sensing of Environment, vol. 93, no. 3,
pp. 359367, November 2004.
K. V. Mardia, J. T. Kent, and J. M. Bibby, Multivariate
Analysis. London: Academic Press, 1979, pp. 360
384.
F. Robinson, A. Apon, D. Brewer, L. Dowdy, D. Hoffman, and B. Lu, Initial starting point analysis for kmeans clustering: a case study, in Proceedings of ALAR
2006 Conference on Applied Research in Information
Technology, 2006.
(2008, December) Aster spectral library. [Online].
Available: http://speclib.jpl.nasa.gov/
Nigel Tzeng received a B.S. In Computer Science and a M.S. In Software Engineering from the University of Maryland College Park. Mr. Tzeng has
over 20 years experience in spacecraft
ground systems, command and control (C2) systems, data visualization
and software engineering. He joined
the Johns Hopkins University Applied
Physics Laboratory (JHUAPL) in 2003
and is currently a senior member of the Space Department
technical staff. Mr. Tzeng leads the development of signature
and geospatial analysis/exploitation software systems and
served as the Group Chief Scientist for the C2 Systems Engineering Group from 2007-2009 as well as been the Principal
Investigator of several C2 research initiatives. His primary
area of research are command and control, geospatial visualization and collaboration. Prior to joining JHUAPL, Mr.
Tzeng worked in telecommunications, e-commerce, advanced
traffic management systems, spacecraft simulation (Landsat,
SOHO), spacecraft command and control (SAMPEX, TRMM,
FUSE), and science data processing/visualization (COBE).
He was the lead software architect and designer of the City of
Louisville Advanced Traffic Management System and developer of the DIRBE, FIRAS and DMR sky map visualization
software on COBE.
Alexer Firpi received a B.S. in electrical
engineering from Polytechnic University
(San Juan, Puerto Rico), an M.S. in
electrical engineering from the University of Puerto Rico (Mayaguez, Puerto
Rico), and a Ph.D. in electrical engineering from Michigan State University
(East Lansing, MI). After concluding his
doctoral studies, Dr. Firpi did postdoctoral work at different institutions in
diverse research areas such as intelligent control, biomedical
engineering, imaging genetics, and bioinformatics. He is
currently a senior staff member at Johns Hopkins University
- Applied Physics Lab. Dr. Firpis research focuses on
machine learning, brain-computer interfaces, computational
intelligence, and any other research problem that can be
automated using machine-learning approaches. He is the
author of more than 20 peer-reviewed publications and two
book chapters.
Benjamin Rodriguez received a Bachelors of Science (B.S.) and Masters of
Science (M.S.) in Electrical Engineering from the University of Texas, and
received a Doctor of Philosophy (Ph.D.)
in Electrical and Computer Engineering
from the Air Force Institute of Technology, Graduate School of Engineering
and Management, Electrical and Computer Engineering Department, WrightPatterson Air Force Base, OH. He is the Section Supervisor
for Space Systems and Architectures in the Space Department with The Johns Hopkins University Applied Physics
Laboratory. He is also an instructor at The Johns Hopkins
University, Whiting School of Engineering for the Department of Electrical and Computer Engineering as well as the
Department of Computer Science.
10