You are on page 1of 7

398 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 34, NO.

2, MARCH 1996

emote Sensing of Forest Change


sing Artificial Neural Networks
Sucharita Gopal and Curtis Woodcock, Associate Member, IEEE

Abstract-A prolonged drought in the Lake Tahoe Basin in change detection methods such as image differencing, princi-
California has resulted in extensive conifer mortality. This phe- pal component analysis, and multitemporal regression. Since
nomenon can be analyzed using (multitemporal) remote sensing that time, much additional literature has been published on
data. Prior research in the same region used more tradition&
methods of change detection [SI, [30]. This paper introduces a the logic of change detection. In the area of timber inventory
third approach to change detection in remote sensing based on and forest management, changes in forest cover caused due to
artificial neural networks. The neural network architecture used defoliation by insects [31], [32], [39], [41] and landuse andor
is a Multilayer Feedforward Network. The results of the study other human-induced factors (e.g., pollution stress) [7], [9],
indicate that the artificialneural network (ANN) estimates conifer [38] are significant.
mortality more accurately than the other approaches. Further, an
analysis of its architecture reveals that it uses identifiable scene The perspective on change detection in this paper is a
characteristics-the same as those used by a GrammSchmidt bit different from the normal in remote sensing, which is to
transformation.ANN models offer a viable alternative for change detect changes in land use or land cover. In these studies the
detection in remote sensing. nature of the change is categorical, or between different land-
cover classes 1371. In this study, the intent is to measure the
I. INTRODUCTION magnitude of change [25], [26], which in this case corresponds
to the number of trees in a forest that have died.
EURAL networks hold the potential for improving a
variety of tasks in remote sensing and image processing. E. BACKGROUND:THE MAPPINGPROJECT AND PRIOR
They represent a fundamentally different approach to problems OF CONIFER MORTALITY
&PROACHES TO DETECTION
like pattern recognition, as they do not rely on statistical
relationships. Instead, neural networks adaptively estimate For the past several years, Boston University Center for
continuous functions from data without specifying mathemat- Remote Sensing and the U.S. Forest Service (USFS) Region 5
ically how outputs depend on inputs (Le., adaptive model-free Remote Sensing Group have been involved in developing
function estimation using a nonalgorithmic strategy). To date, a set of new methods for mapping and inventorying forest
most of the efforts to use neural networks in remote sensing vegetation using remote sensing and geographical information
have involved image classification with considerable success systems (GIS). An important innovation in this mapping
(e.g., [2], [12], [14], [18], [22]). Neural network classifiers project has been the use of an image segment procedure [42]
outperform conventional classifiers mainly due to their lack of to define the map units early, in the mapping process, enabling
assumptions about normality in datasets, considerable ease in subsequent analysis to be conducted on stands rather than a
using multidomain datasets, and perhaps, in capturing some per-pixel basis. The image segmentation algorithm uses raw
of the inherent nonlinearity in such data. The purpose of TM bands and a texture channel and produces regions that
this paper is to test the utility of neural networks in change are the polygons used in the final map. The other primary
detection. As far as we know, this is the first paper to use innovation is the use of a forest canopy reflectance model
neural networks for change detection in remote sensing. In to map forest stand structure. An overview of the mapping
procedure and its results are given in Woodcock et al. [43].

particular, neural networks are used to estimate the degree of


conifer mortality in the Lake Tahoe Basin. This area has been During the period of this project, the region experienced
studied in the past [8], [30], and this project uses their datasets. prolonged drought that has resulted in considerable mortality
Change detection studies in remote sensing involve the of conifer trees. Two prior approaches for detecting conifer
use of multitemporal datasets, Le., sequential images taken mortality in the Lake Tahoe Basin have been tested. Macomber
of the same area. Techniques to analyze the location, nature, and Woodcock [30] used a method that measures the decrease
and magnitude of changes serve two distinct purposes [37]: in crown cover between the two dates of the images. Estimates
comparative analysis of independently produced classifica- of crown cover are obtained from the Li-Strahler model [29]
tions, and simultaneous analysis of multitemporal data. Singh for each conifer stand for two dates of imagery. Three levels of
E371 provides an excellent review and comparison of various change in crown cover were combined with three crown cover
classes to stratify the forest areas. Field samples were collected
Manuscript received December 16, 1994; revised July 20, 1995. This work for each stratum and total conifer mortality was estimated at
was supported by the National Science Foundahon under Grant SBR-9300633 15% of the total timber volume between 1988 and 1992. The
The authors are with the Department of Geography, Boston University,
Boston, MA 02215 USA. patterns in the means for the strata followed the anticipated
Publisher Item Identifier S 0196-2892(96)01005-4. patterns with respect to mortality, indicating the viability of
0196-2892/96$05.00 0 1996 IEEE
GOPAL AND WOODCOCK: REMOTE SENSING OF FOREST CHANGE USING ARTIFICIAL NEURAL NETWORKS 399

the approach. However, at the level of individual stands, output signal x . The system has no feedback connections
the relationship between field measurements and estimated (see Fig. 1). The inputs units i(i = 1,.. . , I) send signals x z
mortality were less reliable (R2 S 0.4). toward intermediate (hidden) units over connections that either
Collins and Woodcock [SI proposed a new change detection attenuate or amplify signals by a factor wZ3.Each hidden unit
technique based on the Gramm-Schmidt orthogonalization j ( j = 1,. . . , J ) sees signals X , W , ~ (i = 1, . . . ,I) and process
process by which an n-band image may be decomposed them in some characteristic way. In the present research, a
into n-orthogonal indices, each with the potential of mea- simple function is used: the hidden unit sums what it receives
suring scene characteristics. The analysis of change in Lake and produces an activation y3 that is sent to the output units
Tahoe used three stable components corresponding to mul- k ( k = 1,. . . , K ) . The input to the hidden unit determines
tidate brightness, greenness, and wetness, and one change the state y3 of the element through the transfer function f . A
component. Regressions between these components and field logistic function is used that scales the activation sigmoidally
measurements indicated improved ability to detect mortality between 0 and 1. The bias unit (shown in Fig. 1) outputs
compared with the methods of [30]. Depending on which data a fixed unit value and can be viewed as constant terms in
were used to train and test the regression model, adjusted R2 the equation defining the learning process carried out by the
values ranged from approximately 0.5-0.7. network.
The network was trained using the backpropagation pro-
cedure that iteratively adjusts the coupling strengths in the
111. METHODSAND RESULTS network to minimize the error between the desired pattern and
This paper pursues a third approach to change detection the predicted pattern. Since convergence tends to be extremely
based on artificial neural networks. Artificial neural networks slow, the momentum variant is added to the weight update
(ANNS) are large networks of extremely simple computa- equation to achieve a faster rate of convergence [21]. At each
tional units, massively interconnected and running in parallel. step n, weight parameter w ( n ) of the network is updated
Formally, an ANN may be viewed as a dynamic system according to the following equation:
with the topology of a directed graph which can execute
information processing by means of its state response to
continuous or episodic input [17]. The nodes of the graph
are called processing elements (PEs), and the directed links where q and y are the learning and momentum rates, and
(unidirectional signal channels) are termed connections. The e(x I x ) is the error signal between target vector x and output
PEs communicate with one another by signals that are nu- vector 2. Hence d e / d w denotes the partial derivative of the
merical rather than symbolic. ANNS are designed to perform error with respect to ~ ( n )For . complete derivation of this
a task by specifying the architecture: the number of processing equation see standard references on backpropagation networks
elements, the network topology (i.e., the interconnections of [21], [36], [40]. The statistical approach to learning in such
the PEs), and the weight or strength of each connection via networks is described in Gopal and Fischer [15].
learning rules. ANNS have proven well suited to problems Data representation is a critical component in ANN mod-
of pattern recognition and classification, nonlinear feature de- eling since it has a strong effect on what can be learned and
tection, prediction and function approximation [4], [ 161-[ 191, how long it takes to learn. In the present research context,
[23], [24]. The essence of learning in ANNS is to find input information consists of TM data and output consists of
a suitable set of parameters that approximate an unknown the change in basal area between 1988 and 1991. There are two
input-output relation. This problem in the present context is methods of representing the inppt vector: a 10-input vector of
solved using a supervised learning algorithm, which requires 10 TM Bands (5 for 1988 and 5 for 1991) or a 5-input vector
a training set (i.e., a set of input-output examples). Learning of the differences in the five TM Bands between 1988 and
the training set may be posed as a search in the network 1991. Either absolute or relative change could be represented
parameter space by introducing an additive error function of in the output vector.
statistically independent examples which measures the quality A decision also had to be made whether to use individual
of the networks approximation to the input-output relation pixel information or stand (map units) information. TM data
on the restricted domain covered by the training set. The (input) were available for each pixel while change in basal area
minimization of this error over the networks parameter space (output) was available only for stands. From the viewpoint of
is called the training process. The ultimate goal, however, is training the neural network, pixel-level information provides
to minimize the error for all possible examples related through more data for training while stand-level information may
the input-output relation, namely, to generalize outside of the provide too few training examples. But to use pixel-level
training set [28]. Hornik, Hinton, and White [20] and Cybenko input data, the stand-level output representation has to be
[ 111 have shown that these networks can approximate any disaggregated to the pixel level. Data on change in basal
continuous input-output relation of interest to any degree of area (between 1988 and 1991) were collected during two field
accuracy, provided sufficient hidden units are available. seasons from 26 and 61 stands, respectively. We used the set of
The architecture of the two-layer feedforward network con- 61 stands to train the network and 26 stands to test or estimate
sidered in the present research is fixed and determined by the generalization capability of the network.
the number of units per array and by their connectivity. Several multilayer feedforward networks were constructed
The network receives a vector input signal x and emits an and simulations were systematically made with different input
400 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,VOL. 34, NO. 2, MARCH 1996

Net work Units Network


Network Architecture ers
including biases)

Output Units 1 Output Unit

Bias Unit

Hiddens Units Hidden Layer


k=l , ..., 15

Input Units
Input Units

Fig. 1. Neural network architecture used in change detection (10:15:1).

and output representations in order to find an acceptable distribution between -0.1 and 0.1. During each time step a set
architecture in terms of convergence during learning and of 5 input signals (i.e., epoch size of 5 ) is presented in random
generalization. In all these simulations, input and outputs order (stochastic approximation). At the end of each epoch, the
were normalized in the [O, 11 range. This normalization network parmeters are updated using (1). Simulations were
removes any need to transform the original digital variable made with different values of learning rate, q, and momentum
like radiance. Also, additive effects associated with sensor rate, 7 , in (1). A final choice on their values was made that
differences between dates becomes irrelevant. We found that reduced oscillations during learning. These values are 0.3
the best network performance was obtained using stand-level and 0.6, respectively. Since the backpropagation procedure
information with the 10-input vector and the output vector is sensitive to different starting points, five simulations with
of the total change in basal area between the two dates different random initializations were conducted. The number
of observation. There may be many reasons why this data of hidden units was varied between 5 io 50. Best performance
representation produced the best performance results. Using was produced with hidden units 15-20. Hence the minimal
pixel-level information, each pixel in the stand is assumed to architecture with 15 hidden units is selected to discuss the
have the mean mortality level for the stand. This assumption results. Five simulations of the 10:15:1 network with different
does not match the physical reality. The mortality withm initial conditions were made. The results of the best simulation
stands is concentrated in patches with some areas exhibiting (with minimum variance) are presented in the next section, but
little more mortality. This mismatch at the pixel level adds it is interesting to note that all five simulations produced sim-
noise io the neural network and undermines convergence on ilar results. This similarity is encouraging and one indication
a solution. The result of this many-to-one mapping is that of the stability of the input-output relationship.
the neural network may be unable to find an appropriate There are two phases in neural network modeling-training
mapping between the inputs and outputs. Thus, the stand-level and testing. During training, input and desired output pairs
representation proved more successful despite the dramatic pairs are presented to the network. After aaining, the net-
reduction in the amount of available data. The other important works generalization performance was estimated on a test
representation question concerns the use o f the raw TM dataset that was not part of the training data set.
data for both dates (10-band input) versus use of only the The prediction of the 10:15:1 neural network model is
differences between dates (5-band input). The IO-band input shown in the scatterplot in Fig. 2. The actual basal area change
representation proved more useful as the 5-band input never is plotted against the predicted basal area change. There are
converged to a reasonable solution. One possible reason is the 26 points in the test set. There is a reasonable fit between the
loss of data concerning overall brightness associated with the two as most of the points are near the diagonal: The value
use of differences. of adjusted R2 using the neural network prediction on the
Fig. 1 shows the architecture of the network. There are 10 test data set was 0.839 compared to the corresponding values
inputs, 15 hidden units, and one output. The initial weights obtained using Gramm-Schmidt approach of between 0.48 and
in the network were drawn at random from an uniform 0.70. Another measure of goodness of fit is the root-mean-
GOPAL AND WOODCOCK REMOTE SENSING OF FOREST CHANGE USING ARTIFICIAL NEURAL NETWORKS 401

0 50 100 150

Actual Basal Area Change


Fig. 2. Network approximation using backpropagation (10:15:1) plot of observed versus predicted basal areas of test stands.

square error (RMSE). RMSE using Gramm-Schmidt approach eigenvalues and eigenvectors in this matrix is useful for two
varied between 9.91 to 7.86. RMSE using neural network is purposes. First the eigen values provide an indication of the
6.80. relative importance of the various principal components. The
These results indicate that the ANN model produces better first principal component explains nearly 94% of the variance
prediction results compared to the benchmark approach. in the data. The second and third explain an additional 4%.
These percentages can be interpreted as the percent of the
IV. DISCUSSION variances in the output vector (mortality) contained in the
various components defined. This formulation for principal
One key question concerns the reason the neural net- components analysis (PCA) is different from conventional
work produces better estimates of conifer mortality than the applications because the information extracted relates to the
Gramm-Schmidt technique. This concern is common when dependent variable (mortality in this case). This difference in
using neural networks, as they are often treated as black
formulation is possible due to the filtering of information from
boxes-meaning that the nature of the signal used from the
the input variables related to the output variable by the weights
input variables and its relationship to the output are usually
in the neural network. All the other principal components are
unknown. For applications where a one-time relationship is
insignificant and are discarded from further analysis.
all that is required, this black-box approach may be sufficient.
The eigenvectors for the first three principal components and
However, the more common case in pattern recognition is the
shown in Table I. The eigenvectors are direct analogs of the
desire to detect patterns usable many times, or to learn about
components created by the Gramm-Schmidt technique used
the underlying relationships between variables. From a remote
by Collins and Woodcock [SI and facilitate comparison of the
sensing perspective, neural networks will prove most useful
if their internal behavior can be understood. Also, the kinds two methods. The first principal component is very similar to
of uses for neural networks expand if their internal structure the change component defined using the Gramn-Schmidt
is understood. procedure. (Table I1 from [8]). The pattern in the signs for the
There appear to be two primary possibilities why neural coefficients matches exactly and the magnitudes of the coef-
networks worked better than the Gramm-Schmidt technique. ficients are also similar. This result indicates that the primary
First, the relationship between conifer mortality and the spec- signal used by both methods is the same. Also, the second
tral data may be nonlinear. Given the lack of constraints in this eigenvector looks in general like a multidate brightness image,
regard within neural networks, they would have an obvious much like the one defined in the Gramm-Schmidt analysis.
advantage in this situation. The other possibility is that the That this factor might be related to change is interesting, and a
neural networks are using different patterns or a different result of the conditions in the Lake Tahoe Basin. As described
signal in the spectral data to find mortality. To evaluate by [8] and [30], there are patterns in the areas where conifer
this situation, we explored the use of principal components mortality is concentrated. One such pattern is that areas of
analysis [ 131. the densest stocking of trees have the highest mortality. Since
Recently, several researchers [ 101, [27], [33]-[351 have brightness is inversely related with forest density, there is a
explored how neural networks filter for principal components. correlation between mortality and scene brightness. Similarly,
In the present study, a covariance matrix of the hidden unit the third eigenvector has the form of a multidate greenness
activation (weights) of the trained network was formed and index, which is also correlated with stand density and hence
the principal components were extracted. Analysis of the mortality.
402 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 34, NO. 2, MARCH 1996

TABLE II
TRLNSFORMATION
GRAMM-SCHMIDT VECTORS

TM Band Principal Components TMBand I Princiaal ComDonents


1 2 3 Brightness I Greenness I Wetness I Change
TM 2 1988 -0.320 0.236 0.146 TM 2 1988 I 0.095 I -0.312 I 0.153 I -0.161
TM 3 1988 -0.313 0.423 0.273 TM 3 1988 0.173 -0.445 0.193 -0.356
TM 4 1988 0.295 0.619 -0.611 TM 4 1988 0.211 0.309 0609 0.108
TM 5 1988 -0.320 0.285 0.206 TM 5 1988 0.561 0.182 -0.271 -0.568
TM 7 1988 -0.323 0.123 0.251 TM 7 1988 0.300 -0.181 -0.218 -0.148
TM 2 1991 0.320 0.114 0.223 TM 2 1991 0.088 -0.356 0.182 0.107
TM 3 1991 0.322 0.124 0.276 TM 3 1991 0.163 -0.529 0.250 0.172
TM 4 1991 -0.312 0.317 -0.314 TM 4 1991 0.198 0.278 0.558 -0.161
TM 5 1991 0.319 0.269 0.281 TM 5 1991 0.580 0.186 -0.140 0.451
TM 7 1991 0.317 0.292 0.350 TM 7 1991 0.319 -0.161 -0.143 0.471

Given that it appears that the same spectral signal is being Similarly, could the same neural network be used at a later
used by both the Gramm-Schmidt technique and the neural date in the same location with new input images? This
network, it seems likely that there are nonlinearities in the kind of generalization has not been tested for either the
relationship between the spectral inputs and the observed Gramm-Schmidt technique, or for neural networks.
mortality pattems which account for the improved results using Another factor related to the issue of generalization of
ANN'S. results concems feature selection. One future role for neural
At this point it is possible to consider some of the strengths networks in remote sensing may be to help define the most
and weaknesses of the two methods, particularly as they apply useful set of input features for particular applications. In
to future use. Obviously the fact that neural networks produced situations where many input bands are available and little is
better results is important. However, there are other con- known about which bands are most useful, neural networks
siderations that influence decisions regarding which method can be used for data exploration. Following training df a neural
to use. One such consideration is the amount of field data network, analysis of the internal structure of the neural network
required to train the neural network. The GrammSchmidt using methods like the principal component analysis of this
technique can be calibrated by enough field sites to estimate paper may help select the appropriate input features for future
regression coefficients-or approximately 30 sites. Neural analysis. The generalization of the results would not come in
networks need many more field sites for training the network. the form of a trained neural network. Rather, the lasting result
In this study less than one hundred sites (5334 pixels) were would be an understanding of the information content of the
used, but that would appear to represent a bare minimum. various input bands with respect to a specific application. In
A serious problem in model estimation is the problem of subsequent projects, it may not prove necessary or expedient
overfitting which is particularly serious for noisy limited to collect enough data to train an entire new neural network,
real world data. Several strategies exist for handling such but the value of the initial neural network analysis will p
problems [ 11. We will address this problem in future research. via feature selection.
Each field site represents a full day's work for two people, Some of the interesting issues in neural network approach
so this difference is not trivial. Another consideration is to change detection need further research. These include
the nature and quantity of processing required for the two data pre-processing and data representation, generalization
different approaches. The processing for the Gramm-Schmidt capabilities of neural networks outside the domain of training
technique is direct and relatively easy. On the other hand, data (temporal and spatial), and development of suitable
error-based neural networks such as the backpropagation that measures to quantify the neural network signal concerning
is described in this paper, is much more difficult to train and the direction and magnitude of change. Data preprocessing
use. This problem can be overcome by using a more powerful is important since input data to neural networks is usually
architecture like the Adaptive Resonance Theory (ART) 131 or normalized to preserve the feature space. In addition, the
one of its variants, ARTMAP [5], [6] which are characterized issue of differencing versus simultaneous presentation of the
by stability, speed, and incremental learning. Our future efforts multitemporal data needs further investigation. Our experience
in change detection will involve the use of fuzzy ARTMAP has shown that the latter provides more complete spectral
[6]. We hope to examine a number of interesting issues information to the neural networks.
concerning magnitude and direction of change. In essence, the
use of neural networks at this time for this kind of application V. CONCLUSION
requires a significantly different type and amount of training on This paper introduces a new approach to change detection
the part of the user than does the Gramm-Schmidt technique. in remote sensing based on artificial neural networks. The
A third consideration related to the selection of change neural network architecture used is a Multilayer Feedforward
detection methods concerns the ability to apply the results Network. The results of the study suggest that the neural
in other places or times. For example, could the neural networks produce improved results over more conventional
network built for this dataset be used in a neighboring area? methods of change detection.
GOPAL AND WOODCOCK REMOTE SENSING OF FOREST CHANGE USING ARTIFICIAL NEURAL NETWORKS 403

An analysis of the internal structure of the trained neural [18] G. F. Hepner, T. Logan, N. Ritter, and N. Bryant, Artificial neural
network through PCA allows understanding of the signal being network classification using a minimal training set comparison of
conventional supervised classification, Photogrammetric Eng. Remote
used in the spectral data. The neural network appears to be Sensing, vol. 56, pp. 469-473, 1990.
using the same signal in the spectral data as those found [19] J. J. Hopfield, Neural networks and physical systems with emergent col-
by a Gramm-Schmidt transformation. This suggests that the lective computational abilities, in Proc. National Academy of Sciences,
1982, vol. 79, pp. 2554-2558.
improvements associated with the use of neural networks is [20] K. Homik, M. Stinchcombe, and H. White, Multilayer feedforward
probably due to nonlinearities in the relationship between the networks are universal approximators, Neural Networks, vol. 2, pp.
spectral data and conifer mortality. This issue needs to be 359-366, 1989.
[21] R. A. Jacobs, Increased rates of convergence through learning rate
evaluated in other studies. adaptations, Neural Networks, vol. 1, pp. 295-307, 1988.
[22] A. Kanellopoulos, A. Varfis, G. G. Wilkinson, and J. Megier, Land-
cover discrimination in SPOT HRV imagery using an artificial neural
network-A 20-class experiment, Int. J. Remote Sensing, vol. 13, no.
ACKNOWLEDGMENT 5, pp. 917-924, 1992.
The authors would like to acknowledge the Support of the 1231 T. Kohonen, Self-organization and Associative Memory, 3rd ed.
Berlin; New York Springer-Verlag, 1989.
U.S. Forest Service for collection of the data used in this paper. [24] B. Kosko, Neural Networks and Fuuy Systems: A Dynamical Systems
They also thank S. Macomber, J. Collins, and V. Jakabhazy Approach to Machine Intelligence. Englewood Cliff, NJ: Prentice-Hall,
for their assistance in organizing the data. 1992.
[25] E. F. Lambin and A. H. Strahler, Indicators of land-cover change for
change-vector analysis in multitemporal space at coarse spatial scales,
Int. J. Remote Sensing, vol. 15, no. 10, pp. 2099-2119, 1994.
[26] -, Change-vector analysis in multitemporal space: A tool to
REFERENCES detect and categorize land-cover change processes using high temporal-
resolution satellite data, Remote Sensing Environ., vol. 48, pp. 23 1-244,
C. Bishop, Improving the generalization properties of radial basis
1994.
function neural networks, Neural Computation, vol. 3, pp. 930-945, [27] T. Leen, Dynamics of learning in recurrent feature-discovery net-
1991. works, Advances in Neural Information Processing Systems, R. P.
J. A. Benediktsson, P. H. Swain, and 0. K. Ersoy, Neural network Lippmann, J. Moody, and D. Touretzky, Eds., vol. 3. San Mateo, CA:
approaches versus statistical methods in classification of multisource
Morgan Kaufmann, 1991, pp. 70-76.
remote sensing data, IEEE Trans. Geosci. Remote Sensing, vol. 28, pp.
[28] E. Levin, N. Tishby, and S . A. Solla, A statistical approach to learning
540-552, 1990. and generalization in layered neural networks, Proc. IEEE, vol. 78, pp.
G. A. Carpenter and S. Grossberg, ART2: Self-organization of stable
1568-1575, 1990.
category recognition codes for analogue input patterns, Appl. Opt., vol. [29] X. Li and A. H. Strahler, Geometric-optical modeling of a conifer
26, pp. 4919-4930, 1987.
-, Pattern Recognition by Self-organizing Neural Networks.
forest canopy, IEEE Trans. Geosci. Remote Sensing, vol. GRS-23, up.
705-72 1, 1985.
__
Cambridge, MA: M.I.T. Press, 1991.
G. A. Carpenter, S. Grossberg, and J. H. Reynolds, ARTMAP: Super- 1301. S . Macomber and C. E. Woodcock, Mapping and monitoring conifer
.

vised real-time learning and classification of nonstationary data by a mortality using remote sensing in the lake-&o; basin, Remote Sensing
Environ., vol. 50, pp. 255-266, 1995.
self-organizing neural network, Neural Networks, vol. 4, pp. 565-588,
1311 D. M. Muchoney and B. N. Haack, Change detection for monitoring
1991.
forest defoliation, Photogrammetric Eng. Remote Sensing, vol. 60, no.
G. A. Carpenter, S . Grossberg, and D. N. Rosen, Fuzzy ART Fast
stable learning and categorization of analog patterns by an adaptive 10, pp. 1243-1251, 1994.
R. F. Nelson, Detecting forest canopy change due to insect activity
resonance system, Neural Networks, vol. 4, pp. 759-771, 1991.
J. C. Coiner, Using Landsat to monitor changes in vegetation cover using Landsat MSS, Photogrammetric Eng. Remote Sensing, vol. 49,
induced by dissertification processes, in Symp. Remote Sensing of the pp. 1303-1314, 1983.
E. Oja, A simplified neuron model as a principal component analyzer,
Environment, Environ. Res. Inst. Michigan, Ann Arbor, MI, 1980, vol.
3, no. 14, pp. 1341-1347. J. Math. Biology, vol. 15, pp. 267-273, 1982.
-, Neural networks, principal components, and subspaces, Int. J.
J. B. Collins and C. E. Woodcock, Change detection using the Gramm-
Neural Systems, vol. 1, pp. 61-68, 1989.
Schmidt transformation applied to mapping forest mortality, Remote J. Rubner and K. Scbulten, Development of feature detectors by
Sensing Environ., vol. 50, pp. 267-279, 1995.
P. R. Coppin and M. E. Bauer, Processing of multitemporal Landsat self-organization: A network model, Biological Cybem., vol. 62, pp.
TM imagery to optimize extraction of forest cover change features, 193-199, 1990.
D. E. Rummelhart, G. E. Hinton, and R. J. Williams, Learning repre-
IEEE Trans. Geosci. Remote Sensing, vol. 32, no. 4, pp. 918-927,
sentations by back-propagating errors, Nature, vol. 323, pp. 533-536,
1994. .--,
IYXb.
G . W. Cottrell atld J. Metcalfe, EMPATH: Face, emotion and gender
[37] A. Singh, Digital change detection techniques using remotely-sensed
recognition using Holons, Advances in Neural Information Processing
Systems, R. P. Lippmann, J. Moody, and D. Touretzky, Eds., vol. 3.
data, Int. J. Remote Sensing, vol. 10, pp. 989-1003, 1989.
[38] J. E. Vogelmann, Use of thematic mapper data for the detection for
San Mateo, CA: Morgan Kaufmann, 1991, pp. 56&572.
G. Cybenko, Approximation by superpositions of sigmoidal function, forest damage caused by the pear thirps, Remote Sensing Environ., vol.
Math. Contr., Signals, Syst., vol. 2, pp. 303-314, 1980. 30, pp. 217-225, 1988.
M. S . Dawson, A. K. Fung, and M. T. Manry, Sea ice classification 1391 J. E. Vogelmann and B. N. Rock, Assessing forest damage in high-
~~

using fast learning neural networks, in Proc. IGARSS92,, Houston, elevation- coniferous forests in Vermont and New Hampshire using
TX, 1992, vol. 2, pp. 1070-1071. thematic mapper data, Remote Sensing Environ., vol. 24, pp. 227-246,
W. Dillon ind M. Goldstein, Multivariate Analysis: Methods and Appli- 1988.
cations. New York Wiley, 1984. [40] H. White, Learning in artificial neural networks: A statistical perspec-
S . Gopal, D. M. Sklarew, and E. Lambin, Fuzzy-neural networks in tive, Neural Computation, vol. 1, pp. 425-464, 1989.
multi-temporal classification of landcover change in the Sahel, in Proc. [41] D. L. Williams, R. F. Nelson, and C. L. Dottavio, A georeferenced
DOSES Workshop on New Tools for Spatial Analysis, Lisbon, Portugal, LANDSAT digital database for forest insect-damage assessment, Int.
DOSES, EUROSTAT, ECSC-EC-EAEC, Brussels, Luxembourg, 1994, J. Remote Sensing, vol. 6, no. 5, pp. 643-656, 1989.
pp. 55-68. [42] C . Woodcock and V. Harward, Nested hierarchical scene models and
S . Gopal and M. Fischer, Learning in single hidden layer feedforward image segmentation, Int. J. Remote Sensing, vol. 13, no. 16, pp.
neural network models, Geographical Analysis, 1996, in press. 3167-3189, 1992.
S . Grossberg, Nonlinear neural networks: Principles, mechanisms and [41] C. E. Woodcock, J. Collins, S . Gopal, V. Jakabhazy, X. Li, S . Macomber,
architectures, Neural Networks, vol. 1, pp. 1 7 4 1 , 1988. S . Ryherd, Y. Wu, V. J. Harward, J. Levitan, and R. Warbington,
R. Hecht-Nielsen, Neurocomputing. Reading, MA: Addison-Wesley, Mapping forest vegetation using Landsat TM imagery and a canopy re-
1990. flectance model, Remote Sensing Environ., vol. 50, pp. 240-254, 1994.
404 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL 34, NO 2, MARCH 1996

Sucharita Godal received the Ph.D. degree from Curtis Woodcock (A90) received the B.A., M A,,
the Department of Geography, University of C a l - and Ph.D. degrees from the Department of Geogra-
fomia, Santa Barbara, in 1989. phy, University of California, Santa Barbara
Since then, she has carried out research in the Since 1984, he has taught at Boston university,
areas of spatial cognition, fuzzy sets, spatial accu- Boston, MA, where he is currently Associate Pro-
racy, and geographical information systems. Over fessor and Chair of Geography and a Researcher in
the last few years, she has conducted research the Center for Remote Sensing. His primary current
in the applications of neural networks to land- research interests in remote sensing include mapping
cover classification, change detection, and modeling of forest structure and change, spatial modeling
in remote sensing. She is currently an Associate of images, inversion of canopy reflectance models,
Professor at Boston University, Boston, MA. detection of environmental change, and issues of
map accuracy.

You might also like