Professional Documents
Culture Documents
Aluizio F. R. Arajo
Universidade de So Paulo
Departamento de Engenharia Eltrica
C. P. 359, 13560-970, So Carlos, SP, BRASIL
{gbarreto, aluizioa}@sel.eesc.sc.usp.br
Abstract. A self-organizing neural network for learning and recall of complex temporal sequences is proposed. we
consider a single sequence with repeated items, or several sequences with a common state. Both cases gives rise to
ambiguities during recall of such sequences which is resolved through context input units. Competitive weights
encode spatial features of the input sequence, while the temporal order is learned by lateral weights through a timedelayed Hebbian learning rule. Repeated or shared items are stored as a single copy resulting in an efficient memory
use. In addition, redundancy in item representation improves the network robustness to noise and faults. The model
operates by recalling the next state of the learned sequences and is able to solve potential ambiguities. The model is
simulated with binary and analog sequences and its functioning is compared to other neural networks models.
Keywords: Self-organization, context-based learning, spatiotemporal sequences, Hebbian learning, robotics.
1. INTRODUCTION
It is well known that natural data are usually dynamic or, equivalently, sequential in the sense that temporal
dependencies between consecutive patterns should be taken into account (Kohonen, 1997). Recognizing such
dynamic patterns is crucial in hearing and vision, and reproducing them underlies processes like motor pattern
production, speech and singing. Aware of this, a long-standing problem in neural network learning theory has been
the mathematical and computational characterization of the so-called list learning or temporal sequence learning
(Grossberg, 1969a, 1969b; Amari, 1972).
Two mechanisms are essential for these models to learn and recall a temporal sequence efficiently. (i) For
purpose of learning, a mechanism must be implemented to extract and store temporal relationship between patterns.
Such a mechanism is usually called short-term memory (STM). In the STM, individual items are stored in such a
way that the pattern of STM activity encodes both the items that have occurred and the temporal order in which they
have occurred. In the cognitive literature, such a mechanism is often said to store both item information and order
information (Bradski et al., 1992). (ii) For purpose of recall, the network dynamics must also include a mechanism
for reading out items in the stored temporal order.
The basic idea involved in learning the temporal order of sequential patterns is known as the temporal chaining
hypothesis: A sequence is seen as a chain of temporally linked items in which the set of associations between
consecutive components (state transitions) must be learned for total or partial reproduction of the memorized
sequence. Most of the artificial neural network (ANN) models that implement this hypothesis are based on either
multilayer perceptron trained with a temporal version of gradient-based learning algorithms or based on the Hopfield
model (see Mozer, 1993; Wang, 1995; Herz, 1995 and references therein). Nevertheless, it is important to emphasize
that self-organization play a major role in temporal sequence learning, and specially the field of robot learning has
gained relevant contributions. The vast majority of models is involved in either solving inverse kinematics for
visuomotor coordination (Kuperstein & Rubistein, 1989; Martinetz et al., 1990; Gaudiano & Grossberg, 1991;
Walter & Schulten, 1993) or route learning (Denham & McCabe, 1995; Gaudiano et al., 1996; Heikkonen &
Koikkalainen ,1997).
A robot task that has received contributions from the field of unsupervised neural network is trajectory
tracking (Hytyniemi, 1990; Althfer & Bugmann, 1995; Bugmann et al., 1998; Barreto & Arajo, 1999a, b). This
task aims at converting a description of a desired motion to a trajectory defined as a time sequence of intermediate
configurations between an origin and a destination. The desired motion is that of an industrial robot arm consisting
of joints driven by individual actuators. The robot is required to follow a prescribed path, so that its controllers must
Proc. IEEE-INNS-ENNS Intl. Joint Conf. on Neural Networks, July 24-27, 2000, Como, Italy, Vol. 3, pp. 207-212.
coordinate the movements of the individual joints of the robot to achieve a desired overall movement from point to
point along the path.
In the models by Hytyniemi (1990), Althfer & Bugmann (1995) and Bugmann et al (1998), the state
transitions are hard-wired. Two layers of connections store exactly the same components, and the temporal links are
established by the network designer since the trajectory is known beforehand. In addition, for sequences with
repeated or shared states, the first two models are unable to reproduce the stored trajectories correctly. The third one
can recall a single trajectory with repeated states but is unable to deal with multiple trajectories with shared states.
Recently, Barreto & Arajo (1999a,b) have proposed a different approach in which temporal order of items is
learned without supervision through a temporal version of Hebbs rule (Hebb, 1949) as the input sequence is being
read. The first model (Barreto & Arajo, 1999a) can handle trajectories that shared states with others. The shared
states are stored at different neurons. The second model (Barreto & Arajo, 1999b) also handle shared states but this
time, shared states are stored as single copy in a single neuron. This work extends this second model by allowing it
to handle closed sequences such as figure eight trajectories and mixed sequences (part open, part closed), while store
repeated or shared states as a single copy.
In this paper, we extend our previous self-organizing network (Barreto & Arajo, 1999b) in order to handle
closed sequences with repeated or shared items and to use less memory resources through the use of a single neuron
to every repeated item. We will consider binary and analog sequences in the form of closed curves such as circles
and figure eight, as well as open ones such as straight lines. For the binary case, a sequence of letters are considered,
and for the analog case we evaluate the network on the problem of trajectory tracking. The rest of the paper is
organized as follows. In Section 2, we present the model. In Section 3, we evaluate the performance of the model
through computer simulations and discuss the main results. We conclude the paper in Section 4.
z-1
z-1
Feedback
weights M
z-1
z-1
z-1
---
Feedforward
weights W
Sensory inputs
trajectory states
Global
Local
context units
Proc. IEEE-INNS-ENNS Intl. Joint Conf. on Neural Networks, July 24-27, 2000, Como, Italy, Vol. 3, pp. 207-212.
These type of context is formed by the sequence items (past history) that precedes the current state. As will be
shown in the simulations, the inclusion of time-varying context units allows the network to encode both closed and
open sequences, increasing the model applicability.
The synaptic weights consists of feedforward weights and feedback weights. The interlayer and intralayer
weights are updated by competitive and Hebbian learning rules respectively. The feedforward weights connect each
input units to each output neuron. They encode the spatial configuration of a sequence item at a specific time step.
The intralayer coupling structure encodes the temporal order of the patterns in a sequence by diffusing its activation
through a non-zero lateral connection in order to trigger its successor in the current sequence being recalled.
The two groups of synaptic weights are updated during a single pass of an entire sequence in which each state
is read once. This means that a sequence with Nc components requires Nc training steps. The input state is compared
with each feedforward weight vector through Euclidean distance. The weight vectors closest to the input vector is
selected to be updated. We define a sensor distance, D sj (t ) R, a global context distance D gc
j (t ) R, and local
context distance, Dlcj (t ) R, as follows:
D sj (t ) = s(t ) w sj (t ) ,
gc
D gc
j (t ) = gc(t ) w j (t )
and
(1)
where || x ||2 = x12 + + x 2p + q . The distance D sj (t ) is used to find the winners of the current competition, while D gc
j (t )
and
Dlcj
For good tracking performance, every item of a sequence should be memorized and recalled. That is, if the
input sequence has 20 items, every 20 items should appear at the network output. Standard competitive networks
tend to cluster the input patterns and can not be used for tracking purpose. To overcame this situation, the network
should penalize a neuron, by excluding it from subsequent competitions, when it tries to encode more than one
pattern. This can be accomplished by defining a function Rj(t), called responsibility function, that indicates if a
neuron is already responsible for encoding a given sequence item. If Rj(t) >> 0, neuron j is excluded from
subsequent competitions for sequence components. If Rj(t) = 0, neuron j is allowed to compete.
According to the definition of Rj(t), if a given item occurs again it will be encoded by other neuron, and many
copies of the same item will exist in the network (Barreto & Arajo, 1999a). In order to efficiently use memory
resources, every time an item occurs, it should be encoded by the neuron that encoded it for the first time. It can be
accomplished by defining a similarity radius 0 < << 1 that accounts for the repeated items in the following way
(Barreto & Arajo, 1999b):
D s (t ), if D sj (t ) < or R j (t ) = 0
f j (t ) = j
otherwise
R j (t ),
(2)
According to Eq. (2), a neuron can only be used more than one time if (1) it never was used before, i.e, Rj(t) = 0, or
(2) its weight vector lies at a distance of less than of the current input item. Thus, the output neurons are ranked as
follows:
f1 (t ) < f 2 (t ) <
< f
n 1
(t ) <
f n (t )
(3)
where i (t), i = 1,..., n, is the index of the i-th closest output neuron to s(t). We choose K neurons, (t) = (1(t),
2(t),...,K(t)), K n, as winners of the current competition. They will represent the current input vector v(t). The
corresponding activation values decay linearly from a maximum value amax R, for 1(t), to a minimum amin R,
for K(t), according to the following equation:
a min
a
(i 1) ,
ai (t ) = a max max
max(1, K 1)
i = 1, ..., K
(4)
Proc. IEEE-INNS-ENNS Intl. Joint Conf. on Neural Networks, July 24-27, 2000, Como, Italy, Vol. 3, pp. 207-212.
where amax and amin are user-defined. For i > K, ai (t ) = 0. The responsibility function Rj(t) is updated every time a
new activation pattern a(t) = (a1(t),..., an(t))T is computed: Rj(t+1) = Rj(t) + aj(t), where >> 0 is an exclusion
constant. Following the selection of the winning neurons and the determination of their activations and outputs, the
weight vectors wj(t) are updated according to the following competitive learning rule:
(5)
where 1 is the learning rate. For t=0, wj(0), for all j, is initialized with random numbers between 0 and 1.
The successive winners are linked in the correct temporal order through a lateral coupling structure. The
feedback weights are updated according to the following learning rule (Barreto & Arajo, 1999b):
mjr(t+1) = mjr(t) + aj(t)ar(t-1) or M(t+1) = M(t) + a(t)aT(t-1) (in matrix form) (6)
where 0 < < 1 is the feedback learning rate. The activities of the previous competition, ar(t-1), are made available
through time delays, and the matrix form in Eq. (6) lends itself to a simple matrix analysis of temporal associative
memory (Amari, 1972; Barreto & Arajo, 1999a). Equation (6) is a Hebbian learning rule (Hebb, 1949) that creates
temporal associations between consecutive states of the input trajectory, encoding the temporal order of the input
sequences (see Figure 2). For t=0, Mj(0) = 0 for all j, which indicates that no temporal associations exist.
t=0
t=1
s(1)
t=2
t=2
cg
cl
s(2)
cg
cl
s(2)
cg
cl
Figure 3. A sketch of how consecutive winners are temporally linked through lateral connections. Initially (t=0), the
network has no lateral connections. At t=1, the neuron on the left is the winner for pattern s(1). At t=2, the neuron on the
right is the winner for pattern s(2). Still at t = 2, a lateral connection is created from the neuron on the left to the neuron
on the right through Eq. (6), indicating the temporal order. cg stands for global context and cl for local context.
It is well known that time in Hebbian learning rules plays an essential role in psychology (Tesauro, 1986;
Montague & Sejnowski, 1994), object recognition (Wallis, 1996), route learning and navigation (Schlkopf &
Mallot, 1995) and blind source separation (Girolami & Fyfe, 1996).
To start the recall of a stored sequence the network must be supplied with any item belonging to that sequence.
This item will then activate the neuron whose weight vector is the most similar (Eq. 4). Then, the active neuron will
trigger its successor through the lateral connections through the following output equation:
D gc
Dlcj (t ) n
j (t )
y j (t ) = 1 n
1
g m jr (t )ar (t )
n
gc
lc
Dr (t ) Dr (t ) r =1
r =1
r =1
(7)
where g(u) 0 and dg(u)/d u > 0. The weight vector of the neuron with highest value of yj(t) is used to represent the
next sequence item. This item is then fed back to the network input and the process continues until the end of the
stored sequence. For t=0, the activation and output values are set to aj(0) = yj(0) = 0, for all j.
3. SIMULATIONS
The first test evaluates the network on binary sequences of letters with repeated and shared items (see Fig. 4a). Each
letter is represented as a 108 matrix which forms an 80-dimensional input vector. We consider two types of
sequences, namely, closed with a repeated item (figure eight) and open with shared items. The network parameters
were set to: p = 80, q = 80, d = 160, n = 250, K = 2, amax = 1, amin = 0.98, = 10-4, = 100, = 1, = 0.8. The
sequences are trained one after the other, and the global context is set to any sequence item in the case of the figure
eight sequence and to the letter U in the case of the open sequence. The local context is lc(t) = {v(t-1), v(t-2)}. Note
Proc. IEEE-INNS-ENNS Intl. Joint Conf. on Neural Networks, July 24-27, 2000, Como, Italy, Vol. 3, pp. 207-212.
that item K is a repeated one for the figure eight sequence, and is also an item shared with the open sequence.
Typical results are shown in Fig. 4b for both sequences. This simple example elucidates the need of both global and
local context. In the figure eight case, when the network arrives at the item K it passes to the correct next item X
because of the local context and it does not jump to the item Z because of the global content acting as an identifier.
D
...
P
U
X
K
(a)
(b)
Figure 4. Sequences of binary patterns. (a) The input sequences, (b) typical recall for the two sequences.
Figure 5 shows a typical result when the given starting item is corrupted by noise. In this test, 10% of the
elements of the starting item were chosen at random and had their values changed from 0 to 1 or from 1 to 0.
The noisy L was given
...
The noise E was given
(a)
(b)
Figure 6. Sequences of analog patterns. (a) Typical recall, (b) a fault-tolerance test. In (a) and (b) the actual
trajectory is represented by circles and the recalled one by asterisks. Arrows show the direction of movement.
Proc. IEEE-INNS-ENNS Intl. Joint Conf. on Neural Networks, July 24-27, 2000, Como, Italy, Vol. 3, pp. 207-212.