Professional Documents
Culture Documents
Recurrent Neetworks
Lecture 12
2005 Ari Visa
Agenda
Some historical notes
Some theory
Recurrent networks
Training
Conclusions
Some Theory
Some Theory
Four specific network architectures will be
represented.
They all incorporate a static multilayer
perceptron or parts thereof.
They all exploit the nonlinear mapping
capability of the multilayer perceptron.
Some Theory
NARX
Some Theory
State-Space Model
The hidden neurons define the state of
the network. The output of the hidden
layer is fed back to the input layer via a
bank of unit delays. The input layer
consists of a concatenation of feedback
nodes and source nodes. The network is
connected to the external environment
via the source node. The number of unit
delays used to feed the output of the
hidden layer back to the input layer
determines the order of the model.
x(n+1) =f(x(n),u(n))
y(n) = Cx(n)
The simple recurrent network (SRN)
differs from the main model by
replacing the output layer by a
nonlinear one and by omitting the bank
of unit delays at the output.
State-Space Model
Some Theory
Recurrent multilayer
perceptron (RMLP)
It has one or more hidden
layers. Each computation
layer of an RMLP has
feedback around it.
xI(n+1) = I(xI(n),u(n))
xII(n+1)
= II(xII(n),xI(n+1)), ...,
Some Theory
Second-order network
When the induced local field vk is combined
using multiplications, we refer to the neuron
as a second-order neuron.
A second-order recurrent networks
vk(n) = bk+ijwkijxi(n)uj(n)
xk(n+1) =(vk(n))
= 1 /(1+exp(- vk(n) )
Note, represents the pair xj(n)uj(n)
[state,input] and a positive weight wkij
represents the presence of the transtion
{state,input}{next state}, while a negative
weight represents the absence of the
transition. The state stransition is described
by (xi,uj) = xk.
Second-order networks are used for
representing and learning deterministic finitestate automata (DFA).
Some Theory
Some Theory
Some Theory
Computational power of
recurrent networks
I All Touring machines
may be simulated by fully
connected recurrent
networks built on neurons
with sigmoid activation
functions.
The Touring machine:
1) control unit
2) linear tape
3) read-write head
Some Theory
II NARX networks with one layer of hidden
neurons with bounded, one-sided
saturated activation functions and a
linear output neuron can simulate fully
connected recurrent networks with
bounded, one-sided saturated activation
functions, except for a linear
slowdown.
Bounded, one-sided saturated activation
functions (BOSS):
1)
a (x) b, ab, for all xR
2)
There exist values s and S (x) = S
for all a s.
Training
Training
Training
Training
Training
Real-time recurrent
learning (RTRL)
concatenated inputfeedback layer
processing layer of
computational nodes
e(n) = d(n) y(n)
ETotal = ne(n)
Training
Summary
The subject was recurrent networks that involve the use of global
feedback applied to a static (memoryless) multilayer perceptron.
1)
1) Nonlinear autoregressive with exogeneous inputs (NARX)
network using feedback from the output layer to the input layer.
2)
2) Fully connected recurrent networks with feedback from the
hidden layer to the input layer.
3)
3) Recurrent multilayer perceptron with more than one hidden layer,
using feedback from the output of each computation layer to its
own input.
4)
4) Second-order recurrent networks using second-order neurons.
5)
All these recurrent networks use tapped-delay-line memories as a
feedback channel.
6)
The methods 1 -3 use a state-space framework.
Summary