Professional Documents
Culture Documents
CS405
What are connectionist neural
networks?
History traces back to the 50s but became popular in the 80s
with work by Rumelhart, Hinton, and Mclelland
Hundreds of variants
Less a model of the actual brain than a useful tool, but still some
debate
Numerous applications
Parallel, Distributed
Fault Tolerant
Learns: Yes
Intelligent/Conscious:
Usually
Element size: 10
-9
m
Processing speed: 10
9
Hz
Serial, Centralized
Learns: Some
Intelligent/Conscious:
Generally No
Biological Inspiration
My brain: It's my second favorite organ.
- Woody Allen, from the movie Sleeper
Idea : To make the computer more robust, intelligent, and learn,
Lets model our computer software (and/or hardware) after the brain
Neurons in the Brain
Brains learn
Creating/deleting connections
'
> +
,
_
otherwise
I w
O
i
i i
: 0
0 : 1
I1
I2
I3
W1
W2
W3
O
or
1
Activation Function
Perceptron Example
2
1
.5
.3
=-1
2(0.5) + 1(0.3) + -1 = 0.3 , O=1
Learning Procedure:
Randomly assign weights (between 0-1)
Present inputs from training data
Get output O, nudge weights to gives results toward our
desired output T
Repeat; stop when no errors, or enough epochs completed
Perception Training
) ( ) ( ) 1 ( t w t w t w
i i i
+ +
i i
I O T t w ) ( ) (
Weights include Threshold. T=Desired, O=Actual output.
5 . 1 ) 2 )( 1 0 ( 5 . 0 ) 1 (
1
+ + t w
Example: T=0, O=1, W1=0.5, W2=0.3, I1=2, I2=1,Theta=-1
7 . 0 ) 1 )( 1 0 ( 3 . 0 ) 1 (
2
+ + t w
2 ) 1 )( 1 0 ( 1 ) 1 ( + + t w
67,1,4,120,229,, 1
37,1,3,130,250, ,0
41,0,2,130,204, ,0
P
P P
O T LMS ce Dis
2
2
1
) ( tan
E.g. if we have two patterns and
T1=1, O1=0.8, T2=0, O2=0.5 then D=(0.5)[(1-0.8)
2
+(0-0.5)
2
]=.145
We want to minimize the LMS:
E
W
W(old)
W(new)
C-learning rate
+
i
i i
I w
LMS Gradient Descent
To compute how much to change weight for link k:
k
j
j k
w
O
O
Error
w
Error
Chain rule:
) (
) (
2
1
) (
2
1
2
2
j j
j
j j
j
P
P P
j
O T
O
O T
O
O T
O
Error
We can remove the sum since we are taking the partial derivative wrt Oj
( ) ) (
'
k k k
k
j
W I Function Activation f I
w
O
( ) ( ) Function Activation f I O T c w
k j j k
' ) (
) ( W I f O
j
Activation Function
'
> +
otherwise
I w
O
i
i i
: 0
0 : 1
Old:
New:
+
i
i i
I w
e
O
1
1
LMS vs. Limiting Threshold
Perceptron Method
j
j x j
H w
e
x O
,
1
1
) (
I3
1
i
i x i
I w
e
x H
,
1
1
) (
Backprop - Learning
Learning Procedure:
Randomly assign weights (between 0-1)
Present inputs from training data, propagate to outputs
Compute outputs O, adjust weights according to the delta
rule, backpropagating the errors. The weights will be
nudged closer so that the network learns to give the desired
output.
Repeat; stop when no errors, or enough epochs completed
Backprop - Modifying Weights
( ) ( )
,
_
+
sum
j j k k
e
f Function Activation f O T cI w
1
1
; '
We had computed:
For the Output unit k, f(sum)=O(k). For the output units, this is:
( ) ) 1 (
, k k k k j k j
O O O T cH w
( )( ) ) ( 1 )( ( sum f sum f O T cI w
j j k k
I H O
W
i,j
W
j,k
For the Hidden units (skipping some math), this is:
k j
k
k k k k i j j j i
w O O O T I H cH w
, ,
) 1 ( ) ( ) 1 (
Backprop
QwikNet
1
Hopfield Network Demo
http://www.cbu.edu/~pong/ai/hopfield/hopfieldappl
et.html
Properties
Find the winner, i.e. the output unit that most closely
matches the input units, using some distance metric, e.g.
n
i
i ij
I W
For all output units j=1 to m
and input units i=1 to n
Find the one that minimizes:
) (
1 t t
i
t
W X c W
+
where c is a small positive learning constant
that usually decreases as the learning proceeds
Result of Algorithm
Classification:
Given new input, the class is the output node that is the
winner
Typical Usage: 2D Feature Map
Modified Algorithm
Find the winner, i.e. the output unit that most closely
matches the input units, using some distance metric, e.g.
http://davis.wpi.edu/~matt/courses/soms/ap
plet.html
Kohonen Network Examples
Document Map:
http://websom.hut.fi/websom/milliondemo/
html/root.html
Poverty Map
http://www.cis.hut.fi/rese
arch/som-
research/worldmap.html
SOM for Classification
For a new test case, calculate the winning node and classify
it as the class it is closest to
Psychological
Biological
Variable binding
Recursion
Reflection
Structured representations
Hybrid model
Jeff Hawkins
Book: On Intelligence
Hierarchically-arranged in regions
Creativity?
Evolution?
Planning?
http://davis.wpi.edu/~matt/courses/soms/ap
plet.html
http://websom.hut.fi/websom/milliondemo/
html/root.html
http://www.cis.hut.fi/research/som-
research/worldmap.html
http://www.patol.com/java/TSP/index.html