You are on page 1of 13

Neural Network Overview Neural Network Overview

Neural Network Neural Network


Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Outline

CSI643: Machine Learning 1 Neural Network Overview


Slide Set 5: Neural Networks 2 Neural Network

3 Types of Neural Networks


Dr. G. Anderson
4 Perceptron
Department of Computer Science
University of Botswana
5 Mult-Layer Perceptron (MLP)
Semester 1 / 2017/2018
6 Backpropagation

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Introduction Similarities Between Biological and ANN

The human brain contains about 100 billion cells.


Artificial Neural Networks (ANN) inspired by the way The brain’s neurons connect to one another in a complex
biological neural systems work to process information. network.
There are a large number of highly interconnected The number of synapses for a typical neurons varies from
processing elements (neurons) working together to solve 1,000 to 10,000.
specific problems. Connections are created and strengthened to remember
ANN learning involves adjustments of the synaptic habits and skills in a human, such as playing a piano.
connections that exist between neurons. When a human stops performing that activity, the required
network becomes weak and can eventually disappear.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Biology Biological Neural Network vs Artificial Neural Network

A neuron consists of a soma (cell body), axon (long fiber),


and dendrites.
Axons send signals.
Dendrite receives signals.
A synapse connects and axon to a dendrite.
Given a signal, a synapse might increase (excite) or
decrease (inhibit) electrical potential.
A neuron fires when its electrical potential reaches a
threshold.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Artificial Neuron Main Components

A set of processing units (neurons or cells).


A state of activation Yi for every unit (its output).
Connections between units defined by weights wjk which
determines the effect which the signal of unit j has on unit
k , positive for excitation, negative for inhibition.
Propagation rule which determines the effective input Xi of
a unit from its external inputs.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Main Components Neuron Types

An activation function f , which determines the new level of


activation based on the input Xi (t) and the current
activation Yi (t). Input units.
An external input (known as a bias), θ for each unit. hidden units.
A method for information gathering (the learning rule). Output units.
An environment within which the system must operate,
providing input signals and error signals.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Neural Network Parallelism and Goal Neural Network Types

Layered Feed-Forward Network.


Many units can carry out their computations at the same
time, hence the system is inherently parallel. The Perceptron.
The neural network aims to train itself to achieve a balance Feed-forward Radial Basis Function (RBF) Network.
between correctly responding to patterns used in training Recurrent Networks:
and the ability to give reasonable responses for new inputs Simple Recurrent Network (SRN) Elman Style.
which are similar but not identical to those used for training. SRN Jordan Style.
Self Organizing Maps

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Layered Feed Forward Network The Perceptron

Characterized by a collection of input neurons.


This is followed by one or more hidden layers, then an
output layer. Single neuron.
There are no connections from neurons to neurons in Multiple inputs and single output.
previous layer, same layer, or neurons more than one layer
It has restricted processing capability.
ahead.
The data from input layer feeds into the next layer, and the
output of this layer feeds into the next layer, etc.
A network with a single layer is called a perceptron.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Recurrent Networks Self Organizing Maps (Kohonen Networks)

Grid topology with unequal weights.


Outputs from a layer are fed into a layer below.
Topology provides for low dimensional visualization of data
Hopfield Neural Network has symmetric connections. It
distribution.
indicates a patternis recognized by echoing it back.
Used in applications which involve browsing of large
Simple Recurrent Network (SRN) Elman Style.
volume of data.
Simple Recurrent Network (SRN) Jordan Style.
Unsupervised learning.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Perceptron Architecture Activation Functions

Sign: Activation(X)={0, ifX < θ, 1 if X ≥ θ}


Step: Activation(X)={−1, ifX < θ, 1 if X ≥ θ}
Sigmoid (logistic): Activation(X)= 1+e1 −x

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Activation Functions Perceptron Notation

Input Vector: X (t) = [X1 (t), X2 (t), ..., XN (t)]>


Synaptic Weights Vector: w(t) = [w1 (t), w2 (t), ..., wN (t)]> ,
where 0 ≤ wi ≤ 1, i = 1, ..., N
Threshold: θ(t)
Actual output: Y (t)
Desired output: Yd (t)
Learning Rate: α(t), 0 < α < 1

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Rosenblatt Perceptron Learning Algorithm (Perceptron Perceptron Rule Remarks


Rule)
1 Set t=1; ∆wi = α · Xi · e(t)
2 Initialize weights w1 , w2 , ..., wN to random numbers in the range [−0.5, 0.5]; So, wi (t + 1) = wi (t) + ∆wi
3 Initialize threshold θ to number in range [−0.5, 0.5];
4 repeat Provided data is linearly separable and small α is used, the
5 Activate the perceptron. Inputs are X1 (t), X2 (t), ..., XN (t) and desired rule is proved to classify all training examples correctly.
output Yd (t). P The Perceptron convergence theorem states that for any
6 Actual output is Y (t) = step[ N i=1 Xi (t)wi (t) − θ];
7 Calculate the error: e(t) = Yd (t) − Y (t);
data set which is linearly separable the Perceptron learning
8 Update the weights of the perceptron: wi (t + 1) = wi (t) + α · Xi · e(t); rule is guaranteed to find a solution in a finite number of
9 t=t+1; steps.
10 Go to line 5; An epoch is the presentation of entire training set to the
11 until Convergence;
neural network.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Delta Rule Delta Rule

Instead of using output of threshold function, delta rule


used net output. Weights are updated according to the rule: wi = wi + ∆wi
This is used when the data is not linearly separable. ∆wi = −α · e w
0
(W )
i
The key idea is to use gradient descent search. P
P e (W ) = i (Yd − Yi ) · (−Xi )
0
The algorithm tries to minimize the error e = i (Ydi − Yi )2 α is the learning rate.
The sum goes over all training examples. P
∆wi = α · i (Yd − Yi ) · (−Xi )
Yi is the inner product wX and not sign(wX) as in the
Perceptron rule.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Delta Rule Delta Rule

There are two differences between the Perceptron and the


Delta Rule: There are two main difficulties with the gradient descent
The Perceptron is based on the output from a step function method:
while the delta rule uses a linear combination of inputs Convergence to a minimum may take a long time.
directly. There is no guarantee we will find the global minimum.
The Perceptron is guaranteed to converge to a consistent Solutions to these are using momentum and random
hypothesis assuming the data is linearly separable. The perturbations to weight vectors.
delta rule converges but does not need the condition of
linear separability of data.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Perceptron Limitations MLP Overview

Learning is efficient if weights are not very large. Perceptron can be successfully used for functions such as
AND and OR, but not XOR.
Attributes should be weighted independently.
There is therefore a need for a network of perceptrons: a
Can only learn lines and hyperplanes.
Multi-layer Percepton.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Typical MLP Architecture MLP Architecture

Input layer.
One or more hidden layers.
Output layer.
Hidden units must use non-linear activation functions,
otherwise the whole network to one without hidden units.
An MLP can learn any continuous mapping with some
accuracy.
One hidden layer is sufficient for most applications.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

MLP Activation Function Backpropagation Learning Algorithm

This involves:
The feed-forward of the input training patterns.
The calculation and backpropagation of the associated
error.
The adjustment of the weights.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Backpropagation Notation Backpropagation Notation


Input Vector: x = [x1 , x2 , ..., xi , ..., xn ]> Net Input for Hidden PLayer Neurons:
z_inputj = θhid_j + i xi vij , j = 1, ..., p
Target Output Vector: t = [t1 , t2 , ..., tk , ...tm ]>
Net Input for Output P Layer Neurons:
Neurons in Input Layer: X1 , X2 , ..., Xi , ..., Xn y _inputk = θout_k + j zj wjk , k = 1, ..., m
Neurons in Hidden Layer: Z1 , Z2 , ..., Zj , ..., Zp Output Signal (Activation) Hidden Layer: zj = f (z_inputj ), j = 1, ..., p
Output Signal (Activation) Output Layer:
Neurons in Output Layer: Y1 , Y2 , ..., Yk , ..., Ym
yk = f (y _inputk ), k = 1, ..., m
Threshold of neuron in Hidden Layer: θhid_j , j = 1, ..., p δk , k = 1, ..., m, portion of error correction weight adjustment for
Threshold of neuron in Output Layer: θout_k , k = 1, ..., m weights between neurons in hidden layer and output layer, wjk , due
to error at neurons in output layer(Yk for the weights wjk . The error
Weights between neurons in the Input Layer and Hidden
at neuron Yk is propagated back to the neurons in the hidden layer
Layer: vij , i = 1, ..., n, j = 1, ..., p that feed into Yk .
Weights between neurons in the Hidden Layer and Output δj , j = 1, ..., p, similar to the above.
Layer: wjk , j = 1, ..., p, k = 1, ..., m α learning rate.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

MLP Backpropagation Network Backpropagation Activation Function

Should be continuous, easily differentiable, and


monotonically non-decreasing.
Sigmoid function is commonly used:
1
f (x) = 1+e−x
f 0 (x) = f (x)[1 − f (x)]

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Backpropagation: 1. Randomly Initialize Weights Backpropagation: 2. Feedforward

The weights initialization influences the speed of the


Each unit Xi receives an input signal and broadcasts this
network in reaching the goal.
signal to the neurons in the hidden layer.
Large weights might lead to very small derivatives of
The neurons in the hidden layer then broadcast their
sigmoid functions, which slows learning.
outputs to the neurons in the output layer.
Too small values will cause net input to a neuron to be
Each output neuron compares its output to the target value
close to zero which slows learning.
to determine the associated error for that pattern with the
[−0.5, 0.5] is commonly used. neuron.
Formulas are also used e.g. (− 2.4 2.4
Fi , Fi ), where Fi is the The error then propagates back from layer to layer.
total number of inputs of neuron i in the network.

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Backpropagation: 3. Backpropagation Backpropagation: 4. Weights Update

For each neuron Yk in the output layer, the term δk is


The adjustment of the weights wjk ,
computed based on the associated error. δk is used to
j = 1, 2, ..., p, k = 1, 2, ..., m between neurons in the hidden
distribute the error at output neuron Yk back to all the
layer and output layer are modified based on the term δk
neurons in the hidden layer which are connected with the
and the activation zj , j = 1, 2, ..., m of the neurons in the
neuron Yk . It is also used to update the weights between
hidden layer.
the hidden layer and the output layer.
The adjustment of the weights vij ,
For each neuron Zj in the hidden layer, the term δj is
i = 1, 2, ..., n, j = 1, 2, ..., p, between neurons in the input
computed. δj is used to update the weights between the
layer and neurons in the hidden layer are modified based
neuron in the input and hidden layer. In our case, since we
on the term δj and the activation xj , i = 1, 2, ..., n of the
only have one hidden layer, it is not necessary to
neurons in the input layer.
propagate the error back to the input layer.

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Backpropagation: Neurons in Output Layer Backpropagation: Neurons in Hidden Layer

1 foreach Neuron Yk do 1 foreach NeuronPZj do


2 δk = (tk − yk ) · f 0 (y _inputk ); 2 δ_inputsj = m k =1 δk wjk ;
3 ∆wjk = α · δk · zj ; 3 δj = δ_inputsj · f 0 (z_inputj );
4 ∆θout_k = α · δk ; 4 ∆vjk = α · δj · xi ;
5 Send δk to units in the layer below.; 5 ∆θhid_j = α · δj ;
6 end 6 end

UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Weights Update: Neurons in Output Layer Weights Update: Neurons in Hidden Layer

1 foreach Neuron Yk do 1 foreach Neuron Zj do


2 wjk = wjk + ∆wjk , j = 1, ..., p; 2 vij = vij + ∆vij , i = 1, ..., n;
3 Θoutk = Θoutk + ∆Θoutk ; 3 Θhidj = Θhidj + ∆Θhidj ;
4 end 4 end

UB CS ML UB CS ML
Neural Network Overview Neural Network Overview
Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Backpropagation Comments Relationship Between Dataset, Number of Weights,


and Classification Accuracy
One cycle through the entire set of training vectors is
called an epoch. Usually, several, even many, epochs are
required to train a backpropagation neural network.
Weights are updated after each training pattern is P: Number of patterns.
presented. Another approach is to update the weights W: Number of weights to be trained.
cumulated over an entire epoch. A: Accuracy of classification expected.
A common stopping condition is when the total squared
If there are enough training patterns, the network will be
error reaches a minimum, however this might not be
able to classify unknown training patterns correctly.
efficient.
One can rather divide the training data into a training set, P=W A . If A is 0.1, a network with 10 weights will require
and a validation set. Training continues until the error in the 100 training patterns.
validation set reaches a minimum, just before it starts
rising again.
UB CS ML UB CS ML

Neural Network Overview Neural Network Overview


Neural Network Neural Network
Types of Neural Networks Types of Neural Networks
Perceptron Perceptron
Mult-Layer Perceptron (MLP) Mult-Layer Perceptron (MLP)
Backpropagation Backpropagation

Improving the Efficiency of Backpropagation Learning Improving the Efficiency of Backpropagation Learning
If the weights are adjusted to very large values, the total input Momentum can be factored into the weight update
of a neuron can reach very high values, and because of the equation, such that when some data very different from the
sigmoid activation function, the neuron will have an activation majority of training data is encountered, a small learning
very close to zero or one. rate will be used, in order not to disrupt the progress.
Gradient descent or other optimization functions used can get For momentum to be used, the weights from one or more
stuck in local minima, when deeper minima are close by. previous training patterns must be preserved.
Probabilistic methods can help to avoid this trap but can be Large weight adjustments are made as long as the
slow. corrections are in the same general direction for several
The number of hidden units can be increased, leading to patterns.
higher dimensionality of the error space, and a smaller chance The network with momentum proceeds in the direction of a
of getting trapped, but after some number of hidden units, combination of the current gradient and the previous
there is again a high chance of getting trapped in local minima. direction of the weight correction, instead of only
proceeding in the direction of the gradient.
UB CS ML UB CS ML
Neural Network Overview
Neural Network
Types of Neural Networks
Perceptron
Mult-Layer Perceptron (MLP)
Backpropagation

Improving the Efficiency of Backpropagation Learning


The delta-bar-delta update algorithm allows each weight to
have its own learning rate.
The learning rates also vary with time as training
progresses.
If the weight change is in the same direction for several
time steps, then the learning rate for that weight should be
increased.
If the direction of the weight change alternates, the
learning rate should be decreased.
The weight change will be in the same direction if the
partial derivative of the error with respect to that weight has
the same sign for several time steps.
UB CS ML

You might also like