ANN Lecture - 1 Nov Rama Mehta

Artificial Neural Network
Rama Mehta, Scientist

rama@nih.ernet.in
It is a system based on the operation of biological

neural networks.
The neural networks have the ability to learn by

past experiences, which makes them very flexible
and powerful.
These networks are also well suitable for real time

systems because of their fast response and
computational times.
The human brain is a wonderful processor. Its exact

workings are silent a mystery.
The most basic element of the human brain is a

specific type of cell, known as neuron, which
doesnt regenerate.
The human brain comprises about 100 billion

neurons. Each neuron can connect with 0 to
200,000 other neurons, although 1,000-10,000
interconnections are typical.
The power of the human mind comes

from the pure or absolute numbers of
neurons and their multiple
interconnections.
It also comes from genetic programming
and learning.
There are multiple interconnections,
more than 100 different classes of
neurons.
The individual neurons are complicated.
Together these neurons and their
connections form a process which is not
binary, not stable, and not synchronous.
Adaptive learning: An ANN is endowed with the ability

to learn how to do tasks based on the data given for
training or initial experience.
Self organization: An ANN can create its own
organization or representation of the information it
receives during learning time.
Real time operation: ANN computations may be
carried out in parallel. Special hardware devices are
being designed and manufactured to take advantage of
this capability of ANNs.
Fault tolerance via redundant information coding:
Partial destruction of a neural network leads to the
corresponding degradation of performance. However,
some network capabilities may be retained even after
major network damage.
Adaptively.
Evidential response.
Contextual information.
VLSI (very large scale integrated) implements
ability.
Neurobiological analogy.
A neural network can perform tasks that a linear
program can not
When an element of the neural network fails, it can
continue without any problem.
A neural network learns and does not need to be
reprogrammed.
It can be implemented in any application without
any problem.
The neural network needs training to operate.
The architecture of a neural network is different

from the architecture of microprocessors
therefore needs to be emulated.
Requires high processing time for large neural

networks.
Input layer
of
source nodes
Output layer
of
neurons
NN 1
Neural Networks
3-4-2 Network
Output
layer
Input
layer
Hidden Layer
The models of ANN are specified by the three basic

entities namely:
i).
The models synaptic interconnections;
ii). The training or learning rules adopted for
updating and adjusting the connection weights;
iii).
Their activation functions.
An ANN consists a set of highly

interconnected processing elements
(neurons) such that each processing element
output is found to be connected through
weights to the other processing elements or
to itself.
the arrangements of these processing
elements and the geometry of their
interconnections are essential for an ANN.
Bias
b
x1
w1
Input x2
signal
w2
xm
wm
Synaptic
weights
Local
Field
v
Summing
function
Activation
function
()
Output
y
Single and Multi-layered perceptron
Five more basic types of neuron

connection architectures are existing
as:
single layer feed forward network;

multilayer feed forward network;
single node with its own feedback;
single layer recurrent network;
multilayer recurrent network.
Recurrent Network with hidden neuron(s): unit

delay operator z-1 implies dynamic system
z-1
z-1
z-1
input
hidden
output
strengths of the connections can be

set by the weights explicitly using
the priori knowledge.
Otherwise system can be trained

by feeding it teaching patterns and
letting it change its weights
according to some learning rule.
Parameter learning: It updates the

connecting weights in a neural net.
Structure learning: It focuses on the

change in network structure
These can performed simultaneously or separately.
Supervised learning;
Unsupervised learning;
Reinforcement learning.
The activation function acts as a squashing

function, such that the output of a neuron in a
neural network is between certain values
(usually 0 and 1, or -1 and 1).
When s signal is fed through a multilayer
network with linear activation function, the
output obtained remains same as that could
be obtained using a single layer network. Due
to this reason, nonlinear functions are widely
used in multilayer networks compare to linear
functions.
There are several linear activation functions as:

Identity function:
It is linear function and can
be defined as
f(x) = x for all x
The input layer uses the identity activation function
for single layer network and output remains the
same as input.
Binary step function:
Where theta represents the Threshold value. This

function is most widely used in single layer nets to
convert the net input to an output that is a binary
(1 or 0). It is also known as threshold function.
Bipolar step function:
Where theta represents the Threshold value. This

function is also used in single layer nets to convert
the net input to an output that is bipolar (+1 or
-1).
Sigmoidal function:
Tanh: Hard non-linearity
Signum and
Step
Feed-forward neural networks:

The data processing can extend over multiple (layers
of) units,
but no feedback connections are present, that is,
connections extending from outputs of units to inputs
of units in the same layer or previous layers.
Recurrent neural networks
that do contain feedback connections.
In some cases, the activation values of the units
undergo a relaxation process such that the neural
network will evolve to a stable state in which these
activations do not change anymore.
In other applications, the change of the activation
values of the output neurons are significant, such that
the dynamical behaviour constitutes the output of the
neural network
Each unit performs a relatively simple job:
receive input from neighbours or external

sources and use this to compute an
output signal which is propagated to
other units.
Apart from this processing, a second task
is the adjustment of the weights.
The system is inherently parallel in the
sense that many units can carry out their
computations at the same time.
Within neural systems it is useful to distinguish three

types of units:
input units which receive data from outside the
neural network,
output units which send data out of the neural
network, and
hidden units whose input and output signals remain
within the neural network.
During operation, units can be updated either
synchronously or asynchronously.
With synchronous updating, all units update their
activation simultaneously;
with asynchronous updating, each unit has a (usually
fixed) probability of updating its activation at a
time t (ONE UNIT AT ONE TIME).
Weights:
Where
is the weight vector of
processing element and
is the weight from
processing element i (source node) to processing
element j (destination node).
Bias
The bias included in the network has its impact in

calculating the net input.
The bias is included by adding a component x0 =1
to the input vector X . Thus, the input vector
becomes
X = (1,X1, .. X2, ,Xn )
Bias as Input
v wj xj
j 0
w0 b
Bias is an external parameter of the neuron.

Can be modeled by adding an extra input.
x0 = +1
x1
Input
signal
xm
w0 b
w0
w2
w x
j 0
w1
x2
Summing
function
wm
Local
Field
Synaptic
weights
Activation
function
()
Output
y
(i)
positive bias and
(ii) negative bias.
The positive bias helps in increasing the net input

of the network and the negative bias helps in
decreasing the net input of the network.
For each and every application, there is a

threshold limit. The activation function using
threshold can be defined as
Where theta is the fixed threshold value.
Learncon Conscience bias learning function

Learngd
Gradient descent weight and bias learning function
Learngdm Gradient descent with momentum weight and bias
learning function
Learnh
Hebb weight learning rule
Learnhd
Hebb with decay weight learning rule
Learnis
Instar weight learning function
Learnk
Kohonen weight learning function
Learnlv1
LVQ1 weight learning function
Learnlv2
LVQ2.1 weight learning function
Learnos
Outstar weight learning function
Learnp
Perceptron weight and bias learning function
Learnpn
Normalized perceptron weight and bias learning
function
Learnsom Self-organizing map weight learning function
Learnsomb Batch self-organizing map weight learning function
Learnwh
Widrow-Hoff weight/bias learning function
The learning rate is denoted by It is used

to control the amount of weight adjustment
at each step of training. The learning rate,
raging from 0 to 1, determines the rate of
learning at each time step.
Compet
Hardlim
Hardlims
Logsig
netinv
Poslin
Purelin
Radbas
Radbasn
Satlin
Satlins
Softmax
Tansig
Tribas
Competitive transfer function

Hard-limit transfer function
Symmetric hard-limit transfer function
Log-sigmoid transfer function
Inverse transfer function
Positive linear transfer function
Linear transfer function
Radial basis transfer function
Normalized radial basis transfer function
Saturating linear transfer function
Symmetric saturating linear transfer function
Soft max transfer function
Hyperbolic tangent sigmoid transfer function
Triangular basis transfer function
Example:
Code to create a plot of the hardlim transfer
function:
n = -5:0.1:5;
a = hardlim(n);
plot(n,a)
Assign this transfer function to layer i of a network
as:
net.layers{i}.transferFcn = 'hardlim';
Algorithms
hardlim(n) = 1 if n 0
0 otherwise
a = purelin(n)
Examples
code to create a plot of the purelin transfer function.
n = -5:0.1:5;
a = purelin(n);
plot(n,a)
Assign this transfer function to layer i of a network by
net.layers{i}.transferFcn = 'purelin';
Algorithms
a = purelin(n) = n
a= logsig(n)
Examples
Here is the code to create a plot of the logsig
transfer function.
n = -5:0.1:5;
a = logsig(n);
plot(n,a)
Assign this transfer function to layer i of a network.
net.layers{i}.transferFcn = 'logsig';
Algorithms
logsig(n) = 1 / (1 + exp(-n))
Examples
code to create a plot of the tansig transfer function

is:
n = -5:0.1:5;
a = tansig(n);
plot(n,a)
Assign this transfer function to layer i of a network.
net.layers{i}.transferFcn = 'tansig';
Algorithms
a = tansig(n) = 2/(1+exp(-2*n))-1
Train
Train neural network
Trainb
Batch training with weight and bias
learning rules
Trainbfg BFGS quasi-Newton backpropagation
Trainbfgc BFGS quasi-Newton backpropagation
for use with NN model reference adaptive
controller
Trainbr Bayesian regulation backpropagation
Trainbu Batch unsupervised weight/bias
training
Trainc
Cyclical order weight/bias training
Traincgb Conjugate gradient backpropagation
with Powell-Beale restarts
Traincgf Conjugate gradient backpropagation
with Fletcher-Reeves updates
Traincgp Conjugate gradient backpropagation
with Polak-Ribire updates
Traingd Gradient descent backpropagation
Traingda Gradient descent with adaptive learning
rate backpropagation
Traingdm Gradient descent with momentum
backpropagation
Traingdx Gradient descent with momentum and
adaptive learning rate backpropagation
Trainlm Levenberg-Marquardt
backpropagation
Trainoss One-step secant backpropagation
Trainr Random order incremental
training with learning functions
Trainrp Resilient backpropagation
Trainru Unsupervised random order
weight/bias training
Trains Sequential order incremental
training with learning functions
Trainscg Scaled conjugate gradient
backpropagation
Training stops when any of these

conditions occurs:
The maximum number of epochs

(repetitions) is reached.
The maximum amount of time is
exceeded.
Performance is minimized to the goal.
The performance gradient falls below
min_grad.
mu exceeds mu_max.
Back Propagation is a systematic method of training

multilayer artificial neural networks.
Back Propagation algorithm is a generalization of the

Widrow-Hoff correction rule given as
w1=T-w2x2/x1 and w2=T-w1x1/x2.
For a typical neuron with inputs xi, weights Wi. The

summation of the weighted inputs I is given by
Sigmoidal function has been used as the nonlinear

activation function as:
This sigmoidal function is a logistic function which

monotonically increases from a lower limit (0 or -1)
to an upper limit (+1) as increase. The values vary
between 0 and 1, with a value of 0.5 when I is
zero.
%=======================================================================%
% CODE FOR climate change in Allahabad,
%========================================================================%
Date June 21, 2011
clear all
clc
t1 = cputime;
% Loading file and assigning data accordingly
data = xlsread('annallahabad.xls');
cali = data(2:150,:); % calibratoin data
vali= data(151:end,:); % validation data
% input files to the network
caliin = cali(:,1:5);
caliout = cali(:,6);
valiin = vali(:,1:5);
valiout = vali(:,6);
% FF neural network
C=[];
n=3; % three hidden neurons
net1=newff([0 18.82; 0 650.92; 9.18 34.13; 0
79.08], [n 1], {'logsig', 'purelin'}, 'trainbr');
net.IW{1,1} = [0.01 0.01 0.01 0.01 0.01 ];
net.b{1} = 0;
net.inputweight{1,1}.learnFcn = 'learngd';
net,layerweights{3,1}.learnparam.lr=0.1
net1.trainParam.epochs=500;
net1.trainParam.goal=0.001;
net1.performFcn='msereg';
net1=train(net1, caliin', caliout');
18.82; 1.34
model_output = sim(net1,valiin');
net.IW{1,1}
net.b{1}
net1.IW{1,1}
net1.b{1}
model_output'
error = abs(valiout-model_output');
a=error.*error
b=sum(a)
rmse=sqrt(b/150)
%RMSE1=sqrt((sum(square(error)))/150)
%RMSE=((sum(square(error)))/150)
figure, plot(valiout) % plots your actual validation output in

blue line
hold on
plot(model_output,'r') % plots your model output in red
line
hold off
legend('observed PET', 'ANN model_output for PET')
ylabel('PET'); xlabel('No. of observations')
figure, plot(error) % plots your variation in error
ylabel('Error'); xlabel('No. of observations')
f=getx(net1); % to store your network parameters for
future use, if any
Aerospace
Automotive
Banking
Defense
Electronics
Entertainment
Financial
Insurance
Manufacturing
Medical
Oil and Gas
Robotics
Speech
Securities
Telecommunications
Transportation

ANN Lecture - 1 Nov Rama Mehta

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ANN Lecture - 1 Nov Rama Mehta

Uploaded by

Copyright:

Available Formats

Artificial Neural Network

Rama Mehta, Scientist

It is a system based on the operation of biological

The neural networks have the ability to learn by

These networks are also well suitable for real time

The human brain is a wonderful processor. Its exact

The most basic element of the human brain is a

The human brain comprises about 100 billion

The power of the human mind comes

Adaptive learning: An ANN is endowed with the ability

The neural network needs training to operate.

The architecture of a neural network is different

Requires high processing time for large neural

The models of ANN are specified by the three basic

Their activation functions.

An ANN consists a set of highly

Single and Multi-layered perceptron

Five more basic types of neuron

single layer feed forward network;

Recurrent Network with hidden neuron(s): unit

strengths of the connections can be

Otherwise system can be trained

Parameter learning: It updates the

Structure learning: It focuses on the

These can performed simultaneously or separately.

The activation function acts as a squashing

There are several linear activation functions as:

Where theta represents the Threshold value. This

Where theta represents the Threshold value. This

Tanh: Hard non-linearity

Feed-forward neural networks:

Each unit performs a relatively simple job:

receive input from neighbours or external

Within neural systems it is useful to distinguish three

The bias included in the network has its impact in

Bias is an external parameter of the neuron.

positive bias and

(ii) negative bias.

The positive bias helps in increasing the net input

For each and every application, there is a

Where theta is the fixed threshold value.

Learncon Conscience bias learning function

The learning rate is denoted by It is used

Competitive transfer function

code to create a plot of the tansig transfer function

Training stops when any of these

The maximum number of epochs

Back Propagation is a systematic method of training

Back Propagation algorithm is a generalization of the

For a typical neuron with inputs xi, weights Wi. The

Sigmoidal function has been used as the nonlinear

This sigmoidal function is a logistic function which

% CODE FOR climate change in Allahabad,

Date June 21, 2011

figure, plot(valiout) % plots your actual validation output in

You might also like