You are on page 1of 10

1

Implementation Of Perceptron Learning Rule Using


MATLAB
Jose Eduardo Urrea Cabus
Student Number 376202
Karadeniz Technical University
Faculty of Engineering
Department of Electrical and Electronics Engineering

Abstract—Artificial Neural Network (ANN) consists of hun- other researchers developed a class of neural networks called
dreds of single units, artificial neurons or processing elements. perceptrons. The neurons in these networks were similar to
Neurons are connected with weights, which constitute the neural those of McCulloch and Pitts. Rosenblatt’s key contribution
structure and are organized in layers. Perceptron is single layer
artificial neuron network and it works with continuous or binary was the introduction of a learning rule for training perceptron
inputs. In the modern sense perceptron is an algorithm for networks to solve pattern recognition problems [Rose58].
learning a binary classifier. In ANN the inputs are applied via He proved that his learning rule will always converge to
series of weights and actual output are compared to the target the correct network weights, if weights exist that solve the
outputs. Then to adjust the weights, learning rule is used and problem. Learning was simple and automatic. Examples
bias the network so that actual output move closer to the target
output.The perceptron learning rules comes under the category of of proper behavior were presented to the network, which
supervised learning. In this Paper, implementation of single layer learned from its mistakes. The perceptron could even learn
perceptron model using single perceptron learning rule through when initialized with random values for its weights and
MATLAB is discussed. biases.Today the perceptron is still viewed as an important
Index Terms—Perceptron , ANN, Sigmoid, Step function , network. It remains a fast and reliable network for the class
FFNN. of problems that it can solve. In addition, an understanding
of the operations of the perceptron provides a good basis for
I. I NTRODUCTION understanding more complex networks.
RTIFICIAL neural networks are computational models
A which are inspired by biological neural networks.
They are used to approximate functions that are generally
II. BIOLOGICAL NEURAL NETWORKS

unknown. In ANN, artificial neurons are elementary units


where receiving inputs summing them and produces an T HE entire information processing system, i.e. the
vertebrate nervous system, consists of the central
nervous system and the peripheral nervous system, which is
output. Activation function [Transfer function] is used to pass
only a first and simple subdivision. In reality, such a rigid
the sum. They are may be of the type – Sigmoid function,
subdivision does not make sense, but here it is helpful to
Step function, and piecewise linear functions. Usually they
outline the information processing in a body.
are monotonically increasing, continuous, differentiable and
bounded.
The peripheral nervous system (PNS) comprises the
nerves that are situated outside of the brain or the spinal
Single layer perceptron network is simplest kind of neural
cord. These nerves form a branched and very dense network
network, consisting of single layer of output nodes. In this
throughout the whole body. The peripheral nervous system
network, inputs are directly given to outputs via a series of
includes, for example, the spinal nerves which pass out of
weights. Due to this it can be considered as the simplest kind
the spinal cord (two within the level of each vertebra of the
of feed forward network.
spine) and supply extremities, neck and trunk, but also the
cranial nerves directly leading to the brain.
In 1943, Warren McCulloch and Walter Pitts introduced
one of the first artificial neurons [McPi43]. The main feature
The central nervous system (CNS), however, is the ”main-
of their neuron model is that a weighted sum of input signals
frame” within the vertebrate. It is the place where information
is compared to a threshold to determine the neuron output.
received by the sense organs are stored and managed.
When the sum is greater than or equal to the threshold, the
Furthermore, it controls the inner processes in the body and,
output is 1. When the sum is less than the threshold, the
last but not least, coordinates the motor functions of the
output is 0. They went on to show that networks of these
organism. The vertebrate central nervous system consists of
neurons could, in principle, compute any arithmetic or logical
the brain and the spinal cord.
function. In the late 1950s, Frank Rosenblatt and several
Jose Eduardo Urrea Cabus, (email: joseeduardourrea@gamil.com). The cerebrum (telencephalon) is one of the areas of the
2

brain that changed most during evolution. Along an axis, III. BACKGROUND AND BASICS
running from the lateral face to the back of the head, this area
is divided into two hemispheres, which are organized in a
folded structure. These cerebral hemispheres are connected by
T HE artificial neurons receives one or more inputs and
sums them to produce an output (Activation). Usually
the sums of each node are weighted and Activation Function
one strong nerve cord (”bar”) and several small ones. A large is used to transform the activation level of a neuron into an
number of neurons are located in the cerebral cortex (cortex) output signal. There are many types of activation functions
which is approx. 2 − 4 cm thick and divided into different such as linear function, step function, sigmoid function, Ramp
cortical fields, each having a specific task to fulfill. Primary function, Gaussian function, hyperbolic tangent.
cortical fields are responsible for processing qualitative
information, such as the management of different perceptions
(e.g. the visual cortex is responsible for the management of
vision). Association cortical fields, however, perform more
abstract association and thinking processes; they also contain
our memory.

The cerebellum is located below the cerebrum, therefore it is


closer to the spinal cord. Accordingly, it serves less abstract
functions with higher priority: Here, large parts of motor
coordination are performed, i.e., balance and movements
are controlled and errors are continually corrected. For this
w h i l e Ep −> P and { ' e r r o r t o o l a r g e ' } do
purpose, the cerebellum has direct sensory information about I n p u t p i n t o t h e network , c a l c u l a t e o u t p u t y {P
muscle lengths as well as acoustic and visual information. set of training patterns }
f o r a l l o u t p u t n e u r o n s ? do
Furthermore, it also receives messages about more abstract i f yq = t q t h e n
motor signals coming from the cerebrum. Output i s okay , no c o r r e c t i o n o f w e i g h t s
else
i f yq=0 t h e n
The interbrain (diencephalon) includes parts of which f o r a l l i n p u t n e u r o n s i do
wi , q := wi , q + o i {... i n c r e a s e
only the thalamus will be briefly discussed: This part of the w e i g h t t o w a r d s q by o i }
diencephalon mediates between sensory and motor signals end f o r
end i f
and the cerebrum. Particularly, the thalamus decides which i f yq =1t h e n
part of the information is transferred to the cerebrum, so f o r a l l i n p u t n e u r o n s i do
wi , q := wi , q o i {... d e c r e a s e w e i g h t t o w a r d s
that especially less important sensory perceptions can be q by o i }
suppressed at short notice to avoid overloads. Another part end f o r
end i f
of the diencephalon is the hypothalamus, which controls end i f
a number of processes within the body. The diencephalon end f o r
end w h i l e
is also heavily involved in the human circadian rhythm
(”internal clock”) and the sensation of pain. The perceptron learning algorithm reduces the weights to
output neurons that return 1 instead of 0, and in the inverse
A neuron is nothing more than a switch with information case increases weights.
input and output. The switch will be activated if there are
enough stimuli of other neurons hitting the information A step function is a function is likely used by the original
input. Then, at the information output, a pulse is sent to, for Perceptron. This function produces two scalar output values
example, other neurons. depending on the threshold(Φ). If input sum is above a certain
threshold the output is 1 and if input sum is below a certain
threshold the output is 0.

1 if netj ≥ Φ
F (netj) =
0 if netj < Φ
Artificial Neuron Learning Training is the act of presenting
the network with some sample data and modifying the weights
to better approximate the desired function. There are two
main types of training - Supervised Training and unsupervised
training. In supervised learning both the inputs and outputs
are provided. Then neural network processes the input and
calculate an error based on its desired output and actual
output. The weights are modified to reduce the difference
(error) between the actual and desired outputs. A feedforward
Figure 1. Illustration of a biological neuron with the components neural network (FFNN) is the simplest type of artificial neural
network. FFNN consists of three layers – input layer, hidden
3

layer and output layer. In this network, the information moves


in only one direction, forward, from the input layer, through r ≡ di − oi (8)
the hidden layers (if any) and to the output layer and the
where oi = sgn(wit x)
and di is the desired response. Weight
connection between the units don’t form a cycle or loop.
adjustments in this method, ∆wi and ∆wij are obtained as
FFNN with monotonically increasing differentiable function
follows
can approximate any continuous function with one hidden
layer, provided the hidden layer has enough hidden neurons.
∆wi = c[di − sgn(wit x)]x (9)
The simplest kind if neural network is a single layer ∆wij = c[di − sgn(wit x)]xj , f orj = 1, 2, . . . , n (10)
perceptron network which consist of one or more artificial t
where a plus sign is applicable when di = 1 and sgn(w x) =
neurons in parallel and it can be considered the simplest kind
−1 and a minus sign is applicable when di = −1 and sgn =
of feedforward network as the inputs are given directly to
(wt x) = 1. The weight adjustment is inherently zero when
the outputs via a series of weights. The sum of the products
the desired and actual responses agree.
of the weights and the inputs is calculated in each node and
calculated value is compared with the threshold value. If the
The delta learning rule is only valid for continuous acti-
calculated value is above the some threshold value typically
vation functions as defined (6) and in the supervised training
0, the neuron fires and takes the value typically 1 (called
mode. The learning signal for this rule is called Delta and is
activated value) otherwise it takes the value typically −1
defined as follows:
(called deactivated value). Neurons with this kind of activation
function are also called artificial neurons or linear threshold
r ≡ [di − f (wit x)]f 0 (wit x) (11)
units. Single-unit perceptrons are only capable of learning
linearly separable patterns. Although a single threshold unit is The term f 0 (wit x) is the derivative of the activation function
quite limited in its computational power, it has been shown f (net) computed for net = wit x. This learning rule can
that networks of parallel threshold units can approximate be readily derived from the condition of least squared error
any continuous function from a compact interval of the real between oi and di . Calculating the gradient vector with respect
numbers into the interval [−1, 1]. The neuron output signal is to wi of the squared error defined as
given by the following relationship: 1
E= (di − oi )2 (12)
O = f (w x) t
(1) 2
which is equivalent to
or
1
n E= [di − f (wit x)]2 (13)
2
X
O = f( wi xi ) (2)
i=1 we obtain the error gradient vector value
Where w is the weight vector defined as ∇E = −(di − oi )2 )f 0 (wit x)x (14)
w ≡ [w1 w2 · · · wN ]t (3) Since the minimization of the error requires the weight
changes to be in the negative gradient direction, we take
and x is the input vector:
∆wi = −η∇E (15)
x ≡ [x1 x2 · · · xn ]t (4)
where η is a positive constant. We then obtain from (14) and
The variable net is defined as a scalar product of the weight (15)
and input vector ∆wi = η(di − oi )f 0 (neti )x (16)
net ≡ wt x (5)
or, for the single weight the adjustment becomes
Typical activation functions used are
∆wij = η(di − oi )f 0 (neti )xj , f orj = 1, 2, . . . , n (17)
2
f (net) ≡ −1 (6) Hence, the weight adjustment becomes
1 + e−λnet
and  ∆wi = c(di − oi )f 0 (neti )x (18)
+1 if net ≥ 0
F (net) = sgn(net) (7) Therefore, it can be seen that (17) is identical to (18), since
−1 if net < 0
c and η have been assumed to be arbitrary constants. The
Activation function (6) and (7) are called bipolar continuous weights are initialized at any values for this method of training.
and bipolar binary functions, respectively. The delta rule was introduced only recently for neural network
training [McClelland and Rumelhart 1986].
For the perceptron learning rule, the learning signal is the
difference between the desired and actual neuron’s response The principal function of a decision-making system is
[Rosenblatt 1958]. Thus, learning is supervised and the learn- to yield decisions concerning the class membership of the
ing signal is equal to input pattern with which it is confronted. Conceptually, the
4

problem can be described as a transformation of set, or


functions, from the input space to the output space, which
is called the classification space. The training or network
adaptation is represented as a sequence of iterative weight
adjustments. A pattern is the quantitative description of an
object, event or phenomenon. The classification may involve
spatial and temporal patterns. Examples of spatial patterns are
pictures, video images of ships, weather maps, etc. Examples
of temporal patterns include speech signals, signals vs. time
produced by sensors, electrocardiograms, etc. The goal of
pattern classification is to assign a physical object, event or Figure 4. Guide of Perceptron Rule, Result Area
phenomenon to one of the prespecified classes (also called
categories).
The code below shows how the Cartesian coordinates
The classifying system consists of an input transducer chosen by the user for an array of two entries are collected.
providing the input pattern data to the feature extractor. In the [figure 5], we can see the process developed by the
Typically, inputs to the feature extractor are sets of data code mentioned below. The section ’GLOBAL’ allows us to
vectors that belong to a certain category. Usually, the make the declaration of variables that will be used throughout
converted data at the output of the transducer can be the calculations in the program. Then the program takes the
compressed while still maintaining the same level of machine selection value made by the user, that is, the type of classes.
performance. The compressed data are called features. The This action is done through the option Items. Then the
feature extractor at the input of the classifier performs the preparation of the environment is done, in which the graphing
reduction of dimensionality. The feature space dimensionality of our classes will be done. The option Button = 1 performs
is postulated to be much smaller than the dimensionality of the mouse control when it is in use, that is, left-click to
the pattern space. The feature vectors retain the minimum graph, right-click to stop plotting the classes in our space.
number of data dimensions while maintaining the probability
of correct classification, thus making handling data easier. The cycle to graph our classes in the space works as
follows: While button = 1, this means that while click-left
IV. DESIGN AND ANALYSIS is using in our workspace, this will make the graph of
the classes depending on the selection made by the user.
The work in MATLAB was done through the use of
The section [x y button] = ginput(1) is responsible for
GUIDE [Figure 2], [Figure 3], [Figure 4] to have a friendly
collecting the clicks made by the user and transform them
environment for the user.
into coordinates ’X’ and ’Y’.
% −−− E x e c u t e s on s e l e c t i o n change i n 2 i n p u t s
selection .
f u n c t i o n popupmenu1 Callback ( h O b j e c t )
g l o b a l RMx RMy Rd1 Rd2 RM TRM RRD T1 T2 T3
i t e m s=g e t ( hObject , ' v a l u e ' ) ; b u t t o n =1;
ax . X A x i s L o c a t i o n=' o r i g i n ' ;
ax . Y A x i s L o c a t i o n=' o r i g i n ' ;
box o f f ; a x i s ([ −100 100 −100 1 0 0 ] ) ; g r i d on ; h o l d on

r x 1 = [ ] ; r y 1 = [ ] ; d1 = [ ] ; d2 = [ ] ;
r x 2 = [ ] ; r y 2 = [ ] ; i =1; k =1;
u1 = [ ] ; u2 = [ ] ;
i f i t e m s==1
w h i l e b u t t o n==1
[ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
p l o t ( x , y , ' r+' ) ;
Figure 2. Guide of Perceptron Rule, Graphic environment
h o l d on
r x 1 ( i )=x ;
r y 1 ( i )=y ;
d1 ( i ) =1;
u1 ( i ) =1;
i=i +1;
end
else
w h i l e b u t t o n==1
[ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
p l o t ( x , y , ' go ' ) ;
h o l d on
r x 2 ( k )=x ;
r y 2 ( k )=y ;
d2 ( k ) =−1;
u2 ( k ) =1;
k=k +1;
Figure 3. Guide of Perceptron Rule, Main Area end
end
5

%M a t r i z mxn , m=rows , n=columns


RMx=[RMx; rx1 ' ry1 ' u1 ' ] ; % −−− E x e c u t e s on s e l e c t i o n change i n M u l t i i n p u t
RMy=[RMy; rx2 ' ry2 ' u2 ' ] ; selection .
%[ r x 1 1 r y 1 1 1 ] f u n c t i o n popupmenu2 Callback ( hObject , ˜ , ˜ )
%[ r x 1 2 r y 1 2 1 ] global d1 d2 d3 d4 d5 d6 DD ...
%[ rx1n ry1n 1 ] CMM MML1 MML2 MML3 MML4 MML5 MML6 TCMM ...
TM1 TM2 TM3 TM4 TM5 TM6 TTM
Rd1=[Rd1 d1 ] ; itemsm=g e t ( hObject , ' v a l u e ' ) ;
Rd2=[Rd2 d2 ] ; b u t t o n =1;
%[ 1 1 1 1 . . . n ] o [ −1 −1 −1 −1 . . . −n ]
ax . X A x i s L o c a t i o n=' o r i g i n ' ;
RRD=[Rd1 Rd2 ] ; ax . Y A x i s L o c a t i o n=' o r i g i n ' ;
% [ 1 1 . . n −1 −1 . . −n ] box o f f ;

RM=[RMx;RMy ] ; a x i s ([ −100 100 −100 1 0 0 ] ) ;


% M a t r i z mx3 g r i d on ; h o l d on
%[ r x 1 1 r y 1 1 1 ] Rx1 = [ ] ; Ry1 = [ ] ; Rx2 = [ ] ; Ry2 = [ ] ;
%[ r x 2 1 r y 2 2 1 ] Rx3 = [ ] ; Ry3 = [ ] ; Rx4 = [ ] ; Ry4 = [ ] ;
%[ rxmn rymn 1 ] Rx5 = [ ] ; Ry5 = [ ] ; Rx6 = [ ] ; Ry6 = [ ] ;
D1 = [ ] ; D2 = [ ] ; D3 = [ ] ; D4 = [ ] ; D5 = [ ] ; D6 = [ ] ;
TRM=RM' ; U1 = [ ] ; U2 = [ ] ; U3 = [ ] ; U4 = [ ] ; U5 = [ ] ; U6 = [ ] ;
% M a t r i z 3 xn A=1; B=1; C=1; D=1; E=1; F=1;
% [ r x 1 1 r x 2 1 rxm1 ] i f itemsm==1
% [ r y 1 1 r y 2 1 rym1 ] w h i l e b u t t o n==1
% [ 1 1 1 ] [ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
p l o t ( x , y , ' r+' ) ; h o l d on
T1=[T1 ; rx1 ' ry1 ' d1 ' ] ; Rx1 (A)=x ; Ry1 (A)=y ;
T2=[T2 ; rx2 ' ry2 ' d2 ' ] ; D1(A) =1; U1 (A) =1;
T3=[T1 ; T2 ] ; A=A+1;
end

e l s e i f itemsm==2
w h i l e b u t t o n==1
[ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
p l o t ( x , y , ' go ' ) ; h o l d on
Rx2 (B)=x ; Ry2 (B)=y ;
D2(B) =2; U2 (B) =1;
B=B+1;
end

e l s e i f itemsm==3
w h i l e b u t t o n==1
[ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
p l o t ( x , y , 'm* ' ) ; h o l d on
Rx3 (C)=x ; Ry3 (C)=y ;
D3(C) =3; U3 (C) =1;
C=C+1;
end
e l s e i f itemsm==4
w h i l e b u t t o n==1
[ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
p l o t ( x , y , ' ws ' ) ; h o l d on
Rx4 (D)=x ; Ry4 (D)=y ;
D4(D) =4; U4 (D) =1;
D=D+1;
end
Figure 5. Cartesian coordinates Two Inputs e l s e i f itemsm==5
w h i l e b u t t o n==1
[ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
From the previous example presented in [figure 5], we p l o t ( x , y , ' cd ' ) ; h o l d on
obtain the following matrix: Rx5 (E)=x ; Ry5 (E)=y ;
D5(E) =5; U5 (E) =1;
E=E+1;
  end
−19.9495 19.8225 1 else
w h i l e b u t t o n==1
−19.9495 39.9408 1 [ x , y , b u t t o n ]= g i n p u t ( 1 ) ;
 
−20.4545 −39.9408 1 p l o t ( x , y , ' yˆ ' ) ; h o l d on
  Rx6 (F)=x ; Ry6 (F)=y ;
−40.1515
RM =  −19.2308 1
 D6(F) =6; U6 (F) =1;
−40.1515 −19.8225 1 F=F+1;
  end
 40.1515 −59.4675 1 end
−19.9495 39.3491 1
%M a t r i z mxn , m= f i l a s , n=columnas
MML1=[MML1; Rx1 ' Ry1 ' U1 ' ] ;
  %[ r x 1 1 r y 1 1 1 ]
RRD = 1 1 1 1 −1 −1 −1 %[ r x 1 2 r y 1 2 1 ]
%[ rx1n ry1n 1 ]

MML2=[MML2; Rx2 ' Ry2 ' U2 ' ] ;


The code below shows how the Cartesian coordinates cho- MML3=[MML3; Rx3 ' Ry3 ' U3 ' ] ;
MML4=[MML4; Rx4 ' Ry4 ' U4 ' ] ;
sen by the user for an array of multi-entries are collected. In MML5=[MML5; Rx5 ' Ry5 ' U5 ' ] ;
the [figure 6], we can see the process developed by the code MML6=[MML6; Rx6 ' Ry6 ' U6 ' ] ;

mentioned below.
6

CMM=[MML1;MML2; MML3; MML4; MML5; MML6 ] ; s e t ( h a n d l e s . e d i t 1 , ' s t r i n g ' , wold ( 1 , 1 ) ) ;


% M a t r i z mx3 s e t ( h a n d l e s . e d i t 2 , ' s t r i n g ' , wold ( 1 , 2 ) ) ;
%[ r x 1 1 r y 1 1 1 ] set ( handles . edit5 , ' s t r i n g ' , bias ) ;
%[ r x 2 1 r y 2 2 1 ]
%[ rxmn rymn 1 ] l i n e=p l o t ( h a n d l e s . a x e s 1 , xx , yy , ' l i n e w i d t h ' , 2 ) ;

TCMM=CMM' ; elseif ( i n p u t 1==0 && i n p u t 2 ==1)


% M a t r i z 3 xn
% [ r x 1 1 r x 2 1 rxm1 ] xx = [ − 1 0 0 : 0 . 0 1 : 1 0 0 ] ;
% [ r y 1 1 r y 2 1 rym1 ]
% [ 1 1 1 ] for i =1: l e n g t h (DD)
s w i t c h DD( i )
d1 =[ d1 D1 ] ; case 1
d2 =[ d2 D2 ] ; d e l e t e ( L1 ) ; wold1=rand ( 1 , 3 ) ;
d3 =[ d3 D3 ] ; x1=wold1 ( 1 , 1 ) ; y1=wold1 ( 1 , 2 ) ; b i a s 1=wold1 ( 1 , 3 ) ;
d4 =[ d4 D4 ] ; yy1=(−(x1 * xx+b i a s 1 ) . / y1 ) ;
d5 =[ d5 D5 ] ; s e t ( h a n d l e s . e d i t 9 , ' s t r i n g ' , wold1 ( 1 , 1 ) ) ;
d6 =[ d6 D6 ] ; s e t ( h a n d l e s . e d i t 1 0 , ' s t r i n g ' , wold1 ( 1 , 2 ) ) ;
set ( handles . edit33 , ' s t r i n g ' , bias1 ) ;
DD=[ d1 d2 d3 d4 d5 d6 ] ; L1=p l o t ( h a n d l e s . a x e s 1 , xx , yy1 , ' r ' , ' l i n e w i d t h ' , 2 ) ;
%[ 1 1 2 2 3 3 . . . 6 6 6 6 . . . n]
case 2
TM1=[TM1; Rx1 ' Ry1 ' D1 ' ] ; d e l e t e ( L2 ) ; wold2=rand ( 1 , 3 ) ;
TM2=[TM2; Rx2 ' Ry2 ' D2 ' ] ; x2=wold2 ( 1 , 1 ) ; y2=wold2 ( 1 , 2 ) ; b i a s 2=wold2 ( 1 , 3 ) ;
TM3=[TM3; Rx3 ' Ry3 ' D3 ' ] ; yy2=(−(x2 * xx+b i a s 2 ) . / y2 ) ;
TM4=[TM4; Rx4 ' Ry4 ' D4 ' ] ; s e t ( h a n d l e s . e d i t 1 1 , ' s t r i n g ' , wold2 ( 1 , 1 ) ) ;
TM5=[TM5; Rx5 ' Ry5 ' D5 ' ] ; s e t ( h a n d l e s . e d i t 1 2 , ' s t r i n g ' , wold2 ( 1 , 2 ) ) ;
TM6=[TM6; Rx6 ' Ry6 ' D6 ' ] ; set ( handles . edit34 , ' s t r i n g ' , bias2 ) ;
TTM=[TM1;TM2;TM3;TM4;TM5;TM6 ] ; L2=p l o t ( h a n d l e s . a x e s 1 , xx , yy2 , ' g ' , ' l i n e w i d t h ' , 2 ) ;

case 3
d e l e t e ( L3 ) ; wold3=rand ( 1 , 3 ) ;
x3=wold3 ( 1 , 1 ) ; y3=wold3 ( 1 , 2 ) ; b i a s 3=wold3 ( 1 , 3 ) ;
yy3=(−(x3 * xx+b i a s 3 ) . / y3 ) ;
s e t ( h a n d l e s . e d i t 1 3 , ' s t r i n g ' , wold3 ( 1 , 1 ) ) ;
s e t ( h a n d l e s . e d i t 1 4 , ' s t r i n g ' , wold3 ( 1 , 2 ) ) ;
set ( handles . edit35 , ' s t r i n g ' , bias3 ) ;
L3=p l o t ( h a n d l e s . a x e s 1 , xx , yy3 , 'm' , ' l i n e w i d t h ' , 2 ) ;

case 4
d e l e t e ( L4 ) ; wold4=rand ( 1 , 3 ) ;
x4=wold4 ( 1 , 1 ) ; y4=wold4 ( 1 , 2 ) ; b i a s 4=wold4 ( 1 , 3 ) ;
yy4=(−(x4 * xx+b i a s 4 ) . / y4 ) ;

s e t ( h a n d l e s . e d i t 1 5 , ' s t r i n g ' , wold4 ( 1 , 1 ) ) ;


s e t ( h a n d l e s . e d i t 1 6 , ' s t r i n g ' , wold4 ( 1 , 2 ) ) ;
set ( handles . edit36 , ' s t r i n g ' , bias4 ) ;
L4=p l o t ( h a n d l e s . a x e s 1 , xx , yy4 , 'w ' , ' l i n e w i d t h ' , 2 ) ;

case 5
d e l e t e ( L5 ) ; wold5=rand ( 1 , 3 ) ;
x5=wold5 ( 1 , 1 ) ; y5=wold5 ( 1 , 2 ) ; b i a s 5=wold5 ( 1 , 3 ) ;
yy5=(−(x5 * xx+b i a s 5 ) . / y5 ) ;

s e t ( h a n d l e s . e d i t 1 7 , ' s t r i n g ' , wold5 ( 1 , 1 ) ) ;


s e t ( h a n d l e s . e d i t 1 8 , ' s t r i n g ' , wold5 ( 1 , 2 ) ) ;
set ( handles . edit37 , ' s t r i n g ' , bias5 ) ;
L5=p l o t ( h a n d l e s . a x e s 1 , xx , yy5 , ' c ' , ' l i n e w i d t h ' , 2 ) ;

case 6
Figure 6. Cartesian coordinates Multi-Classes d e l e t e ( L6 ) ; wold6=rand ( 1 , 3 ) ;
x6=wold6 ( 1 , 1 ) ; y6=wold6 ( 1 , 2 ) ; b i a s 6=wold6 ( 1 , 3 ) ;
yy6=(−(x6 * xx+b i a s 6 ) . / y6 ) ;

The code below shows the calculations of Weight Vectors s e t ( h a n d l e s . e d i t 1 9 , ' s t r i n g ' , wold6 ( 1 , 1 ) ) ;
s e t ( h a n d l e s . e d i t 2 0 , ' s t r i n g ' , wold6 ( 1 , 2 ) ) ;
and Bias, a varied methodology was taken into account de- set ( handles . edit38 , ' s t r i n g ' , bias6 ) ;
pending on the selection of the type of inputs. To simplify the L6=p l o t ( h a n d l e s . a x e s 1 , xx , yy6 , ' y ' , ' l i n e w i d t h ' , 2 ) ;
otherwise
distribution of the code, it is shortened to the most important d i s p ( ' i i s n o t an o p t i o n ' )
processes. In the [figure 6] and [figure 7], we can see the end
end
process developed by the code mentioned below. e l s e i n p u t 1==0 && i n p u t 2 ==0;
w ar n dl g ( ' P l e a s e , You must t o s e l e c t i n p u t
% −−− E x e c u t e s on b u t t o n p r e s s i n Random .
options ') ;
f u n c t i o n pushbutton4 Callback ( handles )
end
g l o b a l b i a s wold i n p u t 1 i n p u t 2 l i n e ...
L1 L2 L3 L4 L5 L6 DD ...
wold1 wold2 wold3 wold4 wold5 wold6 w1 w w3
z }| { z }|2 { z }| {
i n p u t 1=g e t ( h a n d l e s . r a d i o b u t t o n 1 , ' v a l u e ' ) ; wold = [0.1343 0.0986 0.1420]
i n p u t 2=g e t ( h a n d l e s . r a d i o b u t t o n 2 , ' v a l u e ' ) ;

if ( i n p u t 1==1 && i n p u t 2 ==0) Where w1 and w2 are the weight vectors and w3 is the
d e l e t e ( l i n e ) ; wold=rand ( 1 , 3 ) ;
x1=wold ( 1 , 1 ) ; y1=wold ( 1 , 2 ) ; b i a s=wold ( 1 , 3 ) ;
bias for 2 classes.
xx = [ − 1 0 0 : 0 . 0 1 : 1 0 0 ] ; yy=(−(x1 * xx+b i a s ) . / y1 ) ;
7

VN=RM' ;
%−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

%P a r t 3

cla ;
ax . X A x i s L o c a t i o n=' o r i g i n ' ;
ax . Y A x i s L o c a t i o n=' o r i g i n ' ;
box o f f ; a x i s ([ −2 2 −2 2 ] ) ;
g r i d on ; h o l d on
TT=mean ( T3 ) ; TTT=s t d ( T3 ) ;
T3 ( : , 1 ) = ( T3 ( : , 1 )−m( 1 ) ) / s ( 1 ) ;
T3 ( : , 2 ) = ( T3 ( : , 2 )−m( 2 ) ) / s ( 2 ) ;
TVN=T3 ' ;

%−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

%P a r t 4

k = s i z e ( ( u n i q u e (RRD) ) , 2 ) ;
f o r j =1: l e n g t h (RRD)
f o r t t =1: k
f d ( : , t t )=TVN( : , j ) ;
s w i t c h RRD( j )
case 1
x1 ( t t ) =f d ( 1 , 1 ) ;
y1 ( t t )=f d ( 2 , 1 ) ;
p l o t ( x1 , y1 , ' r+' ) ;
c a s e −1
x2 ( t t ) =f d ( 1 , 1 ) ;
Figure 7. Weight Vector and Bias for 2 classes y2 ( t t )=f d ( 2 , 1 ) ;
p l o t ( x2 , y2 , ' go ' )
otherwise
d i s p ( ' i i s n o t an o p t i o n ' )
end
The code below shows the calculations of the Discrete end
and Continuous method for two classes and multi-classes, j=j +1;
end
respectively; a varied methodology was taken into account
depending on the selection of the type of inputs. To simplify
%−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
the distribution of the code, it is shortened to the most
important processes. %P a r t 5

s t o p=' v e r d a d ' ;
w h i l e strcmp ( s t o p , ' v e r d a d ' )
e r r o r =0;
% −−− E x e c u t e s on b u t t o n p r e s s i n C a l c u l a t i o n s .
f o r k =1: l e n g t h (RRD)
f u n c t i o n pushbutton5 Callback ( handles )
n e t=wnew ' *VN( : , k ) ;
g l o b a l i n p u t 1 i n p u t 2 ...
d=RRD( k ) ; o=s i g n ( n e t ) ;
RRD TRM l i n e RM ...
wnew=(wnew+( c * ( d−o ) *VN( : , k ) ) ) ;
L1 L2 L3 L4 L5 L6 ...
e r r o r=e r r o r+abs ( d−o ) ;
wold c wnew i t e r l ...
end
NN DD WL1 WL2 WL3 WL4 WL5 WL6 TCMM ...
CMM T3 wold1 wold2 wold3 wold4 wold5 wold6 ...
f 1=wnew ' ; x1=f 1 ( 1 , 1 ) ; y1=f 1 ( 1 , 2 ) ; b=f 1 ( 1 , 3 ) ;
TTM
xx = [ − 1 0 0 : 0 . 0 1 : 1 0 0 ] ; yy=(−(x1 * xx+b ) . / y1 ) ;
l i n e=p l o t ( h a n d l e s . a x e s 1 , xx , yy , ' l i n e w i d t h ' , 2 ) ;
%P ar t 1
drawnow
set ( handles . edit3 , ' s t r i n g ' , f1 (1 ,1) ) ;
i n p u t 1=g e t ( h a n d l e s . r a d i o b u t t o n 1 , ' v a l u e ' ) ;
set ( handles . edit4 , ' s t r i n g ' , f1 (1 ,2) ) ;
i n p u t 2=g e t ( h a n d l e s . r a d i o b u t t o n 2 , ' v a l u e ' ) ;
set ( handles . edit6 , ' s t r i n g ' , f1 (1 ,3) ) ;
set ( handles . edit7 , ' s t r i n g ' , i t e r ) ;
Norm1=g e t ( h a n d l e s . checkbox6 , ' Value ' ) ;
set ( handles . edit8 , ' s t r i n g ' , error ) ;
Norm2=g e t ( h a n d l e s . checkbox7 , ' Value ' ) ;
i f e r r o r==0
D i s=g e t ( h a n d l e s . checkbox8 , ' Value ' ) ;
s t o p=' no ' ;
Con=g e t ( h a n d l e s . checkbox9 , ' Value ' ) ;
break ;
end
%−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
delete ( line ) ;
i t e r = i t e r +1;
%P ar t 2
t i t l e ( [ ' I t e r a t i o n s : ' , num2str ( i t e r ) ] ) ;
end
i t e r =0; c = 0 . 5 ; l =2;
i f i n p u t 1==1 && Norm1==1 && D i s==1 && ...
i n p u t 2==0 && Norm2==0 && Con==0
The code for the calculation of two classes in normalized
delete ( line ) discrete mode has 5 parts that will be described below:
set ( handles . edit1 , ' enable ' , ' o f f ') ;
set ( handles . edit2 , ' enable ' , ' o f f ') ;
set ( handles . edit5 , ' enable ' , ' o f f ') ;
1. The first section of the arrangement allows to obtain the
set ( handles . edit8 , ' enable ' , ' o f f ') ; values selected by the user to perform the calculations.
%N o r m a l i z a t i o n Code
2. The second section of the arrangement allows to obtain
m = mean (RM) ; s = s t d (RM) ; the logical condition of the selection of the user, dividing
RM( : , 1 ) = (RM( : , 1 )−m( 1 ) ) / s ( 1 ) ;
RM( : , 2 ) = (RM( : , 2 )−m( 2 ) ) / s ( 2 ) ;
them by number of classes, normalized or non-normalized
wnew=wold ' ; data, discrete or continuous method. Like the declaration
8

of variables, to take control of iterations made, as well as V e c t o r ( : , 1 ) = ( V e c t o r ( : , 1 )−m( 1 ) ) / s ( 1 ) ;


V e c t o r ( : , 2 ) = ( V e c t o r ( : , 2 )−m( 2 ) ) / s ( 2 ) ;
the constants of learning. N o r m a l i z a t e d V e c t o r=Transpose Of Vector ' ;
The following code shows the process of normalization made
to the data, taking into consideration that the standard devi-
ation formula and the average formula were used, as well as
their equivalent in MATLAB commands. The Mean Function:
Average or mean value of array, MATLAB syntax M =
mean(A) , returns the mean of the elements of A along the
first array dimension whose size does not equal 1.
1. If A is a vector, then mean(A) returns the mean of the
elements.
2. If A is a matrix, then mean(A) returns a row vector
containing the mean of each column.
3. If A is a multidimensional array, then mean(A) operates
along the first array dimension whose size does not equal 1,
treating the elements as vectors. This dimension becomes
1 while the sizes of all other dimensions remain the same.
For a random variable vector A made up of N scalar
observations, the mean is defined as
N
1 X
µ= Ai (19)
N n=1
Figure 8. Standardized data 2 classes
The Standard deviation function has a syntax in MATLAB
as S = std(A), returns the standard deviation of the elements Normalizated Matrix
of A along the first array dimension whose size does not equal  
1. −0.4435 1.0418 1
−0.7952 1.6131 1
1. If A is a vector of observations, then the standard deviation 
−0.8109 −0.6553 1

is a scalar.  
−1.4225 −0.0672 1
RM =  
2. If A is a matrix whose columns are random variables −1.0708 −0.0840 1
and whose rows are observations, then S is a row vector 
 1.0708 −1.2098 1

containing the standard deviations corresponding to each
0.4435 −0.6385 1
column.
3. If A is a multidimensional array, then std(A) operates
along the first array dimension whose size does not equal 1, 3. The third and fourth section are related to each other, this
treating the elements as vectors. The size of this dimension allows us to regraph the classes after being normalized.
becomes 1 while the sizes of all other dimensions remain 4. In the fifth section the calculations and comparisons of our
the same. By default, the standard deviation is normalized data are made [figure 8]. Giving us the final classificatory
by N − 1, where N is the number of observations. vector as a final result.
For a random variable vector A made up of N scalar w1 w w3
observations, the standard deviation is defined as z }| { z }|2 { z }| {
wnew = [−0.8147 0.7067 0.5312]
v
u
u 1 X N
S=t |Ai − µ|2 (20) %C o n t i n u o u s method m u l t i c l a s s e s
N − 1 n=1 e l s e i f ( i n p u t 2==1 && Norm1==1 && D i s==0 && ...
i n p u t 1==0 && Norm2==0 && Con==1)
Where µ is the mean of A:
cla ;
N ax . X A x i s L o c a t i o n=' o r i g i n ' ;
1 X ax . Y A x i s L o c a t i o n=' o r i g i n ' ;
µ= Ai (21) box o f f ; a x i s ([ −2 2 −2 2 ] ) ;
N n=1 g r i d on ; h o l d on
TT=mean (TTM) ; TTT=s t d (TTM) ;
The standard deviation is the square root of the variance. TTM( : , 1 ) = (TTM( : , 1 )−TT( 1 ) ) /TTT( 1 ) ;
TTM( : , 2 ) = (TTM( : , 2 )−TT( 2 ) ) /TTT( 2 ) ;
Some definitions of standard deviation use a normalization TVM=TTM' ;
factor of N instead of N −1, which you can specify by setting %TVM g i v e us t h e f o l l o w i n g m a t r i x a f t e r
n o r m a l i z a t i o n o f data
w to 1. In the [figure 8] and [figure 9], we can see the process %[ ax1 ax2 bx1 cx1 −−− f x n ]
developed by the code mentioned below. %[ ay1 ay2 by1 cy1 −−− f y n ]
%[ c l a s s 1 class1 class2 c l a s s 3 −−− c l a s s n ]
%−−− N o t m a l i z a t i o n Code
m = mean ( V e c t o r ) ; k=l e n g t h (DD) ;
s = std ( Vector ) ; kk = s i z e ( u n i q u e (DD) , 2 ) ;
9

f o r j =1: l e n g t h (DD)
f o r t t =1: k f=W' ;
f d ( : , t t )=TVM( : , j ) ; xx = [ − 1 0 0 : 0 . 0 1 : 1 0 0 ] ;
s w i t c h DD( j )
case 1 %Element g r a p h i c s
x1 ( t t ) =f d ( 1 , 1 ) ; y1 ( t t )=f d ( 2 ,1) ; f o r i =1: s i z e (W, 1 )
p l o t ( x1 , y1 , ' r+' ) ; yy=(−( f ( 1 , i ) * xx+f ( 3 , i ) ) . / f ( 2 , i ) ) ;
case 2 l i n e ( i )=p l o t ( h a n d l e s . a x e s 1 , xx , yy , ' l i n e w i d t h ' , 2 ) ;
x2 ( t t ) =f d ( 1 , 1 ) ; y2 ( t t )=f d ( 2 ,1) ; drawnow
p l o t ( x2 , y2 , ' go ' ) end
case 3 i f sum ( e r r o r )==0
x3 ( t t ) =f d ( 1 , 1 ) ; y3 ( t t )=f d ( 2 ,1) ; s t o p=' no ' ;
p l o t ( x3 , y3 , 'm* ' ) ; break ;
case 4 end
x4 ( t t ) =f d ( 1 , 1 ) ; y4 ( t t )=f d ( 2 ,1) ; delete ( line )
p l o t ( x4 , y4 , ' ws ' ) end
case 5
x5 ( t t ) =f d ( 1 , 1 ) ; y5 ( t t )=f d ( 2 ,1) ;
p l o t ( x5 , y5 , ' cd ' ) ;
case 6
x6 ( t t ) =f d ( 1 , 1 ) ; y6 ( t t )=f d ( 2 ,1) ;
p l o t ( x6 , y6 , ' yˆ ' )
otherwise
d i s p ( ' i i s n o t an o p t i o n ' )
end
end
j=j +1;
end

i t e r 1 =0; i t e r 2 =0; i t e r 3 =0;


i t e r 4 =0; i t e r 5 =0; i t e r 6 =0;

WL1=wold1 ' ; WL2=wold2 ' ;


WL3=wold3 ' ; WL4=wold4 ' ;
WL5=wold5 ' ; WL6=wold6 ' ;
%wold1 = [ weight1 weight2 bias ]

%WL1= [ w e i g h t 1 ]
% [ weight2 ]
% [ bias ]

%N o r m a l i z a t i o n Code
m = mean (CMM) ; s = s t d (CMM) ;
CMM( : , 1 ) = (CMM( : , 1 )−m( 1 ) ) / s ( 1 ) ;
CMM( : , 2 ) = (CMM( : , 2 )−m( 2 ) ) / s ( 2 ) ;
VNM=CMM' ;

Figure 9. Standardized data multi-classes


WL11=wold1 ;
WL22=wold2 ;
WL33=wold3 ;
WL44=wold4 ; V. C ONCLUSION
WL55=wold5 ;
WL66=wold6 ; The perceptron learning rule was originally developed by
W=[WL11 ; WL22 ; WL33 ; WL44 ; WL55 ; WL66 ] ;
Frank Rosenblatt in the late 1950’s. The perceptron learning
%W= [ WL11 WL12 b i a s 1 ] rule provides a simple algorithm for training a perceptron
% [ WL21 WL22 b i a s 2 ]
% [ WLn1 WLn2 b i a s n ]
neural network. In present paper we have considered the
various steps in single layer perceptron model and only one
%C l a s s i f i c a t i o n o f c l a s s e s
Cl = u n i q u e (DD) ;
epoch is used to update the weight so that error between actual
f o r c l a s s = 1 : kk output and target output can be minimized.
f o r t =1: l e n g t h (DD)
i f (DD( 1 , t ) == Cl ( c l a s s ) )
tmpLabels ( c l a s s , t ) = 1 ;
else
tmpLabels ( c l a s s , t ) = −1;
end
end
end

%C a l c u l a t i o n o f t h e d e c i s i o n a r e a
s t o p=' v e r d a d ' ;
w h i l e strcmp ( s t o p , ' v e r d a d ' )
error = zeros (1 ,k) ;
f o r j =1: s i z e (VNM, 2 )
f o r t =1: kk
n e t = W( t , : ) * VNM( : , j ) ;
d = tmpLabels ( t , j ) ;
f n e t =(2/(1+ exp(− l * n e t ) ) ) −1;
d f n e t =0.5 * (1 −( f n e t ) ˆ 2 ) ;
W( t , : ) = (W( t , : ) ' + c * ( d−f n e t ) * d f n e t *VNM( : , j ) ) ' ;
error ( t ) = error ( t ) + 0.5 * (d − fnet ) ˆ2;
end
end
10

R EFERENCE
[1] Mc Culloch , Warren; Walter Pitts(1943), “A logical
calculus of ideas immanent in Nervous Activity” . Bulletin of
Mathematical Biophysics, 5(4):115-1133.

[2] KenAizawa(2004),“McCulloch,WarrenSturgis” In : Dic-


tionary of the Philosophy of Mind . Retrived May 17, 2008.

[3] Freund, Y; Schapire, R.E.(1999). “Large margin classifi-


cation using the perceptron algorithm”(PDF). Machine Learn-
ing . 37(3) :277- 296.

[4] M. Zurada Jacek, Jacek M. Zurada, ”Introduction to


artificial neural systems”.

[5] T. Hagan Martin, B. Demuth Howard, Beale Mark, De


Jesus Orlando. MartinT. Hagan, Howard B. Demuth, Mark
Hudson Beale, Orlando De Jesus ”Neural Network Desing 2nd
Edition”.

[6] Haykin, Simon, Simon Haykin ”Neural networks and


learning machines / Simon Haykin.—3rd ed”.

[7] Kriesel, David, David Kriesel ”A Brief Introduction to


Neural Networks”.

[8] MATLAB help.

ACKNOWLEDGMENT
The author is very grateful to the Department of EEE,
Karadeniz Technical University for supporting the equipment
for this research works done.

B IOGRAPHIES
Mr. José Eduardo Urrea Cabus
has done his B. Sc in Electrical
and Electronic Engineering from
the National Autonomous University
of Honduras in the Sula Valley,
Cortés, Honduras. He has served as
an Electrical Engineer in the private
sector in Honduras for more than two
years. Now he is studying his MSc.
degree in Electrical Engineering at
Karadeniz Technical University.