677 views

Uploaded by al-amin shohag

A MATLAB BASED APPROACH TO NON LINEARITY PROBLEM OF NEURAL NETWORK

- Forecasting Markets Using Artificial Neural Networks (Bachelor's Thesis)
- Garimella Thesis
- TrindadeTavares_rpt
- Fuzzy Neural
- 176
- mlp.pdf
- Nips Tut 3
- BP algo.
- 91329-0136097111_ch01
- The Smart Grid
- "Smart Grid" preview (ISBN 1587331624)
- homework-6-solutions.pdf
- neuralnetworkitsapplications121-120113215915-phpapp02 (1)
- Neural Network in MATLAB
- Artificial Neural Networks
- Volume 4 Issue 0 2003 [Doi 10.1109%2Fijcnn.2003.1224043] Wenxin Liu, ; Venayagamoorthy, G.K.; Wunsch, D.C. -- [IEEE International Joint Conference on Neural Networks, 2003. - Portland, Oregon USA (July 20 - 24, 2003)] (1)
- Percept Ron
- Image Reconstruction Using Multi Layer Perceptron Mlp and Support Vector Machine Svm Classifier and Study of Classification Accuracy
- An Efficient Web Prediction Model Using Modified Markov Model with ANN
- Industrial Microcontroller Based Neural Network Controlled Autonomous Vehicle

You are on page 1of 19

NONLINEARITY PROBLEM (XOR PROBLEM) OF NEURAL

NETWORK

AL AMIN SHOHAG1 , Dr .UZZAL KUMAR ACHARJA 2

ABSTRACT

Neural network is the first and foremost step in machine learning. It provides the entire basis for

a machine to act like a human. It is prerequisite for a machine to take different categories of data

from analog world. But most of analog world data is non-linear. This non-linearity of analog data

raises a problem for neural network. A neural network classifies dataset linearly. That is, it can

only handle problems which are linearly classified. Thus, it is a necessity for neural network to

have a way to solve non-linearity. In this piece of work, we are going test this linearity

characteristic a neural network using OR, AND operation dataset which are linear. And we will

discuss nonlinearity problem for neural network using XOR dataset. At the end, we will solve

this problem of non-linearity and demonstrate it using MATLAB.

KEYWORDS

Neural Network, Linearity, Perceptron, Back propagation algorithm, XOR, MATLAB

1. Introduction

Neural network is an artificial network which tries to mimic a neural network of human brain.

Neural network of human brain consists of many neurons. Similarly an artificial neural network

consists of many artificial neurons. Thus, it produces almost similar result of a neural network of

human brain.

A neural network is a massively parallel distributed processor that has a natural propensity for

storing experiential knowledge and making it available for use. In its most general form, a neural

network is a machine that is designed to model the way in which the brain performs a particular

task or function of interest. The network is usually implemented using electronic components or

simulated in software or on a digital computer. In most cases the interest is confined largely to an

important class of neural network that performs useful computations through a process of

learning. It resembles the brain in two respects:

2. Interneuron connection strengths known as synaptic weights are used to store the

knowledge.

2. Generalization

However, an artificial neural network consists of many neurons. The basic model of neuron

consists of many synaptic inputs in association with synaptic weights, a summing junction to

produce sum of products of synaptic weights and inputs and an activation function to limit the

output of the neuron. A basic model of neuron is shown below:

1.2 Linearity

A neural network takes a problem and tries to generalize this problem into classes. This approach

of generalize into classes is linearity approach of neural network. It tries to draw a single or

multiple linear lines to produce multiple classes of dataset based on similar feature of the

problem dataset. For example, an AND as well as an OR operation has input dataset below:

A B A AND B

0 0 0

0 0 0

1 0 0

1 1 1

Fig: AND operation

A B A OR B

0 0 1

0 1 1

1 0 1

1 1 0

Fig: OR operation

With this data set a neural network will try to classify the output dataset into two classes. It will

produce a linear boundary line. One side of boundary line will contain all zeros for AND, all

ones for OR operation. On the other side of the boundary it will contain only 1 for AND, only 0

for OR operation. For the classification of this dataset, data set will need a single layer

perceptron only.

But in case of XOR where the output is non-linear, a single perceptron cannot produce a linear

classification. Dataset for XOR is shown below:

A B A XOR B

0 0 1

0 1 0

1 0 0

1 1 1

In this case, a multilayer perceptron is needed. We will see how a multilayer perceptron can solve

this problem in later sections.

2. Perceptron

A perceptron is the simplest form of a neural network used for the classification of a special type

of datasets said to be linearly separable. A perceptron is shown below:

Fig: perceptron

In the case of an elementary perceptron, there two decision regions separated by a hyper plane

defined by the equation below:

l

wki x i=0

i=1

w ki xi

Where are the synaptic weights and are the inputs. is the threshold value. For

example, a single layer perceptron can classify OR and AND dataset linearly. Because these

datasets are linearly separable.

x2

1 1 1

0 0

x1

0 1

x1 , x2

Fig: OR ( )

x2

0 1

0 1 1

x1

0 1

x1 , x2

Fig: AND (

But it cannot classify problems which are not linearly separable such as XOR dataset.

x2

1 1

0 1

0 0

x1

0 1

Fig: XOR

As we can see dataset is not linearly separable. To solve this problem we need multilayer

perceptron. In the next section we will discuss multilayer perceptron and how it solves this

problem using back-propagation algorithm.

3. Multilayer perceptron

A multilayer perceptron has one input dataset and one or many hidden layer and one output layer.

A multilayer perceptron is shown below:

( x 1 , x2)

1 1

1 0

0 0

0 1 ( x1 , x 2)

Fig: XOR

Multilayer perceptrons have been applied successfully to solve some difficult and diverse

problems by training them in a supervised manner with a highly popular algorithm known as the

error back-propagation algorithm. This algorithm is based on the error-correction learning rule.

Error back propagation learning consists of two passes through the different layers of the

network, a forward pass and a backward pass.

In the forward pass, an activity pattern is applied to the sensory nodes of the network and its

effect propagates through the network layer by layer. Finally, a set of outputs is produced as the

actual response of the network. During the forward pass, the synaptic weights of the networks

are all fixed.

During the backward pass, on the other hand, the synaptic weights are all adjusted in accordance

with an error-correction rule. Specifically, the actual response of the network is subtracted from a

desired response to produce an error signal. This error signal is then propagated backward

through the network against the direction of the synaptic connections- the name error back

propagation. The synaptic weights are adjusted to make the actual response of the network move

closer to the desired response in a statistical sense.

1 2

e ( n)

We define the instantaneous value of the error energy for neuron j as 2 j .

1 2

E(n) is obtained by summing e (n)

Correspondingly, the instantaneous value 2 j over all

neurons in the output layer; these are the only visible neurons for which error signals can be

calculated directly. We may thus write

1 2

E ( n )=

2 j c

e j (n)

The instantaneous error energy E(n) and therefore the average error energy Eav , is a

Eav

function of all the free parameters of the network. For a given training set, represents the

cost function as measure of learning performance, the objective of the learning process is to

E

adjust the free parameters of the network to minimize av . For this we consider a simple

method of training in which the weights updated on a pattern- by-pattern basis until one epoch

that is one complete presentation of the entire training set has been dealt with. The adjustments to

the weights are made in accordance with the respective errors computed for each pattern

presented to the network.

The induced local field V j (n) produced at the input of the activation function associated with

neuron j is therefore

m

V j ( n ) = w ji ( n ) y i (n)

i=0

w j0

Where m is the total number of inputs applied to neuron j. The synaptic weight equals the

bias by applied to neuron j. Hence the function signal y j (n) appearing at the output of the

neuron j at iteration n is

y j ( n )= j( v j ( n ) )

The back propagation algorithm applies a correction w ji (n) to the synaptic weight w ji (n)

E( n)

.

, which is proportional to the partial derivative w ji (n) According to the chain rule of

calculus; we may express the gradient as =

w ji (n) e j (n) y j (n) v j ( n) w ji (n)

E( n)

The partial derivative w ji (n) represents a sensitivity factor, determining the direction of

E(n)

Now, let us calculate the parameters of the partial derivative w ji (n) in the equation above:

E(n)

=e (n)

e j (n) j

And

e j (n)

=1

y j (n)

And

V j (n)

y j (n) '

=

v j (n)

v j (n)

= y i (n)

w ji (n)

E( n)

Thus, the partial derivative w ji (n) becomes:

E( n) '

=e j (n) (V j ( n ) ) y i(n)

w ji (n)

n

The correction w ji (n) applied to w ji ) to w ji (n) be defined by the delta rule:

E( n)

w ji ( n )=

w ji (n)

Accordingly

w ji ( n )= j ( n ) y i (n)

E (n) '

j ( n )= =e j (n) (V j ( n ) )

v j (n)

The local gradient points to required changes in synaptic weights.

1. Initialization: Assuming that no prior information is available, pick the synaptic weights

and thresholds from a uniform distribution whose mean is zero and whose variance is

chosen to make the standard deviation of the induced local fields of the neurons lie at the

transition between the linear and saturated parts of the sigmoid activation function.

2. Presentation of training examples: Present the network with an epoch of the training

examples for each example in the set, ordered in some fashion; perform the sequence of

forward and backward computations described under points 3 and 4 respectively.

with the input vector X(n) applied to the input layer of sensory nodes and the desire

response vector d(n) presented to the output layer of sensory nodes and the desired

presented to the output layer of computation nodes. Compute the induced local field and

function signals of the network by proceeding forward through the network layer by

l

layer. The induced local field v j ( n ) for neuron j in layer is

m0

v ( n )= w lji ( n ) y l1

l

j i (n)

l=0

Where y l1

i (n) is the output signal of the neuron I in the previous layer (l1) at

iteration n.

l

w ji (n) is the synaptic weight of neuron j in layer l that is fed form neuron

y l1 1 l

i in the layer l1 . For i=0 , we have 0 ( n ) =+1w j 0 ( n ) =b j ( n) is the bias

applied to neuron j in layer l . Assuming the use of a sigmoid function, the output

y 0j ( n )=x j (n)

Where x j (n) is the jth element of the input vector X (n) . If neuron j is in

the output layer set

y Lj =O j (n)

Compute the error signal:

e j ( v )=d j ( n )O j (n)

Where d i (n) is the jth element of the desired response vector d (n) .

L ; l ' l l +1 l+1

e j ( n ) Q j (v j ( n ))for neuron j at output layer l j (v j ( n ) ) k ( n ) w kj (n)

k

l ]

( n )=

j

The synaptic weights of the network in layer according to the generalized delta rule:

w lji ( n1 ) + lj (n) y l1

i ( n)

l l

w ji ( n+ 1 )=w ji ( n )+

5. Iteration: Iterate the forward and backward computations under points 3 and 4 by

presenting new epochs of training examples to the network until the stopping criterion is

met.

We may solve the XOR problem by using a single hidden layer with two neurons. The signal-

flow graph of the network is shown below. The following assumptions are made here:

for its activation function.

2. Bits 0 and 1 are represented by the levels 0 and 1 respectively.

Fig: Signal flow graph of the network for solving XOR problem

w 11=w12=+1

And

3

b1=

2

w 21=w22=+1

And

1

b2=

2

w 31=2

w 32+1

1

b3 =

2

(0, 1) (1, 1)

(0, 1) (1, 1)

(0, 1) (1, 1)

3.3MATLAB Demonstration

In MATLAB demonstration we will test linearity for AND as well as OR dataset with a

perceptron. We will also test test non-linearity for XOR dataset for a perceptron. Later, we will

see how a multilayer perceptron can solve this non-linearity problem for XOR dataset. We will

be using regression plot for all of these purposes.

MATLAB code for OR dataset is given below:

clc;

close all;

x=[0 0;0 1; 1 1; 1 0];

i=x'

t=[0 1 1 1];

net=perceptron;

view(net);

net=train(net,i,t);

y=net(i);

plotconfusion(t,y);

Confusion is shown below:

3.2.2 AND Dataset test for single perceptron with no hidden layer

MATLAB code for OR dataset is given below:

clc;

close all;

x=[0 0;1 1; 0 1; 1 0];

i=x'

t=[0 0 0 1];

net=perceptron;

view(net);

net=train(net,i,t);

y=net(i);

plotconfusion(t,y);

Confusion is shown below:

3.2.3 XOR Dataset test for single perceptron with no hidden layer

MATLAB code for XOR dataset is given below:

clc;

close all;

x=[0 0;1 1; 0 1; 1 0];

i=x'

t=[1 1 0 0];

net=perceptron;

view(net);

net=train(net,i,t);

y=net(i);

plotconfusion(t,y);

Confusion is shown below:

As we can see from confusion XOR dataset is non-linearly classified for all the targeted output.

So, a single perceptron with no hidden layer cannot solve an XOR problem. Now let us see if a

single perceptron with one hidden layer can solve this problem.

3.2.3 XOR Dataset test for single perceptron with a hidden layer and back

propagation training algorithm

MATLAB code for XOR dataset is given below:

clc;

close all;

x=[0 0;1 1; 0 1; 1 0];

i=x'

t=[1 1 0 0];

net=feedforwardnet(1,'trainrp');

view(net);

net=train(net,i,t);

y=net(i);

plot(y,t);

plotconfusion(t,y)

Confusion is shown below:

As we can see we get linear classification for XOR data set, thus solving the problem for a

perceptron with no hidden layer.

4. Conclusion

We have successfully showed the incapability of a single perceptron with no hidden layer cannot

classify the XOR dataset linearly. We have also successfully showed that this problem can be

solved by using a perceptron with a single hidden layer and using back propagation training

algorithm.

5. Reference

[1]. Neural Network: A comprehensive foundation By Simon Haykin, Mc Master, University

Hamilton, Ontario, Canada

[2]. AN approach to offline arabic character recognitin using neural network by S.N Nawaz,

M.Sarfaraz, A.Zidouri and W.G.AL-Khatib

[5]. Youtube

[6]. Wiki

- Forecasting Markets Using Artificial Neural Networks (Bachelor's Thesis)Uploaded byLukáš Kúdela
- Garimella ThesisUploaded byD Princess Shailashree
- TrindadeTavares_rptUploaded byGeorge Chiu
- Fuzzy NeuralUploaded byapi-3834446
- 176Uploaded byAnonymous w7llc3BD
- mlp.pdfUploaded bygheorghe gardu
- Nips Tut 3Uploaded byEsatheeshmib
- BP algo.Uploaded bySonali Mohapatra
- 91329-0136097111_ch01Uploaded byVikram Raj Singh
- The Smart GridUploaded byMadhava Reddy Chemikala
- "Smart Grid" preview (ISBN 1587331624)Uploaded byTheCapitol.Net
- homework-6-solutions.pdfUploaded byRavi Tej
- neuralnetworkitsapplications121-120113215915-phpapp02 (1)Uploaded byAbhijeetKushwaha
- Neural Network in MATLABUploaded byBang Kosim
- Artificial Neural NetworksUploaded byP K Singh
- Volume 4 Issue 0 2003 [Doi 10.1109%2Fijcnn.2003.1224043] Wenxin Liu, ; Venayagamoorthy, G.K.; Wunsch, D.C. -- [IEEE International Joint Conference on Neural Networks, 2003. - Portland, Oregon USA (July 20 - 24, 2003)] (1)Uploaded bymiguelankelo
- Percept RonUploaded bySinaAstani
- Image Reconstruction Using Multi Layer Perceptron Mlp and Support Vector Machine Svm Classifier and Study of Classification AccuracyUploaded byIJSTR Research Publication
- An Efficient Web Prediction Model Using Modified Markov Model with ANNUploaded byseventhsensegroup
- Industrial Microcontroller Based Neural Network Controlled Autonomous VehicleUploaded byIJMER
- new of newUploaded byMohammed Samba
- FPGA Implementation of Glaucoma Detection using Neural NetworksUploaded byIRJET Journal
- Back PropagationUploaded byModick Basnet
- 42i9-Facial Expression ClassificationUploaded byIJAET Journal
- Valery Petrushin - Emotion Recognition in Speech Signal. Experimental Study, Development and ApplicationUploaded byGeorge Baciu
- A New Back-propagation Neural Network optimized with Cuckoo Search Algorithm[ver 1].pdfUploaded byAnonymous 80NFKS1IH
- Data Mining CaseBrasilTelecomUploaded byMarcio Silva
- 5.- What NARX Networks Can ComputeUploaded bycristian_master
- 2MARKUploaded bykkamal600
- PRIS10008-20130905-114751-6843-35599Uploaded byfpttmm

- Current Affaairs December 2017Uploaded byal-amin shohag
- ModulationsUploaded byal-amin shohag
- Bgim _ Maximum Likelihood Estimation Primer(3)Uploaded byal-amin shohag
- Game NamesUploaded byal-amin shohag
- Bangla Literature Previous BCS QuestionsUploaded byal-amin shohag
- Association RulesUploaded byal-amin shohag
- Time Series and ForecastingUploaded bytrideepsahu
- Bgim _ Maximum Likelihood Estimation PrimerUploaded byal-amin shohag
- Lecture 9(Cloud Computing)Uploaded byal-amin shohag
- Data-mining FINALUploaded byal-amin shohag
- Bgim _ Maximum Likelihood Estimation Primer(5)Uploaded byal-amin shohag
- Bangla Kolpo kahini part1Uploaded byal-amin shohag
- ICMP MisbehaviourUploaded byal-amin shohag
- Bayesian Belief NetworkUploaded byal-amin shohag
- Bgim _ Maximum Likelihood Estimation Primer(7)Uploaded byal-amin shohag
- 52 Years of life CalendarUploaded byal-amin shohag
- Ad Hoc BtmaUploaded byal-amin shohag
- PoemsUploaded byal-amin shohag
- Ad Hoc DsdvUploaded byal-amin shohag
- I am on MarsUploaded byal-amin shohag
- Analog TransmissionUploaded byal-amin shohag
- Micro Processor Design Chapter 1 TransistorsUploaded byal-amin shohag
- Digital TransmissionUploaded byal-amin shohag
- Braintumordetection.pdfUploaded byal-amin shohag
- Braintumordetection.pdfUploaded byal-amin shohag
- জি স্যারUploaded byal-amin shohag
- Rabin Cryptography and Implementation using C programming languageUploaded byal-amin shohag
- What is inside an Arduino Starter KitUploaded byal-amin shohag
- mod1Uploaded byKrishna Varma

- 6. Matlab TutorialUploaded byTej Swaroop
- Adaptive PID With Sliding Mode Control for the Rotary InvertedUploaded byImee Ristika
- mathematical modelling of the in-plane vibratins of portal cranes with fem verification.pdfUploaded byjose alberto padilla
- Ex Chapter 4Uploaded byBalaji Ganesh
- 03_mich_Solutions to Problem Set 1_ao319Uploaded byalbertwing1010
- Skip ListsUploaded bysonal
- StatUploaded bymariancantal
- 2015 Free Vibration OjhaUploaded byAnil Mali
- 10.1.1.134.8090Uploaded byEngr Nayyer Nayyab Malik
- EMW Techniques3_2DUploaded byjustdream
- EE1354 MODERN CONTROL SYSTEMS -final.pdfUploaded bykarthikrobert
- algebra pre-testUploaded byapi-228595495
- Erlang's B FormulaUploaded bysrmanohara
- The Language of MathematicsUploaded byjawja aw
- CIVL2110 Tutorial03 Fall 2016Uploaded byTina Chen
- ak cdn ed book 7 1 unit 3 nsUploaded byapi-271673131
- Chapter 4 Mathematical ReasoningUploaded byjuriah binti ibrahim
- Moran - MAT 1100 Summer a 2017Uploaded byTrey Cross
- Quantitative and Reasoning Training Material.pdfUploaded bymouni chowdary
- Squares,Square Roots, Cubes, Cube RootsUploaded bynorliey
- eureka math grade 3 module 1 parent tip sheetsUploaded byapi-340575766
- Academia PaltayUploaded byRobertRamirez
- 161 PolicyUploaded byTy
- Minimum Jerk TrajectoryUploaded byoctavinavarro8236
- QT1_Tutorial_1-4_student_version.docUploaded bySammie Ping
- LRZP55555Uploaded byharinaathan
- alg1bUploaded byJesus Morales
- Unit 1 Review QuestionsUploaded byAngela
- Factor AnalysisUploaded byRinz17
- wavelets.pdfUploaded byNurjamin