You are on page 1of 10

Pravin Yallappa Kumbhar* et al.

/ (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES


Vol No. 11, Issue No. 1, 162 - 171

Use Of Artificial Bee Colony (ABC) Algorithm in


Artificial Neural Network Synthesis

Pravin Yallappa Kumbhar Prof. Shoba Krishnan


Vivekanand Education Society Institute of Technology Dept. of Electronics and Telecommunication, Vivekanand
Mumbai University, Mumbai (Maharashtra) Education Society Institute of Technology Mumbai
pravinkumbhar03@gmail.com University, Mumbai (Maharashtra),

Abstract—Artificial bee colony (ABC) algorithm has been used in problems: speech synthesis, diagnostic problems, medicine,
several optimization problems, including the optimization of finance, robotic control, signal processing, computer vision
synaptic weights from an Artificial Neural Network (ANN). and many other problems that fall under the category of
However, this is not enough to generate a robust ANN. For that pattern recognition [3]. Among many different neural network
reason, some authors have proposed methodologies based on so- classifiers, the multilayer feed- forward networks have been

T
called met heuristics that automatically allow designing an ANN,
mainly used for solving classification tasks, due to their well-
taking into account not only the optimization of the synaptic
weights as well as the ANN’s architecture, and the transfer known universal approximation capabilities The success of
function of each neuron. However, those methodologies do not neural networks largely depends on their architecture, their
generate a reduced design (synthesis) of the ANN. In this paper, training algorithm, and the choice of features used in training.
we present an ABC based methodology, which maximizes its Artificial neural networks (ANN) are very important tools for
accuracy and minimizes the number of connections of an ANN by solving different kind of problems such as pattern
ES
evolving at the same time the synaptic weights, the ANN’s
architecture and the transfer functions of each neuron. The
methodology is tested with several pattern recognition
classification, forecasting and regression. However, their
design imply a mechanism of error-testing that tests different
architectures, transfer functions and the selection of a training
algorithm that permits to adjust the synaptic weights of the
Keywords- ARTIFICIAL NEURAL NETWORKS, ANN. This design is very important because the wrong
ARTIFICIAL BEE COLONY ALGORITHM, selection of one of these characteristics could provoke that the
METHODOLOGY, STANDARD DEVIATION training algorithm be trapped in a local minimum. Because of
this, several met heuristic based methods in order to obtain a
good ANN design have been reported.
I. INTRODUCTION (HEADING 1) In [1], Xin Yao presents a state-of-the-art where evolutionary
Artificial Neural Networks are commonly used in pattern algorithms are used to evolve the synaptic weights and the
classification, function approximation, optimization, pattern architecture, in some cases with the help of classic techniques
A
matching, machine learning and associative memories. They like back-propagation algorithm. But there are some works
are currently being an alternative to traditional statistical like [2] where the authors evolve automatically the design of
methods for mining data sets in order to classify data. an ANN using basic PSO, second generation PSO (2GPSO)
Artificial Neural Networks are well-established technology for and a new technique (NMPSO). In [3], the authors design an
solving prediction and classification problems, using training ANN by means of DE algorithm and compare it with other
and testing data to build a model. However, the success of the bio-inspired techniques. In these last two works, the authors
evolve, at the same time, the principal features of an ANN: the
IJ

networks is highly dependent on the performance of the


training process and hence the training algorithm. In this synaptic weights, the transfer functions for each neuron and
paper, we applied the Artificial Bee Colony (ABC) the architecture. However, the architectures obtained by these
Optimization Algorithm on training feed-forward neural two methods contain many connections. In [4] the authors
networks to classify different data sets which are widely used train an ANN by means of ABC algorithm. In [5] the authors
in the machine learning community. The performance of the applied this algorithm to train a feed-forward for solving the
ABC algorithm is investigated on benchmark classification XOR, 3-Bit Parity and 4- Bit Encoder-Decoder problems. In
problems from classification area and the results are compared the pattern classification area, other works like [6] ABC
with the other well-known conventional and evolutionary algorithm is compared with other evolutionary techniques,
algorithms. The results indicate that ABC algorithm can while in [7] an ANN is trained with medical pattern
efficiently be used on training feed-forward neural networks classification. Another problem solved by applying the ABC
for the purpose of pattern classification algorithm can be found in [8], where the authors test with
Broad applicable areas of artificial neural networks, pattern clustering problems. In [9], authors train a RBF Neural
recognition is one of the most important applications in such Network using ABC algorithm. In this work four

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 162


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

characteristics of this kind of ANN are optimized: the weights the input nodes, which receive the pattern a 2 IRN, pass the
between the hidden layer and the output layer, the spread information to the units in the first hidden layer, the outputs
parameters of the hidden layer base function, the centre from this first hidden layer are passed to the next layer, and so
vectors of the hidden layer and the bias parameters of the on until reaching the output layer, producing thus an
neurons of the output layer. In [10] an ANN is trained to approximation of the desired output y 2 IRM.
estimate and model the daily reference evapotranspiration of Basically, learning is a process by which the free parameters
two USA stations. There are other kinds of algorithms based (i.e., synaptic weights W and bias levels _) of an ANN are
on the bee behaviour that have been applied to train an ANN. adapted through a continuous process of stimulation by the
For example, in [11] bee algorithm is used to identify woods environment in which the network is embedded. The type of
defects, while in [12], the same algorithm is applied to learning is determined by the manner in which the parameter
optimize the synaptic weights of an ANN. In [13] a good changes take place. On the other hand, the learning process
review concerning this kind a bio-inspired algorithm to may be classified as: supervised or unsupervised. In this paper
provide solutions to different problems is given. It said that we focus on supervised learning that assumes the availability
ABC algorithm is a good optimization technique. In this paper of a labeled set of training data made up of p input-output
we want to verify if this algorithm performs in the automatic samples (see Eq. 2):
designing of an ANN, including not only the synaptic weights
but also architecture and the transfer functions of the neurons.
As we will see, the architectures obtained are optimal in the
sense that the number of connections is minimal without

T
losing efficiency. The paper is organized as follows: in section where a is the input pattern and d the desired response. Given
2 we briefly present the basics of ANNs. In section 3 we the training sample T_, the requirement is to compute the free
explain the fundamental concepts of ABC algorithm, while in parameters of the neural network so that the actual output y_
section 4 we describe how ABC algorithm is used to design of the ANN due to a_ is close enough to d_ for all _ in a
and ANN and how the ANN’s architecture can be optimized. statistical sense. In this sense, we may use the mean-square
In section 5 the experimental results using different error (MSE) given in Eq. 6 as the first objective function to be

present the conclusions of the work.,


II. ARTIFICIAL NEURAL NETWORKS
ES
classification problems are given. Finally, in section 6 we minimized:

An ANN tries to simulate the brain’s behaviour when they


generate information, save it or transform it. An ANN is a One of the most used ANNs is the feed-forward neural
system made up from simple processing units. This system network, trained by means of the back-propagation (BP)
offers the input-output mapping property and capability [14]. algorithm [15] and [16]. This algorithm minimizes the
This type of processing unit performs in two stages: weighted objective function described by Eq. 3. Some algorithms
summation and some type of non-linear function, this allows constantly adjust the values of the synaptic weights until the
the ANN to realize a learning stage of the input data value of the error no longer decreases. However these classic
A
representing the problem to be solved. Each value of an input algorithms can converge to a local minimum instead of to the
pattern a aϵI2 IRN is associated with its synaptic weight desired global minimum. Furthermore, the architecture and the
values transfer function used in their design can influence the ANN’s
W 2 IRN, which is normally between 0 and 1. Furthermore, performance; consequently, the learning algorithm can be
the summation function often takes an extra input value _ trapped in a minimum faraway from the best solution.
with weight value 1, representing a threshold or a bias for the
neuron. The summation function will be then performed as III. ARTIFICIAL BEE COLONY ALGORITHM
IJ

Eq. 1. Artificial Bee Colony (ABC) algorithm is based on the


metaphor of the bees foraging behavior. The natural selection
which created this beautiful system of communication can also
be seen within the system. Information about different parts of
the environment is like species in competition. The fitness of
the species is given by the profitability of the food source it
The sum of the products is then passed to the second stage to describes. Information survives by continuing to circulate
perform the activation function f (o) which generates the within the nest, and is capable of reproducing itself by
output of the neuron, and determines the behavior of the recruiting new foragers who become informed of the food
neural model. By connecting multiple neurons, the true source and come back to the nest and share their information
computing power of the ANN emerges. The most common [17]. ABC algorithm was proposed by Karaboga in 2005 [18]
structure of connecting neurons is by layers. In a multilayer for solving numerical optimization problems. This algorithm
structure, is based on the model proposed by Tereshko and Loengarov

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 163


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

[17]. It consists in a set of possible solutions xi (the 4: Produce new solutions vi for the employed bees by
population) that are represented by the position of the food using Eq. 4 and evaluate them.
sources. On the other hand, in order to find the best solution 5: Apply the greedy selection process.
three classes of bees are used: employed bees, onlooker bees 6: Calculate the probability values pi for the solutions xi
and scout bees. These bees have got different tasks in the by Eq. 5.
colony, i. e., in the earch space. 7: Produce the new solutions vi for the onlookers from
the solutions xi selected depending on pi and evaluate
Employed bees: Each bee searches for new neighbor food them.
source near of their hive. After that, it compares the food 8: Apply the greedy selection process.
source against the old one using Eq. 4. Then, it saves in their 9: Determine the abandoned solution for the scout, if exist,
memory the best food source. and replace it with a new randomly produced solution
xi by Eq. 6.
10: Memorize the best solution achieved so far.
11: cycle = cycle + 1
12:end for
---------------------------------------------------------------------------

IV. METHODOLOGY

T
The main aim of our methodology is to evolve, at the same
time, the synaptic weights, the architecture (or topology), and
After that, the bee evaluates the quality of each food source the transfer functions of each neuron so as to obtain a
based on the amount of the nectar (the information) i.e. the minimum Mean Square Error (MSE) as well as a minimum
fitness function is calculated. Finally, it returns to the dancing classification error (CER). At the same time, we look to
area in the hive, where the onlookers bees are. optimize the ANN’s architecture by reducing the number of
ES
Onlooker bees: This kind of bees watch the dancing of the
employed bee so as to know where the food source can be
found, if the nectar is of high quality, as well as the size of the
neurons and their connections. The problem to be solved can
be defined as follows: Giving a set of input patterns X = _x1,
..., xp , x 2 IRn and a set of desired patterns D = _d1, ..., dp
, d 2 IRm, find an ANN represented by W 2 IRq×(q+2) such
food source. The onlooker bee chooses probabilistically a food that a function defined by min (F (D,X,W)) is minimized. The
source depending on the amount of nectar shown by each codification of the ANN’s design to be evolved by ABC
employed bee, see Eq. 5 algorithm is given in Fig. 1. This figure shows the food
source’s position representing the solution to the problem.
This solution is defined by a matrix W 2 IRq×(q+2) composed
by three main parts: the topology (T), the synaptic weights
(SW), and the transfer functions (F ) where q is the maximum
number of neurons MNN; it is defined by q = 2 (m + n)
A
(remember that n is the dimension of the input patterns vector
where fiti is the fitness value of the solution i and SN is the and m is the dimension of the desired patterns vector). The
number of food sources which are equal to the number of three parts of the matrix W take values form three different
employed bees. ranges. For the case of the topology, the range is between _1,
2MNN − 1_, for the case of the synaptic weights is between
Scout bees: This kind of bees helps the colony to randomly [−4, 4] and for the case of the transfer functions the range is
create new solutions when a food source cannot be improved [1, nF], where nF is the number of transfer functions. The
IJ

anymore, see Eq. 6. This phenomenon is called ― limit‖ or ANN’s topology is codified based on the binary square matrix
―abandonment criteria‖. representation of a graph x where each component xij

The pseudo-code of the ABC algorithm is next shown:


--------------------------------------------------------------------------
1: Initialize the population of solutions xi8i, i = 1, ..., SN
2: Evaluate the population x i,G8i, i = 1, ...,NP
3: for cycle = 1 to MCN do

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 164


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

classified. Based on the winner-take-all technique we can


compute the CER, defined by Eq. 8:

where npwc is the number of patterns well classified and


tnp is the total number of patterns to be classified.
When the MSE is used, the output yi of the ANN is
computed as follows:
-------------------------------------------------------------------------
1: For the first n neurons, the output oi = ai.
2: for nei with i = n to MNN do
3: Get connections by using individual x 1,i.
Fig. 1. Representation of the individual codifying the 4: for each neuron j < i connected to nei do
architecture, synaptic weights and transfer functions 5: oi = f (o) where f is a transfer function giving by
individual xm,i and o is computing using Eq. 4.
represents the connections between neuron i and neuron j 6: end for
7: Finally, yi = oi, i = MNN − m, . . . ,MNN.

T
when xij = 1. This information was codified into its decimal
base. For example, suppose that next binary code ‖01101‖ 8: end for
represents the connections of the ith neuron to five neurons. --------------------------------------------------------------------------
From this binary code, we can observe that only neurons two, Note that the restriction j < i is used to avoid the generation
three, and five are connected to neuron i. This binary code is of cycles in the ANN.
transformed into its decimal base value resulting in ‖13; this Until now, we have defined two fitness functions that help
will be the number that we will evolved instead of the binary to maximize the ANN’s accuracy by minimizing their error
ES
value. This scheme is much faster to manipulate. Instead of
evolving a string of bits, we thus evolve a decimal base
number. The synaptic weights of the ANN are codified again
(MSE or CER). Now, we have to propose a function that helps
not only to get a maximum accuracy but also to minimize
the number of connections of the ANN. The reduction of the
architecture could be represented as follows:
by square matrix representation of a graph x where each
component xij represents the synaptic weight between neuron
i and neuron j.
Finally, the transfer function for each neuron is represented by
an integer in the range of [0, 5] codifying one of the six where NC is the number of connections in the ANN designed
transfer functions used in this research: logsig, tansig, sin, by the proposed methodology and NMaxC is the maximum
adbas, pureline, and hardlim. These functions were selected number of connections generated with MNN neurons.
for they are the most popular and useful transfer functions in NMaxC is given as:
A
several kinds of problems.
When the aptitude of an individual is computed by means of
the MSE function (Eq. 7), all the values of matrix W are
codified so as to obtain the desired ANN. Moreover, each
solution must be tested in order to evaluate its performance. It is important to notice that if F3 is used as a fitness function
Due to the methodology is tested with several pattern in the ABC algorithm. The proposed methodology will allow
classification problems, it is necessary to know the synthesizing the ANN but the accuracy will not be maximized.
IJ

classification error (CER) function, this means to know how For that reason, we have to proposed a fitness function that
many patterns have been correctly classified and many were integrates both objectives: the minimization of the error and
incorrectly classified. synthesis of the ANN (the reduction of the number of
connections). Two fitness functions are proposed to achieved
this goal using the ABC algorithm. These fitness functions are
composed by combining functions F1, F2 and F3. First fitness
function (FF1) is represented by Eq. 11, while second fitness
For the case of the CER fitness function, the output of the function (FF2) is represented by Eq. 12.
ANN is transformed by means of the winner-take-all
technique; this codification is then compared with the set of
the desired patterns. When the output of the ANN equals the
corresponding desired pattern, this means that the pattern has
been correctly classified, otherwise it was incorrectly

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 165


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

With these functions we will next see the we will be able to divided into two sets: a training set and a testing set; this with
design ANNs with a high accuracy and a very low number of the aim to prove the robustness and the performance of the
connections methodology. The same parameters were used through the
whole experimentation.
Depending of the problem, the ABC algorithm approaches the
solution to the minimum error during the evolutionary
learning process at different rates. For instance, for the case of
the object recognition problem, we observed that by evolving
FF1 (the one using MSE), the error tends to diminish faster,
and after a certain number of generations the error diminish
slowly (Figure 2(a)). On the other hand, we also observed that,
in some cases when FF2 is evolved, the error reaches the
minimum error in a few number of epochs; nonetheless
The error tends to diminish slowly (Figure 2(b)).

T
ES
Fig. 2. Evolution of the error for the ten experiments for the
object recognition problem. (a) Evolution of FF1 using MSE
functions. (b) Evolution of FF2 using CER function .

IV. RESULTS
Several experiments were performed in order to evaluate the
accuracy of the ANN designed by means of the proposal. The
accuracy of the ANN was tested with four pattern
classification problems. Three of them were taken from UCI
Fig. 3. Evolution of the error for the ten experiments for the
machine learning benchmark repository [19]: iris plant, wine
Iris plant problem. (a) Evolution of FF1 using MSE function.
A
and breast cancer datasets. The other database was a real
(b) Evolution of FF2 using CER function .
object recognition problem.
The main features of each dataset are next given. For the case
of the object recognition dataset the dimension of the input
vector is 7 and the number of classes is 5 objects. For the iris
plant dataset, the dimension of input vector is 4 and the
number of classes is 3. For the wine dataset, the dimension of
IJ

input vector is 13 and the number of classes is 3. Finally, for


the breast cancer dataset, the dimension of input vector is 9
and the number of classes is 2.
The parameters of the ABC algorithm were set to the same
value for all the dataset problems: Colony size (NP = 50),
number of food sources NP/2, limit = 100 and the maximum
number of cycles was MCN = 4000. Six different transfer
functions were used in all experiments:
SN=sin function, LS=sigmoid function, HT= hyper-tangential
function, GS=Gaussian function, LN=liner function and HL=
hard limit function.
20 experiments using each dataset were performed. Ten for
the case of fitness function FF1, and ten for the case of fitness
function FF2. For each experiment, each dataset was randomly

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 166


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

Fig. 4. Evolution of the error for the ten experiments for the taking into account F3 function. (b) ANN designed by the
Wine problem. (a) Evolution of FF1 using MSE functions. (b) ABC algorithm taking into account F3 function.
Evolution of FF2 using CER function.

For the case of the iris plant dataset, we observed that by


evolving FF1 (the one using MSE) or FF2 (the one using
CER), the error tends to diminish faster, and after a certain
number of generations the error diminish slowly (Figure 3).
For the case of the wine dataset, we observed that by evolving
FF1 or FF2, the error tends to diminish slowly (Figure 4).
Finally, for the case of the breast cancer dataset, we observed
that by evolving FF1 or FF2, the error tends to diminish faster
and after a certain number of generations the error diminish
slowly (Figure 5).
Figures 6, 7, 8 and 9 show two of the 20 different ANNs
automatically generated with the proposed methodology, and

T
Fig. 7. Two different ANNs designs for the Iris plant problem.
(a) ANN designed by the ABC algorithm without taking into
account F3 function. (b) ANN designed by the ABC algorithm
taking into account F3 function.
ES for each dataset. It is important to note that Figures 6(a), 7(a),
8(a) and 9(a) were automatically designed by the ABC
algorithm but the fitness functions FF1 and FF2 did not
include the synthesis of the ANN, in other words, these fitness
functions do not used F3 function. On the contrary, Figures
6(b), 7(b), 8(b) and 9(b) were automatically designed by the
ABC algorithm but the fitness functions FF1 and FF2 include
the synthesis of the ANN using F3 function.
Furthermore, in some cases the dimensionality of the input
pattern is reduced because some features do not contribute to
the ANN’s output.
Fig. 5. Evolution of the error for the ten experiments for the
A
Breast cancer problem. (a) Evolution of FF1 using MSE
functions. (b) Evolution of FF2 using CER function
IJ

Fig. 8. Two different ANNs designs for the Wine problem. (a)
ANN designed by the ABC algorithm without taking into
Fig. 6. Two different ANNs designs for the object recognition account F3 function. (b) ANN designed by the ABC algorithm
problem. (a) ANN designed by the ABC algorithm without taking into account F3 function

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 167


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

breast cancer datasets the best percentage of recognition for


the two phases was achieved using the FF1 fitness function,
the best percentage of recognition for the iris plant dataset was
achieved using FF2 fitness function.
Table II, presents the average percentage of recognition for all
the experiments using fitness function FF1 and the fitness
function FF2. In this Table, we can observe that the best
percentage of recognition for all the databases was achieved
only during training phase. The accuracy slightly diminished
during testing phase. However, the results obtained with the
proposed methodology were highly acceptable and stable.
This tendency can be corroborated in Table III that shows the
standard deviation of all experimental results obtained with
each dataset.
Tables IV and V show the maximum and minimum percentage
of classification achieved in all experiments during training
and testing phase using the two fitness functions. In Table IV
there are many one’s that represen t the
maximum

T
Fig. 9. Two different ANNs designs for the Breast cancer
problem. (a) ANN designed by the ABC algorithm without
taking into account F3 function. (b) ANN designed by the
ABC algorithm taking into account F3 function
ES
Table I shows the average connection number achieved with
the proposed fitness functions (FF1 and FF2). In addition, we
also present the average connection number achieved when F3
is not taken into account by the proposed fitness functions. As
the reader can appreciate, the number of connections decreases
when function F3 is used.

TABLE I
AVERAGE CONNECTION NUMBER

Dataset Objective function Objective


A
without F3 function using F3
FF1 FF2 FF1 FF2
Object rec. 74 66.3 58.4 65
Iris plant 20.9 19 15.6 12.7
Wine 104.8 94.6 86.3 89.9
Breast cancer 48 4.9 30.7 31.6
IJ

Fig. 10. Percentage of recognition for the object recognition


problem and the ten experiments during the training and
testing stage for each fitness function. (a) Percentage of
recognition minimizing the MSE. (b) Percentage of
Once generated the ANN for each problem, we proceeded to recognition minimizing the CER.
test their accuracy. The next figures show the performance of
the methodology with the two fitness functions. Figures 10,
11, 12 and 13 present the percentage of classification for all
the experiments (executions) during the training and testing
phase, whereas for the case of the object recognition, wine and

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 168


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

T
Fig. 11. Percentage of recognition for the Iris problem and the
ten experiments during the training and testing stage for each
fitness function.
(a) Percentage of recognition minimizing the FF1 function. (b)
Percentage
of recognition minimizing the FF2 function. Fig. 13. Percentage of recognition for the Breast cancer
ES problem and the ten experiments during the training and
testing stage for each fitness function. (a) Percentage of
recognition minimizing the FF1 function. (b) Percentage of
recognition minimizing the FF2 function

TABLE III
STANDARD DEVIATION RECOGNITION
A
Dataset Objective function Objective function
FF1 FF
Training Testing Training Testing
Object rec. 0.0386 0.0962 0.0371 0.0842
Fig. 12. Percentage of recognition for the Wine problem and Iris plant 0.0237 0.0378 0.0189 0.0373
the ten experiments during the training and testing stage for Wine 0.0287 0.0575 0.0304 0.1164
each fitness function. (a) Percentage of recognition
IJ

Breast Cancer 0.0063 0.0102 0.0111 0.0134


minimizing the FF1 function. (b) Percentage of recognition
minimizing the FF2 function
TABLE II
AVERAGE PERCENTAGE OF RECOGNITION
percentage (100%) of recognition that can be achieved by the
Dataset Objective function Objective function designed ANN. This is important because at least, we found
FF1 FF2 one configuration that solves a specific problem without
Training Testing Training Testing misclassified patterns or with a low percentage of error. In
Object rec. 0.984 0.946 0.938 0.864 Table V the worst values achieved with the ANN are
Iris plant 0.9667 0.9253 0.9693 0.9387 represented.
Wine 0.9337 0.8629 0.8764 0.7944 Particularly, the dataset that provides the worst results was the
Breast cancer 0.973 0.9655 0.9739 0.9561 wine problem. Nonetheless, the accuracy achieved was higly
acceptable

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 169


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

observed that both functions achieved a highly acceptable


TABLE IV performance. Moreover we demonstrated that these fitness
THE BEST PERCENTAGE OF RECOGNITION functions can considerably reduce the number of connections
of an ANN with a minimum of error of MSE and CER
Dataset Objective function Objective function functions.
FF1 FF On the other hand, in some of the ANN designs generated by
Training Testing Training Testing the proposed methodology, some neurons belonging to the
Object rec. 1 1 1 0.96 input layer are not used; they do not present any connections
Iris plant 1 0.9733 1 0.9733 with other neurons. In this particularly case, we can say that a
Wine 0.9775 0.9551 0.9213 0.9213 reduction of the dimensionality of the input pattern is also
Breast Cancer 0.9824 0.9766 0.9853 0.9766 obtained.
In general, the results were satisfactory. The proposed
TABLE V methodology allows searching the best values that permit
THE WORST PERCENTAGE OF RECOGNITION automatically constructing an ANN that generates a good
solution for a classification problem.
.
Dataset Objective function Objective function
FF1 FF
Training Testing Training Testing REFERENCES
Object rec. 0.88 0.72 0.9 0.7 [1] X. Yao, ― Evolving artificial neural networks,‖ PIEEE: Proceedings of

T
the IEEE, vol. 87, 1999.
Iris plant 0.92 0.8533 0.9333 0.84 [2] B. A. Garro, H. Sossa, and R. A. Vazquez, ― Design of artificial neural
Wine 0.8989 0.7865 0.8315 0.5169 networks using a modified particle swarm optimization algorithm,‖
Breast Cancer 0.9648 0.9444 0.9501 0.9386 in Proceedings of the 2009 international joint conference on Neural
Networks, ser. IJCNN’09. Piscataway, NJ, USA: IEEE Press, 2009,
pp. 2363–2370.
[3] ——, ― Design of artificial neural networks using differential evolution
algorithm,‖ in Proceedings of the 17th international conference on
ES
From these experiments, we observed that the ABC algorithm
was able to find the best configuration for an ANN given a
specific set of patterns that define a classification problem.
Neural information processing: models and applications - Volume Part
II, ser. ICONIP’10. Berlin, Heidelberg: Springer-Verlag, 2010, pp.
201–208.
[4] D. Karaboga and B. Akay, ― Artificial Bee Colony (ABC) Algorithm
on Training Artificial Neural Networks,‖ in Signal Processing and
Moreover, the integration of the synthesis into the fitness Communications Applications, 2007. SIU 2007. IEEE 15th, 2007, pp.
1–4.
function provokes that the ABC algorithm generates ANNs
[5] D. Karaboga, B. Akay, and C. Ozturk, ― Artificial bee colony (abc)
with a small number of connections and high performance. optimization algorithm for training feed-forward neural networks,‖ in
The design of the ANN’s consists on providing a good Proceedings of the 4th international conference on Modeling Decisions
architecture with the best set of transfer functions and synaptic for Artificial Intelligence, ser. MDAI ’07. Berlin, Heidelberg: Springer-
Verlag, 2007, pp. 318–329.
weights. The experimentation shows that all the designs
[6] D. Karaboga and C. Ozturk, ― Neural networks training by artificial
generated by the proposal present an acceptable percentage of bee colony algorithm on pattern classification,‖ Neural Network World,
A
recognition for both training and testing phases with the two vol. 19, no. 10, pp. 279 –292, 2009.
fitness functions [7] D. Karaboga, C. Ozturk, and B. Akay, ― Training neural networks
with abc optimization algorithm on medical pattern classification,‖ in
V. CONCLUSIONS International conference on multivariate statistical modelling and high
dimensional data mining, 2008.
The design of an ANN is achieved using the proposed [8] C. Ozturk and D. Karaboga, ― Classification by neural networks and
methodology. The synaptic weights, the architecture and the clustering with artificial bee colony (abc) algorithm,‖ in International
transfer function of an ANN are evolved by means of ABC symposium on intelligent and manufacturing systems features, strategies
IJ

and innovation, 2008.


algorithm. Furthermore, the connections among the neurons [9] T. Kurban and E. Bes¸dok, ― A comparison of RBF neural network training
that belong to the ANN are synthesized. This allows algorithms for inertial sensor based terrain classification,‖ Sensors,
generating a a reduce design of the an ANN with a high vol. 9, pp. 6312–6329, 2009.
[10] C. Ozkan, O. Kisi, and B. Akay, ― Neural networks with
performance. In this work we tested the performance of the artificial bee colony algorithm for modeling daily reference
ABC algorithm. We have also proved that this novel technique evapotranspiration,‖ Irrigation Science, pp. 1–11, 2010. [Online].
is a good optimization algorithm, because it does not easily Available: http://dx.doi.org/10.1007/s00271-010-0254-0
traps in local minimums. In the case of the proposed [11] D. Pham, A. Soroka, A. Ghanbarzadeh, E. Koc, S. Otri, and M.
Packianather,
methodology we have demonstrated its robustness; the random ―Optimising neural networks for identification of wood defects
choice of the patterns for each experiment allowed us to get, using the bees algorithm,‖ in Industrial Informatics, 2006 IEEE Inter-
statistically speaking, good significant results. national Conference on, 2006, pp. 1346 –1351.
The experiments were tested with two different fitness [12] D. Pham and A. O. S. Koc, E.and Ghanbarzadeh, ― Optimisation of the
weights of multi-layered perceptrons using the bees algorithm,‖ in In
functions: FF1 and FF2, based on the MSE and CER, Proceedings of 5th international symposium on intelligent manufactur-
respectively. Additionally, these fitness functions involve the ing systems, 2006.
synthesis of the architecture. Through these experiments, we [13] D. Karaboga and B. Akay, ― A survey: algorithms simulating bee swarm

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 170


Pravin Yallappa Kumbhar* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES
Vol No. 11, Issue No. 1, 162 - 171

intelligence,‖ Artificial Intelligence Review, vol. 31, no. 1, pp. 61–85,


Jun. 2009.
[14] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal
representations by error propagation. Cambridge, MA, USA: MIT
Press, 1986, ch. 8, pp. 318–362.
[15] J. A. Anderson, An Introduction To Neural Networks. The MIT Press,
1995.
[16] P. Werbos, ― Backpropagation through time: what it does and how to
do it,‖ Proceedings of the IEEE, vol. 78, no. 10, pp. 1550 –1560, Oct.
1990.
[17] V. Tereshko and A. Loengarov, ― Collective decision making in honeybee
foraging dynamics,‖ Computing and Information System Journal,
vol. 9, no. 3, pp. 1 –7, 2005.
[18] D. Karaboga, ― An idea based on honey bee swarm for numerical
optimization,‖ Computer Engineering Department, Engineering Faculty,
Erciyes University., Tech. Rep., 2005.
[19] P. M. Murphy and D. W. Aha, ― UCI Repository of machine learning
databases,‖ University of California, Department of Information and
Computer Science, Irvine, CA, US., Tech. Rep., 1994.

T
ES
A
IJ

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 171

You might also like