You are on page 1of 4

2011 International Symposium on Computer Science and Society

Improving BP Neural Network for the Recognition of Face Direction

Ying He Baohua Jin Qiongshuai Lv Shaoyu Yang


Henan Business College Zhengzhou University of Zhengzhou University Zhengzhou University
Zhengzhou, China Light Industry Zhengzhou, China Zhengzhou, China
hying2050@yahoo.com.cn Zhengzhou, China qiongshuailv@163.com ysymagnet@gmail.com
jinbh@zzuli.edu.cn

Abstract - the recognition of face direction is an important Section briefly introduces the related literature review,
part of the artificial intelligence. In recent years, BP network while Section describes GSAs idea and analyzes the
has been used for pattern recognition. However, in practical internal process. Section discusses the experiment and
application, BP has some disadvantage. The widely used BP presents the primary results. Finally, Section provides
algorithm has slow convergent speed and learning efficiency, the summary and conclusions.
and it is easy to get into local minimum. Selection of the II. RELATED LITERATURE REVIEW
initial value of the BP network can also affect convergent
A.BP Neural Network
speed. This paper presents an improving BP network to
accelerate convergence with genetic-simulated annealing The basic idea of BP is that the learning process is
algorithm. So, we optimized the initial value of the network composed of two processes including positive diffusion of
through adding annealing idea into genetic signal and inverted diffusion of error. Generally, BP
algorithm(Genetic-stimulated annealing algorithm, GSA) to neural network consists of three layers: input layer, hidden
identify face direction. Using this improving BP neural layer and output layer. Topology is illustrated in Fig.1.Pi is
network for the recognition of face direction, the results input vector, i {1,..., n} , Wij is the weight between
presents that our method has the highest precision and
reaches relatively good effects compared with traditionally input layer and hidden layer. Wjk is the weight between
BP neural network.. Therefore, this method optimized with hidden layer and output layer. ai is output vector,
GSA poses better recognition ability, and achieves ideal i {1,..., m} .
effect for the face direction.
Keywords - face direction; BP network; the initial value of
the network; Genetic-stimulated annealing algorithm
I. INTRODUCTION
As one of the most hopeful applications of image
analysis and understanding, face recognition has recently
received significant attention. BP network method has
been applied to recognize face direction. It has very strong
computing capability and is suitable for being applied to
complicated non-linear environment. But BP neural Figure 1. n-r-m neural network structure
network has itself drawbacks slow convergent speed and Recognition on face direction by BP network is
easily getting into local minimum. mainly determined network structure. It has been proved
To improve BP neural network method, many theoretically that the three-layers feed forward network
scholars have presented many methods to accelerate can be trained to approximate arbitrary function well.
convergent speed and search global optimum to overcome Determine the number of nodes in each layer, we take the
the shortcoming of BP neural network. T.P. Voglet.al. first following rules [4]:
introduced several methods for accelerating the z The number of inputs to the network and the number
convergence of back-propagation. It included batching, of outputs from the network are defined by external
momentum (MOBP), variable learning rate (VLBP) [1]. In problem specifications.
the same time, because of the development of heuristic z The external problem does not tell us directly the
techniques and numerical optimization techniques, these number of nodes required in the hidden layers. In
techniques were gradually applied to neural network, fact, there are few problems for which one can
especially genetic algorithm (GA) and stimulating predict the optimal number of nodes needed in a
annealing algorithm (SA). GA was invented by John hidden layer. This problem is an active area of
Holland [2]. SA was a probabilistic method proposed by research.
Kirkpatrick, Gelett, Vecchi and Cerny . As for the step 2 people often use the following three
GA is a robust adaptive optimization based on
formulas [5] to determine the number of nodes in the hidden
biological principles. And SA is a kind of random search
technique based on the probability. GA is poor in the local. layer.
n
In contrast, SA has a strong local search capability and
makes the search process to avoid falling into local C
t 0
i
nt !k
optimal solution. For these reasons, we adopt the method (1)
by combining GA with SA to complement each other and Here k is the number of sample. ni is the number of nodes
optimize the initial weights and biases value of BP neural in the hidden layer. n is the number of nodes in the input
network in this paper. layer. If i ni , Cni
i
0.
The remainder of this paper is organized as follows.

978-0-7695-4443-4/11 $26.00 2011 IEEE 79


DOI 10.1109/ISCCS.2011.29
nt nm a function values and also on a global parameter T (called
(2) the temperature), which is gradually decreased during the
Here m is the number of nodes in the output. n is the process. The dependency is such that the current solution
number of nodes in the input layer. a is the constant changes almost randomly when T is large, but increasingly
between 1 and 10. "downhill" as T goes to zero. The allowance for "up
nt log 2 n hilling" moves to save the method from becoming stuck at
(3) local optima - which are the bane of greedier methods.
Here n is the number of node in the input layer. As described above, the following pseudo code
As we all known, BP network training is the continuous implements the simulated annealing algorithm:
process of learning. The core content of the learning z Randomly generate an initial point, and
process is to adjust weight matrix according to the mean calculate the objective function value.
square error between the real output and anticipant output. z Set the initial temperature and counter T.
B.Genetic Algorithm z Do a random change in the current point,
Genetic algorithm (GA) is a kind of search technique calculate the new objective function value and
used in computing to find exact or approximate solutions the increment .
to optimize and search problems. The genetic algorithm is
z If 0, accept the current new point as the
categorized as global search heuristics. A typical genetic
algorithm requires: current point. If  0, accept the current new
z A genetic representation of the solution point in probability P as the current point.
domain. z If T initial value, T = T + 1, jump to step
z A fitness function to evaluate the solution 3.
domain. z If the temperature is not satisfactory, jump to
The fitness function is defined over the genetic step 2.esle return the current optimal point.
representation and measures the quality of the represented III. GENETIC-STIMULATED ANNEALING ALGORITHM
solution. Once we have the genetic representation and the Through theoretical analysis of the previous, GA has
fitness function defined, GA proceeds to initialize a an enough strong global search ability, but poor in the
population of solutions randomly, and then improve it local. In contrast, SA has a strong local search capability
through repetitive application of selection, crossover and which can make the search process avoid falling into local
mutation operators. The training process forward neural optimal solution. However, SA knows a little for the
network is in fact a sort of based on previous three whole search space. For these reasons, we adopt the
operators to look for optimal link weight value and bias method by adding annealing idea into genetic algorithm to
value. Each group of weight value and bias value complement each other and optimize the initial weights
represents the GA as the individual chromosome. So, and biases of BP neural network.
simple GA pseudo code is shown as follows [6]: GSA is similar to GA in the overall operation. It also
z Choose the initial population of individuals. needs three basic operators including selection, crossover
z Evaluate the fitness of each individual in that and mutation to produce a new group of individuals. Then,
population. implementation of the SA processes in each individual
z Repeat on this generation until termination: independently. Its results as the next generation of group
(time limit, sufficient fitness achieved, etc.). of individuals. This process is repeatedly carried out until
1. Select the best-fit individuals for reproduction. satisfy the end condition. Fig. 2 shows the basic flowchart
2. Breed new individuals through crossover and of GSA on the BP neural network.
mutation operations to give birth to offspring. In the right picture, it includes two parts: one is the
3. Evaluate the individual fitness of new BP neural network and the other is the GSA algorithm.
individuals. The part of BP neural network use optimized weights and
4. Replace least-fit population with new biases with GSA algorithm. Usually, the initial weights
individuals. and biases of BP neural network is selected randomly. we
C. Stimulated Annealing Algorithm use the processed value by GSA algorithm. And the other
The inspiration of simulated annealing (SA) [7] comes part of GSA algorithm mainly including the following
from annealing in metallurgy. a technique involving three parts:
heating and controlled cooling of a material increases the A..Fitness Function
size of its crystals and reduces their defects. The heat GSA is not basically used for external information in
causes the atoms to become unstuck from their initial the search process, but only on the basis of the fitness
positions (a local minimum of the internal energy) and function. It takes advantage of the fitness function of each
wander randomly through states of higher energy, while individual value to search in the population. Fitness
the slow cooling gives them more chances of finding function affects the convergence speed directly. Generally
configurations with lower internal energy than the initial speaking, fitness function is formed by the objective
one. function, we regard error between the anticipate output
By analogy with this physical process, each step of and real output as individual fitness.
the SA algorithm replaces the current solution by a B.Three operation
random "nearby" solution, chosen with a probability that In the selection operation, we choose individual
depends on the difference between the corresponding according to fitness value. The greater fitness of

80
individuals is, the more likely to be selected. Our 1).Pretreatment on Feature Data of face image
experimental adopts the way of roulette wheel selection. In this paper, our experimental data set is from:
In the crossover operation, we randomly select two http://archive.ics.uci.edu/ml/index.html. In observation,
individual chromosomes to cross. Our experiment take the we noticed that the location of the eyes have significant
floating encoding in the process of individual chromosome differences when face direction is not the same, which has
coding. So, we take advantage of float cross rules. In the five directions, including the left , the left front, the front,
mutation, we randomly choose the i-th individuals of the the right front and the right. We use three binary to
j-th genes to mutate. These three operators are the main express facial direction. We divided pictures into six rows
part of genetic algorithm and can guarantee to GSA have and eight columns, and then marginalized pictures. we
enough strong global search ability. only extract the rows and columns where the eye position.
C. SA Operator We counted the number which the pixel of eyes position
In the SA operator, we use Metropolis receive rule [8] is one. To ensure the randomness of the training data, we
and parameter evolution of anneal process to run program. chose 30 as the training set and 20 as the testing set.
Based on this point, SA operator just makes up the 2).Determination of BP Network Structure
deficiency of GA in the local search capability. Metropolis The idea of this experiment is to design a three-layer
receive rule is shown as follows: BP network. It includes eight nodes in the input layer and

^
| E1 | t | E 2 | three nodes in the output layer. It is an unsolved problem
1 to determine the optimal number of nodes in the hidden
P exp(| E1 |  | E 2 |) / T | E |  | E | layer. If the nodes are not enough, the convergence speed
1 2 of the whole network will slow down. On the contrary, if
(4)
Here P is the Metropolis receiving probability. E1 is the there are excessive nodes, he BP network will be more
old energy.E2 is the new energy. When the E1 is less than calculation work on iterative study and error may not be
E2, we accept this change. Or we receive this change with up to grade. we take the formula (2) in section to
probability exp(|E1|-|E2|). determine. So, the BP neural network structure is
T is the annealing value. eight-ten-three.
T T0 / log(T  1) (5)
3).Weight value Encoding Mechanism
According to our established network structure, eight
Here T0 is the initial temperature. nodes in the input layer, ten nodes in the hidden layer and
Metropolis receive rule can make the current solution three nodes in the output layer, and the main part of the
break out of local minimum. In a word, SA operator has a GSA is GS. So the length of individual chromosome is
strong local search capability which can make the search 123 in the GSA.
process avoid falling into local optimal solution. We take the floating-point encoding in the process of
These are the core of the idea of GSA algorithm. GA individual chromosome coding. Floating-point encoding
has an enough strong global search ability, but poor in the can help improve operational efficiency and computational
local. In contrast, SA has strong local search capabilities complexity. Each individual is a floating-point string. It
and makes the search process to avoid falling into local consists of four parts, which are the input layer and hidden
optimal solution. However, SA knows a little for the layer connection weights, hidden layer biases, the hidden
whole search space. In this case, we adopt the method by layer and output layer connection weights and the output
adding annealing idea into genetic algorithm to layer biases. Individual chromosomes contain all of the
complement each other and optimize the initial weights neural network weights and biases.
and biases of BP neural network. 4). Experimental environment and Set Parameter
Use Matlab toolbox to realize BP neural network in
PC Pentium4/CPU 2.8GHz/RAM512MB.With GSA
algorithm optimal BP neural network. In the GSA
algorithm, we set the population size of 10.The number of
iteration is 20. Crossover probability is 0.4. Mutation
probability is 0.2. The initial temperature T0 is 1000.
B.Simulation
We compared the error between GSA algorithm
optimal BP neural network and original BP network.Fig.3
shows the result.
Utilize the GSA algorithm to optimize neural
network, and optimize the neural network weights and
biases. From the figure 3, we can clearly see the error
distribution situation of simulation result for the testing
sample set after training network. Although it is closely
that we use two method, the error of recognition with GSA
algorithm optimal BP neural network is obviously smaller
Figure 2. GSA flowchart based on BP network
than the original BP network. This way has the highest
IV. EXPERIMENT AND SIMUTLATION
precision and reaches relatively good results.
A.Experiment

81
V. CONCLUTIONS [3] Feng Li, Cheng Liu, Application Study of BP Neural Network on
Stock Market Prediction, Ninth International Conference on Hybird
This paper concentrates on the random initial weight Intelligent System, Vol. 3, pp. 174-178, August, 2009
value and bias value of traditional BP neural network, [4] Martin T. Hagan, Howad B. Dcmuth, Mark Beale, Neural Network
such as easy to trap local minimal and has slowly leaning Design, China Machine Press, Beijing, 2002
convergent velocity. we use a method based on GSA [5] Feisi ICS-UNIDO, Neural Networks and Matlab 7 Application,
Publishing House of Electronics Industry, Beijing, 2005
algorithm to optimize its initial values, aimed at improving [6] Tom M. Mitchell, Machine Learning, China Machine Press, Beijing,
the velocity of learning convergence. This method is 2003
practical and general applicable. When making [7] Kirkpatrick. S, Gelatt. C. D, Vecchi. M. P. Optimization by
recognition of face direction, firstly, it should locate the simulated annealing, Science, pp.671-680,1983
[8] Tian Qi Chuan,Pan Quan,Wang Feng,Zhang Hong Cai, Research
position of eyes in the picture to reduce the influence of on Learning Algorithm of BP Neural Network Based on the
other factors and improve the recognition precision. All of Metropolis Criterion,Control Theory and Applications,
these have proved that comparison with the traditional BP Vol.22.pp.15-17,2003
network, the GSA optimization of BP network could get a Recognization Error

much more accuracy result. GSA - Error


0.9
BP - Error
Although the improved BP neural network has
0.8
exceeded the traditional algorithm in convergence speed
0.7
and recognition precision, this method needs a large
0.6
amount of calculation. Pattern recognition problems of the
0.5
real world, such as dynamic image recognition, do not fit

Error
well with this narrowly-defined model. They tend to span 0.4

broad activities and require consideration of multiple 0.3

aspects. 0.2

REFERENCES 0.1
[1] T. P. Vogl, J. K. Mangis, A. K. Rigler, W. T. Zink, D. L. Alkon, 0
Accelerating the Convergence of the Back-Propagation
-0.1
Method,Biological Cybernetics, Vol. 59, pp. 257-263, September, 2 4 6 8 10 12 14 16 18 20
1988 Test Sample
[2] David E. Goldberg, John H. Holland, Genetic Algorithms and Figure 3 Recognization Error
Machine Learning, Machine Learning, Vol. 3, pp. 95-96, , October,
1988

82

You might also like