Professional Documents
Culture Documents
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:2, 2015
I. I NTRODUCTION
International Scholarly and Scientific Research & Innovation 9(2) 2015 510 scholar.waset.org/1999.4/10001122
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:2, 2015
PDF approaches the true PDF as the training set size increases,
as long as the true PDF is smooth. It is important to select a
proper weighting function, which should have the following
properties: it should return a large value when the distance
between unknown input and the training sample is small; and
the returned value rapidly decreases to zero as the distance
International Science Index, Computer and Information Engineering Vol:9, No:2, 2015 waset.org/Publication/10001122
increases[6], [11]. Commonly the Gaussian function is a good Fig. 2. The structure of General Regression Neural Network
choice because it behaves well, is easily computed and is
not related to any assumption about a normal distribution[12]. +
Then the estimated PDF function for each class becomes: z(x)p(x|z)dz
+
f (x) = E[z|x] = , where p(x, z) is the joint
p(x|z)dz
1 xxik 2
gi (x) = nk=1
i
e 2 2 (1) distribution function[13].
(2)p/2 p ni We can use either parametric or nonparametric models to
For the PNN, the training procedure is equal to build the nd the joint distribution function from the training data. The
network and store the information of samples in the hidden latter one is always a preferred choice because it estimates
layer. In other words, after building the neural network, there is the density directly from the data without making any
no training procedure. Therefore, it is quite fast. Besides this, parametric assumptions about the underlying distribution. The
the training samples can be added or removed easily without Parzen-Rosenblatt density estimator of joint PDF is adopted,
extensive retraining. Another important fact is that a proper where p(x, z) in thejoint input-output space is dened as:
1 N
smoothing parameter is required. It can be determined by px,z (x, z) = p+1 N i=1 K( )K( ). Here the K is
xxi zzi
an educated guess based on knowledge of the data or estimated kernel function, always, Gaussian function is a good choice for
by a heuristic technique. the kernel function[14]. After applying the Gaussian function,
we can get (2) and (3). Then assessing the two indicated
2
integrations and using + zez dz = 0, we yield(4).
B. General Regression Neural Network
The General Regression Neural Network(GRNN) was also 2 2
1 1 N 1 Dx,n Dz,n
introduced by D.F. Specht in the early 1990s and it is p(x, z) = exp{ },
based on probability as well. GRNN is very similar to PNN, (2)3/2 N n=1 x,n 2
z
2
2x,n 2z2
(2)
(xxi )T (xxi ) +
especially for the structure. However, the GRNN is designed (zzi )2
for regression, while PNN is designed for classication. Given i=1 exp[
n
2 2 ] z exp[ 22 ]dz
f (x) = (xxi )T (xxi ) + 2 ,
the training data, the GRNN can reconstruct the underlying i=1 exp[
n
2 2 ] exp[ (zz i)
2 2 ]dz
function, f (x). The architecture of GRNN is very similar to (3)
the architecture of PNN, except the neuron in the hidden layer (xxi )T (xxi )
zi exp[
n
2 2 ]
is connected with every neuron in the summation layer and the f (x) = i=1 (xxi )T (xxi )
. (4)
edges that connect the hidden layer and summation layer are i=1 exp[
n
2 2 ]
assigned weights[13], [14], [15], as shown in Fig. 2. Formula (4) is a weighted sum over all the training patterns.
As a reasonable estimate of an unknown regression function Each training pattern is weighted exponentially according to
f (x), relying on a prior belief of its smoothness, one may its Euclidean distance to the unknown pattern x and also
take a mean of observations from a neighborhood of the according to the smoothing factors . Note that the use of
point x. This approach is successful if the local average the normalized Gaussian function does not imply any normal
is conned to observations in a small neighborhood(i.e. a assumption about the distribution of the data in the feature
receptive eld) of the point x as observations corresponding space[15]. The Gaussian function is used just for the reason
to points away from x will generally have different mean that it satises the requirements about the properties of the
values[16]. More precisely, f (x) is equal to the conditional kernel function. The is the smoothing factor, which leads to
mean of z given x. Using the formula for the expectation different estimation results: if is too small, the result will
of+a random variable, it can be written: f (x) = E[z|x] = become over-tting; if too large, the estimation will be over
zpz (z|x)dz where p(z|x) is the conditional probability smoothing.
density function of z given x. From probability theory, it is The nal formula is mapped into the GRNN. Each node in
p (x,z)
known that: pz (z|x) = x,z px (x) , then the formula comes to: the hidden layer represents a training pattern. The summation
International Scholarly and Scientific Research & Innovation 9(2) 2015 511 scholar.waset.org/1999.4/10001122
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:2, 2015
layer includes two nodes: the rst node sums all the outputs
of the hidden layer and evaluates the numerator of the nal
formula, and the second unit evaluates the denominator of the
nal formula. Each unit in the hidden layer is connected to
each of the two nodes in the summation layer. The weights
of the connection between the node i in the hidden layer and
the rst unit of the summation layer is equal to yi , the target
value. The i weight of the connection between any node i in
the hidden layer and the second node in the summation layer
is equal to unity. The output node merely divides the two
outputs of the summation layer to yield the predicted value
of the dependent feature[17], [18]. The advantages of GRNN
are fast training and the ability to easily add or delete training
samples.
International Science Index, Computer and Information Engineering Vol:9, No:2, 2015 waset.org/Publication/10001122
A. Methodology
Considering the similar architecture of PNN and GRNN sample is small; the return value rapidly decreases to zero as
as well as fast training and re-training features of both the distance increases. Using this idea, in our neural function,
neural networks, we are going to construct a neural network if one variable in the city is the same, it will add a value to
by combining PNN and GRNN to predict soccer players its output, if not, it will not. For the numeric variables (game
performance value. Applications using the combination of difculty and player form), the Gaussian function still applies.
neural networks have been discussed in some literatures[19], In this way, our neural function formula becomes:
[20], [21].
In reality, many factors will affect a game. We choose
some key and useful factors: time, opponent, the opponents SU M = 0.15 (opponent) + 0.15 (style)
playing style, city, weather, difculty and player form[22]. The + 0.1 (city) + 0.05 (weather)
target value is the players performance level. Other factors can + 0.55 Gaussian(dif f iculty, playerf orm).
be easily included by adding corresponding input nodes. As
(5)
there are several candidate players, we need to calculate the
most probable performance for every player, and choose the Here, if the input of opponent is same as the sample, the
maximum one as the nal choice. This results in one more opponent value will be 1, otherwise the value will become
layer to compare the estimated performances of players. The 0. Other factors work in the same way. Then, for one input,
GRNN part will use the time information and performance the GRNN and PNN will return estimated performance values
information (target value) in historical data to estimate the for each candidate offensive player. The added layer will add
underlying function between time input and performance these two estimated values for each candidate and send to the
value. When an input game comes, the GRNN will use the output. The output layer will do comparison between players,
input time factor and estimate the target performance value then output the selection and estimated performance for this
for each candidate player. Note that each candidate player player.
has a different underlying function. Other factors will be
used by the PNN. The target valueperformance will be
discretized into several performance classes. In the PNN, the B. Structure
Gaussian function is still used but not enough because there are Fig. 3 is a schematic diagram. The rst layer is the input
several factors that cannot be digitized and applied in Gaussian layer. The number of the nodes in this layer is equal to the
function, such as: city, opponent. How to solve this problem? number of factors. For the hidden layer, each node stores the
The idea of PNN is if input is closer, namely more similar to information of the sample and it can be considered as there are
one training sample, it has higher probability of belonging two neural functions: one uses time and historic target value to
to the corresponding class. For example, if there are two do regression, another uses other factors to do classication.
samples for player1: (AT M, of f ensive, sunny, home, 9.0) In the summation layer1, there are two parts: the node in the
and (Betis, def ensive, cloudy, away, 7.0), where the last left part sums the values calculated by regression function in
value is the target performance level, assuming each hidden layer for each candidate; the node in the right part
variable has the same weight, then when an input sums the values calculated by classication function in hidden
(Betis, def ensive, sunny, away) comes, using the idea of layer for each candidate. Then the summation 2 layer does
PNN, the node representing the second sample will return a division for regression and comparison for the classication.
higher value than the rst one. So we design our own neural This layer calculates the estimated performance and the most
function for our PNN which ensures that return a large value probable performance for each player. Finally, the output layer
when the distance between unknown input and the training can make the selection and output the average of the sum of
International Scholarly and Scientific Research & Innovation 9(2) 2015 512 scholar.waset.org/1999.4/10001122
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:2, 2015
International Scholarly and Scientific Research & Innovation 9(2) 2015 513 scholar.waset.org/1999.4/10001122
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:2, 2015
International Scholarly and Scientific Research & Innovation 9(2) 2015 514 scholar.waset.org/1999.4/10001122