Professional Documents
Culture Documents
2 1 Brief Introduction
2 2 Backpropogation Algorithm
2 3 A Simply Illustration
J
2 uistory
2 1.2 Review to Decision Tree
ë Learning process is to reduce the error, which can
be understood as the difference between the target
and output values from learning structure.
ë ID3 Algorithm can be implemented only for discrete
values.
ë Artificial Neural Network (ANN) can describe
arbitrary functions.
2 1.3 Basic Structure
ë This example of ANN learning is provided by
Pomerluau¶s(1993) system ALVINN, which uses a
learned ANN to steer an autonomous vehicle
driving at normal speeds. The input of ANN is a
30x32 grid of pixel intensities obtained from
forward-faced camera mounted on the vehicle. The
output is the direction in which the vehicle is
steered.
ë As can be seen, 4 units receive inputs directly from
all of the 30X32 pixels from the camera in vehicle.
These are called ´hidden´ units because their
outputs are only available to the coming units in the
network, but not as apart of the global network.
2 1.4 Ability
ë Instances are represented by many attribute-value
pairs. The target function to be learned is defined
over instances that can be described by a vector of
predefined feature. such as the pixel values in the
ALVINN example.
ë The training examples may contain errors. In
following sections we can see, that ANN learning
methods are quite robust to noise in training data.
ë Long training times are acceptable. Compared to
decision tree learning, network training algorithm
requires longer training time, depending on factors
such as the number of the weights in network.
J
|
2 2.1 Sigmoid
ë Like the perceptron, the
sigmoid unit first
computes a linear
combination of its input.
ë then the sigmoid unit
computes its output with
the following function.
ë This equation 2 is often referred to as the
squashing function since it map very large
input domain to a small range of output.
3
ë But in practice, because the function 3 sums all the
error over a whole set of the training data, so need
the algorithm with this function more time to
compute, and can easily be effected by local
minimum, so construct man a new function, named
stochastic squared error: