You are on page 1of 6

Indian Institute of Science, Bangalore

Letter Image Recognition using Neural Network Pattern Recognition



















Objective:
To identify each of a large number of black-and-white rectangular pixel displays as one of the 26
capital letters in the English alphabet.
Number of Inputs/Attributes: 16 Numeric attributes
Number of Targets: 26
Number of Instances: 20000
Neural Network has been used to classify the instances as any of the 26 alphabets based on the
inputs and this problem is a pattern recognition problem.
Number of Instances used for Training: 16000
Number of Instances used for Validation: 3000
Number of Instances used for Testing: 1000
Fivefold cross validation has been used to ensure that the test and training set are independent and
to ensure that every instance is used for testing.

Data Processing:
The target file comprised of alphabets, which could not be imported into MATLAB directly. Hence it
was converted into a vector of numerals (combination of 1s and 0s). A 26 column vector was defined
for every alphabet to uniquely code them. For eg. A was coded as 1 in the first position followed by
25 zeroes. Eventually the input appears as a 20000 x 16 matrix and the target as a 20000 x 26 matrix.

Selection of Network:
We start with a two layer Neural Network with the following specifications:

Two Layer Networks:
Network: 16 x 26



Number of Neurons in the output layer: 26 (as number of outputs is 26)
Transfer function: Tan Sigmoid in both the layers
Algorithm: Scaled Conjugate Gradient
Convergence is based on minimum Mean Squared Error

Learning Rate: 1

Number of Neurons in the hidden Layer Mean Squared Error (in the validation set)
16 0.0259
18 0.0204
20 0.0119
22 0.0111
24 0.0126
26 0.0141

For neurons above 26 in the hidden layer, no improvement in MSE is found.
From the above table it is clear that for a learning rate of one, the best performance is obtained for
22 neurons in the hidden layer. Note that the values shown in the table are the means obtained
after fivefold cross validation.

Learning Rate: 0.75

Number of Neurons in the hidden Layer Mean Squared Error (in the validation set)
16 0.01306
18 0.0177
20 0.0112
22 0.0138
24 0.0131
26 0.0123

From the above table it is clear that for a learning rate of 0.75, the best performance is obtained for
20 neurons in the hidden layer.

Learning Rate: 0.5

Number of Neurons in the hidden Layer Mean Squared Error (in the validation set)
16 0.0164
18 0.0142
20 0.0134
22 0.0124
24 0.0136
26 0.0209

From the above table it is evident that for a learning rate of 0.5, the best performance is obtained
for 22 neurons in the hidden layer.

Comparing the results of the three learning rates, it is pretty evident that the least MSE is obtained
for learning rate one and for 22 neurons in the hidden layer.

Now we embark upon figuring out if there is any significant improvement in the performance of the
network when the number of hidden layers is increased beyond 1.

Three Layer Networks:




Network1: 16 15 15 26
Number of Neurons in the two hidden layers: 15
Learning Rate: 1
Number of Neurons in the output layer: 26
Transfer function: Tan Sigmoid in both the layers
Algorithm: Scaled Conjugate Gradient
Convergence is based on minimum Mean Squared Error
Mean Squared Error: 0.0142 on the validation set (after fivefold cross validation)
Network2: 16 12 12 26
Learning Rate: 1
Number of Neurons in the two hidden layers: 12
Mean Squared Error: 0.0148 on the validation set (after fivefold cross validation)
In the case of two layer networks, results are displayed only for the two best network models after
trying different combinations of neurons and learning rates. It is seen that network 1 with 15
neurons in each of the two hidden layers and a learning rate of 1 is found to be the best.
Now comparing the performance of two layer and three layer networks, we find that the best
network is the single layer network with 22 neurons in the hidden layer and with a learning rate of
one.
Final Neural Network Model:

Number of Inputs in the hidden layer: 22
Learning Rate: 1
Transfer function: Tan Sigmoid in both the layers
Mean Squared Error: 0.011044

The performance plot is shown below:


The performance plot, which is a plot of MSE and Epochs (Iterations), is shown for training, test and
validation sets separately. The figure shows that as the number of iterations increases, MSE falls and
the training stops at a point beyond which the MSE increases. The training stopped when the MSE in
the validation set reached its lowest value of 0.011 at epoch 560.

The Receiver Operating characteristics plot is shown below:


The ROC curve basically plots the true positive rate (sensitivity) in the Y axis and false positive rate
(1-specificity) in the X axis as the threshold is varied. The colored lines indicate the curves for each of
the 26 classes. The performance of the network would be the best when all the curves or lines lie at
the top left corner. The figure shown above clearly shows that all curves lie in the upper corner
implying that the network is performing well.
Conclusion:
The neural network model 16 22 26 26 that has been selected for solving the character pattern
recognition problem is shown below:

The model selected works well in the training set in that it doesnt over fit at the same time
performing well in the test set as well.

You might also like