You are on page 1of 9

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No.

1
ISSN: 1837-7823

Classification and Recognition of Handwritten Digits by the SelfOrganizing Map of Kohonen


Elotmani Saber and El hithmy Mohammed
Facult des Sciences Oujda
Universit Mohammed I Oujda Morocco
selotmani@yahoo.fr
Abstract
We study in this work the classification and the recognition of handwritten digits by the
method of Kohonen neural network. We used a database of handwritten digits written by
different writers, 300 digits, 30 of each from 0 to 9, were used as color images. The color
images are converted to grayscale and then to black and white. After a cropping, the size of
the image is standardised to a matrix of 24x12 pixels. The 24 lines of the image are
concatenated to form a vector of 288 elements. The vectors corresponding to all the digits of
the database are all stored in a matrix.
The vectors of this matrix are arbitrarily presented to a classifier of type self-organizing
map of Kohonen. The study of the size to be considered for this map was carried out. We
identify the digit value associated with each synaptic weight obtained at convergence. The
digit identification is carried out by measuring the Euclidean distance between the obtained
weights and all the digits used in the learning process. The weight is identified with the digit
for which the distance is minimal. The weights at convergence are processed with two
operations, filtering and refining. These operations have helped to improve the recognition
process. The recognition results showed good efficiency for all the digits except for the digit
9.
Keywords: Pre-processing, Kohonen self-organizing map, Euclidian distance, Filtering, Refining, Classification
Handwritten digit recognition

1. Introduction
Reading by a machine, digits written by hand, remains a major challenge to current
scientific research. To meet the industrial demands, the machine should have good accuracy,
acceptable classification, good running time, and robustness to variations in handwriting
styles [1]. In order to satisfy these demands, several approaches have been developed both in
the choice of the representative digit, and the type of classifier. Until recently, the best
performance is given by hybrid classifiers usually several classifiers in series in order to
refine the classification [2]. We opted for the self-organizing Kohonen map, with the digits,
represented by their entire image after pre-processing.
Various classifiers and feature extraction techniques of the digits were reported in the
4

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823

litterature [3], [4]. Various databases of handwritten digits were used such as CENPARMI [5],
CEDAR[6], and MNIST[7],[8]. It has been shown in [2] that the best results were obtained
with compound classifiers, but unfortunately these classifiers require a lot of memory space
and better computing speeds[9].
Chain code method and gradient feature for representing the digit was reported in
litterature [10]. The performance of accuracy can still be improved with the addition of other
complementary structural features. Chain code feature is extracted from the binary images,
and the characteristic-gradient is applied to binary images or to gray scale images[11]. For
those methods, acquiring features requires a lot of computing time, memory space and a
complex classifier.
This paper is divided into six sections, the preprocessing is given in section 2, the
Kohonen SOM classifier is presented in section 3, in section 4 we present two innovative
methods aiming to improve the accuracy of the system, filtering and refining. Simulation
results are presented in section 5. A conclusion is given in section 6.
2. Preprocessing

Figure 1.1 : Learning database

Figure 1.2 : Test database

Figures 1.1 and 1.2 show the digits we used for learning and test stages respectively. 200,
20 of each, and 100 digits, 10 of each from 0 to 9 are preprocessed, the images of the digits
become black and white images with the same size 24X12. With this procedure, the neuron
weight can be visualised as an image
Digit images were scanned and inverted (white on black). The region-of-iterest (ROI) is
extracted by cutting the original image into horizontal lines, figure 2a. and vertical cutting to
get the image of a single digit, figure 2b. The individual digit images are pre-processed. They
are converted to black and white, they are cropped, and the size is modified in order to get the
same size for all the images. Th digit image lines are then concatenated from top to bottom,
and a 1D vector of 288 components is formed.

Figure 2a.: Line digits

Figure 2b:isolated digit

3. Kohonen classifier
The Kohonen classifier or Self-Organizing Map (SOM), that we used is a two
5

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823

dimensional neural network. It is an unsupervised type of classifier. The input data, taken at
random from the input database, is fed to the neural network, figure 3.
The weights are updated at each iteration by an algorithm which was initially proposed
by Kohonen [12],[13] [14]. The SOM can be thought of as a reorganization, preserving the
topology of the input space into a grid of neurons of two dimensions. Preserving the topology
of the input space means, the objects presented to learning, which are close to each other in
the input space, should be set up with those output units which are close to each other.
Columns
Central neuron
Neighbourhood 1

Neighbourhood 2

rows

X1
X2

Xi

X199
X200

Neighbourhood 3

Weights: W(1,1) to W(16,16)

Figure 3: two-dimensional neural layer

Closeness concept is measured by using the Euclidian distance. The various steps of the
algorithm are:
0. Initialisation;
1. Give the number of iterations tmax;
2. While t <tmax do
3. Determine the distance of each neuron to the given input;
4. The neuron with the minimum distance is the winner;
5. Update the weight of the winner neuron and its neighbors (according to the order of the
neighborhood);
6. If t <tmax go to (3) Otherwise go to end (7).
7. end
The weight components are initialized by random values between 0 and 1, they should
be close to the values of the input vectors whose components values are either 0 or 1. The
winner neuron is chosen according to the minimum Euclidean distance by:
j * = Arg min j{1, n} X i W j

(1)

Where Xi the input, n the total number of neurons, Wj the weight vector associated with
index j and j* the index of the winner neuron. The weight of the winner neuron and that of the
closer neighborhood neurons are updated according to the following learning rule:

Wj(i+1)= Wj(i) + (i) h( j, j* ) ( Xi Wj(i) )

(2)

where is the learning rate, j the index of the neuron, h (j, j*) a neighborhood function which
6

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823

determines which neuron is be to considered as neighborhood to neuron j*, i the iteration


number. The neighbourhood function is a type of Gaussian function defined by:

h( j , j*) = e

d2
2 2 ( i )

(3)

Where d the distance in the topological map between the winning neuron and neuron j,

radius of influence. measures the degree in which the neurons excited around the
winning neuron, cooperate in the learning process. The coefficient h(j, j*) is used to propagate
learning to neighboring neurons of the winner. It is maximum for the winner neuron and
decreases when moving away from it. The number of affected neighbors must decrease over
time in order to allow the learning to converge. In the phase of organization, is updated at
each iteration, using the exponential decay:
i

(i ) = 0e

(4)

0 the radius of the map and 1 determined such as:


m

( m) = 0 e = 1 ; 1 =
1

m
, 0 = 22
log( 0 )

(5)

m=10000 and 1 =9705.


should initialize the neighborhood of winner neuron to be almost all neurons in the
network, and then gently shrink the neighborhood region with iterations. For a square map
2
2
containing ll neurons: 0 = (l + l = 2 l . For a map containing 1616 neurons we took

:. If the value of this coefficient is large, this will allow learning to progress faster, but if this
value is very large, the network may never converge. This is due to oscillations of the weight
vectors that are becoming so large that the classification is not performed [15]. The . learning
rate is chosen as:

(i ) = (0) e

(6)

With (0): initial learning rate. We took (0)=0.9 and 2 =4342. According to Kohonen
[15], (i) must take values close to 1 for the first 1000 steps and then decreases gradually.
When reaches the value of 0.01 at iteration 30 000, we fix to this value, we dont allow it
to decrease further. As long as iterations progress, h(j,j*) goes to one for j=j* and to zero for
jj*.
4. Filtering and Refining
The neuron weights obtained at convergence should define areas in the Kohonen map that
obtains the classification of the input data. The weights at convergence should converge to the
input data whose component values are either 0 or 1. The closest zones in the Kohonen map
7

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823

should correspond to the closest input data. Because of these two arguments, we propose to
filter and refine the weights at convergence. The operation of filtering consists of changing
the neuron weight component according to the following law. If a weight component has a
value lower or equal to 0.3, it is replaced by 0. If this value is greater or equal to 0.7, it is
replaced by 1. The result of this process is given in table 2.1. At convergence and after
performing on the weight the filtering operation, the attribution of the weight to its specific
class is carried out by comparing the weight with all the objects in the input database matrix
by:
k * = Arg min k {1, 200} W j X k j=1 to 256

(7)

Filtering helps obtain a good identification of the synaptic weight class with respect to the
objects initially stored in the input database.
The operation of refining consists og changing the isolated single digit by the most
frequent digit found in its surrounding. It may be noted in Table 2.2, where weights are
identified, that there are weights, whose classes are isolated, they have no neighbours of the
same class. We have thought to give the isolated weight the same class as the one that is
dominant in its neighbours. Indeed, if a weight is close to another one in a SOM map, it
means that the two weights are quite similar, they look like each other; they could then be
assimilated into a single class. If a weight is isolated, you may change its class by that of a
neighbour, if the class of this neighbour is dominant (more frequent) in this neighbourhood.
His likeness is stronger with its neighbours in this class, than with other neighbours.
This refining allows a better identification of weight classes, collecting similar weights in
wider areas.
We considered only the isolated single weight because of the reduced number of neurons
in our SOM map, for applications with a SOM map with a very large number of neurons,
small isolated areas may be treated in the same manner.
We considered a window of size (3x3) whose centre is isolated weight, and we look for
dominant class in this windows.
3 3 3
9 4 3
9 3 3

333
933
933

Figure 4 : Replacement Class 4 (unknown) with that dominant in his neighbourhood 3

After these operations, we are getting refined table classes, table 2.3.

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823
Table 2.1 :Images weights

Table 2.2:Images of weight Identification

9
9
9
3
9
9
8
6
6
6
5
5
5
5
5
5

9
3
9
3
4
3
8
5
6
6
6
5
5
5
5
5

3
3
3
3
3
3
9
5
8
6
6
6
5
5
5
8

9
3
9
3
3
8
9
8
8
6
7
6
8
8
8
8

3
3
3
3
3
8
8
8
8
6
6
6
8
8
8
8

9
3
3
3
3
8
8
8
8
6
6
6
8
8
2
8

0
5
2
3
0
0
1
1
8
6
2
2
2
2
2
2

0
0
2
9
0
1
1
1
1
2
2
2
2
2
2
0

6
6
0
9
1
1
1
1
1
3
2
6
2
2
2
1

0
0
0
0
1
1
1
1
2
2
2
2
2
2
2
7

0
0
0
0
1
1
1
9
2
2
2
2
2
2
7
7

0
0
0
1
1
2
9
9
9
2
2
2
2
5
7
7

0
0
1
1
1
1
9
9
4
4
4
4
4
5
7
7

6
1
1
1
1
4
4
4
4
4
4
4
4
4
7
7

Table 2.3 :Refining identifications

7
7
7
7
7
4
4
4
4
4
4
4
4
4
7
8

7
7
7
7
7
4
4
4
4
4
4
4
4
4
4
7

9
9
9
3
9
9
8
6
6
6
5
5
5
5
5
5

9
3
9
3
3
3
8
5
6
6
6
5
5
5
5
5

3
3
3
3
3
3
9
5
8
6
6
6
5
5
5
8

3
3
3
3
3
8
9
8
8
6
6
6
8
8
8
8

3
3
3
3
3
8
8
8
8
6
6
6
8
8
8
8

3
3
3
3
3
8
8
8
8
6
6
6
8
8
2
8

0
0
2
3
0
0
1
1
8
6
2
2
2
2
2
2

0
0
2
9
0
1
1
1
1
2
2
2
2
2
2
2

6
6
0
9
1
1
1
1
1
2
2
2
2
2
2
2

0
0
0
0
1
1
1
1
2
2
2
2
2
2
2
7

0
0
0
0
1
1
1
9
2
2
2
2
2
2
7
7

0
0
0
1
1
1
9
9
9
2
2
2
2
5
7
7

0
0
1
1
1
1
9
9
4
4
4
4
4
5
7
7

0
1
1
1
1
4
4
4
4
4
4
4
4
4
7
7

7
7
7
7
7
4
4
4
4
4
4
4
4
4
7
7

7
7
7
7
7
4
4
4
4
4
4
4
4
4
4
7

Table 2: images of weight refining their identifications

Isolated digits are in red on the Table 2.2.


5. Results, discussion and interpretation
Several tests for different values of the number of neurons to be considered in the SOM
map were performed. We first chose different forms for the same number of neurons (6X4,
4X6, 8X3, 3X8), and tests showed no particular performance for a given form, even if the
results are different for each test. Then we tried square shapes with different numbers (4X4,
5X5, 6X6, and 7X7... 10 x 10, 14X14 and 20X20) which showed a marked improvement for
the 16X16 map. Obviously, the number of neurons depends on the number of objects in the
input database and the number of classes. Indeed, for 4X4 and 5X5 maps, some expected
classes were missing and not represented in the map. The study of convergence has been
made to a map of different numbers of neurons and a square shape; it showed that for the
same data, it always converges.

Figure 6: Convergence of a component of a weight vector

Figure 6 shows the convergence trend of a weight component taken arbitrarily.


Learning has been carried out and Tables 2.1, 2.2 and 2.3 have been produced. In order to
validate these tables and to see that filtering and refining have helped to improve recognition,
data from the learning input and test databases are compared to the neuron weights of the
Kohonen SOM obtained at convergence. Results of recognition of these data is issued and
assessed. Each of the input learning data Xli , i= 1 to 200, is compared to all the neuron
weights, Wcj , j=1 to n, n=256, obtained at convergence:
9

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823

ji = arg min j[1, n ] X i Wcj

The digit placed in the position ji of table 3 is compared with the digit Xli , and
recognition rate measured.
The same procedure is repeated with the test database objects. The results are
summarized in tables 3 to 6. We note that refining has given rise to larger areas for a given
digit, this will generate identifications that are more accurate.
Position of the element in the learning data base
Recognition
rate

90%
100%
100%
100%
95%
95%
90%
100%
90%
65%

Class

1st

2d

10

11

12

13

14

15

16

17

18

19

20

Zeros
ones
twos
threes
fours
fives
sixes
sevens
eights
nines

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
3

0
1
2
3
4
5
2
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
0
6
7
8
3

0
1
2
3
4
5
6
7
8
3

0
1
2
3
4
5
6
7
8
3

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
6
7
9
3

0
1
2
3
4
5
6
7
8
3

0
1
2
3
4
5
6
7
8
9

0
1
2
3
4
5
0
7
8
9

2
1
2
3
4
5
6
7
8
9

2
1
2
3
5
5
6
7
9
3

Table 3: Recognition of elements of learning database.

Table 3 shows the results of recognition of each digit in figure 1.1. For example, the eighteen
first zeros were recognized, the nineteenth and the twentieth were recognized as being 2
instead of 0. The digits 1, 2, 3 and 7 were recognized 100%, the digit 9 was recognized at a
rate of 65%, it was confused with the digit 3 in 7 positions.
Ouput
Input
0
1
2
3
4
5
6
7
8
9

18
0
0
0
0
1
1
0
0
0

0
20
0
0
0
0
0
0
0
0

2
0
20
0
0
0
1
0
0
0

0
0
0
20
0
0
0
0
0
7

0
0
0
0
19
0
0
0
0
0

0
0
0
0
1
19
0
0
0
0

0
0
0
0
0
0
18
0
0
0

0
0
0
0
0
0
0
20
0
0

0
0
0
0
0
0
0
0
18
0

0
0
0
0
0
0
0
0
2
13

Table 4: confusion of elements of learning database

Table 4 shows the confusion degree between the digits, the lines represent the output digits
and the columns the input digits. The 0 was recognized in 18 positions and was confused with
2 in two positions. The digit 9 was recognized in 13 positions and was confused with 3 in 7
positions.

10

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823
Position of the element in the learning data base
Class
1st 2d 3 4 5 6 7 8 9
Zero
0 0 0 0 0 0 0 0
6
one
1
1 1 1 1 1 1 1 1
two
2
2 2 2 2 2 2 2 2
three
3
3 3 9 3 3 3 8 3
four
4 7 2 4 5 5 4 4
5
five
5
5 5 5 5 5 5 5 5
six
6
6 6 5 6 0 6 2 0
seven
7
7 4 7 7 7 4 4 7
eight
8
8 3 8 8 8 8 8 8
nine
9
3 9 1 1 3 3 3 9

Recognition rate
90%
100%
100%
80%
50%
100%
60%
70%
90%
40%

10
0
1
2
3
4
5
6
7
8
9

Table 5: Recognition of elements of test database

Table 5 shows the results of recognition of each digit in figure 1.2. For example, the first 0 in
the database was confused with 6, the other zeros were all recognized. The digits 1, 2 and 5
were recognized 100%.
Output

0
1
2
3
4
5
6
7
8
9

9
0
0
0
0
0
2
0
0
0

0
10
0
0
0
0
0
0
0
2

0
0
10
0
1
0
1
0
0
0

0
0
0
8
0
0
0
0
1
4

0
0
0
0
5
0
0
3
0
0

0
0
0
0
3
10
1
0
0
0

1
0
0
0
0
0
6
0
0
0

0
0
0
0
1
0
0
7
0
0

0
0
0
1
0
0
0
0
9
0

0
0
0
1
0
0
0
0
0
4

Input

Table 6: Table of confusion of the elements of test database

Table 6 shows the confusion degree between the digits of the test database. For example, the
digit 3 was recognized in eight positions, and was confused once with the digits 8 and 9.
5. Conclusion.
The method of self-organizing map of Kohonen was used for classification and recognition of
handwritten digits. These digits were represented by their images taken individually. The data
were divided into two sets, one used for learning and contains 200 objects, 20 of each digit,
and the other set was for test contains 100 objects, 10 of each digit. The images of the digits
were pre-processed and their sizes standardized to 24X12. After convergence, the neuron
weights were filtered and refined. This has shown improvement of the digit recognition.
This new method has well performed for a small number of neurons, future research could use
maps with larger numbers of neurons, and larger number of objects.
6. References
[1]
[2]
[3]
[4]

Impedovo S., Mangini F.M. and Barbuzzi D., (2014), A novel prototype generation technique for
handwriting digit recognition, Pattern Recognition Vol. 47, Issue 3, pp. 1002 - 1010
Cheng-Lin L., Kazuki N., Hiroshi S. and Hiromichi F., (2003), Handwritten digit recognition:
benchmarking of state-of-the-art techniques, Pattern Recognition, Vol. 36, Issue 10, pp. 2271-2285.
El melhaoui O., (2011), Arabic numerals recognition based on an improved version of the loci
characteristic, International Journal of Computer Applications, Vol. 24, No. 1, pp. 36-41.
Singh V., Kumar B., Tushar P., (2013), Feature Extraction Techniques for Handwritten Text in Various
11

International Journal of Computational Intelligence and Information Security, January 2015 Vol. 6, No. 1
ISSN: 1837-7823

[5]
[6]

[7]
[8]
[9]
[10]

[11]
[12]
[13]
[14]

[15]

Scripts: a Survey, International Journal of Soft Computing and Engineering (IJSCE) , Vol. 3, Issue 1, pp.
2231-2307.
Hull J.J., (1994), A database for handwritten text recognition research, Pattern Analysis and Machine
Intelligence, IEEE Transactions, Vol. 16 , Issue 5, pp. 550-554.
Lee D.S. and Srihari S.N., (1993), Handprinted digit recognition: a comparison of algorithms,
Proceedings of the Third International Workshop on Frontiers of Handwriting Recognition, Buffalo, NY,
pp. 153164
LeCun Y., Bottou L., Bengio Y. and Haffner P., (1998), Gradient-based learning applied to document
recognition, Proceedings of the IEEE, Vol. 86 , issue 11, pp.22782324.
LeCun Y., and al., (1995), Comparison of learning algorithms for handwritten digit recognition,
Proceedings of the International Conference on Artificial Neural Networks, Nanterre, France, pp. 5360.
Suen C.Y., Nadal C., Legault R., Mai T.A., Lam L., (1992), Computer recognition of unconstrained
handwritten numerals, Proceedings of IEEE, Vol. 80, issue 7, pp. 1162-1180.
Tushar P., Saurabh U., (2013), Chain Code Based Handwritten Cursive Character Recognition System
with Better Segmentation Using Neural Network, International Journal of Computational Engineering
Research, Vol. 03, Issue, 5, pp. 60-63.
Cheng-Lin L., Kazuki N., Hiroshi S. and Hiromichi F., (2004) Handwritten digit recognition: investigation
of normalization and feature extraction techniques, Pattern Recognition Vol. 37, Issue 2, pp. 265 279.
Kohonen, T. (1990). The self-organizing map. Proceedings of IEEE, Vol. 78, issue 9, pp 1464-1480.
Yap T. N, (2004) Automatic Text Archiving and Retrieval Systems Using the Self-Organizing Kohonen
Map, IEEE Transactions on Knowledge and Data Engineering archive, Vol. 16, Issue 3, pp. 380-383.
Diganta N. and S., (2005), Unsupervised Text Classification Using Kohonen Self Organizing Network,
Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Text
Processing, Mexico City, Mexico, pp. 715-718.
T. Kohonen, (1995), Self Organizing Maps, Springer series of information sciences, Vol. 30, 362p.

12

You might also like