You are on page 1of 5

Persian Handwritten Numeral Recognition

Using Complex Neural Network And


Non-linear Feature Extraction
Zeynab Shokoohi

Department of Electrical, Computer and Biomedical


Engineering
Qazvin Branch, Islamic Azad University
Qazvin , Iran
z.shokoohi@qiau.ac.ir

Fariborz Mahmoudi

Department of Electrical, Computer and Biomedical


Engineering
Qazvin Branch, Islamic Azad University
Qazvin , Iran
mahmoudi@qiau.ac.ir

Abstract In this paper, we propose a new isolated


handwritten numbers recognition by using of sparse structure
representation. We introduce the sparse structure which is a
over-complete dictionary and it is known with K-SVD algorithm.
In this vocabulary, values adopted by initialized to the first layer
of Complex Neural Network(CNN) and in the last, it learned for
doing classification task. The distinction between proposed
method with previous methods in addition to using of the CNN
and K-SVD algorithm is non-linear feature extraction. It is noted
which in the previous methods extracted linear feature. When
using of each type linear and non-linear analysis, it is important
that we distinguish between their application In reduce
dimensional and special gregarious correct recognition of the
features that doing basis on specific rules. Subspaces under high
power will appears in the first usage, for notice to denoising and
high data compression Without necessary that individuals were
specifically. this is only condition which in describe the subspace
to size of information in the data.
Keywords Non-linear features; Neural Network; K-SVD

algorithm

I.

INTRODUCTION

The processing In recent decades, expanded Latin


handwritten recognition with high rate methods of
accuracy[20,7,10,12,22,15,16,9,13,5]. There are some
application of this technique in real system such as post
database[4,8] and bank check recognition[19]. On the other
hand it is low developed of research in field numeral
automatic recognition in other language. In this paper, we

978-1-4673-6206-1/13/$31.00 2013 IEEE

Ali Mahdavi Hormat

Department of Electrical, Computer and Biomedical


Engineering
Qazvin Branch, Islamic Azad University
Qazvin , Iran
ali.mahdavi.hormat@qiau.ac.ir

Hamed Badalabadi

Department of Electrical, Computer and Biomedical


Engineering
Qazvin Branch, Islamic Azad University
Qazvin , Iran
badalabadi.hamed@qiau.ac.ir

investigation Persian language that it is for several countries


such as Iran, Afghanistan, Pakistan, Tajikistan and it consist
about one hundred ten of people. Persian number similar to
Latin numbers have various of type, size and direction that
shown in Fig. 1 Which in 2,3,4,6 have been written
differently.

Fig. 1. Persian numbers can be shown with various of form

Represent different recognition techniques for this the


Persian numbers, such as Soltan Zadeh[18] used of SVM for
learning features of category. Molae[14] assumed a system for
diagnosis the numbers and characters, he used of Haar
Wavelet for take a feature and so give to the SVM for
learning. Ziarteban[7] used a pattern basis on the feature
extraction, which in twenty of the patterns that in each have
been the better matching of image and input data, and them
initialized to multi layer perceptron neural network(NN-MLP).
Liu recently provide a benchmark for the Persian and Bengali
handwritten numbers recognition which in obtained the good
results on CENPARMI database by Class Specific Feature
Polynomial Classifier(CFPC) method, straight direction of
numbers form in normalized grayscale images with low rate of

the error. In this paper, proposed a new method for diagnosis


Persian numbers handwritten which in try to create the sparse
structure and over-complete on data. This the sparse structure
considered a over-complete dictionary which in data for
learning initialized according to weights the first layer of
CNN. The CNN can divided to two parts: 1-automatic
extraction of the features, 2-classification with learnable.
Automatic feature extraction consist the layers of mapping
features that extract topping features of input patterns by two
component(1-linear filtering and 2-operations sampling into
down. Our proposed method using of non-linear filtering
instead linear filtering in feature extraction inside neural
network. Because the our proposed method is hierarchical,
using detection significant features of the available data in the
real world and create a non-linear whitening that it is effective.
In addition, quantitative analysis is shown that H-NLPCA
method have high accuracy in handwritten classification and
the features recognition. Before of doing the our method,
should be normalized data and image.
II.

PRINCIPAL COMPONENTS ANALYSIS(PCA)

When we using of each type PCA (what it is the linear and


non-linear) that diagnosis of usage is important such as need
to reduce dimensional or determination and identify special set
of the features basis on a criterion. In the first usage, with
confirmation to the denoising and data compression, only
considered the subspace which in with notice to MSE have
more features of the primary data set. There are several neural
network strategy that doing decomposition the linear or nonlinear of the PCA. However, usage of this the decomposition
is for multi solution problems.
In here we introduce hierarchical order of principal
components that results is uncorrelated features. We obtain a
conversion whiting six by scaling this features to variance
unit, that it is a stage pre-processing for use such as regression,
classification or blind category.
A good implement of PCA hierarchical algorithm must
have be two properties: 1-scalable, 2-stable. The Scalable is to
means the n first components up to describe variance in the ndimensional space of data. The stable is to means the i-th
component of a solution with n features equal to i-th
component of a solution with m features(m n).Unlike
standard linear self-encoding, true PCA is a instance of this
hierarchical methods. Kernel PCA[6] is other algorithm
whereas applying the true PCA on the non-linear feature
space, it have high need to hierarchically. Algorithm of
principle curves seven have similar behavior and closer to
standard self-encoding and it isn't hierarchically.
A. The Linear PCA to Non-linear PCA
In first, a version of linear self-encoding with add nonlinear hidden layers is consist a mapping non-linear. There are
a nonlinear PCA by this strategy(it is called symmetric
NLPCA) inside a network. In Fig. 2 is shown S-NLPCA.

Fig. 2. the Five layer non-linear self-encoding network[2,3,4] is mapping the


input and output. Whereas is consist the three hidden layer into a standard
self-encoding

As mentioned in the introduction, the S-NLPCA have not


power to distinct features. So basically the non-linear PCA
used to determine a sub-space. There are two strong way
related to introducing limits hierarchical comparison to the
feature space. Similar to the linear PCA a way is force to the ith feature for compute i-th highest mapping variance. Other
strategy is basis on initial the feature i in the first data space to
follow reconstruction MSE. Execution the first way for
limitative reasons, it may be difficult to the second way.
Because the this reason we present a strategy for learning that
it focus on the reconstruction MSE.

E=

1
dN

( x

n
k

xkn ) 2

(1)

Which in that x , x are initial data and reconstructed data,

respectively. N, number of samples and d is dimension. we


limited discussion to two-dimensional feature space for
simplify the our work that can generalized to other
dimensional. E1 and E1,2 , when we using of the first feature
and the second feature are reconstructed mean error. We not
only have need to minimum E1,2 (such as S-NLPCA) even have
need to minimum

for apply H-NLPCA. This work doing


by minimizing hierarchically error:
E1

EH = E1 + E1, 2

(2)

It is possible we have desire to description optimized unit


or the sub-space feature. This relation(Formula. 2) can be
E
balanced by weighting to error rate of E 1 and 1,2 using
hyper-parameter :

EH = E1 + E1, 2 (0, )

(3)

However selection the optimized increase costs in


computational, while interest of performance is 9 on average.
So we in all experiments put values equal to one. In Fig. 2
is shown that this balanced is adjusted to be resistant to the out
data.

order to generalized to the this network, is need to a lot of the


input data so that there are noisy and changed data that define
for the it. In this paper used of distorted data, changed of
direct or scale[16] that the its scale is consist -0,-15,0,15 and
rotate direct is consist -5,5 degree in clockwise or
counterclockwise that is shown in Fig. 5.

Fig. 3. relation between the errors and the hyper-parameter for a nonlinear
five layer self-encoding[2,10,19].

The left side(basis on 0) is equivalent to encoding


two-dimensional of S-NLPCA network. The right
side(basis on ) is equivalent to the standard NLPCA
by a sampling unit in feature layer.
III.

COMPLEX NEURAL NETWORK(CNN)

The CNN is consist a neural network with supervised


learning architecture that it is deep for doing vision tasks. The
CNN can divided to the two parts: 1-automatic extractor of the
features, 2- learnable category. The automatically extractor
features is consist mapping the features layers that extracted
the distinctive features than the input patterns by using two
component (1- linear filtering 2-sampeling operations into
down). In mapping the features size of the filter kernels is
matrix 55 and degree is 2 for down-sampling. Back
propagation using for learning the classification and learning
the filter kernels instead use of the CNN with complex
architecture such as LeNet[5] in[9], the CNN architecture is
shown in Fig. 4. Input layer is a matrix 3535 contain
normalized patterns. Briefly process of normalization is
consist part of experiment results, the second layer(with N1,
mapping feature) and the third layer(with N2, mapping
feature) learned of two different resolution for feature
extraction. Each neuron in this two layers connected to
window 55 in the before layer (according to defined
weight). The last layer of this architecture is consist multi
layer perceptron(according to N3 neuron in the hidden layer
and ten neuron in the output layer.

Fig. 4. The CNN architecture for opinion

Fig. 5. the Distorted patterns, the first and third rows are pre-process and the
second and fourth patterns are after-process for the same the patterns

IV.

LEARNING METHOD IN SPARSE AND OVER-COMPLETE


DICTIONARY

Suppose matrix Y is consist n N data, so that each


column of the Y including a pattern for finding over-complete
dictionary that it is matrix D so that( D R n k , k n ) :
Y = DX

each column of X is called sparse matrix, each column of


the D is called atom. This problem be solved by using of the
K-SVD algorithm.
A. K-SVD Algorithm
The template is used to format your paper and style the text.
All margins, column widths, line spaces, and text fonts are
prescribed; please do not alter them. You may note
peculiarities. For example, the head margin in this template
measures proportionately more than is customary. This
measurement and others are deliberate, using specifications
that anticipate your paper as one part of the entire proceedings,
and not as an independent document. Please do not revise any
of the current designations.
The K-SVD algorithm is for solve the over-complete
dictionary, so that:

min

2
{ Y - DX }
F
D, X

subject to
x 0

(5)

In here x 0 is number of non-zero in the input vector. The


xi is i-th column of the matrix X, T0 is maximum number of
components non-zero.
Problem 2 solve by iterative. In begin be regadarded the
dictionary D and so by K-SVD algorithm do try to find
coefficients in the sparse matrix(X).

Y DX F } = yi Dxi
2

Since the CNN architecture contain a lot of weight for


learning and learn network is need to mapping learning. In

(4)

(6)

The problem 2 can be write with N factor, as follows:

xi 0 T0 , fori = 1,2,..., N . subject to n yi Dxi 2 } (7)


B. Learning by K-SVD
learning the over-complete dictionary for the Persian
numbers recognition, randomly selected 26156 samples of
table 55 , size of each part of image is equal to filter kernels
that define in the first layer of the CNN. K, put equal 50 to
means that the dictionary D have redundancy factor 2.
Experimentally put T0 with 7. Fig.6 Is shown elements in
learning the over-complete dictionary, so that each one have
scale and direct which in it can diagnosis character that
expected.

A. CENPARMI Database
Handwritten Farsi numeral database is according to tests
CENPARMI. This samples of 175 authors with age,
education and gender differently have been collecting. All
these the patterns are scanned with 300 dpi resolution and
converted to grayscale. In ultimate this samples divided to
training, testing, verification in non-overlapping states.
Include 11000 samples in the training set with 1100 samples
per class, 2000 samples for verification set with 200 samples
per class and 5000 samples for testing set with 500 samples
per class. Some the samples of the database in Fig. 7 is
shown.

Fig. 6. Learning the over-complete dictionary. T0=7

V.

THE PROCESS DOING WORK ON HANDWRITTEN

In beginning after normalization of the input handwritten,


set the contrast and change size of handwrite until have been
samples that they have same features. So in first performing
the following steps:

Improved contrast in image


Review in image
Set the size of image and computing distance of line
top and bottom
Cut lines and isolation text

After execution steps in above, we applied the non-linear


algorithm on each handwritten and extract the important
features of image. comparison doing between the important
features and select closer distance.
VI.

Fig. 7. Examples of samples in CENPARMI database

B. Pre-process
As shown in, there are a lot of change in values gray level
data of illumination and size of samples in database
CENPARMI, so by normalization in gray level on pixels in
the input image change to standard scale 210 and standard
deviation 20. In normalization technique[11] that using in
paper, in begin the image set the character to middle row
placed with geometric normal plan and so we formatting the
character for a second time. The linear normalization
technique in Persian handwritten recognition acts better when
it be able to reduce change position to points feature of the
image. Some of this numbers before and after of pre-process
is shown in Fig 8. The each normalize pattern have size of
3535 .

EXPERIMENTS

To evaluate performance of the method, we review its


application on the CENPARMI handwritten Farsi numeral
database. In summary some of the members and patterns on
this database are described, then described the pre-processing
for the used data and ultimately using this method for
comparison with two type of classifier, SVM(Support Vector
Machines) and MQDF(Modified Quadratic Discriminate
Function).

Fig. 8. Examples of results pre-process in the left side: initial samples and
right side: samples after pre-process

C. Experiment Results
In this paper, the first layer of the CNN learned with
learning the over-complete dictionary, so that ended process
of learn at 90 stage iterative. In comparison with the CNN we
show the two classifier in Table 1, the SVM and MQDF.
TABLE I.

TEST ERRO RATES(%) OF THE CLASSIFIERS INVESTIGETED IN


THIS PAPER

Gradient
Features
Profile
Features

MQDF

SVM

2.12

1.02

3.18

2.68

Our
proposed
Method
0.78

[3]
[4]
[5]
[6]

[7]
[8]

[9]
[10]
[11]
[12]

Fig. 9. samples that diagnosis mistake in the this method

VII. RESULTS
In this paper, with use of the sparse structure and the overcomplete dictionary, investigated Persian handwritten
numeral recognition on the database CENPARMI and
comparison to the SVM and MQDF. In last, get to good
results of this the sparse structure and the over-complete
dictionary by initialized to the first layer of the CNN. The our
proposed method used the non-linear filtering instead the
linear filtering for feature extraction.

[13]

[14]

[15]
[16]

[17]

REFERENCES
[1]

[2]

M .Aharon, M .Elad, and A .Bruckstein .The K-SVD:An algorithm for


designing of overcomplete dictionaries for sparse representation .IEEE
Trans .Signal Processing, 54)11:(43114322, 2006.J. Clerk Maxwell, A
Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford:
Clarendon, 1892, pp.68-73.
C- .L.Liu and C.Y.Suen .A new benchmark on the recognition of
handwritten bangla and farsi numeral characters. Pattern Recognition, In
press, 2008.K. Elissa, Title of paper if known, unpublished.

[18]
[19]

C-.C .Chang and C-.J .Lin .LIBSVM :a library for support vector
machines .Software available at http//:www.csie.ntu.edu.tw /
cjlin/libsvm, 2001.
G .Dimauro, S .Impedovo, G .Pirlo, and A .Salzo .Automatic bankcheck
processing :a new engineered system .Machine Perception and Artificial
Intelligence, 28:542, 1997.
D .Keysers, T .Deselaers, C .Gollan, and H .Ney .Deformation models
for image recognition .IEEE Trans .Patt .Anal. Mach .Intell.,
29)8:(14221435, 2007.
F .Kimura, K .Takashina, S .Tsuruoka, and Y .Miyake .Modified
quadratic discriminant functions and the application to Chinese character
recognition .IEEE Trans .Pattern Anal. Mach .Intell., 9)1:(149153,
1987.
F .Lauer, C .Y .Suen, and G .Bloch .A trainable feature extractor for
handwritten digit recognition .Pattern Recognition, 40)6:(18161824,
2007.
Y .LeCun, L .Bottou, and Y .Bengio .Reading checks with graph
transformer networks .In Proc .IEEE International Conference on
Acoustics, Speech, and Signal Processing, volume 1, pages 151154,
1997.
Y .LeCun, L .Bottou, Y .Bengio, and P .Haffner .Gradientbased learning
applied to document recognition .Proc .of the IEEE, 86)11:(22782324,
November 1998.
C- .L .Liu, K .Nakashima, H .Sako, and H .Fujisawa .Handwritten digit
recognition :Benchmarking of the state-ofthe- art techniques .Pattern
Recognition, 36)10:(22712285,2003.
C- .L .Liu, K .Nakashima, H .Sako, and H .Fujisawa .Handwritten digit
recognition :Investigation of normalization and feature extraction
techniques .Pattern Recognition, 37)2:(265279, 2004.
C- .L .Liu and H .Sako .Class-specific feature polynomial classifier for
pattern classification and its application to handwritten numeral
recognition .Pattern Recognition,39)4:(669681, 2006.
R .MarcAurelio, C .Poultney, S .Chopra, and Y .LeCun. Efficient
learning of sparse representations with an energybased model .In M .
Press, editor, Proc .Advances in Neural Information Processing Systems,
2006.
A .Mowlaei and K .Faez .Recognition of isolated handwritten
Persian/Arabic characters and numerals using support vector machines.
In Proc .IEEE 13th Workshop on Neural Networks for Signal
Processing, pages 547554, 2003.
M .Shi, Y .Fujisawa, T.Wakabayashi, and F .Kimura .Handwritten
numeral recognition using gradient and curvature of gray scale image .
Pattern Recognition, 35)10:(20512059,2002.
P .Y .Simard, D .Steinkraus, and J .Platt .Best practices for convolutional
neural networks applied to visual document analysis .In Proc .
International Conference on Document Analysis and Recognition
)ICDAR(, pages 958962, 2003.
F .Solimanpour, J .Sadri, and C .Suen .Standard databases for
recognition of handwritten digits, numerical strings, legal amounts,
letters and dates in Farsi language .In Proc.10th International Workshop
on Frontiers in Handwriting Recognition, pages 37, 2006.
H .Soltanzadeh and M .Rahmati .Recognition of Persian handwritten
digits using image profiles of multiple orientations.Pattern Recognition
Lett., 25)14:(15691576, 2004.
S .Srihari and E .Keubert .Integration of handwritten address
interpretation technology into the United States Postal Service Remote
Computer Reader system .In Proc .Fourth International Conference on
Document Analysis and Recognition, volume 2, pages 892896, 1997.

You might also like