You are on page 1of 8

IJSTE - International Journal of Science Technology & Engineering | Volume 2 | Issue 12 | June 2016

ISSN (online): 2349-784X

A Geometric Feature Extraction Technique for


Hindi Handwritten Character Recognition
Neha Assiwal
M. Tech Scholar
Department of Computer Science & Engineering
GITAM Kablana, Jhajjar(Haryana)

Dr. Neetu Sharma


Head of Dept.
Department of Computer Science & Engineering
GITAM Kablana, Jhajjar(Haryana)

Abstract
This paper explains a recognition technique for recognition a character. This paper involves geometry technique for feature
extraction which extracts the character separately. It is a segmentation-based character recognition system. The system is based
on offline handwritten character recognition. In this system, it recognizes Indian languages characters. The features used in this
system are based on the basic line types that form the character skeleton. The system output is formed from feature vector. The
feature vectors is generated from a training set, were then used to train a pattern recognition engine. The pattern recognition
engine based on Neural Networks so that the system can be benchmarked.
Keywords: Geometry, Character skeleton, Universe of Discourse, Zoning, Feature Extraction
________________________________________________________________________________________________________
I.

INTRODUCTION

This paper describes lots of thinks which has been inspire from many literature. Some literature explains many high accuracy
recognition systems for separated handwritten characters. However feature extraction based on local and global geometric
features of the character skeleton has not been investigated much. The algorithm proposed concentrates on the same. It extracts
different line types that form a particular character. It also concentrates on the positional features of the same. Neural network
was used for testing feature extraction techniques which was trained with the feature vectors obtained from the system.
II. OVERVIEW OF THIS PAPER
Image Preprocessing is the starting stage of this paper. Universe of discourse is selected just because the features extracted from
scanned image. After the universe of discourse is selected, the image is divided into windows of equal size. Character Skeleton
which is used in image preprocessing was defined as starter, intersection and minor starter.
Character traversal is starts after zoning is done on the image. Each zone is individually subjected to the process of extracting
line segments. After line segments have been extracted from the image, they have to be classified into any one of the following
line types: Horizontal line, Vertical line, Right diagonal line, Left diagonal line.
After that the line type of each segment is determined Feature vector is formed based on this information. Then features were
extracted for each such zone. After zonal feature extraction, certain features were extracted for the entire image based on the
regional properties Euler no, Regional area, Eccentricity.

All rights reserved by www.ijste.org

295

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)

III. STAGES OF HINDI CHARACTER RECOGNITION

Fig. 1: (a) Stages Of HCR

IV. IMAGE PROCESSING


Image pre-processing involves the following steps
1) Character Extraction from Scanned Image.
2) Binarization.
3) Background Noise removal.
4) Skeletonization.
V. UNIVERSE OF DISCOURSE
Universe of discourse is defined as the shortest and smallest matrix that fits the entire character skeleton. The Universe of
discourse is selected just because the features extracted from the scanned character image. The character image includes the
positions of different line segments. So each and every character image should be independent of its Image size.
VI. ZONING
The image is divided into windows of equal size, after the universe of discourse is selected. After that feature is done on
individual windows. Two types of zoning were used at the time of system implemented. The image was zoned into 9 equal
windows sized. Feature extraction was applied on individual zones rather than the complete image. This gives more information
about fine details of character skeleton.
If zoning is used it give the information about the positions of different line segments in a character skeleton becomes a
feature. This is because, a particular line segment of a character occurs in a particular zone in almost cases.
To extract the different line segments in a particular zone, the entire skeleton should be traversed in that zone. For this
purpose, certain pixels in the character skeleton were defined as starters, intersections and minor.

(a) Original Image

(b) Universe of Discourse


Fig. 2:

All rights reserved by www.ijste.org

296

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)

Starters:
Starters are those pixels with one neighbour in the character skeleton. Before character traversal starts, all the starters in the that
particular zone is found and is populated in a list.

Fig. 3: Starter

Intersections:
The definition of intersection is little bit complicated. The necessary but insufficient criterion for a pixel to be an intersection is
that it should have more than one neighbour. Neighbouring pixels are classified into two categories i.e. direct pixels and diagonal
pixels. All those Pixels which are in the neighbourhood of the pixel under consideration in the horizontal and vertical directions
are called as Direct Pixels.
The remaining pixels in the neighbourhood which are in a diagonal direction to the pixel under consideration are called as
Diagonal pixels. Under consideration for finding the number of true neighbours for the pixel, it has to be classified based on the
number of neighbours it has in the character skeleton. Under consideration Pixels are classified with 3 neighbours, 4 neighbours,
5 or neighbours.
In the image, once all the intersections are identified, then they are populated in a list.

Fig. 4: Intersection

Minor Starters:
When pixel under consideration have more than two neighbours then minor starter are created. They are found along the course
of traversal along the character skeleton. There are two conditions that can occur i.e intersection and non-intersection.

Fig. 5: Minor Starter

Starter, Intersection and Minor Starter are different for different Characters.
After zoning is done on the image Character traversal is starts. Each zone is individually subjected to the process of extracting
line segments. Algorithm starts by considering the starters list. First the starters and intersections in the zone are found and then
populated in a list.
Once all the starters are processed, the minor starters obtained so are processed. After that, the algorithm starts with the minor
starters. All the line segments obtained during this process are stored, with the positions of pixels in each line segment. Once all
the pixels in the image are visited, the algorithm stops.
After line segments have been extracted from the image, they have to be classified into any one of the following line types:
Horizontal line, Vertical line, Right diagonal line, Left diagonal line.
VII. FEATURE EXTRACTION
Feature vector is formed based on this information after the line type of each segment is determined. Every zone has a feature
vector corresponding to it. Under this algorithm proposed, every zone has a feature vector with a length of 9. They are as
follows:
1) No. of horizontal lines.

All rights reserved by www.ijste.org

297

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)

2)
3)
4)
5)
6)
7)
8)
9)

No. of vertical lines.


No. of Right diagonal lines.
No. of Left diagonal lines.
Normalized Length of all horizontal lines.
Normalized Length of all vertical lines.
Normalized Length of all right diagonal lines.
Normalized Length of all left diagonal lines.
Normalized Area of the Skeleton.
The number of any particular line type is normalized using the following method,
Value = 1 - ((number of lines/10) x 2)
Normalized length of any particular line type is found using the following method,
Length = (Total Pixels in that line type)/ (Total zone pixels)
The feature vector explained here is extracted individual for each zone. So if there are N zones, there will be 9N elements in
feature vector for each zone. For the system proposed, the original image was first zoned into 9 zones by dividing the image
matrix. The features were then extracted for each zone.
Again the original image was divided into 3 zones by dividing in the horizontal direction. Then features were extracted for
each such zone. After zonal feature extraction, certain features were extracted for the entire image based on the regional
properties Euler no, Regional area, Eccentricity.
Euler Number: Euler no. is defined as the difference of Number of Objects and Number of holes in the image.
Regional Area: Regional area is defined as the ratio of the number of the pixels in the skeleton to the total number of pixels
in the image.
Eccentricity: Eccentricity is defined as the eccentricity of the smallest ellipse that fits the skeleton of the image.
VIII. EXPERIMENTAL RESULT
The Image of the handwritten Hindi text will be taken as shown below. And this image is then processed.

Fig. 6: Load an Image

After taken image, train the image using Neural Network tool.

All rights reserved by www.ijste.org

298

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)

Fig. 7: Train using Neural Network Tool

Then when training process is completed then extract the character. Image is processed and made into a grey scale image then
convert into binary image then perform edge detection which the application can read. Five figures are shown each with different
definition.

Fig. 8(a): Input Image with Noise

Fig. 8(b): Input Image without Noise

All rights reserved by www.ijste.org

299

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)

Fig. 8(c): Image Dilation

Fig. 8(d): Image Filling

Fig. 8(E): Image Thining

After Extraction text is check the no of character in the image and shown in text box. Extracted character is display on axes
then Output is produced on output.txt file.

All rights reserved by www.ijste.org

300

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)

Fig. 9: Extract text after five figures

Fig. 10: Output Displayed

IX. CONCLUSION
In this paper, we have been proposed a geometry feature extraction technique for identify the handwritten characters of Hindi
script. A feature extraction technique that may be applied to classification of cursive characters for Hindi handwritten Character
recognition. After training a Neural Network with a database of 650 images the method proposed was tested. This will give the
output as Hindi character. If character was not match with the database then an error message is occur match not found.
The Proposed Algorithm will be implemented in MATLAB and can read Hindi handwritten characters. The Implementation
part also covered in the experimental result section that demonstrates the real working of Proposed Algorithm. This is the best
system for extract the Hindi character from scanned image.
REFERENCES
[1]
[2]
[3]

Ajay Garg, Simpel Jindal, to Extract Feature of Handwritten Devnagri Script, International Journal of Advanced Research in Computer And
Communication Engineering, vol. 3, issue 7, July 2014.
M. Blumenstein, B. K. Verma and H. Basli, A Novel Feature Extraction Technique for the Recognition of Segmented Handwritten Characters, 7th
International Conference on Document Analysis and Recognition (ICDAR 03) Eddinburgh, Scotland: pp.137-141, 2003.
Trier.O.D, Jain.A.K and Taxt.J, "Feature extraction methods for character recognition - A survey", Pattern Recognition, vol.29, no.4, pp.641-662, 1996.

All rights reserved by www.ijste.org

301

A Geometric Feature Extraction Technique for Hindi Handwritten Character Recognition


(IJSTE/ Volume 2 / Issue 12 / 055)
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]

Sameer Antani and Lalitha Agnihotri, "Gujarati Character Recognition", Fifth Int. Conf. Document Analysis and Recognition, Bangalore (India), pp. 418,
1999.
K. Vijay Kumar, R.Rajeshwara Rao, Online Handwritten Character Recognition for Telugu Language Using Support Vector Machines, International
Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 8958, Volume-3, Issue-2, December 2013
R. Jayadevan, Satish R. Kolhe, Pradeep M. Patil, and Umapada Pal "Offline Recognition of Devanagari Script: A Survey", ieee transactions on systems,
man, and cyberneticspart c: applications and reviews, vol. 41, no. 6, november 2011
Sonika dogra, chandra prakash, "pehchaan: hindi handwritten character recognition system based on svm", ijcse, issn : 0975-3397 vol. 4 no. 05 may 2012
Brijesh k. Verma, "Handwritten hindi character recognition using multilayer perceptron and radial basis function in neural networks," IEEE International
conference on Neural Networks,vol. 4,pp. 2111-2115, Nov. 1995.
Surya Nath R S, Afseena S, Handwritten Character Recognition A Review, International Journal of Scientific and Research Publications, 1 ISSN 22503153, Volume 5, Issue 3, March 2015
Mrs. Asma Shaikh, Mr. Rahul Dagade, Offline Recognition of Handwritten Devanagari words using Hidden Markov Model, International Journal for
Innovative Research in Science & Technology| ISSN (online): 2349-6010|Volume 1 | Issue 11 | April 2015
Swapril A. Vaidya, Balaji R. Bombade A Novel APPROACH OF Handwritten Character Recognition Using Positional Feature Extraction, IJCSMC, vol.
2, Issue. 6, June 2013.
Rahul Kala, Harsh Vazirani, AnupamShukla, Ritu Tiwari, Offline Handwriting Recognition, International Journal of Computer Science Issues, Volume
7, March-2010.
Kauleshwar Parsad, Devrat C Nigam, AshmikalAkhotiya, DheerenUmre, Charcateer Recognition Using Matlabs Neural Toolbox, International journal
of u-and-e-Service, Science and Technology vol 6,No.1,February 2013.
Vinita Dutt, Sunil Dutt, Handwritten Charcater Recognition Using artificial Neural Network, Advance in Computing, 1(1):18-23, 2011.

All rights reserved by www.ijste.org

302

You might also like