You are on page 1of 8

Writer Identification Of Handwritten Oriya Script

Barid Baran Nayak


Dept.of ECE
NIT Rourkela,India

Partha Pratim Roy

Umapada Pal

Dept. of CS

CVPR Unit

IIT Roorkee,India

ISI Kolkata,India

Abstract
In handwriten writer identification and character recognition we have
done a image based analysis,where a scanned digital image containing
handwriten script is taken as input, then system translate it into an
machine editable readable digital text format. oriya language present
great challenges due to the large number of letters in alphabet set,the
sophisticated ways in which they combine and many letters are
roundish and similar to look .
In this project an attempt is made to recognize the writers by use of
HISTOGRAM OF GRADIENT features of character image. The features so
obtained are passed through the HMM code which gives out the
identification result.

Keywords: character recognition.writer identification,histogram of


gradient,Hidden Markov Model(HMM)

INTRODUCTION:Oriya is one of the many official languages in India; it is the official


language of Odisha and the second official language of Jharkhand.
Since it is an old language there are various old documents present
whose writers are unknown. My project deals with this problem. Its
main aim is to identify who is the writer. And the Other part of the
project is to identify each character written.
Due to the presence of complex features such as headline, vowels,
modifiers, etc., character segmentation in Oriya script is not easy. Also,
the position of vowels and compound characters make the
segmentation task of words into characters very complex. To take care
of this problem we tried a novel method considering a zone wise break
up of words and next HMM based recognition. In particular, the word
image is segmented into 3 zones, upper, middle and lower,
respectively. The components in middle zone are modelled using HMM.
By this zone segmentation approach we reduce the number of distinct
component classes compared to total number of classes in Oriya
character set. Once the middle zone portion is recognized, HMM based
forced alignment is applied in this zone to mark the boundaries of
individual components. The segmentation paths are extended later to
other zones Next, the residue components, if any, in upper and lower
zones in their respective boundary are combined to achieve the final
word level recognition.
Earlier template based approach was followed for recognition purpose.
In this approach an unknown pattern was superimposed on the ideal
template is done, and then the degree of correlation between the two

was used for the classification. But this approach became ineffective
because of noises and changes in hand writing. Hence now a days
feature based approach is used.

DIAGRAMS

Figure1: original oriya script

Line segmentation:

Word segmentation:

Zone segmentation

Figure 4: (a) Original Word. (b) Zone segmented word


(upper,mid,lower).

Figure 5: character segmentation from words.

Results:
Series 1
120
100
80
60

Series 1
40
20

0
zone segmentation

non zone
segmentation

Conclusion:
The writer identification of writer was successfully carried out and
significant results were obtained. A scheme for segmentation of
unconstrained Oriya handwritten text into lines, words and characters is
proposed in this paper. Here, at first, the text image is segmented into
lines, and then lines are segmented into individual words. Next, for
character segmentation from words, initially, isolated and connected

(touching) characters in a word are detected. Using structural,


topological and water reservoir concept-based features, touching
characters of the word are then segmented into isolated characters. To
the best of our knowledge,
this is the first work of its kind on Oriya text. The proposed water
reservoir-based approach can also be used for other Indian scripts where
touching patterns show similar behavior.

REFERENCE:[1]

[2]

[3]
[4]

[5]
[6]
[7]

[8]

[9]
[10]
[11]
[12]

U. Pal, B. B. Chaudhuri, "OCR in Bangla: an Indo-Bangladeshi Language", Proceedings of the 12th


IAPR International Conference on Pattern Recognition B:ComputerVision & Image Processing,
1994.
Sukalpa Chanda, Katrin Franke, Umapada Pal and Tetsushi Wakabayashi, "Text Independent Writer
Identification for Bengali Script", Proc. 20th International Conference on Pattern Recognition, 2010,
pp.2005-2008.
U. Pal, A. Belaid, and C. Choisy, "Touching numeral segmentation using water reservoir concept,"
Pattern Recognition Letters, pp. 261-272, 2003.
J. M. White and G. D. Rohrer, "Image thresholding for optical character recognition and other
applications requiring character image extraction," IBM J. of Res. and Dev., vol. 27, pp. 400-411,
1983. (Pubitemid 13591061)
O. Tuzel, F. Porikli, and P. Meer, "Pedestrian detection via classification on riemannian manifolds, "
IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 10, pp. 1713-1727, 2008.
L. R. Rabiner "A Tutorial on HMM and Selected Applications in Speech Recognition", IEEE
Proceedings, vol. 77, pp.257 -286 1989
M. Chen , A. Kundu and S. N. Srihari "Variable Duration HMM and Morphological Segmentation
for Handwritten Word Recognition", IEEE Trans. on Image Proc., vol. 4, no. 12, pp.1675 -1688
1995
A. Mohan, C. Papageorgiou, and T Poggio, "Example-based object detection in images by
components, " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 349-361,
2001.
D. G. Lowe, "Distinctive image features from scale-invariant keypoints, " International Journal of
Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
J. Yen, F. Chang, and S. Chang, "A new criterion for automatic multilevel thresholding," IEEE Trans.
Image Processing, vol. 4, no. 3, pp. 370-378, 1995.
B. B. Chaudhuri, U. Pal and M. Mitra, "Automatic recognition of printed Oriya script", Sadhana,
Vol.27, part 1. pp.23-34, February 2002
U. Pal, N. Sharma, and F. Kimura, "Oriya offline handwritten character recognition ", In Proc.

International Conference on Advances in Pattern Recognition, pp. 123-128, 2007.


[13]

U. Pal and B. B. Chaudhuri, "Indian Script Character Recognition: A Survey", Pattern Recognition,
Vol.37, pp. 1887-1899, 2004.

[14]

A. Gordo, A Fornes, and Ernest Valveny. Writer identification in handwritten musical scores with bag ofnotes. Pattern Recognition
46(2013) 1337-1346

[15]

[16]

[17]

A. Fornes, J. Llados, G. Sanchez, H. Bunke, On the use of textural features for writer identification in
old handwritten music scores, InProc. of the International Conference on Document Analysis and
Recognition, 2009, pp. 9961000.
A.Fornes, J. Llanos, G. Sanchez, X. Otazu, H. Bunke, A combination of features for symbolindependent writer identification in old music scores,International Journal on Document Analysis
and Recognition 13 (2010), pp. 243259.
A. Fornes, A. Dutta, A. Gordo, J. Llados, The ICDAR 2011 music scores competition: staff removal
and writer identification, in: Proceedings of the International Conference on Document Analysis and
Recognition, 2011, pp. 15111515.