You are on page 1of 34

TITLE

CHARACTER
RECOGNITION
PLATFORM

• Operating System : Windows


XP/2000
• Language : Visual Basic.net
PROJECTEES
• BOUDHAYAN MAITY
• BRATIN BISWAS
• DIBAKAR SINHA
• GURUSHARAN SINGH
• HIMANGSHU HAZARIKA
• NITIN TOMAR
• PARAG DHAR BARUAH
• YASSER HAZARIKA
GUIDED BY:

MR. AMOL G.
MULEY.
INTRODUCTION

The recognition of characters is


known to be one of the earliest
application of artificial Neural
Network which partially emulate
human thinking in the domain of
artificial intelligence.
WHY VB.NET
• Faster and the easiest way to create
applications for Microsoft Windows.
• Provides a complete set of tools to simplify
rapid application development.
• Visual basic .NET avoids writing of
numerous lines of codes to describe the
appearance and location of interface
elements.
• VB.NET provides a graphical environment
in which you visually design the forms and
controls.
PROBLEM DEFINITION
• The same characters differ in size,
shape and style from person to
person and even from time to time
for the same person.
• Like any image, visual characters are
subject to spoilage due to noise.
• There are no hard and fast rule that
define the appearance of visual
character. Hence rules need to be
heuristically deduced from samples.
Human system of vision is excellent in the
sense of the following qualities-
The human brain is adaptive to minor
changes and errors in visual pattern. Thus
we are able to read the handwritings of
many people despite different styles of
writing.
The human vision system learns from
experience. Thus we are able to grasp newer
styles and scripts easily.
The human vision system is immune to
most variations of size, aspect ratio, location
and orientation of visual characters.
PROBLEM SOLUTION
We solve the problem using ANN which
have the following capabilities-------
• Adaptive to the changing
environment.
• Learning from prior experience
• Using of image digitization, learning
mechanism and employed
architecture.
Task Involved
• Segmentation : Given input image,
indentify individual glyphs.
• Feature extraction : From each
glyph image extract features to be
used as input of ANN, this is the most
critical part of this approach.
• Classification : Train the ANN using
training sample. Then given new
glyphs, classify it.
Single Discrete Perceptron
Training Algorithm
(SDPTA)

• We will begin to examine neural network


classifiers that derive their weights during the
learning cycle.

• The sample pattern vectors X1, X2, …, Xp, called


the training sequence, are presented to the
machine along with the correct response.

• Based on the Perceptron learning rule seen


earlier.
Given are P training pairs
{X1,d1,X2,d2....Xp,dp}, where
Xi is (n*1)
di is (1*1)
i=1,2,...P
Yi= Augmented input pattern( obtained by appending 1 to
the input vector)
i=1,2,…P
In the following, k denotes the training step and p denotes
the step counter within the training cycle
Step 1: c>0 is chosen.
Step 2: Weights are initialized at w at small values, w is
(n+1)*1. Counters and error are initialized.
k=1,p=1,E=0
Step 3: The training cycle begins here. Input is presented
and output computed:
Y=Yp, d=dp
O=sgn(wtY)
SDPTA contd..

Step 4: Weights are updated:


W=W+1/2c(d-o)Y
Step 5: Cycle error is computed:
E=1/2(d-o)2+E
Step 6: If p<P then p=p+1,k=k+1, and go to Step 3:
Otherwise go to Step 7.
Step 7: The training cycle is completed. For E=0,terminate
the training session. Outputs weights and k.
If E>0,then E=0 ,p=1, and enter the new training
cycle by going to step 3.
IMAGE DETECTION
The procedure for analyzing images
to detect characters consist of two
steps:
• Determining character lines
• Detecting individual symbols
DETERMINING
CHARACTER LINES
Enumeration of character lines in a
character image (‘page’) is essential
in delimiting the bounds within which
the detection can proceed. Thus
detecting the next character in an
image does not necessarily involve
scanning the whole image all over
again.
DETECTING INDIVIDUAL
SYMBOLS
Detection of individual
symbols involves scanning
character lines for
orthogonally separable
images composed of black
pixels.
Line and Character
boundary detection
The detected character bound might not
be the actual bound for the character. This
issue arises with the height and bottom
alignment irregularity that exists with
printed alphabetic symbols. Thus a line top
does not necessarily mean top of all
characters and a line bottom might not
mean bottom of all characters as well.
Hence a confirmation of top and bottom
for the character is needed.
An optional confirmation
algorithm implemented in the
project is:
• Start at the top of the current line and left of the
character
• Scan up to the right of the character
– if a black pixels is detected register y as the
confirmed top
– if not continue to the next pixel
– if no black pixels are found increment y and
reset x to scan the next horizontal line
IMAGE DIGITIZATION
• Need of It
• Way of it
Need of Image
Digitization
• Image may provide pictures and
colors that do not provide useful
information in the instant sense of
character recognition.
• Character which need to be single
analyzed may exist as word clusters
or may be located at various point of
the document.
Way of Image
Digitization
• The input image is sampled into
binary window which forms the input
to the recognition system.
Algorithm of Digitization
• In order to be able to feed the matrix
data to the network (which is of a
single dimension) the matrix must
first be linearized to a single
dimension. This is accomplished with
a simple routine with the following
algorithm:
• start with the first matrix element
(0,0)
• increment x keeping y constant up to
the matrix width
– map each element to an element of a
linear array (increment array index)
– if matrix width is reached reset x,
increment y
• repeat up to the matrix height
(x,y)=(width, height
MATRIX MAPPING
TRAINING
• Those patterns will be used for
teaching the neural network to
recognize the images. Basically, each
training pattern consists of two
single-dimensional arrays of float
numbers – Inputs and Outputs
arrays. The Inputs array contains
your input data. In our case it is a
digitized representation of the
character's image. Output array will
Network Architecture

• In this system the candidate pattern


is the input
• Block M provides input matrix M to
the weight blocks Wk for each K.
REFERENCES
• An Introduction to Neural Networks,
James A. Anderson.
• T. Allen, W. Hunter, M. Jacobson, and
M. Miller. Comparing several discrete
handwriting recognition algorithms.
Acknowledgement
We are grateful to our lecturers and
especially our guide Mr. AMOL G.
MULEY who has helped in undergoing
this project. Moreover if
inadvertently we have committed
any mistakes, we would highly
appreciate if our lecturers rectify it
with the same.

You might also like