Professional Documents
Culture Documents
Determine the general properties for each of the obtained segments to extract
valid characters, such as their width, height, and the space between individual characters and
the space between words.
Vertical spacing plays a crucial role in spacing characters and dissecting them into
subimages. Thus, spacing and pitch play crucial role in identifying segmentation points.
Projection [1] methods are essentially utilized for good quality machine printing,
employing column separation of characters.
Methods described in points earlier are feasible only if the characters are printed,
since the width of characters and spacing between them is constant. In the case of
handwritten text, such pitch based or column based approaches produce limited accuracy.
Segmentation of such characters requires a two dimensional process - identifying the black
regions and then splitting them accordingly. Two major techniques employed for the same are
bounding box analysis and splitting of connected components.
The techniques used in each stage are completely different from one another,
making it hard to select one method for each input type.
The subimages that are obtained are dependent on the text in the image.
Recognition can be done either serially, identifying words left to right, or parallely
wherein an optimal path is generated from a lattice of feature-to-letter combination.
Progressive developments have been made with this model, thus making the
newer versions more efficient.
There are multiple ways in which this system can be deployed, each of which
have a number of alternatives, depending on the type of input that has been provided, and the
need for accuracy.
References
[1] V.A. Kovalevsky, Character readers and Pattern Recognition, Spartan Books, Washington
D.C., 1968.
[2] C.J.C. Burges, J.I. Be and C.R. Nohl, Recognition of Handwritten Cursive Postal Words
using Neural Networks, Proc. USPS 5th Advanced Technology Conference, page A-117,
Nov/Dec. 1992
[3] K. Fukushima and T. Imagawa, Recognition and segmentation of connected characters with
selective attention, Neural Networks, vol. 6, no. 1, pp. 33-41, 1993.
The criterion is basically that there is predefined set of lexicons, which are used as recognition
patterns, rather than dividing them as letters and testing each of them either serially or parallely.
Salient Features
Middle zone technique has been used using which ascenders and descenders are
checked and identified [2].
This technique is uses dynamic programming in order to ensure the speed of the is
maintained [1].
Flaws
A major drawback of this class of methods is that their use is usually restricted to
a predefined lexicon: as they do not deal directly with letters but only with words,
recognition is necessarily constrained to a specific lexicon of words.
This approach is well-suited when both reference image and template image, both have
similar or common features or control points [1].
Salient features
Features that are used to recognize characters would include points, curves, or a
surface model that needs to be matched.
Feature extraction generates features that can be used for selection and
classification.
When features are detected, these are saved into a feature vector. These
descriptors are then matched with a reference image [3].
The main objective is to locate a pair-wise connection between the reference and
template using their spatial relations of features.
This method could be made less tedious by reducing the reference image and
template image by a same factor and hence reducing the number of features that needs to be
compared, but this would compromise the efficiency.
Inference
Feature based template matching requires a basic set of features that could be
used to characterize the script. Higher the number of features used, more efficient the
algorithm will be. Feature based algorithm is quite versatile as it could be used on various
forms of representations. For this algorithm to be efficient in character recognition, it should
have large number of features but only relevant features.
Flaws
Not as well developed as older methods hence current efficiency is not high.
For the method to be proved efficient, the algorithm would require large number
of features to gain high efficiency in character recognition [2].
The feature (key) points used should be distinct enough to carry out the character
recognition.
It is important to recognize the needed feature and discard the useless one as they
could hinder the efficiency of the algorithm
Reference
[1] http://maxwellsci.com/print/rjaset/v4-5469-5473.pdf
[2] http://www.ccs.neu.edu/home/feneric/charrec.html
[3] http://www.flll.jku.at/sites/default/files/u20/QCAV2011_Template_Matching_Textures.pdf
[4]http://davinci.fmph.uniba.sk/~uhliarik4/recognition/resources/due_trier_1996_feature_extracti
on_survey.pdf
Salient features
The aforementioned method has the following steps: Acquisition process, filtering
process, and threshold the images, clustering the image of the character, and recognizing the
character.
Statistical templates are those that condense the characteristics of training samples
of a particular character class. The training samples belonging to the same character class are
then superimposed. Superimposition helps determine the probability of each bit being while
in the presence of the class sample. Here white bits are given the value of 1 and black the
value 0, hence the bit would have a value in the range 0-1. This method uses that whiteness
of the bit, i.e., the whiter the bit, more reliable it is.
Average templates, also known as binary templates, are derived from statistical
templates. Average templates round of the probability value to gain an acceptable, general
result.
Inference
Due the vast number of templates required to recognize the characters, large
amount of memory is needed to carry out the algorithm.
If the character to be recognized is not gray- scaled, then color information would
be required which would increase the complications in the recognition.
If the templates and the unknown character input are not scaled then efficient
character recognition would not difficult [2].
Even if no match is present in the set, the algorithm will find one, thereby
resulting in an incorrect recognition.
Reference
[1] http://www1.chihlee.edu.tw/teachers/ctw/paper/2005/imp2005-2.pdf
[2] http://www.ehu.eus/ccwintco/uploads/e/eb/PFC-IonMarques.pdf
[3]http://www.academia.edu/714194/Optical_Character_Recognition_By_Using_Template_Matc
hing_Alphabet_
[4]http://umpir.ump.edu.my/969/1/NaCSES-2007086_Optical_Character_Recognition_By_Using_Templ.pdf
Algorithm 3: Character Recognition using Logical Regression
Method 1: Character Recognition using Quantization
Quantization in character recognition is the process in which large, possibly infinite, set of values
is reduced to a much smaller set, which could include rounding values. Function that performs
quantization is known as quantizer. Quantization is further classified into two; scalar
quantization, and vector quantization.
Sub Topic: Scalar Quantization
A common quantization method where a scalar input is mapped to a scalar output using a
quantization function. Scalar quantization can be simple, such as rounding off high precision
numbers.
Salient features
When many scalar needs to be quantized but they are done independently, then
its called scalar quantization.
It covers many basic ideas such as rounding off a precise number or mapping a
continuous space to a discrete space.
A function that carries out quantization is called a quantizer, and several factors depend
on its design, such as the amount of compression obtained or the loss occurred while
performing the algorithm.
Quantizer consists of
1.
Encoder mapping
2.
Decoder mapping
In encoder mapping, the encoder divides a range into a number of interval, where
in each interval is represented by a codeword. Hence the codeword can identify the interval
but not the value.
In decoder mapping, the decoder generates a reconstruct value from the codeword
it receives. Since the codeword can just identify the interval, the decoder generates an
estimated value. Estimated value usually is the midpoint of the interval.
Inference
Scalar quantization carries out the process of each scalar value independently. It
reduces a large number of set into a smaller one, i.e., many to one mapping is carried out
which is irreversible. The algorithm is carried out using a quantizer, and its design affects the
compression and the loss that would be present. The algorithm reduces the value, thereby
saving up memory but it induces noise in the image.
Flaws
Does not maintain the correlation that exists between the samples.
Reference
[1] http://shodhganga.inflibnet.ac.in/bitstream/10603/2268/17/17_chapter%204.pdf
[2] http://nptel.ac.in/courses/117104069/chapter_5/5_5.html
[3] www.cs.ucf.edu/courses/cap5015/vector.ppt
Data is quantized in the form of contiguous blocks called vectors rather than
individual samples.
Earlier version of the algorithm provided transparency in that vectors which had large
number of dimensions. But new versions allow transparency to exist in vectors with smaller
dimensions.
A simple vector quantizer uses 2-D vector space and divides it into hexagonal
region called the encoding region. Each region has a representation of vectors that needs to
be quantized and a representation of the codeword. The vectors in that region are best
represented by the codeword in the same region.
A training set is required to generate codewords and this is carried out using an
iterative algorithm.
Vector quantization groups the sample together; quantize them, thereby maintaining the
correlation between the samples. This algorithm has higher efficiency compared to scalar
quantization as it maintains the correlation and transparency.
Flaws
Quantizing vectors increases the system complexity and hence increase memory
requirement.
If the dimension of the vector is quite small, then poor quality results would be
produced.
If the application requires high bit-rate, then it would have high complexity and
memory requirement.
Reference
[1] http://shodhganga.inflibnet.ac.in/bitstream/10603/2268/17/17_chapter%204.pdf
[2] http://www.dcs.gla.ac.uk/~vincia/papers/lvq.pdf
[3] www.cs.ucf.edu/courses/cap5015/vector.ppt
The method computes a weighted sum of the ranks scored by the individual
classifiers and derive a consensus ranking. The weights are estimated by a logical regression
analysis.
An assumption made here is that the output from each of the corresponding
recognition techniques is computed to a character set with respect to an image. Since a
combination method is applied on the set, it is independent of the internal techniques used in
algorithms from which the set is computed [1].
Logistic regression requires that each data point be independent of all other data
points. If observations are related to one another, then the model will tend to overweight the
significance of those observations. This is a major disadvantage, because a lot of scientific
and social-scientific research relies on research techniques involving multiple observations of
the same individuals.
Logistic regression works well for predicting categorical outcomes like admission
or rejection at a particular college. It can also predict multinomial outcomes, like admission,
rejection or wait list. However, logistic regression cannot predict continuous outcomes [2].
References
[1] http://ect.bell-labs.com/who/tkh/publications/papers/spie92.pdf
[2] http://www.ehow.com/info_8574447_disadvantages-logistic-regression.html