You are on page 1of 22

P W

Pattern Reco#nition, Vol. 29, No. 4, pp. 641-662, 1996


Elsevier Science Ltd
Copyright I996 Pattern Recognition Society
Printed in Great Britain. All rights reserved
0031-3203/96 $15. 00+. 00
0031-3203(95)00118-2
FEATURE EXTRACTION METHODS FOR CHARACTER
RECOGNI TI ON- - A SURVEY
OIVIND DUE TRIER,?~ ANIL K. JAIN and TORFI NN TAXT:~
~: Department of Informatics, University of Oslo, P.O. Box 1080 Blindern, N-0316 Oslo, Norway
Department of Computer Science, Michigan State University, A714 Wells Hall, East Lansing, MI
48824-1027, U.S.A.
(Received 19 January 1995; in revised form 19 July 1995; received for publication 11 Auoust 1995)
Abstract--This paper presents an overview of feature extraction methods for off-line recognition of
segmented (isolated) characters. Selection of a feature extraction method is probably the single most
important factor in achieving high recognition performance in character recognition systems. Different
feature extraction methods are designed for different representations 6f the characters, such as solid binary
characters, character contours, skeletons (thinned characters) or gray-level subimages of each individual
character. The feature extraction methods are discussed in terms of invariance properties, reconstructability
and expected distortions and variability of the characters. The problem of choosing the appropriate feature
extraction method for a given application is also discussed. When a few promising feature extraction methods
have been identified, they need to be evaluated experimentally to find the best method for the given
application.
Feature extraction Optical character recognition Character representation Invariance
Reconstructability
I . I N T R O D U C T I O N
Optical character recognition (OCR) is one of the most
successful applications of automatic pattern recogni-
tion. Since the mid 1950s, OCR has been a very active
field for research and development, ca) Today, reason-
ably good OCR packages can be bought for as little as
$100. However, these are only able to recognize high
quality printed text documents or neatly written hand-
printed text. The current research in OCR is now
addressing documents that are not well handled by the
available systems, including severely degraded, omni-
font machine-printed text and (unconstrained) hand-
written text. Also, efforts are being made to achieve
lower substitution error rates and reject rates even on
good quality machine-printed text, since an experi-
enced human typist still has a much lower error rate,
albeit at a slower speed.
Selection of a feature extraction method is probably
the single most important factor in achieving high
recognition performance. Our own interest in charac-
ter recognition is to recognize hand-printed digits in
hydrographic maps (Fig. 1), but we have tried not to
emphasize this particular application in the paper.
Given the large number of feature extraction methods
reported in the literature, a newcomer to the field is
faced with the following question: which feature ex-
t Author to whom correspondence should be addressed.
This work was done while O- D. Trier was visiting Michigan
State University.
traction method is the best for a given application?
This question led us to characterize the available
feature extraction methods, so that the most promising
methods could be sorted out. An experimental evalu-
ation of these few promising methods must still be
performed to select the best method for a specific
application. In this process, one might find that a speci-
fic feature extraction method needs to be further
developed. A full performance evaluation of each
method in terms of classification accuracy and speed is
not within the scope of this review paper. In order to
study performance issues, we will have to implement
all the feature extraction methods, which is an enor-
mous task. In addition, the performance also depends
on the type of classifier used. Different feature types
may need different types of classifiers. Also, the classi-
fication results reported in the literature are not
comparable because they are based on different data
sets.
Given the vast number of papers published on OCR
every year, it is impossible to include all the available
feature extraction methods in this survey. Instead, we
have tried to make a representative selection to illus-
trate the different principles that can be used.
Two-dimensional (2-D) object classification has sev-
eral applications in addition to character recognition.
These include airplane recognition, 12) recognition of
mechanical parts and tools, 13l and tissue classification
in medical imaging34) Several of the feature extraction
techniques described in this paper for OCR have also
been found to be useful in such applications.
641
642 ). D. TRIER e t al.
7 . . 6 Y t , ' r 4 r .
Y a m , l , " /a .,
"1 G +' 6 v 6 ,9
, G . . , +.. G [I
: L Y , ' , ,o 5 5;b,,
' / ' P
;=-'D . a , - - .
Fig. 1. A gray-scale image of a part of a hand-printed hydro-
graphic map.
An OCR system t ypi cal l y consists of the following
processing, steps ( Fi g. 2):
(1) gray-level scanni ng at an appr opr i at e resol-
ution, t ypi cal l y 300-1000 dot s per inch.
(2) preprocessi ng:
(a) bi nar i zat i on (two-level thresholding), using
a gl obal or a l ocal l y adapt i ve met hod;
(b) segment at i on t o i sol at e i ndi vi dual char-
acters;
(c) (opt i onal ) conversi on t o anot her char act er
r epr esent at i on (e.g. skel et on or cont our curve);
(3) feature ext ract i on;
(4) recogni t i on using one or mor e classifiers;
(5) cont ext ual veri fi cat i on or post processi ng.
Survey papers, t5-7) books t8-12) and eval uat i on stu-
dies ~13-16) cover most of these subt asks and several
I P A P E R .
r D O C U M E N T
~ , ~ S C A N N I N G
G R A Y L E V E L
. . . . . P R E P R O C E S S I N G
S I N G L E
CHARACTERS
FEATURE EXTRACTI ON
FEATURE
~ V E C T O R S
CLASSI FI CATI ON
CLASSI FI ED
C H A R A C T E R S
~ P O S T P R O C E S S I N G
C L A S S I F I E D
TEXT
Fig. 2. Steps in a character recognition system.
general surveys of OCR systems t1"7-22) also exist.
However, to our knowledge, no t horough, up- t o- dat e
survey of feature ext r act i on met hods for OCR is avail-
able.
Devi j ver and Ki t t l er define feature ext ract i on [ page
12 in reference (11)] as the pr obl em of "ext r act i ng from
the raw dat a t he i nformat i on which is most rel evant for
classification purposes, in t he sense of mi ni mi zi ng the
within-class pat t er n var i abi l i t y while enhanci ng the
between-classs pat t er n vari abi l i t y". I t shoul d be cl ear
t hat different feature ext ract i on met hods fulfill this
requi rement to a varyi ng degree, dependi ng on the
specific recogni t i on pr obl em and avai l abl e dat a. A fea-
ture ext r act i on met hod t hat proves to be successful in
one appl i cat i on domai n may t urn out not to be very
useful in anot her domai n.
One coul d argue t hat t here is onl y a l i mi t ed number
of i ndependent features t hat can be ext ract ed from
a char act er image, so t hat which set of features is used
is not so i mpor t ant . However, t he ext ract ed features
must be i nvar i ant to the expect ed di st or t i ons and
vari at i ons t hat t he charact ers may have in a specific
appl i cat i on. Also, the phenomenon called the c u r s e o f
d i m e n s i o n a l i t y ~9'23) caut i ons us t hat with a limited
t rai ni ng set, the number of features must be kept
r easonabl y small if a st at i st i cal classifier is to be used.
A rule of t humb is to use 5 t o 10 times as many t rai ni ng
pat t er ns of each class as the di mensi onal i t y of t he
feature vector, t23) I n practice, the requi rement s of
a good feature ext ract i on met hod makes selection of
the best met hod for a given appl i cat i on a chal l engi ng
task. One must al so consi der whet her the charact ers to
be recogni zed have known or i ent at i on and size,
whet her t hey are handwri t t en, machi ne- pr i nt ed or
typed, and to what degree they are degraded. Also,
mor e t han one pat t er n class may be necessary to
charact eri ze charact ers t hat can be wri t t en in two or
mor e di st i nct ways, as for exampl e "~-" and "4", and "a"
and "a".
Feat ur e ext ract i on is an i mpor t ant st ep in achieving
good performance of OCR systems. However, the
ot her steps in t he system (Fig. 2) also need to be
opt i mi zed to obt ai n the best possi bl e performance and
these st eps are not i ndependent . The choice of feature
ext r act on met hod limits or di ct at es t he nat ur e and
out put of the preprocessi ng step (Tabl e 1). Some fea-
t ure ext ract i on met hods wor k on gray-level subi mages
of single charact ers (Fig. 3), while ot hers wor k on solid
four- or ei ght -connect ed symbol s segment ed from the
bi nar y rast er i mage (Fig. 4), t hi nned symbol s or skel-
et ons (Fig. 5), or symbol cont our s (Fig. 6). Fur t her , the
t ype or f or mat of the ext ract ed features must mat ch the
requi rement s of t he chosen classifier. Gr a ph descrip-
t i ons or gr ammar - based descri pt i ons of the charact ers
are well suited for st ruct ural or synt act i c classifiers.
Di scret e features t hat may assume only, say, two or
three di st i nct values are i deal for decision trees. Real -
val ued feature vectors are i deal for st at i st i cal classi-
fiers. However, mul t i pl e classifiers may be used, ei t her
as a mul t i -st age classification scheme 124'25~ or as
Feature extraction methods 643
Table 1. Overview of feature extraction methods for the various representation forms (gray level, binary,
vector).
Gray scale Binary Vector
subimage Solid symbol Outer contour (skeleton)
Template matching Template matching Template matching
Deformable templates Deformable templates
Unitary transforms Unitary transforms Graph description
Projection histograms Contour profiles Discrete features
Zoning Zoning Zoning Zoning
Geometric moments Geometric moments Spline curve
Zernike moments Zernike moments Fourier descriptors Fourier descriptors
Fig. 3. Gray-scale subimages (ca 30 x 30 pixels) of segmented characters. These digits were extracted from
the top center portion of the map in Fig. 1. Note that for some of the digits, parts of other print objects are also
present inside the character image.
/O
Fig. 4. Digits from the hydrographic map in the binary raster
representation.
7
Fig. 5. Skeletons of the digits in Fig. 4, thinned with the
method of Zhang and Suen. 12a) Note that junctions are
displaced and a few short false branches occur.
par al l el classifiers, wher e a c ombi na t i on of t he i ndi -
vi dual cl assi f i cat i on resul t s is used t o deci de t he final
cl assi fi cat i on} 2'26'27) I n t hat case, feat ures of mor e
t han one t ype or f or mat ma y be ext r act ed f r om t he
i nput char act er s.
1.1. lnvariants
I n or der t o r ecogni ze ma ny var i at i ons of t he same
char act er , feat ures t hat ar e i nvar i ant t o cer t ai n t rans-
f or mat i ons on t he char act er need t o be used. I nvar i -
644 O- D. TRIER et al.
Fig. 6. Contours of two of the digits in Fig. 4.
S sS
( a) ( b) (c) ( d)
fS"
( e ) ( f ) ( g )
Fig. 7. Transformed versions of digit "5". (a) original,
(b) rotated, (c) scaled, (d) stretched, (e) skewed, (f) de-skewed
and (g) mirrored.
ant s are features which have appr oxi mat el y the same
values for sampl es of the same char act er t hat are, for
exampl e, t ransl at ed, scaled, rot at ed, stretched, skewed
or mi r r or ed (Fig. 7). However , not all vari at i ons
among charact ers from the same char act er class (e.g.
noise or degr adat i on and absence or presence of serifs)
can be model ed using i nvari ant s.
Size and t r ansl at i on i nvari ance is easily achieved.
The segment at i on of i ndi vi dual charact ers can itself
pr ovi de est i mat es of size and l ocat i on, but the feature
ext ract i on met hod may often pr ovi de mor e accurat e
estimates.
Rot at i on i nvari ance is i mpor t ant if the charact ers to
be recogni zed can occur in any ori ent at i on. However,
if all the charact ers are expected to have the same
r ot at i on, t hen r ot at i on- var i ant features shoul d be used
to di st i ngui sh between such charact ers as "6" and "9",
a nd" n" and "u". Anot her al t ernat i ve is to use r ot at i on-
i nvar i ant features, augment ed with the det ect ed rot a-
t i on angle. If the r ot at i on angle is restricted, say, to lie
between - 4 5 and 45 , charact ers t hat are, say 180
r ot at i ons of each ot her can be differentiated. The same
pri nci pl e may be used for si ze-i nvari ant features, if one
want s to recognize punct uat i on mar ks in addi t i on to
charact ers and want s to di st i ngui sh between, say, ".",
"o" and "O", and "," and "9".
Skew-i nvari ance may be useful for hand- pr i nt ed
text, where the charact ers may be mor e or less slanted,
and mul t i font machi ne-pri nt ed text, where some fonts
are sl ant ed and some are not. I nvar i ance to mi r r or
i mages is not desi rabl e in char act er recogni t i on, as t he
mi r r or i mage of a char act er may pr oduce an illegit-
i mat e symbol or a different charact er.
For features ext ract ed from gray-scal e subimages,
i nvari ance to cont r ast between pri nt and backgr ound
and to mean gray level may be needed, in addi t i on to
the ot her i nvari ant s ment i oned above. Invari ance to
mean gr ay level is easily obt ai ned by addi ng t o each
pixel the difference of the desi red and t he act ual mean
gr ay levels of the image. 129)
If i nvari ant features can not be found, an al t ernat i ve
is to normal i ze the i nput images to have st andar d size,
rot at i on, cont r ast and so on. However, one shoul d
keep in mi nd t hat this i nt roduces new di scret i zat i on
errors.
1.2. R e c o n s t r u c t a b i l i t y
For some feature ext r act i on met hods, the char ac-
ters can be reconst ruct ed from t he ext r act ed fea-
tures. 13'31) This pr oper t y ensures t hat compl et e
i nf or mat i on about t he char act er shape is present ed in
the ext ract ed features. Al t hough, for some met hods,
exact reconst ruct i on may requi re an ar bi t r ar i l y large
number of features, reasonabl e appr oxi mat i ons of the
ori gi nal char act er shape can usually be obt ai ned by
using onl y a smal l number of features with the highest
i nf or mat i on content. The hope is t hat these features
also have high di scri mi nat i on power.
By reconst ruct i ng t he char act er images from the
ext ract ed features, one may visually check t hat a suffi-
cient number of features is used to capt ur e the essential
st ruct ure of the characters. Reconst ruct i on may al so
be used to i nformal l y cont r ol t hat t he i mpl ement at i on
is correct.
The rest of the paper is organi zed as follows. Sec-
tions 2- 5 give a det ai l ed review of feature ext ract i on
met hods, gr ouped by the vari ous represent at i on forms
of t he charact ers. A short summar y on neural net work
classifiers is given in Section 6. Section 7 gives guide-
lines for how one shoul d choose the appr opr i at e fea-
t ure ext r act i on met hod for a given appl i cat i on. Fi nal l y,
a summar y is given in Section 8.
2. FEATURES EXTRACTED F R OM
GRAY- SCALE I MAGES
A maj or challenge in gray-scal e i mage-based
met hods is to l ocat e candi dat e char act er locations.
One can use a l ocal l y adapt i ve bi nar i zat i on met hod to
obt ai n a good bi nar y rast er i mage and use connect ed
component s of the expected char act er size to locate the
candi dat e characters. However, a gray-scal e-based
met hod is t ypi cal l y used when recogni t i on based on
the bi nar y rast er represent at i on fails, so t he localiz-
at i on pr obl em remai ns unsol ved for difficult images.
One may have to resort to the brut e force appr oach of
t ryi ng all possi bl e l ocat i ons in the image. However,
one then has to assume a st andar d size for a char act er
image, as t he combi nat i on of all char act er sizes and
l ocat i ons is comput at i onal l y prohi bi t i ve. This ap-
pr oach cannot be used if the char act er size is expected
to vary.
Feature extraction methods 645
The desired result of the localization or segmenta-
tion step is a subimage cont ai ni ng one charact er and,
except for backgr ound pixels, n o o t h e r o b j e c t s . How,
ever, when print objects appear very close t o each
other in the input image, this goal cannot always be
achieved. Often, ot her characters or print objects may
accidentally occur inside the subimage (Fig. 3), poss-
ibly distorting the extracted features. This is one of the
reasons why every character recognition system has
a r e j e c t option.
2.1. T e m p l a t e m a t c h i n g
We are not aware of OCR systems using template
matching on gray-scale charact er images. However,
since template mat chi ng is a fairly st andard image
processing technique, ~3z'33) we have included this sec-
tion for completeness.
In template mat chi ng the feature extraction step is
left out altogether and the character image itself is used
as a "feature vector". In the recognition stage, a simi-
larity (or dissimilarity) measure between each template
Tj and the character image Z is computed. The tem-
plate Tk, which has the highest similarity measure, is
identified and if this similarity is above a specified
threshold, then the character is assigned the class label
k. Else, the character remains unclassified. In the case
of a dissimilarity measure, the template T k having the
l o w e s t dissimilarity measure is identified and if the
dissimilarly is b e l o w a specified threshold, the charac-
ter is given the class label k.
A common dissimilarity measure is the m e a n - s q u a r e
d i s t a n c e D (equation 20.1-1 in Pratt): {33)
M
O j = ~ ( Z ( x i , Y i ) - T j ( x l , Yl ) ) 2, (1)
i =1
where it is assumed t hat the template and the input
charact er image are of the same size and the sum is
taken over the M pixels in the image.
Equat i on (1) can be rewritten as:
D j = E z - 2 R z L + ET? (2)
where
M
E z = ~ ( Z 2 ( x l , y i ) ) , (3)
i 1
M
R z r j = ~ ( Z ( x i , Yl) T j ( x i , Yl)), (4)
i - 1
M
Er j = 2 ( T ~ ( x i , Y,)). (5)
i =l
E z and E L are the total charact er image energy and the
total template energy, respectively. R z r ' is the cross-
correlation between the charact er and the template,
and could have been used as a similarity measure, but
Prat t t33) points out t hat R z r j may detect a false mat ch
if, say, Z contains mostly high values. I n that case, E z
also has a high value and it could be used to normalize
R z r ~ by the expression Rz r ~ = R z r / E z However, in
Prat t ' s formulation of template matching, one wants to
decide whether the template is present in the image
(and obtain the locations of each occurrence). Our
problem is the opposite: find the template t hat matches
the character image best. Therefore, it is more relevant
t o normalize the cross-correlation by dividing it with
the total template energy:
RZT, = RzT~
Er, (6)
Experiments are needed to decide whether Dj or/ ~zr,
should be used for OCR.
Although simple, template matching suffers from
some obvious limitations. One template is only capable
of recognizing characters of the same size and rotation,
is not illumination-invariant (invariant to cont rast and
to mean gray level) and is very vulnerable to noise and
small variations t hat occur among characters from the
same class. However, many templates may be used for
each character class, but at the cost of higher comput a-
tional time since every input character has to be com-
pared with every template. The character candidates in
the input image can be scaled t o suit the template sizes,
thus maki ng the recognizer scale-independent.
2.2. D e f o r m a b l e t e m p l a t e s
Deformable templates have been used extensively in
several object recognition applications/34'351 Recent-
ly, Del Bimbo e t al . ~36) proposed to use deformable
templates for character recognition in gray-scale im-
ages of credit card slips with poor print quality. The
templates used were character skeletons. It is not clear
how the initial positions of the templates were chosen.
If all possible positions in the image were to be tried,
then the comput at i onal time would be prohibitive.
2.3. U n i t a r y i m a g e t r a n s f o r m s
In template matching, all the pixels in the gray-scale
character image are used as features. Andrews t3v} ap-
plies a unitary transform to character images, obtain-
ing a reduction in the number of features while
preserving most of the information about the character
shape. I n the transformed space, the pixels are ordered
by their variance and the pixels with the highest vari-
ance are used as features. The unitary transform has to
be applied to a training set to obtain estimates of the
variances of the pixels in the transformed space. An-
drews investigated the Kar hunen- Loeve (KL),
Fourier, Hadamar d (or Walsh) and Haar transforms in
1971. t3v} He concluded t hat the KL transform was t oo
comput at i onal l y demanding, so he recommended to
use the Fouri er or Hadamar d transform. However, the
KL transform is the only (mean-squared error) opti-
mal unitary transform in terms of information com-
pression) as} When the KL transform is used, the same
amount of information about the input charact er im-
age is cont ai ned in fewer features compared to any
other unitary transform.
646 0. D. TRIER et al.
Ot her uni t ar y t ransforms i ncl ude the Cosine, Sine
and Sl ant transforms, t3s~ I t has been shown t hat t he
Cosi ne t ransform is bet t er in t erms of i nf or mat i on
compressi on [e.g. see pp. 375_379 in reference (38)]
t han t he ot her nonopt i mal uni t ar y transforms. Its
comput at i onal cost is compar abl e to t hat of the fast
Four i er t ransform, so the Cosi ne t ransform has been
coi ned "t he met hod of choice for i mage dat a compres-
sion',.~ 3s)
The KL t ransform has been used for obj ect recogni-
t i on in several appl i cat i on domai ns, for exampl e face
recogni t i on. 139~ It is al so a realistic al t ernat i ve for OCR
on gray-level i mages with t oday' s fast comput ers.
The features ext r act ed from uni t ar y t ransforms are
not r ot at i on- i nvar i ant , so the i nput char act er images
have to be r ot at ed to a st andar d or i ent at i on if r ot at ed
charact ers may occur. Fur t her , the i nput images have
to be of exact l y the same size, so a scaling or re-
sampl i ng is necessary if t he size can vary. The uni t ar y
t ransforms are not i l l umi nat i on i nvari ant , but for the
Four i er t r ansf or med i mage t he val ue at the origin is
pr opor t i onal to the average pixel val ue of the i nput
image, so this feature can be del et ed to obt ai n bri ght -
ness invariance. For all uni t ar y t ransforms, an inverse
t ransform exists, so the ori gi nal char act er i mage can be
reconst ruct ed.
2.4. Zo n i n 9
The commerci al OCR system by Cal er a descri bed in
Bokser ~4) uses zoni ng on sol i d bi nar y charact ers.
A st r ai ght f or war d general i zat i on of this met hod t o
gray-level char act er i mages is given here. An n m gri d
is super i mposed on t he char act er i mage [Fi g. 8(a)] and
for each of t he n m zones, t he average gr ay level is
comput ed [-Fig. 8(b)], giving a feature vect or of length
n x m. However, these features are not i l l umi nat i on
i nvari ant .
2.5. Geomet r i c mo me n t i nvari ant s
HU (41} i nt r oduced the use of mome nt i nva r i a nt s as
features for pat t er n recogni t i on. Hu' s absol ut e ort hog-
onal mo me n t i nvari ant s (i nvari ant to t ransl at i on, scale
and r ot at i on) have been extensively used [see, e.g.
references (29, 42-45)]. Li t45) listed 52 Hu i nvari ant s, of
(&) ( b)
Fig. 8. Zoning of gray-level character images. (a) A 4 x 4 grid
superimposed on a character image. (b) The average gray
levels in each zone, which are used as features.
orders 2- 9, t hat are t ransl at i on-, scale- and r ot at i on-
i nvari ant . Bel kasi m et al. ~43) listed 32 Hu i nvari ant s of
orders 2-7. However, Belkasim et al. identified fewer
i nvari ant s of orders 2 7 t han Li.
Hu also devel oped moment i nvari ant s t hat were
supposed to be i nvar i ant under general l i near t rans-
format i ons:
[ ~ : 1 ~ [ : 1 2 : : : : ] I ~ ] - ' ~ - I b b : ] '
where
La21 a22A b2
( 7 )
( 8 )
Reiss 129~ has recently shown t hat these Hu i nvari ant s
are in fact i ncorrect and pr ovi ded correct ed expres-
sions for them.
Gi ven a gray-scal e subi mage Z cont ai ni ng a charac-
ter candi dat e, t he regul ar moment s t29) of or der (p + q)
are defined as:
M
m p q = 2 Z ( x i , Y i ) ( x i ) P ( Y l ) q , ( 9 )
i = 1
where the sum is t aken over all the M pixels in the
subimage. The t r ansl at i on- i nvar i ant cent ral mo-
ment s (29) of or der (p + q) are obt ai ned by pl aci ng
origin at the cent er of gravity:
M
I~pq= ~ Z( x i , Y i ) ( x i - - x ) P ( y i - y ) q, (10)
i = 1
where
mlo mol
: ~= , ) 7 = - - . ( 1 1 )
moo moo
Hu ~41) showed that:'~
[2 p q
Vpq #11+lp+q)/2)' P + q -> 2 (12)
are scal e-i nvari ant , where /~ = #oo = moo. Fr om the
vpqs, r ot at i on- i nvar i ant features can be const ruct ed.
For exampl e, the second- or der i nvari ant s are:
41 = V2o + v02, (13)
4 2 = ( Y 2 0 - - 1102) 2 ~ - Y21" (14)
I nvar i ant s for general l i near t r ansf or mat i ons are com-
put ed vi a rel at i ve invariants329'411 Relative i nvari ant s
satisfy:
l'j = IArlWildlg~lj, (15)
where l j is a funct i on of the moment s in the ori gi nal
(x, y) space, 1' i is the same function comput ed from t he
moment s in t he t r ansf or med (x' , y' ) space, w~ is t er med
the weight of t he relative i nvari ant , [JI is the absol ut e
val ue of the Jacobi an of the t r ansposed t r ansf or mat i on
t Note that equation (12) is written with a typographical
error in H u's paper. ~41)
Feature extraction methods 647
mat r i x A r a nd k~ is t he or der of 1 i. Not e t hat t he
t r ans l at i on vect or b does not appear i n e qa ua t i on (15)
as t he cent r al mome nt s are i ndependent of t r ansl at i on.
To gener at e absol ut e i nvar i ant s, t hat is, i nvar i ant s ~bj
satisfying:
q/j = ~0j (16)
Reiss ~29) used, for l i near t r ansf or mat i ons:
[ A r l = J and / 1' =[JI/ 1, (17)
where/ 1 = Poo:
I'j. = IJIW,+k~lj for wj even, (18)
l ' j = J I J l ~ ' , + k , - l l j for w~ odd. (19)
Then, it can be s hown t hat :
I j (20)
~ - / 1 w , + ki
is an i nvar i ant if w~ is even a nd I~jl is an i nvar i ant if w~
is odd.
For gener al l i near t r ans f or mat i ons , Hu ~411 and Re-
iSS 129''.2) gave t he fol l owi ng rel at i ve i nvar i ant s t hat are
f unct i ons of t he second- and t hi r d- or der cent r al mo-
ment s:
I1 =/120/102 - / 1 2 1 (21)
I2 = ( / 1 3 0 / 1 0 3 - - / 1 2 1 , / / 1 2 ) 2
- 4(/13o/112 -/121)(/121/1o3 - / 122) (22)
13 = / 1 2 0 ( / 1 1 1 / 1 0 3 - - / 1 2 2 ) - - / 1 1 1 ( ] ' / 3 0 / 1 0 3 - - / 1 2 1 / 1 1 2 )
- F - / 1 0 2 ( / 1 3 0 / 1 1 2 - - 2 /121) (23)
I ~ = /1~ o /13 2 - - 6 /13 0/121/111/12 2
+ 6/130/112/102(2/1~1 --/120]./02)
+/130/103(6/120/111/102 -- 8/1~ 1)
2 2
q - 9 / 1 2 1 / 1 2 0 / 1 0 2 - - 1 8 / 1 2 1 / / 1 2 / 1 2 0 / 1 1 1 / 1 0 2
+ 6 / 1 2 1 / 1 0 3 / 1 2 0 ( 2 / 1 ~ 1 - - / 1 2 0 / 1 0 2 ) + 9 / / 1 2 / 1 2 0 / 1 0 2 2 2
2 3 (24)
- - 6 / 1 1 2 / 1 0 3 / 1 1 1 / 1 2 2 0 + / 1 0 3 / 1 2 0 "
Reiss f ound t he wei ght s wj a nd orders kj to be
w I = 2, w 2 = 6, w 3 = 4, w 4 = 6; (25)
k 1 = 2 , k 2 = 4 , k 3 = 3 , k , , = 5 . (26)
Then t he fol l owi ng feat ures are i nvar i ant under t r ans-
l at i on and gener al l i near t r ans f or mat i ons (given by
Reiss 129) i n 1991 and redi scovered by Fl usser and
Suk 1'~'6'4"7) i n 1993):
(27)
(28)
Hu 141) i mpl i ci t l y used k = 1, obt a i ni ng i ncor r ect i n-
vari ant s.
Bami eh and de Fi guei r edo 148) have suggested t he
fol l owi ng t wo rel at i ve i nvar i ant s in addi t i on to 11
and 12t
J 3 = / 1 4 o # o 4 - - 4 / 1 3 1 / 1 1 3 + 3 / 1 2 2 (29)
J4 =//4o/122/1o4 - 2/131/122/113
-/ 14o#13 -/ 1o4#~1 - #32. (30)
As above, these rel at i ve i nvar i ant s mus t be di vi ded by
/1~=/1w+k to obt ai n absol ut e i nvar i ant s. Regretfully,
Bami eh and de Fi guei r edo di vi ded Ji by/1w (i mpl i ci t l y
usi ng k = 0), so t hei r i nvar i ant s are al so i ncorrect .
Reiss (29'42) also gave features t hat are i nvar i ant
under changes i n cont r ast , i n addi t i on to bei ng i nvar i -
ant under t r ans l at i on and general l i near t r ansf or m-
at i ons ( i ncl udi ng scale change, r ot at i on a nd skew). The
t hree first features are:
14 0 - 1 2 1 1 1 3
0 1 = - - , 2 - , 0 3 - . ( 3 1 )
/1 1 2 /1 1 3 14
Exper i ment s wi t h ot her feat ure ext r act i on met hods
i ndi cat e t hat at least 10-15 features are needed for
a successful OCR system. Mor e mome nt i nvar i ant s
(~)s and 0is) based on hi gher or der mome nt s are gi ven
by Reiss. 142J
2.6. Z e r n i k e m o m e n t s
Zer ni ke mome nt s have been used by several aut hor s
for char act er r ecogni t i on of bi nar y solid
( 3 1 4 3 4 9 )
symbol s. ' ' However, i ni t i al exper i ment s s ugges t
t hat t hey are well sui t ed for gray-scal e char act er
subi mages as well. Bot h r ot at i on- var i ant a nd r ot at i on-
i nvar i ant features can be ext ract ed. Feat ur es i nvar i ant
to i l l umi nat i on need to be devel oped for these features
to be really useful for gray-level char act er images.
Khot a nz a d and H o n g 131'49) u s e t he ampl i t udes of
t he Zer ni ke mome nt s as features. A set of compl ex
or t hogonal pol ynomi al s V , , , ( x , y ) is used [ equat i ons
(32), (33)].:~ The Zer ni ke moment s are pr oj ect i ons of
t he i nput i mage ont o t he space s panned by the or t hog-
onal V funct i ons:
Vn,.(x,y ) = R,,,(x,y)eJ,,tan 'ly..x), (32)
where j = x / - 1, n >_ 0, Iml < n, n - [m[ is even, and
tn Iml)/2 (__ 1)S(X2 + y2)ln/2) s ( n _ s)!
I~.m(=, y ) = Y~
s=0 s , ( n 2 l m l s ) ' ( n - [ml - s ) ' 2
(33)
For a di gi t al image, t he Zer ni ke mome nt of or der n and
t An incorrect version of I 2 is given in Bamieh and de
Figueiredo's paper. 148)
:~ There is an error in reference (49) in equation (33): In
Iml
reference (49) the summation is taken from s = 0 to n - - -
n - l m l 2
however, it must be taken from s = 0 to to avoid
2
( n - 2 l m l - s ) becoming negative.
648 ~ . D. TRIER et al.
r e pe t i t i o n m i s g i v e n by:
n + l
A.., = ~ ~f ( x, y) [V.,,,(x, y ) ] * , (34)
x y
whe r e x z + yZ < 1 a n d t he s y mb o l d e n o t e s t he c o m-
p l e x c o n j u g a t e ope r at or . N o t e t hat t he i ma g e c o o r d i -
na t e s mus t be mappe d t o the range of the unit ci rcl e,
x z + yZ < 1. T h e part o f t he o r i g i na l i ma g e i n s i d e t he
, s ....
u n i t ci rcl e c a n be r e c o ns t r uc t e d wi t h a n ar bi t r ar y
pr e c i s i o n us i ng:
N
](x,y)= l i m ~ ~ A . , . V . m ( X , y ) , (35)
N~at~ n=O m
wh e r e t he s e c o n d s u m i s t a ke n o v e r al l Iml ~ n, s uc h
t hat n - I m l i s e v e n.
f
Fig. 9. Images derived from Zernike moment s. Rows 1-2: Input i mage of digit "4" and contri buti ons from
the Zernike moment s o f order 1-13. The i mages are hi stograms equal i zed to hi ghl i ght the details. Rows 3-4:
Input i mage of digit "4" and i mages reconstructed from the Zernike moment s of order up to 1-13,
respectively.
HW
Fig. 10. Images derived from Zernike moment s. Rows 1-2: Input i mage of digit "5" and contri buti ons from
the Zernike moment s o f order 1-13. The i mages are hi stograms equal i zed to hi ghl i ght the details. Rows 3-4:
Input i mage of digit "5" and i mages reconstructed from the Zernike moment s of order up to 1-13,
respectively.
Feature extraction methods 649
The magni t udes IA,ml are rot at i ofi i nvari ant . To
show t he cont r i but i on of t he Zerni ke moment of or der
n, we have comput ed:
II.(x,y)l= ~ m A n m V n m ( X , y ) , (36)
where x z + y 2 < 1, Iml < n and n - [ml is even.
The images I l , ( x , Y) I, n = 1 . . . . . 13, for the charact ers
"4" and "5" (Fi gs 9 and 10) i ndi cat e t hat the ext ract ed
features are very different for t he t hi rd- and higher-
or der moment s. Or der s one and two seem to represent
ori ent at i on, wi dt h and height. However, reconst ruc-
tions of the same digits (Fi gs 9 and 10) using equat i on
(35), N = 1 . . . . ,13, i ndi cat e t hat moment s of orders up
to 8 11 are needed to achieve a reasonabl e appear -
ance.
Tr ansl at i on- and scal e-i nvari ance can be obt ai ned
by shifting and scaling t he i mage pr i or to t he comput a-
t i on of the Zerni ke moments331) The fi rst -order
regul ar moment s can be used to find the i mage cent er
and the zerot h or der cent ral moment gives a size
estimate.
Belkasim e t a l . (. 3' 441 u s e the following addi t i onal
features:
B . . . . I = I A. z , ~ l l A . ~ l c o s ( 4 ~ , _ 2 , , - c ~ , O , (37)
B...+ L = I a . l l I A . L f COS ( p Ck . L - - ~b.1), (38)
where L = 3, 5 . . . . . n, p = 1 / L and ~b.,. is the phase angle
component of Am. so that:
A m , = l A m . l c o s 4 ~ m . + j l A m . l s i n q ~ m , . (39)
3. FEATURES EXTRACTED FROM BINARY IMAGES
A bi nar y rast er i mage is obt ai ned by a gl obal or
locally adapt i ve bi nar i zat i on 31 of the gray-scal e i nput
image. In many cases, the segment at i on of charact ers is
carri ed out si mpl y by i sol at i ng connect ed component s.
However, for difficult images, some charact ers may
t ouch or overl ap each ot her or ot her pri nt objects.
Anot her pr obl em occurs when charact ers are frag-
ment ed i nt o two or mor e connect ed component s. This
pr obl em may be al l evi at ed somewhat by choosi ng
a bet t er l ocal l y adapt i ve bi nar i zat i on met hod, but
Tri er and T a x t 113~ have shown t hat even the best
l ocal l y adapt i ve bi nar i zat i on met hod may still not
result in perfectly i sol at ed charact ers.
Met hods for segment i ng t ouchi ng charact ers are
given by West al l and Nar asi mha, 15) Fuj i sawa e t al . c511
and in surveys, t5'6~ However, these met hods assume
t hat the charact ers appear in the same text string and
have known ori ent at i on. In hydr ogr aphi c maps
(Fig. 1), for exampl e, some charact ers t ouch or over l ap
lines, or t ouch charact ers from anot her text line. Tr i er
e t al . ts2) have devel oped a met hod based on gray-scal e
t opogr aphi c anal ysi s t53,541 which i nt egrat es bi nari z-
at i on and segment at i on. Thi s met hod gives a bet t er
performance, since i nf or mat i on gai ned in the t opo-
gr aphi c anal ysi s st ep is used in segment i ng t he bi nar y
image. The segment at i on st ep al so handl es r ot at ed
charact ers and t ouchi ng charact ers from different text
strings.
The bi nar y rast er represent at i on of a char act er is
a simplification of the gray-scal e represent at i on. The
i mage function Z ( x , y ) now t akes on two values (say,
0 and 1) i nst ead of the, say 256 gray-level values. Thi s
means t hat all the met hods devel oped for the gray-
scale represent at i on are appl i cabl e t o the solid bi nar y
rast er represent at i on as well. Therefore, we will not
r epeat the full descri pt i on of each met hod, but onl y
poi nt out the simplification in t he comput at i ons in-
volved for each feature ext ract i on met hod. General l y,
i nvari ance to i l l umi nat i on is no l onger relevant, but
the ot her i nvari ances are.
A sol i d bi nar y char act er may be conver t ed to ot her
represent at i ons, such as the out er cont our of t he char-
acter, t he cont our profiles, or t he char act er skel et on
and features may be ext ract ed from each of these
represent at i ons as well. For the pur pose of designing
OCR systems, the goal of these conversi ons is to
preserve the rel evant i nf or mat i on about the char act er
shape and di scard some of t he unnecessary i nforma-
tion.
Here, we onl y present the modi fi cat i ons of the
met hods previ ousl y descri bed for the gray-scal e repre-
sentation. No changes are needed for uni t ar y i mage
t ransforms and Zerni ke moment s, except t hat gray-
level i nvari ance is irrelevant.
3.1. T e m p l a t e m a t c h i n 9
In bi nar y t empl at e mat chi ng, several si mi l ari t y
measures ot her t han mean square di st ance and corre-
l at i on have been suggested355) To det ect mat ches, let
n~j be the number ofpi xel posi t i ons where t he t empl at e
pixel x is i and the i mage pixel y i s j , with i , j ~ { O , 1}:
n
n i ~ = ~ 6 , . ( i , j ) (40)
ra-1
where
10 if (X m = i) /~ (Ym = J )
6 , , ( i , j ) = otherwise,
(41)
i , j e{0, 1}, and y,. and x,, are the mth pixels of the
bi nar y i mages Y and X which are bei ng compar ed.
Tubbs eval uat ed ei ght di st ances and found t he Jaccar d
di st ance d s and t he Yule di st ance d r to be the best:
n i l
d s (42)
n l l + n l o - k - n o l
nl l ~OO - - ~l OnOl
dr = (43)
~11~00 + nl OnOl I
However, the l ack of robust ness regardi ng shape
vari at i ons ment i oned in Section 2 for the gray-scal e
case still applies. Tubbs 155~ tries to overcome some of
these short comi ngs by i nt r oduci ng weights for the
650 O. D. TRIER et al.
different pixel posi t i ons m. Equat i on (40) is repl aced
by:
n
n o = ~ pm(kli)6,,(i,j), (44)
m=l
where Pm(kli) is the pr obabi l i t y t hat t he i nput i mage
Y mat ches t empl at e Xk, given t hat pixel number m in
the t empl at e X k is i. p,,(k[i) is appr oxi mat ed as the
number of t empl at es (including t empl at e Xk) havi ng
the same pixel val ue at l ocat i on m as t empl at e Xk,
di vi ded by the t ot al number of t empl at es.
However, we suspect t hat t he ext r a flexibility ob-
t ai ned by i nt r oduci ng p(kli) is not enough t o cope with
vari abi l i t i es in char act er shapes t hat may occur in
hand- pr i nt ed charact ers and mul t i -font machi ne-
pr i nt ed charact ers. A mor e pr omi si ng appr oach is
t aken by Ga de r et al. ~24~ who use a set of t empl at es for
each char act er class and a pr ocedur e for selecting
t empl at es based on a t rai ni ng set.
3.2. Unitary image transforms
The NI ST form-based hand- pr i nt recogni t i on sys-
tem t56~ uses t he Ka r hune n- Loe ve t ransform to ext ract
features from the bi nar y rast er represent at i on. It s per-
formance is cal i med to be good and this OCR system is
avai l abl e in the publ i c domai n.
3.3. Projection histograms
Proj ect i on hi st ograms were i nt r oduced in 1956 in
a har dwar e OCR system by Gl auber man. 15v~ Today,
this t echni que is most l y used for segment i ng char ac-
ters, words and text lines, or to det ect if an i nput i mage
of a scanned text page is rot at ed. IsaJ For a hor i zont al
proj ect i on, y(xi) is t he number of pixeis wi t h x = xi
(Fig. 11). The features can be made scale i ndependent
by using a fixed number of bins on each axis (by
mergi ng nei ghbori ng bins) and di vi di ng by t he t ot al
number of pri nt pixels in t he char act er image. How-
ever., t he pr oj ect i on hi st ogr ams are very sensitive to
r ot at i on and, to some degree, var i abi l i t y in wri t i ng
style. Also, i mpor t ant i nf or mat i on about t he char act er
shape seems to be lost.
Fig. 11. Horizontal and vertical projection histograms.
( a ) ( b )
(c) ( d)
Fig. 12. Two of the characters in Bokser's study ~4) that are
easily confused when thinned (e.g. with Zhang and Suen's
method 128~). (a) "S", (b) "8", (c) thinned "S" and (d) thinned "8".
The vertical proj ect i on x(y) is sl ant i nvari ant , but
the hor i zont al proj ect i on is not. When measuri ng the
di ssi mi l ari t y between t wo hi st ograms, it is t empt i ng t o
use:
d = i l yl ( xl ) - - Yz(Xl)] , (45)
i =1
where n is the number of bi ns and Yl and Y2 are t he two
hi st ograms to be compar ed. However, it is mor e mean-
ingful to compar e the cumulative histograms Y(Xk), t he
sum of the k first bins:
k
Y(Xk) = ~ Y(Xl), ( 46)
i =1
using the di ssi mi l ari t y measure:
n
D = ~ [ Y l ( x i ) - YE(Xi)[, (47)
i =1
where Y1 and Y2 denot e cumul at i ve hi st ograms. The
new di ssi mi l ari t y measure D is not as sensitive as d to
a slight mi sal i gnment of domi nant peaks in t he orig-
i nal hi st ograms.
3.4. Zoni ng
Bokser ~*m describes the commerci al OCR system
Cal er a t hat uses zoni ng on bi nar y charact ers. The
system was desi gned to recognize machi ne- pr i nt ed
charact ers of al most any nondecor at i ve font, possi bl y
severely degraded, by, for exampl e, several generat i ons
of phot ocopyi ng. Bot h cont our ext ract i on and t hi n-
ning pr oved to be unrel i abl e for self-touching char ac-
ters (Fig. 12). The zoni ng met hod was used to comput e
the percent age of bl ack pixels in each zone. Addi t i onal
features were needed t o i mpr ove t he performance, but
the det ai l s were not present ed by Bokser. t4~ Unfor-
t unat el y, not much explicit i nf or mat i on is avai l abl e
about the commerci al systems.
Feature extraction methods 651
3.5. G e o m e t r i c m o m e n t i n v a r i a n t s
A bi nar y i mage can be consi dered a special case of
a gray-level i mage wi t h Z ( x , y) = 1 for pr i nt pixels and
Z ( z i , Yl) = 0 for backgr ound pixels. By summi ng over
t he N p r i n t pixels only, equat i ons (9) and (10) can be
rewri t t en as:
where
N
mpq = ~ ( xi ) P( yi ) q (48)
i =1
N
Ppq = ~ ( x i - - x ) P( Yi - Y)q, (49)
i =1
ml,o too,1
~= , ~ = . ( 5 0 )
mo.o m0.0
Then, equat i ons (12)-(24) can be used as before. How-
ever, t he cont r ast i nvari ant s [ equat i on (31,)] are of no
i nt erest in the bi nar y case.
For charact ers t hat are not t oo el ongat ed, a fast
al gor i t hm for comput i ng t he moment s based on t he
char act er cont our exists, ~s9) giving the same values as
equat i on (49).
3.6. E v a l u a t i o n s t u d i e s
Bel kasi m e t al . ~43'44~ compar ed several moment in-
vari ant s appl i ed to solid bi nar y charact ers, i ncl udi ng
regular, Hu, Bamieh, Zerni ke, Teague- Zer ni ke and
pseudo- Zer ni ke moment i nvari ant s, using a k-nearest
nei ghbor (kNN) classifier. They concl uded t hat nor -
mal i zed Zer ni ke moment i nvari ant s t43'44~ gave the
best performance for char act er recogni t i on in t erms of
recogni t i on accuracy. The nor mal i zat i on compensat es
for the vari ances of the features and since t he k NN
classifier uses the Eucl i dean di st ance to measure the
di ssi mi l ari t y of t he i nput feature vect ors and the t rai n-
ing samples, this will i mpr ove t he performance. How-
ever, by using a st at i st i cal classifier which explicitly
account s for t he variances, for exampl e, a quadr at i c
Bayesi an classifier using t he Mahal anobi s distance, no
such nor mal i zat i on is needed.
4. FEATURES EXTRACTED FROM THE BINARY
CONTOUR
The closed out er cont our curve of a char act er is
a closed piecewise l i near curve t hat passes t hr ough the
centers of all the pixels which are four-connect ed to the
out si de backgr ound and no ot her pixels. Fol l owi ng the
curve, the pixels are visited in, say, count er-cl ockwi se
or der and the curve may visit an edge pixel twice at
l ocat i ons where the obj ect is one-pi xel wide. Each line
segment is a st rai ght line between t he pixel centers of
two ei ght -connect ed neighbors.
By appr oxi mat i ng' t he cont our curve by a par amet -
ric expression, the coefficients of t he appr oxi mat i on
can be used as features. By following t he closed con-
t our successively, a per i odi c funct i on results. Peri odi c
Fig. 13. Digit "5" with left profile xL( y ) and right p(ofile xR( y ).
For y value, the left (right) profile value is the leftmost
(rightmost) x value on the character contour.
functions are well-suited for Four i er series expansi on,
and this is the f oundat i on for the Four i er - based
met hods discussed below.
4.1. C o n t o u r p r o f i l e s
The mot i vat i on for using cont our profiles is t hat
each hal f of the cont our (Fig. 13) can be appr oxi mat ed
by a discrete function of one of the spat i al variables,
x or y. Then, features can be ext ract ed from discrete
functions. We may use vertical or hor i zont al profiles
and t hey can be ei t her out er profiles or i nner profiles.
To const ruct vertical profiles, first l ocat e the upper -
most and l ower most pixels on t he cont our. The con-
t our is split at these two points. To obt ai n t he out er
profiles, for each y value, select the out er most x val ue
on each cont our hal f (Fig. 13). To obt ai n t he i nner
profiles, for each y value, the i nner most x values are
selected. Hor i zont al profiles can be ext ract ed in a simi-
l ar fashion, st art i ng by di vi di ng t he cont our in upper
and l ower halves.
The profiles are themselves dependent on r ot at i on
(e.g. t ry to r ot at e the "5" in Fig. 13, say, 45 before
comput i ng the profiles). Therefore, all features deri ved
from t he profiles will also be dependent on r ot at i on.
Ki mur a and Shr i dhar C2 v) ext ract ed features from the
out er vertical profiles onl y (Fig. 13). The profiles t hem-
selves can be used as features, as well as t he first
differences of the profiles (e.g. x' L( y ) = XL( Y + 1)-- xL(y));
t he wi dt h w ( y ) = x ~ ( y ) - xL(y); the rat i o of the vert i cal
height of t he charact er, n, by the maxi mum of t he wi dt h
function, maxyw(y); l ocat i on of maxi ma and mi ni ma in
the profiles; and l ocat i ons of peaks in the first differen-
ces (which i ndi cat e discontinuities).
4.2. Z o n i n g
Ki mur a and Shr i dhar (27) used zoni ng on cont our
curves. I n each zone, t he cont our line segments be-
tween nei ghbor i ng pixels were gr ouped by ori ent at i on:
hor i zont al (0), vertical (90 ) and t he two di agonal
ori ent at i ons (45, 135). The number of line segments of
each or i ent at i on was count ed (Fig. 14).
Takahashi t6) al so used or i ent at i on hi st ogr ams
from zones, but used vertical, hor i zont al and di agonal
slices as zones (Fig. 15). The or i ent at i ons were ext rac-
t ed from i nner cont ours (if any) in addi t i on to the out er
cont our when maki ng the hi st ograms.
Fur t her , Takahashi identified high curvat ure poi nt s
al ong bot h out er and i nner cont ours. For each of these
652 O. D. TRIER et al.
/
( a )
5
( b)
o r i e n t ~
45
90
135
(c)
Fig. 14. Zoningofcontourcurve. (a)4 x 4grid superimposed
on character; (b) close-up of the upper right corner zone; (c)
histogram of orientations for this zone.
Fig. 15. Slice zones used by Takahashi. 16)
A u i H c
D E F
G H 1 I
Fig. 16. Zoning with fuzzy borders. Pixel P., has a member-
ship value of 0.25 in each of the four zones A, B, D and E. P2
has a 0.75 membership of E and a 0.25 membership of F.
poi nt s, t he cur vat ur e val ue, t he c ont our t angent a nd
t he poi nt ' s zonal posi t i on were ext ract ed. Thi s t i me
a r egul ar gri d was used as zones.
Cao et al. tzS) obser ved t hat when t he c ont our curve
was close to zone bor der s, smal l var i at i ons i n t he
c ont our cur ve coul d l ead to l arge var i at i ons i n t he
ext r act ed features. They t ri ed to compens at e for this by
us i ng f u z z y borders. Poi nt s near t he zone bor der s are
gi ven fuzzy member s hi p val ues to t wo or f our zones
a nd t he fuzzy member s hi p val ues s um to one (Fig. 16).
4.3. Spl i ne curve appr oxi mat i on
Seki t a e t al. ~6 i~ i dent i fy hi gh- cur vat ur e poi nt s, cal l ed
br eakpoi nt s, on t he out er char act er c ont our a nd ap-
pr oxi mat e t he curve bet ween t wo br eakpoi nt s wi t h
a spl i ne funct i on. Then, bot h t he br eakpoi nt s a nd t he
spl i ne cur ve par amet er s are used as features.
Taxt e t al. t62) appr oxi mat e t he out er c ont our curve
wi t h a spl i ne curve, whi ch is t hen smoot hed. The
s moot hed spl i ne curve is di vi ded i nt o M part s of equal
curve l engt h. For each part , t he average cur vat ur e is
comput ed. I n addi t i on, t he di st ances from t he ar i t hme-
tic me a n of t he c ont our curve poi nt s t o N equal l y
spaced poi nt s on t he c ont our is measur ed. By scal i ng
t he char act er ' s spl i ne cur ve appr oxi mat i on to a st an-
dar d size before t he feat ures are measur ed, t he features
will become size i nvar i ant . The features are al r eady
t r ans l at i on i nvar i ant by nat ur e, but are dependent on
r ot at i on.
4.4. El l i pt i c Four i er descriptors
I n Kuhl a nd Gi a r di na ' s appr oach, 3) t he cl osed
cont our , (x(t ), y(t)), t = 1 . . . . . m, is appr oxi mat ed as:
N I 2nnt 2nnt 1
~ ( t ) = Ao + ~ a . c os + b . s i n - - ~ (51)
. =1 L T
N c 2 n m 2 n n t ]
p ( t ) - - Co + ~ c. cos + d . s i n - - ~ - , (52)
n=l t - T
where T is t ot al c ont our l engt h a nd wi t h ~ ( t ) = x ( t )
a nd 3~( t ) =y ( t ) i n t he l i mi t when N~ o o . The coeffi-
ci ent s are:
[ T
Ao = ~ S x( t ) d t (53)
0
T
1 ! y ( t ) d t (54)
C O =
2 r 2nnt
a . = ~ ! x ( t ) cos T dt (55)
2 T 2 n n t
b . = ~ ! x ( t ) sin T dt (56)
2 T 2nnt
c . = - ~ ! y ( t ) cos T dt (57)
2 T 2nnt
d, = I y(t ) sin dt. (58)
o
T
The f unct i ons x ( t ) a nd y( t ) are piecewise l i near a nd t he
coefficients can, therefore, be obt ai ned by s umma t i on
i nst ead of i nt egr at i on. It can be s hown t3) t hat t he
coefficients a, , b,, c, and d,, whi ch are t he ext r act ed
features, can be expressed as:
T ~ A x l
- - [-cos q~i - cos i _ 1 ] (59)
a, = 2n2n 2 At i
i =1
T ~ A x i
~ - [ si n i - sin ~b i_ 1 ] (60)
b. = 2n2n 2 At~
i =l
T "1 Ay~
c . = 2 n 2 . 2 (61)
i =
Feature extraction methods 653
m
T S ~ A Y l [ si n i - si n i - 1], (62)
d , - 2n2n 2 i~=1 A t i
where dp i = 2 m r t J T ,
A x i = x i - - x i - 1, A y i = Yi - - Y i 1, (63)
i
A t i = x / A x 2 + Ay 2, t i = ~ At j, (64)
j =l
T = t,, = ~ At j, (65)
j -1
and m is t he n u mb e r of pixels al ong t he boundar y. The
st ar t i ng poi nt ( x 1, Yl) can be ar bi t r ar i l y chosen and it is
cl ear from equat i ons (55)-(56) t hat t he coefficients are
dependent on this choice. To obt a i n feat ures t hat are
i ndependent of t he par t i cul ar st ar t i ng poi nt , we cal cu-
l at e t he p h a s e s h i f t f r o m t h e f i r s t m a j o r a x i s as:
1 2 ( a l b 1 + c l d l )
01 = ~ t a n - 1 . (66)
, / a l - + -
Then, t he coefficients can be r ot at ed to achi eve a zero
phase shift:
a * b * ] = I a . b , ] [ c o s n O 1 - s i n n 0 1 1
c* d* c, d, Lsi nn01 cosn01 J"
(67)
To obt ai n r ot at i on i nvar i ant descri pt ors, t he r ot at i on,
~01, of t he s emi - maj or axis [Fi g. 17(a)] can be f ound by:
~kl = t a n - I c1" (68)
a*
and t he descri pt ors can t hen be r ot at ed by - 1 [-Fig.
17(b)], so t hat t he semi - maj or axis is par al l el wi t h t he
x-axis:
a** b**]_[cos~l sin01]Va* b* l
c.** d**J - L- s i n~l cos~lJLc* d*f
(69)
Thi s r ot at i on gives b * * = c * * = 0.0 [Fi g. 17(b)], so
these coefficients shoul d not be used as features. Fur -
ther, bot h t hese r ot at i ons are ambi guous , as 0 and
0 + n give the same axes, as do 0 a nd ~b + n.
To obt ai n si ze- i nvar i ant features, t he coefficients
can be di vi ded by t he magni t ude, E, of t he semi - maj or
axis, gi ven by:
E = x / a . 2 + c . 2 = a**. (70)
The n a** shoul d not be used as a feat ure as well. I n any
case, t he l ow- or der coefficients t hat are avai l abl e con-
t ai n t he most i nf or mat i on ( about t he char act er shape)
and shoul d al ways be used.
I n Fi gs 18 and 19, t he char act er s "4" and "5" of
Fig. 6 have been r econst r uct ed usi ng t he coefficients of
or der up to n for different val ues of n. These figures
suggest t hat usi ng onl y t he descri pt ors of t he first t hree
orders (12 features i n t ot al ) mi ght not be enough to
obt ai n a classifier wi t h sufficient di s cr i mi nat i on power.
Li n a nd Hwa ng ~63~ deri ved r ot at i on- i nvar i ant fea-
t ures based on Kuhl and Gi ar di na' s ~3) features:
lk=a~+h~+c~+d~ (71)
J k = a k d k - - bkCk (72)
K , , j = ( a 2 2 2 ( c 2 1 + d l ) ( e j + d 2 ) + b l ) ( a j + b ] ) + 2 2
+ 2 ( a l Q + b l d l ) ( a j c j + b f l j ) . (73)
As above, a scal i ng fact or may be used to obt ai n
si ze- i nvar i ant features.
4.5. O t h e r F o u r i e r d e s c r i p t o r s
Pr i or to Kuhl and Gi a r di na ~3~ a nd Li n and
Hwang, (63) ot her Four i er descri pt ors were devel oped
by Za h n and Roski es ~64~ a nd Gr a nl und. ~65~
I n Za hn and Roski es' met hod, ~6~) the angul ar differ-
ence A(p bet ween t wo successive l i ne segment s on t he
a;
(a)
x
y ,
(b)
b~* : c[* = 0 . 0
Fig. 17. The rotation of the first-order ellipse used in elliptic Fourier descriptors in order to obtain
rotation-independent descriptors a** and d**. (a) Before rotation; (b) after rotation.
654 ~). D. TRIER e t al.
Fig. 18. Character "4" reconstructed by elliptic Fourierdescriptorsofordersupto 1, 2, . . . , 10; 15,20,30,40,
50 and 100, respectively.
Fig. 19. Character "5" reconstructed by elliptic Fourier descriptors of orders up to 1, 2, . . . , 10; 15, 20, 30, 40,
50 and 100, respectively.
c ont our is meas ur ed at every pixel cent er al ong t he
cont our . The c ont our is fol l owed clockwise. The n t he
fol l owi ng descri pt ors can be extracted:
1 ~ 2~znt k
a. = - - - Aq~ k s i n - - (74)
n7/7 k = l T
1 m 2~zntk
b , = --mtk~l A~k COS T ' (75)
where T is t he l engt h of t he b o u n d a r y curve, consi st i ng
of m l i ne segment s, t k is t he accumul at ed l engt h of t he
b o u n d a r y from t he st ar t i ng poi nt Pl to t he kt h poi nt Pk
a nd A~o k is t he angl e bet ween t he vect ors [ P k - 1, Pk] and
[Pk, Pk + 1]' a, and b, are size- a nd t r ans l at i on- i nvar i ant .
Rot at i on i nvar i ance can be obt ai ned by t r ans f or mi ng
to pol ar coor di nat es. The n t he ampl i t udes:
A. = x / ~ . 2 + b 2 (76)
are i ndependent of r ot at i on and mi r r or i ng, while t he
phase angl es ~. = t a n ( a . / b . ) are not . However, mi r r or -
i ng can be det ect ed via t he ajs. It can be s hown t hat :
f k j = J * ~ k - - k * ~ j , (77)
is i ndependent of r ot at i on, but dependent on mi r r or -
ing. Here, j * = j / g c d ( j , k ) , k * = k / g c d ( j , k ) a nd
gcd(j , k) is t he great est c o mmo n di vi sor of j a nd k.
Za h n a nd Roski es war n t hat ~k becomes unr el i abl e
a s A k --~ 0. and is t ot al l y undef i ned when A k = 0. T h e r e -
fore, t he Fk ) t erms may be unr el i abl e.
Gr a n l u n d ~65~ uses a compl ex n u mb e r z ( t ) =
x ( t ) + y ( t ) to denot e t he poi nt s on t he cont our . The n
t he c ont our can be expressed as a Four i e r series:
z( t ) = ~ a . e j2~"'/r,
where
(78)
are t he c o m p l e x coefficients, a o is the cent er of gravi t y
a nd t he ot her coefficients a, , n # 0 are i ndependent of
t r ansl at i on. Agai n, T is t he t ot al c ont our l engt h. The
deri ved feat ures
bn al +nal -n
a2 , (80)
a n / k
,~m/k
1 + m~ l - n
D i n . a ( l m + n ) / k (81)
are i ndependent of scale a nd r ot at i on. Here, n 1 a nd
k = gcd(m, n) is t he great est c o mmo n di vi sor of m a nd n.
Fur t her mor e:
azlaal
b~' = aZ ~ , (82)
al +ml al l m
d*, (83)
a m+ I
are scal e- i ndependent , but depend on r ot at i on, so t hey
can be useful when t he or i ent at i on of t he char act er s is
known.
T
1 ! z ( t ) e _ ~ Z . . , / r d t (79)
a n = - ~
Feature extraction methods 655
Per soon and Fu 13) poi nt ed out t hat :
a n = a _ . e - 2nnct/T
for some c~. Therefore, the set of a, s is redundant .
( 8 4 )
4.6. Evaluation studies
Taxt et alJ 62~ eval uat ed Zahn and Roski es' Four i er
descri pt ors, ~64~ Kuhl and Gi ar di na' s elliptic Four i er
descri pt ors, ~3~ Lin and Hwang' s elliptic Four i er de-
scri pt ors ~63) and t hei r own cubi c spline appr oxi -
mat i on. ~621 For charact ers with known r ot at i on, t he
best performance was r epor t ed using Kuhl and Gi ar -
di na' s met hod.
Per soon and Fu (3) observed t hat Zahn and Roski es'
descri pt ors (a. , b. ) converge slowly to zero as n ~ 0
relative to Gr anl und' s ~651 descri pt ors (a,) in the case of
piecewise l i near cont our curves. This suggests t hat
Zahn and Roski es' descri pt ors are not so well suited
for t he char act er cont ours obt ai ned from bi nar y rast er
obj ect s nor char act er skeletons.
5. FEATURES EXTRACTED FROM THE VECTOR
REPRESENTATI ON
Char act er skel et ons (Fig. 5) are obt ai ned by t hi nn-
ing the bi nar y r ast er r epr esent at i on of t he charact ers.
An overwhel mi ng number of t hi nni ng al gori t hms exist
and some recent eval uat i on studies give clues to t hei r
merits and di sadvant ages. ~15,16.66) The t ask of choos-
ing the ri ght one often involves a compromi se; one
want s one-pi xel wide ei ght -connect ed skel et ons with-
out spuri ous branches or di spl aced j unct i ons, some
ki nd of robust ness t o r ot at i on and noise and at the
same time a fast and easy- t o- i mpl ement al gori t hm.
Kwok' s t hi nni ng met hod t67) appear s to be a good
candi dat e, al t hough its i mpl ement at i on is compl i -
cated.
A char act er gr aph can be deri ved from the skel et on
by appr oxi mat i ng it with a number of st rai ght line
segments and j unct i on points. Arcs may be used for
curved part s of the skeleton.
Wang and Pavl i di s have recently pr oposed
a met hod for obt ai ni ng char act er gr aphs di rect l y from
t he gr ay- l evel image. (53'68) They view the gray-level
i mage as a 3-D surface, with the gr ay levels mapped
al ong the z coor di nat e, using z = 0 for white (back-
ground) and, for exampl e, z = 255 for black. By using
t opogr aphi c analysis, ri dge lines and saddl e poi nt s are
identified, which are t hen used to obt ai n char act er
gr aphs consi st i ng of st rai ght line segments, arcs and
j unct i on points. The saddl e poi nt s are anal ysed to
det ermi ne if they are poi nt s of uni nt ent i onal l y t ouch-
ing charact ers or uni nt ent i onal l y br oken charact ers.
This met hod is useful when even t he best avai l abl e
bi nar i zat i on met hods are unabl e to preserve the char-
act er shape in the bi nar i zed image.
5.1. Templ at e mat chi ng
Templ at e mat chi ng in its pur e form is not well suited
for char act er skeletons, since t he chances are small so

o H O . I I ***
~ 0 ~ : ) ~
i
oHe me ~O0 o H*
/
. . . : " i!
( a) ( b)
i
(c) ( d)
Fig. 20. The deformable template matching approach of
WakaharaF 2) Legend: "." = original template pixels not in
transformed template; "." = transformed template; "O" = in-
put pattern; "0" = common pixels of transformed template
and input pattern. (a) Template and input pattern of
a Chinese character; (b)-(d) after 1, 5 and 10 iterations,
respectively, of local affine transforms on a copy of the
template.
t hat the pixels of the branches in the i nput skel et on will
exact l y coi nci de with t he pixels of t he correct t empl at e
skeleton. Lee and Par k t69~ reviewed several nonl i near
shape nor mal i zat i on met hods used to obt ai n uni form
line or st r oke spaci ng bot h vert i cal l y and hori zont al l y.
The i dea is t hat such met hods will compensat e for
shape di st ort i ons. Such nor mal i zat i ons are cl ai med to
i mpr ove the performance for t empl at e mat chi ng, ~v)
but may al so be used as a preprocessi ng st ep for
zoning.
5.2. Deformable templates
Def or mabl e t empl at es have been used by Burr t71)
and Wa ka ha r a tv2'v 3) for recogni t i on of char act er skel-
etons. In Wakahar a' s appr oach, each t empl at e is de-
formed in a number of smal l steps, called local affine
transforms (LAT) t o mat ch the candi dat e i nput pat t er n
(Fig. 20). The number and t ypes of t r ansf or mat i ons
before a mat ch is obt ai ned can be used as a di ssi mi l ar-
ity measure between each t empl at e and the i nput
pat t ern.
5.3. Graph description
Pavl i di s t74) ext ract s appr oxi mat e st rokes from skel-
etons. Kahan et al. Ivs) augment ed t hem with addi -
t i onal features to obt ai n reasonabl e recogni t i on
performance.
For Chinese char act er recogni t i on, several aut hor s
ext ract gr aph descri pt i ons or represent at i ons from
skel et ons as features. ~76-78) Lu et al. ~76) deri ve hier-
PR 29:4-E
656 0. D. TRIER et al.
. . . l . . . 1 . . . . i
mm mllN l i n g
n m m m
i l : :
: m e
l i m e i l l i l ~
i i l i i i m m i m i e l m I
Fig. 21. Thinned letters "c" and "d". Vertical and horizontal
axes are placed at the center of gravity. "c" and "d" both have
one semicircle in the West direction, but none in the other
directions. "c" has one horizontal crossing and two vertical
crossings. (Adopted from Kundu et al.(82>).
archi cal at t r i but ed graphs to deal with vari at i ons in
st r oke lengths and connect edness due to t he vari abl e
wri t i ng style of different writers. Cheng et al. (79) use the
Hough ' transform 181 on si ngl e-charact er i mages t o
ext ract st r oke lines from skel et ons of Chinese char ac-
ters.
5.4. Di scret e f e at ur e s
Fr om t hi nned charact ers, the following features may
be ext ract ed: <s 1.82) the number of loops; the number of
T-joints; the number of X-joints; the number of bend
points; wi dt h-t o-hei ght r at i o of encl osi ng rectangle;
presence of an i sol at ed dot; t ot al number of endpoi nt s
and number of endpoi nt s in each of t he four di rect i ons
N, S, W and E; number of semi-circles in each of these
four directions; and number of crossings with vertical
and hor i zont al axes, respectively, the axes pl aced on
the cent er of gravity. The l ast two features are ex-
pl ai ned in Fig. 21.
One mi ght use crossings with many super i mposed
lines as features, and in fact, this was done in earl i er
OCR systems, tl) However, these features al one do not
l ead to r obust recogni t i on systems; as t he number of
super i mposed lines is increased, t he resul t i ng features
are less r obust t o vari at i ons in fonts (for machi ne-
pr i nt ed charact ers) and var i abi l i t y in char act er shapes
and wri t i ng styles (for handwr i t t en characters).
5.5. Zoni ng
Hol ba e k- Ha ns s e n et al. <83] measur ed t he length of
the char act er gr aph in each zone (Fig. 22). These
features can be made size i ndependent by di vi di ng the
Fig. 22. Zoning of character skeletons.
gr aph length in each zone by the t ot al length of t he line
segments in the graph. However, t he features cannot be
made r ot at i on i ndependent . The presence or absence
of j unct i ons or endpoi nt s in each zone can be used as
addi t i onal features.
5.6. Fouri er descri pt ors
The Four i er descr i pt or met hods descri bed in Sec-
tions 3.2.4-3.2.5 for char act er cont ours may also be
used for char act er skel et ons or char act er graphs, since
the skel et on or gr aph can be t raversed t o form a (de-
generat ed) closed curve. Taxt and Bjerde ta4> st udi ed
Kuhl and Gi ar di na' s elliptic Four i er descri pt ors t3)
and stressed t hat for char act er gr aphs with two line
endings, no j unct i ons and no l oops, some of the de-
scri pt ors will be zero, while for gr aphs with j unct i ons
or l oops, all descri pt ors will be nonzero. For the size-
and r ot at i on- var i ant descri pt ors, Taxt and Bjerde
st at ed t hat :
For st rai ght lines:
a * =c * =O, n =2 , 4 , 6 . . . .
b* =d * =O, n = l , 2 , 3 . . . .
For non- st r ai ght gr aphs with two line endings, no
j unct i ons and no loops:
b * =d * =O, n = l , 2 , 3 . . . .
For gr aphs with j unct i ons or loops:
a* ~ O, b * # O, c * ~ O, d * v ~O, n = l , 2 , 3 . . . .
The charact eri st i cs for r ot at i on- and si ze-i nvari ant
features were also found, t84~ Taxt and Bjerde observed
t hat instances of t he same char act er t hat happen to be
different with respect to the above t ypes will obt ai n
very different feature vectors. The sol ut i on was to
pre-classify t he char act er graphs i nt o one of the three
t ypes and t hen use a separ at e classifier for each type.
5.7. Eval uat i on st udi es
Hol ba e k- Ha ns s e n et al. (83) compar ed the zoni ng
met hod descri bed in Section 4.5 with Zahn and Ros-
ki es' Four i er descr i pt or met hod on char act er graphs.
For charact ers with known ori ent at i on, the zoni ng
met hod was better, while Zahn and Roski es' Four i er
descri pt ors were bet t er on charact ers with unknown
rot at i on.
6. NEURAL NETWORK CLASSIFIERS
Mul t i - l ayer feed-forward neural net works (8s) have
been used extensively in OCR, for example, by Le Cun
et al., (86) Takahashi (6) and Cao et al. (25> These net-
works may be viewed as a combi ned feature ext r act or
and classifier. Le Cun et al. scale each i nput char act er
to a 16 16 grid, which are then fed i nt o 256 i nput
nodes of the neural net work (Fig. 23). The net wor k has
10 out put nodes, one for each of the ten di gi t classes
"0"- "9" t hat the net wor k tries to recognize. Three
F e a t u r e e x t r a c t i o n me t h o d s 657
Fig. 23. The neural network classifier used by Le Cun
et al . (86)
i nt er medi at e layers were used. Each node in a l ayer has
connect i ons from a number of nodes in the previ ous
l ayer and dur i ng the t rai ni ng phase, connect i on
weights are learned. The out put at a node is a function
(e.g. sigmoid) of the weighted sum of the connect ed
nodes at the previ ous layer. One can t hi nk of a feed-
f or war d neural net wor k as const ruct i ng deci si on
boundar i es in a feature space and as the number of
layers and nodes increases, the flexibility of the classi-
fier. increases by al l owi ng mor e and mor e compl ex
deci si on boundari es. However, it has been
shown t85'86) t hat this flexibility must be rest ri ct ed to
obt ai n good recogni t i on performance. This is paral l el
to the c ur s e o f d i me n s i o n a l i t y effect observed in st at i st i -
cal classifiers, ment i oned earlier.
Anot her vi ewpoi nt can also be taken. Le Cun et al.' s
neural net wor k can be r egar ded as performi ng hier-
archi cal feature ext ract i on. Each node"sees" a wi ndow
in t he previ ous l ayer and combi nes t he low-level fea-
tures in this wi ndow i nt o a higher level feature. So, the
higher the net wor k layer, t he mor e abst r act and mor e
gl obal features are ext ract ed, t he final abst r act i on level
being the features di gi t "0", di gi t "1", . . . . di gi t "9". Not e
t hat the feature ext ract ors are not hand-craft ed or
ad hoc selected rules, but are t r ai ned on a l arge set of
t rai ni ng samples.
Some neural net works are given ext r act ed features
as i nput i nst ead of a scaled or subsampl ed i nput i mage
[e.g. references (25, 60)]. Then the net wor k can be
viewed as a pure classifier, const ruct i ng some compl i -
cat ed deci si on boundar i es or it can be viewed as
ext ract i ng "superfeat ures" in the combi ned process of
feature ext ract i on and classification.
One pr obl em with using neural net works in OCR is
t hat it is difficult to anal yse and fully under st and the
decision maki ng process. (aT) What are the i mpl i ci t
features and what are the decision boundari es? Also,
an unbi ased compar i son between neural net works and
st at i st i cal classifiers is difficult. If t he pixels themselves
are used as i nput s to a neural net wor k and the neural
net work is compar ed with, say, a k-nearest nei ghbor
(kNN) classifier using the same pixels as "features",
then the compar i son is not fair since a neural net wor k
has the oppor t uni t y to derive mor e meaningful fea-
tures in t he hi dden layers. Rather, the best performi ng
st at i st i cal or st ruct ural classifiers have to be compar ed
with the best neural net work classifiers, t88)
7. DI SCUSSI ON
Before selecting a specific feature ext ract i on met hod,
one needs to consi der t he t ot al char act er recogni t i on
system in which it will operat e. What t ype of i nput
charact ers is the system desi gned for? Is the i nput
single-font t yped or machi ne-pri nt ed charact ers,
mul t i -font machi ne pri nt ed, neat l y hand- pr i nt ed or
unconst r ai ned hand written? What is the var i abi l i t y of
the charact ers bel ongi ng to t he same class? Are gray-
level images or bi nar y i mages avai l abl e? What is the
scanner resol ut i on? Is a st at i st i cal or st ruct ural classi-
fier to be used? What are the t hr oughput requi rement s
(charact ers per second) as opposed to recogni t i on
requi rement s (reject versus er r or rate)? What har dwar e
is avai l abl e? Can speci al -purpose har dwar e be used, or
must the system run on st andar d hardware? What is
the expected price of the system? Such quest i ons need
to be answered in or der to make a qualified selection of
the appr opr i at e feature ext r act i on met hod.
Often, a single feature ext r act i on met hod al one is
not sufficient to obt ai n good di scr i mi nat i on power. An
obvi ous sol ut i on is to combi ne features from different
feature ext r act i on met hods. If a st at i st i cal classifier is
to be used and a large t rai ni ng set is avai l abl e, dis-
cr i mi nant anal ysi s can be used to select t he features
with highest di scri mi nat i ve power. The st at i st i cal
propert i es of such combi ned feature vectors need to be
expl ored. Anot her appr oach is to use mul t i pl e classi-
fiers. (20'25-27'89'90) In t hat case, one can even combi ne
statistical, st ruct ural and neural net work classifiers to
utilize t hei r i nherent differences.
The mai n di sadvant age of using gray-scal e i mage
based appr oaches is the memor y requirements. Al -
t hough Pavl i di s t91) has shown t hat si mi l ar recogni t i on
rates may be achi eved at a l ower resol ut i on for gray-
scale met hods t han for bi nar y i mage-based met hods,
gray-scal e images cannot be compressed significantly
wi t hout a loss of i nformat i on. Bi nary images are easily
compressed using, for exampl e, run-l engt h codi ng and
al gor i t hms can be wri t t en to wor k on this format.
However, as the performance and memor y capaci t y of
658 O. D. TRIER et al.
comput er s cont i nue to doubl e every 1~8 mont hs or so,
gray-scal e met hods will event ual l y become feasible in
mor e and mor e appl i cat i ons.
To i l l ust rat e t he process of identifying the best fea-
ture ext r act i on met hods, let us consi der t he digits in
the hydr ogr aphi c map (Fig. 1). The digits are hand-
pr i nt ed by one wri t er and have roughl y the same
ori ent at i on, size and sl ant (skew), al t hough some vari -
at i ons exist and t hey vary over t he different por t i ons of
t he whol e map. These var i at i ons are pr obabl y l arge
enough to affect the features consi derabl y, if r ot at i on- ,
size- or skew-vari ant features are used. However, by
using features i nvar i ant to scale, r ot at i on and skew,
a l arger vari abi l i t y is al l owed and confusion among
charact ers such as "6" and "9" may be expected. By
using a st at i st i cal classifier which assumes st at i st i cal l y
dependent features (e.g. using the mul t i var i at e Gaus-
sian di st ri but i on), we can hope t hat these var i at i ons
will be pr oper l y account ed for. Ideal l y, it shoul d then
be possi bl e to find the size, or i ent at i on and per haps
sl ant di rect i ons in t he feature space by pri nci pal com-
ponent anal ysi s (PCA), al t hough the act ual PCA does
not have to be i mpl ement ed. However, charact ers with
unusual size, r ot at i on or skew will pr obabl y not be
correct l y classified. An appr opr i at e sol ut i on may
therefore be to use a mix of var i ant and i nvar i ant
features.
For many appl i cat i ons, robust ness t o var i abi l i t y in
char act er shape, to degr adat i on and t o noise is i mpor t -
ant. Char act er s may be fragment ed or merged. Ot her
charact ers mi ght be self-touching or have a br oken
l oop. For features ext r act ed from char act er cont our s
or skeletons, we will expect very different features
dependi ng on Whether fragment ed, self-touching or
br oken l oop charact ers occur or not. Separat e classes
will nor mal l y have to be used for these vari ant s, but the
t rai ni ng set may cont ai n t oo few of each var i ant to
make reliable class descriptions.
Four i er descri pt ors cannot be appl i ed to fragment ed
charact ers in a meaningful way since this met hod
ext ract s features from one single closed cont our or
skeleton. Fur t her , out er cont our curve-based met hods
do not use i nf or mat i on about the i nt eri or of the char-
acters, such as holes in "8", "0", etc., so one t hen has to
consi der if some classes will be easily confused. A sol-
ut i on may be to use mul t i st age classifiers. ~25J
Zoni ng, moment i nvari ant s, Zerni ke moment s and
the Ka r hune n- Loe ve t ransform may be good al t erna-
tives, since they are not affected by the above degr ada-
tions to t he same extent. Zoni ng is pr obabl y not a good
choice, since the vari at i ons present in each di gi t class
may cause a specific par t of a char act er to fall i nt o
different zones for different instances. Cao et al. (25~
t ri ed t o compensat e for this by using f u z z y borders, but
this met hod is onl y capabl e of compensat i ng for smal l
vari at i ons of the char act er shape. Moment i nvari ant s
are i nvar i ant to size and r ot at i on, and some moment
i nvari ant s are also i nvar i ant to skew and mi r r or im-
ages. t42~ Mi r r or i mage i nvari ance is not desirable, so
moment i nvari ant s t hat are i nvar i ant to skew but not
mi r r or images woul d be useful and a few such invari-
ant s do exist, t42) Moment i nvari ant s l ack "the recon-
st ruct abi l i t y pr oper t y, which pr obabl y means t hat
a few mor e features are needed t han for features for
which reconst ruct i on is possible.
Zerni ke moment s are compl ex numbers which
themselves are not r ot at i on i nvari ant , but t hei r ampl i -
t udes are. Also, size i nvari ance is obt ai ned by prescal-
ing the image. I n ot her words, we can obt ai n size- and
r ot at i on- dependent features. Since Zerni ke moment s
have the reconst ruct abi l i t y pr oper t y, they appear to be
very pr omi si ng for our appl i cat i on.
Of the uni t ar y i mage t ransforms, t he Ka r hune n-
Loeve t ransform has the best i nformat i on compact -
ness in t erms of mean square error. However, since the
features are onl y l i near combi nat i ons of t he pixels in
the i nput char act er image, we cannot expect t hem to be
abl e t o ext ract high-level features the way ot her
met hods do, so many mor e features are needed and
thus a much l arger t rai ni ng set t han for ot her met hods.
Also, since the features are tied to pixel l ocat i ons, we
cannot expect to obt ai n class descri pt i ons sui t abl e for
par amet r i c st at i st i cal cl assi fers. Still, a nonpar amet r i c
classifier like t he k-nearest nei ghbor classifier (9) may
perform well on the Ka r hune n- Loe ve t ransform fea-
tures.
Di scret i zat i on errors and ot her high frequency noise
are removed when using Four i er descri pt ors (Fi gs 18
and 19), moment i nvari ant s or Zerni ke moment s, since
we never use very high or der terms. Zoni ng met hods
are also r obust agai nst high frequency noise, because
of t he i mpl i ci t low pass filtering in the met hod.
Fr om the above analysis, it seems t hat Zerni ke
moment s woul d be good features in our hydr ogr aphi c
map appl i cat i on. However, one really needs to per-
form an experi ment al eval uat i on of a few of the most
pr omi si ng met hods to decide which feature ext ract i on
met hod is the best in pract i ce for each appl i cat i on. The
eval uat i on shoul d be performed on large dat a sets t hat
are represent at i ve for the par t i cul ar appl i cat i on.
Large, st andar d dat a sets are now avai l abl e from
NI ST ts6) (Gai t hersburg, MD 20899, U.S.A.) and
SUNY at Buffalo O2) ( CEDAR, SUNY, Buffalo, NY
14260, U.S.A). If these or ot her avai l abl e dat a sets are
not representative, t hen one mi ght have t o collect
a l arge dat a set. However, performance on t he st an-
dar d dat a sets does give an i ndi cat i on of the usefulness
of the features and provi des performance figures t hat
can be compar ed with ot her research gr oups' results.
8 . SUMMARY
Opt i cal char act er recogni t i on (OCR) is one of the
most successful appl i cat i ons of aut omat i c pat t er n rec-
ogni t i on. Since the mi d 1950s, OCR has been a very
active field for research and devel opment . (~) Today,
r easonabl y good OCR packages can be bought for as
little as $100. However, these are onl y abl e to recognize
high qual i t y pr i nt ed text document s or neat l y wri t t en
hand- pr i nt ed text. The current research in OCR is now
Feature extraction methods 659
addr essi ng document s t hat are not wel t handl ed by t he
avai l abl e systems, i ncl udi ng severely degraded, omni -
font machi ne- pr i nt ed text a nd ( unconst r ai ned) hand-
wr i t t en text. Also, efforts are bei ng made to achi eve
l ower s ubs t i t ut i on er r or rat es a nd reject rat es even on
good qual i t y machi ne- pr i nt ed text, si nce an experi-
enced h u ma n t ypi st still has a muc h l ower er r or rate,
al bei t at a sl ower speed.
Sel ect i on of feat ure ext r act i on met hod is pr obabl y
t he single most i mpor t a nt fact or i n achi evi ng hi gh
r ecogni t i on per f or mance. Gi ven t he large n u mb e r of
feat ure ext r act i on met hods r epor t ed i n t he l i t erat ure,
a newcomer to t he field is faced wi t h t he fol l owi ng
quest i on: Whi ch feat ure ext r act i on met hod is t he best
for a gi ven appl i cat i on? Thi s ques t i on led us to char ac-
terize t he avai l abl e feat ure ext r act i on met hods, so t hat
t he most pr omi s i ng met hods coul d be sort ed out . An
exper i ment al eval uat i on of these few pr omi s i ng met h-
ods mus t still be per f or med to select t he best met hod
for a specific appl i cat i on.
Devi j ver and Ki t t l er defi ne feat ure ext r act i on [-page
12 i n reference (11)] as t he pr obl em of "ext r act i ng from
t he raw dat a t he i nf or mat i on whi ch is most r el evant for
cl assi fi cat i on purposes, i n t he sense of mi ni mi zi ng t he
wi t hi n-cl ass pat t er n var i abi l i t y while enhanci ng t he
bet ween-cl ass pat t er n vari abi l i t y". I n this paper, we
revi ewed feat ure ext r act i on met hods i ncl udi ng:
(1) t empl at e mat chi ng;
(2) def or mabl e t empl at es;
(3) uni t ar y i mage t ransforms;
(4) gr aph descr i pt i on;
(5) pr oj ect i on hi st ograms;
(6) c ont our profiles;
(7) zoni ng;
(8) geomet ri c mome nt i nvar i ant s;
(9) Zer ni ke moment s ;
(10) spl i ne curve appr oxi mat i on;
(11) Four i er descri pt ors.
Each of these met hods may be appl i ed to one or mor e
of the fol l owi ng r epr esent at i on forms:
(1) gray-l evel char act er i mage;
(2) bi nar y char act er i mage;
(3) char act er cont our ;
(4) char act er skel et on or char act er graph.
For each feat ure ext r act i on met hod a nd each char act er
r epr esent at i on form, we di scussed t he pr oper t i es of t he
ext ract ed features.
Before selecting a specific feat ure ext r act i on met hod,
one needs to consi der t he t ot al char act er r ecogni t i on
syst em i n whi ch it will operat e. The process of i dent i -
fyi ng t he best feat ure ext r act i on met hod was i l l ust r at ed
by consi der i ng t he di gi t s i n t he hydr ogr aphi c ma p (Fig.
1) as an exampl e. It appear s t hat Zer ni ke mome nt s
woul d be good feat ures i n this appl i cat i on. However,
one really needs to per f or m an exper i ment al eval u-
at i on of a few of t he most pr omi s i ng met hods to deci de
whi ch feat ure ext r act i on met hod is t he best i n pract i ce
for each appl i cat i on. The eval uat i on shoul d be per-
f or med on l arge dat a sets t hat are r epr esent at i ve for t he
par t i cul ar appl i cat i on.
Acknowl edgement s--We thank the anonymous referee for
useful comments. This work was supported by a NATO
collaborative research grant (CRG 930289) and The Research
Council of Norway.
REFERENCES
1. S. Mori, C. Y. Suen and K. Yamamoto, Historical review
of OCR research and development, Proc. I EEE 80, 1029-
1058 (July 1992).
2. J. W. Gorman, O. R. Mitchell and F. P. Kuhl, Partial
shape recognition using dynamic programming, I EEE
Trans. Pattern Anal. Mach. Intell. 10, 257-266 (March
1988).
3. E. Persoon and K.-S. Fu, Shape discrimination using
Fourier descriptors, I EEE Trans. Syst. Man Cybernet. 7,
170-179 (March 1977).
4. L. Shen, R. M. Rangayyan and J. E. L. Desaultes, Appli-
cation of shape analysis to mammographic calcifications,
I EEE Trans. Med. Imaging 13, 263-274 (June 1994).
5. D. G. Elliman and I. T. Lancaster, A review of segmenta-
tion and contextual analysis for text recognition, Pattern
Recognition 23(3/4), 337-346 (1990).
6. C. E. Dunn and P. S. P. Wang, Character segmentation
techniques for handwritten t ext - - a survey, Proc. 11th
I APR Int. Conf. Pattern Recognition II, pp. 577 580. The
Hague, The Netherlands (1992).
7. L. Lam, S.-W. Lee and C. Y. Suen, Thinning methodolo-
gi es--a comprehensive survey, I EEE Trans. Pattern
Anal. Mach. Intell. 14, 869-885 (September 1992).
8. K.-S. Fu, Syntactic Pattern Recognition and Application.
Prentice-Hall, Englewood cliffs, New Jersey (1982).
9. R. O. Duda and P. E. Hart, Pattern Classification and
Scene Analysis. John Wiley & Sons, New York (1973).
10. K. Fukunaga, Introduction to Statistical Pattern Recogni-
tion. Academic Press, Boston (1990).
11. P.A. Devijver and J. Kittler, Pattern Recognition: A Stat-
istical Approach. Prentice-Hall, London (1982).
12. J. R. Ullmann, Pattern Recognition Techniques. Butter-
worth, London (1973).
13. O. D. Trier and T. Taxt, Evaluation of binarization
methods for document images, I EEE Trans. Pattern
Anal. Mach. lntell. 17, 312 315 (March 1995).
14. O. D. Trier and A. K. Jain, "Goal-directed evaluation of
binarization methods, I EEE Trans. Pattern Anal. Mach.
Intell. 17, 1191 1201 (December 1995).
15. S.-W. Lee, L. Lam and C. Y. Suen, Performance evalu-
ation of skeletonizing algorithms for document image
processing, Proc. First l nt l Conf. Document Analy. Recog-
nition, Saint-Malo, France (1991).
16. M. Y. Jaisimha, R. M. Haralick and D. Dori, Quantita-
tive performance evaluation of thinning algorithms un-
der noisy conditions, Proc. I EEE Conf. Comput. Vis.
Pattern Recognition (CVPR) pp. 678-683, Seattle,
Washington (June 1994).
17. V.K. Govindan and A. P. Shivaprasad, Character recog-
ni t i on- - a review, Pattern Recognition 23(7), 671-683
(1990).
18. G. Nagy, At the frontiers of OCR, Proc. I EEE 80, 1093-
1100 (July 1992).
19. C.Y. Suen, R. Legault, C. Nadal, M. Cheriet and L. Lam,
Building a new generation of handwriting recognition
systems, Pattern Recognition Lett. 14, 303-315 (April
1993).
20. C. Y. Suen, C. Nadal, R. Legault, T. A. Mai and L. Lam,
Computer recognition of unconstrained handwritten nu-
merals, Proc. I EEE 80, 1162-1180 (July 1992).
660 O. D. TRIER et al.
21. C.Y. Suen, M. Berthod and S. Mori, Automatic recogni-
tion of handprinted charact ers--t he state of the art, Proc.
I EEE 68, 469-487 (April 1980).
22. J. Mantas, An overview of character recognition method-
ologies, Pattern Recognition 19(6), 425-430 (1986).
23. A. K. Jain and B. Chandrasekaran, Dimensionality and
sample size considerations in pattern recognition prac-
tice, Handbook of Statistics, P. R. Krishnaiah and L. N.
Kanal, eds, Vol. 2. North-Holland, Amsterdam (1982).
24. P. Gader, B. Forester, M. Ganzberger, A. Gillies,
B. Mitchell, M. Whalen and T. Yocum, Recognition of
handwritten digits using template and model matching,
Pattern Recognition 24(5), 421-431 (1991).
25. J. Cao, M. Ahmadi and M. Shridhar, Handwritten nu-
meral recognition with multiple features and multistage
classifiers, I EEE Int. Syrup. Circuits Syst. 6, pp. 323-326,
London (30 May- 2 June 1994).
26. L. Xu, A. Krzy2ak and C. Y. Suen, Methods of combining
multiple classifiers and their applications to handwriting
recognition, I EEE Trans. Syst. Man Cybernet. 22(3),
418-435 (1992).
27. F. Kimura and M. Shridhar, Handwritten numerical
recognition based on multiple algorithms, Pattern Recog-
nition 24(10), 969-983 (1991).
28. T. Y. Zhang and C. Y. Suen, A fast parallel algorithm for
thinning digital patterns, Commun. ACM 27, 236-239
(March 1984).
29. T. H. Reiss, The revised fundamental theorem of moment
invariants, I EEE Trans. Pattern Anal. Mach. lntell. 13,
830-834 (August 1991).
30. F. P. Kuhl and C. R. Giardina, Elliptic Fourier features of
a closed contour, Comput. Vis. Graphics Image Process.
18, 236-258 (1982).
31. A. Khotanzad and Y. H. Hong, Invariant image recogni-
tion by Zernike moments, I EEE Trans. Pattern Anal.
Mach. lntell. 12(5), 489-497 (1990).
32. D. H. Ballard and C. M. Brown, Computer Vision pp. 65
70. Prentice-Hall, Englewood Cliffs, New Jersey (1982).
33. W. K. Pratt, Digital Image Processin O, 2nd edn. Wiley,
New York (1991).
34. G. Storvik, A bayesian approach to dynamic contours
through stochastic sampling and simulated annealing,
I EEE Trans. Part. Anal. Mach. Intell. 16, 976 986 (Octo-
ber 1994).
35. M. Kass, A. Witkin and D. Terzopoulos, Snakes: active
contour models, Intl. J. Comput. Vis. 321-331 (1988).
36. A. D. Bimbo, S. Santini and J. Sanz, OCR from poor
quality images by deformation of elastic templates, Proc.
12th I APR Intl. Conf. Pattern Recognition 2, pp. 433
435. Jerusalem, Israel (1994).
37. H. C. Andrews, Multidimensional rotations in feature
selection, I EEE Trans. Comput. 20, 1045-1051 (Septem-
ber 1971).
38. R. C. Gonzalez and R. E. Woods, Digital Image Proces-
sing. Addison-Wesley, New York (1992).
39. M. Turk and A. Pentland, Eigenfaces for recognition, J.
Coon. Neurosci. 3(1), 71-86 (1991).
40. M. Bokser, Omnidocument technologies, Proc. I EEE 80,
1066-1078 (July 1992).
41. M.-K. Hu, Visual pattern recognition by moment invari-
ants, I RE Trans. Inf. Theory 8, 179-187 (February 1962).
42. T. H. Reiss, Recognizin O Planar Objects Using lnvariant
Image Features, Vol. 676 of Lecture Not es in Computer
Science. Springer-Verlag, Berlin (1993).
43. S. O. Belkasim, M. Shridhar and A. Ahmadi, Pattern
recognition with moment invariants: A comparative
study and new results, Pattern Recognition 24, 1117-1138
(December 1991).
44. S. O. Belkasim, M. Shridhar and A. Ahmadi, Corrigen-
dum, Pattern Recognition 26, 377 (January 1993).
45. Y. Li, Reforming the theory of invariant moments for
pattern recognition, Pattern Recognition Lett. 25, 723-
730 (July 1992).
46. J. Flusser and T. Suk, Pattern recognition by affine
moment invariants, Pattern Recognition 26, 167-174
(January 1993).
47. J. Flusser and T. Suk, Atfine moment invariants: A new
tool for character recognition, Pi~ttern Recognition Lett.
15, 433-436 (April 1994).
48. B. Bamieh and R. J. P. de Figueiredo, A general moment-
invariants/attributed-graph method for three-dimen-
sional object recognition from a single image, I EEE J.
Robot. Automat. 2, 31-41 (March 1986).
49. A. Khotanzad and Y. H. Hong, Rotation invariant
image recognition using features selected via a systema-
tic method, Pattern Recognition 23(10), .1089-1101
(1990).
50. J. M Westall and M. S. Narasimha, Vertex directed
segmentation of handwritten numerals, Pattern Recogni-
tion 26, 1473-1486 (October 1993).
51. H. Fujisawa, Y. Nakano and K. Kurino, Segmentation
methods for character recognition: from segmentation to
document structure analysis, Proc. I EEE 80, 1079-1092
(July 1992).
52. O. D. Trier, T. Taxt and A. K. Jain, Data capture from
maps based on gray scale topographic analysis, Proc.
Third Int. Conf. Document Anal. Recognition, pp. 923-
926. Montreal, Canada (August 1995).
53. L. Wang and T. Pavlidis, Direct gray-scale extraction of
features for character recognition, I EEE Trans. Pattern
Anal. Mach. Intell. 15, 1053-1067 (October 1993).
54. R. M. Haralick, L. T. Watson and T. J. Laffey, The
topographic primal sketch, J. Robot. Res. 2, 50-72
(Spring 1983).
55. J. D. Tubbs, A note on binary template matching, Pattern
Recognition 22(4), 359-365 (1989).
56. M. D. Garris, J. L. Blue, G. T. Candela, D. L. Dimmick,
J. Gelst, P. J. Grother, S. A. Janet and C. L. Wilson, NIST
form-based handprint recognition system, Technical
Report NISTIR 5469, National Institute of Standards
and Technology, Gaithersburg, MD 20899, U.S.A. (July
1994).
57. M. H. Glauberman, Character recognition for business
machines, Electronics 132-136 (February 1956).
58. R. Kasturi and M. M. Trivedi, eds., Image Analysis
Applications. Marcel Dekker, New York (1990).
59. L. Yang and F. Albregtsen, Fast computation of invari-
ant geometric moments: A new method giving correct
results, Proc. 12th I AP R Int. Conf. Pattern Recognition. 1,
pp. 201 204. Jerusalem, Israel (October 1994).
60. H. Takahashi, A neural net OCR using geometrical and
zonal pattern features, Proc. First Int. Conf. Document
Anal. Recognition pp. 821-828. Saint-Malo, France
(1991).
61. I. Sekita, K. Toraichi, R. Mori, K. Yamamoto and
H. Yamada, Feature extraction of handwritten Japanese
characters by spline functions for relaxation matching,
Pat tern Recognition 21(1), 9-17 ( 1988).
62. T. Taxt, J. B. Olafsd6ttir and M. Daehlen, Recognition of
handwritten symbols, Pattern Recognition 23(11), 1155-
1166 (1990).
63. C.-S. Lin and C.-L. Hwang, New forms of shape invari-
ants from elliptic Fourier descriptors, Pattern Recogni-
tion 20(5), 535-545 (1987).
64. C. T. Zahn and R. C. Roskies, Fourier descriptors for
plane closed curves, I EEE Trans. Comput. 21,269-281
(March 1972).
65. G. H. Granlund, Fourier preprocessing for hand print
character recognition, I EEE Trans. Comput. 21,195-201
(February 1972).
66. B. K. Jang and R. T. Chin, One-pass parallel thinning:
analysis, properties and quantitative evaluation, I EEE
Trans. Pattern Anal. Mach. Intell. 14, 1129-1140
(November 1992).
67. P. C. K. Kwok, A thinning algorithm by contour gener-
ation, Commun. ACM 31(11), 1314-1324 (1988).
Feat ure extraction met hods 661
68. L. Wang and T. Pavlidis, Detection of curved and
straight segments from gray scale t opography, CVGIP:
Image Understanding 58, 352-365 (November 1993).
69. S.-W. Lee and J.-S. Park, Nonl i near shape normal i zat i on
met hods for the recognition of large-set handwri t t en
characters, Pattern Recognition 27(7), 895-902 (1994).
70. H. Yamada, K. Yamamot o and T. Saito, A nonl i near
normal i zat i on met hod for handpri nt ed Kanji charact er
r ecogni t i on- - l i ne density equalization, Pattern Recogni-
tion 23(9), 1023-1029 (1990).
71. D. J. Burr, Elastic mat chi ng of line drawings, I EEE
Trans. Pattern Anal. Mach. Intell. 3, 708-713 (November
1981).
72. T. Wakahara, Toward robust handwri t t en charact er rec-
ognition, Pattern Recognition Lett. 14, 345-354 (April
1993).
73. T. Wakahara, Shape mat chi ng using LAT and its appli-
cation to handwri t t en numeral recognition, I EEE Trans.
Pattern Anal. Mach. Intell. 16, 618-629 (June 1994).
74. T. Pavlidis, A vectorizer and feature ext ract or for docu-
ment recognition, Comput. Fis, Graphic Image Process.
35, I 11-127 (1986).
75. S. Kahan, T. Pavlidis and H. S. Baird, On the recognition
of pri nt ed characters of any font and size, I EEE Trans.
Pattern Anal. Mach. Intell. 9, 274 288 (March 1987).
76. S. W. Lu, Y~ Ren and C. Y. Suen, Hierarchical at t ri but ed
graph representation and recognition of handwri t t en
Chinese characters, Pattern Recognition 24(7), 617-632
(1991).
77. H.-J. Lee and B. Chen, Recognition of handwri t t en
Chinese characters via short line segments, Pattern Rec-
ognition 25(5), 543-552 (1992).
78. F.-H. Cheng, W.-H. Hsu and M.-C. Kuo, Recognition of
handpri nt ed Chinese characters via stroke relaxation,
Pattern Recognition 26(4), 579-593 (1993).
79. F.-H. Cheng, W.-H. Hsu and M.-Y. Chen, Recognition of
handwri t t en Chinese characters by the modified Hough
t ransform techniques, I EEE Trans. Pattern Anal. Mach.
lntell. 11, 429-439 (April 1989).
80. R. O. Duda and P. E. Hart, Use of the Hough transform-
ation to detect lines and curves in pictures, commun. ACM
15, 11-15 (January 1972).
81. S. R. Ramesh, A generalized charact er recognition algo-
rithm: a graphical approach, Pattern Recognition 22(4),
347-350 (1989).
82. A. Kundu, Y. He and P. Bahl, Recognition of hand-
written word: first and second order hidden Mar kov
model based approach, Pattern Recognition 22(3), 283
297 (1989).
83. E. Holbaek-Hanssen, K. Br~tthen and T. Taxt, A general
software system for supervised statistical classification of
symbols, Proc. 8th Int l Conf. Pattern Recognition
pp. 144-149. Paris, France (October 1986).
84. T. Taxt and K. W. Bjerde, Classification of handwri t t en
vector symbols using elliptic Fouri er descriptors, Proc.
12th I APR l nt Conf. Pattern Recognition, pp. 123 128.
Jerusalem, Israel (October 1994).
85. J. Hertz, A. Krogh and R. G. Palmer, Introduction to the
Theory of Neural Computation. Addision-Wesley, Red-
wood City, California (1991).
86. Y. L. Cun, B. Boser, J. S. Denker, D. Henderson, R. E.
Howard, W. Hubband and L. D. Jackel, Backpropaga-
tion applied t o handwri t t en zip code recognition, Neural
Comput. 541-551 (1989).
87. R. P. W. Duin, Superlearning and neural network
magic, Pattern Recognition Lett. 15, 215 217 (March
1994).
88. A. K. Jai n and J. Mao, Neural networks and pat t ern
recognition, Proc. I EEE World Congress Comput. Intell.
Orl ando, Florida (June 1994)
89. T. K. Ho, J. J. Hull and S. N. Srihari, Decision combi na-
tion in multiple classifier system, I EEE Trans. Pattern
Anal. Mach. Intell. 16, 66 75 (January 1994).
90. M. Sabourin, A. Mitiche, D. Thomas and G. Nagy,
Classifier combi nat i on for hand-pri nt ed digit recogni-
tion, Proc. Second Int. Conf. Document Anal. Recognition
pp. 163 166. Tsukuba Science City, Japan (October
1993).
91. T. Pavlidis, Recognitoin of printed text under realistic
conditions, Pattern Recognition Lett. 14, 317-326 (April
1993).
92. J.J. Hull, A dat abase for handwri t t en text research, I EEE
Trans. Pattern Anal. Mach. Intell. 16, 550-554 (May
1994).
About the Aut hor - - OI VI ND DUE TRIER was bor n in Oslo, Norway, in 1966. He received his M.Sc. degree
in Comput er Science from The Norwegian Institute of Technology in 1991 and was with the Norwegian
Defense Research Est abl i shment from 1991 to 1992. Since 1992 he has been a Ph.D. st udent at the
Depar t ment of Informatics, University of Oslo. His current research interests include pat t ern recognition,
image analysis, document processing and geographical information systems.
About the Author ANI L K. JAI N received a B.Tech. degree in 1969 from the Indi an Institute of
Technology, Kanpur, and the M.S. and Ph.D. degrees in Electrical Engineering from t h Ohi o State
University, in 1970 and 1973, respectively. He j oi ned the faculty of Michigan State University in 1974, where
he currently holds the rank of University Distinguished Professor in the Depart ment of Comput er Science.
Dr Jain served as Pr ogr am Director of the Intelligent Systems Program at the Nat i onal Science Foundat i on
(1980-1981) and has held visiting appoi nt ment s at Delft Technical University, Holland, Norwegian
Comput i ng Center, Oslo, and Tat a Research Development and Design Center, Pune, India. He has also been
a consul t ant t o several industrial, government and i nt ernat i onal organizations. His current research interests
are comput er vision, image processing and pat t ern recognition. Dr Jain has made significant cont ri but i ons
and published a large number of papers on the following topics: statistical pat t ern recognition, exploratory
pat t ern analysis, Mar kov random fields, texture analysis, i nt erpret at i on of range images and 3-D object
recognition. Several of his papers have been reprinted in edited volumes on image processing and pat t ern
recognition. He received the best paper awards in 1987 and 1991, and received certificates for out st andi ng
cont ri but i ons in 1976, 1979 and 1992 from the Pat t ern Recognition Society. Dr Jain was the Editor-in-Chief
of the I EEE Transact i ons on Pat t ern Analysis and Machi ne Intelligence (1991-1994) and its on the editorial
boards of Pat t ern Recognition, Pat t ern Recognition Letters, Journal of Mat hemat i cal Imaging, Journal of
Applied Intellgence and IEEE Transact i ons on Neural Networks. He is the co-aut hor of Algorithms for
Clustering Data, Prentice Hall, 1988, has edited the book Real-Time Object Measurement and Classifica-
tion, Springer-Verlag, 1988, and has co-edited the books Analysis and i nt erpret at i on of Range Images,
662 ~. D. TRIER e t a l .
Springer-Verlag (1989), Neural Networks and Statistical Pattern Recognition, North-Holland (1991),
Markow Random Fields: Theory and Applications, Academic Press (1993) and 3D Object Recognition,
Elsevier (1993). Dr Jain is a Fellow of the IEEE. He was the Co-General Chairman of the 11 th International
Conference on Pattern Recognition, Hague (1992), General Chairman of the IEEE Workshop on Interpreta-
tion of 3D Scenes, Austin (1989), Director of the NATO Advanced Research Workshop on Real-time Object
Measurement and Classification, Maratea (1987), and co-directed NSF supported Workshops on Future
Research Directions in Computer Vision, Maul (1991), Theory and Applications of Markov Random Field,
San Diego (1989), and Range Image Understanding, East Lansing (1988). Dr Jain was a member of the IEEE
Publications Board (1988-1990) and served as the Distinguished Visitor of the IEEE Computer Society
(1988-1990).
About the Aut bor - - TORFI NN TAXT was born in 1950 and is Professor for the Medical Image Analysis
Section at the University of Bergen and Professor in image processing at the University of Oslo. He is an
associate editor of the IEEE Transactions on Medical Imaging and of Pattern Recognition, and has a Ph.D.
in developmental neuroscience (1983), Medical Degree (1976) and M.S. in computer science (1978) from the
University of Oslo. His current research interests are in the areas of image restoration, image segmentation,
multi-spectral analysis, medical image analysis and document processing. He has published papers on the
following topics: restoration of medical ultrasound and magnetic resonance images, magnetic resonance and
remote sensing multispectral analysis, quantum field models for image analysis and document processing.

You might also like