You are on page 1of 9

International Journal of Computer Systems (ISSN: 2394-1065), Volume 03 Issue 04, April, 2016

Available at http://www.ijcsonline.com/

Writing Order Recovery from Telugu Character Images


Ananda Kumar KinjarapuA, Subhashini NarraB , Dr. Kamakshi Prasad Valurouthu

Department of Computer Science and Engineering, CMRCET, Kandlakoya, Hyderabad, Telangana, India
Department of Computer Science and Engineering, VNRVJIET, Bachupall, Hyderabad, Telangana, India

Department of Computer Science and Engineering, JNTUCEH, JNTUH, Hyderabad, Telangana, India

Abstract
Handwriting interfaces simplifies the communication with the hand-held devices using regional languages. These devices
need to recognize the characters while writing, referred to as online. The online recognizers for regional languages are
becoming popular and yields better recognition performances compared to offline. Offline recognition considers scanned
document images as input instead of ordered sequences of coordinates as online. We attempted to recover the writing
order i.e. the ordered sequences of coordinates from character images and thus an online recognizer can be used to
identify the correct character class. In order to recover the writing order, skeleton graph is constructed from thinned
representation of character image. The endpoints, junction points along with spurious edges formed as a result of
thinning are considered as nodes and the strokes connecting the nodes form edges of the graph. We identified and
proposed rules to recover writing order from eight graph structures possible in Telugu language: S-vertex, S-loop, Stouching, T-vertex, T-loop, D-line, M-loop and L-loop. The strokes are ordered from top to bottom using bounding boxes
and recovered writing order of about 97% of character images with the proposed start and end point identification rules.
Keywords: Writing order recovery, Telugu Character, Graph based.

I.

INTRODUCTION

The increase in usage of electronic gadgets day-by-day


which are equipped with pen interfaces (touch screens)
results in increased attention towards online handwriting
recognition. The online recognition also helps as a first step
in the collaboration of people with their own languages.
The offline handwriting recognition is also equally
important as it has potential applications in the real world.
For example, to automatically process checks which are
handwritten by default requires recognition of the ink
characters as well as validation of the customers signature.
Besides this the handwritten manuscripts need to be
converted to electronic format to preserve the vast literature
to pass it onto future generations.
The key difference between online and offline character
recognition is the availability of dynamic information like
stroke order for a multi stroke character, ordered sequence
of ink points comprising a stroke, the direction of writing
of a stroke, the speed and pressure of writing in online
systems. Besides the recognition performances of online
systems are much better than offline recognition systems
(Rousseau, et al., 2005). If we are able to recover the order
of strokes and the time ordered sequences of cordinates
from each of these constituent strokes, one recognizer can
recognize both online and offline characters. The extraction
of time ordered points mimicking the human like drawing
order is referred to as writing order recovery.
As far as our knowledge is concerned, no attempts have
been made towards writing order recovery of Telugu
scripts in specific and Indic scripts in general. Hence we
planned to recover the writing order of Telugu script
characters. Recently HP Research Labs, Bangalore
provided the standardized datasets for both offline and
online isolated Tamil and Telugu handwritten characters

and open-source tools as well as online handwriting


recognition tools for Indic scripts (HP Research Labs,
Bangalore, 2009). We extracted the writing order from the
isolated offline handwritten character dataset consists of
166 classes, 200 samples per character collected from 141
different writers.
The methods used for writing order recovery are
classified as: local tracing, global tracing and hybrid tracing
(Nagoya & Hiroyuki, 2011). The local tracing methods are
computationally efficient but intolerant to noise whereas
the global tracing methods resilient to noise but
computationally complex. Thus the hybrid methods
combine the strengths of both local and global trace
methods. This paper presented the recovery of writing
order from the graph constructed from the graph
constructed from the character image. In order to construct
the graph, form the character image we need to extract onepixel thin representation of the original image termed as
skeleton preserving geometrical and topological properties.
Here we obtained the skeletons using the fully parallel 2D
thinning algorithm presented in (Ananda, et al., 2014).
The rest of the paper deals with recovering writing
order of a character in the way most of the native writers
produce the character. In section 2, we present a brief
review of popular writing order recovery strategies in the
state of art literature. The recovery procedure detailed in
section 3, followed by result analysis in section 4 and
finally proposed method concluded in section 5.
II.

LITERATURE

Lee and Pan (Lee & Pan, 1992) proposed hierarchical


decision making for stroke identification and a set of
heuristic rules to recover temporal information from the
signature images. It is based on a local tracing method

347 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

which selects the contiguous path at intersection points of


strokes and recover the drawing order one after the other.
Global tracing methods use graphs created from the
skeleton images to recover writing order. Jager estimated
the global smoothness by computing the angles of
intersecting lines accurately whereas Huang et al. uses
spline smoothing. The Jager uses finding Hamiltonian path
and The Eulerian path explored by Huang et al. which
resulted in minimum cost from the graphs created
(Yoshiharu & Makoto, 2000).
Huang and Yasuhara (Huang & Yasuhara, 1995)
proposed an approach to recover the time ordered points
from a cursive script using SLALOM approximation
method to estimate the smooth continuity (TSS) method.
The smoothness of the recovered handwriting is over the
total stroke of the handwriting as against to the
conventional methods which consider the local area of each
of the crossing points. Thus the proposed method is robust
against local ambiguities or noise. The complex
handwriting is divided into smaller components and
extensive experimental studies have verified the
effectiveness of the proposed method. The method is based
on four assumptions: only single strokes are allowed, no
crossing points with more than two lines, the starting and
ending points should not be same and finally stroke doesnt
contain double traced lines.
Kato and Yasuhara (Huang & Yasuhara, 1995)
proposed a hybrid method to recover the drawing order
from skeleton graphs. The double traced lines (D-lines) are
labelled as Loop (LD-line), spurious (SD-line) and Proper
(PD-line) with and without crossings. The dynamic
information extracted based on the labels of the edges
using the proposed trace algorithm. The authors restricted
the recovery of single strokes with the well-defined start
and end points and any junction point with not more than
two lines intersected with each other.
Lau et al. (Lau, et al., 2002) proposed a simple writer
model based method for extracting stroke sequences from
thinned signatures. The number of connected points and
number of connected strokes using 3X3 neighborhood. The
strokes are represented with the start and end points along
with their direction vectors. The direction and order of
stroke sequence estimated by minimizing the distance
travelled in writing the signature. The minimum distance
computation using four proposed cost measures: the
direction cost within the strokes, inter-stroke distance cost,
inter-stroke direction cost and the cost of the start pixel.
The authors claimed promising results with single stroke as
well as multi-stroke signatures without having limitation on
crossing points.
Jung and Hyung (In-Jung & Jin-Hyung, 2003)
represented all kinds of stroke relationships using statistical
dependency followed by the nth order probability
distribution method to identify neighbor relationships. The
neighbor selection algorithm selects important relationships
instead of all existing relationships. An overall recognition
rate of 98.45% reported for handwritten Chinese characters
by the proposed character modeling method.
Qiao and Yasuhara (Qiao & Yasuhara, 2004) proposed
a method to recover writing trajectory from single stroke
images using local node analysis and global smoothness

computation based on SLALOM. The doubletraced/terminal segments which are identified by odd
degree nodes in the skeleton graph are resolved using
probability framework based on angles measured using
Principal Component Analysis (PCA). All the even degree
nodes are analyzed by the node traversing rule (NTR). The
authors claimed that the proposed method is better than the
method proposed by Jager, Huang et al. and Kato et al.
(Yoshiharu & Makoto, 2000). The proposed method for
identification of the double-traced/terminal segment is
more robust and smoothness optimization is
computationally efficient.
Qiao et al. (Qiao, et al., 2005) recovered the writing
order by finding the smoothest Euler path from the skeleton
graph within the framework of Edge Contiguous Relation
(ECR). The authors obtained the possible ECRs at each of
the nodes by analyzing locally followed by a global trace to
find candidate Euler paths and finally the smoothest one is
selected. The two main contributions are: obtaining
possible ECRs at even nodes based on two simple
assumptions and a method to identify double traced lines
by using weighted matching of general graph. They
claimed that their method achieved 95.2% correct recovery
rate.
Rousseau et al. (Rousseau, et al., 2005) improved the
Kato and Yasuhara (Yoshiharu & Makoto, 2000)
algorithm. The authors proposed a methodology for
searching start and end points, identifying drawing
direction and another algorithm for reconstruction of online
information. They proposed criterion for selecting the best
path from the possible paths after identifying the start and
end point pairs. This method also restricted to single stroke,
but relaxed the other two constraints imposed by Kato and
Yasuhara. The authors reported 99.6% of correct
identification of start and end points, 97% of correctly
identified start and end points reconstructed that contains
the good path and finally 93% of times the good path
selected from the paths reconstructed. The recognition
results of the online data and the original online data are
roughly the same.
Qiao et al. (Qiao & Yasuhara, 2006) proposed recovery
of writing trajectory from multi stroke static handwritten
images by searching the best matching writing paths with
that of template strokes. To reduce the search space, the
matching cost function defined as the summation of
positional distortion cost and directional difference cost
combined with bidirectional search algorithm based on
dynamic programming is used to find the best matching
path. The authors claimed 94.5% of strokes recovered
correctly while holding the start/end vertex constraint and
is increased to 98% if the start and end points are supplied
explicitly.
Cordella et al. (Cordella, et al., 2010) proposed a
method to extract the dynamic information from skeleton
graphs. They transformed the complete skeleton graph all
except when the start and end points each have an even
degree. The Eulerian trail in the transformed graph found
using the Fleury's algorithm (Taylor, ). The experimental
results show that the writing order produced even in
handwriting with retracing, crossings and pen-ups.
Nagoya et al. (Nagoya & Hiroyuki, 2011) solved two
major problems of writing order recovery from single

348 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

stroke static images: Double-traced lines (D-lines) and


obtaining the smoothest path of the stroke using graphtheoretic approach. It involves several steps. First,
constructing the graph from the thinned image pattern of
the input handwritten image. Second, using the novel idea,
indexing of D-lines by local examination, and converted
the skeleton graph to semi-Eulerian graph. The problem of
recovering drawing order reduces to maximum weight
perfect matching problem of graphs. Finally, the
probabilistic Tabu search algorithm is used to restore
writing order. Experimental studies proved its effectiveness
and usefulness. This proposed work extended to multistoke handwritten images by the authors in (Nagoya &
Hiroyuki, 2011). The authors further extended the
algorithm for handwritten images with complex patterns
and proved mathematically the correctness and achieved
the linear time algorithm as against to the quadratic time
algorithm proposed in (Yoshiharu & Makoto, 2000) in
terms of stroke intersection points (Nagoya & Fujioka,
2012).
Iwakiri et al. (Iwakiri, et al., 2012) tackled writing order
recovery of stroke from the writing order of a similar
instance (instance-based). This method requires sufficient
instances to cover the various single as well as multi-stroke
characters without any special consideration. The authors
explained the principle behind the recovery process and
presented experimental results with complex characters.
Nguyen et al. (Nguyen & Blumenstein, 2010) presented
a survey on techniques for static handwritten trajectory
recovery. The recovery process requires methods to
perform preprocessing, skeletonizing, contour extraction,
ambiguous zone detection, ambiguous zone analysis,
double trace line and hidden loop analysis, start and end
point detection, global reconstruction and many more.
Despite preprocessing operations to improve the quality of
images to be processed, binarization would deteriorate or
even destroy valuable clues which are available in grayscale images. Although there is diversity in the approaches,
most techniques found in the literature to date are based on
rough estimates. The recovery rate improves if the
detection and estimation of the structural features become
more accurate. The writing order recovery will attract more
attention from the handwriting recognition community as it
deserves.

analyze the structure of the character image and to simplify


the analysis process, the input pattern is thinned using the
proposed efficient fully parallel thinning algorithm as
stated in (Ananda, et al., 2014). The thinning algorithm
retains the structural and topological information and thins
the patterns to one pixel thin. The thinned pattern contains
endpoints which contain only one neighbor in the 8neighborhood i.e. BP = 1, stroke pixels with exactly two
neighbors in the 8-neighborhood i.e. BP = 2 (whose AP
value always equal to 2), and the junction points whose BP
> 3 or with BP = 3 and AP = 3 (Ananda, et al., 2014).

Figure 1: The thinned pattern using the thinning


algorithm
The feature points are the endpoints and the junction
points in the thinned image. The neighboring feature points
are merged to form feature point clusters in which some of
the clusters may contain only one feature point or may
contain clusters of points as indicated by the circle labelled
1 and 2 respectively as shown in Figure 1. The ordered
sequence of the stroke points joining two feature points is
referred to as the segment. The segment may be a real
segment which is part of the original stroke of the character
written by the writer or a spurious segment produced as an
artifact at narrow junction points due to the thinning
process. The procedure used to differentiate the real
skeletons from spurious skeletons is as follows (Qiao &
Yasuhara, 2004):
1. Find the contour length of the input image pattern
where the contour extracted using Pavlidis
algorithm.
2. Find the number of object points i.e. black pixels in
the input image.
3. Compute the stroke width using the formula:
.

III.

RECOVERING WRITING ORDER

The writing order recovery involves constructing


skeleton graph, extracting individual constituent strokes,
extracting ordered sequences of coordinate points,
identification of start and end points and finally ordering
these strokes. Each of these phases along with the order in
which the graph structures need to be processed are
detailed in the following sub sections [1]a through [1]d.
a. Transforming character Images to graph structure
A character image may contain multiple strokes with
non-uniform width. The character strokes may intersect
with themselves and may be with other strokes. The
constituent strokes of the character images need to be
separated before recovering the writing order. The
separation of the strokes from the input image pattern with
uneven stroke width is quite difficult. Thus to efficiently

4. Compute the length of the segment obtained after


thinning the input image pattern as the cumulative
sum of the Euclidean distances between
successive segment points as

5. If the length of the segment < 1.5 * stroke width, the


segment is considered as the spurious segment.
6. If the length of the segment > 4 * stroke width, the
segment is considered as the real segment.
7. If the segment with the length between [1.5 * stroke
width, 4 * stroke width], the segment may be real
or spurious. These strokes are further checked for

349 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

labelling the segments as either real or spurious as


follows:
a. For each pixel on the segment, calculate the
distance to the nearest contour pixel of the
original image.
b. Compute the average distance to the nearest
contour point of all the points on the segment.
c. Compute the maximum of the nearest contour
distances of all the points on the segment.
8. The segment is considered as spurious if any one of
the following conditions satisfied:
i.

ii.

b. Analyzing skeleton graph structures

The points on the spurious segments identified using


the above procedure are merged with the connecting
feature point clusters to form the new feature point cluster.
Now the skeleton graph is constructed by considering the
feature point clusters as nodes and the real segments
connecting the feature point clusters as edges. Skeleton
graphs obtained for a multi-stroke character as well as
single stroke character are as shown in Figure 2 and Figure
3 respectively.

(a)

Figure 4: Graph structures (a) S-Vertex (b) T-Vertex (c)


S-Touching (d) S-Loop (e) T-Loop (f) D-line (g) M-Loop
(h) L-Loop

(b)

(c)

(d)

Figure 2: (a) the multi-stroke input character image (b)


Skeleton image of the input image thinned using proposed
algorithm (Ananda, et al., 2014) (c) Graph with the
spurious segment marked with black circle (d) Skeleton
graph after merging the spurious segment with connected
feature point clusters (pixels coloured black)

In order to recover the writing order, first the


constituent strokes of the multi stroke character images
needs to be separated. The recovery of the writing order of
the entire stroke is now reduced to a simpler problem of
recovering writing order of each of the single strokes. The
skeleton graph structures need to be analyzed for separating
strokes and to recover the writing order of single strokes.
The degree of the nodes in the skeleton graph is used to
identify different graph structures in Telugu character
images. The skeleton graphs obtained are undirected graphs
and thus the degree of any node is equal to the number of
edges incident on the node. Eight graph structures
identified by careful analysis of isolated Telugu character
images: S-vertex, S-loop, S-Touching, T-Vertex, T-loop,
D-line, M-loop and L-loop as shown in Figure 4.
From graph structures identified, the degree of the
nodes in the graph are either 1 or 3 or 4. But some nodes
may also be with degree equal to 2 which result due to the
merging of spurious segments with connecting feature
point clusters. A node with degree = 3 with one of the
incident edge as a spurious segment may be converted to a
node with degree = 2. So we first process the nodes with
degree equal to 2. Thus, we proposed an order in which the
nodes of the graph need to be processed to recover the
writing order of strokes. The proposed order in which
nodes are processed is:
1. Nodes with degree = 2 resulted as a result of
merging spurious nodes.
2. S-Vertex and S-loop structures which are nodes
with degree = 4.
3. T-Cross, T-Loop nodes with degree = 3 and DLines (Double traced lines), lines between two
nodes with degree = 3 and degree = 1.
4. M-Loop, L-Loop and S-Touching structures with
two neighboring nodes of degree = 3.

Figure 3: Skeleton graph corresponding to a single


stroke character image

The identification of these graph structures and


recovering the drawing order of each of these graph
structures is detailed in the following subsections.
1) Recovering graph structures with degree = 2
There are two possible graph structures with degree = 2
as shown in Figure 5. The loop structure is one with the
same starting point and ending point. Thus, the writing

350 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

order is recovered in anti-clock wise order. If the graph


structure is a node with two different incident edges, then
connect the two incident edges and remove the node with
degree = 2 to recover the writing order.

Figure 5: Graph structures with degree = 2 red coloured


node (a) two different incident edges (b) loop
2) Recovering S-Vertex and S-Loop graph structures.
The nodes to be processed after the processing of nodes
with degree = 2 are: S-Vertex and S-Loop. The S-Vertex is
identified as a node with degree = 4 and the number of
incident edges on the node equal to 4 labelled with a red
circle as shown in Figure 6(a). It is produced due to the
intersection of two different strokes. Thus after processing
S-vertices, the skeleton graph decomposed into constituent
strokes. To decompose the strokes, first the edges are
ordered based on the angle of the incident edge with
respect to the horizontal line calculated either clock wise or
counter clock wise direction. Then the 1st edge is
connected with n/2+1 edge, 2nd edge is connected to n/2+2
edge, and so on. The n/2 edge is connected to the last edge
in the ordered sequence where n denotes the edges incident
on the S-vertex. Now remove the S-Vertex and as a
consequence, the incident edges on the node are also
deleted. Finally, update the graph by adding n/2 edges
resulted due to decomposition. In Figure 6(a), four edges
incident on S-vertex connecting two different strokes and
thus the strokes labelled 1 and 3, 2 and 4 are joined
together and the component strokes obtained as shown in
Figure 6(b) and Figure 6(c).

edge labelled 4. Finally remove the S-loop vertex from the


graph and add the new edge to the skeleton graph.
All the remaining nodes left in the skeleton graph are
with degree equal to either 1 or 3. The nodes with degree =
1 are either stroke endpoints or part of a double traced line
(D-line). Graph structures remaining are T-Cross, T-Loop,
D-line, M-Loop, L-Loop, and S-Touching. The graph
structures M-Loop, L-Loop, and S-Touching are termed as
super structures as all of them contains a D-line and the TLoop is a part of L-Loop. Those nodes which are not part
of M-Loop, L-Loop, and S-Touching are processed first
followed by the processing of super structures.
3) Recovering T-Cross, D-line and T-Loop graph
structures.
(a)
(b)

Figure 7: (a) Graph structure with T-Cross (labelled 1),


D-line (labelled 2) and T-Loop (labelled 3) (b) Three
strokes recovered from the graph structure
The Figure 7 (a) is a graph structure with T-Cross
labelled with 1, D-line labelled with 2 and T-Loop labelled
with 3. The constituent strokes recovered after recovering
the writing order of all the graph structures in Figure 7 (a)
are shown with three different colors in Figure 7 (b). All
these three graph structures are differentiated by
calculating the angles around the node to be labelled.
As a T-Loop exist either independently or it may be
part of an L-Loop, first we need to differentiate these two
cases. If the D-line incident with both the endpoints of the
loop is roughly the same i.e. the angular difference between
the top-2 angles is below a threshold the T-Loop is part of
an L-Loop. Otherwise, it is an independent T-Loop. Based
on our experiments the threshold value fixed at one tenth of
the angle around a point i.e.
degrees.
Besides this, the condition for identification of the T-Loop
is a node with degree = 3 and the number of incident edges
equal to 2. To recover the writing order of a T-Loop,
connect the edge to the endpoint of the loop with which it
incident with maximum angle. Finally, the edges incident
on the node previously are removed from the graph.

Figure 6: (a) The multi-stroke character with S-vertex


and S-loop graph structures (b) and (c) component strokes
obtained as a result of decomposing S-Vertex
The S-Loop is identified as a node with degree = 4 and
the number of edges incident on the node equal to 3
labelled with a blue circle as shown in Figure 6(a). It is
formed due to a stroke intersecting with itself thus leads to
formation of loop. The same strategy used for S-Vertex is
used to recover the writing order. The edge labelled 1 is
connected with the loop end labelled with 3 and then the
other end of the loop labelled 4 connected to the remaining

The remaining nodes which are not part of the super


structures are T-Cross and D-line nodes. Both the nodes are
with degree = 3 and the number of incident edges on the
node are equal to 3. If the difference between the top-2
angles is greater than or equal to the threshold = 36
degrees, then the graph structure is T-Cross, otherwise its
a probable D-line. If this D-line is between a node with
degree = 3 and a node with degree = 1 then it is a D-line,
otherwise it is part of a super structure. Once the T-Cross
and D-lines are identified, we can easily recover the writing
order from these structures.

351 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

The T-Cross is recovered in the same way as a T-Loop


is recovered. In this case similar to S-Vertex, the
constituent strokes are separated into individual strokes. To
recover the writing order of a D-line structure, start with
one of the edges other than the D-line, connect the D-line
to the edge selected, reverse the D-line and connect it to the
edge obtained so far. Finally connect the only edge
remaining and remove both the end points of the D-line
from the graph.
4) Recovering M-Loop graph structure
The remaining graph structures to recover the writing
order are: M-Loop, L-Loop and S-Touching. All of these
super structures have two nodes with degree = 3. The
criterion for identifying M-Loop are:
1. Two neighboring nodes with degree = 3.
2. The number of edges in common to both the nodes
= 2.
3. The D-line obtained should be same for both the
nodes.
4. The D-line should be a minimum length common
edge.
The character skeleton that consists of only M-Loop is
shown in Figure 8(a). In M-loop graphical structure, each
node has exactly one edge which is not common to both
nodes. Thus to recover the writing order of such structure,
connect the two edges belong to two nodes which are not
common are joined through the D-line edge. Connect the
D-line edge with the other common edge in the anti-clock
wise direction to obtain the second constituent stroke.
Finally delete the D-line endpoint that is not the start point
of the new edge constructed.
In Figure 8(a), the line labelled 1 connected to the Dline labelled 5, which in turn connected to the line labelled
4 in the sequence to obtain the black colored edge in the
recovered strokes shown in Figure 8(b). The blue colored
edge in Figure 8(b) is obtained by connecting the D-line
edge with the other common edge in anti-clock wise
direction and removing the node labelled with a blue
colored circle.

2. The only common incident edge of both nodes


should be D-line.
3. One node should have only one incident edge which
should be a loop besides D-line. Thus the number
of incident edges on this node should be equal to
2.
4. The other node should have two distinct nodes
incident on the node besides D-line. Thus the
number of nodes incident on this node should be
equal to 3.
The skeleton graph with only the L-Loop graph
structure is shown in Figure 9 (a). The writing order
recovered from such a structure is as stated below:
1. Choose the node that contains two different incident
edges besides D-line.
2. Choose one of the edges as the start edge.
3. Connect the start edge with the D-line.
4. Connect the end point of the loop edge on the other
node which is opposite to the end point of the start
edge considered in step-1.
5. Connect the other end point of the loop edge to the
D-line.
6. Connect the edge constructed so far to the edge
other than the one considered as the start edge in
step-1.
7. Remove both the edges of the D-line from the
graph.
In Figure 9(a), the line labelled 1 connected to the Dline labelled 2, which in turn connected to the line labelled
3, this in turn merged with the D-line and finally with the
edge labelled 4 in the sequence to obtain the black colored
edge in the recovered strokes shown in Figure 9(b).

(a)

(b)

Figure 9: (a) Graph Structure with L-Loop (b) the


recovered stroke from the graph

(a)

(b)

Figure 8: (a) Character image with M-Loop (b) the two


strokes recovered
5) Recovering L-Loop graph structure
Now we can recover the writing order of L-Loop. The
criterion for identifying L-Loop structure is:
1. Two neighboring nodes with degree = 3.

6) Recovering S-Touching graph structure


The last graph structure S-Touching is the one with two
nodes with degree = 3 are connected with a D-line and the
number of incident edges on each of the nodes is equal to
3. A graph structure with only S-Touching is as shown in
Figure 10(a). For recovering the writing order, arrange the
edges excluding the D-line in either clock wise or anticlock wise order. Connect the opposite edges through the
D-line to obtain the two component edges of the graph.
Finally remove both the endpoints of the D-line from the
graph.

352 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

In Figure 10(a), the edge labelled 1 connected to the Dline labelled 2 which in turn connected to the edge labelled
3 to recover the black constituent edge in Figure 10(b). The
edge labelled 4 connected to the D-line labelled 2 which in
turn connected to the edge labelled 5 to recover the blue
constituent edge in Figure 10(b).

So, recovering the exact writing order from static


character images is harder and we focused on recovering
consistent recovery for all the characters. So we need to
impose constraints considering the language and writers
habits to consistently recover individual characters. Some
of the criterion for selecting best drawing path for Latin
languages as proposed by Rousseau et al. (Rousseau, et al.,
2005) are:
1. Very curved edges should not be traversed more
than once
2. Draw the loops in good direction.
3. The highest down stroke should be maximized
4. Easiest way to write the character

(a)

(b)

Figure 10: (a) Graph Structure with S-Touching (b) the


only stroke recovered
After separating the constituent strokes of the multi
stroke character and recovering the writing order for each
of the strokes using the proposed methods, all the strokes
obtained are with exactly two endpoints. Now it is
necessary to identify start and end points for each of the
strokes with exactly two endpoints and order the multiple
strokes in proper order. These two issues discussed in the
subsequent sections c and d respectively.
c.

Determining start and end points

The identification of start and end points for the


constituent strokes of characters is difficult due to various
reasons. First, the contradicting writing directions used to
produce a character varies from character to character as
shown in Figure 11. The stroke corresponds to dheergham
(see Figure 11(a)) written from top to bottom and left to
right where as the ethvam (see Figure 11(b)) written from
bottom to top and right to left which is quite opposite to the
way dheergham is written. Second, the same character may
be written in different writing orders as shown in Figure
12. The base stroke of the character Ka (shown in Figure
12(a)) is written either from top to bottom and right to left
or bottom to top and left to right. The character Sunna (0)
shown in Figure 12(b) may be written clock wise or anticlock wise direction.
(a)

In their experiments they identified that assumptions 1


and 2 yield better results. All the assumptions 1-3 are not
suitable for determining start and end points for Telugu
characters as shown in Figure 13.
(
a)

(
b)

Figure 13: The Character images (a) Aa (b) Cha


The assumptions considered for the determination of
start and end points for Telugu characters:
1. Characters are written from top to bottom and left to
right
2. Loops are recovered in anti-clock wise direction.

d. Ordering constituent strokes of a character


Once the starting and ending points identified for each
of the strokes of the character, then we need to order the
strokes. The proposed criterion for ordering the strokes is
as explained below:
1. Calculate the average stroke length
2. The strokes which are less than the average stroke
length are considered as shorter strokes.
3. The strokes which are greater than or equal to the
average stroke length are considered as longer
strokes.

(b)

Figure 11: (a) Dheergham (b) Ethvam

4. Divide the strokes comprising the character into


shorter and longer strokes based on the steps (2)
and (3).
5. Calculate the bounding boxes for all the longer
strokes.

(a)

6. The longest stroke with the bounding box on top of


all other longer strokes (i.e. whose top left corner
of the bounding box is less) is considered as the
base stroke.

(b)

Figure 12: (a) The character Ka (b) Character Sunna

7. The next stroke selected is the one having the


minimum distance from the current stroke end
point to the start of the next stroke.

353 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al

Writing Order Recovery from Telugu Character Images

The Figure 14 illustrates the thinned character image


and the corresponding stroke order numbered from 1 to 4
where 1 indicates base stroke and 4 indicates the last
stroke.

(a)

(b)

Figure 14: (a) The thinned image (b) The order of strokes
recovered (numbered 1-4)

Figure 15: Example character images with problematic


graph structures
IV. RESULT ANALYSIS
The overall writing order recovery performance is
96.94% by using the above proposed approach, i.e. 37,
235 sample images are correctly recovered out of 38, 411
samples. The Table 1 shows the individual character level
recovery performance (recovered samples (RS) against
total samples (TS)). The proposed approach unable
recover a few samples due to the poor handwriting of
writers. The problematic graph structures are enclosed
with red circle in the example character images in Figure 15.
Table 1: The number of samples recovered out of the total
samples considered for each of the classes considered
CID
000
003
006
010
013
016
019
022
025
028
031
034

RS / TS
265/267
261/271
230/263
262/271
262/265
265/265
235/268
260/262
264/269
270/279
277/278
268/278

CID
001
004
008
011
014
017
020
023
026
029
032
035

RS / TS
267/279
251/260
230/244
277/281
262/272
270/276
269/269
278/278
265/274
255/264
260/272
277/277

CID
002
005
009
012
015
018
021
024
027
030
033
036

RS / TS
282/282
252/258
250/266
267/272
261/276
278/279
267/274
253/274
270/280
272/272
270/282
251/271

037
040
043
046
049
052
055
058
061
064
067
070
073
076
081
084
087
090
093
096
099
102
105
108
111
114
117
120
123
126
129
132
135
138
160

259/270
260/270
276/280
222/266
277/277
272/273
279/279
277/277
277/277
268/268
257/257
262/270
275/276
263/270
260/262
275/275
287/287
257/270
235/257
266/271
263/265
256/264
266/272
278/278
260/265
269/282
269/273
274/283
262/277
268/280
282/292
242/262
278/278
295/307
275/275

038
041
044
047
050
053
056
059
062
065
068
071
074
077
082
085
088
091
094
097
100
103
106
109
112
115
118
121
124
127
130
133
136
141
163

305/314
241/267
270/282
278/278
253/270
273/276
278/279
277/278
272/272
241/266
250/268
290/291
244/260
273/273
260/261
278/279
235/270
265/269
242/277
255/261
261/269
260/276
265/269
269/272
256/264
271/273
261/265
264/277
266/280
254/272
263/269
274/274
246/274
229/239
265/265

039
042
045
048
051
054
057
060
063
066
069
072
075
079
083
086
089
092
095
098
101
104
107
110
113
116
119
122
125
128
131
134
137
142
164

274/278
270/276
271/271
265/278
257/274
272/277
280/280
269/270
270/270
281/283
266/272
253/257
286/296
261/272
273/274
256/265
276/276
247/262
275/287
266/273
258/266
256/268
275/277
264/270
254/260
275/276
255/274
258/268
260/269
237/260
232/268
271/274
254/267
274/274
266/267

V. CONCLUSIONS AND FUTURE ENHANCEMENTS


We proposed an approach to recover the writing order
from the static images there by the same online recognizer
may be used to identify the class of the offline character.
The skeleton graph is constructed from the thinned pattern
of the character image. We separated the constituent
strokes and then recovered the writing order of each of the
strokes. Finally order the strokes in the order normally they
are written. The ordering requires identification of start and
end points of each stroke. With the proposed approach to
recover the strokes we achieved around 97% of the static
images corresponding to which the online data also
available and supplied by HP Laboratories (HP Research
Labs, Bangalore, 2009).
REFERENCES
[1]

[2]

Ananda, K. K., Jagadeeswararao, E. & valurouthu, K. P., 2014.


Robust and Efficient Fully Parallel 2D Thinning Algorithm.
International Journal of Computer Applications, 85(5), pp. 1-6
Cordella, L. P., Claudio, D. S., Marcelli, A. & Santoro, A., 2010.
Writing Order Recovery from Off-Line Handwriting by Graph
Traversal. Istanbul, IEEE, pp. 1896 - 1899.

354 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

Ananda Kumar Kinjarapu et al


[3]
[4]

[5]

[6]

[7]
[8]
[9]

[10]

[11]

[12]

[13]
[14]

[15]
[16]

[17]

[18]

[19]

Writing Order Recovery from Telugu Character Images

HP Research Labs, Bangalore, 2009. Lipi Toolkit Project. [Online]


Available at: http://lipitk.sourceforge.net/
HP Research Labs, Bangalore, 2009. Telugu Character Dataset.
[Online]
Available
at:
http://lipitk.sourceforge.net/datasets/teluguchardata.htm
Huang, T. & Yasuhara, M., 1995. A total stroke SLALOM method
for searching for the optimal drawing order of off-line handwriting.
Vancouver, BC, IEEE (IEEE Transactions on Systems, Man and
Cybernetics) , pp. 2789-2794.
In-Jung, K. & Jin-Hyung, K., 2003. Statistical Character Structure
Modeling and Its Application to Handwritten Chinese Character
Recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS
AND MACHINE INTELLIGENCE, 25(11), pp. 1422-1436.
Iwakiri, Y., Shiraishi, S., Yaokai, F. & Uchida, S., 2012. On the
possibility of instance-based stroke recovery. Bari, IEEE, pp. 29-34.
Lau, K. K., Pong, C. Y. & Tang, Y. Y., 2002. Stroke extraction and
stroke sequence estimation on signatures. s.l., IEEE, pp. 119-112.
Lee, S. & Pan, J. C., 1992. Offline tracing and representation of
signatures. IEEE Transactions on Systems, Man and Cybernetics,
22(4), pp. 755-771.
Nagoya, T. & Fujioka, H., 2012. Recovering Dynamic Stroke
Information of Multi-stroke Handwritten Characters with Complex
Patterns. Bari, IEEE, pp. 722-727.
Nagoya, T. & Hiroyuki, F., 2011. A Graph Theoretic Algorithm for
Recovering Drawing Order of Multi-stroke Handwritten Images.
Fukuoka, IEEE, pp. 569-574.
Nagoya, T. & Hiroyuki, F., 2011. Recovering drawing order from
static handwritten images using probabilistic tabu search. Bali,
IEEE, pp. 379 - 384.
Nguyen, V. & Blumenstein, M., 2010. Techniques for static
handwriting trajectory recovery: a survey. s.l., ACM, pp. 463-470.
Qiao, Y., Nishiara, M. & Yasuhara, M., 2005. A Novel Approach to
Revoer Writing order from Single Stroke Offline Handwritten
Images. s.l., IEEE, pp. 227-231.
Qiao, Y. & Yasuhara, M., 2004. Recovering dynamic information
from static handwritten images. s.l., IEEE, pp. 118 - 123.
Qiao, Y. & Yasuhara, M., 2006. Recover Writing Trajectory from
Multiple Stroked Image Using Bidirectional Dynamic Search. Hong
Kong, IEEE, pp. 970-973.
Rousseau, L., Anquetil, E. & Jean, C., 2005. Recovery of a
Drawing Order from Off-Line Isolated Letters Dedicated to OnLine Recognition. Seoul, Korea, s.n., pp. 1121-1125.
Taylor, J., . Fleury's Algorithm. [Online] Available at:
http://www.it.brighton.ac.uk/staff/jt40/mm322/MM322_FleurysAlg
orithm.pdf
Yoshiharu, K. & Makoto, Y., 2000. Recovery of Drawing Order
from Single-Stroke Handwriting Images. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 22(9), pp. 938-949.

355 | International Journal of Computer Systems, ISSN-(2394-1065), Vol. 03, Issue 04, April, 2016

You might also like