Professional Documents
Culture Documents
prole analysis
Shutao Li
a,
*
, Qinghua Shen
a
, Jun Sun
b
a
College of Electrical and Information Engineering, Hunan University, Changsha 410082, China
b
Fujitsu R&D Center Co., Ltd., Eagle Run Plaza B1003, Xiaoyun Road No. 26, Chaoyang District, Beijing 100084, China
Received 22 February 2006; received in revised form 22 September 2006
Available online 28 November 2006
Communicated by A.M. Alimi
Abstract
In this paper, a novel document skew detection algorithm based on wavelet decompositions and projection prole analysis is pro-
posed. First, the skewed document images are decomposed by the wavelet transform. The matrix containing the absolute values of
the horizontal sub-band coecients, which preserves the texts horizontal structure, is then rotated through a range of angles. A projec-
tion prole is computed at each angle, and the angle that maximizes a criterion function is regarded as the skew angle. Experimental
results show that this algorithm performs well on document images of various layouts and is also robust to dierent languages. The
eects of various wavelet basis, number of decomposition levels, and parameters of the criterion function are investigated too.
2006 Elsevier B.V. All rights reserved.
Keywords: Skew detection; Document analysis; Projection prole analysis; Wavelet transform
1. Introduction
Document skew detection is necessary for most docu-
ment analysis system and many methods have been devel-
oped. Existing methods typically use: (1) projection
proles analysis (Bloomberg and Kopec, 1993; Bloomberg
et al., 1995; Ishitani, 1993; Liolios et al., 2002; Postl, 1986);
(2) nearest neighbors (Jiang et al., 1999; Liolios et al., 2001;
Lu and Tan, 2003); (3) Hough transform (Amin and
Fischer, 2000; Yu and Jain, 1996; Ham et al., 1994); (4)
mathematical morphology (Das and Chanda, 2001; Naj-
man, 2004); (5) cross-correlations (Akiyama and Hagita,
1990; Yan, 1993; Chaudhuri and Chaudhuri, 1997; Chen
and Ding, 1999; Gatos et al., 1997).
The traditional projection prole (PJ) based approach
for skew detection was proposed by Postl (1986). First,
the input document is rotated through a range of angles
and a projection prole is calculated at each angle. Fea-
tures are then extracted from each projection prole to
determine the skew angle. This is computationally expen-
sive as it is performed directly on the original document
image. Moreover, it is sensitive to the layout of the docu-
ment image.
An improved projection prole based approach was
proposed by Bloomberg and Kopec (1993). The original
document image is down-sampled before the projection
prole is computed. The following operations are based
on the sampled image. Therefore, the image data to be pro-
cessed is reduced and the computational cost is reduced sig-
nicantly. However, a major weakness is that its detection
accuracy is inuenced by the document image layout. It
often fails on document images with multiple font styles/
sizes or those that contain a large amount of non-text
regions (such as pictures, tables or graphics).
The second class of the skew detection methods is based
on the nearest neighbors (Jiang et al., 1999; Liolios et al.,
2001; Lu and Tan, 2003). Here, the angle between each
0167-8655/$ - see front matter 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.patrec.2006.10.002
*
Corresponding author. Tel.: +86 731 8672916; fax: +86 731 8822224.
E-mail addresses: shutao_li@hnu.cn, shutao_li@yahoo.com.cn (S. Li).
www.elsevier.com/locate/patrec
Pattern Recognition Letters 28 (2007) 555562
http://www.paper.edu.cn
lters smooth the image while the highpass lters look for
detailed information in the image.
As shown in Fig. 1, when 2D-DWT is implemented on
an image, four frequency bands (LL, LH, HL and HH)
are obtained. Among these four sub-bands, the LL sub-
band corresponds to an approximation of the original
document image, the LH sub-band provides details in the
horizontal direction, the HL sub-band provides details in
the vertical direction, while the HH sub-band provides
details in the diagonal direction. Fig. 2 gives a document
image and its rst level decomposition result using the
symlets wavelet.
3. Projection prole analysis
A popular method for skew detection uses horizontal
projection prole because the texts in most document
images are aligned along horizontal lines. When the hori-
zontal projection prole is applied on an M N image, a
column vector of size M 1 is obtained. Elements of this
column vector are the sum of pixel values in each row of
the document image. An example of the projection proles
of an unskewed and skewed image are shown in Fig. 3. As
can be seen, peaks in Fig. 3(c), which correspond to the
horizontal projection prole of the unskewed image, are
taller than those in Fig. 3(d), which correspond to the hor-
izontal projection prole of the skewed image. In fact,
peaks in Fig. 3(c) average around 170 while peaks in
Fig. 3(d) average around 80. Based on this signicant dif-
ference, the skew angle can be estimated.
4. The proposed algorithm
The proposed algorithm is based on the wavelet trans-
form and horizontal projection prole (Fig. 4):
1. If the input document image is not a gray one, transform
it into a gray-scale image, denoted I
g
.
2. Decompose I
g
with 2D-DWT. Then, four frequency
sub-bands (LL, LH, HL and HH) are obtained. Here,
the LH sub-band is selected because it preserves the
horizontal structure of the document image.
Fig. 3. Projection proles of unskewed and skewed document images. (a) Unskewed document image, (b) document image rotated by 6, (c) horizontal
projection prole of (a), (d) horizontal projection prole of (b).
S. Li et al. / Pattern Recognition Letters 28 (2007) 555562 557
http://www.paper.edu.cn
A
u
t
h
o
r
'
s
p
e
r
s
o
n
a
l
c
o
p
y
papers by Chou et al. (2007). For PJ, method proposed by
Postl (1986) is implemented. For TC, method proposed by
Chen and Wang (2000) is implemented. CC stands for
method proposed by Chaudhuri and Chaudhuri (1997),
and PCP stands for the method proposed by Chou et al.
(2007).
The estimation error is usually used to evaluate the eec-
tiveness of the skew detection method. It is dened as the
Fig. 5. Examples of the ve categories of documents. (a) 1st Category, (b) 2nd category, (c) 3rd category, (d) 4th category, (e) 5th category.
S. Li et al. / Pattern Recognition Letters 28 (2007) 555562 559
http://www.paper.edu.cn