Professional Documents
Culture Documents
[B] Sauvolas method
Sauvolas method is the modified version of Niblacks approach. Threshold is calculated
considering the dynamics of standard deviation by the following formula:
[3] Gatos et al.s method
It uses five distinct steps: preprocessing, rough estimation of foreground region, approximate
background calculation, final thresholding and post-processing. Usually adaptive wiener filter is
used for preprocessing of the main grayscale document images. A rough estimation of the text
region is calculated using sauvolas method which gives the superset of correct text regions. An
approximate background is calculated from the filtered grayscale image using interpolation of
the pixel corresponding to the text region of the estimated foreground. Final thresholding is
performed by calculating the distance of foreground text pixel from it background adaptively.
Finally, a post-processing is performed using shrink and swell filtering.
Fig1. : Block diagram of poor quality document image binarization
Weiner
Filter
Sauvola's
adaptive
thresholding
Interpolation
Examine pixel contrast
B(x,y)-I(x,y)>d(B(x,y))
Post-
Processing
I
s
(x,y)
Gray-scale
source image
I(x,y)
Gray-scale
image after pre-
processing
S(x,y)
Intermediate B/W
image
B(x,y)
Gray-scale
Background
surface image
T(x,y)
Final image
T(x,y)
Final image
after post
processing
[4] Halabi et al.s method
It uses the same steps as the previous method, but a Gaussian filter is used in preprocessing step
to reduce the noise of the grayscale image.
Fig. 3: Block diagram of Halabi et al.s method
Experimental Results and Conclusion
Niblacks method performs binarization well in the regions near to the text areas. But large
amount of noise is present in the non-text regions. This problem of Niblack is overcome by
sauvolas method by considering the dynamics of standard deviation of the pixels for the
grayscale images. However, it gives poor performance when the degree of degradation becomes
more severe. The method proposed by Gatos et al. can deal with the degradations in the
background surface by considering the distance of the text and background surface as threshold.
Halabi et al.s method also perform well, but blurring of edges occur due to the usage of
Gaussian filter.
References
1. Gatos B., I. Pratikakis, S. J. Perantonis, Adaptive Degraded Image Binarization,
Journal of Pattern Recocnition, 39 (2006)
2. Sauvola J., M. Pietikainen, Adaptive Document Image Binarization, Pattern
Recognition 33 (2000)
3. Basilios Gatos, Ioannis Pratikakis and Stavros J. Perantonis An Adaptive Binarization
Technique for Low Quality Historical Documents (2004)
4. Yahia S. Halabi, Zaid SA, Faris Hamdan, Khaled Haj Yousef Modeling Adaptive
Degraded Document Image Binarization and Optical Character System (2009)
5. W. Niblack, An Introduction to Digital Image Processing, Prentice-Hall, Englewood
Cliffs, NJ, 1986
6. ABBYY (www.finereader.com).
I
s
(x,y)
Gray-scale
source image
I(x,y)
Gray-scale
image after pre-
processing
S(x,y)
Intermediate B/W
image
B(x,y)
Gray-scale
Background
surface image
T(x,y)
Final image
T(x,y)
Final image
after post
processing
Gaussian
Filter
Sauvola's
adaptive
thresholding
Interpolation
Examine pixel contrast
B(x,y)-I(x,y)>d(B(x,y))
Post-
Processing