You are on page 1of 8

1.

Movtivation Document image binarization is usually performed in the preprocessing stage of different document image processing related applications such as optical character recognition (OCR) and document image retrieval. It converts a gray-scale document image into a binary document image and accordingly facilitates the ensuing tasks such as document skew estimation and document layout analysis. As more and more text documents are scanned, fast and accurate document image binarization is becoming increasingly important. Though document image binarization has been studied for many years, the thresholding of degraded document images is still an unsolved problem. This can be explained by the fact that the modeling of the document foreground/background is very difficult due to various types of document degradation such as uneven illumination, image contrast variation, bleeding-through, and smear. We try to develop robust and efficient document image binarization techniques which are able to produce good results for badly degraded document images.

Figure 1. Two badly degraded document images 2. Document Image Binarization Using Background Estimation [1]

This algorithm has achieved the top performance among the 43 submitted algorithms in the DIBCO 2009. The algorithm makes use of the document background and the text stroke edge information. In particular, it first estimates a document background surface through an iterative polynomial smoothing procedure. The text stroke edges are then detected by combining the local image variation and the estimated document background surface. After that, the document text is segmented based on the local threshold that is estimated from the detected stroke edge pixels. At the end, a series of post-processing operations are performed to further improve the document binarization performance. One characteristic of our proposed method is that it first estimates a document background surface through a one-dimensional iterative polynomial smoothing procedure. In particular, the estimated document background surface helps to compensate certain document degradation such as uneven illumination that frequently exists within many document images but global thresholding cannot handle properly because of the lack of a bimodal histogram pattern. At the same time, the proposed method makes use of the text stroke edges to estimate the local threshold. The use of the text stroke edges overcomes the limitations of many existing adaptive thresholding methods such as those windowbased methods that rely heavily on the window size as well as many other complex document thresholding methods that estimates the local threshold by combining different types of image information. The paper is published on IJDAR.

Figure 2. Document Background Estimation through iterative polynomial smoothing. (a) The intensity of one image row(blur graph) and the fitted initial smoothing polynomial(black bold graph); (b) The final smoothing polynomial(back bold graph) after multiple round of smoothing of the image row. 3. Document Image Binarization Using Local Maximum and Minimum [2]

The technique makes use of the image contrast that is defined by the local image maximum and minimum. Compared with the image gradient, the image contrast evaluated by the local maximum and minimum has a nice property that it is more tolerant to the uneven illumination and other types of document degradation such as smear. The technique has been tested over the dataset that is used in the recent Document Image Binarization Contest (DIBCO) 2009. Experiments show its superior performance. Compared with the image gradient, such image contrast is more capable of detecting the high contrast image pixels (lying around the text stroke boundary) from historical documents that often suffer from different types of document degradation. And compared with the Lu&Tans method which was used in DIBCO contest, the method is better while handling document images with complex background variation. Given a historical document image, the technique first determines a contrast image based on the local maximum and minimum. The high contrast image pixels around the text stroke boundary are then detected through the global thresholding of the determined contrast image. Lastly, the historical document image is binarized based on the local thresholds that are estimated from the detected high contrast image pixels. Compare with previous method based on image contrast, the method uses the image contrast to identify the text stroke boundary, which can be used to produce more accurate binarization results. The paper has appeared in DAS2010.

Figure 3. (a) The traditional image gradient that is obtained using Cannys edge detector; (b) The image contrast that is obtained by using the local maximum and minimum;(c) One column of the image gradient in (a) (shown as a vertical white line);(d) The same column of the contrast image in (b). 4. A Self-training Learning Document Binarization Framework [3] The proposed framework divides the image pixels into three categories based on given binarization method. It first clusters the foreground and background pixels into different classes using k-means algorithm, and assigns labels to these clusters as foreground or background. Then the uncertain pixels are classified with the same label of its nearest cluster. This framework treats the binarization as a learning problem and has been tested on the dataset that is used in the recent DIBCO contest. Our main contribution is a self-training learning framework that divides document pixels into three categories, and accordingly put the document image binarization as a learning problem. Currently the divided uncertain document pixels are simply classified by nearest neighbor. Better binarization performance may be achieved by more sophisticated learning and classification methods. We will study this issue in our future work.

This paper is published in ICPR 2010.

Figure 4. The F-measure values of ten different document images in DIBCO 2009 dataset. The blue line denotes the results produced by the original methods, the red line denotes the imrpoved results by our proposed framework. 4. Combination of Document Image Binarization Techniques [4] The proposed framework divides the image pixels into three categories based on the binary results of given document binarization methods. All the pixels are then projected into a feature space. The pixels in foreground and background sets can be viewed as correctly labeled samples, and used to determine the label of those uncertain pixels. A classifier is then applied to iteratively classify those uncertain pixels into foreground and background. Experiments over the datasets of recent DIBCO 2009 and H-DIBCO 2010 demonstrate superior performance of our proposed framework. Experimental results show that the proposed framework can improve the reported binarization methods significantly. Our main contribution is to propose a framework that can be used to combine different binarization methods to produce better results. Instead of designing a new binarization method, we try to apply the self-training strategy on existing binarization methods, which improves not only the performance of existing binarization methods, but also the robustness on different kinds of degraded document images. Better performance may be achieved by more sophisticated learning and classification methods. This issue will be investigated in our future work. This paper is published in ICDAR 2011.

Figure 5. The flowchart of combination of two binarization results. 5. Document Binarization Contests and Results

Figure 6. Binarization Results of Figure 1; The first column is results generated by Background Estimation method, the second column is results generated by Local maximum and Minimum method Here shows the binarization results of the images in Figure 1. And Table 1 compares the performance of the two methods evaluated by four metrics: F-Measure, PSNR, NRM, MPM( The detailed description of the four metrics can be found here).

Table 1. Evaluation results of Background Estimation and Local Maximum Minimum methods on DIBCO 2009 H-DIBCO 2010 Dataset Method F-Measure(%) PSNR NRM(*10^-2) MPM(* 91.24 18.6 4.31 0.5 DIBCO 2009 Background Estimation H-DIBCO 2010 DIBCO 2011 DIBCO 2009 H-DIBCO 2010 DIBCO 2011 Background Estimation Background Estimation Local Maximum Minimum Local Maximum Minimum Local Maximum Minimum 86.41 81.67 91.06 85.49 85.56 18.14 15.59 18.5 17.83 16.75 7 11.46 9.06

1.1

11.4

4.3

0.3

6.4

The recent Document Image Binarization Contest (DIBCO) held under the framework of the International Conference on Document Analysis and Recognition (ICDAR) 2009 & 2011 and the Handwritten Document Image Binarization Contest(H-DIBCO) held under the framework of the International Conference on Frontiers in Handwritten Recognition show recent efforts on degraded document image binarization. These contests partially reflect the current efforts on this task as well as the common understanding that further efforts are required for better document image binarization solutions.

And the binarization datasets are available here: DIBCO 2009 dataset; H-DIBCO 2010 dataset; DIBCO 2011 dataset; The binarization results of our Background Estimation and Local Maximum and Minimum Methods are available here. We participated in these contests. Our submitted method performs the best among entries of 43 algorithms submitted from 35 international research groups in DIBCO 2009. Our submitted method of H-DIBCO 2010 was one of the top two winners among 17 submitted algorithms. Our proposed method to the DIBCO 2011 achieved second best results among 18 submitted algorithms. And here is the binarization codes submitted to the contests: (1) Binarization Code for DIBCO 2009 (2) Binarization Code for H-DIBCO 2010 (3) Binarization Code for DIBCO 2011

6. Related Publications [1] Shijian Lu, Bolan Su, Chew Lim Tan. Document Image Binarization Using Background Estimation and Stroke Edges. International Journal on Document Analysis and Recognition, December 2010. [pdf] [2] Bolan Su, Shijian Lu, Chew Lim Tan. Binarization of Historical Document Images Using the Local Maximum and Minimum. International Workshop on Document Analysis Systems, 9-11 June 2010, Boston, MA, USA.[pdf] [3] Bolan Su, Shijian Lu, Chew Lim Tan. A Self-training Learning Document Binarization Framework. International Conference on Pattern Recognition, 23-26 August 2010, Istanbul, Turkey. [pdf] [4] Bolan Su, Shijian Lu, Chew Lim Tan: Combination of Document Image Binarization Techniques. International Conference on Document Analysis and Recognition(ICDAR), September 18-21, 2011, Beijing, China.[pdf]

You might also like