Professional Documents
Culture Documents
Simone B. K. AIRES a,b, Cinthia O. A. FREITAS a, Flvio BORTOLOZZI a, Robert SABOURIN c a Pontifical Catholic University of Paran (PUCPR) R: Imaculada Conceio, 1155 Curitiba (PR) Brazil b Centro Federal de Educao Tecnolgica (CEFET) Av: Monteiro Lobato km 4 Ponta Grossa(PR) Brazil c cole de Technologie Suprieure (ETS) 1100, Rue Notre Dame Ouest Montreal (QC) Canada saibes@ppgia.pucpr.br, cinthia@ppgia.pucpr.br, fborto@ppgia.pucpr.br, robert.sabourin@etsmtl.ca
Abstract. Our study investigates the perceptual zoning mechanism for handwritten character recognition. This paper proposes a non-symmetrical zoning mechanism as the baseline feature extractor. Zoning is a method for local information analysis on partitions of a given pattern. So, the basic idea is to work at a high level of representation: the feature vectors extracted from the word images based on the feature set, which are based on zoning mechanism taking into account letter confusion parts. Our feature extraction is based on Concavities/Convexities Deficiencies, which are obtained by labelling the background pixels of the input images. Therefore, circumscribed the letter by a rectangle and partitioned it into Z parts, say Z = 4, 5H (horizontal), 5V (Vertical), and 7. Our experimental results are promising and contribute to the handwritten character recognition field.
1. Introduction
The handwritten character recognition is a special subject and has become important as ICR systems become more powerful and commercially available. On the other hand, there is a gap between human reading capabilities and the recognition systems, so that it is necessary to explore and capture information from human perception to design new systems (Madhavanath & Govindaraju ,1998), (Schomaker & Segers, 1998) (Suen et al., 1994). To understand the recognition process, let us consider two different views. One view, the feature theory, uses a checklist approach it claims that letters are represented in the nervous system as a set of features, lines, and contours of various orientations (Sekuler & Blake, 1994). The other view uses a spatial frequency approach derived from the work of Campbell and Robson (Sekuler & Blake, 1994). Then, the letter tendency to be confused was used to define the perceptual similarity of letters. The basic idea is that when two letters look a lot alike, they will often be confused with one another. The Figure 1 presents this idea considering letters: U and V, O, and Q. A good strategy of perceptual similarity should predict which pairs of letters are confused and which are not.
Our work is based on the feature approach; therefore, it is necessary first to compile a checklist of features that is, to decide what features should make up the list. Table 1 presents a list of the features distinguishing some letters of the alphabet, such as: A, C, and G. A complete list is presented in (Sekuler & Blake, 1994) and it was adapted from Geyer & DeWalt (1973).
Table 1: List of the features distinguishing letters of the alphabet
A 1 1
Features Open 4. Horizontal 5. Intersection, internal 6. Bar-horizontal 7. Symmetry, vertical 8. Symmetry, horizontal
C 1
G 1 1
2 1 1 1
Based on the feature theory we know that if two letters have many features in common, they will tend to be confused; letters that have few features in common will not be. We can observe in Table 1 that the letter C and G have features in common (convex segment and horizontal open). These letters have two differences based on bar-horizontal and horizontal symmetry, which are difficult to extract from the letter stroke since it depends on the writing style (Figure 2).
When we are designing a recognition system we take into account that perception depends on cooperative interaction between the processing of global and local information. The handwritten character or word recognition is an example of stimulus that contains both kinds of information. We are expertise in recognition of characters from early childhood onwards. But, when we observe only a part of the letter, its identification is not that obvious. In the first observation, we are processing global information. In the second one, we need to process local information (Suen et al., 1994). Therefore, a zoning mechanism has been proposed to evaluate the recognition rates of the distinct parts of characters (Suen et al., 1994). Zoning is a method for local information analysis on partitions of a given pattern. So, the basic idea in this paper is to work at a high level of representation: the feature vectors extracted from the word images based on the feature set, which are based on zoning mechanism taking into account perceptual similarities.
(a) (b) Figure 2: Visual Perception: a) Difference based on bar-horizontal feature (Letters C and G); b) Demonstration of some limits on object segregation by shape
But, the studies can take into account the structuralist approach. This approach treats form perception as an analytical process, whereby complex forms were decomposed into small and simple elements. For our case, it is the same approach that segment words into letters or pseudo-letters, called by the researchers as Local or Analytical Approach. In Figure 2, the first level of perception is Global, and then the observer can identify the cluster of +s. Only the second level of perception is Local, and then the observer can identify the cluster of Ts. Concluding, this work takes into account the Gestalt theory applying first a Global feature extraction based Concavities/Convexities Deficiencies, and seconds a Local perceptual zoning mechanism. The zoning mechanism allows scrutinizing the elements (features) individually. The feature set extracted from the images is presented in Sections 3 and 4.
recognition. However, there is still a significant performance gap between humans and machines in the recognition of off-line totally unconstrained handwritten character recognition. Generally, off-line handwritten character recognition system includes three stages: image preprocessing, feature extraction, and classification. Preprocessing is primarily used to reduce noise or variations of handwritten characters. A feature extraction is essential for data representation and extracting meaningful features for later processing. A classification stage assigns the characters to one of the several classes. Considering the influence on recognition performance, the features extraction plays a very important role in handwriting recognition. This has led to the development of a variety of features for handwritten recognition and their recognition performances have been reported (Madhavanath & Govindaraju, 1998) (Suen et al., 1994). The baseline system used in this work applies a Global Approach for feature extraction combined with a Local Approach based on zoning mechanism, and uses Class-Modular architecture feedforward MLP (Multiple Layer Perceptron) in the classification stage. The system gets as input a 256 grey-level image. The preprocessing step is composed of binarization (Otsu, 1979) and a bounding box definition. The feature set is based on Concavities/Convexities Deficiencies. These deficiencies are obtained by labelling the background pixels of the input images (Parker, 1997). The entire and definitive symbols were adapted to handwritten characters, then we have 24 different symbols. Figure 3 presents an example of feature extraction applied to letter S.
5. IRONOFF Database
The experiments were carried out using the handwritten character database from IRESTE/University of Nantes (France), called IRONOFF (IReste ON/OFF Dual Database), consisting of 26 classes of uppercase characters from Form B: B27 B52 fields (Viard-Gaudin, 1999). The IRONOFF database was selected because it is fully cursive. The samples were collected from about 700 writers, mainly of French nationality. The off-line data were scanned at 300 dpi with 8 bits per pixel. The experiments were carried out using 3 subsets, which we called the Training, Validation, and Test databases. Their composition is as follows: 60% for the Training, 20% for the Validation database, 20% for the Test database. The database has a total of 10,510 images.
7. Experimental Results
The recognition rate obtained applying Z = 4 is 82.89%, as presented in Table 2. In carrying out the recognition procedure over the validation database, we performed an error analysis based on the confusion matrix (Table 3). Observing the matrix the following confusions are established: B, D and O; C and E; D and O; H and M; I, F and J; G and Q; J and D; K and M; N and W; R and A; S and D; W, U and V; X and K; Y and X. Then, we experiment the zoning using 5 parts; 5-Horizontal and 5-Vertical. The idea was to provide a better solution to the confusion problems among shapes that are not symmetrical, such as: G and Q (Figure 5a); D and O; Y and X. The recognition rates obtained are 5H = 81.75% and 5V = 80.94%, respectively (see Table 2). The 5H confusion matrix presents better results to following letters: G, O, and Y (Table 2). This zoning mechanism contributes for those letters which are not horizontally symmetric, as presented in Figure 5b. In this manner, we experiment the zoning using 7 parts. The idea was to provide a better solution to the confusion problems among shapes that are not symmetrical and to extract and represent differentially the character middle zone, such as: D and C; N and W; Y and X. This zoning results better than others for the following letters: B, C, D, E, K, N, P, R, U, W, and X (Table 2 and Figure 5c). This zoning mechanism achieved a recognition rate of 84.73%, demonstrating that this approach is promising and some representations are more robust to discriminate among subset of letters than others.
Letter A B C D E F G H I J K L M 4 91.04 65.67 82.09 73.13 83.58 92.54 82.09 88.06 76.12 83.58 77.61 92.54 92.54 5H 86.57 64.18 79.10 65.67 85.07 91.04 86.57 85.07 71.64 79.10 76.12 89.55 82.09 Table 2: Recognition Rate (%) 5V 7 Letter 4 89.55 91.04 68.66 N 74.63 79.10 86.57 O 68.66 88.06 86.57 P 68.66 82.09 82.09 Q 89.55 95.52 86.57 R 89.55 92.54 79.10 S 80.60 80.60 95.52 T 70.15 76.12 80.60 U 76.12 71.64 95.52 V 79.10 82.09 70.15 W 77.61 80.60 76.12 X 86.57 91.04 77.61 Y 85.07 88.06 89.55 Z Average 82.89 5H 77.61 89.55 92.54 64.18 89.55 79.10 97.01 85.07 82.09 74.63 74.63 89.55 88.06 81.75 5V 70.15 88.06 91.04 76.12 88.06 79.10 97.01 82.09 88.06 65.67 70.15 85.07 88.06 80.94 7 86.57 83.58 94.03 80.60 91.04 76.12 97.01 86.57 82.09 79.10 79.10 82.09 86.57 84.73
8. Conclusion
In this paper, we explored the perceptual zoning based on the confusion matrix and it information about the confusion parts of the letters. So, we can conclude that based on this type of analysis, we probably are able to design a global hierarchical recognition system. This system has some networks specialised on handwritten character recognition sub problems. Therefore, we apply a suited for particular purpose representation to specific sub problems and it is used to improve the overall experimental performance.
(a) G and Q
(b) Y and X
(c) B, C, D, and E
References
Madhavanath, S. & Govindaraju, V., Preceptual features for off-line handwritten word recognition: a framework for prediction, representation and matching. Advances in Pattern Recognition, august, 1998, 524-531. Oh, I-S. Suen, C. Y., A class-modular feedforward neural network for handwriting recognition, Pattern Recognition 35 (2002), 229--244. Otsu, N., A threshold selection method from gray-level histograms, IEEE Transations Systems, Man. and Cybernetics, SMC 9, Vol.1, 1979, pp.63-66. Parker, J.R., Algorithms for Image Processing and Computer Vision. Ed. John Wiley & Sons, Inc. 1997. Sekuler, R. & Blake, R., Perception. 3rd ed. McGraw-Hill, Inc. 1994. Schomaker, L. and Segers, E., A method for the determination of features used in human reading of cursive handwriting. In 6th International Workshop on Frontiers in Handwriting Recognition, 1998, 157-168. Suen, C.Y., Guo, J., Li, Z.C., Analysis and Recognition of Alphanumeric Handprints by parts. IEEE, Transactions on Systems, Man, and Cybernetics, Vol.24, 1994, p. 614-631. Viard-Gaudin, C., The ironoff user manual. IRESTE, University of Nantes, France, 1999.