You are on page 1of 2

SCRIPT IDENTIFICATION THROUGH TEMPORAL SEQUENCE OF THE STROKES (IEEE)

Objective: This project automates the recognition of the written script through online. There have been many attempts on handwritten script identification in offline documents. The most important characteristic of online documents recognition is that they capture the temporal sequence of strokes while writing the document. This allows us to analyze the individual strokes and use the additional temporal information for both script identification. Existing System The existing method follows the following steps to recognize the script that too only in online. Using projection profiles of words and character shapes. Using horizontal projection profiles and looking for the presence or absence of specific shapes in different scripts. Existing method deals with only few characteristics. Most of the method does this in off-line. Proposed System The proposed system uses the features of connected components to classify six different scripts (Arabic, Han, Cyrillic, Devnagari, Hebrew, and Roman) and reported a classification accuracy of 88 percent on document pages. There are a few important aspects of online documents that enable us to process them in a fundamentally different way than offline documents. The most important characteristic of online documents is that they capture the temporal sequence of strokes while writing the document. This allows us to analyze the individual strokes and use the additional temporal information for both script identification as well as text recognition. In the case of online documents, segmentation of foreground from the background is a relatively simple task as the captured data, i.e., the (x; y)

coordinates of the locus of the stylus, defines the characters and any other point on the page belongs to the background. We use stroke properties as well as the spatial and temporal information of a collection of strokes to identify the script used in the document. Unfortunately, the temporal information also introduces additional variability to the handwritten characters, which creates large intra class variations of strokes in each of the script classes. Proposed System Architecture The system first collects the data in the mentioned six languages. The script is created in these languages and stored in the file. Based on this collection words in particular language can be detected with the properties of each language. The classifier design uses k-Nearest Neighbor method. All this procedure is explained as follows. Data collection and Preprocessing Line and Word Detection Feature Extraction Recognition System Environment: Hardware Requirements: Hard disk RAM Processor : 40 GB : 512 MB : Pentium IV Processor

Processor Speed : 3.00GHz

Software Requirements: JDK 1.5 and more Java Swing front end

You might also like