You are on page 1of 19

For learning how to write and pronounce English characters

Introduction 1.About ocr OCR is the acronym for Optical Character Recognition. Optical character recognition is needed when the

information should be readable both to humans and to a machine Both hand written and printed characters may be recognized It converts scanned images of machine-printed or handwritten text (numerals, letters, and symbols) into a computer-processable format. Optical recognition is performed off-line after the writing or printing has been completed, as opposed to on-line recognition where the computer recognizes the characters as they are drawn.

2. About speech synthesis


The text-to-speech (TTS) synthesis procedure consists

of two main phases. text analysis (input text->Phonetic o/p)and speech generation(phonetic info->acoustic o/p) Conversion of text into Speech can be implemented using java Speech Application Programming Interface (JSAPI) through which applications can use functionality of speech engines. FreeTTS is JSAPI speech synthesis engine that we have used .

Current status of development


OCR readers can convert typed and handwritten

documents into digital data. These readers scan the shape of a character on a document, compare the scanned character with a pre-defined shape, and convert the character into its corresponding bit pattern for storage in main computer memory. This technology is still in development. Speech synthesis has reached a high level of performance, with low error rates in text analysis, and high intelligibility in synthesis, but there is still much improvements to be done to achieve more natural sounding speech.

advantages

reduces the time required by user to enter the data. Helps in learning language along with spoken help. no requirement of keyboard for entering text . A computer with handwriting recognition integrated with speech synthesis can teach any time at any place. Both writing and Pronounciation can be learned. people with reading disabilities(dyslexics), can use it. a person can change his own handwritten pattern of alphabet (improved) , during his learning phase as many times as required.

Artificial neural networks


Output Inputs

An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs.

Kohonen algo
The input to a Kohonen neural network is given using the input

neurons. These input neurons are each given the floating point numbers that make up the input pattern to the network. A Kohonen neural network requires that these inputs be normalized to the range between -1 and 1. One output neuron is chosen as the winner To examine which neuron would win and produce output, steps to be followed are : Normalize the inputFirst calculate the "vector length" of the input data, This is done by summing the squares of the input vector. Then, determine the normalization factor. The normalization factor is the reciprocal of the square root of the vector length.

Contd
Calculate each output neurons o/p

For each of the output neurons , dot product of input vector and connection weights between the input neurons and that output neuron must be calculated. This o/p must now be normalized by multiplying it by the normalization factor
The above calculated o/p is mapped to bipolar number by adding

1 and dividing the result by 2.

Finally, choose the winning neuron , the output neuron that

has the largest o/p value becomes the winner.

Unsupervised learning
No help from the outside
no information available on the desired output Learning by doing

Processes in our OCR


The hand written characters are first drawn using the mouse. the bit pattern of the image is grabbed. Cropping is done, for eliminating the extra white space around

the image. DownSampling, an algorithm to reduce the resolution of the letters being drawn, is used for character recognition and training. Recognition (using Kohonen Self Organization Map) and speech synthesis (Using JSAPI). Training the network to recognize same or identical patterns.(by classifying to the same output neuron) Error calculation(how well network classifies)

Language used-Java
JAVA is a general computer programming language

developed by Sun Microsystems. Object oriented language platform independent code written in JAVA will be easier to maintain and reuse in the long run Java has two GUI packages, the original Abstract Windows Toolkit (AWT) and the newer Swing Swing components have the prefix J to distinguish them from the original AWT ones (eg JFrame instead of Frame). To include Swing components and methods in your project you must import the java.awt.*, java.awt.event.*, and javax.swing.* packages.

Containers are used to hold and group components such as text fields

and checkboxes etc. JFRAME AND JPANEL JFrame is the most commonly used top-level container. It adds basic functionality such as minimize, maximize, close, title and border to basic frames and windows. Some important JFrame methods are: setBounds(x,y,w,h), setSize(w,h), setResizable(bool), setTitle(str), setVisible(bool), isResizable() and getTitle(). The setDefaultCloseOperation(constant) method controls the action that occurs when the close icon is clicked. JPanel is the most commonly used content pane. An instance of the pane is created and then added to a frame. The add() method allows GUI components to be added to the pane. The way they are added is controlled by the current layout manager. For text-to-speech conversion using java we need some packages eg.speech,util ,synthesis,freetts etc and some jar files to be installed in our working folder before compiling our program.

Data flow diagrams

Dfd

Level-2 Data Flow Diagram

and written haracters

Dfd

User interfac e

Croppin g

DownSampl ed image

Kohonen Neural Network

Input vecto r

Connecti on Weights Outputs

Vector Length

Normalize d input

Normaliz ed Outputs Recogniti

features in ocr
It can recognize handwritten characters and simultaneously

speak that recognized character. We can train the network to recognize our own handwriting , so that most of our characters can be recognized. The process for training a Kohonen neural network involves stepping through several epochs until the error of the Kohonen neural network is below acceptable level. Epoch occurs when training data is presented to the network ,error is calculated and weights are adjusted to reduce error.
We have a training file that contains training samples for our

own handwriting (capital versions of 26 english letters ), which can be loaded and the application can be trained further to recognize characters drawn by us .

Features cond.
It can create a list of letters that the program has been trained for, by

selecting a particular letter, and deleting it ,we can retrain our program for that letter. map to the output neurons(26 characters).If the error is below the acceptable level of error(10%), there is no requirement of any further training .

Error i.e., how well the training inputs(the letters that you created)

The first error, lastError, indicates the total error for the Kohonen

neural network ,for the epoch that just occurred. The second error, bestError, indicates the best(least) lastError that has occurred so far. Tries counts the no. of times we have tried to adjust weight matrix to reduce error ,during training the n/w.

Snapshot

than

You might also like