Professional Documents
Culture Documents
SUBMITTED BY
SHRIKANT NAYAK
PRASANNA MEHTA
RAHUL AMBADKAR
DISHA YADAV
SUPERVISOR
2014-15
1
Department of Information Technology
Pillai Institute of Information Technology,
SHRIKANT NAYAK
PRASANNA MEHTA
RAHUL AMBADKAR
DISHA YADAV
Mrs.Varunakshi Bhojane
We would also like to thank them for their patience and co-operation, which
proved beneficial for us.We own a substantial share of our success to the
whole faculty and staff member who provided us the requisite facilities
required to complete the project work.
SHRIKANT NAYAK
PRASANNA MEHTA
RAHUL AMBADKAR
DISHA YADAV
3
ABSTRACT
Machine Transliteration is an important problem in an increasingly
multilingual world, asit plays a critical role in many downstream applications,
such as machine translation or“Cross Lingual Information Retrieval (CLIR)”
systems. In this project, we proposecompositional machine transliteration
systems, where multiple transliteration componentsmay be composed either
to improve existing transliteration quality, or to enabletransliteration
functionality between languages even when no direct parallel namescorpora
(set of texts) exist between them. Specifically, we propose Parallel
Composition. In parallel composition evidence from multiple transliteration
paths between X → Z areaggregated for improving the quality of a direct
system. We demonstrate the functionalityand performance benefits of the
compositional methodology using a state of the artmachine transliteration
frame-work between English and Marathi.
4
TABLE OF CONTENTS
i Abstract
1 Introduction……………………………………….6
1.1 Aims and objectives
1.4 Advantages
1.5 Disadvantages
2 Literature Survey.........................................9
2.1 Introduction
2.2 Feasibility study
2.3 Requirement analysis
2.4 System analysis
3 Existing system...........................................14
6 Design details..............................................19
8 References...................................................25
5
Chapter 1
INTRODUCTION
1.1 OBJECTIVE:
language converter
legal converter
literary converter
medical converter
scientific converter
technical converter
6
Designing of machine translator for English to Marathi with hybrid approach
including rule based and example based approach to obtain a good enough
translation for SVO formats of the English statement.
1.3 Scope:
1. For the Marathi pronunciation our system is useful those who can learn
standard level English language.
2. User friendly environment.
3. Better user interface.
4. Fast mechanism.
5. Small memory factor.
1.5 Disadvantages:
Chapter 2
8
LITERATURE SURVEY
What is transliteration?
Transliteration is a representation of the words of one language in the script
of another,i.e., it is the transcription of one alphabet in another. Some other
interesting definitions are:
The representation of characters or words of one language by
corresponding characters of words of another language.
A systematic way to convert characters in one alphabet or
phonetic sounds into another alphabet.
The translation of text from one writing system into another
where the writing conventions of the target writing system are applied.
The transliterated text should read naturally in the target script.
A letter-for-letter or sound-for-letter spelling of a word to
represent a word in another language.
Hindi and Marathi languages are written using Devanagari script. Devanagari
script used for Hindi and Marathi have 12 pure vowels , 2 loan vowels from
the Sanskrit language and 1 loan vowel from English. There are total 34
consonants, 5 conjuncts, 7 loan consonants and 2 traditional signs in
Devanagari script and each consonant have 14 variations through
integration of 14 vowels [32-34]. Table 1 shows Devanagari script along with
their equivalent phonetic mapping in Roman. The consonant /ळ/ is used only
in Marathi and not in Hindi.
9
Name in Devanagari→ महारा STUs → [म | हा | रा | ]
Interpreters -
1. This process involves two or more speakers who may not be speaking the
same language.
10
Transliteration Services - These are also computer-assisted transliteration,
except that the software employed is highly efficient and proficient in
translating a particular language. Using Internet, transliteration software can
be used from remote locations to translate web pages and client provided
content. There are experienced players in the transliteration field who offer
language transliteration services as a SaaS service offering. They provide for
continuous improvements in transliteration speed and quality along with
rapid development of new languages for high volume transliteration
deployments.
12
languages Y and Z,using appropriate parallel names corpora between them.
For testing, each name in language X was provided as an input into X → Y
transliteration system, and the top-10 candidate strings in language Y
produced by the system were further given as an input into system Y → Z.
The outputs of this system were merged and re-ranked by their probability
scores. Finally, the top-10 of the merged outputs were output as the
compositional system output.
13
Chapter 3
EXISTING SYSTEM
Existing system:
In the previous system, it will only convert English word into Marathi
language, but the user cannot understand the actual pronunciation of that
word.
4.Cannot be reliable.
14
of the grapheme transliteration model continued to mount unsuccessful
attempts at reversing government policy until the turn of the century, with
one critic calling appealing to "the Indian Government to give up the whole
attempt at scientific (i.e. Hunterian) transliteration, and decide once and for
all in favour of a return to the old phonetic spelling."
The ITRANS transliteration scheme was developed for the ITRANS software
package, a preprocessor for Indic scripts. The user inputs in Roman letters
and the ITRANS preprocessor converts the Roman letters into Devanagari (or
other Indic scripts). The latest version of ITRANS is version 5.30 released in
July, 2001.
Quillpad is the Number One predictive transliteration tool for inputting Indian
languages. Unlike the rule-based phonetic transliteration solutions where
users had to type by memorizing clumsy key combinations, Quill pad
provided a huge leap in ease of use by enabling users to type in freestyle,
without having to follow any rigid typing rules. Launched in 2006, Quill pad is
the first Indic transliteration solution to use statistical machine learning
method for intelligently converting user entered free-style phonetic input to
its accurate representation in a chosen Indian language.
15
Chapter 4
PROPOSED SYSTEM
In our application English word is taken as input. Then this words are
converted into tokens. The tokens then compare with Dictionary and then
give final result as English-Marathi words.
1. For the Marathi pronunciation our system is useful those who can learn
standard level English language.
4. Fast mechanism.
16
Block Diagram:
17
Chapter 5
Hardware:
1. Processor: Pentium 4
2. RAM: 512 MB or more
3. Hard disk: 16 GB or more
Software
JAVA JDK1.6
Net beans.
MySQL
1.JAVA JDK1.6:
2.Net beans.:
Net Beans is an integrated development environment (IDE) for developing
primarily with Java, but also with other languages, in particular PHP, C/C++,
and HTML5.It is also an application platform framework for Java desktop
applications and others.
3.My SQL:
4.MySQL is a popular choice of database for use in web applications, and is a
central component of the widely used LAMP open source web application
software stack (and other 'AMP' stacks).
18
Chapter 6
Design Details
System flowchart:
Algorithm:-
19
Flowchart
Start
Stop
20
DFD Level 0:
DFD Level 1:
Enter English
words as Input
Convert word into token
Find out
checked words
English-Marathi words
generated
21
CHAPTER 7
IMPLEMENTATION PLAN
IMPLEMENTATION PLAN:
The implementation plan includes a description of all the activities that must
occur to implement the new system and to put it into operation. It identifies
the personnel responsible for the activities and prepares a time chart for
implementing the system. The implementation plan consists of the following
steps.
List all new documents and procedures that go into the new
system.
Implementation includes all those activities that take place to convert from
the old system to the new. The old system consists of manual operations,
which is operated in a very different manner from the proposed new system.
A proper implementation is essential to provide a reliable system to meet the
requirements of the organizations. An improper installation may affect the
success of the computerized system.
IMPLEMENTATION METHODS:
There are several methods for handling the implementation and the
consequent conversion from the old to the new computerized system.
The most secure method for conversion from the old system to the new
system is to run the old and new system in parallel. In this approach, a
person may operate in the manual older processing system as well as start
operating the new computerized system. This method offers high security,
because even if there is a flaw in the computerized system, we can depend
22
upon the manual system. However, the cost for maintaining two systems in
parallel is very high. This outweighs its benefits.
Another commonly method is a direct cut over from the existing manual
system to the computerized system. The change may be within a week or
within a day. There are no parallel activities. However, there is no remedy in
case of a problem. This strategy requires careful planning.
A working version of the system can also be implemented in one part of the
organization and the personnel will be piloting the system and changes can
be made as and when required. But this method is less preferable due to the
loss of entirety of the system.
23
Conclusion
Thus,we conclude the advent of transliteration system. It is an effective
token based system for transliteration between English and Marathi. As
English and Marathi are structurally similar languages, it generates target
language sentence retaining a flavor of the source language. It should be
noted that transliteration is not performed here in the sense of linguistics,
but word-for-word transliteration is performed. It requires limited linguistic
effort and tools for achieving the said goal. Result, demonstrates the
potential advantage and accuracy of our approach.
The translator has successfully realised his intention. Referentially, the main
ideas of the SL text are reproduced. The language is rather more informal
than it is in the original, which is in line with the difference between
educated English and Marathi. There are several instances of under
translation, sometimes inevitable in the context of different collocations and
normal and natural usage. In fact the use of more general words helps to
strengthen the pragmatic effect, since, being common and frequently used,
they have more connotations and are more emotive than specific, let alone
technical, words which are purely referential.
24
REFERENCES
5. MITESH M. KHAPRA ,
PUSHPAKBHATTACHARYYA,“CompositionalMachine Transliteration” By
A KUMARAN,Microsoft Research India,Indian Institute of Technology
Bombay.
25