You are on page 1of 49

Diwakar Vishwakarma & Bharti Gupta MCA II Year BBAU(A Central University) Lucknow

AI Concept and Definition

Encompasses Many Definitions


AI Involves Studying Human Thought

Processes Representing Thought Processes on Machines

study of how to make computers do things at which, at the moment, people are better (Rich and Knight [1991]) Theory of how the human mind works (Mark Fox)

AI Objectives
Make machines smarter Understand what intelligence is Make machines more useful (practical purpose)

Turing Test for Intelligence

A computer can be considered to be smart only when a human interviewer, conversing with both an unseen human being and an unseen computer, can not determine which is which.

Major AI Areas

Expert Systems

Natural

Language Processing

Speech Understanding Robotics and Sensory Systems Computer Vision and Scene Recognition Neural Computing Fuzzy Logic

Interaction Level

Natural Language Processing is a technique where machine can become more human and there by reducing the distance between human being and the machine can be reduced. Therefore in simple sense NLP makes human to communicate with the machine easily. NLP applications are very useful in everyday life for example a machine that takes instructions by voice.

Interaction Level
The level that computer and human interact. NL used for make Interaction level near to human.
Graphical UI NL UI Human Interaction level Command-line Computer

Natural?

Natural Language? Natural Language is one of fundamental aspects of human behaviors. Provide easy interaction with computer Refers to the language spoken by people, e.g. English, Japanese, Hindi as opposed to artificial languages, like C++, Java, etc.

Where does it fit in the CS taxonomy?


Computers Databases Artificial Intelligence Algorithms Networking

Robotics

Natural Language Processing

Expert System

Information Retrieval

Machine Translation

Language Analysis

Semantics

Parsing

Natural Language Processing


Natural Language Processing is a collection

used to extract the meaning from input in order to perform the useful task as a result. Automatic analysis of human language by computer algorithms.

Why Natural Language Processing ?

Huge amounts of data Internet = at least 20 billions pages and exponentially increasing
Applications for processing large amounts of texts require NLP expertise

Application Areas of NLP

Text-based applications This involves applications such as searching for a certain topic or a keyword in a data base, extracting information from a large document, translating one language to another or summarizing text for different purposes.

Application Areas of NLP

Dialogue based applications Some of the typical examples of this are answering systems that can answer questions, services that can be provided over a telephone without an operator, teaching systems, voice controlled machines (that take instructions by speech) and general problem solving systems.

Components of Natural Language Processing


Natural Language Understanding o Mapping the given input in the natural language

into a useful representation.


o Different level of analysis required:

morphological analysis , syntactic analysis, semantic analysis, discourse analysis,

Components of Natural Language Processing


Natural Language Generation o Producing output in the natural language from

some internal representation.


o Different level of synthesis required:

deep planning (what to say), syntactic generation

Natural Language Processing

Natural Language Understanding


The steps in natural language understanding are as follows: Words Morphological Analysis Morphologically analyzed words (another step: POS tagging) Syntactic Analysis Syntactic Structure

Natural Language Understanding


Semantic Analysis Context-independent meaning representation Discourse Processing

Final meaning representation

MAJOR TASKS INVOLVED IN NATURAL LANGUAGE PROCESSING


Phonology Morphology Syntax Semantics Pragmatics Discourse

Phonology
Deals with the interpretation of speech sounds within and across words. Three types of rules used in phonological analysis: 1) phonetic rules for sounds within words; 2) phonemic rules for variations of pronunciation when words are spoken together, and; 3) prosodic rules for fluctuation in stress and intonation across a sentence.

Morphology

Morphology is the first stage of analysis once input has been received. It looks at the ways in which words break down into their components and how that affects their grammatical status.

Morphology

Morphemes are the smallest meaningful units of language. cars car+PLU Children Child+PLU

Syntax

Syntax involves applying the rules of the target languages grammar, its task is to determine the role of each word in a sentence and organize this data into a structure that is more easily manipulated for further analysis.

Issues in Syntax
1.

the dog ate my homework - Who did what? Identify the part of speech (POS)
Dog = noun ; ate = verb ; homework = noun English POS tagging: 95% (Can be improved)

Identify collocations mother in law, hot dog

Issues in Syntax

Full Parsing Ravindra loves Khusi.


Ravindra loves Khusi

NP(Ravindra)

VP(loves Khusi)

Verb Noun(R)

NP

loves
Noun(K) Ravindra Love Khusi

More Issues in Syntax

Preposition Attachment I saw the man in the park with a telescope

Semantics

Semantics are the examination of the meaning of words and sentences. Semantics convey Useful information relevant to the scenario as a whole.

Issues in Semantics
Understand language! How? plant = industrial plant plant = living organism Words are ambiguous Importance of semantics?

Machine Translation: wrong translations

Information Retrieval: wrong information

Issues in Semantics

Learn from annotated examples:


Assume 100 examples containing plant

previously tagged by a human Train a learning algorithm How to choose the learning algorithm? How to obtain the 100 tagged examples?

Pragmatics

Pragmatics is the sequence of steps taken that exposes the overall purpose of the statement being analyzed. This will be broken down into ambiguous entities and will be disambiguate to facilitate understanding.

Discourse

Concerns how the immediately preceding sentences affect the interpretation of the next sentence. For example, interpreting pronouns and interpreting the temporal aspects of the information.

Issues in Discourse
Anaphora Resolution: to resolve referring expression The dog entered my room. It scared me Mary bought a book for Kelly. She didnt like it. She refers to Mary or Kelly. -- possibly Kelly It refers to what -- book.

Approaches to Natural Language Processing


Natural language processing approaches fall
roughly into 3 categories:

Symbolic Approach:
Perform

deep analysis of linguistic phenomena

Based

on explicit representation of facts about

language

Approaches to Natural Language Processing


Statistical Approach

Employ various mathematical techniques

Use large text corpora to develop


approximate generalized models of

linguistic phenomena

Approaches to Natural Language Processing


Connectionist Approach

Develop generalized models from

examples of linguistic phenomena

Combine statistical learning with various

theories of representation

Research

Microsoft Natural Language Processing Group The team is broadening the scope of the NLP effort by developing parallel systems in several languages. The languages covered are Chinese, English, French, German, Japanese, Korean and Spanish.

Research

Canon Natural Language Processing Group research and development of large vocabulary speech understanding software, for interactive spoken systems;

Applications of NLP

Machine Translation: different strategies


Systran: www.Systransoft.com

Google: Translate.google.com

Question Answering Information Extraction Spell Checking

Microsoft Spell Checker

Machine Translation
Machine Translation is the process of translating from source language text into target language. There are 2 types of MT: Rule based MT Statistical MT

Machine Translation
Rule based MT Explicit use and manual creation of linguistically informed rules and representations Statistical MT Corpus based, i.e. learned from examples of translations called parallel or bilingual corpora

Applications of Machine Translation

ANGLABHARTI (1991), a machine-aided translation system specifically designed for translating English to Indian languages at IIT Kanpur. Anglabharti uses a pseudo-interlingua approach. It analyses English only once and creates an intermediate structure called PLIL (Pseudo Lingua for Indian Languages).

Applications of Machine Translation


Anusaaraka (1995) project which started at IIT Kanpur, and is now being continued at IIIT Hyderabad Aim of translation from one Indian language to another Anusaaraka's have been built from Telugu, Kannada, Bengali, and Marathi to Hindi. TDIL(Technology Development for Indian Languages) is also working on developing various MT tools

Question Answering
Is a system that automatically answer questions posed by humans in natural language Three steps involved in question answering: Question Manipulation and classification Matching Answer selection

Applications of Question Answering


LUNAR gives access to a data base containing information on lunar rocks and soil composition obtained during the NASA Apollo-11 moon landing mission. It respond to a natural queries of geologist like what is the average of the basalt?

Applications of Question Answering


ELIZA uses the keyword and pattern matching approach. It is based on the use of sentence templates which contain keywords or phrases. Other famous Question Answering systems are-SHRDLU, GUS, JUPITER, QUALM, BASEBALL

Future of NLP

Well there are so many applications we can dream with NLP techniques. How about robots that understand and follow instructions by human voice or driving by talking to the car like in some science fiction movies. Well they all can be real one day. Imagine we have a computer system that can follow simple human instructions and do what ever we want it to do. How convenient will it be ? But lets leave all that to the FUTURE.........

Conclusions
A lot of research is going into developing new applications and investigating new techniques and approaches that will make Statistical NLP more feasible in the near future. So we will be able to see improved applications of NLP in the near future.

References
Blogs on Natural Language Processing from the Microsofts official site. Tutorial on NLP by Saad Ahmad (University of northern Iowa) Coppin, B. (2004). Artificial Intelligence Illuminated.Sudbury, Massachusetts: Jones and Bartlett Publishers Di Eugenio, B. (2001).Natural-Language Processing for Computer-Supported Instruction. Intelligence. Winter 2001

Thank You

You might also like