You are on page 1of 33

IBM Watson and

Old Dominion University

Watson from DeepQA to


Deep Learning
By: Armen Pischdotchian

2015 IBM Corporaton

Agenda

About cognitve systems

The statstcs behind DeepQA


The DeepQA Pipeline in Detail
From DeepQA to Deep Learning

2015 IBM Corporaton

About Cognitve Systems

2015 IBM Corporaton

What is common amongst cognitve systems


The three L's:
Language: are you leveraging an NLP stack?
Levels: do you score or rank returned responses?
Learning: do you employ machine learning technologies?
Coming soon to the three L's is the forth L:
Limbs: robotcs

2015 IBM Corporaton

Natural Language Processing Challenges

2015 IBM Corporaton

Deterministc vs. Probabilistc Systems

2015 IBM Corporaton

Linear Regression

Logistcal Regression

2015 IBM Corporaton

NLP terminology

2015 IBM Corporaton

When recall is more important than precision


5 Relevant documents (red fsh)
5 irrelevant documents (blue fsh)
The search has retrieved 3 relevant
documents out of a total of 5 relevant
documents from the corpus and 1 irrelevant document.
Recall = 3 / 5 = 0.6
Precision = 3 / 4 = 0.75 (the blue fsh is not part of the equaton at all).

These images are from www.lucidata.inc

2015 IBM Corporaton

The case of 100% recall and low precision


5 Relevant documents (red fsh)
5 irrelevant documents (blue fsh)
In Watson Discovery Advisor, this is the
preferred scenario even though there may
be some irrelevant documents with a high score.
The algorithm team will then work on increasing the precision of this system.
What would be the preferred outcome for the Watson Engagement Advisor?

10

2015 IBM Corporaton

The case of 100% precision and low recall


5 Relevant documents (red fsh)
5 irrelevant documents (blue fsh)
Zero false positves, 100% precision
No blue fsh in the net
But there are many false negatves
Many red fsh in the sea
There are potentally many relevant documents that we will never consider.
Perfect precision with poor recall is of no value to a DeepQA system.
These images are from www.lucidata.inc

11

2015 IBM Corporaton

Precision and accuracy in Jeopardy!

12

2015 IBM Corporaton

Stage 2: Hypothesis Generaton Precision vs.


Percentage atempted

Copyright 2010, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602
13

2015 IBM Corporaton

Search Engine vs. Questons Answering System


A QA system demands more processing from the system and less analysis on the
user compared to a search engine.

14

2015 IBM Corporaton

The DeepQA Pipeline

15

2015 IBM Corporaton

An example Jeopardy! queston

ral

Isaac Newton

[0.58 0 -1.3 0.97]

Wilhelm Tempel

[0.71 1 13.4 0.72]

HMS Paramour

[0.12 0

Christiaan Huygens
Halleys Comet

[0.33 0

Edmond Halley

[0.21 1 11.1 0.92]

Pink Panther

[0.91 0 -8.2 0.61]

Evidence
Retrieval

Models

Models

Models

Models

Models

Models

2.0 0.40]

[0.84 1 10.6 0.21]

Peter Sellers

16

Te
mp
o

Le
xic

Candidate Answer Generation

Primary
Search

Ta
xo
no
mi
c
Sp
ati
al

Question
Analysis

Keywords: 1698, comet,


paramour, pink,
AnswerType(comet discoverer)
Date(1698)
Took(discoverer, ship)
Called(ship, Paramour Pink)

al

IN 1698, THIS COMET


DISCOVERER TOOK A
SHIP CALLED THE
PARAMOUR PINK ON
THE FIRST PURELY
SCIENTIFIC SEA VOYAGE

Related Content
(Structured & Unstructured)

6.3 0.83]

[0.91 0 -1.7 0.60]

1)
2)
3)

Edmond Halley (0.85)


Christiaan Huygens (0.20)
Peter Sellers (0.05)

Merging &
Ranking

Evidence
Scoring
2015 IBM Corporaton

How Watson responds to a Queston


Wikipedia
etc.

Primary
Search

Candidate
Answer
Generation

Answer Contextual
Contextual
Answer Answer
Scoring
Contextual
Answer
AnswerScoring
Scoring
Answer
Scoring Scoring
Scoring

Evidence
Retrieval

Trained
Models
Question

Question
Analysis

Search

Scoring
Scoring
Scoring

Final
Merging
Ranking
Answer, Confidence,
Evidence

17

2015 IBM Corporaton

Queston Analysis (QA) Overview


What is Queston Analysis?
Queston Analysis is the frst stage in the Watson pipeline
Ultmate goal: Understand what is being asked
Various algorithms and technologies to identfy as much as possible about the
input queston
Named Entty Detecton
Natural Language Processing (NLP)
Shallow and Deep Semantc Relaton Detecton
All downstream components rely on the annotatons produced by QA

18

2015 IBM Corporaton

Stage 1: Queston Analysis


Queston analysis technologies includes
Part of speech parsing technology
Named Entty Detecton
Relaton Extracton
Inverse Document Frequency (IDF)

19

2015 IBM Corporaton

Stage 2: Hypothesis Generaton

Primary
Search

Question

Question
Analysis

20

Search

2015 IBM Corporaton

Stage 2: Hypothesis Generaton Primary search


Who is the 44th President of the
United States?
Keywords:
44th President United States

Primary
Search

Question

Question
Analysis

21

Search

2015 IBM Corporaton

Stage 2: Hypothesis Generaton Candidate Answer Gen


Who is the 44th President of the United States?
Barack Obama
George W. Bush
Harvard Law School
Illinois

Primary
Search

Candidate
Answer
Generation

Questio
n

Question
Analysis

22

Search

2015 IBM Corporaton

Stage 3: Hypothesis Scoring


What is Hypothesis Scoring?
Enumeraton of annotators responsible for scoring previous generated candidate
answers
The results produced by these scorers are ranked by the Merging and Ranking
components to produce a ranked list of answers.
Outcome: a confdence level of a generated hypothesis
Scorers can produce results in any (reasonable) range
In fnal merging step, scorers are normalized according to how well their scoring
heuristc correlates to the correct answer
Normalized to [0..1] in fnal merging

23

2015 IBM Corporaton

Hypothesis Scoring - components


Hypothesis & Evidence Scoring

Hypotheses

Textual
Alignment

Term and
nGram
Matching

...

Logical
Form
Analysis

Evidence
Features

Question

Question
/Topic
Analysis

Hypothesis
Generation

Final
Merging
& Ranking

Hypothesis &
Evidence Scoring
Trained
Models

Answer,
Confidence
Evidence

24

2015 IBM Corporaton

AnswerIdf scorer
Context Independent scorer
Uses concept referred to as Inverse Document Frequency
Rato of total documents versus documents containing target
text
Target text = candidate answer text
Large corpus (e.g., Wikipedia)
Lucene formula
Log scale

Scores in range (0inf)


Higher score indicates more informatveness (answer text
appears in few documents)
Example

10,000 documents
Answer text appears in only 10 documents
Log (10,000 / 10) = Log (1,000) = 3

25

2015 IBM Corporaton

Textual Alignment Answer Scorer


Surface similarity measurement
Queston
Supportng passage
Dynamic programming for subsequence alignment
Consider the following example:
Who led the Allied forces on the European front during World War 2?
Dwight D. Eisenhower was supreme commander of Allied forces during the D-Day
invasion and European front during World War 2.
--Overlap is signifcant
Now, consider the example:
In 1698, what comet discoverer took a ship called the Paramour Pink on the frst
purely scientfc sea voyage?
Edmund Halley made probably the frst primarily scientfc voyage to study the
variaton of the magnetc compass
--Fewer textual overlaps, likely with lower IDF scores

26

2015 IBM Corporaton

Who is the 44th President of the United States?


Barack Hussein Obama II (i/brk husen
obm/; born August 4, 1961) is the 44th
and current President of the United States.
George Walker Bush (born July 6, 1946) is an
American politician who served as the 43rd
President of the United States from 2001 to
2009 and the 46th Governor of Texas from
1995 to 2000.
Barack Obama is the 44th President of the United States
George W. Bush is the 44th President of the United States
Harvard Law School is the 44th President of the United
States
Illinois is the 44th President of the United States
Questio
n

Question
Analysis

27

Search

Scoring
Scoring

Contextual
Contextual
Answer
Contextual
Answer
Scoring
Answer
Scoring
Scoring
Barack Obama .95
George W. Bush .80
Harvard Law School .05
Illinois.10

Scoring

2015 IBM Corporaton

Stage 4: Final Merger and Ranking


Wikipedia
etc.

Primary
Search

Candidate
Answer
Generation

Answer Contextual
Contextual
Answer Answer
Scoring
Contextual
Answer
AnswerScoring
Scoring
Answer
Scoring Scoring
Scoring
Trained
Models

Questio
n

Question
Analysis

Search

Scoring
Scoring
Scoring

Final
Merging
Ranking

Answer, Confidence,
Evidence

28

2015 IBM Corporaton

Challenge: Heterogenous feature types and values

29

2015 IBM Corporaton

Stage 4: Final Merger and Ranking confdence scoring


Who is the 44th President of the United States?
Candidate Answer

30

Answer
Scoring

Contextual Answer
Scoring

Confidence
Evidence
Retrieval

Barack Obama

0.90

0.90

George W. Bush

0.90

0.80

.65

Harvard Law School

0.10

0.05

.05

Illinois

0.15

0.10

.10

.95

2015 IBM Corporaton

Watson is Deep Learning

31

2015 IBM Corporaton

University of Texas Watson university competton demo

32

2015 IBM Corporaton

Watson is going Deep Learning

33

2015 IBM Corporaton

You might also like