Professional Documents
Culture Documents
analysis
(some slides borrowed from
D. Jurafski and from S.
Ponzetto)
Semantics
Synta
x
Words
Morphology
D&
D
QA
I
E
I
R
Sound
waves
ASR
Syntacti
Words
c
processi
ng
Discours
Semanti
Parses
Meaning e/Dialog
c
ue
processi
processi
ng
ng
Meaning
in context
Discourse/dialogue analysis
So far we always analyzed one
sentence in isolation, syntactically
and/or semantically
Natural languages are spoken or
written as a collection of sentences
In general, a sentence or utterance
cannot be understood in isolation.
A dialog example
Xu and Rudnicky (2000)
A discourse example
John went to the bank to deposit his
paycheck.
He then took a train to Bills car
dealership.
He needed to buy a car.
The company he works for now isnt
near any public transportation.
He also wanted to talk to Bill about
their softball league.
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation, coherence relations
Both
Anaphora
Co-reference
1. Turn-taking
Dialogue is characterized by
turn-taking.
A:
B:
A:
B:
Turn-taking rules
Sacks et al. (1974)
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation, coherence relations
Both
Anaphora
Co-reference
2. Speech Acts
Speakers contribute more information
than just what is said
Speech Acts can give a principled
account of additional meaning
Speech Act Theory can also help us
examine utterances from the
perspective of their function, rather
than their form
Classification of SA according to
force
Locutionary Force (what is said)
Bring the chair to the dining room
Illocutionary Force (what is done)
The robot is asked to grasp a chair and
change his current position
Perlocutionary Force (the effect)
The current position of the robot&chair
changes to dining room (if action is
successfully performed)
Illocutionary
Force
Perlocutionary
Force
Interrogative
Declarative
Effect: as above
Give me your
sandwich!
Directive
Effect: as above.
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation, coherence relations
Both
Anaphora
Co-reference
3. Grounding
Why do elevator buttons light up?
Clark (1996) (after Norman 1988)
Principle of closure. Agents
performing an action require
evidence, sufficient for current
purposes, that they have
succeeded in performing it
What is the linguistic correlate of
this?
Grounding
Need to know whether an action
succeeded or failed
Dialogue is also an action
a collective action performed by speaker and
hearer
Common ground: set of things mutually
believed by both speaker and hearer
Acknowledgement:
B nods or says continuer like uh-huh, yeah, assessment
(great!)
Demonstration:
B demonstrates understanding A by paraphrasing or
reformulating As contribution, or by collaboratively
completing As utterance
Display:
B displays verbatim all or part of As presentation
A human-human
conversation
Grounding examples
Display:
C: I need to travel in May
A: And, what day in May did you want to
travel?
Acknowledgement
C: He wants to fly from Boston
A: mm-hmm
C: to Baltimore Washington International
[Mm-hmm (usually transcribed uh-huh) is
a backchannel, continuer, or
Grounding negative
responses
From Cohen et al. (2004)
Example
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation, coherence relations
Both
Anaphora
Co-reference
Dialogue management
A dialogue system is finalized to some purpose (e.g.
a flight reservation)
Unlike for discourse analysis, a structure must be
determined a-priori to guide the conversation
Dialogue manager: forces the dialogue between
user and system to follow one or more structures
For speech dialogue systems, most common
approaches are:
Finite state dialogue manager
Frame and slot semantics
Agent-based dialogue manager
Finite-state dialogue
managers
System completely controls the
conversation with the user.
It asks the user a series of questions
Ignoring (or misinterpreting)
anything the user says that is not a
direct answer to the systems
questions
Finite-state approach
Pros
simple to write
very robust and quick
Cons
System direct entire conversation
User actions very limited
Frame-based Approach
Frame-based system
Asks the user questions to fill slots in a template in order to
perform a task (form-filling task)
Permits the user to respond more flexibly to the systems
prompts (see Example 2)
Recognizes the main concepts in the users utterance
Example 1)
System: What is your destination?
User: London.
System: What day do you want to
travel?
User: Friday
33
Example 2)
System: What is your destination?
User: London on Friday around
10 in the morning.
System: I have the following
connection
Frame/Slot semantics
Show me morning flights from Boston to SF
on Tuesday.
SHOW:
FLIGHTS:
ORIGIN:
CITY: Boston
DATE: Tuesday
TIME: morning
DEST:
CITY: SanFrancisco
Slot
Question
IDENTIFY
What is your name?
ORIGIN
What city are you leaving from?
DEST
Where are you going?
DEPT DATE
What day would you like to leave?
DEPT TIMEWhat time would you like to leave?
AIRLINE
What is your preferred airline?
Frame-based approaches
Advantages
The ability to use natural language, multiple
slot filling
The system processes the users overinformative answers and corrections
Disadvantages
Appropriate for well-defined tasks in which the
system takes the initiative in the dialog
Difficult to predict which rule is likely to fire in
a particular context
36
Agent-based approaches
Properties
Examples
User : Im looking for a job in the Calais area. Are there any servers?
System : No, there arent any employment servers for Calais. However, there is
an employment server for Pasde-Calais and an employment server for Lille.
Are you interested in one of these?
37
System attempts to provide a more co-operative response that might address the
users needs.
Agent-based Approach
Advantages
Suitable to more complex dialogues
Mixed-initiative dialogues
Disadvantages
Much more complex resources and
processing
Sophisticated natural language capabilities
Complicated communication between
dialogue modules
38
CMU-systems (Olympus)
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation
coherence relations
Both
Anaphora
Co-reference
Discourse segmentation
Separating a document into a linear
sequence of subtopics
For example: scientific articles are segmented
into Abstract, Introduction, Methods, Results,
Conclusions
Note: this is a simplification a discourse
might have a more complex structure
Applications:
Summarization: summarize each segment
separately
Information Retrieval / Information Extraction:
Apply to an appropriate, i.e. relevant segment
Discourse segmentation
Example: 21 paragraph article called
Stargazers
Unsupervised Discourse
Segmentation
Unsupervised = uses no training
data
Typically cohesion-based: segment
text into subtopics in which
sentences/paragraphs are cohesive
with each other
Cohesion: use of linguistic devices
to establish links between textual
units
Lexical Cohesion: use of same or
similar (e.g. hypernyms, hyponyms,
term frequency
sentence numbers
TextTiling: Pre-processing
Convert text stream into terms (words)
Remove stop-words
E.g. the, a, of
Divide text into token sequences (pseudosentences) of equal length (say 20 words)
TextTiling: Pre-processing
Convert text stream into terms (words)
Remove stop-words
E.g. the, a, of
Divide text into token sequences (pseudosentences) of equal length (say 20 words)
s
Compute lexical cohesion score at
c
each gap
o
rSimilarity of the blocks before and after
ethe gap
Each block is made of k pseudosentences
dCosine similarity between the blocks
eword vectors
Gap
s
Compute lexical cohesion score at
c
each gap
o
rSimilarity of the blocks before and after
ethe gap
Each block is made of k pseudosentences
dCosine similarity between the blocks
Similarity
eword vectors
Gap
TextTiling: Boundary
identification
Compute the depth scores of each
gap
Distance from the peaks on both sides:
(a-b)+(c-b)
TextTiling: Boundary
identification
Compute the depth scores of each
gap
Distance from the peaks on both sides:
(a-b)+(c-b)
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation
coherence relations
Both
Anaphora
Co-reference
Text Coherence
A collection of independent sentences do
not make a discourse because they lack
coherence
Meaning relation between two units of text
how the meaning of different units of text
combine to build
meaning of the larger unit
Explanation
John hid Bills car keys.
He was drunk.
???
John hid Bills car keys. He likes spinach.
Coherence Relations
Other relations (Hobbs, 1979):
Result
The Tin Woodman was caught in the rain. His joints rusted.
Parallel
Dorothy was from Kansas. She lived in the midst of the great
Kansas prairies. Occasion
Dorothy picked up the oil-can. She oiled the Tin Woodmans
joints.
Discourse Structure
The hierarchical structure of a
discourse according to the
coherence relations
Analogous to syntactic tree structure
A node in a tree represents locally
coherent sentences: discourse
segment
Discourse Parsing
Coherence Relation Assignment:
determine automatically the coherence
relations between units of a discourse
Discourse Parsing: find automatically
the discourse structure of an entire
discourse
Example of approaches:
Unsupervised based on cue phrases (or
discourse markers)
Supervised, based on discourse treebanks cf.
the Penn Discourse Treebank
(http://www.seas.upenn.edu/~pdtb)
Automatic Coherence
Assignment
Shallow cue-phrase-based algorithm:
1. Identify cue phrases in a text
2. Segment text into discourse segments
using cue phrases
3. Assign coherence relations between
consecutive discourse segments
Identifying RS Automatically
(Marcu 1999)
A supervised parser trained on a discourse
treebank
Issues in discourse/dialogue
Dialogue
Turn-taking
Speech act
Grounding
Dialogue management
Discourse
Segmentation, coherence relations
Both
Anaphora
Co-reference
Task definition
IDENTIFYING WHICH
MENTIONS REFER TO
THE SAME
(DISCOURSE) ENTITY
Anaphora Coreference
COREFERENT, not ANAPHORIC
two mentions of same object in different
documents
Obama was interviewed last night. The
President..
Nominal anaphoric
expressions
REFLEXIVE PRONOUNS:
John bought himself an hamburger
PRONOUNS:
Definite pronouns: Ross bought {a radiometer | three
kilograms of after-dinner mints} and gave {it | them} to
Nadia for her birthday. (Hirst, 1981)
Indefinite pronouns: Sally admired Sues jacket, so she
got one for Christmas. (Garnham, 2001)
DEFINITE DESCRIPTIONS:
A man and a woman came into the room. The man sat
down.
Epiteths: A man ran into my car. The idiot wasnt looking
where he was going.
DEMONSTRATIVES:
Tom has been caught shoplifting. That boy will turn out
GAPPING:
Nadia brought the food for the picnic, and Daryel _
the wine.
TEMPORAL REFERENCES:
In the mid-Sixties, free love was rampant across
campus. It was then that Sue turned to Scientology.
(Hirst, 1981)
LOCATIVE REFERENCES:
The Church of Scientology met in a secret room
behind the local Colonel Sanders chicken stand.
Sue had her first dianetic experience there. (Hirst,
Ross bought {a radiometer | three kilograms of afterdinner mints} and gave {it | them} to Nadia for her
birthday.
Identity of SENSE
BOUND anaphora
Associative anaphora
(a type of BRIDGING)
Toni Johnson pulls a tape measure across the front of what was
once a stately Victorian home.
A deep trench now runs along its north wall, exposed when the
house lurched two feet off its foundation during last week's
earthquake.
Once inside, she spends nearly four hours measuring and
diagramming each room in the 80-year-old house, gathering
enough information to estimate what it would cost to rebuild it.
While she works inside, a tenant returns with several friends to
collect furniture and clothing.
One of the friends sweeps broken dishes and shattered glass
from a countertop and starts to pack what can be salvaged
from the kitchen.
(WSJ section of Penn Treebank
corpus)
Interpreting anaphoric
expressions
2.
3.
coref ?
[Israel],
not
coref
Iraq
Iraqi
not coref
Israel
Clustering
Algorithm
Israel
the Jewish
state
its
A jittery
publicUSA
US
United States
PERSON
FEMALE
OBJECT
MALE
ORGANIZATION LOCATION
TIME
MONEY
DATE
PERCENT
WHEELS
HAS
CAR
THE END