Professional Documents
Culture Documents
What is Parsing?
S NP VP NP Papa
NP Det N S N caviar
NP NP PP N spoon
VP V NP V spoon
VP VP PP NP VP V ate
PP P NP P with
Det the
VP PP Det a
V NP P NP
Det N Det N
V NP P NP
V NP P NP
ate NP PP
Det N P NP
a spoon
The parsing problem
correct test trees
P
A s
c
R o
S r accuracy
e
E r
test R
sentences
Recent parsers
quite accurate
Grammar … good enough
to help a range of
NLP tasks!
Warning: these slides are out of date
E F E
N + E F E
3 N * N
5 6
(bottom-up) E F E 30
3N + E F E
add
3 5 N mult
* N6
5 6
Det N T VPstem
Every nation -s
every nation Vstem Sinf
want
v x e present(e),v(x)(e) NP VPinf
George
y x e act(e,wanting), G T VPstem
wanter(e,x), wantee(e,y) a a to
Vstem NP
y x e act(e,loving), love Laura L
lover(e,x), lovee(e,y)
600.465 - Intro to NLP - J. Eisner 14
Now let’s develop some parsing algorithms!
S NP VP NP Papa
NP Det N N caviar
NP NP PP N spoon
VP V NP V spoon
VP VP PP V ate
PP P NP P with
Det the
Det a
0 1 2 3 4 5 6 7
“Papa ate the caviar with a spoon”
First try … does it work?
Right phrase
old new
? 17=
12+27
13+37
14+47
15+57
16+67
Avoid duplicate work:
Build width-1, then width-2, etc.
CKY algorithm, recognizer version
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for width := 2 to n
for start := 0 to n-width // this is I
Define end := start + width // this is J
for mid := start+1 to end-1 // find all I-to-J phrases
for every nonterminal Y in [start,mid]
for every nonterminal Z in [mid,end]
for all nonterminals X
if X Y Z is in the grammar
then add X to [start,end]
Loose ends to tie up (no slides)
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for width := 2 to n
for start := 0 to n-width // this is I
Define end := start + width // this is J
for mid := start+1 to end-1 // find all I-to-J phrases
for every nonterminal Y in [start,mid]
for every nonterminal Z in [mid,end]
for all nonterminals X
if X Y Z is in the grammar
then add X to [start,end]
CKY algorithm, recognizer version
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for width := 2 to n
for start := 0 to n-width // this is I
Define end := start + width // this is J
for mid := start+1 to end-1 // find all I-to-J phrases
for every nonterminal Y in [start,mid]
for every nonterminal Z in [mid,end]
for every X Y Z in the grammar
add X to [start,end]
CKY algorithm, recognizer version
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for width := 2 to n
for start := 0 to n-width // this is I
Define end := start + width // this is J
for mid := start+1 to end-1 // find all I-to-J phrases
for every nonterminal Y in [start,mid]
for every X Y Z in the grammar
if Z in [mid,end]
then add X to [start,end]
Alternative version of inner loops
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for width := 2 to n
for start := 0 to n-width // this is I
Define end := start + width // this is J
for mid := start+1 to end-1 // find all I-to-J phrases
for every rule X Y Z in the grammar
if Y in [start,mid] and Z in [mid,end]
then add X to [start,end]
Extensions to discuss (no slides)
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for width := 2 to n
for start := 0 to n-width // this is I
Define end := start + width // this is J
for mid := start+1 to end-1 // find all I-to-J phrases
for every nonterminal Y in [start,mid]
for every nonterminal Z in [mid,end]
for every X Y Z in the grammar
add X to [start,end]
Incremental CKY:
Visit columns left to right and fill each bottom-up
for J := 1 to n
Add to [J-1,J] all categories for the Jth word
for end := 2 to n // this is J
for width := 2 to end
Define start := end - width // this is I
for mid := start+1 to end-1 // find all I-to-J phrases
for every nonterminal Y in [start,mid]
for every nonterminal Z in [mid,end]
for every X Y Z in the grammar
add X to [start,end]
Extensions to discuss (no slides)
40
Understanding the Key Rule
Substring from
I to J could
be a phrase
of category X if
phrase(X,I,J) :- rewrite(X,Y,Z) , phrase(Y,I,Mid), phrase(Z,Mid,J).
e.g., phrase(“VP”,1,7) rewrite(“VP”,“V”,“NP”) e.g., phrase(“V”,1,2), phrase(“NP”,2,7)
start_symbol := “S”.
sentence_length := 7.
Alternatively:
42
sentence_length max= J for word(_,_,J).
Chart Parsing in Dyna
phrase(X,I,J) :- rewrite(X,W), word(W,I,J).
phrase(X,I,J) :- rewrite(X,Y,Z), phrase(Y,I,Mid), phrase(Z,Mid,J).
goal :- phrase(start_symbol, 0, sentence_length).
shared substructure
(dynamic programming)
axioms
45
This picture assumes a slightly different version of the Dyna program, sorry
ambiguity
dead end
shared substructure
(dynamic programming)
46
Procedural algorithms like CKY are just strategies for
running this declarative program
47