Professional Documents
Culture Documents
A Substitution Rule
Equivalent grammar
A Substitution Rule
S p aB | ab A p aaA A p abBc | abbc B p aA
Substitute
B p aA
S p aB | ab | aaA A p aaA A p abBc | abbc | abaAc
Courtesy Costas Buch - RPI
Equivalent grammar
3
In general:
A p xBz B p y1
Substitute
B p y1
A p xBz | xy1z
Courtesy Costas Buch - RPI
equivalent grammar
4
Nullable Variables
P production :
ApP
Nullable Variable:
A- P
S p aMb M p aMb M pP
Nullable variable
Courtesy Costas Buch - RPI 6
Final Grammar
M pP
S p ab M p aMb M p ab
Unit-Productions
Unit Production:
Ap B
Ap A
Is removed immediately
Example Grammar:
S p aA Apa Ap B BpA B p bb
Courtesy Costas Buch - RPI 10
S p aA Apa Ap B BpA B p bb
Courtesy Costas Buch - RPI
S p aA | aB
Substitute
Ap B
Apa B p A| B B p bb
11
S p aA | aB Apa B p A| B B p bb
S p aA | aB
Remove
BpB
Apa BpA B p bb
12
S p aA | aB | aA Apa B p bb
13
Final grammar
S p aA | aB | aA Apa B p bb
S p aA | aB Apa B p bb
14
Useless Productions
Another grammar:
16
In general: if
S - xAy - w
w L(G )
then variable otherwise, variable
A is useful A is useless
17
Variables useless
S p aSb S pP SpA
A p aA useless B p C
useless
CpD
Courtesy Costas Buch - RPI
S p aS | A | C Apa B p aa C p aCb
19
First:
find all variables that can produce strings with only terminals
S p aS | A | C Apa B p aa C p aCb
Round 1:
{ A, B}
SpA
Round 2:
Courtesy Costas Buch - RPI
{ A, B, S }
20
{ A, B, S }
S p aS | A | C Apa B p aa C p aCb
Courtesy Costas Buch - RPI
S p aS | A Apa B p aa
21
S p aS | A Apa B p aa
B
not reachable
22
Final Grammar
S p aS | A Apa B p aa
S p aS | A Apa
Removing All
Step 1: Remove Nullable Variables Step 2: Remove Unit-Productions Step 3: Remove Useless Variables
24
25
A p BC
variable
or
Apa
terminal
variable
26
Examples:
S p AS S pa A p SA Apb
Chomsky Normal Form
Courtesy Costas Buch - RPI
S p AS S p AAS A p SA A p aa
Not Chomsky Normal Form
27
S p ABa A p aab B p Ac
Not Chomsky Normal Form
28
Ta , Tb , Tc
V1
V2
S p AV1 V1 p BTa
Initial grammar
S p ABa A p aab B p Ac
Tc p c
32
In general: From any context-free grammar (which doesnt produce P ) not in Chomsky Normal Form we can obtain: An equivalent grammar in Chomsky Normal Form
Courtesy Costas Buch - RPI 33
The Procedure
First remove: Nullable variables Unit productions
34
a: Ta p a a with Ta
Add production
In productions: replace
New variable:
Ta
35
V1, V2 , - ,Vn2
36
Theorem:
For any context-free grammar (which doesnt produce P ) there is an equivalent grammar in Chomsky Normal Form
37
Observations
Chomsky normal forms are good for parsing and proving theorems
It is very easy to find the Chomsky normal form for any context-free grammar
Courtesy Costas Buch - RPI 38
A p a V1V2 . Vk
symbol variables
k u0
39
Examples:
S p cAB A p aA | bB | b Bpb
Greinbach Normal Form
S p abSb S p aa
Not Greinbach Normal Form
40
Theorem:
For any context-free grammar (which doesnt produce P ) there is an equivalent grammar in Greinbach Normal Form
42
Observations
Greinbach normal forms are very good for parsing
Compilers
44
Machine Code Program v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } ...... Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v ELSE: WHILE: cmp x,3 ...
45
Compiler
Lex
46
For each kind of string found the lex program takes an action
47
Output Input Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Identifier: Var Operand: = Integer: 12 Operand: + Integer: 9 Semicolumn: ; Keyword: if Parenthesis: ( Identifier: test ....
48
Lex
program
Lex program
Regular expressions + - = if then /* operators */
/* keywords */
Courtesy Costas Buch - RPI 49
Lex program
Regular expressions (0|1|2|3|4|5|6|7|8|9)+ /* integers */
(a|b|..|z|A|B|...|Z)+
/* identifiers */
50
integers
(0|1|2|3|4|5|6|7|8|9)+
[0-9]+
51
identifiers
(a|b|..|z|A|B|...|Z)+
[a-zA-Z]+
52
Examples:
Regular expression \n [0-9]+ [a-zA-Z]+ Action linenum++; prinf(integer); printf(identifier);
Courtesy Costas Buch - RPI 53
Default action:
ECHO;
54
printf(Integer\n); printf(Identifier\n);
55
56
Another program
; /*skip spaces*/ linenum++; prinf(Integer\n); printf(Identifier\n); printf(Error in line: %d\n, linenum); Courtesy Costas Buch - RPI 57
Output Integer Identifier Identifier Integer Integer Integer Error in line: 3 Identifier
Courtesy Costas Buch - RPI 58
Lex matches the longest input string Example: Regular Expressions if ifend
Input: Matches:
ifend ifend
if if
59
Lex
Regular expressions NFA DFA Minimal DFA
Compiler
Lexical analyzer parser
input program
Courtesy Costas Buch - RPI
62
Parser
PROGRAM p STMT_LIST STMT_LIST p STMT; STMT_LIST | STMT; STMT p EXPR | IF_STMT | WHILE_STMT | { STMT_LIST } EXPR p EXPR + EXPR | EXPR - EXPR | ID IF_STMT p if (EXPR) then STMT | if (EXPR) then STMT else STMT WHILE_STMTp while (EXPR) do STMT
Courtesy Costas Buch - RPI 63
The parser finds the derivation of a particular input derivation input 10 + 2 * 5 Parser E -> E + E |E*E | INT E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
64
E + E * E 5
65
derivation tree E E 10 E 2 * E 5
Courtesy Costas Buch - RPI 66
Parsing
67
68
Example:
Parser input
S p SS S p aSb S p bSa S pP
Courtesy Costas Buch - RPI
derivation ?
aabb
69
Exhaustive Search
S p SS | aSb | bSa | P
Phase 1:
S SS S aSb S bSa S P
Find derivation of
aabb
S SS S aSb S bSa S P
aabb
71
Phase 2
S SS SSS
Phase 1
S SS S aSb
72
Phase 2
S p SS
input
aabb
S p aSb S p bSa S pP
derivation
Time complexity of exhaustive search Suppose there are no productions of the form
ApP
Ap B
Number of phases for string
w:
2| w|
75
k rules
k possible derivations
76
possible derivations
77
2 | w |:
2|w|
78
w:
k k . k
2|w|
phase 1
phase 2
phase 2|w|
Extremely bad!!!
Courtesy Costas Buch - RPI 79
A p ax
string of variables
symbol
Pair
( A, a ) appears once
80
S-grammar example:
S p aS S p bSS S pc
Each string has a unique derivation
For S-grammars: In the exhaustive search parsing there is only one choice in each phase Time for a phase:
w:
| w|
82
83