Professional Documents
Culture Documents
Chapter 3 Topics
Introduction
The General Problem of Describing Syntax
Formal Methods of Describing Syntax
Attribute Grammars
Describing the Meanings of Programs: Dynamic
Semantics
Introduction
Who must use language definitions?
Language designers
Implementors
Programmers (the users of the language)
Syntax
The form or structure of the expressions, statements, and
program units
Defines what is grammatically correct
Semantics
The meaning of the expressions, statements, and program units
Some definitions
A sentence is a string of characters over some alphabet
A language is a set of valid sentences
The syntax rules of the language specify which strings of
characters are valid sentences
Describing syntax
Syntax may be formally described using
recognition or generation
Recognition involves a recognition device R
Given an input string, R either accepts the string as
valid or rejects it
R is only used in trial-and-error mode
A recognizer is not effective in enumerating all
sentences in a language
Describing syntax
Generation
A language generator generates the sentences of a
language
A grammar is a language generator
One can determine if a string is a sentence by
comparing it with the structure given by a generator
Context-free grammars
Context-sensitive grammars
Alternative form
<if_stmt> if <logic_expr> then <stmt>
if <logic_expr> then <stmt> else <stmt>
More compactly, . . .
<if_stmt> if <logic_expr> then <stmt> | if <logic_expr> then <stmt> else <stmt>
12
Formal
methods
for
describing
syntax
An example grammar
<program> <stmts>
<stmts> <stmt> | <stmt> ; <stmts>
<stmt>
<var> = <expr>
<var>
a|b|c|d
<expr>
<term> + <term> | <term> - <term>
<term>
<var> | integer
Derivation
16
Parse Tree
<program>
<stmts>
<stmt>
<var>
<expr>
a <term> +
<term>
<var>
integer
17
<expr>
<op>
<expr> <op>
<expr>
int
int
<expr>
<expr>
<expr>
<op>
<expr>
<expr> <op>
int
int
int
<expr>
int
18
Ambiguity
The compiler decides what code to generate
based on the structure of the parse tree
The parse tree indicates precedence the operators
Does it mean ( int + int ) * int or int + ( int * int )
A non-ambiguous grammar
<expr> <expr> + <term> | <term>
<term> <term> * int | int
<expr>
Derivation
<expr>
<term>
<term>
<term> *
int
int
20
Associativity of operators
Operator associativity can also be indicated by a
grammar
<expr> <expr> + <expr> | int
(ambiguous)
<expr> <expr> + int | int
(unambiguous)
Example: a parse tree using the unambiguous grammar
The unambiguous grammar is
<expr>
<expr>
left recursive and produces
<expr>
+
int
a parse tree in which the order
of addition is left associative
Addition is performed in a
left-to-right manner
<expr>
int
int
21
Dangling-else problem
Consider the grammar
<stmt> | <if_stmt> |
<if_stmt> if <logic_expr> then <stmt>
if <logic_expr> then <stmt> else <stmt>
and
if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>
22
Dangling-else problem
Most languages match each else with the nearest
preceding elseless if
The ambiguity can be eliminated by developing a
grammar that distinguishes elseless ifs from ifs
with else clauses
See text, page 131
23
Braces
<identifier_list> ident { , ident }
Generates: Larry, Curly, Moe
<term>
<term>
<factor>
<factor>
EBNF:
<expr> <term> { ( + | - ) <term> }
<term> <factor> { ( * | / ) <factor> }
26
Extended BNF
EBNF uses metasymbols |, {, }, (, ), [, and ]
When metasymbols are also terminal symbols in
the language being defined, instances that are
terminal symbols must be quoted
<proc_call> ident [ ( <expr_list> ) ]
Extended BNF
Sometimes a superscript + is used as an additional
metasymbol to indicate one or more repetitions
Example: The production rules
<compound_stmt> begin <stmt> { <stmt> } end
and
<compound_stmt> begin { <stmt> }+ end
are equivalent
28
29
Attribute grammars
Context-free grammars (CFGs) cannot describe all
of the syntax of programming languages
Typical example: a variable must be declared before it
can be referenced
Something like this is called a context-sensitive
constraint
Text refers to it as static semantics
30
Attribute grammars
Static semantics refers to the legal form of a
program
This is actually syntax rather than semantics
The term semantics is used because the syntax
check is done during syntax analysis rather than
during parsing
The term static is used because the analysis required
to check the constraint can be done at compile time
31
Additional features
Attributes
Predicate functions
Do the checking
32
Attribute grammars
BNF:
<assign> <var> = <expr>
<expr> <var> + <var>
<var>
id
Attributes:
actual_type
expected_type
36
Attribute grammars
In what order are attribute values computed?
If all attributes were inherited, the tree could be
decorated in top-down order
If all attributes were synthesized, the tree could be
decorated in bottom-up order
In many cases, both kinds of attributes are used, and it
is some combination of top-down and bottom-up that
must be used
Complex problem in general
May require construction of a dependency graph showing all
attribute dependencies
38
Computation of attributes
For the generated expression: sum + increment
<expr>.expected_type inherited from parent
<var>[1].actual_type lookup (sum.type)
<var>[2].actual_type lookup (increment.type)
<var>[1].actual_type =? <var>[2].actual_type
<expr>.actual_type <var>[1].actual_type
<expr>.actual_type =? <expr>.expected_type
39
Semantics
The meaning of expressions, statements, and
program units is known as dynamic semantics
We consider three methods of describing dynamic
semantics
Operational semantics
Axiomatic semantics
Denotational semantics
40
Operational semantics
Operational semantics describes the meaning of a
language statement by executing the statement on a
machine, either real or simulated
The meaning of the statement is defined by the
observed change in the state of the machine
i.e., the change in memory, registers, etc.
41
Operational semantics
The best approach is to use an idealized, low-level virtual
computer, implemented as a software simulation
Then, build a translator to translate source code to the
machine code of the idealized computer
The state changes in the virtual machine brought about by
executing the code that results from translating a given
statement defines the meaning of the statement
In effect, this describes the meaning of a high-level
language statement in terms of the statements of a
simpler, low-level language
42
exp3;
goto loop
out:
44
Axiomatic semantics
Based on formal logic (predicate calculus)
Original purpose: formal program verification
Each statement in a program is both preceded by
and followed by an assertion about program
variables
Assertions are also known as predicates
Assertions will be written with braces { } to
distinguish them from program statements
45
Axiomatic semantics
A precondition is an assertion immediately before a
statement that describes the relationships and constraints
among variables that are true at that point in execution
A postcondition is an assertion immediately following a
statement that describes the situation at that point
Our point of view is to compute the preconditions for a given
statement from the corresponding postconditions
It is also possible to set things up in the opposite direction
Axiomatic semantics
Notation: {P} S {Q}
P is the preconditon
S is a statement
Q is the postcondition
Example
Find the weakest precondition P for: {P} a = b + 1 {a > 1}
One possible precondition: {b > 10}
Weakest precondition:
{b > 0}
47
Axiomatic semantics
If the weakest precondition can be computed for
each statement in a program, then a correctness
proof can be constructed for the program
Start by using the desired result as the
postcondition of the last statement and work
backward
The resulting precondition of the first statement
defines the conditions under which the program
will compute the desired result
If this precondition is the same as the program
specification, the program is correct
48
Axiomatic semantics
Weakest preconditions can be computed using an
axiom or using an inference rule
An axiom is a logical statement assumed to be
true
An inference rule is a method of inferring the truth
of one assertion on the basis of the values of other
assertions
Each statement type in the language must have an
axiom or an inference rule
We consider assignments, sequences, selection,
and loops
49
Assignment statements
Let x=E be a generic assignment statement
An axiom giving the precondition is sufficient in this case:
{Q x E} x = E {Q}
Here the weakest precondition P is given by Q x E
In other words, P is the same as Q with all instances of x replaced
by expression E
Inference rules
The general form of an inference rule is
S1, S2, S3, , Sn
S
This states that if S1, S2, S3, , and Sn are true, then
the truth of S can be inferred
51
Sequence statements
Since a precondition for a sequence depends on the
statements in the sequence, the weakest precondition
cannot be described by an axiom
An inference rule is needed for sequences
Consider the sequence S1;S2 of two statements with
preconditions and postconditions as follows:
{P1} S1 {P2}
{P2} S2 {P3}
53
54
Selection statements
Consider only if-then-else statements
The inference rule is
{ B and P } S1 { Q }, { (not B) and P } S2 { Q}
{ P } if B then S1 else S2 { Q }
Example:
Loops
We consider a logical pretest (while) loop
{P} while B do S end {Q}
Computing the weakest precondition is more
difficult than for a sequence because the number of
iterations is not predetermined
An assertion called a loop invariant must be found
A loop invariant corresponds to finding the inductive
hypothesis when proving a mathematical theorem
using induction
56
Loops
The inference rule is
{ I and B } S { I }
{ I } while B do S end { I and (not B) }
Example
Loops
The loop invariant I is a weakened version of the
loop postcondition, and it is also a precondition.
I must be weak enough to be satisfied prior to the
beginning of the loop
When combined with the loop exit condition, I must
be strong enough to force the truth of the
postcondition
59
Axiomatic semantics
Evaluation of axiomatic semantics
Developing axioms or inference rules for all of the
statements in a language can be difficult
Axiomatic semantics is . . .
a good tool for correctness proofs
an excellent framework for reasoning about programs
60
61
Denotational semantics
Denotational semantics
Is the most rigorous, widely known method for
describing the meaning of programs
Based on recursive function theory
Fundamental concept
Define a mathematical object for each language entity
The mathematical objects can be rigorously defined
and manipulated
Define functions that map instances of the language
entities onto instances of the corresponding
mathematical objects
62
Denotational semantics
As is the case with operational semantics, the
meaning of a language construct is defined in
terms of the state changes
In denotational semantics, state is defined in terms
of the various mathematical objects
State is defined only in terms of the values of the
program's variables
The value of a variable is an instance of an appropriate
mathematical object
63
Denotational semantics
The state s of a program consists of the values of all
its current variables
s = {<i1, v1>, <i2, v2>, , <in, vn>}
Here, ik is a variable and vk is the associated value
Each vk is a mathematical object
Denotational semantics
Let VARMAP be a function that, when given a
variable name and a state, returns the current
value of the variable
VARMAP(ik, s) = vk
Any variable can have the special value undef
i.e., currently undefined
65
66
67
68
69
Denotational semantics
Evaluation of denotational semantics:
Can be used to determine meaning of complete
programs in a given language
Provides a rigorous way to think about programs
Can be an aid to language design
73