You are on page 1of 24

Chapter 6

6.3 Equivalence of Type Expressions


6.4 Type Conversions
6.4 Overloading of functions and Operators

Dewan Tanvir Ahmed


Assistant Professor, CSE, BUET
Principal Task of Compiler
Type inference
the computation and maintenance of information on data types

Type checking
the use of the information to ensure that each part of the program
makes sense under the type rules of the language.

These two tasks are related, performed together, and referred as


type checking.
Data Type
1. Data type is set of values with certain operations on those
values.
Example:
integer refers to a subset of the mathematical integers,
together with the arithmetic operations that are provided by
the language definition.

2. Data types are described by a type expressions, which is


– A type name (such as integer) or
– A structured expressions (such as array[10] )
– Operations are assumed or implied
Type Expressions
1. Type expressions can occur in several places in a program.

3. Explicit type information


– int x; - associates a type to a variable name
– class Car { …} – defines a new type name

5. Implicit type information


– const greeting = “Hello!”; - array of char in Pascal.

7. Type information, that is contained in declarations, is maintained in


the symbol table and retrieved by the type checker whenever the
associated names are referenced.
– Example: a[i]. A range checking is not statically determinable.
Equivalence of Type Expressions
1. Checking Rules:
if two type expressions are equal then return a certain type else return a
type_error

2. It is important to have a precise definition of when two type


expressions are equivalent.

4. Potential ambiguities arise when names are given to type expressions


and names are then used in subsequent type expressions.

6. The notion of type equivalence implemented by a specific compiler can


be explained using the concepts of
• Structural equivalence
• Name equivalence
Structural Equivalence of Type Expressions

1. Two type expressions are structural equivalent if they are


• same basic type or
• formed by applying the same constructor
2. That is, two type expressions are structural equivalent if and only if
they are identical.
3. Example:
• integer is equivalent to integer
• pointer(char) is equivalent to pointer(char)
4. Modification is needed for structural equivalence.
• When array are passed as parameter, we may not wish to include
the array bounds as part of the type.
Structural Equivalence of Type Expressions
function sequiv(s, t): boolean;
begin
if s and t are the same basic type return true
else if s = array (s1, s2) and t = array(t1, t2) then
return sequiv(s1, t1) and sequiv(s2, t2)
else if s = s1× s2 and t = t1× t2 then
return sequiv(s1, t1) and sequiv(s2, t2)
else if s = pointer(s1) and t = pointer(t1) then
return sequiv(s1, t1)
else if s = s1 → s2 and t = t1 → t2 then
return sequiv(s1, t1) and sequiv(s2, t2)
else return false
end
Encoding of Type Expression
1. The encoding of type expression is from C (D. M. Ritchie)
2. Consider the constructors
• Array
• Pointer
• Function
3. Say
• pointer(t): a pointer to type t
• freturns(t): a function of some arguments that returns an object of
type t
• array(t): an array of elements of type t
4. Example of such type expressions:
1. Char
2. Freturns(char)
3. Pointer(Freturns(char))
4. Array(Pointer(Freturns(char))
Encoding of Type Expression (cont.)
Type Constructor Encoding
pointer 01
array 10
freturns 11

Type Constructor Encoding


boolean 0000
char 0001
integer 0010
real 0011

Type Constructor Encoding


char 000000 0001
freturns(char) 000011 0001
pointer(freturns(char)) 000111 0001
array(pointer(freturns(char)) 100111 0001
Encoding of Type Expression (cont.)
1. Advantages
• saves space
• Keeps track of the constructors that appear in any type expression
• Two bit sequences cannot represent the same type because either
 The basic types are different or
 The constructors in the type expressions are different
2. Disadvantages
 Different types could have the same bit sequence since array size and
function arguments are not represented
Name for Type Expression

• Types can be given names.


• When names are allowed in type expressions, two notions of
equivalence of type expressions arise, depending on the treatment of
names
• Name equivalence views each type name as a distinct type, so two
type expressions are name equivalent if and only if they are identical.
• Under structural equivalence, names are replaced by the type
expressions they define, so two type expressions are structurally
equivalent if they represent two structurally equivalent the
expressions when all names have been substituted out.
Name for Type Expression (cont.)
type link = ↑ cell; Variable Type expression
var next: link; next link
last link
last : link;
p pointer(cell)
p : ↑ cell;
q pointer(cell)
q, r : ↑ cell;
r pointer(cell)

Do the variables next, last, p, q, r all have identical types?

Name equivalence:
• next and last have the same type
• p, q, r also have the same type
• But p and next do not.
Structural equivalence:
• all five variables have the same type because link is a name for
the type expression pointer(cell).
Name for Type Expression (cont.)
1. Confusion arises when an implicit type name is created for each
declared identifier.
type link = ↑ cell;
type link = ↑ cell;
np = ↑ cell;
var next: link;
nqr = ↑ cell;
last : link;
var next: link;
p : ↑ cell;
last : link;
q, r : ↑ cell;
p : np;
q : nqr;
Name equivalence: r : nqr;
• next and last have the same type
• q, r also have the same type
• But p, q and next do not have
equivalent types.
Type Graph
1. The typical implementation is to construct a type graph to
represent types.
2. Every time a type constructor or basic type is seen, a node is
created.
3. Every time a new type name is seen, a leaf is created.
4. With this representation, two type expressions are equivalent if
they are represented by the same node in the type graph.

next last p q r

Link = pointer pointer pointer

cell
Cycles in Representation of Types
1. Recursive data types include lists, trees, and other structures.
2. Languages may or may not permit the direct use of recursion in
type declarations.
3. C allows recursion only indirectly, through pointers.

type link = ↑ cell;


cell = record
info : integer
next : link
end
Cycles in Representation of Types (cont.)
type link = ↑ cell;
cell = record
info : integer

cell = record next : link


end

× ×

info integer next pointer

cell
Cycles in Representation of Types (cont.)
type link = ↑ cell;
cell = record
info : integer

cell = record next : link


end

× ×

info integer next pointer


Type Conversions
1. Expression x+i where x is real and i is integer.

3. Since representation of integers and reals is different within a


computer, and different machine instructions are used for
operations on integers and reals, the compiler have to convert one
of the operands of + .

5. The language definition specifies what conversions are necessary.

7. Postfix notation for x+i, might be


– x i inttoreal real+
Coercions
1. Conversion from one type to another is said to be implicit if it is to be done
automatically by the computer.

3. Implicit type conversions are called coercions.

5. Conversion is said to be explicit if the programmer must write something


to cause the conversion.

7. All conversions is Ada are explicit.

9. Explicit conversions look just like function applications to a type checker, so


they present no new problems.
10. Implicit conversion of constants can usually be done at compile time, often
with a great improvement to the running time of the object program.
Type checking rules for Coercions from
integer to real

E → num { E.type= integer }


E → num.num { E.type= real }
E → id { E.type=lookup(id.entry) }

E → E1 op E2 { E.type = if E1.type=integer and E2.type=integer


then integer
else if E1.type=integer and E2.type=real
then real
else if E1.type=real and E2.type=integer
then real
else if E1.type=real and E2.type=real
then real
else type-error }
Overloading of Functions and Operators
1. An overloaded symbol is one that has different meaning depending
on its context.
2. An operator is overloaded of the same operator is used for two
different operations.
3. 2+3 represents integer additions
4. 2.1+3.0 represents floating-point addition
5. Overloading can be extended to user–defined functions.
int max (int x, int y)
double max ( double x, double y)
6. The type checker can decide which function is meant based on the
types of parameters (C++, Ada).
Set of possible Types for a Sub-expression
1. Overloading is resolved when a unique meaning for an occurrence of
an overloaded symbol is determined.
2. It is not always possible to resolve overloading by looking only at
the arguments of a function because sub expressions may have a
set of possible types.

4. Example;
– function “*” ( i , j : integer ) return complex
– function “*” ( i , j : complex) return complex
– Thus possible types for * include
• integer × integer → integer
• integer × integer → complex
• complex × complex → complex

5. Only possible type for 2, 3, and 5 is integer


6. Sub-expression 3*5 either has type integer or complex.
7. Complete expression 2*(3*5) has type integer, why?
Determining the set of possible types
of an expression
Production Semantic Rule

E’ → E E’.types = E.types

E → id E.types =lookup(id.entry)

E.types = { t | there exists an s in E2.types and such that


E → E1 ( E2 )
s→t is in E1.type}

E: { i, c}

E: {i} E: {i}

3: {i} * 5: {i}
{i×i →i, i×i →c, c×c →c }
Narrowing the Set of Possible Types
1. Ada requires a complete expression to have a unique type.

3. Given a type from the context, we can narrow down the type choices for
each sub-expression.

5. If this process doest not result in a unique type for each sub-expression,
then a type error is declared for the expression.

7. First construct a syntax tree for an expression from the syntax-directed


definition.

9. Two depth-first-traversals are needed.

11. During the first pass:


– Attribute types is synthesized bottom up

12. During the second pass:


– Attribute unique is propagated top-down and
– As we return from a node, the code attribute can be synthesized.

You might also like