You are on page 1of 18

Type Checking

The general topic of type checking includes two parts


Type synthesis assigning a type to each expression in the language Type checking making sure that these types are used in contexts where they are legal, catching type-related errors

Strongly typed languages are ones in which every expression can be assigned an unambiguous type
Weakly typed languages could have run-time errors due to type incompatibility

Statically typed languages vs. dynamically type languages


Capable of being checked at compile time vs. not

Static type checking vs. dynamic type checking


Actually check types at compile time vs. run time Run time type checking is less efficient because type information must be stored 1

Base Types
Numbers
integer C specifies length in relative terms, short and long Java specifies specific lengths: byte, short, int and long have 8, 16, 32 and 64 bits, respectively floating point numbers Many languages have two sizes, 32 and 64 bit
Can use IEEE representation standards

Characters
Single letter, used to be 8 bit ASCII standard, now can also be a 16 bit Unicode

Booleans
Two values: true and false C uses the 0 and 1 subrange of integers

Strings
Some languages have strings as a base type with catenation operators 2

Compound Types: Arrays and Strings


Arrays aggregate of values of the same type
Arrays have a base type for elements, may have an indexing range for each dimension int a [100][25], in C If the indexing range is known, then the compiler can compute space allocation, otherwise indexing is relative and space is allocated by a run-time allocator The main operation is indexing; some languages allow whole array operations

Strings sequence of characters


Also can have bit strings C treats strings as arrays of characters Strings can have comparison operators that use lexicographic order fee < fie
3

More Compound Types


Records or Structures components may have different types and may be indexed by names struct { double r; int i; }
Representation as ordered product of elements

Variant records or Unions a component may be one of a choice of types union ( double r; int i; }
Representation can save space for the largest element and may have a tag to indicate which type the value is Take care not to have run-time (value) errors

Other Types
Enumerated types the programmer can create a type name for a specific set of constant values enum WeekDay {Sunday, Monday, Saturday}
Representation as for a small set

Pointers abstraction of addresses


Can create a reference to, or derefence an object Distinguish between pointer to integer and pointer to boolean, etc. C allows arithmetic on pointers

Void Classes may or may not create new types


Classes can be represented by an extended type of record for data and methods But are not just types since they also have features such as inheritance
5

Function Types, New Type Names


Procedure and function types are sometimes called signatures
Give number and types of parameters May include parameter-passing information such as by value or by reference Give type of result (or indicate no result) strlength : String -> unsigned int

Type declarations or type definitions allow the programmer to assign new type names to a type expression
These names should also be stored in the symbol table Will have scope and other attributes Definitions may be recursive in some languages If so, size of object will be unknown
6

Representing Types
Types can be represented as expressions, or quite often as trees:

array(9)

function

type

arglist

result type argn type


7

arg1 type

Type Equivalence
Name equivalence in this kind of equivalence, two types must have the same name
Assumes that the programmer will introduce new names exactly when they want types to be different if you say t1 = int and t2 = int, t1 and t2 are different

Structural equivalence two objects are interchangeable if their types have the same fields with equivalent types. int x[10][10] and int y[10][10] x and y have equivalent types, the 10 by 10 arrays
More complex situations arise in structural equivalence of other compound types, e.g. may have mutually recursive type definitions

Type checking rules extend this to a notion of type compatibility


8

Type Synthesis and Type Checking


Assigning types to language expressions (type synthesis) can be done by a traversal of the abstract syntax tree At each node, a type checking rule will say which types are allowed Description here is for languages with type declarations
Constants are assigned a type If there is not a known type, then there will be a set of possibles Variables are looked up in the symbol table Note that we are assuming an L-attributed grammar so that declarations are processed first Assignment The type of the assignable entity on the left must be the typeEqual to the type of the value on the right
9

Types for expressions


Arithmetic and other operators have result types defined in terms of the types of the subnodes in the tree Statements have substructures that need to be checked for type correctness
Condition of if and while statements must have type boolean

Array reference
Suppose we have exp1 -> exp2[exp3], then an adhoc SDT: if (isArrayType(exp2.type)) and typeEqual(exp3.type, integer) then exp1.type = exp2.type.basetype // get the basetype child else type-error(exp1)

Function calls have similar rules to check the signature of the function name the parameters
10

Issues for typeEqual


This is sometimes called type compatibility Overloading
Arithmetic operators 2 + 3 means integer addition 2.0 + 3.0 means floating pt addition Language may have an arithmetic operator table, telling which type of operator will be used based on the expressions Type of: a int int int float float double b int float double float double double a+b int float double float double double
11

Overloading Functions
Can declare the same function (or method) name with different numbers and types of parameters int max(int x,y) double max(double x,y)
Java and C++ allow such overloaded declarations

Need to augment the symbol table functionality to allow for the name to have multiple signatures
The lookup procedure is always give a name to look up we can add a typelist argument and the lookup can tell us if there is a function declared with that signature Or the lookup procedure is given the name to look up and in the case of a method, it can return sets of allowable signatures

12

Conversion and Coercion


The typeEqual comparison is commonly extend to allow arithmetic expressions of mixed type and for other cases of types which are compatible, but not equal
If mixed types are allowed in an arithmetic expression, then a conversion should be inserted (into the AST or the code) 2 * 3.14 becomes code like t1 = float ( 2) t2 = t1 * 3.14

Conversion from one type to another is said to be implicit it if is done automatically by the compiler, and is also called coercion Conversion is said to be explicit if the programmer must write the conversion
Called casts in C and Java languages
13

Widening and Narrowing


The rules for Java type conversion distinguishes between
Widening conversions which preserve information Narrowing conversion which can lose information

Conversions between primitive types in Java: (there is also widening and narrowing for references, i.e. objects) Widening Narrowing (usually with a cast) double float long int short byte
14

double float long int char char short byte

Generating Type Conversions


For an arithmetic expressions e1 e2 op e3, the algorithm can generally use widening, to possibly a third type that is greater than both the types of e2 and e3 in the widening tree: Let the new type of e1 be the max of e2.type and e3.type Generate a widening conversion of e1 if necessary Generate a widening conversion of e2 if necessary Set the type (and later the code) of e1 Type conversions and coercions also apply to assignment if r is a double and i is an int: allow r = i;
C also allows i = r, with the corresponding loss of information For classes, we have the subtype principle, objects of subclasses can be assigned to objects of superclasses (with the corresponding loss of information), but not vice versa
15

Continuing type synthesis and checking rules


Expressions: the two expressions involved with boolean operators, such as &&, must both be boolean Functions: the type of each actual parameter must be typeEqual to its formal parameter Classes:
if specified, the parent of the class must be a properly declared class If a class says that it implements an interface, then all methods of the interface must be implemented

16

Polymorphic Typing
A language is polymorphic is language constructs can have more than one type procedure swap(anytype x,y) where anytype is considered to be a type variable Polymorphic functions have type patterns or type schemes, instead of actual type expressions The type checker must check that the types of the actual parameters fit the pattern
Technically, the type checker must find a substitution of actual types for type variables that satisfies the type equivalence between the formal type pattern and the actual type expression In complex cases with recursion, may need to do unification to solve the substitution problem Most notably in the language ML

Java and C# allow polymorphic type constructors, called generics


17

References
Keith Cooper and Linda Torczon, Engineering a Compiler, Elsevier, 2004.. Kenneth C. Louden, Compiler Construction: Principles and Practices, PWS Publishing, 1997. Aho, Lam, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools. Addison-Wesley, 2006. (The purple dragon book) Charles Fischer and Richard LeBlanc, Jr., Crafting a Compiler with C, Benjamin Cummings, 1991.

18

You might also like