Unit-2 PPL Datatypes

UNIT-3
Data Types
Unit-3 Topics
Introduction
Primitive data types
Character string types
User-defined ordinal types
Array types
Associative arrays
Record types
Union types
Pointer and reference types
Type Checking
Strong Typing
Type Equivalence
2
Introduction
DATA TYPE
• A data type defines a collection of data objects
and a set of predefined operations on those
objects.
TYPE SYSTEM
o Defines how a type is associated with each
expression in the language
o Includes rules for type equivalence and type

compatibility 6-2
Copyright © 2009 Addison-Wesley. All rights reserved.
DATA TYPES
A descriptor is the collection of the attributes of a
variable
An object represents an instance of a user-defined

(abstract data) type
One design issue for all data types: What operations

are defined and how are they specified?
4
Evolution of data types
FORTRAN I (1957)
 Just types for INTEGER, REAL, arrays
Later in, COBOL, precesion of the numbers was made as

user’s choice.
In ALGOL68, few basic data types were provided. In addition

to this some user defined data types were provided.
Ada (1983)
 Programmer able to create a user-defined type for every category of
variables in the problem space and have the system enforce the
types
5
Primitive Data Types
• Almost all programming languages provide a set
of primitive data types
• Primitive data types: Those not defined in terms

of other data types
• Some primitive data types are merely reflections

of the hardware
• Others require only a little non-hardware

support for their implementation
Primitive Data Types: Integer
• Almost always an exact reflection of the
hardware so the mapping is trivial
• There may be as many as eight different integer

types in a language
o ADA and Java’s signed integer sizes: byte, short,
int, long
o C and C++ have these plus a set of corresponding
unsigned types
• Scripting languages generally have one integer

type
Representing Integers
• Positive numbers can be converted to base 2
– [e.g. represent 7, 12, 14]
• How do you represent the sign when you only

have 0s and 1s?
o Sign bit
o Ones complement
o Twos complement
Using a Sign Bit
• Use one bit of the representation for the
sign
• Unsigned data types don’t have a sign bit

o unsigned char can represent 0-255 values
o signed char can represent
One’s Complement Representation
• Negative numbers are the complement of

the corresponding positive number
Two’s Complement
• Take the complement and add one
• This representation is continuous from -1 to 1

Floating Point Types
• Model real numbers, but only as approximations.
•On most computers, floating-point numbers are

stored in binary, which exacerbates the problem.
•Another problem, is the loss of accuracy through

arithmetic operations.
• Languages for scientific use support at least two

floating-point types - float and double
(sometimes more).
Floating Point Types
• The collection of values that can be
represented by a floating-point type is defined
in terms of precision and range.
• Usually exactly like the hardware, but not

always.
• The same arithmetic, relational and

assignment operations described for integers
are also provided for reals. The Boolean
operations are restricted slightly.
13
Representing Real numbers
• Decimal numbers can also be converted to
base 2
• Now we need a way to represent both the sign

and the decimal point
o Use one bit for the sign
o For the decimal point two possibilities
• Always have the same number of bits before and after
• Use scientific notation – 1.1 * 10^2 = 110
Floating Point
Floating point numbers are stored in the form of
scientific notations. The storage will be divided into a
mantissa and an exponent.
The figure shows the IEEE Floating-Point Standard
754 format for single and double-precision
representation (IEEE, 1985).
15
COMPLEX
Some programming languages support a complex data
type—for example,Fortran, Python and C99.
Complex values are represented as ordered pairs of

floating-point values.
In Python, the imaginary part of a complex literal is

specified by following it with a j or J
Languages that support a complex type include

operations for arithmetic on complex values.
16
Decimal
Most larger computers that are designed to support
business systems applications have hardware support for
decimal data types.
Decimal data types store a fixed number of decimal

digits, with the decimal point at a fixed position in the
value.
These are the primary data types for business data

processing and are therefore essential to COBOL. C#
and F# also have decimal data types.
17
DECIMAL
Advantage: accuracy of decimal values.
Disadvantages: limited range since no exponents

are allowed, and its representation wastes memory.
Decimal types are stored very much like character

strings, using binary codes for the decimal digits.
These representations are called binary coded
decimal (BCD).
18
Boolean Types
Boolean types are perhaps the simplest of all types.
Their range of values has only two elements: one for
true and one for false.
They were introduced in ALGOL 60 and have been

included in most general-purpose languages designed
since 1960.
All operands with nonzero values are considered true,

and zero is considered false.
19
Boolean Types
This is not the case in the subsequent languages,
Java and C#.
Boolean types are often used to represent switches or
flags in programs.
The use of Boolean types is more readable.
A Boolean value could be represented by a single bit.
One of disadvantage is because a single bit of
memory cannot be accessed efficiently on many
machines.
20
Primitive Data Types: Character
Stored as numeric codings
Most commonly used coding: ASCII
An alternative, 16-bit coding: Unicode (UCS-2)
Includes characters from most natural languages
Originally used in Java
C# and JavaScript also support Unicode
32-bit Unicode (UCS-4)
Supported by Fortran, starting with 2003
Non-Primitive Data Types
Non- primitive data types of constructed using
primitive data types.
Ex:- arrays, sets, sub range, enumeration, pointers ,
strings, structures and unions.
Character Strings
A character string type is one in which the values
consist of sequences of characters.
Character string constants are used to label output,
and the input and output of all kinds of data are often
done in terms of strings. 22
Design Issues
The two most important design issues that are
specific to character string types are the
following:
• Should strings be simply a special kind of

character array or a primitive type?
• Should strings have static or dynamic length?
23
Strings and Their Operations
The common string operations are assignment,
catenation substring reference, comparison and
pattern matching.
A substring reference is a reference to a

substring of a given string, where the substring
references are called slices.
Both the assignment and comparison operations

on character strings are complicated when ever
the operands of different lengths.
Pattern matching is provided by a class library.
If strings are not defined as a primitive type, string data

is usually stored in arrays of single characters and
referenced as such in the language. This is the approach
taken by C and C++.
C and C++ use char arrays to store character strings.
These languages provide a collection of string

operations through a standard library whose header file
is string.h.
For example, consider the following declaration:
char str[]=“apples”;
In this example, str is an array of char elements,
specifically apples0, where 0 is the null
character.
The most commonly used library functions for
character strings are:
strcpy
strcat
strcmp
strlen
strcpy(src, dest);
Fortran 95 treats strings as a primitive type and
provides assignment, relational operators, catenation
and substring reference operations.
In java, strings are supported as a primitive type by the

String class, whose values are constant strings, and the
StringBuffer class whose values are changeable and
are more like arrays of single characters.
Subscripting is allowed on StringBuffer variables.
C# and ruby include string classes that are similar to

those of java.
Python also has strings as a primitive type and
has operations for substring reference,
catenation, indexing to access individual
characters, as well as methods for searching
and replacement.
But for substring references they act very much

like arrays of characters. However, Python
strings are immutable, similar to the string
class objects of java.
Perl, JavaScript, Ruby and PHP include built-in

pattern-matching operations.
In these languages, the pattern-matching

expressions are somewhat loosely based on
mathematical regular expressions.
In fact, they are often called regular expressions.

Consider the following pattern expression:
/[A-Za-z][A-Za-z\d]+/
This pattern matches the first character class should be all letter
and second character class can letter/digit.
Next, consider the following pattern expression:

/\d+\.?\d*|\.\d+/
This pattern matches numeric literals.
\. Specifies a literal decimal point.
(|) separate two alternatives in the whole pattern.
String Length Options
There are several design choices regarding the
length of string values.
First, the length can be static and set when the string
is created. Such a string is called a static length
string.
The second option is to allow strings to have varying

length up to a declared and fixed maximum set by
the variable’s definition, as exemplified by the string
in C and the C style strings of C++.
These are called limited dynamic length strings.

String Length Options
The third option is to allow strings to have
varying length with no maximum, as in
JavaScript and Perl. These are called dynamic
length strings.
Ada 95 supports all three string length options.

Implementation of Character String Types
Character string types could be supported
directly in hardware, retrieval, and manipulation.
A descriptor for a static character string type,

which is required only during compilation, has
three fields.
The first field of every descriptor is the name of the
type. In the case of static character strings,
The second field is type’s length(in characters).
The third field is the address of the first character.
Limited dynamic strings require a run-time
descriptor to store both the fixed maximum
length and the current length.
The limited dynamic strings of C and C++ do

not require run-time descriptors because the end
of the string is marked with the null character.
Character String Type in Certain Languages
C and C++
Not primitive
Use char arrays and a library of functions that provide
operations
SNOBOL4 (a string manipulation language)
Primitive
Many operations, including elaborate pattern matching
Fortran and Python
Primitive type with assignment and several operations
Java
Primitive via the String class
Perl, JavaScript, Ruby, and PHP
- Provide built-in pattern matching, using regular expressions
Character String Length Options
Static: COBOL, Java’s String class
Limited Dynamic Length: C and C++
In these languages, a special character is used to
indicate the end of a string’s characters, rather than
maintaining the length
Dynamic (no maximum): SNOBOL4, Perl,
JavaScript
Ada supports all three string length options
Ordinal types
An ordinal is one in which the range of possible values can
be easily associated with a subset of positive integers.
Examples of typical predefined ordinal types

Integer
Character
Boolean
We will consider the following user-defined ordinal types

Enumeration type
Subrange type
38
Enumeration type
An enumeration type is one in which the user
enumerates all of the possible values
Values are symbolic constants (identifiers)
Example (Ada)
type Days is (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday);
for today in Tuesday .. Thursday loop


end loop;
39
Enumeration type
Design Issues
What operations are allowed for enumeration types
 Ada has attribute operations
• Days‘First gives the first day
• Days‘Last gives the last day
• Days‘Pos( today ) gives the Integer position in the enum list
• Days‘Val( 3 ) gives the enum value associated with position 3
• Days‘Pred( today ) gives the predecessor of today
• Days‘Succ( today ) gives the successor of today
Should comparison operations =, <, <=, etc. be allowed?
Should a symbolic constant be allowed to be in more
than one type definition (overloading)?
Is coercion performed to or from enumeration values?
40
Enumeration choices
Pascal
Cannot overload enumeration constants
Enums can be used for array subscripts and case selectors
Enums can be compared
No operations for input or output
C and C++
Can be used like Pascal, but . . .
Coercion, as in “today++” or as in “int n = today”
Operations for input and output as integers
Ada
Can be used as in Pascal, but . . .
Enums may be overloaded
No coercion and allowed ranges are checked
Operations exist for input and output of enumeration values in text form
C#
No coercion and allowed ranges are checked
41
Enumeration type
Evaluation
Aid to readability
 Names are easily recognized whereas coded values are not
 E.g. – no need to code a color as a number
Aid to reliability
 Compiler can check
• Operations on enums
– E.g. – don’t allow colors to be added
• Ranges of allowed values
– E.g. – Ada detects the error in day := Days’Succ( Saturday )
Implementation
Enumeration types are implemented as integers
42
Subrange type
The subrange type is an ordered contiguous
subsequence of an ordinal type
Examples (Ada)
subtype Positive is Integer range 1 .. Integer'Last;
subtype Natural is Integer range 0 .. Integer'Last;
subtype Index is Integer range -1 .. 100;
for next in Index loop


end loop;
type Days is (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday);

subtype Weekdays is Days range Monday .. Friday;
for today in Weekdays loop


end loop;
43
Subrange type
Evaluation
Aid to readability
 E.g. – Can distinguish between a weekday and a day
Reliability
 Restricted ranges aid error detection
 E.g. – Saturday is not a valid weekday
Implementation
Subrange types are just the parent types with check
code (inserted by the compiler) to restrict assignments
to subrange values
44
Arrays
An array is an aggregate of indexed data elements of the
same type
Two types involved
 Element type
 Index type
Each individual element is identified by an index to its position in
the aggregate
Design Issues
What types are legal for subscripts?
Are subscripting expressions in element references range
checked?
When does binding occur for subscript ranges?
When does allocation take place?
What is the maximum number of subscripts?
Can array objects be initialized?
Are any kind of slices allowed?
45
Arrays
Index syntax
FORTRAN, PL/I, Ada use parentheses
 Ada intentionally uses parentheses to make an array
reference look like a function call
n := a( 23 );
Most other languages use brackets [ ]
Indexing is a storage mapping from the array
indices to elements
This mapping requires a run-time calculation to
reference memory
46
Array storage mapping example
Storage mapping for 2-dim array b
Row-wise allocation is used
Access code for access b[ i, j ] requires j=
2 adds and 2 multiplies 0 1 2 3 4
w is the size of each cell in bytes 0
loc( b[ i, j ] ) 2
= loc( b ) 3
+ w * ( (# elements in previous rows)
+ (# previous elements in row i) ) i= 4
x
= loc( b ) + w*( i * ( # columns) + j ) 5
= loc( b ) + 4*( 5*i + j ) 6
47
Arrays
Subscript types
FORTRAN, C, and Java
 Integer only
Pascal and Ada
 Any ordinal type
• Integer, Boolean, Character, enum
Range checking
Java, ML, C# check the range of all subscripts
C, C++, Perl, Fortran do not
Ada checks by default but this can be disabled by a
compiler Pragma
48
Array binding and allocation
We consider the following categories of arrays
Static array
Fixed stack-dynamic array
Stack-dynamic array
Fixed heap-dynamic array
Heap-dynamic array
These are based on when the subscript ranges are
bound and when storage is allocated
49
Static arrays
Range of subscripts and storage bindings are static
 e.g. FORTRAN 77, some arrays in Ada, C/C++ static arrays
Advantage
 Execution efficiency
 No run-time overhead for allocation or deallocation
Fixed stack-dynamic arrays
The range of subscripts is statically bound
Storage is bound at elaboration time
 e.g. – most local variable arrays
Advantage: space efficiency descriptor
50
Stack-dynamic arrays
The index range and storage allocation are dynamic,
but fixed from then on for the variable’s lifetime
Advantage: flexibility
 Size need not be known until the array is about to
be used
n := <expression>;
E.g. – Ada declare blocks declare
a : array (1..n) of Float;
begin

end;
51
Fixed heap-dyamic arrays
Like stack-dynamic arrays except . . .
 Storage allocated on the heap
 The index range and storage allocation is initiated by program
request rather than subprogram elaboration
E.g. – all Java arrays
Heap-dynamic arrays
The subscript range and storage bindings are dynamic and may
subsequently be changed
Supported by Smalltalk (e.g. – OrderedCollection), APL,
Pearl, JavaScript, FORTRAN 90, and C# ArrayList class
52
Arrays
Number of subscripts
FORTRAN I allowed up to three
FORTRAN 77 allows up to seven
Other languages - no limit
Array initialization
Some languages permit initialization of arrays
Fortran C
Integer List( 3 )
int list [ ] = { 21, 67, 9 }
Data List / 21, 67, 9 /
Ada “aggregates”
list : array( 1 .. 3 ) of Integer := ( 21, 67, 9 );
list : array( 1 .. 100 ) of Integer := ( 10 => 21, 20 => 67, 30 => 9, others => 0 );
list : array( 1..10, 1..3 ) of Integer := (1 => (1,2,3), 10 => (4,5,6), others => (0, 0,0));
53
Array operations
An array operation operates on an array or a part
of an array as a unit
Ada operations
Assignment
Catenation (1-dim only)
Equality (=) and inequality (/=)
APL
Most powerful array-processing language ever devised
Many array operations
54
Slices
A slice is some substructure of an array
It is nothing more than a referencing mechanism
Slices are only useful in languages that have array
operations
Fortran slices at right
Ada slices below
a : array (1..100) of Float;
a( 1..50 ) := a( 51..100);
55
Associative arrays
An associative array is an unordered collection of data
elements that are indexed by an equal number of values
called keys
Also called a . . .
Map
Key-value table
Dictionary
Perl example
An associative array is called a hash in Perl
Names begin with %
Aggregate literals are delimited by parentheses
 E.g. – %temps = ("Monday" => 77,"Tuesday" => 79,…);
Subscripting is done using braces and keys
 E.g. – %temps{ "Wednesday“ } = 83;
Elements can be removed with delete
 E.g. – delete %temps{ "Tuesday“ };
56
Records
A record is a aggregate of
named data elements of
possibly diverse types
A compile-time descriptor for a
record is at right
The offset is from the record base
address
Design Issues
What is the form of references? a compile-time descriptor
for a record
What unit operations are defined?
57
Records
Called the struct data type in C, C++, and C#
A class defines a record in Java and Smalltalk
Record declarations
COBOL uses level numbers to show nested records
Other languages use a recursive definition
Field references
COBOL
 <fieldName> OF <recordName2> OF<recordName1>
Other languages use dot notation
 <recordName1>.<recordName2>.<fieldName>
58
Records
Fully qualified field references must include all
nested record names
Elliptical references allow leaving out record
names as long as the reference is unambiguous
Pascal provides a with clause to abbreviate
references
59
Record Operations
Assignment
Allowed in Pascal, Ada, and C if the types are identical
In Ada, the RHS can be an record aggregate constant
COBOL uses “MOVE CORRESPONDING”
 Moves all fields in the source record to fields with the same
names in the destination record
Initialization
Allowed in Ada, using an aggregate constant
In Java, done by the constructor
Comparison
Ada has tests for equality = and /=
60
Arrays vrs. records
Access to array elements is much slower than
access to record fields
Each record field is accessed with a fixed offset from
the record base address
Array subscripts require run-time calculation
61
Union types
A union is a type whose variables are allowed to store
different type values at different times during execution
Design issue for unions
How should type checking be done?
Examples
Fortran has EQUIVALENCE
 No type checking
C and C++ have free unions
 Not part of structs
 Complete freedom from type checking
Pascal embeds unions in records
 Design leads to ineffective type checking
62
Discriminated unions
Algol 68 and Ada use discriminated unions
This provides secure type checking
Ada
Ada embeds discriminated unions in records
One record field in called a discriminant or tag
The discriminant on in the example on the following
slide is Form
63
Ada example type Shape is ( Circle, Triangle, Rectangle );
type Colors is ( Red, Green,Blue );
The discriminant field type Figure( Form : Shape ) is record
Form may not be Filled : Boolean;
Color : Colors;
changed in isolation case Form is
when Circle =>
It may only be Diameter : Float;
when Triangle =>
changed by assigning LeftSide : Ingeger;
to the entire record RightSide : Integer;
Angle : Float;
This prevents the when Rectangle =>
Height : Integer;
record fields from Width : Integer;
becoming end case;
end record;
inconsistent
64
Ada example
Assignment using a record aggregate
Fig : Figure;
Fig := ( Filled => true, Color => Blue, Form => Rectangle, Height => 12, Width => 3 );
Layout of record fields

Fields Diameter, LeftSide, RightSide, Angle, Height
and Width share the same bytes
65
Pointer types
Pointer type values consist of memory addresses
and the special value nil (or null)
Pointers are used for
Indirect addressing
Management of heap-dynamic variables
 These are anonymous variables
66
Pointer operations
Assignment operation
Sets a pointer to a useful address
Dereferencing operation
Interprets the pointer variable as representing the
object at the memory address contained in the pointer
variable
Thus, it applies one level of indirect addressing
Deallocation
Returns the heap-dynamic storage referred to by a
pointer to the system for reallocation
67
Problems with pointers
Dangling pointers
A dangling pointer refers to a heap-dynamic variable
that has been deallocated
To create a dangling pointer in Pascal with explicit
deallocation . . .
 Allocate a heap-dynamic variable pointed to by p
 Make an alias for the pointer: q := p
 Explicitly deallocate the heap-dynamic variable: dispose( p );
 Now q contains a dangling pointer
68
Problems with pointers
Lost heap-dynamic variables
A lost heap-dynamic variable is no longer referenced
by any program pointer and is inaccessible
To create a lost heap-dynamic variable . . .
 Allocate a heap-dynamic variable pointed to by p
 Replace the pointer in p by a reference to some other heap-
dynamic variable: p := q
 Now the first heap-dynamic variable is inaccessible
The process of losing heap-dynamic variables is called
memory leakage
69
Pointers in C and C++
Pointers in C and C++ are similar to addresses in
assembly language
Pointers may point virtually anywhere in memory
Pointer arithmetic is possible
Programmer is responsible for avoiding problems
of dangling pointers and lost heap-dynamic
variables
70
Pointers in C and C++
Dereferencing is explicitly specified with the * operator
Reference type variables are constant pointers specified
with the & operator
Reference pointers are always implicitly dereferenced
Used for parameter passing
 pass-by-reference
int count; /* defines count as an int variable */

int *ptr; /* defines ptr as a reference to an int variable */
int sum;
ptr = ∑ /* operator & produces the address of sum */
count = *ptr; /* operator * dereferences ptr and produces the value in sum */
ptr = ptr + 3 /* increments address in ptr by 12 */
int &ref = sum /* ref is constant pointer that creates an alias for sum */
ref = 23 /* assigns 23 to sum (implicitly dereferenced) */
71
Pointers in Ada
Called access types
Used only for heap-dynamic variables
No pointer arithmetic
All access variables are initialized to null
This also provides reliability
Heap-dynamic variables may (implementation option) be
implicitly deallocated at the end of the scope of a pointer type
Partially alleviates the problem of lost heap-dynamic variables
Has an explicit deallocator: Unchecked_Deallocation
Dangling pointer problem is possible
72
Pointers in Java
These are called reference types
Refer to heap-dynamic objects exclusively
No pointer arithmetic
All reference variables are initialized to null
No explicit deallocation
This prevents the dangling pointer problem
All objects are implicitly deallocated by garbage collection
Garbage collection prevents the lost heap-dynamic variable
problem
Reference variables are implicitly dereferenced whenever
the dot notation is used, as in p.link
73
Dangling pointer problem
The problem of dangling pointers can be resolved
using . . .
Tombstones
Locks and keys
74
Tombstones
Tombstone
An extra heap cell that
is a pointer to the
heap-dynamic variable
The actual pointer
variable points only at
a tombstone
When a heap-dynamic
variable deallocated,
the tombstone remains
but set to null
75
Locks and keys
The locks-and-keys technique represents pointer values
as a key-address pair
Each heap-dynamic variable is represented as storage for the
data plus a cell for the key
When heap-dynamic variable allocated, a lock value is
created and a copy is placed in both . . .
A lock cell within the heap-dynamic variable
The key cell of pointer
When a heap-dynamic variable is deallocated, its lock
value is cleared
Every dereference must compare the key value in the
pointer to the lock in the heap-dynamic variable
76
Heap management
Takes deallocation of heap-dynamic variables out
of the hands of programmers
Two popular solutions
Reference counters
 Incremental and done when inaccessible cells are created
Garbage collection
 Occurs when available heap space runs out
77
Reference counters
The reference counter solution maintains a counter in
every heap cell
The counter stores the number of pointers currently pointing at
the cell
Whenever a pointer is changed . . .
The counter in the old target is decremented
The counter in the new target is incremented
When a counter decrements to zero, the heap-dynamic
variable is returned to the list of available space
Disadvantages
Space required by the reference counters
Time overhead
Complications for cells in circular linked lists
78
Garbage collection
When heap storage is exhausted, perform garbage
collection as follows
Every heap cell has an extra bit used by the garbage
collection algorithm
All bits are initially cleared (assumed to be garbage)
Starting with all program pointers, recursively follow all
pointers and mark any heap-dynamic variable that can
be reached
All variables that remain unmarked are then returned to
the list of available heap cells
79
Garbage collection
Disadvantage
When you need it most, it works the worst
 You need it most when there is very little actual garbage left
in the heap
 The garbage collection algorithm is very time consuming in
this situation
80
Type checking
Type checking is the activity of ensuring that types are
compatible when considering . . .
the operands of an operator
the parameters and return type of a method
the two sides of an assignment statement
A compatible type is one that is either a legal type or one
that may be coerced to a legal type for the given situation
A coercion is an automatic type conversion that is
allowed under language rules and is implicitly performed
by compiler-generated code
A type error is the use of non-compatible type in a given
situation
81
Type checking
If all type bindings to variables are static, nearly all
type checking can be static
If type bindings are dynamic, type checking must
be dynamic
A programming language is strongly typed if type
errors are always detected
This definition from the text is not the standard
definition
 Under this Smalltalk would be strongly typed
The usual definition requires that the single type of
each variable name is known at compile time
82
Strong typing
Advantage
Allows the detection of type errors due to misuse of variables
Language examples:
FORTRAN 77 is not (parameters, EQUIVALENCE)
Pascal is not (only because of variant records)
C and C++ are not
 Parameter type checking can be avoided
 Unions are not type checked
Ada almost is (UNCHECKED_CONVERSION is loophole)
Java and C# are similar to Ada
 They allow explicit casts
83
Strong typing
Coercion rules strongly (and negatively) affect
strong typing
Fortran, C, and C++ are significantly less reliable than
Ada, in which all type conversion is explicit
Java is between C++ and Ada with about half the
assignment coercions of C++
84
Type equivalence
When are variables declared using user-defined types
compatible?
Name type eqivalence means that two variables have
equivalent types when they are declared in the same
declaration or in declarations that use the same typename
Easy to implement but highly restrictive
Ada example
type IndexType is 1..100;
count : Integer;
index : IndexType;
Variables count and index are not compatible
They don’t use the same type name
Assignments count := index; and index := count; are illegal
85
Type equivalence
Structure type equivalence means that two
variables have equivalent types if their types have
identical structures
More flexible, but harder to implement
The entire structures of both types must be compared
 Are two record (structure) types equivalent if they have the
same structure but different field names?
 Are two array types equivalent if the subscript ranges are
different?
It is not possible to distinguish between types with the
same structure which represent different kinds of data
 How can you avoid mixing counts of apples and oranges if
they are both integer types?
86
Ada examples
Ada usually requires name type equivalence but avoids
most restrictions by having derived types and subtypes
Derived types
A different type that has the same structure as a base type
Example of incompatible derived types
type Celsius is new Float;
type Fahrenheit is new Float;
Subtypes
A possibly range-constrained version of a base type
Example
subtype IndexType is Integer range 1..100;
count : Integer;
index : IndexType;
Variables count and index are now compatible
87
Ada examples
Ada uses structure type equivalence for “unconstrained
array” types
vec1 and vec2 are equivalent
type Vector is array( Integer range <>) of Float;
vec1 : Vector( 1..10 ):
vec2 : Vector( 11.. 20 );
Care must be taken with “constrained” anonymous types
A and B are incompatible
A : array( 1..10 ) of Integer;
B : array( 1..10 ) of Integer;
A and B are still incompatible
A, B : array( 1..10 ) of Integer;
Here, A and B are equivalent

type Array_Type is array( 1..10 ) of Integer;
A, B : Array_Type;
88
C and C++
C uses structure type equivalence for all types
except struct, enum, and union
Except if two structures or unions are defined in
different files
 Then structure type equivalence is again used
C++ uses name type equivalence
typedef in C and C++ simply creates an alias for a
type
89

Unit-2 PPL Datatypes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit-2 PPL Datatypes

Uploaded by

Copyright:

Available Formats

UNIT-3

o Includes rules for type equivalence and type

An object represents an instance of a user-defined

One design issue for all data types: What operations

Later in, COBOL, precesion of the numbers was made as

In ALGOL68, few basic data types were provided. In addition

• Primitive data types: Those not defined in terms

• Some primitive data types are merely reflections

• Others require only a little non-hardware

• There may be as many as eight different integer

• Scripting languages generally have one integer

• How do you represent the sign when you only

• Unsigned data types don’t have a sign bit

• Negative numbers are the complement of

• This representation is continuous from -1 to 1

•On most computers, floating-point numbers are

•Another problem, is the loss of accuracy through

• Languages for scientific use support at least two

• Usually exactly like the hardware, but not

• The same arithmetic, relational and

• Now we need a way to represent both the sign

Complex values are represented as ordered pairs of

In Python, the imaginary part of a complex literal is

Languages that support a complex type include

Decimal data types store a fixed number of decimal

These are the primary data types for business data

Disadvantages: limited range since no exponents

Decimal types are stored very much like character

They were introduced in ALGOL 60 and have been

All operands with nonzero values are considered true,

• Should strings be simply a special kind of

• Should strings have static or dynamic length?

A substring reference is a reference to a

Both the assignment and comparison operations

If strings are not defined as a primitive type, string data

C and C++ use char arrays to store character strings.

These languages provide a collection of string

In java, strings are supported as a primitive type by the

Subscripting is allowed on StringBuffer variables.

C# and ruby include string classes that are similar to

But for substring references they act very much

Perl, JavaScript, Ruby and PHP include built-in

In these languages, the pattern-matching

In fact, they are often called regular expressions.

Next, consider the following pattern expression:

The second option is to allow strings to have varying

These are called limited dynamic length strings.

Ada 95 supports all three string length options.

A descriptor for a static character string type,

The limited dynamic strings of C and C++ do

Examples of typical predefined ordinal types

We will consider the following user-defined ordinal types

type Days is (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday);

for today in Tuesday .. Thursday loop

for next in Index loop

type Days is (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday);

for today in Weekdays loop

w is the size of each cell in bytes 0

= loc( b ) + 4*( 5*i + j ) 6

What unit operations are defined?

Layout of record fields

int count; /* defines count as an int variable */

Here, A and B are equivalent

= loc( b ) + 4( 5i + j ) 6