You are on page 1of 35

Cse321, Programming Languages and Compilers

Lecture #1, Jan. 9, 2007


•Course Mechanics
•Text Book
•Down-loading SML
•Syllabus - Course Overview
•Entrance Exam
•Standard ML
•This weeks assignment
•Top to bottom example
•Lexical issues
•Parsing and syntax issues
•Translation issues

08/21/09 1
Cse321, Programming Languages and Compilers

Acknowledgements
The material taught in this course
was made possible by many people.
Here is a partial list:
• Andrew Tolmach
• Nathan Linger
• Harry Porter
• Jinke Lee

08/21/09 2
Cse321, Programming Languages and Compilers

Class Web Page


• The CS321 class web page can be
found at:
– www.cs.pdx.edu/~sheard/course/Cs321

• Contents of the page


– Course Syllabus
– Link to the ML home page
– Copies of the PowerPoint slides used in lectures
– Copies of the assignments
– Project Description
– Copies of the SML code illustrated in the lectures

• The web page will be updated after


each lecture.
08/21/09 3
Cse321, Programming Languages and Compilers

Today’s Assignments
Reading
• Engineering a Compiler
– Available In the PSU bookstore
– Chapter 1, pp 1-26
– There will be a 5 minute quiz on the reading Wednesday.

Search
• Find the class webpage

1 page programming Assignment


• Due Wednesday, Jan 10, 2007. In Just 2 Days!!
• Login to some SML system. See how the system
operates. Type in solutions (in a file) to the
programming problems (In Class exercises 1 and 2 in
this handout), load them into SML. Get them
running, and print them out then turn them in on
Wednesday. What matters here is that you try out
the SML system, not that you get them perfect.
08/21/09 4
Cse321, Programming Languages and Compilers

Course Information
• CS321 - Languages and Compiler Design
– Time: Monday & Wednesday 18:00-19:50 pm
– Place: PCAT 138
– Instructor: Tim Sheard
– office: room 115, CS Dept, 4th Ave Building, Portland State Univ.
– phone: 503-725-2410 (work) 503-649-7242 (home)
– office hours: Before class in my office (5:00-5:50), or by Appt.
• Assignments
– Reading from text and handouts (quizzes on reading)
– Daily, 1 page programming assignments
– 3 part programming project
• Grading:
– midterm exam (25%)
– 3 parts of project (30%)
– Daily 1 page assignments and quizzes (15%)
– Final exam (30 %)

08/21/09 5
Cse321, Programming Languages and Compilers

Examinations
• Entrance Exam.
– Do you know your REs and CFGs?

• Quizzes on Reading Material.


– There is a possible quiz on every reading assignment
– There will be a quiz on Wednesday!

• Mid Term exam


– Wed. Feb 14, 2007. Time: in class.

• Final exam
– Monday, Mar. 19, 2007. Time: 6:00-7:50.

08/21/09 6
Cse321, Programming Languages and Compilers

Text Book
• Text: Engineering a Compiler
– Keith D. Cooper, and Linda Torczon
• Other Reference Materials
– Auxilliary Material
» Elements of Functional Programming (SML book)
by Chris Reade, Addison Wesley, ISBN 0-201-12915-9
» Using the SML/NJ System
http://www.cs.cmu.edu/~petel/smlguide/smlnj.htm

• Class Handouts
– Each class, a copy of that day’s slides will be available as a
handout.
– I will post files that contain the example programs used in each
lecture on the class web page
www.cs.pdx.edu/~sheard/course/Cs321
– I will post Assignments there as well.

08/21/09 7
Cse321, Programming Languages and Compilers

Labs
• Whenever you learn a new language its
great to have someone looking over your
shoulder.
• In this spirit I have scheduled some lab
times where people can work on learning
ML while I am there to help.
– FAB INTEL Lab (FAB 55-17) downstairs by the Engineering and
Technology Manangement’s departmental offices
– Friday Jan. 12, 2007. 4:00 – 5:30 PM
– Tueday Jan. 16, 2007 4:00 – 5:30
– Friday Jan. 19, 2005. 4:00 – 5:30 PM
• Labs are not required, but attendance of at
least one is highly recommended!

08/21/09 8
Cse321, Programming Languages and Compilers

Installing SML
• Software can be obtained at:
– http://www.smlnj.org/
• I am using the most recent version 110.60
– but it displays the version 110.57 when it runs
• Browse the “documentation and Literature” section
of the SML web page. Find some resources that you
can use.

• SML also runs on the PSU linux and Intel labs


– linux
» usepkg sml
» then logout, or start a new shell
» type: sm
– Intel
» In a commnd window
» p:\programs\smlnj\addpkg.cmd
» then logout, or start a new command window
» then just type:
» N:\>sml

08/21/09 9
Cse321, Programming Languages and Compilers

Entrance Exam
• CS321 has some pretty serious
prerequisites.

• Write a regular expression for the set of


strings that begins with an “a” which is
followed by an arbitrary number of “b”s or
“c”s, and is ended by a “d”.
e.g. ad, abbbd, abcbcbcd, etc.
2. Transform your regular expression into a
DFA
3. Write a context free grammar that
recognizes the same set of strings as your
RE
4 Transform your CFG into a CFG that is left-
recursion free.
08/21/09 10
Cse321, Programming Languages and Compilers

Academic Integrity
Students are expected to be honest in their
academic dealings. Dishonesty is dealt with
severely.

• Homework. Pass in only your own work.


• Program assignments. Program independently.
• Examinations. Notes and such, only as each
instructor allows.

OK to discuss how to solve


problems with other students,
but each student should
write up, debug, and turn in his
own solution.
08/21/09 11
Cse321, Programming Languages and Compilers

Course Thesis
• This course is about programming
languages. We study languages in two
ways.
– From the perspective of the user
– From the perspective of the implementer (compiler writer)
• We will learn about some languages you
may never have heard of. We will learn to
program in one of them (Standard ML). Its
good to learn a new language in depth.
• This course is also about programming.
There will be extensive programming
assignments in SML. If you don’t do them -
you won’t learn
– You’re deluding yourself if you think you can learn the material
without doing the exercises!
• We will write a comiler for a Java subset.
Its good to understand the implementation
details of a language you already know.
08/21/09 12
Cse321, Programming Languages and Compilers

This course is all about programming


• What makes a good program?
• Write at least 3 things on a piece of paper.

08/21/09 13
Cse321, Programming Languages and Compilers

Standard ML
• In this course we will use an
implementation of the language Standard
ML

• The SML/NJ Homepage has lots of useful


information: http://www.smlnj.org//

• You can get a version to install on your own


machine there.

I will use the version 110.57 or 110.60 of SML. Earlier


versions probably will work as well. I don’t foresee any
problems with other versions, but if you want to use the
identical version that I use in class then this is the one.

08/21/09 14
Cse321, Programming Languages and Compilers

Characteristics of SML
• Applicative style
– input output description of problem.
• First class functions
– pass as parameters
– return as value of a function
– store in data-structures
• Less Importantly:
– Automatic memory management (G.C. no new or malloc)
– Use of a strong type system which uses type inference, i.e. no
declarations but still strongly typed.

08/21/09 15
Cse321, Programming Languages and Compilers

Syntactic Elements

• Identifiers start with a letter followed by


digits or other letters or primes or
underscores.
– Valid Examples: a a3 a’b aF
– Invalid Examples: 12A
• Identifiers can also be constructed with a
sequence of operators like: !@#$%^&*+~

• Reserved words include


– fun val datatype if then else
– if of let in end type

08/21/09 16
Cse321, Programming Languages and Compilers

Interacting
• The normal style for interaction is to start
SML, and then type definitions into the
window.
• Types of commands
– 4 + 5;
– val x = 34;
– fun f x = x + 1;
• Here are two commands you might find
useful.

val pwd = OS.FileSys.getDir;


val cd = OS.FileSys.chDir;

• To load a file that has a sml program type

Use “file.sml”;
08/21/09 17
Cse321, Programming Languages and Compilers

The SML Read-Typecheck-Eval-Print Loop


Standard ML of New Jersey v110.57 [built: Mon Nov 21 21:46:28 2005]
-
- 3+5;
val it = 8 : int
- Note the semicolon when
- print "Hi there\n"; you’re ready to evaluate.
Hi there Otherwise commands can
val it = () : unit spread across several
- lines.
- val x = 22;
val x = 22 : int
-
- x+ 5;
val it = 27 : int
-
-val pwd = OS.FileSys.getDir;
-val pwd = fn : unit -> string

- val cd = OS.FileSys.chDir;
val cd = fn : string -> unit
-

08/21/09 18
Cse321, Programming Languages and Compilers

In Class Exercise 1
• Define prefix and lastone in terms of head tail and
reverse.
• First make a file “S01code.sml”fun lastone x = hd (rev x)
• Start sml fun prefix x = rev (tl (rev x))
• Change directory to
where the file resides
• Load the file ( use “S01code.html” )
• Test the function
Standard ML of New Jersey v110.57 - K;
- val cd = OS.FileSys.chDir;
val cd = fn : string -> unit
- cd "D:/work/sheard/courses/PsuCs321/web/notes";
- use "S01code.html";
[opening S01code.html]
val lastone = fn : 'a list -> 'a
val prefix = fn : 'a list -> 'a list
val it = () : unit
- lastone [1,2,3,4];
val it = 4 : int

08/21/09 19
Cse321, Programming Languages and Compilers

In Class Exercise 2
• define map and filter functions
– mymap f [1,2,3] = [f 1, f 2, f 3]
– filter even [1,2,3,4,5] = [2,4]

fun mymap f [] = []
| mymap f (x::xs) = (f x)::(mymap f xs);

fun filter p [] = []
| filter p (x::xs) =
if (p x) then x::(filter p xs) else (filter p xs);
• Sample Session

- mymap plusone [2,3,4]


[3, 4, 5]
- filter even [1,2,3,4,5,6]
[2, 4, 6]
08/21/09 20
Cse321, Programming Languages and Compilers

Course topics
• Programming Language
– Types of languages
– Data types and languages
– Types and languages
• Compilers
– Lexical analysis
– Parsing
– Translation to abstract syntax using modern parser generator
technology.
– Type checking
– identifiers and symbol table organization,

• Next Quarter in the second class of the


sequence
– Intermediate representations
– Backend analysis
– Transformations and optimizations for a number of different kinds
of languages
08/21/09 21
Cse321, Programming Languages and Compilers

Multi Pass Compilers


• Passes
– text
– tokens
– syntax trees
– intermediate forms
» (three address code, CPS code, etc)
– assembly code
– machine code

• Each phase is from one form to another, OR


from one form to the same form, which is
often called a source to source
transformation.

08/21/09 22
Cse321, Programming Languages and Compilers

The Top to Bottom Example

text: z = x + pi * 12.0

tokens:
id(z) eql id(x) plus id(pi) times float(12.0)

syntax tree:
 =
Id(z)
 Id(z)      +
 
Id(x)   *  
Id(pi)  float(12.0)   
08/21/09 23
Cse321, Programming Languages and Compilers

Passes (cont)
Three address code:
temp1 := pi * 12.0
z := x * temp1

Assembly level code:


ld r1,x
ld r2,pi
add r1,r2
ldi r2,12.0
mul r1,r2
st r1,z

08/21/09 24
Cse321, Programming Languages and Compilers

Lexical Analysis
• Produces Tokens and Deals with:
» white space
» comments
» reserved word identification
» symbol table interface

• Tokens are the terminals of grammars.

• Lexical analysis reads the whole program,


character by character thus it needs to be
efficient. This implies fancy buffering
techniques etc. Modern lexical generators
handle these problems so we will ignore
them.

08/21/09 25
Cse321, Programming Languages and Compilers

Tokens, Patterns & Lexemes


• Many strings from the input may produce
the same TOKEN i.e. identifiers, integers
constants, floats

• A PATTERN describes a rule which describes


which strings are assigned to a token.

• A LEXEME is the exact sequence of input


characters matched by a PATTERN.

08/21/09 26
Cse321, Programming Languages and Compilers

Examples
• lexeme pattern token
– x <alpha><alpha>* Id "x"
– abc <alpha><alpha>* Id "abc"
– 152 <digit>+ Constant(152)
– then then ThenKeyword

• Many lexemes map to the same token. e.g.


“x” and “abc” .
• Note, some lexemes might match many
patterns. e.g. "then" above. Need to resolve
ambiguity.
• Since tokens are terminals, they must be
"produced" by the lexical phase with
synthesized attributes in place. (e.g. name
of an identifier). e.g. id(“x”) and
constant(152)
08/21/09 27
Cse321, Programming Languages and Compilers

Syntax, Parse Trees & Grammars


• Syntax (the physical layout of the program)
– Grammars describe precisely the syntax of a language. Two kinds
of grammars which compiler writers use a lot are: regular, and
context free

• Informal Definitions of:


Regular:
concatenation, union, star
Context Free:
only one symbol on the lhs of
a production

08/21/09 28
Cse321, Programming Languages and Compilers

Example Grammar
Sentence ::= Subject Verb Object
Subject ::= Proper-noun
Object ::= Article Adjective Noun
Verb ::= ate | saw | called
Noun ::= cat | ball | dish
Article ::= the | a
Adjective ::= big | bad | pretty
Proper-noun ::= tim | mary

Start Symbol = Sentence

Example sentence: tim ate the big ball

08/21/09 29
Cse321, Programming Languages and Compilers

Recursive Grammar Examples

Recursive Grammars describe infinite


languages
list ::= [ num morenum ]
morenum ::= , num morenum
| <empty>

derives [ 2 ],
[2,4], [2,4,6] ...

Exp ::= id
| Exp + Exp
| Exp * Exp
| ( Exp )
08/21/09 30
derives x, x+x,
Cse321, Programming Languages and Compilers

Parse Trees
• Each nonterminal on the lhs of a
production "roots" a tree:
Exp

Exp + Exp

Id Id

Each node in a tree with all its immediate


children is derived from a single
production of the grammar

• We desire a program which constructs a


parse tree from a string. Such programs
are different for every grammar, we some
times use tools to construct such
programs (yacc).
08/21/09 31
Cse321, Programming Languages and Compilers

Syntax Directed Translations


• A syntax directed translation traverses a
syntax tree and builds a translation in the
process.

Considerations
• Tree Traversal orders
» Left to right?
» right to left?
» in-order, pre-order, or post-order

• Where does the information about what to


do in the traversal come from?
» Attribute grammars
• Inherited attributes
• Synthesized attributes
08/21/09 32
Cse321, Programming Languages and Compilers

Example Translation Process


Translation as an abstract syntax to abstract
syntax transformer
We represent this as a grammar with “actions”
{ ... }. The action is performed when that
production is reduced.

Exp ::= Term terms


terms ::= + Term { print "+" } term

| <empty>
Term ::= Factor factors
factors ::= * Factor { print "*" } factors
| <empty>
Factor ::= id { print id.name }
08/21/09 33
Cse321, Programming Languages and Compilers

Semantics
• How do we know what to translate the
syntax tree into?
• How do we know if it is correct?
• Semantics
» denotational semantics
» operational semantics
» interpreters

• Very useful in writing compilers since they


give a reference when trying to decide what
the compiler should do in particular cases.

08/21/09 34
Cse321, Programming Languages and Compilers

Over view
• Compilation is a large process
• It is often broken into stages
• The theories of computer science guide us
in writing programs at each stage.
• We must understand what a program
“means” if we are to translate it correctly.
• Many phases of the compiler try and
optimize by translating one form into a
better (more efficient?) form.
• Most of compiling is about “pattern
matching” languages and tools that support
pattern matching are very useful.

08/21/09 35

You might also like