You are on page 1of 10

Compiler Design

Yacc Example
"Yet Another Compiler Compiler"
Kanat Bolazar

Lex and Yacc


Two classical tools for compilers:
Lex: A Lexical Analyzer Generator Yacc: Yet Another Compiler Compiler (Parser Generator)

Lex creates programs that scan your tokens one by one. Yacc takes a grammar (sentence structure) and generates a parser.
Lexical Rules Lex Input yylex() Grammar Rules Yacc yyparse() Parsed Input 2

Lex and Yacc


Lex and Yacc generate C code for your analyzer & parser. Lexical Rules
C code

Grammar Rules
C code

Lex

Yacc Parsed Input

Input
char stream

yylex()
C code
Lexical Analyzer (Tokenizer)

token stream

yyparse()
C code Parser

Flex, Yacc, Bison, Byacc


Often, instead of the standard Lex and Yacc, Flex and Bison are used:
Flex: A fast lexical analyzer (GNU) Bison: A drop-in replacement for (backwards compatible with) Yacc

Byacc is Berkeley implementation of Yacc (so it is Yacc). Resources: http://en.wikipedia.org/wiki/Flex_lexical_analyser http://en.wikipedia.org/wiki/GNU_Bison The Lex & Yacc Page (manuals, links): http://dinosaur.compilertools.net/ 4

Yacc: A Standard Parser Generator


Yacc is not a new tool, and yet, it is still used in many projects. Yacc syntax is similar to Lex/Flex at the top level. Lex/Flex rules were regular expression action pairs. Yacc rules are grammar rule action pairs.

declarations %% rules %% programs

Yacc Examples: Calculator



A standard Yacc example is the int-valued calculator. Appendix A of Yacc manual at Lex and Yacc Page shows such a calculator. We'll examine this example in parts. Let's start with four operations:
E -> E + E |EE |E*E |E/E

Note that this grammar is ambiguous because 2 + 5 * 7 could be parsed 2 + 5 first or 5 * 7 first. 6

Yacc Calculator Example: Declarations


%{ # include <stdio.h> # include <ctype.h> int regs[26]; int base; %}

Directly included C code list is our start symbol; a list of one-line statements / expressions. DIGIT & LETTER are tokens; (other tokens use ASCII codes, as in '+', '=', etc)

%start list

%token DIGIT LETTER

%left '+' '-' %left '*' '/' '%' %left UMINUS

/*

Precedence and associativity (left) of precedence for unary minus */ operators: +, - have lowest precedence 7 *, / have higher precedence

Yacc Calculator Example: Rules


%% /* begin rules section */ list : /* empty */ | list stat '\n' | list error '\n' { yyerrok; } ;

list: a list of one-line statements / expressions. Error handling allows a statement to be corrupt, but list continues with next statement. statement: expression to calculate, or assignment

stat : expr { printf( "%d\n", $1 ); } | LETTER '=' expr { regs[$1] = $3; } ;

number: made up of digits (tokenizer should handle this, but this is a simple example).
number: DIGIT { $$ = $1; base = ($1==0) ? 8 : 10; } | number DIGIT { $$ = base * $1 + $2; } ;

Yacc Calculator Example: Rules, cont'd


expr : | | | | | | | ; '(' expr ')' { $$ = $2; } expr '+' expr { $$ = $1 + $3; } expr '-' expr { $$ = $1 - $3; } expr '*' expr { $$ = $1 * $3; } expr '/' expr { $$ = $1 / $3; } '-' expr %prec UMINUS { $$ = - $2; } LETTER { $$ = regs[$1]; } number

Unary minus Letter: Register/var

Yacc Calculator Example: Programs (C Code)


%% /* start of programs */ yylex() {

/* lexical analysis routine */ /* returns LETTER for a lower case letter, yylval = 0 through 25 */ /* return DIGIT for a digit, yylval = 0 through 9 */ /* all other characters are returned immediately */

int c; while( (c=getchar()) == ' ' ) {/* skip blanks */ } /* c is now nonblank */ if( islower( c ) ) { yylval = c - 'a'; return ( LETTER ); } if( isdigit( c ) ) { yylval = c - '0'; return( DIGIT ); } return( c ); }

10

You might also like