You are on page 1of 5

Abstract Interpretation

Daniel berg 20110809


In this report we are going to look at abstract interpretation and how to use it to create an analysis tool that can discover potential division-by-zero operations and by doing so we are able to suggest where to put trycatch statements. We combine this with sign-analysis to be able to find unnecessary if statements and find infinite loops, or loops that will never run. If statements will also be checked to see whether a particular branch always will be taken.

Operational Semantics for AM


We start by showing the operational semantics for our abstract machine language. We do this to clearly contrast the normal execution with the execution with abstract properies (abstract interpretation). The semantics are based on those in Semantics with Applications by H.R. Nielson and F. Nielson.

PUSH n : c, e, s c, n : e, s

ADD : c, z 1 : z 2 : e, s c, (z 1 + z 2 ) : e, s

MULT : c, z 1 : z 2 : e, s c, (z 1 z 2 ) : e, s if z 1 , z 2 SUB : c, z 1 : z 2 : e, s c, (z 1 z 2 ) : e, s TRUE : c, e, s c, tt : e, s

if z 1 , z 2

if z 1 , z 2

FALSE : c, e, s c, ff : e, s

EQ : c, z 1 : z 2 : e, s c, (z 1 = z 2 ) : e, s

LE : c, z 1 : z 2 : e, s c, (z 1 z 2 ) : e, s

if z 1 , z 2 if z 1 , z 2 if t 1 = tt and t 2 = tt; if t 1 = ff or t 2 = ff and t 1 , t 2 .

AND : c, t 1 : t 2 : e, s

c, tt : e, s { c, ff : e, s NEG : c, t : e, s

c, ff : e, s { c, tt : e, s FETCH x : c, e, s c, (sx) : e, s

if t = tt; if t = ff.

STORE x : c, z : e, s c, e, s[x z] > if z NOOP : c, e, s

BRANCH(c 1 , c 2 ) : c, t : e, s c 1 : c, ff : e, s { c 2 : c, tt : e, s LOOP(c 1 , c 2 ) : c, e, s c 1 : BRANCH(c 2 : LOOP(c 1 , c 2 ), NOOP) : c, e, s if t = tt; if t = ff.

c, e, s

TRYCATCH(c 1 , c 2 ) : c, e, ps c 1 : FETCH ABOTTOM : PUSH 1 : EQ : BRANCH(c 2 , NOOP) : c, e, s

Abstract Properties
We have two datatypes in the AM code that we need to be aware of and make abstract properties for; namely numbers and booleans. The Abstract Properties creates a lattice as shown by the graphs.

More about the properties of properties can be read in Semantics with Applications.

Abstract Interpretation
We adapt the above semantics to work on abstract properties. The operation PUSH-n pushes a constant value n onto the stack. We redefine it for our abstract interpretation to push not the value itself but instead its property.

PUSH n, e, ps c, abs z(n) : e, ps


ADD, MULT and SUB is instead redefined to use operations that work on their properties instead of values.

ADD : c, v 1 : v 2 : c, ps c, (v 1 + S v 2 ) : e, ps

MULT : c, v 1 : v 2 : c, ps c, (v 1 S v 2 ) : e, ps SUB : c, v 1 : v 2 : c, ps c, (v 1 S v 2 ) : e, ps

if z 1 , z 2 if z 1 , z 2 if z 1 , z 2

The constant TRUE has of course always the property TT.

TRUE : c, e, ps c, TT : e, ps

FALSE : c, e, ps c, FF : e, ps
EQ and LE is redefined similarly to ADD.

EQ : c, z 1 : z 2 : c, ps c, (z 1 = S z 2 ) : e, ps

LE : c, z 1 : z 2 : c, ps c, (z 1 S z 2 ) : e, ps
AND is where our first optimisation kicks in. If it is given that both t1 and t2 has the property TT, or to put it differently TT TE t1,t2 then the AND operation will always be true.

AND : c, t 1 : t 2 : e, ps c, TT : e, s c, FF : e, s c, ANY : e, s if TT t 1 , t 2 ; if FF t 1 , t 2 ; otherwise.

NEG is similar

NEG : c, t : e, ps c, TT : e, s c, FF : e, s c, ANY : e, s if FF t; if TT t; otherwise.

FETCH-x pushes the value bound to x onto the stack. We redefine this to work on the property instead.

FETCH x : c, e, s c, (sx) : e, s STORE x : c, e, s c, e, s[x z] > NOOP : c, e, s c, e, s if z

BRANCH(c 1 , c 2 ) : c, v : e, ps c 1 : c, e, s c 2 : c, e, s c, ANY : e, s if FF v; if TT v; otherwise.

LOOP(c 1 , c 2 ) : c, e, ps c 1 : BRANCH(c 2 : LOOP(c 1 , c 2 ), NOOP) : c, e, s

Working with control points


Each AM operation contain a line number. We check each set of operations for each line number and use the least upper bound on any conflicting values (with the abstract properties we can get several different operations to run next, not just one, so the execution path branches out).

Results
We now run a couple of example programs (while-lang.) through our tool and explain our findings. It should be noted that we present the heap values as those after the statement has run. RHS is, of course, the right hand side of the assignement.

Dangerous Example
code heap rhs x := 7; {x=(POS)} (POS) x := (x - 7); {x=(Z)} (Z) x := (7 / x); {x=(ANY_A)} (ANY_A)! x := (x + 7) {x=(ANY_A)} (ANY_A)! Result: Possible uncaught exception We note that ANY_A_ is in the heap. This means that there is a potential for an error to be thrown. This creeps up after a division with a number with the property Z, which means that it can be any number.

Try Catch Example


code heap rhs x := 7; {x=(POS)} (POS) try x := (x - 7); {x=(Z)} (Z) x := (7 / x); {x=(Z)} (ANY_A)! x := (x + 7) {x=(Z)} (Z) catch x := (x - 7); {x=(Z)} (Z) Normal Termination Here we see that our tool will not warn if we have the similar code but the block is surrounded with trycatch. The tool shows that there is a potential for error (rhs: ANY_A) but the error will never propagate to the heap because the execution will jump to the catch block.

Fibonacci Example
code heap rhs k := 20; {k=(POS)} (POS) i := 0; {k=(POS), i=(ZERO)} (ZERO) j := 1; {j=(POS), k=(POS), i=(ZERO)} (POS) while 2 <= k do (T) k := (k - 1); {tmp=(POS), j=(POS), k=(Z), i=(NON_NEG)} (Z) tmp := j; {tmp=(POS), j=(POS), k=(Z), i=(NON_NEG)} (POS) j := (i + j); {tmp=(POS), j=(POS), k=(Z), i=(NON_NEG)} (POS) i := tmp {tmp=(POS), j=(POS), k=(Z), i=(NON_NEG)} (POS) Normal Termination This Fibonacci example shows that it works on loops; notice that on the seventh line (j := i + j) that i is NON_NEG and started as ZERO. We can also see that the while loop has T as the rhs which means that it might run. Had it been FF it would never have run and if it had been TT it would have been an infinite loop.

You might also like