You are on page 1of 67

Fundamental Algorithms

1 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Chapter 1: Introduction
1.0 What is an Algorithm?
Before we start talking about algorithms for an entire semester, we should try to figure out what
an algorithm actually is. We could, for example, have a short look at how others define an
algorithm:

Several definitions found in the WWW:


Browsing the web, we can find quite a number definitions of the term "algorithm". Here's a small
collection of them.

Definition 1:
An algorithm is a computable set of steps to achieve a desired result.
(found on numerous websites)

Definition 2:
An algorithm is a set of rules that specify the order and kind of arithmetic operations that
are used on specified set of data.

Definition 3:
An algorithm is a sequence of finite number of steps arranged in a specific logical order
which, when executed, will produce a correct solution for a specific problem.
(taken from an undergraduate course at UMBC).

Definition 4:
An algorithm is a set of instructions for solving a problem. When the instructions are
followed, it must eventually stop with an answer.
(given by Prof. Carol Wolf in a course on algorithms).

Definition 5:
An algorithm is a finite, definite, effective procedure, with some output.
(given by Donald Knuth)
27.11.2003 17:18

Fundamental Algorithms

2 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Essential properties of an algorithm


Though the given definitions differ slightly, we can clearly extract the essential properties an
algorithm should have:
an algorithm is finite (w.r.t.: set of instructions, use of resources, time of computation)
instructions are precise and computable.
instructions have a specified logical order, however, we can discriminate between
deterministic algorithms (every step has a well-defined successor), and
non-deterministic algorithms (randomized algorithms, but also parallel algorithms!)
produce a result.

Basic questions about algorithms:


For each algorithm, especially a new one, we should ask the following basic questions:
does it terminate?
is it correct?
is the result of the algorithm determined?
how much memory will it use?
Throughout this course, we will be mainly concerned with another important question:
How long will an algorithm have to work to achieve the desired result?

1.1. Example: Fibonacci Numbers


Our first example of an algorithm will be one that computes the Fibonacci series. For repetition:
The sequence f j, j 0 of the Fibonacci numbers is defined recursively as:
f 0 := 1
f 1 := 1
f j := f j 1 + f j 2 for each j 2

1.1.1 A recursive algorithm


The definition of the fibonacci numbers can be translated almost directly into a recursive
algorithm:

Fibo (n:Integer) : Integer {


if n=0 then return 1;
if n=1 then return 1;

27.11.2003 17:18

Fundamental Algorithms

3 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

if n>1 then return Fibo(n-1) + Fibo(n-2);

We take a little time to explain our notation:


the algorithm Fibo takes a parameter n of type Integer as its single argument, and returns a
result of type Integer. The recursive calls Fibo(n-1) and Fibo(n-2) require the execution of our
algorithm using the values n-1 and n-2, respectively, as parameters. In our algorithm, the two
results are then added and returned as the result of the call Fibo(n).

Question:
How many arithmetic operations does it take to compute f j using the algorithm Fibo?
That means, we neglect the costs of all operations except the arithmetic ones. Thus, we will
basically count the number of additions.

Define:
T Fibo(n) = number of arithmetic operations (+,-,*,/) that Fibo will perform with n as input
parameter.
Examining the function Fibo ,we see that:
T Fibo(0) = T Fibo(1) = 0, as there are no additions performed, if the parameter n is 0 or 1.
T Fibo(n) = T Fibo(n 1) + T Fibo(n 2), if the parameter n is larger than 1.
In that case, the number of additions is the sum of the additions performed by the two
recursive calls.
Such a recursive characterization of the number of operations is very often found for recursive
algorithms. We will soon examine techniques to solve such recurrence equations. In the
meantime, we try to stick to more basic techniques.

Operation count for Fibo:


We set up a table of the number of additions that are performed by a call to Fibo(n):
n
fn

0
1

1
1

2
2

3
3

4
5

5
8

6
13

7
21

8
34

T Fibon

12

21

36

60

99

Proposition:
After a (very) close look at the values given in this table, we propose that
T Fibo(n) = 3 f n 3
Proof:

27.11.2003 17:18

Fundamental Algorithms

4 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

We prove this proposition by induction:


case n = 0:
from the recurrence equation, we know that T Fibo(0) = 0,
and the right hand side also evaluates to 3 f 0 3 = 3 3 = 0
case n = 1:
T Fibo(1) = 0, and 3 f 1 3 = 3 3 = 0
case n 2 (induction step):
we assume that T Fibo(m) = 3 f m 3 is correct for all m < n (induction assumption), especially for
m = n 1 , and m = n 2. Then:
T Fibo(n) = 3 + T Fibo(n 1) + T Fibo(n 2)
=

3 + (3 f

3( f

3 fn 3

n1

n1

3) + (3 f

n2

n2

3)

)3

(q.e.d.!)

An estimate of the size of T Fibo(n) = 3 f n 3


We will use the following, rather "famous", inequality for the fibonacci numbers,
n

2 2 fn 2 n
to get a more precise estimate of Fibo's operation count.
We will prove the two inequalities separately, by induction. In either case, we use that f j+1 f j
for all j 0 (proof left to the reader....)
Proof for f n 2 n:
By induction over n:
case n = 0 : then f 0 = 1, which is equal to 2 0 = 1
case n = 1: then f 1 = 1, which is smaller than 2 1 = 2
case n 2 (induction step):
f n = f n-1 + f n-2 f n-1 + f n-1 = 2 f n-1 2 2 n-1 = 2 n
n

Proof for f n 2 2 :
case 1: n = 2k n2 = k
again, we prove this by induction (over k):
for k = 0, which means that n = 2 k = 0, we have f 0 = 1, which is equal to 2 0 = 1.
for k 1 (i.e. n 2), we compute that f 2k = f 2k-1 + f 2k-2 2 f 2k-2 = 2 f 2(k-1).
By induction assumption f 2(k-1) 2 k-1, and therefore f 2k 2 2 k-1 = 2 k .
case 2: n = 2 k + 1 n2 = k
27.11.2003 17:18

Fundamental Algorithms

5 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

we can reduce this to case 1 by f n = f 2k+1 f 2k 2 k


Result:
Using the above estimate for the fibonacci numbers, we derive the following operation count for
the algorithm Fibo:
n

3 2 2 3 T Fibo(n) 3 2 n 3
Hence, the operation count of Fibo increases exponentially with the size of the input parameter.

Example
How long would a typical computer take to compute the 100-th fibonacci number using
the algorithm Fibo?
Answer: n = 100 T Fibo(n) 3 2 50 3 10 15, i.e. the computer has to perform more than 10 15
additions.
If we assume that our computer is able to perform one arithmetic operation per nanosecond
(compare this to the typical GHz clock rate of current processors), the execution time would
be at least 39 days!
Exercise:
How large is the respective upper bound for the computing time?

Remark:
In our estimate of Fibo's operation count, we have used a rather weak lower bound of the size of
the fibonacci numbers. A well known algebraic formulation of the fibonacci numbers claims that
n
n

5 1
1 5 + 1
2
.
f n = 2



5
This leads to a quite accurate estimate of Fibo's operation count: T Fibo(100) 10 21.
This means, that under the conditions above (1 operation per nanosecond), the computation of
the 100-th fibonacci number will take approximately 31 500 years, which by far exceeds our initial
estimate of "at least 39 days" . . .

1.1.2. An iterative algorithm


To analyse how the recursive algorithm works, we can draw a "tree" of the function calls, for
example invoked by a call to Fibo(4):

27.11.2003 17:18

Fundamental Algorithms

6 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

We can easily see that intermediate results like Fibo(2) or Fibo(1) are computed again (and again
and again..). Hence, we should try to store these intermediate results, and reuse them.

Strategy
Use two variables, last2, and last1, to store the last two fibonacci numbers f n 2 and
f n 1, respectively.
Use one additional variable, f, to compute f n.

Iterative Algorithm
Fibit (x : Integer) : Integer {
if x < 1 then return 1;
else {
last2 := 1;
last1 := 1:
for i from 2 to x do {
f := last2 + last1;
last2 := last1;
last1 := f;
}
}
}

Question:
How many arithmetic operations does it take to compute the n-th Fibonacci number
using algorithm Fibit?

Answer
For n 1: 0 operations (return 1 as the result)
for n 2: There is 1 operation per cycle of the for-loop.
The loop is executed n 1 times, thus we need n 1 operations.
Therefore,

27.11.2003 17:18

Fundamental Algorithms

7 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

for n 1
0
T Fibit(n) =
n 1 for n 2
We can see that the operation count of Fibit increases linearly with the size of the input
parameter.

Example
Again, we imagine a computer that performs 1 arithmetic operation per nanosecond. The number
of arithmetic operations to compute the 100-th fibonacci number using algorithm Fibit is now T Fibit
(100) 99. Hence, the computing time will be only 99 nanoseconds!

Remark
Please, do not take this example as one to show that recursive programming, on itself, is slow.
It's not recursion that makes Fibo slow, it's excessive recomputation of intermediate results.

1.2. Random Access Machines


In this chapter, we will try to show a more systematic way of computing the running time of an
algorithm than just adding up its arithmetic operations. After all, the time to copy data, execute
loops, etc. cannot always be neglected. As the execution time is usually machine dependent, we
will try to establish a reference machine to objectively compute the working time for algorithms.

1.2.1. Definition of a Random Access Machine


A random access machine (RAM) is a simple model of computation. Its memory
consists of an unbounded sequence of registers. Each register may hold an integer
value.
The control unit of a RAM holds a program, which consists of a numbered list of
statements.
The program counter determines which statement is to be executed next.

27.11.2003 17:18

Fundamental Algorithms

8 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Rules for executing a RAM-program:


in each work cycle the RAM executes one statement of the program.
the program counter specifies the number of the statement that is to be executed.
the program ends when the program counter takes an invalid value (i.e. when there is no
statement in the program that has the specified number).

Execution of a RAM-program:
To "run" a program in the RAM, we therefore need to:
define the program, i.e. the exact list of statements.
define starting values for the registers (the input).
define starting values for the program counter (usually, we'll start with the first statement).

Statements of a RAM
Notation:
<Ri>
<Ri> := x

=
=

the integer stored in the i-th register


set the content of the i-th register to the value x

List of Statements:
Statement
Ri Rj

Effect on registers
<Ri> : = <Rj>

Program Counter
<PC> : = <PC> + 1

Ri RRj
RRi Rj
Ri k
Ri Rj + Rk

<Ri> : =
<R<Ri>>
<Ri> : =
<Ri> : =

<PC>
<PC>
<PC>
<PC>

<R<Rj>>
: = <Rj>
k
<Rj> + <Rk>

:
:
:
:

=
=
=
=

<PC>
<PC>
<PC>
<PC>

+
+
+
+

1
1
1
1

27.11.2003 17:18

Fundamental Algorithms

9 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Ri Rj - Rk

<Ri> : =
max {0, <Rj> - <Rk>}

<PC> : = <PC> + 1

GOTO m
IF Ri=0 GOTO
m

<PC> : = m
<PC> : =
if <Ri> = 0
m

<PC> + 1 otherwise.

IF Ri>0 GOTO
m

<PC> : =
if <Ri> > 0
m

<PC> + 1 otherwise.

Example: (Multiplication on a RAM)


Basic idea
y

Using the well known identity x y = x, we will attempt to multiply x by y by adding up x exactly y
i=1

times.
A respective algorithm in our previous notation could, for example, look like this:

mult (x: Integer, y; Integer) : Integer {


sum := 0;
while y > 0 do {
sum := sum + x;
y := y - 1;
}
return sum;
}

The Multiplication RAM


starting configuration of the RAM:
<R0> := x, <R1> := y, and <Ri> := 0 for all i 2
<PC> := 0
desired result: <R0> := xy
RAM program:
We will use register <R2> for the variable sum. Further, we will need register <R3> to hold
the value 1, because we have to subtract 1 from the variable y in each loop cycle, and the
RAM does not provide operations to subtract numbers, only contents of registers.
(0)
(1)
(2)
(3)

R3
IF
R2
R1

R1

1
= 0 GOTO 5
R2 + R0
R1 - R3

27.11.2003 17:18

Fundamental Algorithms

10 of 67

(4)
(5)
(6)

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

GOTO 1
R0 R2
STOP

Example:
Let's examine the working steps of our RAM on the special input x=4, and y=3:

< PC >
< R0 >
< R1 >
< R2 >
< R3 >
after step
==========================================================
0 (at start)
0
4
3
0
0
1
1
4
3
0
1
---------------------------------------------------------2
2
4
3
0
1
3
3
4
3
4
1
4
4
4
2
4
1
5
1
4
2
4
1
---------------------------------------------------------6
2
4
2
4
1
7
3
4
2
8
1
8
4
4
1
8
1
9
1
4
1
8
1
---------------------------------------------------------10
2
4
1
8
1
11
3
4
1
12
1
12
4
4
0
12
1
13
1
4
0
12
1
---------------------------------------------------------14
5
4
0
12
1
15
6
12
0
12
1
16
stop
We can see that our RAM performs three (for y=3) iterations of its basic loop (statements 1 to 4).
Each of the loop iterations requires 4 work cycles of the RAM. The remaining statements require
3 work steps (one before, and two after the loop iterations). Hence, for y=3, our RAM performs 15
work steps. For general y, it is easy to see that the number of work cycles is 1 + 4 y + 2 = 3 + 4 y.
Note, that the number of work cycles is independent of x!

1.2.2. Measuring complexity


We will consider vectors of integers as input of a RAM, i.e. x = (x 1,...,x m), where x i for i = 1,...,m
The starting configuration of the RAM is given by < Ri > = x i+1 for i = 0 ,..., m 1 , and < Rj > = 0 for i
m

Definition: "Uniform size of x"


Let x = (x 1,....,x m) be the input of a RAM, then the uniform size of the input is defined as
x uni = def m.

27.11.2003 17:18

Fundamental Algorithms

11 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Definition: "Uniform time complexity"


uni
Let M be a RAM and x be its input. The uniform time complexity T M
(x) of M on x is
then defined as the number of work cycles M performs on x.

Example
uni
The Multiplication RAM has a uniform time complexity of T M
(x, y) = 3 + 4 y, independent of x. The
uniform size of its input is x uni = 2, independent of the size of x, and y.

Exercise
Discuss, whether the uniform complexity of a RAM is a good model for measuring complexity of
real algorithms in real computers.

Definition: "Logarithmic size of x"


Let x = (x 1,....,x m) be the input of a RAM. Then, the logarithmic size of the input x shall
be defined as
m1

x log = def l (x i) ,
i=0

where l(z) shall be the number of (binary) digits required to represent z .

Remarks:
for x : l(x) = log 2 (x) + 1 = log 2 ( x + 1)
idea for proof: with k binary digits 2 k different integers can be represented, which implies
that, if l(x) = k, then x is between 2k-1, and 2k.
if we use the decimal system (instead of the binary system), its log 10 instead of log2 (similar
for all other systems ...). We will always use the binary system, and write log instead of
log2.

Definition: Logarithmic time complexity


Again, M is a RAM and x is its input. Then, the logarithmic costs of a RAM's work
cycles are defined as
Statement
Ri Rj

Logarithmic Costs
l( < Rj > ) + 1

Ri RRj
RRi Rj
Ri k
Ri Rj + Rk
Ri Rj - Rk

l( < Rj > ) + l( < R < Rj>>) + 1


l( < Rj > ) + l( < Ri > ) + 1
l(k) + 1
l( < Rj > ) + l( < Rk > ) + 1
l( < Rj > ) + l( < Rk > ) + 1

27.11.2003 17:18

Fundamental Algorithms

12 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

GOTO m
IF Ri=0 GOTO m
IF Ri>0 GOTO m

1
l( < Ri > ) + 1

log

The logarithmic time complexity T M (x) of M on x is defined as the sum of the


logarithmic costs of all working steps M performed on x.

Example:
We compute the logarithmic costs for an execution of the Multiplication-RAM on input (x,y). We
can compute the logarithmic costs by summing up the costs for each loop cycle separately:
for statement (0):

for the 0-th loop cycle:

ly + 1

l0 + lx + 1

ly + l1 + 1

for the 1st loop cycle:

ly 1 + 1

lx + lx + 1

ly 1 + l1 + 1 +1

l i x + lx + 1

ly i + l1 + 1

+1

l (y 1)x + lx + 1

l1 + l1 + 1

+1

...

+1

...
+

for the i -th loop cycle:

ly i + 1

...

...

for the(y 1)-st (final) loop cycle:

l1 + 1

for the final IF-statement:

l0 + 1

for the statement (4):

ly x + 1

This leads us to the following expression for the logarithmic costs:


log

T M (x, y)

y1

y1

y1

i=0

i=0

i=0

l y i + ylx + l i x + l y i + (1 + 1 + l1 + 1 + 1)y + 4 + l0 + l x y
y

y1

i=1

i=0

5 + (4 + l1 + lx)y + 2 li + l i x + l x y

5 + (5 + lx)y + 2 ly + (li + lx) + lx + ly

5 + y(5 + lx) + 2 yly + yly + ylx + lx + ly

5 + y(5 + 2lx + 3ly) + lx + ly

y1

i=1

i=0

log

As the logarithmic size of the input, (x, y) log = lx + ly, we may conclude that T M (x, y) 5 + y
(5 + 3 (x, y) log) + (x, y) log
Hence, we can get a relation between the size of the input, and the logarithmic costs.

Definition: uniformly, and logarithmically time-bounded


Let t : be a function, and M be RAM.
uni
1. M is called uniformly tn-time-bounded, if T M
x tn for all x : x uni = n,

27.11.2003 17:18

Fundamental Algorithms

13 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

i.e. for all inputs of uniform size n, the uniform time complexity has to be bounded
by tn.
log

2. M is called logarithmically tn-time-bounded, if T M x tn for all x : x log = n,


i.e. for all inputs of logarithmic size n, the logarithmic time complexity has to be
bounded by tn.

Remarks:
uni
uni
TM
n = sup{T M
n : x uni = n}
log

log

T M n = sup{T M n : x log = n}

Example: (Multiplication-RAM)
uni
The Multiplication RAM's uniform time complexity, T M
(x, y) = 4 y + 3, is independent of the
uniform size of its input, (x,y) uni = 2, which is independent of x, and y. There is no function tn
uni
x,y = 4 y + 3 t2, because y can become arbitrarily large.
such that the size of the input T M
Thus, M is not uniformly tn-time-bounded for any function t.
log

The logarithmic time complexity of the Multiplication RAM is bounded by T M (x, y) 5 + y


(5 + 3 (x, y) log) + (x, y) log.
If the logarithmic size of the input is n := (x, y) log = lx + ly, then
T M (x, y) 5 + y (5 + 3 n) + n 5 + 2 ly (5 + 3 n) + n 5 + 2 n (5 + 3 n) + n
log

which means exactly that M is logarithmically time bounded w.r.t. the function tn = 5 + 2 n (5 + 3 n)
+n. We can therefore say that
"M has an exponential time-complexity w.r.t. the logarithmic complexity
measure".

Exercise
Discuss, in which situations the logarithmic time complexity can be a better model for the
characterisation of computing time than the uniform time complexity. Consider the following
examples:
sorting a large set of numbers (phone numbers, for example), the size of the numbers is
within a fixed range;
computing the prime factorization of a single, very large integer
computing a matrix-vector product, or solving a large linear system of equations
Discuss how the word length (i.e. the number of bits a CPU can process in a single step) affects
whether uniform or logarithmic complexity is more appropriate.

1.3. Asymptotic Behaviour of Functions


27.11.2003 17:18

Fundamental Algorithms

14 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Definition:
Given functions f,g: + , then we define
1. fO(g) def c>0 n 0 n n 0 :f(n)cg(n)
2. f(g) def c>0 n 0 n n 0 :f(n)cg(n)
3. f(g) def fO(g) and f(g)
4. fo(g) def c>0 n 0 n n 0 :f(n)cg(n)
5. f(g) def c>0 n 0 n n 0 :f(n)cg(n)

Remarks:
1. if fO(g) , then g is called an asymptotic upper bound of f;
2. if f(g) , then g is called an asymptotic lower bound of f;
3. if f(g) , then g is called an asymptotically tight bound of f;
4. if fo(g) , then f is called an asymptotically smaller than g;
5. if f(g) , then f is called an asymptotically larger than g;

Further Remarks:
1. in the definition for fo(g) , we can replace the condition f(n)cg(n) by the equivalent
condition f(n) g(n) c , thus the definition can also be written as c>0 n 0 n n 0 : f(n)
g(n) 1 c , which is equivalent to lim n f(n) g(n) =0 .
2. in the same way, f(g) is equivalent to lim n g(n) f(n) =0
3. from these two observation, we may conclude that fo(g)g(f)
In literature, you will also often find notations like f=O(g) , instead of fO(g) .

Example: (Multiplication RAM)


for the uniform time complexity of the multiplication RAM, we get
T M uni (x,y)=4y+3 T M uni (x,y)O(y) (use f.e. c=5,y2) and T M uni (x,y)(y) (use
f.e. c=3,y0) T M uni (x,y)(y)
for the logarithmic time complexity, we get
T M log (n)5+(5+3n) 2 n T M log (n)O(n 2 n ) or T M log (n)O( c n ) for c>2

Remark:
The definitions for the asymptotical bounds may be used for multidimensional functions f,g: k
+ , too, if we replace the terms n> n 0 by n=( n 1 ,..., n k ): n i n 0 .

Common phrases that denote the growth of a function f:


f constant, if f(1)
f logarithmic, if fO(logn)

27.11.2003 17:18

Fundamental Algorithms

15 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

f polylogarithmic, if fO( log k n)


f linear, if f(n)
f quadratic, if fO( n 2 ) , especially if f( n 2 )
f polynomial, if fO( n k ) for some k
f exponential, if fO( c n ) for some c, or c +

Comparison of functions
The O,,,o, and notations define relations between functions. We can therefore ask
whether these relations are transitive, reflexive, symmetric, etc.:
transitive: all of the five relations are transitive!
reflexive: only O,, and are reflexive
symmetric: f(g)if and only ifg(f)
transpose symmetry:
fO(g) if and only if g(f)
f(g) if and only if gO(f)

2. Sorting
Definition of the Sorting problem:
Input:
A sequence of n numbers a 1, a 2, a 3, . . . ,a n
Output:
|

A permutation (reordering) a 1, a 2, , . . . ,a n of the input sequence such that a 1 a 2


|

, . . . , a n.

2.1. Insertion Sort


Idea:
successively generate ordered sequences of the first j numbers ( j = 1, j = 2, . . ., j = n)
in each step ( j j + 1) one additional integer has to be inserted into an already ordered
sequence

Data Structure:
array A[1..n] containing the sequence a 1 (in A[1]), . . ., a n (in A[n]).

27.11.2003 17:18

Fundamental Algorithms

16 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

numbers are sorted in place: output sequence will be stored in A, itself.

Algorithm:
INSERTSORT(A:Array[1..n]) {
for j from 2 to n {
// insert A[j] into sequence A[1..j-1]
key := A[j];
i := j-1;

while i>=1 and A[i]>key {


A[i+1] := A[i];
i := i-1;
}
A[i+1] := key;

2.1.1. Correctness of Insertsort:


To prove the correctness of the algorithm INSERTSORT, we can use a so-called

Loop invariant:
Before (and after) each iteration of the for-loop,the subarray A[1..j-1] consists of all
elements originally in A[1..j-1], but in sorted order.
To prove that an invariant is true, we always have to show its correctness at the beginning
(initialization), that it stays true from one iteration to the next (maintenance), and show of what
use it is for the correctness at termination of the algorithm:
Initialization (invariant is true before the first iteration of the loop):
The loop starts at j = 2, therefore A[1..j-1] consists of the single element A[1].
A subarry containing only one element is, of course, sorted. The loop invariant is therefore
correct.
Maitenance (if the invariant is true before an iteration, it remains true before the next iteration):
The while loop will shift all elements that are larger than the additional element A[j] to the
right (by one position). A[j] will be inserted at the empty (and correct) position. Thus,
A[1..j] is a sorted array. A formal proof (especially that the position was the correct one)
would f.e. state (and prove) a loop invariant for the while-loop.
Termination (on termination, the loop invariant helps to show that the algorithm is correct)
The for-loop terminates when j exceeds n (that means j = n + 1 ). Thus, at termination, A[1
.. (n+1)-1] = A[1..n] will be sorted, and contain all original elements. Hence, the
algorithm INSERTSORT is correct.

27.11.2003 17:18

Fundamental Algorithms

17 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

2.1.2. Running time of INSERTSORT


The following table presents the algorithm INSERTSORT along with the individual cost of each
instruction, and the number of repetitions of each instruction. As we don't know any precise costs
for the instructions, we assume fixed costs c i for each instruction:
Algorithm
INSERTSORT(A:Array[1..n]) {

cost

repetitions

c1

c2

n1

c3

tj

for j from 2 to n do {
key := A[j]; i := j-1;
while i>=1 and A[i]>key {

j=2
n

(t j 1)

c4

A[i+1] := A[i]; i := i-1;


}
A[i+1] := key;

j=2

n1

c5

}
}
where t j denotes the number of times the while-loop is executed in the j-th for-loop.
Thus, for the total costs T (n) of the algorithm, we get
n

j=2

j=2

T (n ) = c 1 n + (c 2 + c 5 )(n 1 ) + c 3 t j + c 4 (t j 1 )

The total costs of INSERTSORT naturally depend on the data, which determine how often the
while loop will be repeated in each for loop.

Analysis of the "best case":


In the best case, each t j = 1. This will happen if A[1..n] is already sorted.
Then: T (n) = c 1 n + (c 2 + c 5 )(n 1) + c 3 nj = 2 1 + c 4 nj = 2 (1 1) = c 1 n + (c 2 + c 3 + c 2)(n 1)
In the best case, T (n) is a linear function of n (the array size).

Analysis of the worst case


In the worst case, each t j = j. This will happen, for example, if the array A is sorted in the
opposite order. Then:

27.11.2003 17:18

Fundamental Algorithms

18 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

T (n )

j=2

j=2

c 1 n + (c 2 + c 5 )(n 1) + c 3 j + c 4 ( j 1 )

n (n + 1)
n (n 1)
1 + c 4
c 1 n + (c 2 + c 5)(n 1) + c 3
2
2

1
c3 c4
(c 3 + c 4) n 2 + c 1 + c 2 + + c 5 n (c 2 + c 3 + c 5)

2 2
2

In the worse case, T (n) is a quadratic function of n.

Analysis of the average case


The best-case-analysis told us that T (n) (n) (lower bound).
The worst-case-analysis told us that T (n) O(n 2) (upper bound).
What will be the "typical" (average, expected) running time of INSERTSORT?
This requires a probabilistic analysis:
Let X (n) be the set of all posible input sequences of length n, and let P : X (n) [0, 1] be
a probability function such that P(x) is the probability that the input sequence is x. Then,
we define
_

T (n ) =

x X (n)

P(x) T (x)

as the expected running time of an algorithm.


An exact computation of the expected running time of an algorithm therefore not only requires a
difficult calculation, but especially requires assumptions on the probability function P(x). We will
have a look at this problem in a later chapter. In the meantime, we restrict ourselves to an
estimate of the average running time.
_

Heuristic estimate of T (n)


j

On the average, the number t j of repetitions of the while loop will be 2 , because - on the average
- about half of the elements in A[1..n-1] can be expected to be larger than A[j].
_

T (n )

n
j
j
+ c 4 1

j=2 2
j = 2 2
n

c 1 n + (c 2 + c 5 )(n 1) + c 3

c 1 n + (c 2 + c 5)(n 1) +

1
c3 c4
c3
(c 3 + c 4) n 2 + c 1 + c 2 + + c 5 n c 2 + + c 5

4 4
2
4

1 n (n + 1)
1 n (n 1)
1 + c 4
c3
2 2
2
2

In the average case, T (n) is a quadratic function of n.

2.2. Bubble sort

27.11.2003 17:18

Fundamental Algorithms

19 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Basic ideas
compare neighbouring elements only
exchange values if they are not in sorted order
repeat until array is sorted

Algorithm
Algorithm
BUBBLESORT(A:Array[1..n]) {

cost

for i from 1 to n do {
for j from n downto i+1 do {

repetitions

c1

n+1

c2

(n i + 1 )

i=1

( n i)

c3

if A[j] < A[j-1]

i=1
n

( n i)

c 4 t i, j

then exchange A[j] and A[j-1]

i=1

}
}
}
0 , if A[j] and A[j-1] are in correct order in the i -th loop
where t i, j =
1 otherwise

Costs:
Best case: all t ij = 0
T (n )

i=1

i=1

c 1 ( n + 1 ) + c 2 ( n i + 1 ) + c 3 ( n i)

c 1(n + 1) + c 2 k + c 3 k

c 1(n + 1) +

n1

k =1

k =1

c2
c3
n(n + 1) + n(n 1) (n 2)
2
2

Worst case: all t ij = 1


compared with the computation for the best case, we just have to replace c 3 by c 3 + c 4 , and
we get:
c2
c3 + c4
n(n 1) O(n 2)
T (n ) = c 1 (n + 1 ) + n (n + 1 ) +
2
2
Thus, the number of operations of BUBBLESORT is (n 2) (even in the best case), and O(n 2)
(even in the worst case). Hence,
T (n ) (n 2 )

27.11.2003 17:18

Fundamental Algorithms

20 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Correctness of BUBBLESORT:
To give a proof for the correctness of BUBBLESORT, we can for example use the following two
loop invariants:
after each cycle of the i-loop:
A[1..i] will contain the i smallest elements in sorted order.
after each cycle of the j-loop:
the smallest element of A[(j-1)..n] will be in A[j-1].

2.3. Mergesort
Basic idea: "divide-and-conquer"
Divide the problem into two (or more) subproblems:
split the array into two arrays of equal size
Conquer the subproblems by solving them recursively:
sort both arrays using MERGESORT.
Combine the solution of the subproblems:
merge the two sorted array to produce the entire sorted array.

2.3.1. Combining two sorted arrays: MERGE


MERGE (L:Array[1..p], R:Array[1..q], A:Array[1..n]) {
// merge the two sorted arrays L, and R into one sorted array A
// we presume that n=p+q
i:=1; j:=1:
for k from 1 to n do {
if i > p
then { A[k]:=R[j]; j=j+1; }
else if j > q
then { A[k]:=L[i]; i:=i+1; }
else if L[i] < R[j]
then { A[k]:=L[i]; i:=i+1; }
else { A[k]:=R[j]; j:=j+1; }
}
}

Loop invariant for correctness:


After each cycles of the for-loop:

A will contain the k smallest elements of L and R combined;


L[i] and R[i] will be the smallest elements of L and R that have not been copied to A
yet.

27.11.2003 17:18

Fundamental Algorithms

21 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Time complexity of MERGE:


T MERGE(n) (n)
because the for-loop will be executed n times and each loop contains a constant number of
statements at maximum (1 copy-statement, 1 increment-statement, and at most 3 comparisons).

2.3.2. Divide-and-conquer: MERGESORT


MERGESORT(A:Array[1..n]) {
if n > 1 then {
m := n2 ;
create array L[1...m];
for i from 1 to m do { L[i] := A[i]; }
create array R[1...m];
for i from 1 to m do { R[i] := A[m+i]; }
MERGESORT(L);
MERGESORT(R);

MERGE(L,R,A);

Time complexity of MERGESORT


The number of operations T MS(n) needed to sort n elements by MERGESORT is:
constant for n 1 : T MS(n) = c 1
For n > 1:
c 2 n operations to create the arrays L and R
T MS( n2 ) + T MS(n n2 ) to sort L and R using mergesort
For easier computation, we will assume that the number of elements in an array will always be a
power of 2: n = 2 k :
c 1 for n 1
T MS(n) =
n
c 2 n + 2 T MS 2 for n > 1

Thus, we have to find a function that satifies this so-called recurrence equation.

Solving the recurrence:


Assuming n = 2 k k = log 2 n, we get:

27.11.2003 17:18

Fundamental Algorithms

22 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

T MS(n) = T MS(2 k)

c 2 2 k + 2 T MS(2 k 1)

c 2 2 k + 2 c 2 2 k 1 + 2 2 T MS(2 k 2)

...

c 2 2 k + 2 c 2 2 k 1 + . . . + 2 i 1 c 2 2 k i + 1 + 2 i T MS(2 k i)

...

c 2 2 k + 2 c 2 2 k 1 + . . . + 2 k 1 c 2 2 1 + 2 k T MS(1)

k c2 2k + 2k c1

c 2 n log 2 n + c 1 n
T MS(n) (n log n)

Correctness of our solution:


To prove the correctness of our function T MS(n) = c 2 n log 2 n + c 1 n, we have to prove that it actually
satisfies the recurrence given above:
case n = 1:
c 2 1 log 2 1 + c 1 1 = c 1 (i.e. satisfied)
case n > 1:
T MS(n)

c 2 n log 2 n + c 1 n

c 2 n log 2(2 n2 ) + 2 c 1 n2

c 2 n(1 + log 2 n2 ) + 2 c 1 n2

c 2 n + c 2 n log 2 n2 + 2 c 1 n2

= c 2 n + 2(c 2 n2 log 2 n2 + c 1 n2 ) = c 2 n + 2 T MS( n2 )


which finally proves the correctness of our result.

2.4. Quicksort
Divide-and-conquer algorithm:
Divide:
the elements of the input array A[p..r] are rearranged such that the subarray A[p..q]
contains no element that is larger then any element of A[q+1 .. r]. The respective
index q is computed by the partitioning procedure.
Conquer:
the subarray A[p..q] and A[q+1 .. r] are both sorted using quicksort
Combine:
both subarrays, A[p..q] and A[q+1 .. r], will then be sorted, and all elements of
A[p..q] will be less than or equal to the minimal element of A[q+1 ..r].
Therefore, the entire array A[p..r] is sorted, and no further work is necessary.
27.11.2003 17:18

Fundamental Algorithms

23 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Remark:
Like INSERTSORT, and BUBBLESORT, quicksort sorts "in place".

2.4.1. Partitioning the Array: PARTITION


The partitioning of the array is done with respect to a pivot element:
all elements that are smaller than the pivot element will be placed in the left partition;
all elements that are larger than the pivot element will be placed in the right partition.
The following partitioning algorithm chooses the first element of the array as pivot:

PARTITION (A:Array[p..r]) : Integer {


// return value is q (last index of left partition)
// x is the pivot:
x := A[p];
// let partitions grow towards each other:
// i,j: current end of left/right partition
i := p; j := r;
while i<j do {
// enlarge right partition until an element smaller than the p
while A[j]>x do { j := j-1; };
// enlarge left partition until an element larger than the pi
while A[i]<x do { i := i+1; };
// two non-fitting elements have been found at either ends of
// swap these elements, and let the partitions grow by one el
if i < j then {
exchange A[i] and A[j];
i := i+1; j := j-1;
}
// else: if i>j the partitions "meet", and the partitioning i

}
return j;

Time complexity of PARTITION


The number of executed increment statements of i plus the number of executed decrement
statements of jis exactly r-p, because the while loop terminates as soon as i>j. As a
consequence:
T PART(n) = (n)
where n = r-p+1 (i.e. the number of array elements).

2.4.2. Quicksort
27.11.2003 17:18

Fundamental Algorithms

24 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

QUICKSORT (A:Array [p..r])


{
if p<r then {
q := PARTITION (A);
QUICKSORT (A[p..q]);
QUICKSORT (A[q+1..r]);
}

Time Complexity of QUICKSORT:


The time complexity of QUICKSORT is given by the cost of the PARTITION algorithm, plus the
cost of the two recursive calls:
T (n) = T (l) + T (r) + T PART(n)
where l and r denote the number of elements in the left and right partition, respectively.

Best-Case:
The best case occurs when each partitioning splits the array into two partitions of equal size.
Then:
n
T (n) = 2 T + (n) T (n) (n log n)
2
See the analysis of the time complexity of MERGESORT for the solution of the respective
recurrence.

Worst-Case:
The worst case occurs, when each partitioning will lead to one partition that contains only one
element. Then:
T (n) = T (n 1) + T (1) + (n) = T (n 1) + (n)
Applying this successively, we get:
T (n )

T (n 1 ) + (n )

T (n 2 ) + (n 1 ) + (n )

...

T (1) + (2 ) + . . . + (n 1) + (n )

(k) = k = (n 2)
k = 1
k =1

Thus, in the best case, quicksort is (asymptotically) as fast as mergesort, but in the worst case, it
is as slow as insertsort, or bubblesort!

Remark

27.11.2003 17:18

Fundamental Algorithms

25 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

The worst case occurs, if A is already in sorted order, or sorted in reverse order. QUICKSORT will
also be especially slow, if the array is nearly sorted, because several of the partitions will then be
sorted arrays, and suffer from worst case time complexity.

2.4.3. Randomized Quicksort


The (n 2) complexity of QUICKSORT for sorted arrays is a consequence of always choosing the
first element as pivot. Hence, we could try to improve QUICKSORT's time complexity be choosing
a random pivot element, for example.

RAND_PARTITION ( A: Array [p..r] ): Integer {


// choose random integer i between p and r
i:= RANDOM(p,r);
exchange A[i] and A[p]; // make A[i] the (new) Pivot element.
// call PARTITION with new pivot element (the former A[p])
q:= PARTITION (A);
}

return q;

RAND_QUICKSORT ( A:Array [p..r] ) {

if p < r then {
q := RAND_PARTITION(A);
RAND_QUICKSORT (A[p...q]);
RAND_QUICKSORT (A[q+1 ..r]);
}

Time Complexity of RAND_QUICKSORT:


RAND_QUICKSORT may still produce the worst (or best) partitioning in each step. Therefore:
Best Case: T (n) (n log n)
Worst Case: T (n) (n 2)
However:
it is no longer determined wich input sequence (sorted order, reverse order) will lead to
worst case behaviour (or best case behaviour);
the same input sequence might lead to the worst case or the best case, depending on the
random choice of pivot elements.
Thus, only the average-case compexity is of interest!

Average-case complexity of RAND_QUICKSORT:


We make some model assumptions for our computation:
1. The probability that RANDOM(p,r) will return the value i shall be

1
n

for all i (n = r-p+1).

27.11.2003 17:18

Fundamental Algorithms

26 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

2. A[i] A[j] for all i j.


Otherwise, the analysis would be much more compicated (but eventually lead to the same resut).

Basic idea:
Assume that A contains k numbers that are smaller than the pivot element. Then:
for 0 < k < n, PARTITION will create partitions of size k and n k.
for k = 0, PARTITION will create partitions of size 1 and n 1.
the probability of k = 0, k = 1, . . ., k = n 1, is always 1n .

Average time complexity:


To compute the average time compexity, we have to sum up the time complexity of all different
cases multiplied be the respective probabilities:
_

T (n ) =

all cases i

P(case i) T (case i)

For RAND_QUICKSORT, we get:


n1 _
_
_
_
1 _

T (n) = (T (1) + T (n 1)) + (T (k) + T (n k )) + (n)


n
k =1

In the worst case, T (n 1) O(n 2) so we get:


_
1 _
1
(T (1) + T (n 1)) = ( (1) + O (n 2)) O(n)
n
n
and therefore:
_
_
1 n1 _
2 n 1 _
T (n) = (T (k) + T (n k )) + (n) = (T (k)) + (n)
n k =1
n k =1
Solving this recurrence equation (see separate page) leads to
_

T (n) (n log n)

Further Variants of Quicksort


Following the randomized partitioning, we can design increasingly clever algorithms to produce
balanced partitions that lead to fast quicksort algorithms. However, most of the partitioning
algorithms can not guarantee a (n log n) complexity.
Median-of-3-partitioning
Randomly choose a set of 3 elements from the array A, and take their median ("middle
element") as the pivot:
this leads to an improved likelihood of getting a balanced partitioning. However, as in the
worst case still only 2 elements might be "split off" during each partition, the respective
quicksort algorithm also has a O(n 2) complexity in the worst case (similar for
median-of-5-partitioning, etc.).

27.11.2003 17:18

Fundamental Algorithms

27 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Median-partitioning
Choosing the median element of the array A would, of course, lead to a perfectly balanced
partitioning in each step, and therefore, to a guaranteed (n log n) time complexity.
However, this requires a fast algorithm to find the median of an array (see chapter 3).

Exercise:
Given the recurrence
T (n ) =

2 n1
T (k) + c n ,
n k=1

show that

T (1) = 1

T (n) O(n log n)

Proof by induction:
Claim:

T (n) 4(c + 1) n log n + 1

Case n = 1:
T (1) = 1, and 4(c + 1) 1 log 1 + 1 = 4(c + 1) 0 + 1 = 1.
Induction step: n 1 n:
using the induction assumption, T (k) 4(c + 1) k log k + 1 for all k = 1, . . . ,n 1, we may
conclude that:
T (n )

We use the lemma

2 n 1
T (k) + c n
n k =1

2 n 1
(4 (c + 1) k log k + 1) + c n
n k =1

8 (c + 1) n 1
(k log k ) + 2 + c n
n
k =1
2

nk = 11 (k log k ) 12 n 2 log n n8
T (n )

to get

8 (c + 1) 1 2
n2
+ 2 + cn
n log n
8
n
2

4 n(c + 1) log n (c + 1) n + 2 + c n

4(c + 1) n log n + 2 n

4(c + 1) n log n + 1

Thus, our proof by induction is complete. What remains to be done, is to prove the lemma
n 1

(k log k )

k =1

1 2
n2
n log n
8
2

27.11.2003 17:18

Fundamental Algorithms

28 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Proof of lemma:
We set m := n2 , and split the sum into two halves:
n1

(k log k )

k =1

m 1

n1

k =1

k =m

(k log k ) + (k log k )

n1
m1
n 1
n m1
log k + log n k = (log n 1) k + log n k
2 k = 1
k=m
k =1
k =m

log n k k

1
1
log n n(n 1) m(m 1)
2
2

1 2
1 n n
n log n 1

2
2 22

1 2
1
n log n n 2
2
8

n1

m 1

k =1

k =1

2.5. Heapsort
2.5.1. Heaps
A (binary) heap is a data structure based on a binary tree, but is stored in a array. In any
node of the tree, the stored value shall be greater than that in its two child nodes.

Example:

Array representation:
15

13

11

27.11.2003 17:18

Fundamental Algorithms

29 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Remarks:
the binary tree has to be completely filled, i.e."missing nodes" may only occur on the lowest
level, and not left of any other node on the lowest level;
the numbering of the nodes of the tree is called "breadth-first numbering".

Observation:
There are three simple functions to compute the index of the parent, the left son, and the right
son of a node i:

PARENT (i:Integer) : Integer {


return i/2;
}
LEFT (i:Integer) : Integer {
return 2*i;
}
RIGHT (i:Integer) : Integer {
return 2*i+1;
}

Definition of "heap", and "heap property":


Let A[1..n] be an array of n integers.
An element A[i] satisfies the heap property, if A[PARENT(i)] A[i].
The array A is called a heap, if all elements A[i], i = 2, 3, . . ., m, satisfy the heap
property.
The element A[1], of course, doesn't need to satisfy the heap property, because it doesn't have
any parent!

2.5.2. Maintaining the heap property


The following algorithm, HEAPIFY, restores the heap property in the subheap starting from a
given node i. It assumes that the two subheaps starting from LEFT(i) and RIGHT(i) already
satisfy the heap property.

HEAPIFY (A:Array[1..n], heapsize:Integer, i:Integer) {


// we assume that heapsize is at least 1, and at most n
l := LEFT(i); r := RIGHT(i);
// determine maximum of node i and its left son
if l heapsize and A[l] > A[i]
then { largest := l; }
else { largest := i; };

27.11.2003 17:18

Fundamental Algorithms

30 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

// compare the larger one of both with the right son


if r heapsize and A[r] > A[largest]
then { largest := r; }

// restore heap property if necessary


if largest <> i then {
exchange A[i] and A[largest]
HEAPIFY(A, heapsize, largest);
}

HEAPIFY first determines which of the two children of node i is the larger one. If this child is
larger than i, it exchanges it with its parent. As this might violate the heap property in the
respected child node, HEAPIFY is recursively called for that heap.

Time complexity of HEAPIFY


HEAPIFY requires a constant number of operations plus the cost for the recursive call. By each
recursive call, we will "descend" in the tree by one level. Thus, the total cost of HEAPIFY is (h),
when h is the height of the node i in the tree.
Therefore, the time complexity of HEAPIFY is O(log n)

2.5.3. Building a heap


We can build a heap out of an arbitrary array by calling HEAPIFY for each element. We have to
start from the lowest levels and go "upwards" in the tree in order to not destroy the heap property
again.

BUILDHEAP (A:Array[1..n]) {
heapsize := n;
for i from n/2 downto 1 do {
HEAPIFY(A, heapsize, i);
}
}
We skip all elements of the lowest level, because they simply don't have an children that might
not satisy the heap property.

Time complexity of BUILDHEAP:


If h i denotes the height of element A[i] in the tree, then
n

T (n ) = (h i)
i=1

because HEAPIFY's time complexity is (h).


We can actually compute this sum, if we do not sum up each array element separately, but if we
sum up over the levels of the heap, combining the elements of one level into one term.
On level k, a binary tree of n elements can have at most kn+ 1 nodes, so we can write:
2
27.11.2003 17:18

Fundamental Algorithms

31 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

T (n)

h
h (k)
h
n
k

k + 1 (k) n k + 1 = n k +1

k = 1 2
k =1 2
k =1 2

1
n

n h k
k 2 2 (n )
1
2 k = 1 2
2 (1 2 )

where we have made use of


k =1

k
2k

x
.
(1 x) 2

The time complexity of BUILDHEAP is O(n)

2.5.4. Sorting with heaps: HEAPSORT


To sort an array using heaps and our heap algorithms, we can follow this scheme:
build a heap using BUILDHEAP,
the first element of the heap (i.e. the root of the respective tree) will then contain the largest
element of the entiry array;
exchange the first and the last element
reduce the heapsize by 1, and restore the heap property;
proceed recursively until the heapsize is reduced to 1.

HEAPSORT (A:Array[1..n]) {
BUILDHEAP(A);
for heapsize from n downto 2 do {
exchange A[1] and A[heapsize];
HEAPIFY(A, heapsize-1, 1);
}
}
Each iteration of the for loop will determine put the largest of the elements in A[1..heapsize]
in A[heapsize] (after the exchange with A[1]). As the heapsize will then be reduced, the
respective element will no longer be affected by the algorithm, and stay in its correct(!) position.

Time complexity of HEAPSORT:


Using the results of the previous sections, we know that we need

(n) operations to build the initial heap using BUILDHEAP


plus n (log n) for exactly n calls to HEAPIFY.
The time complexity of HEAPSORT is T (n) (n log n)

2.6. Lower bounds for comparison sorts

27.11.2003 17:18

Fundamental Algorithms

32 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Comparisons sorts are sorting algorithms that use only comparisions (i.e. tests like ,
=, <, . . .) to determine the relative order of the elements.

Examples:
Mergesort, Quicksort, Heapsort, Insertsort, and Bubblesort are all comparison sorts.

2.6.1. Decision Trees


A decision tree is a binary tree in which each internal node is annotated by a
comparison of two elements. The leaves of the decision tree are annotated by the
respective permutations that will put an input sequence into sorted order.

Example:
The following picture shows a decision tree for sorting 3 elements a 1 , a 2 , a 3

Observations:
Each comparison sort can be represented by a decision tree.
A decision tree can be used as a comparison sort, if every possible permutation is
annotated to at least one leaf of the tree.
to sort a sequence of n elements, a decision tree needs to have at least n ! leaves.

2.6.2. A lower bound for the worst case


The worst-case complexity of a comparison sort that is represented by a decision tree is given by
the longest path from the root to a leaf of the tree (i.e. the height of the decision tree).
We know that a binary tree of height h (i.e. longest path h) has at most 2 h leaves.

Theorem:
27.11.2003 17:18

Fundamental Algorithms

33 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Any decision tree that sorts n elements has height (n log n).

Proof:
To sort n elements, a decision tree needs n ! leaves, and a tree of height h has at most 2 h leaves.

using h comparisons in the worst case, we can sort n elements only if n ! 2 h, or h log(n !).
n

Using the well-known inequality n ! n 2 (see exercise (b) on worksheet 2), we get
n
n
h log(n !) logn 2 = log(n)
2

which means that, in fact, h (n log n).

Corollary:
Mergesort and heapsort are asymptotically optimal comparision sorts.
That means no comparison sort can be asymptotically faster than mergesort or heapsort. Thus, if
we want to find faster algorithms, we have to look for ways to determine the correct order of
elements that are not based on comparing the elements.

2.7. Sorting in linear time


The (n log n) lower bound for comparison sorts may be beaten, if we can make use of additional
information on the elements.

2.7.1. Counting sort:


Counting sort assumes that the input elements can only take one of k different values (without
loss of generality, we assume these values are {1,....,k}). It uses an array of k elements to directly
compute the final position of all elements sharing a common value:

COUNTNIG_SORT (A:Array [1..n], k: Integer) : Array[1..n]


{
Create Array B[1..n]; // output array
Create Array C[1..k];

// to compute final positions

for i from 1 to k do { C[i]:=0 };


for j from 1 to n do {
C[A[j]] := C[A[j]] + 1
};
// C[i] now contains the number of elements equal to i

27.11.2003 17:18

Fundamental Algorithms

34 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

for i from 2 to k do {
C[i] := C[i-1] + C[i]
};
// C[i] now contains the number of elements i
for j from n downto 1 do {
pos := C[A[j]];

}
}

// determine final position of A[j]

B[pos] := A[j];

// place A[j] at correct position i

C[A[j]] := C[A[j]] - 1 ;

// account for multiple elements of

return B;

Time compexity of counting sort is (k + n), as we can easily see from the number of iterations in
each for loop.
Thus, if k O(n), then counting sort only needs (n) operations!

2.7.2. Radix sort


COUNTINGSORT is an example of a so-called "stable", i.e. the relative position of two elements
A[i]=A[j] will not be changed by the sorting algorithm.
Radix sort uses the representation of the input numbers in the decimal (hexadecimal, octal,...)
system. It uses a stable sorting algorithm to sort for each digit seperately, starting on the least
significant digit. It assumes that all input numbers have at most d digits.

RADIXSORT(A:Array[1,...,n], d:Integer, k:Integer)


{
for i from 1 to d do {
A := COUNTINGSORT(A,k,i);
// identical to COUNTINGSORT(A,k),
// but sort with respect to the i-th digit only.
}
}
COUNTINGSORT may be replaced by any other stable sorting algorithm.

Time complexity:
The time compexity of RADIXSORT is (d (n + k )).
If the decimal system is used, then k=10, and the time complexity is (d n). Naturally the (d n)
-complexity also holds for the use of the binary (octal, hexadecimal, ...) system.
If, in addition, the size of the input numbers is bounded, then d is bounded, and the time
complexity is (n).
27.11.2003 17:18

Fundamental Algorithms

35 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

2.7.3. Bucket sort


Bucket sort assumes that the input numbers are uniformly distributed over the whole range of
values (we assume [0,1) as the range of values).
Bucket sort uses an array of n so-called "buckets" to sort n integers. A "bucket" shall be a data
structure that
can hold an arbitrary number of elements.
can insert a new element into the bucket in O(1) time.
can concatenate two buckets in O(1) time.
We can, for example, use linked lists as the data structure for the buckets.

BUCKETSORT (A:Array[1..n]) {
Create Array B[0..n-1] of Buckets;
//assume all Buckets B[i] are empty at first
for i from 1 to n do {
insert A[i] into Bucket B[i];
}
for i from 0 to n-1 do {
sort Bucket B[n A[i]];
}
}

concatenate the sorted Buckets B[0], B[1], ..., B[n-1]

We assume that we can concatenate the buckets in O(n) time, which should not be difficult to
achieve: we can, for example, simply copy all n elements into an output array.

Time compexity of Bucketsort:


Inserting n elements into n buckets will cost (n) operations ( (n) per insertion). Concatenating n
buckets (with n elements altogether) will also cost (n) operations.
The crucial question is therefore how many operations will be required by the second loop.
Best case analysis:
If each bucket contains exactly 1 element, then (n) operations are required.
Worst case analysis:
If one single bucket contains all the elements, the time compexity will be that of the sorting
algorithm used to sort the buckets (thus (n log n) is possible).
Thus, we need a detailed analysis of the avarage performance:

Average Case Analysis:

27.11.2003 17:18

Fundamental Algorithms

36 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Let n i be the number of elements in the i-th bucket.


The probablity that this bucket will contain exactly i elements is:
k

n 1
1
P(n i = k ) = 1
k n
k

nk

From probability theory, we know (or we can look it up) the expected value and the empirical
variance:
E[n i] = n

1
1
1
1
= 1 and Var[n i] = n 1 = 1
n
n
n
n

Now, the expected time to sort all buckets, for example by using INSERTSORT, is:
n1

n 1

O(E[n 2i ]) = O E[n 2i ]
i=0
i = 0

We can make use of the identity


E[n 2i ] = (E[n i]) 2 + Var[n i] = 1 2 + 1
and we get that

1
= ( 1 )
n

n 1

n 1

O E[n 2i ] = O (1) = O(n)


i = 0

i = 0

Putting it all together, we may claim that:


the expected average running time of BUCKETSORT is (n)

Insertion: Recurrences and how to solve them


A recurrence is an (in-)equality that defines (or characterizes) a function in terms of its
values on smaller arguments.

Example: Time complexity of MERGESORT


for n 1
c 1

T (n) =
n
2 T 2 + c 2 n for n > 1

Remark:
Oftentimes, a recurrence will also be given using the - ( O , , . . . )-notation:
for n 1
(1)

T (n) =
n
2 T 2 + (n) for n > 1

27.11.2003 17:18

Fundamental Algorithms

37 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

1. The substitution method


step 1:
guess the type of the solution
step 2:
find the respective parameters, and prove that the resulting function satisfies the
recurrence

Example
We try to solve the recurrence above:
for n 1
c 1

T (n) =
n
2 T 2 + c 2 n for n > 1

step 1:
we guess the type of the solution: T (n) = a n log 2 n + b n
step 2:
we determine the correct values for the parameters a, and b, and prove that the resulting
function satisfies the recurrence:
for n = 1: T (1) = a 1 log 2 1 + b 1 = b; the recurrence is therefore satisfied if T (1) = c 1
=b
for n > 1, we insert our proposed solution into the recurrence equation:
n
n
n
a n log 2 n + b n = 2a log 2 + b + c 2 n
2
2
2

a n log 2 n + b n

a n(log 2 n 1) + b n + c 2 n

a n + c 2 n

a = c2
Therefore, the solution for the recurrence is T (n) = c 2 n log 2 n + c 1 n.

2. The recursion-tree method (or iteration method)


1. draw a tree of all recursive function calls;
2. state the local costs for each node (= function call) of the tree;
3. sum up the costs of all nodes on each level of the tree.

This may lead to:


a sum of costs-per-level that can be added up easily;
an easier recurrence for the costs-per-level;

27.11.2003 17:18

Fundamental Algorithms

38 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

a good guess for the substitution method.

Example:
See our analysis of the time complexity of MERGESORT in section 2.3.2 for an example of this
approach.

3. The master theorem


Theorem:
Let a 1 and b 1 be constants, and f (n) be a function. Further, let T (n) be a function
+ + (or + + ) defined by the recurrence
T (1 )

(1 )

T (n )

n
a T + f (n)
b

Then, T (n) can be bounded asymptotically as follows:


1. If f (n) O(n log b a ), for some > 0, then T (n) (n log b a).
2. If f (n) (n log b a), then T (n) (n log b a log n).
3. If f (n) (n log b a + ), for some > 0, and if a f ( nb ) c f (n) for some constant c < 1
and all sufficiently large n, then T (n) ( f (n)).
See: Cormen, et. al.: Introduction to Algorithms, . . .

Remarks:
If in the term a T ( nb ) the fraction

n
b

occurs as nb , or nb , the theorem still holds. Situations,


like in T ( n2 ) + T ( n2 ) (compare analysis of mergesort), are also covered.
the master theorem will therefore cover many (but not all) of the recurrences we encounter
in the analysis of divide-and-conquer algorithms.

Proof:
The proof of the master theorem is left as an exercise for the reader 1. Some technicalities of the
proof, however, need some short comments:
in case 1, f (n) O(n log b a) is not sufficient.
in case 3, f (n) (n log b a) is not sufficient.
Instead, f (n) has to be polynomially larger (or smaller) than n log b a, that means f (n) O
(n log b a n ), or f (n) (n log b a n ), respectively, where has to be a positive (non-zero)
constant. Otherwise, the master theorem is not applicable!

27.11.2003 17:18

Fundamental Algorithms

39 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Interpreting the master theorem


For the most simple recurrence, T (n) = a T ( nb ) (i.e. f (n) = 0), we find a solution of the type T (n) =
n log b a:
n
n log b a
n log b a
n log b a

aT
=a
= a log a = a
= n log b a = T (n)
b
b
b
a
b
The master theorem compares the non-recursive part of the costs, f (n), with this solution (the
recursive part):
in case 1, the costs of the recursion dominate, such that T (n) (n log b a)
in case 2, the costs of recursion and that of the combination are about the same, therefore
T (n) (n log b a log n)
in case 3, the costs for the non-recursive combination dominate: T (n) ( f (n))

Examples:
Mergesort:
n
T (n) = 2 T + f (n) , where f (n) (n)
2
To apply the master theorem, we set a = 2 , and b = 2, which means that n log b a = n log 2 2 = n.
As f (n) (n), case 2 of the master theorem applies, and therefore T (n) (n log n), as
expected.
Other recurrence:
2n
T (n) = 2 T + f (n) , where f (n) (n)
3
a = 2 , and b = 32 , and consequently n log b a = n

log 3 2
2

n 1.709....

f (n) (n) implies f (n) O(n 1.7) such that case 1 of the master theorem applies.
log 3 2
Therefore T (n) (n 1.709...) = n 2 .

Expensive combination: (what if MERGE required (n 2) operations?)


n
T (n) = 2 T + f (n) , where f (n) (n 2)
2
log b a
a = 2 , and b = 2, such that n
= n log 2 2 = n.
Then, f (n) (n 1 + ) for any 0 < < 1. Therefore, case 3 applies, and T (n) ( f (n)) =
(n 2).
Less expensive combination: (what if MERGE required (n log n) operations?)
n
T (n) = 2 T + f (n) , where f (n) (n log n)
2
log b a
Again, a = 2 , and b = 2, and n
= n log 2 2 = n.
f (n) (n log n) implies f (n) (n), however, f (n) (n 1 + ) for any > 1!
Therefore, the master theorem does not apply in this case.
1

just kidding . . . the proof of the master theorem is beyond the scope of this course. We
therefore refer to Cormen et. al.

27.11.2003 17:18

Fundamental Algorithms

40 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

3. Selecting
Definition: (selection problem)
Input:
a set A of n (distinct) numbers, and a number i , with 1 i n.
Output:
the element y A that is larger than exactly i 1 other elements of A.

Straight forward algorithm:


sort array A
return element A[i] of the sorted array

Time complexity O(n log n) for arbitary A and i.

Question
Is there a faster (i.e (n)) algorithm?

3.1 Minimum and Maximum


We can easily give an O(n)-algorithm to find the minimum, or the maximum of an array:

MINIMUM(A:Array[1..n]) : Integer {
min := A[1];
for i from 2 to n do {
if min > A[1] then min := A[i];
}
return min;
}
MAXIMUM(A:Array[1..n]) : Integer {
max := A[1];
for i from 2 to n do {
if max < A[1] then max := A[i];
}
return max;
}
It is also reasonably easy to imagine an O(n)-algorithm that finds the second (third, fourth, . . .)
smallest/largest element of an array: in contrast to the algorithms MINIMUM, and MAXIMUM we
would have to store the two (three, four, . . .) currently smallest/largest elements in respective

27.11.2003 17:18

Fundamental Algorithms

41 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

variables min1, min2 (min3, min4, . . .), and update them accordingly while we loop through
the array A.
Hence, the crucial question is, whether we can transfer this O(n)-performance to an algorithm that
finds the i-th largest/smallest element, where i is given as a parameter.

3.2 Quickselect
Quickselect adopts the partitioning idea of quicksort for the selection problem. We will use the
algorithmn RAND_PARTITION for the partitioning.

RANDSELECT(A:Array[p..r], i:Integer) : Integer {


if p=r then return p;
// partition the array
q := RAND_PARTITION(A);
k := q-p+1; // there are k elements in the first partition

if i k
then // the i-th largest element is in the first partition
return RANDSELECT(A[p..q], i)
else // the i-th largest element is in the second partition
// (take into account the k elements in the first partit
return RANDSELECT(A[q+1 .. r], i-k);

Time complexity of RANDSELECT:


Worst case:
still O(n 2), if we are unlucky, and pick the smallest/largest element as pivot in each step.
Best case:
is O(n), because (if we are very lucky) we might find the respective element in only two
partitioning steps.
However, like for quicksort, the best-case and worst-case complexity is not of much interest for
us. We have to examine the average case, instead.

"Average worst-case complexity" of RANDSELECT:


From the analysis of quicksort, we know:
partition sizes will be 1 and n-1 with probability

2
n

partition sizes will be k and n-k with probability

1
n

The cost of randselect is O(n) for the partitioning plus the cost for one recursive call (instead of
two for quicksort). The cost of the recursive call is determined by the size of the subarray. We
want to compute the average worst-case complexity, so we
assume the worst case in the sense that we assume that we always have to make the
27.11.2003 17:18

Fundamental Algorithms

42 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

recursive call for the larger subarray;


compute the expected value of the resulting cost, using the probabilities for the partition
sizes given above.
This leads to the following estimate:
_

T (n )

n 1 _
1 _

T (max{1, n 1}) + T (max{k, n k}) + O(n)


n
k =1

n1 _
1 _
T (n 1) + 2 T (k) + O(n)
n
k = n

_
2 n1 _
T (k) + O(n) , because T (n 1) O(n 2)
n k = n
2

We wil solve the recurrence


_

T (n )

2 n1 _
T (k) + O(n)
n k = n
2

by substition:
assume that the O(n)-term is d n.
_

guess T (n) c n as the solution.


Inserting these assumptions into the recurrence, we get:
_

T (n )

2 n1 _
T (k) + O(n)
n k = n

2 n1
ck + d n
n k = n
2

n
21
2 c n 1
k k + d n
n k = 1
k =1

n
n
2 c n (n 1) 2 ( 2 1)
+dn

n 2
2

c n
n
c(n 1) 1 + d n
2
n2

3
1
c n + d n
2
4

c
c
c n + d n n

4
2
_

Now, we pick c large enough such that d n 4c n 2c < 0, which is equivalent to 4c n + 2c > d n. Then T

27.11.2003 17:18

Fundamental Algorithms

43 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

(n) < 4c n, or in plain words:


_

The average worst-case complexity of RANDSELECT is T (n) O(n).

3.3 The BFPRT algorithm


(named after M. Blum, R.W. Floyd, V.R. Pratt, R.L. Rivest, and R.E. Tarjan)
In quickselect, the partitioning can still be very bad.
in BFPRT, the partitioning is done with respect to a so-called "median of medians":
the array (size n) is split into n5 subarrays of at most 5 elements;
for each subarray the median is computed;
the median of these medians is taken as the pivot element for the partitioning;
to compute the "median of the medians", we use the BFPRT-algorithm recursively.
if the array (of the recursive call) has less than CONST elements, sort the array to solve the
selection problem; CONST has to be determined such that selection is cheaper than using
BPPRT_SELECT.

BFPRT_SELECT( A:Array[1..n], i:Integer) : Integer {


// if n is small, solve selection problem by sorting ...
if n<CONST then return SORT_SELECT(A,i);
cols := n5 ;
create Array M[1..cols];
for i from 1 to cols do {
M[i] := MEDIAN_5(A, 5*i+1, 5*i+5);
}

x := BFPRT_SELECT(M, cols
2 );
k := PIVOT_PARTITION(A,x);

if ik
then return BFPRT_SELECT(A[1..k],i)
else return BFPRT_SELECT(A[k+1 .. n],i-k);

MEDIAN_5 (A:Array[1..n], lo:Integer, hi:Integer) : Integer {


// return value of the median element of the subarray A[lo..hi]
// use for example an algorithm that's based on a decision tree
}
PIVOT_PARTITION (A:Array[1..n], x:Integer) : Integer {
// modified partitioning algorithm (see QUICKSORT)
// where x is explicitely given as pivot element
}

27.11.2003 17:18

Fundamental Algorithms

44 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Time complexity of BFPRT_SELECT


Analyse the size of the partition:
at least half of the "medians of five" are larger than the "median of the medians" x
Thus, at least half of the n5 groups contribute 3 elements (including their medians) larger
than x, except the group of x, and the last group (if n is not a multiple of 5).

at least 3 ( 12 n5 2)

3n
10

6 elements are larger than x.

in the worst case, BFPRT_SELECT will be called for


Thus,

T (n ) (1 )

7n
10

+ 6 elements.

for n CONST

n
T (n) T + . . . for n > CONST
5

Solving the recurrence


This recurrence can, for exmaple, be solved by substitution:
We guess: T (n) c n for some constant c (without loss of generality we may assume that T (n) c
n for n CONST). Then:
T (n)

n
7n
+ 6 + O(n)
c + c
5 10

n
7
c + 1 +
c n + 6 c + O(n)
5
10

9
c n + 7 c + O(n)
10

1
c n c n 7 c O (n)
10

Using the respecting constant hidden in the O(n)-term, we can pick c large enough such that
7 c O(n) < 0.

1
10

cn

For that c, T (n) c n, which means that BFPRT_SELECT has a linear time complexity!

Corollary
A variant of quicksort that partitions the array around the median computed by
BFPRT_SELECT, will have a guaranteed (worst case) time complexity of O(n log n).

4. Searching
27.11.2003 17:18

Fundamental Algorithms

45 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Definition: (Searching problem)


Input:
A sequence of n numbers (objects) A = (a 1, a 2, . . .,a n), and a value (key) x.
Output:
An index i such that x = a i, or the special value NIL if x does not occur in A.

4.1. Simple Searching


4.1.1. Sequential Searching in arrays
If the sequence A is stored in an array A, the most simple search algorithm is based on a
traversal of all elements:

SEQSEARCH ( A: Array [1..n], x: Integer {


for i from 1 to n do {
if x = A[i] then return i;
};

// at this point, we know that x is not present in A:


return NIL;

Time complexity of SEQSEARCH:


As a measure for the time complexity of SEQSEARCH, we will count the number of
comparisons.

Worst Case:
in the worst case,we will compare each element of A with x n comparisions.
The worst case applies every time when x does not occur in A!

Average case:
We assume that x does occur in A, and the probability that x = A[i] is
index i.

1
n

independent of the

If x = A[i], we will stop after i comparisons. Hence, the expected number of comparisons is
_

C (n ) =

1 n
1 n (n + 1) n + 1
i=
=
n i=1
n
2
2

27.11.2003 17:18

Fundamental Algorithms

46 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Our result is not too surprising: on average, we have to search about half of the array to find a
specific element.
Remember that this is only true if we assume that the searched element actually occurs in the
array. If we cannot be sure about that, we have to make an assumption about the probability of
finding an element in an array. This, of course, heavily depends on the scenario where searching
is needed.

4.1.2. Searching in sorted arrays


In a sorted array, we can stop searching as soon as X is larger than an element A[i] (if A[i] is
sorted in ascending order):

SORTSEARCH (A: Array [1..n], x:Integer): Integer {


// assume that A is sorted in ascending order
i := 1;
while i n and x < A[i] do {
i := i+1;
}

// at this point , A[i] is the first element x


if x = A[i]
then return i
else return NIL;

Time complexity (number of comparisons) of SORTSEARCH


Worst case:
The worst case occurs, when x is either the last element, or is larger than the last element (i.e.
larger than all elements)
Then, n + 1 comparisons (n times >, once =) are required.

Average case:
We assume that the probablity that x is in one of the intervals ] , a 1], ]a 1, a 2], . . ., ]a n 1, a n], ]
a n, + ], is exactly n +1 1 for each interval.
Then the number of comparisons performed by SORTSEARCH is
_

C (n )

1 n
(i + 1) + (n + 1)

n + 1 i = 1

1 n + 1
1 n + 2

(i + 1) 1 =
i 2
n + 1 i = 1
n + 1 i = 1

1 (n + 2) (n + 3)
n2 + 5 n + 6 4 n2 + 5 n + 2 n
2 =

n+1
2
2
2 (n + 1)
2n+ 2

27.11.2003 17:18

Fundamental Algorithms

47 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Like in an unsorted array the average number of comparison is about half the number of the
array's elements. The important difference, however, is that for the sorted array, the result also
holds if the searched element does not occur in the array.
Thus, it looks like the extra effort of sorting the array does not improve the situation too much.
However, this is not true, as we will see in the next section.

4.1.3. Binary Searching in Sorted Arrays


In a sorted array we can use the divide-and-conquer technique to come up with a much faster
algorithm:

BINSEARCH (A: Array[p..r], x:Integer) : Integer {


if p=r
then {
if x=A[p] then return p
else return NIL;
}
else {
p+r
m := 2 ;

if x A[m]
then return BINSEARCH(A[p,m], x)
else return BINSEARCH (A[m+1,r]);

Time complexity (number of comparisons) of BINSEARCH


If A has just one element, BINSEARCH requires 1 comparison:
C (1 ) = 1
If A has n elements (n > 1), we need,
1 comparison for deciding in which part of A to look;
plus at most C ( n2 ) comparisons to search x in the respective part of A
n
C (n) = C + 1 for n > 1
2

Solve recurrence:
-> script
case n = 2 k:
We can find the solution by iteration:
n
n
n
C (n) = C + 1 = C + 2 = . . . = C k + k = C (1) + k = k + 1
2
4
2
As k = log 2(n), we get C (n) = 1 + log 2(n).
case 2 k 1 < n 2 k:
27.11.2003 17:18

Fundamental Algorithms

48 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Here, we may use that the number of comparisons is monotonously increasing with the
number of elements in the array (on strict terms, we would have to prove this . . .), thus C (n)
C (2 k) = k + 1.
As k = log 2 (n), we get C (n) 1 + log 2 (n).
As an overall result, we get:
C (n) 1 + log 2 (n) O(log n) for all n .

Remarks:
What happens if we have to insert/delete elements in our sequence?
re-sorting of the sequence required (possibly requires O(n log n) work).
Searching is therefore closely related to chosing appropriate data structures for inserting
and deleting elements!

4.2. Binary Search Trees:


A binary tree is either an empty tree, or an object (record, struct, . . .) consisting of a
key, a reference (pointer, . . .) to a left son, and a reference (pointer, . . .) to a right son.
The left and right sons are, again, binary trees.

Data Structure:
Bin Tree := empty Tree
| (key : integer;
leftson : BinTree;
rightson : BinTree;
);

Example:
x = (4, (2, emptyTree, emptyTree), (3, emptyTree, (5, emptyTree, em
Graph of x:

Notation:
x.key = 4
x.leftSon = (2, emptyTree, emptyTree)
x.rightSon.key = 3

Definition: "binary search tree"

27.11.2003 17:18

Fundamental Algorithms

49 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

A binary tree x is called a binary search tree, if it satisfies the following properties:
for all keys l that are sorted in x.leftSon, l x.key;
for all keys r that are sorted in x.rightSon, r x.key;
both, x.leftSon, and x.rightSon, are again binary search trees.

Examples:
y = (3, (2, emptyTree, emptyTree),(4, emptyTree,(5, emptyTree, empt
Graph of y:

z = (3, (2, emptyTree, emptyTree),(5, (4, emptyTree, emptyTree), em


Graph of z:

4.2.1. Searching in a Binary Search Trees


TREE_SEARCH( x: BinTree, k:Integer) : BinTree {
if x = emptyTree then return emptyTree;
if x.key = k then return x;

if x.key > k
then return TREE_SEARCH(x.leftSon, k)
else return TREE_SEARCH(x.rightSon, k);

TREE_SEARCH returns a subtree of x that contains the value k in its top node. If k does not
occur as a key value in x, then TREE_SEARCH returns an empty tree.

Iterative variant of TREE_SEARCH:


TREE_SEARCH_ITER( x: BinTree, k:Integer) : BinTree {
t := x;
while t
if k
then
else
}
}

// t a local copy for the tree traversal


!= emptyTree and t.key != k do {
< t.key
t := t.leftSon
t := t.rightSon;

return t;

27.11.2003 17:18

Fundamental Algorithms

50 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Number of comparisons performed by TREE_SEARCH (and TREE_SEARCH_ITER)


TREE_SEARCH performs 2 comparisons (3 if we count " x=emptyTree") plus the number
of comparisons induced by the recursive call with x.leftSon, or x,rightSon as the
parameter.
each recursive call descends the search tree by one level.
TREE_SEARCH performs 2 l comparisons, if the key k is found on the l-th level.
in the worst case, TREE_SEARCH performs 2 h comparisons, where h is the height
of the tree.

Remarks:
For a fully balanced binary tree, we know that h O(log n) (n the number of keys stored in
the tree).
Thus, TREE_SEARCH will perform O(log n) comparisons to find a certain key in a balanced
binary tree that contains n nodes/keys.
The main problem will be to build (and maintain) a balanced search tree.

4.2.2. Inserting and Deleting:


Inserting . . .
TREE_INSERT( x: BinTree, k:Integer) {
// x is to be regarded as a "call-by-reference"-parameter!

if x = emptyTree
then
x := (k, emptyTree, emptyTree);
else
if x < x.key
then TREE_INSERT(x.leftSon, k)
else TREE_INSERT(x.rightSon, k);

. . . and Deleting
While we can always add additional nodes to a binary tree, we can not simply remove nodes
from a binary tree without destroying it:
if the required node has only one non-empty subtree, we can delete the top node by
replacing the tree by the repective subtree (if both subtrees are empty, the tree is replaced
by an empty subtree).
if the respective node has two non-empty subtrees, we can not delete the node; instead we
27.11.2003 17:18

Fundamental Algorithms

51 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

replace its value by either


the largest value in the left subtree (which is in the rightmost node), or
the smallest value in the right subtree (which is in the leftmost node).
The rightmost (leftmost) node in the left (right) subtree will always have only one subtree
deleting this node is easy.

Deleting the left-most node


TREE_DEL_LEFTMOST (x:BinTree) : Integer {
// return key value of leftmost node of x,
// and delete the leftmost node

if x.leftSon = emptyTree
then {
// we've found the leftmost node:
k := x.key;
x := x.rightSon;
return k;
}
else
return TREE_DEL_LEFTMOST (x.leftSon);

Deleting the top node


Using TREE_DEL_LEFTMOST, we get a simple algorithmn that deletes the top node of a binary
tree:

TREE_DEL_TOPNODE (x:BinTree) {
// assume that x is non-empty
if x.rightSon = emptyTree
then
x = x.leftSon
else {
// delete leftmost node of the right son (and memorize it)
k = TREE_DEL_LEFTMOST (x.rightSon);
// make the memorized node the new top node
x.key = k;
}
}

Deleting a specific node


And using TREE_DEL_TOPNODE, we get an algorithm that deletes a node with a certain key
value:

TREE_DELETE (x:BinTree, k:Integer) {

27.11.2003 17:18

Fundamental Algorithms

52 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

if x = emptyTree then return;

if x.key = k then TREE_DEL_TOPNODE(x);


else
if k < x.key
then TREE_DELETE(x.leftSon, k)
else TREE_DELETE(x.rightSon, k);

4.2.3. AVL Trees (Adelson-Velski and Landis)


The algorithms for searching/inserting/deleting elements in a binary search tree are fast (that
means O(log n)), only if the search trees are "balanced", i.e. the number of nodes in the two
subtrees are approximately equal.
AVL trees are "balanced" in the sense that the heights of the subtrees are balanced:

Definition: ("height balance" of a node)


Let h(x) be the height of a binary tree x.
Then, the height balance b(x.key) of a node x.key is defined as
b(x.key) = h(x.rightSon) h(x.leftSon)
i.e. the difference of the heights of the two subtrees of x.key.

Definition: "AVL tree"


A binary search tree x is called AVL tree, if:
b(x.key) {-1, 0, 1}, and

x.leftSon and x.rightSon are both AVL trees.


Thus, in an AVL tree, the height balance of every node must be -1, 0, or 1.

Examle:

27.11.2003 17:18

Fundamental Algorithms

53 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Estimating the number of nodes in an AVL tree


We have seen that the number of nodes can differ significantly, if we do not place any restrictions
on the balance of a binary tree. We will therefore consider the following, important question:
What is the maximal and minimal number of nodes that can be stored in an AVL tree of
given height h?

Maximal number:
A binary tree that is completely filled has a height balance of 0 for every node it is an AVL
tree.
We already know that such a completely filled binary tree has 2 h 1 nodes.

Minmal number:
A "minimal" AVL tree of height h consists of
a root node
one subtree that is a minimal AVL tree of height h 1;
one subtree that is a minmal AVL tree of height h 2;
N AVL, min(h) = 1 + N AVL, min(h 1) + N AVL, min(h 2)
In addition, we know that
a minimal AVL tree of height 1 has 1 node: N AVL, min(1) = 1
a minimal AVL tree of height 2 has 2 nodes: N AVL, min(2) = 2
This leads to the following recurrence:
N AVL, min(1) = 1
N AVL, min(2) = 2
N AVL, min(h) = 1 + N AVL, min(h 1) + N AVL, min(h 2)
We compare this recurrence with that of the Fibonacci numbers:
27.11.2003 17:18

Fundamental Algorithms

54 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

f0 = 1
f1 = 1
fh = fh1 + fh2
Let's list the first couple of values:
h
fh

1
1

2
2

3
3

4
5

5
8

6
13

7
21

8
34

N AVL, min(h)

12

20

33

54

Looking at the values, we may boldly claim that


N AVL, min(h) = f h + 1 1

Proof by induction:
case h = 1:
N AVL, min(1) = 1, and f 1 +1 1 = 2 1 = 1
case h = 2:
N AVL, min(2) = 2, and f 2 +1 1 = 3 1 = 2
induction step h 1 h:
N AVL, min(h)

which proves our initial claim!

1 + N AVL, min(h 1) + N AVL, min(h 2)

1 + ( f h 1 + 1 1) + ( f h 2 + 1 1)

fh + fh1 1

fh+1 1

Corollaries:
h+1
2

1. For the Fibonacci numbers f n, we know that 2 2 f n 2 n, which implies that 2


1 N AVL, min(h) 2 h + 1 1.
h+1

Therefore, an AVL tree of height h will have at least 2 2 1 nodes, and at most
2 h + 1 1 nodes.
As a consequence, an AVL tree that contains n nodes will be of height (log n)
.
2. Searching in an AVL tree has a time complexity of (log n).
3. Inserting, or deleting a single element in an AVL tree has a time complexity of
(log n).
BUT: standard inserting/deleting will probably destroy the AVL property.

4.2.4. Algorithms on AVL Trees


27.11.2003 17:18

Fundamental Algorithms

55 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

The algorithms to search, insert, or delete elements in an AVL tree are, in principal, the same as
those on "standard" binary trees. However, the deletion, or insertion of nodes into former AVL
trees might destroy there AVL property. The resulting loss of balance woulda, as a consequence,
slow down the algorithms.
To restore the AVL property in a certain node of a binary tree, we will discuss four so-called
rotation operators:
left rotation,
right rotation,
left-right rotation, and
right-left rotation

The left rotation


The left rotation restores the AVL property in a node in the following situation:
the height balance of the node is +2 or larger, and
the height balance of the right subtree is 0, or +1

AVL_LEFTROT (x:BinTree) {

x := ( x.rightSon.key,
(x.key, x.leftSon, x.rightSon.leftSon),
x.rightSon.rightSon
);

The right rotation


The right rotation restores the AVL property in a node, if:
the height balance of the node is -2 or larger, and
the height balance of the left subtree is 0, or -1

27.11.2003 17:18

Fundamental Algorithms

56 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

AVL_RIGHTROT (x:BinTree) {

x := ( x.leftSon.key,
x.leftSon.leftSon,
(x.key, x.leftSon.rightSon, x.rightSon)
);

The right-left rotation


Let's have a look at the following situation:
the height balance of the node is +2 or larger, and
the height balance of the left subtree is -1.
As we can see in the following diagram, the left rotation is not sufficient to restore the AVL
property:

the AVL property is still violated in the top node r;


a following right rotation is not applicable, because b(n) = +1, instead of 0 or -1, as required.
We can solve this problem by performing a right rotation on node r before performing the left
rotation!
As we can see in the follwing diagram, this makes the balance of the intermediate tree even
worse, but the resulting tree is balanced again in the AVL sense:

27.11.2003 17:18

Fundamental Algorithms

57 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

The combination of a right rotation on the right subtree, followed by a left rotation is generally
referred to as a right-left rotation :

AVL_RIGHTROT_LEFTROT (x:BinTree) {

AVL_RIGHTROT(x.rightSon);
AVL_LEFTROT(x);

The effect of the two rotations can also be achieved by the following algorithm:

AVL_RIGHTLEFTROT (x:BinTree) {
x := ( x.rightSon.leftSon.key,
( x.key,
x.leftSon,
x.rightSon.leftSon.leftSon ),
( x.rightSon.key,
x.rightSon.leftSon.rightSon,
x.rightSon.rightSon)
);

27.11.2003 17:18

Fundamental Algorithms

58 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

The left-right rotation


Finally, we discuss the situation, when:
the height balance of the node is -2 or smaller, and
the height balance of the left subtree is +1.
Analogous to the previous case, the right rotation is not sufficient to restore the AVL property.
This time, we perform a left rotation on the left subtree, followed by a right rotation.

Not suprisingly, the combination of a left rotation on the left subtree, followed by a right rotation is
generally referred to as a left-right rotation.
Again, we can either implement the roation based on the combination of two "simple" rotation, or
choose the direct approach:

AVL_LEFTROT_RIGHTROT (x:BinTree) {
AVL_LEFTROT(x.leftSon);
AVL_RIGHTROT(x);

27.11.2003 17:18

Fundamental Algorithms

59 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

}
AVL_LEFTRIGHTROT (x:BinTree) {

x := ( x.leftSon.rightSon.key,
( x.leftSon.key,
x.leftSon.leftSon,
x.leftSon.rightSon.leftSon),
( x.key,
x.leftSon.rightSon.rightSon,
x.rightSon )
);

Several remarks on the rotation operators


First, some observations on the height of the transformed AVL tree:

left-right rotations and right-left rotations will always reduce the height of the
(sub-)tree by 1.
left rotations and right rotations either keep the hight unchanged, or reduce it by 1,
depending on the heights of the subtrees.
Thus, rotation operators may cause a violation of the AVL property in parent
nodes!

A closer examination of the rotation operators required after inserting/deleting reveals:

after inserting a single node into a previously balanced AVL tree (satifying the
AVL property), at most on rotation operation is required to restore the AVL
property.
after deleting a single node in a previously balanced AVL tree, as many as log n
operations might be necessary to restore the AVL property for all nodes (one
rotation per level, travelling upwards in the tree).
after both, inserting and deleting, the possible violation of the AVL property can
occur on a much higher level than where the node was inserted/deleted. Thus, the
AVL property has to checked in the entire branch of the tree, up to the root.

Corollary: Time complexity of operations on AVL trees


The time complexity for deleting, or inserting a single node in an AVL tree - including the
work to restore the AVL property - is O(log n)
Thus, AVL trees are a data structure that allow inserting, deleting, and searching of data in O
(log n) time.

27.11.2003 17:18

Fundamental Algorithms

60 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

4.3. Hash Tables


Generalized searching problem:
Store a set of objects cosisting of a key and additional data:

Object := (key: Intger, /* or String, ...*/


data: ???;
/* String(s), Records, ....*/
);
Search/insert/delete objects in this set
Balanced search trees, like AVL trees, offer an O(log n)) worst case complexity. Is there any
faster method?

4.3.1. Direct-address tables


Tables as data structures:
A table, like an array, allows the direct access to its elements via an index. In contrast to arrays, it
may contain empty elements (denoted by the special value NIL).

Direct-Address Table;
Assume:
the "universe" U of keys, i.e. the set of all possible keys, is reasonably small, for example U
= {0, 1, . . ., m 1}.
Idea: "direct-address table"
Store every object in the Table element given by the object's key.

DIR_ADDR_INSERT(T:Table, x:Object) {
T[X.key] := x;
}
DIR_ADDR_DELETE(T:Table, x:Object){
T[X.key] := NIL;
}
DIR_ADDR_SEARCH(T:Table, key:Integer){
return T[key];
}

Positive:
+ very fast: every operation is O(1).

27.11.2003 17:18

Fundamental Algorithms

61 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Negative:
- all keys need to be distinct (they better should be, anyway . . .)
- m has to be small (which is a quite severe restriction!)
- if only few elements are stored, lots of Table elements are unused (waste of memory).

4.3.2. Hashing
If the universe of keys is very large, direct-access tables become impractical. However, if we can
find an easy function that
computes the table index from the key of an object,
has a relatively small range of values, and
can be computed efficiently,
we can use this function to calculate the table index.
The respective function is called a hash function, and the table is called a hash table.
The algorithms for inserting, deleting, and searching in such a simple hash table are quite
straightforward:

SMPL_HASH_INSERT(T:Table, x:Object) {
T[h(x.key)] := x;
}
SMPL_HASH_DELETE(T:Table, x:Object) {
T[h(x.key)]:= NIL;
}
SMPL_HASH_SEARCH(T:Table, x:Object) {
return T[h(x.key)];
}

Positive:
+ still very fast: O(1).
+ size of the table can be chosen freely, provided there is an appropriate hash function h.

Negative:
- values of h have to be distinct for all keys
In general, it is not possible to find a function h that generates distinct values for all possible keys
(the whole idea was to make the range of h much smaller then that of the key values . . .)
we have to deal with collisions, i.e. situations where different objects with different
keys share a common value of the hash function, and would therefore have to be stored
in the same table element.

27.11.2003 17:18

Fundamental Algorithms

62 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

4.3.3. Collision resolution by chaining


Idea:
Instead of using a table of elements that can only hold one object each, we use a table of
containers that can hold an arbitrary large number of elements, for example:
lists,
trees.
etc.
If lists are used as containers, this technique is called chaining.
Note: Compare this approach with the Bucketsort algorithm!

Algorithms:
The resulting algorithms for inserting, deleting, and searching are based on that for the
containers:

CHAINED_INSERT(T:Table of Lists, x: Object) {


insert x into List T[h(x.key)];
}
CHAINED_DELETE(T:Table of Lists, x:Object) {
delete x from Lists T[h(x.key)];
}
CHAINED_SEARCH(T:Table of Lists, k:Integer) : Object {
search an element x in List T[h(k)] where x.key=k
return that element x
}

Positive:
+ hash function no longer has to return distinct vaues
+ still very fast, if the lists are short

Negative:
- deleting/searching is O(l), l the number of elements in the accessed list.
- Worst case: All elements have to be stored in one single list (very unlikey).

Average cost of searching in a hash table with chaining


Assumptions:
hash table has m slots (table of m lists);

27.11.2003 17:18

Fundamental Algorithms

63 of 67

contains n elements; =

http://www.cse.tum.de/vtc/FundAlg/print/print.xml
n
m

is called the "load factor";

h(k) can be computed in O(1) time for all k.


all values of h are equally likely to occur.
costs for an unsuccessful search:
on average, the list corresponding to the requested key will have elements.
an unsuccessful search will have to compare the requested key with all objects in the list,
which requires O() operations.
Thus, including O(1) time for computing the hash value, an unsuccessful search, on the
average, will take O(1 + ) operations.

Corollary:
In the worst case, also a successful search will have to check all keys that are stored in
a hash table slot.
Therefore, the average complexity of a successful search will also be O(1 + ).

4.3.4. Designing hash functions


A good hash funtion should:
1. satisfy the assumption that each key is equally likely to be hashed to any of the slots:
1
P(k) =
for all j = 0, 1, . . ., m 1 ,
m
k : h (k) = j
and
2. be easy to compute, and
3. be "non-smooth", i.e. keys that are "close" together should not produce hash values that
are close together (to avoid clustering).

Division method for integer keys:


For integer keys, we could simply use the modulo operator:
h(k) = k mod m
The modulo function certainly satisfies our second claim for hash functions (easy to compute).
However, it is not "non-smooth", and the distribution of its hash values directly depends on that
of the input keys.

Remark: avoid powers of 2 (or 10)!


Otherwise the leading digits of a key k are not used to compute h(k)
use prime numbers for m !

Multiplication method for integer keys:


27.11.2003 17:18

Fundamental Algorithms

64 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Tow-step method:
1. multiply k by a constant (0 < < 1), and extract the fractional part of k .
2. multiply frational part by m, and use the integer part of the result as hash value.
Thus
h(k) = m ( k mod 1)

Remarks:
value of m is not critical (size of hash table can be chosen independent of hash function).
still has to be chosen wisely (see Knuth).

Efficient Implementation:
Choose m to be a power of 2, for example m = 2 p, and use a fixed-point arithmetic with w bits for
the fractional parts (w typically being the word-length). Then:
multiply k by 2 p (the latter term is a w-bit integer);
the result is a (2 k )-bit value r 1 2 w + r 0, where r 0 is the binary representation of the fractional
part of k.
multiplying r 1 2 w + r 0 by 2 p would shift the p most significant bits of r 0 into the "integer" part.
take p most significant bits of r 0 as the hash value.
Thus, the proposed multiplication method can be implemented without the need for floating-point
arithmetics (although fractional values are involved). The required arithmetics shifts and modulo
operations can be implemented very efficiently on most platforms, so the cost of computing the
hash value for a single key is dominated by that of one (long) integer multiplication.

Hashing non-Integer Keys:


Usually, non-integer keys can be trasformed into integers using an appropriate conversion
function, for example:
use ASCII-code to convert characters/strings into integers, for example:
"Bader" 66 128 4 + 97 128 3 + 100 128 2 + 101 128 1 + 114 128 0 = . . .
use fixed-point arithmetics to convert real number to integers:
1
0.333 333 3 3 333 333
3
use binary representation (if there is no binary representation, then think again about how
on earth you're going to store your data . . .)

4.3.5. Open Addressing


Open addressing does not use lists (chains,...) to resolve collisions; each slot of the
hash table either contains an object, or NIL.

27.11.2003 17:18

Fundamental Algorithms

65 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

To resolve collisions, open addressing has to allow more than one position for a specific object.
The hash function is therefore modified to generate sequences of hash table indices:
h : U{0, 1, . . ., m 1} {0, 1, . . ., m 1}
For every key k, the sequence
h(k, 0), h(k, 1), h(k, 2), . . ., h(k, m 1)
is called the probe sequence.

General approach:
An object will be stored in the first empty slot that is specified by the probe sequence.
To guarantee that an empty slot in the hash table will eventually be found, the probe
sequence [h(k, 0), . . ., h(k, m 1)] should be a permutation of [0, . . ., m 1]

Algorithms:
HASH_INSERT(T:Table, x:Object) : Integer {
for i from 0 to m-1 do {
j := h(x.key,i);
if T[j]=NIL
then {
T[j] := x;
return j; /* and terminate */
}
}
cast error "hash table overflow"
}
HASH_SEARCH(T:Table, k:Integer) : Object {
i := 0;
while T[h(k,i)] != NIL and i < m {
if k = T[h(k,i)].key
then return T[h(k,i)]; /* and terminate */
i := i+1;
}
return NIL;
}

Linear Probing:
Use hash function

h(k, i) = (h 0 (k) + i) mod m


where h 0 is an ordinary hash function.

Thus, for a key k,


the first slot of the hash table that is checked is T[h 0(k)].
The second probe slot will be T[h 0(k) + 1], then T[h 0(k) + 2], and so on until T[m-1] is

27.11.2003 17:18

Fundamental Algorithms

66 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

checked;
if this is still occupied, we wrap around to slots T[0], T[1], . . ., T[h 0(k) 1].

Main Problem: "Clustering"


Continuous runs of occupied slots ("clusters") cause lots of checks during searching and
inserting.
Clusters tend to grow, because all objects that are hashed to a slot inside the cluster will
increase it.
Minor improvement:
h(k, i) = (h 0 (k) + c i) mod m

Quadratic probing:
Use hash function
h(k, i) = (h 0 (k) + c 1 i + c 2 i 2) mod m
where h 0 is an ordinary hash function, and c 1 and c 2 are suitable constants.

Problem: "Secondary clustering"


two keys k 1 and k 2 having the same initial probe slot, i.e. h 0(k 1) = h 0(k 2), will still have the same
probe sequence.

Double hashing:
Double hashing uses a function
h(k, i) = (h 0 (k) + i h 1 (k)) mod m
where h 0 and h 1 are (auxiliary) hash functions.

Idea:
Even if h 0 generates the same hash values for several keys, h 1 will generate different probe
sequences for any of these keys.

Choosing h 0 and h 1:
h 0 and h 1 need to have different ranges of values:
h 0 : U {0, . . ., m 0 1}, m 0 = m being the size of the hash table.
h 1(k) must never be 0 (otherwise, no probe sequence is generated)
h 1(k) should be prime to m 0 = m (for all k):
if h 1(k) and m 0 = m have a greatest common divisor d > 1, the generated probe sequence
will only examine 1d -th of the hash slots.

Possible choices:

27.11.2003 17:18

Fundamental Algorithms

67 of 67

http://www.cse.tum.de/vtc/FundAlg/print/print.xml

Let m 0 = m be a power of 2, and design h 1 such that it will generate odd numbers only.
Let m 0 = m be a prime number, and let h 1 : U {1, . . ., m 1}, where m 1 < m

27.11.2003 17:18