You are on page 1of 52

Greedy Algorithms

Intuition: At each step, make the choice that is locally


optimal.
Does the sequence of locally optimal choices lead to a
globally optimal solution?
Depends on the problem
Sometimes guarantees only an approximate solution
Examples:
Shortest paths: Dijkstra
Minimum spanning trees: Prim and Kruskal
Compression: Human coding
Memory allocation: First fit, Best fit
Greedy Method
Greedy algorithms are typically used to solve optimization
problem. Most of these problems have n inputs and require
us to obtain a subset that satisfies some constraints. Any
subset that satisfies these constraints is called a feasible
solution. We are required to find a feasible solution that
either minimizes or maximizes a given objective function.
In the most common situation we have:
C: A set (or list) of candidates;
S: The set of candidates that have already been used;
feasible(): A function that checks if a set is a feasible solution;
solution(): A function that checks if a set provides a solution;
select(): A function for choosing most promising candidates;
An objective function that we are trying to optimize.
The Generic Procedure
1. function greedy(C: set): set;
2. begin
3. S := ; /* S is the set in which we construct the solution */
4. while (not solution(S) and C ) do
5. begin
6. x := select(C);
7. C := C - {x};
8. if feasible(S{x}) then S := S {x};
9. end;
10. if solution(S) then return(S) else return();
11. end;

The selection function use the objective function to
choose the most promising candidates from C
Example: Coin Change
We want to give change to a customer using the smallest
possible number of coins (of units 1, 5, 10 and 25, resp.).
Greedy algorithm will always find the optimal solution in
this case.
If 12-unit coins are added, it will not necessarily find the
optimal solution.

Greedy method might even fail to find a solution despite
the fact that one exist. (Consider coins of 2, 3, and 5
units.)
E.g., 6 = 5 + ? (3, 3) is optimal
E.g., 15 = (12, 1, 1, 1), (10, 5) is optimal
The Knapsack Problems
0-1 knapsack problem: A thief robbing a store finds n
items, the ith item has weight c
i
and is worth v
i
dollars (c
i
& v
i
are integers) If the thief can carry at most B weight
in his knapsack, what items should he take to make the
most profitable?
Fractional knapsack problem: same as above, except that
the thief can take fractions of items (e.g., an item might
be gold dust)
Integer knapsack problem: same as 0-1 knapsack problem,
except that the number of each item is unlimited.
What is the input size of the problem? n * log B
The Knapsack Problems
Optimal substructure property: consider the most
valuable load that weights pounds in 0-1 problem, if we
remove item j from this load, the remaining load must be
the most valuable load weighting at most B - c
j
that can
be taken from the n - 1 original items excluding j.

In fractional problem, if we remove w pounds of item j,
the remaining load must be the most valuable load
weighting at most B - w that can be taken from the n - 1
original items plus c
j
- w pounds of item j.

The Knapsack Problems
The Integer Knapsack Problem
Maximize v
i
0, x
i
: nonnegative integers

Subject to B c
i
0, B > 0
The 0-1 Knapsack Problem: same as integer knapsack
except that the values of x
i
's are restricted to 0 or 1.
The Fractional Knapsack Problem
Maximize v
i
0, 1 x
i


0

Subject to B c
i
0, B > 0

=
n
i
i i
x v
1

=
n
i
i i
x c
1

=
k
i
i i
x v
1

=
n
i
i i
x c
1
The Knapsack Problems

Let f(k, a) = , 0 k n,
0 a B.
For integer knapsack problems, we have
f(k, a) = max{f(k - 1, a), f(k, a - c
k
) + v
k
}, and
For 0-1 knapsack problems, we have
f(k, a) = max{f(k - 1, a), f(k - 1, a - c
k
) + v
k
}.
What we want to compute is f(n, B), which depends
recursively on at most nB previous terms:
f(k, a), 0 k n, 0 a B.
)
`

s

= =
k
i
k
i
i i i i
x c x v
1 1
a : max
The Knapsack Problems
For the fractional knapsack problem, a greedy approach
can solve it: Rearrange the objects so that



Then, for items i= 1 to n, take as much of item i as there
is while not exceeding weight limit B.
Running time is O(n log n)
Remark: Dynamic programming is not applicable to the
fractional knapsack problem (why?), while greedy
method may fail to find optimal solutions for the
integer/0-1 knapsack problems.
n
n
c
v
c
v
c
v
> > >
2
2
1
1
Greedy does not work for 0-1 problem!












Fractional Knapsack: Optimal solution is $240.
Job Scheduling
We want to schedule given n jobs. Job i requires running
time t
i
Find the best order to execute the jobs to minimize the
average completion time
Sample input: (J1, 10), (J2, 4), (J3, 5), (J4, 12), (J5, 7)
One possible schedule: J3, J2, J1, J5, J4.
Total completion time = 5+9+19+26+38= 97
Optimal schedule: J2, J3, J5, J1, J4
Total completion time = 4+9+16+26+38 = 93
Greedy Scheduling Algorithm
Schedule the job with smallest running time first.
Schedule jobs in increasing order of running times.
Correctness: Suppose scheduling order is i
1
, ., i
n
.
If t
i
j
> t
i
j+1
, then swap jobs i
j
and i
j+1
:
Completion times for jobs before i
j
and i
j+1
do not
change
Time of job i
j
increases by t
i
j+1

Time of job i
j+1
decreases by t
i
j

So total completion time decreases by t
i
j
- t
i
j+1
Thus, optimal completion time can happen only if the
list i
1
, ., i
n
is sorted.

Multiprocessor Case
Suppose that jobs can be scheduled on k processors.
Same intuition: since running times of earlier jobs
contribute to completion times of latter jobs, schedule
shorter jobs first.
Algorithm:
Sort jobs in increasing order of running times: i
1
, ., i
n
.
Schedule i
1
on processor P
1
, i
2
on Pi
2
, ., i
k
on P
k
, i
k+1
on P
k+1
, and so on in a cycle
Sample input: (J1, 10), (J2, 4), (J3, 5), (J4, 12), (J5, 7)
and 2 processors
Solution: J2, J5, J4 on P1; J3, J1 on P2
Total completion time: (4+11+23)+(5+15)=58
Final Completion Time
Suppose we want to minimize the maximum completion
time instead of total (or average) completion time
makes sense only in multiprocessor case
Scheduling shorter job first does not seem to have any
advantage
In fact, previous greedy algorithm does not give optimal
solution
Optimal solution: J1, J2, J3 on P1; J4, J5 on P2. Final
completion time =19
In fact, no greedy strategy works. We may be forced to
try out all possible ways to split jobs among processors.
This is an NP-complete problem!
More on Scheduling
We have looked at a very simple variant
In practice, there are many complications:
Jobs are not known a priori, but they arrive in real-
time
Different jobs have different priorities
It may be OK to preempt jobs
Jobs have different resource requirements, not just
processing time
A hot research topic: scheduling for multimedia
applications
Minimum Spanning Trees (MST)
A tree is a connected graph with no cycles.
A spanning tree of a connected graph G is a subgraph of
G which has the same set of vertices of G and is a tree.
A minimum spanning tree of a weighted graph G is the
spanning tree of G whose edges sum to minimum weight.
(If G is not connected, we can talk about a minimum
spanning forest.)
There can be more than one minimum spanning tree in a
graph (consider a graph with identical weight edges.)
The minimum spanning tree problem has a long history,
the 1
st
algorithm dates back at least to 1926!.
Minimum Spanning Trees (MST)
Minimum spanning tree is always taught in algorithm
courses since (1) it arises in many applications, (2) it is
an important example where greedy algorithms always
give the optimal answer, and (3) Clever data structures
are necessary to make it work.
A set of edges is a solution if it constitutes a spanning
tree, and it is feasible if it does not include a cycle. A
feasible set of edges is promising if it can be completed
to form an optimal solution.
An edge touches a set if exactly one end of the edge is in
it.
Lemma
Let G = (V, E) be a connected undirected graph where
the length of each edge is given. Let V' V and E' E be
a promising set of edges such that no edges in E' touches
V'. Let e be the shortest edge that touches V'. Then E' -{e}
is promising.
E
Kruskal's Algorithm
Idea: Using union and find algorithms.

1. T = C;
2. Sort all edges by weight.
3. Make each node a singleton set.
4. For all e = (u, v) e E in sorted order do:
5. If find(u) = find(v) then add e to T and union(u; v).
(else discard e)
Implementation

Example
sorted
Analysis of Kruskal's Algorithm
Let m = |E|.
O(m log m) = O(m log n) to sort edges.
O(n) for initialize n sets.
The repeat-loop is executed at most m times.
In the worst case O((2m+n-1)log*n) for all the find and
union operations, since there are at most 2m find
operations and n-1 union operations.
At worst, O(m) for the remaining operations.
Total time complexity is: O(m log n).
Exercises

Prove that Kruskal's algorithm works correctly. The
proof which uses the lemma in the previous page, is by
induction on the number of edges selected until now.

What happens, if by mistake, we run the algorithm on a
graph that is not connected?

What is the complexity of the algorithm if the list of
edges is replaced by an adjacent matrix?
Example of Kruskal's Algorithm

Example of Kruskal's Algorithm

Example of Kruskal's Algorithm

Prim's Algorithm
Select an arbitrary vertex to start.
While (there are fringe vertices)
select minimum weight edge between tree and fringe
add the selected edge and vertex to the tree
The main loop of the algorithm is executed n-1 times;
each iteration takes O(n) time. Thus Prim's algorithm
takes O(n
2
) time.
Compare the above two algorithm according to the
density of the graph G = (V, E).
What happens to the above two algorithms if we allow
edges with negative lengths?
Prim's Algorithm
Example
Example of Prim's Algorithm

Example of Prim's Algorithm

Correctness
Single-Source Shortest Paths (SSSP)
The Problem: Given an n-vertex weighted graph G = (V,
E) and a vertex v in V, find the shortest paths from v to
all other vertices.

Edges in G may have positive, zero or negative weights,
but there is no cycle of negative weight.


=
otherwise j) (i, edge of weight the
connected not are j and i if
] , [ j i D
D&C approach

1. procedure SP(i, j, d);
2. if i j
3. then begin
4. d := D[i, j];
5. for k := 1 to n do d := min(d, SP(i,k)+SP(k,j));
6. end
7. else d := 0;

Needs EXPONENTIAL time.

Floyd's algorithm: DP approach
Idea: the shortest path from i to j without passing
through nodes numbered > k

1. for k := 1 to n do
2. for i: = 1 to n do
3. for j := 1 to n do
4. D[i, j] := min(D[i, j], D[i, k]+D[k, j]);

Dynamic programming approach.
Takes O(n
3
) time.
Solve all-pairs shortest path problem.
=
k
ij
d
Dijkstra's algorithm: Greedy approach

1. C := {2, 3, ..., n};
2. for i := 2 to n do near[i] := D[l, i];
3. repeat n-2 times
4. v := some element of C minimizing near[v];
5. C := C - {v};
6. for each w in C do near[w] := min(near[w],
near[v]+D[v,w]);

Takes O(n
2
) time.

Example of Dijkstra's Algorithm

Example of Dijkstra's Algorithm

Dijsktra's Shortest-Path Algotithm
1. Algoritllm SHORTEST- PATH(u)
2. begin
3. for i = 1 to n do
4. near[i] = D[u, i];
5. P[i] = u;
6. V' = V - {u};
7. near[u] = 0;
8. while (V' is not empty) do
9. Select v such that near[v] = min{near[w]: w V'};
10. V' = V' - {v};
11. for w V' do

Algorithm Continued
12. if (near[w] > near[v] + D[v, w])
13. then near[w] = near[v] + D[v, w]
14. P[w] = v; (* P[w] is the parent of w *)
15. for w V do
16. (* print the shortest path from w to u. *)
17. print w;
18. q = w;
19. while (q u) do
20. q = P[q]; print q;
21 print u;
22. end.

Greedy Heuristics

Often used in situations where we can (or must) accept an
approximate solution instead of an exact optimal solution.

Graph Coloring Problem: Given G = (V, E), use as few
colors as possible to color the nodes in E so that adjacent
nodes are of different color.
Greedy Solutions
Algorithm 1:
1. Arrange the colors according to some order.
2. For each node v in V, find the smallest color which
has not yet been used to paint any neighbors of v and
paint v by this color.
Algorithm 2:
1. Choosing a color and an arbitrary starting node, and
then considering each other node in turn, painting it
with this color if possible.
2. When no further nodes can be painted, we choose a
new color and a new starting node that has not yet
been painted. Then we repeat as in step 1 until every
node is painted.
Example of Graph Coloring

Example of Graph Coloring

Example of Graph Coloring

Example of Graph Coloring
Greedy: 5 colors
Optimal: 2 colors

Greedy approach may
find optimal solution in
some cases, but it may
also give an arbitrary
answer.
Traveling Salesperson Problem
The Problem: Find, in an undirected graph with weights
on the edges, a tour (a simple cycle that includes all the
vertices) with the minimum sum of edge-weights.
Algorithm:
1. Choose the edge with minimum weight.
2. Accept the edge (under the consideration together with
already selected edges) if it
does not cause a vertex to have degree 3 or more,
and
does not form a cycle, unless the number of selected
edges equals the number of the vertices of the graph.
3. Repeat the above steps until n edges have been selected.
Example
Edges are chosen in
the order:
3 4 5 6
(1, 2) (3, 5) (4, 5) (2, 3)
7 8 9 10
(1, 5) (2, 5) (3, 4) (1, 3)
11 12 15 25
(1, 4) (2, 4) (4, 6) (1, 6)
Thus, the solution is 1-2-3-5-4-6-1 with length = 58.
The optimal solution is 1-2-3-6-4-5-1 with length = 56.
Matroids
Pair (S, I) where S is a nonempty, finite set, and I is a
family of subsets of S such that
1. C e I;
2. If J e I and I _ J, then I e I (hereditary property)
3. If I, J e I and |I| < |J|, then there exists an x e J - I
such that I {x} e I (exchange property).

Elements of I : independent sets

Subsets of S not in I : dependent sets
Examples
Graphic matroid: G = (V, E) connected undirected
graph
S := E
I := set of forests (=acyclic subgraph) in G
Claim: (S; I) is a matroid.
Proof:
1. The graph (V, C) is a forest.
2. Any subset of a forest is a forest.
3. Let I and J be forests with |I| < |J|:
Example Continued
Subclaim 1: There exist 2 nodes u and v that are
connected in J, but not in I.

Proof: Assume by contradiction that if 2 nodes are
connected in J then they are also connected in I. It
follows that the connected components of J can be
spanned with at most |I| edges. Since |J| > |I|, J must
contain a cycle, which is a contradiction.
Example Continued
Subclaim 2: There exists an edge (x, y) on the path in J
from u to v such that x and y belong to different
connected components of I.
Proof: Assume by contradiction that for every edge (x, y)
on the path from u to v, x and y belong to the same
connected components of I. Then u and v belong to the
same connected component of I.
Thus I e is a forest.

Thus, (S, I) is a matroid.

You might also like