Professional Documents
Culture Documents
Introduction
Algorithm Design Techniques
Design of algorithms Algorithms commonly used to solve problems
Greedy, Divide and Conquer, Dynamic Programming, Randomized, Backtracking
Greedy Algorithms
Choose the best option during each phase
Dijkstra, Prim, Kruskal
Making change
Choose largest bill at each round Does this always work?
Greedy Algorithms
Must have
Greedy-choice property: a globally optimal solution can be arrived at by making a locally optimal choice Optimal substructure: an optimal solution to a problem contains optimal solutions to its subproblems
Making Change
Greedy choice property
Highest denomination coin < n will reside in solution if not, it will be replaced by two or more smaller coins which will be more coins and not optimal This is also true for 1, 7, 10 denominations???
Optimal substructure
Solution for (n highest denomination coin) is optimal
Scheduling
Given jobs j1, j2, j3, ..., jn with known running times t1, t2, t3, ..., tn what is the best way to schedule the jobs to minimize average completion time? Job j1 j2 j3 j4 Time 15 8 3 10
Scheduling
j1 j2 j3 j4 36 15 23 26 Average completion time = (15+23+26+36)/4 = 25
j3
j2
j4
j1 36
Scheduling
Greedy-choice property: if shortest job does not go first, the y jobs before it will complete 3 time units faster, but j3 will be postponed by time to complete all jobs before it Optimal substructure: if shortest job is removed from optimal solution, remaining solution for n-1 jobs is optimal
Optimality Proof
Total cost of a schedule is
N
k=1
(N-k+1)tik
(N+1)tik - k*tik First term independent of ordering, as second term increases, total cost becomes smaller
Scheduling
Suppose there is a job ordering such that x > y and tix < tiy Swapping jobs (smaller first) increases second term decreasing total cost Show: xtix + ytiy < ytix + xtiy xtix + ytiy = xtix + ytix + y(tiy - tix) = ytix + xtix+ y(tiy - tix) < ytix + xtix+ x(tiy - tix) = ytix + xtix+ xtiy - xtix = ytix + xtiy
More Scheduling
Multiple processor case
Algorithm?
More Scheduling
Multiple processor case
Algorithm:
order jobs shortest first schedule jobs round-robin
Huffman Codes
100 ASCII characters Need ceil(log 100) bits to represent each character Large file = lots of bits! Would like to reduce number of bits
Huffman Codes
Idea encode frequently occurring characters using fewer bits Need to make sure all characters are distinguishable
01 = A 0101 = B 010101 =? AAA, AB, BA
Huffman Codes
Goal: find a full binary tree of minimum cost where characters are stored in the leaves Cost of tree: sum across all characters of the frequency of the character times its depth in the tree
frequently occurring characters should be highest in the tree
Huffman Codes
e t s
Character a e i s t space newline total Code 001 01 10 00000 0001 11 00001
sp
nl
Frequency 10 15 12 3 4 13 1 Total Bits 30 30 24 15 16 26 5 146
Huffmans Algorithm
How do we produce a code?
Maintain a forest of trees
weight of a tree is the sum of the frequencies of the leaves start with C trees to represent each character
weight of each is frequency of that character
3. Characters at the same depth can be swapped 4. As trees are merged, optimality holds
.3 .5 .8 .7 .2
.4 .1
On-line vs Off-line
On-line
Process one item at a time Cannot move an item once it is placed
Off-line
Look at all items before you place first item
On-line Algorithms
On-line algorithms cannot guarantee optimal solution
Problem: cannot know when input will end M small items - M large items + Can fit into M bins with 1 large and 1 small in each bin If all small come first, place in M separate bins If input is only M small items, we have used twice as many bins as necessary There are inputs that force any on-line bin-packing algorithm to use at least 4/3 the optimal number of bins.
(.2, .5) (.4) (.7, .1) (.3) (.8) Running time? Let M be the optimal number of bins required to pack a list I of items. Then next fit never uses more than 2M bins.
At most, half of the space is wasted (Bj + Bj+1 > 1)
(.2, .5, .1) (.4, .3) (.7) (.8) Running time? Let M be the optimal number of bins required to pack a list I of items. Then first fit never uses more than ceil(17/10M) bins.
(.2, .5, .1) (.4) (.7, .3) (.8) Running time? Same performance as first fit.
Let M be the optimal number of bins required to pack a list I of items. Then first fit decreasing never uses more than (11/9M)+4 bins.