Lecture 07

15-251: Great Theoretical Ideas in Computer Science Lecture 7
Time complexity
Dammit Im mad!
is a palindrome
In 1993, noted comedian Demetri Martin took a math course at Yale called Fractal Geometry. His final project: a 225-word palindromic poem.
In 1993, noted comedian Demetri Martin took a math course at Yale called Fractal Geometry. His final project: a 225-word palindromic poem.
What does that have to do with fractals? I dont know, its a liberal arts school.
Dammit Im mad, by Demetri Martin

Dammit I'm mad Evil is a deed as I live. God, am I reviled? I rise, my bed on a sun, I melt. To be not one man emanating is sad. I piss. Alas it is so late. Who stops to help? Man, it is hot. I'm in it. I tell. I am not a devil. I level "Mad Dog".
Thats nothing.
Rats peed on hope. Elsewhere dips a web. Be still if I fill its ebb. Ew, a spider ... eh? We sleep.
Oh no! Ah, say burning is as a deified gulp Deep, stark cuts saw it in one position. in my halo of a mired rum tin. Part animal, can I live? Sin is a name. I erase many men. Oh, to be man, a sin. Both, one ... my names are in it. Murder? Is evil in a clam? In a trap? I'm a fool. A hymn I plug, No. It is open. Deified as a sign in ruby ash - a goddam level I lived at. On it I was stuck. On mail let it in. I'm it. Oh, sit in ample hot spots. Oh, wet! A loss it is alas (sip). I'd assign it a name. Name not one bottle minus an ode by me: "Sir, I deliver. I'm a dog." Evil is a deed as I live. Dammit I'm mad.
In 1986, one Lawrence Levine wrote an entire palindromic novel. It had ~100,000 letters.
Dr. Awkward & Olson in Oslo

by Lawrence Levine
Suppose you are the proofreader. How would you check if theres a mistake?
Tacit, I hate gas (aroma of evil), masonry, tramps, a wasp martyr. Remote liberal ceding is idle if... heh-heh, Sam X. Xmas murmured in an undertone to tow-trucker Edwards. Alas. Simple hot." To didos, no tracks, Ed decided. Or eh trucks abob.
Tacit, I hate gas (aroma of evil), masonry, tramps, a wasp martyr. Remote liberal ceding is idle if... heh-heh, Sam X. Xmas murmured in an undertone to tow-trucker Edwards. Alas. Simple hot." To didos, no tracks, Ed decided. Or eh trucks abob.
(160 pages)
Bob, ask Curt. He rode diced desk carton. So did Otto help Miss Alas draw Derek-cur. Two tote? Not red Nun. A nide. Rum. Rum Sam X. Xmas. Heh, heh. Field, I sign. I declare bile to merry tramps. A wasp martyr? No, Sam live foam or a sage Tahiti Cat.
(160 pages)
Bob, ask Curt. He rode diced desk carton. So did Otto help Miss Alas draw Derek-cur. Two tote? Not red Nun. A nide. Rum. Rum Sam X. Xmas. Heh, heh. Field, I sign. I declare bile to merry tramps. A wasp martyr? No, Sam live foam or a sage Tahiti Cat.
TwoFingersPalindromeTest(S,n)
// returns Yes iff string // S[1]...S[n] is a palindrome lo := 1 hi := n while (lo < hi) if S[lo] S[hi] then return No lo := lo + 1 hi := hi 1 end while return Yes
The
TwoFingers algorithm solves the

worst-case time O(n).
PALINDROMES problem on size-n inputs in
Today:
Great Idea #1: Great Definitions
9 Great Ideas
in Theoretical Computer Science
An algorithm solves a problem if it gives the correct solution on every instance.
Problems
PALINDROMES is a problem.
dammitimmad is an instance. (also known as an input) More instances: selfless zxckallkdsflsdkf parahazramarzaharap No No Yes Solution Yes
Problems
A problem is an infinite collection of

(naturally related)
instances and solutions.
Problems
Another example: Instances 15251 252 12345679 9 50 610 610 25
Problems
Another example:
MULTIPLICATION
Solutions 3843252 111111111 30500 15250
Chess?
Question: Is this a winning position for white? An interesting question, but its not a Problem
Problems
Lets try again.
Problems
Lets try again.
CHESS
Instance: An arbitrary position. Solution: Yes/No, is it a winning position for white? Only finitely many instances still not a Problem
GENERALIZEDCHESS
Instance: A board size and an arbitrary position. Solution: Yes/No, is it a winning position for white? Yes! Thats a problem!
Algorithms
A well-defined procedure which gets an input (instance), gives an output. I think you know what I mean. (But see Lecture 22.) In 251, we write all our algorithms in pseudocode. Definition:
Algorithms
Algorithm
A solves problem R
means it outputs the correct solution on every instance of the problem.
Algorithms
An algorithm is a finite answer to an infinite number of questions
Great Idea #2: Input size
Measure the size of the input in bits.
Stephen Kleene
Instance/input size
Usually denoted
When instances are integers

If problem instances are positive integers M, input size is: n = = Example problem 1: PRIMES Input: M + Solution: Yes/No whether M is prime
(Mindset: n = # binary digits might be 106. Bignums!)
# of bits of M
Unless otherwise specified: = # bits needed to specify input. Its often otherwise specified! Formally, what n means is part of the definition of the problem.

If problem instances are positive integers M, input size is: n = = Example problem 2: FACTORING Input: M + Solution: The prime factorization of M. # of bits of M

If problem instances are positive integers M, input size is: n = = Example problem 3: MULTIPLICATION Input: Solution: Input size: n := max( M1, M2 + Traditional to define , M1M2 (written in binary) ) # of bits of M
When instances are lists/strings

Usually define n to be the length of the list. (Even though the list items may be > 1 bit.) Example problem 1: PALINDROMES Input: Solution: String of lower-case letters Yes/No whether its a palindrome (as opposed to length )
When instances are lists/strings

Usually define n to be the length of the list. (Even though the list items may be > 1 bit.) Example problem 2: SORTING Input: Solution: List of integers Sorted version of the list
(though there are some issues here see our later discussion of the RAM model)
Input size: n defined to be length of string
Input size: n defined to be length of list
When instances are graphs

How many bits does it take to specify a graph? Depends on input format! Two popular choices: Adjacency List Adjacency Matrix For graphs, input size is always otherwise specified. Major convention: n = # vertices, m = # edges Measure the running time as a (worst-case) function of the input size.
Great Idea #3:
Measuring running time

# elementary steps of alg.

A on instance I
1
A on instance I
1 1
TwoFingers(S,n)
lo := 1 hi := n while (lo < hi) if S[lo] S[hi] then return No lo := lo + 1 hi := hi 1 end while return Yes
TwoFingers(S,n)
Suppose instance
I is: selfless
Suppose instance
I is: selfless


A on instance I
1 1 1
A on instance I
TwoFingers(S,n)
TwoFingers(S,n)
lo := 1 1 hi := n 1 while (lo < hi) 1 if S[lo] S[hi] then 3? return No Were just lo := lo + 1 making this up hi := hi 1 end while return Yes
Suppose instance
I is: selfless
Suppose instance
I is: selfless


A on instance I
1 1 1 3
A on instance I
1 1 1 3 1
TwoFingers(S,n)
TwoFingers(S,n)
Suppose instance
I is: selfless
Suppose instance
I is: selfless


A on instance I
1 1 1 3 1 1
A on instance I
1 1 1 3 1 1 1
TwoFingers(S,n)
TwoFingers(S,n)
Suppose instance
I is: selfless
Suppose instance
I is: selfless


A on instance I
1 1 1 3 1 1 1
A on instance I
1 1 1 3 1 1 1
TwoFingers(S,n)
1
TwoFingers(S,n)
1 3
Suppose instance
I is: selfless
Suppose instance
I is: selfless


A on instance I
1 1 1 3 1 1 1
A on instance I
1 1 1 3 1 1 1 1 1
+a few
TwoFingers(S,n)
1 3 1
TwoFingers(S,n)
14
Suppose instance
I is: selfless
Length n input: between
and
steps.


We focus on the worst case. The running time of algorithm A is a function TimeA : + +, defined by TimeA(n) = max {# steps A on I}
A on instance I
1 1 1 3 1 1 1 1 1
TwoFingers(S,n)
instances I of size n
Length n input: between
and
O(n) steps.
Running time example

Problem: Instance:

Problem:
CLOSEST-PAIR
List of (at least 2) integers
CLOSEST-PAIR
Input size: n is defined to be length of list Solution: Distance between closest pair in list
A simple algorithm:
MyAlg(L)
closestSoFar := |L[1] L[2]| for i = 1...n for j = 1...n diff := |L[i] L[j]| if diff < closestSoFar then closestSoFar := diff return closestSoFar
Example: Instance is [39,18,88,100,15] Solution: 3

Theorem 1: MyAlg solves CLOSEST-PAIR

Theorem 1: MyAlg solves CLOSEST-PAIR Theorem 2: TimeMyAlg(n) = O(n2)
MyAlg(L)
closestSoFar := |L[1] L[2]| for i = 1...n for j = 1...n diff := |L[i] L[j]| if diff < closestSoFar then closestSoFar := diff return closestSoFar
MyAlg(L)
closestSoFar := |L[1] L[2]| for i = 1...n for j = i+1...n diff := |L[i] L[j]| if diff < closestSoFar then closestSoFar := diff return closestSoFar

Theorem 1: MyAlg solves CLOSEST-PAIR Theorem 2: TimeMyAlg(n) = Remark: On every instance, O(n2)
Why worst case?

Well, were not dogmatic about it. Average (random) case, typical case, smoothed analysis, all interesting too. Pros of worst case analysis:
An ironclad guarantee. Matches our worst-case notion of an algorithm solving a problem. Natural in context of cryptography (CS developed during WWII!). Hard to define what a typical instance is. Random inputs are often not representative. E.g., a random graph is essentially never 3-colorable.
MyAlg runs for (n2) steps.

Question: Can you think of a faster algorithm?
On input
I, algorithm A takes 2718 steps (n2)
Great Idea #4:
is not a very meaningful statement TimeA(n) = 18n2 7n + 92

(proportional to n2)
When it comes to running time, focus on the big picture: how it scales as a function of n.
is not a very meaningful statement analogous to too many significant digits overly depends on elementary step definition even at the level of machine code, were still ignoring constant factor time differences like processor vs. cache vs. disk speed
Run time scaling

(n) (n2) (n3) (nc) (2n)
doubling the input size doubling the running time 2 input size 4 run time 2 input size 8 run time 2 input size constant run time 2 input size running time squares n steps
n2
4n 3n 2n n
100
Lets do a log-log plot Say 1 step = 1 s
s
1016
n!
Great Ideas #5:

age of universe
2n n3 n2 n
104 108
Intrinsic complexity
one year one hour one sec
& beating brute force
108
Intrinsic complexity
Given a problem, e.g., PALINDROMES, we can ask about its intrinsic complexity: How fast is its fastest algorithm?
PALINDROMES:
We know an O(n) algorithm, TwoFingers. Could there be a faster one? E.g., Theorem: Any alg. solving PALINDROMES uses n1 steps. Proof:
Let When Suppose algorithm
A solves PALINDROMES using n2 steps.
I be the string aaaaa (n times), which is a palindrome. A runs with input I there must be distinct 1 j1,j2 n
such that
(Assuming a fixed model of elementary steps, and doing analysis up to O()s.) Let But
A never reads I[j1] or I[j2]. (Why?)
I be the same as I except that I[j1]=b and I[j2]=c. A runs on I it has same behavior as when it runs on I. (Why?) A says Yes on I and it says No on I (why?), a contradiction.
When
PALINDROMES:
We know an O(n) algorithm, TwoFingers. Could there be a faster one? E.g., Theorem: Any alg. solving PALINDROMES uses n1 steps. Conclusion: The intrinsic time complexity of PALINDROMES is linear; (n) time is necessary and sufficient.
CLOSEST-PAIR:
We know an O(n2) brute force algorithm. Is there a faster algorithm? Yes, use sorting! O(n log n) time. Not too hard theorem: n steps required. Is there an O(n) algorithm? Depends on your model of step-counting! Intrinsic complexity: linear / quasilinear.
10
MULTIPLICATION:
In grade school you learn an O(n2) algorithm. + =
MULTIPLICATION:
In grade school you learn an O(n2) algorithm. Easy to show n steps are required. Is there a faster algorithm? Yes! A much faster one.
HAMILTONIAN-CYCLE:
Instance: Solution: A connected graph. Yes/No: Is there a tour visiting each vertex exactly once? Instance size: n = # of vertices.
HAMILTONIAN-CYCLE:
Brute-force alg: Try all tours n! steps [Held-Karp70]: Dynamic programming 2n steps [Bjrklund10]: Clever algebraic brute-force 1.657n steps
EULERIAN-CYCLE:
Instance: Solution: A connected graph. Yes/No: Is there a tour visiting each edge exactly once? Instance size: n = # of vertices.
EULERIAN-CYCLE:
Algorithm
E:
Check if every vertex has even degree. If so, output Yes. Else output No. Eulers Theorem: Time: Alg.
E solves EULERIAN-CYCLE.
TimeE(n) = O(n2).
In a reasonable adjacency-list format, TimeE(n) = O(n).
11
Great Idea #6:
Polynomial time.
There is something truly magical about the notion of polynomial time.
I.e., time O(nc) for some constant c.
HAMILTONIAN-CYCLE:
s
1016
n!
2n n3 n2 n
age of universe
Seems to require exponential time. We have no good understanding of which graphs have Hamiltonian cycles.
one year one hour one sec
108
EULERIAN-CYCLE:
Polynomial time. Eulers Theorem explains Eulerian cycles. There is an enormous efficiency chasm between polynomial and exponential time.
104
108
There is an enormous efficiency chasm between polynomial and exponential time.
HAMILTONIAN-CYCLE:
Seems to require exponential time. We have no good understanding of which graphs have Hamiltonian cycles.
Common progress paradigm for a problem

Brute force algorithm: Exponential time
usually the magic happens here
EULERIAN-CYCLE:
Polynomial time. Eulers Theorem explains Eulerian cycles. There is an enormous understanding chasm between polynomial and exponential time.
Algorithmic breakthrough: Polynomial time
Blood, sweat, and tears: Quasilinear time
12
Does polynomial time imply efficient?

(n) (n log n) (n2) (n3) (n100)
Efficient (unless the constant is insane) Efficient. Kind of efficient. Barely efficient? Not efficient. But it almost never arises.
Distinction depends on your exact model.
Polynomial time
50 years of computer science experience shows its a very compelling definition:
A necessary first step towards truly efficient algorithms, associated with beating brute-force Very robust to notion of what is an elementary step. Easy to work with: Plug a poly-time subroutine into a poly-time algorithm: still poly-time. (Not true for quasilinear time.) Empirically, it seems that most natural problems with poly-time algorithms also have efficient-in-practice algorithms.
Its a negatable benchmark: Not polynomial time pretty much implies not efficient.
Definition recall(?)
Great Idea #7:
Let f, g : .
f(n) = O( g(n) )
Big-O notation.
means (informally)
f(n) is roughly g(n) up to a constant factor
(excluding small values of n)
Definition recall(?)
Let f, g : .
Example:
f(n) = 3n2 + 10n + 30 g(n) = n2 4 g(n) f(n)
f(n) = O( g(n) )
means
C, n0 such that |f(n)| C |g(n)| n n0
g(n)
13
WARNING:
In the expression f(n) = O(g(n)), the equals sign (=) does not really mean equals. Its just tradition to write it that way. You can define O() with sets and write if you really want.
n
4n2 for n 13
f(n) =
O(n2)
because f(n)
13
Example:
f(n) = 3n2 + 10n + 30
Big
If O() is like roughly , () is like roughly .
A stronger statement:
f(n) = 3n2 + O(n)

This means: C, n0 such that |f(n)| 3n2 + C n n n0
f(n) = ( g(n) )
means |g(n)| n n0
C > 0, n0 such that |f(n)|
Equivalently, means g(n) = O(f(n)).
Big
() is like roughly =.
Some functions, each O() of the next

1 log (log* n) log* n log log n n n0 log n n / log n
inverse function of
n n log n n2 n3 nO(1) nlog n
2n 3n n! nn
f(n) = ( g(n) )
C1, C2 > 0, n0 such that
means
|g(n)| |f(n)| C2 |g(n)|
Equivalently, f(n) = O(g(n)) AND g(n) = O(f(n)).
Some functions, each O() of the next

fastest known alg. for MULTIPLICATION
n (log n) 2O(log* n) n n log n n2 n3 nO(1) nlog n 2n 3n n! nn
Great Idea #8:
log (log* n) log* n log log n log n n / log n
If you want to be careful about how many steps an algorithm takes, then youll have to be careful.
14
Running time fine details

TwoFingers(S,n)
1 1 1 3 1 1 1 1 1

hi := hi 1
Initially, hi = n. How many bits of storage does hi use? About log n. Now suppose n is a power of 2.
hi = 100000000000000
O(n) ?

hi := hi 1
Initially, hi = n. How many bits of storage does hi use? About log n. Now suppose n is a power of 2.

TwoFingers(S,n)
1 1 1 3 1 1 log n?? 1 1
hi = 100000000000000 hi = 011111111111111
Didnt that take (log n) steps?
O(n log n) ?

if S[lo] S[hi] then
Initially,

TwoFingers(S,n)
1 1 1 n?????? 1 1 log n?? 1 1
lo = 1, hi = n.
Does it take the disk / memory-pointer / bus n steps to get from S[1] to S[n]?
O(n2) ?
15

TwoFingers(S,n)
1 1 1 n?????? 1 1 log n?? 1 1

Whether the running time of TwoFingers is (n) or (n2) depends on your precise model. In the RAM model (more realistic) its O(n). In the Turing Machine model (more elegant) its (n2).
Well, if you only care about polynomial time
RAM model
Good combination of reality/simplicity. On input size n, assume data is stored in words / registers of size O(log n). (So you can store an array index in 1 word.) Indirect memory accesses, like S[i], cost 1. Any oper. on words (like hi:=hi-1) costs 1. All reasonable models of step-counting for algorithms are polynomially equivalent.
Great Idea #9: The Strong ChurchTuring Thesis
The Strong ChurchTuring Thesis

Suggested by decades of computer science experience.

Challenger from the 1970s: Randomized computation. Give the model the ability to generate random bits.
E.g., it is a straightforward theorem that Turing Machines can simulate RAM model with at most polynomial slowdown, & vice versa.
In light of research from 1980s We believe (cant prove) that the Strong ChurchTuring Thesis holds true even with randomized computation.
16

Challenger from the 1980s: Quantum computation (Lecture 19). Allow qubits in quantum superposition.
Great Idea #9:

All reasonable models of step-counting for algorithms are polynomially equivalent.
In light of research from 1990s We believe (cant prove) that the Strong ChurchTuring Thesis is not true.
Sometimes Great Ideas are wrong!
Definitions: problems, instances, algorithms, input size, running time Understand: polynomial time run-time scaling
Study Guide
How-to: count algorithm steps use O() notation prove (n) time is necessary
17

Lecture 07

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 07

Uploaded by

Copyright:

Available Formats

15-251: Great Theoretical Ideas in Computer Science Lecture 7

Dammit Im mad, by Demetri Martin

Dr. Awkward & Olson in Oslo

TwoFingers algorithm solves the

PALINDROMES problem on size-n inputs in

Great Idea #1: Great Definitions

An algorithm solves a problem if it gives the correct solution on every instance.

A problem is an infinite collection of

instances and solutions.

means it outputs the correct solution on every instance of the problem.

Measure the size of the input in bits.

When instances are integers

When instances are integers

When instances are integers

When instances are lists/strings

When instances are lists/strings

Input size: n defined to be length of string

Input size: n defined to be length of list

When instances are graphs

Great Idea #3:

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Measuring running time

Length n input: between

Measuring running time

Measuring running time

Length n input: between

Running time example

Running time example

Example: Instance is [39,18,88,100,15] Solution: 3

Running time example

Running time example

Running time example

Why worst case?

MyAlg runs for (n2) steps.

I, algorithm A takes 2718 steps (n2)

Great Idea #4:

is not a very meaningful statement TimeA(n) = 18n2 7n + 92

Run time scaling

Lets do a log-log plot Say 1 step = 1 s

Great Ideas #5:

& beating brute force

A solves PALINDROMES using n2 steps.

A never reads I[j1] or I[j2]. (Why?)

In a reasonable adjacency-list format, TimeE(n) = O(n).

Great Idea #6:

There is something truly magical about the notion of polynomial time.

I.e., time O(nc) for some constant c.

one year one hour one sec

There is an enormous efficiency chasm between polynomial and exponential time.

Common progress paradigm for a problem

Algorithmic breakthrough: Polynomial time

Blood, sweat, and tears: Quasilinear time

Does polynomial time imply efficient?

f(n) is roughly g(n) up to a constant factor