You are on page 1of 9

Lecture 15: NP, P, and NP-completeness

Lecturer: Sha Goldwasser


1
1 Tractable Problems
Most of the problems that we have studied thus far were solvable by algorithms with poly-
nomial running times, i.e. on inputs of size : (that it takes : bits to write down the input),
the running time of the algorithms 1(:) = C(:
I
) for some xed value of /.
For example,
spanning tree?: Given a weighted graph and an integer 1, is there a tree that connects
all nodes of the graph whose total weight is 1 or less?
matching?: Given a boys-girls compatibility graph, is there a complete matching be-
tween boys and girls??
Feasible linear programming?: Given a matrix vectors b, c and value k is there
a real valued vector x satisfying r / and c
T
r /?
Note that we phrased all the above problems as Yes and No questions. When a problem is
phrased in this way, we call it a decision problem. In actuality, you may be interested in a
more natural phrasing of the above problems as search problems
spanning tree: Given a weighted graph, and a weight 1 nd a tree that connects all
nodes of the graph whose total weight is 1 or less?
matching: Given a boys-girls compatibility graph, nd a complete (or perfect) match-
ing between all the boys and girls if one exists??
linear programming: Given a matrix and vectors vector / and c nd real vector
r 0 satisfying r /, and oip:cc
T
r /.
or even the optimization problem versions.
minimum spanning tree: Given a weighted graph, and a weight 1 nd a tree that
connects all nodes of the graph whose total weight is minimal?
maximal matching: Given a boys-girls compatibility graph, nd a matching between
as many boys and girls as possible??
1
These notes are based on http://www.cs.berkeley.edu/ vazirani/s99cs170/notes/npc.pdf
1
linear programming with an objective function: Given an :r: matrix , an
: vector /, and real numbers c
1
. .... c
a
, nd real numbers r
1
. .... r
a
0 satisfying
:i:{
i
c
i
r
i
} such that r /?
Obviously , if you can solve the search problem variant eciently , you can also solve the
decision problem variant eciently. Inversely, if you cannot solve the decision problems
eciently (it is intractable) you cannot solve the search problems eciently as well. Since
in this part of the course we are actually interested in talking about intractability, it will
suce to talk about decision variants of problems.
Indeed, generally, in complexity theory the convention is to phrase problems as decision
problems mostly for historical reasons as well as thr interest in classifying how dicult
they are. We will stick to this convention in the next 3 lectures.
We will let P denote the class of all decision problems whose solution (Yes/No answer) can
be computed in polynomial time, i.e. C(:
I
) for some xed /, where / can be small or large
as long as it is not a function of :. We consider all such problems eciently solvable, or
tractable. We call problems intractable, if they are not solvable in polynomial time. Arguably,
this is a very liberal notion of tractability (after all is :
100
ecient in any way for large :?)
but when we deem a problem as intractable it is severely so.
Notation: let A be a decision problem. We write r if and only if (r) = c:
2 Intractable Problems ?
Unfortunately, the design of ecient algorithms is not so simple. In fact, there are many
problems quite similar to the ones which are in P for which no polynomial time algorithm
is known, and worse for some of them it seems no polynomial time algorithm exists.
Let us see some examples. For each one listed below the only algorithms known run much
longer than any polynomial in the length of of the input (even in the worst case). Indeed,
in certain cases the best algorithm we know runs in exponential time in the worst case.
2.1 Examples from Graph theory, logic, combinatorics, opimization
Example 1: Traveling salesman problem (TSP): Given a weighted graph and an
integer 1, is there a cycle that visits all nodes of the graph whose total weight is 1 or less?
The best known algorithm for this decision problem runs in exponential time (obviously, this
is true for the corresponding search and optimization variants as well.), although it is easy
to verify if a given path is a cycle of weight smaller than /. Notice similarity of TSP to
Minimum spanning tree problem dened earlier. Yet MST can be solved in C(\ |op\ ) and
TSP the best we can do is in C(2
\
)
2
Example 2: Hamiltonian graph (HAM): Given a directed graph, is there a closed
path that visits each node of the graph exactly once? Again the best we can do is run an
exponential time algorithm to search for all possible permutation of the vertices and see if
they form a Hamiltonian path. Notice similarity to testing whether a graph is an Euler
graph: a directed graph, for which there is there a closed path that visits each edge of the
graph exactly once? The latter is easy to solve since a graph is Eulerian if and only if it is
strongly connected and each node has equal in-degree and out-degree.
Example 3: Vertex Cover Given an undirected graph, and a number / is there a vertex
cover: a subset of o the nodes of size at most / such that such that all the edges of the
graph have at least one end point in o
Example 4: Clique Given an undirected graph, and an integer / is there a clique (i.e a
fully connected subset of nodes. ) of at least size 1 or larger in the graph.
Note that whereas the rst three examples naturally correspond to a minimization problem
the fourth corresponds to a maximization problem.
Example 5: Integer Programming: Given an integer matrix A and vector b, does there
exist an x such that r / where x is an integer vector.
Note that we just spent a lecture on linear programming which is in P. What is the dierence?
Recall that in linear programming: Given an :r: matrix and an : vector /, are there
real numbers r
1
. .... r
a
0 satisfying r /?
The additional requirement that the solution consist of integers seems to make the problem
intractable. Rounding the solutions of the linear program up or down is no help -even nding
whether a rounding that remains feasible is possible is a hard problem.
The next example is from logic. A circuit is made of gates and input variables. We consider
circuits without feedback (essentially directed acyclic graphs). A gate can be an AND, OR
or NOT gate. The AND and OR gates have fan in 2 and unbounded fan out. The NOT gate
has fan in 1 and unbounded fan out. Input variables can take take on the values 0(false) and
1(true). A Boolean circuit has one special output gate and naturally represents a Boolean
function. If x1...xn are the values of the input variables C(x1,...,xn) is the output value.
Example 6: circuit SAT (cSAT) Given a description of a Boolean circuit C specied by
a sequence of gates and variables is there a way to set values for the input variables r
1
...r
a
so that the the value of the output gate is 1 (true)? (in this case we write C(r) = 1(1:nc)
and say that the circuit C is satisable).
Note, we ask here whether there is a set of inputs (a way to assign True or False to each r
i
)
such that C(r) = True. We do not ask what the values of r
i
are that make C(r) become
true True.
Note that it is easy to evaluate a circuit if you given all its inputs in polynomial time, simply
evaluate all the gates bottom-up, however we dont know how to nd out which if any input
3
would make its output True. The best that is known to date is to try all input combinations
for the unset inputs -and this is an exponential algorithm.
Example 7 : k-SAT: Given a Boolean formula in conjunctive normal form, such there are
at most k literals per clause, is there a satisfying truth assignment? when k is not specied
there is no bound on the number of literals per clause
Example 8: knapsack: Given integers c
1
. .... c
a
, and another integer 1 in binary, is there
a subset of these integers that sum exactly to 1?
Example 9: 3-Dimensional Matching Let X, Y, and Z be nite, disjoint sets, and let
T be a subset of X Y Z. That is, T consists of triples (r. . .) such that r A, 1 ,
and . 2. Now ` 1 is a 3-dimensional matching if the following holds: for any
two distinct triples (r1. 1. .1) ` and (r2. 2. .2) `, we have r1 = r2, 1 = 2, and
.1 = .2. The problem is: given T and k, is there a 3-dimension matching ` 1 of size at
least k?
Some comments are in order.
special cases Sometimes special cases of very hard problems can still be in P, even
though the general case is not. For example, a special case of k-SAT called 2SAT:
Given a Boolean formula in conjunctive normal form and with at most two literals per
clause, is there a satisfying truth assignment? is very simply in P. Try it. In contrast,
3SAT where there are at most 3 literals per clause is not known to be in P and the best
algorithm for it runs in exponential time essentially trying all possible assignments.
the importance of representation: Sometimes if the inputs to the problem (or even
part of the inputs) are specied not in binary, but say in unary, it becomes easy to solve
the problem in polynomial time. For example consider the above knapsack problem
where the weight K is specied in unary. Given integers c
1
. .... c
a
, and another integer
1 in unary, is there a subset of these integers that sum exactly to 1? This is in
polynomial time. Unary knapsack is in P-simply because the input is represented so
wastefully, with about :+1 bits, so that the C(:
21
) dynamic programming algorithm,
which would be exponential if 1 were represented in binary, is bounded by a polynomial
in the length of the input.
Fixed vs. Growing Sometimes when your problem comes up a portion of your input
is really xed (i.e the same for all instances) and should not be part of the input
specication. This xing, again, may change the problem from intractable to tractable.
For example, in the clique problem if / is xed (e.g. say / = 3 and you want to know if
there is a triangle) it is possible to try all possible / subsets of vertices and see if they
form a clique in :
I
. Similarly, integer programming with a xed number of variables
can be done in polynomial time this is much less trivial and uses one of the greatest
algorithms of the last 30 years called the 1
3
algorithms by Lenstra, Lenstra, and
Lovasz.
4
There are some problems which we do not polynomial time algorithms for but for which we
can do better than exponential time. We mention two famous examples here.
Factoring Problem: Given an n-bit integer `, nd a divisor d of ` such that 2 d `.
Decision version is much less natural but can be dened. Factoring Problem: Given an
n-bit integer `, and a bound 1, is there a divisor d of ` such that 2 d 1
It is known how to answer the factoring problem using an algorithm which runs in time
C(c
O(a
1
3 a
2
3 )
). This is sub-exponential time. Furthermore we know that quantum computers
can solve factoring in polynomial time whereas we do not know any quantum algorithms for
the problems above TSP, SAT, HAM, etc.
Graph Isomorphism Problem: Given two graphs G1 and G2 decide if there is a mapping
f from the vertices of G1 to the vertices of G2 such that there is an edge (n. ) in G1 if and
only if there is an edge ()(n). )()) in G2.
Again, the best known algorithm for graph isomorphism runs in C(2

alog a
) which is strictly
better than exponential time.
3 Certicates and the Class NP
As hard as they are there is something nice and easy about all the examples above, in-
cluding the TSP, Clique, SAT, cSAT, factoring, GI, Hamilton cycle, and knapsack problems.
In all of them, the solution to the search version is short (polynomial size) and it is easy
(in polynomial time) to verify that the solution is correct. Said dierently, when the answer
to the decision problem is Yes on a input x, there exists a short proof or succinct certicate
(poly-size) of this fact. If we had magic powers to guess this certicate, we could verify it in
polynomial time. Often this certicate is simply the solution to the search problem version
of the decision problem. When it is clear from the context, we will interchangeably refer to
it as the certicate or the solution.
Lets exemplify this notion of a certicate with a few examples and then dene it formally.
TSP: In the case of the traveling salesman problem, on input (G. 1) with n vertices the
certicate is a a tour
1
. ...
a
in the graph whose total cost is less than or equal to K. To
verify the certicate, one checks that (1)
1
=
a
(2) for all i = 1...:, there is a
)
= i (3)

i1
ucipt(
i
.
i+1
) < 1. All this can be done in time O(V+E).
SAT: For example, a certicate that a formula o with n variables and m clauses is in SAT is
a truth assignment to the variables which makes it true. For n variables, this certicate is
of size n, and it takes O(nm) time to verify it by evaluating the formula on this assignment.
HAM: For example, in the case of Hamilton cycle, the certicate (or solution) would be a
closed path that visits each node once, and to verify it one need to check that (1) all vertices
5
are visited (2) no vertex is visited more than once. and (3) rst and last vertices are the
same.
Factoring: In the case of case of factoring on the certicate would be a divisor which is
smaller than the bound. Namely, for input pair (`, 1), the certicate is divisor d of `
such that 2 d 1 and d divides `. To verify it we must check that (1) d divides ` and
(2)1 < d < `
These certicates have the following properties:
Every yes input to the problem has at least one certicate (possibly many), and each
no input has none.
They are succinct In each case the certicate is bounded in size by a xed polynomial
in the length of the input. i.e the truth assignment is of length : which is linear in the
size of the formula.
They are easily veriable. In each case there is a polynomial algorithm which takes
as inputs the input of the problem and the alleged certicate, and checks whether the
certicate is a valid one for this input. In the case of Hamilton cycle whether the given
closed path indeed visits every node once. And so on.
We call the class of decision problems which have such certicates NP.
2
.
NP: class of decision problems for which there exists a verication algorithm \

and
constant c 0 such that for r an input of ( where : = log r.)
1. \

(r. ) runs in polynomial time in n.


2. (r) = 1 c: if and only if there exists a such that < :
c
and \

(r. ) = 1:nc
Remarks:
Remark 1: NP DOES NOT MEAN not polynomial time !!!
EXP: decision problems whose solution takes exponential time
P: decision problems whose solution takes polynomial time
NP: decision problems with ecient verication algorithms
Remark 2:
Not all problems have short and easy veriable certicates. Consider, for example, the
problem non-SAT problem: Given a formula o, how would you prove quickly (even if you
were a magician) that it is true that there is no truth assignment that makes the formula
2
NP stands for nondeterministic polynomial, meaning that all problems in it can be solved in polynomial time by a
nondeterministic computer that starts by guessing the right certicate, and then checking it.
6
true. Or the graph-Non-Isomorphism: How would you prove that there is NO mapping
between all the vertices of the rst graph to those of the second graph which respects the
edge relationship. You could obviously enumerate all of them but that would not take
polynomial time but exponential time. Note that the above problems have exponential size
certicates and solutions. For example for non-satisability, if you cycle through al the
possible solutions to a formula and if none satisfy it, then you know its not satisable. Even
worse, some problems can not be solved at all in principle (they are called undecidable). So,
its not a matter of ineciency but of impossibility. An example of such a problem is the
halting-problem. Given a program 1 and an input r, decide if 1(r) will terminate. For
proof, take 6.045.
How about non-compositeness? Or Primality : is there a short proof that a number is
prime (rather then being composite)? It turns out that yes, but its non-trivial, and was
shown by Pratt in the late seventies.
Primality testing is naturally dened as a decision problem.
Primes: on input p output Yes if p is prime and no otherwise.
Theorem Primes are in NP
This follows from the existence of a certicate of the fact that 2

j
= j 1 which naturally
is only true for primes. What is used is the fact that 2

j
is cyclic. Namely, has a generator :
an element of order p-1. In other words, 2

j
= {p
i
mod j : i = 1. .... j 1} The certicate is:
(p, proof that p is order j 1).
Further reading: What is a short proof of g being order p-1.
4 P vs. NP problem
Notice that P is a subset of NP. Why? Intuitively, because checking the validity of a solution
is easier than coming up with a solution. For example, it is easier to get jokes than to be a
comedian.
Formally, if a problem is in P, there exists a polynomial time algorithm A that solves P.
Dene \

(r) = (r). Obviously, the output is true if and only if (r) = 1 c:. So, here
the execution of A is the certicate. (The recent result that primality is in P, thus yields
another proof that PRIMES problem is in NP).
A big question is whether 1 = `1 ???
Intellectually speaking, the P=NP? problem is very appealing. It asks whether it is as easy
to come up with a solution as it is to check that the solution is correct If this was the case
it would mean that if we show that a problem is in NP (which is generally simple to do
compared to designing ecient algorithms to nd solutions) it is also in P. Assuming the
7
proof would be constructive, this would provide an easy recipe to solve thousands of problems
currently intractable. On the other hand it would mean that modern cryptography which
is based on the fact that P is dierent than NP is impossible. This question of P vs. NP is
truly the million dollar question of computer science
3
After four decades of research, and
thousands, everyone seems to believe that P = NP, but no one has proven it.
What we can do is identify some problems in NP which seem to be harder (or at least as
hard) than all the rest, in the sense that if we could solve one of these special problems in
polynomial time then we could solve every problem in NP in polynomial time. We call these
the NP-complete problems. To answer the question P=NP? you could concentrate all your
eorts in answering whether an NP-complete problem has a polynomial time algorithm.
The rst problem shown NP-complete by Steve Cook in 1974 is cSAT.
Cooks Theorem: Every problem in NP can be reduced in polynomial time to circuit
SAT.
What does it mean to be reduced in polynomial time? Informally, we say that A is polynomial
time reducible to B that you can turn in polynomial time a instance of A to an instance of
B. It follows that a polynomial time algorithm for B, can be used it to obtain a polynomial
time algorithm for A.
Lets talk about reductions more formally.
Reductions between Problems
Formally, let and 1 be two decision. A poly-time reduction from to 1 is a polynomial-
time algorithm 1 which transforms inputs of to equivalent inputs of 1. That is, given an
input r to problem . 1 will produce an input 1(r) to problem 1, such that r is a yes
input of if and only if 1(r) is a yes input of 1. Notation: <
j
1 (A is reducible to
B).
4
Obviously a reduction from to 1 together with a polynomial time algorithm for 1, gives
a polynomial algorithm for . For any input r of of size :, the reduction 1 takes time
j(:) -a polynomial to produce an equivalent input 1(r) of 1. If we now submit this input
1(r) which is at most j(:) in length to the assumed algorithm for 1, running in time (:)
on inputs of size :, where is another polynomial, then we get the right answer of r, within
a total number of steps at most j(:) +(j(:)) also a polynomial.
Showing NP-completeness of a problem B by reducing all NP problems to B, is dierent
than the way we have used reductions before. Earlier in the course we used reductions to
3
See http://www.claymath.org/prizeproblems.
4
Actually the above way to dene a poly-time reduction is a special case called Karp-reduction. A more general type of
reduction with the same underlying meaning is called cook-reduction. In a cook-reduction from A to B: there exists a pair of
polynomial time algorithms R and D such that on input x to problem A, R(x) computes a sequence y1...ym inputs to problem
B, and D(x,B(y1)...B(ym))=A(x). In this class we will use the term polynomial-time reduction to mean a Karp-reduction.
8
establish that problems are easy (e.g., from matching to max-ow). In this part of the class
we shall use reductions (in a more sophisticated and counterintuitive context), in order to
prove that certain problems are hard. If we reduce A to B, we are establishing that, give or
take a polynomial, A is no harder than B. If we know 1 is easy, this establishes that is
easy. If we know is hard, this establishes 1 is hard.
Going back to Cooks theorem, since all problems in NP can be reduced to cSAT, this means
that all of them are no harder than cSAT.
9

You might also like