In
computational
complexity theory, the
complexity
class NP-complete (abbreviated
NP-C or
NPC), is a class of
problems having two properties:
- Any given solution to the problem can be verified
quickly (in polynomial time); the
set of problems with this property is called NP (nondeterministic polynomial time).
- If the problem can be solved quickly (in polynomial
time), then so can every problem in NP.
Although any given solution to such a problem can be verified
quickly, there is no known efficient way to locate a solution in
the first place; indeed, the most notable characteristic of
NP-complete problems is that no fast solution to them is known.
That is, the time required to solve the problem using any currently
known
algorithm increases very quickly as
the size of the problem grows. As a result, the time required to
solve even moderately large versions of many of these problems
easily reaches into the billions or trillions of years, using any
amount of computing power available today. As a consequence,
determining whether or not it is possible to solve these problems
quickly is one of the principal
unsolved problems in
computer science today.
While a method for computing the solutions to NP-complete problems
using a reasonable amount of time remains undiscovered,
computer scientists and
programmer still frequently encounter
NP-complete problems. An expert programmer should be able to
recognize an NP-complete problem so that he or she does not
unknowingly waste time trying to solve a problem which so far has
eluded generations of computer scientists. Instead, NP-complete
problems are often addressed by using
approximation algorithms.
Formal overview
NP-complete is a
subset of
NP, the set of all
decision problems whose solutions can be
verified in polynomial time;
NP may be equivalently
defined as the set of decision problems that can be solved in
polynomial time on a
nondeterministic Turing
machine. A problem
p in NP is also in NPC
if and only if every other problem in NP can
be transformed into
p in polynomial time. NP-complete can
also be used as an adjective: problems in the class NP-complete are
known as NP-complete problems.
NP-complete problems are studied because the ability to quickly
verify solutions to a problem (NP) seems to correlate with the
ability to quickly solve that problem (
P). It is not known whether every problem in
NP can be quickly solved—this is called the
P = NP problem. But if
any single
problem in NP-complete can be solved quickly, then
every
problem in NP can also be quickly solved, because the
definition of an NP-complete problem states that every problem in
NP must be quickly reducible to every problem in NP-complete (that
is, it can be reduced in polynomial time). Because of this, it is
often said that the NP-complete problems are
harder or
more difficult than NP problems in general.
Formal definition of NP-completeness
A decision problem
C is NP-complete if:
- C is in NP, and
- Every problem in NP is reducible to C in polynomial
time.
C can be shown to be in NP by demonstrating that a
candidate solution to
C can be verified in polynomial
time.
A problem
K is reducible to
C if there is a
polynomial-time many-one reduction, a
deterministic algorithm which
transforms any instance
k ∈
K into an instance
c ∈
C, such that the answer to
c is
yes if and only if the answer to
k is
yes. To prove that an NP problem
C is in fact an
NP-complete problem it is sufficient to show that an already known
NP-complete problem reduces to
C.
Note that a problem satisfying condition 2 is said to be
NP-hard, whether or not it satisfies condition
1.
A consequence of this definition is that if we had a polynomial
time algorithm (on a
UTM,
or any other
Turing-equivalent
abstract machine) for
C,
we could solve all problems in NP in polynomial time.
Background
The concept of
NP-complete was introduced in 1971 by
Stephen Cook in a paper entitled
The complexity of theorem-proving
procedures on pages 151-158 of the
Proceedings of the 3rd
Annual ACM Symposium on Theory of Computing, though the term
NP-complete did not appear anywhere in his paper. At that
computer science conference, there
was a fierce debate among the computer scientists about whether
NP-complete problems could be solved in polynomial time on a
deterministic Turing machine.
John
Hopcroft brought everyone at the conference to a consensus that
the question of whether NP-complete problems are
solvable in polynomial time should be
put off to be solved at some later date, since nobody had any
formal proofs for their claims one way or the other. This is known
as the question of whether P = NP.
Nobody has yet been able to determine conclusively whether
NP-complete problems are in fact solvable in polynomial time,
making this one of the great
unsolved problems of
mathematics.
The Clay Mathematics Institute is offering a US$1 million reward to anyone who has
a formal proof that P = NP or that P ≠ NP.
In the celebrated
Cook-Levin theorem
(independently proved by
Leonid Levin),
Cook proved that the
Boolean satisfiability
problem is NP-complete (a simpler, but still highly technical
proof
of this is available). In 1972,
Richard
Karp proved that several other problems were also NP-complete
(see
Karp's 21
NP-complete problems); thus there is a class of NP-complete
problems (besides the Boolean satisfiability problem). Since Cook's
original results, thousands of other problems have been shown to be
NP-complete by reductions from other problems previously shown to
be NP-complete; many of these problems are collected in
Garey and
Johnson's
1979 book
Computers and Intractability: A Guide to
NP-Completeness.
NP-complete problems
An interesting example is the
graph isomorphism problem, the
graph theory problem of determining
whether a
graph isomorphism exists
between two graphs. Two graphs are
isomorphic if one can be
transformed into the other simply by renaming
vertices. Consider these two
problems:
- Graph Isomorphism: Is graph G_{1} isomorphic to graph
G_{2}?
- Subgraph Isomorphism: Is graph G_{1} isomorphic to a
subgraph of graph G_{2}?
The Subgraph Isomorphism problem is NP-complete. The graph
isomorphism problem is suspected to be neither in P nor
NP-complete, though it
is obviously in NP. This is an
example of a problem that is thought to be
hard,
but isn't thought to be NP-complete.
The easiest way to prove that some new problem is NP-complete is
first to prove that it is in NP, and then to reduce some known
NP-complete problem to it. Therefore, it is useful to know a
variety of NP-complete problems. The list below contains some
well-known problems that are NP-complete when expressed as decision
problems.
To the right is a diagram of some of the problems and the
reductions typically used to prove
their NP-completeness. In this diagram, an arrow from one problem
to another indicates the direction of the reduction. Note that this
diagram is misleading as a description of the mathematical
relationship between these problems, as there exists a
polynomial-time reduction between any two NP-complete problems; but
it indicates where demonstrating this polynomial-time reduction has
been easiest.
There is often only a small difference between a problem in P and
an NP-complete problem. For example, the
3-satisfiability problem, a restriction of
the boolean satisfiability problem, remains NP-complete, whereas
the slightly more restricted
2-satisfiability problem is in P
(specifically,
NL-complete), and the
slightly more general max. 2-sat. problem is again NP-complete.
Determining whether a graph can be colored with 2 colors is in P,
but with 3 colors is NP-complete, even when restricted to
planar graphs. Determining if a graph is a
cycle or is
bipartite is very easy (in
L), but finding a maximum bipartite or a
maximum cycle subgraph is NP-complete. A solution of the
knapsack problem within any fixed
percentage of the optimal solution can be computed in polynomial
time, but finding the optimal solution is NP-complete.
Solving NP-complete problems
At present, all known algorithms for NP-complete problems require
time that is
superpolynomial in the
input size, and it is unknown whether there are any faster
algorithms.
The following techniques can be applied to solve computational
problems in general, and they often give rise to substantially
faster algorithms:
- Approximation: Instead
of searching for an optimal solution, search for an "almost"
optimal one.
- Randomization: Use
randomness to get a faster average running
time, and allow the algorithm to fail with some small
probability. See Monte Carlo
method.
- Restriction: By restricting the structure of the input (e.g.,
to planar graphs), faster algorithms are usually possible.
- Parameterization: Often
there are fast algorithms if certain parameters of the input are
fixed.
- Heuristic: An
algorithm that works "reasonably well" in many cases, but for which
there is no proof that it is both always fast and always produces a
good result. Metaheuristic approaches
are often used.
One example of a heuristic algorithm is a suboptimal O(
n
log
n)
greedy coloring
algorithm used for
graph
coloring during the
register
allocation phase of some compilers, a technique called
graph-coloring global
register allocation. Each vertex is a variable, edges are drawn
between variables which are being used at the same time, and colors
indicate the register assigned to each variable. Because most
RISC machines have a fairly large number of
general-purpose registers, even a heuristic approach is effective
for this application.
Completeness under different types of reduction
In the definition of NP-complete given above, the term
reduction was used in the technical meaning of a
polynomial-time many-one reduction.
Another type of reduction is polynomial-time Turing reduction. A
problem
X is polynomial-time Turing-reducible to a problem
Y if, given a subroutine that solves
Y in
polynomial time, one could write a program that calls this
subroutine and solves
X in polynomial time. This contrasts
with many-one reducibility, which has the restriction that the
program can only call the subroutine once, and the return value of
the subroutine must be the return value of the program.
If one defines the analogue to NP-complete with Turing reductions
instead of many-one reductions, the resulting set of problems won't
be smaller than NP-complete; it is an open question whether it will
be any larger. If the two concepts were the same, then it would
follow that NP =
co-NP. This holds because by
their definition the classes of NP-complete and co-NP-complete
problems under Turing reductions are the same and because these
classes are both supersets of the same classes defined with
many-one reductions. So if both definitions of NP-completeness are
equal then there is a co-NP-complete problem (under both
definitions) such as for example the complement of the boolean
satisfiability problem that is also NP-complete (under both
definitions). This implies that NP = co-NP as is shown in the proof
in the co-NP article. Although whether NP = co-NP is an open
question it is considered unlikely and therefore it is also
unlikely that the two definitions of NP-completeness are
equivalent.
Another type of reduction that is also often used to define
NP-completeness is the
logarithmic-space many-one
reduction which is a many-one reduction that can be computed
with only a logarithmic amount of space. Since every computation
that can be done in
logarithmic
space can also be done in polynomial time it follows that if
there is a logarithmic-space many-one reduction then there is also
a polynomial-time many-one reduction. This type of reduction is
more refined than the more usual polynomial-time many-one
reductions and it allows us to distinguish more classes such as
P-complete. Whether under these types of
reductions the definition of NP-complete changes is still an open
problem.
See also
References
Further reading
- Scott Aaronson, NP-complete
Problems and Physical Reality, ACM SIGACT News, Vol. 36, No. 1. (March 2005), pp.
30-52.
- Lance Fortnow, The status of the P versus NP problem,
Commun. ACM,
Vol. 52, No. 9. (2009), pp. 78-86.