Automated theorem proving (
ATP)
or
automated deduction, currently the most
well-developed subfield of
automated reasoning (AR), is the
proving of
mathematical theorems by a
computer program.
Decidability of the problem
Depending on the underlying logic, the problem of deciding the
validity of a formula varies from trivial to impossible. For the
frequent case of
propositional
logic, the problem is decidable but
NP-complete, and hence only exponential-time
algorithms are believed to exist for general proof tasks. For a
first order predicate calculus,
with no ("proper") axioms,
Gödel's completeness
theorem states that the theorems (provable statements) are
exactly the logically valid
well-formed formulas, so identifying
valid formulas is
recursively
enumerable: given unbounded resources, any valid formula can
eventually be proven.
However,
invalid formulas (those that are
not
entailed by a given theory), cannot always be recognized. In
addition, a consistent formal theory that contains the
first-order theory of
the natural numbers (thus having certain "proper axioms"), by
Gödel's
incompleteness theorem, contains true statements which cannot
be proven. In these cases, an automated theorem prover may fail to
terminate while searching for a proof. Despite these theoretical
limits, in practice, theorem provers can solve many hard problems,
even in these undecidable logics.
Related problems
A simpler, but related, problem is
proof verification, where an existing
proof for a theorem is certified valid. For this, it is generally
required that each individual proof step can be verified by a
primitive recursive
function or program, and hence the problem is always
decidable.
Interactive theorem
provers require a human user to give hints to the system.
Depending on the degree of automation, the prover can essentially
be reduced to a proof checker, with the user providing the proof in
a formal way, or significant proof tasks can be performed
automatically. Interactive provers are used for a variety of tasks,
but even fully automatic systems have proven a number of
interesting and hard theorems, including some that have eluded
human mathematicians for a long time. However, these successes are
sporadic, and work on hard problems usually requires a proficient
user.
Another distinction is sometimes drawn between theorem proving and
other techniques, where a process is considered to be theorem
proving if it consists of a traditional proof, starting with axioms
and producing new inference steps using rules of inference. Other
techniques would include
model
checking, which is equivalent to brute-force enumeration of
many possible states (although the actual implementation of model
checkers requires much cleverness, and does not simply reduce to
brute force).
There are hybrid theorem proving systems which use model checking
as an inference rule. There are also programs which were written to
prove a particular theorem, with a (usually informal) proof that if
the program finishes with a certain result, then the theorem is
true. A good example of this was the machine-aided proof of the
four color theorem, which was
very controversial as the first claimed mathematical proof which
was essentially impossible to verify by humans due to the enormous
size of the program's calculation (such proofs are called
non-surveyable proofs). Another
example would be the proof that the game
Connect Four is a win for the first
player.
Industrial uses
Commercial use of automated theorem proving is mostly concentrated
in integrated circuit design and verification. Since the
Pentium FDIV bug, the complicated
floating point units of modern
microprocessors have been designed with extra scrutiny.
In the
latest processors from AMD, Intel, and others,
automated theorem proving has been used to verify that division and
other operations are correct.
First-order theorem proving
First-order theorem proving is one
of the most mature subfields of automated theorem proving. The
logic is expressive enough to allow the specification of arbitrary
problems, often in a reasonably natural and intuitive way. On the
other hand, it is still semi-decidable, and a number of sound and
complete calculi have been developed, enabling
fully
automated systems. More expressive logics, such as higher order and
modal logics, allow the convenient expression of a wider range of
problems than first order logic, but theorem proving for these
logics is less well developed. The quality of implemented system
has benefited from the existence of a large library of standard
benchmark examples — the Thousands of Problems for Theorem Provers
(TPTP) Problem Library — as well as from the
CADE ATP System Competition
(CASC), a yearly competition of first-order systems for many
important classes of first-order problems.
Some important systems (all have won at least one CASC competition
division) are listed below.
Deontic theorem proving
Deontic logic concerns normative
propositions, such as those used in law, engineering
specifications, and computer programs. In other words, propositions
that are translations of commands or "ought" or "must (not)"
statements in ordinary language. The deontic character of such
logic requires formalism that extends the first-order predicate
calculus. Representative of this is the tool KED.
Popular techniques
Available implementations
See also: :Category:Theorem
proving software systems
Free software
Proprietary software including Share-alike Non-commercial
You can find information on some of these theorem provers and
others at http://www.tptp.org/CASC/J2/SystemDescriptions.html . The
TPTP library of test problems, suitable for testing first-order
theorem provers, is available at http://www.tptp.org, and solutions
from many of these provers for TPTP problems are in the TSTP
solution library, available at http://www.tptp.org/TSTP.
Notable people
- Leo Bachmair, co-developer of the
superposition calculus.
- Woody Bledsoe, artificial intelligence
pioneer.
- Robert S. Boyer, co-author of the Boyer-Moore theorem prover, co-recipient of the
Herbrand Award 1999.
- Alan Bundy, University of Edinburgh, meta-level
reasoning for guiding inductive proof, proof planning and recipient
of 2007 IJCAI Award
for Research Excellence and Herbrand
Award, and 2003 Donald E.
Walker
Distinguished Service Award.
- William McCune Argonne National Laboratory, author of
Otter, the first high-performance theorem prover. Many important
papers, recipient of the Herbrand
Award 2000.
- Hubert
Comon, CNRS and now ENS Cachan. Many important papers.
- Robert Constable, Cornell
University. Important contributions to type theory, NuPRL.
- Martin Davis, author of the "Handbook of Artificial
Reasoning", co-inventor of the DPLL
algorithm, recipient of the Herbrand
Award 2005.
- Branden
Fitelson University of California at Berkeley. Work in
automated discovery of shortest axiomatic bases for logic
systems.
- Harald Ganzinger, co-developer
of the superposition calculus, head of the MPI Saarbrücken,
recipient of the Herbrand Award 2004
(posthumous).
- Michael Genesereth, Stanford University professor of Computer Science.
- Keith Goolsbey chief developer of
the Cyc inference engine.
- Michael J. C. Gordon led the development of the HOL
theorem prover.
- Gerard Huet Term rewriting, HOL logics, Herbrand Award 1998
- Robert Kowalski developed the
connection graph theorem-prover and SLD
resolution, the inference engine that executes logic programs.
- Donald
W. Loveland Duke University. Author, co-developer of the
DPLL-procedure, developer of model
elimination, recipient of the Herbrand Award 2001.
- Norman Megill, developer of Metamath,
and maintainer of its site at metamath.org, an online database of
automatically verified proofs.
- J Strother Moore, co-author of
the Boyer-Moore theorem prover, co-recipient
of the Herbrand Award 1999.
- Robert Nieuwenhuis University
of Barcelona. Co-developer of the superposition calculus.
- Tobias Nipkow
Technical
University of Munich, contributions to (higher-order) rewriting,
co-developer of the Isabelle, proof
assistant
- Ross Overbeek Argonne National
Laboratory. Founder of The
Fellowship for Interpretation of Genomes
- Lawrence C. Paulson University of
Cambridge, work on higher-order logic system, co-developer of
the Isabelle proof assistant
- David A. Plaisted University of North Carolina at Chapel
Hill. Complexity results, contributions to
rewriting and completion, instance-based
theorem proving.
- John Rushby Program Director - SRI International
- J.
Alan
Robinson Syracuse University. Developed original resolution and
unification based first order theorem proving, co-editor of the
"Handbook of Automated Reasoning", recipient of the Herbrand Award 1996
- Jürgen Schmidhuber Work
on Gödel Machines: Self-Referential Universal Problem
Solvers Making Provably Optimal Self-Improvements
- Stephan Schulz, E theorem
Prover.
- Natarajan Shankar SRI International, work on decision
procedures, little engines of proof, co-developer of
PVS.
- Mark Stickel SRI. Recipient of the Herbrand
Award 2002.
- Geoff Sutcliffe University of
Miami. Maintainer of the TPTP collection, an organizer of the CADE
annual contest.
- Dolph Ulrich Purdue, Work on automated
discovery of shortest axiomatic bases for systems.
- Robert
Veroff University of New Mexico. Many important papers.
- Andrei
Voronkov Developer of Vampire and Co-Editor of the "Handbook of
Automated Reasoning"
- Larry Wos Argonne National Laboratory. (Otter) Many
important papers. Very first Herbrand
Award winner (1992)
- Wen-Tsun Wu Work in geometric
theorem proving: Wu's method, Herbrand Award 1997
See also
Notes
- http://www.cs.miami.edu/~tptp/
- Artosi, Alberto, Cattabriga, Paola and Governatori, Guido
(1994-01-01). KED: A Deontic Theorem Prover. In: Biagioli, Carlo,
Sartor, Giovanni and Tiscornia, Daniela Workshop on Legal
Application of Logic Programming, Santa Margherita Ligure, Italy,
(60-76).
References