In web archiving
, an archive
is a website
information on, or the actual, webpages from the past for anyone to
Two common techniques are #1 using a web
or #2 user submissions.
- By using a web crawler the service will not depend on an active
community for their content, thereby building a larger database
faster, which usually results in the community growing larger as
well. However, web site developers and system administrators do
have the ability to block these robots from accessing [certain] web
pages (using a robots.txt).
- While it can be difficult to start such services due to
potentially low rates of user submission, this system can yield
some of the best results. By crawling web pages one is only able to
obtain the information the public has bothered to post to the
Internet. They may have not bothered to post it due to not thinking
anyone would be interested in it, lack of a proper medium, etc.
However, if they see someone wants their information then they may
be more apt to submit it.
On February 12, 2001, Google
discussion group archives from
Deja.com and turned it into their Google Groups service 
. They allow users to search old discussions
with Google's search technology, while still allowing users to post
to the mailing lists
Archive ( official website) is building a compendium of websites
and digital media.
, Archive has been employing a web crawler
to build up their database. They are one of the best known
is a large library of old text files
sustained by Jason Scott
. Its mission is to archive the old documents that had
floated around the bulletin board
(BBS) of his youth and to document other people's
experiences on the BBSes.
PANDORA (Pandora Archive
by the National Library of Australia
, stands for Preserving and Accessing
Networked Documentary Resources of Australia, which encapsolates
their mission. They provide a long-term catalog of select online
publications and web sites authored by Australians or that are of
an Australian topic. They employ their PANDAS (PANDORA Digital
Archiving System) when building their catalog.