In
cryptography,
plaintext is information a sender wishes to
transmit to a receiver.
Cleartext is, sometimes
confusingly, often used as a
synonym. Before
the computer era, plaintext most commonly meant message text in the
language of the communicating parties.Plaintext has reference to
the operation of
cryptographic
algorithms, usually
encryption
algorithms, and is the input upon which they operate. Since
computers became commonly available, the definition has also
encompassed not only electronic representations of the traditional
text, for instance, messages (eg, email) and document content (eg,
word processor files), but also the computer representations of
sound (eg, speech, or music), images (eg, photos or videos), ATM
and credit card transaction information, sensor data, and so forth.
Few of these are directly meaningful to humans, being already
transformed into computer manipulable forms. Basically, any
information which the communicating parties wish to conceal from
others can now be treated, and referred to, as plaintext. Thus, in
a significant sense, plaintext is the 'normal' representation of
data before any action has been taken to conceal, compress, or
'digest' it. It may not represent
text, and even if it does, the text may be not "plain".
Plaintext is used as input to an
encryption algorithm; the output is
usually termed
ciphertext particularly when the
algorithm is a
cipher.
Codetext is less often used, and almost always only
when the algorithm involved is actually a
code.
In some systems, however, multiple layers of
encryption are used, in which case the output of
one encryption algorithm becomes plaintext input for the
next.
Secure handling of plaintext
In a
cryptosystem, weaknesses can be
introduced through insecure handling of plaintext, allowing an
attacker to bypass the cryptography altogether. Plaintext is
vulnerable in use and in storage, whether in electronic or paper
format.
Physical security deals
with methods of securing information and its storage media from
local, physical, attacks. For instance, an attacker might enter a
poorly secured building and attempt to open locked desk drawers or
safes. An attacker can also engage in
dumpster diving, and may be able to
reconstruct shredded information if it is sufficiently valuable to
be worth the effort.
One countermeasure is to burn or thoroughly
crosscut shred discarded printed
plaintexts or storage media; NSA
is infamous
for its disposal security precautions.
If plaintext is stored in a
computer
file (and the situation of automatically made backup files
generated during program execution must be included here, even if
invisible to the user), the storage media along with the entire
computer and its components must be secure. Sensitive data is
sometimes processed on computers whose mass storage is removable,
in which case physical security of the removed disk is separately
vital. In the case of securing a computer, useful (as opposed to
handwaving) security must be physical
(e.g., against
burglary, brazen removal
under cover of supposed repair, installation of covert monitoring
devices, etc.), as well as virtual (e.g.,
operating system modification, illicit
network access,
Trojan
programs, ...). The wide availability of
keydrives, which can plug into most modern
computers and store large quantities of data, poses another severe
security headache. A spy (perhaps posing as a cleaning person)
could easily conceal one and even swallow it, if necessary.
Discarded computers, disk drives and media are also a potential
source of plaintexts. Most operating systems do not actually erase
anything — they simply mark the disk space occupied by a deleted
file as 'available for use', and remove its entry from the file
system
directory. The
information in a file deleted in this way remains fully present
until overwritten at some later time when the operating system
reuses the disk space. With even low-end computers commonly sold
with many gigabytes of disk space and rising monthly, this 'later
time' may be months later, or never. Even overwriting the portion
of a disk surface occupied by a deleted file is insufficient in
many cases.
Peter Gutmann of the
University of
Auckland
wrote a celebrated 1996 paper on the recovery of
overwritten information from magnetic disks; areal storage
densities have gotten much higher since then, so this sort of
recovery is likely to be more difficult than it was when Gutmann
wrote.
Also, independently, modern hard drives automatically remap sectors
that are starting to fail; those sectors no longer in use will
contain information that is entirely invisible to the file system
(and all software which uses it for access to disk data), but is
nonetheless still present on the physical drive platter. It may, of
course, be sensitive plaintext.
Some government agencies (e.g., NSA
) require
that all disk drives be physically pulverized when they are
discarded, and in some cases, chemically treated with corrosives
before or after. This practice is not widespread outside of
the government, however. For example, Garfinkel and Shelat (2003)
analyzed 158 second-hand hard drives acquired at garage sales and
the like and found that less than 10% had been sufficiently
sanitized. A wide variety of personal and confidential information
was found readable from the others. See
data remanence.
Laptop computers are a special problem. The US State Department,
the British Secret Service, and the US Department of Defense have
all had laptops containing secret information,some perhaps in
plaintext form, 'vanish' in recent years. Announcements of similar
losses are becoming a common item in news reports.
Disk encryption techniques can provide
protection against such loss or theft -- if properly chosen and
used.
On occasion, even when the data on the host systems is itself
encrypted, the media used to transfer data between such systems is
nevertheless plaintext due to poorly designed data policy. An
incident in October 2007 in which
HM Revenue and Customs lost
CD containing no less than the records of 25
million child benefit recipients in the United Kingdom — the data
apparently being entirely unencrypted — is a case in point.
Modern cryptographic systems are designed to resist
known plaintext or even
chosen plaintext attacks and so may not be
entirely compromised when plaintext is lost or stolen. Older
systems used techniques such as
padding and
Russian copulation to obscure information
in plaintext that could be easily guessed, ans os to resist the
effects of loss of plaintext on the security of the
cryptosystem.
See also
References
- S. Garfinkel and A Shelat, "Remembrance of Data Passed: A Study
of Disk Sanitization Practices", IEEE Security and Privacy,
January/February 2003 (PDF).
- UK HM Revenue and Customs loses 25m records of child benefit
recipients BBC