An
alphabet is a standardized set of
letters basic written symbols or
graphemes each of which roughly represents
a
phoneme in a
spoken language, either as it exists now or
as it was in the past. There are other
systems, such as
logographies, in which each character represents a
word, morpheme, or semantic unit, and
syllabaries, in which each character represents a
syllable. Alphabets are classified
according to how they indicate vowels:
The word "alphabet" came into
Middle
English from the
Late Latin word
Alphabetum, which in turn originated in the
Ancient Greek Αλφάβητος
Alphabetos, from
alpha and
beta, the first two letters of the
Greek alphabet.
Alpha and
beta in turn came from the first two letters of the
Phoenician alphabet, and meant
ox and
house respectively. There are dozens of
alphabets in use today. Most of them are composed of lines
(
linear writing); notable
exceptions are
Braille,
fingerspelling (
Sign
language), and
Morse code.
Linguistic definition and context
The term
alphabet prototypically refers to a
writing system that has characters (
graphemes) which represent both
consonant and
vowel sounds,
even though there may not be a complete one-to-one correspondence
between
symbol and
sound.
A
grapheme is an
abstract entity which may be physically
represented by different styles of
glyphs.
There are many written entities which do not form part of the
alphabet, including
numerals,
mathematical symbols, and
punctuation. Some human languages are
commonly written using a combination of
logograms (which represent
morphemes or
words) and
syllabaries (which represent
syllables) instead of an alphabet.
Egyptian hieroglyphs and
Chinese characters are two of the
best-known writing systems with predominantly non-alphabetic
representations.
Non-written languages may also be represented alphabetically. For
example, linguists researching a non-written language (such as some
of the indigenous Amerindian languages) will use the
International Phonetic
Alphabet to enable them to write down the sounds they
hear.
Most, if not all, linguistic writing systems have some means for
phonetic approximation of foreign words, usually using the native
character set.
History
Middle Eastern Scripts
The history of the alphabet started in
ancient Egypt. By
2700
BC Egyptian writing had a set of some
22 hieroglyphs to represent
syllables that begin with a single
consonant of their language, plus a vowel (or no
vowel) to be supplied by the native speaker. These glyphs were used
as pronunciation guides for
logograms, to
write grammatical inflections, and, later, to transcribe loan words
and foreign names.
However, although seemingly alphabetic in nature, the original
Egyptian uniliterals were not a system and were never used by
themselves to encode Egyptian speech.
In the Middle Bronze Age an apparently
"alphabetic" system known as the Proto-Sinaitic
script is thought by some to have been developed in central
Egypt
around 1700 BC for or by Semitic workers, but only one of these early
writings has been deciphered and their exact nature remains open to
interpretation. Based on letter appearances and names, it is
believed to be based on Egyptian hieroglyphs.
This script eventually developed into the
Proto-Canaanite alphabet, which in
turn was refined into the
Phoenician
alphabet. It also developed into the
South Arabian alphabet, from which
the
Ge'ez alphabet (an
abugida) is descended. Note that the scripts
mentioned above are not considered proper alphabets, as they all
lack characters representing vowels. These early vowelless
alphabets are called
abjads, and still exist
in scripts such as
Arabic,
Hebrew and
Syriac.
Phoenician was the first major phonemic script. In contrast to two
other widely used writing systems at the time,
Cuneiform and
Egyptian hieroglyphs, it contained only
about two dozen distinct letters, making it a script simple enough
for common traders to learn. Another advantage of Phoenician was
that it could be used to write down many different languages, since
it recorded words phonemically.
The script was spread by the Phoenicians, whose
thalassocracy allowed the script to be spread
across the Mediterranean. In Greece, the script was modified to add
the vowels, giving rise to the first true alphabet. The Greeks took
letters which did not represent sounds that existed in Greek, and
changed them to represent the vowels. This marks the creation of a
"true" alphabet , with both vowels and consonants as explicit
symbols in a single script. In its early years, there were many
variants of the Greek alphabet, a situation which caused many
different alphabets to evolve from it.
European alphabets
The
Cumae form of the Greek alphabet was
carried over by Greek colonists from Euboea
to the
Italian peninsula, where it gave rise to a variety of alphabets
used to inscribe the Italic
languages. One of these became the
Latin alphabet, which was spread across
Europe as the Romans expanded their empire. Even after the fall of
the Roman state, the alphabet survived in intellectual and
religious works. It eventually became used for the descendant
languages of Latin (the
Romance
languages) and then for most of the other languages of
Europe.
Another notable script is
Elder
Futhark, which is believed to have evolved out of one of the
Old Italic alphabets. Elder
Futhark gave rise to a variety of alphabets known collectively as
the
Runic alphabets. The Runic
alphabets were used for Germanic languages from AD 100 to the late
Middle Ages. Its usage was mostly restricted to engravings on stone
and jewelry, although inscriptions have also been found on bone and
wood. These alphabets have since been replaced with the Latin
alphabet, except for decorative usage for which the runes remained
in use until the 20th century.
The
Glagolitic alphabet was the
script of the liturgical language
Old Church Slavonic, and became the
basis of the
Cyrillic alphabet.
The Cyrillic alphabet is one of the most widely used modern
alphabets, and is notable for its use in Slavic languages and also
for other languages within the former Soviet Union. Variants
include the
Serbian,
Macedonian,
Bulgarian, and
Russian alphabets. The Glagolitic alphabet
is believed to have been created by
Saints Cyril and Methodius, while
the Cyrillic alphabet was invented by the Bulgarian scholar
Clement of Ohrid, who was their
disciple. They feature many letters that appear to have been
borrowed from or influenced by the
Greek
alphabet and the
Hebrew
alphabet.
Asian alphabets
Beyond the logographic
Chinese
writing, many phonetic scripts are in existence in Asia. The
Arabic alphabet,
Hebrew alphabet,
Syriac alphabet, and other
abjads of the Middle East are developments of the
Aramaic alphabet, but because these
writing systems are largely
consonant-based they are often not considered true
alphabets.
Most alphabetic scripts of India and Eastern Asia are descended
from the
Brahmi script, which is often
believed to be a descendent of Aramaic.
In
Korea
, the Hangul alphabet was
created by Sejong the Great in
1443. Understanding of the phonetic alphabet of Mongolian
Phagspa script aided the creation of
a phonetic script suited to the spoken Korean language. Mongolian
Phagspa script was in turn derived from the Brahmi script. Hangul
is a unique alphabet in a variety of ways: it is a
featural alphabet, where many of the
letters are designed from a sound's place of articulation (P to
look like widened mouth, L sound to look like tongue pulled in,
etc.); its design was planned by the government of the time; and it
places individual letters in syllable clusters with equal
dimensions, in the same way as
Chinese characters, to allow for mixed
script writing (one syllable always takes up one type-space no
matter how many letters get stacked into building that one
sound-block).
Zhuyin (sometimes called Bopomofo) is a
semi-syllabary used to phonetically
transcribe Mandarin Chinese
in the Republic of
China
. After the later establishment of the People's
Republic of China
and its adoption of Hanyu
Pinyin, the use of Zhuyin today is limited, but it's still
widely used in Taiwan
where the
Republic of China still governs. Zhuyin developed out of a
form of Chinese shorthand based on Chinese characters in the early
1900s and has elements of both an alphabet and a syllabary. Like an
alphabet the phonemes of
syllable
initials are represented by individual symbols, but like a
syllabary the phonemes of the
syllable
finals are not; rather, each possible final (excluding the
medial glide) is represented by
its own symbol. For example,
luan is represented as ㄌㄨㄢ
(
l-u-an), where the last symbol ㄢ represents the entire
final
-an. While Zhuyin is not used as a mainstream
writing system, it is still often used in ways similar to a
romanization system that is, for aiding
in pronunciation and as an input method for Chinese characters on
computers and cell phones.
European alphabets, especially Latin and Cyrillic, have been
adapted for many languages of Asia. Arabic is also widely used,
sometimes as an abjad (as with
Urdu
and
Persian) and sometimes as a
complete alphabet (as with
Kurdish
and
Uyghur)
Types
[[Image:800px-Writing systems
worldwide1.png|500px|thumb|
Alphabets: Latin ,
Latin and Arabic ,
Cyrillic ,
Latin and
Cyrillic ,
Greek ,
Georgian ,
Armenian
Abjads: Arabic ,
Hebrew
Abugidas: North
Indic ,
South
Indic ,
Ethiopic ,
Thaana Canadian
Syllabic ,
Logographic+syllabic: Pure
logographic ,
Mixed logographic and
syllabaries ,
Featural-alphabetic
syllabary + limited logographic Featural-alphabetic
syllabary ]]
The term "alphabet" is used by
linguists
and
paleographers in both a wide and a
narrow sense. In the wider sense, an alphabet is a script that is
segmental at the
phoneme level that
is, it has separate glyphs for individual sounds and not for larger
units such as syllables or words. In the narrower sense, some
scholars distinguish "true" alphabets from two other types of
segmental script,
abjads and
abugidas. These three differ from each other in the
way they treat vowels: abjads have letters for consonants and leave
most vowels unexpressed; abugidas are also consonant-based, but
indicate vowels with
diacritics to or a
systematic graphic modification of the consonants. In alphabets in
the narrow sense, on the other hand, consonants and vowels are
written as independent letters. The earliest known alphabet in the
wider sense is the
Wadi
el-Hol script, believed to be an
abjad,
which through its successor
Phoenician is the ancestor of modern
alphabets, including
Arabic,
Greek,
Latin (via the
Old Italic alphabet),
Cyrillic (via the Greek alphabet) and
Hebrew (via
Aramaic).
Examples of present-day abjads are the
Arabic and
Hebrew
scripts; true alphabets include
Latin,
Cyrillic, and
Korean
hangul; and abugidas are used to write
Tigrinya Amharic,
Hindi, and
Thai.
The
Canadian Aboriginal
syllabics are also an abugida rather than a syllabary as their
name would imply, since each glyph stands for a consonant which is
modified by rotation to represent the following vowel. (In a true
syllabary, each consonant-vowel combination would be represented by
a separate glyph.)
The boundaries between the three types of segmental scripts are not
always clear-cut. For example,
Sorani
Kurdish is written in the
Arabic script, which is normally an abjad.
However, in Kurdish, writing the vowels is mandatory, and full
letters are used, so the script is a true alphabet. Other languages
may use a Semitic abjad with mandatory vowel diacritics,
effectively making them abugidas. On the other hand, the
Phagspa script of the
Mongol Empire was based closely on the
Tibetan abugida, but all vowel marks
were written after the preceding consonant rather than as diacritic
marks. Although short
a was not written, as in the Indic
abugidas, one could argue that the linear arrangement made this a
true alphabet. Conversely, the vowel marks of the
Tigrinya abugida and the
Amharic abugida (ironically, the original
source of the term "abugida") have been so completely assimilated
into their consonants that the modifications are no longer
systematic and have to be learned as a
syllabary rather than as a segmental script. Even
more extreme, the Pahlavi abjad eventually became
logographic. (See below.)
Thus the primary
classification of
alphabets reflects how they treat vowels. For
tonal languages, further classification
can be based on their treatment of tone, though names do not yet
exist to distinguish the various types. Some alphabets disregard
tone entirely, especially when it does not carry a heavy functional
load, as in
Somali and many other
languages of Africa and the Americas. Such scripts are to tone what
abjads are to vowels. Most commonly, tones are indicated with
diacritics, the way vowels are treated in abugidas. This is the
case for
Vietnamese (a true
alphabet) and
Thai (an abugida). In
Thai, tone is determined primarily by the choice of consonant, with
diacritics for disambiguation. In the
Pollard script, an abugida, vowels are indicated
by diacritics, but the placement of the diacritic relative to the
consonant is modified to indicate the tone. More rarely, a script
may have separate letters for tones, as is the case for
Hmong and
Zhuang. For most of these scripts,
regardless of whether letters or diacritics are used, the most
common tone is not marked, just as the most common vowel is not
marked in Indic abugidas; in
Zhuyin not only
is one of the tones unmarked, but there is a diacritic to indicate
lack of tone, like the
virama of Indic.
The number of letters in an alphabet can be quite small. The Book
Pahlavi script, an abjad, had only
twelve letters at one point, and may have had even fewer later on.
Today the
Rotokas alphabet has only
twelve letters. (The
Hawaiian
alphabet is sometimes claimed to be as small, but it actually
consists of 18 letters, including the
ʻokina
and five long vowels.) While Rotokas has a small alphabet because
it has few phonemes to represent (just eleven), Book Pahlavi was
small because many letters had been
conflated that is, the
graphic distinctions had been lost over time, and diacritics were
not developed to compensate for this as they were in
Arabic, another script that lost many of its
distinct letter shapes. For example, a comma-shaped letter
represented
g, d, y, k, or
j. However, such
apparent simplifications can perversely make a script more
complicated. In later Pahlavi
papyri, up to
half of the remaining graphic distinctions of these twelve letters
were lost, and the script could no longer be read as a sequence of
letters at all, but instead each word had to be learned as a whole
that is, they had become
logograms as in
Egyptian
Demotic.
The largest segmental script is probably an abugida,
Devanagari. When written in Devanagari, Vedic
Sanskrit has an alphabet of 53 letters,
including the
visarga mark for final aspiration and
special letters for
kš and
jñ, though one of the
letters is theoretical and not actually used. The Hindi alphabet
must represent both Sanskrit and modern vocabulary, and so has been
expanded to 58 with the
khutma letters (letters with a dot
added) to represent sounds from Persian and English.
The largest known abjad is
Sindhi,
with 51 letters. The largest alphabets in the narrow sense include
Kabardian and
Abkhaz (for
Cyrillic), with 58 and 56 letters, respectively,
and
Slovak (for the
Latin alphabet), with 46. However, these
scripts either count
di- and
tri-graphs as separate letters, as Spanish did with
ch
and
ll until recently, or uses
diacritics like Slovak
č. The largest
true alphabet where each letter is graphically independent is
probably
Georgian, with 41
letters.
Syllabaries typically contain 50 to 400
glyphs (though the Múra-Pirahã
language of Brazil
would
require only 24 if it did not denote tone, and Rotokas would
require only 30), and the glyphs of logographic systems typically
number from the many hundreds into the thousands. Thus a
simple count of the number of distinct symbols is an important clue
to the nature of an unknown script.
Alphabetic order
It is not always clear what constitutes a distinct alphabet.
French uses the same basic alphabet
as English, but many of the letters can carry additional marks,
such as é, à, and ô. In French, these combinations are not
considered to be additional letters. However, in
Icelandic, the accented letters such as
á, í, and ö are considered to be distinct letters of the alphabet.
In Spanish, ñ is considered a separate letter, but accented vowels
such as á and é are not. The ll and ch were also considered single
letters, distinct from a single l followed by an l and c followed
by an h, respectively, but in
1994 the
Real Academia Española changed
them so that ll is between lk and lm in the dictionary and ch is
between cg and ci.Real Academia Española. "Spanish Pronto!: Spanish
Alphabet." Spanish Pronto! 22 April 2007. January 2009
Spanish Pronto: Spanish > English Medical
Translators.
In German, words starting with
sch- (constituting the
German phoneme /ʃ/) would be intercalated between words with
initial
sca- and
sci- (all incidentally
loanwords) instead of this graphic cluster appearing after the
letter
s, as though it were a single letter – a
lexicographical policy which would be
de rigueur in a dictionary of
Albanian, i.e.
dh-,
gj-,
ll-,
rr-,
th-,
xh- and
zh- (all representing phonemes and considered separate
single letters) would follow the letters
d,
g,
l,
n,
r,
t,
x and
z respectively. Nor is, in a dictionary of
English, the lexical section with initial
th- reserved a
place after the letter
t, but is inserted between
te- and
ti-. German words with
umlaut would further be alphabetized as if there were
no umlaut at all – contrary to
Turkish which allegedly adopted the
Swedish graphemes ö and
ü, and where a word like tüfek, “gun”, would come
after tuz, “salt”, in the dictionary.
The
Danish and Norwegian
alphabets end with
æ –
ø –
å, whereas the Swedish and the Finnish ones
conventionally put
å –
ä –
ö at the end.
Some adaptations of the Latin alphabet are augmented with
ligatures, such as
æ in
Old English
and
Icelandic and
Ȣ in
Algonquian; by borrowings from other
alphabets, such as the
thorn þ in
Old English and
Icelandic, which came from the
Futhark runes; and by modifying existing
letters, such as the
eth ð of Old
English and Icelandic, which is a modified
d. Other
alphabets only use a subset of the Latin alphabet, such as
Hawaiian, and
Italian, which uses
the letters
j, k, x, y and
w only in foreign
words.
It is unknown whether the earliest alphabets had a defined
sequence. Some alphabets today, such as
Hanunoo, are learned one letter at a time, in no
particular order, and are not used for
collation where a definite order is required.
However, a dozen
Ugaritic tablets
from the fourteenth century BC preserve the alphabet in two
sequences. One, the
ABGDE order later used in Phoenician,
has continued with minor changes in
Hebrew,
Greek,
Armenian,
Gothic,
Cyrillic, and
Latin; the other,
HMĦLQ, was used in
southern Arabia and is preserved today in
Ethiopic. Both orders have therefore been
stable for at least 3000 years.
The historical order was abandoned in
Runic and
Arabic, although Arabic retains the
traditional "
abjadi order" for
numbering.
The
Brahmic family of alphabets used
in India use a unique order based on
phonology: The letters are arranged according to
how and where they are produced in the mouth. This organization is
used in Southeast Asia, Tibet, Korean
hangul,
and even Japanese
kana, which is not an
alphabet.
The Phoenician letter names, in which each letter is associated
with a word that begins with that sound, continue to be used in
Samaritan,
Aramaic,
Syriac,
Hebrew, and
Greek. However, they were abandoned in
Arabic,
Cyrillic and
Latin.
Orthography and spelling
Each language may establish rules that govern the association
between letters and phonemes, but, depending on the language, these
rules may or may not be consistently followed. In a perfectly
phonological alphabet, the phonemes and
letters would correspond perfectly in two directions: a writer
could predict the spelling of a word given its pronunciation, and a
speaker could predict the pronunciation of a word given its
spelling. However, languages often evolve independently of their
writing systems, and writing systems have been borrowed for
languages they were not designed for, so the degree to which
letters of an alphabet correspond to phonemes of a language varies
greatly from one language to another and even within a single
language.
Languages may fail to achieve a one-to-one correspondence between
letters and sounds in any of several ways:
- A language may represent a given phoneme with a combination of
letters rather than just a single letter. Two-letter combinations
are called digraph and
three-letter groups are called trigraph. German uses the tesseragraphs (four letters)
"tsch" for the phoneme and "dsch" for , although the latter is
rare. Kabardian also uses a
tesseragraph for one of its phonemes.
- A language may represent the same phoneme with two different
letters or combinations of letters. An example is modern Greek which may write the phoneme in six
different ways: "ι", "η", "υ", "ει", "οι" and "υι" (although the
last is very rare).
- A language may spell some words with unpronounced letters that
exist for historical or other reasons.
- Pronunciation of individual words may change according to the
presence of surrounding words in a sentence (sandhi).
- Different dialects of a language may use different phonemes for
the same word.
- A language may use different sets of symbols or different rules
for distinct sets of vocabulary items, such as the Japanese
hiragana and katakana syllabaries, or the various rules in
English for spelling words from Latin and Greek, or the original
Germanic vocabulary.
National languages generally elect to address the problem of
dialects by simply associating the alphabet with the national
standard. However, with an international language with wide
variations in its dialects, such as
English, it would be impossible to
represent the language in all its variations with a single phonetic
alphabet.
Some national languages like
Finnish,
Turkish and
Bulgarian have a very regular spelling
system with a nearly one-to-one correspondence between letters and
phonemes. Strictly speaking, there is no word in the Finnish,
Turkish and Bulgarian languages corresponding to the verb "to
spell" (meaning to split a word into its letters), the closest
match being a verb meaning to split a word into its syllables.
Similarly, the
Italian verb
corresponding to 'spell',
compitare, is unknown to many
Italians because the act of spelling itself is almost never needed:
each phoneme of Standard Italian is represented in only one way.
However, pronunciation cannot always be predicted from spelling in
cases of irregular syllabic stress. In standard
Spanish, it is possible to tell the
pronunciation of a word from its spelling, but not vice versa; this
is because certain phonemes can be represented in more than one
way, but a given letter is consistently pronounced.
French, with its
silent letters and its heavy use of
nasal vowels and
elision,
may seem to lack much correspondence between spelling and
pronunciation, but its rules on pronunciation are actually
consistent and predictable with a fair degree of accuracy.
At the other extreme, are languages such as English, where the
spelling of many words simply has to be memorized as they do not
correspond to sounds in a consistent way. For English, this is
partly because the
Great Vowel
Shift occurred after the orthography was established, and
because English has acquired a large number of loanwords at
different times, retaining their original spelling at varying
levels. Even English has general, albeit complex, rules that
predict pronunciation from spelling, and these rules are successful
most of the time; rules to predict spelling from the pronunciation
have a higher failure rate.
Sometimes, countries have the written language undergo a
spelling reform in order to realign the
writing with the contemporary spoken language.
These can range from
simple spelling changes and word forms to switching the entire
writing system itself, as when Turkey
switched
from the Arabic alphabet to the Roman alphabet.
The sounds of speech of all languages of the world can be written
by a rather small universal phonetic alphabet. A standard for this
is the
International
Phonetic Alphabet.
See also
References
- Encyclopædia Britannica Online Merriam-Webster's
Online Dictionary
- 27 pages.
- Daniels and Bright (1996), pp. 74–75
- Coulmas (1989), p. 140-141
- Daniels and Bright (1996), pp. 92–96.
- “上親制諺文二十八字…是謂訓民正音(His majesty created 28 characters himself...
It is Hunminjeongeum (original name for
Hangul)”, 《세종실록 (The
Annals of the Choson Dynasty : Sejong)》 25년 12월.
- Millard, A.R. "The Infancy of the Alphabet", World
Archaeology 17, No. 3, Early Writing Systems (February 1986):
390–398. page 395.
Bibliography
- — (Overview of modern and some ancient writing systems).
- —(Chapter 3 traces and summarizes the invention of alphabetic
writing).
- McLuhan, Marshall; Logan, Robert K. (1977). Alphabet, Mother of
Invention. Etcetera. Vol. 34, pp. 373–383
- — Chapter 4 traces the invention of writing
External links