The Romance languages (sometimes referred to as Romanic languages, Latin languages, Neolatin languages or Neo-Latin languages) are a branch of the Indo-European language family comprising all the languages that descend from Latin, the language of ancient Rome. There are more than 600 million native speakers worldwide, mainly in America, Europe, and Africa, as well as many smaller regions scattered throughout the world. Because of the extreme difficulty and varying methodology of distinguishing among language, variety, and dialect, it is impossible to count the number of Romance languages now in existence, but a restrictive, somewhat arbitrary account can place the total at approximately 25. In fact, the number is much larger, and many more existed previously. The six most widely spoken standardized Romance languages are Spanish, Portuguese, French, Italian, Romanian, and Catalan. Among numerous other Romance languages are Corsican, Leonese, Occitan, Aromanian, Sardinian, Sicilian, Venetian, Neapolitan, Asturian, Galician, and Friulian.


Romance languages are the continuation of Vulgar Latin, the popular sociolect of Latin spoken by soldiers, settlers and merchants of the Roman Empire, as distinguished from the Classical form of the language spoken by the Roman upper classes, the form in which the language was generally written. Between 350 BC and AD 150, the expansion of the Empire, together with its administrative and educational policies, made Latin the dominant native language in continental Western Europe. Latin also exerted a strong influence in southeastern Britain, the Roman province of Africa, and the Balkans north of the Jireček Line.

During the Empire's decline, and after its fragmentation and collapse in the 5th century, varieties of Latin began to diverge within each local area at an accelerated rate, and eventually evolved into a continuum of recognizably different typologies. The overseas empires established by Portugal, Spain and France from the 15th century onward spread their languages to the other continents, to such an extent that about 70% of all Romance speakers today live outside Europe.

Despite influences from pre-Roman languages and from later invasions, the phonology, morphology, lexicon, and syntax of all Romance languages are predominantly evolutions of Vulgar Latin. In particular, with only one or two exceptions, Romance languages have lost the declension system of present Latin and, as a result, have SVO sentence structure and make extensive use of prepositions.


The term "Romance" comes from the Vulgar Latin adverb romanice, derived from Romanicus: for instance, in the expression romanice loqui, "to speak in Roman" (that is, the Latin vernacular), contrasted with latine loqui, "to speak in Latin" (Medieval Latin, the conservative version of the language used in writing and formal contexts or as a lingua franca), and with barbarice loqui, "to speak in Barbarian" (the non-Latin languages of the peoples that conquered the Roman Empire). From this adverb the noun romance originated, which applied initially to anything written romanice, or "in the Roman vernacular".

The word romance with the modern sense of romance novel or love affair has the same origin. In the medieval literature of Western Europe, serious writing was usually in Latin, while popular tales, often focusing on love, were composed in the vernacular and came to be called "romances".


Lexical and grammatical similarities among the Romance languages, and between Latin and each of them, are apparent from the following examples having the same meaning:

{| cellspacing="3px"

English Translation: She always closes the window before dining (or having dinner).

Note that some of the lexical divergence above comes from different Romance languages using the same root word with different meanings (semantic change). Portuguese, for example, has the word fresta, which is a cognate of French fenêtre, Italian finestra, Romanian fereastraand so on, but now means "slit" as opposed to "window."(The Portuguese terms defenestrar, meaning "to throw through a window" and fenestrada, "replete with windows" also have the same root, but are later derivations from Latin.) Likewise, Portuguese also has the word cear, a cognate of Italian cenareand Spanish cenar, but uses it in the sense of "to have a late supper" in most varieties, while the preferred word for "to dine" is actually jantar(related to archaic Spanish yantar) because of semantic changes in the 19th century. Galician has both fiestra(from medieval fẽestrawhich is the ultimate origin of standard Portuguese fresta), and the less frequently used ventáand xanela.

As an alternative to lei(originally the accusative form), Italian has the pronoun ella, a cognate of the other words for "she", but it is hardly ever used in speaking.

Spanish/Asturian/Leonese/Cantabrian ventanaand Mirandese and Sardinian bentanacome from Latin ventum, Spanish viento, "wind" (c.f. English window, etymologically 'wind eye'), and Portuguese janela, Galician xanela, Mirandese jinelafrom Latin ianua + ella, "small opening", same root as "January" and "janitor".

Sardinian balcone(alternative for bentana) comes from Old Italian and is similar to other Romance languages such as French balcon, Portuguese balcão, Romanian balcon, Spanish balcónand Corsican balconi(alternative for purtellu).


Vulgar Latin

There is a lack of documentary evidence about Vulgar Latin for the purposes of comprehensive research, and the literature is often hard to interpret or generalise upon. Many of its speakers were soldiers, slaves, displaced peoples and forced resettlers, more likely to be natives of conquered lands than natives of Rome. It is believed that Vulgar Latin already had most of the features that are shared by all Romance languages, which distinguish them from Classical Latin, such as the almost complete loss of the Latin case systemand its replacement by prepositions; the loss of the neutergender, comparative inflections; replacement of some verbparadigms by innovations (e.g. the syntheticfuture gave way to an an originally analyticstrategy now typically formed by infinitive + evolved present indicative forms of 'have'); the use of articles; and the initial stages of the palatalizationof the plosives /k/, /g/, and /t/. Some modern languages, such as Finnish, have similar, quite sharp, differences between their printed and spoken form. To some scholars, this suggests that the form of Vulgar Latin that evolved into the Romance languages was around during the time of the Empire, and was spoken alongside the written Classical Latin which was reserved for official and formal occasions. Others scholars argue that the distinctions are more rightly viewed as indicative of sociolinguistic and register differences normally found within any language.

Fall of the Roman Empire

During the political decline of the Roman Empirein the fifth century, there were large-scale migrationsinto the empire, and the Latin-speaking world was fragmented into several independent states. Central Europe and the Balkanswere occupied by the Germanic and Slavictribes, as well as by the Huns, which isolated the Vlachsfrom the rest of Latin Europe. British Romanceand African Romance, the forms of Vulgar Latin used in southeastern Britainand the Roman province of Africa, where it had been spoken by much of the urban population, disappeared in the Middle Ages. But the Germanic tribes that had penetrated Italy, Gaul, and Hispaniaeventually adopted Latin and the remnants of Roman culture, and so Latin remained the dominant language there.

Latent incubation

Between the fifth and tenth centuries, the dialects of spoken Vulgar Latin diverged in various parts of their domain, eventually becoming distinct languages. This evolution is poorly documented because the literary language, Medieval Latin, remained close to the older Classical Latin.

Recognition of the vernaculars

Between the 10th and 13th centuries, some local vernacularsdeveloped a written form and began to supplant Latin in many of its roles. In some countries, such as Portugalmarker, this transition was expedited by force of law; whereas in others, such as Italymarker, many prominent poets and writers used the vernacular of their own accord - some of the most famous in Italy being Giacomo da Lentini and Dante Alighieri.

Uniformization and standardization

The invention of the printing pressapparently slowed down the evolution of Romance languages from the 16th century on , and brought a tendency towards greater uniformity of standard languageswithin political boundaries, at the expense of other Romance languages and dialectsless favored politically. In France, for instance, the dialect spoken in the region of Paris gradually spread to the entire country, and the Occitanof the south lost ground.

Current status

Romance languages, 20th century
The Romance language most widely spoken nativelytoday is Spanish(around 400 million speakers), followed by Portuguese(over 200 million), French(close to 100 million and more than 200 million including second languagespeakers), Italian(around 62 million), Romanian(around 32 million), and Catalan(around 6.7 million), all of which are official languagesin at least one country. A few other languages have official status on a regional or otherwise limited level, for instance Friulian, Sardinianand Valdôtainin Italy; Romanshin Switzerland; and Galicianin Spain. French, Italian, Portuguese, Spanish, and Romanian are also official languages of the European Union. Spanish, Portuguese, French, Italian, Romanian, and Catalan are the official languages of the Latin Unionmarker; and French and Spanish are two of the six official languages of the United Nations.

Outside Europe, French, Spanishand Portugueseare spoken and enjoy official status in various countries that emerged from their respective colonial empires. French is an official language of Canadamarker, the Caribbeanmarker, many countries in Africa, and some in the Indianmarker and Pacific Oceansmarker.Spanish is an official language of Mexicomarker, much of South America, Central America and the Caribbeanmarker, and of Equatorial Guineamarker in Africa and is the most spoken Romance language in the world.Portuguese is the official language of Brazilmarker (reaching almost 190 million, is the language spoken by half of South America, though not in the whole of Latin America), five African countries (Angolamarker, Cabo Verdemarker, Guiné-Bissaumarker, Moçambiquemarker and São Tomé e Príncipemarker), and East Timormarker and Macaumarker in Asia and is the second most spoken Romance language.Although Italymarker also had some colonial possessions, its language did not remain official after the end of the colonial domination, resulting in Italian being spoken only as a minority or secondary language by immigrant communities in North, South America, Australia, and African countries like Libyamarker, Eritreamarker and Somaliamarker.Romaniamarker did not establish a colonial empire, but the language is spoken as a native language in Moldavia, while it also spread outside Europe through emigration, notably in Western Asia; Romanian has flourished in Israelmarker, where it is a native language to 5% of the population, and by many more as a secondary language; this is due to the large numbers of Romanian-born Jews who moved to Israel after World War II.

Proportion of the 690 million native Romance language speakers of each language
The total native speakers of Romance languages are divided as follows (with their ranking within the languages of the world in brackets):

The remaining Romance languages survive mostly as spoken languages for informal contact. National governments have historically viewed linguistic diversity as an economic, administrative or military liability, as well as a potential source of separatistmovements; therefore, they have generally fought to eliminate it, by extensively promoting the use of the official language, restricting the use of the "other" languages in the media, characterizing them as mere "dialects", or even persecuting them.

In the late 20th and early 21st centuries, however, increased sensitivity to the rights of minorities have allowed some of these languages to start recovering their prestige and lost rights. Yet it is unclear whether these political changes will be enough to reverse the decline of minority Romance languages.

Classification and related languages

The classification of the Romance languages is inherently difficult, since most of the linguistic area can be considered a dialect continuum, and in some cases political biases can come into play. Nevertheless, according to SILcounts, 47 Romance languages and dialects are spoken in Europe. Along with Latin (which is not included among the Romance languages) and a few extinct languages of ancient Italy, they make up the Italic branchof the Indo-European family.

Note that Dalmatianis now generally grouped under Proto-Italian rather than Eastern Romance.

Proposed subfamilies

The main subfamiles that have been proposed by Ethnologuewithin the various classification schemes for Romance languages are:
  • Italo-Western, the largest group which includes languages such as Italian, Spanish, and French.
  • Eastern Romance, which includes the Romance languages of Eastern Europe, such as Romanian.
  • Southern Romance, which includes a few languages with particularly archaic features, such as Sardinian and, partially, Corsican.

Pidgins, creoles, and mixed languages

Some Romance languages have developed varieties which seem dramatically restructured as to their grammars or to be mixtures with other languages. It is not always clear whether they should be classified as Romance, pidgins, creole languages, or mixed languages. Some other languages, such as English, are sometimes thought of as creolesof semi-Romance ancestry. There are several dozens of creoles of Portuguese, Swahili, Spanishand French origin, some of them spoken as national languagesin former European colonies.

Creoles of French

Creoles of Spanish

Creoles of Portuguese

Auxiliary and constructed languages

Latin and the Romance languages have also served as the inspiration and basis of numerous auxiliary and constructed languages, such as Interlingua, its reformed version Modern Latin, Latino sine flexione, Occidental, Lingua Franca Nova, Idoand Esperanto, as well as languages created for artistic purposes only, such as Talossan. Because Latin is a very well-attested ancient language, some amateur linguists have even constructed Romance languages that mirror real languages that developed from other ancestral languages. These include Brithenig(which mirrors Welsh), Breathanach,[4193](mirrors Irish), Wenedyk(mirrors Polish), and Þrjótrunn (mirrors Icelandic).[4194]

Linguistic features

Common Indo-European features

As members of the Indo-European family, Romance languages have a number of features that are shared with some other members of this family that set them apart from languages of other families, including:
Latin (Illa) Claudit semper fenestram antequam cenat.
Aragonese Ella tranca/zarra siempre la finestra antis de zenar.
Asturian Ella pieslla siempre la ventana/feniestra primero de cenar.
Bolognese (Lî) la sèra sänper la fnèstra prémma ed dsnèr.
Cantabrian Ella pieslla siempri la ventana enantis de cenar.
Corsican Ella chjudi sempre u purtellu primma di cenà.
Bergamasque (Lé) La sèra sèmper sö la finèstra prima de senà.
Catalan (Ella) sempre tanca la finestra abans de sopar.
Franco-Provençal (Le) Sarre toltin/tojor la fenétra avan de goutâ/dinar/sopar.
French Elle ferme toujours la fenêtre avant de dîner/souper.
Friulian Jê e siere simpri il barcon prin di cenâ.
Galician (Ela) Pecha sempre a fiestra/xanela antes de cear.
Italian (Lei) chiude sempre la finestra prima di cenare.
Leonese Eilla pecha siempres la ventana primeiru de cenare.
Milanese (Lee) la sara semper su la finestra primma de disnà.
Mirandese Eilha cerra siempre la bentana/jinela atrás de jantar.
Neapolitan Essa nzerra sempe 'a fenesta primma 'e magnà
Norman lli barre tréjous la crouésie devaunt de daîner.
Occitan (Ela) Barra sempre/totjorn la fenèstra abans de sopar.
Piedmontese Chila a sara sèmper la fnestra dnans ëd fé sin-a/dnans ëd siné.
Portuguese Ela sempre fecha a janela antes de jantar.
Romanian Ea închide totdeauna fereastra înainte de cină.
Romansh Ella clauda/serra adina la fanestra avant ch'ella tschainia.
Sardinian Issa serrat semper sa bentana antes de chenare.
Sicilian Idda chiudi sempri la finestra avanti ca pistia/cina.
Spanish (Ella) siempre cierra la ventana antes de cenar.
Umbrian Essa chjude sempre la finestra prima de cena'.
Venetian Ła sara sèmpre ła finestra prima de senàr.
Walloon Ele sere todi li finiesse divant di soper.

Features inherited from Classical Latin

The Romance languages share a number of features that were inherited from Classical Latin, and collectively set them apart from most other Indo-European languages:
  • Word stress remains predominantly on the penultimate syllable in most languages, although there have been significant changes with respect to classical Latin. Stress patterns are usually similar across languages. In its modern form French is the noticeable exception in that stress falls predictably on the last syllable that does not contain a schwa. It should be observed, however, that the final stress of Modern French is not the result of systematic stress shift, but of the phonological erosion of syllables following the Proto-Romance stressed syllable; thus while e.g. Italian transparently maintains Latin stress on the second syllable of an infinitive such as amare /aˈmare/, in fact French does, too: /ɛˈme/, replicating at first Spanish /aˈmar/, but going beyond in losing /r/ as well.
  • There are two grammatical numbers, singular and plural (no dual).
  • In most Romance languages, personal pronouns have different forms according to their grammatical function in a sentence, a remnant of the Latin case system; there is usually a form for the subject (inherited from the Latin nominative) another for the object (from the accusative or the dative), and a third set of personal pronouns used after prepositions or in stressed positions (see prepositional pronoun and disjunctive pronoun, for further information). Third person pronouns often have different forms for the direct object (accusative), the indirect object (dative), and the reflexive.
  • Except for standard French and a few other exceptions, they are all null-subject languages. (Some non-standard varieties of French treat disjunctive pronouns as arguments and clitic pronouns as agreement markers.
  • Verbs have many conjugations, including in most languages:
    • A present tense, a preterite, an imperfect, a pluperfect and a future tense in the indicative mood, for statements of fact.
    • Present and preterite subjunctive tenses, for hypothetical or uncertain conditions. Several languages (for example, Italian, Portuguese and Spanish) have also imperfect and pluperfect subjunctives, although it is not unusual to have just one subjunctive equivalent for preterit and imperfect (e.g. no unique subjunctive equivalent in Italian of the so-called passato remoto).
    • An imperative mood, for direct commands.
    • Three non-finite forms: infinitive, gerund, and past participle.
    • Distinct active and passive voices, as well as an impersonal passive voice.
  • Several tenses and aspects, especially of the indicative mood, have been preserved with little change in most languages, as shown in the following table for the Latin verb dīcere (to say), and its descendants.

{| class="wikitable"

1With the variant díser.
2Until the 18th century.
3With the disused variant dize.
4From a form like discheva.
5Sicilian uses imperfect subjunctive in place of present subjunctive (dica).
  • The main tense and mood distinctions that were made in classical Latin are generally still present in the modern Romance languages, though many are now expressed through compound rather than simple verbs. The passive voice, which was mostly synthetic in classical Latin, has been completely replaced with compound forms.

Features inherited from Vulgar Latin

Romance languages also have a number of common features that are not shared with Classical Latin. Most of these are thought to have been inherited from Vulgar Latin. Even though the Romance languages are all derived from Latin, they are arguably much closer to each other than to their common ancestor, owing to a core of common developments. The main difference is the loss of the case system of Classical Latin, an essential feature which allowed great freedom of word order, and has no counterpart in any Romance language except Romanian. In this regard, the distance between any modern Romance language and Latin is comparable to that between Modern Englishand Old English. While speakers of French, Italian or Spanish, for example, can quickly learn to see through the phonological changes reflected in spelling differences, and thus recognize many Latin words, they will often fail to understand the meaning of Latin sentences.
  • Vulgar Latin borrowed many words, often from Germanic languages that replaced words from Classical Latin during the Migration Period, including some basic vocabulary. Notable examples are *blancus (white), which replaced Classical Latin albus in most major languages; *guerra (war), which replaced bellum; and the words for the cardinal directions, where cognates of English "north", "south", "east" and "west" replaced the Classical Latin words borealis (or septentrionalis), australis (or meridionalis), orientalis, and occidentalis, respectively, in the vernacular. (See History of French - The Franks.)
  • There are definite and indefinite articles, derived from Latin demonstratives and the numeral unus (one).
  • Nouns have only two grammatical genders, masculine and feminine. Most Latin neuter nouns became masculine nouns in Romance. However, in Romanian, one class of nouns—including the descendants of many Latin neuter nouns—behave like masculines in the singular and feminines in the plural (e.g. un deget "one finger" vs două degete "two fingers", cf. Latin digitum, pl. digita). The same phenomenon is observed non-productively in Italian (e.g. il dito "the finger" vs le dita "the fingers").
  • Apart from gender and number, nouns, adjectives and determiner are not inflected. Cases have generally been lost, though a trace of them survives in the personal pronouns. An exception is Romanian, which retains a combined genitive-dative case, and a vocative case.
  • Adjectives generally follow the noun they modify.
  • Many Latin combining prefixes were incorporated in the lexicon as new roots and verb stems, e.g. Italian estrarre (to extract) from Latin ex- (out of) and trahere (to drag).
  • Many Latin constructions involving nominalized verbal forms (e.g. the use of accusative plus infinitive in indirect discourse and the use of the ablative absolute) were dropped in favor of constructions with subordinate clause. Exceptions can be found in Italian, for example, Latin tempore permittente > Italian tempo permettendo; L. hoc facto > I. fatto ciò.
  • The normal clause structure is SVO, rather than SOV, and is much less flexible than in Latin.
  • Owing to sound changes which made it homophonous with the preterite, the Latin future indicative tense was dropped, and replaced with a periphrasis of the form infinitive + present tense of habēre (to have). Eventually, this structure was reanalysed as a new future tense.
  • In a similar process, an entirely new conditional form was created.
  • While the synthetic passive voice of classical Latin was abandoned in favour of periphrastic constructions, most of the active voice remained in use. However, several tenses have changed meaning, especially subjunctives. For example:
  • The Latin pluperfect indicative became a conditional in Sicilian, and an imperfect subjunctive in Spanish.
  • The Latin pluperfect subjunctive developed into an imperfect subjunctive in all languages except Romansh, where it became a conditional, and Romanian, where it became a pluperfect indicative.
  • The Latin preterite subjunctive, together with the future perfect indicative, became a future subjunctive in Old Spanish, Portuguese, and Galician.
  • The Latin imperfect subjunctive became a personal infinitive in Portuguese and Galician.
Infinitive Indicative Subjunctive Imperative
Present Preterite Imperfect Present Present
dīcere dīcit dīxit dicēbat dīcat/dīcet dīc
dizir diz dizié deziba diga diz
dicir diz dixo dicía diga di
dir diu digué/va dir deia digui/diga digues
dire di djéve dijisse/dzéze dète
dire il dit il a dit il disait (qu')il dise dis
dicir di dixo dicía diga di
dire dice disse diceva dica dica
dicire diz dixu dicía diga di
el dis l'ha dit el diseva el diga
dîr al dîs l'à détt / al dgé al dgeva al dégga
dicere dice dicette diceva
dire1 ditz diguèt disiá diga diga
a dis a dìsser2 a disìa ch'a disa dis
dizer diz disse dizia diga diz3
a zice zice zise zicea zic zi
dir el di (el ha ditg) el scheva4 ch'el dia di
dìciri dici dissi dicìa dicissi5 dici
decir dice dijo decía diga di
dir dixe dixe dixea diga
dire i dit (il a dit) i dijheut (k') i dixhe di
Basic meaning
to say he says he (has) said he was saying [that] he says say! [you]
  • Many Romance languages have two verbs "to be", derived from the Latin stare (mostly used for temporary states) and esse (mostly used for essential attributes). In French, however, stare and esse had become ester and estre by the late Middle Ages. Owing to phonetic developments, there were the forms êter and être, which eventually merged to être, and the distinction was lost. In Italian, the two verbs share the same past participle, stato. See Romance copula, for further information.
For a more detailed illustration of how the verbs have changed with respect to classical Latin, see Romance verbs.

Sound changes

Word structures in Romance languages have undergone considerable phonological change from their earlier Latin forms, by various processes that were in some cases shared, but in many more characteristic of each language. Those changes applied more or less systematically to all words, but were often conditioned by the sound context, morphological structure, or regularizing tendencies.

Most languages have lost sounds from the original Latin words. French, in particular, elision progressed more than in any other of the languages (although its conservative etymological spelling does not always make this apparent). In general, all final vowels were dropped, and sometimes also the preceding consonant: thus Latin lupus and luna became Italian lupo and luna but French loup and lune . (See also Use of the circumflex in French.) Catalan, Occitan, many Northern Italian dialects, and Romanian (Daco-Romanian) lost the final vowels in most singular masculine nouns and adjectives, but retained them in the feminine, leaving masculines unmarked for gender, but feminines overtly marked; a pair such as sec 'dry, m. sg.' vs. seca 'dry, f. sg.' is typical (and ultimately responsible for French sec vs. sèche; /lu/ 'wolf', /luv/ 'she-wolf'). Other languages, including Italian, Portuguese, Spanish, Galician and Romanian have retained those vowels.

Some languages have lost the final vowel -e from verbal infinitives, e.g. dīcere → Portuguese dizer (to say). Other common cases of apocope are the verbal endings, e.g. Latin amāt → Italian ama (he loves), amābamamavo (I loved), amābatamava (he loved), amābatisamavate (you loved), etc.

Sounds were often lost in the middle of words, too; e.g. Latin Luna → Galician and Portuguese Lua (Moon), crēdere → Spanish creer (to believe).

On the other hand, some languages have added epenthetic vowels to words in certain contexts. Characteristic of the Iberian Romance languages (Spanish and Portuguese, etc..) is the insertion of a prosthetic e at the start of Latin words that began with s + consonant, such as sperōespero (I hope). French originally did the same, but later lost the s: spatula → arch. espauleépaule (shoulder). In the case of Italian, vowel-final articles, lo for the definite and uno for the indefinite, are used immediately preceding masculine words that begin with s + consonant words (sbaglio, "mistake" → lo sbaglio, "the mistake"), as well as all masculine words beginning with z (i.e. clusters /ts/ or /dz/) zaino, "backpack" → lo zaino, "the backpack", although Italian is still in possession of a now receding prothetic /i/ if a consonant must otherwise precede the cluster, e.g. in /i/Svizzera 'in Switzerland', alternating today with in Svizzera.

A characteristic feature of the writing systems of almost all Romance languages is that the Latin letters c and g — which originally always represented the "hard" consonants and respectively — now represent "soft" consonants when they come before e, i, or y. This is due to a general palatalization of and that occurred in the transition to Vulgar Latin. Since the written form of all the affected words was tied to the classical language, the shift was accommodated by a change in the pronunciation rules. The soft sounds of c and g vary from language to language. The consonant t, which was also palatalized, changes pronunciation in French (and English) orthography, but in the other Romance languages the spelling was altered to match the new sound. An exception is Sardinian, whose plosives remained hard before e and i in many words.

The distinctions of vowel length present in Classical Latin were lost in most Romance languages (an exception is Friulian), and partly replaced with qualitative contrasts such as monophthong versus diphthong (Italian, Spanish; French to a lesser extent), or close vowel versus open vowel (as in Portuguese, Galician, Occitan and Catalan).

For most languages in this family, consonant length is no longer phonemically distinctive or present. However some languages of Italy (Italian, Sardinian, Sicilian, and numerous other varieties of central and southern Italy) do have long consonants like , , /ll/, /mm/, /nn/, /ss/, and to a lesser extent /rr/, etc., where the doubling indicates a short hold before the consonant is released, in many cases with distinctive lexical value: e.g. note (notes) vs. notte (night), cade (s/he, it falls) vs. cadde (s/he, it fell). They may even occur at the beginning of words in Romanesco, Neapolitan and Sicilian, and are occasionally indicated in writing, e.g. Sicilian cchiù (more), and ccà (here). In general, the consonants , , and are long at the start of a word, while the archiphoneme is realised as a trill in the same position.

The double consonants of Piedmontese exist only after stressed , written ë, and are not etymological: vëdde (Latin videre, to see), sëcca (Latin sicca, dry, feminine of sech). In standard Catalan and Occitan, there exists a geminate sound written ŀl (Catalan) or ll (Occitan), but it is usually pronounced as a simple sound in colloquial (and even some formal) speech in both languages.

For more detailed descriptions of sound changes, see the articles Vulgar Latin, History of French, History of Portuguese, Latin to Romanian sound changes, and History of the Spanish language.

Lexical stress

While word stress was rigorously predictable in classical Latin, this is no longer the case in most Romance languages, and stress differences can be enough to distinguish between words. For example, Italian Papa (Pope) and papà (daddy), or the Spanish imperfect subjunctive cantara ([if he] sang) and future cantará ([he] will sing). However, the main function of Romance stress appears to be a clue for speech segmentation — namely to help the listener identify the word boundaries in normal speech, where inter-word spaces are usually absent.

The position of the stressed syllable in a word generally varies from word to word in each Romance language. Stress usually remains fixed on its assigned syllable within any language, however, even as the word is inflected. It is usually restricted to one of the last three syllables in the word, although Italian verb forms can violate this, e.g. telefonano (they telephone). The limit may be exceeded also by verbs with attached clitics, provided the clitics are counted as part of the word; e.g. Spanish entregándomelo (delivering it to me), Italian mettiamocene (let's put some of it in there), or Portuguese dávamo-vo-lo (we were giving it to you).

Other shared features

The Romance languages also share a number of features that were not the result of common inheritance, but rather of various cultural diffusion processes in the Middle Ages — such as literary diffusion, commercial and military interactions, political domination, influence of the Catholic Church, and (especially in later times) conscious attempts to "purify" them in accordance with Classical Latin. Some of those features have in fact spread to other non-Romance (and even non-Indo-European) languages, chiefly in Europe. Some of these "late origin" shared features are:
  • Most Romance languages have polite forms of address that change the person and/or number of 2nd person subjects (T-V distinction), such as the tu/vous contrast in French, the tu/Ella (or more often Lei) contrast in Italian, the tu/dumneavoastră (from dominus + vostre, literally meaning "your Lordship") in Romanian or the (or vos) /usted contrast in Spanish. Italian also had another form (Voi) denoting more respect than a tu, but of a lesser degree than Ella; the use of Voi has been discontinued because it was strongly supported by fascists.
  • They all have a large collection of learned hellenisms and latinisms, with prefixes, stems, and suffixes retained or reintroduced from Greek and Latin, and used to coin new words. Most of these are also used in English, e.g. tele-, poly-, meta-, pseudo-, dis-, ex-, post-, -scope, -logy, -tion, though their spelling may differ slightly; for example, poly- becomes poli- in Romanian, Italian and Spanish.
  • During the Renaissance, Italian, Portuguese, Spanish and a few other Romance languages developed a progressive aspect which did not exist in Latin. In French, progressive constructions remain very limited, the imperfect aspect generally being preferred, as in Latin.
  • Many Romance languages now have a verbal construction analogous to the present perfect tense of English. In some, it has taken the place of the old preterite (at least in the vernacular); in others, the two coexist with somewhat different meanings (cf. English I did vs. I have done). A few examples:
    • preterite only: Galician, Sicilian, Leonese, some dialects of Spanish;
    • preterite and present perfect: Catalan, Occitan, Portuguese, standard Spanish;
    • present perfect predominant, preterite now literary: French, Romanian, several dialects of Italian and Spanish.
    • present perfect only: Romansh

Writing systems

The Romance languages have kept the writing system of Latin, adapting it to their evolution.One exception was Romanian before the 19th century, where, after the Roman retreat, literacy was reintroduced through the Romanian Cyrillic alphabet by Slavic influences. The Cyrillic alphabet was also used for Romanian (Moldovan) in the USSRmarker. Also the non-Christian populations of Spain used the systems of their culture languages (Arabic and Hebrew) to write Romance languages such as Ladino and Mozarabic in aljamiado.


The Romance languages are written with the classical Latin alphabet of 23 letters — A, B, C, D, E, F, G, H, I, K, L, M, N, O, P, Q, R, S, T, V, X, Y, Z — subsequently modified and augmented in various ways. In particular, the single Latin letter V split into V (consonant) and U (vowel), and the letter I split into I and J. The Latin letter K and the new letter W, which came to be widely used in Germanic languages, are seldom used in most Romance languages — mostly for unassimilated foreign names and words.

While most of the 23 basic Latin letters have maintained their phonetic value, for some of them it has diverged considerably; and the new letters added since the Middle Ages have been put to different uses in different scripts. Some letters, notably H and Q, have been variously combined in digraphs or trigraphs (see below) to represent phonetic phenomena that could not be recorded with the basic Latin alphabet, or to get around previously established spelling conventions. Most languages added auxliary marks (diacritics) to some letters, for these and other purposes.

The spelling rules of most Romance languages are fairly simple, but subject to considerable regional variation. The letters with most conspicuous phonetic variations, between Romance languages or with respect to Latin, are

B: May alternate in pronunciation with v, for example in some variants of Spanish and Portuguese.
C: Generally a "hard" , but "soft" (fricative or affricate) before e, i, or y.
G: Generally a "hard" , but "soft" (fricative or affricate) before e, i, or y. In some languages, like Spanish, the hard g is pronounced as a fricative after vowels. In Romansch, the soft g is a voiced palatal plosive or a voiced alveolo-palatal affricate .
H: Silent in most languages; used to form various digraphs. But represents in Romanian, Walloon and Gascon Occitan.
J: Represents a fricative in most languages, or the palatal approximant in Romansh and in several of the languages of Italy. Italian does not use this letter in native words. Usually pronounced like the soft g (except in Romansch and the languages of Italy).
Q: As in Latin, its phonetic value is that of a hard c, and in native words it is always followed by a (sometimes silent) u. Romanian does not use this letter in native words.
S: Generally voiceless , but voiced between vowels in most languages. In Spanish, Romanian, Galician and several varieties of Italian, however, it is always pronounced voiceless. At the end of syllables, it may represent special allophonic pronunciations. In Romansh, it also stands for a voiceless or voiced fricative, or , before certain consonants.
W: No Romance language uses this letter in native words, with the exception of Walloon.
X: Its pronunciation is rather variable, both between and within languages. In the Middle Ages, the languages of Iberia used this letter to denote the voiceless postalveolar fricative , which is still the case in Modern Catalan and Portuguese. With the Renaissance the classical pronunciation — or similar consonant clusters, such as , , or — were frequently reintroduced in latinisms and hellenisms. In Venetian it represents , and in Ligurian the voiced postalveolar fricative . Italian does not use this letter in native words.
Y: This letter is not used in most languages, with the prominent exceptions of French and Spanish, where it represents before vowels (or various similar fricatives such as the palatal fricative , in Spanish), and the vowel or semivowel elsewhere.
Z: In most languages it represents the sound , but in Italian it denotes the affricates and (which, although not normally in contrast, are usually strictly assigned lexically in any single variety: Standard Italian gazza 'magpie' always with , mazza 'club, mace' only with ), in Romansh the voiceless affricate , and in Galician and Spanish it denotes either the voiceless dental fricative or .

Otherwise, letters that are not combined as digraphs generally have the same sounds as in the International Phonetic Alphabet (IPA), whose design was, in fact, greatly influenced by the Romance spelling systems.

Digraphs and trigraphs

Since most Romance languages have more sounds than can be accommodated in the Roman Latin alphabet they all resort to the use of digraphs and trigraphs — combinations of two or three letters with a single sound value. The concept (but not the actual combinations) derives from Classical Latin; which used, for example, TH, PH, and CH when transliterating the Greek letters "θ", "ϕ" (later "φ"), and "χ" (These were once aspirated sounds in Greek before changing to corresponding fricatives and the H represented what sounded to the Romans like an following , , and respectively. Some of the digraphs used in modern scripts are:

CI: used in Italian, Romance languages in Italy and Romanian to represent before A, O, or U.
CH: used in Italian, Romance languages in Italy, Romanian, Romansh and Sardinian to represent before E or I; in Occitan, Spanish, Leonese and Galician; or in Romansh before A, O or U; and in most other languages.
DD: used in Sicilian and Sardinian to represent the voiced retroflex plosive . In recent history more accurately transcribed as DDH.
DJ: used in Catalan and Walloon for .
GI: used in Italian, Romance languages in Italy and Romanian to represent before A, O, or U, and in Romansh to represent or or (before A, E, O, and U) or
GH: used in Italian, Romance languages in Italy, Romanian, Romansh and Sardinian to represent before E or I, and in Galician for the voiceless pharyngeal fricative (not standard sound).
GL: used in Romansh before consonants and I and at the end of words for .
GLI: used in Italian and Romansh for .
GN: used in French, Italian, Romance languages in Italy and Romansh for , as in champignon or gnocchi.
GU: used before E or I to represent or in all Romance languages except Italian, Romance languages in Italy, Romansh, and Romanian (which use GH instead).
IG: used at the end of word in Catalan for , as in maig, safareig or enmig.
IX: used between vowels or at the end of word in Catalan for , as in caixa or calaix.
LH: used in Portuguese and Occitan .
LL: used in Spanish, Catalan, Galician, Leonese, Norman and Dgèrnésiais, originally for which has merged in some cases with . Represents in French unless it follows I (i) when it represents (or in some dialects). It's used in Occitan for a long
L·L: used in Catalan for a geminate consonant .
NH: used in Portuguese and Occitan for , used in official Galician for .
N-: used in Piedmontese and Ligurian for between two vowels.
NN: used in Leonese for ,
NY: used in Catalan for .
QU: represents in Italian, Romance languages in Italy, and Romansh; in French, Leonese and Spanish; (before e or i) or (normally before a or o) in Occitan, Catalan and Portuguese.
RR: used between vowels in several languages (Occitan, Catalan, Spanish...) to denote a trilled or a guttural R, instead of the flap .
SC: used before E or I in Italian and Romance languages in Italy for , and in French and Spanish as in words of certain etymology.
SCH: used in Romansh for or .
SCI: used in Italian and Romance languages in Italy to represent before A, O, or U.
SH: used in Aranese Occitan for .
SS: used in French, Portuguese, Piedmontese, Romansh, Occitan, and Catalan for between vowels.
TG: used in Romansh for or . In Catalan is used for between vowels, as in metge or fetge.
TH: used in Jèrriais for ; used in Aranese for either or .
TJ: used between vowels and before A, O or U, in Catalan for , as in sotjar or mitjó.
TSCH: used in Romansh for .
TX: used at the beginning or at the end of word or between vowels in Catalan for , as in txec, esquitx or atxa.

While the digraphs CH, PH, RH and TH were at one time used in many words of Greek origin, most languages have now replaced them with C/QU, F, R and T. Only French has kept these etymological spellings, which now represent or , , and , respectively.

Double consonants

Gemination, in the languages where it occurs, is usually indicated by doubling the consonant, except when it does not contrast phonemically with the corresponding short consonant, in which case gemination is not indicated. In Jèrriais, long consonants are marked with an apostrophe: S'S is a long , SS'S is a long , and T'T is a long . The double consonants in French orthography, however, are merely etymological. In Catalan, the gemination of the l is marked by a punt volat = flying point - l·l.


Romance languages also introduced various marks (diacritics) that may be attached to some letters, for various purposes. In some cases, diacritics are used as an alternative to digraphs and trigraphs; namely to represent a larger number of sounds than would be possible with the basic alphabet, or to distinguish between sounds that were previously written the same. Diacritics are also used to mark word stress, to indicate exceptional pronunciation of letters in certain words, and to distinguish words with same pronunciation (homophones).

Depending on the language, some letter-diacritic combinations may be considered distinct letters, e.g. for the purposes of lexical sorting. This is the case, for example, of Romanian ( ) and Spanish ( ).

The following are the most common use of diacritics in Romance languages.

  • Vowel quality: the system of marking close-mid vowels with an acute, é, and open-mid vowels with a grave accent, è, is widely used (in Catalan, French, Italian, etc.) Portuguese, however, uses the circumflex (ê) for the former, and the acute (é), for the latter.
  • Nasality: Portuguese marks nasal vowels with a tilde (ã) when they occur before other written vowels and in some other instances. While not frequent among the other Romance languages, the use of this symbol generally to indicate nasality has been incorporated in the orthographies of many South American indigenous languages (Guarani is an example).
  • Palatalization: some historical palatalizations are indicated with the cedilla (ç) in French, Catalan, and Portuguese. In Spanish and several other world languages influenced by it, the grapheme ñ represents a palatal nasal consonant.
  • Diaeresis: when a vowel and another letter that would normally be combined into a digraph with a single sound are exceptionally pronounced apart, this is often indicated with a diaeresis mark on the vowel. In the Spanish word pingüino (penguin), the letter u is pronounced, although normally it is silent in the digraph gu when this is followed by an e or an i. Other Romance languages that use the diaeresis in this fashion are French, Catalan, and Brazilian Portuguese.
  • Stress: the stressed vowel in a polysyllabic word may be indicated with the acute, é (in Spanish, Portuguese, Catalan), or the grave accent, è (Italian, Catalan, Romansh). The orthographies of French and Romanian do not mark stress. In Italian and Romansh orthography, indicating stress with a diacritic is only required when it falls on the last syllable of a word.
  • Homophones: words that are pronounced exactly or nearly the same way, but have different meanings, can be differentiated by a diacritic. An acute accent, for example, is used in Spanish to distinguish si ("if") from ("yes"), and in Catalan to distinguish os ("bone") from ós ("bear"). A grave accent is used in French to distinguish ou ("or") from ("where"); in Italian and Romansh to distinguish e ("and") from è ("is"); and in Catalan to distinguish ("hand") from ma ("my"). The circumflex can also have this function in French, sometimes. Often, such words are monosyllables, the accented one being phonetically stressed, while the unaccented one is a clitic; examples are the Spanish clitics de, se, and te (a preposition and two personal pronouns), versus the stressed words , , and (two verbs and a noun).

Less widespread diacritics in the Romance languages are the breve (in Romanian, ă) and the ring (in Wallon and the Bolognese dialect of Emiliano-Romagnolo, å). The French orthography includes the etymological ligatures œ and (more rarely) æ. The use of the circumflex in French is partly etymological as well.

Upper and lower case

Most languages are written with a mixture of two distinct but phonetically identical variants or "cases" of the alphabet: majuscule ("uppercase" or "capital letters"), derived from Roman stone-carved letter shapes, and minuscule ("lowercase"), derived from Carolingian writing and Medieval quill pen handwriting which were later adapted by printers in the 15th and 16th centuries.

In particular, all Romance languages presently capitalize (use uppercase for the first letter of) the following words: the first word of each complete sentence, most words in names of people, places, and organizations, and most words in titles of books. The Romance languages do not follow the German practice of capitalizing all nouns including common ones. Unlike English, the names of months (except in European Portuguese), days of the weeks, and derivatives of proper nouns are usually not capitalized: thus, in Italian one capitalizes Francia ("France") and Francesco ("Francis"), but not francese ("French") or francescano ("Franciscan"). However, each language has some exceptions to this general rule.

Vocabulary comparison

The tables below provide a vocabulary comparison that illustrates a number of examples of sound shifts that have occurred between Latin and Romance languages, along with a selection of minority languages.

See also


