The Yaghnobi language is a living East Iranian language (the other living members being Pashto, Ossetic and Pamir languages). Yaghnobi is spoken in the upper valley of the Yaghnob River in the Zarafshanmarker area of Tajikistanmarker by the Yaghnobi people. It is considered to be a direct descendant of Sogdian and has often been called Neo-Sogdian in academic literature.

There are some 12,500 Yaghnobi speakers. They are divided into several communities. The principal group lives in the Zafarobodmarker area. There are also re-settlers in the Yaghnob valley. Some communities live in the villages of Zumand and Kůkteppa and in Dushanbemarker or in its vicinity.

Most Yaghnobi speakers are bilingual in the West Iranian Tajik. Yaghnobi is mostly used for daily family communication, while Tajik is used by Yaghnobi speakers for business and formal transactions. The fact that a single Russian ethnographer was told by nearby Tajiks - long hostile to the Yaghnobis, who were late to adopt Islam - that the Yaghnobis used their language as a "secret" mode of communication to confuse the Tajiks has led to the belief by some (especially those reliant solely on Russian sources) that Yaghnobi or some derivative of it was used as a code for nefarious purposes.

There are two main dialects, a western and an eastern one. These dialects differ primarily in phonetics. For example, to historical corresponds t in the western dialects and s in the eastern, e.g. met - mes 'day' from Sogdian mēθ . To western ay corresponds eastern e, e.g. wayš - weš 'grass' from Sogdian wayš or wēš . The early Sogdian group θr (later ṣ̌) is reflected as sar in the east but tir in the west, e.g. saráy - tiráy 'three' from Sogdian θrē/θray or ṣ̌ē/ṣ̌ay <δRY>. t/s and ay/e are not the only features recognised as relevant to distinguish those two dialects, there are also some differences in verbal endings and in the lexicon.</δRY> <δRY>In between these two main dialects there is a transitional dialect.</δRY> <δRY>It shares some features of the western language and some features of the eastern one.</δRY>


Yaghnobi was a scriptless language until 1990s, but according to some ethnographers the Yaghnobis used a modified form of the Arabic alphabet. Nowadays the language is transcribed by scholars using a modified Latin alphabet, with the following symbols:

a (á), ā (ā́), b, č, d, e (é), f, g, γ, h, ẖ, i (í), ī (ī́), ǰ, k, q, l, m (m̃), n (ñ), o (ó), p, r, s, š, t, u (ú), ū (ū́), ʏ (ʏ́), v, w (u̯), x, x°, y, z, ž, ع

In recent times Sayfiddīn Mīrzozoda form the Tajik Academy of Sciences uses a modified Tajik alphabet for writing Yaghnobi. This alphabet is quite unsuitable for Yaghnobi - it does not distinguish short and long vowels, the difference v/w or does not mark stress etc. Yaghnobi alphabet follows with Latin equivalents given in parenthesis:

А а (a) Б б (b) В в (w) W w (w) Г г (g) Ғ ғ (γ)Д д (d) Е е (e/ye) Ё ё (yo) Ж ж (ž) З з (z)И и (i, ī) Ӣ ӣ (ī) й (y) К к (k) Қ қ (q)Л л (l) М м (m) Н н (n) О о (o) П п (p)Р р (r) С с (s) Т т (t) У у (u, ū, ʏ) Ӯ ӯ (ū, ʏ)Ф ф (f) Х х (x) Хw хw (x°) Ҳ ҳ (h, ẖ) Ч ч (č) Ҷ ҷ (ǰ)Ш ш (š) Ъ ъ (ع) Э э (e) Ю ю (yu, yū, yʏ) Я я (ya)

Notes to the Cyrillic alphabet:

1) Letter й does not have capital form, it never appears at the beginning of a word. Words beginning with ya-, yo- and yu-/yū-/yʏ- are written as я-, ё- and ю-; in a similar way are these combinations written in the middle of the word, f.ex. viyóra is виёра .

2) The usage of letters ӣ and ӯ is not exactly known, it appears, that those letters can be used to distinguish two similar sounding words by orthography (f.ex. иранка and ӣранка, рупак and рӯпак). Maybe letter ӣ is also used as a stress marker as it is also in Tajik. Letter ӯ can also be used in Tajik loanwords to indicate a Tajik vowel <ů> , but it can have some other usage that is not known yet.</ů>

3) In older texts Yaghnobi alphabet did not use letters Ъ ъ and Э э - instead of Tajik ъ is used Yaghnobi letter and Yaghnobi е covered both Tajik е and э for value ; in later notation those letters were integrated into the alphabet - so the older writing етк was changed into этк to represent pronunciation (and not * ), older writing ша’мак was chaged to шаъмак .

4) Sound combinations and are written е and и. Yaghnobi letter и can have value after a vowel as it has in Tajik, letter ӣ after a vowel has value . Letter е has two values - in word-initial position and after a vowel it is pronounced , in position after a consonant it means , please note that is rare in Yaghnobi - it can be found only in Tajik or Russian loans, the only example for /je/ is a Европа [ˈjeːvrɔpa], this word itself is a Russian loanword.

5) Russian letters Ц ц, Щ щ, Ы ы and Ь ь, that can be used in Tajik loans from Russian are not used in Yaghnobi - the Russian words are written as they are pronounced by the Yaghnobi speakers, not as they are written originally in Russian (f.ex. aeroplane is самолет/самолёт in Russian, written самолёт in Tajik and pronounced in Russian and similar in Tajik, in Yaghnobi it is written as самалиёт respecting Yaghnobi pronunciation or ; word concert is borrowed to Yaghnobi from Russian концерт in form кансерт ), see Tajik консерт.

6) By consultation with Sayfiddīn Mīrzozoda distinction between sonds /v/ and /w/ is needed to be established - for the sound /v/ letter в will be used but for /w/ another letter should be adopted. By the agreement Latin letter W w would be the best choice, also for representation of /x°/ letter combination Хw хw should be used. Mīrzozoda uses letter w in some texts, this notation was unfortunately inconsistent.


Yaghnobi includes 9 vowels - 3 short, 6 long - and 27 consonants.


short: i [i-ɪ-e], a [(æ-)a(-ɑ)], u [(y-)u-ʊ-o] (all short vowels might be reduced approximately to [ə] in pretonic positions)

long: ī [i:], e [ɛ:(-e:)], ā [a:(-ɑ:)], o [(ɒ:-)ɔ:(-o:-u:)], ū [u:], ʏ [(u:-)y:(-i:)]

diphthongs: ay [ai̯] ('ay in native words appears only in the western dialects, in the eastern it changes to e, ay can also appear in the eastern dialect, but by different etymology), oy' [ɔ:i̯], uy [ʊi̯], ūy [u:i̯], ʏy [y:i̯], iy [ɪi̯]; ow [ɔ:u̯], aw [au̯]

Front N.-front Central N.-back Back

 • u:




 •  • 


1) Please note that long e, o and ʏ are conventionally not written with the lengthening sign.

2) Long ā is recognised, but it appears only as a result of secondary lengthening (f.ex. ǰām ǰaعm ǰamع).

3) In recent borrowings from Tajik ů and/or Uzbek [ɵ:] can also appear, but it's pronunciation usually merges with ū)

4) Vowel ʏ is recognised by some authorities, by some other not. It seams that it is an allophone of ū. The origin of ʏ comes from historical stressed *ū, but historical *ō, changed in Yaghnobi to ū, remains unchanged. It seams, that the status of ʏ is unstable and it is not recorded in all varieties of Yaghnobi, while ʏ is often realised as ū, ūy/ūy, uy/uy or ʏ. In summary: *ū́ (under stress) > ū/ūy/uy/ʏ or ū, *ō > ū (f.ex. vʏz/vūz, goat; Tajik buz, Avestan buza-). By some authorities ʏ can be transcribed as ü.

5) Vowel o can change to ū in front of a nasal (cf. Toǰīkistón × Toǰīkistū́n, nom × nūm).

6) Vowel e is considered as a long vowel, but in front of h or ع its pronunciation is somewhat shorter - so than e is realised as a half-short (or even short) vowel. Etymologically this "short" e in fornt of h, ع comes from older *i, in pronunciation of Yaghnobi we can see alternation e/i in front of h/ع - in case when the historical cluster *ih or *iع appears in a closed syllable, than *i changes to e, in open syllable this change does not take place (this development is similar to Tajik one) - this change can be seen in case of verb dih-/deh-: infinitive díhak × 3rd sg. present déhči.

7) In Yaghnobi dialects there can be seen a different development of historical svarabhakti vowel: in the Western and Transitional dialects this is rendered as i (or u under certain circumstances) but in the Eastern dialects it changes to a (but also i or u): f.ex. *θray > *θəráy > W./Tr. tiráy × E. saráy but *βrāt > *vərāt > W./Tr./E. virót; when the second vowel is a back vowel usually changes to u in Western or Transitional dialects: *(čə)θβār > *tfār > *təfór > W./Tr. tufór (but also tifór) × E. tafór, *pδūfs- > *bədū́fs > W./Tr./E. budū́fs-. The later change appears also in morphology: verb tifárak (the form is same in all three dialects) has form in 3rd sg. present tufórči *təfár- *tfar- *θβar-. Alternation i/a can be seen also in Tajik loans where an unstressed vowel can undergo this change: W./Tr. širī́k × E. šarī́k Tajik šarīk /šarīk/, W./Tr. xipár × E. xapár Tajik xabar /xabar/. The former svarabhakti vowels are often ultra-short or reduced in pronunciation, in some cases they can disappear in a fast speech: xišáp /xišáp × xišáp × xšap/ *xəšáp *xšap.

8) Vowel a changes to o in verbal stems of the type -Car- when an ending containing historical or *t is added: tifár-, infinitive tifárak, 1st sg. present tifarómišt but 3rd sg. present tufórči (ending -či comes from older -tišt), 2nd pl. present W./Tr. tufórtišt E. tufórsišt, x°ar-: x°árak : x°arómišt : xórči : xórtišt/xórsišt (please note also that when a changes to o after , x loses its labilisation). This change takes place with all verbs of Yaghnobi origin and also in case of older loans from Tajik, in case of new loans a remains unchanged, f. ex.: gudár(ak) : gudórči × pár(ak) : párči - the first verb is an old loan from Tajik guzaštan guδaštan, the later is recent loan from parrīdan.


stops: , , , , , , ( and are palatalised to and respectively before a front vowel or after a front vowel at the end of a word)

fricatives: , , , , <š>, <ž>, , <γ>, , ( appears as an allophone between vowels or voiced consonants), <ẖ>, <ع></ع></ẖ></γ></ž></š>

affricates: <č>, <ǰ></ǰ></č>

nasals: , (both have allophones and before and , respectively)



approximant: ,

Place of articulation Bilabial Labio‐

Alveolar Post‐


or Palatal
Ve­lar Uvu­lar or Labialised Uvular Pha­ryn‐

Manner of articulation
Lateral Approx­imant      

All voiced consonants are pronounced voiceless at the end of the word, in speech when after an unvoiced consonant comes a voiced one, the unoviced is voiced by assimilation. In case of voicing q the voiced opposition is γ, not [ɢ].

Note: Sounds b, g, h, , ǰ, q, l and ع appear mostly in loan-words, native words with those sounds are rare, mostly onomatopoeic.


Note: In following sections symbols W, E and Tr. refer to the western, eastern or transitional dialect.


Case endings:
Case Stem ending is consonant Stem ending is vowel other than -a Stem ending is -a
Sg. Direct (Nominative) - - -a
Sg. Oblique -i -y -ay (W), -e (E)
Pl. Direct (Nominative) -t -t -ot
Pl. Oblique -ti -ti -oti

  • kat : káti, pl. katt, kátti
  • mayn (W) / men (E) : máyni/méni, pl. maynt/ment, máynti/ménti
  • póda : póday/póde, pl. pódot, pódoti
  • čalló : čallóy, pl. čallót, čallóti
  • zindagī́ : zindagī́y, pl. zindagī́t, zindagī́ti
  • mórti : mórtiy, pl. mórtit, mórtiti


Forms of the personal pronouns:

Person Nominative Singular Oblique Singular Enclitic Singular Nominative Plural Oblique Plural Enclitic Plural
1st man man -(i)m mox mox -mox
2nd tu taw -(i)t šumóx šumóx -šint
3rd ax, áwi, ít(i) -(i)š áxtit, íštit áwtiti, ítiti -šint

The 2nd person plural, šumóx also finds use as the polite form of the 2nd person.


Personal endings - present:

Person Singular Plural
1st -omišt -īmišt
2nd -īšt -tišt (W, Tr.), -sišt (E)
3rd -tišt (W), -či (E, Tr.) -ošt

Personal endings - preterite (with augment a-):

Person Singular Plural
1st a- -im a- -om (W), a- -īm (E, Tr.)
2nd a- a- -ti (W, Tr.), a- -si (E)
3rd a- - a- -or
By adding the ending -išt (-št after a vowel) to the preterite a durative preterite is formed.

Participle: Present participle is formed by adding -na to the verbal stem. Past participle (or perfect participle) is formed by addition of -ta to the stem.

Infinitive is formed by addition of ending -ak to the verbal stem.

Negation is formed by prefix na-, in combination with augment in preterite it changes to nē-.

Copula - Present:

Person Singular Plural
1st īm om
2nd išt ot (W, Tr.), os (E)
3rd ast, -x, xast, ásti, xásti or


Present knowledge of Yaghnobi lexicon comes from three main works - from a Yaghnobi-Russian dictionary presented in Yaghnobi texts by Andreyev and Peščereva and then from a supplementary wordlist presented in Yaghnobi grammar by Xromov. The last work is Yaghnobi-Tajik dictionary compiled by Xromov's student Sayfiddīn Mīrzozoda. What is now known, in Yaghnobi Tajik words represent the majority of lexicum (some 60%), then come words of Turkic origin (up to 5%, mainly from Uzbek) and few Russian words (approx. 2%; note that through Russian language also many international words came to Yaghnobi). So only about one third of the lexicon is Eastern-Iranian origin, those words can be easily comparable to those known from Sogdian, Ossetian, Pamir languages or Pashto.

Sample text

"Fálγar-at Yáγnob asosī́ láfz-šint ī-x gumū́n, néki áxtit toǰīkī́-pi wó(v)ošt, mox yaγnobī́-pi. 'Mʏ́štif' wó(v)omišt, áxtit 'Muždív' wó(v)ošt."

"In Falghar and in Yaghnob is certainly one basic language, but they speak Tajik and we speak Yaghnobi. We say 'Müštif', they say 'Muždiv'."

(In edited Cyrillic orthography it could have been written this way: "Фалғарат Яғноб асосӣ лафзшинт ӣх гумӯн, неки ахтит тоҷӣкӣпи wоошт, мох яғнобӣпи. 'Мӯштиф' (Мыштиф) wоомишт, ахтит 'Муждив' wоошт.")



