Skip to content

Anatolia and the Caucasus: the cradle of the Indo-Europeans

January 24, 2016


The most solidly argued theories about the cradle of the Indo-European (IE) proto-language are 1) the Pontic and 2) the Anatolian. The first proposed the steppes north of the Black Sea, in what are today south-Ukraine and south-Russia; the second: some region in central or eastern Anatolia. Both theories allow for contacts with the Caucasus and the indigenous Caucasian languages: north for the first, south for the second.

As we will see, the Pontic theory is enfeebled by the total absence of any contact between IE and the North-West Caucasian languages (NWC): Cherkess-Kabard-Abkhaz, which should have been their most immediate neighbours. NWC languages like Cherkess, that in historical times have occupied the Black Sea coast between Crimea and the Caucasus range, diverge from IE in all respects: typologically, lexically, phonologically.

The Anatolian theory, on the other hand, explains the profound affinities (typologically, lexically, phonologically) with proto-Semitic, proto-Kartvelian (South-Caucasian: SC) and proto-Nakh-Daghestani (the North-East Caucasian languages: NEC).

My arguments will thus be purely linguistic and I will deliberately leave aside any discussion about archaeological continuity, or about the spread of certain kinds of pottery, weapons, burial types or decoration. In recent times, such arguments have been thoroughly discredited and it is now definitely established that a) — a culture that is uniform in its external aspects can hide, and cover, and be shared, by different ethnic groups speaking widely divergent and dissimilar languages, or, vice-versa: b) — that speakers of one single language can belong to distinct material cultures.

In the first case, a) — the Middle East, India, the Balkans or the Caucasus show us how different ethnic groups speaking different languages can partake of the same culture, wearing the same dresses, eating the same food and sharing the same customs. The overall Caucasian culture is very uniform, shared by Christians and Muslims alike, although they speak languages belonging to three indigenous families: SC (Kartvelian), NWC (Abkhaz-Cherkess-Kabardian) and NEC (Nakho-Daghestani, including Chechen), but the same culture is also shared by Indo-European ethnic groups (Armenians and Ossetians), or Turkic newcomers (Kumyk, Nogai, Balkars). There is, and there was, very little linguistic mingling in the Caucasus, languages were always kept apart, while seen from the outside the culture of the Caucasus region presents a striking image of uniformity. On the basis of the material culture, an archaeologist of the future might be tempted to reconstruct a unique ethnic group speaking mutually comprehensible languages.

In the second case, b) — we have a case like the historically attested Etruscan language. It was the language of the powerful founders and first masters of Rome, spoken in what is the modern-day Tuscany, and, although poorly understood (it is not an Indo-European language) it lets us see that it was rather uniform in its dialects. Nevertheless, without the testimony of the written texts, we might conclude that we are confronted with two different civilisations, for some Etruscans were incinerating their dead, burying the ashes in what we call “urn fields”, while others were burying the embalmed corpses in elaborate, and often very rich, underground constructions imitating earthly dwellings.

I will thus firmly use only linguistic arguments in discussing the question of the cradle of the Indo-European languages.

First, a very important tool that has, to my knowledge, never been throughly taken into account is what are called in linguistics the “areal features”. Languages tend to influence each other locally, regionally. Tonal languages are thus spoken in sharply delimited areas: South-East Asia and the African gulf of Guinea. The monosyllabic structure and the (sometimes) delicate and complicated system of tones in these languages are clearly areal features, both in South-East Asia and in the gulf of Guinea, traits shared, for instances, by languages that are not genetically related, such as Chinese and Vietnamese.

In the same way, languages possessing grammatical genders, or nominal classes, on the IE model, tend also to be locally defined. Such are the Bantu languages in Africa, the Pama-Nyungan languages in Australia, and some language families in the Middle East and the Caucasus: Semitic, Indo-European and Nakho-Daghestani. These are the only regions on the planet where languages display the classificatory system of the grammatical genders (or nominal classes).

In Semitic, the number of nominal classes (grammatical genders) are, in the historically attested languages, only two, those that we call conventionally masculine and feminine. In the Nakho-Daghestani languages, they can go up to six (Chechen, Ingush and the related Batsbi have six genders) although the most common figure is three or four. Avar, the main language from Dagestan, has three genders (or nominal classes), which we can conveniently call “masculine”, “feminine” and “neuter”. In Avar (and the related languages from Dagestan) they function, syntactically and morphologically, exactly like the three corresponding classes (grammatical genders) in the Indo-European languages.

In Old Europe, none of the attested non-IE languages languages had grammatical genders. Neither Basque, which stems from one of the old Iberian languages, nor the northern Finnic languages, nor the extinct and rather well attested Etruscan, had grammatical genders or classes. Nor do the languages of Asia, be they Altaic, Uralic, or the tonal languages of South-East Asia. The appearance of the delicate mechanism of the grammatical genders, combined with the internal flexion, in a proto-language, or a group of closely related languages, such as the proto-IE, in the vicinity of (proto)Basque, Etruscan and Finnic is highly implausible.

All this takes us very far away from the proposed northern homelands, such as the steppes of today’s Ukraine and Southern Russia, where proto-Indo-European would have been in contact with Finno-Ugrian languages and languages from the north-west Caucasian branch (Cherkess, Kabard, Abkhaz). None of these has grammatical genders.

The Anatolian hypothesis has been proposed and convincingly argued by linguists and archaeologists such as Gamkrelidze and Ivanov, Trubetzkoy and Colin Renfrew. The starting point are the aforesaid “areal features”. In the Balkans, for instance, the languages form a Sprachbund, a unity of linguistic typology into which they converged regardless of their initial linguistic family. We saw that the languages of old, pre-IE Europe, do not have genders, or nominal classes. Those from Anatolia, the Old Middle East and parts of the Caucasus do have them.

Then comes the flexion. Semitic (two grammatical genders), Kartvelian (no grammatical genders) and North-East Caucasian (NEC, with nominal classes) languages present a morphology and an internal flexion similar to Indo-European. Kartvelian even uses the same mechanism of Ablaut as the IE, as well as a personal flexion of the verb (which NEC languages don’t have).

In the same way, the mechanism of the flexion of Semitic languages is built, like in the IE, around the Ablaut, the internal vowel shift from one grammatical category to another. In the Semitic languages, the play with the vowels, inside the mechanism of the flexion, has a decisive morpho-semantic role, like in the IE, SC (Kartvelian), or NEC (Chechen, Avar etc.), whereas in Basque, Finnish, Estonian or Lapp a root never modifies its vowel, but functions grammatically through agglutination. The two areal and typological models are widely dissimilar, even opposed.

The numerals

Kartvelian and IE languages borrowed in prehistorical times a series of numerals from proto-Semitic, especially the numerals 6 and 7. We thus have shesh and sheva in Hebrew, sitta and sab’a in Arabic, shetta and shub’a in Aramaic, etc.

In the Indo-European family, there is a close parallel: sex and septem in Latin, sechs and sieben in German, sześć and siedem in Polish, sheshí and septyni in Lithuanian.

And in Kartvelian, 6 is ekvsi in Georgian, usgwa in Svan; 7 is švidi in Georgian, išgwid in Svan.

What is interesting and revealing is that there happened a chassé-croisé of designations of numerals. Thus, in Georgian the Semitic 4 (arb’a in Hebrew) became 8 (rva), while the Georgian 4 (oti) is identical with the IE 8: octo, ahtau, etc…

Moreover, 8 in Indo-European was a dual, something which is visible in Sanskrit, Avestan and Gothic: ahtau. A dual means that 8 designated “twice 4”, which sends us immediately to the Georgian oti = 4. Oti, if we reconstruct it as *okt– (-i is simply the termination of the nominative in Georgian), explains why the IE octo, ahtau is a dual. The same mechanism would explain why the Semitic 4 (arb’a in Hebrew) became 8 (rva) in Kartvelian (Georgian).

The prehistoric existence in the area of a counting system based on 4 would also explain why in Chechen (a NEC, Nakho-Daghestani language), the numeral 4 is the only one that receives prefixes of nominal classes, having thus different forms according to the gender of the defined noun.

A similar comput system based on 4 is attested in other language families, in Africa, or in the isolated Burushaski, in the Pamir mountains in the north of today’s Pakistan, where:

2 = alto
4 = walto
8 = altambo
20 = altar

It is thus perfectly coherent that the Georgian oti = 4, while the IE 8 octo (ahtau etc.) is a dual, that is: 4 x 2 . In the same way, the Semitic 4 (arb’a in Hebrew) became the Georgian 8 = rva. This also vindicates Gamkrelidze’s theory that the formal identity, in IE languages, of the numeral 9 with the adjective “new” is not due to mere coincidence: novum-novem, neu-neun, new-nine etc.  9 was simply opening a new series.

All this indicates that Indo-European must have been formed in the vicinity of Semitic and Kartvelian and possibly other Caucasian languages. This excludes the possibility of a cradle north of the Black Sea, and totally excludes the Danube area, the Balkans, or any part of Eastern Europe. Those regions are too far from the Caucasus and from the Semitic languages, and we have seen that in the Neolithic in today’s Europe the languages might have had a typology similar with today’s Basque, or with the Finnic languages, which have an agglutinative typology.

It is only the Anatolian hypothesis that explains the borrowings and the many lexical common terms between IE, Kartvelian and Semitic. The borrowings from Semitic into IE and Kartvelian are too numerous to be listed here. Between IE and Kartvelian we have surprising correspondences, such as the verbal root *sed– to sit, to stay, to remain (identical in IE and Kartvelian), ordinal numerals such as the Georgian pirveli (first), which cannot come from a Slavic language, with which Georgian had no contact by the time of the first written texts in the Vth century.

Numerous are also the lexical archaic correspondences between IE and the North-East Caucasian languages (Chechen, Avar etc.), while Indo-European borrowings into Basque or Finnic are all recent and can be easily traced historically.

All this shows that proto-Indo-European was formed in Eastern Anatolia, in the vicinity of the Semites and Caucasians.


GAMKRELIDZE, T.V. and IVANOV, V., Индоевропейский язык и индоевропейцы, Tbilisi, 1984. (The Indo-European language and the Indo-Europeans)

TRUBEŢKOI, Nikolai Sergheevici. “Mîsli ob indoevropeiskoi probleme.” in Izbrannîe trudî po filologii, ed. T.V. Gamkrelidze. Moscow: Progress, 1987.

RENFREW, Colin. Archaeology and Language. The Puzzle of Indo-European Origins. London: Penguin Books, 1989.

Cf. also:

A structural comparison of Etruscan with the Kartvelian languages

Sucking the victim‘s mother‘s teats – the Etruscans and the Caucasian vendetta…

Yoga: the Chechen language and its prehistoric contacts with Indo-European…

Anatolia și Caucazul : leagănul primitiv al indo-europenilor – demonstrația lingvistică…

  1. Mister Dan Alexe, as you said about dacopat disease, i believe that you will fall in the same mistake as the one criticised by you. The Indo-european theory cannot be true. Why? Because the sustainers of Indo-european are saying that I.E. is linked with Indo-aryan, and the cradle of I.E. civilization is in Europe…this is a major mistake, as big one as the dacopat theory… All the lingvists begin with sanskrit…another mistake. Sanskrit was never spoken by people, it was an artificial language constructed by the brahmin priests in order to dont be understood by ordinary people. The mother language of sanskrit is the prakrit-natural languages of India. In those languages, many of them lost, a linguist can trace some surprisingly facts…that the cradle of Ino-Aryan is in India, and the othe r languages are Indo-European because in their migration the different tribes if Indo-Aryans mixed their prakrit speeches with the locals they met in Europe. Very interesting, the greeks are called in Brahminical treatises as Yuvanas, and are listed as aryan tribes. As you know already greek and sanskrit are linked….My last question to you: DAN ALEXE BELIEVES THAT THE ORIGINAL CRADLE OF INDO-ARYAN IS IN ANATOLIA?
    With respect, a true Rom.

    • Ovidiu permalink

      It is obvious from the text that he strongly believes in the “Anatolian hypothesis”; and that so despite this being a controversial subject.
      You see, everybody suffers of “dacopathy” only that they have different hobby-horses.

  2. Exactly, Ovidiu. But mister Dan Alexe revelead some interesting facts in his famous already book.

  3. Dragos permalink

    Most linguists favour the steppe hypothesis (not merely Pontic, but Pontic-Caspian). First off, we don’t know the geography of the Caucasian languages in Neolithic. Be that as it may, a location north or north-east of Caucasus is compatible with the affinities between PIE and Proto-Nakh-Daghestani and Proto-Kartvelian. On the top of that, there are known morphological and lexical affinities (cognates or early borrowings) between PIE and Proto-Uralic.

    As for numerals, you must compare the words in proto-languages such as PIE, not their reflexes in Latin, German, Polish, Lithuanian, etc. For example, ‘six’ is usually reconstructed *(s)weḱs in PIE which is completely different from the (Proto-)Semitic numerals. The gender-as-areal-feature argument lacks consistency: the Nakh-Daghestani languages have grammatical gender (noun classes), but the Kartvelian languages do not.

    Furthermore, the ancient central and eastern Anatolia was populated in those early times by speakers of exotic languages such as Hattic and Hurrian. The IE Anatolian languages seem intrusive in their regions and borrowed heavily from this non-IE substratum. On the other hand, there’s also the technological vocabulary, with words such as PIE *kwékwlos = ‘wheel’, derived from the verbal root *kwel- = ‘turn, revolve’, which must have been inherited in IE. The wheeled vehicles are documented after 4000-3500 BCE in the Eurasian steppes, providing a convenient homeland and date.

    • Finally a competent answer, although not to all arguments. Thank you The objections are well known. The Ponto-Caspian steppe hypothesis does not explain, for instance, the total absence of contact with the Cherkess-Kabard-Abkhaz languages, nor, on the other hand, the many mutual lexical borrowings between proto-IE and Semitic.

      • Dragos permalink

        The Northwestern Caucasian family is a hard nut to crack (monosyllabic roots, complex sound changes, extensive borrowing): we know very little about the putative proto-language, let alone about its relation with other languages/language families.
        Nevertheless, there are areal features common to PIE and Northern Caucasian languages:

        The connections between Proto-Indo-European and Proto-Semitic were first suggested in the 19th century ( indogermanisch-semitisch ) and were constantly criticized ever since.

  4. There are NO “Northern Caucasian languages”. There are two totally unrelated families, the Cherkess/Adyge and the Chechen-Daghestani. They are extremely dissimilar in their typologies and structures and even a linguist like Trubetzkoy, who spoke Cherkess since childhood, having been raised in fosterage there, and who knew thoroughly the functioning of most other local languages could not find any convincing proof of relatedness between the two families.It is only the “Nostratics”, with their totally unscientific method of comparing only reconstituted roots, that can speak en bloc about “Caucasian languages”, including even Kartvelian in the putative family.

  5. Dragos permalink

    The languages spoken on the northern slopes of Caucasus are Northern Caucasian. It is just a geographical name (cf. Mesoamerican)

    Speaking of “unscientific methods”, V. V. Ivanov (cited above) wrote a long article with the title “Comparative notes on Hurro-Urartian, Northern Caucasian and Indo-European”. Here indeed Northern Caucasian is meant typologically, drawing on the works of Russian scholars such as S. A. Starostin and S. L. Nikolayev.

    • Aș fi vrut, dacă se poate să comunicăm și altfel. Nu mi se intâmplă zilnic să pot purta asemenea conversații cu persoane din România.

  6. How sad. I will certainly look for it. I have myself a fair knowledge of Chechen, as well as passive Avar and Georgian, and a solid personal library with grammars and dictionaries from/and in most languages of the area, so I can argue with precision, without stretching the facts.

Trackbacks & Pingbacks

  1. Indo-European and Nakho-Daghestani languages: a short typological comparison | Cabal in Kabul

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: