Thomas V. Gamkrelidze and V. V. Ivanov

The Early History
of Indo-European Languages

Scientific American, March 1990, pp. 110-116*


* This material is presented solely for non-commercial educational/research purposes.
Illustrations for this article
Opens in a separate window.

[110] The common ancestor of these languages has been traced to Asia rather than to Europe, the authors say. The once-clear distinction between the family's Eastern and Western branches is now blurred.


Linguistics, the scientific study of language, can reach more deeply into the human past than the most ancient written records. It compares related languages to reconstruct their immediate progenitors and eventually their ultimate ancestor, or protolanguage. The protolanguage in turn illuminates the lives of its speakers and locates them in time and place.

The science developed from the study of the Indo-European superfamily of languages, by far the largest in number of languages and number of speakers. Nearly half of the world's population speaks an Indo-European language as a first language; six of the 10 languages in which Scientific American appears--English, French, German, Italian, Russian and Spanish--belong to this superfamily.

Over the past 200 years, linguists have reconstructed the vocabulary and syntax of the postulated Indo-European protolanguage with increasing confidence and insight. They have tried to unravel the paths by which the language broke into daughter languages that spread throughout Eurasia, seeking at the origin of those paths the homeland of the protolanguage itself. The early investigators placed the homeland in Europe and posited migratory paths by which the daughter languages evolved into clearly defined Eastern or Western branches. Our work indicates that the protolanguage originated more than 6,000 years ago in eastern Anatolia and that some daughter languages must have differentiated in the course of migrations that took them first to the East and later to the West.

The reconstruction of ancient languages may be likened to the method used by molecular biologists in their quest to understand the evolution of life. The biochemist identifies molecular elements that perform similar functions in widely divergent species to infer the characteristics of the primordial cell from which they are presumed to have descended. So does the linguist seek correspondences in grammar, syntax, vocabulary and vocalization among known languages in order to reconstruct their immediate forebears and ultimately the original tongue. Living languages can be compared directly with one another; dead languages that have survived in written form can usually be vocalized by inference from internal linguistic evidence. Dead languages that have never been written, however, can be reconstructed only by comparing their descendants and by working backward according to the laws that govern phonological change. Phonology--the study of word sounds--is all-important to historical linguists because sounds are more stable over the centuries than are meanings.

Early studies of Indo-European languages focused on those most familiar to the original European researchers: the Italic, Celtic, Germanic, Baltic and Slavic families. Affinities between these and the "Aryan" languages spoken in faraway India were noticed by European travelers as early as the 16th century. That they might all share a common ancestor was first proposed in 1786 by Sir William Jones, an English jurist and student of Eastern cultures. He thus launched what came to be known as the Indo-European hypothesis, which served as the principal stimulus to the founders of historical linguistics in the 19th century.

In their reconstruction of the ancestral Indo-European language, the early linguists relied heavily on Grimm's law of Lautverschiebung ("sound shift"), which postulated that sets of consonants displace one another over time in predictable and regular fashion. The law was posed in 1822 by Jacob Grimm, who is more widely famed for the anthology of fairy tales he wrote with his brother, Wilhelm. Grimm's law explained, among other things, why in the Germanic languages certain hard consonants had persisted despite their universal tendency to yield to soft ones. The set of softer, "voiced" consonants "b," "d," "g" (followed by momentary vibration of the vocal cords), posited in the protolanguage, had apparently given way to the corresponding hard set "p," "t," "k." According to Grimm's law, this had come about by "devoicing" those consonants ("p," for example, is unaccompanied by vocal vibration). Thus, the Sanskrit dhar is seen as an archaic form of the English "draw," which is itself more archaic than the German tragen (all of which mean "to pull").

These rules were used to reconstruct an Indo-European vocabulary that implies how its speakers lived.

[111] The words described a landscape and climate that linguists originally placed in Europe between the Alps in the south and the Baltic and North seas in the north [see "The Indo-European Language," by Paul Thieme; Scientific American, October, 1958].

More recent evidence now places the probable origin of the Indo-European language in western Asia. Three generations of archaeologists and linguists have thus far excavated and deciphered manuscripts in close to a dozen ancient languages from sites in modern Turkey and as far east as Tocharia, in modern Turkestan. Their observations, together with new ideas in pure linguistic theory, have made it necessary to revise the canons of linguistic evolutions.

The landscape described by the [112] protolanguage as now resolved must lie somewhere in the crescent that curves around the southern shores of the Black Sea, south from the Balkan peninsula, east across ancient Anatolia (today the non-European territories of Turkey) and north to the Caucasus Mountains [see illustration below; Note: all illustrations are on the second page of this article]. Here the agricultural revolution created the food surplus that impelled the Indo-Europeans to found villages and city-states from which, about 6,000 years ago, they began their migrations over the Eurasian continent and into history.

Some of the migrants invaded Anatolia from the East around 2000 B.C. and established the Hittite kingdom, which held all of Anatolia in its power by 1400 B.C. Its official language was among the first of the Indo-European languages to find its way into writing. Early in this [20th] century, Bedrich Hrozny, a linguist at Vienna University and later at Charles University in Prague, deciphered Hittite inscriptions (written in cuneiform, the ancient writing system based on wedge-shaped symbols) on tablets that had been found in the library of the capital at Hattusas, 200 kilometers east of modern Ankara. The library also contained cuneiform tablets in two related languages: Luwian and Palaic. The evolution of Luwian could be traced in later hieroglyphic inscriptions made around 1200 B.C., after the fall of the Hittite Empire. To this emerging family of Anatolian languages linguists added Lydian (closer to Hittite) and Lycian (closer to Luwian), known from inscriptions dating back to late in the first millennium B.C.

The appearance of Hittite and other Anatolian languages at the turn of the third to the second millennium B.C. sets an absolute chronological limit for the breakup of the Indo-European protolanguage. Because the Anatolian protolanguage had already fissioned into daughter languages by that point, investigators estimate that it departed from the parent Indo-European no later than the fourth millennium B.C. and possibly much earlier.

This inference is supported by what is known about the portion of the the Indo-European community that remained after the Anatolian family had broken away. From that community came the languages that persisted into written history. The first to branch off was the Greek-Armenian-Indo-Iranian language community. It must have begun to do so in the fourth millennium B.C. because by the middle of the third millennium B.C. the community was already dividing into two groups, namely, the Indo-Iranian and the Greek-Armenian. Tablets in the Hattusas archives show that by the middle of the second millennium B.C. the Indo-Iranian group had given rise to a language spoken in the Mitanni kingdom on the southeast frontier of Anatolia that was already different from ancient Indian (commonly called Sanskrit) and ancient Iranian. Cretan- Mycenaean texts from the same eras as Mitanni, deciphered in the early 1950's by the British scholars Michael G. F. Ventris and John Chadwick, turned out to be in a previously unknown dialect of Greek. All these languages had gone their separate ways from Armenian.

Tocharian was another language family that diverged from the Indo-European protolanguage quite early. Tocharian is one of the more recently discovered Indo-European languages, first recognized in the early decades of the 20th century in texts from Chinese Turkestan. The texts were comparatively easy to decipher because they were written in a variant of the Brahmi script and were mainly translations from known Buddhist writings.

Not long ago, the British scholar W. N. Henning suggested that the Tocharians be identified with the Gutians, who are mentioned in Babylonian cuneiform inscriptions (in Akkadian, a Semitic language) dating from the end of the third millennium B.C., when King Sargon was building the first [113] great Mesopotamian Empire. If Henning's views are correct, the Tocharians would be the first Indo-Europeans to appear in the recorded history of the ancient Near East. Lexical affinities of Tocharian with Italo-Celtic give evidence that the speakers of the two language families had associated in the Indo-European homeland before the Tocharians began their migration eastward.

The diverging pathways of linguistic transformation and human migration may now be traced back to a convergence in the Indo-European protolanguage and its homeland. This has followed from the revision in the canons of phonology we mentioned above. An uncontested peculiarity of the sound system of the protolanguage, for example, is the near absence, or suppression, of one of the three consonants "p," "b" or "v," which are labials (consonants sounded with the lips). Traditionally, it had been thought that "b" was the suppressed consonant. Subsequent studies in phonology indicated, however, that if one of the three labial consonants is lacking in a language, it is least likely to be the one sounded as "b" in English and other living European languages.

On that basis we decided to reexamine the entire system of consonants posited for the protolanguage, and as early as 1972; we proposed a new system of consonants for the language. Our proposal remains in the crucible of debate from which consensus forms in every science. The debate now focuses more strongly on features that relate the Indo-European protolanguage to other major language families and that have at last begun to bring their common ancestor into view.

According to classical theory, the "stop" consonants--those that are sounded by interruption of the outward flow of the breath that excites the vibration of the glottis, or vocal cords--are divided into three categories [see top of illustration on this page. Note: all illustrations are on the second page of this article]. The labial stop consonant "b" appears in the first column as a voiced consonant; the parentheses enclosing it there indicate its supposed suppression. It is associated with two other voiced stop consonants: "d" (stopped by the forward part of the tongue against the palate) and "g" (stopped by the back of the tongue against the palate).

In the scheme we have developed [see bottom of illustration on this page Note: all illustrations are on the second page of this article], the corresponding consonants are sounded with a glottalized stop: a closure of the throat at the vocal cords that prevents the outward flow of breath. Here the voiceless labial stop ("p'") appears suppressed, followed by "t'" and "k'". As ("p'") is to ("b"), voiceless and voiced, respectively, so "t'" is to "d" and "k'" is to "g". Glottalized stops occur in many different language families, particularly those of northern Caucasian and southern Caucasian (Kartvelian) provenance. The glottalized stop--which hardens a consonant--tends to weaken and disappear in most languages of the world. So we surmised that--among the labial stops--it was the "p'" rather than the "b" that most likely had been suppressed in the Indo-European protolanguage.

Our so-called Indo-European glottalic system, which has been constructed by comparing the phonology of the living and the historically attested Indo-European languages, appears more probable than the classical one. The near absence of the labial phoneme ("p'") finds a natural phonological explanation in relation to the evolution of the other two glottalized stops and to the entire system of stops shown above.

In revising the consonant system of the Indo-European protolanguage, we have also called into question the paths of transformation into the historical Indo-European languages. Our reconstructjon of the protolanguage's consonants shows them to be closer to those of the Germanic, Armenian and Hittite daughter languages than to Sanskrit. This neatly reverses the classical conception that the former languages had undergone a systematic sound shift, whereas Sanskrit had faithfully conserved the original sound system.

The transformation of consonants from parent to daughter languages may be illustrated by the word "cow" in English and Kuh in German; in Sanskrit the word for "ox" is gauh, and in Greek it is bous. All have long been recognized as descending from a common Indo-European word for "ox," or "cow." The word has different forms, however, in the glottalic and classical systems. In the glottalic it has the voiceless consonant *k'wou- (the asterisk before a word designates it as a word in the protolanguage), which makes it phonetically closer to the corresponding words in English and German than to those in Greek and Sanskrit.

In the classical system the word is *gwou, which is practically the same as that in Sanskrit. In accordance with Grimm's law, the transformation of *gwou to the German would require devoicing of the first consonant from "g" to "k." And so the glottalic system seems to make the most sense: it eliminates the need for devoicing and correlates the voiceless stops in the Germanic languages (German, Dutch, Scandinavian and English) with voiceless glottalized stops in the ancestral Indo-European protolanguage. In this respect the Germanic languages are more archaic than Sanskrit and Greek. The glottalic system is seen, correspondingly, as more conservative than the classical system. It has brought the [114] protolanguage closer to some of its daughter languages without resorting to such difficult phonological transformations as that from "g" to "k."

We can learn more about the earliest Indo-Europeans from other aspects of their reconstructed vocabulary. Some words, for example, describe an agricultural technology whose existence dates back to 5000 B.C. By that time the agricultural revolution had spread north from its origins in the Fertile Crescent, where the first archaeological evidence of cultivation dates back to at least 8000 B.C. From this region agriculture also spread southward to sustain the Mesopotamian civilizations and westward to Egypt. The Indo-European words for "barley," "wheat" and "flax"; for "apples," "cherries" and their trees; for "mulberries" and their bushes; for "grapes" and their vines; and for the various implements with which to cultivate and harvest them describe a way of life unknown in northern Europe until the third or second millennium B.C., when the first archaeological evidence appears.

The landscape described by the reconstructed Indo-European protolanguage is mountainous--as evidenced by the many words for high mountains, mountain lakes and rapid rivers flowing from mountain sources. Such a picture cannot be reconciled with either the plains of central Europe or the steppes north of the Black Sea, which have been advanced as all alternative homeland for the Indo-Europeans. The vocabulary does, however, fit the landscape of eastern Anatolia and Transcaucasia, backed by the splendor of the Caucasus Mountains. The language clothes its landscape in the flora of this region, having words for "mountain oak," "birch," "beech," "hornbeam," "ash," "willow" or "white willow," "yew," "pine" or "fir," "heather" and "moss." Moreover, the language has words for animals that are alien to northern Europe: "leopard," "snow leopard," "lion," "monkey" and "elephant."

The presence of a word for "beech tree," incidentally, has been cited in favor of the European plains and against the lower Volga as the putative Indo-European homeland. Beech trees, it is true, do not grow east of a line drawn from Gdansk on the Baltic to the northwest corner of the Black Sea. Two species of beech (Fagus orientalis and F. sylvatica) flourish, however, in modern Turkey. Opposing the so-called beech argument is the oak argument: paleobotanical evidence shows that oak trees {which are [115] listed in the reconstructed language's lexicon) were not native to postglacial northern Europe but began to spread there from the south as late as the turn of the fourth to the third millennium B.C.

Another significant clue to the identification of the Indo-European homeland is provided by the terminology for wheeled transport. There are words for "wheel" (*rotho-), "axle" (*hakhs-), "yoke" (*iuk'om) and associated gear. "Horse" is *ekhos and "foal" *pholo. The bronze parts of the chariot and the bronze tools, with which chariots were fashioned from mountain hardwoods, furnish words that embrace the smelting of metals. Petroglyphs, symbols marked on stone, found in the area from the Transcaucasus to upper Mesopotamia between the lakes Van and Urmia are the earliest pictures of horse-drawn chariots.

The postulated homeland of the Indo-Europeans is, if not the only region, certainly one of the regions in which the horse completed its domestication and was harnessed as a draft animal in the fourth millennium B.C. From here wheeled vehicles spread with the migration of the Indo-Europeans in the third and second millennia B.C. eastward to central Asia, westward to the Balkans, and in a circular motion around the Black Sea and thence to central Europe.

The chariot provides significant evidence of cultural mixing, for chariots figured in the funerary and other religious rites of both the Indo-European peoples and the Mesopotamians. Contact with other western Asiatic cultures is also evidenced in the sharing of various mythological subjects--for example, the theft of the Hesperian apples by Hercules and similar tales in Norse and Celtic. Moreover, the Semitic and Indo-European languages each identify man with the earth. In Hebrew, adam means "man " and adamah means "earth"; both were derived from a root in the Semitic protolanguage (cf. Genesis 2: 7, "...God formed man from the dust of the ground"). "Human" and "humus" came to English through Latin (homo, humus) from *dheghom--, the word for "earth" and "man" (etymologically, "earthly creature") in the Indo-European protolanguage. The rooting of the Indo-European languages in eastern Anatolia is also suggested by the frequency of words borrowed from a number of languages that flourished there: Semitic, Kartvelian, Sumerian and even Egyptian. Conversely, Indo-European contributed words to each of those languages. Nickolai I. Vavilov, a prominent Soviet plant geneticist, found a vivid instance of such an exchange: the Russian vinograd ("grape"), the Italic vino and the Germanic wein ("wine"). These all reach back to the Indo-European *woi-no (or *wei-no), the proto-Semitic *wajnu, the Egyptian *wns, the Kartvelian *wino and the Hittite *wijana. We concede that in the broad territory in which we have placed the homeland of the Indo-Europeans there is no archaeological evidence of a culture that can be positively linked to them. Archaeologists have identified, however, a number of sites that bear evidence of a material and spiritual culture similar to the one implied by the Indo-European lexicon. The Halafian culture of northern Mesopotamia decorated its vessels with religious symbols--bulls' horns and sometimes rams' heads, which are masculine symbols, and ritual images of leopard skins--that are shared by the somewhat later Catal Huyuk culture of the seventh millennium B.C. in western Anatolia. Both cultures have affinities with the later Transcaucasian culture in 1he region embraced by the Kura and the Araks rivers, which includes southern Transcaucasia, eastern Anatolia and northern Iran.

In the 2,000 years before the Indo-Europeans who remained in the homeland began to write history, the success of the agricultural revolution brought a population explosion to the Indo-European community. The pressure of population, we may surmise, compelled the migration of successive waves of Indo-Europeans to fertile areas that were not yet cultivated.

The linguistic translocation of the Indo-European homeland from northern Europe to Asia Minor requires drastic revisions in theories about the migratory paths along which the Indo-European languages must have spread across Eurasia. Thus, the hypothetical Aryans who were said to have borne the so-called Aryan, or Indo-Iranian, language from Europe to India--and who were conscripted into service as the Nordic supermen of Nazi mythology--turn out to be the real Indo-Iranians who made the more plausible migration from Asia Minor around the northern slopes of the Himalaya Mountains and down through modern Afghanistan to settle in India. Europe is seen, therefore, as the destination, rather than the source, of Indo-European migration.

Speakers of the Hittite, Luwian and other Anatolian languages made [116] relatively small migrations within the homeland, and their languages died there with them. The more extensive migrations of speakers of the Greek- Armenian-Indo-Iranian dialects began with the breakup of the main Indo- European language community in the third millennium B.C. Two groups of Indo-Iranian speakers made their way East during the second millennium B.C. One of them, speakers of the Kafiri languages, survives to this day in Nuristan, on the southern slopes of the Hindu Kush in northeast Afghanistan. In Five Continents, a posthumous book recounting his many botanical expeditions between 1916 and 1933, Vavilov speculated that the Kafirs might perpetuate some "original relics" of Indo-Iranian.

The second group of Indo-Iranians, who followed a more southerly path into the Indus Valley, spoke a dialect from which the historical languages of India are descended. Their earliest literary ancestor is embodied in the Rig Veda hymns, written in an ancient variant of Sanskrit. The indigenous peoples of the Indus Valley, known from the archaeological discoveries at their capital Mohenjo-Daro, were apparently displaced by the Indo-Iranians. After the separation of the Indo-Iranians and their departure for the east, the Greek-Armenian community remained for a time in the homeland. There, judging by the numbers of loan words, they had contact with speakers of Kartvelian, Tocharian and the ancient Indo-European languages that later evolved into the historical European languages. One such borrowing from the Kartvelian became the Homeric koas, "fleece."

A bilingual cuneiform tablet found in the Hattusas archives records the mythological tale of a hunter in the then already dead Hurrian language along with a translation into Hittite. This remarkable discovery gave us the Hurrian word ashi from which Homer's askos, for "hide" or "fur," apparently stemmed. Before their migration to the Aegean, the Greeks borrowed the Hittite word kursa, which by a familiar phonological shift became bursa, another synonym for "fleece." These words seem to confirm the Greeks' belief that their ancestors had come from western Asia, as recounted in the myth of Jason and the Argonauts, who sought the Golden Fleece in Colchis, on the eastern shore of the Black Sea. The evidence that the Greeks came thence to their historical homeland puts the Greek "colonies" on the northern shore of the Black Sea in a new light. The colonies may now be considered as very early settlements that were established when the Greeks began migrating to their final home in the Aegean.

The historical European languages--those that left literary remains--provide evidence that the dialects from which they descended had found their way into central Asia along with the Tocharians. These languages have many words in common. An example is the word for "salmon," once regarded as a weighty argument for a homeland in northern Europe. Salmon abounded in the Baltic rivers of Europe, and the word lox (German Lachs) in the Germanic languages is perhaps echoed by lak- in Hindu, for a lacquer of a pink color that evokes the color of salmon flesh. One species of salmon, Salmo trutta, is found in the streams of the Caucasus, and the lak-s- root denotes "fish" in earlier and later forms of Tocharian as well as in the ancient European languages.

The migration of the speakers of some of the early Indo-European dialects into central Asia is established by loan words from the Finno-Ugric language family, which gave rise to modern Finnish and Hungarian. Under the influence of Finno-Ugric, Tocharian underwent a complete transformation of its system of consonants. Words in the ancient European languages that are clearly borrowed from the Altaic and other languages of central Asia give further testimony to the sojourn their speakers there.

Circling back to the west, the ancient Europeans settled for a time north of the Black Sea in a loosely federated community. Thus, it is not entirely wrong to think of this region as a second homeland for these peoples. From the end of the third through the first millennium B.C., speakers of ancient European languages spread gradually into Europe. Their coming is demonstrated archaeologically by the arrival of the seminomadic "pit grave" culture, which buried its dead in shafts, or barrows.

Anthropometry, which is the scientific measurement of the human body, has begun to chart the imposition of the Hittite physiognomy, typified in Hittite reliefs, on certain European populations. The blue-eyed, blond-haired Nordic must still be regarded as the product of inter-breeding between the Indo-European invaders and their predecessors in the settlement of Europe. The culture of the indigenous populations of Europe is memorialized by the megalithic structures, such as Stonehenge, which they built near the periphery of the continent.

The languages of the previous inhabitants of Europe, with the exception of Basque--a non-Indo-European language with possible remote relatives in the Caucasus--were crowded out by the Indo-European dialects. Nonetheless, those languages made contributions to the historical European language families that account for certain differences among them. In his study of the megalithic cultures and their disappearance, as well as of the spread of farming from the ancient Near East, the British archaeologist Colin Renfrew has reached conclusions about the coming of the Indo- Europeans that agree well with ours [see "The Origins of Indo-European Languages," by Colin Renfrew; Scientific American, October, 1989].

Our deductions, resting so preponderantly on linguistic evidence, must find confirmation in archaeological investigations that remain to be done. Undoubtedly, the counting of base-pair substitutions ti1 the DNA of human cells will contribute to the family tree of the speakers of the Indo-Euro- pean languages and to the mapping of their migrations. Anthropometry and history also will contribute to the ultimate picture. Pending the elaboration and correction of our work, we may state with a high order of certainty that the homeland of the Indo-Europeans, the cradle of much of the world's civilization, was in the ancient Near East: "Ex oriente lux!"


Further Reading

Indo-European and the Indo-Europeans: A Reconstruction and Historical Typological Analysis of a Protolanguage and Proto-Culture. Parts I and II. Thomas V. Gamkrelidze and Vjacheslav V. Ivanov. Tbilisi State University, 1984.

Archaeology and Language: The Puzzle of Indo-European Origins. Colin Renfrew. Cambridge University Press, 1988.

Reconstructing Languages and Cultures: Abstracts and Materials from the First International Inter-Disciplinary Symposium on Language and Prehistory, Ann Arbor, November 8-12, 1988. Edited by Vitaly Shevoroshkin. Studienverlag Dr. Norbert Brockmeier, 1989.

In search of the Indo-Europeans: Language, Archaeology and Myth. J. P. Mallory. Thames and Hudson, 1989.

When Worlds Collide: Indo-Europeans and Pre-Indo-Europeans. Edited by John Greppin and T. L Markey. Karoma Publishers, Inc., 1990.


Return to Prehistory Page

Classical Antiquity Menu
Historical Sources Menu
History Workshop Menu