Vol. XVI · Deck 3 · The Deck Catalog

Indo-European.

From Sir William Jones's 1786 lecture in Calcutta to the 2015 ancient-DNA papers that confirmed the steppe homeland. The discovery, the reconstruction, the long argument over where Proto-Indo-European was spoken and how it spread.


Languages~445 living
Speakers~3.2 billion
Pages32
LedeII

OpeningThe largest family.

Three of every seven humans alive speak an Indo-European language. The family is the most successful in human history, and its discovery is the founding event of comparative linguistics.

The branches are familiar individually: Indic (Hindi, Urdu, Bengali, Punjabi, Sanskrit), Iranian (Persian, Pashto, Kurdish), Slavic (Russian, Polish, Czech, Serbo-Croatian), Baltic (Lithuanian, Latvian), Germanic (English, German, Dutch, Scandinavian), Celtic (Welsh, Irish, Scottish Gaelic, Breton), Italic (Latin and its Romance descendants), Hellenic (Greek), Albanian, Armenian, and the extinct branches — Anatolian (Hittite), Tocharian.

This deck covers the discovery, the comparative method, the reconstructed proto-language, the homeland controversy, the steppe-vs-Anatolian hypotheses, the ancient-DNA evidence that has reshaped the question, and the cultural legacy.

Vol. XVI— ii —
JonesIII

Chapter IThe 1786 lecture.

Sir William Jones — judge of the Calcutta Supreme Court, Sanskritist, polyglot — delivered the third anniversary discourse to the Asiatic Society of Bengal on 2 February 1786. His one paragraph on the Sanskrit language is the founding text of historical linguistics:

"The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong, indeed, that no philologer could examine them all three without believing them to have sprung from some common source, which, perhaps, no longer exists."

The "common source" is what we now call Proto-Indo-European. Jones got there in a paragraph. The next two centuries have been about the details.

IE · Jones— iii —
Comparative methodIV

Chapter IIHow reconstruction works.

The comparative method, developed by the Neogrammarians of late-19th-century Leipzig — Karl Brugmann, Berthold Delbrück, August Leskien, Hermann Osthoff — is the engineered procedure by which proto-languages are reconstructed.

The principle: regular sound change. When a language splits into daughter languages, sounds shift in regular patterns. If you have enough cognate sets across enough languages, you can work backward from the observed correspondences to reconstruct the ancestor's likely sounds.

The Neogrammarian credo: "Sound laws operate without exception." If a sound change apparently fails in some words, the explanation is borrowing, analogy, or a previously unrecognised conditioning factor — never random. The credo overstates the regularity but as a working hypothesis it has been astoundingly productive.

PIE *pəter- "father" → Sanskrit pitár-, Greek patēr, Latin pater, Old Irish athair, Gothic fadar, English father, Tocharian B pācer
IE · Method— iv —
GrimmV

Chapter IIIGrimm's Law.

Jacob Grimm — the elder Grimm brother, philologist as well as folklorist — published in 1822 the systematic account of the consonant shift that distinguishes Germanic from the rest of Indo-European. The shift moved Indo-European voiceless stops to fricatives, voiced stops to voiceless, voiced aspirates to voiced.

PIE *p → Germanic f: Latin pater ↔ English father
PIE *t → Germanic þ: Latin tres ↔ English three
PIE *k → Germanic h: Latin centum ↔ English hundred
PIE *bh → Germanic b: Sanskrit bhrātar- ↔ English brother

Apparent exceptions to Grimm's Law were resolved by Karl Verner's 1875 paper, which showed they were conditioned by the position of the original PIE accent. The Verner cleanup was the high-water mark of Neogrammarian confidence: the laws really did work without exception once you understood what was conditioning them.

IE · Grimm— v —
The branchesVI

Chapter IVThe ten daughters.

The first-order branches of Indo-European, with rough timing of attestation:

Anatolian — Hittite, Luwian, Lydian, Lycian. First attested in cuneiform texts from Hattusa, ~1600 BCE. Extinct by the early Christian era.

Indo-Iranian — Sanskrit (Rigveda ~1500 BCE), Avestan (~1000 BCE), then Old Persian, Pali, Prakrits, and the modern Indic and Iranian languages.

Hellenic — Mycenaean Greek (Linear B, ~1400 BCE), then Classical Greek, Koine, Byzantine Greek, modern Greek.

Italic — Latin (~600 BCE), the Romance languages, plus extinct Italic relatives (Oscan, Umbrian, Faliscan).

Celtic — Continental Celtic (Gaulish, Celtiberian, ~3rd century BCE), Insular Celtic (Old Irish ~600 CE), modern Celtic languages.

Germanic — Gothic (4th century CE), then North and West Germanic, modern Germanic languages.

Balto-Slavic — split into Baltic (Lithuanian and Latvian) and Slavic (Old Church Slavonic, 9th c.; modern Slavic).

Albanian, Armenian — each its own branch. Tocharian A and B, attested in 6th–8th century CE Buddhist manuscripts from the Tarim Basin.

IE · Branches— vi —
NumbersVII

Chapter VNumbers and the family.

Indo-European cognates are densest in the basic vocabulary — kinship terms, numerals, body parts, common verbs. The numerals 1–10 are recognisably related across the family even at first encounter.

PIE *kʷetwóres "four" → Sanskrit catvāraḥ, Avestan caθwārō, Greek téssares, Latin quattuor, Old Irish cethair, Gothic fidwōr, English four, Lithuanian keturi, Russian chetýre, Tocharian B śtwer, Hittite meyawes(?)

The pattern is reliable enough to detect even when the modern forms have been heavily eroded. English four and Russian chetýre look unrelated until you trace each back through its sound changes; both descend from PIE *kʷetwóres, "four."

The basic-vocabulary cognates are how language families are first identified. They are also why Indo-European is uncontroversial: the cognate sets are too numerous and too systematic to be borrowing or coincidence.

IE · Numbers— vii —
Centum-SatemVIII

Chapter VIThe first-order division.

The languages divide on the treatment of PIE palatovelar consonants. Latin has centum, "hundred"; Avestan has satəm. The split runs roughly east-west: Centum branches kept the velar; Satem branches palatalised it to a sibilant.

Centum: Italic, Celtic, Germanic, Hellenic, Tocharian, Anatolian.
Satem: Indo-Iranian, Balto-Slavic, Albanian, Armenian.

The split is geographic but does not fall neatly on east-west lines: Tocharian, attested in Central Asia, is centum, which complicates the simple split-by-territory model. The current view is that the satemisation was an areal innovation that spread across some branches without affecting others.

The 19th-century classifications used centum-satem as a primary branching of the family. Modern phylogenetic work treats it as one of many shared features, not the foundational split.

IE · Centum-Satem— viii —
LaryngealsIX

Chapter VIISaussure's prediction.

Ferdinand de Saussure, in his Mémoire sur le système primitif des voyelles dans les langues indo-européennes (1879), reasoned from internal evidence alone that PIE must have had a class of consonants — he called them "coefficients sonantiques" — that survived in no living language but conditioned the vowel patterns of all of them.

Saussure had no direct evidence. The hypothesis was a structural prediction.

In 1915 Bedřich Hrozný deciphered Hittite, and in the 1920s Jerzy Kuryłowicz noticed that Hittite had a consonant — written ḫ — that appeared exactly where Saussure's missing coefficients would be. The "laryngeal theory" became standard. PIE is now reconstructed with three (or four) laryngeals, *h₁, *h₂, *h₃, that conditioned vowel quality and survived only in Anatolian.

The Saussure prediction is the field's emblematic case of structural reasoning getting the right answer in advance of the evidence. Linguistics is rarely so vindicated.

IE · Laryngeals— ix —
The reconstructedX

Chapter VIIIWhat PIE looked like.

The reconstructed Proto-Indo-European of modern handbooks (Ringe, Beekes, Mallory, Adams) has roughly:

Phonology. Fifteen consonants (three series of stops — voiceless, voiced, voiced aspirate — at five places of articulation, though the "glottalic" theory disputes this), four sonorants, three laryngeals, three vowels (e, o, a) plus syllabic resonants.

Morphology. Heavy inflection. Nouns in three genders, three numbers (singular, dual, plural), eight cases (nominative, accusative, genitive, dative, instrumental, locative, ablative, vocative). Verbs with elaborate aspect-tense-mood-voice paradigms.

Syntax. SOV (subject-object-verb) word order, with relatively free order owing to the heavy case marking. Substantial use of particles for discourse functions.

Lexicon. Reconstructed roots in the thousands. Calvert Watkins's The American Heritage Dictionary of Indo-European Roots (1985, revised 2000) is the standard reference.

Proto-Indo-European_society
Reconstruction of late-Yamnaya life on the Pontic-Caspian steppe.
IE · PIE— x —
HomelandXI

Chapter IXThe two hypotheses.

Two main proposals for where Proto-Indo-European was spoken, each with serious advocates.

The Steppe (Kurgan) Hypothesis. Marija Gimbutas, the Lithuanian-American archaeologist, proposed in the 1950s that PIE was spoken on the Pontic-Caspian steppe — present-day Ukraine and southern Russia — by a horse-riding pastoralist culture, the Yamnaya. Their migration westward and southward, beginning ~3500 BCE, carried the language across Europe and into the Iranian plateau and South Asia. Date of dispersal: ~3500–2500 BCE.

The Anatolian (Renfrew) Hypothesis. Colin Renfrew, in Archaeology and Language (1987), proposed that PIE was spoken in Anatolia — present-day Turkey — and spread with the agricultural revolution beginning ~7000 BCE. Date of dispersal: ~6500–5000 BCE.

The hypotheses make different predictions about reconstructed PIE vocabulary (does it include words for plough, wagon, horse, secondary products of dairying?) and about the population genetics of Europe.

IE · Homeland— xi —
Ancient DNAXII

Chapter XThe genetic verdict.

The 2015 papers — Haak et al. in Nature, Allentoft et al. also in Nature — sequenced ancient DNA from across Bronze Age Europe and detected a massive influx of steppe ancestry into Central and Northern Europe between 3000 and 2500 BCE. The Yamnaya signal was unmistakable. By the late third millennium, populations in Britain, Germany, Scandinavia carried 40–80% steppe ancestry that had not been there a millennium earlier.

The implications for the homeland question are strong. The migration the Kurgan hypothesis predicted is genetically detected at the predicted time, in the predicted direction, on a scale large enough to plausibly carry a language.

David Reich's lab at Harvard, the Pääbo lab at Leipzig, and Eske Willerslev's group in Copenhagen continue to add resolution. The 2018 paper extending the analysis to South Asia (Narasimhan et al., Science) showed similar steppe-derived ancestry arriving in the Indian subcontinent in the second millennium BCE — consistent with the Indo-Aryan migration that has long been linguistically inferred.

The Anatolian hypothesis is not refuted but is harder to maintain. Most contemporary Indo-Europeanists (David Anthony, Asya Pereltsvaig, Martin Lewis) take the Steppe hypothesis as the working consensus.

IE · DNA— xii —
YamnayaXIII

Chapter XIThe horse people.

The Yamnaya culture (~3300–2600 BCE) is the leading candidate for the population that spoke late Proto-Indo-European. Their archaeology shows a pastoralist economy on the Pontic-Caspian steppe — cattle and sheep, horse-drawn wagons (the wheel-and-wagon technology was probably theirs to adopt and spread, though not to invent), distinctive single-burial kurgans (mound graves) with grave goods reflecting an emerging warrior elite.

Yamnaya genetic ancestry is roughly half from earlier Eastern Hunter-Gatherers and half from Caucasus Hunter-Gatherer/Iran-Neolithic populations. The genetic profile is detectable across Europe by 2500 BCE and across northern India by 1500 BCE.

The horse-and-wagon technological complex gave the Yamnaya a logistical advantage over neighbouring agriculturalists. Their cultural successors — Corded Ware in the north, Catacomb in the south, Bell Beaker further west — are the visible archaeology of the dispersal that carried Indo-European languages west and south.

IE · Yamnaya— xiii —
AnatolianXIV

Chapter XIIHittite and the early branch.

The Anatolian languages — Hittite, Luwian, Palaic, Lydian, Lycian — are the earliest-attested Indo-European languages and the most divergent. The Hittite cuneiform texts from Hattusa (~1650–1180 BCE) — diplomatic letters, religious rituals, royal annals — were deciphered by Bedřich Hrozný in 1915.

Hittite was the surprise of 20th-century Indo-European studies. It preserved the laryngeals Saussure had predicted, vindicating the structural reconstruction. It used a substantially simpler nominal system than later Indo-European languages — only two genders (animate/inanimate), which suggested either an archaic retention or a heavy areal influence from non-Indo-European Anatolian languages.

The current debate: did Anatolian split off very early (the "Indo-Hittite" hypothesis of Edgar Sturtevant in the 1920s) — implying late PIE proper was spoken after the Anatolian departure — or is Hittite simply a divergent branch? The 2015 Bayesian phylogenies (Bouckaert et al.; revised by Chang et al. 2015) tend to support the early-split view.

IE · Anatolian— xiv —
SanskritXV

Chapter XIIIThe Indic line.

The Rigveda — about 1,028 hymns in Vedic Sanskrit, composed orally between roughly 1500 and 1100 BCE and transmitted with extraordinary fidelity through a millennia-long oral tradition — is the earliest substantial Indo-Aryan text. Its language preserves features lost in every other Indo-European tradition.

The first systematic grammar of Sanskrit, by Pāṇini (~5th–4th century BCE), is the Aṣṭādhyāyī — the most sophisticated grammatical analysis in the ancient world. About 4,000 sutras (compact rules) describe Sanskrit morphology and syntax with such precision that the text remains the operational reference for classical Sanskrit.

Pāṇini's analytic rigour — the recursive rule structure, the meta-rules about rule application, the formal vocabulary — anticipates by 2,500 years the formalisms of generative grammar. Leonard Bloomfield called the Aṣṭādhyāyī "one of the greatest monuments of human intelligence." It is.

IE · Sanskrit— xv —
GreekXVI

Chapter XIVThe Hellenic branch.

Greek's documented history is the longest of any European language. Mycenaean Greek in Linear B (~1400 BCE), Homeric Greek (the Iliad and Odyssey in roughly the 8th century BCE, in a literary dialect drawn from older oral tradition), Classical Attic-Ionic, Koine (the Hellenistic standard, language of the New Testament), Byzantine Greek, Modern Greek.

The continuity is real. A modern Greek speaker can read Koine without much instruction. Even Homer is more accessible to a modern Greek than Beowulf is to a modern English speaker.

Greek phonology has been relatively conservative — most of the consonant inventory of Mycenaean is preserved in modern Greek — though the vowel system has shifted (the Classical long-short distinction collapsed in the early Christian era; iotacism merged several vowels into [i]). The morphology has been simplified — the ancient dative case is gone, the optative mood is gone — but the language remains clearly the same family-member after thirty-four centuries.

IE · Greek— xvi —
LatinXVII

Chapter XVThe Italic radiation.

Latin in the Roman Republic was one of several Italic languages alongside Oscan, Umbrian, Faliscan, and Venetic. By the late Republic, with Roman political expansion, Latin had absorbed or eliminated its sister-languages on the peninsula. By the 1st century CE, Latin was the lingua franca of the western Mediterranean.

The Romance daughters split off in late antiquity: Italo-Romance (Italian, Sardinian, Romansh), Gallo-Romance (French, Occitan, Catalan), Ibero-Romance (Spanish, Portuguese, Galician), Daco-Romance (Romanian), with internal further branching. The breakup happened roughly 5th–9th century CE; the first identifiably non-Latin Romance texts (the Strasbourg Oaths, 842) date to the early 9th.

The Romance languages today are spoken by about 900 million people. With the addition of Vatican-Catholic clerical Latin and the descendant scientific terminology of Western academia, Latin's reach is one of the largest cultural footprints any extinct language has ever had.

IE · Latin— xvii —
TocharianXVIII

Chapter XVIThe eastern outlier.

Tocharian A and B were spoken in the Tarim Basin (modern Xinjiang) and attested in Buddhist manuscripts of roughly the 6th–8th centuries CE. The discovery, by German and French expeditions in the 1890s and 1900s, was a surprise: an Indo-European language at the eastern edge of Inner Asia, already extinct for over a millennium when found.

Tocharian is centum — eastern, but with western consonant patterns. It also has features that look very archaic and others that look very innovative. It does not fit cleanly into any of the dialect-area schemes of late PIE. Its presence at all in Xinjiang is evidence of an early easterly migration that the Steppe hypothesis must accommodate.

The Tarim mummies — well-preserved Bronze Age burials in the Taklamakan, often blonde or red-haired — were once interpreted as the Tocharian-speaking population. The 2021 ancient-DNA paper (Zhang et al., Nature) substantially complicated this: the Tarim people were genetically isolated, descending from Ancient North Eurasians without significant Yamnaya input. The Tocharian-speakers and the Tarim mummies are probably not the same population. The puzzle continues.

IE · Tocharian— xviii —
CelticXIX

Chapter XVIIThe Atlantic fringe.

The Celtic languages dominated much of late prehistoric Europe — Iberia, Gaul, the British Isles, parts of Central Europe and Anatolia (the Galatians) — before Roman and Germanic expansions reduced them to the Atlantic fringe.

Two main subgroups. Continental Celtic (Gaulish, Celtiberian, Lepontic, Galatian) — extinct by the early Christian era. Insular Celtic — Goidelic (Old Irish, modern Irish, Scottish Gaelic, Manx) and Brythonic (Welsh, Cornish, Breton).

Celtic languages are conservative in some respects (Old Irish preserves a substantial PIE inflectional system) and innovative in others (initial consonant mutations are unique among Indo-European subgroups; the verb-subject-object word order of Welsh and Irish is rare). The Insular branch has been a goldmine for Indo-European reconstruction even as the languages themselves have struggled — Cornish died as a community language in the 18th century and has been revived, Manx died in 1974 and has been revived, Irish remains under pressure despite official status.

IE · Celtic— xix —
SlavicXX

Chapter XVIIIThe Slavic expansion.

Slavic is the late expander. Proto-Slavic is reconstructed for roughly 500 CE; Old Church Slavonic, the literary medium of Saints Cyril and Methodius's mission to the Moravians, dates to the 860s. The split into the modern East (Russian, Ukrainian, Belarusian), West (Polish, Czech, Slovak), and South (Serbo-Croatian, Bulgarian, Slovene, Macedonian) groups is later still.

The historical mystery is how Slavic spread so far so fast. Between roughly 500 and 900 CE, Slavic-speaking populations expanded from a relatively small core in eastern Europe to dominate the Balkans, the Russian plains, and as far west as the Elbe. The expansion has no clear archaeological signal at the scale required by the linguistic spread.

Florin Curta's The Making of the Slavs (2001) argues against a "Slavic migration" in the older sense — proposing that Slavic identity coalesced as a Byzantine frontier construction rather than as the trail of a moving people. The thesis is contested; the puzzle is real.

IE · Slavic— xx —
LithuanianXXI

Chapter XIXThe most archaic.

Among living Indo-European languages, Lithuanian is the most morphologically conservative. It preserves the seven cases of late PIE, the dual number (in some constructions), the pitch-accent system, and a substantial body of inherited vocabulary that other branches have replaced with innovations.

The standard pedagogical comparison: Sanskrit devas, "god"; Lithuanian diẽvas, "god"; PIE *deywós. The Lithuanian form looks closer to Sanskrit than any other modern European language to anything ancient.

The conservatism is a function partly of geographic isolation (the Baltic region was at the periphery of the major centres of cultural change) and partly of the late date at which Lithuanian became a written language (16th century). For Indo-Europeanists, Lithuanian is a window: features that have been lost everywhere else are still in active use.

IE · Lithuanian— xxi —
PhylogeneticsXXII

Chapter XXComputational trees.

The last twenty years have seen Bayesian phylogenetic methods, originally developed for biology, applied to Indo-European. Russell Gray, Quentin Atkinson, and Remco Bouckaert's 2003 paper in Nature argued for an Anatolian homeland and an early split (~7000 BCE), reigniting the homeland controversy.

The 2015 paper by Will Chang, Chundra Cathcart, David Hall, and Andrew Garrett — using a constrained ancestry model — produced a date estimate of ~6500 years before present for the family, consistent with the Steppe hypothesis. Their work has been widely cited as the resolution of the chronology dispute in favour of the Steppe.

The methodological lesson: Bayesian phylogenetics gives different answers depending on what assumptions you build in about how cognate sets are coded, which words count as basic vocabulary, and what priors you place on tree shape. The answers are interpretable but not assumption-free.

IE · Phylogenetics— xxii —
ReligionXXIII

Chapter XXIReconstructed myth.

Comparative Indo-European mythology — Émile Benveniste, Georges Dumézil, Calvert Watkins, Bruce Lincoln — has reconstructed elements of the religious-mythological system of Proto-Indo-European speakers. The reconstructions are more speculative than the linguistic ones but are based on the same comparative method.

Reconstructed mythological figures: *Dyḗws Pḥ₂tḗr, "Sky Father" (Greek Zeus patēr, Latin Iūpiter, Sanskrit Dyaus pitár); *H₂éwsōs, dawn goddess (Greek Eos, Latin Aurora, Sanskrit Uṣas); the divine twins (Greek Dioskouroi, Sanskrit Aśvins).

Reconstructed narrative themes: the dragon-slaying hero (Indic Indra slaying Vṛtra; Greek Apollo or Heracles slaying serpents; Norse Thor and the Midgard serpent; Hittite Tarhunna and the dragon Illuyanka). The cosmic three-fold social organisation (priests / warriors / producers — Dumézil's idéologie tripartite) is well-attested across the family.

How much of this is genuinely reconstructable — vs. parallel cultural development — is contested. The language reconstruction is on much firmer ground than the mythological one.

IE · Religion— xxiii —
Genetics & ideologyXXIV

Chapter XXIIThe political shadow.

The Indo-European question has been politically loaded since the 19th century. The discovery of an Indo-Iranian-European linguistic family was repurposed by 19th-century race theorists into the "Aryan race" — a fiction the Nazis built into murderous policy.

The damage was lasting. Indo-European studies as a field withdrew from cultural reconstruction in the postwar decades; archaeologists became wary of "migration" explanations of any kind; the discipline of comparative mythology was tarred by its association with ideologically committed practitioners.

The 21st-century ancient-DNA work has reopened the migration question on a strictly empirical basis, with no implicit ranking of the populations identified. The migrations are real; they tell us about prehistory; they have no contemporary political meaning. Bones don't believe in race. Modern Indo-Europeanists are explicit on this.

IE · Politics— xxiv —
Out-of-IndiaXXV

Chapter XXIIIThe contested alternative.

A minority of scholars, primarily of Indian nationalist orientation (Koenraad Elst, Shrikant Talageri, Subhash Kak), have argued that the Indo-European homeland was India, and that languages spread out of South Asia rather than into it. The view is not respected in the mainstream comparative linguistics literature for several substantive reasons.

The Vedic Sanskrit horse-and-chariot vocabulary (which is shared across the family in Indo-Iranian-Anatolian) describes technology that arrived in South Asia from the steppe, not from South Asia outward. The reconstructed PIE flora and fauna (beech, oak, salmon, beaver, bear, wolf) are not South Asian. The 2018 Narasimhan ancient-DNA paper found steppe-derived ancestry arriving in northern India around 1500 BCE — exactly when Indo-Aryan languages would have arrived on the linguistic evidence.

The Out-of-India view persists for reasons that are mostly extra-academic. The scientific case against it is overwhelming.

IE · Out-of-India— xxv —
LoanwordsXXVI

Chapter XXIVWhat was borrowed.

Late Proto-Indo-European reconstructions show signs of contact with neighbouring non-Indo-European languages. Several reconstructed roots in agriculture and metallurgy may be borrowings from languages now lost — perhaps from the pre-Indo-European populations of the Pontic-Caspian region.

The Anatolian branch shows substantial substrate influence from Hattic and Hurrian, the non-Indo-European languages of pre-Hittite Anatolia. The Greek substrate has long been argued to include a "Pelasgian" or "pre-Greek" layer — words for figs, olives, certain place-names ending in -nthos, -ssos — from the populations Greek-speakers encountered.

The Indo-Iranian branch shows extensive contact with Dravidian (the major non-Indo-European family of South Asia) and with Burushaski and other isolates. The retroflex consonants of modern Indic languages are very plausibly a Dravidian areal influence. Sanskrit has hundreds of Dravidian loanwords.

The lesson: Indo-European languages did not arrive in linguistically empty regions. They mixed.

IE · Loanwords— xxvi —
Reading listXXVII

Chapter XXVTwenty essentials.

IE · Reading list— xxvii —
Watch & ReadXXVIII

Chapter XXVIWatch & read.

↑ Let's Talk About PIE — Reconstructing Old Languages

More on YouTube

Watch · How Do We Know the Indo-European Languages Are Related?
Watch · The History of the European languages 4000 BC–2021 AD

IE · Watch & Read— xxviii —
Modern reachXXIX

Chapter XXVIIThe empire of speakers.

Indo-European languages count roughly 3.2 billion native speakers — about 40% of humanity — and serve as the lingua franca for many more. The top ten:

Hindi-Urdu (~600 million native), Spanish (~480 million), English (~380 million native, ~1.5 billion total speakers), Bengali (~265 million), Portuguese (~260 million), Russian (~150 million), French (~80 million native, ~300 million total), German (~95 million native), Punjabi (~125 million), Marathi (~83 million).

The reach is the consequence of three sequential expansions: the original Bronze Age dispersal, the Roman imperial extension of Latin, and the early-modern European colonial expansion that carried English, Spanish, Portuguese, French, and Dutch across the Americas, Africa, and Asia. Each expansion was a contingent historical event; cumulatively, they produced the linguistic landscape of the modern world.

Indo-European_languages
The modern reach of the Indo-European family.
IE · Reach— xxix —
The futureXXX

Chapter XXVIIIWhere the field goes.

Three frontiers in contemporary Indo-European studies.

Ancient DNA at higher resolution. The 2015 papers were a beginning; the field now has thousands of sequenced ancient genomes from across Eurasia, with thousands more coming. Each year sharpens the picture of which populations moved where, when. The Tocharian puzzle is open; the timing of the Greek arrival in the southern Balkans is being refined; the role of the Caucasus as a population reservoir is now visible.

Anatolian and Hittite philology. The cuneiform archives of Hattusa are still being edited and translated. Each decade adds Hittite vocabulary, syntax, and culture to what we know. The Hittite-Indo-European interface remains the freshest theoretical frontier of the family.

Phylogenetic methods. Bayesian and other quantitative approaches, with better data and better priors, are converging on shared dates. The next frontier is incorporating structural-feature data alongside cognate sets, modelling lateral transfer, and integrating with the genomic timeline to produce a genuinely joint linguistic-genetic-archaeological model of the dispersal.

Indo-European_studies
The branching structure of Indo-European, with dated nodes.
IE · Future— xxx —
SummaryXXXI

Chapter XXIXWhat we can say.

What two and a half centuries of Indo-European research has established with reasonable confidence:

1. The Indo-European languages share a single common ancestor, Proto-Indo-European, spoken approximately 6,500 years ago.

2. The reconstructed proto-language can be specified at the level of phonology, morphology, and substantial portions of the lexicon.

3. The proto-language was almost certainly spoken on the Pontic-Caspian steppe by populations associated with the Yamnaya cultural complex.

4. The dispersal carried the language family west into Europe (~3000–2500 BCE) and east-south into Iran and India (~2000–1500 BCE), in migrations now confirmed by ancient DNA.

5. The Anatolian branch split off earliest; the family proper dates from after that split.

The remaining frontiers are timing details, the relationships among the early branches, and the integration of linguistic, archaeological, and genetic evidence into a single chronological model. These are the kind of questions where progress is steady.

IE · Summary— xxxi —
ColophonXXXII

The end of the deck.

Indo-European — Volume XVI, Deck 3 of The Deck Catalog. Set in Source Serif Pro with monospace technical apparatus. Linen #eef0ee; steppe-green and kurgan-red accents.

Thirty-two leaves on the family three of every seven humans speak. From Calcutta 1786 to ancient DNA 2015.

FINIS

↑ Vol. XVI · Lang. · Deck 3

i / iSpace · ↓ · ↑