A field guide · folio I · 13 leaves

Language
/ 7,000 voices

An atlas of the human capacity to mean — its sounds, its scripts, its families, its futures.

↳ press → or click to begin

I / XIII

I. The species

A creature that talks

Somewhere between 50,000 and 150,000 years ago, anatomically modern humans began to speak in a way no other animal does — recursively, displacedly, infinitely.

The lowered larynx, fine motor control of the tongue, and a rewired brain are part of the equipment. So is FOXP2 — a gene whose human variant differs from the chimpanzee version by just two amino acids, but which catastrophically disrupts speech and grammar when mutated.

Language is not one organ. It is a conspiracy of organs recruited for a new task.

FOXP2 · forkhead box P2

Human:  ...N...S...
Chimp:  ...T...N...

Two substitutions. A KE-family pedigree in London with a disrupted FOXP2 gave linguistics its first clear genetic anchor on the faculty of language (Lai et al., 2001).

Estimates
~7,000 living languages today
II / XIII

II. The big families

A few trunks, a thousand branches

Most of the world's speakers belong to a small handful of language families. Most of the world's languages belong to families you have probably never heard of.

Indo-European English, Spanish, Hindi, Russian, Persian, Greek ~3.2 B
Sino-Tibetan Mandarin, Cantonese, Tibetan, Burmese ~1.4 B
Niger-Congo Swahili, Yoruba, Zulu, Igbo (~1,500 langs) ~700 M
Afro-Asiatic Arabic, Hebrew, Amharic, Hausa ~600 M
Austronesian Malay, Tagalog, Javanese, Maori, Hawaiian ~400 M
Dravidian Tamil, Telugu, Kannada, Malayalam ~250 M
Turkic · Japonic · Koreanic · Uralic · ... and isolates: Basque, Korean*, Ainu, Burushaski + ~140 more
III / XIII

III. Indo-European

From a single tongue, half the world

Around 5,000 BC on the Pontic-Caspian steppe, a small population spoke what philologists reconstruct as Proto-Indo-European (PIE). Their words for "wheel," "horse," "yoke," and "to drive" survive everywhere their descendants rode.

PIE *ph₂tḗr
Latin pater · Greek πατήρ
Sanskrit पितृ pitṛ́ · German Vater
English father · Persian پدر pedar

~3 B speakers descend from PIE today
Proto-Indo-European Celtic Italic / Romance Germanic Balto-Slavic Hellenic Indo-Iranian Irish Welsh French Spanish Italian English German Dutch Russian Polish Greek Hindi Persian Bengali 5000 BC 2000 BC 500 BC 1000 AD today ~10 living branches · ~450 living languages · > 3 billion speakers extinct branches not shown: Anatolian (Hittite), Tocharian
IV / XIII

IV. Sound

The phoneme & its variations

A phoneme is the smallest unit of sound that distinguishes one word from another. English has about 44. Rotokas (Papua New Guinea) has 11. !Xóõ (southern Africa) has more than 160.

The International Phonetic Alphabet (IPA) gives every speech sound a unique symbol — read in monospace, like coordinates.

[p]plosive [θ]think [ʃ]shoe [ŋ]sing [ɹ]red [ʔ]uh-oh [ǂ]palatal click [ɮ]lateral fric.
Clicks · Khoisan

The !Xóõ language uses clicks as full consonants:

ǀ   ǃ   ǂ   ǁ

dental · alveolar · palatal · lateral

Tone · Mandarin

The syllable ma means four different things depending on pitch contour:

mā 媽 — mother
má 麻 — hemp
mǎ 馬 — horse
mà 罵 — to scold

V / XIII

V. Writing

From picture to letter

Writing is roughly 5,400 years old — a baby compared with speech. It evolved, more than once, through a sequence of compressions.

Stage 1 · Pictographs

𓃭   𓅓   ☼

A picture is the thing. Sumerian proto-cuneiform, Egyptian early hieroglyphs, oracle bones.

Stage 2 · Ideographs

山   木   安

A sign for an idea, not a word. Mostly a myth in pure form — Han characters mix logograms, phonetic hints, and radicals.

Stage 3 · Syllabaries

か   き   く

One sign = one syllable. Japanese kana, Cherokee, Linear B. Compact for some phonologies, awkward for others.

Stage 4 · Abjads

ا ب ت

Consonants only — vowels inferred. Phoenician, Hebrew, Arabic. Compact, but reading aloud requires the language already known.

Stage 5 · Abugidas

क का कि

Consonant base + vowel diacritic. The Brahmic family — Devanagari, Tamil, Thai, Tibetan.

Stage 6 · Alphabets

A B Γ Я

Independent letters for vowels and consonants. A Greek innovation, c. 800 BC. Cheap to learn, easy to print.

VI / XIII

VI. Script samples

The shapes of writing

Α α
Greek
~800 BC · 24 letters
Я   Д
Cyrillic
~9th c · 33 letters
ا ب ت
Arabic
~4th c · abjad · RTL
अ आ क
Devanagari
~7th c · abugida
字 文 言
Han / 漢字
~1200 BC · ~50,000 chars
한 글
Hangul
1443 · featural alphabet

Hangul, invented under King Sejong, is unique: its letter shapes literally diagram the position of the tongue and lips. A linguistic instrument disguised as an alphabet.

VII / XIII

VII. Grammar

Who eats whom, in what order

If a sentence has a subject, an object, and a verb, there are six possible orderings — but only two of them dominate the world's languages.

SOV The cat the mouse eats.   — Japanese, Turkish, Hindi, Korean, Latin ~45%
SVO The cat eats the mouse.   — English, Mandarin, Spanish, Russian ~42%
VSO Eats the cat the mouse.   — Irish, Welsh, Classical Arabic, Hawaiian ~9%
VOS Eats the mouse the cat.   — Malagasy, Fijian ~3%
OVS The mouse eats the cat.   — Hixkaryana (Carib) <1%
OSV The mouse the cat eats.   — Warao (rare); Yoda <1%
Free word order

Some languages — Latin, Russian, Finnish — let you reorder words almost arbitrarily, because case endings on the nouns mark who's doing what to whom.

fēlēs mūrem edit.
mūrem fēlēs edit.
edit fēlēs mūrem.

All three: the cat eats the mouse.

"The limits of my language mean the limits of my world."

— Wittgenstein, Tractatus 5.6

VIII / XIII

VIII. Universals & parameters

What every language shares

Joseph Greenberg (1963) drew up 45 statistical universals from a sample of 30 languages — observations like:

  • If a language has prepositions, it tends to have noun-before-genitive order.
  • Almost every language distinguishes nouns from verbs.
  • Vowel inventories cluster: most have /i a u/.

Noam Chomsky went further. Beneath surface variation, all languages share a universal grammar — a finite set of parameters (head-first vs head-last, pro-drop on/off) that the child sets by listening.

Linguistic typology is the field that maps the joint distribution of these settings across the world.

FRONT   ←      →   BACK close mid open i u y ɯ ɪ ʊ e o ə ɛ ɔ æ a ɑ IPA vowel chart · burgundy = the near-universal /i a u/
IX / XIII

IX. Change

How English became English

Grimm's Law · ~500 BC

A regular sound shift that moved Proto-Indo-European stops in Germanic:

*p → f    *t → þ    *k → h
*b → p    *d → t    *g → k

PIE *pater → Old English fæder. PIE *tréyes → English three. PIE *kerd- → English heart.

The Great Vowel Shift · 1400–1700

Across three centuries, English long vowels rotated upward and split. Spelling, frozen by the printing press, did not move with them.

bite  [biːtə] → [baɪt]
meet  [meːt]  → [miːt]
house [huːs]  → [haʊs]
name  [naːmə] → [neɪm]

This is why English spelling looks like a war crime against phonetics.

Language change is inevitable, regular, and faster than you think. The English of 1400 is, to a modern reader, nearly a foreign tongue. The English of 800 — Beowulf — is one.

X / XIII

X. Endangered

One dies every two weeks

~40% of all languages are endangered
~1 / 14d rate of language extinction (UNESCO est.)
~96% of speakers use just 4% of languages

Most of the world's linguistic diversity sits on a long, thin tail: thousands of languages, each with a few hundred or a few thousand speakers, mostly transmitted orally, mostly without economic incentive for younger speakers to keep them.

When a language disappears, what disappears with it is hard to enumerate: a botanical taxonomy, a system of kinship terms, a way of pointing at time, a body of stories, a grammar of feeling.

Recently extinct (last fluent speaker):
Eyak (Alaska, 2008) · Klallam-S'klallam (2014) ·
Wukchumni (critical) · Livonian (Latvia, 2013) · ...

XI / XIII

XI. The future

Translation, preservation, & the rise of English

Three forces are reshaping the linguistic map:

  • Machine translation — neural models now cover 200+ languages well, and another ~1,000 passably. The translator's friction approaches zero for the majority of speakers.
  • Preservation — recording, transcribing, training models on small languages while fluent elders are alive. Living Tongues, ELAR, The Endangered Languages Project.
  • English's rise — ~1.5 B speakers (most of them L2). The default language of science, aviation, and the internet.

Whether the future is one tongue or many depends on choices made now.

Indo-European* Indo-European Niger-Congo Afro-Asiatic IE / Drav. Sino-Tibetan Austronesian Austr./Pama * in Americas: Indo-European overlays many indigenous families (Quechuan, Mayan, Na-Dené, ...) Major language families · approximate distribution today
XII / XIII

XII. Further reading

Sources & rabbit holes

Books
  • The Power of Babel — John McWhorter
  • The Unfolding of Language — Guy Deutscher
  • Through the Language Glass — Guy Deutscher
  • The Horse, the Wheel, and Language — David Anthony
  • Empires of the Word — Nicholas Ostler
  • Language Files — Ohio State University
Tools
  • WALS · World Atlas of Language Structures
  • Glottolog · genealogical reference
  • Ethnologue · speaker counts
  • PHOIBLE · phoneme inventories

"A different language is a different vision of life."

— Federico Fellini

— end of folio I —

Compiled MMXXVI · cream paper, dark ink, deep teal, burgundy.

XIII / XIII
← →  ·  click  ·  space