The Forgotten History of the Dictionary: From Babylonian Word Lists to Wiktionary
The modern English dictionary is a strange kind of artifact. It contains roughly 250,000 entries, each with a definition, etymology, pronunciation guide, and usage examples. It is treated as authoritative for spelling and meaning, consulted in legal disputes, used to settle bar arguments, and assumed to be a basically neutral description of the language. Almost none of these features are inherent properties of dictionaries; almost all of them are recent institutional achievements with concrete histories. The dictionary is one of those objects that feels timeless precisely because the institutional work that produced it has been so successful at hiding itself.
The history of the dictionary is a 5,000-year arc of word lists, glossaries, encyclopedias, and finally the systematic comprehensive dictionaries of the modern type. The patterns are surprisingly revealing about which civilizations took language seriously, what motivated them, and what political work was done by deciding what a word meant.
The 5,000-year prehistory
The earliest known dictionaries are Sumerian-Akkadian word lists from around 2300 BCE — clay tablets with Sumerian words in one column and Akkadian translations in another. The genre, called lexical lists in modern Assyriology, was a major intellectual product of Mesopotamian scribal schools. The Urra=Hubullu list ran to 24 tablets and covered roughly 9,700 entries. The motivation was practical: Akkadian had displaced Sumerian as a spoken language but Sumerian remained the language of religion and high learning, so educated scribes needed translation tools. The lexical lists are also one of the earliest cases of someone deciding that the totality of words in a language is a topic worth systematic treatment.
Greek and Roman scholarship produced glossaries (lists of obscure words from particular authors) and lexicons (broader word collections), but no comprehensive dictionary in the modern sense. The Suda, a tenth-century Byzantine encyclopedia, is sometimes called a dictionary; it is closer to an encyclopedia organized alphabetically. The Chinese tradition produced dictionaries earlier and more systematically than the Western tradition. The Erya, dating to around the 3rd century BCE, organized words by semantic category. The Shuowen Jiezi by Xu Shen, completed around 100 CE, organized 9,353 characters by 540 radicals (graphical components) — an organizing principle still recognizable in modern Chinese dictionaries. Indian grammatical tradition produced Sanskrit lexicons, including Amarasimha's Amarakosha, around the 6th century CE, organized by topic and used as a teaching text.
The pattern in the prehistory is that dictionaries are produced by societies with formal scribal training and a translation problem. Mesopotamia had Sumerian-to-Akkadian; medieval Europe had Latin-to-vernacular; China had archaic-to-classical. The dictionaries served the institutional needs of education and religious or legal interpretation, not general literacy.
The vernacular dictionary problem
The first European vernacular dictionaries appeared in the 16th century, partly driven by the Reformation's emphasis on vernacular Bible translation and partly by the Renaissance interest in standardizing emerging national languages. Robert Cawdrey's Table Alphabeticall, published in 1604 in English, is conventionally counted as the first English dictionary; it contained 2,543 hard words with brief definitions and was aimed at "ladies, gentlewomen, or any other unskillfull persons." It is essentially a translation aid from learned vocabulary to ordinary English, not a comprehensive dictionary of the language.
The 18th century produced the first English dictionaries that aimed at comprehensiveness. Nathan Bailey's Universal Etymological English Dictionary of 1721 ran to about 40,000 entries; Samuel Johnson's Dictionary of the English Language in 1755 ran to about 42,000 entries with usage examples drawn from English literature. Johnson's dictionary is the more famous because of its prefatory essay (a remarkable document on the impossibility of fixing language), its sometimes-tendentious definitions ("oats: a grain which in England is generally given to horses, but in Scotland supports the people"), and its influence on dictionary methodology. Johnson worked nine years with six assistants and a budget that did not nearly cover the costs.
The French Académie française had begun work on the official Dictionnaire de l'Académie française in 1635, completing the first edition in 1694. The Académie's institutional model — a state-sanctioned body whose explicit role was to police what counted as proper French — became the template for many European national dictionaries and for much of the world's institutional thinking about language standardization. England conspicuously rejected this model; the OED was eventually produced by a private learned society (the Philological Society) with university press support, not by a state language authority.
The OED and the comprehensive principle
The Oxford English Dictionary, originally proposed in 1857, took 70 years to complete its first edition (1928). Its central methodological innovation was historical principles: each entry traced the word's usage from earliest attestation to the present, with citations drawn from a corpus of English literature compiled by thousands of volunteer readers. The OED's compilers did not prescribe meaning; they recorded usage and let the citations show how the word had been used over time.
The OED's reading program is one of the largest crowdsourced research projects in the history of scholarship. Volunteers read through assigned books, identified interesting words and their contexts, and submitted slips with the citations to the OED office in Oxford. Over six million slips accumulated by the time the first edition was complete. James Murray, the principal editor for most of the project's duration, ran the operation from a corrugated-iron shed in his garden that he called the Scriptorium.
The OED established the comprehensive descriptive dictionary as the standard form for English-language reference works. Webster's Third New International Dictionary in 1961 followed the same principles for American English, controversially including informal usages and dropping pejorative usage labels. The 1961 edition's reception revealed how much the dictionary's perceived authority depended on the existence of usage labels: if the dictionary lists "ain't" without saying it is "nonstandard," many readers concluded the dictionary was endorsing nonstandard usage. The descriptive-vs-prescriptive debate that followed was less about linguistics than about who got to certify what counted as proper English.
The political work of dictionaries
The interesting question about dictionaries is not what is in them but who gets to decide. The Académie française decides what is proper French. The Royal Spanish Academy decides what is proper Spanish. The Modern Hebrew Academy, founded in 1953, decides what is proper Hebrew. Each of these institutions exists in part to coordinate language standardization across a national or linguistic community, and each carries political weight that exceeds its purely lexicographic role. When the Académie's 2014 attempts to feminize French job titles became a cultural battleground, the underlying question was not really about word forms — it was about who has the authority to declare what counts as language.
The English-language tradition's resistance to a state language authority has had its own consequences. The OED and Webster's are commercial publishers with no special legal status; their decisions about including or excluding words have no force of law. This has led to a healthier, more flexible language standardization in some respects (English absorbs new words quickly and unselfconsciously) and to a less coherent one in others (legal texts have to specify which dictionary's edition is authoritative for purposes of contractual interpretation, and they often do).
The Wikipedia/Wiktionary moment
The 21st century has produced a new dictionary form that is structurally different from anything that came before. Wiktionary, launched in 2002, has more entries than any printed dictionary in history (over 8 million across all languages as of 2025, several million in English alone). Its entries are continuously updated, its etymologies are crowd-sourced, and its quality control depends on the same wiki dynamics that produced Wikipedia. The OED's six-million-citation reading program has become a six-million-edit history log.
Wiktionary is uneven — some languages and topics are extraordinarily well-covered, others are weak — but the central observation is structural: a comprehensive dictionary is no longer a multi-decade institutional project requiring tens of millions of dollars and a corporate publisher. It can be a continuously-updated public resource maintained by anyone interested. The institutional model that produced the OED took 70 years to complete a single edition; Wiktionary is updated thousands of times per day.
The deeper observation
The history of the dictionary is a useful corrective to the assumption that reference works are neutral records of fact. Every dictionary is the product of decisions about what counts as a word, what counts as a meaning, whose usage counts as authoritative, and what gets left out. The Sumerian lexical lists left out everything that was not relevant to scribal training. Johnson's dictionary left out usages he found vulgar and included etymologies that were sometimes wrong. The OED includes everything but tilts toward literary citation. Wiktionary includes vastly more than the OED but with uneven coverage and quality. Each represents a snapshot of what its creators thought a dictionary should be, and what should be in one. The question of "what does this word mean" is always also a question of "according to whom," and the answer is always the product of an institution making decisions about authority, scope, and audience. The dictionary that arrives in students' hands in 2026 is the result of five thousand years of those decisions, most of them invisible.