a dictionary in transition 11.07.2006, 8:47 AM
posted by ben vershbow
James Gleick had a fascinating piece in the Times Sunday magazine on how the Oxford English Dictionary is reinventing itself in the digital age. The O.E.D. has always had to keep up with a rapidly evolving English language. It took over 60 years and two major supplements to arrive at a second edition in 1989, around the same time Tim Berners-Lee and others at the CERN particle physics lab in Switzerland were creating up with the world wide web. Ever since then, the O.E.D. been hard at work on a third edition but under radically different conditions. Now not only the language but the forms in which the language is transmitted are in an extreme state of flux:
In its early days, the O.E.D. found words almost exclusively in books; it was a record of the formal written language. No longer. The language upon which the lexicographers eavesdrop is larger, wilder and more amorphous; it is a great, swirling, expanding cloud of messaging and speech: newspapers, magazines, pamphlets; menus and business memos; Internet news groups and chat-room conversations; and television and radio broadcasts.
Crucial to this massive language research program is a vast alphabet soup known as the Oxford English Corpus, a growing database of more than a billion words, culled mostly from the web, which O.E.D. lexicographers analyze through various programs that compare and contrast contemporary word usages in contexts ranging from novels and academic papers to teen chat rooms and fan sites. Together this data comprises what the O.E.D. calls "the fullest, most accurate picture of the language today" (I'm curious to know how broadly they survey the world's general adoption of English. I'm under the impression that it's still largely an Anglo-American affair).
Marshall McLuhan famously summarized the shift from oral tradition to the written word as "an eye for an ear": a general migration of thought and expression away from the folkloric soundscapes of tribal society toward encounters by individuals with visual symbols on a page, a movement that climaxed in the age of print, and which McLuhan saw at last reversed in the global village of electronic mass media. The curious thing that McLuhan did not live long enough to witness was the fusion of eye-ear cultures in the fast-moving textual traditions of cell phones and the Internet. Written language has acquired an immediacy and a malleability almost matching oral speech, and the effect is a disorienting blurring of boundaries where writing is almost the same as speaking, reading more like overhearing.
So what is a dictionary to do? Or be? Such fundamental change in the process of maintaining "the definitive record of the English language" must have an effect on the product. Might the third "edition" be its final never-ending one? Gleick again:
No one can say for sure whether O.E.D.3 will ever be published in paper and ink. By the point of decision, not before 20 years or so, it will have doubled in size yet again. In the meantime, it is materializing before the world's eyes, bit by bit, online. It is a thoroughgoing revision of the entire text. Whereas the second edition just added new words and new usages to the original entries, the current project is researching and revising from scratch -- preserving the history but aiming at a more coherent whole.
They've even experimented with bringing readers into the process, working with the BBC earlier this year to solicit public aid in locating first usages for a list of particularly hard-to-trace words. One wonders how far they'd go in this direction. It's one thing to let people contribute at the edges -- the 50 words in that list are all from the 20th century -- but to open the full source code is quite another. It seems the dictionary's challenge is to remain a sturdy ark for the English language during this period of flood, and to proceed under the assumption that we may have seen the last of the land.
(image by Kenneth Moyle)
sol gaitan on November 9, 2006 2:15 PM:
Written language, and image, have acquired the immediacy that electronic communication systems have. It is interesting to realize that "the blurring of boundaries where writing is almost the same as speaking, reading more like overhearing" is symptomatic more than procedural. We seem to be always marching a step behind. Even when communications where much slower, words like tomato or canoe eventually found their place in dictionaries, albeit after they had been widely used. Now, we are finding not only that the next edition of the O.E.D. is "materializing before the world's eyes" but that what those new words aim to signify is also materializing before the world's eyes. And that is neither the realm of paper and ink nor of private enterprise. In a way it is a return to orality, perhaps a "third orality" (following Ong.) Knowledge becomes aggregative again, but the impulse to formulate it responds to the need for perpetuation of memory in an organized and intelligible way, which is a product of the written word. If the printing press marked the switch from ear to eye, technology has brought along a synthesis that requires, and allows for, new ways of formulation.
dan visel on November 9, 2006 3:26 PM:
Something that might be worth looking at as a data point is www.urbandictionary.com. This is a more or less completely open dictionary: anyone can add their own definitions for any words they like. To my mind, it's a failure: as an example, try searching for your name - any name! - and you'll almost certainly find something offensive. There's a lot of interesting material in there - for deciphering current slang, it's probably the best thing going - but the signal to noise ration is precariously low. There's still an extremely important role for editors.
ben vershbow on November 9, 2006 6:01 PM:
To be fair, the Urban Dictionary is just looking at slang and by that measure it's pretty good (check out the entry for Santorum).
Then there's the Wiktionary (from the folks who brought you Wikipedia), which seems a little further toward the O.E.D. end of things except that, like the Urban, it's entirely user-created. One nice feature is that it's multilingual so an entry (if it has been well developed) will link to the equivalent word in other languages -- for example see "river."
Sol, you put into words what I think I was trying to say! The return to orality colliding with the persistent "impulse to formulate." In many ways that's the intersection we've set our sights on, and the focus of all these networked book experiments -- texts with conversations happening inside. We're still a long way from synthesis but it's a start.