Listing entries tagged with Libraries, Search and the Web


yahoo! ui design library Post date  02.16.2006, 7:08 PM

yahoo! logoThere are several reasons that Yahoo! released some of their core UI code for free. A callous read of this would suggest that they did it to steal back some goodwill from Google (still riding the successful Goolge API release from 2002). A more charitable soul could suggest that Yahoo! is interested in making the web a better place, not just in their market-share. Two things suggest this—the code is available under an open BSD license, and their release of design patterns. The code is for playing with; the design patterns for learning from.

The code is squarely aimed at folks like me who would struggle mightily to put together a default library to handle complex interactions in Javascript using AJAX (all the rage now) while dealing with the intricacies of modern and legacy browsers. Sure, I could pull together the code from different sources, test it, tweak it, break it, tweak it some more, etc. Unsurprisingly, I’ve never gotten around to it. The Yahoo! code release will literally save me at least a hundred hours. Now I can get right down to designing the interaction, rather than dealing with technology.

The design patterns library is a collection of best practice instructions for dealing with common web UI problems, providing both a solution and a rationale, with a detailed explanation of the interaction/interface feedback. This is something that is more familiar to me, but still stands as a valuable resource. It is a well-documented alternate viewpoint and reminder from a site that serves more users in one day than I’m likely to serve in a year.

Of course Yahoo! is hoping to reclaim some mind-space from Google with developer community goodwill. But since the code is general release, and not brandable in any particular way (it’s all under-the-hood kind of stuff), it’s a little difficult to see the release as a directly marketable item. It really just seems like a gift to the network, and hopefully one that will bear lovely fruit. It’s always heartening to see large corporations opening their products to the public as a way to grease the wheels of innovation.

Posted by jesse wilbur at 07:08 PM | Comments (3)
tags: BSD , Libraries, Search and the Web , Remix , design , design_pattern , gift_economy , innovation , interaction , javascript , license , user_interface , y! , yahoo!

who really needs to turn the pages? Post date  02.15.2006, 6:16 PM

The following post comes from my friend Sally Northmore, a writer and designer based in New York who lately has been interested in things like animation, video game theory, and (right up our alley) the materiality of books and their transition to a virtual environment. A couple of weeks ago we were talking about the British Library's rare manuscript digitization project, "Turning the Pages" -- something I'd been meaning to discuss here but never gotten around to doing. It turns out Sally had some interesting thoughts about this so I persuaded her to do a brief write-up of the project for if:book. Which is what follows below. Come to think of it, this is especially interesting when juxtaposed with Bob's post earlier this week on Jefferson Han's amazing gestural interface design. Here's Sally... - Ben

The British Library's collaboration with multimedia impresarios at Armadillo Systems has led to an impressive publishing enterprise, making available electronic 3-D facsimiles of their rare manuscript collection.

"Turning the Pages", available in CD-ROM, online, and kiosk format, presents the digital incarnation of these treasured texts, allowing the reader to virtually "turn" the pages with a touch and drag function, "pore over" texts with a magnification function, and in some cases, access extras such as supplementary notes, textual secrets, and audio accompaniment.

turning pages mozart.jpg
Pages from Mozart's thematic catalogue -- a composition notebook from the last seven years of his life. Allows the reader to listen to works being discussed.

The designers ambitiously mimicked various characteristics of each work in their 3-D computer models. For instance, the shape of a page of velum turning differs from the shape of a page of paper. It falls at a unique speed according to its weight; it casts a unique shadow. The simulation even allows for a discrepancy in how a page would turn depending on what corner of the page you decide to peel from.

Online visitors can download a library of manuscripts in Shockwave although these versions are a bit clunkier and don't provide the flashier thrills of the enormous touch screen kiosks the British Library now houses.

turning pages map.jpg
Mercator's first atlas of Europe - 1570s

Online, the "Turning the Pages" application forces you to adapt to the nature of its embodiment—to physically re-learn how to use a book. A hand cursor invites the reader to turn each page with a click-and-drag maneuver of the mouse. Sounds simple enough, but I struggled to get the momentum of the drag just right so that the page actually turned. In a few failed attempts, the page lifted just so... only to fall back into place again. Apparently, if you can master the Carpal Tunnel-inducing rhythm, you can learn to manipulate the page-turning function even further, grabbing multiple of pages at once for a faster, abridged read.

The value of providing high resolution scans of rare editions of texts for the general public to experience, a public that otherwise wouldn't necessarily ever "touch" say, the Lindisfarne Gospels, doesn’t go without kudos. Hey, democratic right? Armadillo Systems provides a list of compelling raisons d'être on their site to this effect. But the content of these texts is already available in reprintable (democratic!) form. Is the virtual page-turning function really necessary for greater understanding of these works, or a game of academic scratch-n-sniff?

turning pages davinci.jpg
The "enlarge" function even allows readers to reverse the famous mirror writing in Leonardo da Vinci's notebooks

At the MLA conference in D.C. this past December, where the British Library had set up a demonstration of "Turning the Pages", this was the question most frequently asked of the BL's representative. Who really needs to turn the pages? I learned from the rep's response that, well, nobody does! Scholars are typically more interested studying the page, and the turning function hasn't proven to enhance or revive scholarly exploration. And surely, the Library enjoyed plenty of biblio-clout and tourist traffic before this program?

But the lure of new, sexy technology can't be underestimated. From what I understood, the techno-factor is an excellent beacon for attracting investors and funding in multimedia technology. Armadillo's web site provides an interesting sales pitch:

By converting your manuscripts to "Turning the Pages" applications you can attract visitors, increase website traffic and add a revenue stream - at the same time as broadening access to your collection and informing and entertaining your audience.

The program reveals itself to be a peculiar exercise, tangled in its insistence on fetishizing aspects of the material body of the text—the weight of velum, the karat of gold used to illuminate, the shape of the binding. Such detail and love for each material manuscript went into this project to recreate, as best possible, the "feel" of handling these manuscripts.

Under ideal circumstances, what would the minds behind "Turning the Pages" prefer to create? The original form of the text—the "alpha" manuscript—or the virtual incarnation? Does technological advancement seduce us into valuing the near-perfect simulation over the original? Are we more impressed by the clone, the "Dolly" of hoary manuscripts? And, would one argue that "Turning the Pages" is the best proxy for the real thing, or, another “thing” entirely?

Posted by sally northmore at 06:16 PM | Comments (5)
tags: Libraries, Search and the Web , book_craft , books , design , design_curmudgeonry , digitization , interface , library , manuscript , museum , preservation , reading , turning_the_pages , user_interface

can there be a compromise on copyright? Post date  02.08.2006, 7:19 AM

The following is a response to a comment made by Karen Schneider on my Monday post on libraries and DRM. I originally wrote this as just another comment, but as you can see, it's kind of taken on a life of its own. At any rate, it seemed to make sense to give it its own space, if for no other reason than that it temporarily sidelined something else I was writing for today. It also has a few good quotes that might be of interest. So, Karen said:

I would turn back to you and ask how authors and publishers can continue to be compensated for their work if a library that would buy ten copies of a book could now buy one. I'm not being reactive, just asking the question--as a librarian, and as a writer.

This is a big question, perhaps the biggest since economics will define the parameters of much that is being discussed here. How do we move from an old economy of knowledge based on the trafficking of intellectual commodities to a new economy where value is placed not on individual copies of things that, as a result of new technologies are effortlessly copiable, but rather on access to networks of content and the quality of those networks? The question is brought into particularly stark relief when we talk about libraries, which (correct me if I'm wrong) have always been more concerned with the pure pursuit and dissemination of knowledge than with the economics of publishing.

library xerox.jpg Consider, as an example, the photocopier -- in many ways a predecessor of the world wide web in that it is designed to deconstruct and multiply documents. Photocopiers have been unbundling books in libraries long before there was any such thing as Google Book Search, helping users break through the commodified shell to get at the fruit within.

I know there are some countries in Europe that funnel a share of proceeds from library photocopiers back to the publishers, and this seems to be a reasonably fair compromise. But the role of the photocopier in most libraries of the world is more subversive, gently repudiating, with its low hum, sweeping light, and clackety trays, the idea that there can really be such a thing as intellectual property.

That being said, few would dispute the right of an author to benefit economically from his or her intellectual labor; we just have to ask whether the current system is really serving in the authors' interest, let alone the public interest. New technologies have released intellectual works from the restraints of tangible property, making them easily accessible, eminently exchangable and never out of print. This should, in principle, elicit a hallelujah from authors, or at least the many who have written works that, while possessed of intrinsic value, have not succeeded in their role as commodities.

But utopian visions of an intellecutal gift economy will ultimately fail to nourish writers who must survive in the here and now of a commercial market. Though peer-to-peer gift economies might turn out in the long run to be financially lucrative, and in unexpected ways, we can't realistically expect everyone to hold their breath and wait for that to happen. So we find ourselves at a crossroads where we must soon choose as a society either to clamp down (to preserve existing business models), liberalize (to clear the field for new ones), or compromise.

In her essay "Books in Time," Berkeley historian Carla Hesse gives a wonderful overview of a similar debate over intellectual property that took place in 18th Century France, when liberal-minded philosophes -- most notably Condorcet -- railed against the state-sanctioned Paris printing monopolies, demanding universal access to knowledge for all humanity. To Condorcet, freedom of the press meant not only freedom from censorship but freedom from commerce, since ideas arise not from men but through men from nature (how can you sell something that is universally owned?). Things finally settled down in France after the revolution and the country (and the West) embarked on a historic compromise that laid the foundations for what Hesse calls "the modern literary system":

The modern "civilization of the book" that emerged from the democratic revolutions of the eighteenth century was in effect a regulatory compromise among competing social ideals: the notion of the right-bearing and accountable individual author, the value of democratic access to useful knowledge, and faith in free market competition as the most effective mechanism of public exchange.

Barriers to knowledge were lowered. A system of limited intellectual property rights was put in place that incentivized production and elevated the status of writers. And by and large, the world of ideas flourished within a commercial market. But the question remains: can we reach an equivalent compromise today? And if so, what would it look like? stallman.jpg Creative Commons has begun to nibble around the edges of the problem, but love it as we may, it does not fundamentally alter the status quo, focusing as it does primarily on giving creators more options within the existing copyright system.

Which is why free software guru Richard Stallman announced in an interview the other day his unqualified opposition to the Creative Commons movement, explaining that while some of its licenses meet the standards of open source, others are overly conservative, rendering the project bunk as a whole. For Stallman, ever the iconoclast, it's all or nothing.

But returning to our theme of compromise, I'm struck again by this idea of a tax on photocopiers, which suggests a kind of micro-economy where payments are made automatically and seamlessly in proportion to a work's use. Someone who has done a great dealing of thinking about such a solution (though on a much more ambitious scale than library photocopiers) is Terry Fisher, an intellectual property scholar at Harvard who has written extensively on practicable alternative copyright models for the music and film industries (Ray and I first encountered Fisher's work when we heard him speak at the Economics of Open Content Symposium at MIT last month).

FisherPhoto6.jpg The following is an excerpt from Fisher's 2004 book, "Promises to Keep: Technology, Law, and the Future of Entertainment", that paints a relatively detailed picture of what one alternative copyright scheme might look like. It's a bit long, and as I mentioned, deals specifically with the recording and movie industries, but it's worth reading in light of this discussion since it seems it could just as easily apply to electronic books:

....we should consider a fundamental change in approach.... replace major portions of the copyright and encryption-reinforcement models with a variant of....a governmentally administered reward system. In brief, here’s how such a system would work. A creator who wished to collect revenue when his or her song or film was heard or watched would register it with the Copyright Office. With registration would come a unique file name, which would be used to track transmissions of digital copies of the work. The government would raise, through taxes, sufficient money to compensate registrants for making their works available to the public. Using techniques pioneered by American and European performing rights organizations and television rating services, a government agency would estimate the frequency with which each song and film was heard or watched by consumers. Each registrant would then periodically be paid by the agency a share of the tax revenues proportional to the relative popularity of his or her creation. Once this system were in place, we would modify copyright law to eliminate most of the current prohibitions on unauthorized reproduction, distribution, adaptation, and performance of audio and video recordings. Music and films would thus be readily available, legally, for free.

Painting with a very broad brush...., here would be the advantages of such a system. Consumers would pay less for more entertainment. Artists would be fairly compensated. The set of artists who made their creations available to the world at large--and consequently the range of entertainment products available to consumers--would increase. Musicians would be less dependent on record companies, and filmmakers would be less dependent on studios, for the distribution of their creations. Both consumers and artists would enjoy greater freedom to modify and redistribute audio and video recordings. Although the prices of consumer electronic equipment and broadband access would increase somewhat, demand for them would rise, thus benefiting the suppliers of those goods and services. Finally, society at large would benefit from a sharp reduction in litigation and other transaction costs.

While I'm uncomfortable with the idea of any top-down, governmental solution, this certainly provides food for thought.

Posted by ben vershbow at 07:19 AM | Comments (8)
tags: Copyright and Copyleft , DRM , IP , Libraries, Search and the Web , Publishing, Broadcast, and the Press , condorcet , copyleft , copyright , creative_commons , enlightenment , france , free_software , intellectual_property , libraries , music , open_source , photocopy , printing , richar_stallman , xerox

DRM and the damage done to libraries Post date  02.06.2006, 7:51 AM

nypl.jpg
New York Public Library

A recent BBC article draws attention to widespread concerns among UK librarians (concerns I know are shared by librarians and educators on this side of the Atlantic) regarding the potentially disastrous impact of digital rights management on the long-term viability of electronic collections. At present, when downloads represent only a tiny fraction of most libraries' circulation, DRM is more of a nuisance than a threat. At the New York Public library, for instance, only one "copy" of each downloadable ebook or audio book title can be "checked out" at a time -- a frustrating policy that all but cancels out the value of its modest digital collection. But the implications further down the road, when an increasing portion of library holdings will be non-physical, are far more grave.

What these restrictions in effect do is place locks on books, journals and other publications -- locks for which there are generally no keys. What happens, for example, when a work passes into the public domain but its code restrictions remain intact? Or when materials must be converted to newer formats but can't be extracted from their original files? The question we must ask is: how can librarians, now or in the future, be expected to effectively manage, preserve and update their collections in such straightjacketed conditions?

This is another example of how the prevailing copyright fundamentalism threatens to constrict the flow and preservation of knowledge for future generations. I say "fundamentalism" because the current copyright regime in this country is radical and unprecedented in its scope, yet traces its roots back to the initially sound concept of limited intellectual property rights as an incentive to production, which, in turn, stemmed from the Enlightenment idea of an author's natural rights. What was originally granted (hesitantly) as a temporary, statutory limitation on the public domain has spun out of control into a full-blown culture of intellectual control that chokes the flow of ideas through society -- the very thing copyright was supposed to promote in the first place.

If we don't come to our senses, we seem destined for a new dark age where every utterance must be sanctioned by some rights holder or licensing agent. Free thought isn't possible, after all, when every thought is taxed. In his "An Answer to the Question: What is Enlightenment?" Kant condemns as criminal any contract that compromises the potential of future generations to advance their knowledge. He's talking about the church, but this can just as easily be applied to the information monopolists of our times and their new tool, DRM, which, in its insidious way, is a kind of contract (though one that is by definition non-negotiable since enforced by a machine):

But would a society of pastors, perhaps a church assembly or venerable presbytery (as those among the Dutch call themselves), not be justified in binding itself by oath to a certain unalterable symbol in order to secure a constant guardianship over each of its members and through them over the people, and this for all time: I say that this is wholly impossible. Such a contract, whose intention is to preclude forever all further enlightenment of the human race, is absolutely null and void, even if it should be ratified by the supreme power, by parliaments, and by the most solemn peace treaties. One age cannot bind itself, and thus conspire, to place a succeeding one in a condition whereby it would be impossible for the later age to expand its knowledge (particularly where it is so very important), to rid itself of errors, and generally to increase its enlightenment. That would be a crime against human nature, whose essential destiny lies precisely in such progress; subsequent generations are thus completely justified in dismissing such agreements as unauthorized and criminal.

We can only hope that subsequent generations prove more enlightened than those presently in charge.

Posted by ben vershbow at 07:51 AM | Comments (4)
tags: Copyright and Copyleft , DRM , IP , Libraries, Search and the Web , books , copyright , digital , digitization , ebooks , enlightenment , fundamentalism , intellectual_property , kant , libraries , library , philosophy , public_domain , scholarship

google gets mid-evil Post date  01.30.2006, 3:46 PM

At the World Economic Forum in Davos last Friday, Google CEO Eric Schmidt assured a questioner in the audience that his company had in fact thoroughly searched its soul before deciding to roll out a politically sanitized search engine in China:

We concluded that although we weren't wild about the restrictions, it was even worse to not try to serve those users at all... We actually did an evil scale and decided not to serve at all was worse evil.

(via Ditherati)

Posted by ben vershbow at 03:46 PM | Comments (0)
tags: Libraries, Search and the Web , Network_Freedom , censorship , china , evil , free_speech , google , internet , search , web

illusions of a borderless world Post date  01.27.2006, 3:57 PM

china google falun gong.jpg

A number of influential folks around the blogosphere are reluctantly endorsing Google's decision to play by China's censorship rules on its new Google.cn service -- what one local commentator calls a "eunuch version" of Google.com. Here's a sampler of opinions:

Ethan Zuckerman ("Google in China: Cause For Any Hope?"):

It’s a compromise that doesn’t make me happy, that probably doesn’t make most of the people who work for Google very happy, but which has been carefully thought through...

In launching Google.cn, Google made an interesting decision - they did not launch versions of Gmail or Blogger, both services where users create content. This helps Google escape situations like the one Yahoo faced when the Chinese government asked for information on Shi Tao, or when MSN pulled Michael Anti’s blog. This suggests to me that Google’s willing to sacrifice revenue and market share in exchange for minimizing situations where they’re asked to put Chinese users at risk of arrest or detention... This, in turn, gives me some cause for hope.

Rebecca MacKinnon ("Google in China: Degrees of Evil"):

At the end of the day, this compromise puts Google a little lower on the evil scale than many other internet companies in China. But is this compromise something Google should be proud of? No. They have put a foot further into the mud. Now let's see whether they get sucked in deeper or whether they end up holding their ground.

David Weinberger ("Google in China"):

If forced to choose — as Google has been — I'd probably do what Google is doing. It sucks, it stinks, but how would an information embargo help? It wouldn't apply pressure on the Chinese government. Chinese citizens would not be any more likely to rise up against the government because they don't have access to Google. Staying out of China would not lead to a more free China.

Doc Searls ("Doing Less Evil, Possibly"):

I believe constant engagement — conversation, if you will — with the Chinese government, beats picking up one's very large marbles and going home. Which seems to be the alternative.

Much as I hate to say it, this does seem to be the sensible position -- not unlike opposing America's embargo of Cuba. The logic goes that isolating Castro only serves to further isolate the Cuban people, whereas exposure to the rest of the world -- even restricted and filtered -- might, over time, loosen the state's monopoly on civic life. Of course, you might say that trading Castro for globalization is merely an exchange of one tyranny for another. But what is perhaps more interesting to ponder right now, in the wake of Google's decision, is the palpable melancholy felt in the comments above. What does it reveal about what we assume -- or used to assume -- about the internet and its relationship to politics and geography?

A favorite "what if" of recent history is what might have happened in the Soviet Union had it lasted into the internet age. Would the Kremlin have managed to secure its virtual borders? Or censor and filter the net into a state-controlled intranet -- a Union of Soviet Socialist Networks? Or would the decentralized nature of the technology, mixed with the cultural stirrings of glasnost, have toppled the totalitarian state from beneath?

Ten years ago, in the heady early days of the internet, most would probably have placed their bets against the Soviets. The Cold War was over. Some even speculated that history itself had ended, that free-market capitalism and democracy, on the wings of the information revolution, would usher in a long era of prosperity and peace. No borders. No limits.

jingjing_1.jpg chacha.jpg
"Jingjing" and "Chacha." Internet police officers from the city of Shenzhen who float over web pages and monitor the cyber-traffic of local users.

It's interesting now to see how exactly the opposite has occurred. Bubbles burst. Towers fell. History, as we now realize, did not end, it was merely on vacation; while the utopian vision of the internet -- as a placeless place removed from the inequities of the physical world -- has all but evaporated. We realize now that geography matters. Concrete features have begun to crystallize on this massive information plain: ports, gateways and customs houses erected, borders drawn. With each passing year, the internet comes more and more to resemble a map of the world.

Those of us tickled by the "what if" of the Soviet net now have ourselves a plausible answer in China, who, through a stunning feat of pipe control -- a combination of censoring filters, on-the-ground enforcement, and general peering over the shoulders of its citizens -- has managed to create a heavily restricted local net in its own image. Barely a decade after the fall of the Iron Curtain, we have the Great Firewall of China.

And as we've seen this week, and in several highly publicized instances over the past year, the virtual hand of the Chinese government has been substantially strengthened by Western technology companies willing to play by local rules so as not to be shut out of the explosive Chinese market. Tech giants like Google, Yahoo! , and Cisco Systems have proved only too willing to abide by China's censorship policies, blocking certain search returns and politically sensitive terms like "Taiwanese democracy," "multi-party elections" or "Falun Gong". They also specialize in precision bombing, sometimes removing the pages of specific users at the government's bidding. The most recent incident came just after New Year's when Microsoft acquiesced to government requests to shut down the My Space site of popular muckraking blogger Zhao Jing, aka Michael Anti.

MS_and_China.jpg
One of many angry responses that circulated the non-Chinese net in the days that followed.

We tend to forget that the virtual is built of physical stuff: wires, cable, fiber -- the pipes. Whoever controls those pipes, be it governments or telecomms, has the potential to control what passes through them. The result is that the internet comes in many flavors, depending in large part on where you are logging in. As Jack Goldsmith and Timothy Wu explain in an excellent article in Legal Affairs (adapted from their forthcoming book Who Controls the Internet? : Illusions of a Borderless World), China, far from being the boxed-in exception to an otherwise borderless net, is actually just the uglier side of a global reality. The net has been mapped out geographically into "a collection of nation-state networks," each with its own politics, social mores, and consumer appetites. The very same technology that enables Chinese authorities to write the rules of their local net enables companies around the world to target advertising and gear services toward local markets. Goldsmith and Wu:

...information does not want to be free. It wants to be labeled, organized, and filtered so that it can be searched, cross-referenced, and consumed....Geography turns out to be one of the most important ways to organize information on this medium that was supposed to destroy geography.

Who knows? When networked devices truly are ubiquitous and can pinpoint our location wherever we roam, the internet could be censored or tailored right down to the individual level (like the empire in Borges' fable that commissions a one-to-one map of its territory that upon completion perfectly covers every corresponding inch of land like a quilt).

The case of Google, while by no means unique, serves well to illustrate how threadbare the illusion of the borderless world has become. The company's famous credo, "don't be evil," just doesn't hold up in the messy, complicated real world. "Choose the lesser evil" might be more appropriate. Also crumbling upon contact with air is Google's famous mission, "to make the world's information universally accessible and useful," since, as we've learned, Google will actually vary the world's information depending on where in the world it operates.

Google may be behaving responsibly for a corporation, but it's still a corporation, and corporations, in spite of well-intentioned employees, some of whom may go to great lengths to steer their company onto the righteous path, are still ultimately built to do one thing: get ahead. Last week in the States, the get-ahead impulse happened to be consonant with our values. Not wanting to spook American users, Google chose to refuse a Dept. of Justice request for search records to aid its anti-pornography crackdown. But this week, not wanting to ruffle the Chinese government, Google compromised and became an agent of political repression. "Degrees of evil," as Rebecca MacKinnon put it.

The great irony is that technologies we romanticized as inherently anti-tyrannical have turned out to be powerful instruments of control, highly adaptable to local political realities, be they state or market-driven. Not only does the Chinese government use these technologies to suppress democracy, it does so with the help of its former Cold War adversary, America -- or rather, the corporations that in a globalized world are the de facto co-authors of American foreign policy. The internet is coming of age and with that comes the inevitable fall from innocence. Part of us desperately wanted to believe Google's silly slogans because they said something about the utopian promise of the net. But the net is part of the world, and the world is not so simple.

Posted by ben vershbow at 03:57 PM | Comments (3)
tags: ISP , Libraries, Search and the Web , Network_Freedom , broadband , capitalism , china , cyberspace , democracy , evil , falun_gong , free_speech , geography , globalization , glocalization , good , google , human_rights , search , spectrum , technology

the economics of open content Post date  01.23.2006, 9:31 AM

For the next two days, Ray and I are attending what hopes to be a fascinating conference in Cambridge, MA -- The Economics of Open Content -- co-hosted by Intelligent Television and MIT Open CourseWare.

This project is a systematic study of why and how it makes sense for commercial companies and noncommercial institutions active in culture, education, and media to make certain materials widely available for free—and also how free services are morphing into commercial companies while retaining their peer-to-peer quality.

They've assembled an excellent cross-section of people from the emerging open access movement, business, law, the academy, the tech sector and from virtually every media industry to address one of the most important (and counter-intuitive) questions of our age: how do you make money by giving things away for free?

Rather than continue, in an age of information abundance, to embrace economic models predicated on information scarcity, we need to look ahead to new models for sustainability and creative production. I look forward to hearing from some of the visionaries gathered in this room.

More to come...

Posted by ben vershbow at 09:31 AM | Comments (0)
tags: Copyright and Copyleft , Education , Libraries, Search and the Web , academia , conferences_and_excursions , copyleft , copyright , free_software , gift_economy , library , open_access , open_content , publishing , scholarship

cheney and google Post date  01.21.2006, 6:27 PM

(this is a follow-up to ben's recent post "the book is reading you."

i rarely read Maureen Dowd but the headline of her column in today's New York Times, "Googling past the Graveyard," caught my attention. Dowd calls Dick Cheney on the carpet for asking Google to release the search records of U.S. citizens. while i'm horrified that the govt. would even consider asking for such information, i'm concerned that the way this particular issue is playing out, Google is being portrayed as the poor beleaguered neutral entity caught between an over-reaching bureaucracy and its citizens. Cheney will expire eventually. in the meantime Google will collect even more data. Google is a very big corporation, who's power will grow over time. in the long run, why aren't people outraged that this information is in Google's hands in the first place. shouldn't we be?

Posted by bob stein at 06:27 PM | Comments (5)
tags: Libraries, Search and the Web , cheney , google , government , privacy

the book is reading you Post date  01.19.2006, 1:42 PM

I just noticed that Google Book Search requires users to be logged in on a Google account to view pages of copyrighted works.

google book search account.jpg

They provide the following explanation:

Why do I have to log in to see certain pages?

Because many of the books in Google Book Search are still under copyright, we limit the amount of a book that a user can see. In order to enforce these limits, we make some pages available only after you log in to an existing Google Account (such as a Gmail account) or create a new one. The aim of Google Book Search is to help you discover books, not read them cover to cover, so you may not be able to see every page you're interested in.

So they're tracking how much we've looked at and capping our number of page views. Presumably a bone tossed to publishers, who I'm sure will continue suing Google all the same (more on this here). There's also the possibility that publishers have requested information on who's looking at their books -- geographical breakdowns and stats on click-throughs to retailers and libraries. I doubt, though, that Google would share this sort of user data. Substantial privacy issues aside, that's valuable information they want to keep for themselves.

That's because "the aim of Google Book Search" is also to discover who you are. It's capturing your clickstreams, analyzing what you've searched and the terms you've used to get there. The book is reading you. Substantial privacy issues aside, (it seems more and more that's where we'll be leaving them) Google will use this data to refine Google's search algorithms and, who knows, might even develop some sort of personalized recommendation system similar to Amazon's -- you know, where the computer lists other titles that might interest you based on what you've read, bought or browsed in the past (a system that works only if you are logged in). It's possible Google is thinking of Book Search as the cornerstone of a larger venture that could compete with Amazon.

There are many ways Google could eventually capitalize on its books database -- that is, beyond the contextual advertising that is currently its main source of revenue. It might turn the scanned texts into readable editions, hammer out licensing agreements with publishers, and become the world's biggest ebook store. It could start a print-on-demand service -- a Xerox machine on steroids (and the return of Google Print?). It could work out deals with publishers to sell access to complete online editions -- a searchable text to go along with the physical book -- as Amazon announced it will do with its Upgrade service. Or it could start selling sections of books -- individual pages, chapters etc. -- as Amazon has also planned to do with its Pages program.

Amazon has long served as a valuable research tool for books in print, so much so that some university library systems are now emulating it. Recent additions to the Search Inside the Book program such as concordances, interlinked citations, and statistically improbable phrases (where distinctive terms in the book act as machine-generated tags) are especially fun to play with. Although first and foremost a retailer, Amazon feels more and more like a search system every day (and its A9 engine, though seemingly always on the back burner, is also developing some interesting features). On the flip side Google, though a search system, could start feeling more like a retailer. In either case, you'll have to log in first.

Posted by ben vershbow at 01:42 PM | Comments (5)
tags: Copyright and Copyleft , Libraries, Search and the Web , POD , amazon , books , e-commerce , e-publishing , ebooks , google , google_book_search , google_print , internet , print_on_demand , privacy , publishing , search , web

who owns the network? Post date  01.12.2006, 5:15 PM

Susan Crawford recently floated the idea of the internet network (see comments 1 and 2) as a public trust that, like America's national parks or seashore, requires the protection of the state against the undue influence of private interests.

...it's fine to build special services and make them available online. But broadband access companies that cover the waterfront (literally -- are interfering with our navigation online) should be confronted with the power of the state to protect entry into this self-owned commons, the internet. And the state may not abdicate its duty to take on this battle.

Others argue that a strong government hand will create as many problems as it fixes, and that only true competition between private, municipal and grassroots parties -- across not just broadband, but multiple platforms like wireless mesh networks and satellite -- can guarantee a free net open to corporations and individuals in equal measure.

Discussing this around the table today, Ray raised the important issue of open content: freely available knowledge resources like textbooks, reference works, scholarly journals, media databases and archives. What are the implications of having these resources reside on a network that increasingly is subject to control by phone and cable companies -- companies that would like to transform the net from a many-to-many public square into a few-to-many entertainment distribution system? How open is the content when the network is in danger of becoming distinctly less open?

Posted by ben vershbow at 05:15 PM | Comments (12)
tags: ISP , Libraries, Search and the Web , Network_Freedom , broadband , internet , open_access , open_content

digital universe and expert review Post date  01.06.2006, 5:09 PM

The notion of expert review has been tossed around in the open-content community for a long time. Philosophically, those who lean towards openness tend to sneer at the idea of formalized expert review, trusting in the multiplied consciousness of the community to maintain high standards through less formal processes. Wikipedia is obviously the most successful project in this mode.The informal process has the benefit of speed, and avoids bureaucracy—something which raises the barrier to entry, and keeps out people who just don't have the time to deal with 'process.'

The other side of that coin is the belief that experts and editors encourage civil discourse at a high level; without them you'll end up with mob rule and lowest common denominator content. Editors encourage higher quality writing and thinking. Thinking and writing better than others is, in a way, the definition of expert. In addition, editors and experts tend to have a professional interest in the subject matter, as well as access to better resources. These are exactly the kind of people who are not discouraged by higher barriers to entry, and they are, by extension, the people that you want to create content on your site.

Larry Sanger thinks that, anyway. A Wikipedia co-founder, he gave an interview on news.com about a project that plans to create a better Wikipedia, using a combination of open content development and editorial review: The Digital Universe.

You can think of the Digital Universe as a set of portals, each defined by a topic, such as the planet Mars. And from each portal, there will be links to the best resources on the Web, including a lot of resources of different kinds that are prepared by experts and the general public under the management of experts. This will include an encyclopedia, as well as public domain books, participatory journalism, forums of various kinds and so forth. We'll build a community of experts and an online collaborative network of independent organizations, each of which has authority over its own discipline to select material and to build resources that are together displayed through a single free-information platform.

I have experience with the editor model from my time at About.com. The About.com model is based on 'guides'—nominal (and sometimes actual) experts on a chosen topic (say NASCAR, or anesthesiology)—who scour the internet, find good resources, and write articles and newsletters to facilitate understanding and keep communities up to date. The guides were overseen by a bevy of editors, who tended mostly to enforce the quotas for newsletters and set the line on quality. About.com has its problems, but it was novel and successful during its time.

The Digital Universe model is an improvement on the single guide model; it encourages a multitude of people to contribute to a reservoir of content. Measured by available resources, the Digital Universe model wins, hands down. As with all large, open systems, emergent behaviors will add even more to the system in ways than we cannot predict. The Digitial Universe will have it's own identity and quality, which, according to the blueprint, will be further enhanced by expert editors, shaping the development of a topic and polishing it to a high gloss.

Full disclosure: I find the idea of experts "managing the public" somehow distasteful, but I am compelled by the argument that this will bring about a better product. Sanger's essay on eliminating anti-elitism from Wikipedia clearly demonstrates his belief in the 'expert' methodology. I am willing to go along, mindful that we should be creating material that not only leads people to the best resources, but also allows them to engage more critically with the content. This is what experts do best. However, I'm pessimistic about experts mixing it up with the public. There are strong, and as I see it, opposing forces in play: an expert's reputation vs. public participation, industry cant vs. plain speech, and one expert opinion vs. another.

The difference between Wikipedia and the Digital Universe comes down, fundamentally, to the importance placed on authority. We'll see what shape the Digital Universe takes as the stresses of maintaining an authoritative process clashes with the anarchy of the online public. I think we'll see that adopting authority as your rallying cry is a volatile position in a world of empowered authorship and a universe of alternative viewpoints.

Posted by jesse wilbur at 05:09 PM | Comments (3)
tags: About.com , Libraries, Search and the Web , authority , authors , digital_universe , editors , experts , open_content , trust , wikipedia

questions about blog search and time Post date  01.06.2006, 8:17 AM

Does anyone know of a good way to search for old blog entries on the web? I've just been looking at some of the available blog search resources and few of them appear to provide any serious advanced search options. The couple of major ones I've found that do (after an admittedly cursory look) are Google and Ice Rocket. Both, however, appear to be broken, at least when it comes to dates. I've tried them on three different browsers, on Mac and PC, and in each case the date menus seem to be frozen. It's very weird. They give you the option of entering a specific time range but won't accept the actual dates. Maybe I'm just having a bad tech day, but it's as if there's some conceptual glitch across the web vis a vis blogs and time.

Most blog search engines are geared toward searching the current blogosphere, but there should be a way to research older content. My first thought was that blog search engines crawl RSS feeds, most of which do not transmit the entirety of a blog's content, just the more recent. That would pose a problem for archival search.

Does anyone know what would be the best way to go about finding, say, old blog entries containing the keywords "new orleans superdome" from late August to late September 2005? Is it best to just stick with general web search and painstakingly comb through for blogs? If we agree that blogs have become an important kind of cultural document, than surely there should be a way to find them more than a month after they've been written.

Posted by ben vershbow at 08:17 AM | Comments (5)
tags: Blogosphere , Libraries, Search and the Web , archives , blog_search , blogging , blogs , history , research , search

why google and yahoo love wikipedia Post date  12.29.2005, 3:16 PM

wikipedia.png From Dan Cohen's excellent Digital Humanities Blog comes a discussion of the Wikipedia story that Cohen claims no one seems to be writing about — namely, the question of why Google and Yahoo give so much free server space and bandwith to Wikipedia. Cohen points out that there's more going on here than just the open source ethos of these tech companies: in fact, the two companies are becoming increasingly dependent on Wikipedia as a resource, both as something to repackage for commercial use (in sites such as Answers.com), and as a major component in the programming of search algorithms. Cohen writes:

Let me provide a brief example that I hope will show the value of having such a free resource when you are trying to scan, sort, and mine enormous corpora of text. Let's say you have a billion unstructured, untagged, unsorted documents related to the American presidency in the last twenty years. How would you differentiate between documents that were about George H. W. Bush (Sr.) and George W. Bush (Jr.)? This is a tough information retrieval problem because both presidents are often referred to as just "George Bush" or "Bush." Using data-mining algorithms such as Yahoo's remarkable Term Extraction service, you could pull out of the Wikipedia entries for the two Bushes the most common words and phrases that were likely to show up in documents about each (e.g., "Berlin Wall" and "Barbara" vs. "September 11" and "Laura"). You would still run into some disambiguation problems ("Saddam Hussein," "Iraq," "Dick Cheney" would show up a lot for both), but this method is actually quite a powerful start to document categorization.

Cohen's observation is a valuable reminder that all of the discussion of Wikipedia's accuracy and usefulness as an academic tool is really only skimming the surface of how and why the open-souce encyclopedia is reshaping the way knowledge is made and accessed. Ultimately, the question of whether or not Wikipedia should be used in the classroom might be less important than whether — or how — it is used in the boardroom, by companies whose function is to repackage, reorganize and return "the people's knowledge" back to the people at a tidy profit.

Posted by lisa lynch at 03:16 PM | Comments (7)
tags: Libraries, Search and the Web , google , wikipedia , yahoo

librivox -- free public domain books read aloud by volunteers Post date  12.19.2005, 9:26 AM

Just read a Dec. 16th Wired article about a Canadian Hugh McGuire's brilliant new venture Librivox. Librivox is creating and distributing free audiobooks by asking volunteers to create audio files of works of literature in the public domain. The files are hosted on the Internet Archive and are available in MP3 and OGG formats.

librivox.jpg Thus far, Librivox — which has only been up for a few months — has recorded about 30 titles, relying on dozens of volunteers. The website promotes the project as the "acoustical liberation of the public domain" and claims that the ultimate goal is to liberate all public domain works of literature. For now, titles cataloged on the website include L Frank Baum's The Wizard of Oz, Joseph Conrad's The Secret Agent and the U.S. Constitution.

Using Librivox couldn't be easier: clicking on an entry will bring you to a screen which allows you to select a Wikipedia entry on the book in question, the e-Gutenberg file of the book, an alternate Zip file of the book, and the Librivox audio version, available chapter by chapter with the names of each volunteer reader noted prominently next to the chapter information.

I listened to parts of about a half-dozen book chapters to get a sense of the quality of the recordings, and I was impressed. The volunteers have obviously chosen books they are passionate about, and the recordings are lively, quite clear and easy to listen to. As a regular audiobook listener, I was struck by the fact that while most literary audiobooks are read by authors who tend to work hard at conveying a sense of character, the Librivox selections seemed to convey, more than anything, the reader's passion for the text itself; ie, for the written word. Here at the Institute we've been spending a fair amount of time trying to figure out when a book loses it's book-ness, and I'd argue that while some audiobooks blur the boundary between book and performance, the Librivox books remind us that a book reduced to a stream of digitally produced sound can still be very much a book.

The site's definitely worth a visit, and, if you've got a decent voice and a few spare hours, there's information about how to become a volunteer reader yourself. And finally, don't miss the list of other audiolit projects on the lower right-hand corner of the homepage: there are many voices out there, reading many books — including Japanese Classical Literature For Bedtime, if you're so inclined.

Posted by lisa lynch at 09:26 AM | Comments (1)
tags: Libraries, Search and the Web , audiobooks , domain , librivox , public

google book search debated at american bar association Post date  12.15.2005, 3:50 PM

Last night I attended a fascinating panel discussion at the American Bar Association on the legality of Google Book Search. In many ways, this was the debate made flesh. Making the case against Google were high-level representatives from the two entities that have brought suit, the Authors' Guild (Executive Director Paul Aiken) and the Association of American Publishers (VP for legal counsel Allan Adler). It would have been exciting if Google, in turn, had sent representatives to make their case, but instead we had two independent commentators, law professor and blogger Susan Crawford and Cameron Stracher, also a law professor and writer. The discussion was vigorous, at times heated -- in many ways a preview of arguments that could eventually be aired (albeit under a much stricter clock) in front of federal judges.

The lawsuits in question center around whether Google's scanning of books and presenting tiny snippet quotations online for keyword searches is, as they claim, fair use. As I understand it, the use in question is the initial scanning of full texts of copyrighted books held in the collections of partner libraries. The fair use defense hinges on this initial full scan being the necessary first step before the "transformative" use of the texts, namely unbundling the book into snippets generated on the fly in response to user search queries.

google snippets.jpg
...in case you were wondering what snippets look like

At first, the conversation remained focused on this question, and during that time it seemed that Google was winning the debate. The plaintiffs' arguments seemed weak and a little desperate. Aiken used carefully scripted language about not being against online book search, just wanting it to be licensed, quipping "we're just throwing a little gravel in the gearbox of progress." Adler was a little more strident, calling Google "the master of misdirection," using the promise of technological dazzlement to turn public opinion against the legitimate grievances of publishers (of course, this will be settled by judges, not by public opinion). He did score one good point, though, saying Google has betrayed the weakness of its fair use claim in the way it has continually revised its description of the program.

Almost exactly one year ago, Google unveiled its "library initiative" only to re-brand it several months later as a "publisher program" following a wave of negative press. This, however, did little to ease tensions and eventually Google decided to halt all book scanning (until this past November) while they tried to smooth things over with the publishers. Even so, lawsuits were filed, despite Google's offer of an "opt-out" option for publishers, allowing them to request that certain titles not be included in the search index. This more or less created an analog to the "implied consent" principle that legitimates search engines caching web pages with "spider" programs that crawl the net looking for new material.

In that case, there is a machine-to-machine communication taking place and web page owners are free to insert programs that instruct spiders not to cache, or can simply place certain content behind a firewall. By offering an "opt-out" option to publishers, Google enables essentially the same sort of communication. Adler's point (and this was echoed more succinctly by a smart question from the audience) was that if Google's fair use claim is so air-tight, then why offer this middle ground? Why all these efforts to mollify publishers without actually negotiating a license? (I am definitely concerned that Google's efforts to quell what probably should have been an anticipated negative reaction from the publishing industry will end up undercutting its legal position.)

Crawford came back with some nice points, most significantly that the publishers were trying to make a pretty egregious "double dip" into the value of their books. Google, by creating a searchable digital index of book texts -- "a card catalogue on steroids," as she put it -- and even generating revenue by placing ads alongside search results, is making a transformative use of the published material and should not have to seek permission. Google had a good idea. And it is an eminently fair use.

And it's not Google's idea alone, they just had it first and are using it to gain a competitive advantage over their search engine rivals, who in their turn, have tried to get in on the game with the Open Content Alliance (which, incidentally, has decided not to make a stand on fair use as Google has, and are doing all their scanning and indexing in the context of license agreements). Publishers, too, are welcome to build their own databases and to make them crawl-able by search engines. Earlier this week, Harper Collins announced it would be doing exactly that with about 20,000 of its titles. Aiken and Adler say that if anyone can scan books and make a search engine, then all hell will break loose and millions of digital copies will be leaked into the web. Crawford shot back that this lawsuit is not about net security issues, it is about fair use.

But once the security cat was let out of the bag, the room turned noticeably against Google (perhaps due to a preponderance of publishing lawyers in the audience). Aiken and Adler worked hard to stir up anxiety about rampant ebook piracy, even as Crawford repeatedly tried to keep the discussion on course. It was very interesting to hear, right from the horse's mouth, that the Authors' Guild and AAP both are convinced that the ebook market, tiny as it currently is, is within a few years of exploding, pending the release of some sort of ipod-like gadget for text. At that point, they say, Google will have gained a huge strategic advantage off the back of appropriated content.

Their argument hinges on the fourth determining factor in the fair use exception, which evaluates "the effect of the use upon the potential market for or value of the copyrighted work." So the publishers are suing because Google might be cornering a potential market!!! (Crawford goes further into this in her wrap-up) Of course, if Google wanted to go into the ebook business using the material in their database, there would have to be a licensing agreement, otherwise they really would be pirating. But the suits are not about a future market, they are about creating a search service, which should be ruled fair use. If publishers are so worried about the future ebook market, then they should start planning for business.

To echo Crawford, I sincerely hope these cases reach the court and are not settled beforehand. Larger concerns about Google's expansionist program aside, I think they have made a very brave stand on the principle of fair use, the essential breathing space carved out within our over-extended copyright laws. Crawford reminded the room that intellectual property is NOT like physical property, over which the owner has nearly unlimited rights. Copyright is a "temporary statutory monopoly" originally granted ("with hesitation," Crawford adds) in order to incentivize creative expression and the production of ideas. The internet scares the old-guard publishing industry because it poses so many threats to the security of their product. These threats are certainly significant, but they are not the subject of these lawsuits, nor are they Google's, or any search engine's, fault. The rise of the net should not become a pretext for limiting or abolishing fair use.

Posted by ben vershbow at 03:50 PM | Comments (2)
tags: Copyright and Copyleft , Libraries, Search and the Web , copyright , ebooks , fair_use , google , google_book_search , publishing

wikipedia update: author of seigenthaler smear confesses Post date  12.12.2005, 10:14 AM

According to a Dec 11 New York Times article, Daniel Brandt, a book indexer who runs the site Wikipedia Watch, helped to flush out the man who posted the false biography of USA Today and Freedom Forum founder John Seigenthaler on Wikipedia. After Brandt discovered the post issued from a small delivery company in Nashville, the man in question -- 38-year-old Brian Chase -- sent a letter of apology to Seigenthaler and resigned from his job as operations manager at the company.

According to the Times, Chase claims that he didn't realize that Wikipedia was used as a serious research tool: he posted the information to shock a co-worker who was familiar with the Seigenthaler family. Seigenthaler, who complained in a USA Today editorial last week about the protections afforded to the "volunteer vandals" who post anonymously in cyberspace, told the New York Times that he would not seek damages from Chase.

Responding to the fallout from Seigenthaler's USA Today editorial, Wikipedia founder James Wales changed Wikipedia's policies so that posters now must all be registered with Wikipedia. But, as Brandt shows, it's takes work to remain anonymous in cyberspace. Though I'm not sure that I beleive Chase's professed astonishment that anyone would take his post seriously (why else would it shock his co-worker?), it seems clear that he didn't think what he was doing so outrageous that he ought to make a serious effort to hide his tracks.

Meanwhile, Wales has become somewhat irked by Seignthaler's continuing attacks on Wikipedia. Posting to the threaded discussion of the issue on the mailing list of the Association for Internet Researchers, Wikipedia's founder expressed exasperation about Seigenthaler's telling the Associated Press this morning that "Wikipedia is inviting [more regulation of the internet] by its allowing irresponsible vandals to write anything they want about anybody." Wales wrote:

*sigh* Facts about our policies on vandalism are not hard to come by. A statement like Seigenthaler's, a statement that is egregiously false, would not last long at all at Wikipedia.

For the record, it is just absurd to say that Wikipedia allows "irresponsible vandals to write anything they want about anybody."

--Jimbo

Posted by lisa lynch at 10:14 AM | Comments (1)
tags: Libraries, Search and the Web , seigenthaler , wikipedia

the poetry archive - nice but a bit mixed up Post date  12.09.2005, 11:40 AM

Last week U.K. Poet Laureate Andrew Motion and recording producer Richard Carrington rolled out The Poetry Archive, a free (sort of) web library that aims to be "the world's premier online collection of recordings of poets reading their work" -- "to help make poetry accessible, relevant and enjoyable to a wide audience." poetryarchive.jpg The archive naturally focuses on British poets, but offers a significant selection of english-language writers from the U.S. and the British Commonwealth countries. Seamus Heaney is serving as president of the archive.

For each poet, a few streamable mp3s are available, including some rare historic recordings dating back to the earliest days of sound capture, from Robert Browning to Langston Hughes. The archive also curates a modest collection of children's poetry, and invites teachers to use these and other recordings in the classroom, also providing tips for contacting poets so schools, booksellers and community organizations (again, this is focused on Great Britain) can arrange readings and workshops. While some of this advice seems useful, but it reads more like a public relations/ecudation services page on a publisher's website. Is this a public archive or a poets' guild?

The Poetry Archive is a nice resource as both historic repository and contemporary showcase, but the mission seems a bit muddled. They say they're an archive, but it feels more like a CD store.

poetry archive 1.jpg

Throughout, the archive seems an odd mix of public service and professional leverage for contemporary poets. That's all well and good, but it could stand a bit more of the former. Beyond the free audio offerings (which are quite skimpy), CDs are available for purchase that include a much larger selection of recordings. The archive is non-profit, and they seem to be counting in significant part on these sales to maintain operations. Still, I would add more free audio, and focus on selling individual recordings and playlists as downloads -- the iTunes model. Having streaming teasers and for-sale CDs as the only distribution models seems wrong-headed, and a bit disingenuous if they are to call themselves an archive. It would also be smart to sell subscriptions to the entire archive, with institutional rates for schools. Podcasting would also be a good idea -- a poem a day to take with you on your iPod, weaving poetry into daily life.

There's a growing demand on the web for the spoken word, from audiobooks, podcasts, to performed poetry. The archive would probably do a lot better if they made more of their collection free, and at the same time provided a greater variety of ways to purchase recordings.

Posted by ben vershbow at 11:40 AM | Comments (2)
tags: Libraries, Search and the Web , archive , audio , audiobooks , library , literature , mp3 , poetry , sound

tipping point? Post date  12.08.2005, 7:36 AM

An article by Eileen Gifford Fenton and Roger C. Schonfeld in this morning's Inside Higher Ed claims that over the past year, libraries have accelerated the transition towards purchasing only electronic journals, leaving many publishers of print journals scrambling to make the transition to an online format:

Faced with resource constraints, librarians have been required to make hard choices, electing not to purchase the print version but only to license electronic access to many journals — a step more easily made in light of growing faculty acceptance of the electronic format. Consequently, especially in the sciences, but increasingly even in the humanities, library demand for print has begun to fall. As demand for print journals continues to decline and economies of scale of print collections are lost, there is likely to be a tipping point at which continued collecting of print no longer makes sense and libraries begin to rely only upon journals that are available electronically.

According to Fenton and Schonfeld, this imminent "tipping point" will be a good thing for larger publishing houses which have already begun to embrace an electronic-only format, but smaller nonprofit publishers might "suffer dramatically" if they don't have the means to convert to an electronic format in time. If they fail, and no one is positioned to help them, "the alternative may be the replacement of many of these journals with blogs, repositories, or other less formal distribution models."

Fenton and Schonfeld's point that electronic distribution might substantially change the format of some smaller journals echoes other expressions of concern about the rise of "informal" academic journals and repositories, mainly voiced by scientists who worry about the decline of peer review. Most notably, the Royal Society of London issued a statement on Nov. 24 warning that peer-reviewed scientific journals were threatened by the rise of "open access journals, archives and repositories."

According to the Royal Society, the main problem in the sciences is that government and nonprofit funding organizations are pressing researchers to publish in open-access journals, in order to "stop commercial publishers from making profits from the publication of research that has been funded from the public purse." While this is a noble principle, the Society argued, it undermines the foundations of peer review and compels scientists to publish in formats that might be unsustainable:

The worst-case scenario is that funders could force a rapid change in practice, which encourages the introduction of new journals, archives and repositories that cannot be sustained in the long term, but which simultaneously forces the closure of existing peer-reviewed journals that have a long-track record for gradually evolving in response to the needs of the research community over the past 340 years. That would be disastrous for the research community.

There's more than a whiff of resistance to change in the Royal Society's citing of 340 years of precedent; more to the point however, their position statement downplays the depth of the fundamental opposition between the open access movement in science and traditional journals. As Roger Chartier notes in a recent issue of Critical Inquiry, "Two different logics are at issue here: the logic of free communication, which is associated with the ideal of the Enlightenment that upheld at the sharing of knowledge, and the logic of publishing based on the notion of author's rights and commercial gain."

As we've discussed previously on if:book. the fate of peer review in electronic age is an open question: as long as peer review is tied to the logic of publishing, its fate will be determined at least as much by the still evolving market for electronic distribution as by the needs of the various research communities which have traditionally valued it as a method of assessment.

Posted by lisa lynch at 07:36 AM | Comments (0)
tags: Education , Libraries, Search and the Web , library , peer_review , publishing , royal_society_of_london

google libraries podcast now available Post date  12.07.2005, 11:33 AM

In case you missed Open Source's Monday hour on Google Book Search... Listen here. Podcast RSS here. Show summary here.

Posted by ben vershbow at 11:33 AM | Comments (1)
tags: Libraries, Search and the Web , ebook , google , google_book_search , google_print , library , podcast , publishing

google on the air Post date  12.06.2005, 12:34 AM

librarybrazil.jpg

Open Source's hour on the Googlization of libraries was refreshingly light on the copyright issue and heavier on questions about research, reading, the value of libraries, and the public interest. With its book-scanning project, Google is a private company taking on the responsibilities of a public utility, and Siva Vaidhyanathan came down hard on one of the company's chief legal reps for the mystery shrouding their operations (scanning technology, algorithms and ranking system are all kept secret). The rep reasonably replied that Google is not the only digitization project in town and that none of its library partnerships are exclusive. But most of his points were pretty obvious PR boilerplate about Google's altruism and gosh darn love of books. Hearing the counsel's slick defense, your gut tells you it's right to be suspicious of Google and to keep demanding more transparency, clearer privacy standards and so on. If we're going to let this much information come into the hands of one corporation, we need to be very active watchdogs.

Our friend Karen Schneider then joined the fray and as usual brought her sage librarian's perspective. She's thrilled by the possibilities of Google Book Search, seeing as it solves the fundamental problem of library science: that you can only search the metadata, not the texts themselves. But her enthusiasm is tempered by concerns about privatization similar to Siva's and a conviction that a research service like Google can never replace good librarianship and good physical libraries. She also took issue with the fact that Book Search doesn't link to other library-related search services like Open Worldcat. She has her own wrap-up of the show on her blog.

Rounding out the discussion was Matthew G. Kirschenbaum, a cybertext studies blogger and professor of english at the University of Maryland. Kirschenbaum addressed the question of how Google, and the web in general, might be changing, possibly eroding, our reading practices. He nicely put the question in perspective, suggesting that scattershot, inter-textual, "snippety" reading is in fact the older kind of reading, and that the idea of sustained, deeply immersed involvement with a single text is largely a romantic notion tied to the rise of the novel in the 18th century.

A satisfying hour, all in all, of the sort we should be having more often. It was fun brainstorming with Brendan Greeley, the Open Source on "blogger-in-chief," on how to put the show together. Their whole bit about reaching out to the blogosphere for ideas and inspiration isn't just talk. They put their money where their mouth is. I'll link to the podcast when it becomes available.

image: Real Gabinete Português de Literatura, Rio de Janeiro - Claudio Lara via Flickr

Posted by ben vershbow at 12:34 AM | Comments (2)
tags: Libraries, Search and the Web , copyright , digitization , ebook , google , google_book_search , google_print , library , literature , metadata , reading , search

thinking about google books: tonight at 7 on radio open source Post date  12.05.2005, 4:58 PM

While visiting the Experimental Television Center in upstate New York this past weekend, Lisa found a wonderful relic in a used book shop in Owego, NY -- a small, leatherbound volume from 1962 entitled "Computers," which IBM used to give out as a complimentary item. An introductory note on the opening page reads:

The machines do not think -- but they are one of the greatest aids to the men who do think ever invented! Calculations which would take men thousands of hours -- sometimes thousands of years -- to perform can be handled in moments, freeing scientists, technicians, engineers, businessmen, and strategists to think about using the results.

This echoes Vannevar Bush's seminal 1945 essay on computing and networked knowledge, "As We May Think", which more or less prefigured the internet, web search, and now, the migration of print libraries to the world wide web. Google Book Search opens up fantastic possibilities for research and accessibility, enabling readers to find in seconds what before might have taken them hours, days or weeks. Yet it also promises to transform the very way we conceive of books and libraries, shaking the foundations of major institutions. Will making books searchable online give us more time to think about the results of our research, or will it change the entire way we think? By putting whole books online do we begin the steady process of disintegrating the idea of the book as a bounded whole and not just a sequence of text in a massive database?

The debate thus far has focused too much on the legal ramifications -- helped in part by a couple of high-profile lawsuits from authors and publishers -- failing to take into consideration the larger cognitive, cultural and institutional questions. Those questions will hopefully be given ample air time tonight on Radio Open Source.

Tune in at 7pm ET on local public radio or stream live over the web. The show will also be available later in the week as a podcast.

Posted by ben vershbow at 04:58 PM | Comments (0)
tags: Libraries, Search and the Web , books , copyright , ebook , google , google_book_search , google_print , library , literature , radio , research , university

the role of note taking in the information age Post date  12.03.2005, 3:19 PM

An article by Ann Blair in a recent issue of Critical Inquiry (vol 31 no 1) discusses the changing conceptions of the function of note-taking from about the sixth century to the present, and ends with a speculation on the way that textual searches (such as Google Book Search) might change practices of note-taking in the twenty-first century. Blair argues that "one of the most significant shifts in the history of note taking" occured in the beginning of the twentieth century, when the use of notes as memorization aids gave way to the use of notes as a aid to replace the memorization of too-abundant information. With the advent of the net, she notes:

Today we delegate to sources that we consider authoritative the extraction of information on all but a few carefully specialized areas in which we cultivate direct experience and original research. New technologies increasingly enable us to delegate more tasks of remembering to the computer, in that shifting division of labor between human and thing. We have thus mechanized many research tasks. It is possible that further changes would affect even the existence of note taking. At a theoretical extreme, for example, if every text one wanted were constantly available for searching anew, perhaps the note itself, the selection made for later reuse, might play a less prominent role.

The result of this externalization, Blair notes, is that we come to think of long-term memory as something that is stored elsewhere, in "media outside the mind." At the same time, she writes, "notes must be rememorated or absorbed in the short-term memory at least enough to be intelligently integrated into an argument; judgment can only be applied to experiences that are present to the mind."

Blair's article doesn't say that this bifurcation between short-term and long-term memory is a problem: she simply observes it as a phenomenon. But there's a resonance between Blair's article and Naomi Baron's recent Los Angeles Times piece on Google Book Search: both point to the fact that what we commonly have defined as scholarly reflection has increasingly become more and more a process of database management. Baron seems to see reflection and database management as being in tension, though I'm not completely convinced by her argument. Blair, less apocalyptic than Baron, nonetheless gives me something to ponder. What happens to us if (or when) all of our efforts to make the contents of our extrasomatic memory "present to our mind" happen without the mediation of notes? Blair's piece focuses on the epistemology rather than the phenomenology of note taking — still, she leads me to wonder what happens if the mediating function of the note is lost, when the triangular relation between book, scholar and note becomes a relation between database and user.

Posted by lisa lynch at 03:19 PM | Comments (1)
tags: Libraries, Search and the Web , book , google , internet , note_taking , search

killing the written word? Post date  12.02.2005, 10:41 AM

A November 28 Los Angeles Times editorial by American University linguistics professor Naomi Barron adds another element to the debate over Google Print [now called Google Book Search, though Baron does not use this name]: Baron claims that her students are already clamoring for the abridged, extracted texts and have begun to feel that book-reading is passe. She writes:

Much as automobiles discourage walking, with undeniable consequences for our health and girth, textual snippets-on-demand threaten our need for the larger works from which they are extracted... In an attempt to coax students to search inside real books rather than relying exclusively on the Web for sources, many professors require references to printed works alongside URLs. Now that those "real" full-length publications are increasingly available and searchable online, the distinction between tangible and virtual is evaporating.... Although [the debate over Google Print] is important for the law and the economy, it masks a challenge that some of us find even more troubling: Will effortless random access erode our collective respect for writing as a logical, linear process? Such respect matters because it undergirds modern education, which is premised on thought, evidence and analysis rather than memorization and dogma. Reading successive pages and chapters teaches us how to follow a sustained line of reasoning.

As someone who's struggled to get students to go to the library while writing their papers, I think Baron's making a very important and immediate pedagogical point: what will professors do after Google Book Search allows their students to access bits of "real books" online? Will we simply establish a policy of not allowing the online excerpted material to "count" in our tally of student's assorted research materials?

On the other hand, I can see the benefits of having a student use Google Book Search in their attempt to compile an annotated bibliography for a research project, as long as they were then required to look at a version of the longer text (whether on or off-line). I'm not positive that "random effortless access" needs to be diametrically opposed to instilling the practice of sustained reading. Instead, I think we've got a major educational challenge on our hands whose exact dimensions won't be clear until Google Book Search finally gets going.

Also: thanks to UVM English Professor Richard Parent for posting this article on his blog, which has some interesting ruminations on the future of the book.

Posted by lisa lynch at 10:41 AM | Comments (7)
tags: Libraries, Search and the Web , google_book_search , literacy

katrina archive on internet archive Post date  12.01.2005, 2:26 PM

The Internet Archive has just established an archive dedicated to preserving the online response to the Katrina catastrophe. According to the Archive:

The Internet Archive and many individual contributors worked together to put together a comprehensive list of websites to create a historical record of the devastation caused by Hurricane Katrina and the massive relief effort which followed. This collection has over 25 million unique pages, all text searchable, from over 1500 sites. The web archive commenced on September 4th.

If you try to link to the Internet Archive today, you might not get through, because everyone is on the site talking about the Grateful Dead's decision to allow free downloading

Posted by lisa lynch at 02:26 PM | Comments (0)
tags: Libraries, Search and the Web , archive , internet , katrina

google print on deck at radio open source Post date  12.01.2005, 8:07 AM

Open Source, the excellent public radio program (not to be confused with "Open Source Media") that taps into the blogosphere to generate its shows, has been chatting with me about putting together an hour on the Google library project. Open Source is a unique hybrid, drawing on the best qualities of the blogosphere -- community, transparency, collective wisdom -- to produce an otherwise traditional program of smart talk radio. As host Christopher Lydon puts it, the show is "fused at the brain stem with the world wide web." Or better, it "uses the internet to be a show about the world."

The Google show is set to air live this evening at 7pm (ET) (they also podcast). It's been fun working with them behind the scenes, trying to figure out the right guests and questions for the ideal discussion on Google and its bookish ambitions. My exchange has been with Brendan Greeley, the Radio Open Source "blogger-in-chief" (he's kindly linked to us today on their site). We agreed that the show should avoid getting mired in the usual copyright-focused news peg -- publishers vs. Google etc. -- and focus instead on the bigger questions. At my suggestion, they've invited Siva Vaidhyanathan, who wrote the wonderful piece in the Chronicle of Higher Ed. that I talked about yesterday (see bigger questions). I've also recommended our favorite blogger-librarian, Karen Schneider (who has appeared on the show before), science historian George Dyson, who recently wrote a fascinating essay on Google and artificial intelligence, and a bunch of cybertext studies people: Matthew G. Kirschenbaum, N. Katherine Hayles, Jerome McGann and Johanna Drucker. If all goes well, this could end up being a very interesting hour of discussion. Stay tuned.

UPDATE: Open Source just got a hold of Nicholas Kristof to do an hour this evening on Genocide in Sudan, so the Google piece will be pushed to next week.

Posted by ben vershbow at 08:07 AM | Comments (0)
tags: Libraries, Search and the Web , Online , copyright , google , google_book_search , google_print , library , open_source , podcast , publishing , radio , radio_open_source , search , web

sober thoughts on google: privatization and privacy Post date  11.30.2005, 8:18 AM

nypl reading room.jpg

Siva Vaidhyanathan has written an excellent essay for the Chronicle of Higher Education on the "risky gamble" of Google's book-scanning project -- some of the most measured, carefully considered comments I've yet seen on the issue. His concerns are not so much for the authors and publishers that have filed suit (on the contrary, he believes they are likely to benefit from Google's service), but for the general public and the future of libraries. Outsourcing to a private company the vital task of digitizing collections may prove to have been a grave mistake on the part of Google's partner libraries. Siva:

The long-term risk of privatization is simple: Companies change and fail. Libraries and universities last.....Libraries should not be relinquishing their core duties to private corporations for the sake of expediency. Whichever side wins in court, we as a culture have lost sight of the ways that human beings, archives, indexes, and institutions interact to generate, preserve, revise, and distribute knowledge. We have become obsessed with seeing everything in the universe as "information" to be linked and ranked. We have focused on quantity and convenience at the expense of the richness and serendipity of the full library experience. We are making a tremendous mistake.

This essay contains in abundance what has largely been missing from the Google books debate: intellectual courage. Vaidhyanathan, an intellectual property scholar and "avowed open-source, open-access advocate," easily could have gone the predictable route of scolding the copyright conservatives and spreading the Google gospel. But he manages to see the big picture beyond the intellectual property concerns. This is not just about economics, it's about knowledge and the public interest.

What irks me about the usual debate is that it forces you into a position of either resisting Google or being its apologist. But this fails to get at the real bind we all are in: the fact that Google provides invaluable services and yet is amassing too much power; that a private company is creating a monopoly on public information services. Sooner or later, there is bound to be a conflict of interest. That is where we, the Google-addicted public, are caught. It's more complicated than hip versus square, or good versus evil.

Here's another good piece on Google. On Monday, The New York Times ran an editorial by Adam Cohen that nicely lays out the privacy concerns:

Google says it needs the data it keeps to improve its technology, but it is doubtful it needs so much personally identifiable information. Of course, this sort of data is enormously valuable for marketing. The whole idea of "Don't be evil," though, is resisting lucrative business opportunities when they are wrong. Google should develop an overarching privacy theory that is as bold as its mission to make the world's information accessible - one that can become a model for the online world. Google is not necessarily worse than other Internet companies when it comes to privacy. But it should be doing better.

original google.jpg Two graduate students in Stanford in the mid-90s recognized that search engines would the most important tools for dealing with the incredible flood of information that was then beginning to swell, so they started indexing web pages and working on algorithms. But as the company has grown, Google's admirable-sounding mission statement -- "to organize the world's information and make it universally accessible and useful" -- has become its manifest destiny, and "information" can now encompass the most private of territories.

At one point it simply meant search results -- the answers to our questions. But now it's the questions as well. Google is keeping a meticulous record of our clickstreams, piecing together an enormous database of queries, refining its search algorithms and, some say, even building a massive artificial brain (more on that later). What else might they do with all this personal information? To date, all of Google's services are free, but there may be a hidden cost.

"Don't be evil" may be the company motto, but with its IPO earlier this year, Google adopted a new ideology: they are now a public corporation. If web advertising (their sole source of revenue) levels off, then investors currently high on $400+ shares will start clamoring for Google to maintain profits. "Don't be evil to us!" they will cry. And what will Google do then?

images: New York Public Library reading room by Kalloosh via Flickr; archive of the original Google page

Posted by ben vershbow at 08:18 AM | Comments (7)
tags: Copyright and Copyleft , Libraries, Search and the Web , books , copyright , ethics , google , google_book_search , google_print , intellectual_property , libraries , library , literature , privacy , publishing , university

virtual libraries, real ones, empires Post date  11.28.2005, 12:36 PM

Handsworth readers.jpg Last Tuesday, a Washington Post editorial written by Library of Congress librarian James Billington outlined the possible benefits of a World Digital Library, a proposed LOC endeavor discussed last week in a post by Ben Vershbow. Billington seemed to imagine the library as sort of a United Nations of information: claiming that "deep conflict between cultures is fired up rather than cooled down by this revolution in communications," he argued that a US-sponsored, globally inclusive digital library could serve to promote harmony over conflict:

Libraries are inherently islands of freedom and antidotes to fanaticism. They are temples of pluralism where books that contradict one another stand peacefully side by side just as intellectual antagonists work peacefully next to each other in reading rooms. It is legitimate and in our nation's interest that the new technology be used internationally, both by the private sector to promote economic enterprise and by the public sector to promote democratic institutions. But it is also necessary that America have a more inclusive foreign cultural policy -- and not just to blunt charges that we are insensitive cultural imperialists. We have an opportunity and an obligation to form a private-public partnership to use this new technology to celebrate the cultural variety of the world.

What's interesting about this quote (among other things) is that Billington seems to be suggesting that a World Digital Library would function in much the same manner as a real-world library, and yet he's also arguing for the importance of actual physical proximity. He writes, after all, about books literally, not virtually, touching each other, and about researchers meeting up in a shared reading room. There seems to be a tension here, in other words, between Billington's embrace of the idea of a world digital library, and a real anxiety about what a "library" becomes when it goes online.

I also feel like there's some tension here — in Billington's editorial and in the whole World Digital Library project — between "inclusiveness" and "imperialism." Granted, if the United States provides Brazilians access to their own national literature online, this might be used by some as an argument against the idea that we are "insensitive cultural imperialists." But there are many varieties of empire: indeed, as many have noted, the sun stopped setting on Google's empire a while ago.

To be clear, I'm not attacking the idea of the World Digital Library. Having watch the Smithsonian invest in, and waffle on, some of their digital projects, I'm all for a sustained commitment to putting more material online. But there needs to be some careful consideration of the differences between online libraries and virtual ones — as well as a bit more discussion of just what a privately-funded digital library might eventually morph into.

Posted by lisa lynch at 12:36 PM | Comments (0)
tags: Libraries, Search and the Web , cultural , digital , google , imperialism , internet , libraries

explosion Post date  11.22.2005, 2:10 PM

250px-Nuclear_fireball.jpg A Nov. 18 post on Adam Green's Darwinian Web makes the claim that the web will "explode" (does he mean implode?) over the next year. According to Green, RSS feeds will render many websites obsolete:

The explosion I am talking about is the shifting of a website's content from internal to external. Instead of a website being a "place" where data "is" and other sites "point" to, a website will be a source of data that is in many external databases, including Google. Why "go" to a website when all of its content has already been absorbed and remixed into the collective datastream.

Does anyone agree with Green? Will feeds bring about the restructuring of "the way content is distributed, valued and consumed?" More on this here.

Posted by lisa lynch at 02:10 PM | Comments (5)
tags: Libraries, Search and the Web , Online , Publishing, Broadcast, and the Press , RSS , blogging , blogs , darwin , darwinism , google , internet , singularity , syndication , web , xml

world digital library Post date  11.22.2005, 7:41 AM

library of congress.jpg The Library of Congress has announced plans for the creation of a World Digital Library, "a shared global undertaking" that will make a major chunk of its collection freely available online, along with contributions from other national libraries around the world. From The Washington Post:

...[the] goal is to bring together materials from the United States and Europe with precious items from Islamic nations stretching from Indonesia through Central and West Africa, as well as important materials from collections in East and South Asia.

Google has stepped forward as the first corporate donor, pledging $3 million to help get operations underway. At this point, there doesn't appear to be any direct connection to Google's Book Search program, though Google has been working with LOC to test and refine its book-scanning technology.

Posted by ben vershbow at 07:41 AM | Comments (0)
tags: Libraries, Search and the Web , books , digital , google , library , library_of_congress , literature , preservation , scanning

online retail influencing libraries Post date  11.21.2005, 12:07 PM

The NY Times reports on new web-based services at university libraries that are incorporating features such as personalized recommendations, browsing histories, and email alerts, the sort of thing developed by online retailers like Amazon and Netflix to recreate some of the experience of browsing a physical store. Remember Ranganathan's fourth law of library science: "save the time of the reader." The reader and the customer are perhaps becoming one in the same.

It would be interesting if a social software system were emerging for libraries that allowed students and researchers to work alongside librarians in organizing the stacks. Automated recommendations are just the beginning. I'm talking more about value added by the readers themselves (Amazon has does this with reader reviews, Listmania, and So You'd Like To...). A social card catalogue with a tagging system and other reader-supplied metadata where readers could leave comments and bread crumb trails between books. Each card catalogue entry with its own blog and wiki to create a context for the book. Books are not just surrounded by other volumes on the shelves, they are surrounded by people, other points of view, affinities -- the kinds of thing that up to this point were too vaporous to collect. This goes back to David Weinberger's comment on metadata and Google Book Search.

Posted by ben vershbow at 12:07 PM | Comments (3)
tags: Libraries, Search and the Web , Social Software , books , folksonomy , librarian , library , metadata , reading , social_software , tagging , taxonomy

google print is no more Post date  11.18.2005, 8:06 AM

Not the program, of course, just the name. From now on it is to be known as Google Book Search. "Print" obviously struck a little too close to home with publishers and authors. On the company blog, they explain the shift in emphasis:

No, we don't think that this new name will change what some folks think about this program. But we do believe it will help a lot of people understand better what we're doing. We want to make all the world's books discoverable and searchable online, and we hope this new name will help keep everyone focused on that important goal.

Posted by ben vershbow at 08:06 AM | Comments (1)
tags: Libraries, Search and the Web , books , copyright , google , google_book_search , google_print , publishing , search

the book in the network - masses of metadata Post date  11.15.2005, 6:42 PM

In this weekend's Boston Globe, David Weinberger delivers the metadata angle on Google Print:

...despite the present focus on who owns the digitized content of books, the more critical battle for readers will be over how we manage the information about that content-information that's known technically as metadata.

...we're going to need massive collections of metadata about each book. Some of this metadata will come from the publishers. But much of it will come from users who write reviews, add comments and annotations to the digital text, and draw connections between, for example, chapters in two different books.

As the digital revolution continues, and as we generate more and more ways of organizing and linking books-integrating information from publishers, libraries and, most radically, other readers-all this metadata will not only let us find books, it will provide the context within which we read them.

The book in the network is a barnacled spirit, carrying with it the sum of its various accretions. Each book is also its own library by virtue not only of what it links to itself, but of what its readers are linking to, of what its readers are reading. Each book is also a milk crate of earlier drafts. It carries its versions with it. A lot of weight for something physically weightless.

Posted by ben vershbow at 06:42 PM | Comments (0)
tags: ISBN , Libraries, Search and the Web , books , ebook , electronic_literature , folksonomy , google , google_print , hypertext , library , literature , marginalia , metadata , social_software , tagging , weinberger

having browsed google print a bit more... Post date  11.14.2005, 4:53 AM

...I realize I was over-hasty in dismissing the recent additions made since book scanning resumed earlier this month. True, many of the fine wines in the cellar are there only for the tasting, but the vintage stuff can be drunk freely, and there are already some wonderful 19th century titles, at this point mostly from Harvard. The surest way to find them is to search by date, or by title and date. Specify a date range in advanced search or simply enter, for example, "date: 1890" and a wealth of fully accessible texts comes up, any of which can be linked to from a syllabus. An astonishing resource for teachers and students.

The conclusion: Google Print really is shaping up to be a library, that is, of the world pre-1923 -- the current line of demarcation between copyright and the public domain. It's a stark reminder of how over-extended copyright is. Here's an 1899 english printing of The Mahabharata:

mahabharata.jpg

A charming detail found on the following page is this old Harvard library stamp that got scanned along with the rest:

mahabharata harvard stamp.jpg

Posted by ben vershbow at 04:53 AM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , OCR , copyright , ebook , fair_use , google , google_print , library , mahabharata , scan

google print's not-so-public domain Post date  11.03.2005, 4:16 PM

wealthy new york google.jpg Google's first batch of public domain book scans is now online, representing a smattering of classics and curiosities from the collections of libraries participating in Google Print. Essentially snapshots of books, they're not particularly comfortable to read, but they are keyword-searchable and, since no copyright applies, fully accessible.

The problem is, there really isn't all that much there. Google's gotten a lot of bad press for its supposedly cavalier attitude toward copyright, but spend a few minutes browsing Google Print and you'll see just how publisher-centric the whole affair is. The idea of a text being in the public domain really doesn't amount to much if you're only talking about antique manuscripts, and these are the only books that they've made fully accessible. Daisy Miller's copyright expired long ago but, with the exception of Harvard's illustrated 1892 copy, all the available scanned editions are owned by modern publishers and are therefore only snippeted. This is not an online library, it's a marketing program. Google Print will undeniably have its uses, but we shouldn't confuse it with a library.

(An interesting offering from the stacks of the New York Public Library is this mid-19th century biographic registry of the wealthy burghers of New York: "Capitalists whose wealth is estimated at one hundred thousand dollars and upwards...")

Posted by ben vershbow at 04:16 PM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , OCR , books , copyright , ebook , google , google_print , library , literature , public_domain , scan

a better wikipedia will require a better conversation Post date  10.28.2005, 1:04 PM

There's an interesting discussion going on right now under Kim's Wikibooks post about how an open source model might be made to work for the creation of authoritative knowledge -- textbooks, encyclopedias etc. A couple of weeks ago there was some dicussion here about an article that, among other things, took some rather cheap shots at Wikipedia, quoting (very selectively) a couple of shoddy passages. Clearly, the wide-open model of Wikipedia presents some problems, but considering the advantages it presents (at least in potential) -- never out of date, interconnected, universally accessible, bringing in voices from the margins -- critics are wrong to dismiss it out of hand. Holding up specific passages for critique is like shooting fish in a barrel. Even Wikipedia's directors admit that most of the content right now is of middling quality, some of it downright awful. It doesn't then follow to say that the whole project is bunk. That's a bit like expelling an entire kindergarten for poor spelling. Wikipedia is at an early stage of development. Things take time.

Instead we should be talking about possible directions in which it might go, and how it might be improved. Dan for one, is concerned about the market (excerpted from comments):

What I worry about...is that we're tearing down the old hierarchies and leaving a vacuum in their wake.... The problem with this sort of vacuum, I think, is that capitalism tends to swoop in, simply because there are more resources on that side....

...I'm not entirely sure if the world of knowledge functions analogously, but Wikipedia does presume the same sort of tabula rasa. The world's not flat: it tilts precariously if you've got the cash. There's something in the back of my mind that suspects that Wikipedia's not protected against this – it's kind of in the state right now that the Web as a whole was in 1995 before the corporate world had discovered it. If Wikipedia follows the model of the web, capitalism will be sweeping in shortly.

Unless... the experts swoop in first. Wikipedia is part of a foundation, so it's not exactly just bobbing in the open seas waiting to be swept away. If enough academics and librarians started knocking on the door saying, hey, we'd like to participate, then perhaps Wikipedia (and Wikibooks) would kick up to the next level. Inevitably, these newcomers would insist on setting up some new vetting mechanisms and a few useful hierarchies that would help ensure quality. What would these be? That's exactly the kind of thing we should be discussing.

The Guardian ran a nice piece earlier this week in which they asked several "experts" to evaluate a Wikipedia article on their particular subject. They all more or less agreed that, while what's up there is not insubstantial, there's still a long way to go. The biggest challenge then, it seems to me, is to get these sorts of folks to give Wikipedia more than just a passing glance. To actually get them involved.

For this to really work, however, another group needs to get involved: the users. That might sound strange, since millions of people write, edit and use Wikipedia, but I would venture that most are not willing to rely on it as a bedrock source. No doubt, it's incredibly useful to get a basic sense of a subject. Bloggers (including this one) link to it all the time -- it's like the conversational equivalent of a reference work. And for certain subjects, like computer technology and pop culture, it's actually pretty solid. But that hits on the problem right there. Wikipedia, even at its best, has not gained the confidence of the general reader. And though the Wikimaniacs would be loathe to admit it, this probably has something to do with its core philosophy.

Karen G. Schneider, a librarian who has done a lot of thinking about these questions, puts it nicely:

Wikipedia has a tagline on its main page: "the free-content encyclopedia that anyone can edit." That's an intriguing revelation. What are the selling points of Wikipedia? It's free (free is good, whether you mean no-cost or freely-accessible). That's an idea librarians can connect with; in this country alone we've spent over a century connecting people with ideas.

However, the rest of the tagline demonstrates a problem with Wikipedia. Marketing this tool as a resource "anyone can edit" is a pitch oriented at its creators and maintainers, not the broader world of users. It's the opposite of Ranganathan's First Law, "books are for use." Ranganathan wasn't writing in the abstract; he was referring to a tendency in some people to fetishize the information source itself and lose sight that ultimately, information does not exist to please and amuse its creators or curators; as a common good, information can only be assessed in context of the needs of its users.

I think we are all in need of a good Wikipedia, since in the long run it might be all we've got. And I'm in now way opposed to its spirit of openness and transparency (I think the preservation of version histories is a fascinating element and one which should be explored further -- perhaps the encyclopedia of the future can encompass multiple versions of the "the truth"). But that exhilarating throwing open of the doors should be tempered with caution and with an embrace of the parts of the old system that work. Not everything need be thrown away in our rush to explore the new. Some people know more than other people. Some editors have better judgement than others. There is such a thing as a good kind of gatekeeping.

If these two impulses could be brought into constructive dialogue then we might get somewhere. This is exactly the kind of conversation the Wikimedia Foundation should be trying to foster.

Posted by ben vershbow at 01:04 PM | Comments (8)
tags: Education , Libraries, Search and the Web , Online , authority , encyclopedia , library , open_source , web , wiki , wikibooks , wikimedia , wikipedia

microsoft joins open content alliance Post date  10.26.2005, 9:06 AM

Microsoft's forthcoming "MSN Book Search" is the latest entity to join the Open Content Alliance, the non-controversial rival to Google Print. ZDNet says: "Microsoft has committed to paying for the digitization of 150,000 books in the first year, which will be about $5 million, assuming costs of about 10 cents a page and 300 pages, on average, per book..."

Apparently having learned from Google's mistakes, OCA operates under a strict "opt-in" policy for publishers vis-a-vis copyrighted works (whereas with Google, publishers have until November 1 to opt out). Judging by the growing roster of participants, including Yahoo, the National Archives of Britain, the University of California, Columbia University, and Rice University, not to mention the Internet Archive, it would seem that less hubris equals more results, or at least lower legal fees. Supposedly there is some communication between Google and OCA about potential cooperation.

Also story in NY Times.

Posted by ben vershbow at 09:06 AM | Comments (2)
tags: Libraries, Search and the Web , Microsoft , OCA , books , brewster_kahle , copyright , google , google_print , library , open_content_alliance , search , web , yahoo

to some writers, google print sounds like a sweet deal Post date  10.25.2005, 9:25 AM

Wired has a piece today about authors who are in favor of Google's plans to digitize millions of books and make them searchable online. Most seem to agree that obscurity is a writer's greatest enemy, and that the exposure afforded by Google's program far outweighs any intellectual property concerns. Sometimes to get more you have to give a little.

The article also mentions the institute.

Posted by ben vershbow at 09:25 AM | Comments (0)
tags: Libraries, Search and the Web , Publishing, Broadcast, and the Press , books , copyright , google , google_print , publishing , search , web , writing

debating google print Post date  10.22.2005, 5:53 PM

The Washington Post has run a pair of op-eds, one from each side of the Google Print dispute. Neither says anything particularly new. Moreover, they enforce the perception that there can be only two positions on the subject -- an endemic problem in newspaper opinion pages with their addiction to binaries, where two cardboard boxers are allotted their space to throw a persuasive punch. So you're either for Google or against it? That's awfully close to you're either for technology -- for progress -- or against it. Unfortunately, like technology's impact, the Google book-scanning project is a little trickier to figure out, and a more nuanced conversation is probably in order.

The first piece, "Riches We Must Share...", is submitted in support of Google by University of Michigan President Sue Coleman (a partner in the Google library project). She argues that opening up the elitist vaults of the world's great (english) research libraries will constitute a democratic revolution. "We believe the result can be a widening of human conversation comparable to the emergence of mass literacy itself." She goes on to deliver some boilerplate about the "Net Generation" -- too impatient to look for books unless they're online etc. etc. (great to see a major university president being led by the students instead of leading herself).

Coleman then devotes a couple of paragraphs to the copyright question, failing to tackle any of its controversial elements:

Universities are no strangers to the responsible management of complex copyright, permission and security issues; we deal with them every day in our classrooms, libraries, laboratories and performance halls. We will continue to work within the current criteria for fair use as we move ahead with digitization.

The problem is, Google is stretching the current criteria of fair use, possibly to the breaking point. Coleman does not acknowledge or address this. She does, however, remind the plaintiffs that copyright is not only about the owners:

The protections of copyright are designed to balance the rights of the creator with the rights of the public. At its core is the most important principle of all: to facilitate the sharing of knowledge, not to stifle such exchange.

All in all a rather bland statement in support of open access. It fails to weigh in on the fair use question -- something about which the academy should have a few things to say -- and does not indicate any larger concern about what Google might do with its books database down the road.

The opposing view, "...But Not at Writers' Expense", comes from Nick Taylor, writer, and president of the Authors' Guild (which sued Google last month). Taylor asserts that mega-rich Google is tramping on the dignity of working writers. But a couple of paragraphs in, he gets a little mixed up about contemporary publishing:

Except for a few big-name authors, publishers roll the dice and hope that a book's sales will return their investment. Because of this, readers have a wealth of wonderful books to choose from.

A dubious assessment, since publishing conglomerates are not exactly enthusiastic dice rollers. I would counter that risk-averse corporate publishing has steadily shrunk the number of available titles, counting on a handful of blockbusters to drive the market. Taylor goes on to defend not just the publishing status quo, but the legal one:

Now that the Authors Guild has objected, in the form of a lawsuit, to Google's appropriation of our books, we're getting heat for standing in the way of progress, again for thoughtlessly wanting to be paid. It's been tradition in this country to believe in property rights. When did we decide that socialism was the way to run the Internet?

First of all, it's funny to think of the huge corporations that dominate the web as socialist. Second, this talk about being paid for appropriating books for a search database is revealing of the two totally different worldviews that are at odds in this struggle. The authors say that any use of their book requires a payment. Google sees including the books in the database as a kind of payment in itself. No one with a web page expects Google to pay them for indexing their site. They are grateful that they do! Otherwise, they are totally invisible. This is the unspoken compact that underpins web search. Google assumed the same would apply with books. Taylor says not so fast.

Here's Taylor on fair use:

Google contends that the portions of books it will make available to searchers amount to "fair use," the provision under copyright that allows limited use of protected works without seeking permission. That makes a private company, which is profiting from the access it provides, the arbiter of a legal concept it has no right to interpret. And they're scanning the entire books, with who knows what result in the future.

Actually, Google is not doing all the interpreting. There is a legal precedent for Google's reading of fair use established in the 2003 9th Circuit Court decision Kelly v. Arriba Soft. In the case, Kelly, a photographer, sued Arriba Soft, an online image search system, for indexing several of his photographs in their database. Kelly believed that his intellectual property had been stolen, but the court ruled that Arriba's indexing of thumbnail-sized copies of images (which always linked to their source sites) was fair use: "Arriba’s use of the images serves a different function than Kelly’s use – improving access to information on the internet versus artistic expression.” Still, Taylor's "with who knows what result in the future" concern is valid.

So on the one hand we have many writers and most publishers trying to defend their architecture of revenue (or, as Taylor would have it, their dignity). But I can't imagine how Google Print would really be damaging that architecture, at least not in the foreseeable future. Rather it leverages it by placing it within the frame of another architecture: web search. The irony for the authors is that the current architecture doesn't seem to be serving them terribly well. With print-on-demand gaining in quality and legitimacy, online book search could totally re-define what is an acceptable risk to publishers, and maybe more non-blockbuster authors would get published.

On the other hand we have the universities and libraries participating in Google's program, delivering the good news of accessibility. But they are not sufficiently questioning what Google might do with its database down the road, or the implications of a private technology company becoming the principal gatekeeper of the world's corpus.

If only this debate could be framed in a subtler way, rather than the for-Google-or-against-it paradigm we have now. I'm cautiously optimistic about the effect of having books searchable on the web. And I tend to believe it will be beneficial to authors and publishers. But I have other, deep reservations about the direction in which Google is heading, and feel that a number of things could go wrong. We think the cencorship of the marketplace is bad now in the age of publishing conglomerates. What if one company has total control of everything? And is keeping track of every book, every page, that you read. And is reading you while you read, throwing ads into your peripheral vision. I'm curious to hear from readers what they feel could be the hazards of Google Print.

Posted by ben vershbow at 05:53 PM | Comments (4)
tags: Libraries, Search and the Web , Publishing, Broadcast, and the Press , academy , books , copyright , google , google_print , michigan , publishing , writing

google is sued... again Post date  10.20.2005, 8:08 AM

This time by publishers. Penguin Group USA, McGraw-Hill, Pearson Education, Simon & Schuster and John Wiley & Sons. The gripe is the same as with the Authors' Guild, which filed suit last month alleging "massive copyright infringement." Publishers fear a dangerous precedent is set by Google's scanning of books to construct what amounts to a giant card catalogue on the web. Google claims "fair use" (see rationale), again pointing out that for copyrighted works only tiny "snippets" of text are displayed around keywords (though perhaps this is not yet fully in effect - I was searching around in this book and was able to look at quite a lot).

Google calls the publishers' suit "near-sighted." And it probably is. The benefit to readers and researchers will be tremendous, as will (Google is eager to point out) the exposure for authors and publishers. But Google Print is undoubtedly an earth-shaking program. Look at the reaction in Europe, where alarm bells rung by France warned of cultural imperialism, an english-drenched web. Heads of state and culture convened and initial plans for a European digital library have been drawn up.

What the transatlantic flap makes clear is that Google's book scanning touches a deep nerve, and the argument over intellectual property, signficant though it is, distracts from a more profound human anxiety -- an anxiety about the form of culture and the shape of thoughts. If we try to grope back through the millennia, we can find find an analogy in the invention of writing.

The shift from oral to written language froze speech into stable strings that could be transmitted and stored over distance and time. This change not only affected the modes of communication, it dramatically refigured the cognitive makeup of human beings (as McLuhan, Ong and others have described). We are currently going through another such shift. The digital takes the freezing medium of text and throws it back into fluidity. Like the melting of polar ice caps, it unsettles equilibriums, changes weather patterns. It is a lot to adjust to, and we wonder if our great-great-grandchildren will literally think differently from us.

But in spite of this disorienting new fluidity, we still have print, we still have the book. And actually, Google Print in many ways affirms this since its search returns will point to print retailers and brick-and-mortar libraries. Yet the fact remains that the canon is being scanned, with implications we can't fully perceive, and future uses we can't fully predict, and so it is understandable that many are unnerved. The ice is really beginning to melt.

In Phaedrus, Plato expresses a similar anxiety about the invention of writing. He tells the tale of Theuth, an Egyptian deity who goes around spreading the new technology, and one day encounters a skeptic in King Thamus:

...you who are the father of letters, from a paternal love of your own children have been led to attribute to them a power opposite to that which they in fact possess. For this discovery of yours will create forgetfulness in the minds of those who learn to use it; they will not exercise their memories, but, trusting in external, foreign marks, they will not bring things to remembrance from within themselves. You have discovered a remedy not for memory, but for reminding. You offer your students the appearance of wisdom, not true wisdom. They will be hearers of many things and will have learned nothing; they will appear to be omniscient and will generally know nothing; they will be tiresome company, having the show of wisdom without the reality.

As I type, I'm exhibiting wisdom without the reality. I've read Plato, but nowhere near exhaustively. Yet I can slash and weave texts on the web in seconds, throw together a blog entry and send it screeching into the commons. And with Google Print I can get the quote I need and let the rest of the book rot behind the security fence. This fluidity is dangerous because it makes connections so easy. Do we know what we are connecting?

Posted by ben vershbow at 08:08 AM | Comments (5)
tags: Copyright and Copyleft , Libraries, Search and the Web , Transliteracies , copyright , google , literacy , mcluhan , ong , plato , publishing , search , web

google expands book-scanning project to europe Post date  10.18.2005, 8:56 AM

This week Google will be paying a visit to the Frankfurt Book Fair to talk with European publishers and chief librarians (including arch nemesis Jean-Nöel Jeanneney) about eight new local incarnations of Google Print. (more)

Posted by ben vershbow at 08:56 AM | Comments (0)
tags: Libraries, Search and the Web , Online , books , copyright , ebook , europe , frankfurt , google , internet , library , publishing , search , web

nicholas carr on "the amorality of web 2.0" Post date  10.17.2005, 9:00 AM

Nicholas Carr, who writes about business and technology and formerly was an editor of the Harvard Business Review, has published an interesting though problematic piece on "the amorality of web 2.0". I was drawn to the piece because it seemed to be questioning the giddy optimism surrounding "web 2.0", specifically Kevin Kelly's rapturous late-summer retrospective on ten years of the world wide web, from Netscape IPO to now. While he does poke some much-needed holes in the carnival floats, Carr fails to adequately address the new media practices on their own terms and ends up bashing Wikipedia with some highly selective quotes.

Carr is skeptical that the collectivist paradigms of the web can lead to the creation of high-quality, authoritative work (encyclopedias, journalism etc.). Forced to choose, he'd take the professionals over the amateurs. But put this way it's a Hobson's choice. Flawed as it is, Wikipedia is in its infancy and is probably not going away. Whereas the future of Britannica is less sure. And it's not just amateurs that are participating in new forms of discourse (take as an example the new law faculty blog at U. Chicago). Anyway, here's Carr:

The Internet is changing the economics of creative work - or, to put it more broadly, the economics of culture - and it's doing it in a way that may well restrict rather than expand our choices. Wikipedia might be a pale shadow of the Britannica, but because it's created by amateurs rather than professionals, it's free. And free trumps quality all the time. So what happens to those poor saps who write encyclopedias for a living? They wither and die. The same thing happens when blogs and other free on-line content go up against old-fashioned newspapers and magazines. Of course the mainstream media sees the blogosphere as a competitor. It is a competitor. And, given the economics of the competition, it may well turn out to be a superior competitor. The layoffs we've recently seen at major newspapers may just be the beginning, and those layoffs should be cause not for self-satisfied snickering but for despair. Implicit in the ecstatic visions of Web 2.0 is the hegemony of the amateur. I for one can't imagine anything more frightening.

He then has a nice follow-up in which he republishes a letter from an administrator at Wikipedia, which responds to the above.

Encyclopedia Britannica is an amazing work. It's of consistent high quality, it's one of the great books in the English language and it's doomed. Brilliant but pricey has difficulty competing economically with free and apparently adequate....

...So if we want a good encyclopedia in ten years, it's going to have to be a good Wikipedia. So those who care about getting a good encyclopedia are going to have to work out how to make Wikipedia better, or there won't be anything.

Let's discuss.

Posted by ben vershbow at 09:00 AM | Comments (5)
tags: Libraries, Search and the Web , OS , Online , Publishing, Broadcast, and the Press , Social Software , Web2.0 , amateur , blog , blogging , blogs , book , books , britannica , collective , encyclopedia , encyclopedia_britannica , internet , journalism , mainstream_media , media , msm , open_content , open_source , publishing , web , web_2.0 , wiki , wikipedia

google dystopia Post date  10.10.2005, 10:06 AM

Google as big brother -- the paranoia certainly seems to be creeping into the mainstream. "Op-Art" by Randy Siegel from today's NY Times:

google 2084.jpg

Posted by ben vershbow at 10:06 AM | Comments (0)
tags: 1984 , 2084 , Libraries, Search and the Web , NYTimes , Online , algorithm , art , cartoon , dystopia , editorial , google , information , internet , newspaper , orwell , paranoia , privacy , satire , search , technology , web

welcome to the 19th century Post date  10.10.2005, 12:30 AM

The following was posted by Gary Frost as a comment to our post on Neil Postman's "Building a Bridge to the 18th Century." Gary recently returned from the Mississippi coast where he was part of a team helping to assess library and museum damage after Katrina.

The mystic advise that we walk into the darkness. Postman’s only qualification is that we do futurism with the right gear. But we cannot wander off into the future with enough AA batteries. An archeologist at the storm damaged Jefferson Davis presidential library greeted me saying; “Welcome to the19th century.” He was not kidding. No water, no electricity, no gas, no groceries. He was digging up the same artifacts for the second time in the immense debris fields left by Katrina.

We were driven to a manuscript era and we were invigorated to do our best. Strangely the cell phones worked and we talked to Washington from the 19th century. We asked if the Nation was still interested in the culture of the deep south. Not really, Transformers were at work and in our mobile society the evacuees had left for good. The army trucks were building new roads over the unmarked gravesites of 3000 Confederate veterans, who in their old age, came to Jeff Davis’ home to die.

We were left hanging about the future and technologies were a sidebar. It wasn’t really important that the 19th century had invented instantaneous communication, digital encoding or photographic representation or that the 21st century was taking the credit for its exploitation of these accomplishments. The gist was that the future deserved to be informed and not deluded. The gist was that the future would be fulfilled as a measure of its use of the accomplishments of a much longer past.

Posted by ben vershbow at 12:30 AM | Comments (1)
tags: Libraries, Search and the Web , archive , book , books , confederacy , confederate , digital , gulf , gulf_coast , history , hurricane , hurricane_katrina , jefferson_davis , katrina , library , literature , mississippi , paper , preservation , progress , reading , rescue , south , technology

ubu, king again Post date  10.07.2005, 1:21 AM

It's nice to see that UbuWeb, the great public web library of the avant garde, is back online after "a long summer of rebuilding." At times when the web feels depressingly shallow, Ubu can be the perfect medecine. Among the many masterworks you will find is Samuel Beckett's "Film" (1965), starring a very old Buster Keaton. It's wonderful that anyone can watch this online (I've just spent half an hour in its thrall).

beckett film.jpg

Also worth checking out are /ubu Editions - handsomely designed electronic texts ranging across an interesting selection of poetry, prose and theatre, including Ron Silliman's "The Chinese Notebook," which Dan blogged about a couple weeks back. These, like everything else on Ubu, are free.

Posted by ben vershbow at 01:21 AM | Comments (1)
tags: Libraries, Search and the Web , Online , avant_garde , avantgarde , beckett , buster_keaton , curated , ebook , experimental , fiction , film , gallery , internet , keaton , library , media , museum , music , poetry , samuel_beckett , silliman , theatre , ubu , ubuweb , web

yahoo! announces book-scanning project to rival google's Post date  10.03.2005, 2:00 PM

Yahoo, in collaboration with The Internet Archive, Adobe, O'Reilly Media, Hewlett Packard Labs, the University of California, the University of Toronto, The National Archives of England, and others, will be participating in The Open Content Alliance, a book and media archiving project that will greatly enlarge the body of knowledge available online. At first glance, it appears the program will focus primarily on public domain works, and in the case of copyrighted books, will seek to leverage the Creative Commons.

Google Print, on the other hand, is more self-consciously a marketing program for publishers and authors (although large portions of the public domain will be represented as well). Google aims to make money off its indexing of books through keyword advertising and click-throughs to book vendors. Yahoo throwing its weight behind the "open content" movement seems on the surface to be more of a philanthropic move, but clearly expresses a concern over being outmaneuvered in the search wars. But having this stuff available online is clearly a win for the world at large.

The Alliance was conceived in large part by Brewster Kahle of the Internet Archive. He announced the project on Yahoo's blog:

To kick this off, Internet Archive will host the material and sometimes helps with digitization, Yahoo will index the content and is also funding the digitization of an initial corpus of American literature collection that the University of California system is selecting, Adobe and HP are helping with the processing software, University of Toronto and O'Reilly are adding books, Prelinger Archives and the National Archives of the UK are adding movies, etc. We hope to add more institutions and fine tune the principles of working together.

Initial digitized material will be available by the end of the year.

More in:
NY Times
Chronicle of Higher Ed.

Posted by ben vershbow at 02:00 PM | Comments (0)
tags: Libraries, Search and the Web , archive , book , books , brewster_kahle , digital , digitize , ebook , google , google_print , googleprint , internet_archive , kahle , library , literature , reading , scanning , yahoo , yahoo!

learning from failure: the dot com archive Post date  09.22.2005, 11:37 AM

The University of Maryland's Robert H. Smith School of Business is building an archive of primary source documents related to the dot com boom and bust. The Business Plan Archive contains business plans, marketing plans, venture presentations and other business documents from thousands of failed and successful Internet start-ups. In the upcoming second phase of the project, the archive's creator, assistant professor David A. Kirsch, will collect oral histories from investors, entrepreneurs, and workers, in order to create a complete picture of the so-called internet bubble.

With support from the Alfred P. Sloan Foundation, The Library of Congress, and Maryland's business school, Mr. Kirsch is creating a teaching tool as well as an historical archive. Students in his management and organization courses at Maryland's School of Business, must choose a company from the archive and analyze what went wrong (or right). Scholars and students at other institutions are also using it for course assignments and research.

An article in the Chronicle of Higher Education, Creating an Archive of Failed Dot-Coms, points out that Mr. Kirsch won't profit much, despite the success of the archive.

Mr. Kirsch concedes that spending his time building an online archive might not be the best marketing strategy for an assistant professor who would like to earn tenure and a promotion. Online scholarship, he says, does not always generate the same respect in academic circles that publishing hardcover books does.

"My database has 39,000 registered users from 70 countries," he says. "If that were my book sales, it would be the best-selling academic book of the year."

Even so, Mr. Kirsch believes, the archive fills an important role in preserving firsthand materials.

"Archivists and scholars normally wait around for the records of the past to cascade down through various hands to the netherworld of historical archives," he says. "With digital records, we can't afford to wait."

Posted by Kim White at 11:37 AM | Comments (0)
tags: Libraries, Search and the Web , archive , business , businessplan , dotcom , history_of_interactive_media , internet , primarysource

making visible the invisible: george legrady installation at seattle central library Post date  09.16.2005, 6:37 PM

A nice companion piece to the "database of intentions" is George Legrady's new installation, "Making Visible the Invisible," at the Rem Koolhaas-designed Seattle Central Library. Six large LED display panels suspended above the "mixing chamber" on the library's fifth floor display a series of visualizations depicting the circulation of library books and other media across time and classification area, providing "a real-time living picture of what the community is thinking."

KeyWord Map Attack
legrady visible2.jpg

Legrady described the project at the Transliteracies conference this past June in Santa Barbara. At that time, Bob blogged:

the pinpoint accuracy of computer-searches, leaves those of us lucky enough to have spent time in library stacks, nostalgic for the unexpected discovery of something we didn't know we were looking for but which just happened, serendipitously, to be on a nearby shelf. George Legrady, artist and prof at UC Santa Barbara, just showed a project he is working on for the new public library in Seattle that gave the first glimpse of serendipity in online library searching which lets you see all the books that have recently been checked out on a particular subject. Beautiful and Exciting.

Vital Statistics
legrady visible3.jpg

Floating Titles
legrady visible4.jpg

Dot Matrix Rain
legrady visible5.jpg


Other observations:

"New piece for Central Library pushes art to the technical edge" in Seattle Post Intelligencer

Information Aesthetics profile

Posted by ben vershbow at 06:37 PM | Comments (0)
tags: Libraries, Search and the Web , architecture , art , book , books , circulation , datavisualization , georgelegrady , infovis , infoviz , installation , koolhaas , legrady , library , public , reading , remkoolhaas , sculpture , seattle , visualization

the database of intentions Post date  09.16.2005, 11:16 AM

Interesting edition of Open Source last week on "Google Sociology" with David Weinberger and John Battelle, author of the just-published "The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture". Listen here.

Weinberger has some interesting things to say about Google (and the other search engines) as "publishers." I have some thoughts on that too. More to come later.

Battelle has done a great deal of thinking on search from a variety of angles: the technology of search, the economics of search, and the more esoteric dimensions of a "search" culture. He touches briefly on this last point, laying out a construct that is probably treated more extensively in his book: the "database of intentions." By this he means the archive, or "artifact," of the world's search queries. A picture of the collective consciousness formed by the questions everyone is asking. Even now, when logged in to Google, a history of all your search query strings is kept - your own database of intentions. The potential value of this database is still being determined, but obvious uses are targeted advertising, and more relevant search results based on analysis of search histories.

As regards the collective database of intentions, Battelle speculates that future advances in artificial intelligence will likely draw on this enormous crop of information about how humans think and seek.

Posted by ben vershbow at 11:16 AM | Comments (0)
tags: Libraries, Search and the Web , Online , algorithm , audio , battelle , database , google , internet , listen , opensource , podcast , radio , radioopensource , search , searchengine , web , weinberger

the virtual library: lending audio books online Post date  08.30.2005, 9:03 AM

This past year, most of my reading (for better or for worse) has been done online. When I visit my local library it is to check out DVDs, or to take my son to story hour, or to use the library's free wireless. When I am there, I notice that many of the other patrons are there for the same reason. There's always a waiting list for the computers and a line of patrons with arms full of DVDs waiting to check out.

It's no surprise that libraries are looking for ways to extend these popular digital offerings in order to better serve their patrons and to stay relevant in the digital age. A recent article in Technology Review by Michael Hill reports that libraries have, "considered the needs of younger readers and those too busy to visit," and are beginning to offer downloadable digital audio books. "This is a way for us to have library access 24/7," says Barbara Nichols Randall, director of the Guilderland Public Library in suburban Albany. As an added bonus, you never have to worry about late fees. Here's how it works:

A patron with a valid library card visits a library Web site to borrow a title for, say, three weeks. When the audiobook is due, the patron must renew it or find it automatically "returned" in a virtual sense: The file still sits on the patron's computer, but encryption makes it unplayable beyond the borrowing period.

"The patron doesn't have to do anything after the lending period," said Steve Potash, chief executive of OverDrive. "The file expires. It checks itself back into the collection. There's no parts to lose. It's never damaged. It can never be late."

Posted by Kim White at 09:03 AM | Comments (0)
tags: Libraries, Search and the Web , OverDrive , audiobook , download , lending , library

lexis nexis as multimedia library Post date  08.02.2005, 12:20 PM

criticalmention.jpg Lexis Nexis, an indispensable resource for the more-than-casual web researcher, announced it will be adding video to its news and archive database, as part of its pay-as-you-go AlaCarte service. Clips will cost a few bucks a piece, more or less what a text article does now, and can be emailed to other readers for an extra fee.

The service will be powered by Critical Mention, a Manhattan start-up with a growing video database and several big licensing agreements already under its belt. The move into video parallels recent developments at Google, and particularly Yahoo!, whose video search engine makes it easier to track down clips across the web. But Lexis Nexis will be aimed at more rigorous researchers, primarily businesses, universities, and government agencies.

This brings a broadcast medium into what has traditionally been textual territory, underscoring some of the big questions: what does it mean to "write" with video? What does it mean to "quote" video?

(image: Critical Mention)

Posted by ben vershbow at 12:20 PM | Comments (0)
tags: Libraries, Search and the Web

the librarian in the techno-spa Post date  07.20.2005, 6:54 PM

stackskendrak.jpg The uncritical embrace of technology plagues American universities and consumers alike, whose credo is "adopt first, ask questions later." An assistant professor of english from an unnamed Midwestern liberal arts college, writes in the Chronicle of Higher Education of his dismay at the changes underway in American university libraries, where traditional stacks are left to deteriorate while money is lavished on fancy "techno-spas," transforming research sanctuaries into digital rec centers. The article is written in response to a very a real trend, brought to wider public attention in a NY Times article in May about an initiative at the University of Texas, Austin library in which approximately 90,000 books are to be relocated from the Flawn Academic Center to other libraries around campus to make way for a "24-hour information commons."

Benton's rhapsodizing on the pleasures of the stacks can be trying:

I once had a useful, relevant book fall on my head like Newton's apple. Perhaps it was pushed there by some ghostly scholar, one of my forebears whom I might consider myself privileged to join in the posthumous academy of spectral stack walkers.

But his overall criticism is correct. Many universities have adopted a servile stance, catering to what they perceive to be a new breed of restless, multi-tasking student. But the "customer is always right" philosophy probably isn't doing the students any favors in the long run. A generation is coming of age lost somewhere between the old print-based hierarchies of knowledge and the new Googlesque. And they aren't receiving much in the way of guidance. A university president needs shiny groves of sleek new computers to wow the funders and alumnae, just as he needs a winning football team. The business of universities and the business of technology march ahead together without much thought for what kind of citizen they might be producing.

From Benton:

Library administrators have had to make hard choices as costs have risen, their missions have expanded, and their budgets have failed to keep pace. But I am not so sure that the techno-spa model should be adopted so uncritically. Who will profit most from the transformation now and in the future, as fees and updates for new technologies continue indefinitely? Is that transformation really about the demands of students? If so, should we conform to their expectations, or make an effort to reshape them against the grain of the culture?

Alas, at many institutions, there is no longer much room for books on our central campuses. But we do have room for coffee bars, sports facilities, and a collection of other expensive, space-consuming amenities.

For that reason, I find it hard to accept that digitization is motivated primarily by constrained budgets and limited space. The money is there, and so is the space. It's just that colleges want to spend the money and use the space for something else that, presumably, will make them more competitive among students who are, perhaps, more interested in amenities than education.

One purpose of universities is to provide insulation from the world at large for the cultivation of sensitive minds. Universities might consider extending this principle to technology, applying the brakes on what could be a runaway train. The Amish, who, to say the least, are loathe to adopt new technologies, ask first, when confronted with a new invention, how it might change them. We could learn something from that. The answer isn't to hold candlelight vigils for the death of the card catalogue or the scribbled margin note, but rather to ask at each step how this is changing us, and whether we think it is a good thing.

(image by kendrak, via Flickr)

Posted by ben vershbow at 06:54 PM | Comments (2)
tags: Libraries, Search and the Web

publishers fire another volley at google library Post date  07.18.2005, 12:57 PM

google library.jpg Last week, the Association for Learned and Professional Society Publishers (ALPSP) joined the escalating chorus of concern over the legality of Google's library project, echoing a letter from the Association of American University Presses in May warning that by digitizing library collections without the consent of publishers, Google was about to perpetrate a massive violation of copyright law. The library project has been a troublesome issue for the search king ever since it was announced last December. Resistance first came from across the Atlantic where French outrage led a unified European response to Google's perceived anglo-imperialism, resulting in plans to establish a European digital library. More recently, it has come from the anglos themselves, namely publishers, who, in the case of the ALPSP, "absolutely dispute" Google's claim that the project falls within the "fair use" section of the US Copyright Act. From the ALPSP statement (download PDF):

The Association of Learned and Professional Society Publishers calls on Google to cease unlicensed digitisation of copyright materials with immediate effect, and to enter into urgent discussions with representatives of the publishing industry in order to arrive at an appropriate licensing solution for ‘Google Print for Libraries’. We cannot believe that a business which prides itself on its cooperation with publishers could seriously wish to build part of its business on a basis of copyright infringement.

In the relatively brief history of intellectual property, libraries have functioned as a fair use zone - a haven for the cultivation of minds, insulated from the marketplace of ideas. As the web breaks down boundaries separating readers from remote collections, with Google stepping in as chief wrecking ball, the idea of fair use is being severely tested.

Posted by ben vershbow at 12:57 PM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , Publishing, Broadcast, and the Press

vimeo open to the public - new constellations in the sky Post date  07.15.2005, 1:13 PM

In February, I stumbled upon a wonderful new site for storing and sharing video clips, which, until recently, was being tested on closed beta. Now fully open to the public, Vimeo aims to do for short form video what Flickr does for photos, openly citing the photo-sharing phenom as inspiration. Right now, it's pretty basic. Create a free account and you can start uploading compressed clips (8MB weekly limit), adding tags, and browsing what other users have put up. Over time, I expect they'll start adding some Flickr-esque features (like in-house email, groups, video sets, calendar, favorites and who knows what else). When Vimeo first came onto my radar, they had an interesting feature that allowed you to string several clips together within a single tag, creating an ad hoc montage. They still say on their "about" page that "several clips can be played together to create a movie," but I could no longer figure out how to do that.

All in all, given how troublesome it can be to get video working on the web, Vimeo seems to be off to a very smooth start. Something I hope they figure out is how to make it easy for users to post video to blogs. If they could rig up a basic form that automatically embeds a clip into a blog post, it would be a tremendous boon to the incipient video blogging community. And if they could provide basic video editing tools, then they might have something really big on their hands. (I've posted here my inaugural upload to Vimeo - a column from the ruins at Caesaria, from my recent trip to Israel.)

Though just barely off the ground, I have a feeling that Vimeo could evolve into something serious. Another exciting launch is Odeo, the podcast hub. Taken together as a constellation, these three ventures - Flickr, Vimeo, and Odeo - are constructing the beginnings of a vast media commons, tiny when compared with the giant 20th century media industries, but maybe not for long (see post on the London bombings). Another recently launched site, ourmedia, seeks to create a similar kind of homebrew media repository, offering (through a partnership with the Internet Archive) to "host your media forever — for free." But so far, I've been much more impressed with the the afore-mentioned image/video/sound trio. Different media present different challenges, and there's something to be said for doing one thing really well rather than trying to do everything sort of well. I've found ourmedia's interface frustrating. It's difficult to browse for media, and sometimes hard to open an item once you've found it. Flickr, Vimeo and Odeo all provide a dynamic tagging system, making it much easier for users to dig and explore. ourmedia has no such system. What ourmedia is exploring more intensively (and Odeo too) is the need for an editorial voice, maintaining a rotating roster of volunteer editors, whose duties, among other things, include constructing the site's homepage. Odeo, on the other hand, is more clearly descended from traditional broadcast media, namely radio. The site is organized into channels, each with its own signature mix of programming. But unlike a radio station, an Odeo channel has fluid boundaries. Listener's can pick and choose programs, constructing their own broadcast.

But ultimately, we can't leave it up to these sites to make selections for us. Flickr has its own blog where the site's creators draw attention to noteworthy material. But more interesting is Flickrzen, a blog posting regular "reportages" of the most compelling photos turning up on Flickr. Flickr Pix Photo Magazine is on a similar mission. Of course, with millions of photographs already on Flickr, and thousands pouring in as I write, it would be impossible for any single editorial body to exhaustively survey the whole repository. So there's definitely room for more of these curatorial ventures. They are the next step.

Posted by ben vershbow at 01:13 PM | Comments (5)
tags: Libraries, Search and the Web , Social Software

the internet public library turns ten Post date  07.07.2005, 9:35 PM

The Internet Public Library was created in 1995 by a group of graduate students led by Prof. Joseph James at the University of Michigan to "ask interesting and important questions about interconnections of libraries, librarians, and librarianship with a distributed networked environment."

Over the last ten years, the IPL has expanded their mission to create a public service organization and a learning/teaching environment. According to a 2003 press release:

Through the IPL, librarians and library students learn to integrate the use of the Internet into their professional practice. Internet users get help in navigating the sea of information on the Internet in order to find information they actually need and can use. By training librarians, students, and to some extent users, in using, searching, and evaluating the Internet, the IPL improves information literacy, a much-needed skill in the 21st century. Librarians and library students learn from IPL's examples, thus relieving them of the need to constantly "reinvent the wheel." Internet users spend less time wading through garbage and more time getting their real work done.

Posted by Kim White at 09:35 PM | Comments (0)
tags: Libraries, Search and the Web

american libraries are wired, with doors wide open Post date  06.24.2005, 1:15 PM

From today's NY Times: "Almost All Libraries Offer Free Web Access":

The study, which was conducted by researchers at Florida State University, found that 98.9 percent of libraries offer free public Internet access, up from 21 percent in 1994 and 95 percent in 2002. It also found that 18 percent of libraries have wireless Internet access and 21 percent plan to get it within the next year.

Even in an age of online reading, the library still has tremendous significance as a physical commons. When wi-fi coverage in cities becomes comprehensive, we should still be able to get free access at our local library. Another way that public libraries can stay relevant is to offer free on-site access to pay services: things like Lexis-Nexis, subscription-only web periodicals, and even web-delivered movies and television.

Posted by ben vershbow at 01:15 PM | Comments (0)
tags: Libraries, Search and the Web

weaving libraries into the web Post date  06.23.2005, 5:07 PM

A great feature of the Firefox web browser is the little search window built right into the toolbar next to the address field. It's set to Google as a default, but you can add other common search engines or knowledge bases like Yahoo, IMDB, Amazon, eBay, Wikipedia, dictionaries and others - a customized reference suite right in your browser. What if you could put a card catalogue in there too? John Wohlers, of the Todd Library at Waubonsee Community College in Sugar Grove, Illinois has built a searchlet that effectively does this. It's not like Google Print, where you can actually browse scanned copies of the book, but it takes a step toward integrating libraries with the web - an important move if they are to remain relevant in a world where browsers and search engines are the primary research tools.

Wohlers is also working on building library search into desktop tools. Windows users can find instructions here for putting the Todd Library catalogue into your Microsoft® Office 2003 Research Pane.

(via The Shifted Librarian)

Posted by ben vershbow at 05:07 PM | Comments (0)
tags: Libraries, Search and the Web

Gataga - social bookmark search and exploration engine Post date  06.20.2005, 2:43 PM

gataga_big.gif We came across this the other day - an engine for searching the social bookmarking commons. Gataga allows you to search by tag across several popular web-clipping services including del.icio.us, furl, and others. Gataga's simple interface looks a lot like Google's, but the similarity ends there. The only ranking system is time - the most recent links come up at the top. So Gataga is a nice tool for the moment's glimpse of the links people are saving, but that's about all.

Bit by bit, the web is being catalogued by its users. But at the moment, Gataga (and the rest of these bookmarking tools) works more like a wire service than a library. Tags are sort of like a reporter's "beat" and Gataga provides RSS feeds for all possible queries, so you can track areas of interest. But if you want to use it as an archive, you'll have some pretty serious digging to do.

In the early days of the web, sites sprung up like Voice of the Shuttle (VOS) that thoughtfully catalogued interesting links. The fact that there was a single editor ensured that things stayed fairly organized, that broken links were repaired, and dead ones pruned. But as the web grew, the model quickly became unmanageable. Alan Liu, who single-handedly managed VOS from 1994-1999, said it came to the point where he was spending 2-3 hours per night simply combing for dead links. VOS allowed the community to suggest sites, but the burden of organizing, annotating, and "weeding" fell solely on Liu. The rise of blogs made it easier and less stressful to gather links, but ensured that it was a casual affair - a kind of day-to-day grazing. Of course, all blogs have archives, but they are not terribly useful (Dan talks about this here).

With social bookmarking, we seem to be laying the foundation for something more sustainable - "the only group that can organize everything is everybody." The next step is for librarians, archivists, and new kinds of editors and curators to start making sense of this wilderness of tags.

Posted by ben vershbow at 02:43 PM | Comments (0) | TrackBack
tags: Libraries, Search and the Web

book returned to library 78 years late Post date  06.20.2005, 2:36 AM

In San Francisco Chronicle:

The Oakland Public Library announced Friday that a man returned an overdue book -- 78 years after his now-deceased aunt checked it out of the Melrose Branch.

Posted by ben vershbow at 02:36 AM | Comments (1)
tags: Libraries, Search and the Web

serendipity Post date  06.18.2005, 1:16 PM

the pinpoint accuracy of computer-searches, leaves those of us lucky enough to have spent time in library stacks, nostalgic for the unexpected discovery of something we didn't know we were looking for but which just happened, serendipitously, to be on a nearby shelf. George Legrady, artist and prof at UC Santa Barbara, just showed a project he is working on for the new public library in Seattle that gave the first glimpse of serendipity in online library searching which lets you see all the books that have recently been checked out on a particular subject. Beautiful and Exciting.

Posted by bob stein at 01:16 PM | Comments (0)
tags: Libraries, Search and the Web , Transliteracies , conferences_and_excursions

reading over your shoulder Post date  06.16.2005, 9:09 AM

A particularly offensive section of the Patriot Act was slapped down yesterday in Congress. From Reuters:

The U.S. House of Representatives on Wednesday defied President Bush by approving a measure making it harder for federal agents to secretly gather information on people's library reading habits and bookstore purchases.

Posted by ben vershbow at 09:09 AM | Comments (0)
tags: Libraries, Search and the Web

web news as gated community Post date  06.10.2005, 10:25 AM

Just found out about this on diglet.. Launched in April, The National Digital Newspaper Program (NDNP) is a joint effort of the Library of Congress and the National Endowment of the Humanities to create a comprehensive web archive of the nation's public domain newspapers.

Ultimately, over a period of approximately 20 years, NDNP will create a national, digital resource of historically significant newspapers from all the states and U.S. territories published between 1836 and 1922. This searchable database will be permanently maintained at the Library of Congress (LC) and be freely accessible via the Internet.

(A similar project is getting underway in France.)

It's frustrating that this online collection will stop at 1922. Ordinary libraries maintain up-to-date periodical archives and make them available to anyone if they're willing to make the trip. But if they put those collections on the web, they'll be sued. Archives are one of the few ways newspapers have figured out to make money on the web, so they're not about to let libraries put their microfilm and periodical reading rooms online. The paradigm has flipped.. in print, you pay for the current day's edition, but the following day it ends up in the trash, or wrapping a fish. The passage of 24 hours makes it worthless. On the web, most news is free. It's the fish wrap that costs you.

The web has utterly changed what things are worth. For most people, when a news site asks them to pay, they high tail it out of there and never look back. Even being asked to register is enough to deter many readers. But come September, the New York Times will start charging a $50 annual fee for what it considers its most unique commodities - editorials, op-eds, and selected other features. Is a full subscription site not far off? With their prestige and vast readership, the Times might be able to pull it off. But smaller papers are afraid to start charging, even as they watch their print circulation numbers plummet. If one paper puts up a tollbooth, they instantly become irrelevant to millions of readers. There will always be a public highway somewhere nearby.

A friend at the Columbia School of Journalism told me that the only way newspapers can be profitable on the web is if they all join together in some sort of league and charge bulk subscription fees for universal access. If there's a wholesale move to the pay model, then readers will have no choice but to shell out. It will be like paying for cable service, where each newspaper is a separate channel. The only time you register is when you pay the initial fee. From then on, it's clear sailing.

It's a compelling idea, but could just be collective suicide for the newspapers. There will always be free news on offer somewhere. Indian and Chinese wire services might claim the market while the prestigious western press withers away. Or people will turn to state-funded media like the BBC or Xinhua. Then again, people might be willing to pay if it means unfettered access to high quality, independent journalism. And with newspapers finally making money on web subscriptions, maybe they'd start loosening up about their archives.

Posted by ben vershbow at 10:25 AM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , Publishing, Broadcast, and the Press

"an invaluable resource that they had an extremely limited role in creating" Post date  06.09.2005, 2:11 PM

Good piece today in Wired on the transformation of scientific journals. There's a general feeling that commercial publishers like Reed Elsevier enjoy unreasonable control over an evolving body of research that should be freely available to the public. With exorbitant subscription fees, affordable only for large institutions, most journals are effectively inaccessible, and the authors retain few or no reproduction rights. Recently, however, free article databases have sprung up on the web - The Public Library of Science (PLoS), BioMed Central, and NIH's PubMed - some of which, like PLoS, have begun publishing their own journals. It's a welcome change, considering how much labor and treasure is poured into scientific publications (from funders, private and public, and from the scientists themselves), and yet how little is gotten in return. Shifting to a non-profit model, as PLoS has done, preserves much of the financial architecture that supports the production of journals, but totally revolutionizes the distribution.

PLoS journals are free and allow authors to retain their copyrights, as long as they allow their work to be freely shared and distributed (with full credit given, naturally). They also require that authors pay $1,500 from their grants, or directly from their sponsors or institutions, to have their work published. These groups pay the bulk of the $10 billion that goes to scientific and medical publishers each year, and what do they get in return? Limited access to the research they funded, and no right to reuse the information.

"It's ridiculous to give publishers complete control of an invaluable resource that they had an extremely limited role in creating," Eisen said (Eisen teaches genetics and is a founder of PLoS).

But what is in many ways the tougher question is how to shift the architecture of prestige - peer review - to these new kinds of journals.

Posted by ben vershbow at 02:11 PM | Comments (1)
tags: Copyright and Copyleft , Education , Libraries, Search and the Web

visual bookmarks Post date  06.02.2005, 12:21 PM

wist jaws tag.jpgWists is a visual bookmarking system for the web, doing for images what del.icio.us does for web pages. It's like browsing the web with a camera, or creating your own hand-selected Google image search. Find an image you want to keep track of and Wists will create a thumbnail for you, linking back to the original site. If it's a whole page you want to capture, Wists will take an automatic screenshot of the entire page. Add a title, tags and description and it goes into the system - a photo album of the web. Much like del.icio.us, Wists arranges popular tags on the sidebar and allows you to browse the latest entries. It also enables you to add other users' bookmarks to your own gallery, clearing the slate for your own tags and descriptions. Best of all, it keeps track of people you've taken items from, and people who have taken items from you. Trails become apparent and the archive becomes interconnected. Here's a grab of my "jaws" tag page - combing around for images, I found an amusing juxtaposition.

These are the kind of basic curatorial tools that would be great on Flickr. Currently, you are only able to apply tags to your own photos, or the those of friends, family or mutual contacts. But part of the fun of Flickr is browsing the photos of total strangers. You can comment on any photo or mark it as a favorite, but there is no way to curate your own collection of images from the community at large. Wists suggests how the gap between del.icio.us and Flickr might be bridged.

Posted by ben vershbow at 12:21 PM | Comments (0)
tags: Libraries, Search and the Web

Google Print gets its own address Post date  06.01.2005, 11:04 AM

Google Print now has its own exclusive search page. But make no mistake, this is not a library. Google makes it very clear in a paragraph intended to reassure nervous publishers:

Google Print is a book marketing program, not an online library, and as such your entire book will not be made available online unless you expressly permit it.

If you reach your limit of permitted pages you get this:

googleprintrestrictedpage.jpg

(Technorati Tags: , , )

Posted by ben vershbow at 11:04 AM | Comments (0)
tags: Libraries, Search and the Web

self-destructing books Post date  05.27.2005, 11:07 AM

In January I bought my first ebook (ISBN: B0000E68Z2), which is published by Wiley. I have one copy on my laptop and a backup on my external harddrive. Last week, I downloaded and installed Adobe Professional (writer 6.0) from our company network (Norwegian School of Management, BI) - during the installation some files from the Adobe version that I downloaded and installed when I bought the ebook (from Amazon.com UK) were deleted. Since then, I have not been able to access my ebook - I have tried to get help from our computer staff but they have not been able to help me.

Adobe thinks that I'm using another computer, while I'm not - and it didn't help to activate the computer through some Adobe DRM Activator stuff. Now I have spent at least 10 hours trying to access my ebook - hope you can help...

Boing Boing points to this story illustrating the fundamental flaws of digital rights management (DRM) - about a Norwegian prof who paid $172 for an ebook on Amazon UK only to have it turn to unreadable code jibberish after updating his Acrobat software. He made several pleas for help - to Acrobat, to Wiley (the publisher), and to Amazon. All were in vain. It turns out that after reading the story in Boing Boing (in the past 24 hours, I guess), Wiley finally sent a replacement copy. But the problem of built-in obsolescence in ebooks goes unaddressed.

I'm convinced that encrypting single "copies" is lunacy. For everything we gain with electronic texts - search, multimedia, connection to the network etc. - we lose much in the way of permanence and tactility. DRM software only makes the loss more painful. Publishers need to get away from the idea of selling "copies" and start experimenting with charging for access to a library of titles. You pay for the service, not for the copy. Digital books are immaterial - so the idea of the "copy" has to be revised.

Another example of old thinking with new media is the New York Public Library's ebook collection. That "copies" of electronic titles are set to expire after 21 days is not surprising. The "copy" is "returned" automatically and you sweep the expired file like a husk into the trash. What's incredible is that the library only allows one "copy" to be checked out a time, entirely defeating one of the primary virtues of electronic books: they can always be in circulation. Clearly terrified by the implications of the new medium (or of the retribution of publishers), the NYPL keeps ebooks on an even tighter tether than they do their print books. As a result, they've set up a service that's too frustrating to use. They should rethink this idea of the single "copy" and save everyone the "quote" marks.

Posted by ben vershbow at 11:07 AM | Comments (1) | TrackBack
tags: Copyright and Copyleft , Libraries, Search and the Web

academic publishers get snippety with Google Post date  05.25.2005, 12:37 PM

Last Friday, the Association of American University Presses (AAUP) sent Google a long letter expressing concern over what might amount to "systematic infringement of copyright on a massive scale" in its library project. BusinessWeek reports. The AAUP letter can be read here. Much of it asks Google to clarify its position on a number of points - to provide, as it were, the fine print on Google Print. Here's a great item:

Snippet is used so consistently in describing Google Print for Libraries that it's taking on the status of a technical term, and thus requires a specific definition. How long is a "snippet?"

Google defends its mass digitization project on the grounds of "fair use" (Section 107 of the US Copyright Act). In other words, it asserts the right to copy copyrighted materials and make them browseable on the web for research purposes as long as they restrict the amount that can be seen for free. Any commercial use of the text will take place only in the context of a publisher agreement. Publishers have the right to opt out, and apparently a couple already have, though most are holding their breath and waiting to see if they might be able to profit from Google's project. The tricky question is, can a book that has been withheld from the publisher program be included in the library program?

You could say that the web is one enormous copying machine. And so fair use questions are more important than ever before. Will Google be the juggernaut that breaks down the door into a more permissive fair use era for all? Or will they use their power to establish an exclusive, Google-only, fair use zone, and set up a cartel with publishers? Or will a few well aimed law suits sink the project before it gets off the ground?

Posted by ben vershbow at 12:37 PM | Comments (0)
tags: Libraries, Search and the Web

librarians set up shop at Wikipedia Post date  05.24.2005, 7:04 AM

"We librarians flatter ourselves that we know a thing or two about organizing information. It's time we stepped up and contributed to Wikipedia: not just to its content but to its structures and technologies. This project page is intended to provide a rallying point for these activities."

Posted by ben vershbow at 07:04 AM | Comments (0)
tags: Libraries, Search and the Web

libraries improve the front end Post date  05.15.2005, 2:01 PM

PCLpatio.gif There's a nice article in yesterday's NY Times on how some university libraries are rethinking how they arrange their space - moving, or redistributing print collections to make way for an "electronic information commons." It's not about abandoning the books, or relegating them to a lesser status. It's more about re-positioning them as a sort of physical database. If a library is a big computer, than the database exists toward the back end. These days, digging through the stacks, even if it results in a paper return, is generally done digitally. That's the front end, or the interface, and this is what smart libraries are seeking to improve. Research and scholarly production may be going digital, but the social, conversational space of the university, and in turn, the library, is still vital. The strategy for the libraries, then, is to restructure and expand that space as a compelling social software environment - one that is both physical and virtual. Sounds good in theory, but I'm not sure that's how these facilities will actually be used. Turning libraries into a hi-tech rec center might be sacrificing more than it saves.

The article focuses on reorganization efforts at the University of Texas at Austin. Press release here concerning the transformation of the Flawn Academic Center (likely a more durable link than the Times story).

(image: Perry-Castañeda Library at UT Austin)

Posted by ben vershbow at 02:01 PM | Comments (0)
tags: Libraries, Search and the Web

Europe aims canon at Google Post date  05.06.2005, 9:21 AM

Google is riding high. With nearly 50 percent market share, it is the most widely used search engine on the web. It is even beginning to act suspiciously like a portal (notice the "login" link tucked discretely in the upper right corner?), handling your mail, hosting your blog, helping you find files on your desktop, and even storing a history of all your web searches. (No doubt, Google's expansion into web-based applications has Microsoft scared - they've always considered software their turf.) Google recently patented a system that ranks news searches by the quality and credibility of the source. Gorgeous satellite maps have put the surface of the earth in our web browser and left people breathless (see the Google Sightseeing blog, or the memorymap tag on Flickr as examples of sheer exultation in seeing the world through Google maps). Late last year, Google announced plans to digitize and put online major portions of the libraries of Stanford, U. Michigan, Harvard, Oxford and the NY Public Library. And on top of all this, its stock has continued to soar. Already this year, the newly public company shattered predictions, earning $369.1 million in the first quarter alone, more than covering the cost of the projected 10-year library scanning project. It seems there is no limit to what Google might do.

bibliothequepourtous.jpg That is precisely what has Europeans so worried. Across the Atlantic, Google is coming to be seen as yet another symbol of American cultural hegemony, bestriding the web like a colossus. And the library project touches a particularly sensitive nerve, raising questions of cultural heritage - and cultural destiny. If the future of libraries is solely in Google's hands, what will be left out in the process? Will English become the lingua franca not only for politics and commerce, but for all intellectual discourse? Not content simply to ask questions, Europe has responded. In February, Jean-Noel Jeanneney, chief librarian of the Bibliothèque Nationale de France, warned that Google Print would effectively anglicize the world's knowledge, and called for a French digitization effort to beat back the surging english tide. Less than a month later, President Jacques Chirac gave Jeanneney's proposal the green light. Then, last week, nineteen national libraries, evidently moved by France's determination, signed a joint motion urging the creation of a giant pan-European digital library to counterbalance the nascent Googlian stacks. A couple days ago, 16 EU culture ministers, several heads of state, and over 800 artists and intellectuals met in Paris to close the deal, issuing a strong, continent-wide directive to preserve and promote culture, beginning with the digitization of European library collections.

More serious digitization projects are undoubtedly a good thing. The European effort, being a more purely civic enterprise, might in fact turn out far better than Google Print, which clearly has a large commercial dimension (deals with publishers, advertising etc.). The Euro initiative might produce bona fide electronic editions, not just searchable scans - fully structured, annotated and perhaps employing other scholarly resources (but let's not hold our breath). To be fair, Google has never said it wants to be the only gig in town. Rather, they hope to act as a catalyst for other digitization efforts. And judging by Europe's reaction, it seems to be working. Looking at this latest transatlantic folly, it's funny to think of the Bush administration trying to undercut European unity, splitting the continent into "old" and "new" in the hope of fishing out support for the Iraq war. We've seen where that kind of destructive diplomacy has led us. But quite wonderfully, Google appears to have achieved the opposite, galvanizing a united Europe with a big, visionary idea. If the Euro library project exists for no other reason than the perceived imperialism of Google, then so be it. It will result in a great gift for all. If only our foreign policy were so deft.

(image by libraryman via Flickr)

Posted by ben vershbow at 09:21 AM | Comments (0)
tags: Libraries, Search and the Web

Google talks to the librarians Post date  04.30.2005, 10:32 AM

Joy Weese Moll, a soon-to-be graduate of the School of Information Science and Learning Technologies at the University of Missouri, and author of the blog Wanderings of a Student Librarian, has written a useful overview of Google's Print and Scholar initiatives - actually a session report from the Association of College & Research Libraries conference earlier this month. Summarized by Moll are suprisingly harmonious remarks by Adam Smith, product manager for Google's library-related projects, and John Price Wilkin, a top librarian at the University of Michigan (and one of Google's pilot partners).

"Smith made it very clear that this project is in its infancy. Google considers itself to be an international company and intends to participate in digitization projects in other countries and other languages. Smith acknowledged that Google cannot digitize everything. Rather, Google wants to be a catalyst for digitization efforts, not the only game in town. Google’s digitization project will help them build tools that will improve the searching of digital libraries created by universities, governments, and other organizations."

Among other things, Wilkin points out that the mass digitization library collections "has already proven to be a factor in driving clarification of intellectual property rights, including the orphan copyright issue."

Published in Cites and Insights. Link via Bibliotheke.

Posted by ben vershbow at 10:32 AM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web

"the only group that can organize everything is everybody" Post date  04.21.2005, 4:33 PM

Some more thoughts on Clay Shirky's keynote lecture on "folksonomies" at the Interactive Multimedia Culture Expo this past weekend in New York (see earlier post, "as u like it - a networked bibliography").

Shirky talks about the classification systems of libraries - think card catalogues. Each card provides taxonomical directions for locating a book on a library's shelves. And like shelves, the taxonomies are rigid, hierarchical - "cleaving nature at the joints," in his words. The rigidity of the shelf tends to fossilize cultural biases, and makes it difficult to adjust to changing political realities. The Library of Congress, for instance, devotes the same taxonomic space to the history of Switzerland and the Balkan Peninsula as it does to the history of all Africa and Asia. See the table below (source: Wikipedia).

LChistoryclassifyy.jpg

Or take the end of the Cold War.. When the Soviet Union disintegrated, the world re-arranged itself. An old order was smashed. Dozens of political and cultural identities poured out of stasis. Imagine the re-shelving effort that was required! Librarians shuddered, knowing this was a task that far exceeded their physical and temporal resources. And so they opted to keep the books where they were, changing the section's header to "the former Soviet Union." Problem solved. Well, sort of.

When communication and transportation were slower, libraries had a chance of keeping up with the world. But the management of humanity's paper memory has become too cumbersome and complex - too heavy - to register every nuance, shock, and twist of history and human thought. Now, with the web becoming our library, there is, quoting Shirky again, "no shelf," and it's possible to have more fluid, more flexible ways of classifying knowledge. But the web has been slow to realize this. Look at Yahoo!, which, since first appearing on the scene, has organized its content under dozens of categories, imposing the old shelf-based model. As a result, their home page is the very picture of information overload. Google, on the other hand, decided not to impose these hierarchies, hence their famously spartan portal. Given the speed and frequency with which we can document every moment of our lives in every corner of the world, in every conceivable media - and considering that this will only continue to increase - there is no way that the job of organizing it all can be left solely to professional classifiers. Shirky puts it succinctly: "the only group that can organize everything is everybody."

That's where folksonomy comes in - user-generated taxonomy built with metadata, such as tags. Everybody can apply tags that reflect their sense of how things should be organized - their own personal Dewey Decimal System. There is no "top level." There are no folders. There is no shelf. Categories can overlap endlessly, like a sea of Venn diagrams. The question is, how do we prevent things from becoming incoherent? If there are as many classifications as there are footsteps through the world, then knowledge ceases to be a tool we can use. And though folksonomy frees us from the rigid, top-down hierarchies of the shelf, it subjects us to the brutal hierarchy of the web, which is time.

The web tends to privilege content that is new or recently updated. And tagging systems, in their present stage of development, are no different. Like blogs, tag searches place new content at the top, while the old stuff gets swiftly buried. As of this writing, there are nearly 24,000 photos on Flickr tagged with "China" (and this with Flickr barely a year old). You get the recent uploads first and must dig for anything older. Sure, you can run advanced searches on multiple tags to narrow the field, but how can you be sure you've entered the right tags to find everything that you're looking for? With Flickr, it is by and large the photographers themselves that apply the tags, so we have to be mind readers to guess the more nuanced classifications. Clearly, we'll need better tools if this is all going to work. Far from becoming obsolete, librarians may in fact become the most important people of all. It's not difficult to imagine their role shifting from the management of paper archives to the management of tags. They are, after all, the original masters of metadata. Different schools of tagging could emerge and we would subscribe to the ones we most trust, or that mesh best with our own view of things. Librarians could become the sages of the web.

It's easy to get preoccupied with the volume of information we're dealing with today. But the issue of time, which I raised earlier, should also be foremost in our minds. If libraries were to shake as violently and often as the world, they would crumble. They are not newsrooms. They are not bazaars. Like writing, libraries create stable, legible forms out of swirling passions. They provide refuge. Their cool, peaceful depths enable analysis and abstraction. They provide an environment in which the world can appear at a distance, spread out on literate strands that may be read in calm and quiet. As a library, the web feels more like the real world - sometimes too much so. It throbs with life, with momentary desires, with sudden outbursts. It is hypersensitive to change. But things pile up, or vanish altogether. I may have the smartest, most intuitive tags in the world, but in a year they might become nothing more than headstones for dead links. It is ironic that with greater access to more knowledge than ever before, we tend to live in a perpetual present. If folksonomies are truly where we're headed, then we must find ways to overcome the awful forgetfulness of the web. Otherwise, we may regret leaving the old, stubborn, but dependable shelf behind.

Posted by ben vershbow at 04:33 PM | Comments (2)
tags: Libraries, Search and the Web

as u like it - a networked bibliography Post date  04.19.2005, 12:05 PM

This past weekend I attended some of the keynote lectures at the Interactive Multimedia Culture Expo at the Chelsea Art Museum in New York. Among the speakers was Clay Shirky, who gave a quick, energetic talk on "folksonomies" - user-generated taxonomies (i.e. tags) - and how they are changing, from the bottom up, the way we organize information. Folksonomies are still in an infant stage of development, and it remains to be seen how they will develop and refine themselves. Already, it is getting to be a bit confusing and overwhelming. We are in the process of building, collectively, one tag at a time, a massive library. Clearly, we need tools that will help us navigate it.

citesulike.jpg Something to watch is how folksonomies are converging with social software platforms like Flickr. What's interesting is how communities form around specific interests - photos, for instance - and develop shared vocabularies. You also have the bookmarking model pioneered by del.icio.us, which essentially empowers each individual web user as a curator of links. People can link to your page, or subscribe with a feed reader. Eventually, word might spread of particular "editors" with particularly valuable content, organized particularly well. New forms of authority are thereby engendered.

Shirky mentioned an interesting site that is sort of a cross between these two models. CiteULike takes the tag-based bookmark classification system of del.icio.us and applies it exclusively to papers in academic journals, thereby carving out a defined community of interest, like Flickr.

"CiteULike is a free service to help academics to share, store, and organise the academic papers they are reading. When you see a paper on the web that interests you, you can click one button and have it added to your personal library. CiteULike automatically extracts the citation details, so there's no need to type them in yourself. It all works from within your web browser. There's no need to install any special software."

Essentially, CiteULike is an enormous networked bibliography. On the first page, recently posted papers are listed under the header, "everyone's library." To the right is an array of the most popular tags, varying in size according to popularity (like in Flickr). Each tag page has an RSS feed that you can syndicate. You can also form or join groups around a specific subject area. As of this writing, there are articles bookmarked from 6,498 journals, primarily in biology in medicine, "but there is no reason why, say, history or philosophy bibliographies should not be equally prevalent." So says Richard Cameron, who wrote the site this past November and is its sole operator. Citations are automatically extracted for bookmarked articles, but only if they come from a source that CiteULike supports (list here, scroll down). You can enter metadata manually if you are are not submitting from a vetted source, but your link will appear only on your personal bookmarks page, not on the homepage or in tag searches. This is to maintain a peer review standard for all submitted links, and to guard against "lunatics." CiteULike says it is looking to steadily expand its pool of supported sources.

CiteULike might eventually fizzle out. Or it might mushroom into something massively popular (it's already running in five additional languages). Perhaps it will merge with other social software platforms into a more comprehensive folksonomic universe. Perhaps Google will buy it up. It's impossible to predict. But CiteULike is a valuable experiment in harnessing the power of focused communities, and in creating the tools for navigating our nascent library. It might also solve some of the problems put forth in Kim's post, "weaving textbooks into the web." Worth keeping an eye on.

Posted by ben vershbow at 12:05 PM | Comments (0) | TrackBack
tags: Libraries, Search and the Web , the_networked_book

find it rip it mix it share it Post date  04.15.2005, 8:14 AM

That's the slogan for the just-launched Creative Archive License Group - a DRM-free audio/video/still image repository maintained by the BBC to provide "fuel for the creative nation." Other members include Channel 4, Open University, and the British Film Institute (bfi). Imagine if the big three US networks, PBS, NPR and the MOMA film archive were to do such a thing...

Posted by ben vershbow at 08:14 AM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web

amazon: inching toward semantic Post date  04.13.2005, 5:08 PM

amazonconcordance.jpg

Sometime in the last few days, Amazon.com unveiled three new features for its Inside the Book search: "books on related topics," a "100 most frequently used" concordance (above is the concordance for Orality and Literacy by Walter J. Ong), and "text stats." The stats are pretty funny - in addition to page, word and character count, they measure a book's "complexity" as well as its "readability" according to three established indexes, including the famous and amusingly named "Fog Index" (as though it rated the density of mental fog between a reader and a book). It also includes so-called "fun stats" like words per dollar and words per ounce.

Some of these features seem a little trivial, but there's no denying that Amazon is moving surely and steadily toward a comprehensive semantic browsing system (other recent innovations are Statistically Improbable Phrases (SIPS) and Citations). Though still crude compared to what it might eventually become, you can begin to glimpse the pleasures and uses it will afford. Amazon can never replace the social and tactile pleasures of browsing a physical bookstore, but it's doing a good job at making the virtual bookstore a more exciting place.

Posted by ben vershbow at 05:08 PM | Comments (0) | TrackBack
tags: Libraries, Search and the Web

simple answers to simple questions Post date  04.08.2005, 4:57 PM

Looking for simple facts on the web can be a frustrating business. Over time, we bookmark sites that reliably deliver the goods - things like basic geographical data, conversion scales for measurements, biographical summaries, or anything else that we need to quickly grab, plug in, and move on. But it all takes much longer than it should, and in looking for such things, we're plagued as much by the nuance of internet search as by its imprecision. It's all part of learning how to deal with this massive web we've created, and the state of blindness to which it reduces us. Search engines are really the only tool we have for groping through a pitch black sea of information, where the ineluctable modality is meaning, not the visible (for more on this, read Steven Pemberton's talk from the Decade of Web Design conference, which if:book attended this January in Amsterdam).

Well Google has helped us to see, just a little bit better, the little nuggets and factual crystals that we so often sift for in our blindness - by unveiling a new Q&A feature for basic web search (article via Bibliotheke). Plug in a search like "earth distance sun," or "copernicus date of death," and you get exactly what you're looking for right above the stack of general results:

googleq&a.jpg
or

googleq&a2.jpg

It's the kind of small, thoughtful innovation that makes you appreciate Google's attention to detail and sensitivity to the problem of blindness. Other search engines like Ask Jeeves offer a similar feature, but Google includes the information's source (a source they've vetted and deemed reliable) and a link to that page. For example, in the case of basic geography and demographics, the link might be to the CIA's World Factbook. Even if you just grab the fact and run, it's comforting to have seen a trustworthy citation, though some might grumble about the CIA.

It would be fantastic if this kind of quick fact extraction could be tailored to different search needs. Imagine a "writer's search toolbox" combining every conceivable reference resource that an author might need. Enter "synonym for think" and right at the top you get an entire thesaurus search result: "analyze, appraise, appreciate, brood, cerebrate, chew, cogitate, comprehend, conceive...." Enter "idiom with humble" and you get "eat humble pie," "Be it ever so humble, there’s no place like home," etc. Or search for rhymes, poetic forms, grammar guidelines, literary terms, writer bios, quotes, etymologies - anything. It's good news that search is being refined in this way, and competition among giants seems, in the end, to be good for the average web browser. Whatever helps us spend less time scouring and more time on the things that are important to us.

Posted by ben vershbow at 04:57 PM | Comments (0)
tags: Libraries, Search and the Web

baking google's cookies Post date  03.22.2005, 8:24 AM

google book_icon.gif Bibliotheke points to the recent adventures of Greg Duffy, a talented Texas college student who figured out how to read entire copyrighted books in Google Print by "baking" the cookies (data sent from to your computer from a web browser to store preferences for specific sites and pages) Google uses to impose search limits on protected material. Duffy took on the challenge largely out of curiousity, but doesn't deny that he fantasizes about his chutzpah landing him a job at Google. He hasn't been hired yet, but he did manage to attract a great deal of attention and over 10,000 hits to his site from more than 60 countries. And in the sudden commotion, he mysteriously disappeared from Google's web search results, only to reappear shorly after Google Print had been fixed to repel the hack. Any connection between the two events was cheerily denied by a Google representative writing in the comments on Duffy's blog under the nom de plume "Google Guy." Conspiracy theories abound, but Duffy has retained an excellent sense of humor throughout the whole affair, and still makes no secret of his hopes that sheer audacity and display of chops might yet get him hired by the juggernaut he so admires and loves to tease.

It's a bit tech-heavy, but it's worth reading his post and the updates that follow, if for no other reason than for his amusing riff on the cookie motif.

"So recently I wrote some software to grab and store up a bunch of cookies, keep them for more than 24 hours, and then automate searching for pages by this method. If I wanted to view page 100, the software would search for it and attempt to extract the image with a regular expression. If that doesn't work, it will search for page 99 and extract the "next page" link to get to page 100. It will continue doing this for page 101, 98, and 102 until it finds the correct page. Whenever a cookie would hit the hard limit, I'd replace it with a new cookie from the queue. By grabbing the "next" and "previous" links automatically in this "inductive" fashion and using the search for skipping, I could view an entire book on Google Print with one click every time. I later modified the software to spit out a PDF of the book. I used simple components like GoogleCookie (cookie with accessible properties), GoogleCookieOven (queue with "baking time", i.e. it only pops when the head of the queue is old enough to get the ability to search), and GoogleCookieBaker (thread that keeps the oven full of baking cookies by querying Google for new ones when the number drops below a certain threshold)."

Posted by ben vershbow at 08:24 AM | Comments (0) | TrackBack
tags: Copyright and Copyleft , Libraries, Search and the Web

chirac vs. google Post date  03.17.2005, 10:31 AM

bnf576.jpg French President Jacques Chirac has instructed the Bibliothèque Nationale de France to "draw up a plan" for a comprehensive online library of European literature to counter what is seen as the inevitably Anglo-Saxon bias of Google Print (see "non, merci").

Reuters story: Chirac Rivals Google with French Online Book Plan

Posted by ben vershbow at 10:31 AM | Comments (0)
tags: Libraries, Search and the Web

l.i. library lends ipods Post date  03.03.2005, 5:01 PM

Boing Boing points to a library in Long Island that has recently started lending mp3 audio books on iPod shuffles, even throwing in casette adapters and FM transmitters for listening in the car. The library claims that they are saving a lot of money in the long run, since mp3 audio books cost significantly less than books on cd.

>>Wired story

Posted by ben vershbow at 05:01 PM | Comments (0)
tags: Libraries, Search and the Web , Microlit

non, merci Post date  02.23.2005, 7:35 PM

Jean-Noel Jeanneney, the head of France's national library (BNF), has raised a "battle cry" (Le Figaro) against the cultural and linguistic imperialism of America. But this time, it's not about Big Macs and slang coming to massacre the French langauge. It's about Google and its plans to digitize libraries, which, Jeanneney says, will put a distinctly anglo stamp on the greater part of the world's knowledge (Reuters). Encouraging Europe to take part in this massive project seems like a good idea - for the sake of diversity, but more important, to offer a possible alternative to Google's approach, which was devised in the absence of any real competition. Google Print's interface is limited to a snapshot tour of a book, with minimal search capabilities. They're essentially doing for books what A9 is doing for streets, with souped-up scanners instead of trucks with camera mounts. It's a browsing tool and not much more.

Google's stock is soaring not only because it is a great engine, but also because it has pioneered a new kind of search-based advertising. There's been a lot of high-minded conjecture (e.g.) as to what Google Print might mean for humanity - rhapsodic allusions to Borges and the library of Alexandria. But the great global library of our dreams probably won't be created by Google. You could say that we are all creating it, that the web is that library. But without getting too breathless, think of the fact that with each passing year we move further and further into a paperless world. We will need well-designed electronic books in a well-designed electronic library, or matrix of libraries. So it's heartening that a serious institution like BNF wants to get in on the game. Maybe they can do better. A good indication that they could is their recently announced project (sorry, only French link) to build a free online archive of 130 years of French newspapers and periodicals - 29 publications in total, running from 1814 to 1944. But then again, perhaps they simply want to secure a place in Google's illustrious coalition of the willing: Harvard, Oxford, U. of Michigan, Stanford, and the New York Public Library.

bnf.jpg
Bibliothèque Nationale de France

Posted by ben vershbow at 07:35 PM | Comments (2)
tags: Libraries, Search and the Web

the web in the world Post date  02.14.2005, 7:07 PM

In ten years, the world wide web has become an indispensable fact of life. Where do we take it next? At the conference's closing plenary session, Peter Lunenfeld asked a similar question: "What is the next big dream that will keep us going? Are we out of ideas?" He then offered something called "urban computing" as a possible answer.

Here is my attempt (rather long, I apologize) to jump on that dream...

I live in New York, and in the past few years I've observed a transformation. My neighborhood coffee shop looks like an advertisement for Apple. At any given time, no less than two thirds of the customers are glued to their laptops, with mugs of coffee steaming in perilous proximity. atlasmacs.jpg Power cords snake among the tables and plug into strips deployed around the cafe floor. Go to the counter and they'll be happy to give you a dog-eared business card bearing the password to their wireless network. Of course, people have been toting around notebook computers since they first became available in the mid-80s, and they've certainly been no stranger to coffee shops. But with the introduction of Wi-Fi people are flocking in droves. Some kind of exodus has begun.

It's a familiar sight throughout the more cosmopolitan neighborhoods of the city. Go to any Starbucks on the Upper West Side and you're competing with half a dozen other customers for a space on their too-few powerstrips. And their Wi-Fi service isn't even free. And come spring, I predict the same will occur in the city's parks, especially those downtown, which are rapidly being integrated into a massive wireless infrastructure. No single entity is responsible for this, rather a lattice of different initiatives working toward a common goal: free high speed Wi-Fi coverage across Manhattan.

Mobile web and messaging technologies have already created a new breed of roving web users. Cell phones, PDAs, text messaging, Blue Tooth, RSS, podcasting (the list goes on..) have swept into our daily life like a tidal wave. More and more, we're able to read, search, capture, edit, and send on the go, and with satellite-fed positioning technologies, we can pinpoint our location at any given time. What we have is the beginnings of a kind of "augmented reality" where information relates intimately to place, and vice versa. The world itself can now be as searchable, linkable, and informative as the web - a synthesis, or overlay, of real and virtual realities.

So the next big dream could be the evolution of the web into something more than a desktop system - into something that we can use while moving, and interact with anywhere.

Imagine an entire city redesigned as a communication platform. This has recently become a hot emerging field of research: "urban computing", or "locative media". It has also entered the choppy waters of urban planning. In his talk at the conference, New York sociologist and Silicon Alley chronicler Michael Indergaard, spoke of developers in lower Manhattan hoping to recast the hobbled financial district as a high-tech place "to work and play" - a weave of communications and public space. Given the tangle of ego, money and politics that has converged on Ground Zero, it's hard to predict how this will pan out.. (for examples of recent initiatives, Downtown Alliance, Spectropolis). Or witness the hotly debated proposal to make the entire city of Philadelphia into one giant Wi-Fi hotspot.

Ultimately, the idea is to overlay the complex of buildings, streets and public spaces - the entire fabric of urban life - with the interconnectivity of the web. The implications for culture are as enormous as they are for business. Cities have always been the most brilliant cultural dynamos. Take the simplest, most essential urban act - walking through town - and you will discover a rich attendant tradition: from the eternal stroller, the flaneur, to the urban dérive, or drift, "a technique of rapid passage through varied ambiances," as expostulated by the French Situationalists of the 1960s, most notably Guy-Ernest Debord. Or take the traditions of buskering, street photography, or the countless films that employ the city as canvas (take, for example, Woody Allen, who time and again has used the geography and rhythm of Manhattan in his neurotic self-portraits). grafedia.jpg And let's not forget what is probably the greatest literary treatment of consciousness in the modern city: Joyce's Ulysses. In each case, the city is interpreted as a matrix of "psychogeographies" (to borrow again from the Situationalists) for which the amble, the stroll, the idle on the stoop, are the most effective tools of discovery (see earlier post City Chromosomes - An SMS Chronicle). The city is also the realm of chance: a kind of particle accelerator of circumstance that whips up coincidences with adrenalizing rapidity. It's not hard to see how this is similar to our movements in virtual space - surfing, browsing, chatting, searching, stumbling upon newness - and how these two ways of movement could act in concert when computers are unmoored from desks.

The beginnings of this can be found at the intersection of photography and the web. Digital cameras and camera phones have become incredibly popular. People are snapping pictures everywhere they go, then sharing them through email, image messaging, or online image banking services like Ofoto or Flickr. New technologies are in the works that will allow people to plug in photos the same way they plug in keywords on a search engine, effectively asking "what am I looking at?" or "where have I seen this before?" (see earlier post, Hyperlinking the Eye of the Beholder). Amazon's new yellow pages service allows you to see photos of the business your are looking for, and to virtually stroll, or scroll, down the block to get a sense of the surrounding environment (see earlier post, From Apsen to A9). A student project called "Grafedia" was presented recently at the Tisch ITP winter show which allows people to put "hyperlinks" in any physical environment. As described by its inventor John Geraci:

Grafedia is hyperlinked text, written by hand onto physical surfaces and linking to rich media content - images, video, sound files, and so forth. It can be written anywhere - on walls, in the streets, or in bathroom stalls. Grafedia can also be written in letters or postcards, on the body as tattoos, or anywhere you feel like putting it. Viewers "click" on these grafedia hyperlinks with their cell phones by sending a message addressed to the word + "@grafedia.net" to get the content behind the link."

I can imagine Grafedia as a fun urban diversion - a trail of crumbs, a flirtation, a new way of marking territory - or even as a kind of x-ray vision into closed spaces. A discrete little tag could alert a passerby to shady goings-on within. Though primitive, these new technologies represent the seeds for new social practices, new ways of creating and sharing meaning.

And locative media need not be restricted to cities. GPS technology presents new possibilities for tracing a meaningful path through the physical world - global hide-and-seek, or more frightening, a means of surveillance. GPSscreenshot_01.jpg A recreational pastime, "geocaching," has already sprung up around this technology. In the game, which resembles a kind of global scavenger hunt, participants identify "caches" on the web and track them down with their GPS device. Caches are used as a kind of exchange for anything people might like to share - books, software, recipes, jewelry, clothing, art (anything, really) - and as a stop along the trail to other caches. Clearly, the thrill of finding, of tracking, is the main attraction here.

Solitary roaming through nature could also, theoretically, be "plugged in" to the mobile web. One of the conference presenters, John Chris Jones, has for the past four years has been writing a peripatetic web journal beamed in from his Wordsworthian walks through the English countryside. I for one would find this to be a terrible intrusion into solitude, but I offer up the possibility without judgement, keeping in mind that Thoreau's "isolated" cabin on Walden Pond was less than a mile from the train tracks.

In the end, societies and cultures work through presence - through people, things, and ideas interacting in the same environment. Far from stripping individual locales of their unique qualities, the web in its infant decade has enabled people to explore and converse like never before about their particular place in the world, while simultaneously connecting every point on the map - the realization of McLuhan's "global village". The next step, it would seem, is to shed the physical restrictions of desktop computing so the village can be explored on foot.

Posted by ben vershbow at 07:07 PM | Comments (0)
tags: Libraries, Search and the Web , conferences_and_excursions

collecting and archiving the future book Post date  02.08.2005, 8:02 PM

The collection and preservation of digital artworks has been a significant issue for museum curators for many years now. The digital book will likely present librarians with similar challenges, so it seems useful to look briefly at what curators have been grappling with.

At the Decade of Web Design Conference hosted by the Institute for Networked Cultures. Franziska Nori spoke about her experience as researcher and curator of digital culture for digitalcraft at the Museum for Applied Art in Frankfurt am Main. The project set out to document digital craft as a cultural trend. Digital crafts were defined as “digital objects from everyday life,” mostly websites. Collecting and preserving these ephemeral, ever-changing objects was difficult, at best. A choice had to be made between manual selection, or automatic harvesting. Nori and her associates chose manual selection. The advantage of manual selection was that critical faculties could be employed. The disadvantage was that subjective evaluations regarding an object’s relevance were not always accurate, and important work might be left out. If we begin to treat blogs, websites, and other electronic ephemera as cultural output worthy of preservation and study (i.e. as books), we will have to find solutions to similar problems.

The pace at which technology renews and outdates presents a further obstacle. There are, currently, two ways to approach durability of access to content. The first, is to collect and preserve hardware and software platforms, but this is extremely expensive and difficult to manage. The second solution, is to emulate the project in updated software. In some cases, the artist must write specs for the project, so it can be recreated at a later date. Both these solutions are clearly impractical for digital librarians who must manage hundreds of thousand of objects. One possible solution for libraries, is to encourage proliferation of objects. Open source technology might make it possible for institutions to share data/objects, thus creating “back-up” systems for fragile digital archives.

Nori ended her presentation with two observations. "Most societies create their identity through an awareness of their history." This, she argues, compells us to find ways to preserve digital communications for posterity. She notes that cultural historians, artists, and researchers "are worried about a future where these artifacts will not be accessible."

Posted by Kim White at 08:02 PM | Comments (0)
tags: Libraries, Search and the Web , conferences_and_excursions

from aspen to A9 Post date  02.07.2005, 7:04 PM

Amazon's search engine A9 has recently unveiled a new service: yellow pages "like you've never seen before."

"Using trucks equipped with digital cameras, global positioning system (GPS) receivers, and proprietary software and hardware, A9.com drove tens of thousands of miles capturing images and matching them with businesses and the way they look from the street."

a9statenews.jpg All in all, more than 20 million photos were captured in ten major cities across the US. Run a search in one of these zip codes and you're likely to find a picture next to some of the results. Click on the item and you're taken to a "block view" screen, allowing you to virtually stroll down the street in question (watch this video to see how it works). You're also allowed, with an Amazon login, to upload your own photos of products available at listed stores. At the moment, however, it doesn't appear that you can contribute your own streetscapes. But that may be the next step.

I can imagine online services like Mapquest getting into, or wanting to get into, this kind of image-banking. But I wouldn't expect trucks with camera mounts to become a common sight on city streets. More likely, A9 is building up a first-run image bank to demonstrate what is possible. As people catch on, it would seem only natural that they would start accepting user contributions. aspen discursions.jpg Cataloging every square foot of the biosphere is an impossible project, unless literally everyone plays a part (see Hyperlinking the Eye of the Beholder on this blog). They might even start paying - tiny cuts, proportional to the value of the contribution. Everyone's a stringer for A9, or Mapquest, or for their own, idiosyncratic geo-caching service.

A9's new service does have a predecessor though, and it's nearly 30 years old. In the late 70s, the Architecture Machine Group, which later morphed into the MIT Media Lab, developed some of the first prototypes of "interactive media." Among them was the Aspen Movie Map, developed in 1978-79 by Andrew Lippman - a program that allowed the user to navigate the entirety of this small Colorado city, in whatever order they chose, in winter, spring, summer or fall, and even permitting them to enter many of the buildings. The Movie Map is generally viewed as the first truly interactive computer program. Now, with the explosion of digital photography, wireless networked devices, and image-caching across social networks, we might at last be nearing its realization on a grand scale.

Posted by ben vershbow at 07:04 PM | Comments (0)
tags: Libraries, Search and the Web , history_of_interactive_media

hyperlinking the eye of the beholder Post date  01.13.2005, 3:20 PM

monali.jpg What if instead of just taking dorky pictures of your friends you could use your camera phone as an image swab, culling visual samples of the world around you and plugging them into a global database? Every transmitted picture would then be cross referenced with the global image bank and come back with information about what you just shot. A kind of "visual Google."

This may not be so far away. Take a look at this interview in TheFeature with computer vision researcher Hartmut Neven. Neven talks about "hyperlinking the world" through image-recognition software he has developed for handheld devices such as camera phones. If it were to actually expand to the scale Neven envisions (we're talking billions of images), could it really work? Hard to say, but it's quite a thought - sort of a global brain of Babel. Think of the brain as a library where information is accessed by sense (in this case vision) queries. Then make it earth-sized.

Here, in Neven's words, is how it would work:

"You take a picture of something, send it to our servers, and we either provide you with more information or link you to the place that will. Let's say you're standing in front of the Mona Lisa in the Louvre. You take a snapshot with your cameraphone and instantly receive an audio-visual narrative about the painting. Then you step out of the Louvre and see a cafe. Should you go in? Take a shot from the other side of the street and a restaurant guide will appear on your phone. You sit down inside, but perhaps your French is a little rusty. You take a picture of the menu and a dictionary comes up to translate. There is a huge variety of people in these kinds of situations, from stamp collectors, to people who want to check their skin melanoma, to police officers who need to identify the person in front of them."

But the technology has some very frightening implications as well, chief among them its potential for biometric human identification through "iris scanning and skin texture analysis." This could have some fairly sensible uses, like an added security layer for banking and credit, but we're dreaming if we think that will be the extent of it. Already, the Los Angeles Police Department is testing facial recognition programs based on Neven's work - a library of "digital mugshots" that can be cross referenced with newly captured images from the street. Add this to a second Patriot Act and you've got a pretty nasty cocktail.

Posted by ben vershbow at 03:20 PM | Comments (0)
tags: Libraries, Search and the Web

what's a library? Post date  01.09.2005, 9:52 PM

stacks_then.jpg In a recent discussion in these pages, Gary Frost has suggested that the Google library model would be premised on an inter-library loan system, "extending" the preeminence of print. Sure, enabling "inside the book" browsing of library collections will allow people to engage remotely with print volumes on distant shelves, and will help them track down physical copies if they so desire. But do we really expect this to be the primary function of such a powerful resource?

We have to ask what this Google library intitiative is really aiming to do. What is the point? Is it simply a search tool for accessing physical collections, or is it truly a library in its own right? A library encompasses architecture, other people, temptations, distractions, whispers, touch. If the Google library is nothing more than a dynamic book locator, then it will have fallen terribly short of its immense potential to bring these afore-mentioned qualities into virtual space. Inside-the-book browsing is a sad echo of actual tactile browsing in a brick-and-mortar library. It’s a tease, or more likely, a sales hook. I think that's far more likely to be the way people would use Google to track down print copies - consistent with Google's current ad-based revenue structure.

But a library is not a retail space – it is an open door to knowledge, a highway with no tolls. How can we reinvent this in networked digital space?

Posted by ben vershbow at 09:52 PM | Comments (10) | TrackBack
tags: Libraries, Search and the Web

more thoughts on salinas Post date  01.05.2005, 4:28 PM

Instead of becoming obsolete or extinct, local libraries should become portals to the global catalogue - a place where every conceivable text is directly obtainable. Instead of a library card, I might have a portable PC tablet that I use for all my e-texts, and I could plug into the stacks to download or search material. In this way, each library is every library.

But community libraries shouldn't simply be a node on the larger network. They should cultivate their unique geographical and cultural situation and build themselves into repositories of local knowledge. By being freed, literally, of the weight of general print collections, local branches could really focus on cultivating rich, site-specific resources and multimedia archives of the surrounding environment.

ChavezLib.jpg In Salinas, for example, there are two bookshelves of Chicano literature at the Cesar Chavez Library - a precious, unique resource that will soon be inaccessible as libraries close to solve the city's budget crisis. With all library collections digitized, you wouldn't have to physically be in Salinas to access the Chicano shelves, but Salinas would remain the place where the major archival work is conducted, and where the storehouse of material artifacts is located.

It would be a shame for libraries to lose their local character, or for knowledge to become standardized because of big equalizers like Google. But when federal and municipal money is so tight that libraries are actually closing down, can we really expect the digitization of libraries to be achieved by anyone but the big commercial entities (like Google)? And if they're the ones in charge, can we really count on getting the kind of access to books that libraries once provided? (image: Cesar Chavez Library, Salinas)

Posted by ben vershbow at 04:28 PM | Comments (1)
tags: Libraries, Search and the Web

another ann arbor thought: borders and google Post date  01.04.2005, 1:27 PM

Ann Arbor is also the birthplace of Borders Books, a megabookstore similar to Barnes & Noble. It started in the 70s as a small used bookstore and evolved into a superstore which, according to their website, serves "some 30 million customers annually in over 1,200 stores."

The Borders’ website credits their success to, "a revolutionary inventory system that tailored each store’s selection to the community it served." In other words, they applied small bookstore strategy—get to know the particulars of a customer’s reading habits—on a larger scale. Since Google has chosen Ann Arbor as one locus of its nascent megalibrary, I got to thinking, what might these two distributors have in common (besides A2)? Google might be taking a cue from Borders when it designs the cyberlibrarian to accompany its digital collection. The small bookstore owner learns, through interaction, what a particular community wants. Borders’ inventory system tracked what the client was buying and selling. Google may, likewise, be able to track your buying and selling, your searching and asking. Perhaps the automaton Google librarian will "know" you based on information accumulated by all the various Google searches you have conducted. Problem is, that’s marketing strategy, not educational strategy. Will the Google librarian be able to make intuitive leaps leading the browser to things he/she is not familiar with rather than to more of what he/she already knows? How will search engines answer the need for this kind of expertise?

Posted by Kim White at 01:27 PM | Comments (1)
tags: Libraries, Search and the Web

closing down salinas Post date  01.03.2005, 3:05 PM

SLibrary.jpg The 150,000 citizens of Salinas, California will soon be without a single public library – a drastic measure taken by the city to solve a drastic budget crisis. After a pair of last-minute ballot measures failed to win funds for the city’s embattled libraries, the doors will soon close on what is for many curious minds, the only resource in town.

Is the local community library going extinct? (image: John Steinbeck Library, Salinas)

Posted by ben vershbow at 03:05 PM | Comments (0)
tags: Libraries, Search and the Web

after a holiday visit to ann arbor: the u of m library & google Post date  01.01.2005, 5:21 PM

I have fond memories of studying in the University of Michigan libraries, (as a high school student, and later as a U of M undergraduate). The physical space of the library, the seemingly endless stacks of books, which allowed deep exploration of even the most obscure topics, and gave me a sense of how vast, (and how limited) the universe of human thought really is. How is this going to be translated in the virtual space of Google’s digital library?

Isn’t it the job of the University library to provide a young scholar with opportunities to “see” the scope of human knowledge? While at the same time, offering a kind of temple space for the engagement of these books. Without the marble staircases, the chandeliers, the stone pilasters, the big oak tables, the reference room, the stained glass windows, the hushed silence, how will we get the message that books are important, and that understanding them requires a particular “space.” The physical space of the library serves as a metaphor and a reminder of the serious mental space that needs to be carved out for productive study. What will we lose when that space becomes “virtual?” Are we “saving” space by putting everything in the computer? Or are we losing it?

Posted by Kim White at 05:21 PM | Comments (15)
tags: Libraries, Search and the Web

google takes on u. of michigan library - the numbers Post date  12.22.2004, 12:48 PM

- 7,000,000: Volumes in the U-M library to be digitized.
- 2,380,000,000: Estimated number of pages.
- 743,750,000,000: Estimated number of words.
- 1,600: Years it would take U-M to digitize all 7 million volumes without Google's special technology.
- Fewer than 7: Years it will take to digitize the volumes with Google's technology.
- $1 billion: Estimated value of the project to U-M.

Source: John Wilkin, associate university librarian, library information technology and technical and access services, University of Michigan

U-M's entire library to be put on Google - Detroit Free Press

Posted by ben vershbow at 12:48 PM | Comments (0)
tags: Libraries, Search and the Web

tower of babel or trivial pursuit? Post date  12.20.2004, 3:59 PM

Read New York Times Article
In an article in yesterday’s NY Times, Alberto Manguel compares the Genesis story of Babel and the library at Alexandria with their alleged modern-day counterpart—Google’s commitment to digitize all human knowledge. Are we constructing a modern-day tower of Babel? A monument to the hubris of what might be possible if we could just get a little smarter. Will Google help us find answers to the big questions: where did we come from, and what’s the meaning of it all? I went online to find out. I Googled the question “What is the meaning of it all?” and got the following:

In an article in yesterday’s NY Times, Alberto Manguel compares the Genesis story of Babel and the ambitions of the library at Alexandria with their alleged modern-day counterpart—Google’s commitment to digitize all human knowledge. Are we constructing a modern-day tower of Babel—a monument to the hubris of what might be possible if we could just get a little smarter? Will Google help us find answers to perennial puzzlers like: where did we come from? Is anyone or anything in charge? And, what’s the meaning of it all? I went online to find out. I Googled the question “What is the meaning of it all?” and got the following:

The Meaning of Emmanuel
... "What is the meaning of it all?" "What is its purpose?" The human tendency always is to forget origins. And now that Christmas has grown to be such a ...

The Kubrick Site: John Morgan on 2001 vs. 2010
... What is the meaning of it all? Is there a God? What is the purpose of Art? Is there a merging of Art and Science?' Where Clarke in comparison only asks ...

The meaning of life, the universe and everything
... What is the meaning of it all? 'Antennae' colliding galaxies. When we contemplatethe unimaginable vastness of the universe, the incredible diversity ...

London theater musical on stage in London's West End Shaftesbury ...
... But what is the meaning of it all? Well, mainly that the dreamy idealist, Boney, had all he needed in Anastasia Barzee’s sweetly trilling Jo and never ...

'Rings' actor: 'It'll be the biggest film of all time'
... What is the meaning of it all? In some ways, that sort of inquiry is completely unfashionable. "I often think one of the reasons people are dismissive ...

Becoming a Wise Elder
... Questions such as "What is the meaning of it all?" and "Does my life make any kind of difference to anyone?" were very unlikely to arise. ...

Psychology Today: Still news
... PT: What is the meaning of it all now? BB: There was a recklessness in Kennedy's life that I didn't see, a sexual recklessness I don't understand. ...

None of these offerings brought me closer to a substantive answer. Demoralized by the thought of having to go through the other 517 possibilities. I decided to respond to the suggestion at the top of my page:
Tip: Have a question? Ask the researchers at Google Answers.

I clicked "Google Answers" and entered my question: What is the meaning of it all?

Then I had to set a price for my question between $2 and $200. I clicked on “How do I price my question?” And found the following guidelines:
*The more you pay, the more time and effort a Researcher will likely spend on your answer. However, this depends somewhat on the nature of your question.
*Above all - try to pay what the information is worth to you, not what you think you can get it for - that is the best way to get a good answer - but only you can know the value of the information you seek.

Hmm, what is the information worth to me?

I took a look at Google’s examples to get an idea of where my question might fit on the pay scale. Fifty dollars is the “minimum price appropriate for complex, multi-part questions. Researchers will typically spend at least one hour on $50 questions and be very responsive to follow-up questions.” One hundred dollar questions merit two to four hours of “highly thorough research.” Examples of hundred dollar questions included “Parking in New York City, and How does infant-family bonding develop?” The two hundred dollar question required researchers to “spend extensive amounts of time (4 hours plus).” Examples of $200 questions included: Searching for Barrett's Ginger Beer, Applications using databases, What is the impact of a baby with Down's Syndrome on its family?

None of those examples seemed to be in the same league with “what’s the meaning of it all?” Can a Google researcher find the answer in 4 hours? probably not, although I do wonder what they would come up with. Anyway, the point of all this is that Google is set up to search out trivial, quotidian sorts of things and it will be interesting to see how/if they can make the transition from those who can tell you how to “search for Barrett’s Ginger Beer,” to gatekeepers of all human knowledge.

Posted by Kim White at 03:59 PM | Comments (0)
tags: Libraries, Search and the Web , babel , google , internet , meaning , search , semantic_web , web

enter the cybrarian Post date  12.18.2004, 3:02 PM

inside1-googling-libraries.jpg The recent buzz surrounding Google's library intitiative has everyone talking about the future of research, which inevitably raises the question: how will the digitization of library collections change the role of the librarian? I would guess that, far from becoming obsolete, their role will in fact be elevated in importance, if not necessarily in status. They could very well come to be our indispensible guides through the labyrinth - if perhaps invisible, engineering behind the digital walls.

It's also important to consider the question of visualization. When you run a search on Google you are given an enormous list. This is already deeply ingrained in the day-to-day business of finding information. But these lists are basically the electronic equivelant of scrolls, with the items algorithmically determined to be most relevant placed at the top. But sooner or later we have to admit that using scrolls for this kind of business is ludicrous. There has to be a better way of arraying these vast harvests of information in a way that allows the researcher to zoom across degrees of specificity and through associative chains of context and meaning. I see no reason why a search shouldn't take place in some kind of virtual library, emulating the physical architecture of research settings, and allowing for some of the associative or accidental echoes that so often enrich a paper trail blazed through a brick-and-mortar library. Or cannot knowledge resemble a tree, or an arterial matrix? Must we be bound to the scroll?

Returning to the question of the librarian's role, I recalled this passage from James J O'Donnell's 1996 paper The Pragmatics of the New: Trithemius, McLuhan, Cassiodorus:

"The librarians of the world have, moreover, already led the way, for academics at least, into the new information environment, not least because they are caught between rising demand from their customers (faculty and students) and rising supply and prices from their suppliers, and so have already been making reality-based decisions about ownership versus access, print versus electronics, and so on. In short, they are just now our leading pragmatists. Can we imagine a time in our universities when the librarians are the well-paid principals and the teachers their mere acolytes in a distribution chain? I do not think we can or should rule out that possibility for a moment"

oldgoogle.jpg

Related articles:

"Questions and Praise for Google Web Library" - NY Times
"Google's library plan 'a huge help'" - USA Today
"Making books readable on computer proves trying task" - USA Today

Also, I found this on Searchblog. For a trip down memory lane, check out the original Google in the Stanford archives (click on picture to right). Unfortunately, although it seems interactive, a search just brings up a bunch of stylesheets.

Posted by ben vershbow at 03:02 PM | Comments (0)
tags: Libraries, Search and the Web , google , google_print , information_architecture , infoviz , librarian , library , michigan , search

google and big brother Post date  12.15.2004, 7:35 PM

Can Google remain true to its promise to "do no evil," now that it has shareholders to worry about, advertisers to please, and an ever-increasing reach into the repositories of human knowledge? Google still gives you that warm and fuzzy feeling. It's got the goofy name, those cute seasonal tailorings of its masthead, the lava lamps. And this is not to mention the various amusing pastimes - the "Google Whack" game in which you try to find two words that cohabit only one of the search engine's eight billion web pages; or every writer's guilty pleasure, the Googling of the self, the "auto-Google," that delicious act of cyber-onanism.

But where might it lead? One day, when I open my fridge, might a sensor not read my searching eye and know that I am looking for milk? And knowing that I have run out, suggest an array of retailers who might be able to replenish my supply? Could Google come to mediate every exchange of information, no matter how inane, or how carnal?

Or could it come to resemble something like the Central Intelligence Corporation in Neal Stephenson's Snow Crash - a cross between the CIA, the Library of Congress, and DARPA's "Total Information Awareness" program?

MercuryNews.com | 12/14/2004 | Does Google move augur commercialization of libraries?

Posted by ben vershbow at 07:35 PM | Comments (0)
tags: Libraries, Search and the Web , evil , google , internet , library , library_of_congress , neal_stephenson , privacy , search , surveillance , web

books behind bars - the Google library project Post date  12.14.2004, 4:34 PM

How useful will this service be for in-depth research when copyrighted books (which will account for a huge percentage of searchable texts) cannot be fully accessed? In such cases, a person will be able to view only a selection of pages (depending on agreements with publishers), and will find themselves bombarded with a variety of retail options. On a positive note, the search will be able to refer the user to any local libraries where the desired book is available, but still, the focus here remains squarely on digital texts as simply a means of getting to print texts.

Absent a major paradigm shift with regard to the accessibility and inherent virtue of electronic texts, this ambitious project will never achieve its full potential. For someone searching outside the public domain, the Google library project may amount to nothing more than a guided tour through a prison of incarcerated texts. I've found this to be true so far with Google Scholar - it turned up a lot of interesting stuff, but much of it was password protected or required purchase.

article in Filter: Google -- 21st Century Dewey Decimal System (washingtonpost.com)

Posted by ben vershbow at 04:34 PM | Comments (0)
tags: Libraries, Search and the Web , books , copyright , digitization , ebooks , google , google_book_search , google_print , google_scholar , libraries , library

NYPL ebook collection leaves much to be desired Post date  12.10.2004, 1:51 PM

I just checked out two titles from the New York Public Library's ebook catalog, only to learn, to my great astonishment, that those books are now effectively "checked out," and cannot be downloaded again by anyone else until my copies time out.

It boggles the mind that NYPL would go to the trouble of establishing a collection of electronic titles, only to wipe out every advantage offered by digital texts. In fact, they do more than simply keep the ebooks on the level of print, they limit them further than that, since there are generally multiple copies of most print titles in the NYPL system.

The people responsible for this catalog have either entirely failed to grasp the concept of infinitely accessible, screen-based books, or they grasp it all too well and are trying to stunt it at its inception, perhaps out of fear of extinction of the print librarian. More likely, they are under heavy pressure by a paranoid copyright regime. Whatever the reason, the new ebook catalog shows a total lack of imagination and offers nearly no tangible benefit for the reader.

Beyond that, the books themselves are poorly designed and unpleasant to read. My downloaded copy of Conrad's Heart of Darkness (which, by the way, I found in the "Romance" section) evidences no more than ten minutes worth of design work, and appears to be simply a cut-and-pasted ASCII file from Gutenberg with a garish graphic slapped on the cover. My copy of Chain of Command by Seymour Hersh was a bit more respectable – more or less a pdf facsimile of the print edition.

On an amusing note, the "literary criticism" section is populated almost entirely by Cliff's Notes.

Posted by ben vershbow at 01:51 PM | Comments (0)
tags: DRM , Libraries, Search and the Web , books , copyright , design , e-publishing , ebook , ebooks , internet , libraries , library , manhattan , new_york , publishing

3,000 electronic titles at new york public library Post date  12.09.2004, 2:26 AM

This is the third Times article this week on e-books. What's happening?

"Libraries Reach Out, Online"

No Need to Click Here - we're just claiming our feed at Feedster

Posted by ben vershbow at 02:26 AM | Comments (0)
tags: Libraries, Search and the Web , books , ebook , ebooks , library , new_york