Listing entries tagged with Libraries, Search and the Web
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
google: i'll be your mirror
03.07.2006, 5:25 PM
From notes accidentally published on Google's website, leaked into the blogosphere (though here from the BBC): plans for the GDrive, a mirror of users' hard drives.
With infinite storage, we can house all user files, including e-mails, web history, pictures, bookmarks, etc; and make it accessible from anywhere (any device, any platform, etc).
I just got a shiver -- a keyhole glimpse of where this is headed. Google's stock made a shocking dip last week after its Chief Financial Officer warned investors that growth of its search and advertising business would eventually slow down. The sudden panicked thought: how will Google realize its manifest destiny? You know: "organizing the world's information and making it universally accessible (China notwithstanding) and useful"? How will it continue to feed itself?
Simple: storage.
Google, as it has already begun to do (Gmail, get off my back!), wants to organize our information and make it universally accessible and useful to us. No more worries about backing up data -- Google's got your back. No worries about saving correspondences -- Google's got those. They've got your shoebox of photographs, your file cabinet of old college papers, your bank records, your tax returns. All nicely organized and made incredibly useful.
But as we prepare for the upload of our lives, we might pause to ask: exactly how useful do we want to become?
Posted by ben vershbow at 05:25 PM
| Comments (4)
| TrackBack
tags: Libraries, Search and the Web , Online , google , hard_drive , mirror , privacy , search
RDF = bigger piles
03.06.2006, 4:30 PM
Last week at a meeting of all the Mellon funded projects I heard a lot of discussion about RDF as a key technology for interoperability. RDF (Resource Description Framework) is a data model for machine readable metadata and a necessary, but not sufficient requirement for the semantic web. On top of this data model you need applications that can read RDF. On top of the applications you need the ability to understand the meaning in the RDF structured data. This is the really hard part: matching the meaning of two pieces of data from two different contexts still requires human judgement. There are people working on the complex algorithmic gymnastics to make this easier, but so far, it's still in the realm of the experimental.
So why pursue RDF? The goal is to make human knowledge, implicit and explicit, machine readable. Not only machine readable, but automatically shareable and reusable by applications that understand RDF. Researchers pursuing the semantic web hope that by precipitating an integrated and interoperable data environment, application developers will be able to innovate in their business logic and provide better services across a range of data sets.
Why is this so hard? Well, partly because the world is so complex, and although RDF is theoretically able to model an entire world's worth of data relationships, doing it seamlessly is just plain hard. You can spend time developing a RDF representation of all the data in your world, then someone else will come along with their own world, with their own set of data relationships. Being naturally friendly, you take in their data and realize that they have a completely different view of the category "Author," "Creator," "Keywords," etc. Now you have a big, beautiful dataset, with a thousand similar, but not equivalent pieces. The hard part—determining relationships between the data.
We immediately considered how RDF and Sophie would work. RDF importing/exporting in Sophie could provide value by preparing Sophie for integration with other RDF capable applications. But, as always, the real work is figuring out what it is that people could do with this data. Helping users derive meaning from a dataset begs the question: what kind of meaning are we trying to help them discover? A universe of linguistic analysis? Literary theory? Historical accuracy? I think a dataset that enabled all of these would be 90% metadata, and 10% data. This raises another huge issue: entering semantic metadata requires skill and time, and is therefore relatively rare.
In the end, RDF creates bigger, better piles of data—intact with provenance and other unique characteristics derived from the originating context. This metadata is important information that we'd rather hold on to than irrevocably discard, but it leaves us stuck with a labyrinth of data, until we create the tools to guide us out. RDF is ten years old, yet it hasn't achieved the acceptance of other solutions, like XML Schemas or DTD's. They have succeeded because they solve limited problems in restricted ways and require relatively simple effort to implement. RDF's promise is that it will solve much larger problems with solutions that have more richness and complexity; but ultimately the act of determining meaning or negotiating interoperability between two systems is still a human function. The undeniable fact of it remains— it's easy to put everyone's data into RDF, but that just leaves the hard part for last.
Posted by jdwilbur at 04:30 PM
| Comments (2)
| TrackBack
tags: Libraries, Search and the Web , Mellon , RDF , Sophie , interoperability , semantic_web , the_networked_book
yahoo! ui design library
02.16.2006, 7:08 PM
There are several reasons that Yahoo! released some of their core UI code for free. A callous read of this would suggest that they did it to steal back some goodwill from Google (still riding the successful Goolge API release from 2002). A more charitable soul could suggest that Yahoo! is interested in making the web a better place, not just in their market-share. Two things suggest this—the code is available under an open BSD license, and their release of design patterns. The code is for playing with; the design patterns for learning from.
The code is squarely aimed at folks like me who would struggle mightily to put together a default library to handle complex interactions in Javascript using AJAX (all the rage now) while dealing with the intricacies of modern and legacy browsers. Sure, I could pull together the code from different sources, test it, tweak it, break it, tweak it some more, etc. Unsurprisingly, I've never gotten around to it. The Yahoo! code release will literally save me at least a hundred hours. Now I can get right down to designing the interaction, rather than dealing with technology.
The design patterns library is a collection of best practice instructions for dealing with common web UI problems, providing both a solution and a rationale, with a detailed explanation of the interaction/interface feedback. This is something that is more familiar to me, but still stands as a valuable resource. It is a well-documented alternate viewpoint and reminder from a site that serves more users in one day than I'm likely to serve in a year.
Of course Yahoo! is hoping to reclaim some mind-space from Google with developer community goodwill. But since the code is general release, and not brandable in any particular way (it's all under-the-hood kind of stuff), it's a little difficult to see the release as a directly marketable item. It really just seems like a gift to the network, and hopefully one that will bear lovely fruit. It's always heartening to see large corporations opening their products to the public as a way to grease the wheels of innovation.
Posted by jesse wilbur at 07:08 PM
| Comments (3)
tags: BSD , Libraries, Search and the Web , Remix , design , design_pattern , gift_economy , innovation , interaction , javascript , license , user_interface , y! , yahoo!
who really needs to turn the pages?
02.15.2006, 6:16 PM
The following post comes from my friend Sally Northmore, a writer and designer based in New York who lately has been interested in things like animation, video game theory, and (right up our alley) the materiality of books and their transition to a virtual environment. A couple of weeks ago we were talking about the British Library's rare manuscript digitization project, "Turning the Pages" -- something I'd been meaning to discuss here but never gotten around to doing. It turns out Sally had some interesting thoughts about this so I persuaded her to do a brief write-up of the project for if:book. Which is what follows below. Come to think of it, this is especially interesting when juxtaposed with Bob's post earlier this week on Jefferson Han's amazing gestural interface design. Here's Sally... - Ben
The British Library's collaboration with multimedia impresarios at Armadillo Systems has led to an impressive publishing enterprise, making available electronic 3-D facsimiles of their rare manuscript collection.
"Turning the Pages", available in CD-ROM, online, and kiosk format, presents the digital incarnation of these treasured texts, allowing the reader to virtually "turn" the pages with a touch and drag function, "pore over" texts with a magnification function, and in some cases, access extras such as supplementary notes, textual secrets, and audio accompaniment.

Pages from Mozart's thematic catalogue -- a composition notebook from the last seven years of his life. Allows the reader to listen to works being discussed.
The designers ambitiously mimicked various characteristics of each work in their 3-D computer models. For instance, the shape of a page of velum turning differs from the shape of a page of paper. It falls at a unique speed according to its weight; it casts a unique shadow. The simulation even allows for a discrepancy in how a page would turn depending on what corner of the page you decide to peel from.
Online visitors can download a library of manuscripts in Shockwave although these versions are a bit clunkier and don't provide the flashier thrills of the enormous touch screen kiosks the British Library now houses.

Mercator's first atlas of Europe - 1570s
Online, the "Turning the Pages" application forces you to adapt to the nature of its embodiment--to physically re-learn how to use a book. A hand cursor invites the reader to turn each page with a click-and-drag maneuver of the mouse. Sounds simple enough, but I struggled to get the momentum of the drag just right so that the page actually turned. In a few failed attempts, the page lifted just so... only to fall back into place again. Apparently, if you can master the Carpal Tunnel-inducing rhythm, you can learn to manipulate the page-turning function even further, grabbing multiple of pages at once for a faster, abridged read.
The value of providing high resolution scans of rare editions of texts for the general public to experience, a public that otherwise wouldn't necessarily ever "touch" say, the Lindisfarne Gospels, doesn't go without kudos. Hey, democratic right? Armadillo Systems provides a list of compelling raisons d'être on their site to this effect. But the content of these texts is already available in reprintable (democratic!) form. Is the virtual page-turning function really necessary for greater understanding of these works, or a game of academic scratch-n-sniff?

The "enlarge" function even allows readers to reverse the famous mirror writing in Leonardo da Vinci's notebooks
At the MLA conference in D.C. this past December, where the British Library had set up a demonstration of "Turning the Pages", this was the question most frequently asked of the BL's representative. Who really needs to turn the pages? I learned from the rep's response that, well, nobody does! Scholars are typically more interested studying the page, and the turning function hasn't proven to enhance or revive scholarly exploration. And surely, the Library enjoyed plenty of biblio-clout and tourist traffic before this program?
But the lure of new, sexy technology can't be underestimated. From what I understood, the techno-factor is an excellent beacon for attracting investors and funding in multimedia technology. Armadillo's web site provides an interesting sales pitch:
By converting your manuscripts to "Turning the Pages" applications you can attract visitors, increase website traffic and add a revenue stream - at the same time as broadening access to your collection and informing and entertaining your audience.
The program reveals itself to be a peculiar exercise, tangled in its insistence on fetishizing aspects of the material body of the text--the weight of velum, the karat of gold used to illuminate, the shape of the binding. Such detail and love for each material manuscript went into this project to recreate, as best possible, the "feel" of handling these manuscripts.
Under ideal circumstances, what would the minds behind "Turning the Pages" prefer to create? The original form of the text--the "alpha" manuscript--or the virtual incarnation? Does technological advancement seduce us into valuing the near-perfect simulation over the original? Are we more impressed by the clone, the "Dolly" of hoary manuscripts? And, would one argue that "Turning the Pages" is the best proxy for the real thing, or, another "thing" entirely?
Posted by sally northmore at 06:16 PM
| Comments (5)
tags: Libraries, Search and the Web , book_craft , books , design , design_curmudgeonry , digitization , interface , library , manuscript , museum , preservation , reading , turning_the_pages , user_interface
can there be a compromise on copyright?
02.08.2006, 7:19 AM
The following is a response to a comment made by Karen Schneider on my Monday post on libraries and DRM. I originally wrote this as just another comment, but as you can see, it's kind of taken on a life of its own. At any rate, it seemed to make sense to give it its own space, if for no other reason than that it temporarily sidelined something else I was writing for today. It also has a few good quotes that might be of interest. So, Karen said:
I would turn back to you and ask how authors and publishers can continue to be compensated for their work if a library that would buy ten copies of a book could now buy one. I'm not being reactive, just asking the question--as a librarian, and as a writer.
This is a big question, perhaps the biggest since economics will define the parameters of much that is being discussed here. How do we move from an old economy of knowledge based on the trafficking of intellectual commodities to a new economy where value is placed not on individual copies of things that, as a result of new technologies are effortlessly copiable, but rather on access to networks of content and the quality of those networks? The question is brought into particularly stark relief when we talk about libraries, which (correct me if I'm wrong) have always been more concerned with the pure pursuit and dissemination of knowledge than with the economics of publishing.
Consider, as an example, the photocopier -- in many ways a predecessor of the world wide web in that it is designed to deconstruct and multiply documents. Photocopiers have been unbundling books in libraries long before there was any such thing as Google Book Search, helping users break through the commodified shell to get at the fruit within.
I know there are some countries in Europe that funnel a share of proceeds from library photocopiers back to the publishers, and this seems to be a reasonably fair compromise. But the role of the photocopier in most libraries of the world is more subversive, gently repudiating, with its low hum, sweeping light, and clackety trays, the idea that there can really be such a thing as intellectual property.
That being said, few would dispute the right of an author to benefit economically from his or her intellectual labor; we just have to ask whether the current system is really serving in the authors' interest, let alone the public interest. New technologies have released intellectual works from the restraints of tangible property, making them easily accessible, eminently exchangable and never out of print. This should, in principle, elicit a hallelujah from authors, or at least the many who have written works that, while possessed of intrinsic value, have not succeeded in their role as commodities.
But utopian visions of an intellecutal gift economy will ultimately fail to nourish writers who must survive in the here and now of a commercial market. Though peer-to-peer gift economies might turn out in the long run to be financially lucrative, and in unexpected ways, we can't realistically expect everyone to hold their breath and wait for that to happen. So we find ourselves at a crossroads where we must soon choose as a society either to clamp down (to preserve existing business models), liberalize (to clear the field for new ones), or compromise.
In her essay "Books in Time," Berkeley historian Carla Hesse gives a wonderful overview of a similar debate over intellectual property that took place in 18th Century France, when liberal-minded philosophes -- most notably Condorcet -- railed against the state-sanctioned Paris printing monopolies, demanding universal access to knowledge for all humanity. To Condorcet, freedom of the press meant not only freedom from censorship but freedom from commerce, since ideas arise not from men but through men from nature (how can you sell something that is universally owned?). Things finally settled down in France after the revolution and the country (and the West) embarked on a historic compromise that laid the foundations for what Hesse calls "the modern literary system":
The modern "civilization of the book" that emerged from the democratic revolutions of the eighteenth century was in effect a regulatory compromise among competing social ideals: the notion of the right-bearing and accountable individual author, the value of democratic access to useful knowledge, and faith in free market competition as the most effective mechanism of public exchange.
Barriers to knowledge were lowered. A system of limited intellectual property rights was put in place that incentivized production and elevated the status of writers. And by and large, the world of ideas flourished within a commercial market. But the question remains: can we reach an equivalent compromise today? And if so, what would it look like?
Creative Commons has begun to nibble around the edges of the problem, but love it as we may, it does not fundamentally alter the status quo, focusing as it does primarily on giving creators more options within the existing copyright system.
Which is why free software guru Richard Stallman announced in an interview the other day his unqualified opposition to the Creative Commons movement, explaining that while some of its licenses meet the standards of open source, others are overly conservative, rendering the project bunk as a whole. For Stallman, ever the iconoclast, it's all or nothing.
But returning to our theme of compromise, I'm struck again by this idea of a tax on photocopiers, which suggests a kind of micro-economy where payments are made automatically and seamlessly in proportion to a work's use. Someone who has done a great dealing of thinking about such a solution (though on a much more ambitious scale than library photocopiers) is Terry Fisher, an intellectual property scholar at Harvard who has written extensively on practicable alternative copyright models for the music and film industries (Ray and I first encountered Fisher's work when we heard him speak at the Economics of Open Content Symposium at MIT last month).
The following is an excerpt from Fisher's 2004 book, "Promises to Keep: Technology, Law, and the Future of Entertainment", that paints a relatively detailed picture of what one alternative copyright scheme might look like. It's a bit long, and as I mentioned, deals specifically with the recording and movie industries, but it's worth reading in light of this discussion since it seems it could just as easily apply to electronic books:
....we should consider a fundamental change in approach.... replace major portions of the copyright and encryption-reinforcement models with a variant of....a governmentally administered reward system. In brief, here's how such a system would work. A creator who wished to collect revenue when his or her song or film was heard or watched would register it with the Copyright Office. With registration would come a unique file name, which would be used to track transmissions of digital copies of the work. The government would raise, through taxes, sufficient money to compensate registrants for making their works available to the public. Using techniques pioneered by American and European performing rights organizations and television rating services, a government agency would estimate the frequency with which each song and film was heard or watched by consumers. Each registrant would then periodically be paid by the agency a share of the tax revenues proportional to the relative popularity of his or her creation. Once this system were in place, we would modify copyright law to eliminate most of the current prohibitions on unauthorized reproduction, distribution, adaptation, and performance of audio and video recordings. Music and films would thus be readily available, legally, for free.Painting with a very broad brush...., here would be the advantages of such a system. Consumers would pay less for more entertainment. Artists would be fairly compensated. The set of artists who made their creations available to the world at large--and consequently the range of entertainment products available to consumers--would increase. Musicians would be less dependent on record companies, and filmmakers would be less dependent on studios, for the distribution of their creations. Both consumers and artists would enjoy greater freedom to modify and redistribute audio and video recordings. Although the prices of consumer electronic equipment and broadband access would increase somewhat, demand for them would rise, thus benefiting the suppliers of those goods and services. Finally, society at large would benefit from a sharp reduction in litigation and other transaction costs.
While I'm uncomfortable with the idea of any top-down, governmental solution, this certainly provides food for thought.
Posted by ben vershbow at 07:19 AM
| Comments (8)
tags: Copyright and Copyleft , DRM , IP , Libraries, Search and the Web , Publishing, Broadcast, and the Press , condorcet , copyleft , copyright , creative_commons , enlightenment , france , free_software , intellectual_property , libraries , music , open_source , photocopy , printing , richar_stallman , xerox
DRM and the damage done to libraries
02.06.2006, 7:51 AM
A recent BBC article draws attention to widespread concerns among UK librarians (concerns I know are shared by librarians and educators on this side of the Atlantic) regarding the potentially disastrous impact of digital rights management on the long-term viability of electronic collections. At present, when downloads represent only a tiny fraction of most libraries' circulation, DRM is more of a nuisance than a threat. At the New York Public library, for instance, only one "copy" of each downloadable ebook or audio book title can be "checked out" at a time -- a frustrating policy that all but cancels out the value of its modest digital collection. But the implications further down the road, when an increasing portion of library holdings will be non-physical, are far more grave.
What these restrictions in effect do is place locks on books, journals and other publications -- locks for which there are generally no keys. What happens, for example, when a work passes into the public domain but its code restrictions remain intact? Or when materials must be converted to newer formats but can't be extracted from their original files? The question we must ask is: how can librarians, now or in the future, be expected to effectively manage, preserve and update their collections in such straightjacketed conditions?
This is another example of how the prevailing copyright fundamentalism threatens to constrict the flow and preservation of knowledge for future generations. I say "fundamentalism" because the current copyright regime in this country is radical and unprecedented in its scope, yet traces its roots back to the initially sound concept of limited intellectual property rights as an incentive to production, which, in turn, stemmed from the Enlightenment idea of an author's natural rights. What was originally granted (hesitantly) as a temporary, statutory limitation on the public domain has spun out of control into a full-blown culture of intellectual control that chokes the flow of ideas through society -- the very thing copyright was supposed to promote in the first place.
If we don't come to our senses, we seem destined for a new dark age where every utterance must be sanctioned by some rights holder or licensing agent. Free thought isn't possible, after all, when every thought is taxed. In his "An Answer to the Question: What is Enlightenment?" Kant condemns as criminal any contract that compromises the potential of future generations to advance their knowledge. He's talking about the church, but this can just as easily be applied to the information monopolists of our times and their new tool, DRM, which, in its insidious way, is a kind of contract (though one that is by definition non-negotiable since enforced by a machine):
But would a society of pastors, perhaps a church assembly or venerable presbytery (as those among the Dutch call themselves), not be justified in binding itself by oath to a certain unalterable symbol in order to secure a constant guardianship over each of its members and through them over the people, and this for all time: I say that this is wholly impossible. Such a contract, whose intention is to preclude forever all further enlightenment of the human race, is absolutely null and void, even if it should be ratified by the supreme power, by parliaments, and by the most solemn peace treaties. One age cannot bind itself, and thus conspire, to place a succeeding one in a condition whereby it would be impossible for the later age to expand its knowledge (particularly where it is so very important), to rid itself of errors, and generally to increase its enlightenment. That would be a crime against human nature, whose essential destiny lies precisely in such progress; subsequent generations are thus completely justified in dismissing such agreements as unauthorized and criminal.
We can only hope that subsequent generations prove more enlightened than those presently in charge.
Posted by ben vershbow at 07:51 AM
| Comments (4)
tags: Copyright and Copyleft , DRM , IP , Libraries, Search and the Web , books , copyright , digital , digitization , ebooks , enlightenment , fundamentalism , intellectual_property , kant , libraries , library , philosophy , public_domain , scholarship
google gets mid-evil
01.30.2006, 3:46 PM
At the World Economic Forum in Davos last Friday, Google CEO Eric Schmidt assured a questioner in the audience that his company had in fact thoroughly searched its soul before deciding to roll out a politically sanitized search engine in China:
We concluded that although we weren't wild about the restrictions, it was even worse to not try to serve those users at all... We actually did an evil scale and decided not to serve at all was worse evil.
(via Ditherati)
Posted by ben vershbow at 03:46 PM
| Comments (0)
tags: Libraries, Search and the Web , Network_Freedom , censorship , china , evil , free_speech , google , internet , search , web





