Listing entries tagged with google


1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13

sober thoughts on google: privatization and privacy Post date  11.30.2005, 8:18 AM

nypl reading room.jpg

Siva Vaidhyanathan has written an excellent essay for the Chronicle of Higher Education on the "risky gamble" of Google's book-scanning project -- some of the most measured, carefully considered comments I've yet seen on the issue. His concerns are not so much for the authors and publishers that have filed suit (on the contrary, he believes they are likely to benefit from Google's service), but for the general public and the future of libraries. Outsourcing to a private company the vital task of digitizing collections may prove to have been a grave mistake on the part of Google's partner libraries. Siva:

The long-term risk of privatization is simple: Companies change and fail. Libraries and universities last.....Libraries should not be relinquishing their core duties to private corporations for the sake of expediency. Whichever side wins in court, we as a culture have lost sight of the ways that human beings, archives, indexes, and institutions interact to generate, preserve, revise, and distribute knowledge. We have become obsessed with seeing everything in the universe as "information" to be linked and ranked. We have focused on quantity and convenience at the expense of the richness and serendipity of the full library experience. We are making a tremendous mistake.

This essay contains in abundance what has largely been missing from the Google books debate: intellectual courage. Vaidhyanathan, an intellectual property scholar and "avowed open-source, open-access advocate," easily could have gone the predictable route of scolding the copyright conservatives and spreading the Google gospel. But he manages to see the big picture beyond the intellectual property concerns. This is not just about economics, it's about knowledge and the public interest.

What irks me about the usual debate is that it forces you into a position of either resisting Google or being its apologist. But this fails to get at the real bind we all are in: the fact that Google provides invaluable services and yet is amassing too much power; that a private company is creating a monopoly on public information services. Sooner or later, there is bound to be a conflict of interest. That is where we, the Google-addicted public, are caught. It's more complicated than hip versus square, or good versus evil.

Here's another good piece on Google. On Monday, The New York Times ran an editorial by Adam Cohen that nicely lays out the privacy concerns:

Google says it needs the data it keeps to improve its technology, but it is doubtful it needs so much personally identifiable information. Of course, this sort of data is enormously valuable for marketing. The whole idea of "Don't be evil," though, is resisting lucrative business opportunities when they are wrong. Google should develop an overarching privacy theory that is as bold as its mission to make the world's information accessible - one that can become a model for the online world. Google is not necessarily worse than other Internet companies when it comes to privacy. But it should be doing better.

original google.jpg Two graduate students in Stanford in the mid-90s recognized that search engines would the most important tools for dealing with the incredible flood of information that was then beginning to swell, so they started indexing web pages and working on algorithms. But as the company has grown, Google's admirable-sounding mission statement -- "to organize the world's information and make it universally accessible and useful" -- has become its manifest destiny, and "information" can now encompass the most private of territories.

At one point it simply meant search results -- the answers to our questions. But now it's the questions as well. Google is keeping a meticulous record of our clickstreams, piecing together an enormous database of queries, refining its search algorithms and, some say, even building a massive artificial brain (more on that later). What else might they do with all this personal information? To date, all of Google's services are free, but there may be a hidden cost.

"Don't be evil" may be the company motto, but with its IPO earlier this year, Google adopted a new ideology: they are now a public corporation. If web advertising (their sole source of revenue) levels off, then investors currently high on $400+ shares will start clamoring for Google to maintain profits. "Don't be evil to us!" they will cry. And what will Google do then?

images: New York Public Library reading room by Kalloosh via Flickr; archive of the original Google page

Posted by ben vershbow at 8:18 AM | Comments (7)
tags: Copyright and Copyleft , Libraries, Search and the Web , books , copyright , ethics , google , google_book_search , google_print , intellectual_property , libraries , library , literature , privacy , publishing , university

flushing the net down the tubes Post date  11.29.2005, 8:11 AM

Grand theories about upheavals on the internet horizon are in ready supply. Singularities are near. Explosions can be expected in the next six to eight months. Or the whole thing might just get "flushed" down the tubes. This last scenario is described at length in a recent essay in Linux Journal by Doc Searls, which predicts the imminent hijacking of the net by phone and cable companies who will turn it into a top-down, one-way broadcast medium. In other words, the net's utopian moment, the "read/write" web, may be about to end. Reading Searls' piece, I couldn't help thinking about the story of radio and a wonderful essay Brecht wrote on the subject in 1932:

brecht-foto.jpg

Here is a positive suggestion: change this apparatus over from distribution to communication. The radio would be the finest possible communication apparatus in public life, a vast network of pipes. That is to say, it would be if it knew how to receive as well as to transmit, how to let the listener speak as well as hear, how to bring him into a relationship instead of isolating him. On this principle the radio should step out of the supply business and organize its listeners as suppliers....turning the audience not only into pupils but into teachers.

Unless you're the military, law enforcement, or a short-wave hobbyist, two-way radio never happened. On the mainstream commercial front, radio has always been about broadcast: a one-way funnel. The big FM tower to the many receivers, "prettifying public life," as Brecht puts it. Radio as an agitation? As an invitation to a debate, rousing families from the dinner table into a critical encounter with their world? Well, that would have been neat.

Now there's the internet, a two-way, every-which-way medium -- a stage of stages -- that would have positively staggered a provocateur like Brecht. But although the net may be a virtual place, it's built on some pretty actual stuff. Copper wire, fiber optic cable, trunks, routers, packets -- "the vast network of pipes." The pipes are owned by the phone and cable companies -- the broadband providers -- and these guys expect a big return (far bigger than they're getting now) on the billions they've invested in laying down the plumbing. Searls:

The choke points are in the pipes, the permission is coming from the lawmakers and regulators, and the choking will be done....The carriers are going to lobby for the laws and regulations they need, and they're going to do the deals they need to do. The new system will be theirs, not ours....The new carrier-based Net will work in the same asymmetrical few-to-many, top-down pyramidal way made familiar by TV, radio, newspapers, books, magazines and other Industrial Age media now being sucked into Information Age pipes. Movement still will go from producers to consumers, just like it always did.

If Brecht were around today I'm sure he would have already written (or blogged) to this effect, no doubt reciting the sad fate of radio as a cautionary tale. Watch the pipes, he would say. If companies talk about "broad" as in "broadband," make sure they're talking about both ends of the pipe. The way broadband works today, the pipe running into your house dwarfs the one running out. That means more download and less upload, and it's paving the way for a content delivery platform every bit as powerful as cable on an infinitely broader band. Data storage, domain hosting -- anything you put up there -- will be increasingly costly, though there will likely remain plenty of chat space and web mail provided for free, anything that allows consumers to fire their enthusiasm for commodities through the synapse chain.

rad30cathedral10.jpg If the net goes the way of radio, that will be the difference (allow me to indulge in a little dystopia). Imagine a classic Philco cathedral radio but with a few little funnel-ended hoses extending from the side that connect you to other listeners. "Tune into this frequency!" "You gotta hear this!" You whisper recommendations through the tube. It's sending a link. Viral marketing. Yes, the net will remain two-way to the extent that it helps fuel the market. Web browsers, like the old Philco, would essentially be receivers, enabling participation only to the extent that it encouraged others to receive.

You might even get your blog hosted for free if you promote products -- a sports shoe with gelatinous heels or a music video that allows you to undress the dancing girls with your mouse. Throw in some political rants in between to blow off some steam, no problem. That's entrepreneurial consumerism. Make a living out of your appetites and your ability to make them infectious. Hip recommenders can build a cosy little livelihood out of their endorsements. But any non-consumer activity will be more like amateur short-wave radio: a mildly eccentric (and expensive) hobby (and they'll even make a saccharine movie about a guy communing with his dead firefighter dad through a ghost blog).

Searls sees it as above all a war of language and metaphor. The phone and cable companies will dominate as long as the internet is understood fundamentally as a network of pipes, a kind of information transport system. This places the carriers at the top of the hierarchy -- the highway authority setting the rules of the road and collecting the tolls. So far the carriers have managed, through various regulatory wrangling and court rulings, to ensure that the "transport metaphor" has prevailed.

But obviously the net is much more than the sum of its pipes. It's a public square. It's a community center. It's a market. And it's the biggest publishing system the world has ever known. Searls wants to promote "place metaphors" like these. Sure, unless you're a lobbyist for Verizon or SBC, you probably already think of it this way. But in the end it's the lobbyists that will make all the difference. Unless, that is, an enlightened citizens' lobby begins making some noise. So a broad, broad as in broadband, public conversation should be in order. Far broader than what goes on in the usual progressive online feedback loops -- the Linux and open source communities, the creative commies, and the techno-hip blogosphere, that I'm sure are already in agreement about this.

Google also seems to have an eye on the pipes, reportedly having bought thousands of miles of "dark fiber" -- pipe that has been laid but is not yet in use. Some predict a nationwide "Googlenet." But this can of worms is best saved for another post.

Posted by ben vershbow at 8:11 AM | Comments (3)
tags: brecht , broadband , broadcast , cable , fiber , google , internet , linux , media , net , net_neutrality , radio , short_wave , telecom , telephone , tubes , utopia , verizon , web

virtual libraries, real ones, empires Post date  11.28.2005, 12:36 PM

Handsworth readers.jpg Last Tuesday, a Washington Post editorial written by Library of Congress librarian James Billington outlined the possible benefits of a World Digital Library, a proposed LOC endeavor discussed last week in a post by Ben Vershbow. Billington seemed to imagine the library as sort of a United Nations of information: claiming that "deep conflict between cultures is fired up rather than cooled down by this revolution in communications," he argued that a US-sponsored, globally inclusive digital library could serve to promote harmony over conflict:

Libraries are inherently islands of freedom and antidotes to fanaticism. They are temples of pluralism where books that contradict one another stand peacefully side by side just as intellectual antagonists work peacefully next to each other in reading rooms. It is legitimate and in our nation's interest that the new technology be used internationally, both by the private sector to promote economic enterprise and by the public sector to promote democratic institutions. But it is also necessary that America have a more inclusive foreign cultural policy -- and not just to blunt charges that we are insensitive cultural imperialists. We have an opportunity and an obligation to form a private-public partnership to use this new technology to celebrate the cultural variety of the world.

What's interesting about this quote (among other things) is that Billington seems to be suggesting that a World Digital Library would function in much the same manner as a real-world library, and yet he's also arguing for the importance of actual physical proximity. He writes, after all, about books literally, not virtually, touching each other, and about researchers meeting up in a shared reading room. There seems to be a tension here, in other words, between Billington's embrace of the idea of a world digital library, and a real anxiety about what a "library" becomes when it goes online.

I also feel like there's some tension here -- in Billington's editorial and in the whole World Digital Library project -- between "inclusiveness" and "imperialism." Granted, if the United States provides Brazilians access to their own national literature online, this might be used by some as an argument against the idea that we are "insensitive cultural imperialists." But there are many varieties of empire: indeed, as many have noted, the sun stopped setting on Google's empire a while ago.

To be clear, I'm not attacking the idea of the World Digital Library. Having watch the Smithsonian invest in, and waffle on, some of their digital projects, I'm all for a sustained commitment to putting more material online. But there needs to be some careful consideration of the differences between online libraries and virtual ones -- as well as a bit more discussion of just what a privately-funded digital library might eventually morph into.

Posted by lisa lynch at 12:36 PM | Comments (0)
tags: Libraries, Search and the Web , cultural , digital , google , imperialism , internet , libraries

explosion Post date  11.22.2005, 2:10 PM

250px-Nuclear_fireball.jpg A Nov. 18 post on Adam Green's Darwinian Web makes the claim that the web will "explode" (does he mean implode?) over the next year. According to Green, RSS feeds will render many websites obsolete:

The explosion I am talking about is the shifting of a website's content from internal to external. Instead of a website being a "place" where data "is" and other sites "point" to, a website will be a source of data that is in many external databases, including Google. Why "go" to a website when all of its content has already been absorbed and remixed into the collective datastream.

Does anyone agree with Green? Will feeds bring about the restructuring of "the way content is distributed, valued and consumed?" More on this here.

Posted by lisa lynch at 2:10 PM | Comments (5)
tags: Libraries, Search and the Web , Online , Publishing, Broadcast, and the Press , RSS , blogging , blogs , darwin , darwinism , google , internet , singularity , syndication , web , xml

world digital library Post date  11.22.2005, 7:41 AM

library of congress.jpg The Library of Congress has announced plans for the creation of a World Digital Library, "a shared global undertaking" that will make a major chunk of its collection freely available online, along with contributions from other national libraries around the world. From The Washington Post:

...[the] goal is to bring together materials from the United States and Europe with precious items from Islamic nations stretching from Indonesia through Central and West Africa, as well as important materials from collections in East and South Asia.

Google has stepped forward as the first corporate donor, pledging $3 million to help get operations underway. At this point, there doesn't appear to be any direct connection to Google's Book Search program, though Google has been working with LOC to test and refine its book-scanning technology.

Posted by ben vershbow at 7:41 AM | Comments (0)
tags: Libraries, Search and the Web , books , digital , google , library , library_of_congress , literature , preservation , scanning

google print is no more Post date  11.18.2005, 8:06 AM

Not the program, of course, just the name. From now on it is to be known as Google Book Search. "Print" obviously struck a little too close to home with publishers and authors. On the company blog, they explain the shift in emphasis:

No, we don't think that this new name will change what some folks think about this program. But we do believe it will help a lot of people understand better what we're doing. We want to make all the world's books discoverable and searchable online, and we hope this new name will help keep everyone focused on that important goal.

Posted by ben vershbow at 8:06 AM | Comments (1)
tags: Libraries, Search and the Web , books , copyright , google , google_book_search , google_print , publishing , search

all your base are belong to google Post date  11.16.2005, 7:04 AM

Google Base is live and ready for our stuff.

In AP: "New Project Will Expand Google's Reach"

Posted by ben vershbow at 7:04 AM | Comments (0)
tags: Online , advertising , classifieds , craigslist , ebay , etail , google , google_base , search , web

the book in the network - masses of metadata Post date  11.15.2005, 6:42 PM

In this weekend's Boston Globe, David Weinberger delivers the metadata angle on Google Print:

...despite the present focus on who owns the digitized content of books, the more critical battle for readers will be over how we manage the information about that content-information that's known technically as metadata.

...we're going to need massive collections of metadata about each book. Some of this metadata will come from the publishers. But much of it will come from users who write reviews, add comments and annotations to the digital text, and draw connections between, for example, chapters in two different books.

As the digital revolution continues, and as we generate more and more ways of organizing and linking books-integrating information from publishers, libraries and, most radically, other readers-all this metadata will not only let us find books, it will provide the context within which we read them.

The book in the network is a barnacled spirit, carrying with it the sum of its various accretions. Each book is also its own library by virtue not only of what it links to itself, but of what its readers are linking to, of what its readers are reading. Each book is also a milk crate of earlier drafts. It carries its versions with it. A lot of weight for something physically weightless.

Posted by ben vershbow at 6:42 PM | Comments (0)
tags: ISBN , Libraries, Search and the Web , books , ebook , electronic_literature , folksonomy , google , google_print , hypertext , library , literature , marginalia , metadata , social_software , tagging , weinberger

having browsed google print a bit more... Post date  11.14.2005, 4:53 AM

...I realize I was over-hasty in dismissing the recent additions made since book scanning resumed earlier this month. True, many of the fine wines in the cellar are there only for the tasting, but the vintage stuff can be drunk freely, and there are already some wonderful 19th century titles, at this point mostly from Harvard. The surest way to find them is to search by date, or by title and date. Specify a date range in advanced search or simply enter, for example, "date: 1890" and a wealth of fully accessible texts comes up, any of which can be linked to from a syllabus. An astonishing resource for teachers and students.

The conclusion: Google Print really is shaping up to be a library, that is, of the world pre-1923 -- the current line of demarcation between copyright and the public domain. It's a stark reminder of how over-extended copyright is. Here's an 1899 english printing of The Mahabharata:

mahabharata.jpg

A charming detail found on the following page is this old Harvard library stamp that got scanned along with the rest:

mahabharata harvard stamp.jpg

Posted by ben vershbow at 4:53 AM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , OCR , copyright , ebook , fair_use , google , google_print , library , mahabharata , scan

pages á la carte Post date  11.04.2005, 7:20 AM

The New York Times reports on programs being developed by both Amazon and Google that would allow readers to purchase online access to specific sections of books -- say, a single recipe from a cookbook, an individual chapter from a how-to manual, or a particular short story or poem from an anthology. Such a system would effectively "unbind" books into modular units that consumers patch into their online reading, just as iTunes blew apart the integrity of the album and made digital music all about playlists. We become scrapbook artists.

It seems Random House is in on this too, developing a micropayment model and consulting closely with the two internet giants. Pages would sell for anywhere between five and 25 cents each.

Posted by ben vershbow at 7:20 AM | Comments (1)
tags: Publishing, Broadcast, and the Press , Transliteracies , amazon , books , e-commerce , google , google_print , literature , media_consumption , publishing , randomhouse , reading

google print's not-so-public domain Post date  11.03.2005, 4:16 PM

wealthy new york google.jpg Google's first batch of public domain book scans is now online, representing a smattering of classics and curiosities from the collections of libraries participating in Google Print. Essentially snapshots of books, they're not particularly comfortable to read, but they are keyword-searchable and, since no copyright applies, fully accessible.

The problem is, there really isn't all that much there. Google's gotten a lot of bad press for its supposedly cavalier attitude toward copyright, but spend a few minutes browsing Google Print and you'll see just how publisher-centric the whole affair is. The idea of a text being in the public domain really doesn't amount to much if you're only talking about antique manuscripts, and these are the only books that they've made fully accessible. Daisy Miller's copyright expired long ago but, with the exception of Harvard's illustrated 1892 copy, all the available scanned editions are owned by modern publishers and are therefore only snippeted. This is not an online library, it's a marketing program. Google Print will undeniably have its uses, but we shouldn't confuse it with a library.

(An interesting offering from the stacks of the New York Public Library is this mid-19th century biographic registry of the wealthy burghers of New York: "Capitalists whose wealth is estimated at one hundred thousand dollars and upwards...")

Posted by ben vershbow at 4:16 PM | Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , OCR , books , copyright , ebook , google , google_print , library , literature , public_domain , scan