Listing entries tagged with google_print
the book is reading you
01.19.2006, 1:42 PM
I just noticed that Google Book Search requires users to be logged in on a Google account to view pages of copyrighted works.
They provide the following explanation:
Why do I have to log in to see certain pages?Because many of the books in Google Book Search are still under copyright, we limit the amount of a book that a user can see. In order to enforce these limits, we make some pages available only after you log in to an existing Google Account (such as a Gmail account) or create a new one. The aim of Google Book Search is to help you discover books, not read them cover to cover, so you may not be able to see every page you're interested in.
So they're tracking how much we've looked at and capping our number of page views. Presumably a bone tossed to publishers, who I'm sure will continue suing Google all the same (more on this here). There's also the possibility that publishers have requested information on who's looking at their books -- geographical breakdowns and stats on click-throughs to retailers and libraries. I doubt, though, that Google would share this sort of user data. Substantial privacy issues aside, that's valuable information they want to keep for themselves.
That's because "the aim of Google Book Search" is also to discover who you are. It's capturing your clickstreams, analyzing what you've searched and the terms you've used to get there. The book is reading you. Substantial privacy issues aside, (it seems more and more that's where we'll be leaving them) Google will use this data to refine Google's search algorithms and, who knows, might even develop some sort of personalized recommendation system similar to Amazon's -- you know, where the computer lists other titles that might interest you based on what you've read, bought or browsed in the past (a system that works only if you are logged in). It's possible Google is thinking of Book Search as the cornerstone of a larger venture that could compete with Amazon.
There are many ways Google could eventually capitalize on its books database -- that is, beyond the contextual advertising that is currently its main source of revenue. It might turn the scanned texts into readable editions, hammer out licensing agreements with publishers, and become the world's biggest ebook store. It could start a print-on-demand service -- a Xerox machine on steroids (and the return of Google Print?). It could work out deals with publishers to sell access to complete online editions -- a searchable text to go along with the physical book -- as Amazon announced it will do with its Upgrade service. Or it could start selling sections of books -- individual pages, chapters etc. -- as Amazon has also planned to do with its Pages program.
Amazon has long served as a valuable research tool for books in print, so much so that some university library systems are now emulating it. Recent additions to the Search Inside the Book program such as concordances, interlinked citations, and statistically improbable phrases (where distinctive terms in the book act as machine-generated tags) are especially fun to play with. Although first and foremost a retailer, Amazon feels more and more like a search system every day (and its A9 engine, though seemingly always on the back burner, is also developing some interesting features). On the flip side Google, though a search system, could start feeling more like a retailer. In either case, you'll have to log in first.
Posted by ben vershbow at 01:42 PM
| Comments (5)
tags: Copyright and Copyleft , Libraries, Search and the Web , POD , amazon , books , e-commerce , e-publishing , ebooks , google , google_book_search , google_print , internet , print_on_demand , privacy , publishing , search , web
google libraries podcast now available
12.07.2005, 11:33 AM
In case you missed Open Source's Monday hour on Google Book Search... Listen here. Podcast RSS here. Show summary here.
Posted by ben vershbow at 11:33 AM
| Comments (1)
tags: Libraries, Search and the Web , ebook , google , google_book_search , google_print , library , podcast , publishing
google on the air
12.06.2005, 12:34 AM
Open Source's hour on the Googlization of libraries was refreshingly light on the copyright issue and heavier on questions about research, reading, the value of libraries, and the public interest. With its book-scanning project, Google is a private company taking on the responsibilities of a public utility, and Siva Vaidhyanathan came down hard on one of the company's chief legal reps for the mystery shrouding their operations (scanning technology, algorithms and ranking system are all kept secret). The rep reasonably replied that Google is not the only digitization project in town and that none of its library partnerships are exclusive. But most of his points were pretty obvious PR boilerplate about Google's altruism and gosh darn love of books. Hearing the counsel's slick defense, your gut tells you it's right to be suspicious of Google and to keep demanding more transparency, clearer privacy standards and so on. If we're going to let this much information come into the hands of one corporation, we need to be very active watchdogs.
Our friend Karen Schneider then joined the fray and as usual brought her sage librarian's perspective. She's thrilled by the possibilities of Google Book Search, seeing as it solves the fundamental problem of library science: that you can only search the metadata, not the texts themselves. But her enthusiasm is tempered by concerns about privatization similar to Siva's and a conviction that a research service like Google can never replace good librarianship and good physical libraries. She also took issue with the fact that Book Search doesn't link to other library-related search services like Open Worldcat. She has her own wrap-up of the show on her blog.
Rounding out the discussion was Matthew G. Kirschenbaum, a cybertext studies blogger and professor of english at the University of Maryland. Kirschenbaum addressed the question of how Google, and the web in general, might be changing, possibly eroding, our reading practices. He nicely put the question in perspective, suggesting that scattershot, inter-textual, "snippety" reading is in fact the older kind of reading, and that the idea of sustained, deeply immersed involvement with a single text is largely a romantic notion tied to the rise of the novel in the 18th century.
A satisfying hour, all in all, of the sort we should be having more often. It was fun brainstorming with Brendan Greeley, the Open Source on "blogger-in-chief," on how to put the show together. Their whole bit about reaching out to the blogosphere for ideas and inspiration isn't just talk. They put their money where their mouth is. I'll link to the podcast when it becomes available.
image: Real Gabinete Português de Literatura, Rio de Janeiro - Claudio Lara via Flickr
Posted by ben vershbow at 12:34 AM
| Comments (2)
tags: Libraries, Search and the Web , copyright , digitization , ebook , google , google_book_search , google_print , library , literature , metadata , reading , search
thinking about google books: tonight at 7 on radio open source
12.05.2005, 4:58 PM
While visiting the Experimental Television Center in upstate New York this past weekend, Lisa found a wonderful relic in a used book shop in Owego, NY -- a small, leatherbound volume from 1962 entitled "Computers," which IBM used to give out as a complimentary item. An introductory note on the opening page reads:
The machines do not think -- but they are one of the greatest aids to the men who do think ever invented! Calculations which would take men thousands of hours -- sometimes thousands of years -- to perform can be handled in moments, freeing scientists, technicians, engineers, businessmen, and strategists to think about using the results.
This echoes Vannevar Bush's seminal 1945 essay on computing and networked knowledge, "As We May Think", which more or less prefigured the internet, web search, and now, the migration of print libraries to the world wide web. Google Book Search opens up fantastic possibilities for research and accessibility, enabling readers to find in seconds what before might have taken them hours, days or weeks. Yet it also promises to transform the very way we conceive of books and libraries, shaking the foundations of major institutions. Will making books searchable online give us more time to think about the results of our research, or will it change the entire way we think? By putting whole books online do we begin the steady process of disintegrating the idea of the book as a bounded whole and not just a sequence of text in a massive database?
The debate thus far has focused too much on the legal ramifications -- helped in part by a couple of high-profile lawsuits from authors and publishers -- failing to take into consideration the larger cognitive, cultural and institutional questions. Those questions will hopefully be given ample air time tonight on Radio Open Source.
Tune in at 7pm ET on local public radio or stream live over the web. The show will also be available later in the week as a podcast.
Posted by ben vershbow at 04:58 PM
| Comments (0)
tags: Libraries, Search and the Web , books , copyright , ebook , google , google_book_search , google_print , library , literature , radio , research , university
google print on deck at radio open source
12.01.2005, 8:07 AM
Open Source, the excellent public radio program (not to be confused with "Open Source Media") that taps into the blogosphere to generate its shows, has been chatting with me about putting together an hour on the Google library project. Open Source is a unique hybrid, drawing on the best qualities of the blogosphere -- community, transparency, collective wisdom -- to produce an otherwise traditional program of smart talk radio. As host Christopher Lydon puts it, the show is "fused at the brain stem with the world wide web." Or better, it "uses the internet to be a show about the world."
The Google show is set to air live this evening at 7pm (ET) (they also podcast). It's been fun working with them behind the scenes, trying to figure out the right guests and questions for the ideal discussion on Google and its bookish ambitions. My exchange has been with Brendan Greeley, the Radio Open Source "blogger-in-chief" (he's kindly linked to us today on their site). We agreed that the show should avoid getting mired in the usual copyright-focused news peg -- publishers vs. Google etc. -- and focus instead on the bigger questions. At my suggestion, they've invited Siva Vaidhyanathan, who wrote the wonderful piece in the Chronicle of Higher Ed. that I talked about yesterday (see bigger questions). I've also recommended our favorite blogger-librarian, Karen Schneider (who has appeared on the show before), science historian George Dyson, who recently wrote a fascinating essay on Google and artificial intelligence, and a bunch of cybertext studies people: Matthew G. Kirschenbaum, N. Katherine Hayles, Jerome McGann and Johanna Drucker. If all goes well, this could end up being a very interesting hour of discussion. Stay tuned.
UPDATE: Open Source just got a hold of Nicholas Kristof to do an hour this evening on Genocide in Sudan, so the Google piece will be pushed to next week.
Posted by ben vershbow at 08:07 AM
| Comments (0)
tags: Libraries, Search and the Web , Online , copyright , google , google_book_search , google_print , library , open_source , podcast , publishing , radio , radio_open_source , search , web
sober thoughts on google: privatization and privacy
11.30.2005, 8:18 AM
Siva Vaidhyanathan has written an excellent essay for the Chronicle of Higher Education on the "risky gamble" of Google's book-scanning project -- some of the most measured, carefully considered comments I've yet seen on the issue. His concerns are not so much for the authors and publishers that have filed suit (on the contrary, he believes they are likely to benefit from Google's service), but for the general public and the future of libraries. Outsourcing to a private company the vital task of digitizing collections may prove to have been a grave mistake on the part of Google's partner libraries. Siva:
The long-term risk of privatization is simple: Companies change and fail. Libraries and universities last.....Libraries should not be relinquishing their core duties to private corporations for the sake of expediency. Whichever side wins in court, we as a culture have lost sight of the ways that human beings, archives, indexes, and institutions interact to generate, preserve, revise, and distribute knowledge. We have become obsessed with seeing everything in the universe as "information" to be linked and ranked. We have focused on quantity and convenience at the expense of the richness and serendipity of the full library experience. We are making a tremendous mistake.
This essay contains in abundance what has largely been missing from the Google books debate: intellectual courage. Vaidhyanathan, an intellectual property scholar and "avowed open-source, open-access advocate," easily could have gone the predictable route of scolding the copyright conservatives and spreading the Google gospel. But he manages to see the big picture beyond the intellectual property concerns. This is not just about economics, it's about knowledge and the public interest.
What irks me about the usual debate is that it forces you into a position of either resisting Google or being its apologist. But this fails to get at the real bind we all are in: the fact that Google provides invaluable services and yet is amassing too much power; that a private company is creating a monopoly on public information services. Sooner or later, there is bound to be a conflict of interest. That is where we, the Google-addicted public, are caught. It's more complicated than hip versus square, or good versus evil.
Here's another good piece on Google. On Monday, The New York Times ran an editorial by Adam Cohen that nicely lays out the privacy concerns:
Google says it needs the data it keeps to improve its technology, but it is doubtful it needs so much personally identifiable information. Of course, this sort of data is enormously valuable for marketing. The whole idea of "Don't be evil," though, is resisting lucrative business opportunities when they are wrong. Google should develop an overarching privacy theory that is as bold as its mission to make the world's information accessible - one that can become a model for the online world. Google is not necessarily worse than other Internet companies when it comes to privacy. But it should be doing better.
Two graduate students in Stanford in the mid-90s recognized that search engines would the most important tools for dealing with the incredible flood of information that was then beginning to swell, so they started indexing web pages and working on algorithms. But as the company has grown, Google's admirable-sounding mission statement -- "to organize the world's information and make it universally accessible and useful" -- has become its manifest destiny, and "information" can now encompass the most private of territories.
At one point it simply meant search results -- the answers to our questions. But now it's the questions as well. Google is keeping a meticulous record of our clickstreams, piecing together an enormous database of queries, refining its search algorithms and, some say, even building a massive artificial brain (more on that later). What else might they do with all this personal information? To date, all of Google's services are free, but there may be a hidden cost.
"Don't be evil" may be the company motto, but with its IPO earlier this year, Google adopted a new ideology: they are now a public corporation. If web advertising (their sole source of revenue) levels off, then investors currently high on $400+ shares will start clamoring for Google to maintain profits. "Don't be evil to us!" they will cry. And what will Google do then?
images: New York Public Library reading room by Kalloosh via Flickr; archive of the original Google page
Posted by ben vershbow at 08:18 AM
| Comments (7)
tags: Copyright and Copyleft , Libraries, Search and the Web , books , copyright , ethics , google , google_book_search , google_print , intellectual_property , libraries , library , literature , privacy , publishing , university
google print is no more
11.18.2005, 8:06 AM
Not the program, of course, just the name. From now on it is to be known as Google Book Search. "Print" obviously struck a little too close to home with publishers and authors. On the company blog, they explain the shift in emphasis:
No, we don't think that this new name will change what some folks think about this program. But we do believe it will help a lot of people understand better what we're doing. We want to make all the world's books discoverable and searchable online, and we hope this new name will help keep everyone focused on that important goal.
Posted by ben vershbow at 08:06 AM
| Comments (1)
tags: Libraries, Search and the Web , books , copyright , google , google_book_search , google_print , publishing , search
the book in the network - masses of metadata
11.15.2005, 6:42 PM
In this weekend's Boston Globe, David Weinberger delivers the metadata angle on Google Print:
...despite the present focus on who owns the digitized content of books, the more critical battle for readers will be over how we manage the information about that content-information that's known technically as metadata....we're going to need massive collections of metadata about each book. Some of this metadata will come from the publishers. But much of it will come from users who write reviews, add comments and annotations to the digital text, and draw connections between, for example, chapters in two different books.
As the digital revolution continues, and as we generate more and more ways of organizing and linking books-integrating information from publishers, libraries and, most radically, other readers-all this metadata will not only let us find books, it will provide the context within which we read them.
The book in the network is a barnacled spirit, carrying with it the sum of its various accretions. Each book is also its own library by virtue not only of what it links to itself, but of what its readers are linking to, of what its readers are reading. Each book is also a milk crate of earlier drafts. It carries its versions with it. A lot of weight for something physically weightless.
Posted by ben vershbow at 06:42 PM
| Comments (0)
tags: ISBN , Libraries, Search and the Web , books , ebook , electronic_literature , folksonomy , google , google_print , hypertext , library , literature , marginalia , metadata , social_software , tagging , weinberger
having browsed google print a bit more...
11.14.2005, 4:53 AM
...I realize I was over-hasty in dismissing the recent additions made since book scanning resumed earlier this month. True, many of the fine wines in the cellar are there only for the tasting, but the vintage stuff can be drunk freely, and there are already some wonderful 19th century titles, at this point mostly from Harvard. The surest way to find them is to search by date, or by title and date. Specify a date range in advanced search or simply enter, for example, "date: 1890" and a wealth of fully accessible texts comes up, any of which can be linked to from a syllabus. An astonishing resource for teachers and students.
The conclusion: Google Print really is shaping up to be a library, that is, of the world pre-1923 -- the current line of demarcation between copyright and the public domain. It's a stark reminder of how over-extended copyright is. Here's an 1899 english printing of The Mahabharata:
A charming detail found on the following page is this old Harvard library stamp that got scanned along with the rest:
Posted by ben vershbow at 04:53 AM
| Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , OCR , copyright , ebook , fair_use , google , google_print , library , mahabharata , scan
pages à la carte
11.04.2005, 7:20 AM
The New York Times reports on programs being developed by both Amazon and Google that would allow readers to purchase online access to specific sections of books -- say, a single recipe from a cookbook, an individual chapter from a how-to manual, or a particular short story or poem from an anthology. Such a system would effectively "unbind" books into modular units that consumers patch into their online reading, just as iTunes blew apart the integrity of the album and made digital music all about playlists. We become scrapbook artists.
It seems Random House is in on this too, developing a micropayment model and consulting closely with the two internet giants. Pages would sell for anywhere between five and 25 cents each.
Posted by ben vershbow at 07:20 AM
| Comments (1)
tags: Publishing, Broadcast, and the Press , Transliteracies , amazon , books , e-commerce , google , google_print , literature , media_consumption , publishing , randomhouse , reading
google print's not-so-public domain
11.03.2005, 4:16 PM
Google's first batch of public domain book scans is now online, representing a smattering of classics and curiosities from the collections of libraries participating in Google Print. Essentially snapshots of books, they're not particularly comfortable to read, but they are keyword-searchable and, since no copyright applies, fully accessible.
The problem is, there really isn't all that much there. Google's gotten a lot of bad press for its supposedly cavalier attitude toward copyright, but spend a few minutes browsing Google Print and you'll see just how publisher-centric the whole affair is. The idea of a text being in the public domain really doesn't amount to much if you're only talking about antique manuscripts, and these are the only books that they've made fully accessible. Daisy Miller's copyright expired long ago but, with the exception of Harvard's illustrated 1892 copy, all the available scanned editions are owned by modern publishers and are therefore only snippeted. This is not an online library, it's a marketing program. Google Print will undeniably have its uses, but we shouldn't confuse it with a library.
(An interesting offering from the stacks of the New York Public Library is this mid-19th century biographic registry of the wealthy burghers of New York: "Capitalists whose wealth is estimated at one hundred thousand dollars and upwards...")
Posted by ben vershbow at 04:16 PM
| Comments (0)
tags: Copyright and Copyleft , Libraries, Search and the Web , OCR , books , copyright , ebook , google , google_print , library , literature , public_domain , scan
the creeping (digital) death of fair use
11.02.2005, 1:13 PM
Meant to post about this last week but it got lost in the shuffle... In case anyone missed it, Tarleton Gillespie of Cornell has published a good piece in Inside Higher Ed about how sneaky settings in course management software are effectively eating away at fair use rights in the academy. Public debate tends to focus on the music and movie industries and the ever more fiendish anti-piracy restrictions they build into their products (the latest being the horrendous "analog hole"). But a similar thing is going on in education and it is decidely under-discussed.
Gillespie draws our attention to the "Copyright Permissions Building Block," a new add-on for the Blackboard course management platform that automatically obtains copyright clearances for any materials a teacher puts into the system. It's billed as a time-saver, a friendly chauffeur to guide you through the confounding back alleys of copyright.
But is it necessary? Gillespie, for one, is concerned that this streamlining mechanism encourages permission-seeking that isn't really required, that teachers should just invoke fair use. To be sure, a good many instructors never bother with permissions anyway, but if they stop to think about it, they probably feel that they are doing something wrong. Blackboard, by sneakily making permissions-seeking the default, plays to this misplaced guilt, lulling teachers away from awareness of their essential rights. It's a disturbing trend, since a right not sufficiently excercised is likely to wither away.
Fair use is what oxygenates the bloodstream of education, allowing ideas to be ideas, not commodities. Universities, and their primary fair use organs, libraries, shouldn't be subjected to the same extortionist policies of the mainstream copyright regime, which, like some corrupt local construction authority, requires dozens of permits to set up a simple grocery store. Fair use was written explicitly into law in 1976 to guarantee protection. But the market tends to find a way, and code is its latest, and most insidious, weapon.
Amazingly, few academics are speaking out. John Holbo, writing on The Valve, wonders:
Why aren’t academics - in the humanities in particular - more exercised by recent developments in copyright law? Specifically, why aren’t they outraged by the prospect of indefinite copyright extension?......It seems to me odd, not because overextended copyright is the most pressing issue in 2005 but because it seems like a social/cultural/political/economic issue that recommends itself as well suited to be taken up by academics - starting with the fact that it is right here on their professional doorstep...
Most obviously on the doorstep is Google, currently mired in legal unpleasantness for its book-scanning ambitions and the controversial interpretation of fair use that undergirds them. Why aren't the universities making a clearer statement about this? In defense? In concern? Soon, when search engines move in earnest into video and sound, the shit will really hit the fan. The academy should be preparing for this, staking out ground for the healthy development of multimedia scholarship and literature that necessitates quotation from other "texts" such as film, television and music, and for which these searchable archives will be an essential resource.
Fair use seems to be shrinking at just the moment it should be expanding, yet few are speaking out.
Posted by ben vershbow at 01:13 PM
| Comments (4)
tags: Copyright and Copyleft , DRM , Education , academy , blackboard , copyright , fair_use , google , google_print
microsoft joins open content alliance
10.26.2005, 9:06 AM
Microsoft's forthcoming "MSN Book Search" is the latest entity to join the Open Content Alliance, the non-controversial rival to Google Print. ZDNet says: "Microsoft has committed to paying for the digitization of 150,000 books in the first year, which will be about $5 million, assuming costs of about 10 cents a page and 300 pages, on average, per book..."
Apparently having learned from Google's mistakes, OCA operates under a strict "opt-in" policy for publishers vis-a-vis copyrighted works (whereas with Google, publishers have until November 1 to opt out). Judging by the growing roster of participants, including Yahoo, the National Archives of Britain, the University of California, Columbia University, and Rice University, not to mention the Internet Archive, it would seem that less hubris equals more results, or at least lower legal fees. Supposedly there is some communication between Google and OCA about potential cooperation.
Also story in NY Times.
Posted by ben vershbow at 09:06 AM
| Comments (2)
tags: Libraries, Search and the Web , Microsoft , OCA , books , brewster_kahle , copyright , google , google_print , library , open_content_alliance , search , web , yahoo
to some writers, google print sounds like a sweet deal
10.25.2005, 9:25 AM
Wired has a piece today about authors who are in favor of Google's plans to digitize millions of books and make them searchable online. Most seem to agree that obscurity is a writer's greatest enemy, and that the exposure afforded by Google's program far outweighs any intellectual property concerns. Sometimes to get more you have to give a little.
The article also mentions the institute.
Posted by ben vershbow at 09:25 AM
| Comments (0)
tags: Libraries, Search and the Web , Publishing, Broadcast, and the Press , books , copyright , google , google_print , publishing , search , web , writing
debating google print
10.22.2005, 5:53 PM
The Washington Post has run a pair of op-eds, one from each side of the Google Print dispute. Neither says anything particularly new. Moreover, they enforce the perception that there can be only two positions on the subject -- an endemic problem in newspaper opinion pages with their addiction to binaries, where two cardboard boxers are allotted their space to throw a persuasive punch. So you're either for Google or against it? That's awfully close to you're either for technology -- for progress -- or against it. Unfortunately, like technology's impact, the Google book-scanning project is a little trickier to figure out, and a more nuanced conversation is probably in order.
The first piece, "Riches We Must Share...", is submitted in support of Google by University of Michigan President Sue Coleman (a partner in the Google library project). She argues that opening up the elitist vaults of the world's great (english) research libraries will constitute a democratic revolution. "We believe the result can be a widening of human conversation comparable to the emergence of mass literacy itself." She goes on to deliver some boilerplate about the "Net Generation" -- too impatient to look for books unless they're online etc. etc. (great to see a major university president being led by the students instead of leading herself).
Coleman then devotes a couple of paragraphs to the copyright question, failing to tackle any of its controversial elements:
Universities are no strangers to the responsible management of complex copyright, permission and security issues; we deal with them every day in our classrooms, libraries, laboratories and performance halls. We will continue to work within the current criteria for fair use as we move ahead with digitization.
The problem is, Google is stretching the current criteria of fair use, possibly to the breaking point. Coleman does not acknowledge or address this. She does, however, remind the plaintiffs that copyright is not only about the owners:
The protections of copyright are designed to balance the rights of the creator with the rights of the public. At its core is the most important principle of all: to facilitate the sharing of knowledge, not to stifle such exchange.
All in all a rather bland statement in support of open access. It fails to weigh in on the fair use question -- something about which the academy should have a few things to say -- and does not indicate any larger concern about what Google might do with its books database down the road.
The opposing view, "...But Not at Writers' Expense", comes from Nick Taylor, writer, and president of the Authors' Guild (which sued Google last month). Taylor asserts that mega-rich Google is tramping on the dignity of working writers. But a couple of paragraphs in, he gets a little mixed up about contemporary publishing:
Except for a few big-name authors, publishers roll the dice and hope that a book's sales will return their investment. Because of this, readers have a wealth of wonderful books to choose from.
A dubious assessment, since publishing conglomerates are not exactly enthusiastic dice rollers. I would counter that risk-averse corporate publishing has steadily shrunk the number of available titles, counting on a handful of blockbusters to drive the market. Taylor goes on to defend not just the publishing status quo, but the legal one:
Now that the Authors Guild has objected, in the form of a lawsuit, to Google's appropriation of our books, we're getting heat for standing in the way of progress, again for thoughtlessly wanting to be paid. It's been tradition in this country to believe in property rights. When did we decide that socialism was the way to run the Internet?
First of all, it's funny to think of the huge corporations that dominate the web as socialist. Second, this talk about being paid for appropriating books for a search database is revealing of the two totally different worldviews that are at odds in this struggle. The authors say that any use of their book requires a payment. Google sees including the books in the database as a kind of payment in itself. No one with a web page expects Google to pay them for indexing their site. They are grateful that they do! Otherwise, they are totally invisible. This is the unspoken compact that underpins web search. Google assumed the same would apply with books. Taylor says not so fast.
Here's Taylor on fair use:
Google contends that the portions of books it will make available to searchers amount to "fair use," the provision under copyright that allows limited use of protected works without seeking permission. That makes a private company, which is profiting from the access it provides, the arbiter of a legal concept it has no right to interpret. And they're scanning the entire books, with who knows what result in the future.
Actually, Google is not doing all the interpreting. There is a legal precedent for Google's reading of fair use established in the 2003 9th Circuit Court decision Kelly v. Arriba Soft. In the case, Kelly, a photographer, sued Arriba Soft, an online image search system, for indexing several of his photographs in their database. Kelly believed that his intellectual property had been stolen, but the court ruled that Arriba's indexing of thumbnail-sized copies of images (which always linked to their source sites) was fair use: "Arriba’s use of the images serves a different function than Kelly’s use – improving access to information on the internet versus artistic expression.” Still, Taylor's "with who knows what result in the future" concern is valid.
So on the one hand we have many writers and most publishers trying to defend their architecture of revenue (or, as Taylor would have it, their dignity). But I can't imagine how Google Print would really be damaging that architecture, at least not in the foreseeable future. Rather it leverages it by placing it within the frame of another architecture: web search. The irony for the authors is that the current architecture doesn't seem to be serving them terribly well. With print-on-demand gaining in quality and legitimacy, online book search could totally re-define what is an acceptable risk to publishers, and maybe more non-blockbuster authors would get published.
On the other hand we have the universities and libraries participating in Google's program, delivering the good news of accessibility. But they are not sufficiently questioning what Google might do with its database down the road, or the implications of a private technology company becoming the principal gatekeeper of the world's corpus.
If only this debate could be framed in a subtler way, rather than the for-Google-or-against-it paradigm we have now. I'm cautiously optimistic about the effect of having books searchable on the web. And I tend to believe it will be beneficial to authors and publishers. But I have other, deep reservations about the direction in which Google is heading, and feel that a number of things could go wrong. We think the cencorship of the marketplace is bad now in the age of publishing conglomerates. What if one company has total control of everything? And is keeping track of every book, every page, that you read. And is reading you while you read, throwing ads into your peripheral vision. I'm curious to hear from readers what they feel could be the hazards of Google Print.
Posted by ben vershbow at 05:53 PM
| Comments (4)
tags: Libraries, Search and the Web , Publishing, Broadcast, and the Press , academy , books , copyright , google , google_print , michigan , publishing , writing
yahoo! announces book-scanning project to rival google's
10.03.2005, 2:00 PM
Yahoo, in collaboration with The Internet Archive, Adobe, O'Reilly Media, Hewlett Packard Labs, the University of California, the University of Toronto, The National Archives of England, and others, will be participating in The Open Content Alliance, a book and media archiving project that will greatly enlarge the body of knowledge available online. At first glance, it appears the program will focus primarily on public domain works, and in the case of copyrighted books, will seek to leverage the Creative Commons.
Google Print, on the other hand, is more self-consciously a marketing program for publishers and authors (although large portions of the public domain will be represented as well). Google aims to make money off its indexing of books through keyword advertising and click-throughs to book vendors. Yahoo throwing its weight behind the "open content" movement seems on the surface to be more of a philanthropic move, but clearly expresses a concern over being outmaneuvered in the search wars. But having this stuff available online is clearly a win for the world at large.
The Alliance was conceived in large part by Brewster Kahle of the Internet Archive. He announced the project on Yahoo's blog:
To kick this off, Internet Archive will host the material and sometimes helps with digitization, Yahoo will index the content and is also funding the digitization of an initial corpus of American literature collection that the University of California system is selecting, Adobe and HP are helping with the processing software, University of Toronto and O'Reilly are adding books, Prelinger Archives and the National Archives of the UK are adding movies, etc. We hope to add more institutions and fine tune the principles of working together.Initial digitized material will be available by the end of the year.
More in:
NY Times
Chronicle of Higher Ed.
Posted by ben vershbow at 02:00 PM
| Comments (0)
tags: Libraries, Search and the Web , archive , book , books , brewster_kahle , digital , digitize , ebook , google , google_print , googleprint , internet_archive , kahle , library , literature , reading , scanning , yahoo , yahoo!
enter the cybrarian
12.18.2004, 3:02 PM
The recent buzz surrounding Google's library intitiative has everyone talking about the future of research, which inevitably raises the question: how will the digitization of library collections change the role of the librarian? I would guess that, far from becoming obsolete, their role will in fact be elevated in importance, if not necessarily in status. They could very well come to be our indispensible guides through the labyrinth - if perhaps invisible, engineering behind the digital walls.
It's also important to consider the question of visualization. When you run a search on Google you are given an enormous list. This is already deeply ingrained in the day-to-day business of finding information. But these lists are basically the electronic equivelant of scrolls, with the items algorithmically determined to be most relevant placed at the top. But sooner or later we have to admit that using scrolls for this kind of business is ludicrous. There has to be a better way of arraying these vast harvests of information in a way that allows the researcher to zoom across degrees of specificity and through associative chains of context and meaning. I see no reason why a search shouldn't take place in some kind of virtual library, emulating the physical architecture of research settings, and allowing for some of the associative or accidental echoes that so often enrich a paper trail blazed through a brick-and-mortar library. Or cannot knowledge resemble a tree, or an arterial matrix? Must we be bound to the scroll?
Returning to the question of the librarian's role, I recalled this passage from James J O'Donnell's 1996 paper The Pragmatics of the New: Trithemius, McLuhan, Cassiodorus:
"The librarians of the world have, moreover, already led the way, for academics at least, into the new information environment, not least because they are caught between rising demand from their customers (faculty and students) and rising supply and prices from their suppliers, and so have already been making reality-based decisions about ownership versus access, print versus electronics, and so on. In short, they are just now our leading pragmatists. Can we imagine a time in our universities when the librarians are the well-paid principals and the teachers their mere acolytes in a distribution chain? I do not think we can or should rule out that possibility for a moment"
Related articles:
"Questions and Praise for Google Web Library" - NY Times
"Google's library plan 'a huge help'" - USA Today
"Making books readable on computer proves trying task" - USA Today
Also, I found this on Searchblog. For a trip down memory lane, check out the original Google in the Stanford archives (click on picture to right). Unfortunately, although it seems interactive, a search just brings up a bunch of stylesheets.
Posted by ben vershbow at 03:02 PM
| Comments (0)
tags: Libraries, Search and the Web , google , google_print , information_architecture , infoviz , librarian , library , michigan , search
books behind bars - the Google library project
12.14.2004, 4:34 PM
How useful will this service be for in-depth research when copyrighted books (which will account for a huge percentage of searchable texts) cannot be fully accessed? In such cases, a person will be able to view only a selection of pages (depending on agreements with publishers), and will find themselves bombarded with a variety of retail options. On a positive note, the search will be able to refer the user to any local libraries where the desired book is available, but still, the focus here remains squarely on digital texts as simply a means of getting to print texts.
Absent a major paradigm shift with regard to the accessibility and inherent virtue of electronic texts, this ambitious project will never achieve its full potential. For someone searching outside the public domain, the Google library project may amount to nothing more than a guided tour through a prison of incarcerated texts. I've found this to be true so far with Google Scholar - it turned up a lot of interesting stuff, but much of it was password protected or required purchase.
article in Filter: Google -- 21st Century Dewey Decimal System (washingtonpost.com)
Posted by ben vershbow at 04:34 PM
| Comments (0)
tags: Libraries, Search and the Web , books , copyright , digitization , ebooks , google , google_book_search , google_print , google_scholar , libraries , library








