Listing entries tagged with google_book_search


1 | 2 | 3 | 4 | 5 | 6

google books API Post date  03.13.2008, 2:03 PM

Good news. Google has finally released an API (?) for Google Book Search:

Web developers can use the Books Viewability API to quickly find out a book's viewability on Google Book Search and, in an automated fashion, embed a link to that book in Google Book Search on their own sites.

As an example of the API in use, check out the Deschutes Public Library in Oregon, which has added a link to "Preview this book at Google" next to the listings in their library catalog. This enables Deschutes readers to preview a book immediately via Google Book Search so that they can then make a better decision about whether they'd like to buy the book, borrow it from a library or whether this book wasn't really the book they were looking for.

Tim Spalding of Library Thing has some initial comments on limitations:

The GBS API is a big step forward, but there are some technical limitations. Google data loads after the rest of the page, and may not be instant. Because the data loads in your web browser, with no data "passing through" LibraryThing servers, we can't sort or search by it, and all-library searching is impossible. You can get something like this if you create a Google Books account, which is, of course, the whole point.

(via Peter Brantley)

Posted by ben vershbow at 2:03 PM | Comments (0)
tags: api , books , google , google_book_search , library , librarything

how to keep google's books open Post date  11.27.2007, 5:27 PM

Whip-smart law blogger Frank Pasquale works through his evolving views on digital library projects and search engines, proposing a compelling strategy for wringing some public good from the tangle of lawsuits surrounding Google Book Search. It hinges on a more expansive (though absolutely legally precedented) interpretation of fair use that takes the public interest and not just market factors into account. Recommended reading. (Thanks, Siva!)

Posted by ben vershbow at 5:27 PM | Comments (1)
tags: copyright , digitization , fairuse , google , google_book_search , library

au courant Post date  11.06.2007, 8:33 AM

Paul Courant is the University Librarian at the University of Michigan as well as a professor of economics. And he now has a blog. He leads off with a response to critics (including Brewster Kahle and Siva Vaidhyanathan) of Michigan's book digitization partnership with Google. Siva responds back on Googlization of Everything. Great to see a university librarian entering the public debate in this way.

Posted by ben vershbow at 8:33 AM | Comments (2)
tags: digitization , google , google_book_search , libraries

"digitization and its discontents" Post date  11.06.2007, 8:14 AM

Anthony Grafton's New Yorker piece "Future Reading" paints a forbidding picture of the global digital library currently in formation on public and private fronts around the world (Google et al.). The following quote sums it up well - ?a refreshing counterpoint to the millenarian hype we so often hear w/r/t mass digitization:

The supposed universal library, then, will be not a seamless mass of books, easily linked and studied together, but a patchwork of interfaces and databases, some open to anyone with a computer and WiFi, others closed to those without access or money. The real challenge now is how to chart the tectonic plates of information that are crashing into one another and then to learn to navigate the new landscapes they are creating. Over time, as more of this material emerges from copyright protection, we'll be able to learn things about our culture that we could never have known previously. Soon, the present will become overwhelmingly accessible, but a great deal of older material may never coalesce into a single database. Neither Google nor anyone else will fuse the proprietary databases of early books and the local systems created by individual archives into one accessible store of information. Though the distant past will be more available, in a technical sense, than ever before, once it is captured and preserved as a vast, disjointed mosaic it may recede ever more rapidly from our collective attention.

Grafton begins and ends in a nostalgic tone, with a paean to the New York Public Library and the critic Alfred Kazin: the poor son of immigrants, City College-educated, who researched his seminal study of American literature On Native Grounds almost entirely with materials freely available at the NYPL. Clearly, Grafton is a believer in the civic ideal of the public library - ?a reservoir of knowledge, free to all - ?and this animates his critique of the balkanized digital landscape of search engines and commercial databases. Given where he appears to stand, I wish he could have taken a stab at what a digital public library might look like, and what sorts of technical, social, political and economic reorganization might be required to build it. Obviously, these are questions that would have required their own article, but it would have been valuable for Grafton, whose piece is one of those occasional journalistic events that moves the issue of digitization and the future of libraries out of the specialist realm into the general consciousness, to have connected the threads. Instead Grafton ends what is overall a valuable and intelligent article with a retreat into print fetishism - ?"crowded public rooms where the sunlight gleams on varnished tables....millions of dusty, crumbling, smelly, irreplaceable documents and books" - ?which, while evocative, obscures more than it illuminates.

Incidentally, those questions are precisely what was discussed at our Really Modern Library meetings last month. We're still compiling our notes but expect a report soon.

Posted by ben vershbow at 8:14 AM | Comments (4)
tags: books , digitization , google , google_book_search , libraries , reallymodernlibrary , search

googlesoft gets the brush-off Post date  10.22.2007, 1:55 PM

This is welcome. Several leading American research libraries including the Boston Public and the Smithsonian have said no thanks to Google and Microsoft book digitization deals, opting instead for the more costly but less restrictive Open Content Alliance/Internet Archive program. The NY Times reports, and explains how private foundations like Sloan are funding some of the OCA partnerships.

Posted by ben vershbow at 1:55 PM | Comments (0)
tags: Microsoft , digitization , google , google_book_search , library , publicdomain

siva podcast on the googlization of libraries Post date  09.10.2007, 7:39 PM

We're just a couple of days away from launching what promises to be one of our most important projects to date, The Googlization of Everything, a weblog where Siva Vaidhyanathan (who's a fellow here) will publicly develop his new book, a major critical examination of the Google behemoth. As an appetizer, check out this First Monday podcast conversation with Siva on the subject of Google's book activities (mp3, transcript).

An excerpt:

Q: So what's the alternative? Who are the major players, what are the major policy points?

SIVA: I think this is an important enough project where we need to have a nationwide effort. We have to have a publicly funded effort. Guided, perhaps led by the Library of Congress, certainly a consortium of public university libraries could do just as well to do it.

We're willing to do these sorts of big projects in the sciences. Look at how individual states are rallying billions of dollars to fund stem cell research right now. Look at the ways the United States government, the French government, the Japanese government rallied billions of dollars for the Human Genome Project out of concern that all that essential information was going to be privatized and served in an inefficient and unwieldy way.

So those are the models that I would like to see us pursue. What saddens me about Google's initiative, is that it's let so many people off the hook. Essentially we've seen so many people say, "Great now we don't have to do the digital library projects we were planning to do." And many of these libraries involved in the Google project were in the process of producing their own digital libraries. We don't have to do that any more because Google will do it for us. We don't have to worry about things like quality because Google will take care of the quantity.

And so what I would like to see? I would like to see all the major public universities, public research universities, in the country gather together and raise the money or persuade Congress to deliver the money to do this sort of thing because it's in the public interest, not because it's in Google's interest. If it really is this important we should be able to mount a public campaign, a set of arguments and convince the people with the purse strings that this should be done right.

Posted by ben vershbow at 7:39 PM | Comments (3) | TrackBack
tags: copyright , google , google_book_search , googlization , library , podcast , sivavaidhyanathan

e-book developments at amazon, google (and rambly thoughts thereon) Post date  09.07.2007, 1:40 PM

The NY Times reported yesterday that the Kindle, Amazon's much speculated-about e-book reading device, is due out next month. No one's seen it yet and Amazon has been tight-lipped about specs, but it presumably has an e-ink screen, a small keyboard and scroll wheel, and most significantly, wireless connectivity. This of course means that Amazon will have a direct pipeline between its store and its device, giving readers access an electronic library (and the Web) while on the go. If they'd just come down a bit on the price (the Times says it'll run between four and five hundred bucks), I can actually see this gaining more traction than past e-book devices, though I'm still not convinced by the idea of a dedicated book reader, especially when smart phones are edging ever closer toward being a credible reading environment. A big part of the problem with e-readers to date has been the missing internet connection and the lack of a good store. The wireless capability of the Kindle, coupled with a greater range of digital titles (not to mention news and blog feeds and other Web content) and the sophisticated browsing mechanisms of the Amazon library could add up to the first more-than-abortive entry into the e-book business. But it still strikes me as transitional - ?a red herring in the larger plot.

A big minus is that the Kindle uses a proprietary file format (based on Mobipocket), meaning that readers get locked into the Amazon system, much as iPod users got shackled to iTunes (before they started moving away from DRM). Of course this means that folks who bought the cheaper (and from what I can tell, inferior) Sony Reader won't be able to read Amazon e-books.

But blech... enough about ebook readers. The Times also reports (though does little to differentiate between the two rather dissimilar bits of news) on Google's plans to begin selling full online access to certain titles in Book Search. Works scanned from library collections, still the bone of contention in two major lawsuits, won't be included here. Only titles formally sanctioned through publisher deals. The implications here are rather different from the Amazon news since Google has no disclosed plans for developing its own reading hardware. The online access model seems to be geared more as a reference and research tool -? a powerful supplement to print reading.

But project forward a few years... this could develop into a huge money-maker for Google: paid access (licensed through publishers) not only on a per-title basis, but to the whole collection - ?all the world's books. Royalties could be distributed from subscription revenues in proportion to access. Each time a book is opened, a penny could drop in the cup of that publisher or author. By then a good reading device will almost certainly exist (more likely a next generation iPhone than a Kindle) and people may actually be reading books through this system, directly on the network. Google and Amazon will then in effect be the digital infrastructure for the publishing industry, perhaps even taking on what remains of the print market through on-demand services purveyed through their digital stores. What will publishers then be? Disembodied imprints, free-floating editorial organs, publicity directors...?

Recent attempts to develop their identities online through their own websites seem hopelessly misguided. A publisher's website is like their office building. Unless you have some direct stake in the industry, there's little reason to bother know where it is. Readers are interested in books not publishers. They go to a bookseller, on foot or online, and they certainly don't browse by publisher. Who really pays attention to who publishes the books they read anyway, especially in this corporatized era where the difference between imprints is increasingly cosmetic, like the range of brands, from dish soap to potato chips, under Proctor & Gamble's aegis? The digital storefront model needs serious rethinking.

The future of distribution channels (Googlezon) is ultimately less interesting than this last question of identity. How will today's publishers establish and maintain their authority as filterers and curators of the electronic word? Will they learn how to develop and nurture literate communities on the social Web? Will they be able to carry their distinguished imprints into a new terrain that operates under entirely different rules? So far, the legacy publishers have proved unable to grasp the way these things work in the new network culture and in the long run this could mean their downfall as nascent online communities (blog networks, webzines, political groups, activist networks, research portals, social media sites, list-servers, libraries, art collectives) emerge as the new imprints: publishing, filtering and linking in various forms and time signatures (books being only one) to highly activated, focused readerships.

The prospect of atomization here (a million publishing tribes and sub-tribes) is no doubt troubling, but the thought of renewed diversity in publishing after decades of shrinking horizons through corporate consolidation is just as, if not more, exciting. But the question of a mass audience does linger, and perhaps this is how certain of today's publishers will survive, as the purveyors of mass market fare. But with digital distribution and print on demand, the economies of scale rationale for big publishers' existence takes a big hit, and with self-publishing services like Amazon CreateSpace and Lulu.com, and the emergence of more accessible authoring tools like Sophie (still a ways away, but coming along), traditional publishers' services (designing, packaging, distributing) are suddenly less special. What will really be important in a chaotic jumble of niche publishers are the critics, filterers and the context-generating communities that reliably draw attention to the things of value and link them meaningfully to the rest of the network. These can be big companies or light-weight garage operations that work on the back of third-party infrastructure like Google, Amazon, YouTube or whatever else. These will be the new publishers, or perhaps its more accurate to say, since publishing is now so trivial an act, the new editors.

Of course social filtering and tastemaking is what's been happening on the Web for years, but over time it could actually supplant the publishing establishment as we currently know it, and not just the distribution channels, but the real heart of things: the imprimaturs, the filtering, the building of community. And I would guess that even as the digital business models sort themselves out (and it's worth keeping an eye on interesting experiments like Content Syndicate, covered here yesterday, and on subscription and ad-based models), that there will be a great deal of free content flying around, publishers having finally come to realize (or having gone extinct with their old conceits) that controlling content is a lost cause and out of synch with the way info naturally circulates on the net. Increasingly it will be the filtering, curating, archiving, linking, commenting and community-building -? in other words, the network around the content -? that will be the thing of value. Expect Amazon and Google (Google, btw, having recently rolled out a bunch of impressive new social tools for Book Search, about which more soon) to move into this area in a big way.

Posted by ben vershbow at 1:40 PM | Comments (17) | TrackBack
tags: amazon , authority , books , ebooks , filtering , google , google_book_search , network , publishing

"the bookish character of books": how google's romanticism falls short Post date  08.15.2007, 4:59 PM

tristramgbs.gif

Check out, if you haven't already, Paul Duguid's witty and incisive exposé of the pitfalls of searching for Tristram Shandy in Google Book Search, an exercise which puts many of the inadequacies of the world's leading digitization program into relief. By Duguid's own admission, Lawrence Sterne's legendary experimental novel is an idiosyncratic choice, but its many typographic and structural oddities make it a particularly useful lens through which to examine the challenges of migrating books successfully to the digital domain. This follows a similar examination Duguid carried out last year with the same text in Project Gutenberg, an experience which he said revealed the limitations of peer production in generating high quality digital editions (also see Dan's own take on this in an older if:book post). This study focuses on the problems of inheritance as a mode of quality assurance, in this case the bequeathing of large authoritative collections by elite institutions to the Google digitization enterprise. Does simply digitizing these - ?books, imprimaturs and all - ?automatically result in an authoritative bibliographic resource?

Duguid's suggests not. The process of migrating analog works to the digital environment in a way that respects the orginals but fully integrates them into the networked world is trickier than simply scanning and dumping into a database. The Shandy study shows in detail how Google's ambition to organizing the world's books and making them universally accessible and useful (to slightly adapt Google's mission statement) is being carried out in a hasty, slipshod manner, leading to a serious deficit in quality in what could eventually become, for better or worse, the world's library. Duguid is hardly the first to point this out, but the intense focus of his case study is valuable and serves as a useful counterpoint to the technoromantic visions of Google boosters such as Kevin Kelly, who predict a new electronic book culture liberated by search engines in which readers are free to find, remix and recombine texts in various ways. While this networked bibliotopia sounds attractive, it's conceived primarily from the standpoint of technology and not well grounded in the particulars of books. What works as snappy Web2.0 buzz doesn't necessarily hold up in practice.

As is so often the case, the devil is in the details, and it is precisely the details that Google seems to have overlooked, or rather sprinted past. Sloppy scanning and the blithe discarding of organizational and metadata schemes meticulously devised through centuries of librarianship, might indeed make the books "universally accessible" (or close to that) but the "and useful" part of the equation could go unrealized. As we build the future, it's worth pondering what parts of the past we want to hold on to. It's going to have to be a slower and more painstaking a process than Google (and, ironically, the partner libraries who have rushed headlong into these deals) might be prepared to undertake. Duguid:

The Google Books Project is no doubt an important, in many ways invaluable, project. It is also, on the brief evidence given here, a highly problematic one. Relying on the power of its search tools, Google has ignored elemental metadata, such as volume numbers. The quality of its scanning (and so we may presume its searching) is at times completely inadequate. The editions offered (by search or by sale) are, at best, regrettable. Curiously, this suggests to me that it may be Google's technicians, and not librarians, who are the great romanticisers of the book. Google Books takes books as a storehouse of wisdom to be opened up with new tools. They fail to see what librarians know: books can be obtuse, obdurate, even obnoxious things. As a group, they don't submit equally to a standard shelf, a standard scanner, or a standard ontology. Nor are their constraints overcome by scraping the text and developing search algorithms. Such strategies can undoubtedly be helpful, but in trying to do away with fairly simple constraints (like volumes), these strategies underestimate how a book's rigidities are often simultaneously resources deeply implicated in the ways in which authors and publishers sought to create the content, meaning, and significance that Google now seeks to liberate. Even with some of the best search and scanning technology in the world behind you, it is unwise to ignore the bookish character of books. More generally, transferring any complex communicative artifacts between generations of technology is always likely to be more problematic than automatic.

Also take a look at Peter Brantley's thoughts on Duguid:

Ultimately, whether or not Google Book Search is a useful tool will hinge in no small part on the ability of its engineers to provoke among themselves a more thorough, and less alchemic, appreciation for the materials they are attempting to transmute from paper to gold.

Posted by ben vershbow at 4:59 PM | Comments (9) | TrackBack
tags: books , digitization , duguid , ebooks , google , google_book_search , gutenberg , kevin_kelly , library , tristramshandy

cornell joins google book search Post date  08.08.2007, 1:48 PM

...offering up to 500,000 items for digitization. From the Cornell library site:

Cornell is the 27th institution to join the Google Book Search Library Project, which digitizes books from major libraries and makes it possible for Internet users to search their collections online. Over the next six years, Cornell will provide Google with public domain and copyrighted holdings from its collections. If a work has no copyright restrictions, the full text will be available for online viewing. For books protected by copyright, users will just get the basic background (such as the book's title and the author's name), at most a few lines of text related to their search and information about where they can buy or borrow a book. Cornell University Library will work with Google to choose materials that complement the contributions of the project's other partners. In addition to making the materials available through its online search service, Google will also provide Cornell with a digital copy of all the materials scanned, which will eventually be incorporated into the university's own digital library.

Posted by ben vershbow at 1:48 PM | Comments (0) | TrackBack
tags: cornell , digitization , google , google_book_search , libraries , library , publicdomain