questions about blog search and time 01.06.2006, 8:17 AM
posted by ben vershbow
Does anyone know of a good way to search for old blog entries on the web? I've just been looking at some of the available blog search resources and few of them appear to provide any serious advanced search options. The couple of major ones I've found that do (after an admittedly cursory look) are Google and Ice Rocket. Both, however, appear to be broken, at least when it comes to dates. I've tried them on three different browsers, on Mac and PC, and in each case the date menus seem to be frozen. It's very weird. They give you the option of entering a specific time range but won't accept the actual dates. Maybe I'm just having a bad tech day, but it's as if there's some conceptual glitch across the web vis a vis blogs and time.
Most blog search engines are geared toward searching the current blogosphere, but there should be a way to research older content. My first thought was that blog search engines crawl RSS feeds, most of which do not transmit the entirety of a blog's content, just the more recent. That would pose a problem for archival search.
Does anyone know what would be the best way to go about finding, say, old blog entries containing the keywords "new orleans superdome" from late August to late September 2005? Is it best to just stick with general web search and painstakingly comb through for blogs? If we agree that blogs have become an important kind of cultural document, than surely there should be a way to find them more than a month after they've been written.
dave munger on January 6, 2006 8:44 AM:
Altavista advanced search gives a date range option, but as you point out, you still have to comb through to find just the blog entries, if that's what you're interested in.
I do believe that Google Blog search uses only RSS feeds, but it "remembers" about 6-8 months back. I wonder if it will maintain that history into the future (i.e. a year from now, you'll be able to search 18 months back).
Christopher Harris on January 6, 2006 11:02 AM:
I have been looking at this need for awhile starting with an inital thought that librarians need to find a way to catalog the blogosphere. I have been looking at RSS/XML code and the MARC format, and I can't help but think there has to be a way to embed some additional information in an RSS feed that could lead to an initial cataloging of blog posts.
bowerbird on January 6, 2006 1:54 PM:
> If we agree that blogs have become
> an important kind of cultural document,
> than surely
actually, i think the jury is still out on that one.
a big part of what i see in blogs is a retreat back to
a unilateral one-to-many mode, where reaction is
safely confined to a "letters to the editor" section
-- oh, excuse me, i mean a "comments" section.
to my mind, the bidirectional many-to-many mode
of listserves is _still_ a better model for the future.
the problems with listserves (the "temporary" nature
of e-mail, and the "merry-go-round" of topics that
never come to a resolution) will be overcome when
we find a way to accumulate their expressed wisdom
into a collaborative collection of finely-honed thought.
as those words likely convey to you, i see the future as
some kind of a hybrid between a listserve and a wiki,
with an infusion of editing via collaborative filtering.
but blogs? well, as much as i love the individual voice,
i see their "emergence" as a shortlived phenomenon
that took advantage of google's pagerank alogrthyms
(via incestous crosslinking) to focus attention on a few
needles while the content haystack was relatively small,
when the individual was still able to out-think a crowd
because we hadn't yet invented a mechanism that let us
boil down all our collective intelligence and experience.
when we _do_ have such mechanisms, the individual voice
will come to be appreciated more for its ability to explore
the _outer_edges_of_the_dartboard_, not hit its bullseye.
or, to put it another way, we don't need any individuals
to "represent the voice of the people", not when we can
access that voice directly, in an "ask-the-audience" mode;
but we will always (and even more so in the future) need
individuals to give us the unique voice of the individual.
so blogs will still be "an important cultural document"
in the future, but not in the way that i think you mean.
genevieve tucker on January 7, 2006 7:03 AM:
There are some noises out there already about systematising the metadata or tags added to blogposts - I've written a couple of sketchy posts, with links throughout to StructuredBlogging (which my library school lecturer was quite interested to hear about) and Semantic Blogging, championed by one Peter Morville - here's my post, http://austlit.edublogs.org/2005/12/21/searching-and-structure-a-link-drop/.
Have you tried the Vivisimo search engine, Clusty at www.clusty.com. It has a neat little clustering arrangement and can be used to search only blogs. One of the only features it doesn't have is searching by date posted - BUT they want to hear from you if you have any suggestions for improvement to the service? Might be something no one has asked for yet.
David Tiley on January 9, 2006 6:25 AM:
Not just a search by date problem - we can't discover the post date of items or news stories until we are into the site.
Around, say, Katrina, I was doing a lot of time specific searching.