Listing entries tagged with data
a girl goes to work (infographic video)
06.26.2006, 10:24 AM
It's not often that you see infographics with soul. Even though visuals are significantly more fun to look at than actual data tables, the oversimplification of infographics tends to suck out the interest in favor of making things quickly comprehensible (often to the detriment of the true data points, like the 2000 election map). This Röyksopp video, on the other hand, a delightful crossover between games, illustration, and infographic, is all about the storyline and subverts data to be a secondary player. This is not pure data visualization on the lines of the front page feature in USA Today. It is, instead, a touching story encased in the traditional visual language and iconography of infographics. The video's currency belies its age: it won the 2002 MTV Europe music video award for best video.
Our information environment is growing both more dispersed and more saturated. Infographics serve as a filter, distilling hundreds of data points down into comprehensible form. They help us peer into the impenetrable data pools in our day to day life, and, in the best case, provide an alternative way to reevaluate our surroundings and make better decisions. (Tufte has also famously argued that infographics can be used to make incredibly poor decisions--caveat lector.)
But infographics do something else; more than visual representations of data, they are beautiful renderings of the invisible and obscured. They stylishly separate signal from noise, bringing a sense of comprehensive simplicity to an overstimulating environment. That's what makes the video so wonderful. In the non-physical space of the animation, the datasphere is made visible. The ambient informatics reflect the information saturation that we navigate everyday (some with more serenity than others), but the woman in the video is unperturbed by the massive complexity of the systems that surround her. Her bathroom is part of a maze of municipal waterpipes; she navigates the public transport grid with thousands of others; she works at a computer terminal dealing with massive amounts of data (which are rendered in dancing—and therefore somewhat useless—infographics for her. A clever wink to the audience.); she eats food from a worldwide system of agricultural production that delivers it to her (as far as she can tell) in mere moments. This is the complexity that we see and we know and we ignore, just like her. This recursiveness and reference to the real is deftly handled. The video is designed to emphasize the larger picture and allows us to make connections without being visually bogged down in the particulars and textures of reality. The girl's journey from morning to pint is utterly familiar, yet rendered at this larger scale and with the pointed clarity of a information graphic, the narrative is beautiful and touching.
Posted by jesse wilbur at 10:24 AM
| Comments (1)
| TrackBack
tags: data , infographic , media , royksopp , video , youtube
if not rdf, then what?
03.28.2006, 11:35 AM
I posted about RDF and the difficulty the web development community has had fully adopting RDF and ontologies as a method of metadata organization. I said that one of the reasons was the relative complexity of RDF and the cost of generating useful metadata (as opposed to just enough information to solve the current problem). Simon St. Laurent has a nice redux of the matter. I won't try to duplicate that, but I do want to explain some of the details about RDF. Though I made a case for how complex RDF is when used to create fully relational data sets, I didn't do a very good job of explaining how simple RDF is in principle. RDF proponents believe they are building the future. I'm not entirely convinced, but I want to take a close look at RDF before I consider other solutions.
RDF seems overwhelming, but in the inimitable words of Squire Patsy, "It's only a model!" A model, in this case, that can representat digital and real things and their relationships. The promise of RDF is that it can describe everything using a combination of unique identifiers, properties and property values.
Unique Identifiers
The heart of RDF is the unique identifier. Your name is a unique identifier, but only as long as there is no one else in the room who answers to [your-name-here]. This, clearly, is not a good way to create a universal identification system. Your social security number is a unique identifier in this country, but it doesn't signify much in China, and the system is not extensible (we'd run out of numbers if we tried to SSN the Chinese). Your email address is a unique identifier on the Internet—it works pretty well as a unique identifier. A Universal Resource Indicator (URI) is a little more extensible, and, since it's longer than an email, can provide more information. You can use a URI to identify something, even if it can't be retrieved through the web. A product at Amazon.com, for example, could have a unique URI, even though you still need a truck to bring it to you.
Properties
If we look at objects in the real world, they have physical properties, like size, color, and hardness. An example: my kitchen table. It's a three dimensional object, so it has height, width, length. It's made of wood, it has been stained. It also has informational properties: the date I purchased it, the person who sold it to me, the area of the country it came from, the level of personal attachment I have for the thing. Each of these properties can be put into RDF, by linking it to a schema that defines the property in a normative fashion. It'll make a little more sense when I give an example. But for that to happen I need to describe...
Property Values
Property values are the names, numbers, and dates that make properties make sense. My kitchen table is 78" long x 28" wide x 34" tall, dark-walnut stained, and soft (as wood goes). I bought it in February, 2002 from Joe Komenda, and I'm never going to part with it (even though it isn't really NYC apartment sized). Property values are the easy part of the metadata. Associating property values to properties, and properties to normative schemas, that's when things get tricky.
Here's the example I promised (bound in an XML format):
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:kt="http://www.jdwilbur.fake/furniture#"
xmlns:geom2d="http://nurl.org/0/geom2d/1.0/"
xmlns:map="http://nurl.org/0/geography/map/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<kt:height>34</kt:height>
<kt:width>28</kt:width>
<kt:length>78</kt:length>
<kt:price>150</kt:price>
<kt:month>February</kt:month>
<kt:year>2002</kt:year>
<dc:coverage>
<geom2d:Point>
<map:srs resource="http://nurl.org/0/geography/SRSCatalog/wgs84">
<geom2d:x>-123.817</geom2d:x>
<geom2d:y>46.183</geom2d:y>
</geom2d:Point>
</dc:coverage>
<kt:seller rdf:resource="http://www.komenda.fake/Joseph%20Komenda#" />
<kt:sellit>Never ever ever</kt:sellit>
</rdf:Description>
</rdf:RDF>
http://www.jdwilbur.fake/furniture/kitchen-table: The URI of my kitchen table
kt:height: The property height from my schema defined here: http://www.jdwilbur.fake/furniture#
34: The property value that tells me how tall my table is. I would infer from the schema that the value is in inches, not millimeters or light years
For the purposes of this example, I've made up my own fake schema (which would be a bunch of lines of xml similar to the example above) and included three real ones: Dublin Core dc, Geomap 2d geom2d for mapping coordinates, and map to relate the coordinates to physical locations. My schema, kt (which is a stand for the words kitchen table) includes some special properties like seller and sellit. The seller, Joe Komenda, has his own URI (it appears after rdf:resource). The others are fairly standard, but have a specific meaning in my personal context. The only other tricky part is the geographic coordinates, because I'm using three different schemas to define a geographic point. (It's just an example taken from mapbureau. It could resolve to the middle of the Pacific Ocean for all I know)
The obvious point here is that writing RDF is hard. We need automated tools to help us compose in this syntax, which is convoluted but requires perfection to work. Humans are not perfect; RDF is not our language. RDF also requires front-loading: developing schemas and choosing terms, URI's, finding prior art so that terms can be reused. We need tools to help us manage that aspect. And we need applications that demand RDF. Currently, the demand for RDF is low because it is mostly for the sake of maintaing the richness of a data set for some future application—not the ones I work with every day.
So if RDF, syntactically difficult, but conceptually easy, cannot get adopted, what is the alternative? The web API. A wide variety of new web applications and services are accompanied by an API. It seems like you can hardly be part of Web 2.0 without one. What does the API have that RDF doesn't? Simplicity. Famililarity. You cannot interact with an API unless you follow the rules. Fine. Same with RDF. But the rules of an API fall into the familiar realm of setting parameters, grabbing previously named functions, and following the documentation. This is like a caffeinated beverage for developers: they instinctively know how to consume it. More than that, API's mean that people can innovate on an interface level, even if they don't have serious coding chops. I've seen the Google API implemented in twenty minutes. This is a more fluid way to develop; one that feels more comfortable even if it sacrifices information richness. We'll get to RDF one day, maybe in Web 3.5, but until then we will take small steps towards data sharing and interoperability with API's.
Posted by jdwilbur at 11:35 AM
| Comments (7)
| TrackBack
tags: RDF , api , data , dublin_core , interoperability , property , schema , syntax , uri , value , web_2.0 , xml
memory fails
09.26.2005, 3:30 PM

Posted by ben vershbow at 03:30 PM
| Comments (0)
tags: CD , beautiful , circle , compactdisc , corrosion , data , disc , disk , history , image , lost , memory , round , rust , time
flash memory: "the digital paper age"?
09.13.2005, 3:47 PM
Heads are spinning in response to Samsung's planned release of a 16 gigabyte flash drive - a string of eight 2GB flash memory cards. Flash memory is solid state data storage, as opposed to the conventional hard drive, which contains spinning mechanical parts. The implication is that the price of memory for computers will soon drop dramatically, as will the amount of energy used to power them. Moreover, you will be able to carry millions upon millions of pages on something the size of a keychain (people will probably start using smaller ones as business cards before too long). There's definitely something reassuring about the solidity - to rely entirely on a single, rickety hard drive, or a network, to store documents is incredibly risky and unreliable. Plus, these cards are far more tolerant of shocks, bad weather and all around abuse.
Chosun Ilbo describes the remarks of Hwang Chang-gyu, Samsung's chief executive, who said:
...the development signaled the opening of the "digital paper age." "In the same way that civilization rapidly progressed after paper was invented 2,000 years ago, flash memory will serve as the 'digital paper' to store all kind of information from documents to photos and videos in the future. Mobile storage devices like CDs and hard disks will gradually disappear over the next two or three years, and flash memory will dominate the information age."
Posted by ben vershbow at 03:47 PM
| Comments (3)
tags: The Ideal Device? , computer , data , datastorage , flash , flashmemory , gadgets , gigabyte , harddrive , korea , memory , paper , paperless , samsung , technology



