|click on the thumbnails to view the original PDF..|
This is a visualization of the only the nouns and verbs in the first chapter of the book (AGONY on The Cave(TM)), organized only according to the sentences and paragraphs. My main intent with this series was to try and combine all information available into a single visualization. So, I thought to use paragraphs, sentences, word frequency, part of speech tagging and anything else I could think of. I found it difficult to map too many dimensions however, and since I was doing graph based visualizations I wanted to keep the node count down, or Graphviz takes an eternity
I think this severely truncated text is somewhat understandable still. As an aside, evidently it is not as easy to read scrambled intra-word text as I thought.. I struggled with including “determiners” (here is the list of parts of speech that the OpenNLP part of speech tagger tags things with: Penn Treebank Parts of Speech) – it seemed egregious to me to eliminate the word no which is either classified as “DT” or “UH” – interjection, however there are other interjections which are less important. In the end, my rationalization for eliminating the concept of negation is that you can still (mentally) mine the underlying connections, and the logical comparison is less important than the connection. At least, thats my story and I’m sticking to it.. It has some strange consequences – “No pain no gain” in the text becomes pain -> gain in the sentence visualization. Red circles are verbs (any type, just looking for the “VB” prefix from the pos tags) and blue squares are nouns (also just looking for the “NN” prefix)..
For these two visualizations, I used the text of another chapter (ANALOG on Katamari Damacy) and knocked down to only the nouns. However, I wanted to actually use the graph structure to try and convey some information – the earlier paragraph/sentence visualization doesn’t have any edges – so I created an index of nouns, and then edges from the index word (in a black box) to any occurences in the text. I then rendered it using “dot” for the hierarchical one and “neato” for the energy minimized one. For the energy minimized one (the one I was really shooting for..) the nodes overlap, and I attempted to fix this using “neato’s” voronoi and scaling overlap features – sadly, the voronoi code seems to take forever.. I also tried the fix here, to no avail. The scaling just crashes Graphviz.. So, not what I wanted, but better than nothing.
|dot renders things more quickly than neato, I was impatient, and once I saw it I really liked it.|
|this is closest to my original intention with the Katamari Damacy visualizations, since it resembles a big ball..|