User:Laurier Rochon/prototyping/slinkybookreader: Difference between revisions
(Created page with "== CGI/Python + D3 js library + markov chains = Slinky Text Reader == *Check it out [http://pzwart3.wdka.hro.nl/~lrochon/cgi-bin/slinkydink/work.class.cgi?url=http://www.gutenbe...") |
No edit summary |
||
Line 6: | Line 6: | ||
* You can change the URL to open/parse by giving in another page (preferably from the Gutenberg collection...there is some cleaning code specific to those files) | * You can change the URL to open/parse by giving in another page (preferably from the Gutenberg collection...there is some cleaning code specific to those files) | ||
[[File:Slink1. | [[File:Slink1.png]] | ||
== What is does == | == What is does == | ||
Line 22: | Line 22: | ||
* start writing the SVG nodes (circles), whenever you find a node that already exists, create another link (SVG line) instead of another node | * start writing the SVG nodes (circles), whenever you find a node that already exists, create another link (SVG line) instead of another node | ||
[[File:Slink2. | [[File:Slink2.png]] | ||
[[File:Slink3. | [[File:Slink3.png]] | ||
Revision as of 19:50, 26 October 2011
CGI/Python + D3 js library + markov chains = Slinky Text Reader
- Check it out here
- You can change the amount of nodes (words) displayed by changing the "limit" param
- You can change the URL to open/parse by giving in another page (preferably from the Gutenberg collection...there is some cleaning code specific to those files)
What is does
So I was wandering rather aimlessly in the gutenberg database, and couldn't find much that was so compelling to me. So I decided to just make a simple "work" parser that would pick apart the different pieces of a text and maybe rearrange them. I created a small class that made a "work" object, on which you could call the markovize() method, which effectively did what is says. I thought then, for fun, that I could try to replicate the graphs I built last year, but in animation. The idea was that the system would "read" out to you a book, and when a non-unique word occurred, the story would "split" in many paths, so that you can follow the one you preferred. Erm...I can't say the graph reads in that way, but that was the objective. As new nodes are being pumped out, the rest would move away and create a certain narrative that one could follow on the screen. D3 for js seemed a good fit for this.
What happens, in order
- checks the url param - uses urlib2 to grab it
- parses the file, creates a markov chain by gathering all the unique words of the text, then pairing every word with an index from the unique words list
- write the whole thing in JSON format using simpleJSON (very nice json lib...). the data structure looks like so {"source":"1","target":"2","word":"bananas"}
- if all of the above occurs smoothly, use JS and jQuery to load up the JSON file
- start writing the SVG nodes (circles), whenever you find a node that already exists, create another link (SVG line) instead of another node
A few issues encountered along the way
- D3 documentation sucks. I had worked with it before though
- nodes and links : I wanted to write links dynamically as the nodes popped up - but the links are required 2 points to draw from and to. So how do you draw a line to a node that doesn't exist yet? I had make a little change (that took me a while to figure out) to report that drawing to later, after the "target" node is drawn.
- related to previous problem - my data structure reflected links, but I was considering it as "words" of the markov chain. Things didn't add up logically until I wrapped my head around the fact that every json dictionary describes a link, not a node.