2009 206: Difference between revisions
Line 25: | Line 25: | ||
print word, wc[word] | print word, wc[word] | ||
</source> | </source> | ||
Now we make a function that takes a file and turns it into a "word count dictionary". Then we can use this function on different poems. | |||
== Visualising == | == Visualising == | ||
== Interacting == | == Interacting == |
Revision as of 14:22, 3 March 2009
Toward a navigable text
Acquiring
Today we are working with the text of 10 poems by Edgar Allen Poe, from Project Gutenberg.
Processing
import sys, re
wc = {}
for line in sys.stdin:
line = line.rstrip()
words = re.split("[^a-zA-Z]*", line)
for word in words:
word=word.lower()
if word:
wc[word]=wc.get(word, 0)+1
allwords = wc.keys()
allwords.sort()
for word in allwords:
print word, wc[word]
Now we make a function that takes a file and turns it into a "word count dictionary". Then we can use this function on different poems.