2010 1.09: Difference between revisions

From XPUB & Lens-Based wiki
Line 37: Line 37:


== Word counts ==
== Word counts ==
Grab words (as above) and display one per line followed by the number of times the word appears.


<source lang="python">
<source lang="python">

Revision as of 22:26, 6 December 2010

Read an RSS feed from a URL given on the command line

#!/usr/bin/env python
import sys, feedparser

try:
    url = sys.argv[1]
except IndexError:
    url = "http://feeds.bbci.co.uk/news/rss.xml"

feed = feedparser.parse(url)
for e in feed.entries:
    print e.title.encode("utf-8")

Words

Turns a text in an alphabetical list of unique words. Attempts to strip punctuation and lowercases everything.

#!/usr/bin/env python

import sys, string

words = {}
for line in sys.stdin:
    for word in line.split():
        word = word.lower().strip(string.punctuation)
        words[word] = words.get(word, 0) + 1

for word in sorted(words.keys()):
    print word,
print


Word counts

Grab words (as above) and display one per line followed by the number of times the word appears.

#!/usr/bin/env python

import sys, string

words = {}
for line in sys.stdin:
    for word in line.split():
        word = word.lower().strip(string.punctuation)
        words[word] = words.get(word, 0) + 1

for (word, count) in sorted(words.items()):
    print word, count