Revision as of 21:32, 23 September 2010

Feedparser

Feedparser is a Python library that allows you to read RSS feeds. It helps to isolate your code from some of the differences (in format, version) of RSS feeds.

http://feedparser.org/

d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
print d.entries

Reading feeds from local files

Feedparser supports reading from local files, simply give a (relative) path to the parse command. This is very useful when testing as:

1. It's faster than loading the live feed every time.
2. You can keep working, even if a feed is unavailable, or you have no network.
3. When you encounter a problem, you can keep testing the same feed data over and over (keeping copies of "interesting" feeds as needed).

For example, when processing a feed from the New York Times Online, I might use the following to load a live feed:

feed = feedreader.parse("http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml")

To do this locally, I can first use wget to download the feed (note the -O option to pick the filename to save to):

wget http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml -O nytimes.xml

... and then load the feed in python with:

feed = feedreader.parse("nytimes.xml")

@@ Line 1: / Line 1: @@
-Feedparser is a Python library that allows you to read RSS feeds. It helps to isolate your code from some of the differences (in format, version) of [[RSS feeds]].
+= Feedparser =
+Feedparser is a Python library that allows you to read RSS feeds. It helps to isolate your code from some of the differences (in format, version) of RSS feeds.
 http://feedparser.org/
@@ Line 5: / Line 9: @@
 <source lang="python">
 d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
+print d.entries
+</source>
+== Reading feeds from local files ==
+Feedparser supports reading from local files, simply give a (relative) path to the parse command. This is very useful when testing as:
+. It's faster than loading the live feed every time.
+. You can keep working, even if a feed is unavailable, or you have no network.
+. When you encounter a problem, you can keep testing the same feed data over and over (keeping copies of "interesting" feeds as needed).
+For example, when processing a feed from the New York Times Online, I might use the following to load a live feed:
+<source lang="python">
+feed = feedreader.parse("http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml")
+</source>
+To do this locally, I can first use wget to download the feed (note the -O option to pick the filename to save to):
+<source lang="bash">
+wget http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml -O nytimes.xml
 </source>
+... and then load the feed in python with:
+<source lang="python">
+feed = feedreader.parse("nytimes.xml")
+</source>
+[[Category:Cookbook]]