Prototyping/Download Sample Cut-up Share: Difference between revisions
No edit summary |
|||
Line 9: | Line 9: | ||
== Creating a simple HTML page from the feed == | == Creating a simple HTML page from the feed == | ||
<source lang="python"> | <source lang="python"> | ||
import codecs, sys, lxml.etree | |||
# Open the filename given on the command line | |||
f = codecs.open(sys.argv[1], encoding="utf-8") | |||
# Read in the XML file | |||
doc = lxml.etree.parse(f) | |||
# This is a Python dictionary containing | |||
# the xml "namespaces" that we may use | |||
NS = { | NS = { | ||
'media': 'http://search.yahoo.com/mrss/', | 'media': 'http://search.yahoo.com/mrss/', | ||
'dc': 'http://purl.org/dc/elements/1.1/', | 'dc': 'http://purl.org/dc/elements/1.1/', | ||
'cc': 'http://creativecommons.org/ns#', | |||
'atom': 'http://www.w3.org/2005/Atom', | |||
} | } | ||
Line 26: | Line 38: | ||
</div>""".format(title, link, thumbnail_url) | </div>""".format(title, link, thumbnail_url) | ||
</source> | </source> | ||
== Resources == | == Resources == | ||
* http://lxml.de/tutorial.html | * http://lxml.de/tutorial.html | ||
* http://www.w3.org/TR/xpath/ | * http://www.w3.org/TR/xpath/ |
Revision as of 10:41, 26 October 2011
http://www.openclipart.org/docs/api
Get some feeds. NB wget's O option (and that's a CAPITAL O), allows to save to a reasonable filename of your choice.
wget http://www.openclipart.org/media/feed/rss/woman -O woman.xml wget http://www.openclipart.org/media/feed/rss/man -O man.xml
Creating a simple HTML page from the feed
import codecs, sys, lxml.etree
# Open the filename given on the command line
f = codecs.open(sys.argv[1], encoding="utf-8")
# Read in the XML file
doc = lxml.etree.parse(f)
# This is a Python dictionary containing
# the xml "namespaces" that we may use
NS = {
'media': 'http://search.yahoo.com/mrss/',
'dc': 'http://purl.org/dc/elements/1.1/',
'cc': 'http://creativecommons.org/ns#',
'atom': 'http://www.w3.org/2005/Atom',
}
# Doing something which each item individually (maybe extracting the names
print len(doc.xpath("//item")), "items"
for item in doc.xpath("//item"):
svg = item.xpath(".//enclosure/@url")[0]
thumbnail_url = item.xpath(".//media:thumbnail/@url", namespaces=NS)[0]
creator = item.xpath(".//dc:creator/text()", namespaces=NS)[0]
title = item.xpath(".//title/text()")[0]
link = item.xpath(".//link/text()")[0]
print """<div>
<a href="{1}"><img src="{2}" />{0}</a>
</div>""".format(title, link, thumbnail_url)