RSS Feed: Difference between revisions

From XPUB & Lens-Based wiki
No edit summary
 
(28 intermediate revisions by one other user not shown)
Line 1: Line 1:
RSS stands for either RDF Site Summary or else for Really Simple Syndication; RSS is a format for publishing lists on the web, such as the last posts to a blog, or the latest audio files of a podcast. RSS is designed to make it easy for software, like a "pod catcher" or a feed reader to automatically collect and download information from websites that a user has "subscribed" to. Feeds can be useful to write scripts that use public websites as services to request, for instance, the latest images added to Flickr with a given tag, or to search a set of news sites for their last headlines.
RSS Feeds are way of publishing lists on the web, such as the latest posts to a blog, or audio files of a podcast. RSS originally meant RDF Site Summary, and was popularized by Dave Winer and the blogging communtiy as Really Simple Syndication, is now said to stand for [[wikipedia:RSS|Rich Site Summary]]. RSS is designed to make it easy for software, like a "pod catcher" or a feed reader to automatically collect and download information from websites that a user has "subscribed" to. Feeds can be useful to write scripts that use public websites as services to request, for instance, the latest images added to Flickr with a given tag, or to search a set of news sites for their last headlines.


Some examples of public feeds
Some examples of public feeds
* [http://www.flickr.com/services/feeds/ Flickr feeds]
* [http://www.flickr.com/services/feeds/ Flickr] [http://api.flickr.com/services/feeds/photos_public.gne?format=json json] [http://stackoverflow.com/questions/86163/why-do-i-need-a-flickr-api-key api key?] [http://api.flickr.com/services/feeds/photos_public.gne?tags=woman "woman"] [http://api.flickr.com/services/feeds/photos_public.gne?tags=man "man"]
* [http://www.bbc.co.uk/news/10628494 BBC News] [http://feeds.bbci.co.uk/news/education/rss.xml "education & family"] [http://feeds.bbci.co.uk/news/business/rss.xml "business"]
* [http://archive.org/help/rss.php Internet Archive]
* [http://www.youtube.com/t/rss_feeds YouTube]
* [https://developer.vimeo.com/apis/simple#response-formats Vimeo ("simple api")] ... is a search feed advanced only?
* [http://openclipart.org/docs/api Open Clip Art] [http://openclipart.org/media/feed/rss/woman "woman"] [http://openclipart.org/media/feed/rss/man "man"]
* [http://4chanarchive.org/brchive/feeds.php 4chan]
* A custom [http://jamiedubs.com/rss-feed-of-your-tumblr-dashboard Tumblr] feed creator (tumblr itself doesn't offer rss feeds)
* uncovering a [http://ahrengot.com/tutorials/facebook-rss-feed/ Facebook] feed
* getting at a [http://thenextweb.com/twitter/2011/06/23/how-to-find-the-rss-feed-for-any-twitter-user/ Twitter] rss feed


=== Finding feeds embedded in an HTML page ===


== Examples ==
If you view the "[http://en.wikipedia.org/wiki/Special:RecentChanges Recent Changes]" page of Wikipedia, and view the source you will find something like the following:
=== Die Zeit ===
<source lang="html4strict">
A little script to see the current categories of the RSS Feed of german Newspaper "Die Zeit"
<link rel="alternate" type="application/atom+xml" title="&quot;Special:RecentChanges&quot; Atom feed" href="/w/index.php?title=Special:RecentChanges&amp;feed=atom" />
</source>


http://pzwart3.wdka.hro.nl/~mwocher/cgi-bin/Rss_Zeit6.cgi
Similarly, viewing the source of an Article's change [http://en.wikipedia.org/w/index.php?title=Software_development_process&action=history history] reveals a [http://en.wikipedia.org/w/index.php?title=Software_development_process&feed=atom&action=history feed].


<source lang="python">
== Examples ==
 
[[Die Zeit]]
 
[[Category:RSS]]
#!/usr/bin/env python
#-*- coding:utf-8 -*-
 
#import cgi
import lxml.etree, urllib2, codecs
 
print """
  <html>
 
  <head><title>Sample CGI Script</title></head>
 
  <body>
"""
 
#category_1=Politik
#category_2=Wirtschaft
#category_3=Gesellschaft
#category_4=Kultur
#category_5=Meinung
#category_6=Wissen
#category_7=Digital
#category_8=Studium
#category_9=Karriere
#category_10=Lebensart
#category_11=Reisen
#category_12=Auto
#category_13=Sport
 
#Counters
category_1=0
category_2=0
category_3=0
category_4=0
category_5=0
category_6=0
category_7=0
category_8=0
category_9=0
category_10=0
category_11=0
category_12=0
category_13=0
 
#Faktor
Faktor=10
 
# Read from live URL
f = urllib2.urlopen("http://newsfeed.zeit.de/index")
# Read from local file
import sys
# Read in the XML file
doc = lxml.etree.parse(f)
 
print """<div style="font-size:{0};" """.format(category_1*5) + """>Gesellschaft</div>"""
# Again brief review of file system paths (absolute vs. relative)
# XPATH = super paths for documents, not filesystems!
NS = {
    'media': 'http://search.yahoo.com/mrss/',
    'dc': 'http://purl.org/dc/elements/1.1/',
    'cc': 'http://creativecommons.org/ns#',
    'atom': 'http://www.w3.org/2005/Atom',
}
# Doing something which each item individually (maybe extracting the names
 
for item in doc.xpath("//item"):
category = item.xpath(".//category/text()")[0]
 
if category == "Politik":
category_1=category_1+1
 
if category == "Wirtschaft":
category_2=category_2+1
 
if category == "Gesellschaft":
category_3=category_3+1
 
if category == "Kultur":
category_4=category_4+1
 
if category == "Meinung":
category_5=category_5+1
 
if category == "Wissen":
category_6=category_6+1
 
if category == "Digital":
category_7=category_7+1
 
if category == "Studium":
category_8=category_8+1
 
if category == "Karriere":
category_9=category_9+1
 
if category == "Lebensart":
category_10=category_10+1
 
if category == "Reisen":
category_11=category_11+1
 
if category == "Auto":
category_12=category_12+1
 
if category == "Sport":
category_13=category_13+1
 
 
print """<div style="font-size:{0};" """.format(category_1*Faktor) + """>Politik</div>"""
print """<div style="font-size:{0};" """.format(category_2*Faktor) + """>Wirtschaft</div>"""
print """<div style="font-size:{0};" """.format(category_3*Faktor) + """>Gesellschaft</div>"""
print """<div style="font-size:{0};" """.format(category_4*Faktor) + """>Kultur</div>"""
print """<div style="font-size:{0};" """.format(category_5*Faktor) + """>Meinung</div>"""
print """<div style="font-size:{0};" """.format(category_6*Faktor) + """>Wissen</div>"""
print """<div style="font-size:{0};" """.format(category_7*Faktor) + """>Digital</div>"""
print """<div style="font-size:{0};" """.format(category_8*Faktor) + """>Studium</div>"""
print """<div style="font-size:{0};" """.format(category_9*Faktor) + """>Karriere</div>"""
print """<div style="font-size:{0};" """.format(category_10*Faktor) + """>Lebensart</div>"""
print """<div style="font-size:{0};" """.format(category_11*Faktor) + """>Reisen</div>"""
print """<div style="font-size:{0};" """.format(category_12*Faktor) + """>Auto</div>"""
print """<div style="font-size:{0};" """.format(category_13*Faktor) + """>Sport</div>"""
print """</body></html>"""
 
# urls = doc.xpath("//enclosure/@url")

Latest revision as of 09:48, 8 October 2012

RSS Feeds are way of publishing lists on the web, such as the latest posts to a blog, or audio files of a podcast. RSS originally meant RDF Site Summary, and was popularized by Dave Winer and the blogging communtiy as Really Simple Syndication, is now said to stand for Rich Site Summary. RSS is designed to make it easy for software, like a "pod catcher" or a feed reader to automatically collect and download information from websites that a user has "subscribed" to. Feeds can be useful to write scripts that use public websites as services to request, for instance, the latest images added to Flickr with a given tag, or to search a set of news sites for their last headlines.

Some examples of public feeds

Finding feeds embedded in an HTML page

If you view the "Recent Changes" page of Wikipedia, and view the source you will find something like the following:

<link rel="alternate" type="application/atom+xml" title="&quot;Special:RecentChanges&quot; Atom feed" href="/w/index.php?title=Special:RecentChanges&amp;feed=atom" />

Similarly, viewing the source of an Article's change history reveals a feed.

Examples

Die Zeit