Jump to content

XPUB & Lens-Based wiki

Web Spider in Python

From XPUB & Lens-Based wiki

Revision as of 18:28, 4 March 2014 by Michael Murtaugh (talk | contribs) (Created page with "Using html5lib <source lang="python"> import html5lib, urllib url = "http://wikipedia.org/" html = urllib.urlopen(url).read() tree = html5lib.parse(html, namespaceHTMLElemen...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Using html5lib

import html5lib, urllib

url = "http://wikipedia.org/"
html = urllib.urlopen(url).read()
tree = html5lib.parse(html, namespaceHTMLElements=False)
for a in tree.findall(".//a"):
    print "a element", a

Retrieved from "https://pzwiki.wdka.nl/mw-mediadesign/index.php?title=Web_Spider_in_Python&oldid=58385"

Pages using deprecated source tags