Pages for logged out editors learn more
HTML5lib is a python module for parsing documents.
https://code.google.com/p/html5lib/