Scraping: Difference between revisions
(New page: Scraping (also Screen Scraping) is the process of extracting data out of something. In the course, we have used the library Beautiful Soup to manipulate HTML pages in Python. Oth...) |
No edit summary |
||
Line 1: | Line 1: | ||
Scraping (also Screen Scraping) is the process of extracting data out of something. | Scraping (also Screen Scraping) is the process of extracting data out of something. | ||
In the course, we have used the library [[ | In the course, we have used the library [[BeautifulSoup]] to manipulate HTML pages in [[Python]]. | ||
Other interesting libraries to consider: | Other interesting libraries to consider: | ||
* [http://codespeak.net/lxml/ lxml] | * [http://codespeak.net/lxml/ lxml] | ||
* [http://code.google.com/p/html5lib/ html5lib] | * [http://code.google.com/p/html5lib/ html5lib] |
Revision as of 14:00, 11 April 2009
Scraping (also Screen Scraping) is the process of extracting data out of something.
In the course, we have used the library BeautifulSoup to manipulate HTML pages in Python.
Other interesting libraries to consider: