Calendars:Networked Media Calendar/Networked Media Calendar/16-03-2011 -Event 1: Difference between revisions

From XPUB & Lens-Based wiki
Line 1: Line 1:
11-18 | Nicolas Maleve - Thematic Project
11-18 | Nicolas Maleve - Thematic Project


= Cookbook Recipes for Goodiff Workshop =
=== Cookbook Recipes for Goodiff Workshop ===


* [[Simplifying_HTML_by_removing_"invisible"_parts]]
* [[Simplifying_HTML_by_removing_"invisible"_parts]]
* [[Stripping all the tags from HTML to get pure text]]
* [[Stripping all the tags from HTML to get pure text]]
* [[Looking up synonym-sets for a word]]
* [[Looking up synonym-sets for a word]]
* [[Splitting text into sentences]]
* [[Splitting text into sentences]]
* [[Removing common words / stopwords]]
* [[Removing common words / stopwords]]
* [[Finding capitalized words]]
* [[Finding capitalized words]]
* [[Extracting parts of an HTML document]]
* [[Extracting parts of an HTML document]]
* [[Extracting the text contents of a node]]
* [[Extracting the text contents of a node]]
* [[Turning part of a page back into code (aka serialization)]]
* [[Turning part of a page back into code (aka serialization)]]

Revision as of 14:03, 16 March 2011