Calendars:Networked Media Calendar/Networked Media Calendar/16-03-2011 -Event 1: Difference between revisions
No edit summary |
No edit summary |
||
Line 17: | Line 17: | ||
; TOS selected words frequency in time (by Dusan and Natasa) | ; TOS selected words frequency in time (by Dusan and Natasa) | ||
* [Goodiff_TOS_word_frequency code] | * [[Goodiff_TOS_word_frequency code]] | ||
* [https://spreadsheets.google.com/pub?key=0AgT6KLPteXsOdF84Y0F3RWpxQnQ2ODFOLVA3RG9XWFE&output=html Facebook TOS] | * [https://spreadsheets.google.com/pub?key=0AgT6KLPteXsOdF84Y0F3RWpxQnQ2ODFOLVA3RG9XWFE&output=html Facebook TOS] | ||
* [https://spreadsheets.google.com/pub?key=0AgT6KLPteXsOdHRuczQxUEU4dWxjWmNjaUtKb2JfM1E&single=true&gid=0&output=html Skype TOS] | * [https://spreadsheets.google.com/pub?key=0AgT6KLPteXsOdHRuczQxUEU4dWxjWmNjaUtKb2JfM1E&single=true&gid=0&output=html Skype TOS] |
Revision as of 20:32, 17 March 2011
11-18 | Nicolas Maleve - Thematic Project
Cookbook Recipes for Goodiff Workshop
- Simplifying_HTML_by_removing_"invisible"_parts
- Stripping all the tags from HTML to get pure text
- Looking up synonym-sets for a word
- Splitting text into sentences
- Removing common words / stopwords
- Finding capitalized words
- Extracting parts of an HTML document
- Extracting the text contents of a node
- Turning part of a page back into code (aka serialization)
- 16-03-2011_Danny_Fabien_Mirjam
- TOS selected words frequency in time (by Dusan and Natasa)
- Simple statistics TOS